[R] about strauss process
i have trouble in using spatstat package. i want to simulate a community under the Strauss process,which has a parameter gamma that controls interaction strength between points,and Strauss process is defined only for 0 ¡Ügamma ¡Ü 1 and is a model for inhibition between points. my problem is that in my data, many species's estimated gamma is larger than one.so if i still wanna simulate the point process of that species with interaction between points,what can i do then? is there other method i can use? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] about strauss process
On 24/05/2009, at 3:29 PM, echo_july wrote: i have trouble in using spatstat package. i want to simulate a community under the Strauss process,which has a parameter gamma that controls interaction strength between points,and Strauss process is defined only for 0 ¡Ügamma ¡Ü 1 and is a model for inhibition between points. my problem is that in my data, many species's estimated gamma is larger than one.so if i still wanna simulate the point process of that species with interaction between points,what can i do then? is there other method i can use? You might consider trying the Geyer process. Use cif=geyer in forming the model to be used by rmh(). In fitting Geyer models to data, one way to estimate the saturation parameter is via profile pseudolikelihood. See ?profilepl. cheers, Rolf Turner ## Attention: This e-mail message is privileged and confidential. If you are not the intended recipient please delete the message and notify the sender. Any views or opinions presented are solely those of the author. This e-mail has been scanned and cleared by MailMarshal www.marshalsoftware.com ## __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating a list by just using start and final component
Hi there, say, I have 100 matrices (m1,m2,...,m100) which I want to combine in a list. The list, thus, shall contain the matrices as components. Is it necessary to mention all 100 matrices in the list() command? I would like to use just the first and last matrix or something similar. Best, Holger -- View this message in context: http://www.nabble.com/Creating-a-list-by-just-using-start-and-final-component-tp23691673p23691673.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a list by just using start and final component
Hollix wrote: Hi there, say, I have 100 matrices (m1,m2,...,m100) which I want to combine in a list. The list, thus, shall contain the matrices as components. Is it necessary to mention all 100 matrices in the list() command? I would like to use just the first and last matrix or something similar. Best, Holger Hi, you can do something like that: matrices - ls( pattern = m[0-9]+ ) res - lapply( matrices, get ) Romain -- Romain Francois Independent R Consultant +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [newbie] conditional density of a bivariate
Hello, I have a bidimensional dataset and have succesfully (with the help of this list btw.) the density of the data with smoothScatter. I have just one other issue: I would like to see that plot normalized in the X, that means I would like to have a 2d density plot of Y|X to see where are the critical points for every X point. Should I create a density matrix and then apply some kind of transformation to it or is there a more direct way to plot it? thanks, Francesco Stablum P.S. sorry for my poor english -- The generation of random numbers is too important to be left to chance - Robert R. Coveyou __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a list by just using start and final component
Wow, thank you so much! Where can I learn such creative approaches? Best, Holger Romain Francois-2 wrote: Hollix wrote: Hi there, say, I have 100 matrices (m1,m2,...,m100) which I want to combine in a list. The list, thus, shall contain the matrices as components. Is it necessary to mention all 100 matrices in the list() command? I would like to use just the first and last matrix or something similar. Best, Holger Hi, you can do something like that: matrices - ls( pattern = m[0-9]+ ) res - lapply( matrices, get ) Romain -- Romain Francois Independent R Consultant +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Creating-a-list-by-just-using-start-and-final-component-tp23691673p23692141.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with replacing factors
Hi, In the example dataset below - how can I cahnge gray20, to blue # data black - rep(c(black,red),10) gray - rep(c(gray10,gray20),10) black_gray - data.frame(black,gray) # none of this desperate things works # replace(black_gray$gray, gray==gray20,red) # if(black_gray$gray==gray20){black_gray$gray-blue} # for (i in black_gray$gray)if(black_gray$gray[i]==gray20){black_gray$gray[i] -blue} # black_gray$gray==gray14 - blue # black_gray$gray[gray==gray20] - blue # subset(black_gray,gray==gray20,gray) -rep(blue,10) I have a feeling this is me misunderstanding some very basic stuf about the R engine... So any help will be much appreciated. Thanks in advance Andreas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cream Text Editor
JiHO wrote: On 2009-May-23 , at 20:16 , Jakson Alves de Aquino wrote: Just a note: there is no need of Esc before F9. Almost all key bindings work in insert, normal and visual modes. Well, without switching to the non-insert mode, I find that pressing F9 prints the commands in the file instead of executing them. Maybe that's specific to Cream. I installed and tested cream here, and F9 in insert mode works for me, but in few minutes I've found other problems. The customization of key bindings in .vimrc is ignored by cream and omni completion doesn't work correctly: cream inserts a spurious '.x:call Cream - redo(i)' in addition to the correct completion. It seems that cream is unable to work properly in expert mode. Instead of using cream in expert mode it might be easier to use gvim. Alternatively, if you prefer to use cream, it might be easier to copy and paste commands in a regular R session running directly in a terminal emulator because you will benefit from R's built-in tab-completion. -- Jakson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] accuracy of a neural net
Hi. I started with a file which was a sparse 982x923 matrix and where the last column was a variable to be predicted. I did principle component analysis on it and arrived at a new 982x923 matrix. Then I ran the code below to get a neural network using nnet and then wanted to get a confusion matrix or at least know how accurate the neural net was. I used the first 22 principle components only for the inputs for the neural net. I got a perfect prediction rate which is somewhat suspect ( I was using the same data for training and prediction but I did not expect perfect prediction anyway). So I tried using only a sample of records to build the neural net. Even with this sample I got 980 out of 982 correct. Can anyone spot an error here? crs$dataset - read.csv(file:///C:/dataForR/textsTweet1/cleanForPC.csv, na.strings=c(., NA, , ?)) crs$nnet - nnet(Value ~ ., data=crs$dataset[,c(1:22,922)], size=10, linout=TRUE, skip=TRUE, trace=FALSE, maxit=1000) targets=crs$dataset[,922] rawpredictions =predict(crs$nnet, crs$dataset[, c(1:22)], type=raw) roundedpredictions=round(rawpredictions[,1],digits = 0) trueAndPredicted=cbind(roundedpredictions, targets) howManyEqual=trueAndPredicted[,1]==trueAndPredicted[,2] sum(howManyEqual) samp - c(sample(1:50,25), sample(51:100,25), sample(101:150,25)) samp - c(sample(1:250,125), sample(251:500,125), sample(500:920,300)) crs$nnet - nnet(Value ~ ., data=crs$dataset[samp,c(1:22,922)], size=10, linout=TRUE, skip=TRUE, trace=FALSE, maxit=1000) -- View this message in context: http://www.nabble.com/accuracy-of-a-neural-net-tp23692699p23692699.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with replacing factors
This should work: levels(black_gray$gray)[levels(black_gray$gray)=='gray20'] = 'blue' On Sun, May 24, 2009 at 8:15 AM, Andreas Christoffersen achristoffer...@gmail.com wrote: Hi, In the example dataset below - how can I cahnge gray20, to blue # data black - rep(c(black,red),10) gray - rep(c(gray10,gray20),10) black_gray - data.frame(black,gray) # none of this desperate things works # replace(black_gray$gray, gray==gray20,red) # if(black_gray$gray==gray20){black_gray$gray-blue} # for (i in black_gray$gray)if(black_gray$gray[i]==gray20){black_gray$gray[i] -blue} # black_gray$gray==gray14 - blue # black_gray$gray[gray==gray20] - blue # subset(black_gray,gray==gray20,gray) -rep(blue,10) I have a feeling this is me misunderstanding some very basic stuf about the R engine... So any help will be much appreciated. Thanks in advance Andreas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] A question on type=h plot lines
Dear R users, I need a produce a plot with a single panel and a few lines on it. Each line represents a different data set. The line types must be h, i.e. ‘histogram’ like (or ‘high-density’) vertical lines. The problem is that the vertical lines comprising a plot line of type=h are drawn from the y=0 line to the (x, y) locations. What I need is vertical lines drawn from the line y=value that I specify to the (x, y) locations. Is there a way for me to achieve that? Thank you very much for your responsiveness and attention. Regards, Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with replacing factors
Try storing them as character strings rather than factors: black_gray - data.frame(black, gray, stringsAsFactors = FALSE) Try this to view what you've got: str(black_gray) On Sun, May 24, 2009 at 7:15 AM, Andreas Christoffersen achristoffer...@gmail.com wrote: Hi, In the example dataset below - how can I cahnge gray20, to blue # data black - rep(c(black,red),10) gray - rep(c(gray10,gray20),10) black_gray - data.frame(black,gray) # none of this desperate things works # replace(black_gray$gray, gray==gray20,red) # if(black_gray$gray==gray20){black_gray$gray-blue} # for (i in black_gray$gray)if(black_gray$gray[i]==gray20){black_gray$gray[i] -blue} # black_gray$gray==gray14 - blue # black_gray$gray[gray==gray20] - blue # subset(black_gray,gray==gray20,gray) -rep(blue,10) I have a feeling this is me misunderstanding some very basic stuf about the R engine... So any help will be much appreciated. Thanks in advance Andreas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using optimize() correctly ...
Hello Berend, Berend Hasselman wrote: Your function is not unimodal. The help for optimize states: If f is not unimodal, then optimize() may approximate a local, but perhaps non-global, minimum to the same accuracy. Ah ok, I didn't read the manual page carefully enough. Do you know if R has a function to find the global maximum/minimum of a function of x over a given interval? nlminb(), optim(), in particular the option `method = L-BFGS-B' or the function spg() in BB package were recommended to use if I wanted to optimize a function over x and y given their respective intervals. Will they also potentially only give me the local maxima/minima? I am not a regular R user, so my knowledge is clearly not where is could/should be. Thanks, Esmail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cream Text Editor
Thank you much for the help, I will work on this over the weekend. Is there a way in Windows to connect R and Cream? Jakson Alves de Aquino wrote: As pointed by JiHO the biggest disadvantage of using the plugin is that R is running through a pipe and consequently it is less interactive. Just a note: there is no need of Esc before F9. Almost all key bindings work in insert, normal and visual modes. The last version of the plugin allows the user to set the terminal emulator in the vimrc and now all key bindings are customizable. The details are in the plugin's documentation. Please write, to me if you find bugs in the plugin. Jakson JiHO wrote: On 2009-May-23 , at 17:40 , Paul Heinrich Dietrich wrote: I'm interested in easing my way into learning VIM by first using the Cream text editor, liking the idea that it will work on both my Linux and Windows computers. I've installed Cream on my Linux machine, but can't figure out how to make Cream talk to R? Does anybody know? I'm using Ubuntu if it makes a difference. Thanks. You should install the R Vim Plugin and its dependencies: http://www.vim.org/scripts/script.php?script_id=2628 This creates commands and icons dedicated to the interaction between Vim and R. Then switch cream Settings Preferences Expert Mode. This will allow you to work in Cream and have all the simple keyboard shortcuts (Control-C, Control-V etc.) but still be able to switch between modes as in vim. By default you are in insert mode. You need to switch to normal mode (by pressing ESC) to be able to use the commands of the R-Vim plugin. The workflow is therefore: - open a R file - edit stuff - press ESC (to switch to non-edit mode) - start R in a terminal (click the icon or press F2) - send lines/selection (F9) or document (F5) - press ESC (to switch back to insert mode) - edit 2 lines - ESC - F9 - F9 - ESC - edit again etc... The terminal opened this way does not work completely as a regular one and there are some caveats when reading help and using general command line editing shortcuts (Ctrl-R for backward search for example). I haven't found a way around them so I usually open a second terminal to read the help in, or set R to display the help as HTML files in a browser window. I must say that those caveats can be quite serious and I often find myself just using copy-paste from gedit in a terminal: - set your desktop to focus follow mouse - select text in your editor - move the mouse to the terminal - click middle mouse button - move the mouse back to the editor etc... More cumbersome but reliable. Final note: since you are on ubuntu, you may want to change the terminal from the default X-term to gnome-terminal. You have to edit the file .vim/ftplugin/r.vim. There is a line commented with the gnome-terminal command instead of xterm. Uncomment this one and comment the xterm one. JiHO --- http://jo.irisson.free.fr/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Cream-Text-Editor-tp23688559p23692419.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sciplot question
You define your own function for the confidence intervals. The function needs to return the two values representing the upper and lower CI values. So: qt.fun - function(x) qt(p=.975,df=length(x)-1)*sd(x)/sqrt(length(x)) my.ci - function(x) c(mean(x)-qt.fun(x), mean(x)+qt.fun(x)) lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun=my.ci) Manuel On Fri, 2009-05-22 at 18:38 +0200, Jarle Bjørgeengen wrote: Hi, I would like to have lineplot.CI and barplot.CI to actually plot confidence intervals , instead of standard error. I understand I have to use the ci.fun option, but I'm not quite sure how. Like this : qt(0.975,df=n-1)*s/sqrt(n) but how can I apply it to visualize the length of the student's T confidence intervals rather than the stdandard error of the plotted means ? -- http://mutualism.williams.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with replacing factors
Hi Mike and Gabor - thx for the help. It seams I have made a mistake in my original question. While Mike's solutions worked on the example data I provided, I now see my actual data is is(df100_lang$gray) [1] character vector data.frameRowLabels and the solution doesn't work. I am sorry for posting the wrong data. If you'd still like to help this is the original data: Thanks for the help (and patience!) df100_lang$gray - structure(c(gray0, gray3, gray7, gray10, gray14, gray17, gray21, gray24, gray28, gray31, gray35, gray38, gray42, gray45, gray48, gray52, gray55, gray59, gray62, gray66, gray69, gray73, gray76, gray80, gray83, gray87, gray90, gray0, gray3, gray7, gray10, gray14, gray17, gray21, gray24, gray28, gray31, gray35, gray38, gray42, gray45, gray48, gray52, gray55, gray59, gray62, gray66, gray69, gray73, gray76, gray80, gray83, gray87, gray90, gray0, gray3, gray7, gray10, gray14, gray17, gray21, gray24, gray28, gray31, gray35, gray38, gray42, gray45, gray48, gray52, gray55, gray59, gray62, gray66, gray69, gray73, gray76, gray80, gray83, gray87, gray90, gray0, gray3, gray7, gray10, gray14, gray17, gray21, gray24, gray28, gray31, gray35, gray38, gray42, gray45, gray48, gray52, gray55, gray59, gray62, gray66, gray69, gray73, gray76, gray80, gray83, gray87, gray90, gray0, gray3, gray7, gray10, gray14, gray17, gray21, gray24, gray28, gray31, gray35, gray38, gray42, gray45, gray48, gray52, gray55, gray59, gray62, gray66, gray69, gray73, gray76, gray80, gray83, gray87, gray90, gray0, gray3, gray7, gray10, gray14, gray17, gray21, gray24, gray28, gray31, gray35, gray38, gray42, gray45, gray48, gray52, gray55, gray59, gray62, gray66, gray69, gray73, gray76, gray80, gray83, gray87, gray90, gray0, gray3, gray7, gray10, gray14, gray17, gray21, gray24, gray28, gray31, gray35, gray38, gray42, gray45, gray48, gray52, gray55, gray59, gray62, gray66, gray69, gray73, gray76, gray80, gray83, gray87, gray90, gray0, gray3, gray7, gray10, gray14, gray17, gray21, gray24, gray28, gray31, gray35, gray38, gray42, gray45, gray48, gray52, gray55, gray59, gray62, gray66, gray69, gray73, gray76, gray80, gray83, gray87, gray90, gray0, gray3, gray7, gray10, gray14, gray17, gray21, gray24, gray28, gray31, gray35, gray38, gray42, gray45, gray48, gray52, gray55, gray59, gray62, gray66, gray69, gray73, gray76, gray80, gray83, gray87, gray90, gray0, gray3, gray7, gray10, gray14, gray17, gray21, gray24, gray28, gray31, gray35, gray38, gray42, gray45, gray48, gray52, gray55, gray59, gray62, gray66, gray69, gray73, gray76, gray80, gray83, gray87, gray90, gray0, gray3, gray7, gray10, gray14, gray17, gray21, gray24, gray28, gray31, gray35, gray38, gray42, gray45, gray48, gray52, gray55, gray59, gray62, gray66, gray69, gray73, gray76, gray80, gray83, gray87, gray90), .Label = character(0)) On Sun, May 24, 2009 at 2:12 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try storing them as character strings rather than factors: black_gray - data.frame(black, gray, stringsAsFactors = FALSE) Try this to view what you've got: str(black_gray) On Sun, May 24, 2009 at 7:15 AM, Andreas Christoffersen achristoffer...@gmail.com wrote: Hi, In the example dataset below - how can I cahnge gray20, to blue # data black - rep(c(black,red),10) gray - rep(c(gray10,gray20),10) black_gray - data.frame(black,gray) # none of this desperate things works # replace(black_gray$gray, gray==gray20,red) # if(black_gray$gray==gray20){black_gray$gray-blue} # for (i in black_gray$gray)if(black_gray$gray[i]==gray20){black_gray$gray[i] -blue} # black_gray$gray==gray14 - blue # black_gray$gray[gray==gray20] - blue # subset(black_gray,gray==gray20,gray) -rep(blue,10) I have a feeling this is me misunderstanding some very basic stuf about the R engine... So any help will be much appreciated. Thanks in advance Andreas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sciplot question
Great, thanks Manuel. Just for curiosity, any particular reason you chose standard error , and not confidence interval as the default (the naming of the plotting functions associates closer to the confidence interval ) error indication . - Jarle Bjørgeengen On May 24, 2009, at 3:02 , Manuel Morales wrote: You define your own function for the confidence intervals. The function needs to return the two values representing the upper and lower CI values. So: qt.fun - function(x) qt(p=.975,df=length(x)-1)*sd(x)/sqrt(length(x)) my.ci - function(x) c(mean(x)-qt.fun(x), mean(x)+qt.fun(x)) lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun=my.ci) Manuel On Fri, 2009-05-22 at 18:38 +0200, Jarle Bjørgeengen wrote: Hi, I would like to have lineplot.CI and barplot.CI to actually plot confidence intervals , instead of standard error. I understand I have to use the ci.fun option, but I'm not quite sure how. Like this : qt(0.975,df=n-1)*s/sqrt(n) but how can I apply it to visualize the length of the student's T confidence intervals rather than the stdandard error of the plotted means ? -- http://mutualism.williams.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using optimize() correctly ...
On 24-05-2009, at 14:24, Esmail wrote: Hello Berend, Berend Hasselman wrote: Your function is not unimodal. The help for optimize states: If f is not unimodal, then optimize() may approximate a local, but perhaps non-global, minimum to the same accuracy. Ah ok, I didn't read the manual page carefully enough. Do you know if R has a function to find the global maximum/minimum of a function of x over a given interval? If you do resopt - optim(-5,f, method=SANN,control=list(fnscale=-1)) you will get the global maximum. SANN: simulated annealing. But starting in -4 takes you to the local maximum. But the help for optim recommends optimize for one-dimensional maximization. As far as I know there is no general foolproof method for finding a global optimum except trying different initial points. No method can really replace one's own knowledge about a function. Berend btw. i am away during this week so i won't be replying. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using optimize() correctly ...
Yes. Most classical optimization methods (e.g. gradient-type, Newton-type) are local, i.e. they do not attempt to locate the global optimum. The primary difficulty with global optimization is that there are no mathematical conditions that characterize global optimum in multi-modal problems. For local optimum, you have the first- and second-order Kuhn-Tucker conditions. A simplistic strategy to find global optimum is to use local methods with multiple starting values. Again the problem is that you don't haev any guarantee that you have found the global optimum. The larger the number of starting values, the greater your chances of finding the global optimum. There are more principled strategies than the random multi-start approach, but even they are not guaranteed to work. Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: Esmail esmail...@gmail.com Date: Sunday, May 24, 2009 8:27 am Subject: Re: [R] using optimize() correctly ... To: Berend Hasselman b...@xs4all.nl Cc: r-help@r-project.org Hello Berend, Berend Hasselman wrote: Your function is not unimodal. The help for optimize states: If f is not unimodal, then optimize() may approximate a local, but perhaps non-global, minimum to the same accuracy. Ah ok, I didn't read the manual page carefully enough. Do you know if R has a function to find the global maximum/minimum of a function of x over a given interval? nlminb(), optim(), in particular the option `method = L-BFGS-B' or the function spg() in BB package were recommended to use if I wanted to optimize a function over x and y given their respective intervals. Will they also potentially only give me the local maxima/minima? I am not a regular R user, so my knowledge is clearly not where is could/should be. Thanks, Esmail __ R-help@r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sciplot question
Jarle Bjørgeengen wrote: Great, thanks Manuel. Just for curiosity, any particular reason you chose standard error , and not confidence interval as the default (the naming of the plotting functions associates closer to the confidence interval ) error indication . - Jarle Bjørgeengen On May 24, 2009, at 3:02 , Manuel Morales wrote: You define your own function for the confidence intervals. The function needs to return the two values representing the upper and lower CI values. So: qt.fun - function(x) qt(p=.975,df=length(x)-1)*sd(x)/sqrt(length(x)) my.ci - function(x) c(mean(x)-qt.fun(x), mean(x)+qt.fun(x)) Minor improvement: mean(x) + qt.fun(x)*c(-1,1) but in general confidence limits should be asymmetric (a la bootstrap). I'm not sure how NAs are handled. Frank lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun=my.ci) Manuel On Fri, 2009-05-22 at 18:38 +0200, Jarle Bjørgeengen wrote: Hi, I would like to have lineplot.CI and barplot.CI to actually plot confidence intervals , instead of standard error. I understand I have to use the ci.fun option, but I'm not quite sure how. Like this : qt(0.975,df=n-1)*s/sqrt(n) but how can I apply it to visualize the length of the student's T confidence intervals rather than the stdandard error of the plotted means ? -- http://mutualism.williams.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A question on type=h plot lines
Create your own using 'segments'. On Sun, May 24, 2009 at 8:08 AM, Martin Ivanov tra...@abv.bg wrote: Dear R users, I need a produce a plot with a single panel and a few lines on it. Each line represents a different data set. The line types must be h, i.e. histogram like (or high-density) vertical lines. The problem is that the vertical lines comprising a plot line of type=h are drawn from the y=0 line to the (x, y) locations. What I need is vertical lines drawn from the line y=value that I specify to the (x, y) locations. Is there a way for me to achieve that? Thank you very much for your responsiveness and attention. Regards, Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cream Text Editor
Paul Heinrich Dietrich wrote: Thank you much for the help, I will work on this over the weekend. Is there a way in Windows to connect R and Cream? Perhaps, although I can't help... It would be necessary to write another plugin: https://stat.ethz.ch/pipermail/r-help/2009-May/197794.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sciplot question
On May 24, 2009, at 3:34 , Frank E Harrell Jr wrote: Jarle Bjørgeengen wrote: Great, thanks Manuel. Just for curiosity, any particular reason you chose standard error , and not confidence interval as the default (the naming of the plotting functions associates closer to the confidence interval ) error indication . - Jarle Bjørgeengen On May 24, 2009, at 3:02 , Manuel Morales wrote: You define your own function for the confidence intervals. The function needs to return the two values representing the upper and lower CI values. So: qt.fun - function(x) qt(p=.975,df=length(x)-1)*sd(x)/ sqrt(length(x)) my.ci - function(x) c(mean(x)-qt.fun(x), mean(x)+qt.fun(x)) Minor improvement: mean(x) + qt.fun(x)*c(-1,1) but in general confidence limits should be asymmetric (a la bootstrap). Thanks, if the date is normally distributed , symmetric confidence interval should be ok , right ? When plotting the individual sample , it looks normally distributed. Best regards. Jarle Bjørgeengen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] accuracy of a neural net
You might want to use cross-validation or the bootstrap to get error estimates. Also, you should include the PCA step in the resampling since it does add noise to the model. Look at the pcaNNet and train functions in the caret package. Also your code for the nnet would imply that you are predicting a continuous outcome (i.e. linear function on the hidden units), so a confusion matrix wouldn't be appropriate. Max On Sun, May 24, 2009 at 7:28 AM, onyourmark william...@gmail.com wrote: Hi. I started with a file which was a sparse 982x923 matrix and where the last column was a variable to be predicted. I did principle component analysis on it and arrived at a new 982x923 matrix. Then I ran the code below to get a neural network using nnet and then wanted to get a confusion matrix or at least know how accurate the neural net was. I used the first 22 principle components only for the inputs for the neural net. I got a perfect prediction rate which is somewhat suspect ( I was using the same data for training and prediction but I did not expect perfect prediction anyway). So I tried using only a sample of records to build the neural net. Even with this sample I got 980 out of 982 correct. Can anyone spot an error here? crs$dataset - read.csv(file:///C:/dataForR/textsTweet1/cleanForPC.csv, na.strings=c(., NA, , ?)) crs$nnet - nnet(Value ~ ., data=crs$dataset[,c(1:22,922)], size=10, linout=TRUE, skip=TRUE, trace=FALSE, maxit=1000) targets=crs$dataset[,922] rawpredictions =predict(crs$nnet, crs$dataset[, c(1:22)], type=raw) roundedpredictions=round(rawpredictions[,1],digits = 0) trueAndPredicted=cbind(roundedpredictions, targets) howManyEqual=trueAndPredicted[,1]==trueAndPredicted[,2] sum(howManyEqual) samp - c(sample(1:50,25), sample(51:100,25), sample(101:150,25)) samp - c(sample(1:250,125), sample(251:500,125), sample(500:920,300)) crs$nnet - nnet(Value ~ ., data=crs$dataset[samp,c(1:22,922)], size=10, linout=TRUE, skip=TRUE, trace=FALSE, maxit=1000) -- Max __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] build CONTENTS or 00Index.html without installing whole package
OK, one more for the records. This script is now written so that it uses Rscript instead of bash. The last line still does not work. I don't know what make.packages.html requires, but apparently it requires more than DESCRIPTION and 00Index.html in order to include a package. (The line about buildVignettes seems useless, so I commented it out.) The command line can include *. So I use this after download.packages(), with inst.R *.tar.gz. (inst.R is what i call the script.) #!/usr/bin/Rscript --vanilla # makes indexable help files for R packages, including pdf vignettes # usage inst.R [files] FILES - commandArgs(TRUE) print(FILES) for (PKG in FILES) { system(paste(tar xfz,PKG)) PK - strsplit(PKG,_)[[1]][1] print(PK) system(paste(mkdir -pv /usr/lib/R/library/,PK,/html,sep=)) # copy description (which contains version number) system(paste(cp ,PK,/DESCRIPTION /usr/lib/R/library/,PK,sep=)) # move vignettes if present system(paste(cp -r ,PK,/inst/doc /usr/lib/R/library/,PK, /dev/null,sep=)) #buildVignettes(PK,paste(/usr/lib/R/library/,PK,sep=)) # make html files system(paste(cd /usr/share/R/perl; ,perl build-help.pl --html /home/baron/, PK, /usr/lib/R/library/,; cd /home/baron,sep=)) # make indices tools:::.writePkgIndices(PK,paste(/usr/lib/R/library/,PK,sep=)) system(paste(rm -rf,PK)) } # try to build package list, doesn't work # make.packages.html() -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Naming a random effect in lmer
Hi Bill, I'm about to take a look at this. If I understand the issue, very long expressions for what I call the grouping factor of a random effects term (the expressions on the right hand side of the vertical bar) are encountering problems with deparse. I should have realized that, any time one uses deparse, disaster looms. I can tell you the reason that the collection of random-effects terms is being named is partly for the printed form and partly so that terms with the same grouping factor can be associated. I guess my simplistic solution to the problem would be to precompute these sums and give them names, if it is the sum like Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 that is important, why not evaluate the sum testGroupSamp - within(testGroupSamp, Z29 - Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9) and use Z29 as the grouping factor. Even the use of variables with names like Z1, Z2, ... and the use of expressions like paste(Z, 2:9, sep = ) is not idiomatic R/S code. It's an SPSS/SASism. (You know I never realized before how close the word SASism, meaning a construction that is natural in SAS, is to Sadism.) Why not create a matrix Z and evaluate these sums as matrix/vector products? Zs2- paste(Z,2:9,sep=) On Fri, May 22, 2009 at 5:30 PM, William Dunlap wdun...@tibco.com wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of spencerg Sent: Friday, May 22, 2009 3:01 PM To: Leigh Ann Starcevich Cc: r-help@r-project.org Subject: Re: [R] Naming a random effect in lmer [ ... elided statistical advice ... ] If you and your advisor still feel that what you are doing makes sense, I suggest you first get the source code via svn checkout svn://svn.r-forge.r-project.org/svnroot/lme4 (or by downloading lme4_0.999375-30.tar.gz from http://cran.fhcrc.org/web/packages/lme4/index.html;), then walk through the code line by line using the debug function (or browser or the debug package). From this, you will likely see either (a) how you can do what you want differently to achieve the same result or (b) how to modify the code so it does what you want. The coding error is right in the error message: Error in names(bars) - unlist(lapply(bars, function(x) deparse(x[[3]]))) and I suspect that traceback() would tell you that came from a call to lmerFactorList. That code implicitly assumes that deparse() will produce a scalar character vector, but it doesn't if the input expression is complicated enough. Changing the deparse(x[[3]]) to deparse(x[[3]])[1] or paste(collapse= , deparse(x[[3]])[1]) would fix it. The first truncates the name and the second my make a very long name. There is at least one other use of that idiom in the lme4 code and your dataset and analysis may require that all of them be fixed. Hope this helps. Spencer Leigh Ann Starcevich wrote: Here is a test data set and code. I am including the data set after the code and discussion to make reading easier. Apologies for the size of the data set, but my problem occurs when there are a lot of Z variables. Thanks for your time. # Enter data below # Sample code library(lme4) mb- length(unique(testsamp$WYear)) # Create the formula for the set of identically distributed random effects Zs- paste(Z,2:(mb-1)),sep=) Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs,collapse=+), fittest-lmer(Trendformula, data = testsamp) summary(fittest) # Here I get an error because the name of the random effect is too long to print # in the random effects output (I think). # The error message is: Error in names(bars) - unlist(lapply(bars, function(x) # deparse(x[[3]]))) : 'names' attribute [3] must be the same length as the vector [2] # However, when fewer Z variables are used in the random portion of the model, # there is no error. # Using only Z2 + ... + Z9 for the random intercept Zs2- paste(Z,2:9,sep=) Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs2,collapse=+), fittest2-lmer(Trendformula2, data = testsamp) summary(fittest2) # Is there a way to either name the set of iid random effects something else or # to define a random variable that could be used in the model to create a # random intercept? # I have had some success in lme, but it would be helpful for my simulation if I # could conduct this analysis with lmer. My model in lme is not correctly # estimating one of the variance components (random Site intercept). # I am using: detach(package:lme4) library(nlme) random.model.lme -as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),col lapse=+))) n-dim(testsamp)[1] testsampgroup - rep(1,n) testsamp.lme -cbind(testsamp,testsampgroup) testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup, inner= ~Site, data=
Re: [R] sciplot question
Jarle Bjørgeengen wrote: On May 24, 2009, at 3:34 , Frank E Harrell Jr wrote: Jarle Bjørgeengen wrote: Great, thanks Manuel. Just for curiosity, any particular reason you chose standard error , and not confidence interval as the default (the naming of the plotting functions associates closer to the confidence interval ) error indication . - Jarle Bjørgeengen On May 24, 2009, at 3:02 , Manuel Morales wrote: You define your own function for the confidence intervals. The function needs to return the two values representing the upper and lower CI values. So: qt.fun - function(x) qt(p=.975,df=length(x)-1)*sd(x)/sqrt(length(x)) my.ci - function(x) c(mean(x)-qt.fun(x), mean(x)+qt.fun(x)) Minor improvement: mean(x) + qt.fun(x)*c(-1,1) but in general confidence limits should be asymmetric (a la bootstrap). Thanks, if the date is normally distributed , symmetric confidence interval should be ok , right ? Yes; I do see a normal distribution about once every 10 years. When plotting the individual sample , it looks normally distributed. An appropriate qqnorm plot is a better way to check, but often the data cannot tell you about the normality of themselves. It's usually better to use methods (e.g., bootstrap) that do not assume normality and that provide skewed confidence intervals if the data are skewed. Frank Best regards. Jarle Bjørgeengen -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sciplot question
Dear Frank, et al.: Frank E Harrell Jr wrote: snip Yes; I do see a normal distribution about once every 10 years. To what do you attribute the nonnormality you see in most cases? (1) Unmodeled components of variance that can generate errors in interpretation if ignored, even with bootstrapping? (2) Honest outliers that do not relate to the phenomena of interest and would better be removed through improved checks on data quality, but where bootstrapping is appropriate (provided the data are not also contaminated with (1))? (3) Situations where the physical application dictates a different distribution such as binomial, lognormal, gamma, etc., possibly also contaminated with (1) and (2)? I've fit mixtures of normals to data before, but one needs to be careful about not carrying that to extremes, as the mixture may be a result of (1) and therefore not replicable. George Box once remarked that he thought most designed experiments included split plotting that had been ignored in the analysis. That is only a special case of (1). Thanks, Spencer Graves __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sciplot question
spencerg wrote: Dear Frank, et al.: Frank E Harrell Jr wrote: snip Yes; I do see a normal distribution about once every 10 years. To what do you attribute the nonnormality you see in most cases? (1) Unmodeled components of variance that can generate errors in interpretation if ignored, even with bootstrapping? (2) Honest outliers that do not relate to the phenomena of interest and would better be removed through improved checks on data quality, but where bootstrapping is appropriate (provided the data are not also contaminated with (1))? (3) Situations where the physical application dictates a different distribution such as binomial, lognormal, gamma, etc., possibly also contaminated with (1) and (2)? I've fit mixtures of normals to data before, but one needs to be careful about not carrying that to extremes, as the mixture may be a result of (1) and therefore not replicable. George Box once remarked that he thought most designed experiments included split plotting that had been ignored in the analysis. That is only a special case of (1). Thanks, Spencer Graves Spencer, Those are all important reasons for non-normality of margin distributions. But the biggest reason of all is that the underlying process did not know about the normal distribution. Normality in raw data is usually an accident. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assigning variable names from one object to another object
Hello I have 2 datasets say Data1 and Data2 both are of different dimesions. Data1: 120 rows and 6 columns (Varname, Vartype, Labels, Description, ) The column Varname has 120 rows which has variable names such id, age, gender,.so on Data2: 12528 rows and 120 columns The column names in this case are V1, V2, . V120 (which are default names in R when we say head=F in read.csv) I want to assign the variable names from Data1 to Data2 as the column headings in Data2 i.e V1 should be id, V2 should be age, .. so on Is it possible to do in R? I tired assigning variable names from Data1 in one object and transposing them and then used rbind but it doesnot work. Can I use colnames? I could not apply it in this case. Can any1 tell me how can i apply it for this case? or should I paste the column names in csv file (from where I have imported the data) and then import in R? Thank you in advance Regards Sunita -- View this message in context: http://www.nabble.com/Assigning-variable-names-from-one-object-to-another-object-tp23695359p23695359.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XML parse error
Um, this isn't an XML file. An XML file should look something like this: ?xml version=1.0 encoding=utf-8 ? Regards, Richie. Mathematical Sciences Unit HSL It is for sure little complicated then a plain XML file. The format of binary file is according to XML schema. I have been able to get C parser going to get information from binary with one caveat - I have to manually read the XML schema and figure out which byte means what in binary and then code it in C. I do not know if it will be of interest or worthwhile to incorporate capability in R to interpret schema and decode binary file based on schema interpretation. Or may be this already exists and it is just that I do not know how to do it. Anyway thanks for looking into it. - Kulwinder Banipal To: kbani...@hotmail.com CC: dun...@wald.ucdavis.edu; r-help@r-project.org; r-help-boun...@r-project.org Subject: Re: [R] XML parse error From: richard.cot...@hsl.gov.uk Date: Thu, 21 May 2009 11:05:02 +0100 I am trying to parse XML file ( binary hex) but get an error. Code I am using is: xsd = xmlTreeParse(system.file(exampleData, norel.xsd, package = XML), isSchema =TRUE) doc = xmlInternalTreeParse(system. file(exampleData, LogCallSummary.bin, package = XML)) Start tag expected, '' not found xmlParse command results in same error as well: f = system.file(exampleData, LogCallSummary.bin, package = XML) doc = xmlParse(f)Start tag expected, '' not found I am at beginner level with XML and will appreciate any help with this error or general guidance. Thanks Kulwinder Banipal file is: 000: 0281 0001 0201 0098 c1d5 c000 010: 000a c0a8 db35 0055 6000 00af 0001 0001 .5.U`... 020: 5f00 2200 4530 4411 2233 4455 0f08 _..E0..D.3DU.. 030: 0123 4567 8901 2340 04d2 .#eg...@ 040: 0002 0100 0001 0003 0303 050: 0100 6400 0100 d... 060: 6401 0300 0900 00fe fe00 012f 0001 d../ 070: 0101 0001 0001 2200 0033 .. 3080: 3306 0022 1100 3...33. 090: 0033 3400 2300 0011 0001 3335 .34.#. 350a0: 0024 1100 0200 0033 3600 2500 .$.36.%. 0b0: 0011 0003 3337 0026 1100 37. 0c0: 0400 0033 3800 2700 0011 0005 .38.'... 0d0: 5504 7700 8800 0044 4406 2323 0099 U.wDD...##.. 0e0: 0100 0200 0023 2400 9901 0002 .#$. 0f0: 0001 2325 0099 0100 0200 ..#% 100: 0200 0023 2600 9901 0002 0003 ...#... 110: 2327 0099 0100 0200 0400 0023 2800 #'...#(. 120: 9901 0002 0005 0102 0008 0100 0066 ... f130: 6600 0055 5533 3400 0a35 f..UU34 5140: 0014 3600 1e37 0028 3800 67...(8. 150: 3239 003c 3a00 463b ..29...:...F;.. 160: 0050 3c00 5a00 0088 8800 0077 7744 .P...Z.. wwD170: 4500 0a46 0014 4700 EF G.180: 1e48 0028 4900 324a ...H...(I...2J.. 190: 003c 4b00 464c 0050 4d00 .K...FL...PM... 1a0: 5a02 2207 7766 6604 0500 1100 0088 Z..wff. 1b0: 8800 0106 0011 8889 1c0: 0011 0700 1100 0088 8a00 2108 ..!. 1d0: 0011 888b 0031 0405 ...1 1e0: 0022 0044 0001 0600 2200 D. 1f0: 4500 1107 0022 0046 ..E... F200: 0021 0800 2200 4700 ...!...G... 210: 3106 0001 0002 0003 0004 0005 0200 1... 220: 0033 4400 0055 6609 0101 0202 0303 0404 .3D..Uf. 230: 0505 0606 0707 0808 0909 0405 0011 240: 0044 0022 0088 0500 ...D... 250: 1200 4500 2300 8905 E...#... 260: 0013 0046 0024 008a 0500 .F...$.. 270: 1400 4700 2500 8bfa ..G...%.280: ae Um, this isn't an XML file. An XML file should look something like this: ?xml version=1.0 encoding=utf-8 ? tag subtagvalue/subtag /tag The wikipedia entry on XML gives a reasonable intro to the format. http://en.wikipedia.org/wiki/Xml Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential info...{{dropped:26}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] Assigning variable names from one object to another object
Hello Duncan Thank you so much it worked. I think I was doing it in a more complicated way so I didnt get a solution Thank you very much once again Regards Sunita On Sun, May 24, 2009 at 9:59 PM, Duncan Murdoch murd...@stats.uwo.cawrote: On 24/05/2009 12:21 PM, Sunita22 wrote: Hello I have 2 datasets say Data1 and Data2 both are of different dimesions. Data1: 120 rows and 6 columns (Varname, Vartype, Labels, Description, ) The column Varname has 120 rows which has variable names such id, age, gender,.so on Data2: 12528 rows and 120 columns The column names in this case are V1, V2, . V120 (which are default names in R when we say head=F in read.csv) I want to assign the variable names from Data1 to Data2 as the column headings in Data2 i.e V1 should be id, V2 should be age, .. so on Is it possible to do in R? I tired assigning variable names from Data1 in one object and transposing them and then used rbind but it doesnot work. Can I use colnames? I could not apply it in this case. Can any1 tell me how can i apply it for this case? names or colnames should work: colnames(Data2) - Data1$Varname or names(Data2) - Data1$Varname or should I paste the column names in csv file (from where I have imported the data) and then import in R? That's another way that would work. Duncan Murdoch Thank you in advance Regards Sunita -- Our Thoughts have the Power to Change our Destiny. Sunita [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assigning variable names from one object to another object
On 24/05/2009 12:21 PM, Sunita22 wrote: Hello I have 2 datasets say Data1 and Data2 both are of different dimesions. Data1: 120 rows and 6 columns (Varname, Vartype, Labels, Description, ) The column Varname has 120 rows which has variable names such id, age, gender,.so on Data2: 12528 rows and 120 columns The column names in this case are V1, V2, . V120 (which are default names in R when we say head=F in read.csv) I want to assign the variable names from Data1 to Data2 as the column headings in Data2 i.e V1 should be id, V2 should be age, .. so on Is it possible to do in R? I tired assigning variable names from Data1 in one object and transposing them and then used rbind but it doesnot work. Can I use colnames? I could not apply it in this case. Can any1 tell me how can i apply it for this case? names or colnames should work: colnames(Data2) - Data1$Varname or names(Data2) - Data1$Varname or should I paste the column names in csv file (from where I have imported the data) and then import in R? That's another way that would work. Duncan Murdoch Thank you in advance Regards Sunita __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XML parse error
On Sun, May 24, 2009 at 12:28 PM, kulwinder banipal kbani...@hotmail.comwrote: It is for sure little complicated then a plain XML file. The format of binary file is according to XML schema. I have been able to get C parser going to get information from binary with one caveat - I have to manually read the XML schema and figure out which byte means what in binary and then code it in C. There are many ways of encoding XML in a compact binary form (cf. http://en.wikipedia.org/wiki/Binary_XML), none widely accepted yet. The XML schema does not specify the binary form. -s [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Deleting columns from a matrix
useR's, I have a matrix given by the code: mat - matrix(c(rep(NA,10),1,2,3,4,5,6,7,8,9,10,10,9,8,NA,6,5,4,NA,2,1,rep(NA,10),1,2,3,4,NA,6,7,8,9,10),10,5) This is a 10x5 matrix containing missing values. All columns except the second contain missing values. I want to delete all columns that contain ALL missing values, and in this case, it would be the first and fourth columns. Any column that has at least one real number would remain. I know I can use mat[,-1] to delete the first column, but I have a much larger matrix where it is impossible to tell how many columns contain all missing values and which don't. Is there a function or something else that may be able to help me accomplish this? Thanks in advance. dxc13 -- View this message in context: http://www.nabble.com/Deleting-columns-from-a-matrix-tp23695656p23695656.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Animal Morphology: Deriving Classification Equation with Linear Discriminat Analysis (lda)
Fellow R Users: I'm not extremely familiar with lda or R programming, but a recent editorial review of a manuscript submission has prompted a crash cousre. I am on this forum hoping I could solicit some much needed advice for deriving a classification equation. I have used three basic measurements in lda to predict two groups: male and female. I have a working model, low Wilk's lambda, graphs, coefficients, eigenvalues, etc. (see below). I adjusted the sample analysis for Fisher's or Anderson's Iris data provided in the MASS library for my own data. My final and last step is simply form the classification equation. The classification equation is simply using standardized coefficients to classify each group- in this case male or female. A more thorough explanation is provided: For cases with an equal sample size for each group the classification function coefficient (Cj) is expressed by the following equation: Cj = cj0+ cj1x1+ cj2x2+...+ cjpxp where Cj is the score for the jth group, j = 1 … k, cjo is the constant for the jth group, and x = raw scores of each predictor. If W = within-group variance-covariance matrix, and M = column matrix of means for group j, then the constant cjo= (-1/2)CjMj (Julia Barfield, John Poulsen, and Aaron French http://userwww.sfsu.edu/~efc/classes/biol710/discrim/discriminant.htm). I am unable to navigate this last step based on the R output I have. I only have the linear discriminant coefficients for each predictor that would be needed to complete this equation. Please, if anybody is familiar or able to to help please let me know. There is a spot in the acknowledgments for you. All the best, Chase Mendenhall Below is the R Commander Windows http://www.nabble.com/file/p23693355/LDA-WRMA.csv LDA-WRMA.csv : Script Window: #Dataset workWRMA = read.csv(C:\\Users\\Chase\\Documents\\Interpubic Distance\\LDA\\LDA-WRMA.csv) workWRMA #Linear Discriminant Function model-lda(WRMA_SEX~WRMA_WG+WRMA_WT+WRMA_ID,data=workWRMA) model plot(model) predict(model) #Wilk's Lambda FYI:(Sqrt(1- Wilks’ lamda)=canonical correlation) X-as.matrix(workWRMA[-4]) Y-workWRMA$WRMA_SEX workWRMA.manova-manova(X~Y) workWRMA.wilks-summary(workWRMA.manova,test=Wilks) workWRMA.wilks #Group Centroids sum(LD1*(workWRMA$WRMA_SEX==F))/sum(workWRMA$WRMA_SEX==F) sum(LD1*(workWRMA$WRMA_SEX==M))/sum(workWRMA$WRMA_SEX==M) #Eigenvalue/Canonical Correlation model$svd Output Window: #Dataset workWRMA = read.csv(C:\\Users\\Chase\\Documents\\Interpubic Distance\\LDA\\LDA-WRMA.csv) workWRMA WRMA_WG WRMA_WT WRMA_ID WRMA_SEX 1 57.5 11.98 5.30F 2 60.5 12.25 8.10F 3 59.1 13.28 6.65F 4 61.0 12.20 7.30F 5 59.42857 13.042857 7.70F 6 59.2 13.34 10.20F 7 60.2 12.60 5.00F 8 61.0 13.25 8.00F 9 59.7 12.16 5.30F 10 59.0 12.425000 8.00F 11 59.71429 12.33 6.00F 12 60.16667 12.50 4.40F 13 60.2 13.70 7.60F 14 61.0 12.10 6.90F 15 57.9 12.10 6.55F 16 58.4 12.74 6.00F 17 60.5 12.90 7.50F 18 60.0 13.85 9.40F 19 56.5 12.60 5.60F 20 59.0 11.70 6.50F 21 60.0 12.80 9.10F 22 59.0 12.00 6.30F 23 56.0 11.90 4.00F 24 60.0 11.80 6.20F 25 60.0 13.15 3.80F 26 61.0 12.30 6.10F 27 61.25000 12.05 4.00F 28 57.0 11.95 5.70F 29 59.0 12.70 5.20F 30 58.5 13.35 4.05F 31 54.0 12.00 5.10F 32 60.0 12.00 4.10F 33 58.0 12.30 6.50F 34 57.0 12.60 4.60F 35 57.0 11.50 6.30F 36 60.0 12.20 6.70F 37 60.6 12.525000 8.77F 38 58.3 13.53 11.50F 39 59.25000 12.575000 5.70F 40 61.12500 12.73 7.00F 41 60.0 12.20 5.90F 42 60.0 11.85 4.90F 43 59.25000 12.60 8.10F 44 60.0 12.30 7.90F 45 61.0 12.40 4.50F 46 57.0 12.60 5.80F 47 59.0 12.90 6.40F 48 57.0 12.00 7.15F 49 55.0 12.70 5.20F 50 61.0 12.20 5.70F 51 60.0 11.70 7.00F 52 59.0 14.40 5.00F 53 60.5 13.00 6.30F 54 60.0 12.80 8.00F 55 56.5 12.10 6.30F 56 56.3 12.80 3.45F 57 59.0 11.85 4.00F 58 57.0
[R] how to implement a circular buffer with R
Some wavelet analysis experts have implemented periodic boundary conditions for signals. I need to implement a circular buffer. Something like: 12345abcdefgh12345abcdefgh so that at each step the riightmost element is moved to the leftmost index and everything else is properly shifted: h12345abcdefgh12345abcdefg, gh12345abcdefgh12345abcdef, My implementation (still debugging) seems to start working but is terribly clumsy. I am sure that some expert can suggest a more elegant solution, Thank you. Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Deleting columns from a matrix
one way is: mat - matrix(c(rep(NA,10),1,2,3,4,5,6,7,8,9,10,10,9,8,NA,6,5,4,NA,2,1,rep(NA,10),1,2,3,4,NA,6,7,8,9,10), 10, 5) ind - colSums(is.na(mat)) != nrow(mat) mat[, ind] I hope it helps. Best, Dimitris dxc13 wrote: useR's, I have a matrix given by the code: mat - matrix(c(rep(NA,10),1,2,3,4,5,6,7,8,9,10,10,9,8,NA,6,5,4,NA,2,1,rep(NA,10),1,2,3,4,NA,6,7,8,9,10),10,5) This is a 10x5 matrix containing missing values. All columns except the second contain missing values. I want to delete all columns that contain ALL missing values, and in this case, it would be the first and fourth columns. Any column that has at least one real number would remain. I know I can use mat[,-1] to delete the first column, but I have a much larger matrix where it is impossible to tell how many columns contain all missing values and which don't. Is there a function or something else that may be able to help me accomplish this? Thanks in advance. dxc13 -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] subset dataframe by number of rows of equal values
Hi R helpers! I have the following dataframe «choose» choose-data.frame(firm=c(1,1,2,2,2,2,3,3,4,4,4,4,4,4), year=c(2000,2001,2000,2001,2002,2003,2000,2003,2001,2002,2003,2004,2005,2006),code=c(10,10,11,11,11,11,12,12,13,13,13,13,13,13)) choose I want to subset it to obtain another one with those observations for which there more than 2 observations in the column «code». So I want a dataframe «chosen» like this: chosen-data.frame(firm=c(2,2,2,2,4,4,4,4,4,4),year=c(2000,2001,2002,2003,2001,2002,2003,2004,2005,2006),code=c(11,11,11,11,13,13,13,13,13,13)) chosen Ive tried split() and then nrow() but I got nothing. Could anyone help me with this? Thanks Cecília (Universidade de Aveiro Portugal) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Deleting columns from a matrix
Dear dxc13, Here is another way: index - apply(mat, 2, function(x) !all(is.na(x))) mat[ , index] HTH, Jorge On Sun, May 24, 2009 at 12:53 PM, dxc13 dx...@health.state.ny.us wrote: useR's, I have a matrix given by the code: mat - matrix(c(rep(NA,10),1,2,3,4,5,6,7,8,9,10,10,9,8,NA,6,5,4,NA,2,1,rep(NA,10),1,2,3,4,NA,6,7,8,9,10),10,5) This is a 10x5 matrix containing missing values. All columns except the second contain missing values. I want to delete all columns that contain ALL missing values, and in this case, it would be the first and fourth columns. Any column that has at least one real number would remain. I know I can use mat[,-1] to delete the first column, but I have a much larger matrix where it is impossible to tell how many columns contain all missing values and which don't. Is there a function or something else that may be able to help me accomplish this? Thanks in advance. dxc13 -- View this message in context: http://www.nabble.com/Deleting-columns-from-a-matrix-tp23695656p23695656.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to implement a circular buffer with R
Hi Maura, It is not elegant but may work. actual.string- 12345abcdefgh12345abcdefgh actual.string actual.string-paste(substr(actual.string, nchar(actual.string),nchar(actual.string)), substr(actual.string, 1,nchar(actual.string)-1), sep=) actual.string #in a looping actual.string- 12345abcdefgh12345abcdefgh number.buffers-10 my.buffers-actual.string for (i in 1:number.buffers) { actual.string-paste(substr(actual.string, nchar(actual.string),nchar(actual.string)), substr(actual.string, 1,nchar(actual.string)-1), sep=) my.buffers-c(my.buffers, actual.string) } my.buffers Ciao, milton brazil=toronto On Sun, May 24, 2009 at 1:09 PM, mau...@alice.it wrote: Some wavelet analysis experts have implemented periodic boundary conditions for signals. I need to implement a circular buffer. Something like: 12345abcdefgh12345abcdefgh so that at each step the riightmost element is moved to the leftmost index and everything else is properly shifted: h12345abcdefgh12345abcdefg, gh12345abcdefgh12345abcdef, My implementation (still debugging) seems to start working but is terribly clumsy. I am sure that some expert can suggest a more elegant solution, Thank you. Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] filling area under a function line
Hi R collective, I quite like the curve function because you can chuck a R function into it, and see the graph in one line of R. I had a google and found some threads for filling under the line; http://tolstoy.newcastle.edu.au/R/e2/help/07/09/25457.html However they seem to miss the point of the simplicity of going, I wonder what that looks like, and can I have some colour with that please ;-) So I found the easiest way to do that was to attach a polygon call onto the end of the curve function definition; else plot(x, y, type = type, ylab = ylab, xlim = xlim, log = lg, + polygon( + c(from, x, to), + c(min(y),y,min(y)), + col=areacol, lty=0 + ) } areacurve(sin(2*pi*6*x+pi/2),areacol=red) and stick that in my r init file. Now the question is, am I missing something elementary here? Cheers, T areacurve-function (expr, from = NULL, to = NULL, n = 101, add = FALSE, type = l, ylab = NULL, log = NULL, xlim = NULL, areacol=gray, ...) { sexpr - substitute(expr) if (is.name(sexpr)) { fcall - paste(sexpr, (x)) expr - parse(text = fcall) if (is.null(ylab)) ylab - fcall } else { if (!(is.call(sexpr) match(x, all.vars(sexpr), nomatch = 0L))) stop('expr' must be a function or an expression containing 'x') expr - sexpr if (is.null(ylab)) ylab - deparse(sexpr) } if (is.null(xlim)) delayedAssign(lims, { pu - par(usr)[1L:2] if (par(xaxs) == r) pu - extendrange(pu, f = -1/27) if (par(xlog)) 10^pu else pu }) else lims - xlim if (is.null(from)) from - lims[1L] if (is.null(to)) to - lims[2L] lg - if (length(log)) log else paste(if (add par(xlog)) x, if (add par(ylog)) y, sep = ) if (length(lg) == 0) lg - x - if (lg != x %in% strsplit(lg, NULL)[[1L]]) { if (any(c(from, to) = 0)) stop('from' and 'to' must be 0 with log=\x\) exp(seq.int(log(from), log(to), length.out = n)) } else seq.int(from, to, length.out = n) y - eval(expr, envir = list(x = x), enclos = parent.frame()) if (add) { lines(x, y, type = type, ...) } else plot(x, y, type = type, ylab = ylab, xlim = xlim, log = lg, ...) polygon( c(from, x, to), c(min(y),y,min(y)), col=areacol, lty=0 ) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] filling area under a function line
On Sun, 2009-05-24 at 19:32 +0100, Tom H wrote: polygon( c(from, x, to), c(min(y),y,min(y)), col=areacol, lty=0 ) I guess my question should have been, I don't seem to be able to query or reflect a plot object to get at its attributes to fill in the polygon call, without actually modifying the curve/plot function. eg; myplot-curve(sin) returns a NULL string Tom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to implement a circular buffer with R
Still not elegant, but I would split the string first: spl.str - unlist(strsplit(12345abcdefgh12345abcdefgh, )) Measure its length: len.str - length(spl.str) Shift it: spl.str - c(spl.str[len.str], spl.str[seq(len.str - 1)]) Then paste it back together: paste(spl.str, collapse=) # h12345abcdefgh12345abcdefg Shift it again (same command): spl.str - c(spl.str[len.str], spl.str[seq(len.str - 1)]) Paste it again (same command): paste(spl.str, collapse=) # gh12345abcdefgh12345abcdef And so on. Hth, Adrian milton ruser wrote: Hi Maura, It is not elegant but may work. actual.string- 12345abcdefgh12345abcdefgh actual.string actual.string-paste(substr(actual.string, nchar(actual.string),nchar(actual.string)), substr(actual.string, 1,nchar(actual.string)-1), sep=) actual.string #in a looping actual.string- 12345abcdefgh12345abcdefgh number.buffers-10 my.buffers-actual.string for (i in 1:number.buffers) { actual.string-paste(substr(actual.string, nchar(actual.string),nchar(actual.string)), substr(actual.string, 1,nchar(actual.string)-1), sep=) my.buffers-c(my.buffers, actual.string) } my.buffers Ciao, milton brazil=toronto On Sun, May 24, 2009 at 1:09 PM, mau...@alice.it wrote: Some wavelet analysis experts have implemented periodic boundary conditions for signals. I need to implement a circular buffer. Something like: 12345abcdefgh12345abcdefgh so that at each step the riightmost element is moved to the leftmost index and everything else is properly shifted: h12345abcdefgh12345abcdefg, gh12345abcdefgh12345abcdef, My implementation (still debugging) seems to start working but is terribly clumsy. I am sure that some expert can suggest a more elegant solution, Thank you. Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/how-to-implement-a-circular-buffer-with-R-tp23695934p23696838.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Animal Morphology: Deriving Classification Equation with
[Your data and output listings removed. For comments, see at end] On 24-May-09 13:01:26, cdm wrote: Fellow R Users: I'm not extremely familiar with lda or R programming, but a recent editorial review of a manuscript submission has prompted a crash course. I am on this forum hoping I could solicit some much needed advice for deriving a classification equation. I have used three basic measurements in lda to predict two groups: male and female. I have a working model, low Wilk's lambda, graphs, coefficients, eigenvalues, etc. (see below). I adjusted the sample analysis for Fisher's or Anderson's Iris data provided in the MASS library for my own data. My final and last step is simply form the classification equation. The classification equation is simply using standardized coefficients to classify each group- in this case male or female. A more thorough explanation is provided: For cases with an equal sample size for each group the classification function coefficient (Cj) is expressed by the following equation: Cj = cj0+ cj1x1+ cj2x2+...+ cjpxp where Cj is the score for the jth group, j = 1 ⦠k, cjo is the constant for the jth group, and x = raw scores of each predictor. If W = within-group variance-covariance matrix, and M = column matrix of means for group j, then the constant cjo= (-1/2)CjMj (Julia Barfield, John Poulsen, and Aaron French http://userwww.sfsu.edu/~efc/classes/biol710/discrim/discriminant.htm). I am unable to navigate this last step based on the R output I have. I only have the linear discriminant coefficients for each predictor that would be needed to complete this equation. Please, if anybody is familiar or able to to help please let me know. There is a spot in the acknowledgments for you. All the best, Chase Mendenhall The first thing I did was to plot your data. This indicates in the first place that a perfect discrimination can be obtained on the basis of your variables WRMA_WT and WRMA_ID alone (names abbreviated to WG, WT, ID, SEX): d.csv(horsesLDA.csv) # names(D0) # WRMA_WG WRMA_WT WRMA_ID WRMA_SEX WG-D0$WRMA_WG; WT-D0$WRMA_WT; ID-D0$WRMA_ID; SEX-D0$WRMA_SEX ix.M-(SEX==M); ix.F-(SEX==F) ## Plot WT vs ID (M F) plot(ID,WT,xlim=c(0,12),ylim=c(8,15)) points(ID[ix.M],WT[ix.M],pch=+,col=blue) points(ID[ix.F],WT[ix.F],pch=+,col=red) lines(ID,15.5-1.0*(ID)) and that there is a lot of possible variation in the discriminating line WT = 15.5-1.0*(ID) Also, it is apparent that the covariance between WT and ID for Females is different from the covariance between WT and ID for Males. Hence the assumption (of common covariance matrix in the two groups) for standard LDA (which you have been applying) does not hold. Given that the sexes can be perfectly discriminated within the data on the basis of the linear discriminator (WT + ID) (and others), the variable WG is in effect a close approximation to noise. However, to the extent that there was a common covariance matrix to the two groups (in all three variables WG, WT, ID), and this was well estimated from the data, then inclusion of the third variable WG could yield a slightly improved discriminator in that the probability of misclassification (a rare event for such data) could be minimised. But it would not make much difference! However, since that assumption does not hold, this analysis would not be valid. If you plot WT vs WG, a common covariance is more plausible; but there is considerable overlap for these two variables: plot(WG,WT) points(WG[ix.M],WT[ix.M],pch=+,col=blue) points(WG[ix.F],WT[ix.F],pch=+,col=red) If you plot WG vs ID, there is perhaps not much overlap, but a considerable difference in covariance between the two groups: plot(ID,WG) points(ID[ix.M],WG[ix.M],pch=+,col=blue) points(ID[ix.F],WG[ix.F],pch=+,col=red) This looks better on a log scale, however: lWG - log(WG) ; lWT - log(WT) ; lID - log(ID) ## Plot log(WG) vs log(ID) (M F) plot(lID,lWG) points(lID[ix.M],lWG[ix.M],pch=+,col=blue) points(lID[ix.F],lWG[ix.F],pch=+,col=red) and common covaroance still looks good for WG vs WT: ## Plot log(WT) vs log(WG) (M F) plot(lWG,lWT) points(lWG[ix.M],lWT[ix.M],pch=+,col=blue) points(lWG[ix.F],lWT[ix.F],pch=+,col=red) but there is no improvement for WG vs IG: ## Plot log(WT) vs log(ID) (M F) plot(ID,WT,xlim=c(0,12),ylim=c(8,15)) points(ID[ix.M],WT[ix.M],pch=+,col=blue) points(ID[ix.F],WT[ix.F],pch=+,col=red) So there is no simple road to applying a routine LDA to your data. To take account of different covariances between the two groups, you would normally be looking at a quadratic discriminator. However, as indicated above, the fact that a linear discriminator using the variables ID WT alone works so well would leave considerable imprecision in conclusions to be drawn from its results. Sorry this is not the straightforward answer you were hoping for (which I confess I have not sought); it is simply a
Re: [R] subset dataframe by number of rows of equal values
Here is one way of doing it: moreThan - ave(choose$code, choose$code, FUN=length) moreThan [1] 2 2 4 4 4 4 2 2 6 6 6 6 6 6 choose[moreThan 2,] firm year code 3 2 2000 11 4 2 2001 11 5 2 2002 11 6 2 2003 11 9 4 2001 13 104 2002 13 114 2003 13 124 2004 13 134 2005 13 144 2006 13 On Sun, May 24, 2009 at 1:46 PM, Cecilia Carmo cecilia.ca...@ua.pt wrote: Hi R helpers! I have the following dataframe «choose» choose-data.frame(firm=c(1,1,2,2,2,2,3,3,4,4,4,4,4,4), year=c(2000,2001,2000,2001,2002,2003,2000,2003,2001,2002,2003,2004,2005,2006),code=c(10,10,11,11,11,11,12,12,13,13,13,13,13,13)) choose I want to subset it to obtain another one with those observations for which there more than 2 observations in the column «code». So I want a dataframe «chosen» like this: chosen-data.frame(firm=c(2,2,2,2,4,4,4,4,4,4),year=c(2000,2001,2002,2003,2001,2002,2003,2004,2005,2006),code=c(11,11,11,11,13,13,13,13,13,13)) chosen Ive tried split() and then nrow() but I got nothing. Could anyone help me with this? Thanks Cecília (Universidade de Aveiro Portugal) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Timing issue using locator() in loop containing print()
I am attempting to use locator(n=2) to select the corners of several (5 in this case) rectangles on an image displayed in a JavaGD window. The returned coords are used to draw labeled rectangles around the selected region. I have tried several things to get this to work including sys.Sleep to correct what appears to be a timing issue with this loop. The first-time print in the loop doesn't print before locator executes several mouse clicks, and the order of pt(1) and pt(2) in each execution of the loop gets out of sync. Please offer a suggestion. I am using Windows, Java Gui for R1.6-3, R version 2.8.1. Example: #..PLOT the Image in Java Window. JavaGD(name=JavaGD, width=640, height=480) #.suppress margins all around. par(mar=c(0,0,0,0)) image(xraw,col=my.grays(256),axes=F) #Set up loop.. tot_subsets-5 ss- matrix(0,nrow=tot_subsets,ncol=4) print(begin selections) # .Loop for multiple rectangle selections on image. for (i in 1:tot_subsets) { print(Select lower left and upper right for this rectangle) sub_0- locator(n=2) rect(sub_0$x[1],sub_0$y[1],sub_0$x[2],sub_0$y[2],col=red,density=0) text(sub_0$x[1]+(sub_0$x[2]-sub_0$x[1])/2,sub_0$y[1]+(sub_0$y[2]-sub_0$y[1])/2,paste(S,i),col=red) #...Add each reactangle coords to master subset list. ss[i,1]-sub_0$x[1] ss[i,2]-sub_0$y[1] ss[i,3]-sub_0$x[2] ss[i,4]-sub_0$y[2] } print(finished selection) Thanks, Bob Meglen Boulder, CO __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Getting an older version of a package
Hi there, Thanks for your time in advance. I am using an add-on package from Cran. After I updated this package, some of my programs don't work any more. I was wondering if there is anything like version control so that I could use the older version of that package; or if I could manually install the previous version and how I could acheive it? I am not a regular R user; although it is supposed to be very easy, after spending many hours on this, I still haven't figured out how to proceed. Your help will be greatly appreciated. Thanks. Le __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Deleting columns from a matrix
Thanks, both of these methods work great! Dimitris Rizopoulos-4 wrote: one way is: mat - matrix(c(rep(NA,10),1,2,3,4,5,6,7,8,9,10,10,9,8,NA,6,5,4,NA,2,1,rep(NA,10),1,2,3,4,NA,6,7,8,9,10), 10, 5) ind - colSums(is.na(mat)) != nrow(mat) mat[, ind] I hope it helps. Best, Dimitris dxc13 wrote: useR's, I have a matrix given by the code: mat - matrix(c(rep(NA,10),1,2,3,4,5,6,7,8,9,10,10,9,8,NA,6,5,4,NA,2,1,rep(NA,10),1,2,3,4,NA,6,7,8,9,10),10,5) This is a 10x5 matrix containing missing values. All columns except the second contain missing values. I want to delete all columns that contain ALL missing values, and in this case, it would be the first and fourth columns. Any column that has at least one real number would remain. I know I can use mat[,-1] to delete the first column, but I have a much larger matrix where it is impossible to tell how many columns contain all missing values and which don't. Is there a function or something else that may be able to help me accomplish this? Thanks in advance. dxc13 -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Deleting-columns-from-a-matrix-tp23695656p23696294.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] filling area under a function line
On 24/05/2009 2:50 PM, Tom H wrote: On Sun, 2009-05-24 at 19:32 +0100, Tom H wrote: polygon( c(from, x, to), c(min(y),y,min(y)), col=areacol, lty=0 ) I guess my question should have been, I don't seem to be able to query or reflect a plot object to get at its attributes to fill in the polygon call, without actually modifying the curve/plot function. There isn't an easy way to get what you want. Modifying curve() to return the x and y vectors seems like a reasonable thing to do; I'll take a look at that. Then your areacurve() wouldn't need to reproduce all of curve(). Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Getting an older version of a package
Hi there, Thanks for your time in advance. I am using an add-on package from Cran. After I updated this package, some of my programs don't work any more. I was wondering if there is anything like version control so that I could use the older version of that package; or if I could manually install the previous version and how I could acheive it? I am not a regular R user; although it is supposed to be very easy, after spending many hours on this, I still haven't figured out how to proceed. Your help will be greatly appreciated. Thanks. Le -- ~~ Le Wang, Ph.D Population Center University of Minnesota __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting an older version of a package
On 24/05/2009 4:00 PM, Le Wang wrote: Hi there, Thanks for your time in advance. I am using an add-on package from Cran. After I updated this package, some of my programs don't work any more. I was wondering if there is anything like version control so that I could use the older version of that package; or if I could manually install the previous version and how I could acheive it? I am not a regular R user; although it is supposed to be very easy, after spending many hours on this, I still haven't figured out how to proceed. Your help will be greatly appreciated. CRAN has the older versions of the package available in source form. You need to download one of those and install it: but watch out for other packages that depend on the newer one. In the long run, it's probably a better investment of your time to fix your programs to work with the new package. (Or possibly report to the package maintainer if they have introduced a bug.) Prior to R 2.9.0, it was possible to install multiple different versions of packages, but this never worked perfectly, and it has been dropped. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Animal Morphology: Deriving Classification Equation with
Dear Ted, Thank you for taking the time out to help me with this analysis. I'm seeing that I may have left out a crucial detail concerning this analysis. The ID measurement (interpubic distance) is a new measurement that has never been used in the field of ornithology (to my knowledge). The objective of the paper is to demonstrate the usefulness of ID. The paper compared ID with plumage criterion, a categorical variable at best, but under peer-review there is a request to use other morphological data to compare/contrast ID. Unfortunately, wing (WG) and weight (WT) were the only measurements taken in addition to ID in this study. The purpose of the LDA is to demonstrate the power if ID in the context of WG and WT. I agree that WG is a terrible metric for discrimination, WT is good but there is significant overlap between groups, but ID is a good discriminator on it's own (classified 97-100% of all individuals based on 92.5% CI). You pointed out that I am violating assumptions with LDA based on different covariances between sexes (thank you... I never would have caught it). I'm wondering how to proceed. Should I: 1) Perform linear discrimination with WT and ID, and then determine a classification equation? And, if I do how do I derive the classification equation (e.g. [Cj = cj0+ cjWTxWT+ cjIDxID; Cjx= male, Cjx=female]) 2) Demonstrate that ID is important based on linear discrimanant coefficients and structure coefficients from this WG, WT, and ID LDA; discuss the assumption violation and argue for it's use as a demonstration of variable predicting power; and NOT provide a classification equation because we already have ID ranges and it would be inappropriate. 3) Both #1 and #2 because WT and ID provide such a good discriminating function and use the WG, WT, and ID LDA for demonstration of variable prediction value. 4) ??? better suggestions. THANK YOU so much for responding and all of your insight. I'm humbled by your R skills... that code nearly too me all day to write (little by little I'm learning). Chase Ted.Harding-2 wrote: [Your data and output listings removed. For comments, see at end] On 24-May-09 13:01:26, cdm wrote: Fellow R Users: I'm not extremely familiar with lda or R programming, but a recent editorial review of a manuscript submission has prompted a crash course. I am on this forum hoping I could solicit some much needed advice for deriving a classification equation. I have used three basic measurements in lda to predict two groups: male and female. I have a working model, low Wilk's lambda, graphs, coefficients, eigenvalues, etc. (see below). I adjusted the sample analysis for Fisher's or Anderson's Iris data provided in the MASS library for my own data. My final and last step is simply form the classification equation. The classification equation is simply using standardized coefficients to classify each group- in this case male or female. A more thorough explanation is provided: For cases with an equal sample size for each group the classification function coefficient (Cj) is expressed by the following equation: Cj = cj0+ cj1x1+ cj2x2+...+ cjpxp where Cj is the score for the jth group, j = 1 ⦠k, cjo is the constant for the jth group, and x = raw scores of each predictor. If W = within-group variance-covariance matrix, and M = column matrix of means for group j, then the constant cjo= (-1/2)CjMj (Julia Barfield, John Poulsen, and Aaron French http://userwww.sfsu.edu/~efc/classes/biol710/discrim/discriminant.htm). I am unable to navigate this last step based on the R output I have. I only have the linear discriminant coefficients for each predictor that would be needed to complete this equation. Please, if anybody is familiar or able to to help please let me know. There is a spot in the acknowledgments for you. All the best, Chase Mendenhall The first thing I did was to plot your data. This indicates in the first place that a perfect discrimination can be obtained on the basis of your variables WRMA_WT and WRMA_ID alone (names abbreviated to WG, WT, ID, SEX): d.csv(horsesLDA.csv) # names(D0) # WRMA_WG WRMA_WT WRMA_ID WRMA_SEX WG-D0$WRMA_WG; WT-D0$WRMA_WT; ID-D0$WRMA_ID; SEX-D0$WRMA_SEX ix.M-(SEX==M); ix.F-(SEX==F) ## Plot WT vs ID (M F) plot(ID,WT,xlim=c(0,12),ylim=c(8,15)) points(ID[ix.M],WT[ix.M],pch=+,col=blue) points(ID[ix.F],WT[ix.F],pch=+,col=red) lines(ID,15.5-1.0*(ID)) and that there is a lot of possible variation in the discriminating line WT = 15.5-1.0*(ID) Also, it is apparent that the covariance between WT and ID for Females is different from the covariance between WT and ID for Males. Hence the assumption (of common covariance matrix in the two groups) for standard LDA (which you have been applying) does not hold. Given that the sexes can be perfectly discriminated within the data on the basis of the linear discriminator (WT + ID) (and
[R] unit of grid size of s.class plot (ade4)?
Dear R-helpers, I have perfomed a BPCA (dudi.pca, between, package=ade4) and visualised the result in a scatter plot (s.class, package=ade4). I would like to now the unit of d in the scatterplot, which represents the size of the grid in the background of the plot. So to make it short, what is d? thank you very much in advance, Heike Heike Zimmermann University of Halle/Wittenberg Department of Botany Am Kirchtor 1 06108 Halle Germany +49 (0)345-5526264 http://www.geobotanik.uni-halle.de/index.en.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [correction] Animal Morphology: Deriving Classification Equation with
[Apologies -- I made an error (see at [***] near the end)] On 24-May-09 19:07:46, Ted Harding wrote: [Your data and output listings removed. For comments, see at end] On 24-May-09 13:01:26, cdm wrote: Fellow R Users: I'm not extremely familiar with lda or R programming, but a recent editorial review of a manuscript submission has prompted a crash course. I am on this forum hoping I could solicit some much needed advice for deriving a classification equation. I have used three basic measurements in lda to predict two groups: male and female. I have a working model, low Wilk's lambda, graphs, coefficients, eigenvalues, etc. (see below). I adjusted the sample analysis for Fisher's or Anderson's Iris data provided in the MASS library for my own data. My final and last step is simply form the classification equation. The classification equation is simply using standardized coefficients to classify each group- in this case male or female. A more thorough explanation is provided: For cases with an equal sample size for each group the classification function coefficient (Cj) is expressed by the following equation: Cj = cj0+ cj1x1+ cj2x2+...+ cjpxp where Cj is the score for the jth group, j = 1 ⦠k, cjo is the constant for the jth group, and x = raw scores of each predictor. If W = within-group variance-covariance matrix, and M = column matrix of means for group j, then the constant cjo= (-1/2)CjMj (Julia Barfield, John Poulsen, and Aaron French http://userwww.sfsu.edu/~efc/classes/biol710/discrim/discriminant.htm). I am unable to navigate this last step based on the R output I have. I only have the linear discriminant coefficients for each predictor that would be needed to complete this equation. Please, if anybody is familiar or able to to help please let me know. There is a spot in the acknowledgments for you. All the best, Chase Mendenhall The first thing I did was to plot your data. This indicates in the first place that a perfect discrimination can be obtained on the basis of your variables WRMA_WT and WRMA_ID alone (names abbreviated to WG, WT, ID, SEX): d.csv(horsesLDA.csv) # names(D0) # WRMA_WG WRMA_WT WRMA_ID WRMA_SEX WG-D0$WRMA_WG; WT-D0$WRMA_WT; ID-D0$WRMA_ID; SEX-D0$WRMA_SEX ix.M-(SEX==M); ix.F-(SEX==F) ## Plot WT vs ID (M F) plot(ID,WT,xlim=c(0,12),ylim=c(8,15)) points(ID[ix.M],WT[ix.M],pch=+,col=blue) points(ID[ix.F],WT[ix.F],pch=+,col=red) lines(ID,15.5-1.0*(ID)) and that there is a lot of possible variation in the discriminating line WT = 15.5-1.0*(ID) Also, it is apparent that the covariance between WT and ID for Females is different from the covariance between WT and ID for Males. Hence the assumption (of common covariance matrix in the two groups) for standard LDA (which you have been applying) does not hold. Given that the sexes can be perfectly discriminated within the data on the basis of the linear discriminator (WT + ID) (and others), the variable WG is in effect a close approximation to noise. However, to the extent that there was a common covariance matrix to the two groups (in all three variables WG, WT, ID), and this was well estimated from the data, then inclusion of the third variable WG could yield a slightly improved discriminator in that the probability of misclassification (a rare event for such data) could be minimised. But it would not make much difference! However, since that assumption does not hold, this analysis would not be valid. If you plot WT vs WG, a common covariance is more plausible; but there is considerable overlap for these two variables: plot(WG,WT) points(WG[ix.M],WT[ix.M],pch=+,col=blue) points(WG[ix.F],WT[ix.F],pch=+,col=red) If you plot WG vs ID, there is perhaps not much overlap, but a considerable difference in covariance between the two groups: plot(ID,WG) points(ID[ix.M],WG[ix.M],pch=+,col=blue) points(ID[ix.F],WG[ix.F],pch=+,col=red) This looks better on a log scale, however: lWG - log(WG) ; lWT - log(WT) ; lID - log(ID) ## Plot log(WG) vs log(ID) (M F) plot(lID,lWG) points(lID[ix.M],lWG[ix.M],pch=+,col=blue) points(lID[ix.F],lWG[ix.F],pch=+,col=red) and common covaroance still looks good for WG vs WT: ## Plot log(WT) vs log(WG) (M F) plot(lWG,lWT) points(lWG[ix.M],lWT[ix.M],pch=+,col=blue) points(lWG[ix.F],lWT[ix.F],pch=+,col=red) but there is no improvement for WG vs IG: ## Plot log(WT) vs log(ID) (M F) plot(ID,WT,xlim=c(0,12),ylim=c(8,15)) points(ID[ix.M],WT[ix.M],pch=+,col=blue) points(ID[ix.F],WT[ix.F],pch=+,col=red) [***] The above is incorrect! Apologies. I plotted the raw WT and ID instead of their logs. In fact, if you do plot the logs: ## Plot log(WT) vs log(ID) (M F) plot(lID,lWT) points(lID[ix.M],lWT[ix.M],pch=+,col=blue) points(lID[ix.F],lWT[ix.F],pch=+,col=red) you now get what looks like much closer agreement between
Re: [R] Animal Morphology: Deriving Classification Equation with
On 24-May-09 20:32:06, cdm wrote: Dear Ted, Thank you for taking the time out to help me with this analysis. I'm seeing that I may have left out a crucial detail concerning this analysis. The ID measurement (interpubic distance) is a new measurement that has never been used in the field of ornithology (to my knowledge). The objective of the paper is to demonstrate the usefulness of ID. The paper compared ID with plumage criterion, a categorical variable at best, but under peer-review there is a request to use other morphological data to compare/contrast ID. Unfortunately, wing (WG) and weight (WT) were the only measurements taken in addition to ID in this study. Many thanks for the above additional explanation, Chase. It leads to an interpretation of the log(ID) vs log(LD) plot which could be fruitful. Namely, the ID is a linear dimension, and the WT could be considered as closely reflecting a (linear dimsnion)^3. If you look at the plot of log(WT) vs log(ID): ## Plot log(WT) vs log(ID) (M F) plot(lID,lWT) points(lID[ix.M],lWT[ix.M],pch=+,col=blue) points(lID[ix.F],lWT[ix.F],pch=+,col=red) it is apparent that a linear increase in log(ID) as log(WT) increases is a very good description of what is happening. Also, that the scatter about the linear relationship is very uniform. Therefore, a linear regression of log(ID) on log(WT) should be closely related to the linear discrimination. First, the linear regression: lLM - lm(lID ~ lWT) summary(lLM)$coef # Estimate Std. Error t value Pr(|t|) # (Intercept) -10.657775 0.6562166 -16.24125 5.971407e-35 # lWT 4.901037 0.2671783 18.34369 2.899008e-40 so the slope is 4.901037, and the slope of a linear discriminant is likely to be close to -1/4.901037 = 0.2040385. So: library(MASS) lda(SEX ~ lWG + lWT + lID) # [...] # Coefficients of linear discriminants: #LD1 # lWG 5.304967 # lWT -11.604919 # lID -2.707374 so the slope of a linear discriminant (based on all 3 variables) with respect to variation in log(WT) and log(ID) alone is -2.707374/11.604919 = -0.2332954 which is quite close to the above. It is also interesting to do the discrimination using only log(WT) and log(ID): lda(SEX ~ lWT + lID) # [...] # Coefficients of linear discriminants: #LD1 # lWT -11.352949 # lID -2.673019 So *very little change* compared with using all three variables; and the slope of this discriminant is -2.673019/11.352949 = -0.2354471, almost unchanged compared with the three variables. You can see the performance of the discriminator by plotting histograms of it (here I'll use the 2-variable one): ix.M - (SEX==M) ; ix.F - (SEX==F) LD - 11.352949*lWT + 2.673019*lID hist((2.673019*lID + 11.352949*lWT)[ix.M], breaks=0.5*(40:80),col=blue) hist((2.673019*lID + 11.352949*lWT)[ix.F], breaks=0.5*(40:80),col=red,add=TRUE) Inspection of this, however, raises some interesting questions which I'd prefer to discuss with you off-list (also your queries relating to efficacy of ID). Ted. [But see just one short comment below] The purpose of the LDA is to demonstrate the power if ID in the context of WG and WT. I agree that WG is a terrible metric for discrimination, WT is good but there is significant overlap between groups, but ID is a good discriminator on it's own (classified 97-100% of all individuals based on 92.5% CI). You pointed out that I am violating assumptions with LDA based on different covariances between sexes (thank you... I never would have caught it). I'm wondering how to proceed. As pointed out in my correction, if you work with logs it looks OK on that front! More later. Should I: 1) Perform linear discrimination with WT and ID, and then determine a classification equation? And, if I do how do I derive the classification equation (e.g. [Cj = cj0+ cjWTxWT+ cjIDxID; Cjx= male, Cjx=female]) 2) Demonstrate that ID is important based on linear discrimanant coefficients and structure coefficients from this WG, WT, and ID LDA; discuss the assumption violation and argue for it's use as a demonstration of variable predicting power; and NOT provide a classification equation because we already have ID ranges and it would be inappropriate. 3) Both #1 and #2 because WT and ID provide such a good discriminating function and use the WG, WT, and ID LDA for demonstration of variable prediction value. 4) ??? better suggestions. THANK YOU so much for responding and all of your insight. I'm humbled by your R skills... that code nearly too me all day to write (little by little I'm learning). Chase Ted.Harding-2 wrote: [Your data and output listings removed. For comments, see at end] On 24-May-09 13:01:26, cdm wrote: Fellow R Users: I'm not extremely familiar with lda or R programming, but a recent editorial review of a manuscript submission has prompted a crash course. I am
Re: [R] Getting an older version of a package
Duncan Murdoch, Many thank you for your reply. I did try to download the older versions from CRAN. But I am not quite sure how to compile the source form. I tried using the option install package(s) from local zip files in R, but it didn't work. It simply gave the following msg utils:::menuInstallLocal() updating HTML package descriptions On Sun, May 24, 2009 at 4:36 PM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 24/05/2009 4:00 PM, Le Wang wrote: Hi there, Thanks for your time in advance. I am using an add-on package from Cran. After I updated this package, some of my programs don't work any more. I was wondering if there is anything like version control so that I could use the older version of that package; or if I could manually install the previous version and how I could acheive it? I am not a regular R user; although it is supposed to be very easy, after spending many hours on this, I still haven't figured out how to proceed. Your help will be greatly appreciated. CRAN has the older versions of the package available in source form. You need to download one of those and install it: but watch out for other packages that depend on the newer one. In the long run, it's probably a better investment of your time to fix your programs to work with the new package. (Or possibly report to the package maintainer if they have introduced a bug.) Prior to R 2.9.0, it was possible to install multiple different versions of packages, but this never worked perfectly, and it has been dropped. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Timing issue using locator() in loop containing print()
Is the output buffered on the RGUI? If so, uncheck it and see if the problem clears up. On Sun, May 24, 2009 at 3:24 PM, Bob Meglen bmeg...@comcast.net wrote: I am attempting to use locator(n=2) to select the corners of several (5 in this case) rectangles on an image displayed in a JavaGD window. The returned coords are used to draw labeled rectangles around the selected region. I have tried several things to get this to work including sys.Sleep to correct what appears to be a timing issue with this loop. The first-time print in the loop doesn't print before locator executes several mouse clicks, and the order of pt(1) and pt(2) in each execution of the loop gets out of sync. Please offer a suggestion. I am using Windows, Java Gui for R1.6-3, R version 2.8.1. Example: #..PLOT the Image in Java Window. JavaGD(name=JavaGD, width=640, height=480) #.suppress margins all around. par(mar=c(0,0,0,0)) image(xraw,col=my.grays(256),axes=F) #Set up loop.. tot_subsets-5 ss- matrix(0,nrow=tot_subsets,ncol=4) print(begin selections) # .Loop for multiple rectangle selections on image. for (i in 1:tot_subsets) { print(Select lower left and upper right for this rectangle) sub_0- locator(n=2) rect(sub_0$x[1],sub_0$y[1],sub_0$x[2],sub_0$y[2],col=red,density=0) text(sub_0$x[1]+(sub_0$x[2]-sub_0$x[1])/2,sub_0$y[1]+(sub_0$y[2]-sub_0$y[1])/2,paste(S,i),col=red) #...Add each reactangle coords to master subset list. ss[i,1]-sub_0$x[1] ss[i,2]-sub_0$y[1] ss[i,3]-sub_0$x[2] ss[i,4]-sub_0$y[2] } print(finished selection) Thanks, Bob Meglen Boulder, CO __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [correction] Animal Morphology: Deriving Classification Equation with
Ted, I just ran everything using the log of all variables. Much better analysis and it doesn't violate the assumptions. I'm still in the dark concerning the classification equation- other than the fact that it now will contain log functions. Thank you for you help, Chase Ted.Harding-2 wrote: [Apologies -- I made an error (see at [***] near the end)] On 24-May-09 19:07:46, Ted Harding wrote: [Your data and output listings removed. For comments, see at end] On 24-May-09 13:01:26, cdm wrote: Fellow R Users: I'm not extremely familiar with lda or R programming, but a recent editorial review of a manuscript submission has prompted a crash course. I am on this forum hoping I could solicit some much needed advice for deriving a classification equation. I have used three basic measurements in lda to predict two groups: male and female. I have a working model, low Wilk's lambda, graphs, coefficients, eigenvalues, etc. (see below). I adjusted the sample analysis for Fisher's or Anderson's Iris data provided in the MASS library for my own data. My final and last step is simply form the classification equation. The classification equation is simply using standardized coefficients to classify each group- in this case male or female. A more thorough explanation is provided: For cases with an equal sample size for each group the classification function coefficient (Cj) is expressed by the following equation: Cj = cj0+ cj1x1+ cj2x2+...+ cjpxp where Cj is the score for the jth group, j = 1 ⦠k, cjo is the constant for the jth group, and x = raw scores of each predictor. If W = within-group variance-covariance matrix, and M = column matrix of means for group j, then the constant cjo= (-1/2)CjMj (Julia Barfield, John Poulsen, and Aaron French http://userwww.sfsu.edu/~efc/classes/biol710/discrim/discriminant.htm). I am unable to navigate this last step based on the R output I have. I only have the linear discriminant coefficients for each predictor that would be needed to complete this equation. Please, if anybody is familiar or able to to help please let me know. There is a spot in the acknowledgments for you. All the best, Chase Mendenhall The first thing I did was to plot your data. This indicates in the first place that a perfect discrimination can be obtained on the basis of your variables WRMA_WT and WRMA_ID alone (names abbreviated to WG, WT, ID, SEX): d.csv(horsesLDA.csv) # names(D0) # WRMA_WG WRMA_WT WRMA_ID WRMA_SEX WG-D0$WRMA_WG; WT-D0$WRMA_WT; ID-D0$WRMA_ID; SEX-D0$WRMA_SEX ix.M-(SEX==M); ix.F-(SEX==F) ## Plot WT vs ID (M F) plot(ID,WT,xlim=c(0,12),ylim=c(8,15)) points(ID[ix.M],WT[ix.M],pch=+,col=blue) points(ID[ix.F],WT[ix.F],pch=+,col=red) lines(ID,15.5-1.0*(ID)) and that there is a lot of possible variation in the discriminating line WT = 15.5-1.0*(ID) Also, it is apparent that the covariance between WT and ID for Females is different from the covariance between WT and ID for Males. Hence the assumption (of common covariance matrix in the two groups) for standard LDA (which you have been applying) does not hold. Given that the sexes can be perfectly discriminated within the data on the basis of the linear discriminator (WT + ID) (and others), the variable WG is in effect a close approximation to noise. However, to the extent that there was a common covariance matrix to the two groups (in all three variables WG, WT, ID), and this was well estimated from the data, then inclusion of the third variable WG could yield a slightly improved discriminator in that the probability of misclassification (a rare event for such data) could be minimised. But it would not make much difference! However, since that assumption does not hold, this analysis would not be valid. If you plot WT vs WG, a common covariance is more plausible; but there is considerable overlap for these two variables: plot(WG,WT) points(WG[ix.M],WT[ix.M],pch=+,col=blue) points(WG[ix.F],WT[ix.F],pch=+,col=red) If you plot WG vs ID, there is perhaps not much overlap, but a considerable difference in covariance between the two groups: plot(ID,WG) points(ID[ix.M],WG[ix.M],pch=+,col=blue) points(ID[ix.F],WG[ix.F],pch=+,col=red) This looks better on a log scale, however: lWG - log(WG) ; lWT - log(WT) ; lID - log(ID) ## Plot log(WG) vs log(ID) (M F) plot(lID,lWG) points(lID[ix.M],lWG[ix.M],pch=+,col=blue) points(lID[ix.F],lWG[ix.F],pch=+,col=red) and common covaroance still looks good for WG vs WT: ## Plot log(WT) vs log(WG) (M F) plot(lWG,lWT) points(lWG[ix.M],lWT[ix.M],pch=+,col=blue) points(lWG[ix.F],lWT[ix.F],pch=+,col=red) but there is no improvement for WG vs IG: ## Plot log(WT) vs log(ID) (M F) plot(ID,WT,xlim=c(0,12),ylim=c(8,15)) points(ID[ix.M],WT[ix.M],pch=+,col=blue) points(ID[ix.F],WT[ix.F],pch=+,col=red) [***] The above is
[R] The setting of .Library.site in Rprofile.site doesn't take effect
Dear R users, I am trying to customize the .Library.site in the file etc/Rprofile.site under Windows XP, but it seems that the setting doesn't take effect. My setting is: .Library.site - d:/site-library But after I lauched R 2.9.0, the value is always d:/PROGRA~1/R/R-29~1.0/site-library Could anyone help explain what is the matter? Thanks. Best wishes, Leon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The setting of .Library.site in Rprofile.site doesn't take effect
Oh, sorry, my own fault. I had set a HOME evironmental variable, and there was a .Rprofile in it, and the setting of .Library.site was overriden there. Regards, Leon Leon Yee wrote: Dear R users, I am trying to customize the .Library.site in the file etc/Rprofile.site under Windows XP, but it seems that the setting doesn't take effect. My setting is: .Library.site - d:/site-library But after I lauched R 2.9.0, the value is always d:/PROGRA~1/R/R-29~1.0/site-library Could anyone help explain what is the matter? Thanks. Best wishes, Leon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] vignette problem
Dear R People: I'm using R-2.8.1 on Ubuntu Jaunty jackalope (or whatever its name is), and having a problem with the vignette function: vignette(snowfall) sh: /usr/bin/xpdf: not found Has anyone run into this, please? Or is this for the Debian R list, please? Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting an older version of a package
if you are on a .nix then in a terminal move to the directory that contains the tar ball of the packages and type R CMD install foo.tar.bz hope this helps Stephen Sefick On Sun, May 24, 2009 at 5:50 PM, Le Wang ruser...@gmail.com wrote: Duncan Murdoch, Many thank you for your reply. I did try to download the older versions from CRAN. But I am not quite sure how to compile the source form. I tried using the option install package(s) from local zip files in R, but it didn't work. It simply gave the following msg utils:::menuInstallLocal() updating HTML package descriptions On Sun, May 24, 2009 at 4:36 PM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 24/05/2009 4:00 PM, Le Wang wrote: Hi there, Thanks for your time in advance. I am using an add-on package from Cran. After I updated this package, some of my programs don't work any more. I was wondering if there is anything like version control so that I could use the older version of that package; or if I could manually install the previous version and how I could acheive it? I am not a regular R user; although it is supposed to be very easy, after spending many hours on this, I still haven't figured out how to proceed. Your help will be greatly appreciated. CRAN has the older versions of the package available in source form. You need to download one of those and install it: but watch out for other packages that depend on the newer one. In the long run, it's probably a better investment of your time to fix your programs to work with the new package. (Or possibly report to the package maintainer if they have introduced a bug.) Prior to R 2.9.0, it was possible to install multiple different versions of packages, but this never worked perfectly, and it has been dropped. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Stephen Sefick Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Naming a random effect in lmer
Thanks for your thoughts on this. I tried the approach for the grouping variable using the within function. I looked at a subset of my data for which I do not get the deparse error in lmer and compared the results. The approach using the within function to form the grouping variable underestimates that variance component. Strangely, the Zt matrix for the random effects in the proposed approach does not appear to contain rows for the Zgrouped variable, although it does report a variance estimate for the random effect. The results for the standard approach appear correct for this simulated data set. Any other ideas on how to make the grouping work? Thanks -- Leigh Ann # Assume that only mb=10 years of data are available to compare to results # that avoid deparse error # Group the grouping variable using within testsamp1-testsamp Zs2- paste(Z,2:9,sep=) Zsum2 - paste(Zs2,collapse=+) testsamp1 - within(testsamp1, Zgrouped-Zsum2) fittest29.1-lmer(LogY ~ WYear + (1+WYear|Site) + (1|Zgrouped), data = testsamp1[testsamp1$WYear=10,]) Linear mixed model fit by REML Formula: LogY ~ WYear + (1 + WYear | Site) + (1 | Zgrouped) Data: testsamp1[testsamp1$WYear = 10, ] AICBIC logLik deviance REMLdev -42.96 -26.55 28.48 -63.37 -56.96 Random effects: Groups NameVariance Std.Dev. Corr Site (Intercept) 0.07351401 0.271135 WYear 0.02620422 0.161877 0.057 Zgrouped (Intercept) 0.00036891 0.019207 Residual 0.01065189 0.103208 Number of obs: 77, groups: Site, 7; Zgrouped, 1 Fixed effects: Estimate Std. Error t value (Intercept) 0.9857992 0.1065573 9.251 WYear -0.0002676 0.0612968 -0.004 Correlation of Fixed Effects: (Intr) WYear 0.044 # Use the standard formulation for the grouping variable with reduced number of # terms to avoid deparse error Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs2,collapse=+), fittest29.2-lmer(Trendformula2, data = testsamp[testsamp$WYear=10,]) summary(fittest29.2) Linear mixed model fit by REML Formula: Trendformula2 Data: testsamp[testsamp$WYear = 10, ] AICBIC logLik deviance REMLdev -147.5 -131.1 80.77 -167.7 -161.5 Random effects: GroupsNameVariance Std.Dev. Corr Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 (Intercept) 0.0095237 0.097589 Site (Intercept) 0.0765445 0.276667 WYear 0.0262909 0.162145 0.046 Residual 0.0011282 0.033589 Number of obs: 77, groups: Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9, 11; Site, 7 Fixed effects: Estimate Std. Error t value (Intercept) 0.9857992 0.1183912 8.327 WYear -0.0002676 0.0619989 -0.004 Correlation of Fixed Effects: (Intr) WYear -0.020 fittest2...@zt 15 x 77 sparse Matrix of class dgCMatrix fittest2...@zt 25 x 77 sparse Matrix of class dgCMatrix At 09:37 AM 5/24/2009 -0500, Douglas Bates wrote: Hi Bill, I'm about to take a look at this. If I understand the issue, very long expressions for what I call the grouping factor of a random effects term (the expressions on the right hand side of the vertical bar) are encountering problems with deparse. I should have realized that, any time one uses deparse, disaster looms. I can tell you the reason that the collection of random-effects terms is being named is partly for the printed form and partly so that terms with the same grouping factor can be associated. I guess my simplistic solution to the problem would be to precompute these sums and give them names, if it is the sum like Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 that is important, why not evaluate the sum testGroupSamp - within(testGroupSamp, Z29 - Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9) and use Z29 as the grouping factor. Even the use of variables with names like Z1, Z2, ... and the use of expressions like paste(Z, 2:9, sep = ) is not idiomatic R/S code. It's an SPSS/SASism. (You know I never realized before how close the word SASism, meaning a construction that is natural in SAS, is to Sadism.) Why not create a matrix Z and evaluate these sums as matrix/vector products? Zs2- paste(Z,2:9,sep=) On Fri, May 22, 2009 at 5:30 PM, William Dunlap wdun...@tibco.com wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of spencerg Sent: Friday, May 22, 2009 3:01 PM To: Leigh Ann Starcevich Cc: r-help@r-project.org Subject: Re: [R] Naming a random effect in lmer [ ... elided statistical advice ... ] If you and your advisor still feel that what you are doing makes sense, I suggest you first get the source code via svn checkout svn://svn.r-forge.r-project.org/svnroot/lme4 (or by downloading lme4_0.999375-30.tar.gz from
Re: [R] vignette problem
On Sun, 24 May 2009, Erin Hodgess wrote: Dear R People: I'm using R-2.8.1 on Ubuntu Jaunty jackalope (or whatever its name is), and having a problem with the vignette function: vignette(snowfall) sh: /usr/bin/xpdf: not found xpdf is configured to be the default PDF viewer but it is not installed on your machine at the moment: R getOption(pdfviewer) [1] /usr/bin/xpdf Either install xpdf or set options(pdfviewer = ...) to a PDF viewer that is already installed. hth, Z Has anyone run into this, please? Or is this for the Debian R list, please? Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.