Re: [R] trouble with Rcmdr
Hello, On 8/18/09, Erin Hodgess erinm.hodg...@gmail.com wrote: I used it on Saturday, and it worked fine. When I used it today, the window did not appear. This is version 2.9.1 on Ubuntu Jaunty Jackalope. Did you make any system updates between Sat and today? Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot(x,y)
Hi malcolm, sorry but your post continue with pour quality, because it is not reproducible, it is!? By the way, gavin.simp...@ucl.ac.uk wrote: . You really shouldn't call your data, 'data' as it will get very confusing when you start using modelling functions and using the data argument, or using inbuilt data sets and thus using the data() function. For a funny aside on this, do the following: install.packages(fortunes) require(fortunes) Loading required package: fortunes fortune(dog) Firstly, don't call your matrix 'matrix'. Would you call your dog 'dog'? Anyway, it might clash with the function 'matrix'. -- Barry Rowlingson R-help (October 2004) So not deal your dataset object named data :-) cheers milton On Sun, Aug 16, 2009 at 3:19 PM, malcolm Crouch malcolm.crouc...@gmail.comwrote: My apologies for the poor quality of the last post. Query plot(V6,V5, col=red) or plot(V6,V5) the graph is attached. Data V5 V6 2063 0.009179148 -8117667.885 2064 0.03854349 -8117963.045 2065 -0.018345998 -8117980.935 2066 0.023662906 -8118013.245 2067 0.011029293 -8118253.705 2068 0.01772285 -8118287.025 2069 0.013767122 -8118297.385 2070 0.005579858 -8118832.465 2071 -0.011432977 -8118905.025 2072 0.017675178 -8118990.045 2073 0.037850056 -8118993.905 2074 -0.002187153 -8119290.345 2075 0.03605708 -8119671.645 2076 0.029300877 -8119714.665 2077 -0.008533494 -8119716.585 2078 0.016006132 -8119989.885 2079 0.005745295 -8120185.745 2080 0.00444089 -8120193.425 2081 -0.006769723 -8120293.745 2082 -0.058917022 -8121208.925 2083 0.016650977 -8121585.625 2084 0.013880279 -8123869.985 2085 0.036072047 -8124184.185 2086 0.030211163 -8124396.485 2087 0.036530434 -8124771.525 2088 0.039814047 -8125542.285 2089 0.039217756 -8126208.525 2090 0.044641797 -8126532.865 2091 -0.007088835 -8126578.105 2092 0.043141658 -8126678.505 2093 0.053877516 -8131283.565 2094 0.055036843 -8143173.245 2095 0.128109562 -8160652.125 V1 V2 V3 V4 2074 SHIELD MARKETING C AND C INDEPENDANTS REDISTRIBUTION 2075 ACKERMANS NON FOOD RETAIL CONVENIENCE 2076PEP NON FOOD RETAIL CONVENIENCE 2077 ACKERMANS NON FOOD RETAIL CONVENIENCE 2078PEP NON FOOD RETAIL CONVENIENCE 2079 ACKERMANS NON FOOD RETAIL CONVENIENCE 2080 ACKERMANS NON FOOD RETAIL CONVENIENCE 2081 SHIELD MARKETING C AND C INDEPENDANTS REDISTRIBUTION 2082 ACKERMANS NON FOOD RETAIL CONVENIENCE 2083 ACKERMANS NON FOOD RETAIL CONVENIENCE 2084 ACKERMANS NON FOOD RETAIL CONVENIENCE 2085 ACKERMANS NON FOOD RETAIL CONVENIENCE 2086 INDEPENDENT CAFES IMPULSE CONVENIENCE 2087 ACKERMANS NON FOOD RETAIL CONVENIENCE 2088 SHIELD MARKETING C AND C INDEPENDANTS REDISTRIBUTION 2089 SHIELD MARKETING C AND C INDEPENDANTS REDISTRIBUTION 2090 ACKERMANS NON FOOD RETAIL CONVENIENCE 2091 ACKERMANS NON FOOD RETAIL CONVENIENCE 2092 ACKERMANS NON FOOD RETAIL CONVENIENCE 2093INDEPENDENT WHOLESALERS C AND C INDEPENDANTS REDISTRIBUTION 2094 SHOPRITE GROCERY RETAIL 2095 OK FRANCHISE DIVISION GROCERY TOP UP RETAIL Regards Malcolm On Sun, Aug 16, 2009 at 8:47 PM, malcolm Crouch malcolm.crouc...@gmail.comwrote: Hi , I am using the plot function for some data , and the plot is coming back pure black , with scales on the side . Regards Malcolm -- Malcolm A.B Croucher __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot 2 semi-transparency error
I really don't know anything much about this but I tried device=pdf (no-quotes) and I got a pdf file with transparent plots. ?device.print gives help with some links to other devices hth rajesh j wrote: Hi, I used the command ggplot as follows... p-ggplot(a,aes(x=a$V1,colour=a$V2,fill=a$V2)) p + geom_density(alpha = 0.2,xlim=c(-10,10),ylim=c(0,0.5)) when I say, dev.print(device=postscript,file=/alpha/dct.pdf) I get Warning message: In grid.Call.graphics(L_polygon, x$x, x$y, list(as.integer(1L:length(x$x : semi-transparency is not supported on this device: reported only once per page the pdf has error any help? -- Rajesh.J [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/ggplot-2-semi-transparency-error-tp25038236p25038580.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] histogram scaling
Hi, I'm drawing two histograms in the same plot.However, my point of comparison is the difference in their x coordinates.But my problem is one histogram is much taller than the other.How can I get them both to the same height? -- Rajesh.J [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] histogram scaling
Hey, Rajesh I bit light on detail here. Being a mind reader is not an R-help prerequisite. However since I have been working on histograms today and you've just posted a question using ggplot, let me guess that its ggplot you are refering to. Then here is an example, which you can find in my post a few posts previous to yours. I've added scale_ commands to restrict the x and y scales. Actually in this example both histograms get plotted on the same y-scale automatically so that command can be removed. Check out the ggplot on line reference http://had.co.nz/ggplot2/ and book http://cran.r-project.org/web/packages/ggplot2/ggplot2.pdf hth require(ggplot2) x - data.frame(value=rnorm(5000, mean=0), case=A) y - data.frame(value=rnorm(5000, mean=3), case=B) xy - rbind(x, y) ggplot(xy, aes(x=value, fill=case, group=case)) + geom_histogram(alpha=0.5, binwidth=0.1, position=identity) + scale_x_continuous(limits=c(-2,3)) + scale_y_continuous(limits=c(0,250)) rajesh j wrote: Hi, I'm drawing two histograms in the same plot.However, my point of comparison is the difference in their x coordinates.But my problem is one histogram is much taller than the other.How can I get them both to the same height? -- Rajesh.J [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/histogram-scaling-tp25038602p25038798.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Why are there small circles in my plot
Dear R-listers, There is my data and my codes to create a plot. I want to know why there are two small circles in the upper right and lower left of the plot respectively. Could you please share your experience or advice with me? # dummy data factor-rep(c(Alice,Jone,Mike),each=100) factor-factor(factor) traits-c(rnorm(100, mean=1, sd=1), rnorm(100, mean=3, sd=3), rnorm(100, mean=6, sd=6)) myda-data.frame(factor,traits) # my plot plot(c(min(myda$traits),max(myda$traits)),c(-0.03,0.5), xlab='State', ylab='ylab') lines(density(myda$traits[factor==c(Alice)]), lwd=2,col=2) lines(density(myda$traits[factor==c(Jone)]), lwd=2,col=3) lines(density(myda$traits[factor==c(Mike)]), lwd=2,col=4) points(myda$traits[factor==c(Alice)], rep(-0.01, length(myda$traits[factor==c(Alice)])), pch=|, col=2) points(myda$traits[factor==c(Jone)], rep(-0.02, length(myda$traits[factor==c(Jone)])), pch=|, col=3) points(myda$traits[factor==c(Mike)], rep(-0.03, length(myda$traits[factor==c(Mike)])), pch=|, col=4) Best Regards, Yours, Mao J-F __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Remove columns
Hi r-help-boun...@r-project.org napsal dne 18.08.2009 10:14:26: Hi Everbody Could somebody help me.? I need to remove the columns where the sum of it components is equal to zero. For example a-matrix(c(0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,1,1,0,0,1,0), ncol=4) a [,1] [,2] [,3] [,4] [1,]0001 [2,]0101 [3,]0000 [4,]0100 [5,]0001 [6,]0000 Columns 1 and 3 should be removed the result should be the dollowing matrix [,2] [,4] [1,]01 [2,]11 [3,]00 [4,]10 [5,]01 [6,]00 a[,!colSums(a)==0] Beware of == and finite precision of floating point numbers (see FAQ) Regards Petr Thanks again -- Alberto Lora Michiels Rue du Progrčs, 6B 7860 Lessines GSM 32(0)496659457 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] moving color key in heatmap
Booboo Qu wrote: I have a question on moving color keys when side color bars are added to a heatmap. The R code below produces the color key in the upper left corner. Notice I have added side bars to the heatmap, but how could I move the color key below the image? Thanks for the self-contained example. heatmap internally uses layout (see docs) to arrange the plot, and its parameters can be controlled by lwid and lhei. So give the following a try that makes the histogram and the key same size. However, this does not help in putting the heatmap below, because everything is plotted in an array arrangement, as you can see from the example. You would have to change the code to put the Key in a third row; see the region around layout in function heatmap, but it is not easy. res=heatmap.3(x, RowSideColors=rc, ColSideColors=cc,lwid=c(1.5,1.5), lhei=c(1.5,1.5)) -- View this message in context: http://www.nabble.com/moving-color-key-in-heatmap-tp25036103p25038995.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot 2 semi-transparency error
rajesh j wrote: I used the command ggplot as follows... p-ggplot(a,aes(x=a$V1,colour=a$V2,fill=a$V2)) p + geom_density(alpha = 0.2,xlim=c(-10,10),ylim=c(0,0.5)) when I say, dev.print(device=postscript,file=/alpha/dct.pdf) I get Warning message: In grid.Call.graphics(L_polygon, x$x, x$y, list(as.integer(1L:length(x$x : semi-transparency is not supported on this device: reported only once per page It is a bit unconventional to expect a pdf when using the postscript device, that might explain the error when you open it, because you created a ps file. My personal preference is not to use dev.print at all, but to open the output device before doing the plot. A self-contained example would be nice. Dieter -- View this message in context: http://www.nabble.com/ggplot-2-semi-transparency-error-tp25038236p25039037.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why are there small circles in my plot
Hi, Mao Jianfeng schrieb: plot(c(min(myda$traits),max(myda$traits)),c(-0.03,0.5), xlab='State', ylab='ylab') Here, you are plotting two data points: (min(myda$traits),-0.03) and (max(myda$traits),0.5). Try this: plot(c(min(myda$traits),max(myda$traits)),c(-0.03,0.5), xlab='State', ylab='ylab',type='n') HTH, Stephan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Polygon function
Cetinyürek Aysun wrote: Dear all, I would like to plot credible interval for a function estimate in R. I would like to plot the credible intervals as shaded region using polygon function. Does anyone ever used that? I tried several times but I could not obtain the right figure. xis=sort(xi,decreasing=TRUE) plot(xi,fm) polygon(c(xi,xis),c(f05m,f95m)) The above piece of code produces something else. Hi Aysun, Have a look at the dispersion function in the plotrix package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Performance measure for probabilistic predictions
Hello, I'm using an SVM for predicting a model, but I'm most interested in the probability output. This is easy enough to calculate. My challenge is how to measure the relative performance of the SVM for different settings/parameters/etc. An AUC curve comes to mind, but I'm NOT interested in predicting true vs false. I am interested in finding the most accurate probability predictions possible. I've seen some literature where the probability range is cut into segments and then the predicted probability is compared to the actual. This looks nice, but I need a more tangible numeric measure. One thought was a measure of probability accuracy for each range, but how to calculate this. Any thoughts? -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to fill the area under the density line with semitransparent colors
Dear R-listers, I have created a plot to display the density lines for the same variable by different entities. Now, I want to fill the area under the density lines with semitransparent colors. Though I have checked that in web-searching and book-reading, I still do not perform that. Could anyone please give me any helps or advice? Thank you in advance. The data and code I used listed below: # dummy data factor-rep(c(Alice,Jone,Mike),each=100) factor-factor(factor) traits-c(rnorm(100, mean=1, sd=1), rnorm(100, mean=3, sd=3), rnorm(100, mean=6, sd=6)) myda-data.frame(factor,traits) # my plot plot(c(min(myda$traits),max(myda$traits)),c(-0.03,0.5), xlab='State', ylab='ylab') lines(density(myda$traits[factor==c(Alice)]), lwd=2,col=2) lines(density(myda$traits[factor==c(Jone)]), lwd=2,col=3) lines(density(myda$traits[factor==c(Mike)]), lwd=2,col=4) points(myda$traits[factor==c(Alice)], rep(-0.01, length(myda$traits[factor==c(Alice)])), pch=|, col=2) points(myda$traits[factor==c(Jone)], rep(-0.02, length(myda$traits[factor==c(Jone)])), pch=|, col=3) points(myda$traits[factor==c(Mike)], rep(-0.03, length(myda$traits[factor==c(Mike)])), pch=|, col=4) Regards, Yours Mao J-F __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Move legend text to the left legend box border
Hi, in legend I'am coloring my text rather than using symbols or lines: legend(bottomleft, txt, text.col=col, cex=0.7) However, between the left legend box border and the text in txt is a large empty space. Can I somehow move the text more to the left and get also a smaller legend box? Thanx in advance Sigbert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot 2 semi-transparency error
ggplot2 has it's own build in wrapper-function to store plots: ggsave. p - ggplot(a,aes(x=V1,colour=V2,fill=V2)) + geom_density(alpha = 0.2,xlim=c(-10,10),ylim=c(0,0.5)) ggsave(p, filename = /alpha/dtc.pdf) HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens rajesh j Verzonden: woensdag 19 augustus 2009 7:47 Aan: r-help@r-project.org Onderwerp: [R] ggplot 2 semi-transparency error Hi, I used the command ggplot as follows... p-ggplot(a,aes(x=a$V1,colour=a$V2,fill=a$V2)) p + geom_density(alpha = 0.2,xlim=c(-10,10),ylim=c(0,0.5)) when I say, dev.print(device=postscript,file=/alpha/dct.pdf) I get Warning message: In grid.Call.graphics(L_polygon, x$x, x$y, list(as.integer(1L:length(x$x : semi-transparency is not supported on this device: reported only once per page the pdf has error any help? -- Rajesh.J [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Move legend text to the left legend box border
On Wed, 19 Aug 2009 10:20:03 +0200 Sigbert Klinke sigb...@wiwi.hu-berlin.de wrote: SK in legend I'am coloring my text rather than using symbols or lines: SK SK legend(bottomleft, txt, text.col=col, cex=0.7) SK SK However, between the left legend box border and the text in txt is a SK large empty space. Can I somehow move the text more to the left and SK get also a smaller legend box? try legend(bottomleft, txt, text.col=col, cex=0.7,inset=-0.05) and play around with the inset value. hth Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot2 transparent pdf
Hi, I plotted a histogram using ggplot2 and saved it as a pdf.However, the portions outside the histogram dont appear transparent when I use a non-white bg colour in my latex document.What can I do to make them transparent? -- Rajesh.J [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 transparent pdf
Try this, print(p+ opts(plot.background= theme_rect(fill=NA))) HTH, baptiste 2009/8/19 rajesh j akshay.raj...@gmail.com: Hi, I plotted a histogram using ggplot2 and saved it as a pdf.However, the portions outside the histogram dont appear transparent when I use a non-white bg colour in my latex document.What can I do to make them transparent? -- Rajesh.J [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lmer with random slopes for 2 or more first-level factors?
Dear Jason, Both models have a correct syntax. Although I would write the last model rather as lmer(DV ~ IV1 + IV2 + (1|Subject) + (IV1 - 1| Subject) + (IV2 - 1| Subject)) The only difference is indeed the correlation between the random effects. The random effects in the model I wrote are all independent (not correlated). In your first model all random effects are correlated. In your second model the first random intercept is correlated with the random slope of IV1. The second random intercept with the random slope of IV2. Depending on if you want your interaction to be indepentend or not your can use either lmer(DV ~ IV1 + IV2 + (IV1 * IV2 | Subject)) lmer(DV ~ IV1 + IV2 + (1 | Subject) + (IV1 -1 | Subject) + (IV2 - 1| Subject) + (IV1:IV2 - 1| Subject)) Note that IV1 * IV2 is equivalent to 1 + IV1 + IV2 + IV1:IV2 HTH, Thierry PS R-Sig-mixed-models is a better list for this kind of questions. ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Jason R. Finley Verzonden: woensdag 19 augustus 2009 3:47 Aan: r-help@r-project.org Onderwerp: [R] lmer with random slopes for 2 or more first-level factors? I have data from a design in which items are completely nested within subjects. Subject is the only second-level factor, but I have multiple first-level factors (IVs). Say there are 2 such independent variables that I am interested in. What is the proper syntax to fit a mixed-effects model with a by-subject random intercept, and by-subject random slopes for both the 2 IVs? I can think of at least two possibilities: lmer(DV ~ IV1 + IV2 + (1 + IV1 + IV2 | Subject)) lmer(DV ~ IV1 + IV2 + (1 + IV1 | Subject) + (1 + IV2 | Subject)) Or maybe there is some other way to do it? Maybe the correct syntax depends on whether the random effect of subjects on the intercept and slopes are correlated or not? (If so, how do I proceed?) Finally, what would be the syntax if I wanted to include a random subject effect for the INTERACTION of IV1 and IV2? Thanks very much, ~jason PS: additional search terms: multi-level linear model, MLM, hierarchical, repeated measures ~~~ Jason R. Finley Department of Psychology University of Illinois, Urbana-Champaign ~~~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Package read large file
Dear R-Users, I am looking for packages that could read large files in R? any suggestions are welcome. Regards, ML __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to fill the area under the density line with semitransparent colors
Hi, One way using ggplot2, library(ggplot2) ggplot(data=myda, mapping=aes(x=traits, y=..density..)) + stat_density(aes(fill=factor), alpha=0.5, col=NA, position = 'identity') + stat_density(aes(colour = factor), geom=path, position = 'identity', size=2) HTH, baptiste 2009/8/19 Mao Jianfeng jianfeng@gmail.com: Dear R-listers, I have created a plot to display the density lines for the same variable by different entities. Now, I want to fill the area under the density lines with semitransparent colors. Though I have checked that in web-searching and book-reading, I still do not perform that. Could anyone please give me any helps or advice? Thank you in advance. The data and code I used listed below: # dummy data factor-rep(c(Alice,Jone,Mike),each=100) factor-factor(factor) traits-c(rnorm(100, mean=1, sd=1), rnorm(100, mean=3, sd=3), rnorm(100, mean=6, sd=6)) myda-data.frame(factor,traits) # my plot plot(c(min(myda$traits),max(myda$traits)),c(-0.03,0.5), xlab='State', ylab='ylab') lines(density(myda$traits[factor==c(Alice)]), lwd=2,col=2) lines(density(myda$traits[factor==c(Jone)]), lwd=2,col=3) lines(density(myda$traits[factor==c(Mike)]), lwd=2,col=4) points(myda$traits[factor==c(Alice)], rep(-0.01, length(myda$traits[factor==c(Alice)])), pch=|, col=2) points(myda$traits[factor==c(Jone)], rep(-0.02, length(myda$traits[factor==c(Jone)])), pch=|, col=3) points(myda$traits[factor==c(Mike)], rep(-0.03, length(myda$traits[factor==c(Mike)])), pch=|, col=4) Regards, Yours Mao J-F __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to Transpose a dataset
rajclinasia ha scritto: Hi everyone, How to a transpose a R dataset with a specified variable. If possible send the code. it will be very helpful for us. Thanks in Advance. A written code of your problem would be useful for us to help you. If you have a dataframe you may want to take a look also at ?stack ?reshape -- Ottorino-Luca Pantani, Università di Firenze Dip. Scienza del Suolo e Nutrizione della Pianta P.zle Cascine 28 50144 Firenze Italia Tel 39 055 3288 202 (348 lab) Fax 39 055 333 273 olpant...@unifi.it http://www4.unifi.it/dssnp/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Move legend text to the left legend box border
I believe you want x.intersp, txt - c(Setosa Petals, Versicolor Sepals) plot(1,1,t=n) legend(top, txt, text.col=1:2, cex=0.7, inset=c(0,1/3)) legend(center, txt, text.col=1:2, cex=0.7, x.intersp = -0.5) HTH, baptiste 2009/8/19 Stefan Grosse singularit...@gmx.net: On Wed, 19 Aug 2009 10:20:03 +0200 Sigbert Klinke sigb...@wiwi.hu-berlin.de wrote: SK in legend I'am coloring my text rather than using symbols or lines: SK SK legend(bottomleft, txt, text.col=col, cex=0.7) SK SK However, between the left legend box border and the text in txt is a SK large empty space. Can I somehow move the text more to the left and SK get also a smaller legend box? try legend(bottomleft, txt, text.col=col, cex=0.7,inset=-0.05) and play around with the inset value. hth Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] radial.plot() in plotrix module
Jon Case wrote: Hello, Is it possible to plot differently colored points with the radial.plot() function in the plotrix package? For example, the code: radial.plot( c(.5,.6, .7), c(1,2,3), rp.type = s, point.col=c(red, green, blue) ) produces a plot whose three points are all red, instead of one red, one green, and one blue. Any suggestions? Hi Jon, When I try your code, I get the points in the requested colors. Of course, I am running the very latest version of plotrix Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why are there small circles in my plot
Mao Jianfeng jianfeng@gmail.com 19/08/2009 08:10:54 There is my data and my codes to create a plot. I want to know why there are two small circles in the upper right and lower left of the plot respectively. Because your first plot() command put them there. add type=n to the first plot command? *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Embedding lists in matrices and matrices in lists
Unfortunately the matrix(list(),x,y) way seems to have some limitations. I want to continue working with the matrices which are saved in the database matrix. But for example the following doesn't work: tetrahedron=matrix(c( 0,1,1,1, 1,0,1,1, 1,1,0,1, 1,1,1,0 ),nrow=4, byrow=TRUE) # an example matrix database=matrix(list(),5,5) # create database matrix database[[4,4]]=list(tetrahedron) # save example matrix in database matrix ## try to access the first row of the saved example matrix ## database[[4,4]][1] # prints the whole example matrix database[[4,4]][1][1,] # should have printed the first row of the example matrix, but gives: Error in database[[4, 4]][1][1,] : incorrect number of dimensions Execution halted The same happens with database[[4,4]][1,]... Is there any way to access the saved matrices like normal ones? Thanks, Michael jim holtman schrieb: Is this what you want: x - matrix(list(),3,3) # create a matrix of lists # create matrices for testing for(i in 1:3){ + for (j in 1:3){ + x[[i,j]] - matrix(runif((i+1) * (j+1)), i+1) + } + } x [,1] [,2] [,3] [1,] Numeric,4 Numeric,6 Numeric,8 [2,] Numeric,6 Numeric,9 Numeric,12 [3,] Numeric,8 Numeric,12 Numeric,16 x[[2,2]] # extract one of them [,1] [,2] [,3] [1,] 0.26722067 0.3823880 0.4820801 [2,] 0.38611409 0.8696908 0.5995658 [3,] 0.01339033 0.3403490 0.4935413 str(x) # structure of the data List of 9 $ : num [1:2, 1:2] 0.266 0.372 0.573 0.908 $ : num [1:3, 1:2] 0.38 0.777 0.935 0.212 0.652 ... $ : num [1:4, 1:2] 0.7894 0.0233 0.4772 0.7323 0.6927 ... $ : num [1:2, 1:3] 0.202 0.898 0.945 0.661 0.629 ... $ : num [1:3, 1:3] 0.2672 0.3861 0.0134 0.3824 0.8697 ... $ : num [1:4, 1:3] 0.2448 0.0707 0.0995 0.3163 0.5186 ... $ : num [1:2, 1:4] 0.206 0.177 0.687 0.384 0.77 ... $ : num [1:3, 1:4] 0.186 0.827 0.668 0.794 0.108 ... $ : num [1:4, 1:4] 0.258 0.4785 0.7663 0.0842 0.8753 ... - attr(*, dim)= int [1:2] 3 3 On Tue, Aug 18, 2009 at 4:48 AM, Michael Koganmichael.ko...@gmx.net wrote: Hi, I'm new to programming, new to R and even new to mailing lists so please be patient with me. I need to manage many matrices generated by an R program. These matrices have different dimensions and I'd like to group them somehow. The best way would be to have a big matrix (let's call it database) where every element database[x,y] consists of a list of matrices that all have the dimensions ncol(matrix1)=x and nrow(matrix1)=y. So those matrices have to be embedded into lists and the lists have to be embedded in the big database matrix. If I simply try database=matrix(0,10,10) database[4,4]=c(matrix1,matrix2) I get Error in database[4, 4] = c(matrix1, matrix2) : number of items to replace is not a multiple of replacement length Execution halted which makes sense of course... Is there any possibility to make this work? Or maybe there is a better way to organize those matrices? Regards, Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fitting a logistic regression
Hello, I have this data: Time AMP 0 0.200 10 0.1958350 20 0.2914560 40 0.6763628 60 0.8494534 90 0.9874526 120 1.0477692 where AMP is the concentration of this metabolite with time. If you plot the data, you can see that it could be fitted using a logistic regression. For this purpose, I used this code: AMP.nls - nls(AMP~SSlogis(Time,Asym, xmid, scal), data = concentrations,model=T) When plotting the fitted function, it seems that it fits quite well at the end of the time. However, at the beginning it seems that the fit is not so good. How can I achieve a better fit? Forgive me if it is a stupid question, but I am just starting with non linear regression. Thank you, Dani -- [?] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] font size on graphics
Dear R users, My question is about finding the proper font size for graphics. For this i had written a code that creats 4 diferent graphics and saves them as a png file. From these PNG.graphics , i select one of the proper size and past it to a word document. I have experimented with lots of settings yet:nd lost my track a bit. there are cex; cexaxis cexlabes and so on, i lost track of wich cex does what exactly. Fruthermore, these graphs work well in an R window, yet when i open the png file it did not print the labels of bottom x axis. One other problem i have is that often '10' on the botom x axis is not printed. I tried to patch it up this way (below), yet its unsatisfactory, sometimes it works sometimes it don't. axis(1,at=c(28:30),c('','10','') , padj=-1.5 , cex.axis=ifelse((rol==1),setting[rol,8]-0.15,setting[rol,8]-0.1)) And finaly, when creating the graphs, i prefer the size of graf_1_small it pastes easaly to a word documentn ( 2 one one line), yet the labels are not always clear to read, so i copy the one without suffix to word then i size it to 75%, wich means it loses some quality. The question here is, can anyone help me with the right cex sizes? I gues that that is the question in general, since it are the cex that are eighter printed or not, on all these graph's. Thaks in advance, T here is my example: # saving Directory WinDir - C:/Documents and Settings/towil/Projecten # data trials - rep(1:10,3) test - c(rep(Low,10),rep(Normal,10),rep(Randomised,10)) means - rbeta(1:30,11,3)+rbeta(1:30,8,3) ci - rbeta(1:30,1,2) meanstable - data.frame( Trial=trials,Test=test,Means= means,Upper=means+ci ,Lower=means-ci) graf_1 - Means 1 graf_2 - Means 2 # settings for different graph's setting - data.frame('adname'=c(paste(graf_1,'_small'),graf_1,paste(graf_1,'_big')),'width'=c(300,400,500),'height' = c(300,400,500), 'pointsetting' = c(10,12,14),'Directory'=rep(WinDir,3), 'cex.lab'= c(0.85,0.9,1),'cexsize'= c(0.8,1,1.2), 'cex.axis'= c(0.6,0.9,1)) # loop for different graph's for(rol in 1:3){ save_at -WinDir setwd( save_at) x11() png(filename = paste(graf_1,setting[rol,1],.png,sep=),width = setting[rol,2], height = setting[rol,3],pointsize = setting[rol,4], bg = white, res = NA) plot(1:nrow(meanstable),meanstable$Mean, xlim=c(0,nrow(meanstable)),ylim= ylimits, xaxt='n', pch = 19, frame.plot=F,xlab='',ylab=lg10_label,cex.lab=setting[rol,6],cex.axis=setting[rol,8]) arrows(1:nrow(meanstable) , meanstable$Upper , 1:nrow(meanstable) , meanstable$Lower, lty=3, code = 3, angle = 45, length = .1) axis(3,at=1:2,unique(meanstable$Test)[c(1,2)],las=1 ,hadj=0.5, padj=c(1,0),cex.axis=setting[rol,8]) # upper X axis axis(3,at=2:3,unique(meanstable$Test)[c(2,2)],las=1 ,hadj=0.5, padj=c(0,-1),cex.axis=setting[rol,8]) axis(3,at=3,unique(meanstable$Test)[3],las=1, hadj=0,padj=-1,cex.axis=setting[rol,8]) axis(1,at=c(1:3),c('','1','') , padj=-1 , cex.axis=setting[rol,8]) # lopwer x axis this does not show on the png files, yet it works in an r graphic axis(1,at=c(4:6),c('','2','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(7:9),c('','3','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(10:12),c('','4','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(13:15),c('','5','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(16:18),c('','6','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(19:21),c('','7','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(22:24),c('','8','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(25:27),c('','9','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(28:30),c('','10','') , padj=-1 , cex.axis=setting[rol,8]) dev.off() } # graphics.off() End of example. Disclaimer: click here [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to fill the area under the density line with semitransparent colors
On Aug 19, 2009, at 4:05 AM, Mao Jianfeng wrote: Dear R-listers, I have created a plot to display the density lines for the same variable by different entities. Now, I want to fill the area under the density lines with semitransparent colors. Though I have checked that in web-searching and book-reading, I still do not perform that. Could anyone please give me any helps or advice? Thank you in advance. The data and code I used listed below: # dummy data factor-rep(c(Alice,Jone,Mike),each=100) factor-factor(factor) traits-c(rnorm(100, mean=1, sd=1), rnorm(100, mean=3, sd=3), rnorm(100, mean=6, sd=6)) myda-data.frame(factor,traits) # my plot plot(c(min(myda$traits),max(myda$traits)),c(-0.03,0.5), xlab='State', ylab='ylab') lines(density(myda$traits[factor==c(Alice)]), lwd=2,col=2) lines(density(myda$traits[factor==c(Jone)]), lwd=2,col=3) lines(density(myda$traits[factor==c(Mike)]), lwd=2,col=4) points(myda$traits[factor==c(Alice)], rep(-0.01, length(myda$traits[factor==c(Alice)])), pch=|, col=2) points(myda$traits[factor==c(Jone)], rep(-0.02, length(myda$traits[factor==c(Jone)])), pch=|, col=3) points(myda$traits[factor==c(Mike)], rep(-0.03, length(myda$traits[factor==c(Mike)])), pch=|, col=4) Searching with the terms polygon density( produces a worked example of filling the area under a density plot at this help page: http://finzi.psych.upenn.edu/R/library/mclust02/html/density.html -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Embedding lists in matrices and matrices in lists
Strange, it doesn't work for me: Error in database[4, 4][[1]][1, ] : incorrect number of dimensions Execution halted R version 2.9.0 (2009-04-17) on Arch Linux, no additional packages installed. David Winsemius schrieb: On Aug 19, 2009, at 6:02 AM, Michael Kogan wrote: Unfortunately the matrix(list(),x,y) way seems to have some limitations. I want to continue working with the matrices which are saved in the database matrix. But for example the following doesn't work: tetrahedron=matrix(c( 0,1,1,1, 1,0,1,1, 1,1,0,1, 1,1,1,0 ),nrow=4, byrow=TRUE) # an example matrix database=matrix(list(),5,5) # create database matrix database[[4,4]]=list(tetrahedron) # save example matrix in database matrix ## try to access the first row of the saved example matrix ## database[[4,4]][1] # prints the whole example matrix I was surprised that this worked. I would have expected that you would have used: database[4,4][[1]] since database is a matrix and generally one accesses matrix elements with [n,m] addressing, while the element is a list and would need [[n]] addressing. database[[4,4]][1][1,] # should have printed the first row of the example matrix, but gives: So I tried: database[4,4][[1]][1,] # [1] 0 1 1 1 Success Error in database[[4, 4]][1][1,] : incorrect number of dimensions Execution halted The same happens with database[[4,4]][1,]... Is there any way to access the saved matrices like normal ones? Thanks, Michael David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R function for Probabilities for the (standard) normal distribution
Dear All, I need to write an R function which computes values of Probabilities for the (standard) normal distribution, ¦µ(z) for z 0 by summing this power series. (We should accumulate terms until the sum doesn't change). I also need to make the function work for vector of values z. The initial fomular is ¦µ(z) = ( 1/ sqrt(2¦Ð) )* ¡Ò e^(-t^2/2)dt (¡Ò is from -¡Þ, z) = 1/ 2 + ( 1/ sqrt(2¦Ð) )* ¡Ò e^(-t^2/2)dt (¡Ò is from 0 to z) I can substituting x = -t^2/2 into the series expansion for the exponential function e^x = ¡Æ x^n/n! (¡Æ is from n=0 to ¡Þ) I can obtain the series e^(-t^2/2) = ¡Æ (-1)^k*t^2k / 2^k*k! (¡Æ is from n=0 to ¡Þ) This series can be integrated term by term to obtain the formula ¦µ(z) = 1/ 2 + ( 1/ sqrt(2¦Ð) ) * ¡Æ (-1)^k*z^(2k+1) / (2^k*k! *(2k+1)) (¡Æ is from n=0 to ¡Þ) I know how to write the R function for exponential function e^x expf = function (x) { x=ifelse((neg=(x0)),-x,x) n=0;term=1 oldsum=0; newsum=1 while(any(newsum != oldsum)) { oldsum=newsum n=n+1 term = term*x/n newsum = newsum+term} ifelse(neg, 1/newsum, newsum) } I know it will be similar to the above coding, but I don¡¯t know exactly how should we modify the above coding in order to get Probabilities for the (standard) normal distribution, ¦µ(z) for z 0. Can anybody advise me on this?? Thanks a lot. Rene. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] font size on graphics question (correction in example,sorry)
Dear R users, My question is about finding the proper font size for graphics. For this i had written a code that creats 4 diferent graphics and saves them as a png file. From these PNG.graphics , i select one of the proper size and past it to a word document. I have experimented with lots of settings yet:nd lost my track a bit. there are cex; cexaxis cexlabes and so on, i lost track of wich cex does what exactly. Fruthermore, these graphs work well in an R window, yet when i open the png file it did not print the labels of bottom x axis. One other problem i have is that often '10' on the botom x axis is not printed. I tried to patch it up this way (below), yet its unsatisfactory, sometimes it works sometimes it don't. axis(1,at=c(28:30),c('','10','') , padj=-1.5 , cex.axis=ifelse((rol==1),setting[rol,8]-0.15,setting[rol,8]-0.1)) And finaly, when creating the graphs, i prefer the size of graf_1_small it pastes easaly to a word documentn ( 2 one one line), yet the labels are not always clear to read, so i copy the one without suffix to word then i size it to 75%, wich means it loses some quality. The question here is, can anyone help me with the right cex sizes? I gues that that is the question in general, since it are the cex that are eighter printed or not, on all these graph's. Thaks in advance, T here is my example: # saving Directory WinDir - C:/Documents and Settings/Project dir.create(file.path(C:/Documents and Settings,Project)) # data trials - rep(1:10,3) test - c(rep(Low,10),rep(Normal,10),rep(Randomised,10)) means - rbeta(1:30,11,3)+rbeta(1:30,8,3) ci - rbeta(1:30,1,2) meanstable - data.frame( Trial=trials,Test=test,Means= means,Upper=means+ci ,Lower=means-ci) graf_1 - Means 1 graf_2 - Means 2 # settings for different graph's setting - data.frame('adname'=c(paste(graf_1,'_small'),graf_1,paste(graf_1,'_big')),'width'=c(300,400,500),'height' = c(300,400,500), 'pointsetting' = c(10,12,14),'Directory'=rep(WinDir,3), 'cex.lab'= c(0.85,0.9,1),'cexsize'= c(0.8,1,1.2), 'cex.axis'= c(0.6,0.9,1)) # loop for different graph's for(rol in 1:3){ save_at -WinDir setwd( save_at) x11() png(filename = paste(graf_1,setting[rol,1],.png,sep=),width = setting[rol,2], height = setting[rol,3],pointsize = setting[rol,4], bg = white, res = NA) plot(1:nrow(meanstable),meanstable$Mean, xlim=c(0,nrow(meanstable)),ylim= ylimits, xaxt='n', pch = 19, frame.plot=F,xlab='',ylab=lg10_label,cex.lab=setting[rol,6],cex.axis=setting[rol,8]) arrows(1:nrow(meanstable) , meanstable$Upper , 1:nrow(meanstable) , meanstable$Lower, lty=3, code = 3, angle = 45, length = .1) axis(3,at=1:2,unique(meanstable$Test)[c(1,2)],las=1 ,hadj=0.5, padj=c(1,0),cex.axis=setting[rol,8]) # upper X axis axis(3,at=2:3,unique(meanstable$Test)[c(2,2)],las=1 ,hadj=0.5, padj=c(0,-1),cex.axis=setting[rol,8]) axis(3,at=3,unique(meanstable$Test)[3],las=1, hadj=0,padj=-1,cex.axis=setting[rol,8]) axis(1,at=c(1:3),c('','1','') , padj=-1 , cex.axis=setting[rol,8]) # lopwer x axis this does not show on the png files, yet it works in an r graphic axis(1,at=c(4:6),c('','2','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(7:9),c('','3','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(10:12),c('','4','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(13:15),c('','5','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(16:18),c('','6','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(19:21),c('','7','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(22:24),c('','8','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(25:27),c('','9','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(28:30),c('','10','') , padj=-1 , cex.axis=setting[rol,8]) dev.off() } # graphics.off() End of example. Disclaimer: click here Disclaimer: click here [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R function for Probabilities for the (standard) normal distribution
On Aug 19, 2009, at 7:04 AM, Rene wrote: Dear All, I need to write an R function which computes values of Probabilities for the (standard) normal distribution, ¦µ(z) for z 0 by summing this power series. (We should accumulate terms until the sum doesn't change). I also need to make the function work for vector of values z. Two problems: One, this appears to be a homework exercise. The initial fomular is ¦µ(z) = ( 1/ sqrt(2¦Ð) )* ¡Ò e^(-t^2/2)dt (¡Ò is from -¡Þ, z) = 1/ 2 + ( 1/ sqrt(2¦Ð) )* ¡Ò e^(-t^2/2)dt (¡Ò is from 0 to z) Two, you are posting in HTML mail and the characters that you may think we should be seeing reading as Greek letters including I presume capital sigma for summation are not displayed properly. I can substituting x = -t^2/2 into the series expansion for the exponential function e^x = ¡Æ x^n/n! (¡Æ is from n=0 to ¡Þ) I can obtain the series e^(-t^2/2) = ¡Æ (-1)^k*t^2k / 2^k*k! (¡Æ is from n=0 to ¡Þ) This series can be integrated term by term to obtain the formula ¦µ(z) = 1/ 2 + ( 1/ sqrt(2¦Ð) ) * ¡Æ (-1)^k*z^(2k+1) / (2^k*k! *(2k +1)) (¡Æ is from n=0 to ¡Þ) I know how to write the R function for exponential function e^x expf = function (x) { x=ifelse((neg=(x0)),-x,x) n=0;term=1 oldsum=0; newsum=1 while(any(newsum != oldsum)) { oldsum=newsum n=n+1 term = term*x/n newsum = newsum+term} ifelse(neg, 1/newsum, newsum) } I know it will be similar to the above coding, but I don¡¯t know exactly how should we modify the above coding in order to get Probabilities for the (standard) normal distribution, ¦µ(z) for z 0. Can anybody advise me on this?? Thanks a lot. Rene. [[alternative HTML version deleted]] __ David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] three dimensions barchart
Thanks David and Berton, But I am so sorry for my mistake. I have just found what I want is a three dimension *histogram*. So correct problem is: I draw the Relative Distribution graph, it looks like histogram(X,Y plot), but I have ten(year) Relative Distribution graphs, have any command can combine ten histograms(X,Y plot) to become a three dimensions histogram(X,Y,Z plot)? I check David provide' website http://lmdvr.r-forge.r-project.org/figures/figures.html#%20Figure%206.15 but can't find three dimension histogram. Have any suggestions? All help highly appreciated. Best, Yichih 2009/8/19 David Winsemius dwinsem...@comcast.net On Aug 18, 2009, at 7:18 AM, Yichih Hsieh wrote: Dear R community, I have one problem with figures. I draw the Relative Distribution graph, it looks like barchart(X,Y plot), but I have ten(year) Relative Distribution grapgs, have any command am can combine ten barcharts(X,Y plot) to become a three dimensions barchart(X,Y,Z plot)? It has been my experience that the R-graphics experts generally deprecate 3-D bar graphs, but there are some worked examples in the supporting webpages for Sarkar's Lattice book: Fig 6.4 (image followed by code) http://dsarkar.fhcrc.org/lattice/book/images/Figure_06_04_stdBW.png cloud(density ~ long + lat, state.info, subset = !(name %in% c(Alaska, Hawaii)), type = h, lwd = 2, zlim = c(0, max(state.info$density)), scales = list(arrows = FALSE)) library(maps) state.map - map(state, plot=FALSE, fill = FALSE) panel.3dmap - function(..., rot.mat, distance, xlim, ylim, zlim, xlim.scaled, ylim.scaled, zlim.scaled) { scaled.val - function(x, original, scaled) { scaled[1] + (x - original[1]) * diff(scaled) / diff(original) } m - ltransform3dto3d(rbind(scaled.val(state.map$x, xlim, xlim.scaled), scaled.val(state.map$y, ylim, ylim.scaled), zlim.scaled[1]), rot.mat, distance) panel.lines(m[1,], m[2,], col = grey76) } Fig 6.15 (image ) http://dsarkar.fhcrc.org/lattice/book/images/Figure_06_15_stdBW.png code at: http://lmdvr.r-forge.r-project.org/figures/figures.html#%20Figure%206.15 All help highly appreciated Best, Yichih One Relative Distribution graph: fig2b-reldist(y=mu96$adjwage,yo=mu68$adjwage,ci=F,smooth=0.4, bar=TRUE,yolabs=seq(-1,3,by=0.5), ylim=c(0,7),cex=0.8,ylab=Relative Density,xlab=Proportion of the Original Cohort) title(main=Fig2(b):2007,cex=0.6) -- Yichih Hsieh e-mail : yichih.hs...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT -- Yichih Hsieh e-mail : yichih.hs...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Gam (mgcv) function
Dear all, I have obtained a model for my data set using gam function in mgcv library. The model is as follows: Family: Binomial Link function : Logit RU ~ s(time, by = Status, bs = ps) + Region*Status+ Gender*Status I have obtained the results. While presenting the results for Region and Gender, do I consider the model as usual logistic and estimate the ORS by exponentiating the coefficient estimate (and also for 95% CIs of ORs)? or is there any change in the calculation of OR? I have been searching for that and the thing I have found was related to model prediction and plotting of smooth terms. Thank you in advance, Kind Regards, Aysun __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Embedding lists in matrices and matrices in lists
Try this: database[[4,4]] - tetrahedron database[[4,4]][1,] [1] 0 1 1 1 On Wed, Aug 19, 2009 at 6:02 AM, Michael Koganmichael.ko...@gmx.net wrote: Unfortunately the matrix(list(),x,y) way seems to have some limitations. I want to continue working with the matrices which are saved in the database matrix. But for example the following doesn't work: tetrahedron=matrix(c( 0,1,1,1, 1,0,1,1, 1,1,0,1, 1,1,1,0 ),nrow=4, byrow=TRUE) # an example matrix database=matrix(list(),5,5) # create database matrix database[[4,4]]=list(tetrahedron) # save example matrix in database matrix ## try to access the first row of the saved example matrix ## database[[4,4]][1] # prints the whole example matrix database[[4,4]][1][1,] # should have printed the first row of the example matrix, but gives: Error in database[[4, 4]][1][1,] : incorrect number of dimensions Execution halted The same happens with database[[4,4]][1,]... Is there any way to access the saved matrices like normal ones? Thanks, Michael jim holtman schrieb: Is this what you want: x - matrix(list(),3,3) # create a matrix of lists # create matrices for testing for(i in 1:3){ + for (j in 1:3){ + x[[i,j]] - matrix(runif((i+1) * (j+1)), i+1) + } + } x [,1] [,2] [,3] [1,] Numeric,4 Numeric,6 Numeric,8 [2,] Numeric,6 Numeric,9 Numeric,12 [3,] Numeric,8 Numeric,12 Numeric,16 x[[2,2]] # extract one of them [,1] [,2] [,3] [1,] 0.26722067 0.3823880 0.4820801 [2,] 0.38611409 0.8696908 0.5995658 [3,] 0.01339033 0.3403490 0.4935413 str(x) # structure of the data List of 9 $ : num [1:2, 1:2] 0.266 0.372 0.573 0.908 $ : num [1:3, 1:2] 0.38 0.777 0.935 0.212 0.652 ... $ : num [1:4, 1:2] 0.7894 0.0233 0.4772 0.7323 0.6927 ... $ : num [1:2, 1:3] 0.202 0.898 0.945 0.661 0.629 ... $ : num [1:3, 1:3] 0.2672 0.3861 0.0134 0.3824 0.8697 ... $ : num [1:4, 1:3] 0.2448 0.0707 0.0995 0.3163 0.5186 ... $ : num [1:2, 1:4] 0.206 0.177 0.687 0.384 0.77 ... $ : num [1:3, 1:4] 0.186 0.827 0.668 0.794 0.108 ... $ : num [1:4, 1:4] 0.258 0.4785 0.7663 0.0842 0.8753 ... - attr(*, dim)= int [1:2] 3 3 On Tue, Aug 18, 2009 at 4:48 AM, Michael Koganmichael.ko...@gmx.net wrote: Hi, I'm new to programming, new to R and even new to mailing lists so please be patient with me. I need to manage many matrices generated by an R program. These matrices have different dimensions and I'd like to group them somehow. The best way would be to have a big matrix (let's call it database) where every element database[x,y] consists of a list of matrices that all have the dimensions ncol(matrix1)=x and nrow(matrix1)=y. So those matrices have to be embedded into lists and the lists have to be embedded in the big database matrix. If I simply try database=matrix(0,10,10) database[4,4]=c(matrix1,matrix2) I get Error in database[4, 4] = c(matrix1, matrix2) : number of items to replace is not a multiple of replacement length Execution halted which makes sense of course... Is there any possibility to make this work? Or maybe there is a better way to organize those matrices? Regards, Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Canonical link for the GLM gamma model
Hello! When I estimate a glm gamma model using the canonical link (inverse), I get the opposite signs on almost all coefficients compared to the same model (i.e. with the same linear predictor) estimated using other suitable links (log, logit). What confuses me is that most of sources quote the canonical link for the gamma model as 1/mu, but some quote -1/mu! For example: http://support.sas.com/rnd/app/da/new/802ce/stat/chap5/sect19.htm Which is the correct link, and can this explain the sign-flips I get in my models? Thank you very much for your help / advice! Serguei __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odd results with Chi-square test. (Not an R problem, but general statistics, I think.)
Anybody any ideas? Any help would be appreciated! Cheers, Mika -- View this message in context: http://www.nabble.com/Odd-results-with-Chi-square-test.-%28Not-an-R-problem%2C-but-general-statistics%2C-I-think.%29-tp25026167p25041900.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mild and extreme outliers in boxplot
dear all, could somebody tell me how I can plot mild outliers as a circle(°) and extreme outliers as an asterisk(*) in a box-whisker plot? Thanks very much in advance -- View this message in context: http://www.nabble.com/mild-and-extreme-outliers-in-boxplot-tp25040545p25040545.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 legend problem
On Tue, Aug 18, 2009 at 11:10 PM, Chris Friedlcfrieda...@gmail.com wrote: Still struggling with this. A further example using a slightly different organisation of the data. The factors A and B are included in the dataframe in an attempt to get ggplot to generate a legend automatically. x - data.frame(value=rnorm(5000, mean=0), case=A) y - data.frame(value=rnorm(5000, mean=3), case=B) xy - rbind(x, y) ggplot(xy, aes(x=value, fill=case, group=case)) + geom_histogram(binwidth=0.1) ggplot(xy, aes(x=value, fill=case, group=case)) + geom_density(alpha=0.2) Whilst the legend is generated as expected the histogram and density plots are different. The density plots overlap each other whereas the histogram plots stack. I'm trying the get the histogram plots to overlap, and retain the legend. Is the histogram stacking by design? Can stacking be changed to overlapping? I'm skeptical that this will create a useful plot, but geom_histogram(binwidth=0.1, position = identity) will do what you want. You might also want to look at geom_freqpoly. Alternatively, to use your previous approach, you just need to make a couple of small changes: g + geom_histogram(aes(x=X, fill = A), colour=black, binwidth = 0.1) + geom_histogram(aes(x=Y, fill = B), colour=black, binwidth = 0.1) + scale_fill_manual(Case, c(A = alpha(red, 0.5), B=alpha(blue,0.5))) Previously you weren't supplying the fill aesthetic so the scale had nothing to work with. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Performance measure for probabilistic predictions
Noah Silverman wrote: Hello, I'm using an SVM for predicting a model, but I'm most interested in the probability output. This is easy enough to calculate. My challenge is how to measure the relative performance of the SVM for different settings/parameters/etc. An AUC curve comes to mind, but I'm NOT interested in predicting true vs false. I am interested in finding the most accurate probability predictions possible. I've seen some literature where the probability range is cut into segments and then the predicted probability is compared to the actual. This looks nice, but I need a more tangible numeric measure. One thought was a measure of probability accuracy for each range, but how to calculate this. Any thoughts? -N Noah, This is a big area but I'm glad you are interested in probability accuracy rather than the more frequently (mis)-used classification accuracy. There are many measures available. For independent test samples the val.prob function in the Design package provides many. When making a calibration plot to demonstrate absolute prediction accuracy, it is not a good idea to bin the predicted probabilities. val.prob uses loess to produce a smooth calibration curve. Frank __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] scale or not to scale that is the question - prcomp
Dear all here is my data called rglp structure(list(vzorek = structure(1:17, .Label = c(179/1/1, 179/2/1, 180/1, 181/1, 182/1, 183/1, 184/1, 185/1, 186/1, 187/1, 188/1, 189/1, 190/1, 191/1, 192/1, R310, R610L), class = factor), iep = c(7.51, 7.79, 5.14, 6.35, 5.82, 7.13, 5.95, 7.27, 6.29, 7.5, 7.3, 7.27, 6.46, 6.95, 6.32, 6.32, 6.34), skupina = c(7.34, 7.34, 5.14, 6.23, 6.23, 7.34, 6.23, 7.34, 6.23, 7.34, 7.34, 7.34, 6.23, 7.34, 6.23, 6.23, 6.23), sio2 = c(0.023, 0.011, 0.88, 0.028, 0.031, 0.029, 0.863, 0.898, 0.95, 0.913, 0.933, 0.888, 0.922, 0.882, 0.923, 1, 1), p2o5 = c(0.78, 0.784, 1.834, 1.906, 1.915, 0.806, 1.863, 0.775, 0.817, 0.742, 0.783, 0.759, 0.787, 0.758, 0.783, 3, 2), al2o3 = c(5.812, 5.819, 3.938, 5.621, 3.928, 3.901, 5.621, 5.828, 4.038, 5.657, 3.993, 5.735, 4.002, 5.728, 4.042, 6, 5), dus = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L), .Label = c(ano, ne), class = factor)), .Names = c(vzorek, iep, skupina, sio2, p2o5, al2o3, dus), class = data.frame, row.names = c(NA, -17L)) and I try to do principal component analysis. Here is one without scaling fit-prcomp(~iep+sio2+al2o3+p2o5+as.numeric(dus), data=rglp, factors=2) biplot(fit, choices=2:3,xlabs=rglp$vzorek, cex=.8) you can see that data make 3 groups according to variables sio2 and dus which seems to be reasonable as lowest group has different value of dus = ano while highest group has low value of sio2. But when I do the same with scale=T fit-prcomp(~iep+sio2+al2o3+p2o5+as.numeric(dus), data=rglp, factors=2, scale=T) biplot(fit, choices=2:3,xlabs=rglp$vzorek, cex=.8) I get completely different picture which is not possible to interpret in such an easy way. So if anybody can advice me if I shall follow recommendation from help page (which says The default is FALSE for consistency with S, but in general scaling is advisable. or if I shall stay with scale = FALSE and with simply interpretable result? Thank you. Petr Pikal petr.pi...@precheza.cz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Embedding lists in matrices and matrices in lists
This works, but then I can only save a single matrix in each database[x,y] while I need to save a list of matrices. Gabor Grothendieck schrieb: Try this: database[[4,4]] - tetrahedron database[[4,4]][1,] [1] 0 1 1 1 On Wed, Aug 19, 2009 at 6:02 AM, Michael Koganmichael.ko...@gmx.net wrote: Unfortunately the matrix(list(),x,y) way seems to have some limitations. I want to continue working with the matrices which are saved in the database matrix. But for example the following doesn't work: tetrahedron=matrix(c( 0,1,1,1, 1,0,1,1, 1,1,0,1, 1,1,1,0 ),nrow=4, byrow=TRUE) # an example matrix database=matrix(list(),5,5) # create database matrix database[[4,4]]=list(tetrahedron) # save example matrix in database matrix ## try to access the first row of the saved example matrix ## database[[4,4]][1] # prints the whole example matrix database[[4,4]][1][1,] # should have printed the first row of the example matrix, but gives: Error in database[[4, 4]][1][1,] : incorrect number of dimensions Execution halted The same happens with database[[4,4]][1,]... Is there any way to access the saved matrices like normal ones? Thanks, Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Embedding lists in matrices and matrices in lists
Hi r-help-boun...@r-project.org napsal dne 19.08.2009 13:04:39: Strange, it doesn't work for me: Error in database[4, 4][[1]][1, ] : incorrect number of dimensions Execution halted R version 2.9.0 (2009-04-17) on Arch Linux, no additional packages installed. database[4,4][[1]][,1] does not work for me either but it is result of your complicated structure of nested lists This works database[4,4][[1]][[1]][,1] [1] 0 1 1 1 See the structure Matrix of lists database [,1] [,2] [,3] [,4] [,5] [1,] NULL NULL NULL NULL NULL [2,] NULL NULL NULL NULL NULL [3,] NULL NULL NULL NULL NULL [4,] NULL NULL NULL List,1 NULL [5,] NULL NULL NULL NULL NULL List of lists database[4,4] [[1]] [[1]][[1]] [,1] [,2] [,3] [,4] [1,]0111 [2,]1011 [3,]1101 [4,]1110 List which contains your matrix database[4,4][[1]] [[1]] [,1] [,2] [,3] [,4] [1,]0111 [2,]1011 [3,]1101 [4,]1110 Here is your matrix database[4,4][[1]][[1]] [,1] [,2] [,3] [,4] [1,]0111 [2,]1011 [3,]1101 [4,]1110 Regards Petr David Winsemius schrieb: On Aug 19, 2009, at 6:02 AM, Michael Kogan wrote: Unfortunately the matrix(list(),x,y) way seems to have some limitations. I want to continue working with the matrices which are saved in the database matrix. But for example the following doesn't work: tetrahedron=matrix(c( 0,1,1,1, 1,0,1,1, 1,1,0,1, 1,1,1,0 ),nrow=4, byrow=TRUE) # an example matrix database=matrix(list(),5,5) # create database matrix database[[4,4]]=list(tetrahedron) # save example matrix in database matrix ## try to access the first row of the saved example matrix ## database[[4,4]][1] # prints the whole example matrix I was surprised that this worked. I would have expected that you would have used: database[4,4][[1]] since database is a matrix and generally one accesses matrix elements with [n,m] addressing, while the element is a list and would need [[n]] addressing. database[[4,4]][1][1,] # should have printed the first row of the example matrix, but gives: So I tried: database[4,4][[1]][1,] # [1] 0 1 1 1 Success Error in database[[4, 4]][1][1,] : incorrect number of dimensions Execution halted The same happens with database[[4,4]][1,]... Is there any way to access the saved matrices like normal ones? Thanks, Michael David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scale or not to scale that is the question - prcomp
On 19/08/2009 8:31 AM, Petr PIKAL wrote: Dear all here is my data called rglp structure(list(vzorek = structure(1:17, .Label = c(179/1/1, 179/2/1, 180/1, 181/1, 182/1, 183/1, 184/1, 185/1, 186/1, 187/1, 188/1, 189/1, 190/1, 191/1, 192/1, R310, R610L), class = factor), iep = c(7.51, 7.79, 5.14, 6.35, 5.82, 7.13, 5.95, 7.27, 6.29, 7.5, 7.3, 7.27, 6.46, 6.95, 6.32, 6.32, 6.34), skupina = c(7.34, 7.34, 5.14, 6.23, 6.23, 7.34, 6.23, 7.34, 6.23, 7.34, 7.34, 7.34, 6.23, 7.34, 6.23, 6.23, 6.23), sio2 = c(0.023, 0.011, 0.88, 0.028, 0.031, 0.029, 0.863, 0.898, 0.95, 0.913, 0.933, 0.888, 0.922, 0.882, 0.923, 1, 1), p2o5 = c(0.78, 0.784, 1.834, 1.906, 1.915, 0.806, 1.863, 0.775, 0.817, 0.742, 0.783, 0.759, 0.787, 0.758, 0.783, 3, 2), al2o3 = c(5.812, 5.819, 3.938, 5.621, 3.928, 3.901, 5.621, 5.828, 4.038, 5.657, 3.993, 5.735, 4.002, 5.728, 4.042, 6, 5), dus = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L), .Label = c(ano, ne), class = factor)), .Names = c(vzorek, iep, skupina, sio2, p2o5, al2o3, dus), class = data.frame, row.names = c(NA, -17L)) and I try to do principal component analysis. Here is one without scaling fit-prcomp(~iep+sio2+al2o3+p2o5+as.numeric(dus), data=rglp, factors=2) biplot(fit, choices=2:3,xlabs=rglp$vzorek, cex=.8) you can see that data make 3 groups according to variables sio2 and dus which seems to be reasonable as lowest group has different value of dus = ano while highest group has low value of sio2. But when I do the same with scale=T fit-prcomp(~iep+sio2+al2o3+p2o5+as.numeric(dus), data=rglp, factors=2, scale=T) biplot(fit, choices=2:3,xlabs=rglp$vzorek, cex=.8) I get completely different picture which is not possible to interpret in such an easy way. So if anybody can advice me if I shall follow recommendation from help page (which says The default is FALSE for consistency with S, but in general scaling is advisable. or if I shall stay with scale = FALSE and with simply interpretable result? I would say the answer depends on the meaning of the variables. In the unusual case that they are measured in dimensionless units, it might make sense not to scale. But if you are using arbitrary units of measurement, do you want your answer to depend on them? For example, if you change from Kg to mg, the numbers will become much larger, the variable will contribute much more variance, and it will become a more important part of the largest principal component. Is that sensible? Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package read large file
A little more detail would be appropriate. How large is large? What is the format of the file? What do you wnat to do with the data? Do you have to have it all in memory at once? Does it exist in a data base already? What type of system are you running on? How much memory do you have? On Wed, Aug 19, 2009 at 4:51 AM, Mohamed Lajnefmohamed.laj...@inserm.fr wrote: Dear R-Users, I am looking for packages that could read large files in R? any suggestions are welcome. Regards, ML __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scale or not to scale that is the question - prcomp
Thank you Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 14:49:52: On 19/08/2009 8:31 AM, Petr PIKAL wrote: Dear all snip I would say the answer depends on the meaning of the variables. In the unusual case that they are measured in dimensionless units, it might make sense not to scale. But if you are using arbitrary units of measurement, do you want your answer to depend on them? For example, if you change from Kg to mg, the numbers will become much larger, the variable will contribute much more variance, and it will become a more important part of the largest principal component. Is that sensible? Basically variables are in percentages (all between 0 and 6%) except dus which is present or not present (for the purpose of prcomp transformed to 0/1 by as.numeric:). The only variable which is not such is iep which is basically in range 5-8. So ranges of all variables are quite similar. What surprises me is that biplot without scaling I can interpret by used variables while biplot with scaling is totally different and those two pictures does not match at all. This is what surprised me as I would expected just a small difference between results from those two settings as all numbers are quite comparable and does not differ much. Best regards Petr Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 legend problem
Dear Chris, First of all I would go for the density plot. The 'extra' info from the histogram is just noise. So I guess you are not interessed in it. ggplot(xy, aes(x=value, colour=case, group=case)) + geom_density() But is you want to stick with a histogram then I would use one of the two below ggplot(xy, aes(x=value, fill=case, group=case)) + geom_histogram(binwidth=0.1, position = identity, alpha = 0.2) ggplot(xy, aes(x=value, fill=case, group=case)) + geom_histogram(binwidth=0.1, position = dodge) HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens hadley wickham Verzonden: woensdag 19 augustus 2009 14:18 Aan: Chris Friedl CC: r-help@r-project.org Onderwerp: Re: [R] ggplot2 legend problem On Tue, Aug 18, 2009 at 11:10 PM, Chris Friedlcfrieda...@gmail.com wrote: Still struggling with this. A further example using a slightly different organisation of the data. The factors A and B are included in the dataframe in an attempt to get ggplot to generate a legend automatically. x - data.frame(value=rnorm(5000, mean=0), case=A) y - data.frame(value=rnorm(5000, mean=3), case=B) xy - rbind(x, y) ggplot(xy, aes(x=value, fill=case, group=case)) + geom_histogram(binwidth=0.1) ggplot(xy, aes(x=value, fill=case, group=case)) + geom_density(alpha=0.2) Whilst the legend is generated as expected the histogram and density plots are different. The density plots overlap each other whereas the histogram plots stack. I'm trying the get the histogram plots to overlap, and retain the legend. Is the histogram stacking by design? Can stacking be changed to overlapping? I'm skeptical that this will create a useful plot, but geom_histogram(binwidth=0.1, position = identity) will do what you want. You might also want to look at geom_freqpoly. Alternatively, to use your previous approach, you just need to make a couple of small changes: g + geom_histogram(aes(x=X, fill = A), colour=black, binwidth = 0.1) + geom_histogram(aes(x=Y, fill = B), colour=black, binwidth = 0.1) + scale_fill_manual(Case, c(A = alpha(red, 0.5), B=alpha(blue,0.5))) Previously you weren't supplying the fill aesthetic so the scale had nothing to work with. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] three dimensions barchart
On Aug 19, 2009, at 7:44 AM, Yichih Hsieh wrote: Thanks David and Berton, But I am so sorry for my mistake. I have just found what I want is a three dimension histogram. So correct problem is: I draw the Relative Distribution graph, it looks like histogram(X,Y plot), but I have ten(year) Relative Distribution graphs, have any command can combine ten histograms(X,Y plot) to become a three dimensions histogram(X,Y,Z plot)? I check David provide' website http://lmdvr.r-forge.r-project.org/figures/figures.html#%20Figure%206.15 but can't find three dimension histogram. I suggested looking at Figures 6.4 and 6.15. (I did not say they would be labeled 3D histogram. ) Did the displayed webpage not offer those possibilities? I am using Firefox 3.5.2 and having no problems with displaying the figure or viewing the code when using the above URL. -- DW 2009/8/19 David Winsemius dwinsem...@comcast.net On Aug 18, 2009, at 7:18 AM, Yichih Hsieh wrote: snipped David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graph label greek symbol failure
On 18/08/2009, Gavin Simpson gavin.simp...@ucl.ac.uk wrote: On Tue, 2009-08-18 at 13:06 +0100, e-letter wrote: On 17/08/2009, Michael Knudsen micknud...@gmail.com wrote: On Mon, Aug 17, 2009 at 12:51 PM, e-letter inp...@gmail.com wrote: I have tried to add the delta (δ) symbol to the y axis label and the result is D, using the command: ...ylab=δt... Try ylab = expression(delta*t) instead. This does not work, the result is expression(delta*t) It works for the rest of us who suggested this. plot(1:10, ylab = expression(delta*t)) True, but the following commands fails: plot(1:10,ylab=temperature expression(delta*t)) plot(1:10,ylab=temperature expression(delta*t)) Error: syntax error, unexpected SYMBOL, expecting ',' in plot(1:10,ylab=temperature expression So I want to be able to have 'δt' and 'temperature δt' as a y-axis label. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scale or not to scale that is the question - prcomp
On 19/08/2009 9:02 AM, Petr PIKAL wrote: Thank you Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 14:49:52: On 19/08/2009 8:31 AM, Petr PIKAL wrote: Dear all snip I would say the answer depends on the meaning of the variables. In the unusual case that they are measured in dimensionless units, it might make sense not to scale. But if you are using arbitrary units of measurement, do you want your answer to depend on them? For example, if you change from Kg to mg, the numbers will become much larger, the variable will contribute much more variance, and it will become a more important part of the largest principal component. Is that sensible? Basically variables are in percentages (all between 0 and 6%) except dus which is present or not present (for the purpose of prcomp transformed to 0/1 by as.numeric:). The only variable which is not such is iep which is basically in range 5-8. So ranges of all variables are quite similar. What surprises me is that biplot without scaling I can interpret by used variables while biplot with scaling is totally different and those two pictures does not match at all. This is what surprised me as I would expected just a small difference between results from those two settings as all numbers are quite comparable and does not differ much. If you look at the standard deviations in the two cases, I think you can see why this happens: Scaled: Standard deviations: [1] 1.3335175 1.2311551 1.0583667 0.7258295 0.2429397 Not Scaled: Standard deviations: [1] 1.0030048 0.8400923 0.5679976 0.3845088 0.1531582 The first two sds are close, so small changes to the data will affect their direction a lot. Your biplots look at the 2nd and 3rd components. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graph label greek symbol failure
plot(1:10,ylab=expression(temperature *delta*t)) 2009/8/19 e-letter inp...@gmail.com: On 18/08/2009, Gavin Simpson gavin.simp...@ucl.ac.uk wrote: On Tue, 2009-08-18 at 13:06 +0100, e-letter wrote: On 17/08/2009, Michael Knudsen micknud...@gmail.com wrote: On Mon, Aug 17, 2009 at 12:51 PM, e-letter inp...@gmail.com wrote: I have tried to add the delta (δ) symbol to the y axis label and the result is D, using the command: ...ylab=δt... Try ylab = expression(delta*t) instead. This does not work, the result is expression(delta*t) It works for the rest of us who suggested this. plot(1:10, ylab = expression(delta*t)) True, but the following commands fails: plot(1:10,ylab=temperature expression(delta*t)) plot(1:10,ylab=temperature expression(delta*t)) Error: syntax error, unexpected SYMBOL, expecting ',' in plot(1:10,ylab=temperature expression So I want to be able to have 'δt' and 'temperature δt' as a y-axis label. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to specify two variance effects in gls
Hello everybody, I have a dataset where each row has number of subjects and that gives me natural weights for the variance function. Additionally I see that variance increases with Age, which is a regressor. So using gls I have weights=varFixed(~1/n) but don't know how to include the extra effect of the regressor. Fitted values show a quadratic curve vs. age, not sure if that helps. Thanks everybody. Stephen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Confidence interval on parameters from optim function
Hi everyone, I have two questions: I would like to get confidence intervals on the coefficients derived from the optim() function. I apply optim() to a given function f res - optim(c(0.08,0.04,1.),f,NULL,method=L-BFGS-B,lower=c(0.,0.,0.)) And I would like to get the p-value and confidence intervals associated with res$par My second question deals with error message. I am doing a loop with the optim() function in it, when I get an error message like below, the loop is stopped, however I would like to change my initial values to avoid this error message and keep the loop going, so if it crashes when I am away the program can still run, any idea ? Error in optim(c(0.08, 0.04, 1), f, NULL, method = L-BFGS-B, lower = c(0, : L-BFGS-B needs finite values of 'fn' Thank you for any information on these two problems. Emmanuel --- Dr. Emmanuel Devred Bedford Institute of Oceanography, 1 Challenger Drive, Dartmouth, Nova Scotia, B2Y 4A2, Canada Ph: (1) 902 426-4681 Fax: (1) 902 426-9388 devr...@mar.dfo-mpo.gc.ca http://myweb.dal.ca/edevred/ --- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help understanding lrm function of Design library
On Tue, Aug 18, 2009 at 8:13 AM, Frank E Harrell Jrf.harr...@vanderbilt.edu wrote: Noah Silverman wrote: Hi, I'm developing an experiment with logistic regression. I've come across the lrm function in the Design library. While I understand and can use the basic functionality, there are a ton of options that go beyond my knowledge. I've carefully read the help page for lrm, but don't understand many of the arguments and optional return values. (penalty, penalty.matrix, strata.penalty, var.penalty, weights) Additionally, there are several items returned in the model that I don't understand. (Max Derriv, Model L.R, C, Dxy, Gamma, Tau-a, R2, Briar, Wald Z) 1) Could someone point me to an online resource where I could learn more? (I'm big on trying to teach myself.) 2) Or, alternately, does anybody want to explain a few things for me? Thanks! -- Noah My book Regression Modeling Strategies covers much of that. Its case study on penalized ordinal logistic regression is also largely in a paper in Statistics in Medicine if you have access to that. Once you do some background reading I'll be glad to answer some focused questions. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University Noah, I have found the Regression Modeling Strategies book to be incredibly valuable. I also took a Regression course with Frank Harrell this past February (http://biostat.mc.vanderbilt.edu/wiki/Main/RmS#2009_Upcoming_short_courses_in_R). I can not say enough about this course. I am an ecologist but had no problem with the mostly medical examples. Dr. Harrell is a great teacher! There was so much great information presented that I hope to repeat the course if at all possible! Good luck, Michael -- Michael Denslow Graduate Student I.W. Carpenter Jr. Herbarium [BOON] Department of Biology Appalachian State University Boone, North Carolina U.S.A. -- AND -- Communications Manager Southeast Regional Network of Expertise and Collections sernec.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RGoogleDocs/RCurl through proxy
Hi Remko There is a new version (1.1-0) of the RCurl package (on which RGoogleDocs depends) (no binary for Windows at this point). This version allows one to specify default curl options that are used each time a new curl handle/object is created. You set these defaults in R's own options() list, e.g. options( RCurlOptions = list(verbose = TRUE, proxy = host:port, noproxy = www.nytimes.com)) The proxy option is most likely what you need to set. You can specify the host, port, user and password in this string. Or you can specify the user:password string via the 'proxyuserpwd' option. You can even specify the mechanism to use for authenticating the user via the 'proxyauth' option. libcurl (and hence RCurl) has several different options related to proxies http://curl.haxx.se/libcurl/c/curl_easy_setopt.html D. Remko Duursma wrote: Dear list, I am trying to use RGoogleDocs, but I am connecting through a proxy server. I know RCurl is used for the connection, which should be able to deal with proxies and such. How do I set this up for RCurl? And can I use those settings with RGoogleDocs as well? I have the name of the proxy server and the port number. (Windows XP). thanks, Remko - Remko Duursma Post-Doctoral Fellow Centre for Plants and the Environment University of Western Sydney Hawkesbury Campus Richmond NSW 2753 Dept of Biological Science Macquarie University North Ryde NSW 2109 Australia Mobile: +61 (0)422 096908 www.remkoduursma.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scale or not to scale that is the question - prcomp
Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 15:25:00: On 19/08/2009 9:02 AM, Petr PIKAL wrote: Thank you Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 14:49:52: On 19/08/2009 8:31 AM, Petr PIKAL wrote: Dear all snip I would say the answer depends on the meaning of the variables. In the unusual case that they are measured in dimensionless units, it might make sense not to scale. But if you are using arbitrary units of measurement, do you want your answer to depend on them? For example, if you change from Kg to mg, the numbers will become much larger, the variable will contribute much more variance, and it will become a more important part of the largest principal component. Is that sensible? Basically variables are in percentages (all between 0 and 6%) except dus which is present or not present (for the purpose of prcomp transformed to 0/1 by as.numeric:). The only variable which is not such is iep which is basically in range 5-8. So ranges of all variables are quite similar. What surprises me is that biplot without scaling I can interpret by used variables while biplot with scaling is totally different and those two pictures does not match at all. This is what surprised me as I would expected just a small difference between results from those two settings as all numbers are quite comparable and does not differ much. If you look at the standard deviations in the two cases, I think you can see why this happens: Scaled: Standard deviations: [1] 1.3335175 1.2311551 1.0583667 0.7258295 0.2429397 Not Scaled: Standard deviations: [1] 1.0030048 0.8400923 0.5679976 0.3845088 0.1531582 The first two sds are close, so small changes to the data will affect I see. But I would expect that changes to data made by scaling would not change it in such a way that unscaled and scaled results are completely different. their direction a lot. Your biplots look at the 2nd and 3rd components. Yes because grouping in 2nd and 3rd component biplot can be easily explained by values of some variables (without scaling). I must admit that I do not use prcomp much often and usually scaling can give me explainable result, especially if I use it to variable reduction. Therefore I am reluctant to use it in this case. when I try more standard way fit-lm(iep~sio2+al2o3+p2o5+as.numeric(dus), data=rglp) summary(fit) Call: lm(formula = iep ~ sio2 + al2o3 + p2o5 + as.numeric(dus), data = rglp) Residuals: Min 1Q Median 3Q Max -0.41751 -0.15568 -0.03613 0.20124 0.43046 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.120850.62257 11.438 8.24e-08 *** sio2-0.672500.20953 -3.210 0.007498 ** al2o30.405340.08641 4.691 0.000522 *** p2o5-0.769090.11103 -6.927 1.59e-05 *** as.numeric(dus) -0.640200.18101 -3.537 0.004094 ** I get quite plausible result which can be interpreted without problems. My data is a result of designed experiment (more or less :) and therefore all variables are significant. Is that the reason why scaling may bye inappropriate in this case? Regards Petr Pikal Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm.fit algo
Thank you, Steve! 2009/8/18 Steve Lianoglou mailinglist.honey...@gmail.com: Hi, On Aug 17, 2009, at 5:09 PM, Pavlo Kononenko wrote: Hi, everyone, This is a little silly, but I cant figure out the algorithm behind lm.fit function used in the context of promax rotation algorithm: The promax function is: promax - function(x, m = 4) { if(ncol(x) 2) return(x) dn - dimnames(x) xx - varimax(x) x - xx$loadings Q - x * abs(x)^(m-1) U - lm.fit(x, Q)$coefficients d - diag(solve(t(U) %*% U)) U - U %*% diag(sqrt(d)) dimnames(U) - NULL z - x %*% U U - xx$rotmat %*% U dimnames(z) - dn class(z) - loadings list(loadings = z, rotmat = U, crap = x, coeff = Q) } And the line I'm having trouble with is: U - lm.fit(x, Q)$coefficients Isn't this doing a least squares regression using the predictor variables in x and the (I guess) real valued numbers in vector Q? x is a matrix of n (observations) by p (predictors) The $coefficients is just taking the vector of coefficients/weights over the predictors -- this would be a vector of length p -- such that x %*% t(t(U)) ~ Q * t(t(U)) is ugly, but I just want to say get U to be a column vector * ~ is used as almost equals) You'll need some numerical/scientific/matrix library in java, perhaps this could be a place to start: http://commons.apache.org/math/userguide/stat.html#a1.5_Multiple_linear_regression Hope that helps, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] need help for t.test using by
any one 1Rnwb wrote: I am trying to do ttest for each plate which has equal number of disease and controls. by searching this forum I found one posting suggesting OP to use by(eo,PlateID, function(.sub) t.test(IL1Ra~Group,data=.sub)). when i modified this for my use I used to get the pvalues for each plate, recently upgraded to R2.9.x and now i am getting following error when i use this by(eo,PlateID, function(.sub) t.test(IL1Ra~Group,data=.sub)) Error in t.test.formula(IL1Ra ~ Group, data = .sub) : grouping factor must have exactly 2 levels I checked my data file and the plate ID has P16-P47 and the disease column is Y/N. I have search all but could not locate the answer. Can anyone suggest me the direction where i am going wrong. for more detail i amm posting a part of my data. SampleID PlateID Sex Disease DurationAge prt1prt2 prt3prt4 1 P16 F N 25.33282811 836.08979 20.04582692 295.74 11731.43 2 P16 F N 32.74912883 243.0652116 53.16487056 383.63 4451.36 3 P16 M N 3.49961999 181.5587298 23.24604762 522.52 325.52 4 P16 M N 5.249771666 261.0978097 19.69024684 833.61 2229.04 5 P16 M N 39.16612385 237.334794 83.97284692 694.03 1204.36 6 P16 F N 33.49929145 540.486536 29.69569346 895.36 72105.75 7 P16 F N 3.915997361 5215.378446 35.44324704 1023.63 101680.69 8 P16 F N 33.49929145 466.5188732 12.93422535 814.26 7169.61 9 P16 F N 38.1661348 706.5117791 19.69024684 331.62 7296.72 10P16 M N 0.916030215 242.3011559 36.40117263 90.135962.08 11P16 F Y 14975.2542.08281552 133.8052501 29.0570764 260.55 1280.29 12P16 M Y 1095.75 40.41593932 247.6495456 19.69024684 161.64 6685.6 13P16 F Y 292225.08300169 231.6043765 9.022844487 902.53 37571.39 14P16 F Y 12053.2537.49924765 158.6370595 24.90606549 1.63471.15 15P16 F Y 14610 76.58209602 278.3837309 41.51010914 152.13 16285.49 16P16 F Y 3287.25 49.58239171 128.4568603 17.91234645 274.99 41823.21 17P17 M N 2.333079994 397.4440508 55.20987654 366.37 90011.27 18P17 F N 4.749435461 222.0839813 73.62489675 271.65 903.89 19P17 F N 1.749468315 676.8904636 47.7037037 721.19 663.15 20P17 M N 33.24946503 413.712486 27.42386831 611.64 4195.86 21P17 F N 15.4162131 346.9913086 30.71604938 302.29 2151.47 22P17 F N 33.3327405 2598.071161 107.8088094 1800.44 102005.12 23P17 M N 9.832656179 3296.535581 54.1563786 2642.73 676.78 24P17 F N 3.749446413 488.2589867 15.29151952 562.88 11619.62 25P17 M N 9.749380705 159.1536586 23.41690259 132.64 931.14 26P17 M Y 12053.2550.91548265 173.7764722 18.90280088 52.25 15962.05 27P17 M Y 3652.5 25.83248096 264.8254141 35.32510288 497.33 15504.95 28P17 F Y 28.41607073 415.3599225 35.0617284 1600.67 14224.39 29P17 F Y 6209.25 60.74882219 213.7280879 44.41152263 111.36 9018.48 30P17 M Y 4748.25 47.33258719 264.0016959 31.24279835 366.81 3809.83 31P17 M Y 16071 54.49906148 112.4128794 34.40329218 114.56 1727.02 32P17 M Y 7670.25 52.33253243 145.5753317 32.2962963 44.92 15264.84 33P17 M Y 876631.74913978 220.2561296 34.40329218 130.31 2377.36 34P17 F Y 8400.75 40.83231669 194.6662059 48.09876543 407.83 19587.61 35P17 M Y 12053.2536.1661567 212.1613578 96.51775947 621.42 3269.9 36P17 M Y 8035.5 57.91540599 115.2852178 24.79012346 282.88 8960.27 37P17 M Y 5478.75 36.33270765 160.4592669 40.72427984 102.32 5006.77 38P17 F Y 9496.5 34.74910692 329.8991551 40.32921811 417.32 248857.58 39P18 F N 14.2496731
Re: [R] Confidence interval on parameters from optim function
Hi Emmanuel, You can obtain standard error estimate from the Hessian matrix evaluated at the optimum as: sqrt(diag(solve(ans$hessian))), where `ans' is the object returned by `optim'. However, `optim' does not return the Hessian matrix by default. So, you need to specify `hessian = TRUE' when you call `optim'. There are two issues that I would think about: 1. You need to pay attention to the standard regularity conditions that are required in order for the confidence interval to be valid. One important condition is that the solution be in the interior of the parameter space. Since you are using box-constraints, I would check to make sure that the solution is not on the boundary. 2. The Hessian returned by `optim' is a bit inaccurate. Often, this is not a big deal. There are situations, however, where this matters, as you might get an indefinite Hessian due to inaccuracy. So, I generally prefer to use the `hessian' function in the numDeriv package. Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Devred, Emmanuel Sent: Wednesday, August 19, 2009 9:41 AM To: r-help@r-project.org Subject: [R] Confidence interval on parameters from optim function Hi everyone, I have two questions: I would like to get confidence intervals on the coefficients derived from the optim() function. I apply optim() to a given function f res - optim(c(0.08,0.04,1.),f,NULL,method=L-BFGS-B,lower=c(0.,0.,0.)) And I would like to get the p-value and confidence intervals associated with res$par My second question deals with error message. I am doing a loop with the optim() function in it, when I get an error message like below, the loop is stopped, however I would like to change my initial values to avoid this error message and keep the loop going, so if it crashes when I am away the program can still run, any idea ? Error in optim(c(0.08, 0.04, 1), f, NULL, method = L-BFGS-B, lower = c(0, : L-BFGS-B needs finite values of 'fn' Thank you for any information on these two problems. Emmanuel --- Dr. Emmanuel Devred Bedford Institute of Oceanography, 1 Challenger Drive, Dartmouth, Nova Scotia, B2Y 4A2, Canada Ph: (1) 902 426-4681 Fax: (1) 902 426-9388 devr...@mar.dfo-mpo.gc.ca http://myweb.dal.ca/edevred/ --- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting a logistic regression
with my limited understanding, I am not surprised to see this data fitting nicely at the end just by eyeballing at it. the reaction at the early time point is not completed as the time passes which is close to 20 units the reaction generates more metabolite to be measured reliably your t=0 and t=10 are basically the same, if i limit the digits to 2 then the values will be 0.20. you can do the log transform and then try to fit it. Dani Valverde-4 wrote: Hello, I have this data: Time AMP 0 0.200 10 0.1958350 20 0.2914560 40 0.6763628 60 0.8494534 90 0.9874526 120 1.0477692 where AMP is the concentration of this metabolite with time. If you plot the data, you can see that it could be fitted using a logistic regression. For this purpose, I used this code: AMP.nls - nls(AMP~SSlogis(Time,Asym, xmid, scal), data = concentrations,model=T) When plotting the fitted function, it seems that it fits quite well at the end of the time. However, at the beginning it seems that the fit is not so good. How can I achieve a better fit? Forgive me if it is a stupid question, but I am just starting with non linear regression. Thank you, Dani -- [?] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Fitting-a-logistic-regression-tp25041444p25043712.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ridge regression
Dear all, I considered an ordinary ridge regression problem. I followed three different ways: 1. estimate beta without any standardization 2. estimate standardized beta (standardizing X and y) and then again convert back 3. estimate beta using lm.ridge() function X-matrix(c(1,2,9,3,2,4,7,2,3,5,9,1),4,3) y-t(as.matrix(cbind(2,3,4,5))) n-nrow(X) p-ncol(X) #Without standardization intercept - rep(1,n) Xn - cbind(intercept, X) K-diag(c(0,rep(1.5,p))) beta1 - solve(t(Xn)%*%Xn+K)%*%t(Xn)%*%y beta1 #with standardization ys-scale(y) Xs-scale(X) K-diag(1.5,p) bs - solve(t(Xs)%*%Xs+K)%*%t(Xs)%*%ys b- sd(y)*(bs/sd(X)) intercept - mean(y)-sum(as.matrix(colMeans(X))*b) beta2-rbind(intercept,b) beta2 #Using lm.ridge function of MASS package beta3-lm.ridge(y~X,lambda=1.5) I'm getting three different outputs for using the above three different approaches, but which would been the same for all. beta1 [,1] intercept 3.4007944 0.3977462 0.2082025 -0.4829115 beta2 [,1] intercept 3.3399855 0.1639469 0.0262021 -0.1228987 beta3 X1 X2 X3 3.35158977 0.19460958 0.03152778 -0.15546775 It will be very helpful to me if anybody can help me regarding why the outputs are coming different. regards. -- View this message in context: http://www.nabble.com/ridge-regression-tp25045020p25045020.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence interval on parameters from optim function
Emmanuel, I didn't answer your second question. You can use the `try' function to capture errors and keep proceeding through simulations without crashing out: ?try If `L-BFGS-B' does not work well, you could try the `spg' function in the BB package. Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Devred, Emmanuel Sent: Wednesday, August 19, 2009 9:41 AM To: r-help@r-project.org Subject: [R] Confidence interval on parameters from optim function Hi everyone, I have two questions: I would like to get confidence intervals on the coefficients derived from the optim() function. I apply optim() to a given function f res - optim(c(0.08,0.04,1.),f,NULL,method=L-BFGS-B,lower=c(0.,0.,0.)) And I would like to get the p-value and confidence intervals associated with res$par My second question deals with error message. I am doing a loop with the optim() function in it, when I get an error message like below, the loop is stopped, however I would like to change my initial values to avoid this error message and keep the loop going, so if it crashes when I am away the program can still run, any idea ? Error in optim(c(0.08, 0.04, 1), f, NULL, method = L-BFGS-B, lower = c(0, : L-BFGS-B needs finite values of 'fn' Thank you for any information on these two problems. Emmanuel --- Dr. Emmanuel Devred Bedford Institute of Oceanography, 1 Challenger Drive, Dartmouth, Nova Scotia, B2Y 4A2, Canada Ph: (1) 902 426-4681 Fax: (1) 902 426-9388 devr...@mar.dfo-mpo.gc.ca http://myweb.dal.ca/edevred/ --- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with predict.coxph
In examining the predict.coxph functions for the library I have with 2.7.1 versus the library with 2.9.1 I find a major rewrite of the function. A number of internal survival functions are no longer present so much of the code has changed. This makes identifying the specific problem beyond my capabilities. What I want to do, is generate predictions for specific combinations of covariates. The number of combinations I am interested in is different than the number of records in the original data file. Any help would be appreciated as some of the graphic routines I want to use on the data are only available in 2.8 or greater - meaning I am currently looking at trying to run two different versions of R to get the project done. TIA, Michael Conklin -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Michael Conklin Sent: Tuesday, August 18, 2009 8:26 PM To: r-help@r-project.org Subject: [R] Problem with predict.coxph We occasionally utilize the coxph function in the survival library to fit multinomial logit models. (The breslow method produces the same likelihood function as the multinomial logit). We then utilize the predict function to create summary results for various combinations of covariates. For example: mod1-coxph(Depvar~Price:Product+strata(ID),data=MyDCMData2,na.action=na.omit,method=breslow) The model runs fine. Then we create some new data that is all combinations of Price and Product and retrieve the summary linear predictors. newdata=expand.grid(Price=factor(as.character(1:5)),Product=factor(as.character(1:5))) ## create a utility matrix for all combinations of prices and products totalut-predict(mod1,newdata=newdata,type=lp) Under R 2.7.1 this produces the following output: totalut [,1] 1 0.01534582 2 -0.07628528 3 -0.88085189 4 -1.19458045 5 -1.03579684 6 0.40065672 7 0.15922492 8 -0.49233524 9 -0.65483441 10 -1.07739920 11 0.27589201 12 0.48055065 13 0.33638585 14 -0.28416678 15 -0.48762319 16 1.06071986 17 0.69041596 18 0.67479476 19 0.36360168 20 -0.09492167 21 0.66554276 22 0.55748465 23 0.37596413 24 0.01612020 25 -0.03567735 The problem is that under R 2.8.1 and R 2.9.1 the previous line fails with the following error: totalut-predict(mod1,newdata=newdata,type=lp) Error in model.frame.default(Terms2, newdata, xlev = object$xlevels) : variable lengths differ (found for 'Price') In addition: Warning message: 'newdata' had 25 rows but variable(s) found have 43350 rows Does anyone have an idea what is going on? Best regards, Michael Conklin W. Michael Conklin Chief Methodologist MarketTools, Inc. | www.markettools.comhttp://www.markettools.com 6465 Wayzata Blvd | Suite 170 | St. Louis Park, MN 55426. PHONE: 952.417.4719 | CELL: 612.201.8978 This email and attachment(s) may contain confidential and/or proprietary information and is intended only for the intended addressee(s) or its authorized agent(s). Any disclosure, printing, copying or use of such information is strictly prohibited. If this email and/or attachment(s) were received in error, please immediately notify the sender and delete all copies [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scale or not to scale that is the question - prcomp
On 8/19/2009 10:14 AM, Petr PIKAL wrote: Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 15:25:00: On 19/08/2009 9:02 AM, Petr PIKAL wrote: Thank you Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 14:49:52: On 19/08/2009 8:31 AM, Petr PIKAL wrote: Dear all snip I would say the answer depends on the meaning of the variables. In the unusual case that they are measured in dimensionless units, it might make sense not to scale. But if you are using arbitrary units of measurement, do you want your answer to depend on them? For example, if you change from Kg to mg, the numbers will become much larger, the variable will contribute much more variance, and it will become a more important part of the largest principal component. Is that sensible? Basically variables are in percentages (all between 0 and 6%) except dus which is present or not present (for the purpose of prcomp transformed to 0/1 by as.numeric:). The only variable which is not such is iep which is basically in range 5-8. So ranges of all variables are quite similar. What surprises me is that biplot without scaling I can interpret by used variables while biplot with scaling is totally different and those two pictures does not match at all. This is what surprised me as I would expected just a small difference between results from those two settings as all numbers are quite comparable and does not differ much. If you look at the standard deviations in the two cases, I think you can see why this happens: Scaled: Standard deviations: [1] 1.3335175 1.2311551 1.0583667 0.7258295 0.2429397 Not Scaled: Standard deviations: [1] 1.0030048 0.8400923 0.5679976 0.3845088 0.1531582 The first two sds are close, so small changes to the data will affect I see. But I would expect that changes to data made by scaling would not change it in such a way that unscaled and scaled results are completely different. their direction a lot. Your biplots look at the 2nd and 3rd components. Yes because grouping in 2nd and 3rd component biplot can be easily explained by values of some variables (without scaling). I must admit that I do not use prcomp much often and usually scaling can give me explainable result, especially if I use it to variable reduction. Therefore I am reluctant to use it in this case. when I try more standard way fit-lm(iep~sio2+al2o3+p2o5+as.numeric(dus), data=rglp) summary(fit) Call: lm(formula = iep ~ sio2 + al2o3 + p2o5 + as.numeric(dus), data = rglp) Residuals: Min 1Q Median 3Q Max -0.41751 -0.15568 -0.03613 0.20124 0.43046 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.120850.62257 11.438 8.24e-08 *** sio2-0.672500.20953 -3.210 0.007498 ** al2o30.405340.08641 4.691 0.000522 *** p2o5-0.769090.11103 -6.927 1.59e-05 *** as.numeric(dus) -0.640200.18101 -3.537 0.004094 ** I get quite plausible result which can be interpreted without problems. My data is a result of designed experiment (more or less :) and therefore all variables are significant. Is that the reason why scaling may bye inappropriate in this case? No, I think it's just that the cloud of points is approximately spherical in the first 2 or 3 principal components, so the principal component directions are somewhat arbitrary. You just got lucky that the 2nd and 3rd components are interpretable: I wouldn't put too much faith in being able to repeat that if you went out and collected a new set of data using the same design. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ridge regression
If you didn't post anonymously I would have made a suggestion. Full names and affiliations should be given. Frank spime wrote: Dear all, I considered an ordinary ridge regression problem. I followed three different ways: 1. estimate beta without any standardization 2. estimate standardized beta (standardizing X and y) and then again convert back 3. estimate beta using lm.ridge() function X-matrix(c(1,2,9,3,2,4,7,2,3,5,9,1),4,3) y-t(as.matrix(cbind(2,3,4,5))) n-nrow(X) p-ncol(X) #Without standardization intercept - rep(1,n) Xn - cbind(intercept, X) K-diag(c(0,rep(1.5,p))) beta1 - solve(t(Xn)%*%Xn+K)%*%t(Xn)%*%y beta1 #with standardization ys-scale(y) Xs-scale(X) K-diag(1.5,p) bs - solve(t(Xs)%*%Xs+K)%*%t(Xs)%*%ys b- sd(y)*(bs/sd(X)) intercept - mean(y)-sum(as.matrix(colMeans(X))*b) beta2-rbind(intercept,b) beta2 #Using lm.ridge function of MASS package beta3-lm.ridge(y~X,lambda=1.5) I'm getting three different outputs for using the above three different approaches, but which would been the same for all. beta1 [,1] intercept 3.4007944 0.3977462 0.2082025 -0.4829115 beta2 [,1] intercept 3.3399855 0.1639469 0.0262021 -0.1228987 beta3 X1 X2 X3 3.35158977 0.19460958 0.03152778 -0.15546775 It will be very helpful to me if anybody can help me regarding why the outputs are coming different. regards. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] need help for t.test using by
On Aug 19, 2009, at 9:00 AM, 1Rnwb wrote: any one I looked at your posting yesterday and could not figure out what your statistical question was, nor could I figure out how the t.test formula related to the column names in the data.frame. I saw no Group or IL1Ra variables in your data. I also tried to import that data and found that you had more column headers than columns. I gave up. -- DW 1Rnwb wrote: I am trying to do ttest for each plate which has equal number of disease and controls. by searching this forum I found one posting suggesting OP to use by(eo,PlateID, function(.sub) t.test(IL1Ra~Group,data=.sub)). when i modified this for my use I used to get the pvalues for each plate, recently upgraded to R2.9.x and now i am getting following error when i use this by(eo,PlateID, function(.sub) t.test(IL1Ra~Group,data=.sub)) Error in t.test.formula(IL1Ra ~ Group, data = .sub) : grouping factor must have exactly 2 levels I checked my data file and the plate ID has P16-P47 and the disease column is Y/N. I have search all but could not locate the answer. Can anyone suggest me the direction where i am going wrong. for more detail i amm posting a part of my data. SampleIDPlateID Sex Disease DurationAge prt1prt2 prt3prt4 1 P16 F N 25.33282811 836.08979 20.04582692 295.74 11731.43 2 P16 F N 32.74912883 243.0652116 53.16487056 383.63 4451.36 3 P16 M N 3.49961999 181.5587298 23.24604762 522.52 325.52 4 P16 M N 5.249771666 261.0978097 19.69024684 833.61 2229.04 5 P16 M N 39.16612385 237.334794 83.97284692 694.03 1204.36 6 P16 F N 33.49929145 540.486536 29.69569346 895.36 72105.75 7 P16 F N 3.915997361 5215.378446 35.44324704 1023.63 101680.69 8 P16 F N 33.49929145 466.5188732 12.93422535 814.26 7169.61 9 P16 F N 38.1661348 706.5117791 19.69024684 331.62 7296.72 10 P16 M N 0.916030215 242.3011559 36.40117263 90.135962.08 11 P16 F Y 14975.2542.08281552 133.8052501 29.0570764 260.55 1280.29 12 P16 M Y 1095.75 40.41593932 247.6495456 19.69024684 161.64 6685.6 13 P16 F Y 292225.08300169 231.6043765 9.022844487 902.53 37571.39 14 P16 F Y 12053.2537.49924765 158.6370595 24.90606549 1.63471.15 15 P16 F Y 14610 76.58209602 278.3837309 41.51010914 152.13 16285.49 16 P16 F Y 3287.25 49.58239171 128.4568603 17.91234645 274.99 41823.21 17 P17 M N 2.333079994 397.4440508 55.20987654 366.37 90011.27 18 P17 F N 4.749435461 222.0839813 73.62489675 271.65 903.89 19 P17 F N 1.749468315 676.8904636 47.7037037 721.19 663.15 20 P17 M N 33.24946503 413.712486 27.42386831 611.64 4195.86 21 P17 F N 15.4162131 346.9913086 30.71604938 302.29 2151.47 22 P17 F N 33.3327405 2598.071161 107.8088094 1800.44 102005.12 23 P17 M N 9.832656179 3296.535581 54.1563786 2642.73 676.78 24 P17 F N 3.749446413 488.2589867 15.29151952 562.88 11619.62 25 P17 M N 9.749380705 159.1536586 23.41690259 132.64 931.14 26 P17 M Y 12053.25 50.91548265 173.7764722 18.90280088 52.25 15962.05 27 P17 M Y 3652.5 25.83248096 264.8254141 35.32510288 497.33 15504.95 28 P17 F Y 28.41607073 415.3599225 35.0617284 1600.67 14224.39 29 P17 F Y 6209.25 60.74882219 213.7280879 44.41152263 111.36 9018.48 30 P17 M Y 4748.25 47.33258719 264.0016959 31.24279835 366.81 3809.83 31 P17 M Y 16071 54.49906148 112.4128794 34.40329218 114.56 1727.02 32 P17 M Y 7670.25 52.33253243 145.5753317 32.2962963 44.92 15264.84 33 P17 M Y 876631.74913978 220.2561296 34.40329218 130.31 2377.36 34 P17 F Y 8400.75 40.83231669 194.6662059 48.09876543 407.83 19587.61 35 P17 M Y 12053.2536.1661567 212.1613578 96.51775947 621.42 3269.9 36 P17 M Y 8035.5 57.91540599 115.2852178
Re: [R] Fitting a logistic regression
I would suggest a model with a baseline level, something like nls(AMP~E0+(Emax-E0)*Time**gamma/(EC50**gamma+Time**gamma),data=your data, start=list(EC50=50,gamma=2,E0=0.2,Emax=1.2))-mod.test AIC(mod.test) does improve. Hope this helps. Jun On Wed, Aug 19, 2009 at 5:04 AM, Dani Valverde daniel.valve...@uab.catwrote: Hello, I have this data: Time AMP 0 0.200 10 0.1958350 20 0.2914560 40 0.6763628 60 0.8494534 90 0.9874526 120 1.0477692 where AMP is the concentration of this metabolite with time. If you plot the data, you can see that it could be fitted using a logistic regression. For this purpose, I used this code: AMP.nls - nls(AMP~SSlogis(Time,Asym, xmid, scal), data = concentrations,model=T) When plotting the fitted function, it seems that it fits quite well at the end of the time. However, at the beginning it seems that the fit is not so good. How can I achieve a better fit? Forgive me if it is a stupid question, but I am just starting with non linear regression. Thank you, Dani -- [?] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jun Shen PhD PK/PD Scientist BioPharma Services Millipore Corporation 15 Research Park Dr. St Charles, MO 63304 Direct: 636-720-1589 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package read large file
On Wed, Aug 19, 2009 at 10:51 AM, Mohamed Lajnefmohamed.laj...@inserm.fr wrote: I am looking for packages that could read large files in R? any suggestions are welcome. As already pointed out by Jim, your question is not very specific. My wild guess is that you probably have some memory issues -- if that is the case, maybe the package bigmemory can alleviate your pain. Best, Michael -- Michael Knudsen micknud...@gmail.com http://lifeofknudsen.blogspot.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mild and extreme outliers in boxplot
Rnewbie ha scritto: dear all, could somebody tell me how I can plot mild outliers as a circle(°) and extreme outliers as an asterisk(*) in a box-whisker plot? Thanks very much in advance ?boxplot or help(bxp) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Embedding lists in matrices and matrices in lists
Thanks, that was the solution! But in fact I didn't want to have this list of lists layer at all. And now I'm having trouble writing matrices into the database. It's really really strange... If I write a matrix into the database manually everything works, but if I create a function which adds a matrix into the database and then run the function it fails... Look at this (sorry for the long code block, in fact you can skip the sums and test_sums functions): ##test matrix### m1=matrix(c( 1,0, 0,1 ),nrow=2, byrow=TRUE) #sums### sums=function(matrix) { c(sort(colSums(matrix)),sort(rowSums(matrix))) } ##test_sums# test_sums=function(matrix) { liste=database[[nrow(matrix),ncol(matrix)]][[1]] w=1 if(length(liste)0){ for(i in 1:length(liste)){ if(all(sums(matrix)==sums(liste[[i]]))){ w0=0 }else{ w0=1} w=w*w0} }else{ w=2} w } #write to db write_to_db=function(matrix){ en=nrow(matrix) fn=ncol(matrix) if(test_sums(matrix)==1){ print(matrix isn't in the DB yet) database[en,fn][[1]]=list(database[en,fn][[1]], matrix) }else if(test_sums(matrix)==2){ print(matrix isn't in the DB entry yet and DB is empty) database[en,fn][[1]][[1]]=list(matrix) }else{ print(matrix already exists in the DB) } } ##create database## database=matrix(list(),2,2) ##trying to write a matrix into db using the function## write_to_db(m1) database ###writing a matrix into db without using a function### en=nrow(m1) fn=ncol(m1) if(test_sums(m1)==1){ print(matrix isn't in the DB yet) database[en,fn][[1]]=list(database[en,fn][[1]], m1) }else if(test_sums(m1)==2){ print(matrix isn't in the DB entry yet and DB is empty) database[en,fn][[1]][[1]]=list(m1) }else{ print(matrix already exists in the DB) } database ###test whether matrix already is in db## write_to_db(m1) And the output is: [1] matrix isn't in the DB entry yet and DB is empty [,1] [,2] [1,] NULL NULL [2,] NULL NULL [1] matrix isn't in the DB entry yet and DB is empty [,1] [,2] [1,] NULL NULL [2,] NULL List,1 [1] matrix already exists in the DB So writing the matrix into the database using the function fails while writing it using the same commands but not through a function works! Maybe it's a problem with all these layers in the database matrix? Petr PIKAL schrieb: Hi r-help-boun...@r-project.org napsal dne 19.08.2009 13:04:39: Strange, it doesn't work for me: Error in database[4, 4][[1]][1, ] : incorrect number of dimensions Execution halted R version 2.9.0 (2009-04-17) on Arch Linux, no additional packages installed. database[4,4][[1]][,1] does not work for me either but it is result of your complicated structure of nested lists This works database[4,4][[1]][[1]][,1] [1] 0 1 1 1 See the structure Matrix of lists database [,1] [,2] [,3] [,4] [,5] [1,] NULL NULL NULL NULL NULL [2,] NULL NULL NULL NULL NULL [3,] NULL NULL NULL NULL NULL [4,] NULL NULL NULL List,1 NULL [5,] NULL NULL NULL NULL NULL List of lists database[4,4] [[1]] [[1]][[1]] [,1] [,2] [,3] [,4] [1,]0111 [2,]1011 [3,]1101 [4,]1110 List which contains your matrix database[4,4][[1]] [[1]] [,1] [,2] [,3] [,4] [1,]0111 [2,]1011 [3,]1101 [4,]1110 Here is your matrix database[4,4][[1]][[1]] [,1] [,2] [,3] [,4] [1,]0111 [2,]1011 [3,]1101 [4,]1110 Regards Petr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Hist kernel density estimates
Dear All, Attached are the codes of a histogram a kernel density estimate and the output they produced. In fact the variable q is a vector of 1000 simulated values; that is I generated 1000 samples from the pareto distribution, from each sample I calculated the value of q ( a certain fn in the sample observations), and thus I was left with 1000 values of q and I don't know the distribution of q. Hence, I used the attached codes for histogram and kernel density estimation toestimate the density of q. But what I'm really intersed in is to estimate the probability that q is greater than a certain value , for ex.,P(q11000), using the density estimates I obtained. Could u help me with a fn or some document to do this? Thank u so much Maram __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hist kernel density estimates
Hi, I think I've not received the attachment. Please check. Thanks and Regards. - Indranil Basu, Bangalore, India On Wed, Aug 19, 2009 at 9:01 PM, maram salem marammagdysa...@yahoo.comwrote: Dear All, Attached are the codes of a histogram a kernel density estimate and the output they produced. In fact the variable q is a vector of 1000 simulated values; that is I generated 1000 samples from the pareto distribution, from each sample I calculated the value of q ( a certain fn in the sample observations), and thus I was left with 1000 values of q and I don't know the distribution of q. Hence, I used the attached codes for histogram and kernel density estimation toestimate the density of q. But what I'm really intersed in is to estimate the probability that q is greater than a certain value , for ex.,P(q11000), using the density estimates I obtained. Could u help me with a fn or some document to do this? Thank u so much Maram __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scale or not to scale that is the question - prcomp
Ok Thank you for your time. Best regards Petr Pikal Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 16:29:07: On 8/19/2009 10:14 AM, Petr PIKAL wrote: Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 15:25:00: On 19/08/2009 9:02 AM, Petr PIKAL wrote: Thank you Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 14:49:52: On 19/08/2009 8:31 AM, Petr PIKAL wrote: Dear all snip I would say the answer depends on the meaning of the variables. In the unusual case that they are measured in dimensionless units, it might make sense not to scale. But if you are using arbitrary units of measurement, do you want your answer to depend on them? For example, if you change from Kg to mg, the numbers will become much larger, the variable will contribute much more variance, and it will become a more important part of the largest principal component. Is that sensible? Basically variables are in percentages (all between 0 and 6%) except dus which is present or not present (for the purpose of prcomp transformed to 0/1 by as.numeric:). The only variable which is not such is iep which is basically in range 5-8. So ranges of all variables are quite similar. What surprises me is that biplot without scaling I can interpret by used variables while biplot with scaling is totally different and those two pictures does not match at all. This is what surprised me as I would expected just a small difference between results from those two settings as all numbers are quite comparable and does not differ much. If you look at the standard deviations in the two cases, I think you can see why this happens: Scaled: Standard deviations: [1] 1.3335175 1.2311551 1.0583667 0.7258295 0.2429397 Not Scaled: Standard deviations: [1] 1.0030048 0.8400923 0.5679976 0.3845088 0.1531582 The first two sds are close, so small changes to the data will affect I see. But I would expect that changes to data made by scaling would not change it in such a way that unscaled and scaled results are completely different. their direction a lot. Your biplots look at the 2nd and 3rd components. Yes because grouping in 2nd and 3rd component biplot can be easily explained by values of some variables (without scaling). I must admit that I do not use prcomp much often and usually scaling can give me explainable result, especially if I use it to variable reduction. Therefore I am reluctant to use it in this case. when I try more standard way fit-lm(iep~sio2+al2o3+p2o5+as.numeric(dus), data=rglp) summary(fit) Call: lm(formula = iep ~ sio2 + al2o3 + p2o5 + as.numeric(dus), data = rglp) Residuals: Min 1Q Median 3Q Max -0.41751 -0.15568 -0.03613 0.20124 0.43046 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.120850.62257 11.438 8.24e-08 *** sio2-0.672500.20953 -3.210 0.007498 ** al2o30.405340.08641 4.691 0.000522 *** p2o5-0.769090.11103 -6.927 1.59e-05 *** as.numeric(dus) -0.640200.18101 -3.537 0.004094 ** I get quite plausible result which can be interpreted without problems. My data is a result of designed experiment (more or less :) and therefore all variables are significant. Is that the reason why scaling may bye inappropriate in this case? No, I think it's just that the cloud of points is approximately spherical in the first 2 or 3 principal components, so the principal component directions are somewhat arbitrary. You just got lucky that the 2nd and 3rd components are interpretable: I wouldn't put too much faith in being able to repeat that if you went out and collected a new set of data using the same design. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] edit.row.names taking row names from the edited dataframe
Erik Iverson ha scritto: This really has nothing to do with the row names issue you mention, as far as I can tell. It has to do with this: http://stackoverflow.com/questions/1195826/dropping-factor-levels-in-a-subsetted-data-frame-in-r Best, Erik Iverson Perhaps you may also take a look at this http://wiki.r-project.org/rwiki/doku.php?id=tips:data-manip:drop_unused_levels df.mydata$problemFactor - df.mydata$problemFactor[, drop = TRUE] 8rino __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graph label greek symbol failure
On Wed, 2009-08-19 at 14:20 +0100, e-letter wrote: On 18/08/2009, Gavin Simpson gavin.simp...@ucl.ac.uk wrote: On Tue, 2009-08-18 at 13:06 +0100, e-letter wrote: On 17/08/2009, Michael Knudsen micknud...@gmail.com wrote: On Mon, Aug 17, 2009 at 12:51 PM, e-letter inp...@gmail.com wrote: I have tried to add the delta (δ) symbol to the y axis label and the result is D, using the command: ...ylab=δt... Try ylab = expression(delta*t) instead. This does not work, the result is expression(delta*t) It works for the rest of us who suggested this. plot(1:10, ylab = expression(delta*t)) True, but the following commands fails: plot(1:10,ylab=temperature expression(delta*t)) plot(1:10,ylab=temperature expression(delta*t)) Error: syntax error, unexpected SYMBOL, expecting ',' in plot(1:10,ylab=temperature expression So I want to be able to have 'δt' and 'temperature δt' as a y-axis label. Ah, but you never said that. We aren't mind readers you know ;-) What you need is an expression that will, when used, give you a text label containing Temperature δt. What you have done is create a character string of the literal ...expression(delta*t) which is of course why it is printed as the label - after all , you asked R to do this. I suggest you read the plotmath help page, accessed via: ?plotmath executed at the prompt. I found this expression stuff complicated when I first started out, until I realised that whatever gets passed to expression() has to be a syntactically valid R command. So for your particular example we need: plot(1:10, ylab = expression(Temperature ~ delta*t)) Temperature is treated as the character string Temperature; the ~ means leave some space between the bits either side of ~; delta is a special name that will get replaced by the relevant glyph; the * juxtaposes delta and t, ie places them next to one another without any space in between. I see that Baptiste has also just replied with a similar solution to mine. Baptiste's solution quotes temperature with the spacing being stated explicitly. Whilst both yield similar results, using ~ and not having to quote things results in a simpler solution, IMHO. HTH Gavin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hist kernel density estimates
On Aug 19, 2009, at 11:34 AM, indranil basu wrote: Hi, I think I've not received the attachment. Please check. In all likelihood the OP did not read and follow the instructions regarding attachments that are to be found in the Posting Guide. Thanks and Regards. - Indranil Basu, Bangalore, India On Wed, Aug 19, 2009 at 9:01 PM, maram salem marammagdysa...@yahoo.com wrote: Dear All, Attached are the codes of a histogram a kernel density estimate and the output they produced. In fact the variable q is a vector of 1000 simulated values; that is I generated 1000 samples from the pareto distribution, from each sample I calculated the value of q ( a certain fn in the sample observations), and thus I was left with 1000 values of q and I don't know the distribution of q. Hence, I used the attached codes for histogram and kernel density estimation toestimate the density of q. But what I'm really intersed in is to estimate the probability that q is greater than a certain value , for ex.,P(q11000), using the density estimates I obtained. Could u help me with a fn or some document to do this? Thank u so much Maram David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] visualizzazione colonne
sabrina.michie...@alice.it ha scritto: ciao, ho aperto un file in R di classe data frame di 15000 righe e 29 colonne. Nella console però sono visualizzate solo la prima e l'ultima colonna e le ultime 8000 righe circa. E' possibile una visualizzazione completa? Grazie Sabrina Hi Sabrina, please do not forget that this list in english. Remember it for the future. I find very inconvenient to have all the rows displayed at one time, especially if they are 15000 !!! Should I need to look at all the dataset I would use page(mydataframe) (emacs or xemacs + ESS needed) or from the console fix(mydatframe) very dangerous !!! or edit (mydataframe) Before attempting this, read the help ?edit ?fix Should you want to look at only some part of the dataframe, try for example mydataframe[3:15, 7:8] This will show you the data contained lines from line 3 to 15 and in column from 7 to 8. Ciao -- Ottorino-Luca Pantani, Università di Firenze __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence interval on parameters from optim function
Regrading your second question, I guess somehow you get undefined value like logarithm of zero of your target function for some unfortunate parameter values in the parameter space. Devred, Emmanuel wrote: Hi everyone, I have two questions: I would like to get confidence intervals on the coefficients derived from the optim() function. I apply optim() to a given function f res - optim(c(0.08,0.04,1.),f,NULL,method=L-BFGS-B,lower=c(0.,0.,0.)) And I would like to get the p-value and confidence intervals associated with res$par My second question deals with error message. I am doing a loop with the optim() function in it, when I get an error message like below, the loop is stopped, however I would like to change my initial values to avoid this error message and keep the loop going, so if it crashes when I am away the program can still run, any idea ? Error in optim(c(0.08, 0.04, 1), f, NULL, method = L-BFGS-B, lower = c(0, : L-BFGS-B needs finite values of 'fn' Thank you for any information on these two problems. Emmanuel --- Dr. Emmanuel Devred Bedford Institute of Oceanography, 1 Challenger Drive, Dartmouth, Nova Scotia, B2Y 4A2, Canada Ph: (1) 902 426-4681 Fax: (1) 902 426-9388 devr...@mar.dfo-mpo.gc.ca http://myweb.dal.ca/edevred/ --- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Confidence-interval-on-parameters-from-optim-function-tp25044388p25046772.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to compute other columns without a column for sample name
sandsky ha scritto: Data has the first row for variable name and the first column for sample name. I want to take Log for all data, but how to compute without the first column for sample name. log.raw_data=log(raw_data,base=2) Error in Math.data.frame(list(sample_id = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, : non-numeric variable in data frame: sample_id Thank you in advance, Jin You are trying to calculate a logarithm for a character variable. You most probably imported the data with read.table (header = FALSE). try sapply(raw_data, is.numeric) to see which are the columns that contains logarithmizable items 8rino __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with predict.coxph
-- begin included message --- We occasionally utilize the coxph function in the survival library to fit multinomial logit models. (The breslow method produces the same likelihood function as the multinomial logit). We then utilize the predict function to create summary results for various combinations of covariates. For example: ... The problem is that under R 2.8.1 and R 2.9.1 the previous line fails with the following error: totalut-predict(mod1,newdata=newdata,type=lp) Error in model.frame.default(Terms2, newdata, xlev = object$xlevels) : variable lengths differ (found for 'Price') In addition: Warning message: 'newdata' had 25 rows but variable(s) found have 43350 rows ---end inclusion -- The coxph code was updated to use the standard R methods for prediction with factors. I even added test cases -- but obviously I've missed something. I'll look into a fix and get back to you. Can I assume that Price and Product were both factors with the 5 levels 1:5? Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fw: Hist kernel density estimates
For the hist estimate par(mex=1.3) dens-density(q) options(scipen=4) ylim-range(dens$y) h-hist(q,breaks=scott,freq=FALSE,probability=TRUE, + right=FALSE,xlim=c(9000,16000),ylim=ylim,main=Histogram of q(scott)) lines(dens) box() For the kernel estimateoptions(scipen=4) d - density(q, bw = nrd0,kernel=gaussian) d plot(d) In fact the variable q is a vector of 1000 simulated values; that is I generated 1000 samples from the pareto distribution, from each sample I calculated the value of q ( a certain fn in the sample observations), and thus I was left with 1000 values of q and I don't know the distribution of q. Hence, I used the attached codes for histogram and kernel density estimation toestimate the density of q. But what I'm really intersed in is to estimate the probability that q is greater than a certain value , for ex.,P(q11000), using the density estimates I obtained. Could u help me with a fn or some document to do this? Thank u so much Maram Dear All, Attached are the codes of a histogram a kernel density estimate and the output they produced. I'll copy the codes here in case there's something wrong with the attachement __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graph label greek symbol failure
On 19/08/2009, Gavin Simpson gavin.simp...@ucl.ac.uk wrote: On Wed, 2009-08-19 at 14:20 +0100, e-letter wrote: On 18/08/2009, Gavin Simpson gavin.simp...@ucl.ac.uk wrote: On Tue, 2009-08-18 at 13:06 +0100, e-letter wrote: On 17/08/2009, Michael Knudsen micknud...@gmail.com wrote: On Mon, Aug 17, 2009 at 12:51 PM, e-letter inp...@gmail.com wrote: I have tried to add the delta (δ) symbol to the y axis label and the result is D, using the command: ...ylab=δt... Try ylab = expression(delta*t) instead. This does not work, the result is expression(delta*t) It works for the rest of us who suggested this. plot(1:10, ylab = expression(delta*t)) True, but the following commands fails: plot(1:10,ylab=temperature expression(delta*t)) plot(1:10,ylab=temperature expression(delta*t)) Error: syntax error, unexpected SYMBOL, expecting ',' in plot(1:10,ylab=temperature expression So I want to be able to have 'δt' and 'temperature δt' as a y-axis label. Ah, but you never said that. We aren't mind readers you know ;-) What you need is an expression that will, when used, give you a text label containing Temperature δt. What you have done is create a character string of the literal ...expression(delta*t) which is of course why it is printed as the label - after all , you asked R to do this. I suggest you read the plotmath help page, accessed via: ?plotmath executed at the prompt. I found this expression stuff complicated when I first started out, until I realised that whatever gets passed to expression() has to be a syntactically valid R command. So for your particular example we need: plot(1:10, ylab = expression(Temperature ~ delta*t)) Temperature is treated as the character string Temperature; the ~ means leave some space between the bits either side of ~; delta is a special name that will get replaced by the relevant glyph; the * juxtaposes delta and t, ie places them next to one another without any space in between. Being familiar with latex, I interpret your description of the tilde (~) as non-breaking space, therefore I used the command: plot(1:10,ylab=expression(temperature~delta*t)) to give me the result I wanted; in other words the space characters are not needed between temperature,~ and delta in your suggestion. Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scale or not to scale that is the question - prcomp
scaling changes the metric, ie which things are close to each other. there is no reason to expect the picture to look the same when you change the metric. On the other hand, your two pictures don't look so different to me. It appears that the scaled plot is similar to the unscaled plot, with the roles of the second and third pc reversed, ie the scaled plot is similar but rotated and distorted. For example, the observations forming the strip across the bottom of the first plot form a vertical strip on the right hand side of the second plot. albyn On Wed, Aug 19, 2009 at 02:31:23PM +0200, Petr PIKAL wrote: Dear all here is my data called rglp structure(list(vzorek = structure(1:17, .Label = c(179/1/1, 179/2/1, 180/1, 181/1, 182/1, 183/1, 184/1, 185/1, 186/1, 187/1, 188/1, 189/1, 190/1, 191/1, 192/1, R310, R610L), class = factor), iep = c(7.51, 7.79, 5.14, 6.35, 5.82, 7.13, 5.95, 7.27, 6.29, 7.5, 7.3, 7.27, 6.46, 6.95, 6.32, 6.32, 6.34), skupina = c(7.34, 7.34, 5.14, 6.23, 6.23, 7.34, 6.23, 7.34, 6.23, 7.34, 7.34, 7.34, 6.23, 7.34, 6.23, 6.23, 6.23), sio2 = c(0.023, 0.011, 0.88, 0.028, 0.031, 0.029, 0.863, 0.898, 0.95, 0.913, 0.933, 0.888, 0.922, 0.882, 0.923, 1, 1), p2o5 = c(0.78, 0.784, 1.834, 1.906, 1.915, 0.806, 1.863, 0.775, 0.817, 0.742, 0.783, 0.759, 0.787, 0.758, 0.783, 3, 2), al2o3 = c(5.812, 5.819, 3.938, 5.621, 3.928, 3.901, 5.621, 5.828, 4.038, 5.657, 3.993, 5.735, 4.002, 5.728, 4.042, 6, 5), dus = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L), .Label = c(ano, ne), class = factor)), .Names = c(vzorek, iep, skupina, sio2, p2o5, al2o3, dus), class = data.frame, row.names = c(NA, -17L)) and I try to do principal component analysis. Here is one without scaling fit-prcomp(~iep+sio2+al2o3+p2o5+as.numeric(dus), data=rglp, factors=2) biplot(fit, choices=2:3,xlabs=rglp$vzorek, cex=.8) you can see that data make 3 groups according to variables sio2 and dus which seems to be reasonable as lowest group has different value of dus = ano while highest group has low value of sio2. But when I do the same with scale=T fit-prcomp(~iep+sio2+al2o3+p2o5+as.numeric(dus), data=rglp, factors=2, scale=T) biplot(fit, choices=2:3,xlabs=rglp$vzorek, cex=.8) I get completely different picture which is not possible to interpret in such an easy way. So if anybody can advice me if I shall follow recommendation from help page (which says The default is FALSE for consistency with S, but in general scaling is advisable. or if I shall stay with scale = FALSE and with simply interpretable result? Thank you. Petr Pikal petr.pi...@precheza.cz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scale or not to scale that is the question - prcomp
If you mentally rotate the second biplot by 90 degrees, the plots are not so different. This just indicates that the 2nd and 3rd principal components have switched roles. Kevin Wright On Wed, Aug 19, 2009 at 10:09 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Ok Thank you for your time. Best regards Petr Pikal Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 16:29:07: On 8/19/2009 10:14 AM, Petr PIKAL wrote: Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 15:25:00: On 19/08/2009 9:02 AM, Petr PIKAL wrote: Thank you Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 14:49:52: On 19/08/2009 8:31 AM, Petr PIKAL wrote: Dear all snip I would say the answer depends on the meaning of the variables. In the unusual case that they are measured in dimensionless units, it might make sense not to scale. But if you are using arbitrary units of measurement, do you want your answer to depend on them? For example, if you change from Kg to mg, the numbers will become much larger, the variable will contribute much more variance, and it will become a more important part of the largest principal component. Is that sensible? Basically variables are in percentages (all between 0 and 6%) except dus which is present or not present (for the purpose of prcomp transformed to 0/1 by as.numeric:). The only variable which is not such is iep which is basically in range 5-8. So ranges of all variables are quite similar. What surprises me is that biplot without scaling I can interpret by used variables while biplot with scaling is totally different and those two pictures does not match at all. This is what surprised me as I would expected just a small difference between results from those two settings as all numbers are quite comparable and does not differ much. If you look at the standard deviations in the two cases, I think you can see why this happens: Scaled: Standard deviations: [1] 1.3335175 1.2311551 1.0583667 0.7258295 0.2429397 Not Scaled: Standard deviations: [1] 1.0030048 0.8400923 0.5679976 0.3845088 0.1531582 The first two sds are close, so small changes to the data will affect I see. But I would expect that changes to data made by scaling would not change it in such a way that unscaled and scaled results are completely different. their direction a lot. Your biplots look at the 2nd and 3rd components. Yes because grouping in 2nd and 3rd component biplot can be easily explained by values of some variables (without scaling). I must admit that I do not use prcomp much often and usually scaling can give me explainable result, especially if I use it to variable reduction. Therefore I am reluctant to use it in this case. when I try more standard way fit-lm(iep~sio2+al2o3+p2o5+as.numeric(dus), data=rglp) summary(fit) Call: lm(formula = iep ~ sio2 + al2o3 + p2o5 + as.numeric(dus), data = rglp) Residuals: Min 1Q Median 3Q Max -0.41751 -0.15568 -0.03613 0.20124 0.43046 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.120850.62257 11.438 8.24e-08 *** sio2-0.672500.20953 -3.210 0.007498 ** al2o30.405340.08641 4.691 0.000522 *** p2o5-0.769090.11103 -6.927 1.59e-05 *** as.numeric(dus) -0.640200.18101 -3.537 0.004094 ** I get quite plausible result which can be interpreted without problems. My data is a result of designed experiment (more or less :) and therefore all variables are significant. Is that the reason why scaling may bye inappropriate in this case? No, I think it's just that the cloud of points is approximately spherical in the first 2 or 3 principal components, so the principal component directions are somewhat arbitrary. You just got lucky that the 2nd and 3rd components are interpretable: I wouldn't put too much faith in being able to repeat that if you went out and collected a new set of data using the same design. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graph label greek symbol failure
On Wed, 2009-08-19 at 17:06 +0100, e-letter wrote: snip / Being familiar with latex, I interpret your description of the tilde (~) as non-breaking space, therefore I used the command: plot(1:10,ylab=expression(temperature~delta*t)) to give me the result I wanted; in other words the space characters are not needed between temperature,~ and delta in your suggestion. No, the spacing is not needed, just like any other spacing in R, it is ignored, but my eye-brain wetware system is /not/ the R parser and as such I like to space things out so *I* can parse and *understand* my code; I find it easier to find syntax errors etc that way. Note that there was a difference between what I posted and Baptiste's example; there the space was explicitly encoded within temperature , where as I use ~ as the indicator for spacing; any other spacing was just for sake of legibility. I also don't think this is non-breaking because unless you add an explicit \n (newline) in your label, the entire label will be rendered on a single line. Better to think of x ~ y (or x~y if you think that is more readable) as just putting spacing between x and y. As ?plotmath shows, you can string multiple ~ together to give extra spacing; x ~~ y And so on... Glad you got this working as you wanted. G Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] BUGS
I am running a BUGS function with following schools.sim -bugs(data,inits, parameters, model.file=schools.txt, n.chains=3, n.iter=1000, bugs.directory=E:/Rprograms) My model.file IS in the directory =E:/Rprograms and the code is: model{ for (j in 1:J){ y[j] ~ dnorm (theta[j], tau.y[j]) theta[j] ~ dnorm (mu.theta, tau.theta) tau.y[j] - pow(sigma.y[j], -2) } mu.theta ~ dnorm (0.0, 1.0E-6) tau.theta - pow(sigma.theta, -2) sigma.theta ~ dunif (0, 1000) I am getting the following error which I can't understand. Error in bugs(data, inits, parameters, model.file = schools.txt, n.chains = 3, : schools.txt does not exist. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there a construct for conditional comment?
I believe that #if lines for C++ programs is handled by the preprocessor, not the compiler. So if you want the same functionality for R programs, it would make sense to just preprocess the R file. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Peng Yu Sent: Tuesday, August 18, 2009 3:55 PM To: r-h...@stat.math.ethz.ch Subject: [R] Is there a construct for conditional comment? Hi, In C++, I can use the following construct to choice either of the two blocks the comment but not both. Depending on whether the number after #if is zero or not, the commented block can be chose. I'm wondering if such thing is possible in R? #if 0 commented with 0 #else commented with 1 #endif Regards, Peng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] BUGS
Naively it seems that there is no school.txt file in your specified directory. Stephen Sefick On Wed, Aug 19, 2009 at 12:05 PM, James Lenihanjamesleni...@sbcglobal.net wrote: I am running a BUGS function with following schools.sim -bugs(data,inits, parameters, model.file=schools.txt, n.chains=3, n.iter=1000, bugs.directory=E:/Rprograms) My model.file IS in the directory =E:/Rprograms and the code is: model{ for (j in 1:J){ y[j] ~ dnorm (theta[j], tau.y[j]) theta[j] ~ dnorm (mu.theta, tau.theta) tau.y[j] - pow(sigma.y[j], -2) } mu.theta ~ dnorm (0.0, 1.0E-6) tau.theta - pow(sigma.theta, -2) sigma.theta ~ dunif (0, 1000) I am getting the following error which I can't understand. Error in bugs(data, inits, parameters, model.file = schools.txt, n.chains = 3, : schools.txt does not exist. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Stephen Sefick Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Best performance measure?
Hello, I working on a model to predict probabilities. I don't really care about binary prediction accuracy. I do really care about the accuracy of my probability predictions. Frank was nice enough to point me to the val.prob function from the Design library. It looks very promising for my needs. I've put together some tests and run the val.prob analysis. It produces some very informative graphs along with a bunch of performance measures. Unfortunately, I'm not sure which measure, if any, is the best one. I'm comparing hundreds of different models/parameter combinations/etc. So Ideally I'd like a single value or two as the performance measure for each one. That way I can pick the best model from all my experiments. As mentioned above, I'm mainly interested in the accuracy of my probability predictions. Does anyone have an opinion about which measure I should look at?? (I see Dxy, C, R2, D, U, Briar, Emax, Eavg, etc.) Thanks!! -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Erros with RVM and LSSVM from kernlab library
Hello, In my ongoing quest to develop a best model, I'm testing various forms of SVM to see which is best for my application. I have been using the SVM from the e1071 library without problem for several weeks. Now, I'm interested in RVM and LSSVM to see if I get better performance. When running RVM or LSSVM on the exact same data as the SVM{e1071}, I get an error that I don't understand: Error in .local(x, ...) : kernel must inherit from class 'kernel' Does this make sense to anyone? Can you suggest how to resolve this? Thanks! -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] BUGS
stephen sefick wrote: Naively it seems that there is no school.txt file in your specified directory. Stephen Sefick On Wed, Aug 19, 2009 at 12:05 PM, James Lenihanjamesleni...@sbcglobal.net wrote: I am running a BUGS function with following schools.sim -bugs(data,inits, parameters, model.file=schools.txt, n.chains=3, n.iter=1000, bugs.directory=E:/Rprograms) My model.file IS in the directory =E:/Rprograms and the code is: According to ?bugs: bugs.directory: directory that contains the WinBUGS executable. If the global option R2WinBUGS.bugs.directory is not NULL, it will be used as the default. working.directory: sets working directory during execution of this function; WinBUGS' in- and output will be stored in this directory; if NULL, a temporary working directory via tempdir is used. So either specify working.directory in a way that it points to the model.txt file or specify a full path name for your model.file I don't believe that bugs.directory=E:/Rprograms will do anything sensible (as it should point to your WinBUGS installation. Uwe Ligges model{ for (j in 1:J){ y[j] ~ dnorm (theta[j], tau.y[j]) theta[j] ~ dnorm (mu.theta, tau.theta) tau.y[j] - pow(sigma.y[j], -2) } mu.theta ~ dnorm (0.0, 1.0E-6) tau.theta - pow(sigma.theta, -2) sigma.theta ~ dunif (0, 1000) I am getting the following error which I can't understand. Error in bugs(data, inits, parameters, model.file = schools.txt, n.chains = 3, : schools.txt does not exist. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best performance measure?
Noah Silverman wrote: Hello, I working on a model to predict probabilities. I don't really care about binary prediction accuracy. I do really care about the accuracy of my probability predictions. Frank was nice enough to point me to the val.prob function from the Design library. It looks very promising for my needs. I've put together some tests and run the val.prob analysis. It produces some very informative graphs along with a bunch of performance measures. Unfortunately, I'm not sure which measure, if any, is the best one. I'm comparing hundreds of different models/parameter combinations/etc. So Ideally I'd like a single value or two as the performance measure for each one. That way I can pick the best model from all my experiments. As mentioned above, I'm mainly interested in the accuracy of my probability predictions. Does anyone have an opinion about which measure I should look at?? (I see Dxy, C, R2, D, U, Briar, Emax, Eavg, etc.) Thanks!! -N It all depends on the goal, i.e., the relative value you place on absolute accuracy vs. discrimination ability. The Brier score combines both and other than interpretability has many advantages. Frank __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] a bug in the offset parameter syntax in the geepack package?
Dear R-users, I was doing some Poisson regression using the geepack package and I found a strange behaviour of the geeglm command when defining an offset. Maybe it's my limited knowledge of R syntax, or maybe it's something else. Let's make an example. After loading the geepack library, you may write model=geeglm(y~covariates+offset(log(xxx)),data=data,family=poisson,id=idx) which is equivalent to model=geeglm(y~covariates,offset=log(xxx),data=data,family=poisson,id=idx) however, the command works also with the syntax model=geeglm(y~covariates,offset(log(xxx)),data=data,family=poisson,id=idx) giving different (wrong) results! I discovered it because I was calculating some incidence rates by hand and then comparing with the glm estimates. In the manual, indeed, the way in which the offset should be written is not so easy to understand. Thanks for any response, Mattia Prosperi -- View this message in context: http://www.nabble.com/a-bug-in-the-offset-parameter-syntax-in-the-geepack-package--tp25047143p25047143.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Urgent Help
Dear Sir, I am using RClimDex. I get following error. Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got 'T' I have done copy paste of climate data from excel file to notepad and tried to upload. I do not have any knowledge about programming languages. Please help me. Regards, Binaya Pasakhala -- View this message in context: http://www.nabble.com/Urgent-Help-tp25048115p25048115.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error with acf/pacf functions
Ok strangely both acf and pacf work with the xts object as long as the xts package hasn't been loaded. As soon as I load the package, I get the error again. tmakino wrote: I'm trying to apply a acf/pacf function to a xts object and get the following error: Error in if (frequency 1 abs(frequency - round(frequency)) ts.eps) frequency - round(frequency) : missing value where TRUE/FALSE needed Any ideas? Thanks. -- View this message in context: http://www.nabble.com/Error-with-acf-pacf-functions-tp25029832p25047616.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Ridge regression [Repost]
Dear all, For an ordinary ridge regression problem, I followed three different approaches: 1. estimate beta without any standardization 2. estimate standardized beta (standardizing X and y) and then again convert back 3. estimate beta using lm.ridge() function X-matrix(c(1,2,9,3,2,4,7,2,3,5,9,1),4,3) y-as.matrix(c(2,3,4,5)) n-nrow(X) p-ncol(X) #Without standardization intercept - rep(1,n) Xn - cbind(intercept, X) K-diag(c(0,rep(1.5,p))) beta1 - solve(t(Xn)%*%Xn+K)%*%t(Xn)%*%y beta1 #with standardization ys-scale(y) Xs-scale(X) K-diag(1.5,p) bs - solve(t(Xs)%*%Xs+K)%*%t(Xs)%*%ys b- sd(y)*(bs/sd(X)) intercept - mean(y)-sum(as.matrix(colMeans(X))*b) beta2-rbind(intercept,b) beta2 #Using lm.ridge function of MASS package beta3-lm.ridge(y~X,lambda=1.5) I'm getting three different results using above described approaches: beta1 [,1] intercept 3.4007944 0.3977462 0.2082025 -0.4829115 beta2 [,1] intercept 3.3399855 0.1639469 0.0262021 -0.1228987 beta3 X1 X2 X3 3.35158977 0.19460958 0.03152778 -0.15546775 It will be very helpful to me if anybody can help me regarding why the outputs are coming different. I am extremely sorry for my previous anonymous post. regards. - Sabyasachi Patra PhD Scholar Indian institute of Technology Kanpur India. -- View this message in context: http://www.nabble.com/Ridge-regression--Repost--tp25048898p25048898.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lmer with random slopes for 2 or more first-level factors?
Since nobody has answered yet, let me try. I'm not 100% sure. (So why bother? To check my own understanding.) On 08/18/09 20:47, Jason R. Finley wrote: I have data from a design in which items are completely nested within subjects. Subject is the only second-level factor, but I have multiple first-level factors (IVs). Say there are 2 such independent variables that I am interested in. What is the proper syntax to fit a mixed-effects model with a by-subject random intercept, and by-subject random slopes for both the 2 IVs? I can think of at least two possibilities: lmer(DV ~ IV1 + IV2 + (1 + IV1 + IV2 | Subject)) lmer(DV ~ IV1 + IV2 + (1 + IV1 | Subject) + (1 + IV2 | Subject)) Or maybe there is some other way to do it? Maybe the correct syntax depends on whether the random effect of subjects on the intercept and slopes are correlated or not? (If so, how do I proceed?) I think that the first way assumes correlations of IV1, IV2, and the Subject intercept. The second way assumes that each of IV1 and IV2 is correlated with the Subject intercept, and the intercept is computed separately for IV1 and IV2. There are other ways. You can do, for example, lmer(DV ~ IV1 + IV2 + (1 | Subject) + (1 | Subject)) + (0 + IV1 | Subject) + (0 + IV2| Subject)) This assumes, I think, that intercepts and slopes are uncorrelated. If you can tolerate that assumption, an advantage of doing it this way is that you can use mcmc sampling, and thus you can use pvals.fnc() in the languageR package. (What I do not know are the consequences of making this assumption if it is false.) Finally, what would be the syntax if I wanted to include a random subject effect for the INTERACTION of IV1 and IV2? lmer(DV ~ IV1 + IV2 + (1 + IV1 * IV2 | Subject)) -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] double precision
Roger Bivand wrote: On Tue, 5 Dec 2006, Yoni Schamroth wrote: Hi, I am attempting to query a data frame from a mysql database. One of the variables is a unique identification number (numeric) 18 digits long. I am struggling to retrieve this variable exactly without any rounding. Read it as a character - a double is a double: x - 6527600583317876352 y - 6527600583317876380 all.equal(x,y) [1] TRUE storage.mode(x) [1] double and why they are equal is a FAQ (only ~16 digits in a double). Integer is 4-byte. Since they are IDs, not to be used for math, leave them as character strings - which they are, like telephone numbers. Resurrecting this post for a moment, the same issue arose when interfacing R with a Postgres database using the bigint data type (a signed 64-bit integer ranging from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 as of this writing). While the underlying cause is summarized above, I'd like to recommend the inclusion of a 64-bit integer data type into the R base. For performance reasons, I use R to independently generate a unique transaction ID that is equivalent to the Postgres-generated bigint (with some consistency checks -- generally bad design, but vastly quicker than querying the database for the same value). I currently generate a string representation and pass that to the DBI, though the process is cumbersome and likely not as efficient as an arithmetic equivalent (particularly when using a 64-bit processor architecture). Furthermore, there are additional gyrations that need to occur when querying the database for bigint values. Do significant practical challenges exist in the implementation of a 64-bit integer that would outweigh the faster and cleaner compatibility with database backends? -- View this message in context: http://www.nabble.com/-R--double-precision-tp7708057p25048915.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Urgent Help
On Wed, 19 Aug 2009 10:01:19 -0700 (PDT) Bin1aya bpasakh...@gmail.com wrote: B I have done copy paste of climate data from excel file to notepad B and tried to upload. I do not have any knowledge about programming B languages. Please help me. There are better ways of importing data from Excel. Have a look at the wiki: http://wiki.r-project.org/rwiki/doku.php?id=tips:data-io:ms_windows The first possible solution path should work... hth Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.