Re: [R] Doing pairwise comparisons using either Duncan, Tukey or LSD
try: example(TukeyHSD) hth, Kingsford Jones On Thu, Feb 19, 2009 at 9:51 AM, Saeed Ahmadi ahmadi_sa...@yahoo.com wrote: Hi, I have a basic and simple question on how to code pairwise (multiple) mean compariosn between levels of a factor using one of the Duncan, Tukey or LSD. Thanks in advance, Saeed -- View this message in context: http://www.nabble.com/Doing-pairwise-comparisons-using-either-Duncan%2C-Tukey-or-LSD-tp22104786p22104786.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ideal (possible) configuration for an exalted R system
Hi Harsh, The useR! 2008 site has useful information. E.g. talks by Graham Williams: http://www.statistik.uni-dortmund.de/useR-2008/slides/Williams.pdf Dirk Eddelbuettel http://www.statistik.uni-dortmund.de/useR-2008/tutorials/useR2008introhighperfR.pdf and others http://www.statistik.uni-dortmund.de/useR-2008/abstracts/AbstractsByTopic.html#High%20Performance%20Computing A few days ago I was googling to see what types of workstations are available these days. Here's some with up to 64gb ram: http://www.colfax-intl.com/jlrid/SpotLight.asp?IT=0RID=80 Perhaps it won't be long before we see such memory in laptops: http://www.ubergizmo.com/15/archives/2009/01/samsung_opens_door_to_32gb_ram_stick.html Like you, I'd also be interested in hearing about configurations folks have used to work w/ large datasets. hth, Kingsford Jones On Mon, Feb 16, 2009 at 5:10 AM, Harsh singhal...@gmail.com wrote: Hi All, I am trying to assemble a system that will allow me to work with large datasets (45-50 million rows, 300-400 columns) possibly amounting to 10GB + in size. I am aware that R 64 bit implementations on Linux boxes are suitable for such an exercise but I am looking for configurations that R users out there may have used in creating a high-end R system. Due to a lot of apprehensions that SAS users have about R's data limitations, I want to demonstrate R's usability even with very large datasets as mentioned above. I would be glad to hear from users(share configurations and system specific information) who have desktops/servers on which they use R to crunch massive datasets. Any suggestions in expanding R's functionality in the face of gigabyte class datasets would be appreciated. Thanks Harsh Singhal Decision Systems, Mu Sigma Inc. Chicago, IL __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems in Recommending R
On Thu, Feb 12, 2009 at 3:12 PM, Johannes Huesing johan...@huesing.name wrote: Last time I tried, rseek.org yielded no results when searching for inferno. ...although, if you hit the 'Support Lists' tab it finds the thread in which Patrick announced it. -- Johannes Hüsing There is something fascinating about science. One gets such wholesale returns of conjecture mailto:johan...@huesing.name from such a trifling investment of fact. http://derwisch.wikidot.com (Mark Twain, Life on the Mississippi) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear model
On Wed, Feb 11, 2009 at 1:36 PM, kayj kjaj...@yahoo.com wrote: I want to know how accurate are the p-values when you do linear regression in R? I was looking at the variable x3 and the t=10.843 and the corresponding p-value=2e-16 which is the same p-value for the intercept where the t-value for the intercept is 48.402. I tried to calculate the p-value in R and I got 0 x-2*(1-pt(10.843,2838)) x [1] 0 Some comments: i) the printout says that the value is less than 2e-16 ii) It seems strange to interpret a p-value at that level of precision iii) you're confusing what is printed vs what is stored iv) see http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f v) rather than subtracting from 1, use the 'lower.tail' argument to pt: 2*pt(10.843,2838,lower=F) [1] 7.185635e-27 hth, Kingsford Jones G-lm(y~x1+x2+x3+x4+x5) summary(G) Call: lm(formula = y ~ x1 + x2 +x3 + x4 + x5) Residuals: Min 1Q Median 3Q Max -14.3172 -3.2197 -0.2913 2.6938 23.3602 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 22.9461 0.4741 48.402 2e-16 *** x1 -0.1139 0.3734 -0.305 0.76031 x2 -0.0405 0.1936 -0.209 0.83437 x3 2.0165 0.1860 10.843 2e-16 *** x4 0.5313 0.1782 2.982 0.00289 ** x5 0.5879 0.1779 3.305 0.00096 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 4.724 on 2838 degrees of freedom (138 observations deleted due to missingness) Multiple R-squared: 0.05279,Adjusted R-squared: 0.05112 F-statistic: 31.63 on 5 and 2838 DF, p-value: 2.2e-16 Thanks for the help -- View this message in context: http://www.nabble.com/Linear--model-tp21963370p21963370.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help regarding White's Heteroscedasticity Correction
or simultaneously estimate the coefficients and variance structure via nlme::gls and its 'weights' argument... On Tue, Feb 10, 2009 at 7:57 AM, John Fox j...@mcmaster.ca wrote: Dear Kishore, Yes, White's heteroscedasticity-consistent standard errors are just that -- standard errors for the OLS coefficients that are consistent in the presence of heteroscedasticity. The coefficients themselves don't change. There is an issue here: although the standard errors and OLS coefficients are consistent, the OLS estimates lose efficiency. If you know that pattern of heteroscedasticity, then you might get more efficient estimates by taking it into account, e.g., in weighted-least-squares regression, or, if the residual spread increases with the fitted values, by transforming the response. I hope this helps, John -- John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Kishore Sent: February-10-09 9:19 AM To: r-help@r-project.org; r-h...@stat.math.ethz.ch Subject: [R] Help regarding White's Heteroscedasticity Correction Hi I am actually running the White test for correcting Heteroscedasticity. I used sandwich() car(), however the output shows the updated t test of coefficients, with revised Standard Errors, however the estimates remained same. My problem is that the residuals formed a pattern in the original regression equation. After running the White's test, I got some new standard errors - but from here I didn't understand how to plot the residuals (or) how to correct the estimates?? Can some one direct me in this regard.. Best, Kishore/.. http://kaykayatisb.blogspot.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What is the R equivalent of STATA's 'drop' command?
See ?[ and its examples Also, section 2.7 of An Introduction to R is a good place to start: http://cran.r-project.org/doc/manuals/R-intro.html#Index-vectors hth, Kingsford Jones On Mon, Feb 9, 2009 at 5:27 PM, jjh21 jjhar...@gmail.com wrote: Hello, I am trying to do some data cleaning in R. I need to drop observations that take on certain values of a variable. In STATA I might type something like: drop if variable name == 3 drop if variable name == 4 Is there an R equivalent of this? I have tried playing around with the subset command, but it seems a bit clunky. What would an advanced R user's approach be for something like this? Thank you! -- View this message in context: http://www.nabble.com/What-is-the-R-equivalent-of-STATA%27s-%27drop%27-command--tp21925249p21925249.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Introduction to maps (of Denmark)?
Also, see this page for more ideas on mapping an auxiliary variable by county: http://r-spatial.sourceforge.net/gallery/ (e.g. plots 9, 14, 15 and 21) Kingsford Jones On Fri, Feb 6, 2009 at 3:22 PM, Greg Snow greg.s...@imail.org wrote: Googling for Denmark Shapefile leads to this website: http://www.vdstech.com/map_data.htm Where a few clicks lead to downloading a set of shapefiles (different levels of subdividing the country, hopefully you know more about what they correspond to than I do, about all I know about Denmark is from Hamlet, the fact that I have some ancestors from there (but I don't remember which branch), and how to make aebleskivers). These can be read into R using the maptools package as: library(maptools) dmk1 - readShapePoly('g:/downloads/DNK_adm/DNK0') dmk2 - readShapePoly('g:/downloads/DNK_adm/DNK1') dmk3 - readShapePoly('g:/downloads/DNK_adm/DNK2') simple plots can be done like: plot(dmk1) plot(dmk2) plot(dmk3) a little more complex with: plot(dmk3, col=topo.colors(248)) though of course replacing topo.colors(248) with something that is meaningful. If you need more control, then see the sp package. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Peter Jepsen Sent: Friday, February 06, 2009 1:18 PM To: r-help@r-project.org Subject: [R] Introduction to maps (of Denmark)? Dear R-listers, I am using R for Windows Vista 64-bit. I have no experience working with maps, but now I have to plot a map of Denmark in which each Danish county is color-coded according to its incidence rate of a particular disease. My problem is that I don't know where to start. I have read the help files to packages like 'mapplot', and their examples indicate that I am on the right track, but where do I find a suitably detailed map of Denmark, and which format should I prefer? The terminology in the help files etc. is overwhelming. Thank you in advance for any help. Best regards, Peter. *** R version 2.8.1 (2008-12-22) i386-pc-mingw32 locale: LC_COLLATE=Danish_Denmark.1252;LC_CTYPE=Danish_Denmark.1252;LC_MONETARY = Danish_Denmark.1252;LC_NUMERIC=C;LC_TIME=Danish_Denmark.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matlab inv() and R solve() differences
I suppose the solution is unstable because x is ill-conditioned: x [,1] [,2] [,3] [,4] [1,] 0.133 0.254 -0.214 0.116 [2,] 0.254 0.623 -0.674 0.139 [3,] -0.214 -0.674 0.910 0.011 [4,] 0.116 0.139 0.011 0.180 cor(x) [,1] [,2] [,3] [,4] [1,] 1.000 0.9963557 -0.9883690 0.8548065 [2,] 0.9963557 1.000 -0.9976663 0.8084090 [3,] -0.9883690 -0.9976663 1.000 -0.7663847 [4,] 0.8548065 0.8084090 -0.7663847 1.000 kappa(x) [1] 2813.326 hth, Kingsford Jones On Thu, Jan 29, 2009 at 7:00 PM, Joseph P Gray jpg...@uwm.edu wrote: I submit the following matrix to both MATLAB and R x= 0.133 0.254 -0.214 0.116 0.254 0.623 -0.674 0.139 -0.214 -0.674 0.910 0.011 0.116 0.139 0.011 0.180 MATLAB's inv(x) provides the following 137.21 -50.68 -4.70 -46.42 -120.71 27.28 -8.94 62.19 -58.15 6.93 -7.89 36.94 8.35 11.17 10.42 -14.82 R's solve(x) provides: 261.94 116.22 150.92 -267.78 116.22 344.30 286.68 -358.30 150.92 286.68 252.96 -334.09 -267.78 =358.30 -334.09 475.22 inv(x)*x = I(4) and solve(x)%*%x = I(4) Is there a way to obtain the MATLAB result in R? Thanks for any help. Pat Gray __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correlation matrix one side with significance
On the topic of visualizing correlation, see also Murdoch, D.J. and Chow, E.D. (1996). A graphical display of large correlation matrices. The American Statistician 50, 178-180. with examples here: # install.packages('ellipse') example(plotcorr, package='ellipse') On Sat, Mar 8, 2008 at 3:01 AM, Liviu Andronic landronim...@gmail.com wrote: On 3/5/08, Martin Kaffanke tech...@roomandspace.com wrote: Now I'd like to have it one sided, means only the left bottom side to be printed (the others are the same) and I'd like to have * where the p-value is lower than 0.05 and ** lower than 0.01. Look here [1], at Visualizing Correlations. You might find interesting the example of a plotted correlation matrix. Liviu [1] http://www.statmethods.net/stats/correlations.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Omitting a desired line from a table [Beginner Question]
see ?subset Or use indexing, which is covered in section 2.7 of an introduction to R (but note that a data frame has 2 dimensions) hth, Kingsford Jones On Sun, Jan 25, 2009 at 3:06 PM, pfc_ivan pfc_i...@hotmail.com wrote: I am a beginner using this R software and have a quick question. I added a file into the R called fish.txt using this line. fish-read.table(fish.txt, head=T, fill=T) The .txt file looks like this. Since it contains like 30 lines of data I will copy/paste first 5 lines. Year GeoArea SmpNo Month 1970113 7 197111310 1972113 8 197321310 197411311 Now what I want to do is to omit all the lines in the file that arent happening in GeoArea 1, and that arent happening in Month 10. So basically The only lines that I want to keep are the lines that have GeoArea=1 and Month=10 at the same time. So if GeoArea=2 and Month=10 I dont need it. So i just need the lines that have both of those values correct. How do I delete the rest of the lines that I dont need? Thank you everyone. -- View this message in context: http://www.nabble.com/Omitting-a-desired-line-from-a-table--Beginner-Question--tp21657416p21657416.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparing the previous and next entry of a vector to a criterium
How about: a - c(1,2,3,3,2,1,6,3,2) b - c(NA,a[-length(a)]) c - c(a[-1],NA) a[b==1 c==3] [1] 2 6 hth, Kingsford Jones On Sun, Jan 25, 2009 at 3:02 PM, Jörg Groß jo...@licht-malerei.de wrote: Hi, I have a quit abstract problem, hope someone can help me here. I have a vector like this: x - c(1,2,3,4,5,2,6) x [1] 1 2 3 4 5 2 6 now I want to get the number where the previous number is 1 and the next number is 3 (that is the 2 at the second place) I tried something with tail(x, -1) ... with that, I can check the next number, but how can I check the previous number? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] predict function problem for glmmPQL
On Thu, Jan 22, 2009 at 10:48 PM, Chun Zhang usame...@yahoo.com wrote: Hi all, I am using cross-validation to validate a generalized linear mixed effects model fitted using glmmPQL. i found that the predict function has a problem and i wonder if anyone has encountered the same problem? glmm1 = glmmPQL(y~aX+b,random=~1|sample,data=traindata) predict(glmm1,newdata=testdata,level=1,type=response) gives me all NAs. it works for level=0 (the fixed effects), but not for level=1. When i use newdata=traindata, predict function works perfectly. i wonder if this is a problem with predict function or it's some bug in my code? ...perhaps the levels of 'sample' differ between 'traindata' and 'testdata'? hth, Kingsford Jones Thanks much for your help! Chun Chun Zhang Statistician at Roche __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Joint significance of more regressors in summary
try install.packages('car') ?car::linear.hypothesis hth, Kingsford Jones On Wed, Jan 21, 2009 at 4:20 PM, Roberto Patuelli roberto.patue...@lu.unisi.ch wrote: Dear All, I was wondering if it is possible to generate a regression summary (it does not matter at this stage if from an lm or for example a glm estimate) in which to obtain the joint significance of a set of regressors? Examples could be looking at the joint significance level of a polynomial, or of a set of exogenous variables of which is of interest the linear combination suggested by the regression parameters. With regard to the latter, it would also be cool to visualize directly the linear combination of such group of variables, which will obviously have a regression coefficient of 1. The standard error and significance level, though, are less obvious. I would expect - please correct me if I'm wrong - that a simple ANOVA comparison between two models with and without this set of variables would give the significance level. But what if there are two sets of variables included in the model for which to find joint significance (that is, set by set)? I hope someone can help. As an example, please see the regression output below, from a quasipoisson estimation. I have two large set of eigenvector decomposition variables, one marked by _o and one by _d. For these two sets of variables, I would like to have, in the regression summary, only two lines, with Estimate, Std. Error, t-value and Pr(|t|). Obviously I can do this by hand, constructing the linear combinations, rerunning the model, and therefore obtaining a standard error and a p-value for each set. But the degrees of freedom of the model would in reality be different... Thanks in advance for any help! Cheers Roberto Patuelli Post-doc researcher Institute for Economic Research (IRE) University of Lugano Email: roberto.patue...@lu.unisi.ch Homepage: http://www.people.lu.unisi.ch/patuellr * dep.qglm - glm(dep ~ lndist + com_lang + contig + history + fta + lnarea_i + lngdppc_i + lngdp_i + island_i + landl_i + lnarea_e + lngdp_e + lngdppc_e + island_e + landl_e + + e1_o + e3_o + e4_o + e5_o + e7_o + e8_o + e9_o + e10_o + e11_o + e12_o + e13_o + e14_o + e15_o + e17_o + e18_o + e19_o + e20_o + e21_o + e22_o + e23_o + e24_o + + e1_d + e2_d + e4_d + e5_d + e7_d + e8_d + e9_d + e10_d + e12_d + e13_d + e14_d + e16_d + e17_d + e18_d + e19_d + e20_d + e22_d + e23_d + e24_d + e25_d + e26_d + e27_d + e28_d + e29_d + e30_d, family = quasipoisson (link = log)) summary(dep.qglm) Call: glm(formula = dep ~ lndist + com_lang + contig + history + fta + lnarea_i + lngdppc_i + lngdp_i + island_i + landl_i + lnarea_e + lngdp_e + lngdppc_e + island_e + landl_e + e1_o + e3_o + e4_o + e5_o + e7_o + e8_o + e9_o + e10_o + e11_o + e12_o + e13_o + e14_o + e15_o + e17_o + e18_o + e19_o + e20_o + e21_o + e22_o + e23_o + e24_o + e1_d + e2_d + e4_d + e5_d + e7_d + e8_d + e9_d + e10_d + e12_d + e13_d + e14_d + e16_d + e17_d + e18_d + e19_d + e20_d + e22_d + e23_d + e24_d + e25_d + e26_d + e27_d + e28_d + e29_d + e30_d, family = quasipoisson(link = log)) Deviance Residuals: Min 1Q Median 3QMax -137.3970-4.3775-1.8095-0.6143 195.3221 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -29.311658 0.243063 -120.593 2e-16 *** lndist -0.608668 0.009603 -63.386 2e-16 *** com_lang 0.162357 0.0210647.708 1.34e-14 *** contig0.578563 0.023609 24.506 2e-16 *** history 0.176760 0.0231137.647 2.15e-14 *** fta 0.411314 0.018823 21.851 2e-16 *** lnarea_i -0.137816 0.008402 -16.404 2e-16 *** lngdppc_i 0.003957 0.0183150.216 0.828937 lngdp_i 0.816396 0.010770 75.801 2e-16 *** island_i 0.118761 0.0306183.879 0.000105 *** landl_i -0.337145 0.040638 -8.296 2e-16 *** lnarea_e -0.054909 0.006349 -8.649 2e-16 *** lngdp_e 0.808997 0.009182 88.111 2e-16 *** lngdppc_e 0.012582 0.0123631.018 0.308837 island_e -0.202474 0.029096 -6.959 3.55e-12 *** landl_e -0.226312 0.041144 -5.501 3.84e-08 *** e1_o 0.685095 0.1306365.244 1.59e-07 *** e3_o -1.204244 0.140884 -8.548 2e-16 *** e4_o -1.311745 0.433108 -3.029 0.002460 ** e5_o -1.539045 0.278576 -5.525 3.34e-08 *** e7_o 1.722945 0.145778 11.819 2e-16 *** e8_o 1.286667 0.124809 10.309 2e-16 *** e9_o 0.359851 0.1114943.228 0.001251 ** e10_o 3.783921 0.288042 13.137 2e-16 *** e11_o 0.429692 0.1389963.091 0.001995 ** e12_o-0.707160 0.087880 -8.047 9.00e-16 *** e13_o-2.231826 0.225201 -9.910 2e-16 *** e14_o-0.256754 0.108398 -2.369 0.017865 * e15_o-0.408286 0.158939
Re: [R] I'm looking for a book about spatial point patterns (Diggle, 2003)
Unangu, If you haven't seen the 200pg workshop notes that Adrian Baddeley has made available from his spatstat webpage, I highly recommend them: http://www.csiro.au/files/files/pn0y.pdf hth, Kingsford Jones On Sat, Jan 10, 2009 at 9:04 PM, Unangu una...@gmail.com wrote: To understand some functions about spatial point patterns in spatstat ,I should know some background about it, and the best way is to read the monograph, and Statistical Analysis of Spatial Point Patterns (2nd edt.) is a better choise. But I can not find it anywhere I can. Who can help me? Thank you! - una...@gmail.com -- View this message in context: http://www.nabble.com/I%27m-looking-for-a-book-about-spatial-point-patterns-%28Diggle%2C2003%29-tp21395908p21395908.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R, clinical trials and the FDA
I hope that Marc doesn't mind, but I felt that part of his recent post was important enough to deserve it's own subject line rather then being lost in a 60-msg-long thread... On Sun, Jan 11, 2009 at 10:08 AM, Marc Schwartz marc_schwa...@comcast.net wrote: ... I strongly believe that the comments regarding R and the FDA are overly negative and pessimistic. The hurdles to the use of R for clinical trials are shrinking. There has been substantive activity over the past several years, both internally at the FDA and within the R community to increase R's acceptance in this domain. At the Joint Statistical Meetings in 2006, Sue Bell from the FDA spoke during a session with a presentation entitled Times 'R' A Changing: FDA Perspectives on Use of Open Source. A copy of this presentation is available here: http://www.fda.gov/cder/Offices/Biostatistics/Bell.pdf In 2007, during an FDA committee meeting reviewing the safety profile of Avandia (Rosiglitazone), the internal FDA meta-analysis performed by Joy Mele, the FDA statistician, was done using R. A copy of this presentation is available here: http://www.fda.gov/ohrms/dockets/ac/07/slides/2007-4308s1-05-fda-mele.ppt Given the high profile nature of drug safety issues today, that R was used for this analysis by the FDA itself speaks volumes. Also in 2007, at the annual R user meeting at Iowa State University, I had the pleasure and privilege of Chairing a session on the use of R for clinical trials. The speakers included Frank Harrell (well known to R users here), Tony Rossini and David James (Novartis Pharmaceuticals) and Mat Soukup (FDA statistician). Copies of our presentations are available here, a little more than half way down the page: http://user2007.org/program/ At that meeting, we also introduced a document that has been updated since then and approved formally by the R Foundation for Statistical Computing. The document provides guidance for the use of R in the regulated clinical trials domain, addresses R's compliance with the relevant regulations (eg. 21 CFR 11) as well as describing the development, testing and quality processes in place for R, also known as the Software Development Life Cycle. That document is available here: http://www.r-project.org/doc/R-FDA.pdf I have heard directly from colleagues in industry that this document has provided significant value in their internal discussions regarding implementing the use of R within their respective environments and assuaging many fears regarding R's use. Additionally, presentations regarding the use of open source software and R specifically for clinical trials have been made at DIA and other industry meetings. This fall, there is a session on the use of R scheduled for the FDA's Industry Statistics Workshop in Washington, D.C. For those unfamiliar, I would also point out the membership and financial donors to the R Foundation for Statistical Computing and take note of the plethora of large pharma companies and clinical research institutions: http://www.r-project.org/foundation/memberlist.html The use of R within this domain is increasing and will only continue to progress as R's value becomes increasingly clear to even risk averse industry decision makers. Regards, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R in the NY Times
The reactions to the NYT article have certainly made for some interesting reading. Here are some of the links: http://overdetermined.net/site/content/new-york-times-article-r http://jackman.stanford.edu/blog/?p=1053 http://ggorjan.blogspot.com/2009/01/new-york-times-on-r.html several posts on Andrew Gelman's blog: http://www.stat.columbia.edu/~gelman/blog/ http://www.reddit.com/r/programming/comments/7nwgq/the_new_york_times_notices_the_r_programming/ comments here: http://bits.blogs.nytimes.com/2009/01/08/r-you-ready-for-r/ It's too bad that SAS has reacted to the negative reactions to their NYT quote with more FUD. The quote that Tony posted is just a thinly-veiled jab at R (veiled by a disingenuous we value open source veneer). Perhaps SAS is shooting themselves in the foot with their reactions; aren't they making it harder if they should ever decide the best thing to do is to embrace R and the philosophies behind it? Four years ago, Marc Schwartz posted interesting comments realted to this: http://tolstoy.newcastle.edu.au/R/help/04/12/9497.html On another note, I wonder why in the various conversations there seems to be pervasive views that a) the FDA won't accept work done in R, and b) SAS is the only way to effectively handle data? best, Kingsford Jones On 7 Jan, 14:50, Marc Schwartz marc_schwa...@comcast.net wrote: on 01/07/2009 08:44 AM Kevin E. Thorpe wrote: Zaslavsky, Alan M. wrote: This article is accompanied by nice pictures of Robert and Ross. Data Analysts Captivated by Power of R http://www.nytimes.com/2009/01/07/technology/business-computing/07pro... January 7, 2009 Data Analysts Captivated by R's Power By ASHLEE VANCE SAS says it has noticed R's rising popularity at universities, despite educational discounts on its own software, but it dismisses the technology as being of interest to a limited set of people working on very hard tasks. I think it addresses a niche market for high-end data analysts that want free, readily available code, said Anne H. Milley, director of technology product marketing at SAS. She adds, We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet. Thanks for posting. Does anyone else find the statement by SAS to be humourous yet arrogant and short-sighted? Kevin It is an ignorant comment by a marketing person who has been spoon fed her lines...it is also a comment being made from a very defensive and insecure posture. Congrats to R Core and the R Community. This is yet another sign of R's growth and maturity. Regards, Marc Schwartz __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] install.views()
did you install and load the 'ctv' package? install.packages('ctv') library('ctv') then try your code... Kingsford Jones On Sat, Jan 10, 2009 at 1:00 PM, oscar linares wins...@gmail.com wrote: Dear Rxperts, Using R 2.8.1 and trying install.views(Cluster) getting error Error: could not find function install.views Please help:-( -- Oscar Oscar A. Linares Molecular Medicine Unit Bolles Harbor Monroe, Michigan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix: Problem with the code
On Fri, Jan 9, 2009 at 6:36 PM, markle...@verizon.net wrote: Charlotte: I ran your code because I wasn't clear on it and your way would cause more matrices than the person requested. Bhargab gave us x-c(23,67,2,87,9,63,8,2,35,6,91,41,22,3) and said: I want to have a matrix with p columns such that each column will have the elements of x^(column#). so, I think Charlotte's code was spot-on: p - 3 outer(x, 1:p, '^') [,1] [,2] [,3] [1,] 23 529 12167 [2,] 67 4489 300763 [3,]24 8 [4,] 87 7569 658503 [5,]9 81729 [6,] 63 3969 250047 [7,]8 64512 [8,]24 8 [9,] 35 1225 42875 [10,]6 36216 [11,] 91 8281 753571 [12,] 41 1681 68921 [13,] 22 484 10648 [14,]39 27 Here's another way -- a bit less elegant, but a gentle introduction to thinking in vectors rather than elements: mat - matrix(0,nrow=length(x), ncol=p) for(i in 1:p) mat[,i] - x^i mat [,1] [,2] [,3] [1,] 23 529 12167 [2,] 67 4489 300763 [3,]24 8 [4,] 87 7569 658503 [5,]9 81729 [6,] 63 3969 250047 [7,]8 64512 [8,]24 8 [9,] 35 1225 42875 [10,]6 36216 [11,] 91 8281 753571 [12,] 41 1681 68921 [13,] 22 484 10648 [14,]39 27 best, Kingsford Jones So I think the code below it, although not too short, does what the person asked. Thanks though because I understand outer better now. temp - matrix(c(1,2,3,4,5,6),ncol=2) print(temp) #One of those more elegant ways: print(temp) outer(temp,1:p,'^')One of those more elegant ways: # THIS WAY I THINK GIVES WHAT THEY WANT sapply(1:ncol(temp), function(.col) { temp[,.col]^.col }) On Fri, Jan 9, 2009 at 7:40 PM, Charlotte Wickham wrote: One of those more elegant ways: outer(x, 1:p, ^) Charlotte On Fri, Jan 9, 2009 at 4:24 PM, Sarah Goslee sarah.gos...@gmail.com wrote: Well, mat doesn't have any dimensions / isn't a matrix, and we don't know what p is supposed to be. But leaving aside those little details, do you perhaps want something like this: x-c(23,67,2,87,9,63,8,2,35,6,91,41,22,3) p - 5 mat- matrix(0, nrow=p, ncol=length(x)) for(j in 1:length(x)) { for(i in 1:p) mat[i,j]-x[j]^i } Two notes: I didn't try it out, and if that's what you want rather than a toy example of a larger problem, there are more elegant ways to do it in R. Sarah On Fri, Jan 9, 2009 at 6:42 PM, Bhargab Chattopadhyay bharga...@yahoo.com wrote: Hi, Can any one please explain why the following code doesn't work? Or can anyone suggest an alternative. Suppose x-c(23,67,2,87,9,63,8,2,35,6,91,41,22,3) mat-0; for(j in 1:length(x)) { for(i in 1:p) mat[i,j]-x[j]^i; } Actually I want to have a matrix with p columns such that each column will have the elements of x^(column#). Thanks in advance. Bhargab -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New York Times Article: Data Analysts Captivated by R's Power
On Wed, Jan 7, 2009 at 5:15 PM, Simon Blomberg s.blombe...@uq.edu.au wrote: Some people familiar with R describe it as a supercharged version of Microsoft's Excel spreadsheet software... Maybe it's my dry Australian humour, but I think this should go into the fortunes package. The irony of that quote is great, but I'd guess it's also an effective bit of advertising. Akin to Gary Larson's famous what dogs hear cartoon (link below), I can envision many managers reading the article and seeing something like: blah blah blah blah supercharged version of Microsoft's Excel spreadsheet software that can help illuminate data trends more clearly blah blah blah blah ;-) Kingsford Jones http://www.flickr.com/photos/sluggerotoole/153603564/ Simon. -- Simon Blomberg, BSc (Hons), PhD, MAppStat. Lecturer and Consultant Statistician Faculty of Biological and Chemical Sciences The University of Queensland St. Lucia Queensland 4072 Australia Room 320 Goddard Building (8) T: +61 7 3365 2506 http://www.uq.edu.au/~uqsblomb email: S.Blomberg1_at_uq.edu.au Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. - John Tukey. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the first and last observation for each subject
Here's some more timing's of Bill's function. Although in this example sapply has a clear performance advantage for smaller numbers of groups (k) , gm is substantially faster for k 1000: gm - function(x, group){ # medians by group: o-order(group, x) group - group[o] x - x[o] changes - group[-1] != group[-length(group)] first - which(c(TRUE, changes)) last - which(c(changes, TRUE)) lowerMedian - x[floor((first+last)/2)] upperMedian - x[ceiling((first+last)/2)] median - (lowerMedian+upperMedian)/2 names(median) - group[first] median } ## for(k in 10^(1:6)){ group-sample(1:k, size=10, replace=TRUE) x-rnorm(length(group))*10 + group cat('number of groups:', k, '\n') cat('sapply\n') print(s - unix.time(sapply(split(x,group), median))) cat('gm\n') print(g - unix.time(-gm(x,group))) cat(' sapply/gm user ratio:', s[1]/g[1], '\n\n\n') } number of groups: 10 sapply user system elapsed 0.010.000.01 gm user system elapsed 0.140.000.16 sapply/gm user ratio: 0.07142857 number of groups: 100 sapply user system elapsed 0.020.000.02 gm user system elapsed 0.110.000.09 sapply/gm user ratio: 0.1818182 number of groups: 1000 sapply user system elapsed 0.110.000.11 gm user system elapsed 0.110.000.11 sapply/gm user ratio: 1 number of groups: 1 sapply user system elapsed 1.000.001.04 gm user system elapsed 0.100.000.09 sapply/gm user ratio: 10 number of groups: 1e+05 sapply user system elapsed 6.240.016.31 gm user system elapsed 0.160.000.17 sapply/gm user ratio: 39 number of groups: 1e+06 sapply user system elapsed 9.000.038.92 gm user system elapsed 0.150.000.20 sapply/gm user ratio: 60 R.Version() $platform [1] i386-pc-mingw32 $arch [1] i386 $os [1] mingw32 $system [1] i386, mingw32 $status [1] $major [1] 2 $minor [1] 8.0 $year [1] 2008 $month [1] 10 $day [1] 20 $`svn rev` [1] 46754 $language [1] R $version.string [1] R version 2.8.0 (2008-10-20) Kingsford Jones On Mon, Jan 5, 2009 at 10:18 AM, William Dunlap wdun...@tibco.com wrote: Arg, the 'sapply(...)' in the function was in the initial comment, gm - function(x, group){ # medians by group: sapply(split(x,group),median) but someone's mailer put a newline before the sapply gm - function(x, group){ # medians by group: sapply(split(x,group),median) so it got executed, wasting all that time. (Of course the same mailer will mess up this message in the same way - just delete the sapply() call from gm().) Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com -Original Message- From: hadley wickham [mailto:h.wick...@gmail.com] Sent: Monday, January 05, 2009 9:10 AM To: William Dunlap Cc: gallon...@gmail.com; R help Subject: Re: [R] the first and last observation for each subject Another application of that technique can be used to quickly compute medians by groups: gm - function(x, group){ # medians by group: sapply(split(x,group),median) o-order(group, x) group - group[o] x - x[o] changes - group[-1] != group[-length(group)] first - which(c(TRUE, changes)) last - which(c(changes, TRUE)) lowerMedian - x[floor((first+last)/2)] upperMedian - x[ceiling((first+last)/2)] median - (lowerMedian+upperMedian)/2 names(median) - group[first] median } For a 10^5 long x and a somewhat fewer than 3*10^4 distinct groups (in random order) the times are: group-sample(1:3, size=10, replace=TRUE) x-rnorm(length(group))*10 + group unix.time(z0-sapply(split(x,group), median)) user system elapsed 2.720.003.20 unix.time(z1-gm(x,group)) user system elapsed 0.120.000.16 identical(z1,z0) [1] TRUE I get: unix.time(z0-sapply(split(x,group), median)) user system elapsed 2.733 0.017 2.766 unix.time(z1-gm(x,group)) user system elapsed 2.897 0.032 2.946 Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the first and last observation for each subject
whoops -- I left the group size unchanged so k became greather than the length of the group vector. When I increase the size to 1e7, sapply is faster until it gets to k = 1e6. warning: this takes awhile (particularly on my machine which seems to be using just 1 of it's 2 cpus) for(k in 10^(1:6)){ + group-sample(1:k, size=1e7, replace=TRUE) + x-rnorm(length(group))*10 + group + cat('number of groups:', k, '\n') + cat('sapply\n') + print(s - unix.time(sapply(split(x,group), median))) + cat('gm\n') + print(g - unix.time(-gm(x,group))) + cat(' sapply/gm user ratio:', s[1]/g[1], '\n\n\n') + } number of groups: 10 sapply user system elapsed 1.180.381.56 gm user system elapsed 53.430.70 55.05 sapply/gm user ratio: 0.02208497 number of groups: 100 sapply user system elapsed 1.170.231.42 gm user system elapsed 49.890.61 51.60 sapply/gm user ratio: 0.02345159 number of groups: 1000 sapply user system elapsed 1.290.251.72 gm user system elapsed 45.230.34 46.55 sapply/gm user ratio: 0.02852089 number of groups: 1 sapply user system elapsed 2.800.092.87 gm user system elapsed 41.040.55 42.85 sapply/gm user ratio: 0.06822612 number of groups: 1e+05 sapply user system elapsed 14.650.16 15.18 gm user system elapsed 38.280.65 39.55 sapply/gm user ratio: 0.3827064 number of groups: 1e+06 sapply user system elapsed 126.630.42 129.21 gm user system elapsed 37.560.84 38.78 sapply/gm user ratio: 3.371406 On Mon, Jan 5, 2009 at 2:37 PM, Kingsford Jones kingsfordjo...@gmail.com wrote: Here's some more timing's of Bill's function. Although in this example sapply has a clear performance advantage for smaller numbers of groups (k) , gm is substantially faster for k 1000: gm - function(x, group){ # medians by group: o-order(group, x) group - group[o] x - x[o] changes - group[-1] != group[-length(group)] first - which(c(TRUE, changes)) last - which(c(changes, TRUE)) lowerMedian - x[floor((first+last)/2)] upperMedian - x[ceiling((first+last)/2)] median - (lowerMedian+upperMedian)/2 names(median) - group[first] median } ## for(k in 10^(1:6)){ group-sample(1:k, size=10, replace=TRUE) x-rnorm(length(group))*10 + group cat('number of groups:', k, '\n') cat('sapply\n') print(s - unix.time(sapply(split(x,group), median))) cat('gm\n') print(g - unix.time(-gm(x,group))) cat(' sapply/gm user ratio:', s[1]/g[1], '\n\n\n') } number of groups: 10 sapply user system elapsed 0.010.000.01 gm user system elapsed 0.140.000.16 sapply/gm user ratio: 0.07142857 number of groups: 100 sapply user system elapsed 0.020.000.02 gm user system elapsed 0.110.000.09 sapply/gm user ratio: 0.1818182 number of groups: 1000 sapply user system elapsed 0.110.000.11 gm user system elapsed 0.110.000.11 sapply/gm user ratio: 1 number of groups: 1 sapply user system elapsed 1.000.001.04 gm user system elapsed 0.100.000.09 sapply/gm user ratio: 10 number of groups: 1e+05 sapply user system elapsed 6.240.016.31 gm user system elapsed 0.160.000.17 sapply/gm user ratio: 39 number of groups: 1e+06 sapply user system elapsed 9.000.038.92 gm user system elapsed 0.150.000.20 sapply/gm user ratio: 60 R.Version() $platform [1] i386-pc-mingw32 $arch [1] i386 $os [1] mingw32 $system [1] i386, mingw32 $status [1] $major [1] 2 $minor [1] 8.0 $year [1] 2008 $month [1] 10 $day [1] 20 $`svn rev` [1] 46754 $language [1] R $version.string [1] R version 2.8.0 (2008-10-20) Kingsford Jones On Mon, Jan 5, 2009 at 10:18 AM, William Dunlap wdun...@tibco.com wrote: Arg, the 'sapply(...)' in the function was in the initial comment, gm - function(x, group){ # medians by group: sapply(split(x,group),median) but someone's mailer put a newline before the sapply gm - function(x, group){ # medians by group: sapply(split(x,group),median) so it got executed, wasting all that time. (Of course the same mailer will mess up this message in the same way - just delete the sapply() call from gm().) Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com -Original Message- From: hadley wickham [mailto:h.wick...@gmail.com] Sent: Monday, January 05, 2009 9:10 AM To: William Dunlap Cc: gallon...@gmail.com; R help Subject: Re: [R] the first and last observation for each subject Another application of that technique can be used to quickly compute medians by groups: gm - function(x, group){ # medians by group: sapply(split(x,group),median) o-order(group, x) group - group[o] x - x[o] changes - group[-1] != group
Re: [R] Basic Question about use of R
Hi David, A general answer to your question is: yes, R would be useful for such analyses - particularly when interfaced with a GIS. For an introduction to the types of spatial tools available in R see the Task View for spatial data: http://cran.r-project.org/web/views/Spatial.html Below are a few more specific comments: On Fri, Jan 2, 2009 at 12:12 PM, David Greene greene...@ntelos.net wrote: Dear Sirs: I am not yet a user of R. My background includes the use of (Turbo) Pascal for scientific analysis of underwater acoustics problems (e.g. sound ray tracing and array gain in directional noise fields.) I have become interested in the following type of problem: (1) select , say, 1000 random locations within the continental United States; This could be as simple as using the runif function, but more likely you'll want to look at sp::spsample, or for more advanced tools see the spsurvey and spatstat packages. (2) characterize (statistically) the probabilities of: (a) distance to the nearest paved road; (b) elevation above sea level; (c) (?) ownership (public or private); etc. R is outstanding for the types of 'statistical characterization' I guess you are interested in. It also has excellent capabilities for importing and manipulating spatial data (e.g. see the Reading and writing spatial data section of the Task View). However for doing things like calculating geographic distances using objects of varying types (points, lines, polygons, grids) it's generally easiest to use a GIS (such as GRASS, SAGA, ArcInfo, ...). You can then use the available tools for importing the GIS results into R for statistical analysis, and if you wish, exporting back to the GIS. However if you do not want to put the effort into learning a GIS, it is usually possible to work out a solution using only R. As you run into specific problems the R-sig-geo list is a good place to get helpful answers to well formulated questions. Would R be useful , perhaps in combination with Google Earth, to carry out As far as I know Google Earth is designed for visualization rather than analysis. R in combination with a GIS is really the way to go. Here is a current book that covers many of the spatial tools available in R http://www.springer.com/public+health/epidemiology/book/978-0-387-78170-9 hope that helps, Kingsford Jones this kind of study? Thank you for your consideration. David C. Greene greene...@ntelos.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Spatial Data Analysis in R [was: Basic Question about use of R]
resending to provide a more informative subject line On Fri, Jan 2, 2009 at 3:21 PM, Kingsford Jones kingsfordjo...@gmail.com wrote: Hi David, A general answer to your question is: yes, R would be useful for such analyses - particularly when interfaced with a GIS. For an introduction to the types of spatial tools available in R see the Task View for spatial data: http://cran.r-project.org/web/views/Spatial.html Below are a few more specific comments: On Fri, Jan 2, 2009 at 12:12 PM, David Greene greene...@ntelos.net wrote: Dear Sirs: I am not yet a user of R. My background includes the use of (Turbo) Pascal for scientific analysis of underwater acoustics problems (e.g. sound ray tracing and array gain in directional noise fields.) I have become interested in the following type of problem: (1) select , say, 1000 random locations within the continental United States; This could be as simple as using the runif function, but more likely you'll want to look at sp::spsample, or for more advanced tools see the spsurvey and spatstat packages. (2) characterize (statistically) the probabilities of: (a) distance to the nearest paved road; (b) elevation above sea level; (c) (?) ownership (public or private); etc. R is outstanding for the types of 'statistical characterization' I guess you are interested in. It also has excellent capabilities for importing and manipulating spatial data (e.g. see the Reading and writing spatial data section of the Task View). However for doing things like calculating geographic distances using objects of varying types (points, lines, polygons, grids) it's generally easiest to use a GIS (such as GRASS, SAGA, ArcInfo, ...). You can then use the available tools for importing the GIS results into R for statistical analysis, and if you wish, exporting back to the GIS. However if you do not want to put the effort into learning a GIS, it is usually possible to work out a solution using only R. As you run into specific problems the R-sig-geo list is a good place to get helpful answers to well formulated questions. Would R be useful , perhaps in combination with Google Earth, to carry out As far as I know Google Earth is designed for visualization rather than analysis. R in combination with a GIS is really the way to go. Here is a current book that covers many of the spatial tools available in R http://www.springer.com/public+health/epidemiology/book/978-0-387-78170-9 hope that helps, Kingsford Jones this kind of study? Thank you for your consideration. David C. Greene greene...@ntelos.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] interpretation of conf.type in predict.Design {Design}
Hi Dylan, I haven't looked at the code for predict.Design or predict.lm, but I think it's safe to assume that mean and confidence refer to the same concept, as do individual and prediction. Here's my logic: In general, confidence intervals refer to parameter estimates and prediction intervals to predicted point values. For the linear model, the fitted line represents the estimated conditional mean of y given the x-values and we form a confidence interval around it. In this case, our best prediction of any individual y-value given the x-values is also the fitted line. However, upon repeated sampling the fitted line will vary less than the observed y values at any given set of x-values, and this is reflected in the fact that the confidence interval is narrower than the prediction interval. hth, Kingsford Jones On Wed, Dec 31, 2008 at 2:15 PM, Dylan Beaudette dylan.beaude...@gmail.com wrote: Hi, I am not quite sure how to interpret the differences in output when changing conf.type from the default mean to individual. Are these analogous to the differences between confidence and prediction intervals, as defined in predict.lm {stats} ? Thanks in advance. Dylan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to display y-axis labels in Multcomp plot
See ?par and note the 'mar' parameter Here's an example: library(multcomp) labs - c('short', 'medium', 'long') treatment - gl(3, 10, labels = labs) response - rnorm(30, mean=as.numeric(treatment)) mult - glht(lm(response ~ treatment), linfct=mcp(treatment='Means')) par(mar=c(4,8,4,2)) plot(confint(mult)) hth, Kingsford Jones On Mon, Dec 8, 2008 at 5:06 PM, Metconnection [EMAIL PROTECTED] wrote: Dear R-users, I'm currently using the multcomp package to produce plots of means with 95% confidence intervals i.e. mult-glht(lm(response~treatment, data=statdata), linfct=mcp(treatment=Means)) plot(confint(mult,calpha = sig)) Unfortunately the y-axis on the plot appears to be fixed and hence if the labels on the y-axis (treatment levels) are too long, then they are not displayed in full on the plot. Of course I could always make the labels shorter but I was wondering if there was a way to make the position of the y-axis on the plot more flexible, such as in the scatterplot produced using xyplot function, that would allow me to view the labels in full. Thanks in advance for any advice! Simon -- View this message in context: http://www.nabble.com/How-to-display-y-axis-labels-in-Multcomp-plot-tp20904977p20904977.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create unique sets of 3 from a vector of IDs?
However, I believe Brandon was trying to get the permutations of size 3, rather than combinations. Dylan provided a solution including repeats. Here's one without: library(gtools) permutations(5, 3, LETTERS[1:5]) [,1] [,2] [,3] [1,] A B C [2,] A B D [3,] A B E [4,] A C B [5,] A C D [6,] A C E [7,] A D B [8,] A D C [9,] A D E [10,] A E B [11,] A E C [12,] A E D [13,] B A C [14,] B A D [15,] B A E [16,] B C A [17,] B C D [18,] B C E [19,] B D A [20,] B D C [21,] B D E [22,] B E A [23,] B E C [24,] B E D [25,] C A B [26,] C A D [27,] C A E [28,] C B A [29,] C B D [30,] C B E [31,] C D A [32,] C D B [33,] C D E [34,] C E A [35,] C E B [36,] C E D [37,] D A B [38,] D A C [39,] D A E [40,] D B A [41,] D B C [42,] D B E [43,] D C A [44,] D C B [45,] D C E [46,] D E A [47,] D E B [48,] D E C [49,] E A B [50,] E A C [51,] E A D [52,] E B A [53,] E B C [54,] E B D [55,] E C A [56,] E C B [57,] E C D [58,] E D A [59,] E D B [60,] E D C Kingsford Jones On Tue, Dec 2, 2008 at 9:41 PM, G. Jay Kerns [EMAIL PROTECTED] wrote: Dear Brandon, On Tue, Dec 2, 2008 at 10:46 PM, Dylan Beaudette [EMAIL PROTECTED] wrote: On Tue, Dec 2, 2008 at 7:42 PM, philozine [EMAIL PROTECTED] wrote: Dear all: This is one of those should be easy problems that I'm having great difficulty solving. I have a vector containing ID codes, and I need to generate a 3-column matrix that contains all possible combinations of three. For example, my ID vector looks like this: A B C D E I need to generate a matrix that looks like this: A B C A B D A B E A C B A C D A C E A D B A D C A D E Hi, Does this do what you want? expand.grid(letters[1:5], letters[1:5], letters[1:5]) D Have a look at urnsamples() in the prob package. ID - LETTERS[1:5] urnsamples(ID, size = 3, replace = FALSE, ordered = FALSE) Best, Jay *** G. Jay Kerns, Ph.D. Associate Professor Department of Mathematics Statistics Youngstown State University Youngstown, OH 44555-0002 USA Office: 1035 Cushwa Hall Phone: (330) 941-3310 Office (voice mail) -3302 Department -3170 FAX E-mail: [EMAIL PROTECTED] http://www.cc.ysu.edu/~gjkerns/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function for simultaneous confidence interval of regression coefficients
see ?coef # extract the estimates ?vcov # extract their covariance matrix ?qf # get the F quantile of interest Also, you may be interested in ?car::ellipse ?ellipse::ellipse.lm ?gmodels::glht.test hth, Kingsford Jones On Sat, Nov 29, 2008 at 4:30 PM, Kyle Matoba [EMAIL PROTECTED] wrote: List, Would someone be so kind as to point me to a function that will calculate simultaneous confidence intervals of regression coefficients based upon their distribution as (under the assumption of normal errors, with \mathbf{X} as the design matrix): $\hat{\mathbf{\beta}} \sim N(\mathbf{\beta}, \sigma^2(\mathbf{X}^T\mathbf{X})^{-1})$. 'confint' calculates individual coefficients so far as I can tell, but I need simultaneous CIs based on the confidence ellipse/ F distribution. Inverting the ellipse gives this equation: \mathbf{\hat{\beta}} \pm \sqrt{\mathrm{diag}(s^2(\mathbf{X}^T\mathbf{X})^{-1}) \times p \times F_{p, n-p, .95}} Thanks, and sorry for such a dumb question. Either I am not searching for the right thing or this hasn't already been addressed in the lists (perhaps because it is so easy). Kyle [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to select cases based on value of one or more variables
It's generally easier to work with data frames, so read your data with students - read.spss(yourFile, to.data.frame=TRUE) Then subset will work as expected: subset(students, Sex == 1) If you would rather keep the data as a list you could do something like lapply(students, function(x) x[students$Sex == 1]) hth, Kingsford Jones On Sun, Nov 30, 2008 at 2:15 PM, Simone Gabbriellini [EMAIL PROTECTED] wrote: sorry for my bad presentation... read.spss gives me this: students $Auno [1] 6 1 2 2 1 3 4 2 4 2 4 4 1 1 NA 1 4 2 1 1 1 5 4 [24] 2 1 2 1 2 1 4 4 1 1 1 2 1 6 1 1 1 1 1 2 1 2 1 [47] 2 2 1 4 2 4 3 1 1 1 1 3 2 1 4 4 4 4 2 4 1 2 4 [70] 1 3 4 5 2 4 3 5 5 4 2 1 1 1 1 4 5 2 4 4 1 4 2 [93] 1 2 3 3 2 1 2 2 2 1 1 1 3 5 5 5 2 NA 2 1 NA 5 2 [116] 1 4 2 NA 1 4 5 2 3 1 1 1 1 4 2 1 1 3 2 4 2 4 2 [139] 1 4 1 2 4 1 2 3 2 1 1 2 4 4 3 4 1 1 3 2 1 1 2 [162] 1 2 5 5 5 1 4 3 2 3 3 2 1 1 5 1 2 1 1 2 1 2 1 [185] 1 2 1 1 1 1 3 4 2 1 4 2 4 1 4 2 1 1 1 2 1 4 1 [208] 5 1 1 4 4 2 1 1 5 4 1 1 5 5 4 1 4 $Sex [1] 2 1 2 1 2 2 2 2 1 2 1 1 2 1 0 2 2 2 2 2 2 2 2 1 2 2 1 2 2 1 2 2 2 1 [35] 2 2 2 1 1 2 2 2 2 2 2 2 2 2 1 1 1 1 2 2 2 2 2 2 2 1 1 2 2 2 2 2 1 2 [69] 2 1 2 1 2 1 2 2 2 2 2 2 1 2 2 2 1 2 2 2 1 1 1 2 1 2 1 2 2 2 2 2 2 2 [103] 2 2 2 2 1 2 2 2 2 2 2 2 2 2 1 2 0 2 2 2 1 2 2 1 2 1 2 2 1 1 2 1 2 1 [137] 2 1 2 1 1 1 1 1 1 2 1 1 2 2 2 2 1 1 2 2 2 2 2 2 2 2 2 2 2 2 1 1 2 2 [171] 2 2 2 2 2 1 1 1 1 2 2 2 2 2 2 2 2 1 2 2 1 1 1 1 1 2 0 2 2 1 2 1 2 2 [205] 1 2 2 2 2 2 2 2 2 2 1 2 2 1 1 2 2 2 2 2 I would like to filter - or subset - the dataset for $Sex = 1 (in this case means male...), for example... thanks anyway, Simone Il giorno 30/nov/08, alle ore 21:49, Don MacQueen ha scritto: It is. For example, if you have a variable stored as a vector named x, and another variable stored as aa vector named y, you can select cases of y where x is greater than 3 by using y[x3] However, you're going to have to provide more information in order to get a better answer than that (see the posting guide, link included with every post to r-help). In particular, I'm guessing that the answer you really want looks somewhat different than my example -- but this depends on the exact structure of what read.spss() produces. I'd also suggest reading some of the documentation available from the R website (CRAN), notably, An Introduction to R. -Don At 9:36 PM +0100 11/30/08, Simone Gabbriellini wrote: dear list, I have read a spss file with read.spss() now I have a list with all my variable stored as vectors. is it possible to selec cases based on the value of one or more variables? thank you, Simone __ R-help@r-project.org mailing list https:// stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http:// www. R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- - Don MacQueen Lawrence Livermore National Laboratory Livermore, CA, USA 925-423-1062 [EMAIL PROTECTED] - __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] AIC function and Step function
On Sun, Nov 30, 2008 at 5:05 PM, Dana77 [EMAIL PROTECTED] wrote: Thanks for kind help from Steven and Christos last time. Now I got new problem regarding the codes for calculating the weights (w) in AIC () function. The original code is as below: getAnywhere(logLik.lm) function (object, REML = FALSE, ...) { res - object$residuals p - object$rank N - length(res) if (is.null(w - object$weights)) { w - rep.int(1, N) }else { excl - w == 0 if (any(excl)) { res - res[!excl] N - length(res) w - w[!excl] } } Now my question is, if I use lm() function to fit a multiple linear regression model, such as mod.fit-lm(formula = Y~ X1 + X2 + X3, data = set1), what code could I use to extract the weights (w) out? or how to calculate the weights(w) shown in above codes? mod.fit won't have weights because you didn't specify any through the weights argument to lm. If you had, you could extract them using the same technique used in the above code: w - mod.fit$weights hth, Kingsford Jones Thanks for your time and kind help! Dana Steven McKinney wrote: Hi Dana, Many thanks to Christos Hatzis who sent me an offline response, pointing out the new functions that make this much easier than my last suggestions: methods() and getAnywhere() methods(extractAIC) [1] extractAIC.aov* extractAIC.coxph* extractAIC.glm* extractAIC.lm* extractAIC.negbin* [6] extractAIC.survreg* Non-visible functions are asterisked getAnywhere(extractAIC.coxph) A single object matching 'extractAIC.coxph' was found It was found in the following places registered S3 method for extractAIC from namespace stats namespace:stats with value function (fit, scale, k = 2, ...) { edf - length(fit$coef) loglik - fit$loglik[length(fit$loglik)] c(edf, -2 * loglik + k * edf) } environment: namespace:stats Thank you Christos. That said, one of the advantages of getting the source code is that it has comments that are stripped out when the code is sourced into R e.g. from the command line getAnywhere(AIC.default) A single object matching 'AIC.default' was found It was found in the following places registered S3 method for AIC from namespace stats namespace:stats with value function (object, ..., k = 2) { ll - if (stats4 %in% loadedNamespaces()) stats4:::logLik else logLik if (length(list(...))) { object - list(object, ...) val - lapply(object, ll) val - as.data.frame(t(sapply(val, function(el) c(attr(el, df), AIC(el, k = k) names(val) - c(df, AIC) Call - match.call() Call$k - NULL row.names(val) - as.character(Call[-1]) val } else AIC(ll(object), k = k) } environment: namespace:stats From the source file # File src/library/stats/R/AIC.R # Part of the R package, http://www.R-project.org # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # A copy of the GNU General Public License is available at # http://www.r-project.org/Licenses/ Return the object's value of the Akaike Information Criterion (or An Inf.. Crit..) AIC - function(object, ..., k = 2) UseMethod(AIC) ## AIC for logLik objects AIC.logLik - function(object, ..., k = 2) -2 * c(object) + k * attr(object, df) AIC.default - function(object, ..., k = 2) { ## AIC for various fitted objects --- any for which there's a logLik() method: ll - if(stats4 %in% loadedNamespaces()) stats4:::logLik else logLik if(length(list(...))) {# several objects: produce data.frame object - list(object, ...) val - lapply(object, ll) val - as.data.frame(t(sapply(val, function(el) c(attr(el, df), AIC(el, k = k) names(val) - c(df, AIC) Call - match.call() Call$k - NULL row.names(val) - as.character(Call[-1]) val } else AIC(ll(object), k = k) } Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read
Re: [R] aov help
Rather than estimating the variance components via the method-of-moments estimators, have a look at the 'nlme' and 'lme4' packages, which provide likelihood-based tools for estimating random and mixed models (and in the case of nlme, gls models too). Advantages of the likelihood-based approaches include working with unbalanced data, not producing negative variance estimates when the MS_{error} is larger than the MS_{between groups}, and providing a great deal of flexibility in structuring both the random effects and error covariance matrices. hth, Kingsford Jones On Thu, Nov 13, 2008 at 7:14 PM, [EMAIL PROTECTED] wrote: Please pardon an extremely naive question. I see related earlier posts, but no responses which answer my particular question. In general, I'm very confused about how to do variance decomposition with random and mixed effects. Pointers to good tutorials or texts would be greatly appreciated. To give a specific example, page 193 of VR, 3d Edition, illustrates using raov assuming pure random effects on a subset of coop: raov(Conc ~ Lab / Bat, data=coop, subset = Spc==S1) I realize ('understand' would be a bit too strong) that the same analysis, resulting in identical sums of squares, degrees of freedom, and residuals can be generated in R by doing op - options(contrasts=c(contr.helmert, contr.poly)) aov(Conc ~ Lab + Error(Lab / Bat), data=coop, subset = Spc==S1) However, as shown in VR, raov also equated the expected and observed mean squares, to solve for and display the variance components associated with the random factors, \sigma_\epsilon^2, \sigma_B^2, and \sigma_L^2 in a column labeled Est. Var.. Given the analytical forms of the expected mean squares for each stratum, I can obviously do this manually. But is there way to get R to do it automatically, a la raov? This would be particularly useful for mixed cases in which the analytical formulations of the expected mean squares may not be immediately obvious to a novice. Thanks in advance! Regards, -jh- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] VEC Operator in R
RSiteSearch('vec vech', restrict = 'fun') turns up several packages with vec and vech functions. E.g., matrixcalc, fUtilities, ks, ... hth, Kingsford Jones On Thu, Oct 23, 2008 at 11:47 AM, megh [EMAIL PROTECTED] wrote: Can anyone please tell whether there is any R function to act as VEC and VECH operator on Matrix? Yes of course, I can write a user-defined-function for that or else, I can put dim(mat) - NULL. However I am looking for some R function. Your help will be highly appreciated. Regards, -- View this message in context: http://www.nabble.com/VEC-Operator-in-R-tp20136201p20136201.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating confidence limits for the difference between means
On Wed, Oct 22, 2008 at 11:09 AM, Dennis Fisher [EMAIL PROTECTED] wrote: Colleagues, I am working on a problem at the edge of my knowledge of statistics. Basically, I have values for two groups - I am trying to calculate the 90% confidence limits for the difference between means. I can implement this without difficulty based on standard equations that are available in stat textbooks: (difference between means) / (pooled standard error) My problem arises when I try to correct for the value of a covariate. I can do the ANOVA with the covariate as a factor. But, I don't know how to use the output to obtain the confidence limits of the difference. Generally 'factor' refers to qualitative variables and 'covariate' to quantitative, but if you are looking to estimate the difference in two means using an ANCOVA perhaps something like... fit - lm(y ~ trt + covar) # trt is a 2-level factor and covar is quantitative confint(fit, level = .90) hth, Kingsford Jones I suspect that several participants in this board have implemented code to so this. I hope that someone is willing to share the code. Thanks in advance. Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-415-564-2220 www.PLessThan.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] VEC Operator in R
Sender: [EMAIL PROTECTED] On-Behalf-Of: [EMAIL PROTECTED] Subject: Re: [R] VEC Operator in R Message-Id: [EMAIL PROTECTED] Recipient: [EMAIL PROTECTED] This information is being sent at the recipient's request or with their specific understanding. The recipient acknowledges that by sending this information via electronic means, there is no absolute assurance that the information will be free from third party access, use, or further dissemination. This e-mail contains information that is privileged and/or confidential and may be subject to legal restrictions and penalties regarding its unauthorized disclosure or other use. You are prohibited from copying, distributing or otherwise using this information if you are not the intended recipient. Past performance is not necessarily indicative of future results. This is not an offer of or the solicitation for any security which will be made only by private placement memorandum that may be obtained from the applicable hedge fund. If you have received this e-mail in error, please notify us immediately by return e-mail and delete this e-mail and all attachments from your system. Thank You. ---BeginMessage--- RSiteSearch('vec vech', restrict = 'fun') turns up several packages with vec and vech functions. E.g., matrixcalc, fUtilities, ks, ... hth, Kingsford Jones On Thu, Oct 23, 2008 at 11:47 AM, megh [EMAIL PROTECTED] wrote: Can anyone please tell whether there is any R function to act as VEC and VECH operator on Matrix? Yes of course, I can write a user-defined-function for that or else, I can put dim(mat) - NULL. However I am looking for some R function. Your help will be highly appreciated. Regards, -- View this message in context: http://www.nabble.com/VEC-Operator-in-R-tp20136201p20136201.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ---End Message--- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] VEC Operator in R
Sender: [EMAIL PROTECTED] On-Behalf-Of: [EMAIL PROTECTED] Subject: Re: [R] VEC Operator in R Message-Id: [EMAIL PROTECTED] Recipient: [EMAIL PROTECTED] This information is being sent at the recipient's request or with their specific understanding. The recipient acknowledges that by sending this information via electronic means, there is no absolute assurance that the information will be free from third party access, use, or further dissemination. This e-mail contains information that is privileged and/or confidential and may be subject to legal restrictions and penalties regarding its unauthorized disclosure or other use. You are prohibited from copying, distributing or otherwise using this information if you are not the intended recipient. Past performance is not necessarily indicative of future results. This is not an offer of or the solicitation for any security which will be made only by private placement memorandum that may be obtained from the applicable hedge fund. If you have received this e-mail in error, please notify us immediately by return e-mail and delete this e-mail and all attachments from your system. Thank You. ---BeginMessage--- RSiteSearch('vec vech', restrict = 'fun') turns up several packages with vec and vech functions. E.g., matrixcalc, fUtilities, ks, ... hth, Kingsford Jones On Thu, Oct 23, 2008 at 11:47 AM, megh [EMAIL PROTECTED] wrote: Can anyone please tell whether there is any R function to act as VEC and VECH operator on Matrix? Yes of course, I can write a user-defined-function for that or else, I can put dim(mat) - NULL. However I am looking for some R function. Your help will be highly appreciated. Regards, -- View this message in context: http://www.nabble.com/VEC-Operator-in-R-tp20136201p20136201.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ---End Message--- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating confidence limits for the difference betweenmeans
Sender: [EMAIL PROTECTED] On-Behalf-Of: [EMAIL PROTECTED] Subject: Re: [R] Calculating confidence limits for the difference between means Message-Id: [EMAIL PROTECTED] Recipient: [EMAIL PROTECTED] This information is being sent at the recipient's request or with their specific understanding. The recipient acknowledges that by sending this information via electronic means, there is no absolute assurance that the information will be free from third party access, use, or further dissemination. This e-mail contains information that is privileged and/or confidential and may be subject to legal restrictions and penalties regarding its unauthorized disclosure or other use. You are prohibited from copying, distributing or otherwise using this information if you are not the intended recipient. Past performance is not necessarily indicative of future results. This is not an offer of or the solicitation for any security which will be made only by private placement memorandum that may be obtained from the applicable hedge fund. If you have received this e-mail in error, please notify us immediately by return e-mail and delete this e-mail and all attachments from your system. Thank You. ---BeginMessage--- On Wed, Oct 22, 2008 at 11:09 AM, Dennis Fisher [EMAIL PROTECTED] wrote: Colleagues, I am working on a problem at the edge of my knowledge of statistics. Basically, I have values for two groups - I am trying to calculate the 90% confidence limits for the difference between means. I can implement this without difficulty based on standard equations that are available in stat textbooks: (difference between means) / (pooled standard error) My problem arises when I try to correct for the value of a covariate. I can do the ANOVA with the covariate as a factor. But, I don't know how to use the output to obtain the confidence limits of the difference. Generally 'factor' refers to qualitative variables and 'covariate' to quantitative, but if you are looking to estimate the difference in two means using an ANCOVA perhaps something like... fit - lm(y ~ trt + covar) # trt is a 2-level factor and covar is quantitative confint(fit, level = .90) hth, Kingsford Jones I suspect that several participants in this board have implemented code to so this. I hope that someone is willing to share the code. Thanks in advance. Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-415-564-2220 www.PLessThan.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ---End Message--- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating confidence limits for the difference betweenmeans
Sender: [EMAIL PROTECTED] On-Behalf-Of: [EMAIL PROTECTED] Subject: Re: [R] Calculating confidence limits for the difference between means Message-Id: [EMAIL PROTECTED] Recipient: [EMAIL PROTECTED] This information is being sent at the recipient's request or with their specific understanding. The recipient acknowledges that by sending this information via electronic means, there is no absolute assurance that the information will be free from third party access, use, or further dissemination. This e-mail contains information that is privileged and/or confidential and may be subject to legal restrictions and penalties regarding its unauthorized disclosure or other use. You are prohibited from copying, distributing or otherwise using this information if you are not the intended recipient. Past performance is not necessarily indicative of future results. This is not an offer of or the solicitation for any security which will be made only by private placement memorandum that may be obtained from the applicable hedge fund. If you have received this e-mail in error, please notify us immediately by return e-mail and delete this e-mail and all attachments from your system. Thank You. ---BeginMessage--- On Wed, Oct 22, 2008 at 11:09 AM, Dennis Fisher [EMAIL PROTECTED] wrote: Colleagues, I am working on a problem at the edge of my knowledge of statistics. Basically, I have values for two groups - I am trying to calculate the 90% confidence limits for the difference between means. I can implement this without difficulty based on standard equations that are available in stat textbooks: (difference between means) / (pooled standard error) My problem arises when I try to correct for the value of a covariate. I can do the ANOVA with the covariate as a factor. But, I don't know how to use the output to obtain the confidence limits of the difference. Generally 'factor' refers to qualitative variables and 'covariate' to quantitative, but if you are looking to estimate the difference in two means using an ANCOVA perhaps something like... fit - lm(y ~ trt + covar) # trt is a 2-level factor and covar is quantitative confint(fit, level = .90) hth, Kingsford Jones I suspect that several participants in this board have implemented code to so this. I hope that someone is willing to share the code. Thanks in advance. Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-415-564-2220 www.PLessThan.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ---End Message--- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ecological Niche Modelling on R
On Wed, Oct 8, 2008 at 6:17 AM, milton ruser [EMAIL PROTECTED] wrote: [snip] Finally, in fact GRASP do what I am looking for, and I am starting to compare the results of this packages with other very wel- know softwares (DescktopGarp, Maxent, OpenModeller). If someone of you have suggestions of other R-solutions for Ecological Niche Models, please, let-me know. If you're interested in prediction methods based on presence-only data, you might find Gillian Ward's dissertation interesting ( http://www-stat.stanford.edu/~hastie/students.htm ). It includes reference to an accompanying R package (although it does not appear to be on CRAN). hth, Kingsford Jones Regards a lot. Savava. Miltinho Astronauta Brazil On Wed, Oct 8, 2008 at 6:02 AM, Clément Calenge [EMAIL PROTECTED] wrote: It's very kind of Stephen to plug my book, but it's notwhat you're looking for. You need to read more about this general topic, and aboutthe particular packages: try http://www.unine.ch/CSCF/grasp/grasp-r/index.htmlhttp://www.unine.ch/CSCF/grasp/ Based on downloading grasp , it doesn't look as thoughit will handle presence-only data, though -- you may needto look further. It doesn't look like adehabitat is what you want.From Calenge, Clement. 2006. The package adehabitat for the R software: A toolfor the analysis of space and habitat use by animals. Ecological Modelling 197,no. 3-4 (August 25): 516-519. doi:10.1016/j.ecolmodel.2006.03.017. ' ... the adehabitat package for the R software, which offers basic GIS(Geographic Information System) functions, methods to analyze radio-trackingdata and habitat selection by wildlife, and interfaces with other R packages.' General advice about I want to do X in R -- (expandingon Stephen's advice above): 1. read about X in general (perhaps you have already done this);2. search for R packages and functions that do what you want (you've already done this, although you misidentified adehabitat3. install those packages and see what they do. Look at thedocumentation included with the packages, including any citationsreferenced. Try the examples.4. If you don't know enough R to understand the examples or howto get your data into R, back up and read the introductory Rdocumentation. Actually, the confusion could be explained by the fact that many analyses methods (and especially factor analyses) originally developed in community ecology and biogeography to study the niche are also used in habitat selection studies (e.g., OMI analysis, ENFA, etc.). As the statistical issues (predict the species/animal presence on an area, given the value of environmental variables) and type of data (presence-only data to be compared with a sample/census of available units, etc.) involved in studies of the niche and habitat selection are often similar, the methods used are often similar too... However, most of the functions in adehabitat implement /exploratory/ methods of the ecological niche, and methods suitable for prediction are rare in the package (except one or two functions which have already been used for that, such as mahasuhab or domain, but they are probably not the best choice given your aim)... The package grasp may indeed be a better choice if your aim is prediction... But I concur with Ben and Stephen on the fact that you should first read the (large) literature on niche modelling before choosing the method that seems appropriate to your data/issue, and then search R archives/package for a solution. a good start: @ARTICLE{Elith2006, author = {Elith, J. and Graham, C.H. and Anderson, R.P. and Dudik, M. and Ferrier, S. and Guisan, A. and Hijmans, R.J. and Huettmann, F. and Leathwick, J.R. and Lehmann, A. and Li, J. and Lohmann, L.G. and Loiselle, B.A. and Manion, G. and Moritz, C. and Nakamura, M. and Nakazawa, Y. and McC. Overton, J. and Peterson, A.T. and Phillips, S.J. and Richardson, K. and Scachetti-Pereira, R. and Schapire, R.E. and Soberon, J. and Williams, S. and Wisz, M.S. and Zimmermann, N.E.}, title = {Novel methods improve prediction of species distributions from occurrence data}, journal = {Ecography}, year = {2006}, volume = {29}, pages = {129-151} } and references therein. Cheers, Clément Calenge. -- Clément CALENGE Office national de la chasse et de la faune sauvage Saint Benoist - 78610 Auffargis tel. (33) 01.30.46.54.14 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented
Re: [R] lme and lmer df's and F-statistics again
You may find this site useful: http://lme4.r-forge.r-project.org/bib/lme4bib.html On Tue, Oct 7, 2008 at 9:08 AM, Dieter Menne [EMAIL PROTECTED] wrote: Julia S. julia.schroeder at gmail.com writes: Now, I did that in my article and I got a response from a reviewer that I additionally should give the degrees of freedom, and the F-statistics. From what I read here, that would be incorrect to do, and I sort of intuitively also understand why (at least I think I do). ... Well, writing on my rebuttal, I find myself being unable to explain in a few, easy to understand (and, at the same time, correct) sentences stating that it is not a good idea to report (most likely wrong) dfs and F statistics. Can somebody here help me out with a correct explanation for a laymen? Feeling with you, and hoping some day this will be resolved. I am sure you have read Douglas Bates' http://finzi.psych.upenn.edu/R/Rhelp02a/archive/76742.html but I thought this was temporary. The only workaround I have is not to use lmer for gaussian models. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ANOVA between within variance
So, rather than the estimated between and within group variances from a standard fixed-effects ANOVA (i.e. the Mean Sqs in the anova table), you are looking for the estimated variance components from a model with crossed random effects? If so, you may find the lmer function found in package lme4 to be useful (formulas for crossed random effects are much more straightforward than in nlme). If not, this page provides some helpful tips on eliciting useful responses from this list: http://www.r-project.org/posting-guide.html HTH, Kingsford Jones On Sat, Sep 27, 2008 at 2:21 AM, Gregor Rolshausen [EMAIL PROTECTED] wrote: dear Dr. Kubovy, I am sorry. but the Variance table is not exactly what I want. I want the partitioned VARIANCE for between and within the groups. the anova ()-table just gives me the SumSq and the mean Sq... I know how to run t.test and ANOVA! in the nlme-package there is the VarCorr function, which extracts the between and within variances, but only for nested ANOVAs. so my question was, if there is a function like that for not-nested ANOVAS ? sorry. maybe I should reformulate the question. cheers , gregor Am Sep 27, 2008 um 7:19 AM schrieb Michael Kubovy: Than all you need is to run a t-test, no? More generally (from ?lm): ctl - c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) trt - c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) group - gl(2,10,20, labels=c(Ctl,Trt)) weight - c(ctl, trt) anova(lm.D9 - lm(weight ~ group)) This gives you what you need: Analysis of Variance Table Response: weight Df Sum Sq Mean Sq F value Pr(F) group 1 0.690.691.42 0.25 Residuals 18 8.730.48 I am concerned that you have not spent enough time either studying stats or reading up on R. There are many good introductions to stats using R. _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ Gregor Rolshausen PhD Student Department of Evolutionary Ecology University of Freiburg im Breisgau; Hauptstrasse 1, 79108 Freiburg phone - +49.761.2559 email - [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug in is ?
On Thu, Sep 25, 2008 at 3:23 PM, Wacek Kusnierczyk [EMAIL PROTECTED] wrote: Rolf Turner wrote: [snip] Now what on earth does ``integer type'' mean? The concept ``type'' is not defined anywhere, and there is no help on ``type''. There is no type() function. One has to intuit, from the discussion of integer vectors existing so that they can be properly passed to .C() or .Fortran(), that type has something to do with storage mode. indeed. one more example that R man pages are often rather uninformative, despite verbosity Try ?type which correctly guesses the user is looking for the 'typeof' page. Or even example(type) Also, after a brief introduction, the R Language Definition document begins with a discussion of types. Kingsford Jones It would have been better to have called the function now known as ``is.integer()'' something like ``is.storedAsInteger()'' and have a function is.integer() which does what people expect. E.g. is.integer(c(5.1, 5.4, 4.8, 5.0)) would return [1] FALSE FALSE FALSE TRUE Despite what fortune(85) says, it is not unreasonable to expect functions to do what one would intuitively think that they should do. E.g. sin(x) should not return 1/(1+x^2) even if the help page for sin() says clearly and explicitly that this is what it does. (Aside: help pages rarely if ever say *anything* clearly and explicitly, at least from the point of view of the person who does not already understand everything about the concepts being explained.) indeed. one more opinion that R man pages are often rather uninformative, despite verbosity. Be that as it may, this all academic maundering. The is.integer() function does what it does and THAT IS NOT GOING TO CHANGE. We'll just have to deal with it. Once one is *aware* that the results of is.integer are counter-intuitive, one can adjust one's expectations, and it's no big deal. I do think, however, that there ought to a WARNING section in the help on is.integer() saying something like: NOTE: is.integer() DOES NOT DO what you expect it to do. hehe. this should be printed on the first page in every R tutorial: NOTE: R DOES NOT DO what you expect it to do (seems i'm in a bad mood, sorry, R is just fine). In large friendly letters. cheers, Rolf Turner P. S. Those who want a function that does what one would naturally expect is.integer() to do could define, e.g. is.whole.number - function(x) { abs(x-round(x)) sqrt(.Machine$double.eps) } Then is.whole.number(c(5.1, 5.4, 4.8, 5.0)) returns [1] FALSE FALSE FALSE TRUE just as one would want, hope, and expect. if *this* is what one would want, hope, and expect from is.integer, why can't we want, hope, and expect that it eventually happens? vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to do Clustered Standard Errors for Regression in R?
Bo, Try using RSiteSearch with the strings 'huber-white', 'sandwich' or even 'clustered standard errors'. You may also want to consider a mixed models approach -- see: http://www.stat.columbia.edu/~cook/movabletype/archives/2007/11/clustered_stand.html HTH, Kingsford Jones On Tue, Sep 16, 2008 at 12:01 PM, Bo Cowgill [EMAIL PROTECTED] wrote: I can't seem to find the right set of commands to enable me to do perform a regression with cluster-adjusted standard-errors. There seems to be nothing in the archives about this -- so this thread could help generate some useful content. I've searched everywhere. Can anyone point me to the right set of commands? An example would be most helpful as well. Bo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating rainbow gradients
On Wed, Sep 17, 2008 at 3:11 PM, Gillian Silver [EMAIL PROTECTED] wrote: What would I do if I have something like: x - rnorm(1:1000) y - rnorm(1:1000) z - x + y and I want the rainbow to increase with z? (i.e., red for lowest z...all the way up to the last color in the rainbow for the highest z) Do you mean something like the following? z - rnorm(1000, sd = sqrt(2)) plot(z, col = rainbow(length(z), end = 5/6)[rank(z)], pch = 19) On Wed, Sep 17, 2008 at 2:05 PM, stephen sefick [EMAIL PROTECTED] wrote: plot(1:20, col=rainbow(20)) On Wed, Sep 17, 2008 at 4:58 PM, Gillian Silver [EMAIL PROTECTED] wrote: Hi, how can I create a rainbow gradient in R? For example, let's say I have a plot of y = x...and I want the plot to go from red - orange - yellow - green - blue - etc. Right now, I know how to do something like go from red to blue, using the plotrix library: library(plotrix) redToBlue - color.scale(x,redrange=c(0,1),greenrange=c(0,1),bluerange=c(0,1),extremes=c(red,blue)) plot(x, y, col=redToBlue) But I can't figure out how to make the colors a rainbow. (I don't understand how the redrange, greenrange, and bluerange parameters in color.scale work.) Could someone please help? Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Stephen Sefick Research Scientist Southeastern Natural Sciences Academy Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating rainbow gradients
On second thought, this is more likely to be what you're looking for... library(rgl) x - rnorm(1000) y - rnorm(1000) z - x + y plot3d(x, y, z, col = rainbow(1000, end = 5/6)[rank(z)], size = 3) On Wed, Sep 17, 2008 at 4:06 PM, Kingsford Jones [EMAIL PROTECTED] wrote: On Wed, Sep 17, 2008 at 3:11 PM, Gillian Silver [EMAIL PROTECTED] wrote: What would I do if I have something like: x - rnorm(1:1000) y - rnorm(1:1000) z - x + y and I want the rainbow to increase with z? (i.e., red for lowest z...all the way up to the last color in the rainbow for the highest z) Do you mean something like the following? z - rnorm(1000, sd = sqrt(2)) plot(z, col = rainbow(length(z), end = 5/6)[rank(z)], pch = 19) On Wed, Sep 17, 2008 at 2:05 PM, stephen sefick [EMAIL PROTECTED] wrote: plot(1:20, col=rainbow(20)) On Wed, Sep 17, 2008 at 4:58 PM, Gillian Silver [EMAIL PROTECTED] wrote: Hi, how can I create a rainbow gradient in R? For example, let's say I have a plot of y = x...and I want the plot to go from red - orange - yellow - green - blue - etc. Right now, I know how to do something like go from red to blue, using the plotrix library: library(plotrix) redToBlue - color.scale(x,redrange=c(0,1),greenrange=c(0,1),bluerange=c(0,1),extremes=c(red,blue)) plot(x, y, col=redToBlue) But I can't figure out how to make the colors a rainbow. (I don't understand how the redrange, greenrange, and bluerange parameters in color.scale work.) Could someone please help? Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Stephen Sefick Research Scientist Southeastern Natural Sciences Academy Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] autocorrelation in gams
Keeping Gavin's advice in mind, you may also want to look at ?acf (and see section 14.1 of MASS) and help(ACF, package=nlme) (see section 5.3 of MEMSS). These are useful functions for exploring the 1d empirical autocorrelation structure of model residuals. hth, Kingsford Jones On Fri, Aug 15, 2008 at 1:18 AM, Gavin Simpson [EMAIL PROTECTED] wrote: On Thu, 2008-08-14 at 16:12 +0100, Abigail McQuatters-Gollop wrote: Hi, I am looking at the effects of two explanatory variables on chlorophyll. The data are an annual time-series (so are autocorrelated) and the relationships are non-linear. I want to account for autocorrelation in my model. The model I am trying to use is this: Library(mgcv) gam1 -gam(Chl~s(wintersecchi)+s(SST),family=gaussian, na.action=na.omit, correlation=corAR1(form =~ Year)) There is no correlation argument in mgcv::gam you need gamm(). gam() has a ... argument which I suspect is silently mopping up your correlation argument so that no error/warning is raised. Note that gamm() uses lme from the nlme package (in the Gaussian case) and works that function very hard (see Wood 2006 GAM book). In my experience with this function separating trend from the correlation is quite difficult when also estimating the degree of smoothness and one has to work hard with the options. As an alternative you might also take a look at the paper by Ferguson et al (2007): http://www3.interscience.wiley.com/journal/119392062/abstract?CRETRY=1SRETRY=0 Which has R code using the sm package to do a very similar thing. HTH G the result I get is this: Family: gaussian Link function: identity Formula: CPRChl ~ s(wintersecchi) + s(SST) Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 3.570000.05061 70.54 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Approximate significance of smooth terms: edf Est.rank F p-value s(wintersecchi) 2.4455 4.672 0.00887 ** s(SST) 2.4085 4.301 0.01237 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 R-sq.(adj) = 0.676 Deviance explained = 75.4% GCV score = 0.074563 Scale est. = 0.053781 n = 21 The result look good - significant, with a lot of deviance explained, but I am not convinced the model is actually accounting for the autocorrelation (based on the formula in the results). How can I tell? Many thanks, Dr Abigail McQuatters-Gollop Sir Alister Hardy Foundation for Ocean Science (SAHFOS) The Laboratory Citadel Hill Plymouth UK PL1 2PB tel: +44 1752 633233 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inconsistent linear model calculations
Also, it's worth pointing out the reason for the numerical instability of the parameter estimates: the predictors are nearly collinear. (dubious - read.table('clipboard')) V1 V2V3V4V5 V6 V7 1 1 300 39.87 39.85 39.90 39.87333 9 2 2 400 45.16 45.23 45.17 45.18667 16 3 3 500 50.72 51.03 50.90 50.88333 25 4 4 600 56.85 56.80 57.02 56.89000 36 5 5 700 63.01 63.09 63.14 63.08000 49 6 6 800 69.52 59.68 69.63 66.27667 64 round(cor(dubious), 3) V1V2V3V4V5V6V7 V1 1.000 1.000 0.999 0.951 0.999 0.997 0.991 V2 1.000 1.000 0.999 0.951 0.999 0.997 0.991 V3 0.999 0.999 1.000 0.942 1.000 0.995 0.996 V4 0.951 0.951 0.942 1.000 0.943 0.970 0.912 V5 0.999 0.999 1.000 0.943 1.000 0.995 0.995 V6 0.997 0.997 0.995 0.970 0.995 1.000 0.983 V7 0.991 0.991 0.996 0.912 0.995 0.983 1.000 Note that the correlation for V2 and V7 is about .991 Kingsford Jones On Thu, May 15, 2008 at 3:01 PM, e-letter [EMAIL PROTECTED] wrote: Thank you to all you eagle eyes; amendment made accordingly and solved. Not sure how the difference occurred... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-sig-ME] lme nesting/interaction advice
Hi Rolf, On Tue, May 13, 2008 at 1:59 PM, Rolf Turner [EMAIL PROTECTED] wrote: in response to Doug Bates' useful tutorial... Thanks very much for your long, detailed, patient, and lucid response to my cri de coeur. That helps a *great* deal. Hear Hear! snip One point that I'd like to spell out very explicitly, to make sure that I'm starting from the right square: The model that you start with in the Machines/Workers example is fm1 - lmer(score ~ Machine + (1|Worker) + (1|Machine:Worker), Machines) My understanding is that this is the ``only'' model that could be fitted by the Package-Which-Must-Not-Be-Named. I.e. this package *could not* fit the second, more complex model: fm2 - lmer(score ~ Machine + (Machine|Worker), Machines) (At least not directly.) Can you (or someone) confirm that I've got that right? Compare: ## R m4 Linear mixed model fit by REML Formula: score ~ Machine + (0 + Machine | Worker) Data: Machines AIC BIC logLik deviance REMLdev 228.3 248.2 -104.2216.6 208.3 Random effects: Groups Name Variance Std.Dev. Corr Worker MachineA 16.64098 4.07934 MachineB 74.39558 8.62529 0.803 MachineC 19.26646 4.38936 0.623 0.771 Residual 0.92463 0.96158 Number of obs: 54, groups: Worker, 6 ## The-Package proc mixed data = machines; class worker machine; model score = machine / solution; random machine / subject = worker type = un; run; Covariance Parameter Estimates Cov Parm SubjectEstimate UN(1,1) Worker 16.6405 UN(2,1) Worker 28.2447 UN(2,2) Worker 74.3956 UN(3,1) Worker 11.1465 UN(3,2) Worker 29.1841 UN(3,3) Worker 19.2675 Residual 0.9246 The two outputs report essentially the same thing. Note that e.g, UN(2,1) = 28.2477 approx= .803*4.07934*8.62529 (And, as usual, the fixed effects estimates match too once the contrasts and 'types' of SS for an ANOVA table are set up) UN is short for 'unstructured' - a term Doug has pointed out is not particularly fitting because the covariance matrix is symmetric positive definite. It seems to me to be the key to why I've had such trouble in the past in grappling with mixed models in R. I.e. I've been thinking like the Package-Which-Must-Not-Be-Named --- that the simple, everything-independent model was the only possible model. Although this may well not apply to you, another area of confusion arises not so much from differences between stats packages but by differences between methods. I'm not an expert in the estimation methods but, as I understand it, classic texts describe fitting mixed models in terms of ANOVA in the OLS framework, calculating method of moments estimators for the variances of the random effects by equating observed and expected mean squares (I believe using aov and lm with an 'Error' term would fall into this category, and proc anova and proc glm would also). Starting in the 90's these methods started falling out of fashion in favor of ML/REML/GLS methods (likelihood based), which offer more flexibility in structuring both the error and random effects covariance matrices, will not produce negative variance estimates, and have other nice properties that someone more 'mathy' than me could explain. Tools like lme, lmer, proc mixed and proc glimmix fall into this category. hoping this helps, Kingsford Jones Thanks again. cheers, Rolf ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme nesting/interaction advice
On Fri, May 9, 2008 at 4:04 AM, Federico Calboli [EMAIL PROTECTED] wrote: Note that random can be a list: a one-sided formula of the form ~x1+...+xn, or a pdMat object with a formula (i.e. a non-NULL value for formula(object)), or a list of such formulas or pdMat objects. If you can translate that into *informative* English I'd be grateful. I have the Pinheiro and Bates book under my nose now, and I suspect it's pretty more extensive that the helpfile, but it's still nowhere close to providing a straigtforward answer to my question. Federico, I think you'll be more likely to receive the type of response you're looking for if you formulate your question more clearly. The inclusion of commented, minimal, self-contained, reproducible code (as is requested at the bottom of every email sent by r-help) is an effective way to clarify the issues. Also, when asking a question about fitting a model it's helpful to describe the specific research questions you want the model to answer. I'll offer my interpretation of your study design so you can see where questions might arise. It sounds like you have protein measures (on how many units?) at various levels of 'male' (which at first you described as presence/absence, but later as continuous - also you descibed 'male' as fixed and continuous but then entered it in the formula as though it were a random grouping factor), within the second level of 'selection' (e.g. large1), within the first level of selection (e.g. large), within a random block (what are the blocks?) within a random month. Is this right -- multiple observations within 4 levels of nesting - some of which are random and some fixed? Finally, I'll point out that there's an R list dedicated to mixed models, with a particular focus on the lmer function, which might be the right tool for your analyses ( https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models ). Kingsford Jones Cheers, Federico -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which gls models to use?
Tom, I've never used lm.gls, but a quick look at the help page suggests that, unlike gls, it doesn't have built-in functionality to estimate parameters to structure the response/error covariance matrix. I believe it assumes the covariance matrix is known (hopefully someone will correct me if I'm wrong about that). So, building off the answer I sent to you on April 29 (pasted below rather than hyper-linked because parts of the conversation appear to be missing from the archive), if y ~ N(XB, Sigma) where Sigma = sigma^2(Lambda) and Lambda is decomposed into VCV, where V is diagonal and describes a variance structure and C has 1's on the diagonal and the off-diagonals describe the correlation of the errors, then the 'weights' argument to gls will allow you to estimate a variance structure for V (see ?varClasses) and the 'correlation' argument allows you to estimate the correlation structure for C (see ?corClasses). You'll notice the first corClass listed on the help page is corAR1 (there's also corCAR1 if you don't have discrete lags). See example(gls) for an example of how it's used, and Pinheiro and Bates (2000) for detailed examples. Frank Harrell has commented on this list that gls is underused ( http://tolstoy.newcastle.edu.au/R/help/06/08/32487.html ). Given the fact that it can drastically reduce the constraints of the constant variance and independence assumptions of the ubiquitous linear model, I agree. good luck, Kingsford Jones Discussion on weights in gls from 2008-Apr-29 Thanks Kingsford! I will cc r-help. varPower() works for me. I want to use lm because predict.gls does not give lwr and upr values, and also the bptest and ncv.test only work with lm models... - Hide quoted text - On 4/29/08, Kingsford Jones [EMAIL PROTECTED] wrote: hi Tom, Basically, a simple way to model non-constant variance when the variance increases (the common case) or decreases with the conditional mean of your response is to use weights = varPower(), and gls will estimate the parameter \alpha that defines the power relationship between the regression line and the error variances. If for some reason you wanted to use lm rather then gls for your final model, then after estimating the gls model you could supply a weights argument to lm of the form: weights = fitted(your.gls.model)^(-2*the.alpha.estimate.from.the.gls.model). and you should get the same (or very close) estimates as from the gls model but have a model of class 'lm'. There are many different ways to model non-constant variance and you'll need to choose one that is appropriate for your data. If the model I described isn't appropriate then you should look at Ch 5 of PB to learn about the other varFunc classes. good luck, Kingsford ps - would you mind forwarding to r-help in case this others have the same question. On Tue, Apr 29, 2008 at 3:24 PM, tom soyer [EMAIL PROTECTED] wrote: Thanks Kingsford! What are the varClasses? I don't know how to use them If I choose varPower(), then does it mean the function would generate the weights, w, so that w gives the most likely explanation of the relationship between the variance and the independent variable? On 4/29/08, Kingsford Jones [EMAIL PROTECTED] wrote: On Tue, Apr 29, 2008 at 6:20 AM, tom soyer [EMAIL PROTECTED] wrote: Hi, I would like to use a weighted lm model to reduce heteroscendasticity. I am wondering if the only way to generate the weights in R is through the laborious process of trial and error by hand. Does anyone know if R has a function that would automatically generate the weights need for lm? Hi Tom, The 'weights' argument to the 'gls' function in the nlme package provides a great deal of flexibility in estimate weighting parameters and model coefficients. For example, if you want to model monotonic heteroscedasticity by estimating the weights $E(Y)^{-2\alpha}$, you can use the varPower variance function class. E.g., something like f1 - gls(y ~ x1 + x2, data = your.data, weights = varPower()) will estimate the regression coefficients and alpha parameter together via maximum likelihood. (note that the usual specification for varPower is varPower(form = ~ your.formula), but by default the mean is used. See Ch 5 of the Pinheiro and Bates Mixed-effects Models book for details) Kingsford Jones -- Tom On Fri, May 9, 2008 at 8:05 AM, tom soyer [EMAIL PROTECTED] wrote: Hi, I need to correct for ar(1) behavior of my residuals of my model. I noticed that there are multiple gls models in R. I am wondering if anyone has experience in choosing between gls models. For example, how should one decide whether to use lm.gls in MASS, or gls in nlme
Re: [R] predict lmer
One question that arises is: at what level is the prediction desired? Within a given ID:TRKPT2 level? Within a given ID level? At the marginal level (which Bert's code appears to produce). Also, there is the question: how confident can you be in your predictions. This thread discusses possible ways to get prediction intervals: https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q2/thread.html#841 Finally, why assume a Poisson error distribution for a binary response? Kingsford Jones On Wed, May 7, 2008 at 10:13 AM, Bert Gunter [EMAIL PROTECTED] wrote: Sorry, my reply below may be too terse. You'll need to also construct the appropriate design matrix to which to apply the fixef() results to. If newDat is a data.frame containing **exactly the same named regressor and response columns** as your original vdata dataframe, and if me.fit.of is your fitted lmer object as you have defined it below, then model.matrix(terms(me.fit.of),newDat) %*% fixef(me.fit.of) gives your predictions. Note that while the response column in newDat is obviously unnecessary for prediction (you can fill it with 0's,say), it is nevertheless needed for model.matrix to work. This seems clumsy to me, so there may well be better ways to do this, and **I would appreciate suggestions for improvement.*** Cheers, Bert -Original Message- From: bgunter Sent: Wednesday, May 07, 2008 9:53 AM To: May, Roel; r-help@r-project.org Subject: RE: [R] predict lmer ?fixef gets you the coefficient vector, from which you can make your predictions. -- Bert Gunter Genentech -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of May, Roel Sent: Wednesday, May 07, 2008 7:23 AM To: r-help@r-project.org Subject: [R] predict lmer Hi, I am using lmer to analyze habitat selection in wolverines using the following model: (me.fit.of - lmer(USED~1+STEP+ALT+ALT2+relM+relM:ALT+(1|ID)+(1|ID:TRKPT2),data=vdata, control=list(usePQL=TRUE),family=poisson,method=Laplace)) Here, the habitat selection is calaculated using a so-called discrete choice model where each used location has a certain number of alternatives which the animal could have chosen. These sets of locations are captured using the TRKPT2 random grouping. However, these sets are also clustered over the different individuals (ID). USED is my binary dependent variable which is 1 for used locations and zero for unused locations. The other are my predictors. I would like to predict the model fit at different values of the predictors, but does anyone know whether it is possible to do this? I have looked around at the R-sites and in help but it seems that there doesn't exist a predict function for lmer??? I hope someone can help me with this; point me to the right functions or tell me to just forget it Thanks in advance! Cheers Roel Roel May Norwegian Institute for Nature Research Tungasletta 2, NO-7089 Trondheim, Norway [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pwr package
see ?library You need to load the package before R can access its functions. Kingsford Jones On Mon, May 5, 2008 at 12:18 PM, [EMAIL PROTECTED] wrote: Hello, I'm a new user of the R environment. I need to do some power analysis. For this purpose, I installed the pwr package from the R window, but unfortunately something went wrong. The installation of the package was successful (I got a message saying so in R) but when I enter a function of that package it says that the function does not exist! I appreciate your help, Sincerely, Samar Mouchawrab __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data transformation
Hi Christian, Here's a way using the reshape package: dfr site lat lon spec1 spec2 spec3 spec4 1 site1 10 11 1 0 1 0 2 site2 20 21 1 1 1 0 3 site3 30 31 0 1 1 1 library(reshape) dfr - melt(dfr[, -1], id=1:2, variable_name='species') dfr - dfr[dfr$value0,] dfr lat lon species value 1 10 11 spec1 1 2 20 21 spec1 1 5 20 21 spec2 1 6 30 31 spec2 1 7 10 11 spec3 1 8 20 21 spec3 1 9 30 31 spec3 1 12 30 31 spec4 1 The 'value', variable is not interesting here, but if you had counts rather than presence/absence it could be. best, Kingsford Jones On Fri, May 2, 2008 at 2:27 PM, Christian Hof [EMAIL PROTECTED] wrote: Dear all, how can I, with R, transform a presence-absence (0/1) matrix of species occurrences into a presence-only table (3 columns) with the names of the species (1st column), the lat information of the sites (2nd column) and the lon information of the sites (3rd column), as given in the below example? Thanks a lot for your help! Christian my dataframe: sitelat lon spec1 spec2 spec3 spec4 site1 10 11 1 0 1 0 site2 20 21 1 1 1 0 site3 30 31 0 1 1 1 my desired new dataframe: species lat lon spec1 10 11 spec1 20 21 spec2 20 21 spec2 30 31 spec3 10 11 spec3 20 21 spec3 30 31 spec4 30 31 -- Christian Hof, PhD student Center for Macroecology Evolution University of Copenhagen www.macroecology.ku.dk Biodiversity Global Change Lab Museo Nacional de Ciencias Naturales, Madrid www.biochange-lab.eu mobile ES .. +34 697 508 519 mobile DE .. +49 176 205 189 27 mail .. [EMAIL PROTECTED] mail2 .. [EMAIL PROTECTED] blog .. www.vogelwart.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Making a map in R?
There are many ways to do this, depending on what type of spatial data you are working with and what you want your maps to look like. The sp package provides S4 classes and methods for working with GIS data. Two likely candidates for importing data are the rgdal and maptools packages. See the Spatial Task View: http://cran.r-project.org/web/packages/rgdal/index.html Also, there is a r-sig for spatial data: https://stat.ethz.ch/mailman/listinfo/r-sig-geo Kingsford Jones On Thu, May 1, 2008 at 6:56 AM, stephen sefick [EMAIL PROTECTED] wrote: Does anyone know of a package to make a map from GIS data, and/or would it be easier in one of the free GIS programs. I would like to make a map of the savannah river area with our sampling locations. thanks stephen -- Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Making a map in R?
I sent the wrong link for the spatial Task View. Here it is http://cran.r-project.org/web/views/Spatial.html On Thu, May 1, 2008 at 8:21 AM, Kingsford Jones [EMAIL PROTECTED] wrote: There are many ways to do this, depending on what type of spatial data you are working with and what you want your maps to look like. The sp package provides S4 classes and methods for working with GIS data. Two likely candidates for importing data are the rgdal and maptools packages. See the Spatial Task View: http://cran.r-project.org/web/packages/rgdal/index.html Also, there is a r-sig for spatial data: https://stat.ethz.ch/mailman/listinfo/r-sig-geo Kingsford Jones On Thu, May 1, 2008 at 6:56 AM, stephen sefick [EMAIL PROTECTED] wrote: Does anyone know of a package to make a map from GIS data, and/or would it be easier in one of the free GIS programs. I would like to make a map of the savannah river area with our sampling locations. thanks stephen -- Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] efficiency profiling?
On Wed, Apr 30, 2008 at 4:33 PM, Duncan Murdoch [EMAIL PROTECTED] wrote: On 30/04/2008 6:59 PM, esmail bonakdarian wrote: snip Is there a good collection of hints/suggestions for R language idoms in terms of efficiency? snip See ?Rprof for the tool. For the tips, I think you just need to hang around here a while. I don't know of a nice collection (but I'm sure there are several.) Duncan Murdoch ;-) here's one: https://stat.ethz.ch/pipermail/r-help/2005-October/080991.html Kingsford Jones __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function to generate weights for lm?
On Tue, Apr 29, 2008 at 6:20 AM, tom soyer [EMAIL PROTECTED] wrote: Hi, I would like to use a weighted lm model to reduce heteroscendasticity. I am wondering if the only way to generate the weights in R is through the laborious process of trial and error by hand. Does anyone know if R has a function that would automatically generate the weights need for lm? Hi Tom, The 'weights' argument to the 'gls' function in the nlme package provides a great deal of flexibility in estimate weighting parameters and model coefficients. For example, if you want to model monotonic heteroscedasticity by estimating the weights $E(Y)^{-2\alpha}$, you can use the varPower variance function class. E.g., something like f1 - gls(y ~ x1 + x2, data = your.data, weights = varPower()) will estimate the regression coefficients and alpha parameter together via maximum likelihood. (note that the usual specification for varPower is varPower(form = ~ your.formula), but by default the mean is used. See Ch 5 of the Pinheiro and Bates Mixed-effects Models book for details) Kingsford Jones __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nested anova and multiple comparisons
Hi Stephen, On Sat, Apr 26, 2008 at 10:29 AM, Stephen Cole [EMAIL PROTECTED] wrote: ...snip... I have managed to accomplish my first two goals by analyzing the data as a 3 level nested Anova with the following code mod1 - aov(Count~Region/Location/Site, data=data) This allows me to get the MS for a Anova table. However R does not compute the correct F-statistics (because it uses the residual MS for the denominator in all calculations which it shouldnt) so i manually computed the F-stats and variance components by hand. R's just doing what it was asked it to do. The fact that it used the residual MS isn't a surprise given your code. From reading the help guide i learned about and tried using the Error(Region/Location/Site) command but then i can only get MS and no F-statistics and still hand to compute the rest by hand. Yes, if you want to use Method-of-Moments estimation to solve for expected mean squares for the variance components and calculate F-stats w/ the 'correct' MS, then then specifying error strata is reasonable for balanced data (assuming you've met model assumptions). However, I believe most statisticians these days are more inclined to use (Restricted) Maximum Likelihood estimation (e.g. using lme or lmer) because of the added fliexibility it provides (also it won't produce negative variance estimates when the mean squared error is larger than the mean square between the groups of interest) As for why you are only getting MS and no F-stats, it's hard to say without a reproducible example (see the posting guide). In my experience this will occur when there are insufficient degrees of freedom for calculating an F-stat at a given level. My problem now is that i would liek to use TukeyHSD for multiple comparisons. Howeber since R is unable to compute the correct F statistics in this nested design i also think it is using the wrong MS and df in calculating tukeys q. for example when i use Again, it's not that R is unable to, it's that you asked R not to. TukeyHSD(mod1, Region) i will get values however i do not think they have been calculated correctly. Furthermore when i use the Error(Region/Location/Site) term i can then no longer use TukeyHSD as i get the error message that there is no applicable use of tukey on this object. Looking at the methods available for TukeyHSD shows that it will work for an object of class aov methods(TukeyHSD) [1] TukeyHSD.aov But ?aov states: Value: An object of class 'c(aov, lm)' or for multiple responses of class 'c(maov, aov, mlm, lm)' or for multiple error strata of class 'aovlist'. There are 'print' and 'summary' methods available for these. So your model with error strata is of class aovlist not aov. i am just wondering if there is any way to use Multiple comparisons with R in a nested design like mine. I'm not sure but the package 'multcomp' might be of help. I have thought about using lme or lmer but if i understand them right with a balanced design i should be able to get the same result using aov Even with balanced data, you don't get the exact same thing with aov. As mentioned above lme and lmer use different estimation methods. One of the advantages of using (RE)ML is a lot more flexibility in how you specifiy the structure of random effects covariance matrices, and (at least with lme) you have a lot of flexibility in how you structure the residual covariance matrix as well. This is particularly useful when you are not meeting assumptions of constant variance and/or independence of observations (which seems to be the rule rather than the exception with ecological data with a spatial component, such as you appear to have). The theory behind analyses you want to do are quite complex with many potential pitfalls. If at all possible, I highly recommend finding the help of someone who is an expert in these types of models (known as mixed, hierarchical, multilevel, random effects, variance components, nested ANOVA, etc). best, Kingsford Jones Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Documentation General Comments
I just read through this thread and I didn't see the R Language Definition mentioned. As with An Introduction to R it can be accessed -- at least in my Windows GUI -- via the menu bar: Help - Manuals (in PDF). If An Introduction to R is too basic, then the Language Definition should be a good place to look for more details on R objects (Ch 2). However An Introduction to R does include authoritative introductions to the data types mentioned by the original poster: factors (Ch4), arrays and matrices (Ch 5), and lists and data frames (Ch 6). That said, I agree that learning efficiency could be improved by augmenting the manuals with tables similar to the table 2.1 that was referenced earlier in the thread (aside: are functions, or even lists, really Data Objects?). Of course, as pointed out by Duncan, we are collaborators not consumers, so if I think there should be more tables in the documents then the onus is on me to try to get my ideas incorporated (see http://wiki.r-project.org/rwiki/doku.php?id=misc:rpatch ). Kingsford Jones On Thu, Apr 24, 2008 at 9:37 AM, Duncan Murdoch [EMAIL PROTECTED] wrote: On 4/24/2008 12:08 PM, Martin Maechler wrote: Hmm, KeBe == Beck, Kenneth (STP) [EMAIL PROTECTED] on Thu, 24 Apr 2008 10:12:19 -0500 writes: KeBe OK I've spent a lot of time with the core KeBe documentation, and I never found anything as simple as KeBe their table 2.1, which elucidated the difference KeBe between a vector, matrix and array first, then the KeBe higher level structures, frame and list. Maybe I'm KeBe not a good searcher, but believe me for every initial KeBe posting I submit to this group, I have spent hours KeBe trying to find the answer elsewhere. And, as you KeBe state, maybe I am now deluded by that presentation, KeBe maybe it is not this simple! Well, I get the impression that you've never read the manual Introduction to R (or some good book such as Peter Dalgaard's) but have directly jumped into reading help() pages ??? That's not correct. Kenneth started the thread (on Monday) saying: The basic tutorial Introduction to R is so basic, it hardly helps at all, then digging through documentation is really an exercise in frustration. Duncan Murdoch Maybe a good idea would be to improve the Introduction to R rather than thinking of misusing the help() collection {which is the reference manual, not the user manual !!} by making it easy to understand (and consequently less precise) ?? Patches (well reflected ..) to the Introduction are quite welcome, indeed. The (development) source is always available at https://svn.r-project.org/R/trunk/doc/manual/R-intro.texi (and yes, the source does look a bit less user-friendly, than its PDF output, e.g. http://cran.r-project.org/doc/manuals/R-intro.pdf or its daily updated HTML output at http://stat.ethz.ch/R-manual/R-devel/doc/manual/R-intro.html ) Regards, Martin KeBe Look at the help for data.frame. VERY terse KeBe explanation, with not a good comparison to the other KeBe data types. Then, look at the titles list. Where is a KeBe topic for data types Every other programming KeBe language I have used (C++, Pascal, SAS, Java) has a KeBe basic chapter in the documentation that goes over data KeBe types, what arrays are, higher level structures, etc. KeBe When I typed help.search(data type) I get the KeBe following: KeBe Help files with alias or concept or title matching KeBe 'data type' using fuzzy matching: KeBe character-class(methods) Classes Corresponding to KeBe Basic Data Types sqlTypeInfo(RODBC) Request KeBe Information about DataTypes in an ODBC Database KeBe Looking for the term character-class(methods) yields KeBe nothing. I don't think that is what I want! KeBe Given all this complaining, I actually have completed KeBe several nice project using R, it is an impressive KeBe package. Somehow, though, we need to make the KeBe documentation better. KeBe -Original Message- From: Duncan Murdoch KeBe [mailto:[EMAIL PROTECTED] Sent: Thursday, April KeBe 24, 2008 9:51 AM To: Beck, Kenneth (STP) Cc: Bert KeBe Gunter; r-help@r-project.org Subject: Re: [R] KeBe Documentation General Comments KeBe On 4/24/2008 10:22 AM, Beck, Kenneth (STP) wrote: Agree that terseness is good, but I also agree with other posters that better cross referencing or maybe an index of synonyms would be good. So far, the best suggestion is the pdf at this link (http://www.medepi.net/epir/epir_chap02.pdf). Is there a way to pop at least part of this into the R-base help page