[R] Grouping distances
Hi R-listers, I am trying to group my HTL data, this is a column of data of Distances to the HTL data = turtlehatch. I would like to create an Index of distances (0-5m, 6-10, 11-15, 16-20... up to 60). And then create a new file with this HTLIndex in a column. So far I have gotten this far: HTL.index - function (values, weights=c(0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60)) { hope -values * weights return (apply(hope, 1, sum)/apply(values, 1, sum)) } write.csv(turtlehatch, HTLIndex, row.names=FALSE) But I do not seem to be able to create a new column in a new file. Please advise, Jean -- View this message in context: http://r.789695.n4.nabble.com/Grouping-distances-tp4632985.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Grouping distances
Hi R-listers, I am trying to group my HTL data, this is a column of data of Distances to the HTL data = turtlehatch. I would like to create an Index of distances (0-5m, 6-10, 11-15, 16-20... up to 60). And then create a new file with this HTLIndex in a column. So far I have gotten this far: HTL.index - function (values, weights=c(0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60)) { hope -values * weights return (apply(hope, 1, sum)/apply(values, 1, sum)) } write.csv(turtlehatch, HTLIndex, row.names=FALSE) But I do not seem to be able to create a new column in a new file. Please advise, Jean -- View this message in context: http://r.789695.n4.nabble.com/Grouping-distances-tp4632984.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grouping distances
Hello, It's easy to create a new column. Since you haven't said where nor the type of data structure you are using, I'll try to answer to both. Suppose that 'x' s a matrix. Then newcolumn - newvalues x2 - cbind(x, newcolumn) # new column added to x, result in x2 Suppose that 'y' is a data.frame. Then the same would do it, or y$newcolumn - newvalues Now, I believe that the new values come from your function. If so, you must assign the function value to some variable outside the function. htlindex - HTL.index(...etc...) # 'htlindex' is the 'newvalues' above Two extra notes. One, rowSums() does what your apply() instructions do. Second, first you multiply then you divide, to give 'weights'. I think this is just an example, not the real function. Hope this helps, Rui Barradas Em 11-06-2012 07:01, Jhope escreveu: Hi R-listers, I am trying to group my HTL data, this is a column of data of Distances to the HTL data = turtlehatch. I would like to create an Index of distances (0-5m, 6-10, 11-15, 16-20... up to 60). And then create a new file with this HTLIndex in a column. So far I have gotten this far: HTL.index - function (values, weights=c(0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60)) { hope -values * weights return (apply(hope, 1, sum)/apply(values, 1, sum)) } write.csv(turtlehatch, HTLIndex, row.names=FALSE) But I do not seem to be able to create a new column in a new file. Please advise, Jean -- View this message in context: http://r.789695.n4.nabble.com/Grouping-distances-tp4632985.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to add a vertical line for each panel in a lattice dotplot with log scale?
sorry but I can't close this thread with a viable solution other than the following one (i.e. by defining an user function to add line); I understand that the problem is related to the fact that: mean(log(.)) != log(mean(.)) is but for some reason I can't put all that in practice inside the panel.abline(...) I would appreciate if someone can show me how (sorry but at this point I must give up...), thank you all for the help # code start addLine- function(a=NULL, b=NULL, v = NULL, h = NULL, ..., once=F) { tcL - trellis.currentLayout() k-0 for(i in 1:nrow(tcL)) for(j in 1:ncol(tcL)) if (tcL[i,j] 0) { k-k+1 trellis.focus(panel, j, i, highlight = FALSE) if (once) panel.abline(a=a[k], b=b[k], v=v[k], h=h[k], ...) else panel.abline(a=a, b=b, v=v, h=h, ...) trellis.unfocus() } } dotplot(date_sampl_time_recs ~ lower_b_i | site, data=teq, scales=list(x=list(log=TRUE)), xscale.components = xscale.components.logpower, layout=c(5,1), panel = function(x,y,...) { panel.grid(h=53, v=-1, lty=dotted, col=gray) panel.dotplot(x,y,...) medians - median(x) panel.abline(v=medians, col.line=red, lty=dotted) } ) medie-as.vector(tapply(teq$lower_b_i,teq$site,mean)) addLine(v=log10(medie), once=TRUE, col=blue, lty=dotted) # code end -- View this message in context: http://r.789695.n4.nabble.com/how-to-add-a-vertical-line-for-each-panel-in-a-lattice-dotplot-with-log-scale-tp4632513p4632991.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determinant and inverse using cholsky parameter
Thanks for all who replied me to this topic. I have the L (cholesky decomposed matrix)as a vector listed as columwise as mentioned in the article http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.494 and looking for functions to convert the vector into inverse matrix ( but chol2inv mentioned by Kjetil, takes up matrix form) and the corresponding determinant value (square of the product of the diagonal element of the L). I am going to use the function in an maximum likelihood optimization routine for estimating variables in a variance-covariance matrix. Thanks and regards, B.Nataraj -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Özgür Asar Sent: Friday, June 08, 2012 7:38 PM To: r-help@r-project.org Subject: Re: [R] Determinant and inverse using cholsky parameter Hi, Isn't the Cholesky decomposition of A=L (L)^T where T stands for transpose and L is the Cholesky factor of A. You say you have the Cholesky decomposition, isn't it L (above)? A-L%*%t(L) det(A) solve(A) would be your answer. Hope this helps Ozgur -- View this message in context: http://r.789695.n4.nabble.com/Determinant-and-inverse-using-cholsky-parameter-tp4632769p4632808.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] compute Mcdonald's omega ω
Hi I keep having this error when I used omega(my.data) Error in cor(m, use = pairwise) : 'x' must be numeric I'm quite sure my data is numeric as I have tried several methods such as opening csv in notepad and make sure the number is not string, also tried SPSS and make sure the numbers are set to numeric.. can somebody advice me what is wrong. Thanks. On 11 June 2012 04:17, William Revelle li...@revelle.net wrote: Dear codecat You can get the most recent version of psych from CRAN. The current version is 1.2.4. Then, the help page for omega should be of use library(psych) ?omega In addition, try the vignette for the psych package. Or, as of today, there is a more detailed instruction for newbies on using the psych package and R to find omega http://personality-project.org/r/tutorials/R_for_omega.pdf Let me know if any of this helps. Bill On Jun 10, 2012, at 4:00 PM, codec cat wrote: Dear all I am a newbie to R and I would appreciate it very much if someone can give me some advice on this. Please note that I am not a programmer so some of the questions might sound really stupid. I would like to compute McDonald's omega calculation using R, I'm aware I can use the omega function in the psych package. But I'm really not sure how to do it. I have read these two articles that explain about omega- http://personality-project.org/r/html/omega.html and http://personality-project.org/r/r.omega.html But I'm still not sure how to do it. Can someone correct me what I have gone wrong. I'm following the instruction from this site and I have downloaded the psych package- http://personality-project.org/r/ *1. Load and read the data* datafilename - file.choose() # use the OS to find the file person.data - read.table(datafilename,header=TRUE) #read the data file *2. Use the code examples from http://personality-project.org/r/r.omega.html* Copy and paste the whole code from this website in the R console I know I must have understood Step 2 incorrectly, but I am really not sure what to do with the omega function. Can someone give me more specific help on this please. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. William Revelle http://personality-project.org/revelle.html Professor http://personality-project.org Department of Psychology http://www.wcas.northwestern.edu/psych/ Northwestern Universityhttp://www.northwestern.edu/ Use R for psychology http://personality-project.org/r It is 5 minutes to midnighthttp://www.thebulletin.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grouping distances
Thank you Rui, I am trying to create a column in the data file turtlehatch.csv Saludos, Jean -- View this message in context: http://r.789695.n4.nabble.com/Grouping-distances-tp4632985p4632989.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mgcv (bam) very large standard error difference between versions 1.7-11 and 1.7-17, bug?
Dear Martijn, You are right that the change in the summary defaults was not a sensible thing to do, and I'll change it back. However in the mean time you can use 1.7-17 with summary(...,freq=FALSE), when you have random effects in the model (it really shouldn't make a statistically meaningful difference otherwise). I made the change because it leads to better p-value performance when parametric model component are penalized using gam's `paraPen' argument, but as you have demonstrated this then uses a fixed effects distribution that is not really appropriate when there are random effects present. I should probably just deal with the paraPen case in the paraPen documentation. Technically the difference here is that with freq=TRUE the covariance matrix for the model coefficient estimators/predictions is taken to be (X'X+S)^{-1}X'X(X'X+S)^{-1}\sigma^2 where X is the model matrix and S the penalty matrix for the model (Gaussian case, non-Gaussian also has weights). This is based on the fact that the coefficient estimates/predictions are \hat \beta = (X'X+S)^{-1}X'y where y is the response data, and y_i iid N(0,\sigma^2). When freq=FALSE the covariance matrix is the one that comes from the Bayesian/random effects view of the smoothing process, and is (X'X+S)^{-1}\sigma^2 The latter is usually what you want when you have true random effects in the model (rather than smooths which are usually not in the model as true random effects). best, Simon On 03/06/12 18:45, Martijn Wieling wrote: Dear useRs, I've ran some additional analyses (see below), which strongly suggest the standard errors of the bam (and gam) function are much too low in mgcv version 1.7-17, at least when including an s(X,bs=re) term. Until this issue has been clarified, it's perhaps best to use an older version of mgcv (unfortunately, however, in earlier versions the p-value calculation of s(X,bs=re) is not correct). All analyses were conducted in R 2.15.0. My approach was the following: I created a mixed-effects regression model with a single random intercept and only linear predictors. In my view, the results using lmer (lme4) should be comparable to those of bam and gam (mgcv). This was the case when using an older version of mgcv (version 1.7-13), but this is not the case anymore in version 1.7-17. In version 1.7-17, the standard errors and p-values are much lower and very similar to those of a linear model (which does not take the random-effects structure into account). The R-code and results are shown below. (The results using gam are not shown, but show the same pattern.) Furthermore, note that the differences in standard errors become less severe (but still noticeable) when less data is involved (e.g., using only 500 rows as opposed to100.000). Finally, when not including an s(X,bs=re) term, but another non-random-effect smooth, the standard errors do not appear to be structurally lower (only for some variables, but not by a great deal - see also below). With kind regards, Martijn Wieling University of Groningen lme4 model (most recent version of lme4) modelLMER- lmer(RefPMIdistMeanLog.c ~ SpYearBirth.z*IsAragon + SpIsMale + (1|Key), data=wrddst) #Estimate Std. Error t value #SpYearBirth.z -0.012084 0.004577 -2.640 #IsAragon0.138959 0.010040 13.840 #SpIsMale -0.003087 0.008290 -0.372 #SpYearBirth.z:IsAragon 0.015429 0.010159 1.519 mgcv 1.7-13, default (method = REML) - almost identical to modelLMER modelBAMold- bam(RefPMIdistMeanLog.c ~ SpYearBirth.z*IsAragon + SpIsMale + s(Key,bs=re), data=wrddst) #Estimate Std. Error t value Pr(|t|) #SpYearBirth.z -0.012084 0.004578 -2.640 0.00829 ** #IsAragon0.138959 0.010042 13.838 2e-16 *** #SpIsMale -0.003087 0.008292 -0.372 0.70968 #SpYearBirth.z:IsAragon 0.015429 0.010160 1.519 0.12886 mgcv 1.7-17, method = REML - standard errors greatly reduced # (comparable to standard errors of LM without random intercept) modelBAMnew- bam(RefPMIdistMeanLog.c ~ SpYearBirth.z*IsAragon + SpIsMale + s(Key,bs=re), data=wrddst); print(testje,cor=F) #Estimate Std. Error t value Pr(|t|) #SpYearBirth.z -0.012084 0.001159 -10.428 2e-16 *** #IsAragon0.138959 0.002551 54.472 2e-16 *** #SpIsMale -0.003087 0.002098 -1.4710.141 #SpYearBirth.z:IsAragon 0.015429 0.002587 5.965 2.45e-09 *** lm results, standard errors comparable to mgcv 1.7-17 modelLM- lm(RefPMIdistMeanLog.c ~ SpYearBirth.z*IsAragon + SpIsMale, data=wrddst) #Estimate Std. Error t value Pr(|t|) #(Intercept)-0.025779 0.001653 -15.595 2e-16 *** #SpYearBirth.z -0.011906 0.001182 -10.070 2e-16 *** #IsAragon0.139323 0.002603 53.531 2e-16 *** #SpIsMale -0.003076 0.002140 -1.437
Re: [R] mgcv (bam) very large standard error difference between versions 1.7-11 and 1.7-17, bug?
Hi Simon, Thanks for your reply. That is very helpful. However, in the logistic regression example, there also appears to be a large difference in the p-values and estimates when comparing summary(model) # mgcv 1.7-11 summary(model,freq=F) # mgcv 1.7-17 These should be the same, right? But why is this not the case? The results are shown below, and are also explained in this post: http://r.789695.n4.nabble.com/mgcv-bam-very-large-standard-error-difference-between-versions-1-7-11-and-1-7-17-bug-tp4632173p4632491.html) With kind regards, Martijn # call in all versions model = bam(NoStandard3 ~ te(GeoX,GeoY,WordFreqLog.z,by=OldSpeakerContrast,d=c(2,1)) + OldSpeakerContrast + PopCntLog.z + s(Word,bs=re) + s(Placename,bs=re) + s(Word,PopAvgIncomeLog.z,bs=re) + s(Word,PopAvgAgeLog_residIncome.z,bs=re) + s(Word,PopCntLog.z,bs=re) + s(Word,YearRec.z,bs=re),data=lexdst,family=binomial,method=REML) ### mgcv, version 1.7-11 #Family: binomial #Link function: logit # #Formula: #NoStandard3 ~ te(GeoX, GeoY, WordFreqLog.z, by = OldSpeakerContrast, #d = c(2, 1)) + OldSpeakerContrast + PopCntLog.z + s(Word, #bs = re) + s(Placename, bs = re) + s(Word, PopAvgIncomeLog.z, #bs = re) + s(Word, PopAvgAgeLog_residIncome.z, bs = re) + #s(Word, PopCntLog.z, bs = re) + s(Word, YearRec.z, bs = re) # #Parametric coefficients: #Estimate Std. Error z value Pr(|z|) #(Intercept) -0.416100.14027 -2.966 0.00301 ** #OldSpeakerContrast1 0.445080.01934 23.009 2e-16 *** #PopCntLog.z -0.116730.02725 -4.284 1.83e-05 *** #--- #Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 # #Approximate significance of smooth terms: # edf Ref.df Chi.sq p-value #te(GeoX,GeoY,WordFreqLog.z):OldSpeakerContrast0 54.05 69.06 223.3 2e-16 #te(GeoX,GeoY,WordFreqLog.z):OldSpeakerContrast1 59.17 74.83 327.6 2e-16 #s(Word) 166.41 167.84 13756.0 2e-16 #s(Placename)157.64 188.02 692.7 2e-16 #s(Word,PopAvgIncomeLog.z) 137.83 160.05 829.9 2e-16 #s(Word,PopAvgAgeLog_residIncome.z) 121.13 151.13 435.5 2e-16 #s(Word,PopCntLog.z) 97.12 133.97 205.5 7.01e-05 #s(Word,YearRec.z) 128.98 155.27 585.7 2e-16 # #R-sq.(adj) = 0.372 Deviance explained = 32.4% #REML score = 97450 Scale est. = 1 n = 69259 ### mgcv 1.7-17, summary(model,freq=F) #Family: binomial #Link function: logit # #Formula: #NoStandard3 ~ te(GeoX, GeoY, WordFreqLog.z, by = OldSpeakerContrast, #d = c(2, 1)) + OldSpeakerContrast + PopCntLog.z + s(Word, #bs = re) + s(Placename, bs = re) + s(Word, PopAvgIncomeLog.z, #bs = re) + s(Word, PopAvgAgeLog_residIncome.z, bs = re) + #s(Word, PopCntLog.z, bs = re) + s(Word, YearRec.z, bs = re) # #Parametric coefficients: #Estimate Std. Error z value Pr(|z|) #(Intercept) -0.981500.52485 -1.870 0.0615 . #OldSpeakerContrast1 0.654970.68628 0.954 0.3399 #PopCntLog.z -0.117240.02726 -4.300 1.71e-05 *** #--- #Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 # #Approximate significance of smooth terms: # edf Ref.df Chi.sq p-value #te(GeoX,GeoY,WordFreqLog.z):OldSpeakerContrast0 53.62 68.48 221.4 2e-16 #te(GeoX,GeoY,WordFreqLog.z):OldSpeakerContrast1 58.64 74.14 324.9 2e-16 #s(Word) 166.41 168.00 14351.7 2e-16 #s(Placename)157.79 209.00 1160.0 2e-16 #s(Word,PopAvgIncomeLog.z) 137.84 170.00 1062.1 2e-16 #s(Word,PopAvgAgeLog_residIncome.z) 121.15 170.00 686.7 2e-16 #s(Word,PopCntLog.z) 97.10 169.00 434.0 2e-16 #s(Word,YearRec.z) 128.98 170.00 867.4 2e-16 # #R-sq.(adj) = 0.372 Deviance explained = 32.4% #REML score = 97449 Scale est. = 1 n = 69259 Met vriendelijke groet, Martijn -- *** Martijn Wieling http://www.martijnwieling.nl wiel...@gmail.com +31(0)614108622 *** University of Groningen http://www.rug.nl/staff/m.b.wieling *** On Mon, Jun 11, 2012 at 10:34 AM, Simon Wood s.w...@bath.ac.uk wrote: Dear Martijn, You are right that the change in the summary defaults was not a sensible thing to do, and I'll change it back. However in the mean time you can use 1.7-17 with summary(...,freq=FALSE), when you have random effects in the model (it really shouldn't make a statistically meaningful difference otherwise). I made the change because it leads to better p-value performance when parametric model component are penalized using gam's `paraPen' argument, but as you have
Re: [R] compute Mcdonald's omega ω
Dear codecat, To track down where the trouble is, try lapply(my.data,is.numeric) That will tell you which column of your data is giving you problems. It is possible that you read the data in from SPSS and did not turn off the use.value.labels switch. On Jun 11, 2012, at 1:55 AM, codec cat wrote: Hi I keep having this error when I used omega(my.data) Error in cor(m, use = pairwise) : 'x' must be numeric I'm quite sure my data is numeric as I have tried several methods such as opening csv in notepad and make sure the number is not string, also tried SPSS and make sure the numbers are set to numeric.. can somebody advice me what is wrong. Thanks. On 11 June 2012 04:17, William Revelle li...@revelle.net wrote: Dear codecat You can get the most recent version of psych from CRAN. The current version is 1.2.4. Then, the help page for omega should be of use library(psych) ?omega In addition, try the vignette for the psych package. Or, as of today, there is a more detailed instruction for newbies on using the psych package and R to find omega http://personality-project.org/r/tutorials/R_for_omega.pdf Let me know if any of this helps. Bill On Jun 10, 2012, at 4:00 PM, codec cat wrote: Dear all I am a newbie to R and I would appreciate it very much if someone can give me some advice on this. Please note that I am not a programmer so some of the questions might sound really stupid. I would like to compute McDonald's omega calculation using R, I'm aware I can use the omega function in the psych package. But I'm really not sure how to do it. I have read these two articles that explain about omega- http://personality-project.org/r/html/omega.html and http://personality-project.org/r/r.omega.html But I'm still not sure how to do it. Can someone correct me what I have gone wrong. I'm following the instruction from this site and I have downloaded the psych package- http://personality-project.org/r/ *1. Load and read the data* datafilename - file.choose() # use the OS to find the file person.data - read.table(datafilename,header=TRUE) #read the data file *2. Use the code examples from http://personality-project.org/r/r.omega.html* Copy and paste the whole code from this website in the R console I know I must have understood Step 2 incorrectly, but I am really not sure what to do with the omega function. Can someone give me more specific help on this please. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. William Revellehttp://personality-project.org/revelle.html Professor http://personality-project.org Department of Psychology http://www.wcas.northwestern.edu/psych/ Northwestern Universityhttp://www.northwestern.edu/ Use R for psychology http://personality-project.org/r It is 5 minutes to midnighthttp://www.thebulletin.org William Revellehttp://personality-project.org/revelle.html Professor http://personality-project.org Department of Psychology http://www.wcas.northwestern.edu/psych/ Northwestern Universityhttp://www.northwestern.edu/ Use R for psychology http://personality-project.org/r It is 5 minutes to midnighthttp://www.thebulletin.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mgcv (bam) very large standard error difference between versions 1.7-11 and 1.7-17, bug?
Hi Martijn, The issue in this case is that the estimates for the terms are rather different (not just the p-values). I think that the difference is down to this change made in version 1.7-13 (from changeLog)... * re smooths are no longer subject to side constraint under nesting (since this is almost always un-necessary and undesirable, and often unexpected). ... in 1.7-11 re terms were checked for identifiability under nesting in exactly the same way as any other smooth. This led to the imposition of identifiability constraints in your model, since you have nested random effects. However re terms are a special case in this context, and it is usually undesirable to impose such constraints, so from version 1.7-13 onwards re terms are excluded when identifiability constraints are imposed. best, Simon On 11/06/12 09:47, Martijn Wieling wrote: Hi Simon, Thanks for your reply. That is very helpful. However, in the logistic regression example, there also appears to be a large difference in the p-values and estimates when comparing summary(model) # mgcv 1.7-11 summary(model,freq=F) # mgcv 1.7-17 These should be the same, right? But why is this not the case? The results are shown below, and are also explained in this post: http://r.789695.n4.nabble.com/mgcv-bam-very-large-standard-error-difference-between-versions-1-7-11-and-1-7-17-bug-tp4632173p4632491.html) With kind regards, Martijn # call in all versions model = bam(NoStandard3 ~ te(GeoX,GeoY,WordFreqLog.z,by=OldSpeakerContrast,d=c(2,1)) + OldSpeakerContrast + PopCntLog.z + s(Word,bs=re) + s(Placename,bs=re) + s(Word,PopAvgIncomeLog.z,bs=re) + s(Word,PopAvgAgeLog_residIncome.z,bs=re) + s(Word,PopCntLog.z,bs=re) + s(Word,YearRec.z,bs=re),data=lexdst,family=binomial,method=REML) ### mgcv, version 1.7-11 #Family: binomial #Link function: logit # #Formula: #NoStandard3 ~ te(GeoX, GeoY, WordFreqLog.z, by = OldSpeakerContrast, #d = c(2, 1)) + OldSpeakerContrast + PopCntLog.z + s(Word, #bs = re) + s(Placename, bs = re) + s(Word, PopAvgIncomeLog.z, #bs = re) + s(Word, PopAvgAgeLog_residIncome.z, bs = re) + #s(Word, PopCntLog.z, bs = re) + s(Word, YearRec.z, bs = re) # #Parametric coefficients: #Estimate Std. Error z value Pr(|z|) #(Intercept) -0.416100.14027 -2.966 0.00301 ** #OldSpeakerContrast1 0.445080.01934 23.009 2e-16 *** #PopCntLog.z -0.116730.02725 -4.284 1.83e-05 *** #--- #Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 # #Approximate significance of smooth terms: # edf Ref.df Chi.sq p-value #te(GeoX,GeoY,WordFreqLog.z):OldSpeakerContrast0 54.05 69.06 223.3 2e-16 #te(GeoX,GeoY,WordFreqLog.z):OldSpeakerContrast1 59.17 74.83 327.6 2e-16 #s(Word) 166.41 167.84 13756.0 2e-16 #s(Placename)157.64 188.02 692.7 2e-16 #s(Word,PopAvgIncomeLog.z) 137.83 160.05 829.9 2e-16 #s(Word,PopAvgAgeLog_residIncome.z) 121.13 151.13 435.5 2e-16 #s(Word,PopCntLog.z) 97.12 133.97 205.5 7.01e-05 #s(Word,YearRec.z) 128.98 155.27 585.7 2e-16 # #R-sq.(adj) = 0.372 Deviance explained = 32.4% #REML score = 97450 Scale est. = 1 n = 69259 ### mgcv 1.7-17, summary(model,freq=F) #Family: binomial #Link function: logit # #Formula: #NoStandard3 ~ te(GeoX, GeoY, WordFreqLog.z, by = OldSpeakerContrast, #d = c(2, 1)) + OldSpeakerContrast + PopCntLog.z + s(Word, #bs = re) + s(Placename, bs = re) + s(Word, PopAvgIncomeLog.z, #bs = re) + s(Word, PopAvgAgeLog_residIncome.z, bs = re) + #s(Word, PopCntLog.z, bs = re) + s(Word, YearRec.z, bs = re) # #Parametric coefficients: #Estimate Std. Error z value Pr(|z|) #(Intercept) -0.981500.52485 -1.870 0.0615 . #OldSpeakerContrast1 0.654970.68628 0.954 0.3399 #PopCntLog.z -0.117240.02726 -4.300 1.71e-05 *** #--- #Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 # #Approximate significance of smooth terms: # edf Ref.df Chi.sq p-value #te(GeoX,GeoY,WordFreqLog.z):OldSpeakerContrast0 53.62 68.48 221.42e-16 #te(GeoX,GeoY,WordFreqLog.z):OldSpeakerContrast1 58.64 74.14 324.92e-16 #s(Word) 166.41 168.00 14351.72e-16 #s(Placename)157.79 209.00 1160.02e-16 #s(Word,PopAvgIncomeLog.z) 137.84 170.00 1062.12e-16 #s(Word,PopAvgAgeLog_residIncome.z) 121.15 170.00 686.72e-16 #s(Word,PopCntLog.z) 97.10 169.00 434.02e-16 #s(Word,YearRec.z) 128.98 170.00 867.42e-16 # #R-sq.(adj) = 0.372 Deviance explained = 32.4% #REML score = 97449 Scale est. = 1 n = 69259 Met vriendelijke groet,
Re: [R] mgcv (bam) very large standard error difference between versions 1.7-11 and 1.7-17, bug?
Hi Martijn, The negative edf from fREML was a matrix indexing bug triggered by the re term (other fREML discrepancies followed from this). Fixed for 1.7-18. Thanks for reporting this, and the supporting offline info! best, Simon On 02/06/12 23:17, Simon Wood wrote: the fREML results must be a bug. I'd modified the bam covariance and edf code in 1.7-17 and it looks like I must have messed something up. Any chance you could send me the data off line? Also can you try summary(...,freq=FALSE) under 1.7-17 and see if you get the same results as before? On 06/02/2012 05:25 PM, Martijn Wieling wrote: Dear useRs, I reran an analysis with bam (mgcv, version 1.7-17) originally conducted using an older version of bam (mgcv, version 1.7-11) and this resulted in the same estimates, but much lower standard errors (in some cases 20 times as low) and lower p-values. This obviously results in a larger set of significant predictors. Is this result expected given the improvements in the new version? Or this a bug and are the p-values of bam in mgcv 1.7-17 too low? The summaries of both versions are shown below to enable a comparison. In addition, applying the default method=fREML (mgcv version 1.7-17) on the same dataset yields only non-significant results, while all results are highly significant using method=REML. Furthermore, it also results in large negative (e.g., -8757) edf values linked to s(X,bs=RE) terms. Is this correct, or is this a bug? The summary of the model using method=fREML is also shown below. I hope someone can shed some light on this. With kind regards, Martijn Wieling, University of Groningen # ### mgcv version 1.7-11 # Family: gaussian Link function: identity Formula: RefPMIdistMeanLog.c ~ s(GeoX, GeoY) + RefVratio.z + IsSemiwordOrDemonstrative + RefSoundCnt.z + SpYearBirth.z * IsAragon + PopCntLog_residGeo.z + s(Word, bs = re) + s(Key, bs = re) Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0.099757 0.020234 -4.930 8.23e-07 *** RefVratio.z 0.105705 0.013328 7.931 2.19e-15 *** IsSemiwordOrDemonstrative 0.289828 0.046413 6.245 4.27e-10 *** RefSoundCnt.z 0.119981 0.021202 5.659 1.53e-08 *** SpYearBirth.z -0.011396 0.002407 -4.734 2.21e-06 *** IsAragon 0.055678 0.033137 1.680 0.09291 . PopCntLog_residGeo.z -0.006504 0.003279 -1.984 0.04731 * SpYearBirth.z:IsAragon 0.015871 0.005459 2.907 0.00365 ** --- Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 Approximate significance of smooth terms: edf Ref.df F p-value s(GeoX,GeoY) 24.01 24.21 31.162e-16 *** s(Word) 352.29 347.00 501.572e-16 *** s(Key) 269.75 289.25 10.762e-16 *** --- Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 R-sq.(adj) = 0.693 Deviance explained = 69.4% REML score = -22476 Scale est. = 0.038177 n = 112608 # ### mgcv version 1.7-17, much lower p-values and standard errors than version 1.7-11 # Family: gaussian Link function: identity Formula: RefPMIdistMeanLog.c ~ s(GeoX, GeoY) + RefVratio.z + IsSemiwordOrDemonstrative + RefSoundCnt.z + SpYearBirth.z * IsAragon + PopCntLog_residGeo.z + s(Word, bs = re) + s(Key, bs = re) Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0.0997566 0.0014139 -70.552 2e-16 *** RefVratio.z 0.1057049 0.0006565 161.010 2e-16 *** IsSemiwordOrDemonstrative 0.2898285 0.0020388 142.155 2e-16 *** RefSoundCnt.z 0.1199813 0.0009381 127.905 2e-16 *** SpYearBirth.z -0.0113956 0.0006508 -17.509 2e-16 *** IsAragon 0.0556777 0.0057143 9.744 2e-16 *** PopCntLog_residGeo.z -0.0065037 0.0007938 -8.193 2.58e-16 *** SpYearBirth.z:IsAragon 0.0158712 0.0014829 10.703 2e-16 *** --- Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 Approximate significance of smooth terms: edf Ref.df F p-value s(GeoX,GeoY) 24.01 24.21 31.152e-16 *** s(Word) 352.29 347.00 587.262e-16 *** s(Key) 269.75 313.00 4246.762e-16 *** --- Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 R-sq.(adj) = 0.693 Deviance explained = 69.4% REML score = -22476 Scale est. = 0.038177 n = 112608 # ### mgcv version 1.7-17, default: method=fREML, all p-values non-significant and negative edf's of s(X,bs=re) # Family: gaussian Link function: identity Formula: RefPMIdistMeanLog.c ~ s(GeoX, GeoY) + RefVratio.z + IsSemiwordOrDemonstrative + RefSoundCnt.z + SpYearBirth.z * IsAragon + PopCntLog_residGeo.z + s(Word, bs = re) + s(Key, bs = re) Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0.099757 1.730235 -0.058 0.954 RefVratio.z 0.105705 1.145329 0.092 0.926 IsSemiwordOrDemonstrative 0.289828 4.167237 0.070 0.945 RefSoundCnt.z 0.119981 1.901158 0.063 0.950 SpYearBirth.z -0.011396 0.034236 -0.333 0.739 IsAragon 0.055678 0.298629 0.186 0.852 PopCntLog_residGeo.z -0.006504 0.041981 -0.155 0.877 SpYearBirth.z:IsAragon 0.015871
Re: [R] Gaps on merging xts objects
Eric, I'd be happy to help. Please follow the posting guide (specifically the Surprising behavior and bugs section) and provide a *minimal*, reproducible example and the output from sessionInfo(). http://www.r-project.org/posting-guide.html Best, -- Joshua Ulrich | FOSS Trading: www.fosstrading.com On Sun, Jun 10, 2012 at 3:19 PM, eric ericst...@aol.com wrote: An update ... I did a bit more search on the internet and got some ideas i set the start month for the series to the same date. That didn't help. Then I tried .index(x2)==.index(msci.m) FALSE i was able to fix the problem with : index(x2) - as.Date(index(x2)) index(msci.m) - as.Date(index(msci.m)) Now the xts objects merged just fine. But I'm not exactly sure what the root cause was or why this fix worked One other part of this is the x2 data started as weekly and the msci data started as daily. Both were converted to monthly data with to.monthly. I suspect that has something to do with it but I don't understand. Any words of wisdom in gaining a better understanding would be appreciated -- View this message in context: http://r.789695.n4.nabble.com/Gaps-on-merging-xts-objects-tp4632941p4632950.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] compute Mcdonald's omega ω
you are right, it shows False for all variables. The use.value.labels switch is not on, it does not work also when I type number manually in notepad (csv) or excel. Is there a way to force the variables to numeric? On 11 June 2012 12:23, William Revelle li...@revelle.net wrote: Dear codecat, To track down where the trouble is, try lapply(my.data,is.numeric) That will tell you which column of your data is giving you problems. It is possible that you read the data in from SPSS and did not turn off the use.value.labels switch. On Jun 11, 2012, at 1:55 AM, codec cat wrote: Hi I keep having this error when I used omega(my.data) Error in cor(m, use = pairwise) : 'x' must be numeric I'm quite sure my data is numeric as I have tried several methods such as opening csv in notepad and make sure the number is not string, also tried SPSS and make sure the numbers are set to numeric.. can somebody advice me what is wrong. Thanks. On 11 June 2012 04:17, William Revelle li...@revelle.net wrote: Dear codecat You can get the most recent version of psych from CRAN. The current version is 1.2.4. Then, the help page for omega should be of use library(psych) ?omega In addition, try the vignette for the psych package. Or, as of today, there is a more detailed instruction for newbies on using the psych package and R to find omega http://personality-project.org/r/tutorials/R_for_omega.pdf Let me know if any of this helps. Bill On Jun 10, 2012, at 4:00 PM, codec cat wrote: Dear all I am a newbie to R and I would appreciate it very much if someone can give me some advice on this. Please note that I am not a programmer so some of the questions might sound really stupid. I would like to compute McDonald's omega calculation using R, I'm aware I can use the omega function in the psych package. But I'm really not sure how to do it. I have read these two articles that explain about omega- http://personality-project.org/r/html/omega.html and http://personality-project.org/r/r.omega.html But I'm still not sure how to do it. Can someone correct me what I have gone wrong. I'm following the instruction from this site and I have downloaded the psych package- http://personality-project.org/r/ *1. Load and read the data* datafilename - file.choose() # use the OS to find the file person.data - read.table(datafilename,header=TRUE) #read the data file *2. Use the code examples from http://personality-project.org/r/r.omega.html* Copy and paste the whole code from this website in the R console I know I must have understood Step 2 incorrectly, but I am really not sure what to do with the omega function. Can someone give me more specific help on this please. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. William Revelle http://personality-project.org/revelle.html Professor http://personality-project.org Department of Psychology http://www.wcas.northwestern.edu/psych/ Northwestern Universityhttp://www.northwestern.edu/ Use R for psychology http://personality-project.org/r It is 5 minutes to midnighthttp://www.thebulletin.org William Revelle http://personality-project.org/revelle.html Professor http://personality-project.org Department of Psychology http://www.wcas.northwestern.edu/psych/ Northwestern Universityhttp://www.northwestern.edu/ Use R for psychology http://personality-project.org/r It is 5 minutes to midnighthttp://www.thebulletin.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mgcv: inclusion of random intercept in model - based on p-value of smooth or anova?
Dear Simon, I ran an additional analysis using bam (mgcv 1.7-17) with three random intercepts and no non-linearities, and compared these to the results of lmer (lme4). Using bam results in a significant random intercept (even though it has a very low edf-value), while the lmer results show no variance associated to the random intercept of Placename. Should I drop the random intercept of Placename and if so, how is this apparent from the results of bam? Summaries of both models are shown below. With kind regards, Martijn l1 = lmer(RefPMIdistMeanLog.c ~ geogamfit + (1|Word) + (1|Key) + (1|Placename), data=wrddst); print(l1,cor=F) Linear mixed model fit by REML Formula: RefPMIdistMeanLog.c ~ geogamfit + (1 | Word) + (1 | Key) + (1 | Placename) Data: wrddst AICBIC logLik deviance REMLdev -44985 -44927 22498 -45009 -44997 Random effects: GroupsNameVariance Std.Dev. Word (Intercept) 0.0800944 0.283009 Key (Intercept) 0.0013641 0.036933 Placename (Intercept) 0.000 0.00 Residual 0.0381774 0.195390 Number of obs: 112608, groups: Word, 357; Key, 320; Placename, 40 Fixed effects: Estimate Std. Error t value (Intercept) -0.003420.01513 -0.23 geogamfit0.992490.02612 37.99 m1 = bam(RefPMIdistMeanLog.c ~ geogamfit + s(Word,bs=re) + s(Key,bs=re) + s(Placename,bs=re), data=wrddst,method=REML); summary(m1,freq=F) Family: gaussian Link function: identity Formula: RefPMIdistMeanLog.c ~ geogamfit + s(Word, bs = re) + s(Key, bs = re) + s(Placename, bs = re) Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0.003420.01513 -0.2260.821 geogamfit0.992490.02612 37.991 2e-16 *** --- Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 Approximate significance of smooth terms: edf Ref.df F p-value s(Word) 3.554e+02347 634.716 2e-16 *** s(Key) 2.946e+02316 23.054 2e-16 *** s(Placename) 1.489e-04 38 7.282 2e-16 *** --- Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 R-sq.(adj) = 0.693 Deviance explained = 69.4% REML score = -22498 Scale est. = 0.038177 n = 112608 On Wed, May 23, 2012 at 11:30 AM, Simon Wood-4 [via R] ml-node+s789695n4631060...@n4.nabble.com wrote: Having looked at this further, I've made some changes in mgcv_1.7-17 to the p-value computations for terms that can be penalized to zero during fitting (e.g. s(x,bs=re), s(x,m=1) etc). The Wald statistic based p-values from summary.gam and anova.gam (i.e. what you get from e.g. anova(a) where a is a fitted gam object) are quite well founded for smooth terms that are non-zero under full penalization (e.g. a cubic spline is a straight line under full penalization). For such smooths, an extension of Nychka's (1988) result on CI's for splines gives a well founded distributional result on which to base a Wald statistic. However, the Nychka result requires the smoothing bias to be substantially less than the smoothing estimator variance, and this will often not be the case if smoothing can actually penalize a term to zero (to understand why, see argument in appendix of Marra Wood, 2012, Scandinavian Journal of Statistics, 39,53-74). Simulation testing shows that this theoretical concern has serious practical consequences. So for terms that can be penalized to zero, alternative approximations have to be used, and these are now implemented in mgcv_1.7-17 (see ?summary.gam). The approximate test performed by anova(a,b) (a and b are fitted gam objects) is less well founded. It is a reasonable approximation when each smooth term in the models could in principle be well approximated by an unpenalized term of rank approximately equal to the edf of the smooth term, but otherwise the p-values produced are likely to be much too small. In particular simulation testing suggests that the test is not to be trusted with s(...,bs=re) terms, and can be poor if the models being compared involve any terms that can be penalized to zero during fitting. (Although the mechanisms are a little different, this is similar to the problem we would have if the models were viewed as regular mixed models and we tried to use a GLRT to test variance components for equality to zero). These issues are now documented in ?anova.gam and ?summary.gam... Simon On 08/05/12 15:01, Martijn Wieling wrote: Dear useRs, I am using mgcv version 1.7-16. When I create a model with a few non-linear terms and a random intercept for (in my case) country using s(Country,bs=re), the representative line in my model (i.e. approximate significance of smooth terms) for the random intercept reads: edf Ref.df F p-value s(Country) 36.127 58.551 0.644 0.982 Can I interpret this as there being no support for a random intercept for country? However, when I compare the simpler
Re: [R] controlling spatial autocorrelation in linear regression models
Hi, Maybe this is helpful..?? http://www.ats.ucla.edu/stat/r/faq/spatial_regression.htm Best, Dimitrios -- View this message in context: http://r.789695.n4.nabble.com/controlling-spatial-autocorrelation-in-linear-regression-models-tp4632940p4632998.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to build a large identity matrix faster?
diag(n) is alright when n = 5e3, it took 0.7 sec on my machine for diag(5e3). However, it's slow when n = 23000, diag(23000) took 15 sec On 11 June 2012 17:43, Ceci Tam yeeching...@gmail.com wrote: diag(n) is alright when n = 5e3, it took 0.7 sec on my machine for diag(5e3). However, it's slow when n = 23000, diag(23000) took 15 sec On 9 June 2012 06:32, R. Michael Weylandt michael.weyla...@gmail.comwrote: On Fri, Jun 8, 2012 at 5:31 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: For the matter and hand, you can probably get to it by simply trying base:::diag. In this case, it's not too hard because what you're seeing is the S4 generic that the Matrix package is defining _over_ the regular base function generic. Sorry -- the regular base function is not a generic in this case. Everything else still holds though. (Same tricks to find things work with print which becomes both an S3 and S4 generic on loading Matrix) More generally, going down the rabbit hole of S4: As it suggests, first try showMethods(diag) and you'll see a long list of types. The parallel to the *.default method is the one with signature ANY so you can try that: getMethod(diag, ANY) which gets you where you need to be. Hope this helps, Michael On Fri, Jun 8, 2012 at 5:11 PM, Spencer Graves spencer.gra...@structuremonitoring.com wrote: How can one get the source code for diag? I tried the following: diag standardGeneric for diag defined from package base function (x = 1, nrow, ncol) standardGeneric(diag) environment: 0x09dc1ab0 Methods may be defined for arguments: x, nrow, ncol Use showMethods(diag) for currently available ones. How can I look at the code past the methods dispatch? methods('diag') [1] diag.panel.splom Warning message: In methods(diag) : function 'diag' appears not to be generic So diag is an S4 generic. I tried the following: dumpMethods('diag', file='diag.R') readLines('diag.R') character(0) More generally, what do you recommend I read to learn about S4 generics? I've read fair portions of Chambers (1998, 2008), which produced more frustration than enlightenment for me. Thanks, Spencer On 6/8/2012 12:07 PM, Uwe Ligges wrote: I quickly looked at it, and the difference comes from: n - 5e3 system.time(x - array(0, c(n, n))) # from diag() system.time(x - matrix(0, n, n)) # from Rdiag() Replaced in R-devel. Best, Uwe Ligges On 07.06.2012 12:11, Spencer Graves wrote: On 6/7/2012 2:27 AM, Rui Barradas wrote: Hello, To my great surprise, on my system, Windows 7, R 15.0, 32 bits, an R version is faster! I was also surprised, Windows 7, R 2.15.0, 64-bit rbind(diag=t1, Rdiag=t2, ratio=t1/t2) user.self sys.self elapsed user.child sys.child diag 0.72 0.08 0.81 NA NA Rdiag 0.09 0.03 0.12 NA NA ratio 8.00 2.67 6.75 NA NA sessionInfo() R version 2.15.0 (2012-03-30) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] fda_2.2.9 Matrix_1.0-6 lattice_0.20-6 zoo_1.7-7 loaded via a namespace (and not attached): [1] grid_2.15.0 tools_2.15.0 Spencer Rdiag - function(n){ m - matrix(0, nrow=n, ncol=n) m[matrix(rep(seq_len(n), 2), ncol=2)] - 1 m } Rdiag(4) n - 5e3 t1 - system.time(d1 - diag(n)) t2 - system.time(d2 - Rdiag(n)) all.equal(d1, d2) rbind(diag=t1, Rdiag=t2, ratio=t1/t2) Anyway, why don't you create it once, save a copy and use it many times? Hope this helps, Rui Barradas Em 07-06-2012 08:55, Ceci Tam escreveu: Hello, I am trying to build a large size identity matrix using diag(). The size is around 23000 and I've tried diag(23000), that took a long time. Since I have to use this operation several times in my program, the running time is too long to be tolerable. Are there any alternative for diag(N)? Thanks Cheers, yct [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide
[R] gamm (mgcv) interaction with linear term
Hello, I am trying to fit a gamm (package mgcv) model with a smooth term, a linear term, and an interaction between the two. The reason I am using gamm rather than gam is that there are repeated measures in time (which is the smooth term x1), so I am including an AR1 autocorrelation term. The model I have so far ended up with is of the type gamm(y ~ s(x1) + s(x1, by=x2), correlation = corAR1(form= ~ x1|Unit)) where Units are replicate experimental units from which we have sampled. I have a few questions that I have been unable to find answers to: 1) Is this model doing what I hope it is doing (see above)? Prior to adding the AR1 component, I used gam and was able to run the (more intuitive) model gam(y ~ s(x1) + x2 + s(x1, by=x2)) with a separate term (and output) for the linear x2, but unfortunately I don't seem to get this to work with gamm. Can I somehow estimate the significance (and slope) of the linear term with a gamm model? 2) When I run the gamm model above, I end up with a significant intercept, significant smooth term x1 and a significant interaction s(x1):x2. The interaction has an edf of 6.87, i.e. it is far from linear. My next question is how to interpret this interaction: If I plot the interaction with x1 (time) on x-axis and s(x1):x2 on y-axis, can I somehow relate the value of the interaction term at a particular point in time to the slope of the linear x2 term at that point in time? Appreciate any help with this as I am relatively new to both r and gam(m). Aino Hosia Postdoc Institute of Marine Research PO Box 1870 Nordnes, N-5817 Bergen, Norway (Nordnesgaten 50) Tel: +47 55 23 53 49 E-mail: aino.ho...@imr.no __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Storing datasets
This does tend to look like homework, but... If you want them in one vector, then that vector will have length 225*100, of course. So rt(225*100,225) would do it. Or you could use the matrix() function to convert this to a matrix. See ?matrix. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 6/9/12 3:16 AM, Rody rodric_seu...@hotmail.com wrote: Hi everyone I need to make a work for school in R and one of my questions is to create 225 datasets of 100 observations and they need to be t_225 distributed. So, I know how to make one dataset (rt(100,df=225)), but how can I store those 225 in one vector, array,.. ? -- View this message in context: http://r.789695.n4.nabble.com/Storing-datasets-tp4632874.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generating random samples of IG distribution
For the normal inverse Gaussian: Package 'GeneralizedHyperbolic' For the generalized inverse Gaussian: Package 'GeneralizedHyperbolic' -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of shirin nezampour Sent: Sunday, June 10, 2012 11:37 AM To: r-help@r-project.org Subject: [R] generating random samples of IG distribution Dear R users, I want to generating random samples from Inverse Gaussian distribution . How can I do? and what package should I install? Thanks. Shirin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in if (rank) { : argument is not interpretable as logical
Hello all, I am running the function rda of the vegan library, like I did many times without having troubles, and I get the following error message: Error in if (rank) { : argument is not interpretable as logical. I am just testing a response matrix against a single explanatory matrix, which is the result of a model matrix with 4 dummy variables (coding for a factor of 5 levels). myrda-rda(resp, expl) When I change the response matrix, it goes well. Does anyone know what might be happening? Thanks, Duarte __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] caret: compare linear models of different degree
The same statement is true for the plotmo function. It also does not handle the situations right if the training functions contains interactions. You can try this out using this code: library(caret) data(trees) m = train(Volume~(Girth+Height)^2, data=trees, method=lm) plotmo(m$finalModel) --- which leads to this error: Error in eval(expr, envir, enclos) : object 'Girth:Height' not found Is this a bug or am I missing something? Thanks! On 09/06/12 18:09, Dominik Bruhn wrote: Max, thanks for your answer! predict.train() will handle the formulas. If you want to compare the models in terms of their predictive performance, set the seeds prior to running the model. This will ensure that the same resampling indices are used in train(). If you do this, the resamples() function can be used to make formal comparisons between the models: I think I did not express my question in the right way: I want to use the extractPrediction function because I want to plot some stats using the plotObsVsPred method. As stated in my previous mail, the extractPrediction method does not work if the formula is changed. Perhaps my question now got clearer. Thanks again! -- Dominik Bruhn mailto: domi...@dbruhn.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] controlling spatial autocorrelation in linear regression models
Thanks! This link is very helpful. Best Gary On Mon, Jun 11, 2012 at 4:28 AM, D.Soudis d.sou...@rug.nl wrote: Hi, Maybe this is helpful..?? http://www.ats.ucla.edu/stat/r/faq/spatial_regression.htm Best, Dimitrios -- View this message in context: http://r.789695.n4.nabble.com/controlling-spatial-autocorrelation-in-linear-regression-models-tp4632940p4632998.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Round down to earliest hour or half hour
Hello, See if the following function does it #-- # Round POSIXct date/time # round.POSIXct - function(x, units = c(mins, 5 mins, 10 mins, 15 mins, quarter hours, 30 mins, half hours, hours)){ if(is.numeric(units)) units - as.character(units) units - match.arg(units) r - switch(units, mins = 60, 5 mins = 60*5, 10 mins = 60*10, 15 mins=, quarter hours = 60*15, 30 mins=, half hours = 60*30, hours = 60*60 ) H - as.integer(format(x, %H) M - as.integer(format(x, %M)) S - as.integer(format(x, %S)) D - format(x, %Y-%m-%d) secs - 3600*H + 60*M + S as.POSIXct(round(secs/r)*r, origin=D) } (x - Sys.time() + 1:10*60) round(x, 5 mins) round(x, 5) round(x, 10 mins) round(x, quarter) round(x, 15 mins) Hope this helps, Rui Barradas Em 09-06-2012 03:04, Phil Hurvitz escreveu: I wanted to round rather than truncate timestamps # create a data set to work with s - 30 * rnorm(10, 1, 0.1) gps.timestamp - Sys.time() + s*1:10 gps.timestamp - c(round(gps.timestamp[1], min), gps.timestamp, as.POSIXct(2012-06-08 17:32:59), as.POSIXct(2012-06-08 17:32:30)) f - function(gps.timestamp=gps.timestamp, interval=30){ # is the interval a multiple of 60? if so, just round if(interval %% 60 == 0){ return(data.frame(gps.timestamp, gps.timestamp.round=round(gps.timestamp, min))) } # a POSIXlt timestamp for fiddling with seconds lt - as.POSIXlt(gps.timestamp) # extract rounded seconds from the timestamp gps.seconds - round(lt$sec, 0) # the number of segments in a minute is 60/interval seg - 60/interval # the cutpoints rounds - c(0, 1:seg * interval) # the midpoints for cutting cuts - c(-1, 1:seg * 60/seg - interval/2, 60) # and a numeric representation of the cut position # (the factor position, really) numcut - as.numeric(cut(gps.seconds,cuts)) # the rounded seconds (*not* truncated) round.seconds - rounds[numcut] # truncate the date to the nearest low minute lt - lt1 - trunc(lt, min) # add the rounded seconds to the truncated minutes lt1 - lt1 + round.seconds # how does that look? return(data.frame(gps.timestamp, gps.timestamp.round=lt1)) } # try it: f(gps.timestamp, 15) f(gps.timestamp, 30) f(gps.timestamp, 10) -P. ** Philip M. Hurvitz, PhD | Research Assistant Professor | UW-CBE Urban Form Lab | 1107 NE 45th Street, Suite 535 | Box 354802 University of Washington, Seattle, Washington 98195-4802, USA phurv...@u.washington.edu | http://gis.washington.edu/phurvitz What is essential is invisible to the eye. -de Saint-Exupéry ** I would use package chron and trunc() One example from the help of trunc.times (modified): lirary(chron) tt - times(c(12:13:14, 15:46:17)) trunc(tt, times(00:30:00)) HTH Jannis --- Schatzi adele_thompson at cargill.com schrieb am Mo, 9.5.2011: Von: Schatzi adele_thompson at cargill.com Betreff: [R] Round down to earliest hour or half hour An: r-help at r-project.org Datum: Montag, 9. Mai, 2011 14:06 Uhr I have times and would like to round down to the earliest 30 minute increment. For instance, a time of 2011-04-28 09:02:00 (the as.numeric value = 1303999320) I would like it to be rounded down to: 2011-04-28 09:00:00 (the as.numeric value = 1303999200) Any ideas of how to do this? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correlation in Rattle
Same problem here. I have updated rattle from the repository as suggested by Graham, but this, unfortunately, did not solve the issue. (The release date in the 'About Rattle' dialog is 2012-04-22.) Any other ideas? Graham Williams wrote It is fixed and 2.6.19 will include the fix. When will version 2.6.19 be released? Thanks, Marco -- View this message in context: http://r.789695.n4.nabble.com/Correlation-in-Rattle-tp4630471p4633009.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bubbleplot3: R equivalent of the MATLAB function?
All, Does there exist an R equivalent of the MATLAB function bubbleplot3? Semi-naive Googleing has thus far revealed no such package to me. Your insight is appreciated! Regards, TJM [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Why is my data always imported as a list?
Hi, I am a relatively new to R. So, this is probably a really basic issue that I keep hitting. I read my data into R using the read.csv command: x = rep(numeric, 3) CB_un=read.csv(Corn_Belt_unirr.csv, header=TRUE, colClasses=c(factor, x)) # I have clearly told R that I have one factor variable and 3 numeric variables in this table. #but if I try to do anything with them, I get an error boxplot(CB_un[Value]~CB_un[State.Fips]) Error in model.frame.default(formula = CB_un[Value] ~ CB_un[State.Fips]) : invalid type (list) for variable 'CB_un[Value]' # Because these variables are all stored as lists. #So, I have to unpack them. CB_unirr_rent-as.numeric(unlist(CB_un[Value])) CB_unirr_State-as.factor(unlist(CB_un[State.Fips])) #before I can do anything with them boxplot(CB_unirr_rent~CB_unirr_State) Is there a reason my data is always imported as lists? Is there a way to skip this upacking step? Thanks, Sam [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Order all the columns ascending elements on a matrix or a data frame
Many ThanKs to all. 2012/6/10 Bert Gunter gunter.ber...@gene.com On Sun, Jun 10, 2012 at 2:22 PM, arun smartpink...@yahoo.com wrote: Hi Bert, I tried the code. dat2-data.frame(dat1) do.call(order,dat2) [1] 3 6 1 10 2 5 7 9 8 4 Here, I get the order of 1st column as a list. Is there anything I am missing here? No you don't. You get a vector of row indices, but ... My oversight. Subscript the rows of dat2 by this: dat2[do.call(order,dat2) , ] -- Bert Thanks, A.K. - Original Message - From: Bert Gunter gunter.ber...@gene.com To: arun smartpink...@yahoo.com Cc: Trying To learn again tryingtolearnag...@gmail.com; R help r-help@r-project.org Sent: Sunday, June 10, 2012 4:47 PM Subject: Re: [R] Order all the columns ascending elements on a matrix or a data frame Inline ... -- Bert On Sun, Jun 10, 2012 at 10:46 AM, arun smartpink...@yahoo.com wrote: Hi, If your intention is to order the first column by ascending, then by 2nd and so on.. Try this. set.seed(1) dat1-cbind(x=rnorm(10,5,0.5),y=runif(10,0.4),z=rnorm(10,15,0.2)) dat1 x yz [1,] 4.686773 0.9608231 14.99101 [2,] 5.091822 0.5272855 14.99676 [3,] 4.582186 0.7910043 15.18877 [4,] 5.797640 0.4753331 15.16424 [5,] 5.164754 0.5603324 15.11878 [6,] 4.589766 0.6316685 15.18380 [7,] 5.243715 0.4080342 15.15643 [8,] 5.369162 0.6294328 15.01491 [9,] 5.287891 0.9218145 14.60213 [10,] 4.847306 0.6042094 15.12397 dat1[order(dat1[,1],dat1[,2],dat1[,3]),] ## if dat1 is a data frame, e.g. dat1 - data.frame(dat1) ## This can be shortened to: do.call(order,dat1) ?do.call -- Bert x yz [1,] 4.582186 0.7910043 15.18877 [2,] 4.589766 0.6316685 15.18380 [3,] 4.686773 0.9608231 14.99101 [4,] 4.847306 0.6042094 15.12397 [5,] 5.091822 0.5272855 14.99676 [6,] 5.164754 0.5603324 15.11878 [7,] 5.243715 0.4080342 15.15643 [8,] 5.287891 0.9218145 14.60213 [9,] 5.369162 0.6294328 15.01491 [10,] 5.797640 0.4753331 15.16424 But, if it is like to order all the columns at once, apply(dat1,2,sort) x yz [1,] 4.582186 0.4080342 14.60213 [2,] 4.589766 0.4753331 14.99101 [3,] 4.686773 0.5272855 14.99676 [4,] 4.847306 0.5603324 15.01491 [5,] 5.091822 0.6042094 15.11878 [6,] 5.164754 0.6294328 15.12397 [7,] 5.243715 0.6316685 15.15643 [8,] 5.287891 0.7910043 15.16424 [9,] 5.369162 0.9218145 15.18380 [10,] 5.797640 0.9608231 15.18877 Here, the all columns are sorted to ascending, but only problem is that the corresponding elements in each of the rows in the original dataset has also changed. A.K. - Original Message - From: Trying To learn again tryingtolearnag...@gmail.com To: r-help@r-project.org Cc: Sent: Sunday, June 10, 2012 2:36 AM Subject: [R] Order all the columns ascending elements on a matrix or a data frame Imagine I have a csv KT.csv I want to create a new dataframe o convert KT in a matrix and create a new matrix with each column of KT ordered by ascending order. I have tried to make this b-read.csv(KT.csv) for(i in 1:ncol(b)){ b[,i]-sort(b[,i]) } But it puts a message that the number of rows doesn´t correspond. Can someone give me a clue on how to order. Many Thaks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why is my data always imported as a list?
On Jun 11, 2012, at 12:29 PM, Samantha Sifleet wrote: Hi, I am a relatively new to R. So, this is probably a really basic issue that I keep hitting. I read my data into R using the read.csv command: x = rep(numeric, 3) CB_un=read.csv(Corn_Belt_unirr.csv, header=TRUE, colClasses=c(factor, x)) # I have clearly told R that I have one factor variable and 3 numeric variables in this table. #but if I try to do anything with them, I get an error boxplot(CB_un[Value]~CB_un[State.Fips]) Error in model.frame.default(formula = CB_un[Value] ~ CB_un[State.Fips]) : invalid type (list) for variable 'CB_un[Value]' If you were steadfastly intent on using direct extraction in the formula, then this would be the way to do so: boxplot(CB_un[[Value]]~CB_un[[State.Fips]]) Beter would be to use the formula interface the way it was designed to operate: boxplot( Value ~ State.Fips, data=CB_un) -- David. # Because these variables are all stored as lists. #So, I have to unpack them. CB_unirr_rent-as.numeric(unlist(CB_un[Value])) CB_unirr_State-as.factor(unlist(CB_un[State.Fips])) #before I can do anything with them boxplot(CB_unirr_rent~CB_unirr_State) Is there a reason my data is always imported as lists? It's in a dataframe . dataframes are lists Is there a way to skip this upacking step? Thanks, Sam [[alternative HTML version deleted]] -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why is my data always imported as a list?
Hi, Have you tried str(CB_un) to make sure the structure of your data is what you expect? Does boxplot(CB_un[, Value]~CB_un[, State.Fips]) work? Look at this: testdf - data.frame(a=1:3, b=11:13) class(testdf[a]) [1] data.frame class(testdf[[a]]) [1] integer class(testdf[, a]) [1] integer A data frame is a special form of list, so the usual list subsetting rules apply. Extracting a named component of a data frame with single square brackets gives you a data frame. Using row, column notation or double brackets gives a vector. ?[ will give you more detail. You have to use a data frame, and thus a list, for your data, since you can't mix factor and numeric data types in a matrix. Sarah On Mon, Jun 11, 2012 at 12:29 PM, Samantha Sifleet sifleet.saman...@epamail.epa.gov wrote: Hi, I am a relatively new to R. So, this is probably a really basic issue that I keep hitting. I read my data into R using the read.csv command: x = rep(numeric, 3) CB_un=read.csv(Corn_Belt_unirr.csv, header=TRUE, colClasses=c(factor, x)) # I have clearly told R that I have one factor variable and 3 numeric variables in this table. #but if I try to do anything with them, I get an error boxplot(CB_un[Value]~CB_un[State.Fips]) Error in model.frame.default(formula = CB_un[Value] ~ CB_un[State.Fips]) : invalid type (list) for variable 'CB_un[Value]' # Because these variables are all stored as lists. #So, I have to unpack them. CB_unirr_rent-as.numeric(unlist(CB_un[Value])) CB_unirr_State-as.factor(unlist(CB_un[State.Fips])) #before I can do anything with them boxplot(CB_unirr_rent~CB_unirr_State) Is there a reason my data is always imported as lists? Is there a way to skip this upacking step? Thanks, Sam -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why is my data always imported as a list?
They aren't quite lists --- they are actually data.frame()s which are a special sort of list with rownames and other nice things. To your immediate question, I think you're looking for the formula interface: boxplot(Value ~ State.Fips, data = CB_un) The data= argument is important so boxplot knows where to look for Value and State.Fips Best, Michael On Mon, Jun 11, 2012 at 11:29 AM, Samantha Sifleet sifleet.saman...@epamail.epa.gov wrote: Hi, I am a relatively new to R. So, this is probably a really basic issue that I keep hitting. I read my data into R using the read.csv command: x = rep(numeric, 3) CB_un=read.csv(Corn_Belt_unirr.csv, header=TRUE, colClasses=c(factor, x)) # I have clearly told R that I have one factor variable and 3 numeric variables in this table. #but if I try to do anything with them, I get an error boxplot(CB_un[Value]~CB_un[State.Fips]) Error in model.frame.default(formula = CB_un[Value] ~ CB_un[State.Fips]) : invalid type (list) for variable 'CB_un[Value]' # Because these variables are all stored as lists. #So, I have to unpack them. CB_unirr_rent-as.numeric(unlist(CB_un[Value])) CB_unirr_State-as.factor(unlist(CB_un[State.Fips])) #before I can do anything with them boxplot(CB_unirr_rent~CB_unirr_State) Is there a reason my data is always imported as lists? Is there a way to skip this upacking step? Thanks, Sam [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bubbleplot3: R equivalent of the MATLAB function?
John, It does 3D bubble plots, allowing you to represent 4 data dimensions: the x, y, and z coordinates in 3D space, and the radius of the bubble (sphere). There also seem to be coloring options. I'd like to use something like this for a project I'm working on; a simulation I've done in R. Of course I can just bring the data over to matlab and do it there, but this seems like something that would be a valuable addition to the R environment. I'm considering attempting to implement it as a package in R if it hasn't already been done. Here's a link to the matlab source code: http://www.mathworks.com/matlabcentral/fileexchange/8231-bubbleplot3/content/bubbleplot3.m Arun, The links you sent seem to do only two-dimensional bubble plots, am I missing something? Thanks, TJ Murphy On Mon, Jun 11, 2012 at 11:51 AM, John Kane jrkrid...@inbox.com wrote: What exacrly does it do? Any pointers to relevant plots? John Kane Kingston ON Canada -Original Message- From: timothyjosephmur...@gmail.com Sent: Mon, 11 Jun 2012 10:54:28 -0500 To: r-help@r-project.org Subject: [R] bubbleplot3: R equivalent of the MATLAB function? All, Does there exist an R equivalent of the MATLAB function bubbleplot3? Semi-naive Googleing has thus far revealed no such package to me. Your insight is appreciated! Regards, TJM [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk and most webmails [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mgcv: inclusion of random intercept in model - based on p-value of smooth or anova?
Hi Martijn, Irrespective of the p-value, 'bam' and 'lmer' agree that the variance component for 'Placename' is practically zero. In the 'bam' output see the 'edf' for s(Placename), or for a more direct comparison call gam.vcomp(m1). As mentioned in ?summary.gam the p-values for re terms are fairly crude (and certainly don't solve all the usual problems with testing variance components for equality to zero), so I would not take them too seriously for testing whether your random effect is exactly zero, when the estimates/predictions are this close to zero (the typical random effect size for Placename is about 1e-8, after all). That said, when I randomly re-shuffle Placename, so that the null hypothesis is true, then the p-value distribution does look uniform, as it should, despite some edfs even smaller than that for the original data So the low p-value may simply reflect the common problem that even very tiny effects are often statistically significant at high sample sizes, I suppose. Anyway, unless effects of size 1e-8 are meaningful here, I would drop the Placename term. best, Simon ps. I'm not sure what effect the rather heavy tails on the residuals may be having here? On 11/06/12 14:56, Martijn Wieling wrote: Dear Simon, I ran an additional analysis using bam (mgcv 1.7-17) with three random intercepts and no non-linearities, and compared these to the results of lmer (lme4). Using bam results in a significant random intercept (even though it has a very low edf-value), while the lmer results show no variance associated to the random intercept of Placename. Should I drop the random intercept of Placename and if so, how is this apparent from the results of bam? Summaries of both models are shown below. With kind regards, Martijn l1 = lmer(RefPMIdistMeanLog.c ~ geogamfit + (1|Word) + (1|Key) + (1|Placename), data=wrddst); print(l1,cor=F) Linear mixed model fit by REML Formula: RefPMIdistMeanLog.c ~ geogamfit + (1 | Word) + (1 | Key) + (1 | Placename) Data: wrddst AICBIC logLik deviance REMLdev -44985 -44927 22498 -45009 -44997 Random effects: GroupsNameVariance Std.Dev. Word (Intercept) 0.0800944 0.283009 Key (Intercept) 0.0013641 0.036933 Placename (Intercept) 0.000 0.00 Residual 0.0381774 0.195390 Number of obs: 112608, groups: Word, 357; Key, 320; Placename, 40 Fixed effects: Estimate Std. Error t value (Intercept) -0.003420.01513 -0.23 geogamfit0.992490.02612 37.99 m1 = bam(RefPMIdistMeanLog.c ~ geogamfit + s(Word,bs=re) + s(Key,bs=re) + s(Placename,bs=re), data=wrddst,method=REML); summary(m1,freq=F) Family: gaussian Link function: identity Formula: RefPMIdistMeanLog.c ~ geogamfit + s(Word, bs = re) + s(Key, bs = re) + s(Placename, bs = re) Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0.003420.01513 -0.2260.821 geogamfit0.992490.02612 37.9912e-16 *** --- Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 Approximate significance of smooth terms: edf Ref.df F p-value s(Word) 3.554e+02347 634.7162e-16 *** s(Key) 2.946e+02316 23.0542e-16 *** s(Placename) 1.489e-04 38 7.2822e-16 *** --- Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 R-sq.(adj) = 0.693 Deviance explained = 69.4% REML score = -22498 Scale est. = 0.038177 n = 112608 On Wed, May 23, 2012 at 11:30 AM, Simon Wood-4 [via R] ml-node+s789695n4631060...@n4.nabble.com wrote: Having looked at this further, I've made some changes in mgcv_1.7-17 to the p-value computations for terms that can be penalized to zero during fitting (e.g. s(x,bs=re), s(x,m=1) etc). The Wald statistic based p-values from summary.gam and anova.gam (i.e. what you get from e.g. anova(a) where a is a fitted gam object) are quite well founded for smooth terms that are non-zero under full penalization (e.g. a cubic spline is a straight line under full penalization). For such smooths, an extension of Nychka's (1988) result on CI's for splines gives a well founded distributional result on which to base a Wald statistic. However, the Nychka result requires the smoothing bias to be substantially less than the smoothing estimator variance, and this will often not be the case if smoothing can actually penalize a term to zero (to understand why, see argument in appendix of Marra Wood, 2012, Scandinavian Journal of Statistics, 39,53-74). Simulation testing shows that this theoretical concern has serious practical consequences. So for terms that can be penalized to zero, alternative approximations have to be used, and these are now implemented in mgcv_1.7-17 (see ?summary.gam). The approximate test performed by anova(a,b) (a and b are fitted gam objects) is less well founded. It is a reasonable approximation when each smooth term in the models could in principle be well
[R] snow, ssh, and socket connections
I'm trying to setup a snow grid using sockets (Windows 7). On the test grid (my computer and another) I have an SSHD server up and running, can connect OK via public key authentication. Running makeSOCKcluster on just my local machine works OK. Running it to the other computer fails. My SSH logs show the connection accepted, authenticated and then dropped. I added manual=TRUE to get the command line for the remote machine. Running what is printed on that machine from the console is OK. My session connects and pushes work out OK. So I know the command line is not the problem. I then modified newSOCKnode function to print the cmd variable before calling it on the system. This gives me the ssh command with the remote command appended. It looks OK. When I paste this into the local command line, it connects to the remote machine and starts Rscript.exe without issue. I've tried the outfile= option without success -- it is not created on the remote machine. I'm at a loss as to where to go and what to try next. Any help would be greatly appreciated. Thanks Dominic Dominic J. Pazzula +-+-+-+-+-+-+-+-+-+-+- domp...@yahoo.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] access to raw data sets for six-sigma training
Dear R-help, I'm trying to train myself how to do six-sigma analysis (with R) and visualization (with d3.js), but all I can find is summary xls sheets, never the raw data they made the charts from. Does anyone know where I can get access to large amounts of quality data? Industry/application doesn't matter. Thanks! Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why is my data always imported as a list?
A data.frame is a list with some extra attributes. When you subset a data.frame as z[Column] you get a one-column data.frame (which boxplot rejects because it want numeric or character data). Subsetting it as either z[, Column] or z[[Column]] gives you the column itself, not a data.frame containing one column. z - data.frame(One=log(1:10), Two=rep(c(i,ii,iii),c(3,4,3))) str(z[One]) 'data.frame': 10 obs. of 1 variable: $ One: num 0 0.693 1.099 1.386 1.609 ... str(z[, One]) num [1:10] 0 0.693 1.099 1.386 1.609 ... str(z[[One]]) num [1:10] 0 0.693 1.099 1.386 1.609 ... In the particular case of the formula interface to boxplot (and to other functions), you can avoid having to choose the column-extraction operator by using the data= argument. The following three examples give the same result: boxplot(data=z, One ~ Two) boxplot(z[[One]] ~ z[[Two]]) boxplot(z[, One] ~ z[, Two]) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Samantha Sifleet Sent: Monday, June 11, 2012 9:29 AM To: r-help@r-project.org Subject: [R] Why is my data always imported as a list? Hi, I am a relatively new to R. So, this is probably a really basic issue that I keep hitting. I read my data into R using the read.csv command: x = rep(numeric, 3) CB_un=read.csv(Corn_Belt_unirr.csv, header=TRUE, colClasses=c(factor, x)) # I have clearly told R that I have one factor variable and 3 numeric variables in this table. #but if I try to do anything with them, I get an error boxplot(CB_un[Value]~CB_un[State.Fips]) Error in model.frame.default(formula = CB_un[Value] ~ CB_un[State.Fips]) : invalid type (list) for variable 'CB_un[Value]' # Because these variables are all stored as lists. #So, I have to unpack them. CB_unirr_rent-as.numeric(unlist(CB_un[Value])) CB_unirr_State-as.factor(unlist(CB_un[State.Fips])) #before I can do anything with them boxplot(CB_unirr_rent~CB_unirr_State) Is there a reason my data is always imported as lists? Is there a way to skip this upacking step? Thanks, Sam [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bubbleplot3: R equivalent of the MATLAB function?
What exacrly does it do? Any pointers to relevant plots? John Kane Kingston ON Canada -Original Message- From: timothyjosephmur...@gmail.com Sent: Mon, 11 Jun 2012 10:54:28 -0500 To: r-help@r-project.org Subject: [R] bubbleplot3: R equivalent of the MATLAB function? All, Does there exist an R equivalent of the MATLAB function bubbleplot3? Semi-naive Googleing has thus far revealed no such package to me. Your insight is appreciated! Regards, TJM [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most webmails __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Define a variable on a non-standard year interval (Water Years)
Hello, I am trying to define a different interval for a year. In hydrology, a water year is defined as the period between October 1st and September 30 of the following year. I was wondering how I might do this in R. Say I have a data.frame like the following and I want to extract a variable with the water year specs as defined above: df-data.frame(Date=seq(as.Date(2000/10/1), as.Date(2003/9/30), days)) ## Extract the normal year df$year - factor(format(as.Date(df$Date), %Y)) So the question is how might I define a variable that extends from October 1st to September 30th rather than the normal January 1st to December 31st? Thanks in advance! Sam __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generating random samples of IG distribution
Should have been For the normal inverse Gaussian: Package 'GeneralizedHyperbolic' For the generalized inverse Gaussian: Package 'HyperbolicDist' -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: David L Carlson [mailto:dcarl...@tamu.edu] Sent: Monday, June 11, 2012 10:26 AM To: 'shirin nezampour'; 'r-help@r-project.org' Subject: RE: [R] generating random samples of IG distribution For the normal inverse Gaussian: Package 'GeneralizedHyperbolic' For the generalized inverse Gaussian: Package 'GeneralizedHyperbolic' -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of shirin nezampour Sent: Sunday, June 10, 2012 11:37 AM To: r-help@r-project.org Subject: [R] generating random samples of IG distribution Dear R users, I want to generating random samples from Inverse Gaussian distribution . How can I do? and what package should I install? Thanks. Shirin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bubbleplot3: R equivalent of the MATLAB function?
On 12-06-11 1:49 PM, Timothy Murphy wrote: John, It does 3D bubble plots, allowing you to represent 4 data dimensions: the x, y, and z coordinates in 3D space, and the radius of the bubble (sphere). There also seem to be coloring options. Sounds like plot3d with type=s and size specified. plot3d is in the rgl package. Duncan Murdoch I'd like to use something like this for a project I'm working on; a simulation I've done in R. Of course I can just bring the data over to matlab and do it there, but this seems like something that would be a valuable addition to the R environment. I'm considering attempting to implement it as a package in R if it hasn't already been done. Here's a link to the matlab source code: http://www.mathworks.com/matlabcentral/fileexchange/8231-bubbleplot3/content/bubbleplot3.m Arun, The links you sent seem to do only two-dimensional bubble plots, am I missing something? Thanks, TJ Murphy On Mon, Jun 11, 2012 at 11:51 AM, John Kanejrkrid...@inbox.com wrote: What exacrly does it do? Any pointers to relevant plots? John Kane Kingston ON Canada -Original Message- From: timothyjosephmur...@gmail.com Sent: Mon, 11 Jun 2012 10:54:28 -0500 To: r-help@r-project.org Subject: [R] bubbleplot3: R equivalent of the MATLAB function? All, Does there exist an R equivalent of the MATLAB function bubbleplot3? Semi-naive Googleing has thus far revealed no such package to me. Your insight is appreciated! Regards, TJM [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most webmails [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bubbleplot3: R equivalent of the MATLAB function?
On 2012-06-11 10:49, Timothy Murphy wrote: John, It does 3D bubble plots, allowing you to represent 4 data dimensions: the x, y, and z coordinates in 3D space, and the radius of the bubble (sphere). There also seem to be coloring options. I'd like to use something like this for a project I'm working on; a simulation I've done in R. Of course I can just bring the data over to matlab and do it there, but this seems like something that would be a valuable addition to the R environment. I'm considering attempting to implement it as a package in R if it hasn't already been done. [snip] You might check out the rgl package. It does 3D spheres. Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bubbleplot3: R equivalent of the MATLAB function?
Duncan, Excellent, that looks like it will do the trick exactly. Thank you for your help. TJ On Mon, Jun 11, 2012 at 2:27 PM, Duncan Murdoch murdoch.dun...@gmail.comwrote: On 12-06-11 1:49 PM, Timothy Murphy wrote: John, It does 3D bubble plots, allowing you to represent 4 data dimensions: the x, y, and z coordinates in 3D space, and the radius of the bubble (sphere). There also seem to be coloring options. Sounds like plot3d with type=s and size specified. plot3d is in the rgl package. Duncan Murdoch I'd like to use something like this for a project I'm working on; a simulation I've done in R. Of course I can just bring the data over to matlab and do it there, but this seems like something that would be a valuable addition to the R environment. I'm considering attempting to implement it as a package in R if it hasn't already been done. Here's a link to the matlab source code: http://www.mathworks.com/**matlabcentral/fileexchange/** 8231-bubbleplot3/content/**bubbleplot3.mhttp://www.mathworks.com/matlabcentral/fileexchange/8231-bubbleplot3/content/bubbleplot3.m Arun, The links you sent seem to do only two-dimensional bubble plots, am I missing something? Thanks, TJ Murphy On Mon, Jun 11, 2012 at 11:51 AM, John Kanejrkrid...@inbox.com wrote: What exacrly does it do? Any pointers to relevant plots? John Kane Kingston ON Canada -Original Message- From: timothyjosephmur...@gmail.com Sent: Mon, 11 Jun 2012 10:54:28 -0500 To: r-help@r-project.org Subject: [R] bubbleplot3: R equivalent of the MATLAB function? All, Does there exist an R equivalent of the MATLAB function bubbleplot3? Semi-naive Googleing has thus far revealed no such package to me. Your insight is appreciated! Regards, TJM [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/**posting-guide.htmlhttp://www.r-project.org/posting-guide.html http://www.**r-project.org/posting-guide.**htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __**__ GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk and most webmails [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Define a variable on a non-standard year interval (Water Years)
Here's one way. (not including the conversion to factor) wdf - data.frame(Date=seq(as.Date(2000/10/1), as.Date(2003/9/30), days)) wdf$wyr - as.numeric(format(wdf$Date,'%Y')) is.nxt - as.numeric(format(wdf$Date,'%m')) %in% 1:9 wdf$wyr[ is.nxt ] - wdf$wyr[is.nxt]-1 ## and you can do some rudimentary checking with: table(format(wdf$Date,'%Y'), wdf$wyr) Note that df is the name of an R-supplied function and therefore not a good choice for one's own use. (By the way, thank you for providing easy to recreate example data. As a minor point, this: wdf - data.frame(Date=seq(as.Date(2000/10/1), as.Date(2003/9/30), 3 weeks)) would be just a little easier to work with for testing) -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 6/11/12 10:22 AM, Sam Albers tonightstheni...@gmail.com wrote: Hello, I am trying to define a different interval for a year. In hydrology, a water year is defined as the period between October 1st and September 30 of the following year. I was wondering how I might do this in R. Say I have a data.frame like the following and I want to extract a variable with the water year specs as defined above: df-data.frame(Date=seq(as.Date(2000/10/1), as.Date(2003/9/30), days)) ## Extract the normal year df$year - factor(format(as.Date(df$Date), %Y)) So the question is how might I define a variable that extends from October 1st to September 30th rather than the normal January 1st to December 31st? Thanks in advance! Sam __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bubbleplot3: R equivalent of the MATLAB function?
Hi, Hope this links will be useful http://flowingdata.com/2010/11/23/how-to-make-bubble-charts/ http://sas-and-r.blogspot.com/2010/03/example-728-bubble-plots.html A.K. - Original Message - From: Timothy Murphy timothyjosephmur...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, June 11, 2012 11:54 AM Subject: [R] bubbleplot3: R equivalent of the MATLAB function? All, Does there exist an R equivalent of the MATLAB function bubbleplot3? Semi-naive Googleing has thus far revealed no such package to me. Your insight is appreciated! Regards, TJM [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Kruskal Wallis Post hoc
Hi, I have searched and found a response to a question similar to mine but when I tried the code, R says it's not an actual function so I thought I'd ask here. http://r.789695.n4.nabble.com/file/n4633035/Cookies.csv Cookies.csv I have attached the data I am using. I am trying to look at two things: how moisture content changes over time, and how it changes along the length of a log (bolt). My data is not normal and doesn't become normal after ArcSin or Log transformations. Because the 'Days' has 4 levels, I performed a Kruskal-Wallis (kruskal.test) and got significant results. Same for comparing 'Cookie'. Neither of these have only 2 levels so significance here doesn't really explain it in depthbut everything I'm finding for post hoc with a KW says Mann Whitney. But when I try this (wilcox.test) in R it doesn't work - bc it should only have 2 levels. Can I simply run a tukeyHSD to see differences between levels? -- View this message in context: http://r.789695.n4.nabble.com/Kruskal-Wallis-Post-hoc-tp4633035.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R2wd error in wdGet
I've tried again and it worked fine this time. I can't figure out why it didn't last time. Thanks for helping anyway! On Mon, Jun 11, 2012 at 4:58 AM, Robert Baer rb...@atsu.edu wrote: wdGet() Error in if (wdapp[[Documents]][[Count**]] == 0) wdapp[[Documents]]$Add() : argument is of length zero Not sure what your error means without more context, you should see: Loading required package: rcom Loading required package: rscproxy Do you have RDCOMClient installed and working properly? rcom requires a current version of statconnDCOM installed. To install statconnDCOM type installstatconnDCOM() This will download and install the current version of statconnDCOM You will need a working Internet connection because installation needs to download a file. HTH, Rob Error in if (wdapp[[Documents]][[Count**]] == 0) wdapp[[Documents]]$Add() : argument is of length zero -Original Message- From: Andreia Leite Sent: Thursday, June 07, 2012 2:53 PM To: r-help@r-project.org Subject: [R] R2wd error in wdGet that Dear list, I'm trying to use R2wd package. I've installed the package and try wdGet(). However a error message came up. I'm presently using R 2.15.0 wdGet() Error in if (wdapp[[Documents]][[Count**]] == 0) wdapp[[Documents]]$Add() : argument is of length zero Does anyone knows what this means? Thanks a lot. Andreia Leite -- View this message in context: http://r.789695.n4.nabble.com/** R2wd-error-in-wdGet-tp4632737.**htmlhttp://r.789695.n4.nabble.com/R2wd-error-in-wdGet-tp4632737.html Sent from the R help mailing list archive at Nabble.com. __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Andreia Leite [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R and Ruby integration using RSruby gem
Hi All, Could anyone please tell the installation steps of RSruby gem on Windows XP. I have latest version of ruby R installed on Windows. Thanks -- View this message in context: http://r.789695.n4.nabble.com/R-and-Ruby-integration-using-RSruby-gem-tp4633020.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Do YOU know an equation for splines (ns)?
I was able to get the predicted values from the splines. Thanks so much for the help. I wrote a loop with some of the code that Bill suggested. It seems that when using predict with nlme, it is important to be specific with what one is using as newdata. This does come through in Pinheiro and Bates, I just didn't recognize it to begin with. Bert, I did try your code, but was only getting coefficients, so I may have neglected a step. ##The successful code: library(nlme) library(splines) rootCN-read.table(spline.txt, header=3DTRUE) rootCN$plotF-as.factor(rootCN$plot) rcn10G-groupedData(N ~ day | plotF, data=3DrootCN) fit10 - lme( N~ns(day, 3), data =3D rcn10G) plot(augPred(fit10)) t- 152:305 subject-rootCN[11:22,2] sim-NULL for(i in 1:12){ sim- cbind(sim, predict(fit10, data.frame(day=3Dt, plotF=3Drep(subject[i], length(t) } colnames(sim) - c(subject) par(mfrow=3Dc(4,3)) for(i in 1:12){ plot(t, sim[,i], type=3Dl, main=3Dsubject[i]) } -Ranae -- View this message in context: http://r.789695.n4.nabble.com/Do-YOU-know-an-equation-for-splines-ns-tp4632440p4633037.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] replacing values of matrix with random values of another dataframe
Hello, I'm having trouble performing a certain function within R and I was hoping someone might be able to help. I have a matrix (1000x21) that contains whole-number values ranging from 1-7 and I want to replace all entries within this matrix that have a value of 1 with a random number contained within a different dataframe that has 21 rows,1000 columns. I've tried using the if/else function, but this always seems to return a list with strange output. I've also tried using the replace and apply functions, but can't seem to get anything to work. My code is as follows: #create data frame of 21000 random values m_good_initial=rnorm(21000,2.79,0.18) m_good_D1=ifelse(m_good_initial0,0,m_good_initial) m_good_D1mat=matrix(m_good_D1,byrow=T,21) m_good_D1=as.data.frame(m_good_D1mat) #create matrix of (1000x21) that contains whole-number values ranging from 1-7 #using sample to select values with a respective probability search_strat_good - sample(landscenarios,1000,replace=T,prob=com_avgpgood[1,]) for (i in 2:21) { search_strat_good - cbind(search_strat_good,sample(landscenarios,1000,replace=T,prob=com_avgpgood[i,])) } #replace all search strategies of value 1 within matrix search_strat_good #with a random value from dataframe m_good_D1 bengood1=ifelse(search_strat_good==1,sample(m_good_D1,replace=F),search_strat_good) Any help would be greatly appreciated. -- Curtis Burkhalter [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Kruskal Wallis Post hoc
Look at kruskalmc in package pgirmess and package multcomView for plotting the results. -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of jhartsho Sent: Monday, June 11, 2012 2:27 PM To: r-help@r-project.org Subject: [R] Kruskal Wallis Post hoc Hi, I have searched and found a response to a question similar to mine but when I tried the code, R says it's not an actual function so I thought I'd ask here. http://r.789695.n4.nabble.com/file/n4633035/Cookies.csv Cookies.csv I have attached the data I am using. I am trying to look at two things: how moisture content changes over time, and how it changes along the length of a log (bolt). My data is not normal and doesn't become normal after ArcSin or Log transformations. Because the 'Days' has 4 levels, I performed a Kruskal-Wallis (kruskal.test) and got significant results. Same for comparing 'Cookie'. Neither of these have only 2 levels so significance here doesn't really explain it in depthbut everything I'm finding for post hoc with a KW says Mann Whitney. But when I try this (wilcox.test) in R it doesn't work - bc it should only have 2 levels. Can I simply run a tukeyHSD to see differences between levels? -- View this message in context: http://r.789695.n4.nabble.com/Kruskal- Wallis-Post-hoc-tp4633035.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] TukeyHSD troubles
Thanks! I was having the same issue. my treatments were 'days' so they were in intervals of 15 days. My anova was fine but I couldn't get my tukey to produce results. Used your code and it worked. -- View this message in context: http://r.789695.n4.nabble.com/TukeyHSD-troubles-tp1570205p4633045.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replacing values of matrix with random values of another dataframe
Hi Curtis, This isn't quite a reproducible example, though very close. As far as I can tell, you're over-thinking it. Instead of: bengood1=ifelse(search_strat_good==1,sample(m_good_D1,replace=F),search_strat_good) bengood1 - search_strat_good bengood1[bengood1 == 1] - sample(m_good_D1, size=sum(bengood1 == 1), replace=FALSE) I made a couple other comments inline: On Mon, Jun 11, 2012 at 3:44 PM, Curtis Burkhalter curtisburkhal...@gmail.com wrote: Hello, I'm having trouble performing a certain function within R and I was hoping someone might be able to help. I have a matrix (1000x21) that contains whole-number values ranging from 1-7 and I want to replace all entries within this matrix that have a value of 1 with a random number contained within a different dataframe that has 21 rows,1000 columns. I've tried using the if/else function, but this always seems to return a list with strange output. I've also tried using the replace and apply functions, but can't seem to get anything to work. My code is as follows: #create data frame of 21000 random values m_good_initial=rnorm(21000,2.79,0.18) m_good_D1=ifelse(m_good_initial0,0,m_good_initial) Again, you don't need the ifelse: m_good_D1 - m_good_initial m_good_D1[m_good_D1 0] - 0 m_good_D1mat=matrix(m_good_D1,byrow=T,21) m_good_D1=as.data.frame(m_good_D1mat) #create matrix of (1000x21) that contains whole-number values ranging from 1-7 #using sample to select values with a respective probability search_strat_good - sample(landscenarios,1000,replace=T,prob=com_avgpgood[1,]) for (i in 2:21) { search_strat_good - cbind(search_strat_good,sample(landscenarios,1000,replace=T,prob=com_avgpgood[i,])) } Using cbind is a lot less efficient than creating a matrix of the desired size beforehand and filling in each column. Neater yet: search_strat_good - sapply(1:21, function(i)sample(landscenarios,1000,replace=T,prob=com_avgpgood[i,])) #replace all search strategies of value 1 within matrix search_strat_good #with a random value from dataframe m_good_D1 bengood1=ifelse(search_strat_good==1,sample(m_good_D1,replace=F),search_strat_good) Any help would be greatly appreciated. -- Curtis Burkhalter -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frames can not hold objects...What can be done in the following scenario?
Rui and the R-help team, In Rui's helpful answer below, the function returns a list as output. When we apply() the function to the data.frame, dataframe$newcolumn has 2 layers of list before we can access each vector elements. For instance, dataframe$newcolumn[[1]][[1]] is a vector whereas dataframe$newcolumn and dataframe$newcolumn[[1]] are lists. Is there a solution that involves less layers of lists? I am just trying to understand the R language better. Thank you. On Sun, Jun 10, 2012 at 3:18 PM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, What you need is to have your function return a list, not a vector. Like this testfun - function (x, y) list(seq(x, y, 1)) testframe-data.frame(xvalues=c(2,3),yvalues=c(4,5)) testframe$newcolumn - apply(testframe, 1, function(x) testfun(x[1], x[2])) class(testframe$newcolumn) # [1] list Then you access lists and list elements. testframe$newcolumn[[1]] # a list with just one element testframe$newcolumn[[1]][[1]] # that element, a vector testframe$newcolumn[[1]][[1]][2] # the vector's 2nd element Since you want the function to return vectors in order to do further computations, you'll access those vectors by varying the list index, testframe$newcolumn[[1]][[1]] # first list, it's only vector testframe$newcolumn[[2]][[1]] # second list, it's only vector Etc. Hope this helps, Rui Barradas Em 10-06-2012 12:29, Onur Uncu escreveu: Thank you Duncan. A follow-up question is, how can I achieve the desired result in the earlier email? (i.e. Add the resulting vectors as a new column to the existing data.frame?) I tried the following: testframe$newcolumn-apply(testframe,1,function(x)testfun(x[1],x[2])) but I am getting the following error: Error in `$-.data.frame`(`*tmp*`, vecss, value = c(2, 3, 4, 3, 4, 5 : replacement has 3 rows, data has 2 Thanks for the help. On Sun, Jun 10, 2012 at 12:02 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 12-06-10 6:41 AM, Onur Uncu wrote: R-Help community, I understand that data.frames can hold elements of type double, string etc but NOT objects (such as a matrix etc). That is incorrect. Dataframes can hold list vectors. For example: A - data.frame(x = 1:3) A$y - list(matrix(1, 2,2), matrix(2, 3,3), matrix(3,4,4)) A[1,2] will now extract the 2x2 matrix, A[2,2] will extract the 3x3, etc. Duncan Murdoch This is not convenient for me in the following situation. I have a function that takes 2 inputs and returns a vector: testfun- function (x,y) seq(x,y,1) I have a data.frame defined as follows: testframe-data.frame(xvalues=c(2,3),yvalues=c(4,5)) I would like to apply testfun to every row of testframe and then create a new column in the data.frame which holds the returned vectors as objects. Why do I want this? Because the returned vectors are an intermediate step towards further calculations. It would be great to keep adding new columns to the data.frame with the intermediate objects. But this is not possible since data.frames can not hold objects as elements. What do you suggest as an elegant solution in this scenario? Thank you for any help! I would love to hear if forum __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frames can not hold objects...What can be done in the following scenario?
It is possible to chain together uses of `[[` -- e.g., x - list(1:5, list(letters[1:5], list(LETTERS[1:5]))) x[[c(1,2)]] # 2L x[[c(2,1,3)]] # c x[[c(2,2,1,3)]] # C which is sometimes useful. Best, Michael On Mon, Jun 11, 2012 at 4:35 PM, Onur Uncu onuru...@gmail.com wrote: Rui and the R-help team, In Rui's helpful answer below, the function returns a list as output. When we apply() the function to the data.frame, dataframe$newcolumn has 2 layers of list before we can access each vector elements. For instance, dataframe$newcolumn[[1]][[1]] is a vector whereas dataframe$newcolumn and dataframe$newcolumn[[1]] are lists. Is there a solution that involves less layers of lists? I am just trying to understand the R language better. Thank you. On Sun, Jun 10, 2012 at 3:18 PM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, What you need is to have your function return a list, not a vector. Like this testfun - function (x, y) list(seq(x, y, 1)) testframe-data.frame(xvalues=c(2,3),yvalues=c(4,5)) testframe$newcolumn - apply(testframe, 1, function(x) testfun(x[1], x[2])) class(testframe$newcolumn) # [1] list Then you access lists and list elements. testframe$newcolumn[[1]] # a list with just one element testframe$newcolumn[[1]][[1]] # that element, a vector testframe$newcolumn[[1]][[1]][2] # the vector's 2nd element Since you want the function to return vectors in order to do further computations, you'll access those vectors by varying the list index, testframe$newcolumn[[1]][[1]] # first list, it's only vector testframe$newcolumn[[2]][[1]] # second list, it's only vector Etc. Hope this helps, Rui Barradas Em 10-06-2012 12:29, Onur Uncu escreveu: Thank you Duncan. A follow-up question is, how can I achieve the desired result in the earlier email? (i.e. Add the resulting vectors as a new column to the existing data.frame?) I tried the following: testframe$newcolumn-apply(testframe,1,function(x)testfun(x[1],x[2])) but I am getting the following error: Error in `$-.data.frame`(`*tmp*`, vecss, value = c(2, 3, 4, 3, 4, 5 : replacement has 3 rows, data has 2 Thanks for the help. On Sun, Jun 10, 2012 at 12:02 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 12-06-10 6:41 AM, Onur Uncu wrote: R-Help community, I understand that data.frames can hold elements of type double, string etc but NOT objects (such as a matrix etc). That is incorrect. Dataframes can hold list vectors. For example: A - data.frame(x = 1:3) A$y - list(matrix(1, 2,2), matrix(2, 3,3), matrix(3,4,4)) A[1,2] will now extract the 2x2 matrix, A[2,2] will extract the 3x3, etc. Duncan Murdoch This is not convenient for me in the following situation. I have a function that takes 2 inputs and returns a vector: testfun- function (x,y) seq(x,y,1) I have a data.frame defined as follows: testframe-data.frame(xvalues=c(2,3),yvalues=c(4,5)) I would like to apply testfun to every row of testframe and then create a new column in the data.frame which holds the returned vectors as objects. Why do I want this? Because the returned vectors are an intermediate step towards further calculations. It would be great to keep adding new columns to the data.frame with the intermediate objects. But this is not possible since data.frames can not hold objects as elements. What do you suggest as an elegant solution in this scenario? Thank you for any help! I would love to hear if forum __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Decision Trees or Markov Models for Cost Effectiveness
Hello, I was just assigned to perform a cost effectiveness study in healthcare. We are studying the cost effectiveness of a proposed diagnostic vs. current screening procedures. One of the team members suggest a commercial software package called TreeAge Pro. Looking at the description, it appears to be a nice GUI to some very simple models that could be easily constructed in R. Are there any packages in R for this type of analysis? Additionally, does anyone have any suggestions in general regarding doing this type of analysis in R? Thank You, -- Noah Silverman UCLA Department of Statistics 8117 Math Sciences Building Los Angeles, CA 90095 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frames can not hold objects...What can be done in the following scenario?
Hello, There are also other possibilities. What I believe is the easiest is to go back to the beginning, i.e., have the function return a vector as before, and then use lapply on the data.frame's rows. testfun - function (x, y) seq(x, y, 1) testframe$newcolumn - lapply(1:nrow(testframe), function(i) testfun(testframe[i, 1], testframe[i, 2])) class(testframe$newcolumn) # [1] list testframe$newcolumn[[1]]# a vector, no longer a list testframe$newcolumn[[1]][2] # 2nd element of that vector The main point is that data.frames are lists of a special kind, they implement the statistical concept of variables and their observations, the columns and the rows. And like all list, its elements can be any R object including lists. Rui Barradas Em 11-06-2012 23:02, R. Michael Weylandt escreveu: It is possible to chain together uses of `[[` -- e.g., x - list(1:5, list(letters[1:5], list(LETTERS[1:5]))) x[[c(1,2)]] # 2L x[[c(2,1,3)]] # c x[[c(2,2,1,3)]] # C which is sometimes useful. Best, Michael On Mon, Jun 11, 2012 at 4:35 PM, Onur Uncu onuru...@gmail.com wrote: Rui and the R-help team, In Rui's helpful answer below, the function returns a list as output. When we apply() the function to the data.frame, dataframe$newcolumn has 2 layers of list before we can access each vector elements. For instance, dataframe$newcolumn[[1]][[1]] is a vector whereas dataframe$newcolumn and dataframe$newcolumn[[1]] are lists. Is there a solution that involves less layers of lists? I am just trying to understand the R language better. Thank you. On Sun, Jun 10, 2012 at 3:18 PM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, What you need is to have your function return a list, not a vector. Like this testfun - function (x, y) list(seq(x, y, 1)) testframe-data.frame(xvalues=c(2,3),yvalues=c(4,5)) testframe$newcolumn - apply(testframe, 1, function(x) testfun(x[1], x[2])) class(testframe$newcolumn) # [1] list Then you access lists and list elements. testframe$newcolumn[[1]] # a list with just one element testframe$newcolumn[[1]][[1]] # that element, a vector testframe$newcolumn[[1]][[1]][2] # the vector's 2nd element Since you want the function to return vectors in order to do further computations, you'll access those vectors by varying the list index, testframe$newcolumn[[1]][[1]] # first list, it's only vector testframe$newcolumn[[2]][[1]] # second list, it's only vector Etc. Hope this helps, Rui Barradas Em 10-06-2012 12:29, Onur Uncu escreveu: Thank you Duncan. A follow-up question is, how can I achieve the desired result in the earlier email? (i.e. Add the resulting vectors as a new column to the existing data.frame?) I tried the following: testframe$newcolumn-apply(testframe,1,function(x)testfun(x[1],x[2])) but I am getting the following error: Error in `$-.data.frame`(`*tmp*`, vecss, value = c(2, 3, 4, 3, 4, 5 : replacement has 3 rows, data has 2 Thanks for the help. On Sun, Jun 10, 2012 at 12:02 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 12-06-10 6:41 AM, Onur Uncu wrote: R-Help community, I understand that data.frames can hold elements of type double, string etc but NOT objects (such as a matrix etc). That is incorrect. Dataframes can hold list vectors. For example: A - data.frame(x = 1:3) A$y - list(matrix(1, 2,2), matrix(2, 3,3), matrix(3,4,4)) A[1,2] will now extract the 2x2 matrix, A[2,2] will extract the 3x3, etc. Duncan Murdoch This is not convenient for me in the following situation. I have a function that takes 2 inputs and returns a vector: testfun- function (x,y) seq(x,y,1) I have a data.frame defined as follows: testframe-data.frame(xvalues=c(2,3),yvalues=c(4,5)) I would like to apply testfun to every row of testframe and then create a new column in the data.frame which holds the returned vectors as objects. Why do I want this? Because the returned vectors are an intermediate step towards further calculations. It would be great to keep adding new columns to the data.frame with the intermediate objects. But this is not possible since data.frames can not hold objects as elements. What do you suggest as an elegant solution in this scenario? Thank you for any help! I would love to hear if forum __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] Decision Trees or Markov Models for Cost Effectiveness
Noah: On Mon, Jun 11, 2012 at 3:28 PM, Noah Silverman noahsilver...@ucla.edu wrote: Hello, I was just assigned to perform a cost effectiveness study in healthcare. We are studying the cost effectiveness of a proposed diagnostic vs. current screening procedures. -- Please! -- Do you think this really describes your problem sufficiently for a coherent answer? What sort of data do you have? -- what are the measures of cost and effectiveness? -- How complete are your data? -- Is there censoring, missing values? What are the covariates? How measured and recorded? ... I would strongly recommend that you consult with fellow local statisticians. At the very least, they might be able to point you in the right (R package) direction. But I think you need a good deal more help and guidance than that to formulate the questions, both scientific and statistical. -- Bert One of the team members suggest a commercial software package called TreeAge Pro. Looking at the description, it appears to be a nice GUI to some very simple models that could be easily constructed in R. Are there any packages in R for this type of analysis? Additionally, does anyone have any suggestions in general regarding doing this type of analysis in R? Thank You, -- Noah Silverman UCLA Department of Statistics 8117 Math Sciences Building Los Angeles, CA 90095 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] saving sublist lda object with save.image()
Greetings R experts, I'm having some difficulty recovering lda objects that I've saved within sublists using the save.image() function. I am running a script that exports a variety of different information as a list, included within that list is an lda object. I then take that list and create a list of that with all the different replications I've run. Unfortunately I've been having difficulty accessing my lda data when I attempt to get it back after closing and reopening R. Here's an example that would give me the error I'm running into: library(MASS) ldaobject-lda(Species~.,data=iris) someotherresults-c(1:5) list1-list(someotherresults,ldaobject) list2-list(list1,list1,list1) plot(list2[[1]][[2]])#plots the ldaobject save.image('ldalists.Rdata') ###Now if I close my R buffer and reopen it I get: load('ldalists.Rdata') plot(list2[[1]][[2]]) Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' is a list, but does not have components 'x' and 'y' ### And my lda obects appear to have changed and look like: list2[[1]][[2]] $prior setosa versicolor virginica 0.333 0.333 0.333 $counts setosa versicolor virginica 50 50 50 $means Sepal.Length Sepal.Width Petal.Length Petal.Width setosa5.006 3.4281.462 0.246 versicolor5.936 2.7704.260 1.326 virginica 6.588 2.9745.552 2.026 $scaling LD1 LD2 Sepal.Length 0.8293776 0.02410215 Sepal.Width 1.5344731 2.16452123 Petal.Length -2.2012117 -0.93192121 Petal.Width -2.8104603 2.83918785 $lev [1] setosa versicolor virginica $svd [1] 48.642644 4.579983 $N [1] 150 $call lda(formula = Species ~ ., data = iris) $terms Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width attr(,variables) list(Species, Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) attr(,factors) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 0 00 0 Sepal.Length1 00 0 Sepal.Width 0 10 0 Petal.Length0 01 0 Petal.Width 0 00 1 attr(,term.labels) [1] Sepal.Length Sepal.Width Petal.Length Petal.Width attr(,order) [1] 1 1 1 1 attr(,intercept) [1] 1 attr(,response) [1] 1 attr(,.Environment) environment: R_GlobalEnv attr(,predvars) list(Species, Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) attr(,dataClasses) Species Sepal.Length Sepal.Width Petal.Length Petal.Width factornumericnumericnumericnumeric $xlevels named list() attr(,class) [1] lda If anyone can help me out with a way to circumvent this problem or to recover my data that would be an immense help! Thanks again! -Dan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replacing values of matrix with random values of another dataframe
Hi, Not sure if this is what you want. dat1-matrix(sample(1:7,70,replace=TRUE),ncol=10,byrow=FALSE) dat2-matrix(sample(1:7,70,replace=TRUE),ncol=7,byrow=FALSE) dat1[dat1[,1:10]==1]-NA dat1 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 3 4 3 2 3 7 6 2 4 3 [2,] 2 5 NA 7 2 4 2 5 4 6 [3,] 7 NA 2 2 5 7 2 6 7 3 [4,] NA 5 2 5 NA 6 5 5 NA 7 [5,] 5 5 6 5 7 6 4 4 7 6 [6,] 5 7 5 NA 6 5 3 6 3 2 [7,] 3 7 NA 3 2 4 7 4 5 7 dat1[is.na(dat1)]-sample(dat2[,1:7]) dat1 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 3 4 3 2 3 7 6 2 4 3 [2,] 2 5 1 7 2 4 2 5 4 6 [3,] 7 7 2 2 5 7 2 6 7 3 [4,] 1 5 2 5 1 6 5 5 1 7 [5,] 5 5 6 5 7 6 4 4 7 6 [6,] 5 7 5 3 6 5 3 6 3 2 [7,] 3 7 4 3 2 4 7 4 5 7 A.K. - Original Message - From: Curtis Burkhalter curtisburkhal...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, June 11, 2012 3:44 PM Subject: [R] replacing values of matrix with random values of another dataframe Hello, I'm having trouble performing a certain function within R and I was hoping someone might be able to help. I have a matrix (1000x21) that contains whole-number values ranging from 1-7 and I want to replace all entries within this matrix that have a value of 1 with a random number contained within a different dataframe that has 21 rows,1000 columns. I've tried using the if/else function, but this always seems to return a list with strange output. I've also tried using the replace and apply functions, but can't seem to get anything to work. My code is as follows: #create data frame of 21000 random values m_good_initial=rnorm(21000,2.79,0.18) m_good_D1=ifelse(m_good_initial0,0,m_good_initial) m_good_D1mat=matrix(m_good_D1,byrow=T,21) m_good_D1=as.data.frame(m_good_D1mat) #create matrix of (1000x21) that contains whole-number values ranging from 1-7 #using sample to select values with a respective probability search_strat_good - sample(landscenarios,1000,replace=T,prob=com_avgpgood[1,]) for (i in 2:21) { search_strat_good - cbind(search_strat_good,sample(landscenarios,1000,replace=T,prob=com_avgpgood[i,])) } #replace all search strategies of value 1 within matrix search_strat_good #with a random value from dataframe m_good_D1 bengood1=ifelse(search_strat_good==1,sample(m_good_D1,replace=F),search_strat_good) Any help would be greatly appreciated. -- Curtis Burkhalter [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] eclipse cran r
hi i see you can create latex pdf files using eclipse and that cran r code can be put on it direcctly. I have tried but it seems very difficult there is an easy way step by step manual and explaining all programs and folders needed? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Decision Trees or Markov Models for Cost Effectiveness
Bert, I AM a statistician. Your lack of familiarity with this particular subset of stats is no reason to insult me or my posting. Cost Effectiveness is a very standard field of study in health care and epidemiology. The methods and models are common and well defined. I did not ask for help with my model, my math, or my stats. It isn't relavant to my question to describe the entire study being performed. My question was simply in relation to any existing packages in R that might make the job easier. I tend to prefer using R over myopic commercial packages that often don't offer much more than R does. Instead of having the department purchase an expensive software license, I'd rather do this in R. (You can do al most anything in R.) -- Noah Silverman UCLA Department of Statistics 8117 Math Sciences Building Los Angeles, CA 90095 On Jun 11, 2012, at 4:35 PM, Bert Gunter wrote: Noah: On Mon, Jun 11, 2012 at 3:28 PM, Noah Silverman noahsilver...@ucla.edu wrote: Hello, I was just assigned to perform a cost effectiveness study in healthcare. We are studying the cost effectiveness of a proposed diagnostic vs. current screening procedures. -- Please! -- Do you think this really describes your problem sufficiently for a coherent answer? What sort of data do you have? -- what are the measures of cost and effectiveness? -- How complete are your data? -- Is there censoring, missing values? What are the covariates? How measured and recorded? ... I would strongly recommend that you consult with fellow local statisticians. At the very least, they might be able to point you in the right (R package) direction. But I think you need a good deal more help and guidance than that to formulate the questions, both scientific and statistical. -- Bert One of the team members suggest a commercial software package called TreeAge Pro. Looking at the description, it appears to be a nice GUI to some very simple models that could be easily constructed in R. Are there any packages in R for this type of analysis? Additionally, does anyone have any suggestions in general regarding doing this type of analysis in R? Thank You, -- Noah Silverman UCLA Department of Statistics 8117 Math Sciences Building Los Angeles, CA 90095 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simple Binning of Values
Hello I am very new to R. I have an R task to complete that I have not been able to find a straightforward answer to as of yet. I have a list of values. I would like to count the number of values that are in one bin, the number that fall in the next bin, etc. For example My input file is: 123 48 342 442 43 232 32 129 191 147 I would like the output to be similar to: 0-100 3 100-200 4 200-3001 300-400 1 400-5001 Thus far I have tried working with hist, cut and split, but have not gotten very far. Thanks Kat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ff-package: * operator overloading setOldClass
Did you ever find an answer to your posted question? I find I get results if I grab parts of my ff object (indexing or chunks), do the multiplication, and then loop through the rest of the data, but that strikes me as a weak work around. One pass should be sufficient as we see with traditional R objects. More help appreciated -- View this message in context: http://r.789695.n4.nabble.com/operator-overloading-setOldClass-tp4607889p4633053.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Abrupt closure of R when using .C function
Thanks for your response Peter, Indeed, when I run the R code in a linux terminal, i get the following error message: address (nil), cause 'memory not mapped' Thanks for this tips 2012/6/7 peter dalgaard pda...@gmail.com On Jun 7, 2012, at 12:52 , Nouedoui Laetitia wrote: Hi Everyone, This is my first message on this discussion list. I create a R function which includes a .C function. I didn't get any error neither from C side, nor from R side. I tried to put proper type in R. But the problem is that I get an abrupt closure of R, with the following message: R for Windows GUI front-end encountered a problem and needs to close . Does anyone have an idea about where this abrupt closure come from? Usually, the C code did something disastrous like writing to memory it doesn't own. First step to find out what happened is to run the same code from Rterm, second step is to learn how to use a debugger (which I have so far avoided having to do under Windows, so don't ask me about it). Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eclipse cran r
There's no such thing as cran r -- there's just R, possibly using the binaries distributed by CRAN. [Both are capitalized, por favor] With that said, I believe the term you are looking for is Sweave or, for a slightly different take on things, knitr. Both are well documented with a little bit of google-fu. Best, Michael On Mon, Jun 11, 2012 at 5:32 PM, Trying To learn again tryingtolearnag...@gmail.com wrote: hi i see you can create latex pdf files using eclipse and that cran r code can be put on it direcctly. I have tried but it seems very difficult there is an easy way step by step manual and explaining all programs and folders needed? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] saving sublist lda object with save.image()
Untested guess -- are you reloading MASS in the new R session? I don't believe loaded packages are saved in the session image... If MASS isn't around, R won't know how to plot an lda object. Best, Michael On Mon, Jun 11, 2012 at 4:55 PM, dga...@huskers.unl.edu dga...@huskers.unl.edu wrote: Greetings R experts, I'm having some difficulty recovering lda objects that I've saved within sublists using the save.image() function. I am running a script that exports a variety of different information as a list, included within that list is an lda object. I then take that list and create a list of that with all the different replications I've run. Unfortunately I've been having difficulty accessing my lda data when I attempt to get it back after closing and reopening R. Here's an example that would give me the error I'm running into: library(MASS) ldaobject-lda(Species~.,data=iris) someotherresults-c(1:5) list1-list(someotherresults,ldaobject) list2-list(list1,list1,list1) plot(list2[[1]][[2]])#plots the ldaobject save.image('ldalists.Rdata') ###Now if I close my R buffer and reopen it I get: load('ldalists.Rdata') plot(list2[[1]][[2]]) Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' is a list, but does not have components 'x' and 'y' ### And my lda obects appear to have changed and look like: list2[[1]][[2]] $prior setosa versicolor virginica 0.333 0.333 0.333 $counts setosa versicolor virginica 50 50 50 $means Sepal.Length Sepal.Width Petal.Length Petal.Width setosa 5.006 3.428 1.462 0.246 versicolor 5.936 2.770 4.260 1.326 virginica 6.588 2.974 5.552 2.026 $scaling LD1 LD2 Sepal.Length 0.8293776 0.02410215 Sepal.Width 1.5344731 2.16452123 Petal.Length -2.2012117 -0.93192121 Petal.Width -2.8104603 2.83918785 $lev [1] setosa versicolor virginica $svd [1] 48.642644 4.579983 $N [1] 150 $call lda(formula = Species ~ ., data = iris) $terms Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width attr(,variables) list(Species, Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) attr(,factors) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 0 0 0 0 Sepal.Length 1 0 0 0 Sepal.Width 0 1 0 0 Petal.Length 0 0 1 0 Petal.Width 0 0 0 1 attr(,term.labels) [1] Sepal.Length Sepal.Width Petal.Length Petal.Width attr(,order) [1] 1 1 1 1 attr(,intercept) [1] 1 attr(,response) [1] 1 attr(,.Environment) environment: R_GlobalEnv attr(,predvars) list(Species, Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) attr(,dataClasses) Species Sepal.Length Sepal.Width Petal.Length Petal.Width factor numeric numeric numeric numeric $xlevels named list() attr(,class) [1] lda If anyone can help me out with a way to circumvent this problem or to recover my data that would be an immense help! Thanks again! -Dan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple Binning of Values
Hi Kat, Something along the lines of the following might work: x - c(123, 48, 342, 442, 43, 232, 32, 129, 191, 147) table(cut(x, c(0, 100, 200, 300, 400, 500))) See ?cut and ?table for more details. Best, Jorge.- On Mon, Jun 11, 2012 at 5:22 PM, Kathryn B Walters-Conte wrote: Hello I am very new to R. I have an R task to complete that I have not been able to find a straightforward answer to as of yet. I have a list of values. I would like to count the number of values that are in one bin, the number that fall in the next bin, etc. For example My input file is: 123 48 342 442 43 232 32 129 191 147 I would like the output to be similar to: 0-100 3 100-200 4 200-3001 300-400 1 400-5001 Thus far I have tried working with hist, cut and split, but have not gotten very far. Thanks Kat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to skip out and go on read while the read.table meet with a null file
hi ,R users I read a series of file by the command shown as below. cop_x_data-read.table(flnm,skip=2) the first two line of the files are headfile and I skip them. and sometimes the original data file is null and there is no any data in the file except for the head information. At this situation,my command would be in the error and tell me that there is not data in the file and can not go on. How can I go on read the files even if the file is null? thank you -- TANG Jie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to skip out and go on read while the read.table meet with a null file
On Jun 11, 2012, at 10:06 PM, Jie Tang wrote: hi ,R users I read a series of file by the command shown as below. cop_x_data-read.table(flnm,skip=2) the first two line of the files are headfile and I skip them. and sometimes the original data file is null and there is no any data in the file except for the head information. At this situation,my command would be in the error and tell me that there is not data in the file and can not go on. How can I go on read the files even if the file is null? There are many examples in the archives that show how to recover from errors using 'tr'y and 'tryCatch'. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to skip out and go on read while the read.table meet with a null file
On Jun 11, 2012, at 10:16 PM, David Winsemius wrote: On Jun 11, 2012, at 10:06 PM, Jie Tang wrote: hi ,R users I read a series of file by the command shown as below. cop_x_data-read.table(flnm,skip=2) the first two line of the files are headfile and I skip them. and sometimes the original data file is null and there is no any data in the file except for the head information. At this situation,my command would be in the error and tell me that there is not data in the file and can not go on. How can I go on read the files even if the file is null? There are many examples in the archives that show how to recover from errors using 'tr'y and 'tryCatch'. Well. 'try' anyway. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cost-effectiveness studies post -- my error
In an (appropriately) private reply, Noah SIlverman told me that cost effectiveness study is a well-defined term in epidemiology and that my frustration at his vague description was therefore just a reflection of my ignorance. I accept that and wish to publicly acknowledge my error. My apologies to Noah and the list. -- Bert -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple Binning of Values
Hi Kat, Try this: x1-c(123, 48, 342, 442, 43, 232, 32, 129, 191, 147) res-cut(x1,breaks=seq(0,500,by=100)) library(reshape) res2-melt(table(res)) colnames(res2)-c(bin,count) res2 bin count 1 (0,100] 3 2 (100,200] 4 3 (200,300] 1 4 (300,400] 1 5 (400,500] 1 I am trying to figure out how to get rid of the brackets. Otherwise, the results looks fine. A.K. - Original Message - From: Kathryn B Walters-Conte kbw1...@yahoo.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Monday, June 11, 2012 5:22 PM Subject: [R] Simple Binning of Values Hello I am very new to R. I have an R task to complete that I have not been able to find a straightforward answer to as of yet. I have a list of values. I would like to count the number of values that are in one bin, the number that fall in the next bin, etc. For example My input file is: 123 48 342 442 43 232 32 129 191 147 I would like the output to be similar to: 0-100 3 100-200 4 200-3001 300-400 1 400-5001 Thus far I have tried working with hist, cut and split, but have not gotten very far. Thanks Kat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.