Re: [R] Finding data association in R
Johannes Hüsing wrote: Am 19.05.2009 um 05:39 schrieb phen_ys: surgery - data.frame(outcome = c(0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, + 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0), age = c(50, 50, 51, + 51, 53, 54, 54, 54, 55, 55, 56, 56, 56, 57, 57, 57, 57, 58, + 59, 60, 61, 61, 61, 62, 62, 62, 62, 63, 63, 63, 64, 64, 65, + 67, 67, 68, 68, 69, 70, 71)) How to use R to find association of the death rate and age with the above data? with(surgery, boxplot(age ~ outcome)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. I try to fit this into the model use lm function, but it doesn't make much sense. The question i'm trying to answer is whether death rate is associated with age. E.g the death rate is higher when the age is older. -- View this message in context: http://www.nabble.com/Finding-data-association-in-R-tp23609249p23610952.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding data association in R
Your problem is statistical and has nothing particularly to do with R. It looks like homework to me. You may care to look at it this way: ### fm - glm(outcome ~ age, binomial, surgery) summary(fm) Call: glm(formula = outcome ~ age, family = binomial, data = surgery) Deviance Residuals: Min 1Q Median 3Q Max -1.6601 -0.8099 -0.5839 1.0491 1.7079 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -10.481744.30409 -2.435 0.0149 age 0.162950.07018 2.322 0.0202 (Dispersion parameter for binomial family taken to be 1) Null deviance: 51.796 on 39 degrees of freedom Residual deviance: 45.301 on 38 degrees of freedom AIC: 49.301 Number of Fisher Scoring iterations: 3 ### There are still a few questions left, of course. Bill Venables http://www.cmis.csiro.au/bill.venables/ -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of phen_ys Sent: Tuesday, 19 May 2009 5:17 PM To: r-help@r-project.org Subject: Re: [R] Finding data association in R Johannes Hüsing wrote: Am 19.05.2009 um 05:39 schrieb phen_ys: surgery - data.frame(outcome = c(0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, + 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0), age = c(50, 50, 51, + 51, 53, 54, 54, 54, 55, 55, 56, 56, 56, 57, 57, 57, 57, 58, + 59, 60, 61, 61, 61, 62, 62, 62, 62, 63, 63, 63, 64, 64, 65, + 67, 67, 68, 68, 69, 70, 71)) How to use R to find association of the death rate and age with the above data? with(surgery, boxplot(age ~ outcome)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. I try to fit this into the model use lm function, but it doesn't make much sense. The question i'm trying to answer is whether death rate is associated with age. E.g the death rate is higher when the age is older. -- View this message in context: http://www.nabble.com/Finding-data-association-in-R-tp23609249p23610952.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Predicting complicated GAMMs on response scale
On Mon, 2009-05-18 at 11:48 -0700, William Paterson wrote: Hi, I am using GAMMs to show a relationship of temperature differential over time with a model that looks like this:- gamm(Diff~s(DaysPT)+AirToC,method=REML) where DaysPT is time in days since injury and Diff is repeat measures of temperature differentials with regards to injury sites compared to non-injured sites in individuals over the course of 0-24 days. I use the following code to plot this model on the response scale with 95% CIs which works fine:- g.m-gamm(Diff~s(DaysPT)+AirToC,method=REML) p.d-data.frame(DaysPT=seq(min(DaysPT),max(DaysPT))) p.d$AirToC-(6.7) b-predict.gam(g.m$gam,p.d,se=TRUE) range-c(min(b$fit-2*b$se.fit),max(b$fit+2*b$se.fit)) plot(p.d$DaysPT,b$fit,ylim=c(-4,12),xlab=Days post-tagging,ylab=dTmax (ºC),type=l,lab=c(24,4,12),las=1,cex.lab=1.5, cex.axis=1,lwd=2) lines(p.d$DaysPT,b$fit+b$se.fit*1.96,lty=2,lwd=1.5) lines(p.d$DaysPT,b$fit-b$se.fit*1.96,lty=2,lwd=1.5) points(DaysPT,Diff) However, when I add a correlation structure and/or a variance structure so that the model may look like:- gamm(Diff~s(DaysPT3)+AirToC,correlation=corCAR1(form=~DaysPT| Animal),weights=varPower(form=~DaysPT),method=REML) I get this message at the point of inputting the line b-predict.gam(g.m$gam,p.d,se=TRUE) Note that p.d doesn't contain Animal. Not sure this is the problem, but I would have thought you'd need to supply new values of Animal for the data you wish to predict for in order to get the CAR(1) errors correct. Is it possible that the model is finding another Animal variable in the global environment? I have predicted from several thousand GAMMs containing correlation structures similar to the way you do above so this does work in general. If the above change to p.d doesn't work, you'll probably need to speak to Simon Wood to take this further. Is mgcv up-to-date? I am using 1.5-5 that was released in the last week or so. For example, this dummy example runs without error for me and is similar to your model y1 - arima.sim(list(order = c(1,0,0), ar = 0.5), n = 200, sd = 1) y2 - arima.sim(list(order = c(1,0,0), ar = 0.8), n = 200, sd = 3) x1 - rnorm(200) x2 - rnorm(200) ind - rep(1:2, each = 200) d - data.frame(Y = c(y1,y2), X = c(x1,x2), ind = ind, time = rep(1:200, times = 2)) require(mgcv) mod - gamm(Y ~ s(X), data = d, corr = corCAR1(form = ~ time | ind), weights = varPower(form = ~ time)) p.d - data.frame(X = rep(seq(min(d$X), max(d$X), len = 20), 2), ind = rep(1:2, each = 20), time = rep(1:20, times = 2)) pred - predict(mod$gam, newdata = p.d, se = TRUE) Does this work for you? If not, the above would be a reproducible example (as asked for in the posting guide) and might help Simon track down the problem if you are running an up-to-date mgcv. HTH G Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : variable lengths differ (found for 'DaysPT') In addition: Warning messages: 1: not all required variables have been supplied in newdata! in: predict.gam(g.m$gam, p.d, se = TRUE) 2: 'newdata' had 25 rows but variable(s) found have 248 rows Is it possible to predict a more complicated model like this on the response scale? How can I incorporate a correlation structure and variance structure in a dataframe when using the predict function for GAMMs? Any help would be greatly appreciated. William Paterson -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overdispersion using repeated measures lmer
Dear Christine, The poisson family does not allow for overdispersion (nor underdispersion). Try using the quasipoisson family instead. HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Christine Griffiths Verzonden: maandag 18 mei 2009 13:26 Aan: r-help@r-project.org Onderwerp: [R] Overdispersion using repeated measures lmer Dear All I am trying to do a repeated measures analysis using lmer and have a number of issues. I have non-orthogonal, unbalanced data. Count data was obtained over 10 months for three treatments, which were arranged into 6 blocks. Treatment is not nested in Block but crossed, as I originally designed an orthogonal, balanced experiment but subsequently lost a treatment from 2 blocks. My fixed effects are treatment and Month, and my random effects are Block which was repeated sampled. My model is: Model-lmer(Count~Treatment*Month+(Month|Block),data=dataset,family=pois son(link=sqrt)) Is this the only way in which I can specify my random effects? I.e. can I specify them as: (1|Block)+(1|Month)? When I run this model, I do not get any residuals in the error term or estimated scale parameters and so do not know how to check if I have overdispersion. Below is the output I obtained. Generalized linear mixed model fit by the Laplace approximation Formula: Count ~ Treatment * Month + (Month | Block) Data: dataset AIC BIC logLik deviance 310.9 338.5 -146.4292.9 Random effects: Groups NameVariance Std.Dev. Corr Block (Intercept) 0.06882396 0.262343 Month 0.00011693 0.010813 1.000 Number of obs: 160, groups: Block, 6 Fixed effects: Estimate Std. Error z value Pr(|z|) (Intercept) 1.624030 0.175827 9.237 2e-16 *** Treatment2.Radiata0.150957 0.207435 0.728 0.466777 Treatment3.Aldabra -0.005458 0.207435 -0.026 0.979009 Month-0.079955 0.022903 -3.491 0.000481 *** Treatment2.Radiata:Month 0.048868 0.033340 1.466 0.142717 Treatment3.Aldabra:Month 0.077697 0.033340 2.330 0.019781 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Correlation of Fixed Effects: (Intr) Trt2.R Trt3.A Month T2.R:M Trtmnt2.Rdt -0.533 Trtmnt3.Ald -0.533 0.450 Month -0.572 0.585 0.585 Trtmnt2.R:M 0.474 -0.882 -0.402 -0.661 Trtmnt3.A:M 0.474 -0.402 -0.882 -0.661 0.454 Any advice on how to account for overdispersion would be much appreciated. Many thanks in advance Christine -- Christine Griffiths School of Biological Sciences University of Bristol Woodland Road Bristol BS8 1UG Tel: 0117 9287593 Fax 0117 925 7374 christine.griffi...@bristol.ac.uk http://www.bio.bris.ac.uk/research/mammal/tortoises.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] As mat.or.vec
is there a command like mat.or.vec for an array that I have to create with a cicle for? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generic 'diff'
Stavros Macrakis wrote: On Mon, May 18, 2009 at 6:00 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: I understood what you were asking but R is an oo language so that's the model to use to do this sort of thing. I am not talking about creating a new class with an analogue to the subtraction function. I am talking about a function which applies another function to a sequence and its lagged version. Functional arguments are used all over the place in R's base package (Xapply, sweep, outer, by, not to mention Map, Reduce, Filter, etc.) and they seem perfectly natural here. perhaps 'diff' would not be the best name, something like 'lag' would be better for the more generic function, but 'lag' is already taken. i agree it would be reasonable to have diff (lag) to accept an extra argument for the function to be applied. the solution of wrapping the vector into a new class to be diff'ed with a non-default diff does not seem to make much sense, as (a) what you seem to want is to custom-diff plain vectors, (b) to keep the diff family coherent, you'd need to upgrade the other diffs to have the extra argument anyway. as you say, it's trivial to implement an extended diff, say difff, reusing code from diff: difff = function(x, ...) UseMethod('difff') difff.default = function(x, lag=1, differences=1, fun=`-`, ...) { ismat = is.matrix(x) xlen = if (ismat) dim(x)[1L] else length(x) if (length(lag) 1L || length(differences) 1L || lag 1L || differences 1L) stop('lag' and 'differences' must be integers = 1) if (lag * differences = xlen) return(x[0]) r = unclass(x) i1 = -1L:-lag if (ismat) for (i in 1L:differences) r = fun(r[i1, , drop = FALSE], r[-nrow(r):-(nrow(r) - lag + 1), , drop = FALSE]) else for (i in 1L:differences) r = fun(r[i1], r[-length(r):-(length(r) - lag + 1)]) class(r) = oldClass(x) r } now, this naive version seems to work close to what you'd like: difff(1:4) # 1 1 1 difff(1:4, fun=`+`) # 3 5 7 it might be useful if the original diff were working this way. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fitting distribution
On Tue, 19 May 2009 14:04:19 +1000 Kon Knafelman konk2...@hotmail.com wrote: KK i have the sample variances for 1000 samples, and i want to fit it KK to a chi-squared distribution. KK can someone please help me fit this to a chi-squared distribution KK with n degrees of freedom. Thanks a lot Dear Kon, 1. please only mail to r-h...@stat.math.ethz.ch OR to r-help@r-project.org as we receive every mail of you twice if you mail to both. 2. please read the posting guide. A link is attached every e-mail: PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. especially the section do your homework. It means use search on r-project.org and rseek.org and have a look at the documentation. Don't expect others to do your homework. And show some effort! Using search would have pointed you to a guide of Ricci on fitting distributions for example: http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf If you programmed something and it does not work as expected- THEN mail to the list. hth Stefan PS You don't know Debbie Zhang by coincidence?... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generic 'diff'
Wacek Kusnierczyk wrote: Stavros Macrakis wrote: [...] I am not talking about creating a new class with an analogue to the subtraction function. I am talking about a function which applies another function to a sequence and its lagged version. Functional arguments are used all over the place in R's base package (Xapply, sweep, outer, by, not to mention Map, Reduce, Filter, etc.) and they seem perfectly natural here. [...] as you say, it's trivial to implement an extended diff, say difff, reusing code from diff: difff = function(x, ...) UseMethod('difff') difff.default = function(x, lag=1, differences=1, fun=`-`, ...) { ismat = is.matrix(x) xlen = if (ismat) dim(x)[1L] else length(x) if (length(lag) 1L || length(differences) 1L || lag 1L || differences 1L) stop('lag' and 'differences' must be integers = 1) btw., the error message here is confusing: lag = 1:2 diff(1:10, lag=lag) # Error in diff.default(1:10, lag = lag) : # 'lag' and 'differences' must be integers = 1 is.integer(lag) # TRUE all(lag = 1) # TRUE what is meant is that lag and differences must be atomic 1-element vectors of positive integers. or rather integer-representing numerics: lag = 1 diff(1:5, lag=1) # fine is.integer(lag) # FALSE (the usual confusion between 'integer' as the underlying representation and 'integer' as the represented number.) vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overdispersion using repeated measures lmer
Thanks. I did try using quasipoisson and a negative binomial error but am unsure of the degree of overdispersion and whether it is simply due to missing values. I am investigating to see if I can replace these missing values so that I can have a balanced orthogonal design and use lme or aov instead which is easier to interpret. Any ideas on whether it is feasible to replace missing values for a small dataset with repeated measures? I have 6 blocks with 3 treatments sampled over 10 months. Two blocks are missing one treatment, albeit a different one. Also any suggestions about how I would go about this would be much appreciated. I am also unsure of whether my random effects (Month|Block) for repeated measures with random slope and intercept is correct and whether (1|Month) + (1|Block) represents repeated measures. Any confirmation would be great. Cheers Christine Christine Griffiths-2 wrote: Dear All I am trying to do a repeated measures analysis using lmer and have a number of issues. I have non-orthogonal, unbalanced data. Count data was obtained over 10 months for three treatments, which were arranged into 6 blocks. Treatment is not nested in Block but crossed, as I originally designed an orthogonal, balanced experiment but subsequently lost a treatment from 2 blocks. My fixed effects are treatment and Month, and my random effects are Block which was repeated sampled. My model is: Model-lmer(Count~Treatment*Month+(Month|Block),data=dataset,family=poisson(link=sqrt)) Is this the only way in which I can specify my random effects? I.e. can I specify them as: (1|Block)+(1|Month)? When I run this model, I do not get any residuals in the error term or estimated scale parameters and so do not know how to check if I have overdispersion. Below is the output I obtained. Generalized linear mixed model fit by the Laplace approximation Formula: Count ~ Treatment * Month + (Month | Block) Data: dataset AIC BIC logLik deviance 310.9 338.5 -146.4292.9 Random effects: Groups NameVariance Std.Dev. Corr Block (Intercept) 0.06882396 0.262343 Month 0.00011693 0.010813 1.000 Number of obs: 160, groups: Block, 6 Fixed effects: Estimate Std. Error z value Pr(|z|) (Intercept) 1.624030 0.175827 9.237 2e-16 *** Treatment2.Radiata0.150957 0.207435 0.728 0.466777 Treatment3.Aldabra -0.005458 0.207435 -0.026 0.979009 Month-0.079955 0.022903 -3.491 0.000481 *** Treatment2.Radiata:Month 0.048868 0.033340 1.466 0.142717 Treatment3.Aldabra:Month 0.077697 0.033340 2.330 0.019781 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Correlation of Fixed Effects: (Intr) Trt2.R Trt3.A Month T2.R:M Trtmnt2.Rdt -0.533 Trtmnt3.Ald -0.533 0.450 Month -0.572 0.585 0.585 Trtmnt2.R:M 0.474 -0.882 -0.402 -0.661 Trtmnt3.A:M 0.474 -0.402 -0.882 -0.661 0.454 Any advice on how to account for overdispersion would be much appreciated. Many thanks in advance Christine -- Christine Griffiths School of Biological Sciences University of Bristol Woodland Road Bristol BS8 1UG Tel: 0117 9287593 Fax 0117 925 7374 christine.griffi...@bristol.ac.uk http://www.bio.bris.ac.uk/research/mammal/tortoises.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Overdispersion-using-repeated-measures-lmer-tp23595955p23612349.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R copula - empirical distributions
Dear list, Has anyone used the 'copula' or 'fCopulae' package with empirical distributions. I have two distributions (10.000 samples each) which I need to combine using archimedean copulas (probably Clayton and/or Frank). Is this possible? Is there an existing empirical distribution function defined which I can use or do I need to define my own? Thanks and best regards, Matej -- View this message in context: http://www.nabble.com/R-copula---empirical-distributions-tp23612397p23612397.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generic 'diff'
Wacek Kusnierczyk wrote: btw., the error message here is confusing: lag = 1:2 diff(1:10, lag=lag) # Error in diff.default(1:10, lag = lag) : # 'lag' and 'differences' must be integers = 1 is.integer(lag) # TRUE all(lag = 1) # TRUE what is meant is that lag and differences must be atomic 1-element vectors of positive integers. or rather integer-representing numerics: lag = 1 diff(1:5, lag=1) # fine is.integer(lag) # FALSE ... and even non-integer-representing non-integers are fine: diff(1:5, lag=pi) # 3 3 vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wilcoxon nonparametric p-values
cvandy wrote: When I use wilcox.test, I get vastly different p-values than the problems from Statistics textbooks. For example: The following problem comes from Applied Statistics and Probability for Engineers, 2nd Edition, by D. C. Montgomery. Page736, problem 14.7. The problem is to compare the sample data with a population median of 8.5. The book answer is p = 0.25, wilcox.test answer is p = 0.573. I've tried several other similar problems with similar results. I've copied the following directly from my workspace. wilcox.exact (from exactRankTests) gives wilcox.exact(x - 8.5) Exact Wilcoxon signed rank test data: x - 8.5 V = 80.5, p-value = 0.5748 so I'd suspect the textbook. One-sided p-value perhaps? or table limitation (as in p .25). If you want to dig deeper, you'll probably have to check the computations implied by the text. Thanks for any help, CHV x-c(8.32,8.05, 8.93,8.65,8.25,8.46,8.52,8.35,8.36,8.41,8.42,8.30,8.71,8.75,8.6,8.83,8.5,8.38,8.29,8.46) wilcox.test(x,y=NULL,mu=8.5) Wilcoxon signed rank test with continuity correction data: x V = 80.5, p-value = 0.573 alternative hypothesis: true location is not equal to 8.5 Warning messages: 1: In wilcox.test.default(x, y = NULL, mu = 8.5) : cannot compute exact p-value with ties 2: In wilcox.test.default(x, y = NULL, mu = 8.5) : cannot compute exact p-value with zeroes Charles H Van deZande -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wilcoxon nonparametric p-values
I just tried it in Minitab and got -- Test of median = 8.500 versus median not = 8.500 N for Wilcoxon Estimated N Test Statistic P Median C1 20 19 80.5 0.573 8.460 - One tailed gave me closer to the textbook, but still not very close. --- Test of median = 8.500 versus median 8.500 N for Wilcoxon Estimated N Test Statistic P Median C1 20 19 80.5 0.287 8.460 --- I agree with Peter Dalgaard, the book has it wrong (for some value of wrong) Regards KJ Peter Dalgaard p.dalga...@biostat.ku.dk wrote in message news:4a127d6d.1060...@biostat.ku.dk... cvandy wrote: When I use wilcox.test, I get vastly different p-values than the problems from Statistics textbooks. For example: The following problem comes from Applied Statistics and Probability for Engineers, 2nd Edition, by D. C. Montgomery. Page736, problem 14.7. The problem is to compare the sample data with a population median of 8.5. The book answer is p = 0.25, wilcox.test answer is p = 0.573. I've tried several other similar problems with similar results. I've copied the following directly from my workspace. wilcox.exact (from exactRankTests) gives wilcox.exact(x - 8.5) Exact Wilcoxon signed rank test data: x - 8.5 V = 80.5, p-value = 0.5748 so I'd suspect the textbook. One-sided p-value perhaps? or table limitation (as in p .25). If you want to dig deeper, you'll probably have to check the computations implied by the text. Thanks for any help, CHV x-c(8.32,8.05, 8.93,8.65,8.25,8.46,8.52,8.35,8.36,8.41,8.42,8.30,8.71,8.75,8.6,8.83,8.5,8.38,8.29,8.46) wilcox.test(x,y=NULL,mu=8.5) Wilcoxon signed rank test with continuity correction data: x V = 80.5, p-value = 0.573 alternative hypothesis: true location is not equal to 8.5 Warning messages: 1: In wilcox.test.default(x, y = NULL, mu = 8.5) : cannot compute exact p-value with ties 2: In wilcox.test.default(x, y = NULL, mu = 8.5) : cannot compute exact p-value with zeroes ? ? Charles H Van deZande ? -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overdispersion using repeated measures lmer
Dear Christine, (Month|Block) and (1|Block) + (1|Month) are completely different random effects. The first assumes that each Block exhibits a different linear trend along Month. The latter assumes that each block has a random effect, each month has a random effect and that the random effects of block and month are independent. So each month has a different effect, but within a given month that effect is the same on each block. It is up to you to see if that kind of assumption is valid in your design. Missing values should not be a problem, as long as they are missing at random. I would not try to impute the missing values. How would you determine the imputed values? That requires a lot of assumptions and they could affect your model parameters. HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Christine Griffiths Verzonden: dinsdag 19 mei 2009 11:01 Aan: r-help@r-project.org Onderwerp: Re: [R] Overdispersion using repeated measures lmer Thanks. I did try using quasipoisson and a negative binomial error but am unsure of the degree of overdispersion and whether it is simply due to missing values. I am investigating to see if I can replace these missing values so that I can have a balanced orthogonal design and use lme or aov instead which is easier to interpret. Any ideas on whether it is feasible to replace missing values for a small dataset with repeated measures? I have 6 blocks with 3 treatments sampled over 10 months. Two blocks are missing one treatment, albeit a different one. Also any suggestions about how I would go about this would be much appreciated. I am also unsure of whether my random effects (Month|Block) for repeated measures with random slope and intercept is correct and whether (1|Month) + (1|Block) represents repeated measures. Any confirmation would be great. Cheers Christine Christine Griffiths-2 wrote: Dear All I am trying to do a repeated measures analysis using lmer and have a number of issues. I have non-orthogonal, unbalanced data. Count data was obtained over 10 months for three treatments, which were arranged into 6 blocks. Treatment is not nested in Block but crossed, as I originally designed an orthogonal, balanced experiment but subsequently lost a treatment from 2 blocks. My fixed effects are treatment and Month, and my random effects are Block which was repeated sampled. My model is: Model-lmer(Count~Treatment*Month+(Month|Block),data=dataset,family=po isson(link=sqrt)) Is this the only way in which I can specify my random effects? I.e. can I specify them as: (1|Block)+(1|Month)? When I run this model, I do not get any residuals in the error term or estimated scale parameters and so do not know how to check if I have overdispersion. Below is the output I obtained. Generalized linear mixed model fit by the Laplace approximation Formula: Count ~ Treatment * Month + (Month | Block) Data: dataset AIC BIC logLik deviance 310.9 338.5 -146.4292.9 Random effects: Groups NameVariance Std.Dev. Corr Block (Intercept) 0.06882396 0.262343 Month 0.00011693 0.010813 1.000 Number of obs: 160, groups: Block, 6 Fixed effects: Estimate Std. Error z value Pr(|z|) (Intercept) 1.624030 0.175827 9.237 2e-16 *** Treatment2.Radiata0.150957 0.207435 0.728 0.466777 Treatment3.Aldabra -0.005458 0.207435 -0.026 0.979009 Month-0.079955 0.022903 -3.491 0.000481 *** Treatment2.Radiata:Month 0.048868 0.033340 1.466 0.142717 Treatment3.Aldabra:Month 0.077697 0.033340 2.330 0.019781 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Correlation of Fixed Effects: (Intr) Trt2.R Trt3.A Month T2.R:M Trtmnt2.Rdt -0.533 Trtmnt3.Ald -0.533 0.450 Month -0.572 0.585 0.585 Trtmnt2.R:M 0.474 -0.882 -0.402 -0.661 Trtmnt3.A:M 0.474 -0.402 -0.882 -0.661 0.454 Any advice on how to account for overdispersion would be much appreciated. Many thanks in advance
[R] problem with installing a local zip file : GFCURE
Dear all, I am trying to install a package called GFCURE from a local zip file. This package fits a cure survival model and has been downloaded from: http://post.queensu.ca/~pengp/software.html The problem is that when I try to install this package from a local zip file using R, I've got the following error message: Error in gzfile(file, r) : cannot open the connection In addition: Warning message: In gzfile(file, r) : cannot open compressed file 'gfcureWinR/DESCRIPTION', probable reason 'No such file or directory' First, I thought it was an internal problem. I then asked some of my colleagues to do the same thing and they had the same error message. I would be very grateful if you can help me on that matter. All the best Marc _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Coord_equal in ggplot2
Dear all, I'm plotting some points on a graph where both axes need to have the same scale. See the example below. Coord_equal does that trick but in this case it wastes a lot of space on the y-axis. Setting the limits of the y-axis myself was no avail. Any suggestions to solve this problem? library(ggplot2) ds - data.frame(x = runif(1000, min = 0, max = 30), y = runif(1000, min = 14, max = 26)) ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal() ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal() + scale_x_continuous(limits = c(0, 30)) + scale_y_continuous(limits = c(14, 26)) Regards, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] remove empty objects from workspace
Hi, how can I remove all empty objects (which are NA or have zero rows) from my workspace? Thanks, Katharina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Remove objects names like character String
Hi, how can I use rm() on objects named like: paste(site,i,_data,sep=) while looping through i? I tried rm(paste(site,i,_data,sep=)) but I get the error that rm() must contain names or text strings which is confusing me as I thought paste() would create something like that...? Thanks, Katharina -- Time flies like an arrow, fruit flies like bananas. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coord_equal in ggplot2
If you use coord_equal on data where the range on the x-axis is larger than the range on the y-axis, then of course you'll observe extra space on the y-axis. What did you expect? Also, this post may be better suited to the ggplot2 mailing list: http://had.co.nz/ggplot2/ On Tue, May 19, 2009 at 7:17 AM, ONKELINX, Thierry thierry.onkel...@inbo.be wrote: Dear all, I'm plotting some points on a graph where both axes need to have the same scale. See the example below. Coord_equal does that trick but in this case it wastes a lot of space on the y-axis. Setting the limits of the y-axis myself was no avail. Any suggestions to solve this problem? library(ggplot2) ds - data.frame(x = runif(1000, min = 0, max = 30), y = runif(1000, min = 14, max = 26)) ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal() ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal() + scale_x_continuous(limits = c(0, 30)) + scale_y_continuous(limits = c(14, 26)) Regards, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] remove empty objects from workspace
Katharina May wrote: Hi, how can I remove all empty objects (which are NA or have zero rows) from my workspace? Hi Katharina, To remove objects that are all NA: for(object in objects()) if(all(is.na(get(object rm(list=object) If by zero rows you mean objects that do not have a dimension: for(object in objects()) if(is.null(dim(get(object rm(list=object) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] remove empty objects from workspace
Thanks Jim, the removal of objects which are NA works perfectly! For my second problem it didn't express myself correctly: I actually meant objects with rows (attributes?) but no data in it but I solved this adjusting your approach: for(object in objects()) if(is.null(dim(get(object))[1]) || dim((get(object)))[1] == 0) rm(list=object) Thanks a lot! 2009/5/19 Jim Lemon j...@bitwrit.com.au: Katharina May wrote: Hi, how can I remove all empty objects (which are NA or have zero rows) from my workspace? Hi Katharina, To remove objects that are all NA: for(object in objects()) if(all(is.na(get(object rm(list=object) If by zero rows you mean objects that do not have a dimension: for(object in objects()) if(is.null(dim(get(object rm(list=object) Jim -- Time flies like an arrow, fruit flies like bananas. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] loglinear analysis
Dear R Users, A would like to fit a loglinear analysis to a three dimensional contingency table. But I Don't want to run a full saturated modell. Is there any package in R that could handle somekind of stepwise search to choose out the best soultion? And how can I fit a non fully saturated modell, which only use the important interactions? Best Regards Zoltan Kmetty [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loglinear analysis
Look to the glm function then pass the output to the step function Steve Friedman Ph. D. Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 steve_fried...@nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 Zoltan Kmetty zkme...@gmail.co m To Sent by: r-help@r-project.org r-help-boun...@r- cc project.org Subject [R] loglinear analysis 05/19/2009 02:12 PM ZE2 Dear R Users, A would like to fit a loglinear analysis to a three dimensional contingency table. But I Don't want to run a full saturated modell. Is there any package in R that could handle somekind of stepwise search to choose out the best soultion? And how can I fit a non fully saturated modell, which only use the important interactions? Best Regards Zoltan Kmetty [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with installing a local zip file : GFCURE
On Tue, 19 May 2009, marc bernard wrote: Dear all, I am trying to install a package called GFCURE from a local zip file. This package fits a cure survival model and has been downloaded from: http://post.queensu.ca/~pengp/software.html However, it is not an R package. Read the Readme.txt in the zip file for the instructions for use with R (under Windows). The problem is that when I try to install this package from a local zip file using R, I've got the following error message: Error in gzfile(file, r) : cannot open the connection In addition: Warning message: In gzfile(file, r) : cannot open compressed file 'gfcureWinR/DESCRIPTION', probable reason 'No such file or directory' First, I thought it was an internal problem. I then asked some of my colleagues to do the same thing and they had the same error message. I would be very grateful if you can help me on that matter. It would have been reasonable to ask the author (Cc:ed here) for help, since then he will become aware that potential users have been confused by his instructions. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generic 'diff'
Note that this could be done like this for ordinary vectors: x - seq(1:4)^2 apply(embed(x, 2), 1, function(x, f) f(rev(x)), f = diff) [1] 3 5 7 apply(embed(x, 2), 1, function(x, f) f(rev(x)), f = sum) [1] 5 13 25 or a method to rollapply in zoo could be added for ordinary vectors. Here it is applied to zoo objects: library(zoo) rollapply(zoo(x), 2, diff) 1 2 3 3 5 7 rollapply(zoo(x), 2, sum) 1 2 3 5 13 25 On Tue, May 19, 2009 at 4:23 AM, Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: Stavros Macrakis wrote: On Mon, May 18, 2009 at 6:00 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: I understood what you were asking but R is an oo language so that's the model to use to do this sort of thing. I am not talking about creating a new class with an analogue to the subtraction function. I am talking about a function which applies another function to a sequence and its lagged version. Functional arguments are used all over the place in R's base package (Xapply, sweep, outer, by, not to mention Map, Reduce, Filter, etc.) and they seem perfectly natural here. perhaps 'diff' would not be the best name, something like 'lag' would be better for the more generic function, but 'lag' is already taken. i agree it would be reasonable to have diff (lag) to accept an extra argument for the function to be applied. the solution of wrapping the vector into a new class to be diff'ed with a non-default diff does not seem to make much sense, as (a) what you seem to want is to custom-diff plain vectors, (b) to keep the diff family coherent, you'd need to upgrade the other diffs to have the extra argument anyway. as you say, it's trivial to implement an extended diff, say difff, reusing code from diff: difff = function(x, ...) UseMethod('difff') difff.default = function(x, lag=1, differences=1, fun=`-`, ...) { ismat = is.matrix(x) xlen = if (ismat) dim(x)[1L] else length(x) if (length(lag) 1L || length(differences) 1L || lag 1L || differences 1L) stop('lag' and 'differences' must be integers = 1) if (lag * differences = xlen) return(x[0]) r = unclass(x) i1 = -1L:-lag if (ismat) for (i in 1L:differences) r = fun(r[i1, , drop = FALSE], r[-nrow(r):-(nrow(r) - lag + 1), , drop = FALSE]) else for (i in 1L:differences) r = fun(r[i1], r[-length(r):-(length(r) - lag + 1)]) class(r) = oldClass(x) r } now, this naive version seems to work close to what you'd like: difff(1:4) # 1 1 1 difff(1:4, fun=`+`) # 3 5 7 it might be useful if the original diff were working this way. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] remove empty objects from workspace
Try this also: rm(list=names(which(unlist(eapply(globalenv(), function(a)all(is.na(a) || is.null(a))) On Tue, May 19, 2009 at 9:07 AM, Katharina May may.kathar...@googlemail.com wrote: Thanks Jim, the removal of objects which are NA works perfectly! For my second problem it didn't express myself correctly: I actually meant objects with rows (attributes?) but no data in it but I solved this adjusting your approach: for(object in objects()) if(is.null(dim(get(object))[1]) || dim((get(object)))[1] == 0) rm(list=object) Thanks a lot! 2009/5/19 Jim Lemon j...@bitwrit.com.au: Katharina May wrote: Hi, how can I remove all empty objects (which are NA or have zero rows) from my workspace? Hi Katharina, To remove objects that are all NA: for(object in objects()) if(all(is.na(get(object rm(list=object) If by zero rows you mean objects that do not have a dimension: for(object in objects()) if(is.null(dim(get(object rm(list=object) Jim -- Time flies like an arrow, fruit flies like bananas. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with installing a local zip file : GFCURE
On Tue, May 19, 2009 at 6:17 AM, marc bernard marc_bern...@hotmail.co.uk wrote: Dear all, I am trying to install a package called GFCURE from a local zip file. This package fits a cure survival model and has been downloaded from: You are assuming its in the form of an R *package* but its not. Unzip it and read the readme.txt file. If you still have problems contact the author. http://post.queensu.ca/~pengp/software.html The problem is that when I try to install this package from a local zip file using R, I've got the following error message: Error in gzfile(file, r) : cannot open the connection In addition: Warning message: In gzfile(file, r) : cannot open compressed file 'gfcureWinR/DESCRIPTION', probable reason 'No such file or directory' First, I thought it was an internal problem. I then asked some of my colleagues to do the same thing and they had the same error message. I would be very grateful if you can help me on that matter. All the best Marc _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] memory stack overflow
Dear colleagues, I am trying a glm.nb for the distribution of a plant species with 93 environmental variables. I execute the instruction and I get the following message: Error: C stack usage is too close to the limit. How can I increase the memory of R? Your sincerely, Nora. _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory stack overflow
Have you try principal component analysis to reduce space variables? 2009/5/19 Nora Pérez norichu...@hotmail.com Dear colleagues, I am trying a glm.nb for the distribution of a plant species with 93 environmental variables. I execute the instruction and I get the following message: Error: C stack usage is too close to the limit. How can I increase the memory of R? Your sincerely, Nora. _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Luis Iván Ortiz Valencia Estatístico Msc. ... Curriculum Lattes http://buscatextual.cnpq.br/buscatextual/visualizacv.jsp?id=K4778724J3 ... Aquarela Cusco Hostel http://www.aquarelacuscohostel.com/ ... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove objects names like character String
Katharina May wrote: Hi, how can I use rm() on objects named like: paste(site,i,_data,sep=) while looping through i? I tried rm(paste(site,i,_data,sep=)) but I get the error that rm() must contain names or text strings which is confusing me as I thought paste() would create something like that...? Well, I would try to avoid the creation of so many objects, but once you have them you can do even without a loop: e.g. for the first 5: i - 1:5 do.call(rm, list(paste(site, i, _data, sep=))) Uwe Ligges Thanks, Katharina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stringsAsFactors param in expand.grid not working
RT == Rolf Turner r.tur...@auckland.ac.nz on Tue, 19 May 2009 11:02:08 +1200 writes: RT On 19/05/2009, at 10:20 AM, Steve Lianoglou wrote: Hi all, I've (tried) to look through the bug tracker, and gmane-search the R list to see if this has been mentioned before, and it looks like it hasn't. Yes, thank you. That's a bug on which we (R-core) currently work. More about this, and notably about Rolf's (.)*^%)(#%$) proposal on the R-devel list. Martin Maechler According to the R 2.9.0 release notes[1], the expand.grid function should now take a stringsAsFactor=LOGICAL argument which controls whether or not the function coerces strings as factors. While the parameter is indeed in the function, a quick examination of the function's source shows that the value of this argument is never checked, and all strings are converted to factors as a matter of course. The fix is pretty easy, and I believe only requires changing the `if` check here: if (!is.factor(x) is.character(x)) x - factor(x, levels = unique(x)) To: if (!is.factor(x) is.character(x) stringsAsFactors) x - factor(x, levels = unique(x)) I can open a ticket regarding this issue and add this there if necessary. Thanks, -steve [1] http://article.gmane.org/gmane.comp.lang.r.general/146891 RT While we're at it --- would it not make sense to have the RT stringsAsFactors RT argument (once it's working) of expand.grid() default to options() RT $stringsAsFactors, RT rather than to FALSE? RT This would make no difference to me personally, since I set RT options(stringsAsFactors=FALSE) in my .Rprofile. But it might make some RT people happier RT cheers, RT Rolf Turner RT ## RT Attention:\ This e-mail message is privileged and confid...{{dropped:9}} RT __ RT R-help@r-project.org mailing list RT https://stat.ethz.ch/mailman/listinfo/r-help RT PLEASE do read the posting guide http://www.R-project.org/posting-guide.html RT and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with installing a local zip file : GFCURE
Dear gabor, Many thanks for your answer. I indeed didn't check the read me text. Bests From: ggrothendi...@gmail.com Date: Tue, 19 May 2009 08:44:13 -0400 Subject: Re: [R] problem with installing a local zip file : GFCURE To: marc_bern...@hotmail.co.uk CC: r-help@r-project.org On Tue, May 19, 2009 at 6:17 AM, marc bernard marc_bern...@hotmail.co.uk wrote: Dear all, I am trying to install a package called GFCURE from a local zip file. This package fits a cure survival model and has been downloaded from: You are assuming its in the form of an R *package* but its not. Unzip it and read the readme.txt file. If you still have problems contact the author. http://post.queensu.ca/~pengp/software.html The problem is that when I try to install this package from a local zip file using R, I've got the following error message: Error in gzfile(file, r) : cannot open the connection In addition: Warning message: In gzfile(file, r) : cannot open compressed file 'gfcureWinR/DESCRIPTION', probable reason 'No such file or directory' First, I thought it was an internal problem. I then asked some of my colleagues to do the same thing and they had the same error message. I would be very grateful if you can help me on that matter. All the best Marc _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coord_equal in ggplot2
ONKELINX, Thierry Thierry.ONKELINX at inbo.be writes: I'm plotting some points on a graph where both axes need to have the same scale. See the example below. Coord_equal does that trick but in this case it wastes a lot of space on the y-axis. Setting the limits of the y-axis myself was no avail. Any suggestions to solve this problem? library(ggplot2) ds - data.frame(x = runif(1000, min = 0, max = 30), y = runif(1000, min = 14, max = 26)) ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal() ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal() + scale_x_continuous(limits = c(0, 30)) + scale_y_continuous(limits = c(14, 26)) I think you need to set ratio in addition to cut off the extra space. (Not tried) From Docs: Equal scales. coord_equal ensures that the x and y axes have equal scales: i.e. 1 cm along the x axis represents the same range of data as 1 cm along the y axis. By default it will assume that you want a one-to-one ratio, but you can change this with the ratio parameter. The aspect ratio will also be set to ensure that the mapping is maintained regardless of the shape of the output device. See the documentation of coord_equal() for more details. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wilcoxon nonparametric p-values
Thanks Peter, There are 8 measurements less than 8.5, so calculating the probability (binomial) of 8, or fewer, happening by chance with n = 20 and p = 0.50 gives P = 0.25-- the book answer. I've tried several problems in other textbooks and in each case I get vastly different P-values than I get with wilcox.test or wilcox.exact. However, upon further testing, I've found good agreement when the calculated P-values are small, but disagreement when P-values are large. This might mean a problem with wilcox.test and wilcox.exact when P-values are large or I might be misinterpreting something. CHV   Charles H Van deZande ---Original Message--- From: Peter Dalgaard Date: 5/19/2009 5:35:07 AM To: cvandy Cc: r-help@r-project.org Subject: Re: [R] Wilcoxon nonparametric p-values cvandy wrote: When I use wilcox.test, I get vastly different p-values than the problems from Statistics textbooks. For example: The following problem comes from Applied Statistics and Probability for Engineers, 2nd Edition, by D. C. Montgomery. Page736, problem 14.7. The problem is to compare the sample data with a population median of 8.5. The book answer is p = 0.25, wilcox.test answer is p = 0.573. I've tried several other similar problems with similar results. I've copied the following directly from my workspace. wilcox.exact (from exactRankTests) gives wilcox.exact(x - 8.5) Exact Wilcoxon signed rank test data: x - 8.5 V = 80.5, p-value = 0.5748 so I'd suspect the textbook. One-sided p-value perhaps? or table limitation (as in p .25). If you want to dig deeper, you'll probably have to check the computations implied by the text. Thanks for any help, CHV x-c(8.32,8.05, 8.93,8.65,8.25,8.46,8.52,8.35,8.36,8.41,8.42,8.30,8.71,8.75,8.6,8.83,8.5 8.38,8.29,8.46) wilcox.test(x,y=NULL,mu=8.5) Wilcoxon signed rank test with continuity correction data: x V = 80.5, p-value = 0.573 alternative hypothesis: true location is not equal to 8.5 Warning messages: 1: In wilcox.test.default(x, y = NULL, mu = 8.5) : cannot compute exact p-value with ties 2: In wilcox.test.default(x, y = NULL, mu = 8.5) : cannot compute exact p-value with zeroes   Charles H Van deZande  -- O__ Peter Dalgaard Ãster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to calculate means of matrix elements
Easy enough. What if some of the matrix elements contained missing values? Then how could you still calculate the means? Example code below: mat1 - matrix(c(1,2,3,4,5,NA,7,8,9),3,3) mat2 - matrix(c(NA,6,1,9,0,5,8,2,7),3,3) mat3 - matrix(c(5,9,1,8,NA,3,7,2,4),3,3) Gabor Grothendieck wrote: Try this: (mat1 + mat2 + mat3) / 3 On Mon, May 18, 2009 at 8:40 PM, dxc13 dx...@health.state.ny.us wrote: useR's, I have several matrices of size 4x4 that I want to calculate means of their respective positions with. For example, consider I have 3 matrices given by the code: mat1 - matrix(sample(1:20,16,replace=T),4,4) mat2 - matrix(sample(-5:15,16,replace=T),4,4) mat3 - matrix(sample(5:25,16,replace=T),4,4) The result I want is one matrix of size 4x4 in which position [1,1] is the mean of position [1,1] of the given three matrices. The same goes for all other positions of the matrix. If these three matrices are given in separate text files, how can I write code that will get this result I need? Thanks in advance, dxc13 -- View this message in context: http://www.nabble.com/how-to-calculate-means-of-matrix-elements-tp23607694p23607694.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/how-to-calculate-means-of-matrix-elements-tp23607694p23615755.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wilcoxon nonparametric p-values
Charles Van deZande wrote: Thanks Peter, There are 8 measurements less than 8.5, so calculating the probability (binomial) of 8, or fewer, happening by chance with n = 20 and p = 0.50 gives P = 0.25-- the book answer. I've tried several problems in other textbooks and in each case I get vastly different P-values than I get with wilcox.test or wilcox.exact. Ah, but that is NOT a signed-rank test, just a sign test. (Using the former as a test of the median is BTW not really a good idea unless you assume symmetry of the distribution.) It is also still a one-sided test, with two tails you get binom.test(8,20) Exact binomial test data: 8 and 20 number of successes = 8, number of trials = 20, p-value = 0.5034 alternative hypothesis: true probability of success is not equal to 0.5 95 percent confidence interval: 0.1911901 0.6394574 sample estimates: probability of success 0.4 (and that is disregarding that one observation is exactly 8.5, so you should really look at 7 in 19 rather than 8 in 20.) However, upon further testing, I've found good agreement when the calculated P-values are small, but disagreement when P-values are large. This might mean a problem with wilcox.test and wilcox.exact when P-values are large or I might be misinterpreting something. You need to read some more theory. The extreme cases (all signs equal) are equally unlikely for the sign test and the signed-rank test. CHV Charles H Van deZande /---Original Message---/ /*From:*/ Peter Dalgaard mailto:p.dalga...@biostat.ku.dk /*Date:*/ 5/19/2009 5:35:07 AM /*To:*/ cvandy mailto:cvand...@gmail.com /*Cc:*/ r-help@r-project.org mailto:r-help@r-project.org /*Subject:*/ Re: [R] Wilcoxon nonparametric p-values cvandy wrote: When I use wilcox.test, I get vastly different p-values than the problems from Statistics textbooks. For example: The following problem comes from Applied Statistics and Probability for Engineers, 2nd Edition, by D. C. Montgomery. Page736, problem 14.7. The problem is to compare the sample data with a population median of 8.5. The book answer is p = 0.25, wilcox.test answer is p = 0.573. I've tried several other similar problems with similar results. I've copied the following directly from my workspace. wilcox.exact (from exactRankTests) gives wilcox.exact(x - 8.5) Exact Wilcoxon signed rank test data: x - 8.5 V = 80.5, p-value = 0.5748 so I'd suspect the textbook. One-sided p-value perhaps? or table limitation (as in p .25). If you want to dig deeper, you'll probably have to check the computations implied by the text. Thanks for any help, CHV x-c(8.32,8.05, 8.93,8.65,8.25,8.46,8.52,8.35,8.36,8.41,8.42,8.30,8.71,8.75,8.6,8.83,8.5,8.38,8.29,8.46) wilcox.test(x,y=NULL,mu=8.5) Wilcoxon signed rank test with continuity correction data: x V = 80.5, p-value = 0.573 alternative hypothesis: true location is not equal to 8.5 Warning messages: 1: In wilcox.test.default(x, y = NULL, mu = 8.5) : cannot compute exact p-value with ties 2: In wilcox.test.default(x, y = NULL, mu = 8.5) : cannot compute exact p-value with zeroes Charles H Van deZande -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk mailto:p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove objects names like character String
Try this: rm(list=ls(patt=site[0-9]$)) On Tue, May 19, 2009 at 7:47 AM, Katharina May may.kathar...@googlemail.com wrote: Hi, how can I use rm() on objects named like: paste(site,i,_data,sep=) while looping through i? I tried rm(paste(site,i,_data,sep=)) but I get the error that rm() must contain names or text strings which is confusing me as I thought paste() would create something like that...? Thanks, Katharina -- Time flies like an arrow, fruit flies like bananas. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory stack overflow
That is very unusual, but the C stack can be increased under Windows by recompiling R (see src/gnuwin32/front-ends/Makefile). On most other OSes it is much easier, just adjust the setting via ulimit or limit in your shell. But I suspect the problem is that your model is too complex. Incidentally, is your R current? Stack usage in expanding models was reduced recently. On Tue, 19 May 2009, Nora Pérez wrote: Dear colleagues, I am trying a glm.nb for the distribution of a plant species with 93 environmental variables. I execute the instruction and I get the following message: Error: C stack usage is too close to the limit. Well, we don't know what you did (see the footer of this message) but if you mean 93 explanatory variables I hope you have tens of thousnads of cases and are just looking for a predictive model. How can I increase the memory of R? Your sincerely, Nora. _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove objects names like character String
thanks to all your solutions, works out perfectly! 2009/5/19 Henrique Dallazuanna www...@gmail.com: Try this: rm(list=ls(patt=site[0-9]$)) On Tue, May 19, 2009 at 7:47 AM, Katharina May may.kathar...@googlemail.com wrote: Hi, how can I use rm() on objects named like: paste(site,i,_data,sep=) while looping through i? I tried rm(paste(site,i,_data,sep=)) but I get the error that rm() must contain names or text strings which is confusing me as I thought paste() would create something like that...? Thanks, Katharina -- Time flies like an arrow, fruit flies like bananas. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O -- Time flies like an arrow, fruit flies like bananas. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove objects names like character String
I don't get the error you mention: site1_data-1 site2_data-2 site3_data-3 for (i in 1:3) paste(site,i,_data,sep=) In my example, another way is: rm(list=paste(site,1:3,_data,sep=)) Or you can use rm(list=ls(pattern=you pattern)), in my example, it is: rm(list=ls(pattern=site[1-3]_data)) Ronggui 2009/5/19 Katharina May may.kathar...@googlemail.com: Hi, how can I use rm() on objects named like: paste(site,i,_data,sep=) while looping through i? I tried rm(paste(site,i,_data,sep=)) but I get the error that rm() must contain names or text strings which is confusing me as I thought paste() would create something like that...? Thanks, Katharina -- Time flies like an arrow, fruit flies like bananas. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- HUANG Ronggui, Wincent PhD Candidate Dept of Public and Social Administration City University of Hong Kong Home page: http://asrr.r-forge.r-project.org/rghuang.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error glmpath()
Hi R-users! I am trying to learn how to use the glmpath package. I have a dataframe like this dim(data) [1] 605 109 and selected the following response - data[,1] features-as.matrix(data[,3:109]) mymodel - glmpath(features,response, family = binomial) Error in if (lambda = min.lambda) { : missing value where TRUE/FALSE expected Reading the glmpath pdf, I don't understand why I get this error since lambda and min.lambda seem to have default values. Any suggestions will be very much welcomed Dave sessionInfo() R version 2.9.0 Under development (unstable) (2009-01-14 r47602) x86_64-unknown-linux-gnu locale: LC_CTYPE=es_ES.UTF-8;LC_NUMERIC=C;LC_TIME=es_ES.UTF-8;LC_COLLATE=es_ES.UTF-8;LC_MONETARY=C;LC_MESSAGES=es_ES.U TF-8;LC_PAPER=es_ES.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=es_ES.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] foreign_0.8-34 glmpath_0.94survival_2.35-4 _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nlrwr package. Error when fitting the optimal Box-Cox transformation with two variables
Dear all: I'm trying to fit the optimal Box-Cox transformation related to nls (see the code below) for the demand of money data in Green (3th Edition) but in the last step R gives the next error message. Error en `[.data.frame`(eval(object$data), , as.character(formula(object)[[2]])[2]) : undefined columns selected. ¿Any idea to solve the problem? Thanks in advance, library(nlrwr) r-c(4.50,4.19,5.16,5.87,5.95,4.88,4.50,6.44,7.83,6.25,5.50,5.46,7.46,10.28,11.77,13.42,11.02,8.50,8.80,7.69) M-c(480.00,524.30,566.30,589.50,628.20,712.80,805.20,861.00,908.40,1023.10,1163.60,1286.60,1388.90,1497.90,1631.40,1794.40,1954.90,2188.80,2371.70,2563.60) Y-c(2208.30,2271.40,2365.60,2423.30,2416.20,2484.80,2608.50,2744.10,2729.30,2695.00,2826.70,2958.60,3115.20,3192.40,3187.10,3248.80,3166.00,3277.70,3492.00,3573.50) money-data.frame(r,M,Y) attach(money) ols1-lm(log(M)~log(r)+log(Y)) output1-summary(ols1) coef1-ols1$coefficients a1-coef1[[1]] b11-coef1[[2]] b21-coef1[[3]] money.m1-nls(log(M)~a+b*r^g+c*Y^g,data=money,start=list(a=a1,b=b11,g=1,c=b21)) summary(money.m1) money.m2-boxcox(money.m1) Prof. Ikerne del Valle Erkiaga Department of Applied Economics V Faculty of Economic and Business Sciences University of the Basque Country Avda. Lehendakari Agirre, Nº 83 48015 Bilbao (Bizkaia) Spain __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Spearman rho
I read that Spearman rho can be used to detect the presence of trend in a time series. However, I cannot figure out how to use such a test to thsi purpose. First of all which one of the available functions and how to pass my mono-channel time series which contains both positive and negative values. I would love to see some examples. Thank you very much. Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] File too big for filehash?
Dear R users, I try to use a very large file (~3 Gib) with the filehash package. The length of the dataset is around 4,000,000 obs. I get this message from R while trying to load the dataset (named cc084.csv): dumpDF(read.csv(cc084.csv, header=T), dbName=db01) Erreur : impossible d'allouer un vecteur de taille 15.6 Mo (French) Error: impossible to allow a vector of size 15.6 Meg (my English translation) Is there anything I can do? My R version is 2.8.1. - Rock Ouimet Forest Soil Scientist MRNF-QC -- View this message in context: http://www.nabble.com/File-too-big-for-filehash--tp23618709p23618709.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using while statements to insert rows in a dataframe
Hi. I am very new to R and have been diligently working my way through the manual and various tutorials. I am now trying to work with some of my own data and have encountered a problem that I need to fix. I have a dataframe with 8 columns and approximately 850 rows. I have provided an excerpt of the dataframe below. Within column 6 (Question) the numbers 1:33 repeat down the entire column. Occasionally, however, another value (-32767) appears. I need to locate this value everytime it appears and in its place insert 33 rows that are numbered 1:33 in column Question. Additionally, I need to maintain the integrity of the other columns so that the values at that location in each column are also repeated 33 times. So, in the example below, I currently have 68 rows of data, but I actually need 132 rows (two -32767 values need to be replaced). Based on my reading I am guessing that I need to use a while loop, but I cannot seem to get it right. Is this the appropriate function! or is there another more efficient method for achieving my goal. Again, I am quite new to R. Thanks for your help! Year Month Day Time PartID Question Latency Response 2008 2 7 194556 6 1 265 -1 2008 2 7 194556 6 2 466 84 2008 2 7 194556 6 3 199 68 2008 2 7 194556 6 4 152 83 2008 2 7 194556 6 5 177 100 2008 2 7 194556 6 6 177 61 2008 2 7 194556 6 7 400 43 2008 2 7 194556 6 8 225 88 2008 2 7 194556 6 9 249 32 2008 2 7 194556 6 10 172 8 2008 2 7 194556 6 11 163 17 2008 2 7 194556 6 12 326 70 2008 2 7 194556 6 13 232 26 2008 2 7 194556 6 14 157 22 2008 2 7 194556 6 15 135 -1 2008 2 7 194556 6 16 133 2 2008 2 7 194556 6 17 222 2 2008 2 7 194556 6 18 357 4 2008 2 7 194556 6 19 131 -1 2008 2 7 194556 6 20 222 90 2008 2 7 194556 6 21 230 35 2008 2 7 194556 6 22 374 32 2008 2 7 194556 6 23 275 85 2008 2 7 194556 6 24 141 -1 2008 2 7 194556 6 25 264 19 2008 2 7 194556 6 26 380 17 2008 2 7 194556 6 27 240 21 2008 2 7 194556 6 28 127 -1 2008 2 7 194556 6 29 232 92 2008 2 7 194556 6 30 205 95 2008 2 7 194556 6 31 185 96 2008 2 7 194556 6 32 319 61 2008 2 7 194556 6 33 101 -1 2008 2 8 122203 6 -32767 0 NA 2008 2 7 194556 6 1 265 -1 2008 2 7 194556 6 2 466 84 2008 2 7 194556 6 3 199 68 2008 2 7 194556 6 4 152 83 2008 2 7 194556 6 5 177 100 2008 2 7 194556 6 6 177 61 2008 2 7 194556 6 7 400 43 2008 2 7 194556 6 8 225 88 2008 2 7 194556 6 9 249 32 2008 2 7 194556 6 10 172 8 2008 2 7 194556 6 11 163 17 2008 2 7 194556 6 12 326 70 2008 2 7 194556 6 13 232 26 2008 2 7 194556 6 14 157 22 2008 2 7 194556 6 15 135 -1 2008 2 7 194556 6 16 133 2 2008 2 7 194556 6 17 222 2 2008 2 7 194556 6 18 357 4 2008 2 7 194556 6 19 131 -1 2008 2 7 194556 6 20 222 90 2008 2 7 194556 6 21 230 35 2008 2 7 194556 6 22 374 32 2008 2 7 194556 6 23 275 85 2008 2 7 194556 6 24 141 -1 2008 2 7 194556 6 25 264 19 2008 2 7 194556 6 26 380 17 2008 2 7 194556 6 27 240 21 2008 2 7 194556 6 28 127 -1 2008 2 7 194556 6 29 232 92 2008 2 7 194556 6 30 205 95 2008 2 7 194556 6 31 185 96 2008 2 7 194556 6 32 319 61 2008 2 7 194556 6 33 101 -1 2008 2 8 143056 6 -32767 0 NA Eric S McKibben Industrial-Organizational Psychology Graduate Student Clemson University Clemson, SC [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using while statements to insert rows in a dataframe
Eric McKibben wrote: Within column 6 (Question) the numbers 1:33 repeat down the entire column. Occasionally, however, another value (-32767) appears. I need to locate this value everytime it appears and in its place insert 33 rows that are numbered 1:33 in column Question. Additionally, I need to maintain the integrity of the other columns so that the values at that location in each column are also repeated 33 times. So, in the example below, I currently have 68 rows of data, but I actually need 132 rows (two -32767 values need to be replaced). Year Month Day Time PartID Question Latency Response 2008 2 7 194556 6 1 265 -1 2008 2 7 194556 6 2 466 84 2008 2 7 194556 6 3 199 68 .. 2008 2 8 122203 6 -32767 0 NA It's always good to boil down you example to the minimal possible, your example is too big. To clarify you point: assuming there are only two questions: You have: Question Latency Response 1265 -1 2466 84 -32767 0 NA You need? Question Latency Response 1265 -1 2466 84 1265 -1 2466 84 -- View this message in context: http://www.nabble.com/Using-while-statements-to-insert-rows-in-a-dataframe-tp23618849p23619171.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using while statements to insert rows in a dataframe
Eric McKibben wrote: Hi. I am very new to R and have been diligently working my way through the manual and various tutorials. I am now trying to work with some of my own data and have encountered a problem that I need to fix. I have a dataframe with 8 columns and approximately 850 rows. I have provided an excerpt of the dataframe below. Within column 6 (Question) the numbers 1:33 repeat down the entire column. Occasionally, however, another value (-32767) appears. I need to locate this value everytime it appears and in its place insert 33 rows that are numbered 1:33 in column Question. Additionally, I need to maintain the integrity of the other columns so that the values at that location in each column are also repeated 33 times. So, in the example below, I currently have 68 rows of data, but I actually need 132 rows (two -32767 values need to be replaced). Based on my reading I am guessing that I need to use a while loop, but I cannot seem to get it right. Is this the appropriate function! or is there another more efficient method for achieving my goal. Again, I am quite new to R. Thanks for your help! Year Month Day Time PartID Question Latency Response 2008 2 7 194556 6 1 265 -1 2008 2 7 194556 6 2 466 84 2008 2 7 194556 6 3 199 68 2008 2 7 194556 6 4 152 83 2008 2 7 194556 6 5 177 100 2008 2 7 194556 6 6 177 61 2008 2 7 194556 6 7 400 43 2008 2 7 194556 6 8 225 88 2008 2 7 194556 6 9 249 32 2008 2 7 194556 6 10 172 8 2008 2 7 194556 6 11 163 17 2008 2 7 194556 6 12 326 70 2008 2 7 194556 6 13 232 26 2008 2 7 194556 6 14 157 22 2008 2 7 194556 6 15 135 -1 2008 2 7 194556 6 16 133 2 2008 2 7 194556 6 17 222 2 2008 2 7 194556 6 18 357 4 2008 2 7 194556 6 19 131 -1 2008 2 7 194556 6 20 222 90 2008 2 7 194556 6 21 230 35 2008 2 7 194556 6 22 374 32 2008 2 7 194556 6 23 275 85 2008 2 7 194556 6 24 141 -1 2008 2 7 194556 6 25 264 19 2008 2 7 194556 6 26 380 17 2008 2 7 194556 6 27 240 21 2008 2 7 194556 6 28 127 -1 2008 2 7 194556 6 29 232 92 2008 2 7 194556 6 30 205 95 2008 2 7 194556 6 31 185 96 2008 2 7 194556 6 32 319 61 2008 2 7 194556 6 33 101 -1 2008 2 8 122203 6 -32767 0 NA 2008 2 7 194556 6 1 265 -1 2008 2 7 194556 6 2 466 84 2008 2 7 194556 6 3 199 68 2008 2 7 194556 6 4 152 83 2008 2 7 194556 6 5 177 100 2008 2 7 194556 6 6 177 61 2008 2 7 194556 6 7 400 43 2008 2 7 194556 6 8 225 88 2008 2 7 194556 6 9 249 32 2008 2 7 194556 6 10 172 8 2008 2 7 194556 6 11 163 17 2008 2 7 194556 6 12 326 70 2008 2 7 194556 6 13 232 26 2008 2 7 194556 6 14 157 22 2008 2 7 194556 6 15 135 -1 2008 2 7 194556 6 16 133 2 2008 2 7 194556 6 17 222 2 2008 2 7 194556 6 18 357 4 2008 2 7 194556 6 19 131 -1 2008 2 7 194556 6 20 222 90 2008 2 7 194556 6 21 230 35 2008 2 7 194556 6 22 374 32 2008 2 7 194556 6 23 275 85 2008 2 7 194556 6 24 141 -1 2008 2 7 194556 6 25 264 19 2008 2 7 194556 6 26 380 17 2008 2 7 194556 6 27 240 21 2008 2 7 194556 6 28 127 -1 2008 2 7 194556 6 29 232 92 2008 2 7 194556 6 30 205 95 2008 2 7 194556 6 31 185 96 2008 2 7 194556 6 32 319 61 2008 2 7 194556 6 33 101 -1 2008 2 8 143056 6 -32767 0 NA Eric S McKibben Industrial-Organizational Psychology Graduate Student Clemson University Clemson, SC [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hi Eric, Using a /while/ statement would probably work, but it would imply not making use of R's convenient indexing aspect. What I suggest is the following (my.data is the data.frame you provided) : ## To locate the rows ; row.pos = which(my.data$Question==-32767) ; repeat.index = rep(row.pos, 33) ; ## To output the result data.frame ; index.vector = sort(c(seq_along(my.data$Question)[my.data$Question != -32767], repeat.index)) ; final.result = my.data[index.vector,] ; This should do the trick. Cheers, -- *Luc Villandré* /Biostatistician McGill University Health Center - Montreal Children's Hospital Research Institute/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] exists function on list objects gives always a FALSE
Dear R-users, in a minimal example exists() gives FALSE on an object which obviously does exist. How can I check on that list object anyway else, please? SmoothData - list(exists=TRUE, span=0.001) SmoothData $exists [1] TRUE $span [1] 0.001 exists(SmoothData) TRUE exists(SmoothData$span) FALSE exists(SmoothData[[2]]) FALSE Thank you for any opinion regarding this topic. Zroutik [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] create string of comma-separated content of vector
Hi, how do I create a string of the comma-separated content of a vector? I've got the vector i with several numeric values as content: str(i) num 99 and want to create a SQL statement to look like the following where the part '(2, 4, 6, 7)' should be the content of the vector i: select * from [biomass_data$] where site_no in (2, 4, 6, 7) Here my approach (which doesn't work): site_all_data = sqlQuery(channel, select * from [biomass_data$] where site_no in (,paste(i,sep=,),) ) sorry for spaming so much today to the mailing list... -Katharina -- Time flies like an arrow, fruit flies like bananas. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exists function on list objects gives always a FALSE
SmoothData$span is not an object which can be checked by exists(), but part of an object which can be checked by is.null(). On Wed, May 20, 2009 at 12:07 AM, Žroutík zrou...@gmail.com wrote: Dear R-users, in a minimal example exists() gives FALSE on an object which obviously does exist. How can I check on that list object anyway else, please? SmoothData - list(exists=TRUE, span=0.001) SmoothData $exists [1] TRUE $span [1] 0.001 exists(SmoothData) TRUE exists(SmoothData$span) FALSE exists(SmoothData[[2]]) FALSE Thank you for any opinion regarding this topic. Zroutik [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exists function on list objects gives always a FALSE
On 5/19/2009 12:07 PM, Žroutík wrote: Dear R-users, in a minimal example exists() gives FALSE on an object which obviously does exist. How can I check on that list object anyway else, please? SmoothData - list(exists=TRUE, span=0.001) SmoothData $exists [1] TRUE $span [1] 0.001 exists(SmoothData) TRUE exists(SmoothData$span) FALSE exists(SmoothData[[2]]) FALSE Thank you for any opinion regarding this topic. There is no variable with name SmoothData$span, there is an element of SmoothData with name span. To test for that, the safest test is probably span %in% names(SmoothData) but a common convention is to use is.null(SmoothData$span) because NULL elements are rare in lists. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exists function on list objects gives always a FALSE
Žroutík wrote: Dear R-users, in a minimal example exists() gives FALSE on an object which obviously does exist. How can I check on that list object anyway else, please? SmoothData - list(exists=TRUE, span=0.001) SmoothData $exists [1] TRUE $span [1] 0.001 exists(SmoothData) TRUE exists(SmoothData$span) FALSE This checks for existance of an object called SmoothData$span, as in : `SmoothData$span` - 1:10 exists(SmoothData$span) You can do: is.list( SmoothData ) !is.null(names(SmoothData)) span %in% names(SmoothData) exists(SmoothData[[2]]) FALSE Similarly: `SmoothData[[2]]` - 1 exists(SmoothData[[2]]) You can do: is.list( SmoothData ) length(SmoothData) 1 Thank you for any opinion regarding this topic. Zroutik -- Romain Francois Independent R Consultant +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exists function on list objects gives always a FALSE
Žroutík wrote: SmoothData - list(exists=TRUE, span=0.001) SmoothData $exists [1] TRUE $span [1] 0.001 exists(SmoothData) TRUE exists(SmoothData$span) FALSE 'SmoothData$span' = 'foo' exists(SmoothData$span) # TRUE exists(SmoothData[[2]]) 'SmoothData[[2]]' = 'bar' exists(SmoothData[[2]]) # TRUE the problem in your case is that you have an object named 'SmoothData' with a nested component named 'span', but you're testing for the existence of an object named 'SmoothData$span'. as shown in a recent post, one attempt to do your task would be exists('SmoothData') 'span' %in% names(SmoothData) # TRUE vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coord_equal in ggplot2
Dear Dieter, I tried that. But it rescales one of the axis. The resulting graph is still square. But now 1 cm Y-axis equal 2.5 cm on the X-axis. This seems not te be the documented behaviour. ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal(ratio = 2/5) From sessionInfo() R 2.9.0 on WinXP ggplot2_0.8.3 reshape_0.8.3 plyr_0.1.8proto_0.3-8 Regards, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Dieter Menne Verzonden: dinsdag 19 mei 2009 16:15 Aan: r-h...@stat.math.ethz.ch Onderwerp: Re: [R] Coord_equal in ggplot2 ONKELINX, Thierry Thierry.ONKELINX at inbo.be writes: I'm plotting some points on a graph where both axes need to have the same scale. See the example below. Coord_equal does that trick but in this case it wastes a lot of space on the y-axis. Setting the limits of the y-axis myself was no avail. Any suggestions to solve this problem? library(ggplot2) ds - data.frame(x = runif(1000, min = 0, max = 30), y = runif(1000, min = 14, max = 26)) ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal() ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal() + scale_x_continuous(limits = c(0, 30)) + scale_y_continuous(limits = c(14, 26)) I think you need to set ratio in addition to cut off the extra space. (Not tried) From Docs: Equal scales. coord_equal ensures that the x and y axes have equal scales: i.e. 1 cm along the x axis represents the same range of data as 1 cm along the y axis. By default it will assume that you want a one-to-one ratio, but you can change this with the ratio parameter. The aspect ratio will also be set to ensure that the mapping is maintained regardless of the shape of the output device. See the documentation of coord_equal() for more details. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exists function on list objects gives always a FALSE
Linlin Yan wrote: SmoothData$span is not an object which can be checked by exists(), but part of an object which can be checked by is.null(). is.null is unhelpful here, in that lists can contain NULL as a named element, and retrieving a non-existent element returns NULL: foo = list(bar=NULL) is.null(foo$bar) # TRUE is.null(foo$foo) # TRUE i must admit i find it surprising that ?'$' does not appropriately explain what happens if a list is indexed with a name not included in the list's names. the closest is When extracting, a numerical, logical or character 'NA' index picks an unknown element and so returns 'NA' in the corresponding element of a logical, integer, numeric, complex or character result, and 'NULL' for a list. but it's valid for NAs in the index, and If no match is found then 'NULL' is returned. but it's in the section on environments. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wilcoxon nonparametric p-values
Thanks Peter, You are correct! After I sent the previous message, I realized that I was comparing the sign test against the Wilcoxon test. I would have replied sooner, but I realized that while I was out walking my dogs. CHV   Charles H Van deZande ---Original Message--- From: Peter Dalgaard Date: 05/19/09 10:32:30 To: Charles Van deZande Cc: r-help@r-project.org Subject: Re: [R] Wilcoxon nonparametric p-values Charles Van deZande wrote: Thanks Peter, There are 8 measurements less than 8.5, so calculating the probability (binomial) of 8, or fewer, happening by chance with n = 20 and p = 0.50 gives P = 0.25-- the book answer. I've tried several problems in other textbooks and in each case I get vastly different P-values than I get with wilcox.test or wilcox.exact. Ah, but that is NOT a signed-rank test, just a sign test. (Using the former as a test of the median is BTW not really a good idea unless you assume symmetry of the distribution.) It is also still a one-sided test, with two tails you get binom.test(8,20) Exact binomial test data: 8 and 20 number of successes = 8, number of trials = 20, p-value = 0.5034 alternative hypothesis: true probability of success is not equal to 0.5 95 percent confidence interval: 0.1911901 0.6394574 sample estimates: probability of success 0.4 (and that is disregarding that one observation is exactly 8.5, so you should really look at 7 in 19 rather than 8 in 20.) However, upon further testing, I've found good agreement when the calculated P-values are small, but disagreement when P-values are large. This might mean a problem with wilcox.test and wilcox.exact when P-values are large or I might be misinterpreting something. You need to read some more theory. The extreme cases (all signs equal) are equally unlikely for the sign test and the signed-rank test. CHV   Charles H Van deZande /---Original Message---/ /*From:*/ Peter Dalgaard mailto:p.dalga...@biostat.ku.dk /*Date:*/ 5/19/2009 5:35:07 AM /*To:*/ cvandy mailto:cvand...@gmail.com /*Cc:*/ r-help@r-project.org mailto:r-help@r-project.org /*Subject:*/ Re: [R] Wilcoxon nonparametric p-values cvandy wrote: When I use wilcox.test, I get vastly different p-values than the problems from Statistics textbooks. For example: The following problem comes from Applied Statistics and Probability for Engineers, 2nd Edition, by D. C. Montgomery. Page736, problem 14.7. The problem is to compare the sample data with a population median of 8.5. The book answer is p = 0.25, wilcox.test answer is p = 0.573. I've tried several other similar problems with similar results. I've copied the following directly from my workspace. wilcox.exact (from exactRankTests) gives wilcox.exact(x - 8.5) Exact Wilcoxon signed rank test data: x - 8.5 V = 80.5, p-value = 0.5748 so I'd suspect the textbook. One-sided p-value perhaps? or table limitation (as in p .25). If you want to dig deeper, you'll probably have to check the computations implied by the text. Thanks for any help, CHV x-c(8.32,8.05, 8.93,8.65,8.25,8.46,8.52,8.35,8.36,8.41,8.42,8.30,8.71,8.75,8.6,8.83,8.5,8 38,8.29,8.46) wilcox.test(x,y=NULL,mu=8.5) Wilcoxon signed rank test with continuity correction data: x V = 80.5, p-value = 0.573 alternative hypothesis: true location is not equal to 8.5 Warning messages: 1: In wilcox.test.default(x, y = NULL, mu = 8.5) : cannot compute exact p-value with ties 2: In wilcox.test.default(x, y = NULL, mu = 8.5) : cannot compute exact p-value with zeroes   Charles H Van deZande  -- O__ Peter Dalgaard Ãster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk mailto:p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 -- O__ Peter Dalgaard Ãster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] anova(cph(..) output
Hi, Thank you very much for the answer. However, I have still some misunderstandings. from the output, can we say that plant and leaf age are significant but not their interaction? And the last question I promise, what would you advise me to write in the paper to explain the different method and ackonwledge for the df? Thank you again, julien. Frank E Harrell Jr wrote: pompon wrote: Hello, I am a beginner in R and statistics, so my question may be trivial. Sorry in advance. I performed a Cox proportion hazard regression with 2 categorical variables with cph{design}. Then an anova on the results. the output is anova(cph(surv(survival, censor) ~ plant + leaf.age + plant*leaf.age, Mpnymph) Wald Statistics Response: Surv(survival, censored) FactorChi-Square d.f. P plant (Factor+Higher Order Factors) 96.96 12 .0001 All Interactions 10.58 6 0.1022 leaf.age (Factor+Higher Order Factors) 29.11 7 0.0001 All Interactions 10.58 6 0.1022 plant * leaf.age (Factor+Higher Order Factors) 10.58 6 0.1022 TOTAL 106.63 13 .0001 What do All interaction stand for? The real df of for plant is 6 and 1 for leaf.age. Then, which chi square is one for my main factors anf their interaction. thank you, Julien. Julien, I know what you mean when you say 'real df' but that's not the whole story as plant has 6 more df by interacting with a single df variable. There is no such thing as 'the' main effect test for plant. The 12 df test is unique and tests whether plant is associated with Y for any level of leaf.age. You can see exactly what is being tested by using various print options for anova.Design, as described in the help file. The dots option is easy on the eyes. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/anova%28cph%28..%29-output-tp23563818p23617483.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] text() to label points in ggplot
# Here are two options: p - ggplot(mtcars, aes(wt, mpg)) + geom_point() + geom_text(aes(x = 5, y = 30, label = A Label)) #or response - c(2,4) xvar - c(1,2) label - response; myData - data.frame(response,xvar,label) p - ggplot(myData, aes(y=response, x=xvar)) p + geom_bar(position=dodge, stat=identity) + geom_text(aes(y=response+1,label=label)) On Thu, May 14, 2009 at 1:22 PM, stephenb sten...@go.com wrote: is there a way to label points in a graph using text(locator(1),text) after ggplot() or qplot() ? qplot(date, psavert, data = economics, geom = line,main=jhdjd)-p p+opts(text(locator(1),),new=T) does not work. -- View this message in context: http://www.nabble.com/text%28%29-to-label-points-in-ggplot-tp23545135p23545135.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create string of comma-separated content of vector
see ?paste e.g. x - seq(0,10,1) paste(x, collapse=, ) 2009/5/19 Katharina May may.kathar...@googlemail.com: Hi, how do I create a string of the comma-separated content of a vector? I've got the vector i with several numeric values as content: str(i) num 99 and want to create a SQL statement to look like the following where the part '(2, 4, 6, 7)' should be the content of the vector i: select * from [biomass_data$] where site_no in (2, 4, 6, 7) Here my approach (which doesn't work): site_all_data = sqlQuery(channel, select * from [biomass_data$] where site_no in (,paste(i,sep=,),) ) sorry for spaming so much today to the mailing list... -Katharina -- Time flies like an arrow, fruit flies like bananas. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. Mark Wardle Specialist registrar, Neurology Cardiff, UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] anova(cph(..) output
pompon wrote: Hi, Thank you very much for the answer. However, I have still some misunderstandings. from the output, can we say that plant and leaf age are significant but not their interaction? And the last question I promise, what would you advise me to write in the paper to explain the different method and ackonwledge for the df? Thank you again, julien. I would say there is moderate evidence for an interaction (P=0.10) and strong evidence for both a plant effect (at least at some level of leaf) and a leaf effect (at least at some level of plant). Frank Frank E Harrell Jr wrote: pompon wrote: Hello, I am a beginner in R and statistics, so my question may be trivial. Sorry in advance. I performed a Cox proportion hazard regression with 2 categorical variables with cph{design}. Then an anova on the results. the output is anova(cph(surv(survival, censor) ~ plant + leaf.age + plant*leaf.age, Mpnymph) Wald Statistics Response: Surv(survival, censored) FactorChi-Square d.f. P plant (Factor+Higher Order Factors) 96.96 12 .0001 All Interactions 10.58 6 0.1022 leaf.age (Factor+Higher Order Factors) 29.11 7 0.0001 All Interactions 10.58 6 0.1022 plant * leaf.age (Factor+Higher Order Factors) 10.58 6 0.1022 TOTAL 106.63 13 .0001 What do All interaction stand for? The real df of for plant is 6 and 1 for leaf.age. Then, which chi square is one for my main factors anf their interaction. thank you, Julien. Julien, I know what you mean when you say 'real df' but that's not the whole story as plant has 6 more df by interacting with a single df variable. There is no such thing as 'the' main effect test for plant. The 12 df test is unique and tests whether plant is associated with Y for any level of leaf.age. You can see exactly what is being tested by using various print options for anova.Design, as described in the help file. The dots option is easy on the eyes. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] length(grep(...
Colleagues R2.8.1 in OSX I often combine two commands as follows: length(grep(TEXT, OBJECT)) 0 to see if a particular snippet of text exists within an object. Is there a single command that would accomplish this? Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-415-564-2220 www.PLessThan.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] length(grep(...
Hi, R=2.9.0 ships grepl, which lets you do that: any( grepl( TEXT, OBJECT) ) You can also: install.packages( operators ) require( operators ) OBJECT %~+% TEXT Romain Dennis Fisher wrote: Colleagues R2.8.1 in OSX I often combine two commands as follows: length(grep(TEXT, OBJECT)) 0 to see if a particular snippet of text exists within an object. Is there a single command that would accomplish this? Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-415-564-2220 www.PLessThan.com -- Romain Francois Independent R Consultant +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] S data sets in R?
Greetings. I'm trying to learn to program in R. (I'm definitely NOT new to programming, just to R.) A colleague suggested that I have a look at the book: An Introduction to S and S-Plus by: Phil Spector I've glanced at the book, and it does indeed seem to be the kind of thing I wanted, but in the Introduction to the book, the author says he'll be using several example data sets throughout the book, including: 1. auto.stats 2. saving.x 3. rain.nyc1 4. state.x77 The author states: These data sets should be available as part of the standard S distribution, so you can simply refer to them as they are used in the examples. Of course I want to use R, not S. I have every R-* package installed on my Fedora linux system, but I can't find any of the data sets mentioned above. (The command locate rain.nyc produces no output, for instance.) It's entirely possible that these data sets are installed, but I just don't know enough about R to determine that. Hence, I need to help to find out if the data sets are installed, or if they CAN be installed, etc. If you can steer me in the right direction, please do so. Thanks. -- Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] S data sets in R?
Maybe you should just bypass that book for one of these? http://www.springer.com/series/6991 -Ro On Tue, May 19, 2009 at 12:01 PM, Michael Hannon jm_han...@yahoo.com wrote: Greetings. I'm trying to learn to program in R. (I'm definitely NOT new to programming, just to R.) A colleague suggested that I have a look at the book: An Introduction to S and S-Plus by: Phil Spector I've glanced at the book, and it does indeed seem to be the kind of thing I wanted, but in the Introduction to the book, the author says he'll be using several example data sets throughout the book, including: 1. auto.stats 2. saving.x 3. rain.nyc1 4. state.x77 The author states: These data sets should be available as part of the standard S distribution, so you can simply refer to them as they are used in the examples. Of course I want to use R, not S. I have every R-* package installed on my Fedora linux system, but I can't find any of the data sets mentioned above. (The command locate rain.nyc produces no output, for instance.) It's entirely possible that these data sets are installed, but I just don't know enough about R to determine that. Hence, I need to help to find out if the data sets are installed, or if they CAN be installed, etc. If you can steer me in the right direction, please do so. Thanks. -- Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exists function on list objects gives always a FALSE
On Tue, May 19, 2009 at 12:07 PM, routík zrou...@gmail.com wrote: SmoothData - list(exists=TRUE, span=0.001) exists(SmoothData$span) FALSE As others have said, this just checks for the existence of a variable with the (strange) name SmoothData$span. In some sense, in R semantics, xxx$yyy *always* exists if xxx is a list (or other recursive object): xxx - list() xxx$hello NULL You might think that you can check names(xxx) to see if the slot has been explicitly set, but it depends on *how* you have explicitly set the slot to NULL: xxx$hello - 3 xxx$hello - NULL names(xxx) character(0) # no names -- assigning to NULL kills slot xxx - list(hello=NULL) names(xxx) [1] hello# 1 name -- constructing with NULL-valued slot Welcome to R! -s [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exists function on list objects gives always a FALSE
Stavros Macrakis wrote: You might think that you can check names(xxx) to see if the slot has been explicitly set, but it depends on *how* you have explicitly set the slot to NULL: xxx$hello - 3 xxx$hello - NULL names(xxx) character(0) # no names -- assigning to NULL kills slot kills indeed: foo = list(bar=1) with(foo, bar) # 1 foo$bar = NULL with(foo, bar) # error: object 'bar' not found xxx - list(hello=NULL) names(xxx) [1] hello# 1 name -- constructing with NULL-valued slot but: # cleanup -- don't do it in mission critical session rm(list=ls()) foo # error: object 'foo' not found foo = NULL foo # NULL that is, foo$bar = NULL kills bar within foo (even though NULL is a valid component of lists), but foo = NULL does *not* kill foo. Welcome to R! ... and its zemanticks. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replace / swap values of subset of a data.frame
Dear R users, I have 1 data.frame of 1500x80 - data1. I found out that there are a few cells of data that I have misplace, and I need to fix the ordering of them. In an attempt trying to swap column 22 23 of the Subject with misplaced data, I did the following: data2 - data1 subset(data1,(Subject==25 Session==1))[,22] - subset(data2,(Subject==25 Session==1))[,23] (error messages... Could not find function subset-) subset(data1,(Subject==25 Session==1))[,23] - subset(data2,(Subject==25 Session==1))[,22] (error messages... Could not find function subset-) Please, please point me to some ways to achieve the swapping. Thanks a lot! Cheers, John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with error
while (theta1!=theta) {...} gives the error message: Error in while (theta1 != theta) { : missing value where TRUE/FALSE needed but when i extract theta1!=theta and paste it into the console it comes up with the output TRUE which contradicts the error message- im not sure what I am doing wrong -- View this message in context: http://www.nabble.com/help-with-error-tp23623932p23623932.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] what is wrong with this code?
dlogl - -(n/theta)-sum((y/(theta)^2)*((1-exp(y/theta))/(1+exp(y/theta))) d2logl - (n/theta^2) - sum((-2y/theta^3)*(1-exp(y/theta))/(1+exp(y/theta))) - sum(((2*y/theta^4)*exp(y/theta))/((1+exp(y/theta))^2)) returns the error message: Error: unexpected symbol in: dlogl - -(n/theta)-sum((y/(theta)^2)*((1-exp(y/theta))/(1+exp(y/theta))) d2logl do you know what i have done wrong -- View this message in context: http://www.nabble.com/what-is-wrong-with-this-code--tp23623227p23623227.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
On Mon, May 18, 2009 at 9:22 AM, Thomas Lumley tlum...@u.washington.edu wrote: On Mon, 18 May 2009, Debbie Zhang wrote: Based on a set of binomial sample data, how would you utilize the nlm function in R to estimate the true proportion of the population? I can't see why anyone would want to use nlm() for this. The sample proportion is the MLE, and binom.test() gives an exact confidence interval. Homework exercise intended to teach the use of optimization when you can separately work out what the answer should be? And, as you probably know, the exact confidence interval from binom.test is not as good as the approximate interval described by Agresti and B.A. Coull in a 1998 American Statistician article. (The coverage of the exact interval is at least the nominal value but it can be greater because the binomial is discrete.) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] what is wrong with this code?
You're missing a ) off end of the first line. You should consider using an editor (e.g. ESS/Emacs) that does parentheses matching. I found this in less than 5 sec (less time than I'm taking to write you a note) by cut and pasting in Emacs. --sundar On Tue, May 19, 2009 at 12:52 PM, deanj2k dl...@le.ac.uk wrote: dlogl - -(n/theta)-sum((y/(theta)^2)*((1-exp(y/theta))/(1+exp(y/theta))) d2logl - (n/theta^2) - sum((-2y/theta^3)*(1-exp(y/theta))/(1+exp(y/theta))) - sum(((2*y/theta^4)*exp(y/theta))/((1+exp(y/theta))^2)) returns the error message: Error: unexpected symbol in: dlogl - -(n/theta)-sum((y/(theta)^2)*((1-exp(y/theta))/(1+exp(y/theta))) d2logl do you know what i have done wrong -- View this message in context: http://www.nabble.com/what-is-wrong-with-this-code--tp23623227p23623227.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] what is wrong with this code?
On 19-May-09 19:52:20, deanj2k wrote: dlogl - -(n/theta)-sum((y/(theta)^2)*((1-exp(y/theta))/(1+exp(y/theta))) d2logl - (n/theta^2) - sum((-2y/theta^3)*(1-exp(y/theta))/(1+exp(y/theta))) - sum(((2*y/theta^4)*exp(y/theta))/((1+exp(y/theta))^2)) returns the error message: Error: unexpected symbol in: dlogl - -(n/theta)-sum((y/(theta)^2)*((1-exp(y/theta))/(1+exp(y/theta))) d2logl do you know what i have done wrong The error message strongly suggests that the line beginning d2logl - is being seen as a continuation of the preceding line. Counting parentheses, I find that you are 1 short of what is required to complete the expression in the line beginning -(n/theta). In that case, R will continue on to the next line seeking the completion, and will encounter d2logl non-syntactically. Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 19-May-09 Time: 22:12:40 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] S data sets in R?
On Tue, May 19, 2009 at 2:01 PM, Michael Hannon jm_han...@yahoo.com wrote: Greetings. I'm trying to learn to program in R. (I'm definitely NOT new to programming, just to R.) A colleague suggested that I have a look at the book: An Introduction to S and S-Plus by: Phil Spector I've glanced at the book, and it does indeed seem to be the kind of thing I wanted, but in the Introduction to the book, the author says he'll be using several example data sets throughout the book, including: 1. auto.stats 2. saving.x 3. rain.nyc1 4. state.x77 The author states: These data sets should be available as part of the standard S distribution, so you can simply refer to them as they are used in the examples. Of course I want to use R, not S. I have every R-* package installed on my Fedora linux system, but I can't find any of the data sets mentioned above. (The command locate rain.nyc produces no output, for instance.) Not an unreasonable first guess but in R you need parentheses around the arguments in function calls and you would need to quote the name of the object. Even when you do those things and guess at the function name being find instead of locate you still won't get any joy. find(rain.nyc) character(0) The state.x77 data set is part of the datasets package but the others never seemed to make it from S to R. If you want to find out what is available you can try ls.str(package:datasets) and stare at the output for a while until it begins to make sense. In general, an experienced programmer can learn a lot about the structure of an object in R by applying Martin Maechler's str function to it. The ls.str function is the equivalent of asking for a listing of the objects in a namespace and applying str to each of those names. Two recent books that I would recommend for learning R are Robert Gentleman's R Programming for Bioinformatics and John Chambers Software for Data Analysis. Robert (one of the two R's who started the R Project) gives you a broad overview of tools available and considerable detail on the important parts. John, the designer and implementor of the S language the preceded R, describes how to think about the programming task in R. Both are worth reading. It's entirely possible that these data sets are installed, but I just don't know enough about R to determine that. Hence, I need to help to find out if the data sets are installed, or if they CAN be installed, etc. If you can steer me in the right direction, please do so. Thanks. -- Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] binom package (was: no subject)
There are 17 different help pages in 5 different packages citing Agresti and Coull. This is quickly displayed using the RSiteSearch package as follows: library(RSiteSearch) HTML(RSiteSearch.function(Agresti and Coull)) I have not checked all these 17, but they doubtless help explain Agresti and Coull's point that the term exact confidence interval is like a lot of terms in Marketing: The substance falls far short of the hype for most purposes. Hope this helps. Spencer Graves Douglas Bates wrote: On Mon, May 18, 2009 at 9:22 AM, Thomas Lumley tlum...@u.washington.edu wrote: On Mon, 18 May 2009, Debbie Zhang wrote: Based on a set of binomial sample data, how would you utilize the nlm function in R to estimate the true proportion of the population? I can't see why anyone would want to use nlm() for this. The sample proportion is the MLE, and binom.test() gives an exact confidence interval. Homework exercise intended to teach the use of optimization when you can separately work out what the answer should be? And, as you probably know, the exact confidence interval from binom.test is not as good as the approximate interval described by Agresti and B.A. Coull in a 1998 American Statistician article. (The coverage of the exact interval is at least the nominal value but it can be greater because the binomial is discrete.) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] S data sets in R?
My favorite tool for finding things like this is RSiteSearch.function in the RSiteSearch package. For the objects you mention, I get the following: library(RSiteSearch) hits(a.s - RSiteSearch.function(auto.stats)) # 0 hits(sx - RSiteSearch.function(saving.x)) # 0 hits(rn - RSiteSearch.function(rain.nyc1)) # 0 hits(s77 - RSiteSearch.function(state.x77)) # 12 HTML(s77) # View the 12 and find states in the datasets package. hits(ps - RSiteSearch.function(Phil Spector)) # 0 If you are still interested in that book, you might write to the author, suggesting he might get more readers by providing a package that includes those data sets. If he were really interested in having more readers, he might also include script files providing R scripts for working all the examples in the book, as Doug Bates does in the nlme package, which can be found using system.file('scripts', package='nlme'). These provide R code to work essentially all the examples in Pinhiero and Bates (2000) Mixed-Effects Models in S and S-Plus (Springer). For me, those files made reading that book much easier, more pleasant and memorable. Hope this helps. Spencer Graves Douglas Bates wrote: On Tue, May 19, 2009 at 2:01 PM, Michael Hannon jm_han...@yahoo.com wrote: Greetings. I'm trying to learn to program in R. (I'm definitely NOT new to programming, just to R.) A colleague suggested that I have a look at the book: An Introduction to S and S-Plus by: Phil Spector I've glanced at the book, and it does indeed seem to be the kind of thing I wanted, but in the Introduction to the book, the author says he'll be using several example data sets throughout the book, including: 1. auto.stats 2. saving.x 3. rain.nyc1 4. state.x77 The author states: These data sets should be available as part of the standard S distribution, so you can simply refer to them as they are used in the examples. Of course I want to use R, not S. I have every R-* package installed on my Fedora linux system, but I can't find any of the data sets mentioned above. (The command locate rain.nyc produces no output, for instance.) Not an unreasonable first guess but in R you need parentheses around the arguments in function calls and you would need to quote the name of the object. Even when you do those things and guess at the function name being find instead of locate you still won't get any joy. find(rain.nyc) character(0) The state.x77 data set is part of the datasets package but the others never seemed to make it from S to R. If you want to find out what is available you can try ls.str(package:datasets) and stare at the output for a while until it begins to make sense. In general, an experienced programmer can learn a lot about the structure of an object in R by applying Martin Maechler's str function to it. The ls.str function is the equivalent of asking for a listing of the objects in a namespace and applying str to each of those names. Two recent books that I would recommend for learning R are Robert Gentleman's R Programming for Bioinformatics and John Chambers Software for Data Analysis. Robert (one of the two R's who started the R Project) gives you a broad overview of tools available and considerable detail on the important parts. John, the designer and implementor of the S language the preceded R, describes how to think about the programming task in R. Both are worth reading. It's entirely possible that these data sets are installed, but I just don't know enough about R to determine that. Hence, I need to help to find out if the data sets are installed, or if they CAN be installed, etc. If you can steer me in the right direction, please do so. Thanks. -- Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Getting lm() to work with a matrix
Hi I'm fairly new to R and am trying to analyse some large spectral datasets using stepwise regression (fairly standard in this area). I have a field sampled dataset, of which a proportion has been held back for validation. I gather than step() needs to be fed a regression model and lm() can produce a multiple regression. I had thought something like: spectra.lm - lm(response[,3]~spectra.spec[,2:20]) might work but lm() doesnt appear to like being fed a range of columns. I suspect Ive missed something fairly fundamental here. Any help much appreciated best wishes mike -- View this message in context: http://www.nabble.com/Getting-lm%28%29-to-work-with-a-matrix-tp23625486p23625486.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] panel question (plm)
Hello, I am working on a data set (already as a plm.data object) located here: http://econsteve.com/arch/plmWithDensity.Robj With the following R session: library(plm) ... load(plmWithDensity.Robj) model - plm(RATE ~ density08, data=plmWithDensity) Error: subscript out of bounds I am not understanding the subscript out of bounds error, as this is a balanced panel and there are no holes in the data set. Any help would be very much appreciated. The model I am trying to run is model2 - plm(RATE~ AGR.PCT+SVC.PCT+IND.PCT+density08, data=plmWithDensity) This code runs fine, but I do not get any coefficients for density08 summary(model2) Oneway (individual) effect Within Model Call: plm(formula = RATE ~ AGR.PCT + SVC.PCT + IND.PCT + density08, data = plmWithDensity) Balanced Panel: n=89, T=26, N=2314 Residuals : Min. 1st Qu. Median 3rd Qu.Max. -1860.0 -475.011.3 526.0 1250.0 Coefficients : Estimate Std. Error t-value Pr(|t|) AGR.PCT 34192.604281.07 7.9869 1.383e-15 *** SVC.PCT 4024.83 457.17 8.8037 2.2e-16 *** IND.PCT -16545.621541.32 -10.7347 2.2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Total Sum of Squares:98515 Residual Sum of Squares: 85206 F-statistic: 115.692 on 3 and DF, p-value: 2.22e-16 What is going on? Any advice is appreciated. Thanks, -stephen == Stephen J. Barr University of Washington WEB: www.econsteve.com == __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] panel question (plm)
On Tue, 19 May 2009, Stephen J. Barr wrote: Hello, I am working on a data set (already as a plm.data object) located here: http://econsteve.com/arch/plmWithDensity.Robj With the following R session: library(plm) ... load(plmWithDensity.Robj) model - plm(RATE ~ density08, data=plmWithDensity) Error: subscript out of bounds I am not understanding the subscript out of bounds error, as this is I agree that the error is not very meaningful but the problem is due to your data: density08 does not vary within your id variable (COURT), hence the default within model cannot be estimated. And it is also the reason why density08 gets no coefficient in a larger model. Also note that your RATE variable is a factor...I'm pretty certain you want a numeric variable here! Yves Giovanni: What happens in the code is that the model.matrix() method silently omits the column from the regressor matrix. Hence, this goes unnoticed in the larger model and results in a regressor matrix without any columns in the case above. Thus, the subscript error. hth, Z __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] create string of comma-separated content of vector
See ?toString x - 0:10 toString(x) See ?sQuote for cases where the vector is a character and needs to be quoted. Jason Law Statistician City of Portland Bureau of Environmental Services Water Pollution Control Laboratory 6543 N Burlington Avenue Portland, OR 97203-5452 jason@bes.ci.portland.or.us Hi, how do I create a string of the comma-separated content of a vector? I've got the vector i with several numeric values as content: str(i) num 99 and want to create a SQL statement to look like the following where the part '(2, 4, 6, 7)' should be the content of the vector i: select * from [biomass_data$] where site_no in (2, 4, 6, 7) Here my approach (which doesn't work): site_all_data = sqlQuery(channel, select * from [biomass_data$] where site_no in (,paste(i,sep=,),) ) sorry for spaming so much today to the mailing list... -Katharina -- Time flies like an arrow, fruit flies like bananas. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] panel question (plm)
Ah, thank you for the help, and for the explanation of what is going on. I suppose I will have to reload my data with plm.data set such that RATE is not a factor. For my time index, will 2000,2000.25,2000.5, etc. work? Meaning 2000 quarter 1, 2000 quarter 2, etc? Or is there some special way that I need to format the time? Thanks, -stephen == Stephen J. Barr University of Washington WEB: www.econsteve.com == On Tue, May 19, 2009 at 4:39 PM, Achim Zeileis achim.zeil...@wu-wien.ac.at wrote: On Tue, 19 May 2009, Stephen J. Barr wrote: Hello, I am working on a data set (already as a plm.data object) located here: http://econsteve.com/arch/plmWithDensity.Robj With the following R session: library(plm) ... load(plmWithDensity.Robj) model - plm(RATE ~ density08, data=plmWithDensity) Error: subscript out of bounds I am not understanding the subscript out of bounds error, as this is I agree that the error is not very meaningful but the problem is due to your data: density08 does not vary within your id variable (COURT), hence the default within model cannot be estimated. And it is also the reason why density08 gets no coefficient in a larger model. Also note that your RATE variable is a factor...I'm pretty certain you want a numeric variable here! Yves Giovanni: What happens in the code is that the model.matrix() method silently omits the column from the regressor matrix. Hence, this goes unnoticed in the larger model and results in a regressor matrix without any columns in the case above. Thus, the subscript error. hth, Z __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Barchart in lattice - wrong order of groups, data labels on top of each other, and a legend question
On Mon, May 18, 2009 at 11:47 AM, Dimitri Liakhovitski ld7...@gmail.com wrote: Hello! I have a question about my lattice barchart that I am trying to build in Section 3 below. I can't figure out a couple of things: 1. When I look at the dataframe test that I am trying to plot, it looks right to me (the group Total is always the first out of 5). However, in the chart it is the last. Why? 2. How can I make sure the value labels (on y) are not sitting on top of each other but on top of the respective bar? 3. Is there any way to make the legend group items horizontally as opposed to now (vertically - taking up too much space) For 1 and 3, use auto.key = list(points = FALSE, rectangles = TRUE, reverse.rows = TRUE, columns = 2, space = bottom) From ?xyplot (under 'key'): 'reverse.rows' logical, defaulting to 'FALSE'. If 'TRUE', all components are reversed _after_ being replicated (the details of which may depend on the value of 'rep'). This is useful in certain situations, e.g. with a grouped 'barchart' with 'stack = FALSE' with the categorical variable on the vertical axis, where the bars in the plot will usually be ordered from bottom to top, but the corresponding legend will have the levels from top to bottom (unless, of course, 'reverse.rows = TRUE'). Note that in this case, unless all columns have the same number or rows, they will no longer be aligned. 'columns' the number of columns column-blocks the key is to be divided into, which are drawn side by side. 2 is hard with a simple custom panel function, because you need to replicate some fairly involved calculations that are performed in panel.barchart. Your best bet is to start with a copy of panel.barchart, and then add calls to panel.text at suitable places. -Deepayan Thanks a lot! Dimitri ### Section 1: generates my data set data - just run: # N-100 myset1-c(1,2,3,4,5) probs1-c(.05,.10,.15,.40,.30) myset2-c(0,1) probs2-c(.65,.30) myset3-c(1,2,3,4,5,6,7) probs3-c(.02,.03,.10,.15,.20,.30,.20) group-unlist(lapply(1:4,function(x){ out-rep(x,25) return(out) })) set.seed(1) a-sample(myset1, N, replace = TRUE,probs1) a[which(rbinom(100,2,.01)==1)]-NA set.seed(12) b-sample(myset1, N, replace = TRUE,probs1) b[which(rbinom(100,2,.01)==1)]-NA set.seed(123) c-sample(myset2, N, replace = TRUE,probs2) set.seed(1234) d-sample(myset2, N, replace = TRUE,probs2) set.seed(12345) e-sample(myset3, N, replace = TRUE,probs3) e[which(rbinom(100,2,.01)==1)]-NA set.seed(123456) f-sample(myset3, N, replace = TRUE,probs3) f[which(rbinom(100,2,.01)==1)]-NA data-data.frame(group,a=a,b=b,c=c,d=d,e=e,f=f) data[group]-lapply(data[group],function(x) { x[x %in% 1]-Group 1 x[x %in% 2]-Group 2 x[x %in% 3]-Group 3 x[x %in% 4]-Group 4 return(x) }) data$group-as.factor(data$group) lapply(data,table,exclude=NULL) tables-lapply(data,function(x){ out-table(x) out-prop.table(out) out-round(out,3)*100 return(out) }) str(tables[2]) # Section 2: Generating a list of tables with percentages to be plotted in barcharts - just run: # listoftables-list() for(i in 1:(length(data)-1)) { listoftables[[i]]-data.frame() } for(i in 1:length(listoftables)) { total-table(data[[i+1]]) groups-table(data[[1]],data[[i+1]]) total.percents-as.data.frame(t(as.vector(round(total*100/sum(total),1 groups.percents-as.data.frame(t(apply(groups,1,function(x){ out-round(x*100/sum(x),1) return(out) }))) names(total.percents)-names(groups.percents) final.table-rbind(total.percents,groups.percents) row.names(final.table)[1]-Total final.table-as.matrix(final.table) listoftables[[i]]-final.table } names(listoftables)-names(data)[2:(length(listoftables)+1)] ### Section 3 - building the graph for the very first table of the listoftables ### library(lattice) i-1 test - data.frame(Group = rep(row.names(listoftables[[i]]),5), a = rep(1:5,each=5),Percentage = as.vector(listoftables[[i]])) par.settings=trellis.par.set(reference.line = list(col = gray, lty =dotted)) barchart(Percentage~a, test, groups = Group, horizontal = F, auto.key = list(points = FALSE, rectangles = TRUE, space = bottom),ylim = c(0,50), panel = function(y,x,...) { panel.grid(h = -1, v = -1) panel.barchart(x, y, ...) ltext(x, y, labels=round(y,0),cex=.7,col=black,font=2,pos=3) }) -- Dimitri Liakhovitski MarketTools, Inc. dimitri.liakhovit...@markettools.com __
Re: [R] panel question (plm)
On Tue, 19 May 2009, Stephen J. Barr wrote: Ah, thank you for the help, and for the explanation of what is going on. I suppose I will have to reload my data with plm.data set such that RATE is not a factor. plmWithDensity$RATE - as.numeric(as.character(plmWithDensity$RATE)) should suffice. For my time index, will 2000,2000.25,2000.5, etc. work? Meaning 2000 quarter 1, 2000 quarter 2, etc? Or is there some special way that I need to format the time? That's ok. Internally, plm.data always stores it as a factor anyway. Best, Z Thanks, -stephen == Stephen J. Barr University of Washington WEB: www.econsteve.com == On Tue, May 19, 2009 at 4:39 PM, Achim Zeileis achim.zeil...@wu-wien.ac.at wrote: On Tue, 19 May 2009, Stephen J. Barr wrote: Hello, I am working on a data set (already as a plm.data object) located here: http://econsteve.com/arch/plmWithDensity.Robj With the following R session: library(plm) ... load(plmWithDensity.Robj) model - plm(RATE ~ density08, data=plmWithDensity) Error: subscript out of bounds I am not understanding the subscript out of bounds error, as this is I agree that the error is not very meaningful but the problem is due to your data: density08 does not vary within your id variable (COURT), hence the default within model cannot be estimated. And it is also the reason why density08 gets no coefficient in a larger model. Also note that your RATE variable is a factor...I'm pretty certain you want a numeric variable here! Yves Giovanni: What happens in the code is that the model.matrix() method silently omits the column from the regressor matrix. Hence, this goes unnoticed in the larger model and results in a regressor matrix without any columns in the case above. Thus, the subscript error. hth, Z __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting lm() to work with a matrix
Try this (note dot after ~): lm(response[, 3] ~., as.data.frame(spectra.spec[, 2:20])) On Tue, May 19, 2009 at 6:21 PM, MikSmith m...@hsm.org.uk wrote: Hi I'm fairly new to R and am trying to analyse some large spectral datasets using stepwise regression (fairly standard in this area). I have a field sampled dataset, of which a proportion has been held back for validation. I gather than step() needs to be fed a regression model and lm() can produce a multiple regression. I had thought something like: spectra.lm - lm(response[,3]~spectra.spec[,2:20]) might work but lm() doesnt appear to like being fed a range of columns. I suspect Ive missed something fairly fundamental here. Any help much appreciated best wishes mike -- View this message in context: http://www.nabble.com/Getting-lm%28%29-to-work-with-a-matrix-tp23625486p23625486.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] New version of actuar
Dear useRs, A new version of actuar is available since last Friday. This is mainly a bugfix release. From the NEWS file: Version 1.0-2 = USER-VISIBLE CHANGES o mfoo() and levfoo() now return Inf instead of NaN for infinite moments. (Thanks to David Humke for the idea.) BUG FIXES o Non-ascii characters in one R source file prevented compilation of the package in a C locale (at least on OS X). o For probability laws that have a strictly positive mode or a mode at zero depending on the value of one or more shape parameters, dfoo(0, ...) did not handle correctly the case exactly at the boundary condition. actuar is a package offering additional actuarial science functionality to R, mostly in the fields of loss distributions, risk theory (including ruin theory), simulation of compound hierarchical models and credibility theory See also: http://www.actuar-project.org. -- Vincent Goulet, Associate Professor École d'actuariat Université Laval, Québec vincent.gou...@act.ulaval.ca http://vgoulet.act.ulaval.ca ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] panel question (plm)
Thank you for the advice. For the density08 variable, that is population density in year 2008. I also have population densities for year 2000, so I could put them both in, and interpolate between them for the times that are covered by the panel (2000-2008), and then just have a density column that will vary both over time and across various courts. I would assume that this would fix the problem of density not showing up in my coefficients list, although I think it is more of an econometrics issue :) Thanks again, -stephen On Tue, May 19, 2009 at 3:35 PM, Achim Zeileis achim.zeil...@wu-wien.ac.at wrote: On Tue, 19 May 2009, Stephen J. Barr wrote: Ah, thank you for the help, and for the explanation of what is going on. I suppose I will have to reload my data with plm.data set such that RATE is not a factor. plmWithDensity$RATE - as.numeric(as.character(plmWithDensity$RATE)) should suffice. For my time index, will 2000,2000.25,2000.5, etc. work? Meaning 2000 quarter 1, 2000 quarter 2, etc? Or is there some special way that I need to format the time? That's ok. Internally, plm.data always stores it as a factor anyway. Best, Z Thanks, -stephen == Stephen J. Barr University of Washington WEB: www.econsteve.com == On Tue, May 19, 2009 at 4:39 PM, Achim Zeileis achim.zeil...@wu-wien.ac.at wrote: On Tue, 19 May 2009, Stephen J. Barr wrote: Hello, I am working on a data set (already as a plm.data object) located here: http://econsteve.com/arch/plmWithDensity.Robj With the following R session: library(plm) ... load(plmWithDensity.Robj) model - plm(RATE ~ density08, data=plmWithDensity) Error: subscript out of bounds I am not understanding the subscript out of bounds error, as this is I agree that the error is not very meaningful but the problem is due to your data: density08 does not vary within your id variable (COURT), hence the default within model cannot be estimated. And it is also the reason why density08 gets no coefficient in a larger model. Also note that your RATE variable is a factor...I'm pretty certain you want a numeric variable here! Yves Giovanni: What happens in the code is that the model.matrix() method silently omits the column from the regressor matrix. Hence, this goes unnoticed in the larger model and results in a regressor matrix without any columns in the case above. Thus, the subscript error. hth, Z __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to copy files from one direction to another?
There's 10 files in c:\\ I wanna copy 3 of them to d:\\ How to do it via R? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to copy files from one direction to another?
?file.copy On Tue, May 19, 2009 at 9:51 PM, XinMeng xm...@capitalbio.com wrote: There's 10 files in c:\\ I wanna copy 3 of them to d:\\ How to do it via R? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace / swap values of subset of a data.frame
Exactly what are you trying to do? Are you trying to just change a subset of the values? 'subset' does not have an 'assignment' operator. Maybe you want something like this (but it is not clear from your description. Also it is not clear if you have exactly the same set of matching values in the two data frames for the subset conditions. If you do, then this might work: data1[(data1$Subject==25) (data1$Session==1), 22] - data2[(data2$Subject==25)(data2$Session==1), 23] On Tue, May 19, 2009 at 3:50 PM, tsunhin wong thjw...@gmail.com wrote: Dear R users, I have 1 data.frame of 1500x80 - data1. I found out that there are a few cells of data that I have misplace, and I need to fix the ordering of them. In an attempt trying to swap column 22 23 of the Subject with misplaced data, I did the following: data2 - data1 subset(data1,(Subject==25 Session==1))[,22] - subset(data2,(Subject==25 Session==1))[,23] (error messages... Could not find function subset-) subset(data1,(Subject==25 Session==1))[,23] - subset(data2,(Subject==25 Session==1))[,22] (error messages... Could not find function subset-) Please, please point me to some ways to achieve the swapping. Thanks a lot! Cheers, John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace / swap values of subset of a data.frame
If DF is your data frame: DF2 - edit(DF) and then make the changes manually in the spreadsheet that pops up. On Tue, May 19, 2009 at 3:50 PM, tsunhin wong thjw...@gmail.com wrote: Dear R users, I have 1 data.frame of 1500x80 - data1. I found out that there are a few cells of data that I have misplace, and I need to fix the ordering of them. In an attempt trying to swap column 22 23 of the Subject with misplaced data, I did the following: data2 - data1 subset(data1,(Subject==25 Session==1))[,22] - subset(data2,(Subject==25 Session==1))[,23] (error messages... Could not find function subset-) subset(data1,(Subject==25 Session==1))[,23] - subset(data2,(Subject==25 Session==1))[,22] (error messages... Could not find function subset-) Please, please point me to some ways to achieve the swapping. Thanks a lot! Cheers, John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Too large a data set to be handled by R?
Dear R users, I have been using a dynamic data extraction from raw files strategy at the moment, but it takes a long long time. In order to save time, I am planning to generate a data set of size 1500 x 2 with each data point a 9-digit decimal number, in order to save my time. I know R is limited to 2^31-1 and that my data set is not going to exceed this limit. But my laptop only has 2 Gb and is running 32-bit Windows / XP or Vista. I ran into R memory problem issue before. Please let me know your opinion according to your experience. Thanks a lot! - John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extracting correlation in a nlme model
Hi R users: Is there a function to obtain the correlation within groups from this very simple lme model? modeloMx1 Linear mixed-effects model fit by REML Data: barrag Log-restricted-likelihood: -70.92739 Fixed: fza_tension ~ 1 (Intercept) 90.86667 Random effects: Formula: ~1 | molde (Intercept) Residual StdDev:2.610052 2.412176 Number of Observations: 30 Number of Groups: 3 I want to obtain \rho = \sigma_b^2 / (\sigma_b^2 + \sigma^2) I know that I obtain \sigma_b^2 and \sigma^2 with VarCorr(modeloMx1) molde = pdLogChol(1) Variance StdDev (Intercept) 6.812374 2.610052 Residual5.818593 2.412176 But, I want to know if I can obtain \rho = 6.8123/(6.8123 + 5.8185) = 0.53934 straightforward. Thank you for you help. Kenneth __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glht problem
I am struggling with a simple repeated-measure model: fit-lme(trait~year * A, random = ~1|subj/year) A being a factor with three levels. I got have the following results for anova(fit): numDF denDF F-value p-value (Intercept) 1 126 2471.4720 .0001 year 2060 10.4126 .0001 A 2 126 23.0721 .0001 year:A40 1261.6499 0.0193 Now I try to use glht for A, but fail: Linear Hypotheses: Estimate Std. Error z value p value A2 - A1 == 0 0.25 1.10 0.227 0.972 A3 - A1 == 0 1.001.10 0.909 0.634 A3 - A2 == 0 0.75 1.10 0.682 0.774 (Adjusted p values reported -- single-step method) Warning message: In mcp2matrix(model, linfct = linfct) : covariate interactions found -- default contrast might be inappropriate What can be going on with this? many thanks in advance, Wolf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Package Inline under windows
Hi all, I installed the package inline (windows-version) but can not compile any code, I alway get an error message ERROR(s) during compilation : source code errors or compiler configuration errors! Unfornutanely there is no description where the package finds a c-compiler nor where so set the configuration. Using the linux version, everything works. Thank's for help ! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.