[R] nlme with a factor in R 2.4.0beta
Hi, the following R lines work fine in R 2.4.0 alpha (and older R versions), but not in R 2.4.0 beta (details below): library(drc) # to load the dataset 'PestSci' library(nlme) ## Starting values sv - c(0.328919, 1.956121, 0.097547, 1.642436, 0.208924) ## No error m1 - nlme(SLOPE ~ c + (d-c)/(1+exp(b*(log(DOSE)-log(e, fixed = list(b+c+d+e~1), random = d~1|CURVE, start = sv[c(2,3,4,5)], data = PestSci) ## Error: attempt to select more than one element m2 - nlme(SLOPE ~ c + (d-c)/(1+exp(b*(log(DOSE)-log(e, fixed = list(b~HERBICIDE, c+d+e~1), random = d~1|CURVE, start = sv, data = PestSci) Output from sessionInfo() for R 2.4.0 alpha R version 2.4.0 alpha (2006-09-16 r39365) i386-pc-mingw32 locale: LC_COLLATE=Danish_Denmark.1252;LC_CTYPE=Danish_Denmark.1252;LC_MONETARY=Danish_Denmark.1252;LC_NUMERIC=C; LC_TIME=Danish_Denmark.1252 attached base packages: [1] methods stats graphics grDevices utils datasets [7] base other attached packages: nlme drc 3.1-75 1.0-1 Output from sessionInfo() for R 2.4.0 beta R version 2.4.0 beta (2006-09-24 r39497) i386-pc-mingw32 locale: LC_COLLATE=Danish_Denmark.1252;LC_CTYPE=Danish_Denmark.1252;LC_MONETARY=Danish_Denmark.1252;LC_NUMERIC=C; LC_TIME=Danish_Denmark.1252 attached base packages: [1] methods stats graphics grDevices utils datasets [7] base other attached packages: nlme drc 3.1-76 1.0-1 Christian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] contrasts in aov
Dear John ?ordered will help you. Regards, Christoph Buser -- Christoph Buser [EMAIL PROTECTED] Seminar fuer Statistik, LEO C13 ETH Zurich 8092 Zurich SWITZERLAND phone: x-41-44-632-4673 fax: 632-1228 http://stat.ethz.ch/~buser/ -- John Vokey writes: useRs, A no doubt simple question, but I am baffled. Indeed, I think I once knew the answer, but can't recover it. The default contrasts for aov (and lm, and...) are contr.treatment and contr.poly for unordered and ordered factors, respectively. But, how does one invoke the latter? That is, in a data.frame, how does one indicate that a factor is an *ordered* factor such that contr.poly is invoked in the aov or lm call? -- Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html -Dr. John R. Vokey __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Initialising Mersenne-Twister with one integer
Hi, It seems to me that the Mersenne-Twister PRNG can be initialised using one integer instead of 624 integers, since inside RNG.c code there's a function defined as MT_sgenrand(Int32). How do I actually set this seed within R? I've tried: .Random.seed - c(3, 1) runif(1) Error in runif(1) : .Random.seed has wrong length In addition, is '3' actually the correct rng.kind for the Mersenne-Twister? I'm using R version 2.2.1, 2005-12-20 on Ubuntu Dapper Linux 686. Thanks, Gad -- Gad Abraham Department of Mathematics and Statistics University of Melbourne Parkville 3010, Victoria, Australia email: [EMAIL PROTECTED] web: http://www.ms.unimelb.edu.au/~gabraham __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Initialising Mersenne-Twister with one integer
On Mon, 25 Sep 2006, Gad Abraham wrote: Hi, It seems to me that the Mersenne-Twister PRNG can be initialised using one integer instead of 624 integers, since inside RNG.c code there's a function defined as MT_sgenrand(Int32). How do I actually set this seed within R? set.seed(), on the help page for ?.Random.seed. I've tried: .Random.seed - c(3, 1) runif(1) Error in runif(1) : .Random.seed has wrong length From the help page '.Random.seed' is an integer vector, containing the random number generator (RNG) *state* for random number generation in R. It can be saved and restored, but should not be altered by the user. In addition, is '3' actually the correct rng.kind for the Mersenne-Twister? I'm using R version 2.2.1, 2005-12-20 on Ubuntu Dapper Linux 686. Not current, but I suspect the help page is the same in that version. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice strip labels for two factors
Dear Gabor and Deepayan, Many thanks for your help. I used suggestion 3 from Gabor (it worked well with my long df ) and will try Deepayan's suggestion. Rafael Deepayan Sarkar wrote: On 9/23/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: 1. You can write a custom strip function: my.strip - function(which.given, ..., factor.levels) { levs - if (which.given == 1) factor.levels else c(faro, porto, lisbon, setubal) strip.default(which.given, ..., factor.levels = levs) } xyplot(value ~ year | fact1 * fact2, data = df, strip = my.strip) 2. however, its probably easier just to change the levels in the data frame. Just do it in a copy if you don't want to change the original one: df2 - df levels(df2$fact2) - c(faro, porto, lisbon, setubal) xyplot(value ~ year | fact1 * fact2, data = df2) 3. or you can even do it inline in the data statement which similarly won't change the original data frame: levs - c(faro, porto, lisbon, setubal) xyplot(value ~ year | fact1 * fact2, data = replace(df, fact2, structure(df$fact2, levels = levs))) head(df) # unchanged or even (untested) xyplot(value ~ year | fact1 * factor(fact2, levels = levels(fact2), labels = levs), data = df) Deepayan On 9/23/06, Rafael Duarte [EMAIL PROTECTED] wrote: Thank you for your suggestion. This could be a solution that I didn't think of. But I forgot to say that I didn't want to change the original data frame (I have other code that depends on the original df and on the original factor levels). I was looking more for an implementation directly in the xyplot call (same as I did for one factor). Is it possible/simple to do? Thank you, Rafael Gabor Grothendieck wrote: Try this: levels(df$fact2) - c(faro,porto,lisbon,setubal) xyplot( value ~ year | fact1*fact2, data=df, type=b) On 9/22/06, Rafael Duarte [EMAIL PROTECTED] wrote: Dear list, My problem is to change the strip text of lattice panels when using two factors. I have a data frame with two factors: df - expand.grid( fact1=c(y,b,r), fact2=c(far,por,lis,set), year=1991:2000, value= NA) df[,value] - sample(1:50, 120, replace=TRUE) I can make simple xyplot and change the text of the factor levels with strip.custom: require(lattice) xyplot( value ~ year | fact1, data=df, type=b, subset= fact2==far, strip = strip.custom(bg=gray.colors(1,0.95), factor.levels=c(yellow, black, red)), layout=c(1,3) ) But how can I change the text of the factor levels when using both factors as in this plot: xyplot( value ~ year | fact1*fact2, data=df, type=b) (fact2 levels text should change to: c(faro,porto,lisbon,setubal)) I read the help for strip.default and the emails archive, tried with which.given but could not find out how to accomplish this. Many thanks, Rafael Duarte -- Rafael Duarte Marine Resources Department - DRM IPIMAR - National Research Institute for Agriculture and Fisheries Av. Brasília, 1449-006 Lisbon - Portugal Tel:+351 21 302 7000 Fax:+351 21 301 5948 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Rafael Duarte Marine Resources Department - DRM IPIMAR - National Research Institute for Agriculture and Fisheries Av. Brasília, 1449-006 Lisbon - Portugal Tel:+351 21 302 7000 Fax:+351 21 301 5948 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nlme with a factor in R 2.4.0beta
Christian Ritz [EMAIL PROTECTED] writes: Hi, the following R lines work fine in R 2.4.0 alpha (and older R versions), but not in R 2.4.0 beta (details below): library(drc) # to load the dataset 'PestSci' library(nlme) ## Starting values sv - c(0.328919, 1.956121, 0.097547, 1.642436, 0.208924) ## No error m1 - nlme(SLOPE ~ c + (d-c)/(1+exp(b*(log(DOSE)-log(e, fixed = list(b+c+d+e~1), random = d~1|CURVE, start = sv[c(2,3,4,5)], data = PestSci) ## Error: attempt to select more than one element m2 - nlme(SLOPE ~ c + (d-c)/(1+exp(b*(log(DOSE)-log(e, fixed = list(b~HERBICIDE, c+d+e~1), random = d~1|CURVE, start = sv, data = PestSci) ... other attached packages: nlme drc 3.1-75 1.0-1 nlme drc 3.1-76 1.0-1 I presume this is the real issue: The upgrade of nlme, rather than the change of R itself from alpha to beta status. The culprit would seem to be pars[, nm] - f %*% beta[[fmap[[nm inside nlme:::getParsNlme(). fmap[[nm]] is not necessarily a scalar, so the outer set of [[]] should likely be []. The maintainer of nlme will know for sure. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] behavior of [-.foo
I've not seen an actual answer to this, which is that this is a misunderstanding as to how NextMethod works. First, + x - unclass(x) looks wrong. NextMethod uses the next method at the call to the generic, and subsequent changes to the object 'x' do not alter the class that would be dispatched on. Given that the next method might not be the default method, unclassing here seems potentially damaging. Second, the matched call is Called from: `[-.foo`(`*tmp*`, , value = 100) and it is that call which is going to be passed on to the next method. As the help page says: 'NextMethod' works by creating a special call frame for the next method. If no new arguments are supplied, the arguments will be the same in number, order and name as those to the current method but their values will be promises to evaluate their name in the current method and environment. Since 'j' was not an argument of the original call, it is ignored. Now, [- is an internal generic without a visible default method, but it is as if you have called `[-.default`(`*tmp*`, 1:5, value = 100), which explains the result you got. (That NextMethod invokes the generic as a possible default method is not documented anywhere that I can see, and the description in 2.3.1 is all about methods invoked via UseMethod. There are issues with the above description when the method invoked is a primitive such as [-, as that uses positional matching. I would not be confident that this works as intended for multi-argument primitives.) x[,] - 100 is perhaps what you intended for a matrix-like class, and that does not work (it seems because the argument matching does indeed not work as intended). You need something like `[-.foo` - function(x, i, j, value) { if(missing(i)) i - 1:nrow(x) if(missing(j)) j - 1:ncol(x) cl - class(x) cll - length(cl) m - match(foo, cl, cll) oldClass(x) - if(m == cll) NULL else cl[(m+1):cll] x[i,j] - value class(x) - cl x } On Fri, 22 Sep 2006, Armstrong, Whit wrote: Can someone help me understand the following behavior of [- ? If I define a simple class based on a matrix, the [- operation only inserts into the first column: x - matrix(rnorm(10),nrow=5,ncol=2) class(x) - foo [-.foo - function(x, i, j, value) { + if(missing(i)) i - 1:nrow(x) + if(missing(j)) j - 1:ncol(x) + + x - unclass(x) + x - NextMethod(.Generic) + class(x) - foo + x + } x[] - 100.0 x [,1] [,2] [1,] 100 -0.1465296 [2,] 100 -0.2615796 [3,] 100 -0.8882629 [4,] 100 -0.2886357 [5,] 100 -0.9565273 attr(,class) [1] foo Based on the behavior of [- for a matrix, I would have thought that the data for the whole object would be replaced. for instance: y - matrix(rnorm(10),nrow=5,ncol=2) y [,1] [,2] [1,] -0.55297049 -1.1896488 [2,] 0.06157438 -0.6628254 [3,] -0.28184208 -2.5260177 [4,] 0.61204398 -0.3492488 [5,] 0.43971216 1.8990789 y[] - 100 y [,1] [,2] [1,] 100 100 [2,] 100 100 [3,] 100 100 [4,] 100 100 [5,] 100 100 [...] -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Beginner question: select cases
Hello all, I hope i chose the right list as my question is a beginner-question. I have a data set with 3 colums London, Rome and Vienna - the location is presented through a 1 like this: London RomeVienna q1 0 0 1 4 0 1 0 2 1 0 0 3 I just want to calculate the means of a variable q1. I tried following script: # calculate the mean of all locations results - subset(results, subset== 1 ) mean(results$q1) # calculate the mean of London results - subset(results, subset== 1 , select=c(London)) mean(results$q1) # calculate the mean of Rome results - subset(results, subset== 1 , select=c(Rome)) mean(results$q1) # calcualate the mean of Vienna results - subset(results, subset== 1 , select=c(Vienna)) mean(results$q1) As all results are 1.68 and there is defenitely a difference in the three locations I wonder whats going on. I get confused as the Rcmdr asks me to overwrite things and there is no just filter option. Any help would be apprechiated. Thank you in advance. Regards Peter ___CURE - Center for Usability Research Engineering___ Peter Wolkerstorfer Usability Engineer Hauffgasse 3-5, 1110 Wien, Austria [Tel] +43.1.743 54 51.46 [Fax] +43.1.743 54 51.30 [Mail] [EMAIL PROTECTED] [Web] http://www.cure.at __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Beginner question: select cases
Your problem would be a lot easier if you coded the location in one variable instead of three variables. Then you could calculate the means with one line of code: by(results$q1, results$location, mean) With your dataset you could use by(results$London, results$location, mean) by(results$Rome, results$location, mean) by(results$Vienna, results$location, mean) see ?by for more information And take a good look at your code. You take a subset from results and the assign it to results. This means that you replace the original results dataframe with a subset of it. As you take the subset for the next city, you won't take a subset from the original dataset but for the previous subset! Cheers, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Reseach Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be -Oorspronkelijk bericht- Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Namens Peter Wolkerstorfer - CURE Verzonden: maandag 25 september 2006 13:51 Aan: r-help@stat.math.ethz.ch Onderwerp: [R] Beginner question: select cases Hello all, I hope i chose the right list as my question is a beginner-question. I have a data set with 3 colums London, Rome and Vienna - the location is presented through a 1 like this: London RomeVienna q1 0 0 1 4 0 1 0 2 1 0 0 3 I just want to calculate the means of a variable q1. I tried following script: # calculate the mean of all locations results - subset(results, subset== 1 ) mean(results$q1) # calculate the mean of London results - subset(results, subset== 1 , select=c(London)) mean(results$q1) # calculate the mean of Rome results - subset(results, subset== 1 , select=c(Rome)) mean(results$q1) # calcualate the mean of Vienna results - subset(results, subset== 1 , select=c(Vienna)) mean(results$q1) As all results are 1.68 and there is defenitely a difference in the three locations I wonder whats going on. I get confused as the Rcmdr asks me to overwrite things and there is no just filter option. Any help would be apprechiated. Thank you in advance. Regards Peter ___CURE - Center for Usability Research Engineering___ Peter Wolkerstorfer Usability Engineer Hauffgasse 3-5, 1110 Wien, Austria [Tel] +43.1.743 54 51.46 [Fax] +43.1.743 54 51.30 [Mail] [EMAIL PROTECTED] [Web] http://www.cure.at __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Beginner question: select cases
Peter, There is a much easier way to do this. First, you should consider organizing your data as follows: set.seed(1) # for replication only # Here is a sample dataframe tmp - data.frame(city = gl(3,10, label = c(London, Rome,Vienna )), q1 = rnorm(30)) # Compute the means with(tmp, tapply(q1,city, mean)) London Rome Vienna 0.1322028 0.2488450 -0.1336732 I hope this helps -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Wolkerstorfer - CURE Sent: Monday, September 25, 2006 7:51 AM To: r-help@stat.math.ethz.ch Subject: [R] Beginner question: select cases Hello all, I hope i chose the right list as my question is a beginner-question. I have a data set with 3 colums London, Rome and Vienna - the location is presented through a 1 like this: LondonRomeVienna q1 0 0 1 4 0 1 0 2 1 0 0 3 I just want to calculate the means of a variable q1. I tried following script: # calculate the mean of all locations results - subset(results, subset== 1 ) mean(results$q1) # calculate the mean of London results - subset(results, subset== 1 , select=c(London)) mean(results$q1) # calculate the mean of Rome results - subset(results, subset== 1 , select=c(Rome)) mean(results$q1) # calcualate the mean of Vienna results - subset(results, subset== 1 , select=c(Vienna)) mean(results$q1) As all results are 1.68 and there is defenitely a difference in the three locations I wonder whats going on. I get confused as the Rcmdr asks me to overwrite things and there is no just filter option. Any help would be apprechiated. Thank you in advance. Regards Peter ___CURE - Center for Usability Research Engineering___ Peter Wolkerstorfer Usability Engineer Hauffgasse 3-5, 1110 Wien, Austria [Tel] +43.1.743 54 51.46 [Fax] +43.1.743 54 51.30 [Mail] [EMAIL PROTECTED] [Web] http://www.cure.at __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Selfstarting models for soil hydrology using R
Dear, I have developed and tested some models in soil hydrology with NLME library in R. I want to ask if it could be possible to submit this to the NLME library (with sample data) as a toolbox or something so that anyone downloading new components of new versions of R may simply call (say SSbrookscorey function to predict water retention in the same way someone can call SSlogis to predict logistic function in the current version)? I would be grateful for your support. I can also give in-depth description and capabilities for white papers concerning the applications of R in soil hydrology. Please advice me. Dr. Christian Thine, University of Nairobi, Kenya. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Beginner Loop Question with dynamic variable names
Dear all, I have another small scripting-beginner problem which you hopefully can help: I compute new variables with: # Question 1 results$q1 - with(results, q1_1*1+ q1_2*2+ q1_3*3+ q1_4*4+ q1_5*5) # Question 2 results$q2 - with(results, q2_1*1+ q2_2*2+ q2_3*3+ q2_4*4+ q2_5*5) # Question 3 results$q3 - with(results, q3_1*1+ q3_2*2+ q3_3*3+ q3_4*4+ q3_5*5) # Question 4 results$q4 - with(results, q4_1*1+ q4_2*2+ q4_3*3+ q4_4*4+ q4_5*5) This is very inefficient so I would like to do this in a loop like: for (i in 1:20) {results$q1 - with(results, q1_1*1+ q1_2*2+ q1_3*3+ q1_4*4+ q1_5*5)} My question now: How to replace the 1-s (results$q1, q1_1...) in the variables with the looping variable? Here like I like it (just for illustration - of course I still miss the function to tell R that it should append the value of i to the variable name): # i is the number of questions - just an illustration, I know it does not work this way for (i in 1:20) {results$qi - with(results, qi_1*1+ qi_2*2+ qi_3*3+ qi_4*4+ qi_5*5)} Help would be greatly appreciated. Thanks in advance. Peter ___CURE - Center for Usability Research Engineering___ Peter Wolkerstorfer Usability Engineer Hauffgasse 3-5, 1110 Wien, Austria [Tel] +43.1.743 54 51.46 [Fax] +43.1.743 54 51.30 [Mail] [EMAIL PROTECTED] [Web] http://www.cure.at __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple imputation using mice with mean
Hi I am trying to impute missing values for my data.frame. As I intend to use the complete data for prediction I am currently measuring the success of an imputation method by its resulting classification error in my training data. I have tried several approaches to replace missing values: - mean/median substitution - substitution by a value selected from the observed values of a variable - MLE in the mix package - all available methods for numerical data in the MICE package (ie. pmm, sample, mean and norm) I found that the least classification error results using mice with the mean option for numerical data. However, I am not sure how the mean multiple imputatation differs from the simple mean substitution. I tried to read some of the documentation supporting the R package, but couldn't find much theory about the mean imputation method. Are there any good papers to explain the background behind each imputation option in MICE? I would really appreciate any comments on the above, as my understanding of statistics is very limited. Many thanks Eleni Rapsomaniki Birkbeck College, UK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Beginner question: select cases
--- Peter Wolkerstorfer - CURE [EMAIL PROTECTED] wrote: Hello all, I hope i chose the right list as my question is a beginner-question. I have a data set with 3 colums London, Rome and Vienna - the location is presented through a 1 like this: LondonRomeVienna q1 0 0 1 4 0 1 0 2 1 0 0 3 I just want to calculate the means of a variable q1. I tried following script: # calculate the mean of all locations results - subset(results, subset== 1 ) mean(results$q1) # calculate the mean of London results - subset(results, subset== 1 , select=c(London)) mean(results$q1) # calculate the mean of Rome results - subset(results, subset== 1 , select=c(Rome)) mean(results$q1) # calcualate the mean of Vienna results - subset(results, subset== 1 , select=c(Vienna)) mean(results$q1) As all results are 1.68 and there is defenitely a difference in the three locations I wonder whats going on. I get confused as the Rcmdr asks me to overwrite things and there is no just filter option. Any help would be apprechiated. Thank you in advance. Regards Peter I'm new at R also. However I don't recognize your syntax. I have not seen select used here. Try results - subset(results, London==1 ) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Beginner Loop Question with dynamic variable names
I think this does what you are looking for: dta - data.frame(q1_1=rep(1,5),q1_2=rep(2,5),q2_1=rep(3,5),q2_2=rep(4,5)) for (i in 1:2) { e1 - paste(q,i,_1 + q,i,_2 * 2,sep=) assign(paste(q,i,sep=),with(dta,eval(parse(text=e1 } On 25/09/06, Peter Wolkerstorfer - CURE [EMAIL PROTECTED] wrote: Dear all, I have another small scripting-beginner problem which you hopefully can help: I compute new variables with: # Question 1 results$q1 - with(results, q1_1*1+ q1_2*2+ q1_3*3+ q1_4*4+ q1_5*5) # Question 2 results$q2 - with(results, q2_1*1+ q2_2*2+ q2_3*3+ q2_4*4+ q2_5*5) # Question 3 results$q3 - with(results, q3_1*1+ q3_2*2+ q3_3*3+ q3_4*4+ q3_5*5) # Question 4 results$q4 - with(results, q4_1*1+ q4_2*2+ q4_3*3+ q4_4*4+ q4_5*5) This is very inefficient so I would like to do this in a loop like: for (i in 1:20) {results$q1 - with(results, q1_1*1+ q1_2*2+ q1_3*3+ q1_4*4+ q1_5*5)} My question now: How to replace the 1-s (results$q1, q1_1...) in the variables with the looping variable? Here like I like it (just for illustration - of course I still miss the function to tell R that it should append the value of i to the variable name): # i is the number of questions - just an illustration, I know it does not work this way for (i in 1:20) {results$qi - with(results, qi_1*1+ qi_2*2+ qi_3*3+ qi_4*4+ qi_5*5)} Help would be greatly appreciated. Thanks in advance. Peter ___CURE - Center for Usability Research Engineering___ Peter Wolkerstorfer Usability Engineer Hauffgasse 3-5, 1110 Wien, Austria [Tel] +43.1.743 54 51.46 [Fax] +43.1.743 54 51.30 [Mail] [EMAIL PROTECTED] [Web] http://www.cure.at __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glmmPQL in 2.3.1
Dear R-help, I recently tried implementing glmmPQL in 2.3.1, and I discovered a few differences as compared to 2.2.1. I am fitting a regression with fixed and random effects with Gamma error structure. First, 2.3.1 gives different estimates than 2.2.1, and 2.3.1, takes more iterations to converge. Second, when I try using the anova function it says, 'anova' is not available for PQL fits, why? Any help would be greatly appreciated. Best wishes, Justin -- Justin S. Rhodes Assistant Professor Department of Psychology Beckman Institute Neuroscience Program, Institute for Genomic Biology University of Illinois 405 N. Mathews Avenue Urbana, IL 61801 Ph: 217-265-0021 Fax: 217-244-5180 e-mail: [EMAIL PROTECTED] http://www.psych.uiuc.edu/people/showprofile.php?id=545 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Beginner Loop Question with dynamic variable names
- Original Message - From: David Barron [EMAIL PROTECTED] To: Peter Wolkerstorfer - CURE [EMAIL PROTECTED]; r-help r-help@stat.math.ethz.ch Sent: Monday, September 25, 2006 3:33 PM Subject: Re: [R] Beginner Loop Question with dynamic variable names I think this does what you are looking for: dta - data.frame(q1_1=rep(1,5),q1_2=rep(2,5),q2_1=rep(3,5),q2_2=rep(4,5)) for (i in 1:2) { e1 - paste(q,i,_1 + q,i,_2 * 2,sep=) assign(paste(q,i,sep=),with(dta,eval(parse(text=e1 } or something like the following if you want to avoid eval(parse(text = ...)): dta - data.frame(q1_1 = rep(1,5), q1_2 = rep(2,5), q1_3 = rep(1,5), q1_4 = rep(2,5), q2_1 = rep(3,5), q2_2 = rep(4,5), q2_3 = rep(3,5), q2_4 = rep(4,5), q3_1 = rep(3,5), q3_2 = rep(4,5), q3_3 = rep(3,5), q3_4 = rep(4,5)) for (i in 1:3) { nam - paste(q, i, sep = ) e1 - data.matrix(dta[grep(nam, names(dta), fixed = TRUE)]) dta - cbind(dta, rowSums(e1 * rep(1:ncol(e1), each = nrow(e1 names(dta)[length(dta)] - nam } dta Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm On 25/09/06, Peter Wolkerstorfer - CURE [EMAIL PROTECTED] wrote: Dear all, I have another small scripting-beginner problem which you hopefully can help: I compute new variables with: # Question 1 results$q1 - with(results, q1_1*1+ q1_2*2+ q1_3*3+ q1_4*4+ q1_5*5) # Question 2 results$q2 - with(results, q2_1*1+ q2_2*2+ q2_3*3+ q2_4*4+ q2_5*5) # Question 3 results$q3 - with(results, q3_1*1+ q3_2*2+ q3_3*3+ q3_4*4+ q3_5*5) # Question 4 results$q4 - with(results, q4_1*1+ q4_2*2+ q4_3*3+ q4_4*4+ q4_5*5) This is very inefficient so I would like to do this in a loop like: for (i in 1:20) {results$q1 - with(results, q1_1*1+ q1_2*2+ q1_3*3+ q1_4*4+ q1_5*5)} My question now: How to replace the 1-s (results$q1, q1_1...) in the variables with the looping variable? Here like I like it (just for illustration - of course I still miss the function to tell R that it should append the value of i to the variable name): # i is the number of questions - just an illustration, I know it does not work this way for (i in 1:20) {results$qi - with(results, qi_1*1+ qi_2*2+ qi_3*3+ qi_4*4+ qi_5*5)} Help would be greatly appreciated. Thanks in advance. Peter ___CURE - Center for Usability Research Engineering___ Peter Wolkerstorfer Usability Engineer Hauffgasse 3-5, 1110 Wien, Austria [Tel] +43.1.743 54 51.46 [Fax] +43.1.743 54 51.30 [Mail] [EMAIL PROTECTED] [Web] http://www.cure.at __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting grouped data object
Hello Xiaohui, data.grp is just a pseudo example of a grouped data object that is grouped according the factor y. I just tried your suggestion, but the result is that three separate panels are still created, whereas I would like to have all 3 lines in a single panel. cheers, dave -Original Message- From: X.H Chen [mailto:[EMAIL PROTECTED] Sent: Saturday, September 23, 2006 6:25 PM To: Afshartous, David; r-help@stat.math.ethz.ch Subject: RE: [R] plotting grouped data object I don't get your meaning in what is in datagrp, anyway, try: X11() par(new=T) before calling: plot(data.grp, outer = ~ y) Xiaohui Chen Dept. of Statistics UBC, Canada From: Afshartous, David [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Subject: [R] plotting grouped data object Date: Sat, 23 Sep 2006 18:09:38 -0400 All, I'd like to plot the main relationship of a grouped data object for all levels of a factor in a single panel. The sample code below creates a separate panel for each level of the factor. I realize that this could be done in other ways, but I'd like to do it via plotting the grouped data object. thanks! dave z = rnorm(18, mean=0, sd=1) x = rep(1:6, 3) y = factor(rep(c(I, C, P), each = 6)) dat = data.frame(x, y, z) data.grp = groupedData(z ~ x | y, data = dat) plot(data.grp, outer = ~ y) ### this produces 1 line each in 3 panels ### how to collapse all 3 lines into 1 panel? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Don't waste time standing in line-try shopping online. Visit Sympatico / MSN Shopping today! http://shopping.sympatico.msn.ca __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple imputation using mice with mean
On 25-Sep-06 Eleni Rapsomaniki wrote: Hi I am trying to impute missing values for my data.frame. As I intend to use the complete data for prediction I am currently measuring the success of an imputation method by its resulting classification error in my training data. I have tried several approaches to replace missing values: - mean/median substitution - substitution by a value selected from the observed values of a variable - MLE in the mix package - all available methods for numerical data in the MICE package (ie. pmm, sample, mean and norm) I found that the least classification error results using mice with the mean option for numerical data. However, I am not sure how the mean multiple imputatation differs from the simple mean substitution. I tried to read some of the documentation supporting the R package, but couldn't find much theory about the mean imputation method. Are there any good papers to explain the background behind each imputation option in MICE? I agree that the MICE documentation tends to be silent about some imporant questions, both in the R/S help pages, and also in the MICE user's manual which can be found at http://web.inter.nl.net/users/S.van.Buuren/mi/docs/Manual.pdf Possibly it could be worth looking at some of the other relevant reports listed at http://web.inter.nl.net/users/S.van.Buuren/mi/hmtl/mice.htm but they do not look very hopeful. That being said, my understanding relating to your query is (glossing over the technicalities of the Gibbs sampling methods used in (b)) a) mean/median substitution relates to the very basic method of substituting, for a missing value, the arithmetic mean of the non missing values for that variable, possibly with selection of cases with non-missing values so as to approximately match the observed covariates of the case being imputed. b) mean imputation in MICE (as far as I can infer it) means that the distribution of the missing value (conditional on its observed covariates) is inferred from the cases with non-missing values, and the mean of this conditional distribution is subsitutedfor the missing value. These two approaches will in general give different results. Some further comments. 1. I would suggest that you consider the full multiple imputation approach. Filling in missing values just once, and then using the completed results (for predicition, in your case) in some procedure which treats them as though they were observed values, will not take into account the uncertainty as to what values they should have (as opposed to the values they were imputed to have). Whe multiple imputation is used, the variation from imputation to imputation in the imputed values will represent this uncertainty, and so a more realistic picture of the overall uncertainty of prediction can be obtained. 2. You stated that one method tried was MLE in the mix package. MLE (maximum likelihood estimation) using the EM algorithm is implemented in the mix functions em.mix and ecm.mix, but neither of these produces values to substitute for missing data. The result is essentially just parameter estimation by MLE based on the incomplete data. Values to substitute for missing data are produced by other functions, such as imp.mix; but these are randomly sampled from the conditional distributions of the missing values and therefore, each time it is done, the results are different. In particular, the first value you sample will be random. Hence the values you impute will be more or less good, in terms of your training set, depending on the luck of the draw when you use (say) imp.mix. I don't know if I have understood what you meant by MLE in the mix package, but if the above is a correct understanding then the remarks under (1) apply: in particular, as just noted, that comparing a single imputation with your training set is an uncertain comparison. Hoping this helps, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 25-Sep-06 Time: 15:33:59 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RE : Beginner question: select cases
Hi, subset function is use to select rows of a dataframe. just compute mean(results$q1) without subset instruction, or mean(results[,4]) Peter Wolkerstorfer - CURE [EMAIL PROTECTED] a écrit : Hello all, I hope i chose the right list as my question is a beginner-question. I have a data set with 3 colums London, Rome and Vienna - the location is presented through a 1 like this: London Rome Vienna q1 0 0 1 4 0 1 0 2 1 0 0 3 I just want to calculate the means of a variable q1. I tried following script: # calculate the mean of all locations results - subset(results, subset== 1 ) mean(results$q1) # calculate the mean of London results - subset(results, subset== 1 , select=c(London)) mean(results$q1) # calculate the mean of Rome results - subset(results, subset== 1 , select=c(Rome)) mean(results$q1) # calcualate the mean of Vienna results - subset(results, subset== 1 , select=c(Vienna)) mean(results$q1) As all results are 1.68 and there is defenitely a difference in the three locations I wonder whats going on. I get confused as the Rcmdr asks me to overwrite things and there is no just filter option. Any help would be apprechiated. Thank you in advance. Regards Peter ___CURE - Center for Usability Research Engineering___ Peter Wolkerstorfer Usability Engineer Hauffgasse 3-5, 1110 Wien, Austria [Tel] +43.1.743 54 51.46 [Fax] +43.1.743 54 51.30 [Mail] [EMAIL PROTECTED] [Web] http://www.cure.at __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] F values for glm with binomial distribution
Hi Rneters, I'm running a GLM model with a full factorial design in blocks and binomial error distribution. I would like to have the F values for this model but I got a message that using F test with a binomial family is inappropriate in: anova.glm(model, test = F). Should I not report F statistics on this kind of analysis? I would appreciate any comment on this. This is my output: model=glm(y~agua*fert+bloco, family=binomial) anova.glm(model,test=F) Analysis of Deviance Table Model: binomial, link: logit Response: y Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev F Pr(F) NULL 43 85.018 agua 10.90842 84.110 0.9081 0.34063 fert 15.67341 78.437 5.6727 0.01723 * bloco 10.04440 78.393 0.0444 0.83320 agua:fert 10.61639 77.777 0.6160 0.43253 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Warning message: using F test with a binomial family is inappropriate in: anova.glm(model, test = F) Best regards, André -- André Tavares Corrêa Dias Laboratório de Ecologia Vegetal Universidade Federal do Rio de Janeiro CCS-IB-Departamento de Ecologia Caixa Postal 68020 21941-970 Rio de Janeiro RJ, Brazil tel: +55 21 25626377 Fax: + 55 21 25626320 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Best use of LaTeX listings package for pretty printing R code
This is what I have been using. Does anyone have a better way? In particular I would like to see letters in comment strings not stretched so much. Thanks -Frank \documentclass{article} \usepackage{listings,relsize} \lstloadlanguages{R} \newcommand{\lil}[1]{\lstinline|#1|} \begin{document} \lstset{language=R,basicstyle=\smaller,commentstyle=\rmfamily\smaller, showstringspaces=false,% xleftmargin=4ex,literate={-}{{$\leftarrow$}}1 {~}{{$\sim$}}1} \lstset{escapeinside={(*}{*)}} % for (*\ref{ }*) inside lstlistings (S code) \begin{lstlisting} a - b # this is a test line if(i==3) { # another line, for y^2 y - 3^3 z - 'this string' qqcat - y ~ pol(x,2) } else y - 4 \end{lstlisting} That was \lstinline|x - 22| \lil{q - 'cat'}. \end{document} -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Splitting a character variable into a numeric one and a character one?
Hi All, I have a data with a variable like this: Column 1 123abc 12cd34 1e23 ... Now I want to do an operation that can split it into two variables: Column 1Column 2 Column 3 123abc 123 abc 12cd34 12cd34 1e23 1 e23 ... So basically, I want to split the original variabe into a numeric one and a character one, while the splitting element is the first character in Column 1. I searched the forum with key words strsplitand substr, but still can't solve this problem. Can anyone give me some hints? Thanks in advance, FD [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting a character variable into a numeric one and a character one?
On Mon, 2006-09-25 at 11:30 -0500, Marc Schwartz (via MN) wrote: On Mon, 2006-09-25 at 11:04 -0500, Frank Duan wrote: Hi All, I have a data with a variable like this: Column 1 123abc 12cd34 1e23 ... Now I want to do an operation that can split it into two variables: Column 1Column 2 Column 3 123abc 123 abc 12cd34 12cd34 1e23 1 e23 ... So basically, I want to split the original variabe into a numeric one and a character one, while the splitting element is the first character in Column 1. I searched the forum with key words strsplitand substr, but still can't solve this problem. Can anyone give me some hints? Thanks in advance, FD Something like this using gsub() should work I think: DF V1 1 123abc 2 12cd34 3 1e23 # Replace letters and any following chars with DF$V2 - gsub([A-Za-Z]+.*, , DF$V1) Quick typo correction here. It should be: DF$V2 - gsub([A-Za-z]+.*, , DF$V1) The second 'z' should be lower case. Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting a character variable into a numeric one and a character one?
Here is one more solution: library(gsubfn) s - c(123abc, 12cd34, 1e23) out - gsubfn(^([[:digit:]]+)(.*), paste, s, backref = -2) read.table(textConnection(out)) It assumes there are no spaces in the strings. If there are then choose a sep= that does not appear and do this: sep = , f - function(x, y) paste(x, y, sep = sep) out - gsubfn(^([[:digit:]]+)(.*), f, s, backref = -2) read.table(textConnection(out), sep = sep) On 9/25/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: strapply in package gsubfn can do that: library(gsubfn) s - c(123abc, 12cd34, 1e23) out - strapply(s, ^([[:digit:]]+)(.*), c) out - do.call(rbind, out) # as a matrix data.frame(x = out[,1], num = as.numeric(out[,2]), char = out[,3]) # as a data.frame On 9/25/06, Frank Duan [EMAIL PROTECTED] wrote: Hi All, I have a data with a variable like this: Column 1 123abc 12cd34 1e23 ... Now I want to do an operation that can split it into two variables: Column 1Column 2 Column 3 123abc 123 abc 12cd34 12cd34 1e23 1 e23 ... So basically, I want to split the original variabe into a numeric one and a character one, while the splitting element is the first character in Column 1. I searched the forum with key words strsplitand substr, but still can't solve this problem. Can anyone give me some hints? Thanks in advance, FD [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sampling distribution of correlation estimations derived from robust MCD and MVE methods
Dear R users, I am trying to use MCD and MVE methods in the analysis of functional imaging (fMRI) data. But, before doing that, I want to understand the sampling distribution of the correlation parameter given by MCD and MVE (cov.mcd$cor, cov.mve$cor). To this end, I conducted a simulation where in each of 10 epochs, I a.construct a matrix from two vectors, each containing 40 numbers randomly sampled from a normal distribution. b.apply cov.mve and cov.mcd to the resulting matrix. c.obtain the correlations in the subsets selected by cor.mve: e.g., if the matrix is called cormat20.ans, I request: current.mve20 - round(cov.mve(cormat20.ans, cor=T)$cor[[2]] ,3) At the end of the day, I have the sampling distribution for these correlations [i.e., what correlations exist in the subsets that MVE and MCD tend to pick up when sampling from normal distribution]. Here is my question: Because MVE and MCD select the most central 20 points (of the 40), I wanted to compare the resulting sampling distributions to that of a Pearson's r correlation coefficient (i.e., a Pearson's r with N=20; the goal was to establish whether the significance thresholds are similar). However the three sampling distributions are quite different. That is, the sampling distribution of Pearson's R (N=20) is very different than that of cov.mve and cov.mcd (with N=20 [20 being the subset selected of the 40 points]). The sampling distribution of Pearson's R with N=40 is also very different than that of MVE and MCD. If anyone knows, or could point me to sources information that discuss the issue of the sampling distribution of of cov.mve$cor and cov.mcd$cor and their relations to the pearson's R, I would be very grateful. I have put the simulation code I used here: http://home.uchicago.edu/~uhasson/pearson-mcd-mve.R.txt And an image of the resulting sampling distributions here: http://home.uchicago.edu/~uhasson/correl.comparison.tiff Sincerely, Uri Hasson The Brain Research Imaging Center The University of Chicago [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting a character variable into a numeric one and a character one?
strapply in package gsubfn can do that: library(gsubfn) s - c(123abc, 12cd34, 1e23) out - strapply(s, ^([[:digit:]]+)(.*), c) out - do.call(rbind, out) # as a matrix data.frame(x = out[,1], num = as.numeric(out[,2]), char = out[,3]) # as a data.frame On 9/25/06, Frank Duan [EMAIL PROTECTED] wrote: Hi All, I have a data with a variable like this: Column 1 123abc 12cd34 1e23 ... Now I want to do an operation that can split it into two variables: Column 1Column 2 Column 3 123abc 123 abc 12cd34 12cd34 1e23 1 e23 ... So basically, I want to split the original variabe into a numeric one and a character one, while the splitting element is the first character in Column 1. I searched the forum with key words strsplitand substr, but still can't solve this problem. Can anyone give me some hints? Thanks in advance, FD [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting a character variable into a numeric one and a character one?
And here is a third solution not using package gsubfn: s - c(123abc, 12cd34, 1e23) out - gsub(^(([[:digit:]]+)(.*)), \\1 \\2 \\3, s) read.table(textConnection(out), as.is = TRUE) Again, if spaces appear in the input string choose a character not appearing, such as comma, and do it like this: s - c(123abc, 12cd34, 1e23) out - gsub(^(([[:digit:]]+)(.*)), \\1,\\2,\\3, s) read.table(textConnection(out), sep = ,, as.is = TRUE) On 9/25/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: Here is one more solution: library(gsubfn) s - c(123abc, 12cd34, 1e23) out - gsubfn(^([[:digit:]]+)(.*), paste, s, backref = -2) read.table(textConnection(out)) It assumes there are no spaces in the strings. If there are then choose a sep= that does not appear and do this: sep = , f - function(x, y) paste(x, y, sep = sep) out - gsubfn(^([[:digit:]]+)(.*), f, s, backref = -2) read.table(textConnection(out), sep = sep) On 9/25/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: strapply in package gsubfn can do that: library(gsubfn) s - c(123abc, 12cd34, 1e23) out - strapply(s, ^([[:digit:]]+)(.*), c) out - do.call(rbind, out) # as a matrix data.frame(x = out[,1], num = as.numeric(out[,2]), char = out[,3]) # as a data.frame On 9/25/06, Frank Duan [EMAIL PROTECTED] wrote: Hi All, I have a data with a variable like this: Column 1 123abc 12cd34 1e23 ... Now I want to do an operation that can split it into two variables: Column 1Column 2 Column 3 123abc 123 abc 12cd34 12cd34 1e23 1 e23 ... So basically, I want to split the original variabe into a numeric one and a character one, while the splitting element is the first character in Column 1. I searched the forum with key words strsplitand substr, but still can't solve this problem. Can anyone give me some hints? Thanks in advance, FD [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting a character variable into a numeric one and a character one?
On Mon, 2006-09-25 at 11:04 -0500, Frank Duan wrote: Hi All, I have a data with a variable like this: Column 1 123abc 12cd34 1e23 ... Now I want to do an operation that can split it into two variables: Column 1Column 2 Column 3 123abc 123 abc 12cd34 12cd34 1e23 1 e23 ... So basically, I want to split the original variabe into a numeric one and a character one, while the splitting element is the first character in Column 1. I searched the forum with key words strsplitand substr, but still can't solve this problem. Can anyone give me some hints? Thanks in advance, FD Something like this using gsub() should work I think: DF V1 1 123abc 2 12cd34 3 1e23 # Replace letters and any following chars with DF$V2 - gsub([A-Za-Z]+.*, , DF$V1) # Replace any initial numbers with DF$V3 - gsub(^[0-9]+, , DF$V1) DF V1 V2 V3 1 123abc 123 abc 2 12cd34 12 cd34 3 1e23 1 e23 See ?gsub and ?regex for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] paste? 'cmd /c c:\\pheno\\whap --file c:\\pheno\\smri --alt 1'
Dear R users, This command works (calling a programm -called whap- with file specifiers etc.): system('cmd /c c:\\pheno\\whap --file c:\\pheno\\smri --alt 1 --perm 500', intern=TRUE) Now I need to call it from a loop to replace the 1 by different number, however I get lost using the quotes: I tried numerous versions of: i-1 system(paste(c('cmd /c c:\\pheno\\whap --file c:\\pheno\\smri --alt, i, --perm 500', sep= )), intern=TRUE) However no luck! I would be gratefull for any help. Thanks, Marco __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting a character variable into a numeric one and a character one?
Now I want to do an operation that can split it into two variables: Column 1Column 2 Column 3 123abc 123 abc 12cd34 12cd34 1e23 1 e23 ... So basically, I want to split the original variabe into a numeric one and a character one, while the splitting element is the first character in Column My first thought on this was to apply the regexp ^([0-9]*)(.*)$ and getting the two parts out. But I dont see a way to get both matches in parentheses out in one go. In Python you just do: re.findall('^([0-9]*)(.*)$',123abc) [('123', 'abc')] re.findall('^([0-9]*)(.*)$',1e12) [('1', 'e12')] In R you can get the groups and go gsub on them: r=^([0-9]*)(.*)$ gsub(r,\\1,123abc) [1] 123 But I dont see a way of getting the two values out except as part of one string in gsub - which is right back where you started - or doing gsub twice. Barry __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Passing R connection as argument to a shell command on Windows
Hello, is there a way to pass a connection to a file in a zipped archive as argument (instead of a file name of unzipped file) to shell command cut. In general, is it possible to pipe output of a R function to a shell command? How? I want to do something like: z = unz(zipArchive.zip, fileASCII.ASC) # open connection open(z) # cut lines of the ASCII file in zipped archive at specific postions and send results to another file. shell(cut -c2-3,5-8 z test2.dat) Anupam. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting a character variable into a numeric one and a character one?
Great! That's exactly what I want. Thanks a lot, FD On 9/25/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: strapply in package gsubfn can do that: library(gsubfn) s - c(123abc, 12cd34, 1e23) out - strapply(s, ^([[:digit:]]+)(.*), c) out - do.call(rbind, out) # as a matrix data.frame(x = out[,1], num = as.numeric(out[,2]), char = out[,3]) # as a data.frame On 9/25/06, Frank Duan [EMAIL PROTECTED] wrote: Hi All, I have a data with a variable like this: Column 1 123abc 12cd34 1e23 ... Now I want to do an operation that can split it into two variables: Column 1Column 2 Column 3 123abc 123 abc 12cd34 12cd34 1e23 1 e23 ... So basically, I want to split the original variabe into a numeric one and a character one, while the splitting element is the first character in Column 1. I searched the forum with key words strsplitand substr, but still can't solve this problem. Can anyone give me some hints? Thanks in advance, FD [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting grouped data object
On 9/23/06, Afshartous, David [EMAIL PROTECTED] wrote: All, I'd like to plot the main relationship of a grouped data object for all levels of a factor in a single panel. The sample code below creates a separate panel for each level of the factor. I realize that this could be done in other ways, but I'd like to do it via plotting the grouped data object. thanks! dave z = rnorm(18, mean=0, sd=1) x = rep(1:6, 3) y = factor(rep(c(I, C, P), each = 6)) dat = data.frame(x, y, z) data.grp = groupedData(z ~ x | y, data = dat) plot(data.grp, outer = ~ y) ### this produces 1 line each in 3 panels ### how to collapse all 3 lines into 1 panel? The closest I can get is dat$one - gl(1, 18) data.grp = groupedData(z ~ x | one, data = dat) plot(data.grp, innerGroups = ~y, strip = FALSE) -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Passing R connection as argument to a shell command on Windows
No, the cut command won't understand that z is an R connection and not a file in the current working directory: there is no overlap between the R object name space and the Windows object name space. Unfortunately, you may be forced to unzip to a temporary file, and then read from that. One thing that you might want to try, if you're using cygwin, is to create a named pipe, and use shell() with wait=FALSE to unzip and pipe into cut and then output to the named pipe. Open an R connection for reading from the named pipe. This leaves open the question of how to deal with failures, and whether you can invoke a command pipeline from R under Windows... I haven't tried this, so if you manage to make it work, it may be something that's of interest to the list in general. Regards, Mike On 9/25/06, Anupam Tyagi [EMAIL PROTECTED] wrote: Hello, is there a way to pass a connection to a file in a zipped archive as argument (instead of a file name of unzipped file) to shell command cut. In general, is it possible to pipe output of a R function to a shell command? How? I want to do something like: z = unz(zipArchive.zip, fileASCII.ASC) # open connection open(z) # cut lines of the ASCII file in zipped archive at specific postions and send results to another file. shell(cut -c2-3,5-8 z test2.dat) Anupam. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Regards, Mike Nielsen __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can't mix high level and low level plot functions.
Hey R-Comunity, I'd like to print out an histogram of some experimental data and add a smooth curve of a normal distribution with an ideally generated population having the same mean and standard deviation like the experimental data. The experimental data is set as vector x and its name is set to group.name. I paint the histogram as follows: hist(data, freq=FALSE, col=lightgrey, ylab=Density, xlab=group.name) First I did the normal distribution curve this way: lines(x, dnorm(x, mean=mean(x), sd=sd(x)), type=l, lwd=2) This curve just uses as many values as there are in x. When using small amounts of sample populations the curve looks really shaky. I tried this one using a high level plot function as well: curve(dnorm, n=1, add=TRUE, xlim=range(x)) The advantage is, now I can set an ideal population of 1 to get the ideal curve really smooth. But the big disadvantage is, I don't know how to add mean=mean(x), sd=sd(x) arguments to it? It says that it can't mix high level with low level plot functions when I try to set some kind of parameter like n=1 to the low level function, it says that there ain't enough x values. So my question is, how to get a smooth curve placed of dnorm over an histogram of sample data, ideally by using the curve method? TIA, Lothar Rubusch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can't mix high level and low level plot functions.
On 9/25/2006 1:56 PM, Lothar Botelho-Machado wrote: Hey R-Comunity, I'd like to print out an histogram of some experimental data and add a smooth curve of a normal distribution with an ideally generated population having the same mean and standard deviation like the experimental data. The experimental data is set as vector x and its name is set to group.name. I paint the histogram as follows: hist(data, freq=FALSE, col=lightgrey, ylab=Density, xlab=group.name) First I did the normal distribution curve this way: lines(x, dnorm(x, mean=mean(x), sd=sd(x)), type=l, lwd=2) This curve just uses as many values as there are in x. When using small amounts of sample populations the curve looks really shaky. This is generally the right way to do it, but you likely want to use a different variable for the first two occurrences of x, e.g. x0 - seq(from=min(x), to=max(x), len=200) lines(x0, dnorm(x0, mean=mean(x), sd=sd(x)), type=l, lwd=2) Duncan Murdoch I tried this one using a high level plot function as well: curve(dnorm, n=1, add=TRUE, xlim=range(x)) The advantage is, now I can set an ideal population of 1 to get the ideal curve really smooth. But the big disadvantage is, I don't know how to add mean=mean(x), sd=sd(x) arguments to it? It says that it can't mix high level with low level plot functions when I try to set some kind of parameter like n=1 to the low level function, it says that there ain't enough x values. So my question is, how to get a smooth curve placed of dnorm over an histogram of sample data, ideally by using the curve method? TIA, Lothar Rubusch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding percentage to Pie Charts
Gabor Grothendieck ggrothendieck at gmail.com writes: It might also be nice to be able to align the fans at the left or right, not just the center. Fans that open only on one side: A line that moves like the minute needle of an analog clock; with zero at the top. Movement of the needle in clock-wise direction represents the number (precentage). Anupam. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [PlainText Attempt] Sampling distribution of correlation estimations derived from robust MCD and MVE methods
Dear R users, I am trying to use MCD and MVE methods in the analysis of functional imaging (fMRI) data. But, before doing that, I want to understand the sampling distribution of the correlation parameter given by MCD and MVE (cov.mcd$cor, cov.mve$cor). To this end, I conducted a simulation where in each of 10 epochs, I a.construct a matrix from two vectors, each containing 40 numbers randomly sampled from a normal distribution. b.apply cov.mve and cov.mcd to the resulting matrix. c.obtain the correlations in the subsets selected by cor.mve: e.g., if the matrix is called cormat20.ans, I request: current.mve20 - round(cov.mve(cormat20.ans, cor=T)$cor[[2]] ,3) At the end of the day, I have the sampling distribution for these correlations [i.e., what correlations exist in the subsets that MVE and MCD tend to pick up when sampling from normal distribution]. Here is my question: Because MVE and MCD select the most central 20 points (of the 40), I wanted to compare the resulting sampling distributions to that of a Pearson's r correlation coefficient (i.e., a Pearson's r with N=20; the goal was to establish whether the significance thresholds are similar). However the three sampling distributions are quite different. That is, the sampling distribution of Pearson's R (N=20) is very different than that of cov.mve and cov.mcd (with N=20 [20 being the subset selected of the 40 points]). The sampling distribution of Pearson's R with N=40 is also very different than that of MVE and MCD. If anyone knows, or could point me to sources information that discuss the issue of the sampling distribution of of cov.mve$cor and cov.mcd$cor and their relations to the pearson's R, I would be very grateful. I have put the simulation code I used here: http://home.uchicago.edu/~uhasson/pearson-mcd-mve.R.txt And an image of the resulting sampling distributions here: http://home.uchicago.edu/~uhasson/correl.comparison.tiff Sincerely, Uri Hasson The Brain Research Imaging Center The University of Chicago __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Unidentified warning message in Portuguese
What is the meaning of this message? Warning message: Realizando coerção de LHD para uma lista I tried to do something like this: test - function(x) { rval - NULL m - mean(x) s - sd(x) rval$m - m rval$s - s y - x[abs(x - m) 3 * s] rval$y - y # this is the critical line return(rval) } except that in the example above the critical line does not give any error, while in the real (much bigger) example it does, for things like: test(c(runif(100), 100)) Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can't mix high level and low level plot functions.
On Mon, 2006-09-25 at 19:56 +0200, Lothar Botelho-Machado wrote: Hey R-Comunity, I'd like to print out an histogram of some experimental data and add a smooth curve of a normal distribution with an ideally generated population having the same mean and standard deviation like the experimental data. The experimental data is set as vector x and its name is set to group.name. I paint the histogram as follows: hist(data, freq=FALSE, col=lightgrey, ylab=Density, xlab=group.name) First I did the normal distribution curve this way: lines(x, dnorm(x, mean=mean(x), sd=sd(x)), type=l, lwd=2) This curve just uses as many values as there are in x. When using small amounts of sample populations the curve looks really shaky. I tried this one using a high level plot function as well: curve(dnorm, n=1, add=TRUE, xlim=range(x)) The advantage is, now I can set an ideal population of 1 to get the ideal curve really smooth. But the big disadvantage is, I don't know how to add mean=mean(x), sd=sd(x) arguments to it? It says that it can't mix high level with low level plot functions when I try to set some kind of parameter like n=1 to the low level function, it says that there ain't enough x values. So my question is, how to get a smooth curve placed of dnorm over an histogram of sample data, ideally by using the curve method? TIA, Lothar Rubusch This almost seems like it should be a FAQ. I also checked the R Graphics Gallery (http://addictedtor.free.fr/graphiques/index.php) and didn't see an example there either, unless I missed it. In either case: x - rnorm(50) hist(x, freq = FALSE) # Create a sequence of x axis values with small # increments over the range of 'x' to smooth the lines x.hypo - seq(min(x), max(x), length = 1000) # Now use lines() lines(x.hypo, dnorm(x.hypo, mean=mean(x), sd=sd(x)), type=l, lwd=2) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] paste? 'cmd /c c:\\pheno\\whap --file c:\\pheno\\smri --alt 1'
On Mon, 2006-09-25 at 18:58 +0200, Boks, M.P.M. wrote: Dear R users, This command works (calling a programm -called whap- with file specifiers etc.): system('cmd /c c:\\pheno\\whap --file c:\\pheno\\smri --alt 1 --perm 500', intern=TRUE) Now I need to call it from a loop to replace the 1 by different number, however I get lost using the quotes: I tried numerous versions of: i-1 system(paste(c('cmd /c c:\\pheno\\whap --file c:\\pheno\\smri --alt, i, --perm 500', sep= )), intern=TRUE) However no luck! I would be gratefull for any help. Thanks, Marco You need to escape the quote () chars in the paste()d string so that they get passed to your command properly. Also, you don't want to use c() within the paste() function, as the paste() function already concatenates the component vectors. Note: i - 1 paste('cmd /c c:\\pheno\\whap --file c:\\pheno\\smri --alt, i, --perm 500', sep=) Error: syntax error in paste('cmd /c c R sees the double quote before the second 'c' as the end of the string: 'cmd /c Now use \ to escape the internal quotes: paste('cmd /c \c:\\pheno\\whap --file c:\\pheno\\smri --alt , i, --perm 500\', sep=) [1] 'cmd /c \c:\\pheno\\whap --file c:\\pheno\\smri --alt 1 --perm 500\' Use '\' to escape each of the double quotes within the string, so that R can differentiate string delimiters versus characters within the string. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best use of LaTeX listings package for pretty printing R code
Le Lundi 25 Septembre 2006 09:31, Frank E Harrell Jr a écrit : This is what I have been using. Does anyone have a better way? In particular I would like to see letters in comment strings not stretched so much. Thanks -Frank \documentclass{article} \usepackage{listings,relsize} \lstloadlanguages{R} \newcommand{\lil}[1]{\lstinline|#1|} \begin{document} \lstset{language=R,basicstyle=\smaller,commentstyle=\rmfamily\smaller, showstringspaces=false,% xleftmargin=4ex,literate={-}{{$\leftarrow$}}1 {~}{{$\sim$}}1} \lstset{escapeinside={(*}{*)}} % for (*\ref{ }*) inside lstlistings (S code) \begin{lstlisting} a - b # this is a test line if(i==3) { # another line, for y^2 y - 3^3 z - 'this string' qqcat - y ~ pol(x,2) } else y - 4 \end{lstlisting} That was \lstinline|x - 22| \lil{q - 'cat'}. \end{document} listings is a great package to highlight R keywords and comments and --- that was my main use of the package --- index those keywords. I found that I had to slightly redefine the list of keywords included in listings. I still did not take the time to submit a patch to the author, though... In any case, here's what I use, if it can be of any help to anyone: \lstloadlanguages{R} \lstdefinelanguage{Renhanced}[]{R}{% morekeywords={acf,ar,arima,arima.sim,colMeans,colSums,is.na,is.null,% mapply,ms,na.rm,nlmin,replicate,row.names,rowMeans,rowSums,seasonal,% sys.time,system.time,ts.plot,which.max,which.min}, deletekeywords={c}, alsoletter={.\%},% alsoother={:_\$}} \lstset{language=Renhanced,extendedchars=true, basicstyle=\small\ttfamily, commentstyle=\textsl, keywordstyle=\mdseries, showstringspaces=false, index=[1][keywords], indexstyle=\indexfonction} with [EMAIL PROTECTED] -- Vincent Goulet, Associate Professor École d'actuariat Université Laval, Québec [EMAIL PROTECTED] http://vgoulet.act.ulaval.ca __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rows of a data frame to matrix
useRs, I have a data frame where four of the columns of the data frame represent the values of a two-by-two matrix. I'd like to, row-by-row, go through the data frame and use the four columns, in matrix form, to perform calculations necessary to create new values for variables in the data frame. My first idea was to use apply: apply(as.array(data.frame[,1:4]), 1, matrix, nrow=2) Though intuitive, this doesn't work. I've stumbled in the dark with mApply and other functions but can't find anything in the help that works. This can't be that hard, but it has me stumped. Any help greatly appreciated, Damian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rows of a data frame to matrix
Using the builtin 11x8 anscombe data frame here are some alternatives: # 1 # list of 2x2 matrices lapply(split(anscombe[1:4], 1:nrow(anscombe)), matrix, 2) # 2 # 2x2x11 array array(t(anscombe[1:4]), c(2, 2, nrow(anscombe))) # 3 # to create matrix and perform calculations, e.g. det, all in one step apply(anscombe[1:4], 1, function(x) det(matrix(x, 2))) On 9/25/06, Damian Betebenner [EMAIL PROTECTED] wrote: useRs, I have a data frame where four of the columns of the data frame represent the values of a two-by-two matrix. I'd like to, row-by-row, go through the data frame and use the four columns, in matrix form, to perform calculations necessary to create new values for variables in the data frame. My first idea was to use apply: apply(as.array(data.frame[,1:4]), 1, matrix, nrow=2) Though intuitive, this doesn't work. I've stumbled in the dark with mApply and other functions but can't find anything in the help that works. This can't be that hard, but it has me stumped. Any help greatly appreciated, Damian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Beginner Loop Question with dynamic variable names
Is this what you had in mind? j-data.frame(q1=rnorm(10),q2=rnorm(10)) j q1 q2 1 -0.9189618 -0.2832102 2 0.9394316 1.1345975 3 -0.6388848 0.6850255 4 0.4938245 -0.5825715 5 -1.2885257 -0.2654023 6 -0.5278295 0.2382791 7 0.6517268 0.8923375 8 0.4124178 1.1231630 9 -0.1604982 0.2285672 10 -0.2369713 0.6130197 for(i in 1:3){j[,paste(sep=,res,i)]-with(j,q1+q2)} j q1 q2res1res2 res3 1 -0.9189618 -0.2832102 -1.20217207 -1.20217207 -1.20217207 2 0.9394316 1.1345975 2.07402913 2.07402913 2.07402913 3 -0.6388848 0.6850255 0.04614073 0.04614073 0.04614073 4 0.4938245 -0.5825715 -0.08874699 -0.08874699 -0.08874699 5 -1.2885257 -0.2654023 -1.55392802 -1.55392802 -1.55392802 6 -0.5278295 0.2382791 -0.28955044 -0.28955044 -0.28955044 7 0.6517268 0.8923375 1.54406433 1.54406433 1.54406433 8 0.4124178 1.1231630 1.53558084 1.53558084 1.53558084 9 -0.1604982 0.2285672 0.06806901 0.06806901 0.06806901 10 -0.2369713 0.6130197 0.37604847 0.37604847 0.37604847 Regards, Mike On 9/25/06, Peter Wolkerstorfer - CURE [EMAIL PROTECTED] wrote: Dear all, I have another small scripting-beginner problem which you hopefully can help: I compute new variables with: # Question 1 results$q1 - with(results, q1_1*1+ q1_2*2+ q1_3*3+ q1_4*4+ q1_5*5) # Question 2 results$q2 - with(results, q2_1*1+ q2_2*2+ q2_3*3+ q2_4*4+ q2_5*5) # Question 3 results$q3 - with(results, q3_1*1+ q3_2*2+ q3_3*3+ q3_4*4+ q3_5*5) # Question 4 results$q4 - with(results, q4_1*1+ q4_2*2+ q4_3*3+ q4_4*4+ q4_5*5) This is very inefficient so I would like to do this in a loop like: for (i in 1:20) {results$q1 - with(results, q1_1*1+ q1_2*2+ q1_3*3+ q1_4*4+ q1_5*5)} My question now: How to replace the 1-s (results$q1, q1_1...) in the variables with the looping variable? Here like I like it (just for illustration - of course I still miss the function to tell R that it should append the value of i to the variable name): # i is the number of questions - just an illustration, I know it does not work this way for (i in 1:20) {results$qi - with(results, qi_1*1+ qi_2*2+ qi_3*3+ qi_4*4+ qi_5*5)} Help would be greatly appreciated. Thanks in advance. Peter ___CURE - Center for Usability Research Engineering___ Peter Wolkerstorfer Usability Engineer Hauffgasse 3-5, 1110 Wien, Austria [Tel] +43.1.743 54 51.46 [Fax] +43.1.743 54 51.30 [Mail] [EMAIL PROTECTED] [Web] http://www.cure.at __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Regards, Mike Nielsen __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting grouped data object
thanks! -Original Message- From: Deepayan Sarkar [mailto:[EMAIL PROTECTED] Sent: Monday, September 25, 2006 1:34 PM To: Afshartous, David Cc: r-help@stat.math.ethz.ch Subject: Re: [R] plotting grouped data object On 9/23/06, Afshartous, David [EMAIL PROTECTED] wrote: All, I'd like to plot the main relationship of a grouped data object for all levels of a factor in a single panel. The sample code below creates a separate panel for each level of the factor. I realize that this could be done in other ways, but I'd like to do it via plotting the grouped data object. thanks! dave z = rnorm(18, mean=0, sd=1) x = rep(1:6, 3) y = factor(rep(c(I, C, P), each = 6)) dat = data.frame(x, y, z) data.grp = groupedData(z ~ x | y, data = dat) plot(data.grp, outer = ~ y) ### this produces 1 line each in 3 panels ### how to collapse all 3 lines into 1 panel? The closest I can get is dat$one - gl(1, 18) data.grp = groupedData(z ~ x | one, data = dat) plot(data.grp, innerGroups = ~y, strip = FALSE) -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] paste? 'cmd /c c:\\pheno\\whap --file c:\\pheno\\smri --alt 1'
Use single outer quotes so that the inner double quotes are not interpreted as the end of the string. cmd - 'cmd /c ...whatever... ' system(cmd, intern = TRUE) On 9/25/06, Boks, M.P.M. [EMAIL PROTECTED] wrote: Dear R users, This command works (calling a programm -called whap- with file specifiers etc.): system('cmd /c c:\\pheno\\whap --file c:\\pheno\\smri --alt 1 --perm 500', intern=TRUE) Now I need to call it from a loop to replace the 1 by different number, however I get lost using the quotes: I tried numerous versions of: i-1 system(paste(c('cmd /c c:\\pheno\\whap --file c:\\pheno\\smri --alt, i, --perm 500', sep= )), intern=TRUE) However no luck! I would be gratefull for any help. Thanks, Marco __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can't mix high level and low level plot functions.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Thanks for your help!! I appreciate, now it works perfectly. Lothar Rubusch Duncan Murdoch wrote: On 9/25/2006 1:56 PM, Lothar Botelho-Machado wrote: Hey R-Comunity, I'd like to print out an histogram of some experimental data and add a smooth curve of a normal distribution with an ideally generated population having the same mean and standard deviation like the experimental data. The experimental data is set as vector x and its name is set to group.name. I paint the histogram as follows: hist(data, freq=FALSE, col=lightgrey, ylab=Density, xlab=group.name) First I did the normal distribution curve this way: lines(x, dnorm(x, mean=mean(x), sd=sd(x)), type=l, lwd=2) This curve just uses as many values as there are in x. When using small amounts of sample populations the curve looks really shaky. This is generally the right way to do it, but you likely want to use a different variable for the first two occurrences of x, e.g. x0 - seq(from=min(x), to=max(x), len=200) lines(x0, dnorm(x0, mean=mean(x), sd=sd(x)), type=l, lwd=2) Duncan Murdoch I tried this one using a high level plot function as well: curve(dnorm, n=1, add=TRUE, xlim=range(x)) The advantage is, now I can set an ideal population of 1 to get the ideal curve really smooth. But the big disadvantage is, I don't know how to add mean=mean(x), sd=sd(x) arguments to it? It says that it can't mix high level with low level plot functions when I try to set some kind of parameter like n=1 to the low level function, it says that there ain't enough x values. So my question is, how to get a smooth curve placed of dnorm over an histogram of sample data, ideally by using the curve method? TIA, Lothar Rubusch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFGEVAHRf7N9c+X7sRAtRoAJ97ft75u1etTac3Daiti1u2mlyRWgCeIAAK 81WfyDGzDdWDm11MwPiDKIA= =W2m9 -END PGP SIGNATURE- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fraitly in coxph
For a nested model you want to use the coxme function, which is the much superior successor to frailty(). It is currently found in the kinship library. Terry Therneau __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sort problem with merge (again)
# R version 2.3.1 (2006-06-01) Debian Linux testing # Is the following behaviour a bug, feature or just a lack of # understanding on my part? I see that this was discussed here # last March with no apparent resolution. d - as.factor(c(1970-04-04,1970-08-11,1970-10-18)) x - c(9,10,11) ch - data.frame(Date=d,X=x) d - as.factor(c(1970-06-04,1970-08-11,1970-08-18)) y - c(109,110,111) sp - data.frame(Date=d,Y=y) df - merge(ch,sp,all=TRUE,by=Date) # the rows with dates missing all ch vars are tacked on the end. # the rows with dates missing all sp vars are sorted in with # the row with a date with vars from both ch and sp # is.ordered(df$Date) returns FALSE # The rows of df are not sorted as they should be as sort=TRUE # is the default. Adding sort=TRUE does nothing. # So try this: # dd - df[order(df$Date),] # But that doesn't work. # Nor does sort(df$Date) # But sort(as.vector(df$Date)) does work. # As does order(as.vector(df$Date)), so this works: dd - df[order(as.vector(df$Date)),] # ? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sort problem with merge (again)
If you want it to act like a date store it as a Date: dx - as.Date(c(1970-04-04,1970-08-11,1970-10-18)) ### x - c(9,10,11) ch - data.frame(Date=dx,X=x) dy - as.Date(c(1970-06-04,1970-08-11,1970-08-18)) ### y - c(109,110,111) sp - data.frame(Date=dy,Y=y) merge(ch, sp, all = TRUE) By the way you might consider using zoo objects here: library(zoo) chz - zoo(x, dx) spz - zoo(y, dy) merge(chz, spz) See: vignette(zoo) On 9/25/06, Bruce LaZerte [EMAIL PROTECTED] wrote: # R version 2.3.1 (2006-06-01) Debian Linux testing # Is the following behaviour a bug, feature or just a lack of # understanding on my part? I see that this was discussed here # last March with no apparent resolution. d - as.factor(c(1970-04-04,1970-08-11,1970-10-18)) x - c(9,10,11) ch - data.frame(Date=d,X=x) d - as.factor(c(1970-06-04,1970-08-11,1970-08-18)) y - c(109,110,111) sp - data.frame(Date=d,Y=y) df - merge(ch,sp,all=TRUE,by=Date) # the rows with dates missing all ch vars are tacked on the end. # the rows with dates missing all sp vars are sorted in with # the row with a date with vars from both ch and sp # is.ordered(df$Date) returns FALSE # The rows of df are not sorted as they should be as sort=TRUE # is the default. Adding sort=TRUE does nothing. # So try this: # dd - df[order(df$Date),] # But that doesn't work. # Nor does sort(df$Date) # But sort(as.vector(df$Date)) does work. # As does order(as.vector(df$Date)), so this works: dd - df[order(as.vector(df$Date)),] # ? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating Movies with R
Hello! J.R. Lockwood lockwood at rand.org writes: An alternative that I've used a few times is the jpg() function to create the sequence of images, and then converting these to an mpeg movie using mencoder distributed with mplayer. This works on both windows and linux. I have a pretty self-contained example file written up that I can send to anyone who is interested. Oddly, the most challenging part was creating a sequence of file names that would be correctly ordered - for this I use: lex - function(N){ ## produce vector of N lexicograpically ordered strings ndig - nchar(N) substr(formatC((1:N)/10^ndig,digits=ndig,format=f),3,1000) } RWiki[1] would be a very nice place for such explanation. I am looking forwrad to it! Gregor [1]http://wiki.r-project.org/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting a character variable into a numeric one and a character one?
Here is a slight simplification of the strapply solution using simplify = TRUE library(gsubfn) s - c(123abc, 12cd34, 1e23) out - t(strapply(s, ^([[:digit:]]+)(.*), c, simplify = TRUE)) # matrix data.frame(x = out[,1], num = as.numeric(out[,2]), char = out[,3]) On 9/25/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: strapply in package gsubfn can do that: library(gsubfn) s - c(123abc, 12cd34, 1e23) out - strapply(s, ^([[:digit:]]+)(.*), c) out - do.call(rbind, out) # as a matrix data.frame(x = out[,1], num = as.numeric(out[,2]), char = out[,3]) # as a data.frame On 9/25/06, Frank Duan [EMAIL PROTECTED] wrote: Hi All, I have a data with a variable like this: Column 1 123abc 12cd34 1e23 ... Now I want to do an operation that can split it into two variables: Column 1Column 2 Column 3 123abc 123 abc 12cd34 12cd34 1e23 1 e23 ... So basically, I want to split the original variabe into a numeric one and a character one, while the splitting element is the first character in Column 1. I searched the forum with key words strsplitand substr, but still can't solve this problem. Can anyone give me some hints? Thanks in advance, FD [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best use of LaTeX listings package for pretty printing R code
Frank E Harrell Jr f.harrell at vanderbilt.edu writes: This is what I have been using. Does anyone have a better way? In particular I would like to see letters in comment strings not stretched so much. Thanks -Frank It may be possible to pass on all comments to a verbatim like environment inside the listings environment, by defining and redefining the preamble to listings. I hope it does not interfere with something else in LaTeX. Anupam. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] linear terms within a nonlinear model
I have a complicated nonlinear function, myfun(a,b,c), that I want to fit to data, allowing one or more of the parameters a, b, and c in turn to have linear dependence on other covariates. In other words, I'd like to specify something like nls(y~myfun(a,b,c),linear=list(a~f1,b~1,c~1)) I know would this work in nlme *if I wanted to specify random effects as well*, but I don't -- and wasn't able to figure out how to specify a null random effect. (Have looked in Pinheiro and Bates but haven't yet found a solution ...) I don't see how to do it in nls() or nlsList(), short of implementing the linear structure within myfun(). Looked at Jim Lindsey's gnlm package but haven't yet been able to figure it out. Does anyone have any ideas or tips? thanks Ben Bolker __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] linear terms within a nonlinear model
Hi, the contributed package 'drc' allows specification of non-linear regression models with individual parameter models that include covariates. For an example see section 8 the accompanying paper in J. Statist. Software (http://www.jstatsoft.org/v12/i05/v12i05.pdf). Christian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.