[R] Adding a We think R rocks page
Hi, there is a reference given for R. It should be used to prove its value to donators. OK, I quoted R but probably nobody will ever recognize that. A web page where dummies and no name users like me were pointed to and could leave a short statement of use and usefulness might help in demonstrating the impact and the spread of R. (Besides download numbers). At least this was more visible than merely quoting R. Regards, Joachim Soft Use Dr. Joachim Harloff Tel. 089 74 49 37 95 Mobil 0177 58 24 124 Fax 089 74 49 37 94 http://www.softuse.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R accuracy
Hello, I am trying to test the precision of R on datasets from The Statistical Reference Datasets Project http://www.itl.nist.gov/div898/strd/index.html and I don't manage to understand how R is storing its results. For example, I calculate a mean on the michelso dataset (100 values) and find: m=mean(michel) m V1 299.8524 print(m,digits=15) V1 299.8524 print(m,digits=22) V1 299.852393 The certified value of the mean is 299.85240, so I try print(m-299.8524) V1 -5.684342e-14 print(m-299.8524,digits=15) V1 -5.68434188608080e-14 Does it have a sens to print with more than 15 signifiant digits? Why is the difference not equal to zero? I am using R 2.0.1 under Windows XP. Regards, Anthony Landrevie - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R accuracy
Anthony Landrevie wrote: Hello, I am trying to test the precision of R on datasets from The Statistical Reference Datasets Project http://www.itl.nist.gov/div898/strd/index.html and I don't manage to understand how R is storing its results. For example, I calculate a mean on the michelso dataset (100 values) and find: m=mean(michel) m V1 299.8524 print(m,digits=15) V1 299.8524 print(m,digits=22) V1 299.852393 The certified value of the mean is 299.85240, so I try print(m-299.8524) V1 -5.684342e-14 print(m-299.8524,digits=15) V1 -5.68434188608080e-14 Does it have a sens to print with more than 15 signifiant digits? Depends, e.g. you mightg want to see that the result is only -5.684342e-14 off, hence not identical() to, but all.equal() with what you expected. Why is the difference not equal to zero? Floating point calculations on a digital computer are involved... Uwe Ligges I am using R 2.0.1 under Windows XP. Regards, Anthony Landrevie - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Question on statistics
Hi, Can anyone help me with the following (although not directly correlated to R functionality)? I have been looking on the internet but can not find the answer. My question: what is the variation on the mean of a limited distribution (total N points normally distributed), when I have a small sample of that distribution (n N)? Your help would be very welcome. Thanx, Roy -- The information contained in this communication and any atta...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Tool for update
Hello, Am Mittwoch, 23. März 2005 06:12 schrieb Yuandan Zhang: Hi, Is there any tool to check if there is update version of a package available? I look for things alike YUM for linux? Start R --no-save on a root console and launch update.packages() from within R-Enviroment. Surely, this is not like YUM but does a supherb job. Thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problem encounter during graphics device driver
On Wed, 23 Mar 2005 [EMAIL PROTECTED] wrote: Hello I am facing the following problem using the R-version 1.9.1 The PDF or PS none of these device drivers are opening while I am using R-1.9.1, the following error message is coming Error in PS(file, old$paper, old$family, old$encoding, old$bg, old$fg, : unable to start device PostScript In addition: Warning message: problem loading encoding file Execution halted But while I am using the R-1.9.0, no error message is coming. Though I have installed all the required packages. Please suggest any workaround. Use an encoding for which you do have a readable file! The message is quite explicit: In addition: Warning message: problem loading encoding file so it is to do with the encoding you requested. Check what it is, that the file exists and that it is readable to you. I suspect a permissions problem in your R installation or an incorrect setting of ps.options(). Your R is well out of date, and that message no longer exists in the R sources. Please read the R posting guide, use a current R, and tell us your OS etc. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] more classes to one class in one dataset
try this: split(dat, dat$class) where 'dat' is your data.frame I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/16/336899 Fax: +32/16/337015 Web: http://www.med.kuleuven.ac.be/biostat/ http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm - Original Message - From: Jan Sabee [EMAIL PROTECTED] To: R-help@stat.math.ethz.ch Sent: Wednesday, March 23, 2005 11:17 AM Subject: [R] more classes to one class in one dataset I have a big database which more classes in class variable. I want to make each class to one dataset, for example: x1 x2 x3 x4 class a b a c cM1 c b b c cM4 c c a c cM2 c a c a aM2 c c a a aM1 c a b c aM3 c c a b cM3 c a c a bM2 c c a b aM1 How can I make, like: x1 x2 x3 x4 class a b a c cM1 c c a a aM1 c c a b aM1 x1 x2 x3 x4 class c c a c cM2 c a c a aM2 c a c a bM2 x1 x2 x3 x4 class c a b c aM3 c c a b cM3 x1 x2 x3 x4 class c b b c cM4 Thanks for your help. Jan Sabee __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] more classes to one class in one dataset
Thanks. It's work. Jan Sabee On Wed, 23 Mar 2005 11:30:22 +0100, Dimitris Rizopoulos [EMAIL PROTECTED] wrote: try this: split(dat, dat$class) where 'dat' is your data.frame I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/16/336899 Fax: +32/16/337015 Web: http://www.med.kuleuven.ac.be/biostat/ http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm - Original Message - From: Jan Sabee [EMAIL PROTECTED] To: R-help@stat.math.ethz.ch Sent: Wednesday, March 23, 2005 11:17 AM Subject: [R] more classes to one class in one dataset I have a big database which more classes in class variable. I want to make each class to one dataset, for example: x1 x2 x3 x4 class a b a c cM1 c b b c cM4 c c a c cM2 c a c a aM2 c c a a aM1 c a b c aM3 c c a b cM3 c a c a bM2 c c a b aM1 How can I make, like: x1 x2 x3 x4 class a b a c cM1 c c a a aM1 c c a b aM1 x1 x2 x3 x4 class c c a c cM2 c a c a aM2 c a c a bM2 x1 x2 x3 x4 class c a b c aM3 c c a b cM3 x1 x2 x3 x4 class c b b c cM4 Thanks for your help. Jan Sabee __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] smallest/biggest number
Hi, I'm running monte carlo and i wonder what is the biggest/smallest number that can reliably be represented in R? Thanks, Chris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] smallest/biggest number
On Wed, 23 Mar 2005, chris desimpelaere wrote: I'm running monte carlo and i wonder what is the biggest/smallest number that can reliably be represented in R? Well, -Inf and Inf, of course. But if you meant a finite number, see ?.Machine : the values are OS-specific. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R on SuSe 9.2 AMD 64 bit: Solved
Hi everybody, I just downloaded the file R-2.0.1.tar.gz and followed the instruction written in doc/R-admin.html. In particular we installed for SuSe 9.2 AMD64 the following packages: gcc, gcc++, gcc-g77, Perl, te-latex, te-pdf, libpng, libbz, PCRE, Tcl/Tk, BLAS, LAPACK. After we just made: ./configure make make check make install (from root) make install-info make install-pdf Now everything seems to work properly: no problem at all! Many thanks to all of you who helped. Ciao! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Browser to replace the internal browser?
Hello, I see that the more I work with R and the more the code gets larger I would like to have some graphic support in my quellcode. Is there a browser that could be easily implemened in R? And how do I call it from R? It would be nice if the browser replaces the fix() function. Carsten __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Browser to replace the internal browser?
On Mar 23, 2005, at 6:44 AM, Carsten Steinhoff wrote: Hello, I see that the more I work with R and the more the code gets larger I would like to have some graphic support in my quellcode. Is there a browser that could be easily implemened in R? And how do I call it from R? It would be nice if the browser replaces the fix() function. You may want to think about using ESS within emacs. There are other options that offer similar features, but ESS is what I personally like and use. Sean __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] manova and contrasts, again
Hi R-people, To determine contrasts after MANOVA I've found a piece of R-code provided by Yves Rosseel (http://tolstoy.newcastle.edu.au/R/help/04/06/0134.html), which has been very helpful. Now I Would like to determine contrasts for a model which has a main effect and an interaction effect, and in which both effects were found to be statistically significant. I'm a bit puzzled with the contrast matrix to feed the routine, especially for the interactions. Could anybody help me out here? If more information is necessary on the underlying model, pleas let me know. Best regards, Tsjerk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Question on statistics
Ehh, by limited distribution, I meant to say a population of N points. ... -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Roy Werkman Sent: Wednesday, March 23, 2005 10:22 AM To: r-help@stat.math.ethz.ch Subject: [R] Question on statistics Hi, Can anyone help me with the following (although not directly correlated to R functionality)? I have been looking on the internet but can not find the answer. My question: what is the variation on the mean of a limited distribution (total N points normally distributed), when I have a small sample of that distribution (n N)? Your help would be very welcome. Thanx, Roy -- The information contained in this communication and any\ att...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Tool for update
If you're talking about R itself, I believe the answer is no. However, the release schedule for R is rather predictable (two major releases per year, one in Spring and another in Fall, with patch releases in between as needed), so the need is not that great, IMHO. Andy From: Yuandan Zhang Hi, Is there any tool to check if there is update version of a package available? I look for things alike YUM for linux? YD __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Question on statistics
If the sample is drawn with replacement from the finite population, then the usual formula applies (assuming iid samples); i.e., var(sample mean) = var(population) / n. There's some problem in your description: A finite population, I believe, is necessarily discrete (since there are only N possible values), so it can not be Gaussian (i.e., normal). Andy From: Roy Werkman Ehh, by limited distribution, I meant to say a population of N points. ... Hi, Can anyone help me with the following (although not directly correlated to R functionality)? I have been looking on the internet but can not find the answer. My question: what is the variation on the mean of a limited distribution (total N points normally distributed), when I have a small sample of that distribution (n N)? Your help would be very welcome. Thanx, Roy -- The information contained in this communication and any\\ ...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Package vignette and build
Yes, you need the package installed first. Something like: R CMD build --no-vignettes DLM R CMD install DLM. R CMD build DLM R CMD install DLM. At least you had to do this with 1.9.1, can't recall looking again since then. On Tue, 22 Mar 2005 11:23:36 -0600 (CST), Giovanni Petris [EMAIL PROTECTED] wrote: Hello, I am writing a package called 'DLM' containing a vignette. The vignette contains a chunck with the function call 'library(DLM)'. This worked fine with 'R CMD check DLM', but when it comes to building the package with 'R CMD build DLM' I get the following error message: * creating vignettes ... ERROR Error: chunk 1 Error in library(DLM) : There is no package called 'DLM' Error in buildVignettes(dir = .) : Error: chunk 1 Error in library(DLM) : There is no package called 'DLM' Execution halted It looks to me as if I should have the package already installed before building it... I have read the article by F. Leisch on package vignettes in R News 3/2 and looked at the source for 'strucchange', but I can't figure out what I am doing wrong. Any suggestions you can provide are more than welcome! Thank you in advance, Giovanni version _ platform sparc-sun-solaris2.8 arch sparc os solaris2.8 system sparc, solaris2.8 status major2 minor0.1 year 2004 month11 day 15 language R -- __ [ ] [ Giovanni Petris [EMAIL PROTECTED] ] [ Department of Mathematical Sciences ] [ University of Arkansas - Fayetteville, AR 72701 ] [ Ph: (479) 575-6324, 575-8630 (fax) ] [ http://definetti.uark.edu/~gpetris/ ] [__] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- best, -tony Commit early,commit often, and commit in a repository from which we can easily roll-back your mistakes (AJR, 4Jan05). A.J. Rossini [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sampling from a mixture distribution
you have also to sample the mixture compoment membership; check this for a mixtrue of two normals: rnorm.mixture - function(n, prob=0.5, mu1=0, sigma1=1, mu2=0, sigma2=1){ u - runif(n) out - numeric(n) for(i in 1:n) out[i] - if(u[i] prob) rnorm(1, mu1, sigma1) else rnorm(1, mu2, sigma2) out } hist(rnorm.mixture(1000, prob=0.6, mu1=-1, sigma1=0.5, mu2=2, sigma2=0.5)) I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/16/336899 Fax: +32/16/337015 Web: http://www.med.kuleuven.ac.be/biostat/ http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm - Original Message - From: Vumani Dlamini [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Wednesday, March 23, 2005 2:53 PM Subject: [R] sampling from a mixture distribution Dear R users, I would like to sample from a mixture distribution p1*f(x1)+p2*f(x2). I usually sample variates from both distributions and weight them with their respective probabilities, but someone told me that was wrong. What is the correct way? Vumani __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] replace values in a matrix subject to boolean condition
Hi everybody! I am sorry to bother you with a question so simple but I think there might be a better solution: I have a matrix of size 360x501 where I want to check the value of each 5th column of each row and replace it (and the 6th, 7th, 8th column) by zero if the value is less than 1000. I have written a double loop to do that but that requires a lot of time. Is there a faster way to achieve this? Thanks, Werner __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sampling from a mixture distribution
For each variate, generate it from f1() with probability p1, and from f2() with probability p2. In other words, flip a p1-biased coin to decide which distribution, f1 or f2, to generate from. HTH, Giovanni Date: Wed, 23 Mar 2005 13:53:10 + From: Vumani Dlamini [EMAIL PROTECTED] Sender: [EMAIL PROTECTED] Precedence: list Dear R users, I would like to sample from a mixture distribution p1*f(x1)+p2*f(x2). I usually sample variates from both distributions and weight them with their respective probabilities, but someone told me that was wrong. What is the correct way? Vumani __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- __ [ ] [ Giovanni Petris [EMAIL PROTECTED] ] [ Department of Mathematical Sciences ] [ University of Arkansas - Fayetteville, AR 72701 ] [ Ph: (479) 575-6324, 575-8630 (fax) ] [ http://definetti.uark.edu/~gpetris/ ] [__] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] sampling from a mixture distribution
Here's one possible way: rmix2 - function(n, p1, rF1, rF2, argF1=NULL, argF2=NULL) { ## n is the number of deviates to simulate ## p1 is the probability of a point coming from the 1st component ## rF1, rF2 are functions for generating random deviates ## from the two components ## argF1, argF2 are lists of arguments to rF1 and rF2 n1 - rbinom(1, n, p1) n2 - n - n1 x1 - do.call(rF1, c(list(n1), argF1)) x2 - do.call(rF2, c(list(n2), argF2)) c(x1, x2) } To test: x - rmix2(1000, 0.3, rnorm, rnorm, list(mean=5)) hist(x) HTH, Andy From: Vumani Dlamini Dear R users, I would like to sample from a mixture distribution p1*f(x1)+p2*f(x2). I usually sample variates from both distributions and weight them with their respective probabilities, but someone told me that was wrong. What is the correct way? Vumani __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Negative binomial GLMMs in R
Dear R-users, A recent post (Feb 16) to R-help inquired about fitting a glmm with a negative binomial distribution. Professor Ripley responded that this was a difficult problem with the simpler Poisson model already being a difficult case: https://stat.ethz.ch/pipermail/r-help/2005-February/064708.html Since we are developing software for fitting general nonlinear random effects models we thought this might be an interesting challenge. We contacted Professor Ripley who kindly directed us to the epilepsy data in Venables Ripley section 10.4 (4th ed.). While VB did not actually fit a negative binomial to these data they did refer to evidence of overdispersion in the response. Fortunately Booth et al. (2003) did attempt to fit this model with a negative binomial which gave us something to which we could compare our results. Booth et al. fitted two forms of the model a simpler one and a more complicated model. They reported some difficulty fitting the more complicated model. We found that we could reliably fit (MLE) both the complicated and simpler model in 20 seconds or less (although the more complicated turns out to be overparameterized) Using the random effects module of AD Model Builder we have developed a shared library (Windows dll) that can be called from R via the driver function glmm.admb(). The function can be downloaded from http://otter-rsch.com/admbre/examples/nbmm/nbmm.html The two models of Booth et al are fit by the commands: glmm.admb(y~Base*trt+Age+Visit,random=~1,group=subject,data=epil2) glmm.admb(y~Base*trt+Age+Visit,random=~Visit,group=subject,data=epil2) I will be happy to receive feedback on the function glmm.admb(). Best regards, Hans Skaug Reference: Booth J.G.; Casella G.; Friedl H.; Hobert J.P, Negative binomial loglinear mixed models. Statistical Modelling, October 2003, vol. 3, no. 3, pp. 179-191 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sampling from a mixture distribution
I would like to sample from a mixture distribution p1*f(x1)+p2*f(x2). ***Surely*** you mean ``p1*f1(x)+p2*f2(x)'' !!! I usually sample variates from both distributions and weight them with their respective probabilities, but someone told me that was wrong. What is the correct way? If you want a sample of size n, first generate n1 by n1 - rbinom(1,n,p1) Then generate a vector x1 equal to n1 observations from the f1(x) distribution and a vector x2 equal to n2 = n-n1 observations from the f2(x) distribution. Finally combine the two vectors of observations into a single vector: x - c(x1,x2) You can then shuffle the order of x x - sample(x,n) if you want to be obsessive about it. cheers, Rolf Turner [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Question on class 1, 2 output for RandomForest
Hi All, I read the R-newsletter Volum 2/3, December 2002 on page 18. I tried the example there, too. Then, I used a different data set with random Forest from the UCI respository. The results for the credit data generated 2 additional columns, column 1 and a column 2 that the example given in the newsletter did not generate from the fgl data set. For the credit data, what does the output with the heading 1, 2 imply for ntree=100...500 (below)? Does the 1 imply the actual data, class 1 and a group of synthetic data 2 - class 2? Did my random forest automatically default to unsupervised learning and automatically create the class 2, synthetic data, then classify the combined data with the random Forest? If so, which method did R used to generate the synthetic data? The newsletter states that there are 2 ways to generate synthetic data. Further, the parameters to tune these randomForest would ideally optimize the OOB error rate and whatever column 1 and 2 error rates mean? I tried mtry=2, 3 and 10, but that didn't change the errors much. Are these results reasonable, or should I tried to tune different parameters for this special case? ntree OOB 1 2 100: 20.72% 14.10% 28.99% 200: 18.99% 13.58% 25.73% 300: 19.71% 15.14% 25.41% 400: 20.00% 14.10% 27.36% 500: 19.13% 13.58% 26.06% Call: randomForest(x = V16 ~ ., data = credit, mtry = 3, importance = TRUE, do.trace = 100) Type of random forest: classification Number of trees: 500 No. of variables tried at each split: 3 OOB estimate of error rate: 19.86% Confusion matrix: - + class.error - 326 57 0.1488251 + 80 227 0.2605863 Thanks in advance, -Melanie --- # Read in the credit table credit = read.table(url('ftp://ftp.ics.uci.edu/pub/machine-learning-databases/credit-screening/crx.data'),sep=,) str(credit) credit$V2 = as.numeric(credit$V2) credit$V14 = as.numeric(credit$V14) str(credit) credit.rf - randomForest(V16 ~ ., data=credit, mtry=3, importance = TRUE, do.trace=100) print(credit.rf) -Melanie __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Will R work on this 64 bit machine?...
I believe R will run out of the box on your setup. I personally haven't tried the RPMs but you can always build R from the sources (fairly straightforward on a Linux box). -roger dsandif wrote: Hello, Will R work on this 64 bit machine?, Here are the specs. of our linux box: *Red Hat Enterprise Linux WS (v.3 Standard for AMD64 and Intel EM64T) *OS: redhat-release Release: 3WS CPU Arch: ia32e-redhat-linux (4) GenuineIntel Intel(R) Xeon(TM) CPU 3.40GHz 3399 MHZ Arch: EM64T Cache: 1024 KB Vendor:GenuineIntelMemory: 2000 MB Stepping: 1 Family:15 Swap: 4000 MB I see that you have it for Unix machines and that you have it for the following linux platforms: Red Hat i3868/9/Fedora1/Fedora2/Fedora3 Martyn Plummer x86_64 Fedora1 James Henstridge x86_64 Fedora3 Brian Ripley i386Enterprise LinuxMatthew P. Cox Could I use the Fedore 3 x86_64 version? Thanks for you attention and help. D- [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Roger D. Peng http://www.biostat.jhsph.edu/~rpeng/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Will R work on this 64 bit machine?...
On Wed, 23 Mar 2005, dsandif wrote: Hello, Will R work on this 64 bit machine?, Yes. Here are the specs. of our linux box: *Red Hat Enterprise Linux WS (v.3 Standard for AMD64 and Intel EM64T) *OS: redhat-release Release: 3WS CPU Arch: ia32e-redhat-linux That's not at all clear: what is `ia32e'? Depending on what it means R will work in 32- or 64-bit mode. My hesitation is that when we tried RHEL3 without much happiness on an AMD64 box: it had lots of 32-bit components. (4) GenuineIntel Intel(R) Xeon(TM) CPU 3.40GHz 3399 MHZ Arch: EM64T Cache: 1024 KB Vendor: GenuineIntelMemory: 2000 MB Stepping: 1 Family: 15 Swap: 4000 MB I see that you have it for Unix machines and that you have it for the following linux platforms: Red Hat i3868/9/Fedora1/Fedora2/Fedora3 Martyn Plummer x86_64 Fedora1 James Henstridge x86_64 Fedora3 Brian Ripley i386Enterprise LinuxMatthew P. Cox Could I use the Fedore 3 x86_64 version? Not likely (it has later software than RHEL3). Building from the sources should be straightforward, but watch the compiler versions. Martin Maechler had problems (with I believe RHEL3) and need to update gcc and g77. The problem was with gcc 3.2.x, and 3.3.3 and 3.4.3 are both fine. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] nl regression with 8 parameters, help!
I'm doing a non linear regression with 8 parameters to be fitted: J.Tl.nls-nls(Gw~(a1/(1+exp(-a2*Tl+a3))+a4)*(b1/(1+exp(b2*Tl-b3))+b4),data=Enveloppe, start=list(a1=0.88957,a2=0.36298,a3=10.59241,a4=0.26308, b1=0.391268,b2=1.041856,b3=0.391268,b4=0.03439)) First, I fitted my curve on my data by guessing the parameters' values (by hand), and wrote them. Then, I ajusted my model only with two parameters (whereas the others were fixed with previously found values, I did it the same way for all parameters. Finally, I got 8 fitted values that I enventually embedded in my nls() function, like above, yet R talled me: Error in nlsModel(formula, mf, start) : singular gradient matrix at initial parameter estimates should I use optim() or optimize()? How could I perform it? Thanks for help Guillaume Storchi __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Question on class 1, 2 output for RandomForest
The `1' and `2' columns are the error rates within those classes. E.g., the last row of the `1' column should correspond to the class.error for -, and the last row of the `2' column to the class.error for +. (I would have thought that that should be fairly obvious, but I guess not. It mimics what Breiman and Cutler's Fortran code does.) I suspect you showed us the output from two different runs, so they don't match. It does for me: library(randomForest) randomForest 4.5-4 Type rfNews() to see new features/changes/bug fixes. credit - read.csv(url(ftp://ftp.ics. credit - read.csv(url(ftp://ftp.ics.uci.edu/pub/machine-learning-databases/credit-sc reening/crx.data), header=FALSE, na.string=?) credit.rf - randomForest(V16~., credit, imp=T, do.trace=100, na.action=na.omit) ntree OOB 1 2 100: 20.37% 14.01% 28.04% 200: 21.59% 15.41% 29.05% 300: 20.52% 13.45% 29.05% 400: 20.52% 13.17% 29.39% 500: 20.21% 12.61% 29.39% credit.rf Call: randomForest(x = V16 ~ ., data = credit, imp = T, do.trace = 100, na.action = na.omit) Type of random forest: classification Number of trees: 500 No. of variables tried at each split: 3 OOB estimate of error rate: 20.21% Confusion matrix: - + class.error - 312 45 0.1260504 + 87 209 0.2939189 The article in R News was written for the first version of the package. It has changed quite a bit in many respects since then. The `class error' may be important, e.g., if one of the classes only make up a small proportion of the data. Andy From: Melanie Vida Hi All, I read the R-newsletter Volum 2/3, December 2002 on page 18. I tried the example there, too. Then, I used a different data set with random Forest from the UCI respository. The results for the credit data generated 2 additional columns, column 1 and a column 2 that the example given in the newsletter did not generate from the fgl data set. For the credit data, what does the output with the heading 1, 2 imply for ntree=100...500 (below)? Does the 1 imply the actual data, class 1 and a group of synthetic data 2 - class 2? Did my random forest automatically default to unsupervised learning and automatically create the class 2, synthetic data, then classify the combined data with the random Forest? If so, which method did R used to generate the synthetic data? The newsletter states that there are 2 ways to generate synthetic data. Further, the parameters to tune these randomForest would ideally optimize the OOB error rate and whatever column 1 and 2 error rates mean? I tried mtry=2, 3 and 10, but that didn't change the errors much. Are these results reasonable, or should I tried to tune different parameters for this special case? ntree OOB 1 2 100: 20.72% 14.10% 28.99% 200: 18.99% 13.58% 25.73% 300: 19.71% 15.14% 25.41% 400: 20.00% 14.10% 27.36% 500: 19.13% 13.58% 26.06% Call: randomForest(x = V16 ~ ., data = credit, mtry = 3, importance = TRUE, do.trace = 100) Type of random forest: classification Number of trees: 500 No. of variables tried at each split: 3 OOB estimate of error rate: 19.86% Confusion matrix: - + class.error - 326 57 0.1488251 + 80 227 0.2605863 Thanks in advance, -Melanie --- # Read in the credit table credit = read.table(url('ftp://ftp.ics.uci.edu/pub/machine-learning-dat abases/credit-screening/crx.data'),sep=,) str(credit) credit$V2 = as.numeric(credit$V2) credit$V14 = as.numeric(credit$V14) str(credit) credit.rf - randomForest(V16 ~ ., data=credit, mtry=3, importance = TRUE, do.trace=100) print(credit.rf) -Melanie __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R accuracy
Try signif(m,8) At 1:03 AM -0800 3/23/05, Anthony Landrevie wrote: Hello, I am trying to test the precision of R on datasets from The Statistical Reference Datasets Project http://www.itl.nist.gov/div898/strd/index.html and I don't manage to understand how R is storing its results. For example, I calculate a mean on the michelso dataset (100 values) and find: m=mean(michel) m V1 299.8524 print(m,digits=15) V1 299.8524 print(m,digits=22) V1 299.852393 The certified value of the mean is 299.85240, so I try print(m-299.8524) V1 -5.684342e-14 print(m-299.8524,digits=15) V1 -5.68434188608080e-14 Does it have a sens to print with more than 15 signifiant digits? Why is the difference not equal to zero? I am using R 2.0.1 under Windows XP. Regards, Anthony Landrevie - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- -- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory Livermore, CA, USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] alternative to 'groups' for lattice bwplot()
Mulholland, Tom [EMAIL PROTECTED] wrote: I'm afraid you have lost me. What is it that you want that reordering the formula does not achieve. bwplot(yield ~ year | site, data = barley) has sites next to each other. Yes, they are next to each other, but in different panels, as expected when using a formula like that. I should have been more explicit saying that I want the conditioning variable to show within a panel. If the lattice structure is your issue (it appears you wish to remove the structure and replace it with a wider space) then I guess you might find writing your own code easier than forcing lattice to be something other than itself. I disagree. IMHO, I don't think the sole purpose of lattice is to put plots in different panels. There are several cases, where (I think) lattice can mark groups of data in a single panel more efficiently than other tools. One may or may not need other conditioning variables to show in different panels. -- Sebastian P. Luque __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] alternative to 'groups' for lattice bwplot()
On Wednesday 23 March 2005 00:10, Sebastian Luque wrote: Hi, Is there some alternative to the 'groups' argument in lattice's bwplot function for boxplots? Say in the example below: bwplot(yield ~ site | year, data = barley) you want to have two side by side boxplots per site, corresponding to each year in the barley data frame. Ideally, the space between boxplots of the same site should be smaller than that between boxplots of different sites. This seemed like a job for the 'groups' argument, but panel.bwplot doesn't take it. I saw that boxplot() might do this for the particular example above, but not for a more complex one with additional conditioning variables (as in my actual problem). I consider bwplot to already provide a grouped display (box plots are univariate summaries, and bwplot allows you to display several of them together within a panel). What you are looking for may be appropriate in certain situations, but is not general enough to warrant a built-in implementation. In other words, you'll have to write your own panel function. I thought I'd find something about this in the archives, but I'm either not using the right keywords or the question hasn't come up yet. The only instance I can recall is: http://tolstoy.newcastle.edu.au/R/help/04/02/0848.html Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] nl regression with 8 parameters, help!
Does this error always occur independently of the starting values that you provide? I guess so, because I think that the parameters in your equation are not identifiable, since the first term (a1 to a4) is identical to the second term (b1 to b4) with a1 = b1, -a2 = b2, a3 = -b3, and a4 = b4 . Do you really want to have the same explanatory variable (Tl) in both terms? Arne On Wednesday 23 March 2005 16:28, Guillaume STORCHI wrote: I'm doing a non linear regression with 8 parameters to be fitted: J.Tl.nls-nls(Gw~(a1/(1+exp(-a2*Tl+a3))+a4)*(b1/(1+exp(b2*Tl-b3))+b4),data= Enveloppe, start=list(a1=0.88957,a2=0.36298,a3=10.59241,a4=0.26308, b1=0.391268,b2=1.041856,b3=0.391268,b4=0.03439)) First, I fitted my curve on my data by guessing the parameters' values (by hand), and wrote them. Then, I ajusted my model only with two parameters (whereas the others were fixed with previously found values, I did it the same way for all parameters. Finally, I got 8 fitted values that I enventually embedded in my nls() function, like above, yet R talled me: Error in nlsModel(formula, mf, start) : singular gradient matrix at initial parameter estimates should I use optim() or optimize()? How could I perform it? Thanks for help Guillaume Storchi __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Arne Henningsen Department of Agricultural Economics University of Kiel Olshausenstr. 40 D-24098 Kiel (Germany) Tel: +49-431-880 4445 Fax: +49-431-880 1397 [EMAIL PROTECTED] http://www.uni-kiel.de/agrarpol/ahenningsen/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to do such MDS in R
i know cmdscale and isoMDS inR can do classical and non-metric MDS.but i want to konw if there is packages can carry on individual differences scaling and multidimensional analysis og preference?both method are important one,but i can not find any clue on how to do it using R. anyone can help? thank you! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to test for equality of covariance Matrices in lda
when using the two-group discriminant analysis,we need to test for equality of covariance Matrices in lda.as whenm we formed our estimate of the within-group covariance matrix by pooling across groups,we implicitly assumed that the covariance structure was the same across groups.so it seems important the test the equality.but i can not find function in R to do these. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] mixtures as outcome variables
Jason W. Martinez wrote: Dear R-users, I have an outcome variable and I'm unsure about how to treat it. Any advice? I have spending data for each county in the state of California (N=58). Each county has been allocated money to spend on any one of the following four categories: A, B, C, and D. Each county may spend the money in any way they see fit. This also means that the county need not spend all the money that was allocated to them. The data structure looks something like the one below: COUNTYAB C DTotal alameda 2534221 192 2835475 3063249 9988537 alpine 3174 850004555855232 amador0 000 0 The goal is to explain variation in spending patterns, which are presumably the result of characteristics for each county. I may treat the problem like a simple linear regression problem for each category, but by definition, money spent in one category will take away the amount of money that can be spent in any other category---and each county is not allocated the same amount of money to spend. I have constructed proportions of amount spent on each category and have conducted quasibinomial regression, on each dependent outcome but that does not seem very convincing to me. Would anyone have any advice about how to treat an outcome variable of this sort? Thanks for any hints! Jason If you only concentrate on the relative proportions, this are called compositional data. I f your data are in mydata (n x 4), you obtain compositions by sweep(mydata, 1, apply(mydata, 1, sum), /) There are not (AFAIK) specific functions/packages for R for compositional data AFAIK, but you can try googling. Aitchison has a monography (Chapman Hall) and a paper in JRSS B. One way to start might be lm's or anova on the symmetric logratio transform of the compositons. The R function lm can take a multivariate response, but some extra programming will be needed for interpretation. With simulated data: slr function(y) { # y should sum to 1 v - log(y) return( v - mean(v) ) } testdata - matrix( rgamma(120, 2,3), 30, 4) str(testdata) num [1:30, 1:4] 0.200 0.414 0.311 2.145 0.233 ... comp - sweep(testdata, 1, apply(testdata,1,sum), /) # To get the symmetric logratio transform: comp - t(apply(comp, 1, slr)) # Observe: apply(cov(comp), 1, sum) [1] -5.551115e-17 2.775558e-17 5.551115e-17 -2.775558e-17 lm( comp ~ 1) Call: lm(formula = comp ~ 1) Coefficients: [,1] [,2] [,3] [,4] (Intercept) 0.17606 0.06165 -0.03783 -0.19988 summary(lm( comp ~ 1)) Response Y1 : Call: lm(formula = Y1 ~ 1) Residuals: Min 1Q Median 3Q Max -1.29004 -0.46725 -0.07657 0.55834 1.20551 Coefficients: Estimate Std. Error t value Pr(|t|) [1,] 0.1761 0.1265 1.3910.175 Residual standard error: 0.6931 on 29 degrees of freedom Response Y2 : Call: lm(formula = Y2 ~ 1) Residuals: Min 1Q Median 3Q Max -1.2982 -0.5711 -0.1355 0.5424 1.6598 Coefficients: Estimate Std. Error t value Pr(|t|) [1,] 0.061650.150490.410.685 Residual standard error: 0.8242 on 29 degrees of freedom Response Y3 : Call: lm(formula = Y3 ~ 1) Residuals: Min 1Q Median 3Q Max -1.97529 -0.41115 0.03666 0.42785 0.88567 Coefficients: Estimate Std. Error t value Pr(|t|) [1,] -0.037830.11623 -0.3250.747 Residual standard error: 0.6366 on 29 degrees of freedom Response Y4 : Call: lm(formula = Y4 ~ 1) Residuals: Min 1Q Median 3Q Max -2.8513 -0.3955 0.2815 0.5939 1.2475 Coefficients: Estimate Std. Error t value Pr(|t|) [1,] -0.1999 0.1620 -1.2340.227 Residual standard error: 0.8872 on 29 degrees of freedom Sorry for not being of more help! Kjetil -- Kjetil Halvorsen. Peace is the most effective weapon of mass construction. -- Mahdi Elmandjra -- No virus found in this outgoing message. Checked by AVG Anti-Virus. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] nested random effects
Hi I am struggling with nested random effects and hope someone can help. I have individuals (ID) who are nested within families (FAM). I want to model an outcome variable, and take account of the intercorrelation of individuals within each family. I think this amounts to two random effects, one nested within the other. How can I model this in R? So far I have tried using the library(nlme), and then Y~ID, random=~1|ID*FAM, But this isn't working.. Thanks Philip [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Question on statistics
Roy Werkman wrote: Yes, it is discrete, but the underlying distribution is Gaussian. / I guess you mean what somebody calls the superpopulation distribution. Kjetil / Just got the following from a college: Var(mean of finite population) = ((N - n)/(N - 1)) * var(population) / n This should be it... Greetings, Roy -Original Message- From: Liaw, Andy [mailto:[EMAIL PROTECTED] Sent: Wednesday, March 23, 2005 2:17 PM To: Roy Werkman; r-help@stat.math.ethz.ch Subject: RE: [R] Question on statistics If the sample is drawn with replacement from the finite population, then the usual formula applies (assuming iid samples); i.e., var(sample mean) = var(population) / n. There's some problem in your description: A finite population, I believe, is necessarily discrete (since there are only N possible values), so it can not be Gaussian (i.e., normal). Andy From: Roy Werkman Ehh, by limited distribution, I meant to say a population of N points. ... Hi, Can anyone help me with the following (although not directly correlated to R functionality)? I have been looking on the internet but can not find the answer. My question: what is the variation on the mean of a limited distribution (total N points normally distributed), when I have a small sample of that distribution (n N)? Your help would be very welcome. Thanx, Roy -- The information contained in this communication and any\ att...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp Dohme or MSD and in Japan, as Banyu) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. -- -- Kjetil Halvorsen. Peace is the most effective weapon of mass construction. -- Mahdi Elmandjra -- No virus found in this outgoing message. Checked by AVG Anti-Virus. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] nl regression with 8 parameters, help!
Arne Henningsen [EMAIL PROTECTED] writes: Does this error always occur independently of the starting values that you provide? I guess so, because I think that the parameters in your equation are not identifiable, since the first term (a1 to a4) is identical to the second term (b1 to b4) with a1 = b1, -a2 = b2, a3 = -b3, and a4 = b4 . Do you really want to have the same explanatory variable (Tl) in both terms? That's not necessarily a problem. There will of course always be two solutions, but the algorithm may still converge to one of them. This happens all the time with biexponential curves, e.g.. However, in this case we have a local unidentifiability too: if you multiply a1 and a4 by a constant and divide b1 and b4 by the same constant, you get the same fitted values. This is reflected in the singular gradient. On Wednesday 23 March 2005 16:28, Guillaume STORCHI wrote: I'm doing a non linear regression with 8 parameters to be fitted: J.Tl.nls-nls(Gw~(a1/(1+exp(-a2*Tl+a3))+a4)*(b1/(1+exp(b2*Tl-b3))+b4),data= Enveloppe, start=list(a1=0.88957,a2=0.36298,a3=10.59241,a4=0.26308, b1=0.391268,b2=1.041856,b3=0.391268,b4=0.03439)) -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to do such MDS in R
On 21 Mar 2005, at 13:29, ronggui wrote: i know cmdscale and isoMDS inR can do classical and non-metric MDS.but i want to konw if there is packages can carry on individual differences scaling and multidimensional analysis og preference?both method are important one,but i can not find any clue on how to do it using R. anyone can help? thank you! It may be that individual differences scaling is not available in R. The classic piece of software for this purpose is SINDSCAL. It is beautiful Fortran (although this sounds like contradiction in terms), and it would be easy to port the software into R, but I think the license does not allow this. The hardest bit would be to change the output into R. I suggest you dig up SINDSCAL somewhere -- it could be in netlib -- and compile it yourself. Gnu g77 is quite OK. cheers, jari oksanen -- Jari Oksanen, Oulu, Finland __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Does R work in 64 bit on apple G5?
Hi, I am working with R on 2xG5 1.8Ghz from Apple under 10.3.8 The G5 chip is 64 bits but does R run in 64 bit or 32 under OS X? How can know? I think it run in 32 bits... but not sure... anyway thanks for this fabulous soft... ;-) David __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] nested random effects
On Wed, 2005-03-23 at 11:58 -0500, Shaw, Philip (NIH/NIMH) wrote: Hi I am struggling with nested random effects and hope someone can help. I have individuals (ID) who are nested within families (FAM). I want to model an outcome variable, and take account of the intercorrelation of individuals within each family. I think this amounts to two random effects, one nested within the other. How can I model this in R? So far I have tried using the library(nlme), and then Y~ID, random=~1|ID*FAM, An interaction random effect/fixed effect is noted as random ~1|random/fixed in your case random =~1|ID/FAM (but I don't uderstand why indiviuals withing families are fixed and and families are random, but there you go). Check out Pinheiro and Bates Ch1, especially pg 23 onwards. Cheers, F -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Does R work in 64 bit on apple G5?
On Wed, 23 Mar 2005, David Ruau wrote: Hi, I am working with R on 2xG5 1.8Ghz from Apple under 10.3.8 The G5 chip is 64 bits but does R run in 64 bit or 32 under OS X? How can know? I think it run in 32 bits... but not sure... Under the current OS X it runs 32bit. You can tell by looking at .Machine$sizeof.pointer which is 4. The next version of OS X is advertised as having full 64bit support so this limitation will go away then. -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Does R work in 64 bit on apple G5?
On Wed, 23 Mar 2005, David Ruau wrote: I am working with R on 2xG5 1.8Ghz from Apple under 10.3.8 The G5 chip is 64 bits but does R run in 64 bit or 32 under OS X? How can know? From the size of the ncells! 32-bit machine: gc() used (Mb) gc trigger (Mb) Ncells 144907 3.9 35 9.4 Vcells 61911 0.5 786432 6.0 64-bit machine: gc() used (Mb) gc trigger (Mb) Ncells 141134 7.6 35 18.7 Vcells 63088 0.5 786432 6.0 The ncells are 28bytes on a 32-bit machine and usually 56 on a 64-bit machine (depending on alignment needs). I think it run in 32 bits... but not sure... The precompiled binary is definitely 32-bit. If you compiled R yourself I suspect you would know if you sed a 64-bit compiler (if you had one). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] mixtures as outcome variables
Jason W. Martinez [EMAIL PROTECTED] 03/22/05 04:11PM Dear R-users, I have an outcome variable and I'm unsure about how to treat it. Any advice? Below are a couple of ideas/suggestions of things to think about I have spending data for each county in the state of California (N=58). Each county has been allocated money to spend on any one of the following four categories: A, B, C, and D. Each county may spend the money in any way they see fit. This also means that the county need not spend all the money that was allocated to them. The data structure looks something like the one below: You might want to include a category for the amout of money not spent (for a total of 5 possibilities). COUNTYAB C DTotal alameda 2534221 192 2835475 3063249 9988537 alpine 3174 850004555855232 amador0 000 0 The goal is to explain variation in spending patterns, which are presumably the result of characteristics for each county. Do you have data representing these characteristics? The predictor values in a regression type model? Starting with some good graphics may help determine and show interesting patterns. The maptools package can read in shapefiles and plot the maps. You can download a shapefile with the county boundaries from: http://www.census.gov/geo/www/cob/co2000.html Then you could use the symbols function to plot a star in the center of each county (use get.Pcent from maptools to find the coordinates of the centers). Then just look for groups of counties with similar looking stars, or stars that are different from those close by (I would use the percentage spent in each category for the lengths of the star spokes). Another graph that may prove interesting is the trilinear plot (see the article in Chance from the summer of 2002). Combine your categories into 3 groups (e.g. AB vs. CD vs. not spent; or A vs. B vs. all others) then plot each county's spending on the trilinear plot (functions to do the plot are: triangle.plot in ade4, triplot in klaR, or I have some code that I wrote (not on CRAN yet)). Look for clusters of counties in these plots. I may treat the problem like a simple linear regression problem for each category, but by definition, money spent in one category will take away the amount of money that can be spent in any other category---and each county is not allocated the same amount of money to spend. I have constructed proportions of amount spent on each category and have conducted quasibinomial regression, on each dependent outcome but that does not seem very convincing to me. Would anyone have any advice about how to treat an outcome variable of this sort? Here are a couple of thoughts (there may be better options). Assuming that you have some predictor (x) variables about each county: use the multinom function in the nnet package, the idea being that each dollar spent follows a multinomial with certain probabilities as to which category it will be spent in and the predictors tell you what the probabilities are. Similarly you could use package rpart to do a tree model, use the category as the outcome and the percentage spent on the category as the weights (each county would be spread accross 4 or 5 lines of the dataset with the predictors replicated on each line). rpart gives the probabilities/proportions for each category based on splits of the predictor variables. Thanks for any hints! Jason -- Jason W. Martinez, Gradaute Student University of California, Riverside Department of Sociology E-mail: [EMAIL PROTECTED] hope this helps, Greg Snow, Ph.D. Statistical Data Center [EMAIL PROTECTED] (801) 408-8111 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] nested random effects
An interaction random effect/fixed effect is noted as random ~1|random/fixed in your case random =~1|ID/FAM (but I don't uderstand why indiviuals withing families are fixed and and families are random, but there you go). 1. Fixed effects cannot be nested within random effects. 2. The random specification is backwards: nesting, |g1/g2/g3... , is outer to inner and so FAM/ID Check out Pinheiro and Bates Ch1, especially pg 23 onwards. Indeed. See the Worker/Machine example on p. 24 for outer to inner nesting. -- Bert Gunter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] nested random effects
It should be random=~1|FAM/ID indicating individuals are nested within families. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Federico Calboli Sent: Wednesday, March 23, 2005 12:34 PM To: Shaw, Philip (NIH/NIMH) Cc: r-help Subject: Re: [R] nested random effects On Wed, 2005-03-23 at 11:58 -0500, Shaw, Philip (NIH/NIMH) wrote: Hi I am struggling with nested random effects and hope someone can help. I have individuals (ID) who are nested within families (FAM). I want to model an outcome variable, and take account of the intercorrelation of individuals within each family. I think this amounts to two random effects, one nested within the other. How can I model this in R? So far I have tried using the library(nlme), and then Y~ID, random=~1|ID*FAM, An interaction random effect/fixed effect is noted as random ~1|random/fixed in your case random =~1|ID/FAM (but I don't uderstand why indiviuals withing families are fixed and and families are random, but there you go). Check out Pinheiro and Bates Ch1, especially pg 23 onwards. Cheers, F -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Does R work in 64 bit on apple G5?
Thanks, I was sure the pre-compile version was 32 bit but not if you compile it your self... It give the same infos when you run gc() or .Machine$sizeof.pointer either on OS X client with a pre-compiled version or on OS X Server with a home compile version. .Machine$sizeof.pointer [1] 4 gc() used (Mb) gc trigger (Mb) Ncells 140949 3.8 35 9.4 Vcells 52967 0.5 786432 6.0 Did anybody use R with Xgrid? I am trying but it's not so easy to send the R job to the controller... David On Mar 23, 2005, at 18:44, Thomas Lumley wrote: On Wed, 23 Mar 2005, David Ruau wrote: Hi, I am working with R on 2xG5 1.8Ghz from Apple under 10.3.8 The G5 chip is 64 bits but does R run in 64 bit or 32 under OS X? How can know? I think it run in 32 bits... but not sure... Under the current OS X it runs 32bit. You can tell by looking at .Machine$sizeof.pointer which is 4. The next version of OS X is advertised as having full 64bit support so this limitation will go away then. -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] nested random effects
On Wed, 2005-03-23 at 10:04 -0800, Berton Gunter wrote: An interaction random effect/fixed effect is noted as random ~1|random/fixed in your case random =~1|ID/FAM (but I don't uderstand why indiviuals withing families are fixed and and families are random, but there you go). 1. Fixed effects cannot be nested within random effects. 2. The random specification is backwards: nesting, |g1/g2/g3... , is outer to inner and so FAM/ID The original question had Y~ID so I assumed ID was/is fixed. I have my reservations on that, but who am I to decide? it' not my data and anyway I have not seen it. F -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] nested random effects
I should have added that if you have only one Y observation per ID (within family), then the ID variance component is residual error and the model becomes (without any covariates) Y~1, rand=~1|FAM -- Bert On Wed, 2005-03-23 at 11:58 -0500, Shaw, Philip (NIH/NIMH) wrote: Hi I am struggling with nested random effects and hope someone can help. I have individuals (ID) who are nested within families (FAM). I want to model an outcome variable, and take account of the intercorrelation of individuals within each family. I think this amounts to two random effects, one nested within the other. How can I model this in R? So far I have tried using the library(nlme), and then Y~ID, random=~1|ID*FAM, An interaction random effect/fixed effect is noted as random ~1|random/fixed in your case random =~1|ID/FAM (but I don't uderstand why indiviuals withing families are fixed and and families are random, but there you go). Check out Pinheiro and Bates Ch1, especially pg 23 onwards. Cheers, F -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] go to msn!!!!
Some people think that this server is like msn messenger! I see that lots of people talk about some uninteresting things like G5 stuff or whatever, but nobody is able to think about real useRs' problems! Guillaume Storchi __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Looking for function for Double to raw to double conversions
Hello, I am trying to implement function for reading/writing some XML file format. One feature of that XML format is that a lot of binary data is stored in Base64 format, and since R's XML package does not seem to support it, I just wrote my own converter from raw format to Base64, and back. However one place I have problems with is conversion from vector of doubles to vector of raws. I was expecting equivalent of C casting operation, like: double *doubleVec; unsigned char* rawVec; rawVec = (unsigned char*) doubleVec; Or if compiler complains: rawVec = (unsigned char*) ((void*) doubleVec); Unfortunately I can not find equivalent function in R. Simple minded: doubleVec = (1:4)*pi rawVec = as.raw(doubleVec) as.double(rawVec) Does not seem to work (output: 3 6 9 12, instead of: 3.141593 6.283185 9.424778 12.566371) . The only way I figured out how to do it is by using: raw2double = function(x) { writeBin(as.raw(x), temp.bin) return( readBin(temp.bin, double, n=length(x)%/%8) ) } double2raw = function(x) { writeBin(as.double(x), temp.bin) return( readBin(temp.bin, raw, n=length(x)*8) ) } Than: rawVec = double2raw(doubleVec) raw2double(rawVec) Gives correct results. Is there any other way that does not use temporary files to do this simple casting, that does not involve writing my own C code (which I am trying to avoid). Jarek =\ Jarek Tuszynski, PhD. o / \ Science Applications International Corporation \__,| (703) 676-4192 \ [EMAIL PROTECTED] `\ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R on red hat 2.1 problem while trying to generate image
Running R 1.9.1 under red hat 2.1 version When I try to generate an image, we get an error as in the following plot(rnorm(100)) Error in PS(file, old$paper, old$family, old$encoding, old$bg, old$fg, : unable to start device PostScript In addition: Warning message: cannot open `postscript' file argument `Rplots.ps' If anyone can throw some light or any pointers into why we are facing this problem then it would be really helpful. Thanks, Sandeep. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Courses***April R/Splus Advanced and Intermediate level courses by XLSolutions
Here are our April courses: R/Splus Advanced Programming: March 31st - April 1st, San Francisco http://www.xlsolutions-corp.com/Radv.htm R/Splus Programming Techniques: April 14th - April 15th, New York City http://www.xlsolutions-corp.com/Rfund.htm Microarrays Data Analysis with R/S+ and GGobi http://www.xlolutions-corp.com/Rarrays.htm: April 27th-28th, Princeton Please email for our May Summer schedule Ask for group discounts. Email Sue Turner: [EMAIL PROTECTED] Phone: 206-686-1578 Visit us: www.xlsolutions-corp.com/training.htm Please let us know if you and your colleagues are interested in this classto take advantage of group discount. Register now to secure your seat! Cheers, Elvis Miller, PhD Manager Training. XLSolutions Corporation 206 686 1578 www.xlsolutions-corp.com [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Gini's Importance Value Variable = Inf
That result looks fishy: Not only there shouldn't be Inf, but there shouldn't be negative values in that measure (look at V6). I will look into it. I hope by now you realize that there's not much point in asking such package-specific questions on R-help... Not all package maintainers are on R-help, and they are the best persons to ask package specific questions or report bugs. Andy From: Melanie Vida Hi All, In the script below, the importance measure for column 4 (ie MeanDecreaseGini) indicated Inf for V7. Running the getTree command showed that V7 had been selected at least twice in one of the trees for Random Forest. So the Inf command was not generated as a result of dividing the sum of the decreases by 0. Any suggestions on what may be causing the Inf in V7 would be helpful? Thanks in advance, -Melanie -i library(randomForest) credit-read.csv(url(ftp://ftp.ics.uci.edu/pub/machine-learni ng-databases/credit-screening/crx.data), header=FALSE, na.string=?) credit.rf - randomForest(V16~., credit, imp=T, do.trace=100,na.action=na.omit) imp - round(importance(credit.rf), 2) imp - + MeanDecreaseAccuracy MeanDecreaseGini V1 0.00 0.00 0.00 0.00 V2 0.75 0.25 0.5519.92 V3 0.41 0.57 0.4622.13 V4 0.39 0.33 0.33 4.93 V5 0.26 0.24 0.21 0.60 V6 0.39 0.50 0.40 -46.21 V7 0.91 0.59 0.71 Inf V8 1.35 1.35 1.0637.15 V9 0.00 0.00 0.00 0.00 V10 0.00 0.00 0.00 0.00 V11 1.65 1.59 1.2349.16 V12 0.00 0.00 0.00 0.00 V13 -0.11 -0.10-0.10 0.21 V14 0.82 0.57 0.6620.71 V15 1.36 1.02 1.0133.47 getTree(credit.rf, 1) left daughter right daughter split var split point status prediction [1,] 2 315492. 1 0 [2,] 4 511 2.5000 1 0 [3,] 6 7 2 38.5000 1 0 [4,] 8 914 83. 1 0 [5,]10 11 7207. 1 0 [6,]12 1311 0.5000 1 0 [7,] 0 0 0 0. -1 2 [8,]14 15 7117. 1 0 [9,]16 17 8 3.0625 1 0 [10,]18 19 3 0.2700 1 0 [11,] 0 0 0 0. -1 2 [12,]20 2115 4753. 1 0 [13,]22 23 2 37.0850 1 0 [14,]24 2514 8.5000 1 0 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Error in unitrootTest (fSeries)
Hello, I am getting the following error message from unitrootTest. Do you have any clue of what could be wrong. Details: AMD64 (x86_64) Gentoo Linux system. library(fSeries) kmodel - list(ar=c(.3,0,0,0,0.7,-.4*.7),d=1) x=armaSim(nobs,model=kmodel) unitrootTest(x,trend=c,statistic=t,method=adf,lags=2) Error in file(file, r) : unable to open connection In addition: Warning message: cannot open file `library/fSeries/libs/.urc1.tab' Thank you very much Roberto -- Roberto Bertolusso [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Error in unitrootTest (fSeries)
On Wed, 23 Mar 2005, Roberto Bertolusso wrote: Hello, I am getting the following error message from unitrootTest. Do you have any clue of what could be wrong. A bug in the package: please contact the maintainer. This *may* work if you run in R_HOME. Hint to Diethelm: use system.file(libs, .urc1.tab, package=fSeries) to find the file in a location-independent way. Details: AMD64 (x86_64) Gentoo Linux system. library(fSeries) kmodel - list(ar=c(.3,0,0,0,0.7,-.4*.7),d=1) x=armaSim(nobs,model=kmodel) unitrootTest(x,trend=c,statistic=t,method=adf,lags=2) Error in file(file, r) : unable to open connection In addition: Warning message: cannot open file `library/fSeries/libs/.urc1.tab' -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Complete Linkage Clustering techniques
Dear R I recently asked for a cluster analysis Using * cluster.results - hclust(iris.dist, method=complete) * but nothing happened i.e the previous scatterplot matrix still showed up whereas I was expecting a dendogram. Could it be that because I had used cutree before on the scatter plots that it somehow mucked it up. I tried detach then attach and commenced making the data matrix again and followed the procedures through. Not sure what I've done wrong here, can anyone help me brett stansfield __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] non-derivative based optimization and standard errors.
Hi AlL, I ahve this problem that my objective function is discontinous in the paramaters and I need to use methods such as nelder-mead to get around this. My question is: How do i compute standard errors to a problem that does not have a gradient? Any literature on this is greatly appreciated. Jean, __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Mapping actual to expected columns for princomp object
I am working with data sets in which the number and order of columns may vary, but each column is uniquely identified by its name. E.g., one data set might have columns MW logP Num_Rings Num_H_Donors while another has columns Num_Rings Num_Atoms Num_H_Donors logP MW I would like to be able to perform a principal component analysis (PCA) on one data set and save the PCA object to a file. In a later R session, I would like to load the object and then apply the loadings to a new data set in order to compute the principal component (PC) values for each row of new data. I am trying to use the princomp method in R to do this. (I started with prcomp, but found that there is no predict method for objects created by prcomp.) The problem is that when using predict on a princomp object, R ignores the names of columns and simply assumes that the column order is the same as in the original data frame used to do the PCA. (This contrasts, for example, with the behavior of a model produced by lm, which is aware of column names in a data frame.) What I think I need to do is this: 1. After reloading the princomp object, extract the names and order of columns that it expects. (If you look at the loadings for the object, you can see that this info is there, but I would like to get at it directly somehow.) 2. Reorder the columns in the new data set to correspond to this expected order, and remove any extra columns. 3. Use the predict method to predict the PC values for the new data set. Is this the best approach to achieve what I am attempting? If so, can anyone tell me how to accomplish steps 1 and 2 above? Thanks, Dana Honeycutt P.S. Here's a script that demonstrates the problem: x1 - rnorm(10) x2 - rnorm(10) y - rnorm(10) frx - data.frame(x1,x2) frxy - data.frame(x1,x2,y) lm1 - lm(y~x1+x2,frxy) pca1 - princomp(frx) rm(x1,x2,y,frx,frxy) z1 - rnorm(10) z2 - rnorm(10) frz - data.frame(z1,z2) predict(lm1, frz) # gives error: Object x1 not found predict(pca1, frz) # gives no error, indicating column names ignored z3 - rnorm(10) fr3z - data.frame(frz,z3) predict(pca1,fr3z) # gives error due to unexpected number of columns loadings(pca1) # shows linear combos of variables corresponding to PCs __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] parallel r job on sun gridengine
greetings all, this may be the wrong forum for my problem - if so please advise. i am addressing this list because of an error i am getting from the snow library rmpi (i think) after lam has booted the mpi nodes i have a script (provided by a faculty member - i am not an R user but have the task of making it run scucessfully as a batch job on the gridengine) that runs with success as an interactive shell script, can be run interactively using qrsh on a sun gridengine, but fails when submitted to the gridengine as a batch job. the lam/mpi nodes boot and shutdown properly via a parallel environment defined in the gridengine. where the job falls flat is when the snow RMPInode.sh script is called - or so it seems. the error generated is: ___ /usr/local/lib/R.framework/Versions/2.0.0/Resources/library/snow/ RMPInode.sh: line 9: 13465 Trace/BPT trap (core dumped) ${RPROG:-R} --vanilla ${OUT:-/dev/null} 21 EOF library(Rmpi) library(snow) runMPIslave() EOF ___ environment is darwin (panther 10.3.8), r version is 2.0.0, gridengine version is 5.3. i get the feeling this is not an r problem, but if you used r in batch mode in a parallel environment maybe you could point me in the right direction.i also realize that many factors could contibute to this error, but to be able to rule out r (or the snow library) would be helpful. thanks in advance, mark+ \ ucsf biostat -- mark garey ucsf department of epidemiology and biostatistics 500 parnassus ave, mu420w san francisco, ca. 94143 415-502-8870 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] summing values as image
I'm trying to summarize irregularly spaced data (in data.frame with x,y,z) and need to sum (not average as the as.image() function in fields does) and I'm not sure if there is a function in on of the packages or if I'm going to need to string a few functions together like fields::as.image() and fields::image.count() to get what I need or if I should simply write my own. suggestions? -- Jeff D. Hamann Forest Informatics, Inc. PO Box 1421 Corvallis, Oregon 97339-1421 phone 541-754-1428 fax 541-752-0288 [EMAIL PROTECTED] http://www.forestinformatics.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Complete Linkage Clustering techniques
- Original Message - From: Brett Stansfield [EMAIL PROTECTED] To: R help (E-mail) R-help@stat.math.ethz.ch Sent: Wednesday, March 23, 2005 5:25 PM Subject: [R] Complete Linkage Clustering techniques Dear R I recently asked for a cluster analysis Using * cluster.results - hclust(iris.dist, method=complete) * but nothing happened i.e the previous scatterplot matrix still showed up whereas I was expecting a dendogram. Could it be that because I had used cutree before on the scatter plots that it somehow mucked it up. I tried detach then attach and commenced making the data matrix again and followed the procedures through. Not sure what I've done wrong here, can anyone help me You need to plot the result. plot(cluster.results) Sean __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] font sizes for row.names of dendograms
Dear R I recently performed a cluster analysis. It produced the dendogram no problem but unfortunately the font size of the row.names were all cluttered due to their large size So I tried to change the font size using plclust(cluster.results, labels=iris$specie, cex=0.8) and R came back to me saying Error in plclust(cluster.results, labels = iris$specie, cex = 0.8) : unused argument(s) (cex ...) what am I doing wrong here brett stansfield __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] extracting numerical data from text field
Luis Tercero luis.tercero at ebi-wasser.uni-karlsruhe.de writes: : : I have imported a data frame that looks like this: : :Measurement.Date.and.Time Z.Average..nm. PDI : 572 Dienstag, 22. Mrz 2005 11:05:59 366,4 0,468 : 573 Dienstag, 22. Mrz 2005 11:09:30 353,4 0,532 : 574 Dienstag, 22. Mrz 2005 11:12:59343 0,428 : 575 Dienstag, 22. Mrz 2005 11:16:28 354,1 0,433 : 576 Dienstag, 22. Mrz 2005 11:19:59 341,9 0,349 : 577 Dienstag, 22. Mrz 2005 11:23:29 334,9 0,429 : ... : : Would there be a way to extract the time in numerical form from the : Measurement.Date.and.Time field? What I would like to do is a time : series where, for example, : Dienstag, 22. Mrz 2005 11:05:59 is time=0 min : Dienstag, 22. Mrz 2005 11:09:30 is time=3.5 min, etc. : : Thank you in advance for your help. : : Luis Make sure that you are in a German locale: # this works on Windows XP. On other OS, ge code may differ. Sys.setlocale(LC_TIME, ge) Then if DF is your data frame use strptime (see ?strptime for more on the % codes): dat - strptime(DF[,1], %A, %d. %B %Y %H:%M:%S) dat - dat[1] # difference in time since the first date time __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] extracting numerical data from text field
Gabor Grothendieck ggrothendieck at myway.com writes: Luis Tercero luis.tercero at ebi-wasser.uni-karlsruhe.de writes: : : I have imported a data frame that looks like this: : :Measurement.Date.and.Time Z.Average..nm. PDI : 572 Dienstag, 22. Mrz 2005 11:05:59 366,4 0,468 : 573 Dienstag, 22. Mrz 2005 11:09:30 353,4 0,532 : 574 Dienstag, 22. Mrz 2005 11:12:59343 0,428 : 575 Dienstag, 22. Mrz 2005 11:16:28 354,1 0,433 : 576 Dienstag, 22. Mrz 2005 11:19:59 341,9 0,349 : 577 Dienstag, 22. Mrz 2005 11:23:29 334,9 0,429 : ... : : Would there be a way to extract the time in numerical form from the : Measurement.Date.and.Time field? What I would like to do is a time : series where, for example, : Dienstag, 22. Mrz 2005 11:05:59 is time=0 min : Dienstag, 22. Mrz 2005 11:09:30 is time=3.5 min, etc. : : Thank you in advance for your help. : : Luis Make sure that you are in a German locale: # this works on Windows XP. On other OS, ge code may differ. Sys.setlocale(LC_TIME, ge) Then if DF is your data frame use strptime (see ?strptime for more on the % codes): dat - strptime(DF[,1], %A, %d. %B %Y %H:%M:%S) dat - dat[1] # difference in time since the first date time One other comment. I assumed your data time field is stored as character in the data frame. If its stored as a factor then you need to convert it to character first using as.character. If its already stored as a POSIXct date time then all you have to do is subtract off the first one. (Note that if you put the output of dput(DF) in your post then people will be able to exactly recreate your data frame and then know what you have.) Also, RNews 4/1 has a table with lots of date time processing idioms. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Prediction using GAM
Recently I was using GAM and couldn't help noticing the following incoherence in prediction: data(gam.data) data(gam.newdata) gam.object - gam(y ~ s(x,6) + z, data=gam.data) predict(gam.object)[1] 1 0.8017407 predict(gam.object,data.frame(x=gam.data$x[1],z=gam.data$z[1])) 1 0.1668452 I would expect that using two types of predict arguments should give me the same results. When I used this to predict a new data set then it seems OK: predict(gam.object,data.frame(x=gam.newdata$x[1],z=gam.newdata$z[1])) 1 0.4832136 predict(gam.object,gam.newdata)[1] 1 0.4832136 Could anybody explain the strange behavior of predict.gam function? Thanks, Kai __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] non-derivative based optimization and standard errors.
Have you considered bootstrap or Monte Carlo? spencer graves Jean Eid wrote: Hi AlL, I ahve this problem that my objective function is discontinous in the paramaters and I need to use methods such as nelder-mead to get around this. My question is: How do i compute standard errors to a problem that does not have a gradient? Any literature on this is greatly appreciated. Jean, __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] font sizes for row.names of dendograms
The error message states that you are passing a parameter called cex which has not been used. If you look at ?plclust more closely you will see it does not have cex parameter. However the S3 method for class hclust, plot, does? So does this help? hc - hclust(dist(USArrests), ave) plot(hc,cex = 0.5) Tom -Original Message- From: Brett Stansfield [mailto:[EMAIL PROTECTED] Sent: Thursday, 24 March 2005 11:42 AM To: R help (E-mail) Subject: [R] font sizes for row.names of dendograms Dear R I recently performed a cluster analysis. It produced the dendogram no problem but unfortunately the font size of the row.names were all cluttered due to their large size So I tried to change the font size using plclust(cluster.results, labels=iris$specie, cex=0.8) and R came back to me saying Error in plclust(cluster.results, labels = iris$specie, cex = 0.8) : unused argument(s) (cex ...) what am I doing wrong here brett stansfield __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Robust multivariate regression with rlm
Dear Group, I am having trouble with using rlm on multivariate data sets. When I call rlm I get Error in lm.wfit(x, y, w, method = qr) : incompatible dimensions lm on the same data sets seem to work well (see code example). Am I doing something wrong? I have already browsed through the forums and google but could not find any related discussions. I use Windows XP and R Version 2.0.1 (2004-11-15) (if that makes a difference). Example code: Mx [,1] [,2] [1,] 49.10899 45.75513 [2,] 505.92018 48.81037 [3,] 973.30659 50.28478 [4,] 55.99533 508.94504 [5,] 964.96028 513.69579 [6,] 48.25670 975.94972 [7,] 510.21291 967.62767 [8,] 977.12363 978.29216 My [,1] [,2] [1,] 50 50 [2,] 512 50 [3,] 974 50 [4,] 50 512 [5,] 974 512 [6,] 50 974 [7,] 512 974 [8,] 974 974 model-lm(My~Mx) model Call: lm(formula = My ~ Mx) Coefficients: [,1] [,2] (Intercept) 0.934727 3.918421 Mx1 1.003517 -0.004202 Mx2 -0.002624 0.998155 model-rlm(My~Mx) Error in lm.wfit(x, y, w, method = qr) : incompatible dimensions model-rlm(My~Mx,psi=psi.bisquare) Error in lm.wfit(x, y, w, method = qr) : incompatible dimensions Another example (this one seems to work): Mx-matrix(c(0,0,1,0,0,1),ncol=2,byrow=TRUE)+1 My-matrix(c(0,0,1,1,-1,1),ncol=2,byrow=TRUE)+1 model-rlm(My~Mx) model Call: rlm(formula = My ~ Mx) Converged in 0 iterations Coefficients: [,1] [,2] (Intercept)1 -1 Mx111 Mx2 -11 Degrees of freedom: 6 total; 0 residual Scale estimate: 0 Best regards, Markku Mielityinen __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R on red hat 2.1 problem while trying to generate image
Ghosh, Sandeep wrote: Running R 1.9.1 under red hat 2.1 version Please upgrade. When I try to generate an image, we get an error as in the following plot(rnorm(100)) Error in PS(file, old$paper, old$family, old$encoding, old$bg, old$fg, : unable to start device PostScript In addition: Warning message: cannot open `postscript' file argument `Rplots.ps' Do you have write permission in the current working directory? If not, use postscript() explicitly. Uwe Ligges If anyone can throw some light or any pointers into why we are facing this problem then it would be really helpful. Thanks, Sandeep. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Robust multivariate regression with rlm
lm works for multivariate responses rlm does not - check what the help file says about the response. That's about it, really. Bill Venables. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Markku Mielityinen Sent: Thursday, 24 March 2005 5:20 PM To: r-help@stat.math.ethz.ch Subject: [R] Robust multivariate regression with rlm Dear Group, I am having trouble with using rlm on multivariate data sets. When I call rlm I get Error in lm.wfit(x, y, w, method = qr) : incompatible dimensions lm on the same data sets seem to work well (see code example). Am I doing something wrong? I have already browsed through the forums and google but could not find any related discussions. I use Windows XP and R Version 2.0.1 (2004-11-15) (if that makes a difference). Example code: Mx [,1] [,2] [1,] 49.10899 45.75513 [2,] 505.92018 48.81037 [3,] 973.30659 50.28478 [4,] 55.99533 508.94504 [5,] 964.96028 513.69579 [6,] 48.25670 975.94972 [7,] 510.21291 967.62767 [8,] 977.12363 978.29216 My [,1] [,2] [1,] 50 50 [2,] 512 50 [3,] 974 50 [4,] 50 512 [5,] 974 512 [6,] 50 974 [7,] 512 974 [8,] 974 974 model-lm(My~Mx) model Call: lm(formula = My ~ Mx) Coefficients: [,1] [,2] (Intercept) 0.934727 3.918421 Mx1 1.003517 -0.004202 Mx2 -0.002624 0.998155 model-rlm(My~Mx) Error in lm.wfit(x, y, w, method = qr) : incompatible dimensions model-rlm(My~Mx,psi=psi.bisquare) Error in lm.wfit(x, y, w, method = qr) : incompatible dimensions Another example (this one seems to work): Mx-matrix(c(0,0,1,0,0,1),ncol=2,byrow=TRUE)+1 My-matrix(c(0,0,1,1,-1,1),ncol=2,byrow=TRUE)+1 model-rlm(My~Mx) model Call: rlm(formula = My ~ Mx) Converged in 0 iterations Coefficients: [,1] [,2] (Intercept)1 -1 Mx111 Mx2 -11 Degrees of freedom: 6 total; 0 residual Scale estimate: 0 Best regards, Markku Mielityinen __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html