Re: [R] A rude question
On Wednesday 26 January 2005 21:09, [EMAIL PROTECTED] wrote: Dear all, I am beginner using R. I have a question about it. When you use it, since it is written by so many authors, how do you know that the results are trustable?(I don't want to affend anyone, also I trust people). But I think this should be a question. Almost all software - generally all important software - is has numerous authors. Windows has hundreds, perhaps thousand of coders. So too does Unix. The big difference between open source and closed source is not in the number of authors. Rather it is in the open availability of the code. Arguably, if there is sufficient interest in an open source project, studies have indicated that the code is likely to be superior to that of a comparable closed source program. This a probability though, not a natural law. If you are concerned about the trustworthiness of R, then perhaps the best gauge is that some of our favorite if occasionally curmudgeonly authors on this list are also experts in S and S-Plus, the proprietary, closed source language of which R is also a dialect. They evidently know what they're doing and work comfortably in both domains. If you compare statistical results using R and Excel, there is no question that R is superior, but that will also be true if you tested Excel against S-Plus, or SAS, or NCSS - all proprietary programs, or any number of other closed and open source programs designed to do statistical analyses. At the same time just about any spreadsheet, open or closed source will also suffer in a similar comparison. If you want a more information about the safety of Excel I would suggest this site: http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html Read the various links. Beyond this there is a broad literature available on the risks and benefits of open and close source programs. Read it. JWDougherty __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] getting package version inside .First.lib
On Thu, 27 Jan 2005, Adrian Baddeley wrote: Greetings - Is it possible, inside .First.lib, to find out the version number of the package that is being loaded? If only one version of the package has been installed, we could scan the DESCRIPTION file, something like .First.lib - function(lib, pkg) { library.dynam(spatstat, pkg, lib) dfile - system.file(DESCRIPTION, package=spatstat) ttt - scan(dfile, what=, sep=^M, quiet=TRUE)[2] \n not ^M, please, and readLines is better than scan here. vvv - strsplit(ttt, )[[1]][2] cat(spatstat version number,vvv,\n) } but even this does not seem very safe (it makes assumptions about the format of the DESCRIPTION file). It is better to use read.dcf or the installed description information in package.rds. Take a look at how library() does this. Post R-2.0.0 you can assume the format is as library uses. BTW: all installed.packages does is to read the descriptions of all the packages it finds, and in .First.lib you know the path to your package. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] A rude question
When Haydn was asked about his 100+ symphonies he is reputed to have replied sunt mala bona mixta which is kind of dog latin for There are good ones and bad ones all mixed together. It's certainly the same with R packages so to continue the latin motif: caveat emptor The R engine, on the other hand, is pretty well uniformly excellent code but you have to take my word for that. Actually, you don't. The whole engine is open source so, if you wish, you can check every line of it. If people were out to push dodgy software, this is not the way they'd go about it. Bill Venables. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, 27 January 2005 3:10 PM To: r-help@stat.math.ethz.ch Subject: [R] A rude question Dear all, I am beginner using R. I have a question about it. When you use it, since it is written by so many authors, how do you know that the results are trustable?(I don't want to affend anyone, also I trust people). But I think this should be a question. Thanks, Ming __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Request for help
My name is Michela Marignani and I'm an ecologist trying to solve a problem linked to knight' s tour algorithm. I need a program to create random matrices with presence/absence (i.e. 1,0 values), with defined colums and rows sums, to create null models for statistical comparison of species distribution phenomena. I've seen on the web many solutions of the problem, but none provides the freedom to easily change row+colums constraint and none of them produce matrices with 1 and 0. Also, I've tryied to use R, but it is too complicated for a not-statistician as I amcan you help me? Thank you for your attention, so long Michela Marignani University of Siena Environmental Science Dept. Siena, Italy [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Specification of factorial random-effects model
Berton Gunter wrote: If you read the Help file for lme (!), you'll see that ~1|a*b is certainly incorrect. Briefly, the issue has been discussed before on this list: the current version of lme() follows the original Laird/Ware formulation for **nested** random effects. Specifying **crossed** random effects is possible but difficult, and the fitting algorithm is not optimized for this. See p. 163 in Bates and Pinheiro for an example. The development package lme4 has a version of a linear mixed model function that does handle crossed random effects. In lme4_0.8-1 and later the new version of lme, called lmer (which could mean either lme revised or lme for R), has a different syntax for specifying mixed models. A random effects specification is indicated by a `|' character which separates a linear model expression on the left side from the grouping factor on the right side. Because the | operator has very low precedence, such terms usually must be enclosed in parentheses. The same type of specification is used for nested or crossed or partially crossed grouping factors. The only restriction is that the grouping factor must have a unique level for each group, which is to say that you must explicitly create nested factors - you cannot specify them implicitly. This example could be fit as (fm1 - lmer(y ~ c + (1|a) +(1|b) + (1|a:b))) Linear mixed-effects model fit by REML Formula: y ~ c + (1 | a) + (1 | b) + (1 | a:b) AIC BIClogLik MLdeviance REMLdeviance 376.0148 392.7759 -181.0074 369.2869 362.0148 Random effects: Groups NameVariance Std.Dev. a:b (Intercept) 1.1118 1.0544 b(Intercept) 286.8433 16.9364 a(Intercept) 86.2138 9.2851 Residual 3.4626 1.8608 # of obs: 81, groups: a:b, 9; b, 3; a, 3 Fixed effects: Estimate Std. Error DF t value Pr(|t|) (Intercept) 65.91259 11.16262 78 5.9048 8.707e-08 c2 -9.470000.50645 78 -18.6989 2.2e-16 c3 -10.882590.50645 78 -21.4881 2.2e-16 For the random effects the Variance column is the estimate of the variance component. The Std.Dev. column is simply the square root of the estimated variance. I find it easier to think in terms of standard deviations rather than variances because I can compare the standard deviations to the scale of the data. Note that this column is *not* a standard error of the estimated variance component (and purposely so because I feel that such quantities are often nonsensical). A test of, say, whether the variance component for the interaction could be zero is performed by fitting the reduced model and using the anova function to compare the fitted models. The p-value quoted for this test is conservative because the null hypothesis is on the boundary of the parameter space. (fm2 - lmer(y ~ c + (1|a) +(1|b))) Linear mixed-effects model fit by REML Formula: y ~ c + (1 | a) + (1 | b) AIC BIClogLik MLdeviance REMLdeviance 379.3209 393.6876 -183.6605 374.8822 367.3209 Random effects: Groups NameVariance Std.Dev. a(Intercept) 86.3823 9.2942 b(Intercept) 286.5391 16.9275 Residual 4.0039 2.0010 # of obs: 81, groups: a, 3; b, 3 Fixed effects: Estimate Std. Error DF t value Pr(|t|) (Intercept) 65.912611.1560 78 5.9083 8.58e-08 c2 -9.4700 0.5446 78 -17.3890 2.2e-16 c3 -10.8826 0.5446 78 -19.9829 2.2e-16 Warning message: optim returned message ERROR: ABNORMAL_TERMINATION_IN_LNSRCH in: LMEoptimize-(`*tmp*`, value = list(maxIter = 50, msMaxIter = 50, anova(fm1, fm2) Data: Models: fm2: y ~ c + (1 | a) + (1 | b) fm1: y ~ c + (1 | a) + (1 | b) + (1 | a:b) Df AIC BIC logLik Chisq Chi Df Pr(Chisq) fm2 6 386.88 401.25 -187.44 fm1 7 383.29 400.05 -184.64 5.5953 10.01801 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nicholas Galwey Sent: Wednesday, January 26, 2005 1:45 PM To: r-help@stat.math.ethz.ch Subject: [R] Specification of factorial random-effects model I want to specify two factors and their interaction as random effects using the function lme(). This works okay when I specify these terms using the function Error() within the function aov(), but I can't get the same model fitted using lme(). The code below illustrates the problem. a - factor(rep(c(1:3), each = 27)) b - factor(rep(rep(c(1:3), each = 9), times = 3)) c - factor(rep(rep(c(1:3), each = 3), times = 9)) y - c(74.59,75.63,76.7,63.48,63.17,65.99,64,66.35,64.5, 46.57,44.16,47.96,35.09,36.14,35.16,36.4,34.72,34.58, 41.82,47.35,45.74,33.33,36.8,33.38,34.13,34.39,34.48, 89.73,85.24,90.86,82.5,79.44,81.65,77.74,77.02,81.62, 59.32,62.29,60.7,55.42,55.5,51.17,50.54,53.54,51.85, 64.5,63.6,65.19,55.07,50.26,53.73,54.57,47.8,48.8,91.56, 94.49,92.17,82.14,83.16,81.31,83.58,78.63,77.08,60.53,
RE: [R] A rude question
Hi I don't know if you are asking the question for the same reasons I did, but recently (and ongoing) we have been required to adopt an internationally recognised standard. Being in the bioinformatics field, where open-source software is the beating heart of cutting edge research, we have obviously had to ask ourselves that exact question - How can we be sure the software we use works?. In science, this doesn't just apply to software though. When someone publishes a paper, how can any of us be sure they did what they said they did? Or that their methods are the correct ones to use? Luckily, there is a two word answer that we hope will satisfy our auditors, and that is Peer Review. In the context of R, I would say that you could put a confidence measure on any package based on the number of people who use it; the more people who use a package, the more likely they are to find and remove bugs. I won't get into the open source vs commercial argument, but put simply, all software has bugs at some stage, no matter who has written it. Given that fact, I prefer the code to be open so I can see them, not closed so that I can't. The fact that we can see all code relating to R is surely the biggest quality measure of all? Cheers Mick -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: 27 January 2005 05:10 To: r-help@stat.math.ethz.ch Subject: [R] A rude question Dear all, I am beginner using R. I have a question about it. When you use it, since it is written by so many authors, how do you know that the results are trustable?(I don't want to affend anyone, also I trust people). But I think this should be a question. Thanks, Ming __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Results of MCD estimators in MASS and rrcov
Hi! I tested two different implementations of the robust MCD estimator: cov.mcd from the MASS package and covMcd from the rrcov package. Tests were done on the hbk dataset included in the rrcov package. Unfortunately I get quite differing results -- so the question is whether this differences are justified or an error on my side or a bug? Here is, what I did: require(MASS) require(rrcov) data(hbk) mass.mcd-cov.mcd(hbk,quantile.used=57) rrcov.covMcd-covMcd(hbk,alpha=0.75) #output from cov.mcd (MASS) mass.mcd$center X1 X2 X3 Y 1.5583 1.8033 1.6600 -0.0867 mass.mcd$cov X1 X2 X3 Y X1 1.12484463 0.02217514 0.1537288 0.07615819 X2 0.02217514 1.13897175 0.1814915 0.02029379 X3 0.15372881 0.18149153 1.0434576 -0.12877966 Y 0.07615819 0.02029379 -0.1287797 0.31236158 #output from covMcd (rrcov) rrcov.covMcd$center X1 X2 X3 Y 1.53770492 1.78032787 1.68688525 -0.07377049 rrcov.covMcd$cov X1 X2 X3Y X1 1.61921813 0.072595397 0.1678300 0.083905209 X2 0.07259540 1.648137481 0.2013022 0.002657454 X3 0.16782996 0.201302158 1.5306858 -0.150876964 Y 0.08390521 0.002657454 -0.1508770 0.453846286 As you can see, the results are quite different. I tried to start both calls with 75% (=57 of 75) good data-points. I crosschecked the results with the MCD implementation in MATLAB by Verboven and Hubert. This functions give the same results as cov.mcd (MASS). If somebody knows, why the results do not match, although both functions are implementation referring to the same estimator, please tell me. Thanks, Rainer -- 10 GB Mailbox, 100 FreeSMS http://www.gmx.net/de/go/topmail __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] getting package version inside .First.lib
On Thu, 27 Jan 2005, Prof Brian Ripley wrote: On Thu, 27 Jan 2005, Adrian Baddeley wrote: Greetings - Is it possible, inside .First.lib, to find out the version number of the package that is being loaded? If only one version of the package has been installed, we could scan the DESCRIPTION file, something like .First.lib - function(lib, pkg) { library.dynam(spatstat, pkg, lib) dfile - system.file(DESCRIPTION, package=spatstat) ttt - scan(dfile, what=, sep=^M, quiet=TRUE)[2] \n not ^M, please, and readLines is better than scan here. vvv - strsplit(ttt, )[[1]][2] cat(spatstat version number,vvv,\n) } but even this does not seem very safe (it makes assumptions about the format of the DESCRIPTION file). It is better to use read.dcf or the installed description information in package.rds. Take a look at how library() does this. Or even packageDescription() in utils, which uses read.dcf() and should be a way of making sure you get the version even if the underlying formatting changes. Roger Post R-2.0.0 you can assume the format is as library uses. BTW: all installed.packages does is to read the descriptions of all the packages it finds, and in .First.lib you know the path to your package. -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Breiviksveien 40, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Cluster analysis using EM algorithm
Hi! Take a look at the packages mclust and flexmix! They use the EM algorithm for mixture modelling, sometimes called model based cluster analysis. Best, Christian On Wed, 26 Jan 2005 [EMAIL PROTECTED] wrote: Hi, I am looking for a package to do the clustering analysis using the expectation maximization algorithm. Thanks in advance. Ming __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html *** Christian Hennig Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg [EMAIL PROTECTED], http://www.math.uni-hamburg.de/home/hennig/ ### ich empfehle www.boag-online.de __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Converting yr mo da to dates
David Parkhurst wrote: I'm using R 2.0.1 in windows XP (and am not currently subscribed to this mailing list). I have a USGS dataset, a text file with fixed width fields, that includes dates as 6-digit integers in the form yrmoda. I could either read them that way, or with yr, mo, and da as separate integers. In either case, I'd like to convert them to a form will allow plotting other y variables against the dates (with correct spacing) on the horizontal axis. I've looked in all the manuals, but didn't find a way to do this. I can copy the data to a spreadsheet, make the conversion there, and then move the data to R, but that's a nuisance. I'd appreciate learning whether there is a way to do this all within R. Thanks. Dave Parkhurst __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html See ?strptime as in: strptime(c(050127, 050128), %y%m%d) Uwe Ligges __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] self-written function
Dear all, I´ve got a simple self-written function to calculate the mean + s.e. from arcsine-transformed data: backsin-function(x,y,...){ backtransf-list() backtransf$back-((sin(x[x!=NA]))^2)*100 backtransf$mback-tapply(backtransf$back,y[x!=NA],mean) backtransf$sdback-tapply(backtransf$back,y[x!=NA],stdev)/sqrt(length(y[x!=NA])) backtransf } I would like to apply this function to whole datasets, such as tapply(variable,list(A,B,C,D),backsin) Of course, this doesn´t work with the way in which the backsin() function is specified. Does anyone have suggestions on how I could improve my function? Regards, Christoph __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] getting package version inside .First.lib
From: Roger Bivand On Thu, 27 Jan 2005, Prof Brian Ripley wrote: On Thu, 27 Jan 2005, Adrian Baddeley wrote: Greetings - Is it possible, inside .First.lib, to find out the version number of the package that is being loaded? If only one version of the package has been installed, we could scan the DESCRIPTION file, something like .First.lib - function(lib, pkg) { library.dynam(spatstat, pkg, lib) dfile - system.file(DESCRIPTION, package=spatstat) ttt - scan(dfile, what=, sep=^M, quiet=TRUE)[2] \n not ^M, please, and readLines is better than scan here. vvv - strsplit(ttt, )[[1]][2] cat(spatstat version number,vvv,\n) } but even this does not seem very safe (it makes assumptions about the format of the DESCRIPTION file). It is better to use read.dcf or the installed description information in package.rds. Take a look at how library() does this. Or even packageDescription() in utils, which uses read.dcf() and should be a way of making sure you get the version even if the underlying formatting changes. This is how I do it in randomForest (using .onAttach instead of .First.Lib): .onAttach - function(libname, pkgname) { RFver - if (as.numeric(R.version$major) 2 as.numeric(R.version$minor) 9.0) package.description(randomForest)[Version] else packageDescription(randomForest)$Version cat(paste(randomForest, RFver), \n) cat(Type rfNews() to see new features/changes/bug fixes.\n) } HTH, Andy Roger Post R-2.0.0 you can assume the format is as library uses. BTW: all installed.packages does is to read the descriptions of all the packages it finds, and in .First.lib you know the path to your package. -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Breiviksveien 40, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] getting package version inside .First.lib
On Thu, 27 Jan 2005, Liaw, Andy wrote: From: Roger Bivand On Thu, 27 Jan 2005, Prof Brian Ripley wrote: On Thu, 27 Jan 2005, Adrian Baddeley wrote: Greetings - Is it possible, inside .First.lib, to find out the version number of the package that is being loaded? If only one version of the package has been installed, we could scan the DESCRIPTION file, something like .First.lib - function(lib, pkg) { library.dynam(spatstat, pkg, lib) dfile - system.file(DESCRIPTION, package=spatstat) ttt - scan(dfile, what=, sep=^M, quiet=TRUE)[2] \n not ^M, please, and readLines is better than scan here. vvv - strsplit(ttt, )[[1]][2] cat(spatstat version number,vvv,\n) } but even this does not seem very safe (it makes assumptions about the format of the DESCRIPTION file). It is better to use read.dcf or the installed description information in package.rds. Take a look at how library() does this. Or even packageDescription() in utils, which uses read.dcf() and should be a way of making sure you get the version even if the underlying formatting changes. This is how I do it in randomForest (using .onAttach instead of .First.Lib): .onAttach - function(libname, pkgname) { RFver - if (as.numeric(R.version$major) 2 as.numeric(R.version$minor) 9.0) package.description(randomForest)[Version] else packageDescription(randomForest)$Version cat(paste(randomForest, RFver), \n) cat(Type rfNews() to see new features/changes/bug fixes.\n) } Please don't use functions from utils in such places without explicitly loading them from utils unless your package has an explicit dependence on utils (and randomForest does not). There was a good reason why I suggested what I did: you don't need the utils namespace for this. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] getting package version inside .First.lib
Thanks, Brian. So, to print the version number when 'mypackage' is loaded, .First.lib - function(lib, pkg) { library.dynam(mypackage, pkg, lib) vvv - read.dcf(file=system.file(DESCRIPTION, package=mypackage), fields=Version) cat(paste(mypackage, vvv, \n)) } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] computing roots of bessel function
hi package(gsl) calculates zeroes of regular Bessel functions of integral order. You need function bessel_zero_Jnu() best wishes Robn On Jan 27, 2005, at 11:55 am, coutand wrote: I am not yet a R user but I will be soon. I am looking for the R command and syntax to compute the roots of Bessel function i.e. computing the z values that lead to Jnu(z)=0 where J is a Bessel function or order nu. May You help me ? thanks in advance. Dr Catherine COUTAND Institut National de la Recherche Agronomique (INRA) umr Physiologie Intégrative de l'Arbre Fruitier et Forestier (PIAF) 234 av. du Brézet 63039 Clermont-Ferrand cedex 02 France tel : 00-33-(0)4-73-62-46-73 fax : 00-33-(0)4-73-62-44-54 email : [EMAIL PROTECTED] http://www.clermont.inra.fr/piaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Robin Hankin Uncertainty Analyst Southampton Oceanography Centre European Way, Southampton SO14 3ZH, UK tel 023-8059-7743 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] getting package version inside .First.lib
From: Prof Brian Ripley On Thu, 27 Jan 2005, Liaw, Andy wrote: From: Roger Bivand On Thu, 27 Jan 2005, Prof Brian Ripley wrote: On Thu, 27 Jan 2005, Adrian Baddeley wrote: Greetings - Is it possible, inside .First.lib, to find out the version number of the package that is being loaded? If only one version of the package has been installed, we could scan the DESCRIPTION file, something like .First.lib - function(lib, pkg) { library.dynam(spatstat, pkg, lib) dfile - system.file(DESCRIPTION, package=spatstat) ttt - scan(dfile, what=, sep=^M, quiet=TRUE)[2] \n not ^M, please, and readLines is better than scan here. vvv - strsplit(ttt, )[[1]][2] cat(spatstat version number,vvv,\n) } but even this does not seem very safe (it makes assumptions about the format of the DESCRIPTION file). It is better to use read.dcf or the installed description information in package.rds. Take a look at how library() does this. Or even packageDescription() in utils, which uses read.dcf() and should be a way of making sure you get the version even if the underlying formatting changes. This is how I do it in randomForest (using .onAttach instead of .First.Lib): .onAttach - function(libname, pkgname) { RFver - if (as.numeric(R.version$major) 2 as.numeric(R.version$minor) 9.0) package.description(randomForest)[Version] else packageDescription(randomForest)$Version cat(paste(randomForest, RFver), \n) cat(Type rfNews() to see new features/changes/bug fixes.\n) } Please don't use functions from utils in such places without explicitly loading them from utils unless your package has an explicit dependence on utils (and randomForest does not). There was a good reason why I suggested what I did: you don't need the utils namespace for this. Thanks for the tip! Will remediate... Andy -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Request for help
If I understand your problem properly, then your matrices have a known number of zeros and ones in them. So you can create a matrix with just this constraint binding via: mat - matrix(sample(rep(0:1, c(nzeros, nones))), nr, nc) That command first generates the appropriate number of zeros and ones (via 'rep'), then does a random permutation of them (with 'sample') and finally turns it into a matrix. You could then test for the row and column constraints, and permute the sub-matrix of rows and columns that do not obey their constraints. It could look something like: mat[bad.rows, bad.cols] - sample(mat[bad.rows, bad.cols]) where 'bad.rows' and 'bad.cols' are logical vectors stating if the constraints are satisfied or not. You do not need to be a statistician to use R -- far from it. The 'Guide for the Unwilling' gives you a brief introduction. There is also a lot of introductory material in the contributed documentation section of the R Project website. It would be good to use a more descriptive subject for messages to R-help. Patrick Burns Burns Statistics [EMAIL PROTECTED] +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and A Guide for the Unwilling S User) Michela Marignani wrote: My name is Michela Marignani and I'm an ecologist trying to solve a problem linked to knight' s tour algorithm. I need a program to create random matrices with presence/absence (i.e. 1,0 values), with defined colums and rows sums, to create null models for statistical comparison of species distribution phenomena. I've seen on the web many solutions of the problem, but none provides the freedom to easily change row+colums constraint and none of them produce matrices with 1 and 0. Also, I've tryied to use R, but it is too complicated for a not-statistician as I amcan you help me? Thank you for your attention, so long Michela Marignani University of Siena Environmental Science Dept. Siena, Italy [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] A rude question
Ming, results are trustable?(I don't want to affend anyone, also I trust people). Years ago I read about a simplified formula to answer whether I trust someone, and in turn, something: Trustworthiness = Competence + Character. I think a bit of research, as the other R-help posters have so comprehensively covered in their replies to your original question, will convince you or anyone else you need to convince that the R-core team and the core product of R itself rates at the top of the scale on both character and competence. Packages of course will not be as consistently high in the trustworthiness continuum, but rest assured there are several that are high, which again, you can verify yourself for your and/or your audience's needs. Best Regards, Bill --- Bill Pikounis, PhD Nonclinical Statistics Centocor, Inc. 200 Great Valley Parkway MailStop C4-1 Malvern, PA 19355 610 240 8498 fax 610 651 6717 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, January 27, 2005 12:10 AM To: r-help@stat.math.ethz.ch Subject: [R] A rude question Dear all, I am beginner using R. I have a question about it. When you use it, since it is written by so many authors, how do you know that the results are trustable?(I don't want to affend anyone, also I trust people). But I think this should be a question. Thanks, Ming __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Output predictions based on a Cox model
Hi, I've generated a cox model, but I'm struggling to work out how to output predctions based on the model I've made. my.model-coxph(Surv(duration,status) ~ gender + var1 + var2, data=mydata) My test data set looks something like this: id,actualduration,gender,var1,var2 a,65,m,1,3 b,34,f,1,5 ... What i need to do is for each id, output a predicted duration based on my cox model so that I can compare it with other models. I've looked in the survival package, and the Design package, but I can only see how to output survival probabilities. I'm probably missing something obvious, but trawling the mail archives has been fruitless, any suggestions? Cheers, George __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] A rude question
Ming, You have received a number of excellent replies to your question and should really consider them. Here is another point--really extending a Bill Venables comment: If people were out to push dodgy software, this is not the way they'd go about it. Definitely! Look at the requirements for submitting a package to R. While the required documentation and uniform approach mandated do not automatically equate to VV'ed code it is a strong indication of commitment of the R core and contributing communities. The imposition of these standards by the core team and the time committed to the project vis-a-vis development, the help list, etc. speaks volumes about the quality of R. Rest assured such commitment is not the norm. That being said, I do respectfully disagree with Dr. Rossini in one minor detail ;O). It is not 'extremely paranoid' to re-code in another language and definitely not so to do hand calculations! Murphy's Law is relentless in all matters! If you are like most of us (all of us?) you will find errors in your own coding and maybe rarely an R bug. BTW, since you are starting out in R...voraciously read the documentation, helplist, newletter, and other free and commercial material on R, work thru the examples relevant to you area of endeavor, read more, code more, read more, code more, read more, code more The facility with R that you gain as a result will reward you multifold down the road. Best regards, Michael Grant P.S. Whenever you upgrade R, read the CHANGES, NEWS files, etc. R does evolve--even the core--although it is very controlled and managed. (You will learn of bugfixes there too.) --- [EMAIL PROTECTED] wrote: Dear all, I am beginner using R. I have a question about it. When you use it, since it is written by so many authors, how do you know that the results are trustable?(I don't want to affend anyone, also I trust people). But I think this should be a question. Thanks, Ming __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sw
Dear Mahdi Mahdi Osman writes: Hi list, I am just a new user of R. How can I run stepwise regression in R? If you look for a stepwise procedure for Linear Regression Models or Generalized Linear Models, you can use step() (see ?step) Regards, Christoph -- Christoph Buser [EMAIL PROTECTED] Seminar fuer Statistik, LEO C11 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-5414 fax: 632-1228 http://stat.ethz.ch/~buser/ -- Is there a graphic user interphase for any of the spatial packages inculded in R, such as gstat, geoR and someothers. I am mainly interested interactive variogram modelling and mapping. Thanks Mahdi -- --- Mahdi Osman (PhD) E-mail: [EMAIL PROTECTED] --- 10 GB Mailbox, 100 FreeSMS http://www.gmx.net/de/go/topmail __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] cluster, mona error
Hi I have a problem using the package cluster on my binary data. I want to try mona at first. But i get the an error. hc-read.table(all.txt, header=TRUE, sep=\t, row.names=1) srt(hc) `data.frame': 51 obs. of 59 variables: $ G1p : int 2 1 1 1 1 1 1 1 1 1 ... $ G1q : int 1 1 1 1 1 1 1 1 1 1 ... $ G2p : int 1 1 1 1 1 1 1 1 1 1 ... $ G2q : int 1 1 1 1 1 1 1 1 1 1 ... $ G3p : int 1 1 1 1 1 1 1 1 1 1 ... m-mona(hc) Error in mona(hc) : All variables must be binary (factor with 2 levels). I find this strange when the cluster dataset animals have the same structure as my data. srt(animals) `data.frame': 20 obs. of 6 variables: $ war: int 1 1 2 1 2 2 2 2 2 1 ... $ fly: int 1 2 1 1 1 1 2 2 1 2 ... $ ver: int 1 1 2 1 2 2 2 2 2 1 ... $ end: int 1 1 1 1 2 1 1 2 2 1 ... $ gro: int 2 2 1 1 2 2 2 1 2 1 ... $ hai: int 1 2 2 2 2 2 1 1 1 1 ... m-mona(animals) #works fine what is this error trying to tell me? mvh morten __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] getting package version inside .First.lib
This is what I use for all my packages, which I believe handles multiple versions of the same package being installed: .First.lib - function(lib, pkg) { ver - read.dcf(file.path(lib, pkg, DESCRIPTION), Version) ver - as.character(ver) ... } -roger Adrian Baddeley wrote: Greetings - Is it possible, inside .First.lib, to find out the version number of the package that is being loaded? If only one version of the package has been installed, we could scan the DESCRIPTION file, something like .First.lib - function(lib, pkg) { library.dynam(spatstat, pkg, lib) dfile - system.file(DESCRIPTION, package=spatstat) ttt - scan(dfile, what=, sep=^M, quiet=TRUE)[2] vvv - strsplit(ttt, )[[1]][2] cat(spatstat version number,vvv,\n) } but even this does not seem very safe (it makes assumptions about the format of the DESCRIPTION file). Is there a better way? thanks Adrian Baddeley __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Roger D. Peng http://www.biostat.jhsph.edu/~rpeng/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] cluster, mona error
On Jan 27, 2005, at 9:06 AM, Morten Mattingsdal wrote: Hi I have a problem using the package cluster on my binary data. I want to try mona at first. But i get the an error. hc-read.table(all.txt, header=TRUE, sep=\t, row.names=1) srt(hc) `data.frame': 51 obs. of 59 variables: $ G1p : int 2 1 1 1 1 1 1 1 1 1 ... $ G1q : int 1 1 1 1 1 1 1 1 1 1 ... $ G2p : int 1 1 1 1 1 1 1 1 1 1 ... $ G2q : int 1 1 1 1 1 1 1 1 1 1 ... $ G3p : int 1 1 1 1 1 1 1 1 1 1 ... m-mona(hc) Error in mona(hc) : All variables must be binary (factor with 2 levels). You have to be careful that the data are indeed each factors with 2 levels (numeric variables with values 1 and 2 will not do). A summary of the data will tell you that. Sean I find this strange when the cluster dataset animals have the same structure as my data. srt(animals) `data.frame': 20 obs. of 6 variables: $ war: int 1 1 2 1 2 2 2 2 2 1 ... $ fly: int 1 2 1 1 1 1 2 2 1 2 ... $ ver: int 1 1 2 1 2 2 2 2 2 1 ... $ end: int 1 1 1 1 2 1 1 2 2 1 ... $ gro: int 2 2 1 1 2 2 2 1 2 1 ... $ hai: int 1 2 2 2 2 2 1 1 1 1 ... m-mona(animals) #works fine what is this error trying to tell me? mvh morten __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] cluster, mona error solved
Sean Davis wrote: On Jan 27, 2005, at 9:06 AM, Morten Mattingsdal wrote: Hi I have a problem using the package cluster on my binary data. I want to try mona at first. But i get the an error. hc-read.table(all.txt, header=TRUE, sep=\t, row.names=1) srt(hc) `data.frame': 51 obs. of 59 variables: $ G1p : int 2 1 1 1 1 1 1 1 1 1 ... $ G1q : int 1 1 1 1 1 1 1 1 1 1 ... $ G2p : int 1 1 1 1 1 1 1 1 1 1 ... $ G2q : int 1 1 1 1 1 1 1 1 1 1 ... $ G3p : int 1 1 1 1 1 1 1 1 1 1 ... m-mona(hc) Error in mona(hc) : All variables must be binary (factor with 2 levels). You have to be careful that the data are indeed each factors with 2 levels (numeric variables with values 1 and 2 will not do). A summary of the data will tell you that. Sean Yes. Now I understand. There was one single variable among my 59, which did only have 1 level: I used summary(mydata) as you said: and found L16p Min. :1 1st Qu.:1 Median :1 Mean :1 3rd Qu.:1 Max. :1 I removed this and now it workes fine thanks alot for your quick reply regards greatful morten __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] cluster, mona error
Morten, just a try: is there a constant variable (only 1) in the first dataset? Christian On Thu, 27 Jan 2005, Morten Mattingsdal wrote: Hi I have a problem using the package cluster on my binary data. I want to try mona at first. But i get the an error. hc-read.table(all.txt, header=TRUE, sep=\t, row.names=1) srt(hc) `data.frame': 51 obs. of 59 variables: $ G1p : int 2 1 1 1 1 1 1 1 1 1 ... $ G1q : int 1 1 1 1 1 1 1 1 1 1 ... $ G2p : int 1 1 1 1 1 1 1 1 1 1 ... $ G2q : int 1 1 1 1 1 1 1 1 1 1 ... $ G3p : int 1 1 1 1 1 1 1 1 1 1 ... m-mona(hc) Error in mona(hc) : All variables must be binary (factor with 2 levels). I find this strange when the cluster dataset animals have the same structure as my data. srt(animals) `data.frame': 20 obs. of 6 variables: $ war: int 1 1 2 1 2 2 2 2 2 1 ... $ fly: int 1 2 1 1 1 1 2 2 1 2 ... $ ver: int 1 1 2 1 2 2 2 2 2 1 ... $ end: int 1 1 1 1 2 1 1 2 2 1 ... $ gro: int 2 2 1 1 2 2 2 1 2 1 ... $ hai: int 1 2 2 2 2 2 1 1 1 1 ... m-mona(animals) #works fine what is this error trying to tell me? mvh morten __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html *** Christian Hennig Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg [EMAIL PROTECTED], http://www.math.uni-hamburg.de/home/hennig/ ### ich empfehle www.boag-online.de __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Indexing Lists and Partial Matching
I was unaware until recently that partial matching was used to index data frames and lists. This is now causing a great deal of problems in my code as I sometimes index a list without knowing what elements it contains, expecting a NULL if the column does not exist. However, if partial matching is used, sometimes R will return an object I do not want. My question, is there an easy way of getting around this? For example: a - NULL a$abc - 5 a$a [1] 5 a$a - a$a a $abc [1] 5 $a [1] 5 Certainly from a coding prospective, one might expect assigning a$a to itself wouldn't do anything since either 1) a$a doesn't exist, so nothing happens, or 2) a$a does exist and so it just assigns its value to itself. However, in the above case, it creates a new column entirely because I happen to have another column called a$abc. I do not want this behavior. The solution I came up with was to create another indexing function that uses the subset() (which doesn't partial match), then check for an error, and if there is an error substitute NULL (to mimic the [ behavior). However, I don't really want to start using another indexing function altogether just to get around this behavior. Is there a better way? Can I turn off partial matching? Thanks, Robert Robert McGehee Geode Capital Management, LLC 53 State Street, 5th Floor | Boston, MA | 02109 Tel: 617/392-8396Fax:617/476-6389 mailto:[EMAIL PROTECTED] This e-mail, and any attachments hereto, are intended for us...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] partial ranking models
Dear R-users, is a library available to estimate partial ranking models? Best, Ruud __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] weighting in nls
I'm fitting nonlinear functions to some growth data but I'm getting radically different results in R to another program (Prism). Furthermore the values from the other program give a better fit and seem more realistic. I think there is a problem with the results from the r nls function. The differences only occur with weighted data so I think I'm making a mistake in the weighting. I'm following the procedure outlined on p 244 of MASS (or at least I'm trying to). Thus, I'm using mean data with heteroscedasticity so I'm weighting by n/ variance, where the variance is well known from a large data set. This weighting factor is available as the variable 'novervar'. The function is a von Bertalanffy curve of the form weight~(a*(1-exp(-b*(age-c^3. Thus I'm entering the command in the form: solb1wvb-nls(~sqrt(novervar)*(weight-(a*(1-exp(-b*(age-c^3),data=solb1.na.rm,start=list(a=0.85,b=0.45,c=0.48)) Can anyone suggest what I'm doing wrong? I seem to be folowing the instructions in MASS. I tried following the similar instructions on page 450 of the white book but these were a bit cryptic. I'm using R 2.0.0 on a Windows 2000 machine Regards, Robert Brown *** This email and any attachments are intended for the named re...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Indexing Lists and Partial Matching
This has been discussed a few times on this list before, so you might want to dig into the archive... You might want to check existence of name instead of checking whether the component is NULL: x - list(bc=bc, ab=ab) is.null(x$b) [1] FALSE b %in% names(x) [1] FALSE Andy From: McGehee, Robert I was unaware until recently that partial matching was used to index data frames and lists. This is now causing a great deal of problems in my code as I sometimes index a list without knowing what elements it contains, expecting a NULL if the column does not exist. However, if partial matching is used, sometimes R will return an object I do not want. My question, is there an easy way of getting around this? For example: a - NULL a$abc - 5 a$a [1] 5 a$a - a$a a $abc [1] 5 $a [1] 5 Certainly from a coding prospective, one might expect assigning a$a to itself wouldn't do anything since either 1) a$a doesn't exist, so nothing happens, or 2) a$a does exist and so it just assigns its value to itself. However, in the above case, it creates a new column entirely because I happen to have another column called a$abc. I do not want this behavior. The solution I came up with was to create another indexing function that uses the subset() (which doesn't partial match), then check for an error, and if there is an error substitute NULL (to mimic the [ behavior). However, I don't really want to start using another indexing function altogether just to get around this behavior. Is there a better way? Can I turn off partial matching? Thanks, Robert Robert McGehee Geode Capital Management, LLC 53 State Street, 5th Floor | Boston, MA | 02109 Tel: 617/392-8396Fax:617/476-6389 mailto:[EMAIL PROTECTED] This e-mail, and any attachments hereto, are intended for us...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Indexing Lists and Partial Matching
This came up a few months ago. Check the thread on hashing and partial matching around Nov 18. The short answer is no, you can't turn it off because lots of code relies on that behavior. Reid Huntsinger -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of McGehee, Robert Sent: Thursday, January 27, 2005 9:34 AM To: r-help@stat.math.ethz.ch Subject: [R] Indexing Lists and Partial Matching I was unaware until recently that partial matching was used to index data frames and lists. This is now causing a great deal of problems in my code as I sometimes index a list without knowing what elements it contains, expecting a NULL if the column does not exist. However, if partial matching is used, sometimes R will return an object I do not want. My question, is there an easy way of getting around this? For example: a - NULL a$abc - 5 a$a [1] 5 a$a - a$a a $abc [1] 5 $a [1] 5 Certainly from a coding prospective, one might expect assigning a$a to itself wouldn't do anything since either 1) a$a doesn't exist, so nothing happens, or 2) a$a does exist and so it just assigns its value to itself. However, in the above case, it creates a new column entirely because I happen to have another column called a$abc. I do not want this behavior. The solution I came up with was to create another indexing function that uses the subset() (which doesn't partial match), then check for an error, and if there is an error substitute NULL (to mimic the [ behavior). However, I don't really want to start using another indexing function altogether just to get around this behavior. Is there a better way? Can I turn off partial matching? Thanks, Robert Robert McGehee Geode Capital Management, LLC 53 State Street, 5th Floor | Boston, MA | 02109 Tel: 617/392-8396Fax:617/476-6389 mailto:[EMAIL PROTECTED] This e-mail, and any attachments hereto, are intended for us...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] weighting in nls
Can you show us the difference; i.e., what are the parameter estimates and associated SEs from the two programs? Even better, can you supply an example data set? [With is `trick' for weighted nls, you need to be careful with the output of predict().] Andy From: Robert Brown FM CEFAS I'm fitting nonlinear functions to some growth data but I'm getting radically different results in R to another program (Prism). Furthermore the values from the other program give a better fit and seem more realistic. I think there is a problem with the results from the r nls function. The differences only occur with weighted data so I think I'm making a mistake in the weighting. I'm following the procedure outlined on p 244 of MASS (or at least I'm trying to). Thus, I'm using mean data with heteroscedasticity so I'm weighting by n/ variance, where the variance is well known from a large data set. This weighting factor is available as the variable 'novervar'. The function is a von Bertalanffy curve of the form weight~(a*(1-exp(-b*(age-c^3. Thus I'm entering the command in the form: solb1wvb-nls(~sqrt(novervar)*(weight-(a*(1-exp(-b*(age-c^ 3),data=solb1.na.rm,start=list(a=0.85,b=0.45,c=0.48)) Can anyone suggest what I'm doing wrong? I seem to be folowing the instructions in MASS. I tried following the similar instructions on page 450 of the white book but these were a bit cryptic. I'm using R 2.0.0 on a Windows 2000 machine Regards, Robert Brown ** * This email and any attachments are intended for the named re...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] weighting in nls
Hi there, this is the output from R solb2wvb-nls(~sqrt(novervar)*(weight-(a*(1-exp(-b*(age-c^3),data=solb2.na.rm,start=list(a=0.85,b=0.45,c=0.48)) summary(solb2wvb) Formula: ~ sqrt(novervar) * (weight - (a * (1 - exp( - b * (age - c^3) Parameters: Value Std. Error t value a 1.087370 0.01193090 91.1392 b 0.151838 0.00714963 21.2372 c -1.809770 0.13186000 -13.7250 Residual standard error: 4.41368 on 109 degrees of freedom The output from Prism is: von Bertalanffy Best-fit values A 0.8957 B 0.2381 C -1.358 Std. Error A 0.002280 B 0.002568 C 0.02919 95% Confidence Intervals A 0.8912 to 0.9001 B 0.2331 to 0.2431 C -1.415 to -1.300 The latter has much better visual fit and reasonable residuals. Furthermore theory and practice both lead to the expectation that this model should fit the data. Incidentally, I was under the impression that with a weighted nls in R the SE values were not accurate. Finally I've attached the dataset -Original Message- From: Liaw, Andy [mailto:[EMAIL PROTECTED] Sent: 27 January 2005 15:25 To: Robert Brown FM CEFAS; r-help@stat.math.ethz.ch Subject: RE: [R] weighting in nls Can you show us the difference; i.e., what are the parameter estimates and associated SEs from the two programs? Even better, can you supply an example data set? [With is `trick' for weighted nls, you need to be careful with the output of predict().] Andy From: Robert Brown FM CEFAS I'm fitting nonlinear functions to some growth data but I'm getting radically different results in R to another program (Prism). Furthermore the values from the other program give a better fit and seem more realistic. I think there is a problem with the results from the r nls function. The differences only occur with weighted data so I think I'm making a mistake in the weighting. I'm following the procedure outlined on p 244 of MASS (or at least I'm trying to). Thus, I'm using mean data with heteroscedasticity so I'm weighting by n/ variance, where the variance is well known from a large data set. This weighting factor is available as the variable 'novervar'. The function is a von Bertalanffy curve of the form weight~(a*(1-exp(-b*(age-c^3. Thus I'm entering the command in the form: solb1wvb-nls(~sqrt(novervar)*(weight-(a*(1-exp(-b*(age-c^ 3),data=solb1.na.rm,start=list(a=0.85,b=0.45,c=0.48)) Can anyone suggest what I'm doing wrong? I seem to be folowing the instructions in MASS. I tried following the similar instructions on page 450 of the white book but these were a bit cryptic. I'm using R 2.0.0 on a Windows 2000 machine Regards, Robert Brown ** * This email and any attachments are intended for the named re...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp Dohme or MSD and in Japan, as Banyu) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. -- *** This email and any attachments are intended for the named recipient only. Its unauthorised use, distribution, disclosure, storage or copying is not permitted. If you have received it in error, please destroy all copies and notify the sender. In messages of a non-business nature, the views and opinions expressed are the author's own and do not necessarily reflect those of the organisation from which it is sent. All emails may be subject to monitoring. *** __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Indexing Lists and Partial Matching
Thank you both for your reference. I had missed the previous discussions before posting. I am surprised to hear that there is code that relies on this indexing behavior, especially if it is in the base package. I'm not sure how a function could even make use of this feature without first asking R what the names of the list or data frame are, and then intentionally shortening them to something else. It even seems reasonable that if code _does_ rely on this behavior, then it may be subject to other problems anyway, such as if the wrong data is unintentionally returned (when NULL or error should be returned instead). (Although I freely acknowledge my ignorance of the uses of this feature as I only recently discovered it.) From the previous posts, it seems the only way in R to code around this is to _always_ check the names of a list before indexing, as anything else could lead to very subtle errors in complex code, unless one can a priori guarantee that the list names are always distinguishable. Perhaps one easy way to optionally remove this feature without breaking anything would be to have an option/flag in the description or namespace of a package indicating that list-indexing partial-matching should not be used for any function within that package. But that might be a bit hackish. However, for my personal code, the a[[match(abc, names(a))]] construct (from one of the Nov 18th posts) is easy enough to use, so no intention to rehash an already well-discussed topic. Thanks, Robert PS. None of this applies to partial matching of function arguments, as this is certainly widely used. -Original Message- From: Huntsinger, Reid [mailto:[EMAIL PROTECTED] Sent: Thursday, January 27, 2005 10:15 AM To: 'McGehee, Robert'; r-help@stat.math.ethz.ch Subject: RE: [R] Indexing Lists and Partial Matching This came up a few months ago. Check the thread on hashing and partial matching around Nov 18. The short answer is no, you can't turn it off because lots of code relies on that behavior. Reid Huntsinger -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of McGehee, Robert Sent: Thursday, January 27, 2005 9:34 AM To: r-help@stat.math.ethz.ch Subject: [R] Indexing Lists and Partial Matching I was unaware until recently that partial matching was used to index data frames and lists. This is now causing a great deal of problems in my code as I sometimes index a list without knowing what elements it contains, expecting a NULL if the column does not exist. However, if partial matching is used, sometimes R will return an object I do not want. My question, is there an easy way of getting around this? For example: a - NULL a$abc - 5 a$a [1] 5 a$a - a$a a $abc [1] 5 $a [1] 5 Certainly from a coding prospective, one might expect assigning a$a to itself wouldn't do anything since either 1) a$a doesn't exist, so nothing happens, or 2) a$a does exist and so it just assigns its value to itself. However, in the above case, it creates a new column entirely because I happen to have another column called a$abc. I do not want this behavior. The solution I came up with was to create another indexing function that uses the subset() (which doesn't partial match), then check for an error, and if there is an error substitute NULL (to mimic the [ behavior). However, I don't really want to start using another indexing function altogether just to get around this behavior. Is there a better way? Can I turn off partial matching? Thanks, Robert Robert McGehee Geode Capital Management, LLC 53 State Street, 5th Floor | Boston, MA | 02109 Tel: 617/392-8396Fax:617/476-6389 mailto:[EMAIL PROTECTED] This e-mail, and any attachments hereto, are intended for us...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Installing Problems
Hi, I tried installing R on my MAC OS 10.3. After R installation I tried installing BioConductor which requires R. I ran into some problems with Bioconductor. Right now I want to remove (uninstall) all R and Bioconductor components from my machine and start afresh. Can somebody tell me how i can remove(uninstall) all R and Bioconductor components. Thanks Regards Ashok __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sw
also stepAIC in library(MASS). hope this helps. spencer graves Christoph Buser wrote: Dear Mahdi Mahdi Osman writes: Hi list, I am just a new user of R. How can I run stepwise regression in R? If you look for a stepwise procedure for Linear Regression Models or Generalized Linear Models, you can use step() (see ?step) Regards, Christoph -- Christoph Buser [EMAIL PROTECTED] Seminar fuer Statistik, LEO C11 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-5414 fax: 632-1228 http://stat.ethz.ch/~buser/ -- Is there a graphic user interphase for any of the spatial packages inculded in R, such as gstat, geoR and someothers. I am mainly interested interactive variogram modelling and mapping. Thanks Mahdi -- --- Mahdi Osman (PhD) E-mail: [EMAIL PROTECTED] --- 10 GB Mailbox, 100 FreeSMS http://www.gmx.net/de/go/topmail __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to generate labels or names?
Hi, I'm new to R and I would like to generate labels like data.frame does : V1 V2 V3 I'm trying to generate a N vector with label such as Lab1 Lab2 ... LabN. I guess this is pretty easy when you know R ;) Thanks for help Eric __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] svd error
Hi, I met a probem recently and need your help. I would really appreciate it. I kept receiving the following error message when running a program: 'Error in svd(X) : infinite or missing values in x'. However, I did not use any svd function in this program though I did include the function pseudoinverse. Is the problem caused by doing pseudoinverse? Best regards, Tongtong __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] weighting in nls
There seems to be some peculiarity with the weights. If you try the unweighted fit, it comes much closer to the answer from Prism... Andy From: Robert Brown FM CEFAS Hi there, this is the output from R solb2wvb-nls(~sqrt(novervar)*(weight-(a*(1-exp(-b*(age-c^ 3),data=solb2.na.rm,start=list(a=0.85,b=0.45,c=0.48)) summary(solb2wvb) Formula: ~ sqrt(novervar) * (weight - (a * (1 - exp( - b * (age - c^3) Parameters: Value Std. Error t value a 1.087370 0.01193090 91.1392 b 0.151838 0.00714963 21.2372 c -1.809770 0.13186000 -13.7250 Residual standard error: 4.41368 on 109 degrees of freedom The output from Prism is: von Bertalanffy Best-fit values A0.8957 B0.2381 C-1.358 Std. Error A0.002280 B0.002568 C0.02919 95% Confidence Intervals A0.8912 to 0.9001 B0.2331 to 0.2431 C-1.415 to -1.300 The latter has much better visual fit and reasonable residuals. Furthermore theory and practice both lead to the expectation that this model should fit the data. Incidentally, I was under the impression that with a weighted nls in R the SE values were not accurate. Finally I've attached the dataset -Original Message- From: Liaw, Andy [mailto:[EMAIL PROTECTED] Sent: 27 January 2005 15:25 To: Robert Brown FM CEFAS; r-help@stat.math.ethz.ch Subject: RE: [R] weighting in nls Can you show us the difference; i.e., what are the parameter estimates and associated SEs from the two programs? Even better, can you supply an example data set? [With is `trick' for weighted nls, you need to be careful with the output of predict().] Andy From: Robert Brown FM CEFAS I'm fitting nonlinear functions to some growth data but I'm getting radically different results in R to another program (Prism). Furthermore the values from the other program give a better fit and seem more realistic. I think there is a problem with the results from the r nls function. The differences only occur with weighted data so I think I'm making a mistake in the weighting. I'm following the procedure outlined on p 244 of MASS (or at least I'm trying to). Thus, I'm using mean data with heteroscedasticity so I'm weighting by n/ variance, where the variance is well known from a large data set. This weighting factor is available as the variable 'novervar'. The function is a von Bertalanffy curve of the form weight~(a*(1-exp(-b*(age-c^3. Thus I'm entering the command in the form: solb1wvb-nls(~sqrt(novervar)*(weight-(a*(1-exp(-b*(age-c^ 3),data=solb1.na.rm,start=list(a=0.85,b=0.45,c=0.48)) Can anyone suggest what I'm doing wrong? I seem to be folowing the instructions in MASS. I tried following the similar instructions on page 450 of the white book but these were a bit cryptic. I'm using R 2.0.0 on a Windows 2000 machine Regards, Robert Brown ** * This email and any attachments are intended for the named re...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp Dohme or MSD and in Japan, as Banyu) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. -- ** * This email and any attachments are intended for the named recipient only. Its unauthorised use, distribution, disclosure, storage or copying is not permitted. If you have received it in error, please destroy all copies and notify the sender. In messages of a non-business nature, the views and opinions expressed are the author's own and do not necessarily reflect those of the organisation from which it is sent. All emails may be subject to monitoring. ** *
RE: [R] Request for help (reference details)
I referred in my reply to a paper by Diaconis and Sturmfels. The exact reference is: Diaconis and Sturmfels, Algebraic algorithms for sampling from conditional distributions, Ann. Stat 26 (1998) 363-397. They cite the following: Besag and Clifford, Generalized Monte Carlo significance tests, Biometrika 76 (1989) 633-42. which actually contains your problem (section 3, Testing the Rasch model) and gives a very simple Markov chain for sampling from the uniform distribution on these matrices. If you need other than the uniform distribution, see the modifications Diaconis and Sturmfels make (the Metropolis step). Reid Huntsinger -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Huntsinger, Reid Sent: Thursday, January 27, 2005 10:50 AM To: 'Michela Marignani'; r-help@stat.math.ethz.ch Subject: RE: [R] Request for help Persi Diaconis and Bernd Sturmfels have an article on generating random contingency tables uniformly distributed subject to having fixed marginals for the same purpose (null distribution of conditional test) and they used Markov Chain Monte Carlo to sample. That could perhaps be adapted here. The article is in Annals of Statistics from several years ago, and if you google for algebraic statistics you'll probably find several recent expositions of the ideas, possibly even code. Reid Huntsinger -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michela Marignani Sent: Thursday, January 27, 2005 3:52 AM To: r-help@stat.math.ethz.ch Subject: [R] Request for help My name is Michela Marignani and I'm an ecologist trying to solve a problem linked to knight' s tour algorithm. I need a program to create random matrices with presence/absence (i.e. 1,0 values), with defined colums and rows sums, to create null models for statistical comparison of species distribution phenomena. I've seen on the web many solutions of the problem, but none provides the freedom to easily change row+colums constraint and none of them produce matrices with 1 and 0. Also, I've tryied to use R, but it is too complicated for a not-statistician as I amcan you help me? Thank you for your attention, so long Michela Marignani University of Siena Environmental Science Dept. Siena, Italy [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to generate labels or names?
Hi Eric If you want produce a vector with names, you can use v - rnorm(20) names(v) - paste(Lab,1:20, sep=) Regards, Christoph -- Christoph Buser [EMAIL PROTECTED] Seminar fuer Statistik, LEO C11 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-5414 fax: 632-1228 http://stat.ethz.ch/~buser/ -- Eric Rodriguez writes: Hi, I'm new to R and I would like to generate labels like data.frame does : V1 V2 V3 I'm trying to generate a N vector with label such as Lab1 Lab2 ... LabN. I guess this is pretty easy when you know R ;) Thanks for help Eric __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] svd error
You haven't told us what you used to compute the pseudoinverse, but I can get that error message using ginv in library(MASS). When I then typed ginv (without the parentheses, it listed the code, and I quickly saw Xsvd - svd(X) [using R 2.0.1 under Windows 2000]. hope this helps. spencer graves p.s. The posting guide (www.R-project.org/posting-guide.html) can help you find answers to many questions like this yourself, in addition to improving your facility with language AND improving, I believe, your chances of getting a reply that actually answers your question. In this case, if you are not using ginv in library(MASS) and the discussion above doesn't help you solve the problem otherwise, following the posting guide would have made it much easier for someone like me to provide a more useful answer. WU,TONGTONG wrote: Hi, I met a probem recently and need your help. I would really appreciate it. I kept receiving the following error message when running a program: 'Error in svd(X) : infinite or missing values in x'. However, I did not use any svd function in this program though I did include the function pseudoinverse. Is the problem caused by doing pseudoinverse? Best regards, Tongtong __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to generate labels or names?
Note that sometimes it makes more sense to use a list than a labeled vector. Sean On Jan 27, 2005, at 12:26 PM, Spencer Graves wrote: ?paste One of its examples is paste(A, 1:6, sep = ) [1] A1 A2 A3 A4 A5 A6 spencer graves Eric Rodriguez wrote: Hi, I'm new to R and I would like to generate labels like data.frame does : V1 V2 V3 I'm trying to generate a N vector with label such as Lab1 Lab2 ... LabN. I guess this is pretty easy when you know R ;) Thanks for help Eric __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] svd error
Dear Prof. Ripley: With library(MASS), I got the following in R 2.0.1 under Windows 2000: X [,1] [,2] [1,]13 [2,]2 NA ginv(X) Error in svd(X) : infinite or missing values in x This may not relate to Tongtong Wu's problem, but it used ginv in library(MASS) as you suggested and did produce the cited error message. spencer graves Prof Brian Ripley wrote: On Thu, 27 Jan 2005, WU,TONGTONG wrote: Hi, I met a probem recently and need your help. I would really appreciate it. I kept receiving the following error message when running a program: 'Error in svd(X) : infinite or missing values in x'. However, I did not use any svd function in this program though I did include the function pseudoinverse. Is the problem caused by doing pseudoinverse? Where did you find that function? It is not part of R as it ships, and it *may* be part of GeneTS, where it calls svd after squaring the matrix. But there are simpler pseudoinverse functions (e.g. ginv in MASS) that will not introduce that error. The tool you needed was traceback(): try it to see what it tells you here. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] svd error
On Thu, 27 Jan 2005, Spencer Graves wrote: Dear Prof. Ripley: With library(MASS), I got the following in R 2.0.1 under Windows 2000: X [,1] [,2] [1,]13 [2,]2 NA ginv(X) Error in svd(X) : infinite or missing values in x This may not relate to Tongtong Wu's problem, but it used ginv in library(MASS) as you suggested and did produce the cited error message. I said `introduce'. The cause of the error is in X, not introduced by ginv. pseudoinverse can introduce NaNs/infinities. Please do remember the care I take when writing things. BDR spencer graves Prof Brian Ripley wrote: On Thu, 27 Jan 2005, WU,TONGTONG wrote: Hi, I met a probem recently and need your help. I would really appreciate it. I kept receiving the following error message when running a program: 'Error in svd(X) : infinite or missing values in x'. However, I did not use any svd function in this program though I did include the function pseudoinverse. Is the problem caused by doing pseudoinverse? Where did you find that function? It is not part of R as it ships, and it *may* be part of GeneTS, where it calls svd after squaring the matrix. But there are simpler pseudoinverse functions (e.g. ginv in MASS) that will not introduce that error. The tool you needed was traceback(): try it to see what it tells you here. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help with R and Bioconductor
seemed successful. Then while attempting to getBioC() I had to force quit the R application since I had to attend to something else urgently. When i returned and tried to getBioC, I am getting errors Why not just let it run? indicating that there is a lock on some files. So i would like to The directory will likely be path-to-R/library/00LOCK (I say likely because the 'path-to-R/library' part could be something else if you specified an alternate installation directory or your default .libPaths is different then standard), and removing that directory will solve your issues. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Array Manipulation
I have a data set that looks like the following: ID Responce 1 57 1 63 1 49 2 31 2 45 2 67 2 91 3 56 3 43 4 23 4 51 4 61 4 76 4 68 5 34 5 35 5 45 I used sample(unique(ID)) to select a sample if ID's, say, (1,4,5). Now I want to pull out the rows with ID's 1, 4, and 5. I've tried forceing the matrix into a vector but it does not create and appropriate vector. I've also tried the if statment but it didn't work right either. Any suggestions? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Array Manipulation
Something like: dat[dat$ID %in% sample(unique(dat$ID), 3), ] Andy From: [EMAIL PROTECTED] I have a data set that looks like the following: ID Responce 1 57 1 63 1 49 2 31 2 45 2 67 2 91 3 56 3 43 4 23 4 51 4 61 4 76 4 68 5 34 5 35 5 45 I used sample(unique(ID)) to select a sample if ID's, say, (1,4,5). Now I want to pull out the rows with ID's 1, 4, and 5. I've tried forceing the matrix into a vector but it does not create and appropriate vector. I've also tried the if statment but it didn't work right either. Any suggestions? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Array Manipulation
Liaw, Andy wrote: Something like: dat[dat$ID %in% sample(unique(dat$ID), 3), ] or subset(dat, ID %in% sample(unique(ID), 3)) which I find to be more readable. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Is glm weird or am I?
roy wilson [EMAIL PROTECTED] writes: Hi, I've written a script that checks all bivariate correlations for variables in a matrix. I'm now trying to run a logistic regression on each pair (x,y) where y is a factor with 2 levels. I don't know how (or whether I want) to try to fathom what's up with glm. What I wrote is attached. Here's what I get. [If you want people to debug your code, you might supply the data as well. People might be more helpful if they can actually run your code. Remember who is asking who for a favour...] * source(lrtest.R) building model: Wgend ~ WAY construct_and_run_model: class of x: integer nlevels(x): 0 class of y: factor nlevels(y): 2 model built model ran -1.070886 0.01171153 building model: Wgend ~ WBWS construct_and_run_model: class of x: integer nlevels(x): 0 class of y: factor nlevels(y): 2 model built model ran 0.0837854 0.01898052 building model: Wgend ~ Wcond construct_and_run_model: class of x: factor nlevels(x): 2 class of y: factor nlevels(y): 2 Error in contrasts-(`*tmp*`, value = contr.treatment) : contrasts can be applied only to factors with 2 or more levels * Both Wcond and Wgend take values in {1,2}. My understanding is that, when family is bonomial, GLM recodes these to {0, 1}. That's consistent with what I've seen previously. Excuse the possible stupidity :-). What you're seeing is similar to this: x - factor(rep(0,20),levels=0:1) y - rbinom(20,1,.5) glm(y~x,binomial) Error in contrasts-(`*tmp*`, value = contr.treatment) : contrasts can be applied only to factors with 2 or more levels I.e. x is a two-level factor, but only one level is actually present in data. You have attach(newDataSet) for (cond in 1:2) { # Select rows for each condition t - newDataSet[Wcond == cond,] and then you proceed to use Wcond as a regressor within the data frame t. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Where is MASS
Dear List: I have been using the MASS package until 5 minutes ago. I just updated some packages from CRAN, something happened and R crashed. I then started R again and tried to source in some code that calls MASS, but received an error that there is not a package called MASS. I then went to install packages from CRAN and MASS was not visible as an option and I then went to the CRAN website and did not see MASS as one of the contributed packages available. I looked at the changes in the last two editions of R-News and didn't see anything related to MASS. I might be missing something obvious. Has something happened to this package? Thanks, Harold Windows XP 2.0.1 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] clustering
Hi, I just get a question (sorry if it is a dumb one) and I phase my question in the following R codes: group1-rnorm(n=50, mean=0, sd=1) group2-rnorm(n=20, mean=1, sd=1.5) group3-c(group1,group2) Now, if I am given a dataset from group3, what method (discriminant analysis, clustering, maybe) is the best to cluster them by using R. The known info includes: 2 clusters, normal distribution (but the parameters are unknown). Thanks, Ed __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Where is MASS
It's part of the VR bundle (for a _long_ time...) Andy From: Doran, Harold Dear List: I have been using the MASS package until 5 minutes ago. I just updated some packages from CRAN, something happened and R crashed. I then started R again and tried to source in some code that calls MASS, but received an error that there is not a package called MASS. I then went to install packages from CRAN and MASS was not visible as an option and I then went to the CRAN website and did not see MASS as one of the contributed packages available. I looked at the changes in the last two editions of R-News and didn't see anything related to MASS. I might be missing something obvious. Has something happened to this package? Thanks, Harold Windows XP 2.0.1 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Where is MASS
You need to get the VR bundle from CRAN. MASS has been part of the VR bundle depuis longtemps. Like forever. But the packages in the VR bundle ship with R by default, so if you've installed R you should have those packages there automatically. So it would seem that something got damaged in the R crash that you spoke of. Perhaps you should re-install R. cheers, Rolf Turner [EMAIL PROTECTED] ===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+=== Harold Doran wrote: I have been using the MASS package until 5 minutes ago. I just updated some packages from CRAN, something happened and R crashed. I then started R again and tried to source in some code that calls MASS, but received an error that there is not a package called MASS. I then went to install packages from CRAN and MASS was not visible as an option and I then went to the CRAN website and did not see MASS as one of the contributed packages available. I looked at the changes in the last two editions of R-News and didn't see anything related to MASS. I might be missing something obvious. Has something happened to this package? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Results of MCD estimators in MASS and rrcov
The two implementations use different consistency factors as well as different small sample correction factors. 1. The search parts of both implementations produce the same result - compare rrcov.mcd$best and mass.mcd$best. 2. The raw MCD covariance matrix is corrected as follows: MASS: - Rousseeuw and Leroy (1987), p.259 (eq. 1.26) - Marazzi (1993) (or may be Rousseeuw and van Zomeren (1900) p.638 (eq A.9) rrcov: - Croux and Haesbroeck (1999), Pison et.al. p. 337 - Pison et.al. (2002), p.338 3. The reweighted (final) covariance matrix is corrected as follows: MASS: no correction rrcov: Pison et.al. (2002) p. 339 This explains the different covariance matrices. As far as the location is concerned, in this particular case the raw MCD estimates in MASS identify one additional outlier - observation 53, which is discarded from the computation of the reweighted estimates. Look at the following plots and judge yourself if this is an outlier or not: covPlot(hbk, mcd=rrcov.mcd, which=distance, id.n=15) covPlot(hbk, mcd=mass.mcd, which=distance, id.n=15) valentin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Where is MASS
Harold, Try looking in C:\Program Files\R\rw2001\library (presuming the default install path) to see if the MASS folder is still there. If it is, then what exactly is the error message you are getting when you try library(MASS)? If MASS is no longer there, I would try the Packages Install Packages from CRAN.. menu item in the R Console and see if VR is listed. If so, install that by the Select OK process and MASS should be restored. Actually, this *might* work even if you still have a MASS folder. Hope that helps, Bill --- Bill Pikounis, PhD Nonclinical Statistics Centocor, Inc. 200 Great Valley Parkway MailStop C4-1 Malvern, PA 19355 I have been using the MASS package until 5 minutes ago. I just updated some packages from CRAN, something happened and R crashed. I then started R again and tried to source in some code that calls MASS, but received an error that there is not a package called MASS. I then went to install packages from CRAN and MASS was not visible as an option and I then went to the CRAN website and did not see MASS as one of the contributed packages available. I looked at the changes in the last two editions of R-News and didn't see anything related to MASS. I might be missing something obvious. Has something happened to this package? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] clustering
The cluster analysis should be able to handle that. I think if you know how many clusters you have, kmeans is ok, or the EM algorithm can also do that. On Thu, Jan 27, 2005 at 03:44:42PM -0500, WeiWei Shi wrote: Hi, I just get a question (sorry if it is a dumb one) and I phase my question in the following R codes: group1-rnorm(n=50, mean=0, sd=1) group2-rnorm(n=20, mean=1, sd=1.5) group3-c(group1,group2) Now, if I am given a dataset from group3, what method (discriminant analysis, clustering, maybe) is the best to cluster them by using R. The known info includes: 2 clusters, normal distribution (but the parameters are unknown). Thanks, Ed __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Finding runs of TRUE in binary vector
I have a binary vector and I want to find all regions of that vector that are runs of TRUE (or FALSE). a - rnorm(10) b - a0.5 b [1] TRUE TRUE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE My function would return something like a list: region[[1]] 1,3 region[[2]] 5,5 region[[3]] 7,10 Any ideas besides looping and setting start and ends directly? Thanks, Sean __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] clustering
Hi, thanks for reply. In fact, I tried both of them and I also tried the other method and I found all of them gave me different boundaries (to my real datasets). I am thinking about k-median but hoping to get more suggestions from all of you in this forum. Cheers, Ed On Thu, 27 Jan 2005 15:37:16 -0600, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: The cluster analysis should be able to handle that. I think if you know how many clusters you have, kmeans is ok, or the EM algorithm can also do that. On Thu, Jan 27, 2005 at 03:44:42PM -0500, WeiWei Shi wrote: Hi, I just get a question (sorry if it is a dumb one) and I phase my question in the following R codes: group1-rnorm(n=50, mean=0, sd=1) group2-rnorm(n=20, mean=1, sd=1.5) group3-c(group1,group2) Now, if I am given a dataset from group3, what method (discriminant analysis, clustering, maybe) is the best to cluster them by using R. The known info includes: 2 clusters, normal distribution (but the parameters are unknown). Thanks, Ed __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Finding runs of TRUE in binary vector
Untested: c(TRUE, b[-1] != b[-length(b)]) gives you the (logical) indexes of the beginnings of the runs c(b[-1] != b[-length(b)], TRUE) gives you the (logical) indexes of the ends of the runs -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sean Davis Sent: Thursday, January 27, 2005 2:14 PM To: r-help Subject: [R] Finding runs of TRUE in binary vector I have a binary vector and I want to find all regions of that vector that are runs of TRUE (or FALSE). a - rnorm(10) b - a0.5 b [1] TRUE TRUE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE My function would return something like a list: region[[1]] 1,3 region[[2]] 5,5 region[[3]] 7,10 Any ideas besides looping and setting start and ends directly? Thanks, Sean __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Finding runs of TRUE in binary vector
Thanks Patrick, Albyn, and Vadim. rle() does what I want and, Vadim, your method gives the same results in a different form. I appreciate the help! Sean On Jan 27, 2005, at 5:29 PM, Vadim Ogranovich wrote: Untested: c(TRUE, b[-1] != b[-length(b)]) gives you the (logical) indexes of the beginnings of the runs c(b[-1] != b[-length(b)], TRUE) gives you the (logical) indexes of the ends of the runs -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sean Davis Sent: Thursday, January 27, 2005 2:14 PM To: r-help Subject: [R] Finding runs of TRUE in binary vector I have a binary vector and I want to find all regions of that vector that are runs of TRUE (or FALSE). a - rnorm(10) b - a0.5 b [1] TRUE TRUE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE My function would return something like a list: region[[1]] 1,3 region[[2]] 5,5 region[[3]] 7,10 Any ideas besides looping and setting start and ends directly? Thanks, Sean __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Survreg with gamma distribution
Dear r-help subscribers, I am working on some survival analysis of some interval censored failure time data in R. I have done similar analysis before using PROC LIFEREG in SAS. In that instance, a gamma survival function was the optimum parametric model for describing the survival and hazard functions. I would like to be able to use a gamma function in R, but apparently the survival package does not support this distribution. I have been googling around for some help, and have found some threads to a similar question posted to the R-Help list in October last year. Because I am a bit of a survival analysis and R newbie, I didn't really understand the discussion thread. I've been working with a Weibull distribution, thus: leafsurv.weibull-survreg(Surv(minage, maxage, censorcode, type = interval)~1, dist = weib) And I guess I'd like to be able to do something that's the equivalent of leafsurv.gamma-survreg(Surv(minage, maxage, censorcode, type = interval)~1, dist = gamma) At least one of the R-help listserver comments mentioned using survreg.distributions to customise a gamma distribution, but I can't figure out how to make this work with the resources (intellectual and bibliographical!) that I have available. With thanks in advance for your help, Dr Roger Dungan School of Biological Sciences University of Cantebury Christchurch, New Zealand ph +64 3 366 7001 ext. 4848 fax +64 3 354 2590 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Multiple colors in a plot title
R-Help: Is there a way to use multiple colors in the title of a plot? For instance, to have certain words be red, and certain words be blue? thanks in advance, Leo -- 1718 Commonwealth Avenue Apt 2 Brighton, MA 02135 Cell: 617-599-0037 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Array Manipulation
and something like that: dat[dat$ID == sample(unique(dat$ID), 3), 2] ? I'm not sure about the ,2 maybe you need the full matrix ? ps: first time, i forgot the list __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Finding runs of TRUE in binary vector
Sean Davis [EMAIL PROTECTED] writes: I have a binary vector and I want to find all regions of that vector that are runs of TRUE (or FALSE). a - rnorm(10) b - a0.5 b [1] TRUE TRUE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE My function would return something like a list: region[[1]] 1,3 region[[2]] 5,5 region[[3]] 7,10 Any ideas besides looping and setting start and ends directly? You could base it on rle(b) Run Length Encoding lengths: int [1:5] 1 1 2 4 2 values : logi [1:5] TRUE FALSE TRUE FALSE TRUE b [1] TRUE FALSE TRUE TRUE FALSE FALSE FALSE FALSE TRUE TRUE (Notice that my b differs from yours) then you might proceed with end - cumsum(rle(b)$lengths) start - rev(length(b) + 1 - cumsum(rev(rle(b)$lengths))) # or: start - c(1, end[-length(end)] + 1) cbind(start,end)[rle(b)$values,] start end [1,] 1 1 [2,] 3 4 [3,] 9 10 -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] binomia data and mixed model
Hi, I am a first user of R. I was hoping I could get some help on some data I need to analyze. The experimental design is a complete randomized design with 2 factors (Source material and Depth). The experimental design was suppose to consist of 4 treatments replicated 3 time, Source 1 and applied at 10 cm and source 2 applied at 20 cm. During the construction of the treatmetns the depths vary considerably so i can't test all my samples based on 10 and 20 cm any more the depths are now considered random and not fixed. Each treatment was sampled for depth and total density of plants with 3 transects with 28 quadrats per transect. The data is very non-normal (lots of zeros) therefore the only way to analyze it is to convert to binomial data. Does any one know what type of analysis I should use? I was told that a NLmixed model would work but also a GLIM mixed model was appropriate. Is there any info using these in R. Dean D. MacKenzie Master's of Science Candidate. A.Ag Department of Renewable Resources Rm 723 GSB University of Alberta Edmonton, AB T6G 2H1 Office tel: (780) 492-4135 Home tel: (780) 437-9563 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Threshhold Models in gnlm
Eliot McIntire emcintire at forestry.umt.edu writes: Hello, I am interested in fitting a generalized nonlinear regression (gnlr) model with negative binomial errors. I have found Jim Lindsay's package that will do gnlr, but I have having trouble with the particular model I am interested in fitting. It is a threshhold model, where below a certain value of one of the parameters being fitted, the model changes. [BIG SNIP] Threshold models (also known as piecewise linear, or more recently, hockey stick models) are actually surprisingly challenging to fit numerically. There are papers on least-squares fitting going back to Bacon and Watts (1971, Biometrika) and before to Quandt, and more recent (2000) posts on the S-PLUS lists from Bill Venables, Mary Lindstrom, and Nicholas Barrowman (who has a paper with Ram Myers on hockey stick models in fisheries). The basic trick is that, unless you do some kind of numerical smoothing, it's very easy to get stuck in local minima. I got a little carried away with the problem and am sending you some code off-list ... cheers Ben Bolker __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help with R and Bioconductor
Hi Jeff, First of all thanks for the response. But i still am encountering some problems with installing bioconductor. I did remove the directory 00Lock. I ran getBioC(affy) During the installation i get a warning several times. chmod:/Library/Frameworks/R.framework/Version/2.0.1/Resources/library/R.css:Operation Not Permitted Is this warning message critical to the installation? Moreover at the end of the Installation i also got this message. I have appended that message to the end of this email. Question 1. Package Annotate was not updated. Why wasnt it updated. How can i update it? Question 2. Should i be worried abt the other warning messages? Thanks Regards Ashok Packages that were not updated: annotate Warning messages: 1: Package annotate version 1.5.1 suggests zebrafish Package annotate version 1.5.1 suggests xenopuslaevis in: resolve.depends(pkgInfo, repEntry, force, lib = lib, searchOptions = searchOptions, 2: Installation of package annaffy had non-zero exit status in: installPkg(fileName, pkg, pkgVer, type, lib, repEntry, versForce) 3: Package annotate version 1.5.1 suggests zebrafish Package annotate version 1.5.1 suggests xenopuslaevis in: resolve.depends(pkgInfo, repEntry, force, lib = lib, searchOptions = searchOptions, 4: Installation of package Rgraphviz had non-zero exit status in: installPkg(fileName, pkg, pkgVer, type, lib, repEntry, versForce) 5: Installation of package geneplotter had non-zero exit status in: installPkg(fileName, pkg, pkgVer, type, lib, repEntry, versForce) 6: Package annotate version 1.5.1 suggests zebrafish Package annotate version 1.5.1 suggests xenopuslaevis in: resolve.depends(pkgInfo, repEntry, force, lib = lib, searchOptions = searchOptions, On Thu, 27 Jan 2005 13:41:45 -0500 (EST), Jeff Gentry [EMAIL PROTECTED] wrote: seemed successful. Then while attempting to getBioC() I had to force quit the R application since I had to attend to something else urgently. When i returned and tried to getBioC, I am getting errors Why not just let it run? indicating that there is a lock on some files. So i would like to The directory will likely be path-to-R/library/00LOCK (I say likely because the 'path-to-R/library' part could be something else if you specified an alternate installation directory or your default .libPaths is different then standard), and removing that directory will solve your issues. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Finding runs of TRUE in binary vector
use 'rle'; a - rnorm(20) b - a .5 b [1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE TRUE TRUE [13] FALSE FALSE TRUE TRUE TRUE FALSE TRUE FALSE rle(b) Run Length Encoding lengths: int [1:9] 1 7 2 2 2 3 1 1 1 values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE __ James HoltmanWhat is the problem you are trying to solve? Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 Sean Davis [EMAIL PROTECTED]To: r-help r-help@stat.math.ethz.ch cc: Sent by: Subject: [R] Finding runs of TRUE in binary vector [EMAIL PROTECTED] ath.ethz.ch 01/27/2005 17:13 I have a binary vector and I want to find all regions of that vector that are runs of TRUE (or FALSE). a - rnorm(10) b - a0.5 b [1] TRUE TRUE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE My function would return something like a list: region[[1]] 1,3 region[[2]] 5,5 region[[3]] 7,10 Any ideas besides looping and setting start and ends directly? Thanks, Sean __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] agglomerative coefficient in agnes (cluster)
Thanks very much Andy for the code and the explanation. The meaning of AC is much more clear now. I did notice, when I tried the code, the results were not exactly the same as yours. sapply(c(.25,.5), testAC, x=x[1:4], method=single) Loading required package: cluster Error in FUN(X[[1]], ...) : Object x not found x=rnorm(50) sapply(c(.25,.5), testAC, x=x[1:4], method=single) [1] 0.7450599 0.9926918 version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major2 minor0.1 year 2004 month11 day 15 language R Regards, Weiguang --- Liaw, Andy [EMAIL PROTECTED] wrote: It has to do with sample sizes. Consider the following: testAC - function(prop1=0.5, x=rnorm(50), center=c(0, 100), ...) { stopifnot(require(cluster)) n - length(x) n1 - ceiling(n * prop1) n2 - n - n1 agnes(x + rep(center, c(n1, n2)), ...)$ac } Now some tests: sapply(c(.25, .5), testAC, x=x[1:4], method=single) [1] 0.7427591 0.9862944 sapply(1:5 / 10, testAC, x=x[1:10], method=single) [1] 0.8977139 0.9974224 0.9950061 0.9946366 0.9946366 sapply(1:5 / 10, testAC, x=x, method=single) [1] 0.9982955 0.9969757 0.9971114 0.9971127 0.9975111 So it seems like AC does not consider isolated singletons as cluster structures. This is only discernable in small sample size, though. Andy --- Liaw, Andy [EMAIL PROTECTED] wrote: BTW, I checked the book. You're not going find much more than that. Thanks for checking. Weiguang __ Post your free ad now! http://personals.yahoo.ca -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp Dohme or MSD and in Japan, as Banyu) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] clustering
It depends a lot on what you know or don't know about the data, and what problem you're trying to solve. If you know for sure it's a mixture of gaussians, likelihood based approaches might be better. MASS (the book) has an example of fitting univariate mixture of gaussians using various optimizers. The code is even in $R_HOME/library/MASS/scripts/ch16.R. Andy From: WeiWei Shi Hi, thanks for reply. In fact, I tried both of them and I also tried the other method and I found all of them gave me different boundaries (to my real datasets). I am thinking about k-median but hoping to get more suggestions from all of you in this forum. Cheers, Ed On Thu, 27 Jan 2005 15:37:16 -0600, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: The cluster analysis should be able to handle that. I think if you know how many clusters you have, kmeans is ok, or the EM algorithm can also do that. On Thu, Jan 27, 2005 at 03:44:42PM -0500, WeiWei Shi wrote: Hi, I just get a question (sorry if it is a dumb one) and I phase my question in the following R codes: group1-rnorm(n=50, mean=0, sd=1) group2-rnorm(n=20, mean=1, sd=1.5) group3-c(group1,group2) Now, if I am given a dataset from group3, what method (discriminant analysis, clustering, maybe) is the best to cluster them by using R. The known info includes: 2 clusters, normal distribution (but the parameters are unknown). Thanks, Ed __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] clustering
Actually the problem I am trying to solve is to discretize a continuous variable (which is my response variable (dependent variable) in my project so that I can make a regression problem into a classification one. (There are many reasons for doing this.) Since there is no class label for this variable (because this variable is my class variable :), the unsupervised approach can be applied here. However, checking the related papers shows there is little research (in my knowledge, and I haven't checked the MCC yet) in this field. Using qqnorm to check the normality and histogram indicates there might be two normal distributions. My approach is splitting the values for this variable into 2 or 3 intervals and check each interval's normality again. If some approach like clustering or the one Andy suggests works well, then I should get much better normality. I will try that tomorrow. I am not sure if my idea works or not here, please be advised ! Thanks, Ed On Thu, 27 Jan 2005 18:58:28 -0500, Liaw, Andy [EMAIL PROTECTED] wrote: It depends a lot on what you know or don't know about the data, and what problem you're trying to solve. If you know for sure it's a mixture of gaussians, likelihood based approaches might be better. MASS (the book) has an example of fitting univariate mixture of gaussians using various optimizers. The code is even in $R_HOME/library/MASS/scripts/ch16.R. Andy From: WeiWei Shi Hi, thanks for reply. In fact, I tried both of them and I also tried the other method and I found all of them gave me different boundaries (to my real datasets). I am thinking about k-median but hoping to get more suggestions from all of you in this forum. Cheers, Ed On Thu, 27 Jan 2005 15:37:16 -0600, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: The cluster analysis should be able to handle that. I think if you know how many clusters you have, kmeans is ok, or the EM algorithm can also do that. On Thu, Jan 27, 2005 at 03:44:42PM -0500, WeiWei Shi wrote: Hi, I just get a question (sorry if it is a dumb one) and I phase my question in the following R codes: group1-rnorm(n=50, mean=0, sd=1) group2-rnorm(n=20, mean=1, sd=1.5) group3-c(group1,group2) Now, if I am given a dataset from group3, what method (discriminant analysis, clustering, maybe) is the best to cluster them by using R. The known info includes: 2 clusters, normal distribution (but the parameters are unknown). Thanks, Ed __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachment...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Matrix multiplication in R is inaccurate!!
If you multiply a matrix by its inverse you should get an identity matrix. In R, you get an answer that is accurate up to about 16 decimal points? Why can't one get a perfect answer? See for example: c(5,3)-x1 c(3,2)-x2 cbind(x1,x2)-x solve(x)-y x%*%y Vikas Rawal == This Mail was Scanned for Virus and found Virus free __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Error: cannot allocate vector of size... but with a twist
Hi, I have a memory problem, one which I've seen pop up in the list a few times, but which seems to be a little different. It is the Error: cannot allocate vector of size x problem. I'm running R2.0 on RH9. My R program is joining big datasets together, so there are lots of duplicate cases of data in memory. This (and other tasks) prompted me to... expand... my swap partition to 16Gb. I have 0.5Gb of regular, fast DDR. The OS seems to be fine accepting the large amount of memory, and I'm not restricting memory use or vector size in any way. R chews up memory up until the 3.5Gb area, then halts. Here's the last bit of output: # join the data together cdata01.data - cbind(c.1,c.2,c.3,c.4,c.5,c.6,c.7,c.8,c.9,c.10,c.11,c.12,c.13,c.14,c.15,c.16,c.17,c.18,c.19,c.20,c.21,c.22,c.23,c.24,c.25,c.26,c.27,c.28,c.29,c.30,c.31,c.32,c.33) Error: cannot allocate vector of size 145 Kb Execution halted 145--Kb---?? This has me rather lost. Maybe on overflow of some sort?? Maybe on OS problem of some sort? I'm scratching here. Before you question it, there is a legitimate reason for sticking all these components in the one data.frame. One of the problems here is that tinkering is not really feasible. This cbind took 1.5 hrs to finally halt. Any help greatly appreciated, James __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Error: cannot allocate vector of size... but with a twist
On Fri, 28 Jan 2005, James Muller wrote: Hi, I have a memory problem, one which I've seen pop up in the list a few times, but which seems to be a little different. It is the Error: cannot allocate vector of size x problem. I'm running R2.0 on RH9. My R program is joining big datasets together, so there are lots of duplicate cases of data in memory. This (and other tasks) prompted me to... expand... my swap partition to 16Gb. I have 0.5Gb of regular, fast DDR. The OS seems to be fine accepting the large amount of memory, and I'm not restricting memory use or vector size in any way. R chews up memory up until the 3.5Gb area, then halts. Here's the last bit of output: You have, presumably, a 32-bit computer with a 4GB-per-process memory limit. You have hit it (you get less than 4Gb as the OS services need some and there is some fragmentation). The last failed allocation may be small, as you see, if you are allocating lots of smallish pieces. The only way to overcome that is to use a 64-bit OS and version of R. What was the `twist' mentioned in the title? You will find a similar overall limit mentioned about weekly on this list if you look in the archives. # join the data together cdata01.data - cbind(c.1,c.2,c.3,c.4,c.5,c.6,c.7,c.8,c.9,c.10,c.11,c.12,c.13,c.14,c.15,c.16,c.17,c.18,c.19,c.20,c.21,c.22,c.23,c.24,c.25,c.26,c.27,c.28,c.29,c.30,c.31,c.32,c.33) Error: cannot allocate vector of size 145 Kb Execution halted 145--Kb---?? This has me rather lost. Maybe on overflow of some sort?? Maybe on OS problem of some sort? I'm scratching here. Before you question it, there is a legitimate reason for sticking all these components in the one data.frame. One of the problems here is that tinkering is not really feasible. This cbind took 1.5 hrs to finally halt. Any help greatly appreciated, James __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Error: cannot allocate vector of size... but with a twist
On Fri, 28 Jan 2005, James Muller wrote: I have a memory problem, one which I've seen pop up in the list a few times, but which seems to be a little different. It is the Error: cannot allocate vector of size x problem. I'm running R2.0 on RH9. [SNIP] R chews up memory up until the 3.5Gb area, then halts. Here's the last bit of output: 32-bit addressing goes to ~4Gb. -- SIGSIG -- signature too long (core dumped) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html