Re: [R] Splitting Data Frame into Two Based on Source Array
data_main[ match(src,data_main$V1), ] and the compliment of src (call it srcc) data_main[ match(srcc,data_main$V1), ] ...this only works so long as there is only one occurrance of each item in V1 in V1. --Adam On Tue, 9 Sep 2008, Gundala Viswanath wrote: Dear all, Suppose I have this data frame: data_main V1 V2 foo13.1 bar 12.0 qux 10.4 cho 20.33 pox 8.21 And I want to split the data into two parts first part are the one contain in the source array: src [1] bar pox and the other one the complement. In the end we hope to get this two dataframes: data_child1 V1 V2 bar 13.1 pox 8.21 and data_child2_complement foo 13.1 qux 10.4 cho 20.33 Is there a compact way to do it in R? - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cluster/snow question
Hi Tolga, in SNOW you have to start a cluster with the command library(snow) cluster - makeCluster(#nodes) The object cluster is a list with an object for each node and each object again is a list with all informations (rank, comm, tags) The size of the cluster is the length of the list. #nodes == length(cluster) E.g. the rank for node one you can get by cluster[[1]]$rank Best Markus [EMAIL PROTECTED] schrieb: Dear R Users, I am attempting to use the snow package for clustering. Is there a way to identfy, in the environment of each node, a rank for that node and also, the total size of the cluster ? By way of analogy, I am looking for the functions in snow equivalent to mpi.comm.rank() and mpi.comm.size() from RMPI, in case that makes things clearer. Thanks in advance, Tolga Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to UK legal entities. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dipl.-Tech. Math. Markus Schmidberger Ludwig-Maximilians-Universität München IBE - Institut für medizinische Informationsverarbeitung, Biometrie und Epidemiologie Marchioninistr. 15, D-81377 Muenchen URL: http://ibe.web.med.uni-muenchen.de Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de Tel: +49 (089) 7095 - 4599 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] yahoo finance into R
thomastos wrote: Hi R, I am familiar with the basics of R. To learn more I would like how to get data from Yahoo!finance directly into R. So basically I want a data frame or matrix to do some data analysis. How do I do this? RSiteSearch(yahoo) get.hist.quote() from tseries yahooSeries() from fImport (untried) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] match problem by rownames
Hi all, While dat['a1',] and dat['a10',] produce the same results in the following example, I'd like dat['a1',] to return NAs. dat - data.frame(x1 = paste(letters[1:5],10, sep=''), x2=rnorm(5)) rownames(dat) - dat$x1 dat['a1',] dat['a10',] sessionInfo() R version 2.7.2 (2008-08-25) i386-pc-mingw32 locale: LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_MON ETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lattice_0.17-13 loaded via a namespace (and not attached): [1] grid_2.7.2 Regards, Xianming DISCLAIMER:\ For details of our e-mail disclaimer, pleas...{{dropped:15}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Compiling date
Hi, I have following kind of dataset (all are dates) in my Excel sheet. 09/08/08 09/05/08 09/04/08 09/02/08 09/01/08 29/08/2008 28/08/2008 27/08/2008 26/08/2008 25/08/2008 22/08/2008 21/08/2008 20/08/2008 18/08/2008 14/08/2008 13/08/2008 08/12/08 08/11/08 08/08/08 08/07/08 However I want to use R to compile those data to make all dates in same format. Can anyone please tell me any automated way for doing that? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Memory allocation problem (during kmeans)
Dear all, I am trying to apply kmeans clusterring on a data file (size is about 300 Mb) I read this file using x=read.table('file path' , sep= ) then i do kmeans(x,25) but the process stops after two minutes with an error : Error: cannot allocate vector of size 907.3 Mb when i read the archive i notice that the best solution is to use a 64bit OS. Error messages beginning cannot allocate vector of size indicate a failure to obtain memory, either because the size exceeded the address-space limit for a process or, more likely, because the system was unable to provide the memory. Note that on a 32-bit OS there may well be enough free memory available, but not a large enough contiguous block of address space into which to map it. the problem that I have two machines with two OS (32bit and 64bit) and when i used the 64bit OS the same error remains. Thank you if you have any suggestions to me and excuse me because i am a newbie. Here the default information for the 64bit os: sessionInfo() R version 2.7.1 (2008-06-23) x86_64-redhat-linux-gnu gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 137955 7.4 35 18.7 35 18.7 Vcells 141455 1.1 786432 6.0 601347 4.6 I tried also to start R using the options to control the available memory and the result still the same. or maybe i don't assign the correct values. Thank you in advance. -- Rami BATAL [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to preserve date format while aggregating
This is completely wrong: min _is_ defined for date-times: min(.leap.seconds) [1] 1972-07-01 01:00:00 BST Please do study the posting guide and do your homework before posting: you seem unaware of what the POSIXct class is, so ?DateTimeClasses is one place you need to start. And methods(Summary) [1] Summary.data.frame Summary.DateSummary.difftime [4] Summary.factor Summary.numeric_version Summary.POSIXct [7] Summary.POSIXlt so ?Summary is another. On Mon, 8 Sep 2008, Adam D. I. Kramer wrote: Hi Erich, Since min() is defined for numbers and not dates, the problem is in the min() function. min() is converting from date format to number format. Your best bet is to make this conversion explicit...such that it is reversable. So, convert the date into UTC, then UTC to seconds since epoch, then take the minimum, then convert back to UTC time. This sounds like a pain...but that's basically what a version of min() designed to work with dates would do. The reason this is a pain is basically due to timezones: Consider a comparison between x = 3:54 PM September 8 in California (right now where I am) and y = 12:54 AM September 9 in Zurich (right now where you are). Is it earlier here than there? Yes, because it's Sept 8 to your Sept 9. Is it earlier there than here? Yes, because your day started 56 minutes ago, mine over 15 hours ago. Is it the same time here than there? Yes, because our UTC times are equal. So it's not clear what min should return, so min is not defined for dates. However, min is defined for numbers, and dates can be converted to numbers...but what those numbers actually mean is not necessarily clear. --Adam On Mon, 8 Sep 2008, Erich Studerus wrote: Hi I have a dataframe in which some subjects appear in more than one row. I want to extract the subject-rows which have the minimum date per subject. I tried the following aggregate function. attach(dataframe.xy) aggregate(Date,list(SubjectID),min) Unfortunately, the format of the Date-column changes to numeric, when I'm applying this function. How can I preserve the date format? Thanks Erich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] isolate elements in vector that match one of many possible values
Check out ?match, ?%in% x - c(1,2,3,4) y - c(1,2,4) match(y,x) [1] 1 2 4 --Adam On Mon, 8 Sep 2008, Andrew Barr wrote: Hi all, I want to get the index numbers of all elements of a vector which match any of a long series of possible values. Say x - c(1,2,3,4) and I want to know which values are equal to 1, 2 or 4. I could do which(x == 1 | x==2 | x==4) [1] 1 2 4 This gets really ugly though, when the list of values of interest is really long. Is there a nicer way to do this? Something akin to the MySQL construction in(), as in #MySQL script example Select * from table where parameter in(x,y,z); Thanks! -- W. Andrew Barr Biological Anthropology University of Texas at Austin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] make methods work in lapply - remove lapply's environment
This is a side-effect of lapply being in the base namespace and not evaluating its arguments, as explained on its help page which also points out that using a wrapper is sometimes needed. It also points out that code has been written that relies on the current behaviour. On Mon, 8 Sep 2008, Tim Hesterberg wrote: I've defined my own version of summary.default, that gives a better summary for highly skewed vectors. If I call summary(x) the method is used. If I call summary(data.frame(x)) the method is not used. I've traced this to lapply; this uses the new method: lapply(list(x), function(x) summary(x)) and this does not: lapply(list(x), summary) If I make a copy of lapply, WITHOUT the environment, then the method is used. lapply - function (X, FUN, ...) { FUN - match.fun(FUN) if (!is.vector(X) || is.object(X)) X - as.list(X) .Internal(lapply(X, FUN)) } I'm curious to hear reactions to this. There is a March 2006 thread object size vs. file size in which Duncan Murdoch wrote: Functions in R consist of 3 parts: the formals, the body, and the environment. You can't remove any part, but you can change it. That is exactly what I want to do, remove the environment, so that when I define a better version of some function that the better version is used. Here's a function to automate the process: copyFunction - function(Name){ # Copy a function, without its environment. # Name should be quoted # Return the copy file - tempfile() on.exit(unlink(file)) dput(get(Name), file = file) f - source(file)$value f } lapply - copyFunction(lapply) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] naive variance in GEE
On Mon, 8 Sep 2008, Qiong Yang wrote: The standard error from logistic regression is slightly different from the naive SE from GEE under independence working correlation structure. Shouldn't they be identical? Anyone has insight about this? They are computed quantities from iterations with different stopping criteria. The coefficients are not 'identical' either. Your example is incorrect (the first line) and not reproducible (no seed is set, no library gee), so we don't know what you saw. But with set.seed(1) a - rbinom(1000, 1, 0.2) b - rbinom(1000, 2, 0.1) c - rbinom(1000, 10, 0.5) library(gee) summary(gee(a ~ b, id=c, family=binomial, corstr=independence))$coef summary(glm(a ~ b, family=binomial))$coef the differences I see are negligible. I suggest you talk to your supervisor about some courses on numerical methods. Thanks, Qiong a-rbinom(1000,1) b-rbinom(1000,2,0.1) c-rbinom(1000,10,0.5) summary(gee(a~b, id=c,family=binomial,corstr=independence))$coef summary(glm(a~b,family=binomial)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] S.O.S try doesnot work in boot?
First thanks for Jinsong's suggestions I would like to do a bootstrap in a nonlinear model. But it fails to converge in most of time. (it did converge if I just use nls without boot). Thus, I use try function to resolve my problem. This following code is from Jinsong's suggestion. h1a.nls-nls(density~nmf(time, alpha, delta, psi, tau, gamma),data=h1a, start=c(alpha=0.3, delta=0.08869, psi=1.26523, tau=3.93919, gamma=-1.41927)) h1a.data-data.frame(h1a,res=resid(h1a.nls),fitted=fitted(h1a.nls)) h1a.fun-function(data,i){ d-data d$density-d$fitted+d$res[i] try(update(h1a.nls,data=d),silent=T) if(!inherits(h1a.nls,try-error)) h1a.coef-coef(h1a.nls) else h1a.coef-NA h1a.coef } h1a.boot-boot(h1a.data, statistic = h1a.fun, R=1000) h1a.boot ORDINARY NONPARAMETRIC BOOTSTRAP Call: boot(data = h1a.data, statistic = h1a.fun, R = 1000) Bootstrap Statistics : original biasstd. error t1* 0.27892590 0 0 t2* 0.08869433 0 0 t3* 1.26523275 0 0 t4* 3.93919567 0 0 t5* -1.41926966 0 0 all of the values of each column in h1a.boot$t are the same. Is anyone know to how I can solve this problem? Appreciate in advance Chunhao __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] correct lme syntax for this problem?
Dear Matthew, First of all I'm forwarding this to R-SIG-Mixed, which is a more appropriate list for your question. Using a mixed effect with only 5 levels is a borderline situation. Douglas Bates recommends at least 6 levels in order to get a more or less reliable estimate. So I would consider the populations as fixed effects. Do you have repeated measurements of individuals within your populations? If you do you could use those as random effects. Your anova tests whether the variances of the random slope on SPI is zero. I think you might want this: mod1 - lm(height ~ SPI * population + covariate1 + covariate2) mod2 - lm(height ~ SPI + population + covariate1 + covariate2) anova(mod1, mod2) HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Namens Matthew Keller Verzonden: dinsdag 9 september 2008 1:10 Aan: R Help Onderwerp: [R] correct lme syntax for this problem? Hello all, I am about to send off a manuscript and, although I am fairly confident I have used the lme function correctly, I want to be 100% sure. Could some kind soul out there put my mind at ease? I am simply interested in whether a predictor (SPI) is related to height. However, there are five different populations, and each may differ in mean level of height as well as the relationship between SPI and height. Thus, I also want to a) account for mean level differences in height and b) check whether the relationship between height and SPI is different between the groups. I hope this is sufficient information. height, SPI, covariate1, and covariate2 are numeric. population is a factor with 5 levels. Here are the steps I took: summary(mod1 - lme(height ~ SPI + covariate1 + covariate2, random = ~ SPI | population)) summary(mod2 - lme(height ~ SPI + covariate1 + covariate2, random = ~ 1 | population)) anova(mod1,mod2) #this checks whether there is evidence for IQ SPI being related differently between the 5 populations. Is this correct? THANKS! Matt -- Matthew C Keller Asst. Professor of Psychology University of Colorado at Boulder www.matthewckeller.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] S.O.S try doesnot work in boot?
Returning NA (of the correct length, not length 1) will not help you, as all the derived statistics from the bootstrap runs will be NA. But here you never looked at the result of try. On Tue, 9 Sep 2008, [EMAIL PROTECTED] wrote: First thanks for Jinsong's suggestions I would like to do a bootstrap in a nonlinear model. But it fails to converge in most of time. (it did converge if I just use nls without boot). Thus, I use try function to resolve my problem. This following code is from Jinsong's suggestion. h1a.nls-nls(density~nmf(time, alpha, delta, psi, tau, gamma),data=h1a, start=c(alpha=0.3, delta=0.08869, psi=1.26523, tau=3.93919, gamma=-1.41927)) h1a.data-data.frame(h1a,res=resid(h1a.nls),fitted=fitted(h1a.nls)) h1a.fun-function(data,i){ d-data d$density-d$fitted+d$res[i] try(update(h1a.nls,data=d),silent=T) if(!inherits(h1a.nls,try-error)) h1a.coef-coef(h1a.nls) h1a.nls is the original fit, not the result of try(). else h1a.coef-NA h1a.coef } h1a.boot-boot(h1a.data, statistic = h1a.fun, R=1000) h1a.boot ORDINARY NONPARAMETRIC BOOTSTRAP Call: boot(data = h1a.data, statistic = h1a.fun, R = 1000) Bootstrap Statistics : original biasstd. error t1* 0.27892590 0 0 t2* 0.08869433 0 0 t3* 1.26523275 0 0 t4* 3.93919567 0 0 t5* -1.41926966 0 0 all of the values of each column in h1a.boot$t are the same. Is anyone know to how I can solve this problem? Appreciate in advance Chunhao __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How do I compute interactions with anova.mlm ?
Hi, I wish to compute multivariate test statistics for a within-subjects repeated measures design with anova.mlm. This works great if I only have two factors, but I don't know how to compute interactions with more than two factors. I suspect, I have to create a new grouping factor and then test with this factor to get these interactions (as it is hinted in R News 2007/2), but I don't really know how to use this approach. Here is my current code: Two Factors: fac1, fac2 mlmfit - lm(mydata~1) mlmfit0 - update(mlmfit, ~0) % test fac1, works, produces same output as SAS anova(mlmfit, mlmfit0, M = ~ fac1 + fac2, X = ~ fac2, idata = idata, test = Wilks) % test fac1*fac2 interaction, also works, also the same output as SAS anova(mlmfit, mlmfit0, X = ~ fac1 + fac2, idata = idata, test = Wilks) Three Factors: fac1, fac2, fac3 mlmfit - lm(mydata~1) mlmfit0 - update(mlmfit, ~0) % test fac1, works, same as SAS anova(mlmfit, mlmfit0, M = ~ fac1 + fac2 + fac3, X = ~ fac2 + fac3, idata = idata, test = Wilks) Now, I try to compute the interactions the same way, but this doesn't work: % fac1*fac2 anova(mlmfit, mlmfit0, M = ~ fac1 + fac2 + fac3, X = ~ fac3, idata = idata, test = Wilks) % fac1*fac2*fac3 anova(mlmfit, mlmfit0, X = ~ fac1 + fac2 + fac3, idata = idata, test = Wilks) Both of these above differ quite much from the SAS output and I suspect, my understanding of X and M is somewhat flawed. I would be very happy, if someone could tell me how to compute the two interactions above and an interaction of N factors in general. I would also be interested in computing linear contrasts using the T matrix and anova.mlm. Thank you very much, Stefan -- Stefan Schadwinkel, Dipl.-Inf. Neurologische Klinik Sektion Biomagnetismus Universität Heidelberg Im Neuenheimer Feld 400 69120 Heidelberg Telefon: 06221 - 56 5196 Email:[EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] exporting tapply objects to csv-files
Dear Everyone, I try to create a cvs-file with different results form the table function. Imagine a data-frame with two vectors a and b where b is of the class factor. I use the tapply function to count a for the different values of b. tapply(a,b,table) and I use the table function to have a look of the frequencies as a total table(a) I would like to put both results together in one txt or csv file that I can import to e.g. Excel. The export file should have a layout like 1,2,3,4,5,6,7 (possible values of a) 3,6,7,8,8,8,1 (Counts of a total) 1,2,3,4,5,3,0 (Counts of a where b==A) 2,4,4,4,3,5,1 (Counts of a where b==B) I tried to change the class of the table result to a matrix but I could not find a way to use the results of tapply. I use tapply because b has 15 different values. Thanx Andreas Kunzler Bundeszahnärztekammer (BZÄK) Chausseestraße 13 10115 Berlin Tel.: 030 40005-113 Fax: 030 40005-119 E-Mail: [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Read from url requiring authentication?
René Sachse wrote: Damien schrieb: I'm looking into opening an url on a server which requires authentication. Under a Windows Operating System you could try to start R with the --internet2 option. This worked in my case. Thanks René it did the trick for me too! Best Regards, Damien __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Read from url requiring authentication?
On 8 Sep, 20:15, Prof Brian Ripley [EMAIL PROTECTED] wrote: On Mon, 8 Sep 2008, Damien wrote: Hi all, I'm looking into opening an url on a server which requires authentication. After failing to find some kind of connection structure to fill in I turned to explicitly stating the credentials in the url itself (e.g. http://username:[EMAIL PROTECTED]). Sadly this didn't do the trick either and both source() and url() failed trying to resolve the username () Is there anything I missed in the documentation/internet/groups? If not could I maybe add to the existing R functions as it doesn't seem too far of a stretch to allow the username and password in the url string fed to the web server? Look at the RCurl package: it is more like download.file than url, though, and you could perhaps wse the wget method of download.file. Thank you for the quick reply, it seems that the argument --internet2 did solve my immediate problem but I'll have a look at RCurl too. Best Regards, Damien __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compiling date
On Mon, 8 Sep 2008, Megh Dal wrote: Hi, I have following kind of dataset (all are dates) in my Excel sheet. 09/08/08 09/05/08 09/04/08 09/02/08 09/01/08 29/08/2008 28/08/2008 27/08/2008 26/08/2008 25/08/2008 22/08/2008 21/08/2008 20/08/2008 18/08/2008 14/08/2008 13/08/2008 08/12/08 08/11/08 08/08/08 08/07/08 However I want to use R to compile those data to make all dates in same format. Can anyone please tell me any automated way for doing that? Well you have to read them in as character first. Then use sub to make the two digit years into four digits. The following could probably be improved by a regular expression whiz, but works: strngs - c(06/05/08,23/11/2008) sub(([0-9][0-9]/[0-9][0-9]/)([0-9][0-9]$),\\120\\2,strngs) [1] 06/05/2008 23/11/2008 David Scott _ David Scott Department of Statistics, Tamaki Campus The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000 Email: [EMAIL PROTECTED] Graduate Officer, Department of Statistics Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory allocation problem (during kmeans)
rami batal skrev: Dear all, I am trying to apply kmeans clusterring on a data file (size is about 300 Mb) I read this file using x=read.table('file path' , sep= ) then i do kmeans(x,25) but the process stops after two minutes with an error : Error: cannot allocate vector of size 907.3 Mb when i read the archive i notice that the best solution is to use a 64bit OS. Error messages beginning cannot allocate vector of size indicate a failure to obtain memory, either because the size exceeded the address-space limit for a process or, more likely, because the system was unable to provide the memory. Note that on a 32-bit OS there may well be enough free memory available, but not a large enough contiguous block of address space into which to map it. the problem that I have two machines with two OS (32bit and 64bit) and when i used the 64bit OS the same error remains. Thank you if you have any suggestions to me and excuse me because i am a newbie. Here the default information for the 64bit os: sessionInfo() R version 2.7.1 (2008-06-23) x86_64-redhat-linux-gnu gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 137955 7.4 35 18.7 35 18.7 Vcells 141455 1.1 786432 6.0 601347 4.6 I tried also to start R using the options to control the available memory and the result still the same. or maybe i don't assign the correct values. It might be a good idea first to work out what the actual memory requirements are. 64 bits does not help if you are running out of RAM (+swap). -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I compute interactions with anova.mlm ?
Schadwinkel, Stefan skrev: Hi, I wish to compute multivariate test statistics for a within-subjects repeated measures design with anova.mlm. This works great if I only have two factors, but I don't know how to compute interactions with more than two factors. I suspect, I have to create a new grouping factor and then test with this factor to get these interactions (as it is hinted in R News 2007/2), but I don't really know how to use this approach. Here is my current code: Two Factors: fac1, fac2 mlmfit - lm(mydata~1) mlmfit0 - update(mlmfit, ~0) % test fac1, works, produces same output as SAS anova(mlmfit, mlmfit0, M = ~ fac1 + fac2, X = ~ fac2, idata = idata, test = Wilks) % test fac1*fac2 interaction, also works, also the same output as SAS anova(mlmfit, mlmfit0, X = ~ fac1 + fac2, idata = idata, test = Wilks) Three Factors: fac1, fac2, fac3 mlmfit - lm(mydata~1) mlmfit0 - update(mlmfit, ~0) % test fac1, works, same as SAS anova(mlmfit, mlmfit0, M = ~ fac1 + fac2 + fac3, X = ~ fac2 + fac3, idata = idata, test = Wilks) Now, I try to compute the interactions the same way, but this doesn't work: % fac1*fac2 anova(mlmfit, mlmfit0, M = ~ fac1 + fac2 + fac3, X = ~ fac3, idata = idata, test = Wilks) % fac1*fac2*fac3 anova(mlmfit, mlmfit0, X = ~ fac1 + fac2 + fac3, idata = idata, test = Wilks) Both of these above differ quite much from the SAS output and I suspect, my understanding of X and M is somewhat flawed. I would be very happy, if someone could tell me how to compute the two interactions above and an interaction of N factors in general. You need to ensure that the difference between the X and M models is the relevant interaction, so something like M=~fac1*fac2*fac3 X=~fac1*fac2*fac3 - fac1:fac2:fac3 should test for fac1:fac2:fac3 If the within-subject design is fac1*fac2*fac3 with one observation per cell (NB!), then you can omit M. X can also be written as ~fac1*fac2+fac2*fac3+fac1*fac3 or ~(fac1+fac2+fac3)^2. For the next step, use, e.g., M=~fac1*fac2+fac2*fac3+fac1*fac3 X=~fac2*fac3+fac1*fac3 to test significance of fac1:fac2 (notice that the main effects are still in X becaus of the meaning of the * operator in R). I would also be interested in computing linear contrasts using the T matrix and anova.mlm. Thank you very much, Stefan -- Stefan Schadwinkel, Dipl.-Inf. Neurologische Klinik Sektion Biomagnetismus Universität Heidelberg Im Neuenheimer Feld 400 69120 Heidelberg Telefon: 06221 - 56 5196 Email:[EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] match problem by rownames
As suggested in ?[.data.frame, try: dat[match('a1', rownames(dat)),] Haris Skiadas Department of Mathematics and Computer Science Hanover College On Sep 9, 2008, at 2:41 AM, Xianming Wei wrote: Hi all, While dat['a1',] and dat['a10',] produce the same results in the following example, I'd like dat['a1',] to return NAs. dat - data.frame(x1 = paste(letters[1:5],10, sep=''), x2=rnorm(5)) rownames(dat) - dat$x1 dat['a1',] dat['a10',] sessionInfo() R version 2.7.2 (2008-08-25) i386-pc-mingw32 locale: LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia. 1252;LC_MON ETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia. 1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lattice_0.17-13 loaded via a namespace (and not attached): [1] grid_2.7.2 Regards, Xianming __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to split a data framed with sequences
Is this what you want: my.df - data.frame(a = c(1:5, 1:10, 1:20), b = runif(35)) split(my.df, c(0, cumsum(diff(my.df$a) 0))) $`0` a b 1 1 0.2655087 2 2 0.3721239 3 3 0.5728534 4 4 0.9082078 5 5 0.2016819 $`1` a b 6 1 0.89838968 7 2 0.94467527 8 3 0.66079779 9 4 0.62911404 10 5 0.06178627 11 6 0.20597457 12 7 0.17655675 13 8 0.68702285 14 9 0.38410372 15 10 0.76984142 $`2` a b 16 1 0.49769924 17 2 0.71761851 18 3 0.99190609 19 4 0.38003518 20 5 0.77744522 21 6 0.93470523 22 7 0.21214252 23 8 0.65167377 24 9 0.1210 25 10 0.26722067 26 11 0.38611409 27 12 0.01339033 28 13 0.38238796 29 14 0.86969085 30 15 0.34034900 31 16 0.48208012 32 17 0.59956583 33 18 0.49354131 34 19 0.18621760 35 20 0.82737332 On Tue, Sep 9, 2008 at 5:38 AM, David Carslaw [EMAIL PROTECTED] wrote: Hi all, Given a data frame: my.df - data.frame(a = c(1:5, 1:10, 1:20), b = runif(35)) I want to split it by a such that I end up with a list containing 3 components i.e. the first containing a = 1 to 5, the second a = 1 to 10 etc. In other words, sets of sequences of a. I can't seem to find the right form using the split function - can you help? Much appreciated. David - Institute for Transport Studies University of Leeds -- View this message in context: http://www.nabble.com/how-to-split-a-data-framed-with-sequences-tp19388964p19388964.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to split a data framed with sequences
Hi all, Given a data frame: my.df - data.frame(a = c(1:5, 1:10, 1:20), b = runif(35)) I want to split it by a such that I end up with a list containing 3 components i.e. the first containing a = 1 to 5, the second a = 1 to 10 etc. In other words, sets of sequences of a. I can't seem to find the right form using the split function - can you help? Much appreciated. David - Institute for Transport Studies University of Leeds -- View this message in context: http://www.nabble.com/how-to-split-a-data-framed-with-sequences-tp19388964p19388964.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] match problem by rownames
try this: dat - data.frame(x1 = paste(letters[1:5],10, sep=''), x2=rnorm(5)) row.names(dat) - dat$x1 dat['a1' %in% row.names(dat), ] dat['a10' %in% row.names(dat), ] I hope it helps. Best, Dimitris Hi all, While dat['a1',] and dat['a10',] produce the same results in the following example, I'd like dat['a1',] to return NAs. dat - data.frame(x1 = paste(letters[1:5],10, sep=''), x2=rnorm(5)) rownames(dat) - dat$x1 dat['a1',] dat['a10',] sessionInfo() R version 2.7.2 (2008-08-25) i386-pc-mingw32 locale: LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_MON ETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lattice_0.17-13 loaded via a namespace (and not attached): [1] grid_2.7.2 Regards, Xianming DISCLAIMER:\ For details of our e-mail disclaimer, pleas...{{dropped:15}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043399 Fax: +31/(0)10/7044657 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotting group means
Hi all, I want to plot the grouped means of some variables. The dependent variables and the grouping factor are stored in different columns. I want to draw a simple line-plot of means, in which the x-axis represents the variables and y-axis represents the means. The means of the groups should be connected by lines. So far, the only function that I could find comes closest to what I'm looking for, is the error.bars.by-function in the psych-package. To know, what I'm looking for, just type: library(psych) x - matrix(rnorm(500),ncol=20) y - sample(4,25 ,replace=TRUE) x - x+y error.bars.by(x,y,ci=0) Now, I want to put a legend for the grouping factor of this graph. I also would like to manipulate the linetypes and colors of the lines. I've read the documentation, but it was not clear to me, how to do this. Are there other plotting functions in R, which can do the same? Erich Erich Studerus Lic. Phil. Klinische Psychologie Psychiatric University Hospital Zurich Division of Clinical Research Lenggstr. 31 CH-8008 Zurich Switzerland Mail: [EMAIL PROTECTED] Office: +41 44 384 26 66 Mobile: +41 76 563 31 54 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about multiple regression
On Mon, Sep 8, 2008 at 7:47 PM, Dimitri Liakhovitski [EMAIL PROTECTED] wrote: Thank you everyone for your responses. I'll answer several questions. 1. Disclaimer: I have **NO IDEA** of the details of what you want to do or why -- but I am willing to bet that there are better ways of doing it than 1.8 mm multiple refressions that take 270 secs each!! (which I find difficult to believe in itself -- are you sure you are doing things right? Something sounds very fishy here: R's regression code is typically very fast). I probably should not bore everyone, but just to explain where the large number is coming from. I have an experimental design with 7 factors. Each factor has between 3 and 5 levels. Once you cross them all, you end up with 18,000 cells. For each cell, I want to generate a sample of N=100. For each sample I have to analyze the data using 3 different statistical methods of analysis (the goal of the Monte-Carlo) is to compare those methods. One of the methods requires running of up to ~32,000 simple multiple regressions - yes just for one sample and it's not a mistake. I test-ran one such analysis for a sample with N=800 and 15 predictors and it took 270 seconds. R was actually very fast - it ran each of the individual regressions in about 0.008 seconds. Still I need something faster. 2. Sorry - what was the formula sum(lm.fit(x,y))$residuals^2) for? For example, using it on my data, I got a value of 36,644... 3. I know that for similarly challenging situations people did used Fortran compilers. So, anyone heard of a free Fortran library or an efficient piece of code? Thank you! Dimitri Have you considered the fact that 32000 regressions simply takes a lot of time? I don't really have anything to go by, but it sounds unlikely that you will be able to cut computing time by more than, say, ten times to 27 second. That would still leave you with 4 months of running a computer. Perhaps an alternative approach would be to get access to stronger (super)computers, either at a university, or buying access. A quick googling turns up http://www.clusterondemand.com/ for example. Anyhow, good luck with your project! I'm sure the R list would be very interested to hear of how you solved your problem. Regards, Gustaf -- Gustaf Rydevik, M.Sci. tel: +46(0)703 051 451 address:Essingetorget 40,112 66 Stockholm, SE skype:gustaf_rydevik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compiling date
Try this: strptime(x, ifelse(nchar(x) == 8, '%d/%m/%y', '%d/%m/%Y')) On Tue, Sep 9, 2008 at 3:48 AM, Megh Dal [EMAIL PROTECTED] wrote: Hi, I have following kind of dataset (all are dates) in my Excel sheet. 09/08/08 09/05/08 09/04/08 09/02/08 09/01/08 29/08/2008 28/08/2008 27/08/2008 26/08/2008 25/08/2008 22/08/2008 21/08/2008 20/08/2008 18/08/2008 14/08/2008 13/08/2008 08/12/08 08/11/08 08/08/08 08/07/08 However I want to use R to compile those data to make all dates in same format. Can anyone please tell me any automated way for doing that? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compiling date
Why not Format - Cell in Excell? el on 9/9/08 1:03 PM Henrique Dallazuanna said the following: Try this: strptime(x, ifelse(nchar(x) == 8, '%d/%m/%y', '%d/%m/%Y')) On Tue, Sep 9, 2008 at 3:48 AM, Megh Dal [EMAIL PROTECTED] wrote: Hi, I have following kind of dataset (all are dates) in my Excel sheet. 09/08/08 09/05/08 09/04/08 09/02/08 09/01/08 29/08/2008 28/08/2008 27/08/2008 26/08/2008 25/08/2008 22/08/2008 21/08/2008 20/08/2008 18/08/2008 14/08/2008 13/08/2008 08/12/08 08/11/08 08/08/08 08/07/08 However I want to use R to compile those data to make all dates in same format. Can anyone please tell me any automated way for doing that? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exporting tapply objects to csv-files
Try creating a new object: tb - rbind(table(a), do.call(rbind.data.frame, tapply(a, b, table))) names(tb) - unique(a) then write to csv by write.table. On Tue, Sep 9, 2008 at 5:48 AM, Kunzler, Andreas [EMAIL PROTECTED] wrote: Dear Everyone, I try to create a cvs-file with different results form the table function. Imagine a data-frame with two vectors a and b where b is of the class factor. I use the tapply function to count a for the different values of b. tapply(a,b,table) and I use the table function to have a look of the frequencies as a total table(a) I would like to put both results together in one txt or csv file that I can import to e.g. Excel. The export file should have a layout like 1,2,3,4,5,6,7 (possible values of a) 3,6,7,8,8,8,1 (Counts of a total) 1,2,3,4,5,3,0 (Counts of a where b==A) 2,4,4,4,3,5,1 (Counts of a where b==B) I tried to change the class of the table result to a matrix but I could not find a way to use the results of tapply. I use tapply because b has 15 different values. Thanx Andreas Kunzler Bundeszahnärztekammer (BZÄK) Chausseestraße 13 10115 Berlin Tel.: 030 40005-113 Fax: 030 40005-119 E-Mail: [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting group means
On 9/9/2008 6:49 AM, Erich Studerus wrote: Hi all, I want to plot the grouped means of some variables. The dependent variables and the grouping factor are stored in different columns. I want to draw a simple line-plot of means, in which the x-axis represents the variables and y-axis represents the means. The means of the groups should be connected by lines. So far, the only function that I could find comes closest to what I'm looking for, is the error.bars.by-function in the psych-package. To know, what I'm looking for, just type: library(psych) x - matrix(rnorm(500),ncol=20) y - sample(4,25 ,replace=TRUE) x - x+y error.bars.by(x,y,ci=0) Now, I want to put a legend for the grouping factor of this graph. I also would like to manipulate the linetypes and colors of the lines. I've read the documentation, but it was not clear to me, how to do this. Are there other plotting functions in R, which can do the same? Here is an approach which uses xyplot() in the lattice package and shows how to control line types and colors: mydf - data.frame(x=rep(paste(Group, 1:4, sep=), 6), v=rep(paste(Variable, 1:6, sep=), each=4), y=runif(24)) library(lattice) xyplot(y ~ v, groups = x, data = mydf, type=b, xlab=Dependent Variables, ylab=Mean, auto.key=list(lines=TRUE, points=TRUE, space=right), par.settings = list(superpose.symbol = list(pch=c(16,8,1,5), col=c(black,red,green,blue), lty=c(1,2,3,4)), superpose.line = list(col=c(black,red,green,blue), lty=c(1,2,3,4 Erich Erich Studerus Lic. Phil. Klinische Psychologie Psychiatric University Hospital Zurich Division of Clinical Research Lenggstr. 31 CH-8008 Zurich Switzerland Mail: [EMAIL PROTECTED] Office: +41 44 384 26 66 Mobile: +41 76 563 31 54 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting group means
Dear Erich, Have a look at ggplot2 library(ggplot2) dataset - expand.grid(x = 1:20, y = factor(LETTERS[1:4]), value = 1:10) dataset$value - rnorm(nrow(dataset), sd = 0.5) + as.numeric(dataset$y) plotdata - aggregate(dataset$value, list(x = dataset$x, y = dataset$y), mean) plotdata - merge(plotdata, aggregate(dataset$value, list(x = dataset$x, y = dataset$y), sd)) plotdata$min - plotdata$x.x - plotdata$x.y plotdata$max - plotdata$x.x + plotdata$x.y ggplot(plotdata, aes(x = x, y = x.x, colour = y, min = min, max = max)) + geom_pointrange() + geom_line() + geom_point() HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Namens Erich Studerus Verzonden: dinsdag 9 september 2008 12:49 Aan: r-help@r-project.org Onderwerp: [R] plotting group means Hi all, I want to plot the grouped means of some variables. The dependent variables and the grouping factor are stored in different columns. I want to draw a simple line-plot of means, in which the x-axis represents the variables and y-axis represents the means. The means of the groups should be connected by lines. So far, the only function that I could find comes closest to what I'm looking for, is the error.bars.by-function in the psych-package. To know, what I'm looking for, just type: library(psych) x - matrix(rnorm(500),ncol=20) y - sample(4,25 ,replace=TRUE) x - x+y error.bars.by(x,y,ci=0) Now, I want to put a legend for the grouping factor of this graph. I also would like to manipulate the linetypes and colors of the lines. I've read the documentation, but it was not clear to me, how to do this. Are there other plotting functions in R, which can do the same? Erich Erich Studerus Lic. Phil. Klinische Psychologie Psychiatric University Hospital Zurich Division of Clinical Research Lenggstr. 31 CH-8008 Zurich Switzerland Mail: [EMAIL PROTECTED] Office: +41 44 384 26 66 Mobile: +41 76 563 31 54 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write dataframes
Hi, Just a thought. You wrote: ob1-object1$ORF ob2-object2$ORF and then use cbind like, HG-cbind(on1,ob2) but there is an error. Is there any other function I can use? If you copied and pasted this from R, then your problem is Hg - cbind(on1,ob2) You mean Hg - cbind(ob1,ob2) So perhaps just a typo. HTH, Robin Williams Met Office summer intern - Health Forecasting [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Roberto Olivares-Hernández Sent: Tuesday, September 09, 2008 12:47 PM To: r-help@r-project.org Subject: [R] write dataframes Hi, After manipulate my data I have ended up with 5 different data frames with different number of observations but the same number of variables (columns) An example, if I write str(object1), I see this, data.frame': 47 obs. of 3 variables: $ ORF: Factor w/ 245 levels YAL038W,YAL054C,..: 10 19 38 39 44 45 50 51 59 60 ... $ mRNA : num 0.891 1.148 1.202 1.479 1.445 ... $ Protein: num 1.230 1.288 1.175 0.724 0.851 .. str(object2) 'data.frame': 21 obs. of 3 variables: $ ORF: Factor w/ 245 levels YAL038W,YAL054C,..: 11 25 40 55 66 78 104 119 141 153 ... $ mRNA : num 0.794 0.741 0.676 1.047 0.912 ... $ Protein: num 0.427 0.363 0.468 0.501 0.661 ... using the column $ORF from each object , how can I compose/write the results in a file that contains columns with different length ? I have tried to generate objects like ob1-object1$ORF ob2-object2$ORF and then use cbind like, HG-cbind(on1,ob2) but there is an error. Is there any other function I can use? Thanks for the help Roberto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] write dataframes
Hi, After manipulate my data I have ended up with 5 different data frames with different number of observations but the same number of variables (columns) An example, if I write str(object1), I see this, data.frame': 47 obs. of 3 variables: $ ORF: Factor w/ 245 levels YAL038W,YAL054C,..: 10 19 38 39 44 45 50 51 59 60 ... $ mRNA : num 0.891 1.148 1.202 1.479 1.445 ... $ Protein: num 1.230 1.288 1.175 0.724 0.851 .. str(object2) 'data.frame': 21 obs. of 3 variables: $ ORF: Factor w/ 245 levels YAL038W,YAL054C,..: 11 25 40 55 66 78 104 119 141 153 ... $ mRNA : num 0.794 0.741 0.676 1.047 0.912 ... $ Protein: num 0.427 0.363 0.468 0.501 0.661 ... using the column $ORF from each object , how can I compose/write the results in a file that contains columns with different length ? I have tried to generate objects like ob1-object1$ORF ob2-object2$ORF and then use cbind like, HG-cbind(on1,ob2) but there is an error. Is there any other function I can use? Thanks for the help Roberto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compiling date
this is day month year? look at chron or maybe the easiest is to use excel to change the format On Tue, Sep 9, 2008 at 7:12 AM, Dr Eberhard Lisse [EMAIL PROTECTED] wrote: Why not Format - Cell in Excell? el on 9/9/08 1:03 PM Henrique Dallazuanna said the following: Try this: strptime(x, ifelse(nchar(x) == 8, '%d/%m/%y', '%d/%m/%Y')) On Tue, Sep 9, 2008 at 3:48 AM, Megh Dal [EMAIL PROTECTED] wrote: Hi, I have following kind of dataset (all are dates) in my Excel sheet. 09/08/08 09/05/08 09/04/08 09/02/08 09/01/08 29/08/2008 28/08/2008 27/08/2008 26/08/2008 25/08/2008 22/08/2008 21/08/2008 20/08/2008 18/08/2008 14/08/2008 13/08/2008 08/12/08 08/11/08 08/08/08 08/07/08 However I want to use R to compile those data to make all dates in same format. Can anyone please tell me any automated way for doing that? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Stephen Sefick Research Scientist Southeastern Natural Sciences Academy Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting group means
Hi Erich, Have a look at brkdn.plot in the plotrix package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting group means
On Tue, Sep 9, 2008 at 6:56 AM, ONKELINX, Thierry [EMAIL PROTECTED] wrote: Dear Erich, Have a look at ggplot2 library(ggplot2) dataset - expand.grid(x = 1:20, y = factor(LETTERS[1:4]), value = 1:10) dataset$value - rnorm(nrow(dataset), sd = 0.5) + as.numeric(dataset$y) Or with stat_summary: qplot(x, value, data=dataset, colour=y, group = y) + stat_summary(geom=line, fun=mean,size=2) -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R_USER - in which file should I include it?
Hello Many thanks. It works just fine. How about the packages issue? That is, same thing for the installation path. Cheers Ed -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Monday, September 08, 2008 10:01 PM To: Eduardo M. A. M.Mendes Cc: r-help@r-project.org Subject: Re: [R] R_USER - in which file should I include it? Try adding this at the end of your etc/Rprofile.site file. That file should already be there so you don't have to create it, just edit it. cat(Hello from Rprofile.site\n) setwd(C:/Users/eduardo/Documents) You may need to edit it as Administrator. You should see the Hello message in which case you will know that the Rprofile.site file is being run. That should work unless Tinn-R runs R in such a way as to ignore Rprofile.site. On Mon, Sep 8, 2008 at 8:11 PM, Eduardo M. A. M.Mendes [EMAIL PROTECTED] wrote: Hello I am not sure whether R starts from the same dir. For instance: a) if I double-click on R-2.7.2 icon and then issue the command getwd(), the result is: getwd() [1] C:/Users/eduardo/Documents b) If R starts from within Tinn-R, the result is: getwd() [1] C:/Program Files/R/R-2/bin I want that no matter which calling R method I am using if I issue the command getwd() (first command) the result is: C:/Users/eduardo/Documents/R Moreover all new packages go to C:/Users/eduardo/Documents/R/win-library Thanks Ed -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Monday, September 08, 2008 8:57 PM To: Eduardo M. A. M.Mendes Cc: r-help@r-project.org Subject: Re: [R] R_USER - in which file should I include it? Could you explain more clearly what you mean by the same? Do you mean that each time you click on R 2.7.2 icon on your desktop that running this from the R console: getwd() is the same directory on each startup? Isn't that already the case? I don't think you need to set any environment variables at all. If you don't set any environment variables then what specifically is happening that you don't want to happen? On Mon, Sep 8, 2008 at 7:10 PM, Eduardo M. A. M.Mendes [EMAIL PROTECTED] wrote: Hello I am a newbie. I had my R upgraded from 2.7.1 to 2.7.2 and in doing so I decided to install all 2.7 versions under c:\program files\R\2.7 from now on (2.7.1 is located under .\2.7.1) Although I don't like the idea (I am running Vista), I have edited etc\Renviron.site to contain: R_USER=c:/Users/eduardo/Documents/R R_LIBS_USER=c:/Users/eduardo/Documents/R/win-library/2.7 As far as R starting always from the same location, that is, c:/Users/eduardo/Documents/R, etc\Renviron.site didn't help. So I wonder whether someone from the list could help me to: a) force R to start always from the same location b) force R to install all new packages in the same location Many thanks Ed PS. Before sending this email, I read windows FAQ and browsed the archives (too many posts in the subject!). __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Hardwarefor R cpu 64 vs 32, dual vs quad
Need to buy fast computer for running R on. Today we use 2,8 MHz intel D cpu and the calculations takes around 15 days. Is it possible to get the same calculations down to minutes/hours by only changing the hardware? Should I go for an really fast dual 32 bit cpu and run R over linux or xp or go for an quad core / 64 bit cpu? Is it effective to run R on 64 bit (and problem free (running/installing))??? Have around 2000-3000 euro to spend Thanx for any tip [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting group means
Thanks for all the suggestions, but it seems, that all these functions need a rearrangement of my data, since in my case, the dependent variables are in different columns. The error.bars.by-function seems to be the only plotting function, that does not need a rearrangement. Are there other functions, which can do that or is there an easy way to rearrange the columns into one? Thanks Erich -Ursprüngliche Nachricht- Von: hadley wickham [mailto:[EMAIL PROTECTED] Gesendet: Dienstag, 9. September 2008 15:02 An: ONKELINX, Thierry Cc: Erich Studerus; r-help@r-project.org Betreff: Re: [R] plotting group means On Tue, Sep 9, 2008 at 6:56 AM, ONKELINX, Thierry [EMAIL PROTECTED] wrote: Dear Erich, Have a look at ggplot2 library(ggplot2) dataset - expand.grid(x = 1:20, y = factor(LETTERS[1:4]), value = 1:10) dataset$value - rnorm(nrow(dataset), sd = 0.5) + as.numeric(dataset$y) Or with stat_summary: qplot(x, value, data=dataset, colour=y, group = y) + stat_summary(geom=line, fun=mean,size=2) -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R_USER - in which file should I include it?
You might look at ?.libPaths (note the dot) and play around with adding a .libPaths command to your Rprofile.site and again you may need Administrator rights when editing it. If that does not help then you can try clarifying the problem. In particular what the same refers to and what is happening now and what you want to happen. On Tue, Sep 9, 2008 at 9:14 AM, Eduardo M. A. M.Mendes [EMAIL PROTECTED] wrote: Hello Many thanks. It works just fine. How about the packages issue? That is, same thing for the installation path. Cheers Ed -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Monday, September 08, 2008 10:01 PM To: Eduardo M. A. M.Mendes Cc: r-help@r-project.org Subject: Re: [R] R_USER - in which file should I include it? Try adding this at the end of your etc/Rprofile.site file. That file should already be there so you don't have to create it, just edit it. cat(Hello from Rprofile.site\n) setwd(C:/Users/eduardo/Documents) You may need to edit it as Administrator. You should see the Hello message in which case you will know that the Rprofile.site file is being run. That should work unless Tinn-R runs R in such a way as to ignore Rprofile.site. On Mon, Sep 8, 2008 at 8:11 PM, Eduardo M. A. M.Mendes [EMAIL PROTECTED] wrote: Hello I am not sure whether R starts from the same dir. For instance: a) if I double-click on R-2.7.2 icon and then issue the command getwd(), the result is: getwd() [1] C:/Users/eduardo/Documents b) If R starts from within Tinn-R, the result is: getwd() [1] C:/Program Files/R/R-2/bin I want that no matter which calling R method I am using if I issue the command getwd() (first command) the result is: C:/Users/eduardo/Documents/R Moreover all new packages go to C:/Users/eduardo/Documents/R/win-library Thanks Ed -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Monday, September 08, 2008 8:57 PM To: Eduardo M. A. M.Mendes Cc: r-help@r-project.org Subject: Re: [R] R_USER - in which file should I include it? Could you explain more clearly what you mean by the same? Do you mean that each time you click on R 2.7.2 icon on your desktop that running this from the R console: getwd() is the same directory on each startup? Isn't that already the case? I don't think you need to set any environment variables at all. If you don't set any environment variables then what specifically is happening that you don't want to happen? On Mon, Sep 8, 2008 at 7:10 PM, Eduardo M. A. M.Mendes [EMAIL PROTECTED] wrote: Hello I am a newbie. I had my R upgraded from 2.7.1 to 2.7.2 and in doing so I decided to install all 2.7 versions under c:\program files\R\2.7 from now on (2.7.1 is located under .\2.7.1) Although I don't like the idea (I am running Vista), I have edited etc\Renviron.site to contain: R_USER=c:/Users/eduardo/Documents/R R_LIBS_USER=c:/Users/eduardo/Documents/R/win-library/2.7 As far as R starting always from the same location, that is, c:/Users/eduardo/Documents/R, etc\Renviron.site didn't help. So I wonder whether someone from the list could help me to: a) force R to start always from the same location b) force R to install all new packages in the same location Many thanks Ed PS. Before sending this email, I read windows FAQ and browsed the archives (too many posts in the subject!). __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] PCA and % variance explained
After doing a PCA using princomp, how do you view how much each component contributes to variance in the dataset. I'm still quite new to the theory of PCA - I have a little idea about eigenvectors and eigenvalues (these determine the variance explained?). Are the eigenvalues related to loadings in R? Thanks, Paul -- View this message in context: http://www.nabble.com/PCA-and---variance-explained-tp19388970p19388970.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting group means
On Tue, Sep 9, 2008 at 8:38 AM, Erich Studerus [EMAIL PROTECTED] wrote: Thanks for all the suggestions, but it seems, that all these functions need a rearrangement of my data, since in my case, the dependent variables are in different columns. The error.bars.by-function seems to be the only plotting function, that does not need a rearrangement. Are there other functions, which can do that or is there an easy way to rearrange the columns into one? Try: library(reshape) melt(x) Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vorticity and Divergence
Both vorticity and divergence are defined in terms of partial derivatives. You can compute these derivatives using the `grad' function in numDeriv package. U - function(X) { your U function} V - function(X) { your V function} # where X = c(x,y) library(numDeriv) grU - function(X) grad(X, func=U) grV - function(X) grad(X, func=V) # For a 2-dimensional vector field vortivcity - function(X) grV(X)[2] - grU(X)[1] divergence - function(X) grU(X)[1] + grV(X)[2] # Here is an example: U - function(X) X[1]^2 + X[1] * X[2] V - function(X) X[2]^2 - X[1] * X[2] vorticity(c(2,1)) [1] -5 divergence(c(2,1)) [1] 5 Does this help? Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Igor Oliveira Sent: Monday, September 08, 2008 11:37 AM To: r-help@r-project.org Subject: [R] Vorticity and Divergence Hi all, I have some wind data (U and V components) and I would like to compute Vorticity and Divergence of these fields. Is there any R function that can easily do that? Thanks in advance for any help Igor Oliveira CSAG, Dept. Environmental Geographical Science, University of Cape Town, Private Bag X3, Rondebosch 7701. Tel.: +27 (0)21 650 5774 South Africa Fax: +27 (0)21 650 5773 http:///www.csag.uct.ac.za/~igor __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question
Hi, I'm trying to verify the assumption of homogeneity of variance of residuals in an ANOVA with levene.test. I don't know how to define the groups. I have 3 factors : A, B and C(AxB). What do I have to change or to add in the command to set that I'm working with the residuals and to set the groups? library(car) attach(anova.sns2) levene.test(residuals ~ ???) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R_USER - in which file should I include it?
Many thanks. I shall look at it. In case I run into trouble again, I'll try to clarify the the same. Ed -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 09, 2008 10:46 AM To: Eduardo M. A. M.Mendes Cc: r-help@r-project.org Subject: Re: [R] R_USER - in which file should I include it? You might look at ?.libPaths (note the dot) and play around with adding a .libPaths command to your Rprofile.site and again you may need Administrator rights when editing it. If that does not help then you can try clarifying the problem. In particular what the same refers to and what is happening now and what you want to happen. On Tue, Sep 9, 2008 at 9:14 AM, Eduardo M. A. M.Mendes [EMAIL PROTECTED] wrote: Hello Many thanks. It works just fine. How about the packages issue? That is, same thing for the installation path. Cheers Ed -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Monday, September 08, 2008 10:01 PM To: Eduardo M. A. M.Mendes Cc: r-help@r-project.org Subject: Re: [R] R_USER - in which file should I include it? Try adding this at the end of your etc/Rprofile.site file. That file should already be there so you don't have to create it, just edit it. cat(Hello from Rprofile.site\n) setwd(C:/Users/eduardo/Documents) You may need to edit it as Administrator. You should see the Hello message in which case you will know that the Rprofile.site file is being run. That should work unless Tinn-R runs R in such a way as to ignore Rprofile.site. On Mon, Sep 8, 2008 at 8:11 PM, Eduardo M. A. M.Mendes [EMAIL PROTECTED] wrote: Hello I am not sure whether R starts from the same dir. For instance: a) if I double-click on R-2.7.2 icon and then issue the command getwd(), the result is: getwd() [1] C:/Users/eduardo/Documents b) If R starts from within Tinn-R, the result is: getwd() [1] C:/Program Files/R/R-2/bin I want that no matter which calling R method I am using if I issue the command getwd() (first command) the result is: C:/Users/eduardo/Documents/R Moreover all new packages go to C:/Users/eduardo/Documents/R/win-library Thanks Ed -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Monday, September 08, 2008 8:57 PM To: Eduardo M. A. M.Mendes Cc: r-help@r-project.org Subject: Re: [R] R_USER - in which file should I include it? Could you explain more clearly what you mean by the same? Do you mean that each time you click on R 2.7.2 icon on your desktop that running this from the R console: getwd() is the same directory on each startup? Isn't that already the case? I don't think you need to set any environment variables at all. If you don't set any environment variables then what specifically is happening that you don't want to happen? On Mon, Sep 8, 2008 at 7:10 PM, Eduardo M. A. M.Mendes [EMAIL PROTECTED] wrote: Hello I am a newbie. I had my R upgraded from 2.7.1 to 2.7.2 and in doing so I decided to install all 2.7 versions under c:\program files\R\2.7 from now on (2.7.1 is located under .\2.7.1) Although I don't like the idea (I am running Vista), I have edited etc\Renviron.site to contain: R_USER=c:/Users/eduardo/Documents/R R_LIBS_USER=c:/Users/eduardo/Documents/R/win-library/2.7 As far as R starting always from the same location, that is, c:/Users/eduardo/Documents/R, etc\Renviron.site didn't help. So I wonder whether someone from the list could help me to: a) force R to start always from the same location b) force R to install all new packages in the same location Many thanks Ed PS. Before sending this email, I read windows FAQ and browsed the archives (too many posts in the subject!). __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Gumbell distribution - minimum case
If you mean you want an EVD with a fat left tail (instead of a fat right tail), then can;t you just multiply all the values by -1 to reverse the distribution? A new location parameter could then shift the distribution wherever you want along the number line ... -Aaron On Mon, Sep 8, 2008 at 5:22 PM, Richard Gwozdz [EMAIL PROTECTED] wrote: Hello, I would like to sample from a Gumbell (minimum) distribution. I have installed package {evd} but the Gumbell functions there appear to refer to the maximum case. Unfortunately, setting the scale parameter negative does not appear to work. Is there a separate package for the Gumbell minimum? -- _ Rich Gwozdz Fire and Mountain Ecology Lab College of Forest Resources University of Washington cell: 206-769-6808 office: 206-543-9138 [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How does predict.lm work?
Hi, Please could someone explain how this element of predict.lm works? From the help file ` newdata An optional data frame in which to look for variables with which to predict. If omitted, the fitted values are used. ' Does this dataframe (newdata) need to have the same variable names as was used in the original data frame used to fit the model? Or will R just look across consecutive columns of newdata, and apply them to the call as appropriate? For example, if I have fitted a model with four variables (x1,x2,x3,x4) in my original dataframe, and then have a second dataframe which I want to supply to the newdata argument in predict.lm with variable names (x5, x6, x7, x8), do I need to change the variable names in my newdata dataframe to match those of the original dataframe? Or will R treat x5 as x1, x6 as x2, etc, when using predict.lm? I would like to know so that I can design the structure of some somewhat larger dataframes in a manner which will make using predict.lm straight forward and quick. Hope this makes sense. Many thanks for any help. Robin Williams Met Office summer intern - Health Forecasting [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How does predict.lm work?
Just try it: BOD # built in data frame Time demand 118.3 22 10.3 33 19.0 44 16.0 55 15.6 67 19.8 BOD.lm - lm(demand ~ Time, BOD) predict(BOD.lm, list(Time = 10)) 1 25.73571 predict(BOD.lm, list(10)) Error in eval(expr, envir, enclos) : object Time not found On Tue, Sep 9, 2008 at 10:59 AM, Williams, Robin [EMAIL PROTECTED] wrote: Hi, Please could someone explain how this element of predict.lm works? From the help file ` newdata An optional data frame in which to look for variables with which to predict. If omitted, the fitted values are used. ' Does this dataframe (newdata) need to have the same variable names as was used in the original data frame used to fit the model? Or will R just look across consecutive columns of newdata, and apply them to the call as appropriate? For example, if I have fitted a model with four variables (x1,x2,x3,x4) in my original dataframe, and then have a second dataframe which I want to supply to the newdata argument in predict.lm with variable names (x5, x6, x7, x8), do I need to change the variable names in my newdata dataframe to match those of the original dataframe? Or will R treat x5 as x1, x6 as x2, etc, when using predict.lm? I would like to know so that I can design the structure of some somewhat larger dataframes in a manner which will make using predict.lm straight forward and quick. Hope this makes sense. Many thanks for any help. Robin Williams Met Office summer intern - Health Forecasting [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How does predict.lm work?
on 09/09/2008 09:59 AM Williams, Robin wrote: Hi, Please could someone explain how this element of predict.lm works? From the help file ` newdata An optional data frame in which to look for variables with which to predict. If omitted, the fitted values are used. ' Does this dataframe (newdata) need to have the same variable names as was used in the original data frame used to fit the model? Yes. Also, see the Note in ?predict.lm: Variables are first looked for in newdata and then searched for in the usual way (which will include the environment of the formula used in the fit). A warning will be given if the variables found are not of the same length as those in newdata if it was supplied. It also says Variables, not columns. Or will R just look across consecutive columns of newdata, and apply them to the call as appropriate? No. For example, if I have fitted a model with four variables (x1,x2,x3,x4) in my original dataframe, and then have a second dataframe which I want to supply to the newdata argument in predict.lm with variable names (x5, x6, x7, x8), do I need to change the variable names in my newdata dataframe to match those of the original dataframe? Yes. Or will R treat x5 as x1, x6 as x2, etc, when using predict.lm? I would like to know so that I can design the structure of some somewhat larger dataframes in a manner which will make using predict.lm straight forward and quick. Hope this makes sense. Many thanks for any help. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] printing all rows
Hi, my data table has 38939 rows. R prints the first 1 columns and then prints an error message:[ reached getOption(max.print) -- omitted 27821 rows ]]. is it possible to set the maxprint parameter so that R prints all the rows? tia, anjan -- = anjan purkayastha, phd bioinformatics analyst whitehead institute for biomedical research nine cambridge center cambridge, ma 02142 purkayas [at] wi [dot] mit [dot] edu 703.740.6939 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hardwarefor R cpu 64 vs 32, dual vs quad
On Tue, 9 Sep 2008, Nic Larson wrote: Need to buy fast computer for running R on. Today we use 2,8 MHz intel D cpu and the calculations takes around 15 days. Is it possible to get the same calculations down to minutes/hours by only changing the hardware? No: you would need to arrange to parallelize the computations. I'd be surprised if you got a computer within your budget that was 3x faster on a single CPU than your current one, and R will only use (unaided) one CPU for most tasks (the exception being some matrix algebra). Should I go for an really fast dual 32 bit cpu and run R over linux or xp or go for an quad core / 64 bit cpu? Is it effective to run R on 64 bit (and problem free (running/installing))??? All answered in the R-admin manual, so please RTFM. Have around 2000-3000 euro to spend Thanx for any tip [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] puzzle about contrasts
Hi, I'm trying to redefine the contrasts for a linear model. With a 2 level factor, x, with levels A and B, a two level factor outputs A and B - A from an lm fit, say lm(y ~ x). I would like to set the contrasts so that the coefficients output are -0.5 (A + B) and B - A, but I can't get the sign correct for the first coefficient (Intercept). Here is a toy example, set.seed(12161952) y - rnorm(10) x - factor(rep(letters[1:2], each = 5)) ## so A and B = tapply(y, x, mean) a b -0.719 0.8323837 ## and with treatment contrasts coef(lm(y ~ x)) ## A and B - A (Intercept) xb -0.719 1.5522724 Then, I try to redefine the contrasts ### would like contrasts: -0.5 (A + B) and B - A D1 - matrix( c(-0.5, -0.5, -1, 1), 2, 2, byrow = TRUE) C1 - solve(D1) Cnt - C1[, -1] contrasts(x) - Cnt coef(lm(y ~ x)) (Intercept) x1 0.05624745 1.55227241 but note that the desired value is -0.5 * sum(tapply(y, x, mean)) [1] -0.05624745 I note that the first column of C1 is -1's not +1's and that working by hand, if I tamper with the model matrix mm - model.matrix(y ~ x) mm[, 1] - -1 mm (Intercept) x1 1 -1 -0.5 2 -1 -0.5 3 -1 -0.5 4 -1 -0.5 5 -1 -0.5 6 -1 0.5 7 -1 0.5 8 -1 0.5 9 -1 0.5 10 -1 0.5 attr(,assign) [1] 0 1 attr(,contrasts) attr(,contrasts)$x [,1] a -0.5 b 0.5 solve(t(mm) %*% mm) %*% t(mm) %*% y ##Yes, I know. Use QR [,1] (Intercept) -0.05624745 x1 1.55227241 gives the correct sign. So, I guess my question reduces to how one would set the contrasts for the model.matrix to be correct for this to work out correctly? Thank you. Ken -- Ken Knoblauch Inserm U846 Institut Cellule Souche et Cerveau Département Neurosciences Intégratives 18 avenue du Doyen Lépine 69500 Bron France tel: +33 (0)4 72 91 34 77 fax: +33 (0)4 72 91 34 61 portable: +33 (0)6 84 10 64 10 http://www.sbri.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exporting tapply objects to csv-files
On Tue, Sep 9, 2008 at 3:48 AM, Kunzler, Andreas [EMAIL PROTECTED] wrote: Dear Everyone, I try to create a cvs-file with different results form the table function. Imagine a data-frame with two vectors a and b where b is of the class factor. I use the tapply function to count a for the different values of b. tapply(a,b,table) and I use the table function to have a look of the frequencies as a total table(a) I would like to put both results together in one txt or csv file that I can import to e.g. Excel. The export file should have a layout like 1,2,3,4,5,6,7 (possible values of a) 3,6,7,8,8,8,1 (Counts of a total) 1,2,3,4,5,3,0 (Counts of a where b==A) 2,4,4,4,3,5,1 (Counts of a where b==B) I tried to change the class of the table result to a matrix but I could not find a way to use the results of tapply. I use tapply because b has 15 different values. An alternative would be to use reshape (http://had.co.nz/reshape): mydf - data.frame( a = sample(7, 100, rep = T), b = sample(letters[1:15], 100, rep = T)) library(reshape) mydf$value - 1 cast(mydf, b ~ a, sum, margins=row.major, fill = 0) Regards, Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] PCA and % variance explained
I did PCA stuff years there is a thing that is called a scree score Which will give an indication of the number of PCA's and the variance explained. Might want to web search on scree score and PCA. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of pgseye Sent: Tuesday, September 09, 2008 5:39 AM To: r-help@r-project.org Subject: [R] PCA and % variance explained After doing a PCA using princomp, how do you view how much each component contributes to variance in the dataset. I'm still quite new to the theory of PCA - I have a little idea about eigenvectors and eigenvalues (these determine the variance explained?). Are the eigenvalues related to loadings in R? Thanks, Paul -- View this message in context: http://www.nabble.com/PCA-and---variance-explained-tp19388970p19388970.h tml Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This information is being sent at the recipient's reques...{{dropped:16}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Linear Modeling the best alternative
I have a data set of mean velocity, discharge, and mean depth. I need to find out which model best fits them out of log linear, linear, some other kind of model... Using excel I have found that linear is not that bad and log10(discharge) vs. the other two variables (I am trying to predict velocity and depth from discharge) is not that bad either. How do I test and see which one of these models is better... better R-squared... I know this is a stats question and not particularly an R question, but I will use R for the models vetting process. any ideas would be greatly appreciated, -- Stephen Sefick Research Scientist Southeastern Natural Sciences Academy Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzle about contrasts
-0.5*(A+B) is not a contrast, which is the seat of your puzzlement. All you can get from y ~ x is an intercept (a column of ones) and a single 'contrast' column for 'x'. If you use y ~ 0+x you can get two columns for 'x', but R does not give you an option of what columns in the case: see the source of contrasts(). So you would need to replace contrasts(), which I think will be hard as model.matrix.default will look in the 'stats' namespace. It would probably be easier to create the model matrix yourself. On Tue, 9 Sep 2008, Kenneth Knoblauch wrote: Hi, I'm trying to redefine the contrasts for a linear model. With a 2 level factor, x, with levels A and B, a two level factor outputs A and B - A from an lm fit, say lm(y ~ x). I would like to set the contrasts so that the coefficients output are -0.5 (A + B) and B - A, but I can't get the sign correct for the first coefficient (Intercept). Here is a toy example, set.seed(12161952) y - rnorm(10) x - factor(rep(letters[1:2], each = 5)) ## so A and B = tapply(y, x, mean) a b -0.719 0.8323837 ## and with treatment contrasts coef(lm(y ~ x)) ## A and B - A (Intercept) xb -0.719 1.5522724 Then, I try to redefine the contrasts ### would like contrasts: -0.5 (A + B) and B - A D1 - matrix( c(-0.5, -0.5, -1, 1), 2, 2, byrow = TRUE) C1 - solve(D1) Cnt - C1[, -1] contrasts(x) - Cnt coef(lm(y ~ x)) (Intercept) x1 0.05624745 1.55227241 but note that the desired value is -0.5 * sum(tapply(y, x, mean)) [1] -0.05624745 I note that the first column of C1 is -1's not +1's and that working by hand, if I tamper with the model matrix mm - model.matrix(y ~ x) mm[, 1] - -1 mm (Intercept) x1 1 -1 -0.5 2 -1 -0.5 3 -1 -0.5 4 -1 -0.5 5 -1 -0.5 6 -1 0.5 7 -1 0.5 8 -1 0.5 9 -1 0.5 10 -1 0.5 attr(,assign) [1] 0 1 attr(,contrasts) attr(,contrasts)$x [,1] a -0.5 b 0.5 solve(t(mm) %*% mm) %*% t(mm) %*% y ##Yes, I know. Use QR [,1] (Intercept) -0.05624745 x1 1.55227241 gives the correct sign. So, I guess my question reduces to how one would set the contrasts for the model.matrix to be correct for this to work out correctly? Thank you. Ken -- Ken Knoblauch Inserm U846 Institut Cellule Souche et Cerveau Département Neurosciences Intégratives 18 avenue du Doyen Lépine 69500 Bron France tel: +33 (0)4 72 91 34 77 fax: +33 (0)4 72 91 34 61 portable: +33 (0)6 84 10 64 10 http://www.sbri.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] passing graph image data from remote Rserve
Hello, I am using Rserve to create a dedicated computational back-engine. I generate and pass an array of data to a java application on a separate server. I was wondering if the same is possible for an image. I believe that Rserve supports passing certain R objects and JRclient can cast these objects into their Java counterparts. If I generate a barplot in R (remotely), can I pass the graph image back to the Java application for display? Currently, I am reduced to saving the graph as a .pdf locally, passing the .pdf's filepath to the Java application and allowing the application access to the file, which is not an ideal structure. Thanks for your help, Prasad [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Modality Test
Dear Readers: I have two issues in nonparametric statistical analysis that i need help: First, does R have a package that can implement the multimodality test, e.g., the Silverman test, DIP test, MAP test or Runt test. I have seen an earlier thread (sometime in 2003) where someone was trying to write a code for the Silverman test of multimodality. Is there any other tests that can enable me to know how many modes are in a distribution? Second, i would like to test whether two distributions are equal. Does R have a package than can implement the Li (1996) test of the equality of two distributions? Is there any other test i can use rather than the Li test? Thank you in advance for your help. Amin Mugera Graduate Student AgEcon Dept. Kansas State University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] passing graph image data from remote Rserve
I believe I have found my solution, so please disregard. Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear Modeling the best alternative
stephen sefick ssefick at gmail.com writes: I have a data set of mean velocity, discharge, and mean depth. I need to find out which model best fits them out of log linear, linear, some other kind of model... Using excel I have found that linear is not that bad and log10(discharge) vs. the other two variables (I am trying to predict velocity and depth from discharge) is not that bad either. How do I test and see which one of these models is better... better R-squared... I know this is a stats question and not particularly an R question, but I will use R for the models vetting process. any ideas would be greatly appreciated, AIC is not bad, but see http://www.unc.edu/courses/2006spring/ecol/145/001/docs/lectures/lecture18.htm for computing AIC to compare models where some have transformed response variables ... Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzle about contrasts
Prof Brian Ripley skrev: -0.5*(A+B) is not a contrast, which is the seat of your puzzlement. All you can get from y ~ x is an intercept (a column of ones) and a single 'contrast' column for 'x'. If you use y ~ 0+x you can get two columns for 'x', but R does not give you an option of what columns in the case: see the source of contrasts(). So you would need to replace contrasts(), which I think will be hard as model.matrix.default will look in the 'stats' namespace. It would probably be easier to create the model matrix yourself. Or accept the default and do the parameter transformations yourself. l - lm(y~x) T - rbind( c(-1,-.5), c(0,1)) c2 - T%*%coef(l) V2 - T%*%vcov(l) %*% t(T) cbind(coef=c(c2), s.e.=sqrt(diag(V2))) On Tue, 9 Sep 2008, Kenneth Knoblauch wrote: Hi, I'm trying to redefine the contrasts for a linear model. With a 2 level factor, x, with levels A and B, a two level factor outputs A and B - A from an lm fit, say lm(y ~ x). I would like to set the contrasts so that the coefficients output are -0.5 (A + B) and B - A, but I can't get the sign correct for the first coefficient (Intercept). Here is a toy example, set.seed(12161952) y - rnorm(10) x - factor(rep(letters[1:2], each = 5)) ## so A and B = tapply(y, x, mean) a b -0.719 0.8323837 ## and with treatment contrasts coef(lm(y ~ x)) ## A and B - A (Intercept) xb -0.719 1.5522724 Then, I try to redefine the contrasts ### would like contrasts: -0.5 (A + B) and B - A D1 - matrix( c(-0.5, -0.5, -1, 1), 2, 2, byrow = TRUE) C1 - solve(D1) Cnt - C1[, -1] contrasts(x) - Cnt coef(lm(y ~ x)) (Intercept) x1 0.05624745 1.55227241 but note that the desired value is -0.5 * sum(tapply(y, x, mean)) [1] -0.05624745 I note that the first column of C1 is -1's not +1's and that working by hand, if I tamper with the model matrix mm - model.matrix(y ~ x) mm[, 1] - -1 mm (Intercept) x1 1 -1 -0.5 2 -1 -0.5 3 -1 -0.5 4 -1 -0.5 5 -1 -0.5 6 -1 0.5 7 -1 0.5 8 -1 0.5 9 -1 0.5 10 -1 0.5 attr(,assign) [1] 0 1 attr(,contrasts) attr(,contrasts)$x [,1] a -0.5 b 0.5 solve(t(mm) %*% mm) %*% t(mm) %*% y ##Yes, I know. Use QR [,1] (Intercept) -0.05624745 x1 1.55227241 gives the correct sign. So, I guess my question reduces to how one would set the contrasts for the model.matrix to be correct for this to work out correctly? Thank you. Ken -- Ken Knoblauch Inserm U846 Institut Cellule Souche et Cerveau Département Neurosciences Intégratives 18 avenue du Doyen Lépine 69500 Bron France tel: +33 (0)4 72 91 34 77 fax: +33 (0)4 72 91 34 61 portable: +33 (0)6 84 10 64 10 http://www.sbri.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with 'spectrum'
For the command 'spectrum' I read: The spectrum here is defined with scaling 1/frequency(x), following S-PLUS. This makes the spectral density a density over the range (-frequency(x)/2, +frequency(x)/2], whereas a more common scaling is 2π and range (-0.5, 0.5] (e.g., Bloomfield) or 1 and range (-π, π]. Forgive my ignorance but I am having a hard time interpreting this. Does this mean that in the spectrum output every element of the $spec array is scaled by 1/frequency(x)? I am having a hard time determing what is meant by 'frequency'.Say I define a time series for a year with samples for every day. I input a 'frequency' of 365 (which in my mind is the period). On the output of 'spectrum' would this mean that every element of the $spec array is scaled by 1/365? There is a corresponding frequency array on the output from 'spectrum'. If the frequency is 365 and an element in the frequency array output from 'spectrum' is .1 am I to assume that the period is 36.5 and a corresponding sin wave would be sin(2 * pi * 36.5/365)? Thank you in advance for helping me clear up some confusion. Kevin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzle about contrasts
Peter Dalgaard skrev: Prof Brian Ripley skrev: -0.5*(A+B) is not a contrast, which is the seat of your puzzlement. All you can get from y ~ x is an intercept (a column of ones) and a single 'contrast' column for 'x'. If you use y ~ 0+x you can get two columns for 'x', but R does not give you an option of what columns in the case: see the source of contrasts(). So you would need to replace contrasts(), which I think will be hard as model.matrix.default will look in the 'stats' namespace. It would probably be easier to create the model matrix yourself. Or accept the default and do the parameter transformations yourself. l - lm(y~x) T - rbind( c(-1,-.5), c(0,1)) c2 - T%*%coef(l) V2 - T%*%vcov(l) %*% t(T) cbind(coef=c(c2), s.e.=sqrt(diag(V2))) I forgot: Also have a look at estimable() from the gmodels packages. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] : writeMat
I write a .mat file using the writeMat() command, but when i try to load it in Matlab it says that file may be corrupt. I did it a month ago and it worked. It exists any option that I can change for making the file readable to Matlab? A - c(1:10) dim(A) - c(2,5) library(R.matlab) writeMat('A.mat', A=A) And what matlab say is: file may be corrupt Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help on wavelet
Hi, I have little experience using wavelet and I would like to know if it is possible,using R wavelet package, to have a plot of frequency versus time. thank you giov -- View this message in context: http://www.nabble.com/help-on-wavelet-tp19395583p19395583.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] binomial(link=inverse)
this may be a better question for r-devel, but ... Is there a particular reason (and if so, what is it) that the inverse link is not in the list of allowable link functions for the binomial family? I initially thought this might have something to do with the properties of canonical vs non-canonical link functions, but since other link functions (probit, cloglog, cauchit, log) are allowed, I can't think of any good reason. In fact, it's sort of a mystery to me why the sets of link functions for each family are restricted. Is this from painful experience that some link functions just don't work well? I can go ahead and hack my own version that allows inverse link, but it would be nice to know if I'm doing something dumb. (The reason I want to do this is that the inverse link linearizes the Michaelis-Menten function, y = a*x/(b+x) ...) cheers Ben Bolker signature.asc Description: OpenPGP digital signature __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] printing all rows
options(max.print) $max.print [1] 9 options(max.print=10) options(max.print) $max.print [1] 1e+05 ...so check what your max.print is, and figure out whether you need to set it to nrow, ncol, or nrow*ncol of your data frame...then do so...though of course, this is a global variable, so everything you print from then on will just keep printing and printing. Really, though, you might get more utility out of write.table and then using a word processor to read the data in your table. --Adam On Tue, 9 Sep 2008, ANJAN PURKAYASTHA wrote: Hi, my data table has 38939 rows. R prints the first 1 columns and then prints an error message:[ reached getOption(max.print) -- omitted 27821 rows ]]. is it possible to set the maxprint parameter so that R prints all the rows? tia, anjan -- = anjan purkayastha, phd bioinformatics analyst whitehead institute for biomedical research nine cambridge center cambridge, ma 02142 purkayas [at] wi [dot] mit [dot] edu 703.740.6939 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help on wavelet
It depends on what you want to do. In wavelet speak frequency is scale. these are the libraries: wmtsa - wavCWT (make sure that you pick the wavelet. I suggest morlet because it is compactly supported (disappears to zero quickly)) I would also suggest the fields packages for the tim.colors function which produces the familiar red to blue color scheme. sowas- more complex stuff here take a look very interesting if you are trying to tell if two signals are coherent. hope this helps stephen On Tue, Sep 9, 2008 at 12:03 PM, giov [EMAIL PROTECTED] wrote: Hi, I have little experience using wavelet and I would like to know if it is possible,using R wavelet package, to have a plot of frequency versus time. thank you giov -- View this message in context: http://www.nabble.com/help-on-wavelet-tp19395583p19395583.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Stephen Sefick Research Scientist Southeastern Natural Sciences Academy Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] creating table of averages
Dear Colleagues, I have a dataframe with variables: [1] ID category a11a12 a13a21 [7] a22a23a31a32 b11b12 [13] b13b21b31b32 b33b41 [19] b42c11c12c21 c22c23 [25] c31c32c33d11 d12d13 [31] d14d21d22d23 d24d25 [37] d31d32d33e11 e12e13 [43] e21e22e23e31 e32e33 [49] f11f12f13f14 f21f22 [55] f23f24g11g12 g13g14 [61] g21g22g23g24 g31g32 [67] g33g41g42g43 h11h12 [73] h13h21h22h23 C1.Employ SC11.Ops [79] SC12.Unit SC13.Nonadvers C2.Enterprise SC21.Structure SC22.Gov SC23.Culture [85] SC24.Stratcomm C3.Manage SC31.Resource SC32.Change SC33.Continue C4.Stratthink [91] SC41.VisionSC42.Decision SC43.Adapt C5.Lead SC51.Develop SC52.Care [97] SC53.Diversity C6.Foster SC61.Teams SC62.Negotiate C7.Embody SC71.Ethical [103] SC72.Follower SC73.Warrior SC74.Develop C8.Comm C81.Speak C82.Listen [109] OverallImp The variable category has four values: Regular, CCM, CFM, and Other I'd like to create a table like this to feed into barplot2: row.name C1.Employ C2.Enterprise C3.Manage C4.Stratthink C5.Lead C6.Foster C7.Embody C8.Comm Regular 3.68 4.27 3.22 etc.. CCM 4.32 4.56 etc. CFM etc. Other etc. So far, I have been able to get this far: mean(subset(impchiefs08,category==Regular,select=c(C1.Employ,C2.Enterprise,C3.Manage,C4.Stratthink,C5.Lead,C6.Foster,C7.Embody,C8.Comm ))) C1.Employ C2.Enterprise C3.Manage C4.Stratthink C5.Lead C6.Foster C7.Embody C8.Comm 3.60 3.85 4.48 4.346667 4.608889 4.44 4.60 4.49 But I am stumped as to how to get what I want. Thanks in advance. Larry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] randomForest
I am combining many different random forest objects run on the same data set using the combine ( ) function. After combining the forests I am not sure whether the variable importance, local importance, and rsq predictors are recalculated for the new random forest object or are calculated individually for each tree ensemble? Is it possible to calculate these predictors for the new random forest object after calling the combine function? Any help would be greatly apprecaited. Thanks, Kate [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compiling date
Is this Month-Day or Day-Month or a mixture of both? I still think using the Format - Cell - Date will work much better... el On 09 Sep 2008, at 11:21 , David Scott wrote: On Mon, 8 Sep 2008, Megh Dal wrote: Hi, I have following kind of dataset (all are dates) in my Excel sheet. 09/08/08 09/05/08 09/04/08 09/02/08 09/01/08 29/08/2008 28/08/2008 27/08/2008 26/08/2008 25/08/2008 22/08/2008 21/08/2008 20/08/2008 18/08/2008 14/08/2008 13/08/2008 08/12/08 08/11/08 08/08/08 08/07/08 However I want to use R to compile those data to make all dates in same format. Can anyone please tell me any automated way for doing that? Well you have to read them in as character first. Then use sub to make the two digit years into four digits. The following could probably be improved by a regular expression whiz, but works: strngs - c(06/05/08,23/11/2008) sub(([0-9][0-9]/[0-9][0-9]/)([0-9][0-9]$),\\120\\2,strngs) [1] 06/05/2008 23/11/2008 David Scott __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating table of averages
Maybe something like this: by(df[,c(77,81,86,90,94,98,101,106)],df$category,apply,2,mean) ...which would then need to be reformatted into a data frame (there is probably an easy way to do this which I don't know). aggregate seems like a more reasonable choice, but the function for aggregate must return scalars, not rows...tapply doesn't take data.frame inputs. Maybe someone else has a suggestion? --Adam On Tue, 9 Sep 2008, Lawrence Hanser wrote: Dear Colleagues, I have a dataframe with variables: [1] ID category a11a12 a13a21 [7] a22a23a31a32 b11b12 [13] b13b21b31b32 b33b41 [19] b42c11c12c21 c22c23 [25] c31c32c33d11 d12d13 [31] d14d21d22d23 d24d25 [37] d31d32d33e11 e12e13 [43] e21e22e23e31 e32e33 [49] f11f12f13f14 f21f22 [55] f23f24g11g12 g13g14 [61] g21g22g23g24 g31g32 [67] g33g41g42g43 h11h12 [73] h13h21h22h23 C1.Employ SC11.Ops [79] SC12.Unit SC13.Nonadvers C2.Enterprise SC21.Structure SC22.Gov SC23.Culture [85] SC24.Stratcomm C3.Manage SC31.Resource SC32.Change SC33.Continue C4.Stratthink [91] SC41.VisionSC42.Decision SC43.Adapt C5.Lead SC51.Develop SC52.Care [97] SC53.Diversity C6.Foster SC61.Teams SC62.Negotiate C7.Embody SC71.Ethical [103] SC72.Follower SC73.Warrior SC74.Develop C8.Comm C81.Speak C82.Listen [109] OverallImp The variable category has four values: Regular, CCM, CFM, and Other I'd like to create a table like this to feed into barplot2: row.name C1.Employ C2.Enterprise C3.Manage C4.Stratthink C5.Lead C6.Foster C7.Embody C8.Comm Regular 3.68 4.27 3.22 etc.. CCM 4.32 4.56 etc. CFM etc. Other etc. So far, I have been able to get this far: mean(subset(impchiefs08,category==Regular,select=c(C1.Employ,C2.Enterprise,C3.Manage,C4.Stratthink,C5.Lead,C6.Foster,C7.Embody,C8.Comm ))) C1.Employ C2.Enterprise C3.Manage C4.Stratthink C5.Lead C6.Foster C7.Embody C8.Comm 3.60 3.85 4.48 4.346667 4.608889 4.44 4.60 4.49 But I am stumped as to how to get what I want. Thanks in advance. Larry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating table of averages
On 9/9/2008 2:12 PM, Adam D. I. Kramer wrote: Maybe something like this: by(df[,c(77,81,86,90,94,98,101,106)],df$category,apply,2,mean) ...which would then need to be reformatted into a data frame (there is probably an easy way to do this which I don't know). sparseby() in the reshape package is more flexible than by(). If the function returns a vector with a consistent length, you'll get a dataframe with columns corresponding to its entries. Duncan Murdoch aggregate seems like a more reasonable choice, but the function for aggregate must return scalars, not rows...tapply doesn't take data.frame inputs. Maybe someone else has a suggestion? --Adam On Tue, 9 Sep 2008, Lawrence Hanser wrote: Dear Colleagues, I have a dataframe with variables: [1] ID category a11a12 a13a21 [7] a22a23a31a32 b11b12 [13] b13b21b31b32 b33b41 [19] b42c11c12c21 c22c23 [25] c31c32c33d11 d12d13 [31] d14d21d22d23 d24d25 [37] d31d32d33e11 e12e13 [43] e21e22e23e31 e32e33 [49] f11f12f13f14 f21f22 [55] f23f24g11g12 g13g14 [61] g21g22g23g24 g31g32 [67] g33g41g42g43 h11h12 [73] h13h21h22h23 C1.Employ SC11.Ops [79] SC12.Unit SC13.Nonadvers C2.Enterprise SC21.Structure SC22.Gov SC23.Culture [85] SC24.Stratcomm C3.Manage SC31.Resource SC32.Change SC33.Continue C4.Stratthink [91] SC41.VisionSC42.Decision SC43.Adapt C5.Lead SC51.Develop SC52.Care [97] SC53.Diversity C6.Foster SC61.Teams SC62.Negotiate C7.Embody SC71.Ethical [103] SC72.Follower SC73.Warrior SC74.Develop C8.Comm C81.Speak C82.Listen [109] OverallImp The variable category has four values: Regular, CCM, CFM, and Other I'd like to create a table like this to feed into barplot2: row.name C1.Employ C2.Enterprise C3.Manage C4.Stratthink C5.Lead C6.Foster C7.Embody C8.Comm Regular 3.68 4.27 3.22 etc.. CCM 4.32 4.56 etc. CFM etc. Other etc. So far, I have been able to get this far: mean(subset(impchiefs08,category==Regular,select=c(C1.Employ,C2.Enterprise,C3.Manage,C4.Stratthink,C5.Lead,C6.Foster,C7.Embody,C8.Comm ))) C1.Employ C2.Enterprise C3.Manage C4.Stratthink C5.Lead C6.Foster C7.Embody C8.Comm 3.60 3.85 4.48 4.346667 4.608889 4.44 4.60 4.49 But I am stumped as to how to get what I want. Thanks in advance. Larry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating table of averages
Perfect! Thanks. On Tue, Sep 9, 2008 at 11:27 AM, Duncan Murdoch [EMAIL PROTECTED]wrote: On 9/9/2008 2:12 PM, Adam D. I. Kramer wrote: Maybe something like this: by(df[,c(77,81,86,90,94,98,101,106)],df$category,apply,2,mean) ...which would then need to be reformatted into a data frame (there is probably an easy way to do this which I don't know). sparseby() in the reshape package is more flexible than by(). If the function returns a vector with a consistent length, you'll get a dataframe with columns corresponding to its entries. Duncan Murdoch aggregate seems like a more reasonable choice, but the function for aggregate must return scalars, not rows...tapply doesn't take data.frame inputs. Maybe someone else has a suggestion? --Adam On Tue, 9 Sep 2008, Lawrence Hanser wrote: Dear Colleagues, I have a dataframe with variables: [1] ID category a11a12 a13a21 [7] a22a23a31a32 b11b12 [13] b13b21b31b32 b33b41 [19] b42c11c12c21 c22c23 [25] c31c32c33d11 d12d13 [31] d14d21d22d23 d24d25 [37] d31d32d33e11 e12e13 [43] e21e22e23e31 e32e33 [49] f11f12f13f14 f21f22 [55] f23f24g11g12 g13g14 [61] g21g22g23g24 g31g32 [67] g33g41g42g43 h11h12 [73] h13h21h22h23 C1.Employ SC11.Ops [79] SC12.Unit SC13.Nonadvers C2.Enterprise SC21.Structure SC22.Gov SC23.Culture [85] SC24.Stratcomm C3.Manage SC31.Resource SC32.Change SC33.Continue C4.Stratthink [91] SC41.VisionSC42.Decision SC43.Adapt C5.Lead SC51.Develop SC52.Care [97] SC53.Diversity C6.Foster SC61.Teams SC62.Negotiate C7.Embody SC71.Ethical [103] SC72.Follower SC73.Warrior SC74.Develop C8.Comm C81.Speak C82.Listen [109] OverallImp The variable category has four values: Regular, CCM, CFM, and Other I'd like to create a table like this to feed into barplot2: row.name C1.Employ C2.Enterprise C3.Manage C4.Stratthink C5.Lead C6.Foster C7.Embody C8.Comm Regular 3.68 4.27 3.22 etc.. CCM 4.32 4.56 etc. CFM etc. Other etc. So far, I have been able to get this far: mean(subset(impchiefs08,category==Regular,select=c(C1.Employ,C2.Enterprise,C3.Manage,C4.Stratthink,C5.Lead,C6.Foster,C7.Embody,C8.Comm ))) C1.Employ C2.Enterprise C3.Manage C4.Stratthink C5.Lead C6.Foster C7.Embody C8.Comm 3.60 3.85 4.48 4.346667 4.608889 4.44 4.60 4.49 But I am stumped as to how to get what I want. Thanks in advance. Larry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hardwarefor R cpu 64 vs 32, dual vs quad
On Tue, Sep 9, 2008 at 6:31 AM, Nic Larson [EMAIL PROTECTED] wrote: Need to buy fast computer for running R on. Today we use 2,8 MHz intel D cpu and the calculations takes around 15 days. Is it possible to get the same calculations down to minutes/hours by only changing the hardware? Should I go for an really fast dual 32 bit cpu and run R over linux or xp or go for an quad core / 64 bit cpu? Is it effective to run R on 64 bit (and problem free (running/installing))??? Have around 2000-3000 euro to spend Faster machines won't do that much. Without knowing what methods and algorithms you are running, I bet you a beer that it can be made twice as fast by just optimizing the code. My claim applies recursively. In other words, by optimizing the algorithms/code you can speed up things quite a bit. From experience, it is not unlikely to find bottlenecks in generic algorithms that can be made 10-100 times faster. Here is *one* example illustrating that even when you think the code is fully optimized you can still squeeze out more: http://wiki.r-project.org/rwiki/doku.php?id=tips:programming:code_optim2 So, start profiling your code to narrow down the parts that takes most of the CPU time. help(Rprof) is a start. There is also a Section 'Profiling R code for speed' in 'Writing R Extensions'. Good old verbose print out of system.time() also helps. My $.02 ...or 2000-3000USD if it was bounty?! ;) /Henrik Thanx for any tip [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cluster/snow question
Hi Markus, Many thanks. Is the cluster variable you mention below available in the environment of the nodes ? Specifically, within that environment, how could one identify the rank of that specific node ? My code would use that information to partition the problem. Thanks, Tolga Markus Schmidberger [EMAIL PROTECTED] 09/09/2008 07:11 Please respond to [EMAIL PROTECTED] To [EMAIL PROTECTED] cc r-help@r-project.org Subject Re: [R] cluster/snow question Hi Tolga, in SNOW you have to start a cluster with the command library(snow) cluster - makeCluster(#nodes) The object cluster is a list with an object for each node and each object again is a list with all informations (rank, comm, tags) The size of the cluster is the length of the list. #nodes == length(cluster) E.g. the rank for node one you can get by cluster[[1]]$rank Best Markus [EMAIL PROTECTED] schrieb: Dear R Users, I am attempting to use the snow package for clustering. Is there a way to identfy, in the environment of each node, a rank for that node and also, the total size of the cluster ? By way of analogy, I am looking for the functions in snow equivalent to mpi.comm.rank() and mpi.comm.size() from RMPI, in case that makes things clearer. Thanks in advance, Tolga Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to UK legal entities. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dipl.-Tech. Math. Markus Schmidberger Ludwig-Maximilians-Universität München IBE - Institut für medizinische Informationsverarbeitung, Biometrie und Epidemiologie Marchioninistr. 15, D-81377 Muenchen URL: http://ibe.web.med.uni-muenchen.de Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de Tel: +49 (089) 7095 - 4599 Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is
[R] Information on the number of CPU's
Dear R Users, I am on Windows XP SP2 platform, using R version 2.7.2 . I was wondering if there is a way to find out, within R, the number of CPU's on my machine ? I would use this information to set the number of nodes in a cluster, depending on the machine. Sys.info() and .Platform do not carry this information. Thanks in advance, Tolga Uzuner Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to UK legal entities. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Binning
Dear List: I have a dataset with over 5000 records and I would like to put the Count in bins based on the ForkLength. e.g. Forklength Count 32-34? 35-37? 38-40? and so on... and lastly I would like to plot (scatterplot) including the SampleDate along the X axis and ForkLength along the Y axis. I recently saw an example similar to this one here but I don't want a histogram I just want to see the ForkLength ranges with different colors. For example: ForkLength 32-34---green ForkLength 35-37---red ForkLength 38-40--Orange Thanks in advance SampleDate ForkLength Count 112/4/2007 32 2 212/6/2007 33 1 312/7/2007 33 2 412/7/2007 33 2 512/7/2007 34 1 612/9/2007 31 1 712/9/2007 33 2 8 12/10/2007 33 5 9 12/10/2007 34 1 10 12/11/2007 33 2 11 12/15/2007 34 1 12 12/16/2007 33 2 13 12/17/2007 35 1 14 12/19/2007 33 1 15 12/19/2007 35 1 16 12/20/2007 31 1 17 12/20/2007 32 1 18 12/20/2007 33 1 19 12/20/2007 34 3 20 12/21/2007 31 1 21 12/21/2007 32 3 22 12/21/2007 33 4 23 12/21/2007 3411 24 12/21/2007 3516 25 12/21/2007 36 3 26 12/21/2007 37 1 27 12/22/2007 32 1 28 12/22/2007 33 3 29 12/22/2007 34 1 30 12/22/2007 35 2 31 12/23/2007 32 1 32 12/23/2007 35 1 33 12/25/2007 32 1 34 12/25/2007 36 1 35 12/26/2007 34 1 36 12/26/2007 35 2 37 12/26/2007 36 1 38 12/27/2007 34 4 39 12/27/2007 35 2 40 12/27/2007 36 2 41 12/28/2007 32 1 42 12/28/2007 33 1 43 12/28/2007 34 1 44 12/28/2007 35 3 45 12/28/2007 36 4 46 12/28/2007 37 6 47 12/28/2007 38 2 48 12/28/2007 39 2 49 12/29/2007 34 1 50 12/29/2007 35 5 51 12/29/2007 36 2 52 12/29/2007 37 1 53 12/30/2007 33 3 54 12/30/2007 3410 55 12/30/2007 3510 56 12/30/2007 36 6 57 12/30/2007 3715 58 12/30/2007 38 3 59 12/31/2007 33 3 60 12/31/2007 34 8 61 12/31/2007 35 9 62 12/31/2007 36 6 63 12/31/2007 37 3 64 12/31/2007 38 1 651/1/2008 34 6 661/1/2008 35 6 671/1/2008 35 1 681/1/2008 36 6 691/1/2008 37 9 701/1/2008 38 1 711/2/2008 34 2 721/2/2008 34 1 731/2/2008 35 2 741/2/2008 36 2 751/2/2008 37 2 761/2/2008 39 1 771/3/2008 34 3 781/3/2008 35 3 791/3/2008 36 2 801/3/2008 37 3 811/8/2008 32 1 821/8/2008 33 7 831/8/2008 34 6 841/8/2008 3510 851/8/2008 3616 861/8/2008 37 7 871/8/2008 38 1 881/8/2008 39 1 891/9/2008 33 1 901/9/2008 3420 911/9/2008 3549 921/9/2008 3649 931/9/2008 3739 941/9/2008 37 1 951/9/2008 3818 961/9/2008 39 1 971/9/2008 40 1 98 1/10/2008 32 3 99 1/10/2008 3313 100 1/10/2008 3456 101 1/10/2008 3533 102 1/10/2008 3624 103 1/10/2008 3718 104 1/10/2008 39 1 105 1/11/2008 33 7 106 1/11/2008 3446 107 1/11/2008 3541 108 1/11/2008 3628 109 1/11/2008 3729 Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cluster/snow question
On Tue, 9 Sep 2008, [EMAIL PROTECTED] wrote: Hi Markus, Many thanks. Is the cluster variable you mention below available in the environment of the nodes ? Specifically, within that environment, how could one identify the rank of that specific node ? No -- that isn't the way snow works. With snow the partitioning is done on the master. If you need a node to know how many other nodes there are or which index it represents in a clusterApply call then you need to pass that information in the arguments. luke My code would use that information to partition the problem. Thanks, Tolga Markus Schmidberger [EMAIL PROTECTED] 09/09/2008 07:11 Please respond to [EMAIL PROTECTED] To [EMAIL PROTECTED] cc r-help@r-project.org Subject Re: [R] cluster/snow question Hi Tolga, in SNOW you have to start a cluster with the command library(snow) cluster - makeCluster(#nodes) The object cluster is a list with an object for each node and each object again is a list with all informations (rank, comm, tags) The size of the cluster is the length of the list. #nodes == length(cluster) E.g. the rank for node one you can get by cluster[[1]]$rank Best Markus [EMAIL PROTECTED] schrieb: Dear R Users, I am attempting to use the snow package for clustering. Is there a way to identfy, in the environment of each node, a rank for that node and also, the total size of the cluster ? By way of analogy, I am looking for the functions in snow equivalent to mpi.comm.rank() and mpi.comm.size() from RMPI, in case that makes things clearer. Thanks in advance, Tolga Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to UK legal entities. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dipl.-Tech. Math. Markus Schmidberger Ludwig-Maximilians-Universit?? M??chen IBE - Institut f?? medizinische Informationsverarbeitung, Biometrie und Epidemiologie Marchioninistr. 15, D-81377 Muenchen URL: http://ibe.web.med.uni-muenchen.de Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de Tel: +49 (089) 7095 - 4599 Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution,
[R] splitting time vector into days
Greetings -- I have a dataframe a with one element a vector, time, of POSIXct values. What's a good way to split the data frame into periods of a$time, e.g. days, and apply a function, e.g. mean, to some other column of the dataframe, e.g. a$value? Cheers, Alexy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Modality Test
the diptest package, perhaps? url:www.econ.uiuc.edu/~rogerRoger Koenker email[EMAIL PROTECTED]Department of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Champaign, IL 61820 On Sep 9, 2008, at 11:23 AM, Amin W. Mugera wrote: Dear Readers: I have two issues in nonparametric statistical analysis that i need help: First, does R have a package that can implement the multimodality test, e.g., the Silverman test, DIP test, MAP test or Runt test. I have seen an earlier thread (sometime in 2003) where someone was trying to write a code for the Silverman test of multimodality. Is there any other tests that can enable me to know how many modes are in a distribution? Second, i would like to test whether two distributions are equal. Does R have a package than can implement the Li (1996) test of the equality of two distributions? Is there any other test i can use rather than the Li test? Thank you in advance for your help. Amin Mugera Graduate Student AgEcon Dept. Kansas State University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cluster/snow question
Understood, that's what I'll do. I'm thinking of exporting the number of nodes to all nodes and passing in the node rank as 1:nonodes through clusterApply. Thanks all, Tolga Luke Tierney [EMAIL PROTECTED] 09/09/2008 20:11 To [EMAIL PROTECTED] cc [EMAIL PROTECTED], r-help@r-project.org Subject Re: [R] cluster/snow question On Tue, 9 Sep 2008, [EMAIL PROTECTED] wrote: Hi Markus, Many thanks. Is the cluster variable you mention below available in the environment of the nodes ? Specifically, within that environment, how could one identify the rank of that specific node ? No -- that isn't the way snow works. With snow the partitioning is done on the master. If you need a node to know how many other nodes there are or which index it represents in a clusterApply call then you need to pass that information in the arguments. luke My code would use that information to partition the problem. Thanks, Tolga Markus Schmidberger [EMAIL PROTECTED] 09/09/2008 07:11 Please respond to [EMAIL PROTECTED] To [EMAIL PROTECTED] cc r-help@r-project.org Subject Re: [R] cluster/snow question Hi Tolga, in SNOW you have to start a cluster with the command library(snow) cluster - makeCluster(#nodes) The object cluster is a list with an object for each node and each object again is a list with all informations (rank, comm, tags) The size of the cluster is the length of the list. #nodes == length(cluster) E.g. the rank for node one you can get by cluster[[1]]$rank Best Markus [EMAIL PROTECTED] schrieb: Dear R Users, I am attempting to use the snow package for clustering. Is there a way to identfy, in the environment of each node, a rank for that node and also, the total size of the cluster ? By way of analogy, I am looking for the functions in snow equivalent to mpi.comm.rank() and mpi.comm.size() from RMPI, in case that makes things clearer. Thanks in advance, Tolga Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to UK legal entities. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dipl.-Tech. Math. Markus Schmidberger Ludwig-Maximilians-Universit?? M??chen IBE - Institut f?? medizinische Informationsverarbeitung, Biometrie und Epidemiologie Marchioninistr. 15, D-81377 Muenchen URL: http://ibe.web.med.uni-muenchen.de Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de Tel: +49 (089) 7095 - 4599 Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and
Re: [R] Information on the number of CPU's
On Tue, 9 Sep 2008, [EMAIL PROTECTED] wrote: Dear R Users, I am on Windows XP SP2 platform, using R version 2.7.2 . I was wondering if there is a way to find out, within R, the number of CPU's on my machine ? I would use this information to set the number of nodes in a cluster, depending on the machine. Sys.info() and .Platform do not carry this information. Correct, since a) R does not make use of more than 1. b) It is really not portable, and not even well-defined. (How many CPUs has a hyperthreaded dual Xeon? Some say 2, some say 4. Do you want CPUs or cores? If this is a virtualized OS, is the physical number or the logical number?) In the case of Windows, how depends on the Windows version. The w32api (XP or later) call GetNativeSystemInfo will tell you the number of CPUs, for some (unstated) definition of 'CPU'. Later versions have GetLogicalProcessorInformation, which can give the number of cores. Thanks in advance, Tolga Uzuner -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Information on the number of CPU's
Many thanks, that's very helpful. Regards, Tolga - Original Message - From: Prof Brian Ripley [EMAIL PROTECTED] Sent: 09/09/2008 20:57 CET To: Tolga Uzuner Cc: r-help@r-project.org Subject: Re: [R] Information on the number of CPU's On Tue, 9 Sep 2008, [EMAIL PROTECTED] wrote: Dear R Users, I am on Windows XP SP2 platform, using R version 2.7.2 . I was wondering if there is a way to find out, within R, the number of CPU's on my machine ? I would use this information to set the number of nodes in a cluster, depending on the machine. Sys.info() and .Platform do not carry this information. Correct, since a) R does not make use of more than 1. b) It is really not portable, and not even well-defined. (How many CPUs has a hyperthreaded dual Xeon? Some say 2, some say 4. Do you want CPUs or cores? If this is a virtualized OS, is the physical number or the logical number?) In the case of Windows, how depends on the Windows version. The w32api (XP or later) call GetNativeSystemInfo will tell you the number of CPUs, for some (unstated) definition of 'CPU'. Later versions have GetLogicalProcessorInformation, which can give the number of cores. Thanks in advance, Tolga Uzuner -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to UK legal entities. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Modality Test
Hi Amin, First, does R have a package that can implement the multimodality test, e.g., the Silverman test, DIP test, MAP test or Runt test. Jeremy Tantrum (a Ph.D. student of Werner Steutzle's, c. 2003/04) did some work on this. There is some useful code on Steutzle's website: http://www.stat.washington.edu/wxs/Stat593-s03/Code/jeremy-unimodality.R I used it last year when I was trying to solve the problem of how best to compare lots of density curves (age distributions of 3 spp. of tree euphorbias from about very different 35 sites). In particular I had to ensure that I wasn't creating spurious bimodality at a particular age range when combining sites. You might find it useful. Feel free to contact me off list if the code has gone, as I think I still have it (somewhere). Regards, Mark. Amin W. Mugera wrote: Dear Readers: I have two issues in nonparametric statistical analysis that i need help: First, does R have a package that can implement the multimodality test, e.g., the Silverman test, DIP test, MAP test or Runt test. I have seen an earlier thread (sometime in 2003) where someone was trying to write a code for the Silverman test of multimodality. Is there any other tests that can enable me to know how many modes are in a distribution? Second, i would like to test whether two distributions are equal. Does R have a package than can implement the Li (1996) test of the equality of two distributions? Is there any other test i can use rather than the Li test? Thank you in advance for your help. Amin Mugera Graduate Student AgEcon Dept. Kansas State University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Modality-Test-tp19396085p19400095.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Modality Test
Whoops! I think that should be Stuetzle --- though I very much doubt that he reads the list. Mark Difford wrote: Hi Amin, First, does R have a package that can implement the multimodality test, e.g., the Silverman test, DIP test, MAP test or Runt test. Jeremy Tantrum (a Ph.D. student of Werner Steutzle's, c. 2003/04) did some work on this. There is some useful code on Steutzle's website: http://www.stat.washington.edu/wxs/Stat593-s03/Code/jeremy-unimodality.R I used it last year when I was trying to solve the problem of how best to compare lots of density curves (age distributions of 3 spp. of tree euphorbias from about very different 35 sites). In particular I had to ensure that I wasn't creating spurious bimodality at a particular age range when combining sites. You might find it useful. Feel free to contact me off list if the code has gone, as I think I still have it (somewhere). Regards, Mark. Amin W. Mugera wrote: Dear Readers: I have two issues in nonparametric statistical analysis that i need help: First, does R have a package that can implement the multimodality test, e.g., the Silverman test, DIP test, MAP test or Runt test. I have seen an earlier thread (sometime in 2003) where someone was trying to write a code for the Silverman test of multimodality. Is there any other tests that can enable me to know how many modes are in a distribution? Second, i would like to test whether two distributions are equal. Does R have a package than can implement the Li (1996) test of the equality of two distributions? Is there any other test i can use rather than the Li test? Thank you in advance for your help. Amin Mugera Graduate Student AgEcon Dept. Kansas State University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Modality-Test-tp19396085p19400138.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Information on the number of CPU's
The wmic command line utility can also be used to query this; on a dual-core Vista laptop I get C:\Users\lukewmic cpu get NumberOfCores,NumberOfLogicalProcessors NumberOfCores NumberOfLogicalProcessors 2 2 luke -- Luke Tierney University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: [EMAIL PROTECTED] Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu On Tue, 9 Sep 2008, [EMAIL PROTECTED] wrote: Many thanks, that's very helpful. Regards, Tolga - Original Message - From: Prof Brian Ripley [EMAIL PROTECTED] Sent: 09/09/2008 20:57 CET To: Tolga Uzuner Cc: r-help@r-project.org Subject: Re: [R] Information on the number of CPU's On Tue, 9 Sep 2008, [EMAIL PROTECTED] wrote: Dear R Users, I am on Windows XP SP2 platform, using R version 2.7.2 . I was wondering if there is a way to find out, within R, the number of CPU's on my machine ? I would use this information to set the number of nodes in a cluster, depending on the machine. Sys.info() and .Platform do not carry this information. Correct, since a) R does not make use of more than 1. b) It is really not portable, and not even well-defined. (How many CPUs has a hyperthreaded dual Xeon? Some say 2, some say 4. Do you want CPUs or cores? If this is a virtualized OS, is the physical number or the logical number?) In the case of Windows, how depends on the Windows version. The w32api (XP or later) call GetNativeSystemInfo will tell you the number of CPUs, for some (unstated) definition of 'CPU'. Later versions have GetLogicalProcessorInformation, which can give the number of cores. Thanks in advance, Tolga Uzuner -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to UK legal entities. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Modality Test
Hi Amin, And I have just remembered that there is a function called curveRep in Frank Harrell's Hmisc package that might be useful, even if not quite in the channel of your enquiry. curveRep was added to the package after my struggles, so I never used it and so don't know how well it performs (quite well, I would think). Regards, Mark. Amin W. Mugera wrote: Dear Readers: I have two issues in nonparametric statistical analysis that i need help: First, does R have a package that can implement the multimodality test, e.g., the Silverman test, DIP test, MAP test or Runt test. I have seen an earlier thread (sometime in 2003) where someone was trying to write a code for the Silverman test of multimodality. Is there any other tests that can enable me to know how many modes are in a distribution? Second, i would like to test whether two distributions are equal. Does R have a package than can implement the Li (1996) test of the equality of two distributions? Is there any other test i can use rather than the Li test? Thank you in advance for your help. Amin Mugera Graduate Student AgEcon Dept. Kansas State University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Modality-Test-tp19396085p19400426.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] NMDS and varimax rotation
hello, subsequently to a NMDS analysis (performed with metaMDS or isoMDS) is it possible to rotate the axis through a varimax-rotation? Thanks in advance. Bernd Panassiti __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] csaps in R?
Is there is function in R equivalent to Matlab's csaps? I need a spline function with the same calculation of the smoothing parameter in csaps to compare some results. AFAIK, the spar in smooth.spline is related but not the same. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tsdiag error
Does anyone know why I get the following error when trying tsdiag? Error in UseMethod(tsdiag) : no applicable method for tsdiag I am invoking it as: tsdiag(mar). Thank you. Kevin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] splitting time vector into days
?aggregate ?window.zoo ?rollapply anyway have a look at package zoo On Tue, Sep 9, 2008 at 3:25 PM, Alexy Khrabrov [EMAIL PROTECTED] wrote: Greetings -- I have a dataframe a with one element a vector, time, of POSIXct values. What's a good way to split the data frame into periods of a$time, e.g. days, and apply a function, e.g. mean, to some other column of the dataframe, e.g. a$value? Cheers, Alexy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Stephen Sefick Research Scientist Southeastern Natural Sciences Academy Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NMDS and varimax rotation
have you looked at the vegan viginette- I know there is proscrutes rotation. On Tue, Sep 9, 2008 at 3:54 PM, Bernd Panassiti [EMAIL PROTECTED] wrote: hello, subsequently to a NMDS analysis (performed with metaMDS or isoMDS) is it possible to rotate the axis through a varimax-rotation? Thanks in advance. Bernd Panassiti __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Stephen Sefick Research Scientist Southeastern Natural Sciences Academy Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] building a package that contains S4 classes and methods
Hello R users, I am trying to make a my first package and I get an error that I can understand. The package is build out of three files (one for functions, 1 for s4 classes and 1 for s4 methods). Once I source them I run package.skeleton( name=TDC ) within a R session and I get Creating directories ... Creating DESCRIPTION ... Creating Read-and-delete-me ... Saving functions and data ... Making help files ... Done. Further steps are described in './TDC/Read-and-delete-me'. Warning messages: 1: In dump(internalObjs, file = file.path(code_dir, sprintf(%s-internal.R, : deparse of an S4 object will not be source()able 2: In dump(internalObjs, file = file.path(code_dir, sprintf(%s-internal.R, : deparse of an S4 object will not be source()able 3: In dump(internalObjs, file = file.path(code_dir, sprintf(%s-internal.R, : deparse of an S4 object will not be source()able 4: In dump(internalObjs, file = file.path(code_dir, sprintf(%s-internal.R, : deparse may be incomplete I keep going in spite of the warnings with R CMD check --no-examples TDC and I get * checking for working pdflatex ... OK * using log directory '/home/mariepierre/Packages/PermAlgo/PermAlgo/PermAlgo2/TDC.Rcheck' * using R version 2.7.1 (2008-06-23) * using session charset: UTF-8 * checking for file 'TDC/DESCRIPTION' ... OK * checking extension type ... Package * this is package 'TDC' version '1.0' * checking package dependencies ... OK * checking if this is a source package ... OK * checking whether package 'TDC' can be installed ... ERROR Installation failed. The error file says: * Installing *source* package 'TDC' ... ** R ** preparing package for lazy loading Error in parse(n = -1, file = file) : unexpected '' at 102: `.__C__BindArgs` - 103: Calls: Anonymous - code2LazyLoadDB - sys.source - parse Execution halted ERROR: lazy loading failed for package 'TDC' ** Removing '/home/mariepierre/Packages/PermAlgo/PermAlgo/PermAlgo2/TDC.Rcheck/TDC' The problem is with my classes and methods. The respective files contain: setClass(BindArgs, signature( function )) setClass(BindArgs2, signature( function )) and setMethod(initialize, BindArgs, function( .Object, f, ... ) callNextMethod( .Object, function( x ) f( x, ... ) )) setMethod(initialize, BindArgs2, function( .Object, f, ...) callNextMethod( .Object, function( x, y ) f( x, y, ... ) )) Everything works well within a R session but I can build the package. If I look at the internal R file that this created I get `.__C__BindArgs` - S4 object of class structure(classRepresentation, package = methods) `.__C__BindArgs2` - S4 object of class structure(classRepresentation, package = methods) `.__M__initialize:methods` - S4 object of class structure(MethodsList, package = methods) `.__T__initialize:methods` - environment Well, let just say that I am new to classes so this confuses me greatly. I have checked the documentation and tried a few things but I reached my personal limits! I am using R 2.7.1 on Linux Fedora 8. Any comments on what is happening and/or help would be greatly appreciated. MP __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] splitting time vector into days
Here is one way of doing it: x - data.frame(dates=seq(as.POSIXct('2008-09-08'), by='7 hours', length=10), + values=1:10) # split into days x.s - split(x, format(x$dates, %Y%m%d)) x.s $`20080908` dates values 1 2008-09-08 00:00:00 1 2 2008-09-08 07:00:00 2 3 2008-09-08 14:00:00 3 4 2008-09-08 21:00:00 4 $`20080909` dates values 5 2008-09-09 04:00:00 5 6 2008-09-09 11:00:00 6 7 2008-09-09 18:00:00 7 $`20080910` dates values 8 2008-09-10 01:00:00 8 9 2008-09-10 08:00:00 9 10 2008-09-10 15:00:00 10 lapply(x.s, function(.df) mean(.df$values)) $`20080908` [1] 2.5 $`20080909` [1] 6 $`20080910` [1] 9 On Tue, Sep 9, 2008 at 3:25 PM, Alexy Khrabrov [EMAIL PROTECTED] wrote: Greetings -- I have a dataframe a with one element a vector, time, of POSIXct values. What's a good way to split the data frame into periods of a$time, e.g. days, and apply a function, e.g. mean, to some other column of the dataframe, e.g. a$value? Cheers, Alexy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with 'spectrum'
This is why some help pages have references: please use them (Venables Ripley explain the exact formulae used in R). On Tue, 9 Sep 2008, [EMAIL PROTECTED] wrote: For the command 'spectrum' I read: The spectrum here is defined with scaling 1/frequency(x), following S-PLUS. This makes the spectral density a density over the range (-frequency(x)/2, +frequency(x)/2], whereas a more common scaling is 2π and range (-0.5, 0.5] (e.g., Bloomfield) or 1 and range (-π, π]. Forgive my ignorance but I am having a hard time interpreting this. Does this mean that in the spectrum output every element of the $spec array is scaled by 1/frequency(x)? I am having a hard time determing what is meant by 'frequency'. So please do look up the help for frequency(). Say I define a time series for a year with samples for every day. I input a 'frequency' of 365 (which in my mind is the period). The point is that your time unit is 1 year, and your measurements are every 1/365 year. That is unrelated to the 'period' (no one mentioned periodicity yet). On the output of 'spectrum' would this mean that every element of the $spec array is scaled by 1/365? There is a corresponding frequency array on the output from 'spectrum'. If the frequency is 365 and an element in the frequency array output from 'spectrum' is .1 am I to assume that the period is 36.5 and a corresponding sin wave would be sin(2 * pi * 36.5/365)? Hmm, you need a 't' in there (and a phase). The issue is the units for t. A frequency in the 'freq' element of the output of 0.1 corresponds to 10 cycles per unit of time, and in your example the unit of time is 365 observations. So the sine (sic) wave is sin(2*pi*0.1*t + phi), where the increments in 't' are 1/365: that gives 10 complete cycles in observations at, say, c(1990, 1) ... c(1990, 365), the days of 1990 (not a leap year). Thank you in advance for helping me clear up some confusion. Kevin -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Binning
This should do what you want. #--x - read.table('clipboard', header=TRUE, as.is=TRUE) # convert dates x$date - as.POSIXct(strptime(x$SampleDate, %m/%d/%Y)) # put ForkLength into bins x$bins - cut(x$ForkLength, breaks=c(32, 34, 37, 40), include.lowest=TRUE) # count the bins tapply(x$Count, x$bins, sum) # plot the data plot(x$date, x$ForkLength, col=c('green', 'red', 'orange')[x$bins]) On Tue, Sep 9, 2008 at 3:12 PM, Felipe Carrillo [EMAIL PROTECTED] wrote: Dear List: I have a dataset with over 5000 records and I would like to put the Count in bins based on the ForkLength. e.g. Forklength Count 32-34? 35-37? 38-40? and so on... and lastly I would like to plot (scatterplot) including the SampleDate along the X axis and ForkLength along the Y axis. I recently saw an example similar to this one here but I don't want a histogram I just want to see the ForkLength ranges with different colors. For example: ForkLength 32-34---green ForkLength 35-37---red ForkLength 38-40--Orange Thanks in advance SampleDate ForkLength Count 112/4/2007 32 2 212/6/2007 33 1 312/7/2007 33 2 412/7/2007 33 2 512/7/2007 34 1 612/9/2007 31 1 712/9/2007 33 2 8 12/10/2007 33 5 9 12/10/2007 34 1 10 12/11/2007 33 2 11 12/15/2007 34 1 12 12/16/2007 33 2 13 12/17/2007 35 1 14 12/19/2007 33 1 15 12/19/2007 35 1 16 12/20/2007 31 1 17 12/20/2007 32 1 18 12/20/2007 33 1 19 12/20/2007 34 3 20 12/21/2007 31 1 21 12/21/2007 32 3 22 12/21/2007 33 4 23 12/21/2007 3411 24 12/21/2007 3516 25 12/21/2007 36 3 26 12/21/2007 37 1 27 12/22/2007 32 1 28 12/22/2007 33 3 29 12/22/2007 34 1 30 12/22/2007 35 2 31 12/23/2007 32 1 32 12/23/2007 35 1 33 12/25/2007 32 1 34 12/25/2007 36 1 35 12/26/2007 34 1 36 12/26/2007 35 2 37 12/26/2007 36 1 38 12/27/2007 34 4 39 12/27/2007 35 2 40 12/27/2007 36 2 41 12/28/2007 32 1 42 12/28/2007 33 1 43 12/28/2007 34 1 44 12/28/2007 35 3 45 12/28/2007 36 4 46 12/28/2007 37 6 47 12/28/2007 38 2 48 12/28/2007 39 2 49 12/29/2007 34 1 50 12/29/2007 35 5 51 12/29/2007 36 2 52 12/29/2007 37 1 53 12/30/2007 33 3 54 12/30/2007 3410 55 12/30/2007 3510 56 12/30/2007 36 6 57 12/30/2007 3715 58 12/30/2007 38 3 59 12/31/2007 33 3 60 12/31/2007 34 8 61 12/31/2007 35 9 62 12/31/2007 36 6 63 12/31/2007 37 3 64 12/31/2007 38 1 651/1/2008 34 6 661/1/2008 35 6 671/1/2008 35 1 681/1/2008 36 6 691/1/2008 37 9 701/1/2008 38 1 711/2/2008 34 2 721/2/2008 34 1 731/2/2008 35 2 741/2/2008 36 2 751/2/2008 37 2 761/2/2008 39 1 771/3/2008 34 3 781/3/2008 35 3 791/3/2008 36 2 801/3/2008 37 3 811/8/2008 32 1 821/8/2008 33 7 831/8/2008 34 6 841/8/2008 3510 851/8/2008 3616 861/8/2008 37 7 871/8/2008 38 1 881/8/2008 39 1 891/9/2008 33 1 901/9/2008 3420 911/9/2008 3549 921/9/2008 3649 931/9/2008 3739 941/9/2008 37 1 951/9/2008 3818 961/9/2008 39 1 971/9/2008 40 1 98 1/10/2008 32 3 99 1/10/2008 3313 100 1/10/2008 3456 101 1/10/2008 3533 102 1/10/2008 3624 103 1/10/2008 3718 104 1/10/2008 39 1 105 1/11/2008 33 7 106 1/11/2008 3446 107 1/11/2008 3541 108 1/11/2008 3628 109 1/11/2008 3729 Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help
Re: [R] naive variance in GEE
On Mon, 8 Sep 2008, Qiong Yang wrote: Hi, The standard error from logistic regression is slightly different from the naive SE from GEE under independence working correlation structure. Yes Shouldn't they be identical? Anyone has insight about this? No, they shouldn't. They are different estimators of the same quantity, like the mean and median of a symmetric distribution. -thomas Thanks, Qiong a-rbinom(1000,1) b-rbinom(1000,2,0.1) c-rbinom(1000,10,0.5) summary(gee(a~b, id=c,family=binomial,corstr=independence))$coef summary(glm(a~b,family=binomial)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] naive variance in GEE
Sorry, I misread your message. Prof Ripley is right, as usual -- the estimates use different stopping criteria and so are just numerically different. -thomas On Tue, 9 Sep 2008, Thomas Lumley wrote: On Mon, 8 Sep 2008, Qiong Yang wrote: Hi, The standard error from logistic regression is slightly different from the naive SE from GEE under independence working correlation structure. Yes Shouldn't they be identical? Anyone has insight about this? No, they shouldn't. They are different estimators of the same quantity, like the mean and median of a symmetric distribution. -thomas Thanks, Qiong a-rbinom(1000,1) b-rbinom(1000,2,0.1) c-rbinom(1000,10,0.5) summary(gee(a~b, id=c,family=binomial,corstr=independence))$coef summary(glm(a~b,family=binomial)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] survey package
Version 3.9 of the survey package is now on CRAN. Since the last announcement (version 3.6-11, about a year ago) the main changes are - Database-backed survey objects: the data can live in a SQLite (or other DBI-compatible) database and be loaded as needed. - Ordinal logistic regression - Support for the 'mitools' package and multiply-imputed data - Conditioning plots, transparent scatterplots, survival and CDF plots. There is more information on the package web page at http://faculty.washington.edu/tlumley/survey/ -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.