[R] Need Help for creating a new variable
Hi R users, I did do the research and work on for hours, but I still don't know how to solve my silly problem. I try to creat a new variable in my dataset. such as if diet==C vesl==P then trt=CP; if diet==C vesl==A then trt=CA;. The following is my code (It does not work correctly). Could anyone give me a hint? Appreciate! diet-sort(rep(x=c(C,T),4)) vesl-rep(x=c(A,P),4) mydata-data.frame(diet,vesl) mydata$trt-ifelse(mydata$diet==C mydata$vesl==A, CA, +ifelse(mydata$diet==C mydata$vesl==P, CP, + ifelse(mydata$diet==T mydata$vesl==A, TA, + ifelse(mydata$diet==T mydata$vesl==P, TP mydata diet vesl trt 1CA CA 2CP CA 3CA CA 4CP CA 5TA CA 6TP CA 7TA CA 8TP CA Thank you very much Chunhao _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Difference between GEE and Robust Cluster Standard Errors
On Wed, 18 Feb 2009, jjh21 wrote: Hello, I know that two possible approaches to dealing with clustered data would be GEE or a robust cluster covariance matrix from a standard regression. What are the differences between these two methods, or are they doing the same thing? Thanks. There are two components to 'GEE'. The first is the robust cluster (or 'sandwich') covariance, the second is the ability to choose a weight matrix to get higher efficiency ('working correlation'). Using the 'independence working correlation' asks for the same weighting as in ordinary regression, so the estimates are the same as in standard regression, and then the standard errors are the same as the 'robust cluster' ones (up to factors of n/(n-1) and similar implementation details). The standard errors are also the same as the Horvitz-Thompson estimator gives for cluster sampling from an infinite population, and they are also the same as an approximation to the cluster jackknife standard errors where a cluster is downweighted slightly rather than removed. -thomas Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SVM regression code
Dear R user, I am looking for SVM regression in R. It willl be helpful for me if some one send me SVM regression code. Thanks Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Build R-2.8.1 on AIX5.3
Hi R users, I want to build R-2.8.1 on AIX5.3, but I got following error message: Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared library '/rnd/homes/jixu/tmp/R-2.8.1/library/stats/libs/stats.so': rtld: 0712-001 Symbol d1mach was referenced from module /rnd/homes/jixu/tmp/R-2.8.1/library/stats/libs/stats.so(), but a runtime definition of the symbol was not found. rtld: 0712-001 Symbol interv was referenced from module /rnd/homes/jixu/tmp/R-2.8.1/library/stats/libs/stats.so(), but a runtime definition of the symbol was not found. rtld: 0712-002 fatal error: exiting. Error: unable to load R code in package 'methods' Execution halted I found there is someone meet same problem when he build R-2.7.0 by searching r-help, and he used a patch to resolve this issue. The patch¡¯s location is http://prs.ism.ac.jp/~nakama/AIX/changefiles. But I can not find a patch R-2.8.1 in this path. So I want to know what I can do if I want to build R-2.8.1 on AIX5.3 In addition I use IBM compiler with below parameter: OBJECT_MODE=64 LIBICONV=/where/libiconv/installed CC=xlc_r -q64 CFLAGS=-O -qstrict CXX=xlC_r -q64 CXXFLAGS=-O -qstrict F77=xlf_r -q64 AR=ar -X64 CPPFLAGS=-I$LIBICONV/include -I/usr/lpp/X11/include/X11 LDFLAGS=-L$LIBICONV/lib -L/usr/lib -L/usr/X11R6/lib --prefix=/my_R_dir \ --enable-R-shlib \ --enable-BLAS-shlib \ --with-x \ --with-readline=no Thanks Jin _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need Help for creating a new variable
Hi Chun, I did do the research and work on for hours ... I try to creat a new variable in my dataset. Yes, looks like you did. Look at ?interaction, which gives you more flexibility than ?:. ## Example diet-sort(rep(x=c(C,T),4)) vesl-rep(x=c(A,P),4) mydata-data.frame(diet,vesl) mydata$trt - interaction(mydata$diet, mydata$vesl) mydata mydata$trt - mydata$diet:mydata$vesl mydata Regards, Mark. Chun-Hao Tu wrote: Hi R users, I did do the research and work on for hours, but I still don't know how to solve my silly problem. I try to creat a new variable in my dataset. such as if diet==C vesl==P then trt=CP; if diet==C vesl==A then trt=CA;. The following is my code (It does not work correctly). Could anyone give me a hint? Appreciate! diet-sort(rep(x=c(C,T),4)) vesl-rep(x=c(A,P),4) mydata-data.frame(diet,vesl) mydata$trt-ifelse(mydata$diet==C mydata$vesl==A, CA, +ifelse(mydata$diet==C mydata$vesl==P, CP, + ifelse(mydata$diet==T mydata$vesl==A, TA, + ifelse(mydata$diet==T mydata$vesl==P, TP mydata diet vesl trt 1CA CA 2CP CA 3CA CA 4CP CA 5TA CA 6TP CA 7TA CA 8TP CA Thank you very much Chunhao _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/error-bars-tp22092367p22096172.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R: R scripts and parameters
You seem not to have put the R bin directory in your path. That is where Rterm, Rscript ... are installed (and the installer does not change the PATH for you). On Thu, 19 Feb 2009, mau...@alice.it wrote: Sorry. This is still unclear to me. I generated a file called Test.R that contains the following lines: commandArgs(TRUE) cat(\n A = ,A,\n) cat(\n B = ,B,\n) cat(\n C = ,C,\n) First of all I have to clarify which command line we are talking about. If I run the command Rscript from a Windows terminal, the system does not recognize such a command: C:\Documents and Settings\Monville\Utilities-DirRscript Test.R aa bb cc Rscript non ? riconosciuto come comando interno o esterno, un programma eseguibile o un file batch. The above system response tells me that Rscript is not recognized either as an internal or external command, an executable or a batch file. In fact Rscript is an R command. Nevertheless, I started an R session and tried such a command from R console command line and got the following: getwd() [1] C:/Documents and Settings/Monville/Utilities-Dir Rscript Test.R aa bb cc Error: unexpected symbol in Rscript Test.R I feel I do not have a good grasp of how to run R scripts the same way as I usually run C programs. Any help is welcome. Thank you. Maura -Messaggio originale- Da: Duncan Murdoch [mailto:murd...@stats.uwo.ca] Inviato: mar 17/02/2009 17.34 A: mau...@alice.it Cc: r-h...@stat.math.ethz.ch Oggetto: Re: [R] R scripts and parameters On 2/17/2009 10:55 AM, mau...@alice.it wrote: A couple of weeks ago I asked how it is possible to run an R script (not a function) passing some parameters. Someone suggested the function commandArgs(). I read the on-line help and found no clarifying example. Therefore I do not know how to use it appropriately. I noticed this function returns the pathname of the R executable which is not what I need. I meant to ask if it is possible to open an R session and launch a script passing parameters that the script can retrieve and use itself. Just like in C I can run a program and call it with some arguments Example_Prog A B C The program Example_Prog can acess its own arguments through the data structures argc an argv. How can I launch an R script simulating the above mechanism ? Shall I use source (script-name) ? Where are the arguments to be passed, as part of the source call ? Is the function commandArgs to be places as one of the first code lines of the script in order to access its own arguments ? Is there any commandArgs usage example ? Gabor gave you a solution from within R. If you want to run a script from the command line, then use commandArgs(TRUE). For example, put this into the file test.R: commandArgs(TRUE) (The TRUE says you only want to see the trailing arguments, not everything else on the command line.) Then from the command line, do Rscript test.R A B C and you'll see the output [1] A B C Duncan Murdoch tutti i telefonini TIM! [[alternative HTML version deleted]] -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Build R-2.8.1 on AIX5.3
Hi Jin. I found there is someone meet same problem when he build R-2.7.0 by searching r-help, and he used a patch to resolve this issue. The patch¡¯s location is http://prs.ism.ac.jp/~nakama/AIX/changefiles. But I can not find a patch R-2.8.1 in this path. Gnu patch is necessary.(AIX patch can't be used.) http://prs.ism.ac.jp/~nakama/AIX/AIX_R-2.8.1.patch However, this patch is not tried at all. -- EI-JI Nakama nakama (a) ki.rim.or.jp \u4e2d\u9593\u6804\u6cbb nakama (a) ki.rim.or.jp __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Read.table not reading in all columns
Hello, I am reading in a file called fit2.txt (Limma). fit2.txt has 38 columns but when I dim(fit2) I only get 6 columns. The first column that it does not read in is df.residual. fit2-read.table(fit2, file=fit2.txt,sep=\t,quote=,comment.char=,as.is=TRUE) The first few lines of fit2.txt (does not include all 38 columns) looks like this: coefficients.s0vss24 coefficients.s24vss48 coefficients.s48vss96 coefficients.c0vsc24 coefficients.c24vsc48 coefficients.c48vsc96 df.residual sigma stdev.unscaled.s0vss24 U179971039 0.058663 0.087575 0.074886 0.099245 -0.18102 0.311904 20 0.176096 empty1 -0.1296 -0.09105 0.238859 -0.25477 0.063964 0.386198 20 0.34345 empty2 0.136259 0.398073 0.158244 0.175756 -0.10171 0.356534 20 0.425968 empty3 0.446041 -0.33997 0.345333 0.023821 -0.00783 0.119907 20 0.294745 empty4 0.097918 0.168314 0.096333 -0.37584 0.268128 -0.10736 20 0.247398 empty5 -0.07256 0.133791 0.086718 0.078185 -0.19707 -0.4144 20 0.342228 empty6 0.013663 0.028841 -0.164 0.003989 0.21666 -0.13302 20 0.227787 empty7 -0.09123 0.006704 0.357164 -0.23903 -0.01792 0.107122 20 0.309987 empty8 0.164526 0.012946 0.130663 0.526142 -0.68847 0.144673 20 0.276618 CA054869 0.78055 0.723824 0.491332 0.000452 -0.00143 0.810123 20 0.787196 CB490276 -0.43612 0.481221 0.19325 -0.83793 0.611478 0.710366 20 0.800111 CA769480 1.201204 -3.45015 -3.58526 -0.28248 -4.21731 -2.75097 20 1.847347 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Meaning of .local and the special token ..1 returned from match.call
I am writing a version of the subset function for a new class. I don't understand the behavior of match.call in this particular case, and I didn't seem to be able to find much help in the language definition or the email archive. Here follows a minimal example: setClass(myClass, representation(id = factor) ) setMethod(subset,myClass, function(x,subset,...) match.call() ) tmp - new(myClass,id=factor(1:10)) subset(x=tmp,subset=id 5) which gives me .local(x = x, subset = ..1) I want to call a further subset function, subset.data.frame, say, using the unevaluated expression id 5, but in this setup I don't understand how I should proceed. I can't find any explanation of .local and what the ..1 means, except that ..1 is a special token. A small modification of the method as setMethod(subset,myClass, function(x,...) match.call() ) where I exclude explicitly mentioning the subset argument gives, however, what I expected: subset(x = tmp, subset = id 5) This might be OK, and I am able to get everything to work without having an explicit subset argument in the class method -- by passing the ... -- but I think it would be nice to have the subset argument like in the S3 version of subset.data.frame. It seems that the issue is related to the fact that the generic subset method has the arguments (x,...). Is there a way to get around this so that my method can have explicit additional arguments like the subset-argument? Thanks for any help, Niels -- Niels Richard Hansen Associate Professor Department of Mathematical Sciences University of Copenhagen Universitetsparken 5 2100 Copenhagen Ø Denmark +45 353 20783 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re : SVM regression code
there is svmpath package by Trevor Hastie  Justin BEM BP 1917 Yaoundé Tél (237) 99597295 (237) 22040246 De : Alex Roy alexroy2...@gmail.com à : r-help@r-project.org Envoyé le : Jeudi, 19 Février 2009, 9h19mn 18s Objet : [R] SVM regression code Dear R user,           I am looking for SVM regression in R. It willl be helpful for me if some one send me SVM regression code. Thanks Alex    [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Translate R to C code
I am looking for a person who is able to translate a short R code to a C code. I am not sure where to start searching. It is only a short R code (the wrm.smooth code from the robfilter package). Perhaps someone can give me an hint where to start. Thanks in advance. Sincerely, Lars -- View this message in context: http://www.nabble.com/Translate-R-to-C-code-tp22097124p22097124.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple merge, better solution?
Hello, My problem is that I would like to merge multiple files with a common column but merge accepts only two data.frames to merge. In the real situation, I have 26 different data.frames with a common column. I can of course use merge many times (see below) but what would be more sophisticated solution? For loop? Any ideas? DF1 - data.frame(var1 = letters[1:5], a = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], b = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], d = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) merge(g, DF4, by.x=var1, by.y=var1, all=T) Thanks in advance. -Lauri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Matrix package: band matrix
I want to construct a symmetric band matrix in the Matrix package from a matrix where the first column contains data for the main diagonal, the second column has data for the first subdiagonal/superdiagonal and so on. Since the Matrix will be 10^5 x 10^5 or so, with perhaps 10-20 non-zero elements above the diagonal per row, I can't do it by constructing a full matrix and then using the band() function to subset it to a band matrix. Any suggestions? -thomas Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple merge, better solution?
Hi: Below is a TOTAL HACK and I don't recommend it but it does seem to do what you want. I think that I remember Gabor saying that you can merge multiple data frames using zoo but I don't know the specifics. I'm sure he'll respond with the correct way. Below uses a global variable to access the dataframe inside the loop and keeps adding on to it. Don't use it unless you're really desperate for a solution. DF - DF1 for ( .df in list(DF2,DF3,DF4) ) { DF-merge(DF,.df,by.x=var1, by.y=var1, all=T) } print(DF) On Thu, Feb 19, 2009 at 5:21 AM, Lauri Nikkinen wrote: Hello, My problem is that I would like to merge multiple files with a common column but merge accepts only two data.frames to merge. In the real situation, I have 26 different data.frames with a common column. I can of course use merge many times (see below) but what would be more sophisticated solution? For loop? Any ideas? DF1 - data.frame(var1 = letters[1:5], a = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], b = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], d = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) merge(g, DF4, by.x=var1, by.y=var1, all=T) Thanks in advance. -Lauri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] type III effect from glm()
Hi all, This could be naivety/stupidity on my part rather than a problem with model output, but here goes I have fitted a fairly simple model m1-glm(count~siteall+yrs+yrs:district,family=quasipoisson,weights=weight,data=m[x[[i]],]) I want to know if yrs (a continuous variable) has a significant unique effect in the model, so I fit a simplified model with the main effect ommitted... m2-glm(count~siteall+yrs:district,family=quasipoisson,weights=weight,data=m[x[[i]],]) then compare models using anova() anova(m1,m1b,test=F) Analysis of Deviance Table Model 1: count ~ siteall + yrs + yrs:district Model 2: count ~ siteall + yrs:district Resid. Df Resid. Dev Df Deviance F Pr(F) 1 1936 75913 2 1936 7591300 The d.f.'s are exactly the same, is this right? Can I only test the significance of a main effect when it is not in an interaction? Thanks in advance, Simon. Dr. Simon Pickett Research Ecologist Land Use Department Terrestrial Unit British Trust for Ornithology The Nunnery Thetford Norfolk IP242PU 01842750050 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple merge, better solution?
Hi, I think Reduce could help you. DF1 - data.frame(var1 = letters[1:5], a = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], b = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], d = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) g - merge(g, DF4, by.x=var1, by.y=var1, all=T) test - Reduce(function(x, y) merge(x, y, all=T,by.x=var1, by.y=var1), list(DF1, DF2, DF3, DF4), accumulate=F) all.equal(test, g) # TRUE As a warning, it's the first time I've ever used it myself... Hope this helps, baptiste On 19 Feb 2009, at 10:21, Lauri Nikkinen wrote: Hello, My problem is that I would like to merge multiple files with a common column but merge accepts only two data.frames to merge. In the real situation, I have 26 different data.frames with a common column. I can of course use merge many times (see below) but what would be more sophisticated solution? For loop? Any ideas? DF1 - data.frame(var1 = letters[1:5], a = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], b = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], d = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) merge(g, DF4, by.x=var1, by.y=var1, all=T) Thanks in advance. -Lauri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] type III effect from glm()
Sorry, that was a typo in the email, not the model. So I still have the problem. Cheers, Simon. - Original Message - From: Ted Harding ted.hard...@manchester.ac.uk To: Simon Pickett simon.pick...@bto.org; r-help@r-project.org Sent: Thursday, February 19, 2009 10:56 AM Subject: RE: [R] type III effect from glm() On 19-Feb-09 10:38:50, Simon Pickett wrote: Hi all, This could be naivety/stupidity on my part rather than a problem with model output, but here goes I have fitted a fairly simple model m1-glm(count~siteall+yrs+yrs:district,family=quasipoisson, weights=weight,data=m[x[[i]],]) I want to know if yrs (a continuous variable) has a significant unique effect in the model, so I fit a simplified model with the main effect ommitted... m2-glm(count~siteall+yrs:district,family=quasipoisson, weights=weight,data=m[x[[i]],]) So, above, you have fitted two models: m1, m2 then compare models using anova() anova(m1,m2,test=F) And here you are comparing two models: m1, m1b Could this be the reason for your result? Analysis of Deviance Table Model 1: count ~ siteall + yrs + yrs:district Model 2: count ~ siteall + yrs:district Resid. Df Resid. Dev Df Deviance F Pr(F) 1 1936 75913 2 1936 7591300 The d.f.'s are exactly the same, is this right? Can I only test the significance of a main effect when it is not in an interaction? Thanks in advance, Simon. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 19-Feb-09 Time: 10:56:12 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple merge, better solution?
Thanks, both solutions work fine. I tried these solutions to my real data, and I got an error Error in match.names(clabs, names(xi)) : names do not match previous names I refined this example data to look more like my real data, this also produces the same error. Any ideas how to prevent this error? DF1 - data.frame(var1 = letters[1:5], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], a = rnorm(5), b = rnorm(5), c = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) merge(g, DF4, by.x=var1, by.y=var1, all=T) Error in match.names(clabs, names(xi)) : names do not match previous names DF - DF1 for ( .df in list(DF2,DF3,DF4) ) { + DF -merge(DF,.df,by.x=var1, by.y=var1, all=T) + } Error in match.names(clabs, names(xi)) : names do not match previous names Reduce(function(x, y) merge(x, y, all=T,by.x=var1, by.y=var1), list(DF1, DF2, DF3, DF4), accumulate=F) Error in match.names(clabs, names(xi)) : names do not match previous names - Lauri 2009/2/19 baptiste auguie ba...@exeter.ac.uk: Hi, I think Reduce could help you. DF1 - data.frame(var1 = letters[1:5], a = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], b = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], d = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) g - merge(g, DF4, by.x=var1, by.y=var1, all=T) test - Reduce(function(x, y) merge(x, y, all=T,by.x=var1, by.y=var1), list(DF1, DF2, DF3, DF4), accumulate=F) all.equal(test, g) # TRUE As a warning, it's the first time I've ever used it myself... Hope this helps, baptiste On 19 Feb 2009, at 10:21, Lauri Nikkinen wrote: Hello, My problem is that I would like to merge multiple files with a common column but merge accepts only two data.frames to merge. In the real situation, I have 26 different data.frames with a common column. I can of course use merge many times (see below) but what would be more sophisticated solution? For loop? Any ideas? DF1 - data.frame(var1 = letters[1:5], a = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], b = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], d = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) merge(g, DF4, by.x=var1, by.y=var1, all=T) Thanks in advance. -Lauri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Read.table not reading in all columns
I'd suggest to make the data available on the web. Then we can take a closer look. You or some mail tool in between removed the tabs from the message, hence we cannot reproduce in any way. Best, Uwe Ligges Sally wrote: Hello, I am reading in a file called fit2.txt (Limma). fit2.txt has 38 columns but when I dim(fit2) I only get 6 columns. The first column that it does not read in is df.residual. fit2-read.table(fit2, file=fit2.txt,sep=\t,quote=,comment.char=,as.is=TRUE) The first few lines of fit2.txt (does not include all 38 columns) looks like this: coefficients.s0vss24 coefficients.s24vss48 coefficients.s48vss96 coefficients.c0vsc24 coefficients.c24vsc48 coefficients.c48vsc96 df.residual sigma stdev.unscaled.s0vss24 U179971039 0.058663 0.087575 0.074886 0.099245 -0.18102 0.311904 20 0.176096 empty1 -0.1296 -0.09105 0.238859 -0.25477 0.063964 0.386198 20 0.34345 empty2 0.136259 0.398073 0.158244 0.175756 -0.10171 0.356534 20 0.425968 empty3 0.446041 -0.33997 0.345333 0.023821 -0.00783 0.119907 20 0.294745 empty4 0.097918 0.168314 0.096333 -0.37584 0.268128 -0.10736 20 0.247398 empty5 -0.07256 0.133791 0.086718 0.078185 -0.19707 -0.4144 20 0.342228 empty6 0.013663 0.028841 -0.164 0.003989 0.21666 -0.13302 20 0.227787 empty7 -0.09123 0.006704 0.357164 -0.23903 -0.01792 0.107122 20 0.309987 empty8 0.164526 0.012946 0.130663 0.526142 -0.68847 0.144673 20 0.276618 CA054869 0.78055 0.723824 0.491332 0.000452 -0.00143 0.810123 20 0.787196 CB490276 -0.43612 0.481221 0.19325 -0.83793 0.611478 0.710366 20 0.800111 CA769480 1.201204 -3.45015 -3.58526 -0.28248 -4.21731 -2.75097 20 1.847347 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] type III effect from glm()
On 19-Feb-09 10:38:50, Simon Pickett wrote: Hi all, This could be naivety/stupidity on my part rather than a problem with model output, but here goes I have fitted a fairly simple model m1-glm(count~siteall+yrs+yrs:district,family=quasipoisson, weights=weight,data=m[x[[i]],]) I want to know if yrs (a continuous variable) has a significant unique effect in the model, so I fit a simplified model with the main effect ommitted... m2-glm(count~siteall+yrs:district,family=quasipoisson, weights=weight,data=m[x[[i]],]) So, above, you have fitted two models: m1, m2 then compare models using anova() anova(m1,m1b,test=F) And here you are comparing two models: m1, m1b Could this be the reason for your result? Analysis of Deviance Table Model 1: count ~ siteall + yrs + yrs:district Model 2: count ~ siteall + yrs:district Resid. Df Resid. Dev Df Deviance F Pr(F) 1 1936 75913 2 1936 7591300 The d.f.'s are exactly the same, is this right? Can I only test the significance of a main effect when it is not in an interaction? Thanks in advance, Simon. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 19-Feb-09 Time: 10:56:12 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] type III effect from glm()
Cheers Mark, I did originally think too, i.e. that not including the main effect was the problem. However, the same thing happens when I include main effects test1-glm(count~siteall+yrs*district,family=quasipoisson,weights=weight,data=m[x[[i]],]) test2-glm(count~siteall+district+yrs:district,family=quasipoisson,weights=weight,data=m[x[[i]],]) anova(test1,test2,test=F) Model 1: count ~ siteall + yrs * district Model 2: count ~ siteall + district + yrs:district Resid. Df Resid. Dev Df Deviance F Pr(F) 1 1933 75665 2 1933 7566500 Simon. - Original Message - From: markle...@verizon.net To: Simon Pickett simon.pick...@bto.org Sent: Thursday, February 19, 2009 10:50 AM Subject: RE: [R] type III effect from glm() Hi Simon: John Fox can say a lot more about below but I've been reading his book over and over recently and one thing he constantly stresses is marginality which he defines as always including the lower order term if you include it in a higher order term. So, I think below is problematic because you are including an interaction that includes the main effect but not including the main effect. This definitely causes problems when trying to interpret the anova table or the Anova table. That's as much as I can say. I highly recommed his text for this sort of thing and hopefully he will respond. Oh, my point is that if you want to check the effect of yrs, then I think you have to take it out of model 2 totally in order to interpret the anova ( or the Anova ) table. On Thu, Feb 19, 2009 at 5:38 AM, Simon Pickett wrote: Hi all, This could be naivety/stupidity on my part rather than a problem with model output, but here goes I have fitted a fairly simple model m1-glm(count~siteall+yrs+yrs:district,family=quasipoisson,weights=weight,data=m[x[[i]],]) I want to know if yrs (a continuous variable) has a significant unique effect in the model, so I fit a simplified model with the main effect ommitted... m2-glm(count~siteall+yrs:district,family=quasipoisson,weights=weight,data=m[x[[i]],]) then compare models using anova() anova(m1,m1b,test=F) Analysis of Deviance Table Model 1: count ~ siteall + yrs + yrs:district Model 2: count ~ siteall + yrs:district Resid. Df Resid. Dev Df Deviance F Pr(F) 1 1936 75913 2 1936 75913 0 0 The d.f.'s are exactly the same, is this right? Can I only test the significance of a main effect when it is not in an interaction? Thanks in advance, Simon. Dr. Simon Pickett Research Ecologist Land Use Department Terrestrial Unit British Trust for Ornithology The Nunnery Thetford Norfolk IP242PU 01842750050 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] table with 3 varialbes
I have the initial matrice: *data.frame(Subject=rep(100:101, each=4), Quarter=rep(paste(Q,1:4, sep=),2), Boolean = rep(c(Y,N),4))* Subject Quarter Boolean 1100 Q1 Y 2100 Q2 N 3100 Q3 Y 4100 Q4 N 5101 Q1 Y 6101 Q2 N 7101 Q3 Y 8101 Q4 N ... And I would like to group the Subject by Quarter using as a result in the table the value of the third variable (Boolean). The final result would give: Subjet Q1 Q2 Q3 Q4 1100 Y Y Y Y 2101 N N N N ... I started using the *table(Subject, Quarter)* but can't find a way to correspond the Boolean information in the table Thanks in advance for the ideas... Pascal Candolfi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] package knnFinder, kd-trees
Dear R users, thanks to Samuel for making the package knnFinder available to the public. I was wondering if there is an easy way to only build and store the kdd tree in a first step and perform NN queries from then on ? It seems that nn() does both simultaneously. Thanks! Markus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] table with 3 variables
I have the initial matrice: *data.frame(Subject=rep(100:101, each=4), Quarter=rep(paste(Q,1:4, sep=),2), Boolean = rep(c(Y,N),4))* Subject Quarter Boolean 1100 Q1 Y 2100 Q2 N 3100 Q3 Y 4100 Q4 N 5101 Q1 Y 6101 Q2 N 7101 Q3 Y 8101 Q4 N ... And I would like to group the Subject by Quarter using as a result in the table the value of the third variable (Boolean). The final result would give: Subjet Q1 Q2 Q3 Q4 1100 Y Y Y Y 2101 N N N N ... I started using the *table(Subject, Quarter)* but can't find a way to correspond the Boolean information in the table Thanks in advance for the ideas... Pascal Candolfi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Use of ifelse for indicating specific rownumber
Hello I have a dataset named b2 with 1521 rows, in that dataset i have 64 rows containing specific information. the rownumbers with specific info are: + i [1] 22 53 104 127 151 196 235 238 249 250 263 335 344 353 362 370 389 422 458 459 473 492 502 530 561 624 647 651 666 671 [31] 715 784 791 807 813 823 830 841 862 865 1036 1051 1062 1068 1092 1109 1171 1187 1283 1293 1325 1335 1342 1360 1379 1414 1419 1425 1447 1452 [61] 1465 1489 1512 1518 So what i want is that everytime the rownumber equals a number in i (which obviously indicate a rownumber i b2), i want it indicated in a vector called b2$totalvac. Fx. in rownumber 22 in b2 the b2$totalvac vector should have the value 1. So thougth of b2$totalvac - ifelse(,1,0), but i don't what to put as the if-sentence. Hope you can help me -- View this message in context: http://www.nabble.com/Use-of-ifelse-for-indicating-specific-rownumber-tp22098418p22098418.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] table with 3 varialbes
Hi, Look, is simple with reshape: x - data frame(...) reshape( x, idvar = Subject, direction = wide, timevar = Quarter) Regards, Patricia Date: Thu, 19 Feb 2009 11:02:58 +0100 From: pcando...@gmail.com To: r-help@r-project.org Subject: [R] table with 3 varialbes I have the initial matrice: *data.frame(Subject=rep(100:101, each=4), Quarter=rep(paste(Q,1:4, sep=),2), Boolean = rep(c(Y,N),4))* Subject Quarter Boolean 1100 Q1 Y 2100 Q2 N 3100 Q3 Y 4100 Q4 N 5101 Q1 Y 6101 Q2 N 7101 Q3 Y 8101 Q4 N ... And I would like to group the Subject by Quarter using as a result in the table the value of the third variable (Boolean). The final result would give: Subjet Q1 Q2 Q3 Q4 1100 Y Y Y Y 2101 N N N N ... I started using the *table(Subject, Quarter)* but can't find a way to correspond the Boolean information in the table Thanks in advance for the ideas... Pascal Candolfi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] System.time
Wacek Kusnierczyk wrote: to contribute my few cents, here's a simple benchmarking routine, inspired by the perl module Benchmark. it allows one to benchmark an arbitrary number of expressions with an arbitrary number of replications, and provides a summary matrix with selected timings. snip it's rudimentary and not fool-proof, but might be helpful if used with care. (the nested do.call-rbind-lapply sequence can surely be simplified, but i could not resist the pattern. someone once wrote that if you need more than three (five?) levels of indentation in your code, there must be something wrong with it; presumably, he was a fortran programmer.) i have cleaned-up the code, removing the fancy nested structure. the code plus detailed documentation is available from googlecode [1], and i stop the self-marketing here. vQ [1] http://code.google.com/p/rbenchmark/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple merge, better solution?
Another option using Recall, merge.rec - function(.list, ...){ if(length(.list)==1) return(.list[[1]]) Recall(c(list(merge(.list[[1]], .list[[2]], ...)), .list[-(1:2)]), ...) } my.list - list(DF1, DF2, DF3, DF4) test2 - merge.rec(my.list, by.x=var1, by.y=var1, all=T) all.equal(test2, g) Note that your second example does not work because in the last step there are no common names between g and DF4 (I think). Using suffixes=c(, ) seems to do the trick but I'm not sure it's giving the result you want/expect. Hope this helps, baptiste On 19 Feb 2009, at 10:21, Lauri Nikkinen wrote: Hello, My problem is that I would like to merge multiple files with a common column but merge accepts only two data.frames to merge. In the real situation, I have 26 different data.frames with a common column. I can of course use merge many times (see below) but what would be more sophisticated solution? For loop? Any ideas? DF1 - data.frame(var1 = letters[1:5], a = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], b = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], d = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) merge(g, DF4, by.x=var1, by.y=var1, all=T) Thanks in advance. -Lauri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple merge, better solution?
If you don't mind I've added this example to the R wiki, http://wiki.r-project.org/rwiki/doku.php?id=tips:data-frames:merge It would be very nice if a R guru could check that the information I put is not complete fantasy. Feel free to remove as appropriate. Best wishes, baptiste On 19 Feb 2009, at 11:00, Lauri Nikkinen wrote: Thanks, both solutions work fine. I tried these solutions to my real data, and I got an error Error in match.names(clabs, names(xi)) : names do not match previous names I refined this example data to look more like my real data, this also produces the same error. Any ideas how to prevent this error? DF1 - data.frame(var1 = letters[1:5], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], a = rnorm(5), b = rnorm(5), c = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) merge(g, DF4, by.x=var1, by.y=var1, all=T) Error in match.names(clabs, names(xi)) : names do not match previous names DF - DF1 for ( .df in list(DF2,DF3,DF4) ) { + DF -merge(DF,.df,by.x=var1, by.y=var1, all=T) + } Error in match.names(clabs, names(xi)) : names do not match previous names Reduce(function(x, y) merge(x, y, all=T,by.x=var1, by.y=var1), list(DF1, DF2, DF3, DF4), accumulate=F) Error in match.names(clabs, names(xi)) : names do not match previous names - Lauri 2009/2/19 baptiste auguie ba...@exeter.ac.uk: Hi, I think Reduce could help you. DF1 - data.frame(var1 = letters[1:5], a = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], b = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], d = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) g - merge(g, DF4, by.x=var1, by.y=var1, all=T) test - Reduce(function(x, y) merge(x, y, all=T,by.x=var1, by.y=var1), list(DF1, DF2, DF3, DF4), accumulate=F) all.equal(test, g) # TRUE As a warning, it's the first time I've ever used it myself... Hope this helps, baptiste On 19 Feb 2009, at 10:21, Lauri Nikkinen wrote: Hello, My problem is that I would like to merge multiple files with a common column but merge accepts only two data.frames to merge. In the real situation, I have 26 different data.frames with a common column. I can of course use merge many times (see below) but what would be more sophisticated solution? For loop? Any ideas? DF1 - data.frame(var1 = letters[1:5], a = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], b = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], d = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) merge(g, DF4, by.x=var1, by.y=var1, all=T) Thanks in advance. -Lauri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] type III effect from glm()
Hi Simon: In below , test1 spelled out is count ~ siteall + yrs + district + yrs:district so this is fine. but in test2 , you have years interacting with district but not the main effect for years. this is against the rules of marginality so I still think there's a problem. I would wait for John or the other wizaRds to respond ( you know who you are ) because I don't feel particularly confident giving advice on this because I bang my head against it often also. Plus, I gotta go home because it's getting light out soon ( i'm in the US on the east coast ). Good luck. On Thu, Feb 19, 2009 at 6:10 AM, Simon Pickett wrote: Cheers Mark, I did originally think too, i.e. that not including the main effect was the problem. However, the same thing happens when I include main effects test1-glm(count~siteall+yrs*district,family=quasipoisson,weights=weight,data=m[x[[i]],]) test2-glm(count~siteall+district+yrs:district,family=quasipoisson,weights=weight,data=m[x[[i]],]) anova(test1,test2,test=F) Model 1: count ~ siteall + yrs * district Model 2: count ~ siteall + district + yrs:district Resid. Df Resid. Dev Df Deviance F Pr(F) 1 1933 75665 2 1933 7566500 Simon. - Original Message - From: markle...@verizon.net To: Simon Pickett simon.pick...@bto.org Sent: Thursday, February 19, 2009 10:50 AM Subject: RE: [R] type III effect from glm() Hi Simon: John Fox can say a lot more about below but I've been reading his book over and over recently and one thing he constantly stresses is marginality which he defines as always including the lower order term if you include it in a higher order term. So, I think below is problematic because you are including an interaction that includes the main effect but not including the main effect. This definitely causes problems when trying to interpret the anova table or the Anova table. That's as much as I can say. I highly recommed his text for this sort of thing and hopefully he will respond. Oh, my point is that if you want to check the effect of yrs, then I think you have to take it out of model 2 totally in order to interpret the anova ( or the Anova ) table. On Thu, Feb 19, 2009 at 5:38 AM, Simon Pickett wrote: Hi all, This could be naivety/stupidity on my part rather than a problem with model output, but here goes I have fitted a fairly simple model m1-glm(count~siteall+yrs+yrs:district,family=quasipoisson,weights=weight,data=m[x[[i]],]) I want to know if yrs (a continuous variable) has a significant unique effect in the model, so I fit a simplified model with the main effect ommitted... m2-glm(count~siteall+yrs:district,family=quasipoisson,weights=weight,data=m[x[[i]],]) then compare models using anova() anova(m1,m1b,test=F) Analysis of Deviance Table Model 1: count ~ siteall + yrs + yrs:district Model 2: count ~ siteall + yrs:district Resid. Df Resid. Dev Df Deviance F Pr(F) 1 1936 75913 2 1936 75913 0 0 The d.f.'s are exactly the same, is this right? Can I only test the significance of a main effect when it is not in an interaction? Thanks in advance, Simon. Dr. Simon Pickett Research Ecologist Land Use Department Terrestrial Unit British Trust for Ornithology The Nunnery Thetford Norfolk IP242PU 01842750050 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] table with 3 varialbes
Hi, I think you should take a look to ?reshape. Regards Patricia Date: Thu, 19 Feb 2009 11:02:58 +0100 From: pcando...@gmail.com To: r-help@r-project.org Subject: [R] table with 3 varialbes I have the initial matrice: *data.frame(Subject=rep(100:101, each=4), Quarter=rep(paste(Q,1:4, sep=),2), Boolean = rep(c(Y,N),4))* Subject Quarter Boolean 1100 Q1 Y 2100 Q2 N 3100 Q3 Y 4100 Q4 N 5101 Q1 Y 6101 Q2 N 7101 Q3 Y 8101 Q4 N ... And I would like to group the Subject by Quarter using as a result in the table the value of the third variable (Boolean). The final result would give: Subjet Q1 Q2 Q3 Q4 1100 Y Y Y Y 2101 N N N N ... I started using the *table(Subject, Quarter)* but can't find a way to correspond the Boolean information in the table Thanks in advance for the ideas... Pascal Candolfi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use of ifelse for indicating specific rownumber
I think you are looking for something like: ifelse(1:nrow(b2) %in% i, 1, 0) Patrick Burns patr...@burns-stat.com +44 (0)20 8525 0696 http://www.burns-stat.com (home of The R Inferno and A Guide for the Unwilling S User) joe1985 wrote: Hello I have a dataset named b2 with 1521 rows, in that dataset i have 64 rows containing specific information. the rownumbers with specific info are: + i [1] 22 53 104 127 151 196 235 238 249 250 263 335 344 353 362 370 389 422 458 459 473 492 502 530 561 624 647 651 666 671 [31] 715 784 791 807 813 823 830 841 862 865 1036 1051 1062 1068 1092 1109 1171 1187 1283 1293 1325 1335 1342 1360 1379 1414 1419 1425 1447 1452 [61] 1465 1489 1512 1518 So what i want is that everytime the rownumber equals a number in i (which obviously indicate a rownumber i b2), i want it indicated in a vector called b2$totalvac. Fx. in rownumber 22 in b2 the b2$totalvac vector should have the value 1. So thougth of b2$totalvac - ifelse(,1,0), but i don't what to put as the if-sentence. Hope you can help me __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple merge, better solution?
That's perfectly fine. I figured out how to to this with my second example DF1 - data.frame(var1 = letters[1:5], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF - DF1 for ( .df in list(DF2,DF3,DF4) ) { DF -merge(DF,.df,by.x=var1, by.y=var1, all=T) names(DF)[-1] - paste(names(DF)[-1], 2:length(names(DF))) } names(DF) - sub([[:space:]].+$, , names(DF), perl=T) DF Thank you all! -Lauri 2009/2/19 baptiste auguie ba...@exeter.ac.uk: If you don't mind I've added this example to the R wiki, http://wiki.r-project.org/rwiki/doku.php?id=tips:data-frames:merge It would be very nice if a R guru could check that the information I put is not complete fantasy. Feel free to remove as appropriate. Best wishes, baptiste On 19 Feb 2009, at 11:00, Lauri Nikkinen wrote: Thanks, both solutions work fine. I tried these solutions to my real data, and I got an error Error in match.names(clabs, names(xi)) : names do not match previous names I refined this example data to look more like my real data, this also produces the same error. Any ideas how to prevent this error? DF1 - data.frame(var1 = letters[1:5], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], a = rnorm(5), b = rnorm(5), c = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) merge(g, DF4, by.x=var1, by.y=var1, all=T) Error in match.names(clabs, names(xi)) : names do not match previous names DF - DF1 for ( .df in list(DF2,DF3,DF4) ) { + DF -merge(DF,.df,by.x=var1, by.y=var1, all=T) + } Error in match.names(clabs, names(xi)) : names do not match previous names Reduce(function(x, y) merge(x, y, all=T,by.x=var1, by.y=var1), list(DF1, DF2, DF3, DF4), accumulate=F) Error in match.names(clabs, names(xi)) : names do not match previous names - Lauri 2009/2/19 baptiste auguie ba...@exeter.ac.uk: Hi, I think Reduce could help you. DF1 - data.frame(var1 = letters[1:5], a = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], b = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], d = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) g - merge(g, DF4, by.x=var1, by.y=var1, all=T) test - Reduce(function(x, y) merge(x, y, all=T,by.x=var1, by.y=var1), list(DF1, DF2, DF3, DF4), accumulate=F) all.equal(test, g) # TRUE As a warning, it's the first time I've ever used it myself... Hope this helps, baptiste On 19 Feb 2009, at 10:21, Lauri Nikkinen wrote: Hello, My problem is that I would like to merge multiple files with a common column but merge accepts only two data.frames to merge. In the real situation, I have 26 different data.frames with a common column. I can of course use merge many times (see below) but what would be more sophisticated solution? For loop? Any ideas? DF1 - data.frame(var1 = letters[1:5], a = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], b = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], d = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) merge(g, DF4, by.x=var1, by.y=var1, all=T) Thanks in advance. -Lauri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unadulterated plot
good point! Provide your own set of x,y,z co-ords, mine are pretty big but you can use any. library(akima) fr3d = data.frame(x,y,z) xtrp - interp(fr3d$x,fr3d$y,fr3d$z,linear=FALSE,extrap=TRUE,duplicate= strip) op - par(ann=FALSE, mai=c(0,0,0,0)) filled.contour(xtrp$x, xtrp$y, xtrp$z, asp = 0.88402366864, col = rev(rainbow(28,start=0, end=8/12)), n = 40) par(op) I tried all these settings too (none of them made a difference)... usr=c(0,845,0,747), mfcol=c(1,1), mfrow=c(1,1), oma=c(0,0,0,0),omi=c(0,0,0,0), plt=c(1,1,1,1) Regards James Peter Alspach wrote: Kia ora James I think it would be easier to provide you with help if you provide commented, minimal, self-contained, reproducible code [see bottom of this, or any, email to R-help]. Hei kona ra ... Peter Alspach -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of James Nicolson Sent: Thursday, 19 February 2009 11:22 a.m. To: r-help@r-project.org Subject: Re: [R] Unadulterated plot Hi, Thanks for your help. I have looked at the beginners documentation and while there are options to configure various aspects of the plot none of them seem to have the desired effect. I have managed to ensure that the plot fills the space vertically with no margins, no axes etc (using mai=c(0,0,0,0)). However, horizontally there remains a margin to the right that pads the space between the filled.contour and its legend. I've tried options to par and filled.contour but I can't seem to remove the legend. Kind Regards, James Simon Pickett wrote: Hi James, What you really need to do is to check out the many freely available pdfs for R beginners. Here is a good place to start http://cran.r-project.org/other-docs.html If I am right interpreting what you want, I think you need to create a blank plot with no axes, axis labels etc. Try plot(x,y,xlab=,ylab=,xaxt=NULL,yaxt=NULL,type=n) #blank plot points(x,y) type ?par into R and see how you can set parameters like this up as the default. Hope this helps? Simon. - Original Message - From: James Nicolson jlnicol...@gmail.com To: r-help@r-project.org Sent: Sunday, February 15, 2009 10:29 PM Subject: [R] Unadulterated plot To all, Apologies if this question has already been asked but I can't find anything. I can't seem to think of more specific search terms. I want to display/create a file of a pure plot with a specific height and width. I want to utilise every single pixel inside the axes. I do not want to display any margins, legends, axes, titles or spaces around the edges. Is this possible? Additionally, the plot I am working with is a filled.contour plot and I can not remove the legend? How can I do this? Kind Regards, James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. The contents of this e-mail are confidential and may be subject to legal privilege. If you are not the intended recipient you must not use, disseminate, distribute or reproduce all or any part of this e-mail or attachments. If you have received this e-mail in error, please notify the sender and delete all material pertaining to this e-mail. Any opinion or views expressed in this e-mail are those of the individual sender and may not represent those of The New Zealand Institute for Plant and Food Research Limited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple merge, better solution?
Yes, even better DF1 - data.frame(var1 = letters[1:5], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF - DF1 for ( .df in list(DF2,DF3,DF4) ) { DF -merge(DF,.df,by.x=var1, by.y=var1, all=T, suffixes=c(, )) } DF -Lauri 2009/2/19 Lauri Nikkinen lauri.nikki...@iki.fi: That's perfectly fine. I figured out how to to this with my second example DF1 - data.frame(var1 = letters[1:5], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF - DF1 for ( .df in list(DF2,DF3,DF4) ) { DF -merge(DF,.df,by.x=var1, by.y=var1, all=T) names(DF)[-1] - paste(names(DF)[-1], 2:length(names(DF))) } names(DF) - sub([[:space:]].+$, , names(DF), perl=T) DF Thank you all! -Lauri 2009/2/19 baptiste auguie ba...@exeter.ac.uk: If you don't mind I've added this example to the R wiki, http://wiki.r-project.org/rwiki/doku.php?id=tips:data-frames:merge It would be very nice if a R guru could check that the information I put is not complete fantasy. Feel free to remove as appropriate. Best wishes, baptiste On 19 Feb 2009, at 11:00, Lauri Nikkinen wrote: Thanks, both solutions work fine. I tried these solutions to my real data, and I got an error Error in match.names(clabs, names(xi)) : names do not match previous names I refined this example data to look more like my real data, this also produces the same error. Any ideas how to prevent this error? DF1 - data.frame(var1 = letters[1:5], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], a = rnorm(5), b = rnorm(5), c = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) merge(g, DF4, by.x=var1, by.y=var1, all=T) Error in match.names(clabs, names(xi)) : names do not match previous names DF - DF1 for ( .df in list(DF2,DF3,DF4) ) { + DF -merge(DF,.df,by.x=var1, by.y=var1, all=T) + } Error in match.names(clabs, names(xi)) : names do not match previous names Reduce(function(x, y) merge(x, y, all=T,by.x=var1, by.y=var1), list(DF1, DF2, DF3, DF4), accumulate=F) Error in match.names(clabs, names(xi)) : names do not match previous names - Lauri 2009/2/19 baptiste auguie ba...@exeter.ac.uk: Hi, I think Reduce could help you. DF1 - data.frame(var1 = letters[1:5], a = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], b = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], d = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) g - merge(g, DF4, by.x=var1, by.y=var1, all=T) test - Reduce(function(x, y) merge(x, y, all=T,by.x=var1, by.y=var1), list(DF1, DF2, DF3, DF4), accumulate=F) all.equal(test, g) # TRUE As a warning, it's the first time I've ever used it myself... Hope this helps, baptiste On 19 Feb 2009, at 10:21, Lauri Nikkinen wrote: Hello, My problem is that I would like to merge multiple files with a common column but merge accepts only two data.frames to merge. In the real situation, I have 26 different data.frames with a common column. I can of course use merge many times (see below) but what would be more sophisticated solution? For loop? Any ideas? DF1 - data.frame(var1 = letters[1:5], a = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], b = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], d = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) merge(g, DF4, by.x=var1, by.y=var1, all=T) Thanks in advance. -Lauri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag
Re: [R] type III effect from glm()
Hi Simon, I want to know if yrs (a continuous variable) has a significant unique effect in the model, so I fit a simplified model with the main effect ommitted... [A different approach...] This is not really a sensible question until you have established that there is no significant interaction between yrs and district. If this interaction is significant then, ipso facto, the effect of yrs is not unique but depends on district. So establish that first. There is a good section on marginality in MASS (Venables Ripley) and, as Mark has mentioned, in Prof Fox's texts. From what I can remember, some of these tests are reparametrized behind the scenes to enforce the marginality constraint. Regards, Mark. Simon Pickett-4 wrote: Hi all, This could be naivety/stupidity on my part rather than a problem with model output, but here goes I have fitted a fairly simple model m1-glm(count~siteall+yrs+yrs:district,family=quasipoisson,weights=weight,data=m[x[[i]],]) I want to know if yrs (a continuous variable) has a significant unique effect in the model, so I fit a simplified model with the main effect ommitted... m2-glm(count~siteall+yrs:district,family=quasipoisson,weights=weight,data=m[x[[i]],]) then compare models using anova() anova(m1,m1b,test=F) Analysis of Deviance Table Model 1: count ~ siteall + yrs + yrs:district Model 2: count ~ siteall + yrs:district Resid. Df Resid. Dev Df Deviance F Pr(F) 1 1936 75913 2 1936 7591300 The d.f.'s are exactly the same, is this right? Can I only test the significance of a main effect when it is not in an interaction? Thanks in advance, Simon. Dr. Simon Pickett Research Ecologist Land Use Department Terrestrial Unit British Trust for Ornithology The Nunnery Thetford Norfolk IP242PU 01842750050 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/type-III-effect-from-glm%28%29-tp22097773p22099331.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] table with 3 variables
Maybe reshape will help you, but I'm in doubt that your posted desired result fits your given data - e.g shouldn't subject 101 Q3 give Y? xx-data.frame(Subject=rep(100:101, each=4), Quarter=rep(paste(Q,1:4,sep=),2), Boolean = rep(c(Y,N),4)) reshape(xx,timevar=Quarter,idvar=Subject,direction=wide,v.names=Boolean) hth. Pascal Candolfi schrieb: I have the initial matrice: *data.frame(Subject=rep(100:101, each=4), Quarter=rep(paste(Q,1:4, sep=),2), Boolean = rep(c(Y,N),4))* Subject Quarter Boolean 1100 Q1 Y 2100 Q2 N 3100 Q3 Y 4100 Q4 N 5101 Q1 Y 6101 Q2 N 7101 Q3 Y 8101 Q4 N ... And I would like to group the Subject by Quarter using as a result in the table the value of the third variable (Boolean). The final result would give: Subjet Q1 Q2 Q3 Q4 1100 Y Y Y Y 2101 N N N N ... I started using the *table(Subject, Quarter)* but can't find a way to correspond the Boolean information in the table Thanks in advance for the ideas... Pascal Candolfi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/42803-8243 F ++49/40/42803-7790 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use of ifelse for indicating specific rownumber
why use ifelse? Shouldn't b2$totalvac-rep(0,1521) b2$totalvac[i]-1 do the trick? joe1985 schrieb: Hello I have a dataset named b2 with 1521 rows, in that dataset i have 64 rows containing specific information. the rownumbers with specific info are: + i [1] 22 53 104 127 151 196 235 238 249 250 263 335 344 353 362 370 389 422 458 459 473 492 502 530 561 624 647 651 666 671 [31] 715 784 791 807 813 823 830 841 862 865 1036 1051 1062 1068 1092 1109 1171 1187 1283 1293 1325 1335 1342 1360 1379 1414 1419 1425 1447 1452 [61] 1465 1489 1512 1518 So what i want is that everytime the rownumber equals a number in i (which obviously indicate a rownumber i b2), i want it indicated in a vector called b2$totalvac. Fx. in rownumber 22 in b2 the b2$totalvac vector should have the value 1. So thougth of b2$totalvac - ifelse(,1,0), but i don't what to put as the if-sentence. Hope you can help me -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/42803-8243 F ++49/40/42803-7790 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dotplot points color
Dear list, is it possible to change the background color of dotplot's points? I tried in many ways but unsuccessfully Thanks in advance Gianandrea require(lattice) dotplot(variety ~ yield | site, data = barley, groups = year, pch=21) dotplot(variety ~ yield | site, data = barley, groups = year, pch=21, bg=c(2,3)) ??!!! -- View this message in context: http://www.nabble.com/dotplot-points-color-tp22099530p22099530.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with comparing a part of string with whole string
Hi all, I got one problem with comparing strings like if any string is like *RIGHT, EPICARDIUM: FOCUS, GRAY-WHITE, SINGLE, APPROX 0.6 CM IN DIAMETER*. and i have to compare *GRAY-WHITE*with the above string or otherwise i have to compare*TUMOR BENIGN* this string with *MEDULLRY TUMOR BENIGN,TYP PHEOCHROMOCYTOMA* i tried with split and compare but its not working can any one suggest how can i compare these type of Strings thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Insert value in a Vector Alternately
How about this: dat-c(0.00377467,0.00377467,0.00377467,0.00380083,0.00380083,0.00380083,0.00380959, + 0.00380959,0.00380959,0.00380083,0.00380083,0.00380083) dat[seq(1, by=3, to=length(dat))] - 0 dat [1] 0. 0.00377467 0.00377467 0. 0.00380083 0.00380083 0. 0.00380959 0.00380959 0. 0.00380083 [12] 0.00380083 On Thu, Feb 19, 2009 at 1:47 AM, Gundala Viswanath gunda...@gmail.com wrote: Hi, I have a vector that look like this: dat V1 V2 V3 V4 V5 V6 0.00377467 0.00377467 0.00377467 0.00380083 0.00380083 0.00380083 V7 V8 V9V10V11V12 0.00380959 0.00380959 0.00380959 0.00380083 0.00380083 0.00380083 what I want to do is to insert 0 (zero) for every 3 position yielding: V1 V2 V3V4 V5V6 V7 V8 0 0.00377467 0.00377467 0.00377467 0 0.00380083 0.00380083 0.00380083 V9 V10 V11V12 V13V14 V15 V16 0 0.00380959 0.00380959 0.00380959 0 .00380083 0.00380083 0.00380083 Is there a quick way to do it in R? - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with comparing a part of string with whole string
'grep' will tell you if there is a match in the string; x - c(*RIGHT, EPICARDIUM: FOCUS, GRAY-WHITE, SINGLE, APPROX 0.6 CM IN DIAMETER*.,*MEDULLRY TUMOR BENIGN,TYP PHEOCHROMOCYTOMA*) grep(GRAY-WHITE, x) [1] 1 grep(TUMOR BENIGN, x) [1] 2 On Thu, Feb 19, 2009 at 7:39 AM, venkata kirankumar kiran4u2...@gmail.com wrote: Hi all, I got one problem with comparing strings like if any string is like *RIGHT, EPICARDIUM: FOCUS, GRAY-WHITE, SINGLE, APPROX 0.6 CM IN DIAMETER*. and i have to compare *GRAY-WHITE*with the above string or otherwise i have to compare*TUMOR BENIGN* this string with *MEDULLRY TUMOR BENIGN,TYP PHEOCHROMOCYTOMA* i tried with split and compare but its not working can any one suggest how can i compare these type of Strings thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: problem with comparing a part of string with whole string
Hi r-help-boun...@r-project.org napsal dne 19.02.2009 13:39:42: Hi all, I got one problem with comparing strings like if any string is like *RIGHT, EPICARDIUM: FOCUS, GRAY-WHITE, SINGLE, APPROX 0.6 CM IN DIAMETER*. and i have to compare *GRAY-WHITE*with the above string or otherwise i have to compare*TUMOR BENIGN* this string with *MEDULLRY TUMOR BENIGN,TYP PHEOCHROMOCYTOMA* i tried with split and compare but its not working Work for regular expressions? vec=*RIGHT, EPICARDIUM: FOCUS, GRAY-WHITE, SINGLE, APPROX 0.6 CM IN DIAMETER*. test=GRAY-WHITE regexpr(test, vec) [1] 28 attr(,match.length) [1] 10 Regards Petr can any one suggest how can i compare these type of Strings thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] type III effect from glm()
Hi Simon, [On my response] ...not really a sensible question until... On reading through this...what I mean is that yours seems not to be a sensible approach, the question itself may be reasonable. What you want to be doing is testing whether the interaction term (yrs:district) gets dropped. Do it by comparing nested models (basically as you have done), or use dropterm() or stepAIC() [both are in MASS]. Regards, Mark. Mark Difford wrote: Hi Simon, I want to know if yrs (a continuous variable) has a significant unique effect in the model, so I fit a simplified model with the main effect ommitted... [A different approach...] This is not really a sensible question until you have established that there is no significant interaction between yrs and district. If this interaction is significant then, ipso facto, the effect of yrs is not unique but depends on district. So establish that first. There is a good section on marginality in MASS (Venables Ripley) and, as Mark has mentioned, in Prof Fox's texts. From what I can remember, some of these tests are reparametrized behind the scenes to enforce the marginality constraint. Regards, Mark. Simon Pickett-4 wrote: Hi all, This could be naivety/stupidity on my part rather than a problem with model output, but here goes I have fitted a fairly simple model m1-glm(count~siteall+yrs+yrs:district,family=quasipoisson,weights=weight,data=m[x[[i]],]) I want to know if yrs (a continuous variable) has a significant unique effect in the model, so I fit a simplified model with the main effect ommitted... m2-glm(count~siteall+yrs:district,family=quasipoisson,weights=weight,data=m[x[[i]],]) then compare models using anova() anova(m1,m1b,test=F) Analysis of Deviance Table Model 1: count ~ siteall + yrs + yrs:district Model 2: count ~ siteall + yrs:district Resid. Df Resid. Dev Df Deviance F Pr(F) 1 1936 75913 2 1936 7591300 The d.f.'s are exactly the same, is this right? Can I only test the significance of a main effect when it is not in an interaction? Thanks in advance, Simon. Dr. Simon Pickett Research Ecologist Land Use Department Terrestrial Unit British Trust for Ornithology The Nunnery Thetford Norfolk IP242PU 01842750050 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/type-III-effect-from-glm%28%29-tp22097773p22099812.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Insert value in a Vector Alternately
Perhaps you can try this, d - c(0.00377467, 0.00377467, 0.00377467, 0.00380083, 0.00380083, 0.00380083, 0.00380959, 0.00380959, 0.00380959, 0.00380083, 0.00380083, 0.00380083) c( t( cbind(matrix(d, ncol=3, byrow=T), 0))) I don't know how to avoid the transpose operation that might slow things down in large cases. Hope this helps, baptiste On 19 Feb 2009, at 12:47, jim holtman wrote: How about this: dat- c (0.00377467,0.00377467,0.00377467,0.00380083,0.00380083,0.00380083,0.00380959 , + 0.00380959,0.00380959,0.00380083,0.00380083,0.00380083) dat[seq(1, by=3, to=length(dat))] - 0 dat [1] 0. 0.00377467 0.00377467 0. 0.00380083 0.00380083 0. 0.00380959 0.00380959 0. 0.00380083 [12] 0.00380083 On Thu, Feb 19, 2009 at 1:47 AM, Gundala Viswanath gunda...@gmail.com wrote: Hi, I have a vector that look like this: dat V1 V2 V3 V4 V5 V6 0.00377467 0.00377467 0.00377467 0.00380083 0.00380083 0.00380083 V7 V8 V9V10V11V12 0.00380959 0.00380959 0.00380959 0.00380083 0.00380083 0.00380083 what I want to do is to insert 0 (zero) for every 3 position yielding: V1 V2 V3V4 V5V6 V7 V8 0 0.00377467 0.00377467 0.00377467 0 0.00380083 0.00380083 0.00380083 V9 V10 V11V12 V13V14 V15 V16 0 0.00380959 0.00380959 0.00380959 0 .00380083 0.00380083 0.00380083 Is there a quick way to do it in R? - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bug in predict function for naiveBayes?
Dear all, I tried a simple naive Bayes classification on an artificial dataset, but I have troubles getting the predict function to work with the type=class specification. With type= raw, it works perfectly, but with type=class I get following error : Error in as.vector(x, mode) : invalid 'mode' argument Data : mixture.train is a training set with 100 points originating from 2 multivariate gaussian distributions (class 0 and class 1), with X1 and X2 as coordinates in a 2-dimensional space. Mixture.test is a grid going from -15 to +15 in both dimensions. Stupid data, but it's just to test. Code : Sigma - matrix(c(10,3,3,2),2,2) mixture.train - cbind(mvrnorm(n=50, c(0, 2), Sigma),rep(0,50)) mixture.train - as.data.frame(rbind(mixture.train,cbind(mvrnorm(n=50, c(2, 0), Sigma),rep(1,50 names(mixture.train) -c(X1,X2,Class) X1 - rep(seq(-15,15,by=1),31) X2 - rep(seq(-15,15,by=1),each = 31) mixture.test - data.frame(X1,X2) Bayes.res - naiveBayes(Class ~ X1 + X2, data=mixture.train) pred.bayes -predict(Bayes.res, cbind(mixture.test$X1, mixture.test$X2),type=class) Tried it also with pred.bayes -predict(Bayes.res, mixture.test,type=class), but that gives the same effect. Is this a bug or am I missing something? Kind regards Joris Meys University Ghent [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error bars
On Thu, Feb 19, 2009 at 1:19 AM, jdeisenberg catc...@catcode.com wrote: Nicole Hackman wrote: Hello, I have a very simple data set i imported from excel including 96 averages in a column along with 96 standard errors associated with those averages (calculated in excel). I plotted the 95 averages using r and I am wondering if it is possible to plot the second column of standard errors while applying error bars to each value so they represent the error corresponding to each average? You might also find http://users.fmg.uva.nl/rgrasman/rpages/2005/09/error-bars-in-plots.html this page to be useful; it doesn't require you to load any new packages. On the other hand, it's a fundamentally limited approach. With a small investment in learning ggplot2, you can easily add error bars to absolutely any type of graphic. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Insert value in a Vector Alternately
to avoid the transposition you can use rbind(matrix(d, nrow=3), 0) baptiste auguie schrieb: Perhaps you can try this, d - c(0.00377467, 0.00377467, 0.00377467, 0.00380083, 0.00380083, 0.00380083, 0.00380959, 0.00380959, 0.00380959, 0.00380083, 0.00380083, 0.00380083) c( t( cbind(matrix(d, ncol=3, byrow=T), 0))) I don't know how to avoid the transpose operation that might slow things down in large cases. Hope this helps, baptiste On 19 Feb 2009, at 12:47, jim holtman wrote: How about this: dat-c(0.00377467,0.00377467,0.00377467,0.00380083,0.00380083,0.00380083,0.00380959, + 0.00380959,0.00380959,0.00380083,0.00380083,0.00380083) dat[seq(1, by=3, to=length(dat))] - 0 dat [1] 0. 0.00377467 0.00377467 0. 0.00380083 0.00380083 0. 0.00380959 0.00380959 0. 0.00380083 [12] 0.00380083 On Thu, Feb 19, 2009 at 1:47 AM, Gundala Viswanath gunda...@gmail.com wrote: Hi, I have a vector that look like this: dat V1 V2 V3 V4 V5 V6 0.00377467 0.00377467 0.00377467 0.00380083 0.00380083 0.00380083 V7 V8 V9V10V11V12 0.00380959 0.00380959 0.00380959 0.00380083 0.00380083 0.00380083 what I want to do is to insert 0 (zero) for every 3 position yielding: V1 V2 V3V4 V5V6 V7 V8 0 0.00377467 0.00377467 0.00377467 0 0.00380083 0.00380083 0.00380083 V9 V10 V11V12 V13V14 V15 V16 0 0.00380959 0.00380959 0.00380959 0 .00380083 0.00380083 0.00380083 Is there a quick way to do it in R? - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/42803-8243 F ++49/40/42803-7790 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Python and R
Doran, Harold wrote: lm(y ~ x-1) solve(crossprod(x), t(x))%*%y# probably this can be done more efficiently You could do crossprod(x,y) instead of t(x))%*%y that certainly looks more readable (and less error prone) to an R newbie like myself :-) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Insert value in a Vector Alternately
2009/2/19 baptiste auguie ba...@exeter.ac.uk Perhaps you can try this, d - c(0.00377467, 0.00377467, 0.00377467, 0.00380083, 0.00380083, 0.00380083, 0.00380959, 0.00380959, 0.00380959, 0.00380083, 0.00380083, 0.00380083) c( t( cbind(matrix(d, ncol=3, byrow=T), 0))) I don't know how to avoid the transpose operation that might slow things down in large cases Like that ? c(rbind(matrix(dat, nrow = 3), 0)) david Hope this helps, baptiste On 19 Feb 2009, at 12:47, jim holtman wrote: How about this: dat-c(0.00377467,0.00377467,0.00377467,0.00380083,0.00380083,0.00380083,0.00380959, + 0.00380959,0.00380959,0.00380083,0.00380083,0.00380083) dat[seq(1, by=3, to=length(dat))] - 0 dat [1] 0. 0.00377467 0.00377467 0. 0.00380083 0.00380083 0. 0.00380959 0.00380959 0. 0.00380083 [12] 0.00380083 On Thu, Feb 19, 2009 at 1:47 AM, Gundala Viswanath gunda...@gmail.com wrote: Hi, I have a vector that look like this: dat V1 V2 V3 V4 V5 V6 0.00377467 0.00377467 0.00377467 0.00380083 0.00380083 0.00380083 V7 V8 V9V10V11V12 0.00380959 0.00380959 0.00380959 0.00380083 0.00380083 0.00380083 what I want to do is to insert 0 (zero) for every 3 position yielding: V1 V2 V3V4 V5V6 V7 V8 0 0.00377467 0.00377467 0.00377467 0 0.00380083 0.00380083 0.00380083 V9 V10 V11V12 V13V14 V15 V16 0 0.00380959 0.00380959 0.00380959 0 .00380083 0.00380083 0.00380083 Is there a quick way to do it in R? - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Python and R
Hi Kenn, Thanks for the suggestions, I'll have to see if I can figure out how to convert the relatively simple call to lm with an equation and the data file to the functions you mention (or if that's even feasible). Not an expert in statistics myself, I am mostly concentrating on the programming aspects of R. Problem is that I suspect my colleagues who are providing some guidance with the stats end are not quite experts themselves, and certainly new to R. Cheers, Esmail Kenn Konstabel wrote: lm does lots of computations, some of which you may never need. If speed really matters, you might want to compute only those things you will really use. If you only need coefficients, then using %*%, solve and crossprod will be remarkably faster than lm # repeating someone else's example # lm(DAX~., EuStockMarkets) y - EuStockMarkets[,DAX] x - EuStockMarkets x[,1]-1 colnames(x)[1] - Intercept lm(y ~ x-1) solve(crossprod(x), t(x))%*%y# probably this can be done more efficiently # and a naive timing system.time( for(i in 1:1000) lm(y ~ x-1)) user system elapsed 14.640.33 32.69 system.time(for(i in 1:1000) solve(crossprod(x), crossprod(x,y)) ) user system elapsed 0.360.000.36 Also lsfit() is a bit quicker than lm or lm.fit. Regards, Kenn __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] indicator or deviation contrasts in log-linear modelling
Maja, The need to interpret parameters in log-linear models (and therefore, the need to understand how the model is parameterized) often vanishes if you visualize the fitted model or the residuals in a mosaic display. e.g., ucb1 asserts Admit is jointly independent of Gender and Dept --- fits very badly, but the residuals show the *nature* of the association not accounted for. ucb2 - Admit and Gender conditionally independent, given Dept --- fits badly overall, but only in one department. library(vcd) ucb1 - loglm(~Admit + Gender*Dept, data=UCBAdmissions) ucb1 Call: loglm(formula = ~Admit + Gender * Dept, data = UCBAdmissions) Statistics: X^2 df P( X^2) Likelihood Ratio 877 110 Pearson 798 110 plot(ucb1) ucb2 - loglm(~Admit*Dept + Gender*Dept, data=UCBAdmissions) ucb2 Call: loglm(formula = ~Admit * Dept + Gender * Dept, data = UCBAdmissions) Statistics: X^2 df P( X^2) Likelihood Ratio 22 6 0.0014 Pearson 20 6 0.0028 plot(ucb2) maiya wrote: I am fairly new to log-linear modelling, so as opposed to trying to fit modells, I am still trying to figure out how it actually works - hence I am looking at the interpretation of parameters. Now it seems most people skip this part and go directly to measuring model fit, so I am finding very few references to actual parameters, and am of course clear on the fact that their choice is irelevant for the actual model fit. But here is my question: loglin uses deviation contrasts, so the coefficients in each term add up to zero. Another option are indicator contrasts, where a reference category is chosen in each term and set to zero, while the others are relative to it. My question is if there is a log-linear command equivalent to loglin that uses this secong dummy coding style of constraints (I know e.g. spss genlog does this). I hope this is not to basic a question! And if anyone is up for answeing the wider question of why log-linear parameters are not something to be looked at - which might just be my impression of the literature - feel free to comment! Thanks for your help! Maja -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] table with 3 variables
And I would like to group the Subject by Quarter using as a result in the table the value of the third variable (Boolean). The final result would give: Subjet Q1 Q2 Q3 Q4 1100 Y Y Y Y 2101 N N N N Are you sure that this is the final result you want ? You could use the reshape package (and not reshape function) : library(reshape) truc - data.frame(Subject=rep(100:101, each=4), Quarter=rep(paste(Q,1:4,sep=),2), Boolean = rep(c(Y,N),4)) mtruc - melt(truc, id = c(Subject, Quarter)) cast(mtruc, Subject ~ Quarter) Final result : Subject Q1 Q2 Q3 Q4 1 100 Y N Y N 2 101 Y N Y N 2009/2/19 Pascal Candolfi pcando...@gmail.com I have the initial matrice: *data.frame(Subject=rep(100:101, each=4), Quarter=rep(paste(Q,1:4, sep=),2), Boolean = rep(c(Y,N),4))* Subject Quarter Boolean 1100 Q1 Y 2100 Q2 N 3100 Q3 Y 4100 Q4 N 5101 Q1 Y 6101 Q2 N 7101 Q3 Y 8101 Q4 N ... And I would like to group the Subject by Quarter using as a result in the table the value of the third variable (Boolean). The final result would give: Subjet Q1 Q2 Q3 Q4 1100 Y Y Y Y 2101 N N N N ... I started using the *table(Subject, Quarter)* but can't find a way to correspond the Boolean information in the table Thanks in advance for the ideas... Pascal Candolfi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] table with 3 variables
Sorry, you don't need 'melt' : cast(truc, Subject ~ Quarter) 2009/2/19 David Hajage dhajag...@gmail.com And I would like to group the Subject by Quarter using as a result in the table the value of the third variable (Boolean). The final result would give: Subjet Q1 Q2 Q3 Q4 1100 Y Y Y Y 2101 N N N N Are you sure that this is the final result you want ? You could use the reshape package (and not reshape function) : library(reshape) truc - data.frame(Subject=rep(100:101, each=4), Quarter=rep(paste(Q,1:4,sep=),2), Boolean = rep(c(Y,N),4)) mtruc - melt(truc, id = c(Subject, Quarter)) cast(mtruc, Subject ~ Quarter) Final result : Subject Q1 Q2 Q3 Q4 1 100 Y N Y N 2 101 Y N Y N 2009/2/19 Pascal Candolfi pcando...@gmail.com I have the initial matrice: *data.frame(Subject=rep(100:101, each=4), Quarter=rep(paste(Q,1:4, sep=),2), Boolean = rep(c(Y,N),4))* Subject Quarter Boolean 1100 Q1 Y 2100 Q2 N 3100 Q3 Y 4100 Q4 N 5101 Q1 Y 6101 Q2 N 7101 Q3 Y 8101 Q4 N ... And I would like to group the Subject by Quarter using as a result in the table the value of the third variable (Boolean). The final result would give: Subjet Q1 Q2 Q3 Q4 1100 Y Y Y Y 2101 N N N N ... I started using the *table(Subject, Quarter)* but can't find a way to correspond the Boolean information in the table Thanks in advance for the ideas... Pascal Candolfi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Insert value in a Vector Alternately
actually c(rbind(0,matrix(d, nrow=3))) which has the bonus of giving the desired result ;) baptiste auguie schrieb: Perhaps you can try this, d - c(0.00377467, 0.00377467, 0.00377467, 0.00380083, 0.00380083, 0.00380083, 0.00380959, 0.00380959, 0.00380959, 0.00380083, 0.00380083, 0.00380083) c( t( cbind(matrix(d, ncol=3, byrow=T), 0))) I don't know how to avoid the transpose operation that might slow things down in large cases. Hope this helps, baptiste On 19 Feb 2009, at 12:47, jim holtman wrote: How about this: dat-c(0.00377467,0.00377467,0.00377467,0.00380083,0.00380083,0.00380083,0.00380959, + 0.00380959,0.00380959,0.00380083,0.00380083,0.00380083) dat[seq(1, by=3, to=length(dat))] - 0 dat [1] 0. 0.00377467 0.00377467 0. 0.00380083 0.00380083 0. 0.00380959 0.00380959 0. 0.00380083 [12] 0.00380083 On Thu, Feb 19, 2009 at 1:47 AM, Gundala Viswanath gunda...@gmail.com wrote: Hi, I have a vector that look like this: dat V1 V2 V3 V4 V5 V6 0.00377467 0.00377467 0.00377467 0.00380083 0.00380083 0.00380083 V7 V8 V9V10V11V12 0.00380959 0.00380959 0.00380959 0.00380083 0.00380083 0.00380083 what I want to do is to insert 0 (zero) for every 3 position yielding: V1 V2 V3V4 V5V6 V7 V8 0 0.00377467 0.00377467 0.00377467 0 0.00380083 0.00380083 0.00380083 V9 V10 V11V12 V13V14 V15 V16 0 0.00380959 0.00380959 0.00380959 0 .00380083 0.00380083 0.00380083 Is there a quick way to do it in R? - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/42803-8243 F ++49/40/42803-7790 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem in converting a string in binary formate
Hi all, can any one suggest how to convert one string into binary formate to store in another file and to use for farther searching with using of binary files only that search can process for that i need to convert string into binary files thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running out of memory when importing SPSS files
2009/2/19 Thomas Lumley tlum...@u.washington.edu: On Wed, 18 Feb 2009, Uwe Ligges wrote: dobomode wrote: Hello R-help, I am trying to import a large dataset from SPSS into R. The SPSS file is in .SAV format and is about 1GB in size. I use read.spss to import the file and get an error saying that I have run out of memory. I am on a MAC OS X 10.5 system with 4GB of RAM. Monitoring the R process tells me that R runs out of memory when reaching about 3GB of RAM so I suppose the remaining 1GB is used up by the OS. Why would a 1GB SPSS file take up more than 3GB of memory in R? Because SPSS stores data in a compressed way? Or because R uses quite a lot more memory to read a data set than to store it. Either way, even if the data set eventually took up only 1Gb in R you still would probably not be able to work usefully with it on a 32-bit machine. You need to either use a 64-bit system or avoid loading the whole data set. Unfortunately read.spss can't read the data selectively [something I'd like to fix, sometime], but if you had a .csv file you could read a subset of columns or rows using read.table. A better bet is likely to be putting the data set into a database (SQLite is easiest) and reading subsets of the data that way. That's how I handle data sets of a few Gb (on a laptop with 1Gb memory). -thomas Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. You could try using package memisc and only bring in the variables you need to analyse. see spss.system.file() and the additional subset() methods in memisc. Paul Bivand - Paul Bivand Head of Analysis and Statistics Inclusion Inclusion has a launched a new website, please visit: www.cesi.org.uk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Python and R
Gabor Grothendieck wrote: On Wed, Feb 18, 2009 at 7:27 AM, Esmail Bonakdarian esmail...@gmail.com wrote: Gabor Grothendieck wrote: See ?Rprof for profiling your R code. If lm is the culprit, rewriting your lm calls using lm.fit might help. Yes, based on my informal benchmarking, lm is the main bottleneck, the rest of the code consists mostly of vector manipulations and control structures. I am not familiar with lm.fit, I'll definitely look it up. I hope it's similar enough to make it easy to substitute one for the other. Thanks for the suggestion, much appreciated. (My runs now take sometimes several hours, it would be great to cut that time down by any amount :-) Yes, the speedup can be significant. e.g. here we cut the time down to 40% of the lm time by using lm.fit and we can get down to nearly 10% if we go even lower level: Wow those numbers look impressive, that would be a nice speedup to have. I took a look at the manual and found the following at the top of the description for lm.fit: These are the basic computing engines called by lm used to fit linear models. These should usually not be used directly unless by experienced users. I am certainly not an experienced user - so I wonder how different it would be to use lm.fit instead of lm. Right now I cobble together an equation and then call lm with it and the datafile. I.e., LM.1 = lm(as.formula(eqn), data=datafile) s=summary(LM.1) I then extract some information from the summary stats. I'm not really quite sure what to make of the parameter list in lm.fit I will look on-line and see if I can find an example showing the use of this - thanks for pointing me in that direction. Esmail system.time(replicate(1000, lm(DAX ~.-1, EuStockMarkets))) user system elapsed 26.850.07 27.35 system.time(replicate(1000, lm.fit(EuStockMarkets[,-1], EuStockMarkets[,1]))) user system elapsed 10.760.00 10.78 system.time(replicate(1000, qr.coef(qr(EuStockMarkets[,-1]), EuStockMarkets[,1]))) user system elapsed 3.330.003.34 lm(DAX ~.-1, EuStockMarkets) Call: lm(formula = DAX ~ . - 1, data = EuStockMarkets) Coefficients: SMI CAC FTSE 0.55156 0.45062 -0.09392 # They call give the same coefficients: lm.fit(EuStockMarkets[,-1], EuStockMarkets[,1])$coef SMI CACFTSE 0.55156141 0.45062183 -0.09391815 qr.coef(qr(EuStockMarkets[,-1]), EuStockMarkets[,1]) SMI CACFTSE 0.55156141 0.45062183 -0.09391815 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unadulterated plot
James, as I previously told you in my broken English, probably the function you're looking for is not filled.contour but image and contour The following code makes exactly what you ask for data(akima) akima akima.smooth - with(akima, interp(x, y, z, xo=seq(0,25, length=500), yo=seq(0,20, length=500))) op - par(ann=FALSE, mai=c(0,0,0,0)) image (akima.smooth, main = interp(akima data, *) on finer grid) contour(akima.smooth, add = TRUE,drawlabels=F) cheers Patrizio 2009/2/19 James Nicolson jlnicol...@gmail.com: good point! Provide your own set of x,y,z co-ords, mine are pretty big but you can use any. library(akima) fr3d = data.frame(x,y,z) xtrp - interp(fr3d$x,fr3d$y,fr3d$z,linear=FALSE,extrap=TRUE,duplicate= strip) op - par(ann=FALSE, mai=c(0,0,0,0)) filled.contour(xtrp$x, xtrp$y, xtrp$z, asp = 0.88402366864, col = rev(rainbow(28,start=0, end=8/12)), n = 40) par(op) I tried all these settings too (none of them made a difference)... usr=c(0,845,0,747), mfcol=c(1,1), mfrow=c(1,1), oma=c(0,0,0,0),omi=c(0,0,0,0), plt=c(1,1,1,1) Regards James Peter Alspach wrote: Kia ora James I think it would be easier to provide you with help if you provide commented, minimal, self-contained, reproducible code [see bottom of this, or any, email to R-help]. Hei kona ra ... Peter Alspach -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of James Nicolson Sent: Thursday, 19 February 2009 11:22 a.m. To: r-help@r-project.org Subject: Re: [R] Unadulterated plot Hi, Thanks for your help. I have looked at the beginners documentation and while there are options to configure various aspects of the plot none of them seem to have the desired effect. I have managed to ensure that the plot fills the space vertically with no margins, no axes etc (using mai=c(0,0,0,0)). However, horizontally there remains a margin to the right that pads the space between the filled.contour and its legend. I've tried options to par and filled.contour but I can't seem to remove the legend. Kind Regards, James Simon Pickett wrote: Hi James, What you really need to do is to check out the many freely available pdfs for R beginners. Here is a good place to start http://cran.r-project.org/other-docs.html If I am right interpreting what you want, I think you need to create a blank plot with no axes, axis labels etc. Try plot(x,y,xlab=,ylab=,xaxt=NULL,yaxt=NULL,type=n) #blank plot points(x,y) type ?par into R and see how you can set parameters like this up as the default. Hope this helps? Simon. - Original Message - From: James Nicolson jlnicol...@gmail.com To: r-help@r-project.org Sent: Sunday, February 15, 2009 10:29 PM Subject: [R] Unadulterated plot To all, Apologies if this question has already been asked but I can't find anything. I can't seem to think of more specific search terms. I want to display/create a file of a pure plot with a specific height and width. I want to utilise every single pixel inside the axes. I do not want to display any margins, legends, axes, titles or spaces around the edges. Is this possible? Additionally, the plot I am working with is a filled.contour plot and I can not remove the legend? How can I do this? Kind Regards, James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. The contents of this e-mail are confidential and may be subject to legal privilege. If you are not the intended recipient you must not use, disseminate, distribute or reproduce all or any part of this e-mail or attachments. If you have received this e-mail in error, please notify the sender and delete all material pertaining to this e-mail. Any opinion or views expressed in this e-mail are those of the individual sender and may not represent those of The New Zealand Institute for Plant and Food Research Limited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal,
Re: [R] Re : SVM regression code
You can find the complete list at: http://cran.r-project.org/web/views/MachineLearning.html Max __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] DLM and matrices with 0 eigenvalues
I am using DLM to fit a state space model. The covariance matrix of states (W) is given by: a 0 a 0 0 0 0 0 a 0 a 0 0 0 0 0 where a is a parameter to be estimated. Even though the matrix is positive semidefinite, sometimes DLM gives me an error that W is not a valid variance matrix. As far as I can tell, the reason is that one of R's computed eigenvalues is very slightly negative (something like -5E-17). Is there a way to work around this? Thanks! Rebecca __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Insert value in a Vector Alternately
thanks all for the correction, funny how it's often the complicated solution that comes to mind first. baptiste On 19 Feb 2009, at 13:41, Eik Vettorazzi wrote: actually c(rbind(0,matrix(d, nrow=3))) which has the bonus of giving the desired result ;) baptiste auguie schrieb: Perhaps you can try this, d - c(0.00377467, 0.00377467, 0.00377467, 0.00380083, 0.00380083, 0.00380083, 0.00380959, 0.00380959, 0.00380959, 0.00380083, 0.00380083, 0.00380083) c( t( cbind(matrix(d, ncol=3, byrow=T), 0))) I don't know how to avoid the transpose operation that might slow things down in large cases. Hope this helps, baptiste On 19 Feb 2009, at 12:47, jim holtman wrote: How about this: dat- c (0.00377467,0.00377467,0.00377467,0.00380083,0.00380083,0.00380083,0.00380959 , + 0.00380959,0.00380959,0.00380083,0.00380083,0.00380083) dat[seq(1, by=3, to=length(dat))] - 0 dat [1] 0. 0.00377467 0.00377467 0. 0.00380083 0.00380083 0. 0.00380959 0.00380959 0. 0.00380083 [12] 0.00380083 On Thu, Feb 19, 2009 at 1:47 AM, Gundala Viswanath gunda...@gmail.com wrote: Hi, I have a vector that look like this: dat V1 V2 V3 V4 V5 V6 0.00377467 0.00377467 0.00377467 0.00380083 0.00380083 0.00380083 V7 V8 V9V10V11V12 0.00380959 0.00380959 0.00380959 0.00380083 0.00380083 0.00380083 what I want to do is to insert 0 (zero) for every 3 position yielding: V1 V2 V3V4 V5V6 V7 V8 0 0.00377467 0.00377467 0.00377467 0 0.00380083 0.00380083 0.00380083 V9 V10 V11V12 V13V14 V15 V16 0 0.00380959 0.00380959 0.00380959 0 .00380083 0.00380083 0.00380083 Is there a quick way to do it in R? - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/42803-8243 F ++49/40/42803-7790 _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple merge, better solution?
The zoo package has a multi-way merge for zoo objects. Its just do.call(merge, z) where z is a list of zoo objects. In detail: set.seed(1) DF1 - data.frame(var1 = letters[1:5], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], a = rnorm(5), b = rnorm(5), c = rnorm(5)) # create list of data frames DFs - list(A = DF1, B = DF2, C = DF3, D = DF4) library(zoo) # convert to list of zoo objects z - lapply(DFs, function(x) zoo(as.matrix(x[ ,-1, drop = FALSE]), as.character(x[,1]))) # perform merge zz - do.call(merge, z) # to convert back to data frame DF - as.data.frame(var1 = time(zz), coredata(zz)) On Thu, Feb 19, 2009 at 6:00 AM, Lauri Nikkinen lauri.nikki...@iki.fi wrote: Thanks, both solutions work fine. I tried these solutions to my real data, and I got an error Error in match.names(clabs, names(xi)) : names do not match previous names I refined this example data to look more like my real data, this also produces the same error. Any ideas how to prevent this error? DF1 - data.frame(var1 = letters[1:5], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], a = rnorm(5), b = rnorm(5), c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], a = rnorm(5), b = rnorm(5), c = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) merge(g, DF4, by.x=var1, by.y=var1, all=T) Error in match.names(clabs, names(xi)) : names do not match previous names DF - DF1 for ( .df in list(DF2,DF3,DF4) ) { + DF -merge(DF,.df,by.x=var1, by.y=var1, all=T) + } Error in match.names(clabs, names(xi)) : names do not match previous names Reduce(function(x, y) merge(x, y, all=T,by.x=var1, by.y=var1), list(DF1, DF2, DF3, DF4), accumulate=F) Error in match.names(clabs, names(xi)) : names do not match previous names - Lauri 2009/2/19 baptiste auguie ba...@exeter.ac.uk: Hi, I think Reduce could help you. DF1 - data.frame(var1 = letters[1:5], a = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], b = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], d = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) g - merge(g, DF4, by.x=var1, by.y=var1, all=T) test - Reduce(function(x, y) merge(x, y, all=T,by.x=var1, by.y=var1), list(DF1, DF2, DF3, DF4), accumulate=F) all.equal(test, g) # TRUE As a warning, it's the first time I've ever used it myself... Hope this helps, baptiste On 19 Feb 2009, at 10:21, Lauri Nikkinen wrote: Hello, My problem is that I would like to merge multiple files with a common column but merge accepts only two data.frames to merge. In the real situation, I have 26 different data.frames with a common column. I can of course use merge many times (see below) but what would be more sophisticated solution? For loop? Any ideas? DF1 - data.frame(var1 = letters[1:5], a = rnorm(5)) DF2 - data.frame(var1 = letters[3:7], b = rnorm(5)) DF3 - data.frame(var1 = letters[6:10], c = rnorm(5)) DF4 - data.frame(var1 = letters[8:12], d = rnorm(5)) g - merge(DF1, DF2, by.x=var1, by.y=var1, all=T) g - merge(g, DF3, by.x=var1, by.y=var1, all=T) merge(g, DF4, by.x=var1, by.y=var1, all=T) Thanks in advance. -Lauri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] code patterns in vector
Dear List, I have this column/vector: vec - c(function, missing, string) and want to compute a second column/vector: - value if the pattern unc is found: 1 - value if the pattern iss is found: 2 - value if none of the patterns is found: 0 This should be the result: vec2 [1] 1 2 0 Any help? Tried it with grep, but the output is not as long as vec, so I'm lost a bit here. Thanks, Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] code patterns in vector
On 19/02/2009 9:26 AM, Stefan Uhmann wrote: Dear List, I have this column/vector: vec - c(function, missing, string) and want to compute a second column/vector: - value if the pattern unc is found: 1 - value if the pattern iss is found: 2 - value if none of the patterns is found: 0 This should be the result: vec2 [1] 1 2 0 Any help? Tried it with grep, but the output is not as long as vec, so I'm lost a bit here. vec2 - rep(0, length(vec)) vec2[grep(iss, vec)] - 2 vec2[grep(unc, vec)] - 1 Note that an entry containing both unc and iss will get a 1 according to this scheme. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Insert value in a Vector Alternately
- Original Message - From: baptiste auguie ba...@exeter.ac.uk To: Gundala Viswanath gunda...@gmail.com Cc: r-h...@stat.math.ethz.ch R-help r-h...@stat.math.ethz.ch Sent: Thursday, February 19, 2009 7:12:23 AM GMT -06:00 US/Canada Central Subject: Re: [R] Insert value in a Vector Alternately Perhaps you can try this, d - c(0.00377467, 0.00377467, 0.00377467, 0.00380083, 0.00380083, 0.00380083, 0.00380959, 0.00380959, 0.00380959, 0.00380083, 0.00380083, 0.00380083) c( t( cbind(matrix(d, ncol=3, byrow=T), 0))) I don't know how to avoid the transpose operation that might slow things down in large cases. --- This seems to work: c(0,c(rbind(matrix(d,nrow=3),0))) -- David Winsemius --- Hope this helps, baptiste On 19 Feb 2009, at 12:47, jim holtman wrote: How about this: dat- c (0.00377467,0.00377467,0.00377467,0.00380083,0.00380083,0.00380083,0.00380959 , + 0.00380959,0.00380959,0.00380083,0.00380083,0.00380083) dat[seq(1, by=3, to=length(dat))] - 0 dat [1] 0. 0.00377467 0.00377467 0. 0.00380083 0.00380083 0. 0.00380959 0.00380959 0. 0.00380083 [12] 0.00380083 On Thu, Feb 19, 2009 at 1:47 AM, Gundala Viswanath gunda...@gmail.com wrote: Hi, I have a vector that look like this: dat V1 V2 V3 V4 V5 V6 0.00377467 0.00377467 0.00377467 0.00380083 0.00380083 0.00380083 V7 V8 V9 V10 V11 V12 0.00380959 0.00380959 0.00380959 0.00380083 0.00380083 0.00380083 what I want to do is to insert 0 (zero) for every 3 position yielding: V1 V2 V3 V4 V5 V6 V7 V8 0 0.00377467 0.00377467 0.00377467 0 0.00380083 0.00380083 0.00380083 V9 V10 V11 V12 V13 V14 V15 V16 0 0.00380959 0.00380959 0.00380959 0 .00380083 0.00380083 0.00380083 Is there a quick way to do it in R? - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Curso: Introduccion al idioma R
A Todos Todavía hay sitios para los próximos cursos. Para mas información y el formulario de inscripción pónganse en contacto con train...@mango-solutions.com, o visiten nuestro sitio Web www.mango-solutions.com http://www.mango-solutions.com/ . The R Language Introducción al idioma R Fecha: 2 y 3 Junio. Sitio: Madrid. Un saludo cordial Sharon Lazenby * mangosolutions * Tel +44 (0)1249 767700 * Mob +44 (0)7966 062462 * Fax +44 (0)1249 767707 data analysis that delivers Mango have moved - our new address is Ground Floor, Unit 2 Greenways Business Park, Chippenham SN15 1BN [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Age as time-scale in a cox model
You asked about survival curves with age scale versus follow-up scale. fit1 - coxph(Surv(time/365.25, status) ~ t5 + id + age, data=stanford2) surv1- survfit(fit1) surv1 n events median 0.95LCL 0.95UCL 157.000 102.000 1.999 0.898 3.608 summary(surv1, times=3) time n.risk n.event survival std.err lower 95% CI upper 95% CI 3 46 850.451 0.04250.3750.543 I've taken the liberty of rewriting your query using the standard survival library calls instead of Design, since I don't attempt to keep up with the latter. The above shows a median survival of 1.999 years after enrollment, and a 3 year survival of 45%. I was surprised when you put id in the model, but it turns out to have p=.03! It seems that patients entered later in the study have better survival. Now for age scale: fit2 - coxph(Surv(age, age+ time/365.25, status) ~ t5 + id, stanford2) surv2- survfit(fit2) surv2 n events median 0.95LCL 0.95UCL 1.0 102.012.212.228.1 This shows a median age at death of 12.2 years. Puzzling, isn't it. First, note that your code cph(Surv(age,age+time, status) ~ t5+id, data=stanford2... doesn't make sense due to different time scales: age in years and time in days. As to your final question: These are obviously out-of sync, so there must be some way I can adjust them to mean the same thing. The first means the probability of surviving a 1000 days since they started being followed up while the second means the probability of surviving up to starting age+1000 days. How do I get the equivalent risks from the two models? The first fit is on a time since entry scale, and so the survival curve is with respect to time since entry. The second is on an age scale, and so the curve will be in terms of absolute age, not starting age + x. There is no simple way to realign them. As to the curve above with a median age of 12.2 years. We know that a usual Kaplan-Meier curve can become unstable at the right hand end due to very small n (=5), which leads to big steps. With start,stop data this can happen at the left end too. In the stanford2 data set there is one subject who enters the study at age 12 and dies at age 12.2. At the time of death there is only 1 person at risk, so the survival curve goes to zero (100% death rate). This curve is mathematically correct, but not at all useful. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need Help for creating a new variable
Dear Chun-Hao, How about this? diet-sort(rep(x=c(C,T),4)) vesl-rep(x=c(A,P),4) mydata-data.frame(diet,vesl) mydata$trt-with(mydata,paste(diet,vesl,sep=)) mydata HTH, Jorge On Thu, Feb 19, 2009 at 2:53 AM, Chun-Hao Tu tc...@hotmail.com wrote: Hi R users, I did do the research and work on for hours, but I still don't know how to solve my silly problem. I try to creat a new variable in my dataset. such as if diet==C vesl==P then trt=CP; if diet==C vesl==A then trt=CA;. The following is my code (It does not work correctly). Could anyone give me a hint? Appreciate! diet-sort(rep(x=c(C,T),4)) vesl-rep(x=c(A,P),4) mydata-data.frame(diet,vesl) mydata$trt-ifelse(mydata$diet==C mydata$vesl==A, CA, +ifelse(mydata$diet==C mydata$vesl==P, CP, + ifelse(mydata$diet==T mydata$vesl==A, TA, + ifelse(mydata$diet==T mydata$vesl==P, TP mydata diet vesl trt 1CA CA 2CP CA 3CA CA 4CP CA 5TA CA 6TP CA 7TA CA 8TP CA Thank you very much Chunhao _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need Help for creating a new variable
On Thu, Feb 19, 2009 at 9:50 AM, Jorge Ivan Velez jorgeivanve...@gmail.com wrote: mydata$trt-with(mydata,paste(diet,vesl,sep=)) Besides the above (good!) solution, you might want to understand why your original solution didn't work: mydata$trt-ifelse(mydata$diet==C mydata$vesl==A, CA, +ifelse(mydata$diet==C mydata$vesl==P, CP, + ifelse(mydata$diet==T mydata$vesl==A, TA, + ifelse(mydata$diet==T mydata$vesl==P, TP The problem here is that you are using rather than . From the man page: '' and '' indicate logical AND and '|' and '||' indicate logical OR. The shorter form performs elementwise comparisons in much the same way as arithmetic operators. The longer form evaluates left to right examining only the first element of each vector. I trust this makes things clearer. -s __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help on warning message from Neg. Binomial error during glm
I am using glm.nb, a ~b*c ( b is categorical and c is continuous). when I run this model I get the warning message: Warning messages: 1: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace : iteration limit reached 2: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace : iteration limit reached What does this mean? -- Graduate student -- Graduate student Dr.Renee M. Borges lab Centre for Ecological Sciences Indian Institute of Science Bangalore karnataka 560 012 India [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] type III effect from glm()
Dear Mark and Simon, I assume from the variable names that siteall and district are factors and that yrs is numeric. If that's the case, then the second model formula, ~ siteall + district + yrs:district, nests yrs within district, that is, will fit a separate slope for years within each level of district -- what you'd get by ~ siteall + district/years or ~ siteall + district + yrs %in% district. This model is equivalent to ~ siteall + yrs*district, although it's parametrized differently. To see what's happening, check model.matrix(test1) and model.matrix(test2). More generally, R avoids violating marginality. If you want type-III tests, you could use the Anova() function in the car package, but if I properly interpreted the meaning of the predictors, the type-III test for the main effect of yrs is simply the test that the slope for yrs is 0 in the first (reference) category of district, assuming that you're using the default dummy-coded (contr.treatment) contrasts -- not generally a particularly interesting hypothesis. Regards, John -- John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of markle...@verizon.net Sent: February-19-09 6:23 AM To: Simon Pickett; r-help@r-project.org Subject: Re: [R] type III effect from glm() Hi Simon: In below , test1 spelled out is count ~ siteall + yrs + district + yrs:district so this is fine. but in test2 , you have years interacting with district but not the main effect for years. this is against the rules of marginality so I still think there's a problem. I would wait for John or the other wizaRds to respond ( you know who you are ) because I don't feel particularly confident giving advice on this because I bang my head against it often also. Plus, I gotta go home because it's getting light out soon ( i'm in the US on the east coast ). Good luck. On Thu, Feb 19, 2009 at 6:10 AM, Simon Pickett wrote: Cheers Mark, I did originally think too, i.e. that not including the main effect was the problem. However, the same thing happens when I include main effects test1- glm(count~siteall+yrs*district,family=quasipoisson,weights=weight,data=m[x[[ i ]],]) test2- glm(count~siteall+district+yrs:district,family=quasipoisson,weights=weight,d a ta=m[x[[i]],]) anova(test1,test2,test=F) Model 1: count ~ siteall + yrs * district Model 2: count ~ siteall + district + yrs:district Resid. Df Resid. Dev Df Deviance F Pr(F) 1 1933 75665 2 1933 7566500 Simon. - Original Message - From: markle...@verizon.net To: Simon Pickett simon.pick...@bto.org Sent: Thursday, February 19, 2009 10:50 AM Subject: RE: [R] type III effect from glm() Hi Simon: John Fox can say a lot more about below but I've been reading his book over and over recently and one thing he constantly stresses is marginality which he defines as always including the lower order term if you include it in a higher order term. So, I think below is problematic because you are including an interaction that includes the main effect but not including the main effect. This definitely causes problems when trying to interpret the anova table or the Anova table. That's as much as I can say. I highly recommed his text for this sort of thing and hopefully he will respond. Oh, my point is that if you want to check the effect of yrs, then I think you have to take it out of model 2 totally in order to interpret the anova ( or the Anova ) table. On Thu, Feb 19, 2009 at 5:38 AM, Simon Pickett wrote: Hi all, This could be naivety/stupidity on my part rather than a problem with model output, but here goes I have fitted a fairly simple model m1- glm(count~siteall+yrs+yrs:district,family=quasipoisson,weights=weight,data=m [ x[[i]],]) I want to know if yrs (a continuous variable) has a significant unique effect in the model, so I fit a simplified model with the main effect ommitted... m2- glm(count~siteall+yrs:district,family=quasipoisson,weights=weight,data=m[x[[ i ]],]) then compare models using anova() anova(m1,m1b,test=F) Analysis of Deviance Table Model 1: count ~ siteall + yrs + yrs:district Model 2: count ~ siteall + yrs:district Resid. Df Resid. Dev Df Deviance F Pr(F) 1 1936 75913 2 1936 75913 0 0 The d.f.'s are exactly the same, is this right? Can I only test the significance of a main effect when it is not in an interaction? Thanks in advance, Simon. Dr. Simon Pickett Research Ecologist Land Use Department Terrestrial Unit British Trust for Ornithology The Nunnery Thetford Norfolk IP242PU 01842750050
[R] read.table : how to condition on error while opening file?
Hi, I'm using read.table in a loop, to read in multiple files. The problem is that when a file is missing there is an error message and the loop is broken; what I'd like to do is to test for the error and simply do next instead of breaking the loop. Anybody knows how to do that? Example: filelist - c(file1.txt, file2.txt, file3.txt) for (i in 1:3) { if (read.table(filelist[i]) == ERROR LOADING FILE) { # this is where I do not know how to write the condition print(paste(error opening file , filelist[i], sep=)) next } else { tmp - read.table(filelist[i]) } } Cheers, Stephane -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a compa ny registered in England with number 2742969, whose registered office is 2 15 Euston Road, London, NW1 2BE. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unadulterated plot
To All, I'm using R 2.8.1. I have attached two images, one shows what I get and the other shows what I want. All I want is the plotting region. Surely there must be a way of plotting individual regions/components in R? Patrizio, I prefer filled.contour() for my data. I still have the same problem (a legend to the right) if I use image() then contour()! Regards, James ps. No need for apologies Simon :D I'm grateful for any assistance fruitless or not. Patrizio Frederic wrote: James, as I previously told you in my broken English, probably the function you're looking for is not filled.contour but image and contour The following code makes exactly what you ask for data(akima) akima akima.smooth - with(akima, interp(x, y, z, xo=seq(0,25, length=500), yo=seq(0,20, length=500))) op - par(ann=FALSE, mai=c(0,0,0,0)) image (akima.smooth, main = interp(akima data, *) on finer grid) contour(akima.smooth, add = TRUE,drawlabels=F) cheers Patrizio 2009/2/19 James Nicolson jlnicol...@gmail.com: good point! Provide your own set of x,y,z co-ords, mine are pretty big but you can use any. library(akima) fr3d = data.frame(x,y,z) xtrp - interp(fr3d$x,fr3d$y,fr3d$z,linear=FALSE,extrap=TRUE,duplicate= strip) op - par(ann=FALSE, mai=c(0,0,0,0)) filled.contour(xtrp$x, xtrp$y, xtrp$z, asp = 0.88402366864, col = rev(rainbow(28,start=0, end=8/12)), n = 40) par(op) I tried all these settings too (none of them made a difference)... usr=c(0,845,0,747), mfcol=c(1,1), mfrow=c(1,1), oma=c(0,0,0,0),omi=c(0,0,0,0), plt=c(1,1,1,1) Regards James Peter Alspach wrote: Kia ora James I think it would be easier to provide you with help if you provide commented, minimal, self-contained, reproducible code [see bottom of this, or any, email to R-help]. Hei kona ra ... Peter Alspach -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of James Nicolson Sent: Thursday, 19 February 2009 11:22 a.m. To: r-help@r-project.org Subject: Re: [R] Unadulterated plot Hi, Thanks for your help. I have looked at the beginners documentation and while there are options to configure various aspects of the plot none of them seem to have the desired effect. I have managed to ensure that the plot fills the space vertically with no margins, no axes etc (using mai=c(0,0,0,0)). However, horizontally there remains a margin to the right that pads the space between the filled.contour and its legend. I've tried options to par and filled.contour but I can't seem to remove the legend. Kind Regards, James Simon Pickett wrote: Hi James, What you really need to do is to check out the many freely available pdfs for R beginners. Here is a good place to start http://cran.r-project.org/other-docs.html If I am right interpreting what you want, I think you need to create a blank plot with no axes, axis labels etc. Try plot(x,y,xlab=,ylab=,xaxt=NULL,yaxt=NULL,type=n) #blank plot points(x,y) type ?par into R and see how you can set parameters like this up as the default. Hope this helps? Simon. - Original Message - From: James Nicolson jlnicol...@gmail.com To: r-help@r-project.org Sent: Sunday, February 15, 2009 10:29 PM Subject: [R] Unadulterated plot To all, Apologies if this question has already been asked but I can't find anything. I can't seem to think of more specific search terms. I want to display/create a file of a pure plot with a specific height and width. I want to utilise every single pixel inside the axes. I do not want to display any margins, legends, axes, titles or spaces around the edges. Is this possible? Additionally, the plot I am working with is a filled.contour plot and I can not remove the legend? How can I do this? Kind Regards, James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. The contents of this e-mail are confidential and may be subject to legal privilege. If you are not the intended recipient you must not use, disseminate, distribute or reproduce all or any part of this e-mail or attachments. If you have received this e-mail in error, please notify the sender and
[R] everybody loves R...
Just thought I'd let you guys know about this site I stumbled across: http://riki.wikidot.com/ http://riki.wikidot.com/ It is obviously in its early stages (as it does not have any content yet) but is looking like a good place to build a simple knowledge base for the R software. Anyway, if any of you have time on your hands, I'm sure they'd appreciate the help. -- View this message in context: http://www.nabble.com/everybody-loves-R...-tp22103250p22103250.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] everybody loves R...
I don't want to be mean, I really like wikidot, but isn't it a better solution to use the R wiki instead? http://wiki.r-project.org/rwiki/doku.php G. On Thu, Feb 19, 2009 at 4:44 PM, thefurryblur wrcst...@gmail.com wrote: Just thought I'd let you guys know about this site I stumbled across: http://riki.wikidot.com/ http://riki.wikidot.com/ It is obviously in its early stages (as it does not have any content yet) but is looking like a good place to build a simple knowledge base for the R software. Anyway, if any of you have time on your hands, I'm sure they'd appreciate the help. -- View this message in context: http://www.nabble.com/everybody-loves-R...-tp22103250p22103250.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gabor Csardi gabor.csa...@unil.ch UNIL DGM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table : how to condition on error while opening file?
Hi Stephane, see ?try hth. Stephane Bourgeois schrieb: Hi, I'm using read.table in a loop, to read in multiple files. The problem is that when a file is missing there is an error message and the loop is broken; what I'd like to do is to test for the error and simply do next instead of breaking the loop. Anybody knows how to do that? Example: filelist - c(file1.txt, file2.txt, file3.txt) for (i in 1:3) { if (read.table(filelist[i]) == ERROR LOADING FILE) { # this is where I do not know how to write the condition print(paste(error opening file , filelist[i], sep=)) next } else { tmp - read.table(filelist[i]) } } Cheers, Stephane -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/42803-8243 F ++49/40/42803-7790 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Questions about biglm
The idea of the biglm function is to only have part of the data in memory at a time. You read in part of the data and run biglm on that section of the data, then delete it from memory, load in the next part of the data and use update to include the new data in the analysis, delete that, read in the next group, run update, and repeat until you have processed all the data. The result will then be the same as if you ran lm on the entire dataset (possible slight differences due to rounding). The bigglm function or code from other packages (SQLiteDF for one) can automate this a bit more. The code for VIF below uses the model.matrix command, this returns the x matrix for the analysis when used with an lm object. Since biglm is based on the idea of not having all the data in memory at once, I would be very surprised if model.matrix worked with biglm objects, so that code is unlikely to work as is. One approach is to do VIF and other diagnostics on a subset of the data (random sample, stratified random sample) that fits easily into memory, then after making decisions about the model based on the diagnostics, run the final model with biglm to get the precise results using the full data set. You can do the diagnostics on a couple different random subsets to confirm the decisions made. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of dobomode Sent: Wednesday, February 18, 2009 9:34 PM To: r-help@r-project.org Subject: [R] Questions about biglm Hello folks, I am very excited to have discovered R and have been exploring its capabilities. R's regression models are of great interest to me as my company is in the business of running thousands of linear regressions on large datasets. I am using biglm to run linear regressions on datasets that are as large as several GB's. I have been pleasantly surprised that biglm runs the regressions extremely fast (one regression may take minutes in SPSS vs seconds in R). I have been trying to wrap my head around biglm and have a couple of questions. 1. How can I get VIF's (Variance Inflation Factors) using biglm? I was able to get VIF's from the regular lm function using this piece of code I found through Google, but have not been able to adapt it to work with biglm. Hasn't anyone been successful in this? vif.lm - function(object, ...) { V - summary(object)$cov.unscaled Vi - crossprod(model.matrix(object)) nam - names(coef(object)) if(k - match((Intercept), nam, nomatch = F)) { v1 - diag(V)[-k] v2 - (diag(Vi)[-k] - Vi[k, -k]^2/Vi[k,k]) nam - nam[-k] } else { v1 - diag(V) v2 - diag(Vi) warning(No intercept term detected. Results may surprise.) } structure(v1*v2, names = nam) } 2. How reliable / stable is biglm's update() function? I was experimenting with running regressions on individual chunks of my large dataset, but the coefficients I got were different compared to those obtained form running biglm on the whole dataset. Am I mistaken when I say that update() is intended to run regressions in chunks (when memory becomes an issue with datasets that are too large) and produce identical results to running a single regression on the dataset as a whole? Thanks! Dobo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] an S idiom for ordering matrix by columns?
There's got to be a better way to use order() on a matrix than this: y 2L-035-3 2L-081-23 2L-143-18 2L-189-1 2R-008-5 2R-068-15 3L-113-4 3L-173-2 3981 1 221 12 2 8571 1 221 22 2 9111 1 221 22 2 3831 1 221 12 2 6391 2 212 21 2 7561 2 212 21 2 3L-186-1 3R-013-7 3R-032-1 3R-169-10 X-002 X-087 398122 2 1 2 857122 2 1 2 911122 2 1 2 383122 2 1 2 639221 2 1 2 756221 2 1 2 y[order(y[,1],y[,2],y[,3],y[,4],y[,5],y[,6],y[,7],y[,8],y[,9],y[,10],y[,11],y[,12],y[,13],y[,14]),] 2L-035-3 2L-081-23 2L-143-18 2L-189-1 2R-008-5 2R-068-15 3L-113-4 3L-173-2 3981 1 221 12 2 3831 1 221 12 2 8571 1 221 22 2 9111 1 221 22 2 6391 2 212 21 2 7561 2 212 21 2 3L-186-1 3R-013-7 3R-032-1 3R-169-10 X-002 X-087 398122 2 1 2 383122 2 1 2 857122 2 1 2 911122 2 1 2 639221 2 1 2 756221 2 1 2 Thanks for any suggestions! -Aaron [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] colored maps again
I'm trying to create a colored map that would show the number of students per state. My data frame consists of two columns - state and count. I'm using the following code library(maps) map(usa) library(plotrix) state.col-color.scale(gre$count,0,0,c(0,1)) map(state,fill=TRUE,col=state.col) I'm getting a map, but the values are not being mapped to correct states. What do I need to do to fix that? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] counting strings in a column
Dear All, I have a query : what is the command to count number of repeated words in a column. for ex: a = oranges oranges apples apples grape oranges apple pine the result should be oranges 3 apples 3 grape 1 pine 1 is there an easy way for this. Thanks, Nataraju GM R D Bangalore -- No relationship is Static .. You either Step up or Step down [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] an S idiom for ordering matrix by columns?
How about this: x - matrix(sample(0:1,100,TRUE),10) # create a list of all the columns to sort col.list - lapply(seq(ncol(x)), function(a) x[,a]) # now sort the matrix x[do.call(order, col.list),] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]001001100 1 [2,]001010111 1 [3,]010101111 1 [4,]010110000 1 [5,]101000010 0 [6,]101100010 0 [7,]101101110 1 [8,]110001011 0 [9,]110011110 0 [10,]110101101 0 On Thu, Feb 19, 2009 at 11:40 AM, Aaron Mackey ajmac...@gmail.com wrote: There's got to be a better way to use order() on a matrix than this: y 2L-035-3 2L-081-23 2L-143-18 2L-189-1 2R-008-5 2R-068-15 3L-113-4 3L-173-2 3981 1 221 12 2 8571 1 221 22 2 9111 1 221 22 2 3831 1 221 12 2 6391 2 212 21 2 7561 2 212 21 2 3L-186-1 3R-013-7 3R-032-1 3R-169-10 X-002 X-087 398122 2 1 2 857122 2 1 2 911122 2 1 2 383122 2 1 2 639221 2 1 2 756221 2 1 2 y[order(y[,1],y[,2],y[,3],y[,4],y[,5],y[,6],y[,7],y[,8],y[,9],y[,10],y[,11],y[,12],y[,13],y[,14]),] 2L-035-3 2L-081-23 2L-143-18 2L-189-1 2R-008-5 2R-068-15 3L-113-4 3L-173-2 3981 1 221 12 2 3831 1 221 12 2 8571 1 221 22 2 9111 1 221 22 2 6391 2 212 21 2 7561 2 212 21 2 3L-186-1 3R-013-7 3R-032-1 3R-169-10 X-002 X-087 398122 2 1 2 383122 2 1 2 857122 2 1 2 911122 2 1 2 639221 2 1 2 756221 2 1 2 Thanks for any suggestions! -Aaron [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] an S idiom for ordering matrix by columns?
On Thu, Feb 19, 2009 at 5:40 PM, Aaron Mackey ajmac...@gmail.com wrote: There's got to be a better way to use order() on a matrix than this: y 2L-035-3 2L-081-23 2L-143-18 2L-189-1 2R-008-5 2R-068-15 3L-113-4 3L-173-2 3981 1 221 12 2 8571 1 221 22 2 9111 1 221 22 2 3831 1 221 12 2 6391 2 212 21 2 7561 2 212 21 2 3L-186-1 3R-013-7 3R-032-1 3R-169-10 X-002 X-087 398122 2 1 2 857122 2 1 2 911122 2 1 2 383122 2 1 2 639221 2 1 2 756221 2 1 2 y[order(y[,1],y[,2],y[,3],y[,4],y[,5],y[,6],y[,7],y[,8],y[,9],y[,10],y[,11],y[,12],y[,13],y[,14]),] 2L-035-3 2L-081-23 2L-143-18 2L-189-1 2R-008-5 2R-068-15 3L-113-4 3L-173-2 3981 1 221 12 2 3831 1 221 12 2 8571 1 221 22 2 9111 1 221 22 2 6391 2 212 21 2 7561 2 212 21 2 3L-186-1 3R-013-7 3R-032-1 3R-169-10 X-002 X-087 398122 2 1 2 383122 2 1 2 857122 2 1 2 911122 2 1 2 639221 2 1 2 756221 2 1 2 Thanks for any suggestions! -Aaron You mean something like this: test-matrix(sample(1:4,100,replace=T),ncol=10) test[do.call(order,data.frame(test)),] ? Regards, Gustaf -- Gustaf Rydevik, M.Sci. tel: +46(0)703 051 451 address:Essingetorget 40,112 66 Stockholm, SE skype:gustaf_rydevik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] counting strings in a column
?table On Thu, Feb 19, 2009 at 11:48 AM, Nattu natar...@gmail.com wrote: Dear All, I have a query : what is the command to count number of repeated words in a column. for ex: a = oranges oranges apples apples grape oranges apple pine the result should be oranges 3 apples 3 grape 1 pine 1 is there an easy way for this. Thanks, Nataraju GM R D Bangalore -- No relationship is Static .. You either Step up or Step down [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Questions about biglm
Dear Greg and Dobo, The vif() in the car package computes VIFs (and generalized VIFs) from the covariance matrix of the coefficients; I'm not sure whether it will work directly on objects produced by biglm() but if not it should be easily adapted to do so. I hope this helps, John -- John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Greg Snow Sent: February-19-09 11:35 AM To: dobomode; r-help@r-project.org Subject: Re: [R] Questions about biglm The idea of the biglm function is to only have part of the data in memory at a time. You read in part of the data and run biglm on that section of the data, then delete it from memory, load in the next part of the data and use update to include the new data in the analysis, delete that, read in the next group, run update, and repeat until you have processed all the data. The result will then be the same as if you ran lm on the entire dataset (possible slight differences due to rounding). The bigglm function or code from other packages (SQLiteDF for one) can automate this a bit more. The code for VIF below uses the model.matrix command, this returns the x matrix for the analysis when used with an lm object. Since biglm is based on the idea of not having all the data in memory at once, I would be very surprised if model.matrix worked with biglm objects, so that code is unlikely to work as is. One approach is to do VIF and other diagnostics on a subset of the data (random sample, stratified random sample) that fits easily into memory, then after making decisions about the model based on the diagnostics, run the final model with biglm to get the precise results using the full data set. You can do the diagnostics on a couple different random subsets to confirm the decisions made. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of dobomode Sent: Wednesday, February 18, 2009 9:34 PM To: r-help@r-project.org Subject: [R] Questions about biglm Hello folks, I am very excited to have discovered R and have been exploring its capabilities. R's regression models are of great interest to me as my company is in the business of running thousands of linear regressions on large datasets. I am using biglm to run linear regressions on datasets that are as large as several GB's. I have been pleasantly surprised that biglm runs the regressions extremely fast (one regression may take minutes in SPSS vs seconds in R). I have been trying to wrap my head around biglm and have a couple of questions. 1. How can I get VIF's (Variance Inflation Factors) using biglm? I was able to get VIF's from the regular lm function using this piece of code I found through Google, but have not been able to adapt it to work with biglm. Hasn't anyone been successful in this? vif.lm - function(object, ...) { V - summary(object)$cov.unscaled Vi - crossprod(model.matrix(object)) nam - names(coef(object)) if(k - match((Intercept), nam, nomatch = F)) { v1 - diag(V)[-k] v2 - (diag(Vi)[-k] - Vi[k, -k]^2/Vi[k,k]) nam - nam[-k] } else { v1 - diag(V) v2 - diag(Vi) warning(No intercept term detected. Results may surprise.) } structure(v1*v2, names = nam) } 2. How reliable / stable is biglm's update() function? I was experimenting with running regressions on individual chunks of my large dataset, but the coefficients I got were different compared to those obtained form running biglm on the whole dataset. Am I mistaken when I say that update() is intended to run regressions in chunks (when memory becomes an issue with datasets that are too large) and produce identical results to running a single regression on the dataset as a whole? Thanks! Dobo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
Re: [R] indicator or deviation contrasts in log-linear modelling
On Wed, 18 Feb 2009, maiya wrote: I realise that in the case of loglin the parameters are clacluated post festum from the cell frequencies, however other programmes that use Newton-Raphson as opposed to IPF work the other way round, right? In which case one would expect the output of parameters to be limited to the particular contrast used. But since loglin uses IPF I would have thought the choice of style of parameter to be output could be made... Anyway, this is the line that interests me: lm( as.vector( loglin(...,fit=TRUE)$fit ) ~ your favored contrasts ) only I'm not profficient in R to figure out the last term :( How would I go about this then if my prefered contrasti is setting the first categories as reference cats? See An Introduction to R Chapter 11 and try this: for ( i in ls('package:stats',pat='contr[.]')){ cat( i, '\n' ) print( get(i)(letters[1:5]) ) options(contrasts=c(unordered=i,ordered='contr.poly')) print( coef( glm( Freq~ Dept*Gender, as.data.frame(UCBAdmissions),family=poisson)) ) } I literaly just need the equivalent of loglin(matrix(c(1,2,3,4), nrow=2), list(c(1,2)), param=TRUE) which would give me parameters under indicator contrast. glm... well, I'd have to work on it Regarding the more general points ad 2) I would have thought that direct inspection of cell frequencies is precisely the wrong/misleading thing to do - the highest order coefficients can be inspected directly in order to see the interaction without the (lower) marginal effects, or alternatively the table can be standardized to uniform margins for the same sort of inspection. OK, to each her own. But try this out yourself. What is the story here? (Review ?UCBAdmissions, if you need to.) options(contrasts=c(unordered='contr.sum',ordered='contr.poly')) print( cbind(coef( glm( Freq~ Admit*Dept*Gender, as.data.frame(UCBAdmissions),family=poisson)) )) [,1] (Intercept) 4.786575880 Admit1 -0.277614562 Dept1 0.067824911 Dept2-0.758615446 Dept3 0.560293364 Dept4 0.446131873 Dept5-0.001254892 Gender1 0.355262130 Admit1:Dept1 0.786694268 Admit1:Dept2 0.599494828 Admit1:Dept3 -0.021374963 Admit1:Dept4 -0.053867688 Admit1:Dept5 -0.250913079 Admit1:Gender1 -0.050744703 Dept1:Gender1 0.782600986 Dept2:Gender1 1.216370861 Dept3:Gender1-0.646880514 Dept4:Gender1-0.308737151 Dept5:Gender1-0.691810320 Admit1:Dept1:Gender1 -0.212274286 Admit1:Dept2:Gender1 -0.004260932 Admit1:Dept3:Gender1 0.081975109 Admit1:Dept4:Gender1 0.030247904 Admit1:Dept5:Gender1 0.100791458 OK, got the whole story? Could you explain it to someone who is not a statistician? Now try it again. But with this display: ftable(UCBAdmissions) Dept A B C D E F AdmitGender Admitted Male512 353 120 138 53 22 Female 89 17 202 131 94 24 Rejected Male313 207 205 279 138 351 Female 19 8 391 244 299 317 round( ftable(prop.table(UCBAdmissions,2:3)) ,2) DeptABCDEF AdmitGender Admitted Male0.62 0.63 0.37 0.33 0.28 0.06 Female 0.82 0.68 0.34 0.35 0.24 0.07 Rejected Male0.38 0.37 0.63 0.67 0.72 0.94 Female 0.18 0.32 0.66 0.65 0.76 0.93 You can pretty easily see that admission rates vary by department, that all departments but one have pretty equal admission rates by gender and that in that department the rate is a 20% higher for females. (And yes, a significance test confirms this). Maybe not a statistified as talking about three-way interactions and coefficients of products of contrasts, but I'll bet a lot of scientists would find the tables more compelling. HTH, Chuck ad 3) and yes, I figured as much! I can't see how lower order terms can be interpreted at all if higher order interactions exist? I've seen it done, e.g I've seen it claimed that in a standardized table the lower order terms are all equal to zero, which is of course not true? Thanks! Maja -- View this message in context: http://www.nabble.com/indicator-or-deviation-contrasts-in-log-linear-modelling-tp22090104p22093070.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
Re: [R] Unadulterated plot
James Nicolson wrote: To All, I'm using R 2.8.1. I have attached two images, one shows what I get and the other shows what I want. All I want is the plotting region. Surely there must be a way of plotting individual regions/components in R? Patrizio, I prefer filled.contour() for my data. I still have the same problem (a legend to the right) if I use image() then contour()! I don't see why you'd get a legend in that case. But it's hard to get the contour lines to match the colour changes when using image with contour, so I can see why you might prefer filled.contour. I think you're going to have to edit filled.contour to do what you want. It's not too hard: just comment out the code that sets up the layout, then everything until just before the second plot.new(). Insert par(mar=c(0,0,0,0)), and you'll get a filled contour plot with no margins. Duncan Murdoch Regards, James ps. No need for apologies Simon :D I'm grateful for any assistance fruitless or not. Patrizio Frederic wrote: James, as I previously told you in my broken English, probably the function you're looking for is not filled.contour but image and contour The following code makes exactly what you ask for data(akima) akima akima.smooth - with(akima, interp(x, y, z, xo=seq(0,25, length=500), yo=seq(0,20, length=500))) op - par(ann=FALSE, mai=c(0,0,0,0)) image (akima.smooth, main = interp(akima data, *) on finer grid) contour(akima.smooth, add = TRUE,drawlabels=F) cheers Patrizio 2009/2/19 James Nicolson jlnicol...@gmail.com: good point! Provide your own set of x,y,z co-ords, mine are pretty big but you can use any. library(akima) fr3d = data.frame(x,y,z) xtrp - interp(fr3d$x,fr3d$y,fr3d$z,linear=FALSE,extrap=TRUE,duplicate= strip) op - par(ann=FALSE, mai=c(0,0,0,0)) filled.contour(xtrp$x, xtrp$y, xtrp$z, asp = 0.88402366864, col = rev(rainbow(28,start=0, end=8/12)), n = 40) par(op) I tried all these settings too (none of them made a difference)... usr=c(0,845,0,747), mfcol=c(1,1), mfrow=c(1,1), oma=c(0,0,0,0),omi=c(0,0,0,0), plt=c(1,1,1,1) Regards James Peter Alspach wrote: Kia ora James I think it would be easier to provide you with help if you provide commented, minimal, self-contained, reproducible code [see bottom of this, or any, email to R-help]. Hei kona ra ... Peter Alspach -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of James Nicolson Sent: Thursday, 19 February 2009 11:22 a.m. To: r-help@r-project.org Subject: Re: [R] Unadulterated plot Hi, Thanks for your help. I have looked at the beginners documentation and while there are options to configure various aspects of the plot none of them seem to have the desired effect. I have managed to ensure that the plot fills the space vertically with no margins, no axes etc (using mai=c(0,0,0,0)). However, horizontally there remains a margin to the right that pads the space between the filled.contour and its legend. I've tried options to par and filled.contour but I can't seem to remove the legend. Kind Regards, James Simon Pickett wrote: Hi James, What you really need to do is to check out the many freely available pdfs for R beginners. Here is a good place to start http://cran.r-project.org/other-docs.html If I am right interpreting what you want, I think you need to create a blank plot with no axes, axis labels etc. Try plot(x,y,xlab=,ylab=,xaxt=NULL,yaxt=NULL,type=n) #blank plot points(x,y) type ?par into R and see how you can set parameters like this up as the default. Hope this helps? Simon. - Original Message - From: James Nicolson jlnicol...@gmail.com To: r-help@r-project.org Sent: Sunday, February 15, 2009 10:29 PM Subject: [R] Unadulterated plot To all, Apologies if this question has already been asked but I can't find anything. I can't seem to think of more specific search terms. I want to display/create a file of a pure plot with a specific height and width. I want to utilise every single pixel inside the axes. I do not want to display any margins, legends, axes, titles or spaces around the edges. Is this possible? Additionally, the plot I am working with is a filled.contour plot and I can not remove the legend? How can I do this? Kind Regards, James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] an S idiom for ordering matrix by columns?
Thanks to all, do.call(order, as.data.frame(y)) was the idiom I was missing! -Aaron On Thu, Feb 19, 2009 at 11:52 AM, Gustaf Rydevik gustaf.ryde...@gmail.comwrote: On Thu, Feb 19, 2009 at 5:40 PM, Aaron Mackey ajmac...@gmail.com wrote: There's got to be a better way to use order() on a matrix than this: y 2L-035-3 2L-081-23 2L-143-18 2L-189-1 2R-008-5 2R-068-15 3L-113-4 3L-173-2 3981 1 221 12 2 8571 1 221 22 2 9111 1 221 22 2 3831 1 221 12 2 6391 2 212 21 2 7561 2 212 21 2 3L-186-1 3R-013-7 3R-032-1 3R-169-10 X-002 X-087 398122 2 1 2 857122 2 1 2 911122 2 1 2 383122 2 1 2 639221 2 1 2 756221 2 1 2 y[order(y[,1],y[,2],y[,3],y[,4],y[,5],y[,6],y[,7],y[,8],y[,9],y[,10],y[,11],y[,12],y[,13],y[,14]),] 2L-035-3 2L-081-23 2L-143-18 2L-189-1 2R-008-5 2R-068-15 3L-113-4 3L-173-2 3981 1 221 12 2 3831 1 221 12 2 8571 1 221 22 2 9111 1 221 22 2 6391 2 212 21 2 7561 2 212 21 2 3L-186-1 3R-013-7 3R-032-1 3R-169-10 X-002 X-087 398122 2 1 2 383122 2 1 2 857122 2 1 2 911122 2 1 2 639221 2 1 2 756221 2 1 2 Thanks for any suggestions! -Aaron You mean something like this: test-matrix(sample(1:4,100,replace=T),ncol=10) test[do.call(order,data.frame(test)),] ? Regards, Gustaf -- Gustaf Rydevik, M.Sci. tel: +46(0)703 051 451 address:Essingetorget 40,112 66 Stockholm, SE skype:gustaf_rydevik [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix computation???
You are computing the hat matrix to predict stack.loss, but stack.loss is a column in the A matrix, so you predictions are all perfect (given stack.loss, what is stack.loss, fairly simple answer, all errors are 0). I think you want to redo this using only the 3 columns other than stack.loss in the call to svd, then you should get the results that you are expecting. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Kutlwano Ramaboa Sent: Wednesday, February 18, 2009 10:43 PM To: r-help@r-project.org Subject: [R] matrix computation??? Hello Can anyone tell me what I am doing wrong below? My Y and y_hat are the same. A-scale(stackloss) n1- dim(A)[1];n2-dim(A)[2] X-svd(A) Y- matrix(A[,stack.loss],nrow=n1) Y y_hat -matrix((X$u%*% t(X$u))%*%Y,nrow=n1,byrow=T) y_hat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] colored maps again
Here's a bit of a hacky way to do it: #get the names of each state state=map('state',plot=F)$names #set up some random state-color data cols = as.data.frame(cbind(state=states,x=sample(1:10,length(states),replace=T))) #do the plot map('usa') for(this_state in state){ map('state',region=this_state,add=T,fill=T,col=cols$x[cols$state==this_state]) } On Thu, Feb 19, 2009 at 12:45 PM, Alina Sheyman alina...@gmail.com wrote: I'm trying to create a colored map that would show the number of students per state. My data frame consists of two columns - state and count. I'm using the following code library(maps) map(usa) library(plotrix) state.col-color.scale(gre$count,0,0,c(0,1)) map(state,fill=TRUE,col=state.col) I'm getting a map, but the values are not being mapped to correct states. What do I need to do to fix that? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with using %in% condition while using in if() condition
The %in% operator returns a vector of logicals the same length as the vector to the left. The if program flow operator expects a single logical value, not a vector, since you are giving it a vector it looks at just the 1st element, ignores the rest and gives the warning. This warning should be taken seriously since it indicates that what is happening and what you intend probably do not match. If you tell us more about what you are trying to do, we can give more help. Some possibilities: use any() or all() to reduce the vector to a single logical value. Use ifelse(), subscripting, or some other method to accomplish what you want. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of venkata kirankumar Sent: Wednesday, February 18, 2009 11:15 PM To: r-help@r-project.org Subject: [R] problem with using %in% condition while using in if() condition Hi all, I got one problem with using %in% condition while using in if() condition where I used the condition as if(SubFinSpt$SPECIMENTYP %in% CAP$SPECIMENTYP) this if()condition is in else condition and hear *SubFinSpt$SPECIMENTYP* having only one value but *CAP$SPECIMENTYP *having nearly 20 SPECIMENTYP's while applying this condition I got one warning that says only first element is checked and after that warning it is executing normally and giving results but i want to know why it is giving this warning can any one explain why it is comming and how to resolve it thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to connect R and WinBUGS/OpenBUGS/LinBUGS in Linux in Feb. 2009
I do not know about the ubuntu instructions, they would not help me on Suse. The wine version is 1.1.9. I thought that was the latest,but when I checked latest is 1.1.15, which does indeed throw the blackbox error. So, now it does not work for me either Sorry I gave bad advise kees On Thu, 19 Feb 2009 01:38:15 +0100, Paul Heinrich Dietrich paul.heinrich.dietr...@gmail.com wrote: I went to this webpage (http://ubuntuforums.org/showthread.php?t=624644) and followed the instructions to the letter on getting the latest Wine. I installed WinBUGS again, but this time I cannot open it in Wine. It says Black Box, Trap #101, and some text I can't copy/paste here. Is this the latest Wine that you have, or something different? Thanks. chaogai-2 wrote: Hi, For me running winbugs through wine just works. Even when I do not specify any directories. The example they give in the bugs helpfile was my starting point. Setup is suse 11.1, latest Wine, R, R2WinBUGS winbugs. I assume you first tried without specifying directories? The directories you use do not work for me, with WINEPATH the culprit. If you do not have the latest wine I advise to upgrade not specify directories. Good luck, Kees On Wed, 18 Feb 2009 01:27:18 +0100, Paul Heinrich Dietrich paul.heinrich.dietr...@gmail.com wrote: Hi Uwe, Thank you for your guidance. I have installed R2WinBUGS and WinBUGS14 under wine. Using ?bugs for help, it tells me: useWINE: logical; attempt to use the Wine emulator to run 'WinBUGS', defaults to 'FALSE' on Windows, and 'TRUE' otherwise. Not available in S-PLUS. WINE: character, path to 'wine' binary file, it is tried hard (by a guess and the utilities 'which' and 'locate') to get the information automatically if not given. newWINE: Use new versions of Wine that have 'winepath' utility WINEPATH: character, path to 'winepath' binary file, it is tried hard (by a guess and the utilities 'which' and 'locate') to get the information automatically if not given. ..and the following code is a simple Bayesian version of a t-test... Directory Paths MyModelPath - /home/me/Compound/R/WinBUGS/ MyBUGSPath - /home/me/.wine/drive_c/Program Files/WinBUGS14/ MyModelFile - paste(MyModelPath, model.bug, sep=) WINEPATH - /usr/bin/wine Create Data Set # Here is some fake data n_draws - 50 x - round(runif(n_draws, 1, 2)) y - ifelse(x == 1, rnorm(n_draws, 1, 1), rnorm(n_draws, 1.2, 0.8)) MyData - as.data.frame(cbind(y, x)) y.n - NROW(MyData$y) x.j - length(unique(x)) summary(MyData) ## Format Data for WinBUGS ## MyBUGSData - list(y=MyData$y, x=MyData$x, n=y.n, x.j=x.j) MyBUGSData ## WinBUGS Model File ### library(R2WinBUGS) cat(model { for (i in 1:n) { y[i] ~ dnorm(mu[i], tau) mu[i] - alpha + beta[x[i]] } ### STZ (Sum-To-Zero) Constraints beta[1] - -sum(beta[2:x.j]) ### Priors alpha ~ dnorm(0.0, 1.0E-4) for (i in 2:x.j) { beta[i] ~ dnorm(0.0, 1.0E-4) } tau ~ dgamma(0.01, 0.01) precision - sqrt(1/tau) }, file=MyModelFile) file.show(MyModelFile) # WinBUGS Model # MyModel - bugs(MyBUGSData, inits=NULL, model.file=MyModelFile, parameters.to.save=c(alpha, beta, precision), n.chains=3, n.iter=2000, n.burnin=1000, n.thin=1, codaPkg=TRUE, bugs.directory = MyBUGSPath, working.directory=MyModelPath, useWINE=TRUE, WINEPATH=WINEPATH, debug=TRUE) The output says: ERROR: cannot open the connection I'm wondering if I've misinterpreted how to set my paths with wine, because I can go to the following path, double-click on WinBUGS14.exe, and open it just fine: /home/me/.wine/drive_c/Program Files/WinBUGS14/ I can also go to Applications Wine Browse C:\ Drive and navigate to WinBUGS. Please help if I've done something wrong. Thanks. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Source code for nlm()
Hi, Where can I find the source code for nlm()? I dowloaded the R2.8.1.tar.gz file and looked at all the .c and .f files, but couldn't find either nlm.c or nlm.f There is an nlm.r file, but that is not useful. Thanks for any help, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Python and R
On Thu, Feb 19, 2009 at 8:30 AM, Esmail Bonakdarian esmail...@gmail.com wrote: Hi Kenn, Thanks for the suggestions, I'll have to see if I can figure out how to convert the relatively simple call to lm with an equation and the data file to the functions you mention (or if that's even feasible). X - model.matrix(formula, data) will calculate the X matrix for you. Not an expert in statistics myself, I am mostly concentrating on the programming aspects of R. Problem is that I suspect my colleagues who are providing some guidance with the stats end are not quite experts themselves, and certainly new to R. Cheers, Esmail Kenn Konstabel wrote: lm does lots of computations, some of which you may never need. If speed really matters, you might want to compute only those things you will really use. If you only need coefficients, then using %*%, solve and crossprod will be remarkably faster than lm # repeating someone else's example # lm(DAX~., EuStockMarkets) y - EuStockMarkets[,DAX] x - EuStockMarkets x[,1]-1 colnames(x)[1] - Intercept lm(y ~ x-1) solve(crossprod(x), t(x))%*%y# probably this can be done more efficiently # and a naive timing system.time( for(i in 1:1000) lm(y ~ x-1)) user system elapsed 14.640.33 32.69 system.time(for(i in 1:1000) solve(crossprod(x), crossprod(x,y)) ) user system elapsed 0.360.000.36 Also lsfit() is a bit quicker than lm or lm.fit. Regards, Kenn __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table : how to condition on error while opening file?
Hello Stephane, here is something you could try, filelist - c(file1.txt, file2.txt, file3.txt) for (i in 1:3) { tmpList-try(read.table(filelist[[i]]), silent=TRUE) if(inherits(tmpList, try-error)) {print(paste(error opening file , filelist[[i]])) } else { tmp-read.table(filelist[[i]])-namelist[[i]] } } There is though a problem that I didnt manage to fix, that is: if , suppose, file1.txt exists, file2 doesn't exist and file 3 exists, the dataframe in file 1 will at first be called tmp, but then it will be substituted by the data.frame in file 3... It is as if you would do: c(1,2,3,4)-tmp and then do c(1,6,7,8)-tmp the second tmp will substitute the first one... Hope this helps Laura Messaggio originale Da: e.vettora...@uke.uni-hamburg.de Data: 19.02.2009 17.23 A: Stephane Bourgeoiss...@sanger.ac.uk Copia: r-help@r-project.org Oggetto: Re: [R] read.table : how to condition on error while opening file? Hi Stephane, see ?try hth. Stephane Bourgeois schrieb: Hi, I'm using read.table in a loop, to read in multiple files. The problem is that when a file is missing there is an error message and the loop is broken; what I'd like to do is to test for the error and simply do next instead of breaking the loop. Anybody knows how to do that? Example: filelist - c(file1.txt, file2.txt, file3.txt) for (i in 1:3) { if (read.table(filelist[i]) == ERROR LOADING FILE) { # this is where I do not know how to write the condition print(paste(error opening file , filelist[i], sep=)) next } else { tmp - read.table(filelist[i]) } } Cheers, Stephane -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/42803-8243 F ++49/40/42803-7790 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] TAR Models and predictive residuals
Hello R users. There is a paper from Ruey Tsay with the title: Testing and Modelling Threshold Autoregressive Processes, published in 1989 in the Journal of the American Statistical Association (March, Vol. 84, No. 405). Mr. Tsay describes a very interesting way of identifying and modelling threshold AR processes. 1. Is there a package in R or some routines, which implements his ideas and his methodology? 2. Is there a routine in R to calculate the predictive residuals (like defined in the paper)? Thanks in advance. Regards, Andreas. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Doing pairwise comparisons using either Duncan, Tukey or LSD
Hi, I have a basic and simple question on how to code pairwise (multiple) mean compariosn between levels of a factor using one of the Duncan, Tukey or LSD. Thanks in advance, Saeed -- View this message in context: http://www.nabble.com/Doing-pairwise-comparisons-using-either-Duncan%2C-Tukey-or-LSD-tp22104786p22104786.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using R in Java?
Hi 2 questions- 1. Is there a package that will allow me to run R scripts (entirely) from Java? 2. If so, is there a way to capture the output of those scripts, (including images) and embed them in my SWT java app? My challenge is I have a java app that does some statistical chores- it would be fantastic if the users could use their R skills to modify a script in whatever R environment they like and then my app could use that script to calculate the results and display them in the app. I have found StatET and JavaGD with rJava/JRI and read through all the docs... its seems possible that some combination may give me what I want, but its not very clear. Any suggestions anyone? Thanks! -- View this message in context: http://www.nabble.com/Using-R-in-Java--tp22104843p22104843.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Doing pairwise comparisons using either Duncan, Tukey or LSD
On 2/19/2009 11:51 AM, Saeed Ahmadi wrote: Hi, I have a basic and simple question on how to code pairwise (multiple) mean compariosn between levels of a factor using one of the Duncan, Tukey or LSD. Here is one approach: library(multcomp) summary(glht(lm(Petal.Width ~ Species, data = iris), linfct = mcp(Species = Tukey))) Thanks in advance, Saeed -- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Source code for nlm()
It seems to be in optimize.c Rgonzui has a very nice search facility for source of R or CRAN packages (however it is against R 2.8.0 source): http://rgonzui.nakama.ne.jp/R/markup/R-2.8.0/src/main/optimize.c?fm=cq=nlm# l378 -Christos -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Ravi Varadhan Sent: Thursday, February 19, 2009 1:00 PM To: r-help@r-project.org Subject: [R] Source code for nlm() Hi, Where can I find the source code for nlm()? I dowloaded the R2.8.1.tar.gz file and looked at all the .c and .f files, but couldn't find either nlm.c or nlm.f There is an nlm.r file, but that is not useful. Thanks for any help, Ravi. -- -- --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -- -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.