[R] Imputing missing values using LSmeans (i.e., population marginal means) - advice in R?
Hi folks, I have a dataset that consists of counts over a ~30 year period at multiple (200) sites. Only one count is conducted at each site in each year; however, not all sites are surveyed in all years. I need to impute the missing values because I need an estimate of the total population size (i.e., sum of counts across all sites) in each year as input to another model. head(newdat,40) SITE YEAR COUNT 1 1 1975 12620 2 1 1976 13499 3 1 1977 45575 4 1 1978 21919 5 1 1979 33423 ... 372 1975 4 382 1978 40322 392 1979 7 402 1980 16244 It was suggested to me by a statistician to use LSmeans to do this; however, I do not have SAS, nor do I know anything much about SAS. I have spent DAYS reading about these LSmeans and while (I think) I understand what they are, I have absolutely no idea how to a) calculate them in R and b) how to use them to impute my missing values in R. Again, I've searched the mail lists, internet and literature and have not found any documentation to advise on how to do this - I'm lost. I've looked at popMeans, but have no clue how to use this with predict() - if this is even the route to go. Any advice would be much appreciated. Note that YEAR will be treated as a factor and not a linear variable (i.e., the relationship between COUNT and YEAR is not linear - rather there are highs and lows about every 10 or so years). One thought I did have was to just set up a loop to calculate the least-squares estimates as: Yij = (IYi + JYj - Y)/[(I-1)(J-1)] where I = number of treatments and J = number of blocks (so I = sites and J = years). I found this formula in some stats lecture handouts by UC Davis on unbalanced data and LSMeans...but does it yield the same thing as using the LSmeans estimates? Does it make any sense? Thoughts? Many thanks in advance. Jenn __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A trivial plot gives open circles as the plot char but another one gives me qs . . why is this?
On Apr 3, 2012, at 07:19 , Philip Rhoades wrote: People, On my home computer with a real plot I get what I expect - open circles as the plot character - on my work computer I get qs ! So I tried a trivial test plot on the work computer ie: Argh. I have forgotten this before... There was a configuration issue leading to plotting symbols showing up as q, but I can't recall the details. Possibly libpango-related (do you have it installed?) -pd a - c(1,2,3,4,5) b - c(1,2,3,4,5) plot(a,b) and that also gives me open circles as it should! - the plot line in my real script is: plot( means5[ ,4 ], means6[ ,4 ], xlab=640loci-50reps Gen4(-1) Rst, ylab=ADf ) I have tried adding the pch=x command with no effect . . what is going on? (Both computers have the same fonts installed although the home computer is Fedora 16 and the work one is Fedora 15). It is very frustrating . . Thanks, Phil. -- Philip Rhoades GPO Box 3411 Sydney NSW2001 Australia E-mail: p...@pricom.com.au __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in gamma(delta + (complex(0, 0, 1) * (x - mu))/alpha) : unimplemented complex function
I am trying to obtain the grafic of a pdf . but this error keeps showing . Here is the code Or use the complex gamma function gammaz() in package 'pracma'. The following code works, that is produces a plot: library(pracma) MXN.fd - function(x,alpha,beta,mu,delta) { A - (2*cos(beta/2))^(2*delta) B - 2*alpha*pi*gamma(2*delta) C - (beta*(x-mu))/alpha D - abs(gammaz(delta + (complex(0,0,1)*(x-mu))/alpha)^2) M - A/B*exp(C)*D plot(x,M,type=l,lwd=2,col=red) } alpha - 0.02612297; beta - -0.50801886 mu - 0.00575829917; delta - 0.67395 x - seq(-0.04,0.04,length=200) MXN.fd(x,alpha,beta,mu,delta) grid() i think the problem is the gamma function, does anyone know how to compute gamma with imaginary numbers? thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] e1071 tune.control() random parameter
I'm not sure what the parameter specifies: random if an integer value is specified, random parameter vectors are drawn from the parameter space. What are the parameter vectors and what is the parameter space? What means drawn? greetings Jessi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Package seems to be present but library don't find it
Hi, I try to make my first package? The HelloWorld.R file is: HelloWorld.R #' showHello est une fonction R permettant d'afficher le message #' Hello World! sur la console. #' @title la fonction showHello() showHello -function(){ cat(Hello World!\n) } I use the following procedure to get the tar: # set the working directory where the file is located setwd(...) package.skeleton(HelloWorld,code_files=c(HelloWorld.R)) # to generate .rd files library(roxygen2) roxygenize(HelloWorld,copy.package=FALSE) system(R CMD build '/Users/marcgirondot/Documents/Espace de travail R/Phenology/Source fit/Essai_package/HelloWorld') * checking for file ‘/Users/marcgirondot/Documents/Espace de travail R/Phenology/Source fit/Essai_package/HelloWorld/DESCRIPTION’ ... OK * preparing ‘HelloWorld’: * checking DESCRIPTION meta-information ... OK * checking for LF line-endings in source and make files * checking for empty or unneeded directories Removed empty directory ‘HelloWorld/inst’ * building ‘HelloWorld_1.0.tar.gz’ install.packages(/Users/marcgirondot/Documents/Espace\ de\ travail\ R/Phenology/Source\ fit/Essai_package/HelloWorld_1.0.tar.gz, repos = NULL) Installing package(s) into ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library’ (as ‘lib’ is unspecified) library(HelloWorld) Erreur dans library(HelloWorld) : ‘HelloWorld’ n'est pas un nom correct de package installé Whereas the Helloworld folder is available in the library folder with other packages /Library/Frameworks/R.framework/Versions/2.14/Resources/library/HelloWorld -- __ Marc Girondot, Pr Laboratoire Ecologie, Systématique et Evolution Equipe de Conservation des Populations et des Communautés CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079 Bâtiment 362 91405 Orsay Cedex, France Tel: 33 1 (0)1.69.15.72.30 Fax: 33 1 (0)1.69.15.73.53 e-mail: marc.giron...@u-psud.fr Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html Skype: girondot __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A trivial plot gives open circles as the plot char but another one gives me qs . . why is this?
On Tue, 3 Apr 2012, peter dalgaard wrote: On Apr 3, 2012, at 07:19 , Philip Rhoades wrote: People, On my home computer with a real plot I get what I expect - open circles as the plot character - on my work computer I get qs ! So I tried a trivial test plot on the work computer ie: Argh. I have forgotten this before... There was a configuration issue leading to plotting symbols showing up as q, but I can't recall the details. Possibly libpango-related (do you have it installed?) We were not told the device This is well-known for pdf() and poppler-based PDF viewers so you could try the workaround in ?pdf. I don't really see how this could happen on the default X11 device, as it draws circles rather than uses fonts. -pd a - c(1,2,3,4,5) b - c(1,2,3,4,5) plot(a,b) and that also gives me open circles as it should! - the plot line in my real script is: plot( means5[ ,4 ], means6[ ,4 ], xlab=640loci-50reps Gen4(-1) Rst, ylab=ADf ) I have tried adding the pch=x command with no effect . . what is going on? (Both computers have the same fonts installed although the home computer is Fedora 16 and the work one is Fedora 15). It is very frustrating . . Thanks, Phil. -- Philip Rhoades GPO Box 3411 Sydney NSW 2001 Australia E-mail: p...@pricom.com.au __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] output of several results from a function
Hi, I try my first time to write a summary method for my function. The result of my function is a list of two objects (2 arrays). In my summary both objects should be displayed but with a some introductory text like: ls - list(A=A123,B=B123) summaryout=function(x,...){ cat(This is output A:/n) x$A cat(This is output B:/n) x$B } How can I achieve that both list-elements are displayed and the lines inbetween, so that it looks nice? As the single list elements are arrays (partly 5-dimensional) is there a nice way to display them i a summary or print method? cheers, best regards, Johannes -- Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fisher's LSD multiple comparisons in a two-way ANOVA
Hi there, Is there a function that can do a Fisher's LSD multiple comparisons in a two-way ANOVA? I hope to get a result similar with TukeyHSD(). Especially, I hope to know the significance of comparisons between the interactions of two factors. In the following example: x - c(76, 84, 78, 80, 82, 70, 62, 72, 71, 69, 72, 74, 66, 74, 68, 66, 69, 72, 72, 78, 74, 71, 73, 67, 86, 67, 72, 85, 87, 74, 83, 86, 66, 68, 70, 76, 78, 76, 69, 74, 72, 72, 76, 69, 69, 82, 79, 81) a - rep(c(A1,A2), each = 24) b - rep(c(B1,B2,B3), each =8, times = 2) a - factor(a) b - factor(b) x.aov - aov(x~a*b) I hope to obtained the results similar with http://www.gigawiz.com/images12/twowayrmposthoc.jpg Any suggestions or comments will be really appreciated. Regards, Jinsong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] output of several results from a function
Hi Johannes, You are on the right track. Just need to use the right line break indicators, and explicitly call print. Like this: tmp - list(A=A123,B=B123) summaryout=function(x,...){ cat(This is output A:\n) print(x$A) cat(This is output B:\n) print(x$B) } summaryout(tmp) Best, Ista On Tue, Apr 3, 2012 at 5:12 AM, Johannes Radinger jradin...@gmx.at wrote: Hi, I try my first time to write a summary method for my function. The result of my function is a list of two objects (2 arrays). In my summary both objects should be displayed but with a some introductory text like: ls - list(A=A123,B=B123) summaryout=function(x,...){ cat(This is output A:/n) x$A cat(This is output B:/n) x$B } How can I achieve that both list-elements are displayed and the lines inbetween, so that it looks nice? As the single list elements are arrays (partly 5-dimensional) is there a nice way to display them i a summary or print method? cheers, best regards, Johannes -- Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fisher's LSD multiple comparisons in a two-way ANOVA
yes. See ?glht in the multcomp package, and the examples using glht in ?MMC in the HH package. Sent from my iPhone On Apr 3, 2012, at 6:16, Jinsong Zhao jsz...@yeah.net wrote: Hi there, Is there a function that can do a Fisher's LSD multiple comparisons in a two-way ANOVA? I hope to get a result similar with TukeyHSD(). Especially, I hope to know the significance of comparisons between the interactions of two factors. In the following example: x - c(76, 84, 78, 80, 82, 70, 62, 72, 71, 69, 72, 74, 66, 74, 68, 66, 69, 72, 72, 78, 74, 71, 73, 67, 86, 67, 72, 85, 87, 74, 83, 86, 66, 68, 70, 76, 78, 76, 69, 74, 72, 72, 76, 69, 69, 82, 79, 81) a - rep(c(A1,A2), each = 24) b - rep(c(B1,B2,B3), each =8, times = 2) a - factor(a) b - factor(b) x.aov - aov(x~a*b) I hope to obtained the results similar with http://www.gigawiz.com/images12/twowayrmposthoc.jpg Any suggestions or comments will be really appreciated. Regards, Jinsong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] e1071 tune.control() random parameter
Hello, Jessica, Can you please elaborate what you're trying to find with that function. If e.g. you want to find best parameters C and gamma for RBF model (SVM), you can use grid.py, check here: http://www.csie.ntu.edu.tw/~cjlin/libsvm/ Function tune.control() in package e1071 is an R interface to this algorithm. Best, -Alex From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of Jessica Streicher [j.streic...@micromata.de] Sent: 03 April 2012 11:05 To: r-help@r-project.org Subject: [R] e1071 tune.control() random parameter I'm not sure what the parameter specifies: random if an integer value is specified, random parameter vectors are drawn from the parameter space. What are the parameter vectors and what is the parameter space? What means drawn? greetings Jessi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] TR: [e1071] Load an SVM model exported with write.svm
Hello, Alexandre, In R you can specify whether to use or not scaling with parameter scale, e.g.: model-svm(class_var ~ ., data=trainset, scale=FALSE); Don't forget to disable it if you already sclaed your data with libsvm svm-scale algorithm. Best, -Alex From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of Alexandre Marié [alexandre.ma...@artelys.com] Sent: 29 March 2012 13:22 To: r-help@r-project.org Subject: [R] TR: [e1071] Load an SVM model exported with write.svm Dear R help mailing list, I work on a project using the SVM implementation from e1071 R package and I really need your help in order to use correctly the write.svm function. I trained my SVM model in R and I would like to use this model in Java. To do that, I would like to use the libsvm Java version (I tried to used jlibsvm, in order to benefit from the refactoring, but it does not seem to work and the lack of documentation push me to switch to the libsvm java version). Thus, I exported my model with the function write.svm and import it with the libsvm java library but I didn’t obtain the expected result from my test data set. I suspect that the load of the model file is not enough and it is necessary to use the .scale file, in Java, to scale the data (I suspect that the model object in R contains the scaling) but I don’t know how to do that. In order to understand how it is working, I would like, in the first place, load the file generated by write.svm in R in order to be sure the model in the file match with my R object. But I don’t find any function to load that kind of file. How could I load a model file generated by write.svm in R? Thanks a lot in advance for your help! Best regards, _ Alexandre Mariι Artelys : www.artelys.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A trivial plot gives open circles as the plot char but another one gives me qs . . why is this?
On Apr 3, 2012, at 1:19 AM, Philip Rhoades wrote: People, On my home computer with a real plot I get what I expect - open circles as the plot character - on my work computer I get qs ! So I tried a trivial test plot on the work computer ie: a - c(1,2,3,4,5) b - c(1,2,3,4,5) plot(a,b) and that also gives me open circles as it should! - the plot line in my real script is: plot( means5[ ,4 ], means6[ ,4 ], xlab=640loci-50reps Gen4(-1) Rst, ylab=ADf ) I have tried adding the pch=x command with no effect . . Did you actually type pch=x ? Not pch=x ? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to use condition indexes to test multi-collinearity
Dear Frank, The idea of collinearity with the intercept never made a whole lot of sense to me, unless one entirely assimilates the notion of collinearity with numerical instability, rather than thinking about it as the difficulty of separating different partial relationships. In any event, variables with values far from 0, where the ratio of the largest to smallest value is close to 1, will be collinear with the intercept. Thus, a way of dealing with the problem, if you think it's a problem, is to centre the variables at their means. I hope this helps, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of frank.chiang Sent: April-02-12 11:47 PM To: r-help@r-project.org Subject: [R] how to use condition indexes to test multi-collinearity Dear Users, I try to calculate condition indexes and variance decomposition proportions in order to test for collinearity using colldiag() in perturb package, I got a large index and two variables with large variance decomposition proportions,but one of them is constant item.I also checked the VIF for that variable, the value is small.The result is as follows: Index interceptV1V2 163.54 0.97 0.810.16 VIF(V1)=1.4 V1,V2 are two variables with largest variance decomposition proportions. I haven't got the Belsley's book, and don't quite understand what's the meaning of collinearity between intercept and one variable.If that's the case, how to deal with it? Discard the V1 variable? Anyone can give me some suggestions?Thanks -- View this message in context: http://r.789695.n4.nabble.com/how-to-use- condition-indexes-to-test-multi-collinearity-tp4527740p4527740.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] grouping
Hi all, Assume that I have the following 10 data points. x=c( 46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45) sort x and get the following y= (36 , 45 , 46, 66, 78, 125,193, 209, 242, 297) I want to group the sorted data point (y) into equal number of observation per group. In this case there will be three groups. The first two groups will have three observation and the third will have four observations group 1 = 34, 45, 46 group 2 = 66, 78, 125 group 3 = 193, 209, 242,297 Finally I want to calculate the group mean group 1 = 42 group 2 = 87 group 3 = 234 Can anyone help me out? In SAS I used to do it using proc rank. thanks in advance Val [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] temporal disaggregation
There is a package called 'tempdisagg' on CRAN that offers a similar functionality. -- View this message in context: http://r.789695.n4.nabble.com/temporal-disaggregation-tp3745205p4528583.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rpart error message
Hi R-helpers, I am using rpart package for decision tree using R.We are invoking R environment through JRI from our java application.Hence, the result of R command is returned in REXP and we use geterrMessage() to retrieve the error. When we execute the following command, cnr_model-rpart(as.factor(Species)~Sepal Length+Sepal Width+Petal Length, method=class, parms=list(split=gini,prior=c()), control=rpart.control(minsplit=2, na.action=na.pass,cp=0.001,usesurrogate=1,maxsurrogate=2,surrogatestyle=0,maxdepth=20,xval=10)) we get an error message* Error: unexpected symbol in a-cnr_model-rpart(as.factor(Species)~Sepal Length* The REXP returned in not null and the geterrMessage() call doesnt return an error message. But for other errors in R , say summary(cnr_model) , REXP returned is null and we get the following error message from geterrMessage() call. *Error in summary(cnr_model) : object 'cnr_model' not found* Is this behavior of showing error message specific to rpart package ? Why is this difference in the way error message is displayed in rpart package?Can you please let us know how to retrieve error messages of rpart command through JRI? Thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/rpart-error-message-tp4528104p4528104.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help in ddply
Hi I've records like this df= x panel 4 1 93 2 21 3 83 4 75 1 87 2 87 3 78 4 50 1 76 2 86 3 65 4 84 1 40 2 39 3 26 4 i want to create histogram out of it . i want all the mid and count values for panel wise my code is histoutput = ddply(df,.(df[2]),hist) i'm not able to get the required result. please help me using for loop takes a lot of time if there are more records - Thanks in Advance Arun -- View this message in context: http://r.789695.n4.nabble.com/help-in-ddply-tp4528064p4528064.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Histogram classwise
Hi I have a data class wise. I want to create a histogram class wise without using for loop as it takes a long time my data looks like this x class 27 1 93 3 65 5 1 2 69 5 2 1 92 4 49 5 55 4 46 1 51 3 100 4 - Thanks in Advance Arun -- View this message in context: http://r.789695.n4.nabble.com/Histogram-classwise-tp4528624p4528624.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] filling small gaps of N/A
Hi everybody, I'm a new R french user. Sorry if my english is not perfect. Hope you'll understand my problem ;) I have to work on temperature data (35000 lines in one file) containing some missing data (N/A). Sometimes I have only 2 or 3 N/A following each other, but I have also sometimes 100 or 200 N/A following each other. Here's an example of my data, when I have only small gaps of missing data (2 or 3 N/A): 09/01/2008 12:00 2 1.93 2.93 4.56 5.43 09/01/2008 12:15 2 *3.93* 3.25 4.93 5.56 09/01/2008 12:30 2NA 3.5 5.06 5.56 09/01/2008 12:45 2NA 3.68 5.25 5.68 09/01/2008 13:00 2 *4.93 * 3.87 5.56 5.93 09/01/2008 13:15 2 5.93 4.25 5.75 6.06 09/01/2008 13:30 2 3.93 4.56 5.93 6.18 My question is: how can I replace these small gaps of N/A by numeric values? I would like a fonction which only replace the small gaps (2 or 3 N/A) in my data, but not the big gaps (more than 5 N/A following each other). For the moment, i'm trying to do it by working with the time gap between the 2 numeric values surrounding the N/A as following: imputation - function(x){ met = NULL temp - met[1] - x[1] ind_temp - 1 tps - time(x) for (i in 2:(length(x)) ){ if((tps[i]-tps[ind_temp] 1)(tps[i]-tps[ind_temp] = 4)(is.na(x[i]))){ met[i] - na.approx(x) } else { temp - met[i] - x[i] ind_temp - i } } return(met) } In this example, I would like to apply the function: na.approx(x) on my N/A, but only when I have maximum 4 N/A following each other. There's no error, but it doesn't work (it was working in the other way, when I had to detect aberrant data and replace it by N/A, but not now). It is maybe not the good way to solve this problem. I don't have a lot of experience in R. Maybe there is an easier way to do it... Does somebody have an idea about it for helping me? Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528184.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] hclust and plot functions work, cutree does not
Sarah, . clust_tree=hclust(as.dist(x),method=complete) . plot(clust_tree) this produces a dendrogram, whereas . clust_tree=hclust(as.dist(x),method=complete) .cut = cutree(clust_tree,k=1:5) .plot(cut) produces a plot with 2 dots. The dissimilarity matrix x is 100*100 matrix. Thanks for any feedback. Also, I am looking at various methods to retrieve the sub-trees. Any tips on such techniques would be helpful to me. -- View this message in context: http://r.789695.n4.nabble.com/hclust-and-plot-functions-work-cutree-does-not-tp4515010p4528282.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Choice between gwidgetsRGtk2 and gwidgetstcltk
Hi, all What I want to ask is what is better gui between gwidgetsRGtk2 and gwidgetstcltk? In my opinion, gwidgetsRgtk2 has more functionality for gui. For example , in importing image from file, Rgtk2 works with most of formats but tcltk only with gif and pnm. However, gwidgetstcltk is more lovely than Rgtk2. Is there a way to mix two of them or alternative? -- View this message in context: http://r.789695.n4.nabble.com/Choice-between-gwidgetsRGtk2-and-gwidgetstcltk-tp4528350p4528350.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fitting data with constraints to GEV distribution
Hi, Is there a way to fit a set of data with specified constraints (ie specified the percentile values at say 99.9%) to a GEV distribution? Thanks -- View this message in context: http://r.789695.n4.nabble.com/Fitting-data-with-constraints-to-GEV-distribution-tp4528418p4528418.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] e1071 tune.control() random parameter
The function is a parameter to the tune function of library e1071. The other parameters of the tune.control() function specify more or less the sampling method (cross-validation, fixed or bootstrap+attributes for those) tune itself iterates over valuevectors of the parameters (e.g. C and gamma) and calculates errors in the classification for each combination (using the provided sampling method in tune.control() i would guess) also: e1071 = libsvm and i have enough problems with R, i definetly wont start with python now ;) The question is rather of academic interest, i can do what i want to do already. But it COULD be something useful, after all it is the first argument of the function^^ On 03.04.12 14:13, Alekseiy Beloshitskiy wrote: Hello, Jessica, Can you please elaborate what you're trying to find with that function. If e.g. you want to find best parameters C and gamma for RBF model (SVM), you can use grid.py, check here: http://www.csie.ntu.edu.tw/~cjlin/libsvm/ Function tune.control() in package e1071 is an R interface to this algorithm. Best, -Alex From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of Jessica Streicher [j.streic...@micromata.de] Sent: 03 April 2012 11:05 To: r-help@r-project.org Subject: [R] e1071 tune.control() random parameter I'm not sure what the parameter specifies: random if an integer value is specified, random parameter vectors are drawn from the parameter space. What are the parameter vectors and what is the parameter space? What means drawn? greetings Jessi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
On Apr 3, 2012, at 8:47 AM, Val wrote: Hi all, Assume that I have the following 10 data points. x=c( 46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45) sort x and get the following y= (36 , 45 , 46, 66, 78, 125,193, 209, 242, 297) The methods below do not require a sorting step. I want to group the sorted data point (y) into equal number of observation per group. In this case there will be three groups. The first two groups will have three observation and the third will have four observations group 1 = 34, 45, 46 group 2 = 66, 78, 125 group 3 = 193, 209, 242,297 Finally I want to calculate the group mean group 1 = 42 group 2 = 87 group 3 = 234 I hope those weren't answers from SAS. Can anyone help me out? I usually do this with Hmisc::cut2 since it has a `g = n` parameter that auto-magically calls the quantile splitting criterion but this is done in base R. split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) , include.lowest=TRUE) ) $`[36,65.9]` [1] 36 45 46 $`(65.9,189]` [1] 66 78 125 $`(189,297]` [1] 193 209 242 297 lapply( split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) , include.lowest=TRUE) ), mean) $`[36,65.9]` [1] 42.3 $`(65.9,189]` [1] 89.7 $`(189,297]` [1] 235.25 Or to get a table instead of a list: tapply( x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) , include.lowest=TRUE) , mean) [36,65.9] (65.9,189] (189,297] 42.3 89.7 235.25000 In SAS I used to do it using proc rank. ?quantile isn't equivalent to Proc Rank but it will provide a useful basis for splitting or tabling functions. thanks in advance Val [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
Ignoring the fact your desired answers are wrong, I'd split the separating part and the group means parts into three steps: i) quantile() can help you get the split points, ii) findInterval() can assign each y to a group iii) then ave() or tapply() will do group-wise means Something like: y - c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a c here. ave(y, findInterval(y, quantile(y, c(0.33, 0.66 tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean) You could also use cut2 from the Hmisc package to combine findInterval and quantile into a single step. Depending on your desired output. Hope that helps, Michael On Tue, Apr 3, 2012 at 8:47 AM, Val valkr...@gmail.com wrote: Hi all, Assume that I have the following 10 data points. x=c( 46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45) sort x and get the following y= (36 , 45 , 46, 66, 78, 125,193, 209, 242, 297) I want to group the sorted data point (y) into equal number of observation per group. In this case there will be three groups. The first two groups will have three observation and the third will have four observations group 1 = 34, 45, 46 group 2 = 66, 78, 125 group 3 = 193, 209, 242,297 Finally I want to calculate the group mean group 1 = 42 group 2 = 87 group 3 = 234 Can anyone help me out? In SAS I used to do it using proc rank. thanks in advance Val [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
Probably something along the following lines: x - c( 46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45) sorted - c(36 , 45 , 46, 66, 78, 125,193, 209, 242, 297) tapply(sorted, INDEX = (seq_along(sorted) - 1) %/% 3, FUN = mean) 0 1 2 3 42.3 89.7 214.7 297.0 Hope this helps, Giovanni On Tue, 2012-04-03 at 08:47 -0400, Val wrote: Hi all, Assume that I have the following 10 data points. x=c( 46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45) sort x and get the following y= (36 , 45 , 46, 66, 78, 125,193, 209, 242, 297) I want to group the sorted data point (y) into equal number of observation per group. In this case there will be three groups. The first two groups will have three observation and the third will have four observations group 1 = 34, 45, 46 group 2 = 66, 78, 125 group 3 = 193, 209, 242,297 Finally I want to calculate the group mean group 1 = 42 group 2 = 87 group 3 = 234 Can anyone help me out? In SAS I used to do it using proc rank. thanks in advance Val [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Giovanni Petris gpet...@uark.edu Associate Professor Department of Mathematical Sciences University of Arkansas - Fayetteville, AR 72701 Ph: (479) 575-6324, 575-8630 (fax) http://definetti.uark.edu/~gpetris/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rpart error message
On Apr 3, 2012, at 4:16 AM, Raji wrote: Hi R-helpers, I am using rpart package for decision tree using R.We are invoking R environment through JRI from our java application.Hence, the result of R command is returned in REXP and we use geterrMessage() to retrieve the error. When we execute the following command, cnr_model-rpart(as.factor(Species)~Sepal Length+Sepal Width+Petal Length, You have spaces in your variable names. That has nothing to do with JRI and everything to do with base R syntax. (You also have no data argument so are you attaching it. That would seem to be an unwise choice.) method=class, parms=list(split=gini,prior=c()), control=rpart.control(minsplit=2, na .action = na .pass ,cp = 0.001 ,usesurrogate=1,maxsurrogate=2,surrogatestyle=0,maxdepth=20,xval=10)) we get an error message* Error: unexpected symbol in a-cnr_model-rpart(as.factor(Species)~Sepal Length* The unexpected symbol was Length. Spaces are separators to the R parser. The REXP returned in not null and the geterrMessage() call doesnt return an error message. But for other errors in R , say summary(cnr_model) , REXP returned is null and we get the following error message from geterrMessage() call. *Error in summary(cnr_model) : object 'cnr_model' not found* Right. It wasn't found because the command you tried to create it with ... failed. Is this behavior of showing error message specific to rpart package ? No. It would be very generic. Why is this difference in the way error message is displayed in rpart package?Can you please let us know how to retrieve error messages of rpart command through JRI? Thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/rpart-error-message-tp4528104p4528104.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gamm: tensor product and interaction
On 04/02/2012 01:26 PM, Mabille, Geraldine wrote: Hi list, I'm working with gamm models of this sort, using Simon Wood's mgcv library: gm- gamm(Z~te(x,y),data=DATA,random=list(Group=~1)) gm1-gamm(Z~te(x,y,by=Factor)+Factor,data=DATA,random=list(Group=~1)) with a dataset of about 7 rows and 110 levels for Group I would be inclined to use something like gm- gam(Z~te(x,y)+s(Group,bs=re),data=DATA,method=REML) gm1-gam(Z~te(x,y,by=Factor)+Factor+s(Group,bs=re),data=DATA,method=REML) AIC(gm,gm1) See ?random.effects in the mgcv help for more details. in order to test whether tensor product smooths vary across factor levels. I was wondering if comparing those two models would be enough to conclude? I saw a preceding post on similar issues http://r.789695.n4.nabble.com/Comparing-and-Interpreting-GAMMs-td2234209.html but that example used simple s() smooths instead of tensor product smooths. In this case, the person was comparing between model A and model A1 like: A-gamm4(Z~factor+s(x)+s(x,by=factor) ,data=DATA, random=~(1|Group)) A1- gamm4(Z~factor+s(x) ,data=DATA, random=~(1|Group)) I thus also tried to compare my model gm1, with another gm2 model of that sort: gm2-gamm(Z~te(x,y)+te(x,y,by=Factor)+Factor,data=DATA,random=list(Group=~1)) but this type of models never converge and I obtain error messages of that sort: Error in MEestimate(lmeSt, grps) : Singularity in backsolve at level 0, block 1 - The problem is that te(x,y) and te(x,y,by=Factor) are confounded. You can get around this by making `Factor' into an ordered factor. See te `by' variable section in ?gam.models. 2 questions from that: 1) Keeping in mind that my main question is to check whether tensor products smooths vary across factor levels, would the comparison of models gm and gm1 be sufficient for me? - I'd have thought so. 2) Otherwise, is there something wrong in my gm2 model that make it impossible to converge?? Also, side question that I already posted a few days ago but didn't get an answer for: 3)I don't manage to use vis.gam() to print 3 different 2D-contour plots for the three levels of factors I have in model gm1. It works well with model gm (when only one plot is generated) but not gm1 (3 plots should be generated) Here is the error message I obtain: vis.gam(gm1$gam,plot.type=contour,n.grid=200,color=heat,zlim=c(0,4)) Error in predict.gam(x, newdata = newd, se.fit = TRUE, type = type) : number of items to replace is not a multiple of replacement length - hmm, possibly a bug. I'll look into it. best, Simon Thanks a lot if someone can help with that, Geraldine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help in ddply
Don't use ddply because you aren't returning a data.frame. Hist objects are lists so use dlply. dats - structure(list(x = c(4L, 93L, 21L, 83L, 75L, 87L, 87L, 78L, 50L, 76L, 86L, 65L, 84L, 40L, 39L, 26L), panel = c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L)), .Names = c(x, panel), class = data.frame, row.names = c(NA, -16L)) #please start using dput -- I've asked this before dlply(dats, panel, function(d) hist(d$x)) It's up to you to interpret the results. Michael On Tue, Apr 3, 2012 at 4:02 AM, arunkumar akpbond...@gmail.com wrote: Hi I've records like this df= x panel 4 1 93 2 21 3 83 4 75 1 87 2 87 3 78 4 50 1 76 2 86 3 65 4 84 1 40 2 39 3 26 4 i want to create histogram out of it . i want all the mid and count values for panel wise my code is histoutput = ddply(df,.(df[2]),hist) i'm not able to get the required result. please help me using for loop takes a lot of time if there are more records - Thanks in Advance Arun -- View this message in context: http://r.789695.n4.nabble.com/help-in-ddply-tp4528064p4528064.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] filling small gaps of N/A
It seems like you could benefit from using a zoo [time series] object to hold your data -- then you have a variety of NA filling functions which work for arbitrarily long gaps. E.g., library(zoo) x - zoo(1:100, Sys.Date() + 1:100) x[2:60] - NA # Most of these look the same because the data is simple: will give different results for more complicated examples na.approx(x) na.locf(x) na.spline(x) na.aggregate(x) na.fill # Takes more arguments Hope this helps, Michael On Tue, Apr 3, 2012 at 4:52 AM, jeff6868 geoffrey_kl...@etu.u-bourgogne.fr wrote: Hi everybody, I'm a new R french user. Sorry if my english is not perfect. Hope you'll understand my problem ;) I have to work on temperature data (35000 lines in one file) containing some missing data (N/A). Sometimes I have only 2 or 3 N/A following each other, but I have also sometimes 100 or 200 N/A following each other. Here's an example of my data, when I have only small gaps of missing data (2 or 3 N/A): 09/01/2008 12:00 2 1.93 2.93 4.56 5.43 09/01/2008 12:15 2 *3.93* 3.25 4.93 5.56 09/01/2008 12:30 2 NA 3.5 5.06 5.56 09/01/2008 12:45 2 NA 3.68 5.25 5.68 09/01/2008 13:00 2 *4.93 * 3.87 5.56 5.93 09/01/2008 13:15 2 5.93 4.25 5.75 6.06 09/01/2008 13:30 2 3.93 4.56 5.93 6.18 My question is: how can I replace these small gaps of N/A by numeric values? I would like a fonction which only replace the small gaps (2 or 3 N/A) in my data, but not the big gaps (more than 5 N/A following each other). For the moment, i'm trying to do it by working with the time gap between the 2 numeric values surrounding the N/A as following: imputation - function(x){ met = NULL temp - met[1] - x[1] ind_temp - 1 tps - time(x) for (i in 2:(length(x)) ){ if((tps[i]-tps[ind_temp] 1)(tps[i]-tps[ind_temp] = 4)(is.na(x[i]))){ met[i] - na.approx(x) } else { temp - met[i] - x[i] ind_temp - i } } return(met) } In this example, I would like to apply the function: na.approx(x) on my N/A, but only when I have maximum 4 N/A following each other. There's no error, but it doesn't work (it was working in the other way, when I had to detect aberrant data and replace it by N/A, but not now). It is maybe not the good way to solve this problem. I don't have a lot of experience in R. Maybe there is an easier way to do it... Does somebody have an idea about it for helping me? Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528184.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] filling small gaps of N/A
Sorry -- left out a major detail: most of these functions have maxgap arguments which allow you to leave larger gaps of NAs as NAs. Best, Michael On Tue, Apr 3, 2012 at 9:24 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: It seems like you could benefit from using a zoo [time series] object to hold your data -- then you have a variety of NA filling functions which work for arbitrarily long gaps. E.g., library(zoo) x - zoo(1:100, Sys.Date() + 1:100) x[2:60] - NA # Most of these look the same because the data is simple: will give different results for more complicated examples na.approx(x) na.locf(x) na.spline(x) na.aggregate(x) na.fill # Takes more arguments Hope this helps, Michael On Tue, Apr 3, 2012 at 4:52 AM, jeff6868 geoffrey_kl...@etu.u-bourgogne.fr wrote: Hi everybody, I'm a new R french user. Sorry if my english is not perfect. Hope you'll understand my problem ;) I have to work on temperature data (35000 lines in one file) containing some missing data (N/A). Sometimes I have only 2 or 3 N/A following each other, but I have also sometimes 100 or 200 N/A following each other. Here's an example of my data, when I have only small gaps of missing data (2 or 3 N/A): 09/01/2008 12:00 2 1.93 2.93 4.56 5.43 09/01/2008 12:15 2 *3.93* 3.25 4.93 5.56 09/01/2008 12:30 2 NA 3.5 5.06 5.56 09/01/2008 12:45 2 NA 3.68 5.25 5.68 09/01/2008 13:00 2 *4.93 * 3.87 5.56 5.93 09/01/2008 13:15 2 5.93 4.25 5.75 6.06 09/01/2008 13:30 2 3.93 4.56 5.93 6.18 My question is: how can I replace these small gaps of N/A by numeric values? I would like a fonction which only replace the small gaps (2 or 3 N/A) in my data, but not the big gaps (more than 5 N/A following each other). For the moment, i'm trying to do it by working with the time gap between the 2 numeric values surrounding the N/A as following: imputation - function(x){ met = NULL temp - met[1] - x[1] ind_temp - 1 tps - time(x) for (i in 2:(length(x)) ){ if((tps[i]-tps[ind_temp] 1)(tps[i]-tps[ind_temp] = 4)(is.na(x[i]))){ met[i] - na.approx(x) } else { temp - met[i] - x[i] ind_temp - i } } return(met) } In this example, I would like to apply the function: na.approx(x) on my N/A, but only when I have maximum 4 N/A following each other. There's no error, but it doesn't work (it was working in the other way, when I had to detect aberrant data and replace it by N/A, but not now). It is maybe not the good way to solve this problem. I don't have a lot of experience in R. Maybe there is an easier way to do it... Does somebody have an idea about it for helping me? Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528184.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
Hi! Maybe not the most elegant solution, but works: for(i in seq(1,length(data)-(length(data) %% 3), 3)) { ifelse((length(data)-i)3, { print(sort(data)[ c(i:(i+2)) ]); print(mean(sort(data)[ c(i:(i+2)) ])) }, { print(sort(data)[ c(i:length(data)) ]); print(mean(sort(data)[ c(i:length(data)) ])) } ) } Produces: [1] 36 45 46 [1] 42.3 [1] 66 78 125 [1] 89.7 [1] 193 209 242 297 [1] 235.25 HTH, Kimmo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
Thank you all (David, Michael, Giovanni) for your prompt response. First there was a typo error for the group mean it was 89.6 not 87. For a small data set and few groupings I can use prob=c(0, .333, .66 ,1) to group in to three groups in this case. However, if I want to extend the number of groupings say 10 or 15 then do I have to figure it out the split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) Is there a short cut for that? Thanks On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Ignoring the fact your desired answers are wrong, I'd split the separating part and the group means parts into three steps: i) quantile() can help you get the split points, ii) findInterval() can assign each y to a group iii) then ave() or tapply() will do group-wise means Something like: y - c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a c here. ave(y, findInterval(y, quantile(y, c(0.33, 0.66 tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean) You could also use cut2 from the Hmisc package to combine findInterval and quantile into a single step. Depending on your desired output. Hope that helps, Michael On Tue, Apr 3, 2012 at 8:47 AM, Val valkr...@gmail.com wrote: Hi all, Assume that I have the following 10 data points. x=c( 46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45) sort x and get the following y= (36 , 45 , 46, 66, 78, 125,193, 209, 242, 297) I want to group the sorted data point (y) into equal number of observation per group. In this case there will be three groups. The first two groups will have three observation and the third will have four observations group 1 = 34, 45, 46 group 2 = 66, 78, 125 group 3 = 193, 209, 242,297 Finally I want to calculate the group mean group 1 = 42 group 2 = 87 group 3 = 234 Can anyone help me out? In SAS I used to do it using proc rank. thanks in advance Val [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
Use cut2 as I suggested and David demonstrated. Michael On Tue, Apr 3, 2012 at 9:31 AM, Val valkr...@gmail.com wrote: Thank you all (David, Michael, Giovanni) for your prompt response. First there was a typo error for the group mean it was 89.6 not 87. For a small data set and few groupings I can use prob=c(0, .333, .66 ,1) to group in to three groups in this case. However, if I want to extend the number of groupings say 10 or 15 then do I have to figure it out the split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) Is there a short cut for that? Thanks On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Ignoring the fact your desired answers are wrong, I'd split the separating part and the group means parts into three steps: i) quantile() can help you get the split points, ii) findInterval() can assign each y to a group iii) then ave() or tapply() will do group-wise means Something like: y - c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a c here. ave(y, findInterval(y, quantile(y, c(0.33, 0.66 tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean) You could also use cut2 from the Hmisc package to combine findInterval and quantile into a single step. Depending on your desired output. Hope that helps, Michael On Tue, Apr 3, 2012 at 8:47 AM, Val valkr...@gmail.com wrote: Hi all, Assume that I have the following 10 data points. x=c( 46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45) sort x and get the following y= (36 , 45 , 46, 66, 78, 125,193, 209, 242, 297) I want to group the sorted data point (y) into equal number of observation per group. In this case there will be three groups. The first two groups will have three observation and the third will have four observations group 1 = 34, 45, 46 group 2 = 66, 78, 125 group 3 = 193, 209, 242,297 Finally I want to calculate the group mean group 1 = 42 group 2 = 87 group 3 = 234 Can anyone help me out? In SAS I used to do it using proc rank. thanks in advance Val [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A trivial plot gives open circles as the plot char but another one gives me qs . . why is this?
Brian, On 2012-04-03 18:58, Prof Brian Ripley wrote: On Tue, 3 Apr 2012, peter dalgaard wrote: On Apr 3, 2012, at 07:19 , Philip Rhoades wrote: People, On my home computer with a real plot I get what I expect - open circles as the plot character - on my work computer I get qs ! So I tried a trivial test plot on the work computer ie: Argh. I have forgotten this before... There was a configuration issue leading to plotting symbols showing up as q, but I can't recall the details. Possibly libpango-related (do you have it installed?) At home, yes: pango-1.29.4-1.fc16.x86_64 pangomm-2.28.3-1.fc16.x86_64 (I will need to check work). We were not told the device This is well-known for pdf() and poppler-based PDF viewers so you could try the workaround in ?pdf. Yes, I was outputting to pdf in all cases . . do you mean this workaround?: useDingbats logical. Should small circles be rendered via the Dingbats font? Defaults to TRUE, which produces smaller and better output. Setting this to FALSE can work around font display problems in broken PDF viewers. I don't really see how this could happen on the default X11 device, as it draws circles rather than uses fonts. Ah . . when I was testing on the problem work machine and doing the trivial test . . I was outputting to the screen . . so if I repeat the trivial test on the work machine (I am at home now) to PDF I should get the same problem as I do on the real script? Thanks, Phil. -- Philip Rhoades GPO Box 3411 Sydney NSW 2001 Australia E-mail: p...@pricom.com.au __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A trivial plot gives open circles as the plot char but another one gives me qs . . why is this?
David, On 2012-04-03 22:29, David Winsemius wrote: On Apr 3, 2012, at 1:19 AM, Philip Rhoades wrote: People, On my home computer with a real plot I get what I expect - open circles as the plot character - on my work computer I get qs ! So I tried a trivial test plot on the work computer ie: a - c(1,2,3,4,5) b - c(1,2,3,4,5) plot(a,b) and that also gives me open circles as it should! - the plot line in my real script is: plot( means5[ ,4 ], means6[ ,4 ], xlab=640loci-50reps Gen4(-1) Rst, ylab=ADf ) I have tried adding the pch=x command with no effect . . Did you actually type pch=x ? Not pch=x ? Sorry, I meant x as in 1,2,3, . . . 18 Thanks, Phil. -- Philip Rhoades GPO Box 3411 Sydney NSW 2001 Australia E-mail: p...@pricom.com.au __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
On Tue, Apr 03, 2012 at 09:31:29AM -0400, Val wrote: Thank you all (David, Michael, Giovanni) for your prompt response. First there was a typo error for the group mean it was 89.6 not 87. For a small data set and few groupings I can use prob=c(0, .333, .66 ,1) to group in to three groups in this case. However, if I want to extend the number of groupings say 10 or 15 then do I have to figure it out the split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) Is there a short cut for that? Hi. There may be better ways for the whole task, but specifically c(0, .333, .66 ,1) can be obtained as seq(0, 1, length=3+1) Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Histogram classwise
I'm not entirely sure what you mean, but maybe split() and lapply() would help? Sarah On Tue, Apr 3, 2012 at 8:20 AM, arunkumar akpbond...@gmail.com wrote: Hi I have a data class wise. I want to create a histogram class wise without using for loop as it takes a long time my data looks like this x class 27 1 93 3 65 5 1 2 69 5 2 1 92 4 49 5 55 4 46 1 51 3 100 4 - Thanks in Advance Arun -- -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package seems to be present but library don't find it
In case someone has the competence to check, the file is here: setwd(~) download.file(http://www.ese.u-psud.fr/epc/conservation/r-scripts/HelloWorld_1.0.tar.gz;, HelloWorld_1.0.tar.gz) install.packages(HelloWorld_1.0.tar.gz, repos = NULL) Thanks a lot Marc Le 03/04/12 10:55, Marc Girondot a écrit : Hi, I try to make my first package? The HelloWorld.R file is: HelloWorld.R #' showHello est une fonction R permettant d'afficher le message #' Hello World! sur la console. #' @title la fonction showHello() showHello -function(){ cat(Hello World!\n) } I use the following procedure to get the tar: # set the working directory where the file is located setwd(...) package.skeleton(HelloWorld,code_files=c(HelloWorld.R)) # to generate .rd files library(roxygen2) roxygenize(HelloWorld,copy.package=FALSE) system(R CMD build '/Users/marcgirondot/Documents/Espace de travail R/Phenology/Source fit/Essai_package/HelloWorld') * checking for file ‘/Users/marcgirondot/Documents/Espace de travail R/Phenology/Source fit/Essai_package/HelloWorld/DESCRIPTION’ ... OK * preparing ‘HelloWorld’: * checking DESCRIPTION meta-information ... OK * checking for LF line-endings in source and make files * checking for empty or unneeded directories Removed empty directory ‘HelloWorld/inst’ * building ‘HelloWorld_1.0.tar.gz’ install.packages(/Users/marcgirondot/Documents/Espace\ de\ travail\ R/Phenology/Source\ fit/Essai_package/HelloWorld_1.0.tar.gz, repos = NULL) Installing package(s) into ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library’ (as ‘lib’ is unspecified) library(HelloWorld) Erreur dans library(HelloWorld) : ‘HelloWorld’ n'est pas un nom correct de package installé Whereas the Helloworld folder is available in the library folder with other packages /Library/Frameworks/R.framework/Versions/2.14/Resources/library/HelloWorld -- __ Marc Girondot, Pr Laboratoire Ecologie, Systématique et Evolution Equipe de Conservation des Populations et des Communautés CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079 Bâtiment 362 91405 Orsay Cedex, France Tel: 33 1 (0)1.69.15.72.30 Fax: 33 1 (0)1.69.15.73.53 e-mail: marc.giron...@u-psud.fr Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html Skype: girondot __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
On Apr 3, 2012, at 9:32 AM, R. Michael Weylandt wrote: Use cut2 as I suggested and David demonstrated. Agree that Hmisc::cut2 is extremely handy and I also like that fact that the closed ends of intervals are on the left side (which is not the same behavior as cut()), which has the otehr effect of setting include.lowest = TRUE which is not the default for cut() either (to my continued amazement). But let me add the method I use when doing it by hand: cut(x, quantile(x, prob=seq(0, 1, length=ngrps+1)), include.lowest=TRUE) -- David. Michael On Tue, Apr 3, 2012 at 9:31 AM, Val valkr...@gmail.com wrote: Thank you all (David, Michael, Giovanni) for your prompt response. First there was a typo error for the group mean it was 89.6 not 87. For a small data set and few groupings I can use prob=c(0, .333, . 66 ,1) to group in to three groups in this case. However, if I want to extend the number of groupings say 10 or 15 then do I have to figure it out the split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) Is there a short cut for that? Thanks On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Ignoring the fact your desired answers are wrong, I'd split the separating part and the group means parts into three steps: i) quantile() can help you get the split points, ii) findInterval() can assign each y to a group iii) then ave() or tapply() will do group-wise means Something like: y - c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a c here. ave(y, findInterval(y, quantile(y, c(0.33, 0.66 tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean) You could also use cut2 from the Hmisc package to combine findInterval and quantile into a single step. Depending on your desired output. Hope that helps, Michael On Tue, Apr 3, 2012 at 8:47 AM, Val valkr...@gmail.com wrote: Hi all, Assume that I have the following 10 data points. x=c( 46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45) sort x and get the following y= (36 , 45 , 46, 66, 78, 125,193, 209, 242, 297) I want to group the sorted data point (y) into equal number of observation per group. In this case there will be three groups. The first two groups will have three observation and the third will have four observations group 1 = 34, 45, 46 group 2 = 66, 78, 125 group 3 = 193, 209, 242,297 Finally I want to calculate the group mean group 1 = 42 group 2 = 87 group 3 = 234 Can anyone help me out? In SAS I used to do it using proc rank. thanks in advance Val [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
Or just replace c(0, .333, .667, 1) with n - 10 split(x, cut(x, quantile(x, prob= c(0, 1:(n-1)/n, 1)), include.lowest=TRUE)) where n is the number of groups you want. -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of R. Michael Weylandt Sent: Tuesday, April 03, 2012 8:32 AM To: Val Cc: r-help@r-project.org Subject: Re: [R] grouping Use cut2 as I suggested and David demonstrated. Michael On Tue, Apr 3, 2012 at 9:31 AM, Val valkr...@gmail.com wrote: Thank you all (David, Michael, Giovanni) for your prompt response. First there was a typo error for the group mean it was 89.6 not 87. For a small data set and few groupings I can use prob=c(0, .333, .66 ,1) to group in to three groups in this case. However, if I want to extend the number of groupings say 10 or 15 then do I have to figure it out the split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) Is there a short cut for that? Thanks On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Ignoring the fact your desired answers are wrong, I'd split the separating part and the group means parts into three steps: i) quantile() can help you get the split points, ii) findInterval() can assign each y to a group iii) then ave() or tapply() will do group-wise means Something like: y - c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a c here. ave(y, findInterval(y, quantile(y, c(0.33, 0.66 tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean) You could also use cut2 from the Hmisc package to combine findInterval and quantile into a single step. Depending on your desired output. Hope that helps, Michael On Tue, Apr 3, 2012 at 8:47 AM, Val valkr...@gmail.com wrote: Hi all, Assume that I have the following 10 data points. x=c( 46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45) sort x and get the following y= (36 , 45 , 46, 66, 78, 125,193, 209, 242, 297) I want to group the sorted data point (y) into equal number of observation per group. In this case there will be three groups. The first two groups will have three observation and the third will have four observations group 1 = 34, 45, 46 group 2 = 66, 78, 125 group 3 = 193, 209, 242,297 Finally I want to calculate the group mean group 1 = 42 group 2 = 87 group 3 = 234 Can anyone help me out? In SAS I used to do it using proc rank. thanks in advance Val [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] regression for poisson distributed data
Hello all, I would like to get parameter estimates for different models. For one of them I give the code in example. I am estimating the parameters (i,j and k) with the nls function, which sees the error distribution as normal, I would however like to do the same as nls with the assumption that the errors are poisson distributed. Is there a way to do this with R? Are there packages designed for this? I tried with the gnm package, but don't understand how to transform my equation to a generalised equation. Is there an option for nls to choose family = poisson? Lower in the mail the code with the model and visualisations I use to check my results. I also copied the test dataset from my txt file. I am using R 2.15 and Rstudio to visualise it. plot(FR~N0) x - nls(FR~(exp(i+j*N0)/(1+exp(i+j*N0)))*(k*N0/(k+N0)),start=list(i=0.02,j=0.002,k=6)) summary(x) hatx - predict(x) lines(spline(N0,hatx)) N0 FR 10 2 10 3 10 2 10 4 10 2 10 2 10 1 10 2 10 2 10 2 20 2 20 3 20 3 20 3 20 4 20 2 20 4 20 2 20 3 20 2 30 1 30 2 30 3 30 4 30 5 30 6 30 2 30 3 30 2 30 2 40 2 40 3 40 3 40 6 40 5 40 4 40 3 40 3 40 2 40 3 50 4 50 5 50 2 50 3 50 7 50 5 50 4 50 3 50 4 50 5 60 5 60 6 60 8 60 4 60 4 60 3 60 2 60 2 60 5 60 4 Kind Regards Joachim Don't waste paper! Think about the environment before printing this e-mail __ Joachim Audenaert Adviesdienst Gewasbescherming Proefcentrum voor Sierteelt Schaessestraat 18 B-9070 Destelbergen Tel. +32 9 353 94 71 Fax +32 9 353 94 95 E-mail: joachim.audena...@pcsierteelt.be www.pcsierteelt.be __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package seems to be present but library don't find it
On 03/04/2012 9:50 AM, Marc Girondot wrote: In case someone has the competence to check, the file is here: setwd(~) download.file(http://www.ese.u-psud.fr/epc/conservation/r-scripts/HelloWorld_1.0.tar.gz;, HelloWorld_1.0.tar.gz) install.packages(HelloWorld_1.0.tar.gz, repos = NULL) The problem is that you try to export the name HelloWorld in your NAMESPACE file, but you don't have an object of that name. You should export showHello instead. Not sure why you didn't see the error message I got: $ R CMD INSTALL HelloWorld_1.0.tar.gz * installing to library 'F:/cygwin/home/murdoch/R/win-library/2.15' * installing *source* package 'HelloWorld' ... ** R ** preparing package for lazy loading ** help Warning: C:/temp/RtmpimSWuh/R.INSTALLe801c296e9/HelloWorld/man/HelloWorld-packag e.Rd:32: All text must be in a section Warning: C:/temp/RtmpimSWuh/R.INSTALLe801c296e9/HelloWorld/man/HelloWorld-packag e.Rd:33: All text must be in a section *** installing help indices ** building package indices ** testing if installed package can be loaded Error in namespaceExport(ns, exports) : undefined exports: HelloWorld Error: loading failed Execution halted ERROR: loading failed * removing 'F:/cygwin/home/murdoch/R/win-library/2.15/HelloWorld' Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] When lack of data is data and not n/a
Greetings. Here is a problem I don't know how to handle, even by brute force. We have an 800k line data file that includes eye fixations for subjects in a 3 x 2 factorial design. There are several screen locations where information is available while Ss do their task. These locations vary by condition so there is no reason for people in some conditions (i.e., the 3-factor one) to look at some locations. So I am analyzing these conditions, two pairs at a time (i.e., the 2-factor one). So the problem: My ANOVA of these data had messed up the within and between factors the way aov does when there are different numbers of Ss contributing data to some of the variables than for others. A bit of sleuthing with plyr revealed that one of our Ss in one of our conditions never looked at one of our locations. Given the nature of the DV, zero is a fine number. Although a little unexpected in this condition, it is reasonable and cannot be ignored. However, R recognizes, rightly, that lack of data is not the same as zero. I guess I could subset and aggregate the dataset to pull out a data.frame that contains the data aggregated at the right level for this aov and then add one record for this one Ss (well - actually I would add 8 records as block is a within-Ss factor that we have been looking at). But there must be a more elegant way of doing this especially as we are still in the exploratory phase and will be pulling out factors such as mean dwell time, total dwell time (dwell time per fix, summed over all fix) and other factors (e.g., tallies of sequential fixation chains -- e.g., obj-A, obj-B, vs obj-A, obj-C, etc, etc, etc). As always, your thoughts, comments, and code would be appreciated. Thanks, Wayne Gray __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
David W and all, Thank you very much for your help. Here is the final output that I want in the form of data frame. The data frame should contain x, group and group_ mean in the following way x group group mean 46 142.3 125 289.6 36 142.3 193 3235.25 209 3235.25 78 289.6 66 289.6 242 3235.25 297 3235.25 45 142.3 Thanks a lot On Tue, Apr 3, 2012 at 9:51 AM, David Winsemius dwinsem...@comcast.netwrote: On Apr 3, 2012, at 9:32 AM, R. Michael Weylandt wrote: Use cut2 as I suggested and David demonstrated. Agree that Hmisc::cut2 is extremely handy and I also like that fact that the closed ends of intervals are on the left side (which is not the same behavior as cut()), which has the otehr effect of setting include.lowest = TRUE which is not the default for cut() either (to my continued amazement). But let me add the method I use when doing it by hand: cut(x, quantile(x, prob=seq(0, 1, length=ngrps+1)), include.lowest=TRUE) -- David. Michael On Tue, Apr 3, 2012 at 9:31 AM, Val valkr...@gmail.com wrote: Thank you all (David, Michael, Giovanni) for your prompt response. First there was a typo error for the group mean it was 89.6 not 87. For a small data set and few groupings I can use prob=c(0, .333, .66 ,1) to group in to three groups in this case. However, if I want to extend the number of groupings say 10 or 15 then do I have to figure it out the split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) Is there a short cut for that? Thanks On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Ignoring the fact your desired answers are wrong, I'd split the separating part and the group means parts into three steps: i) quantile() can help you get the split points, ii) findInterval() can assign each y to a group iii) then ave() or tapply() will do group-wise means Something like: y - c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a c here. ave(y, findInterval(y, quantile(y, c(0.33, 0.66 tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean) You could also use cut2 from the Hmisc package to combine findInterval and quantile into a single step. Depending on your desired output. Hope that helps, Michael On Tue, Apr 3, 2012 at 8:47 AM, Val valkr...@gmail.com wrote: Hi all, Assume that I have the following 10 data points. x=c( 46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45) sort x and get the following y= (36 , 45 , 46, 66, 78, 125,193, 209, 242, 297) I want to group the sorted data point (y) into equal number of observation per group. In this case there will be three groups. The first two groups will have three observation and the third will have four observations group 1 = 34, 45, 46 group 2 = 66, 78, 125 group 3 = 193, 209, 242,297 Finally I want to calculate the group mean group 1 = 42 group 2 = 87 group 3 = 234 Can anyone help me out? In SAS I used to do it using proc rank. thanks in advance Val [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/**posting-guide.htmlhttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
On Apr 3, 2012, at 10:11 AM, Val wrote: David W and all, Thank you very much for your help. Here is the final output that I want in the form of data frame. The data frame should contain x, group and group_ mean in the following way x group group mean 46 142.3 125 289.6 36 142.3 193 3235.25 209 3235.25 78 289.6 66 289.6 242 3235.25 297 3235.25 45 142.3 I you want group means in a vector the same length as x then instead of using tapply as done in earlier solutions you should use `ave`. -- DW Thanks a lot On Tue, Apr 3, 2012 at 9:51 AM, David Winsemius dwinsem...@comcast.net wrote: On Apr 3, 2012, at 9:32 AM, R. Michael Weylandt wrote: Use cut2 as I suggested and David demonstrated. Agree that Hmisc::cut2 is extremely handy and I also like that fact that the closed ends of intervals are on the left side (which is not the same behavior as cut()), which has the otehr effect of setting include.lowest = TRUE which is not the default for cut() either (to my continued amazement). But let me add the method I use when doing it by hand: cut(x, quantile(x, prob=seq(0, 1, length=ngrps+1)), include.lowest=TRUE) -- David. Michael On Tue, Apr 3, 2012 at 9:31 AM, Val valkr...@gmail.com wrote: Thank you all (David, Michael, Giovanni) for your prompt response. First there was a typo error for the group mean it was 89.6 not 87. For a small data set and few groupings I can use prob=c(0, .333, . 66 ,1) to group in to three groups in this case. However, if I want to extend the number of groupings say 10 or 15 then do I have to figure it out the split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) Is there a short cut for that? Thanks On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Ignoring the fact your desired answers are wrong, I'd split the separating part and the group means parts into three steps: i) quantile() can help you get the split points, ii) findInterval() can assign each y to a group iii) then ave() or tapply() will do group-wise means Something like: y - c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a c here. ave(y, findInterval(y, quantile(y, c(0.33, 0.66 tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean) You could also use cut2 from the Hmisc package to combine findInterval and quantile into a single step. Depending on your desired output. Hope that helps, Michael On Tue, Apr 3, 2012 at 8:47 AM, Val valkr...@gmail.com wrote: Hi all, Assume that I have the following 10 data points. x=c( 46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45) sort x and get the following y= (36 , 45 , 46, 66, 78, 125,193, 209, 242, 297) I want to group the sorted data point (y) into equal number of observation per group. In this case there will be three groups. The first two groups will have three observation and the third will have four observations group 1 = 34, 45, 46 group 2 = 66, 78, 125 group 3 = 193, 209, 242,297 Finally I want to calculate the group mean group 1 = 42 group 2 = 87 group 3 = 234 Can anyone help me out? In SAS I used to do it using proc rank. thanks in advance Val [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regression for poisson distributed data
On Tue, Apr 3, 2012 at 9:58 AM, Joachim Audenaert joachim.audena...@pcsierteelt.be wrote: Hello all, I would like to get parameter estimates for different models. For one of them I give the code in example. I am estimating the parameters (i,j and k) with the nls function, which sees the error distribution as normal, I would however like to do the same as nls with the assumption that the errors are poisson distributed. Is there a way to do this with R? Are there packages designed for this? I tried with the gnm package, but don't understand how to transform my equation to a generalised equation. Is there an option for nls to choose family = poisson? Lower in the mail the code with the model and visualisations I use to check my results. I also copied the test dataset from my txt file. I am using R 2.15 and Rstudio to visualise it. plot(FR~N0) x - nls(FR~(exp(i+j*N0)/(1+exp(i+j*N0)))*(k*N0/(k+N0)),start=list(i=0.02,j=0.002,k=6)) summary(x) hatx - predict(x) lines(spline(N0,hatx)) Just minimize the negative log likelihood from first principles. Define the mean function and then define the negative log likelihood in terms of that: Mean - function(p) { i - p[1]; j - p[2]; k - p[3]; exp(i+j*N0)/(1+exp(i+j*N0))*(k*N0/(k+N0)) } ll - function(p) - sum(dpois(FR, Mean(p), log = TRUE)) out - optim(c(1, 1, 1), ll); out plot(FR ~ N0, pch = 20) lines(Mean(out$par) ~ N0) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] identify with mfcol=c(1,2)
I would like to have a figure with two graphs. This is easily accomplished using mfcol: oldpar - par(mfcol=c(1,2)) plot(x,y) plot(z,x) par(oldpar) I run into trouble if I try to use identify with the two plots. If, after identifying points on my first graph I hit the ESC key, or hitting stop menu bar of my R session, the system stops the identification process, but fails to give me my second graph. Is there a way to allow for the identification of points when one is plotting to graphs in a single graph window? My code follows. plotter - function(first,second) { # Allow for two plots in on graph window. oldpar-par(mfcol=c(1,2)) #Bland-Altman plot. plot((second+first)/2,second-first) abline(0,0) # Allow for indentification of extreme values. BAzap-identify((second+first)/2,second-first,labels = seq_along(data$Line)) print(BAzap) # Plot second as a function of first value. plot(first,second,main=Limin vs. Limin,xlab=First (cm^2),ylab=Second (cm^3)) # Add identity line. abline(0,1,lty=2,col=red) # Allow for identification of extreme values. zap-identify(first,second,labels = seq_along(data$Line)) print(zap) # Add regression line. fit1-lm(first~second) print(summary(fit1)) abline(fit1) print(summary(fit1)$sigma) # reset par to default values. par(oldpar) } plotter(first,second) Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] meaning of sigma from LM, is it the same as RMSE
Is the sigma from a lm, i.e. fit1 - lm(y~x) summary(fit1) summary(fit1)$sigma the RMSE (root mean square error) Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] indexing in a function doesn't work?
Hi Benjamin, You seem to have the right basic ideas, but a lot of your code had typos and some logic flaws that I guess came from trying to move from just code to in a function. I attached the changes I made. What I would strongly encourage you to do, is work through each of the little functions I made and: 1) make sure you understand what it is doing 2) make sure each small function works properly---this means creating a *variety* of plausible test cases and trying them out 3) once all of the pieces work, then try to wrap them up in your overall plotter() The original function you had was large and had many errors, but it is difficult to debug something when there can be errors coming from multiple places. Easier is to break your work into small, tractable chunks. Plan in advance what your final goal is, and how each piece will fit in to that. Then write each piece and ensure that it works. From this point, you will have a much easier time bundling all the pieces together (even still, there may be additional work, but it will be more doable because you can be reasonably certain all your code works, it just does not quite work together. A few functions I used may be new to you, so I would also suggest reading the documentation (I know this can be tedious, but it is valuable) ?match.arg ?switch ?on.exit note that ... are arguments passed down from the final function plotter to internal ones. They must be named if they are passed to You are off to a good start, and I think with some more work, you can get this going fine. Long run you may have less headaches and stress if you take more time at the beginning to write clean code. I hope this helps, Josh On Sun, Apr 1, 2012 at 4:34 PM, Benjamin Caldwell btcaldw...@berkeley.edu wrote: Josh, Many thanks - here's a subset of the data and a couple examples: plotter(10,3,fram=rwb,framvec=rwb$prcnt.char.depth,obj=prcnt.char.depth,form1= post.f.crwn.length~shigo.av,form2=post.f.crwn.length~shigo.av-1, form3=leaf.area~(1/exp(shigo.av*x))*n,type=2,xlm=70,ylm=35) plotter(10,3,fram=rwb, framvec=rwb$prcnt.char.depth, obj=prcnt.char.depth, form1= post.f.crwn.length~leaf.area, form2=post.f.crwn.length~leaf.area-1, form3=leaf.area~(1/exp(shigo.av*x))*n,type=1, xlm=1500, ylm=35, sx=.01,sn=25) plotter-function(a,b,fram,framvec,obj,form1,form2,form3, type=1, xlm, ylm, sx=.01,sn=25){ g-ceiling(a/b) par(mfrow=c(b,g)) num-rep(0,a) sub.plotter-function(i,fram,framvec,obj,form1,form2,form3,type, xlm,ylm,var1,var2){ temp.i-fram[framvec =(i*.10),] #trees in the list that have an attribute less than or equal to a progressively larger percentage plot(form1, data=temp.i, xlim=c(0,xlm), ylim=c(0,ylm), main=((i-1)*.10)) if(type==1){ mod-lm(form2,data=temp.i) r2-summary(mod)$adj.r.squared num[i]-r2 legend(bottomright, legend=signif(r2), col=black) abline(mod) num} else{ if(type==2){ try(mod-nls(form3, data=temp.i, start=list(x=sx,n=sn), na.action=na.omit), silent=TRUE) try(x1-summary(mod)$coefficients[1,1], silent=TRUE) try(n1-summary(mod)$coefficients[2,1], silent=TRUE) try(lines((1/exp(c(0:70)*x1)*n1)), silent=TRUE) try(num[i]-AIC(mod), silent=TRUE) try(legend(bottomright, legend=round(num[i],3) , col=black), silent=TRUE) try((num), silent=TRUE) } }} for(i in 0:a+1){ num-sub.plotter(i,fram,framvec,obj,form1,form2,form3,type,xlm,ylm) } plot.cor-function(x){ temp-a+1 lengthx-c(1:temp) plot(x~c(1:temp)) m2-lm(x~c(1:temp)) abline(m2) n-summary(m2)$adj.r.squared legend(bottomright, legend=signif(n), col=black) slope-(coef(m2)[2])# slope values-(num)#values for aic or adj r2 r2ofr2-(n) #r2 of r2 or AIC output-data.frame(lengthx,slope,values,r2ofr2) } plot.cor(num) write.csv(plot.cor(num)$output,output.csv) # can't seem to use paste(substitute(form3),.csv,sep=) to name it at the moment par(mfrow=c(1,1)) } On Sun, Apr 1, 2012 at 3:25 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi, Glancing through your code it was not immediately obvious to me why it does not work, but I can see a lot of things that could be simplified. It would really help if you could give us a reproducible example. Find/upload/create (in R) some data, and examples of how you would use the function. Right now, I can only guess what your data etc. are like and based on your description plus what the code you wrote seems to expect to be given. I could try to give code suggestions, but I have no easy way of testing them so it would be very easy to make typos, etc. Then you just get back my edits to your code that still do not work and maybe it is because of something fundamentally wrong with what I have done, a simple typo, or something else still wrong in your code that I did not fix. Anyway, if you send some data and an example using your function (i.e., using the data you send, write our form1, form2, type, etc.), I will take a look at your function and see if I can make it run. Cheers, Josh On Sun,
[R] How to change color in R2HTML?
Hi all, I just wanted to highlight something unusual in the output from R. Lets say we expect most results to be positive and we would like to flag the negative outputs as unusual and hight them in bold red or yellow in the R2HTML output... How to do that? I am not sophisticated enough for handling CSS stuff... and just wanted to get something done quick... Could you please help me? Thanks a lot! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] hclust and plot functions work, cutree does not
On Tue, Apr 3, 2012 at 2:41 AM, vinod1 vinod.hegd...@gmail.com wrote: Sarah, . clust_tree=hclust(as.dist(x),method=complete) . plot(clust_tree) this produces a dendrogram, whereas . clust_tree=hclust(as.dist(x),method=complete) . cut = cutree(clust_tree,k=1:5) . plot(cut) produces a plot with 2 dots. The dissimilarity matrix x is 100*100 matrix. What kind of output do you expect? If you want to see the clustering tree and an indication of which object belongs to which cluster, install package WGCNA and use the function plotDendroAndColors. Continuing with Sarah's example, you can use hc - hclust(dist(USArrests)) labels = cutree(hc, k=1:5) require(WGCNA) plotDendroAndColors(hc, labels) You get the clustering tree plus a color indication of the cluster each object belongs to in each cut. You can also produce MDS plots colored by the cluster label. If you'd like to experiment with more involved ways of identifying branches (or subtrees) in the dendrogram, I can recommend the article (warning, shameless plug) Langfelder P, Zhang B, Horvath S (2007) Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Bioinformatics 2008 24(5):719-720 and the package dynamicTreeCut that the short article describes. You can see some example code at http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting/ We use this approach in large gene expression data sets. HTH, Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regression for poisson distributed data
¿Just write down the loglikelihood function and send it to optim? Kjetil On Tue, Apr 3, 2012 at 8:58 AM, Joachim Audenaert joachim.audena...@pcsierteelt.be wrote: Hello all, I would like to get parameter estimates for different models. For one of them I give the code in example. I am estimating the parameters (i,j and k) with the nls function, which sees the error distribution as normal, I would however like to do the same as nls with the assumption that the errors are poisson distributed. Is there a way to do this with R? Are there packages designed for this? I tried with the gnm package, but don't understand how to transform my equation to a generalised equation. Is there an option for nls to choose family = poisson? Lower in the mail the code with the model and visualisations I use to check my results. I also copied the test dataset from my txt file. I am using R 2.15 and Rstudio to visualise it. plot(FR~N0) x - nls(FR~(exp(i+j*N0)/(1+exp(i+j*N0)))*(k*N0/(k+N0)),start=list(i=0.02,j=0.002,k=6)) summary(x) hatx - predict(x) lines(spline(N0,hatx)) N0 FR 10 2 10 3 10 2 10 4 10 2 10 2 10 1 10 2 10 2 10 2 20 2 20 3 20 3 20 3 20 4 20 2 20 4 20 2 20 3 20 2 30 1 30 2 30 3 30 4 30 5 30 6 30 2 30 3 30 2 30 2 40 2 40 3 40 3 40 6 40 5 40 4 40 3 40 3 40 2 40 3 50 4 50 5 50 2 50 3 50 7 50 5 50 4 50 3 50 4 50 5 60 5 60 6 60 8 60 4 60 4 60 3 60 2 60 2 60 5 60 4 Kind Regards Joachim Don't waste paper! Think about the environment before printing this e-mail __ Joachim Audenaert Adviesdienst Gewasbescherming Proefcentrum voor Sierteelt Schaessestraat 18 B-9070 Destelbergen Tel. +32 9 353 94 71 Fax +32 9 353 94 95 E-mail: joachim.audena...@pcsierteelt.be www.pcsierteelt.be __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regression for poisson distributed data
On Tue, Apr 3, 2012 at 11:03 AM, David Winsemius dwinsem...@comcast.net wrote: On Apr 3, 2012, at 9:58 AM, Joachim Audenaert wrote: Hello all, I would like to get parameter estimates for different models. For one of them I give the code in example. I am estimating the parameters (i,j and k) with the nls function, which sees the error distribution as normal, Doesn't it really just minimize the squared residuals from a nonlinear objective function? I didn't think there was any assumption that there was a particular error structure. I would however like to do the same as nls with the assumption that the errors are poisson distributed. Is there a way to do this with R? Are there packages designed for this? I tried with the gnm package, but don't understand how to transform my equation to a generalised equation. Is there an option for nls to choose family = poisson? dat - read.table(text=N0 FR 10 2 10 3 snipped 60 2 60 2 60 5 60 4', header=TRUE) I was wondering if you could just use `glm` in the base/stats package: plot(jitter(FR)~jitter(N0), data=dat) ( reg=glm(FR ~ N0, data=dat, family=poisson) ) lines(10:60, predict(reg, newdata=list(N0=10:60), type=response)) # It gives you: FR ~ exp(alpha + beta*N0) + pois-error # The plot looks adequate. And I would say arguably better than the one I get with: x - nls(FR~(exp(i+j*N0)/(1+exp(i+j*N0)))*(k*N0/(k+N0)), data=dat, control =nls.control(maxiter=150, tol=0.01), start=list(i=1,j=.02,k=6)) summary(x) hatx - predict(x) lines(spline(dat$N0,hatx), col=red) # I also saw Gabor Grothendieck's suggestion which I'm sure is more # mathematically sophisticated than my hacking, but when I plot its # results, it seems to fit even less well than your original nls function. The normal will tend to overfit to the right side of the plot if the data is truly poisson since it won't take into account that the variance of those points is larger. Note how the normal plot rises toward the right more than the poisson plot suggesting the normal is letting those points unduly influence the fit. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package seems to be present but library don't find it
Indeed I get this error message when I install the library using R CMD INSTALL but not within the GUI (in MacOsX). Good to know that R CMD INSTALL is more verbose and permits to track bug. Thanks a lot. It works fine now. Marc Le 03/04/12 16:03, Duncan Murdoch a écrit : On 03/04/2012 9:50 AM, Marc Girondot wrote: In case someone has the competence to check, the file is here: setwd(~) download.file(http://www.ese.u-psud.fr/epc/conservation/r-scripts/HelloWorld_1.0.tar.gz;, HelloWorld_1.0.tar.gz) install.packages(HelloWorld_1.0.tar.gz, repos = NULL) The problem is that you try to export the name HelloWorld in your NAMESPACE file, but you don't have an object of that name. You should export showHello instead. Not sure why you didn't see the error message I got: $ R CMD INSTALL HelloWorld_1.0.tar.gz * installing to library 'F:/cygwin/home/murdoch/R/win-library/2.15' * installing *source* package 'HelloWorld' ... ** R ** preparing package for lazy loading ** help Warning: C:/temp/RtmpimSWuh/R.INSTALLe801c296e9/HelloWorld/man/HelloWorld-packag e.Rd:32: All text must be in a section Warning: C:/temp/RtmpimSWuh/R.INSTALLe801c296e9/HelloWorld/man/HelloWorld-packag e.Rd:33: All text must be in a section *** installing help indices ** building package indices ** testing if installed package can be loaded Error in namespaceExport(ns, exports) : undefined exports: HelloWorld Error: loading failed Execution halted ERROR: loading failed * removing 'F:/cygwin/home/murdoch/R/win-library/2.15/HelloWorld' Duncan Murdoch -- __ Marc Girondot, Pr Laboratoire Ecologie, Systématique et Evolution Equipe de Conservation des Populations et des Communautés CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079 Bâtiment 362 91405 Orsay Cedex, France Tel: 33 1 (0)1.69.15.72.30 Fax: 33 1 (0)1.69.15.73.53 e-mail: marc.giron...@u-psud.fr Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html Skype: girondot [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] time-series features skewness kurtosis periodicity trend seasonality
Hello everyone, I found the paper A scalable method for time-series clustering and there are proposed several measures to characterize time-series like trend, seasonality, periodicity, serial correlation, skewness, kurtosis, self-similarity etc. They say they have implemented them on R, do you have any clue if there is a package calculating them? or any other packages that calculate some of them so that i can combine them? Thanks, John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to map microarray probe to gene, homology
Hi: I have clustered microarray gene expression data and trying to map between microarray probe, gene, pathway, gene ontology, and homology for a set of (affy) microarray probes. Is there any package in R which facilitates this? I am looking at bioconductor, but till now could not find a solution. A link to some worked example would be appreciated. Thanks and regards. John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do you distinguish between characters on a pco plot?
I've read everything I can find, maybe I'm just missing something but I can't find an example which colours the points on a pco plot differently based on user specified parameters, I can colour every point differently or all the same, or randomly with a number of colours. What I need to do is specify that points a, b and c, for example will be blue, while points d, e, f, and g will be red and points h and i will be green. Is it possible to colour with uneven distribution, I suppose is my question? I can't find a solution in any manual. -- View this message in context: http://r.789695.n4.nabble.com/How-do-you-distinguish-between-characters-on-a-pco-plot-tp4496217p4529425.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] identify time span in date vector
Hello everyone, i try to identify the first element of a date vector, for which the following condition holds: at least 3 more dates within the next 365 days, but at least one of these must be between 3-12 month later. dates = as.Date(sort(rnorm(10,3000,100)), origin = 2000-1-1) Has anyone an idea how to do this economically? I'll need to apply this to a large dataset with date vectors of various lengths and I can think only of quite difficult algorithms :( Any ideas would be appreciated, Felix [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] filling small gaps of N/A
Michael, First of all, thank you very much for your answer. I've read your 2 answers, but I'm not really sure that they corresponds to my problem of NAs. I'll try to detail you a bit more. This problem concerns the second part of my program. In the first part, I've already created a timeseries object with the library (timeseries). I had to delete first all the wrong values in my data and replace it with NAs. So my data contains already missing data (NAs), as I have cleaned it before. The thing is that sometimes I have small gaps of missing data (only 2 or 3 following) like in example 1 below: example 1: 09/01/2008 12:00 1.93 09/01/2008 12:15 3.93 09/01/2008 12:30 NASo here you have a small gap with only 2 NAs 09/01/2008 12:45 NA 09/01/2008 13:00 4.93 09/01/2008 13:15 5.93 But sometimes, always in the same file, I have big gaps, such as 10 or more NAs following each other like in example 2 below: example 2: 09/01/2008 16:152.93 09/01/2008 16:302.93 09/01/2008 16:45NA 09/01/2008 17:00NA 09/01/2008 17:15NA 09/01/2008 17:30NA 09/01/2008 17:45NA 09/01/2008 18:00NA So here you have a big gap with more than 10 NAs following each other 09/01/2008 18:15NA 09/01/2008 18:30NA 09/01/2008 18:45NA 09/01/2008 19:00NA 09/01/2008 19:15NA 09/01/2008 19:30NA 09/01/2008 19:45NA 09/01/2008 20:00NA 09/01/2008 20:157.93 09/01/2008 20:307.93 So in the whole same file, I can have sometimes big gaps (2 or 3 NAs), sometimes big or very big gaps (10 or 100 NAs following). The aim of my problem is to apply the function: na.approx(x) of the library (zoo) to fill NAs ONLY for small gaps. If I just do: apply(na.approx(x)), it will fill all the NAs of my data (big gaps + small gaps). It's exactly what I DON'T WANT. My problem is to say to R: you apply the function (na.approx) to fill NAs ONLY if you see 4 NAs maximum following each other (small gaps) (like example 1). If you see more than 4 NAs following each other (big gaps like in example 2), you keep these NAs and you DON'T fill this big gap. My question is: how can I say this to R? I don't know how to do it. Hope I've been understandable this time ^^ Thanks a lot again for all your answers! -- View this message in context: http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528907.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] object of type 'S4' is not subsettable
hey there! The object 'cit' contains: cit # # Johansen-Procedure Unit Root / Cointegration Test # # The value of the test statistic is: 5.3484 9.0681 10.6433 --- I want R to save the value 5.3484 in a new object. I am used to use the command x=cit[a] where a stands for the place where R has saved the value. However, the johansen procedure works with S4 objects with which subsetting does not seem to work. Any ideas how I could get R save a certain value of the output into a new object? Thousand thanks, Philipp -- View this message in context: http://r.789695.n4.nabble.com/object-of-type-S4-is-not-subsettable-tp4529063p4529063.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] conversion of Modified Julian Days to YMDHMS format
Does anybody knows how to convert Modified Julian Days(MJD) to YMDHMS format? The sample data set which I have is as follows time - c(1832.415 ,1832.415 ,1832.415 ,1832.415 ,1832.415 ,1832.415 ,1832.415, 1832.415, 1832.415 ,1832.415) which is Days since 1.1.2000 (in UT time, starting at 00:00:00) -- View this message in context: http://r.789695.n4.nabble.com/conversion-of-Modified-Julian-Days-to-YMDHMS-format-tp4528975p4528975.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Import from excel button in R-command
Hello I have been searching for almost 2 hours for a certain plug-in/package, so im making this thread as i hope you can help me find it. I had my first lesson in Statistics in use today, and when we worked on the school computers, we could do this to import data from excel: Data Import Data Import from excel or something else Now i downloaded it for my laptop as i want to work with it from home, but there is no button to import from excel in my edition. Is there anyone who knows how to get this feature, so i dont have to type in commands or change my excel documents everytime? Thanks in advance /Nick -- View this message in context: http://r.789695.n4.nabble.com/Import-from-excel-button-in-R-command-tp4529023p4529023.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] object of type 'S4' is not subsettable
Figured it out. given that object 'cit' is of class S4, extracting information works as follows: 1)finding out names of slots in object 'cit' slotNames(cit) [1] x Z0Z1ZKtype model [7] ecdet lag P seasondumvarcval [13] teststat lambdaVorg V W PI [19] DELTA GAMMA R0RKbpspec [25] call test.name 2) extracting info from wanted slot cit@teststat [1] 5.348440 9.068113 10.643293 -- View this message in context: http://r.789695.n4.nabble.com/object-of-type-S4-is-not-subsettable-tp4529063p4529432.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create Model Object (setClass?setMethod?)
Hi all, I have a self written likelihood as a model and functions to optimize and get fitted values, confidence intervals ect. I wonder if there is a way to define a 'class', or a 'model' (or a certain object)? so that I can use 'summary' to produce a summary like it does for a lm object. Also, it should be able to use 'predict' and 'plot' and other various generic functions. I am reading bits and pieces on the internet on 'setClass', 'setMethod'. Am I looking for the correct thing? Is there any up to date references that I can get help? I need some examples to get started with. Thanks! Casper - ### PhD candidate in Statistics School of Mathematics, Statistics and Actuarial Science, University of Kent ### -- View this message in context: http://r.789695.n4.nabble.com/Create-Model-Object-setClass-setMethod-tp4529473p4529473.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do you distinguish between characters on a pco plot?
You can pass a vector of colors to the col= argument, eg col=c(1, 1, 2, 3, 2, 1) to match your parameters. Or, in your case, c(rep(blue, 3), rep(red, 4), rep(2, green)) - untested because you didn't provide an example. You should read ?par and ?plot Unless you're using lattice graphics, but you don't tell us. Sarah On Tue, Apr 3, 2012 at 12:39 PM, leannehaggerty leannehagge...@gmail.com wrote: I've read everything I can find, maybe I'm just missing something but I can't find an example which colours the points on a pco plot differently based on user specified parameters, I can colour every point differently or all the same, or randomly with a number of colours. What I need to do is specify that points a, b and c, for example will be blue, while points d, e, f, and g will be red and points h and i will be green. Is it possible to colour with uneven distribution, I suppose is my question? I can't find a solution in any manual. -- -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] identify time span in date vector
On Apr 3, 2012, at 9:35 AM, Fischer, Felix wrote: Hello everyone, i try to identify the first element of a date vector, for which the following condition holds: at least 3 more dates within the next 365 days, but at least one of these must be between 3-12 month later. dates = as.Date(sort(rnorm(10,3000,100)), origin = 2000-1-1) Has anyone an idea how to do this economically? I'll need to apply this to a large dataset with date vectors of various lengths and I can think only of quite difficult algorithms :( which( dates[4:(length(dates))] -dates[1:(length(dates)-3)] 365 dates[3:(length(dates)-1)] -dates[1:(length(dates)-3)] 90) [1] 2 3 Any ideas would be appreciated, Felix [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time-series features skewness kurtosis periodicity trend seasonality
John, If the paper is the one by Xiaozhe Wang, Kate A. Smith, Rob Hyndman, and Damminda Alahakoon, the way I read it, it does not say they created an R package (that may be on CRAN). They may have just used R in their analyses; just a guess... Tom On Tue, Apr 3, 2012 at 12:34 PM, John Kohr illuminati...@hotmail.comwrote: Hello everyone, I found the paper A scalable method for time-series clustering and there are proposed several measures to characterize time-series like trend, seasonality, periodicity, serial correlation, skewness, kurtosis, self-similarity etc. They say they have implemented them on R, do you have any clue if there is a package calculating them? or any other packages that calculate some of them so that i can combine them? Thanks, John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Thomas E Adams National Weather Service Ohio River Forecast Center 1901 South State Route 134 Wilmington, OH 45177 EMAIL: thomas.ad...@noaa.gov VOICE: 937-383-0528 FAX:937-383-0033 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time-series features skewness kurtosis periodicity trend seasonality
On Apr 3, 2012, at 12:34 PM, John Kohr wrote: Hello everyone, I found the paper A scalable method for time-series clustering and there are proposed several measures to characterize time-series like trend, seasonality, periodicity, serial correlation, skewness, kurtosis, self-similarity etc. They say they have implemented them on R, do you have any clue if there is a package calculating them? or any other packages that calculate some of them so that i can combine them? http://cran.r-project.org/web/views/TimeSeries.html -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] filling small gaps of N/A
x - read.table(text=09/01/2008 12:00 1.93 + 09/01/2008 12:15 3.93 + 09/01/2008 12:30 NA + 09/01/2008 12:45 NA + 09/01/2008 13:00 4.93 + 09/01/2008 13:15 5.93 + 09/01/2008 16:152.93 + 09/01/2008 16:302.93 + 09/01/2008 16:45NA + 09/01/2008 17:00NA + 09/01/2008 17:15NA + 09/01/2008 17:30NA + 09/01/2008 17:45NA + 09/01/2008 18:00NA + 09/01/2008 18:15NA + 09/01/2008 18:30NA + 09/01/2008 18:45NA + 09/01/2008 19:00NA + 09/01/2008 19:15NA + 09/01/2008 19:30NA + 09/01/2008 19:45NA + 09/01/2008 20:00NA + 09/01/2008 20:157.93 + 09/01/2008 20:307.93, as.is = TRUE) # find the NA gaps and process differently gaps - rle(is.na(x$V3)) offsets - c(1, cumsum(head(gaps$lengths, -1)) + 1) (shortgaps - which(gaps$values (gaps$lengths = 4))) [1] 2 (longgaps - which(gaps$values (gaps$lengths 4))) [1] 4 # now that you have the indices of where the short/long gaps are # you can process each individually; left as an exercise to the reader On Tue, Apr 3, 2012 at 10:13 AM, jeff6868 geoffrey_kl...@etu.u-bourgogne.fr wrote: Michael, First of all, thank you very much for your answer. I've read your 2 answers, but I'm not really sure that they corresponds to my problem of NAs. I'll try to detail you a bit more. This problem concerns the second part of my program. In the first part, I've already created a timeseries object with the library (timeseries). I had to delete first all the wrong values in my data and replace it with NAs. So my data contains already missing data (NAs), as I have cleaned it before. The thing is that sometimes I have small gaps of missing data (only 2 or 3 following) like in example 1 below: example 1: 09/01/2008 12:00 1.93 09/01/2008 12:15 3.93 09/01/2008 12:30 NA So here you have a small gap with only 2 NAs 09/01/2008 12:45 NA 09/01/2008 13:00 4.93 09/01/2008 13:15 5.93 But sometimes, always in the same file, I have big gaps, such as 10 or more NAs following each other like in example 2 below: example 2: 09/01/2008 16:15 2.93 09/01/2008 16:30 2.93 09/01/2008 16:45 NA 09/01/2008 17:00 NA 09/01/2008 17:15 NA 09/01/2008 17:30 NA 09/01/2008 17:45 NA 09/01/2008 18:00 NA So here you have a big gap with more than 10 NAs following each other 09/01/2008 18:15 NA 09/01/2008 18:30 NA 09/01/2008 18:45 NA 09/01/2008 19:00 NA 09/01/2008 19:15 NA 09/01/2008 19:30 NA 09/01/2008 19:45 NA 09/01/2008 20:00 NA 09/01/2008 20:15 7.93 09/01/2008 20:30 7.93 So in the whole same file, I can have sometimes big gaps (2 or 3 NAs), sometimes big or very big gaps (10 or 100 NAs following). The aim of my problem is to apply the function: na.approx(x) of the library (zoo) to fill NAs ONLY for small gaps. If I just do: apply(na.approx(x)), it will fill all the NAs of my data (big gaps + small gaps). It's exactly what I DON'T WANT. My problem is to say to R: you apply the function (na.approx) to fill NAs ONLY if you see 4 NAs maximum following each other (small gaps) (like example 1). If you see more than 4 NAs following each other (big gaps like in example 2), you keep these NAs and you DON'T fill this big gap. My question is: how can I say this to R? I don't know how to do it. Hope I've been understandable this time ^^ Thanks a lot again for all your answers! -- View this message in context: http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528907.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] filling small gaps of N/A
Forgot to mention that the offsets were into the 'gaps' (result of the rle) and 'offsets' which is the index into the original data there the gap starts. gaps Run Length Encoding lengths: int [1:5] 2 2 4 14 2 values : logi [1:5] FALSE TRUE FALSE TRUE FALSE offsets [1] 1 3 5 9 23 On Tue, Apr 3, 2012 at 1:12 PM, jim holtman jholt...@gmail.com wrote: x - read.table(text=09/01/2008 12:00 1.93 + 09/01/2008 12:15 3.93 + 09/01/2008 12:30 NA + 09/01/2008 12:45 NA + 09/01/2008 13:00 4.93 + 09/01/2008 13:15 5.93 + 09/01/2008 16:15 2.93 + 09/01/2008 16:30 2.93 + 09/01/2008 16:45 NA + 09/01/2008 17:00 NA + 09/01/2008 17:15 NA + 09/01/2008 17:30 NA + 09/01/2008 17:45 NA + 09/01/2008 18:00 NA + 09/01/2008 18:15 NA + 09/01/2008 18:30 NA + 09/01/2008 18:45 NA + 09/01/2008 19:00 NA + 09/01/2008 19:15 NA + 09/01/2008 19:30 NA + 09/01/2008 19:45 NA + 09/01/2008 20:00 NA + 09/01/2008 20:15 7.93 + 09/01/2008 20:30 7.93, as.is = TRUE) # find the NA gaps and process differently gaps - rle(is.na(x$V3)) offsets - c(1, cumsum(head(gaps$lengths, -1)) + 1) (shortgaps - which(gaps$values (gaps$lengths = 4))) [1] 2 (longgaps - which(gaps$values (gaps$lengths 4))) [1] 4 # now that you have the indices of where the short/long gaps are # you can process each individually; left as an exercise to the reader On Tue, Apr 3, 2012 at 10:13 AM, jeff6868 geoffrey_kl...@etu.u-bourgogne.fr wrote: Michael, First of all, thank you very much for your answer. I've read your 2 answers, but I'm not really sure that they corresponds to my problem of NAs. I'll try to detail you a bit more. This problem concerns the second part of my program. In the first part, I've already created a timeseries object with the library (timeseries). I had to delete first all the wrong values in my data and replace it with NAs. So my data contains already missing data (NAs), as I have cleaned it before. The thing is that sometimes I have small gaps of missing data (only 2 or 3 following) like in example 1 below: example 1: 09/01/2008 12:00 1.93 09/01/2008 12:15 3.93 09/01/2008 12:30 NA So here you have a small gap with only 2 NAs 09/01/2008 12:45 NA 09/01/2008 13:00 4.93 09/01/2008 13:15 5.93 But sometimes, always in the same file, I have big gaps, such as 10 or more NAs following each other like in example 2 below: example 2: 09/01/2008 16:15 2.93 09/01/2008 16:30 2.93 09/01/2008 16:45 NA 09/01/2008 17:00 NA 09/01/2008 17:15 NA 09/01/2008 17:30 NA 09/01/2008 17:45 NA 09/01/2008 18:00 NA So here you have a big gap with more than 10 NAs following each other 09/01/2008 18:15 NA 09/01/2008 18:30 NA 09/01/2008 18:45 NA 09/01/2008 19:00 NA 09/01/2008 19:15 NA 09/01/2008 19:30 NA 09/01/2008 19:45 NA 09/01/2008 20:00 NA 09/01/2008 20:15 7.93 09/01/2008 20:30 7.93 So in the whole same file, I can have sometimes big gaps (2 or 3 NAs), sometimes big or very big gaps (10 or 100 NAs following). The aim of my problem is to apply the function: na.approx(x) of the library (zoo) to fill NAs ONLY for small gaps. If I just do: apply(na.approx(x)), it will fill all the NAs of my data (big gaps + small gaps). It's exactly what I DON'T WANT. My problem is to say to R: you apply the function (na.approx) to fill NAs ONLY if you see 4 NAs maximum following each other (small gaps) (like example 1). If you see more than 4 NAs following each other (big gaps like in example 2), you keep these NAs and you DON'T fill this big gap. My question is: how can I say this to R? I don't know how to do it. Hope I've been understandable this time ^^ Thanks a lot again for all your answers! -- View this message in context: http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528907.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what
Re: [R] Import from excel button in R-command
On Apr 3, 2012, at 10:57 AM, nimoke wrote: Hello I have been searching for almost 2 hours for a certain plug-in/ package, so im making this thread as i hope you can help me find it. I had my first lesson in Statistics in use today, and when we worked on the school computers, we could do this to import data from excel: Data Import Data Import from excel or something else Now i downloaded it for my laptop as i want to work with it from home, but there is no button to import from excel in my edition. Is there anyone who knows how to get this feature, so i dont have to type in commands or change my excel documents everytime? You were probably set up with one of the GUI's, perhaps the Rcmdr GUI http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/ This would be a widely cited set of instructions about how to do it without dependance on a GUI: http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windows -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time-series features skewness kurtosis periodicity trend seasonality
Thx for the link, I have seen these packages before, but looking to them, i couldn't find any that computes these statistics/measures. Any clue if any particular of them computes any of these characteristics? One of the co-authors of the paper is Prof. Hyndman who has contributed in many of these packages.. that's why i was guessing there would be implementations for these measures. best, john CC: r-help@r-project.org From: dwinsem...@comcast.net To: illuminati...@hotmail.com Subject: Re: [R] time-series features skewness kurtosis periodicity trend seasonality Date: Tue, 3 Apr 2012 13:00:42 -0400 On Apr 3, 2012, at 12:34 PM, John Kohr wrote: Hello everyone, I found the paper A scalable method for time-series clustering and there are proposed several measures to characterize time-series like trend, seasonality, periodicity, serial correlation, skewness, kurtosis, self-similarity etc. They say they have implemented them on R, do you have any clue if there is a package calculating them? or any other packages that calculate some of them so that i can combine them? http://cran.r-project.org/web/views/TimeSeries.html -- David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to map microarray probe to gene, homology
On Tue, Apr 3, 2012 at 7:21 AM, email mail email8...@gmail.com wrote: Hi: I have clustered microarray gene expression data and trying to map between microarray probe, gene, pathway, gene ontology, and homology for a set of (affy) microarray probes. Is there any package in R which facilitates this? I am looking at bioconductor, but till now could not find a solution. A link to some worked example would be appreciated. Yes, Bioconductor has annotation packages for microarray chips plus other packages for connecting gene identifiers to (for example) gene ontologies. For example, the package hgu133plus2.db contains annotations for the U133 plus 2 chip and you can use to map affy probe IDs to gene symbols, IDs, GO terms etc. Install the package, load it, then type ls(package:hgu133plus2.db) and look at the help files for each topic for more details and examples. The array annotations are chip-specific so you need to choose the right package for your chip. Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] filling small gaps of N/A
On Tue, Apr 3, 2012 at 4:52 AM, jeff6868 geoffrey_kl...@etu.u-bourgogne.fr wrote: Hi everybody, I'm a new R french user. Sorry if my english is not perfect. Hope you'll understand my problem ;) I have to work on temperature data (35000 lines in one file) containing some missing data (N/A). Sometimes I have only 2 or 3 N/A following each other, but I have also sometimes 100 or 200 N/A following each other. Here's an example of my data, when I have only small gaps of missing data (2 or 3 N/A): 09/01/2008 12:00 2 1.93 2.93 4.56 5.43 09/01/2008 12:15 2 *3.93* 3.25 4.93 5.56 09/01/2008 12:30 2 NA 3.5 5.06 5.56 09/01/2008 12:45 2 NA 3.68 5.25 5.68 09/01/2008 13:00 2 *4.93 * 3.87 5.56 5.93 09/01/2008 13:15 2 5.93 4.25 5.75 6.06 09/01/2008 13:30 2 3.93 4.56 5.93 6.18 My question is: how can I replace these small gaps of N/A by numeric values? I would like a fonction which only replace the small gaps (2 or 3 N/A) in my data, but not the big gaps (more than 5 N/A following each other). Try na.locf, na.approx or na.spline in the zoo package noting the maxgap= argument on each. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ff problems
this works: bigcl - read.table(file=bigCL1.csv,sep=',',header=T , colClasses=c(rep(factor,3),numeric,NULL,integer, numeric,integer,NULL,numeric,NULL),nrow=1000) this doesn't: bigcl - read.table.ffdf(file=bigCL1.csv,sep=',',header=T , colClasses=c(TERMGROUP=factor,SN=factor, INS=factor,INCENTIVE=numeric,SRC=character,FLAG=integer, RELAGE=numeric,BURNSUM=integer,prod=character,BAL=numeric,origbal=NULL) ) Error in ff(initdata = initdata, length = length, levels = levels, ordered = ordered, : vmode 'character' not implemented neither this: bigcl - read.table.ffdf(file=bigCL1.csv,sep=',',header=T , + colClasses=c(TERMGROUP=factor,SN=factor, + INS=factor,INCENTIVE=numeric,SRC=NULL,FLAG=integer, + RELAGE=numeric,BURNSUM=integer,prod=NULL,BAL=numeric,origbal=NULL) + ) Error in repnam(colClasses, colnames(x), default = NA) : the following argument names do not match'SRC','prod','origbal' if I load as is without classes then: bigcl - read.table.ffdf(file=bigCL1.csv,sep=',',header=T) bigcl$term - factor(bigcl$term) Error in sort.list(y) : 'x' must be atomic for 'sort.list' Have you called 'sort' on a list? attempting to save it: ffsave(bigcl,file=fcl) Error in system(cmd, input = filelist, intern = TRUE) : 'zip' not found attempting to get rid of the unwanted columns: bigcl - bigcl[,c(-5,-9,-11)] Error: cannot allocate vector of size 129.8 Mb In addition: Warning messages: 1: In class(df) - data.frame : Reached total allocation of 1535Mb: see help(memory.size) 2: In class(df) - data.frame : Reached total allocation of 1535Mb: see help(memory.size) 3: In class(df) - data.frame : Reached total allocation of 1535Mb: see help(memory.size) 4: In class(df) - data.frame : Reached total allocation of 1535Mb: see help(memory.size) ff was supposed to be gentle on RAM usage?? I am attaching the first 1000 lines from the csv file. thanks a lot. Stephen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
Hi All, On the same data points x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 ) I want to have have the following output as data frame x group group mean 46 142.3 125 289.6 36 142.3 193 3235.25 209 3235.25 78 289.6 66 289.6 242 3235.25 297 3235.25 45 142.3 I tried the following code dat - data.frame(xc=split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1 gxc - with(dat, tapply(xc, group, mean)) dat$gxc - gxce[as.character(dat$group)] txc=dat$gxc it did not work for me. On Tue, Apr 3, 2012 at 10:15 AM, David Winsemius dwinsem...@comcast.netwrote: On Apr 3, 2012, at 10:11 AM, Val wrote: David W and all, Thank you very much for your help. Here is the final output that I want in the form of data frame. The data frame should contain x, group and group_ mean in the following way x group group mean 46 142.3 125 289.6 36 142.3 193 3235.25 209 3235.25 78 289.6 66 289.6 242 3235.25 297 3235.25 45 142.3 I you want group means in a vector the same length as x then instead of using tapply as done in earlier solutions you should use `ave`. -- DW Thanks a lot On Tue, Apr 3, 2012 at 9:51 AM, David Winsemius dwinsem...@comcast.netwrote: On Apr 3, 2012, at 9:32 AM, R. Michael Weylandt wrote: Use cut2 as I suggested and David demonstrated. Agree that Hmisc::cut2 is extremely handy and I also like that fact that the closed ends of intervals are on the left side (which is not the same behavior as cut()), which has the otehr effect of setting include.lowest = TRUE which is not the default for cut() either (to my continued amazement). But let me add the method I use when doing it by hand: cut(x, quantile(x, prob=seq(0, 1, length=ngrps+1)), include.lowest=TRUE) -- David. Michael On Tue, Apr 3, 2012 at 9:31 AM, Val valkr...@gmail.com wrote: Thank you all (David, Michael, Giovanni) for your prompt response. First there was a typo error for the group mean it was 89.6 not 87. For a small data set and few groupings I can use prob=c(0, .333, .66 ,1) to group in to three groups in this case. However, if I want to extend the number of groupings say 10 or 15 then do I have to figure it out the split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) Is there a short cut for that? Thanks On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Ignoring the fact your desired answers are wrong, I'd split the separating part and the group means parts into three steps: i) quantile() can help you get the split points, ii) findInterval() can assign each y to a group iii) then ave() or tapply() will do group-wise means Something like: y - c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a c here. ave(y, findInterval(y, quantile(y, c(0.33, 0.66 tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean) You could also use cut2 from the Hmisc package to combine findInterval and quantile into a single step. Depending on your desired output. Hope that helps, Michael On Tue, Apr 3, 2012 at 8:47 AM, Val valkr...@gmail.com wrote: Hi all, Assume that I have the following 10 data points. x=c( 46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45) sort x and get the following y= (36 , 45 , 46, 66, 78, 125,193, 209, 242, 297) I want to group the sorted data point (y) into equal number of observation per group. In this case there will be three groups. The first two groups will have three observation and the third will have four observations group 1 = 34, 45, 46 group 2 = 66, 78, 125 group 3 = 193, 209, 242,297 Finally I want to calculate the group mean group 1 = 42 group 2 = 87 group 3 = 234 Can anyone help me out? In SAS I used to do it using proc rank. thanks in advance Val [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/**posting-guide.htmlhttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT
Re: [R] grouping
On Tue, Apr 03, 2012 at 02:21:36PM -0400, Val wrote: Hi All, On the same data points x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 ) I want to have have the following output as data frame x group group mean 46 142.3 125 289.6 36 142.3 193 3235.25 209 3235.25 78 289.6 66 289.6 242 3235.25 297 3235.25 45 142.3 I tried the following code dat - data.frame(xc=split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1 gxc - with(dat, tapply(xc, group, mean)) dat$gxc - gxce[as.character(dat$group)] txc=dat$gxc it did not work for me. David Winsemius suggested to use ave(), when you asked this question for the first time. Can you have look at it? Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] filling small gaps of N/A
Like I said in my followup, please pass the maxgap argument: i.e., na.approx(x, maxgap = 4) x - zoo(1:20, Sys.Date() + 1:20) x[2:4] - NA # Short run of NA's x[10:16] - NA # Long run of NA's na.approx(x) # All filled in na.approx(x, maxgap = 4) # Only the short one filled in Michael On Tue, Apr 3, 2012 at 10:13 AM, jeff6868 geoffrey_kl...@etu.u-bourgogne.fr wrote: Michael, First of all, thank you very much for your answer. I've read your 2 answers, but I'm not really sure that they corresponds to my problem of NAs. I'll try to detail you a bit more. This problem concerns the second part of my program. In the first part, I've already created a timeseries object with the library (timeseries). I had to delete first all the wrong values in my data and replace it with NAs. So my data contains already missing data (NAs), as I have cleaned it before. The thing is that sometimes I have small gaps of missing data (only 2 or 3 following) like in example 1 below: example 1: 09/01/2008 12:00 1.93 09/01/2008 12:15 3.93 09/01/2008 12:30 NA So here you have a small gap with only 2 NAs 09/01/2008 12:45 NA 09/01/2008 13:00 4.93 09/01/2008 13:15 5.93 But sometimes, always in the same file, I have big gaps, such as 10 or more NAs following each other like in example 2 below: example 2: 09/01/2008 16:15 2.93 09/01/2008 16:30 2.93 09/01/2008 16:45 NA 09/01/2008 17:00 NA 09/01/2008 17:15 NA 09/01/2008 17:30 NA 09/01/2008 17:45 NA 09/01/2008 18:00 NA So here you have a big gap with more than 10 NAs following each other 09/01/2008 18:15 NA 09/01/2008 18:30 NA 09/01/2008 18:45 NA 09/01/2008 19:00 NA 09/01/2008 19:15 NA 09/01/2008 19:30 NA 09/01/2008 19:45 NA 09/01/2008 20:00 NA 09/01/2008 20:15 7.93 09/01/2008 20:30 7.93 So in the whole same file, I can have sometimes big gaps (2 or 3 NAs), sometimes big or very big gaps (10 or 100 NAs following). The aim of my problem is to apply the function: na.approx(x) of the library (zoo) to fill NAs ONLY for small gaps. If I just do: apply(na.approx(x)), it will fill all the NAs of my data (big gaps + small gaps). It's exactly what I DON'T WANT. My problem is to say to R: you apply the function (na.approx) to fill NAs ONLY if you see 4 NAs maximum following each other (small gaps) (like example 1). If you see more than 4 NAs following each other (big gaps like in example 2), you keep these NAs and you DON'T fill this big gap. My question is: how can I say this to R? I don't know how to do it. Hope I've been understandable this time ^^ Thanks a lot again for all your answers! -- View this message in context: http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528907.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
I did look at it the result is below, x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 ) #lapply( split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) , include.lowest=TRUE) ), mean) ave( split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) , include.lowest=TRUE) ), mean) ave( split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) , include.lowest=TRUE) ), mean) $`[36,74]` [1] NA $`(74,197]` [1] NA $`(197,297]` [1] NA There were 11 warnings (use warnings() to see them) On Tue, Apr 3, 2012 at 2:35 PM, Petr Savicky savi...@cs.cas.cz wrote: On Tue, Apr 03, 2012 at 02:21:36PM -0400, Val wrote: Hi All, On the same data points x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 ) I want to have have the following output as data frame x group group mean 46 142.3 125 289.6 36 142.3 193 3235.25 209 3235.25 78 289.6 66 289.6 242 3235.25 297 3235.25 45 142.3 I tried the following code dat - data.frame(xc=split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1 gxc - with(dat, tapply(xc, group, mean)) dat$gxc - gxce[as.character(dat$group)] txc=dat$gxc it did not work for me. David Winsemius suggested to use ave(), when you asked this question for the first time. Can you have look at it? Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mixed italic and non-italic in text
Hi, I need to italicize the journal in a citation but have thus far failed. How can I make 'Journal of Something' below italic but leave the rest? mtext( See Author1 and Author2 (2007) , \Title\, Journal of Something , pp. 1-50., side = 3, outer = T, line=-0.75, cex = 0.7, at= 0.04, adj = 0, font = 1, family = Times) -- View this message in context: http://r.789695.n4.nabble.com/Mixed-italic-and-non-italic-in-text-tp4529710p4529710.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R equivalent for SQL query
Hi, I have a query which I would like to translate into R, but I do not know how to do it in an easy way. Assume a data frame has columns A, B and C: A B C 1 1 3 1 1 4 1 1 5 1 2 6 1 2 7 1 3 8 The query is as follows: select A, B, count(*) from data.frame group by A, B order by count(*) desc How do I translate this into R statements in such way that the result is a data frame structured as follows: A B count(*) 1 1 3 1 2 2 1 3 1 Thanks, Steven __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unable to move temporary installation
A final followup. I have identified a rather extreme workaround. The problem arises when the function utils:::unpackPkgZip uses file.rename(...) to move the unzipped binary package from the temporary directory that it was unpacked into into the proper directory in the library tree. If one does debug(utils:::unpackPkgZip) and then steps through the function line by line, it works. Thank you. On Mon, Apr 2, 2012 at 12:06 PM, Drew Tyre aty...@unl.edu wrote: OK - so I followed the following steps, which I think rule out those causes 1) I uninstalled all remaining versions of R, and then deleted all the directories in c:\progra~1\R 2) I restarted the computer 3) I installed 2.14.2, and attempted to install the Rcmdr package. Same error message for both the cars package and the Rcmdr package. 4) I then exited and confirmed that I have write permission to C:\progra~1\R\R-2.14.2\libraries both by looking at the permissions, and by creating a directory in there. I appear to have full control, and I could create the directory. note that R is able to create the temporary directory to install the package, but not the correct, final directory. 5) I then uninstalled 2.14.2, and installed 2.15.0, hoping for a fix. No luck. Same error message. 6) I then tried installing the packages to a different directory, one that I created, c:\test, using install.packages(Rcmdr,c:\\test) This time, the car package installed correctly, but Rcmdr still had the same warning message Warning: unable to move temporary installation c:\test\file136c67c337b3\Rcmdr to c:\test\Rcmdr There is clearly something messed up on this computer, but I'm at a loss for how to get around it. Thanks for the suggestions, and I guess I have to work on a different computer! 2012/3/31 Uwe Ligges lig...@statistik.tu-dortmund.de On 31.03.2012 16:15, Drew Tyre wrote: Hi all, I'm having a strange error that prevents me from installing new packages, or updating packages after reinstalling. The error message is Warning: unable to move temporary installation C:\Program Files\R\R-2.14.2\library\**file15045004ac2\sandwich to C:\Program Files\R\R-2.14.2\library\**sandwich for one of the packages that is failing to install/update. This error started happening after I attempted installing lme4Eigen from the R-Forge repositories - that installation failed too. Any suggestions for fixes welcome. I don't want to upgrade to 2.15 just yet because I'm in the middle of a project (although if that's the solution I guess I'll have to do it). Probably the package is in use by another instance of R. Otherwise, check permissions. Best, Uwe Ligges R version 2.14.2 (2012-02-29) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.14.2 __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-help https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Drew Tyre School of Natural Resources University of Nebraska-Lincoln 416 Hardin Hall, East Campus 3310 Holdrege Street Lincoln, NE 68583-0974 phone: +1 402 472 4054 fax: +1 402 472 2946 email: aty...@unl.edu http://snr.unl.edu/tyre http://aminpractice.blogspot.com http://www.flickr.com/photos/atiretoo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Drew Tyre School of Natural Resources University of Nebraska-Lincoln 416 Hardin Hall, East Campus 3310 Holdrege Street Lincoln, NE 68583-0974 phone: +1 402 472 4054 fax: +1 402 472 2946 email: aty...@unl.edu http://snr.unl.edu/tyre http://aminpractice.blogspot.com http://www.flickr.com/photos/atiretoo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Italicize journal article in mid text
Hi, I need to italicize the journal in a citation but have thus far failed. How can I make 'Journal of Something' below italic but leave the rest plain? mtext( See Author1 and Author2 (2007) , \Title\, Journal of Something , pp. 1-50., side = 3, outer = T, line=-0.75, cex = 0.7, at= 0.04, adj = 0, font = 1, family = Times) Thanks -- View this message in context: http://r.789695.n4.nabble.com/Italicize-journal-article-in-mid-text-tp4529715p4529715.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
On Tue, Apr 3, 2012 at 2:53 PM, Berend Hasselman b...@xs4all.nl wrote: On 03-04-2012, at 20:21, Val wrote: Hi All, On the same data points x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 ) I want to have have the following output as data frame x group group mean 46 142.3 125 289.6 36 142.3 193 3235.25 209 3235.25 78 289.6 66 289.6 242 3235.25 297 3235.25 45 142.3 I tried the following code dat - data.frame(xc=split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1 gxc - with(dat, tapply(xc, group, mean)) dat$gxc - gxce[as.character(dat$group)] txc=dat$gxc it did not work for me. I'm not surprised. In the line dat - there are 5 opening parentheses and 4 closing )'s. In the line dat$gxc - you reference an object gxce. Where was it created? So I tried this dat - data.frame(x, group=findInterval(x, quantile(x, prob=c(0, .333, .66 ,1)), all.inside=TRUE)) dat$gmean - ave(dat$x, as.factor(dat$group)) dat x group gmean 1 46 1 42.3 2 125 2 89.7 3 36 1 42.3 4 193 3 235.25000 5 209 3 235.25000 6 78 2 89.7 7 66 2 89.7 8 242 3 235.25000 9 297 3 235.25000 10 45 1 42.3 Thank you very much. It is working now. there was a type error on gxce. But in the r-code it was correct, gxc.. Berend [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about randomForest
I dont see how this answered the original question of the poster. He was quite clear: the value of the predictions coming out of RF do not match what comes out of the predict function using the same RF object and the same data. Therefore, what is predict() doing that is different from RF? Yes, RF is making its predictions using OOB, but nowhere does it say way predict() is doing; indeed, it says if newdata is not given, then the results are just the OOB predictions. But newdata=oldata, then predict(newdata) != OOB predictions. So what is it then? Opens another issue, which is if newdata is close but not exactly oldata, then you get overfitted results? -- View this message in context: http://r.789695.n4.nabble.com/Question-about-randomForest-tp4111311p4529770.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] np package problem
Hi, Im trying to run a non parametric regression and I wish to use function npreg(), I installed the np package, but I am being told that npreg doesnt exist. Any advice on how I could fix this? -- View this message in context: http://r.789695.n4.nabble.com/np-package-problem-tp4529813p4529813.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixed italic and non-italic in text
On Apr 3, 2012, at 2:05 PM, lukevancleve wrote: mtext( See Author1 and Author2 (2007) , \Title\, Journal of Something , pp. 1-50., side = 3, outer = T, line=-0.75, cex = 0.7, at= 0.04, adj = 0, font = 1, family = Times) ?plotmath # especially the italic function mtext(expression(See Author1 and Author2 (2007) Title~italic(Journal of Something)~ , pp. 1-50.), side = 3, outer = T, line=-0.75, cex = 0.7, at= 0.04, adj = 0, font = 1, family = Times) You didn't indicate your OS. The choice of screen or pdf fonts may be installation specific but this seemed to work as expected with output in a serif font and the output in the upper left corner on a Mac with the quartz() device. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
On 03-04-2012, at 21:02, Val wrote: On Tue, Apr 3, 2012 at 2:53 PM, Berend Hasselman b...@xs4all.nl wrote: On 03-04-2012, at 20:21, Val wrote: Hi All, On the same data points x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 ) I want to have have the following output as data frame x group group mean 46 142.3 125 289.6 36 142.3 193 3235.25 209 3235.25 78 289.6 66 289.6 242 3235.25 297 3235.25 45 142.3 I tried the following code dat - data.frame(xc=split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1 gxc - with(dat, tapply(xc, group, mean)) dat$gxc - gxce[as.character(dat$group)] txc=dat$gxc it did not work for me. I'm not surprised. In the line dat - there are 5 opening parentheses and 4 closing )'s. In the line dat$gxc - you reference an object gxce. Where was it created? So I tried this dat - data.frame(x, group=findInterval(x, quantile(x, prob=c(0, .333, .66 ,1)), all.inside=TRUE)) dat$gmean - ave(dat$x, as.factor(dat$group)) And the as.factor is not necessary. This will do dat$gmean - ave(dat$x, dat$group) Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] np package problem
You need to load it once per session with library(np) Michael On Tue, Apr 3, 2012 at 2:46 PM, dnewbold dwightnewbold...@hotmail.com wrote: Hi, Im trying to run a non parametric regression and I wish to use function npreg(), I installed the np package, but I am being told that npreg doesnt exist. Any advice on how I could fix this? -- View this message in context: http://r.789695.n4.nabble.com/np-package-problem-tp4529813p4529813.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] object of type 'S4' is not subsettable
On 03/04/2012 12:42 PM, phillen wrote: Figured it out. given that object 'cit' is of class S4, extracting information works as follows: 1)finding out names of slots in object 'cit' slotNames(cit) [1] x Z0Z1ZKtype model [7] ecdet lag P seasondumvarcval [13] teststat lambdaVorg V W PI [19] DELTA GAMMA R0RKbpspec [25] call test.name 2) extracting info from wanted slot cit@teststat [1] 5.348440 9.068113 10.643293 That's one way. There may be another: if you print the class of cit using class(cit) there may be a help page, which you would find using class?foo assuming foo is the name of the class. The author may have defined a method to get what you want, or may give other guidance. For example, in the Matrix package: x - Hilbert(6) class(x) [1] dpoMatrix attr(,package) [1] Matrix class?dpoMatrix gives quite useful information about the dpoMatrix class. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
On 03-04-2012, at 20:21, Val wrote: Hi All, On the same data points x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 ) I want to have have the following output as data frame x group group mean 46 142.3 125 289.6 36 142.3 193 3235.25 209 3235.25 78 289.6 66 289.6 242 3235.25 297 3235.25 45 142.3 I tried the following code dat - data.frame(xc=split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1 gxc - with(dat, tapply(xc, group, mean)) dat$gxc - gxce[as.character(dat$group)] txc=dat$gxc it did not work for me. I'm not surprised. In the line dat - there are 5 opening parentheses and 4 closing )'s. In the line dat$gxc - you reference an object gxce. Where was it created? So I tried this dat - data.frame(x, group=findInterval(x, quantile(x, prob=c(0, .333, .66 ,1)), all.inside=TRUE)) dat$gmean - ave(dat$x, as.factor(dat$group)) dat x group gmean 1 46 1 42.3 2 125 2 89.7 3 36 1 42.3 4 193 3 235.25000 5 209 3 235.25000 6 78 2 89.7 7 66 2 89.7 8 242 3 235.25000 9 297 3 235.25000 10 45 1 42.3 Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] np package problem
On Apr 3, 2012, at 2:46 PM, dnewbold wrote: Hi, Im trying to run a non parametric regression and I wish to use function npreg(), I installed the np package, but I am being told that npreg doesnt exist. Any advice on how I could fix this? The usual error following that set of actions is user failure to load the package. ?require ?library -- View this message in context: http://r.789695.n4.nabble.com/np-package-problem-tp4529813p4529813.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping
Please take a look at my first reply to you: ave(y, findInterval(y, quantile(y, c(0.33, 0.66 Then read ?ave for an explanation of the syntax. ave takes two vectors, the first being the data to be averaged, the second being an index to split by. You don't want to use split() here. Michael On Tue, Apr 3, 2012 at 2:50 PM, Val valkr...@gmail.com wrote: I did look at it the result is below, x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 ) #lapply( split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) , include.lowest=TRUE) ), mean) ave( split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) , include.lowest=TRUE) ), mean) ave( split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) , include.lowest=TRUE) ), mean) $`[36,74]` [1] NA $`(74,197]` [1] NA $`(197,297]` [1] NA There were 11 warnings (use warnings() to see them) On Tue, Apr 3, 2012 at 2:35 PM, Petr Savicky savi...@cs.cas.cz wrote: On Tue, Apr 03, 2012 at 02:21:36PM -0400, Val wrote: Hi All, On the same data points x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 ) I want to have have the following output as data frame x group group mean 46 1 42.3 125 2 89.6 36 1 42.3 193 3 235.25 209 3 235.25 78 2 89.6 66 2 89.6 242 3 235.25 297 3 235.25 45 1 42.3 I tried the following code dat - data.frame(xc=split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1 gxc - with(dat, tapply(xc, group, mean)) dat$gxc - gxce[as.character(dat$group)] txc=dat$gxc it did not work for me. David Winsemius suggested to use ave(), when you asked this question for the first time. Can you have look at it? Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create Model Object (setClass?setMethod?)
Setting a class is quite easy if you are in the S3 (read: easier) system: x - 1:15 class(x) - YourClass summary.YourClass - function(x, ...) cat(The mean of your object is , mean(x), \n) summary(x) Michael On Tue, Apr 3, 2012 at 12:54 PM, casperyc caspe...@hotmail.co.uk wrote: Hi all, I have a self written likelihood as a model and functions to optimize and get fitted values, confidence intervals ect. I wonder if there is a way to define a 'class', or a 'model' (or a certain object)? so that I can use 'summary' to produce a summary like it does for a lm object. Also, it should be able to use 'predict' and 'plot' and other various generic functions. I am reading bits and pieces on the internet on 'setClass', 'setMethod'. Am I looking for the correct thing? Is there any up to date references that I can get help? I need some examples to get started with. Thanks! Casper - ### PhD candidate in Statistics School of Mathematics, Statistics and Actuarial Science, University of Kent ### -- View this message in context: http://r.789695.n4.nabble.com/Create-Model-Object-setClass-setMethod-tp4529473p4529473.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ff usage for glm
On Tue, Apr 3, 2012 at 1:42 AM, Benilton Carvalho beniltoncarva...@gmail.com wrote: Did you try the example described on the ff man page? Also, the last error message you report happens when chunks of the data set give design matrices that don't line up correctly. You said you added one new variable, but there are actually two new variables in the formula you show, compared to the previous run you showed. If you only have 32bit R then doing this from a data frame is not going to be efficient, and you do want to use ff (or SQLite or something). You may also want to decrease the chunk size from the default -- 5000 observations at a time might be too much. Incidentally, putting ATTN: Thomas Lumley on a nabble post would be counterproductive if I read nabble, but since I don't it's completely pointless. -thomas On Monday, April 2, 2012, Bond, Stephen wrote: Thomas, I tried biglm and it does not work see http://r.789695.n4.nabble.com/unable-to-get-bigglm-working-ATTN-Thomas-Lumley-td2276524.html#a2278381 . There are other posts from people who cannot get biglm working and others who get strange results. Please, advise if you can help. I have row based native code which works, but it is inconvenient as it does not produce an R object, which can be fed to anova etc. offered it to the developer forum, but message is still waiting for mod approval. regards Stephen B -Original Message- From: Thomas Lumley [mailto:tlum...@uw.edu javascript:;] Sent: Friday, March 30, 2012 7:32 PM To: Bond, Stephen Cc: r-help@r-project.org javascript:; Subject: Re: [R] ff usage for glm On Sat, Mar 31, 2012 at 9:05 AM, Bond, Stephen stephen.b...@cibc.comjavascript:; wrote: Greetings useRs, Can anyone provide an example how to use ff to feed a very large data frame to glm? The data.frame cannot be loaded in R using conventional read.csv as it is too big. glm(...,data=ff.file) ?? I shouldn't think glm() will work on data that are too big to read into R. However, bigglm() from the biglm package should work. You just need to write a function that supplies chunks of data from ff.file as requested (see the example on ?bigglm). I haven't used ff, but it looks from the documentation as though chunk() will do all the difficult parts. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org javascript:; mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R equivalent for SQL query
Hi, here are some solutions: DF - read.table(textConnection( A B C 1 1 3 1 1 4 1 1 5 1 2 6 1 2 7 1 3 8 ), header=TRUE) #using sqldf package library(sqldf) sqldf(select A, B, count(*) from DF group by A, B order by count(*) desc) #using function table as.data.frame(table(DF$A, DF$B)) As you can see, you can use sqldf package for performing sql queries on R data frames. Andrija On Tue, Apr 3, 2012 at 8:26 PM, Steven Raemaekers s.raemaek...@sig.eu wrote: Hi, I have a query which I would like to translate into R, but I do not know how to do it in an easy way. Assume a data frame has columns A, B and C: A B C 1 1 3 1 1 4 1 1 5 1 2 6 1 2 7 1 3 8 The query is as follows: select A, B, count(*) from data.frame group by A, B order by count(*) desc How do I translate this into R statements in such way that the result is a data frame structured as follows: A B count(*) 1 1 3 1 2 2 1 3 1 Thanks, Steven __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R equivalent for SQL query
Here is a solution using the data.table package: x - read.table(text = A B C + 1 1 3 + 1 1 4 + 1 1 5 + 1 2 6 + 1 2 7 + 1 3 8, header = TRUE) require(data.table) x - data.table(x) # convert to a data.table # query result - x[ + , list(count = length(C)) + , by = list(A,B) + ] # unsorted result result A B count [1,] 1 1 3 [2,] 1 2 2 [3,] 1 3 1 # sorted result result[order(result$count, decreasing = TRUE), ] A B count [1,] 1 1 3 [2,] 1 2 2 [3,] 1 3 1 On Tue, Apr 3, 2012 at 3:50 PM, andrija djurovic djandr...@gmail.com wrote: Hi, here are some solutions: DF - read.table(textConnection( A B C 1 1 3 1 1 4 1 1 5 1 2 6 1 2 7 1 3 8 ), header=TRUE) #using sqldf package library(sqldf) sqldf(select A, B, count(*) from DF group by A, B order by count(*) desc) #using function table as.data.frame(table(DF$A, DF$B)) As you can see, you can use sqldf package for performing sql queries on R data frames. Andrija On Tue, Apr 3, 2012 at 8:26 PM, Steven Raemaekers s.raemaek...@sig.eu wrote: Hi, I have a query which I would like to translate into R, but I do not know how to do it in an easy way. Assume a data frame has columns A, B and C: A B C 1 1 3 1 1 4 1 1 5 1 2 6 1 2 7 1 3 8 The query is as follows: select A, B, count(*) from data.frame group by A, B order by count(*) desc How do I translate this into R statements in such way that the result is a data frame structured as follows: A B count(*) 1 1 3 1 2 2 1 3 1 Thanks, Steven __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] object of type 'S4' is not subsettable
Thanks for that hint. Philipp Lentner -- View this message in context: http://r.789695.n4.nabble.com/object-of-type-S4-is-not-subsettable-tp4529063p4529958.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.