Re: [R] Converting a dataframe column from string to datetime
ok..thanks - Original Message - From: Don MacQueen macque...@llnl.gov To: raje...@cse.iitm.ac.in, r-help r-help@r-project.org Sent: Fri, 01 Oct 2010 20:51:06 +0530 (IST) Subject: Re: [R] Converting a dataframe column from string to datetime Youâre working too hard. Use this: tms - as.POSIXct(strptime(v, %a %b %d %H:%M:%OS %Y)) Take note of the fact that there are two types of datetime objects: POSIXct and POSIXlt. Your unlist() gave what seemed a strange result because you used on an âltâ object. Had you given it a âctâ object it would have made sense. To see, try lapply(v,function(x){as.POSIXct(strptime(x, %a %b %d %H:%M:%OS %Y))}) But using lapply() was more complicated than necessary. -Don On 9/30/10 10:59 PM, raje...@cse.iitm.ac.in raje...@cse.iitm.ac.in wrote: Hi, I have a dataframe column of the form v-c(Fri Feb 05 20:00:01.43000 2010,Fri Feb 05 20:00:02.274000 2010,Fri Feb 05 20:00:02.274000 2010,Fri Feb 05 20:00:06.34000 2010) I need to convert this to datetime form. I did the following.. lapply(v,function(x){strptime(x, %a %b %d %H:%M:%OS %Y)}) This gives me a list that looks like this... [[1]] [1] 2010-02-05 20:00:01.43 [[2]] [1] 2010-02-05 20:00:02.274 [[3]] [1] 2010-02-05 20:00:02.274 [[4]] [1] 2010-02-05 20:00:06.34 However, when I do an unlist...I gets converted to something like this... sec minhourmday monyearwdayyday isdst sec minhour mday monyearwdayyday isdst sec 1.430 0.000 20.000 5.000 1.000 110.000 5.000 35.000 0.000 2.274 0.000 20.000 5.000 1.000 110.000 5.000 35.000 0.000 2.274 minhourmday monyearwdayyday isdst sec min hourmday monyearwdayyday isdst 0.000 20.000 5.000 1.000 110.000 5.000 35.000 0.000 6.340 0.000 20.000 5.000 1.000 110.000 5.000 I want it to become a dataframe column except for a change in the datatype to datetime...how can I achieve this? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://BLOCKEDstat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://BLOCKEDwww.BLOCKEDR-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory 925 423-1062 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to iterate through different arguments?
any solutions? -- View this message in context: http://r.789695.n4.nabble.com/How-to-iterate-through-different-arguments-tp2953511p2953608.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating R objects in Java
It seems that you can call Java objects from R but that you cannot call R objects from Java using the RJava interface. -- View this message in context: http://r.789695.n4.nabble.com/Creating-R-objects-in-Java-tp2904497p2953611.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Syntax for Rmpi cf multicore
I'm aiming to compare the workings of Rmpi and multicore on a duel processor quad core machine with 64 bit R-2.11.1 Kubuntu 10.4. It's impossible for me to get a small reproducable code segment to show what I mean, but if I show what works for mclapply, I'd hope it's possible to be shown what would be the equivalent way with mpi.apply. The function lr.gbm has variables trees, folds and minob which I use with mclapply like this: out - mclapply(subsets, lr.gbm, mc.cores = 6, trees=trees, folds=folds, minob=minob) subsets is simply a vector that looks like this: [1] COB_2 CNJ_2 COB_3 CNJ_3 COB_4 CNJ_4 and that works more or less fine to produce a list of length 6 which I've named with the elements of the subsets vector. names(out) - subsets Now, when I use mpi.apply, I make the trees, folds and minob available to all slaves thus: mpi.spawn.Rslaves(nslaves=6) mpi.bcast.Robj2slave(trees) mpi.bcast.Robj2slave(folds) mpi.bcast.Robj2slave(minob) but, I can't work out what sort of 'array' is required for the first argument to mpi.apply. out.mpi - mpi.apply(subsets.l, lr.gbm) What should subsets.l look like? I tried a list of length 6 with elements the same as those in 'subsets', but evidently, something more array-like is required, but I'm out of ideas. TIA -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] i have aproblem --thank you
On Oct 4, 2010, at 7:29 AM, 笑啸 wrote: dear professor: If this is directed to my posting, I am not a professor. thank you for your help,witn your help i develop the nomogram successfully. after that i want to do the internal validation to the model.i ues the bootpred to do it,and then i encounter problem again,just like that.(´íÎóÓÚerror to :complete.cases(x, y, wt) : ²»ÊÇËùÓеIJÎÊý¶¼Ò»Ñù³¤(the length of the augment was different)) i hope you tell me where is the mistake,and maybe i have chosen the wrong function. I think so, at least to the limited extent that I I am understanding your efforts. It appears you are trying to use boot to do simulation, but the goal is rather unclear. I think you would get further along the way if you imported your data into R your data and constructed a model within the rms system. None of your functions in this posting are creating a new instance of dfr$y. They are all depending on the single constructed version of the data which may be peculiar or it may be exactly typical, but one cannot tell. Rather than boot() you might get further with replicate(), but I am not inclined to offer further code in support of what is an unclear and possibly futile direction. thank you turly yours .. load package 'rms' ddist - datadist(dfr) options(datadist='ddist') n-100 set.seed(10) T.Grade-factor(0:3,labels=c(G0, G1, G2,G3)) Sex-factor(sample(0:1, 100, replace=TRUE),labels=c(F,M)) Smoking-factor(sample(0:1, 100, replace=TRUE),labels=c(No,yes)) dfr$L-with(dfr,0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking) +0.92*as.numeric(Sex)-1.338) dfr$y - with(dfr, ifelse(runif(n) plogis(L), 1, 0) ) dfr - data.frame(T.Grade,Sex,Smoking, L, y) ddist - datadist(dfr) options(datadist='ddist') f-lrm(y~T.Grade +Sex+Smoking, data=dfr) nom-nomogram(f,fun=function(x)1/(1+exp(-x)),fun.at=c(.01,.05,seq(. 1,.9,by=.2),.9,1),funlabel=Risk of Death) plot(nom, xfrac=0.45) load package bootstrap .the problem This is not a problem statement. It is merely evidence that you have some sort of unspecified problem. theta.fit - function(dfr,y){lsfit(dfr,y)} theta.predict - function(fit,dfr){cbind(1,dfr)%*%fit$coef} (There is already a predict() function for regular regression problems and a Predict() function for use with rms fit objects, which if you explained clearly what you were attempting might be urseful.) sq.err - function(y,yhat) { (y-yhat)^2} It's probably possible to get the sum-squared error for a linear fit from function lm() by more direct means. I do not understand what the next line is supposed to be doing, so am unable to offer further suggestion. Refitting the same model 50 times to the same data seems an uninformative exercise. results - bootpred(x,y,50,theta.fit,theta.predict,err.meas=sq.err) ´íÎóÓÚerror to :complete.cases(x, y, wt) : ²»ÊÇËùÓеIJÎÊý¶¼Ò»Ñù³¤(the length of the augment was different) È«¹ú×îµÍ¼Û£¬ÌìÌìÔÚ¼Ò³åÕÕƬ£¬24Сʱ· ¢»õÉÏÃÅ£¡ [[alternative HTML version deleted]] -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatterplot matrix - Pearson linear correlation and Density Ellipse
Thanks a lot for the answer -- View this message in context: http://r.789695.n4.nabble.com/Scatterplot-matrix-Pearson-linear-correlation-and-Density-Ellipse-tp2763552p2953909.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about splines in R
Have you looked at lrm in rms? Harrell uses restricted cubic splines, but my understanding (subject to correction by my statistical betters) is that these are the same as natural cubic splines. The penalty matrix is returned as part of the fit object after being constructed and the model fit. You have the option of specifying knot locations or allowing these to be estimated. require(rms) ?lrm ?pentrace Option 2,,, try one of the many methods for searching? RSiteSearch(penalized logistic) ... returned 95 hits -- David Winsemius, MD On Oct 4, 2010, at 1:44 AM, Yan Li wrote: Dear all, I wanted to fit a penalized logistic model to obtain knots-based basis matrix as well as basis coefficients. I was wondering if there is a package in R that can generate a natural cubic spline and construct the corresponding penalty matrix? Thank you! Yan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading Strings in R
Try thist... your.strings - scan(yourfilename.dat, what=) Michael On 4 October 2010 22:04, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, I know this must be a one-liner, but I am experiencing problems. I need to read lists of long integers stored on a text files, i.e. which reads like 7359700484475972 7359700484475972 0 7359700484475972 0 0 0 97189954101318722 0 0 7811896690636354 7359700490636354 0 0 0 7811896684475972 7811896684475972 7811896690636354 8447597298304052 7359700484475972 0 7359700498304052 7359700498304052 7359700498304052 0 7359700498304052 7359700484475972 7359700484475972 Now, for me every integer is just a symbol, i.e. the ID of some object and I simply need to find out how many times and where it occurs in this file. Bottom line: I would like to read this file directly as a string to ensure I will not have any trouble of integer overflow or rounding error. I tried readLines, but I cannot get rid of \n. Any help is appreciated. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to make list() return a list of *named* elements
On Thu, Sep 30, 2010 at 09:10:16AM -0400, Gabor Grothendieck wrote: A data frame is a list in which every component (i.e. every column) must have the same length (i.e. the same number of rows). data.frame() does preserve names: data.frame(b, my.c) b my.c 1 22.48 2 12.29 3 10.9 15 4 8.51 5 9.2 14 Thanks for your suggestion. However, the reason I used list() was that the different vectors to return usually have different lengths. Admittedly, I should have used another example that explicated this. -- Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net GnuPG key: 1024D/7050614E Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E Learn about secure email at http://www.gnupg.org signature.asc Description: Digital signature __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Force evaluation of variable when calling partialPlot
The plot titles aren't pretty, but the following works for me: R library(randomForest) randomForest 4.5-37 Type rfNews() to see new features/changes/bug fixes. R set.seed(1004) R iris.rf - randomForest(iris[-5], iris[[5]], ntree=1001) R par(mfrow=c(2,2)) R for (i in 1:4) partialPlot(iris.rf, iris, names(iris)[i]) Andy From: Ben Bond-Lamberty Dear R Users, I'm using the randomForest package and would like to generate partial dependence plots, one after another, for a variety of variables: m - randomForest( s, ... ) varnames - c( var1, var2, var3, var4 ) # var1..4 are all in data frame s for( v in varnames ) { partialPlot( x=m, pred.data=s, x.var=v ) } ...but this doesn't work, with partialPlot complaining that it can't find the variable v. I think I need to force the evaluation of the loop variable v so that partialPlot sees the correct variable names, but am stumped after trying eval and similar functions. Any suggestions on how to do this? Googling has not turned up anything very useful. Thanks, Ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attachme...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 2010年10月4日 19:14:45 自动保存草稿
dear professor: thank you for your help,witn your help i develop the nomogram successfully. after that i want to do the internal validation to the model.i ues the bootpred to do it,and then i encounter problem again,just like that.(´íÎóÓÚerror to :complete.cases(x, y, wt) : ²»ÊÇËùÓеIJÎÊý¶¼Ò»Ñù³¤(the length of the augment was different)) i hope you tell me where is the mistake,and maybe i have chosen the wrong function. thank you turly yours .. load package 'rms' ddist - datadist(dfr) options(datadist='ddist') n-100 set.seed(10) T.Grade-factor(0:3,labels=c(G0, G1, G2,G3)) Sex-factor(sample(0:1, 100, replace=TRUE),labels=c(F,M)) Smoking-factor(sample(0:1, 100, replace=TRUE),labels=c(No,yes)) dfr$L-with(dfr,0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking)+0.92*as.numeric(Sex)-1.338) dfr$y - with(dfr, ifelse(runif(n) plogis(L), 1, 0) ) dfr - data.frame(T.Grade,Sex,Smoking, L, y) ddist - datadist(dfr) options(datadist='ddist') f-lrm(y~T.Grade +Sex+Smoking, data=dfr) nom-nomogram(f,fun=function(x)1/(1+exp(-x)),fun.at=c(.01,.05,seq(.1,.9,by=.2),.9,1),funlabel=Risk of Death) plot(nom, xfrac=0.45) load package bootstrap .the problem theta.fit - function(dfr,y){lsfit(dfr,y)} theta.predict - function(fit,dfr){cbind(1,dfr)%*%fit$coef} sq.err - function(y,yhat) { (y-yhat)^2} results - bootpred(x,y,50,theta.fit,theta.predict,err.meas=sq.err) ´íÎóÓÚerror to :complete.cases(x, y, wt) : ²»ÊÇËùÓеIJÎÊý¶¼Ò»Ñù³¤(the length of the augment was different) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] i have aproblem --thank you
dear professor: thank you for your help,witn your help i develop the nomogram successfully. after that i want to do the internal validation to the model.i ues the bootpred to do it,and then i encounter problem again,just like that.(´íÎóÓÚerror to :complete.cases(x, y, wt) : ²»ÊÇËùÓеIJÎÊý¶¼Ò»Ñù³¤(the length of the augment was different)) i hope you tell me where is the mistake,and maybe i have chosen the wrong function. thank you turly yours .. load package 'rms' ddist - datadist(dfr) options(datadist='ddist') n-100 set.seed(10) T.Grade-factor(0:3,labels=c(G0, G1, G2,G3)) Sex-factor(sample(0:1, 100, replace=TRUE),labels=c(F,M)) Smoking-factor(sample(0:1, 100, replace=TRUE),labels=c(No,yes)) dfr$L-with(dfr,0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking)+0.92*as.numeric(Sex)-1.338) dfr$y - with(dfr, ifelse(runif(n) plogis(L), 1, 0) ) dfr - data.frame(T.Grade,Sex,Smoking, L, y) ddist - datadist(dfr) options(datadist='ddist') f-lrm(y~T.Grade +Sex+Smoking, data=dfr) nom-nomogram(f,fun=function(x)1/(1+exp(-x)),fun.at=c(.01,.05,seq(.1,.9,by=.2),.9,1),funlabel=Risk of Death) plot(nom, xfrac=0.45) load package bootstrap .the problem theta.fit - function(dfr,y){lsfit(dfr,y)} theta.predict - function(fit,dfr){cbind(1,dfr)%*%fit$coef} sq.err - function(y,yhat) { (y-yhat)^2} results - bootpred(x,y,50,theta.fit,theta.predict,err.meas=sq.err) ´íÎóÓÚerror to :complete.cases(x, y, wt) : ²»ÊÇËùÓеIJÎÊý¶¼Ò»Ñù³¤(the length of the augment was different) È«¹ú×îµÍ¼Û£¬ÌìÌìÔÚ¼Ò³åÕÕƬ£¬24Сʱ·¢»õÉÏÃÅ£¡ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] have aproblem --thank you
dear professor: thank you for your help,witn your help i develop the nomogram successfully. after that i want to do the internal validation to the model.i ues the bootpred to do it,and then i encounter problem again,just like that.(错误于error to :complete.cases(x, y, wt) : 不是所有的参数都一样长(the length of the augment was different)) i hope you tell me where is the mistake,and maybe i have chosen the wrong function. thank you turly yours .. load package 'rms' ddist - datadist(dfr) options(datadist='ddist') n-100 set.seed(10) T.Grade-factor(0:3,labels=c(G0, G1, G2,G3)) Sex-factor(sample(0:1, 100, replace=TRUE),labels=c(F,M)) Smoking-factor(sample(0:1, 100, replace=TRUE),labels=c(No,yes)) dfr$L-with(dfr,0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking)+0.92*as.numeric(Sex)-1.338) dfr$y - with(dfr, ifelse(runif(n) plogis(L), 1, 0) ) dfr - data.frame(T.Grade,Sex,Smoking, L, y) ddist - datadist(dfr) options(datadist='ddist') f-lrm(y~T.Grade +Sex+Smoking, data=dfr) nom-nomogram(f,fun=function(x)1/(1+exp(-x)),fun.at=c(.01,.05,seq(.1,.9,by=.2),.9,1),funlabel=Risk of Death) plot(nom, xfrac=0.45) load package bootstrap .the problem theta.fit - function(dfr,y){lsfit(dfr,y)} theta.predict - function(fit,dfr){cbind(1,dfr)%*%fit$coef} sq.err - function(y,yhat) { (y-yhat)^2} results - bootpred(x,y,50,theta.fit,theta.predict,err.meas=sq.err) 错误于error to :complete.cases(x, y, wt) : 不是所有的参数都一样长(the length of the augment was different) 全国最低价,天天在家冲照片,24小时发货上门! 全国最低价,天天在家冲照片,24小时发货上门!__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Issue with match.call
Hi, I have a function that I'm writing. The arguments in the function are as follows RFF-function(qtype, qOpt,...){} i.e., I have two args that are compulsary and the rest are optional. Now when my user passes the function call, I need to see what optional args are defined and process accordingly...what I have so far is.. RFF-function(qtype, qOpt,...){ mc - match.call(expand.dots=TRUE) } I need to see what all args have been sent out of vec-c(flag,sep,dec) and define if-else conditions based on whether they have been defined. How do I do this? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] I have aproblem --thank you
dear teacher: thank you for your help,witn your help i develop the nomogram successfully. after that i want to do the internal validation to the nomogram.i wish to use the bootstrap to achieve the prediction accuracy of the nomogram,and then i encounter problem again,just like that.(error to :complete.cases(x, y, wt) : (the length of the augment was different)) i hope you tell me where is the mistake,and maybe i have chosen the wrong function. thank you turly yours .. load package 'rms' n-100 set.seed(10) T.Grade-factor(0:3,labels=c(G0, G1, G2,G3)) Sex-factor(0:1,labels=c(F,M)) Smoking-factor(0:1,labels=c(No,yes)) L-0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking) +0.92*as.numeric(Sex)-1.338 L [1] -0.755 -0.172 0.363 0.946 y - ifelse(runif(n) plogis(L), 1, 0) dfr - data.frame(T.Grade,Sex,Smoking, L, y) ddist - datadist(dfr) # wrap the vectors into a dataframe. options(datadist='ddist') f-lrm(y~T.Grade +Sex+Smoking, data=dfr) # skip the as.numeric()'s ### Gives an error message due to singular X matrix. f-lrm(y~T.Grade +Sex+Smoking, data=dfr) singular information matrix in lrm.fit (rank= 5 ). Offending variable(s): Sex=M Error in lrm(y ~ T.Grade + Sex + Smoking, data = dfr) : Unable to fit model using “lrm.fit” #Try instead: n-100 set.seed(10) T.Grade-factor(0:3,labels=c(G0, G1, G2,G3)) Sex-factor(sample(0:1, 100, replace=TRUE),labels=c(F,M)) Smoking-factor(sample(0:1, 100, replace=TRUE),labels=c(No,yes)) dfr$L - with(dfr, 0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking) +0.92*as.numeric(Sex)-1.338) dfr$y - with(dfr, ifelse(runif(n) plogis(L), 1, 0) ) dfr - data.frame(T.Grade,Sex,Smoking, L, y) ddist - datadist(dfr) options(datadist='ddist') f-lrm(y~T.Grade +Sex+Smoking, data=dfr) nom - nomogram(f, fun=function(x)1/(1+exp(-x)), # or fun=plogis fun.at=c(.001,.01,.05,seq(.1,.9,by=.1),.95,.99,.999), funlabel=Risk of Death) plot(nom, xfrac=.45) load package bootstrap(achieve the prediction accuracy of the nomogram) .the problem theta.fit - function(dfr,y){lsfit(dfr,y)} theta.predict - function(fit,dfr){cbind(1,dfr)%*%fit$coef} sq.err - function(y,yhat) { (y-yhat)^2} results - bootpred(x,y,50,theta.fit,theta.predict,err.meas=sq.err) error to :complete.cases(x, y, wt) : (the length of the augment was different.__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to make list() return a list of *named* elements
On Thu, Sep 30, 2010 at 09:34:26AM -0300, Henrique Dallazuanna wrote: You should try: eapply(.GlobalEnv, I)[c('b', 'my.c')] Great! b - c(22.4, 12.2, 10.9, 8.5, 9.2) my.c - sample.int(round(2*mean(b)), 4) my.return - function (vector.of.variable.names) { eapply(.GlobalEnv, I)[vector.of.variable.names] } str(my.return(c(b,my.c))) List of 2 $ b :Class 'AsIs' num [1:5] 22.4 12.2 10.9 8.5 9.2 $ my.c:Class 'AsIs' int [1:4] 18 22 12 3 much nicer than list(b=b, my.c=my.c), especially in real cases with longer variable names and a lot of variables to return. Thanks Henrique! -- Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net Q. What is that strange attachment in this mail? A. My digital signature, see www.gnupg.org for info on how you could use it to ensure that this mail is from me and has not been altered on the way to you. signature.asc Description: Digital signature __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotmath: how to use greek symbols in expression(integral(f(tau)*dtau, 0, t))?
I would like to use greek tau as a symbol of variable to integrate over in plotmath expression(integral(f(tau)*dtau, 0,t)) but nothing seems to work. I tried d{\tau}, d\tau, etc., without any success Is it possible? How can I accomplish this? Best regards, Ryszard -- Confidentiality Notice: This message is private and may ...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inserting a plot into another
Hi everyone. First, thank you for your answers, it helped me alot. I have 1 more question regarding subplot. I'm using this function in a graph that contains 6 plots (3 x 2). This is producing strange behavior. http://img545.imageshack.us/img545/9233/fig6.png See here for the resulting figure. For instance here's my code : pdf(file=C:/Users/Modelisation/Desktop/Fig6.pdf, width = 8.5, height = 11, pointsize = 12); par(mfcol = c(3,2), mai = c(0.7,0.8,0,0), omi = c(1, 0, 0.7, 0.1)); plot(FSL.data_Center$Distance, 4.6/FSL.data_Center$kd.BLUE., pch = 22, bg = gray43, xlab = Distance (km), ylab = 1% penetration depth (m) , ylim = c(0,20), xaxt = n, yaxt = n, xpd = NA); axis(1, at = c(100,200,300,400,500),labels = c(100,200,300,400,500), cex.axis=1, tck = 0.02); axis(2, tck = 0.02); ... #Subplots subplot(plot(lowess(FSL.data_North$Distance, 4.6/FSL.data_North$kd.BLUE.), type = 'l', xlab = , ylab = , cex.axis = 0.75, xlim = c(0,450), ylim = c(0,20)), 330,17, size = c(.96, .6)); subplot(plot(lowess(FSL.data_Center$Distance, 4.6/FSL.data_Center$kd.RED.), type = 'l', xlab = , ylab = , cex.axis = 0.75, xlim = c(0,450), ylim = c(0,20)), 330,17, size = c(.96, .6)); subplot(plot(lowess(FSL.data_South$Distance, 4.6/FSL.data_South$kd.GREEN.), type = 'l', xlab = , ylab = , cex.axis = 0.75, xlim = c(0,450), ylim = c(0,20)), 330,17, size = c(.96, .6)); PLOT CODE FOR THE 5 OTHER FIGURES HERE dev.off(); The last 2 subplots are plotted in at the bottom of the page. I would like to have the last 2 curves on the plot A. With regards, Phil -- View this message in context: http://r.789695.n4.nabble.com/Inserting-a-plot-into-another-tp2720936p2954362.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotmath: how to use greek symbols in expression(integral(f(tau)*dtau, 0, t))?
On Oct 4, 2010, at 15:25 , Czerminski, Ryszard wrote: I would like to use greek tau as a symbol of variable to integrate over in plotmath expression(integral(f(tau)*dtau, 0,t)) but nothing seems to work. I tried d{\tau}, d\tau, etc., without any success Is it possible? How can I accomplish this? I'd try ...f(tau)~d*tau... Best regards, Ryszard -- Confidentiality Notice: This message is private and may ...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to make list() return a list of *named* elements
On Mon, Oct 4, 2010 at 7:51 AM, Berwin A Turlach ber...@maths.uwa.edu.au wrote: On Mon, 4 Oct 2010 19:45:23 +0800 Berwin A Turlach ber...@maths.uwa.edu.au wrote: Mmh, R my.return - function (vector.of.variable.names) { sapply(vector.of.variable.names, function(x) list(get(x))) } make that: R my.return - function (vector.of.variable.names) { sapply(vector.of.variable.names, function(x) get(x)) } Some small tweaks. If you use simplify=FALSE then it will guarantee that a list is returned: sapply(my.names, get, simplify = FALSE) for example, compare the outputs of: sapply(c(letters, LETTERS), get) sapply(c(letters, LETTERS), get, simplify = FALSE) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to make list() return a list of *named* elements
See llist from Hmisc package: library(Hmisc) a=rnorm(10) b=rnorm(5) llist(a,b) On Thu, Sep 30, 2010 at 1:49 PM, Hans Ekbrand hans.ekbr...@sociology.gu.se wrote: If I combine elements into a list b - c(22.4, 12.2, 10.9, 8.5, 9.2) my.c - sample.int(round(2*mean(b)), 5) my.list - list(b, my.c) the names of the elements seems to get lost in the process: str(my.list) List of 2 $ : num [1:5] 22.4 12.2 10.9 8.5 9.2 $ : int [1:5] 11 8 6 9 20 If I explicitly name the elements at list-creation, I get what I want: my.list - list(b=b, my.c=my.c) str(my.list) List of 2 $ b : num [1:5] 22.4 12.2 10.9 8.5 9.2 $ my.c: int [1:5] 11 8 6 9 20 Now, is there a way to get list() (or some other function) to automatically name the elements? I often use list() in return(), and I am getting tired of having to repeat myself. -- Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net A. Because it breaks the logical sequence of discussion Q. Why is top posting bad? -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkykeUwACgkQfCyHKnBQYU4J+ACgrdjMSoyr/Uzt9fpTsietde3n d8UAnRskbOM7mDhDexiS7T9LkhRs287P =MsDG -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Christophe Pallier christo...@pallier.org tel: +33 (0)1 69 08 79 34 Unité de Neuroimagerie Cognitive INSERM-CEA, Neurospin center, F91191 Gif-sur-Yvette, France web site: http://www.unicog.org personal web site: www.pallier.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotmath: how to use greek symbols in expression(integral(f(tau)*dtau, 0, t))?
Try expression(paste(integral(),f(,tau,) d,tau,sep=))) it works for me. -tgs On Mon, Oct 4, 2010 at 9:25 AM, Czerminski, Ryszard ryszard.czermin...@astrazeneca.com wrote: I would like to use greek tau as a symbol of variable to integrate over in plotmath expression(integral(f(tau)*dtau, 0,t)) but nothing seems to work. I tried d{\tau}, d\tau, etc., without any success Is it possible? How can I accomplish this? Best regards, Ryszard -- Confidentiality Notice: This message is private and may ...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating R objects in Java
Have you looked at JRI ? http://www.rforge.net/JRI/ Michael On 4 October 2010 09:29, lord12 trexi...@yahoo.com wrote: It seems that you can call Java objects from R but that you cannot call R objects from Java using the RJava interface. -- View this message in context: http://r.789695.n4.nabble.com/Creating-R-objects-in-Java-tp2904497p2953611.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fisher exact test?
Hi All, My apologies if this is a totally newbie question. I want to calculate the probability that a particular set of genes is picked repeatedly for 3 samplings. For example, if from a total of 'N' genes, I pick 'A' number of genes in the first pick, 'B' number of genes in the second pick, and 'C' number of genes in the third pick, I would want to calculate p(n) where n = (A intersection B intersection C). Would fisher exact test be the correct method to evaluate the probability? And if so how can I frame the 3x3 (?) contingency table? I can find numerous examples of 2x2 tables, but none for 3x3. Any guidance would be appreciated. many thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help on reading multipe files in R
Hi, How can I read multiple files(in a loop like manner) within a single code of R ? For example, I need to run the same code for different datasets (here list of companies) and since individual files are quite large, appending the files into one file is not a desirable option. Can this be done through a macro or sql kind of command? Thanks and Regards, Rahul S Menon Research Associate ISB, Hyderabad Ph-040-2318 7288 DISCLAIMER : This e-mail (including any attachments) is ...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interpreting the example given by Frank Harrell in the predict.lrm {Design} help
Dear List and Frank, I have calculated the log-odds for my models but maybe i am not getting something but i am not understanding how for a categorical factor this helps? On all the examples i have see it relates to continuous factors where moving from one number to another shows either a increase or decrease, not as in my case a change of catagory. Furthermore, this gives the values for each factor independent of each other, how do i get the log-odds for the entire model? I appreciate i maybe trying to put things in boxes again, i am not i am happy to report the log odds of moving from one response level to the next but would like it for all the factors together not independently. John Low HighDiff. Effect S.E.Lower Upper WO Woody:Non_woody 1 2 NA 0.280.16-0.04 0.6 Odds Ratio 1 2 NA 1.32NA 0.961.82 PD Abiotic:Biotic 2 1 NA -1.21 0.13-1.47 -0.96 Odds Ratio 2 1 NA 0.3 NA 0.230.38 ALT All:Low 3 1 NA 0.470.190.110.84 Odds Ratio 3 1 NA 1.6 NA 1.112.31 ALT High:Low3 2 NA -0.07 0.14-0.35 0.21 Odds Ratio 3 2 NA 0.93NA 0.7 1.24 ALT Mid:Low 3 4 NA 0.390.150.1 0.67 Odds Ratio 3 4 NA 1.48NA 1.111.96 REG Two_plus:One1 2 NA -0.59 0.13-0.84 -0.34 Odds Ratio 1 2 NA 0.55NA 0.430.72 BIO Arctic:Subtropical/Tropical 4 1 NA -1.02 0.81-2.61 0.58 Odds Ratio 4 1 NA 0.36NA 0.071.78 BIO Boreal:Subtropical/Tropical 4 2 NA -1.21 0.81-2.79 0.37 Odds Ratio 4 2 NA 0.3 NA 0.061.44 BIO Mediterranean:Subtropical/Tropical 4 3 NA -1.89 0.48-2.83 -0.95 Odds Ratio 4 3 NA 0.15NA 0.060.39 BIO Temperate:Subtropical/Tropical 4 5 NA -0.09 0.16-0.41 0.23 Odds Ratio 4 5 NA 0.91NA 0.661.26 On 3 Oct 2010, at 15:29, Frank Harrell wrote: You still seem to be hung up on making arbitrary classifications. Instead, look at tendencies using odds ratios or rank correlation measures. My book Regression Modeling Strategies covers this. Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Frank-Harrell-in-the-predict-lrm-Design-help-tp2883311p2953220.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 'all subsets' fitting algorithm for Bayesian approach
Bert Thanks for your reply but I already have the Box Steinberg work in my extensive library of 'homework' :o) I have some problem specific priors that I need to use for calculating model probabilities so that I can produce predictive distributions using Bayesian model averaging - hence I need to be able to extract the key summary stats (as an analogue of the likelihood) for each model from an exhaustive selection of possible model terms. If this is available in R already then I would love to hear about it. I am aware of leaps() and regsubsets() but I am not sure that provides exactly what I need here, though it may be possible to adapt it somehow. Michael On 1 Oct 2010, at 18:32, Bert Gunter wrote: u... You are reinventing the wheel. In fact, several wheels: the statistical literature already has several different approaches worked out for this. For example, George Box and David Steinberg did one about 20 years ago, and it has been incorporated as one of the options in the JMP DOE model choice procedure. So do your homework and save yourself some effort. Maybe even all your effort. -- Bert On Fri, Oct 1, 2010 at 7:02 AM, Michael Hopkins hopk...@upstreamsystems.com wrote: Hi R experts I am just wondering if something is already available (or easily adaptable) to do the following. I am planning to build linear models for all possible combinations of terms, so for example if the terms are sent into a function as this string X1 + X2 + X3 + X4 + X1:X2 I would want to build models for all possible combinations of these 5 terms, e.g. m1 - lm( y ~ X1 + X3 ) and capture at least the residual sum of squares and total number of model parameters from each model produced. This will become part of a Bayesian approach to infer actual model probabilities when specialist prior knowledge is also introduced into the problem. At a high level this particular problem requires something like: 1) the term 'string' to be broken down into it's elements which are separated by + and, I suppose, stored in a list for easier manipulation 2) a matrix with 2^5 rows and 5 columns to be formed with a 0 present if the term is not included and 1 if it is. Then a model will be fitted to represent every row of this matrix and the key statistics stored in vectors of length 2^5 For N terms of course the number of models will be 2^N. Is there anything available already? This is a very similar problem to all subsets regression. My skill at manipulating strings in R is very limited; can anyone recommend some links or available functions which would make the separations and constructions required easy to achieve? Thanks in advance to all Michael Hopkins Algorithm and Statistical Modelling Expert -- Bert Gunter Genentech Nonclinical Biostatistics 467-7374 http://devo.gene.com/groups/devo/depts/ncb/home.shtml Michael Hopkins Algorithm and Statistical Modelling Expert Upstream 23 Old Bond Street London W1S 4PZ Mob +44 0782 578 7220 DL +44 0207 290 1326 Fax +44 0207 290 1321 hopk...@upstreamsystems.com www.upstreamsystems.com IMPORTANT NOTICE The information in this e-mail and any attached files is...{{dropped:22}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interpreting the example given by Frank Harrell in the predict.lrm {Design} help
I may be missing a point, but the proportional odds model easily gives you odds ratios for Y=j (independent of j by PO assumption). Other options include examining a rank correlation between the linear predictor and Y, or (if Y is numeric and spacings between categories are meaningful) you can get predicted mean Y (see the Mean.lrm in the R rms package, a replacement for the Design package). Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Frank-Harrell-in-the-predict-lrm-Design-help-tp2883311p2954274.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on reading multipe files in R
Hello, I think you'll have to provide more details on the file type / format and your analytical objectives (as will Arnab Kumar Maity, Researcher, Indian School of Business who just asked a remarkably similar question :) Michael On 4 October 2010 16:43, Rahul Menon rahul_me...@isb.edu wrote: Hi, How can I read multiple files(in a loop like manner) within a single code of R ? For example, I need to run the same code for different datasets (here list of companies) and since individual files are quite large, appending the files into one file is not a desirable option. Can this be done through a macro or sql kind of command? Thanks and Regards, Rahul S Menon Research Associate ISB, Hyderabad Ph-040-2318 7288 DISCLAIMER : This e-mail (including any attachments) is ...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading Strings in R
Dear All, I know this must be a one-liner, but I am experiencing problems. I need to read lists of long integers stored on a text files, i.e. which reads like 7359700484475972 7359700484475972 0 7359700484475972 0 0 0 97189954101318722 0 0 7811896690636354 7359700490636354 0 0 0 7811896684475972 7811896684475972 7811896690636354 8447597298304052 7359700484475972 0 7359700498304052 7359700498304052 7359700498304052 0 7359700498304052 7359700484475972 7359700484475972 Now, for me every integer is just a symbol, i.e. the ID of some object and I simply need to find out how many times and where it occurs in this file. Bottom line: I would like to read this file directly as a string to ensure I will not have any trouble of integer overflow or rounding error. I tried readLines, but I cannot get rid of \n. Any help is appreciated. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] print-show-display a matrix
?View On Mon, Oct 4, 2010 at 7:48 AM, Alaios ala...@yahoo.com wrote: Hello. I want to print the value a matrix has. The matrix's size is not too big (100*100 cells). I tried print(matrixname) but as it does not fit very well on my screen R splits it into several small matrixes that do overflow my screen. IS it possible somehow to display this matrix as one (even if this needs to go fullscreen) . There might be some function for that like plot cells or display matrixes. I would like to thank you in advance for your help Best Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help on reading multiple files in R
Hi, How can I read multiple files(in a loop like manner) within a single code of R ? For example, I need to run the same code for different datasets (here list of companies) and since individual files are quite large, appending the files into one file is not a desirable option. Can this be done through a macro or sql kind of command? Thank you in advance. Arnab Kumar Maity Researcher Indian School of Business, Hyderabad - 500032 India Ph- +91 40 2318 7886 DISCLAIMER : This e-mail (including any attachments) is ...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to make list() return a list of *named* elements
On Mon, Oct 04, 2010 at 07:45:23PM +0800, Berwin A Turlach wrote: G'day Hans, On Mon, 4 Oct 2010 11:28:15 +0200 Hans Ekbrand h...@sociologi.cjb.net wrote: On Thu, Sep 30, 2010 at 09:34:26AM -0300, Henrique Dallazuanna wrote: You should try: eapply(.GlobalEnv, I)[c('b', 'my.c')] Great! b - c(22.4, 12.2, 10.9, 8.5, 9.2) my.c - sample.int(round(2*mean(b)), 4) my.return - function (vector.of.variable.names) { eapply(.GlobalEnv, I)[vector.of.variable.names] } Well, if you are willing to create a vector with the variable names, then simpler solutions should be possible, i.e. solutions that only operate on the objects of interest and not on all objects in the global environment (which could be a lot depending on your style). Actually, what made me want this list-like function was when coding the return() of the interesting results from a calculation function to what I imagine is the global environment (I have only a vague concept of that though). So, in the global environment there are very few objects, while there are more objects in the function where this list-like function will be used. Your solution does look way cleaner the falling back to hidden stuff as .GlobalEnv, so I will definately use it. In addition, the returned list has is not of a strange class as in Henriques example. Thanks, -- Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net GPG Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E signature.asc Description: Digital signature __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plyr: a*ply with functions that return matrices-- possible bug in aaply?
That is, I want to define something like the following using an a*ply method, but aaply gives a result in which the applied .margin(s) do not appear last in the result, contrary to the documentation for ?aaply. I think this is a bug, either in the function or the documentation, but perhaps there's something I misunderstand for this case. Maybe the documentation isn't clear but I think this is behaving as expected: * the margin you split on comes first in the output * followed by the dimensions created by the applied function. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to make list() return a list of *named* elements
On Mon, 4 Oct 2010 19:45:23 +0800 Berwin A Turlach ber...@maths.uwa.edu.au wrote: Mmh, R my.return - function (vector.of.variable.names) { sapply(vector.of.variable.names, function(x) list(get(x))) } make that: R my.return - function (vector.of.variable.names) { sapply(vector.of.variable.names, function(x) get(x)) } Cheers, Berwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to make list() return a list of *named* elements
G'day Hans, On Mon, 4 Oct 2010 11:28:15 +0200 Hans Ekbrand h...@sociologi.cjb.net wrote: On Thu, Sep 30, 2010 at 09:34:26AM -0300, Henrique Dallazuanna wrote: You should try: eapply(.GlobalEnv, I)[c('b', 'my.c')] Great! b - c(22.4, 12.2, 10.9, 8.5, 9.2) my.c - sample.int(round(2*mean(b)), 4) my.return - function (vector.of.variable.names) { eapply(.GlobalEnv, I)[vector.of.variable.names] } Well, if you are willing to create a vector with the variable names, then simpler solutions should be possible, i.e. solutions that only operate on the objects of interest and not on all objects in the global environment (which could be a lot depending on your style). For example: R my.return - function (vector.of.variable.names) { sapply(vector.of.variable.names, function(x) list(get(x))) } R str(my.return(c(b,my.c))) List of 2 $ b : num [1:5] 22.4 12.2 10.9 8.5 9.2 $ my.c: int [1:4] 7 5 23 4 Cheers, Berwin == Full address Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr) School of Maths and Stats (M019)+61 (8) 6488 3383 (self) The University of Western Australia FAX : +61 (8) 6488 1028 35 Stirling Highway Crawley WA 6009e-mail: ber...@maths.uwa.edu.au Australiahttp://www.maths.uwa.edu.au/~berwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question about splines in R
Dear all, I wanted to fit a penalized logistic model to obtain knots-based basis matrix as well as basis coefficients. I was wondering if there is a package in R that can generate a natural cubic spline and construct the corresponding penalty matrix? Thank you! Yan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multivariate longitudinal analysis
Dear all, I am looking for an R package that fits multivariate gaussian or non-gaussian longitudinal outcomes. I am especially interested to non-gaussian outcomes since the outcomes I've got are discretes (some are binomial and some are counts). Many thanks in advance, Abderrahim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Force evaluation of variable when calling partialPlot
Dear R Users, I'm using the randomForest package and would like to generate partial dependence plots, one after another, for a variety of variables: m - randomForest( s, ... ) varnames - c( var1, var2, var3, var4 ) # var1..4 are all in data frame s for( v in varnames ) { partialPlot( x=m, pred.data=s, x.var=v ) } ...but this doesn't work, with partialPlot complaining that it can't find the variable v. I think I need to force the evaluation of the loop variable v so that partialPlot sees the correct variable names, but am stumped after trying eval and similar functions. Any suggestions on how to do this? Googling has not turned up anything very useful. Thanks, Ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] print-show-display a matrix
Hello. I want to print the value a matrix has. The matrix's size is not too big (100*100 cells). I tried print(matrixname) but as it does not fit very well on my screen R splits it into several small matrixes that do overflow my screen. IS it possible somehow to display this matrix as one (even if this needs to go fullscreen) . There might be some function for that like plot cells or display matrixes. I would like to thank you in advance for your help Best Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on reading multiple files in R
allFilesList - lapply(list.file(), read.table, ..args..) On Mon, Oct 4, 2010 at 2:10 AM, Arnab Maity arnab_ma...@isb.edu wrote: Hi, How can I read multiple files(in a loop like manner) within a single code of R ? For example, I need to run the same code for different datasets (here list of companies) and since individual files are quite large, appending the files into one file is not a desirable option. Can this be done through a macro or sql kind of command? Thank you in advance. Arnab Kumar Maity Researcher Indian School of Business, Hyderabad - 500032 India Ph- +91 40 2318 7886 DISCLAIMER : This e-mail (including any attachments) is ...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to make list() return a list of *named* elements
On Mon, Oct 04, 2010 at 07:51:10PM +0800, Berwin A Turlach wrote: R my.return - function (vector.of.variable.names) { sapply(vector.of.variable.names, function(x) get(x)) } Even better :-) -- Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net GPG Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E signature.asc Description: Digital signature __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] spatial interaction (gravity) model as Poisson regression
Dear list, I posted essentially this same question to the r-sig-geo mailing list last week with no response :( Unfortunately I am no closer to reaching a solution, so I now post it here (with some clarifications) in the hope that someone following this list might have an answer for me: Has anyone had much experience with spatial interaction (or gravity) models, specifically in the form of Poisson regression? I'm a bit unsure of how to operationalize this using glm(), and would appreciate any pointers from those with more experience. I have searched the archives extensively, and there are several mentions of this question, but I have yet to find anything concrete that I can wrap my head around... apologies if I have missed something. Basically, the conventional origin constrained model would look something like this: T_{ij} = exp(\delta_{i} + \log{A_{j}} - \beta D_{ij}) ~ \varepsilon_{ij} where \delta_{i} is a constant parameter specific to the ith zone, A_{j} is the attractiveness of the jth location, and D_{ij} is the distance between i and j. Note that \varepsilon_{ij} is just the multiplicative error term of the flow from i to j, and \beta is the distance decay parameter. Similarly, the doubly constrained model follows the form: T_{ij} = exp(\delta_{i} + \gamma_{j} - \beta D_{ij}) ~ \varepsilon_{ij} where everything is defined as above, except exp(\gamma_{j}) is an estimate of the attractiveness of location A_{j}. Hopefully the above description makes things a bit clearer, essentially my question is this: What factors or in what form do I have to have my data in order to be able to run such a model following the glm syntax? I know this should be relatively straight-forward, I just can't seem to get my head wrapped around it at all? If it helps, I can provide some sample data to those who request it. TIA, Carson -- Carson J. Q. Farmer ISSP Doctoral Fellow National Centre for Geocomputation National University of Ireland, Maynooth, http://www.carsonfarmer.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] I have aproblem --thank you
On Oct 4, 2010, at 9:57 AM, 笑啸 wrote: dear teacher: thank you for your help,witn your help i develop the nomogram successfully. after that i want to do the internal validation to the nomogram. So you need some standard for accuracy, which would only come from the original data. Hence my suggestion that you bring the original data into R and rms. You cannot validate that model with the approaches you are attempting. And it would not be correct to say that you are validating the nomogram. It is simply a graphical tool for implementing predictions from a statistical model. You should be thinking of this as exploring the i wish to use the bootstrap to achieve the prediction accuracy of the nomogram,and then i encounter problem again,just like that. (error to :complete.cases(x, y, wt) : (the length of the augment was different)) i hope you tell me where is the mistake,and maybe i have chosen the wrong function. More accurately, I think you have chosen the wrong strategy. I think you need to add some statistical consultation fees to your research budget. Or you may get away with buying Regression Modeling Strategies and using it to work through the many worked examples that accompany the rms package. The first option may be faster, but the second option is probably slower but should result in a higher level of statistical understanding for you. The proper choice for you depends on your deadline, available time and what monetary value you assign to that spare time. -- David. thank you turly yours .. load package 'rms' n-100 set.seed(10) T.Grade-factor(0:3,labels=c(G0, G1, G2,G3)) Sex-factor(0:1,labels=c(F,M)) Smoking-factor(0:1,labels=c(No,yes)) L-0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking) +0.92*as.numeric(Sex)-1.338 L [1] -0.755 -0.172 0.363 0.946 y - ifelse(runif(n) plogis(L), 1, 0) dfr - data.frame(T.Grade,Sex,Smoking, L, y) ddist - datadist(dfr) # wrap the vectors into a dataframe. options(datadist='ddist') f-lrm(y~T.Grade +Sex+Smoking, data=dfr) # skip the as.numeric()'s ### Gives an error message due to singular X matrix. f-lrm(y~T.Grade +Sex+Smoking, data=dfr) singular information matrix in lrm.fit (rank= 5 ). Offending variable(s): Sex=M Error in lrm(y ~ T.Grade + Sex + Smoking, data = dfr) : Unable to fit model using “lrm.fit” #Try instead: n-100 set.seed(10) T.Grade-factor(0:3,labels=c(G0, G1, G2,G3)) Sex-factor(sample(0:1, 100, replace=TRUE),labels=c(F,M)) Smoking-factor(sample(0:1, 100, replace=TRUE),labels=c(No,yes)) dfr$L - with(dfr, 0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking) +0.92*as.numeric(Sex)-1.338) dfr$y - with(dfr, ifelse(runif(n) plogis(L), 1, 0) ) dfr - data.frame(T.Grade,Sex,Smoking, L, y) ddist - datadist(dfr) options(datadist='ddist') f-lrm(y~T.Grade +Sex+Smoking, data=dfr) nom - nomogram(f, fun=function(x)1/(1+exp(-x)), # or fun=plogis fun.at=c(.001,.01,.05,seq(.1,.9,by=.1),.95,.99,.999), funlabel=Risk of Death) plot(nom, xfrac=.45) load package bootstrap(achieve the prediction accuracy of the nomogram) .the problem theta.fit - function(dfr,y){lsfit(dfr,y)} theta.predict - function(fit,dfr){cbind(1,dfr)%*%fit$coef} sq.err - function(y,yhat) { (y-yhat)^2} results - bootpred(x,y,50,theta.fit,theta.predict,err.meas=sq.err) error to :complete.cases(x, y, wt) : (the length of the augment was different.__ -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue with match.call
Hi, Something along these lines should get you there: RFF - function(qtype, qOpt, ...) { mc - match.call(expand.dots=TRUE) vec - c(flag,sep,dec) matchedargs - match(vec, names(mc), FALSE) } 'matchedargs' will be a vector of the positions in mc where it matched 'vec' (or 0 if it did not). If you just want to extract the actual arguments passed afterward, use the [[ extraction operator. Cheers, Josh On Mon, Oct 4, 2010 at 6:45 AM, raje...@cse.iitm.ac.in raje...@cse.iitm.ac.in wrote: Hi, I have a function that I'm writing. The arguments in the function are as follows RFF-function(qtype, qOpt,...){} i.e., I have two args that are compulsary and the rest are optional. Now when my user passes the function call, I need to see what optional args are defined and process accordingly...what I have so far is.. RFF-function(qtype, qOpt,...){ mc - match.call(expand.dots=TRUE) } I need to see what all args have been sent out of vec-c(flag,sep,dec) and define if-else conditions based on whether they have been defined. How do I do this? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fisher exact test?
To find the probability directly, you'll need to clearly state how you are sampling. Are you sampling with replacement? (Draw then put back.) Or are you sampling without replacement? (Draw. Draw from the remaining. Draw from the remaining.) Also, of the N genes, do you know how many are of the particular set? How many genes do you draw each try? Does order matter to you? In other words is (A=5, B=0, C=23) equivalent to (A=0,B=5,C=23)? -tgs On Mon, Oct 4, 2010 at 10:13 AM, James Nead james_n...@yahoo.com wrote: Hi All, My apologies if this is a totally newbie question. I want to calculate the probability that a particular set of genes is picked repeatedly for 3 samplings. For example, if from a total of 'N' genes, I pick 'A' number of genes in the first pick, 'B' number of genes in the second pick, and 'C' number of genes in the third pick, I would want to calculate p(n) where n = (A intersection B intersection C). Would fisher exact test be the correct method to evaluate the probability? And if so how can I frame the 3x3 (?) contingency table? I can find numerous examples of 2x2 tables, but none for 3x3. Any guidance would be appreciated. many thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Splitting a DF into rows according to a column
Hi, I'm turning my wheels on this and keep coming around to the same wrong solution - please have a look and give a hand ... The premise is: a DF like so loremIpsum - Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque leo ipsum, ultricies scelerisque volutpat non, volutpat et nulla. Curabitur consequat ullamcorper tellus id imperdiet. Duis semper malesuada nulla, blandit lobortis diam fringilla at. Vestibulum nec tellus orci, eu sollicitudin quam. Phasellus sit amet enim diam. Phasellus mattis hendrerit varius. Curabitur ut tristique enim. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed convallis, tortor id vehicula facilisis, nunc justo facilisis tellus, sed eleifend nisi lacus id purus. Maecenas tempus sollicitudin libero, molestie laoreet metus dapibus eu. Mauris justo ante, mattis et pulvinar a, varius pretium eros. Curabitur fringilla dui ac dui rutrum pretium. Donec sed magna adipiscing nisi accumsan congue sed ac est. Vivamus lorem urna, tristique quis accumsan quis, ullamcorper aliquet velit. tmpDF - data.frame(Column1=rep(unlist(strsplit(loremIpsum, )),length.out=510),Column2=runif(510,min=0,max=1e8)) is to be split into DFs with 50 entries in an ordered manner according to column2 (first DF ist o contain the rows with the 50 largest numbers, ...). Here is what I have been doing: binSize - 50 splitMembership - pmin(ceiling(order(tmpDF[[Column2]],decreasing=TRUE)/binSize),floor(nrow(tmpDF)/binSize)) splitList - split(tmpDF,splitMembership) Distribution seems to work ... sapply(splitList,nrow) But this is NOT what I wanted ... sapply(splitList,function(x){max(x[[Column2]])}) This was supposed to give me bins that are Column2-sorted and bin one should have a higher max than 2 than 3 ... Can anyone point out where (my now 3 reimplementations) fail? Thanks, Stupid Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] input matrix for leaps algorithm
Hello, I was trying to use the leaps algorithm for my variable selection. I tried different matrices calculated from multiple linear regression, linear discriminant analysis, and the basic correlation matrix as inputs to the leaps algorithm. But I always got the error message - the matrix is not positive definite. I have more variables than the number of observations/samples. I know this may be a problem. But the multiple linear regression in SAS seems to be able to handle the case when there are more variables than the number of observations. I wonder if there is a way I can still do my variable selection instead of reducing the number of variables. Or there may be something wrong with my function calls. Thank you. -- View this message in context: http://r.789695.n4.nabble.com/input-matrix-for-leaps-algorithm-tp2954453p2954453.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatterplot matrix - Pearson linear correlation and Density Ellipse
Hi, strangely, when I run this script: panel.cor = function(x, y, digits=2, prefix=r=, cex.cor) { usr = par(usr); on.exit(par(usr)) par(usr = c(0, 1, 0, 1)) r = abs(cor(x, y, use=pairwise.complete.obs, method = pearson)) txt = format(c(r, 0.123456789), digits=digits)[1] txt = paste(prefix, txt, sep=) if(missing(cex.cor)) cex.cor = 0.8/strwidth(txt) text(0.5, 0.5, txt, cex = cex.cor) } pairs(cap[8:11], lower.panel=panel.lm, upper.panel=panel.cor, cex=2) I get linear regression and density ellipse, why? Is it possible to add also the data point? Thanks a lot -- View this message in context: http://r.789695.n4.nabble.com/Scatterplot-matrix-Pearson-linear-correlation-and-Density-Ellipse-tp2763552p2954416.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot for Binomial GLM
Hi i would like to use some graphs or tables to explore the data and make some sensible guesses of what to expect to see in a glm model to assess if toxin concentration and sex have a relationship with the kill rate of rats. But i cant seem to work it out as i have two predictor variables~help?Thanks.:) Here's my data. rat.toxic-read.table(file=Rats.csv,header=T,row.names=NULL,sep=,) attach(rat.toxic) names(rat.toxic) [1] Dose Sex Dead Alive rat.toxic Dose Sex Dead Alive 110 F119 210 M020 320 F416 420 M416 530 F911 630 M812 740 F 13 7 840 M 13 7 950 F 18 2 10 50 M 17 3 11 60 F 20 0 12 60 M 16 4 13 10 F317 14 10 M119 15 20 F218 16 20 M218 17 30 F 1010 18 30 M812 19 40 F 14 6 20 40 M 12 8 21 50 F 16 4 22 50 M 13 7 23 60 F 18 2 24 60 M 16 4 glm2-glm(deadalive~Dose*Sex,family=binomial,data=rat.toxic) anova(glm2,test=Chi) Analysis of Deviance Table Model: binomial, link: logit Response: deadalive Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev P(|Chi|) NULL23225.455 Dose 1 202.36622 23.0902e-16 *** Sex 14.32821 18.7620.0375 * Dose:Sex 11.14920 17.6130.2838 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 summary(glm2) Call: glm(formula = deadalive ~ Dose * Sex, family = binomial, data = rat.toxic) Deviance Residuals: Min1QMedian3Q Max -1.82241 -0.85632 0.06675 0.61981 1.47874 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -3.479390.46167 -7.537 4.83e-14 *** Dose 0.105970.01286 8.243 2e-16 *** SexM 0.155010.63974 0.2420.809 Dose:SexM -0.018210.01707 -1.0670.286 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 225.455 on 23 degrees of freedom Residual deviance: 17.613 on 20 degrees of freedom AIC: 91.115 Number of Fisher Scoring iterations: 4 -- View this message in context: http://r.789695.n4.nabble.com/Plot-for-Binomial-GLM-tp2954406p2954406.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How many R packages are not free?
Thanks to everyone who replied to my post. One person indicated that this question treads on dangerous terrain. I had that sense too but really wanted to know how many packages might not be free in the sense that they could be used at no cost by everyone. I was a little disappointed to learn that there are packages that are free for some people but not for others. But I understand that it's not me who has put time and effort into developing the base software or the various add-on packages. And so clearly it's up to others who have put in the time and effort to make those decisions. Paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue with match.call
RFF-function(qtype, qOpt,...){} i.e., I have two args that are compulsary and the rest are optional. Now when my user passes the function call, I need to see what optional args are defined and process accordingly...what I have so far is.. RFF-function(qtype, qOpt,...){ mc - match.call(expand.dots=TRUE) } I need to see what all args have been sent out of vec-c(flag,sep,dec) and define if-else conditions based on whether they have been defined. How do I do this? I think you'd be much better off defining those as arguments and using missing(), rather than messing around with match.call (unless there is a specific reason you need the unevaluated expressions). Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot for Binomial GLM
On Mon, Oct 4, 2010 at 7:21 AM, klsk89 karenkls...@yahoo.com wrote: Hi i would like to use some graphs or tables to explore the data and make some sensible guesses of what to expect to see in a glm model to assess if toxin concentration and sex have a relationship with the kill rate of rats. But i cant seem to work it out as i have two predictor variables~help?Thanks.:) What about xtabs? For instance: xtabs(deadalive ~ Dose + Sex, data = rat.toxic) Regarding graphs, take a look at faceting in ggplot2 (or lattice). You can get something close to the 3 way table but in graphical form that way. I am not sure if this is completely up and running yet, but I know there has been work linking ggobi with R. I have seen a few demonstrations that looked quite promising, and it may work well for you to visualize three variables at once (and interactively). Here is the link: http://www.ggobi.org/rggobi/ Here's my data. rat.toxic-read.table(file=Rats.csv,header=T,row.names=NULL,sep=,) attach(rat.toxic) ^ why attach it? names(rat.toxic) [1] Dose Sex Dead Alive rat.toxic Dose Sex Dead Alive 1 10 F 1 19 2 10 M 0 20 3 20 F 4 16 4 20 M 4 16 5 30 F 9 11 6 30 M 8 12 7 40 F 13 7 8 40 M 13 7 9 50 F 18 2 10 50 M 17 3 11 60 F 20 0 12 60 M 16 4 13 10 F 3 17 14 10 M 1 19 15 20 F 2 18 16 20 M 2 18 17 30 F 10 10 18 30 M 8 12 19 40 F 14 6 20 40 M 12 8 21 50 F 16 4 22 50 M 13 7 23 60 F 18 2 24 60 M 16 4 Please tell me that after this, you converted the counts of dead and alive into a single variable that had a 0 or 1 if dead and the opposite as alive before you used it as the dependent variable in your logistic regression. glm2-glm(deadalive~Dose*Sex,family=binomial,data=rat.toxic) anova(glm2,test=Chi) Analysis of Deviance Table Model: binomial, link: logit Response: deadalive Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev P(|Chi|) NULL 23 225.455 Dose 1 202.366 22 23.090 2e-16 *** Sex 1 4.328 21 18.762 0.0375 * Dose:Sex 1 1.149 20 17.613 0.2838 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 summary(glm2) Call: glm(formula = deadalive ~ Dose * Sex, family = binomial, data = rat.toxic) Deviance Residuals: Min 1Q Median 3Q Max -1.82241 -0.85632 0.06675 0.61981 1.47874 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -3.47939 0.46167 -7.537 4.83e-14 *** Dose 0.10597 0.01286 8.243 2e-16 *** SexM 0.15501 0.63974 0.242 0.809 Dose:SexM -0.01821 0.01707 -1.067 0.286 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 225.455 on 23 degrees of freedom Residual deviance: 17.613 on 20 degrees of freedom AIC: 91.115 Number of Fisher Scoring iterations: 4 -- View this message in context: http://r.789695.n4.nabble.com/Plot-for-Binomial-GLM-tp2954406p2954406.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting a DF into rows according to a column
On Oct 4, 2010, at 16:57 , Johannes Graumann wrote: Hi, I'm turning my wheels on this and keep coming around to the same wrong solution - please have a look and give a hand ... The premise is: a DF like so loremIpsum - Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque leo ipsum, ultricies scelerisque volutpat non, volutpat et nulla. Curabitur consequat ullamcorper tellus id imperdiet. Duis semper malesuada nulla, blandit lobortis diam fringilla at. Vestibulum nec tellus orci, eu sollicitudin quam. Phasellus sit amet enim diam. Phasellus mattis hendrerit varius. Curabitur ut tristique enim. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed convallis, tortor id vehicula facilisis, nunc justo facilisis tellus, sed eleifend nisi lacus id purus. Maecenas tempus sollicitudin libero, molestie laoreet metus dapibus eu. Mauris justo ante, mattis et pulvinar a, varius pretium eros. Curabitur fringilla dui ac dui rutrum pretium. Donec sed magna adipiscing nisi accumsan congue sed ac est. Vivamus lorem urna, tristique quis accumsan quis, ullamcorper aliquet velit. tmpDF - data.frame(Column1=rep(unlist(strsplit(loremIpsum, )),length.out=510),Column2=runif(510,min=0,max=1e8)) is to be split into DFs with 50 entries in an ordered manner according to column2 (first DF ist o contain the rows with the 50 largest numbers, ...). Here is what I have been doing: binSize - 50 splitMembership - pmin(ceiling(order(tmpDF[[Column2]],decreasing=TRUE)/binSize),floor(nrow(tmpDF)/binSize)) splitList - split(tmpDF,splitMembership) Distribution seems to work ... sapply(splitList,nrow) But this is NOT what I wanted ... sapply(splitList,function(x){max(x[[Column2]])}) This was supposed to give me bins that are Column2-sorted and bin one should have a higher max than 2 than 3 ... Can anyone point out where (my now 3 reimplementations) fail? Thanks, Stupid Joh Dear Stupid Joh, Have you considered something along the lines of o - order(-x$Column2) xx - x[o,] split(xx, (seq_len(NROW(x))-1) %/% 50) The above is a bit hard to follow, but it seems to work better with rank() instead of order(): splitMembership - + pmin(ceiling(rank(-tmpDF[[Column2]])/binSize),floor(nrow(tmpDF)/binSize)) splitList - split(tmpDF,splitMembership) sapply(splitList,nrow) 1 2 3 4 5 6 7 8 9 10 50 50 50 50 50 50 50 50 50 60 sapply(splitList,function(x){max(x[[Column2]])}) 123456 99877498 90567877 81965382 69112280 59814266 52130373 789 10 41557660 32630212 21226996 11880032 -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotmath: how to use greek symbols in expression(integral(f(tau)*dtau, 0, t))?
On Oct 4, 2010, at 10:11 AM, Thomas Stewart wrote: Try expression(paste(integral(),f(,tau,) d,tau,sep=))) it works for me. It does work (albeit without properly including the requested limits of integrations), but if it is used as a teaching example, it will obscure the syntax of plotmath expressions. A simple version that gets the same result is: expression(integral()*f(tau)*d*tau)) Plotmath expressions are parsed from unquoted text with proper non- printing separators being ~ for space and * for juxtaposition. The plotmath paste() is not really the same as the base function paste(), despite identical names and its main value is as a method for allowing commas to also be used as separators. Frankly I'm a bit surprised you got away with adding the undocumented sep= argument. My reading of the help page for plotmath would have lead me to predict that the text 'sep=' should appear in the output. Instead the sep= is ignored and the text string is appended to the end of the expression Try for instance: plot(1,1, main=expression(paste(a,b,c,d, sep=-) ) ) -- David. -tgs On Mon, Oct 4, 2010 at 9:25 AM, Czerminski, Ryszard ryszard.czermin...@astrazeneca.com wrote: I would like to use greek tau as a symbol of variable to integrate over in plotmath expression(integral(f(tau)*dtau, 0,t)) but nothing seems to work. I tried d{\tau}, d\tau, etc., without any success Is it possible? How can I accomplish this? Best regards, Ryszard -- Confidentiality Notice: This message is private and may ... {{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Debye function
Dear list, Is there another package than gsl* where the Debye function (of first order) is implemented? Google only finds the gsl package. Thanks in advance Christophe * page 12 of reference manual of gsl. http://cran.r-project.org/web/packages/gsl/gsl.pdf -- Christophe DUTANG Ph. D. student at ISFA, Lyon, France [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error during scp transfer
Dear all I am implement one grid computing system using GridR package. I am working on windows platform. I did all the configuration according to the GridR tutorial. But it has some errors when I execute grid.apply function. Here is the output from stderr: R: not found java.io.IOException: Error during SCP transfer. at com.trilead.ssh2.SCPClient.get(SCPClient.java:703) at com.trilead.ssh2.SCPClient.get(SCPClient.java:596) at MySSh.download(MySSh.java:104) at MySSh.main(MySSh.java:151) Caused by: java.io.IOException: Remote SCP error: scp: grid/grid-ORNITORRINCO-1708-2010-10-04-15-28-47-9-script.Rout: No such file or directory at com.trilead.ssh2.SCPClient.receiveFiles(SCPClient.java:325) at com.trilead.ssh2.SCPClient.get(SCPClient.java:699) ... 3 more Grid job had error: SSH Connection Error:Error during SCP transfer. Could you please give some suggestion to solve this problem ? Thanks in advance. Kind Regards W. Mathew [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Debye function
install.packages('sos') # if not already installed library(sos) Debye - ???Debye #found 4 matches Debye # Print method opens table in a web browser Conclusions: 1. There are references to Debye in the CHNOSZ package, but it may not be what you want. 2. If it's otherwise available in R, it's very well concealed. Spencer On 10/4/2010 8:56 AM, Christophe Dutang wrote: Dear list, Is there another package than gsl* where the Debye function (of first order) is implemented? Google only finds the gsl package. Thanks in advance Christophe * page 12 of reference manual of gsl. http://cran.r-project.org/web/packages/gsl/gsl.pdf -- Spencer Graves, PE, PhD President and Chief Operating Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read columns of quoted numbers as factors
Suppose I have a data file (possibly with a huge number of columns), where the columns with factors are coded as 1, 2, 3, etc ... The default behavior of read.table is to convert these columns to integer vectors. Is there a way to get read.table to recognize that columns of quoted numbers represent factors (while unquoted numbers are interpreted as integers), without explicitly setting them with colClasses ? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error message with scan() function
Hi Everyone, I am new R user and am trying to learn by reading the online manual An Introduction to R from the R web site. I am trying to practice using the scan() function as explained in the manual. For this I first created three vectors (one a character vector and two numeric one) and saved file input.dat in my working directory as: label - c(Bill, Tom, Pat, Will, Sue, Sam) x - 1:6 y - 6:1 save(label, x, y, file = input.dat) Now, when I try the scan() function as explained in the manual as below, I get the error message: inp - scan(input.dat, list(,0,0)) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got 'X' I have not been able to figure out what the problem is. I would appreciate if someone could please guide me as to what I am doing wrong. Regards, Tariq [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with apply
Suppose I have the following data: tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE)) I can run the following double loop and yield what I want in the end (rr1) as: library(statmod) Q - 2 b - runif(3) qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) rr1 - matrix(0, nrow = Q, ncol = nrow(tmp)) L - nrow(tmp) for(j in 1:Q){ for(i in 1:L){ rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j] } } rr1 But, I think this is so inefficient for large Q and nrow(tmp). The function I am looping over is: fn - function(x, nodes, weights, s, b) { exp(sum(log((exp(x*(nodes-b))) / (factorial(x) * exp(exp(nodes-b)) * ((1/(s*sqrt(2*pi))) * exp(-((nodes-0)^2/(2*s^2/dnorm(nodes) * weights } I've tried using apply in a few ways, but I can't replicate rr1 from the double loop. I can go through each value of Q one step at a time and get matching values in the rr1 table, but this would still require a loop and that I think can be avoided. apply(tmp, 1, fn, nodes = qq$nodes[1], weights = qq$weights[1], s = 1, b = b) Does anyone see an efficient way to compute rr1 without a loop? Harold sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] minqa_1.1.9 Rcpp_0.8.6 MiscPsycho_1.6 statmod_1.4.6 lattice_0.17-26 gdata_2.8.0 loaded via a namespace (and not attached): [1] grid_2.10.1 gtools_2.6.2 tools_2.10.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Debye function
Well, not really ... Google on R package Debye . The 5th hit lists the gsl package with the debye() function. Moral: DO make use of standard professional search tools, also. -- Bert On Mon, Oct 4, 2010 at 9:13 AM, Spencer Graves spencer.gra...@structuremonitoring.com wrote: install.packages('sos') # if not already installed library(sos) Debye - ???Debye #found 4 matches Debye # Print method opens table in a web browser Conclusions: 1. There are references to Debye in the CHNOSZ package, but it may not be what you want. 2. If it's otherwise available in R, it's very well concealed. Spencer On 10/4/2010 8:56 AM, Christophe Dutang wrote: Dear list, Is there another package than gsl* where the Debye function (of first order) is implemented? Google only finds the gsl package. Thanks in advance Christophe * page 12 of reference manual of gsl. http://cran.r-project.org/web/packages/gsl/gsl.pdf -- Spencer Graves, PE, PhD President and Chief Operating Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Debye function
OOPs. I Only read Spencer's reply. Sorry for repeating the poster's original comment. -- Bert On Mon, Oct 4, 2010 at 10:10 AM, Bert Gunter bgun...@gene.com wrote: Well, not really ... Google on R package Debye . The 5th hit lists the gsl package with the debye() function. Moral: DO make use of standard professional search tools, also. -- Bert On Mon, Oct 4, 2010 at 9:13 AM, Spencer Graves spencer.gra...@structuremonitoring.com wrote: install.packages('sos') # if not already installed library(sos) Debye - ???Debye #found 4 matches Debye # Print method opens table in a web browser Conclusions: 1. There are references to Debye in the CHNOSZ package, but it may not be what you want. 2. If it's otherwise available in R, it's very well concealed. Spencer On 10/4/2010 8:56 AM, Christophe Dutang wrote: Dear list, Is there another package than gsl* where the Debye function (of first order) is implemented? Google only finds the gsl package. Thanks in advance Christophe * page 12 of reference manual of gsl. http://cran.r-project.org/web/packages/gsl/gsl.pdf -- Spencer Graves, PE, PhD President and Chief Operating Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics -- Bert Gunter Genentech Nonclinical Biostatistics 467-7374 http://devo.gene.com/groups/devo/depts/ncb/home.shtml __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error message with scan() function
On Oct 4, 2010, at 12:51 PM, Tariq Perwez wrote: Hi Everyone, I am new R user and am trying to learn by reading the online manual An Introduction to R from the R web site. I am trying to practice using the scan() function as explained in the manual. For this I first created three vectors (one a character vector and two numeric one) and saved file input.dat in my working directory as: label - c(Bill, Tom, Pat, Will, Sue, Sam) x - 1:6 y - 6:1 save(label, x, y, file = input.dat) Now, when I try the scan() function as explained in the manual as below, I get the error message: Typically the output and input functions are paired and the natural pairing for the save function is with the load() function. The scan() function is for text files (and requires specification if the vector types if not all numeric, while save() stores R objects with a compressed format that preserves the object identities and attributes. Other pairings: dput/dget write/read You probably ought to be also reading the R Import/Export Manual: http://cran.r-project.org/doc/manuals/R-data.pdf inp - scan(input.dat, list(,0,0)) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got 'X' ?scan scan() expects a real when not given an argument for what. Did you look at that file with a text editor? -- David. I have not been able to figure out what the problem is. I would appreciate if someone could please guide me as to what I am doing wrong. Regards, Tariq David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inserting a plot into another
OK, I found the bug in subplot (it was never tested with multiple figures). The bug has been fixed, but probably won't make it to CRAN any time soon. In the meantime there are a couple of work arounds. 1. create your own local copy of subplot and edit it so that the 1st 3 lines of the body are: type - match.arg(type) old.par - par( c(type, 'usr', names(pars) ) ) on.exit(par(old.par)) (this means moving the 'type -' line and modifying the 'old.par -' line ). Then use this copy instead of the one in TeachingDemos. 2. The problem comes because par('mfg') is getting updated by subplot when it shouldn't be, you can just set par(mfg= ) yourself after each call to subplot. 3. wait a couple of days for the R-forge version to be updated, then install the R-forge version of TeachingDemos 2.8. Sorry for the problems and hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Filoche Sent: Monday, October 04, 2010 7:58 AM To: r-help@r-project.org Subject: Re: [R] Inserting a plot into another Hi everyone. First, thank you for your answers, it helped me alot. I have 1 more question regarding subplot. I'm using this function in a graph that contains 6 plots (3 x 2). This is producing strange behavior. http://img545.imageshack.us/img545/9233/fig6.png See here for the resulting figure. For instance here's my code : pdf(file=C:/Users/Modelisation/Desktop/Fig6.pdf, width = 8.5, height = 11, pointsize = 12); par(mfcol = c(3,2), mai = c(0.7,0.8,0,0), omi = c(1, 0, 0.7, 0.1)); plot(FSL.data_Center$Distance, 4.6/FSL.data_Center$kd.BLUE., pch = 22, bg = gray43, xlab = Distance (km), ylab = 1% penetration depth (m) , ylim = c(0,20), xaxt = n, yaxt = n, xpd = NA); axis(1, at = c(100,200,300,400,500),labels = c(100,200,300,400,500), cex.axis=1, tck = 0.02); axis(2, tck = 0.02); ... #Subplots subplot(plot(lowess(FSL.data_North$Distance, 4.6/FSL.data_North$kd.BLUE.), type = 'l', xlab = , ylab = , cex.axis = 0.75, xlim = c(0,450), ylim = c(0,20)), 330,17, size = c(.96, .6)); subplot(plot(lowess(FSL.data_Center$Distance, 4.6/FSL.data_Center$kd.RED.), type = 'l', xlab = , ylab = , cex.axis = 0.75, xlim = c(0,450), ylim = c(0,20)), 330,17, size = c(.96, .6)); subplot(plot(lowess(FSL.data_South$Distance, 4.6/FSL.data_South$kd.GREEN.), type = 'l', xlab = , ylab = , cex.axis = 0.75, xlim = c(0,450), ylim = c(0,20)), 330,17, size = c(.96, .6)); PLOT CODE FOR THE 5 OTHER FIGURES HERE dev.off(); The last 2 subplots are plotted in at the bottom of the page. I would like to have the last 2 curves on the plot A. With regards, Phil -- View this message in context: http://r.789695.n4.nabble.com/Inserting- a-plot-into-another-tp2720936p2954362.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with apply
You are missing s in your definitions so I can't reproduce your code. tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE)) str(tmp) 'data.frame': 3 obs. of 3 variables: $ var1: int 9 3 9 $ var2: int 4 6 2 $ var3: int 2 9 3 #I can run the following double loop and yield what I want in the end (rr1) as: library(statmod) Q - 2 b - runif(3) qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) rr1 - matrix(0, nrow = Q, ncol = nrow(tmp)) L - nrow(tmp) for(j in 1:Q){ + for(i in 1:L){ + rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * + exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j] +} +} Error: objeto 's' no encontrado rr1 [,1] [,2] [,3] [1,]000 [2,]000 Gabriela 2010/10/4, Doran, Harold hdo...@air.org: Suppose I have the following data: tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE)) I can run the following double loop and yield what I want in the end (rr1) as: library(statmod) Q - 2 b - runif(3) qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) rr1 - matrix(0, nrow = Q, ncol = nrow(tmp)) L - nrow(tmp) for(j in 1:Q){ for(i in 1:L){ rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j] } } rr1 But, I think this is so inefficient for large Q and nrow(tmp). The function I am looping over is: fn - function(x, nodes, weights, s, b) { exp(sum(log((exp(x*(nodes-b))) / (factorial(x) * exp(exp(nodes-b)) * ((1/(s*sqrt(2*pi))) * exp(-((nodes-0)^2/(2*s^2/dnorm(nodes) * weights } I've tried using apply in a few ways, but I can't replicate rr1 from the double loop. I can go through each value of Q one step at a time and get matching values in the rr1 table, but this would still require a loop and that I think can be avoided. apply(tmp, 1, fn, nodes = qq$nodes[1], weights = qq$weights[1], s = 1, b = b) Does anyone see an efficient way to compute rr1 without a loop? Harold sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] minqa_1.1.9 Rcpp_0.8.6 MiscPsycho_1.6 statmod_1.4.6 lattice_0.17-26 gdata_2.8.0 loaded via a namespace (and not attached): [1] grid_2.10.1 gtools_2.6.2 tools_2.10.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- _ Lic. María Gabriela Cendoya Magíster en Biometría Profesor Adjunto Facultad de Ciencias Agrarias UNMdP - Argentina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with apply
Sorry about that; s - 1 -Original Message- From: Gabriela Cendoya [mailto:gabrielacendoya.rl...@gmail.com] Sent: Monday, October 04, 2010 1:44 PM To: Doran, Harold Cc: R-help Subject: Re: [R] Help with apply You are missing s in your definitions so I can't reproduce your code. tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE)) str(tmp) 'data.frame': 3 obs. of 3 variables: $ var1: int 9 3 9 $ var2: int 4 6 2 $ var3: int 2 9 3 #I can run the following double loop and yield what I want in the end (rr1) as: library(statmod) Q - 2 b - runif(3) qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) rr1 - matrix(0, nrow = Q, ncol = nrow(tmp)) L - nrow(tmp) for(j in 1:Q){ + for(i in 1:L){ + rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * + exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j] +} +} Error: objeto 's' no encontrado rr1 [,1] [,2] [,3] [1,]000 [2,]000 Gabriela 2010/10/4, Doran, Harold hdo...@air.org: Suppose I have the following data: tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE)) I can run the following double loop and yield what I want in the end (rr1) as: library(statmod) Q - 2 b - runif(3) qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) rr1 - matrix(0, nrow = Q, ncol = nrow(tmp)) L - nrow(tmp) for(j in 1:Q){ for(i in 1:L){ rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j] } } rr1 But, I think this is so inefficient for large Q and nrow(tmp). The function I am looping over is: fn - function(x, nodes, weights, s, b) { exp(sum(log((exp(x*(nodes-b))) / (factorial(x) * exp(exp(nodes-b)) * ((1/(s*sqrt(2*pi))) * exp(-((nodes-0)^2/(2*s^2/dnorm(nodes) * weights } I've tried using apply in a few ways, but I can't replicate rr1 from the double loop. I can go through each value of Q one step at a time and get matching values in the rr1 table, but this would still require a loop and that I think can be avoided. apply(tmp, 1, fn, nodes = qq$nodes[1], weights = qq$weights[1], s = 1, b = b) Does anyone see an efficient way to compute rr1 without a loop? Harold sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] minqa_1.1.9 Rcpp_0.8.6 MiscPsycho_1.6 statmod_1.4.6 lattice_0.17-26 gdata_2.8.0 loaded via a namespace (and not attached): [1] grid_2.10.1 gtools_2.6.2 tools_2.10.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- _ Lic. María Gabriela Cendoya Magíster en Biometría Profesor Adjunto Facultad de Ciencias Agrarias UNMdP - Argentina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with apply
On Mon, Oct 4, 2010 at 10:44 AM, Gabriela Cendoya gabrielacendoya.rl...@gmail.com wrote: You are missing s in your definitions so I can't reproduce your code. When he uses apply(), he sets s = 1. tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE)) str(tmp) 'data.frame': 3 obs. of 3 variables: $ var1: int 9 3 9 $ var2: int 4 6 2 $ var3: int 2 9 3 #I can run the following double loop and yield what I want in the end (rr1) as: library(statmod) Q - 2 b - runif(3) qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) rr1 - matrix(0, nrow = Q, ncol = nrow(tmp)) L - nrow(tmp) for(j in 1:Q){ + for(i in 1:L){ + rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * + exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j] + } + } Error: objeto 's' no encontrado rr1 [,1] [,2] [,3] [1,] 0 0 0 [2,] 0 0 0 Gabriela 2010/10/4, Doran, Harold hdo...@air.org: Suppose I have the following data: tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE)) I can run the following double loop and yield what I want in the end (rr1) as: library(statmod) Q - 2 b - runif(3) qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) rr1 - matrix(0, nrow = Q, ncol = nrow(tmp)) L - nrow(tmp) for(j in 1:Q){ for(i in 1:L){ rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j] } } rr1 But, I think this is so inefficient for large Q and nrow(tmp). The function I am looping over is: fn - function(x, nodes, weights, s, b) { exp(sum(log((exp(x*(nodes-b))) / (factorial(x) * exp(exp(nodes-b)) * ((1/(s*sqrt(2*pi))) * exp(-((nodes-0)^2/(2*s^2/dnorm(nodes) * weights } I've tried using apply in a few ways, but I can't replicate rr1 from the double loop. I can go through each value of Q one step at a time and get matching values in the rr1 table, but this would still require a loop and that I think can be avoided. apply(tmp, 1, fn, nodes = qq$nodes[1], weights = qq$weights[1], s = 1, b = b) Does anyone see an efficient way to compute rr1 without a loop? Harold sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] minqa_1.1.9 Rcpp_0.8.6 MiscPsycho_1.6 statmod_1.4.6 lattice_0.17-26 gdata_2.8.0 loaded via a namespace (and not attached): [1] grid_2.10.1 gtools_2.6.2 tools_2.10.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- _ Lic. María Gabriela Cendoya Magíster en Biometría Profesor Adjunto Facultad de Ciencias Agrarias UNMdP - Argentina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] vectorizing problem
Hello, I have a two column dataframe that has entries that look like this: 2315100 NR_024005,NR_024004,AK093685 2315106 DQ786314 and I want to change this to look like this: 2315100 NR_024005 2315100 NR_024004 2315100 AK093685 2315106 DQ786314 I can do this with the following for loop but the dataframe (GPL) has ~140,000 rows and this takes about 15 minutes: extGPL - matrix(nrow=0,ncol=2) for (i in 1:length(GPL[,2])){ aa - unlist(strsplit(as.character(GPL[i,2]),\\,)) bb - rep(as.numeric(as.character(GPL[i,1])), length(aa)) cc - matrix(c(bb,aa),ncol = 2) extGPL - rbind(extGPL,cc) } Is there a way to vectorize this? Thanks, Dylan Miracle University of Minnesota GCD Department __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vectorizing problem
Try this: do.call(rbind.data.frame, mapply(cbind, DF$V1, strsplit(as.character(DF$V2), ,))) On Mon, Oct 4, 2010 at 2:54 PM, Dylan Miracle dylan.mira...@gmail.comwrote: Hello, I have a two column dataframe that has entries that look like this: 2315100 NR_024005,NR_024004,AK093685 2315106 DQ786314 and I want to change this to look like this: 2315100 NR_024005 2315100 NR_024004 2315100 AK093685 2315106 DQ786314 I can do this with the following for loop but the dataframe (GPL) has ~140,000 rows and this takes about 15 minutes: extGPL - matrix(nrow=0,ncol=2) for (i in 1:length(GPL[,2])){ aa - unlist(strsplit(as.character(GPL[i,2]),\\,)) bb - rep(as.numeric(as.character(GPL[i,1])), length(aa)) cc - matrix(c(bb,aa),ncol = 2) extGPL - rbind(extGPL,cc) } Is there a way to vectorize this? Thanks, Dylan Miracle University of Minnesota GCD Department __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vectorizing problem
try this: GPL - data.frame(x = c(2315100, 2315106), y = c(NR_024005,NR_024004,AK093685, DQ786314)) sp - strsplit(as.character(GPL$y), ,) ni - sapply(sp, length) data.frame(x = rep(GPL$x, ni), y = unlist(sp)) I hope it helps. Best, Dimitris On 10/4/2010 7:54 PM, Dylan Miracle wrote: Hello, I have a two column dataframe that has entries that look like this: 2315100 NR_024005,NR_024004,AK093685 2315106 DQ786314 and I want to change this to look like this: 2315100 NR_024005 2315100 NR_024004 2315100 AK093685 2315106 DQ786314 I can do this with the following for loop but the dataframe (GPL) has ~140,000 rows and this takes about 15 minutes: extGPL- matrix(nrow=0,ncol=2) for (i in 1:length(GPL[,2])){ aa- unlist(strsplit(as.character(GPL[i,2]),\\,)) bb- rep(as.numeric(as.character(GPL[i,1])), length(aa)) cc- matrix(c(bb,aa),ncol = 2) extGPL- rbind(extGPL,cc) } Is there a way to vectorize this? Thanks, Dylan Miracle University of Minnesota GCD Department __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vectorizing problem
On Oct 4, 2010, at 2:02 PM, Henrique Dallazuanna wrote: Try this: do.call(rbind.data.frame, mapply(cbind, DF$V1, strsplit(as.character(DF$V2), ,))) Also try: txt V1 V2 1 2315100 NR_024005,NR_024004,AK093685 2 2315106 DQ786314 str(txt) 'data.frame': 2 obs. of 2 variables: $ V1: int 2315100 2315106 $ V2: chr NR_024005,NR_024004,AK093685 DQ786314 # Note that my V2 was input as chr, Henrique's appears to have been a factor column data.frame(idxs = rep(txt$V1, length(strsplit(txt$V2, split=,) ) ) , vals = unlist(strsplit(txt$V2, split=,) ) ) idxs vals 1 2315100 NR_024005 2 2315106 NR_024004 3 2315100 AK093685 4 2315106 DQ786314 -- David. On Mon, Oct 4, 2010 at 2:54 PM, Dylan Miracle dylan.mira...@gmail.comwrote: Hello, I have a two column dataframe that has entries that look like this: 2315100 NR_024005,NR_024004,AK093685 2315106 DQ786314 and I want to change this to look like this: 2315100 NR_024005 2315100 NR_024004 2315100 AK093685 2315106 DQ786314 I can do this with the following for loop but the dataframe (GPL) has ~140,000 rows and this takes about 15 minutes: extGPL - matrix(nrow=0,ncol=2) for (i in 1:length(GPL[,2])){ aa - unlist(strsplit(as.character(GPL[i,2]),\\,)) bb - rep(as.numeric(as.character(GPL[i,1])), length(aa)) cc - matrix(c(bb,aa),ncol = 2) extGPL - rbind(extGPL,cc) } Is there a way to vectorize this? Thanks, Dylan Miracle University of Minnesota GCD Department __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glmer or not - glmer model specification
Hello, I'm having some trouble figuring out the correct model specification for my data. The system consists of multiple populations of an organism, which have been genetically sampled for several years. The problem is this: A minority of individuals are found in more than one sample, either they have survived into the next sampling at the same location, or have migrated to another another location and survived into the next sampling there. This pseudoreplication constitutes about 12% of all observations. I have information on the number of heterozygous locus per individual, its location, year, population size and number of immigrants at that location, and want to investigate the effect of population size and migration on heterozygosity through time. Heterozygosity, the response variable is a proportion, and I would like to account for the pseudoreplication. Their heterozygosity score will be the same for each sampling, and is constant. I have been trying with a mixed models approach, more specifically glmer (lme4 package) with a binomial distribution family and the response variable as an odds, eg. glmer(heterozygosity~1+population size+number of immigrants+year+(1| individual id)+(0+year|individual id)+(1+year|location),family=binomial) To my very basic understanding, this poses the following question: Is there a an effect of population size, number of immigrants and year, while taking into account a non-correlated random effect of time and individual, and a correlated random effect of year and location? Does anyone have any advice as to wether a mixed model is the best approach, and if so, what model specification to use? -- - Vennlig hilsen/Kind regards, Atle Torvik Kristiansen M.Sc. student Institutt for biologi (DU2-100) Realfagbygget NTNU 7491 Trondheim Norway Email: atletor...@gmail.com atlet...@stud.ntnu.no __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with apply
Harold - The first way that comes to mind is doit = function(i,j)exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2/ dnorm(qq$nodes[j]) * qq$weights[j] t(outer(1:3,1:2,Vectorize(doit))) but you said you wanted to use apply. That leads to doit1 = function(tmpi,nod,wt) exp(sum(log((exp(tmpi*(nod-b))) / (factorial(tmpi) * exp(exp(nod-b)) * ((1/(s*sqrt(2*pi))) * exp(-((nod-0)^2/(2*s^2/dnorm(nod) * qq$weights[j] apply(tmp,1,function(x)mapply(function(n,w)doit1(x,n,w),qq$nodes,qq$weights)) Both seem to agree with your rr1. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Mon, 4 Oct 2010, Doran, Harold wrote: Sorry about that; s - 1 -Original Message- From: Gabriela Cendoya [mailto:gabrielacendoya.rl...@gmail.com] Sent: Monday, October 04, 2010 1:44 PM To: Doran, Harold Cc: R-help Subject: Re: [R] Help with apply You are missing s in your definitions so I can't reproduce your code. tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE)) str(tmp) 'data.frame': 3 obs. of 3 variables: $ var1: int 9 3 9 $ var2: int 4 6 2 $ var3: int 2 9 3 #I can run the following double loop and yield what I want in the end (rr1) as: library(statmod) Q - 2 b - runif(3) qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) rr1 - matrix(0, nrow = Q, ncol = nrow(tmp)) L - nrow(tmp) for(j in 1:Q){ + for(i in 1:L){ + rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * + exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j] +} +} Error: objeto 's' no encontrado rr1 [,1] [,2] [,3] [1,]000 [2,]000 Gabriela 2010/10/4, Doran, Harold hdo...@air.org: Suppose I have the following data: tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE)) I can run the following double loop and yield what I want in the end (rr1) as: library(statmod) Q - 2 b - runif(3) qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) rr1 - matrix(0, nrow = Q, ncol = nrow(tmp)) L - nrow(tmp) for(j in 1:Q){ for(i in 1:L){ rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j] } } rr1 But, I think this is so inefficient for large Q and nrow(tmp). The function I am looping over is: fn - function(x, nodes, weights, s, b) { exp(sum(log((exp(x*(nodes-b))) / (factorial(x) * exp(exp(nodes-b)) * ((1/(s*sqrt(2*pi))) * exp(-((nodes-0)^2/(2*s^2/dnorm(nodes) * weights } I've tried using apply in a few ways, but I can't replicate rr1 from the double loop. I can go through each value of Q one step at a time and get matching values in the rr1 table, but this would still require a loop and that I think can be avoided. apply(tmp, 1, fn, nodes = qq$nodes[1], weights = qq$weights[1], s = 1, b = b) Does anyone see an efficient way to compute rr1 without a loop? Harold sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] minqa_1.1.9 Rcpp_0.8.6 MiscPsycho_1.6 statmod_1.4.6 lattice_0.17-26 gdata_2.8.0 loaded via a namespace (and not attached): [1] grid_2.10.1 gtools_2.6.2 tools_2.10.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- _ Lic. María Gabriela Cendoya Magíster en Biometría Profesor Adjunto Facultad de Ciencias Agrarias UNMdP - Argentina
Re: [R] Help with apply
A, I see that you wrapped apply around mapply. I was toying with both; but didn't think to use mapply inside apply. As always, thank you, Phil -Original Message- From: Phil Spector [mailto:spec...@stat.berkeley.edu] Sent: Monday, October 04, 2010 2:20 PM To: Doran, Harold Cc: Gabriela Cendoya; R-help Subject: Re: [R] Help with apply Harold - The first way that comes to mind is doit = function(i,j)exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2/ dnorm(qq$nodes[j]) * qq$weights[j] t(outer(1:3,1:2,Vectorize(doit))) but you said you wanted to use apply. That leads to doit1 = function(tmpi,nod,wt) exp(sum(log((exp(tmpi*(nod-b))) / (factorial(tmpi) * exp(exp(nod-b)) * ((1/(s*sqrt(2*pi))) * exp(-((nod-0)^2/(2*s^2/dnorm(nod) * qq$weights[j] apply(tmp,1,function(x)mapply(function(n,w)doit1(x,n,w),qq$nodes,qq$weights)) Both seem to agree with your rr1. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Mon, 4 Oct 2010, Doran, Harold wrote: Sorry about that; s - 1 -Original Message- From: Gabriela Cendoya [mailto:gabrielacendoya.rl...@gmail.com] Sent: Monday, October 04, 2010 1:44 PM To: Doran, Harold Cc: R-help Subject: Re: [R] Help with apply You are missing s in your definitions so I can't reproduce your code. tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE)) str(tmp) 'data.frame': 3 obs. of 3 variables: $ var1: int 9 3 9 $ var2: int 4 6 2 $ var3: int 2 9 3 #I can run the following double loop and yield what I want in the end (rr1) as: library(statmod) Q - 2 b - runif(3) qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) rr1 - matrix(0, nrow = Q, ncol = nrow(tmp)) L - nrow(tmp) for(j in 1:Q){ + for(i in 1:L){ + rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * + exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j] +} +} Error: objeto 's' no encontrado rr1 [,1] [,2] [,3] [1,]000 [2,]000 Gabriela 2010/10/4, Doran, Harold hdo...@air.org: Suppose I have the following data: tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE)) I can run the following double loop and yield what I want in the end (rr1) as: library(statmod) Q - 2 b - runif(3) qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) rr1 - matrix(0, nrow = Q, ncol = nrow(tmp)) L - nrow(tmp) for(j in 1:Q){ for(i in 1:L){ rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j] } } rr1 But, I think this is so inefficient for large Q and nrow(tmp). The function I am looping over is: fn - function(x, nodes, weights, s, b) { exp(sum(log((exp(x*(nodes-b))) / (factorial(x) * exp(exp(nodes-b)) * ((1/(s*sqrt(2*pi))) * exp(-((nodes-0)^2/(2*s^2/dnorm(nodes) * weights } I've tried using apply in a few ways, but I can't replicate rr1 from the double loop. I can go through each value of Q one step at a time and get matching values in the rr1 table, but this would still require a loop and that I think can be avoided. apply(tmp, 1, fn, nodes = qq$nodes[1], weights = qq$weights[1], s = 1, b = b) Does anyone see an efficient way to compute rr1 without a loop? Harold sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] minqa_1.1.9 Rcpp_0.8.6 MiscPsycho_1.6 statmod_1.4.6 lattice_0.17-26 gdata_2.8.0 loaded via a namespace (and not attached): [1] grid_2.10.1 gtools_2.6.2 tools_2.10.1 [[alternative HTML version deleted]]
[R] Null values from DBI connection
I am connecting to an Oracle database with RJDBC, and creating a data frame with dbGetQuery. When I get values return, values which are NULL in the database are being returned as 0 from my query. How can I have them returned as NA? I am query like the following. -- require(RJDBC) drv - JDBC(oracle.jdbc.driver.OracleDriver,/usr/local/ojdbc/ojdbc14.jar,') conn - dbConnect(drv, dbc:oracle:thin:@ADDRESS,user,pass) results - dbGetQuery(select rownum, col1 from table where col1 is null) -- results ROWNUM COL1 1 10 2 20 3 30 4 40 5 50 I'd rather see: results ROWNUM COL1 1 1NA 2 2NA 3 3NA 4 4NA 5 5NA -- -albert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Roxygen not truncating files
I'm trying to use Roxygen for the first time, but I'm running into trouble. I call it like so: roxygenize('~/p4/r-packages/IREval', '~/p4/r-packages/IREval', unlink.target=T) But it seems that it's only appending to files in ~/p4/r-packages/IREval/man/, not overwriting them. So for example, for a function that I haven't documented yet I get a long file with just the following 4 lines over and over, as many times as I've called roxygenize(): % head -n12 man/adddots.pr.Rd \name{adddots.pr} \alias{adddots.pr} \title{adddots.pr} \usage{adddots.pr(pr)} \name{adddots.pr} \alias{adddots.pr} \title{adddots.pr} \usage{adddots.pr(pr)} \name{adddots.pr} \alias{adddots.pr} \title{adddots.pr} \usage{adddots.pr(pr)} ... I also get the same behavior for the DESCRIPTION file. Is this a known gotcha that someone's found a workaround for, maybe? -- Ken Williams Sr. Research Scientist Thomson Reuters Phone: 651-848-7712 ken.willi...@thomsonreuters.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] input matrix for leaps algorithm
The regsubsets function in the leaps package can work with fewer points than variables. Though the meaning will be questionable. The lasso (lasso2 or lars packages) may be more informative for your situation. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of CZ Sent: Monday, October 04, 2010 8:43 AM To: r-help@r-project.org Subject: [R] input matrix for leaps algorithm Hello, I was trying to use the leaps algorithm for my variable selection. I tried different matrices calculated from multiple linear regression, linear discriminant analysis, and the basic correlation matrix as inputs to the leaps algorithm. But I always got the error message - the matrix is not positive definite. I have more variables than the number of observations/samples. I know this may be a problem. But the multiple linear regression in SAS seems to be able to handle the case when there are more variables than the number of observations. I wonder if there is a way I can still do my variable selection instead of reducing the number of variables. Or there may be something wrong with my function calls. Thank you. -- View this message in context: http://r.789695.n4.nabble.com/input- matrix-for-leaps-algorithm-tp2954453p2954453.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading Strings in R
Use scan whith what='' or read.table with colClasses='character'. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Lorenzo Isella Sent: Monday, October 04, 2010 5:05 AM To: r-help Subject: [R] Reading Strings in R Dear All, I know this must be a one-liner, but I am experiencing problems. I need to read lists of long integers stored on a text files, i.e. which reads like 7359700484475972 7359700484475972 0 7359700484475972 0 0 0 97189954101318722 0 0 7811896690636354 7359700490636354 0 0 0 7811896684475972 7811896684475972 7811896690636354 8447597298304052 7359700484475972 0 7359700498304052 7359700498304052 7359700498304052 0 7359700498304052 7359700484475972 7359700484475972 Now, for me every integer is just a symbol, i.e. the ID of some object and I simply need to find out how many times and where it occurs in this file. Bottom line: I would like to read this file directly as a string to ensure I will not have any trouble of integer overflow or rounding error. I tried readLines, but I cannot get rid of \n. Any help is appreciated. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inserting a plot into another
Hi Greg. Thank you for the time you took to look at my problem. I'll try your solution until CRAN is updated. With regards, Phil -- View this message in context: http://r.789695.n4.nabble.com/Inserting-a-plot-into-another-tp2720936p2954769.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vectorizing problem
On Mon, Oct 4, 2010 at 1:54 PM, Dylan Miracle dylan.mira...@gmail.com wrote: Hello, I have a two column dataframe that has entries that look like this: 2315100 NR_024005,NR_024004,AK093685 2315106 DQ786314 and I want to change this to look like this: 2315100 NR_024005 2315100 NR_024004 2315100 AK093685 2315106 DQ786314 I can do this with the following for loop but the dataframe (GPL) has ~140,000 rows and this takes about 15 minutes: Try this assuming that the columns of GPL are character. You may need to use as.character first if they are factor: library(reshape2) V2 - strsplit(GPL$V2, ,) names(V2) - GPL$V1 melt(V2) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] title and axis labels placement with layout
Hello, I'm writing a set of functions which we will use with a bunch of different datasets. The functions create a set of graphics pages, with each page split into an upper panel with plots and a lower panel with tables. The plots part might have one plot, or might have multiple. I'm using layout() to split the region into a top part for plots and a bottom part for 2 tables side-by-side. For the tables below, I'm using textplot() from the gplots package. I'm trying to get it so that the layout of the full page looks exactly the same regardless of the plot. My working test case creates a boxplot with a few groups on one page, and 6 boxplots each with 3 groups on another page. R automatically scales the cex parameter such that, for example, the points are smaller on the more complicated plots. I can live with different point sizes. But, I want the tables to have the same size below. After much trial and error, I determined that I had to set cex OUTSIDE of textplot in order to get the sizes of the tables to be the same across pages. But, what I can't get to work are 2 things: (1) the ylabel placement (horizontally) for the top plot depends on the graphics in a way that I can't figure out. (I've tried setting cex,lheight and many combinations of padj, line using both the plot function itself and title, mtext functions after the plot call). (2) I can't figure out how to put a title on the tables such that they end up in repeatable position. Again, I've tried all the various options I tried in (1) here again. One solution to (1) would be if I could place the label using a measurement rather than a line number. (analogous to using par(mai) instead of par(mar)) One solution to (2) would be that textplot include a main option. thanks, Daryl Morris SCHARP, FHCRC, UW Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generating weekdays only
Dear all, can anyone please tell me how to generate a sequence of days continuously, however without considering weekends i.e. Saturday and Sunday? I am aware of following code: seq(as.Date(2010-01-01), as.Date(2010-02-01), by=1 day) [1] 2010-01-01 2010-01-02 2010-01-03 2010-01-04 2010-01-05 2010-01-06 2010-01-07 2010-01-08 [9] 2010-01-09 2010-01-10 2010-01-11 2010-01-12 2010-01-13 2010-01-14 2010-01-15 2010-01-16 [17] 2010-01-17 2010-01-18 2010-01-19 2010-01-20 2010-01-21 2010-01-22 2010-01-23 2010-01-24 [25] 2010-01-25 2010-01-26 2010-01-27 2010-01-28 2010-01-29 2010-01-30 2010-01-31 2010-02-01 However this generates all days. Thanks for your time. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Include externally generated pdf in output (without Sweave)
Importing as pixel image then replotting will probably reduce the quality of the graph. Another approach is to use the pdftk program (I have seen versions for unix and windows) to add/insert the original pdf page into your document. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Dieter Menne Sent: Sunday, October 03, 2010 7:39 AM To: r-help@r-project.org Subject: Re: [R] Include externally generated pdf in output (without Sweave) Tal Galili wrote: You could potentially read an image file (like, for example, tiff) using something like read.picture {SoPhy} Thanks to you and Baptiste Auguie. Looks like the best way would be to import the picture as a pixel graphics with read.picture. It might be possible to read ps with grImport (Murells/suggested by Baptiste), but I always found the postscript-detour messy. Dieter -- View this message in context: http://r.789695.n4.nabble.com/Include- externally-generated-pdf-in-output-without-Sweave- tp2953057p2953182.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] title and axis labels placement with layout
You might want to look at the addtable2plot function in the plotrix package as an alternative for your tables. For placing titles/labels/etc. look at the grconvertX and grconvertY functions for ways to get coordinates based on the device converted to other coordinate systems. You can use those with the text function (or others) to place a label at a fixed position on the device. Another option is to include an empty plot area in the layout to place the titles/labels into. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Daryl Morris Sent: Monday, October 04, 2010 12:34 PM To: r-help@r-project.org Subject: [R] title and axis labels placement with layout Hello, I'm writing a set of functions which we will use with a bunch of different datasets. The functions create a set of graphics pages, with each page split into an upper panel with plots and a lower panel with tables. The plots part might have one plot, or might have multiple. I'm using layout() to split the region into a top part for plots and a bottom part for 2 tables side-by-side. For the tables below, I'm using textplot() from the gplots package. I'm trying to get it so that the layout of the full page looks exactly the same regardless of the plot. My working test case creates a boxplot with a few groups on one page, and 6 boxplots each with 3 groups on another page. R automatically scales the cex parameter such that, for example, the points are smaller on the more complicated plots. I can live with different point sizes. But, I want the tables to have the same size below. After much trial and error, I determined that I had to set cex OUTSIDE of textplot in order to get the sizes of the tables to be the same across pages. But, what I can't get to work are 2 things: (1) the ylabel placement (horizontally) for the top plot depends on the graphics in a way that I can't figure out. (I've tried setting cex,lheight and many combinations of padj, line using both the plot function itself and title, mtext functions after the plot call). (2) I can't figure out how to put a title on the tables such that they end up in repeatable position. Again, I've tried all the various options I tried in (1) here again. One solution to (1) would be if I could place the label using a measurement rather than a line number. (analogous to using par(mai) instead of par(mar)) One solution to (2) would be that textplot include a main option. thanks, Daryl Morris SCHARP, FHCRC, UW Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating weekdays only
On Oct 4, 2010, at 2:07 PM, Ron Michael wrote: Dear all, can anyone please tell me how to generate a sequence of days continuously, however without considering weekends i.e. Saturday and Sunday? I am aware of following code: seq(as.Date(2010-01-01), as.Date(2010-02-01), by=1 day) [1] 2010-01-01 2010-01-02 2010-01-03 2010-01-04 2010-01-05 2010-01-06 2010-01-07 2010-01-08 [9] 2010-01-09 2010-01-10 2010-01-11 2010-01-12 2010-01-13 2010-01-14 2010-01-15 2010-01-16 [17] 2010-01-17 2010-01-18 2010-01-19 2010-01-20 2010-01-21 2010-01-22 2010-01-23 2010-01-24 [25] 2010-01-25 2010-01-26 2010-01-27 2010-01-28 2010-01-29 2010-01-30 2010-01-31 2010-02-01 ??weekdays ?weekdays # generates a character vector that can be tested against non Sat/Sun values (as.Date(2010-01-01)+0:31)[ !weekdays((as.Date(2010-01-01)+0:31)) %in% c(Saturday , Sunday) ] However this generates all days. Thanks for your time. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating weekdays only
Here is one way: mydays - seq(as.Date(2010-01-01), as.Date(2010-02-01), by=1 day) myweekdays - mydays[ ! weekdays(mydays) %in% c('Saturday','Sunday') ] myweekdays [1] 2010-01-01 2010-01-04 2010-01-05 2010-01-06 2010-01-07 [6] 2010-01-08 2010-01-11 2010-01-12 2010-01-13 2010-01-14 [11] 2010-01-15 2010-01-18 2010-01-19 2010-01-20 2010-01-21 [16] 2010-01-22 2010-01-25 2010-01-26 2010-01-27 2010-01-28 [21] 2010-01-29 2010-02-01 This generates the sequence of days, then throws out the weekends (you could also list the weekdays instead of the weekend days and drop the !, but I am lazy). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Ron Michael Sent: Monday, October 04, 2010 12:08 PM To: r-help@r-project.org Subject: [R] Generating weekdays only Dear all, can anyone please tell me how to generate a sequence of days continuously, however without considering weekends i.e. Saturday and Sunday? I am aware of following code: seq(as.Date(2010-01-01), as.Date(2010-02-01), by=1 day) [1] 2010-01-01 2010-01-02 2010-01-03 2010-01-04 2010-01-05 2010-01-06 2010-01-07 2010-01-08 [9] 2010-01-09 2010-01-10 2010-01-11 2010-01-12 2010-01-13 2010-01-14 2010-01-15 2010-01-16 [17] 2010-01-17 2010-01-18 2010-01-19 2010-01-20 2010-01-21 2010-01-22 2010-01-23 2010-01-24 [25] 2010-01-25 2010-01-26 2010-01-27 2010-01-28 2010-01-29 2010-01-30 2010-01-31 2010-02-01 However this generates all days. Thanks for your time. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with apply
Hi, It seemed to me like the calculations were implemented in a bit of a redundant way, so I tried to simplify it algebraically. doit1 is Phil's version of your original calculations, and doit2 is my simplified way. doit1 - function(i, j) { exp(sum(log((exp(tmp[i,] * (qq$nodes[j]-b)))/ (factorial(tmp[i,])*exp(exp(qq$nodes[j]-b))* ((1/(s*sqrt(2*pi)))*exp(-((qq$nodes[j]-0)^2/(2*s^2/ dnorm(qq$nodes[j])*qq$weights[j] } doit2 - function(i, j) { (prod(exp((tmp[i,]*(qq$nodes[j]-b))-exp(qq$nodes[j]-b))/ factorial(tmp[i,]))*qq$weights[j])/ (s*sqrt(2*pi)*dnorm(qq$nodes[j])*exp((qq$nodes[j]-0)^2/(2*s^2))) } system.time(t(outer(1:300,1:2,Vectorize(doit1 system.time(t(outer(1:300,1:2,Vectorize(doit2 New vs. Old Characters (for the calculations): 154 vs. 177 Function calls for calculations (i.e., excluding subseting/dnorm, etc.): 22 vs. 27 Timing difference: Not appreciable on my system, but I spent so long wading through exactly what was happening, I figured why not share it. Moving tmp up to 300 rows, I got: system.time(t(outer(1:300,1:2,Vectorize(doit1 user system elapsed 9.044 0.008 10.135 system.time(t(outer(1:300,1:2,Vectorize(doit2 user system elapsed 8.413 0.008 8.893 At this rate, the difference between a few million rows might save you enough time you can stretch your hands once, and those 20 characters should really free up space on your hard disksigh. Josh On Mon, Oct 4, 2010 at 11:22 AM, Doran, Harold hdo...@air.org wrote: A, I see that you wrapped apply around mapply. I was toying with both; but didn't think to use mapply inside apply. As always, thank you, Phil -Original Message- From: Phil Spector [mailto:spec...@stat.berkeley.edu] Sent: Monday, October 04, 2010 2:20 PM To: Doran, Harold Cc: Gabriela Cendoya; R-help Subject: Re: [R] Help with apply Harold - The first way that comes to mind is doit = function(i,j)exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2/ dnorm(qq$nodes[j]) * qq$weights[j] t(outer(1:3,1:2,Vectorize(doit))) but you said you wanted to use apply. That leads to doit1 = function(tmpi,nod,wt) exp(sum(log((exp(tmpi*(nod-b))) / (factorial(tmpi) * exp(exp(nod-b)) * ((1/(s*sqrt(2*pi))) * exp(-((nod-0)^2/(2*s^2/dnorm(nod) * qq$weights[j] apply(tmp,1,function(x)mapply(function(n,w)doit1(x,n,w),qq$nodes,qq$weights)) Both seem to agree with your rr1. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Mon, 4 Oct 2010, Doran, Harold wrote: Sorry about that; s - 1 -Original Message- From: Gabriela Cendoya [mailto:gabrielacendoya.rl...@gmail.com] Sent: Monday, October 04, 2010 1:44 PM To: Doran, Harold Cc: R-help Subject: Re: [R] Help with apply You are missing s in your definitions so I can't reproduce your code. tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE)) str(tmp) 'data.frame': 3 obs. of 3 variables: $ var1: int 9 3 9 $ var2: int 4 6 2 $ var3: int 2 9 3 #I can run the following double loop and yield what I want in the end (rr1) as: library(statmod) Q - 2 b - runif(3) qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) rr1 - matrix(0, nrow = Q, ncol = nrow(tmp)) L - nrow(tmp) for(j in 1:Q){ + for(i in 1:L){ + rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * + exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j] + } + } Error: objeto 's' no encontrado rr1 [,1] [,2] [,3] [1,] 0 0 0 [2,] 0 0 0 Gabriela 2010/10/4, Doran, Harold hdo...@air.org: Suppose I have the following data: tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE)) I can run the following double loop and yield what I want in the end (rr1) as: library(statmod) Q - 2 b - runif(3) qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) rr1 - matrix(0, nrow = Q, ncol = nrow(tmp)) L - nrow(tmp) for(j in 1:Q){ for(i in 1:L){ rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi))) *
Re: [R] Plot for Binomial GLM
On 2010-10-04 8:21, klsk89 wrote: Hi i would like to use some graphs or tables to explore the data and make some sensible guesses of what to expect to see in a glm model to assess if toxin concentration and sex have a relationship with the kill rate of rats. But i cant seem to work it out as i have two predictor variables~help?Thanks.:) Here's my data. rat.toxic-read.table(file=Rats.csv,header=T,row.names=NULL,sep=,) attach(rat.toxic) names(rat.toxic) [1] Dose Sex Dead Alive rat.toxic Dose Sex Dead Alive 110 F119 210 M020 320 F416 420 M416 530 F911 630 M812 740 F 13 7 840 M 13 7 950 F 18 2 10 50 M 17 3 11 60 F 20 0 12 60 M 16 4 13 10 F317 14 10 M119 15 20 F218 16 20 M218 17 30 F 1010 18 30 M812 19 40 F 14 6 20 40 M 12 8 21 50 F 16 4 22 50 M 13 7 23 60 F 18 2 24 60 M 16 4 glm2-glm(deadalive~Dose*Sex,family=binomial,data=rat.toxic) anova(glm2,test=Chi) Analysis of Deviance Table Model: binomial, link: logit Response: deadalive Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev P(|Chi|) NULL23225.455 Dose 1 202.36622 23.0902e-16 *** Sex 14.32821 18.7620.0375 * Dose:Sex 11.14920 17.6130.2838 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 summary(glm2) Call: glm(formula = deadalive ~ Dose * Sex, family = binomial, data = rat.toxic) Deviance Residuals: Min1QMedian3Q Max -1.82241 -0.85632 0.06675 0.61981 1.47874 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -3.479390.46167 -7.537 4.83e-14 *** Dose 0.105970.01286 8.243 2e-16 *** SexM 0.155010.63974 0.2420.809 Dose:SexM -0.018210.01707 -1.0670.286 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 225.455 on 23 degrees of freedom Residual deviance: 17.613 on 20 degrees of freedom AIC: 91.115 Number of Fisher Scoring iterations: 4 This is pretty much the budworm example in MASS (the book). I would produce a plot of Prob(dead) vs dose with separate lines for M/F: dose - seq(10, 60, length=51) ypF - predict(glm2, data.frame(Dose=dose, Sex=F), type=response) ypM - predict(glm2, data.frame(Dose=dose, Sex=M), type=response) plot(c(10,60), c(0,1), type=n) lines(dose, ypF, col=2) lines(dose, ypM, col=4) text(locator(1), F, col=2) text(locator(1), M, col=4) See recent posts by Ben Bolker for confidence bands. -Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot for Binomial GLM
Hi, Dennis was kind of enough to remind me that glm() can take a two column matrix, which is probably what you did with deadalive. He also gave a rather elegant graphing solution using xyplot: xyplot(Alive/20 ~ Dose, data = rat.toxic, groups = Sex, type = c('p', 'a')) Josh On Mon, Oct 4, 2010 at 8:23 AM, Joshua Wiley jwiley.ps...@gmail.com wrote: On Mon, Oct 4, 2010 at 7:21 AM, klsk89 karenkls...@yahoo.com wrote: Hi i would like to use some graphs or tables to explore the data and make some sensible guesses of what to expect to see in a glm model to assess if toxin concentration and sex have a relationship with the kill rate of rats. But i cant seem to work it out as i have two predictor variables~help?Thanks.:) What about xtabs? For instance: xtabs(deadalive ~ Dose + Sex, data = rat.toxic) Regarding graphs, take a look at faceting in ggplot2 (or lattice). You can get something close to the 3 way table but in graphical form that way. I am not sure if this is completely up and running yet, but I know there has been work linking ggobi with R. I have seen a few demonstrations that looked quite promising, and it may work well for you to visualize three variables at once (and interactively). Here is the link: http://www.ggobi.org/rggobi/ Here's my data. rat.toxic-read.table(file=Rats.csv,header=T,row.names=NULL,sep=,) attach(rat.toxic) ^ why attach it? names(rat.toxic) [1] Dose Sex Dead Alive rat.toxic Dose Sex Dead Alive 1 10 F 1 19 2 10 M 0 20 3 20 F 4 16 4 20 M 4 16 5 30 F 9 11 6 30 M 8 12 7 40 F 13 7 8 40 M 13 7 9 50 F 18 2 10 50 M 17 3 11 60 F 20 0 12 60 M 16 4 13 10 F 3 17 14 10 M 1 19 15 20 F 2 18 16 20 M 2 18 17 30 F 10 10 18 30 M 8 12 19 40 F 14 6 20 40 M 12 8 21 50 F 16 4 22 50 M 13 7 23 60 F 18 2 24 60 M 16 4 Please tell me that after this, you converted the counts of dead and alive into a single variable that had a 0 or 1 if dead and the opposite as alive before you used it as the dependent variable in your logistic regression. glm2-glm(deadalive~Dose*Sex,family=binomial,data=rat.toxic) anova(glm2,test=Chi) Analysis of Deviance Table Model: binomial, link: logit Response: deadalive Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev P(|Chi|) NULL 23 225.455 Dose 1 202.366 22 23.090 2e-16 *** Sex 1 4.328 21 18.762 0.0375 * Dose:Sex 1 1.149 20 17.613 0.2838 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 summary(glm2) Call: glm(formula = deadalive ~ Dose * Sex, family = binomial, data = rat.toxic) Deviance Residuals: Min 1Q Median 3Q Max -1.82241 -0.85632 0.06675 0.61981 1.47874 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -3.47939 0.46167 -7.537 4.83e-14 *** Dose 0.10597 0.01286 8.243 2e-16 *** SexM 0.15501 0.63974 0.242 0.809 Dose:SexM -0.01821 0.01707 -1.067 0.286 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 225.455 on 23 degrees of freedom Residual deviance: 17.613 on 20 degrees of freedom AIC: 91.115 Number of Fisher Scoring iterations: 4 -- View this message in context: http://r.789695.n4.nabble.com/Plot-for-Binomial-GLM-tp2954406p2954406.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fisher exact test?
Hi Thomas, Thanks for the reply. 1. In the first pick, I draw 'A' genes from N, without replacement. 2. Similarly, in the second pick, I draw 'B' genes from N, without replacement (and 'C' genes from 'N' etc.) 3. Order does not matter - so the two cases you cited are equivalent. I would like to find out the probability that 'n' of the genes are common to all three 'picks'. many thanks! From: Thomas Stewart tgstew...@gmail.com Cc: r-help@r-project.org Sent: Mon, October 4, 2010 10:46:37 AM Subject: Re: [R] Fisher exact test? To find the probability directly, you'll need to clearly state how you are sampling. Are you sampling with replacement? (Draw then put back.) Or are you sampling without replacement? (Draw. Draw from the remaining. Draw from the remaining.) Also, of the N genes, do you know how many are of the particular set? How many genes do you draw each try? Does order matter to you? In other words is (A=5, B=0, C=23) equivalent to (A=0,B=5,C=23)? -tgs Hi All, My apologies if this is a totally newbie question. I want to calculate the probability that a particular set of genes is picked repeatedly for 3 samplings. For example, if from a total of 'N' genes, I pick 'A' number of genes in the first pick, 'B' number of genes in the second pick, and 'C' number of genes in the third pick, I would want to calculate p(n) where n = (A intersection B intersection C). Would fisher exact test be the correct method to evaluate the probability? And if so how can I frame the 3x3 (?) contingency table? I can find numerous examples of 2x2 tables, but none for 3x3. Any guidance would be appreciated. many thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Script auto-detecting its own path
I'm not sure this will solve the issue because if I move the script, I would still have to go into the script and edit the /path/to/my/script.r, or do I misunderstand your workaround? I'm looking for something like: file.path.is.here(myscript.r) and which would return something like: [1] c:/user/Desktop/ so that regardless of where the script is, as long as the accompanying scripts are in the same directory, they can be easily sourced with something like: dirX - file.path.is.here(MasterScript.r) source(paste(dirX, AuxillaryFile.r, sep=)) If you use relative paths like so: # master.r source(AuxillaryFile.r) Then source(path/to/master.r, chdir = T) will work. Mastering working directories is a much better idea than coming up with your own workarounds. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reducing distances between tickmarks
Hello, everybody! I have a code below that creates a data set and then a stacked bar chart based on that data set. No need to look at it - just notice please that my horizontal axis is a date varible (x=my.data$date). I have a question about the last 2 lines of this code: grid(nx=NULL,ny=NULL,col = lightgray, lty = dotted,lwd = par(lwd)) axis(1, las = 2) Could anyone tell me - what are the tick mark labels on my X axis? Why are they not dates? axTicks(side=1) Is there any way to change them to actual dates as in my.data$date? And also - is there a way to force the graph to display all tickmarks on X axis - and not just 4 of them? Thank you! Dimitri my.data-data.frame(date=c(20080301,20080401,20080501,20080601,20080701,20080801,20080901,20081001,20081101,20081201,20090101,20090201,20090301,20090401,20090501,20090601,20090701,20090801,20090901,20091001,20091101,20091201,20100101,20100201,20100301,20100402,20100503), x=c(1.1, 1, 1.6, 1, 2, 1.5, 2.1, 1.3, 1.9, 1.1, 1, 1.6, 1, 2, 1.5, 2.1, 1.3, 1.9, 1.1, 1, 1.6, 1, 2, 1.5, 2.1, 1.3, 1.9), y=c(-4,-3,-6,-5,-7,-5.2,-6,-4,-4.9,-4,-3,-6,-5,-7,-5.2,-6,-4,-4.9,-4,-3,-6,-5,-7,-5.2,-6,-4,-4.9), z=c(-0.2,-0.3,-0.4,-0.1,-0.2,-0.05,-0.2,-0.15,-0.06,-0.2,-0.3,-0.4,-0.1,-0.2,-0.05,-0.2,-0.15,-0.06,-0.06,-0.2,-0.3,-0.4,-0.1,-0.2,-0.05,-0.2,-0.15), a=c(10,13,15,15,16,17,15,16,14,10,13,15,15,16,17,15,16,14,10,13,15,15,16,17,15,16,14)) my.data$date-as.character(my.data$date) my.data$date-as.Date(my.data$date,%Y%m%d) (my.data) str(my.data) positives-which(colSums(my.data[2:ncol(my.data)])0) # which vars have positive column sums? negatives-which(colSums(my.data[2:ncol(my.data)])0) # which vars have negative column sums? y.max-1.1*max(rowSums(my.data[names(positives)])) # the max on the y axis of the chart y.min-1.1*min(rowSums(my.data[names(negatives)])) # the min on the y axis of the chart ylim - c(y.min, y.max) order.positives-rev(rank(positives)) order.of.pos.vars-names(order.positives) order.negatives-rev(rank(negatives)) order.of.neg.vars-names(order.negatives) order-c(order.negatives,order.positives) order.of.vars-names(order) # the order of variables on the chart - from the bottom up ### so, the bottom-most area should be for z, the second from the bottom area- for y (above z) all.colors-c('red','blue','green','orange','yellow','purple') xx - c(my.data$date, rev(my.data$date)) bottom.y.coordinates-rowSums(my.data[names(negatives)]) plot(x=my.data$date, y=bottom.y.coordinates, ylim=ylim, col='white', type='l', xaxt='n', ylab='Title for Y', xlab='Date', main='Chart Title') for(var in order.of.neg.vars){ top.line.coords-bottom.y.coordinates-my.data[[var]] bottom.coords-c(bottom.y.coordinates,rev(top.line.coords)) polygon(xx,bottom.coords,col=all.colors[which(names(my.data) %in% var)]) bottom.y.coordinates-top.line.coords } for(var in order.of.pos.vars){ top.line.coords-bottom.y.coordinates+my.data[[var]] bottom.coords-c(bottom.y.coordinates,rev(top.line.coords)) polygon(xx,bottom.coords,col=all.colors[which(names(my.data) %in% var)]) bottom.y.coordinates-top.line.coords } grid(nx=NULL,ny=NULL,col = lightgray, lty = dotted,lwd = par(lwd)) axis(1, las = 2) -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Debye function
Thank you for your time. I'm looking for another package than the gsl package. But it does not seem to be implemented anywhere else. Christophe Le 4 oct. 2010 à 19:10, Bert Gunter a écrit : Well, not really ... Google on R package Debye . The 5th hit lists the gsl package with the debye() function. Moral: DO make use of standard professional search tools, also. -- Bert On Mon, Oct 4, 2010 at 9:13 AM, Spencer Graves spencer.gra...@structuremonitoring.com wrote: install.packages('sos') # if not already installed library(sos) Debye - ???Debye #found 4 matches Debye # Print method opens table in a web browser Conclusions: 1. There are references to Debye in the CHNOSZ package, but it may not be what you want. 2. If it's otherwise available in R, it's very well concealed. Spencer On 10/4/2010 8:56 AM, Christophe Dutang wrote: Dear list, Is there another package than gsl* where the Debye function (of first order) is implemented? Google only finds the gsl package. Thanks in advance Christophe * page 12 of reference manual of gsl. http://cran.r-project.org/web/packages/gsl/gsl.pdf -- Spencer Graves, PE, PhD President and Chief Operating Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics -- Christophe Dutang Ph.D. student at ISFA, Lyon, France website: http://dutangc.free.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reducing distances between tickmarks
On Oct 4, 2010, at 3:28 PM, Dimitri Liakhovitski wrote: Hello, everybody! I have a code below that creates a data set and then a stacked bar chart based on that data set. No need to look at it - just notice please that my horizontal axis is a date varible (x=my.data$date). I have a question about the last 2 lines of this code: grid(nx=NULL,ny=NULL,col = lightgray, lty = dotted,lwd = par(lwd)) axis(1, las = 2) But you only gave two arguments to axis... the side and las??? Could anyone tell me - what are the tick mark labels on my X axis? Why are they not dates? Because they are the number of days since the origin. Read the help page for Dates. If you want the text value for axis to be formatted, then consult the axis help page for the argument names and use the appropriate call to format: ?Dates ?axis # perhaps ... labels =format(my.data$date, %Y-%m-%d) # but you will also need to decide where you want the tick marks Perhaps: axis(1, labels =format(as.Date(axTicks(1), origin=1970-01-01), %Y-%m-%d), at=axTicks(1), col=red) ?format axTicks(side=1) Is there any way to change them to actual dates as in my.data$date? And also - is there a way to force the graph to display all tickmarks on X axis - and not just 4 of them? Thank you! Dimitri my.data- data.frame(date=c(20080301,20080401,20080501,20080601,20080701,20080801,20080901,20081001,20081101,20081201,20090101,20090201,20090301,20090401,20090501,20090601,20090701,20090801,20090901,20091001,20091101,20091201,20100101,20100201,20100301,20100402,20100503), x=c(1.1, 1, 1.6, 1, 2, 1.5, 2.1, 1.3, 1.9, 1.1, 1, 1.6, 1, 2, 1.5, 2.1, 1.3, 1.9, 1.1, 1, 1.6, 1, 2, 1.5, 2.1, 1.3, 1.9), y = c (-4 ,-3 ,-6 ,-5 ,-7 ,-5.2 ,-6 ,-4 ,-4.9,-4,-3,-6,-5,-7,-5.2,-6,-4,-4.9,-4,-3,-6,-5,-7,-5.2,-6,-4,-4.9), z = c (-0.2 ,-0.3 ,-0.4 ,-0.1 ,-0.2 ,-0.05 ,-0.2 ,-0.15 ,-0.06 ,-0.2 ,-0.3 ,-0.4 ,-0.1 ,-0.2 ,-0.05 ,-0.2,-0.15,-0.06,-0.06,-0.2,-0.3,-0.4,-0.1,-0.2,-0.05,-0.2,-0.15), a = c (10,13,15,15,16,17,15,16,14,10,13,15,15,16,17,15,16,14,10,13,15,15,16,17,15,16,14 )) my.data$date-as.character(my.data$date) my.data$date-as.Date(my.data$date,%Y%m%d) (my.data) str(my.data) positives-which(colSums(my.data[2:ncol(my.data)])0) # which vars have positive column sums? negatives-which(colSums(my.data[2:ncol(my.data)])0) # which vars have negative column sums? y.max-1.1*max(rowSums(my.data[names(positives)])) # the max on the y axis of the chart y.min-1.1*min(rowSums(my.data[names(negatives)])) # the min on the y axis of the chart ylim - c(y.min, y.max) order.positives-rev(rank(positives)) order.of.pos.vars-names(order.positives) order.negatives-rev(rank(negatives)) order.of.neg.vars-names(order.negatives) order-c(order.negatives,order.positives) order.of.vars-names(order) # the order of variables on the chart - from the bottom up ### so, the bottom-most area should be for z, the second from the bottom area- for y (above z) all.colors-c('red','blue','green','orange','yellow','purple') xx - c(my.data$date, rev(my.data$date)) bottom.y.coordinates-rowSums(my.data[names(negatives)]) plot(x=my.data$date, y=bottom.y.coordinates, ylim=ylim, col='white', type='l', xaxt='n', ylab='Title for Y', xlab='Date', main='Chart Title') for(var in order.of.neg.vars){ top.line.coords-bottom.y.coordinates-my.data[[var]] bottom.coords-c(bottom.y.coordinates,rev(top.line.coords)) polygon(xx,bottom.coords,col=all.colors[which(names(my.data) %in% var)]) bottom.y.coordinates-top.line.coords } for(var in order.of.pos.vars){ top.line.coords-bottom.y.coordinates+my.data[[var]] bottom.coords-c(bottom.y.coordinates,rev(top.line.coords)) polygon(xx,bottom.coords,col=all.colors[which(names(my.data) %in% var)]) bottom.y.coordinates-top.line.coords } grid(nx=NULL,ny=NULL,col = lightgray, lty = dotted,lwd = par(lwd)) axis(1, las = 2) -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading data into
I have data in the following form: judge poster score poster score poster score a1 89 2 79 392 b 3 45 4 65 and am trying to get it to the following: Poster Judge_A Judge_B Judge_C 1 89 2 79 3 92 45 4 65 Any hints would be appreciated. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Ridge regression and mixed models
Dear R users, An equivalence between linear mixed model formulation and penalized regression models (including the ridge regression and penalized regression splines) has proven to be very useful in many aspects. Examples include the use of the lme() function in the library(nlme) to fit smooth models including the estimation of a smoothing parameter using REML. My question concerns the use of the linear mixed model software to fit a ridge regression with the number of columns in the design matrix X (p) exceeding the number of observations (n). Has anybody in the R community implemented the LME-like approach with estimation of the variance components using REML to find the coefficient estimates (BLUEs) and predictors (BLUPs) in the ridge regression problem in the p n setting? Sample code below summarizes my problem: version$version.string # [1] R version 2.11.1 (2010-05-31) library(nlme) # DATA generation: dim - 200 n - 50 XX - matrix(rnorm(dim*n, 0, 0.1), ncol=dim, nrow=n) beta - matrix(c(rep(1, 40), rep(2,20), rep(0,140)), ncol=1) Y - XX %*% beta + rnorm(n) # MODEL fit: dummyId - factor(rep(1,n)) Z.block - list(dummyId=pdIdent(~-1+XX)) data.fr - data.frame(Y,XX) fit - lme(Y~1, data=data.fr, random=Z.block) # ERROR: Warning message: In lme.formula(Y ~ 1, data = data.fr, random = Z.block) : Fewer observations than random effects in all level 1 groups # Thank you in advance, Jarek Harezlak [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Format of Output of Residuals
Thank you this was worked great. Thank you for the additional information as well. Knowledge is power. Cheers, Michael On Fri, Oct 1, 2010 at 5:34 AM, Dennis Murphy djmu...@gmail.com wrote: Hi: CW.lm - lm(weight ~ Diet, ChickWeight) resid.CW.lm - resid(CW.lm) as.data.frame(resid(CW.lm))[1:10, ] # (nope) [1] -60.645455 -51.645455 -43.645455 -38.645455 -26.645455 -9.645455 [7] 3.354545 22.354545 46.354545 68.354545 # convert residuals to one column matrix, then convert to data frame: as.data.frame(matrix(resid(CW.lm), ncol = 1)) V1 1 -60.6454545 2 -51.6454545 3 -43.6454545 4 -38.6454545 5 -26.6454545 ... HTH, Dennis On Thu, Sep 30, 2010 at 9:09 PM, Michael Just mgj...@gmail.com wrote: An excerpt from dataset ChickWeight: weight Time Chick Diet 1 42 0 1 1 2 51 2 1 1 3 59 4 1 1 I am interested in the residuals of the dataset. Specifically in saving them to another format. I have been creating text files with sink. CW.lm - lm(weight ~ Diet, ChickWeight) resid.CW.lm - resid(CW.lm) But when I call: resid.CW.lm The data appears like this (excerpt) : 1 2 3 4 5 6 -60.6454545 -51.6454545 -43.6454545 -38.6454545 -26.6454545 -9.6454545 How can I get the data to be formatted like below in an output / sink friendly way? 1 -60.6454545 2 -51.6454545 3 -43.6454545 4 -38.6454545 5 -26.6454545 6 -9.6454545 Thank you kindly, Cheers, Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] source package build/installation with subdirectory-lib
hi all - i have a source package i'm writing that i'd like to be able to install with a special library that my R src files rely on. to be more precise, i have a normal R package directory structure (i.e. src/ R/ man/ DESCRIPTION NAMESPACE etc.). i also have another directory here called depPkg, and it has it's own configure file for the canonical './configure make make install' installation. so myPkg looks like so (only showing the relevant pieces): + myPkg |--- DESCRIPTION |--- NAMESPACE |--- R/ (with R files in here) |--- src/ (with my .c files in here) |--- depPkg/ (with all of the package contents (e.g. src, docs, configure script, etc) if i switch into and build depPkg by hand it works just fine, and i set a configure option to locate the .a library file it creates into the depPkg/lib/ directory, like so: bash$ ./configure --prefix=./lib --with-pic --disable-shared this builds depPkglib.a and puts it in the lib directory under depPkg. now when i compile *my* shared library in testing, it also works just fine: bash$ cd src bash$ gcc -std=gnu99 -I/usr/share/R/include -I../depPkg/include -fpic - g -O2 -c mycode.c -o mycode.o bash$ gcc -std=gnu99 -shared -o mycode.so mycode.o -L/usr/lib64/R/lib - lR -L../depPgk/lib -ldepPkg when in R, i can now use dyn.load(mycode.so) and .Call(...) with my C functions perfectly. now i want to build the package so i can install it easily on my colleagues' machines. so i use the normal R build mechanism (R CMD build), but first create a Makevars file that looks like so: PKG_CPPFLAGS = -I../depPkg/include PKG_LIBS = ../depPkg/lib .PHONY: all mylibs all: $(SHLIB) $(SHLIB): mylibs mylibs: (cd ..; gunzip depPgk.tar.gz; tar -xf depPkg.tar; cd depPkg; ./ configure --prefix=./lib --with-pic --disable-shared; make; make install) finally, when i try R CMD INSTALL, i know it sees the Makevars file, because the first build step shows the added include path (from the PKG_CPPFLAGS variable), but it never seems to execute the commands for the mylibs target, thus the compiling step of my code fails. sorry for the long-winded description, but without it i think this would have made no sense (and likely still doesn't). i have also tried moving depPgk INTO the src directory and updating the necessary ../ to ./ in the Makevars, but target mylibs never seems to be built first. anyone ever try something like this and have some pointers on how to debug? thanks much for any help, -murat __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fisher exact test?
On 10/04/2010 09:09 PM, James Nead wrote: Hi Thomas, Thanks for the reply. 1. In the first pick, I draw 'A' genes from N, without replacement. 2. Similarly, in the second pick, I draw 'B' genes from N, without replacement (and 'C' genes from 'N' etc.) 3. Order does not matter - so the two cases you cited are equivalent. I would like to find out the probability that 'n' of the genes are common to all three 'picks'. many thanks! Hmm, a sort of cubic version of Fisher's test? Someone might well have studied this extensively, but it wasn't me However, there are three constraints on the 7 df 2x2x2 table, whereas the usual Fisher case puts two constraints on 3 df, so it's not like one cell determines the rest. With only 'A' and 'B' samples, it's a clear hypergeometric (capture-recapture) situation: Just color the balls that you caught in the first draw. If you draw a third time, coloring the recaptured balls, you'd get a hypergeometric mixture of hypergeometric distributions, which might be workable if the samples are not too large. Alternatively, in the large-sample regime, you can probably work out the mean and variance of 'n'... -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.