Re: [R] Problem to remove loops in a routine

jim holtman Wed, 01 Aug 2007 15:58:22 -0700

First thing to do is to use Rprof to determine where the time is being
spent and then you can pinpoint what section of code is taking the
time.  A quick look says to do all your subsetting at once.  You might
look into using 'split' to create the subsets and then access the
subsets with "[[...]]".  So use Rprof and see if most of the time is
being spent the the data.frame subsetting functions.


On 8/1/07, Sébastien <[EMAIL PROTECTED]> wrote:
> Dear R-users,
>
> I have written the following code to generate some trellis plots. It
> works perfectly fine except that it is quite slow when it is apply to my
> typical datasets (over several thousands of lines). I believe the
> problem comes from the loops I am using to subset my data.frame. I read
> in the archives that the tapply function is often more efficient than a
> loop in R. Unfortunately , it seems that I am not enough familiar with
> the "philosophy" of this function to implement it in my code. Would you
> have some suggestions to speed up the whole thing?
>
> Thanks in advance
>
> Sebastien
>
> PS: the rationale behind these loops is to split the trellis plots on
> different pages, all plots on a page (or a group of pages) having a
> given combination of values for the PLOT, DVID, PER and GRP parameters.
>
> ###############################################
>
> library(lattice)
> rm(list=ls(all=TRUE))
>
> # Generate a dummy dataset with
> # - 20 individuals (ID)
> # - individuals 1 to 10 belong to group (GRP) 1, 11 to 20 belong to group 2
> # - measurements (DV) done at 10 time points (TIME) per individuals on 2
> occassions (OCC)
> # - modelisation of the DV versus TIME relationships with 4 different
> models (MODEL)
> # - predicted values (Y)
> # - the PLOT column serves as a flag to plot together the models (A and
> B) and (C and D)
>
> PLOT<-rep(1:2,each=40,times=20)
> ID<-rep(1:20,each=80)
> OCC<-rep(1:2,each=10,times=80)
> GRP<-as.numeric(rep(gl(2,80),times=10))
> MODEL<-as.vector(rep(gl(4,20,label=c("A","B","C","D")),times=20))
> TIME<-rep.int(1:10,160)
> DV<-OCC*(1:10)*rep(rnorm(20,50,10),each=80)+rep(rnorm(20,10,1),each=80)
> Y<-jitter(DV)
>
> mydata<-data.frame(PLOT,ID,OCC,GRP,MODEL,TIME,DV,Y)
>
> mydata$DVID<-rep.int(1,1600)   #in a real dataset, DVID could have
> typically 2 to 3 levels
>
> #############
> # Plotting routine
> #############
>
> myPath<-"C:/"                    #TO BE MODIFIED
>
> nTrellisCol<-2                                    #number of columns per
> Trellis plot
> nTrellisRow<-3                                    #number of lines per
> Trellis plot
>
> nDVID<-nlevels(factor(mydata$DVID))                #number of
> DVID=observations types
> nidPlot<-nlevels(factor(mydata$PLOT))            #number of items in the
> PLOT column
> nPer<-nlevels(factor(mydata$OCC))                  #number of occassions
> (OCC, PER, etc...)
> nGRP<-nlevels(factor(mydata$GRP))         #number of groups
>
> pdf(file=paste(myPath,"test.pdf",sep=""))
>
> trellis.par.set(par.main.text=list(cex=1))
> trellis.par.set(par.ylab.text=list(font=2))
> trellis.par.set(par.xlab.text=list(font=2))
>
> for (i in 1:nidPlot) {
> #loop on PLOT id
> #    i=1
>    idata<-subset(mydata,mydata$PLOT==i)
>
> for (j in 1:nDVID) {
> #loop on DVID
> #    j=1
>    ijdata<-subset(idata,idata$DVID==j)
>
>        for (k in 1:nPer) {
> #loop on Period
> #    k=1
>
>    ijkdata<-subset(ijdata,ijdata$OCC==k)
>
>         for (l in 1:nGRP) {                           #loop on Group
> # l=1
>
>  subdata<-subset(ijkdata,ijkdata$GRP==l)
>
>  nModel<-nlevels(factor(subdata$MODEL))
> #number of models to be plotted in this loop
>    mylegend<-c("Raw data",levels(factor(subdata$MODEL)))
>    subID<-nlevels(factor(subdata$ID))                       #number of
> ID in the new dataset
>
>  myplot<-xyplot(Y ~ TIME | ID,                               #creates plot
>            data = subdata, type = "l",
>            groups = MODEL,
>            observed = subdata$DV,
>            as.table=TRUE,
>            panel = function(x, y, ..., subscripts, observed) {
>                panel.points(x, pch=3,col=1,observed[subscripts])
>                panel.xyplot(x, y, ...,
> col=2:nlevels(subdata$MODEL),subscripts = subscripts)},
>
>            strip=function (which.panel,...){
>                col<-rep("Black",subID)
>                llines(c(0,1,1,0,0),c(0,0,1,1,0),col.line=1)
>                ltext(rep(0.5,subID),rep(0.5,subID),
>                paste("Subject
> ",levels(factor(subdata$ID))[which.panel],sep=""),cex=trellis.par.get("axis.text")[2])},
>
>            key=list(space="bottom",
>               lines = list(pch = as.integer(c(3,rep("",nModel))), type
> = c("p", gl(1,nModel,label="l")),
>                             col =
> 1:(nModel+1),"cex"=trellis.par.get("axis.text")[2]),
>                     text=list(mylegend,
> "cex"=trellis.par.get("axis.text")[2])),
>
>            xlab="Time (hr)",
>            ylab="Concentration (ng/mL)",
>            layout=c(nTrellisCol,nTrellisRow),
>
>            main=paste(paste(paste("Plot ",i,sep=""),
>                             paste(paste(", DVID ",j,sep=""),
>                                   paste(paste(", Occasion ",k,sep=""),
>                                         paste(", Group
> ",l,sep="")))),sep=""))
>
>
>  trellis.par.set(par.xlab.text=list(cex=trellis.par.get("axis.text")[2]))
>    trellis.par.set(par.ylab.text=list(cex=trellis.par.get("axis.text")[2]))
>
>
> print(myplot,panel.width=list(x=(0.75/nTrellisCol),units="npc"),panel.height=list(x=(0.50/nTrellisRow),units="npc"))
>
>        }}}}
> dev.off()
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem to remove loops in a routine

Reply via email to