If I have data:

dat<-data.frame(a=rnorm(20),b=rnorm(20),c=rnorm(20),d=rnorm(20),site=rep(letters[5:8],each=5))

And want to plot like this:

ctr<-1
for(i in c('a','b','c','d')){
    png(file=paste('/tmp/plot_number_',ctr,'.png',sep=''),height=8.5,
width=11,units='in',pointsize=9,res=300)
    print(ggplot(dat[,names(dat) %in%
c('site',i)],aes(x=factor(site),y=dat[,i]))+geom_boxplot()+opts(title=paste('plot
number',ctr,sep=' ')))
    dev.off()
    ctr<-ctr+1
}

Is there a way to do the same naming using plyr (or data.table or foreach
which I am not familiar with at all!)?

m.dat<-melt(dat,id.vars='site')
ddply(m.dat,.(variable),function(df)
print(ggplot(df,aes(x=factor(site),y=value))+geom_boxplot()+ ..?)

And better yet, is there a way to do it using .parallel=T?

Faceting is not really an option (unless I can facet onto multiple pages of
a pdf or something) because these need to go into reports as individually
labelled and titled plots.


As a bit of a corollary, is it really worth the headache to resolve this if
I am only using melt/plyr to split on the four letter variables? With a
larger set of data (1e6 rows), the melt/plyr version takes a significant
amount of time but .parallel=T drops the time significantly.  Is the right
answer a foreach loop and can I do that with the increasing counter? (I
haven't gotten beyond Hadley's .parallel feature in my parallel R
dealings.)

>
dat<-data.frame(a=rnorm(1e6),b=rnorm(1e6),c=rnorm(1e6),d=rnorm(1e6),site=rep(letters[5:8],each=2.5e5))
> ctr<-1
> system.time(for(i in c('a','b','c','d')){
+     png(file=paste('/tmp/plot_number_',ctr,'.png',sep=''),height=8.5,
width=11,units='in',pointsize=9,res=300)
+     print(ggplot(dat[,names(dat) %in%
c('site',i)],aes(x=factor(site),y=dat[,i]))+geom_boxplot()+opts(title=paste('plot
number',ctr,sep=' ')))
+     dev.off()
+     ctr<-ctr+1
+ })
   user  system elapsed
 54.630   0.120  54.843

> system.time(
+ ddply(melt(dat,id.vars='site'),.(variable),function(df) {
+
png(file='/tmp/plyr_plot.png',height=8.5,width=11,units='in',pointsize=9,res=300)
+     print(ggplot(df,aes(x=factor(site),y=value))+geom_boxplot())
+     dev.off()
+     },.parallel=F)
+ )
   user  system elapsed
  58.40    0.13   58.63

> system.time(
+ ddply(melt(dat,id.vars='site'),.(variable),function(df) {
+
png(file='/tmp/plyr_plot.png',height=8.5,width=11,units='in',pointsize=9,res=300)
+     print(ggplot(df,aes(x=factor(site),y=value))+geom_boxplot())
+     dev.off()
+     },.parallel=T)
+ )
   user  system elapsed
  70.33    3.46   27.61
>

How might I speed this up and include the sequential plot names?

Thanks a bunch!

Justin

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to