After talking about it, I forgot to put the drop=TRUE in the 'split' call: x.index <- split(seq(nrow(dat)), dat[,c("tx","day")], drop=TRUE) results <- lapply(x.index, function(.indx){ mn <- mean(dat$k[.indx]) ...... data.frame(....) })
On Tue, Apr 22, 2008 at 1:30 PM, Judith Flores <[EMAIL PROTECTED]> wrote: > Dear R experts, > > > I am sorry for sending this email again. I would > imagine yesterday and maybe today, have been very busy > days with the release of R v 2.7.0. I join all the R > users who are very gratful for your contant work and > efforts, specially knowing that you are doing this for > the sake of science, without gettig any compensation > for that. > Having written that, I decided to send the email > below again, in case it was forgotten; or maybe I am > missing something very basic? > > I am using version 2.7.0, in windows XP. > > Start of yesterday's email: > > I am trying to optimize my script, because right > now it requires a lot of memory. The goal is to > generate four plots in one page. Every plot > corresponds to the means and sem's calculated for a > given variable at different days. In order to obtain > the means and sem's I apply the 'by' function. The way > I have done it so far is like this: > > Read the data > Generate a summary of the mean and sem of a variable > at every Day. > Plot the mean and sem of that variable. > > Repeat the same process for the other 3 variables. > > I tried to optimize the code by using a for loop, > the code is below. > > > > #Reading the data > dato<-read.csv('mydata.csv') > names(dato)<-c("id","day","tx","var1","var2","var3","var4") > dato<-dato[,1:7] > > #Specify varible to be plotted > variable<-c('var1','var2','var3','var4') > > #Define parameters of window where panel: margins, > number of plots in the panel > windows(height=9, width=9, rescale='fixed') > par(mfrow=c(2,2),xpd=T, bty='l', > omi=c(0.8,0.25,1.2,0.15), mai=c(1.1,0.8,0.3,0.3)) > > > for (k in variable) { > > dat<-dato[!is.na(k),] > > > > summ<-by(dat,dat[,c("tx","day")], function(x) { > mn<-mean(x$k) > std<-sd(x$k) > n<-length(x$k) > se<-std/sqrt(n) > lowb<-mn-se > upb<-mn+se > > data.frame(tx=x$tx[1],day=x$day[1],mn=mn,std=std,lowb=lowb,upb=upb,se=se) > }) > summ<-do.call("rbind",summ) > > > > > #Definining x axis range > xmax<-unique(max(summ$day,na.rm=TRUE)) > xmin<-unique(min(summ$day,na.rm=TRUE)) > > yaxmin<-unique(min(summ$lowb)) > yaxmax<-unique(max(summ$upb)) > > > plot(1,1,type='n',xlab='Day',xlim=c(xmin,xmax),ylim=c(yaxmin,yaxmax), > ylab=k, > > las=1,cex.lab=1,xaxp=c(xmin,xmax,diff(range(c(xmin,xmax))))) > points(summ$day,summ$mn) > > } > > > > > Where variable is a vector that specifies all the > variables I want to plot. > > But I am getting the following error: > > "Error in var(as.vector(x), na.rm = na.rm) : 'x' is > empty > In addition: Warning message: > In mean.default(x$k) : argument is not numeric or > logical: returning NA" > > Could some one please show me how to structure my > code to achieve my final goal, which is to simplify > it? > > I am attaching a csv file in case you want to run my > code. > > Thank you very much in advance for your time and help, > > Judith > > > > > > > ____________________________________________________________________________________ > Be a better friend, newshound, and > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.