I wonder how you do this (or maybe on what kind of machine you execute it).
I tried it out of curiosity and get > df = as.data.frame(lapply(1:300,function(x)sample(200,250000,T))) > colnames(df) = sample(letters[1:20],300,T) > system.time(dfmed<-lapply(unique(colnames(df)), function(x) + rowMedians(as.matrix(df[,colnames(df) == x]),na.rm=TRUE))) user system elapsed 5.680 0.952 7.171 and those times are in seconds! The time consuming part was building the data.frame not the calculation. The only thing I noticed is that my R process claims some 1.4 GB of memory but that should not be a problem on any recent hardware but my guess at answering your question would be that this might be your problem, especially if you have other memory-hogging variables like this data frame lying around and you see severe memory swapping effects Benno > Hello Everybody, > > The code: > > dfmed<-lapply(unique(colnames(df)), function(x) > rowMedians(as.matrix(df[,colnames(df) == x]),na.rm=TRUE)) > > takes really long time to execute ( in hours). Is there a faster way to do > this? > > Thanks! > > On Tue, May 22, 2012 at 3:46 PM, Preeti <pre...@sci.utah.edu> wrote: > >> Thanks Henrik! Here is the one-liner that I wrote: >> >> dfmed<-lapply(unique(colnames(df)), function(x) >> rowMedians(as.matrix(df[,colnames(df) == x]),na.rm=TRUE)) >> >> Thanks again! >> >> >> On Tue, May 22, 2012 at 3:23 PM, Henrik Bengtsson >> <h...@biostat.ucsf.edu>wrote: >> >>> See rowMedians() of the matrixStats package for replacing apply(x, >>> MARGIN=1, FUN=median). /Henrik >>> >>> On Tue, May 22, 2012 at 12:34 PM, Preeti <pre...@sci.utah.edu> wrote: >>>> Hi, >>>> >>>> I have a 250,000 by 300 matrix. I am trying to calculate the median of >>>> those columns (by row) with column names that are identical. I would >>> like >>>> this to be efficient since apply(x,1,median) where x is created by >>> choosing >>>> only those columns with same column name and looping on this is taking a >>>> really long time. Is there an efficient way to do this? >>>> >>>> Thanks! >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Benno Pütz Statistical Genetics MPI of Psychiatry Kraepelinstr. 2-10 80804 Munich, Germany T: ++49-(0)89-306 22 222 F: ++49-(0)89-306 22 601 [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.