Hi everyone,
I was wondering if there is anything already implemented for
efficient ("row-wise") computation of group-specific trimmed stats
(mean and sd on the trimmed vector) on large matrices.
For example:
set.seed(1)
nc = 300
nr = 250000
x = matrix(rnorm(nc*nr), ncol=nc)
g = matrix(sample(1:3, nr*nc, rep=T), ncol=nc)
trimmedMeanByGroup <- function(y, grp, trim=.05)
tapply(y, factor(grp, levels=1:3), mean, trim=trim)
sapply(1:10, function(i) trimmedMeanByGroup(x[i,], g[i,]))
works fine... but:
> system.time(sapply(1:nr, function(i) trimmedMeanByGroup(x[i,], g
[i,])))
user system elapsed
399.928 0.019 399.988
does not look interesting for me.
Maybe some package has some implementation of the above?
Thank you very much,
-b
--
Benilton Carvalho
PhD Candidate
Department of Biostatistics
Bloomberg School of Public Health
Johns Hopkins University
[EMAIL PROTECTED]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.