Talbot,
Vectorization is not panacea.
For n == 100, m ==1000:
> system.time( for( i in 1:n ){ p[ g[[i]] ] <- p[ g[[i]] ] + y[[i]] })
[1] 0 0 0 NA NA
> system.time( p2 <- tapply( unlist(y), unlist(g), sum ))
[1] 0.16 0.00 0.16 NA NA
> all.equal(p,as.vector(p2))
[1] TRUE
> system.time( p3 <- xtabs( unlist(y) ~ unlist(g) ) )
[1] 0.08 0.00 0.08 NA NA
> all.equal(p,as.vector(p3))
[1] TRUE
> system.time( p4 <- unlist(y) %*% diag(m)[ unlist(g), ] )
[1] 4.16 0.20 4.36 NA NA
> all.equal(p,as.vector(p4))
[1] TRUE
Vectorization has had no victory, Grasshopper.
---
For n == 10000, m == 10, the slowest method above becomes the fastest, and
the fastest above becomes the slowest. So, you need to consider the
applications to which you will apply this.
Read up on profiling if you really 'feel the need for speed'. (Writing R
Extensions 3.2 Profiling R code for speed.)
Chuck
p.s. Please read "Writing R Extensions 3.1 Tidying R code" and follow the
wisdom therein.
On Fri, 2 Feb 2007, Talbot Katz wrote:
> Hi.
>
> You folks are so clever, I thought perhaps you could help me make another
> procedure more efficient.
>
> Right now I have the following setup:
>
> p is a vector of length m
> g is a list of length n, g[[i]] is a vector whose elements are indices of p,
> i.e., integers between 1 and m inclusive); the g[[i]] cover the full set
> 1:m, but they don't have to constitute an exact partition, theycan overlap
> members.
> y is another list of length n, each y[[i]] is a vector of the same length as
> g[[i]].
>
> Now I build up the vector p as follows:
>
> p=rep(0,m)
> for(i in 1:n){p[g[[i]]]=p[g[[i]]]+y[[i]]}
>
> Can this loop be vectorized?
>
> Thanks!
>
> -- TMK --
> 212-460-5430 home
> 917-656-5351 cell
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED] UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.