Talbot,

Vectorization is not panacea.

For n == 100, m ==1000:

> system.time( for( i in 1:n ){ p[ g[[i]] ] <- p[ g[[i]] ] + y[[i]] })
[1]  0  0  0 NA NA
> system.time( p2 <- tapply( unlist(y), unlist(g), sum ))
[1] 0.16 0.00 0.16   NA   NA
> all.equal(p,as.vector(p2))
[1] TRUE
> system.time( p3 <- xtabs( unlist(y) ~ unlist(g) ) )
[1] 0.08 0.00 0.08   NA   NA
> all.equal(p,as.vector(p3))
[1] TRUE
> system.time( p4 <- unlist(y) %*% diag(m)[ unlist(g), ] )
[1] 4.16 0.20 4.36   NA   NA
> all.equal(p,as.vector(p4))
[1] TRUE


Vectorization has had no victory, Grasshopper.

---

For n == 10000, m == 10, the slowest method above becomes the fastest, and 
the fastest above becomes the slowest. So, you need to consider the 
applications to which you will apply this.

Read up on profiling if you really 'feel the need for speed'. (Writing R 
Extensions 3.2 Profiling R code for speed.)

Chuck

p.s. Please read "Writing R Extensions 3.1 Tidying R code" and follow the 
wisdom therein.


On Fri, 2 Feb 2007, Talbot Katz wrote:

> Hi.
>
> You folks are so clever, I thought perhaps you could help me make another
> procedure more efficient.
>
> Right now I have the following setup:
>
> p is a vector of length m
> g is a list of length n, g[[i]] is a vector whose elements are indices of p,
> i.e., integers between 1 and m inclusive); the g[[i]] cover the full set
> 1:m, but they don't have to constitute an exact partition, theycan overlap
> members.
> y is another list of length n, each y[[i]] is a vector of the same length as
> g[[i]].
>
> Now I build up the vector p as follows:
>
> p=rep(0,m)
> for(i in 1:n){p[g[[i]]]=p[g[[i]]]+y[[i]]}
>
> Can this loop be vectorized?
>
> Thanks!
>
> --  TMK  --
> 212-460-5430  home
> 917-656-5351  cell
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry                        (858) 534-2098
                                          Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]               UC San Diego
http://biostat.ucsd.edu/~cberry/         La Jolla, San Diego 92093-0901

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to