Re: [R] memory-efficient column aggregation of a sparse matrix

Jon Stearley Thu, 01 Feb 2007 15:50:26 -0800

On Feb 1, 2007, at 6:22 AM, Douglas Bates wrote:

> It turns out that in the sparse matrix code used by the
> Matrix package the triplet representation allows for duplicate index
> positions with the convention that the resulting value at a position
> is the sum of the values of any triplets with that index pair.


Very handy!  I suggest adding this nugget near the "(possibly  
redundant) triplets" phrase in Matrix.pdf.

> If you decide to use this approach please be aware that the indices
> for the triplet representation in the Matrix package are 0-based (as
> in C code) not 1-based (as in R code).  (I imagine that Martin is
> thinking "we really should change that" as he reads this part.)

The Value of the appended function is equivalent to my previous  
version, but it runs in 1/10'th the time, uses vastly less memory,  
and is fewer lines of code to boot!  Sure it's tricky, but it does  
the trick.

THANK YOU SO MUCH!

-jon

NEWaggregate.csr <- function(x,fac) {
         # cast into handy Matrix sparse Triplet form
         x.T <- as(as(x, "dgRMatrix"), "dgTMatrix")

         # factor column indexes (compensating for 0 vs 1 indexing)
         [EMAIL PROTECTED] <- as.integer(as.integer([EMAIL PROTECTED])-1)

         # cast back, magically computing factor sums along the way :)
         y <- as(x.T, "matrix.csr")

         # and fix the dimension (doing this on x.T bus errors!)
         [EMAIL PROTECTED] <- as.integer(c(nrow(y),nlevels(fac)))
         y
}

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory-efficient column aggregation of a sparse matrix

Reply via email to