[GitHub] spark pull request: [SPARK-14533] [MLLIB] RowMatrix.computeCovaria...

srowen Mon, 11 Apr 2016 11:24:06 -0700

Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/12299#issuecomment-208486030
  
    It matters a fair bit for computing the Gramian, since computing a'a for 
each row is sped up by the square of the sparsity. For 20% sparse vectors, 
using the sparsity makes it go 25x faster. It's not a big deal to push down a 
value to subtract off the values for the sparse case, since BLAS.spr handles 
this explicitly. It is trickier to figure out how to efficiently  handle the 
dense case, since it's already using NativeBLAS.dspr directly and I don't think 
we can modify the array directly in place. That may be the best that can be 
done -- copy the array and subtract the mean -- but I'll look more into what 
BLAS provides for this.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-14533] [MLLIB] RowMatrix.computeCovaria...

Reply via email to