Github user vrilleup commented on the pull request:
https://github.com/apache/spark/pull/964#issuecomment-45981092
@rezazadeh Thank you for the comments! I added another method with the
original method as you suggested.
@mengxr I have tested it with real matrices. Here are some results measured
in wall-clock time (68 executors with 8GB memory in each):
23282735 x 38160 matrix, 51700398 non-zeros, 0.2s per matrix-vector
multiplication, total ~10s
63390753 x 49739 matrix, 441386111 non-zeros, 1s per matrix-vector
multiplication, total ~50s
94589483 x 4820 matrix, 1672664571 non-zeros, 0.5s per matrix-vector
multiplication, total ~50s
I also compared singular values and right singular vectors with results
from svds in octave. The differences are in the range of numerical error. This
can be controlled by the tolerance.
Can you also assign the JIRA ticket
(https://issues.apache.org/jira/browse/SPARK-1782) to me and update the status?
Thank you!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---