GitHub user vrilleup opened a pull request:
https://github.com/apache/spark/pull/1378
use specialized axpy in RowMatrix for SVD
After running some more tests on large matrix, found that the BV axpy
(breeze/linalg/Vector.scala, axpy) is slower than the BSV axpy
(breeze/linalg/operators/SparseVectorOps.scala, sv_dv_axpy), 8s v.s. 2s for
each multiplication. The BV axpy operates on an iterator while BSV axpy
directly operates on the underlying array. I think the overhead comes from
creating the iterator (with a zip) and advancing the pointers.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/vrilleup/spark master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/1378.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1378
----
commit e1db950e91c7d9526519626aa252cd711307d857
Author: Li Pu <[email protected]>
Date: 2014-06-04T01:05:18Z
SPARK-1782: svd for sparse matrix using ARPACK
copy ARPACK dsaupd/dseupd code from latest breeze
change RowMatrix to use sparse SVD
change tests for sparse SVD
commit 96d2ecb837843651db70d7505ddb73cfc0b0bf9a
Author: Li Pu <[email protected]>
Date: 2014-06-04T06:03:35Z
improve eigenvalue sorting
commit fe983b0e7d62359275a92c2adaae8a635d7dd5d8
Author: Li Pu <[email protected]>
Date: 2014-06-04T07:01:29Z
improve scala style
commit 9c8051594a88b53ce83b39b127a098b31bd89aad
Author: Li Pu <[email protected]>
Date: 2014-06-04T08:25:58Z
use non-sparse implementation when k = n
commit 827411b7a7c7a44ec9cf0a3a3439bba0a47575f7
Author: Li Pu <[email protected]>
Date: 2014-06-04T08:29:12Z
fix EOF new line
commit e7850ed465ceadd6a45132935013292a4845f8df
Author: Li Pu <[email protected]>
Date: 2014-06-04T23:56:26Z
use aggregate and axpy
commit 4c7aec3d1c5203b4825047c66bed718211f9446c
Author: Li Pu <[email protected]>
Date: 2014-06-07T01:33:47Z
improve comments
commit eb15100052aae878552aa437c41e548243a6a29e
Author: Li Pu <[email protected]>
Date: 2014-06-13T06:36:18Z
fix binary compatibility
commit 819824b85acfc8ace9c15e0a9c5ce317604e4f73
Author: Li Pu <[email protected]>
Date: 2014-06-18T02:11:53Z
add flag for dense svd or sparse svd
commit 5543cce3b7eba1bb3c4b5b8b43ca2c0399295044
Author: Li Pu <[email protected]>
Date: 2014-06-23T23:27:27Z
improve svd api
commit 71484263409c03669be825b50714731fa9c46f6c
Author: Li Pu <[email protected]>
Date: 2014-06-26T07:09:48Z
improve RowMatrix multiply
commit c2737714b696d3cfae3b1efd0bde6a8d44a47b95
Author: Li Pu <[email protected]>
Date: 2014-07-07T20:49:29Z
automatically determine SVD compute mode and parameters
commit 62969fa4e06a715025483ed282b29427075bbbf1
Author: Xiangrui Meng <[email protected]>
Date: 2014-07-09T00:54:54Z
use BDV directly in symmetricEigs
change the computation mode to local-svd, local-eigs, and dist-eigs
update tests and docs
commit 861ec48bc74616b47d45ad3b828097a35045050f
Author: Xiangrui Meng <[email protected]>
Date: 2014-07-09T01:09:23Z
simplify axpy
commit a461082d98828501eccfbb59c8813c5fbd2ef826
Author: Xiangrui Meng <[email protected]>
Date: 2014-07-09T01:43:18Z
make superscript show up correctly in doc
commit 4c618e917607b6d760f6192878173198399302c1
Author: Li Pu <[email protected]>
Date: 2014-07-09T07:10:14Z
Merge pull request #1 from mengxr/vrilleup-master
Some updates to SVD impl
commit 7312ec10b1be13a41e46c4b8d164302c8497514a
Author: Li Pu <[email protected]>
Date: 2014-07-09T07:35:20Z
very minor comment fix
commit 5255f2a23ae979dcf809034bba658491ab8fd72a
Author: Li Pu <[email protected]>
Date: 2014-07-10T18:53:06Z
Merge remote-tracking branch 'upstream/master'
commit 6fb01a31ad967b849f5b738f22a64f8616d3177b
Author: Li Pu <[email protected]>
Date: 2014-07-11T23:12:43Z
use specialized axpy in RowMatrix
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---