Github user MechCoder commented on the pull request:
https://github.com/apache/spark/pull/7222#issuecomment-118479809
The speed up is not that impressive, but I roughly get a 10x speedup
averaged over 100 iterations
Dot Products
Two Sparse Vectors length 50000 n_values 5000
In master:0.031453819274902345
In this branch: 0.0016013431549072267
Length 50000 n_values:500
In master:0.00331263542175293
In this branch: 0.0006479525566101074
Length: 500000 n_values:50000
In master: 0.04630022764205933
In this branch: 0.014638817310333252
squared_distance
Length: 500000 n_values:50000
In this branch:0.0178
In master:0.158
Length: 50000 n_values:500
In master: 0.0017
In this branch:0.0007526993751525879
# 50000, 5000
In master: 0.0158
In this branch: 0.001717
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]