Github user debasish83 commented on the pull request:
https://github.com/apache/spark/pull/6213#issuecomment-103925316
For gemv it is not clear how to re-use the scratch space for result
vector...if we can't reuse the result vector over multiple calls to
kernel.compute we won't get much runtime benefits...I am considering that for
Vector based IndexedRowMatrix, we define the kernel as the traditional (vector,
vector) compute and use level 1 BLAS as done in this PR. The big runtime
benefit will come from Approximate KNN that I will open up next but we still
need the brute-force KNN for cross validation.
For (Long, Array[Double]) from matrix factorization model (similarUsers and
similarProducts) we can use dgemm specifically for DenseMatrix x
DenseMatrix...@mengxr what do you think ? That way we can use dgemm when the
features are Dense..Also (Long, Array[Double]) data structure can be defined in
recommendation/linalg package and re-used by dense kernel computation Or
perhaps for similarity/KNN computation it is fine to stay in vector space and
not do gemv/gemm optimization?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]