[GitHub] spark pull request: [MLLIB][SPARK-4675, SPARK-4823] RowSimilarity

debasish83 Wed, 20 May 2015 08:27:16 -0700

Github user debasish83 commented on the pull request:

    https://github.com/apache/spark/pull/6213#issuecomment-103925316
  
    For gemv it is not clear how to re-use the scratch space for result 
vector...if we can't reuse the result vector over multiple calls to 
kernel.compute we won't get much runtime benefits...I am considering that for 
Vector based IndexedRowMatrix, we define the kernel as the traditional (vector, 
vector) compute and use level 1 BLAS as done in this PR. The big runtime 
benefit will come from Approximate KNN that I will open up next but we still 
need the brute-force KNN for cross validation.
    
    For (Long, Array[Double]) from matrix factorization model (similarUsers and 
similarProducts) we can use dgemm specifically for DenseMatrix x 
DenseMatrix...@mengxr what do you think ? That way we can use dgemm when the 
features are Dense..Also (Long, Array[Double]) data structure can be defined in 
recommendation/linalg package and re-used by dense kernel computation Or 
perhaps for similarity/KNN computation it is fine to stay in vector space and 
not do gemv/gemm optimization?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [MLLIB][SPARK-4675, SPARK-4823] RowSimilarity

Reply via email to