[ 
https://issues.apache.org/jira/browse/SPARK-5766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318013#comment-14318013
 ] 

Amaru Cuba Gyllensten commented on SPARK-5766:
----------------------------------------------

Yeah, I noticed it when multiplying a 10,000 by 2000 IndexedRowMatrix with its 
transpose (represented as a local matrix), and doing some reductions on the 
rows. 
Running on my local machine, the multiplication in spark took about 7 times 
longer than an implementation where the left hand matrix was chunked and each 
chunk (consisiting of ~1000 rows) was multiplied with gemm (or similar).
This might be an unfair comparison, as it kinda requires the rows to be stored 
locally as dense matrices. (A use case which might be covered by the upcoming 
BlockMatrix?) 

> Slow RowMatrix multiplication
> -----------------------------
>
>                 Key: SPARK-5766
>                 URL: https://issues.apache.org/jira/browse/SPARK-5766
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>            Reporter: Amaru Cuba Gyllensten
>            Priority: Minor
>              Labels: matrix
>
> Looking at the source code for RowMatrix multiplication by a local matrix, it 
> seems like it is going through all columnvectors of the matrix, doing 
> pairwise dot product on each column.  
> It seems like this could be sped up by using gemm, performing full 
> matrix-matrix multiplication on the local data, (or gemv, for vector-matrix 
> multiplication), as is done in BlockMatrix or Matrix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to