[
https://issues.apache.org/jira/browse/SPARK-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152636#comment-14152636
]
Xiangrui Meng commented on SPARK-3434:
--------------------------------------
[~shivaram] Could you post the design of the partitioning strategy for block
matrices? I think we should have a 2D partitioner, which consists of the row
partitioner and column partitioner. A matrix with partitioner (p1, p2) can
multiply a matrix with partitioner (p2, p3), resulting a matrix with
partitioner (p1, p3).
> Distributed block matrix
> ------------------------
>
> Key: SPARK-3434
> URL: https://issues.apache.org/jira/browse/SPARK-3434
> Project: Spark
> Issue Type: New Feature
> Components: MLlib
> Reporter: Xiangrui Meng
>
> This JIRA is for discussing distributed matrices stored in block
> sub-matrices. The main challenge is the partitioning scheme to allow adding
> linear algebra operations in the future, e.g.:
> 1. matrix multiplication
> 2. matrix factorization (QR, LU, ...)
> Let's discuss the partitioning and storage and how they fit into the above
> use cases.
> Questions:
> 1. Should it be backed by a single RDD that contains all of the sub-matrices
> or many RDDs with each contains only one sub-matrix?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]