[
https://issues.apache.org/jira/browse/FLINK-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15282684#comment-15282684
]
Simone Robutti commented on FLINK-1873:
---------------------------------------
Hello Till,
I worked full time on this issue this week and I almost have a draft for a PR.
I would like to submit it with the following features:
2 matrix formats:
*row-based distribution*
*block-based distribution*
*conversion from block-based to row-based*
*conversion from row-based to block-based*
Operations on block-based matrices:
*per-block operations on two matrices
*sum*
*sub*
*multiplication*
Row-based builders:
*from COO*
Row-based collectors
*local SparseMatrix*
*local DenseMatrix*
*local Seq of COO entries*
There are many basic features that are actually simpler than the one I already
implemented and many others that may have a rather high priority (SVD?) but
before proceeding I would like to receive a review on what is already done to
stabilize the structures I'm working on. Also this is my first open source
contribution so I would receive a validation on the technical and stylistical
aspects to avoid the same errors on the work yet to be done.
If you think there are other core features to consider for this first
iteration, please let me know. Otherwise I plan to open a PR next week.
> Distributed matrix implementation
> ---------------------------------
>
> Key: FLINK-1873
> URL: https://issues.apache.org/jira/browse/FLINK-1873
> Project: Flink
> Issue Type: New Feature
> Components: Machine Learning Library
> Reporter: liaoyuxi
> Assignee: Simone Robutti
> Labels: ML
>
> It would help to implement machine learning algorithm more quickly and
> concise if Flink would provide support for storing data and computation in
> distributed matrix. The design of the implementation is attached.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)