[
https://issues.apache.org/jira/browse/SYSTEMML-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
LI Guobao resolved SYSTEMML-2418.
---------------------------------
Resolution: Fixed
Fix Version/s: SystemML 1.2
> Spark data partitioner
> ----------------------
>
> Key: SYSTEMML-2418
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2418
> Project: SystemML
> Issue Type: Sub-task
> Reporter: LI Guobao
> Assignee: LI Guobao
> Priority: Major
> Fix For: SystemML 1.2
>
>
> In the context of ml, it would be more efficient to support the data
> partitioning in distributed manner. This task aims to do the data
> partitioning on Spark which means that all the data will be firstly splitted
> among workers and then execute data partitioning on worker side according to
> scheme, and then the partitioned data which stay on each worker could be
> directly passed to run model training work without materialization on HDFS.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)