[ 
https://issues.apache.org/jira/browse/SYSTEMML-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LI Guobao resolved SYSTEMML-2418.
---------------------------------
       Resolution: Fixed
    Fix Version/s: SystemML 1.2

> Spark data partitioner
> ----------------------
>
>                 Key: SYSTEMML-2418
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2418
>             Project: SystemML
>          Issue Type: Sub-task
>            Reporter: LI Guobao
>            Assignee: LI Guobao
>            Priority: Major
>             Fix For: SystemML 1.2
>
>
> In the context of ml, it would be more efficient to support the data 
> partitioning in distributed manner. This task aims to do the data 
> partitioning on Spark which means that all the data will be firstly splitted 
> among workers and then execute data partitioning on worker side according to 
> scheme, and then the partitioned data which stay on each worker could be 
> directly passed to run model training work without materialization on HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to