[ 
https://issues.apache.org/jira/browse/SYSTEMML-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LI Guobao updated SYSTEMML-2418:
--------------------------------
    Summary: Spark data partitioner  (was: Distributing data to workers)

> Spark data partitioner
> ----------------------
>
>                 Key: SYSTEMML-2418
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2418
>             Project: SystemML
>          Issue Type: Sub-task
>            Reporter: LI Guobao
>            Assignee: LI Guobao
>            Priority: Major
>
> In the context of ps, the training data will be partitioned according to the 
> different schemes. This conversion is executed in driver node and the 
> partitioned data should be distributed to workers via broadcast. Due to the 
> 2G limitation of spark broadcast, we could leverage the 
> _PartitionedBroadcast_ class to do this conversion. Afterwards, the 
> partitioned broadcast object can be passed to workers for launching its job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to