[jira] [Updated] (SYSTEMML-2418) Spark data partitioner

LI Guobao (JIRA) Tue, 26 Jun 2018 16:06:06 -0700


     [ 
https://issues.apache.org/jira/browse/SYSTEMML-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


LI Guobao updated SYSTEMML-2418:
--------------------------------
    Summary: Spark data partitioner  (was: Distributing data to workers)

> Spark data partitioner
> ----------------------
>
>                 Key: SYSTEMML-2418
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2418
>             Project: SystemML
>          Issue Type: Sub-task
>            Reporter: LI Guobao
>            Assignee: LI Guobao
>            Priority: Major
>
> In the context of ps, the training data will be partitioned according to the 
> different schemes. This conversion is executed in driver node and the 
> partitioned data should be distributed to workers via broadcast. Due to the 
> 2G limitation of spark broadcast, we could leverage the 
> _PartitionedBroadcast_ class to do this conversion. Afterwards, the 
> partitioned broadcast object can be passed to workers for launching its job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SYSTEMML-2418) Spark data partitioner

Reply via email to