[
https://issues.apache.org/jira/browse/SYSTEMML-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
LI Guobao updated SYSTEMML-2418:
--------------------------------
Summary: Spark data partitioner (was: Distributing data to workers)
> Spark data partitioner
> ----------------------
>
> Key: SYSTEMML-2418
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2418
> Project: SystemML
> Issue Type: Sub-task
> Reporter: LI Guobao
> Assignee: LI Guobao
> Priority: Major
>
> In the context of ps, the training data will be partitioned according to the
> different schemes. This conversion is executed in driver node and the
> partitioned data should be distributed to workers via broadcast. Due to the
> 2G limitation of spark broadcast, we could leverage the
> _PartitionedBroadcast_ class to do this conversion. Afterwards, the
> partitioned broadcast object can be passed to workers for launching its job.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)