[ 
https://issues.apache.org/jira/browse/SYSTEMML-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LI Guobao reassigned SYSTEMML-1313:
-----------------------------------

    Assignee: LI Guobao

> Parfor broadcast exploitation
> -----------------------------
>
>                 Key: SYSTEMML-1313
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1313
>             Project: SystemML
>          Issue Type: Sub-task
>          Components: APIs, Runtime
>            Reporter: Matthias Boehm
>            Assignee: LI Guobao
>            Priority: Major
>
> The parfor optimizer may decide to execute the entire loop as a remote Spark 
> job to utilize cluster parallelism. In this case all inputs to the parfor 
> body (i.e., variable that are created or read outside of the parfor body but 
> used or overwritten inside) are read from HDFS. In the past there was an 
> issue of redundant reads, which has been addressed with SYSTEMML-1879. 
> However, the direct use of Spark broadcast variables would likely improve 
> performance, especially in clusters with many nodes.
> This task aims to leverage Spark broadcast variables for all parfor inputs. 
> In detail this entails two major aspects. First, we need runtime support to 
> optionally broadcast the inputs via broadcast variables in 
> {{RemoteParForSpark}} and obtain them from these broadcast variables in 
> {{RemoteParForSparkWorker}} without causing unnecessary eviction. In 
> contrast, to the existing broadcast primitives, we don't need to blockify the 
> matrix because the matrix is accessed in full by in-memory operations. 
> Second, this requires an extension of the parfor optimizer to reason about 
> scenarios where it is safe to use broadcast because these broadcasts cause 
> additional memory requirements since they act as pinned in memory matrices. 
> This second task has likely overlap with SYSTEMML-1349 which requires a 
> similar reasoning to handle shared reads.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to