[ 
https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-2197:
-------------------------------------
    Description: 
All spark instructions that broadcast one of the input operands, rely on a 
shared primitive {{sec.getBroadcastForVariable(var)}} for creating partitioned 
broadcasts, which are wrapper objects around potentially many broadcast 
variables to overcome Spark 2GB limitation for compressed broadcasts. Each 
individual broadcast blocks the matrix into squared blocks for direct access 
without unnecessary copy per task. So far this broadcast creation is 
single-threaded. 

This task aims to parallelize the blocking of the given in-memory matrix into 
squared blocks 
(https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82)
 as well as the subsequent partition creation and actual broadcasting 
(https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548).
 

For consistency and in order to avoid excessive over-provisioning, this 
multi-threading should use the common internal thread pool or parallel java 
streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An 
example is the multi-threaded parallelization of RDDs which similarly blocks a 
given matrix into its squared blocks (see 
https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679).

> Multi-threaded broadcast creation
> ---------------------------------
>
>                 Key: SYSTEMML-2197
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2197
>             Project: SystemML
>          Issue Type: Task
>            Reporter: Matthias Boehm
>            Priority: Major
>
> All spark instructions that broadcast one of the input operands, rely on a 
> shared primitive {{sec.getBroadcastForVariable(var)}} for creating 
> partitioned broadcasts, which are wrapper objects around potentially many 
> broadcast variables to overcome Spark 2GB limitation for compressed 
> broadcasts. Each individual broadcast blocks the matrix into squared blocks 
> for direct access without unnecessary copy per task. So far this broadcast 
> creation is single-threaded. 
> This task aims to parallelize the blocking of the given in-memory matrix into 
> squared blocks 
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82)
>  as well as the subsequent partition creation and actual broadcasting 
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548).
>  
> For consistency and in order to avoid excessive over-provisioning, this 
> multi-threading should use the common internal thread pool or parallel java 
> streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An 
> example is the multi-threaded parallelization of RDDs which similarly blocks 
> a given matrix into its squared blocks (see 
> https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to