[
https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthias Boehm updated SYSTEMML-2197:
-------------------------------------
Description:
All spark instructions that broadcast one of the input operands, rely on a
shared primitive {{sec.getBroadcastForVariable(var)}} for creating partitioned
broadcasts, which are wrapper objects around potentially many broadcast
variables to overcome Spark 2GB limitation for compressed broadcasts. Each
individual broadcast blocks the matrix into squared blocks for direct access
without unnecessary copy per task. So far this broadcast creation is
single-threaded.
This task aims to parallelize the blocking of the given in-memory matrix into
squared blocks
(https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82)
as well as the subsequent partition creation and actual broadcasting
(https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548).
For consistency and in order to avoid excessive over-provisioning, this
multi-threading should use the common internal thread pool or parallel java
streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An
example is the multi-threaded parallelization of RDDs which similarly blocks a
given matrix into its squared blocks (see
https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679).
> Multi-threaded broadcast creation
> ---------------------------------
>
> Key: SYSTEMML-2197
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2197
> Project: SystemML
> Issue Type: Task
> Reporter: Matthias Boehm
> Priority: Major
>
> All spark instructions that broadcast one of the input operands, rely on a
> shared primitive {{sec.getBroadcastForVariable(var)}} for creating
> partitioned broadcasts, which are wrapper objects around potentially many
> broadcast variables to overcome Spark 2GB limitation for compressed
> broadcasts. Each individual broadcast blocks the matrix into squared blocks
> for direct access without unnecessary copy per task. So far this broadcast
> creation is single-threaded.
> This task aims to parallelize the blocking of the given in-memory matrix into
> squared blocks
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82)
> as well as the subsequent partition creation and actual broadcasting
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548).
>
> For consistency and in order to avoid excessive over-provisioning, this
> multi-threading should use the common internal thread pool or parallel java
> streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An
> example is the multi-threaded parallelization of RDDs which similarly blocks
> a given matrix into its squared blocks (see
> https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)