[ https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16419113#comment-16419113 ]
LI Guobao commented on SYSTEMML-2197: ------------------------------------- Thanks [~mboehm7] for the details. And I want to know which test should be launched for it? Thanks. > Multi-threaded broadcast creation > --------------------------------- > > Key: SYSTEMML-2197 > URL: https://issues.apache.org/jira/browse/SYSTEMML-2197 > Project: SystemML > Issue Type: Task > Reporter: Matthias Boehm > Priority: Major > > All spark instructions that broadcast one of the input operands, rely on a > shared primitive {{sec.getBroadcastForVariable(var)}} for creating > partitioned broadcasts, which are wrapper objects around potentially many > broadcast variables to overcome Spark 2GB limitation for compressed > broadcasts. Each individual broadcast blocks the matrix into squared blocks > for direct access without unnecessary copy per task. So far this broadcast > creation is single-threaded. > This task aims to parallelize the blocking of the given in-memory matrix into > squared blocks > (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82) > as well as the subsequent partition creation and actual broadcasting > (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548). > > For consistency and in order to avoid excessive over-provisioning, this > multi-threading should use the common internal thread pool or parallel java > streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An > example is the multi-threaded parallelization of RDDs which similarly blocks > a given matrix into its squared blocks (see > https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679). -- This message was sent by Atlassian JIRA (v7.6.3#76005)