[
https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16428839#comment-16428839
]
Matthias Boehm commented on SYSTEMML-2197:
------------------------------------------
No these tests run fine locally and on our daily jenkins tests. Please look at
the bottom of the exception (potentially cut off in the pasted stacktrace you
included?) - I suspect a simple permission issue, which would show up as
something like {{error running command chmod ...}}. You could resolve that by
recursively setting the permissions for your local systemml root directory with
{{chmod -R 755 systemml}}.
> Multi-threaded broadcast creation
> ---------------------------------
>
> Key: SYSTEMML-2197
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2197
> Project: SystemML
> Issue Type: Task
> Reporter: Matthias Boehm
> Assignee: LI Guobao
> Priority: Major
>
> All spark instructions that broadcast one of the input operands, rely on a
> shared primitive {{sec.getBroadcastForVariable(var)}} for creating
> partitioned broadcasts, which are wrapper objects around potentially many
> broadcast variables to overcome Spark 2GB limitation for compressed
> broadcasts. Each individual broadcast blocks the matrix into squared blocks
> for direct access without unnecessary copy per task. So far this broadcast
> creation is single-threaded.
> This task aims to parallelize the blocking of the given in-memory matrix into
> squared blocks
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82)
> as well as the subsequent partition creation and actual broadcasting
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548).
>
> For consistency and in order to avoid excessive over-provisioning, this
> multi-threading should use the common internal thread pool or parallel java
> streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An
> example is the multi-threaded parallelization of RDDs which similarly blocks
> a given matrix into its squared blocks (see
> https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)