[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation
[ https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16428839#comment-16428839 ] Matthias Boehm commented on SYSTEMML-2197: -- No these tests run fine locally and on our daily jenkins tests. Please look at the bottom of the exception (potentially cut off in the pasted stacktrace you included?) - I suspect a simple permission issue, which would show up as something like {{error running command chmod ...}}. You could resolve that by recursively setting the permissions for your local systemml root directory with {{chmod -R 755 systemml}}. > Multi-threaded broadcast creation > - > > Key: SYSTEMML-2197 > URL: https://issues.apache.org/jira/browse/SYSTEMML-2197 > Project: SystemML > Issue Type: Task >Reporter: Matthias Boehm >Assignee: LI Guobao >Priority: Major > > All spark instructions that broadcast one of the input operands, rely on a > shared primitive {{sec.getBroadcastForVariable(var)}} for creating > partitioned broadcasts, which are wrapper objects around potentially many > broadcast variables to overcome Spark 2GB limitation for compressed > broadcasts. Each individual broadcast blocks the matrix into squared blocks > for direct access without unnecessary copy per task. So far this broadcast > creation is single-threaded. > This task aims to parallelize the blocking of the given in-memory matrix into > squared blocks > (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82) > as well as the subsequent partition creation and actual broadcasting > (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548). > > For consistency and in order to avoid excessive over-provisioning, this > multi-threading should use the common internal thread pool or parallel java > streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An > example is the multi-threaded parallelization of RDDs which similarly blocks > a given matrix into its squared blocks (see > https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation
[ https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16428709#comment-16428709 ] LI Guobao commented on SYSTEMML-2197: - Thanks, I successfully launched this test and got 4 failed tests(testDenseDenseMapmmMR, testDenseSparseMapmmMR, testSparseDenseMapmmMR, testSparseSparseMapmmMR). And it seems to concern the MR backend. When reswitching to the master branch, I got the same result. So is it a detected bug? Could I ignore it? Thanks for the response. Here is the stack information: {code:java} 18/04/06 19:38:48 ERROR api.DMLScript: Failed to execute DML script. org.apache.sysml.runtime.DMLRuntimeException: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program block generated from statement block between lines 23 and 27 -- Error evaluating instruction: jobtype = GMR input labels = [_mVar13, _mVar14] recReader inst = rand inst = mapper inst = MR°mapmm°0·MATRIX·DOUBLE°1·MATRIX·DOUBLE°2·MATRIX·DOUBLE°RIGHT°false shuffle inst = agg inst = MR°ak+°2·MATRIX·DOUBLE°3·MATRIX·DOUBLE°true°NONE other inst = output labels = [pVar15] result indices = ,3 num reducers = 10 replication = 1 at org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123) at org.apache.sysml.api.ScriptExecutorUtils.executeRuntimeProgram(ScriptExecutorUtils.java:97) at org.apache.sysml.api.DMLScript.execute(DMLScript.java:744) at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:515) at org.apache.sysml.api.DMLScript.main(DMLScript.java:246) at org.apache.sysml.test.integration.AutomatedTestBase.runTest(AutomatedTestBase.java:1214) at org.apache.sysml.test.integration.functions.binary.matrix_full_other.FullDistributedMatrixMultiplicationTest.runDistributedMatrixMatrixMultiplicationTest(FullDistributedMatrixMultiplicationTest.java:276) at org.apache.sysml.test.integration.functions.binary.matrix_full_other.FullDistributedMatrixMultiplicationTest.testSparseSparseMapmmMR(FullDistributedMatrixMultiplicationTest.java:101){code} > Multi-threaded broadcast creation > - > > Key: SYSTEMML-2197 > URL: https://issues.apache.org/jira/browse/SYSTEMML-2197 > Project: SystemML > Issue Type: Task >Reporter: Matthias Boehm >Assignee: LI Guobao >Priority: Major > > All spark instructions that broadcast one of the input operands, rely on a > shared primitive {{sec.getBroadcastForVariable(var)}} for creating > partitioned broadcasts, which are wrapper objects around potentially many > broadcast variables to overcome Spark 2GB limitation for compressed > broadcasts. Each individual broadcast blocks the matrix into squared blocks > for direct access without unnecessary copy per task. So far this broadcast > creation is single-threaded. > This task aims to parallelize the blocking of the given in-memory matrix into > squared blocks > (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82) > as well as the subsequent partition creation and actual broadcasting > (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548). > > For consistency and in order to avoid excessive over-provisioning, this > multi-threading should use the common internal thread pool or parallel java > streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An > example is the multi-threaded parallelization of RDDs which similarly blocks > a given matrix into its squared blocks (see > https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation
[ https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16427850#comment-16427850 ] Matthias Boehm commented on SYSTEMML-2197: -- Sure - for checking result correctness, most of our tests run SystemML and for comparison R, and compare the results accordingly. This error {{Cannot run program "Rscript": error=}} indicates that you might not have R installed or it is not properly set in your PATH environment variable. To test this, please open a command line terminal and type {{Rscript}}. Also note that R is an external program and thus independent of the Java classpath. > Multi-threaded broadcast creation > - > > Key: SYSTEMML-2197 > URL: https://issues.apache.org/jira/browse/SYSTEMML-2197 > Project: SystemML > Issue Type: Task >Reporter: Matthias Boehm >Assignee: LI Guobao >Priority: Major > > All spark instructions that broadcast one of the input operands, rely on a > shared primitive {{sec.getBroadcastForVariable(var)}} for creating > partitioned broadcasts, which are wrapper objects around potentially many > broadcast variables to overcome Spark 2GB limitation for compressed > broadcasts. Each individual broadcast blocks the matrix into squared blocks > for direct access without unnecessary copy per task. So far this broadcast > creation is single-threaded. > This task aims to parallelize the blocking of the given in-memory matrix into > squared blocks > (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82) > as well as the subsequent partition creation and actual broadcasting > (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548). > > For consistency and in order to avoid excessive over-provisioning, this > multi-threading should use the common internal thread pool or parallel java > streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An > example is the multi-threaded parallelization of RDDs which similarly blocks > a given matrix into its squared blocks (see > https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation
[ https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426745#comment-16426745 ] LI Guobao commented on SYSTEMML-2197: - OK. I got an error when launching this test. [~mboehm7], could you help me out of this? I have setted the classpath to the systemml module. And the generated folders inside target can be also found. {code:java} 18/04/05 12:40:42 INFO api.DMLScript: END DML run 04/05/2018 12:40:42 starting R script cmd: Rscript --default-packages=methods,datasets,graphics,grDevices,stats,utils ./src/test/scripts/functions/binary/matrix_full_other/FullDistributedMatrixMultiplication.R target/testTemp/functions/binary/matrix_full_other/FullDistributedMatrixMultiplicationTest/in/ target/testTemp/functions/binary/matrix_full_other/FullDistributedMatrixMultiplicationTest/expected/0.7_0.1/ java.io.IOException: Cannot run program "Rscript": error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) at java.lang.Runtime.exec(Runtime.java:620) at java.lang.Runtime.exec(Runtime.java:450) at java.lang.Runtime.exec(Runtime.java:347) at org.apache.sysml.test.integration.AutomatedTestBase.runRScript(AutomatedTestBase.java:990) at org.apache.sysml.test.integration.functions.binary.matrix_full_other.FullDistributedMatrixMultiplicationTest.runDistributedMatrixMatrixMultiplicationTest(FullDistributedMatrixMultiplicationTest.java:277) at org.apache.sysml.test.integration.functions.binary.matrix_full_other.FullDistributedMatrixMultiplicationTest.testDenseSparseRmmSpark(FullDistributedMatrixMultiplicationTest.java:209) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.junit.runner.JUnitCore.run(JUnitCore.java:160) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47) at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) Caused by: java.io.IOException: error=2, No such file or directory at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.(UNIXProcess.java:247) at java.lang.ProcessImpl.start(ProcessImpl.java:134) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) ... 32 more {code} > Multi-threaded broadcast creation > - > > Key: SYSTEMML-2197 > URL: https://issues.apache.org/jira/browse/SYSTEMML-2197 > Project: SystemML > Issue Type: Task >Reporter: Matthias Boehm >Assignee: LI Guobao >Priority: Major > > All spark instructions that broadcast one of the input operands, rely on a > shared primitive {{sec.getBroadcastForVariable(var)}} for creating > partitioned broadcasts, which are wrapper objects around potentially many > broadcast variables to overcome Spark 2GB limitation for compressed > broadcasts. Each individual broadcast blocks the matrix into squared blocks > for direct access without unnecessary copy per task. So far this broadcast > creation is single-threaded. > This task aims to parallelize the blocking of the given in-memory matrix into > squared blocks > (https:
[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation
[ https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16419314#comment-16419314 ] Matthias Boehm commented on SYSTEMML-2197: -- Our testsuite will run 100s of tests that use this broadcast primitive, but if you want to have one particular test you can use {{FullDistributedMatrixMultiplicationTest}}. > Multi-threaded broadcast creation > - > > Key: SYSTEMML-2197 > URL: https://issues.apache.org/jira/browse/SYSTEMML-2197 > Project: SystemML > Issue Type: Task >Reporter: Matthias Boehm >Priority: Major > > All spark instructions that broadcast one of the input operands, rely on a > shared primitive {{sec.getBroadcastForVariable(var)}} for creating > partitioned broadcasts, which are wrapper objects around potentially many > broadcast variables to overcome Spark 2GB limitation for compressed > broadcasts. Each individual broadcast blocks the matrix into squared blocks > for direct access without unnecessary copy per task. So far this broadcast > creation is single-threaded. > This task aims to parallelize the blocking of the given in-memory matrix into > squared blocks > (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82) > as well as the subsequent partition creation and actual broadcasting > (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548). > > For consistency and in order to avoid excessive over-provisioning, this > multi-threading should use the common internal thread pool or parallel java > streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An > example is the multi-threaded parallelization of RDDs which similarly blocks > a given matrix into its squared blocks (see > https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation
[ https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16419113#comment-16419113 ] LI Guobao commented on SYSTEMML-2197: - Thanks [~mboehm7] for the details. And I want to know which test should be launched for it? Thanks. > Multi-threaded broadcast creation > - > > Key: SYSTEMML-2197 > URL: https://issues.apache.org/jira/browse/SYSTEMML-2197 > Project: SystemML > Issue Type: Task >Reporter: Matthias Boehm >Priority: Major > > All spark instructions that broadcast one of the input operands, rely on a > shared primitive {{sec.getBroadcastForVariable(var)}} for creating > partitioned broadcasts, which are wrapper objects around potentially many > broadcast variables to overcome Spark 2GB limitation for compressed > broadcasts. Each individual broadcast blocks the matrix into squared blocks > for direct access without unnecessary copy per task. So far this broadcast > creation is single-threaded. > This task aims to parallelize the blocking of the given in-memory matrix into > squared blocks > (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82) > as well as the subsequent partition creation and actual broadcasting > (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548). > > For consistency and in order to avoid excessive over-provisioning, this > multi-threading should use the common internal thread pool or parallel java > streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An > example is the multi-threaded parallelization of RDDs which similarly blocks > a given matrix into its squared blocks (see > https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation
[ https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16418358#comment-16418358 ] Matthias Boehm commented on SYSTEMML-2197: -- sure, I just updated the description - let me know if you need more details. > Multi-threaded broadcast creation > - > > Key: SYSTEMML-2197 > URL: https://issues.apache.org/jira/browse/SYSTEMML-2197 > Project: SystemML > Issue Type: Task >Reporter: Matthias Boehm >Priority: Major > > All spark instructions that broadcast one of the input operands, rely on a > shared primitive {{sec.getBroadcastForVariable(var)}} for creating > partitioned broadcasts, which are wrapper objects around potentially many > broadcast variables to overcome Spark 2GB limitation for compressed > broadcasts. Each individual broadcast blocks the matrix into squared blocks > for direct access without unnecessary copy per task. So far this broadcast > creation is single-threaded. > This task aims to parallelize the blocking of the given in-memory matrix into > squared blocks > (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82) > as well as the subsequent partition creation and actual broadcasting > (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548). > > For consistency and in order to avoid excessive over-provisioning, this > multi-threading should use the common internal thread pool or parallel java > streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An > example is the multi-threaded parallelization of RDDs which similarly blocks > a given matrix into its squared blocks (see > https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation
[ https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16418357#comment-16418357 ] Matthias Boehm commented on SYSTEMML-2197: -- sure - let me know if you need more details. > Multi-threaded broadcast creation > - > > Key: SYSTEMML-2197 > URL: https://issues.apache.org/jira/browse/SYSTEMML-2197 > Project: SystemML > Issue Type: Task >Reporter: Matthias Boehm >Priority: Major > > All spark instructions that broadcast one of the input operands, rely on a > shared primitive {{sec.getBroadcastForVariable(var)}} for creating > partitioned broadcasts, which are wrapper objects around potentially many > broadcast variables to overcome Spark 2GB limitation for compressed > broadcasts. Each individual broadcast blocks the matrix into squared blocks > for direct access without unnecessary copy per task. So far this broadcast > creation is single-threaded. > This task aims to parallelize the blocking of the given in-memory matrix into > squared blocks > (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82) > as well as the subsequent partition creation and actual broadcasting > (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548). > > For consistency and in order to avoid excessive over-provisioning, this > multi-threading should use the common internal thread pool or parallel java > streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An > example is the multi-threaded parallelization of RDDs which similarly blocks > a given matrix into its squared blocks (see > https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation
[ https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16418050#comment-16418050 ] LI Guobao commented on SYSTEMML-2197: - Hi [~mboehm7], could you give me some more details on this issue? Thanks > Multi-threaded broadcast creation > - > > Key: SYSTEMML-2197 > URL: https://issues.apache.org/jira/browse/SYSTEMML-2197 > Project: SystemML > Issue Type: Task >Reporter: Matthias Boehm >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)