[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation

2018-04-06 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16428839#comment-16428839
 ] 

Matthias Boehm commented on SYSTEMML-2197:
--

No these tests run fine locally and on our daily jenkins tests. Please look at 
the bottom of the exception (potentially cut off in the pasted stacktrace you 
included?) - I suspect a simple permission issue, which would show up as 
something like {{error running command chmod ...}}. You could resolve that by 
recursively setting the permissions for your local systemml root directory with 
{{chmod -R 755 systemml}}.

> Multi-threaded broadcast creation
> -
>
> Key: SYSTEMML-2197
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2197
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: LI Guobao
>Priority: Major
>
> All spark instructions that broadcast one of the input operands, rely on a 
> shared primitive {{sec.getBroadcastForVariable(var)}} for creating 
> partitioned broadcasts, which are wrapper objects around potentially many 
> broadcast variables to overcome Spark 2GB limitation for compressed 
> broadcasts. Each individual broadcast blocks the matrix into squared blocks 
> for direct access without unnecessary copy per task. So far this broadcast 
> creation is single-threaded. 
> This task aims to parallelize the blocking of the given in-memory matrix into 
> squared blocks 
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82)
>  as well as the subsequent partition creation and actual broadcasting 
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548).
>  
> For consistency and in order to avoid excessive over-provisioning, this 
> multi-threading should use the common internal thread pool or parallel java 
> streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An 
> example is the multi-threaded parallelization of RDDs which similarly blocks 
> a given matrix into its squared blocks (see 
> https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation

2018-04-06 Thread LI Guobao (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16428709#comment-16428709
 ] 

LI Guobao commented on SYSTEMML-2197:
-

Thanks, I successfully launched this test and got 4 failed 
tests(testDenseDenseMapmmMR, testDenseSparseMapmmMR, testSparseDenseMapmmMR, 
testSparseSparseMapmmMR). And it seems to concern the MR backend. When 
reswitching to the master branch, I got the same result. So is it a detected 
bug? Could I ignore it? Thanks for the response. Here is the stack information:
{code:java}
18/04/06 19:38:48 ERROR api.DMLScript: Failed to execute DML script.
org.apache.sysml.runtime.DMLRuntimeException: 
org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
block generated from statement block between lines 23 and 27 -- Error 
evaluating instruction: jobtype = GMR 
input labels = [_mVar13, _mVar14] 
recReader inst = 
rand inst = 
mapper inst = 
MR°mapmm°0·MATRIX·DOUBLE°1·MATRIX·DOUBLE°2·MATRIX·DOUBLE°RIGHT°false 
shuffle inst = 
agg inst = MR°ak+°2·MATRIX·DOUBLE°3·MATRIX·DOUBLE°true°NONE 
other inst = 
output labels = [pVar15] 
result indices = ,3 
num reducers = 10 
replication = 1 

at org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
at 
org.apache.sysml.api.ScriptExecutorUtils.executeRuntimeProgram(ScriptExecutorUtils.java:97)
at org.apache.sysml.api.DMLScript.execute(DMLScript.java:744)
at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:515)
at org.apache.sysml.api.DMLScript.main(DMLScript.java:246)
at 
org.apache.sysml.test.integration.AutomatedTestBase.runTest(AutomatedTestBase.java:1214)
at 
org.apache.sysml.test.integration.functions.binary.matrix_full_other.FullDistributedMatrixMultiplicationTest.runDistributedMatrixMatrixMultiplicationTest(FullDistributedMatrixMultiplicationTest.java:276)
at 
org.apache.sysml.test.integration.functions.binary.matrix_full_other.FullDistributedMatrixMultiplicationTest.testSparseSparseMapmmMR(FullDistributedMatrixMultiplicationTest.java:101){code}

> Multi-threaded broadcast creation
> -
>
> Key: SYSTEMML-2197
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2197
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: LI Guobao
>Priority: Major
>
> All spark instructions that broadcast one of the input operands, rely on a 
> shared primitive {{sec.getBroadcastForVariable(var)}} for creating 
> partitioned broadcasts, which are wrapper objects around potentially many 
> broadcast variables to overcome Spark 2GB limitation for compressed 
> broadcasts. Each individual broadcast blocks the matrix into squared blocks 
> for direct access without unnecessary copy per task. So far this broadcast 
> creation is single-threaded. 
> This task aims to parallelize the blocking of the given in-memory matrix into 
> squared blocks 
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82)
>  as well as the subsequent partition creation and actual broadcasting 
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548).
>  
> For consistency and in order to avoid excessive over-provisioning, this 
> multi-threading should use the common internal thread pool or parallel java 
> streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An 
> example is the multi-threaded parallelization of RDDs which similarly blocks 
> a given matrix into its squared blocks (see 
> https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation

2018-04-05 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16427850#comment-16427850
 ] 

Matthias Boehm commented on SYSTEMML-2197:
--

Sure - for checking result correctness, most of our tests run SystemML and for 
comparison R, and compare the results accordingly. This error {{Cannot run 
program "Rscript": error=}} indicates that you might not have R installed or it 
is not properly set in your PATH environment variable. To test this, please 
open a command line terminal and type {{Rscript}}. Also note that R is an 
external program and thus independent of the Java classpath. 

> Multi-threaded broadcast creation
> -
>
> Key: SYSTEMML-2197
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2197
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: LI Guobao
>Priority: Major
>
> All spark instructions that broadcast one of the input operands, rely on a 
> shared primitive {{sec.getBroadcastForVariable(var)}} for creating 
> partitioned broadcasts, which are wrapper objects around potentially many 
> broadcast variables to overcome Spark 2GB limitation for compressed 
> broadcasts. Each individual broadcast blocks the matrix into squared blocks 
> for direct access without unnecessary copy per task. So far this broadcast 
> creation is single-threaded. 
> This task aims to parallelize the blocking of the given in-memory matrix into 
> squared blocks 
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82)
>  as well as the subsequent partition creation and actual broadcasting 
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548).
>  
> For consistency and in order to avoid excessive over-provisioning, this 
> multi-threading should use the common internal thread pool or parallel java 
> streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An 
> example is the multi-threaded parallelization of RDDs which similarly blocks 
> a given matrix into its squared blocks (see 
> https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation

2018-04-05 Thread LI Guobao (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426745#comment-16426745
 ] 

LI Guobao commented on SYSTEMML-2197:
-

OK. I got an error when launching this test. [~mboehm7], could you help me out 
of this? I have setted the classpath to the systemml module. And the generated 
folders inside target can be also found.
{code:java}
18/04/05 12:40:42 INFO api.DMLScript: END DML run 04/05/2018 12:40:42
starting R script
cmd: Rscript --default-packages=methods,datasets,graphics,grDevices,stats,utils 
./src/test/scripts/functions/binary/matrix_full_other/FullDistributedMatrixMultiplication.R
 
target/testTemp/functions/binary/matrix_full_other/FullDistributedMatrixMultiplicationTest/in/
 
target/testTemp/functions/binary/matrix_full_other/FullDistributedMatrixMultiplicationTest/expected/0.7_0.1/
java.io.IOException: Cannot run program "Rscript": error=2, No such file or 
directory
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
 at java.lang.Runtime.exec(Runtime.java:620)
 at java.lang.Runtime.exec(Runtime.java:450)
 at java.lang.Runtime.exec(Runtime.java:347)
 at 
org.apache.sysml.test.integration.AutomatedTestBase.runRScript(AutomatedTestBase.java:990)
 at 
org.apache.sysml.test.integration.functions.binary.matrix_full_other.FullDistributedMatrixMultiplicationTest.runDistributedMatrixMatrixMultiplicationTest(FullDistributedMatrixMultiplicationTest.java:277)
 at 
org.apache.sysml.test.integration.functions.binary.matrix_full_other.FullDistributedMatrixMultiplicationTest.testDenseSparseRmmSpark(FullDistributedMatrixMultiplicationTest.java:209)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
 at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
 at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
 at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
 at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
 at 
com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
 at 
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
 at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: java.io.IOException: error=2, No such file or directory
 at java.lang.UNIXProcess.forkAndExec(Native Method)
 at java.lang.UNIXProcess.(UNIXProcess.java:247)
 at java.lang.ProcessImpl.start(ProcessImpl.java:134)
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
 ... 32 more
{code}

> Multi-threaded broadcast creation
> -
>
> Key: SYSTEMML-2197
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2197
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: LI Guobao
>Priority: Major
>
> All spark instructions that broadcast one of the input operands, rely on a 
> shared primitive {{sec.getBroadcastForVariable(var)}} for creating 
> partitioned broadcasts, which are wrapper objects around potentially many 
> broadcast variables to overcome Spark 2GB limitation for compressed 
> broadcasts. Each individual broadcast blocks the matrix into squared blocks 
> for direct access without unnecessary copy per task. So far this broadcast 
> creation is single-threaded. 
> This task aims to parallelize the blocking of the given in-memory matrix into 
> squared blocks 
> (https:

[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation

2018-03-29 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16419314#comment-16419314
 ] 

Matthias Boehm commented on SYSTEMML-2197:
--

Our testsuite will run 100s of tests that use this broadcast primitive, but if 
you want to have one particular test you can use 
{{FullDistributedMatrixMultiplicationTest}}.

> Multi-threaded broadcast creation
> -
>
> Key: SYSTEMML-2197
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2197
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Priority: Major
>
> All spark instructions that broadcast one of the input operands, rely on a 
> shared primitive {{sec.getBroadcastForVariable(var)}} for creating 
> partitioned broadcasts, which are wrapper objects around potentially many 
> broadcast variables to overcome Spark 2GB limitation for compressed 
> broadcasts. Each individual broadcast blocks the matrix into squared blocks 
> for direct access without unnecessary copy per task. So far this broadcast 
> creation is single-threaded. 
> This task aims to parallelize the blocking of the given in-memory matrix into 
> squared blocks 
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82)
>  as well as the subsequent partition creation and actual broadcasting 
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548).
>  
> For consistency and in order to avoid excessive over-provisioning, this 
> multi-threading should use the common internal thread pool or parallel java 
> streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An 
> example is the multi-threaded parallelization of RDDs which similarly blocks 
> a given matrix into its squared blocks (see 
> https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation

2018-03-29 Thread LI Guobao (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16419113#comment-16419113
 ] 

LI Guobao commented on SYSTEMML-2197:
-

Thanks [~mboehm7] for the details. And I want to know which test should be 
launched for it? Thanks.

> Multi-threaded broadcast creation
> -
>
> Key: SYSTEMML-2197
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2197
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Priority: Major
>
> All spark instructions that broadcast one of the input operands, rely on a 
> shared primitive {{sec.getBroadcastForVariable(var)}} for creating 
> partitioned broadcasts, which are wrapper objects around potentially many 
> broadcast variables to overcome Spark 2GB limitation for compressed 
> broadcasts. Each individual broadcast blocks the matrix into squared blocks 
> for direct access without unnecessary copy per task. So far this broadcast 
> creation is single-threaded. 
> This task aims to parallelize the blocking of the given in-memory matrix into 
> squared blocks 
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82)
>  as well as the subsequent partition creation and actual broadcasting 
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548).
>  
> For consistency and in order to avoid excessive over-provisioning, this 
> multi-threading should use the common internal thread pool or parallel java 
> streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An 
> example is the multi-threaded parallelization of RDDs which similarly blocks 
> a given matrix into its squared blocks (see 
> https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation

2018-03-28 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16418358#comment-16418358
 ] 

Matthias Boehm commented on SYSTEMML-2197:
--

sure, I just updated the description - let me know if you need more details.

> Multi-threaded broadcast creation
> -
>
> Key: SYSTEMML-2197
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2197
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Priority: Major
>
> All spark instructions that broadcast one of the input operands, rely on a 
> shared primitive {{sec.getBroadcastForVariable(var)}} for creating 
> partitioned broadcasts, which are wrapper objects around potentially many 
> broadcast variables to overcome Spark 2GB limitation for compressed 
> broadcasts. Each individual broadcast blocks the matrix into squared blocks 
> for direct access without unnecessary copy per task. So far this broadcast 
> creation is single-threaded. 
> This task aims to parallelize the blocking of the given in-memory matrix into 
> squared blocks 
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82)
>  as well as the subsequent partition creation and actual broadcasting 
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548).
>  
> For consistency and in order to avoid excessive over-provisioning, this 
> multi-threading should use the common internal thread pool or parallel java 
> streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An 
> example is the multi-threaded parallelization of RDDs which similarly blocks 
> a given matrix into its squared blocks (see 
> https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation

2018-03-28 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16418357#comment-16418357
 ] 

Matthias Boehm commented on SYSTEMML-2197:
--

sure - let me know if you need more details.

> Multi-threaded broadcast creation
> -
>
> Key: SYSTEMML-2197
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2197
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Priority: Major
>
> All spark instructions that broadcast one of the input operands, rely on a 
> shared primitive {{sec.getBroadcastForVariable(var)}} for creating 
> partitioned broadcasts, which are wrapper objects around potentially many 
> broadcast variables to overcome Spark 2GB limitation for compressed 
> broadcasts. Each individual broadcast blocks the matrix into squared blocks 
> for direct access without unnecessary copy per task. So far this broadcast 
> creation is single-threaded. 
> This task aims to parallelize the blocking of the given in-memory matrix into 
> squared blocks 
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82)
>  as well as the subsequent partition creation and actual broadcasting 
> (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548).
>  
> For consistency and in order to avoid excessive over-provisioning, this 
> multi-threading should use the common internal thread pool or parallel java 
> streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An 
> example is the multi-threaded parallelization of RDDs which similarly blocks 
> a given matrix into its squared blocks (see 
> https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SYSTEMML-2197) Multi-threaded broadcast creation

2018-03-28 Thread LI Guobao (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16418050#comment-16418050
 ] 

LI Guobao commented on SYSTEMML-2197:
-

Hi [~mboehm7], could you give me some more details on this issue? Thanks

> Multi-threaded broadcast creation
> -
>
> Key: SYSTEMML-2197
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2197
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)