[
https://issues.apache.org/jira/browse/SYSTEMML-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Imran Younus updated SYSTEMML-1166:
-----------------------------------
Summary: Bad performance of TSMM (was: Performance issues with tsmm RIGHT)
> Bad performance of TSMM
> -----------------------
>
> Key: SYSTEMML-1166
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1166
> Project: SystemML
> Issue Type: Bug
> Components: Runtime
> Environment: spark 1.6.1
> centos
> Reporter: Imran Younus
>
> I'm running the following simple dml code:
> {code}
> N = 30000
> L = Rand(cols=N, max=1, min=-1, pdf="uniform", rows=N)
> LLT = L %*% t(L)
> fn = sum(LLT * LLT)
> print(fn)
> {code}
> The line 3 is translated as {{tsmm RIGHT}} and it takes almost 30 min to run
> both in standalone and spark modes.
> Here is the output from explain and stats:
> {code}
> > /opt/spark-1.6.2-bin-hadoop2.6/bin/spark-submit
> > --master=spark://rr-ram4.softlayer.com:7077 --executor-memory=20g
> > --driver-memory=60g --executor-cores=4 --num-executors=20
> > sysml/target/SystemML.jar -f tsmm_verify.dml -explain -stats
> 17/01/05 18:11:10 INFO api.DMLScript: BEGIN DML run 01/05/2017 18:11:10
> 17/01/05 18:11:10 INFO conf.DMLConfig: Using internal default configuration
> settings. If you wish to customize any settings, please supply a
> `SystemML-config.xml` file.
> 17/01/05 18:11:10 INFO api.DMLScript: EXPLAIN (RUNTIME):
> # Memory Budget local/remote = 41216MB/?MB/?MB/?MB
> # Degree of Parallelism (vcores) local/remote = 48/?
> PROGRAM ( size CP/SP = 10/0 )
> --MAIN PROGRAM
> ----GENERIC (lines 8-24) [recompile=false]
> ------CP createvar _mVar1 scratch_space//_p22241_10.168.31.117//_t0/temp1
> true MATRIX binaryblock 30000 30000 1000 1000 900000000 copy
> ------CP rand 30000 30000 1000 1000 -1 1 1.0 -1 uniform 1.0 48
> _mVar1.MATRIX.DOUBLE
> ------CP createvar _mVar2 scratch_space//_p22241_10.168.31.117//_t0/temp2
> true MATRIX binaryblock 30000 30000 1000 1000 -1 copy
> ------CP tsmm _mVar1.MATRIX.DOUBLE _mVar2.MATRIX.DOUBLE RIGHT 48
> ------CP rmvar _mVar1
> ------CP uasqk+ _mVar2.MATRIX.DOUBLE _Var3.SCALAR.DOUBLE 48
> ------CP rmvar _mVar2
> ------CP print _Var3.SCALAR.DOUBLE.false _Var4.SCALAR.STRING
> ------CP rmvar _Var3
> ------CP rmvar _Var4
> 17/01/05 18:11:10 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 6.00000141554168E12
> 17/01/05 18:40:38 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time: 1768.639 sec.
> Total compilation time: 0.266 sec.
> Total execution time: 1768.373 sec.
> Number of compiled Spark inst: 0.
> Number of executed Spark inst: 0.
> Cache hits (Mem, WB, FS, HDFS): 2/0/0/0.
> Cache writes (WB, FS, HDFS): 2/1/0.
> Cache times (ACQr/m, RLS, EXP):
> 0.000/0.001/13.410/0.000 sec.
> HOP DAGs recompiled (PRED, SB): 0/0.
> HOP DAGs recompile time: 0.000 sec.
> Spark ctx create time (lazy): 0.000 sec.
> Spark trans counts (par,bc,col):0/0/0.
> Spark trans times (par,bc,col): 0.000/0.000/0.000 secs.
> Total JIT compile time: 1.345 sec.
> Total JVM GC count: 3.
> Total JVM GC time: 4.428 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) tsmm 1762.014 sec 1
> -- 2) rand 5.210 sec 1
> -- 3) uasqk+ 0.541 sec 1
> -- 4) rmvar 0.002 sec 4
> -- 5) print 0.002 sec 1
> -- 6) createvar 0.001 sec 2
> 17/01/05 18:40:38 INFO api.DMLScript: END DML run 01/05/2017 18:40:38
> {code}
> If I calculate {{t(L) %*% L}} which is {{tsmm LEFT}}, then it takes only 5
> min. Here is the output of explain and stats in this case:
> {code}
> > /opt/spark-1.6.2-bin-hadoop2.6/bin/spark-submit
> > --master=spark://rr-ram4.softlayer.com:7077 --executor-memory=20g
> > --driver-memory=60g --executor-cores=4 --num-executors=20
> > sysml/target/SystemML.jar -f tsmm_verify.dml -explain -stats
> 17/01/05 18:03:13 INFO api.DMLScript: BEGIN DML run 01/05/2017 18:03:13
> 17/01/05 18:03:13 INFO conf.DMLConfig: Using internal default configuration
> settings. If you wish to customize any settings, please supply a
> `SystemML-config.xml` file.
> 17/01/05 18:03:13 INFO api.DMLScript: EXPLAIN (RUNTIME):
> # Memory Budget local/remote = 41216MB/?MB/?MB/?MB
> # Degree of Parallelism (vcores) local/remote = 48/?
> PROGRAM ( size CP/SP = 10/0 )
> --MAIN PROGRAM
> ----GENERIC (lines 8-24) [recompile=false]
> ------CP createvar _mVar1 scratch_space//_p19190_10.168.31.117//_t0/temp1
> true MATRIX binaryblock 30000 30000 1000 1000 900000000 copy
> ------CP rand 30000 30000 1000 1000 -1 1 1.0 -1 uniform 1.0 48
> _mVar1.MATRIX.DOUBLE
> ------CP createvar _mVar2 scratch_space//_p19190_10.168.31.117//_t0/temp2
> true MATRIX binaryblock 30000 30000 1000 1000 -1 copy
> ------CP tsmm _mVar1.MATRIX.DOUBLE _mVar2.MATRIX.DOUBLE LEFT 48
> ------CP rmvar _mVar1
> ------CP uasqk+ _mVar2.MATRIX.DOUBLE _Var3.SCALAR.DOUBLE 48
> ------CP rmvar _mVar2
> ------CP print _Var3.SCALAR.DOUBLE.false _Var4.SCALAR.STRING
> ------CP rmvar _Var3
> ------CP rmvar _Var4
> 17/01/05 18:03:13 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> q5.999491296172791E12
> 17/01/05 18:09:14 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time: 361.293 sec.
> Total compilation time: 0.267 sec.
> Total execution time: 361.026 sec.
> Number of compiled Spark inst: 0.
> Number of executed Spark inst: 0.
> Cache hits (Mem, WB, FS, HDFS): 2/0/0/0.
> Cache writes (WB, FS, HDFS): 2/1/0.
> Cache times (ACQr/m, RLS, EXP):
> 0.000/0.001/12.827/0.000 sec.
> HOP DAGs recompiled (PRED, SB): 0/0.
> HOP DAGs recompile time: 0.000 sec.
> Spark ctx create time (lazy): 0.000 sec.
> Spark trans counts (par,bc,col):0/0/0.
> Spark trans times (par,bc,col): 0.000/0.000/0.000 secs.
> Total JIT compile time: 1.32 sec.
> Total JVM GC count: 3.
> Total JVM GC time: 4.271 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) tsmm 354.384 sec 1
> -- 2) rand 5.181 sec 1
> -- 3) uasqk+ 0.842 sec 1
> -- 4) rmvar 0.002 sec 4
> -- 5) createvar 0.001 sec 2
> -- 6) print 0.001 sec 1
> 17/01/05 18:09:14 INFO api.DMLScript: END DML run 01/05/2017 18:09:14
> {code}
> This time difference for the two cases is way too large. I'm sure there is
> some bug in runtime code that is causing this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)