[ 
https://issues.apache.org/jira/browse/SYSTEMML-843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416458#comment-15416458
 ] 

Matthias Boehm edited comment on SYSTEMML-843 at 8/11/16 3:23 AM:
------------------------------------------------------------------

After a closer look over the algorithm, I'm wondering if this is a good fit for 
large data. The function x2p gives us a dense intermediate that is O(n^2) in 
the number of rows of the input. Subsequently, we run thousands of iterations 
and create dozens of intermediates of the same size per iteration. The 
referenced paper actually talks about tree-based approximate algorithms in O(n 
log n). [~iyounus] do you see a chance of reworking the algorithm? Apart from 
that there is some potential for compiler/runtime improvements but they are 
limited. 


was (Author: mboehm7):
After a closer look over the algorithm, I'm wondering if this is a good fit for 
large data. The function x2p gives us a dense intermediate that is O(n^2) in 
the number of rows of the input. Subsequently, we runs thousands of iterations 
and create dozens of intermediates of the same size. The referenced paper 
actually talks about tree-based approximate algorithms in O(n log n). 
[~iyounus] do you see a chance of reworking the algorithm? Apart from that 
there is some potential for compiler/runtime improvements but they are limited. 

> leftIndex and cache release extremely slow
> ------------------------------------------
>
>                 Key: SYSTEMML-843
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-843
>             Project: SystemML
>          Issue Type: Bug
>            Reporter: Imran Younus
>         Attachments: tSNT.tar.gz
>
>
> I'm running the tSNE script in standalone mode with a subset of MNIST data 
> (2500 points). I ran this with and without  `-exec singlenode`. Here are the 
> stats:
> (BTW, the same function implemented in python takes less than 10 sec!)
> -> with singlenode flag
> {code}
> ./bin/systemml scripts/staging/tSNE.dml -stats -nvargs 
> X=/home/iyounus/workspace/tsne_python/mnist2500_X.txt Y=Y_out.txt C=C_out.txt
> 16/08/01 16:46:54 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time:           109.667 sec.
> Total compilation time:               0.407 sec.
> Total execution time:         109.260 sec.
> Number of compiled MR Jobs:   0.
> Number of executed MR Jobs:   0.
> Cache hits (Mem, WB, FS, HDFS):       223692/0/0/1.
> Cache writes (WB, FS, HDFS):  80351/0/2.
> Cache times (ACQr/m, RLS, EXP):       0.289/0.015/85.192/0.043 sec.
> HOP DAGs recompiled (PRED, SB):       0/0.
> HOP DAGs recompile time:      0.007 sec.
> Functions recompiled:         1.
> Functions recompile time:     0.039 sec.
> Total JIT compile time:               4.924 sec.
> Total JVM GC count:           312.
> Total JVM GC time:            1.12 sec.
> Heavy hitter instructions (name, time, count):
> -- 1)         tsne    109.202 sec     1
> -- 2)         x2p     109.189 sec     1
> -- 3)         leftIndex       106.728 sec     32136
> -- 4)         tsmm    0.564 sec       1
> -- 5)         exp     0.376 sec       8034
> -- 6)         rangeReIndex    0.201 sec       40170
> -- 7)         /       0.183 sec       24103
> -- 8)         *       0.161 sec       16069
> -- 9)         +       0.144 sec       22840
> -- 10)        uak+    0.106 sec       8036
> 16/08/01 16:46:54 INFO api.DMLScript: END DML run 08/01/2016 16:46:54
> {code}
> -> without singlenode flag
> {code}
> > ./bin/systemml scripts/staging/tSNE.dml -stats -nvargs 
> > X=/home/iyounus/workspace/tsne_python/mnist2500_X.txt Y=Y_out.txt 
> > C=C_out.txt
> 16/08/01 16:52:59 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time:           127.290 sec.
> Total compilation time:               0.396 sec.
> Total execution time:         126.894 sec.
> Number of compiled MR Jobs:   1.
> Number of executed MR Jobs:   0.
> Cache hits (Mem, WB, FS, HDFS):       223693/0/0/1.
> Cache writes (WB, FS, HDFS):  80352/0/2.
> Cache times (ACQr/m, RLS, EXP):       0.421/0.016/100.974/0.041 sec.
> HOP DAGs recompiled (PRED, SB):       0/0.
> HOP DAGs recompile time:      0.009 sec.
> Functions recompiled:         1.
> Functions recompile time:     0.038 sec.
> Total JIT compile time:               4.835 sec.
> Total JVM GC count:           312.
> Total JVM GC time:            1.226 sec.
> Heavy hitter instructions (name, time, count):
> -- 1)         tsne    126.426 sec     1
> -- 2)         x2p     126.412 sec     1
> -- 3)         leftIndex       123.982 sec     32136
> -- 4)         exp     0.427 sec       8034
> -- 5)         MR-Job_CSV_REBLOCK      0.412 sec       1
> -- 6)         tsmm    0.308 sec       1
> -- 7)         rangeReIndex    0.242 sec       40170
> -- 8)         /       0.208 sec       24103
> -- 9)         +       0.172 sec       22840
> -- 10)        *       0.151 sec       16069
> 16/08/01 16:52:59 INFO api.DMLScript: END DML run 08/01/2016 16:52:59
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to