[ 
https://issues.apache.org/jira/browse/SYSTEMML-845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408671#comment-15408671
 ] 

Mike Dusenberry edited comment on SYSTEMML-845 at 8/5/16 12:28 AM:
-------------------------------------------------------------------

[~niketanpansare] A full run of {{mnist_lenet-train.dml}} in Spark local mode 
with 50GB driver memory (and other settings as seen in {{perf.sh}}) had the 
following output:
{code}
Total execution time:   8230.825 sec.
Number of executed Spark inst:  137923.
{code}

However, {{lenet-train.dml}} ran as follows, using the same settings as above:
{code}
Total execution time:           2927.089 sec.
Number of executed Spark inst:  4.
{code}

So, these two scripts have the same performance in forced singlenode mode, but 
different performance when run with Spark (in local mode), even with an 
excessive amount of memory (50GB -- will run in 10GB or less on a laptop).  
Awesome chance to make an optimizer improvement for major gains.
 
cc [~mboehm7]


was (Author: [email protected]):
[~niketanpansare] A full run of {{mnist_lenet-train.dml}} in Spark local mode 
with 50GB driver memory (and other settings as seen in {{perf.sh}}) had the 
following output:
{code}
Total execution time:   8230.825 sec.
Number of executed Spark inst:  137923.
{code}

However, {{lenet-train.dml}} run with the same settings as above ran as follows:
{code}
Total execution time:           2927.089 sec.
Number of executed Spark inst:  4.
{code}

So, these two scripts have the same performance in forced singlenode mode, but 
different performance when run with Spark (in local mode), even with an 
excessive amount of memory (50GB -- will run in 10GB or less on a laptop).  
Awesome chance to make an optimizer improvement for major gains.
 
cc [~mboehm7]

> Compare Performance of LeNet Scripts With & Without Using SystemML-NN
> ---------------------------------------------------------------------
>
>                 Key: SYSTEMML-845
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-845
>             Project: SystemML
>          Issue Type: Improvement
>            Reporter: Mike Dusenberry
>         Attachments: convert.dml, lenet-train-spark-explain.log, 
> log08.03.16-1470268602.txt, mnist_lenet-train-spark-explain.log, perf.sh, 
> run.sh
>
>
> This JIRA issue tracks the comparison of the performance of the LeNet scripts 
> with & without using SystemML-NN.  The goal is that they should have equal 
> performance in terms of both accuracy and time.  Any difference will be 
> indicate areas of engine improvement.
> Scripts:
> * [mnist_lenet-train.dml | 
> https://github.com/apache/incubator-systemml/blob/master/scripts/staging/SystemML-NN/examples/mnist_lenet-train.dml]
>  - LeNet script that *does* use the SystemML-NN library.
> * [lenet-train.dml | 
> https://github.com/apache/incubator-systemml/blob/master/scripts/staging/lenet-train.dml]
>  - LeNet script that *does not* use the SystemML-NN library.
> To fully reproduce, I basically created a directory, placed the two attached 
> bash scripts in it, grabbed a copy of the NN library and placed it into the 
> directory, ran the examples/get_mnist_data.sh script from the library to get 
> the data (placed into examples/data), then used the attached convert.dml to 
> create binary copies of the data for both scripts, then ran run.sh. Also, I 
> copied examples/data to the base directory as well.  Adjust the {{EXEC}} and 
> related variables in {{perf.sh}} to switch between standalone, Spark, memory 
> sizes, explain, stats, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to