Re: Removal of workaround flags

2017-02-16 Thread dusenberrymw
Yeah I want us to look heavily into this problem in the context of deep 
learning algorithms.  I think we should plan on having first-class support for 
DL in our 1.0 release, including efficient (distributed SGD) training (+GPUs) 
and efficient distributed scoring.  Nice thing too is that when we achieve 
this, we'll end up benefiting most of our existing algorithms as well.

--

Mike Dusenberry
GitHub: github.com/dusenberrymw
LinkedIn: linkedin.com/in/mikedusenberry

Sent from my iPhone.


> On Feb 15, 2017, at 12:22 PM, Niketan Pansare  wrote:
> 
> Hi Matthias,
> 
> I am OK with removing this flag, but would prefer that we keep the JIRA open 
> until we are sure that caching is not a bottleneck. I have noticed that the 
> gradients turns to sparse as we execute more iterations. Also, cache release 
> time is dependent on the memory budget. Here are the statistics running Lenet 
> on MNIST using 
> https://github.com/apache/incubator-systemml/tree/master/scripts/staging/SystemML-NN/examples
> 
> With 20G driver memory, the statistics after running 10 epochs are as follows:
> Epoch: 10, Iter: 700, Train Loss: 0.20480149054528493, Train Accuracy: 
> 0.984375, Val Loss: 0.026928755962383588, Val Accuracy: 0.9922
> Epoch: 10, Iter: 800, Train Loss: 0.20165772217976913, Train Accuracy: 1.0, 
> Val Loss: 0.027878978005867083, Val Accuracy: 0.9922
> 17/02/14 16:06:58 INFO DMLScript: SystemML Statistics:
> Total elapsed time: 12687.863 sec.
> Total compilation time: 2.168 sec.
> Total execution time: 12685.694 sec.
> Number of compiled Spark inst: 147.
> Number of executed Spark inst: 4.
> Cache hits (Mem, WB, FS, HDFS): 1096424/0/0/2.
> Cache writes (WB, FS, HDFS): 603950/15/8.
> Cache times (ACQr/m, RLS, EXP): 3.704/0.336/61.831/1.242 sec.
> HOP DAGs recompiled (PRED, SB): 0/154885.
> HOP DAGs recompile time: 28.663 sec.
> Functions recompiled: 1.
> Functions recompile time: 0.024 sec.
> Spark ctx create time (lazy): 1.009 sec.
> Spark trans counts (par,bc,col):0/0/2.
> Spark trans times (par,bc,col): 0.000/0.000/3.433 secs.
> Total JIT compile time: 44.711 sec.
> Total JVM GC count: 7459.
> Total JVM GC time: 166.26 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) train 12138.979 sec 1
> -- 2) conv2d_bias_add 10876.708 sec 17362
> -- 3) conv2d_backward_filter 421.303 sec 17200
> -- 4) sel+ 239.660 sec 25881
> -- 5) update 226.687 sec 68800
> -- 6) update_nesterov 223.775 sec 68800
> -- 7) maxpooling_backward 136.709 sec 17200
> -- 8) conv2d_backward_data 134.315 sec 8600
> -- 9) ba+* 118.897 sec 51762
> -- 10) relu_maxpooling 112.283 sec 17362
> -- 11) relu_backward 107.483 sec 34400
> -- 12) uack+ 89.258 sec 34400
> -- 13) r' 74.304 sec 43000
> -- 14) +* 57.193 sec 34400
> -- 15) * 16.493 sec 95178
> -- 16) rand 16.038 sec 8613
> -- 17) / 8.352 sec 86492
> -- 18) rangeReIndex 6.628 sec 17208
> -- 19) + 3.054 sec 96528
> -- 20) uark+ 2.219 sec 43241
> -- 21) sp_csvrblk 2.183 sec 2
> -- 22) rmvar 1.517 sec 1451571
> -- 23) write 1.250 sec 9
> -- 24) - 1.059 sec 86486
> -- 25) createvar 1.026 sec 587259
> -- 26) exp 0.663 sec 17281
> -- 27) *2 0.361 sec 2
> -- 28) uasqk+ 0.277 sec 320
> -- 29) log 0.200 sec 160
> -- 30) uarmax 0.191 sec 17281
> 
> With 5G driver memory, the statistics after running 10 epochs are as follows:
> Epoch: 10, Iter: 700, Train Loss: 0.19313544015858036, Train Accuracy: 1.0, 
> Val Loss: 0.025943927403263182, Val Accuracy: 0.993
> Epoch: 10, Iter: 800, Train Loss: 0.1883995965207449, Train Accuracy: 1.0, 
> Val Loss: 0.0260796819319468, Val Accuracy: 0.9916
> 17/02/14 20:16:40 INFO DMLScript: SystemML Statistics:
> Total elapsed time: 13886.763 sec.
> Total compilation time: 2.148 sec.
> Total execution time: 13884.615 sec.
> Number of compiled Spark inst: 147.
> Number of executed Spark inst: 4.
> Cache hits (Mem, WB, FS, HDFS): 1096422/0/2/2.
> Cache writes (WB, FS, HDFS): 603868/2176/8.
> Cache times (ACQr/m, RLS, EXP): 3.883/0.343/271.757/1.312 sec.
> HOP DAGs recompiled (PRED, SB): 0/154885.
> HOP DAGs recompile time: 28.290 sec.
> Functions recompiled: 1.
> Functions recompile time: 0.023 sec.
> Spark ctx create time (lazy): 0.981 sec.
> Spark trans counts (par,bc,col):0/0/2.
> Spark trans times (par,bc,col): 0.000/0.000/3.501 secs.
> Total JIT compile time: 45.131 sec.
> Total JVM GC count: 7605.
> Total JVM GC time: 157.716 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) train 13301.811 sec 1
> -- 2) conv2d_bias_add 11890.291 sec 17362
> -- 3) conv2d_backward_filter 416.645 sec 17200
> -- 4) ba+* 252.966 sec 51762
> -- 5) sel+ 237.334 sec 25881
> -- 6) update 228.261 sec 68800
> -- 7) update_nesterov 225.383 sec 68800
> -- 8) maxpooling_backward 134.260 sec 17200
> -- 9) +* 133.959 sec 34400
> -- 10) conv2d_backward_data 128.046 sec 8600
> -- 11) relu_maxpooling 106.499 sec 17362
> -- 12) relu_backward 104.062 sec 34400
> -- 13) uack+ 90.104 sec 34400
> -- 14) r' 70.932 sec 43000
> -- 15) * 16.203 sec 95178
> -- 16) rand 16.131 

Re: SystemML 0.13 Release plan

2017-02-16 Thread Arvind Surve
We will plan to have branch based on 0.13 tag.  --     Arvind 
Surve     Spark Technology Center     http://www.spark.tc/

  From: Luciano Resende 
 To: dev@systemml.incubator.apache.org; Arvind Surve  
 Sent: Thursday, February 16, 2017 9:56 AM
 Subject: Re: SystemML 0.13 Release plan
   
What do you mean by frozen ? A branch will be created from the 0.13 rc tag
if needed ? or master will be frozen?

On Thu, Feb 16, 2017 at 8:29 AM, Arvind Surve 
wrote:

> Hi,
> We are planning to get SystemML 0.13 release out. One of the major purpose
> of this release is to provide supporting SystemML on Spark 2.1.0.We will
> plan to do Release Candidate (RC) build by end of this week. After RC
> build, verification will be done to get code ready for publish.
> Once RC build gets started code for SystemML 0.13 will be frozen other
> than any stop shipment issues.If you are working on any critical issue/s,
> please try to get in change/s by EOD Friday (2/17/2017) PST.
>
> ThanksArvind
>  --    Arvind Surve    Spark Technology Center
> http://www.spark.tc/




-- 
Luciano Resende
http://twitter.com/lresende1975
http://lresende.blogspot.com/


   

Re: SystemML 0.13 Release plan

2017-02-16 Thread Luciano Resende
What do you mean by frozen ? A branch will be created from the 0.13 rc tag
if needed ? or master will be frozen?

On Thu, Feb 16, 2017 at 8:29 AM, Arvind Surve 
wrote:

> Hi,
> We are planning to get SystemML 0.13 release out. One of the major purpose
> of this release is to provide supporting SystemML on Spark 2.1.0.We will
> plan to do Release Candidate (RC) build by end of this week. After RC
> build, verification will be done to get code ready for publish.
> Once RC build gets started code for SystemML 0.13 will be frozen other
> than any stop shipment issues.If you are working on any critical issue/s,
> please try to get in change/s by EOD Friday (2/17/2017) PST.
>
> ThanksArvind
>  -- Arvind Surve Spark Technology Center
> http://www.spark.tc/




-- 
Luciano Resende
http://twitter.com/lresende1975
http://lresende.blogspot.com/


SystemML 0.13 Release plan

2017-02-16 Thread Arvind Surve
Hi,
We are planning to get SystemML 0.13 release out. One of the major purpose of 
this release is to provide supporting SystemML on Spark 2.1.0.We will plan to 
do Release Candidate (RC) build by end of this week. After RC build, 
verification will be done to get code ready for publish.
Once RC build gets started code for SystemML 0.13 will be frozen other than any 
stop shipment issues.If you are working on any critical issue/s, please try to 
get in change/s by EOD Friday (2/17/2017) PST.

ThanksArvind
 --     Arvind Surve     Spark Technology Center     
http://www.spark.tc/