[jira] [Commented] (SYSTEMML-1875) Support for ppc64le in Caffe2DML

2017-09-08 Thread Nakul Jindal (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16159765#comment-16159765
 ] 

Nakul Jindal commented on SYSTEMML-1875:


Sure [~reinwald], the only reason I put them here instead of in {{devdocs}} is 
that they have nothing to do with SystemML directly.

> Support for ppc64le in Caffe2DML
> 
>
> Key: SYSTEMML-1875
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1875
> Project: SystemML
>  Issue Type: Bug
>Reporter: Nakul Jindal
> Fix For: SystemML 1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SYSTEMML-1875) Support for ppc64le in Caffe2DML

2017-09-08 Thread Berthold Reinwald (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16159764#comment-16159764
 ] 

Berthold Reinwald commented on SYSTEMML-1875:
-

Could you please save these instructions in a md doc in the repo? Thanks.

> Support for ppc64le in Caffe2DML
> 
>
> Key: SYSTEMML-1875
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1875
> Project: SystemML
>  Issue Type: Bug
>Reporter: Nakul Jindal
> Fix For: SystemML 1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1724) Remove Guava from compile-time dependencies

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1724:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Remove Guava from compile-time dependencies
> ---
>
> Key: SYSTEMML-1724
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1724
> Project: SystemML
>  Issue Type: Improvement
>  Components: Build
>Reporter: Dylan Hutchison
>Assignee: Dylan Hutchison
>Priority: Minor
> Fix For: SystemML 0.15
>
>
> SYSTEMML-1663 reintroduced Guava as a compile-time dependency into SystemML 
> during [PR 540|https://github.com/apache/systemml/pull/540]. Let's remove it 
> to reduce the compile-time memory footprint, as per SYSTEMML-698.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1083) Fix spelling of "Demonstration" in VLDB award

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1083:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.11

> Fix spelling of "Demonstration" in VLDB award
> -
>
> Key: SYSTEMML-1083
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1083
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Website
>Reporter: Jeremy Anderson
>Assignee: Jason Azares
>Priority: Minor
> Fix For: SystemML 0.11
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1549) Cox.dml - return S & T in usable format

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1549:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Cox.dml - return S & T in usable format
> ---
>
> Key: SYSTEMML-1549
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1549
> Project: SystemML
>  Issue Type: Improvement
>  Components: Algorithms
>Reporter: Brendan Dwyer
>Assignee: Brendan Dwyer
>Priority: Minor
> Fix For: SystemML 0.15
>
>
> Variables S & T are returned as strings. They should also be returned as a 
> matrix like R4ML 
> [does|https://github.com/SparkTC/r4ml/blob/master/R4ML/inst/sysml/scripts/algorithms/Cox.dml].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1600) Display version in MLContext welcome message

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1600:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Display version in MLContext welcome message
> 
>
> Key: SYSTEMML-1600
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1600
> Project: SystemML
>  Issue Type: Improvement
>  Components: APIs
>Reporter: Deron Eriksson
>Assignee: Krishna Kalyan
>Priority: Minor
> Fix For: SystemML 0.15
>
>
> Append SystemML version number to MLContext welcome message. It is available 
> via the MLContext version() method.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1668) Wrong worst-case estimates for rbind in BinaryOp

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1668:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Wrong worst-case estimates for rbind in BinaryOp
> 
>
> Key: SYSTEMML-1668
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1668
> Project: SystemML
>  Issue Type: Bug
>  Components: Compiler
>Affects Versions: SystemML 0.14
>Reporter: Dylan Hutchison
>Assignee: Dylan Hutchison
>Priority: Minor
> Fix For: SystemML 0.15
>
>
> In {{BinaryOp.inferOutputCharacteristics}}, RBIND is not checked and CBIND is 
> checked twice. The second case should refer to RBIND.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1737) BufferedReader should be closed in ParameterizedBuiltinCPFileInstruction#createCellResultFile()

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1737:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> BufferedReader should be closed in 
> ParameterizedBuiltinCPFileInstruction#createCellResultFile()
> ---
>
> Key: SYSTEMML-1737
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1737
> Project: SystemML
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Fix For: SystemML 0.15
>
>
> {code}
>   BufferedReader fkeyMap = StagingFileUtils.openKeyMap(metaOut);
> {code}
> BufferedReader should be closed upon exit from method.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1455) Change the term PLAIN_R2 to R2 in all algorithms

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1455:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Change the term PLAIN_R2 to R2 in all algorithms
> 
>
> Key: SYSTEMML-1455
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1455
> Project: SystemML
>  Issue Type: Improvement
>  Components: Algorithms
>Reporter: Imran Younus
>Assignee: Krishna Kalyan
>Priority: Minor
> Fix For: SystemML 0.15
>
>
> In some of the regression algorithms, we return several metrics. One of these 
> is R2. But we call if PLAIN_R2. This is unconventional. We should just call 
> it R2. I've never see the term PLAIN_R2 in any book or paper or software etc. 
> There is R2 and Adjusted R2. I think it would be better to use the 
> conventional terminology as mush as possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1380) Kmeans isY and verb parameters can be boolean

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1380:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Kmeans isY and verb parameters can be boolean
> -
>
> Key: SYSTEMML-1380
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1380
> Project: SystemML
>  Issue Type: Improvement
>  Components: Algorithms
>Reporter: Deron Eriksson
>Assignee: Krishna Kalyan
>Priority: Minor
> Fix For: SystemML 0.15
>
>
> In the Kmeans.dml script, the 'isY' and 'verb' input parameters are integers. 
> However, they are basically on/off switches, so replacing the integer types 
> with boolean types could potentially be a little clearer to users.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1774) Improve Parfor parallelism for deep learning

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1774:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Improve Parfor parallelism for deep learning
> 
>
> Key: SYSTEMML-1774
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1774
> Project: SystemML
>  Issue Type: Improvement
>  Components: Algorithms, Compiler, ParFor
>Affects Versions: SystemML 1.0
>Reporter: Fei Hu
>Assignee: Fei Hu
>  Labels: deeplearning
> Fix For: SystemML 0.15
>
> Attachments: Explain_For_HYBRID_SPARK_Mode_With_ErrorInfo.txt, 
> Explain_For_Spark_Mode.txt, MNIST_Distrib_Sgd.scala, 
> mnist_lenet_distrib_sgd.dml
>
>
> When running the  [distributed MNIST LeNet example | 
> https://github.com/apache/systemml/blob/master/scripts/nn/examples/mnist_lenet_distrib_sgd.dml],
>  each mini-batch could ideally run in parallel without interaction. We try to 
> force {{parfor (j in 1:parallel_batches)}} at line 137 of 
> {{nn/examples/mnist_lenet_distrib_sgd.dml}} to be {{parfor (j in 
> 1:parallel_batches, mode=REMOTE_SPARK, opt=CONSTRAINED)}} use 
> {{REMOTE_SPARK}} mode, but got some errors about 
> {{org.apache.sysml.runtime.DMLRuntimeException: Not supported: Instructions 
> of type other than CP instructions}} using the mode {{SPARK}}, and the error 
> {{java.lang.NullPointerException}} using the mode {{HYBRID_SPARK}}. More log 
> information can be found at the following comments. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1736) Add new 2D top_k utility function

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1736:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Add new 2D top_k utility function
> -
>
> Key: SYSTEMML-1736
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1736
> Project: SystemML
>  Issue Type: New Feature
>Reporter: Mike Dusenberry
>Assignee: Fei Hu
> Fix For: SystemML 0.15
>
>
> We should add a new {{top_k2d}} utility function (in {{nn/util.dml}}) that 
> accepts a matrix {{X}} and return matrices {{values}} and {{indices}} with 
> the top {{k}} values (i.e. probabilities) and associated indices (i.e. 
> classes) along a certain dimension.  This will be modeled after the 
> [{{top_k}} function in TensorFlow | 
> https://www.tensorflow.org/api_docs/python/tf/nn/top_k].  For the 2D case, 
> {{top_k}} will operate on the channels dimension.  A typical use case here is 
> that in which {{X}} is the output of a {{softmax2d}} layer (so each channel 
> contains a set of normalized class probabilities), and {{values}} and 
> {{indices}} will contain the top {{k}} probabilities and indices along the 
> channel axis.  This scenario would be common in an image segmentation 
> problem, in which every pixel of the output image will have a set of class 
> probabilities along the channel axis.
> Having these {{top-k}} functions will allow us to extract either predict a 
> single class for each item, or the top {{k}} classes, and therefore may be 
> more useful that a {{predict_class}} function.
> Although we will use {{values}} and {{indices}} as the names of the returned 
> matrices within the functions, in practice, one is likely to name the results 
> {{probs}} and {{classes}} in the calling environment.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1677) Add a new 2D cross-entropy layer

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1677:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Add a new 2D cross-entropy layer
> 
>
> Key: SYSTEMML-1677
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1677
> Project: SystemML
>  Issue Type: New Feature
>  Components: Algorithms
>Reporter: Mike Dusenberry
>Assignee: Fei Hu
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1676) Add a new 2D softmax layer

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1676:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Add a new 2D softmax layer
> --
>
> Key: SYSTEMML-1676
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1676
> Project: SystemML
>  Issue Type: New Feature
>  Components: Algorithms
>Reporter: Mike Dusenberry
>Assignee: Fei Hu
> Fix For: SystemML 0.15
>
>
> A 2D softmax layer would accept a tensor of shape {{(N,C,H,W)}}, where the 
> {{C}} axis contains scores for {{D}} classes, and output a tensor of the same 
> shape, with the scores transformed to normalized probabilities.  The typical 
> use case would be a segmentation problem, in which every pixel has a 
> multiclass prediction.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1694) Add snapshot version number to docs header

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1694:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Add snapshot version number to docs header
> --
>
> Key: SYSTEMML-1694
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1694
> Project: SystemML
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Deron Eriksson
>Assignee: Gus Jenkins
> Fix For: SystemML 0.15
>
>
> Currently the latest snapshot documentation 
> (http://apache.github.io/systemml/) has "Latest" in the header. "Latest" can 
> be a little confusing since it's hard to tell whether this is the latest 
> release version (currently 0.14.0-incubating) or the latest snapshot version 
> (1.0.0-SNAPSHOT).
> We should probably change "Latest" to either something like:
>   Latest (1.0.0-SNAPSHOT)
>   or
>   1.0.0-SNAPSHOT



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1647) Verify whether StepLinearReg script works with MLContext

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1647:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Verify whether StepLinearReg script works with MLContext
> 
>
> Key: SYSTEMML-1647
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1647
> Project: SystemML
>  Issue Type: Improvement
>  Components: Algorithms
>Reporter: Imran Younus
>Assignee: Imran Younus
> Fix For: SystemML 0.15
>
>
> This jira plans to fix StepLinearReg script in order to make it work with new 
> MLContext. Currently its not working with new MLContext.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1679) Add a new threshold utility function

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1679:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Add a new threshold utility function
> 
>
> Key: SYSTEMML-1679
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1679
> Project: SystemML
>  Issue Type: New Feature
>Reporter: Mike Dusenberry
>Assignee: Fei Hu
> Fix For: SystemML 0.15
>
>
> We should add a new {{threshold}} utility function (in {{nn/util.dml}}) that 
> accepts a matrix {{X}} and a threshold parameter {{thresh}} and returns an 
> indicator matrix {{out}} with values in \{0, 1\} depending on whether or not 
> the values in {{X}} are above {{thresh}}.  We could use this, for example, 
> for determining the predicted class in a binary classification problem given 
> the output of a sigmoid layer.
> We should also add a test case in {{nn/test/test.dml}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1608) Add ALS notebook example

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1608:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Add ALS notebook example
> 
>
> Key: SYSTEMML-1608
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1608
> Project: SystemML
>  Issue Type: Task
>Reporter: Imran Younus
>Assignee: Imran Younus
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1567) Remove conditionals from nn layers

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1567:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Remove conditionals from nn layers
> --
>
> Key: SYSTEMML-1567
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1567
> Project: SystemML
>  Issue Type: Improvement
>  Components: APIs
>Affects Versions: SystemML 1.0
>Reporter: Niketan Pansare
> Fix For: SystemML 0.15
>
>
> Conditionals in nn layers introduce transient read/write variables that 
> disables fused operators such as CP relu_maxpooling_backward and hence 
> redundant execute sparsity-introducing sel+ operator. This operator causes 
> unnecessary dense-to-sparse-to-dense conversion and becomes the heavy hitter 
> after native BLAS change. Note: some fused operators such as CP 
> relu_maxpooling are still applied because there is no conditional in between 
> those layers.
> Without conditionals in dropout layer: 
> https://github.com/apache/incubator-systemml/blob/master/scripts/nn/layers/dropout.dml#L49-L53
>  
> {code}
> Iter:2000.0, training loss:0.003149394810197065, training accuracy:100.0
> Iter:2000.0, validation loss:191.9888157354513, validation accuracy:96.875
> SystemML Statistics:
> Total elapsed time: 416.609 sec.
> Total compilation time: 0.000 sec.
> Total execution time:   416.609 sec.
> Number of compiled Spark inst:  69.
> Number of executed Spark inst:  2.
> Native mkl calls (LibMatrixMult/LibMatrixDNN):  4270/10553.
> Cache hits (Mem, WB, FS, HDFS): 277973/0/0/0.
> Cache writes (WB, FS, HDFS):143616/0/0.
> Cache times (ACQr/m, RLS, EXP): 0.101/0.080/1.988/0.000 sec.
> HOP DAGs recompiled (PRED, SB): 0/2277.
> HOP DAGs recompile time:6.146 sec.
> Spark ctx create time (lazy):   0.027 sec.
> Spark trans counts (par,bc,col):0/0/0.
> Spark trans times (par,bc,col): 0.000/0.000/0.000 secs.
> Total JIT compile time: 37.746 sec.
> Total JVM GC count: 3949.
> Total JVM GC time:  56.609 sec.
> Heavy hitter instructions (name, time, count):
> -- 1)   conv2d_bias_add 48.984 sec  4514
> -- 2)   conv2d_backward_filter  47.780 sec  4026
> -- 3)   -*  38.246 sec  16104
> -- 4)   +*  35.902 sec  8052
> -- 5)   +   34.227 sec  30566
> -- 6)   ba+*30.643 sec  12566
> -- 7)   relu_maxpooling_backward29.678 sec  4026
> -- 8)   conv2d_backward_data28.520 sec  2013
> -- 9)   *   26.825 sec  35275
> -- 10)  relu_backward   24.842 sec  6039
> {code}
> With conditional, we add sel+ to the heavy hitter:
> {code}
> -- 1)   sel+55.054 sec  6283
> {code}
> [~mwdus...@us.ibm.com] Since you created the layers, I think you should 
> decide how best to restructure the DML. My recommendation would be to create 
> two layers in case of conditionals.
> [~mboehm7] [~reinwald]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1585) Include JCuda jars into SystemML's extra.jar

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1585:

Fix Version/s: (was: SystemML 1.0)
   Not Applicable

> Include JCuda jars into SystemML's extra.jar
> 
>
> Key: SYSTEMML-1585
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1585
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Niketan Pansare
> Fix For: Not Applicable
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1469) Add a new `conv2d_transpose` layer.

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1469:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Add a new `conv2d_transpose` layer.
> ---
>
> Key: SYSTEMML-1469
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1469
> Project: SystemML
>  Issue Type: New Feature
>Reporter: Mike Dusenberry
>Assignee: Prithviraj Sen
> Fix For: SystemML 0.15
>
>
> A conv2d tranpose layer is the gradient of a conv2d layer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1068) Add Code Highlighting

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1068:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Add Code Highlighting
> -
>
> Key: SYSTEMML-1068
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1068
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Website
>Reporter: Mike Dusenberry
>Assignee: Dexter Lesaca
> Fix For: SystemML 0.14
>
>
> For our tutorials, it would be nice to have code syntax highlighting to make 
> it easier to understand the code snippets.  Jekyll supports this feature 
> \[1], as do a number of other libraries.  At a minimum, we should have R, 
> Python, and Scala syntax highlighting.
> \[1]: https://jekyllrb.com/docs/posts/#highlighting-code-snippets



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-230) Blockwise data partitioning

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-230:
---
Fix Version/s: (was: SystemML 1.0)
   Not Applicable

> Blockwise data partitioning
> ---
>
> Key: SYSTEMML-230
> URL: https://issues.apache.org/jira/browse/SYSTEMML-230
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Frederick Reiss
> Fix For: Not Applicable
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-515) 'Sparsity' Parameter For `rand` Statement Should Allow An Expression

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-515:
---
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.13

> 'Sparsity' Parameter For `rand` Statement Should Allow An Expression
> 
>
> Key: SYSTEMML-515
> URL: https://issues.apache.org/jira/browse/SYSTEMML-515
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Mike Dusenberry
> Fix For: SystemML 0.13
>
>
> The {{rand(...)}} function has a {{sparsity}} parameter that allows one to 
> specify the desired sparsity of the generated matrix as in
> {code}
> X = rand(rows=10, cols=20, min=0, max=1, pdf=”uniform”, sparsity=0.2)
> {code}.
> Currently, the {{sparsity}} parameter only accepts {{Literal}} inputs, i.e. 
> hard-coded double values, or variables that are themselves hard-coded double 
> values.  It would be better to allow an expression that evaluates to a double 
> value, such as the following, simple, contrived example:
> {code}
> s = log(0.2)
> X = rand(rows=10, cols=20, min=0, max=1, pdf=”uniform”, sparsity=s)
> {code}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1415) Rename `nn/layers/max_pool.dml` to `nn/layers/max_pool2d.dml`

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1415:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Rename `nn/layers/max_pool.dml` to `nn/layers/max_pool2d.dml`
> -
>
> Key: SYSTEMML-1415
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1415
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
>Priority: Minor
> Fix For: SystemML 0.14
>
>
> Note that this breaks the current API.   This is fine though since the {{nn}} 
> library is currently in staging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1416) Rename `nn/layers/conv_builtin.dml` to `nn/layers/conv2d_builtin.dml`

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1416:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Rename `nn/layers/conv_builtin.dml` to `nn/layers/conv2d_builtin.dml`
> -
>
> Key: SYSTEMML-1416
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1416
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
>Priority: Minor
> Fix For: SystemML 0.14
>
>
> Note that this breaks the current API.   This is fine though since the {{nn}} 
> library is currently in staging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1417) Rename `nn/layers/max_pool_builtin.dml` to `nn/layers/max_pool2d_builtin.dml`

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1417:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Rename `nn/layers/max_pool_builtin.dml` to `nn/layers/max_pool2d_builtin.dml`
> -
>
> Key: SYSTEMML-1417
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1417
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
>Priority: Minor
> Fix For: SystemML 0.14
>
>
> Note that this breaks the current API.   This is fine though since the {{nn}} 
> library is currently in staging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1766) Merge experimental breast cancer project code into main repo

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1766:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Merge experimental breast cancer project code into main repo
> 
>
> Key: SYSTEMML-1766
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1766
> Project: SystemML
>  Issue Type: New Feature
>  Components: Algorithms
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 0.15
>
>
> This aims to consolidate and cleanup experimental breast cancer project code, 
> and move it into the main repo.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1813) Preprocessing simplification and cleanup

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1813:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Preprocessing simplification and cleanup
> 
>
> Key: SYSTEMML-1813
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1813
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 0.15
>
>
> In anticipation of near-future algorithmic improvements to the preprocessing 
> to improve model training, this simplifies and cleans up the preprocessing 
> code as follows.
> - Previously, we were processing all slides into one large saved
> DataFrame, and then splitting that DataFrame into train and validation
> DataFrames.  We should simplify this by splitting the slide numbers
> into train and validation sets, and then processing those slides
> separately.  This will effectively skip the creation of the large DataFrame,
> and remove the need to split that large DataFrame into train/val ones,
> which should provide a large performance benefit.  The DataFrame `union`
> method can be used to combine two DataFrames row-wise.
> - Previously, we maintained a list of "broken" slides that were manually
> removed.  We should remove that manual list, and instead add a
> try/except filtering step to automatically remove problematic slides.
> - We should move ad-hoc sampling code into a new `sample` function.
> - We should move code to add row indices to a DataFrame into a new
> `add_row_indices` function.
> The benefit is that near-future algorithmic improvements to the
> preprocessing code will be much easier to incorporate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1674) Add a new 2D depthwise convolution layer

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1674:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Add a new 2D depthwise convolution layer
> 
>
> Key: SYSTEMML-1674
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1674
> Project: SystemML
>  Issue Type: New Feature
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 0.15
>
>
> A depthwise convolution (1) applies a different set of M filters to each 
> input channel separately, thus expanding each input channel to M output 
> channels, and (2) concatenates the results into a single volume with C*M 
> output channels. This is in contrast to a regular 2D convolution, in which 
> all of the filters would be applied to all of the input channels at once.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1575) DataType Change Test Failure

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1575:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> DataType Change Test Failure
> 
>
> Key: SYSTEMML-1575
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1575
> Project: SystemML
>  Issue Type: Bug
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 0.15
>
>
> While working on SYSTEMML-1554, an additional bug was uncovered. 
> Specifically, with the IPA scalar replacement enhancement, the 
> {{org.apache.sysml.test.integration.functions.misc.DataTypeChangeTest#testDataTypeChangeValidate4c}}
>  test has started to fail.  Looking into it, it fails due to trying to cast a 
> Matrix to a Scalar object.  At a deeper level, it looks like the propagated 
> variable map is holding onto the "matrix" `X`, rather than dropping it as it 
> should, since X is turned into a scalar by the call `X = foo(X)`.  
> Interestingly, the FunctionOp for the `foo` function is marked as having an 
> `Unknown` datatype and valuetype.  Overall, this seems like a bug that was 
> just hidden before, rather than being newly introduced.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1675) Add a new 2D depthwise transpose convolution layer

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1675:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Add a new 2D depthwise transpose convolution layer
> --
>
> Key: SYSTEMML-1675
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1675
> Project: SystemML
>  Issue Type: New Feature
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 0.15
>
>
> A depthwise transpose convolution (1) applies a different filter to each 
> unique group of M input channels separately, thus condensing each group of M 
> input channels to 1 output channel, and (2) concatenates the results into a 
> single volume with C/M output channels. This is in contrast to a regular 2D 
> transpose convolution, in which all of the filters would be applied to all of 
> the input channels at once.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1561) Improve constant folding during compilation

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1561:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Improve constant folding during compilation
> ---
>
> Key: SYSTEMML-1561
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1561
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 0.15
>
> Attachments: scenario1_plan.txt, scenario1.py, scenario2_plan.txt, 
> scenario2.py
>
>
> In our `nn` library, our convolution and pooling layers have to pass around 
> the spatial dimensions (height and width) of the images that are stretched 
> out into rows of the input/output matrices.  These output dimensions are 
> computed within the forward functions of the above layers as small scalar 
> equations.  From a mathematical standpoint, these sizes can be determined at 
> compile time, and it is nice to have these size equations in DML (v.s. hiding 
> them inside the engine within built-in functions).  However, we do not 
> currently evaluate these expressions during compilation, and thus we are left 
> with unknown sizes even during recompilation.  This naturally leads to max 
> memory estimates and thus often leads to unnecessary distributed runtime ops 
> rather than simple CP ones.
> I have two related scenarios for which this is a problem.  They both involve 
> the {{Houtc1}} & {{Woutc1}} values that are returned from a 
> `conv2d::forward(...)` function.  These represent the spatial dimensions of 
> the volume with each of the rows of the output {{outc1}} of the function, and 
> the third dimension is {{F1}}.  Thus, {{outc1}} has a number of columns equal 
> to {{F1*Houtc1*Wouc1}}.
> In the first scenario ({{scenario1.py}}), a random matrix {{doutc1}} is 
> created that should have the same dimensions as {{outc1}}.  For the columns, 
> if I use {{cols=ncol(outc1)}} in this rand statement, the size will be 
> propagated and CP ops will be compiled and run.  I I instead use 
> {{cols=F1*Houtc1*Woutc1}}, the size will forever be unknown, even during 
> recompilation, and thus Spark ops will be compiled and run.  I have included 
> the recompile hops plan ({{scenario1_plan.txt}}).
> In the second scenario ({{scenario2.py}}), a {{max_pool2d::forward(...)}} 
> function is inserted after the {{conv2d::forward(...)}} function that 
> requires the {{Houtc1}} and {{Woutc1}} variables to be supplied as arguments. 
>  Since those latter variables are not executed during compilation time, the 
> max pooling sizes remain unknown, even during recompilation, and thus Spark 
> ops will be compiled and run.  I have included the recompile hops plan 
> ({{scenario2_plan.txt}}).
> We should either improve or fix our constant folding rewrites so that these 
> scenarios are fixed, as they are necessary for performant deep learning 
> applications.  Note too that this issue will be present in other non-deep 
> learning scenarios as well.
> Mailing list thread: 
> https://www.mail-archive.com/dev@systemml.incubator.apache.org/msg01657.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1563) Add a distributed synchronous SGD MNIST LeNet example

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1563:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Add a distributed synchronous SGD MNIST LeNet example
> -
>
> Key: SYSTEMML-1563
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1563
> Project: SystemML
>  Issue Type: New Feature
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 0.15
>
>
> This aims to add a *distributed synchronous SGD* MNIST LeNet example.  In 
> distributed synchronous SGD, multiple mini-batches are run forward & backward 
> simultaneously, and the gradients are aggregated together by addition before 
> the model parameters are updated.  This is mathematically equivalent to 
> simply using a large mini-batch size, i.e. {{new_mini_batch_size = 
> mini_batch_size * number_of_parallel_mini_batches}}.  The benefit is that 
> distributed synchronous SGD can make use of multiple devices, i.e. multiple 
> GPUs or multiple CPU machines, and thus can speed up training time.  More 
> specifically, using an effectively larger mini-batch size can yield a more 
> stable gradient in expectation, and a larger number of epochs can be run in 
> the same amount of time, both of which lead to faster convergence.  
> Alternatives include various forms of distributed _asynchronous_ SGD, such as 
> Downpour, Hogwild, etc.  However, a recent paper \[1] from Google Brain / 
> Open AI has found evidence supporting the claim that distributed synchronous 
> SGD can lead to faster convergence, particularly if it is extending with the 
> notion of "backup workers" as described in the paper.
> We will first aim for distributed synchronous SGD with no backup workers, and 
> then extend this to include backup workers.  The MNIST LeNet model will 
> simply serve as an example, and this same approach can be extended to more 
> recent models, such as ResNets.
> \[1]: https://arxiv.org/abs/1604.00981



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1564) Add a Java test suite wrapper around `nn` DML test suite

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1564:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Add a Java test suite wrapper around `nn` DML test suite
> 
>
> Key: SYSTEMML-1564
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1564
> Project: SystemML
>  Issue Type: New Feature
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 0.15
>
>
> The {{nn}} library contains it's own DML test suite for gradient checks and 
> unit tests.  The test suite produces "ERROR..." messages if any of the 
> mathematical operations return incorrect results.  Note that this has helped 
> to find mathematical bugs in the library & engine that do not result in JVM 
> exceptions, but are equally important.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1465) Add stain normalization to preprocessing.

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1465:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Add stain normalization to preprocessing.
> -
>
> Key: SYSTEMML-1465
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1465
> Project: SystemML
>  Issue Type: New Feature
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1554) IPA Scalar Transient Read Replacement

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1554:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> IPA Scalar Transient Read Replacement
> -
>
> Key: SYSTEMML-1554
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1554
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 0.15
>
> Attachments: convnet_distrib_sgd.dml, parfor_oom_convnet_plan.txt, 
> parfor_oom_convnet.py, parfor_oom_plan.txt, parfor_oom.py
>
>
> Currently, during IPA we collect all variables (scalars & matrices) eligible 
> for propagation across blocks (i.e. not updated in block), and then propagate 
> the only the matrix sizes across the blocks.  It seems plausible that we 
> could also replace all eligible scalar transient reads with literals based on 
> the variables that have already been collected.  The benefit is that many ops 
> will be able to determine their respective output sizes during regular 
> compilation, instead of having to wait until dynamic recompilation, and thus 
> we can reduce the pressure on dynamic recompilation.
> Are there drawbacks to this approach?  The use case is that I was seeing a 
> large number of memory warnings while training a convolutional net due to the 
> sizes being unknown during regular compilation, yet the engine only having CP 
> versions of the ops.  Additionally, I was running into actual heap space OOM 
> errors for situations that should not run out of memory, and thus I started 
> exploring.
> I've attached an example script and the explain plan (hops & runtime) w/ and 
> w/o the IPA scalar replacement.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1414) Rename `nn/layers/conv.dml` to `nn/layers/conv2d.dml`

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1414:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Rename `nn/layers/conv.dml` to `nn/layers/conv2d.dml`
> -
>
> Key: SYSTEMML-1414
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1414
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 0.14
>
>
> Note that this breaks the current API.   This is fine though since the {{nn}} 
> library is currently in staging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1524) Graduate `nn` library from `scripts/staging/SystemML-NN/nn` to `scripts/nn`

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1524:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Graduate `nn` library from `scripts/staging/SystemML-NN/nn` to `scripts/nn`
> ---
>
> Key: SYSTEMML-1524
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1524
> Project: SystemML
>  Issue Type: New Feature
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 0.15
>
>
> For our upcoming 1.0 release, we should release the {{nn}} deep learning 
> library as an official top-level SystemML library.  This would coincide with 
> our Caffe integration via Caffe2DML that targets the {{nn}} library, as well 
> as our native BLAS and GPU runtime targets, which our deep learning use cases 
> will benefit from.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1034) Implement solve builtin function using cublas kernels

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1034:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Implement solve builtin function using cublas kernels
> -
>
> Key: SYSTEMML-1034
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1034
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Runtime
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
> Fix For: SystemML 0.15
>
>
> 1. Extend BinaryOp to enable GPU for solve
> 2. Add MatrixMatrixBuiltinGPUInstruction and use JCuBlas2's 
> cublasDtrsmBatched and cublasDgeqrfBatched (or cublasDgetrfBatched) methods.
> For reference implementation, see 
> https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/runtime/matrix/data/LibCommonsMath.java#L97
> [~nakul02]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1816) toString can return -0

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1816:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> toString can return -0
> --
>
> Key: SYSTEMML-1816
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1816
> Project: SystemML
>  Issue Type: Bug
>  Components: Runtime
>Reporter: Deron Eriksson
>Assignee: Nakul Jindal
> Fix For: SystemML 0.15
>
>
> When display matrix values with toString, -0 can be displayed.
> Example:
> {code}
> m = matrix("50 99 100 200",rows=2,cols=2);
> x = 100;
> m = (m - x) * ((m-x) >= 0)
> print(toString(m))
> {code}
> gives:
> {code}
> -0.000 -0.000
> 0.000 100.000
> {code}
> Using as.scalar on the individual cells returns 0:
> {code}
> for (i in 1:nrow(m)) {
> for (j in 1:ncol(m)) {
> n = m[i,j]
> print('[' + i + ',' + j + ']:' + as.scalar(n))
> }
> }
> {code}
> gives:
> {code}
> [1,1]:0.0
> [1,2]:0.0
> [2,1]:0.0
> [2,2]:100.0
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-704) Host jcu*.jar libraries on mvn repo

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-704:
---
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Host jcu*.jar libraries on mvn repo
> ---
>
> Key: SYSTEMML-704
> URL: https://issues.apache.org/jira/browse/SYSTEMML-704
> Project: SystemML
>  Issue Type: Task
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
> Fix For: SystemML 0.14
>
>
> The PR https://github.com/apache/incubator-systemml/pull/165/ uses system 
> scope for jcu*.jar as they are not published on mvn central. Since we are 
> planning to include them into SystemML, it would be good to host them into a 
> repo we maintain and have provided scope instead. If for LICENSE or some 
> other reasons, we are not able to host them, I am fine with rejecting this 
> issue too. From jcuda's website "JCuda is published under the terms of the 
> MIT/X11 License".
> The current version depends on jcu*-0.7.5b.jar (except jcudnn-0.7.5.jar). The 
> jars are available for download from 
> http://www.jcuda.org/downloads/downloads.html. The source is available at 
> https://github.com/jcuda
> [~nakul02] [~deron] [~luciano resende]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1806) setTextValue in DMLConfig is not behaving correctly

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1806:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> setTextValue in DMLConfig is not behaving correctly
> ---
>
> Key: SYSTEMML-1806
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1806
> Project: SystemML
>  Issue Type: Bug
>  Components: Runtime
>Reporter: Nakul Jindal
>Assignee: Nakul Jindal
> Fix For: SystemML 0.15
>
>
> The problem was discovered when trying to set a configuration property from 
> MLContext.
> Specifically, it was to try and restrict which GPU to use for the program. 
> Currently this is done via a System property.
> This was the script:
> {code}
> unet = Caffe2DML(spark, solver='solver.prototxt', input_shape=img_shape)
> unet.setGPU(True)
> unet.setForceGPU(True)
> unet.setConfigProperty("systemml.stats.extraGPU", "true")
> unet.setConfigProperty("systemml.gpu.availableGPUs", "1")
> {code}
> Here is what I discovered:
> The first time 
> [setText|https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/conf/DMLConfig.java#L266]
>  value is called on an empty DMLConfig (by calling new DMLConfig()) as 
> opposed to by parsing a file, a new _xmlRoot is initialized.
> Thereafter, since the _xmlRoot is not null, it tries to call 
> getElementsByTagName, even when the tag is different. In the example above, 
> the tag is {{systemml.stats.extraGPU"}} the first time around and 
> {{systemml.gpu.availableGPUs}} the second time around.
> The bug is in 
> [setText|https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/conf/DMLConfig.java#L253]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1744) Include jcuda jars in extra assembly jar for easy pip install

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1744:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Include jcuda jars in extra assembly jar for easy pip install
> -
>
> Key: SYSTEMML-1744
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1744
> Project: SystemML
>  Issue Type: Improvement
>  Components: Build
>Reporter: Nakul Jindal
>Assignee: Nakul Jindal
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1735) Add relational operators for GPU

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1735:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Add relational operators for GPU
> 
>
> Key: SYSTEMML-1735
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1735
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Runtime
>Reporter: Nakul Jindal
>Assignee: Nakul Jindal
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1758) Add cbind (and rbind) GPU ops

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1758:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Add cbind (and rbind) GPU ops
> -
>
> Key: SYSTEMML-1758
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1758
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Runtime
>Reporter: Nakul Jindal
>Assignee: Nakul Jindal
> Fix For: SystemML 0.15
>
>
> Ping [~niketanpansare]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1654) GPU cannot handle nested local parfors

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1654:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> GPU cannot handle nested local parfors
> --
>
> Key: SYSTEMML-1654
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1654
> Project: SystemML
>  Issue Type: Bug
>  Components: Runtime
>Affects Versions: SystemML 0.14
>Reporter: Nakul Jindal
>Assignee: Nakul Jindal
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1713) Verify and correct memory estimates for various ops on the GPU

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1713:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Verify and correct memory estimates for various ops on the GPU
> --
>
> Key: SYSTEMML-1713
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1713
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: Nakul Jindal
>Assignee: Nakul Jindal
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1625) Add (Unit) Tests for GPU functions

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1625:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Add (Unit) Tests for GPU functions
> --
>
> Key: SYSTEMML-1625
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1625
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Nakul Jindal
>Assignee: Nakul Jindal
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1701) Fix the need to add force to -gpu always

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1701:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Fix the need to add force to -gpu always
> 
>
> Key: SYSTEMML-1701
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1701
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Runtime
>Affects Versions: SystemML 0.14
>Reporter: Nakul Jindal
>Assignee: Nakul Jindal
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1138) Exception thrown when a GPU sparse-sparse matrix multiply is performed

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1138:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.12

> Exception thrown when a GPU sparse-sparse matrix multiply is performed
> --
>
> Key: SYSTEMML-1138
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1138
> Project: SystemML
>  Issue Type: Bug
>Reporter: Nakul Jindal
>Assignee: Nakul Jindal
> Fix For: SystemML 0.12
>
>
> For a simple program like so:
> A = rand(rows=5, cols=10, sparsity=0.0003)
> B = rand(rows=10, cols=100, sparsity=0.7)
> C = A %*% B
> print(toString(C))
> This is the exception:
> Caused by: jcuda.CudaException: cudaErrorIllegalAddress
>   at jcuda.runtime.JCuda.checkResult(JCuda.java:460)
>   at jcuda.runtime.JCuda.cudaDeviceSynchronize(JCuda.java:7361)
>   at 
> org.apache.sysml.runtime.instructions.gpu.context.JCudaObject.columnMajorDenseToRowMajorSparse(JCudaObject.java:1130)
>   at 
> org.apache.sysml.runtime.matrix.data.LibMatrixCUDA.sparseDenseMatmult(LibMatrixCUDA.java:668)
>   at 
> org.apache.sysml.runtime.matrix.data.LibMatrixCUDA.eitherSparseMatmult(LibMatrixCUDA.java:573)
>   at 
> org.apache.sysml.runtime.matrix.data.LibMatrixCUDA.matmult(LibMatrixCUDA.java:538)
>   at 
> org.apache.sysml.runtime.instructions.gpu.AggregateBinaryGPUInstruction.processInstruction(AggregateBinaryGPUInstruction.java:98)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1568) NULL condition not check for Spark version in MLContext

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1568:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> NULL condition not check for Spark version in MLContext
> ---
>
> Key: SYSTEMML-1568
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1568
> Project: SystemML
>  Issue Type: Bug
>Reporter: Niketan Pansare
>Assignee: Niketan Pansare
>Priority: Minor
> Fix For: SystemML 0.15
>
>
> I see following warning after starting pyspark shell:
> {code}
> 17/04/30 14:05:25 WARN MLContext: Apache Spark null or above is recommended 
> for SystemML null
> Welcome to Apache SystemML!
> {code}
> To reproduce the warning, please use Spark 2.1:
> {code}
> # checkout current master
> mvn package -P distribution
> pip install target/systemml-1.0.0-incubating-SNAPSHOT-python.tgz
> pyspark
> >> run simple script with python mlcontext
> {code} 
> [~deron] Can you please take a look at it ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1589) conv2d_bias_add fails w/ NPE on lenet with random data

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1589:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> conv2d_bias_add fails w/ NPE on lenet with random data
> --
>
> Key: SYSTEMML-1589
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1589
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Niketan Pansare
> Fix For: SystemML 0.15
>
>
> The lenet dml script fails with a null pointer exception for random multi 
> class data, generated with
> {code}
> X_full = rand(rows=6,cols=784);
> y_full = round(rand(rows=nrow(X_full), cols=1, min=1, max=10));
> {code}
> The detailed stacktrace is as follows:
> {code}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.sysml.runtime.matrix.data.LibMatrixDNN.getRowInDenseFormat(LibMatrixDNN.java:1355)
> at 
> org.apache.sysml.runtime.matrix.data.LibMatrixDNN.doIm2colSparse(LibMatrixDNN.java:1382)
> at 
> org.apache.sysml.runtime.matrix.data.LibMatrixDNN.doIm2col(LibMatrixDNN.java:1421)
> at 
> org.apache.sysml.runtime.matrix.data.LibMatrixDNN.doLoopedIm2ColConv2d(LibMatrixDNN.java:406)
> at 
> org.apache.sysml.runtime.matrix.data.LibMatrixDNN.access$400(LibMatrixDNN.java:51)
> at 
> org.apache.sysml.runtime.matrix.data.LibMatrixDNN$ConvTask.call(LibMatrixDNN.java:1143)
> at 
> org.apache.sysml.runtime.matrix.data.LibMatrixDNN$ConvTask.call(LibMatrixDNN.java:1076)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1661) Builtin functions bias_add and bias_multiply not documented

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1661:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Builtin functions bias_add and bias_multiply not documented
> ---
>
> Key: SYSTEMML-1661
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1661
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Niketan Pansare
>Priority: Minor
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-888) Add PNMF algorithm to SystemML

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-888:
---
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Add PNMF algorithm to SystemML
> --
>
> Key: SYSTEMML-888
> URL: https://issues.apache.org/jira/browse/SYSTEMML-888
> Project: SystemML
>  Issue Type: Task
>  Components: Algorithms
>Reporter: Deron Eriksson
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Add the Poisson Nonnegative Matrix Factorization algorithm to the SystemML 
> algorithms.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1878) Perftest: Performance issues MSVM 1M x 1K, sparse

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1878:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Perftest: Performance issues MSVM 1M x 1K, sparse
> -
>
> Key: SYSTEMML-1878
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1878
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> For the MSVM 1M x 1K, sparse performance test, the parfor optimizer currently 
> selects a local plan although the data size is just 244MB. This task aims to 
> make the necessary fixes to automatically compile this outer parfor loop to 
> {{REMOTE_SPARK}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-455) OOM CP transpose in Spark hybrid mode

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-455:
---
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> OOM CP transpose in Spark hybrid mode 
> --
>
> Key: SYSTEMML-455
> URL: https://issues.apache.org/jira/browse/SYSTEMML-455
> Project: SystemML
>  Issue Type: Bug
>  Components: Compiler
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>
> The following data generation script failed with OOM in hybrid_spark 
> execution mode (config: 20GB driver memory), whereas the same script runs 
> fine with the same memory budget in hybrid_mr execution mode.
> {code}
> n = 3;
> B = Rand (rows = n, cols = n, min = -1, max = 1, pdf = "uniform", seed = 
> 1234);
> v = exp (Rand (rows = n, cols = 1, min = -3, max = 3, pdf = "uniform", seed = 
> 5678));
> A = t(B) %*% (B * v);
> write(A, "./tmp/A", format="binary");
> {code}
> The resulting hop explain output is as follows:
> {code}
> # Memory Budget local/remote = 13739MB/184320MB/8602MB
> # Degree of Parallelism (vcores) local/remote = 16/120
> PROGRAM
> --MAIN PROGRAM
> GENERIC (lines 4-12) [recompile=true]
> --(10) dg(rand) [3,3,1000,1000,9] [0,0,6866 -> 6866MB], CP
> --(21) r(t) (10) [3,3,1000,1000,9] [6866,0,6866 -> 
> 13733MB], CP
> --(19) dg(rand) [3,1,1000,1000,3] [0,0,0 -> 0MB], CP
> --(20) u(exp) (19) [3,1,1000,1000,-1] [0,0,0 -> 0MB], CP
> --(22) b(*) (10,20) [3,3,1000,1000,-1] [6867,0,6866 -> 13733MB], 
> CP
> --(23) ba(+*) (21,22) [3,3,1000,1000,-1] [13733,6866,6866 -> 
> 27466MB], SPARK
> --(28) PWrite A (23) [3,3,1000,1000,-1] [6866,0,0 -> 6866MB], CP
> {code}
> The scripts fails at CP transpose with
> {code}
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:414)
> at 
> org.apache.sysml.runtime.matrix.data.LibMatrixReorg.transposeDenseToDense(LibMatrixReorg.java:752)
> at 
> org.apache.sysml.runtime.matrix.data.LibMatrixReorg.transpose(LibMatrixReorg.java:136)
> at 
> org.apache.sysml.runtime.matrix.data.LibMatrixReorg.reorg(LibMatrixReorg.java:105)
> at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.reorgOperations(MatrixBlock.java:3458)
> at 
> org.apache.sysml.runtime.instructions.cp.ReorgCPInstruction.processInstruction(ReorgCPInstruction.java:129)
> {code}
> It's noteworthy that the failing cp instructions requires 13733MB at a memory 
> budget of 13739MB. The current guess is that Spark itself occupies 
> substantial memory overhead which eventually leads to the OOM - we should 
> adjust our memory budget in Spark execution modes to account for this 
> overhead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1262) Keep track of parallelized RDDs and broadcasts

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1262:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Keep track of parallelized RDDs and broadcasts
> --
>
> Key: SYSTEMML-1262
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1262
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1319) Statistical estimates over compressed matrix blocks

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1319:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Statistical estimates over compressed matrix blocks
> ---
>
> Key: SYSTEMML-1319
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1319
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Statistical estimates like moment, cov, aggregate, table, median, and 
> quantiles can be efficiently computed over compressed matrix blocks by 
> mapping distinct items + counts to weighted statistical estimates.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1560) Cache-conscious compressed tsmm operations

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1560:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Cache-conscious compressed tsmm operations
> --
>
> Key: SYSTEMML-1560
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1560
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1682) Missing in-memory spark csv-reblock w/ unknown sizes

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1682:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Missing in-memory spark csv-reblock w/ unknown sizes
> 
>
> Key: SYSTEMML-1682
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1682
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1534) Multi-aggregates w/ dot products as aggregation roots

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1534:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Multi-aggregates w/ dot products as aggregation roots
> -
>
> Key: SYSTEMML-1534
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1534
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1538) Improved dynamic recompilation (size update after rewrites)

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1538:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Improved dynamic recompilation (size update after rewrites)
> ---
>
> Key: SYSTEMML-1538
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1538
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Dynamic recompilation currently first updates matrix characteristics and 
> subsequently applied dynamic rewrites and operator selection which depend on 
> the updates stats. However, there are various scenarios where applied 
> rewrites simplify the propagation of statistics. Hence, we should 
> additionally update statistics after rewrites in order to increase the 
> potential of subsequent operator selection and code generation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1519) Old MLContext API setConfig only take affect on execute

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1519:

Fix Version/s: (was: SystemML 1.0)
   Not Applicable

> Old MLContext API setConfig only take affect on execute
> ---
>
> Key: SYSTEMML-1519
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1519
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: Not Applicable
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1881) Tuning parfor degree of parallelism for operations

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1881:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Tuning parfor degree of parallelism for operations
> --
>
> Key: SYSTEMML-1881
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1881
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Currently, we assign remaining parfor parallelism conservatively to 
> operations of the parfor body. Consider, for example, a Kmeans or MSVM 
> scenario with 10 runs or 10 classes respectively. On a box with 16 HW 
> threads, we assign k=10 to the parfor and {{floor(16/10)}} to remaining 
> operations. Since it is usually a good idea to slightly over-provision CPU in 
> order to get full utilization (due to barriers at the end of each operation), 
> we should tune this to {{round(16/10)}} which provides performance 
> improvements of about 15% in above examples. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1879) Parfor remote spark w/ reuse of shared inputs

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1879:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Parfor remote spark w/ reuse of shared inputs
> -
>
> Key: SYSTEMML-1879
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1879
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Currently, we read shared inputs redundantly in each parfor worker. This 
> causes redundant read and is unnecessarily memory-inefficient.
> This task aims to read shared inputs once per process and reuse them across 
> threads. The most elegant way of handling this is to reuse initially parsed 
> symbol table entries (instances of matrix objects), except for result 
> variables. Then the sharing happens automatically (similar to local parfor) 
> over the shared per-process buffer pool. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1877) Perfttest: Univariate statistics 1M x 1K fails on DPESP

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1877:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Perfttest: Univariate statistics 1M x 1K fails on DPESP
> ---
>
> Key: SYSTEMML-1877
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1877
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.14
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> The univariate statistics script fails in hybrid_spark on 1M x 1K with the 
> following exception (this issue has been introduced w/ SYSTEMML-1310)
> {code}
> Caused by: java.lang.RuntimeException: Unsupported partition format: 
> COLUMN_WISE
>   at 
> org.apache.sysml.runtime.controlprogram.parfor.RemoteDPParForSparkWorker.(RemoteDPParForSparkWorker.java:100)
>   at 
> org.apache.sysml.runtime.controlprogram.parfor.RemoteDPParForSpark.runJob(RemoteDPParForSpark.java:98)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.executeRemoteSparkParForDP(ParForProgramBlock.java:1104)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.execute(ParForProgramBlock.java:638)
>   ... 14 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1871) Rework compiler/runtime predicate handling

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1871:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Rework compiler/runtime predicate handling
> --
>
> Key: SYSTEMML-1871
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1871
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Currently, the handling of if, while, and for predicates exhibits a couple of 
> shortcomings. First, there are different representations for operations (as 
> single-root HOP DAGs) and literals (as dedicated constants). Second, the 
> runtime has to explicitly find intermediate variable names, remove rmvar 
> instructions, which is brittle and error-prone. Third, the special handling 
> of operations vs literals renders constant folding during dynamic 
> recompilation invalid because, we would have to handle the transitioning from 
> operation DAGs to constants accordingly. 
> This task aims to resolve all these issues, by properly compiling transient 
> writes to special predicate variables (e.g., _pred that are guaranteed not to 
> conflict with external variables). This requires a complete rework of the 
> entire predicate handling during compilation and runtime.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1862) Perftest: MSVM 800GB fails on buffer pool eviction

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1862:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Perftest: MSVM 800GB fails on buffer pool eviction
> --
>
> Key: SYSTEMML-1862
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1862
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> {code}
> Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
> Eviction to local path 
> /tmp/systemml/_p196865_1.12.34.56//cache/cache31072.dat (_mVar372) failed.
> at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.release(CacheableData.java:619)
> at 
> org.apache.sysml.runtime.controlprogram.context.ExecutionContext.setMatrixOutput(ExecutionContext.java:426)
> at 
> org.apache.sysml.runtime.instructions.cp.ScalarMatrixRelationalCPInstruction.processInstruction(ScalarMatrixRelationalCPInstruction.java:64)
> at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:286)
> ... 6 more
> Caused by: java.util.NoSuchElementException
> at 
> java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:721)
> at 
> java.util.LinkedHashMap$LinkedEntryIterator.next(LinkedHashMap.java:752)
> at 
> java.util.LinkedHashMap$LinkedEntryIterator.next(LinkedHashMap.java:750)
> at 
> org.apache.sysml.runtime.controlprogram.caching.LazyWriteBuffer$EvictionQueue.removeFirst(LazyWriteBuffer.java:273)
> at 
> org.apache.sysml.runtime.controlprogram.caching.LazyWriteBuffer.writeBlock(LazyWriteBuffer.java:82)
> at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.release(CacheableData.java:615)
> ... 9 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1861) Performance sparse-sparse binary ops

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1861:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Performance sparse-sparse binary ops
> 
>
> Key: SYSTEMML-1861
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1861
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> This task aims to improve the performance of sparse-sparse binary operations 
> such as elementwise multiply. Currently, we use a merge join with outer join 
> semantics to cover the general case - for operations like multiply this is 
> unnecessary and could be improved by using inner join semantics and 
> branchless position maintenance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1857) Misc performance issues codegen templates

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1857:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Misc performance issues codegen templates
> -
>
> Key: SYSTEMML-1857
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1857
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> This task covers the following performance improvements:
> * Avoid multi-threaded operations for small inputs (all templates)
> * MAgg: select shared sparse-safe input as driver
> * Row: used flipped outer computation depending on size



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1853) StepLinreg is failing w/ recompilation issue

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1853:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> StepLinreg is failing w/ recompilation issue
> 
>
> Key: SYSTEMML-1853
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1853
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Our step-wise LinregDS is currently failing with the following NPE on dynamic 
> recompilation 
> {code}
> Caused by: java.lang.NullPointerException
> at org.apache.sysml.lops.BinaryScalar.getOpcode(BinaryScalar.java:111)
> at 
> org.apache.sysml.lops.BinaryScalar.getInstructions(BinaryScalar.java:87)
> at 
> org.apache.sysml.lops.compile.Dag.generateControlProgramJobs(Dag.java:1406)
> at org.apache.sysml.lops.compile.Dag.doGreedyGrouping(Dag.java:1176)
> at org.apache.sysml.lops.compile.Dag.getJobs(Dag.java:270)
> at 
> org.apache.sysml.hops.recompile.Recompiler.recompileHopsDag(Recompiler.java:239)
> at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:147)
> {code}
> The root cause is that some rewrite modified the input solve to a scalar, 
> which causes the instruction generation of solve to fail because it is not 
> defined over scalars.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1852) IPA fails w/ issue of simplification rewrite

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1852:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> IPA fails w/ issue of simplification rewrite
> 
>
> Key: SYSTEMML-1852
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1852
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> After the recent IPA and compiler changes (e.g., merge of statement blocks), 
> we encountered the following rewrite issue:
> {code}
> Caused by: org.apache.sysml.hops.HopsException: Failed to retrieve 'to' 
> argument from basic 1-N sequence.
> at 
> org.apache.sysml.hops.rewrite.HopRewriteUtils.getBasic1NSequenceMaxLiteral(HopRewriteUtils.java:1005)
> at 
> org.apache.sysml.hops.rewrite.RewriteAlgebraicSimplificationStatic.simplifyOuterSeqExpand(RewriteAlgebraicSimplificationStatic.java:1644)
> at 
> org.apache.sysml.hops.rewrite.RewriteAlgebraicSimplificationStatic.rule_AlgebraicSimplification(RewriteAlgebraicSimplificationStatic.java:173)
> at 
> org.apache.sysml.hops.rewrite.RewriteAlgebraicSimplificationStatic.rule_AlgebraicSimplification(RewriteAlgebraicSimplificationStatic.java:178)
> at 
> org.apache.sysml.hops.rewrite.RewriteAlgebraicSimplificationStatic.rule_AlgebraicSimplification(RewriteAlgebraicSimplificationStatic.java:178)
> at 
> org.apache.sysml.hops.rewrite.RewriteAlgebraicSimplificationStatic.rule_AlgebraicSimplification(RewriteAlgebraicSimplificationStatic.java:178)
> at 
> org.apache.sysml.hops.rewrite.RewriteAlgebraicSimplificationStatic.rule_AlgebraicSimplification(RewriteAlgebraicSimplificationStatic.java:178)
> at 
> org.apache.sysml.hops.rewrite.RewriteAlgebraicSimplificationStatic.rule_AlgebraicSimplification(RewriteAlgebraicSimplificationStatic.java:178)
> at 
> org.apache.sysml.hops.rewrite.RewriteAlgebraicSimplificationStatic.rule_AlgebraicSimplification(RewriteAlgebraicSimplificationStatic.java:178)
> at 
> org.apache.sysml.hops.rewrite.RewriteAlgebraicSimplificationStatic.rewriteHopDAGs(RewriteAlgebraicSimplificationStatic.java:83)
> at 
> org.apache.sysml.hops.rewrite.ProgramRewriter.rewriteHopDAGs(ProgramRewriter.java:275)
> at 
> org.apache.sysml.hops.rewrite.ProgramRewriter.rewriteStatementBlockHopDAGs(ProgramRewriter.java:265)
> at 
> org.apache.sysml.hops.rewrite.ProgramRewriter.rewriteStatementBlockHopDAGs(ProgramRewriter.java:249)
> at 
> org.apache.sysml.hops.rewrite.ProgramRewriter.rewriteStatementBlockHopDAGs(ProgramRewriter.java:233)
> at 
> org.apache.sysml.hops.rewrite.ProgramRewriter.rewriteProgramHopDAGs(ProgramRewriter.java:206)
> at 
> org.apache.sysml.hops.ipa.IPAPassApplyStaticHopRewrites.rewriteProgram(IPAPassApplyStaticHopRewrites.java:52)
> at 
> org.apache.sysml.hops.ipa.InterProceduralAnalysis.analyzeProgram(InterProceduralAnalysis.java:202)
> at 
> org.apache.sysml.parser.DMLTranslator.rewriteHopsDAG(DMLTranslator.java:281)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1854) Transformapply w/ empty recode maps fails w/ NPE

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1854:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Transformapply w/ empty recode maps fails w/ NPE
> 
>
> Key: SYSTEMML-1854
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1854
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> {code}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.sysml.runtime.transform.encode.EncoderRecode.lookupRCDMap(EncoderRecode.java:62)
> at 
> org.apache.sysml.runtime.transform.encode.EncoderRecode.apply(EncoderRecode.java:133)
> at 
> org.apache.sysml.runtime.transform.encode.EncoderComposite.apply(EncoderComposite.java:87)
> at 
> org.apache.sysml.runtime.instructions.cp.ParameterizedBuiltinCPInstruction.processInstruction(ParameterizedBuiltinCPInstruction.java:264)
> at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:286)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1850) Transformdecode fails on tokens w/ special characters

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1850:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Transformdecode fails on tokens w/ special characters
> -
>
> Key: SYSTEMML-1850
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1850
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> In the past, we already encountered issues with special tokens on 
> {{transformapply}} where the token included the delimiter between token and 
> code. A similar issue still exists for {{transformdecode}} which causes 
> failures similar to the one below:
> {code}
> Caused by: java.lang.NumberFormatException: For input string: ""
> at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:601)
> at java.lang.Long.parseLong(Long.java:631)
> at 
> org.apache.sysml.runtime.transform.decode.DecoderRecode.initMetaData(DecoderRecode.java:89)
> at 
> org.apache.sysml.runtime.transform.decode.DecoderComposite.initMetaData(DecoderComposite.java:55)
> at 
> org.apache.sysml.runtime.transform.decode.DecoderFactory.createDecoder(DecoderFactory.java:86)
> ... 13 more
> {code}
> This task aims to fix this issue and create global consistency for all 
> construction and splitting of recode map entries.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1851) Transformdecode always produces default column names

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1851:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Transformdecode always produces default column names
> 
>
> Key: SYSTEMML-1851
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1851
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Since {{transformdecode}} gets a matrix as input, the output schema always 
> contains default column names (e.g., C1, C2, etc). This task aims to get the 
> column names from the corresponding meta data in a best effort manner. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1846) Transformapply w/ column names fails with index-out-of-bounds

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1846:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Transformapply w/ column names fails with index-out-of-bounds
> -
>
> Key: SYSTEMML-1846
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1846
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Given a simple transformapply scenario as shown in the following script
> {code}
> spec = "{ids: false, recode: [ zipcode, district, view ]}";
> [X, M] = transformencode(target=F, spec=spec);
> spec2 = "{ids: false, recode: [ zipcode ]}";
> X2 = transformapply(target=F[,1], spec=spec2, meta=M);
> {code}
> currently leads to index out-of-bounds exceptions because the column name 
> zipcode is not found in the column names of the meta data frame. The root 
> cause is a wrong assumption of sorted column names in the underlying 
> implementation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1845) Performance issue codegen cellwise over multiple sparse inputs

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1845:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Performance issue codegen cellwise over multiple sparse inputs
> --
>
> Key: SYSTEMML-1845
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1845
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1843) Wrong loop update-in-place decisions

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1843:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Wrong loop update-in-place decisions 
> -
>
> Key: SYSTEMML-1843
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1843
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> For special cases, where a matrix is simply updated in a loop, the rewrite 
> for marking updated loop variables as update-in-place mistakenly flags these 
> variables. For example, consider the following script:
> {code}
> ...
> for(i in 1:100) {
>   q = as.matrix(sum(X * U%*%t(V)))
>   print("at iteration "+i);
> }
> {code}
> and the related hop explain output
> {code}
> FOR (lines 9-13) [in-place=[q]]
> --GENERIC (lines 10-12) [recompile=true]
> (46) TRead X [8026324,2330066,1000,1000,22507155] [0,0,1317 -> 
> 1317MB], CP
> (48) TRead U [8026324,10,1000,1000,80263240] [0,0,612 -> 612MB], CP
> (49) TRead V [2330066,10,1000,1000,23300660] [0,0,178 -> 178MB], CP
> (50) r(t) (49) [10,2330066,1000,1000,23300660] [178,0,178 -> 356MB], 
> CP
> (51) ba(+*) (48,50) [8026324,2330066,1000,1000,-1] 
> [790,85611347,142683904 -> 228296041MB], SPARK
> (52) b(*) (46,51) [8026324,2330066,1000,1000,-1] [142685221,0,1317 -> 
> 142686537MB], SPARK
> (53) ua(+RC) (52) [0,0,-1,-1,-1] [1317,0,0 -> 1317MB], SPARK
> (54) u(cast_as_matrix) (53) [1,1,1000,1000,-1] [0,0,0 -> 0MB]
> (55) TWrite q (54) [1,1,1000,1000,-1] [0,0,0 -> 0MB], CP
> (47) TRead i [0,0,0,0,-1] [0,0,0 -> 0MB], CP
> (57) b(+) (47) [0,0,-1,-1,-1] [0,0,0 -> 0MB], CP
> (58) u(print) (57) [-1,-1,-1,-1,-1] [0,0,0 -> 0MB]
> {code}
> As can be seen above variable q is mistakenly marked as update in place, 
> which causes unnecessary copies and thus can negatively affect performance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1841) Performance issue codegen outer over ultra-sparse matrices

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1841:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Performance issue codegen outer over ultra-sparse matrices
> --
>
> Key: SYSTEMML-1841
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1841
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Experiments with codegen outer operations over the Amazon Books review 
> dataset (8,026,324 x 2,330,066, nnz=22,507,155, i.e., sparsity=10^(-6)) 
> showed unnecessary overhead for this ultra-sparse data set. This task aims to 
> remove this overhead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1842) Compression decision lost after recompilation or codegen

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1842:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Compression decision lost after recompilation or codegen
> 
>
> Key: SYSTEMML-1842
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1842
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Even with forced compression (compressed.linalg=true), compression is 
> currently not applied if the respective HOP DAG is recompiled or subject to 
> code generation. The root cause is an incomplete deep copy of the HOP DAG 
> which loses the compression flag.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1840) Transform spec literals should be checked during validate

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1840:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Transform spec literals should be checked during validate
> -
>
> Key: SYSTEMML-1840
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1840
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Currently, there is no validation happening for transform specifications 
> during initial compilation. This is very annoying, especially when trying to 
> encode large files, which takes a while to read in, just to find out that the 
> given transform specification was invalid json. Here is an example:
> {code}
> Caused by: org.apache.wink.json4j.JSONException: Expecting '{' on line 1, 
> column 4 instead, obtained token: 'Token: String - 'ids''
> at org.apache.wink.json4j.internal.Parser.parseObject(Parser.java:193)
> at org.apache.wink.json4j.internal.Parser.parse(Parser.java:130)
> at org.apache.wink.json4j.internal.Parser.parse(Parser.java:95)
> at org.apache.wink.json4j.JSONObject.(JSONObject.java:138)
> at 
> org.apache.sysml.runtime.transform.encode.EncoderFactory.createEncoder(EncoderFactory.java:56)
> {code}
> This task aims to parse the transform specification if its available as a 
> literal string during the language validation step.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1837) Unary aggregate w/ corrections output to large physical blocks

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1837:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Unary aggregate w/ corrections output to large physical blocks
> --
>
> Key: SYSTEMML-1837
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1837
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Many unary aggregate operations store corrections in additional columns or 
> rows. For example, {{rowSums(X)}} uses a two-column output to store sums and 
> corrections. In CP, we drop these corrections immediately after the 
> operations, while in MR and Spark these corrections are dropped after final 
> aggregation. The issue is that the {{MatrixBlock::dropLastRowsOrColums}} does 
> not actually drop the correction but simply shifts all values in the right 
> starting positions. Hence, the physical output is actually larger than what 
> the memory estimates represent. This leads to unnecessary large memory 
> consumption during subsequent operations and in the buffer pool, which can 
> lead to OOMs. This task aims to fix {{MatrixBlock::dropLastRowsOrColums}}. 
> In a subsequent task, we could also modify all unary aggregates to never 
> allocate the multi-column/row output when executed in CP. However, this 
> requires custom code paths for the different backends. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1838) Performance issues sparse/ultra-sparse binary read

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1838:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Performance issues sparse/ultra-sparse binary read
> --
>
> Key: SYSTEMML-1838
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1838
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Recent experiments with PageRank (20 iterations) on a 1M x 1M, sp=0.001 input 
> showed that the actual iterations are indeed very fast, at peak memory 
> bandwidth (i.e., ~500ms per iteration in CP only) but the initial read is 
> unnecessarily slow, and thus dominating the entire execution time. For 
> example, in this scenario, the read took 41s. 
> This task aims to improve the read performance of sparse and ultra-sparse 
> matrices into CP.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1292) Support spark codegen instructions w/ multiple RDD inputs

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1292:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Support spark codegen instructions w/ multiple RDD inputs
> -
>
> Key: SYSTEMML-1292
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1292
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> This task aims to support spark codegen instructions (for all templates) over 
> multiple RDD inputs if not all side inputs fit into the local and remote 
> broadcast memory budgets. In detail, this might entail either (1) generating 
> custom RDD operations and functions for various combinations of input RDDs, 
> or (2) a generalization of the related spark instructions regarding the input 
> RDD construction and a generic function signature.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1443) Handling of plan selection constraints (e.g., memory/blocksize)

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1443:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Handling of plan selection constraints (e.g., memory/blocksize)
> ---
>
> Key: SYSTEMML-1443
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1443
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1833) Arima and MDABivar algorithms failing w/ codegen

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1833:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Arima and MDABivar algorithms failing w/ codegen
> 
>
> Key: SYSTEMML-1833
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1833
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1832) Redundant checkpoint instructions before loops

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1832:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Redundant checkpoint instructions before loops
> --
>
> Key: SYSTEMML-1832
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1832
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Static rewrites include HOP DAG and statement block rewrites. We apply these 
> rewrites multiple times during compilation (e.g., rewrites followed by 
> multiple passes of IPA). Some of the static rewrites such as 
> {{RewriteInjectSparkLoopCheckpointing}} assume that they are called once for 
> a program. Applying them multiple times leads to redundant statement blocks 
> with redundant checkpoint instructions. Accordingly, IPA should explicitly 
> disable such rewrites.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1828) New simplification rewrite for merging sequences blocks

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1828:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> New simplification rewrite for merging sequences blocks
> ---
>
> Key: SYSTEMML-1828
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1828
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> https://www.mail-archive.com/dev@systemml.apache.org/msg00205.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1506) Codegen only supported through dmlscript (spark_submit, hadoop)

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1506:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Codegen only supported through dmlscript (spark_submit, hadoop)
> ---
>
> Key: SYSTEMML-1506
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1506
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> This task aims to support codegen through all APIs, i.e., in addition to 
> DMLScript also through MLContext and JMLC.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1800) Matrix/frame block reader utils from streams

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1800:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Matrix/frame block reader utils from streams
> 
>
> Key: SYSTEMML-1800
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1800
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
>Priority: Minor
> Fix For: SystemML 0.15
>
>
> In JMLC deployments, models and meta data is often read from resource streams 
> of packaged artifacts. This task aims to add some util functions for 
> deserialization of matrix and frame blocks directly from such input streams 
> in order to avoid the expensive code path of reading text formats from 
> streams.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1790) FrameBlock reset fails with ArrayIndexOutOfBoundsException

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1790:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> FrameBlock reset fails with ArrayIndexOutOfBoundsException 
> ---
>
> Key: SYSTEMML-1790
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1790
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> A FrameBlock reset, e.g., on feeding the same reuse frame block multiple 
> times into slice with different data sizes, currently does not work properly, 
> leading to an ArrayIndexOutOfBoundsException on the actual data copy if the 
> target is larger than then previously allocated block.
> {code}
> java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at 
> org.apache.sysml.runtime.matrix.data.FrameBlock$StringArray.set(FrameBlock.java:1280)
> at 
> org.apache.sysml.runtime.matrix.data.FrameBlock.sliceOperations(FrameBlock.java:884)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1791) Performance features frame blocks

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1791:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Performance features frame blocks
> -
>
> Key: SYSTEMML-1791
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1791
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Recent experiments have shown that there are unnecessary overheads in various 
> frame block operations. This task is an umbrella for all related performance 
> improvements. In detail, this includes:
> * Shallow copy for column indexing
> * Bidirectional reuse of recode maps in meta data frames
> * Avoid unnecessary long-string-double parsing on transformapply



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1773) Improve JMLC error handling of invalid inputs

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1773:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Improve JMLC error handling of invalid inputs
> -
>
> Key: SYSTEMML-1773
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1773
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> The JMLC API uses two different mechanisms for binding input parameters (aka 
> $ parameters) and input variables. We should exploit this for better error 
> handling in order to avoid silent errors if users for example miss the $ 
> prefix for input parameters.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1765) Reading of dml scripts from object stores (main, mlcontext)

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1765:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Reading of dml scripts from object stores (main, mlcontext)
> ---
>
> Key: SYSTEMML-1765
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1765
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1767) Performance issues codegen rowwise (column aggregation) w/ wide matrices

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1767:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Performance issues codegen rowwise (column aggregation) w/ wide matrices
> 
>
> Key: SYSTEMML-1767
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1767
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> On scenarios with wide matrices of millions of features, the codegen rowwise 
> template shows performance issues due to unnecessary multi-threading which 
> requires additional memory per thread for partial aggregation which leads to 
> cache thrashing. We should similarly to the mmchain operator establish a 
> threshold for maximum temporary results and fall back to sequential 
> operations if this threshold is exceeded.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1761) Sparsity-exploiting weighted squared loss w/o weights

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1761:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Sparsity-exploiting weighted squared loss w/o weights
> -
>
> Key: SYSTEMML-1761
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1761
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> There are existing rewrites and fused operators for weighted squared loss 
> (wsloss). However, for the wsloss type {{NONE}}, i.e., without weights 
> {{sum((X-(U%*%t(V)))^2)}}, the implementation is not sparsity-exploiting 
> leading to huge (unnecessary) computation overhead for the outer-product-like 
> multiply of factors. As it turns out, this expression can be rewritten into a 
> sparsity-exploiting form as follows:
> {code}
> sum ((X - U %*% t(V)) ^ 2)
> -> sum(X^2) - sum(2 * (X * (U%*%t(V + sum((U%*%t(V))^2)
> -> sum(X^2) - sum(2 * (X * (U%*%t(V + sum ((t(U) %*% U) * (t(V) %*% V))
> {code}
> This task aims to change the block-level wsloss NONE implementation to 
> exploit this logical rewrite by computing {{sum(X^2) - sum(2 * (X * 
> (U%*%t(V}} in a sparsity-exploiting pass over non-zeros in X and a 
> subsequent correction for {{+ sum ((t(U) %*% U) * (t(V) %*% V))}} via two 
> tsmm operations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1755) Failed instruction generation during dynamic recompilation

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1755:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Failed instruction generation during dynamic recompilation
> --
>
> Key: SYSTEMML-1755
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1755
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> {code}
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: Unable to recompile 
> program block.
> at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:159)
> at 
> org.apache.sysml.runtime.controlprogram.FunctionProgramBlock.execute(FunctionProgramBlock.java:115)
> ... 13 more
> Caused by: java.lang.NullPointerException
> at org.apache.sysml.lops.BinaryScalar.getOpcode(BinaryScalar.java:119)
> at 
> org.apache.sysml.lops.BinaryScalar.getInstructions(BinaryScalar.java:84)
> at 
> org.apache.sysml.lops.compile.Dag.generateControlProgramJobs(Dag.java:1405)
> at org.apache.sysml.lops.compile.Dag.doGreedyGrouping(Dag.java:1175)
> at org.apache.sysml.lops.compile.Dag.getJobs(Dag.java:269)
> at 
> org.apache.sysml.hops.recompile.Recompiler.recompileHopsDag(Recompiler.java:240)
> at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:147)
> ... 14 more
> {code}
> The root cause was a simplification rewrite for binary matrix-scalar 
> operations which did not account for unsupported scalar operations such as 
> {{OpOp2.QUANTILE, OpOp2.CENTRALMOMENT, OpOp2.MINUS1_MULT, OpOp2.MINUS_NZ, 
> OpOp2.LOG_NZ}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1752) Cache-conscious mmchain matrix multiply for wide matrices

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1752:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Cache-conscious mmchain matrix multiply for wide matrices
> -
>
> Key: SYSTEMML-1752
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1752
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> The fused mmchain matrix multiply for patterns such as {{t(X) %*% (w * (X %*% 
> v))}} uses row-wise {{dotProduct}} and {{vectMultAdd}} operations, which 
> works very well for the common case of tall matrices where individual 
> rows fit into L1 cache. However, for graph and text scenarios with wide 
> matrices this leads to cache trashing on the input and output vectors.
> This task aims to generalize these dense and sparse operations to perform the 
> computation in a cache-conscious manner when necessary, by accessing 
> fragments of the input and output vector for groups of rows. For dense this 
> is trivial to realize while for sparse it requires a careful determination of 
> the block sizes according to the input sparsity. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1750) Optional dynamic recompilation for JMLC training

2017-09-08 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1750:

Fix Version/s: (was: SystemML 1.0)
   SystemML 0.15

> Optional dynamic recompilation for JMLC training
> 
>
> Key: SYSTEMML-1750
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1750
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> There are scenarios where JMLC is used for training on moderately sized 
> input. Due to the use of prepared scripts (which are compiled without size 
> information) and forced singlenode execution type, this can lead to 
> performance problems caused by poor plan choices. This task aims to (1) 
> expose compiler configurations such as dynamic recompilation and 
> multi-threading at JMLC API and (2) rework the recompilation framework for 
> singlenode execution type.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   3   >