[systemml] branch master updated: [SYSTEMML-2525] Initial implementation of RESTful model serving system

2019-03-29 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 863c9d5  [SYSTEMML-2525] Initial implementation of RESTful model 
serving system
863c9d5 is described below

commit 863c9d5cb1752b0e50140f5c6673968b57c2f9d0
Author: Anthony Thomas 
AuthorDate: Fri Mar 29 10:27:54 2019 -0700

[SYSTEMML-2525] Initial implementation of RESTful model serving system

- The current implementation extends JMLC's readMatrix and GPUContext API.
- The serving system is implemented in Scala using Akka and is available in 
the org.apache.sysml.api.ml.serving.
- Minor cleanup and refactoring required before it's ready to be used by 
the general public will be done in subsequent commits.
- It still remains unclear whether CUDA and Serving code should be included 
in future standalone releases. If yes, it will greatly simplify the deployment, 
else the user will have to build standalone jar before deployment.
- The serving system can be started by:
```
mvn -Djcuda.scope=compile -Dserving.scope=compile package -P standalone-jar
java -jar systemml-*-standalone.jar 
org.apache.sysml.api.ml.serving.PredictionService -port 8099 -scheduler 
scheduler -admin_password admin
```
- The model can registered using http://localhost:8099/register-model and 
user can invoke prediction using http://localhost:8099/predict service.

Closes #860.
---
 .travis.yml|   5 +-
 pom.xml|  31 ++
 .../java/org/apache/sysml/api/jmlc/Connection.java |  51 +++
 .../org/apache/sysml/api/jmlc/PreparedScript.java  |  18 +
 .../org/apache/sysml/parser/DataExpression.java|   1 +
 .../runtime/controlprogram/LocalVariableMap.java   |   4 +
 .../org/apache/sysml/utils/PersistentLRUCache.java |  97 ++--
 .../api/ml/serving/BasicBatchingScheduler.scala|  93 
 .../sysml/api/ml/serving/BatchingScheduler.scala   |  99 +
 .../sysml/api/ml/serving/BatchingUtils.scala   |  57 +++
 .../org/apache/sysml/api/ml/serving/Executor.scala | 155 +++
 .../api/ml/serving/LocalityAwareScheduler.scala| 218 +
 .../apache/sysml/api/ml/serving/ModelManager.scala | 176 
 .../api/ml/serving/NonBatchingScheduler.scala  |  69 +++
 .../sysml/api/ml/serving/PredictionService.scala   | 490 +
 .../apache/sysml/api/ml/serving/RLSEstimator.scala |  91 
 .../apache/sysml/api/ml/serving/Scheduler.scala| 133 ++
 .../sysml/api/ml/serving/SchedulerFactory.scala|  29 ++
 18 files changed, 1756 insertions(+), 61 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index a0c308b..3ce9d06 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -46,7 +46,8 @@ before_script:
 
 script:
 #  - mvn clean verify jacoco:report coveralls:report
-  - mvn clean verify
+# The -q parameter tells mvn not to display anything other than ERROR level 
log messages. This is required because travis kills the job after the log 
length exceeds its maximum log length (usually 4 MB).
+  - mvn -q clean verify
 
 after_success:
-#  -  mvn test jacoco:report coveralls:report
\ No newline at end of file
+#  -  mvn test jacoco:report coveralls:report
diff --git a/pom.xml b/pom.xml
index ad74276..4b5dd29 100644
--- a/pom.xml
+++ b/pom.xml
@@ -72,6 +72,7 @@
-MM-dd HH:mm:ss 
z
false
provided
+   provided
0.9.0d


@@ -1259,6 +1260,36 @@
3.2.0


+   com.typesafe.akka
+   akka-http_2.11
+   10.1.3
+   ${serving.scope}
+   
+   
+   com.typesafe.akka
+   akka-actor_2.11
+   2.5.14
+   ${serving.scope}
+   
+   
+   com.typesafe.akka
+   akka-stream_2.11
+   2.5.14
+   ${serving.scope}
+   
+   
+   com.typesafe
+   config
+   1.2.0
+   ${serving.scope}
+   
+   
+   com.typesafe.akka
+   
akka-http-spray-json-experimental_2.11
+   2.4.11.2
+   ${serving.scope}
+   
+   
org.jcuda
jcuda
${jcuda.version}
diff --git a/src/main/java/org/apache/sysml/api/jmlc/Connection.java 
b/src/main/java/org/apache/sysml/api/jmlc/Connection.java
index 53b7d04..29df4c0 100644
--- a/src/main/java/org/apache/sysml/api/jmlc/Connection.java
+++ b/src/main/java/org/apache

[systemml] branch master updated: [SYSTEMML-540] Added performance tests for ResNet200

2019-03-29 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 794c5a2  [SYSTEMML-540] Added performance tests for ResNet200
794c5a2 is described below

commit 794c5a232a3f462e2a85836dea55570f102e1682
Author: Niketan Pansare 
AuthorDate: Fri Mar 29 10:26:04 2019 -0700

[SYSTEMML-540] Added performance tests for ResNet200

These tests compare the effect of different eviction policies when
training ResNet as well as performs baseline comparison with Unified
Memory, TF and TF-GPU.
---
 scripts/perftest/gpu_resnet_perftest/resnet.py | 282 +
 scripts/perftest/gpu_resnet_perftest/run.py| 219 +++
 scripts/perftest/gpu_resnet_perftest/run.sh|  72 +++
 3 files changed, 573 insertions(+)

diff --git a/scripts/perftest/gpu_resnet_perftest/resnet.py 
b/scripts/perftest/gpu_resnet_perftest/resnet.py
new file mode 100644
index 000..a2e8514
--- /dev/null
+++ b/scripts/perftest/gpu_resnet_perftest/resnet.py
@@ -0,0 +1,282 @@
+# -
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+# -
+
+from __future__ import division
+
+import six
+from keras.models import Model
+from keras.layers import (
+Input,
+Activation,
+Dense,
+Flatten
+)
+from keras.layers.convolutional import (
+Conv2D,
+MaxPooling2D,
+AveragePooling2D
+)
+from keras.layers.merge import add
+from keras.layers.normalization import BatchNormalization
+from keras.regularizers import l2
+from keras import backend as K
+
+
+def _bn_relu(input):
+"""Helper to build a BN -> relu block
+"""
+norm = BatchNormalization(axis=CHANNEL_AXIS)(input)
+return Activation("relu")(norm)
+
+
+def _conv_bn_relu(**conv_params):
+"""Helper to build a conv -> BN -> relu block
+"""
+filters = conv_params["filters"]
+kernel_size = conv_params["kernel_size"]
+strides = conv_params.setdefault("strides", (1, 1))
+kernel_initializer = conv_params.setdefault("kernel_initializer", 
"he_normal")
+padding = conv_params.setdefault("padding", "same")
+kernel_regularizer = conv_params.setdefault("kernel_regularizer", 
l2(1.e-4))
+
+def f(input):
+conv = Conv2D(filters=filters, kernel_size=kernel_size,
+  strides=strides, padding=padding,
+  kernel_initializer=kernel_initializer,
+  kernel_regularizer=kernel_regularizer)(input)
+return _bn_relu(conv)
+
+return f
+
+
+def _bn_relu_conv(**conv_params):
+"""Helper to build a BN -> relu -> conv block.
+This is an improved scheme proposed in 
http://arxiv.org/pdf/1603.05027v2.pdf
+"""
+filters = conv_params["filters"]
+kernel_size = conv_params["kernel_size"]
+strides = conv_params.setdefault("strides", (1, 1))
+kernel_initializer = conv_params.setdefault("kernel_initializer", 
"he_normal")
+padding = conv_params.setdefault("padding", "same")
+kernel_regularizer = conv_params.setdefault("kernel_regularizer", 
l2(1.e-4))
+
+def f(input):
+activation = _bn_relu(input)
+return Conv2D(filters=filters, kernel_size=kernel_size,
+  strides=strides, padding=padding,
+  kernel_initializer=kernel_initializer,
+  kernel_regularizer=kernel_regularizer)(activation)
+
+return f
+
+
+def _shortcut(input, residual):
+"""Adds a shortcut between input and residual block and merges them with 
"sum"
+"""
+# Expand channels of shortcut to match residual.
+# Stride appropriately to match residual (width, height)
+# Should 

[systemml] branch master updated: [SYSTEMML-540] Optimized sparse-to-dense conversion on GPU and added a flag to disable forced memset0

2019-03-28 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 70bf610  [SYSTEMML-540] Optimized sparse-to-dense conversion on GPU 
and added a flag to disable forced memset0
70bf610 is described below

commit 70bf61093dc3814ccbec867de4e4753cb9f3e086
Author: Niketan Pansare 
AuthorDate: Thu Mar 28 22:44:24 2019 -0700

[SYSTEMML-540] Optimized sparse-to-dense conversion on GPU and added a flag 
to disable forced memset0

- Improved the performance of sparse-to-dense conversion of empty matrices.
- Added a flag sysml.gpu.force.memSetZero that allows the user to disable 
forced memset0.
- This flag is turned on for now and after exhaustive testing, it will be 
turned off later by default.
---
 conf/SystemML-config.xml.template  |  3 +++
 src/main/java/org/apache/sysml/conf/DMLConfig.java |  4 +++-
 .../instructions/gpu/context/CSRPointer.java   |  3 +++
 .../instructions/gpu/context/GPUMemoryManager.java | 20 
 .../instructions/gpu/context/GPUObject.java| 22 ++
 5 files changed, 43 insertions(+), 9 deletions(-)

diff --git a/conf/SystemML-config.xml.template 
b/conf/SystemML-config.xml.template
index 17cc2cc..cd0d311 100644
--- a/conf/SystemML-config.xml.template
+++ b/conf/SystemML-config.xml.template
@@ -121,4 +121,7 @@


true
+   
+   
+   true
 
\ No newline at end of file
diff --git a/src/main/java/org/apache/sysml/conf/DMLConfig.java 
b/src/main/java/org/apache/sysml/conf/DMLConfig.java
index 0b5ed78..e435c77 100644
--- a/src/main/java/org/apache/sysml/conf/DMLConfig.java
+++ b/src/main/java/org/apache/sysml/conf/DMLConfig.java
@@ -96,6 +96,7 @@ public class DMLConfig
public static final String GPU_MEMORY_ALLOCATOR = 
"sysml.gpu.memory.allocator"; // String to specify the memory allocator to use. 
Supported values are: cuda, unified_memory
public static final String FLOATING_POINT_PRECISION = 
"sysml.floating.point.precision"; // String to specify the datatype to use 
internally: supported values are double, single
public static final String PRINT_GPU_MEMORY_INFO = 
"sysml.gpu.print.memoryInfo";
+   public static final String GPU_FORCE_MEMSET_ZERO = 
"sysml.gpu.force.memSetZero";
public static final String EVICTION_SHADOW_BUFFERSIZE = 
"sysml.gpu.eviction.shadow.bufferSize";
public static final String GPU_RECOMPUTE_ACTIVATIONS = 
"sysml.gpu.recompute.activations";
 
@@ -140,6 +141,7 @@ public class DMLConfig
_defaultVals.put(NATIVE_BLAS_DIR,"none" );
_defaultVals.put(EXTRA_FINEGRAINED_STATS,"false" );
_defaultVals.put(PRINT_GPU_MEMORY_INFO,  "false" );
+   _defaultVals.put(GPU_FORCE_MEMSET_ZERO,  "true" );
_defaultVals.put(EVICTION_SHADOW_BUFFERSIZE,  "0.5" );
_defaultVals.put(STATS_MAX_WRAP_LEN, "30" );
_defaultVals.put(GPU_MEMORY_UTILIZATION_FACTOR,  "0.9" );
@@ -431,7 +433,7 @@ public class DMLConfig
YARN_APPMASTER, YARN_APPMASTERMEM, 
YARN_MAPREDUCEMEM, 
CP_PARALLEL_OPS, CP_PARALLEL_IO, NATIVE_BLAS, 
NATIVE_BLAS_DIR,
COMPRESSED_LINALG, 
-   CODEGEN, CODEGEN_COMPILER, CODEGEN_OPTIMIZER, 
CODEGEN_PLANCACHE, CODEGEN_LITERALS,
+   CODEGEN, CODEGEN_COMPILER, CODEGEN_OPTIMIZER, 
CODEGEN_PLANCACHE, CODEGEN_LITERALS, GPU_FORCE_MEMSET_ZERO,
EXTRA_FINEGRAINED_STATS, STATS_MAX_WRAP_LEN, 
PRINT_GPU_MEMORY_INFO, CACHING_BUFFER_SIZE,
AVAILABLE_GPUS, SYNCHRONIZE_GPU, 
EAGER_CUDA_FREE, FLOATING_POINT_PRECISION, GPU_EVICTION_POLICY, 
EVICTION_SHADOW_BUFFERSIZE,
GPU_MEMORY_ALLOCATOR, 
GPU_MEMORY_UTILIZATION_FACTOR, GPU_RECOMPUTE_ACTIVATIONS, FORCE_LSTM_CUDNN
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/CSRPointer.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/CSRPointer.java
index b3ec497..d7bd295 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/CSRPointer.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/CSRPointer.java
@@ -303,6 +303,9 @@ public class CSRPointer {
r.val = gCtx.allocate(null, getDataTypeSizeOf(nnz2));
r.rowPtr = gCtx.allocate(null, getIntSizeOf(rows + 1));
r.colInd = gCtx.allocate(null, getIntSizeOf(nnz2));
+   GPUMemoryManager.postAllocateMemset0(r.

[systemml] branch gh-pages updated: [MINOR][DOC] Updated Keras2DML and Caffe2DML reference guides.

2019-03-27 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/gh-pages by this push:
 new c71d404  [MINOR][DOC] Updated Keras2DML and Caffe2DML reference guides.
c71d404 is described below

commit c71d404922591300ef6c9e872069ba94ae944cd1
Author: Niketan Pansare 
AuthorDate: Wed Mar 27 09:21:30 2019 -0700

[MINOR][DOC] Updated Keras2DML and Caffe2DML reference guides.
---
 reference-guide-caffe2dml.md | 12 ++--
 reference-guide-keras2dml.md | 30 +-
 2 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/reference-guide-caffe2dml.md b/reference-guide-caffe2dml.md
index 1a3d154..993d587 100644
--- a/reference-guide-caffe2dml.md
+++ b/reference-guide-caffe2dml.md
@@ -1137,17 +1137,17 @@ class   precision   recall  
f1-scorenum_true_labels
 
  Design document of Caffe2DML
 
-1. Caffe2DML is designed to fit well into the mllearn framework. Hence, the 
key methods that were to be implemented are:
+Caffe2DML is designed to fit well into the mllearn framework. Hence, the key 
methods that were to be implemented are:
 - `getTrainingScript` for the `Estimator` class.
 - `getPredictionScript` for the `Model` class.
 
 These methods should be the starting point of any developer to understand the 
DML generated for training and prediction respectively.
 
-2. To simplify the DML generation in `getTrainingScript` and 
`getPredictionScript method`, we use DMLGenerator interface. 
+To simplify the DML generation in `getTrainingScript` and `getPredictionScript 
method`, we use DMLGenerator interface. 
 This interface generates DML string for common operations such as loops (such 
as if, for, while) as well as built-in functions (read, write), etc. 
 Also, this interface helps in "code reading" of the Caffe2DML class.
 
-3. Here is an analogy for SystemML developers to think of various moving 
components of Caffe2DML:
+Here is an analogy for SystemML developers to think of various moving 
components of Caffe2DML:
 - Like `Dml.g4` in the `org.apache.sysml.parser.dml` package, `caffe.proto` in 
the `src/main/proto/caffe` directory
 is used to generate classes to parse the input files.
 
@@ -1187,7 +1187,7 @@ trait CaffeSolver {
 }
 ```
 
-4. To simplify the traversal of the network, we created a Network interface:
+To simplify the traversal of the network, we created a Network interface:
 ```
 trait Network {
   def getLayers(): List[String]
@@ -1198,8 +1198,8 @@ trait Network {
 }
 ```
 
-5. One of the key design restriction of Caffe2DML is that every layer is 
identified uniquely by its name.
+One of the key design restriction of Caffe2DML is that every layer is 
identified uniquely by its name.
 This restriction simplifies the code significantly.
 To shield from network files that violates this restriction, Caffe2DML 
performs rewrites in CaffeNetwork class (search for condition 1-5 in Caffe2DML 
class).
 
-6. Like Caffe, Caffe2DML also expects the layers to be in sorted order.
+Like Caffe, Caffe2DML also expects the layers to be in sorted order.
\ No newline at end of file
diff --git a/reference-guide-keras2dml.md b/reference-guide-keras2dml.md
index a576ee7..d04ff51 100644
--- a/reference-guide-keras2dml.md
+++ b/reference-guide-keras2dml.md
@@ -30,10 +30,30 @@ limitations under the License.
 
 # Layers supported in Keras2DML
 
-TODO:
+If a Keras layer or a hyperparameter is not supported, we throw an error 
informing that the layer is not supported.
+We follow the Keras specification very closely during DML generation and 
compare the results of our layers (both forward and backward) with Tensorflow 
to validate that.
+
+- Following layers are not supported but will be supported in near future: 
`Reshape, Permute, RepeatVector, ActivityRegularization, Masking, 
SpatialDropout1D, SpatialDropout2D, SeparableConv1D, SeparableConv2D, 
DepthwiseConv2D, Cropping1D, Cropping2D, GRU and Embedding`.
+- Following layers are not supported by their 2D variants exists (consider 
using them instead): `UpSampling1D, ZeroPadding1D, MaxPooling1D, 
AveragePooling1D and Conv1D`.
+- Specialized `CuDNNGRU and CuDNNLSTM` layers are not required in SystemML. 
Instead use `LSTM` layer. 
+- We do not have immediate plans to support the following layers: `Lambda, 
SpatialDropout3D, Conv3D, Conv3DTranspose, Cropping3D, UpSampling3D, 
ZeroPadding3D, MaxPooling3D, AveragePooling3D and ConvLSTM2D*`.
 
 # Frequently asked questions
 
+ How do I specify the batch size, the number of epochs and the validation 
dataset?
+
+Like Keras, the user can provide `batch_size` and `epochs` via the `fit` 
method. 
+
+```python
+# Either:
+sysml_model.fit(features, labels, epochs=10, batch_size=64, 
validation_split=0.3)
+# Or
+sysml_model.fit(features, labels, epochs=10, batch_size=64, 
valid

[systemml] branch master updated: [MINOR][DOC] Updated Keras2DML and Caffe2DML reference guides.

2019-03-27 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 9a86656  [MINOR][DOC] Updated Keras2DML and Caffe2DML reference guides.
9a86656 is described below

commit 9a86656392d0bd36002614366cf980802d146176
Author: Niketan Pansare 
AuthorDate: Wed Mar 27 09:21:30 2019 -0700

[MINOR][DOC] Updated Keras2DML and Caffe2DML reference guides.
---
 docs/reference-guide-caffe2dml.md | 12 ++--
 docs/reference-guide-keras2dml.md | 30 +-
 2 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/docs/reference-guide-caffe2dml.md 
b/docs/reference-guide-caffe2dml.md
index 1a3d154..993d587 100644
--- a/docs/reference-guide-caffe2dml.md
+++ b/docs/reference-guide-caffe2dml.md
@@ -1137,17 +1137,17 @@ class   precision   recall  
f1-scorenum_true_labels
 
  Design document of Caffe2DML
 
-1. Caffe2DML is designed to fit well into the mllearn framework. Hence, the 
key methods that were to be implemented are:
+Caffe2DML is designed to fit well into the mllearn framework. Hence, the key 
methods that were to be implemented are:
 - `getTrainingScript` for the `Estimator` class.
 - `getPredictionScript` for the `Model` class.
 
 These methods should be the starting point of any developer to understand the 
DML generated for training and prediction respectively.
 
-2. To simplify the DML generation in `getTrainingScript` and 
`getPredictionScript method`, we use DMLGenerator interface. 
+To simplify the DML generation in `getTrainingScript` and `getPredictionScript 
method`, we use DMLGenerator interface. 
 This interface generates DML string for common operations such as loops (such 
as if, for, while) as well as built-in functions (read, write), etc. 
 Also, this interface helps in "code reading" of the Caffe2DML class.
 
-3. Here is an analogy for SystemML developers to think of various moving 
components of Caffe2DML:
+Here is an analogy for SystemML developers to think of various moving 
components of Caffe2DML:
 - Like `Dml.g4` in the `org.apache.sysml.parser.dml` package, `caffe.proto` in 
the `src/main/proto/caffe` directory
 is used to generate classes to parse the input files.
 
@@ -1187,7 +1187,7 @@ trait CaffeSolver {
 }
 ```
 
-4. To simplify the traversal of the network, we created a Network interface:
+To simplify the traversal of the network, we created a Network interface:
 ```
 trait Network {
   def getLayers(): List[String]
@@ -1198,8 +1198,8 @@ trait Network {
 }
 ```
 
-5. One of the key design restriction of Caffe2DML is that every layer is 
identified uniquely by its name.
+One of the key design restriction of Caffe2DML is that every layer is 
identified uniquely by its name.
 This restriction simplifies the code significantly.
 To shield from network files that violates this restriction, Caffe2DML 
performs rewrites in CaffeNetwork class (search for condition 1-5 in Caffe2DML 
class).
 
-6. Like Caffe, Caffe2DML also expects the layers to be in sorted order.
+Like Caffe, Caffe2DML also expects the layers to be in sorted order.
\ No newline at end of file
diff --git a/docs/reference-guide-keras2dml.md 
b/docs/reference-guide-keras2dml.md
index a576ee7..d04ff51 100644
--- a/docs/reference-guide-keras2dml.md
+++ b/docs/reference-guide-keras2dml.md
@@ -30,10 +30,30 @@ limitations under the License.
 
 # Layers supported in Keras2DML
 
-TODO:
+If a Keras layer or a hyperparameter is not supported, we throw an error 
informing that the layer is not supported.
+We follow the Keras specification very closely during DML generation and 
compare the results of our layers (both forward and backward) with Tensorflow 
to validate that.
+
+- Following layers are not supported but will be supported in near future: 
`Reshape, Permute, RepeatVector, ActivityRegularization, Masking, 
SpatialDropout1D, SpatialDropout2D, SeparableConv1D, SeparableConv2D, 
DepthwiseConv2D, Cropping1D, Cropping2D, GRU and Embedding`.
+- Following layers are not supported by their 2D variants exists (consider 
using them instead): `UpSampling1D, ZeroPadding1D, MaxPooling1D, 
AveragePooling1D and Conv1D`.
+- Specialized `CuDNNGRU and CuDNNLSTM` layers are not required in SystemML. 
Instead use `LSTM` layer. 
+- We do not have immediate plans to support the following layers: `Lambda, 
SpatialDropout3D, Conv3D, Conv3DTranspose, Cropping3D, UpSampling3D, 
ZeroPadding3D, MaxPooling3D, AveragePooling3D and ConvLSTM2D*`.
 
 # Frequently asked questions
 
+ How do I specify the batch size, the number of epochs and the validation 
dataset?
+
+Like Keras, the user can provide `batch_size` and `epochs` via the `fit` 
method. 
+
+```python
+# Either:
+sysml_model.fit(features, labels, epochs=10, batch_size=64, 
validation_split=0.3)
+# Or
+sysml_model.fit(featur

[systemml] 01/02: [SYSTEMML-540] Added looped_minibatch training algorithm in Keras2DML

2019-03-27 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git

commit 9deb19ca8092b20a4cebcb9bdbc91fb444b1918b
Author: Niketan Pansare 
AuthorDate: Mon Mar 25 12:33:50 2019 -0700

[SYSTEMML-540] Added looped_minibatch training algorithm in Keras2DML

- This algorithm performs multiple forward-backward passes 
(=`parallel_batches` parameters) with the given batch size, aggregate gradients 
and finally updates the model.
- Updated the documentation.
---
 beginners-guide-caffe2dml.md |  2 +-
 beginners-guide-keras2dml.md | 35 ++-
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/beginners-guide-caffe2dml.md b/beginners-guide-caffe2dml.md
index 8814283..db74feb 100644
--- a/beginners-guide-caffe2dml.md
+++ b/beginners-guide-caffe2dml.md
@@ -161,7 +161,7 @@ Iter:2000, validation loss:173.66147359346, validation 
accuracy:97.4897540983606
 
 Unlike Caffe where default train and test algorithm is `minibatch`, you can 
specify the
 algorithm using the parameters `train_algo` and `test_algo` (valid values are: 
`minibatch`, `allreduce_parallel_batches`, 
-and `allreduce`). Here are some common settings:
+`looped_minibatch`, and `allreduce`). Here are some common settings:
 
 |  | 
PySpark script  
 | Changes to 
Network/Solver  |
 
|--|--||
diff --git a/beginners-guide-keras2dml.md b/beginners-guide-keras2dml.md
index 4517be5..2259397 100644
--- a/beginners-guide-keras2dml.md
+++ b/beginners-guide-keras2dml.md
@@ -208,4 +208,37 @@ For example: for the expression `Keras2DML(..., 
display=100, test_iter=10, test_
 To verify that Keras2DML produce same results as other Keras' backend, we have 
[Python unit 
tests](https://github.com/apache/systemml/blob/master/src/main/python/tests/test_nn_numpy.py)
 that compare the results of Keras2DML with that of TensorFlow. We assume that 
Keras team ensure that all their backends are consistent with their TensorFlow 
backend.
 
-
+ How can I train very deep models on GPU?
+
+Unlike Keras where default train and test algorithm is `minibatch`, you can 
specify the
+algorithm using the parameters `train_algo` and `test_algo` (valid values are: 
`minibatch`, `allreduce_parallel_batches`, 
+`looped_minibatch`, and `allreduce`). Here are some common settings:
+
+|  | 
PySpark script  
 | Changes to 
Network/Solver  |
+|--|--||
+| Single-node CPU execution (similar to Caffe with solver_mode: CPU)   | 
`lenet.set(train_algo="minibatch", test_algo="minibatch")`  
 | Ensure that 
`batch_size` is set to appropriate value (for example: 64) |
+| Single-node single-GPU execution | 
`lenet.set(train_algo="minibatch", 
test_algo="minibatch").setGPU(True).setForceGPU(True)`  
  | Ensure that `batch_size` is set to appropriate value 
(for example: 64) |
+| Single-node multi-GPU execution (similar to Caffe with solver_mode: GPU) | 
`lenet.set(train_algo="allreduce_parallel_batches", test_algo="minibatch", 
parallel_batches=num_gpu).setGPU(True).setForceGPU(True)` | Ensure that 
`batch_size` is set to appropriate value (for example: 64) |
+| Distributed prediction   | 
`lenet.set(test_algo="allreduce")`  
 |  
  |
+| Distributed synchronous training | 
`lenet.set(train_algo="allreduce_parallel_batches", 
parallel_batches=num_cluster_cores)`
 | Ensure that `ba

[systemml] branch gh-pages updated (878f757 -> 47ce217)

2019-03-27 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a change to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git.


from 878f757  [SYSTEMML-540] Added ternary aggregate operators for GPU 
backend
 new 9deb19c  [SYSTEMML-540] Added looped_minibatch training algorithm in 
Keras2DML
 new 47ce217  [SYSTEMML-540] Updated Keras2DML to match Keras API and 
improved rmvar performance

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 beginners-guide-caffe2dml.md   |   2 +-
 beginners-guide-keras2dml.md   | 132 
 gpu.md |  12 +-
 index.md   |   4 +-
 reference-guide-caffe2dml.md   |  68 
 ...de-keras2dml.md => reference-guide-keras2dml.md | 171 +++--
 6 files changed, 165 insertions(+), 224 deletions(-)
 copy beginners-guide-keras2dml.md => reference-guide-keras2dml.md (55%)



[systemml] 02/02: [SYSTEMML-540] Updated Keras2DML to match Keras API and improved rmvar performance

2019-03-27 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git

commit 47ce2179768b9dd2821aae7fc11f11c1fb8de210
Author: Niketan Pansare 
AuthorDate: Wed Mar 27 08:57:02 2019 -0700

[SYSTEMML-540] Updated Keras2DML to match Keras API and improved rmvar 
performance

- Improved performance of rmvar by refactoring LocalVariableMap. With this 
change, the end-to-end performance of a sample run of ResNet-200 improved from 
55 seconds to 31 seconds.
- The parameters batch_size, max_iter, test_iter, test_interval, display of 
Keras2DML constructor is removed. Instead, batch_size, epochs, 
validation_split, validation_data parameters of fit() method.
- Updated the Caffe2DML generator to include the above parameters.
- Updated the documentation.

Closes #859.
---
 beginners-guide-keras2dml.md   | 165 -
 gpu.md |  12 +-
 index.md   |   4 +-
 reference-guide-caffe2dml.md   |  68 +
 ...de-keras2dml.md => reference-guide-keras2dml.md | 142 +++---
 5 files changed, 133 insertions(+), 258 deletions(-)

diff --git a/beginners-guide-keras2dml.md b/beginners-guide-keras2dml.md
index 2259397..788a489 100644
--- a/beginners-guide-keras2dml.md
+++ b/beginners-guide-keras2dml.md
@@ -27,34 +27,23 @@ limitations under the License.
 
 
 
-## Introduction
+# Introduction
 
-Keras2DML is an **experimental API** that converts a Keras specification to 
DML through the intermediate Caffe2DML module. 
+Keras2DML converts a Keras specification to DML through the intermediate 
Caffe2DML module. 
 It is designed to fit well into the mllearn framework and hence supports 
NumPy, Pandas as well as PySpark DataFrame.
 
-### Getting Started 
-
-To create a Keras2DML object, one needs to create a Keras model through the 
Funcitonal API. please see the [Functional API.](https://keras.io/models/model/)
-This module utilizes the existing [Caffe2DML](beginners-guide-caffe2dml) 
backend to convert Keras models into DML. Keras models are 
-parsed and translated into Caffe prototext and caffemodel files which are then 
piped into Caffe2DML. Thus one can follow the Caffe2DML
-documentation for further information.
-
-### Model Conversion
-
-Keras models are parsed based on their layer structure and corresponding 
weights and translated into the relative Caffe layer and weight
-configuration. Be aware that currently this is a translation into Caffe and 
there will be loss of information from keras models such as 
-intializer information, and other layers which do not exist in Caffe. 
-
 First, install SystemML and other dependencies for the below demo:
 
 ```
-pip install systemml keras tensorflow mlxtend
+pip install systemml keras tensorflow
 ``` 
 
 To create a Keras2DML object, simply pass the keras object to the Keras2DML 
constructor. It's also important to note that your models
 should be compiled so that the loss can be accessed for Caffe2DML.
 
+# Training Lenet on the MNIST dataset
 
+Download the MNIST dataset using [mlxtend 
package](https://pypi.python.org/pypi/mlxtend).
 
 ```python
 # pyspark --driver-memory 20g
@@ -115,130 +104,34 @@ sysml_model.fit(X_train, y_train)
 sysml_model.score(X_test, y_test)
 ```
 
-# Frequently asked questions
-
- How can I get the training and prediction DML script for the Keras model?
-
-The training and prediction DML scripts can be generated using 
`get_training_script()` and `get_prediction_script()` methods.
+# Prediction using a pretrained ResNet-50
 
 ```python
-from systemml.mllearn import Keras2DML
-sysml_model = Keras2DML(spark, keras_model, input_shape=(3,224,224))
-print(sysml_model.get_training_script())
-```
-
- What is the mapping between Keras' parameters and Caffe's solver 
specification ? 
-
-|| Specified via the 
given parameter in the Keras2DML constructor | From input Keras' model  
   | Corresponding 
parameter in the Caffe solver file |
-|||-|--|
-| Solver type| 
   | `type(keras_model.optimizer)`. 
Supported types: `keras.optimizers.{SGD, Adagrad, Adam}` | `type`   
|
-| Maximum number of iterations   | `max_iter`  
   | The `epoch` parameter in the `fit` 
method is not suppor

[systemml] branch master updated: [SYSTEMML-540] Updated Keras2DML to match Keras API and improved rmvar performance

2019-03-27 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 8857b89  [SYSTEMML-540] Updated Keras2DML to match Keras API and 
improved rmvar performance
8857b89 is described below

commit 8857b8925004c369c39d969c4312472a0119d425
Author: Niketan Pansare 
AuthorDate: Wed Mar 27 08:57:02 2019 -0700

[SYSTEMML-540] Updated Keras2DML to match Keras API and improved rmvar 
performance

- Improved performance of rmvar by refactoring LocalVariableMap. With this 
change, the end-to-end performance of a sample run of ResNet-200 improved from 
55 seconds to 31 seconds.
- The parameters batch_size, max_iter, test_iter, test_interval, display of 
Keras2DML constructor is removed. Instead, batch_size, epochs, 
validation_split, validation_data parameters of fit() method.
- Updated the Caffe2DML generator to include the above parameters.
- Updated the documentation.

Closes #859.
---
 docs/beginners-guide-keras2dml.md  | 165 ++-
 docs/gpu.md|  12 +-
 docs/index.md  |   4 +-
 docs/reference-guide-caffe2dml.md  |  68 +++
 ...e-keras2dml.md => reference-guide-keras2dml.md} | 142 +-
 .../runtime/controlprogram/LocalVariableMap.java   |  75 ++-
 src/main/python/systemml/mllearn/estimators.py |  41 +-
 src/main/python/tests/test_nn_numpy.py |   4 +-
 .../scala/org/apache/sysml/api/dl/Caffe2DML.scala  | 523 ++---
 .../scala/org/apache/sysml/api/dl/CaffeLayer.scala |  11 +-
 .../org/apache/sysml/api/dl/CaffeSolver.scala  |   1 +
 .../sysml/api/ml/BaseSystemMLClassifier.scala  |  15 +-
 12 files changed, 610 insertions(+), 451 deletions(-)

diff --git a/docs/beginners-guide-keras2dml.md 
b/docs/beginners-guide-keras2dml.md
index 2259397..788a489 100644
--- a/docs/beginners-guide-keras2dml.md
+++ b/docs/beginners-guide-keras2dml.md
@@ -27,34 +27,23 @@ limitations under the License.
 
 
 
-## Introduction
+# Introduction
 
-Keras2DML is an **experimental API** that converts a Keras specification to 
DML through the intermediate Caffe2DML module. 
+Keras2DML converts a Keras specification to DML through the intermediate 
Caffe2DML module. 
 It is designed to fit well into the mllearn framework and hence supports 
NumPy, Pandas as well as PySpark DataFrame.
 
-### Getting Started 
-
-To create a Keras2DML object, one needs to create a Keras model through the 
Funcitonal API. please see the [Functional API.](https://keras.io/models/model/)
-This module utilizes the existing [Caffe2DML](beginners-guide-caffe2dml) 
backend to convert Keras models into DML. Keras models are 
-parsed and translated into Caffe prototext and caffemodel files which are then 
piped into Caffe2DML. Thus one can follow the Caffe2DML
-documentation for further information.
-
-### Model Conversion
-
-Keras models are parsed based on their layer structure and corresponding 
weights and translated into the relative Caffe layer and weight
-configuration. Be aware that currently this is a translation into Caffe and 
there will be loss of information from keras models such as 
-intializer information, and other layers which do not exist in Caffe. 
-
 First, install SystemML and other dependencies for the below demo:
 
 ```
-pip install systemml keras tensorflow mlxtend
+pip install systemml keras tensorflow
 ``` 
 
 To create a Keras2DML object, simply pass the keras object to the Keras2DML 
constructor. It's also important to note that your models
 should be compiled so that the loss can be accessed for Caffe2DML.
 
+# Training Lenet on the MNIST dataset
 
+Download the MNIST dataset using [mlxtend 
package](https://pypi.python.org/pypi/mlxtend).
 
 ```python
 # pyspark --driver-memory 20g
@@ -115,130 +104,34 @@ sysml_model.fit(X_train, y_train)
 sysml_model.score(X_test, y_test)
 ```
 
-# Frequently asked questions
-
- How can I get the training and prediction DML script for the Keras model?
-
-The training and prediction DML scripts can be generated using 
`get_training_script()` and `get_prediction_script()` methods.
+# Prediction using a pretrained ResNet-50
 
 ```python
-from systemml.mllearn import Keras2DML
-sysml_model = Keras2DML(spark, keras_model, input_shape=(3,224,224))
-print(sysml_model.get_training_script())
-```
-
- What is the mapping between Keras' parameters and Caffe's solver 
specification ? 
-
-|| Specified via the 
given parameter in the Keras2DML constructor | From input Keras' model  
   | Corresponding 
parameter in the Caffe solver f

[systemml] branch master updated: [SYSTEMML-540] Added looped_minibatch training algorithm in Keras2DML

2019-03-25 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new b657820  [SYSTEMML-540] Added looped_minibatch training algorithm in 
Keras2DML
b657820 is described below

commit b657820248fbb42f1c4f27564cdb14865ebeeec1
Author: Niketan Pansare 
AuthorDate: Mon Mar 25 12:33:50 2019 -0700

[SYSTEMML-540] Added looped_minibatch training algorithm in Keras2DML

- This algorithm performs multiple forward-backward passes 
(=`parallel_batches` parameters) with the given batch size, aggregate gradients 
and finally updates the model.
- Updated the documentation.
---
 docs/beginners-guide-caffe2dml.md  |  2 +-
 docs/beginners-guide-keras2dml.md  | 35 -
 src/main/python/systemml/mllearn/estimators.py | 11 ++--
 .../scala/org/apache/sysml/api/dl/Caffe2DML.scala  | 60 ++
 4 files changed, 82 insertions(+), 26 deletions(-)

diff --git a/docs/beginners-guide-caffe2dml.md 
b/docs/beginners-guide-caffe2dml.md
index 8814283..db74feb 100644
--- a/docs/beginners-guide-caffe2dml.md
+++ b/docs/beginners-guide-caffe2dml.md
@@ -161,7 +161,7 @@ Iter:2000, validation loss:173.66147359346, validation 
accuracy:97.4897540983606
 
 Unlike Caffe where default train and test algorithm is `minibatch`, you can 
specify the
 algorithm using the parameters `train_algo` and `test_algo` (valid values are: 
`minibatch`, `allreduce_parallel_batches`, 
-and `allreduce`). Here are some common settings:
+`looped_minibatch`, and `allreduce`). Here are some common settings:
 
 |  | 
PySpark script  
 | Changes to 
Network/Solver  |
 
|--|--||
diff --git a/docs/beginners-guide-keras2dml.md 
b/docs/beginners-guide-keras2dml.md
index 4517be5..2259397 100644
--- a/docs/beginners-guide-keras2dml.md
+++ b/docs/beginners-guide-keras2dml.md
@@ -208,4 +208,37 @@ For example: for the expression `Keras2DML(..., 
display=100, test_iter=10, test_
 To verify that Keras2DML produce same results as other Keras' backend, we have 
[Python unit 
tests](https://github.com/apache/systemml/blob/master/src/main/python/tests/test_nn_numpy.py)
 that compare the results of Keras2DML with that of TensorFlow. We assume that 
Keras team ensure that all their backends are consistent with their TensorFlow 
backend.
 
-
+ How can I train very deep models on GPU?
+
+Unlike Keras where default train and test algorithm is `minibatch`, you can 
specify the
+algorithm using the parameters `train_algo` and `test_algo` (valid values are: 
`minibatch`, `allreduce_parallel_batches`, 
+`looped_minibatch`, and `allreduce`). Here are some common settings:
+
+|  | 
PySpark script  
 | Changes to 
Network/Solver  |
+|--|--||
+| Single-node CPU execution (similar to Caffe with solver_mode: CPU)   | 
`lenet.set(train_algo="minibatch", test_algo="minibatch")`  
 | Ensure that 
`batch_size` is set to appropriate value (for example: 64) |
+| Single-node single-GPU execution | 
`lenet.set(train_algo="minibatch", 
test_algo="minibatch").setGPU(True).setForceGPU(True)`  
  | Ensure that `batch_size` is set to appropriate value 
(for example: 64) |
+| Single-node multi-GPU execution (similar to Caffe with solver_mode: GPU) | 
`lenet.set(train_algo="allreduce_parallel_batches", test_algo="minibatch", 
parallel_batches=num_gpu).setGPU(True).setForceGPU(True)` | Ensure that 
`batch_size` is set to appropriate value (for example: 64) |
+| Distributed prediction 

[systemml] 01/02: [SYSTEMML-540] Invoke update immediately after backward call at the script-level.

2019-03-24 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git

commit 4032594bc737d41efe06606e801dd61be27413ee
Author: Niketan Pansare 
AuthorDate: Sat Mar 23 21:51:45 2019 -0700

[SYSTEMML-540] Invoke update immediately after backward call at the 
script-level.

- This reduces the chance of unnecessary evictions especially when there 
are statement block cuts.
- The configuration property `perform_fused_backward_update` allows the 
user to toggle this behavior and control the script-generation process.
- Also, updated the release creation document to ensure that untracked 
files are not included in the artifacts.
- For Resnet-200, the eviction time was reduced from 173.488 seconds to 
60.048 seconds with minibatch size of 96.
---
 release-creation-process.md | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/release-creation-process.md b/release-creation-process.md
index 3115390..bf05000 100644
--- a/release-creation-process.md
+++ b/release-creation-process.md
@@ -42,6 +42,9 @@ Step 1: Prepare the release.
 
# Extract latest code to a directory

+   
+   # Check if there are any untracked files (created by the unit tests) 
and remove them to avoid packing them in the artifacts
+   git status
 
# Go to dev/release directory
cd /dev/release



[systemml] branch gh-pages updated (2eed56f -> 878f757)

2019-03-24 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a change to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git.


from 2eed56f  [MINOR][DOC] Updated Deep Learning documentation
 new 4032594  [SYSTEMML-540] Invoke update immediately after backward call 
at the script-level.
 new 878f757  [SYSTEMML-540] Added ternary aggregate operators for GPU 
backend

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 release-creation-process.md |  3 +++
 release-process.md  | 25 -
 2 files changed, 23 insertions(+), 5 deletions(-)



[systemml] 02/02: [SYSTEMML-540] Added ternary aggregate operators for GPU backend

2019-03-24 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git

commit 878f757f1825a7fe22370dff0e23114fc0434308
Author: Niketan Pansare 
AuthorDate: Sun Mar 24 09:06:55 2019 -0700

[SYSTEMML-540] Added ternary aggregate operators for GPU backend

- Also added steps to upload SystemML's python package to pypi.
---
 release-process.md | 25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/release-process.md b/release-process.md
index 2477cd0..c50a27e 100644
--- a/release-process.md
+++ b/release-process.md
@@ -388,7 +388,7 @@ file and remove all the `@Ignore` annotations from all the 
tests. Then run the N
 # Run other GPU Unit Tests 
 
rm result.txt
-   for t in AggregateUnaryOpTests  BinaryOpTests  
MatrixMatrixElementWiseOpTests  RightIndexingTests AppendTest  
MatrixMultiplicationOpTest ReorgOpTests ScalarMatrixElementwiseOpTests 
UnaryOpTests LstmTest LstmCPUTest
+   for t in AggregateUnaryOpTests AggregateTernaryTests  BinaryOpTests  
MatrixMatrixElementWiseOpTests  RightIndexingTests AppendTest  
MatrixMultiplicationOpTest ReorgOpTests ScalarMatrixElementwiseOpTests 
UnaryOpTests LstmTest LstmCPUTest
do
mvn -Dit.test="org.apache.sysml.test.gpu."$t verify -PgpuTests 
&> tmp.txt
SUCCESS=`grep "BUILD SUCCESS" tmp.txt`
@@ -503,8 +503,23 @@ The versioned project documentation is now deployed to the 
main website, and the
 
 ## Update Crawler configuration for the search indexing
 
-Create a PR or an issue to update the version number in the crawler 
configuration. 
-Please see the `start_urls` tag in the file 
[https://github.com/algolia/docsearch-configs/blob/master/configs/apache_systemml.json](https://github.com/algolia/docsearch-configs/blob/master/configs/apache_systemml.json).
-If the Algolia team provides us an updated `apiKey` or `indexName` 
credentials, then please update the corresponding entries in the file 
+- Create a PR or an issue to update the version number in the crawler 
configuration. Please see the `start_urls` tag in the file 
[https://github.com/algolia/docsearch-configs/blob/master/configs/apache_systemml.json](https://github.com/algolia/docsearch-configs/blob/master/configs/apache_systemml.json).
+- If the Algolia team provides us an updated `apiKey` or `indexName` 
credentials, then please update the corresponding entries in the file 
 
[https://github.com/apache/systemml/blob/master/docs/_layouts/global.html](https://github.com/apache/systemml/blob/master/docs/_layouts/global.html)
 
-(see for `Algolia search section` in the previously mentioned HTML file).
\ No newline at end of file
+(see for `Algolia search section` in the previously mentioned HTML file).
+
+## Upload Python package to PyPI
+
+Download the released `systemml-*-python.tar.gz` and 
`systemml-*-python.tar.gz`.
+
+   $ wget 
https://dist.apache.org/repos/dist/release/systemml/1.0.0/systemml-1.0.0-python.tar.gz
+   $ wget 
https://dist.apache.org/repos/dist/release/systemml/1.0.0/systemml-1.0.0-python.tar.gz.asc
+   
+Rename the files to remove `-python` suffix.
+
+   $ mv systemml-1.0.0-python.tar.gz systemml-1.0.0.tar.gz
+   $ mv systemml-1.0.0-python.tar.gz.asc systemml-1.0.0.tar.gz.asc
+
+Upload the Python package to PyPI using 
[twine](https://pypi.org/project/twine/).
+
+   $ twine upload -u systemml systemml-1.0.0.tar.gz 
systemml-1.0.0.tar.gz.asc 
\ No newline at end of file



[systemml] branch master updated: [SYSTEMML-540] Added ternary aggregate operators for GPU backend

2019-03-24 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 7fba4b2  [SYSTEMML-540] Added ternary aggregate operators for GPU 
backend
7fba4b2 is described below

commit 7fba4b29d653747a9ed038d282954a44fea3031c
Author: Niketan Pansare 
AuthorDate: Sun Mar 24 09:06:55 2019 -0700

[SYSTEMML-540] Added ternary aggregate operators for GPU backend

- Also added steps to upload SystemML's python package to pypi.
---
 docs/release-process.md|  25 +++-
 .../java/org/apache/sysml/hops/AggUnaryOp.java |  11 +-
 .../runtime/instructions/GPUInstructionParser.java |   7 ++
 .../gpu/AggregateTernaryGPUInstruction.java| 130 +
 .../runtime/instructions/gpu/GPUInstruction.java   |   1 +
 .../sysml/runtime/matrix/data/LibMatrixCUDA.java   |  13 ++-
 .../sysml/test/gpu/AggregateTernaryTests.java  |  57 +
 .../sysml/test/gpu/AggregateUnaryOpTests.java  |   1 +
 .../apache/sysml/test/gpu/UnaryOpTestsBase.java|  18 +++
 9 files changed, 250 insertions(+), 13 deletions(-)

diff --git a/docs/release-process.md b/docs/release-process.md
index 2477cd0..c50a27e 100644
--- a/docs/release-process.md
+++ b/docs/release-process.md
@@ -388,7 +388,7 @@ file and remove all the `@Ignore` annotations from all the 
tests. Then run the N
 # Run other GPU Unit Tests 
 
rm result.txt
-   for t in AggregateUnaryOpTests  BinaryOpTests  
MatrixMatrixElementWiseOpTests  RightIndexingTests AppendTest  
MatrixMultiplicationOpTest ReorgOpTests ScalarMatrixElementwiseOpTests 
UnaryOpTests LstmTest LstmCPUTest
+   for t in AggregateUnaryOpTests AggregateTernaryTests  BinaryOpTests  
MatrixMatrixElementWiseOpTests  RightIndexingTests AppendTest  
MatrixMultiplicationOpTest ReorgOpTests ScalarMatrixElementwiseOpTests 
UnaryOpTests LstmTest LstmCPUTest
do
mvn -Dit.test="org.apache.sysml.test.gpu."$t verify -PgpuTests 
&> tmp.txt
SUCCESS=`grep "BUILD SUCCESS" tmp.txt`
@@ -503,8 +503,23 @@ The versioned project documentation is now deployed to the 
main website, and the
 
 ## Update Crawler configuration for the search indexing
 
-Create a PR or an issue to update the version number in the crawler 
configuration. 
-Please see the `start_urls` tag in the file 
[https://github.com/algolia/docsearch-configs/blob/master/configs/apache_systemml.json](https://github.com/algolia/docsearch-configs/blob/master/configs/apache_systemml.json).
-If the Algolia team provides us an updated `apiKey` or `indexName` 
credentials, then please update the corresponding entries in the file 
+- Create a PR or an issue to update the version number in the crawler 
configuration. Please see the `start_urls` tag in the file 
[https://github.com/algolia/docsearch-configs/blob/master/configs/apache_systemml.json](https://github.com/algolia/docsearch-configs/blob/master/configs/apache_systemml.json).
+- If the Algolia team provides us an updated `apiKey` or `indexName` 
credentials, then please update the corresponding entries in the file 
 
[https://github.com/apache/systemml/blob/master/docs/_layouts/global.html](https://github.com/apache/systemml/blob/master/docs/_layouts/global.html)
 
-(see for `Algolia search section` in the previously mentioned HTML file).
\ No newline at end of file
+(see for `Algolia search section` in the previously mentioned HTML file).
+
+## Upload Python package to PyPI
+
+Download the released `systemml-*-python.tar.gz` and 
`systemml-*-python.tar.gz`.
+
+   $ wget 
https://dist.apache.org/repos/dist/release/systemml/1.0.0/systemml-1.0.0-python.tar.gz
+   $ wget 
https://dist.apache.org/repos/dist/release/systemml/1.0.0/systemml-1.0.0-python.tar.gz.asc
+   
+Rename the files to remove `-python` suffix.
+
+   $ mv systemml-1.0.0-python.tar.gz systemml-1.0.0.tar.gz
+   $ mv systemml-1.0.0-python.tar.gz.asc systemml-1.0.0.tar.gz.asc
+
+Upload the Python package to PyPI using 
[twine](https://pypi.org/project/twine/).
+
+   $ twine upload -u systemml systemml-1.0.0.tar.gz 
systemml-1.0.0.tar.gz.asc 
\ No newline at end of file
diff --git a/src/main/java/org/apache/sysml/hops/AggUnaryOp.java 
b/src/main/java/org/apache/sysml/hops/AggUnaryOp.java
index 48d18b7..92ec22c 100644
--- a/src/main/java/org/apache/sysml/hops/AggUnaryOp.java
+++ b/src/main/java/org/apache/sysml/hops/AggUnaryOp.java
@@ -93,9 +93,12 @@ public class AggUnaryOp extends MultiThreadedHop
return false;

try {
-   if( isTernaryAggregateRewriteApplicable() || 
isUnaryAggregateOuterCPRewriteApplicable() ) {
+   if(isUnaryAggregateOuterCPRewriteApplicable()) {
 

[systemml] branch master updated: [SYSTEMML-540] Invoke update immediately after backward call at the script-level.

2019-03-23 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new b49ac76  [SYSTEMML-540] Invoke update immediately after backward call 
at the script-level.
b49ac76 is described below

commit b49ac760c180baa0582c2168c4b58fb3c0108bc4
Author: Niketan Pansare 
AuthorDate: Sat Mar 23 21:51:45 2019 -0700

[SYSTEMML-540] Invoke update immediately after backward call at the 
script-level.

- This reduces the chance of unnecessary evictions especially when there 
are statement block cuts.
- The configuration property `perform_fused_backward_update` allows the 
user to toggle this behavior and control the script-generation process.
- Also, updated the release creation document to ensure that untracked 
files are not included in the artifacts.
- For Resnet-200, the eviction time was reduced from 173.488 seconds to 
60.048 seconds with minibatch size of 96.
---
 docs/release-creation-process.md   |  3 +
 src/main/python/systemml/mllearn/estimators.py |  8 ++-
 .../scala/org/apache/sysml/api/dl/Caffe2DML.scala  | 76 ++
 .../scala/org/apache/sysml/api/dl/CaffeLayer.scala |  2 +
 4 files changed, 76 insertions(+), 13 deletions(-)

diff --git a/docs/release-creation-process.md b/docs/release-creation-process.md
index 3115390..bf05000 100644
--- a/docs/release-creation-process.md
+++ b/docs/release-creation-process.md
@@ -42,6 +42,9 @@ Step 1: Prepare the release.
 
# Extract latest code to a directory

+   
+   # Check if there are any untracked files (created by the unit tests) 
and remove them to avoid packing them in the artifacts
+   git status
 
# Go to dev/release directory
cd /dev/release
diff --git a/src/main/python/systemml/mllearn/estimators.py 
b/src/main/python/systemml/mllearn/estimators.py
index 8d1e164..456280b 100644
--- a/src/main/python/systemml/mllearn/estimators.py
+++ b/src/main/python/systemml/mllearn/estimators.py
@@ -922,7 +922,8 @@ class Caffe2DML(BaseSystemMLClassifier):
 self.estimator.setWeightsToIgnore(ignore_weights)
 
 def set(self, debug=None, train_algo=None, test_algo=None, 
parallel_batches=None,
-output_activations=None, perform_one_hot_encoding=None, 
parfor_parameters=None, inline_nn_library=None, use_builtin_lstm_fn=None):
+output_activations=None, perform_one_hot_encoding=None, 
parfor_parameters=None, inline_nn_library=None, use_builtin_lstm_fn=None,
+perform_fused_backward_update=None):
 """
 Set input to Caffe2DML
 
@@ -933,10 +934,11 @@ class Caffe2DML(BaseSystemMLClassifier):
 test_algo: can be minibatch, batch, allreduce_parallel_batches or 
allreduce (default: minibatch)
 parallel_batches: number of parallel batches
 output_activations: (developer flag) directory to output activations 
of each layer as csv while prediction. To be used only in batch mode (default: 
None)
-perform_one_hot_encoding: should perform one-hot encoding in DML using 
table function (default: False)
+perform_one_hot_encoding: should perform one-hot encoding in DML using 
table function (default: True)
 parfor_parameters: dictionary for parfor parameters when using 
allreduce-style algorithms (default: "")
 inline_nn_library: whether to inline the NN library when generating 
DML using Caffe2DML (default: False)
 use_builtin_lstm_fn: whether to use builtin lstm function for LSTM 
layer (default: True)
+perform_fused_backward_update: whether to perform update immediately 
after backward pass at the script level. Supported for minibatch and batch 
algorithms. (default: True)
 """
 if debug is not None:
 self.estimator.setInput("$debug", str(debug).upper())
@@ -950,6 +952,8 @@ class Caffe2DML(BaseSystemMLClassifier):
 self.estimator.setInput("$parallel_batches", str(parallel_batches))
 if use_builtin_lstm_fn is not None:
 self.estimator.setInput("$use_builtin_lstm_fn", 
str(use_builtin_lstm_fn).upper())
+if perform_fused_backward_update is not None:
+self.estimator.setInput("$perform_fused_backward_update", 
str(perform_fused_backward_update).upper())
 if output_activations is not None:
 self.estimator.setInput(
 "$output_activations",
diff --git a/src/main/scala/org/apache/sysml/api/dl/Caffe2DML.scala 
b/src/main/scala/org/apache/sysml/api/dl/Caffe2DML.scala
index e480dfc..c5a20db 100644
--- a/src/main/scala/org/apache/sysml/api/dl/Caffe2DML.scala
+++ b/src/main/scala/org/apache/sysml/api/dl/Caffe2DML.scala
@@ -304,14 +304,27 @@ class Caffe2DML(val sc: Sp

[systemml] branch gh-pages updated: [MINOR][DOC] Updated Deep Learning documentation

2019-03-22 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/gh-pages by this push:
 new 2eed56f  [MINOR][DOC] Updated Deep Learning documentation
2eed56f is described below

commit 2eed56fd88bbeb633dfa6d452d55110594ff310d
Author: Niketan Pansare 
AuthorDate: Fri Mar 22 19:47:00 2019 -0700

[MINOR][DOC] Updated Deep Learning documentation

- Also, fixed javadoc errors.
---
 deep-learning.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/deep-learning.md b/deep-learning.md
index 2dbb4bb..968c959 100644
--- a/deep-learning.md
+++ b/deep-learning.md
@@ -207,6 +207,7 @@ keras_model.add(Flatten())
 keras_model.add(Dense(512, activation='relu'))
 keras_model.add(Dropout(0.5))
 keras_model.add(Dense(10, activation='softmax'))
+keras_model.compile(loss='categorical_crossentropy', optimizer=SGD(lr=0.01, 
decay=1e-6, momentum=0.9, nesterov=True))
 keras_model.summary()
 
 # Scale the input features



[systemml] branch master updated: [MINOR][DOC] Updated Deep Learning documentation

2019-03-22 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 392f3d2  [MINOR][DOC] Updated Deep Learning documentation
392f3d2 is described below

commit 392f3d2c8a9d7fd9f1c05454636536d5b4d9e155
Author: Niketan Pansare 
AuthorDate: Fri Mar 22 19:47:00 2019 -0700

[MINOR][DOC] Updated Deep Learning documentation

- Also, fixed javadoc errors.
---
 docs/deep-learning.md| 1 +
 src/main/java/org/apache/sysml/api/ScriptExecutorUtils.java  | 1 +
 .../sysml/runtime/instructions/gpu/context/GPUMemoryManager.java | 5 +++--
 src/main/python/systemml/mllearn/keras2caffe.py  | 2 +-
 4 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/docs/deep-learning.md b/docs/deep-learning.md
index 2dbb4bb..968c959 100644
--- a/docs/deep-learning.md
+++ b/docs/deep-learning.md
@@ -207,6 +207,7 @@ keras_model.add(Flatten())
 keras_model.add(Dense(512, activation='relu'))
 keras_model.add(Dropout(0.5))
 keras_model.add(Dense(10, activation='softmax'))
+keras_model.compile(loss='categorical_crossentropy', optimizer=SGD(lr=0.01, 
decay=1e-6, momentum=0.9, nesterov=True))
 keras_model.summary()
 
 # Scale the input features
diff --git a/src/main/java/org/apache/sysml/api/ScriptExecutorUtils.java 
b/src/main/java/org/apache/sysml/api/ScriptExecutorUtils.java
index c9d1a5d..5e59204 100644
--- a/src/main/java/org/apache/sysml/api/ScriptExecutorUtils.java
+++ b/src/main/java/org/apache/sysml/api/ScriptExecutorUtils.java
@@ -104,6 +104,7 @@ public class ScriptExecutorUtils {
 * @param api API used to execute the runtime program
 * @param performHOPRewrites should perform hop rewrites
 * @param maintainSymbolTable whether or not all values should be 
maintained in the symbol table after execution.
+* @param init whether to initialize hadoop execution
 * @return compiled runtime program
 */
public static Program compileRuntimeProgram(String script, 
Map nsscripts, Map args, String[] allArgs,
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUMemoryManager.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUMemoryManager.java
index ce22a7e..cf579ec 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUMemoryManager.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUMemoryManager.java
@@ -517,14 +517,15 @@ public class GPUMemoryManager {
}

/**
-* Clears up the memory used by non-dirty pointers.
+* Clears up the memory used by non-dirty pointers except output and 
locked matrix objects.
+* 
+* @param outputMatrixObjects list of output matrix objects
 */
public void clearTemporaryMemory(HashSet 
outputMatrixObjects) {
Set donotClearPointers =  new HashSet<>();
// First clean up all GPU objects except:
// 1. Output matrix objects
// 2. GPU objects that are currently being used (i.e. locked)
-   // 3. Matrix object are 
Set allGPUObjects = new 
HashSet<>(matrixMemoryManager.getGpuObjects());
for (GPUObject gpuObj : allGPUObjects) {
boolean isOutput = 
outputMatrixObjects.contains(gpuObj.mat);
diff --git a/src/main/python/systemml/mllearn/keras2caffe.py 
b/src/main/python/systemml/mllearn/keras2caffe.py
index 39a9755..19cde10 100755
--- a/src/main/python/systemml/mllearn/keras2caffe.py
+++ b/src/main/python/systemml/mllearn/keras2caffe.py
@@ -296,7 +296,7 @@ def getDropoutParam(layer):
 if not supported:
 raise Exception('noise_shape=' + str(layer.noise_shape) + ' is not 
supported for Dropout layer with input_shape='
 + str(layer.input_shape))
-return {'dropout_ratio': l.rate}
+return {'dropout_ratio': layer.rate}
 
 layerParamMapping = {
 keras.layers.InputLayer: lambda l:



[systemml] branch gh-pages updated: [SYSTEMML-540] Throw exception whenever parameter of a Keras layer is not supported by SystemML

2019-03-22 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/gh-pages by this push:
 new c81626b  [SYSTEMML-540] Throw exception whenever parameter of a Keras 
layer is not supported by SystemML
c81626b is described below

commit c81626b4476c7859c94aa8750df45de4f4382573
Author: Niketan Pansare 
AuthorDate: Fri Mar 22 19:21:40 2019 -0700

[SYSTEMML-540] Throw exception whenever parameter of a Keras layer is not 
supported by SystemML
---
 reference-guide-caffe2dml.md | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/reference-guide-caffe2dml.md b/reference-guide-caffe2dml.md
index 381b96d..6242e03 100644
--- a/reference-guide-caffe2dml.md
+++ b/reference-guide-caffe2dml.md
@@ -450,6 +450,21 @@ layer {
 
 ## Utility Layers
 
+### Flatten Layer
+
+The Flatten layer is a utility layer that flattens an input of shape n * c * h 
* w to a simple vector output of shape n * (c*h*w).
+
+
+**Sample Usage:**
+```
+layer {
+name: "flatten_1"
+type: "Flatten"
+bottom: "max_pooling2d_2"
+top: "flatten_1"
+}
+```
+
 ### Eltwise Layer
 
 Element-wise operations such as product or sum between two blobs.



[systemml] branch master updated: [SYSTEMML-540] Throw exception whenever parameter of a Keras layer is not supported by SystemML

2019-03-22 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 7cab282  [SYSTEMML-540] Throw exception whenever parameter of a Keras 
layer is not supported by SystemML
7cab282 is described below

commit 7cab282faa77b3bc66200396803f97ec1375544a
Author: Niketan Pansare 
AuthorDate: Fri Mar 22 19:21:40 2019 -0700

[SYSTEMML-540] Throw exception whenever parameter of a Keras layer is not 
supported by SystemML
---
 docs/reference-guide-caffe2dml.md   | 15 ++
 src/main/python/systemml/mllearn/keras2caffe.py | 64 +
 2 files changed, 60 insertions(+), 19 deletions(-)

diff --git a/docs/reference-guide-caffe2dml.md 
b/docs/reference-guide-caffe2dml.md
index 381b96d..6242e03 100644
--- a/docs/reference-guide-caffe2dml.md
+++ b/docs/reference-guide-caffe2dml.md
@@ -450,6 +450,21 @@ layer {
 
 ## Utility Layers
 
+### Flatten Layer
+
+The Flatten layer is a utility layer that flattens an input of shape n * c * h 
* w to a simple vector output of shape n * (c*h*w).
+
+
+**Sample Usage:**
+```
+layer {
+name: "flatten_1"
+type: "Flatten"
+bottom: "max_pooling2d_2"
+top: "flatten_1"
+}
+```
+
 ### Eltwise Layer
 
 Element-wise operations such as product or sum between two blobs.
diff --git a/src/main/python/systemml/mllearn/keras2caffe.py 
b/src/main/python/systemml/mllearn/keras2caffe.py
index 2b97560..39a9755 100755
--- a/src/main/python/systemml/mllearn/keras2caffe.py
+++ b/src/main/python/systemml/mllearn/keras2caffe.py
@@ -192,6 +192,7 @@ def _parseKerasLayer(layer):
 
 
 def _parseBatchNorm(layer):
+# TODO: Ignoring axis
 bnName = layer.name + '_1'
 config = layer.get_config()
 bias_term = 'true' if config['center'] else 'false'
@@ -215,44 +216,51 @@ def getPadding(kernel_size, padding):
 else:
 raise ValueError('Unsupported padding:' + str(padding))
 
+# Used by padding to extract different types of possible padding:
+# int: the same symmetric padding is applied to height and width.
+# tuple of 2 ints: interpreted as two different symmetric padding values for 
height and width: (symmetric_height_pad, symmetric_width_pad)
+# tuple of 2 tuples of 2 ints: interpreted as  ((top_pad, bottom_pad), 
(left_pad, right_pad))
+def get2Tuple(val):
+return [val, val] if isinstance(val, int) else [val[0], val[1]]
+
 # Helper method to return Caffe's ConvolutionParameter in JSON-like data 
structure
 def getConvParam(layer):
-stride = (1, 1) if layer.strides is None else layer.strides
+# TODO: dilation_rate, kernel_constraint and bias_constraint are not 
supported
+stride = (1, 1) if layer.strides is None else get2Tuple(layer.strides)
+kernel_size = get2Tuple(layer.kernel_size)
 config = layer.get_config()
+if not layer.use_bias:
+raise Exception('use_bias=False is not supported for the Conv2D layer. 
Consider setting use_bias to true.')
 return {'num_output': layer.filters, 'bias_term': 
str(config['use_bias']).lower(
-), 'kernel_h': layer.kernel_size[0], 'kernel_w': layer.kernel_size[1], 
'stride_h': stride[0], 'stride_w': stride[1],
-'pad_h': getPadding(layer.kernel_size[0], layer.padding), 'pad_w': 
getPadding(layer.kernel_size[1], layer.padding)}
+), 'kernel_h': kernel_size[0], 'kernel_w': kernel_size[1], 'stride_h': 
stride[0], 'stride_w': stride[1],
+'pad_h': getPadding(kernel_size[0], layer.padding), 'pad_w': 
getPadding(kernel_size[1], layer.padding)}
 
 
 # Helper method to return newly added UpsampleParameter
 # (search for UpsampleParameter in the file src/main/proto/caffe/caffe.proto) 
in JSON-like data structure
 def getUpSamplingParam(layer):
-return {'size_h': layer.size[0], 'size_w': layer.size[1]}
-
-# Used by padding to extract different types of possible padding:
-# int: the same symmetric padding is applied to height and width.
-# tuple of 2 ints: interpreted as two different symmetric padding values for 
height and width: (symmetric_height_pad, symmetric_width_pad)
-# tuple of 2 tuples of 2 ints: interpreted as  ((top_pad, bottom_pad), 
(left_pad, right_pad))
-def getPaddingTuple(padding):
-return [padding, padding] if isinstance(padding, int) else [padding[0], 
padding[1]]
+# TODO: Skipping interpolation type
+size = get2Tuple(layer.size)
+return {'size_h': size[0], 'size_w': size[1]}
 
 # Helper method to return newly added PaddingParameter
 # (search for UpsampleParameter in the file src/main/proto/caffe/caffe.proto) 
in JSON-like data structure
 def getPaddingParam(layer):
 if isinstance(layer.padding, int):
-padding = getPaddingTuple(layer.padding) + 
getPaddingTuple(layer.padding)
+padding = get2Tuple(layer.padding) + get2Tuple(layer.padding)
 

[systemml] branch master updated: [SYSTEMML-540] Fixed lstm_backward and python test bug

2019-03-21 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 78b79de  [SYSTEMML-540] Fixed lstm_backward and python test bug
78b79de is described below

commit 78b79de4e0a3966dfa45451ac7f3a7b8c7184806
Author: Niketan Pansare 
AuthorDate: Thu Mar 21 21:22:55 2019 -0700

[SYSTEMML-540] Fixed lstm_backward and python test bug

- Also updated the release documentation to specify the Keras and 
TensorFlow version
- Fixed Python3 indexing bug when lstm units is not an integer
---
 docs/release-process.md|  6 
 src/main/python/systemml/mllearn/keras2caffe.py|  2 +-
 src/main/python/tests/test_nn_numpy.py | 37 +++---
 .../scala/org/apache/sysml/api/dl/CaffeLayer.scala |  2 +-
 4 files changed, 33 insertions(+), 14 deletions(-)

diff --git a/docs/release-process.md b/docs/release-process.md
index 8ef4693..2477cd0 100644
--- a/docs/release-process.md
+++ b/docs/release-process.md
@@ -255,6 +255,12 @@ this OS X example.
 
 ## Python Tests
 
+
+Install Keras and Tensorflow:
+
+   python3 -m pip install --user keras=='2.1.5'
+   python3 -m pip install --user tensorflow=='1.11.0'
+
 Compile SystemML distribution:
 
mvn package -P distribution
diff --git a/src/main/python/systemml/mllearn/keras2caffe.py 
b/src/main/python/systemml/mllearn/keras2caffe.py
index ce341fd..892deb2 100755
--- a/src/main/python/systemml/mllearn/keras2caffe.py
+++ b/src/main/python/systemml/mllearn/keras2caffe.py
@@ -485,7 +485,7 @@ def getInputMatrices(layer):
 elif isinstance(layer, keras.layers.LSTM):
 weights = layer.get_weights()
 W, U, b =  weights[0], weights[1], weights[2]
-units = W.shape[1]/4
+units = int(W.shape[1]/4)
 if W.shape[1] != U.shape[1]:
 raise Exception('Number of hidden units of the kernel and the 
recurrent kernel doesnot match')
 # Note: For the LSTM layer, Keras weights are laid out in [i, f, c, o] 
format;
diff --git a/src/main/python/tests/test_nn_numpy.py 
b/src/main/python/tests/test_nn_numpy.py
index 80d3151..43e3303 100644
--- a/src/main/python/tests/test_nn_numpy.py
+++ b/src/main/python/tests/test_nn_numpy.py
@@ -28,6 +28,8 @@
 #   - Python 2: `PYSPARK_PYTHON=python2 spark-submit --master local[*] 
--driver-memory 10g  --driver-class-path 
../../../../target/SystemML.jar,../../../../target/systemml-*-extra.jar 
test_nn_numpy.py`
 #   - Python 3: `PYSPARK_PYTHON=python3 spark-submit --master local[*] 
--driver-memory 10g --driver-class-path SystemML.jar,systemml-*-extra.jar 
test_nn_numpy.py`
 
+# Test with Keras 2.1.5 and Tensorflow 1.11.0
+
 # Make the `systemml` package importable
 import os
 os.environ['CUDA_DEVICE_ORDER'] = 'PCI_BUS_ID'
@@ -81,7 +83,12 @@ def get_input_output_shape(layers):
 return tmp_keras_model.layers[0].input_shape, 
tmp_keras_model.layers[-1].output_shape
 
 def get_one_hot_encoded_labels(output_shape):
-output_cells = reduce(mul, list(output_shape[1:]), 1)
+try:
+output_cells = reduce(mul, list(output_shape[1:]), 1)
+except NameError:
+# As per https://www.artima.com/weblogs/viewpost.jsp?thread=98196, 
reduce was moved to functools in later versions
+from functools import reduce
+output_cells = reduce(mul, list(output_shape[1:]), 1)
 y = np.array(np.random.choice(output_cells, batch_size))
 y[0] = output_cells - 1
 one_hot_labels = np_utils.to_categorical(y, num_classes=output_cells)
@@ -97,7 +104,7 @@ def get_sysml_model(keras_model):
 # print('Script:' + str(sysml_model.get_training_script()))
 return sysml_model
 
-def base_test(layers, add_dense=False, test_backward=True, 
reshuffle_keras_output=False):
+def base_test(layers, add_dense=False, test_backward=True):
 layers = [layers] if not isinstance(layers, list) else layers
 in_shape, output_shape = get_input_output_shape(layers)
 # --
@@ -133,12 +140,6 @@ def base_test(layers, add_dense=False, test_backward=True, 
reshuffle_keras_outpu
 # --
 if len(output_shape) > 4:
 raise Exception('Unsupported output shape:' + str(output_shape))
-if len(output_shape) == 4 and reshuffle_keras_output:
-# This is not required as of Keras 2.1.5 and Tensorflow 1.11.0, but 
keeping it for backward compatibility.
-# Flatten doesnot respect channel_first, so reshuffle the dimensions:
-keras_preds = keras_preds.reshape((batch_size, output_shape[2], 
output_shape[3], output_shape[1]))
-keras_preds = np.swapaxes(keras_preds, 2, 3)  # (h,w,c) -> (h,c,w)
-keras_preds = np.swapaxes(keras_preds, 1, 2)  # (h,c,w) -> (c,h,w)
 # --
 return sy

[systemml] branch gh-pages updated: [MINOR][DOC] Updated the documentation

2019-03-21 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/gh-pages by this push:
 new 208a6fb  [MINOR][DOC] Updated the documentation
208a6fb is described below

commit 208a6fb5e851b8307f31b87c600399ccd3c8552f
Author: Niketan Pansare 
AuthorDate: Thu Mar 21 09:51:42 2019 -0700

[MINOR][DOC] Updated the documentation

- Removed unnecessary external hyperlinks
---
 index.md | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/index.md b/index.md
index e7f16f3..4ceaee6 100644
--- a/index.md
+++ b/index.md
@@ -42,26 +42,26 @@ This version of SystemML supports: Java 8+, Scala 2.11+, 
Python 2.7/3.5+, Hadoop
 
 * If you are new to SystemML, please refer to the [installation 
guide](http://systemml.apache.org/install-systemml.html) and try out our 
[sample notebooks](http://systemml.apache.org/get-started.html#sample-notebook)
 * If you want to invoke one of our [pre-implemented 
algorithms](algorithms-reference):
-  * Using Python, consider using 
-* the convenient [mllearn 
API](http://apache.github.io/systemml/python-reference.html#mllearn-api). The 
usage is describe in our [beginner's 
guide](http://apache.github.io/systemml/beginners-guide-python.html#invoke-systemmls-algorithms)
  
-* OR [Spark MLContext](spark-mlcontext-programming-guide) API
-  * Using Java/Scala, consider using 
+  * In Python, consider using 
+* the convenient [mllearn 
API](http://apache.github.io/systemml/python-reference.html#mllearn-api). The 
usage is described in our [beginner's 
guide](http://apache.github.io/systemml/beginners-guide-python.html#invoke-systemmls-algorithms)
  
+* Or [Spark MLContext](spark-mlcontext-programming-guide) API
+  * In Java/Scala, consider using 
 * [Spark MLContext](spark-mlcontext-programming-guide) API for large 
datasets
-* OR [JMLC](jmlc) API for in-memory scoring
+* Or [JMLC](jmlc) API for in-memory scoring
   * Via Command-line, follow the usage section in the [Algorithms 
Reference](algorithms-reference) 
 * If you want to implement a deep neural network, consider
-  * specifying your network in [Keras](https://keras.io/) format and invoking 
it with our [Keras2DML](beginners-guide-keras2dml) API
-  * OR specifying your network in [Caffe](http://caffe.berkeleyvision.org/) 
format and invoking it with our [Caffe2DML](beginners-guide-caffe2dml) API
-  * OR Using DML-bodied [NN 
library](https://github.com/apache/systemml/tree/master/scripts/nn). The usage 
is described in our [sample 
notebook](https://github.com/apache/systemml/blob/master/samples/jupyter-notebooks/Deep%20Learning%20Image%20Classification.ipynb)
-* Since training a deep neural network is often compute-bound, you may want to
-  * Enable [native BLAS](native-backend) in SystemML
-  * OR run it [using our GPU backend](gpu)  
+  * Specifying your network in [Keras](https://keras.io/) format and invoking 
it with [Keras2DML](beginners-guide-keras2dml) API
+  * Or specifying your network in [Caffe](http://caffe.berkeleyvision.org/) 
format and invoking it with [Caffe2DML](beginners-guide-caffe2dml) API
+  * Or using DML-bodied [NN 
library](https://github.com/apache/systemml/tree/master/scripts/nn). The usage 
is described in our [sample 
notebook](https://github.com/apache/systemml/blob/master/samples/jupyter-notebooks/Deep%20Learning%20Image%20Classification.ipynb)
+* Since training a deep neural network is often compute-bound, you may want to 
enable SystemML's
+  * [native BLAS](native-backend)
+  * Or [GPU backend](gpu)
 * If you want to implement a custom machine learning algorithm and you are 
familiar with:
-  * [R](https://www.r-project.org/about.html), consider implementing your 
algorithm in [DML](dml-language-reference) (recommended)
-  * [Python](https://www.python.org/), you can implement your algorithm in 
[PyDML](beginners-guide-to-dml-and-pydml) or using the [matrix 
class](http://apache.github.io/systemml/python-reference.html#matrix-class)
-* If you want to try out SystemML on single machine (for example, your 
laptop), consider
-  * using the above mentioned APIs with [Apache 
Spark](https://spark.apache.org/downloads.html) (recommended). Please refer to 
our [installation guide](http://systemml.apache.org/install-systemml.html).
-  * OR running it using java in [standalone mode](standalone-guide)
+  * R syntax, consider implementing your algorithm in 
[DML](dml-language-reference) (recommended)
+  * Python syntax, you can implement your algorithm in 
[PyDML](beginners-guide-to-dml-and-pydml) or using the [matrix 
class](http://apache.github.io/systemml/python-reference.html#matrix-class)
+* If you want to try out SystemML on your laptop, consider
+  * using the above mentioned APIs with Apache Spark (recommended). Please 
refer to our [installation

[systemml] branch master updated: [MINOR][DOC] Updated the documentation

2019-03-21 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new bc02283  [MINOR][DOC] Updated the documentation
bc02283 is described below

commit bc022839d489329c5e9bf1ca763c6596697110cb
Author: Niketan Pansare 
AuthorDate: Thu Mar 21 09:51:42 2019 -0700

[MINOR][DOC] Updated the documentation

- Removed unnecessary external hyperlinks
---
 docs/index.md | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/docs/index.md b/docs/index.md
index e7f16f3..4ceaee6 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -42,26 +42,26 @@ This version of SystemML supports: Java 8+, Scala 2.11+, 
Python 2.7/3.5+, Hadoop
 
 * If you are new to SystemML, please refer to the [installation 
guide](http://systemml.apache.org/install-systemml.html) and try out our 
[sample notebooks](http://systemml.apache.org/get-started.html#sample-notebook)
 * If you want to invoke one of our [pre-implemented 
algorithms](algorithms-reference):
-  * Using Python, consider using 
-* the convenient [mllearn 
API](http://apache.github.io/systemml/python-reference.html#mllearn-api). The 
usage is describe in our [beginner's 
guide](http://apache.github.io/systemml/beginners-guide-python.html#invoke-systemmls-algorithms)
  
-* OR [Spark MLContext](spark-mlcontext-programming-guide) API
-  * Using Java/Scala, consider using 
+  * In Python, consider using 
+* the convenient [mllearn 
API](http://apache.github.io/systemml/python-reference.html#mllearn-api). The 
usage is described in our [beginner's 
guide](http://apache.github.io/systemml/beginners-guide-python.html#invoke-systemmls-algorithms)
  
+* Or [Spark MLContext](spark-mlcontext-programming-guide) API
+  * In Java/Scala, consider using 
 * [Spark MLContext](spark-mlcontext-programming-guide) API for large 
datasets
-* OR [JMLC](jmlc) API for in-memory scoring
+* Or [JMLC](jmlc) API for in-memory scoring
   * Via Command-line, follow the usage section in the [Algorithms 
Reference](algorithms-reference) 
 * If you want to implement a deep neural network, consider
-  * specifying your network in [Keras](https://keras.io/) format and invoking 
it with our [Keras2DML](beginners-guide-keras2dml) API
-  * OR specifying your network in [Caffe](http://caffe.berkeleyvision.org/) 
format and invoking it with our [Caffe2DML](beginners-guide-caffe2dml) API
-  * OR Using DML-bodied [NN 
library](https://github.com/apache/systemml/tree/master/scripts/nn). The usage 
is described in our [sample 
notebook](https://github.com/apache/systemml/blob/master/samples/jupyter-notebooks/Deep%20Learning%20Image%20Classification.ipynb)
-* Since training a deep neural network is often compute-bound, you may want to
-  * Enable [native BLAS](native-backend) in SystemML
-  * OR run it [using our GPU backend](gpu)  
+  * Specifying your network in [Keras](https://keras.io/) format and invoking 
it with [Keras2DML](beginners-guide-keras2dml) API
+  * Or specifying your network in [Caffe](http://caffe.berkeleyvision.org/) 
format and invoking it with [Caffe2DML](beginners-guide-caffe2dml) API
+  * Or using DML-bodied [NN 
library](https://github.com/apache/systemml/tree/master/scripts/nn). The usage 
is described in our [sample 
notebook](https://github.com/apache/systemml/blob/master/samples/jupyter-notebooks/Deep%20Learning%20Image%20Classification.ipynb)
+* Since training a deep neural network is often compute-bound, you may want to 
enable SystemML's
+  * [native BLAS](native-backend)
+  * Or [GPU backend](gpu)
 * If you want to implement a custom machine learning algorithm and you are 
familiar with:
-  * [R](https://www.r-project.org/about.html), consider implementing your 
algorithm in [DML](dml-language-reference) (recommended)
-  * [Python](https://www.python.org/), you can implement your algorithm in 
[PyDML](beginners-guide-to-dml-and-pydml) or using the [matrix 
class](http://apache.github.io/systemml/python-reference.html#matrix-class)
-* If you want to try out SystemML on single machine (for example, your 
laptop), consider
-  * using the above mentioned APIs with [Apache 
Spark](https://spark.apache.org/downloads.html) (recommended). Please refer to 
our [installation guide](http://systemml.apache.org/install-systemml.html).
-  * OR running it using java in [standalone mode](standalone-guide)
+  * R syntax, consider implementing your algorithm in 
[DML](dml-language-reference) (recommended)
+  * Python syntax, you can implement your algorithm in 
[PyDML](beginners-guide-to-dml-and-pydml) or using the [matrix 
class](http://apache.github.io/systemml/python-reference.html#matrix-class)
+* If you want to try out SystemML on your laptop, consider
+  * using the above mentioned APIs with Apache Spark (recommended). Please 
refer

[systemml] branch gh-pages updated: [SYSTEMML-540] Bugfix for Python 3+ and updated the documentation

2019-03-21 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/gh-pages by this push:
 new b72f0f8  [SYSTEMML-540] Bugfix for Python 3+ and updated the 
documentation
b72f0f8 is described below

commit b72f0f883a4b99f69869cf5c4fca1f592d4b4d02
Author: Niketan Pansare 
AuthorDate: Thu Mar 21 09:29:07 2019 -0700

[SYSTEMML-540] Bugfix for Python 3+ and updated the documentation

- Added a quick tour of the documentation in the overview page.
- Updated GPU documentation to explain how to resolve common setup issues.
- Updated Keras2DML documentation to be compatible with the recently added 
features.
- Updated mllearn documentation to include Keras2DML.
---
 beginners-guide-keras2dml.md |  89 +---
 deep-learning.md |  26 +--
 gpu.md   | 162 ++-
 index.md |  25 +++
 native-backend.md|  10 +++
 python-reference.md  |   8 ++-
 6 files changed, 270 insertions(+), 50 deletions(-)

diff --git a/beginners-guide-keras2dml.md b/beginners-guide-keras2dml.md
index c99334e..60de360 100644
--- a/beginners-guide-keras2dml.md
+++ b/beginners-guide-keras2dml.md
@@ -45,23 +45,88 @@ Keras models are parsed based on their layer structure and 
corresponding weights
 configuration. Be aware that currently this is a translation into Caffe and 
there will be loss of information from keras models such as 
 intializer information, and other layers which do not exist in Caffe. 
 
+First, install SystemML and other dependencies for the below demo:
+
+```
+pip install systemml keras tensorflow mlxtend
+``` 
+
 To create a Keras2DML object, simply pass the keras object to the Keras2DML 
constructor. It's also important to note that your models
-should be compiled so that the loss can be accessed for Caffe2DML
+should be compiled so that the loss can be accessed for Caffe2DML.
 
-```python
-from systemml.mllearn import Keras2DML
-import keras
-from keras.applications.resnet50 import preprocess_input, decode_predictions, 
ResNet50
 
-keras_model = 
ResNet50(weights='imagenet',include_top=True,pooling='None',input_shape=(224,224,3))
-keras_model.compile(optimizer='sgd', loss= 'categorical_crossentropy')
 
-sysml_model = Keras2DML(spark, keras_model,input_shape=(3,224,224))
-sysml_model.summary()
+```python
+# pyspark --driver-memory 20g
+
+# Disable Tensorflow from using GPU to avoid unnecessary evictions by SystemML 
runtime
+import os
+os.environ['CUDA_DEVICE_ORDER'] = 'PCI_BUS_ID'
+os.environ['CUDA_VISIBLE_DEVICES'] = ''
+
+# Import dependencies
+from mlxtend.data import mnist_data
+import numpy as np
+from sklearn.utils import shuffle
+from keras.models import Sequential
+from keras.layers import Input, Dense, Conv2D, MaxPooling2D, Dropout,Flatten
+from keras import backend as K
+from keras.models import Model
+from keras.optimizers import SGD
+
+# Set channel first layer
+K.set_image_data_format('channels_first')
+
+# Download the MNIST dataset
+X, y = mnist_data()
+X, y = shuffle(X, y)
+
+# Split the data into training and test
+n_samples = len(X)
+X_train = X[:int(.9 * n_samples)]
+y_train = y[:int(.9 * n_samples)]
+X_test = X[int(.9 * n_samples):]
+y_test = y[int(.9 * n_samples):]
+
+# Define Lenet in Keras
+keras_model = Sequential()
+keras_model.add(Conv2D(32, kernel_size=(5, 5), activation='relu', 
input_shape=(1,28,28), padding='same'))
+keras_model.add(MaxPooling2D(pool_size=(2, 2)))
+keras_model.add(Conv2D(64, (5, 5), activation='relu', padding='same'))
+keras_model.add(MaxPooling2D(pool_size=(2, 2)))
+keras_model.add(Flatten())
+keras_model.add(Dense(512, activation='relu'))
+keras_model.add(Dropout(0.5))
+keras_model.add(Dense(10, activation='softmax'))
+keras_model.compile(loss='categorical_crossentropy', optimizer=SGD(lr=0.01, 
decay=1e-6, momentum=0.9, nesterov=True))
+keras_model.summary()
+
+# Scale the input features
+scale = 0.00390625
+X_train = X_train*scale
+X_test = X_test*scale
+
+# Train Lenet using SystemML
+from systemml.mllearn import Keras2DML
+sysml_model = Keras2DML(spark, keras_model, weights='weights_dir')
+# sysml_model.setConfigProperty("sysml.native.blas", "auto")
+# sysml_model.setGPU(True).setForceGPU(True)
+sysml_model.fit(X_train, y_train)
+sysml_model.score(X_test, y_test)
 ```
 
 # Frequently asked questions
 
+ How can I get the training and prediction DML script for the Keras model?
+
+The training and prediction DML scripts can be generated using 
`get_training_script()` and `get_prediction_script()` methods.
+
+```python
+from systemml.mllearn import Keras2DML
+sysml_model = Keras2DML(spark, keras_model, input_shape=(3,224,224))
+print(sysml_model.get_training_script())
+```
+
  What is the mapping between Keras' parameter

[systemml] branch master updated: [SYSTEMML-540] Bugfix for Python 3+ and updated the documentation

2019-03-21 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new ef8b10a  [SYSTEMML-540] Bugfix for Python 3+ and updated the 
documentation
ef8b10a is described below

commit ef8b10a964a4b620f43e627303b85616d9abb502
Author: Niketan Pansare 
AuthorDate: Thu Mar 21 09:29:07 2019 -0700

[SYSTEMML-540] Bugfix for Python 3+ and updated the documentation

- Added a quick tour of the documentation in the overview page.
- Updated GPU documentation to explain how to resolve common setup issues.
- Updated Keras2DML documentation to be compatible with the recently added 
features.
- Updated mllearn documentation to include Keras2DML.
---
 docs/beginners-guide-keras2dml.md   |  89 +++--
 docs/deep-learning.md   |  26 +++-
 docs/gpu.md | 162 +++-
 docs/index.md   |  25 
 docs/native-backend.md  |  10 ++
 docs/python-reference.md|   8 +-
 src/main/python/systemml/mllearn/keras2caffe.py |  37 +++---
 7 files changed, 288 insertions(+), 69 deletions(-)

diff --git a/docs/beginners-guide-keras2dml.md 
b/docs/beginners-guide-keras2dml.md
index c99334e..60de360 100644
--- a/docs/beginners-guide-keras2dml.md
+++ b/docs/beginners-guide-keras2dml.md
@@ -45,23 +45,88 @@ Keras models are parsed based on their layer structure and 
corresponding weights
 configuration. Be aware that currently this is a translation into Caffe and 
there will be loss of information from keras models such as 
 intializer information, and other layers which do not exist in Caffe. 
 
+First, install SystemML and other dependencies for the below demo:
+
+```
+pip install systemml keras tensorflow mlxtend
+``` 
+
 To create a Keras2DML object, simply pass the keras object to the Keras2DML 
constructor. It's also important to note that your models
-should be compiled so that the loss can be accessed for Caffe2DML
+should be compiled so that the loss can be accessed for Caffe2DML.
 
-```python
-from systemml.mllearn import Keras2DML
-import keras
-from keras.applications.resnet50 import preprocess_input, decode_predictions, 
ResNet50
 
-keras_model = 
ResNet50(weights='imagenet',include_top=True,pooling='None',input_shape=(224,224,3))
-keras_model.compile(optimizer='sgd', loss= 'categorical_crossentropy')
 
-sysml_model = Keras2DML(spark, keras_model,input_shape=(3,224,224))
-sysml_model.summary()
+```python
+# pyspark --driver-memory 20g
+
+# Disable Tensorflow from using GPU to avoid unnecessary evictions by SystemML 
runtime
+import os
+os.environ['CUDA_DEVICE_ORDER'] = 'PCI_BUS_ID'
+os.environ['CUDA_VISIBLE_DEVICES'] = ''
+
+# Import dependencies
+from mlxtend.data import mnist_data
+import numpy as np
+from sklearn.utils import shuffle
+from keras.models import Sequential
+from keras.layers import Input, Dense, Conv2D, MaxPooling2D, Dropout,Flatten
+from keras import backend as K
+from keras.models import Model
+from keras.optimizers import SGD
+
+# Set channel first layer
+K.set_image_data_format('channels_first')
+
+# Download the MNIST dataset
+X, y = mnist_data()
+X, y = shuffle(X, y)
+
+# Split the data into training and test
+n_samples = len(X)
+X_train = X[:int(.9 * n_samples)]
+y_train = y[:int(.9 * n_samples)]
+X_test = X[int(.9 * n_samples):]
+y_test = y[int(.9 * n_samples):]
+
+# Define Lenet in Keras
+keras_model = Sequential()
+keras_model.add(Conv2D(32, kernel_size=(5, 5), activation='relu', 
input_shape=(1,28,28), padding='same'))
+keras_model.add(MaxPooling2D(pool_size=(2, 2)))
+keras_model.add(Conv2D(64, (5, 5), activation='relu', padding='same'))
+keras_model.add(MaxPooling2D(pool_size=(2, 2)))
+keras_model.add(Flatten())
+keras_model.add(Dense(512, activation='relu'))
+keras_model.add(Dropout(0.5))
+keras_model.add(Dense(10, activation='softmax'))
+keras_model.compile(loss='categorical_crossentropy', optimizer=SGD(lr=0.01, 
decay=1e-6, momentum=0.9, nesterov=True))
+keras_model.summary()
+
+# Scale the input features
+scale = 0.00390625
+X_train = X_train*scale
+X_test = X_test*scale
+
+# Train Lenet using SystemML
+from systemml.mllearn import Keras2DML
+sysml_model = Keras2DML(spark, keras_model, weights='weights_dir')
+# sysml_model.setConfigProperty("sysml.native.blas", "auto")
+# sysml_model.setGPU(True).setForceGPU(True)
+sysml_model.fit(X_train, y_train)
+sysml_model.score(X_test, y_test)
 ```
 
 # Frequently asked questions
 
+ How can I get the training and prediction DML script for the Keras model?
+
+The training and prediction DML scripts can be generated using 
`get_training_script()` and `get_prediction_script()` methods.
+
+```python
+from systemml.mllearn import Keras2DML
+sysml_model = Keras2DM

[systemml] branch master updated: [SYSTEMML-540] Integrate the lstm builtin function in Keras2DML

2019-03-20 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new fbd3aab  [SYSTEMML-540] Integrate the lstm builtin function in 
Keras2DML
fbd3aab is described below

commit fbd3aabbda8027e34744ad97a81f1376cf5f2041
Author: Niketan Pansare 
AuthorDate: Wed Mar 20 10:54:48 2019 -0700

[SYSTEMML-540] Integrate the lstm builtin function in Keras2DML

- Also, migrated the builtin function layer from staging to nn.
- Updated the GPU tests.
---
 scripts/nn/layers/conv2d.dml  |  2 ++
 scripts/nn/layers/lstm.dml|  2 ++
 .../nn/layers/{lstm_staging.dml => lstm_builtin.dml}  |  4 ++--
 scripts/nn/layers/max_pool2d.dml  |  2 ++
 src/main/python/systemml/mllearn/estimators.py|  5 -
 .../scala/org/apache/sysml/api/dl/Caffe2DML.scala |  4 
 .../scala/org/apache/sysml/api/dl/CaffeLayer.scala| 19 +++
 .../java/org/apache/sysml/test/gpu/LstmCPUTest.java   |  2 +-
 src/test/java/org/apache/sysml/test/gpu/LstmTest.java |  2 +-
 9 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/scripts/nn/layers/conv2d.dml b/scripts/nn/layers/conv2d.dml
index 49d887b..de40668 100644
--- a/scripts/nn/layers/conv2d.dml
+++ b/scripts/nn/layers/conv2d.dml
@@ -21,6 +21,8 @@
 
 /*
  * 2D Convolutional layer.
+ *
+ * Consider using conv2d_builtin.dml for better performance.
  */
 source("nn/util.dml") as util
 
diff --git a/scripts/nn/layers/lstm.dml b/scripts/nn/layers/lstm.dml
index cd1557d..838cc44 100644
--- a/scripts/nn/layers/lstm.dml
+++ b/scripts/nn/layers/lstm.dml
@@ -21,6 +21,8 @@
 
 /*
  * LSTM layer.
+ *
+ * Consider using lstm_builtin.dml for better performance.
  */
 source("nn/layers/sigmoid.dml") as sigmoid
 source("nn/layers/tanh.dml") as tanh
diff --git a/scripts/nn/layers/lstm_staging.dml 
b/scripts/nn/layers/lstm_builtin.dml
similarity index 98%
rename from scripts/nn/layers/lstm_staging.dml
rename to scripts/nn/layers/lstm_builtin.dml
index f1934da..95661f8 100644
--- a/scripts/nn/layers/lstm_staging.dml
+++ b/scripts/nn/layers/lstm_builtin.dml
@@ -21,9 +21,9 @@
 
 /*
  * LSTM layer.
+ * 
+ * This implementation uses a built-in operator for higher performance.
  */
-source("nn/layers/sigmoid.dml") as sigmoid
-source("nn/layers/tanh.dml") as tanh
 
 forward = function(matrix[double] X, matrix[double] W, matrix[double] b, 
boolean return_sequences, matrix[double] out0, 
matrix[double] c0)
diff --git a/scripts/nn/layers/max_pool2d.dml b/scripts/nn/layers/max_pool2d.dml
index fba1a4c..ee57141 100644
--- a/scripts/nn/layers/max_pool2d.dml
+++ b/scripts/nn/layers/max_pool2d.dml
@@ -21,6 +21,8 @@
 
 /*
  * Max Pooling layer.
+ *
+ * Consider using max_pool2d_builtin.dml for better performance.
  */
 source("nn/util.dml") as util
 
diff --git a/src/main/python/systemml/mllearn/estimators.py 
b/src/main/python/systemml/mllearn/estimators.py
index 144cf66..d6aa8e8 100644
--- a/src/main/python/systemml/mllearn/estimators.py
+++ b/src/main/python/systemml/mllearn/estimators.py
@@ -924,7 +924,7 @@ class Caffe2DML(BaseSystemMLClassifier):
 self.estimator.setWeightsToIgnore(ignore_weights)
 
 def set(self, debug=None, train_algo=None, test_algo=None, 
parallel_batches=None,
-output_activations=None, perform_one_hot_encoding=None, 
parfor_parameters=None, inline_nn_library=None):
+output_activations=None, perform_one_hot_encoding=None, 
parfor_parameters=None, inline_nn_library=None, use_builtin_lstm_fn=None):
 """
 Set input to Caffe2DML
 
@@ -938,6 +938,7 @@ class Caffe2DML(BaseSystemMLClassifier):
 perform_one_hot_encoding: should perform one-hot encoding in DML using 
table function (default: False)
 parfor_parameters: dictionary for parfor parameters when using 
allreduce-style algorithms (default: "")
 inline_nn_library: whether to inline the NN library when generating 
DML using Caffe2DML (default: False)
+use_builtin_lstm_fn: whether to use builtin lstm function for LSTM 
layer (default: True)
 """
 if debug is not None:
 self.estimator.setInput("$debug", str(debug).upper())
@@ -949,6 +950,8 @@ class Caffe2DML(BaseSystemMLClassifier):
 self.estimator.setInput("$test_algo", str(test_algo).lower())
 if parallel_batches is not None:
 self.estimator.setInput("$parallel_batches", str(parallel_batches))
+if use_builtin_lstm_fn is not None:
+self.estimator.setInput("$use_builtin_lstm_fn", 
str(use_builtin_lstm_fn).upper())
 if output_activations is not None:
 self.es

[systemml] 05/05: [SYSTEMML-2520][DOC] Specified the steps required to update Crawler configuration for the search indexing in our release documentation.

2019-03-19 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git

commit ff681d850806ac0d1043345fea6a620365c59b15
Author: Niketan Pansare 
AuthorDate: Tue Mar 19 13:22:18 2019 -0700

[SYSTEMML-2520][DOC] Specified the steps required to update Crawler 
configuration for the search indexing in our release documentation.
---
 release-process.md | 8 
 1 file changed, 8 insertions(+)

diff --git a/release-process.md b/release-process.md
index dec6b15..8ef4693 100644
--- a/release-process.md
+++ b/release-process.md
@@ -494,3 +494,11 @@ Commit the update to `documentation.html` to publish the 
website update.
 
 The versioned project documentation is now deployed to the main website, and 
the
 [Documentation Page](http://systemml.apache.org/documentation) contains a link 
to the versioned documentation.
+
+## Update Crawler configuration for the search indexing
+
+Create a PR or an issue to update the version number in the crawler 
configuration. 
+Please see the `start_urls` tag in the file 
[https://github.com/algolia/docsearch-configs/blob/master/configs/apache_systemml.json](https://github.com/algolia/docsearch-configs/blob/master/configs/apache_systemml.json).
+If the Algolia team provides us an updated `apiKey` or `indexName` 
credentials, then please update the corresponding entries in the file 
+[https://github.com/apache/systemml/blob/master/docs/_layouts/global.html](https://github.com/apache/systemml/blob/master/docs/_layouts/global.html)
 
+(see for `Algolia search section` in the previously mentioned HTML file).
\ No newline at end of file



[systemml] 04/05: [SYSTEMML-2520] Add documentation search with Algolia service

2019-03-19 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git

commit 834ee0465777e4ac9288b81ff2cd4f0244ecfa37
Author: Janardhan 
AuthorDate: Tue Mar 19 13:08:10 2019 -0700

[SYSTEMML-2520] Add documentation search with Algolia service

Algolia is an api based service, indexes the documentation every 24h.
- When we query a keyword, the results would be rendered in a dropdown form.

Also, navigation header fix for the dropdown in iphone, and on minimize 
screen on
normal screens.

Closes #855.
---
 _layouts/global.html | 31 ++-
 css/main.css | 30 +++---
 2 files changed, 57 insertions(+), 4 deletions(-)

diff --git a/_layouts/global.html b/_layouts/global.html
index 4286c9c..734b2a0 100644
--- a/_layouts/global.html
+++ b/_layouts/global.html
@@ -15,10 +15,13 @@
 
 
 
+https://cdn.jsdelivr.net/npm/docsearch.js@2/dist/cdn/docsearch.min.css; 
/> 
 
 
 
 
 
@@ -93,6 +96,16 @@
 {% endif %}
 
 
+
+
 
 
 
@@ -254,5 +267,21 @@
 d.getElementsByTagName('head')[0].appendChild(script);
 }(document));
 
+
+https://cdn.jsdelivr.net/npm/docsearch.js@2/dist/cdn/docsearch.min.js&quot</a>;>
+
+// Crawler configuration for the search indexing is available at:
+// 
<a  rel="nofollow" href="https://github.com/algolia/docsearch-configs/blob/master/configs/apache_systemml.json">https://github.com/algolia/docsearch-configs/blob/master/configs/apache_systemml.json</a>
+
+docsearch({ 
+apiKey: '78c19564c220d4642a41197baae304ef', 
+indexName: 'apache_systemml', 
+inputSelector: "#s-bar", 
+// For custom styling for the dropdown, please set debug to 
true
+// so that the dropdown won't disappear when the inspect tools 
are 
+// open.
+debug: false 
+});
+
 
 
diff --git a/css/main.css b/css/main.css
index 8a7426b..3dd758b 100644
--- a/css/main.css
+++ b/css/main.css
@@ -61,6 +61,7 @@ h1, h2, h3, h4, h5, h6 {
 pre {
   background-color: #FFF
 }
+
 /* Branding */
 .brand {
   font-weight: normal !important;
@@ -81,7 +82,7 @@ img.logo {
 /* Navigation Bar */
 .navbar {
   background-color: rgba(0, 0, 0, 0.9);
-  height: 68px;
+  /*height: 68px;*/
 }
 
 .navbar-brand {
@@ -96,12 +97,28 @@ img.logo {
   height: 100%;
 }
 
+.navbar-collapse {
+  /*height: 67px !important;*/
+  background: rgba(0,0,0,0);
+}
+
 .navbar-collapse.collapse {
-  height: 67px !important;
+  background: rgba(0, 0, 0, 0);
+  border-top: 0px;
+}
+
+.navbar-collapse.collapsing {
+  background: rgba(0, 0, 0, 0);
+  border-top: 0px;
+}
+
+.navbar-toggle {
+ border-radius: 1px;
 }
 
 .navbar-header {
-  padding-top: 10px;
+  padding-top: 0px;
+  padding-bottom: 10px;
 }
 
 .navbar .container {
@@ -159,6 +176,13 @@ img.logo {
 }
 
 /**
+ * Search bar
+ */
+input#s-bar {
+  margin-left: 10px;
+}
+
+/**
  * MathJax (embedded latex formulas)
  */
 .MathJax .mo { color: inherit }



[systemml] branch gh-pages updated (d38bf4e -> ff681d8)

2019-03-19 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a change to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git.


from d38bf4e  [SYSTEMML-445] Removed batch_norm builtin functions
 new aa7e0a9  [MINOR] Fixes bug causing stats output to be cleared in JMLC
 new 0599687  [SYSTEMML-2499] Built-in functions for binomial distribution
 new 2251f40  [SYSTEMML-540] Improve the performance of GPU lstm backward 
operator by passing the state
 new 834ee04  [SYSTEMML-2520] Add documentation search with Algolia service
 new ff681d8  [SYSTEMML-2520][DOC] Specified the steps required to update 
Crawler configuration for the search indexing in our release documentation.

The 5 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 _layouts/global.html  | 31 ++-
 css/main.css  | 30 +++---
 dml-language-reference.md | 34 ++
 jmlc.md   | 10 +-
 release-process.md| 33 +++--
 5 files changed, 103 insertions(+), 35 deletions(-)



[systemml] 01/05: [MINOR] Fixes bug causing stats output to be cleared in JMLC

2019-03-19 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git

commit aa7e0a9baa63622bf5a778ae81e8f65313a45f2f
Author: Anthony Thomas 
AuthorDate: Mon Nov 12 18:56:31 2018 +0530

[MINOR] Fixes bug causing stats output to be cleared in JMLC

Closes #843.
---
 jmlc.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/jmlc.md b/jmlc.md
index 08d1688..e0d72ea 100644
--- a/jmlc.md
+++ b/jmlc.md
@@ -53,7 +53,7 @@ dependent on the nature of the business use case being 
addressed.
 
 JMLC can be configured to gather runtime statistics, as in the MLContext API, 
by calling Connection's `setStatistics()`
 method with a value of `true`. JMLC can also be configured to gather 
statistics on the memory used by matrices and
-frames in the DML script. To enable collection of memory statistics, call 
Connection's `gatherMemStats()` method
+frames in the DML script. To enable collection of memory statistics, call 
PreparedScript's `gatherMemStats()` method
 with a value of `true`. When finegrained statistics are enabled in 
`SystemML.conf`, JMLC will also report the variables
 in the DML script which used the most memory. An example showing how to enable 
statistics in JMLC is presented in the
 section below.
@@ -122,10 +122,6 @@ the resulting `"predicted_y"` matrix. We repeat this 
process. When done, we clos
  
 // obtain connection to SystemML
 Connection conn = new Connection();
-
-// turn on gathering of runtime statistics and memory use
-conn.setStatistics(true);
-conn.gatherMemStats(true);
  
 // read in and precompile DML script, registering inputs and outputs
 String dml = conn.readScript("scoring-example.dml");
@@ -135,6 +131,10 @@ the resulting `"predicted_y"` matrix. We repeat this 
process. When done, we clos
 String plan = script.explain();
 System.out.println(plan);
 
+// turn on gathering of runtime statistics and memory use
+script.setStatistics(true);
+script.gatherMemStats(true);
+
 double[][] mtx = matrix(4, 3, new double[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 
});
 double[][] result = null;
  



[systemml] 03/05: [SYSTEMML-540] Improve the performance of GPU lstm backward operator by passing the state

2019-03-19 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git

commit 2251f4031e745635ba308af12851e2a5ffa7255d
Author: Niketan Pansare 
AuthorDate: Tue Mar 19 12:30:01 2019 -0700

[SYSTEMML-540] Improve the performance of GPU lstm backward operator by 
passing the state

- The lstm builtin function extended to return state: [out, c, state] = 
lstm(X, W, b, out0, c0, return_sequences)
- The lstm_backward builtin function extended to accept state: [dX, dW, db, 
dout0, dc0] = lstm_backward(X, W, b, out0, c0, given_sequences, dout, dc, state)
- Updated the DML documentation to reflect this change.
- Updated the release documentation.

Closes #856.
---
 dml-language-reference.md | 21 +++--
 release-process.md| 25 +++--
 2 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/dml-language-reference.md b/dml-language-reference.md
index 6f1c854..f64b6ea 100644
--- a/dml-language-reference.md
+++ b/dml-language-reference.md
@@ -1521,16 +1521,17 @@ The images are assumed to be stored NCHW format, where 
N = batch size, C = #chan
 Hence, the images are internally represented as a matrix with dimension (N, C 
* H * W).
 
 
-| Function name   | Input matrices   | 
Dimension of first input matrix   | Dimension of second 
input matrix (if applicable)  | Dimension of (first) output matrix  
| Input Parameters  


| Notes   [...]
-|-|--|---|---|-|---|
 [...]
-| conv2d  | input, filter| 
[batch_size X num_channels* height_image* width_image]| [num_filters X 
num_channels* height_filter* width_filter] | [batch_size X num_channels_out* 
height_out* width_out]  | stride=[stride_h, 
stride_w], padding=[pad_h, pad_w], input_shape=[batch_size, num_channels, 
height_image, width_image], filter_shape=[num_filters, num_channels, 
height_filter, width_filter] | Performs 2D [...]
-| conv2d_backward_filter  | input, dout  | 
[batch_size X num_channels* height_image* width_image]| [batch_size X 
num_channels_out* height_out* width_out]| [num_filters X num_channels* 
height_filter* width_filter]   | 
stride=[stride_h, stride_w], padding=[pad_h, pad_w], input_shape=[batch_size, 
num_channels, height_image, width_image], filter_shape=[num_filters, 
num_channels, height_filter, width_filter] | Computes th [...]
-| conv2d_backward_data| filter, dout | 
[num_filters X num_channels* height_filter* width_filter] | [batch_size X 
num_channels_out* height_out* width_out]| [batch_size X num_channels* 
height_image* width_image]  | 
stride=[stride_h, stride_w], padding=[pad_h, pad_w], input_shape=[batch_size, 
num_channels, height_image, width_image], filter_shape=[num_filters, 
num_channels, height_filter, width_filter] | Computes th [...]
-| max_pool, avg_pool  | input| 
[batch_size X num_channels* height_image* width_image]| 
  | [batch_size X num_channels* height_out* 
width_out]  | stride=[stride_h, 
stride_w], padding=[pad_h, pad_w], input_shape=[batch_size, num_channels, 
height_image, width_image], pool_size=[height_pool, width_pool] 
  | Performs ma [...]
-| max_pool_backward, avg_pool_backward| input, dout  | 
[batch_size X num_channels* height_image* width_image]| [batch_size X 
num_channels* height_out* width_out]| [batch_size X num_channels* 
height_image* width_image]  | 
stride=[stride_h, stride_w], padding=[pad_h, pad_w], input_shape=[batch_size, 
num_channels, height_image, width_image], pool_size=[height_pool, width_pool]   
| Computes th [...]
-| bias_add| input, bias  | 
[batch_size X num_channels* height_image

[systemml] 02/05: [SYSTEMML-2499] Built-in functions for binomial distribution

2019-03-19 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git

commit 0599687e5c561736870e9ce8df20db7c05b84542
Author: Berthold Reinwald 
AuthorDate: Thu Nov 29 17:32:19 2018 -0800

[SYSTEMML-2499] Built-in functions for binomial distribution
---
 dml-language-reference.md | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/dml-language-reference.md b/dml-language-reference.md
index cdcc529..6f1c854 100644
--- a/dml-language-reference.md
+++ b/dml-language-reference.md
@@ -691,7 +691,7 @@ moment() | Returns the kth central moment of values in a 
column matrix V, where
 colSums()  colMeans()  colVars()  colSds()  colMaxs() 
 colMins() | Column-wise computations -- for each column, compute the 
sum/mean/variance/stdDev/max/min of cell values | Input: matrix  Output: 
(1 x n) matrix | colSums(X)  colMeans(X)  colVars(X)  colSds(X) 
 colMaxs(X) colMins(X)
 cov() | Returns the covariance between two 1-dimensional column matrices X and 
Y. The function takes an optional weights parameter W. All column matrices X, 
Y, and W (when specified) must have the exact same dimension. | Input: (X 
(n x 1) matrix, Y (n x 1) matrix [, W (n x 1) matrix)]) 
 Output: scalar | cov(X,Y)  cov(X,Y,W)
 table() | Returns the contingency table of two vectors A and B. The resulting 
table F consists of max(A) rows and max(B) columns.  More precisely, 
F[i,j] = \\|{ k \\| A[k] = i and B[k] = j, 1 ≤ k ≤ n }\\|, where A and B are 
two n-dimensional vectors.  This function supports multiple other 
variants, which can be found below, at the end of this Table 7. | Input: 
((n x 1) matrix, (n x 1) matrix), [(n x 1) matrix]) 
 Output: matrix | F = table(A, [...]
-cdf() pnorm() pexp() pchisq() pf() pt() 
icdf() qnorm() qexp() qchisq() qf() qt() | 
p=cdf(target=q, ...) returns the cumulative probability P[X = q].  
q=icdf(target=p, ...) returns the inverse cumulative probability i.e., it 
returns q such that the given target p = P[X=q].  For more details, 
please see the section "Probability Distribution Functions" below Table 7. | 
Input: (target=scalar, dist="...", ...)  pnorm() pbinomial()pexp() pchisq() pf() 
pt() icdf() qnorm() qbinomial()qexp() qchisq() 
qf() qt() | p=cdf(target=q, ...) returns the cumulative probability P[X 
= q].  q=icdf(target=p, ...) returns the inverse cumulative 
probability i.e., it returns q such that the given target p = P[X=q].  
For more details, please see the section "Probability Distribution Functions" 
below Table 7. | Input: (target= [...]
 aggregate() | Splits/groups the values from X according to the corresponding 
values from G, and then applies the function fn on each group.  The result 
F is a column matrix, in which each row contains the value computed from a 
distinct group in G. More specifically, F[k,1] = fn( {X[i,1] \\| 1=i=n 
and G[i,1] = k} ), where n = nrow(X) = nrow(G).  Note that the distinct 
values in G are used as row indexes in the result matrix F. Therefore, nrow(F) 
= max(G). It is thus reco [...]
 interQuartileMean() | Returns the mean of all x in X such that 
xquantile(X, 0.25) and x=quantile(X, 0.75). X, W are column matrices 
(vectors) of the same size. W contains the weights for data in X. | Input: (X 
(n x 1) matrix [, W (n x 1) matrix)])  Output: 
scalar | interQuartileMean(X)  interQuartileMean(X, W)
 quantile () | The p-quantile for a random variable X is the value x such that 
Pr[Xx] = p and Pr[X= x] = p  let n=nrow(X), 
i=ceiling(p*n), quantile() will return X[i]. p is a scalar (0p1) that 
specifies the quantile to be computed. Optionally, a weight vector may be 
provided for X. | Input: (X (n x 1) matrix, [W (n x 1) 
matrix),] p scalar)  Output: scalar | quantile(X, p) 
 quantile(X, W, p)
@@ -749,6 +749,7 @@ This computes the cumulative probability at the given 
quantile i.e., P[X=q],
   * `dist`: name of the distribution specified as a string. Valid values are 
"normal" (for Normal or Gaussian distribution), "f" (for F distribution), "t" 
(for Student t-distribution), "chisq" (for Chi Squared distribution), and "exp" 
(for Exponential distribution). This is a mandatory argument.
   * `...`: parameters of the distribution
 * For `dist="normal"`, valid parameters are mean and sd that specify the 
mean and standard deviation of the normal distribution. The default values for 
mean and sd are 0.0 and 1.0, respectively.
+* For `dist="binomial"`, valid parameters are trials and p that specify 
the number of trials and probability of success. Both parameters are mandatory.
 * For `dist="f"`, valid parameters are df1 and df2 that specify two 
degrees of freedom. Both these parameters are mandatory.
 * For `dist="t"`, and dist="chisq", valid parameter is df that specifies 
the degrees 

[systemml] branch master updated: [SYSTEMML-2520][DOC] Specified the steps required to update Crawler configuration for the search indexing in our release documentation.

2019-03-19 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 45f72bf  [SYSTEMML-2520][DOC] Specified the steps required to update 
Crawler configuration for the search indexing in our release documentation.
45f72bf is described below

commit 45f72bf4e4aecff98b74382282948a850ed4846d
Author: Niketan Pansare 
AuthorDate: Tue Mar 19 13:22:18 2019 -0700

[SYSTEMML-2520][DOC] Specified the steps required to update Crawler 
configuration for the search indexing in our release documentation.
---
 docs/release-process.md | 8 
 1 file changed, 8 insertions(+)

diff --git a/docs/release-process.md b/docs/release-process.md
index dec6b15..8ef4693 100644
--- a/docs/release-process.md
+++ b/docs/release-process.md
@@ -494,3 +494,11 @@ Commit the update to `documentation.html` to publish the 
website update.
 
 The versioned project documentation is now deployed to the main website, and 
the
 [Documentation Page](http://systemml.apache.org/documentation) contains a link 
to the versioned documentation.
+
+## Update Crawler configuration for the search indexing
+
+Create a PR or an issue to update the version number in the crawler 
configuration. 
+Please see the `start_urls` tag in the file 
[https://github.com/algolia/docsearch-configs/blob/master/configs/apache_systemml.json](https://github.com/algolia/docsearch-configs/blob/master/configs/apache_systemml.json).
+If the Algolia team provides us an updated `apiKey` or `indexName` 
credentials, then please update the corresponding entries in the file 
+[https://github.com/apache/systemml/blob/master/docs/_layouts/global.html](https://github.com/apache/systemml/blob/master/docs/_layouts/global.html)
 
+(see for `Algolia search section` in the previously mentioned HTML file).
\ No newline at end of file



[systemml] branch master updated: [SYSTEMML-2520] Add documentation search with Algolia service

2019-03-19 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new ea82102  [SYSTEMML-2520] Add documentation search with Algolia service
ea82102 is described below

commit ea821028bfc5869d5874163885ec73bf4d14670a
Author: Janardhan 
AuthorDate: Tue Mar 19 13:08:10 2019 -0700

[SYSTEMML-2520] Add documentation search with Algolia service

Algolia is an api based service, indexes the documentation every 24h.
- When we query a keyword, the results would be rendered in a dropdown form.

Also, navigation header fix for the dropdown in iphone, and on minimize 
screen on
normal screens.

Closes #855.
---
 docs/_layouts/global.html | 31 ++-
 docs/css/main.css | 30 +++---
 2 files changed, 57 insertions(+), 4 deletions(-)

diff --git a/docs/_layouts/global.html b/docs/_layouts/global.html
index 4286c9c..734b2a0 100644
--- a/docs/_layouts/global.html
+++ b/docs/_layouts/global.html
@@ -15,10 +15,13 @@
 
 
 
+https://cdn.jsdelivr.net/npm/docsearch.js@2/dist/cdn/docsearch.min.css; 
/> 
 
 
 
 
 
@@ -93,6 +96,16 @@
 {% endif %}
 
 
+
+
 
 
 
@@ -254,5 +267,21 @@
 d.getElementsByTagName('head')[0].appendChild(script);
 }(document));
 
+
+https://cdn.jsdelivr.net/npm/docsearch.js@2/dist/cdn/docsearch.min.js&quot</a>;>
+
+// Crawler configuration for the search indexing is available at:
+// 
<a  rel="nofollow" href="https://github.com/algolia/docsearch-configs/blob/master/configs/apache_systemml.json">https://github.com/algolia/docsearch-configs/blob/master/configs/apache_systemml.json</a>
+
+docsearch({ 
+apiKey: '78c19564c220d4642a41197baae304ef', 
+indexName: 'apache_systemml', 
+inputSelector: "#s-bar", 
+// For custom styling for the dropdown, please set debug to 
true
+// so that the dropdown won't disappear when the inspect tools 
are 
+// open.
+debug: false 
+});
+
 
 
diff --git a/docs/css/main.css b/docs/css/main.css
index 8a7426b..3dd758b 100644
--- a/docs/css/main.css
+++ b/docs/css/main.css
@@ -61,6 +61,7 @@ h1, h2, h3, h4, h5, h6 {
 pre {
   background-color: #FFF
 }
+
 /* Branding */
 .brand {
   font-weight: normal !important;
@@ -81,7 +82,7 @@ img.logo {
 /* Navigation Bar */
 .navbar {
   background-color: rgba(0, 0, 0, 0.9);
-  height: 68px;
+  /*height: 68px;*/
 }
 
 .navbar-brand {
@@ -96,12 +97,28 @@ img.logo {
   height: 100%;
 }
 
+.navbar-collapse {
+  /*height: 67px !important;*/
+  background: rgba(0,0,0,0);
+}
+
 .navbar-collapse.collapse {
-  height: 67px !important;
+  background: rgba(0, 0, 0, 0);
+  border-top: 0px;
+}
+
+.navbar-collapse.collapsing {
+  background: rgba(0, 0, 0, 0);
+  border-top: 0px;
+}
+
+.navbar-toggle {
+ border-radius: 1px;
 }
 
 .navbar-header {
-  padding-top: 10px;
+  padding-top: 0px;
+  padding-bottom: 10px;
 }
 
 .navbar .container {
@@ -159,6 +176,13 @@ img.logo {
 }
 
 /**
+ * Search bar
+ */
+input#s-bar {
+  margin-left: 10px;
+}
+
+/**
  * MathJax (embedded latex formulas)
  */
 .MathJax .mo { color: inherit }



[systemml] branch master updated: [SYSTEMML-540] Improve the performance of GPU lstm backward operator by passing the state

2019-03-19 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 91467c1  [SYSTEMML-540] Improve the performance of GPU lstm backward 
operator by passing the state
91467c1 is described below

commit 91467c164202f70c5a85ba7e0f7f9fcd16ddca1b
Author: Niketan Pansare 
AuthorDate: Tue Mar 19 12:30:01 2019 -0700

[SYSTEMML-540] Improve the performance of GPU lstm backward operator by 
passing the state

- The lstm builtin function extended to return state: [out, c, state] = 
lstm(X, W, b, out0, c0, return_sequences)
- The lstm_backward builtin function extended to accept state: [dX, dW, db, 
dout0, dc0] = lstm_backward(X, W, b, out0, c0, given_sequences, dout, dc, state)
- Updated the DML documentation to reflect this change.
- Updated the release documentation.

Closes #856.
---
 conf/SystemML-config.xml.template  |   3 +
 docs/dml-language-reference.md |  21 +-
 docs/release-process.md|  25 +-
 scripts/nn/layers/lstm_staging.dml |  12 +-
 src/main/java/org/apache/sysml/conf/DMLConfig.java |   4 +-
 .../sysml/parser/BuiltinFunctionExpression.java|  14 +-
 .../org/apache/sysml/parser/StatementBlock.java|  13 +-
 .../controlprogram/caching/CacheableData.java  |  10 +
 .../runtime/instructions/cp/DnnCPInstruction.java  |  72 +-
 .../instructions/gpu/DnnGPUInstruction.java| 278 -
 .../instructions/gpu/context/GPUObject.java|   2 +-
 .../sysml/runtime/matrix/data/LibMatrixCuDNN.java  | 163 ++--
 .../matrix/data/LibMatrixCuDNNRnnAlgorithm.java|  19 +-
 .../runtime/matrix/data/LibMatrixCuMatMult.java|   3 +
 .../org/apache/sysml/test/gpu/LstmCPUTest.java |   5 +-
 .../java/org/apache/sysml/test/gpu/LstmTest.java   |  10 +-
 16 files changed, 443 insertions(+), 211 deletions(-)

diff --git a/conf/SystemML-config.xml.template 
b/conf/SystemML-config.xml.template
index b9189b1..17cc2cc 100644
--- a/conf/SystemML-config.xml.template
+++ b/conf/SystemML-config.xml.template
@@ -118,4 +118,7 @@

false
+   
+   
+   true
 
\ No newline at end of file
diff --git a/docs/dml-language-reference.md b/docs/dml-language-reference.md
index 6f1c854..f64b6ea 100644
--- a/docs/dml-language-reference.md
+++ b/docs/dml-language-reference.md
@@ -1521,16 +1521,17 @@ The images are assumed to be stored NCHW format, where 
N = batch size, C = #chan
 Hence, the images are internally represented as a matrix with dimension (N, C 
* H * W).
 
 
-| Function name   | Input matrices   | 
Dimension of first input matrix   | Dimension of second 
input matrix (if applicable)  | Dimension of (first) output matrix  
| Input Parameters  


| Notes   [...]
-|-|--|---|---|-|---|
 [...]
-| conv2d  | input, filter| 
[batch_size X num_channels* height_image* width_image]| [num_filters X 
num_channels* height_filter* width_filter] | [batch_size X num_channels_out* 
height_out* width_out]  | stride=[stride_h, 
stride_w], padding=[pad_h, pad_w], input_shape=[batch_size, num_channels, 
height_image, width_image], filter_shape=[num_filters, num_channels, 
height_filter, width_filter] | Performs 2D [...]
-| conv2d_backward_filter  | input, dout  | 
[batch_size X num_channels* height_image* width_image]| [batch_size X 
num_channels_out* height_out* width_out]| [num_filters X num_channels* 
height_filter* width_filter]   | 
stride=[stride_h, stride_w], padding=[pad_h, pad_w], input_shape=[batch_size, 
num_channels, height_image, width_image], filter_shape=[num_filters, 
num_channels, height_filter, width_filter] | Computes th [...]
-| conv2d_backward_data| filter, dout | 
[num_filters X num_channels* height_filter* width_filter] | [batch_size X 
num_channels_out* height_out* width_out]| [batch_size X num_channels* 
height_image

[systemml] branch master updated: [SYSTEMML-540] Fixed the failing Python tests and refactored BaseSystemMLClassifier class

2019-03-19 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 69c6a1a  [SYSTEMML-540] Fixed the failing Python tests and refactored 
BaseSystemMLClassifier class
69c6a1a is described below

commit 69c6a1acb1481b1537bdc850654a4eb0a8efe20b
Author: Niketan Pansare 
AuthorDate: Tue Mar 19 11:40:28 2019 -0700

[SYSTEMML-540] Fixed the failing Python tests and refactored 
BaseSystemMLClassifier class

- Fixed failing mllearn numpy and df tests.
- Added a fix in converter util method that converts Spark DF to Pandas. 
This is required as of Spark 2.3+
- Also, updated the nn tests to match the results of latest Keras/TF 
release, especially the Flatten layer.
- Added added a warning message when the user attempts to write a metadata 
file with empty name.
---
 .../apache/sysml/runtime/util/MapReduceTool.java   |   4 +
 src/main/python/systemml/mllearn/estimators.py |   2 +-
 src/main/python/tests/test_mllearn_df.py   |  15 ++-
 src/main/python/tests/test_mllearn_numpy.py|  44 --
 src/main/python/tests/test_nn_numpy.py |  27 ++--
 .../sysml/api/ml/BaseSystemMLClassifier.scala  | 148 ++---
 6 files changed, 134 insertions(+), 106 deletions(-)

diff --git a/src/main/java/org/apache/sysml/runtime/util/MapReduceTool.java 
b/src/main/java/org/apache/sysml/runtime/util/MapReduceTool.java
index cecd0e3..d1f1be5 100644
--- a/src/main/java/org/apache/sysml/runtime/util/MapReduceTool.java
+++ b/src/main/java/org/apache/sysml/runtime/util/MapReduceTool.java
@@ -422,6 +422,9 @@ public class MapReduceTool
throws IOException 
{
Path path = new Path(mtdfile);
+   if(path.getName().equals(" .mtd")) {
+   LOG.warn("Performing a write on a empty mtd path:" + 
mtdfile + ". This can lead to unexpected behavior.");
+   }
FileSystem fs = IOUtilFunctions.getFileSystem(path);
try( BufferedWriter br = new BufferedWriter(new 
OutputStreamWriter(fs.create(path,true))) ) {
String mtd = metaDataToString(vt, schema, dt, mc, 
outinfo, formatProperties);
@@ -429,6 +432,7 @@ public class MapReduceTool
} catch (Exception e) {
throw new IOException("Error creating and writing 
metadata JSON file", e);
}
+   
}
 
public static void writeScalarMetaDataFile(String mtdfile, ValueType 
vt) 
diff --git a/src/main/python/systemml/mllearn/estimators.py 
b/src/main/python/systemml/mllearn/estimators.py
index 2c3b6a2..144cf66 100644
--- a/src/main/python/systemml/mllearn/estimators.py
+++ b/src/main/python/systemml/mllearn/estimators.py
@@ -314,7 +314,7 @@ class BaseSystemMLEstimator(Estimator):
 output: a java-side object (either MatrixBlock or Java DataFrame)
 """
 if isinstance(X, SUPPORTED_TYPES) and self.transferUsingDF:
-retDF = DataFrame(output, self.sparkSession)
+retDF = DataFrame(output, self.sparkSession._wrapped)
 retPDF = retDF.sort('__INDEX').select('prediction').toPandas()
 return retPDF.as_matrix().flatten() if isinstance(X, np.ndarray) 
else retPDF
 elif isinstance(X, SUPPORTED_TYPES):
diff --git a/src/main/python/tests/test_mllearn_df.py 
b/src/main/python/tests/test_mllearn_df.py
index c2f8a3e..4f94589 100644
--- a/src/main/python/tests/test_mllearn_df.py
+++ b/src/main/python/tests/test_mllearn_df.py
@@ -24,6 +24,7 @@
 #   - Python 2: `PYSPARK_PYTHON=python2 spark-submit --master local[*] 
--driver-class-path SystemML.jar test_mllearn_df.py`
 #   - Python 3: `PYSPARK_PYTHON=python3 spark-submit --master local[*] 
--driver-class-path SystemML.jar test_mllearn_df.py`
 
+
 # Make the `systemml` package importable
 import os
 import sys
@@ -45,6 +46,16 @@ from systemml.mllearn import LinearRegression, 
LogisticRegression, NaiveBayes, S
 
 sparkSession = SparkSession.builder.getOrCreate()
 
+def test_accuracy_score(sklearn_predicted, mllearn_predicted, y_test, 
threshold):
+if accuracy_score(sklearn_predicted, mllearn_predicted) > threshold:
+# Our results match that of scikit-learn. No need to measure with the 
ground truth
+return True
+elif accuracy_score(y_test, mllearn_predicted) > accuracy_score(y_test, 
sklearn_predicted):
+# We perform better than scikit-learn, ignore the threshold
+return True
+else:
+return False
+
 # Currently not integrated with JUnit test
 # ~/spark-1.6.1-scala-2.11/bin/spark-submit --master local[*] 
--driver-class-path SystemML.jar test.py
 class TestMLLearn(unittest.TestCase):
@@ -64,7 +75,7 @@ class TestMLLearn(unittest.Te

[systemml] branch master updated: [MINOR] Provide a more informative error message when the dimensions don't match during the validate phase

2019-03-13 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 881f606  [MINOR] Provide a more informative error message when the 
dimensions don't match during the validate phase
881f606 is described below

commit 881f606a89a5683e1a41a1c974fc0188d8600ade
Author: Niketan Pansare 
AuthorDate: Wed Mar 13 14:47:11 2019 -0700

[MINOR] Provide a more informative error message when the dimensions don't 
match during the validate phase
---
 .../org/apache/sysml/parser/BuiltinFunctionExpression.java  | 13 ++---
 .../java/org/apache/sysml/parser/RelationalExpression.java  |  7 +--
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git 
a/src/main/java/org/apache/sysml/parser/BuiltinFunctionExpression.java 
b/src/main/java/org/apache/sysml/parser/BuiltinFunctionExpression.java
index fe86dc8..f27958f 100644
--- a/src/main/java/org/apache/sysml/parser/BuiltinFunctionExpression.java
+++ b/src/main/java/org/apache/sysml/parser/BuiltinFunctionExpression.java
@@ -1717,8 +1717,11 @@ public class BuiltinFunctionExpression extends 
DataIdentifier
  || (!allowsMV && expr1.getOutput().getDim2() 
!= expr2.getOutput().getDim2()) 
  || (allowsMV && expr1.getOutput().getDim2() 
!= expr2.getOutput().getDim2() && expr2.getOutput().getDim2() != 1) ) 
{
-   raiseValidateError("Mismatch in matrix 
dimensions of parameters for function "
-   + this.getOpCode(), 
conditional, LanguageErrorCodes.INVALID_PARAMETERS);
+   String str1 = "([" + 
expr1.getOutput().getDim1() + ", " + expr1.getOutput().getDim2()  + "] and [" 
+   + expr2.getOutput().getDim1() + 
", " + expr2.getOutput().getDim2()  + "])";
+   String str2 = !allowsMV ? " (Note: " + 
this.getOpCode() + " does not support matrix-vector operations)" : "";
+   raiseValidateError("Mismatch in matrix 
dimensions " + str1 + " of parameters for function "
+   + this.getOpCode() + str2, 
conditional, LanguageErrorCodes.INVALID_PARAMETERS);
}
}
}
@@ -1726,7 +1729,11 @@ public class BuiltinFunctionExpression extends 
DataIdentifier
private void checkMatchingDimensionsQuantile() 
{
if (getFirstExpr().getOutput().getDim1() != 
getSecondExpr().getOutput().getDim1()) {
-   raiseValidateError("Mismatch in matrix dimensions for "
+   Expression expr1 = getFirstExpr();
+   Expression expr2 = getSecondExpr();
+   String str1 = "([" + expr1.getOutput().getDim1() + ", " 
+ expr1.getOutput().getDim2()  + "] and [" 
+   + expr2.getOutput().getDim1() + ", " + 
expr2.getOutput().getDim2()  + "])";
+   raiseValidateError("Mismatch in matrix dimensions " + 
str1 + " of parameters for "
+ this.getOpCode(), false, 
LanguageErrorCodes.INVALID_PARAMETERS);
}
}
diff --git a/src/main/java/org/apache/sysml/parser/RelationalExpression.java 
b/src/main/java/org/apache/sysml/parser/RelationalExpression.java
index eed568c..8d897c7 100644
--- a/src/main/java/org/apache/sysml/parser/RelationalExpression.java
+++ b/src/main/java/org/apache/sysml/parser/RelationalExpression.java
@@ -182,8 +182,11 @@ public class RelationalExpression extends Expression
  || (!allowsMV && expr1.getOutput().getDim2() 
!= expr2.getOutput().getDim2()) 
  || (allowsMV && expr1.getOutput().getDim2() 
!= expr2.getOutput().getDim2() && expr2.getOutput().getDim2() != 1) ) 
{
-   raiseValidateError("Mismatch in matrix 
dimensions of parameters for function "
-   + this.getOpCode(), false, 
LanguageErrorCodes.INVALID_PARAMETERS);
+   String str1 = "([" + 
expr1.getOutput().getDim1() + ", " + expr1.getOutput().getDim2()  + "] and [" 
+   + expr2.getOutput().getDim1() + 
", " + expr2.getOutput().getDim2()  + "])";
+   String str2 = !allowsMV ? " (Note: " + 
this.getOpCode() + " does not support matrix-vector operations)" : "";
+   raiseValidateError("Mismatch in matrix 
dimensions " + str1 + " of parameters for function "
+   + this.getOpCode() + str2, 
false, LanguageErrorCodes.INVALID_PARAMETERS);
}
}
}



[systemml] branch master updated: [MINOR] Throw a controlled exception when the expected number of inputs of UDF does not match the actual number of inputs

2019-03-11 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new a060f83  [MINOR] Throw a controlled exception when the expected number 
of inputs of UDF does not match the actual number of inputs
a060f83 is described below

commit a060f83f01b268a9fd0582517993d8ebdbe2848a
Author: Niketan Pansare 
AuthorDate: Mon Mar 11 14:40:53 2019 -0700

[MINOR] Throw a controlled exception when the expected number of inputs of 
UDF does not match the actual number of inputs
---
 .../org/apache/sysml/hops/ipa/InterProceduralAnalysis.java| 11 +++
 1 file changed, 11 insertions(+)

diff --git 
a/src/main/java/org/apache/sysml/hops/ipa/InterProceduralAnalysis.java 
b/src/main/java/org/apache/sysml/hops/ipa/InterProceduralAnalysis.java
index 72aa9cb..213991e 100644
--- a/src/main/java/org/apache/sysml/hops/ipa/InterProceduralAnalysis.java
+++ b/src/main/java/org/apache/sysml/hops/ipa/InterProceduralAnalysis.java
@@ -529,6 +529,17 @@ public class InterProceduralAnalysis
ArrayList inputOps = fop.getInput();
String fkey = fop.getFunctionKey();

+   // Throw a controlled exception when the expected number of 
inputs doesnot match the actual number of inputs 
+   // instead of array out of bounds exception.
+   if(inputOps.size() != funArgNames.length) {
+   String argsList = funArgNames.length > 0 ? 
funArgNames[0] : "";
+   for( int i=1; i

[systemml] branch master updated: [SYSTEMML-540] Reduce the memory pressure of CP lstm_backward instruction

2019-03-05 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new c7b9745  [SYSTEMML-540] Reduce the memory pressure of CP lstm_backward 
instruction
c7b9745 is described below

commit c7b9745800e0c71f0c6c76b8284c78e33a5cdb01
Author: Niketan Pansare 
AuthorDate: Tue Mar 5 15:48:09 2019 -0800

[SYSTEMML-540] Reduce the memory pressure of CP lstm_backward instruction

- When lstm_backward in invoked, this commit avoids memory allocation and 
left indexing of output and carry activations of the corresponding forward 
invocation.
---
 .../runtime/instructions/cp/DnnCPInstruction.java|  6 +++---
 .../sysml/runtime/matrix/data/LibMatrixDNN.java  | 20 +---
 2 files changed, 12 insertions(+), 14 deletions(-)

diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/cp/DnnCPInstruction.java 
b/src/main/java/org/apache/sysml/runtime/instructions/cp/DnnCPInstruction.java
index 50a11de..35ac5b6 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/cp/DnnCPInstruction.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/cp/DnnCPInstruction.java
@@ -388,8 +388,6 @@ public class DnnCPInstruction extends UnaryCPInstruction {
+ "but found [" + dc.getNumRows() + "," 
+ dc.getNumColumns() + "]");
}

-   MatrixBlock out = new MatrixBlock(N, return_seq ? (T*M) : M, 
false);
-   MatrixBlock c = new MatrixBlock(N, M, false);
MatrixBlock cache_out = new MatrixBlock(T, N*M, false);
MatrixBlock cache_c = new MatrixBlock(T, N*M, false);
MatrixBlock cache_ifog = new MatrixBlock(T, N*4*M, false);
@@ -401,7 +399,9 @@ public class DnnCPInstruction extends UnaryCPInstruction {
cache_ifog.allocateDenseBlock();
LibMatrixDNN.lstm(X, W, b, out0, c0, 
return_seq, N, T, D, M,
-   out,  c, cache_out, cache_c, cache_ifog,
+   // Avoid out and c computation in lstm forward 
call
+   null, null, 
+   cache_out, cache_c, cache_ifog,
_numThreads);

MatrixBlock dX = new MatrixBlock(N, T*D, false);
diff --git 
a/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixDNN.java 
b/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixDNN.java
index 0005932..3ec9fb3 100644
--- a/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixDNN.java
+++ b/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixDNN.java
@@ -20,7 +20,6 @@ package org.apache.sysml.runtime.matrix.data;
 
 import java.util.ArrayList;
 import java.util.Arrays;
-import java.util.HashSet;
 import java.util.List;
 import java.util.concurrent.Callable;
 import java.util.concurrent.ExecutorService;
@@ -41,7 +40,6 @@ import org.apache.sysml.runtime.functionobjects.Multiply;
 import org.apache.sysml.runtime.functionobjects.Plus;
 import org.apache.sysml.runtime.functionobjects.PlusMultiply;
 import org.apache.sysml.runtime.functionobjects.Power;
-import org.apache.sysml.runtime.functionobjects.Power2;
 import org.apache.sysml.runtime.functionobjects.SwapIndex;
 import org.apache.sysml.runtime.functionobjects.Builtin.BuiltinCode;
 import org.apache.sysml.runtime.instructions.cp.KahanObject;
@@ -57,8 +55,6 @@ import 
org.apache.sysml.runtime.matrix.operators.UnaryOperator;
 import org.apache.sysml.runtime.util.CommonThreadPool;
 import org.apache.sysml.runtime.util.DnnUtils;
 
-import com.sun.org.apache.xpath.internal.operations.Minus;
-
 /*
  * This class allows users to invoke deep learning related operations 
  * (such as conv2d, conv2d_backward_data, conv2d_backward_filter, maxpooling, 
maxpooling_backward, bias_add)
@@ -514,7 +510,7 @@ public class LibMatrixDNN {

public static void lstm(MatrixBlock X, MatrixBlock W, MatrixBlock b, 
MatrixBlock out0, MatrixBlock c0, 
boolean return_seq, int N, int T, int D, int M,
-   MatrixBlock out, MatrixBlock c, // output 
+   MatrixBlock out, MatrixBlock c, // output: if null, the 
output and c are not passed back
MatrixBlock cache_out, MatrixBlock cache_c, MatrixBlock 
cache_ifog, // if null, the cache values are not computed
int numThreads) {
MatrixBlock out_prev = out0;
@@ -624,7 +620,7 @@ public class LibMatrixDNN {
updateIfogCache(cache_ifog, ifo, g, t, N, M);
}

-   if(return_seq) {
+ 

[systemml] 02/02: [SYSTEMML-540] Improved the performance of lstm builtin function for sparse inputs

2019-03-04 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git

commit 792da5d0aa2abd6e650a3a17f243795d0f9a4b35
Author: Niketan Pansare 
AuthorDate: Mon Mar 4 13:32:35 2019 -0800

[SYSTEMML-540] Improved the performance of lstm builtin function for sparse 
inputs

This commits allows matrix multiplication operator to exploit sparsity by 
separating lstm into three cases:
1. If W is sparse, perform cbind(X_t, out_prev) %*% W
2. If X_t is sparse, perform X_t %*% W1 + out_prev %*% W2
3. If none of the case is applicable, perform cbind(X_t, out_prev) %*% W to 
maximize parallelism within matrix multiplication operator
---
 .../sysml/runtime/matrix/data/LibMatrixDNN.java| 114 ++---
 1 file changed, 53 insertions(+), 61 deletions(-)

diff --git 
a/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixDNN.java 
b/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixDNN.java
index e2742d8..365d7a2 100644
--- a/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixDNN.java
+++ b/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixDNN.java
@@ -284,26 +284,34 @@ public class LibMatrixDNN {

private static MatrixBlock add(MatrixBlock matBlock1, MatrixBlock 
matBlock2, boolean inplace) {
BinaryOperator bop = new BinaryOperator(Plus.getPlusFnObject());
-// if(inplace) {
-// matBlock1.binaryOperationsInPlace(bop, matBlock2);
-// return matBlock1;
-// }
-// else {
+   if(inplace && matBlock1.isInSparseFormat() == 
matBlock2.isInSparseFormat() &&
+   matBlock1.getNumRows() == matBlock2.getNumRows() && 
matBlock1.getNumColumns() == matBlock2.getNumColumns()) {
+   matBlock1.binaryOperationsInPlace(bop, matBlock2);
+   return matBlock1;
+   }
+   else {
return (MatrixBlock) matBlock1.binaryOperations(bop, 
matBlock2, new MatrixBlock());
-// }
+   }
+   }
+   private static MatrixBlock plusMultiply(MatrixBlock matBlock1, 
MatrixBlock matBlock2, MatrixBlock matBlock3) {
+   return matBlock1.ternaryOperations(new 
TernaryOperator(PlusMultiply.getFnObject()), 
+   matBlock2, matBlock3, new MatrixBlock());
}

+   
private static MatrixBlock multiply(MatrixBlock matBlock1, MatrixBlock 
matBlock2, boolean inplace) {
BinaryOperator bop = new 
BinaryOperator(Multiply.getMultiplyFnObject());
-// if(inplace) {
-// matBlock1.binaryOperationsInPlace(bop, matBlock2);
-// return matBlock1;
-// }
-// else {
+   if(inplace && matBlock1.isInSparseFormat() == 
matBlock2.isInSparseFormat() &&
+   matBlock1.getNumRows() == matBlock2.getNumRows() && 
matBlock1.getNumColumns() == matBlock2.getNumColumns()) {
+   matBlock1.binaryOperationsInPlace(bop, matBlock2);
+   return matBlock1;
+   }
+   else {
return (MatrixBlock) matBlock1.binaryOperations(bop, 
matBlock2, new MatrixBlock());
-// }
+   }
}

+   
// sigmoid(0)*c_prev + sigmoid(0)*tanh(0);

private static Builtin sigmoidOp = 
Builtin.getBuiltinFnObject(BuiltinCode.SIGMOID);
@@ -311,16 +319,10 @@ public class LibMatrixDNN {
private static MatrixBlock sigmoid(MatrixBlock in, int numThreads, 
boolean inPlace) {
return (MatrixBlock) in.unaryOperations(new 
UnaryOperator(sigmoidOp, numThreads, inPlace), new MatrixBlock());
}
-   
private static MatrixBlock tanh(MatrixBlock in, int numThreads, boolean 
inPlace) {
return (MatrixBlock) in.unaryOperations(new 
UnaryOperator(tanhOp, numThreads, inPlace), new MatrixBlock());
}

-   private static MatrixBlock plusMultiply(MatrixBlock matBlock1, 
MatrixBlock matBlock2, MatrixBlock matBlock3) {
-   return matBlock1.ternaryOperations(new 
TernaryOperator(PlusMultiply.getFnObject()), 
-   matBlock2, matBlock3, new MatrixBlock());
-   }
-   
public static void lstm(MatrixBlock X, MatrixBlock W, MatrixBlock b, 
MatrixBlock out0, MatrixBlock c0, 
boolean return_seq, int N, int T, int D, int M,
MatrixBlock out, MatrixBlock c, // output 
@@ -329,61 +331,56 @@ public class LibMatrixDNN {
MatrixBlock out_prev = out0;
MatrixBlock c_prev = c0;

-   MatrixBlock W1 = W.slice(0, D-1);
-

[systemml] 01/02: added profiling info

2019-03-04 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git

commit adff5ee743992fdcfed9923ee876791e01532220
Author: Niketan Pansare 
AuthorDate: Thu Feb 28 09:29:56 2019 -0800

added profiling info
---
 .../sysml/runtime/matrix/data/LibMatrixDNN.java| 37 --
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git 
a/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixDNN.java 
b/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixDNN.java
index 0f932ba..e2742d8 100644
--- a/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixDNN.java
+++ b/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixDNN.java
@@ -333,10 +333,25 @@ public class LibMatrixDNN {
MatrixBlock W2 = W.slice(D, D+M-1);
MatrixBlock c_t = null;
MatrixBlock out_t = null;
+   
+   boolean profile = true;
+   long t1 = 0, t2 = 0, t3 = 0, t4 = 0, t5 = 0;
for(int t = 1; t <= T; t++) {
+   long s =  profile ? System.nanoTime() : 0;
MatrixBlock X_t = X.slice(0, N-1, (t-1)*D, t*D-1, new 
MatrixBlock());
+   if(profile) {
+   long e = System.nanoTime();
+   t1 += e - s;
+   }
+   
+   s =  profile ? System.nanoTime() : 0;
MatrixBlock ifog_raw = add(add(matmult(X_t, W1, 
numThreads), matmult(out_prev, W2, numThreads), true), b, true);
+   if(profile) {
+   long e = System.nanoTime();
+   t2 += e - s;
+   }

+   s =  profile ? System.nanoTime() : 0;
MatrixBlock ifo = ifog_raw.slice(0, N-1, 0, 3*M-1, new 
MatrixBlock());
ifo = sigmoid(ifo, numThreads, true);
MatrixBlock i = ifo.slice(0, N-1, 0, M-1, new 
MatrixBlock());
@@ -345,16 +360,30 @@ public class LibMatrixDNN {

MatrixBlock g = ifog_raw.slice(0, N-1, 3*M, 4*M-1, new 
MatrixBlock());
g = tanh(g, numThreads, true);
+   if(profile) {
+   long e = System.nanoTime();
+   t3 += e - s;
+   }

+   s =  profile ? System.nanoTime() : 0;
// c_t = f*c_prev + i*g
c_t = plusMultiply(multiply(f, c_prev, true), i, g);
-   
// out_t = o*tanh(c)
out_t = multiply(o, tanh(c_t, numThreads, false), true);
+   if(profile) {
+   long e = System.nanoTime();
+   t4 += e - s;
+   }

+   s =  profile ? System.nanoTime() : 0;
if(return_seq) {
out = out.leftIndexingOperations(out_t, 0, N-1, 
(t-1)*M, t*M-1, new MatrixBlock(), UpdateType.INPLACE);
}
+   if(profile) {
+   long e = System.nanoTime();
+   t5 += e - s;
+   }
+   
out_prev = out_t;
c_prev = c_t;

@@ -369,7 +398,11 @@ public class LibMatrixDNN {
c.copy(c_t);
else
c.copy(c0);
-   
+   System.out.println("Time taken in lstm forward call: [X_t 
indexing:" + String.format("%.3f", t1*1e-9) + 
+   ", ifog_raw computation:" + 
String.format("%.3f", t2*1e-9) + 
+   ", lstm_squash computation:" + 
String.format("%.3f", t3*1e-9) +  
+   ", c_t/out_t computation:" + 
String.format("%.3f", t4*1e-9) + 
+   ", out leftIndexing computation:" + 
String.format("%.3f", t5*1e-9));
}

/**



[systemml] branch master updated: [SYSTEMML-540] Improve the performance of lstm builtin function

2019-02-27 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 0cabde0  [SYSTEMML-540] Improve the performance of lstm builtin 
function
0cabde0 is described below

commit 0cabde0ca26c99a55c62f7e7ffac67b450dea850
Author: Niketan Pansare 
AuthorDate: Wed Feb 27 21:03:15 2019 -0800

[SYSTEMML-540] Improve the performance of lstm builtin function

- Allow FunctionOp to be multi-threaded.
- Currently, only lstm builtin function will have number of threads > 1.
- Added more tests.
---
 .../java/org/apache/sysml/hops/FunctionOp.java | 10 +++-
 .../java/org/apache/sysml/lops/FunctionCallCP.java | 14 +-
 .../runtime/instructions/cp/DnnCPInstruction.java  | 13 ++---
 .../sysml/runtime/matrix/data/LibMatrixDNN.java| 55 --
 .../org/apache/sysml/test/gpu/LstmCPUTest.java | 50 
 5 files changed, 118 insertions(+), 24 deletions(-)

diff --git a/src/main/java/org/apache/sysml/hops/FunctionOp.java 
b/src/main/java/org/apache/sysml/hops/FunctionOp.java
index 66ce478..5fdc8e7 100644
--- a/src/main/java/org/apache/sysml/hops/FunctionOp.java
+++ b/src/main/java/org/apache/sysml/hops/FunctionOp.java
@@ -39,7 +39,7 @@ import 
org.apache.sysml.runtime.controlprogram.parfor.opt.CostEstimatorHops;
  * Note: Currently, we support expressions in function arguments along with 
function calls
  * in expressions with single outputs, leaving multiple outputs handling as it 
is.
  */
-public class FunctionOp extends Hop
+public class FunctionOp extends MultiThreadedHop
 {
public enum FunctionType{
DML,
@@ -253,8 +253,14 @@ public class FunctionOp extends Hop
tmp.add( in.constructLops() );

//construct function call
+   int numThreads = 0;
+   if(getFunctionType() == FunctionType.MULTIRETURN_BUILTIN && 
isBuiltinFunction() && et == ExecType.CP &&
+   (getFunctionName().equalsIgnoreCase("lstm") || 
getFunctionName().equalsIgnoreCase("lstm_backward"))) {
+   numThreads = 
OptimizerUtils.getConstrainedNumThreads(_maxNumThreads);
+   }
+   
Lop fcall = _singleOutFun ? new FunctionCallCPSingle( tmp, 
_fnamespace, _fname, et ) :
-   new FunctionCallCP(tmp, _fnamespace, _fname, 
_inputNames, _outputNames, _outputHops, et);
+   new FunctionCallCP(tmp, _fnamespace, _fname, 
_inputNames, _outputNames, _outputHops, et, numThreads);
setLineNumbers(fcall);
setLops(fcall);

diff --git a/src/main/java/org/apache/sysml/lops/FunctionCallCP.java 
b/src/main/java/org/apache/sysml/lops/FunctionCallCP.java
index 50d43de..237b806 100644
--- a/src/main/java/org/apache/sysml/lops/FunctionCallCP.java
+++ b/src/main/java/org/apache/sysml/lops/FunctionCallCP.java
@@ -38,10 +38,12 @@ public class FunctionCallCP extends Lop
private String[] _inputNames;
private String[] _outputNames;
private ArrayList _outputLops = null;
+   private int _numThreads;
 
public FunctionCallCP(ArrayList inputs, String fnamespace, String 
fname, 
-   String[] inputNames, String[] outputNames, ArrayList 
outputHops, ExecType et) {
+   String[] inputNames, String[] outputNames, ArrayList 
outputHops, ExecType et, int numThreads) {
this(inputs, fnamespace, fname, inputNames, outputNames, et);
+   _numThreads = numThreads;
if(outputHops != null) {
_outputLops = new ArrayList<>();
setLevel();
@@ -104,6 +106,11 @@ public class FunctionCallCP extends Lop
sb.append(_outputNames[i]);
}

+   if(_numThreads > 0) {
+   sb.append(Lop.OPERAND_DELIMITOR);
+   sb.append(_numThreads);
+   }
+   
return sb.toString();
}

@@ -145,6 +152,11 @@ public class FunctionCallCP extends Lop
inst.append(Lop.OPERAND_DELIMITOR);
inst.append(out);
}
+   
+   if(_numThreads > 0) {
+   inst.append(Lop.OPERAND_DELIMITOR);
+   inst.append(_numThreads);
+   }
 
return inst.toString();
}
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/cp/DnnCPInstruction.java 
b/src/main/java/org/apache/sysml/runtime/instructions/cp/DnnCPInstruction.java
index 4043908..93ffd4f 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/cp/

[systemml] branch master updated: [MINOR] Allow access to classloaders methods

2019-02-27 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new bf4717f  [MINOR] Allow access to classloaders methods
bf4717f is described below

commit bf4717f39aaf3cf70bf99648afd38cd8dd5c8ad3
Author: Niketan Pansare 
AuthorDate: Wed Feb 27 13:05:49 2019 -0800

[MINOR] Allow access to classloaders methods
---
 src/main/python/systemml/__init__.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/main/python/systemml/__init__.py 
b/src/main/python/systemml/__init__.py
index 74c8e50..f268642 100644
--- a/src/main/python/systemml/__init__.py
+++ b/src/main/python/systemml/__init__.py
@@ -22,7 +22,9 @@
 from .mlcontext import *
 from .defmatrix import *
 from .converters import *
+from .classloader import *
 
 __all__ = mlcontext.__all__
 __all__ += defmatrix.__all__
 __all__ += converters.__all__
+__all__ += classloader.__all__



[systemml] branch master updated: [SYSTEMML-540] Added an initial CP operator for lstm builtin function

2019-02-23 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 592f1b0  [SYSTEMML-540] Added an initial CP operator for lstm builtin 
function
592f1b0 is described below

commit 592f1b0e9a5566195d0e73fede318e2a269bb4a0
Author: Niketan Pansare 
AuthorDate: Sat Feb 23 09:05:11 2019 -0800

[SYSTEMML-540] Added an initial CP operator for lstm builtin function

- This operator relies on existing slice, left indexing, binary and matrix 
multiplication operators.
- We can later create a fused implementation especially for lstm activation 
to avoid unnecessary copies in the inner loop.
- To exploit sparsity in the input data and avoid unnecessary 
sparse-to-dense conversion (as out_prev is often dense), we perform two matrix 
multiplication followed by binary addition.
---
 .../java/org/apache/sysml/hops/FunctionOp.java |  42 +-
 .../runtime/instructions/CPInstructionParser.java  |   2 +
 .../runtime/instructions/cp/DnnCPInstruction.java  |  61 
 .../sysml/runtime/matrix/data/LibMatrixDNN.java|  88 +++
 .../org/apache/sysml/test/gpu/LstmCPUTest.java | 162 +
 5 files changed, 347 insertions(+), 8 deletions(-)

diff --git a/src/main/java/org/apache/sysml/hops/FunctionOp.java 
b/src/main/java/org/apache/sysml/hops/FunctionOp.java
index 5f177bd..66ce478 100644
--- a/src/main/java/org/apache/sysml/hops/FunctionOp.java
+++ b/src/main/java/org/apache/sysml/hops/FunctionOp.java
@@ -274,19 +274,45 @@ public class FunctionOp extends Hop
{
checkAndSetForcedPlatform();

-   if ( getFunctionType() == FunctionType.MULTIRETURN_BUILTIN ) {
-   boolean isBuiltinFunction = isBuiltinFunction();
+   if(getFunctionType() == FunctionType.MULTIRETURN_BUILTIN && 
isBuiltinFunction() &&
+   (getFunctionName().equalsIgnoreCase("lstm") || 
getFunctionName().equalsIgnoreCase("lstm_backward"))) {
+   
+   if(getFunctionName().equalsIgnoreCase("lstm_backward")) 
{
+   if(!ConfigurationManager.isGPU())
+   throw new RuntimeException("The 
function " + getFunctionName() + " is only supported on GPU.");
+   _etype = ExecType.GPU;
+   }
+   
+   ExecType REMOTE = OptimizerUtils.isSparkExecutionMode() 
? ExecType.SPARK : ExecType.MR;
+   
+   if( _etypeForced != null ) {
+   _etype = _etypeForced;
+   }
+   else {  
+   if ( OptimizerUtils.isMemoryBasedOptLevel() ) {
+   _etype = findExecTypeByMemEstimate();
+   }
+   else {
+   _etype = REMOTE;
+   }
+   
+   //check for valid CP dimensions and matrix size
+   checkAndSetInvalidCPDimsAndSize();
+   }
+   
+   // Since lstm builtin functions are not supported on 
Spark
+   _etype = _etype == REMOTE ?  ExecType.CP : _etype;
+   
+   //mark for recompile (forever)
+   setRequiresRecompileIfNecessary();
+   }
+   else if ( getFunctionType() == FunctionType.MULTIRETURN_BUILTIN 
) {
// check if there is sufficient memory to execute this 
function
-   if(isBuiltinFunction && 
getFunctionName().equalsIgnoreCase("transformencode") ) {
+   if(isBuiltinFunction() && 
getFunctionName().equalsIgnoreCase("transformencode") ) {
_etype = ((_etypeForced==ExecType.SPARK 
|| (getMemEstimate() >= 
OptimizerUtils.getLocalMemBudget()
&& 
OptimizerUtils.isSparkExecutionMode())) ? ExecType.SPARK : ExecType.CP);
}
-   else if(isBuiltinFunction && 
(getFunctionName().equalsIgnoreCase("lstm") || 
getFunctionName().equalsIgnoreCase("lstm_backward"))) {
-   if(!ConfigurationManager.isGPU())
-   throw new RuntimeException("The 
function " + getFunctionName() + " is only supported on GPU.");
- 

[systemml] branch master updated: [SYSTEMML-540] Added tests for comparing Keras2DML output with TF

2019-02-18 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 0b9a5fc  [SYSTEMML-540] Added tests for comparing Keras2DML output 
with TF
0b9a5fc is described below

commit 0b9a5fc3e44d649efeebee929f679f0188e57134
Author: Niketan Pansare 
AuthorDate: Mon Feb 18 11:54:57 2019 -0800

[SYSTEMML-540] Added tests for comparing Keras2DML output with TF

- The test framework is generalized to simplify testing of new layers.
- The default values in Keras2DML has been updated to match default 
invocation of Keras.
- Added Flatten layer in Caffe2DML.
- If the user attempts to use dense layer for 3-D inputs, we now throw an 
error instead of silently giving a wrong answer.
- Fixed a bug in conversion of Conv2D weights.
- Also, fixed a bug when a neural network is invoked which has no weights.
---
 src/main/python/systemml/mllearn/estimators.py |  32 ++-
 src/main/python/systemml/mllearn/keras2caffe.py|  61 ++--
 src/main/python/tests/test_nn_numpy.py | 307 ++---
 .../scala/org/apache/sysml/api/dl/CaffeLayer.scala |  18 ++
 .../org/apache/sysml/api/dl/CaffeNetwork.scala |   1 +
 5 files changed, 285 insertions(+), 134 deletions(-)

diff --git a/src/main/python/systemml/mllearn/estimators.py 
b/src/main/python/systemml/mllearn/estimators.py
index a2647f1..2c3b6a2 100644
--- a/src/main/python/systemml/mllearn/estimators.py
+++ b/src/main/python/systemml/mllearn/estimators.py
@@ -1009,8 +1009,8 @@ class Keras2DML(Caffe2DML):
 
 """
 
-def __init__(self, sparkSession, keras_model, input_shape, 
transferUsingDF=False, load_keras_weights=True, weights=None, labels=None,
- batch_size=64, max_iter=2000, test_iter=10, 
test_interval=500, display=100, lr_policy="step", weight_decay=5e-4, 
regularization_type="L2"):
+def __init__(self, sparkSession, keras_model, input_shape=None, 
transferUsingDF=False, load_keras_weights=True, weights=None, labels=None,
+ batch_size=64, max_iter=2000, test_iter=0, test_interval=500, 
display=100, lr_policy="step", weight_decay=0, regularization_type="L2"):
 """
 Performs training/prediction for a given keras model.
 
@@ -1018,37 +1018,43 @@ class Keras2DML(Caffe2DML):
 --
 sparkSession: PySpark SparkSession
 keras_model: keras model
-input_shape: 3-element list (number of channels, input height, input 
width)
+input_shape: 3-element list (number of channels, input height, input 
width). If not provided, it is inferred from the input shape of the first layer.
 transferUsingDF: whether to pass the input dataset via PySpark 
DataFrame (default: False)
 load_keras_weights: whether to load weights from the keras_model. If 
False, the weights will be initialized to random value using NN libraries' init 
method  (default: True)
 weights: directory whether learned weights are stored (default: None)
 labels: file containing mapping between index and string labels 
(default: None)
 batch_size: size of the input batch (default: 64)
-max_iter: maximum number of iterations (default: 1)
-test_iter: test_iter for caffe solver (default: 10)
+max_iter: maximum number of iterations (default: 2000)
+test_iter: test_iter for caffe solver (default: 0)
 test_interval: test_interval for caffe solver (default: 500)
 display: display for caffe solver (default: 100)
 lr_policy: learning rate policy for caffe solver (default: "step")
-weight_decay: regularation strength (default: 5e-4)
+weight_decay: regularation strength (default: 0, recommended: 5e-4)
 regularization_type: regularization type (default: "L2")
 """
 from .keras2caffe import convertKerasToCaffeNetwork, 
convertKerasToCaffeSolver, convertKerasToSystemMLModel
 import tempfile, keras
+if keras.backend.image_data_format() != 'channels_first':
+raise Exception('The data format ' + 
str(keras.backend.image_data_format())
++ ' is not supported. Please use 
keras.backend.set_image_data_format("channels_first")')
 if isinstance(keras_model, keras.models.Sequential):
 # Convert the sequential model to functional model
 if keras_model.model is None:
 keras_model.build()
 keras_model = keras_model.model
+if input_shape is None:
+keras_shape = keras_model.layers[0].input_shape
+input_shape = [1, 1, 1]
+if len(keras_shape) > 4 or len(keras_shape) <= 1:
+

[systemml] branch master updated: [SYSTEMML-540] Bugfix for the Keras2DML LSTM layer

2019-02-04 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new dd1a09b  [SYSTEMML-540] Bugfix for the Keras2DML LSTM layer
dd1a09b is described below

commit dd1a09b2df610555b950c39cd4792ee70f326b5d
Author: Niketan Pansare 
AuthorDate: Mon Feb 4 15:31:03 2019 -0800

[SYSTEMML-540] Bugfix for the Keras2DML LSTM layer

- For the LSTM layer, Keras weights are laid out in [i, f, c, o] format,
whereas the SystemML weights are laid out in [i, f, o, c] format.
- This causes inconsistent outputs especially in transfer learning or
prediction setting where the model trained in Keras has to be used by
SystemML.
---
 src/main/python/systemml/mllearn/keras2caffe.py | 24 ++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/src/main/python/systemml/mllearn/keras2caffe.py 
b/src/main/python/systemml/mllearn/keras2caffe.py
index a06113c..f6d6440 100755
--- a/src/main/python/systemml/mllearn/keras2caffe.py
+++ b/src/main/python/systemml/mllearn/keras2caffe.py
@@ -477,10 +477,30 @@ def convertKerasToCaffeSolver(kerasModel, 
caffeNetworkFilePath, outCaffeSolverFi
 
 
 def getInputMatrices(layer):
-if isinstance(layer, keras.layers.LSTM) or isinstance(
-layer, keras.layers.SimpleRNN):
+if isinstance(layer, keras.layers.SimpleRNN):
 weights = layer.get_weights()
 return [np.vstack((weights[0], weights[1])), np.matrix(weights[2])]
+elif isinstance(layer, keras.layers.LSTM):
+weights = layer.get_weights()
+W, U, b =  weights[0], weights[1], weights[2]
+units = W.shape[1]/4
+if W.shape[1] != U.shape[1]:
+raise Exception('Number of hidden units of the kernel and the 
recurrent kernel doesnot match')
+# Note: For the LSTM layer, Keras weights are laid out in [i, f, c, o] 
format;
+# whereas SystemML weights are laid out in [i, f, o, c] format.
+W_i = W[:, :units]
+W_f = W[:, units: units * 2]
+W_c = W[:, units * 2: units * 3]
+W_o = W[:, units * 3:]
+U_i = U[:, :units]
+U_f = U[:, units: units * 2]
+U_c = U[:, units * 2: units * 3]
+U_o = U[:, units * 3:]
+b_i = b[:units]
+b_f = b[units: units * 2]
+b_c = b[units * 2: units * 3]
+b_o = b[units * 3:]
+return [np.vstack((np.hstack((W_i, W_f, W_o, W_c)), np.hstack((U_i, 
U_f, U_o, U_c.reshape((-1, 4*units)), np.hstack((b_i, b_f, b_o, 
b_c)).reshape((1, -1))]
 else:
 return [getNumPyMatrixFromKerasWeight(
 param) for param in layer.get_weights()]



[systemml] branch master updated: [SYSTEMML-540] Make Keras2DML compatible with newer Keras versions

2019-02-01 Thread niketanpansare
This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemml.git


The following commit(s) were added to refs/heads/master by this push:
 new 5288bc0  [SYSTEMML-540] Make Keras2DML compatible with newer Keras 
versions
5288bc0 is described below

commit 5288bc0d536df0574b17363d950e05b3c4bbe0d4
Author: Niketan Pansare 
AuthorDate: Fri Feb 1 16:52:57 2019 -0800

[SYSTEMML-540] Make Keras2DML compatible with newer Keras versions

- After version 2.1.5, Keras had major refactoring which changed their 
layer definitions.
- In version 2.2.4, the model no longer contains an explicit InputLayer.
- This commit addresses this issue so as to be compatible with older as 
well as newer Keras versions.
---
 src/main/python/systemml/mllearn/keras2caffe.py | 108 ++--
 1 file changed, 64 insertions(+), 44 deletions(-)

diff --git a/src/main/python/systemml/mllearn/keras2caffe.py 
b/src/main/python/systemml/mllearn/keras2caffe.py
index 6e1e9c3..a06113c 100755
--- a/src/main/python/systemml/mllearn/keras2caffe.py
+++ b/src/main/python/systemml/mllearn/keras2caffe.py
@@ -106,7 +106,7 @@ str_keys = ['name', 'type', 'top', 'bottom']
 
 def toKV(key, value):
 return str(key) + ': "' + str(value) + \
-'"' if key in str_keys else str(key) + ': ' + str(value)
+   '"' if key in str_keys else str(key) + ': ' + str(value)
 
 
 def _parseJSONObject(obj):
@@ -143,7 +143,8 @@ def _parseActivation(layer, customLayerName=None):
   'type': supportedCaffeActivations[kerasActivation], 
'top': layer.name, 'bottom': layer.name}}
 else:
 return {'layer': {'name': layer.name,
-  'type': supportedCaffeActivations[kerasActivation], 
'top': layer.name, 'bottom': _getBottomLayers(layer)}}
+  'type': supportedCaffeActivations[kerasActivation], 
'top': layer.name,
+  'bottom': _getBottomLayers(layer)}}
 
 
 def _shouldParseActivation(layer):
@@ -184,8 +185,10 @@ def _parseBatchNorm(layer):
 bnName = layer.name + '_1'
 config = layer.get_config()
 bias_term = 'true' if config['center'] else 'false'
-return [{'layer': {'name': bnName, 'type': 'BatchNorm', 'bottom': 
_getBottomLayers(layer), 'top': bnName, 'batch_norm_param': 
{'moving_average_fraction': layer.momentum, 'eps': layer.epsilon}}}, {
-'layer': {'name': layer.name, 'type': 'Scale', 'bottom': bnName, 
'top': layer.name, 'scale_param': {'bias_term': bias_term}}}]
+return [{'layer': {'name': bnName, 'type': 'BatchNorm', 'bottom': 
_getBottomLayers(layer), 'top': bnName,
+   'batch_norm_param': {'moving_average_fraction': 
layer.momentum, 'eps': layer.epsilon}}}, {
+'layer': {'name': layer.name, 'type': 'Scale', 'bottom': 
bnName, 'top': layer.name,
+  'scale_param': {'bias_term': bias_term}}}]
 
 
 # The special are redirected to their custom parse function in _parseKerasLayer
@@ -206,7 +209,8 @@ def getConvParam(layer):
 0]
 config = layer.get_config()
 return {'num_output': layer.filters, 'bias_term': 
str(config['use_bias']).lower(
-), 'kernel_h': layer.kernel_size[0], 'kernel_w': layer.kernel_size[1], 
'stride_h': stride[0], 'stride_w': stride[1], 'pad_h': padding[0], 'pad_w': 
padding[1]}
+), 'kernel_h': layer.kernel_size[0], 'kernel_w': layer.kernel_size[1], 
'stride_h': stride[0], 'stride_w': stride[1],
+'pad_h': padding[0], 'pad_w': padding[1]}
 
 
 def getUpSamplingParam(layer):
@@ -227,11 +231,11 @@ def getPoolingParam(layer, pool='MAX'):
 
 
 def getRecurrentParam(layer):
-if(not layer.use_bias):
+if (not layer.use_bias):
 raise Exception('Only use_bias=True supported for recurrent layers')
-if(keras.activations.serialize(layer.activation) != 'tanh'):
+if (keras.activations.serialize(layer.activation) != 'tanh'):
 raise Exception('Only tanh activation supported for recurrent layers')
-if(layer.dropout != 0 or layer.recurrent_dropout != 0):
+if (layer.dropout != 0 or layer.recurrent_dropout != 0):
 raise Exception('Only dropout not supported for recurrent layers')
 return {'num_output': layer.units, 'return_sequences': str(
 layer.return_sequences).lower()}
@@ -242,27 +246,27 @@ layerParamMapping = {
 keras.layers.InputLayer: lambda l:
 {'data_param': {'batch_size': l.batch_size}},
 keras.layers.Dense: lambda l:
-{'inner_product_param': {'num_output': l.units}},
+{'inner_product_param': {'num_output': l.units}},
 keras.layers.Dropout: lambda l:
-{'dropout_param': {'dropout_ratio': l.rate}},
+{'dropout_param': {'dropout_ratio': l.rate}},
 keras.layers.Add: lambda l:
-{'eltwise_param': {'operation': 'SUM'}},
+{'eltwise_param': {'o

systemml git commit: [SYSTEMML-540] Improved performance of prediction via Keras2DML

2018-12-14 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 3b87c2ba9 -> 341a1dc78


[SYSTEMML-540] Improved performance of prediction via Keras2DML

- Reduced the model loading time of VGG by 1.7x by supporting exchange of 
float32 matrices.
- Eliminated an additional mlcontext execution for converting probability to 
predicted labels. This improved the performance of VGG prediction by 15%.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/341a1dc7
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/341a1dc7
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/341a1dc7

Branch: refs/heads/master
Commit: 341a1dc789396ff3e46cf952a75bbe6958b77671
Parents: 3b87c2b
Author: Niketan Pansare 
Authored: Fri Dec 14 09:49:48 2018 -0800
Committer: Niketan Pansare 
Committed: Fri Dec 14 09:49:48 2018 -0800

--
 .../spark/utils/RDDConverterUtilsExt.java   | 35 ++-
 src/main/python/systemml/converters.py  | 27 +---
 src/main/python/tests/test_mlcontext.py | 25 +++
 .../sysml/api/ml/BaseSystemMLClassifier.scala   | 45 +++-
 4 files changed, 95 insertions(+), 37 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/341a1dc7/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtilsExt.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtilsExt.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtilsExt.java
index 4871aee..8db7558 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtilsExt.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtilsExt.java
@@ -126,13 +126,19 @@ public class RDDConverterUtilsExt
return df.select(columns.get(0), 
scala.collection.JavaConversions.asScalaBuffer(columnToSelect).toList());
}
 
-   public static MatrixBlock convertPy4JArrayToMB(byte [] data, long rlen, 
long clen) {
-   return convertPy4JArrayToMB(data, (int)rlen, (int)clen, false);
+   // data_type: 0: int, 1: float and 2: double
+   public static MatrixBlock convertPy4JArrayToMB(byte [] data, long rlen, 
long clen, long dataType) {
+   return convertPy4JArrayToMB(data, (int)rlen, (int)clen, false, 
dataType);
}
 
-   public static MatrixBlock convertPy4JArrayToMB(byte [] data, int rlen, 
int clen) {
-   return convertPy4JArrayToMB(data, rlen, clen, false);
+   public static MatrixBlock convertPy4JArrayToMB(byte [] data, int rlen, 
int clen, int dataType) {
+   return convertPy4JArrayToMB(data, rlen, clen, false, dataType);
}
+   
+   public static MatrixBlock convertPy4JArrayToMB(byte [] data, long rlen, 
long clen, boolean isSparse, long dataType) {
+   return convertPy4JArrayToMB(data, (int) rlen, (int) clen, 
isSparse, dataType);
+   }
+
 
public static MatrixBlock convertSciPyCOOToMB(byte [] data, byte [] 
row, byte [] col, long rlen, long clen, long nnz) {
return convertSciPyCOOToMB(data, row, col, (int)rlen, 
(int)clen, (int)nnz);
@@ -158,10 +164,6 @@ public class RDDConverterUtilsExt
return mb;
}
 
-   public static MatrixBlock convertPy4JArrayToMB(byte [] data, long rlen, 
long clen, boolean isSparse) {
-   return convertPy4JArrayToMB(data, (int) rlen, (int) clen, 
isSparse);
-   }
-
public static MatrixBlock allocateDenseOrSparse(int rlen, int clen, 
boolean isSparse) {
MatrixBlock ret = new MatrixBlock(rlen, clen, isSparse);
ret.allocateBlock();
@@ -195,7 +197,8 @@ public class RDDConverterUtilsExt
ret.examSparsity();
}
 
-   public static MatrixBlock convertPy4JArrayToMB(byte [] data, int rlen, 
int clen, boolean isSparse) {
+   // data_type: 0: int, 1: float and 2: double
+   public static MatrixBlock convertPy4JArrayToMB(byte [] data, int rlen, 
int clen, boolean isSparse, long dataType) {
MatrixBlock mb = new MatrixBlock(rlen, clen, isSparse, -1);
if(isSparse) {
throw new DMLRuntimeException("Convertion to sparse 
format not supported");
@@ -207,9 +210,19 @@ public class RDDConverterUtilsExt
double [] denseBlock = new double[(int) limit];
ByteBuffer buf = ByteBuffer.wrap(data);
buf.order(ByteOrder.nativeOrder());
-   for(int i = 0; i < rlen*clen; i++) {
-   denseBlock[i] = buf.getDouble();
+   if(dataType == 

systemml git commit: [SYSTEMML-2505] Generate the DML for Caffe and Keras models

2018-12-08 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 1a58946a0 -> 7019f3bc8


[SYSTEMML-2505] Generate the DML for Caffe and Keras models

Here is a sample example:

```
from keras.applications.vgg16 import VGG16
keras_model = VGG16(weights="imagenet", pooling="max")
from systemml.mllearn import Keras2DML
sysml_model = Keras2DML(spark, keras_model, input_shape=(3,224,224), 
weights='weights_dir')
sysml_model.set(test_algo='batch', train_algo='minibatch')
print(sysml_model.get_training_script())
print(sysml_model.get_prediction_script())
```


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/7019f3bc
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/7019f3bc
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/7019f3bc

Branch: refs/heads/master
Commit: 7019f3bc805aaae67ef32e281cf99e26cbd26b29
Parents: 1a58946
Author: Niketan Pansare 
Authored: Sat Dec 8 11:20:09 2018 -0800
Committer: Niketan Pansare 
Committed: Sat Dec 8 11:20:09 2018 -0800

--
 src/main/python/systemml/mllearn/estimators.py | 12 
 src/main/scala/org/apache/sysml/api/dl/Caffe2DML.scala |  3 +++
 2 files changed, 15 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/7019f3bc/src/main/python/systemml/mllearn/estimators.py
--
diff --git a/src/main/python/systemml/mllearn/estimators.py 
b/src/main/python/systemml/mllearn/estimators.py
index 8a100b4..a2647f1 100644
--- a/src/main/python/systemml/mllearn/estimators.py
+++ b/src/main/python/systemml/mllearn/estimators.py
@@ -973,6 +973,18 @@ class Caffe2DML(BaseSystemMLClassifier):
 raise TypeError("parfor_parameters should be a dictionary")
 return self
 
+def get_training_script(self):
+"""
+Return the training DML script
+"""
+return self.estimator.get_training_script()
+
+def get_prediction_script(self):
+"""
+Return the prediction DML script
+"""
+return self.estimator.get_prediction_script()
+
 def summary(self):
 """
 Print the summary of the network

http://git-wip-us.apache.org/repos/asf/systemml/blob/7019f3bc/src/main/scala/org/apache/sysml/api/dl/Caffe2DML.scala
--
diff --git a/src/main/scala/org/apache/sysml/api/dl/Caffe2DML.scala 
b/src/main/scala/org/apache/sysml/api/dl/Caffe2DML.scala
index 8ddb1fe..13f8a65 100644
--- a/src/main/scala/org/apache/sysml/api/dl/Caffe2DML.scala
+++ b/src/main/scala/org/apache/sysml/api/dl/Caffe2DML.scala
@@ -221,6 +221,9 @@ class Caffe2DML(val sc: SparkContext,
 mloutput = baseFit(df, sc)
 new Caffe2DMLModel(this)
   }
+  // Public methods to be called from the Python APIs:
+  def get_training_script():String = getTrainingScript(true)._1.getScriptString
+  def get_prediction_script():String = new 
Caffe2DMLModel(this).getPredictionScript(true)._1.getScriptString
   // --
   // Returns true if last 2 of 4 dimensions are 1.
   // The first dimension refers to number of input datapoints.



systemml git commit: [MINOR] Updated the Linear Regression demo notebook

2018-12-07 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master c3fdbb4da -> bda61b600


[MINOR] Updated the Linear Regression demo notebook

Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/bda61b60
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/bda61b60
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/bda61b60

Branch: refs/heads/master
Commit: bda61b600a05e71be84848377b3e9ae93811c4d4
Parents: c3fdbb4
Author: Niketan Pansare 
Authored: Fri Dec 7 15:31:48 2018 -0800
Committer: Niketan Pansare 
Committed: Fri Dec 7 15:31:48 2018 -0800

--
 .../Linear Regression Algorithms Demo.ipynb | 595 ---
 .../Linear_Regression_Algorithms_Demo.ipynb | 582 ++
 2 files changed, 582 insertions(+), 595 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/bda61b60/samples/jupyter-notebooks/Linear
 Regression Algorithms Demo.ipynb
--
diff --git a/samples/jupyter-notebooks/Linear Regression Algorithms Demo.ipynb 
b/samples/jupyter-notebooks/Linear Regression Algorithms Demo.ipynb
deleted file mode 100644
index 001f402..000
--- a/samples/jupyter-notebooks/Linear Regression Algorithms Demo.ipynb 
+++ /dev/null
@@ -1,595 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-"# Linear Regression Algorithms using Apache SystemML\n",
-"\n",
-"This notebook shows:\n",
-"- Install SystemML Python package and jar file\n",
-"  - pip\n",
-"  - SystemML 'Hello World'\n",
-"- Example 1: Matrix Multiplication\n",
-"  - SystemML script to generate a random matrix, perform matrix 
multiplication, and compute the sum of the output\n",
-"  - Examine execution plans, and increase data size to obverve changed 
execution plans\n",
-"- Load diabetes dataset from scikit-learn\n",
-"- Example 2: Implement three different algorithms to train linear 
regression model\n",
-"  - Algorithm 1: Linear Regression - Direct Solve (no regularization)\n",
-"  - Algorithm 2: Linear Regression - Batch Gradient Descent (no 
regularization)\n",
-"  - Algorithm 3: Linear Regression - Conjugate Gradient (no 
regularization)\n",
-"- Example 3: Invoke existing SystemML algorithm script LinearRegDS.dml 
using MLContext API\n",
-"- Example 4: Invoke existing SystemML algorithm using 
scikit-learn/SparkML pipeline like API\n",
-"- Uninstall/Clean up SystemML Python package and jar file"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-"# Install SystemML Python package and jar file"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-"collapsed": false
-   },
-   "outputs": [],
-   "source": [
-"!pip uninstall systemml --y\n",
-"!pip install --user 
https://repository.apache.org/content/groups/snapshots/org/apache/systemml/systemml/1.0.0-SNAPSHOT/systemml-1.0.0-20171201.070207-23-python.tar.gz;
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-"collapsed": false
-   },
-   "outputs": [],
-   "source": [
-"!pip show systemml"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-"### Import SystemML API "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-"collapsed": false
-   },
-   "outputs": [],
-   "source": [
-"from systemml import MLContext, dml, dmlFromResource\n",
-"\n",
-"ml = MLContext(sc)\n",
-"\n",
-"print \"Spark Version:\", sc.version\n",
-"print \"SystemML Version:\", ml.version()\n",
-"print \"SystemML Built-Time:\", ml.buildTime()"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-"collapsed": false
-   },
-   "outputs": [],
-   "source": [
-"ml.execute(dml(\"\"\"s = 'Hello World!'\"\"\").output(\"s\")).get(\"s\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-"### Import numpy, sklearn, and define some helper functions"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-"collapsed": true
-   },
-   "outputs": [],
-   "source": [
-"import matplotlib.pyplot as plt\n",
-"import numpy as np\n",
-"from sklearn import datasets\n",
-"plt.switch_backend('agg')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-"# Example 1: Matrix Multiplication"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-"### SystemML script to generate a random matrix, perform matrix 
multiplication, and compute the sum of the output"
-   ]
-  },
-  {
-   "cell_type": 

[1/4] systemml git commit: [BUGFIX] Revert all the files of commit 95cbbd6

2018-12-07 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 25a10f412 -> c3fdbb4da


http://git-wip-us.apache.org/repos/asf/systemml/blob/c3fdbb4d/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtils.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtils.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtils.java
index c8a0d3e..f775e92 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtils.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtils.java
@@ -71,7 +71,6 @@ import org.apache.sysml.runtime.util.DataConverter;
 import org.apache.sysml.runtime.util.FastStringTokenizer;
 import org.apache.sysml.runtime.util.MapReduceTool;
 import org.apache.sysml.runtime.util.UtilFunctions;
-import org.apache.sysml.utils.IntUtils;
 
 import scala.Tuple2;
 
@@ -272,7 +271,7 @@ public class RDDConverterUtils
//slice blocks into rows, align and convert into data frame rows
JavaRDD rowsRDD = in
.flatMapToPair(new 
SliceBinaryBlockToRowsFunction(mc.getRowsPerBlock()))
-   .groupByKey().map(new 
ConvertRowBlocksToRows(IntUtils.toInt(mc.getCols()), mc.getColsPerBlock(), 
toVector));
+   .groupByKey().map(new 
ConvertRowBlocksToRows((int)mc.getCols(), mc.getColsPerBlock(), toVector));

//create data frame schema
List fields = new ArrayList<>();
@@ -323,7 +322,7 @@ public class RDDConverterUtils
MapReduceTool.deleteFileIfExistOnHDFS(pathY);

//convert libsvm to labeled points
-   int numFeatures = IntUtils.toInt( mcOutX.getCols() );
+   int numFeatures = (int) mcOutX.getCols();
int numPartitions = 
SparkUtils.getNumPreferredPartitions(mcOutX, null);
JavaRDD 
lpoints = 
MLUtils.loadLibSVMFile(sc.sc(), pathIn, 
numFeatures, numPartitions).toJavaRDD();
@@ -486,7 +485,7 @@ public class RDDConverterUtils
_bclen = mc.getColsPerBlock();

//determine upper bounded buffer len
-   _bufflen = IntUtils.toInt( Math.min(_rlen*_clen, 
BUFFER_SIZE) );
+   _bufflen = (int) Math.min(_rlen*_clen, BUFFER_SIZE);
}
 
protected void flushBufferToList( ReblockBuffer rbuff,  
ArrayList> ret ) 
@@ -703,7 +702,7 @@ public class RDDConverterUtils
{
ArrayList> ret = new 
ArrayList<>();
 
-   int ncblks = 
IntUtils.toInt(Math.ceil((double)_clen/_bclen));
+   int ncblks = (int)Math.ceil((double)_clen/_bclen);
MatrixIndexes[] ix = new MatrixIndexes[ncblks];
MatrixBlock[] mb = new MatrixBlock[ncblks];

@@ -725,7 +724,7 @@ public class RDDConverterUtils
if( ix[0] !=null )
flushBlocksToList(ix, mb, ret);
long len = 
UtilFunctions.computeBlockSize(_rlen, rix, _brlen);
-   createBlocks(rowix, 
IntUtils.toInt(len), ix, mb);
+   createBlocks(rowix, (int)len, ix, mb);
}

//process row data
@@ -733,7 +732,7 @@ public class RDDConverterUtils
boolean emptyFound = false;
for( int cix=1, pix=0; cix<=ncblks; cix++ ) 
{
-   int lclen = 
IntUtils.toInt(UtilFunctions.computeBlockSize(_clen, cix, _bclen));
+   int lclen = 
(int)UtilFunctions.computeBlockSize(_clen, cix, _bclen);
if( mb[cix-1].isInSparseFormat() ) {
//allocate row once (avoid 
re-allocations)
int lnnz = 
IOUtilFunctions.countNnz(parts, pix, lclen);
@@ -763,13 +762,13 @@ public class RDDConverterUtils
{
//compute row block index and number of column blocks
long rix = UtilFunctions.computeBlockIndex(rowix, 
_brlen);
-   int ncblks = 
IntUtils.toInt(Math.ceil((double)_clen/_bclen));
+   int ncblks = (int)Math.ceil((double)_clen/_bclen);

//create all column blocks (assume dense since csv is 

[2/4] systemml git commit: [BUGFIX] Revert all the files of commit 95cbbd6

2018-12-07 Thread niketanpansare
http://git-wip-us.apache.org/repos/asf/systemml/blob/c3fdbb4d/src/main/java/org/apache/sysml/runtime/instructions/spark/AppendGSPInstruction.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/AppendGSPInstruction.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/AppendGSPInstruction.java
index 4b9c2e0..3ecb91f 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/AppendGSPInstruction.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/AppendGSPInstruction.java
@@ -41,7 +41,6 @@ import org.apache.sysml.runtime.matrix.data.MatrixIndexes;
 import org.apache.sysml.runtime.matrix.operators.Operator;
 import org.apache.sysml.runtime.matrix.operators.ReorgOperator;
 import org.apache.sysml.runtime.util.UtilFunctions;
-import org.apache.sysml.utils.IntUtils;
 
 public class AppendGSPInstruction extends BinarySPInstruction {
private boolean _cbind = true;
@@ -165,8 +164,8 @@ public class AppendGSPInstruction extends 
BinarySPInstruction {
_cbind = cbind;
_startIx = cbind ? 
UtilFunctions.computeBlockIndex(mc1.getCols(), mc1.getColsPerBlock()) :
UtilFunctions.computeBlockIndex(mc1.getRows(), 
mc1.getRowsPerBlock());
-   _blen = IntUtils.toInt(cbind ? mc1.getColsPerBlock() : 
mc1.getRowsPerBlock());
-   _shiftBy = IntUtils.toInt(cbind ? mc1.getCols()%_blen : 
mc1.getRows()%_blen); 
+   _blen = (int) (cbind ? mc1.getColsPerBlock() : 
mc1.getRowsPerBlock());
+   _shiftBy = (int) (cbind ? mc1.getCols()%_blen : 
mc1.getRows()%_blen); 
_outlen = cbind ? mc1.getCols()+mc2.getCols() : 
mc1.getRows()+mc2.getRows();
}
 

http://git-wip-us.apache.org/repos/asf/systemml/blob/c3fdbb4d/src/main/java/org/apache/sysml/runtime/instructions/spark/CastSPInstruction.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/CastSPInstruction.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/CastSPInstruction.java
index 6917a7c..3ff878a 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/CastSPInstruction.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/CastSPInstruction.java
@@ -36,7 +36,6 @@ import org.apache.sysml.runtime.matrix.data.MatrixBlock;
 import org.apache.sysml.runtime.matrix.data.MatrixIndexes;
 import org.apache.sysml.runtime.matrix.operators.Operator;
 import org.apache.sysml.runtime.util.UtilFunctions;
-import org.apache.sysml.utils.IntUtils;
 
 public class CastSPInstruction extends UnarySPInstruction {
 
@@ -87,7 +86,7 @@ public class CastSPInstruction extends UnarySPInstruction {
//update schema information for output frame
if( opcode.equals(UnaryCP.CAST_AS_FRAME_OPCODE) ) {
sec.getFrameObject(output.getName()).setSchema(
-   
UtilFunctions.nCopies(IntUtils.toInt(mcIn.getCols()), ValueType.DOUBLE));
+   UtilFunctions.nCopies((int)mcIn.getCols(), 
ValueType.DOUBLE));
}
}
 }

http://git-wip-us.apache.org/repos/asf/systemml/blob/c3fdbb4d/src/main/java/org/apache/sysml/runtime/instructions/spark/CentralMomentSPInstruction.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/CentralMomentSPInstruction.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/CentralMomentSPInstruction.java
index 2b6575a..f25899f 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/CentralMomentSPInstruction.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/CentralMomentSPInstruction.java
@@ -40,7 +40,6 @@ import org.apache.sysml.runtime.matrix.data.MatrixBlock;
 import org.apache.sysml.runtime.matrix.data.MatrixIndexes;
 import org.apache.sysml.runtime.matrix.operators.CMOperator;
 import 
org.apache.sysml.runtime.matrix.operators.CMOperator.AggregateOperationTypes;
-import org.apache.sysml.utils.IntUtils;
 
 public class CentralMomentSPInstruction extends UnarySPInstruction {
 
@@ -105,7 +104,7 @@ public class CentralMomentSPInstruction extends 
UnarySPInstruction {
ScalarObject order = ec.getScalarInput(scalarInput.getName(), 
scalarInput.getValueType(), scalarInput.isLiteral()); 
CMOperator cop = ((CMOperator)_optr); 
if ( cop.getAggOpType() == AggregateOperationTypes.INVALID ) {
-   cop.setCMAggOp(IntUtils.toInt(order.getLongValue()));
+   cop.setCMAggOp((int)order.getLongValue());
}

//get input


[3/4] systemml git commit: [BUGFIX] Revert all the files of commit 95cbbd6

2018-12-07 Thread niketanpansare
http://git-wip-us.apache.org/repos/asf/systemml/blob/c3fdbb4d/src/main/java/org/apache/sysml/runtime/controlprogram/parfor/RemoteDPParWorkerReducer.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/controlprogram/parfor/RemoteDPParWorkerReducer.java
 
b/src/main/java/org/apache/sysml/runtime/controlprogram/parfor/RemoteDPParWorkerReducer.java
index 9352508..11d1ed9 100644
--- 
a/src/main/java/org/apache/sysml/runtime/controlprogram/parfor/RemoteDPParWorkerReducer.java
+++ 
b/src/main/java/org/apache/sysml/runtime/controlprogram/parfor/RemoteDPParWorkerReducer.java
@@ -48,7 +48,6 @@ import 
org.apache.sysml.runtime.matrix.mapred.MRConfigurationNames;
 import org.apache.sysml.runtime.matrix.mapred.MRJobConfiguration;
 import org.apache.sysml.runtime.util.LocalFileUtils;
 import org.apache.sysml.runtime.util.ProgramConverter;
-import org.apache.sysml.utils.IntUtils;
 import org.apache.sysml.utils.Statistics;
 
 public class RemoteDPParWorkerReducer extends ParWorker
@@ -121,8 +120,8 @@ public class RemoteDPParWorkerReducer extends ParWorker
_dpf = MRJobConfiguration.getPartitioningFormat( job );
MatrixCharacteristics mc = 
MRJobConfiguration.getPartitionedMatrixSize(job);
PartitionFormat pf = new PartitionFormat(_dpf, 
MRJobConfiguration.getPartitioningSizeN(job));
-   _rlen = IntUtils.toInt(pf.getNumRows(mc));
-   _clen = IntUtils.toInt(pf.getNumColumns(mc));
+   _rlen = (int)pf.getNumRows(mc);
+   _clen = (int)pf.getNumColumns(mc);
_brlen = mc.getRowsPerBlock();
_bclen = mc.getColsPerBlock();
_iterVar = MRJobConfiguration.getPartitioningItervar( job );
@@ -130,9 +129,9 @@ public class RemoteDPParWorkerReducer extends ParWorker
_info = MRJobConfiguration.getPartitioningOutputInfo( job );
_tSparseCol = MRJobConfiguration.getPartitioningTransposedCol( 
job ); 
if( _tSparseCol )
-   _partition = new MatrixBlock(IntUtils.toInt(_clen), 
_rlen, true);
+   _partition = new MatrixBlock((int)_clen, _rlen, true);
else
-   _partition = new MatrixBlock(IntUtils.toInt(_rlen), 
_clen, false);
+   _partition = new MatrixBlock((int)_rlen, _clen, false);
 
//Step 1: configure parworker
String taskID = job.get(MRConfigurationNames.MR_TASK_ID);
@@ -153,7 +152,7 @@ public class RemoteDPParWorkerReducer extends ParWorker

//create local runtime program
String in = MRJobConfiguration.getProgramBlocks(job);
-   ParForBody body = ProgramConverter.parseParForBody(in, 
IntUtils.toInt(_workerID));
+   ParForBody body = ProgramConverter.parseParForBody(in, 
(int)_workerID);
_childBlocks = body.getChildBlocks();
_ec  = body.getEc();
_resultVars  = body.getResultVariables();
@@ -243,8 +242,8 @@ public class RemoteDPParWorkerReducer extends ParWorker
while( valueList.hasNext() )
{
PairWritableBlock pairValue = 
(PairWritableBlock)valueList.next();
-   int row_offset = 
IntUtils.toInt(pairValue.indexes.getRowIndex()-1)*_brlen;
-   int col_offset = 
IntUtils.toInt(pairValue.indexes.getColumnIndex()-1)*_bclen;
+   int row_offset = 
(int)(pairValue.indexes.getRowIndex()-1)*_brlen;
+   int col_offset = 
(int)(pairValue.indexes.getColumnIndex()-1)*_bclen;
MatrixBlock block = pairValue.block;
if( !_partition.isInSparseFormat() ) //DENSE
{
@@ -298,7 +297,7 @@ public class RemoteDPParWorkerReducer extends ParWorker
PairWritableCell pairValue = 
(PairWritableCell)valueList.next();
if( 
pairValue.indexes.getColumnIndex()<0 )
continue; //cells used to 
ensure empty partitions
-   _partition.quickSetValue(0, 
IntUtils.toInt(pairValue.indexes.getColumnIndex()-1), 
pairValue.cell.getValue());
+   _partition.quickSetValue(0, 
(int)pairValue.indexes.getColumnIndex()-1, pairValue.cell.getValue());
}
break;
case COLUMN_WISE:
@@ -308,9 +307,9 @@ public class RemoteDPParWorkerReducer extends ParWorker
if( pairValue.indexes.getRowIndex()<0 )
   

[4/4] systemml git commit: [BUGFIX] Revert all the files of commit 95cbbd6

2018-12-07 Thread niketanpansare
[BUGFIX] Revert all the files of commit 95cbbd6

- This also contains a instruction parsing-related bugfix introduced by the 
commit 25a10f4
- However, there is also NPE error from recent commits related to Spark 
bcumoffk instruction. This error will be fixed in later commits as it is 
independent of the commit 95cbbd6.

Closes #851.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/c3fdbb4d
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/c3fdbb4d
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/c3fdbb4d

Branch: refs/heads/master
Commit: c3fdbb4da7c6cac4d363b31366ae21ef976cde92
Parents: 25a10f4
Author: Niketan Pansare 
Authored: Fri Dec 7 10:51:36 2018 -0800
Committer: Niketan Pansare 
Committed: Fri Dec 7 10:51:36 2018 -0800

--
 .../api/mlcontext/MLContextConversionUtil.java  |  5 +-
 src/main/java/org/apache/sysml/hops/DnnOp.java  | 17 ++--
 .../java/org/apache/sysml/hops/IndexingOp.java  |  5 +-
 .../org/apache/sysml/hops/OptimizerUtils.java   | 13 ++-
 .../sysml/hops/codegen/cplan/CNodeNary.java |  5 +-
 .../hops/recompile/LiteralReplacement.java  |  9 +-
 .../java/org/apache/sysml/lops/compile/Dag.java | 15 ++--
 .../apache/sysml/lops/runtime/RunMRJobs.java|  3 +-
 .../controlprogram/ParForProgramBlock.java  | 19 ++--
 .../controlprogram/caching/ByteBuffer.java  |  5 +-
 .../controlprogram/caching/CacheStatistics.java | 14 +--
 .../controlprogram/caching/FrameObject.java | 15 ++--
 .../controlprogram/caching/MatrixObject.java| 19 ++--
 .../context/ExecutionContext.java   |  3 +-
 .../context/SparkExecutionContext.java  | 31 ---
 .../controlprogram/paramserv/LocalPSWorker.java |  3 +-
 .../paramserv/ParamservUtils.java   |  5 +-
 .../paramserv/dp/DRLocalScheme.java |  3 +-
 .../paramserv/dp/DRSparkScheme.java |  5 +-
 .../paramserv/dp/DataPartitionSparkScheme.java  |  3 +-
 .../dp/DataPartitionerSparkAggregator.java  |  9 +-
 .../paramserv/dp/SparkDataPartitioner.java  |  7 +-
 .../paramserv/rpc/PSRpcObject.java  |  3 +-
 .../controlprogram/parfor/DataPartitioner.java  |  7 +-
 .../parfor/DataPartitionerLocal.java| 25 +++---
 .../parfor/DataPartitionerRemoteMR.java |  3 +-
 .../parfor/DataPartitionerRemoteMapper.java | 13 ++-
 .../parfor/DataPartitionerRemoteSpark.java  |  3 +-
 .../DataPartitionerRemoteSparkMapper.java   |  5 +-
 .../controlprogram/parfor/RemoteDPParForMR.java | 23 +++--
 .../parfor/RemoteDPParForSpark.java |  5 +-
 .../parfor/RemoteDPParForSparkWorker.java   | 19 ++--
 .../parfor/RemoteDPParWorkerReducer.java| 21 +++--
 .../RemoteParForColocatedNLineInputFormat.java  |  3 +-
 .../controlprogram/parfor/RemoteParForMR.java   | 19 ++--
 .../parfor/RemoteParForSparkWorker.java |  5 +-
 .../parfor/RemoteParWorkerMapper.java   |  3 +-
 .../parfor/ResultMergeLocalFile.java|  5 +-
 .../parfor/ResultMergeLocalMemory.java  |  5 +-
 .../parfor/ResultMergeRemoteMR.java |  3 +-
 .../parfor/ResultMergeRemotePartitioning.java   |  5 +-
 .../parfor/ResultMergeRemoteSpark.java  |  5 +-
 .../parfor/TaskPartitionerStatic.java   |  3 +-
 .../instructions/CPInstructionParser.java   |  4 +-
 .../instructions/SPInstructionParser.java   |  6 +-
 .../cp/CentralMomentCPInstruction.java  |  3 +-
 .../instructions/cp/CtableCPInstruction.java|  5 +-
 .../instructions/cp/DataGenCPInstruction.java   |  5 +-
 .../cp/DataPartitionCPInstruction.java  |  5 +-
 .../instructions/cp/DnnCPInstruction.java   | 11 ++-
 .../cp/FrameIndexingCPInstruction.java  |  3 +-
 .../instructions/cp/IndexingCPInstruction.java  |  9 +-
 .../cp/ListIndexingCPInstruction.java   |  9 +-
 .../runtime/instructions/cp/ListObject.java |  5 +-
 .../cp/MatrixIndexingCPInstruction.java |  7 +-
 .../cp/MatrixReshapeCPInstruction.java  |  5 +-
 .../instructions/cp/PMMJCPInstruction.java  |  3 +-
 .../cp/ParameterizedBuiltinCPInstruction.java   |  3 +-
 .../cp/ParamservBuiltinCPInstruction.java   |  3 +-
 .../instructions/cp/ReorgCPInstruction.java |  3 +-
 .../cp/StringInitCPInstruction.java |  7 +-
 .../gpu/AggregateBinaryGPUInstruction.java  |  5 +-
 .../gpu/AggregateUnaryGPUInstruction.java   |  5 +-
 .../instructions/gpu/DnnGPUInstruction.java | 58 ++--
 .../instructions/gpu/MMTSJGPUInstruction.java   |  3 +-
 .../gpu/MatrixIndexingGPUInstruction.java   |  9 +-
 .../MatrixMatrixArithmeticGPUInstruction.java   |  3 +-
 .../gpu/MatrixMatrixAxpyGPUInstruction.java |  3 +-
 ...rixMatrixRelationalBinaryGPUInstruction.java |  3 +-
 .../gpu/MatrixReshapeGPUInstruction.java|  5 +-
 .../instructions/gpu/ReorgGPUInstruction.java   |  5 +-
 

[2/4] systemml git commit: [MINOR] Throw an exception if incorrect long-to-int conversion occurs

2018-11-30 Thread niketanpansare
http://git-wip-us.apache.org/repos/asf/systemml/blob/95cbbd65/src/main/java/org/apache/sysml/runtime/instructions/spark/CentralMomentSPInstruction.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/CentralMomentSPInstruction.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/CentralMomentSPInstruction.java
index f25899f..2b6575a 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/CentralMomentSPInstruction.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/CentralMomentSPInstruction.java
@@ -40,6 +40,7 @@ import org.apache.sysml.runtime.matrix.data.MatrixBlock;
 import org.apache.sysml.runtime.matrix.data.MatrixIndexes;
 import org.apache.sysml.runtime.matrix.operators.CMOperator;
 import 
org.apache.sysml.runtime.matrix.operators.CMOperator.AggregateOperationTypes;
+import org.apache.sysml.utils.IntUtils;
 
 public class CentralMomentSPInstruction extends UnarySPInstruction {
 
@@ -104,7 +105,7 @@ public class CentralMomentSPInstruction extends 
UnarySPInstruction {
ScalarObject order = ec.getScalarInput(scalarInput.getName(), 
scalarInput.getValueType(), scalarInput.isLiteral()); 
CMOperator cop = ((CMOperator)_optr); 
if ( cop.getAggOpType() == AggregateOperationTypes.INVALID ) {
-   cop.setCMAggOp((int)order.getLongValue());
+   cop.setCMAggOp(IntUtils.toInt(order.getLongValue()));
}

//get input

http://git-wip-us.apache.org/repos/asf/systemml/blob/95cbbd65/src/main/java/org/apache/sysml/runtime/instructions/spark/CpmmSPInstruction.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/CpmmSPInstruction.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/CpmmSPInstruction.java
index 308e60f..d83a6ed 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/CpmmSPInstruction.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/CpmmSPInstruction.java
@@ -49,6 +49,7 @@ import 
org.apache.sysml.runtime.matrix.operators.AggregateBinaryOperator;
 import org.apache.sysml.runtime.matrix.operators.AggregateOperator;
 import org.apache.sysml.runtime.matrix.operators.Operator;
 import org.apache.sysml.runtime.matrix.operators.ReorgOperator;
+import org.apache.sysml.utils.IntUtils;
 
 /**
  * Cpmm: cross-product matrix multiplication operation (distributed matrix 
multiply
@@ -165,9 +166,9 @@ public class CpmmSPInstruction extends BinarySPInstruction {
}

private static int getMaxParJoin(MatrixCharacteristics mc1, 
MatrixCharacteristics mc2) {
-   return mc1.colsKnown() ? (int)mc1.getNumColBlocks() :
-   mc2.rowsKnown() ? (int)mc2.getNumRowBlocks() :
-   Integer.MAX_VALUE;
+   return IntUtils.toInt(mc1.colsKnown() ? mc1.getNumColBlocks() :
+   mc2.rowsKnown() ? mc2.getNumRowBlocks() :
+   Integer.MAX_VALUE);
}
 
private static class CpmmIndexFunction implements 
PairFunction, Long, IndexedMatrixValue>

http://git-wip-us.apache.org/repos/asf/systemml/blob/95cbbd65/src/main/java/org/apache/sysml/runtime/instructions/spark/CumulativeAggregateSPInstruction.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/CumulativeAggregateSPInstruction.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/CumulativeAggregateSPInstruction.java
index 68cc6db..a0dfb85 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/CumulativeAggregateSPInstruction.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/CumulativeAggregateSPInstruction.java
@@ -39,6 +39,7 @@ import org.apache.sysml.runtime.matrix.data.MatrixIndexes;
 import org.apache.sysml.runtime.matrix.data.OperationsOnMatrixValues;
 import org.apache.sysml.runtime.matrix.operators.AggregateUnaryOperator;
 import org.apache.sysml.runtime.matrix.operators.UnaryOperator;
+import org.apache.sysml.utils.IntUtils;
 
 public class CumulativeAggregateSPInstruction extends 
AggregateUnarySPInstruction {
 
@@ -76,7 +77,7 @@ public class CumulativeAggregateSPInstruction extends 
AggregateUnarySPInstructio
//merge partial aggregates, adjusting for correct number of 
partitions
//as size can significant shrink (1K) but also grow 
(sparse-dense)
int numParts = SparkUtils.getNumPreferredPartitions(mcOut);
-   int minPar = 
(int)Math.min(SparkExecutionContext.getDefaultParallelism(true), 
mcOut.getNumBlocks());
+   int minPar = 
IntUtils.toInt(Math.min(SparkExecutionContext.getDefaultParallelism(true), 

[4/4] systemml git commit: [MINOR] Throw an exception if incorrect long-to-int conversion occurs

2018-11-30 Thread niketanpansare
[MINOR] Throw an exception if incorrect long-to-int conversion occurs

- Without this fix, an incorrect conversion is silently ignored. This is
propagated further down in the engine and thrown as a different error
such as "Invalid block dimensions" or an incorrect result is returned.

Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/95cbbd65
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/95cbbd65
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/95cbbd65

Branch: refs/heads/master
Commit: 95cbbd656b9c2c85b79536dd5175ce49ff0c1d22
Parents: bc6e941
Author: Niketan Pansare 
Authored: Fri Nov 30 13:58:46 2018 -0800
Committer: Niketan Pansare 
Committed: Fri Nov 30 13:59:26 2018 -0800

--
 .../api/mlcontext/MLContextConversionUtil.java  |  5 +-
 src/main/java/org/apache/sysml/hops/DnnOp.java  | 17 ++--
 .../java/org/apache/sysml/hops/IndexingOp.java  |  5 +-
 .../org/apache/sysml/hops/OptimizerUtils.java   | 13 +--
 .../sysml/hops/codegen/cplan/CNodeNary.java |  5 +-
 .../hops/recompile/LiteralReplacement.java  |  9 +-
 .../java/org/apache/sysml/lops/compile/Dag.java | 15 ++--
 .../apache/sysml/lops/runtime/RunMRJobs.java|  3 +-
 .../controlprogram/ParForProgramBlock.java  | 19 ++--
 .../controlprogram/caching/ByteBuffer.java  |  5 +-
 .../controlprogram/caching/CacheStatistics.java | 14 +--
 .../controlprogram/caching/FrameObject.java | 15 ++--
 .../controlprogram/caching/MatrixObject.java| 19 ++--
 .../context/ExecutionContext.java   |  3 +-
 .../context/SparkExecutionContext.java  | 31 +++
 .../controlprogram/paramserv/LocalPSWorker.java |  3 +-
 .../paramserv/ParamservUtils.java   |  5 +-
 .../paramserv/dp/DRLocalScheme.java |  3 +-
 .../paramserv/dp/DRSparkScheme.java |  5 +-
 .../paramserv/dp/DataPartitionSparkScheme.java  |  3 +-
 .../dp/DataPartitionerSparkAggregator.java  |  9 +-
 .../paramserv/dp/SparkDataPartitioner.java  |  7 +-
 .../paramserv/rpc/PSRpcObject.java  |  3 +-
 .../controlprogram/parfor/DataPartitioner.java  |  7 +-
 .../parfor/DataPartitionerLocal.java| 25 +++---
 .../parfor/DataPartitionerRemoteMR.java |  3 +-
 .../parfor/DataPartitionerRemoteMapper.java | 13 +--
 .../parfor/DataPartitionerRemoteSpark.java  |  3 +-
 .../DataPartitionerRemoteSparkMapper.java   |  5 +-
 .../controlprogram/parfor/RemoteDPParForMR.java | 23 ++---
 .../parfor/RemoteDPParForSpark.java |  5 +-
 .../parfor/RemoteDPParForSparkWorker.java   | 19 ++--
 .../parfor/RemoteDPParWorkerReducer.java| 21 ++---
 .../RemoteParForColocatedNLineInputFormat.java  |  3 +-
 .../controlprogram/parfor/RemoteParForMR.java   | 19 ++--
 .../parfor/RemoteParForSparkWorker.java |  5 +-
 .../parfor/RemoteParWorkerMapper.java   |  3 +-
 .../parfor/ResultMergeLocalFile.java|  5 +-
 .../parfor/ResultMergeLocalMemory.java  |  5 +-
 .../parfor/ResultMergeRemoteMR.java |  3 +-
 .../parfor/ResultMergeRemotePartitioning.java   |  5 +-
 .../parfor/ResultMergeRemoteSpark.java  |  5 +-
 .../parfor/TaskPartitionerStatic.java   |  3 +-
 .../cp/CentralMomentCPInstruction.java  |  3 +-
 .../instructions/cp/CtableCPInstruction.java|  5 +-
 .../instructions/cp/DataGenCPInstruction.java   |  5 +-
 .../cp/DataPartitionCPInstruction.java  |  5 +-
 .../instructions/cp/DnnCPInstruction.java   | 11 +--
 .../cp/FrameIndexingCPInstruction.java  |  3 +-
 .../instructions/cp/IndexingCPInstruction.java  |  9 +-
 .../cp/ListIndexingCPInstruction.java   |  9 +-
 .../runtime/instructions/cp/ListObject.java |  5 +-
 .../cp/MatrixIndexingCPInstruction.java |  7 +-
 .../cp/MatrixReshapeCPInstruction.java  |  5 +-
 .../instructions/cp/PMMJCPInstruction.java  |  3 +-
 .../cp/ParameterizedBuiltinCPInstruction.java   |  3 +-
 .../cp/ParamservBuiltinCPInstruction.java   |  3 +-
 .../instructions/cp/ReorgCPInstruction.java |  3 +-
 .../cp/StringInitCPInstruction.java |  7 +-
 .../gpu/AggregateBinaryGPUInstruction.java  |  5 +-
 .../gpu/AggregateUnaryGPUInstruction.java   |  5 +-
 .../instructions/gpu/DnnGPUInstruction.java | 58 ++--
 .../instructions/gpu/MMTSJGPUInstruction.java   |  3 +-
 .../gpu/MatrixIndexingGPUInstruction.java   |  9 +-
 .../MatrixMatrixArithmeticGPUInstruction.java   |  3 +-
 .../gpu/MatrixMatrixAxpyGPUInstruction.java |  3 +-
 ...rixMatrixRelationalBinaryGPUInstruction.java |  3 +-
 .../gpu/MatrixReshapeGPUInstruction.java|  5 +-
 .../instructions/gpu/ReorgGPUInstruction.java   |  5 +-
 .../ScalarMatrixArithmeticGPUInstruction.java   |  5 +-
 ...larMatrixRelationalBinaryGPUInstruction.java |  5 +-
 .../mr/AggregateBinaryInstruction.java  |  

[1/4] systemml git commit: [MINOR] Throw an exception if incorrect long-to-int conversion occurs

2018-11-30 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master bc6e941ce -> 95cbbd656


http://git-wip-us.apache.org/repos/asf/systemml/blob/95cbbd65/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtils.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtils.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtils.java
index f775e92..c8a0d3e 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtils.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDConverterUtils.java
@@ -71,6 +71,7 @@ import org.apache.sysml.runtime.util.DataConverter;
 import org.apache.sysml.runtime.util.FastStringTokenizer;
 import org.apache.sysml.runtime.util.MapReduceTool;
 import org.apache.sysml.runtime.util.UtilFunctions;
+import org.apache.sysml.utils.IntUtils;
 
 import scala.Tuple2;
 
@@ -271,7 +272,7 @@ public class RDDConverterUtils
//slice blocks into rows, align and convert into data frame rows
JavaRDD rowsRDD = in
.flatMapToPair(new 
SliceBinaryBlockToRowsFunction(mc.getRowsPerBlock()))
-   .groupByKey().map(new 
ConvertRowBlocksToRows((int)mc.getCols(), mc.getColsPerBlock(), toVector));
+   .groupByKey().map(new 
ConvertRowBlocksToRows(IntUtils.toInt(mc.getCols()), mc.getColsPerBlock(), 
toVector));

//create data frame schema
List fields = new ArrayList<>();
@@ -322,7 +323,7 @@ public class RDDConverterUtils
MapReduceTool.deleteFileIfExistOnHDFS(pathY);

//convert libsvm to labeled points
-   int numFeatures = (int) mcOutX.getCols();
+   int numFeatures = IntUtils.toInt( mcOutX.getCols() );
int numPartitions = 
SparkUtils.getNumPreferredPartitions(mcOutX, null);
JavaRDD 
lpoints = 
MLUtils.loadLibSVMFile(sc.sc(), pathIn, 
numFeatures, numPartitions).toJavaRDD();
@@ -485,7 +486,7 @@ public class RDDConverterUtils
_bclen = mc.getColsPerBlock();

//determine upper bounded buffer len
-   _bufflen = (int) Math.min(_rlen*_clen, BUFFER_SIZE);
+   _bufflen = IntUtils.toInt( Math.min(_rlen*_clen, 
BUFFER_SIZE) );
}
 
protected void flushBufferToList( ReblockBuffer rbuff,  
ArrayList> ret ) 
@@ -702,7 +703,7 @@ public class RDDConverterUtils
{
ArrayList> ret = new 
ArrayList<>();
 
-   int ncblks = (int)Math.ceil((double)_clen/_bclen);
+   int ncblks = 
IntUtils.toInt(Math.ceil((double)_clen/_bclen));
MatrixIndexes[] ix = new MatrixIndexes[ncblks];
MatrixBlock[] mb = new MatrixBlock[ncblks];

@@ -724,7 +725,7 @@ public class RDDConverterUtils
if( ix[0] !=null )
flushBlocksToList(ix, mb, ret);
long len = 
UtilFunctions.computeBlockSize(_rlen, rix, _brlen);
-   createBlocks(rowix, (int)len, ix, mb);
+   createBlocks(rowix, 
IntUtils.toInt(len), ix, mb);
}

//process row data
@@ -732,7 +733,7 @@ public class RDDConverterUtils
boolean emptyFound = false;
for( int cix=1, pix=0; cix<=ncblks; cix++ ) 
{
-   int lclen = 
(int)UtilFunctions.computeBlockSize(_clen, cix, _bclen);
+   int lclen = 
IntUtils.toInt(UtilFunctions.computeBlockSize(_clen, cix, _bclen));
if( mb[cix-1].isInSparseFormat() ) {
//allocate row once (avoid 
re-allocations)
int lnnz = 
IOUtilFunctions.countNnz(parts, pix, lclen);
@@ -762,13 +763,13 @@ public class RDDConverterUtils
{
//compute row block index and number of column blocks
long rix = UtilFunctions.computeBlockIndex(rowix, 
_brlen);
-   int ncblks = (int)Math.ceil((double)_clen/_bclen);
+   int ncblks = 
IntUtils.toInt(Math.ceil((double)_clen/_bclen));

//create all column blocks (assume dense since csv is 

[3/4] systemml git commit: [MINOR] Throw an exception if incorrect long-to-int conversion occurs

2018-11-30 Thread niketanpansare
http://git-wip-us.apache.org/repos/asf/systemml/blob/95cbbd65/src/main/java/org/apache/sysml/runtime/controlprogram/parfor/RemoteDPParWorkerReducer.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/controlprogram/parfor/RemoteDPParWorkerReducer.java
 
b/src/main/java/org/apache/sysml/runtime/controlprogram/parfor/RemoteDPParWorkerReducer.java
index 11d1ed9..9352508 100644
--- 
a/src/main/java/org/apache/sysml/runtime/controlprogram/parfor/RemoteDPParWorkerReducer.java
+++ 
b/src/main/java/org/apache/sysml/runtime/controlprogram/parfor/RemoteDPParWorkerReducer.java
@@ -48,6 +48,7 @@ import 
org.apache.sysml.runtime.matrix.mapred.MRConfigurationNames;
 import org.apache.sysml.runtime.matrix.mapred.MRJobConfiguration;
 import org.apache.sysml.runtime.util.LocalFileUtils;
 import org.apache.sysml.runtime.util.ProgramConverter;
+import org.apache.sysml.utils.IntUtils;
 import org.apache.sysml.utils.Statistics;
 
 public class RemoteDPParWorkerReducer extends ParWorker
@@ -120,8 +121,8 @@ public class RemoteDPParWorkerReducer extends ParWorker
_dpf = MRJobConfiguration.getPartitioningFormat( job );
MatrixCharacteristics mc = 
MRJobConfiguration.getPartitionedMatrixSize(job);
PartitionFormat pf = new PartitionFormat(_dpf, 
MRJobConfiguration.getPartitioningSizeN(job));
-   _rlen = (int)pf.getNumRows(mc);
-   _clen = (int)pf.getNumColumns(mc);
+   _rlen = IntUtils.toInt(pf.getNumRows(mc));
+   _clen = IntUtils.toInt(pf.getNumColumns(mc));
_brlen = mc.getRowsPerBlock();
_bclen = mc.getColsPerBlock();
_iterVar = MRJobConfiguration.getPartitioningItervar( job );
@@ -129,9 +130,9 @@ public class RemoteDPParWorkerReducer extends ParWorker
_info = MRJobConfiguration.getPartitioningOutputInfo( job );
_tSparseCol = MRJobConfiguration.getPartitioningTransposedCol( 
job ); 
if( _tSparseCol )
-   _partition = new MatrixBlock((int)_clen, _rlen, true);
+   _partition = new MatrixBlock(IntUtils.toInt(_clen), 
_rlen, true);
else
-   _partition = new MatrixBlock((int)_rlen, _clen, false);
+   _partition = new MatrixBlock(IntUtils.toInt(_rlen), 
_clen, false);
 
//Step 1: configure parworker
String taskID = job.get(MRConfigurationNames.MR_TASK_ID);
@@ -152,7 +153,7 @@ public class RemoteDPParWorkerReducer extends ParWorker

//create local runtime program
String in = MRJobConfiguration.getProgramBlocks(job);
-   ParForBody body = ProgramConverter.parseParForBody(in, 
(int)_workerID);
+   ParForBody body = ProgramConverter.parseParForBody(in, 
IntUtils.toInt(_workerID));
_childBlocks = body.getChildBlocks();
_ec  = body.getEc();
_resultVars  = body.getResultVariables();
@@ -242,8 +243,8 @@ public class RemoteDPParWorkerReducer extends ParWorker
while( valueList.hasNext() )
{
PairWritableBlock pairValue = 
(PairWritableBlock)valueList.next();
-   int row_offset = 
(int)(pairValue.indexes.getRowIndex()-1)*_brlen;
-   int col_offset = 
(int)(pairValue.indexes.getColumnIndex()-1)*_bclen;
+   int row_offset = 
IntUtils.toInt(pairValue.indexes.getRowIndex()-1)*_brlen;
+   int col_offset = 
IntUtils.toInt(pairValue.indexes.getColumnIndex()-1)*_bclen;
MatrixBlock block = pairValue.block;
if( !_partition.isInSparseFormat() ) //DENSE
{
@@ -297,7 +298,7 @@ public class RemoteDPParWorkerReducer extends ParWorker
PairWritableCell pairValue = 
(PairWritableCell)valueList.next();
if( 
pairValue.indexes.getColumnIndex()<0 )
continue; //cells used to 
ensure empty partitions
-   _partition.quickSetValue(0, 
(int)pairValue.indexes.getColumnIndex()-1, pairValue.cell.getValue());
+   _partition.quickSetValue(0, 
IntUtils.toInt(pairValue.indexes.getColumnIndex()-1), 
pairValue.cell.getValue());
}
break;
case COLUMN_WISE:
@@ -307,9 +308,9 @@ public class RemoteDPParWorkerReducer extends ParWorker
if( pairValue.indexes.getRowIndex()<0 )
   

systemml git commit: [MINOR] Added an external UDF to split string

2018-11-29 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 6ca9be1f5 -> 62647de61


[MINOR] Added an external UDF to split string

- Also, updated ListObject to specify the valuetype of the list. This takes 
care of the "wrong value type warning".

Closes #844.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/62647de6
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/62647de6
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/62647de6

Branch: refs/heads/master
Commit: 62647de614a549cba7e89446b89308bb531e61a1
Parents: 6ca9be1
Author: Niketan Pansare 
Authored: Thu Nov 29 16:01:09 2018 -0800
Committer: Niketan Pansare 
Committed: Thu Nov 29 16:01:09 2018 -0800

--
 .../runtime/instructions/cp/ListObject.java | 14 +++-
 .../org/apache/sysml/udf/lib/SplitWrapper.java  | 85 
 2 files changed, 96 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/62647de6/src/main/java/org/apache/sysml/runtime/instructions/cp/ListObject.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/cp/ListObject.java 
b/src/main/java/org/apache/sysml/runtime/instructions/cp/ListObject.java
index 863b77c..576e57c 100644
--- a/src/main/java/org/apache/sysml/runtime/instructions/cp/ListObject.java
+++ b/src/main/java/org/apache/sysml/runtime/instructions/cp/ListObject.java
@@ -37,11 +37,19 @@ public class ListObject extends Data {
private int _nCacheable;

public ListObject(List data) {
-   this(data, null);
+   this(data, null, ValueType.UNKNOWN);
}
-
+   
+   public ListObject(List data, ValueType vt) {
+   this(data, null, vt);
+   }
+   
public ListObject(List data, List names) {
-   super(DataType.LIST, ValueType.UNKNOWN);
+   this(data, names, ValueType.UNKNOWN);
+   }
+
+   public ListObject(List data, List names, ValueType vt) {
+   super(DataType.LIST, vt);
_data = data;
_names = names;
_nCacheable = (int) data.stream().filter(

http://git-wip-us.apache.org/repos/asf/systemml/blob/62647de6/src/main/java/org/apache/sysml/udf/lib/SplitWrapper.java
--
diff --git a/src/main/java/org/apache/sysml/udf/lib/SplitWrapper.java 
b/src/main/java/org/apache/sysml/udf/lib/SplitWrapper.java
new file mode 100644
index 000..75cb27a
--- /dev/null
+++ b/src/main/java/org/apache/sysml/udf/lib/SplitWrapper.java
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.sysml.udf.lib;
+
+import java.util.ArrayList;
+
+import org.apache.sysml.parser.Expression.ValueType;
+import org.apache.sysml.runtime.instructions.cp.Data;
+import org.apache.sysml.runtime.instructions.cp.ListObject;
+import org.apache.sysml.runtime.instructions.cp.StringObject;
+import org.apache.sysml.udf.FunctionParameter;
+import org.apache.sysml.udf.PackageFunction;
+import org.apache.sysml.udf.Scalar;
+import org.apache.sysml.udf.List;
+
+/**
+ * Wrapper class for split invocation
+ * 
+ * split = externalFunction(String s, String regex, int limit) return 
(list[String] out) implemented in
+ * (classname="org.apache.sysml.udf.lib.SplitWrapper",exectype="mem");
+ * 
+ * out = split ("foo_goo_boo", "_", 2); 
+ * for ( i in 1:3) { print(as.scalar(out[i])); }
+ * 
+ */
+public class SplitWrapper extends PackageFunction {
+   private static final long serialVersionUID = 1L;
+
+   private List outputList;
+
+   @Override
+   public int getNumFunctionOutputs() {
+   return 1;
+   }
+
+   @Override
+   public FunctionParameter getFunctionOutput(int pos) {
+   if (pos == 0)
+   return outputList;
+   else
+   throw new RuntimeException("Invalid function output 
being 

systemml git commit: [SYSTEMML-445] Support recomputation of activations to reduce the memory footprint

2018-11-04 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 8606754ea -> beb1a1d19


[SYSTEMML-445] Support recomputation of activations to reduce the memory 
footprint

- Added a configuration property sysml.gpu.recompute.activations to enable 
recomputation of ReLU.
- This configuration is disabled by default, but can be enabled for large 
networks.

Closes #841.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/beb1a1d1
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/beb1a1d1
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/beb1a1d1

Branch: refs/heads/master
Commit: beb1a1d19a5a2710b55bd41d36a5d8085fb0afda
Parents: 8606754
Author: Niketan Pansare 
Authored: Sun Nov 4 14:19:38 2018 +0530
Committer: Niketan Pansare 
Committed: Sun Nov 4 14:19:38 2018 +0530

--
 conf/SystemML-config.xml.template   |  4 +
 .../java/org/apache/sysml/conf/DMLConfig.java   |  4 +-
 src/main/java/org/apache/sysml/hops/DnnOp.java  | 12 ++-
 .../instructions/GPUInstructionParser.java  |  2 +
 .../instructions/gpu/DnnGPUInstruction.java | 60 +++---
 .../runtime/matrix/data/LibMatrixCuDNN.java | 87 +++-
 6 files changed, 155 insertions(+), 14 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/beb1a1d1/conf/SystemML-config.xml.template
--
diff --git a/conf/SystemML-config.xml.template 
b/conf/SystemML-config.xml.template
index 7b535c9..b9189b1 100644
--- a/conf/SystemML-config.xml.template
+++ b/conf/SystemML-config.xml.template
@@ -114,4 +114,8 @@


cuda
+   
+   
+   false
 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/systemml/blob/beb1a1d1/src/main/java/org/apache/sysml/conf/DMLConfig.java
--
diff --git a/src/main/java/org/apache/sysml/conf/DMLConfig.java 
b/src/main/java/org/apache/sysml/conf/DMLConfig.java
index 7f0ecbc..8459fd4 100644
--- a/src/main/java/org/apache/sysml/conf/DMLConfig.java
+++ b/src/main/java/org/apache/sysml/conf/DMLConfig.java
@@ -96,6 +96,7 @@ public class DMLConfig
public static final String FLOATING_POINT_PRECISION = 
"sysml.floating.point.precision"; // String to specify the datatype to use 
internally: supported values are double, single
public static final String PRINT_GPU_MEMORY_INFO = 
"sysml.gpu.print.memoryInfo";
public static final String EVICTION_SHADOW_BUFFERSIZE = 
"sysml.gpu.eviction.shadow.bufferSize";
+   public static final String GPU_RECOMPUTE_ACTIVATIONS = 
"sysml.gpu.recompute.activations";
 
// supported prefixes for custom map/reduce configurations
public static final String PREFIX_MAPRED = "mapred";
@@ -147,6 +148,7 @@ public class DMLConfig
_defaultVals.put(SYNCHRONIZE_GPU,"false" );
_defaultVals.put(CACHING_BUFFER_SIZE,"0.15" );
_defaultVals.put(EAGER_CUDA_FREE,"false" );
+   _defaultVals.put(GPU_RECOMPUTE_ACTIVATIONS, "false" );
_defaultVals.put(FLOATING_POINT_PRECISION,   
"double" );
}

@@ -430,7 +432,7 @@ public class DMLConfig
CODEGEN, CODEGEN_COMPILER, CODEGEN_OPTIMIZER, 
CODEGEN_PLANCACHE, CODEGEN_LITERALS,
EXTRA_FINEGRAINED_STATS, STATS_MAX_WRAP_LEN, 
PRINT_GPU_MEMORY_INFO, CACHING_BUFFER_SIZE,
AVAILABLE_GPUS, SYNCHRONIZE_GPU, 
EAGER_CUDA_FREE, FLOATING_POINT_PRECISION, GPU_EVICTION_POLICY, 
EVICTION_SHADOW_BUFFERSIZE,
-   GPU_MEMORY_ALLOCATOR, 
GPU_MEMORY_UTILIZATION_FACTOR
+   GPU_MEMORY_ALLOCATOR, 
GPU_MEMORY_UTILIZATION_FACTOR, GPU_RECOMPUTE_ACTIVATIONS
}; 

StringBuilder sb = new StringBuilder();

http://git-wip-us.apache.org/repos/asf/systemml/blob/beb1a1d1/src/main/java/org/apache/sysml/hops/DnnOp.java
--
diff --git a/src/main/java/org/apache/sysml/hops/DnnOp.java 
b/src/main/java/org/apache/sysml/hops/DnnOp.java
index 7cf5061..cc94111 100644
--- a/src/main/java/org/apache/sysml/hops/DnnOp.java
+++ b/src/main/java/org/apache/sysml/hops/DnnOp.java
@@ -20,6 +20,7 @@
 package org.apache.sysml.hops;
 
 import org.apache.sysml.conf.ConfigurationManager;
+import org.apache.sysml.conf.DMLConfig;
 import org.apache.sysml.hops.rewrite.HopRewriteUtils;
 import org.apache.sysml.lops.DnnTransform;
 import org.apache.sysml.lops.DnnTransform.OperationTypes;
@@ -47,6 +48,8 @@ public class DnnOp extends MultiThreadedHop
private static final boolean 

systemml git commit: [SYSTEMML-1325] Setting floating point precision in JMLC

2018-11-04 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 912b47018 -> 8606754ea


[SYSTEMML-1325] Setting floating point precision in JMLC

- In current master, the configuration sysml.floating.point.precision is
ignored. This commit fixes that issue

Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/8606754e
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/8606754e
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/8606754e

Branch: refs/heads/master
Commit: 8606754eaf6af43dbeab5cf5aa5a3d7621bef889
Parents: 912b470
Author: Niketan Pansare 
Authored: Sun Nov 4 14:16:02 2018 +0530
Committer: Niketan Pansare 
Committed: Sun Nov 4 14:16:02 2018 +0530

--
 src/main/java/org/apache/sysml/api/jmlc/Connection.java | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/8606754e/src/main/java/org/apache/sysml/api/jmlc/Connection.java
--
diff --git a/src/main/java/org/apache/sysml/api/jmlc/Connection.java 
b/src/main/java/org/apache/sysml/api/jmlc/Connection.java
index 7f5d7c9..53b7d04 100644
--- a/src/main/java/org/apache/sysml/api/jmlc/Connection.java
+++ b/src/main/java/org/apache/sysml/api/jmlc/Connection.java
@@ -44,7 +44,6 @@ import org.apache.sysml.conf.DMLConfig;
 import org.apache.sysml.conf.DMLOptions;
 import org.apache.sysml.hops.codegen.SpoofCompiler;
 import org.apache.sysml.parser.DataExpression;
-import org.apache.sysml.parser.LanguageException;
 import org.apache.sysml.runtime.DMLRuntimeException;
 import org.apache.sysml.runtime.controlprogram.Program;
 import org.apache.sysml.runtime.controlprogram.caching.CacheableData;
@@ -61,7 +60,6 @@ import org.apache.sysml.runtime.matrix.data.MatrixBlock;
 import org.apache.sysml.runtime.transform.TfUtils;
 import org.apache.sysml.runtime.transform.meta.TfMetaUtils;
 import org.apache.sysml.runtime.util.DataConverter;
-import org.apache.sysml.runtime.util.UtilFunctions;
 import org.apache.sysml.utils.Explain;
 import org.apache.wink.json4j.JSONObject;
 
@@ -945,5 +943,6 @@ public class Connection implements Closeable
//set thread-local configurations for compilation and read
ConfigurationManager.setLocalConfig(_dmlconf);
ConfigurationManager.setLocalConfig(_cconf);
+   DMLScript.setGlobalFlags(_dmlconf);
}
 }
\ No newline at end of file



systemml git commit: [SYSTEMML-540] Improve the performance of LSTM forward on GPU

2018-11-02 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master bf4ba16b9 -> 912b47018


[SYSTEMML-540] Improve the performance of LSTM forward on GPU

- This commit improves the performance of LSTM forward by reducing unnecessary 
ping pongs between CPU-GPU due to left indexing.
- There is no performance gains for CPU execution.

Closes #756.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/912b4701
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/912b4701
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/912b4701

Branch: refs/heads/master
Commit: 912b4701875d4de0db8327479398c32607f4687d
Parents: bf4ba16
Author: Niketan Pansare 
Authored: Sat Nov 3 05:52:00 2018 +0530
Committer: Niketan Pansare 
Committed: Sat Nov 3 05:52:00 2018 +0530

--
 scripts/nn/layers/lstm.dml | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/912b4701/scripts/nn/layers/lstm.dml
--
diff --git a/scripts/nn/layers/lstm.dml b/scripts/nn/layers/lstm.dml
index 0b0016b..cd1557d 100644
--- a/scripts/nn/layers/lstm.dml
+++ b/scripts/nn/layers/lstm.dml
@@ -89,13 +89,13 @@ forward = function(matrix[double] X, matrix[double] W, 
matrix[double] b, int T,
   for (t in 1:T) {  # each timestep
 X_t = X[,(t-1)*D+1:t*D]  # shape (N, D)
 input = cbind(X_t, out_prev)  # shape (N, D+M)
-ifog = input %*% W + b  # input, forget, output, and g gates; shape (N, 4M)
-ifog[,1:3*M] = sigmoid::forward(ifog[,1:3*M])  # i,f,o gates squashed with 
sigmoid
-ifog[,3*M+1:4*M] = tanh::forward(ifog[,3*M+1:4*M])  # g gate squashed with 
tanh
+ifog_raw = input %*% W + b  # input, forget, output, and g gates; shape 
(N, 4M)
+ifo = sigmoid::forward(ifog_raw[,1:3*M])  # i,f,o gates squashed with 
sigmoid
+g = tanh::forward(ifog_raw[,3*M+1:4*M])  # g gate squashed with tanh
 # c_t = f*prev_c + i*g
-c = ifog[,M+1:2*M]*c_prev + ifog[,1:M]*ifog[,3*M+1:4*M]  # shape (N, M)
+c = ifo[,M+1:2*M]*c_prev + ifo[,1:M]*g  # shape (N, M)
 # out_t = o*tanh(c)
-out_t = ifog[,2*M+1:3*M] * tanh::forward(c)  # shape (N, M)
+out_t = ifo[,2*M+1:3*M] * tanh::forward(c)  # shape (N, M)
 
 # store
 if (return_sequences) {
@@ -108,7 +108,7 @@ forward = function(matrix[double] X, matrix[double] W, 
matrix[double] b, int T,
 c_prev = c
 cache_out[t,] = matrix(out_t, rows=1, cols=N*M)  # reshape
 cache_c[t,] = matrix(c, rows=1, cols=N*M)  # reshape
-cache_ifog[t,] = matrix(ifog, rows=1, cols=N*4*M)  # reshape
+cache_ifog[t,] = matrix(cbind(ifo, g), rows=1, cols=N*4*M)  # reshape
   }
 }
 



systemml git commit: [SYSTEMML-1325] Fixes formatting issues and warnings. Fixes bug causing explain to sometimes not be printed.

2018-11-02 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master cc7ec7625 -> bf4ba16b9


[SYSTEMML-1325] Fixes formatting issues and warnings. Fixes bug causing explain 
to sometimes not be printed.

Closes #838.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/bf4ba16b
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/bf4ba16b
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/bf4ba16b

Branch: refs/heads/master
Commit: bf4ba16b9aaa9afee20a3f1c03b0ff49c5346a9d
Parents: cc7ec76
Author: Anthony Thomas 
Authored: Sat Nov 3 05:44:54 2018 +0530
Committer: Niketan Pansare 
Committed: Sat Nov 3 05:44:54 2018 +0530

--
 .../java/org/apache/sysml/api/DMLScript.java| 104 -
 .../apache/sysml/api/ScriptExecutorUtils.java   |  20 +-
 .../apache/sysml/api/jmlc/PreparedScript.java   |   8 +-
 .../sysml/api/mlcontext/ScriptExecutor.java | 232 +--
 .../apache/sysml/conf/ConfigurationManager.java |   8 +-
 .../controlprogram/LocalVariableMap.java|   1 -
 .../org/apache/sysml/test/gpu/JMLCTests.java| 186 ---
 .../integration/mlcontext/MLContextTest.java|  10 +-
 8 files changed, 170 insertions(+), 399 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/bf4ba16b/src/main/java/org/apache/sysml/api/DMLScript.java
--
diff --git a/src/main/java/org/apache/sysml/api/DMLScript.java 
b/src/main/java/org/apache/sysml/api/DMLScript.java
index 16d8986..293a810 100644
--- a/src/main/java/org/apache/sysml/api/DMLScript.java
+++ b/src/main/java/org/apache/sysml/api/DMLScript.java
@@ -376,56 +376,56 @@ public class DMLScript
// (core compilation and execute)

 
-/**
- * The running body of DMLScript execution. This method should be called 
after execution properties have been correctly set,
- * and customized parameters have been put into _argVals
- *
- * @param dmlScriptStr DML script string
- * @param fnameOptConfig configuration file
- * @param argVals map of argument values
- * @param allArgs arguments
- * @param scriptType type of script (DML or PyDML)
- * @throws IOException if IOException occurs
- */
-private static void execute(String dmlScriptStr, String fnameOptConfig, 
Map argVals, String[] allArgs, ScriptType scriptType)
-throws IOException
-{
-SCRIPT_TYPE = scriptType;
-
-//print basic time and environment info
-printStartExecInfo( dmlScriptStr );
-
-//Step 1: parse configuration files & write any configuration specific 
global variables
-DMLConfig dmlconf = DMLConfig.readConfigurationFile(fnameOptConfig);
-ConfigurationManager.setGlobalConfig(dmlconf);
-CompilerConfig cconf = OptimizerUtils.constructCompilerConfig(dmlconf);
-ConfigurationManager.setGlobalConfig(cconf);
-LOG.debug("\nDML config: \n" + dmlconf.getConfigInfo());
-
-setGlobalFlags(dmlconf);
-Program rtprog = 
ScriptExecutorUtils.compileRuntimeProgram(dmlScriptStr, argVals, allArgs,
-scriptType, dmlconf, SystemMLAPI.DMLScript);
-List gCtxs = ConfigurationManager.getDMLOptions().gpu ? 
GPUContextPool.getAllGPUContexts() : null;
-
-//double costs = CostEstimationWrapper.getTimeEstimate(rtprog, 
ExecutionContextFactory.createContext());
-//System.out.println("Estimated costs: "+costs);
-
-//Step 10: execute runtime program
-ExecutionContext ec = null;
-try {
-ec = ScriptExecutorUtils.executeRuntimeProgram(
-rtprog, dmlconf, ConfigurationManager.isStatistics() ?
-
ConfigurationManager.getDMLOptions().getStatisticsMaxHeavyHitters() : 0,
-new LocalVariableMap(), null, SystemMLAPI.DMLScript, 
gCtxs);
-}
-finally {
-if(ec != null && ec instanceof SparkExecutionContext)
-((SparkExecutionContext) ec).close();
-LOG.info("END DML run " + getDateTime() );
-//cleanup scratch_space and all working dirs
-cleanupHadoopExecution( dmlconf );
-}
-}
+   /**
+* The running body of DMLScript execution. This method should be 
called after execution properties have been correctly set,
+* and customized parameters have been put into _argVals
+*
+* @param dmlScriptStr DML script string
+* @param fnameOptConfig configuration file
+* @param argVals map of argument values
+* @param allArgs arguments
+* @param scriptType type of script (DML or PyDML)
+* @throws IOException if IOException occurs
+*/
+   private static void execute(String 

systemml git commit: [SYSTEMML-1325] Bugfix for GPU memory manager clear temporary memory. Fixes bug in GPU cleanup with JMLC.

2018-11-02 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master be2b3e220 -> cc7ec7625


[SYSTEMML-1325] Bugfix for GPU memory manager clear temporary memory. Fixes bug 
in GPU cleanup with JMLC.

In GPUMemoryManager.clearTemporaryMemory() we deallocate pointers but do not 
set the corresponding Pointer slots to null in the associated GPUObject 
instances. This can lead to attempted double-freeing of Pointers which results 
in an exception. This commit fixes this issue by creating a list of GPU objects 
associated with Pointers that have been freed as part of clearTemporaryMemory() 
and setting the corresponding pointer slots to null. This commit also addresses 
a minor issue with cleanup in JMLC which was causing Pointers for pinned data 
to be improperly cleared. Note this commit will reduce performance of 
GPUMemoryManager.clearTemporaryMemory() because it is now necessary to search 
through the list of managed GPUObjects to find the ones corresponding to 
pointers being freed. However, this method is only called once at the end of 
script invocation and so the performance cost will be small.

Closes #839.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/cc7ec762
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/cc7ec762
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/cc7ec762

Branch: refs/heads/master
Commit: cc7ec762583aec137ed0bd10a8754e29140b2e35
Parents: be2b3e2
Author: Anthony Thomas 
Authored: Sat Nov 3 05:41:42 2018 +0530
Committer: Niketan Pansare 
Committed: Sat Nov 3 05:41:42 2018 +0530

--
 .../gpu/context/GPUMatrixMemoryManager.java | 27 +---
 .../gpu/context/GPUMemoryManager.java   | 13 +++---
 .../instructions/gpu/context/GPUObject.java |  2 +-
 3 files changed, 34 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/cc7ec762/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUMatrixMemoryManager.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUMatrixMemoryManager.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUMatrixMemoryManager.java
index 47a8391..f69d340 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUMatrixMemoryManager.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUMatrixMemoryManager.java
@@ -18,7 +18,9 @@
  */
 package org.apache.sysml.runtime.instructions.gpu.context;
 
+import java.util.Collections;
 import java.util.HashSet;
+import java.util.List;
 import java.util.Set;
 import java.util.stream.Collectors;
 
@@ -43,8 +45,7 @@ public class GPUMatrixMemoryManager {
void addGPUObject(GPUObject gpuObj) {
gpuObjects.add(gpuObj);
}
-   
-   
+
/**
 * Get list of all Pointers in a GPUObject 
 * @param gObj gpu object 
@@ -81,6 +82,20 @@ public class GPUMatrixMemoryManager {
 * so that an extraneous host to dev transfer can be avoided
 */
HashSet gpuObjects = new HashSet<>();
+
+   /**
+* Return a set of GPU Objects associated with a list of pointers
+* @param pointers A list of pointers
+* @return A set of GPU objects corresponding to any of these pointers
+*/
+   Set getGpuObjects(Set pointers) {
+   Set gObjs = new HashSet<>();
+   for (GPUObject g : gpuObjects) {
+   if (!Collections.disjoint(getPointers(g), pointers))
+   gObjs.add(g);
+   }
+   return gObjs;
+   }

/**
 * Return all pointers in the first section
@@ -94,10 +109,14 @@ public class GPUMatrixMemoryManager {
 * Get pointers from the first memory sections "Matrix Memory"
 * @param locked return locked pointers if true
 * @param dirty return dirty pointers if true
+* @param isCleanupEnabled return pointers marked for cleanup if true
 * @return set of pointers
 */
-   Set getPointers(boolean locked, boolean dirty) {
-   return gpuObjects.stream().filter(gObj -> gObj.isLocked() == 
locked && gObj.isDirty() == dirty).flatMap(gObj -> 
getPointers(gObj).stream()).collect(Collectors.toSet());
+   Set getPointers(boolean locked, boolean dirty, boolean 
isCleanupEnabled) {
+   return gpuObjects.stream().filter(
+   gObj -> (gObj.isLocked() == locked && 
gObj.isDirty() == dirty) ||
+   (gObj.mat.isCleanupEnabled() == 
isCleanupEnabled)).flatMap(
+   gObj -> 

systemml git commit: [SYSTEMML-445] Extend shadow buffer for double precision

2018-11-01 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 3cbd9d5ab -> be2b3e220


[SYSTEMML-445] Extend shadow buffer for double precision

- This commit also prepares SystemML for very low precision.

Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/be2b3e22
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/be2b3e22
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/be2b3e22

Branch: refs/heads/master
Commit: be2b3e220401c0244bb5df33ddfa8125996066b6
Parents: 3cbd9d5
Author: Niketan Pansare 
Authored: Thu Nov 1 05:05:10 2018 -0700
Committer: Niketan Pansare 
Committed: Thu Nov 1 17:36:03 2018 +0530

--
 .../instructions/gpu/context/ShadowBuffer.java  | 98 ++--
 1 file changed, 72 insertions(+), 26 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/be2b3e22/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/ShadowBuffer.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/ShadowBuffer.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/ShadowBuffer.java
index 88ea972..1aeec6f 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/ShadowBuffer.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/ShadowBuffer.java
@@ -22,9 +22,9 @@ import static jcuda.runtime.JCuda.cudaMemcpy;
 
 import org.apache.commons.logging.Log;
 import org.apache.commons.logging.LogFactory;
-import org.apache.sysml.api.DMLScript;
 import org.apache.sysml.conf.ConfigurationManager;
 import org.apache.sysml.conf.DMLConfig;
+import org.apache.sysml.runtime.DMLRuntimeException;
 import 
org.apache.sysml.runtime.controlprogram.parfor.stat.InfrastructureAnalyzer;
 import org.apache.sysml.runtime.matrix.data.LibMatrixCUDA;
 import org.apache.sysml.runtime.matrix.data.MatrixBlock;
@@ -36,21 +36,17 @@ import jcuda.Sizeof;
 public class ShadowBuffer {
private static final Log LOG = 
LogFactory.getLog(ShadowBuffer.class.getName());

-   GPUObject gpuObj;
-   float[] shadowPointer = null;
+   private GPUObject gpuObj;
+   // shadowPointer can be double[], float[] or short[].
+   private Object shadowPointer = null;
private static boolean _warnedAboutShadowBuffer = false;
private static long EVICTION_SHADOW_BUFFER_CURR_BYTES = 0;
private static long EVICTION_SHADOW_BUFFER_MAX_BYTES;
static {
-   if(DMLScript.FLOATING_POINT_PRECISION.equals("double")) {
-   EVICTION_SHADOW_BUFFER_MAX_BYTES = 0;
-   }
-   else {
-   double shadowBufferSize = 
ConfigurationManager.getDMLConfig().getDoubleValue(DMLConfig.EVICTION_SHADOW_BUFFERSIZE);
-   if(shadowBufferSize < 0 || shadowBufferSize > 1) 
-   throw new RuntimeException("Incorrect value (" 
+ shadowBufferSize + ") for the configuration:" + 
DMLConfig.EVICTION_SHADOW_BUFFERSIZE);
-   EVICTION_SHADOW_BUFFER_MAX_BYTES = (long) 
(((double)InfrastructureAnalyzer.getLocalMaxMemory())*shadowBufferSize);
-   }
+   double shadowBufferSize = 
ConfigurationManager.getDMLConfig().getDoubleValue(DMLConfig.EVICTION_SHADOW_BUFFERSIZE);
+   if(shadowBufferSize < 0 || shadowBufferSize > 1) 
+   throw new RuntimeException("Incorrect value (" + 
shadowBufferSize + ") for the configuration:" + 
DMLConfig.EVICTION_SHADOW_BUFFERSIZE);
+   EVICTION_SHADOW_BUFFER_MAX_BYTES = (long) 
(((double)InfrastructureAnalyzer.getLocalMaxMemory())*shadowBufferSize);
}

public ShadowBuffer(GPUObject gpuObj) {
@@ -73,9 +69,21 @@ public class ShadowBuffer {
public void moveFromDevice(String instName) {
long start = ConfigurationManager.isStatistics() ? 
System.nanoTime() : 0;
int numElems = 
GPUObject.toIntExact(gpuObj.mat.getNumRows()*gpuObj.mat.getNumColumns());
-   shadowPointer = new float[numElems];
-   EVICTION_SHADOW_BUFFER_CURR_BYTES += 
getSizeOfFloat(shadowPointer.length);
-   cudaMemcpy(Pointer.to(shadowPointer), 
gpuObj.jcudaDenseMatrixPtr, getSizeOfDataType(numElems), 
jcuda.runtime.cudaMemcpyKind.cudaMemcpyDeviceToHost);
+   if(LibMatrixCUDA.sizeOfDataType == Sizeof.DOUBLE) {
+   shadowPointer = new double[numElems];
+   }
+   else if(LibMatrixCUDA.sizeOfDataType == Sizeof.FLOAT) {
+   shadowPointer = new float[numElems];
+   }
+   else if(LibMatrixCUDA.sizeOfDataType == Sizeof.SHORT) {
+   

systemml git commit: [MINOR] Bugfix in Large Dense Block

2018-10-22 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 0a957e4c9 -> 73e1e40d7


[MINOR] Bugfix in Large Dense Block

- Current master throws java.lang.ArrayIndexOutOfBoundsException when
counting number of non-zeroes

Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/73e1e40d
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/73e1e40d
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/73e1e40d

Branch: refs/heads/master
Commit: 73e1e40d766fda53210b5176597b182024cac344
Parents: 0a957e4
Author: Niketan Pansare 
Authored: Mon Oct 22 09:34:26 2018 -0700
Committer: Niketan Pansare 
Committed: Mon Oct 22 09:40:59 2018 -0700

--
 .../java/org/apache/sysml/runtime/matrix/data/DenseBlockLDRB.java  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/73e1e40d/src/main/java/org/apache/sysml/runtime/matrix/data/DenseBlockLDRB.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/matrix/data/DenseBlockLDRB.java 
b/src/main/java/org/apache/sysml/runtime/matrix/data/DenseBlockLDRB.java
index fda1b30..6e99fbd 100644
--- a/src/main/java/org/apache/sysml/runtime/matrix/data/DenseBlockLDRB.java
+++ b/src/main/java/org/apache/sysml/runtime/matrix/data/DenseBlockLDRB.java
@@ -164,7 +164,7 @@ public class DenseBlockLDRB extends DenseBlock
nnz += UtilFunctions.computeNnz(data[bi], lpos, 
len);
else
for(int i=lpos; i

[1/2] systemml git commit: [SYSTEMML-445] Support non-CuDNN GPU operator for LSTM forward and backward

2018-10-20 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master ef842da9c -> bd34292d4


http://git-wip-us.apache.org/repos/asf/systemml/blob/bd34292d/src/main/java/org/apache/sysml/runtime/instructions/gpu/DnnGPUInstruction.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/DnnGPUInstruction.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/DnnGPUInstruction.java
index 4ad4155..0424114 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/DnnGPUInstruction.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/DnnGPUInstruction.java
@@ -38,6 +38,15 @@ import org.apache.sysml.runtime.util.DnnUtils;
 import org.apache.sysml.utils.GPUStatistics;
 
 public class DnnGPUInstruction extends GPUInstruction {
+   
+   public static enum LstmOperator {
+   CUDNN,
+   DENSE_NN,
+   NONE
+   }
+   
+   public static LstmOperator FORCED_LSTM_OP = LstmOperator.NONE;
+   
private CPOperand _input1;
private CPOperand _input2;
private CPOperand _input3;
@@ -638,43 +647,36 @@ public class DnnGPUInstruction extends GPUInstruction {
return (int)num;
}

+   public static long getMemRequiredForCuDNNLSTMBackward(long N, long T, 
long M, long D, boolean return_sequences) {
+   double memRequired = (D+M)*4*M // sysmlWPointer
+   + 2*(D+M+2)*(4*M) // cudnnWPointer and 
cudnnDwPointer
+   + 3*N*T*D  // cudnnInput, cudnnDx and smlDx
+   + 2*N*T*M // dy and yPointer
+   + (return_sequences ? T*M : M); // dout
+   memRequired *= LibMatrixCUDA.sizeOfDataType;
+   // Assume the workspace to be proportional to cudnnWPointer 
(add 20% additional overhead for workspace)
+   memRequired += 1.2*(D+M+2)*(4*M)*LibMatrixCUDA.sizeOfDataType;
+   return (long)memRequired;
+   }
+   
private void processLstmBackwardInstruction(ExecutionContext ec) throws 
DMLRuntimeException {
MatrixObject out0 = getMatrixInputForGPUInstruction(ec, 
_input4.getName());
long M = out0.getNumColumns(); // hiddenSize .. since out0: (N, 
M)
+   long N1 = out0.getNumRows();
Pointer out0Pointer =  LibMatrixCUDA.getDensePointer(gCtx, 
out0, instName);

MatrixObject W = getMatrixInputForGPUInstruction(ec, 
_input2.getName());
MatrixObject bias = getMatrixInputForGPUInstruction(ec, 
_input3.getName());
long numRowsW = W.getNumRows();
-   long D = numRowsW - M; // since W:(D+M, 4M) ... numFeatures 
-   Pointer sysmlWPointer = 
LibMatrixCuDNN.getDensePointerForCuDNN(gCtx, W, instName, D+M, 4*M);
-   Pointer sysmlBiasPointer = 
LibMatrixCuDNN.getDensePointerForCuDNN(gCtx, bias, instName, 1, 4*M);
-   Pointer cudnnWPointer = gCtx.allocate(instName, 
(D+M+2)*(4*M)*LibMatrixCUDA.sizeOfDataType);
-   
LibMatrixCUDA.getCudaKernels(gCtx).launchKernel("prepare_lstm_weight",
-   
ExecutionConfig.getConfigForSimpleVectorOperations(toInt((D+M+2)*(4*M))),
-   sysmlWPointer, sysmlBiasPointer, cudnnWPointer, 
D, M);
-   ec.releaseMatrixInputForGPUInstruction(_input2.getName());
-   ec.releaseMatrixInputForGPUInstruction(_input3.getName());
-   
-   
+   long D = numRowsW - M; // since W:(D+M, 4M) ... numFeatures
MatrixObject X = getMatrixInputForGPUInstruction(ec, 
_input1.getName());
-   Pointer xPointer = LibMatrixCUDA.getDensePointer(gCtx, X, 
instName); 
int N = toInt(X.getNumRows()); // batchSize .. since X:(N, T*D)
long numColsX = X.getNumColumns();
int T = toInt(numColsX/ D); // since X:(N, T*D) ... seqLength
-   Pointer cudnnInput = gCtx.allocate(instName, 
(N*T*D)*LibMatrixCUDA.sizeOfDataType);
-   
LibMatrixCUDA.getCudaKernels(gCtx).launchKernel("prepare_lstm_input",
-   
ExecutionConfig.getConfigForSimpleVectorOperations(toInt(N*T*D)),
-   xPointer, cudnnInput, N, D, T*D, N*T*D);
-   ec.releaseMatrixInputForGPUInstruction(_input1.getName());
-   
-   Pointer c0Pointer = LibMatrixCUDA.getDensePointer(gCtx, 
getMatrixInputForGPUInstruction(ec, _input5.getName()), instName);
boolean return_sequences = ec.getScalarInput(_input6.getName(), 
_input6.getValueType(), _input6.isLiteral()).getBooleanValue();

-   // LibMatrixCuDNN.lstm(ec, gCtx, instName, 
-   // cudnnInput, cudnnWPointer, 

[2/2] systemml git commit: [SYSTEMML-445] Support non-CuDNN GPU operator for LSTM forward and backward

2018-10-20 Thread niketanpansare
[SYSTEMML-445] Support non-CuDNN GPU operator for LSTM forward and backward

- Added corresponding GPU tests that compare the result of CuDNN operator with 
the newly added operator. Also, the results are compared with DML-bodied LSTM 
implementation in the nn layer.
- The LSTM forward operator support sparse weights.
- Sparse support for LSTM backward is disabled in the initial implementation.
- Unnecessary intermediates are removed from lstm.dml
- Extended LibMatrixCuMatMult to support arbitrary alpha and beta during matrix 
multiplication.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/bd34292d
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/bd34292d
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/bd34292d

Branch: refs/heads/master
Commit: bd34292d4e521ffaa5118f89ab9350ffe4e89af0
Parents: ef842da
Author: Niketan Pansare 
Authored: Sat Oct 20 11:03:53 2018 -0700
Committer: Niketan Pansare 
Committed: Sat Oct 20 11:08:11 2018 -0700

--
 scripts/nn/layers/lstm.dml  |1 -
 src/main/cpp/kernels/SystemML.cu|  315 +++
 src/main/cpp/kernels/SystemML.ptx   | 2074 +-
 .../instructions/gpu/DnnGPUInstruction.java |  232 +-
 .../gpu/context/GPUMemoryManager.java   |4 +
 .../runtime/matrix/data/LibMatrixCUDA.java  |   21 +-
 .../runtime/matrix/data/LibMatrixCuDNN.java |  236 +-
 .../runtime/matrix/data/LibMatrixCuMatMult.java |   34 +-
 .../org/apache/sysml/test/gpu/LstmTest.java |  318 +++
 9 files changed, 3130 insertions(+), 105 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/bd34292d/scripts/nn/layers/lstm.dml
--
diff --git a/scripts/nn/layers/lstm.dml b/scripts/nn/layers/lstm.dml
index 44942d2..0b0016b 100644
--- a/scripts/nn/layers/lstm.dml
+++ b/scripts/nn/layers/lstm.dml
@@ -182,7 +182,6 @@ backward = function(matrix[double] dout, matrix[double] dc,
   for (iter in 1:T) {  # each timestep in reverse order
 X_t = X[,(t-1)*D+1:t*D]  # shape (N, D)
 dout_t = dout[,(t-1)*M+1:t*M]  # shape (N, M)
-out_t = matrix(cache_out[t,], rows=N, cols=M)  # shape (N, M)
 ct = matrix(cache_c[t,], rows=N, cols=M)  # shape (N, M)
 if (t == 1) {
   out_prev = out0  # shape (N, M)

http://git-wip-us.apache.org/repos/asf/systemml/blob/bd34292d/src/main/cpp/kernels/SystemML.cu
--
diff --git a/src/main/cpp/kernels/SystemML.cu b/src/main/cpp/kernels/SystemML.cu
index 26d7f43..ab5f326 100644
--- a/src/main/cpp/kernels/SystemML.cu
+++ b/src/main/cpp/kernels/SystemML.cu
@@ -2406,3 +2406,318 @@ extern "C" __global__ void backward_dgamma_tmp_f(double 
*ema_mean, double *dout,
int N, int C, int HW, int CHW, int NCHW) {
   backward_dgamma_tmp(ema_mean, dout, X, ema_var, ret, N, C, HW, CHW, NCHW);
 }
+
+
+// Performs the operation:
+// X_t = X[,(t-1)*D+1:t*D]  # shape (N, D)
+// ret = cbind(X_t, out_prev)  # shape (N, D+M)
+// size => N*(D+M)
+template 
+__device__ void prepareInputNNLstm(T *X, T* out_prev, T *ret, int t, int M, 
int D, int TD, int DPlusM, unsigned int size) {
+  int index = blockIdx.x * blockDim.x + threadIdx.x;
+  if (index < size) {
+int n = index / DPlusM;
+   int iy = index % DPlusM;
+if(iy < D) {
+   ret[index] = X[n*TD + t*D + iy];
+}
+else {
+   ret[index] = out_prev[n*M + (iy-D)];
+}
+  }
+}
+
+extern "C" __global__ void prepareInputNNLstm_d(double *X, double* out_prev, 
double *ret, int t, int M, int D, int TD, int DPlusM, unsigned int size) {
+  prepareInputNNLstm(X, out_prev, ret, t, M, D, TD, DPlusM, size);
+}
+
+extern "C" __global__ void prepareInputNNLstm_f(float *X, float* out_prev, 
float *ret, int t, int M, int D, int TD, int DPlusM, unsigned int size) {
+  prepareInputNNLstm(X, out_prev, ret, t, M, D, TD, DPlusM, size);
+}
+
+
+// Performs the operations:
+// ifog = ifog + b
+// ifog[,1:3*M] = sigmoid::forward(ifog[,1:3*M])  # i,f,o gates squashed with 
sigmoid
+// ifog[,3*M+1:4*M] = tanh::forward(ifog[,3*M+1:4*M])  # g gate squashed with 
tanh
+template 
+__device__ void squashIFOG(T *ifog, T *b, int M, unsigned int size) {
+  int index = blockIdx.x * blockDim.x + threadIdx.x;
+  if (index < size) {
+int M4 = M*4;
+   int n = index / M4;
+   int iy = index % M4; 
+   T ifogVal = ifog[index] + b[iy];
+   if(iy < M*3) {
+   ifogVal = 0.5 * tanh(0.5 * ifogVal) + 0.5; // sigmoid
+   }
+   else {
+   ifogVal = tanh(ifogVal);
+   }
+   ifog[index] = ifogVal;
+  }
+}
+
+extern "C" __global__ void squashIFOG_d(double *ifog, double *b, int M, 
unsigned int size) {
+  squashIFOG(ifog, b, M, size);
+}
+

[10/11] systemml git commit: [SYSTEMML-2496] Skip MapReduce tests as it is in maintenance mode

2018-10-14 Thread niketanpansare
http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/applications/parfor/ParForUnivariateStatsTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/applications/parfor/ParForUnivariateStatsTest.java
 
b/src/test/java/org/apache/sysml/test/integration/applications/parfor/ParForUnivariateStatsTest.java
index 4303e50..82deb95 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/applications/parfor/ParForUnivariateStatsTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/applications/parfor/ParForUnivariateStatsTest.java
@@ -22,7 +22,6 @@ package org.apache.sysml.test.integration.applications.parfor;
 import java.util.HashMap;
 
 import org.junit.Test;
-
 import org.apache.sysml.hops.Hop;
 import org.apache.sysml.lops.LopProperties.ExecType;
 import org.apache.sysml.runtime.controlprogram.ParForProgramBlock.PExecMode;
@@ -100,6 +99,9 @@ public class ParForUnivariateStatsTest extends 
AutomatedTestBase
 */
private void runParForUnivariateStatsTest( boolean parallel, PExecMode 
outer, PExecMode inner, ExecType instType )
{
+   if(shouldSkipTest())
+   return;
+   
//inst exec type, influenced via rows
int rows = -1;
if( instType == ExecType.CP )

http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/conversion/RDDConverterUtilsExtTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/conversion/RDDConverterUtilsExtTest.java
 
b/src/test/java/org/apache/sysml/test/integration/conversion/RDDConverterUtilsExtTest.java
index 89de76d..4ffdb68 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/conversion/RDDConverterUtilsExtTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/conversion/RDDConverterUtilsExtTest.java
@@ -19,8 +19,6 @@
 
 package org.apache.sysml.test.integration.conversion;
 
-import static org.junit.Assert.assertTrue;
-
 import java.util.ArrayList;
 import java.util.List;
 
@@ -77,6 +75,9 @@ public class RDDConverterUtilsExtTest extends 
AutomatedTestBase {
 
@Test
public void testStringDataFrameToVectorDataFrame() {
+   if(shouldSkipTest())
+   return;
+   
List list = new ArrayList();
list.add("((1.2, 4.3, 3.4))");
list.add("(1.2, 3.4, 2.2)");
@@ -105,6 +106,8 @@ public class RDDConverterUtilsExtTest extends 
AutomatedTestBase {
 
@Test
public void testStringDataFrameToVectorDataFrameNull() {
+   if(shouldSkipTest())
+   return;
List list = new ArrayList();
list.add("[1.2, 3.4]");
list.add(null);
@@ -129,6 +132,8 @@ public class RDDConverterUtilsExtTest extends 
AutomatedTestBase {
 
@Test(expected = SparkException.class)
public void testStringDataFrameToVectorDataFrameNonNumbers() {
+   if(shouldSkipTest())
+   return;
List list = new ArrayList();
list.add("[cheeseburger,fries]");
JavaRDD javaRddString = sc.parallelize(list);

http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/aggregate/AggregateInfTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/aggregate/AggregateInfTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/aggregate/AggregateInfTest.java
index e89da18..bc2a211 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/aggregate/AggregateInfTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/aggregate/AggregateInfTest.java
@@ -22,7 +22,6 @@ package org.apache.sysml.test.integration.functions.aggregate;
 import java.util.HashMap;
 
 import org.junit.Test;
-
 import org.apache.sysml.api.DMLScript.RUNTIME_PLATFORM;
 import org.apache.sysml.lops.LopProperties.ExecType;
 import org.apache.sysml.runtime.matrix.data.MatrixValue.CellIndex;
@@ -111,9 +110,9 @@ public class AggregateInfTest extends AutomatedTestBase
 */
private void runInfAggregateOperationTest( boolean pos, boolean sparse, 
ExecType instType)
{
-   //rtplatform for MR
-   RUNTIME_PLATFORM platformOld = rtplatform;
-   rtplatform = (instType==ExecType.MR) ? RUNTIME_PLATFORM.HADOOP 
: RUNTIME_PLATFORM.HYBRID;
+   RUNTIME_PLATFORM platformOld = setRuntimePlatform(instType);
+   if(shouldSkipTest())
+   return;

try
{


[03/11] systemml git commit: [SYSTEMML-2496] Skip MapReduce tests as it is in maintenance mode

2018-10-14 Thread niketanpansare
http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAAssignConstantPropagationTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAAssignConstantPropagationTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAAssignConstantPropagationTest.java
index 8d0736b..47f9463 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAAssignConstantPropagationTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAAssignConstantPropagationTest.java
@@ -22,7 +22,6 @@ package org.apache.sysml.test.integration.functions.recompile;
 import java.util.HashMap;
 
 import org.junit.Test;
-
 import org.apache.sysml.hops.OptimizerUtils;
 import org.apache.sysml.runtime.matrix.data.MatrixValue.CellIndex;
 import org.apache.sysml.test.integration.AutomatedTestBase;
@@ -69,6 +68,9 @@ public class IPAAssignConstantPropagationTest extends 
AutomatedTestBase

private void runIPAAssignConstantPropagationTest( boolean 
branchRemoval, boolean IPA )
{   
+   if(shouldSkipTest())
+   return;
+   
boolean oldFlagBranchRemoval = 
OptimizerUtils.ALLOW_BRANCH_REMOVAL;
boolean oldFlagIPA = 
OptimizerUtils.ALLOW_INTER_PROCEDURAL_ANALYSIS;


http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAComplexAppendTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAComplexAppendTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAComplexAppendTest.java
index 58f9f46..c8a422b 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAComplexAppendTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAComplexAppendTest.java
@@ -74,6 +74,9 @@ public class IPAComplexAppendTest extends AutomatedTestBase

private void runIPAAppendTest( boolean IPA, boolean rewrites ) throws 
IOException
{
+   if(shouldSkipTest())
+   return;
+   
boolean oldFlagIPA = 
OptimizerUtils.ALLOW_INTER_PROCEDURAL_ANALYSIS;
boolean oldFlagRewrites = 
OptimizerUtils.ALLOW_ALGEBRAIC_SIMPLIFICATION;


http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAConstantPropagationTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAConstantPropagationTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAConstantPropagationTest.java
index 871d3f1..a865262 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAConstantPropagationTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAConstantPropagationTest.java
@@ -22,7 +22,6 @@ package org.apache.sysml.test.integration.functions.recompile;
 import java.util.HashMap;
 
 import org.junit.Test;
-
 import org.apache.sysml.hops.OptimizerUtils;
 import org.apache.sysml.runtime.matrix.data.MatrixValue.CellIndex;
 import org.apache.sysml.test.integration.AutomatedTestBase;
@@ -87,6 +86,9 @@ public class IPAConstantPropagationTest extends 
AutomatedTestBase

private void runIPAConstantPropagationTest( boolean update, boolean 
branchRemoval, boolean IPA )
{   
+   if(shouldSkipTest())
+   return;
+   
boolean oldFlagBranchRemoval = 
OptimizerUtils.ALLOW_BRANCH_REMOVAL;
boolean oldFlagIPA = 
OptimizerUtils.ALLOW_INTER_PROCEDURAL_ANALYSIS;


http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAPropagationSizeMultipleFunctionsTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAPropagationSizeMultipleFunctionsTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAPropagationSizeMultipleFunctionsTest.java
index 7103775..d8c54c3 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAPropagationSizeMultipleFunctionsTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/recompile/IPAPropagationSizeMultipleFunctionsTest.java
@@ -108,6 +108,9 @@ public class IPAPropagationSizeMultipleFunctionsTest 
extends 

[06/11] systemml git commit: [SYSTEMML-2496] Skip MapReduce tests as it is in maintenance mode

2018-10-14 Thread niketanpansare
http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/frame/FrameMatrixReblockTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/frame/FrameMatrixReblockTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/frame/FrameMatrixReblockTest.java
index c3b5311..4991183 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/frame/FrameMatrixReblockTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/frame/FrameMatrixReblockTest.java
@@ -180,17 +180,10 @@ public class FrameMatrixReblockTest extends 
AutomatedTestBase

private void runFrameReblockTest( String testname, boolean multColBlks, 
boolean sparse, String ofmt, ExecType et)
{
-   //rtplatform for MR
-   RUNTIME_PLATFORM platformOld = rtplatform;
-   switch( et ){
-   case MR: rtplatform = RUNTIME_PLATFORM.HADOOP; break;
-   case SPARK: rtplatform = RUNTIME_PLATFORM.SPARK; break;
-   default: rtplatform = RUNTIME_PLATFORM.HYBRID; break;
-   }
-   
boolean sparkConfigOld = DMLScript.USE_LOCAL_SPARK_CONFIG;
-   if( rtplatform == RUNTIME_PLATFORM.SPARK )
-   DMLScript.USE_LOCAL_SPARK_CONFIG = true;
+   RUNTIME_PLATFORM platformOld = setRuntimePlatform(et);
+   if(shouldSkipTest())
+   return;

try
{

http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/frame/FrameMatrixWriteTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/frame/FrameMatrixWriteTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/frame/FrameMatrixWriteTest.java
index 2cbbf55..144f536 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/frame/FrameMatrixWriteTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/frame/FrameMatrixWriteTest.java
@@ -123,17 +123,10 @@ public class FrameMatrixWriteTest extends 
AutomatedTestBase
 */
private void runFrameWriteTest( String testname, boolean multColBlks, 
String ofmt, ExecType et)
{
-   //rtplatform for MR
-   RUNTIME_PLATFORM platformOld = rtplatform;
-   switch( et ){
-   case MR: rtplatform = RUNTIME_PLATFORM.HADOOP; break;
-   case SPARK: rtplatform = RUNTIME_PLATFORM.SPARK; break;
-   default: rtplatform = RUNTIME_PLATFORM.HYBRID; break;
-   }
-   
boolean sparkConfigOld = DMLScript.USE_LOCAL_SPARK_CONFIG;
-   if( rtplatform == RUNTIME_PLATFORM.SPARK )
-   DMLScript.USE_LOCAL_SPARK_CONFIG = true;
+   RUNTIME_PLATFORM platformOld = setRuntimePlatform(et);
+   if(shouldSkipTest())
+   return;

try
{

http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/frame/FrameMetaReadWriteTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/frame/FrameMetaReadWriteTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/frame/FrameMetaReadWriteTest.java
index ceeec07..19eec76 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/frame/FrameMetaReadWriteTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/frame/FrameMetaReadWriteTest.java
@@ -31,7 +31,6 @@ import org.apache.sysml.runtime.util.DataConverter;
 import org.apache.sysml.test.integration.AutomatedTestBase;
 import org.apache.sysml.test.integration.TestConfiguration;
 import org.apache.sysml.test.utils.TestUtils;
-import org.junit.Assert;
 import org.junit.Test;
 
 public class FrameMetaReadWriteTest extends AutomatedTestBase
@@ -87,17 +86,10 @@ public class FrameMetaReadWriteTest extends 
AutomatedTestBase
 */
private void runFrameReadWriteTest( OutputInfo oinfo, ExecType et)
{
-   //rtplatform for MR
-   RUNTIME_PLATFORM platformOld = rtplatform;
-   switch( et ){
-   case MR: rtplatform = RUNTIME_PLATFORM.HADOOP; break;
-   case SPARK: rtplatform = RUNTIME_PLATFORM.SPARK; break;
-   default: rtplatform = RUNTIME_PLATFORM.HYBRID; break;
-   }
-   
boolean sparkConfigOld = DMLScript.USE_LOCAL_SPARK_CONFIG;
-   if( rtplatform == 

[02/11] systemml git commit: [SYSTEMML-2496] Skip MapReduce tests as it is in maintenance mode

2018-10-14 Thread niketanpansare
http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/ternary/FullIfElseTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/ternary/FullIfElseTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/ternary/FullIfElseTest.java
index 95f43ac..efb87d8 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/ternary/FullIfElseTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/ternary/FullIfElseTest.java
@@ -297,15 +297,11 @@ public class FullIfElseTest extends AutomatedTestBase

private void runIfElseTest(boolean matrix1, boolean matrix2, boolean 
matrix3, boolean sparse, ExecType et)
{
-   //rtplatform for MR
-   RUNTIME_PLATFORM platformOld = rtplatform;
-   switch( et ){
-   case MR: rtplatform = RUNTIME_PLATFORM.HADOOP; break;
-   case SPARK: rtplatform = RUNTIME_PLATFORM.SPARK; break;
-   default: rtplatform = RUNTIME_PLATFORM.HYBRID_SPARK; 
break;
-   }
-   
boolean sparkConfigOld = DMLScript.USE_LOCAL_SPARK_CONFIG;
+   RUNTIME_PLATFORM platformOld = setRuntimePlatform(et);
+   if(shouldSkipTest())
+   return;
+   
boolean rewritesOld = 
OptimizerUtils.ALLOW_ALGEBRAIC_SIMPLIFICATION;
if( rtplatform == RUNTIME_PLATFORM.SPARK || rtplatform == 
RUNTIME_PLATFORM.HYBRID_SPARK )
DMLScript.USE_LOCAL_SPARK_CONFIG = true;

http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/ternary/QuantileWeightsTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/ternary/QuantileWeightsTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/ternary/QuantileWeightsTest.java
index 44b4282..c8efd90 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/ternary/QuantileWeightsTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/ternary/QuantileWeightsTest.java
@@ -206,17 +206,10 @@ public class QuantileWeightsTest extends AutomatedTestBase

private void runQuantileTest( String TEST_NAME, double p, boolean 
sparse, ExecType et)
{
-   //rtplatform for MR
-   RUNTIME_PLATFORM platformOld = rtplatform;
-   switch( et ){
-   case MR: rtplatform = RUNTIME_PLATFORM.HADOOP; break;
-   case SPARK: rtplatform = RUNTIME_PLATFORM.SPARK; break;
-   default: rtplatform = RUNTIME_PLATFORM.HYBRID; break;
-   }
-   
boolean sparkConfigOld = DMLScript.USE_LOCAL_SPARK_CONFIG;
-   if( rtplatform == RUNTIME_PLATFORM.SPARK )
-   DMLScript.USE_LOCAL_SPARK_CONFIG = true;
+   RUNTIME_PLATFORM platformOld = setRuntimePlatform(et);
+   if(shouldSkipTest())
+   return;

try
{

http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/ternary/TableOutputTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/ternary/TableOutputTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/ternary/TableOutputTest.java
index 2c51ad8..91b66db 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/ternary/TableOutputTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/ternary/TableOutputTest.java
@@ -21,7 +21,6 @@ package org.apache.sysml.test.integration.functions.ternary;
 
 import java.util.HashMap;
 
-import org.junit.Assert;
 import org.junit.Test;
 
 import org.apache.sysml.api.DMLScript;
@@ -114,18 +113,10 @@ public class TableOutputTest extends AutomatedTestBase
 */
private void runTableOutputTest( ExecType et, int delta)
{
-   //rtplatform for MR
-   RUNTIME_PLATFORM platformOld = rtplatform;
-
-   switch( et ){
-   case MR: rtplatform = RUNTIME_PLATFORM.HADOOP; break;
-   case SPARK: rtplatform = RUNTIME_PLATFORM.SPARK; break;
-   default: rtplatform = RUNTIME_PLATFORM.HYBRID; break;
-   }
-   
boolean sparkConfigOld = DMLScript.USE_LOCAL_SPARK_CONFIG;
-   if( rtplatform == RUNTIME_PLATFORM.SPARK )
-   DMLScript.USE_LOCAL_SPARK_CONFIG = true;
+   RUNTIME_PLATFORM platformOld = 

[09/11] systemml git commit: [SYSTEMML-2496] Skip MapReduce tests as it is in maintenance mode

2018-10-14 Thread niketanpansare
http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/binary/matrix/MapMultLimitTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/binary/matrix/MapMultLimitTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/binary/matrix/MapMultLimitTest.java
index 3ac6bc3..16ee1df 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/binary/matrix/MapMultLimitTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/binary/matrix/MapMultLimitTest.java
@@ -19,10 +19,10 @@
 
 package org.apache.sysml.test.integration.functions.binary.matrix;
 
-import org.junit.Assert;
 import org.junit.Test;
 
 import org.apache.sysml.api.DMLScript.RUNTIME_PLATFORM;
+import org.apache.sysml.lops.LopProperties.ExecType;
 import org.apache.sysml.runtime.matrix.MatrixCharacteristics;
 import org.apache.sysml.test.integration.AutomatedTestBase;
 import org.apache.sysml.test.integration.TestConfiguration;
@@ -57,8 +57,9 @@ public class MapMultLimitTest extends AutomatedTestBase
public void testMapMultLimit()
{
 
-   RUNTIME_PLATFORM rtold = rtplatform;
-   rtplatform = RUNTIME_PLATFORM.HADOOP;
+   RUNTIME_PLATFORM rtold = setRuntimePlatform(ExecType.MR);
+   if(shouldSkipTest())
+   return;
 
try
{
@@ -91,7 +92,7 @@ public class MapMultLimitTest extends AutomatedTestBase
// Expected 3 jobs: 1 Reblock, 2 MapMults
runTest(true, exceptionExpected, null, 3); 
//System.out.println("#Jobs: " + 
Statistics.getNoOfExecutedMRJobs() + ", " + Statistics.getNoOfCompiledMRJobs());
-   
Assert.assertTrue(Statistics.getNoOfExecutedMRJobs()==3);
+   assertTrue(Statistics.getNoOfExecutedMRJobs()==3);
}
finally
{
@@ -99,4 +100,4 @@ public class MapMultLimitTest extends AutomatedTestBase
}
}

-}
\ No newline at end of file
+}

http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/binary/matrix/MatrixMultiplicationTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/binary/matrix/MatrixMultiplicationTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/binary/matrix/MatrixMultiplicationTest.java
index 60e97e0..e02d0c1 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/binary/matrix/MatrixMultiplicationTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/binary/matrix/MatrixMultiplicationTest.java
@@ -53,6 +53,9 @@ public class MatrixMultiplicationTest extends 
AutomatedTestBase
 
@Test
public void testMatrixMultiplication() {
+   if(shouldSkipTest())
+   return;
+   
int m = 20;
int n = 20;
int k = 20;
@@ -80,6 +83,9 @@ public class MatrixMultiplicationTest extends 
AutomatedTestBase

@Test
public void testSparseMatrixMultiplication() {
+   if(shouldSkipTest())
+   return;
+   
int m = 40;
int n = 10;
int k = 30;
@@ -107,6 +113,9 @@ public class MatrixMultiplicationTest extends 
AutomatedTestBase
 
@Test
public void testWrongDimensions() {
+   if(shouldSkipTest())
+   return;
+   
int m = 6;
int n1 = 8;
int n2 = 10;
@@ -128,6 +137,9 @@ public class MatrixMultiplicationTest extends 
AutomatedTestBase
 
@Test
public void testAMultASpecial1() {
+   if(shouldSkipTest())
+   return;
+   
int rows = 10;
int cols = 10;
 
@@ -151,6 +163,9 @@ public class MatrixMultiplicationTest extends 
AutomatedTestBase
 
@Test
public void testAMultBSpecial2() {
+   if(shouldSkipTest())
+   return;
+   
int rows = 10;
int cols = 10;
 

http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/binary/matrix/MatrixVectorTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/binary/matrix/MatrixVectorTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/binary/matrix/MatrixVectorTest.java
index acc0355..79bd463 100644
--- 

[08/11] systemml git commit: [SYSTEMML-2496] Skip MapReduce tests as it is in maintenance mode

2018-10-14 Thread niketanpansare
http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/codegen/APICodegenTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/codegen/APICodegenTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/codegen/APICodegenTest.java
index fcd2326..4e94993 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/codegen/APICodegenTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/codegen/APICodegenTest.java
@@ -36,7 +36,6 @@ import org.apache.sysml.runtime.util.DataConverter;
 import org.apache.sysml.test.integration.AutomatedTestBase;
 import org.apache.sysml.utils.Statistics;
 import org.junit.After;
-import org.junit.Assert;
 import org.junit.Test;
 
 
@@ -67,6 +66,9 @@ public class APICodegenTest extends AutomatedTestBase
 
private void runMLContextParforDatasetTest(boolean jmlc) 
{
+   if(shouldSkipTest())
+   return;
+   
try {
double[][] X = getRandomMatrix(rows, cols, -10, 10, 
sparsity, 76543); 
MatrixBlock mX = DataConverter.convertToMatrixBlock(X); 
@@ -107,7 +109,7 @@ public class APICodegenTest extends AutomatedTestBase
}

//check for generated operator
-   
Assert.assertTrue(heavyHittersContainsSubString("spoofRA"));
+   assertTrue(heavyHittersContainsSubString("spoofRA"));
}
catch(Exception ex) {
throw new RuntimeException(ex);

http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/codegen/CPlanComparisonTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/codegen/CPlanComparisonTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/codegen/CPlanComparisonTest.java
index 0b85fae..084e560 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/codegen/CPlanComparisonTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/codegen/CPlanComparisonTest.java
@@ -19,7 +19,6 @@
 
 package org.apache.sysml.test.integration.functions.codegen;
 
-import org.junit.Assert;
 import org.junit.Test;
 import org.apache.sysml.hops.DataOp;
 import org.apache.sysml.hops.Hop;
@@ -58,140 +57,168 @@ public class CPlanComparisonTest extends AutomatedTestBase

@Test
public void testEqualLiteral() {
+   if(shouldSkipTest())
+   return;
CNodeData c1 = new CNodeData(new LiteralOp(7), 0, 0, 
DataType.SCALAR);
CNodeData c2 = new CNodeData(new LiteralOp(7), 0, 0, 
DataType.SCALAR);
-   Assert.assertEquals(c1.hashCode(), c2.hashCode());
-   Assert.assertEquals(c1, c2);
+   assertEquals(c1.hashCode(), c2.hashCode());
+   assertEquals(c1, c2);
c1.setLiteral(true);
c2.setLiteral(true);
-   Assert.assertEquals(c1.hashCode(), c2.hashCode());
-   Assert.assertEquals(c1, c2);
+   assertEquals(c1.hashCode(), c2.hashCode());
+   assertEquals(c1, c2);
c1.setStrictEquals(true);
c2.setStrictEquals(true);
-   Assert.assertEquals(c1.hashCode(), c2.hashCode());
-   Assert.assertEquals(c1, c2);
+   assertEquals(c1.hashCode(), c2.hashCode());
+   assertEquals(c1, c2);
}

@Test
public void testNotEqualLiteral() {
+   if(shouldSkipTest())
+   return;
CNodeData c1 = new CNodeData(new LiteralOp(7), 0, 0, 
DataType.SCALAR);
CNodeData c2 = new CNodeData(new LiteralOp(3), 0, 0, 
DataType.SCALAR);
-   Assert.assertNotEquals(c1.hashCode(), c2.hashCode());
-   Assert.assertNotEquals(c1, c2);
+   assertNotEquals(c1.hashCode(), c2.hashCode());
+   assertNotEquals(c1, c2);
c1.setLiteral(true);
c2.setLiteral(true);
-   Assert.assertNotEquals(c1.hashCode(), c2.hashCode());
-   Assert.assertNotEquals(c1, c2);
+   assertNotEquals(c1.hashCode(), c2.hashCode());
+   assertNotEquals(c1, c2);
c1.setStrictEquals(true);
c2.setStrictEquals(true);
-   Assert.assertNotEquals(c1.hashCode(), c2.hashCode());
-   Assert.assertNotEquals(c1, c2);
+   assertNotEquals(c1.hashCode(), c2.hashCode());
+   assertNotEquals(c1, c2);
}

@Test
public 

[01/11] systemml git commit: [SYSTEMML-2496] Skip MapReduce tests as it is in maintenance mode

2018-10-14 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 7907c0ea5 -> 95bf8cfe6


http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/unary/scalar/FullDistributionTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/unary/scalar/FullDistributionTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/unary/scalar/FullDistributionTest.java
index 7122f99..92707a0 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/unary/scalar/FullDistributionTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/unary/scalar/FullDistributionTest.java
@@ -187,16 +187,11 @@ public class FullDistributionTest extends 
AutomatedTestBase
 */
private void runDFTest(TEST_TYPE type, boolean inverse, Double param1, 
Double param2, ExecType instType) 
{
-   //setup multi backend configuration
-   RUNTIME_PLATFORM platformOld = rtplatform;
-   switch( instType ){
-   case MR: rtplatform = RUNTIME_PLATFORM.HADOOP; break;
-   case SPARK: rtplatform = RUNTIME_PLATFORM.SPARK; break;
-   default: rtplatform = RUNTIME_PLATFORM.HYBRID; break;
-   }
boolean sparkConfigOld = DMLScript.USE_LOCAL_SPARK_CONFIG;
-   if( rtplatform == RUNTIME_PLATFORM.SPARK )
-   DMLScript.USE_LOCAL_SPARK_CONFIG = true;
+   RUNTIME_PLATFORM platformOld = setRuntimePlatform(instType);
+   if(shouldSkipTest())
+   return;
+

try
{

http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/unary/scalar/NotTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/unary/scalar/NotTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/unary/scalar/NotTest.java
index 102656a..67d81e6 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/unary/scalar/NotTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/unary/scalar/NotTest.java
@@ -51,6 +51,9 @@ public class NotTest extends AutomatedTestBase

@Test
public void testNot() {
+   if(shouldSkipTest())
+   return;
+   
TestConfiguration config = getTestConfiguration("NotTest");
loadTestConfiguration(config);


http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/unary/scalar/PrintTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/unary/scalar/PrintTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/unary/scalar/PrintTest.java
index f73f37f..66e0d40 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/unary/scalar/PrintTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/unary/scalar/PrintTest.java
@@ -44,6 +44,9 @@ public class PrintTest extends AutomatedTestBase
 
@Test
public void testInt() {
+   if(shouldSkipTest())
+   return;
+   
int value = 0;
 
TestConfiguration config = 
availableTestConfigurations.get("PrintTest");
@@ -57,6 +60,9 @@ public class PrintTest extends AutomatedTestBase

@Test
public void testDouble() {
+   if(shouldSkipTest())
+   return;
+   
double value = 1337.3;
 
TestConfiguration config = 
availableTestConfigurations.get("PrintTest");
@@ -70,6 +76,9 @@ public class PrintTest extends AutomatedTestBase

@Test
public void testBoolean() {
+   if(shouldSkipTest())
+   return;
+   
String value = "TRUE";
 
TestConfiguration config = 
availableTestConfigurations.get("PrintTest");
@@ -83,6 +92,9 @@ public class PrintTest extends AutomatedTestBase

@Test
public void testString() {
+   if(shouldSkipTest())
+   return;
+   
String value = "\"Hello World!\"";
 
TestConfiguration config = 
availableTestConfigurations.get("PrintTest");
@@ -96,6 +108,9 @@ public class PrintTest extends AutomatedTestBase

@Test
public void testStringWithoutMsg() {
+   if(shouldSkipTest())
+   return;
+   
String value = 

[11/11] systemml git commit: [SYSTEMML-2496] Skip MapReduce tests as it is in maintenance mode

2018-10-14 Thread niketanpansare
[SYSTEMML-2496] Skip MapReduce tests as it is in maintenance mode

- This commit makes hybrid_spark as default runtime for junit tests to increase 
the coverage of our spark backend.
- The FrameConverterTest and parfor tests are kept as is and can be modified in 
subsequent commits.
- The MapReduce tests can be turned on using TEST_MR_BACKEND flag in 
AutomatedTestBase class.
- We only disable HADOOP runtime and not HYBRID runtime. We can disable latter 
in subsequent commit.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/95bf8cfe
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/95bf8cfe
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/95bf8cfe

Branch: refs/heads/master
Commit: 95bf8cfe6c0ae4f45b460a03c397203b7b354fc5
Parents: 7907c0e
Author: Niketan Pansare 
Authored: Sun Oct 14 12:14:19 2018 -0700
Committer: Niketan Pansare 
Committed: Sun Oct 14 12:14:19 2018 -0700

--
 .../test/integration/AutomatedTestBase.java | 161 ++-
 .../applications/ApplyTransformTest.java| 258 +--
 .../integration/applications/ArimaTest.java |   3 +
 .../integration/applications/CsplineCGTest.java |   3 +
 .../integration/applications/CsplineDSTest.java |   3 +
 .../test/integration/applications/GLMTest.java  |   3 +
 .../test/integration/applications/GNMFTest.java |   3 +
 .../test/integration/applications/HITSTest.java |   2 +
 .../test/integration/applications/ID3Test.java  |   8 +-
 .../integration/applications/L2SVMTest.java |   3 +
 .../applications/LinearRegressionTest.java  |   3 +
 .../applications/MDABivariateStatsTest.java |   3 +
 .../applications/MultiClassSVMTest.java |   3 +
 .../applications/NaiveBayesParforTest.java  |   3 +
 .../applications/NaiveBayesTest.java|   3 +
 .../integration/applications/PageRankTest.java  |   3 +
 .../integration/applications/WelchTTest.java|   3 +
 .../BivariateCategoricalCategoricallTest.java   |   9 +
 .../BivariateOrdinalOrdinalTest.java|   6 +
 .../BivariateScaleCategoricalTest.java  |   5 +
 .../BivariateScaleScaleTest.java|   8 +-
 .../descriptivestats/OrderStatisticsTest.java   |   3 +
 .../UnivariateCategoricalTest.java  |   6 +
 .../descriptivestats/UnivariateStatsBase.java   |  10 +-
 .../dml/ScalableDecompositionTest.java  |  14 +-
 .../parfor/ParForBivariateStatsTest.java|   5 +-
 .../parfor/ParForCVMulticlassSVMTest.java   |   5 +-
 .../parfor/ParForCorrelationTest.java   |   8 +-
 .../parfor/ParForCorrelationTestLarge.java  |   4 +-
 .../parfor/ParForNaiveBayesTest.java|  12 +-
 .../applications/parfor/ParForSampleTest.java   |  28 +-
 .../parfor/ParForUnivariateStatsTest.java   |   4 +-
 .../conversion/RDDConverterUtilsExtTest.java|   9 +-
 .../functions/aggregate/AggregateInfTest.java   |   7 +-
 .../functions/aggregate/AggregateNaNTest.java   |  11 +-
 .../functions/aggregate/ColStdDevsTest.java |  21 +-
 .../functions/aggregate/ColSumTest.java |   6 +
 .../functions/aggregate/ColSumsSqTest.java  |  28 +-
 .../functions/aggregate/ColVariancesTest.java   |  28 +-
 .../functions/aggregate/FullAggregateTest.java  |  17 +-
 .../aggregate/FullColAggregateTest.java |  12 +-
 .../FullGroupedAggregateMatrixTest.java |  13 +-
 .../aggregate/FullGroupedAggregateTest.java |  15 +-
 .../aggregate/FullRowAggregateTest.java |  13 +-
 .../functions/aggregate/LengthTest.java |   6 +
 .../functions/aggregate/MaxTest.java|   6 +
 .../functions/aggregate/MinTest.java|   6 +
 .../functions/aggregate/NColTest.java   |   6 +
 .../functions/aggregate/NRowTest.java   |   6 +
 .../functions/aggregate/ProdTest.java   |   6 +
 .../aggregate/PushdownSumBinaryTest.java|  17 +-
 .../aggregate/RowColProdsAggregateTest.java |  14 +-
 .../functions/aggregate/RowStdDevsTest.java |  21 +-
 .../functions/aggregate/RowSumTest.java |   6 +
 .../functions/aggregate/RowSumsSqTest.java  |  28 +-
 .../functions/aggregate/RowVariancesTest.java   |  27 +-
 .../functions/aggregate/StdDevTest.java |  21 +-
 .../functions/aggregate/SumSqTest.java  |  27 +-
 .../functions/aggregate/SumTest.java|   6 +
 .../functions/aggregate/TraceTest.java  |   6 +
 .../functions/aggregate/VarianceTest.java   |  20 +-
 .../functions/append/AppendChainTest.java   |   8 +-
 .../functions/append/AppendMatrixTest.java  |  14 +-
 .../functions/append/AppendVectorTest.java  |  11 +-
 .../functions/append/RBindCBindMatrixTest.java  |  15 +-
 .../functions/append/StringAppendTest.java  |  15 +-
 .../binary/matrix/BinUaggChainTest.java |  15 +-
 .../binary/matrix/CentralMomentTest.java|  15 +-
 

[05/11] systemml git commit: [SYSTEMML-2496] Skip MapReduce tests as it is in maintenance mode

2018-10-14 Thread niketanpansare
http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/misc/FunctionInExpressionTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/misc/FunctionInExpressionTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/misc/FunctionInExpressionTest.java
index ad7aecc..867899b 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/misc/FunctionInExpressionTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/misc/FunctionInExpressionTest.java
@@ -20,7 +20,6 @@
 package org.apache.sysml.test.integration.functions.misc;
 
 
-import org.junit.Assert;
 import org.junit.Test;
 import org.apache.sysml.runtime.matrix.data.MatrixValue.CellIndex;
 import org.apache.sysml.test.integration.AutomatedTestBase;
@@ -82,6 +81,9 @@ public class FunctionInExpressionTest extends 
AutomatedTestBase

private void runFunInExpressionTest( String testName )
{
+   if(shouldSkipTest())
+   return;
+   
TestConfiguration config = getTestConfiguration(testName);
loadTestConfiguration(config);

@@ -97,6 +99,6 @@ public class FunctionInExpressionTest extends 
AutomatedTestBase

//compare results
double val = readDMLMatrixFromHDFS("R").get(new CellIndex(1,1));
-   Assert.assertTrue("Wrong result: 7 vs "+val, 
Math.abs(val-7)http://git-wip-us.apache.org/repos/asf/systemml/blob/95bf8cfe/src/test/java/org/apache/sysml/test/integration/functions/misc/FunctionInliningTest.java
--
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/misc/FunctionInliningTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/misc/FunctionInliningTest.java
index 0605c02..6d230fa 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/misc/FunctionInliningTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/misc/FunctionInliningTest.java
@@ -19,7 +19,6 @@
 
 package org.apache.sysml.test.integration.functions.misc;
 
-import org.junit.Assert;
 import org.junit.Test;
 
 import org.apache.sysml.hops.OptimizerUtils;
@@ -93,6 +92,8 @@ public class FunctionInliningTest extends AutomatedTestBase
private void runInliningTest( String testname, boolean IPA )
{   
boolean oldIPA = OptimizerUtils.ALLOW_INTER_PROCEDURAL_ANALYSIS;
+   if(shouldSkipTest())
+   return;

try
{
@@ -111,22 +112,23 @@ public class FunctionInliningTest extends 
AutomatedTestBase

//compare output
double ret = 
MapReduceTool.readDoubleFromHDFSFile(output("Rout"));
-   Assert.assertEquals(Double.valueOf(rows*cols*val*6), 
Double.valueOf(ret));
+   assertEquals(Double.valueOf(rows*cols*val*6), 
Double.valueOf(ret));

//compiled MR jobs
-   int expectNumCompiled = IPA ? 0 : 
(testname.equals(TEST_NAME1)?2: //2GMR in foo1 and foo2 (not removed w/o IPA)
-  
(testname.equals(TEST_NAME2)?4: //3GMR in foo1 and foo2, 1GMR for subsequent 
sum  
-   5 )); //5GMR in 
foo1-foo5 (not removed w/o IPA) 
-   Assert.assertEquals("Unexpected number of compiled MR 
jobs.", 
+// int expectNumCompiled = IPA ? 0 : 
(testname.equals(TEST_NAME1)?2: //2GMR in foo1 and foo2 (not removed w/o IPA)
+//
(testname.equals(TEST_NAME2)?4: //3GMR in foo1 and foo2, 1GMR for subsequent 
sum  
+// 5 )); //5GMR in 
foo1-foo5 (not removed w/o IPA) 
+   int expectNumCompiled = 0;
+   assertEquals("Unexpected number of compiled MR jobs.", 
expectNumCompiled, 
Statistics.getNoOfCompiledMRJobs());

//check executed MR jobs
int expectNumExecuted = 0; //executed jobs should 
always be 0 due to dynamic recompilation
-   Assert.assertEquals("Unexpected number of executed MR 
jobs.", expectNumExecuted, Statistics.getNoOfExecutedMRJobs());
+   assertEquals("Unexpected number of executed MR jobs.", 
expectNumExecuted, Statistics.getNoOfExecutedMRJobs());
}
catch(Exception ex)
{
-   

systemml git commit: [SYSTEMML-1325] Harmonize Compilation Execution Pipelines and Add GPU Support to JMLC

2018-10-14 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 41de8dcdc -> 7907c0ea5


[SYSTEMML-1325] Harmonize Compilation Execution Pipelines and Add GPU
Support to JMLC

This PR adds support for compilation and execution of GPU enabled
scripts in JMLC and harmonizes the pipeline used to compile and execute
DML programs across the JMLC, MLContext and DMLScript. Specifically, the
following changes were made:

1. All three APIs now call ScriptExecutorUtils.compileRuntimeProgram to
compile DML scripts. The original logic in MLContext and JMLC for
pinning inputs and persisting outputs has been preserved.
2. All three APIs now use ScriptExecutorUtils.executeRuntimeProgram to
execute the compiled program. Previously, JMLC called the Script.execute
method directly.
3. jmlc.Connection.prepareScript now supports compiling a script to use
GPU. Note that following #832 the issue noted in #830 has been resolved.
4. A PreparedScript is now statically assigned a GPU context when it is
compiled and instatiated. This has potential performance implications
because it means that a PreparedScript must be executed on a specific
GPU. However, it reduces overhead from creating a GPU context each time
a script is executed and unsures that a user cannot compile a script to
use GPU and then forget to assign a GPU context when the script is run.
5. Per (3) I have added a unit test which compiles and executes a GPU
enabled script in JMLC both with and without pinned data and just
asserts that no errors occur.

Closes #836.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/7907c0ea
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/7907c0ea
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/7907c0ea

Branch: refs/heads/master
Commit: 7907c0ea5109e9b33465b7d7a2ac2bf0c42ab380
Parents: 41de8dc
Author: Anthony Thomas 
Authored: Sun Oct 14 09:24:33 2018 -0700
Committer: Niketan Pansare 
Committed: Sun Oct 14 09:25:55 2018 -0700

--
 .../java/org/apache/sysml/api/DMLScript.java| 172 ---
 .../apache/sysml/api/ScriptExecutorUtils.java   | 290 ---
 .../org/apache/sysml/api/jmlc/Connection.java   | 158 +-
 .../apache/sysml/api/jmlc/PreparedScript.java   |  43 +--
 .../sysml/api/mlcontext/ScriptExecutor.java | 100 +++
 .../apache/sysml/conf/ConfigurationManager.java |  38 ++-
 .../controlprogram/LocalVariableMap.java|  11 +-
 .../runtime/controlprogram/ProgramBlock.java|   4 +-
 .../controlprogram/caching/CacheableData.java   |   4 +-
 .../context/ExecutionContext.java   |   4 +-
 .../context/SparkExecutionContext.java  |   5 +-
 .../gpu/context/GPUContextPool.java |  31 +-
 .../java/org/apache/sysml/utils/Statistics.java |   6 +-
 .../org/apache/sysml/test/gpu/GPUTests.java |  13 +-
 .../org/apache/sysml/test/gpu/JMLCTests.java| 126 
 .../jmlc/JMLCParfor2ForCompileTest.java |   7 +-
 16 files changed, 661 insertions(+), 351 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/7907c0ea/src/main/java/org/apache/sysml/api/DMLScript.java
--
diff --git a/src/main/java/org/apache/sysml/api/DMLScript.java 
b/src/main/java/org/apache/sysml/api/DMLScript.java
index 9976adc..16d8986 100644
--- a/src/main/java/org/apache/sysml/api/DMLScript.java
+++ b/src/main/java/org/apache/sysml/api/DMLScript.java
@@ -32,6 +32,7 @@ import java.util.Arrays;
 import java.util.Collections;
 import java.util.Date;
 import java.util.HashSet;
+import java.util.List;
 import java.util.Map;
 import java.util.Scanner;
 
@@ -48,6 +49,7 @@ import org.apache.hadoop.security.UserGroupInformation;
 import org.apache.hadoop.util.GenericOptionsParser;
 import org.apache.log4j.Level;
 import org.apache.log4j.Logger;
+import org.apache.sysml.api.ScriptExecutorUtils.SystemMLAPI;
 import org.apache.sysml.api.mlcontext.ScriptType;
 import org.apache.sysml.conf.CompilerConfig;
 import org.apache.sysml.conf.ConfigurationManager;
@@ -65,13 +67,14 @@ import org.apache.sysml.parser.ParserFactory;
 import org.apache.sysml.parser.ParserWrapper;
 import org.apache.sysml.runtime.DMLRuntimeException;
 import org.apache.sysml.runtime.DMLScriptException;
+import org.apache.sysml.runtime.controlprogram.LocalVariableMap;
 import org.apache.sysml.runtime.controlprogram.Program;
 import org.apache.sysml.runtime.controlprogram.caching.CacheableData;
 import org.apache.sysml.runtime.controlprogram.context.ExecutionContext;
-import org.apache.sysml.runtime.controlprogram.context.ExecutionContextFactory;
 import org.apache.sysml.runtime.controlprogram.context.SparkExecutionContext;
 import 
org.apache.sysml.runtime.controlprogram.parfor.stat.InfrastructureAnalyzer;
 import 

systemml git commit: [SYSTEMML-540] Allow user to generate an inlined DML script in Caffe2DML

2018-10-11 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 11c67055a -> ef1945d70


[SYSTEMML-540] Allow user to generate an inlined DML script in Caffe2DML

- The inlining code is generic enough to be extended to perform parser-level 
inlining. This commit allows us to compare the tradeoffs of performing 
script-level inlining v/s hop-level inlining.
- Refactored DMLParserWrapper and also added javadoc.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/ef1945d7
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/ef1945d7
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/ef1945d7

Branch: refs/heads/master
Commit: ef1945d70a85df4f646c315d06a1a094dad6ebb2
Parents: 11c6705
Author: Niketan Pansare 
Authored: Thu Oct 11 14:28:59 2018 -0700
Committer: Niketan Pansare 
Committed: Thu Oct 11 14:32:35 2018 -0700

--
 .../parser/common/CustomErrorListener.java  |   8 +
 .../sysml/parser/dml/DMLParserWrapper.java  | 130 ++-
 .../java/org/apache/sysml/parser/dml/Dml.g4 |   6 +-
 .../sysml/parser/dml/DmlPreprocessor.java   |  13 +-
 .../apache/sysml/parser/dml/InlineHelper.java   | 798 +++
 .../sysml/parser/dml/InlineableMethods.java |  98 +++
 .../controlprogram/caching/CacheableData.java   |   9 +-
 .../gpu/context/GPUMemoryManager.java   |   2 +-
 src/main/python/systemml/mllearn/estimators.py  |   5 +-
 .../org/apache/sysml/api/dl/Caffe2DML.scala |   8 +-
 .../org/apache/sysml/api/dl/CaffeLayer.scala|  12 +-
 .../org/apache/sysml/api/dl/CaffeSolver.scala   |  35 +-
 .../org/apache/sysml/api/dl/DMLGenerator.scala  |  36 +-
 .../scala/org/apache/sysml/api/dl/Utils.scala   |  58 +-
 14 files changed, 1114 insertions(+), 104 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/ef1945d7/src/main/java/org/apache/sysml/parser/common/CustomErrorListener.java
--
diff --git 
a/src/main/java/org/apache/sysml/parser/common/CustomErrorListener.java 
b/src/main/java/org/apache/sysml/parser/common/CustomErrorListener.java
index 2af5f69..b82afc9 100644
--- a/src/main/java/org/apache/sysml/parser/common/CustomErrorListener.java
+++ b/src/main/java/org/apache/sysml/parser/common/CustomErrorListener.java
@@ -22,6 +22,7 @@ package org.apache.sysml.parser.common;
 import java.util.ArrayList;
 import java.util.Collections;
 import java.util.List;
+import java.util.Set;
 
 import org.antlr.v4.runtime.BaseErrorListener;
 import org.antlr.v4.runtime.RecognitionException;
@@ -38,6 +39,9 @@ public class CustomErrorListener extends BaseErrorListener {
private boolean atLeastOneError = false;
private boolean atLeastOneWarning = false;
private String currentFileName = null;
+   
+   // Names of user internal and external functions definitions
+   public Set functions;
 
/**
 * List of parse issues.
@@ -55,6 +59,10 @@ public class CustomErrorListener extends BaseErrorListener {
public void unsetCurrentFileName() {
currentFileName = null;
}
+   
+   public Set getFunctionDefs() {
+   return functions;
+   }
 
/**
 * Validation error occurred. Add the error to the list of parse issues.

http://git-wip-us.apache.org/repos/asf/systemml/blob/ef1945d7/src/main/java/org/apache/sysml/parser/dml/DMLParserWrapper.java
--
diff --git a/src/main/java/org/apache/sysml/parser/dml/DMLParserWrapper.java 
b/src/main/java/org/apache/sysml/parser/dml/DMLParserWrapper.java
index 1d7daa1..9b3f8c4 100644
--- a/src/main/java/org/apache/sysml/parser/dml/DMLParserWrapper.java
+++ b/src/main/java/org/apache/sysml/parser/dml/DMLParserWrapper.java
@@ -23,12 +23,14 @@ import java.io.ByteArrayInputStream;
 import java.io.FileNotFoundException;
 import java.io.IOException;
 import java.io.InputStream;
+import java.util.HashMap;
 import java.util.Map;
 
 import org.antlr.v4.runtime.ANTLRInputStream;
 import org.antlr.v4.runtime.BailErrorStrategy;
 import org.antlr.v4.runtime.CommonTokenStream;
 import org.antlr.v4.runtime.DefaultErrorStrategy;
+import org.antlr.v4.runtime.TokenStreamRewriter;
 import org.antlr.v4.runtime.atn.PredictionMode;
 import org.antlr.v4.runtime.misc.ParseCancellationException;
 import org.antlr.v4.runtime.tree.ParseTree;
@@ -74,6 +76,15 @@ import 
org.apache.sysml.parser.dml.DmlParser.StatementContext;
 public class DMLParserWrapper extends ParserWrapper
 {
private static final Log LOG = 
LogFactory.getLog(DMLScript.class.getName());
+   
+   // Rewriter is only used in getInlineableMethods
+   private TokenStreamRewriter rewriter = null;
+   
+   // The below fields are set 

systemml git commit: [SYSTEMML-445] Improved the performance of batchnorm backward

2018-10-09 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 512fb9e11 -> 3702df7c1


[SYSTEMML-445] Improved the performance of batchnorm backward

- Added a custom kernel for computing dgamma in batch normalization
layer.
- Also, fixed a minor bug in GPUDenseInputPointerFetcher class.

Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/3702df7c
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/3702df7c
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/3702df7c

Branch: refs/heads/master
Commit: 3702df7c1890b8c87c42715260240c604a5c3c64
Parents: 512fb9e
Author: Niketan Pansare 
Authored: Tue Oct 9 14:58:09 2018 -0700
Committer: Niketan Pansare 
Committed: Tue Oct 9 14:58:09 2018 -0700

--
 src/main/cpp/kernels/SystemML.cu|  21 +++
 src/main/cpp/kernels/SystemML.ptx   | 188 ---
 src/main/java/org/apache/sysml/hops/DnnOp.java  |   8 +-
 src/main/java/org/apache/sysml/hops/Hop.java|   3 +-
 .../hops/rewrite/RewriteGPUSpecificOps.java |  22 ++-
 .../org/apache/sysml/lops/DnnTransform.java |   7 +-
 .../instructions/GPUInstructionParser.java  |   1 +
 .../instructions/gpu/DnnGPUInstruction.java |  51 -
 .../gpu/GPUDenseInputPointerFetcher.java|   4 +-
 .../runtime/matrix/data/LibMatrixCUDA.java  |  19 +-
 10 files changed, 285 insertions(+), 39 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/3702df7c/src/main/cpp/kernels/SystemML.cu
--
diff --git a/src/main/cpp/kernels/SystemML.cu b/src/main/cpp/kernels/SystemML.cu
index a53d07a..26d7f43 100644
--- a/src/main/cpp/kernels/SystemML.cu
+++ b/src/main/cpp/kernels/SystemML.cu
@@ -2385,3 +2385,24 @@ extern "C" __global__ void invVar_f(float *X, float *C, 
double eps, unsigned int
   invVar(X, C, eps, size);
 }
 
+template 
+__device__ void backward_dgamma_tmp(T *ema_mean, T *dout, T *X, T*ema_var, 
T*ret, int N, int C,
+ int HW, int CHW, unsigned int NCHW) {
+  int tid = blockIdx.x * blockDim.x + threadIdx.x;
+  int ix = tid / CHW;
+  int iy = tid % CHW;
+  if (ix < N && iy < CHW) {
+int c = iy / HW;
+ret[tid] = dout[tid] * ((X[tid] - ema_mean[c]) * ema_var[c]);
+  }
+}
+
+extern "C" __global__ void backward_dgamma_tmp_d(double *ema_mean, double 
*dout, double *X, double* ema_var, double* ret, 
+   int N, int C, int HW, int CHW, unsigned int NCHW) {
+  backward_dgamma_tmp(ema_mean, dout, X, ema_var, ret, N, C, HW, CHW, NCHW);
+}
+
+extern "C" __global__ void backward_dgamma_tmp_f(double *ema_mean, double 
*dout, double *X, double* ema_var, double* ret, 
+   int N, int C, int HW, int CHW, int NCHW) {
+  backward_dgamma_tmp(ema_mean, dout, X, ema_var, ret, N, C, HW, CHW, NCHW);
+}

http://git-wip-us.apache.org/repos/asf/systemml/blob/3702df7c/src/main/cpp/kernels/SystemML.ptx
--
diff --git a/src/main/cpp/kernels/SystemML.ptx 
b/src/main/cpp/kernels/SystemML.ptx
index ac04967..3043373 100644
--- a/src/main/cpp/kernels/SystemML.ptx
+++ b/src/main/cpp/kernels/SystemML.ptx
@@ -15084,12 +15084,146 @@ BB123_2:
ret;
 }
 
+   // .globl   backward_dgamma_tmp_d
+.visible .entry backward_dgamma_tmp_d(
+   .param .u64 backward_dgamma_tmp_d_param_0,
+   .param .u64 backward_dgamma_tmp_d_param_1,
+   .param .u64 backward_dgamma_tmp_d_param_2,
+   .param .u64 backward_dgamma_tmp_d_param_3,
+   .param .u64 backward_dgamma_tmp_d_param_4,
+   .param .u32 backward_dgamma_tmp_d_param_5,
+   .param .u32 backward_dgamma_tmp_d_param_6,
+   .param .u32 backward_dgamma_tmp_d_param_7,
+   .param .u32 backward_dgamma_tmp_d_param_8,
+   .param .u32 backward_dgamma_tmp_d_param_9
+)
+{
+   .reg .pred  %p<4>;
+   .reg .b32   %r<11>;
+   .reg .f64   %fd<8>;
+   .reg .b64   %rd<18>;
+
+
+   ld.param.u64%rd1, [backward_dgamma_tmp_d_param_0];
+   ld.param.u64%rd2, [backward_dgamma_tmp_d_param_1];
+   ld.param.u64%rd3, [backward_dgamma_tmp_d_param_2];
+   ld.param.u64%rd4, [backward_dgamma_tmp_d_param_3];
+   ld.param.u64%rd5, [backward_dgamma_tmp_d_param_4];
+   ld.param.u32%r4, [backward_dgamma_tmp_d_param_5];
+   ld.param.u32%r2, [backward_dgamma_tmp_d_param_7];
+   ld.param.u32%r3, [backward_dgamma_tmp_d_param_8];
+   mov.u32 %r5, %ctaid.x;
+   mov.u32 %r6, %ntid.x;
+   mov.u32 %r7, %tid.x;
+   mad.lo.s32  %r1, %r6, %r5, %r7;
+   div.s32 %r8, %r1, %r3;
+   setp.lt.s32 %p1, %r8, %r4;
+   setp.gt.s32 %p2, %r3, -1;
+   and.pred%p3, %p1, %p2;
+   @!%p3 bra   BB124_2;
+ 

systemml git commit: [SYSTEMML-445] Extend coverage for GPU batchnorm test rewrite

2018-10-09 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master fab31fd1f -> 512fb9e11


[SYSTEMML-445] Extend coverage for GPU batchnorm test rewrite

- If inv_var rewrite has already been applied, the application of GPU
batchnorm test rewrite (and CuDNN batchnorm kernel) is skipped. This
commit fixes this performance regression.
- Also, this commit allows for forcing of GPU rewrites in case of forced
GPU mode.

Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/512fb9e1
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/512fb9e1
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/512fb9e1

Branch: refs/heads/master
Commit: 512fb9e119541ae9d7dae58c0812a89d569d1ca0
Parents: fab31fd
Author: Niketan Pansare 
Authored: Tue Oct 9 13:56:47 2018 -0700
Committer: Niketan Pansare 
Committed: Tue Oct 9 13:56:47 2018 -0700

--
 .../sysml/hops/rewrite/HopDagPatternMatcher.java | 15 +++
 .../sysml/hops/rewrite/RewriteGPUSpecificOps.java|  2 +-
 2 files changed, 16 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/512fb9e1/src/main/java/org/apache/sysml/hops/rewrite/HopDagPatternMatcher.java
--
diff --git 
a/src/main/java/org/apache/sysml/hops/rewrite/HopDagPatternMatcher.java 
b/src/main/java/org/apache/sysml/hops/rewrite/HopDagPatternMatcher.java
index 51b1812..33cd5ed 100644
--- a/src/main/java/org/apache/sysml/hops/rewrite/HopDagPatternMatcher.java
+++ b/src/main/java/org/apache/sysml/hops/rewrite/HopDagPatternMatcher.java
@@ -294,6 +294,19 @@ public class HopDagPatternMatcher {
return new HopDagPatternMatcher().addPredicate("sqrt", h -> 
HopRewriteUtils.isUnary(h, OpOp1.SQRT))
.addChildMatcher(child);
}
+   public static HopDagPatternMatcher inv_var(HopDagPatternMatcher var, 
HopDagPatternMatcher eps) {
+   return new HopDagPatternMatcher().addPredicate("sqrt", h ->  {
+   if(HopRewriteUtils.isDnn(h, OpOpDnn.INV_VAR)) {
+   return true;
+   }
+   else {
+   return HopRewriteUtils.isBinary(h, OpOp2.DIV) 
&& HopRewriteUtils.isLiteralOfValue(h.getInput().get(0), 1.0) &&
+   HopRewriteUtils.isUnary(h.getInput().get(1), 
OpOp1.SQRT) && 
+   
HopRewriteUtils.isBinary(h.getInput().get(1).getInput().get(0), OpOp2.PLUS);
+   }
+   })
+   .addChildMatcher(var, eps);
+   }
public static HopDagPatternMatcher div(HopDagPatternMatcher child1, 
HopDagPatternMatcher child2) {
return new HopDagPatternMatcher().addPredicate("div", h -> 
HopRewriteUtils.isBinary(h, OpOp2.DIV))
.addChildMatcher(child1, child2);
@@ -370,6 +383,8 @@ public class HopDagPatternMatcher {
.addChildMatcher(child1, dummy);
}
private static boolean _fitsOnGPU(Hop h, double multiplier) {
+   if(ConfigurationManager.isForcedGPU())
+   return true;
double memEst = multiplier*h.getMemEstimate();
return ConfigurationManager.isGPU() && h.dimsKnown() && 
OptimizerUtils.isMemoryBasedOptLevel() &&
memEst < OptimizerUtils.getLocalMemBudget() && 
memEst < GPUContextPool.initialGPUMemBudget();

http://git-wip-us.apache.org/repos/asf/systemml/blob/512fb9e1/src/main/java/org/apache/sysml/hops/rewrite/RewriteGPUSpecificOps.java
--
diff --git 
a/src/main/java/org/apache/sysml/hops/rewrite/RewriteGPUSpecificOps.java 
b/src/main/java/org/apache/sysml/hops/rewrite/RewriteGPUSpecificOps.java
index 53d368b..577adc3 100644
--- a/src/main/java/org/apache/sysml/hops/rewrite/RewriteGPUSpecificOps.java
+++ b/src/main/java/org/apache/sysml/hops/rewrite/RewriteGPUSpecificOps.java
@@ -178,7 +178,7 @@ public class RewriteGPUSpecificOps extends 
HopRewriteRuleWithPatternMatcher {
HopDagPatternMatcher norm = 
bias_multiply(
bias_add(leaf("X", MATRIX), 
unaryMinus(leaf("mean", MATRIX))), // bias_add(X, -mean)
-   div(1, sqrt(plus(leaf("var", MATRIX), 
leaf("eps", SCALAR); // 1/sqrt(var+eps)
+   inv_var(leaf("var", MATRIX), 
leaf("eps", SCALAR))); // 1/sqrt(var+eps)
// hi = bias_add(bias_multiply(norm, gamma), beta)
_batchNormTest = 
bias_add(



systemml git commit: [SYSTEMML-445] Fixed the error handling during GPU memory cleanup

2018-10-09 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 8a144f2b3 -> fab31fd1f


[SYSTEMML-445] Fixed the error handling during GPU memory cleanup

If an error occurs during cleanup of temporary memory and free-ing of
GPU context, SystemML does not display the correct error message. This
commit fixes this issue.

Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/fab31fd1
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/fab31fd1
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/fab31fd1

Branch: refs/heads/master
Commit: fab31fd1f3b8c832641ba2cd8f2a678ecdfcf043
Parents: 8a144f2
Author: Niketan Pansare 
Authored: Tue Oct 9 13:36:45 2018 -0700
Committer: Niketan Pansare 
Committed: Tue Oct 9 13:36:45 2018 -0700

--
 .../apache/sysml/api/ScriptExecutorUtils.java   | 45 
 1 file changed, 27 insertions(+), 18 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/fab31fd1/src/main/java/org/apache/sysml/api/ScriptExecutorUtils.java
--
diff --git a/src/main/java/org/apache/sysml/api/ScriptExecutorUtils.java 
b/src/main/java/org/apache/sysml/api/ScriptExecutorUtils.java
index e32fa29..9956518 100644
--- a/src/main/java/org/apache/sysml/api/ScriptExecutorUtils.java
+++ b/src/main/java/org/apache/sysml/api/ScriptExecutorUtils.java
@@ -75,6 +75,7 @@ public class ScriptExecutorUtils {
boolean exceptionThrown = false;

Statistics.startRunTimer();
+   Exception finalizeException = null;
try {
// run execute (w/ exception handling to ensure proper 
shutdown)
if (ConfigurationManager.isGPU() && ec != null) {
@@ -92,29 +93,34 @@ public class ScriptExecutorUtils {
throw e;
} finally { // ensure cleanup/shutdown
if (ConfigurationManager.isGPU() && 
!ec.getGPUContexts().isEmpty()) {
-   // 
-
-   // The below code pulls the output variables on 
the GPU to the host. This is required especially when:
-   // The output variable was generated as part of 
a MLContext session with GPU enabled
-   // and was passed to another MLContext with GPU 
disabled
-   // The above scenario occurs in our gpu test 
suite (eg: BatchNormTest).
-   if(outputVariables != null) {
-   for(String outVar : outputVariables) {
-   Data data = 
ec.getVariable(outVar);
-   if(data != null && data 
instanceof MatrixObject) {
-   for(GPUContext gCtx : 
ec.getGPUContexts()) {
-   GPUObject 
gpuObj = ((MatrixObject)data).getGPUObject(gCtx);
-   if(gpuObj != 
null && gpuObj.isDirty()) {
-   
gpuObj.acquireHostRead(null);
+   try {
+   // 
-
+   // The below code pulls the output 
variables on the GPU to the host. This is required especially when:
+   // The output variable was generated as 
part of a MLContext session with GPU enabled
+   // and was passed to another MLContext 
with GPU disabled
+   // The above scenario occurs in our gpu 
test suite (eg: BatchNormTest).
+   if(outputVariables != null) {
+   for(String outVar : 
outputVariables) {
+   Data data = 
ec.getVariable(outVar);
+   if(data != null && data 
instanceof MatrixObject) {
+   for(GPUContext 
gCtx : ec.getGPUContexts()) {
+   
GPUObject gpuObj = ((MatrixObject)data).getGPUObject(gCtx);
+   
if(gpuObj != null && gpuObj.isDirty()) {
+   
gpuObj.acquireHostRead(null);
+ 

systemml git commit: [MINOR] Bugfix in Keras2DML API when loading weights from Keras

2018-10-09 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master ef15e582b -> 8a144f2b3


[MINOR] Bugfix in Keras2DML API when loading weights from Keras

Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/8a144f2b
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/8a144f2b
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/8a144f2b

Branch: refs/heads/master
Commit: 8a144f2b35343a7aa8fbb4bf7aedd31dd36a3852
Parents: ef15e58
Author: Niketan Pansare 
Authored: Tue Oct 9 13:25:47 2018 -0700
Committer: Niketan Pansare 
Committed: Tue Oct 9 13:25:47 2018 -0700

--
 src/main/python/systemml/mllearn/estimators.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/8a144f2b/src/main/python/systemml/mllearn/estimators.py
--
diff --git a/src/main/python/systemml/mllearn/estimators.py 
b/src/main/python/systemml/mllearn/estimators.py
index d231b08..fbcd3e2 100644
--- a/src/main/python/systemml/mllearn/estimators.py
+++ b/src/main/python/systemml/mllearn/estimators.py
@@ -1017,7 +1017,7 @@ class Keras2DML(Caffe2DML):
 weight_decay: regularation strength (default: 5e-4)
 regularization_type: regularization type (default: "L2")
 """
-from .keras2caffe import convertKerasToCaffeNetwork, 
convertKerasToCaffeSolver
+from .keras2caffe import convertKerasToCaffeNetwork, 
convertKerasToCaffeSolver, convertKerasToSystemMLModel
 import tempfile, keras
 if isinstance(keras_model, keras.models.Sequential):
 # Convert the sequential model to functional model



systemml git commit: [SYSTEMML-445] Added memory stats for GPU allocation/eviction

2018-09-20 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 69624850e -> f46279a17


[SYSTEMML-445] Added memory stats for GPU allocation/eviction

- Also, reverted the shadow buffer to the original implementation as we are 
getting OOM for lstm scripts. This likely has to do with pessimistic GC.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/f46279a1
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/f46279a1
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/f46279a1

Branch: refs/heads/master
Commit: f46279a17031d3f8827923f6eddd614c3eac77d3
Parents: 6962485
Author: Niketan Pansare 
Authored: Thu Sep 20 14:56:51 2018 -0700
Committer: Niketan Pansare 
Committed: Thu Sep 20 14:56:51 2018 -0700

--
 conf/SystemML-config.xml.template   |   8 +-
 .../gpu/context/GPUMemoryManager.java   |  61 
 .../instructions/gpu/context/GPUObject.java |  18 +--
 .../instructions/gpu/context/ShadowBuffer.java  | 154 +--
 .../org/apache/sysml/utils/GPUStatistics.java   |  29 
 .../apache/sysml/utils/PersistentLRUCache.java  |   8 +-
 6 files changed, 108 insertions(+), 170 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/f46279a1/conf/SystemML-config.xml.template
--
diff --git a/conf/SystemML-config.xml.template 
b/conf/SystemML-config.xml.template
index 3925c4e..7b535c9 100644
--- a/conf/SystemML-config.xml.template
+++ b/conf/SystemML-config.xml.template
@@ -105,11 +105,9 @@

0.15

-   
-   
0.5
+   
+   
0.0


0.9

http://git-wip-us.apache.org/repos/asf/systemml/blob/f46279a1/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUMemoryManager.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUMemoryManager.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUMemoryManager.java
index 033051a..57b76f6 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUMemoryManager.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUMemoryManager.java
@@ -191,7 +191,7 @@ public class GPUMemoryManager {
GPUStatistics.cudaAllocCount.increment();
}
if(printDebugMessage != null && (PRINT_GPU_MEMORY_INFO 
|| LOG.isTraceEnabled()) )  {
-   LOG.info("Success: " + printDebugMessage + ":" 
+ byteCountToDisplaySize(size));
+   LOG.info("Success: " + printDebugMessage + ":" 
+ GPUStatistics.byteCountToDisplaySize(size));
}
return A;
} catch(jcuda.CudaException e) {
@@ -203,7 +203,7 @@ public class GPUMemoryManager {
GPUStatistics.cudaAllocCount.increment();
}
if(printDebugMessage != null && (PRINT_GPU_MEMORY_INFO 
|| LOG.isTraceEnabled()) )  {
-   LOG.info("Failed: " + printDebugMessage + ":" + 
byteCountToDisplaySize(size));
+   LOG.info("Failed: " + printDebugMessage + ":" + 
GPUStatistics.byteCountToDisplaySize(size));
LOG.info("GPU Memory info " + printDebugMessage 
+ ":" + toString());
}
return null;
@@ -224,28 +224,15 @@ public class GPUMemoryManager {
return "->" + stackTrace[index].getClassName() + "." + 
stackTrace[index].getMethodName() + "(" + stackTrace[index].getFileName() + ":" 
+ stackTrace[index].getLineNumber() + ")";
}

-   /**
-* Pretty printing utility to print bytes
-* 
-* @param numBytes number of bytes
-* @return a human-readable display value
-*/
-   private String byteCountToDisplaySize(long numBytes) {
-   // return 
org.apache.commons.io.FileUtils.byteCountToDisplaySize(bytes); // performs 
rounding
-   if (numBytes < 1024) { 
-   return numBytes + " bytes";
-   }
-   else {
-   int exp = (int) (Math.log(numBytes) / 6.931471805599453);
-   return String.format("%.3f %sB", ((double)numBytes) / 
Math.pow(1024, exp), "KMGTP".charAt(exp-1));
-   }
-   }

public boolean canAllocateWithoutEviction(String opcode, long size) {
return lazyCudaFreeMemoryManager.contains(opcode, size) || 
allocator.canAllocate(size) ||

lazyCudaFreeMemoryManager.containsRmvarPointerMinSize(opcode, size) ;
}

+

systemml git commit: [SYSTEMML-445] Use PersistentLRUCache for shadow buffering

2018-09-20 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 69f2d377c -> 69624850e


[SYSTEMML-445] Use PersistentLRUCache for shadow buffering

- Shadow buffer is cleared eagerly in case of garbage collection to avoid OOM 
and is backed by org.apache.sysml.utils.PersistentLRUCache.
- Setting the configuration property sysml.gpu.eviction.shadow.bufferSize to 
zero disables shadow buffering. If you intend to train network larger than the 
GPU memory size, consider using large driver memory and setting 
sysml.gpu.eviction.shadow.bufferSize to a value greater than 0.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/69624850
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/69624850
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/69624850

Branch: refs/heads/master
Commit: 69624850ea872841daef1f99251d793e103502f3
Parents: 69f2d37
Author: Niketan Pansare 
Authored: Thu Sep 20 11:22:37 2018 -0700
Committer: Niketan Pansare 
Committed: Thu Sep 20 11:22:37 2018 -0700

--
 conf/SystemML-config.xml.template   |   8 +-
 .../java/org/apache/sysml/conf/DMLConfig.java   |   2 +-
 .../instructions/gpu/context/GPUObject.java |  19 ++-
 .../instructions/gpu/context/ShadowBuffer.java  | 152 ++-
 4 files changed, 134 insertions(+), 47 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/69624850/conf/SystemML-config.xml.template
--
diff --git a/conf/SystemML-config.xml.template 
b/conf/SystemML-config.xml.template
index d773f79..3925c4e 100644
--- a/conf/SystemML-config.xml.template
+++ b/conf/SystemML-config.xml.template
@@ -105,9 +105,11 @@

0.15

-   
-   
0.0
+   
+   
0.5


0.9

http://git-wip-us.apache.org/repos/asf/systemml/blob/69624850/src/main/java/org/apache/sysml/conf/DMLConfig.java
--
diff --git a/src/main/java/org/apache/sysml/conf/DMLConfig.java 
b/src/main/java/org/apache/sysml/conf/DMLConfig.java
index 5b30609..7f0ecbc 100644
--- a/src/main/java/org/apache/sysml/conf/DMLConfig.java
+++ b/src/main/java/org/apache/sysml/conf/DMLConfig.java
@@ -138,7 +138,7 @@ public class DMLConfig
_defaultVals.put(NATIVE_BLAS_DIR,"none" );
_defaultVals.put(EXTRA_FINEGRAINED_STATS,"false" );
_defaultVals.put(PRINT_GPU_MEMORY_INFO,  "false" );
-   _defaultVals.put(EVICTION_SHADOW_BUFFERSIZE,  "0.0" );
+   _defaultVals.put(EVICTION_SHADOW_BUFFERSIZE,  "0.5" );
_defaultVals.put(STATS_MAX_WRAP_LEN, "30" );
_defaultVals.put(GPU_MEMORY_UTILIZATION_FACTOR,  "0.9" );
_defaultVals.put(GPU_MEMORY_ALLOCATOR,   "cuda");

http://git-wip-us.apache.org/repos/asf/systemml/blob/69624850/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUObject.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUObject.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUObject.java
index 552ee3b..43e2727 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUObject.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUObject.java
@@ -24,6 +24,7 @@ import static jcuda.runtime.JCuda.cudaMemset;
 import static jcuda.runtime.cudaMemcpyKind.cudaMemcpyDeviceToDevice;
 import static jcuda.runtime.cudaMemcpyKind.cudaMemcpyDeviceToHost;
 
+import java.io.IOException;
 import java.util.concurrent.atomic.AtomicLong;
 import java.util.concurrent.atomic.LongAdder;
 
@@ -110,7 +111,11 @@ public class GPUObject {
 */
public Pointer getDensePointer() {
if(jcudaDenseMatrixPtr == null && shadowBuffer.isBuffered() && 
getJcudaSparseMatrixPtr() == null) {
-   shadowBuffer.moveToDevice();
+   try {
+   shadowBuffer.moveToDevice();
+   } catch (IOException e) {
+   throw new DMLRuntimeException("Error moving the 
data from shadow buffer to the device", e);
+   }
}
return jcudaDenseMatrixPtr;
}
@@ -934,13 +939,21 @@ public class GPUObject {
else {
// If already copied to shadow buffer as part 
of previous eviction and this is not an eviction (i.e. bufferpool call for 
subsequent CP/Spark instruction),
// then copy from shadow buffer to MatrixObject.
-   shadowBuffer.moveToHost();
+ 

systemml git commit: [SYSTEMML-445] Write to disk when the cache is used in the write-mode

2018-09-20 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 3fbfbaecb -> 69f2d377c


[SYSTEMML-445] Write to disk when the cache is used in the write-mode

- This avoids the need to depend on finalize to perform writing.

Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/69f2d377
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/69f2d377
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/69f2d377

Branch: refs/heads/master
Commit: 69f2d377c456f9baea1e248818d544b54ee00e6f
Parents: 3fbfbae
Author: Niketan Pansare 
Authored: Thu Sep 20 10:44:27 2018 -0700
Committer: Niketan Pansare 
Committed: Thu Sep 20 10:44:27 2018 -0700

--
 .../apache/sysml/utils/PersistentLRUCache.java  | 100 ---
 1 file changed, 64 insertions(+), 36 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/69f2d377/src/main/java/org/apache/sysml/utils/PersistentLRUCache.java
--
diff --git a/src/main/java/org/apache/sysml/utils/PersistentLRUCache.java 
b/src/main/java/org/apache/sysml/utils/PersistentLRUCache.java
index bf356bb..71a1e28 100644
--- a/src/main/java/org/apache/sysml/utils/PersistentLRUCache.java
+++ b/src/main/java/org/apache/sysml/utils/PersistentLRUCache.java
@@ -86,7 +86,7 @@ public class PersistentLRUCache extends LinkedHashMap {
private String _prefixFilePath;
final AtomicLong _currentNumBytes = new AtomicLong();
private final long _maxNumBytes;
-   Random _rand = new Random();
+   private static final Random _rand = new Random();
boolean isInReadOnlyMode;
HashSet persistedKeys = new HashSet<>();

@@ -101,6 +101,9 @@ public class PersistentLRUCache extends 
LinkedHashMap {
for(long i = 0; i < numIter; ++i) {
LOG.debug("Putting a double array of size 50MB.");
cache.put("file_" + i, new double[numDoubleIn50MB]);
+   try {
+   Thread.sleep(100);
+   } catch (InterruptedException e) {}
}
cache.clear();
}
@@ -127,13 +130,13 @@ public class PersistentLRUCache extends 
LinkedHashMap {
_prefixFilePath = tmp.getAbsolutePath();
}
public ValueWrapper put(String key, double[] value) throws 
FileNotFoundException, IOException {
-   return putImplm(key, new ValueWrapper(new DataWrapper(key, 
value, this)), value.length*Double.BYTES);
+   return putImplm(key, new ValueWrapper(new DataWrapper(key, 
value, this), isInReadOnlyMode), value.length*Double.BYTES);
}
public ValueWrapper put(String key, float[] value) throws 
FileNotFoundException, IOException {
-   return putImplm(key, new ValueWrapper(new DataWrapper(key, 
value, this)), value.length*Float.BYTES);
+   return putImplm(key, new ValueWrapper(new DataWrapper(key, 
value, this), isInReadOnlyMode), value.length*Float.BYTES);
}
public ValueWrapper put(String key, MatrixBlock value) throws 
FileNotFoundException, IOException {
-   return putImplm(key, new ValueWrapper(new DataWrapper(key, 
value, this)), value.getInMemorySize());
+   return putImplm(key, new ValueWrapper(new DataWrapper(key, 
value, this), isInReadOnlyMode), value.getInMemorySize());
}

private ValueWrapper putImplm(String key, ValueWrapper value, long 
sizeInBytes) throws FileNotFoundException, IOException {
@@ -206,7 +209,7 @@ public class PersistentLRUCache extends 
LinkedHashMap {
 }

float [] tmp = new float[0];
-   String dummyKey = "RAND_KEY_" + Math.abs(_rand.nextLong()) + "_" + 
Math.abs(_rand.nextLong());
+   static String dummyKey = "RAND_KEY_" + Math.abs(_rand.nextLong()) + "_" 
+ Math.abs(_rand.nextLong());
void ensureCapacity(long newNumBytes) throws FileNotFoundException, 
IOException {
if(newNumBytes > _maxNumBytes) {
throw new DMLRuntimeException("Exceeds maximum 
capacity. Cannot put a value of size " + newNumBytes + 
@@ -217,7 +220,7 @@ public class PersistentLRUCache extends 
LinkedHashMap {
synchronized(this) {
if(LOG.isDebugEnabled())
LOG.debug("The required capacity (" + 
newCapacity + ") is greater than max capacity:" + _maxNumBytes);
-   ValueWrapper dummyValue = new ValueWrapper(new 
DataWrapper(dummyKey, tmp, this));
+   ValueWrapper dummyValue = new ValueWrapper(new 
DataWrapper(dummyKey, tmp, this), isInReadOnlyMode);
int maxIter = 

systemml git commit: [SYSTEMML-445] Dynamically decide whether to perform float-to-double conversion in the single precision mode on the host or device

2018-09-19 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 61139e400 -> 3fbfbaecb


[SYSTEMML-445] Dynamically decide whether to perform float-to-double conversion 
in the single precision mode on the host or device

- Fixed a int-to-long conversion bug in the shadow buffer.
- Updated javadocs for GPULazyCudaFreeMemoryManager.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/3fbfbaec
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/3fbfbaec
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/3fbfbaec

Branch: refs/heads/master
Commit: 3fbfbaecb9d1e31341df8084ff28035bede47766
Parents: 61139e4
Author: Niketan Pansare 
Authored: Wed Sep 19 09:19:30 2018 -0700
Committer: Niketan Pansare 
Committed: Wed Sep 19 09:19:30 2018 -0700

--
 .../context/GPULazyCudaFreeMemoryManager.java   | 51 +++-
 .../gpu/context/GPUMemoryManager.java   |  4 ++
 .../instructions/gpu/context/ShadowBuffer.java  | 18 +--
 .../matrix/data/CudaSupportFunctions.java   |  1 -
 .../SinglePrecisionCudaSupportFunctions.java| 37 +-
 .../apache/sysml/utils/PersistentLRUCache.java  | 15 +++---
 6 files changed, 100 insertions(+), 26 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/3fbfbaec/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPULazyCudaFreeMemoryManager.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPULazyCudaFreeMemoryManager.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPULazyCudaFreeMemoryManager.java
index ba98b3f..db21ae3 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPULazyCudaFreeMemoryManager.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPULazyCudaFreeMemoryManager.java
@@ -45,6 +45,7 @@ public class GPULazyCudaFreeMemoryManager {
 */
private HashMap> rmvarGPUPointers = new 
HashMap>();

+   
/**
 * Get any pointer of the given size from rmvar-ed pointers (applicable 
if eager cudaFree is set to false)
 * 
@@ -85,10 +86,17 @@ public class GPULazyCudaFreeMemoryManager {
GPUStatistics.maintainCPMiscTimes(opcode, 
instructionLevelTimer, System.nanoTime() - startTime);
}

+   /**
+* 
+* @return set of all pointers managed by this memory manager.
+*/
public Set getAllPointers() {
return rmvarGPUPointers.values().stream().flatMap(ptrs -> 
ptrs.stream()).collect(Collectors.toSet());
}

+   /**
+* Frees up all the cached rmvar-ed pointers
+*/
public void clearAll() {
Set toFree = new HashSet();
for(Set ptrs : rmvarGPUPointers.values()) {
@@ -100,9 +108,16 @@ public class GPULazyCudaFreeMemoryManager {
}
}

+   /**
+* Helper method to get the rmvar pointer that is greater than equal to 
min size
+* 
+* @param opcode instruction name
+* @param minSize size in bytes
+* @return the rmvar pointer that is greater than equal to min size
+* @throws DMLRuntimeException if error
+*/
public Pointer getRmvarPointerMinSize(String opcode, long minSize) 
throws DMLRuntimeException {
-   Optional toClear = 
rmvarGPUPointers.entrySet().stream().filter(e -> e.getValue().size() > 0).map(e 
-> e.getKey())
-   .filter(size -> size >= minSize).min((s1, s2) 
-> s1 < s2 ? -1 : 1);
+   Optional toClear = getRmvarSize(minSize);
if(toClear.isPresent()) {
boolean measureTime = opcode != null && 
ConfigurationManager.isFinegrainedStatistics();
long t0 = measureTime ?  System.nanoTime() : 0;
@@ -118,6 +133,38 @@ public class GPULazyCudaFreeMemoryManager {
return null;
}

+   /**
+* Helper method to check if the lazy memory manager contains a pointer 
of the given size
+* 
+* @param opcode instruction name
+* @param size size in bytes
+* @return true if the lazy memory manager contains a pointer of the 
given size
+*/
+   boolean contains(String opcode, long size) {
+   return rmvarGPUPointers.containsKey(size);
+   }
+   
+   /**
+* Helper method to check if the lazy memory manager contains a pointer 
>= minSize
+* 
+* @param opcode instruction name
+* @param minSize size in bytes
+* @return true if the lazy memory manager contains a pointer >= minSize
+*/
+   boolean 

[1/3] systemml git commit: [SYSTEMML-445] Added sparse scalar-matrix arithmetic/relational operators

2018-09-18 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 0e323ec26 -> 61139e400


http://git-wip-us.apache.org/repos/asf/systemml/blob/61139e40/src/main/java/org/apache/sysml/runtime/controlprogram/context/ExecutionContext.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/controlprogram/context/ExecutionContext.java
 
b/src/main/java/org/apache/sysml/runtime/controlprogram/context/ExecutionContext.java
index 443adf4..ce53aea 100644
--- 
a/src/main/java/org/apache/sysml/runtime/controlprogram/context/ExecutionContext.java
+++ 
b/src/main/java/org/apache/sysml/runtime/controlprogram/context/ExecutionContext.java
@@ -336,7 +336,7 @@ public class ExecutionContext {
public Pair 
getSparseMatrixOutputForGPUInstruction(String varName, long numRows, long 
numCols, long nnz) {
MatrixObject mo = allocateGPUMatrixObject(varName, numRows, 
numCols);
mo.getMatrixCharacteristics().setNonZeros(nnz);
-   boolean allocated = 
mo.getGPUObject(getGPUContext(0)).acquireDeviceModifySparse();
+   boolean allocated = 
mo.getGPUObject(getGPUContext(0)).acquireDeviceModifySparse();
return new Pair<>(mo, allocated);
}
 

http://git-wip-us.apache.org/repos/asf/systemml/blob/61139e40/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/CSRPointer.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/CSRPointer.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/CSRPointer.java
index 135e0b1..b3ec497 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/CSRPointer.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/CSRPointer.java
@@ -25,6 +25,8 @@ import static jcuda.jcusparse.JCusparse.cusparseSetMatType;
 import static jcuda.jcusparse.JCusparse.cusparseSetPointerMode;
 import static jcuda.jcusparse.JCusparse.cusparseXcsrgeamNnz;
 import static jcuda.jcusparse.JCusparse.cusparseXcsrgemmNnz;
+import static jcuda.jcusparse.JCusparse.cusparseXcsr2coo;
+
 import static jcuda.jcusparse.cusparseIndexBase.CUSPARSE_INDEX_BASE_ZERO;
 import static jcuda.jcusparse.cusparseMatrixType.CUSPARSE_MATRIX_TYPE_GENERAL;
 import static jcuda.runtime.JCuda.cudaMemcpy;
@@ -111,6 +113,24 @@ public class CSRPointer {
colInd = new Pointer();
allocateMatDescrPointer();
}
+   
+   /**
+* Note: the user is expected to free the returned pointer.
+* 
+* @param handle cusparse handle
+* @param rows number of rows of the CSR pointer
+* @return integer array of nnz uncompressed row indices (with index 
base 0).
+*/
+   public Pointer getCooRowPointer(cusparseHandle handle, int rows) {
+   if(nnz > 0) {
+   Pointer cooRowInd = gpuContext.allocate(null, 
getIntSizeOf(nnz));
+   cusparseXcsr2coo(handle, rowPtr, 
LibMatrixCUDA.toInt(nnz), rows, cooRowInd, CUSPARSE_INDEX_BASE_ZERO);
+   return cooRowInd;
+   }
+   else {
+   throw new DMLRuntimeException("csr2coo only support 
when nnz > 0, but instead found " +  nnz);
+   }
+   }
 
private static long getDataTypeSizeOf(long numElems) {
return numElems * ((long) LibMatrixCUDA.sizeOfDataType);

http://git-wip-us.apache.org/repos/asf/systemml/blob/61139e40/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixCUDA.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixCUDA.java 
b/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixCUDA.java
index e2d5824..46ab3f7 100644
--- a/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixCUDA.java
+++ b/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixCUDA.java
@@ -830,7 +830,7 @@ public class LibMatrixCUDA {
 
// Subtract mean from every element in the 
matrix
ScalarOperator minusOp = new 
RightScalarOperator(Minus.getMinusFnObject(), mean);
-   matrixScalarOp(gCtx, instName, in, mean, rlen, 
clen, tmp, minusOp);
+   denseMatrixScalarOp(gCtx, instName, in, mean, 
rlen, clen, tmp, minusOp);
 
squareMatrix(gCtx, instName, tmp, tmp2, rlen, 
clen);
 
@@ -899,7 +899,7 @@ public class LibMatrixCUDA {
reduceCol(gCtx, instName, "reduce_col_sum", tmp2, tmpCol, rlen, 
clen);
 
ScalarOperator divideOp = new 
RightScalarOperator(Divide.getDivideFnObject(), rlen - 1);
-   matrixScalarOp(gCtx, instName, tmpCol, rlen - 1, 1, clen, out, 
divideOp);
+   

[3/3] systemml git commit: [SYSTEMML-445] Added sparse scalar-matrix arithmetic/relational operators

2018-09-18 Thread niketanpansare
[SYSTEMML-445] Added sparse scalar-matrix arithmetic/relational operators


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/61139e40
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/61139e40
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/61139e40

Branch: refs/heads/master
Commit: 61139e40052ea6591840a6925258a0dcbfdf0f8e
Parents: 0e323ec
Author: Niketan Pansare 
Authored: Tue Sep 18 16:20:14 2018 -0700
Committer: Niketan Pansare 
Committed: Tue Sep 18 16:22:06 2018 -0700

--
 src/main/cpp/kernels/SystemML.cu|   41 +
 src/main/cpp/kernels/SystemML.ptx   | 4053 +-
 .../context/ExecutionContext.java   |2 +-
 .../instructions/gpu/context/CSRPointer.java|   20 +
 .../runtime/matrix/data/LibMatrixCUDA.java  |   99 +-
 5 files changed, 3026 insertions(+), 1189 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/61139e40/src/main/cpp/kernels/SystemML.cu
--
diff --git a/src/main/cpp/kernels/SystemML.cu b/src/main/cpp/kernels/SystemML.cu
index b874cdd..a53d07a 100644
--- a/src/main/cpp/kernels/SystemML.cu
+++ b/src/main/cpp/kernels/SystemML.cu
@@ -772,6 +772,47 @@ extern "C" __global__ void matrix_scalar_op_f(float *A, 
double scalar, float *C,
   matrix_scalar_op(A, (float)scalar, C, size, op, isLeftScalar);
 }
 
+
+/**
+ * Performs sparse-dense arithmetic operation between a matrix and a scalar.
+ * C = s op A or C = A op s (where A is the matrix, s is the scalar and op is
+ * the operation)
+ * @param cooRowPtrArow pointers for input matrix allocated on GPU in coo 
format
+ * @param colPtrA   col index pointers for input matrix allocated on GPU
+ * @param valA  val array for input matrix allocated on GPU
+ * @param scalarscalar input
+ * @param C output matrix allocated on GPU
+ * @param nnz   number of non-zero elements in matrix A
+ * @param colsA number of columns in matrix A
+ * @param opnumber code of the arithmetic operation to perform
+ * @param isLeftScalar  whether the scalar is on the left side
+ */
+template 
+__device__ void sparse_dense_matrix_scalar_op(int* cooRowPtrA, int* colPtrA, T 
*valA, T scalar, T *C, int nnz, int colsA, int op,
+ int isLeftScalar) {
+  int index = blockIdx.x * blockDim.x + threadIdx.x;
+  if (index < nnz) {
+T inputVal = valA[index];
+int outIndex = cooRowPtrA[index]*colsA + colPtrA[index];
+if (isLeftScalar) {
+  C[outIndex] = binaryOp(scalar, inputVal, op);
+} else {
+  C[outIndex] = binaryOp(inputVal, scalar, op);
+}
+  }
+  __syncthreads();
+}
+
+extern "C" __global__ void sparse_dense_matrix_scalar_op_d(int* cooRowPtrA, 
int* colPtrA, double *valA, double scalar, double *C, 
+   int nnz, int colsA, int op, int isLeftScalar) {
+  sparse_dense_matrix_scalar_op(cooRowPtrA, colPtrA, valA, scalar, C, nnz, 
colsA, op, isLeftScalar);
+}
+
+extern "C" __global__ void sparse_dense_matrix_scalar_op_f(int* cooRowPtrA, 
int* colPtrA, float *valA, double scalar, float *C, 
+   int nnz, int colsA, int op, int isLeftScalar) {
+  sparse_dense_matrix_scalar_op(cooRowPtrA, colPtrA, valA, (float) scalar, C, 
nnz, colsA, op, isLeftScalar);
+}
+
 /**
  * Sets all elements (fills) of a double array of given length with a given
  * scalar value



[2/3] systemml git commit: [SYSTEMML-445] Added sparse scalar-matrix arithmetic/relational operators

2018-09-18 Thread niketanpansare
http://git-wip-us.apache.org/repos/asf/systemml/blob/61139e40/src/main/cpp/kernels/SystemML.ptx
--
diff --git a/src/main/cpp/kernels/SystemML.ptx 
b/src/main/cpp/kernels/SystemML.ptx
index 1ab32f5..ac04967 100644
--- a/src/main/cpp/kernels/SystemML.ptx
+++ b/src/main/cpp/kernels/SystemML.ptx
@@ -4595,6 +4595,1739 @@ BB31_126:
ret;
 }
 
+   // .globl   sparse_dense_matrix_scalar_op_d
+.visible .entry sparse_dense_matrix_scalar_op_d(
+   .param .u64 sparse_dense_matrix_scalar_op_d_param_0,
+   .param .u64 sparse_dense_matrix_scalar_op_d_param_1,
+   .param .u64 sparse_dense_matrix_scalar_op_d_param_2,
+   .param .f64 sparse_dense_matrix_scalar_op_d_param_3,
+   .param .u64 sparse_dense_matrix_scalar_op_d_param_4,
+   .param .u32 sparse_dense_matrix_scalar_op_d_param_5,
+   .param .u32 sparse_dense_matrix_scalar_op_d_param_6,
+   .param .u32 sparse_dense_matrix_scalar_op_d_param_7,
+   .param .u32 sparse_dense_matrix_scalar_op_d_param_8
+)
+{
+   .reg .pred  %p<133>;
+   .reg .b32   %r<92>;
+   .reg .f64   %fd<99>;
+   .reg .b64   %rd<28>;
+
+
+   ld.param.u64%rd4, [sparse_dense_matrix_scalar_op_d_param_0];
+   ld.param.u64%rd5, [sparse_dense_matrix_scalar_op_d_param_1];
+   ld.param.u64%rd6, [sparse_dense_matrix_scalar_op_d_param_2];
+   ld.param.f64%fd68, [sparse_dense_matrix_scalar_op_d_param_3];
+   ld.param.u64%rd7, [sparse_dense_matrix_scalar_op_d_param_4];
+   ld.param.u32%r9, [sparse_dense_matrix_scalar_op_d_param_5];
+   ld.param.u32%r6, [sparse_dense_matrix_scalar_op_d_param_6];
+   ld.param.u32%r7, [sparse_dense_matrix_scalar_op_d_param_7];
+   ld.param.u32%r8, [sparse_dense_matrix_scalar_op_d_param_8];
+   mov.u32 %r10, %ntid.x;
+   mov.u32 %r11, %ctaid.x;
+   mov.u32 %r12, %tid.x;
+   mad.lo.s32  %r1, %r10, %r11, %r12;
+   setp.ge.s32 %p3, %r1, %r9;
+   @%p3 braBB32_142;
+
+   cvta.to.global.u64  %rd8, %rd7;
+   cvta.to.global.u64  %rd9, %rd6;
+   mul.wide.s32%rd10, %r1, 8;
+   add.s64 %rd11, %rd9, %rd10;
+   ld.global.f64   %fd1, [%rd11];
+   cvta.to.global.u64  %rd12, %rd4;
+   mul.wide.s32%rd13, %r1, 4;
+   add.s64 %rd14, %rd12, %rd13;
+   ld.global.u32   %r13, [%rd14];
+   cvta.to.global.u64  %rd15, %rd5;
+   add.s64 %rd16, %rd15, %rd13;
+   ld.global.u32   %r14, [%rd16];
+   mad.lo.s32  %r15, %r13, %r6, %r14;
+   mul.wide.s32%rd17, %r15, 8;
+   add.s64 %rd1, %rd8, %rd17;
+   setp.eq.s32 %p4, %r8, 0;
+   @%p4 braBB32_72;
+
+   mov.f64 %fd94, 0d7FEF;
+   setp.gt.s32 %p5, %r7, 8;
+   @%p5 braBB32_19;
+
+   setp.gt.s32 %p19, %r7, 3;
+   @%p19 bra   BB32_11;
+
+   setp.gt.s32 %p26, %r7, 1;
+   @%p26 bra   BB32_8;
+
+   setp.eq.s32 %p29, %r7, 0;
+   @%p29 bra   BB32_70;
+   bra.uni BB32_6;
+
+BB32_70:
+   add.f64 %fd94, %fd1, %fd68;
+   bra.uni BB32_71;
+
+BB32_72:
+   mov.f64 %fd98, 0d7FEF;
+   setp.gt.s32 %p69, %r7, 8;
+   @%p69 bra   BB32_89;
+
+   setp.gt.s32 %p83, %r7, 3;
+   @%p83 bra   BB32_81;
+
+   setp.gt.s32 %p90, %r7, 1;
+   @%p90 bra   BB32_78;
+
+   setp.eq.s32 %p93, %r7, 0;
+   @%p93 bra   BB32_140;
+   bra.uni BB32_76;
+
+BB32_140:
+   add.f64 %fd98, %fd1, %fd68;
+   bra.uni BB32_141;
+
+BB32_19:
+   setp.gt.s32 %p6, %r7, 13;
+   @%p6 braBB32_28;
+
+   setp.gt.s32 %p13, %r7, 10;
+   @%p13 bra   BB32_24;
+
+   setp.eq.s32 %p17, %r7, 9;
+   @%p17 bra   BB32_48;
+   bra.uni BB32_22;
+
+BB32_48:
+   setp.eq.f64 %p44, %fd1, %fd68;
+   selp.f64%fd94, 0d3FF0, 0d, %p44;
+   bra.uni BB32_71;
+
+BB32_89:
+   setp.gt.s32 %p70, %r7, 13;
+   @%p70 bra   BB32_98;
+
+   setp.gt.s32 %p77, %r7, 10;
+   @%p77 bra   BB32_94;
+
+   setp.eq.s32 %p81, %r7, 9;
+   @%p81 bra   BB32_118;
+   bra.uni BB32_92;
+
+BB32_118:
+   setp.eq.f64 %p108, %fd1, %fd68;
+   selp.f64%fd98, 0d3FF0, 0d, %p108;
+   bra.uni BB32_141;
+
+BB32_11:
+   setp.gt.s32 %p20, %r7, 5;
+   @%p20 bra   BB32_15;
+
+   setp.eq.s32 %p24, %r7, 4;
+   @%p24 bra   BB32_51;
+   bra.uni BB32_13;
+
+BB32_51:
+   {
+   .reg .b32 %temp; 
+   mov.b64 {%temp, %r2}, %fd68;
+   }
+   {
+   .reg .b32 %temp; 
+   mov.b64 {%temp, %r3}, %fd1;
+   }
+ 

systemml git commit: [MINOR] Throw an error if the user attempts to put null keys

2018-09-17 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 104b20e0b -> d2894feea


[MINOR] Throw an error if the user attempts to put null keys

- Also, added checks to verify persisted keys for debugging purposes


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/d2894fee
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/d2894fee
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/d2894fee

Branch: refs/heads/master
Commit: d2894feea6b274db46149c44fb697aa1c998fdca
Parents: 104b20e
Author: Niketan Pansare 
Authored: Mon Sep 17 15:12:59 2018 -0700
Committer: Niketan Pansare 
Committed: Mon Sep 17 15:12:59 2018 -0700

--
 .../apache/sysml/utils/PersistentLRUCache.java  | 42 +---
 1 file changed, 37 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/d2894fee/src/main/java/org/apache/sysml/utils/PersistentLRUCache.java
--
diff --git a/src/main/java/org/apache/sysml/utils/PersistentLRUCache.java 
b/src/main/java/org/apache/sysml/utils/PersistentLRUCache.java
index f053cd5..22f74c6 100644
--- a/src/main/java/org/apache/sysml/utils/PersistentLRUCache.java
+++ b/src/main/java/org/apache/sysml/utils/PersistentLRUCache.java
@@ -27,6 +27,7 @@ import java.io.ObjectInputStream;
 import java.io.ObjectOutputStream;
 import java.lang.ref.SoftReference;
 import java.nio.file.Files;
+import java.util.HashSet;
 import java.util.LinkedHashMap;
 import java.util.Map;
 import java.util.Random;
@@ -87,6 +88,7 @@ public class PersistentLRUCache extends LinkedHashMap {
private final long _maxNumBytes;
Random _rand = new Random();
boolean isInReadOnlyMode;
+   HashSet persistedKeys = new HashSet<>();

public static void main(String [] args) throws IOException {
org.apache.log4j.Logger.getRootLogger().setLevel(Level.DEBUG);
@@ -132,6 +134,8 @@ public class PersistentLRUCache extends 
LinkedHashMap {
}

private ValueWrapper putImplm(String key, ValueWrapper value, long 
sizeInBytes) throws FileNotFoundException, IOException {
+   if(key == null)
+   throw new IOException("Null keys are not supported by 
PersistentLRUCache");
ValueWrapper prev = null;
if(containsKey(key))
prev = remove(key);
@@ -237,6 +241,10 @@ public class PersistentLRUCache extends 
LinkedHashMap {
}

public double [] getAsDoubleArray(String key) throws 
FileNotFoundException, IOException {
+   if(key == null)
+   throw new IOException("Null keys are not supported by 
PersistentLRUCache");
+   if(!containsKey(key))
+   throw new DMLRuntimeException("The map doesnot contains 
the given key:" + key);
ValueWrapper value = super.get(key);
if(!value.isAvailable()) {
// Fine-grained synchronization: only one read per key, 
but will allow parallel loading
@@ -254,6 +262,10 @@ public class PersistentLRUCache extends 
LinkedHashMap {
}

public float [] getAsFloatArray(String key) throws 
FileNotFoundException, IOException {
+   if(key == null)
+   throw new DMLRuntimeException("Null keys are not 
supported by PersistentLRUCache");
+   if(!containsKey(key))
+   throw new DMLRuntimeException("The map doesnot contains 
the given key:" + key);
ValueWrapper value = super.get(key);
if(!value.isAvailable()) {
// Fine-grained synchronization: only one read per key, 
but will allow parallel loading
@@ -271,6 +283,10 @@ public class PersistentLRUCache extends 
LinkedHashMap {
}

public MatrixBlock getAsMatrixBlock(String key) throws 
FileNotFoundException, IOException {
+   if(key == null)
+   throw new DMLRuntimeException("Null keys are not 
supported by PersistentLRUCache");
+   if(!containsKey(key))
+   throw new DMLRuntimeException("The map doesnot contains 
the given key:" + key);
ValueWrapper value = super.get(key);
if(!value.isAvailable()) {
// Fine-grained synchronization: only one read per key, 
but will allow parallel loading
@@ -360,6 +376,7 @@ class DataWrapper {
os.writeDouble(_dArr[i]);
}
}
+   _cache.persistedKeys.add(_key);

systemml git commit: [MINOR] Fixes a minor big in LRU cache

2018-09-17 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 1d2f4b630 -> 104b20e0b


[MINOR] Fixes a minor big in LRU cache

Closes #834.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/104b20e0
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/104b20e0
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/104b20e0

Branch: refs/heads/master
Commit: 104b20e0bca906a6b76f962145254f5a1fb02ba6
Parents: 1d2f4b6
Author: Anthony Thomas 
Authored: Mon Sep 17 14:31:31 2018 -0700
Committer: Niketan Pansare 
Committed: Mon Sep 17 15:07:49 2018 -0700

--
 src/main/java/org/apache/sysml/utils/PersistentLRUCache.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/104b20e0/src/main/java/org/apache/sysml/utils/PersistentLRUCache.java
--
diff --git a/src/main/java/org/apache/sysml/utils/PersistentLRUCache.java 
b/src/main/java/org/apache/sysml/utils/PersistentLRUCache.java
index 83a0dcf..f053cd5 100644
--- a/src/main/java/org/apache/sysml/utils/PersistentLRUCache.java
+++ b/src/main/java/org/apache/sysml/utils/PersistentLRUCache.java
@@ -462,7 +462,7 @@ class DataWrapper {
return _dArr.length*Double.BYTES;
else if(_fArr != null)
return _fArr.length*Float.BYTES;
-   else if(_fArr != null)
+   else if(_mb != null)
return _mb.getInMemorySize();
else
throw new DMLRuntimeException("Not implemented");



systemml git commit: [SYSTEMML-445] Acquire read lock before copying from host to device

2018-09-17 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 4d8df33cc -> 1d2f4b630


[SYSTEMML-445] Acquire read lock before copying from host to device

Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/1d2f4b63
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/1d2f4b63
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/1d2f4b63

Branch: refs/heads/master
Commit: 1d2f4b630ebf800be5009b182880b03682077ccd
Parents: 4d8df33
Author: Niketan Pansare 
Authored: Mon Sep 17 11:12:26 2018 -0700
Committer: Niketan Pansare 
Committed: Mon Sep 17 11:12:26 2018 -0700

--
 .../apache/sysml/runtime/instructions/gpu/context/GPUObject.java   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/1d2f4b63/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUObject.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUObject.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUObject.java
index 1564f48..552ee3b 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUObject.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/context/GPUObject.java
@@ -570,6 +570,7 @@ public class GPUObject {
LOG.trace("GPU : acquireDeviceRead on " + this);
}
boolean transferred = false;
+   addReadLock();
if (!isAllocated()) {
if(LOG.isTraceEnabled()) {
LOG.trace("GPU : in acquireDeviceRead, data is 
not allocated, copying from host, on " + this + ", GPUContext="
@@ -578,7 +579,6 @@ public class GPUObject {
copyFromHostToDevice(opcode);
transferred = true;
}
-   addReadLock();
if (!isAllocated())
throw new DMLRuntimeException("Expected device data to 
be allocated");
return transferred;



systemml git commit: [SYSTEMML-445] Removed unnecessary long-to-int conversion in LSTM

2018-09-13 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 77c98d693 -> e2dc85688


[SYSTEMML-445] Removed unnecessary long-to-int conversion in LSTM

- Minor cleanup of the GPUObject class.
- Also, fixed incorrect forced GPU configuration flag.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/e2dc8568
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/e2dc8568
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/e2dc8568

Branch: refs/heads/master
Commit: e2dc8568855d353265ac4e0755b9ac3d2b30b1d8
Parents: 77c98d6
Author: Niketan Pansare 
Authored: Thu Sep 13 11:17:33 2018 -0700
Committer: Niketan Pansare 
Committed: Thu Sep 13 11:17:33 2018 -0700

--
 .../apache/sysml/conf/ConfigurationManager.java |  2 +-
 .../instructions/gpu/DnnGPUInstruction.java | 20 +++---
 .../instructions/gpu/context/CSRPointer.java|  8 ---
 .../gpu/context/ExecutionConfig.java|  4 +-
 .../gpu/context/GPUMemoryManager.java   | 12 +++-
 .../instructions/gpu/context/GPUObject.java | 72 ++--
 .../runtime/matrix/data/LibMatrixCuDNN.java | 38 +++
 .../matrix/data/LibMatrixCuDNNRnnAlgorithm.java | 56 ---
 .../sysml/runtime/matrix/data/MatrixBlock.java  |  3 +-
 9 files changed, 100 insertions(+), 115 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/e2dc8568/src/main/java/org/apache/sysml/conf/ConfigurationManager.java
--
diff --git a/src/main/java/org/apache/sysml/conf/ConfigurationManager.java 
b/src/main/java/org/apache/sysml/conf/ConfigurationManager.java
index d9f1906..96c3885 100644
--- a/src/main/java/org/apache/sysml/conf/ConfigurationManager.java
+++ b/src/main/java/org/apache/sysml/conf/ConfigurationManager.java
@@ -258,7 +258,7 @@ public class ConfigurationManager
 * @return true if GPU is enabled in forced mode
 */
public static boolean isForcedGPU() {
-   return _ldmlOptions.get().isGPU();
+   return _ldmlOptions.get().isForceGPU();
}

/**

http://git-wip-us.apache.org/repos/asf/systemml/blob/e2dc8568/src/main/java/org/apache/sysml/runtime/instructions/gpu/DnnGPUInstruction.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/DnnGPUInstruction.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/DnnGPUInstruction.java
index d620de9..6094b6c 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/DnnGPUInstruction.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/DnnGPUInstruction.java
@@ -595,18 +595,18 @@ public class DnnGPUInstruction extends GPUInstruction {

private void processLstmBackwardInstruction(ExecutionContext ec) throws 
DMLRuntimeException {
MatrixObject out0 = getMatrixInputForGPUInstruction(ec, 
_input4.getName());
-   int M = toInt(out0.getNumColumns()); // hiddenSize .. since 
out0: (N, M)
+   long M = out0.getNumColumns(); // hiddenSize .. since out0: (N, 
M)
Pointer out0Pointer =  LibMatrixCUDA.getDensePointer(gCtx, 
out0, instName);

MatrixObject W = getMatrixInputForGPUInstruction(ec, 
_input2.getName());
MatrixObject bias = getMatrixInputForGPUInstruction(ec, 
_input3.getName());
long numRowsW = W.getNumRows();
-   int D = toInt(numRowsW) - M; // since W:(D+M, 4M) ... 
numFeatures 
+   long D = numRowsW - M; // since W:(D+M, 4M) ... numFeatures 
Pointer sysmlWPointer = 
LibMatrixCuDNN.getDensePointerForCuDNN(gCtx, W, instName, D+M, 4*M);
Pointer sysmlBiasPointer = 
LibMatrixCuDNN.getDensePointerForCuDNN(gCtx, bias, instName, 1, 4*M);
Pointer cudnnWPointer = gCtx.allocate(instName, 
(D+M+2)*(4*M)*LibMatrixCUDA.sizeOfDataType);

LibMatrixCUDA.getCudaKernels(gCtx).launchKernel("prepare_lstm_weight",
-   
ExecutionConfig.getConfigForSimpleVectorOperations((D+M+2)*(4*M)),
+   
ExecutionConfig.getConfigForSimpleVectorOperations(toInt((D+M+2)*(4*M))),
sysmlWPointer, sysmlBiasPointer, cudnnWPointer, 
D, M);
ec.releaseMatrixInputForGPUInstruction(_input2.getName());
ec.releaseMatrixInputForGPUInstruction(_input3.getName());
@@ -619,7 +619,7 @@ public class DnnGPUInstruction extends GPUInstruction {
int T = toInt(numColsX/ D); // since X:(N, T*D) ... seqLength
Pointer cudnnInput = gCtx.allocate(instName, 
(N*T*D)*LibMatrixCUDA.sizeOfDataType);
 

systemml git commit: [MINOR] Fixed import error in Keras2DML

2018-09-11 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 2fc26b3dc -> 77c98d693


[MINOR] Fixed import error in Keras2DML


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/77c98d69
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/77c98d69
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/77c98d69

Branch: refs/heads/master
Commit: 77c98d693c3b2d407094de50accac615a638183f
Parents: 2fc26b3
Author: Niketan Pansare 
Authored: Tue Sep 11 13:26:23 2018 -0700
Committer: Niketan Pansare 
Committed: Tue Sep 11 13:26:23 2018 -0700

--
 src/main/python/systemml/mllearn/estimators.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/77c98d69/src/main/python/systemml/mllearn/estimators.py
--
diff --git a/src/main/python/systemml/mllearn/estimators.py 
b/src/main/python/systemml/mllearn/estimators.py
index 0b3de41..d231b08 100644
--- a/src/main/python/systemml/mllearn/estimators.py
+++ b/src/main/python/systemml/mllearn/estimators.py
@@ -1018,7 +1018,7 @@ class Keras2DML(Caffe2DML):
 regularization_type: regularization type (default: "L2")
 """
 from .keras2caffe import convertKerasToCaffeNetwork, 
convertKerasToCaffeSolver
-import tempfile
+import tempfile, keras
 if isinstance(keras_model, keras.models.Sequential):
 # Convert the sequential model to functional model
 if keras_model.model is None:



systemml git commit: [MINOR] Allow non-literal values in parameterized built-in functions

2018-09-10 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master ab251f6ee -> 2fc26b3dc


[MINOR] Allow non-literal values in parameterized built-in functions


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/2fc26b3d
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/2fc26b3d
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/2fc26b3d

Branch: refs/heads/master
Commit: 2fc26b3dced89a473055828b08550ed6e6a8d7be
Parents: ab251f6
Author: Niketan Pansare 
Authored: Mon Sep 10 15:05:05 2018 -0700
Committer: Niketan Pansare 
Committed: Mon Sep 10 15:05:05 2018 -0700

--
 .../gpu/GPUDenseInputPointerFetcher.java|  1 -
 .../gpu/MatrixReshapeGPUInstruction.java|  3 +-
 .../ParameterizedBuiltinSPInstruction.java  | 75 
 3 files changed, 63 insertions(+), 16 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/2fc26b3d/src/main/java/org/apache/sysml/runtime/instructions/gpu/GPUDenseInputPointerFetcher.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/GPUDenseInputPointerFetcher.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/GPUDenseInputPointerFetcher.java
index 8fcaec3..1ab3420 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/GPUDenseInputPointerFetcher.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/GPUDenseInputPointerFetcher.java
@@ -20,7 +20,6 @@ package org.apache.sysml.runtime.instructions.gpu;
 
 import java.util.HashMap;
 
-import org.apache.sysml.api.DMLScript;
 import org.apache.sysml.conf.ConfigurationManager;
 import org.apache.sysml.runtime.DMLRuntimeException;
 import org.apache.sysml.runtime.controlprogram.caching.MatrixObject;

http://git-wip-us.apache.org/repos/asf/systemml/blob/2fc26b3d/src/main/java/org/apache/sysml/runtime/instructions/gpu/MatrixReshapeGPUInstruction.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/MatrixReshapeGPUInstruction.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/MatrixReshapeGPUInstruction.java
index 61cb643..ee2166e 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/MatrixReshapeGPUInstruction.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/MatrixReshapeGPUInstruction.java
@@ -79,7 +79,8 @@ public class MatrixReshapeGPUInstruction extends 
GPUInstruction {
GPUContext gCtx = ec.getGPUContext(0); 
MatrixObject mat = getMatrixInputForGPUInstruction(ec, 
_input.getName());
if(rows*cols != mat.getNumRows()*mat.getNumColumns()) {
-   throw new DMLRuntimeException("Incorrect number of rows 
and cols in rshape instruction");
+   throw new DMLRuntimeException("Cannot reshape a matrix 
of dimensions: [" + mat.getNumRows() + ", " + mat.getNumColumns() + "] to a 
matrix of"
+   + " dimensions [" + rows + ", " + cols 
+ "]");
}
// We currently support only dense rshape
Pointer inPtr = LibMatrixCUDA.getDensePointer(gCtx, mat, 
instName);

http://git-wip-us.apache.org/repos/asf/systemml/blob/2fc26b3d/src/main/java/org/apache/sysml/runtime/instructions/spark/ParameterizedBuiltinSPInstruction.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/ParameterizedBuiltinSPInstruction.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/ParameterizedBuiltinSPInstruction.java
index f9d7ef3..4a1c710 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/ParameterizedBuiltinSPInstruction.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/ParameterizedBuiltinSPInstruction.java
@@ -174,6 +174,54 @@ public class ParameterizedBuiltinSPInstruction extends 
ComputationSPInstruction
}
}

+   private double getDoubleParam(ExecutionContext ec, String key) {
+   String val = params.get(key);
+   try {
+   if(val != null)
+   return Double.parseDouble( val );
+   else
+   throw new RuntimeException("Expected parameter 
" + key);
+   } catch(NumberFormatException e) {
+   return ec.getScalarInput(val, ValueType.DOUBLE, 
false).getDoubleValue();
+   } 
+   }
+   
+   private boolean getBooleanParam(ExecutionContext ec, String key) {
+   String val = params.get(key);
+   try {
+  

systemml git commit: [MINOR] Fixed javadoc errors

2018-08-30 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 0f36780a8 -> ab251f6ee


[MINOR] Fixed javadoc errors

Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/ab251f6e
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/ab251f6e
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/ab251f6e

Branch: refs/heads/master
Commit: ab251f6ee42fe44eabf51483184c95a5a3e472d9
Parents: 0f36780
Author: Niketan Pansare 
Authored: Thu Aug 30 15:59:37 2018 -0700
Committer: Niketan Pansare 
Committed: Thu Aug 30 15:59:37 2018 -0700

--
 .../java/org/apache/sysml/runtime/matrix/data/LibMatrixCUDA.java  | 3 +++
 src/main/java/org/apache/sysml/udf/lib/RemoveDuplicates.java  | 3 +--
 2 files changed, 4 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/ab251f6e/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixCUDA.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixCUDA.java 
b/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixCUDA.java
index f3f8434..e2d5824 100644
--- a/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixCUDA.java
+++ b/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixCUDA.java
@@ -946,6 +946,7 @@ public class LibMatrixCUDA {
/**
 * Do a simple reduction, the output of which is a single value
 * @param gCtx   a valid {@link GPUContext}
+* @param instNameinstruction name
 * @param kernelFunctionname of the kernel function to invoke
 * @param in{@link 
Pointer} to matrix in device memory
 * @param n 
size of array
@@ -988,6 +989,7 @@ public class LibMatrixCUDA {
 * Do a reduction by row. Data is reduced per row and the
 * resulting vector is calculated.
 * @param gCtxa valid {@link GPUContext}
+* @param instNameinstruction name
 * @param kernelFunctionname of the kernel function to invoke
 * @param in{@link 
Pointer} to input matrix in device memory (size - rows * columns)
 * @param out   {@link 
Pointer} to output matrix in device memory (size - rows * 1)
@@ -1015,6 +1017,7 @@ public class LibMatrixCUDA {
 * Do a reduction by column. Data is reduced per column and the
 * resulting vector is calculated.
 * @param gCtxa valid {@link GPUContext}
+* @param instNameinstruction name
 * @param kernelFunctionname of the kernel function to invoke
 * @param in{@link 
Pointer} to input matrix in device memory (size - rows * columns)
 * @param out   {@link 
Pointer} to output matrix in device memory (size - 1 * cols)

http://git-wip-us.apache.org/repos/asf/systemml/blob/ab251f6e/src/main/java/org/apache/sysml/udf/lib/RemoveDuplicates.java
--
diff --git a/src/main/java/org/apache/sysml/udf/lib/RemoveDuplicates.java 
b/src/main/java/org/apache/sysml/udf/lib/RemoveDuplicates.java
index 5c5e0d5..97edadb 100644
--- a/src/main/java/org/apache/sysml/udf/lib/RemoveDuplicates.java
+++ b/src/main/java/org/apache/sysml/udf/lib/RemoveDuplicates.java
@@ -49,12 +49,11 @@ import org.apache.sysml.udf.Matrix.ValueType;
  * W = X*sum(X);
  * inL = list(Y, Z, W)
  * [outL, idx] = distinct(inL);
- * print(">>\n" + toString(idx));
+ * print(toString(idx));
  * 
  * 
  * 
  * The above code prints:
- * >>
  * 1.000
  * 2.000
  * 1.000



systemml git commit: [SYSTEMML-445] Removed batch_norm builtin functions

2018-08-30 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/gh-pages 14049d257 -> d38bf4ee9


[SYSTEMML-445] Removed batch_norm builtin functions

- Removed batch_norm builtin functions to exploit codegen in CP.
- Added rewrites for compiling efficient CuDNN operators.
- Added rewrites for SGD update operations.
- To simplify adding new GPU rewrites, added HopDagPatternMatcher that allows 
for pattern matching at the HOP-level. This can be extended for other rewrites 
as well.
- Added GPU tests to validate the rewrites.
- Updated the DML language documentation.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/d38bf4ee
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/d38bf4ee
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/d38bf4ee

Branch: refs/heads/gh-pages
Commit: d38bf4ee982946a7d06c855690c26f072d2ab17d
Parents: 14049d2
Author: Niketan Pansare 
Authored: Thu Aug 30 15:40:44 2018 -0700
Committer: Niketan Pansare 
Committed: Thu Aug 30 15:40:44 2018 -0700

--
 dml-language-reference.md | 2 --
 1 file changed, 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/d38bf4ee/dml-language-reference.md
--
diff --git a/dml-language-reference.md b/dml-language-reference.md
index 924336a..cdcc529 100644
--- a/dml-language-reference.md
+++ b/dml-language-reference.md
@@ -1522,8 +1522,6 @@ Hence, the images are internally represented as a matrix 
with dimension (N, C *
 | bias_add| input, bias  | 
[batch_size X num_channels* height_image* width_image]| [num_channels X 1]  
  | [batch_size X num_channels* 
height_image* width_image]  |   


| Adds the bias (row vector of size num_channels) to input with 
the given num_channels  
|
 | bias_multiply   | input, bias  | 
[batch_size X num_channels* height_image* width_image]| [num_channels X 1]  
  | [batch_size X num_channels* 
height_image* width_image]  |   


| Multiplies the bias (row vector of size num_channels) to 
input with the given num_channels   
 |
 | lstm| X,  W, bias, out0, c0| 
[batch_size X seq_length*num_features]| 
[num_features+hidden_size X 4*hidden_size]| [batch_size X 
seq_length*hidden_size] if return_sequences else  [batch_size X hidden_size]  | 
return_sequences

  | Perform computation for single-layer 
unidirectional LSTM (outputs: out, carryOut)
 |
-| batch_norm2d| input| 
[batch_size X num_channels* height_image* width_image]| 
  | [batch_size X num_channels* 
height_image* width_image]  | scale, shift, 
exponentialMovingAverage_Mean, exponentialMovingAverage_Variance, mode, 
epsilon, momentum   
| Performs batch normalization operation  (outputs: 
updated exponential moving average mean and variance, cache of the batch mean 
and variance) |
-| batch_norm2d_backward   | input, dout  | 
[batch_size X num_channels* height_image* width_image]| [batch_size X 
num_channels* height_image* width_image]| [batch_size X num_channels* 
height_image* width_image]  | scale, 
epsilon, cache_mean (from forward), cache_inv_var (from forward)

   | Computed backpropagation error for batch normalization 
operation   
   |
 
 Note: the builtin functions `batch_norm2d` and `batch_norm2d_backward` are 
deprecated and will be removed 

[1/3] systemml git commit: [SYSTEMML-445] Removed batch_norm builtin functions

2018-08-30 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 81419ae6a -> 0f36780a8


http://git-wip-us.apache.org/repos/asf/systemml/blob/0f36780a/src/main/java/org/apache/sysml/runtime/instructions/gpu/DnnGPUInstruction.java
--
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/DnnGPUInstruction.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/DnnGPUInstruction.java
index e736a1c..d620de9 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/gpu/DnnGPUInstruction.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/gpu/DnnGPUInstruction.java
@@ -19,8 +19,8 @@
 package org.apache.sysml.runtime.instructions.gpu;
 
 import java.util.ArrayList;
-
 import jcuda.Pointer;
+import jcuda.jcudnn.JCudnn;
 
 import org.apache.sysml.runtime.DMLRuntimeException;
 import org.apache.sysml.runtime.controlprogram.caching.MatrixObject;
@@ -32,7 +32,6 @@ import 
org.apache.sysml.runtime.instructions.gpu.context.ExecutionConfig;
 import org.apache.sysml.runtime.instructions.gpu.context.GPUContext;
 import org.apache.sysml.runtime.matrix.data.LibMatrixCUDA;
 import org.apache.sysml.runtime.matrix.data.LibMatrixCuDNN;
-import org.apache.sysml.runtime.matrix.data.MatrixBlock;
 import org.apache.sysml.runtime.matrix.data.LibMatrixDNN.PoolingType;
 import org.apache.sysml.runtime.matrix.operators.ReorgOperator;
 import org.apache.sysml.runtime.util.DnnUtils;
@@ -57,12 +56,14 @@ public class DnnGPUInstruction extends GPUInstruction {
private ArrayList _stride = new ArrayList<>();
private ArrayList _padding = new ArrayList<>();
private double _intermediateMemoryBudget = 0;
+   private GPUContext gCtx;
+   private String instName;

public DnnGPUInstruction(CPOperand in1, CPOperand in2, CPOperand out, 
String opcode, String istr, double intermediateMemoryBudget) {
super(new ReorgOperator(SwapIndex.getSwapIndexFnObject()), 
opcode, istr);
-   if (!(opcode.equals("bias_add") || 
opcode.equals("bias_multiply") || opcode.equals("relu_backward"))) {
+   if (!(opcode.equals("bias_add") || 
opcode.equals("bias_multiply") || opcode.equals("relu_backward") || 
opcode.equals("inv_var") )) {
throw new DMLRuntimeException(
-   "Incorrect usage. Expected the opcode 
to be bias_add or bias_multiply or relu_backward, but found "
+   "Incorrect usage. Expected the opcode 
to be bias_add or bias_multiply or relu_backward or inv_var, but found "
+ opcode);
}
_input1 = in1;
@@ -112,8 +113,8 @@ public class DnnGPUInstruction extends GPUInstruction {
public DnnGPUInstruction(CPOperand in1, CPOperand in2, CPOperand in3, 
CPOperand out, String opcode, String istr, 
double intermediateMemoryBudget) throws 
DMLRuntimeException {
super(new ReorgOperator(SwapIndex.getSwapIndexFnObject()), 
opcode, istr);
-   if( !opcode.equals("channel_sums") ) {
-   throw new DMLRuntimeException("Incorrect usage. 
Expected the opcode to be channel_sums, but found " + opcode);
+   if( !(opcode.equals("channel_sums") || 
opcode.equals("reshape_colmeans") || opcode.equals("update_ema") ) ) {
+   throw new DMLRuntimeException("Incorrect usage. 
Expected the opcode to be channel_sums or reshape_colmeans or update_ema, but 
found " + opcode);
}
_input1 = in1;
_input2 = in2;
@@ -126,7 +127,7 @@ public class DnnGPUInstruction extends GPUInstruction {
public DnnGPUInstruction(CPOperand in1, CPOperand in2, CPOperand in3, 
CPOperand in4, CPOperand out, String opcode, String istr, 
double intermediateMemoryBudget) throws 
DMLRuntimeException {
super(new ReorgOperator(SwapIndex.getSwapIndexFnObject()), 
opcode, istr);
-   if( !opcode.equals("update_nesterov_x") ) {
+   if( !( opcode.equals("update_nesterov_x")) ) {
throw new DMLRuntimeException("Incorrect opcode: " + 
opcode);
}
_input1 = in1;
@@ -182,6 +183,22 @@ public class DnnGPUInstruction extends GPUInstruction {
_intermediateMemoryBudget = intermediateMemoryBudget;
}

+   public DnnGPUInstruction(CPOperand in, CPOperand in2, CPOperand in3, 
CPOperand in4, CPOperand in5, 
+   CPOperand out, String opcode, String istr, double 
intermediateMemoryBudget) {
+   super(new ReorgOperator(SwapIndex.getSwapIndexFnObject()), 
opcode, istr);
+   if( !(opcode.equals("update_ema_var") || 
opcode.equals("batch_norm2d_bwd_dx")) ) {
+   throw new DMLRuntimeException("Incorrect 

[3/3] systemml git commit: [SYSTEMML-445] Removed batch_norm builtin functions

2018-08-30 Thread niketanpansare
[SYSTEMML-445] Removed batch_norm builtin functions

- Removed batch_norm builtin functions to exploit codegen in CP.
- Added rewrites for compiling efficient CuDNN operators.
- Added rewrites for SGD update operations.
- To simplify adding new GPU rewrites, added HopDagPatternMatcher that allows 
for pattern matching at the HOP-level. This can be extended for other rewrites 
as well.
- Added GPU tests to validate the rewrites.
- Updated the DML language documentation.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/0f36780a
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/0f36780a
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/0f36780a

Branch: refs/heads/master
Commit: 0f36780a8244c6e728d37c32a79e00ed181211ad
Parents: 81419ae
Author: Niketan Pansare 
Authored: Thu Aug 30 15:40:44 2018 -0700
Committer: Niketan Pansare 
Committed: Thu Aug 30 15:40:44 2018 -0700

--
 docs/dml-language-reference.md  |2 -
 scripts/nn/layers/batch_norm2d.dml  |   60 +-
 scripts/nn/layers/batch_norm2d_old.dml  |  200 
 src/main/cpp/kernels/SystemML.cu|   56 +-
 src/main/cpp/kernels/SystemML.ptx   |  321 +-
 src/main/java/org/apache/sysml/hops/DnnOp.java  |   56 +-
 .../java/org/apache/sysml/hops/FunctionOp.java  |   30 +-
 src/main/java/org/apache/sysml/hops/Hop.java|8 +-
 .../hops/rewrite/HopDagPatternMatcher.java  |  378 +++
 .../sysml/hops/rewrite/HopPatternRewriter.java  |   72 ++
 .../HopRewriteRuleWithPatternMatcher.java   |   98 ++
 .../sysml/hops/rewrite/HopRewriteUtils.java |   20 +
 .../hops/rewrite/RewriteGPUSpecificOps.java | 1027 +-
 .../org/apache/sysml/lops/DnnTransform.java |   53 +-
 .../sysml/parser/BuiltinFunctionExpression.java |   61 +-
 .../org/apache/sysml/parser/DMLTranslator.java  |2 -
 .../org/apache/sysml/parser/Expression.java |2 +-
 .../instructions/GPUInstructionParser.java  |   10 +-
 .../instructions/gpu/DnnGPUInstruction.java |  526 +
 .../gpu/GPUDenseInputPointerFetcher.java|  111 ++
 .../gpu/context/GPUMemoryManager.java   |2 +-
 .../runtime/matrix/data/LibMatrixCUDA.java  |  110 +-
 .../runtime/matrix/data/LibMatrixCuDNN.java |   37 +-
 .../apache/sysml/test/gpu/BatchNormTest.java|   47 +-
 24 files changed, 1818 insertions(+), 1471 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/0f36780a/docs/dml-language-reference.md
--
diff --git a/docs/dml-language-reference.md b/docs/dml-language-reference.md
index 924336a..cdcc529 100644
--- a/docs/dml-language-reference.md
+++ b/docs/dml-language-reference.md
@@ -1522,8 +1522,6 @@ Hence, the images are internally represented as a matrix 
with dimension (N, C *
 | bias_add| input, bias  | 
[batch_size X num_channels* height_image* width_image]| [num_channels X 1]  
  | [batch_size X num_channels* 
height_image* width_image]  |   


| Adds the bias (row vector of size num_channels) to input with 
the given num_channels  
|
 | bias_multiply   | input, bias  | 
[batch_size X num_channels* height_image* width_image]| [num_channels X 1]  
  | [batch_size X num_channels* 
height_image* width_image]  |   


| Multiplies the bias (row vector of size num_channels) to 
input with the given num_channels   
 |
 | lstm| X,  W, bias, out0, c0| 
[batch_size X seq_length*num_features]| 
[num_features+hidden_size X 4*hidden_size]| [batch_size X 
seq_length*hidden_size] if return_sequences else  [batch_size X hidden_size]  | 
return_sequences

  | Perform computation for single-layer 
unidirectional LSTM (outputs: out, carryOut)
 |
-| 

systemml git commit: Updates standalone jar dependencies with commons-lang3

2018-08-28 Thread niketanpansare
Repository: systemml
Updated Branches:
  refs/heads/master 9e7ee19a4 -> be465dd65


Updates standalone jar dependencies with commons-lang3


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/be465dd6
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/be465dd6
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/be465dd6

Branch: refs/heads/master
Commit: be465dd65f9435f24fc195c25d9b1c4781bbb459
Parents: 9e7ee19
Author: Anthony Thomas 
Authored: Thu Aug 23 10:58:45 2018 -0700
Committer: Niketan Pansare 
Committed: Tue Aug 28 09:21:07 2018 -0700

--
 src/assembly/standalone-jar.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/systemml/blob/be465dd6/src/assembly/standalone-jar.xml
--
diff --git a/src/assembly/standalone-jar.xml b/src/assembly/standalone-jar.xml
index 0d1bda7..2a7a183 100644
--- a/src/assembly/standalone-jar.xml
+++ b/src/assembly/standalone-jar.xml
@@ -81,7 +81,7 @@
*:commons-configuration*
*:commons-httpclient*
*:commons-io*
-   *:commons-lang
+   *:commons-lang*
*:commons-logging*
*:commons-math3*
*:guava*



  1   2   3   4   >