Repository: systemml
Updated Branches:
  refs/heads/master 8b054804e -> 7e11deaa8


[SYSTEMML-2135] Extended removeEmpty with outputs of zero rows/cols

This patch extends the support for matrices with zero rows and columns
to removeEmpty. So far this operation returned a single row/column of
zeros for empty inputs. For backwards compatibility, we keep these
semantics, but introduce a new flag that allows to disable this special
case in order to return matrices with zero rows and columns,
respectively. In addition, this patch also includes the related update
of the dml language guide.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/e0b66b30
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/e0b66b30
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/e0b66b30

Branch: refs/heads/master
Commit: e0b66b300c8bd4582579bac8b6dea26e132447b0
Parents: 8b05480
Author: Matthias Boehm <[email protected]>
Authored: Thu Feb 8 18:12:32 2018 -0800
Committer: Matthias Boehm <[email protected]>
Committed: Thu Feb 8 22:04:03 2018 -0800

----------------------------------------------------------------------
 docs/dml-language-reference.md                  |  2 +-
 .../sysml/hops/ParameterizedBuiltinOp.java      |  3 +-
 .../ParameterizedBuiltinFunctionExpression.java | 21 ++++++-
 .../java/org/apache/sysml/parser/dml/Dml.g4     |  2 +-
 .../java/org/apache/sysml/parser/pydml/Pydml.g4 |  2 +-
 .../runtime/compress/CompressedMatrixBlock.java |  8 +--
 .../estim/CompressedSizeEstimatorSample.java    |  2 +-
 .../cp/ParameterizedBuiltinCPInstruction.java   | 14 ++---
 .../ParameterizedBuiltinSPInstruction.java      | 44 +++++++-------
 .../spark/utils/RDDAggregateUtils.java          |  6 +-
 .../sysml/runtime/io/WriterBinaryBlock.java     |  9 ++-
 .../runtime/matrix/data/LibMatrixReorg.java     | 20 ++++---
 .../sysml/runtime/matrix/data/MatrixBlock.java  |  8 +--
 .../functions/misc/ZeroRowsColsMatrixTest.java  | 60 ++++++++++----------
 .../functions/misc/ZeroMatrix_RemoveEmpty.R     |  2 +-
 .../functions/misc/ZeroMatrix_RemoveEmpty.dml   |  6 +-
 16 files changed, 113 insertions(+), 96 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/docs/dml-language-reference.md
----------------------------------------------------------------------
diff --git a/docs/dml-language-reference.md b/docs/dml-language-reference.md
index 5fb9de5..355b507 100644
--- a/docs/dml-language-reference.md
+++ b/docs/dml-language-reference.md
@@ -648,7 +648,7 @@ nrow(), <br/> ncol(), <br/> length() | Return the number of 
rows, number of colu
 prod() | Return the product of all cells in matrix | Input: matrix <br/> 
Output: scalarj | prod(X)
 rand() | Generates a random matrix | Input: (rows=&lt;value&gt;, 
cols=&lt;value&gt;, min=&lt;value&gt;, max=&lt;value&gt;, 
sparsity=&lt;value&gt;, pdf=&lt;string&gt;, seed=&lt;value&gt;) <br/> 
rows/cols: Number of rows/cols (expression) <br/> min/max: Min/max value for 
cells (either constant value, or variable that evaluates to constant value) 
<br/> sparsity: fraction of non-zero cells (constant value) <br/> pdf: 
"uniform" (min, max) distribution, or "normal" (0,1) distribution; or "poisson" 
(lambda=1) distribution. string; default value is "uniform". Note that, for the 
Poisson distribution, users can provide the mean/lambda parameter as follows: 
<br/> rand(rows=1000,cols=1000, pdf="poisson", lambda=2.5). <br/> The default 
value for lambda is 1. <br/> seed: Every invocation of rand() internally 
generates a random seed with which the cell values are generated. One can 
optionally provide a seed when repeatability is desired.  <br/> Output: matrix 
| X = rand(rows=10, cols=20, min=0, m
 ax=1, pdf="uniform", sparsity=0.2) <br/> The example generates a 10 x 20 
matrix, with cell values uniformly chosen at random between 0 and 1, and 
approximately 20% of cells will have non-zero values.
 rbind() | Row-wise matrix concatenation. Concatenates the second matrix as 
additional rows to the first matrix | Input: (X &lt;matrix&gt;, Y 
&lt;matrix&gt;) <br/>Output: &lt;matrix&gt; <br/> X and Y are matrices, where 
the number of columns in X and the number of columns in Y are the same. | A = 
matrix(1, rows=2,cols=3) <br/> B = matrix(2, rows=2,cols=3) <br/> C = 
rbind(A,B) <br/> print("Dimensions of C: " + nrow(C) + " X " + ncol(C)) <br/> 
Output: <br/> Dimensions of C: 4 X 3
-removeEmpty() | Removes all empty rows or columns from the input matrix target 
X according to the specified margin. Also, allows to apply a filter F before 
removing the empty rows/cols. | Input : (target= X &lt;matrix&gt;, 
margin="...", select=F) <br/> Output : &lt;matrix&gt; <br/> Valid values for 
margin are "rows" or "cols". | A = removeEmpty(target=X, margin="rows", 
select=F)
+removeEmpty() | Removes all empty rows or columns from the input matrix target 
X according to the specified margin. The optional select vector F specifies 
selected rows or columns; if not provided, the semantics are 
F=(rowSums(X!=0)&gt;0) and F=(colSums(X!=0)&gt;0) for removeEmpty "rows" and 
"cols", respectively. The optional empty.return flag indicates if a row or 
column of zeros should be returned for empty inputs. | Input : (target= X 
&lt;matrix&gt;, margin="..."[, select=F][, empty.return=TRUE]) <br/> Output : 
&lt;matrix&gt; <br/> Valid values for margin are "rows" or "cols". | A = 
removeEmpty(target=X, margin="rows", select=F)
 replace() | Creates a copy of input matrix X, where all values that are equal 
to the scalar pattern s1 are replaced with the scalar replacement s2. | Input : 
(target= X &lt;matrix&gt;, pattern=&lt;scalar&gt;, replacement=&lt;scalar&gt;) 
<br/> Output : &lt;matrix&gt; <br/> If s1 is NaN, then all NaN values of X are 
treated as equal and hence replaced with s2. Positive and negative infinity are 
treated as different values. | A = replace(target=X, pattern=s1, replacement=s2)
 rev() | Reverses the rows in a matrix | Input : (&lt;matrix&gt;) <br/> Output 
: &lt;matrix&gt; | <span style="white-space: nowrap;">A = matrix("1 2 3 4", 
rows=2, cols=2)</span> <br/> <span style="white-space: nowrap;">B = matrix("1 2 
3 4", rows=4, cols=1)</span> <br/> <span style="white-space: nowrap;">C = 
matrix("1 2 3 4", rows=1, cols=4)</span> <br/> revA = rev(A) <br/> revB = 
rev(B) <br/> revC = rev(C) <br/> Matrix revA: [[3, 4], [1, 2]]<br/> Matrix 
revB: [[4], [3], [2], [1]]<br/> Matrix revC: [[1, 2, 3, 4]]<br/>
 seq() | Creates a single column vector with values starting from &lt;from&gt;, 
to &lt;to&gt;, in increments of &lt;increment&gt; | Input: (&lt;from&gt;, 
&lt;to&gt;, &lt;increment&gt;) <br/> Output: &lt;matrix&gt; | S = seq (10, 200, 
10)

http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/hops/ParameterizedBuiltinOp.java
----------------------------------------------------------------------
diff --git a/src/main/java/org/apache/sysml/hops/ParameterizedBuiltinOp.java 
b/src/main/java/org/apache/sysml/hops/ParameterizedBuiltinOp.java
index aa48061..e9800a5 100644
--- a/src/main/java/org/apache/sysml/hops/ParameterizedBuiltinOp.java
+++ b/src/main/java/org/apache/sysml/hops/ParameterizedBuiltinOp.java
@@ -726,7 +726,8 @@ public class ParameterizedBuiltinOp extends Hop implements 
MultiThreadedHop
                        inMap.put("offset", loffset);
                        inMap.put("maxdim", lmaxdim);
                        inMap.put("margin", inputlops.get("margin"));
-               
+                       inMap.put("empty.return", 
inputlops.get("empty.return"));
+                       
                        if ( !FORCE_DIST_RM_EMPTY && isRemoveEmptyBcSP())
                                _bRmEmptyBC = true;
                        

http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/parser/ParameterizedBuiltinFunctionExpression.java
----------------------------------------------------------------------
diff --git 
a/src/main/java/org/apache/sysml/parser/ParameterizedBuiltinFunctionExpression.java
 
b/src/main/java/org/apache/sysml/parser/ParameterizedBuiltinFunctionExpression.java
index b365abb..dbb2a1e 100644
--- 
a/src/main/java/org/apache/sysml/parser/ParameterizedBuiltinFunctionExpression.java
+++ 
b/src/main/java/org/apache/sysml/parser/ParameterizedBuiltinFunctionExpression.java
@@ -23,10 +23,13 @@ import java.util.ArrayList;
 import java.util.Arrays;
 import java.util.HashMap;
 import java.util.HashSet;
+import java.util.Set;
+import java.util.stream.Collectors;
 
 import org.antlr.v4.runtime.ParserRuleContext;
 import org.apache.sysml.hops.Hop.ParamBuiltinOp;
 import org.apache.sysml.parser.LanguageException.LanguageErrorCodes;
+import org.apache.sysml.runtime.util.UtilFunctions;
 import org.apache.wink.json4j.JSONObject;
 
 
@@ -476,6 +479,15 @@ public class ParameterizedBuiltinFunctionExpression 
extends DataIdentifier
        }
 
        private void validateRemoveEmpty(DataIdentifier output, boolean 
conditional) throws LanguageException {
+               
+               //check for invalid parameters
+               Set<String> valid = UtilFunctions.asSet("target", "margin", 
"select", "empty.return");
+               Set<String> invalid = _varParams.keySet().stream()
+                       .filter(k -> 
!valid.contains(k)).collect(Collectors.toSet());
+               if( !invalid.isEmpty() )
+                       raiseValidateError("Invalid parameters for removeEmpty: 
"
+                               + Arrays.toString(invalid.toArray(new 
String[0])), false);
+               
                //check existence and correctness of arguments
                Expression target = getVarParam("target");
                if( target==null ) {
@@ -498,11 +510,18 @@ public class ParameterizedBuiltinFunctionExpression 
extends DataIdentifier
                        raiseValidateError("Index matrix 'select' is of type 
'"+select.getOutput().getDataType()+"'. Please specify the select matrix.", 
conditional, LanguageErrorCodes.INVALID_PARAMETERS);
                }
                
+               Expression empty = getVarParam("empty.return");
+               if( empty!=null && (!empty.getOutput().getDataType().isScalar() 
|| empty.getOutput().getValueType() != ValueType.BOOLEAN) ){
+                       raiseValidateError("Boolean parameter 'empty.return' is 
of type "+empty.getOutput().getDataType()
+                               +"["+empty.getOutput().getValueType()+"].", 
conditional, LanguageErrorCodes.INVALID_PARAMETERS);
+               }
+               if( empty == null ) //default handling
+                       _varParams.put("empty.return", new 
BooleanIdentifier(true));
+               
                // Output is a matrix with unknown dims
                output.setDataType(DataType.MATRIX);
                output.setValueType(ValueType.DOUBLE);
                output.setDimensions(-1, -1);
-               
        }
        
        private void validateGroupedAgg(DataIdentifier output, boolean 
conditional) 

http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/parser/dml/Dml.g4
----------------------------------------------------------------------
diff --git a/src/main/java/org/apache/sysml/parser/dml/Dml.g4 
b/src/main/java/org/apache/sysml/parser/dml/Dml.g4
index fb72ed2..9491855 100644
--- a/src/main/java/org/apache/sysml/parser/dml/Dml.g4
+++ b/src/main/java/org/apache/sysml/parser/dml/Dml.g4
@@ -182,7 +182,7 @@ strictParameterizedKeyValueString : paramName=ID '=' 
paramVal=STRING ;
 ID : (ALPHABET (ALPHABET|DIGIT|'_')*  '::')? ALPHABET (ALPHABET|DIGIT|'_')*
     // Special ID cases:
    // | 'matrix' // --> This is a special case which causes lot of headache
-   | 'as.scalar' | 'as.matrix' | 'as.frame' | 'as.double' | 'as.integer' | 
'as.logical' | 'index.return' | 'lower.tail'
+   | 'as.scalar' | 'as.matrix' | 'as.frame' | 'as.double' | 'as.integer' | 
'as.logical' | 'index.return' | 'empty.return' | 'lower.tail'
 ;
 // Unfortunately, we have datatype name clashing with builtin function name: 
matrix :(
 // Therefore, ugly work around for checking datatype

http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/parser/pydml/Pydml.g4
----------------------------------------------------------------------
diff --git a/src/main/java/org/apache/sysml/parser/pydml/Pydml.g4 
b/src/main/java/org/apache/sysml/parser/pydml/Pydml.g4
index 320793e..34d0c34 100644
--- a/src/main/java/org/apache/sysml/parser/pydml/Pydml.g4
+++ b/src/main/java/org/apache/sysml/parser/pydml/Pydml.g4
@@ -302,7 +302,7 @@ ID : (ALPHABET (ALPHABET|DIGIT|'_')*  '.')? ALPHABET 
(ALPHABET|DIGIT|'_')*
     // Special ID cases:
    // | 'matrix' // --> This is a special case which causes lot of headache
    // | 'scalar' |  'float' | 'int' | 'bool' // corresponds to as.scalar, 
as.double, as.integer and as.logical
-   | 'index.return'
+   | 'index.return' | 'empty.return'
 ;
 // Unfortunately, we have datatype name clashing with builtin function name: 
matrix :(
 // Therefore, ugly work around for checking datatype

http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/runtime/compress/CompressedMatrixBlock.java
----------------------------------------------------------------------
diff --git 
a/src/main/java/org/apache/sysml/runtime/compress/CompressedMatrixBlock.java 
b/src/main/java/org/apache/sysml/runtime/compress/CompressedMatrixBlock.java
index 0ce3dc8..d9cfc72 100644
--- a/src/main/java/org/apache/sysml/runtime/compress/CompressedMatrixBlock.java
+++ b/src/main/java/org/apache/sysml/runtime/compress/CompressedMatrixBlock.java
@@ -2165,19 +2165,19 @@ public class CompressedMatrixBlock extends MatrixBlock 
implements Externalizable
        }
 
        @Override
-       public MatrixBlock removeEmptyOperations(MatrixBlock ret, boolean rows, 
MatrixBlock select) 
+       public MatrixBlock removeEmptyOperations(MatrixBlock ret, boolean rows, 
boolean emptyReturn, MatrixBlock select) 
                        throws DMLRuntimeException {
                printDecompressWarning("removeEmptyOperations");
                MatrixBlock tmp = isCompressed() ? decompress() : this;
-               return tmp.removeEmptyOperations(ret, rows, select);
+               return tmp.removeEmptyOperations(ret, rows, emptyReturn, 
select);
        }
 
        @Override
-       public MatrixBlock removeEmptyOperations(MatrixBlock ret, boolean rows)
+       public MatrixBlock removeEmptyOperations(MatrixBlock ret, boolean rows, 
boolean emptyReturn)
                        throws DMLRuntimeException {
                printDecompressWarning("removeEmptyOperations");
                MatrixBlock tmp = isCompressed() ? decompress() : this;
-               return tmp.removeEmptyOperations(ret, rows);
+               return tmp.removeEmptyOperations(ret, rows, emptyReturn);
        }
 
        @Override

http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/runtime/compress/estim/CompressedSizeEstimatorSample.java
----------------------------------------------------------------------
diff --git 
a/src/main/java/org/apache/sysml/runtime/compress/estim/CompressedSizeEstimatorSample.java
 
b/src/main/java/org/apache/sysml/runtime/compress/estim/CompressedSizeEstimatorSample.java
index 2492c0b..f3d842a 100644
--- 
a/src/main/java/org/apache/sysml/runtime/compress/estim/CompressedSizeEstimatorSample.java
+++ 
b/src/main/java/org/apache/sysml/runtime/compress/estim/CompressedSizeEstimatorSample.java
@@ -63,7 +63,7 @@ public class CompressedSizeEstimatorSample extends 
CompressedSizeEstimator
                        for( int i=0; i<sampleSize; i++ )
                                select.quickSetValue(_sampleRows[i], 0, 1);
                        _data = _data.removeEmptyOperations(new MatrixBlock(), 
-                                       !CompressedMatrixBlock.TRANSPOSE_INPUT, 
select);
+                                       !CompressedMatrixBlock.TRANSPOSE_INPUT, 
true, select);
                }
                
                //establish estimator-local cache for numeric solve

http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/runtime/instructions/cp/ParameterizedBuiltinCPInstruction.java
----------------------------------------------------------------------
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/cp/ParameterizedBuiltinCPInstruction.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/cp/ParameterizedBuiltinCPInstruction.java
index 5a7442a..18ec72c 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/cp/ParameterizedBuiltinCPInstruction.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/cp/ParameterizedBuiltinCPInstruction.java
@@ -197,19 +197,17 @@ public class ParameterizedBuiltinCPInstruction extends 
ComputationCPInstruction
                        
                }
                else if ( opcode.equalsIgnoreCase("rmempty") ) {
+                       String margin = params.get("margin");
+                       if( !(margin.equals("rows") || margin.equals("cols")) )
+                               throw new DMLRuntimeException("Unspupported 
margin identifier '"+margin+"'.");
+                       
                        // acquire locks
                        MatrixBlock target = 
ec.getMatrixInput(params.get("target"), getExtendedOpcode());
                        MatrixBlock select = params.containsKey("select")? 
ec.getMatrixInput(params.get("select"), getExtendedOpcode()):null;
                        
                        // compute the result
-                       String margin = params.get("margin");
-                       MatrixBlock soresBlock = null;
-                       if( margin.equals("rows") )
-                               soresBlock = target.removeEmptyOperations(new 
MatrixBlock(), true, select);
-                       else if( margin.equals("cols") ) 
-                               soresBlock = target.removeEmptyOperations(new 
MatrixBlock(), false, select);
-                       else
-                               throw new DMLRuntimeException("Unspupported 
margin identifier '"+margin+"'.");
+                       MatrixBlock soresBlock = 
target.removeEmptyOperations(new MatrixBlock(),
+                               margin.equals("rows"), 
margin.equals("empty.return"), select);
                        
                        //release locks
                        ec.setMatrixOutput(output.getName(), soresBlock, 
getExtendedOpcode());

http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/runtime/instructions/spark/ParameterizedBuiltinSPInstruction.java
----------------------------------------------------------------------
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/ParameterizedBuiltinSPInstruction.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/ParameterizedBuiltinSPInstruction.java
index 49dd6e5..425739d 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/ParameterizedBuiltinSPInstruction.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/ParameterizedBuiltinSPInstruction.java
@@ -320,6 +320,7 @@ public class ParameterizedBuiltinSPInstruction extends 
ComputationSPInstruction
                        String rddOffVar = params.get("offset");
                        
                        boolean rows = sec.getScalarInput(params.get("margin"), 
ValueType.STRING, true).getStringValue().equals("rows");
+                       boolean emptyReturn = 
Boolean.parseBoolean(params.get("empty.return").toLowerCase());
                        long maxDim = sec.getScalarInput(params.get("maxdim"), 
ValueType.DOUBLE, false).getLongValue();
                        MatrixCharacteristics mcIn = 
sec.getMatrixCharacteristics(rddInVar);
                        
@@ -340,14 +341,14 @@ public class ParameterizedBuiltinSPInstruction extends 
ComputationSPInstruction
                                        broadcastOff = 
sec.getBroadcastForVariable( rddOffVar );
                                        // Broadcast offset vector
                                        out = in
-                                               .flatMapToPair(new 
RDDRemoveEmptyFunctionInMem(rows, maxDim, brlen, bclen, broadcastOff));         
     
+                                               .flatMapToPair(new 
RDDRemoveEmptyFunctionInMem(rows, maxDim, brlen, bclen, broadcastOff));
                                }
                                else {
                                        off = 
sec.getBinaryBlockRDDHandleForVariable( rddOffVar );
                                        out = in
                                                .join( off.flatMapToPair(new 
ReplicateVectorFunction(!rows,numRep)) )
-                                               .flatMapToPair(new 
RDDRemoveEmptyFunction(rows, maxDim, brlen, bclen));         
-                               }                               
+                                               .flatMapToPair(new 
RDDRemoveEmptyFunction(rows, maxDim, brlen, bclen));
+                               }
        
                                out = RDDAggregateUtils.mergeByKey(out, false);
                                
@@ -365,7 +366,8 @@ public class ParameterizedBuiltinSPInstruction extends 
ComputationSPInstruction
                        }
                        else //special case: empty output (ensure valid dims)
                        {
-                               MatrixBlock out = new 
MatrixBlock(rows?1:(int)mcIn.getRows(), rows?(int)mcIn.getCols():1, true); 
+                               int n = emptyReturn ? 1 : 0;
+                               MatrixBlock out = new 
MatrixBlock(rows?n:(int)mcIn.getRows(), rows?(int)mcIn.getCols():n, true); 
                                sec.setMatrixOutput(output.getName(), out, 
getExtendedOpcode());
                        }
                }
@@ -521,13 +523,12 @@ public class ParameterizedBuiltinSPInstruction extends 
ComputationSPInstruction
        {
                private static final long serialVersionUID = 
4906304771183325289L;
 
-               private boolean _rmRows; 
-               private long _len;
-               private long _brlen;
-               private long _bclen;
-                               
-               public RDDRemoveEmptyFunction(boolean rmRows, long len, long 
brlen, long bclen) 
-               {
+               private final boolean _rmRows;
+               private final long _len;
+               private final long _brlen;
+               private final long _bclen;
+               
+               public RDDRemoveEmptyFunction(boolean rmRows, long len, long 
brlen, long bclen) {
                        _rmRows = rmRows;
                        _len = len;
                        _brlen = brlen;
@@ -545,7 +546,7 @@ public class ParameterizedBuiltinSPInstruction extends 
ComputationSPInstruction
                        //execute remove empty operations
                        ArrayList<IndexedMatrixValue> out = new ArrayList<>();
                        LibMatrixReorg.rmempty(data, offsets, _rmRows, _len, 
_brlen, _bclen, out);
-
+                       
                        //prepare and return outputs
                        return 
SparkUtils.fromIndexedMatrixBlock(out).iterator();
                }
@@ -555,13 +556,13 @@ public class ParameterizedBuiltinSPInstruction extends 
ComputationSPInstruction
        {
                private static final long serialVersionUID = 
4906304771183325289L;
 
-               private boolean _rmRows; 
-               private long _len;
-               private long _brlen;
-               private long _bclen;
+               private final boolean _rmRows;
+               private final long _len;
+               private final long _brlen;
+               private final long _bclen;
                
                private PartitionedBroadcast<MatrixBlock> _off = null;
-                               
+               
                public RDDRemoveEmptyFunctionInMem(boolean rmRows, long len, 
long brlen, long bclen, PartitionedBroadcast<MatrixBlock> off) 
                {
                        _rmRows = rmRows;
@@ -577,12 +578,9 @@ public class ParameterizedBuiltinSPInstruction extends 
ComputationSPInstruction
                {
                        //prepare inputs (for internal api compatibility)
                        IndexedMatrixValue data = 
SparkUtils.toIndexedMatrixBlock(arg0._1(),arg0._2());
-                       //IndexedMatrixValue offsets = 
SparkUtils.toIndexedMatrixBlock(arg0._1(),arg0._2()._2());
-                       IndexedMatrixValue offsets = null;
-                       if(_rmRows)
-                               offsets = 
SparkUtils.toIndexedMatrixBlock(arg0._1(), 
_off.getBlock((int)arg0._1().getRowIndex(), 1));
-                       else
-                               offsets = 
SparkUtils.toIndexedMatrixBlock(arg0._1(), _off.getBlock(1, 
(int)arg0._1().getColumnIndex()));
+                       IndexedMatrixValue offsets = _rmRows ?
+                               SparkUtils.toIndexedMatrixBlock(arg0._1(), 
_off.getBlock((int)arg0._1().getRowIndex(), 1)) :
+                               SparkUtils.toIndexedMatrixBlock(arg0._1(), 
_off.getBlock(1, (int)arg0._1().getColumnIndex()));
                        
                        //execute remove empty operations
                        ArrayList<IndexedMatrixValue> out = new ArrayList<>();

http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDAggregateUtils.java
----------------------------------------------------------------------
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDAggregateUtils.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDAggregateUtils.java
index 70174a9..97870e3 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDAggregateUtils.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDAggregateUtils.java
@@ -564,7 +564,7 @@ public class RDDAggregateUtils
                private MatrixBlock _corr = null;
                
                public AggregateSingleBlockFunction( AggregateOperator op ) {
-                       _op = op;       
+                       _op = op;
                }
                
                @Override
@@ -572,11 +572,11 @@ public class RDDAggregateUtils
                        throws Exception 
                {
                        //prepare combiner block
-                       if( arg0.getNumRows() <= 0 || arg0.getNumColumns() <= 
0) {
+                       if( arg0.getNumRows() == 0 && arg0.getNumColumns() == 
0) {
                                arg0.copy(arg1);
                                return arg0;
                        }
-                       else if( arg1.getNumRows() <= 0 || arg1.getNumColumns() 
<= 0 ) {
+                       else if( arg1.getNumRows() == 0 && arg1.getNumColumns() 
== 0 ) {
                                return arg0;
                        }
                        

http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/runtime/io/WriterBinaryBlock.java
----------------------------------------------------------------------
diff --git a/src/main/java/org/apache/sysml/runtime/io/WriterBinaryBlock.java 
b/src/main/java/org/apache/sysml/runtime/io/WriterBinaryBlock.java
index a8f31cb..0d8e221 100644
--- a/src/main/java/org/apache/sysml/runtime/io/WriterBinaryBlock.java
+++ b/src/main/java/org/apache/sysml/runtime/io/WriterBinaryBlock.java
@@ -77,15 +77,14 @@ public class WriterBinaryBlock extends MatrixWriter
                JobConf job = new 
JobConf(ConfigurationManager.getCachedJobConf());
                Path path = new Path( fname );
                FileSystem fs = IOUtilFunctions.getFileSystem(path, job);
-               
                SequenceFile.Writer writer = null;
                try {
                        writer = new SequenceFile.Writer(fs, job, path,
-                                                               
MatrixIndexes.class, MatrixBlock.class);
-                       
+                               MatrixIndexes.class, MatrixBlock.class);
                        MatrixIndexes index = new MatrixIndexes(1, 1);
-                       MatrixBlock block = new MatrixBlock((int)Math.min(rlen, 
brlen),
-                                                                               
                (int)Math.min(clen, bclen), true);
+                       MatrixBlock block = new MatrixBlock(
+                               (int)Math.max(Math.min(rlen, brlen),1),
+                               (int)Math.max(Math.min(clen, bclen),1), true);
                        writer.append(index, block);
                }
                finally {

http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixReorg.java
----------------------------------------------------------------------
diff --git 
a/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixReorg.java 
b/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixReorg.java
index 59df934..dcf8c2f 100644
--- a/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixReorg.java
+++ b/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixReorg.java
@@ -512,28 +512,30 @@ public class LibMatrixReorg
         * @param in input matrix
         * @param ret output matrix
         * @param rows ?
+        * @param emptyReturn return row/column of zeros for empty input
         * @param select ?
         * @return matrix block
         * @throws DMLRuntimeException if DMLRuntimeException occurs
         */
-       public static MatrixBlock rmempty(MatrixBlock in, MatrixBlock ret, 
boolean rows, MatrixBlock select) 
+       public static MatrixBlock rmempty(MatrixBlock in, MatrixBlock ret, 
boolean rows, boolean emptyReturn, MatrixBlock select) 
                throws DMLRuntimeException
        {
                //check for empty inputs 
                //(the semantics of removeEmpty are that for an empty m-by-n 
matrix, the output 
                //is an empty 1-by-n or m-by-1 matrix because we don't allow 
matrices with dims 0)
                if( in.isEmptyBlock(false) && select == null  ) {
+                       int n = emptyReturn ? 1 : 0;
                        if( rows )
-                               ret.reset(1, in.clen, in.sparse);
+                               ret.reset(n, in.clen, in.sparse);
                        else //cols
-                               ret.reset(in.rlen, 1, in.sparse);
+                               ret.reset(in.rlen, n, in.sparse);
                        return ret;
                }
                
                if( rows )
-                       return removeEmptyRows(in, ret, select);
+                       return removeEmptyRows(in, ret, select, emptyReturn);
                else //cols
-                       return removeEmptyColumns(in, ret, select);
+                       return removeEmptyColumns(in, ret, select, emptyReturn);
        }
 
        /**
@@ -1704,7 +1706,7 @@ public class LibMatrixReorg
                        new MatrixIndexes(ci, cj);
        }
 
-       private static MatrixBlock removeEmptyRows(MatrixBlock in, MatrixBlock 
ret, MatrixBlock select) 
+       private static MatrixBlock removeEmptyRows(MatrixBlock in, MatrixBlock 
ret, MatrixBlock select, boolean emptyReturn) 
                throws DMLRuntimeException 
        {
                final int m = in.rlen;
@@ -1769,7 +1771,7 @@ public class LibMatrixReorg
                //Step 2: reset result and copy rows
                //dense stays dense if correct input representation (but robust 
for any input), 
                //sparse might be dense/sparse
-               rlen2 = Math.max(rlen2, 1); //ensure valid output
+               rlen2 = Math.max(rlen2, emptyReturn ? 1 : 0); //ensure valid 
output
                boolean sp = MatrixBlock.evalSparseFormatInMemory(rlen2, n, 
in.nonZeros);
                ret.reset(rlen2, n, sp);
                if( in.isEmptyBlock(false) )
@@ -1825,7 +1827,7 @@ public class LibMatrixReorg
                return ret;
        }
 
-       private static MatrixBlock removeEmptyColumns(MatrixBlock in, 
MatrixBlock ret, MatrixBlock select) 
+       private static MatrixBlock removeEmptyColumns(MatrixBlock in, 
MatrixBlock ret, MatrixBlock select, boolean emptyReturn) 
                throws DMLRuntimeException 
        {
                final int m = in.rlen;
@@ -1872,7 +1874,7 @@ public class LibMatrixReorg
                //Step 3: reset result and copy columns
                //dense stays dense if correct input representation (but robust 
for any input), 
                // sparse might be dense/sparse
-               clen2 = Math.max(clen2, 1); //ensure valid output
+               clen2 = Math.max(clen2, emptyReturn ? 1 : 0); //ensure valid 
output
                boolean sp = MatrixBlock.evalSparseFormatInMemory(m, clen2, 
in.nonZeros);
                ret.reset(m, clen2, sp);
                if( in.isEmptyBlock(false) )

http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/runtime/matrix/data/MatrixBlock.java
----------------------------------------------------------------------
diff --git 
a/src/main/java/org/apache/sysml/runtime/matrix/data/MatrixBlock.java 
b/src/main/java/org/apache/sysml/runtime/matrix/data/MatrixBlock.java
index 8767b13..c59ec51 100644
--- a/src/main/java/org/apache/sysml/runtime/matrix/data/MatrixBlock.java
+++ b/src/main/java/org/apache/sysml/runtime/matrix/data/MatrixBlock.java
@@ -4873,17 +4873,17 @@ public class MatrixBlock extends MatrixValue implements 
CacheBlock, Externalizab
                return result;
        }
 
-       public MatrixBlock removeEmptyOperations( MatrixBlock ret, boolean 
rows, MatrixBlock select )
+       public MatrixBlock removeEmptyOperations( MatrixBlock ret, boolean 
rows, boolean emptyReturn, MatrixBlock select )
                throws DMLRuntimeException 
        {       
                MatrixBlock result = checkType(ret);
-               return LibMatrixReorg.rmempty(this, result, rows, select);
+               return LibMatrixReorg.rmempty(this, result, rows, emptyReturn, 
select);
        }
 
-       public MatrixBlock removeEmptyOperations( MatrixBlock ret, boolean rows)
+       public MatrixBlock removeEmptyOperations( MatrixBlock ret, boolean 
rows, boolean emptyReturn)
                throws DMLRuntimeException 
        {
-               return removeEmptyOperations(ret, rows, null);
+               return removeEmptyOperations(ret, rows, emptyReturn, null);
        }
 
        public MatrixBlock rexpandOperations( MatrixBlock ret, double max, 
boolean rows, boolean cast, boolean ignore, int k )

http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/test/java/org/apache/sysml/test/integration/functions/misc/ZeroRowsColsMatrixTest.java
----------------------------------------------------------------------
diff --git 
a/src/test/java/org/apache/sysml/test/integration/functions/misc/ZeroRowsColsMatrixTest.java
 
b/src/test/java/org/apache/sysml/test/integration/functions/misc/ZeroRowsColsMatrixTest.java
index 349828a..960eac5 100644
--- 
a/src/test/java/org/apache/sysml/test/integration/functions/misc/ZeroRowsColsMatrixTest.java
+++ 
b/src/test/java/org/apache/sysml/test/integration/functions/misc/ZeroRowsColsMatrixTest.java
@@ -53,35 +53,35 @@ public class ZeroRowsColsMatrixTest extends 
AutomatedTestBase
                addTestConfiguration(TEST_NAME4, new 
TestConfiguration(TEST_CLASS_DIR, TEST_NAME4, new String[] { "R" })); 
        }
        
-//     @Test
-//     public void testEmptyMatrixRemoveEmptyNoRewritesCP() {
-//             runEmptyMatrixTest(TEST_NAME1, false, ExecType.CP);
-//     }
-//     
-//     @Test
-//     public void testEmptyMatrixRemoveEmptyRewritesCP() {
-//             runEmptyMatrixTest(TEST_NAME1, true, ExecType.CP);
-//     }
-//     
-//     @Test
-//     public void testEmptyMatrixRemoveEmptyNoRewritesMR() {
-//             runEmptyMatrixTest(TEST_NAME1, false, ExecType.MR);
-//     }
-//     
-//     @Test
-//     public void testEmptyMatrixRemoveEmptyRewritesMR() {
-//             runEmptyMatrixTest(TEST_NAME1, true, ExecType.MR);
-//     }
-//     
-//     @Test
-//     public void testEmptyMatrixRemoveEmptyNoRewritesSP() {
-//             runEmptyMatrixTest(TEST_NAME1, false, ExecType.SPARK);
-//     }
-//     
-//     @Test
-//     public void testEmptyMatrixRemoveEmptyRewritesSP() {
-//             runEmptyMatrixTest(TEST_NAME1, true, ExecType.SPARK);
-//     }
+       @Test
+       public void testEmptyMatrixRemoveEmptyNoRewritesCP() {
+               runEmptyMatrixTest(TEST_NAME1, false, ExecType.CP);
+       }
+       
+       @Test
+       public void testEmptyMatrixRemoveEmptyRewritesCP() {
+               runEmptyMatrixTest(TEST_NAME1, true, ExecType.CP);
+       }
+       
+       @Test
+       public void testEmptyMatrixRemoveEmptyNoRewritesMR() {
+               runEmptyMatrixTest(TEST_NAME1, false, ExecType.MR);
+       }
+       
+       @Test
+       public void testEmptyMatrixRemoveEmptyRewritesMR() {
+               runEmptyMatrixTest(TEST_NAME1, true, ExecType.MR);
+       }
+       
+       @Test
+       public void testEmptyMatrixRemoveEmptyNoRewritesSP() {
+               runEmptyMatrixTest(TEST_NAME1, false, ExecType.SPARK);
+       }
+       
+       @Test
+       public void testEmptyMatrixRemoveEmptyRewritesSP() {
+               runEmptyMatrixTest(TEST_NAME1, true, ExecType.SPARK);
+       }
        
        @Test
        public void testEmptyMatrixCbindNoRewritesCP() {
@@ -196,7 +196,7 @@ public class ZeroRowsColsMatrixTest extends 
AutomatedTestBase
                        
                        String HOME = SCRIPT_DIR + TEST_DIR;
                        fullDMLScriptName = HOME + TEST_NAME + ".dml";
-                       programArgs = new String[]{"-args", 
String.valueOf(dim), output("R")};
+                       programArgs = new 
String[]{"-explain","recompile_runtime","-args", String.valueOf(dim), 
output("R")};
                        
                        fullRScriptName = HOME + TEST_NAME +".R";
                        rCmd = getRCmd(String.valueOf(dim), expectedDir());

http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.R
----------------------------------------------------------------------
diff --git a/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.R 
b/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.R
index 900e15e..0814e92 100644
--- a/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.R
+++ b/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.R
@@ -24,6 +24,6 @@ args <- commandArgs(TRUE)
 options(digits=22)
 library("Matrix")
 
-R = matrix(7, as.integer(args[1]), 7);
+R = matrix(7, as.integer(args[1]), 3);
 
 writeMM(as(R, "CsparseMatrix"), paste(args[2], "R", sep="")); 

http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.dml
----------------------------------------------------------------------
diff --git a/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.dml 
b/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.dml
index 7ca8634..479bd9e 100644
--- a/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.dml
+++ b/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.dml
@@ -21,8 +21,8 @@
 
 
 A = matrix(0, $1, $1);
-B = removeEmpty(target=A, margin="rows", outEmpty=FALSE) + 7;
-C = removeEmpty(target=A, margin="cols", outEmpty=FALSE) + 3;
-R = matrix(7+sum(B)+sum(C), $1, 7);
+B = removeEmpty(target=A, margin="rows", empty.return=FALSE);
+C = removeEmpty(target=A, margin="cols", empty.return=FALSE);
+R = matrix(7+nrow(B)+ncol(C), $1, 3);
 
 write(R, $2);
\ No newline at end of file

Reply via email to