Repository: systemml Updated Branches: refs/heads/master 8b054804e -> 7e11deaa8
[SYSTEMML-2135] Extended removeEmpty with outputs of zero rows/cols This patch extends the support for matrices with zero rows and columns to removeEmpty. So far this operation returned a single row/column of zeros for empty inputs. For backwards compatibility, we keep these semantics, but introduce a new flag that allows to disable this special case in order to return matrices with zero rows and columns, respectively. In addition, this patch also includes the related update of the dml language guide. Project: http://git-wip-us.apache.org/repos/asf/systemml/repo Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/e0b66b30 Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/e0b66b30 Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/e0b66b30 Branch: refs/heads/master Commit: e0b66b300c8bd4582579bac8b6dea26e132447b0 Parents: 8b05480 Author: Matthias Boehm <[email protected]> Authored: Thu Feb 8 18:12:32 2018 -0800 Committer: Matthias Boehm <[email protected]> Committed: Thu Feb 8 22:04:03 2018 -0800 ---------------------------------------------------------------------- docs/dml-language-reference.md | 2 +- .../sysml/hops/ParameterizedBuiltinOp.java | 3 +- .../ParameterizedBuiltinFunctionExpression.java | 21 ++++++- .../java/org/apache/sysml/parser/dml/Dml.g4 | 2 +- .../java/org/apache/sysml/parser/pydml/Pydml.g4 | 2 +- .../runtime/compress/CompressedMatrixBlock.java | 8 +-- .../estim/CompressedSizeEstimatorSample.java | 2 +- .../cp/ParameterizedBuiltinCPInstruction.java | 14 ++--- .../ParameterizedBuiltinSPInstruction.java | 44 +++++++------- .../spark/utils/RDDAggregateUtils.java | 6 +- .../sysml/runtime/io/WriterBinaryBlock.java | 9 ++- .../runtime/matrix/data/LibMatrixReorg.java | 20 ++++--- .../sysml/runtime/matrix/data/MatrixBlock.java | 8 +-- .../functions/misc/ZeroRowsColsMatrixTest.java | 60 ++++++++++---------- .../functions/misc/ZeroMatrix_RemoveEmpty.R | 2 +- .../functions/misc/ZeroMatrix_RemoveEmpty.dml | 6 +- 16 files changed, 113 insertions(+), 96 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/docs/dml-language-reference.md ---------------------------------------------------------------------- diff --git a/docs/dml-language-reference.md b/docs/dml-language-reference.md index 5fb9de5..355b507 100644 --- a/docs/dml-language-reference.md +++ b/docs/dml-language-reference.md @@ -648,7 +648,7 @@ nrow(), <br/> ncol(), <br/> length() | Return the number of rows, number of colu prod() | Return the product of all cells in matrix | Input: matrix <br/> Output: scalarj | prod(X) rand() | Generates a random matrix | Input: (rows=<value>, cols=<value>, min=<value>, max=<value>, sparsity=<value>, pdf=<string>, seed=<value>) <br/> rows/cols: Number of rows/cols (expression) <br/> min/max: Min/max value for cells (either constant value, or variable that evaluates to constant value) <br/> sparsity: fraction of non-zero cells (constant value) <br/> pdf: "uniform" (min, max) distribution, or "normal" (0,1) distribution; or "poisson" (lambda=1) distribution. string; default value is "uniform". Note that, for the Poisson distribution, users can provide the mean/lambda parameter as follows: <br/> rand(rows=1000,cols=1000, pdf="poisson", lambda=2.5). <br/> The default value for lambda is 1. <br/> seed: Every invocation of rand() internally generates a random seed with which the cell values are generated. One can optionally provide a seed when repeatability is desired. <br/> Output: matrix | XÂ = rand(rows=10, cols=20, min=0, m ax=1, pdf="uniform", sparsity=0.2) <br/> The example generates a 10 x 20 matrix, with cell values uniformly chosen at random between 0 and 1, and approximately 20% of cells will have non-zero values. rbind() | Row-wise matrix concatenation. Concatenates the second matrix as additional rows to the first matrix | Input: (X <matrix>, Y <matrix>) <br/>Output: <matrix> <br/> X and Y are matrices, where the number of columns in X and the number of columns in Y are the same. | A = matrix(1, rows=2,cols=3) <br/> B = matrix(2, rows=2,cols=3) <br/> C = rbind(A,B) <br/> print("Dimensions of C: " + nrow(C) + " X " + ncol(C)) <br/> Output: <br/> Dimensions of C: 4 X 3 -removeEmpty() | Removes all empty rows or columns from the input matrix target X according to the specified margin. Also, allows to apply a filter F before removing the empty rows/cols. | Input : (target= X <matrix>, margin="...", select=F) <br/> Output : <matrix> <br/> Valid values for margin are "rows" or "cols". | A = removeEmpty(target=X, margin="rows", select=F) +removeEmpty() | Removes all empty rows or columns from the input matrix target X according to the specified margin. The optional select vector F specifies selected rows or columns; if not provided, the semantics are F=(rowSums(X!=0)>0) and F=(colSums(X!=0)>0) for removeEmpty "rows" and "cols", respectively. The optional empty.return flag indicates if a row or column of zeros should be returned for empty inputs. | Input : (target= X <matrix>, margin="..."[, select=F][, empty.return=TRUE]) <br/> Output : <matrix> <br/> Valid values for margin are "rows" or "cols". | A = removeEmpty(target=X, margin="rows", select=F) replace() | Creates a copy of input matrix X, where all values that are equal to the scalar pattern s1 are replaced with the scalar replacement s2. | Input : (target= X <matrix>, pattern=<scalar>, replacement=<scalar>) <br/> Output : <matrix> <br/> If s1 is NaN, then all NaN values of X are treated as equal and hence replaced with s2. Positive and negative infinity are treated as different values. | A = replace(target=X, pattern=s1, replacement=s2) rev() | Reverses the rows in a matrix | Input : (<matrix>) <br/> Output : <matrix> | <span style="white-space: nowrap;">A = matrix("1 2 3 4", rows=2, cols=2)</span> <br/> <span style="white-space: nowrap;">B = matrix("1 2 3 4", rows=4, cols=1)</span> <br/> <span style="white-space: nowrap;">C = matrix("1 2 3 4", rows=1, cols=4)</span> <br/> revA = rev(A) <br/> revB = rev(B) <br/> revC = rev(C) <br/> Matrix revA: [[3, 4], [1, 2]]<br/> Matrix revB: [[4], [3], [2], [1]]<br/> Matrix revC: [[1, 2, 3, 4]]<br/> seq() | Creates a single column vector with values starting from <from>, to <to>, in increments of <increment> | Input: (<from>, <to>, <increment>) <br/> Output: <matrix> | S = seq (10, 200, 10) http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/hops/ParameterizedBuiltinOp.java ---------------------------------------------------------------------- diff --git a/src/main/java/org/apache/sysml/hops/ParameterizedBuiltinOp.java b/src/main/java/org/apache/sysml/hops/ParameterizedBuiltinOp.java index aa48061..e9800a5 100644 --- a/src/main/java/org/apache/sysml/hops/ParameterizedBuiltinOp.java +++ b/src/main/java/org/apache/sysml/hops/ParameterizedBuiltinOp.java @@ -726,7 +726,8 @@ public class ParameterizedBuiltinOp extends Hop implements MultiThreadedHop inMap.put("offset", loffset); inMap.put("maxdim", lmaxdim); inMap.put("margin", inputlops.get("margin")); - + inMap.put("empty.return", inputlops.get("empty.return")); + if ( !FORCE_DIST_RM_EMPTY && isRemoveEmptyBcSP()) _bRmEmptyBC = true; http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/parser/ParameterizedBuiltinFunctionExpression.java ---------------------------------------------------------------------- diff --git a/src/main/java/org/apache/sysml/parser/ParameterizedBuiltinFunctionExpression.java b/src/main/java/org/apache/sysml/parser/ParameterizedBuiltinFunctionExpression.java index b365abb..dbb2a1e 100644 --- a/src/main/java/org/apache/sysml/parser/ParameterizedBuiltinFunctionExpression.java +++ b/src/main/java/org/apache/sysml/parser/ParameterizedBuiltinFunctionExpression.java @@ -23,10 +23,13 @@ import java.util.ArrayList; import java.util.Arrays; import java.util.HashMap; import java.util.HashSet; +import java.util.Set; +import java.util.stream.Collectors; import org.antlr.v4.runtime.ParserRuleContext; import org.apache.sysml.hops.Hop.ParamBuiltinOp; import org.apache.sysml.parser.LanguageException.LanguageErrorCodes; +import org.apache.sysml.runtime.util.UtilFunctions; import org.apache.wink.json4j.JSONObject; @@ -476,6 +479,15 @@ public class ParameterizedBuiltinFunctionExpression extends DataIdentifier } private void validateRemoveEmpty(DataIdentifier output, boolean conditional) throws LanguageException { + + //check for invalid parameters + Set<String> valid = UtilFunctions.asSet("target", "margin", "select", "empty.return"); + Set<String> invalid = _varParams.keySet().stream() + .filter(k -> !valid.contains(k)).collect(Collectors.toSet()); + if( !invalid.isEmpty() ) + raiseValidateError("Invalid parameters for removeEmpty: " + + Arrays.toString(invalid.toArray(new String[0])), false); + //check existence and correctness of arguments Expression target = getVarParam("target"); if( target==null ) { @@ -498,11 +510,18 @@ public class ParameterizedBuiltinFunctionExpression extends DataIdentifier raiseValidateError("Index matrix 'select' is of type '"+select.getOutput().getDataType()+"'. Please specify the select matrix.", conditional, LanguageErrorCodes.INVALID_PARAMETERS); } + Expression empty = getVarParam("empty.return"); + if( empty!=null && (!empty.getOutput().getDataType().isScalar() || empty.getOutput().getValueType() != ValueType.BOOLEAN) ){ + raiseValidateError("Boolean parameter 'empty.return' is of type "+empty.getOutput().getDataType() + +"["+empty.getOutput().getValueType()+"].", conditional, LanguageErrorCodes.INVALID_PARAMETERS); + } + if( empty == null ) //default handling + _varParams.put("empty.return", new BooleanIdentifier(true)); + // Output is a matrix with unknown dims output.setDataType(DataType.MATRIX); output.setValueType(ValueType.DOUBLE); output.setDimensions(-1, -1); - } private void validateGroupedAgg(DataIdentifier output, boolean conditional) http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/parser/dml/Dml.g4 ---------------------------------------------------------------------- diff --git a/src/main/java/org/apache/sysml/parser/dml/Dml.g4 b/src/main/java/org/apache/sysml/parser/dml/Dml.g4 index fb72ed2..9491855 100644 --- a/src/main/java/org/apache/sysml/parser/dml/Dml.g4 +++ b/src/main/java/org/apache/sysml/parser/dml/Dml.g4 @@ -182,7 +182,7 @@ strictParameterizedKeyValueString : paramName=ID '=' paramVal=STRING ; ID : (ALPHABET (ALPHABET|DIGIT|'_')* '::')? ALPHABET (ALPHABET|DIGIT|'_')* // Special ID cases: // | 'matrix' // --> This is a special case which causes lot of headache - | 'as.scalar' | 'as.matrix' | 'as.frame' | 'as.double' | 'as.integer' | 'as.logical' | 'index.return' | 'lower.tail' + | 'as.scalar' | 'as.matrix' | 'as.frame' | 'as.double' | 'as.integer' | 'as.logical' | 'index.return' | 'empty.return' | 'lower.tail' ; // Unfortunately, we have datatype name clashing with builtin function name: matrix :( // Therefore, ugly work around for checking datatype http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/parser/pydml/Pydml.g4 ---------------------------------------------------------------------- diff --git a/src/main/java/org/apache/sysml/parser/pydml/Pydml.g4 b/src/main/java/org/apache/sysml/parser/pydml/Pydml.g4 index 320793e..34d0c34 100644 --- a/src/main/java/org/apache/sysml/parser/pydml/Pydml.g4 +++ b/src/main/java/org/apache/sysml/parser/pydml/Pydml.g4 @@ -302,7 +302,7 @@ ID : (ALPHABET (ALPHABET|DIGIT|'_')* '.')? ALPHABET (ALPHABET|DIGIT|'_')* // Special ID cases: // | 'matrix' // --> This is a special case which causes lot of headache // | 'scalar' | 'float' | 'int' | 'bool' // corresponds to as.scalar, as.double, as.integer and as.logical - | 'index.return' + | 'index.return' | 'empty.return' ; // Unfortunately, we have datatype name clashing with builtin function name: matrix :( // Therefore, ugly work around for checking datatype http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/runtime/compress/CompressedMatrixBlock.java ---------------------------------------------------------------------- diff --git a/src/main/java/org/apache/sysml/runtime/compress/CompressedMatrixBlock.java b/src/main/java/org/apache/sysml/runtime/compress/CompressedMatrixBlock.java index 0ce3dc8..d9cfc72 100644 --- a/src/main/java/org/apache/sysml/runtime/compress/CompressedMatrixBlock.java +++ b/src/main/java/org/apache/sysml/runtime/compress/CompressedMatrixBlock.java @@ -2165,19 +2165,19 @@ public class CompressedMatrixBlock extends MatrixBlock implements Externalizable } @Override - public MatrixBlock removeEmptyOperations(MatrixBlock ret, boolean rows, MatrixBlock select) + public MatrixBlock removeEmptyOperations(MatrixBlock ret, boolean rows, boolean emptyReturn, MatrixBlock select) throws DMLRuntimeException { printDecompressWarning("removeEmptyOperations"); MatrixBlock tmp = isCompressed() ? decompress() : this; - return tmp.removeEmptyOperations(ret, rows, select); + return tmp.removeEmptyOperations(ret, rows, emptyReturn, select); } @Override - public MatrixBlock removeEmptyOperations(MatrixBlock ret, boolean rows) + public MatrixBlock removeEmptyOperations(MatrixBlock ret, boolean rows, boolean emptyReturn) throws DMLRuntimeException { printDecompressWarning("removeEmptyOperations"); MatrixBlock tmp = isCompressed() ? decompress() : this; - return tmp.removeEmptyOperations(ret, rows); + return tmp.removeEmptyOperations(ret, rows, emptyReturn); } @Override http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/runtime/compress/estim/CompressedSizeEstimatorSample.java ---------------------------------------------------------------------- diff --git a/src/main/java/org/apache/sysml/runtime/compress/estim/CompressedSizeEstimatorSample.java b/src/main/java/org/apache/sysml/runtime/compress/estim/CompressedSizeEstimatorSample.java index 2492c0b..f3d842a 100644 --- a/src/main/java/org/apache/sysml/runtime/compress/estim/CompressedSizeEstimatorSample.java +++ b/src/main/java/org/apache/sysml/runtime/compress/estim/CompressedSizeEstimatorSample.java @@ -63,7 +63,7 @@ public class CompressedSizeEstimatorSample extends CompressedSizeEstimator for( int i=0; i<sampleSize; i++ ) select.quickSetValue(_sampleRows[i], 0, 1); _data = _data.removeEmptyOperations(new MatrixBlock(), - !CompressedMatrixBlock.TRANSPOSE_INPUT, select); + !CompressedMatrixBlock.TRANSPOSE_INPUT, true, select); } //establish estimator-local cache for numeric solve http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/runtime/instructions/cp/ParameterizedBuiltinCPInstruction.java ---------------------------------------------------------------------- diff --git a/src/main/java/org/apache/sysml/runtime/instructions/cp/ParameterizedBuiltinCPInstruction.java b/src/main/java/org/apache/sysml/runtime/instructions/cp/ParameterizedBuiltinCPInstruction.java index 5a7442a..18ec72c 100644 --- a/src/main/java/org/apache/sysml/runtime/instructions/cp/ParameterizedBuiltinCPInstruction.java +++ b/src/main/java/org/apache/sysml/runtime/instructions/cp/ParameterizedBuiltinCPInstruction.java @@ -197,19 +197,17 @@ public class ParameterizedBuiltinCPInstruction extends ComputationCPInstruction } else if ( opcode.equalsIgnoreCase("rmempty") ) { + String margin = params.get("margin"); + if( !(margin.equals("rows") || margin.equals("cols")) ) + throw new DMLRuntimeException("Unspupported margin identifier '"+margin+"'."); + // acquire locks MatrixBlock target = ec.getMatrixInput(params.get("target"), getExtendedOpcode()); MatrixBlock select = params.containsKey("select")? ec.getMatrixInput(params.get("select"), getExtendedOpcode()):null; // compute the result - String margin = params.get("margin"); - MatrixBlock soresBlock = null; - if( margin.equals("rows") ) - soresBlock = target.removeEmptyOperations(new MatrixBlock(), true, select); - else if( margin.equals("cols") ) - soresBlock = target.removeEmptyOperations(new MatrixBlock(), false, select); - else - throw new DMLRuntimeException("Unspupported margin identifier '"+margin+"'."); + MatrixBlock soresBlock = target.removeEmptyOperations(new MatrixBlock(), + margin.equals("rows"), margin.equals("empty.return"), select); //release locks ec.setMatrixOutput(output.getName(), soresBlock, getExtendedOpcode()); http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/runtime/instructions/spark/ParameterizedBuiltinSPInstruction.java ---------------------------------------------------------------------- diff --git a/src/main/java/org/apache/sysml/runtime/instructions/spark/ParameterizedBuiltinSPInstruction.java b/src/main/java/org/apache/sysml/runtime/instructions/spark/ParameterizedBuiltinSPInstruction.java index 49dd6e5..425739d 100644 --- a/src/main/java/org/apache/sysml/runtime/instructions/spark/ParameterizedBuiltinSPInstruction.java +++ b/src/main/java/org/apache/sysml/runtime/instructions/spark/ParameterizedBuiltinSPInstruction.java @@ -320,6 +320,7 @@ public class ParameterizedBuiltinSPInstruction extends ComputationSPInstruction String rddOffVar = params.get("offset"); boolean rows = sec.getScalarInput(params.get("margin"), ValueType.STRING, true).getStringValue().equals("rows"); + boolean emptyReturn = Boolean.parseBoolean(params.get("empty.return").toLowerCase()); long maxDim = sec.getScalarInput(params.get("maxdim"), ValueType.DOUBLE, false).getLongValue(); MatrixCharacteristics mcIn = sec.getMatrixCharacteristics(rddInVar); @@ -340,14 +341,14 @@ public class ParameterizedBuiltinSPInstruction extends ComputationSPInstruction broadcastOff = sec.getBroadcastForVariable( rddOffVar ); // Broadcast offset vector out = in - .flatMapToPair(new RDDRemoveEmptyFunctionInMem(rows, maxDim, brlen, bclen, broadcastOff)); + .flatMapToPair(new RDDRemoveEmptyFunctionInMem(rows, maxDim, brlen, bclen, broadcastOff)); } else { off = sec.getBinaryBlockRDDHandleForVariable( rddOffVar ); out = in .join( off.flatMapToPair(new ReplicateVectorFunction(!rows,numRep)) ) - .flatMapToPair(new RDDRemoveEmptyFunction(rows, maxDim, brlen, bclen)); - } + .flatMapToPair(new RDDRemoveEmptyFunction(rows, maxDim, brlen, bclen)); + } out = RDDAggregateUtils.mergeByKey(out, false); @@ -365,7 +366,8 @@ public class ParameterizedBuiltinSPInstruction extends ComputationSPInstruction } else //special case: empty output (ensure valid dims) { - MatrixBlock out = new MatrixBlock(rows?1:(int)mcIn.getRows(), rows?(int)mcIn.getCols():1, true); + int n = emptyReturn ? 1 : 0; + MatrixBlock out = new MatrixBlock(rows?n:(int)mcIn.getRows(), rows?(int)mcIn.getCols():n, true); sec.setMatrixOutput(output.getName(), out, getExtendedOpcode()); } } @@ -521,13 +523,12 @@ public class ParameterizedBuiltinSPInstruction extends ComputationSPInstruction { private static final long serialVersionUID = 4906304771183325289L; - private boolean _rmRows; - private long _len; - private long _brlen; - private long _bclen; - - public RDDRemoveEmptyFunction(boolean rmRows, long len, long brlen, long bclen) - { + private final boolean _rmRows; + private final long _len; + private final long _brlen; + private final long _bclen; + + public RDDRemoveEmptyFunction(boolean rmRows, long len, long brlen, long bclen) { _rmRows = rmRows; _len = len; _brlen = brlen; @@ -545,7 +546,7 @@ public class ParameterizedBuiltinSPInstruction extends ComputationSPInstruction //execute remove empty operations ArrayList<IndexedMatrixValue> out = new ArrayList<>(); LibMatrixReorg.rmempty(data, offsets, _rmRows, _len, _brlen, _bclen, out); - + //prepare and return outputs return SparkUtils.fromIndexedMatrixBlock(out).iterator(); } @@ -555,13 +556,13 @@ public class ParameterizedBuiltinSPInstruction extends ComputationSPInstruction { private static final long serialVersionUID = 4906304771183325289L; - private boolean _rmRows; - private long _len; - private long _brlen; - private long _bclen; + private final boolean _rmRows; + private final long _len; + private final long _brlen; + private final long _bclen; private PartitionedBroadcast<MatrixBlock> _off = null; - + public RDDRemoveEmptyFunctionInMem(boolean rmRows, long len, long brlen, long bclen, PartitionedBroadcast<MatrixBlock> off) { _rmRows = rmRows; @@ -577,12 +578,9 @@ public class ParameterizedBuiltinSPInstruction extends ComputationSPInstruction { //prepare inputs (for internal api compatibility) IndexedMatrixValue data = SparkUtils.toIndexedMatrixBlock(arg0._1(),arg0._2()); - //IndexedMatrixValue offsets = SparkUtils.toIndexedMatrixBlock(arg0._1(),arg0._2()._2()); - IndexedMatrixValue offsets = null; - if(_rmRows) - offsets = SparkUtils.toIndexedMatrixBlock(arg0._1(), _off.getBlock((int)arg0._1().getRowIndex(), 1)); - else - offsets = SparkUtils.toIndexedMatrixBlock(arg0._1(), _off.getBlock(1, (int)arg0._1().getColumnIndex())); + IndexedMatrixValue offsets = _rmRows ? + SparkUtils.toIndexedMatrixBlock(arg0._1(), _off.getBlock((int)arg0._1().getRowIndex(), 1)) : + SparkUtils.toIndexedMatrixBlock(arg0._1(), _off.getBlock(1, (int)arg0._1().getColumnIndex())); //execute remove empty operations ArrayList<IndexedMatrixValue> out = new ArrayList<>(); http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDAggregateUtils.java ---------------------------------------------------------------------- diff --git a/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDAggregateUtils.java b/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDAggregateUtils.java index 70174a9..97870e3 100644 --- a/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDAggregateUtils.java +++ b/src/main/java/org/apache/sysml/runtime/instructions/spark/utils/RDDAggregateUtils.java @@ -564,7 +564,7 @@ public class RDDAggregateUtils private MatrixBlock _corr = null; public AggregateSingleBlockFunction( AggregateOperator op ) { - _op = op; + _op = op; } @Override @@ -572,11 +572,11 @@ public class RDDAggregateUtils throws Exception { //prepare combiner block - if( arg0.getNumRows() <= 0 || arg0.getNumColumns() <= 0) { + if( arg0.getNumRows() == 0 && arg0.getNumColumns() == 0) { arg0.copy(arg1); return arg0; } - else if( arg1.getNumRows() <= 0 || arg1.getNumColumns() <= 0 ) { + else if( arg1.getNumRows() == 0 && arg1.getNumColumns() == 0 ) { return arg0; } http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/runtime/io/WriterBinaryBlock.java ---------------------------------------------------------------------- diff --git a/src/main/java/org/apache/sysml/runtime/io/WriterBinaryBlock.java b/src/main/java/org/apache/sysml/runtime/io/WriterBinaryBlock.java index a8f31cb..0d8e221 100644 --- a/src/main/java/org/apache/sysml/runtime/io/WriterBinaryBlock.java +++ b/src/main/java/org/apache/sysml/runtime/io/WriterBinaryBlock.java @@ -77,15 +77,14 @@ public class WriterBinaryBlock extends MatrixWriter JobConf job = new JobConf(ConfigurationManager.getCachedJobConf()); Path path = new Path( fname ); FileSystem fs = IOUtilFunctions.getFileSystem(path, job); - SequenceFile.Writer writer = null; try { writer = new SequenceFile.Writer(fs, job, path, - MatrixIndexes.class, MatrixBlock.class); - + MatrixIndexes.class, MatrixBlock.class); MatrixIndexes index = new MatrixIndexes(1, 1); - MatrixBlock block = new MatrixBlock((int)Math.min(rlen, brlen), - (int)Math.min(clen, bclen), true); + MatrixBlock block = new MatrixBlock( + (int)Math.max(Math.min(rlen, brlen),1), + (int)Math.max(Math.min(clen, bclen),1), true); writer.append(index, block); } finally { http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixReorg.java ---------------------------------------------------------------------- diff --git a/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixReorg.java b/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixReorg.java index 59df934..dcf8c2f 100644 --- a/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixReorg.java +++ b/src/main/java/org/apache/sysml/runtime/matrix/data/LibMatrixReorg.java @@ -512,28 +512,30 @@ public class LibMatrixReorg * @param in input matrix * @param ret output matrix * @param rows ? + * @param emptyReturn return row/column of zeros for empty input * @param select ? * @return matrix block * @throws DMLRuntimeException if DMLRuntimeException occurs */ - public static MatrixBlock rmempty(MatrixBlock in, MatrixBlock ret, boolean rows, MatrixBlock select) + public static MatrixBlock rmempty(MatrixBlock in, MatrixBlock ret, boolean rows, boolean emptyReturn, MatrixBlock select) throws DMLRuntimeException { //check for empty inputs //(the semantics of removeEmpty are that for an empty m-by-n matrix, the output //is an empty 1-by-n or m-by-1 matrix because we don't allow matrices with dims 0) if( in.isEmptyBlock(false) && select == null ) { + int n = emptyReturn ? 1 : 0; if( rows ) - ret.reset(1, in.clen, in.sparse); + ret.reset(n, in.clen, in.sparse); else //cols - ret.reset(in.rlen, 1, in.sparse); + ret.reset(in.rlen, n, in.sparse); return ret; } if( rows ) - return removeEmptyRows(in, ret, select); + return removeEmptyRows(in, ret, select, emptyReturn); else //cols - return removeEmptyColumns(in, ret, select); + return removeEmptyColumns(in, ret, select, emptyReturn); } /** @@ -1704,7 +1706,7 @@ public class LibMatrixReorg new MatrixIndexes(ci, cj); } - private static MatrixBlock removeEmptyRows(MatrixBlock in, MatrixBlock ret, MatrixBlock select) + private static MatrixBlock removeEmptyRows(MatrixBlock in, MatrixBlock ret, MatrixBlock select, boolean emptyReturn) throws DMLRuntimeException { final int m = in.rlen; @@ -1769,7 +1771,7 @@ public class LibMatrixReorg //Step 2: reset result and copy rows //dense stays dense if correct input representation (but robust for any input), //sparse might be dense/sparse - rlen2 = Math.max(rlen2, 1); //ensure valid output + rlen2 = Math.max(rlen2, emptyReturn ? 1 : 0); //ensure valid output boolean sp = MatrixBlock.evalSparseFormatInMemory(rlen2, n, in.nonZeros); ret.reset(rlen2, n, sp); if( in.isEmptyBlock(false) ) @@ -1825,7 +1827,7 @@ public class LibMatrixReorg return ret; } - private static MatrixBlock removeEmptyColumns(MatrixBlock in, MatrixBlock ret, MatrixBlock select) + private static MatrixBlock removeEmptyColumns(MatrixBlock in, MatrixBlock ret, MatrixBlock select, boolean emptyReturn) throws DMLRuntimeException { final int m = in.rlen; @@ -1872,7 +1874,7 @@ public class LibMatrixReorg //Step 3: reset result and copy columns //dense stays dense if correct input representation (but robust for any input), // sparse might be dense/sparse - clen2 = Math.max(clen2, 1); //ensure valid output + clen2 = Math.max(clen2, emptyReturn ? 1 : 0); //ensure valid output boolean sp = MatrixBlock.evalSparseFormatInMemory(m, clen2, in.nonZeros); ret.reset(m, clen2, sp); if( in.isEmptyBlock(false) ) http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/main/java/org/apache/sysml/runtime/matrix/data/MatrixBlock.java ---------------------------------------------------------------------- diff --git a/src/main/java/org/apache/sysml/runtime/matrix/data/MatrixBlock.java b/src/main/java/org/apache/sysml/runtime/matrix/data/MatrixBlock.java index 8767b13..c59ec51 100644 --- a/src/main/java/org/apache/sysml/runtime/matrix/data/MatrixBlock.java +++ b/src/main/java/org/apache/sysml/runtime/matrix/data/MatrixBlock.java @@ -4873,17 +4873,17 @@ public class MatrixBlock extends MatrixValue implements CacheBlock, Externalizab return result; } - public MatrixBlock removeEmptyOperations( MatrixBlock ret, boolean rows, MatrixBlock select ) + public MatrixBlock removeEmptyOperations( MatrixBlock ret, boolean rows, boolean emptyReturn, MatrixBlock select ) throws DMLRuntimeException { MatrixBlock result = checkType(ret); - return LibMatrixReorg.rmempty(this, result, rows, select); + return LibMatrixReorg.rmempty(this, result, rows, emptyReturn, select); } - public MatrixBlock removeEmptyOperations( MatrixBlock ret, boolean rows) + public MatrixBlock removeEmptyOperations( MatrixBlock ret, boolean rows, boolean emptyReturn) throws DMLRuntimeException { - return removeEmptyOperations(ret, rows, null); + return removeEmptyOperations(ret, rows, emptyReturn, null); } public MatrixBlock rexpandOperations( MatrixBlock ret, double max, boolean rows, boolean cast, boolean ignore, int k ) http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/test/java/org/apache/sysml/test/integration/functions/misc/ZeroRowsColsMatrixTest.java ---------------------------------------------------------------------- diff --git a/src/test/java/org/apache/sysml/test/integration/functions/misc/ZeroRowsColsMatrixTest.java b/src/test/java/org/apache/sysml/test/integration/functions/misc/ZeroRowsColsMatrixTest.java index 349828a..960eac5 100644 --- a/src/test/java/org/apache/sysml/test/integration/functions/misc/ZeroRowsColsMatrixTest.java +++ b/src/test/java/org/apache/sysml/test/integration/functions/misc/ZeroRowsColsMatrixTest.java @@ -53,35 +53,35 @@ public class ZeroRowsColsMatrixTest extends AutomatedTestBase addTestConfiguration(TEST_NAME4, new TestConfiguration(TEST_CLASS_DIR, TEST_NAME4, new String[] { "R" })); } -// @Test -// public void testEmptyMatrixRemoveEmptyNoRewritesCP() { -// runEmptyMatrixTest(TEST_NAME1, false, ExecType.CP); -// } -// -// @Test -// public void testEmptyMatrixRemoveEmptyRewritesCP() { -// runEmptyMatrixTest(TEST_NAME1, true, ExecType.CP); -// } -// -// @Test -// public void testEmptyMatrixRemoveEmptyNoRewritesMR() { -// runEmptyMatrixTest(TEST_NAME1, false, ExecType.MR); -// } -// -// @Test -// public void testEmptyMatrixRemoveEmptyRewritesMR() { -// runEmptyMatrixTest(TEST_NAME1, true, ExecType.MR); -// } -// -// @Test -// public void testEmptyMatrixRemoveEmptyNoRewritesSP() { -// runEmptyMatrixTest(TEST_NAME1, false, ExecType.SPARK); -// } -// -// @Test -// public void testEmptyMatrixRemoveEmptyRewritesSP() { -// runEmptyMatrixTest(TEST_NAME1, true, ExecType.SPARK); -// } + @Test + public void testEmptyMatrixRemoveEmptyNoRewritesCP() { + runEmptyMatrixTest(TEST_NAME1, false, ExecType.CP); + } + + @Test + public void testEmptyMatrixRemoveEmptyRewritesCP() { + runEmptyMatrixTest(TEST_NAME1, true, ExecType.CP); + } + + @Test + public void testEmptyMatrixRemoveEmptyNoRewritesMR() { + runEmptyMatrixTest(TEST_NAME1, false, ExecType.MR); + } + + @Test + public void testEmptyMatrixRemoveEmptyRewritesMR() { + runEmptyMatrixTest(TEST_NAME1, true, ExecType.MR); + } + + @Test + public void testEmptyMatrixRemoveEmptyNoRewritesSP() { + runEmptyMatrixTest(TEST_NAME1, false, ExecType.SPARK); + } + + @Test + public void testEmptyMatrixRemoveEmptyRewritesSP() { + runEmptyMatrixTest(TEST_NAME1, true, ExecType.SPARK); + } @Test public void testEmptyMatrixCbindNoRewritesCP() { @@ -196,7 +196,7 @@ public class ZeroRowsColsMatrixTest extends AutomatedTestBase String HOME = SCRIPT_DIR + TEST_DIR; fullDMLScriptName = HOME + TEST_NAME + ".dml"; - programArgs = new String[]{"-args", String.valueOf(dim), output("R")}; + programArgs = new String[]{"-explain","recompile_runtime","-args", String.valueOf(dim), output("R")}; fullRScriptName = HOME + TEST_NAME +".R"; rCmd = getRCmd(String.valueOf(dim), expectedDir()); http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.R ---------------------------------------------------------------------- diff --git a/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.R b/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.R index 900e15e..0814e92 100644 --- a/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.R +++ b/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.R @@ -24,6 +24,6 @@ args <- commandArgs(TRUE) options(digits=22) library("Matrix") -R = matrix(7, as.integer(args[1]), 7); +R = matrix(7, as.integer(args[1]), 3); writeMM(as(R, "CsparseMatrix"), paste(args[2], "R", sep="")); http://git-wip-us.apache.org/repos/asf/systemml/blob/e0b66b30/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.dml ---------------------------------------------------------------------- diff --git a/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.dml b/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.dml index 7ca8634..479bd9e 100644 --- a/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.dml +++ b/src/test/scripts/functions/misc/ZeroMatrix_RemoveEmpty.dml @@ -21,8 +21,8 @@ A = matrix(0, $1, $1); -B = removeEmpty(target=A, margin="rows", outEmpty=FALSE) + 7; -C = removeEmpty(target=A, margin="cols", outEmpty=FALSE) + 3; -R = matrix(7+sum(B)+sum(C), $1, 7); +B = removeEmpty(target=A, margin="rows", empty.return=FALSE); +C = removeEmpty(target=A, margin="cols", empty.return=FALSE); +R = matrix(7+nrow(B)+ncol(C), $1, 3); write(R, $2); \ No newline at end of file
