[SYSTEMML-2221] Performance checkpointing of ultra-sparse matrices

This patch improves the checkpointing (i.e., distributed in-memory
caching) of ultra-sparse matrices by avoiding unnecessary MCSR-to-CSR
conversion for empty blocks, which is unnecessary and causes substantial
GC overhead if the fraction of empty 1kx1k blocks is very large.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/022e046d
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/022e046d
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/022e046d

Branch: refs/heads/master
Commit: 022e046d3c47386d5020b45c57e61e3e0ee306e7
Parents: 42e30a1
Author: Matthias Boehm <[email protected]>
Authored: Fri Mar 30 20:57:48 2018 -0700
Committer: Matthias Boehm <[email protected]>
Committed: Fri Mar 30 20:57:48 2018 -0700

----------------------------------------------------------------------
 .../instructions/spark/functions/CreateSparseBlockFunction.java | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/systemml/blob/022e046d/src/main/java/org/apache/sysml/runtime/instructions/spark/functions/CreateSparseBlockFunction.java
----------------------------------------------------------------------
diff --git 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/functions/CreateSparseBlockFunction.java
 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/functions/CreateSparseBlockFunction.java
index 93f91ec..ded0360 100644
--- 
a/src/main/java/org/apache/sysml/runtime/instructions/spark/functions/CreateSparseBlockFunction.java
+++ 
b/src/main/java/org/apache/sysml/runtime/instructions/spark/functions/CreateSparseBlockFunction.java
@@ -45,9 +45,10 @@ public class CreateSparseBlockFunction implements 
Function<MatrixBlock,MatrixBlo
        {
                //convert given block to CSR representation if in sparse format
                //but allow shallow pass-through if already in CSR 
representation. 
-               if( arg0.isInSparseFormat() && !(arg0 instanceof 
CompressedMatrixBlock) )
+               if( arg0.isInSparseFormat() && !arg0.isEmptyBlock(false)
+                       && !(arg0 instanceof CompressedMatrixBlock) )
                        return new MatrixBlock(arg0, _stype, false);
                else //pass through dense
-                       return arg0;    
+                       return arg0;
        }
 }
\ No newline at end of file

Reply via email to