This is an automated email from the ASF dual-hosted git repository.

mboehm7 pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/systemds.git


The following commit(s) were added to refs/heads/main by this push:
     new 99ab0bb231 [SYSTEMDS-3501] Fix perftest regression: tsmm transitive 
spark exec
99ab0bb231 is described below

commit 99ab0bb23162ead74c7cbe9094aea49745d00858
Author: Matthias Boehm <[email protected]>
AuthorDate: Fri Feb 24 20:31:36 2023 +0100

    [SYSTEMDS-3501] Fix perftest regression: tsmm transitive spark exec
    
    This patch fixes a performance regression observed on perftest LinregDS
    1M_1K_dense and 20GB driver, where tsmm was executed as spark operation.
    For a long time we had dedicated memory estimates for tsmm which allowed
    its compilation into CP despite the default estimate exceeding the
    memory budget. However, the recent change on transitive spark exec type
    selection overwrote these decision by not handling tsmm.
    
    On a small reproduction scenario of rand -> tsmm with the same data
    characteristics showed an improvement from 105s (incl 25s spark context
    creation) to 10s (7s tsmm).
---
 src/main/java/org/apache/sysds/hops/AggBinaryOp.java | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/main/java/org/apache/sysds/hops/AggBinaryOp.java 
b/src/main/java/org/apache/sysds/hops/AggBinaryOp.java
index e8bf19f40b..231435fdfa 100644
--- a/src/main/java/org/apache/sysds/hops/AggBinaryOp.java
+++ b/src/main/java/org/apache/sysds/hops/AggBinaryOp.java
@@ -417,8 +417,9 @@ public class AggBinaryOp extends MultiThreadedHop {
                        if( _etype == ExecType.CP
                                && checkMapMultChain() != ChainType.NONE
                                && OptimizerUtils.getLocalMemBudget() < 
-                               
getInput().get(0).getInput().get(0).getOutputMemEstimate() )
+                               
getInput().get(0).getInput().get(0).getOutputMemEstimate() ) {
                                _etype = ExecType.SPARK;
+                       }
                        
                        //check for valid CP dimensions and matrix size
                        checkAndSetInvalidCPDimsAndSize();
@@ -426,9 +427,10 @@ public class AggBinaryOp extends MultiThreadedHop {
                
                //spark-specific decision refinement (execute binary aggregate 
w/ left or right spark input and 
                //single parent also in spark because it's likely cheap and 
reduces data transfer)
+               MMTSJType mmtsj = checkTransposeSelf(); //determine tsmm pattern
                if( transitive && _etype == ExecType.CP && _etypeForced != 
ExecType.CP 
-                       && (isApplicableForTransitiveSparkExecType(true) 
-                       || isApplicableForTransitiveSparkExecType(false)) )
+                       && ((!mmtsj.isLeft() && 
isApplicableForTransitiveSparkExecType(true))
+                       || ( !mmtsj.isRight() && 
isApplicableForTransitiveSparkExecType(false))) )
                {
                        //pull binary aggregate into spark 
                        _etype = ExecType.SPARK;

Reply via email to