Qi Zhu created SPARK-47284:
------------------------------

             Summary: We should ensure enough parallelism when 
ShuffleExchangeLike join with specs without shuffle
                 Key: SPARK-47284
                 URL: https://issues.apache.org/jira/browse/SPARK-47284
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 4.0.0
            Reporter: Qi Zhu


The following case is introduced by 
https://issues.apache.org/jira/browse/SPARK-35703


// When choosing specs, we should consider those children with no 
`ShuffleExchangeLike` node
// first. For instance, if we have:
// A: (No_Exchange, 100) <---> B: (Exchange, 120)
// it's better to pick A and change B to (Exchange, 100) instead of picking B 
and insert a
// new shuffle for A.



But we'd better improve it in some cases, for example:
A: (No_Exchange, 2) <---> B: (Exchange, 100)


The current logic will change to:
A: (No_Exchange, 2) <---> B: (Exchange,2)

It actually not ensure enough parallelism, it will reduce the performance i 
think.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to