Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19257
  
    > We need to know the exact partitioning of the children which dummy nodes 
won't give
    
    We only add the dummy shuffle node when it's necessary, e.g.
    ```
           hash-join
              /      \
        child1   child2
    ```
    
    Let's say `hash-join` needs children to be clustered by a, b, and `child1` 
is already partitioned by a, and `child2` has no partitioning. After adding the 
dummy nodes:
    ```
           hash-join
              /      \
             /     dummy-shuffle
            /            |
        child1    child2
    ```
    Now we still keep exact partitioning, i.e. left child is partitioned by a, 
right child is partitioned by a,b 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to