Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/19257
> We need to know the exact partitioning of the children which dummy nodes
won't give
We only add the dummy shuffle node when it's necessary, e.g.
```
hash-join
/ \
child1 child2
```
Let's say `hash-join` needs children to be clustered by a, b, and `child1`
is already partitioned by a, and `child2` has no partitioning. After adding the
dummy nodes:
```
hash-join
/ \
/ dummy-shuffle
/ |
child1 child2
```
Now we still keep exact partitioning, i.e. left child is partitioned by a,
right child is partitioned by a,b
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]