[
https://issues.apache.org/jira/browse/SPARK-20366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhenhua Wang updated SPARK-20366:
---------------------------------
Description:
If a plan has multi-level successive joins, e.g.:
```
Join
/ \
Union t5
/ \
Join t4
/ \
Join t3
/ \
t1 t2
```
Currently we fail to reorder the inside joins, i.e. t1, t2, t3.
In join reorder, we use `OrderedJoin` to indicate a join has been ordered, such
that when transforming down the plan, these joins don't need to be rerodered
again.
But there's a problem in the definition of `OrderedJoin`:
The real join node is a parameter, but not its child. This breaks the transform
procedure because `mapChildren` applies transform function on parameters which
should be children.
was:
If a plan has multi-level successive joins, e.g.:
```
Join
/ \
Union t5
/ \
Join t4
/ \
Join t3
/ \
t1 t2
```
Currently we fail to reorder the inside joins, i.e. t1, t2, t3.
In join reorder, we use `OrderedJoin` to indicate a join has been ordered, such
that when transforming down the plan, these joins don't need to be rerodered
again.
But there's a problem in the definition of `OrderedJoin`:
The real join node is a parameter, rather than a child. This breaks the
transform procedure because `mapChildren` applies transform on parameters which
should be children.
> Fix recursive join reordering: inside joins are not reordered
> -------------------------------------------------------------
>
> Key: SPARK-20366
> URL: https://issues.apache.org/jira/browse/SPARK-20366
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 2.2.0
> Reporter: Zhenhua Wang
>
> If a plan has multi-level successive joins, e.g.:
> ```
> Join
> / \
> Union t5
> / \
> Join t4
> / \
> Join t3
> / \
> t1 t2
> ```
> Currently we fail to reorder the inside joins, i.e. t1, t2, t3.
> In join reorder, we use `OrderedJoin` to indicate a join has been ordered,
> such that when transforming down the plan, these joins don't need to be
> rerodered again.
> But there's a problem in the definition of `OrderedJoin`:
> The real join node is a parameter, but not its child. This breaks the
> transform procedure because `mapChildren` applies transform function on
> parameters which should be children.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]