GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/14867
[SPARK-17296][SQL] Simplify join parser join processing.
## What changes were proposed in this pull request?
Join processing in the parser relies on the fact that the grammar produces
a right nested trees, for instance the parse tree for `select * from a join b
join c` is expected to produce a tree similar to `JOIN(a, JOIN(b, c))`. However
there are cases in which this (invariant) is violated, like:
```sql
SELECT COUNT(1)
FROM test T1
CROSS JOIN test T2
JOIN test T3
ON T3.col = T1.col
JOIN test T4
ON T4.col = T1.col
```
In this case the parser returns a tree in which Joins are located on both
the left and the right sides of the parent join node.
This PR introduces a different grammar rule which does not make this
assumption. The new rule takes a relation and searches for zero or more joined
relations. As a bonus processing is much easier.
## How was this patch tested?
Existing tests and I have added a regression test to the plan parser suite.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/hvanhovell/spark SPARK-17296
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/14867.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #14867
----
commit b25e2db00f327ff81f8d243948ef4df77e31af15
Author: Herman van Hovell <[email protected]>
Date: 2016-08-29T20:12:59Z
Simplify join processing.
commit f9cb0d267bf1d4c5f0e16d832e90684236fe74ce
Author: Herman van Hovell <[email protected]>
Date: 2016-08-29T20:21:58Z
Merge remote-tracking branch 'apache-github/master' into SPARK-17296
# Conflicts:
#
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]