GitHub user JoshRosen opened a pull request:
https://github.com/apache/spark/pull/10997
[SPARK-13105] Reject NATURAL JOIN queries rather than returning wrong
answers
In Spark 1.6 and earlier, Spark SQL does not support `NATURAL JOIN`
queries. However, its SQL parser does not consider `NATURAL` to be a reserved
word, which causes natural joins to be parsed as regular joins where the left
table has been aliased. For instance,
```
SELECT * FROM foo NATURAL JOIN bar
```
gets interpreted as `foo JOIN bar` where `foo` is aliased to `natural`.
Rather than doing this, which leads to confusing / wrong results for users
who expect NATURAL JOIN behavior, Spark should immediately reject these queries
at analysis time and should provide an informative error message.
As a result, this patch is targeted at Spark 1.6 and earlier.
I chose to implement this check entirely within the parser in order to
minimize the scope of the changes and to not introduce any new classes into the
logical plan layer. I considered introducing a new `NaturalJoin` join type,
parsing the query, then detecting and throwing an error from the analyzer but
ended up rejecting this approach because I was concerned that adding a new
class to a sealed trait would break compilation for third-party code which
pattern-matches on Catalyst join types.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/JoshRosen/spark SPARK-13105
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/10997.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #10997
----
commit fc15a91de0e163c0ea5b7f7ca031aa62317ed301
Author: Josh Rosen <[email protected]>
Date: 2016-01-31T01:57:36Z
Add regression test for SPARK-13105
commit 21e42d893283a754f3d25758562864ceb57043d4
Author: Josh Rosen <[email protected]>
Date: 2016-01-31T01:57:44Z
Fix SPARK-13105 at the parser level.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]