GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/10509
[SPARK-12362][SQL][WIP] Inline Hive Parser
This is a WIP. The PR has been taken over from @nongli (see
https://github.com/apache/spark/pull/10420). I have remove some additional
deadcode, and fixed a few issues which were caused by the fact that the inlined
Hive parser is newer than the Hive parser we currently use in Spark.
I am submitting this PR in order to get some feedback and testing done.
There quite a bit of work to do:
- [ ] Get it to pass jenkins build/test.
- [ ] Refactorings between HiveQl and the java classes.
- [ ] Create our own ASTNode and integrate the current implicit
extentions.
- [ ] Move remaining ```SemanticAnalyzer``` and ```ParseUtils```
functionality to ```HiveQl```.
- [ ] Removing Hive dependencies from the parser. This will require some
edits in the grammar files.
- [ ] Introduce our own context which needs to contain a
```TokenRewriteStream```.
- [ ] Add ```useSQL11ReservedKeywordsForIdentifier``` and
```allowQuotedId``` to the catalyst or sql configuration.
- [ ] Remove ```HiveConf``` from grammar files &HiveQl, and pass in our
own configuration.
- [ ] Moving the parser into sql/core.
cc @nongli @rxin
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/hvanhovell/spark SPARK-12362
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/10509.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #10509
----
commit 9d51dd10aa1542b55e68cd661485e1f49688fffd
Author: Nong Li <[email protected]>
Date: 2015-12-16T05:03:41Z
[SPARK-12363] [SQL] Inline Hive parser into spark sql.
This is a WIP. This inlines the hive sql grammar parser into spark sql in
the hive
subproject. This should eventually be moved into the SQL core project once
all the
hive dependencies are removed.
This patch does some of that by cleaning up the hive code to remove much of
semantic
analysis.
commit 7e1a14582fc32fda2016072138b4a431c7ba9333
Author: Nong Li <[email protected]>
Date: 2015-12-16T05:59:26Z
Add anti join to grammar as an example.
commit 0cbf502356ca70d2455d385c4fb0540c38ef9301
Author: Nong Li <[email protected]>
Date: 2015-12-21T19:57:18Z
Updates to support antlr 3.5.2 and SBT build.
commit cd07d7f1391af8b3f777c56b4017c71a5a77c725
Author: Herman van Hovell <[email protected]>
Date: 2015-12-28T14:23:48Z
Remove dead code from the parser.
commit 8ced9c0f7736401fb13b591b6a465aeb4501e96d
Author: Herman van Hovell <[email protected]>
Date: 2015-12-29T13:00:51Z
Remove tests no longer supported by parser (HIVE-11145).
commit cb60ba045ff6663ed83c308b2423bdb87152a092
Author: Herman van Hovell <[email protected]>
Date: 2015-12-29T13:01:06Z
Remove ASTNodeOrigin
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]