GitHub user hvanhovell opened a pull request:

    https://github.com/apache/spark/pull/10509

    [SPARK-12362][SQL][WIP] Inline Hive Parser

    This is a WIP. The PR has been taken over from @nongli (see 
https://github.com/apache/spark/pull/10420). I have remove some additional 
deadcode, and fixed a few issues which were caused by the fact that the inlined 
Hive parser is newer than the Hive parser we currently use in Spark.
    
    I am submitting this PR in order to get some feedback and testing done. 
There quite a bit of work to do:
    - [ ] Get it to pass jenkins build/test.
    - [ ] Refactorings between HiveQl and the java classes.
      - [ ] Create our own ASTNode and integrate the current implicit 
extentions.
      - [ ] Move remaining ```SemanticAnalyzer``` and ```ParseUtils``` 
functionality to ```HiveQl```.
    - [ ] Removing Hive dependencies from the parser. This will require some 
edits in the grammar files.
      - [ ] Introduce our own context which needs to contain a 
```TokenRewriteStream```.
      - [ ] Add ```useSQL11ReservedKeywordsForIdentifier``` and 
```allowQuotedId``` to the catalyst or sql configuration.
      - [ ] Remove ```HiveConf``` from grammar files &HiveQl, and pass in our 
own configuration.
    - [ ] Moving the parser into sql/core.
    
    cc @nongli @rxin

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hvanhovell/spark SPARK-12362

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10509.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10509
    
----
commit 9d51dd10aa1542b55e68cd661485e1f49688fffd
Author: Nong Li <[email protected]>
Date:   2015-12-16T05:03:41Z

    [SPARK-12363] [SQL] Inline Hive parser into spark sql.
    
    This is a WIP. This inlines the hive sql grammar parser into spark sql in 
the hive
    subproject. This should eventually be moved into the SQL core project once 
all the
    hive dependencies are removed.
    
    This patch does some of that by cleaning up the hive code to remove much of 
semantic
    analysis.

commit 7e1a14582fc32fda2016072138b4a431c7ba9333
Author: Nong Li <[email protected]>
Date:   2015-12-16T05:59:26Z

    Add anti join to grammar as an example.

commit 0cbf502356ca70d2455d385c4fb0540c38ef9301
Author: Nong Li <[email protected]>
Date:   2015-12-21T19:57:18Z

    Updates to support antlr 3.5.2 and SBT build.

commit cd07d7f1391af8b3f777c56b4017c71a5a77c725
Author: Herman van Hovell <[email protected]>
Date:   2015-12-28T14:23:48Z

    Remove dead code from the parser.

commit 8ced9c0f7736401fb13b591b6a465aeb4501e96d
Author: Herman van Hovell <[email protected]>
Date:   2015-12-29T13:00:51Z

    Remove tests no longer supported by parser (HIVE-11145).

commit cb60ba045ff6663ed83c308b2423bdb87152a092
Author: Herman van Hovell <[email protected]>
Date:   2015-12-29T13:01:06Z

    Remove ASTNodeOrigin

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to