Let me add Reynold to the thread. On Fri, Dec 18, 2015 at 12:36 PM, Gopal Vijayaraghavan <gop...@apache.org> wrote:
> > >We have looked into various options, and it looks like the best option is > >to copy the ANTLR grammar file from Hive into Spark. Because the grammar > >file is tightly coupled with Hive's semantic analysis, we need to refactor > >some code to use them so it will end up becoming the .g file plus some > >coupled code. > > Is the eventual goal to contribute that fork back into Hive & have Hive > devs maintain a compatible parser for SparkSQL? > > Would that affect Hive's ability to refactor the SQL parser in the future > or is this a one-time only deal? > > >parser. From Hive's perspective this does not provide any immediate > >benefits. From Spark's perspective, we iterate very quickly so having to > >depend on an external component also slow down our development. We also > >have some requirements that simply don't apply in other projects (e.g. > >being able to parse DataFrame expressions). > > From that I assume, this involves some form of cut-paste duplication of > the code into SparkSQL project with that version diverging away from > Hive's. > > > Thanks a lot for developing this parser, and we will try our best to > > contribute back as we fix bugs. I will also make sure we have the proper > > acknowledgment when we do this. > > > Under the Apache license, there's no actual restriction against a hostile > embrace-extend by copying hive's code verbatim as long as the fork retains > license notices. > > The maintainability concerns are mostly around whether this is intended as > an ongoing relationship, including any compatibility committments from > hive-dev@. > > > Cheers, > Gopal > > >