I think this proposal make sense - we've done well in enabling parallel development for different Hive versions so far, but it is a burden. E.g. we still don't have precommit tests for Hive 3+ (I like that name) and I don't know that we want to go about making the suite of precommit tests even larger.
On Fri, Jan 17, 2020 at 4:29 PM Joe McDonnell <joemcdonn...@cloudera.com> wrote: > I wanted to start a conversation around moving to develop against Hive 3+ > by default. (I describe this as Hive 3+ because it is close to Hive master, > which is well beyond any released Hive 3.) There has been considerable > development effort towards implementing features integrating Impala with > Hive 3+ and Hive ACID. This is currently developed under the > USE_CDP_HIVE=true configuration while regular development has continued > with Hive 2. The Hive 3+ development is now stable enough to be used for > regular development. It would be nice to reduce our test and compatibility > matrix and have a unified development environment. > > Changing the major version of Hive is a breaking change, so it would > require an Impala 4.x code line. I have a specific proposal, but this is > mainly a frame for getting the discussion going. > > I propose that we release Impala 3.4.0 and then update master to 4.0 and > allow breaking changes until the Impala 4.0 release. The main breaking > change would be to set USE_CDP_HIVE=true, enabling Hive 3+ development by > default. The Hive 2 configuration would be removed over time. Other > breaking changes can be proposed and voted on. > > If there are developers interested in maintaining a 3.x branch, we can > create this branch and add appropriate support to any infrastructure (e.g. > bin/push_to_asf.py) to allow that. > > Thoughts? > > Thanks, > > Joe McDonnell >