If https://issues.apache.org/jira/browse/HIVE-16391 can be resolved, we do not need to keep our fork of Hive.
Sean Owen <sro...@gmail.com> 于2019年1月15日周二 上午10:44写道: > It's almost certainly needed just to get off the fork of Hive we're > not supposed to have. Yes it's going to impact dependencies, so would > need to happen at Spark 3. > Separately, its usage could be reduced or removed -- this I don't know > much about. But it doesn't really make it harder or easier. > > On Tue, Jan 15, 2019 at 12:40 PM Xiao Li <gatorsm...@gmail.com> wrote: > > > > Since Spark 2.0, we have been trying to move all the Hive-specific > logics to a separate package and make Hive as a data source like the other > built-in data sources. You might see a lot of refactoring PRs for this > goal. Hive will be still an important data source Spark supports for sure. > > > > Now, the upgrade of Hive execution JAR touches so many code and changes > many dependencies. Any PR like this looks very risky to me. Both quality > and stability are my major concern. This could impact the adoption rate of > our upcoming Spark 3.0 release, which will contain many important features. > I doubt whether upgrading the Hive execution JAR is really needed? >