Re: [DISCUSS] Upgrade built-in Hive to 2.3.4

Xiao Li Tue, 15 Jan 2019 10:50:57 -0800

If https://issues.apache.org/jira/browse/HIVE-16391 can be resolved, we do
not need to keep our fork of Hive.


Sean Owen <[email protected]> 于2019年1月15日周二 上午10:44写道：

> It's almost certainly needed just to get off the fork of Hive we're
> not supposed to have. Yes it's going to impact dependencies, so would
> need to happen at Spark 3.
> Separately, its usage could be reduced or removed -- this I don't know
> much about. But it doesn't really make it harder or easier.
>
> On Tue, Jan 15, 2019 at 12:40 PM Xiao Li <[email protected]> wrote:
> >
> > Since Spark 2.0, we have been trying to move all the Hive-specific
> logics to a separate package and make Hive as a data source like the other
> built-in data sources. You might see a lot of refactoring PRs for this
> goal. Hive will be still an important data source Spark supports for sure.
> >
> > Now, the upgrade of Hive execution JAR touches so many code and changes
> many dependencies. Any PR like this looks very risky to me. Both quality
> and stability are my major concern. This could impact the adoption rate of
> our upcoming Spark 3.0 release, which will contain many important features.
> I doubt whether upgrading the Hive execution JAR is really needed?
>

Re: [DISCUSS] Upgrade built-in Hive to 2.3.4

Reply via email to