Hello,

Hive PMC members or committers could share insider knowledge about the
status of the Hive project, but here is my impression on Hive 3.1.2 as an
outsider.

Hive 3.1.2 is widely used in production, but not maintained seriously. (You
could just check out the # of commits in branch-3.1 for the last couple of
years). Many critical patches have not been backported, and some patches
are committed even without proper testing. As a result, running Hive 3.1.2
in production would require you to maintain your own fork of Hive 3.1.2,
backporting patches as necessary. Or you could use a commercial solution
like CDP. There is nothing unusual here, as Hive is an open source project.

On the other hand, bugs and performance issues in Hive 3.1.2 are constantly
reported in Hive JIRAs, while important bug-fixes and performance
improvements are contributed by many individuals. What seems to happen
afterwards is that the contributed code is either merged only in the master
branch or not accepted at all. (I see quite a few important patches stay
unnoticed without being discussed.) Occasionally you see new Hive JIRAs
reporting bugs in Hive 3.1.2 which have actually been fixed in earlier
JIRAs that are not merged in branch-3.1. In order to take advantage of new
patches, one would have to backport batches on his own. (I guess Hive PMC
is mostly focused on the master branch).

As for Hive 4.0, I know nothing about its status, but in the virtual meetup
last March, it was briefly mentioned that no concrete release plan was
ready. (I could be wrong, so someone could correct me.)

We are maintaining our own fork of Hive 3.1.2 which backported over 300
additional patches. More important patches from the master branch are
currently being backported. This repository is getting increasingly
popular, so it might be useful to you.

https://github.com/mr3project/hive-mr3

For dealing with the difficulty of operating Hive 3.1.2, there is Hive on
MR3 - no need to configure LLAP daemons, little dependence on the Hadoop
version, as fast as Hive-LLAP, and so on. Quickstart guides are available (
https://mr3docs.datamonad.com/docs/quick/hadoop/), and tutorials will be
published in the next release. If you can use Kubernetes, you can easily
run Hive on MR3 with Ranger 2.1.0/2.0.0 (
https://mr3docs.datamonad.com/docs/k8s/guide/).

Disclaimer: I am the main developer of MR3.

--- Sungwoo

On Tue, Sep 14, 2021 at 9:43 PM Antoine DUBOIS <antoine.dub...@cc.in2p3.fr>
wrote:

> Hello
> After trying to use hive 3.1.2 for several weeks with ranger, I stop.
> It's seems way too complicated and tedious.
> I wonder when or even if there will be any more release in the 3.0 branch.
> I wonder if Hive 3.0 was just an experience as it seems maintenance is not
> really there.
> Is there any plan for Hive 4.0 or should I use Hive 2.8 knowing I'm using
> Hadoop 3 ?
> Any insight on hive release cycle woudl be awesome.
>
> i hope you have a nice day.
>
> Antoine DUBOIS
>
>

Reply via email to