Thanks Sungwoo for sharing this!

A few questions:
- Are these patches you mention below bugfixes, or new features on Hive 3.1.3? 
(This might be a typo as I think the last Hive release is 3.1.2)
- Could you backport these patches to the apache branch-3, and branch-3.1?
- Is there any reason not to?

I am asking this because I think the best way to move forward is to consolidate 
these backports to a single repo, preferably to the apache one, so everyone can 
benefit from it.

What do you think?

Thanks,
Peter

> On Mar 18, 2021, at 09:02, Sungwoo Park <glap...@gmail.com> wrote:
> 
> Hello Hive users,
> 
> After attending the Hive meetup yesterday (huge thanks to the organizers!), I 
> thought that perhaps many organizations were maintaining their own Hive 2 and 
> 3 branches by backporting important patches to vanilla Hive. Ideally it would 
> be great if all the important patches were regularly merged to Hive 2 and 3 
> branches (e.g., branch-2.3 and branch-3.1), but I guess this would take a lot 
> of time and effort on the Hive committer side, and it also seems like at the 
> moment, most of the efforts are directed at the master branch.
> 
> I find this process of backporting patches to Hive 2 and 3 branches to be 
> quite a challenge and time-consuming, especially to those "outsiders" who 
> have not implemented/reviewed the patches. The problem is two-fold: 1) you 
> have to decide what patches to apply and in what order; 2) you have to run 
> all the tests to make sure that new patches are compatible with the code base 
> and do not introduce new bugs.
> 
> 1) is not easy because sometimes a patch from the master branch fails to 
> merge because of missing dependencies. In such a case, you have to go back to 
> the history of commits, identify those dependency commits, and merge them 
> first. Depending on the level of changes made in the patch, this can be a big 
> pain.
> 
> 2) can be also a problem if applying a new patch produces different test 
> results. Sometimes a patch is merged with no conflicts, but some tests fail. 
> Besides it may take a lot of time to run tests themselves.
> 
> So, I wonder if anyone could share their experience and wisdom on how to 
> maintain Hive 2 and 3 branches, or share their git repos. For us, we have 
> applied about 210 patches to Hive 3.1.3 (since Nov 2, 2020), and are in the 
> middle of applying additional 100+ patches. You can find our work at the 
> following repo. (You can ignore the last commit which is internal to our 
> work.)
> 
> https://github.com/mr3project/hive-mr3/commits/master3 
> <https://github.com/mr3project/hive-mr3/commits/master3>
> 
> Thanks,
> 
> --- Sungwoo Park
> 
> 
> 
> 
> 
> 
> 

Reply via email to