Thanks Sungwoo for sharing this! A few questions: - Are these patches you mention below bugfixes, or new features on Hive 3.1.3? (This might be a typo as I think the last Hive release is 3.1.2) - Could you backport these patches to the apache branch-3, and branch-3.1? - Is there any reason not to?
I am asking this because I think the best way to move forward is to consolidate these backports to a single repo, preferably to the apache one, so everyone can benefit from it. What do you think? Thanks, Peter > On Mar 18, 2021, at 09:02, Sungwoo Park <glap...@gmail.com> wrote: > > Hello Hive users, > > After attending the Hive meetup yesterday (huge thanks to the organizers!), I > thought that perhaps many organizations were maintaining their own Hive 2 and > 3 branches by backporting important patches to vanilla Hive. Ideally it would > be great if all the important patches were regularly merged to Hive 2 and 3 > branches (e.g., branch-2.3 and branch-3.1), but I guess this would take a lot > of time and effort on the Hive committer side, and it also seems like at the > moment, most of the efforts are directed at the master branch. > > I find this process of backporting patches to Hive 2 and 3 branches to be > quite a challenge and time-consuming, especially to those "outsiders" who > have not implemented/reviewed the patches. The problem is two-fold: 1) you > have to decide what patches to apply and in what order; 2) you have to run > all the tests to make sure that new patches are compatible with the code base > and do not introduce new bugs. > > 1) is not easy because sometimes a patch from the master branch fails to > merge because of missing dependencies. In such a case, you have to go back to > the history of commits, identify those dependency commits, and merge them > first. Depending on the level of changes made in the patch, this can be a big > pain. > > 2) can be also a problem if applying a new patch produces different test > results. Sometimes a patch is merged with no conflicts, but some tests fail. > Besides it may take a lot of time to run tests themselves. > > So, I wonder if anyone could share their experience and wisdom on how to > maintain Hive 2 and 3 branches, or share their git repos. For us, we have > applied about 210 patches to Hive 3.1.3 (since Nov 2, 2020), and are in the > middle of applying additional 100+ patches. You can find our work at the > following repo. (You can ignore the last commit which is internal to our > work.) > > https://github.com/mr3project/hive-mr3/commits/master3 > <https://github.com/mr3project/hive-mr3/commits/master3> > > Thanks, > > --- Sungwoo Park > > > > > > >