Thanks everyone for sharing your thoughts. I am happy to see so many people
involved in the discussion.

I would say that the current 4.0.0-alpha-1 is better in many aspects than
previous stable releases, although this might be a bit subjective.

I am afraid that if we keep supporting older releases it will take too much
time till people start using the 4.x.
Having real deployments of Hive 4 is the only way to go from alpha to
stable releases with confidence.

I checked the download statistics for Hive releases [1], [2] for the past
month and the results show that the vast majority of downloads are for
older releases.
I am not posting the stats here since I am not sure if this would violate
some policies. Hive committers can access the stats using their ASF
credentials.
To some degree this is expected but at the same time problematic given the
number of open issues which affect older releases.

I would definitely like to have multiple maintenance branches with high
quality standards but I don't think there are enough active committers in
the project to successfully maintain those.
The https://github.com/mr3project/hive-mr3 repo may be a great fit for an
upcoming ASF Hive release.
However, according to what Sungwoo said, this seems more like a new
maintenance branch rather than a continuation of Hive 3.
Moving towards this direction would certainly require more time from all of
us.

Lastly, it seems that there are some issues preventing people from using
4.0.0-alpha-1.
As Peter already mentioned these issues are probably release blockers and
it should be taken into account in the next Hive 4 release.
The thread about the next steps after 4.0.0-alpha-1 [3] is the perfect
place to discuss those.
For those with certain demands around Hive 4, please reply to [3] and
include any specific JIRAs that need to be in the scope of the next release.

Best,
Stamatis

[1] https://logging1-he-de.apache.org/stats/
[2] https://repository.apache.org/#central-stat
[3] https://lists.apache.org/thread/n245dd23kb2v3qrrfp280w3pto89khxj


On Tue, May 10, 2022 at 10:55 AM Sungwoo Park <glap...@gmail.com> wrote:

> We maintain our own fork of Hive 3 because we are not always adding new
> commits to the tip of the branch. To backport a new patch, sometimes we
> have to add new commits between existing commits, update earlier commits,
> and so on. This makes it impractical to keep adding new patches only to the
> tip of the branch while reverting commits if necessary. Maintaining the
> Hive 3 branch would mean frequent force-updates, which might produce more
> problems. (If this is not an issue, we could try to completely rebuild the
> Hive 3 branch.)
>
> I hope the Apache community can make a concerted effort to figure out what
> patches to include in Hive 3. For us, the challenge was 1) to decide which
> patch to include; 2) to figure out its dependencies if any; 3) to resolve
> conflicts. Testing was also another source of pain.
>
> Thanks,
>
> --- Sungwoo
>
>
>
>
>
> On Tue, May 10, 2022 at 4:26 PM Peter Vary <pv...@cloudera.com> wrote:
>
>> When we were brainstorming about the future of the Hive 3 branch with
>> Zoltan Haindrich, he mentioned this letter:
>> https://lists.apache.org/thread/by9ppc2z8oqdzpqotzv5bs34yrxrd84l
>>
>> I think Sungwoo Park and his team makes a huge effort to maintain this
>> branch, and maybe it would be better to help them do this inside the Apache
>> Hive project. They should not need to maintain their own branch if there is
>> no particular reason behind it, or we can remove those blockers. This could
>> be beneficial for every Hive user who still uses Hive 3.
>>
>> @Sungwoo: Do you have any specific reason to keep you own fork of Hive 3?
>>
>> That would mean we could have a much better Hive 3.x branch than we have
>> now.
>>
>> What do you think?
>>
>> Thanks,
>> Peter
>>
>>
>>
>> On 2022. May 10., at 8:40, Battula, Brahma Reddy <
>> bbatt...@visa.com.INVALID> wrote:
>>
>> Agree to Peter and sunchao..
>>
>> Even we are using the hive 3.x, we might contribute on bugfixes.
>>
>> Even I am +1 on 1.x EOL as it's hard to maintain so many releases and
>> time to user's migrate to 2.x and 3.x.
>>
>>
>> On 09/05/22, 10:51 PM, "Chao Sun" <sunc...@apache.org> wrote:
>>
>>    Agree to Peter above. I know quite a few projects such as Spark,
>>    Iceberg and Trino/Presto are depending on Hive 2.x and 3.x, and
>>    periodically they may need new fixes in these. Upgrading them to use
>>    4.x seems not an option for now since the core classified artifact has
>>    been removed and the shading issue has to be solved before they can
>>    consume the new jar.
>>
>>    On Mon, May 9, 2022 at 4:10 AM Peter Vary <pv...@cloudera.com> wrote:
>>
>>
>> Hi Team,
>>
>> My experience with the Iceberg community shows that there are some
>> sizeable userbase around Hive 2.x. I have seen patches, contributions to
>> Hive 2.3.x branches, and the tests are in much better shape there.
>>
>> I would definitely vote for EOL Hive 1.x, but until we have a stable 4.x,
>> I would be cautious about slashing 2.x, 3.x branches.
>>
>> Just my 2 cents.
>>
>> Peter
>>
>> On 2022. May 9., at 10:51, Alessandro Solimando <
>> alessandro.solima...@gmail.com> wrote:
>>
>> Hi Stamatis,
>> thanks for bringing up this topic, I basically agree on everything you
>> wrote.
>>
>> I just wanted to add that this kind of proposal might sound harsh,
>> because in many contexts upgrading is a complex process, but it's in
>> nobody's interest to keep release branches that are missing important
>> fixes/improvements and that might not meet the quality standards that
>> people expect, as mentioned.
>>
>> Since we don't have yet a stable 4.x release (only alpha for now) we
>> might want to keep supporting the 3.x branch until the first 4.x stable
>> release and EOL < 3.x branches, WDYT?
>>
>> Best regards,
>> Alessandro
>>
>> On Fri, 6 May 2022 at 23:14, Stamatis Zampetakis <zabe...@gmail.com>
>> wrote:
>>
>>
>> Hi all,
>>
>> The current master has many critical bug fixes as well as important
>> performance improvements that are not backported (and most likely never
>> will) to the maintenance branches.
>>
>> Backporting changes from master usually requires adapting the code and
>> tests in questions making it a non-trivial and time consuming task.
>>
>> The ASF bylaws require PMCs to deliver high quality software which
>> satisfy certain criteria. Cutting new releases from maintenance branches
>> with known critical bugs is not compliant with the ASF.
>>
>> CI is unstable in all maintenance branches making the quality of a
>> release questionable and merging new PRs rather difficult. Enabling and
>> running it frequently in all maintenance branches would require a big
>> amount of resources on top of what we already need for master.
>>
>> History has shown that it is very difficult or impossible to properly
>> maintain multiple release branches for Hive.
>>
>> I think it would be to the best interest of the project if the PMC
>> decided to drop support for maintenance branches and focused on releasing
>> exclusively from master.
>>
>> This mail is related to the discussion about the release cadence [1]
>> since it would certainly help making Hive releases more regular. I decided
>> to start a separate thread to avoid mixing multiple topics together.
>>
>> Looking forward to your thoughts.
>>
>> Best,
>> Stamatis
>>
>> [1]
>> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.apache.org%2Fthread%2Fn245dd23kb2v3qrrfp280w3pto89khxj&amp;data=05%7C01%7Cbbattula%40visa.com%7Ccba1383657724a00f0bb08da31e069bc%7C38305e12e15d4ee888b9c4db1c477d76%7C0%7C0%7C637877137169408371%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=X3BJyzgALXZVnjmd2PzbLrOi4lXMHxEQa8KwA1Pz7BQ%3D&amp;reserved=0
>>
>>
>>

Reply via email to