CC user@f.a.o

Is anyone aware of something that blocks us from doing the upgrade?

D.

On Tue, Dec 21, 2021 at 5:50 PM David Morávek <david.mora...@gmail.com>
wrote:

> Hi Martijn,
>
> from person experience, most Hadoop users are lagging behind the release
> lines by a lot, because upgrading a Hadoop cluster is not really a simply
> task to achieve. I think for now, we can stay a bit conservative, nothing
> blocks us for using 2.8.5 as we don't use any "newer" APIs in the code.
>
> As for Till's concern, we can still wrap the reflection based logic, to be
> skipped in case of "NoClassDefFound" instead of "ClassNotFound" as we do
> now.
>
> D.
>
>
> On Tue, Dec 14, 2021 at 5:23 PM Martijn Visser <mart...@ververica.com>
> wrote:
>
>> Hi David,
>>
>> Thanks for bringing this up for discussion! Given that Hadoop 2.8 is
>> considered EOL, shouldn't we bump the version to Hadoop 2.10? [1]
>>
>> Best regards,
>>
>> Martijn
>>
>> [1]
>>
>> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Active+Release+Lines
>>
>> On Tue, 14 Dec 2021 at 10:28, Till Rohrmann <trohrm...@apache.org> wrote:
>>
>> > Hi David,
>> >
>> > I think we haven't updated our Hadoop dependencies in a long time.
>> Hence,
>> > it is probably time to do so. So +1 for upgrading to the latest patch
>> > release.
>> >
>> > If newer 2.x Hadoop versions are compatible with 2.y with x >= y, then I
>> > don't see a problem with dropping support for pre-bundled Hadoop
>> versions <
>> > 2.8. This could indeed help us decrease our build matrix a bit and,
>> thus,
>> > saving some build time.
>> >
>> > Concerning simplifying our code base to get rid of reflection logic
>> etc. we
>> > still might have to add a safeguard for features that are not supported
>> by
>> > earlier versions. According to the docs
>> >
>> > > YARN applications that attempt to use new APIs (including new fields
>> in
>> > data structures) that have not yet been deployed to the cluster can
>> expect
>> > link exceptions
>> >
>> > we can see link exceptions. We could get around this by saying that
>> Flink
>> > no longer supports Hadoop < 2.8. But this should be checked with our
>> users
>> > on the user ML at least.
>> >
>> > Cheers,
>> > Till
>> >
>> > On Tue, Dec 14, 2021 at 9:25 AM David Morávek <d...@apache.org> wrote:
>> >
>> > > Hi,
>> > >
>> > > I'd like to start a discussion about upgrading a minimal Hadoop
>> version
>> > > that Flink supports.
>> > >
>> > > Even though the default value for `hadoop.version` property is set to
>> > > 2.8.3, we're still ensuring both runtime and compile compatibility
>> with
>> > > Hadoop 2.4.x with the scheduled pipeline[1].
>> > >
>> > > Here is list of dates of the latest releases for each minor version
>> up to
>> > > 2.8.x
>> > >
>> > > - Hadoop 2.4.1: Last commit on 6/30/2014
>> > > - Hadoop 2.5.2: Last commit on 11/15/2014
>> > > - Hadoop 2.6.5: Last commit on 10/11/2016
>> > > - Hadoop 2.7.7: Last commit on 7/18/2018
>> > > - Hadoop 2.8.5: Last commit on 9/8/2018
>> > >
>> > > Since then there were two more minor releases in 2.x branch and four
>> more
>> > > minor releases in 3.x branch.
>> > >
>> > > Supporting the older version involves reflection-based "hacks" for
>> > > supporting multiple versions.
>> > >
>> > > My proposal would be changing the minimum supported version *to
>> 2.8.5*.
>> > > This should simplify the hadoop related codebase and simplify the CI
>> > build
>> > > infrastructure as we won't have to test for the older versions.
>> > >
>> > > Please note that this only involves a minimal *client side*
>> > compatibility.
>> > > The wire protocol should remain compatible with earlier versions [2],
>> so
>> > we
>> > > should be able to talk with any servers in 2.x major branch.
>> > >
>> > > One small note for the 2.8.x branch, some of the classes we need are
>> only
>> > > available in 2.8.4 version and above, but I'm not sure we should take
>> an
>> > > eventual need for upgrading a patch version into consideration here,
>> > > because both 2.8.4 and 2.8.5 are pretty old.
>> > >
>> > > WDYT, is it already time to upgrade? Looking forward for any thoughts
>> on
>> > > the topic!
>> > >
>> > > [1]
>> > >
>> > >
>> >
>> https://github.com/apache/flink/blob/release-1.14.0/tools/azure-pipelines/build-apache-repo.yml#L123
>> > > [2]
>> > >
>> > >
>> >
>> https://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-common/Compatibility.html#Wire_compatibility
>> > >
>> > > Best,
>> > > D.
>> > >
>> >
>>
>

Reply via email to