It sounds like Hadoop compatibility guarantees should make it simpler/less 
risky for the default Hadoop version used by HBase to be bumped up. Some 
questions that come to mind, how much complication/risk is introduced by 
HBase’s use of private Hadoop interfaces that have weaker (or nonexistent?) 
Hadoop compatibility guarantees across Hadoop minor/patch versions? I know that 
AsyncFSWAL in particular is heavily dependent on private HDFS client internals 
and for that reason is very Hadoop version sensitive with a fallback to FSHLog 
in the case of compatibility issues detected at runtime - 
https://hbase.apache.org/book.html#trouble.rs.startup.asyncfs 

This is something that has been touched on but I think would be good to make 
more explicit/document - do said Hadoop compatibility guarantees mean that one 
does not have to upgrade their existing Hadoop cluster before they upgrade to 
an HBase version with a default Hadoop 3.x version that is minor versions ahead 
of their Hadoop cluster, e.g for an HBase 2.5.y -> 2.5.z patch version upgrade, 
if Hadoop 3.4.1 is made the default for branch-2.5 as proposed below and a user 
is currently running Hadoop 3.2.3? Or is that something that may work, but in 
case of issues a user is expected to fall back to compiling their own HBase 
with a lower Hadoop version as you mentioned Istvan?

> > We also have the packaging
> > (smoke) tests for running the official binaries against earlier Hadoop
> > clusters, which would catch (some of) the Hadoop wire api incompatibilities

I am admittedly not familiar with the relevant HBase/Hadoop compatibility 
testing that is done or its depth/breadth, and also don't know off the top of 
my head of places outside of the WAL providers which make use of Hadoop 
internals which I would think are particularly Hadoop version sensitive/at 
risk. I stumbled on an older issue HBASE-13740 "Stop using Hadoop private 
interfaces" where Busbey mentions animal sniffer as a possible tool to detect 
the places where HBase is reliant on Hadoop private interfaces. The places 
where reflection is used to make use of newer Hadoop features/APIs with 
fallback to older behavior/APIs are a lot more obvious. 

Thank you,
Daniel

----- Original Message -----
From: 张铎 <dev@hbase.apache.org>
To: dev@hbase.apache.org
At: 11/15/24 23:35:53 UTC-05:00


+1

Istvan Toth <st...@cloudera.com.invalid> 于2024年11月15日周五 14:26写道:
>
> With Duo's fix for HBASE-28965 , there are no more known blockers.
>
> Should I go ahead with making 3.4.1 the default for branch-2.5 and
> branch-2.6 ?
>
> On Tue, Nov 5, 2024 at 7:16 AM Istvan Toth <st...@cloudera.com> wrote:
>
> > I've just committed the - hopefully - last test change, so the nightlies
> > will not always fail on the packaging tests.
> >
> > All non-released branches (i.e. 2.7+ now default to building with 3.4.1).
> >
> > I wanted to revisit updating the default version on the released (2.5,
> > 2.6) branches, because Nick has expressed concerns about it.
> >
> > The dev tests are good (the test failures seem to be "normal" flakies). As
> > we've not updated the default version, we're not yet running the packaging
> > tests
> > on branch-2.5 and 2.6 against the older Hadoop versions, but we do on the
> > rest of branches, and if we update it, we will also run them on 2.5 and 2.6.
> >
> > Without repeating the arguments I have already made for it, I want to add
> > a new one:
> >
> > Security and CVEs are getting more and more emphasis, which is great, but
> > has some drawbacks.
> > While the proliferation of static analyzers leads to a lot of frustrating
> > CVE witch hunts and false positives, the majority of users cannot evaluate
> > the actual security impact,
> > or even if they can, they are tied by inflexible policies.
> >
> > We at Phoenix had recent security discussions with Trino, and ended up
> > having to dependencyManage the transitive Hadoop dependencies in our shaded
> > uberjar to address their concerns.
> > (Which is an antipattern)
> >
> > While updating the Hadoop version in a patch release undoubtedly increases
> > the risk of regressions, IMO we are protected by the Hadoop backwards
> > compatibility promises, and we have added a reasonable number of tests to
> > catch any issues. I am confident in our test coverage when building HBase
> > with any of the supported HBase versions. We also have the packaging
> > (smoke) tests for running the official binaries against earlier Hadoop
> > clusters, which would catch (some of) the Hadoop wire api incompatibilities
> > .
> >
> > However, IMO not updating Hadoop is net negative to the project's health,
> > as the binary (maven or assembly) releases are used primarily either by
> > other libraries interfacing with HBase, or by new users, POC clusters, etc.
> > and these are the use cases where the transitive CVEs can prevent projects
> > from adding (or maintaining as in the case of Trino) HBase support, or
> > discourage new users from adopting HBase.
> >
> > Obviously, I have a very skewed view of HBase users, but I think that most
> > production HBase users either use a vendor or cloud provider version, or
> > have enough in-house expertise to rebuild HBase with their Hadoop version
> > if something goes wrong (despite the Hadoop backwards compatibility
> > promises)
> >
> >
> >
> >
> >
> > On Wed, Oct 30, 2024 at 12:41 PM Istvan Toth <st...@cloudera.com> wrote:
> >
> >> Thanks!
> >>
> >> I will backport the test changes, but keep the default Hadoop version.
> >>
> >> We will have more information then.
> >>
> >> Istvan
> >>
> >> On Wed, Oct 30, 2024 at 10:22 AM Nick Dimiduk <ndimi...@apache.org>
> >> wrote:
> >>
> >>> On Mon, Oct 28, 2024 at 11:00 AM Istvan Toth <st...@cloudera.com.invalid>
> >>> wrote:
> >>> >
> >>> > I have looked at branch-2.5, but the nightly looks off there, as it
> >>> runs
> >>> > the packaging tests with Hadoop 3.1.1, which it  doesn't even
> >>> officially
> >>> > support anymore.
> >>> >
> >>> > What should we do with branch-2.5 ?
> >>> >
> >>> > I think that it would not be a lot of extra  work to backport
> >>> everything,
> >>> > both the backwards compatibility tests and defaulting Hadoop to 3.4.1.
> >>> > We just have to update the version in the pom, and add 3.2.4 to the
> >>> list of
> >>> > versions to test for backwards compatibility and integration (and
> >>> remove
> >>> > 3.1.1).
> >>> >
> >>> > I would prefer to have uniform tests and default to Hadoop 3.4.1 on all
> >>> > active branches.
> >>> > Having a (few) final 2.5.x release(s) with tested Hadoop 3.4.x support
> >>> may
> >>> > be useful for users for migration and CVE mitigation purposes.
> >>> >
> >>> > WDYT ?
> >>>
> >>> branch-2.5's default hadoop3 version is 3.2.4. That's a big dependency
> >>> to change for a patch release. I don't think that we can get away with
> >>> that change and maintain our compatibility obligations. I'm not up to
> >>> speed on the current state of CVEs for this older (EOL?) version, so
> >>> we have that dimension to consider. If the newer version is "drop-in"
> >>> compatible (and only if), then I have no issue with moving that
> >>> release line forward. Ultimately it's the release manager for 2.5 to
> >>> make a determination, so I defer to Andrew's assessment.
> >>>
> >>> I am in favor of backing-porting the improved testing coverage you've
> >>> added to branch-2.5. It would be great to understand if branch-2.5 (1)
> >>> compiled against 3.2.4 will run on Hadoop 3.4.1 and (2) builds and
> >>> tests out on 3.4.1. That will give the more security-minded users
> >>> additional confidence in bumping their hadoop dependency on their own.
> >>>
> >>> Thanks,
> >>> Nick
> >>>
> >>> > On Mon, Oct 21, 2024 at 6:54 PM Istvan Toth <st...@cloudera.com>
> >>> wrote:
> >>> >
> >>> > > We could also move the default to 3.4.1 directly.
> >>> > > We already test for 3.4.0 in the nightly job.
> >>> > >
> >>> > > On Mon, Oct 21, 2024 at 3:49 PM 张铎(Duo Zhang) <palomino...@gmail.com
> >>> >
> >>> > > wrote:
> >>> > >
> >>> > >> And seems hadoop 3.4.1 is out. we could see whether to bump to this
> >>> > >> version later?
> >>> > >>
> >>> > >> Istvan Toth <st...@cloudera.com.invalid> 于2024年10月21日周一 20:56写道:
> >>> > >> >
> >>> > >> > I have merged the new tests to the nightly Jenkins runs on master.
> >>> > >> >
> >>> > >> > They have identified another 3.4.0 incompatibility:
> >>> > >> > HBASE-28929 <https://issues.apache.org/jira/browse/HBASE-28929>
> >>> > >> >
> >>> > >> > I will hold off backporting the test changes until HBASE-28929 is
> >>> > >> resolved.
> >>> > >>
> >>> > >
> >>> > >
> >>> > > --
> >>> > > *István Tóth* | Sr. Staff Software Engineer
> >>> > > *Email*: st...@cloudera.com
> >>> > > cloudera.com <https://www.cloudera.com>
> >>> > > [image: Cloudera] <https://www.cloudera.com/>
> >>> > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
> >>> > > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
> >>> > > Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera>
> >>> > > ------------------------------
> >>> > > ------------------------------
> >>> > >
> >>> >
> >>> >
> >>> > --
> >>> > *István Tóth* | Sr. Staff Software Engineer
> >>> > *Email*: st...@cloudera.com
> >>> > cloudera.com <https://www.cloudera.com>
> >>> > [image: Cloudera] <https://www.cloudera.com/>
> >>> > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
> >>> > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
> >>> Cloudera
> >>> > on LinkedIn] <https://www.linkedin.com/company/cloudera>
> >>> > ------------------------------
> >>> > ------------------------------
> >>>
> >>
> >>
> >> --
> >> *István Tóth* | Sr. Staff Software Engineer
> >> *Email*: st...@cloudera.com
> >> cloudera.com <https://www.cloudera.com>
> >> [image: Cloudera] <https://www.cloudera.com/>
> >> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
> >> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
> >> Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera>
> >> ------------------------------
> >> ------------------------------
> >>
> >
> >
> > --
> > *István Tóth* | Sr. Staff Software Engineer
> > *Email*: st...@cloudera.com
> > cloudera.com <https://www.cloudera.com>
> > [image: Cloudera] <https://www.cloudera.com/>
> > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
> > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
> > Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera>
> > ------------------------------
> > ------------------------------
> >
>
>
> --
> *István Tóth* | Sr. Staff Software Engineer
> *Email*: st...@cloudera.com
> cloudera.com <https://www.cloudera.com>
> [image: Cloudera] <https://www.cloudera.com/>
> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera
> on LinkedIn] <https://www.linkedin.com/company/cloudera>
> ------------------------------
> ------------------------------

Reply via email to