I don't think you'll need to manage downloading anything. Just run
something like `mvn clean test -Dhadoop.version=... -PrunDevTests=true` for
each hadoop version we need to check. The actual command will be a little
more complex than that thanks to our flaky tests exclusions. It would
probably be good if this was integrated into the yetus personality
(dev-support/hbase-personality.sh).

This is going to add a lot of time to our nightlies for branch-2...

On Mon, Sep 16, 2024 at 3:45 PM Istvan Toth <st...@cloudera.com.invalid>
wrote:

> OK, I'm gonna look into downloading multiple Hadoop 3 versions, and running
> those tests with each one.
>
>
>
>
>
> On Mon, Sep 16, 2024 at 3:08 PM 张铎(Duo Zhang) <palomino...@gmail.com>
> wrote:
>
> > And if we can make sure the compatibility, I agree that we could
> > depend on the newest possible hadoop version by default. As you said,
> > it can reduce most transitive security issues.
> >
> > There are still 3 security issues on master branch because of netty 3,
> > which should be fixed in 3.4.0.
> >
> > 张铎(Duo Zhang) <palomino...@gmail.com> 于2024年9月16日周一 21:03写道:
> > >
> > > There is a devTests profile in our pom, we can make use of it first.
> > >
> > > And on integration tests, I mean this one
> > >
> > >
> >
> https://github.com/apache/hbase/blob/4446d297112899dab59c0952489457c4419366d3/dev-support/Jenkinsfile#L755
> > >
> > > We could extend this test to test different combinations.
> > >
> > > Istvan Toth <st...@cloudera.com.invalid> 于2024年9月16日周一 19:48写道:
> > > >
> > > > On Wed, Sep 11, 2024 at 4:30 PM 张铎(Duo Zhang) <palomino...@gmail.com
> >
> > wrote:
> > > >
> > > > > There is a problem that, usually, you can use an old hadoop client
> to
> > > > > communicate with a new hadoop server, but not vice versa.
> > > > >
> > > >
> > > > Do we have examples of that ?
> > > >
> > > >
> >
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html
> > > > specifically states otherwise:
> > > >
> > > > In addition to the limitations imposed by being Stable
> > > > <
> >
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/InterfaceClassification.html#Stable
> > >,
> > > > Hadoop’s wire protocols MUST also be forward compatible across minor
> > > > releases within a major version according to the following:
> > > >
> > > >    - Client-Server compatibility MUST be maintained so as to allow
> > users to
> > > >    continue using older clients even after upgrading the server
> > (cluster) to a
> > > >    later version (or vice versa). For example, a Hadoop 2.1.0 client
> > talking
> > > >    to a Hadoop 2.3.0 cluster.
> > > >    - Client-Server compatibility MUST be maintained so as to allow
> > users to
> > > >    upgrade the client before upgrading the server (cluster). For
> > example, a
> > > >    Hadoop 2.4.0 client talking to a Hadoop 2.3.0 cluster. This allows
> > > >    deployment of client-side bug fixes ahead of full cluster
> upgrades.
> > Note
> > > >    that new cluster features invoked by new client APIs or shell
> > commands will
> > > >    not be usable. YARN applications that attempt to use new APIs
> > (including
> > > >    new fields in data structures) that have not yet been deployed to
> > the
> > > >    cluster can expect link exceptions.
> > > >    - Client-Server compatibility MUST be maintained so as to allow
> > > >    upgrading individual components without upgrading others. For
> > example,
> > > >    upgrade HDFS from version 2.1.0 to 2.2.0 without upgrading
> > MapReduce.
> > > >    - Server-Server compatibility MUST be maintained so as to allow
> > mixed
> > > >    versions within an active cluster so the cluster may be upgraded
> > without
> > > >    downtime in a rolling fashion.
> > > >
> > > > Admittedly, I don't have a lot of experience with mismatched Hadoop
> > > > versions, but my proposal should be covered by the second clause.
> > > >
> > > > Usage of newer APIs should be caught when compiling with older Hadoop
> > > > versions.
> > > > The only risk I can see is when we use a new feature which was added
> > > > without changing the API signature (such as adding a new constant
> > value for
> > > > some new behaviour)
> > > >
> > > >
> > > > > When deploying HBase, HBase itself acts as a client of hadoop,
> that's
> > > > > why we always stay on the oldest support hadoop version.
> > > > >
> > > > >
> > > > Not true for 2.6 , which according to the docs supports Hadoop 3.2,
> but
> > > > defaults to Hadoop 3.3
> > > >
> > > >
> > > > > For me, technically I think bumping to the newest patch release of
> a
> > > > > minor release should be fine, which is the proposal 1.
> > > > >
> > > > > But the current hadoopcheck is not enough, since it can only ensure
> > > > > that there is no complation error.
> > > > > Maybe we should also run some simple dev tests in the hadoopcheck
> > > > > stage, and in integration tests, we should try to build with all
> the
> > > > > support hadoop version and run the basic read write tests.
> > > >
> > > >
> > > > Do we need to test all versions ?
> > > > If We test with say, 3.3.0 and 3.3.6 , do we need to test with
> > 3.3.[1-5] ?
> > > > Or if we test with 3.2.5  and 3.3.6, do we need to test with any of
> the
> > > > interim versions ?
> > > >
> > > > Basically, how much do we trust Hadoop to keep to its compatibility
> > rules ?
> > > >
> > > > Running a limited number of tests should not be a problem.
> > > > Should we add a new test category, so that they can be easily started
> > from
> > > > Maven ?
> > > >
> > > > Can you suggest some tests that we should run for the compatibility
> > check ?
> > > >
> > > >
> > > > > Thanks.
> > > > >
> > > > > Istvan Toth <st...@cloudera.com.invalid> 于2024年9月11日周三 21:05写道:
> > > > > >
> > > > > > Let me summarize my take of the discussion so far:
> > > > > >
> > > > > > There are two aspects to the HBase version we build with:
> > > > > > 1. Source code quality/compatibility
> > > > > > 2. Security and usability of the public binary assemblies and
> > (shaded)
> > > > > > hbase maven artifacts.
> > > > > >
> > > > > > 1. Source code quality/compatibility
> > > > > >
> > > > > > AFAICT we have the following hard goals:
> > > > > > 1.a : Ensure that HBase compiles and runs well with the earlier
> > supported
> > > > > > Hadoop version on the given branch
> > > > > > 1.b: Ensure that HBase compiles and runs well with the latest
> > supported
> > > > > > Hadoop version on the given branch
> > > > > >
> > > > > > In my opinion we should also strive for these goals:
> > > > > > 1.c: Aim to officially support the newest possible Hadoop
> releases
> > > > > > 1.d: Take advantage  of new features in newer Hadoop versions
> > > > > >
> > > > > > 2. Public binary usability wish list:
> > > > > >
> > > > > > 2.a: We want them to work OOB for as many use cases as possible
> > > > > > 2.b: We want to work them as well as possible
> > > > > > 2.c: We want to have as few CVEs in them as possible
> > > > > > 2.d: We want to make upgrades as painless as possible, especially
> > for
> > > > > patch
> > > > > > releases
> > > > > >
> > > > > > The factor that Hadoop does not have an explicit end-of-life
> > policy of
> > > > > > course complicates things.
> > > > > >
> > > > > > Our current policy seems to be that we pick a Hadoop version to
> > build
> > > > > with
> > > > > > when releasing a minor version,
> > > > > > and stay on that version until there is a newer patch released of
> > that
> > > > > > minor version with direct CVE fixes.
> > > > > > This does not seem to be an absolute, for example the recently
> > released
> > > > > > HBase 2.4.18 still defaults to Hadoop 3.1.2,
> > > > > > which has several old CVEs, many of which are reportedly fixed in
> > 3.1.3
> > > > > and
> > > > > > 3.1.4.
> > > > > >
> > > > > > my proposals are :
> > > > > >
> > > > > > Proposal 1:
> > > > > >
> > > > > > Whenever a new Hadoop patch release is released for a minor
> > version, then
> > > > > > unless it breaks source compatibility, we should automatically
> > update the
> > > > > > default Hadoop version for
> > > > > > all branches that use the same minor version.
> > > > > > The existing hadoopcheck mechanism should be good enough to
> > guarantee
> > > > > that
> > > > > > we do not break compatibility with the earlier patch releases.
> > > > > >
> > > > > > This would ensure that the binaries use the latest and greatest
> > Hadoop
> > > > > (of
> > > > > > that minor branch) and that users of the binaries get the latest
> > fixes,
> > > > > > both CVE and functionality wise, and
> > > > > > the binaries also get the transitive CVE fixes in that release.
> > > > > > For example,if we did this we could use  the new feature in 3.3.6
> > in
> > > > > > HBASE-27769 (via reflection) and also test it, thereby improving
> > Ozone
> > > > > > support.
> > > > > >
> > > > > > On the other hand we minimize changes and maximize compatibility
> by
> > > > > > sticking to the same Hadoop minor release.
> > > > > >
> > > > > > Proposal 2:
> > > > > >
> > > > > > We should default to the latest hadoop version (currently 3.4.0)
> on
> > > > > > unreleased branches.
> > > > > > This should ensure that when we do release we default to the
> latest
> > > > > > version, and we've tested it as thoroughly as possible.
> > > > > >
> > > > > > Again. the existing Hadoopcheck mechanism should ensure that we
> do
> > not
> > > > > > break compatibility with earlier supported versions.
> > > > > >
> > > > > > Istvan
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Sep 9, 2024 at 9:41 PM Nick Dimiduk <ndimi...@apache.org
> >
> > wrote:
> > > > > >
> > > > > > > Yes, we’ll use reflection to make use of APIs introduced in
> > newer HDFS
> > > > > > > versions than the stated dependency until the stated dependency
> > finally
> > > > > > > catches up.
> > > > > > >
> > > > > > > On Mon, 9 Sep 2024 at 19:55, Wei-Chiu Chuang <
> weic...@apache.org
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > Reflection is probably the way to go to ensure maximum
> > compatibility
> > > > > TBH
> > > > > > > >
> > > > > > > > On Mon, Sep 9, 2024 at 10:40 AM Istvan Toth
> > > > > <st...@cloudera.com.invalid>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Stephen Wu has kindly sent me the link for the previous
> email
> > > > > thread:
> > > > > > > > >
> > https://lists.apache.org/thread/2k4tvz3wpg06sgkynkhgvxrodmj86vsj
> > > > > > > > >
> > > > > > > > > Reading it, I cannot see anything there that would
> > contraindicate
> > > > > > > > upgrading
> > > > > > > > > to 3.3.6 from 3.3.5, at least on the branches that already
> > default
> > > > > to
> > > > > > > > > 3.3.5, i.e. 2.6+.
> > > > > > > > >
> > > > > > > > > At first glance, the new logic in HBASE-27769 could also be
> > > > > implemented
> > > > > > > > > with the usual reflection hacks, while preserving the old
> > logic for
> > > > > > > > Hadoop
> > > > > > > > > 3.3.5 and earlier.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Istvan
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Sep 9, 2024 at 1:42 PM Istvan Toth <
> > st...@cloudera.com>
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Thanks for your reply, Nick.
> > > > > > > > > >
> > > > > > > > > > There are no listed direct CVEs in either Hadoop 3.2.4 or
> > 3.3.5,
> > > > > but
> > > > > > > > > there
> > > > > > > > > > are CVEs in their transitive dependencies.
> > > > > > > > > >
> > > > > > > > > > My impression is that rather than shipping the oldest
> > 'safe'
> > > > > version,
> > > > > > > > > > HBase does seem to update the default Hadoop version to
> the
> > > > > > > latest-ish
> > > > > > > > at
> > > > > > > > > > the time of the start
> > > > > > > > > > of the release process, otherwise 2.6 would still default
> > to
> > > > > 3.2.4.
> > > > > > > > > (HBase
> > > > > > > > > > 2.6 release was already underway when Hadoop 3.4.0 was
> > released)
> > > > > > > > > >
> > > > > > > > > > For now, we (Phoenix) have resorted to dependency
> managing
> > > > > transitive
> > > > > > > > > > dependencies coming in (only) via Hadoop in Phoenix,
> > > > > > > > > > but that is a slippery slope, and adds a layer of
> > uncertainty,
> > > > > as it
> > > > > > > > may
> > > > > > > > > > introduce incompatibilities in Hadoop that we don't have
> > tests
> > > > > for.
> > > > > > > > > >
> > > > > > > > > > Our situation is similar to that of the HBase shaded
> > artifacts,
> > > > > where
> > > > > > > > we
> > > > > > > > > > ship a huge uberjar that includes much of both HBase and
> > Hadoop
> > > > > on
> > > > > > > top
> > > > > > > > of
> > > > > > > > > > (or rather below) Phoenix,
> > > > > > > > > > similar to the hbase-client-shaded jar.
> > > > > > > > > >
> > > > > > > > > > I will look into to hadoop check CI tests that you've
> > mentioned,
> > > > > > > then I
> > > > > > > > > > will try to resurrect HBASE-27931, and if I don't find
> any
> > > > > issues,
> > > > > > > and
> > > > > > > > > > there are no objections, then
> > > > > > > > > > I will put a PR to update the unreleased version to
> > default to
> > > > > 3.4.0.
> > > > > > > > > >
> > > > > > > > > > Istvan
> > > > > > > > > >
> > > > > > > > > > On Mon, Sep 9, 2024 at 11:06 AM Nick Dimiduk <
> > > > > ndimi...@apache.org>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >> My understanding of our hadoop dependency policy is that
> > we ship
> > > > > > > poms
> > > > > > > > > with
> > > > > > > > > >> hadoop versions pinned to the oldest compatible, "safe"
> > version
> > > > > that
> > > > > > > > is
> > > > > > > > > >> supported. Our test infrastructure has a "hadoop check"
> > > > > procedure
> > > > > > > that
> > > > > > > > > >> does
> > > > > > > > > >> some validation against other patch release versions.
> > > > > > > > > >>
> > > > > > > > > >> I don't know if anyone has done a CVE sweep recently. If
> > there
> > > > > are
> > > > > > > new
> > > > > > > > > >> CVEs, we do bump the minimum supported version specified
> > in the
> > > > > pom
> > > > > > > as
> > > > > > > > > >> part
> > > > > > > > > >> of patch releases. These changes need to include a
> pretty
> > > > > thorough
> > > > > > > > > >> compatibility check so that we can include release notes
> > about
> > > > > any
> > > > > > > > > >> introduced incompatibilities.
> > > > > > > > > >>
> > > > > > > > > >> I am in favor of a dependency bump so as to address
> known
> > CVEs
> > > > > as
> > > > > > > best
> > > > > > > > > as
> > > > > > > > > >> we reasonably can.
> > > > > > > > > >>
> > > > > > > > > >> Thanks,
> > > > > > > > > >> Nick
> > > > > > > > > >>
> > > > > > > > > >> On Mon, Sep 9, 2024 at 10:59 AM Istvan Toth <
> > st...@apache.org>
> > > > > > > wrote:
> > > > > > > > > >>
> > > > > > > > > >> > Hi!
> > > > > > > > > >> >
> > > > > > > > > >> > I'm working on building the Phoenix uberjars with
> newer
> > Hadoop
> > > > > > > > > versions
> > > > > > > > > >> by
> > > > > > > > > >> > default to improve its CVE stance, and I realized that
> > HBase
> > > > > > > itself
> > > > > > > > > does
> > > > > > > > > >> > not use the latest releases.
> > > > > > > > > >> >
> > > > > > > > > >> > branch-2.5 defaults to 3.2.4
> > > > > > > > > >> > branch-2.6 and later defaults to 3.3.5
> > > > > > > > > >> >
> > > > > > > > > >> > I can kind of understand that we don't want to bump
> the
> > minor
> > > > > > > > version
> > > > > > > > > >> for
> > > > > > > > > >> > branch-2.5 from the one it was released with.
> > > > > > > > > >> >
> > > > > > > > > >> > However, I don't see the rationale for not upgrading
> > > > > branch-2.6 to
> > > > > > > > at
> > > > > > > > > >> least
> > > > > > > > > >> > 3.3.6, and the unreleased branches (branch-2,
> branch-3,
> > > > > master) to
> > > > > > > > > >> 3.4.0.
> > > > > > > > > >> >
> > > > > > > > > >> > I found a mention of wanting to stay off the latest
> > patch
> > > > > release
> > > > > > > > > >> > HBASE-27931, but I could not figure if it has a
> > technical
> > > > > reason,
> > > > > > > or
> > > > > > > > > if
> > > > > > > > > >> > this is a written (or unwritten) policy.
> > > > > > > > > >> >
> > > > > > > > > >> > best regards
> > > > > > > > > >> > Istvan
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > *István Tóth* | Sr. Staff Software Engineer
> > > > > > > > > > *Email*: st...@cloudera.com
> > > > > > > > > > cloudera.com <https://www.cloudera.com>
> > > > > > > > > > [image: Cloudera] <https://www.cloudera.com/>
> > > > > > > > > > [image: Cloudera on Twitter] <
> https://twitter.com/cloudera
> > >
> > > > > [image:
> > > > > > > > > > Cloudera on Facebook] <https://www.facebook.com/cloudera
> >
> > > > > [image:
> > > > > > > > > > Cloudera on LinkedIn] <
> > https://www.linkedin.com/company/cloudera
> > > > > >
> > > > > > > > > > ------------------------------
> > > > > > > > > > ------------------------------
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > *István Tóth* | Sr. Staff Software Engineer
> > > > > > > > > *Email*: st...@cloudera.com
> > > > > > > > > cloudera.com <https://www.cloudera.com>
> > > > > > > > > [image: Cloudera] <https://www.cloudera.com/>
> > > > > > > > > [image: Cloudera on Twitter] <https://twitter.com/cloudera
> >
> > > > > [image:
> > > > > > > > > Cloudera on Facebook] <https://www.facebook.com/cloudera>
> > [image:
> > > > > > > > Cloudera
> > > > > > > > > on LinkedIn] <https://www.linkedin.com/company/cloudera>
> > > > > > > > > ------------------------------
> > > > > > > > > ------------------------------
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *István Tóth* | Sr. Staff Software Engineer
> > > > > > *Email*: st...@cloudera.com
> > > > > > cloudera.com <https://www.cloudera.com>
> > > > > > [image: Cloudera] <https://www.cloudera.com/>
> > > > > > [image: Cloudera on Twitter] <https://twitter.com/cloudera>
> > [image:
> > > > > > Cloudera on Facebook] <https://www.facebook.com/cloudera>
> [image:
> > > > > Cloudera
> > > > > > on LinkedIn] <https://www.linkedin.com/company/cloudera>
> > > > > > ------------------------------
> > > > > > ------------------------------
> > > > >
> > > >
> > > >
> > > > --
> > > > *István Tóth* | Sr. Staff Software Engineer
> > > > *Email*: st...@cloudera.com
> > > > cloudera.com <https://www.cloudera.com>
> > > > [image: Cloudera] <https://www.cloudera.com/>
> > > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
> > > > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
> > Cloudera
> > > > on LinkedIn] <https://www.linkedin.com/company/cloudera>
> > > > ------------------------------
> > > > ------------------------------
> >
>
>
> --
> *István Tóth* | Sr. Staff Software Engineer
> *Email*: st...@cloudera.com
> cloudera.com <https://www.cloudera.com>
> [image: Cloudera] <https://www.cloudera.com/>
> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera
> on LinkedIn] <https://www.linkedin.com/company/cloudera>
> ------------------------------
> ------------------------------
>

Reply via email to