Thanks for the assessment, Wei-Chiu. Transitive dependency updates in Hadoop are normal (desired even), that's something that HBase needs to manage.
As for the test: - Duo's suggestion is to extend the Hadoop compatibility tests, and run them with multiple Hadoop 3 releases. Looking at the nightly results, those tests are fast, it takes 14 minutes for Hadoop2 and Hadoop3. I've peeked into hbase_nightly_pseudo-distributed-test.sh , the tests there are indeed quite minimal, more of a smoke test, and seem to be targeted to check the shaded artifacts. - Nick's suggestion is to run DevTests and set the Hadoop version. The runAllTests step in the current nightly takes 8+ hours. On my 6+8 core laptop, my last attempt failed after 90 minutes, so let's say the full run takes 120 minutes. I don't know how many free resources HBase has, but if we can utilize another VM per branch we could run the dev tests with four HBase versions, and still finish about the same time as the full test job does. We don't need to test with the default version, as we already run the full suite for that one. Assuming that we officially support 3.4.0 on all active branches, and also default to 3.4.0 on all branches, and trusting Hadoop's compatibility so that we don't need to test interim patch releases within a minor version, we could go with these versions: branch-2.5 : 3.2.3, 3.2.4, 3.3.2, 3.3.6 branch-2.6 : 3.3.5, 3.3.6 branch-3: 3.3.5, 3.3.6 branch-4: 3.3.5, 3.3.6 If we trust Hadoop not to break compatibility in patch releases, we could reduce this to only the oldest patch releases: branch-2.5 : 3.2.3, 3.3.2 branch-2.6 : 3.3.5 branch-3: 3.3.5 branch-4: 3.3.5 or if we trust it not break compatibility in specific minor versions, we could further reduce it to just the oldest supported release: branch-2.5 : 3.2.3 branch-2.6 : 3.3.5 branch-3: 3.3.5 branch-4: 3.3.5 Of course running every devTest is an overkill, as the vast majority of the tests use the same set of Hadoop APIs and features, and we'd only really need to run the tests that cover that feature set. Figuring out a subset of tests that exercise the full Hadoop API (that we use) is a hard and error prone task, so if we have the resources, we can just brute force it with devTests. As a base for further discussion: Let's take the first (first and last supported patch level for each minor release) set of versions, and run both the pseudistributed tests and the devTests on them. Does that sound good ? Do we have the resources for that ? Do we have a better idea ? Istvan On Mon, Sep 16, 2024 at 7:20 PM Wei-Chiu Chuang <weic...@apache.org> wrote: > I strive to meet that stated compatibility goal when I release Hadoop. > But we don't have a rigorous compatibility/upgrade test in Hadoop so YMMV > (we now have in Ozone!) > > There are so many gotchas that it really depends on the RM to do the > hardwork, checking protobuf definitions, running API compat report, > compiling against downstream applications. > The other thing is thirdparty dependency update. Whenever I bump Netty or > Jetty version, new transitive dependencies slip in as part of the update, > which sometimes break HBase because of the dependency check in shading. > > On Mon, Sep 16, 2024 at 4:48 AM Istvan Toth <st...@cloudera.com.invalid> > wrote: > > > On Wed, Sep 11, 2024 at 4:30 PM 张铎(Duo Zhang) <palomino...@gmail.com> > > wrote: > > > > > There is a problem that, usually, you can use an old hadoop client to > > > communicate with a new hadoop server, but not vice versa. > > > > > > > Do we have examples of that ? > > > > > > > https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html > > specifically states otherwise: > > > > In addition to the limitations imposed by being Stable > > < > > > https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/InterfaceClassification.html#Stable > > >, > > Hadoop’s wire protocols MUST also be forward compatible across minor > > releases within a major version according to the following: > > > > - Client-Server compatibility MUST be maintained so as to allow users > to > > continue using older clients even after upgrading the server (cluster) > > to a > > later version (or vice versa). For example, a Hadoop 2.1.0 client > > talking > > to a Hadoop 2.3.0 cluster. > > - Client-Server compatibility MUST be maintained so as to allow users > to > > upgrade the client before upgrading the server (cluster). For > example, a > > Hadoop 2.4.0 client talking to a Hadoop 2.3.0 cluster. This allows > > deployment of client-side bug fixes ahead of full cluster upgrades. > Note > > that new cluster features invoked by new client APIs or shell commands > > will > > not be usable. YARN applications that attempt to use new APIs > (including > > new fields in data structures) that have not yet been deployed to the > > cluster can expect link exceptions. > > - Client-Server compatibility MUST be maintained so as to allow > > upgrading individual components without upgrading others. For example, > > upgrade HDFS from version 2.1.0 to 2.2.0 without upgrading MapReduce. > > - Server-Server compatibility MUST be maintained so as to allow mixed > > versions within an active cluster so the cluster may be upgraded > without > > downtime in a rolling fashion. > > > > Admittedly, I don't have a lot of experience with mismatched Hadoop > > versions, but my proposal should be covered by the second clause. > > > > Usage of newer APIs should be caught when compiling with older Hadoop > > versions. > > The only risk I can see is when we use a new feature which was added > > without changing the API signature (such as adding a new constant value > for > > some new behaviour) > > > > > > > When deploying HBase, HBase itself acts as a client of hadoop, that's > > > why we always stay on the oldest support hadoop version. > > > > > > > > Not true for 2.6 , which according to the docs supports Hadoop 3.2, but > > defaults to Hadoop 3.3 > > > > > > > For me, technically I think bumping to the newest patch release of a > > > minor release should be fine, which is the proposal 1. > > > > > > But the current hadoopcheck is not enough, since it can only ensure > > > that there is no complation error. > > > Maybe we should also run some simple dev tests in the hadoopcheck > > > stage, and in integration tests, we should try to build with all the > > > support hadoop version and run the basic read write tests. > > > > > > Do we need to test all versions ? > > If We test with say, 3.3.0 and 3.3.6 , do we need to test with 3.3.[1-5] > ? > > Or if we test with 3.2.5 and 3.3.6, do we need to test with any of the > > interim versions ? > > > > Basically, how much do we trust Hadoop to keep to its compatibility > rules ? > > > > Running a limited number of tests should not be a problem. > > Should we add a new test category, so that they can be easily started > from > > Maven ? > > > > Can you suggest some tests that we should run for the compatibility > check ? > > > > > > > Thanks. > > > > > > Istvan Toth <st...@cloudera.com.invalid> 于2024年9月11日周三 21:05写道: > > > > > > > > Let me summarize my take of the discussion so far: > > > > > > > > There are two aspects to the HBase version we build with: > > > > 1. Source code quality/compatibility > > > > 2. Security and usability of the public binary assemblies and > (shaded) > > > > hbase maven artifacts. > > > > > > > > 1. Source code quality/compatibility > > > > > > > > AFAICT we have the following hard goals: > > > > 1.a : Ensure that HBase compiles and runs well with the earlier > > supported > > > > Hadoop version on the given branch > > > > 1.b: Ensure that HBase compiles and runs well with the latest > supported > > > > Hadoop version on the given branch > > > > > > > > In my opinion we should also strive for these goals: > > > > 1.c: Aim to officially support the newest possible Hadoop releases > > > > 1.d: Take advantage of new features in newer Hadoop versions > > > > > > > > 2. Public binary usability wish list: > > > > > > > > 2.a: We want them to work OOB for as many use cases as possible > > > > 2.b: We want to work them as well as possible > > > > 2.c: We want to have as few CVEs in them as possible > > > > 2.d: We want to make upgrades as painless as possible, especially for > > > patch > > > > releases > > > > > > > > The factor that Hadoop does not have an explicit end-of-life policy > of > > > > course complicates things. > > > > > > > > Our current policy seems to be that we pick a Hadoop version to build > > > with > > > > when releasing a minor version, > > > > and stay on that version until there is a newer patch released of > that > > > > minor version with direct CVE fixes. > > > > This does not seem to be an absolute, for example the recently > released > > > > HBase 2.4.18 still defaults to Hadoop 3.1.2, > > > > which has several old CVEs, many of which are reportedly fixed in > 3.1.3 > > > and > > > > 3.1.4. > > > > > > > > my proposals are : > > > > > > > > Proposal 1: > > > > > > > > Whenever a new Hadoop patch release is released for a minor version, > > then > > > > unless it breaks source compatibility, we should automatically update > > the > > > > default Hadoop version for > > > > all branches that use the same minor version. > > > > The existing hadoopcheck mechanism should be good enough to guarantee > > > that > > > > we do not break compatibility with the earlier patch releases. > > > > > > > > This would ensure that the binaries use the latest and greatest > Hadoop > > > (of > > > > that minor branch) and that users of the binaries get the latest > fixes, > > > > both CVE and functionality wise, and > > > > the binaries also get the transitive CVE fixes in that release. > > > > For example,if we did this we could use the new feature in 3.3.6 in > > > > HBASE-27769 (via reflection) and also test it, thereby improving > Ozone > > > > support. > > > > > > > > On the other hand we minimize changes and maximize compatibility by > > > > sticking to the same Hadoop minor release. > > > > > > > > Proposal 2: > > > > > > > > We should default to the latest hadoop version (currently 3.4.0) on > > > > unreleased branches. > > > > This should ensure that when we do release we default to the latest > > > > version, and we've tested it as thoroughly as possible. > > > > > > > > Again. the existing Hadoopcheck mechanism should ensure that we do > not > > > > break compatibility with earlier supported versions. > > > > > > > > Istvan > > > > > > > > > > > > > > > > > > > > On Mon, Sep 9, 2024 at 9:41 PM Nick Dimiduk <ndimi...@apache.org> > > wrote: > > > > > > > > > Yes, we’ll use reflection to make use of APIs introduced in newer > > HDFS > > > > > versions than the stated dependency until the stated dependency > > finally > > > > > catches up. > > > > > > > > > > On Mon, 9 Sep 2024 at 19:55, Wei-Chiu Chuang <weic...@apache.org> > > > wrote: > > > > > > > > > > > Reflection is probably the way to go to ensure maximum > > compatibility > > > TBH > > > > > > > > > > > > On Mon, Sep 9, 2024 at 10:40 AM Istvan Toth > > > <st...@cloudera.com.invalid> > > > > > > wrote: > > > > > > > > > > > > > Stephen Wu has kindly sent me the link for the previous email > > > thread: > > > > > > > > https://lists.apache.org/thread/2k4tvz3wpg06sgkynkhgvxrodmj86vsj > > > > > > > > > > > > > > Reading it, I cannot see anything there that would > contraindicate > > > > > > upgrading > > > > > > > to 3.3.6 from 3.3.5, at least on the branches that already > > default > > > to > > > > > > > 3.3.5, i.e. 2.6+. > > > > > > > > > > > > > > At first glance, the new logic in HBASE-27769 could also be > > > implemented > > > > > > > with the usual reflection hacks, while preserving the old logic > > for > > > > > > Hadoop > > > > > > > 3.3.5 and earlier. > > > > > > > > > > > > > > Thanks, > > > > > > > Istvan > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 9, 2024 at 1:42 PM Istvan Toth <st...@cloudera.com > > > > > wrote: > > > > > > > > > > > > > > > Thanks for your reply, Nick. > > > > > > > > > > > > > > > > There are no listed direct CVEs in either Hadoop 3.2.4 or > > 3.3.5, > > > but > > > > > > > there > > > > > > > > are CVEs in their transitive dependencies. > > > > > > > > > > > > > > > > My impression is that rather than shipping the oldest 'safe' > > > version, > > > > > > > > HBase does seem to update the default Hadoop version to the > > > > > latest-ish > > > > > > at > > > > > > > > the time of the start > > > > > > > > of the release process, otherwise 2.6 would still default to > > > 3.2.4. > > > > > > > (HBase > > > > > > > > 2.6 release was already underway when Hadoop 3.4.0 was > > released) > > > > > > > > > > > > > > > > For now, we (Phoenix) have resorted to dependency managing > > > transitive > > > > > > > > dependencies coming in (only) via Hadoop in Phoenix, > > > > > > > > but that is a slippery slope, and adds a layer of > uncertainty, > > > as it > > > > > > may > > > > > > > > introduce incompatibilities in Hadoop that we don't have > tests > > > for. > > > > > > > > > > > > > > > > Our situation is similar to that of the HBase shaded > artifacts, > > > where > > > > > > we > > > > > > > > ship a huge uberjar that includes much of both HBase and > Hadoop > > > on > > > > > top > > > > > > of > > > > > > > > (or rather below) Phoenix, > > > > > > > > similar to the hbase-client-shaded jar. > > > > > > > > > > > > > > > > I will look into to hadoop check CI tests that you've > > mentioned, > > > > > then I > > > > > > > > will try to resurrect HBASE-27931, and if I don't find any > > > issues, > > > > > and > > > > > > > > there are no objections, then > > > > > > > > I will put a PR to update the unreleased version to default > to > > > 3.4.0. > > > > > > > > > > > > > > > > Istvan > > > > > > > > > > > > > > > > On Mon, Sep 9, 2024 at 11:06 AM Nick Dimiduk < > > > ndimi...@apache.org> > > > > > > > wrote: > > > > > > > > > > > > > > > >> My understanding of our hadoop dependency policy is that we > > ship > > > > > poms > > > > > > > with > > > > > > > >> hadoop versions pinned to the oldest compatible, "safe" > > version > > > that > > > > > > is > > > > > > > >> supported. Our test infrastructure has a "hadoop check" > > > procedure > > > > > that > > > > > > > >> does > > > > > > > >> some validation against other patch release versions. > > > > > > > >> > > > > > > > >> I don't know if anyone has done a CVE sweep recently. If > there > > > are > > > > > new > > > > > > > >> CVEs, we do bump the minimum supported version specified in > > the > > > pom > > > > > as > > > > > > > >> part > > > > > > > >> of patch releases. These changes need to include a pretty > > > thorough > > > > > > > >> compatibility check so that we can include release notes > about > > > any > > > > > > > >> introduced incompatibilities. > > > > > > > >> > > > > > > > >> I am in favor of a dependency bump so as to address known > CVEs > > > as > > > > > best > > > > > > > as > > > > > > > >> we reasonably can. > > > > > > > >> > > > > > > > >> Thanks, > > > > > > > >> Nick > > > > > > > >> > > > > > > > >> On Mon, Sep 9, 2024 at 10:59 AM Istvan Toth < > st...@apache.org > > > > > > > > wrote: > > > > > > > >> > > > > > > > >> > Hi! > > > > > > > >> > > > > > > > > >> > I'm working on building the Phoenix uberjars with newer > > Hadoop > > > > > > > versions > > > > > > > >> by > > > > > > > >> > default to improve its CVE stance, and I realized that > HBase > > > > > itself > > > > > > > does > > > > > > > >> > not use the latest releases. > > > > > > > >> > > > > > > > > >> > branch-2.5 defaults to 3.2.4 > > > > > > > >> > branch-2.6 and later defaults to 3.3.5 > > > > > > > >> > > > > > > > > >> > I can kind of understand that we don't want to bump the > > minor > > > > > > version > > > > > > > >> for > > > > > > > >> > branch-2.5 from the one it was released with. > > > > > > > >> > > > > > > > > >> > However, I don't see the rationale for not upgrading > > > branch-2.6 to > > > > > > at > > > > > > > >> least > > > > > > > >> > 3.3.6, and the unreleased branches (branch-2, branch-3, > > > master) to > > > > > > > >> 3.4.0. > > > > > > > >> > > > > > > > > >> > I found a mention of wanting to stay off the latest patch > > > release > > > > > > > >> > HBASE-27931, but I could not figure if it has a technical > > > reason, > > > > > or > > > > > > > if > > > > > > > >> > this is a written (or unwritten) policy. > > > > > > > >> > > > > > > > > >> > best regards > > > > > > > >> > Istvan > > > > > > > >> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > *István Tóth* | Sr. Staff Software Engineer > > > > > > > > *Email*: st...@cloudera.com > > > > > > > > cloudera.com <https://www.cloudera.com> > > > > > > > > [image: Cloudera] <https://www.cloudera.com/> > > > > > > > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> > > > [image: > > > > > > > > Cloudera on Facebook] <https://www.facebook.com/cloudera> > > > [image: > > > > > > > > Cloudera on LinkedIn] < > > https://www.linkedin.com/company/cloudera > > > > > > > > > > > > ------------------------------ > > > > > > > > ------------------------------ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > *István Tóth* | Sr. Staff Software Engineer > > > > > > > *Email*: st...@cloudera.com > > > > > > > cloudera.com <https://www.cloudera.com> > > > > > > > [image: Cloudera] <https://www.cloudera.com/> > > > > > > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> > > > [image: > > > > > > > Cloudera on Facebook] <https://www.facebook.com/cloudera> > > [image: > > > > > > Cloudera > > > > > > > on LinkedIn] <https://www.linkedin.com/company/cloudera> > > > > > > > ------------------------------ > > > > > > > ------------------------------ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > *István Tóth* | Sr. Staff Software Engineer > > > > *Email*: st...@cloudera.com > > > > cloudera.com <https://www.cloudera.com> > > > > [image: Cloudera] <https://www.cloudera.com/> > > > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: > > > > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: > > > Cloudera > > > > on LinkedIn] <https://www.linkedin.com/company/cloudera> > > > > ------------------------------ > > > > ------------------------------ > > > > > > > > > -- > > *István Tóth* | Sr. Staff Software Engineer > > *Email*: st...@cloudera.com > > cloudera.com <https://www.cloudera.com> > > [image: Cloudera] <https://www.cloudera.com/> > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: > > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: > Cloudera > > on LinkedIn] <https://www.linkedin.com/company/cloudera> > > ------------------------------ > > ------------------------------ > > > -- *István Tóth* | Sr. Staff Software Engineer *Email*: st...@cloudera.com cloudera.com <https://www.cloudera.com> [image: Cloudera] <https://www.cloudera.com/> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera> ------------------------------ ------------------------------