The security posture of Hadoop 2 in general is a problem, because maintenance on that branch is spotty, that is just how it goes. We had the same situation with our now EOL branch-1. I know Hadoop released 2.10.2 to address some CVE worthy problems but it is unclear if 2.10.2 addresses all known issues, unlike 3.3.4. Also as you know Hadoop 2 has unpatchable dependencies on org.codehaus versions of Jackson and Jetty, which themselves have high scoring CVEs that will never be fixed because they are EOL, and other similar issues. Hadoop 3 doesn’t completely solve such problems but is the only realistic place we can hope they can be addressed as required. For organizations that implement or require a top to bottom security audit of their software bill of materials, it seems possible to avoid user pain by providing supported convenience artifacts *and* libraries built against Hadoop 3 APIs in the Apache repository addressable with a Maven classifier.
My employer has some interests in this area that align so I would like to sponsor (implement, review, commit, RM backfill releases, etc.) this work. Would there be any objections? Read through the thread for some thoughts on approach. Summarized: - Amend create-release to build, stage, and deploy a -hadoop3 variant build by activating the Hadoop 3 build profile. - Amend the Hadoop 3 build profile to flatten POMs before deployment to resolve potential downstream issues due to Hadoop 3 being a non-default build profile. (This could also be applied to all builds.) - Amend hbase-vote to be aware of and evaluate if present -hadoop3 variant artifacts. > On Aug 25, 2022, at 10:40 AM, Andrew Purtell <andrew.purt...@gmail.com> wrote: > > Thanks, that would work. > >> On Aug 25, 2022, at 11:35 AM, Sean Busbey <bus...@apache.org> wrote: >> >> yes, the flatten plugin. We use it in hbase-connectors already. >> >> https://www.mojohaus.org/flatten-maven-plugin/ >> >> this sounds like it could also be a use case for BOMs, which would also >> benefit users of our client artifacts that use build tools that don't >> respect maven profiles generally, like gradle. >> >>> On Thu, Aug 25, 2022 at 10:30 AM Andrew Purtell <andrew.purt...@gmail.com> >>> wrote: >>> >>> Thinking about this a bit more, we will have an issue in that the POMs >>> published from our -hadoop3 build will not have a default activation of our >>> Hadoop 3 build profile. The convenience binaries will function as expected >>> but Maven will read and process eg Phoenix POMs, then download and perform >>> substitutions on HBase POMs, and then etc, so downstreamers like Phoenix >>> will have to set up the hadoop.profile variable for us in their default >>> build profile or else the transitive paths through us may be wrong. I >>> wonder if there is a Maven plugin available for deploying POMs with all >>> variable substitutions performed before deployment, that would solve that >>> problem and all conceivable related issues. >>> >>>> On Aug 25, 2022, at 11:03 AM, Andrew Purtell <andrew.purt...@gmail.com> >>> wrote: >>>> >>>> I think 2.x is going to have a few years of life remaining so it would >>> be best, if we are going to address this, to have a 2.x solution was well >>> as a 3.x one. >>>> >>>> In my opinion we can continue to publish 2.4 and 2.5 (and 2.6) unchanged >>> and then also introduce a Hadoop 3 release using “hadoop3” or similar as >>> Maven classifier. Phoenix could specify this classifier in their POMs. >>> Everyone should be happy. Users who already are comfortable with the Hadoop >>> 2 default don’t have to change anything. A one time POM change on the >>> Phoenix side is required but that’s it. >>>> >>>> The additional build time complexity for generating two releases can be >>> incorporated into create-release. Nobody does manual releases any more as >>> far as I know. Likewise, download and verification of -hadoop3 convenience >>> binaries can be added to hbase-vote. I believe we are all using that tool >>> for verification of releases now. After these one time changes are landed >>> the cost for RMs and PMC will be only in a roughly doubled amount of time >>> needed to build and verify releases. >>>> >>>>>> On Aug 17, 2022, at 9:06 AM, Nick Dimiduk <ndimi...@apache.org> wrote: >>>>>> >>>>>> Hi Geoffrey, >>>>>> >>>>>> I have no complaints with shipping convenience binaries built against >>> both >>>>> Hadoop2 and Hadoop3. The primary challenge is implementing the >>>>> necessary build changes, the secondary challenge is verifying/testing it >>>>> works reliably. >>>>> >>>>> But for Phoenix, are you asking for convenience binaries, or are you >>> asking >>>>> for artifacts published into maven that have the Hadoop3 profile >>> activated >>>>> and specify the associated dependencies? >>>>> >>>>> I'm afraid that the 2.5.0 release ship has already sailed. I've heard >>> talk >>>>> of a 2.6 "fast-follow", so maybe someone can have the build changes >>> ready >>>>> for that? Also, isn't this a too little, too late situation? Shouldn't >>> we >>>>> shift our focus to releasing 3.0, which has dropped support for Hadoop2? >>>>> >>>>> Thanks, >>>>> Nick >>>>> >>>>>>> On Tue, Aug 16, 2022 at 9:30 PM Geoffrey Jacoby <gjac...@apache.org> >>> wrote: >>>>>> >>>>>> I see that the next HBase 2.5 RC is imminent, and before that's set in >>>>>> stone, I wanted to bring up the question of whether there will be >>> official >>>>>> HBase 2.5 binaries built with the Hadoop 3 profile and available in the >>>>>> usual Maven repositories. (In addition to the usual Hadoop 2 profile >>>>>> binaries) >>>>>> >>>>>> The HBase 2.x line has a commitment to maintain support for Hadoop >>> 2.x, but >>>>>> Hadoop 3.3 is the current stable Hadoop line and the most recent >>> release >>>>>> notes [1] encourage all users of Hadoop 2.x to upgrade to Hadoop 3. >>>>>> >>>>>> Without convenience artifacts built against Hadoop 3, no end-users with >>>>>> Hadoop 3 clusters will be able to use the Apache-distributed binaries >>> and >>>>>> will instead have to recompile HBase from source themselves, or use a >>> 3rd >>>>>> party distribution that does so for them. >>>>>> >>>>>> This is especially inconvenient for downstream projects such as Apache >>>>>> Phoenix, which has never officially supported the HBase 2.x / Hadoop >>> 2.10 >>>>>> combination. (It currently supports only HBase 2.3 or 2.4 with Hadoop >>> 3. >>>>>> HBase 2.5 support will be added very shortly after its release as part >>> of >>>>>> Phoenix 5.2.) >>>>>> >>>>>> To even run the Phoenix IT tests locally requires contributors to >>> download >>>>>> the HBase source release and manually mvn install to their local maven >>> repo >>>>>> using the Hadoop 3 profile, to avoid crashes in the HBase >>> minicluster.[2] >>>>>> This is a barrier to new contributors and confuses even veteran ones, >>> and >>>>>> has to be done again for every new HBase release. >>>>>> >>>>>> In general, I expect the Hadoop 3 user base to grow and the Hadoop 2.10 >>>>>> user base to shrink with every future HBase 2 release, so I think this >>> is a >>>>>> worthwhile improvement. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Geoffrey >>>>>> >>>>>> [1] https://hadoop.apache.org/release/3.3.4.html >>>>>> [2] https://github.com/apache/phoenix/blob/master/BUILDING.md >>>>>> >>>