I think changing the default hadoop profile for builds in branch-2 would
unnecessarily complicate our compatibility messaging so long as Hadoop 2
hasn't gone EOL.

On Mon, Aug 29, 2022 at 5:30 AM Nick Dimiduk <ndimi...@apache.org> wrote:

> Should we also make hadoop3 the default active profile for branch-2 going
> forward?
>
> On Fri, Aug 26, 2022 at 5:25 PM Andrew Purtell <andrew.purt...@gmail.com>
> wrote:
>
> > The security posture of Hadoop 2 in general is a problem, because
> > maintenance on that branch is spotty, that is just how it goes. We had
> the
> > same situation with our now EOL branch-1. I know Hadoop released 2.10.2
> to
> > address some CVE worthy problems but it is unclear if 2.10.2 addresses
> all
> > known issues, unlike 3.3.4. Also as you know Hadoop 2 has unpatchable
> > dependencies on org.codehaus versions of Jackson and Jetty, which
> > themselves have high scoring CVEs that will never be fixed because they
> are
> > EOL, and other similar issues. Hadoop 3 doesn’t completely solve such
> > problems but is the only realistic place we can hope they can be
> addressed
> > as required. For organizations that implement or require a top to bottom
> > security audit of their software bill of materials, it seems possible to
> > avoid user pain by providing supported convenience artifacts *and*
> > libraries built against Hadoop 3 APIs in the Apache repository
> addressable
> > with a Maven classifier.
> >
> > My employer has some interests in this area that align so I would like to
> > sponsor (implement, review, commit, RM backfill releases, etc.) this
> work.
> > Would there be any objections? Read through the thread for some thoughts
> on
> > approach. Summarized:
> >
> > - Amend create-release to build, stage, and deploy a -hadoop3 variant
> > build by activating the Hadoop 3 build profile.
> >
> > - Amend the Hadoop 3 build profile to flatten POMs before deployment to
> > resolve potential downstream issues due to Hadoop 3 being a non-default
> > build profile. (This could also be applied to all builds.)
> >
> > - Amend hbase-vote to be aware of and evaluate if present -hadoop3
> variant
> > artifacts.
> >
> >
> > > On Aug 25, 2022, at 10:40 AM, Andrew Purtell <andrew.purt...@gmail.com
> >
> > wrote:
> > >
> > > Thanks, that would work.
> > >
> > >> On Aug 25, 2022, at 11:35 AM, Sean Busbey <bus...@apache.org> wrote:
> > >>
> > >> yes, the flatten plugin. We use it in hbase-connectors already.
> > >>
> > >> https://www.mojohaus.org/flatten-maven-plugin/
> > >>
> > >> this sounds like it could also be a use case for BOMs, which would
> also
> > >> benefit users of our client artifacts that use build tools that don't
> > >> respect maven profiles generally, like gradle.
> > >>
> > >>> On Thu, Aug 25, 2022 at 10:30 AM Andrew Purtell <
> > andrew.purt...@gmail.com>
> > >>> wrote:
> > >>>
> > >>> Thinking about this a bit more, we will have an issue in that the
> POMs
> > >>> published from our -hadoop3 build will not have a default activation
> > of our
> > >>> Hadoop 3 build profile. The convenience binaries will function as
> > expected
> > >>> but Maven will read and process eg Phoenix POMs, then download and
> > perform
> > >>> substitutions on HBase POMs, and then etc, so downstreamers like
> > Phoenix
> > >>> will have to set up the hadoop.profile variable for us in their
> default
> > >>> build profile or else the transitive paths through us may be wrong. I
> > >>> wonder if there is a Maven plugin available for deploying POMs with
> all
> > >>> variable substitutions performed before deployment, that would solve
> > that
> > >>> problem and all conceivable related issues.
> > >>>
> > >>>> On Aug 25, 2022, at 11:03 AM, Andrew Purtell <
> > andrew.purt...@gmail.com>
> > >>> wrote:
> > >>>>
> > >>>> I think 2.x is going to have a few years of life remaining so it
> > would
> > >>> be best, if we are going to address this, to have a 2.x solution was
> > well
> > >>> as a 3.x one.
> > >>>>
> > >>>> In my opinion we can continue to publish 2.4 and 2.5 (and 2.6)
> > unchanged
> > >>> and then also introduce a Hadoop 3 release using “hadoop3” or similar
> > as
> > >>> Maven classifier. Phoenix could specify this classifier in their
> POMs.
> > >>> Everyone should be happy. Users who already are comfortable with the
> > Hadoop
> > >>> 2 default don’t have to change anything. A one time POM change on the
> > >>> Phoenix side is required but that’s it.
> > >>>>
> > >>>> The additional build time complexity for generating two releases can
> > be
> > >>> incorporated into create-release. Nobody does manual releases any
> more
> > as
> > >>> far as I know. Likewise, download and verification of -hadoop3
> > convenience
> > >>> binaries can be added to hbase-vote. I believe we are all using that
> > tool
> > >>> for verification of releases now. After these one time changes are
> > landed
> > >>> the cost for RMs and PMC will be only in a roughly doubled amount of
> > time
> > >>> needed to build and verify releases.
> > >>>>
> > >>>>>> On Aug 17, 2022, at 9:06 AM, Nick Dimiduk <ndimi...@apache.org>
> > wrote:
> > >>>>>>
> > >>>>>> Hi Geoffrey,
> > >>>>>>
> > >>>>>> I have no complaints with shipping convenience binaries built
> > against
> > >>> both
> > >>>>> Hadoop2 and Hadoop3. The primary challenge is implementing the
> > >>>>> necessary build changes, the secondary challenge is
> > verifying/testing it
> > >>>>> works reliably.
> > >>>>>
> > >>>>> But for Phoenix, are you asking for convenience binaries, or are
> you
> > >>> asking
> > >>>>> for artifacts published into maven that have the Hadoop3 profile
> > >>> activated
> > >>>>> and specify the associated dependencies?
> > >>>>>
> > >>>>> I'm afraid that the 2.5.0 release ship has already sailed. I've
> heard
> > >>> talk
> > >>>>> of a 2.6 "fast-follow", so maybe someone can have the build changes
> > >>> ready
> > >>>>> for that? Also, isn't this a too little, too late situation?
> > Shouldn't
> > >>> we
> > >>>>> shift our focus to releasing 3.0, which has dropped support for
> > Hadoop2?
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Nick
> > >>>>>
> > >>>>>>> On Tue, Aug 16, 2022 at 9:30 PM Geoffrey Jacoby <
> > gjac...@apache.org>
> > >>> wrote:
> > >>>>>>
> > >>>>>> I see that the next HBase 2.5 RC is imminent, and before that's
> set
> > in
> > >>>>>> stone, I wanted to bring up the question of whether there will be
> > >>> official
> > >>>>>> HBase 2.5 binaries built with the Hadoop 3 profile and available
> in
> > the
> > >>>>>> usual Maven repositories. (In addition to the usual Hadoop 2
> profile
> > >>>>>> binaries)
> > >>>>>>
> > >>>>>> The HBase 2.x line has a commitment to maintain support for Hadoop
> > >>> 2.x, but
> > >>>>>> Hadoop 3.3 is the current stable Hadoop line and the most recent
> > >>> release
> > >>>>>> notes [1] encourage all users of Hadoop  2.x to upgrade to Hadoop
> 3.
> > >>>>>>
> > >>>>>> Without convenience artifacts built against Hadoop 3, no end-users
> > with
> > >>>>>> Hadoop 3 clusters will be able to use the Apache-distributed
> > binaries
> > >>> and
> > >>>>>> will instead have to recompile HBase from source themselves, or
> use
> > a
> > >>> 3rd
> > >>>>>> party distribution that does so for them.
> > >>>>>>
> > >>>>>> This is especially inconvenient for downstream projects such as
> > Apache
> > >>>>>> Phoenix, which has never  officially supported the HBase 2.x /
> > Hadoop
> > >>> 2.10
> > >>>>>> combination. (It currently supports only HBase 2.3 or 2.4 with
> > Hadoop
> > >>> 3.
> > >>>>>> HBase 2.5 support will be added very shortly after its release as
> > part
> > >>> of
> > >>>>>> Phoenix 5.2.)
> > >>>>>>
> > >>>>>> To even run the Phoenix IT tests locally requires contributors to
> > >>> download
> > >>>>>> the HBase source release and manually mvn install to their local
> > maven
> > >>> repo
> > >>>>>> using the Hadoop 3 profile, to avoid crashes in the HBase
> > >>> minicluster.[2]
> > >>>>>> This is a barrier to new contributors and confuses even veteran
> > ones,
> > >>> and
> > >>>>>> has to be done again for every new HBase release.
> > >>>>>>
> > >>>>>> In general, I expect the Hadoop 3 user base to grow and the Hadoop
> > 2.10
> > >>>>>> user base to shrink with every future HBase 2 release, so I think
> > this
> > >>> is a
> > >>>>>> worthwhile improvement.
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>>
> > >>>>>> Geoffrey
> > >>>>>>
> > >>>>>> [1] https://hadoop.apache.org/release/3.3.4.html
> > >>>>>> [2] https://github.com/apache/phoenix/blob/master/BUILDING.md
> > >>>>>>
> > >>>
> >
>

Reply via email to