Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
Thanks a lot, Duo! Looking at HBASE-27359, the -hadoop3 artifacts should be exactly the same that we're rebuilding in our build process. Ran a quick test: Added the rc repo to my settings.xml built Phoenix locally with mvn clean install -Dhbase.profile=2.5 -Dhbase.version=2.5.2-hadoop3 The test suite passed, everything looks good. Thank you again! On Fri, Nov 25, 2022 at 5:25 AM 张铎(Duo Zhang) wrote: > I've put up 2.5.2RC0, which contains a hadoop3 dist and also hadoop3 > maven artifacts, it is built with hadoop 3.2.4. > > The dist is available here > https://dist.apache.org/repos/dist/dev/hbase/2.5.2RC0/ > > And the maven artifacts is available here > https://repository.apache.org/content/repositories/orgapachehbase-1504/ > > Notice that the version for hadoop3 maven artifacts is 2.5.2-hadoop3. > > Please take a look and have a try. > > Thanks. > > > > 张铎(Duo Zhang) 于2022年10月31日周一 12:02写道: > > > > > > Some progress here. > > With other developers help(especially Nick, Andrew and Guanghao), I've > > successfully made the release scripts able to publish binaries and > > maven artifacts for hadoop3, in a dry run mode, > > > > https://github.com/apache/hbase/pull/4856 > > > > I've put up a discussion thread, for quickly releasing 2.5.2 for the > > 2.5 release line, with hadoop3 binaries. Please shout if you have any > > ideas. > > > > Thanks. > > > > 张铎(Duo Zhang) 于2022年10月24日周一 12:27写道: > > > > > > HBASE-27434 has been landed to branch-2.5+. Branch-2.4 does not have a > > > flatten plugin so do not apply HBASE-27434 to it. > > > > > > Filed HBASE-27442 for changing the way of bumping versions in release > scripts. > > > > > > After this change, let's finally go back to HBASE-27359 to make the > > > release scripts publish different artifacts for hadoop2 and hadoop3. > > > > > > Thanks. > > > > > > Andrew Purtell 于2022年10月19日周三 23:36写道: > > > > > > > > Suggestions: > > > > > > > > - For HBase 2.x releases, we should continue to publish default > builds, > > > > those without any -hadoop3- or -widgetfoo- modifiers, against Hadoop > 2. > > > > > > > > - For HBase 3, it makes sense to move the default to Hadoop 3, no > other > > > > build variants needed there. This is the kind of thing a major > version > > > > increment allows us to do per our dependency compatibility > guidelines. > > > > > > > > - While eventually it may be necessary to differentiate between minor > > > > release lines of Hadoop it would be simpler to pick one Hadoop 3 > version, > > > > like 3.3.4, and build and publish a -hadoop3- artifact for each > current > > > > releasing 2.x code line: 2.4.15-hadoop3, 2.5.2-hadoop3, > 2.6.0-hadoop3. > > > > > > > > - The process of building releases is automated by create-release, > which > > > > all RMs use now. create-release automates the process of building and > > > > signing tarballs and publishing to Nexus. There should be no > significant > > > > new burden on the RM, beyond an increase in time for create-release > > > > execution, to parameterize it and iterate over one or more variant > builds. > > > > That is a long way of suggesting we do publish variant tarballs too, > they > > > > are almost "for free" if we've gone to the trouble to build for > publishing > > > > to Nexus. > > > > > > > > > > > > On Wed, Oct 19, 2022 at 12:52 AM 张铎(Duo Zhang) < > palomino...@gmail.com> > > > > wrote: > > > > > > > > > After some investigating, I think using the $revision placeholder > can > > > > > solve the problem here, i.e, using different command line to > publish > > > > > different artifacts for hadoop2 and hadoop3, with the same souce > code. > > > > > You can see the comment on HBASE-27359 for more details. > > > > > > > > > > Next I will open an issue to land the $revision change. And here, I > > > > > think first we need to discuss how many new artifacts we want to > > > > > publish. For example, for 2.6.0, we only want to publish a > > > > > 2.6.0-hadoop3, with the default hadoop3 version? Or we publish > > > > > 2.6.0-hadoop3.2, 2.6.0-hadoop3.3 for different hadoop minor release > > > > > lines? And do we want to publish different tarballs for hadoop2 and > > > > > hadoop3? > > > > > > > > > > Thanks. > > > > > > > > > > Andrew Purtell 于2022年8月31日周三 00:19写道: > > > > > > > > > > > > I also don't think we should change the defaults in branch-2 > until > > > > > Hadoop 2 > > > > > > is EOLed. > > > > > > > > > > > > On Mon, Aug 29, 2022 at 10:22 AM Sean Busbey > wrote: > > > > > > > > > > > > > I think changing the default hadoop profile for builds in > branch-2 > > > > > would > > > > > > > unnecessarily complicate our compatibility messaging so long > as Hadoop > > > > > 2 > > > > > > > hasn't gone EOL. > > > > > > > > > > > > > > On Mon, Aug 29, 2022 at 5:30 AM Nick Dimiduk < > ndimi...@apache.org> > > > > > wrote: > > > > > > > > > > > > > > > Should we also make hadoop3 the default active profile for > branch-2 > > > > > going > > > > > > > > forward? > > > > > > > > > > >
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
I've put up 2.5.2RC0, which contains a hadoop3 dist and also hadoop3 maven artifacts, it is built with hadoop 3.2.4. The dist is available here https://dist.apache.org/repos/dist/dev/hbase/2.5.2RC0/ And the maven artifacts is available here https://repository.apache.org/content/repositories/orgapachehbase-1504/ Notice that the version for hadoop3 maven artifacts is 2.5.2-hadoop3. Please take a look and have a try. Thanks. 张铎(Duo Zhang) 于2022年10月31日周一 12:02写道: > > Some progress here. > With other developers help(especially Nick, Andrew and Guanghao), I've > successfully made the release scripts able to publish binaries and > maven artifacts for hadoop3, in a dry run mode, > > https://github.com/apache/hbase/pull/4856 > > I've put up a discussion thread, for quickly releasing 2.5.2 for the > 2.5 release line, with hadoop3 binaries. Please shout if you have any > ideas. > > Thanks. > > 张铎(Duo Zhang) 于2022年10月24日周一 12:27写道: > > > > HBASE-27434 has been landed to branch-2.5+. Branch-2.4 does not have a > > flatten plugin so do not apply HBASE-27434 to it. > > > > Filed HBASE-27442 for changing the way of bumping versions in release > > scripts. > > > > After this change, let's finally go back to HBASE-27359 to make the > > release scripts publish different artifacts for hadoop2 and hadoop3. > > > > Thanks. > > > > Andrew Purtell 于2022年10月19日周三 23:36写道: > > > > > > Suggestions: > > > > > > - For HBase 2.x releases, we should continue to publish default builds, > > > those without any -hadoop3- or -widgetfoo- modifiers, against Hadoop 2. > > > > > > - For HBase 3, it makes sense to move the default to Hadoop 3, no other > > > build variants needed there. This is the kind of thing a major version > > > increment allows us to do per our dependency compatibility guidelines. > > > > > > - While eventually it may be necessary to differentiate between minor > > > release lines of Hadoop it would be simpler to pick one Hadoop 3 version, > > > like 3.3.4, and build and publish a -hadoop3- artifact for each current > > > releasing 2.x code line: 2.4.15-hadoop3, 2.5.2-hadoop3, 2.6.0-hadoop3. > > > > > > - The process of building releases is automated by create-release, which > > > all RMs use now. create-release automates the process of building and > > > signing tarballs and publishing to Nexus. There should be no significant > > > new burden on the RM, beyond an increase in time for create-release > > > execution, to parameterize it and iterate over one or more variant builds. > > > That is a long way of suggesting we do publish variant tarballs too, they > > > are almost "for free" if we've gone to the trouble to build for publishing > > > to Nexus. > > > > > > > > > On Wed, Oct 19, 2022 at 12:52 AM 张铎(Duo Zhang) > > > wrote: > > > > > > > After some investigating, I think using the $revision placeholder can > > > > solve the problem here, i.e, using different command line to publish > > > > different artifacts for hadoop2 and hadoop3, with the same souce code. > > > > You can see the comment on HBASE-27359 for more details. > > > > > > > > Next I will open an issue to land the $revision change. And here, I > > > > think first we need to discuss how many new artifacts we want to > > > > publish. For example, for 2.6.0, we only want to publish a > > > > 2.6.0-hadoop3, with the default hadoop3 version? Or we publish > > > > 2.6.0-hadoop3.2, 2.6.0-hadoop3.3 for different hadoop minor release > > > > lines? And do we want to publish different tarballs for hadoop2 and > > > > hadoop3? > > > > > > > > Thanks. > > > > > > > > Andrew Purtell 于2022年8月31日周三 00:19写道: > > > > > > > > > > I also don't think we should change the defaults in branch-2 until > > > > Hadoop 2 > > > > > is EOLed. > > > > > > > > > > On Mon, Aug 29, 2022 at 10:22 AM Sean Busbey > > > > > wrote: > > > > > > > > > > > I think changing the default hadoop profile for builds in branch-2 > > > > would > > > > > > unnecessarily complicate our compatibility messaging so long as > > > > > > Hadoop > > > > 2 > > > > > > hasn't gone EOL. > > > > > > > > > > > > On Mon, Aug 29, 2022 at 5:30 AM Nick Dimiduk > > > > wrote: > > > > > > > > > > > > > Should we also make hadoop3 the default active profile for > > > > > > > branch-2 > > > > going > > > > > > > forward? > > > > > > > > > > > > > > On Fri, Aug 26, 2022 at 5:25 PM Andrew Purtell < > > > > andrew.purt...@gmail.com > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > The security posture of Hadoop 2 in general is a problem, > > > > > > > > because > > > > > > > > maintenance on that branch is spotty, that is just how it goes. > > > > > > > > We > > > > had > > > > > > > the > > > > > > > > same situation with our now EOL branch-1. I know Hadoop released > > > > 2.10.2 > > > > > > > to > > > > > > > > address some CVE worthy problems but it is unclear if 2.10.2 > > > > addresses > > > > > > > all > > > > > > > > known issues, unlike 3.3.4. Also as you know Hadoop 2 has > > > >
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
Some progress here. With other developers help(especially Nick, Andrew and Guanghao), I've successfully made the release scripts able to publish binaries and maven artifacts for hadoop3, in a dry run mode, https://github.com/apache/hbase/pull/4856 I've put up a discussion thread, for quickly releasing 2.5.2 for the 2.5 release line, with hadoop3 binaries. Please shout if you have any ideas. Thanks. 张铎(Duo Zhang) 于2022年10月24日周一 12:27写道: > > HBASE-27434 has been landed to branch-2.5+. Branch-2.4 does not have a > flatten plugin so do not apply HBASE-27434 to it. > > Filed HBASE-27442 for changing the way of bumping versions in release scripts. > > After this change, let's finally go back to HBASE-27359 to make the > release scripts publish different artifacts for hadoop2 and hadoop3. > > Thanks. > > Andrew Purtell 于2022年10月19日周三 23:36写道: > > > > Suggestions: > > > > - For HBase 2.x releases, we should continue to publish default builds, > > those without any -hadoop3- or -widgetfoo- modifiers, against Hadoop 2. > > > > - For HBase 3, it makes sense to move the default to Hadoop 3, no other > > build variants needed there. This is the kind of thing a major version > > increment allows us to do per our dependency compatibility guidelines. > > > > - While eventually it may be necessary to differentiate between minor > > release lines of Hadoop it would be simpler to pick one Hadoop 3 version, > > like 3.3.4, and build and publish a -hadoop3- artifact for each current > > releasing 2.x code line: 2.4.15-hadoop3, 2.5.2-hadoop3, 2.6.0-hadoop3. > > > > - The process of building releases is automated by create-release, which > > all RMs use now. create-release automates the process of building and > > signing tarballs and publishing to Nexus. There should be no significant > > new burden on the RM, beyond an increase in time for create-release > > execution, to parameterize it and iterate over one or more variant builds. > > That is a long way of suggesting we do publish variant tarballs too, they > > are almost "for free" if we've gone to the trouble to build for publishing > > to Nexus. > > > > > > On Wed, Oct 19, 2022 at 12:52 AM 张铎(Duo Zhang) > > wrote: > > > > > After some investigating, I think using the $revision placeholder can > > > solve the problem here, i.e, using different command line to publish > > > different artifacts for hadoop2 and hadoop3, with the same souce code. > > > You can see the comment on HBASE-27359 for more details. > > > > > > Next I will open an issue to land the $revision change. And here, I > > > think first we need to discuss how many new artifacts we want to > > > publish. For example, for 2.6.0, we only want to publish a > > > 2.6.0-hadoop3, with the default hadoop3 version? Or we publish > > > 2.6.0-hadoop3.2, 2.6.0-hadoop3.3 for different hadoop minor release > > > lines? And do we want to publish different tarballs for hadoop2 and > > > hadoop3? > > > > > > Thanks. > > > > > > Andrew Purtell 于2022年8月31日周三 00:19写道: > > > > > > > > I also don't think we should change the defaults in branch-2 until > > > Hadoop 2 > > > > is EOLed. > > > > > > > > On Mon, Aug 29, 2022 at 10:22 AM Sean Busbey wrote: > > > > > > > > > I think changing the default hadoop profile for builds in branch-2 > > > would > > > > > unnecessarily complicate our compatibility messaging so long as Hadoop > > > 2 > > > > > hasn't gone EOL. > > > > > > > > > > On Mon, Aug 29, 2022 at 5:30 AM Nick Dimiduk > > > wrote: > > > > > > > > > > > Should we also make hadoop3 the default active profile for branch-2 > > > going > > > > > > forward? > > > > > > > > > > > > On Fri, Aug 26, 2022 at 5:25 PM Andrew Purtell < > > > andrew.purt...@gmail.com > > > > > > > > > > > > wrote: > > > > > > > > > > > > > The security posture of Hadoop 2 in general is a problem, because > > > > > > > maintenance on that branch is spotty, that is just how it goes. We > > > had > > > > > > the > > > > > > > same situation with our now EOL branch-1. I know Hadoop released > > > 2.10.2 > > > > > > to > > > > > > > address some CVE worthy problems but it is unclear if 2.10.2 > > > addresses > > > > > > all > > > > > > > known issues, unlike 3.3.4. Also as you know Hadoop 2 has > > > unpatchable > > > > > > > dependencies on org.codehaus versions of Jackson and Jetty, which > > > > > > > themselves have high scoring CVEs that will never be fixed because > > > they > > > > > > are > > > > > > > EOL, and other similar issues. Hadoop 3 doesn’t completely solve > > > such > > > > > > > problems but is the only realistic place we can hope they can be > > > > > > addressed > > > > > > > as required. For organizations that implement or require a top to > > > > > bottom > > > > > > > security audit of their software bill of materials, it seems > > > possible > > > > > to > > > > > > > avoid user pain by providing supported convenience artifacts *and* > > > > > > > libraries built against Hadoop 3 APIs in the Apache repository > > > > > >
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
HBASE-27434 has been landed to branch-2.5+. Branch-2.4 does not have a flatten plugin so do not apply HBASE-27434 to it. Filed HBASE-27442 for changing the way of bumping versions in release scripts. After this change, let's finally go back to HBASE-27359 to make the release scripts publish different artifacts for hadoop2 and hadoop3. Thanks. Andrew Purtell 于2022年10月19日周三 23:36写道: > > Suggestions: > > - For HBase 2.x releases, we should continue to publish default builds, > those without any -hadoop3- or -widgetfoo- modifiers, against Hadoop 2. > > - For HBase 3, it makes sense to move the default to Hadoop 3, no other > build variants needed there. This is the kind of thing a major version > increment allows us to do per our dependency compatibility guidelines. > > - While eventually it may be necessary to differentiate between minor > release lines of Hadoop it would be simpler to pick one Hadoop 3 version, > like 3.3.4, and build and publish a -hadoop3- artifact for each current > releasing 2.x code line: 2.4.15-hadoop3, 2.5.2-hadoop3, 2.6.0-hadoop3. > > - The process of building releases is automated by create-release, which > all RMs use now. create-release automates the process of building and > signing tarballs and publishing to Nexus. There should be no significant > new burden on the RM, beyond an increase in time for create-release > execution, to parameterize it and iterate over one or more variant builds. > That is a long way of suggesting we do publish variant tarballs too, they > are almost "for free" if we've gone to the trouble to build for publishing > to Nexus. > > > On Wed, Oct 19, 2022 at 12:52 AM 张铎(Duo Zhang) > wrote: > > > After some investigating, I think using the $revision placeholder can > > solve the problem here, i.e, using different command line to publish > > different artifacts for hadoop2 and hadoop3, with the same souce code. > > You can see the comment on HBASE-27359 for more details. > > > > Next I will open an issue to land the $revision change. And here, I > > think first we need to discuss how many new artifacts we want to > > publish. For example, for 2.6.0, we only want to publish a > > 2.6.0-hadoop3, with the default hadoop3 version? Or we publish > > 2.6.0-hadoop3.2, 2.6.0-hadoop3.3 for different hadoop minor release > > lines? And do we want to publish different tarballs for hadoop2 and > > hadoop3? > > > > Thanks. > > > > Andrew Purtell 于2022年8月31日周三 00:19写道: > > > > > > I also don't think we should change the defaults in branch-2 until > > Hadoop 2 > > > is EOLed. > > > > > > On Mon, Aug 29, 2022 at 10:22 AM Sean Busbey wrote: > > > > > > > I think changing the default hadoop profile for builds in branch-2 > > would > > > > unnecessarily complicate our compatibility messaging so long as Hadoop > > 2 > > > > hasn't gone EOL. > > > > > > > > On Mon, Aug 29, 2022 at 5:30 AM Nick Dimiduk > > wrote: > > > > > > > > > Should we also make hadoop3 the default active profile for branch-2 > > going > > > > > forward? > > > > > > > > > > On Fri, Aug 26, 2022 at 5:25 PM Andrew Purtell < > > andrew.purt...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > The security posture of Hadoop 2 in general is a problem, because > > > > > > maintenance on that branch is spotty, that is just how it goes. We > > had > > > > > the > > > > > > same situation with our now EOL branch-1. I know Hadoop released > > 2.10.2 > > > > > to > > > > > > address some CVE worthy problems but it is unclear if 2.10.2 > > addresses > > > > > all > > > > > > known issues, unlike 3.3.4. Also as you know Hadoop 2 has > > unpatchable > > > > > > dependencies on org.codehaus versions of Jackson and Jetty, which > > > > > > themselves have high scoring CVEs that will never be fixed because > > they > > > > > are > > > > > > EOL, and other similar issues. Hadoop 3 doesn’t completely solve > > such > > > > > > problems but is the only realistic place we can hope they can be > > > > > addressed > > > > > > as required. For organizations that implement or require a top to > > > > bottom > > > > > > security audit of their software bill of materials, it seems > > possible > > > > to > > > > > > avoid user pain by providing supported convenience artifacts *and* > > > > > > libraries built against Hadoop 3 APIs in the Apache repository > > > > > addressable > > > > > > with a Maven classifier. > > > > > > > > > > > > My employer has some interests in this area that align so I would > > like > > > > to > > > > > > sponsor (implement, review, commit, RM backfill releases, etc.) > > this > > > > > work. > > > > > > Would there be any objections? Read through the thread for some > > > > thoughts > > > > > on > > > > > > approach. Summarized: > > > > > > > > > > > > - Amend create-release to build, stage, and deploy a -hadoop3 > > variant > > > > > > build by activating the Hadoop 3 build profile. > > > > > > > > > > > > - Amend the Hadoop 3 build profile to flatten POMs before > > deployment to > > >
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
Suggestions: - For HBase 2.x releases, we should continue to publish default builds, those without any -hadoop3- or -widgetfoo- modifiers, against Hadoop 2. - For HBase 3, it makes sense to move the default to Hadoop 3, no other build variants needed there. This is the kind of thing a major version increment allows us to do per our dependency compatibility guidelines. - While eventually it may be necessary to differentiate between minor release lines of Hadoop it would be simpler to pick one Hadoop 3 version, like 3.3.4, and build and publish a -hadoop3- artifact for each current releasing 2.x code line: 2.4.15-hadoop3, 2.5.2-hadoop3, 2.6.0-hadoop3. - The process of building releases is automated by create-release, which all RMs use now. create-release automates the process of building and signing tarballs and publishing to Nexus. There should be no significant new burden on the RM, beyond an increase in time for create-release execution, to parameterize it and iterate over one or more variant builds. That is a long way of suggesting we do publish variant tarballs too, they are almost "for free" if we've gone to the trouble to build for publishing to Nexus. On Wed, Oct 19, 2022 at 12:52 AM 张铎(Duo Zhang) wrote: > After some investigating, I think using the $revision placeholder can > solve the problem here, i.e, using different command line to publish > different artifacts for hadoop2 and hadoop3, with the same souce code. > You can see the comment on HBASE-27359 for more details. > > Next I will open an issue to land the $revision change. And here, I > think first we need to discuss how many new artifacts we want to > publish. For example, for 2.6.0, we only want to publish a > 2.6.0-hadoop3, with the default hadoop3 version? Or we publish > 2.6.0-hadoop3.2, 2.6.0-hadoop3.3 for different hadoop minor release > lines? And do we want to publish different tarballs for hadoop2 and > hadoop3? > > Thanks. > > Andrew Purtell 于2022年8月31日周三 00:19写道: > > > > I also don't think we should change the defaults in branch-2 until > Hadoop 2 > > is EOLed. > > > > On Mon, Aug 29, 2022 at 10:22 AM Sean Busbey wrote: > > > > > I think changing the default hadoop profile for builds in branch-2 > would > > > unnecessarily complicate our compatibility messaging so long as Hadoop > 2 > > > hasn't gone EOL. > > > > > > On Mon, Aug 29, 2022 at 5:30 AM Nick Dimiduk > wrote: > > > > > > > Should we also make hadoop3 the default active profile for branch-2 > going > > > > forward? > > > > > > > > On Fri, Aug 26, 2022 at 5:25 PM Andrew Purtell < > andrew.purt...@gmail.com > > > > > > > > wrote: > > > > > > > > > The security posture of Hadoop 2 in general is a problem, because > > > > > maintenance on that branch is spotty, that is just how it goes. We > had > > > > the > > > > > same situation with our now EOL branch-1. I know Hadoop released > 2.10.2 > > > > to > > > > > address some CVE worthy problems but it is unclear if 2.10.2 > addresses > > > > all > > > > > known issues, unlike 3.3.4. Also as you know Hadoop 2 has > unpatchable > > > > > dependencies on org.codehaus versions of Jackson and Jetty, which > > > > > themselves have high scoring CVEs that will never be fixed because > they > > > > are > > > > > EOL, and other similar issues. Hadoop 3 doesn’t completely solve > such > > > > > problems but is the only realistic place we can hope they can be > > > > addressed > > > > > as required. For organizations that implement or require a top to > > > bottom > > > > > security audit of their software bill of materials, it seems > possible > > > to > > > > > avoid user pain by providing supported convenience artifacts *and* > > > > > libraries built against Hadoop 3 APIs in the Apache repository > > > > addressable > > > > > with a Maven classifier. > > > > > > > > > > My employer has some interests in this area that align so I would > like > > > to > > > > > sponsor (implement, review, commit, RM backfill releases, etc.) > this > > > > work. > > > > > Would there be any objections? Read through the thread for some > > > thoughts > > > > on > > > > > approach. Summarized: > > > > > > > > > > - Amend create-release to build, stage, and deploy a -hadoop3 > variant > > > > > build by activating the Hadoop 3 build profile. > > > > > > > > > > - Amend the Hadoop 3 build profile to flatten POMs before > deployment to > > > > > resolve potential downstream issues due to Hadoop 3 being a > non-default > > > > > build profile. (This could also be applied to all builds.) > > > > > > > > > > - Amend hbase-vote to be aware of and evaluate if present -hadoop3 > > > > variant > > > > > artifacts. > > > > > > > > > > > > > > > > On Aug 25, 2022, at 10:40 AM, Andrew Purtell < > > > andrew.purt...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > > Thanks, that would work. > > > > > > > > > > > >> On Aug 25, 2022, at 11:35 AM, Sean Busbey > > > wrote: > > > > > >> > > > > > >> yes, the flatten plugin. We use it in
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
Filed HBASE-27434 for landing the '${revision}' change. 张铎(Duo Zhang) 于2022年10月19日周三 15:52写道: > > After some investigating, I think using the $revision placeholder can > solve the problem here, i.e, using different command line to publish > different artifacts for hadoop2 and hadoop3, with the same souce code. > You can see the comment on HBASE-27359 for more details. > > Next I will open an issue to land the $revision change. And here, I > think first we need to discuss how many new artifacts we want to > publish. For example, for 2.6.0, we only want to publish a > 2.6.0-hadoop3, with the default hadoop3 version? Or we publish > 2.6.0-hadoop3.2, 2.6.0-hadoop3.3 for different hadoop minor release > lines? And do we want to publish different tarballs for hadoop2 and > hadoop3? > > Thanks. > > Andrew Purtell 于2022年8月31日周三 00:19写道: > > > > I also don't think we should change the defaults in branch-2 until Hadoop 2 > > is EOLed. > > > > On Mon, Aug 29, 2022 at 10:22 AM Sean Busbey wrote: > > > > > I think changing the default hadoop profile for builds in branch-2 would > > > unnecessarily complicate our compatibility messaging so long as Hadoop 2 > > > hasn't gone EOL. > > > > > > On Mon, Aug 29, 2022 at 5:30 AM Nick Dimiduk wrote: > > > > > > > Should we also make hadoop3 the default active profile for branch-2 > > > > going > > > > forward? > > > > > > > > On Fri, Aug 26, 2022 at 5:25 PM Andrew Purtell > > > > > > > wrote: > > > > > > > > > The security posture of Hadoop 2 in general is a problem, because > > > > > maintenance on that branch is spotty, that is just how it goes. We had > > > > the > > > > > same situation with our now EOL branch-1. I know Hadoop released > > > > > 2.10.2 > > > > to > > > > > address some CVE worthy problems but it is unclear if 2.10.2 addresses > > > > all > > > > > known issues, unlike 3.3.4. Also as you know Hadoop 2 has unpatchable > > > > > dependencies on org.codehaus versions of Jackson and Jetty, which > > > > > themselves have high scoring CVEs that will never be fixed because > > > > > they > > > > are > > > > > EOL, and other similar issues. Hadoop 3 doesn’t completely solve such > > > > > problems but is the only realistic place we can hope they can be > > > > addressed > > > > > as required. For organizations that implement or require a top to > > > bottom > > > > > security audit of their software bill of materials, it seems possible > > > to > > > > > avoid user pain by providing supported convenience artifacts *and* > > > > > libraries built against Hadoop 3 APIs in the Apache repository > > > > addressable > > > > > with a Maven classifier. > > > > > > > > > > My employer has some interests in this area that align so I would like > > > to > > > > > sponsor (implement, review, commit, RM backfill releases, etc.) this > > > > work. > > > > > Would there be any objections? Read through the thread for some > > > thoughts > > > > on > > > > > approach. Summarized: > > > > > > > > > > - Amend create-release to build, stage, and deploy a -hadoop3 variant > > > > > build by activating the Hadoop 3 build profile. > > > > > > > > > > - Amend the Hadoop 3 build profile to flatten POMs before deployment > > > > > to > > > > > resolve potential downstream issues due to Hadoop 3 being a > > > > > non-default > > > > > build profile. (This could also be applied to all builds.) > > > > > > > > > > - Amend hbase-vote to be aware of and evaluate if present -hadoop3 > > > > variant > > > > > artifacts. > > > > > > > > > > > > > > > > On Aug 25, 2022, at 10:40 AM, Andrew Purtell < > > > andrew.purt...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > > Thanks, that would work. > > > > > > > > > > > >> On Aug 25, 2022, at 11:35 AM, Sean Busbey > > > wrote: > > > > > >> > > > > > >> yes, the flatten plugin. We use it in hbase-connectors already. > > > > > >> > > > > > >> https://www.mojohaus.org/flatten-maven-plugin/ > > > > > >> > > > > > >> this sounds like it could also be a use case for BOMs, which would > > > > also > > > > > >> benefit users of our client artifacts that use build tools that > > > don't > > > > > >> respect maven profiles generally, like gradle. > > > > > >> > > > > > >>> On Thu, Aug 25, 2022 at 10:30 AM Andrew Purtell < > > > > > andrew.purt...@gmail.com> > > > > > >>> wrote: > > > > > >>> > > > > > >>> Thinking about this a bit more, we will have an issue in that the > > > > POMs > > > > > >>> published from our -hadoop3 build will not have a default > > > activation > > > > > of our > > > > > >>> Hadoop 3 build profile. The convenience binaries will function as > > > > > expected > > > > > >>> but Maven will read and process eg Phoenix POMs, then download and > > > > > perform > > > > > >>> substitutions on HBase POMs, and then etc, so downstreamers like > > > > > Phoenix > > > > > >>> will have to set up the hadoop.profile variable for us in their > > > > default > > > > > >>> build profile or else the transitive paths through us
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
After some investigating, I think using the $revision placeholder can solve the problem here, i.e, using different command line to publish different artifacts for hadoop2 and hadoop3, with the same souce code. You can see the comment on HBASE-27359 for more details. Next I will open an issue to land the $revision change. And here, I think first we need to discuss how many new artifacts we want to publish. For example, for 2.6.0, we only want to publish a 2.6.0-hadoop3, with the default hadoop3 version? Or we publish 2.6.0-hadoop3.2, 2.6.0-hadoop3.3 for different hadoop minor release lines? And do we want to publish different tarballs for hadoop2 and hadoop3? Thanks. Andrew Purtell 于2022年8月31日周三 00:19写道: > > I also don't think we should change the defaults in branch-2 until Hadoop 2 > is EOLed. > > On Mon, Aug 29, 2022 at 10:22 AM Sean Busbey wrote: > > > I think changing the default hadoop profile for builds in branch-2 would > > unnecessarily complicate our compatibility messaging so long as Hadoop 2 > > hasn't gone EOL. > > > > On Mon, Aug 29, 2022 at 5:30 AM Nick Dimiduk wrote: > > > > > Should we also make hadoop3 the default active profile for branch-2 going > > > forward? > > > > > > On Fri, Aug 26, 2022 at 5:25 PM Andrew Purtell > > > > > wrote: > > > > > > > The security posture of Hadoop 2 in general is a problem, because > > > > maintenance on that branch is spotty, that is just how it goes. We had > > > the > > > > same situation with our now EOL branch-1. I know Hadoop released 2.10.2 > > > to > > > > address some CVE worthy problems but it is unclear if 2.10.2 addresses > > > all > > > > known issues, unlike 3.3.4. Also as you know Hadoop 2 has unpatchable > > > > dependencies on org.codehaus versions of Jackson and Jetty, which > > > > themselves have high scoring CVEs that will never be fixed because they > > > are > > > > EOL, and other similar issues. Hadoop 3 doesn’t completely solve such > > > > problems but is the only realistic place we can hope they can be > > > addressed > > > > as required. For organizations that implement or require a top to > > bottom > > > > security audit of their software bill of materials, it seems possible > > to > > > > avoid user pain by providing supported convenience artifacts *and* > > > > libraries built against Hadoop 3 APIs in the Apache repository > > > addressable > > > > with a Maven classifier. > > > > > > > > My employer has some interests in this area that align so I would like > > to > > > > sponsor (implement, review, commit, RM backfill releases, etc.) this > > > work. > > > > Would there be any objections? Read through the thread for some > > thoughts > > > on > > > > approach. Summarized: > > > > > > > > - Amend create-release to build, stage, and deploy a -hadoop3 variant > > > > build by activating the Hadoop 3 build profile. > > > > > > > > - Amend the Hadoop 3 build profile to flatten POMs before deployment to > > > > resolve potential downstream issues due to Hadoop 3 being a non-default > > > > build profile. (This could also be applied to all builds.) > > > > > > > > - Amend hbase-vote to be aware of and evaluate if present -hadoop3 > > > variant > > > > artifacts. > > > > > > > > > > > > > On Aug 25, 2022, at 10:40 AM, Andrew Purtell < > > andrew.purt...@gmail.com > > > > > > > > wrote: > > > > > > > > > > Thanks, that would work. > > > > > > > > > >> On Aug 25, 2022, at 11:35 AM, Sean Busbey > > wrote: > > > > >> > > > > >> yes, the flatten plugin. We use it in hbase-connectors already. > > > > >> > > > > >> https://www.mojohaus.org/flatten-maven-plugin/ > > > > >> > > > > >> this sounds like it could also be a use case for BOMs, which would > > > also > > > > >> benefit users of our client artifacts that use build tools that > > don't > > > > >> respect maven profiles generally, like gradle. > > > > >> > > > > >>> On Thu, Aug 25, 2022 at 10:30 AM Andrew Purtell < > > > > andrew.purt...@gmail.com> > > > > >>> wrote: > > > > >>> > > > > >>> Thinking about this a bit more, we will have an issue in that the > > > POMs > > > > >>> published from our -hadoop3 build will not have a default > > activation > > > > of our > > > > >>> Hadoop 3 build profile. The convenience binaries will function as > > > > expected > > > > >>> but Maven will read and process eg Phoenix POMs, then download and > > > > perform > > > > >>> substitutions on HBase POMs, and then etc, so downstreamers like > > > > Phoenix > > > > >>> will have to set up the hadoop.profile variable for us in their > > > default > > > > >>> build profile or else the transitive paths through us may be > > wrong. I > > > > >>> wonder if there is a Maven plugin available for deploying POMs with > > > all > > > > >>> variable substitutions performed before deployment, that would > > solve > > > > that > > > > >>> problem and all conceivable related issues. > > > > >>> > > > > On Aug 25, 2022, at 11:03 AM, Andrew Purtell < > > > > andrew.purt...@gmail.com> > > > > >>>
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
I also don't think we should change the defaults in branch-2 until Hadoop 2 is EOLed. On Mon, Aug 29, 2022 at 10:22 AM Sean Busbey wrote: > I think changing the default hadoop profile for builds in branch-2 would > unnecessarily complicate our compatibility messaging so long as Hadoop 2 > hasn't gone EOL. > > On Mon, Aug 29, 2022 at 5:30 AM Nick Dimiduk wrote: > > > Should we also make hadoop3 the default active profile for branch-2 going > > forward? > > > > On Fri, Aug 26, 2022 at 5:25 PM Andrew Purtell > > > wrote: > > > > > The security posture of Hadoop 2 in general is a problem, because > > > maintenance on that branch is spotty, that is just how it goes. We had > > the > > > same situation with our now EOL branch-1. I know Hadoop released 2.10.2 > > to > > > address some CVE worthy problems but it is unclear if 2.10.2 addresses > > all > > > known issues, unlike 3.3.4. Also as you know Hadoop 2 has unpatchable > > > dependencies on org.codehaus versions of Jackson and Jetty, which > > > themselves have high scoring CVEs that will never be fixed because they > > are > > > EOL, and other similar issues. Hadoop 3 doesn’t completely solve such > > > problems but is the only realistic place we can hope they can be > > addressed > > > as required. For organizations that implement or require a top to > bottom > > > security audit of their software bill of materials, it seems possible > to > > > avoid user pain by providing supported convenience artifacts *and* > > > libraries built against Hadoop 3 APIs in the Apache repository > > addressable > > > with a Maven classifier. > > > > > > My employer has some interests in this area that align so I would like > to > > > sponsor (implement, review, commit, RM backfill releases, etc.) this > > work. > > > Would there be any objections? Read through the thread for some > thoughts > > on > > > approach. Summarized: > > > > > > - Amend create-release to build, stage, and deploy a -hadoop3 variant > > > build by activating the Hadoop 3 build profile. > > > > > > - Amend the Hadoop 3 build profile to flatten POMs before deployment to > > > resolve potential downstream issues due to Hadoop 3 being a non-default > > > build profile. (This could also be applied to all builds.) > > > > > > - Amend hbase-vote to be aware of and evaluate if present -hadoop3 > > variant > > > artifacts. > > > > > > > > > > On Aug 25, 2022, at 10:40 AM, Andrew Purtell < > andrew.purt...@gmail.com > > > > > > wrote: > > > > > > > > Thanks, that would work. > > > > > > > >> On Aug 25, 2022, at 11:35 AM, Sean Busbey > wrote: > > > >> > > > >> yes, the flatten plugin. We use it in hbase-connectors already. > > > >> > > > >> https://www.mojohaus.org/flatten-maven-plugin/ > > > >> > > > >> this sounds like it could also be a use case for BOMs, which would > > also > > > >> benefit users of our client artifacts that use build tools that > don't > > > >> respect maven profiles generally, like gradle. > > > >> > > > >>> On Thu, Aug 25, 2022 at 10:30 AM Andrew Purtell < > > > andrew.purt...@gmail.com> > > > >>> wrote: > > > >>> > > > >>> Thinking about this a bit more, we will have an issue in that the > > POMs > > > >>> published from our -hadoop3 build will not have a default > activation > > > of our > > > >>> Hadoop 3 build profile. The convenience binaries will function as > > > expected > > > >>> but Maven will read and process eg Phoenix POMs, then download and > > > perform > > > >>> substitutions on HBase POMs, and then etc, so downstreamers like > > > Phoenix > > > >>> will have to set up the hadoop.profile variable for us in their > > default > > > >>> build profile or else the transitive paths through us may be > wrong. I > > > >>> wonder if there is a Maven plugin available for deploying POMs with > > all > > > >>> variable substitutions performed before deployment, that would > solve > > > that > > > >>> problem and all conceivable related issues. > > > >>> > > > On Aug 25, 2022, at 11:03 AM, Andrew Purtell < > > > andrew.purt...@gmail.com> > > > >>> wrote: > > > > > > I think 2.x is going to have a few years of life remaining so it > > > would > > > >>> be best, if we are going to address this, to have a 2.x solution > was > > > well > > > >>> as a 3.x one. > > > > > > In my opinion we can continue to publish 2.4 and 2.5 (and 2.6) > > > unchanged > > > >>> and then also introduce a Hadoop 3 release using “hadoop3” or > similar > > > as > > > >>> Maven classifier. Phoenix could specify this classifier in their > > POMs. > > > >>> Everyone should be happy. Users who already are comfortable with > the > > > Hadoop > > > >>> 2 default don’t have to change anything. A one time POM change on > the > > > >>> Phoenix side is required but that’s it. > > > > > > The additional build time complexity for generating two releases > can > > > be > > > >>> incorporated into create-release. Nobody does manual releases any > > more > > > as > > > >>> far as I
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
FYI, there's some experiment reported on https://issues.apache.org/jira/browse/HBASE-27340. On Mon, Aug 29, 2022 at 7:22 PM Sean Busbey wrote: > I think changing the default hadoop profile for builds in branch-2 would > unnecessarily complicate our compatibility messaging so long as Hadoop 2 > hasn't gone EOL. > > On Mon, Aug 29, 2022 at 5:30 AM Nick Dimiduk wrote: > > > Should we also make hadoop3 the default active profile for branch-2 going > > forward? > > > > On Fri, Aug 26, 2022 at 5:25 PM Andrew Purtell > > > wrote: > > > > > The security posture of Hadoop 2 in general is a problem, because > > > maintenance on that branch is spotty, that is just how it goes. We had > > the > > > same situation with our now EOL branch-1. I know Hadoop released 2.10.2 > > to > > > address some CVE worthy problems but it is unclear if 2.10.2 addresses > > all > > > known issues, unlike 3.3.4. Also as you know Hadoop 2 has unpatchable > > > dependencies on org.codehaus versions of Jackson and Jetty, which > > > themselves have high scoring CVEs that will never be fixed because they > > are > > > EOL, and other similar issues. Hadoop 3 doesn’t completely solve such > > > problems but is the only realistic place we can hope they can be > > addressed > > > as required. For organizations that implement or require a top to > bottom > > > security audit of their software bill of materials, it seems possible > to > > > avoid user pain by providing supported convenience artifacts *and* > > > libraries built against Hadoop 3 APIs in the Apache repository > > addressable > > > with a Maven classifier. > > > > > > My employer has some interests in this area that align so I would like > to > > > sponsor (implement, review, commit, RM backfill releases, etc.) this > > work. > > > Would there be any objections? Read through the thread for some > thoughts > > on > > > approach. Summarized: > > > > > > - Amend create-release to build, stage, and deploy a -hadoop3 variant > > > build by activating the Hadoop 3 build profile. > > > > > > - Amend the Hadoop 3 build profile to flatten POMs before deployment to > > > resolve potential downstream issues due to Hadoop 3 being a non-default > > > build profile. (This could also be applied to all builds.) > > > > > > - Amend hbase-vote to be aware of and evaluate if present -hadoop3 > > variant > > > artifacts. > > > > > > > > > > On Aug 25, 2022, at 10:40 AM, Andrew Purtell < > andrew.purt...@gmail.com > > > > > > wrote: > > > > > > > > Thanks, that would work. > > > > > > > >> On Aug 25, 2022, at 11:35 AM, Sean Busbey > wrote: > > > >> > > > >> yes, the flatten plugin. We use it in hbase-connectors already. > > > >> > > > >> https://www.mojohaus.org/flatten-maven-plugin/ > > > >> > > > >> this sounds like it could also be a use case for BOMs, which would > > also > > > >> benefit users of our client artifacts that use build tools that > don't > > > >> respect maven profiles generally, like gradle. > > > >> > > > >>> On Thu, Aug 25, 2022 at 10:30 AM Andrew Purtell < > > > andrew.purt...@gmail.com> > > > >>> wrote: > > > >>> > > > >>> Thinking about this a bit more, we will have an issue in that the > > POMs > > > >>> published from our -hadoop3 build will not have a default > activation > > > of our > > > >>> Hadoop 3 build profile. The convenience binaries will function as > > > expected > > > >>> but Maven will read and process eg Phoenix POMs, then download and > > > perform > > > >>> substitutions on HBase POMs, and then etc, so downstreamers like > > > Phoenix > > > >>> will have to set up the hadoop.profile variable for us in their > > default > > > >>> build profile or else the transitive paths through us may be > wrong. I > > > >>> wonder if there is a Maven plugin available for deploying POMs with > > all > > > >>> variable substitutions performed before deployment, that would > solve > > > that > > > >>> problem and all conceivable related issues. > > > >>> > > > On Aug 25, 2022, at 11:03 AM, Andrew Purtell < > > > andrew.purt...@gmail.com> > > > >>> wrote: > > > > > > I think 2.x is going to have a few years of life remaining so it > > > would > > > >>> be best, if we are going to address this, to have a 2.x solution > was > > > well > > > >>> as a 3.x one. > > > > > > In my opinion we can continue to publish 2.4 and 2.5 (and 2.6) > > > unchanged > > > >>> and then also introduce a Hadoop 3 release using “hadoop3” or > similar > > > as > > > >>> Maven classifier. Phoenix could specify this classifier in their > > POMs. > > > >>> Everyone should be happy. Users who already are comfortable with > the > > > Hadoop > > > >>> 2 default don’t have to change anything. A one time POM change on > the > > > >>> Phoenix side is required but that’s it. > > > > > > The additional build time complexity for generating two releases > can > > > be > > > >>> incorporated into create-release. Nobody does manual releases any > > more > > > as > > > >>> far
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
I think changing the default hadoop profile for builds in branch-2 would unnecessarily complicate our compatibility messaging so long as Hadoop 2 hasn't gone EOL. On Mon, Aug 29, 2022 at 5:30 AM Nick Dimiduk wrote: > Should we also make hadoop3 the default active profile for branch-2 going > forward? > > On Fri, Aug 26, 2022 at 5:25 PM Andrew Purtell > wrote: > > > The security posture of Hadoop 2 in general is a problem, because > > maintenance on that branch is spotty, that is just how it goes. We had > the > > same situation with our now EOL branch-1. I know Hadoop released 2.10.2 > to > > address some CVE worthy problems but it is unclear if 2.10.2 addresses > all > > known issues, unlike 3.3.4. Also as you know Hadoop 2 has unpatchable > > dependencies on org.codehaus versions of Jackson and Jetty, which > > themselves have high scoring CVEs that will never be fixed because they > are > > EOL, and other similar issues. Hadoop 3 doesn’t completely solve such > > problems but is the only realistic place we can hope they can be > addressed > > as required. For organizations that implement or require a top to bottom > > security audit of their software bill of materials, it seems possible to > > avoid user pain by providing supported convenience artifacts *and* > > libraries built against Hadoop 3 APIs in the Apache repository > addressable > > with a Maven classifier. > > > > My employer has some interests in this area that align so I would like to > > sponsor (implement, review, commit, RM backfill releases, etc.) this > work. > > Would there be any objections? Read through the thread for some thoughts > on > > approach. Summarized: > > > > - Amend create-release to build, stage, and deploy a -hadoop3 variant > > build by activating the Hadoop 3 build profile. > > > > - Amend the Hadoop 3 build profile to flatten POMs before deployment to > > resolve potential downstream issues due to Hadoop 3 being a non-default > > build profile. (This could also be applied to all builds.) > > > > - Amend hbase-vote to be aware of and evaluate if present -hadoop3 > variant > > artifacts. > > > > > > > On Aug 25, 2022, at 10:40 AM, Andrew Purtell > > > wrote: > > > > > > Thanks, that would work. > > > > > >> On Aug 25, 2022, at 11:35 AM, Sean Busbey wrote: > > >> > > >> yes, the flatten plugin. We use it in hbase-connectors already. > > >> > > >> https://www.mojohaus.org/flatten-maven-plugin/ > > >> > > >> this sounds like it could also be a use case for BOMs, which would > also > > >> benefit users of our client artifacts that use build tools that don't > > >> respect maven profiles generally, like gradle. > > >> > > >>> On Thu, Aug 25, 2022 at 10:30 AM Andrew Purtell < > > andrew.purt...@gmail.com> > > >>> wrote: > > >>> > > >>> Thinking about this a bit more, we will have an issue in that the > POMs > > >>> published from our -hadoop3 build will not have a default activation > > of our > > >>> Hadoop 3 build profile. The convenience binaries will function as > > expected > > >>> but Maven will read and process eg Phoenix POMs, then download and > > perform > > >>> substitutions on HBase POMs, and then etc, so downstreamers like > > Phoenix > > >>> will have to set up the hadoop.profile variable for us in their > default > > >>> build profile or else the transitive paths through us may be wrong. I > > >>> wonder if there is a Maven plugin available for deploying POMs with > all > > >>> variable substitutions performed before deployment, that would solve > > that > > >>> problem and all conceivable related issues. > > >>> > > On Aug 25, 2022, at 11:03 AM, Andrew Purtell < > > andrew.purt...@gmail.com> > > >>> wrote: > > > > I think 2.x is going to have a few years of life remaining so it > > would > > >>> be best, if we are going to address this, to have a 2.x solution was > > well > > >>> as a 3.x one. > > > > In my opinion we can continue to publish 2.4 and 2.5 (and 2.6) > > unchanged > > >>> and then also introduce a Hadoop 3 release using “hadoop3” or similar > > as > > >>> Maven classifier. Phoenix could specify this classifier in their > POMs. > > >>> Everyone should be happy. Users who already are comfortable with the > > Hadoop > > >>> 2 default don’t have to change anything. A one time POM change on the > > >>> Phoenix side is required but that’s it. > > > > The additional build time complexity for generating two releases can > > be > > >>> incorporated into create-release. Nobody does manual releases any > more > > as > > >>> far as I know. Likewise, download and verification of -hadoop3 > > convenience > > >>> binaries can be added to hbase-vote. I believe we are all using that > > tool > > >>> for verification of releases now. After these one time changes are > > landed > > >>> the cost for RMs and PMC will be only in a roughly doubled amount of > > time > > >>> needed to build and verify releases. > > > > >> On Aug 17, 2022, at 9:06 AM, Nick Dimiduk >
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
Should we also make hadoop3 the default active profile for branch-2 going forward? On Fri, Aug 26, 2022 at 5:25 PM Andrew Purtell wrote: > The security posture of Hadoop 2 in general is a problem, because > maintenance on that branch is spotty, that is just how it goes. We had the > same situation with our now EOL branch-1. I know Hadoop released 2.10.2 to > address some CVE worthy problems but it is unclear if 2.10.2 addresses all > known issues, unlike 3.3.4. Also as you know Hadoop 2 has unpatchable > dependencies on org.codehaus versions of Jackson and Jetty, which > themselves have high scoring CVEs that will never be fixed because they are > EOL, and other similar issues. Hadoop 3 doesn’t completely solve such > problems but is the only realistic place we can hope they can be addressed > as required. For organizations that implement or require a top to bottom > security audit of their software bill of materials, it seems possible to > avoid user pain by providing supported convenience artifacts *and* > libraries built against Hadoop 3 APIs in the Apache repository addressable > with a Maven classifier. > > My employer has some interests in this area that align so I would like to > sponsor (implement, review, commit, RM backfill releases, etc.) this work. > Would there be any objections? Read through the thread for some thoughts on > approach. Summarized: > > - Amend create-release to build, stage, and deploy a -hadoop3 variant > build by activating the Hadoop 3 build profile. > > - Amend the Hadoop 3 build profile to flatten POMs before deployment to > resolve potential downstream issues due to Hadoop 3 being a non-default > build profile. (This could also be applied to all builds.) > > - Amend hbase-vote to be aware of and evaluate if present -hadoop3 variant > artifacts. > > > > On Aug 25, 2022, at 10:40 AM, Andrew Purtell > wrote: > > > > Thanks, that would work. > > > >> On Aug 25, 2022, at 11:35 AM, Sean Busbey wrote: > >> > >> yes, the flatten plugin. We use it in hbase-connectors already. > >> > >> https://www.mojohaus.org/flatten-maven-plugin/ > >> > >> this sounds like it could also be a use case for BOMs, which would also > >> benefit users of our client artifacts that use build tools that don't > >> respect maven profiles generally, like gradle. > >> > >>> On Thu, Aug 25, 2022 at 10:30 AM Andrew Purtell < > andrew.purt...@gmail.com> > >>> wrote: > >>> > >>> Thinking about this a bit more, we will have an issue in that the POMs > >>> published from our -hadoop3 build will not have a default activation > of our > >>> Hadoop 3 build profile. The convenience binaries will function as > expected > >>> but Maven will read and process eg Phoenix POMs, then download and > perform > >>> substitutions on HBase POMs, and then etc, so downstreamers like > Phoenix > >>> will have to set up the hadoop.profile variable for us in their default > >>> build profile or else the transitive paths through us may be wrong. I > >>> wonder if there is a Maven plugin available for deploying POMs with all > >>> variable substitutions performed before deployment, that would solve > that > >>> problem and all conceivable related issues. > >>> > On Aug 25, 2022, at 11:03 AM, Andrew Purtell < > andrew.purt...@gmail.com> > >>> wrote: > > I think 2.x is going to have a few years of life remaining so it > would > >>> be best, if we are going to address this, to have a 2.x solution was > well > >>> as a 3.x one. > > In my opinion we can continue to publish 2.4 and 2.5 (and 2.6) > unchanged > >>> and then also introduce a Hadoop 3 release using “hadoop3” or similar > as > >>> Maven classifier. Phoenix could specify this classifier in their POMs. > >>> Everyone should be happy. Users who already are comfortable with the > Hadoop > >>> 2 default don’t have to change anything. A one time POM change on the > >>> Phoenix side is required but that’s it. > > The additional build time complexity for generating two releases can > be > >>> incorporated into create-release. Nobody does manual releases any more > as > >>> far as I know. Likewise, download and verification of -hadoop3 > convenience > >>> binaries can be added to hbase-vote. I believe we are all using that > tool > >>> for verification of releases now. After these one time changes are > landed > >>> the cost for RMs and PMC will be only in a roughly doubled amount of > time > >>> needed to build and verify releases. > > >> On Aug 17, 2022, at 9:06 AM, Nick Dimiduk > wrote: > >> > >> Hi Geoffrey, > >> > >> I have no complaints with shipping convenience binaries built > against > >>> both > > Hadoop2 and Hadoop3. The primary challenge is implementing the > > necessary build changes, the secondary challenge is > verifying/testing it > > works reliably. > > > > But for Phoenix, are you asking for convenience binaries, or are you > >>> asking > > for artifacts published into maven
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
The security posture of Hadoop 2 in general is a problem, because maintenance on that branch is spotty, that is just how it goes. We had the same situation with our now EOL branch-1. I know Hadoop released 2.10.2 to address some CVE worthy problems but it is unclear if 2.10.2 addresses all known issues, unlike 3.3.4. Also as you know Hadoop 2 has unpatchable dependencies on org.codehaus versions of Jackson and Jetty, which themselves have high scoring CVEs that will never be fixed because they are EOL, and other similar issues. Hadoop 3 doesn’t completely solve such problems but is the only realistic place we can hope they can be addressed as required. For organizations that implement or require a top to bottom security audit of their software bill of materials, it seems possible to avoid user pain by providing supported convenience artifacts *and* libraries built against Hadoop 3 APIs in the Apache repository addressable with a Maven classifier. My employer has some interests in this area that align so I would like to sponsor (implement, review, commit, RM backfill releases, etc.) this work. Would there be any objections? Read through the thread for some thoughts on approach. Summarized: - Amend create-release to build, stage, and deploy a -hadoop3 variant build by activating the Hadoop 3 build profile. - Amend the Hadoop 3 build profile to flatten POMs before deployment to resolve potential downstream issues due to Hadoop 3 being a non-default build profile. (This could also be applied to all builds.) - Amend hbase-vote to be aware of and evaluate if present -hadoop3 variant artifacts. > On Aug 25, 2022, at 10:40 AM, Andrew Purtell wrote: > > Thanks, that would work. > >> On Aug 25, 2022, at 11:35 AM, Sean Busbey wrote: >> >> yes, the flatten plugin. We use it in hbase-connectors already. >> >> https://www.mojohaus.org/flatten-maven-plugin/ >> >> this sounds like it could also be a use case for BOMs, which would also >> benefit users of our client artifacts that use build tools that don't >> respect maven profiles generally, like gradle. >> >>> On Thu, Aug 25, 2022 at 10:30 AM Andrew Purtell >>> wrote: >>> >>> Thinking about this a bit more, we will have an issue in that the POMs >>> published from our -hadoop3 build will not have a default activation of our >>> Hadoop 3 build profile. The convenience binaries will function as expected >>> but Maven will read and process eg Phoenix POMs, then download and perform >>> substitutions on HBase POMs, and then etc, so downstreamers like Phoenix >>> will have to set up the hadoop.profile variable for us in their default >>> build profile or else the transitive paths through us may be wrong. I >>> wonder if there is a Maven plugin available for deploying POMs with all >>> variable substitutions performed before deployment, that would solve that >>> problem and all conceivable related issues. >>> On Aug 25, 2022, at 11:03 AM, Andrew Purtell >>> wrote: I think 2.x is going to have a few years of life remaining so it would >>> be best, if we are going to address this, to have a 2.x solution was well >>> as a 3.x one. In my opinion we can continue to publish 2.4 and 2.5 (and 2.6) unchanged >>> and then also introduce a Hadoop 3 release using “hadoop3” or similar as >>> Maven classifier. Phoenix could specify this classifier in their POMs. >>> Everyone should be happy. Users who already are comfortable with the Hadoop >>> 2 default don’t have to change anything. A one time POM change on the >>> Phoenix side is required but that’s it. The additional build time complexity for generating two releases can be >>> incorporated into create-release. Nobody does manual releases any more as >>> far as I know. Likewise, download and verification of -hadoop3 convenience >>> binaries can be added to hbase-vote. I believe we are all using that tool >>> for verification of releases now. After these one time changes are landed >>> the cost for RMs and PMC will be only in a roughly doubled amount of time >>> needed to build and verify releases. >> On Aug 17, 2022, at 9:06 AM, Nick Dimiduk wrote: >> >> Hi Geoffrey, >> >> I have no complaints with shipping convenience binaries built against >>> both > Hadoop2 and Hadoop3. The primary challenge is implementing the > necessary build changes, the secondary challenge is verifying/testing it > works reliably. > > But for Phoenix, are you asking for convenience binaries, or are you >>> asking > for artifacts published into maven that have the Hadoop3 profile >>> activated > and specify the associated dependencies? > > I'm afraid that the 2.5.0 release ship has already sailed. I've heard >>> talk > of a 2.6 "fast-follow", so maybe someone can have the build changes >>> ready > for that? Also, isn't this a too little, too late situation? Shouldn't >>> we > shift our focus to
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
Thanks, that would work. > On Aug 25, 2022, at 11:35 AM, Sean Busbey wrote: > > yes, the flatten plugin. We use it in hbase-connectors already. > > https://www.mojohaus.org/flatten-maven-plugin/ > > this sounds like it could also be a use case for BOMs, which would also > benefit users of our client artifacts that use build tools that don't > respect maven profiles generally, like gradle. > >> On Thu, Aug 25, 2022 at 10:30 AM Andrew Purtell >> wrote: >> >> Thinking about this a bit more, we will have an issue in that the POMs >> published from our -hadoop3 build will not have a default activation of our >> Hadoop 3 build profile. The convenience binaries will function as expected >> but Maven will read and process eg Phoenix POMs, then download and perform >> substitutions on HBase POMs, and then etc, so downstreamers like Phoenix >> will have to set up the hadoop.profile variable for us in their default >> build profile or else the transitive paths through us may be wrong. I >> wonder if there is a Maven plugin available for deploying POMs with all >> variable substitutions performed before deployment, that would solve that >> problem and all conceivable related issues. >> >>> On Aug 25, 2022, at 11:03 AM, Andrew Purtell >> wrote: >>> >>> I think 2.x is going to have a few years of life remaining so it would >> be best, if we are going to address this, to have a 2.x solution was well >> as a 3.x one. >>> >>> In my opinion we can continue to publish 2.4 and 2.5 (and 2.6) unchanged >> and then also introduce a Hadoop 3 release using “hadoop3” or similar as >> Maven classifier. Phoenix could specify this classifier in their POMs. >> Everyone should be happy. Users who already are comfortable with the Hadoop >> 2 default don’t have to change anything. A one time POM change on the >> Phoenix side is required but that’s it. >>> >>> The additional build time complexity for generating two releases can be >> incorporated into create-release. Nobody does manual releases any more as >> far as I know. Likewise, download and verification of -hadoop3 convenience >> binaries can be added to hbase-vote. I believe we are all using that tool >> for verification of releases now. After these one time changes are landed >> the cost for RMs and PMC will be only in a roughly doubled amount of time >> needed to build and verify releases. >>> On Aug 17, 2022, at 9:06 AM, Nick Dimiduk wrote: Hi Geoffrey, I have no complaints with shipping convenience binaries built against >> both Hadoop2 and Hadoop3. The primary challenge is implementing the necessary build changes, the secondary challenge is verifying/testing it works reliably. But for Phoenix, are you asking for convenience binaries, or are you >> asking for artifacts published into maven that have the Hadoop3 profile >> activated and specify the associated dependencies? I'm afraid that the 2.5.0 release ship has already sailed. I've heard >> talk of a 2.6 "fast-follow", so maybe someone can have the build changes >> ready for that? Also, isn't this a too little, too late situation? Shouldn't >> we shift our focus to releasing 3.0, which has dropped support for Hadoop2? Thanks, Nick >> On Tue, Aug 16, 2022 at 9:30 PM Geoffrey Jacoby >> wrote: > > I see that the next HBase 2.5 RC is imminent, and before that's set in > stone, I wanted to bring up the question of whether there will be >> official > HBase 2.5 binaries built with the Hadoop 3 profile and available in the > usual Maven repositories. (In addition to the usual Hadoop 2 profile > binaries) > > The HBase 2.x line has a commitment to maintain support for Hadoop >> 2.x, but > Hadoop 3.3 is the current stable Hadoop line and the most recent >> release > notes [1] encourage all users of Hadoop 2.x to upgrade to Hadoop 3. > > Without convenience artifacts built against Hadoop 3, no end-users with > Hadoop 3 clusters will be able to use the Apache-distributed binaries >> and > will instead have to recompile HBase from source themselves, or use a >> 3rd > party distribution that does so for them. > > This is especially inconvenient for downstream projects such as Apache > Phoenix, which has never officially supported the HBase 2.x / Hadoop >> 2.10 > combination. (It currently supports only HBase 2.3 or 2.4 with Hadoop >> 3. > HBase 2.5 support will be added very shortly after its release as part >> of > Phoenix 5.2.) > > To even run the Phoenix IT tests locally requires contributors to >> download > the HBase source release and manually mvn install to their local maven >> repo > using the Hadoop 3 profile, to avoid crashes in the HBase >> minicluster.[2] > This is a barrier to new contributors and confuses even veteran ones, >> and > has to be done again for every new HBase
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
yes, the flatten plugin. We use it in hbase-connectors already. https://www.mojohaus.org/flatten-maven-plugin/ this sounds like it could also be a use case for BOMs, which would also benefit users of our client artifacts that use build tools that don't respect maven profiles generally, like gradle. On Thu, Aug 25, 2022 at 10:30 AM Andrew Purtell wrote: > Thinking about this a bit more, we will have an issue in that the POMs > published from our -hadoop3 build will not have a default activation of our > Hadoop 3 build profile. The convenience binaries will function as expected > but Maven will read and process eg Phoenix POMs, then download and perform > substitutions on HBase POMs, and then etc, so downstreamers like Phoenix > will have to set up the hadoop.profile variable for us in their default > build profile or else the transitive paths through us may be wrong. I > wonder if there is a Maven plugin available for deploying POMs with all > variable substitutions performed before deployment, that would solve that > problem and all conceivable related issues. > > > On Aug 25, 2022, at 11:03 AM, Andrew Purtell > wrote: > > > > I think 2.x is going to have a few years of life remaining so it would > be best, if we are going to address this, to have a 2.x solution was well > as a 3.x one. > > > > In my opinion we can continue to publish 2.4 and 2.5 (and 2.6) unchanged > and then also introduce a Hadoop 3 release using “hadoop3” or similar as > Maven classifier. Phoenix could specify this classifier in their POMs. > Everyone should be happy. Users who already are comfortable with the Hadoop > 2 default don’t have to change anything. A one time POM change on the > Phoenix side is required but that’s it. > > > > The additional build time complexity for generating two releases can be > incorporated into create-release. Nobody does manual releases any more as > far as I know. Likewise, download and verification of -hadoop3 convenience > binaries can be added to hbase-vote. I believe we are all using that tool > for verification of releases now. After these one time changes are landed > the cost for RMs and PMC will be only in a roughly doubled amount of time > needed to build and verify releases. > > > >> On Aug 17, 2022, at 9:06 AM, Nick Dimiduk wrote: > >> > >> Hi Geoffrey, > >> > >> I have no complaints with shipping convenience binaries built against > both > >> Hadoop2 and Hadoop3. The primary challenge is implementing the > >> necessary build changes, the secondary challenge is verifying/testing it > >> works reliably. > >> > >> But for Phoenix, are you asking for convenience binaries, or are you > asking > >> for artifacts published into maven that have the Hadoop3 profile > activated > >> and specify the associated dependencies? > >> > >> I'm afraid that the 2.5.0 release ship has already sailed. I've heard > talk > >> of a 2.6 "fast-follow", so maybe someone can have the build changes > ready > >> for that? Also, isn't this a too little, too late situation? Shouldn't > we > >> shift our focus to releasing 3.0, which has dropped support for Hadoop2? > >> > >> Thanks, > >> Nick > >> > On Tue, Aug 16, 2022 at 9:30 PM Geoffrey Jacoby > wrote: > >>> > >>> I see that the next HBase 2.5 RC is imminent, and before that's set in > >>> stone, I wanted to bring up the question of whether there will be > official > >>> HBase 2.5 binaries built with the Hadoop 3 profile and available in the > >>> usual Maven repositories. (In addition to the usual Hadoop 2 profile > >>> binaries) > >>> > >>> The HBase 2.x line has a commitment to maintain support for Hadoop > 2.x, but > >>> Hadoop 3.3 is the current stable Hadoop line and the most recent > release > >>> notes [1] encourage all users of Hadoop 2.x to upgrade to Hadoop 3. > >>> > >>> Without convenience artifacts built against Hadoop 3, no end-users with > >>> Hadoop 3 clusters will be able to use the Apache-distributed binaries > and > >>> will instead have to recompile HBase from source themselves, or use a > 3rd > >>> party distribution that does so for them. > >>> > >>> This is especially inconvenient for downstream projects such as Apache > >>> Phoenix, which has never officially supported the HBase 2.x / Hadoop > 2.10 > >>> combination. (It currently supports only HBase 2.3 or 2.4 with Hadoop > 3. > >>> HBase 2.5 support will be added very shortly after its release as part > of > >>> Phoenix 5.2.) > >>> > >>> To even run the Phoenix IT tests locally requires contributors to > download > >>> the HBase source release and manually mvn install to their local maven > repo > >>> using the Hadoop 3 profile, to avoid crashes in the HBase > minicluster.[2] > >>> This is a barrier to new contributors and confuses even veteran ones, > and > >>> has to be done again for every new HBase release. > >>> > >>> In general, I expect the Hadoop 3 user base to grow and the Hadoop 2.10 > >>> user base to shrink with every future HBase 2 release, so I think this > is
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
Thinking about this a bit more, we will have an issue in that the POMs published from our -hadoop3 build will not have a default activation of our Hadoop 3 build profile. The convenience binaries will function as expected but Maven will read and process eg Phoenix POMs, then download and perform substitutions on HBase POMs, and then etc, so downstreamers like Phoenix will have to set up the hadoop.profile variable for us in their default build profile or else the transitive paths through us may be wrong. I wonder if there is a Maven plugin available for deploying POMs with all variable substitutions performed before deployment, that would solve that problem and all conceivable related issues. > On Aug 25, 2022, at 11:03 AM, Andrew Purtell wrote: > > I think 2.x is going to have a few years of life remaining so it would be > best, if we are going to address this, to have a 2.x solution was well as a > 3.x one. > > In my opinion we can continue to publish 2.4 and 2.5 (and 2.6) unchanged and > then also introduce a Hadoop 3 release using “hadoop3” or similar as Maven > classifier. Phoenix could specify this classifier in their POMs. Everyone > should be happy. Users who already are comfortable with the Hadoop 2 default > don’t have to change anything. A one time POM change on the Phoenix side is > required but that’s it. > > The additional build time complexity for generating two releases can be > incorporated into create-release. Nobody does manual releases any more as far > as I know. Likewise, download and verification of -hadoop3 convenience > binaries can be added to hbase-vote. I believe we are all using that tool for > verification of releases now. After these one time changes are landed the > cost for RMs and PMC will be only in a roughly doubled amount of time needed > to build and verify releases. > >> On Aug 17, 2022, at 9:06 AM, Nick Dimiduk wrote: >> >> Hi Geoffrey, >> >> I have no complaints with shipping convenience binaries built against both >> Hadoop2 and Hadoop3. The primary challenge is implementing the >> necessary build changes, the secondary challenge is verifying/testing it >> works reliably. >> >> But for Phoenix, are you asking for convenience binaries, or are you asking >> for artifacts published into maven that have the Hadoop3 profile activated >> and specify the associated dependencies? >> >> I'm afraid that the 2.5.0 release ship has already sailed. I've heard talk >> of a 2.6 "fast-follow", so maybe someone can have the build changes ready >> for that? Also, isn't this a too little, too late situation? Shouldn't we >> shift our focus to releasing 3.0, which has dropped support for Hadoop2? >> >> Thanks, >> Nick >> On Tue, Aug 16, 2022 at 9:30 PM Geoffrey Jacoby wrote: >>> >>> I see that the next HBase 2.5 RC is imminent, and before that's set in >>> stone, I wanted to bring up the question of whether there will be official >>> HBase 2.5 binaries built with the Hadoop 3 profile and available in the >>> usual Maven repositories. (In addition to the usual Hadoop 2 profile >>> binaries) >>> >>> The HBase 2.x line has a commitment to maintain support for Hadoop 2.x, but >>> Hadoop 3.3 is the current stable Hadoop line and the most recent release >>> notes [1] encourage all users of Hadoop 2.x to upgrade to Hadoop 3. >>> >>> Without convenience artifacts built against Hadoop 3, no end-users with >>> Hadoop 3 clusters will be able to use the Apache-distributed binaries and >>> will instead have to recompile HBase from source themselves, or use a 3rd >>> party distribution that does so for them. >>> >>> This is especially inconvenient for downstream projects such as Apache >>> Phoenix, which has never officially supported the HBase 2.x / Hadoop 2.10 >>> combination. (It currently supports only HBase 2.3 or 2.4 with Hadoop 3. >>> HBase 2.5 support will be added very shortly after its release as part of >>> Phoenix 5.2.) >>> >>> To even run the Phoenix IT tests locally requires contributors to download >>> the HBase source release and manually mvn install to their local maven repo >>> using the Hadoop 3 profile, to avoid crashes in the HBase minicluster.[2] >>> This is a barrier to new contributors and confuses even veteran ones, and >>> has to be done again for every new HBase release. >>> >>> In general, I expect the Hadoop 3 user base to grow and the Hadoop 2.10 >>> user base to shrink with every future HBase 2 release, so I think this is a >>> worthwhile improvement. >>> >>> Thanks, >>> >>> Geoffrey >>> >>> [1] https://hadoop.apache.org/release/3.3.4.html >>> [2] https://github.com/apache/phoenix/blob/master/BUILDING.md >>>
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
I think 2.x is going to have a few years of life remaining so it would be best, if we are going to address this, to have a 2.x solution was well as a 3.x one. In my opinion we can continue to publish 2.4 and 2.5 (and 2.6) unchanged and then also introduce a Hadoop 3 release using “hadoop3” or similar as Maven classifier. Phoenix could specify this classifier in their POMs. Everyone should be happy. Users who already are comfortable with the Hadoop 2 default don’t have to change anything. A one time POM change on the Phoenix side is required but that’s it. The additional build time complexity for generating two releases can be incorporated into create-release. Nobody does manual releases any more as far as I know. Likewise, download and verification of -hadoop3 convenience binaries can be added to hbase-vote. I believe we are all using that tool for verification of releases now. After these one time changes are landed the cost for RMs and PMC will be only in a roughly doubled amount of time needed to build and verify releases. > On Aug 17, 2022, at 9:06 AM, Nick Dimiduk wrote: > > Hi Geoffrey, > > I have no complaints with shipping convenience binaries built against both > Hadoop2 and Hadoop3. The primary challenge is implementing the > necessary build changes, the secondary challenge is verifying/testing it > works reliably. > > But for Phoenix, are you asking for convenience binaries, or are you asking > for artifacts published into maven that have the Hadoop3 profile activated > and specify the associated dependencies? > > I'm afraid that the 2.5.0 release ship has already sailed. I've heard talk > of a 2.6 "fast-follow", so maybe someone can have the build changes ready > for that? Also, isn't this a too little, too late situation? Shouldn't we > shift our focus to releasing 3.0, which has dropped support for Hadoop2? > > Thanks, > Nick > >> On Tue, Aug 16, 2022 at 9:30 PM Geoffrey Jacoby wrote: >> >> I see that the next HBase 2.5 RC is imminent, and before that's set in >> stone, I wanted to bring up the question of whether there will be official >> HBase 2.5 binaries built with the Hadoop 3 profile and available in the >> usual Maven repositories. (In addition to the usual Hadoop 2 profile >> binaries) >> >> The HBase 2.x line has a commitment to maintain support for Hadoop 2.x, but >> Hadoop 3.3 is the current stable Hadoop line and the most recent release >> notes [1] encourage all users of Hadoop 2.x to upgrade to Hadoop 3. >> >> Without convenience artifacts built against Hadoop 3, no end-users with >> Hadoop 3 clusters will be able to use the Apache-distributed binaries and >> will instead have to recompile HBase from source themselves, or use a 3rd >> party distribution that does so for them. >> >> This is especially inconvenient for downstream projects such as Apache >> Phoenix, which has never officially supported the HBase 2.x / Hadoop 2.10 >> combination. (It currently supports only HBase 2.3 or 2.4 with Hadoop 3. >> HBase 2.5 support will be added very shortly after its release as part of >> Phoenix 5.2.) >> >> To even run the Phoenix IT tests locally requires contributors to download >> the HBase source release and manually mvn install to their local maven repo >> using the Hadoop 3 profile, to avoid crashes in the HBase minicluster.[2] >> This is a barrier to new contributors and confuses even veteran ones, and >> has to be done again for every new HBase release. >> >> In general, I expect the Hadoop 3 user base to grow and the Hadoop 2.10 >> user base to shrink with every future HBase 2 release, so I think this is a >> worthwhile improvement. >> >> Thanks, >> >> Geoffrey >> >> [1] https://hadoop.apache.org/release/3.3.4.html >> [2] https://github.com/apache/phoenix/blob/master/BUILDING.md >>
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
Hi Geoffrey, I have no complaints with shipping convenience binaries built against both Hadoop2 and Hadoop3. The primary challenge is implementing the necessary build changes, the secondary challenge is verifying/testing it works reliably. But for Phoenix, are you asking for convenience binaries, or are you asking for artifacts published into maven that have the Hadoop3 profile activated and specify the associated dependencies? I'm afraid that the 2.5.0 release ship has already sailed. I've heard talk of a 2.6 "fast-follow", so maybe someone can have the build changes ready for that? Also, isn't this a too little, too late situation? Shouldn't we shift our focus to releasing 3.0, which has dropped support for Hadoop2? Thanks, Nick On Tue, Aug 16, 2022 at 9:30 PM Geoffrey Jacoby wrote: > I see that the next HBase 2.5 RC is imminent, and before that's set in > stone, I wanted to bring up the question of whether there will be official > HBase 2.5 binaries built with the Hadoop 3 profile and available in the > usual Maven repositories. (In addition to the usual Hadoop 2 profile > binaries) > > The HBase 2.x line has a commitment to maintain support for Hadoop 2.x, but > Hadoop 3.3 is the current stable Hadoop line and the most recent release > notes [1] encourage all users of Hadoop 2.x to upgrade to Hadoop 3. > > Without convenience artifacts built against Hadoop 3, no end-users with > Hadoop 3 clusters will be able to use the Apache-distributed binaries and > will instead have to recompile HBase from source themselves, or use a 3rd > party distribution that does so for them. > > This is especially inconvenient for downstream projects such as Apache > Phoenix, which has never officially supported the HBase 2.x / Hadoop 2.10 > combination. (It currently supports only HBase 2.3 or 2.4 with Hadoop 3. > HBase 2.5 support will be added very shortly after its release as part of > Phoenix 5.2.) > > To even run the Phoenix IT tests locally requires contributors to download > the HBase source release and manually mvn install to their local maven repo > using the Hadoop 3 profile, to avoid crashes in the HBase minicluster.[2] > This is a barrier to new contributors and confuses even veteran ones, and > has to be done again for every new HBase release. > > In general, I expect the Hadoop 3 user base to grow and the Hadoop 2.10 > user base to shrink with every future HBase 2 release, so I think this is a > worthwhile improvement. > > Thanks, > > Geoffrey > > [1] https://hadoop.apache.org/release/3.3.4.html > [2] https://github.com/apache/phoenix/blob/master/BUILDING.md >
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
Ah, a good news is that, for branch-2, I tried locally to build it with hadoop 2 profile(the hadoop version is 2.10.2), and then remove all the hadoop 2.10.2 jars in the binary and copy all the hadoop jars in hadoop 3.3.4 there. Starting a mini cluster is fine, hbase shell is fine, LTT tool is fine, and then scan the LTT result in shell is also fine. So maybe upgrading to 2.10.2 is enough for hbase to maintain the drop in replacement. You could have a try for running phoenix IT tests to see if it works. Thanks. 张铎(Duo Zhang) 于2022年8月17日周三 10:31写道: > > In general, it is fine to setup a hbase cluster which ships with > hadoop 2.x client library against a hadoop 3.x hdfs and yarn cluster. > The wire communication is still compatible. That's why we still keep > the client library as 2.x and not do something like the time in 0.98 > that publish two binaries for hadoop1 and hadoop2. > > So I do not understand why 'no end-users with Hadoop 3 clusters will > be able to use the Apache-distributed binaries'. > > And on the Phoenix IT tests, it is a problem as HBase does not support > drop in replacement with hadoop libraries. There are several possible > directions: > 1. Support drop in replacement. > 2. Publish two binaries with hadoop2 and hadoop3, and also publish two > maven dependencies with hadoop2 and hadoop3. > 3. Only publish hadoop3 binaries and maven dependencies. > > The problem for these directions > 1. Not sure if this is easy to implement... > 2. Will increase the complexity when users just want to use hbase. > 3. May have compatible issues when using the hadoop3 libraries against > a hadoop 2 cluster. > > Thanks. > > Geoffrey Jacoby 于2022年8月17日周三 03:30写道: > > > > I see that the next HBase 2.5 RC is imminent, and before that's set in > > stone, I wanted to bring up the question of whether there will be official > > HBase 2.5 binaries built with the Hadoop 3 profile and available in the > > usual Maven repositories. (In addition to the usual Hadoop 2 profile > > binaries) > > > > The HBase 2.x line has a commitment to maintain support for Hadoop 2.x, but > > Hadoop 3.3 is the current stable Hadoop line and the most recent release > > notes [1] encourage all users of Hadoop 2.x to upgrade to Hadoop 3. > > > > Without convenience artifacts built against Hadoop 3, no end-users with > > Hadoop 3 clusters will be able to use the Apache-distributed binaries and > > will instead have to recompile HBase from source themselves, or use a 3rd > > party distribution that does so for them. > > > > This is especially inconvenient for downstream projects such as Apache > > Phoenix, which has never officially supported the HBase 2.x / Hadoop 2.10 > > combination. (It currently supports only HBase 2.3 or 2.4 with Hadoop 3. > > HBase 2.5 support will be added very shortly after its release as part of > > Phoenix 5.2.) > > > > To even run the Phoenix IT tests locally requires contributors to download > > the HBase source release and manually mvn install to their local maven repo > > using the Hadoop 3 profile, to avoid crashes in the HBase minicluster.[2] > > This is a barrier to new contributors and confuses even veteran ones, and > > has to be done again for every new HBase release. > > > > In general, I expect the Hadoop 3 user base to grow and the Hadoop 2.10 > > user base to shrink with every future HBase 2 release, so I think this is a > > worthwhile improvement. > > > > Thanks, > > > > Geoffrey > > > > [1] https://hadoop.apache.org/release/3.3.4.html > > [2] https://github.com/apache/phoenix/blob/master/BUILDING.md
Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts
In general, it is fine to setup a hbase cluster which ships with hadoop 2.x client library against a hadoop 3.x hdfs and yarn cluster. The wire communication is still compatible. That's why we still keep the client library as 2.x and not do something like the time in 0.98 that publish two binaries for hadoop1 and hadoop2. So I do not understand why 'no end-users with Hadoop 3 clusters will be able to use the Apache-distributed binaries'. And on the Phoenix IT tests, it is a problem as HBase does not support drop in replacement with hadoop libraries. There are several possible directions: 1. Support drop in replacement. 2. Publish two binaries with hadoop2 and hadoop3, and also publish two maven dependencies with hadoop2 and hadoop3. 3. Only publish hadoop3 binaries and maven dependencies. The problem for these directions 1. Not sure if this is easy to implement... 2. Will increase the complexity when users just want to use hbase. 3. May have compatible issues when using the hadoop3 libraries against a hadoop 2 cluster. Thanks. Geoffrey Jacoby 于2022年8月17日周三 03:30写道: > > I see that the next HBase 2.5 RC is imminent, and before that's set in > stone, I wanted to bring up the question of whether there will be official > HBase 2.5 binaries built with the Hadoop 3 profile and available in the > usual Maven repositories. (In addition to the usual Hadoop 2 profile > binaries) > > The HBase 2.x line has a commitment to maintain support for Hadoop 2.x, but > Hadoop 3.3 is the current stable Hadoop line and the most recent release > notes [1] encourage all users of Hadoop 2.x to upgrade to Hadoop 3. > > Without convenience artifacts built against Hadoop 3, no end-users with > Hadoop 3 clusters will be able to use the Apache-distributed binaries and > will instead have to recompile HBase from source themselves, or use a 3rd > party distribution that does so for them. > > This is especially inconvenient for downstream projects such as Apache > Phoenix, which has never officially supported the HBase 2.x / Hadoop 2.10 > combination. (It currently supports only HBase 2.3 or 2.4 with Hadoop 3. > HBase 2.5 support will be added very shortly after its release as part of > Phoenix 5.2.) > > To even run the Phoenix IT tests locally requires contributors to download > the HBase source release and manually mvn install to their local maven repo > using the Hadoop 3 profile, to avoid crashes in the HBase minicluster.[2] > This is a barrier to new contributors and confuses even veteran ones, and > has to be done again for every new HBase release. > > In general, I expect the Hadoop 3 user base to grow and the Hadoop 2.10 > user base to shrink with every future HBase 2 release, so I think this is a > worthwhile improvement. > > Thanks, > > Geoffrey > > [1] https://hadoop.apache.org/release/3.3.4.html > [2] https://github.com/apache/phoenix/blob/master/BUILDING.md
[DISCUSS] HBase 2.5 / Hadoop 3 artifacts
I see that the next HBase 2.5 RC is imminent, and before that's set in stone, I wanted to bring up the question of whether there will be official HBase 2.5 binaries built with the Hadoop 3 profile and available in the usual Maven repositories. (In addition to the usual Hadoop 2 profile binaries) The HBase 2.x line has a commitment to maintain support for Hadoop 2.x, but Hadoop 3.3 is the current stable Hadoop line and the most recent release notes [1] encourage all users of Hadoop 2.x to upgrade to Hadoop 3. Without convenience artifacts built against Hadoop 3, no end-users with Hadoop 3 clusters will be able to use the Apache-distributed binaries and will instead have to recompile HBase from source themselves, or use a 3rd party distribution that does so for them. This is especially inconvenient for downstream projects such as Apache Phoenix, which has never officially supported the HBase 2.x / Hadoop 2.10 combination. (It currently supports only HBase 2.3 or 2.4 with Hadoop 3. HBase 2.5 support will be added very shortly after its release as part of Phoenix 5.2.) To even run the Phoenix IT tests locally requires contributors to download the HBase source release and manually mvn install to their local maven repo using the Hadoop 3 profile, to avoid crashes in the HBase minicluster.[2] This is a barrier to new contributors and confuses even veteran ones, and has to be done again for every new HBase release. In general, I expect the Hadoop 3 user base to grow and the Hadoop 2.10 user base to shrink with every future HBase 2 release, so I think this is a worthwhile improvement. Thanks, Geoffrey [1] https://hadoop.apache.org/release/3.3.4.html [2] https://github.com/apache/phoenix/blob/master/BUILDING.md