Re: [VOTE] Release Apache Hadoop 3.4.0 RC0

2024-01-16 Thread Xiaoqiao He
Hi Shilun,

Thanks for your great work!

Compare branch-3.4 and branch-3.4.0, I found that commit [1]
is only checked in branch-3.4.0 but not branch-3.4. Please double
check if we need to backport branch-3.4.

Moreover, before 3.4.0 release we should cherry-pick from trunk
to branch-3.4 first if necessary, then to branch-3.4.0.

Thanks again.

Best Regards,
- He Xiaoqiao

[1]
https://github.com/apache/hadoop/commit/5e30d28d7e524dd3eed179d81c38e2eb82ae4673

On Wed, Jan 17, 2024 at 9:19 AM slfan1989  wrote:

> Hi, all
>
> Since the hadoop-3.4.0-RC0 vote, I have received valuable feedback. I
> encountered some issues during the preparation of hadoop-3.4.0-RC0, and I
> will address these issues in hadoop-3.4.0-RC1.
>
> The voting for RC0 will be closed. After the release of RC1, I will invite
> members of the community to review and vote once again.
>
> Thank you all once again for your support!
>
> Best Regards,
> Shilun Fan.
>
> On Wed, Jan 17, 2024 at 9:03 AM slfan1989  wrote:
>
> > Thank you very much for the response!
> >
> > The content is very comprehensive and valuable.
> >
> > I will prepare hadoop-3.4.0-RC1 according to the instructions provided by
> > you, and after RC1 is packaged, I will use
> validate-hadoop-client-artifacts
> > for validation.
> >
> > Best Regards,
> > Shilun Fan.
> >
> > On Tue, Jan 16, 2024 at 12:34 AM Steve Loughran
> >  wrote:
> >
> >> -1 I'm afraid, just due to staging/packaging issues.
> >>
> >> This took me a few goes to get right myself, so nothing unusual.
> >>
> >> Note I used my validator project which is set to retrieve binaries,
> check
> >> signatures, run maven builds against staged artifacts *and clean up any
> >> local copies first*and more.
> >>
> >> This uses apache ant to manage all this:
> >>
> >> https://github.com/steveloughran/validate-hadoop-client-artifacts
> >>
> >> Here's the initial build.properties:file I used to try and manage this
> >>
> >> ## build.properties:
> >> hadoop.version=3.4.0
> >> rc=RC0
> >> amd.src.dir=https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/
> >> http.source=https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64
> >> <
> https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/http.source=https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64
> >
> >>
> >> release=hadoop-${hadoop.version}-RC0
> >> rc.dirname=${release}
> >> release.native.binaries=false
> >> git.commit.id=cdb8af4f22ec
> >> nexus.staging.url=
> >>
> https://repository.apache.org/content/repositories/orgapachehadoop-1391/
> >> hadoop.source.dir=${local.dev.dir}/hadoop-trunk
> >> ##
> >>
> >> When I did my own builds, all the artifacts created were without the RC0
> >> suffix. It is critical this happens because the .sha512 checksums
> include
> >> that in their paths
> >>
> >> > cat hadoop-3.4.0-RC0.tar.gz.sha512
> >> SHA512 (hadoop-3.4.0-RC0.tar.gz) =
> >>
> >>
> e50e68aecb36867c610db8309ccd3aae812184da21354b50d2a461b29c73f21d097fb27372c73c150e1c035003bb99a61c64db26c090fe0fb9e7ed6041722eab
> >>
> >>
> >> Maven artifacts: staging problems
> >>
> >> Couldn't build with a -Pstaging profile as the staging repository wasn't
> >> yet closed -I tried to do that myself.
> >>
> >> This failed with some rule problem
> >>
> >> Event: Failed: Checksum Validation
> >> Monday, January 15, 2024 14:37:13 GMT (GMT+)
> >> typeId checksum-staging
> >> failureMessage INVALID SHA-1:
> >>
> >>
> '/org/apache/hadoop/hadoop-mapreduce-client-jobclient/3.4.0/hadoop-mapreduce-client-jobclient-3.4.0-tests.jar.sha1'
> >> failureMessage Requires one-of SHA-1:
> >>
> >>
> /org/apache/hadoop/hadoop-mapreduce-client-jobclient/3.4.0/hadoop-mapreduce-client-jobclient-3.4.0-tests.jar.sha1,
> >> SHA-256:
> >>
> >>
> /org/apache/hadoop/hadoop-mapreduce-client-jobclient/3.4.0/hadoop-mapreduce-client-jobclient-3.4.0-tests.jar.sha256,
> >> SHA-512:
> >>
> >>
> /org/apache/hadoop/hadoop-mapreduce-client-jobclient/3.4.0/hadoop-mapreduce-client-jobclient-3.4.0-tests.jar.sha512
> >>
> >> I don't know precisely what this means...my guess is that the upload
> >> didn't
> >> include everything.
> >>
> >> Note my client-validator module can check this; just run its maven test
> >> commands
> >>
> >> mvn clean test -U -P3.4 -Pstaging
> >>
> >> GPG signing: all good.
> >>
> >> Picked your key up from the site ( ant gpg.keys ) ... first validation
> >> with
> >> ant gpg.verify was unhappy as your key wasn't trusted. I've signed it
> and
> >> pushed that signature up, so people who trust me get some reassurance
> >> about
> >> you.
> >>
> >> My build then failed as the gpg code couldn't find the
> >> hadoop-3.4.0-aarch64.tar.gz.asc
> >>
> >> The problem here is that although we want separate arm and x86 tar
> files,
> >> we don't really want separate binaries as it only creates different jars
> >> in
> >> the wild.
> >>
> >> The way I addressed that was after creating that x86 release on an ec2
> vm
> >> and downloading it, I then did a local arm64 build and then created an
> arm
> 

Re: [VOTE] Release Apache Hadoop 3.4.0 RC0

2024-01-12 Thread slfan1989
Thank you for reporting this issue and I will follow up on HDFS-17129.

I acknowledge that I haven't checked for those marked as critical/blocker
in RC0. I intend to complete this check before the release of RC1.

Thank you again for your valuable suggestions!

Best Regards
Shilun Fan.

Ayush Saxena  于2024年1月13日周六 05:10写道:

> We should consider including
> https://issues.apache.org/jira/browse/HDFS-17129
>
> Which looks like inducing some misorder between IBR & FBR which can
> potentially lead to strange issues, if that can’t be merged, should revert
> the one which causes that.
>
> I think we should check for any ticket which has a target version or
> affect version & is marked critica/blockerl for 3.4.0 before spinning up a
> new RC, I think I mentioned that somewhere before.
>
> -1, in case HDFS-17129 is not a false alarm or we can prove it won't cause
> any issues. There is a comment which says a block was reported missing post
> the patch that induced it: [1]
>
> [1] https://github.com/apache/hadoop/pull/6244#issuecomment-1793981740
>
> -Ayush
>
>
> On Fri, 12 Jan 2024 at 07:37, slfan1989  wrote:
>
>> Thank you very much for your help in verifying this version! We will use
>> version 3.5.0 for fix jira in the future.
>>
>> Best Regards,
>> Shilun Fan.
>>
>>  > wonderful! I'll be testing over the weekend
>>
>>  > Meanwhile, new changes I'm putting in to trunk are tagged as fixed in
>> 3.5.0
>>  > -correct?
>>
>>  > steve
>>
>>
>> > On Thu, 11 Jan 2024 at 05:15, slfan1989 wrote:
>>
>> > Hello all,
>> >
>> > We plan to release hadoop 3.4.0 based on hadoop trunk, which is the
>> first
>> > hadoop 3.4.0-RC version.
>> >
>> > The RC is available at:
>> > https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/ (for amd64)
>> > https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-arm64/ (for arm64)
>> >
>> > Maven artifacts is built by x86 machine and are staged at
>> >
>> https://repository.apache.org/content/repositories/orgapachehadoop-1391/
>> >
>> > My public key:
>> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>> >
>> > Changelog:
>> > https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/CHANGELOG.md
>> >
>> > Release notes:
>> >
>> https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/RELEASENOTES.md
>> >
>> > This is a relatively big release (by Hadoop standard) containing about
>> 2852
>> > commits.
>> >
>> > Please give it a try, this RC vote will run for 7 days.
>> >
>> > Feature highlights:
>> >
>> > DataNode FsDatasetImpl Fine-Grained Locking via BlockPool
>> > 
>> > [HDFS-15180](https://issues.apache.org/jira/browse/HDFS-15180) Split
>> > FsDatasetImpl datasetLock via blockpool to solve the issue of heavy
>> > FsDatasetImpl datasetLock
>> > When there are many namespaces in a large cluster.
>> >
>> > YARN Federation improvements
>> > 
>> > [YARN-5597](https://issues.apache.org/jira/browse/YARN-5597) brings
>> many
>> > improvements, including the following:
>> >
>> > 1. YARN Router now boasts a full implementation of all relevant
>> interfaces
>> > including the ApplicationClientProtocol,
>> > ResourceManagerAdministrationProtocol, and RMWebServiceProtocol.
>> > 2. Enhanced support for Application cleanup and automatic offline
>> > mechanisms for SubCluster are now facilitated by the YARN Router.
>> > 3. Code optimization for Router and AMRMProxy was undertaken, coupled
>> with
>> > improvements to previously pending functionalities.
>> > 4. Audit logs and Metrics for Router received upgrades.
>> > 5. A boost in cluster security features was achieved, with the inclusion
>> of
>> > Kerberos support.
>> > 6. The page function of the router has been enhanced.
>> >
>> > Upgrade AWS SDK to V2
>> > 
>> > [HADOOP-18073](https://issues.apache.org/jira/browse/HADOOP-18073)
>> > The S3A connector now uses the V2 AWS SDK. This is a significant change
>> at
>> > the source code level.
>> > Any applications using the internal extension/override points in the
>> > filesystem connector are likely to break.
>> > Consult the document aws\_sdk\_upgrade for the full details.
>> >
>> > hadoop-thirdparty will also provide the new RC0 soon.
>> >
>> > Best Regards,
>> > Shilun Fan.
>> >
>>
>


Re: [VOTE] Release Apache Hadoop 3.4.0 RC0

2024-01-12 Thread Ayush Saxena
We should consider including
https://issues.apache.org/jira/browse/HDFS-17129

Which looks like inducing some misorder between IBR & FBR which can
potentially lead to strange issues, if that can’t be merged, should revert
the one which causes that.

I think we should check for any ticket which has a target version or
affect version & is marked critica/blockerl for 3.4.0 before spinning up a
new RC, I think I mentioned that somewhere before.

-1, in case HDFS-17129 is not a false alarm or we can prove it won't cause
any issues. There is a comment which says a block was reported missing post
the patch that induced it: [1]

[1] https://github.com/apache/hadoop/pull/6244#issuecomment-1793981740

-Ayush


On Fri, 12 Jan 2024 at 07:37, slfan1989  wrote:

> Thank you very much for your help in verifying this version! We will use
> version 3.5.0 for fix jira in the future.
>
> Best Regards,
> Shilun Fan.
>
>  > wonderful! I'll be testing over the weekend
>
>  > Meanwhile, new changes I'm putting in to trunk are tagged as fixed in
> 3.5.0
>  > -correct?
>
>  > steve
>
>
> > On Thu, 11 Jan 2024 at 05:15, slfan1989 wrote:
>
> > Hello all,
> >
> > We plan to release hadoop 3.4.0 based on hadoop trunk, which is the first
> > hadoop 3.4.0-RC version.
> >
> > The RC is available at:
> > https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/ (for amd64)
> > https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-arm64/ (for arm64)
> >
> > Maven artifacts is built by x86 machine and are staged at
> > https://repository.apache.org/content/repositories/orgapachehadoop-1391/
> >
> > My public key:
> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >
> > Changelog:
> > https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/CHANGELOG.md
> >
> > Release notes:
> >
> https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/RELEASENOTES.md
> >
> > This is a relatively big release (by Hadoop standard) containing about
> 2852
> > commits.
> >
> > Please give it a try, this RC vote will run for 7 days.
> >
> > Feature highlights:
> >
> > DataNode FsDatasetImpl Fine-Grained Locking via BlockPool
> > 
> > [HDFS-15180](https://issues.apache.org/jira/browse/HDFS-15180) Split
> > FsDatasetImpl datasetLock via blockpool to solve the issue of heavy
> > FsDatasetImpl datasetLock
> > When there are many namespaces in a large cluster.
> >
> > YARN Federation improvements
> > 
> > [YARN-5597](https://issues.apache.org/jira/browse/YARN-5597) brings many
> > improvements, including the following:
> >
> > 1. YARN Router now boasts a full implementation of all relevant
> interfaces
> > including the ApplicationClientProtocol,
> > ResourceManagerAdministrationProtocol, and RMWebServiceProtocol.
> > 2. Enhanced support for Application cleanup and automatic offline
> > mechanisms for SubCluster are now facilitated by the YARN Router.
> > 3. Code optimization for Router and AMRMProxy was undertaken, coupled
> with
> > improvements to previously pending functionalities.
> > 4. Audit logs and Metrics for Router received upgrades.
> > 5. A boost in cluster security features was achieved, with the inclusion
> of
> > Kerberos support.
> > 6. The page function of the router has been enhanced.
> >
> > Upgrade AWS SDK to V2
> > 
> > [HADOOP-18073](https://issues.apache.org/jira/browse/HADOOP-18073)
> > The S3A connector now uses the V2 AWS SDK. This is a significant change
> at
> > the source code level.
> > Any applications using the internal extension/override points in the
> > filesystem connector are likely to break.
> > Consult the document aws\_sdk\_upgrade for the full details.
> >
> > hadoop-thirdparty will also provide the new RC0 soon.
> >
> > Best Regards,
> > Shilun Fan.
> >
>


Re: [VOTE] Release Apache Hadoop 3.4.0 RC0

2024-01-11 Thread slfan1989
Thank you very much for your help in verifying this version! We will use
version 3.5.0 for fix jira in the future.

Best Regards,
Shilun Fan.

 > wonderful! I'll be testing over the weekend

 > Meanwhile, new changes I'm putting in to trunk are tagged as fixed in
3.5.0
 > -correct?

 > steve


> On Thu, 11 Jan 2024 at 05:15, slfan1989 wrote:

> Hello all,
>
> We plan to release hadoop 3.4.0 based on hadoop trunk, which is the first
> hadoop 3.4.0-RC version.
>
> The RC is available at:
> https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/ (for amd64)
> https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-arm64/ (for arm64)
>
> Maven artifacts is built by x86 machine and are staged at
> https://repository.apache.org/content/repositories/orgapachehadoop-1391/
>
> My public key:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> Changelog:
> https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/CHANGELOG.md
>
> Release notes:
> https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/RELEASENOTES.md
>
> This is a relatively big release (by Hadoop standard) containing about
2852
> commits.
>
> Please give it a try, this RC vote will run for 7 days.
>
> Feature highlights:
>
> DataNode FsDatasetImpl Fine-Grained Locking via BlockPool
> 
> [HDFS-15180](https://issues.apache.org/jira/browse/HDFS-15180) Split
> FsDatasetImpl datasetLock via blockpool to solve the issue of heavy
> FsDatasetImpl datasetLock
> When there are many namespaces in a large cluster.
>
> YARN Federation improvements
> 
> [YARN-5597](https://issues.apache.org/jira/browse/YARN-5597) brings many
> improvements, including the following:
>
> 1. YARN Router now boasts a full implementation of all relevant interfaces
> including the ApplicationClientProtocol,
> ResourceManagerAdministrationProtocol, and RMWebServiceProtocol.
> 2. Enhanced support for Application cleanup and automatic offline
> mechanisms for SubCluster are now facilitated by the YARN Router.
> 3. Code optimization for Router and AMRMProxy was undertaken, coupled with
> improvements to previously pending functionalities.
> 4. Audit logs and Metrics for Router received upgrades.
> 5. A boost in cluster security features was achieved, with the inclusion
of
> Kerberos support.
> 6. The page function of the router has been enhanced.
>
> Upgrade AWS SDK to V2
> 
> [HADOOP-18073](https://issues.apache.org/jira/browse/HADOOP-18073)
> The S3A connector now uses the V2 AWS SDK. This is a significant change at
> the source code level.
> Any applications using the internal extension/override points in the
> filesystem connector are likely to break.
> Consult the document aws\_sdk\_upgrade for the full details.
>
> hadoop-thirdparty will also provide the new RC0 soon.
>
> Best Regards,
> Shilun Fan.
>


Re: [VOTE] Release Apache Hadoop 3.4.0 RC0

2024-01-11 Thread Steve Loughran
wonderful! I'll be testing over the weekend

Meanwhile, new changes I'm putting in to trunk are tagged as fixed in 3.5.0
-correct?

steve


On Thu, 11 Jan 2024 at 05:15, slfan1989  wrote:

> Hello all,
>
> We plan to release hadoop 3.4.0 based on hadoop trunk, which is the first
> hadoop 3.4.0-RC version.
>
> The RC is available at:
> https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/ (for amd64)
> https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-arm64/ (for arm64)
>
> Maven artifacts is built by x86 machine and are staged at
> https://repository.apache.org/content/repositories/orgapachehadoop-1391/
>
> My public key:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> Changelog:
> https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/CHANGELOG.md
>
> Release notes:
> https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/RELEASENOTES.md
>
> This is a relatively big release (by Hadoop standard) containing about 2852
> commits.
>
> Please give it a try, this RC vote will run for 7 days.
>
> Feature highlights:
>
> DataNode FsDatasetImpl Fine-Grained Locking via BlockPool
> 
> [HDFS-15180](https://issues.apache.org/jira/browse/HDFS-15180) Split
> FsDatasetImpl datasetLock via blockpool to solve the issue of heavy
> FsDatasetImpl datasetLock
> When there are many namespaces in a large cluster.
>
> YARN Federation improvements
> 
> [YARN-5597](https://issues.apache.org/jira/browse/YARN-5597) brings many
> improvements, including the following:
>
> 1. YARN Router now boasts a full implementation of all relevant interfaces
> including the ApplicationClientProtocol,
> ResourceManagerAdministrationProtocol, and RMWebServiceProtocol.
> 2. Enhanced support for Application cleanup and automatic offline
> mechanisms for SubCluster are now facilitated by the YARN Router.
> 3. Code optimization for Router and AMRMProxy was undertaken, coupled with
> improvements to previously pending functionalities.
> 4. Audit logs and Metrics for Router received upgrades.
> 5. A boost in cluster security features was achieved, with the inclusion of
> Kerberos support.
> 6. The page function of the router has been enhanced.
>
> Upgrade AWS SDK to V2
> 
> [HADOOP-18073](https://issues.apache.org/jira/browse/HADOOP-18073)
> The S3A connector now uses the V2 AWS SDK.  This is a significant change at
> the source code level.
> Any applications using the internal extension/override points in the
> filesystem connector are likely to break.
> Consult the document aws\_sdk\_upgrade for the full details.
>
> hadoop-thirdparty will also provide the new RC0 soon.
>
> Best Regards,
> Shilun Fan.
>


Re: [VOTE] Release Apache Hadoop 3.4.0 RC0

2024-01-10 Thread Masatake Iwasaki

Thanks for driving this release, Shilun Fan.

The top page of site documentation (in hadoop-3.4.0-RC0-site.tar.gz) 
looks the same as 3.3.5.


While the index.md.vm is updated in branch-3.4.0[1], it seems not to be 
reflected.

release-3.4.0-RC0 tag should be pushed to make checking easier.

In addition, the description about new features of previous release 
should be removed from the index.md.vm.


[1] 
https://github.com/apache/hadoop/blob/branch-3.4.0/hadoop-project/src/site/markdown/index.md.vm


Masatake Iwasaki

On 2024/01/11 14:15, slfan1989 wrote:

Hello all,

We plan to release hadoop 3.4.0 based on hadoop trunk, which is the first
hadoop 3.4.0-RC version.

The RC is available at:
https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/ (for amd64)
https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-arm64/ (for arm64)

Maven artifacts is built by x86 machine and are staged at
https://repository.apache.org/content/repositories/orgapachehadoop-1391/

My public key:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

Changelog:
https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/CHANGELOG.md

Release notes:
https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/RELEASENOTES.md

This is a relatively big release (by Hadoop standard) containing about 2852
commits.

Please give it a try, this RC vote will run for 7 days.

Feature highlights:

DataNode FsDatasetImpl Fine-Grained Locking via BlockPool

[HDFS-15180](https://issues.apache.org/jira/browse/HDFS-15180) Split
FsDatasetImpl datasetLock via blockpool to solve the issue of heavy
FsDatasetImpl datasetLock
When there are many namespaces in a large cluster.

YARN Federation improvements

[YARN-5597](https://issues.apache.org/jira/browse/YARN-5597) brings many
improvements, including the following:

1. YARN Router now boasts a full implementation of all relevant interfaces
including the ApplicationClientProtocol,
ResourceManagerAdministrationProtocol, and RMWebServiceProtocol.
2. Enhanced support for Application cleanup and automatic offline
mechanisms for SubCluster are now facilitated by the YARN Router.
3. Code optimization for Router and AMRMProxy was undertaken, coupled with
improvements to previously pending functionalities.
4. Audit logs and Metrics for Router received upgrades.
5. A boost in cluster security features was achieved, with the inclusion of
Kerberos support.
6. The page function of the router has been enhanced.

Upgrade AWS SDK to V2

[HADOOP-18073](https://issues.apache.org/jira/browse/HADOOP-18073)
The S3A connector now uses the V2 AWS SDK.  This is a significant change at
the source code level.
Any applications using the internal extension/override points in the
filesystem connector are likely to break.
Consult the document aws\_sdk\_upgrade for the full details.

hadoop-thirdparty will also provide the new RC0 soon.

Best Regards,
Shilun Fan.



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org