Thanks Ayush for finding this issue. Looks like there is a problem with the
ARM binaries. The reason for that could be that the create-release script
doesn't work in ARM docker container but works manually after logging
inside the container. So I was creating the tars manually. Seems like it
didn't work properly.
There are two yarn issues because of which I had to create this manually.
https://issues.apache.org/jira/browse/YARN-11712
https://issues.apache.org/jira/browse/YARN-11713
So we have decided to remove the aarch* binaries from the 3.4.1 release and
target to fix that in 3.4.2.




On Fri, Oct 18, 2024 at 1:41 PM Ayush Saxena <ayush...@gmail.com> wrote:

> Thanx Mukund for sharing the details, I might have missed that
> discussion. Do mention the purpose on the website once we upload the
> lean jar, so that relevant people can use it.
>
> I am holding my vote due to [3], that in general looks blocker to me,
> If I haven't messed up. Would be great if someone can help validate
> that I believe is the cause for [2]
>
> * Built from source
> * Verified checksums
> * Verified signatures
> * Validated no code diff b/w the git tag & src tar
> * Verified the output of `hadoop version` [1]
> * Checked the NOTICE & LICENSE file
> * Ran some basic HDFS shell commands
> * Ran some Erasure Coding related commands
> * Ran example jobs (TeraGen, TeraSort, WordCount) [2]
> * Browsed through the UI (NN, DN, RM, NM & JHS)
> * Skimmed over the contents of maven artifacts, Release Notes & ChangeLog
> * Discrepancy in contents of aarch64 tar & x86 tar [3]
>
> [1]
>
> The hadoop version output inside the aarch64 tar shows the source
> repository as Unknown, which isn't the case in other tar nor was the
> case in the last 3.4.0 release. Not blocking, but we should figure out
> & fix it in the next release.
>
> ```
> ayushsaxena@ayushsaxena hadoop-3.4.1 % bin/hadoop version
> Hadoop 3.4.1
> Source code repository Unknown -r Unknown
> Compiled by mthakur on 2024-10-10T08:24Z
> Compiled on platform linux-aarch_64
> Compiled with protoc 3.23.4
> From source with checksum 7292fe9dba5e2e44e3a9f763fce3e680
> ```
>
> [2] While trying to start ResourceManager via the ARM binary, it is
> failing for me:
> ```
> ayushsaxena@ayushsaxena hadoop-3.4.1 % bin/yarn resourcemanager
> Error: Could not find or load main class
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
> ```
> It didn't used to happen earlier, but I sorted it out by  "export
>
> HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/Users/ayushsaxena/code/RC/hadoop-3.4.1/hadoop-3.4.1/share/hadoop/tools/lib/*",
> Considering I am not pro with YARN, I might have messed up the setup.
> My mapred settings have Tez related stuff as well, but the same works
> if I use the x86 binary....
>
>
> [3] Investigating [2], I think the YARN Jars are missing in the
> aarch64 share/hadoop/yarn & they are there in the share/hadoop/tools
> directory
>
> in x86
> ```
> ayushsaxena@ayushsaxena hadoop-3.4.1 % ls
> x86/hadoop-3.4.1/share/hadoop/yarn
> csi hadoop-yarn-server-router-3.4.1.jar
> hadoop-yarn-api-3.4.1.jar hadoop-yarn-server-sharedcachemanager-3.4.1.jar
> hadoop-yarn-applications-catalog-webapp-3.4.1.war
> hadoop-yarn-server-tests-3.4.1.jar
> hadoop-yarn-applications-distributedshell-3.4.1.jar
> hadoop-yarn-server-timeline-pluginstorage-3.4.1.jar
> hadoop-yarn-applications-mawo-core-3.4.1.jar
> hadoop-yarn-server-web-proxy-3.4.1.jar
> hadoop-yarn-applications-unmanaged-am-launcher-3.4.1.jar
> hadoop-yarn-services-api-3.4.1.jar
> hadoop-yarn-client-3.4.1.jar hadoop-yarn-services-core-3.4.1.jar
> hadoop-yarn-common-3.4.1.jar lib
> hadoop-yarn-registry-3.4.1.jar sources
> hadoop-yarn-server-applicationhistoryservice-3.4.1.jar test
> hadoop-yarn-server-common-3.4.1.jar timelineservice
> hadoop-yarn-server-globalpolicygenerator-3.4.1.jar webapps
> hadoop-yarn-server-nodemanager-3.4.1.jar yarn-service-examples
> hadoop-yarn-server-resourcemanager-3.4.1.jar
> ayushsaxena@ayushsaxena hadoop-3.4.1 %
>
> ```
>
> And in aarch64 they aren't there
>
> ```
>
> ayushsaxena@ayushsaxena hadoop-3.4.1 % ls -l
> aarch/hadoop-3.4.1/share/hadoop/yarn
>
> total 0
> drwxr-xr-x   3 ayushsaxena  staff    96 Oct 10 15:34 csi
> drwxr-xr-x  47 ayushsaxena  staff  1504 Oct 10 15:34 lib
> drwxr-xr-x  28 ayushsaxena  staff   896 Oct 10 15:34 sources
> drwxr-xr-x   3 ayushsaxena  staff    96 Oct 10 15:34 timelineservice
> drwxr-xr-x   3 ayushsaxena  staff    96 Oct 10 15:34 webapps
> drwxr-xr-x   6 ayushsaxena  staff   192 Oct 10 15:34 yarn-service-examples
> ayushsaxena@ayushsaxena hadoop-3.4.1 %
> ```
>
> Can someone help double check this once? I downloaded the 3.4.0 binary
> & this is not the case over there. I did a build locally on aarch64
> and there also the jars are present in the share/hadoop/yarn dir
>
> -Ayush
>
> On Fri, 18 Oct 2024 at 11:57, Mukund Madhav Thakur <mtha...@cloudera.com>
> wrote:
> >
> > Hi Ayush,
> > "lean" tar is a small tar file which doesn't contain the AWS SDK because
> the size of AWS SDK is itself 500 MB.
> > This can ease usage for non AWS users. Even AWS users can add this jar
> explicitly if desired.
> >
> > This is created using https://github.com/apache/hadoop-release-support
> which is the new and easy way to create and validate hadoop releases
> > created by Steve during 3.4.0. Also the link to this is present in [1] .
> > There was a discussion around this during the RC2 testing.
> >
> > I will check with Steve on why we don't have the equivalent for aarch64.
> >
> > [1] https://cwiki.apache.org/confluence/display/HADOOP2/HowToRelease
> >
> > On Thu, Oct 17, 2024 at 10:00 PM Ayush Saxena <ayush...@gmail.com>
> wrote:
> >>
> >> What exactly is this "lean tar"? I couldn’t find any mention of it in
> >> [1], nor did I come across any thread establishing consensus on adding
> >> it, as we did for the aarch64 tar. It’s also not clear to me from the
> >> create-release script how it is getting generated. So, where is this
> >> coming from? Additionally, why don’t we have an equivalent for
> >> aarch64?
> >>
> >> -Ayush
> >>
> >> [1] https://cwiki.apache.org/confluence/display/HADOOP2/HowToRelease
> >>
> >> On Thu, 17 Oct 2024 at 13:21, Mukund Madhav Thakur
> >> <mtha...@cloudera.com.invalid> wrote:
> >> >
> >> > Adding my +1 (binding) after all the testing.
> >> > Ran a lot of validation workflows from hadoop-release-support all went
> >> > fine.
> >> > Verified checksum and signatures.
> >> > Build orc and reran the failing vectored io tests and it succeeded.
> >> > Build parquet and run the hadoop parquet module tests.
> >> >  @Steve Loughran <ste...@apache.org>  Build gcs and ran all the
> tests which
> >> > went fine.
> >> >
> >> > Also Ahmar is a committer so his vote can be counted as +1 as well.
> >> >
> >> > Thanks everyone for testing! Need just one more binding vote for this
> to be
> >> > released.
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Thu, Oct 17, 2024 at 10:01 AM Sneha Vijayarajan <
> >> > sneha.vijayara...@gmail.com> wrote:
> >> >
> >> > > +1 (non-binding)
> >> > >
> >> > > Ran the ABFS driver tests and all the tests passed.
> >> > >
> >> > > ============================================================
> >> > > HNS-OAuth
> >> > > ============================================================
> >> > >
> >> > > [WARNING] Tests run: 154, Failures: 0, Errors: 0, Skipped: 2
> >> > > [WARNING] Tests run: 646, Failures: 0, Errors: 0, Skipped: 82
> >> > > [WARNING] Tests run: 433, Failures: 0, Errors: 0, Skipped: 57
> >> > >
> >> > > ============================================================
> >> > > HNS-SharedKey
> >> > > ============================================================
> >> > >
> >> > > [WARNING] Tests run: 154, Failures: 0, Errors: 0, Skipped: 3
> >> > > [WARNING] Tests run: 646, Failures: 0, Errors: 0, Skipped: 34
> >> > > [WARNING] Tests run: 433, Failures: 0, Errors: 0, Skipped: 44
> >> > >
> >> > > ============================================================
> >> > > NonHNS-SharedKey
> >> > > ============================================================
> >> > >
> >> > > [WARNING] Tests run: 154, Failures: 0, Errors: 0, Skipped: 9
> >> > > [WARNING] Tests run: 630, Failures: 0, Errors: 0, Skipped: 274
> >> > > [WARNING] Tests run: 433, Failures: 0, Errors: 0, Skipped: 47
> >> > >
> >> > > ============================================================
> >> > > AppendBlob-HNS-OAuth
> >> > > ============================================================
> >> > >
> >> > > [WARNING] Tests run: 154, Failures: 0, Errors: 0, Skipped: 2
> >> > > [WARNING] Tests run: 646, Failures: 0, Errors: 0, Skipped: 84
> >> > > [WARNING] Tests run: 433, Failures: 0, Errors: 0, Skipped: 81
> >> > >
> >> > > Thanks,
> >> > > Sneha Vijayarajan
> >> > >
> >> > > On Thu, Oct 17, 2024 at 8:57 AM Xiaoqiao He <hexiaoq...@apache.org>
> wrote:
> >> > > >
> >> > > > Update my vote to +1(binding).
> >> > > >
> >> > > > Addendum: Verified signature was correct for both src/binary/site
> >> > > tarball.
> >> > > >
> >> > > >
> >> > > > On Mon, Oct 14, 2024 at 5:26 PM Xiaoqiao He <
> hexiaoq...@apache.org>
> >> > > wrote:
> >> > > >
> >> > > > > Thanks Mukund and Steve for driving this release.
> >> > > > >
> >> > > > > +0. Will +1 when signature check passed.
> >> > > > >
> >> > > > > [Y] LICENSE files exist and NOTICE is included.
> >> > > > > [Y] Rat check is ok. mvn clean apache-rat:check
> >> > > > > [Y] Build the source code on Ubuntu and OpenJDK 11 by `mvn clean
> >> > > package
> >> > > > > -DskipTests -Pnative -Pdist -Dtar`.
> >> > > > > [Y] Setup pseudo cluster with HDFS and YARN.
> >> > > > > [Y] Run simple FsShell - mkdir/put/get/mv/rm and check the
> result.
> >> > > > > [Y] Run example mr jobs and check the result - Pi & wordcount.
> >> > > > > [Y] Spot-check and run some unit tests.
> >> > > > > [Y] Skimmed the Web UI of
> NameNode/DataNode/Resourcemanager/NodeManager
> >> > > > > etc.
> >> > > > > [Y] Skimmed over the contents of site documentation.
> >> > > > > [Y] Skimmed over the contents of maven repo.
> >> > > > > [N] Verified checksum passed but signature didn't pass. Did you
> use
> >> > > > > another key to sign?
> >> > > > >
> >> > > > >> hexiaoqiao@hexiaoqiao-ubuntu:~/Public/hadoop-3.4.1$ gpg
> --list-key |
> >> > > > >> grep -B2 -A1 "Mukund"
> >> > > > >> pub   rsa4096 2023-09-28 [SC] [expires: 2027-09-28]
> >> > > > >>       41C7034C031FB3AF4BBFB5B89F070EBA4202CEC1
> >> > > > >> uid           [ unknown] Mukund Thakur <mtha...@apache.org>
> >> > > > >> sub   rsa4096 2023-09-28 [E] [expires: 2027-09-28]
> >> > > > >> hexiaoqiao@hexiaoqiao-ubuntu:~/Public/hadoop-3.4.1$ gpg
> --verify
> >> > > > >> hadoop-3.4.1.tar.gz.asc hadoop-3.4.1.tar.gz
> >> > > > >> gpg: Signature made 2024年10月10日 星期四 01时10分30秒 CST
> >> > > > >> gpg:                using RSA key
> >> > > 53931DAA708291409958BD474D22BB7D32882201
> >> > > > >> gpg: Can't check signature: No public key
> >> > > > >
> >> > > > >
> >> > > > > Best Regards,
> >> > > > > - He Xiaoqiao
> >> > > > >
> >> > > > > On Fri, Oct 11, 2024 at 1:31 AM Steve Loughran
> >> > > <ste...@cloudera.com.invalid>
> >> > > > > wrote:
> >> > > > >
> >> > > > >> Hney, you did the -lean one too! nice
> >> > > > >>
> >> > > > >> Anyway, yes I'll test. I will try not to find new problems
> >> > > > >>
> >> > > > >>
> >> > > > >> On Thu, 10 Oct 2024 at 12:18, Mukund Madhav Thakur
> >> > > > >> <mtha...@cloudera.com.invalid> wrote:
> >> > > > >>
> >> > > > >> > Apache Hadoop 3.4.1
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > With help of Steve, I have put together a release candidate
> (RC3)
> >> > > for
> >> > > > >> > Hadoop 3.4.1.
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > What we would like is for anyone who can to verify the
> tarballs,
> >> > > > >> especially
> >> > > > >> >
> >> > > > >> > anyone who can try the arm64 binaries as we want to include
> them
> >> > > too.
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > The RC is available at:
> >> > > > >> >
> >> > > > >> >
> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.4.1-RC3/
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > The git tag is release-3.4.1-RC3, commit
> >> > > > >> > 4d7825309348956336b8f06a08322b78422849b1
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > The maven artifacts are staged at
> >> > > > >> >
> >> > > > >> >
> >> > >
> https://repository.apache.org/content/repositories/orgapachehadoop-1430
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > You can find my public key at:
> >> > > > >> >
> >> > > > >> >
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > Change log
> >> > > > >> >
> >> > > > >> >
> >> > > > >>
> >> > >
> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.4.1-RC3/CHANGELOG.md
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > Release notes
> >> > > > >> >
> >> > > > >> >
> >> > > > >> >
> >> > > > >>
> >> > >
> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.4.1-RC3/RELEASENOTES.md
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > This is off branch-3.4.1
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > Key changes include
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > * Bulk Delete API.
> >> > > https://issues.apache.org/jira/browse/HADOOP-18679
> >> > > > >> >
> >> > > > >> > * Fixes and enhancements in Vectored IO API.
> >> > > > >> >
> >> > > > >> > * Improvements in Hadoop Azure connector.
> >> > > > >> >
> >> > > > >> > * Fixes and improvements post upgrade to AWS V2 SDK in
> S3AConnector.
> >> > > > >> >
> >> > > > >> > * This release includes Arm64 binaries. Please can anyone
> with
> >> > > > >> >
> >> > > > >> >   compatible systems validate these.
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > Note, because the arm64 binaries are built separately on a
> different
> >> > > > >> >
> >> > > > >> > platform and JVM, their jar files may not match those of the
> x86
> >> > > > >> >
> >> > > > >> > release -and therefore the maven artifacts. I don't think
> this is
> >> > > > >> >
> >> > > > >> > an issue (the ASF actually releases source tarballs, the
> binaries
> >> > > are
> >> > > > >> >
> >> > > > >> > there for help only, though with the maven repo that's a bit
> >> > > blurred).
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > The only way to be consistent would actually untar the
> x86.tar.gz,
> >> > > > >> >
> >> > > > >> > overwrite its binaries with the arm stuff, retar, sign and
> push out
> >> > > > >> >
> >> > > > >> > for the vote. Even automating that would be risky.
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > Please try the release and vote. The vote will run for 5
> days.
> >> > > > >> >
> >> > > > >>
> >> > > > >
> >> > >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >>
>

Reply via email to