from:"Ayush Saxena"

Re: [ANNOUNCE] New Hadoop Committer - Haiyang Hu

2024-04-21 Thread Ayush Saxena

Congratulations Haiyang!!!

-Ayush

> On 22 Apr 2024, at 9:41 AM, Xiaoqiao He  wrote:
> 
> I am pleased to announce that Haiyang Hu has been elected as
> a committer on the Apache Hadoop project. We appreciate all of
> Haiyang's work, and look forward to her/his continued contributions.
> 
> Congratulations and Welcome, Haiyang!
> 
> Best Regards,
> - He Xiaoqiao
> (On behalf of the Apache Hadoop PMC)
> 
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> 

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 3.4.0 (RC3)

2024-03-13 Thread Ayush Saxena

>  Counter should be with yourself vote, where the current summary
is 5 +1 binding and 1 +1 non-binding. Let's re-count when deadline.

Just on the process: The release manager needs to "explicitly" vote like
any other before counting their own vote, there has been a lot of
discussions around that at multiple places & the official apache doc has
been updated as well [1], the last paragraph reads:

"Note that there is no implicit +1 from the release manager, or from anyone
in any ASF vote. Only explicit votes are valid. The release manager is
encouraged to vote on releases, like any reviewer would do."

So, do put an explicit +1, before you count yourself. Good Luck!!!

-Ayush

[1] https://www.apache.org/foundation/voting.html#ReleaseVotes

On Tue, 12 Mar 2024 at 17:27, Steve Loughran 
wrote:

> followup: overnight work happy too.
>
> one interesting pain point is that on a raspberry pi 64 os checknative
> complains that libcrypto is missing
>
> > bin/hadoop checknative
>
> 2024-03-12 11:50:24,359 INFO bzip2.Bzip2Factory: Successfully loaded &
> initialized native-bzip2 library system-native
> 2024-03-12 11:50:24,363 INFO zlib.ZlibFactory: Successfully loaded &
> initialized native-zlib library
> 2024-03-12 11:50:24,370 WARN erasurecode.ErasureCodeNative: ISA-L support
> is not available in your platform... using builtin-java codec where
> applicable
> 2024-03-12 11:50:24,429 INFO nativeio.NativeIO: The native code was built
> without PMDK support.
> 2024-03-12 11:50:24,431 WARN crypto.OpensslCipher: Failed to load OpenSSL
> Cipher.
> java.lang.UnsatisfiedLinkError: Cannot load libcrypto.so (libcrypto.so:
> cannot open shared object file: No such file or directory)!
> at org.apache.hadoop.crypto.OpensslCipher.initIDs(Native Method)
> at
> org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:90)
> at
>
> org.apache.hadoop.util.NativeLibraryChecker.main(NativeLibraryChecker.java:111)
> Native library checking:
> hadoop:  true
>
> /home/stevel/Projects/hadoop-release-support/target/arm-untar/hadoop-3.4.0/lib/native/libhadoop.so.1.0.0
> zlib:true /lib/aarch64-linux-gnu/libz.so.1
> zstd  :  true /lib/aarch64-linux-gnu/libzstd.so.1
> bzip2:   true /lib/aarch64-linux-gnu/libbz2.so.1
> openssl: false Cannot load libcrypto.so (libcrypto.so: cannot open shared
> object file: No such file or directory)!
> ISA-L:   false libhadoop was built without ISA-L support
> PMDK:false The native code was built without PMDK support.
>
> which happens because its not in /lib/aarch64-linux-gnu but instead in
> /usr/lib/aarch64-linux-gnu/l
> ls -l /usr/lib/aarch64-linux-gnu/libcrypto*
> -rw-r--r-- 1 root root 2739952 Sep 19 13:09
> /usr/lib/aarch64-linux-gnu/libcrypto.so.1.1
> -rw-r--r-- 1 root root 4466856 Oct 27 13:40
> /usr/lib/aarch64-linux-gnu/libcrypto.so.3
>
> Anyone got any insights on how I should set up this (debian-based) OS here?
> I know it's only a small box but with arm64 VMs becoming available in cloud
> infras, it'd be good to know if they are similar.
>
> Note: checknative itself is happy; but checknative -a will fail because of
> this -though it's an OS setup issue, nothing related to the hadoop
> binaries.
>
> steve
>
> On Tue, 12 Mar 2024 at 02:26, Xiaoqiao He  wrote:
>
> > Hi Shilun, Counter should be with yourself vote, where the current
> summary
> > is 5 +1 binding and 1 +1 non-binding. Let's re-count when deadline.
> > Thanks again.
> >
> > Best Regards,
> > - He Xiaoqiao
> >
> > On Tue, Mar 12, 2024 at 9:00 AM slfan1989  wrote:
> >
> > > As of now, we have collected 5 affirmative votes, with 4 votes binding
> > and
> > > 1 vote non-binding.
> > >
> > > Thank you very much for voting and verifying!
> > >
> > > This voting will continue until March 15th, this Friday.
> > >
> > > Best Regards,
> > > Shilun Fan.
> > >
> > > On Tue, Mar 12, 2024 at 4:29 AM Steve Loughran
> >  > > >
> > > wrote:
> > >
> > > > +1 binding
> > > >
> > > > (sorry, this had ended in the yarn-dev folder, otherwise I'd have
> seen
> > it
> > > > earlier. been testing it this afternoon:
> > > >
> > > > pulled the latest version of
> > > > https://github.com/apache/hadoop-release-support
> > > > (note, this module is commit-then-review; whoever is working
> > > on/validating
> > > > a release can commit as they go along. This is not production
> code...)
> > > >
> > > > * went through the "validating a release" step, validating maven
> > > artifacts
> > > > * building the same downstream modules which built for me last time
> > (avro
> > > > too complex; hboss not aws v2 in apache yet)
> > > >
> > > > spark build is still ongoing, but I'm not going to wait. It is
> > building,
> > > > which is key.
> > > >
> > > > The core changes I needed in are at the dependency level and I've
> > > > verified they are good.
> > > >
> > > > Oh, and I've also got my raspberry p5 doing the download of the arm
> > > > stuff for its checknative; not expecting problems.
> > > >
> > > > So: i've got some stuff still

Re: [DISCUSS] Support/Fate of HBase v1 in Hadoop

2024-03-12 Thread Ayush Saxena

Thanx Everyone. I think we have an agreement around dropping the Hbase v1
support.

I have created a ticket: https://issues.apache.org/jira/browse/HADOOP-19107

FYI. The HBase v2 build is broken now. I tried " mvn clean install
-DskipTests -Dhbase.profile=2.0" & it didn't work (It worked couple of
months back IIRC), Some sl4j exclusion needs to be added I believe

Maybe an upgrade of the HBase v2 version might solve it, if not we need to
handle it in our code. We can chase the upgrade of v2 version as Bryan
mentioned either in a different ticket parallely or together with this as
well :-)

In case anyone has objections/suggestions do shout out here or on the
ticket or both

-Ayush

On Mon, 11 Mar 2024 at 19:22, Steve Loughran 
wrote:

>  +1 for cutting hbase 1; it only reduces dependency pain (no more protobuf
> 2.5!)
>
> Created the JIRA on that a few days back
> https://issues.apache.org/jira/browse/YARN-11658
>
> On Tue, 5 Mar 2024 at 12:08, Bryan Beaudreault 
> wrote:
>
> > Hbase v1 is EOL for a while now, so option 2 probably makes sense. While
> > you are at it you should probably update the hbase2 version, because
> 2.2.x
> > is also very old and EOL. 2.5.x is the currently maintained release for
> > hbase2, with 2.5.7 being the latest. We’re soon going to release 2.6.0 as
> > well.
> >
> > On Tue, Mar 5, 2024 at 6:56 AM Ayush Saxena  wrote:
> >
> > > Hi Folks,
> > > As of now we have two profiles for HBase: one for HBase v1(1.7.1) &
> other
> > > for v2(2.2.4). The versions are specified over here: [1], how to build
> is
> > > mentioned over here: [2]
> > >
> > > As of now we by default run our Jenkins "only" for HBase v1, so we have
> > > seen HBase v2 profile silently breaking a couple of times.
> > >
> > > Considering there are stable versions for HBase v2 as per [3] & HBase
> v2
> > > seems not too new, I have some suggestions, we can consider:
> > >
> > > * Make HBase v2 profile as the default profile & let HBase v1 profile
> > stay
> > > in our code.
> > > * Ditch HBase v1 profile & just lets support HBase v2 profile.
> > > * Let everything stay as is, just add a Jenkins job/ Github action
> which
> > > compiles HBase v2 as well, so we make sure no change breaks it.
> > >
> > > Personally I would go with the second option, the last HBase v1 release
> > > seems to be 2 years back, it might be pulling in some
> > > problematic transitive dependencies & it will open scope for us to
> > support
> > > HBase 3.x when they have a stable release in future.
> > >
> > >
> > > Let me know your thoughts!!!
> > >
> > > -Ayush
> > >
> > >
> > > [1]
> > >
> > >
> >
> https://github.com/apache/hadoop/blob/dae871e3e0783e1fe6ea09131c3f4650abfa8a1d/hadoop-project/pom.xml#L206-L207
> > >
> > > [2]
> > >
> > >
> >
> https://github.com/apache/hadoop/blob/dae871e3e0783e1fe6ea09131c3f4650abfa8a1d/BUILDING.txt#L168-L172
> > >
> > > [3] https://hbase.apache.org/downloads.html
> > >
> >
>

Re: [VOTE] Release Apache Hadoop 3.4.0 (RC3)

2024-03-07 Thread Ayush Saxena

Just gave a quick try, Hive master didn't compile. I gave up post that...
[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile
(default-compile) on project hive-common: Compilation failure
[ERROR]
/Users/ayushsaxena/code/hive/common/src/java/org/apache/hadoop/hive/common/JvmMetrics.java:[23,37]
package org.apache.hadoop.log.metrics does not exist

I think it due to HADOOP-17524
<https://issues.apache.org/jira/browse/HADOOP-17524>, it will be quite a
effort to upgrade from 3.3.x to 3.4.x, I don't think we would chase that
upgrade in hive anytime soon

I will abstain

-Ayush

On Thu, 7 Mar 2024 at 08:33, Xiaoqiao He  wrote:

> cc @Chao Sun  @zhang...@apache.org
>  @Ayush Saxena 
> Would you mind helping to check the upstream systems(Spark, HBase, Hive)
> dependency with
> the new hadoop release version if necessary? Thanks.
>
> Best Regards,
> - He Xiaoqiao
>
>
> On Thu, Mar 7, 2024 at 10:23 AM Xiaoqiao He  wrote:
>
>> Shilun, Thanks for your great work. I am checking but not finished yet.
>> Will confirm once done.
>>
>> Best Regards,
>> - He Xiaoqiao
>>
>> On Thu, Mar 7, 2024 at 7:23 AM slfan1989  wrote:
>>
>>> @Xiaoqiao He  @Ayush Saxena 
>>> @Steve
>>> Loughran  @Mukund Madhav Thakur
>>>  @Takanobu
>>> Asanuma  @Masatake Iwasaki 
>>>
>>> Can you help review and vote for hadoop-3.4.0-RC3? Thank you very much!
>>>
>>> Best Regards,
>>> Shilun Fan.
>>>
>>> On Tue, Mar 5, 2024 at 6:07 AM slfan1989  wrote:
>>>
>>> > Hi folks,
>>> >
>>> > Xiaoqiao He and I have put together a release candidate (RC3) for
>>> Hadoop
>>> > 3.4.0.
>>> >
>>> > What we would like is for anyone who can to verify the tarballs,
>>> especially
>>> > anyone who can try the arm64 binaries as we want to include them too.
>>> >
>>> > The RC is available at:
>>> > https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.4.0-RC3/
>>> >
>>> > The git tag is release-3.4.0-RC3, commit bd8b77f398f
>>> >
>>> > The maven artifacts are staged at
>>> >
>>> https://repository.apache.org/content/repositories/orgapachehadoop-1408
>>> >
>>> > You can find my public key at:
>>> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>>> >
>>> > Change log
>>> >
>>> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.4.0-RC3/CHANGELOG.md
>>> >
>>> > Release notes
>>> >
>>> >
>>> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.4.0-RC3/RELEASENOTES.md
>>> >
>>> > This is off branch-3.4.0 and is the first big release since 3.3.6.
>>> >
>>> > Key changes include
>>> >
>>> > * S3A: Upgrade AWS SDK to V2
>>> > * HDFS DataNode Split one FsDatasetImpl lock to volume grain locks
>>> > * YARN Federation improvements
>>> > * YARN Capacity Scheduler improvements
>>> > * HDFS RBF: Code Enhancements, New Features, and Bug Fixes
>>> > * HDFS EC: Code Enhancements and Bug Fixes
>>> > * Transitive CVE fixes
>>> >
>>> > Differences from Hadoop-3.4.0-RC2
>>> >
>>> > * From branch-3.4 to branch-3.4.0 backport 2 Prs
>>> > * HADOOP-18088: Replacing log4j 1.x with reload4j. (ad8b6541117b)
>>> > * HADOOP-19084: Pruning hadoop-common transitive dependencies.
>>> > (80b4bb68159c)
>>> > * Use hadoop-release-support[1] for packaging and verification.
>>> > * Add protobuf compatibility issue description
>>> >
>>> > Note, because the arm64 binaries are built separately on a different
>>> > platform and JVM, their jar files may not match those of the x86
>>> > release -and therefore the maven artifacts. I don't think this is
>>> > an issue (the ASF actually releases source tarballs, the binaries are
>>> > there for help only, though with the maven repo that's a bit blurred).
>>> >
>>> > The only way to be consistent would actually untar the x86.tar.gz,
>>> > overwrite its binaries with the arm stuff, retar, sign and push out
>>> > for the vote. Even automating that would be risky.
>>> >
>>> > [1] hadoop-release-support:
>>> > https://github.com/apache/hadoop-release-support
>>> > Thanks to steve for providing hadoop-release-support.
>>> >
>>> > Best Regards,
>>> > Shilun Fan.
>>> >
>>> >
>>>
>>

[DISCUSS] Support/Fate of HBase v1 in Hadoop

2024-03-05 Thread Ayush Saxena

Hi Folks,
As of now we have two profiles for HBase: one for HBase v1(1.7.1) & other
for v2(2.2.4). The versions are specified over here: [1], how to build is
mentioned over here: [2]

As of now we by default run our Jenkins "only" for HBase v1, so we have
seen HBase v2 profile silently breaking a couple of times.

Considering there are stable versions for HBase v2 as per [3] & HBase v2
seems not too new, I have some suggestions, we can consider:

* Make HBase v2 profile as the default profile & let HBase v1 profile stay
in our code.
* Ditch HBase v1 profile & just lets support HBase v2 profile.
* Let everything stay as is, just add a Jenkins job/ Github action which
compiles HBase v2 as well, so we make sure no change breaks it.

Personally I would go with the second option, the last HBase v1 release
seems to be 2 years back, it might be pulling in some
problematic transitive dependencies & it will open scope for us to support
HBase 3.x when they have a stable release in future.


Let me know your thoughts!!!

-Ayush


[1]
https://github.com/apache/hadoop/blob/dae871e3e0783e1fe6ea09131c3f4650abfa8a1d/hadoop-project/pom.xml#L206-L207

[2]
https://github.com/apache/hadoop/blob/dae871e3e0783e1fe6ea09131c3f4650abfa8a1d/BUILDING.txt#L168-L172

[3] https://hbase.apache.org/downloads.html

Re: [VOTE] Release Apache Hadoop Thirdparty 1.2.0 (RC1)

2024-02-06 Thread Ayush Saxena

+1 (Binding)

* Built from source
* Verified Checksums
* Verified Signatures
* Verified no diff b/w the git tag & src tar
* Verified LICENSE & NOTICE files
* Compiled hadoop trunk with 1.2.0-RC1*

Thanx Shilun Fan & Xiaoqiao He for driving the release. Good Luck!!!

* You need to change that protobuf artifact name to compile, that ain't
fancy at all, the exclusions and all need to be updated after every
release, this thing we should change maybe from next release and let the
dependency stay without the version suffix. Incompatible for sure, but the
current approach isn't very compatible either, If someone is having a
dependency of any hadoop module, which pulls in this shaded jar
transitively, that guy needs to update his exclusions post every release
and that IMO isn't a very good experience, lets chase that in our next
release or have some discussions around it

-Ayush


On Tue, 6 Feb 2024 at 19:14, Takanobu Asanuma  wrote:

> +1 (binding).
>
> * Verified signatures and checksums
> * Reviewed the documents
> * Successfully built from source with `mvn clean install`
> * Successfully compiled Hadoop trunk and branch-3.4 with `mvn clean install
> -DskipTests` using the thirdparty 1.2.0-RC1
> * There are not any diffs between tag and src
>
> Regards,
> - Takanobu Asanuma
>
> 2024年2月6日(火) 11:03 Xiaoqiao He :
>
> > cc @PJ Fanning  @Ayush Saxena 
> @Steve
> > Loughran  @Takanobu Asanuma 
> @Shuyan
> > Zhang  @inigo...@apache.org  >
> >
> > On Mon, Feb 5, 2024 at 11:30 AM Xiaoqiao He 
> wrote:
> >
> >> +1(binding).
> >>
> >> I checked the following items:
> >> - [X] Download links are valid.
> >> - [X] Checksums and PGP signatures are valid.
> >> - [X] LICENSE and NOTICE files are correct for the repository.
> >> - [X] Source code artifacts have correct names matching the current
> >> release.
> >> - [X] All files have license headers if necessary.
> >> - [X] Building is OK using `mvn clean install` on JDK_1.8.0_202.
> >> - [X] Built Hadoop trunk successfully with updated thirdparty (include
> >> update protobuf shaded path).
> >> - [X] No difference between tag and release src tar.
> >>
> >> Good Luck!
> >>
> >> Best Regards,
> >> - He Xiaoqiao
> >>
> >>
> >> On Sun, Feb 4, 2024 at 10:29 PM slfan1989  wrote:
> >>
> >>> Hi folks,
> >>>
> >>> Xiaoqiao He and I have put together a release candidate (RC1) for
> Hadoop
> >>> Thirdparty 1.2.0.
> >>>
> >>> The RC is available at:
> >>>
> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-thirdparty-1.2.0-RC1
> >>>
> >>> The RC tag is
> >>>
> >>>
> https://github.com/apache/hadoop-thirdparty/releases/tag/release-1.2.0-RC1
> >>>
> >>> The maven artifacts are staged at
> >>>
> https://repository.apache.org/content/repositories/orgapachehadoop-1401
> >>>
> >>> Comparing to 1.1.1, there are three additional fixes:
> >>>
> >>> HADOOP-18197. Upgrade Protobuf-Java to 3.21.12
> >>> https://github.com/apache/hadoop-thirdparty/pull/26
> >>>
> >>> HADOOP-18921. Upgrade to avro 1.11.3
> >>> https://github.com/apache/hadoop-thirdparty/pull/24
> >>>
> >>> HADOOP-18843. Guava version 32.0.1 bump to fix CVE-2023-2976 (#23)
> >>> https://github.com/apache/hadoop-thirdparty/pull/23
> >>>
> >>> You can find my public key at :
> >>> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >>>
> >>> Best Regards,
> >>> Shilun Fan.
> >>>
> >>>
>

Re: [VOTE] Release Apache Hadoop Thirdparty 1.2.0 RC0

2024-02-01 Thread Ayush Saxena

There is some diff b/w the git tag & the src tar, the Dockerfile & the
create-release are different, Why?

Files hadoop-thirdparty/dev-support/bin/create-release and
hadoop-thirdparty-1.2.0-src/dev-support/bin/create-release differ

Files hadoop-thirdparty/dev-support/docker/Dockerfile and
hadoop-thirdparty-1.2.0-src/dev-support/docker/Dockerfile differ


ayushsaxena@ayushsaxena hadoop-thirdparty-1.2.0-RC0 % diff
hadoop-thirdparty/dev-support/bin/create-release
hadoop-thirdparty-1.2.0-src/dev-support/bin/create-release

444,446c444,446

< echo "RUN groupadd --non-unique -g ${group_id} ${user_name}"

< echo "RUN useradd -g ${group_id} -u ${user_id} -m ${user_name}"

< echo "RUN chown -R ${user_name} /home/${user_name}"

---

> echo "RUN groupadd --non-unique -g ${group_id} ${user_name}; exit 0;"

> echo "RUN useradd -g ${group_id} -u ${user_id} -m ${user_name}; exit
0;"

> echo "RUN chown -R ${user_name} /home/${user_name}; exit 0;"

ayushsaxena@ayushsaxena hadoop-thirdparty-1.2.0-RC0 % diff
hadoop-thirdparty/dev-support/docker/Dockerfile
hadoop-thirdparty-1.2.0-src/dev-support/docker/Dockerfile

103a104,105

> RUN rm -f /etc/maven/settings.xml && ln -s /home/root/.m2/settings.xml
/etc/maven/settings.xml

>

126a129,130

> RUN pip2 install setuptools-scm==5.0.2

> RUN pip2 install lazy-object-proxy==1.5.0

159d162

<




Other things look Ok,
* Built from source
* Verified Checksums
* Verified Signatures
* Validated files have ASF header

Not sure if having diff b/w the git tag & src tar is ok, this doesn't look
like core code change though, can anybody check & confirm?

-Ayush


On Thu, 1 Feb 2024 at 13:39, Xiaoqiao He  wrote:

> Gentle ping. @Ayush Saxena  @Steve Loughran
>  @inigo...@apache.org  @Masatake
> Iwasaki  and some other folks.
>
> On Wed, Jan 31, 2024 at 10:17 AM slfan1989  wrote:
>
> > Thank you for the review and vote! Looking forward to other forks helping
> > with voting and verification.
> >
> > Best Regards,
> > Shilun Fan.
> >
> > On Tue, Jan 30, 2024 at 6:20 PM Xiaoqiao He 
> wrote:
> >
> > > Thanks Shilun for driving it and making it happen.
> > >
> > > +1(binding).
> > >
> > > [x] Checksums and PGP signatures are valid.
> > > [x] LICENSE files exist.
> > > [x] NOTICE is included.
> > > [x] Rat check is ok. `mvn clean apache-rat:check`
> > > [x] Built from source works well: `mvn clean install`
> > > [x] Built Hadoop trunk with updated thirdparty successfully (include
> > update
> > > protobuf shaded path).
> > >
> > > BTW, hadoop-thirdparty-1.2.0 will be included in release-3.4.0, hope we
> > > could finish this vote before 2024/02/06(UTC) if there are no concerns.
> > > Thanks all.
> > >
> > > Best Regards,
> > > - He Xiaoqiao
> > >
> > >
> > >
> > > On Mon, Jan 29, 2024 at 10:42 PM slfan1989 
> wrote:
> > >
> > > > Hi folks,
> > > >
> > > > Xiaoqiao He and I have put together a release candidate (RC0) for
> > Hadoop
> > > > Thirdparty 1.2.0.
> > > >
> > > > The RC is available at:
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-thirdparty-1.2.0-RC0
> > > >
> > > > The RC tag is
> > > >
> > >
> >
> https://github.com/apache/hadoop-thirdparty/releases/tag/release-1.2.0-RC0
> > > >
> > > > The maven artifacts are staged at
> > > >
> > https://repository.apache.org/content/repositories/orgapachehadoop-1398
> > > >
> > > > Comparing to 1.1.1, there are three additional fixes:
> > > >
> > > > HADOOP-18197. Upgrade Protobuf-Java to 3.21.12
> > > > https://github.com/apache/hadoop-thirdparty/pull/26
> > > >
> > > > HADOOP-18921. Upgrade to avro 1.11.3
> > > > https://github.com/apache/hadoop-thirdparty/pull/24
> > > >
> > > > HADOOP-18843. Guava version 32.0.1 bump to fix CVE-2023-2976
> > > > https://github.com/apache/hadoop-thirdparty/pull/23
> > > >
> > > > You can find my public key at :
> > > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> > > >
> > > > Best Regards,
> > > > Shilun Fan.
> > > >
> > >
> >
>

Re: [VOTE] Release Apache Hadoop 3.4.0 RC0

2024-01-12 Thread Ayush Saxena

We should consider including
https://issues.apache.org/jira/browse/HDFS-17129

Which looks like inducing some misorder between IBR & FBR which can
potentially lead to strange issues, if that can’t be merged, should revert
the one which causes that.

I think we should check for any ticket which has a target version or
affect version & is marked critica/blockerl for 3.4.0 before spinning up a
new RC, I think I mentioned that somewhere before.

-1, in case HDFS-17129 is not a false alarm or we can prove it won't cause
any issues. There is a comment which says a block was reported missing post
the patch that induced it: [1]

[1] https://github.com/apache/hadoop/pull/6244#issuecomment-1793981740

-Ayush


On Fri, 12 Jan 2024 at 07:37, slfan1989  wrote:

> Thank you very much for your help in verifying this version! We will use
> version 3.5.0 for fix jira in the future.
>
> Best Regards,
> Shilun Fan.
>
>  > wonderful! I'll be testing over the weekend
>
>  > Meanwhile, new changes I'm putting in to trunk are tagged as fixed in
> 3.5.0
>  > -correct?
>
>  > steve
>
>
> > On Thu, 11 Jan 2024 at 05:15, slfan1989 wrote:
>
> > Hello all,
> >
> > We plan to release hadoop 3.4.0 based on hadoop trunk, which is the first
> > hadoop 3.4.0-RC version.
> >
> > The RC is available at:
> > https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/ (for amd64)
> > https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-arm64/ (for arm64)
> >
> > Maven artifacts is built by x86 machine and are staged at
> > https://repository.apache.org/content/repositories/orgapachehadoop-1391/
> >
> > My public key:
> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >
> > Changelog:
> > https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/CHANGELOG.md
> >
> > Release notes:
> >
> https://home.apache.org/~slfan1989/hadoop-3.4.0-RC0-amd64/RELEASENOTES.md
> >
> > This is a relatively big release (by Hadoop standard) containing about
> 2852
> > commits.
> >
> > Please give it a try, this RC vote will run for 7 days.
> >
> > Feature highlights:
> >
> > DataNode FsDatasetImpl Fine-Grained Locking via BlockPool
> > 
> > [HDFS-15180](https://issues.apache.org/jira/browse/HDFS-15180) Split
> > FsDatasetImpl datasetLock via blockpool to solve the issue of heavy
> > FsDatasetImpl datasetLock
> > When there are many namespaces in a large cluster.
> >
> > YARN Federation improvements
> > 
> > [YARN-5597](https://issues.apache.org/jira/browse/YARN-5597) brings many
> > improvements, including the following:
> >
> > 1. YARN Router now boasts a full implementation of all relevant
> interfaces
> > including the ApplicationClientProtocol,
> > ResourceManagerAdministrationProtocol, and RMWebServiceProtocol.
> > 2. Enhanced support for Application cleanup and automatic offline
> > mechanisms for SubCluster are now facilitated by the YARN Router.
> > 3. Code optimization for Router and AMRMProxy was undertaken, coupled
> with
> > improvements to previously pending functionalities.
> > 4. Audit logs and Metrics for Router received upgrades.
> > 5. A boost in cluster security features was achieved, with the inclusion
> of
> > Kerberos support.
> > 6. The page function of the router has been enhanced.
> >
> > Upgrade AWS SDK to V2
> > 
> > [HADOOP-18073](https://issues.apache.org/jira/browse/HADOOP-18073)
> > The S3A connector now uses the V2 AWS SDK. This is a significant change
> at
> > the source code level.
> > Any applications using the internal extension/override points in the
> > filesystem connector are likely to break.
> > Consult the document aws\_sdk\_upgrade for the full details.
> >
> > hadoop-thirdparty will also provide the new RC0 soon.
> >
> > Best Regards,
> > Shilun Fan.
> >
>

Re: Re: [DISCUSS] Release Hadoop 3.4.0

2024-01-05 Thread Ayush Saxena

Thanx @slfan1989 for volunteering. Please remove this [1] from the new
branches-3.4 & 3.4.0 when you create them as part of preparing for the
release, else it would be putting up trunk labels for backport PRs to
those branches as well.

There are some tickets marked as Critical/Blocker for 3.4.0 [2], just
give a check to them if they are actually critical or not, if yes, we
should get them in. Most of them were not looking relevant to me at
first glance.

-Ayush


[1] https://github.com/apache/hadoop/blob/trunk/.github/labeler.yml
[2] 
https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20HADOOP%2C%20MAPREDUCE%2C%20YARN)%20AND%20priority%20in%20(Blocker%2C%20Critical)%20AND%20resolution%20%3D%20Unresolved%20AND%20affectedVersion%20in%20(3.4.0%2C%203.4.1)


On Thu, 4 Jan 2024 at 19:57, slfan1989  wrote:
>
> Hey all,
>
> We are planning to release Hadoop 3.4.0 base on trunk. I made some 
> preparations and changed the target version of JIRA for non-blockers in 
> HADOOP, HDFS, YARN, and MAPREDUCE from 3.4.0 to 3.5.0. If we want to create a 
> new JIRA, the target version can directly select version 3.5.0.
>
> If you have any thoughts, suggestions, or concerns, please feel free to share 
> them.
>
> Best Regards,
> Shilun Fan.
>
> > +1 from me.
> >> It will include the new AWS V2 SDK upgrade as well.
>
> > On Wed, Jan 3, 2024 at 6:35 AM Xiaoqiao He wrote:
>
> > >
> > > I think the release discussion can be in public ML?
> >
> > Good idea. cc common-dev/hdfs-dev/yarn-dev/mapreduce-dev ML.
> >
> > Best Regards,
> > - He Xiaoqiao
> >
> > On Tue, Jan 2, 2024 at 6:18 AM Ayush Saxena wrote:
> >
> > > +1 from me as well.
> > >
> > > We should definitely attempt to upgrade the thirdparty version for
> > > 3.4.0 & check if there are any pending critical/blocker issues as
> > > well.
> > >
> > > I think the release discussion can be in public ML?
> > >
> > > -Ayush
> > >
> > > On Mon, 1 Jan 2024 at 18:25, Steve Loughran  > >
> > > wrote:
> > > >
> > > > +1 from me
> > > >
> > > > ant and maven repo to build and validate things, including making arm
> > > > binaries if you work from an arm macbook.
> > > > https://github.com/steveloughran/validate-hadoop-client-artifacts
> > > >
> > > > do we need to publish an up to date thirdparty release for this?
> > > >
> > > >
> > > >
> > > > On Mon, 25 Dec 2023 at 16:06, slfan1989 wrote:
> > > >
> > > > > Dear PMC Members,
> > > > >
> > > > > First of all, Merry Christmas to everyone!
> > > > >
> > > > > In our community discussions, we collectively finalized the plan to
> > > release
> > > > > Hadoop 3.4.0 based on the current trunk branch. I am applying to take
> > > on
> > > > > the responsibility for the initial release of version 3.4.0, and the
> > > entire
> > > > > process is set to officially commence in January 2024.
> > > > > I have created a new JIRA: HADOOP-19018. Release 3.4.0.
> > > > >
> > > > > The specific work plan includes:
> > > > >
> > > > > 1. Following the guidance in the HowToRelease document, completing
> > all
> > > the
> > > > > relevant tasks required for the release of version 3.4.0.
> > > > > 2. Pointing the trunk branch to 3.5.0-SNAPSHOT.
> > > > > 3. Currently, the Fix Versions of all tasks merged into trunk are set
> > > as
> > > > > 3.4.0; I will move them to 3.5.0.
> > > > >
> > > > > Confirmed features to be included in the release:
> > > > >
> > > > > 1. Enhanced functionality for YARN Federation.
> > > > > 2. Optimization of HDFS RBF.
> > > > > 3. Introduction of fine-grained global locks for DataNodes.
> > > > > 4. Improvements in the stability of HDFS EC, and more.
> > > > > 5. Fixes for important CVEs.
> > > > >
> > > > > If you have any thoughts, suggestions, or concerns, please feel free
> > to
> > > > > share them.
> > > > >
> > > > > Looking forward to a successful release!
> > > > >
> > > > > Best Regards,
> > > > > Shilun Fan.
> > > > >
> > >
> >

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Hadoop 3.2.x EOL

2023-12-05 Thread Ayush Saxena

+1

-Ayush

> On 06-Dec-2023, at 9:40 AM, Xiaoqiao He  wrote:
> 
> Dear Hadoop devs,
> 
> Given the feedback from the discussion thread [1], I'd like to start
> an official thread for the community to vote on release line 3.2 EOL.
> 
> It will include,
> a. An official announcement informs no further regular Hadoop 3.2.x
> releases.
> b. Issues which target 3.2.5 will not be fixed.
> 
> This vote will run for 7 days and conclude by Dec 13, 2023.
> 
> I’ll start with my +1.
> 
> Best Regards,
> - He Xiaoqiao
> 
> [1] https://lists.apache.org/thread/bbf546c6jz0og3xcl9l3qfjo93b65szr

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [DISCUSS] Make some release lines EOL

2023-12-04 Thread Ayush Saxena

Thanx He Xiaoqiao for starting the thread.

+1 to mark 3.2 EOL. I am not sure about branch-2 stuff, I think a
bunch of folks are still using Hadoop 2.x, but we hardly push anything
over there & I am not sure if anyone is interested in releasing that
or not.

> Hadoop 3.3 Release (release-3.3.5 at Jun 22 2022),  360 commits checked in
since last release.

I think you got the year wrong here, it is 2023

-Ayush

On Mon, 4 Dec 2023 at 18:09, Xiaoqiao He  wrote:
>
> Hi folks,
>
> There are many discussions about which release lines should we still
> consider actively
> maintained in history. I want to launch this topic again, and try to get a
> consensus.
>
> From download page[1] and active branches page[2], we have the following
> release lines:
> Hadoop 3.3 Release (release-3.3.5 at Jun 22 2022),  360 commits checked in
> since last release.
> Hadoop 3.2 Release (release-3.2.4 at Jul 11, 2022) 36 commits checked in
> since last release.
> Hadoop 2.10 Release (release-2.10.2 at May 17, 2022) 24 commits checked in
> since last release.
>
> And Hadoop 3.4.0 will be coming soon which Shilun Fan (maybe cooperating
> with Ahmar Suhail?)
> has been actively working on getting the 3.4.0 release out.
>
> Considering the less updates for some active branches, should we declare to
> our downstream
> users that some of these lines will EOL?
>
> IMO we should announce EOL branch-2.10 and branch-3.2 which are not active
> now.
> Then we could focus on minor active branches (branch-3.3 and branch-3.4)
> and increase release pace.
>
> So how about to keep branch-3.3 and branch-3.4 release lines as actively
> maintained, And mark branch-2.10 and branch-3.2 EOL? Any opinions? Thanks.
>
> Best Regards,
> - He Xiaoqiao
>
> [1] https://hadoop.apache.org/releases.html
> [2]
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Active+Release+Lines

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [DISCUSS] Hadoop 3.4.0 release

2023-12-01 Thread Ayush Saxena

Hi Ahmar,

Makes sense to have a 3.4.0 rather than a 3.3.x if there are major
changes. Few things on the process:

* trunk should now point to 3.5.0-SNAPSHOT
* Fix Versions of all the tickets merged to trunk would be 3.4.0 as of
now, you need to move it to 3.5.0, else while chasing the release, the
release notes will get messed up.
* Since we are moving to 3.4.x, we can even explore a few more changes
in trunk as well, which weren't merged to 3.3.x due to compat issues.
HADOOP-13386 is one such change, which was asked for some time back.
* Now we have an additional release line, we can start a thread and
think about marking some existing release lines as EOL, so, the number
of active release lines stay in control, maybe 3.2

Rest, Good Luck, Thanx for volunteering!!!

-Ayush

On Fri, 1 Dec 2023 at 17:30, Ahmar Suhail  wrote:
>
> Hey all,
>
>
> HADOOP-18073 was recently merged to trunk, which upgrades the AWS SDK to
> V2. Since this merge, there has been work for stabilising this SDK version
> upgrade, as tracked in HADOOP-18886. Further, HADOOP-18995 and HADOOP-18996
> added support for Amazon Express One Zone.
>
>
> The SDK upgrade has brought major code changes to the hadoop-aws module.
> For example, In the v2 SDK, all package, class and method names have
> changed. Due to the upgrade being a large change, it might be good to
> release these changes as Hadoop 3.4.0. For this, I started creating a
> branch-3.4 from branch-3.3 and branch-3.4.0 for a possible release early
> next year.
>
>
> The other option would be to release from trunk as a 3.3.x release, but due
> to the large number of changes there this will be really difficult.
>
>
> Thoughts on a 3.4.0 release?

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Resolved] (MAPREDUCE-7459) Fixed TestHistoryViewerPrinter flakiness during string comparison

2023-11-03 Thread Ayush Saxena (Jira)



 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved MAPREDUCE-7459.
-
Fix Version/s: 3.4.0
   (was: 3.3.6)
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Fixed TestHistoryViewerPrinter flakiness during string comparison 
> --
>
> Key: MAPREDUCE-7459
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7459
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.3.6
> Environment: Java version: openjdk 11.0.20.1
> Maven version: Apache Maven 3.6.3
>Reporter: Rajiv Ramachandran
>Assignee: Rajiv Ramachandran
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> The test 
> {{_org.apache.hadoop.mapreduce.jobhistory.TestHistoryViewerPrinter#testHumanPrinterAll_}}
> can fail due to flakiness. These flakiness occurs because the test utilizes 
> Hashmaps values and converts the values to string to perform the comparision 
> and the order of the objects returned may not be necessarily maintained. 
> The stack trace is as follows:
> testHumanPrinterAll(org.apache.hadoop.mapreduce.jobhistory.TestHistoryViewerPrinter)
>   Time elapsed: 0.297 s  <<< FAILURE!
> org.junit.ComparisonFailure:
> expected:<...8501754_0001_m_0[7    6-Oct-2011 19:15:09    6-Oct-2011 
> 19:15:16 (7sec)
> SUCCEEDED MAP task list for job_1317928501754_0001
> TaskId        StartTime    FinishTime    Error    InputSplits
> 
> task_1317928501754_0001_m_06    6-Oct-2011 19:15:08    6-Oct-2011 
> 19:15:14 (6sec)
> ...
> /tasklog?attemptid=attempt_1317928501754_0001_m_03]_1
> REDUCE task list...> but was:<...8501754_0001_m_0[5    6-Oct-2011 
> 19:15:07    6-Oct-2011 19:15:12 (5sec)
> SUCCEEDED MAP task list for job_1317928501754_0001
> TaskId        StartTime    FinishTime    Error    InputSplits
> 
> task_1317928501754_0001_m_06    6-Oct-2011 19:15:08    6-Oct-2011 
> 19:15:14 (6sec)
> SUCCEEDED MAP task list for job_1317928501754_0001
> TaskId        StartTime    FinishTime    Error    InputSplits
> 
> task_1317928501754_0001_m_04    6-Oct-2011 19:15:06    6-Oct-2011 
> 19:15:10 (4sec)
> SUCCEEDED MAP task list for job_1317928501754_0001
> TaskId        StartTime    FinishTime    Error    InputSplits
> 
> task_1317928501754_0001_m_07    6-Oct-2011 19:15:09    6-Oct-2011 
> 19:15:16 (7sec)
> ...
> /tasklog?attemptid=attempt_1317928501754_0001_m_06]_1



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [ANNOUNCE] New Hadoop PMC Member - Shilun Fan

2023-10-31 Thread Ayush Saxena

Congratulations!!!

-Ayush

> On 31-Oct-2023, at 6:15 PM, Samrat Deb  wrote:
> 
> Congratulations Shilun Fan
> 
> Bests,
> Samrat
> 
>> On Tue, Oct 31, 2023 at 6:07 PM Tao Li  wrote:
>> 
>> Congratulations!!!
>> 
>> Xiaoqiao He  于2023年10月31日周二 20:19写道：
>> 
>>> On behalf of the Apache Hadoop PMC, I am pleased to announce that
>>> Shilun Fan(slfan1989) has accepted the PMC's invitation to become
>>> PMC member on the project. We appreciate all of Shilun's generous
>>> contributions thus far and look forward to his continued involvement.
>>> 
>>> Congratulations and welcome, Shilun!
>>> 
>>> Best Regards,
>>> He Xiaoqiao
>>> (On behalf of the Apache Hadoop PMC)
>>> 
>> 

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [ANNOUNCE] New Hadoop Committer - Simbarashe Dzinamarira

2023-10-03 Thread Ayush Saxena

Congratulations!!!

-Ayush

> On 03-Oct-2023, at 5:42 AM, Erik Krogen  wrote:
> 
> Congratulations Simba! Thanks for the great work you've done on making HDFS
> more scalable!
> 
>> On Mon, Oct 2, 2023 at 4:31 PM Iñigo Goiri  wrote:
>> 
>> I am pleased to announce that Simbarashe Dzinamarira has been elected as a
>> committer on the Apache Hadoop project.
>> We appreciate all of Simbarashe's work, and look forward to his continued
>> contributions.
>> 
>> Congratulations and welcome !
>> 
>> Best Regards,
>> Inigo Goiri
>> (On behalf of the Apache Hadoop PMC)
>> 

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [ANNOUNCE] New Hadoop Committer - Shuyan Zhang

2023-09-26 Thread Ayush Saxena

Congratulations!!!

Sent from my iPhone

> On 27-Sep-2023, at 9:56 AM, Xiaoqiao He  wrote:
> 
> I am pleased to announce that Shuyan Zhang has been elected as
> a committer on the Apache Hadoop project. We appreciate all of
> Shuyan's work, and look forward to her/his continued contributions.
> 
> Congratulations and Welcome, Shuyan!
> 
> Best Regards,
> - He Xiaoqiao
> (On behalf of the Apache Hadoop PMC)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: HADOOP-18207 hadoop-logging module about to land

2023-07-28 Thread Ayush Saxena

gt; "site configs" to "log4j properties", we don't seem to have any lost
> abilities. This is also not a lost ability, it's the way of configuring it
> that is different.
>
>
>> Most of the time when this entire activity breaks & like usual we are on
> a follow-up or on an addendum PR, there is generally some sarcastic or a
> response like: 'We can't do it without breaking things', and I am not
> taking any of these for now.
>
> My sincere apologies if you think I have not addressed your comments. Could
> you please provide a specific reference and I will be happy to provide
> detailed answers. Many of the questions we are dealing with at this point
> in time have already been discussed on the parent Jira in the past.
>
> I would also like to share some of my opinions: I understand that none of
> this is simpler to deal with, none of this is an interesting work either,
> as opposed to working on a big feature or providing some bug fixes
> (specifically when it takes hours and days to figure out where we have the
> bug, despite how small the fix might be), however we still cannot give up
> on the work that helps maintain our project, can we?
> Just to provide an example, we don't have Java 11 compile support because
> of (let's say) the lack of migration from Jersey 1.x to 2.x (HADOOP-15984
> <https://issues.apache.org/jira/browse/HADOOP-15984>). At this point, it
> seems extremely difficult to migrate to Jersey 2 because of multiple
> factors (e.g. guice support + hk2 dependency injection don't work well in
> Jersey 2), I have initiated a mailing thread on eclipse community and also
> created an issue on jersey tracker to get some insights:
> https://github.com/eclipse-ee4j/jersey/issues/5357
> There has been no update and it is well known that guice support does not
> work with jersey 2, we are not aware of which direction we are going to go
> with, and how can we possibly upgrade jersey, but we still cannot give up
> on it, right? We do need to find some way, don't we?
> Similarly, the whole log4j2 upgrade is also a big deal, but we are not
> going to lose any hadoop functionality, we have an alternative way of
> achieving the same output (as mentioned about async logger).
>
> The point I am trying to make is: all this work is really not that
> interesting, but we still need to work on it, we still need to
> maintain hadoop project for the rest of the world (no matter how complex it
> is to maintain it) as a community, correct? Hadoop is the only project that
> doesn't have log4j2 support among all the other big data projects that we
> are using as of today, we can stay in this state for a while but how long
> are we willing to compromise an inevitable boring maintenance effort?
> Moreover, once we migrate to log4j2, we could get a real boost from some of
> their async logger efforts, where we don't even need async appender.
>
> I would be extremely happy if anyone else is willing to put efforts and get
> this work rolling forward. I am committed to ensuring that my work on the
> project does not give rise to "multiple community conflicts." Therefore, if
> we can confidently determine that using log4j1 for the foreseeable future
> is acceptable, there would be no need for additional dev and review cycles.
> I am more than willing to refrain from creating any further patches in such
> a scenario. Similar discussion is required for jersey as well, while it
> might be tempting to hope that integrating jersey 2 will magically resolve
> all our issues and make our Jackson and other dependency upgrades reliable,
> it is essential to be realistic about the potential challenges ahead.
> Despite the appeal of adopting jersey 2, we must be prepared to face a
> substantial num of incompatibilities that would arise with jersey 2
> migration.
>
> I am really grateful to all reviewers that have reviewed the tasks so far,
> I am not expecting reviewers to be able to provide their reviews as quickly
> as possible, this particular sub-task has been dev and test ready for the
> last 4 months, and it is absolutely okay to wait longer, but what really
> hurts a bit is the fact that despite the whole discussion that took place
> on the parent Jira, and the clear agreements/directions we have agreed
> upon, we are still engaging in the discussion to determine the value this
> work brings in.
> I sincerely apologize if any aspects were not adequately clarified during
> our discussions of each sub-task. I am more than willing to revisit any
> line of code and engage in a detailed conversation to share insights into
> the factors that influenced the changes made.
>
>
> 1. https://lists.apache.org/thread/gvfb3jkg6t11cyds4jmpo7lrswmx28w3
> 2. https://lists.apache

Re: HADOOP-18207 hadoop-logging module about to land

2023-07-27 Thread Ayush Saxena

Hi Wei-Chiu,
I am glad this activity finally made it to the dev mailing list. Just
sharing the context being the guy who actually reverted this last time
it was in: It had a test failure on the PR itself and it went in, that
had nothing to do with the nature of the PR, generic for all PR and
all projects.

Some thoughts & Questions?
* Regarding this entire activity including the parent tickets: Do we
have any dev list agreement for this?
* What incompatibilities have been introduced till now for this and
what are planned.
* What does this activity bring up for the downstream folks adapting
this? Upgrading Hadoop is indeed important for a lot of projects and
for "us as well" and it is already a big pain (my past experience)
* What tests have been executed verifying all these changes including
this and the ones already in, apart from the Jenkins results, and
what's the plan.
* Considering you are heavily involved, any insights around perf stuff?
* This Comment 
[https://github.com/apache/hadoop/pull/5503#discussion_r1199614640],
this says it isn't moving all the instances? So, when do you plan to
work on this? Should that be a release blocker for us, since part of
the activity is in? Needless to say: "Best Effort, whatever could move
in, moves is, isn't an answer"
* The above comment thread even says losing some available abilities,
even some past one said so, what all is getting compromised, and how
do you plan to get it back? Most of the lost abilities are related to
HDFS, I don't think we are in a state to lose stuff there, if we
aren't having enough to make people adapt. Our ultimate goal isn't to
have something in, but to make people use it.
* What advantages do we get with all of these activities over existing
branch-3 stuff? Considering what are the trade-offs, Was discussing
with some folks offline & that seems to be a good question to have an
answer beforehand.

PS. Most of the time when this entire activity breaks & like usual we
are on a follow-up or on an addendum PR, there is generally some
sarcastic or a response like: 'We can't do it without breaking
things', and I am not taking any of these for now.

Most importantly since we are discussing it now and if there are
incompatibilities introduced already, is there a possible way out and
get rid of them, if not, if there ain't an agreement, how tough is
going back, because if it introduces incompatibilities for HDFS, you
won't get an agreement most probably, not sure about others but I will
veto that...

TLDR, Please hold unless all the concerns are addressed and we have an
agreement for this as well as anything done in past or planned for
future, Shouldn't compromise the adaptability of the product at any
cost

-Ayush

On Thu, 27 Jul 2023 at 03:47, Wei-Chiu Chuang  wrote:
>
> Hi,
>
> I am preparing to resolve HADOOP-18207
>  (
> https://github.com/apache/hadoop/pull/5717).
>
> This change affects all modules. With this change, it will eliminate almost
> all the direct log4j usage.
>
> As always, landing such a big piece is tricky. I am sorry for the mishaps
> last time and am doing more due diligence to make it a smoother transition.
> I am triggering one last precommit check. Once the change is merged, Viraj
> and I will pay attention to any potential problems.
>
> Weichiu

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: Signing releases using automated release infra

2023-07-25 Thread Ayush Saxena

Yep, thirdparty could be a good candidate to try, building thirdparty
release is relatively easy as well

-Ayush

On Thu, 20 Jul 2023 at 15:25, Steve Loughran  wrote:
>
>
> could be good.
>
> why not set it up for the third-party module first to see how well it works?
>
> On Tue, 18 Jul 2023 at 21:05, Ayush Saxena  wrote:
>>
>> Something we can explore as well!!
>>
>> -Ayush
>>
>> Begin forwarded message:
>>
>> > From: Volkan Yazıcı 
>> > Date: 19 July 2023 at 1:24:49 AM IST
>> > To: d...@community.apache.org
>> > Subject: Signing releases using automated release infra
>> > Reply-To: d...@community.apache.org
>> >
>> > Abstract: Signing release artifacts using an automated release
>> > infrastructure has been officially approved by LEGAL. This enables
>> > projects to sign artifacts using, say, GitHub Actions.
>> >
>> > I have been trying to overhaul the Log4j release process and make it
>> > as frictionless as possible since last year. As a part of that effort,
>> > I wanted to sign artifacts in CI during deployment and in a
>> > `members@a.o` thread[0] I explained how one can do that securely with
>> > the help of Infra. That was in December 2022. It has been a long,
>> > rough journey, but we succeeded. In this PR[1], Legal has updated the
>> > release policy to reflect that this process is officially allowed.
>> > Further, Infra put together guides[2][3] to assist projects. Logging
>> > Services PMC has already successfully performed 4 Log4j Tools releases
>> > using this approach, see its release process[4] for a demonstration.
>> >
>> > [0] (members only!)
>> > https://lists.apache.org/thread/1o12mkjrhyl45f9pof94pskg55vhs61n
>> > [1] https://github.com/apache/www-site/pull/235
>> > [2] https://infra.apache.org/release-publishing.html#signing
>> > [3] https://infra.apache.org/release-signing.html#automated-release-signing
>> > [4] 
>> > https://github.com/apache/logging-log4j-tools/blob/master/RELEASING.adoc
>> >
>> > # F.A.Q.
>> >
>> > ## Why shall a project be interested in this?
>> >
>> > It greatly simplifies the release process. See Log4j Tools release
>> > process[4], probably the simplest among all Java-based ASF projects.
>> >
>> > ## How can a project get started?
>> >
>> > 1. Make sure your project builds are reproducible (otherwise there is
>> > no way PMC can verify the integrity of CI-produced and -signed
>> > artifacts)
>> > 2. Clone and adapt INFRA-23996 (GPG keys in GitHub secrets)
>> > 3. Clone and adapt INFRA-23974 (Nexus creds. in GitHub secrets for
>> > snapshot deployments)
>> > 4. Clone and adapt INFRA-24051 (Nexus creds. in GitHub secrets for
>> > staging deployments)
>> >
>> > You might also want to check this[5] GitHub Action workflow for 
>> > inspiration.
>> >
>> > [5] 
>> > https://github.com/apache/logging-log4j-tools/blob/master/.github/workflows/build.yml
>> >
>> > ## Does the "automated release infrastructure" (CI) perform the full 
>> > release?
>> >
>> > No. CI *only* uploads signed artifacts to Nexus. The release manager
>> > (RM) still needs to copy the CI-generated files to SVN, PMC needs to
>> > vote, and, upon consensus, RM needs to "close" the release in Nexus
>> > and so on.
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@community.apache.org
>> > For additional commands, e-mail: dev-h...@community.apache.org
>> >

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Fwd: Signing releases using automated release infra

2023-07-18 Thread Ayush Saxena

Something we can explore as well!!

-Ayush

Begin forwarded message:

> From: Volkan Yazıcı 
> Date: 19 July 2023 at 1:24:49 AM IST
> To: d...@community.apache.org
> Subject: Signing releases using automated release infra
> Reply-To: d...@community.apache.org
> 
> Abstract: Signing release artifacts using an automated release
> infrastructure has been officially approved by LEGAL. This enables
> projects to sign artifacts using, say, GitHub Actions.
> 
> I have been trying to overhaul the Log4j release process and make it
> as frictionless as possible since last year. As a part of that effort,
> I wanted to sign artifacts in CI during deployment and in a
> `members@a.o` thread[0] I explained how one can do that securely with
> the help of Infra. That was in December 2022. It has been a long,
> rough journey, but we succeeded. In this PR[1], Legal has updated the
> release policy to reflect that this process is officially allowed.
> Further, Infra put together guides[2][3] to assist projects. Logging
> Services PMC has already successfully performed 4 Log4j Tools releases
> using this approach, see its release process[4] for a demonstration.
> 
> [0] (members only!)
> https://lists.apache.org/thread/1o12mkjrhyl45f9pof94pskg55vhs61n
> [1] https://github.com/apache/www-site/pull/235
> [2] https://infra.apache.org/release-publishing.html#signing
> [3] https://infra.apache.org/release-signing.html#automated-release-signing
> [4] https://github.com/apache/logging-log4j-tools/blob/master/RELEASING.adoc
> 
> # F.A.Q.
> 
> ## Why shall a project be interested in this?
> 
> It greatly simplifies the release process. See Log4j Tools release
> process[4], probably the simplest among all Java-based ASF projects.
> 
> ## How can a project get started?
> 
> 1. Make sure your project builds are reproducible (otherwise there is
> no way PMC can verify the integrity of CI-produced and -signed
> artifacts)
> 2. Clone and adapt INFRA-23996 (GPG keys in GitHub secrets)
> 3. Clone and adapt INFRA-23974 (Nexus creds. in GitHub secrets for
> snapshot deployments)
> 4. Clone and adapt INFRA-24051 (Nexus creds. in GitHub secrets for
> staging deployments)
> 
> You might also want to check this[5] GitHub Action workflow for inspiration.
> 
> [5] 
> https://github.com/apache/logging-log4j-tools/blob/master/.github/workflows/build.yml
> 
> ## Does the "automated release infrastructure" (CI) perform the full release?
> 
> No. CI *only* uploads signed artifacts to Nexus. The release manager
> (RM) still needs to copy the CI-generated files to SVN, PMC needs to
> vote, and, upon consensus, RM needs to "close" the release in Nexus
> and so on.
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@community.apache.org
> For additional commands, e-mail: dev-h...@community.apache.org
>

Re: [VOTE] Release Apache Hadoop 3.3.6 RC1

2023-06-24 Thread Ayush Saxena

+1 (Binding)

* Built from source (x86 & Arm)
* Successful native build on ubuntu 18.04(x86) & ubuntu 20.04(Arm)
* Verified Checksums (x86 & Arm)
* Verified Signatures (x86 & Arm)
* Successful RAT check (x86 & Arm)
* Verified the diff b/w the tag & the source tar
* Built Ozone with 3.3.6, green build after a retrigger due to some OOM
issues [1]
* Built Tez with 3.3.6 green build [2]
* Ran basic HDFS shell commands (Fs
Operations/EC/RBF/StoragePolicy/Snapshots) (x86 & Arm)
* Ran some basic Yarn shell commands.
* Browsed through the UI (NN, DN, RM, NM, JHS) (x86 & Arm)
* Ran some example Jobs (TeraGen, TeraSort, TeraValidate, WordCount,
WordMean, Pi) (x86 & Arm)
* Verified the output of `hadoop version` (x86 & Arm)
* Ran some HDFS unit tests around FsOperations/EC/Observer Read/RBF/SPS
* Skimmed over the contents of site jar
* Skimmed over the staging repo.
* Checked the NOTICE & Licence files.

Thanx Wei-Chiu for driving the release, Good Luck!!!

-Ayush


[1] https://github.com/ayushtkn/hadoop-ozone/actions/runs/5282707769
[2] https://github.com/apache/tez/pull/285#issuecomment-1590962978

On Sat, 24 Jun 2023 at 09:43, Nilotpal Nandi 
wrote:

> +1 (Non-binding).
> Thanks a lot Wei-Chiu for driving it.
>
> Thanks,
> Nilotpal Nandi
>
> On 2023/06/23 21:51:56 Wei-Chiu Chuang wrote:
> > +1 (binding)
> >
> > Note: according to the Hadoop bylaw, release vote is open for 5 days,
> not 7
> > days. So technically the time is almost up.
> > https://hadoop.apache.org/bylaws#Decision+Making
> >
> > If you plan to cast a vote, please do so soon. In the meantime, I'll
> start
> > to prepare to wrap up the release work.
> >
> > On Fri, Jun 23, 2023 at 6:09 AM Xiaoqiao He 
> wrote:
> >
> > > +1(binding)
> > >
> > > * Verified signature and checksum of all source tarballs.
> > > * Built source code on Ubuntu and OpenJDK 11 by `mvn clean package
> > > -DskipTests -Pnative -Pdist -Dtar`.
> > > * Setup pseudo cluster with HDFS and YARN.
> > > * Run simple FsShell - mkdir/put/get/mv/rm and check the result.
> > > * Run example mr applications and check the result - Pi & wordcount.
> > > * Checked the Web UI of NameNode/DataNode/Resourcemanager/NodeManager
> etc.
> > > * Checked git and JIRA using dev-support tools
> > > `git_jira_fix_version_check.py` .
> > >
> > > Thanks WeiChiu for your work.
> > >
> > > NOTE: I believe the build fatal error report from me above is only
> related
> > > to my own environment.
> > >
> > > Best Regards,
> > > - He Xiaoqiao
> > >
> > > On Thu, Jun 22, 2023 at 4:17 PM Chen Yi  wrote:
> > >
> > > > Thanks Wei-Chiu for leading this effort !
> > > >
> > > > +1(Binding)
> > > >
> > > >
> > > > + Verified the signature and checksum of all tarballs.
> > > > + Started a web server and viewed documentation site.
> > > > + Built from the source tarball on macOS 12.3 and OpenJDK 8.
> > > > + Launched a pseudo distributed cluster using released binary
> packages,
> > > > done some HDFS dir/file basic opeations.
> > > > + Run grep, pi and wordcount MR tasks on the pseudo cluster.
> > > >
> > > > Bests,
> > > > Sammi Chen
> > > > 
> > > > 发件人: Wei-Chiu Chuang 
> > > > 发送时间: 2023年6月19日 8:52
> > > > 收件人: Hadoop Common ; Hdfs-dev <
> > > > hdfs-...@hadoop.apache.org>; yarn-dev ;
> > > > mapreduce-dev 
> > > > 主题: [VOTE] Release Apache Hadoop 3.3.6 RC1
> > > >
> > > > I am inviting anyone to try and vote on this release candidate.
> > > >
> > > > Note:
> > > > This is exactly the same as RC0, except the CHANGELOG.
> > > >
> > > > The RC is available at:
> > > > https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/ (for amd64)
> > > > https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-arm64/ (for arm64)
> > > >
> > > > Git tag: release-3.3.6-RC1
> > > > https://github.com/apache/hadoop/releases/tag/release-3.3.6-RC1
> > > >
> > > > Maven artifacts is built by x86 machine and are staged at
> > > >
> https://repository.apache.org/content/repositories/orgapachehadoop-1380/
> > > >
> > > > My public key:
> > > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> > > >
> > > > Changelog:
> > > > https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/CHANGELOG.md
> > > >
> > > > Release notes:
> > > >
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/RELEASENOTES.md
> > > >
> > > > This is a relatively small release (by Hadoop standard) containing
> about
> > > > 120 commits.
> > > > Please give it a try, this RC vote will run for 7 days.
> > > >
> > > >
> > > > Feature highlights:
> > > >
> > > > SBOM artifacts
> > > > 
> > > > Starting from this release, Hadoop publishes Software Bill of
> Materials
> > > > (SBOM) using
> > > > CycloneDX Maven plugin. For more information about SBOM, please go to
> > > > [SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).
> > > >
> > > > HDFS RBF: RDBMS based token storage support
> > > > 
> > > > HDFS Router-Router Based Federation now

Re: [VOTE] Release Apache Hadoop 3.3.6 RC0

2023-06-20 Thread Ayush Saxena

Folks you had put your vote on the wrong thread, this is the RC0 thead, you 
need to put on the RC1,
https://lists.apache.org/thread/022p4rml5tvsx9xpq6t7b3n1td8lzz1d

-Ayush

Sent from my iPhone

> On 20-Jun-2023, at 10:03 PM, Ashutosh Gupta  
> wrote:
> 
> Hi
> 
> Thanks Wei-Chiu for driving the release.
> 
> +1 (non-binding)
> 
> * Builds from source looks good.
> * Checksums and signatures look good.
> * Running basic HDFS commands and running simple MapReduce jobs looks good.
> * hadoop-tools/hadoop-aws UTs and ITs looks good
> 
> Thanks,
> Ash
> 
>> On Tue, Jun 20, 2023 at 5:18 PM Mukund Madhav Thakur
>>  wrote:
>> 
>> Hi Wei-Chiu,
>> Thanks for driving the release.
>> 
>> +1 (binding)
>> Verified checksum and signature.
>> Built from source successfully.
>> Ran aws Itests
>> Ran azure Itests
>> Compiled hadoop-api-shim
>> Compiled google cloud storage.
>> 
>> 
>> I did see the two test failures in GCS connector as well but those are
>> harmless.
>> 
>> 
>> 
>> On Thu, Jun 15, 2023 at 8:21 PM Wei-Chiu Chuang
>>  wrote:
>> 
>>> Overall so far so good.
>>> 
>>> hadoop-api-shim:
>>> built, tested successfully.
>>> 
>>> cloudstore:
>>> built successfully.
>>> 
>>> Spark:
>>> built successfully. Passed hadoop-cloud tests.
>>> 
>>> Ozone:
>>> One test failure due to unrelated Ozone issue. This test is being
>> disabled
>>> in the latest Ozone code.
>>> 
>>> org.apache.hadoop.hdds.utils.NativeLibraryNotLoadedException: Unable
>>> to load library ozone_rocksdb_tools from both java.library.path &
>>> resource file libozone_rocksdb_t
>>> ools.so from jar.
>>>at
>>> 
>> org.apache.hadoop.hdds.utils.db.managed.ManagedSSTDumpTool.(ManagedSSTDumpTool.java:49)
>>> 
>>> 
>>> Google gcs:
>>> There are two test failures. The tests were added recently by
>> HADOOP-18724
>>>  in Hadoop 3.3.6.
>> This
>>> is okay. Not production code problem. Can be addressed in GCS code.
>>> 
>>> [ERROR] Errors:
>>> [ERROR]
>>> 
>>> 
>> TestInMemoryGoogleContractOpen>AbstractContractOpenTest.testFloatingPointLength:403
>>> » IllegalArgument Unknown mandatory key for gs://fake-in-memory-test-buck
>>> et/contract-test/testFloatingPointLength "fs.option.openfile.length"
>>> [ERROR]
>>> 
>>> 
>> TestInMemoryGoogleContractOpen>AbstractContractOpenTest.testOpenFileApplyAsyncRead:341
>>> » IllegalArgument Unknown mandatory key for gs://fake-in-memory-test-b
>>> ucket/contract-test/testOpenFileApplyAsyncRead
>> "fs.option.openfile.length"
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Wed, Jun 14, 2023 at 5:01 PM Wei-Chiu Chuang 
>>> wrote:
>>> 
 The hbase-filesystem tests passed after reverting HADOOP-18596
  and HADOOP-18633
  from my local
>> tree.
 So I think it's a matter of the default behavior being changed. It's
>> not
 the end of the world. I think we can address it by adding an
>> incompatible
 change flag and a release note.
 
 On Wed, Jun 14, 2023 at 3:55 PM Wei-Chiu Chuang 
 wrote:
 
> Cross referenced git history and jira. Changelog needs some update
> 
> Not in the release
> 
>   1. HDFS-16858 
> 
> 
>   1. HADOOP-18532 <
>> https://issues.apache.org/jira/browse/HADOOP-18532>
>   2.
>  1. HDFS-16861 >> 
> 2.
>1. HDFS-16866
>
>2.
>   1. HADOOP-18320
>   
>   2.
> 
> Updated fixed version. Will generate. new Changelog in the next RC.
> 
> Was able to build HBase and hbase-filesystem without any code change.
> 
> hbase has one unit test failure. This one is reproducible even with
> Hadoop 3.3.5, so maybe a red herring. Local env or something.
> 
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time
>> elapsed:
> 9.007 s <<< FAILURE! - in
> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker
> [ERROR]
> 
>>> 
>> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker.testConcurrentIncludeTimestampCorrectness
> Time elapsed: 3.13 s  <<< ERROR!
> java.lang.OutOfMemoryError: Java heap space
> at
> 
>>> 
>> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker$RandomTestData.(TestSyncTimeRangeTracker.java:91)
> at
> 
>>> 
>> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker.testConcurrentIncludeTimestampCorrectness(TestSyncTimeRangeTracker.java:156)
> 
> hbase-filesystem has three test failures in TestHBOSSContractDistCp,
>> and
> is not reproducible with Hadoop 3.3.5.
> [ERROR] Failures: [ERROR]
> 
>>> 
>>

Re: [VOTE] Release Apache Hadoop 3.3.6 RC0

2023-06-18 Thread Ayush Saxena

Hi Masatake,
That ticket is just a backport of:
https://issues.apache.org/jira/browse/HADOOP-17612 and the only java code
change is in test [1], which is causing issues for you. That ticket isn't
marked incompatible either, so maybe that is why it got pulled back.

btw. the original PR does mention it to be compatible [2].

Are you trying to compile ZK-3.5 with Hadoop-3.3.6 or Hadoop-3.3.6 with
ZK-3.5, If later and then if the compilation fails, then it shouldn't be an
incompatible change, right? or do we need to maintain compat that way as
well?

+1 in maintaining compatibility, incompatible changes should be avoided as
far as possible unless excessively necessary even in trunk or like we need
to do it for some "relevant" security issue or so in those thirdparty libs.
The RC1 vote is already up, do you plan to get this change excluded from
that?

Regarding the test, If you pass ``null`` instead of that DisconnectReason,
then also the test test passes, but I am pretty sure you would get
NoSuchMethod error for closeAll after that because that closeAll ain't
there is Zk-3.5, ZOOKEEPER-3439 removed it

-Ayush

PS. From this doc: https://zookeeper.apache.org/releases.html, even ZK-3.6
line is also EOL, not sure how those guys operate :-)

[1]
https://github.com/apache/hadoop/pull/3241/files#diff-b273546d6f060e617553eaa49da69039d2c655a77d42022779c2281d0f6cd08eR135
[2] https://github.com/apache/hadoop/pull/3241#issuecomment-889185103

On Mon, 19 Jun 2023 at 06:44, Masatake Iwasaki 
wrote:

> I got compilation error against ZooKeeper 3.5 due to HADOOP-18515.
> It should be marked as incompatible change?
> https://issues.apache.org/jira/browse/HADOOP-18515
>
> ::
>
>[ERROR]
> /home/rocky/srcs/bigtop/build/hadoop/rpm/BUILD/hadoop-3.3.6-src/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/TestZKFailoverControllerStress.java:[135,40]
> cannot find symbol
>  symbol:   variable DisconnectReason
>  location: class org.apache.zookeeper.server.ServerCnxn
>
> While ZooKeeper 3.5 is already EoL, it would be nice to keep compatibility
> in a patch release
> especially if only test code is the cause.
>
> Thanks,
> Masatake Iwasaki
>
> On 2023/06/18 4:57, Wei-Chiu Chuang wrote:
> > I was going to do another RC in case something comes up.
> > But it looks like the only thing that needs to be fixed is the Changelog.
> >
> >
> > 1. HADOOP-18596 
> >
> > HADOOP-18633 
> > are related to cloud store semantics, and I don't want to make a
> judgement
> > call on it. As far as I can tell its effect can be addressed by
> supplying a
> > config option in the application code.
> > It looks like the feature improves fault tolerance by ensuring files are
> > synchronized if modification time is different between the source and
> > destination. So to me it's the better behavior.
> >
> > I can make a RC1 over the weekend to fix the Changelog but that's
> probably
> > the only thing that's going to have.
> > On Sat, Jun 17, 2023 at 2:00 AM Xiaoqiao He 
> wrote:
> >
> >> Thanks Wei-Chiu for driving this release. The next RC will be prepared,
> >> right?
> >> If true, I would like to try and vote on the next RC.
> >> Just notice that some JIRAs are not included and need to revert some
> PRs to
> >> pass HBase verification which are mentioned above.
> >>
> >> Best Regards,
> >> - He Xiaoqiao
> >>
> >>
> >> On Fri, Jun 16, 2023 at 9:20 AM Wei-Chiu Chuang
> >>  wrote:
> >>
> >>> Overall so far so good.
> >>>
> >>> hadoop-api-shim:
> >>> built, tested successfully.
> >>>
> >>> cloudstore:
> >>> built successfully.
> >>>
> >>> Spark:
> >>> built successfully. Passed hadoop-cloud tests.
> >>>
> >>> Ozone:
> >>> One test failure due to unrelated Ozone issue. This test is being
> >> disabled
> >>> in the latest Ozone code.
> >>>
> >>> org.apache.hadoop.hdds.utils.NativeLibraryNotLoadedException: Unable
> >>> to load library ozone_rocksdb_tools from both java.library.path &
> >>> resource file libozone_rocksdb_t
> >>> ools.so from jar.
> >>>  at
> >>>
> >>
> org.apache.hadoop.hdds.utils.db.managed.ManagedSSTDumpTool.(ManagedSSTDumpTool.java:49)
> >>>
> >>>
> >>> Google gcs:
> >>> There are two test failures. The tests were added recently by
> >> HADOOP-18724
> >>>  in Hadoop 3.3.6.
> >> This
> >>> is okay. Not production code problem. Can be addressed in GCS code.
> >>>
> >>> [ERROR] Errors:
> >>> [ERROR]
> >>>
> >>>
> >>
> TestInMemoryGoogleContractOpen>AbstractContractOpenTest.testFloatingPointLength:403
> >>> » IllegalArgument Unknown mandatory key for
> gs://fake-in-memory-test-buck
> >>> et/contract-test/testFloatingPointLength "fs.option.openfile.length"
> >>> [ERROR]
> >>>
> >>>
> >>
> TestInMemoryGoogleContractOpen>AbstractContractOpenTest.testOpenFileApplyAsyncRead:341
> >>> » IllegalArgument Unknown mandatory key for

Fwd: Call for Presentations, Community Over Code Asia 2023

2023-06-05 Thread Ayush Saxena

Forwarding from dev@hadoop to the ML we use.
Original Mail:
https://lists.apache.org/thread/3c16xyvmdyvg4slmtfywcwjbck1b4x02

-Ayush

-- Forwarded message -
From: Rich Bowen 
Date: Mon, 5 Jun 2023 at 21:43
Subject: Call for Presentations, Community Over Code Asia 2023
To: rbo...@apache.org 

You are receiving this message because you are subscribed to one more
more developer mailing lists at the Apache Software Foundation.

The call for presentations is now open at
"https://apachecon.com/acasia2023/cfp.html;, and will be closed by
Sunday, Jun 18th, 2023 11:59 PM GMT.

The event will be held in Beijing, China, August 18-20, 2023.

We are looking for presentations about anything relating to Apache
Software Foundation projects, open-source governance, community, and
software development.
In particular, this year we are building content tracks around the
following specific topics/projects:

AI / Machine learning
API / Microservice
Community
CloudNative
Data Storage & Computing
DataOps
Data Lake & Data Warehouse
OLAP & Data Analysis
Performance Engineering
Incubator
IoT/IIoT
Messaging
RPC
Streaming
Workflow / Data Processing
Web Server / Tomcat

If your proposed presentation falls into one of these categories,
please select that topic in the CFP entry form. Or select Others if
it’s related to another topic or project area.

Looking forward to hearing from you!

Willem Jiang, and the Community Over Code planners

-
To unsubscribe, e-mail: dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: dev-h...@hadoop.apache.org

Re: Issues with Github PR integration with Jira

2023-05-21 Thread Ayush Saxena

It is sorted now. the github bot got blocked as a spam user. poor fellow
:-)

-Ayush

On Fri, 19 May 2023 at 05:33, Ayush Saxena  wrote:

> Hi Folks,
> Just wanted to let everyone know, there is an ongoing issue with
> Github PR & Jira integration since the last few days.
>
> The Pull Requests aren't getting linked automatically to Jira in some
> cases, If you observe that Please link manually unless this issue is
> resolved.
>
> I have an INFRA ticket in place for it [1] and I have validated the
> yaml files are intact. [2], so it shouldn't be Hadoop specific
>
>
> -Ayush
>
> [1] https://issues.apache.org/jira/browse/INFRA-24608
> [2] https://github.com/apache/hadoop/blame/trunk/.asf.yaml#L27
>

Re: Call for Presentations, Community Over Code 2023

2023-05-19 Thread Ayush Saxena

Thanx Wei-Chiu,

By any chance if you have access to the presentations, do share!!!

Was looking at the Apache Iceberg website[1], they maintain a list of
all the blogs shared anywhere, was thinking if we can collect some
blogs/youtube links around some fascinating talks and have such a page
on our website as well. Obviously if people feel good about it, just
an idea that came into my mind looking at the iceberg website and
thought of sharing.


[1] https://iceberg.apache.org/blogs/

On Wed, 10 May 2023 at 10:13, Wei-Chiu Chuang  wrote:
>
> There's also a call for presentation for Community over Code Asia 2023
>
> https://www.bagevent.com/event/cocasia-2023-EN
> Happening Aug 18-20. CfP due by 6/6
>
>
> On Tue, May 9, 2023 at 8:39 PM Ayush Saxena  wrote:
>>
>> Forwarding from dev@hadoop to the dev ML which we use.
>>
>> The actual mail lies here:
>> https://www.mail-archive.com/dev@hadoop.apache.org/msg00160.html
>>
>> -Ayush
>>
>> On 2023/05/09 21:24:09 Rich Bowen wrote:
>> > (Note: You are receiving this because you are subscribed to the dev@
>> > list for one or more Apache Software Foundation projects.)
>> >
>> > The Call for Presentations (CFP) for Community Over Code (formerly
>> > Apachecon) 2023 is open at
>> > https://communityovercode.org/call-for-presentations/, and will close
>> > Thu, 13 Jul 2023 23:59:59 GMT.
>> >
>> > The event will be held in Halifax, Canada, October 7-10, 2023.
>> >
>> > We welcome submissions on any topic related to the Apache Software
>> > Foundation, Apache projects, or the communities around those projects.
>> > We are specifically looking for presentations in the following
>> > catetegories:
>> >
>> > Fintech
>> > Search
>> > Big Data, Storage
>> > Big Data, Compute
>> > Internet of Things
>> > Groovy
>> > Incubator
>> > Community
>> > Data Engineering
>> > Performance Engineering
>> > Geospatial
>> > API/Microservices
>> > Frameworks
>> > Content Wrangling
>> > Tomcat and httpd
>> > Cloud and Runtime
>> > Streaming
>> > Sustainability
>> >
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@hadoop.apache.org
>> > For additional commands, e-mail: dev-h...@hadoop.apache.org
>> >
>> >
>>
>>
>> Sent from my iPhone
>>
>> -
>> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>>

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Issues with Github PR integration with Jira

2023-05-18 Thread Ayush Saxena

Hi Folks,
Just wanted to let everyone know, there is an ongoing issue with
Github PR & Jira integration since the last few days.

The Pull Requests aren't getting linked automatically to Jira in some
cases, If you observe that Please link manually unless this issue is
resolved.

I have an INFRA ticket in place for it [1] and I have validated the
yaml files are intact. [2], so it shouldn't be Hadoop specific


-Ayush

[1] https://issues.apache.org/jira/browse/INFRA-24608
[2] https://github.com/apache/hadoop/blame/trunk/.asf.yaml#L27

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

RE: Call for Presentations, Community Over Code 2023

2023-05-09 Thread Ayush Saxena

Forwarding from dev@hadoop to the dev ML which we use.

The actual mail lies here:
https://www.mail-archive.com/dev@hadoop.apache.org/msg00160.html

-Ayush

On 2023/05/09 21:24:09 Rich Bowen wrote:
> (Note: You are receiving this because you are subscribed to the dev@
> list for one or more Apache Software Foundation projects.)
>
> The Call for Presentations (CFP) for Community Over Code (formerly
> Apachecon) 2023 is open at
> https://communityovercode.org/call-for-presentations/, and will close
> Thu, 13 Jul 2023 23:59:59 GMT.
>
> The event will be held in Halifax, Canada, October 7-10, 2023.
>
> We welcome submissions on any topic related to the Apache Software
> Foundation, Apache projects, or the communities around those projects.
> We are specifically looking for presentations in the following
> catetegories:
>
> Fintech
> Search
> Big Data, Storage
> Big Data, Compute
> Internet of Things
> Groovy
> Incubator
> Community
> Data Engineering
> Performance Engineering
> Geospatial
> API/Microservices
> Frameworks
> Content Wrangling
> Tomcat and httpd
> Cloud and Runtime
> Streaming
> Sustainability
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: dev-h...@hadoop.apache.org
>
>


Sent from my iPhone

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: Nightly Jenkins CI for Hadoop on Windows 10

2023-05-09 Thread Ayush Saxena

One thing I forgot to mention. If you are planning to run tests also for
Windows OS then github actions isn’t something to be chased. It has 6 hrs
timeout.

So, in case of multi module changes or root level changes, we will get
screwed, the tests take more than 20 hrs in those cases, so we will hit the
timeout

-Ayush

On Tue, 9 May 2023 at 11:49 PM, Ayush Saxena  wrote:

> Hi Gautham,
> The only advantage I can think of is your limitation to nodes shouldn’t
> bother much. Though there are limited resources given to Apache but still
> never observed any starvation in any project for the resources using Github
> Actions.
>
> If Precommit CI for windows is there and it can work in parallel, that
> should be good as well
>
> Good Luck!!!
>
> -Ayush
>
> On Tue, 9 May 2023 at 8:47 PM, Gautham Banasandra 
> wrote:
>
>> Hi Ayush,
>>
>> I was planning to set up the precommit CI job for Windows 10, so that
>> we validate each PR against Windows 10. The precommit CI can actually
>> run in parallel to the Linux precommit CI. But I think we've got very few
>> Windows nodes (about 12 I think) compared to the number of Linux nodes
>> (about 40 in number).
>> However, I think the setting up the precommit CI is quite worth
>> the effort.
>> It was quite hard to get to this point and we certainly don't want to
>> regress
>> in this regard. I think I'll pursue this once I stabilize the nightly CI
>> and after
>> I sort out the rest of the failing parts (mvnsite, javadoc etc).
>>
>> Also, does Github Actions provide any advantage over precommit CI?
>>
>> Thanks,
>> --Gautham
>>
>> On Tue, 9 May 2023 at 02:44, Ayush Saxena  wrote:
>>
>>> Awesome!!!
>>>
>>> Do you plan to run some builds per PR to ensure no commits breaks it
>>> in the future? Worth exploring GithubActions in the later stages if we
>>> can have an action for maven install on windows, it won't affect the
>>> normal build time as well. can run in parallel with the existing
>>> builds.
>>>
>>> -Ayush
>>>
>>> On Mon, 8 May 2023 at 23:39, Gautham Banasandra 
>>> wrote:
>>> >
>>> > Dear Hadoop community,
>>> >
>>> > It is my pleasure to announce that I've set up the Nightly Jenkins CI
>>> for
>>> > Hadoop on the Windows 10 platform[1]. The effort mainly involved
>>> getting
>>> > Yetus to run on Windows against Hadoop.
>>> > The nightly CI will run every 36 hours and send out the build report
>>> to the
>>> > same recipients as this email, upon completion.
>>> > There are still quite a few things that need to be sorted out.
>>> Currently,
>>> > mvninstall has a +1. Other phases like mvnsite, javadoc etc still need
>>> to be
>>> > fixed.
>>> >
>>> > [1]
>>> >
>>> https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-win10-x86_64/
>>> >
>>> > Thanks,
>>> > --Gautham
>>>
>>

Re: Nightly Jenkins CI for Hadoop on Windows 10

2023-05-09 Thread Ayush Saxena

Hi Gautham,
The only advantage I can think of is your limitation to nodes shouldn’t
bother much. Though there are limited resources given to Apache but still
never observed any starvation in any project for the resources using Github
Actions.

If Precommit CI for windows is there and it can work in parallel, that
should be good as well

Good Luck!!!

-Ayush

On Tue, 9 May 2023 at 8:47 PM, Gautham Banasandra 
wrote:

> Hi Ayush,
>
> I was planning to set up the precommit CI job for Windows 10, so that
> we validate each PR against Windows 10. The precommit CI can actually
> run in parallel to the Linux precommit CI. But I think we've got very few
> Windows nodes (about 12 I think) compared to the number of Linux nodes
> (about 40 in number).
> However, I think the setting up the precommit CI is quite worth the effort.
> It was quite hard to get to this point and we certainly don't want to
> regress
> in this regard. I think I'll pursue this once I stabilize the nightly CI
> and after
> I sort out the rest of the failing parts (mvnsite, javadoc etc).
>
> Also, does Github Actions provide any advantage over precommit CI?
>
> Thanks,
> --Gautham
>
> On Tue, 9 May 2023 at 02:44, Ayush Saxena  wrote:
>
>> Awesome!!!
>>
>> Do you plan to run some builds per PR to ensure no commits breaks it
>> in the future? Worth exploring GithubActions in the later stages if we
>> can have an action for maven install on windows, it won't affect the
>> normal build time as well. can run in parallel with the existing
>> builds.
>>
>> -Ayush
>>
>> On Mon, 8 May 2023 at 23:39, Gautham Banasandra 
>> wrote:
>> >
>> > Dear Hadoop community,
>> >
>> > It is my pleasure to announce that I've set up the Nightly Jenkins CI
>> for
>> > Hadoop on the Windows 10 platform[1]. The effort mainly involved getting
>> > Yetus to run on Windows against Hadoop.
>> > The nightly CI will run every 36 hours and send out the build report to
>> the
>> > same recipients as this email, upon completion.
>> > There are still quite a few things that need to be sorted out.
>> Currently,
>> > mvninstall has a +1. Other phases like mvnsite, javadoc etc still need
>> to be
>> > fixed.
>> >
>> > [1]
>> >
>> https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-win10-x86_64/
>> >
>> > Thanks,
>> > --Gautham
>>
>

Re: Nightly Jenkins CI for Hadoop on Windows 10

2023-05-09 Thread Ayush Saxena

Awesome!!!

Do you plan to run some builds per PR to ensure no commits breaks it
in the future? Worth exploring GithubActions in the later stages if we
can have an action for maven install on windows, it won't affect the
normal build time as well. can run in parallel with the existing
builds.

-Ayush

On Mon, 8 May 2023 at 23:39, Gautham Banasandra  wrote:
>
> Dear Hadoop community,
>
> It is my pleasure to announce that I've set up the Nightly Jenkins CI for
> Hadoop on the Windows 10 platform[1]. The effort mainly involved getting
> Yetus to run on Windows against Hadoop.
> The nightly CI will run every 36 hours and send out the build report to the
> same recipients as this email, upon completion.
> There are still quite a few things that need to be sorted out. Currently,
> mvninstall has a +1. Other phases like mvnsite, javadoc etc still need to be
> fixed.
>
> [1]
> https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-win10-x86_64/
>
> Thanks,
> --Gautham

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [DISCUSS] Hadoop 3.3.6 release planning

2023-05-09 Thread Ayush Saxena

That openssl change ain't a blocker now from my side, that ABFS-Jdk-17
stuff got sorted out, Steve knew a way out

On Sat, 6 May 2023 at 00:51, Ayush Saxena  wrote:
>
> Thanx Wei-Chiu for the initiative, Good to have quick releases :)
>
> With my Hive developer hat on, I would like to bring some stuff up for
> consideration(feel free to say no, if it is beyond scope or feels even
> a bit unsafe, don't want to mess up the release)
>
> * HADOOP-18662: ListFiles with recursive fails with FNF : This broke
> compaction in Hive, bothers only with HDFS though. There is a
> workaround to that, if it doesn't feel safe. no issues, or if some
> improvements suggested. I can quickly do that :)
>
> * HADOOP-17649: Update wildfly openssl to 2.1.3.Final. Maybe not 2.1.3
> but if it works and is safe then to 2.2.5. I got flagged today that
> this openssl creates a bit of mess with JDK-17 for Hive with ABFS I
> think(need to dig in more),
>
> Now for the dependency upgrades:
>
> A big NO to Jackson, that ain't safe and the wounds are still fresh,
> it screwed the 3.3.3 release for many projects. So, let's not get into
> that. Infact anything that touches those shaded jars is risky, some
> package-json exclusion also created a mess recently. So, Lets not
> touch only and that too when we have less time.
>
> Avoid anything around Jetty upgrade, I have selfish reasons for that.
> Jetty messes something up with Hbase and Hive has a dependency on
> Hbase, and it is crazy, in case interested [1]. So, any upgrade to
> Jetty will block hive from upgrading Hadoop as of today. But that is a
> selfish reason and just up for consideration. Go ahead if necessary. I
> just wanted to let folks know
>
>
> Apart from the Jackson stuff, everything is suggestive in nature, your
> call feel free to ignore.
>
> @Xiaoqiao He , maybe pulling in all those 100+ would be risky,
> considering the timelines, but if we find a few fancy safe tickets,
> maybe if you have identified any already, can put them up on this
> thread and if folks are convinced. We can get them in? Juzz my
> thoughts, it is up to you and Wei-Chiu, (No skin in the game opinion)
>
>
> God Luck
>
> -Ayush
>
> [1] https://github.com/apache/hive/pull/4290#issuecomment-1536553803
>
> On Fri, 5 May 2023 at 16:13, Steve Loughran  
> wrote:
> >
> > Wei-Chiu has suggested a minimal "things in 3.3.5 which were very broken,
> > api change for ozone and any critical jar updates"
> >
> > so much lower risk/easier to qualify and ship.
> >
> > I need to get https://issues.apache.org/jira/browse/HADOOP-18724 in here;
> > maybe look at a refresh of the "classic" jars (slf4j, reload, jackson*,
> > remove json-smart...)
> >
> > I'd also like to downgrade protobuf 2.5 from required to optional; even
> > though hadoop uses the shaded one, to support hbase etc the IPC code still
> > has direct use of the 2.5 classes. that coud be optional
> >
> > if anyone wants to take up this PR, I would be very happy
> > https://github.com/apache/hadoop/pull/4996
> >
> > On Fri, 5 May 2023 at 04:27, Xiaoqiao He  wrote:
> >
> > > Thanks Wei-Chiu for driving this release.
> > > Cherry-pick YARN-11482 to branch-3.3 and mark 3.3.6 as the fixed version.
> > >
> > > so far only 8 jiras were resolved in the branch-3.3 line.
> > >
> > >
> > > If we should consider both 3.3.6 and 3.3.9 (which is from release-3.3.5
> > > discuss)[1] for this release line?
> > > I try to query with `project in (HDFS, YARN, HADOOP, MAPREDUCE) AND
> > > fixVersion in (3.3.6, 3.3.9)`[2],
> > > there are more than hundred jiras now.
> > >
> > > Best Regards,
> > > - He Xiaoqiao
> > >
> > > [1] https://lists.apache.org/thread/kln96frt2tcg93x6ht99yck9m7r9qwxp
> > > [2]
> > >
> > > https://issues.apache.org/jira/browse/YARN-11482?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20fixVersion%20in%20(3.3.6%2C%203.3.9)
> > >
> > >
> > > On Fri, May 5, 2023 at 1:19 AM Wei-Chiu Chuang  wrote:
> > >
> > > > Hi community,
> > > >
> > > > I'd like to kick off the discussion around Hadoop 3.3.6 release plan.
> > > >
> > > > I'm being selfish but my intent for 3.3.6 is to have the new APIs in
> > > > HADOOP-18671 <https://issues.apache.org/jira/browse/HADOOP-18671> added
> > > so
> > > > we can have HBase to adopt this new API. Other than that, perhaps
> > > > thirdparty dependency updates.
> > > >
> > >

Re: [DISCUSS] Hadoop 3.3.6 release planning

2023-05-05 Thread Ayush Saxena

Thanx Wei-Chiu for the initiative, Good to have quick releases :)

With my Hive developer hat on, I would like to bring some stuff up for
consideration(feel free to say no, if it is beyond scope or feels even
a bit unsafe, don't want to mess up the release)

* HADOOP-18662: ListFiles with recursive fails with FNF : This broke
compaction in Hive, bothers only with HDFS though. There is a
workaround to that, if it doesn't feel safe. no issues, or if some
improvements suggested. I can quickly do that :)

* HADOOP-17649: Update wildfly openssl to 2.1.3.Final. Maybe not 2.1.3
but if it works and is safe then to 2.2.5. I got flagged today that
this openssl creates a bit of mess with JDK-17 for Hive with ABFS I
think(need to dig in more),

Now for the dependency upgrades:

A big NO to Jackson, that ain't safe and the wounds are still fresh,
it screwed the 3.3.3 release for many projects. So, let's not get into
that. Infact anything that touches those shaded jars is risky, some
package-json exclusion also created a mess recently. So, Lets not
touch only and that too when we have less time.

Avoid anything around Jetty upgrade, I have selfish reasons for that.
Jetty messes something up with Hbase and Hive has a dependency on
Hbase, and it is crazy, in case interested [1]. So, any upgrade to
Jetty will block hive from upgrading Hadoop as of today. But that is a
selfish reason and just up for consideration. Go ahead if necessary. I
just wanted to let folks know

Apart from the Jackson stuff, everything is suggestive in nature, your
call feel free to ignore.

@Xiaoqiao He , maybe pulling in all those 100+ would be risky,
considering the timelines, but if we find a few fancy safe tickets,
maybe if you have identified any already, can put them up on this
thread and if folks are convinced. We can get them in? Juzz my
thoughts, it is up to you and Wei-Chiu, (No skin in the game opinion)

God Luck

-Ayush

[1] https://github.com/apache/hive/pull/4290#issuecomment-1536553803

On Fri, 5 May 2023 at 16:13, Steve Loughran  wrote:
>
> Wei-Chiu has suggested a minimal "things in 3.3.5 which were very broken,
> api change for ozone and any critical jar updates"
>
> so much lower risk/easier to qualify and ship.
>
> I need to get https://issues.apache.org/jira/browse/HADOOP-18724 in here;
> maybe look at a refresh of the "classic" jars (slf4j, reload, jackson*,
> remove json-smart...)
>
> I'd also like to downgrade protobuf 2.5 from required to optional; even
> though hadoop uses the shaded one, to support hbase etc the IPC code still
> has direct use of the 2.5 classes. that coud be optional
>
> if anyone wants to take up this PR, I would be very happy
> https://github.com/apache/hadoop/pull/4996
>
> On Fri, 5 May 2023 at 04:27, Xiaoqiao He  wrote:
>
> > Thanks Wei-Chiu for driving this release.
> > Cherry-pick YARN-11482 to branch-3.3 and mark 3.3.6 as the fixed version.
> >
> > so far only 8 jiras were resolved in the branch-3.3 line.
> >
> >
> > If we should consider both 3.3.6 and 3.3.9 (which is from release-3.3.5
> > discuss)[1] for this release line?
> > I try to query with `project in (HDFS, YARN, HADOOP, MAPREDUCE) AND
> > fixVersion in (3.3.6, 3.3.9)`[2],
> > there are more than hundred jiras now.
> >
> > Best Regards,
> > - He Xiaoqiao
> >
> > [1] https://lists.apache.org/thread/kln96frt2tcg93x6ht99yck9m7r9qwxp
> > [2]
> >
> > https://issues.apache.org/jira/browse/YARN-11482?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20fixVersion%20in%20(3.3.6%2C%203.3.9)
> >
> >
> > On Fri, May 5, 2023 at 1:19 AM Wei-Chiu Chuang  wrote:
> >
> > > Hi community,
> > >
> > > I'd like to kick off the discussion around Hadoop 3.3.6 release plan.
> > >
> > > I'm being selfish but my intent for 3.3.6 is to have the new APIs in
> > > HADOOP-18671  added
> > so
> > > we can have HBase to adopt this new API. Other than that, perhaps
> > > thirdparty dependency updates.
> > >
> > > If you have open items to be added in the coming weeks, please add 3.3.6
> > to
> > > the target release version. Right now I am only seeing three open jiras
> > > targeting 3.3.6.
> > >
> > > I imagine this is going to be a small release as 3.3.5 (hat tip to Steve)
> > > was only made two months back, and so far only 8 jiras were resolved in
> > the
> > > branch-3.3 line.
> > >
> > > Best,
> > > Weichiu
> > >
> >

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 3.3.5 (RC3)

2023-03-30 Thread Ayush Saxena

We have a daily build running for 3.3.5:
https://ci-hadoop.apache.org/job/hadoop-qbt-3.3.5-java8-linux-x86_64/

We have already released it, so I feel we can disable it. Will do it
tomorrow, if nobody objects. In case the one who configured it wants
to do it early, feel free to do so.

We already have one for branch-3.3 which runs weekly which most
probably most of us don't follow :)

-Ayush

On Wed, 22 Mar 2023 at 00:20, Steve Loughran
 wrote:
>
> ok, here's my summary, even though most of the binding voters forgot to
> declare they were on the PMC.
>
> +1 binding
>
> Steve Loughran
> Chris Nauroth
> Masatake Iwasaki
> Ayush Saxena
> Xiaoqiao He
>
> +1 non-binding
>
> Viraj Jasani
>
>
> 0 or -1 votes: none.
>
>
> Accordingly: the release is good!
>
> I will send the formal announcement out tomorrow
>
> A big thank you to everyone who qualified the RC, I know its a lot of work.
> We can now get this out and *someone else* can plan the followup.
>
>
> steve
>
> On Mon, 20 Mar 2023 at 16:01, Chris Nauroth  wrote:
>
> > +1
> >
> > Thank you for the release candidate, Steve!
> >
> > * Verified all checksums.
> > * Verified all signatures.
> > * Built from source, including native code on Linux.
> > * mvn clean package -Pnative -Psrc -Drequire.openssl -Drequire.snappy
> > -Drequire.zstd -DskipTests
> > * Tests passed.
> > * mvn --fail-never clean test -Pnative -Dparallel-tests
> > -Drequire.snappy -Drequire.zstd -Drequire.openssl
> > -Dsurefire.rerunFailingTestsCount=3 -DtestsThreadCount=8
> > * Checked dependency tree to make sure we have all of the expected library
> > updates that are mentioned in the release notes.
> > * mvn -o dependency:tree
> > * Confirmed that hadoop-openstack is now just a stub placeholder artifact
> > with no code.
> > * For ARM verification:
> > * Ran "file " on all native binaries in the ARM tarball to confirm
> > they actually came out with ARM as the architecture.
> > * Output of hadoop checknative -a on ARM looks good.
> > * Ran a MapReduce job with the native bzip2 codec for compression, and
> > it worked fine.
> > * Ran a MapReduce job with YARN configured to use
> > LinuxContainerExecutor and verified launching the containers through
> > container-executor worked.
> >
> > Chris Nauroth
> >
> >
> > On Mon, Mar 20, 2023 at 3:45 AM Ayush Saxena  wrote:
> >
> > > +1(Binding)
> > >
> > > * Built from source (x86 & ARM)
> > > * Successful Native Build (x86 & ARM)
> > > * Verified Checksums (x86 & ARM)
> > > * Verified Signature (x86 & ARM)
> > > * Checked the output of hadoop version (x86 & ARM)
> > > * Verified the output of hadoop checknative (x86 & ARM)
> > > * Ran some basic HDFS shell commands.
> > > * Ran some basic Yarn shell commands.
> > > * Played a bit with HDFS Erasure Coding.
> > > * Ran TeraGen & TeraSort
> > > * Browed through NN, DN, RM & NM UI
> > > * Skimmed over the contents of website.
> > > * Skimmed over the contents of maven repo.
> > > * Selectively ran some HDFS & CloudStore tests
> > >
> > > Thanx Steve for driving the release. Good Luck!!!
> > >
> > > -Ayush
> > >
> > > > On 20-Mar-2023, at 12:54 PM, Xiaoqiao He 
> > wrote:
> > > >
> > > > +1
> > > >
> > > > * Verified signature and checksum of the source tarball.
> > > > * Built the source code on Ubuntu and OpenJDK 11 by `mvn clean package
> > > > -DskipTests -Pnative -Pdist -Dtar`.
> > > > * Setup pseudo cluster with HDFS and YARN.
> > > > * Run simple FsShell - mkdir/put/get/mv/rm (include EC) and check the
> > > > result.
> > > > * Run example mr applications and check the result - Pi & wordcount.
> > > > * Check the Web UI of NameNode/DataNode/Resourcemanager/NodeManager
> > etc.
> > > >
> > > > Thanks Steve for your work.
> > > >
> > > > Best Regards,
> > > > - He Xiaoqiao
> > > >
> > > >> On Mon, Mar 20, 2023 at 12:04 PM Masatake Iwasaki <
> > > iwasak...@oss.nttdata.com>
> > > >> wrote:
> > > >>
> > > >> +1
> > > >>
> > > >> + verified the signature and checksum of the source tarball.
> > > >>
> > > >> + built from the source tarball on Rocky Linux 8 (x86_64) and OpenJDK
> >

A Message from the Board to PMC members

2023-03-29 Thread Ayush Saxena

 d...@hadoop.apache.org received a mail from the apache board,
forwarding it to the mailing lists which we use, sharing the link
here:

https://www.mail-archive.com/dev@hadoop.apache.org/msg00158.html

private@ in bcc

-Ayush

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [DISCUSS] hadoop branch-3.3+ going to java11 only

2023-03-28 Thread Ayush Saxena

>
>  it's already hard to migrate from JDK8 why not retarget JDK17.
>

+1, makes sense to me, sounds like a win-win situation to me, though there
would be some additional issues to chase now :)

-Ayush


On Tue, 28 Mar 2023 at 23:29, Wei-Chiu Chuang  wrote:

> My random thoughts. Probably bad takes:
>
> There are projects experimenting with JDK17 now.
> JDK11 active support will end in 6 months. If it's already hard to migrate
> from JDK8 why not retarget JDK17.
>
> On Tue, Mar 28, 2023 at 10:30 AM Ayush Saxena  wrote:
>
>> I know Jersey upgrade as a blocker. Some folks were chasing that last
>> year during 3.3.4 time, I don’t know where it is now, didn’t see then
>> what’s the problem there but I remember there was some intitial PR which
>> did it for HDFS atleast, so I never looked beyond that…
>>
>> I too had jdk-11 in my mind, but only for trunk. 3.4.x can stay as
>> java-11 only branch may be, but that is something later to decide, once we
>> get the code sorted…
>>
>> -Ayush
>>
>> > On 28-Mar-2023, at 9:16 PM, Steve Loughran 
>> wrote:
>> >
>> > well, how about we flip the switch and get on with it.
>> >
>> > slf4j seems happy on java11,
>> >
>> > side issue, anyone seen test failures on zulu1.8; somehow my test run is
>> > failing and i'm trying to work out whether its a mismatch in command
>> > line/ide jvm versions, or the 3.3.5 JARs have been built with an openjdk
>> > version which requires IntBuffer implements an overridden method
>> IntBuffer
>> > rewind().
>> >
>> > java.lang.NoSuchMethodError:
>> java.nio.IntBuffer.rewind()Ljava/nio/IntBuffer;
>> >
>> > at
>> org.apache.hadoop.fs.FSInputChecker.verifySums(FSInputChecker.java:341)
>> > at
>> >
>> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:308)
>> > at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:257)
>> > at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:202)
>> > at java.io.DataInputStream.read(DataInputStream.java:149)
>> >
>> >> On Tue, 28 Mar 2023 at 15:52, Viraj Jasani  wrote:
>> >> IIRC some of the ongoing major dependency upgrades (log4j 1 to 2,
>> jersey 1
>> >> to 2 and junit 4 to 5) are blockers for java 11 compile + test
>> stability.
>> >> On Tue, Mar 28, 2023 at 4:55 AM Steve Loughran
>> > >> wrote:
>> >>> Now that hadoop 3.3.5 is out, i want to propose something new
>> >>> we switch branch-3.3 and trunk to being java11 only
>> >>> 1. java 11 has been out for years
>> >>> 2. oracle java 8 is no longer available under "premier support"; you
>> >>> can't really get upgrades
>> >>> https://www.oracle.com/java/technologies/java-se-support-roadmap.html
>> >>> 3. openJDK 8 releases != oracle ones, and things you compile with them
>> >>> don't always link to oracle java 8 (some classes in java.nio have
>> >> added
>> >>> more overrides)
>> >>> 4. more and more libraries we want to upgrade to/bundle are java 11
>> >> only
>> >>> 5. moving to java 11 would cut our yetus build workload in half, and
>> >>> line up for adding java 17 builds instead.
>> >>> I know there are some outstanding issues still in
>> >>> https://issues.apache.org/jira/browse/HADOOP-16795 -but are they
>> >> blockers?
>> >>> Could we just move to java11 and enhance at our leisure, once java8
>> is no
>> >>> longer a concern.
>>
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>
>>

Re: [DISCUSS] hadoop branch-3.3+ going to java11 only

2023-03-28 Thread Ayush Saxena

I know Jersey upgrade as a blocker. Some folks were chasing that last year 
during 3.3.4 time, I don’t know where it is now, didn’t see then what’s the 
problem there but I remember there was some intitial PR which did it for HDFS 
atleast, so I never looked beyond that…

I too had jdk-11 in my mind, but only for trunk. 3.4.x can stay as java-11 only 
branch may be, but that is something later to decide, once we get the code 
sorted…

-Ayush

> On 28-Mar-2023, at 9:16 PM, Steve Loughran  
> wrote:
> 
> well, how about we flip the switch and get on with it.
> 
> slf4j seems happy on java11,
> 
> side issue, anyone seen test failures on zulu1.8; somehow my test run is
> failing and i'm trying to work out whether its a mismatch in command
> line/ide jvm versions, or the 3.3.5 JARs have been built with an openjdk
> version which requires IntBuffer implements an overridden method IntBuffer
> rewind().
> 
> java.lang.NoSuchMethodError: java.nio.IntBuffer.rewind()Ljava/nio/IntBuffer;
> 
> at org.apache.hadoop.fs.FSInputChecker.verifySums(FSInputChecker.java:341)
> at
> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:308)
> at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:257)
> at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:202)
> at java.io.DataInputStream.read(DataInputStream.java:149)
> 
>> On Tue, 28 Mar 2023 at 15:52, Viraj Jasani  wrote:
>> IIRC some of the ongoing major dependency upgrades (log4j 1 to 2, jersey 1
>> to 2 and junit 4 to 5) are blockers for java 11 compile + test stability.
>> On Tue, Mar 28, 2023 at 4:55 AM Steve Loughran > wrote:
>>> Now that hadoop 3.3.5 is out, i want to propose something new
>>> we switch branch-3.3 and trunk to being java11 only
>>> 1. java 11 has been out for years
>>> 2. oracle java 8 is no longer available under "premier support"; you
>>> can't really get upgrades
>>> https://www.oracle.com/java/technologies/java-se-support-roadmap.html
>>> 3. openJDK 8 releases != oracle ones, and things you compile with them
>>> don't always link to oracle java 8 (some classes in java.nio have
>> added
>>> more overrides)
>>> 4. more and more libraries we want to upgrade to/bundle are java 11
>> only
>>> 5. moving to java 11 would cut our yetus build workload in half, and
>>> line up for adding java 17 builds instead.
>>> I know there are some outstanding issues still in
>>> https://issues.apache.org/jira/browse/HADOOP-16795 -but are they
>> blockers?
>>> Could we just move to java11 and enhance at our leisure, once java8 is no
>>> longer a concern.

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Fwd: TAC supporting Berlin Buzzwords

2023-03-24 Thread Ayush Saxena

Forwarded a recieved.

-Ayush

-- Forwarded message -
From: Gavin McDonald 
Date: Fri, 24 Mar 2023 at 15:27
Subject: TAC supporting Berlin Buzzwords
To: 

PMCs,

Please forward to your dev and user lists.

Hi All,

The ASF Travel Assistance Committee is supporting taking up to six (6)
people
to attend Berlin Buzzwords In June this year.

This includes Conference passes, and travel & accommodation as needed.

Please see our website at https://tac.apache.org for more information and
how to apply.

Applications close on 15th April.

Good luck to those that apply.

Gavin McDonald (VP TAC)

Re: [VOTE] Release Apache Hadoop 3.3.5 (RC3)

2023-03-20 Thread Ayush Saxena

+1(Binding)

* Built from source (x86 & ARM)
* Successful Native Build (x86 & ARM)
* Verified Checksums (x86 & ARM)
* Verified Signature (x86 & ARM)
* Checked the output of hadoop version (x86 & ARM)
* Verified the output of hadoop checknative (x86 & ARM)
* Ran some basic HDFS shell commands.
* Ran some basic Yarn shell commands.
* Played a bit with HDFS Erasure Coding.
* Ran TeraGen & TeraSort
* Browed through NN, DN, RM & NM UI
* Skimmed over the contents of website.
* Skimmed over the contents of maven repo.
* Selectively ran some HDFS & CloudStore tests

Thanx Steve for driving the release. Good Luck!!!

-Ayush

> On 20-Mar-2023, at 12:54 PM, Xiaoqiao He  wrote:
> 
> +1
> 
> * Verified signature and checksum of the source tarball.
> * Built the source code on Ubuntu and OpenJDK 11 by `mvn clean package
> -DskipTests -Pnative -Pdist -Dtar`.
> * Setup pseudo cluster with HDFS and YARN.
> * Run simple FsShell - mkdir/put/get/mv/rm (include EC) and check the
> result.
> * Run example mr applications and check the result - Pi & wordcount.
> * Check the Web UI of NameNode/DataNode/Resourcemanager/NodeManager etc.
> 
> Thanks Steve for your work.
> 
> Best Regards,
> - He Xiaoqiao
> 
>> On Mon, Mar 20, 2023 at 12:04 PM Masatake Iwasaki 
>> wrote:
>> 
>> +1
>> 
>> + verified the signature and checksum of the source tarball.
>> 
>> + built from the source tarball on Rocky Linux 8 (x86_64) and OpenJDK 8
>> with native profile enabled.
>>   + launched pseudo distributed cluster including kms and httpfs with
>> Kerberos and SSL enabled.
>>   + created encryption zone, put and read files via httpfs.
>>   + ran example MR wordcount over encryption zone.
>>   + checked the binary of container-executor.
>> 
>> + built rpm packages by Bigtop (with trivial modifications) on Rocky Linux
>> 8 (aarch64).
>>   + ran smoke-tests of hdfs, yarn and mapreduce.
>> + built site documentation and skimmed the contents.
>>   +  Javadocs are contained.
>> 
>> Thanks,
>> Masatake Iwasaki
>> 
>>> On 2023/03/16 4:47, Steve Loughran wrote:
>>> Apache Hadoop 3.3.5
>>> 
>>> Mukund and I have put together a release candidate (RC3) for Hadoop
>> 3.3.5.
>>> 
>>> What we would like is for anyone who can to verify the tarballs,
>> especially
>>> anyone who can try the arm64 binaries as we want to include them too.
>>> 
>>> The RC is available at:
>>> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/
>>> 
>>> The git tag is release-3.3.5-RC3, commit 706d88266ab
>>> 
>>> The maven artifacts are staged at
>>> https://repository.apache.org/content/repositories/orgapachehadoop-1369/
>>> 
>>> You can find my public key at:
>>> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>>> 
>>> Change log
>>> 
>> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/CHANGELOG.md
>>> 
>>> Release notes
>>> 
>> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/RELEASENOTES.md
>>> 
>>> This is off branch-3.3 and is the first big release since 3.3.2.
>>> 
>>> Key changes include
>>> 
>>> * Big update of dependencies to try and keep those reports of
>>>   transitive CVEs under control -both genuine and false positives.
>>> * HDFS RBF enhancements
>>> * Critical fix to ABFS input stream prefetching for correct reading.
>>> * Vectored IO API for all FSDataInputStream implementations, with
>>>   high-performance versions for file:// and s3a:// filesystems.
>>>   file:// through java native io
>>>   s3a:// parallel GET requests.
>>> * This release includes Arm64 binaries. Please can anyone with
>>>   compatible systems validate these.
>>> * and compared to the previous RC, all the major changes are
>>>   HDFS issues.
>>> 
>>> Note, because the arm64 binaries are built separately on a different
>>> platform and JVM, their jar files may not match those of the x86
>>> release -and therefore the maven artifacts. I don't think this is
>>> an issue (the ASF actually releases source tarballs, the binaries are
>>> there for help only, though with the maven repo that's a bit blurred).
>>> 
>>> The only way to be consistent would actually untar the x86.tar.gz,
>>> overwrite its binaries with the arm stuff, retar, sign and push out
>>> for the vote. Even automating that would be risky.
>>> 
>>> Please try the release and vote. The vote will run for 5 days.
>>> 
>>> -Steve
>>> 
>> 
>> -
>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>> 
>> 

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 3.3.5 (RC3)

2023-03-18 Thread Ayush Saxena

Count me in as well. I am almost done. So, you have 3 potential votes, can be 
happy now :) 
Thanx Steve for the efforts!!!

-Ayush

> On 19-Mar-2023, at 2:46 AM, Chris Nauroth  wrote:
> 
> Yes, I'm in progress on verification, so you can expect to get a vote from
> me. Thank you, Steve!
> 
> Chris Nauroth
> 
> 
>> On Sat, Mar 18, 2023 at 9:19 AM Ashutosh Gupta 
>> wrote:
>> Hi Steve
>> I will also do it by today/tomorrow.
>> Thanks,
>> Ashutosh
>> On Sat, 18 Mar, 2023, 4:07 pm Steve Loughran, > wrote:
>>> Thank you for this!
>>> Can anyone else with time do a review too? i really want to get this one
>>> done, now the HDFS issues are all resolved.
>>> I do not want this release to fall by the wayside through lack of votes
>>> alone. In fact, I would be very unhappy
 On Sat, 18 Mar 2023 at 06:47, Viraj Jasani  wrote:
 +1 (non-binding)
 * Signature/Checksum: ok
 * Rat check (1.8.0_341): ok
 - mvn clean apache-rat:check
 * Built from source (1.8.0_341): ok
 - mvn clean install  -DskipTests
 * Built tar from source (1.8.0_341): ok
 - mvn clean package  -Pdist -DskipTests -Dtar
>> -Dmaven.javadoc.skip=true
 Containerized deployments:
 * Deployed and started Hdfs - NN, DN, JN with Hbase 2.5 and Zookeeper
>> 3.7
 * Deployed and started JHS, RM, NM
 * Hbase, hdfs CRUD looks good
 * Sample RowCount MapReduce job looks good
 * S3A tests with scale profile looks good
 On Wed, Mar 15, 2023 at 12:48 PM Steve Loughran
 
 wrote:
> Apache Hadoop 3.3.5
> Mukund and I have put together a release candidate (RC3) for Hadoop
 3.3.5.
> What we would like is for anyone who can to verify the tarballs,
 especially
> anyone who can try the arm64 binaries as we want to include them too.
> The RC is available at:
> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/
> The git tag is release-3.3.5-RC3, commit 706d88266ab
> The maven artifacts are staged at
>>> https://repository.apache.org/content/repositories/orgapachehadoop-1369/
> You can find my public key at:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> Change log
>> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/CHANGELOG.md
> Release notes
>> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/RELEASENOTES.md
> This is off branch-3.3 and is the first big release since 3.3.2.
> Key changes include
> * Big update of dependencies to try and keep those reports of
> transitive CVEs under control -both genuine and false positives.
> * HDFS RBF enhancements
> * Critical fix to ABFS input stream prefetching for correct reading.
> * Vectored IO API for all FSDataInputStream implementations, with
> high-performance versions for file:// and s3a:// filesystems.
> file:// through java native io
> s3a:// parallel GET requests.
> * This release includes Arm64 binaries. Please can anyone with
> compatible systems validate these.
> * and compared to the previous RC, all the major changes are
> HDFS issues.
> Note, because the arm64 binaries are built separately on a different
> platform and JVM, their jar files may not match those of the x86
> release -and therefore the maven artifacts. I don't think this is
> an issue (the ASF actually releases source tarballs, the binaries are
> there for help only, though with the maven repo that's a bit
>> blurred).
> The only way to be consistent would actually untar the x86.tar.gz,
> overwrite its binaries with the arm stuff, retar, sign and push out
> for the vote. Even automating that would be risky.
> Please try the release and vote. The vote will run for 5 days.
> -Steve

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 3.3.5 (RC2)

2023-03-02 Thread Ayush Saxena

>
> I will highlight that I am completely fed up with doing this  release and
> really want to get it out the way -for which I depend on support from as
> many other developers as possible.

hmm, I can feel the pain. I tried to find if there is any config or any
workaround which can dodge this HDFS issue, but unfortunately couldn't find
any. If someone does a getListing with needLocation and the file doesn't
exist at Observer he is gonna get a NPE rather than a FNF, It isn't just
the exception, AFAIK Observer reads have some logic around handling FNF
specifically, that it attempts Active NN or something like that in such
cases, So, that will be broken as well for this use case.

Now, there is no denying the fact there is an issue on the HDFS side, and
it has already been too much work on your side, so you can argue that it
might not be a very frequent use case or so. It's your call.

Just sharing, no intentions of saying you should do that, But as an RM
"nobody" can force you for a new iteration of a RC, it is gonna be your
call and discretion. As far as I know a release can not be vetoed by
"anybody" as per the apache by laws.(
https://www.apache.org/legal/release-policy.html#release-approval). Even
our bylaws say that product release requires a Lazy Majority not a
Consensus Approval.

So, you have a way out. You guys are 2 already and 1 I will give you a
pass, in case you are really in a state: ''Get me out of this mess" state,
my basic validations on x86 & Aarch64 both are passing as of now, couldn't
reach the end for any of the RC's

-Ayush

On Fri, 3 Mar 2023 at 08:41, Viraj Jasani  wrote:

> While this RC is not going to be final, I just wanted to share the results
> of the testing I have done so far with RC1 and RC2.
>
> * Signature: ok
> * Checksum : ok
> * Rat check (1.8.0_341): ok
>  - mvn clean apache-rat:check
> * Built from source (1.8.0_341): ok
>  - mvn clean install  -DskipTests
> * Built tar from source (1.8.0_341): ok
>  - mvn clean package  -Pdist -DskipTests -Dtar -Dmaven.javadoc.skip=true
>
> * Built images using the tarball, installed and started all of Hdfs, JHS
> and Yarn components
> * Ran Hbase (latest 2.5) tests against Hdfs, ran RowCounter Mapreduce job
> * Hdfs CRUD tests
> * MapReduce wordcount job
>
> * Ran S3A tests with scale profile against us-west-2:
> mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale
>
> ITestS3AConcurrentOps#testParallelRename is timing out after ~960s. This is
> consistently failing, looks like a recent regression.
> I was also able to repro on trunk, will create Jira.
>
>
> On Mon, Feb 27, 2023 at 9:59 AM Steve Loughran  >
> wrote:
>
> > Mukund and I have put together a release candidate (RC2) for Hadoop
> 3.3.5.
> >
> > We need anyone who can to verify the source and binary artifacts,
> > including those JARs staged on maven, the site documentation and the
> arm64
> > tar file.
> >
> > The RC is available at:
> > https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC2/
> >
> > The git tag is release-3.3.5-RC2, commit 72f8c2a4888
> >
> > The maven artifacts are staged at
> > https://repository.apache.org/content/repositories/orgapachehadoop-1369/
> >
> > You can find my public key at:
> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >
> > Change log
> >
> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC2/CHANGELOG.md
> >
> > Release notes
> >
> >
> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC2/RELEASENOTES.md
> >
> > This is off branch-3.3 and is the first big release since 3.3.2.
> >
> > As to what changed since the RC1 attempt last week
> >
> >
> >1. Version fixup in JIRA (credit due to Takanobu Asanuma there)
> >2. HADOOP-18470. Remove HDFS RBF text in the 3.3.5 index.md file
> >3. Revert "HADOOP-18590. Publish SBOM artifacts (#5281)" (creating
> build
> >issues in maven 3.9.0)
> >4. HADOOP-18641. Cloud connector dependency and LICENSE fixup. (#5429)
> >
> >
> > Note, because the arm64 binaries are built separately on a different
> > platform and JVM, their jar files may not match those of the x86
> > release -and therefore the maven artifacts. I don't think this is
> > an issue (the ASF actually releases source tarballs, the binaries are
> > there for help only, though with the maven repo that's a bit blurred).
> >
> > The only way to be consistent would actually untar the x86.tar.gz,
> > overwrite its binaries with the arm stuff, retar, sign and push out
> > for the vote. Even automating that would be risky.
> >
> > Please try the release and vote. The vote will run for 5 days.
> >
> > Steve and Mukund
> >
>

Re: [VOTE] Release Apache Hadoop 3.3.5

2023-02-24 Thread Ayush Saxena

>
>  And i
> think we need to change the PR template to mention transitive updates in
> the license bit too


Not sure if that is gonna help, People might ignore that or check that in
overconfidence. No harm though..

BTW Ozone has some cool stuff to handle this, it was added here:
https://github.com/apache/ozone/pull/2199

It checks for each PR, if the changes bring any new transitive dependency
or not and if it does, it flags that and then licence and all can be
managed. Worth exploring

-Ayush

On Sat, 25 Feb 2023 at 01:09, Steve Loughran 
wrote:

>  need this pr in too, https://github.com/apache/hadoop/pull/5429
>
>1. cuts back on some transitive dependencies from hadoop-aliyun
>2. fixes LICENSE-bin to be correct
>
> #2 is the blocker...and it looks like 3.2.x will also need fixup as well as
> the later ones -hadoop binaries have shipped without that file being up to
> date, but at least all the transitive stuff is correctly licensed. And i
> think we need to change the PR template to mention transitive updates in
> the license bit too
>
> if this goes in, I will do the rebuild on monday UK time
>
> On Thu, 23 Feb 2023 at 11:18, Steve Loughran  wrote:
>
> >
> > And I've just hit HADOOP-18641. cyclonedx maven plugin breaks on recent
> > maven releases (3.9.0)
> >
> > on a new local build with maven updated on homebrew (which i needed for
> > spark). so a code change too. That issue doesn't surface on our
> > release dockers, but will hit other people. especially over time. Going
> to
> > revert HADOOP-18590. Publish SBOM artifacts (#5281)
> >
> >
> >
> > On Thu, 23 Feb 2023 at 10:29, Steve Loughran 
> wrote:
> >
> >> ok, let me cancel, update those jiras and kick off again. that will save
> >> anyone else having to do their homework
> >>
> >> On Thu, 23 Feb 2023 at 08:56, Takanobu Asanuma 
> >> wrote:
> >>
> >>> I'm now -1 as I found the wrong information on the top page (index.md).
> >>>
> >>> > 1. HDFS-13522, HDFS-16767 & Related Jiras: Allow Observer Reads in
> HDFS
> >>> Router Based Federation.
> >>>
> >>> The fix version of HDFS-13522 and HDFS-16767 also included 3.3.5
> before,
> >>> though it is actually not in branch-3.3. I corrected the fix version
> and
> >>> created HDFS-16889 to backport them to branch-3.3 about a month ago.
> >>> Unfortunately, it won't be fixed soon. I should have let you know at
> that
> >>> time, sorry.  Supporting Observer NameNode in RBF is a prominent
> feature.
> >>> So I think we have to delete the description from the top page not to
> >>> confuse Hadoop users.
> >>>
> >>> - Takanobu
> >>>
> >>> 2023年2月23日(木) 17:17 Takanobu Asanuma :
> >>>
> >>> > Thanks for driving the release, Steve and Mukund.
> >>> >
> >>> > I found that there were some jiras with wrong fix versions.
> >>> >
> >>> > The fix versions included 3.3.5, but actually, it isn't in 3.3.5-RC1:
> >>> > - HDFS-16845
> >>> > - HADOOP-18345
> >>> >
> >>> > The fix versions didn't include 3.3.5, but actually, it is in
> 3.3.5-RC1
> >>> > (and it is not in release-3.3.4) :
> >>> > - HADOOP-17276
> >>> > - HDFS-13293
> >>> > - HDFS-15630
> >>> > - HDFS-16266
> >>> > - HADOOP-18003
> >>> > - HDFS-16310
> >>> > - HADOOP-18014
> >>> >
> >>> > I corrected all the wrong fix versions just now. I'm not sure we
> should
> >>> > revote it since it only affects the changelog.
> >>> >
> >>> > - Takanobu
> >>> >
> >>> > 2023年2月21日(火) 22:43 Steve Loughran :
> >>> >
> >>> >> Apache Hadoop 3.3.5
> >>> >>
> >>> >> Mukund and I have put together a release candidate (RC1) for Hadoop
> >>> 3.3.5.
> >>> >>
> >>> >> What we would like is for anyone who can to verify the tarballs,
> >>> >> especially
> >>> >> anyone who can try the arm64 binaries as we want to include them
> too.
> >>> >>
> >>> >> The RC is available at:
> >>> >> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC1/
> >>> >>
> >>> >> The git tag is release-3.3.5-RC1, commit 274f91a3259
> >>> >>
> >>> >> The maven artifacts are staged at
> >>> >>
> >>>
> https://repository.apache.org/content/repositories/orgapachehadoop-1368/
> >>> >>
> >>> >> You can find my public key at:
> >>> >> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >>> >>
> >>> >> Change log
> >>> >>
> >>> >>
> >>>
> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC1/CHANGELOG.md
> >>> >>
> >>> >> Release notes
> >>> >>
> >>> >>
> >>>
> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC1/RELEASENOTES.md
> >>> >>
> >>> >> This is off branch-3.3 and is the first big release since 3.3.2.
> >>> >>
> >>> >> Key changes include
> >>> >>
> >>> >> * Big update of dependencies to try and keep those reports of
> >>> >>   transitive CVEs under control -both genuine and false positives.
> >>> >> * HDFS RBF enhancements
> >>> >> * Critical fix to ABFS input stream prefetching for correct reading.
> >>> >> * Vectored IO API for all FSDataInputStream implementations, with
> >>> >>   high-performance versions for file:// and s3a:// filesystems.
> >>> >>

[jira] [Resolved] (MAPREDUCE-7428) Fix failures related to Junit 4 to Junit 5 upgrade in org.apache.hadoop.mapreduce.v2.app.webapp

2023-02-23 Thread Ayush Saxena (Jira)



 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved MAPREDUCE-7428.
-
Resolution: Fixed

> Fix failures related to Junit 4 to Junit 5 upgrade in 
> org.apache.hadoop.mapreduce.v2.app.webapp
> ---
>
> Key: MAPREDUCE-7428
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7428
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.4.0
>Reporter: Ashutosh Gupta
>Assignee: Akira Ajisaka
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Few test are getting failed due to Junit 4 to Junit 5 upgrade in 
> org.apache.hadoop.mapreduce.v2.app.webapp 
> [https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86_64/1071/testReport/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Reopened] (MAPREDUCE-7428) Fix failures related to Junit 4 to Junit 5 upgrade in org.apache.hadoop.mapreduce.v2.app.webapp

2023-02-21 Thread Ayush Saxena (Jira)



 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena reopened MAPREDUCE-7428:
-

> Fix failures related to Junit 4 to Junit 5 upgrade in 
> org.apache.hadoop.mapreduce.v2.app.webapp
> ---
>
> Key: MAPREDUCE-7428
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7428
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.4.0
>Reporter: Ashutosh Gupta
>Assignee: Akira Ajisaka
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Few test are getting failed due to Junit 4 to Junit 5 upgrade in 
> org.apache.hadoop.mapreduce.v2.app.webapp 
> [https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86_64/1071/testReport/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: yetus reporting javadoc errors on @InterfaceAudience attributes

2023-02-21 Thread Ayush Saxena

>
> I'd be happy to take a look at the tests with a fresh set of eyes.  Was
> there an existing PR for Junit work?


Awesome, thanx for volunteering. There is a ticket MAPREDUCE-7428
<https://issues.apache.org/jira/browse/MAPREDUCE-7428>, which
initially talked about these test failures, they were 150+ that time and
had similar issues. You can gather details from there. If you want to
figure out the tickets where these Junit upgrades happened, a simple git
query might help "git log --grep Junit"

Let me know if you need any help, I will be happy to help :-)

-Ayush

On Tue, 21 Feb 2023 at 21:16, Steve Vaughan  wrote:

> I'd be happy to take a look at the tests with a fresh set of eyes.  Was
> there an existing PR for Junit work?
> ----------
> *From:* Ayush Saxena 
> *Sent:* Tuesday, February 21, 2023 1:24 AM
> *To:* ste...@cloudera.com ; Hadoop Common <
> common-...@hadoop.apache.org>; Ashutosh Gupta ;
> Akira Ajisaka ; mapreduce-dev <
> mapreduce-dev@hadoop.apache.org>
> *Subject:* Re: yetus reporting javadoc errors on @InterfaceAudience
> attributes
>
> I think it is 2.5 months and the Junit upgrade tests are still in a mess[1]
> There was an attempt as part of MAPREDUCE-7428, but that didn't fix
> everything or some other commits induced these problems again. I don't
> think anyone is actively chasing that either.
>
> Can add Junit-vintage dependencies here and there and most probably can fix
> this, but last time the approach taken was different, so not exploring in
> that way.
>
> Planning to backtrack and revert/reopen all the Junit upgrades tickets till
> I get a green build. Will hold a minimum of 24 hrs, next whenever I find
> some time, will revert all of these.
>
> Shout out if anyone has concerns around this, or is working on the fix
>
> ++ @Akira Ajisaka  / @mapreduce-dev
> 
>
> -Ayush
>
> [1]
>
> https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86_64/1142/testReport/junit/org.apache.hadoop.mapreduce.v2.hs/TestJobHistoryEvents/testEventsFlushOnStop/
>
>
> On Fri, 23 Dec 2022 at 08:50, Ayush Saxena  wrote:
>
> > Have we stopped
> >> with the java8 builds?
> >
> >
> > Nopes, The answer is here just cross posting:
> > https://github.com/apache/hadoop/pull/5226#issuecomment-1354964948
> >
> > Regarding dropping Java-8 and adapting Java-11, We just have runtime
> > support for Java-11 in hadoop, the compile support ain't there, it is
> being
> > tracked here:
> > https://issues.apache.org/jira/browse/HADOOP-16795
> >
> > Some issues are there, one with Jersey I know and may be a couple of
> more.
> >
> > -Ayush
> >
> > On Fri, 16 Dec 2022 at 20:07, Steve Loughran  >
> > wrote:
> >
> >> OK, it's a JDK bug
> >>
> >> both the java8 and java11 javadocs are now using java11. Have we stopped
> >> with the java8 builds?
> >>
> >> as i am happy with that, we just need to make an explicit declaration
> and
> >> wrap up of anything outstanding.
> >>
> >>
> >>
> >> On Thu, 15 Dec 2022 at 22:30, Ayush Saxena  wrote:
> >>
> >> > Thanx Ashutosh, Let me know if you need any help there.
> >> >
> >> > Got some time to recheck the Javadoc stuff, it seems like a JDK bug
> >> > https://bugs.openjdk.org/browse/JDK-8295850
> >> >
> >> > more details over here:
> >> >
> https://github.com/apache/hadoop/pull/5226#pullrequestreview-1220041496
> >> >
> >> > -Ayush
> >> >
> >> > On Mon, 12 Dec 2022 at 19:46, Ashutosh Gupta <
> >> ashutoshgupta...@gmail.com>
> >> > wrote:
> >> >
> >> >> Thanks Ayush for pointing out the failures related to the Junit 5
> >> >> upgrade. As I have closely worked in upgrading Junit 4 to Junit 5
> >> >> throughout the hadoop project. I will create a JIRA for these
> failures
> >> and
> >> >> fix them on priority.
> >> >>
> >> >> -Ashutosh
> >> >>
> >> >> On Mon, Dec 12, 2022 at 1:59 PM Ayush Saxena 
> >> wrote:
> >> >>
> >> >>> Try to fix in the same way it was done here and couple of similar
> PRs:
> >> >>> https://github.com/apache/hadoop/pull/5179
> >> >>>
> >> >>> There are a bunch of PRs in yarn getting the similar error fixed
> >> module
> >> >>> wise, the problem would be there in many other modules as well...
>

Re: yetus reporting javadoc errors on @InterfaceAudience attributes

2023-02-20 Thread Ayush Saxena

I think it is 2.5 months and the Junit upgrade tests are still in a mess[1]
There was an attempt as part of MAPREDUCE-7428, but that didn't fix
everything or some other commits induced these problems again. I don't
think anyone is actively chasing that either.

Can add Junit-vintage dependencies here and there and most probably can fix
this, but last time the approach taken was different, so not exploring in
that way.

Planning to backtrack and revert/reopen all the Junit upgrades tickets till
I get a green build. Will hold a minimum of 24 hrs, next whenever I find
some time, will revert all of these.

Shout out if anyone has concerns around this, or is working on the fix

++ @Akira Ajisaka  / @mapreduce-dev


-Ayush

[1]
https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86_64/1142/testReport/junit/org.apache.hadoop.mapreduce.v2.hs/TestJobHistoryEvents/testEventsFlushOnStop/


On Fri, 23 Dec 2022 at 08:50, Ayush Saxena  wrote:

> Have we stopped
>> with the java8 builds?
>
>
> Nopes, The answer is here just cross posting:
> https://github.com/apache/hadoop/pull/5226#issuecomment-1354964948
>
> Regarding dropping Java-8 and adapting Java-11, We just have runtime
> support for Java-11 in hadoop, the compile support ain't there, it is being
> tracked here:
> https://issues.apache.org/jira/browse/HADOOP-16795
>
> Some issues are there, one with Jersey I know and may be a couple of more.
>
> -Ayush
>
> On Fri, 16 Dec 2022 at 20:07, Steve Loughran 
> wrote:
>
>> OK, it's a JDK bug
>>
>> both the java8 and java11 javadocs are now using java11. Have we stopped
>> with the java8 builds?
>>
>> as i am happy with that, we just need to make an explicit declaration and
>> wrap up of anything outstanding.
>>
>>
>>
>> On Thu, 15 Dec 2022 at 22:30, Ayush Saxena  wrote:
>>
>> > Thanx Ashutosh, Let me know if you need any help there.
>> >
>> > Got some time to recheck the Javadoc stuff, it seems like a JDK bug
>> > https://bugs.openjdk.org/browse/JDK-8295850
>> >
>> > more details over here:
>> > https://github.com/apache/hadoop/pull/5226#pullrequestreview-1220041496
>> >
>> > -Ayush
>> >
>> > On Mon, 12 Dec 2022 at 19:46, Ashutosh Gupta <
>> ashutoshgupta...@gmail.com>
>> > wrote:
>> >
>> >> Thanks Ayush for pointing out the failures related to the Junit 5
>> >> upgrade. As I have closely worked in upgrading Junit 4 to Junit 5
>> >> throughout the hadoop project. I will create a JIRA for these failures
>> and
>> >> fix them on priority.
>> >>
>> >> -Ashutosh
>> >>
>> >> On Mon, Dec 12, 2022 at 1:59 PM Ayush Saxena 
>> wrote:
>> >>
>> >>> Try to fix in the same way it was done here and couple of similar PRs:
>> >>> https://github.com/apache/hadoop/pull/5179
>> >>>
>> >>> There are a bunch of PRs in yarn getting the similar error fixed
>> module
>> >>> wise, the problem would be there in many other modules as well...
>> >>>
>> >>> The daily JDK-11 build also shows that failure here:
>> >>>
>> >>>
>> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/410/artifact/out/patch-javadoc-root.txt
>> >>>
>> >>> BTW. the daily build is also broken with some whooping 150+ failures
>> >>>
>> >>>
>> https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86_64/1071/testReport/
>> >>>
>> >>> Mostly some Junit upgrade patch being the reason.
>> >>>
>> >>> -Ayush
>> >>>
>> >>> On Mon, 12 Dec 2022 at 18:46, Steve Loughran
>> > >>> >
>> >>> wrote:
>> >>>
>> >>> > yetus is now reporting errors on our @InterfaceAudience tags in
>> java8
>> >>> and
>> >>> > java11 javadoc generation
>> >>> > https://github.com/apache/hadoop/pull/5205#issuecomment-1344664692
>> >>> >
>> >>> >
>> >>>
>> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5205/2/artifact/out/branch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt
>> >>> >
>> >>> > it looks a bit like the javadocs are both being done in the java11
>> >>> version,
>> >>> > and is is unhappy.
>> >>> >
>> >>> > any suggestions as to a fix?
>> >>> >
>> >>>
>> >>
>>
>

Re: [VOTE] Release Apache Hadoop 3.3.5

2023-01-05 Thread Ayush Saxena

I haven't got a chance to deep dive into HADOOP-18324
<https://issues.apache.org/jira/browse/HADOOP-18324> which is claimed to be
the reason for these failures. Most probably will try to check next week if
it is still there.
>From the PR uploaded on HDFS-16853
<https://issues.apache.org/jira/browse/HDFS-16853> it looks like changing
or tweaking the cleanup logic itself rather than with playing with tests or
MiniDfsCluster, So, the clean up logic has issues but I still need to check
what is the impact of that, If I have a service and that terminates in a
non test setup, will the restart be an issue like these tests are facing,
my initial hunch was No. But I need to carefully check and see what is the
impact and what other issues it can cause. the original logic ain't
something which can be decoded with just a few seconds of cursory look.

++ @Owen O'Malley  is the original author of the
Hadoop Jira, maybe he can share some pointers about that.

-Ayush

On Thu, 5 Jan 2023 at 07:04, Chris Nauroth  wrote:

> Is it a problem limited to MiniDFSCluster, or is it a broader problem of
> RPC client resource cleanup? The patch is changing connection close
> cleanup, so I assumed the latter. If so, then it could potentially impact
> applications integrating with the RPC clients.
>
> If the problem is limited to MiniDFSCluster and restarts within a single
> JVM, then I agree the impact is smaller. Then, we'd want to consider what
> downstream projects have tests that do restarts on a MiniDFSCluster.
>
> Chris Nauroth
>
>
> On Wed, Jan 4, 2023 at 4:22 PM Ayush Saxena  wrote:
>
> > Hmm I'm looking at HADOOP-11867 related stuff but couldn't find it
> >> mentioned anywhere in change log or release notes. Are they actually
> >> up-to-date?
> >
> >
> > I don't think there is any issue with the ReleaseNotes generation as such
> > but with the Resolution type of this ticket, It ain't marked as Fixed but
> > Done. The other ticket which is marked Done is also not part of the
> release
> > notes. [1]
> >
> > if I'm understanding the potential impact of HDFS-16853
> >> correctly, then it's serious enough to fix before a release. (I could
> >> change my vote if someone wants to make a case that it's not that
> >> serious.)
> >>
> >
> > Chris, I just had a very quick look at HDFS-16853, I am not sure if this
> > can happen outside a MiniDfsCluster setup? Just guessing from the
> > description in the ticket. It looked like when we did a restart of the
> > Namenode in the MiniDfsCluster, I guess that would be in the same single
> > JVM, and that is why a previous blocked thread caused issues with the
> > restart. That is what I understood, I haven't checked the code though.
> >
> > Second, In the same context, Being curious If this lands up being a
> > MiniDfsCluster only issue, do we still consider this a release blocker?
> Not
> > saying in a way it won't be serious, MiniDfsCluster is very widely used
> by
> > downstream projects and all, so just wanted to know
> >
> > Regarding the Hive & Bouncy castle. The PR seems to have a valid binding
> > veto, I am not sure if it will get done any time soon, so if the use case
> > is something required, I would suggest handling it at Hadoop itself. It
> > seems to be centric to Hive-3.x, I tried compiling the Hive master branch
> > with 3.3.5 and it passed. Other than that Hive officially support only
> > Hadoop-3.3.1 and that too only in the last 4.x release[2]
> >
> >
> > [1]
> >
> https://issues.apache.org/jira/browse/HADOOP-11867?jql=project%20%3D%20HADOOP%20AND%20resolution%20%3D%20Done%20AND%20fixVersion%20%3D%203.3.5%20ORDER%20BY%20resolution%20DESC
> > [2] https://issues.apache.org/jira/browse/HIVE-24484
> >
> > -Ayush
> >
> > On Tue, 3 Jan 2023 at 23:51, Chris Nauroth  wrote:
> >
> >> -1, because if I'm understanding the potential impact of HDFS-16853
> >> correctly, then it's serious enough to fix before a release. (I could
> >> change my vote if someone wants to make a case that it's not that
> >> serious.)
> >>
> >> Otherwise, this RC was looking good:
> >>
> >> * Verified all checksums.
> >> * Verified all signatures.
> >> * Built from source, including native code on Linux.
> >> * mvn clean package -Pnative -Psrc -Drequire.openssl
> -Drequire.snappy
> >> -Drequire.zstd -DskipTests
> >> * Tests passed.
> >> * mvn --fail-never clean test -Pnative -Dparallel-tests
> >> -Drequire.snappy -Drequire.zstd -Drequire.openssl
> >> -Dsurefire.rerunFailingTestsCount=3 -Dtests

Re: [VOTE] Release Apache Hadoop 3.3.5

2023-01-04 Thread Ayush Saxena

>
> Hmm I'm looking at HADOOP-11867 related stuff but couldn't find it
> mentioned anywhere in change log or release notes. Are they actually
> up-to-date?


I don't think there is any issue with the ReleaseNotes generation as such
but with the Resolution type of this ticket, It ain't marked as Fixed but
Done. The other ticket which is marked Done is also not part of the release
notes. [1]

if I'm understanding the potential impact of HDFS-16853
> correctly, then it's serious enough to fix before a release. (I could
> change my vote if someone wants to make a case that it's not that serious.)
>

Chris, I just had a very quick look at HDFS-16853, I am not sure if this
can happen outside a MiniDfsCluster setup? Just guessing from the
description in the ticket. It looked like when we did a restart of the
Namenode in the MiniDfsCluster, I guess that would be in the same single
JVM, and that is why a previous blocked thread caused issues with the
restart. That is what I understood, I haven't checked the code though.

Second, In the same context, Being curious If this lands up being a
MiniDfsCluster only issue, do we still consider this a release blocker? Not
saying in a way it won't be serious, MiniDfsCluster is very widely used by
downstream projects and all, so just wanted to know

Regarding the Hive & Bouncy castle. The PR seems to have a valid binding
veto, I am not sure if it will get done any time soon, so if the use case
is something required, I would suggest handling it at Hadoop itself. It
seems to be centric to Hive-3.x, I tried compiling the Hive master branch
with 3.3.5 and it passed. Other than that Hive officially support only
Hadoop-3.3.1 and that too only in the last 4.x release[2]


[1]
https://issues.apache.org/jira/browse/HADOOP-11867?jql=project%20%3D%20HADOOP%20AND%20resolution%20%3D%20Done%20AND%20fixVersion%20%3D%203.3.5%20ORDER%20BY%20resolution%20DESC
[2] https://issues.apache.org/jira/browse/HIVE-24484

-Ayush

On Tue, 3 Jan 2023 at 23:51, Chris Nauroth  wrote:

> -1, because if I'm understanding the potential impact of HDFS-16853
> correctly, then it's serious enough to fix before a release. (I could
> change my vote if someone wants to make a case that it's not that serious.)
>
> Otherwise, this RC was looking good:
>
> * Verified all checksums.
> * Verified all signatures.
> * Built from source, including native code on Linux.
> * mvn clean package -Pnative -Psrc -Drequire.openssl -Drequire.snappy
> -Drequire.zstd -DskipTests
> * Tests passed.
> * mvn --fail-never clean test -Pnative -Dparallel-tests
> -Drequire.snappy -Drequire.zstd -Drequire.openssl
> -Dsurefire.rerunFailingTestsCount=3 -DtestsThreadCount=8
> * Checked dependency tree to make sure we have all of the expected library
> updates that are mentioned in the release notes.
> * mvn -o dependency:tree
> * Farewell, S3Guard.
> * Confirmed that hadoop-openstack is now just a stub placeholder artifact
> with no code.
> * For ARM verification:
> * Ran "file " on all native binaries in the ARM tarball to confirm
> they actually came out with ARM as the architecture.
> * Output of hadoop checknative -a on ARM looks good.
> * Ran a MapReduce job with the native bzip2 codec for compression, and
> it worked fine.
> * Ran a MapReduce job with YARN configured to use
> LinuxContainerExecutor and verified launching the containers through
> container-executor worked.
>
> My local setup didn't have the test failures mentioned by Viraj, though
> there was some flakiness with a few HDFS snapshot tests timing out.
>
> Regarding Hive and Bouncy Castle, there is an existing issue and pull
> request tracking an upgrade attempt. It's looking like some amount of code
> changes are required:
>
> https://issues.apache.org/jira/browse/HIVE-26648
> https://github.com/apache/hive/pull/3744
>
> Chris Nauroth
>
>
> On Tue, Jan 3, 2023 at 8:57 AM Chao Sun  wrote:
>
> > Hmm I'm looking at HADOOP-11867 related stuff but couldn't find it
> > mentioned anywhere in change log or release notes. Are they actually
> > up-to-date?
> >
> > On Mon, Jan 2, 2023 at 7:48 AM Masatake Iwasaki
> >  wrote:
> > >
> > > >- building HBase 2.4.13 and Hive 3.1.3 against 3.3.5 failed due to
> > dependency change.
> > >
> > > For HBase, classes under com/sun/jersey/json/* and com/sun/xml/* are
> not
> > expected in hbase-shaded-with-hadoop-check-invariants.
> > > Updating hbase-shaded/pom.xml is expected to be the fix as done in
> > HBASE-27292.
> > >
> >
> https://github.com/apache/hbase/commit/00612106b5fa78a0dd198cbcaab610bd8b1be277
> > >
> > >[INFO] --- exec-maven-plugin:1.6.0:exec
> > (check-jar-contents-for-stuff-with-hadoop) @
> > hbase-shaded-with-hadoop-check-invariants ---
> > >[ERROR] Found artifact with unexpected contents:
> >
> '/home/rocky/srcs/bigtop/build/hbase/rpm/BUILD/hbase-2.4.13/hbase-shaded/hbase-shaded-client/target/hbase-shaded-client-2.4.13.jar'
> > >Please check the following and either correct the

Re: [VOTE] Release Apache Hadoop 3.3.5

2022-12-27 Thread Ayush Saxena

Mostly or may be all of those failures are due to HADOOP-18324
, there is a Jira
tracking issues with TestLeaseRecovery2 linked to that as well HDFS-16853


-Ayush

On Wed, 28 Dec 2022 at 09:13, Viraj Jasani  wrote:

> -0 (non-binding)
>
> Output of hadoop-vote.sh:
>
> * Signature: ok
> * Checksum : ok
> * Rat check (1.8.0_341): ok
>  - mvn clean apache-rat:check
> * Built from source (1.8.0_341): ok
>  - mvn clean install  -DskipTests
> * Built tar from source (1.8.0_341): ok
>  - mvn clean package  -Pdist -DskipTests -Dtar -Dmaven.javadoc.skip=true
>
> Manual testing on local mini cluster:
> * Basic CRUD tests on Hdfs looks good
> * Sample MapReduce job looks good
> * S3A tests look good with scale profile (ITestS3AContractUnbuffer is
> flaky, but when run individually, it passes)
>
> Full build with all modules UT results for branch-3.3.5 latest HEAD are
> available on
>
> https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-3.3.5-java8-linux-x86_64/
>
> From the above build, there are some consistently failing tests, out of
> which only TestDataNodeRollingUpgrade passed locally, whereas rest of the
> tests are consistently failing locally as well, we might want to fix (or
> ignore, if required) them:
>
>
> org.apache.hadoop.hdfs.TestErasureCodingPolicyWithSnapshot#testSnapshotsOnErasureCodingDirAfterNNRestart
>
> org.apache.hadoop.hdfs.TestFileLengthOnClusterRestart#testFileLengthWithHSyncAndClusterRestartWithOutDNsRegister
>
> org.apache.hadoop.hdfs.TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart
>
> org.apache.hadoop.hdfs.TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart2
>
> org.apache.hadoop.hdfs.TestLeaseRecovery2#testHardLeaseRecoveryWithRenameAfterNameNodeRestart
>
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade#testWithLayoutChangeAndFinalize
>
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshot#testSnapshotOpsOnRootReservedPath
>
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap#testReadRenamedSnapshotFileWithCheckpoint
>
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion#testApplyEditLogForDeletion
>
>
>
> On Wed, Dec 21, 2022 at 11:29 AM Steve Loughran
> 
> wrote:
>
> > Mukund and I have put together a release candidate (RC0) for Hadoop
> 3.3.5.
> >
> > Given the time of year it's a bit unrealistic to run a 5 day vote and
> > expect people to be able to test it thoroughly enough to make this the
> one
> > we can ship.
> >
> > What we would like is for anyone who can to verify the tarballs, and test
> > the binaries, especially anyone who can try the arm64 binaries. We've got
> > the building of those done and now the build file will incorporate them
> > into the release -but neither of us have actually tested it yet. Maybe I
> > should try it on my pi400 over xmas.
> >
> > The maven artifacts are up on the apache staging repo -they are the ones
> > from x86 build. Building and testing downstream apps will be incredibly
> > helpful.
> >
> > The RC is available at:
> > https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC0/
> >
> > The git tag is release-3.3.5-RC0, commit 3262495904d
> >
> > The maven artifacts are staged at
> > https://repository.apache.org/content/repositories/orgapachehadoop-1365/
> >
> > You can find my public key at:
> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >
> > Change log
> >
> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC0/CHANGELOG.md
> >
> > Release notes
> >
> >
> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC0/RELEASENOTES.md
> >
> > This is off branch-3.3 and is the first big release since 3.3.2.
> >
> > Key changes include
> >
> > * Big update of dependencies to try and keep those reports of
> >   transitive CVEs under control -both genuine and false positive.
> > * HDFS RBF enhancements
> > * Critical fix to ABFS input stream prefetching for correct reading.
> > * Vectored IO API for all FSDataInputStream implementations, with
> >   high-performance versions for file:// and s3a:// filesystems.
> >   file:// through java native io
> >   s3a:// parallel GET requests.
> > * This release includes Arm64 binaries. Please can anyone with
> >   compatible systems validate these.
> >
> >
> > Please try the release and vote on it, even though i don't know what is a
> > good timeline here...i'm actually going on holiday in early jan. Mukund
> is
> > around and so can drive the process while I'm offline.
> >
> > Assuming we do have another iteration, the RC1 will not be before mid jan
> > for that reason
> >
> > Steve (and mukund)
> >
>

Re: exciting new content needed for the 3.3.5 index.md file

2022-12-01 Thread Ayush Saxena

Hi Steve,
I just went through the Jira's fixed in HDFS[1]
Sharing some stuff which I find good enough to find a mention.
1.HDFS-13522 , HDFS-16767
 & Related Jiras: Allow
Observer Reads in HDFS Router Based Federation.
2. HDFS-13248 : RBF
supports Client Locality
3. HDFS-16400 , HDFS-16399
, HDFS-16396
, HDFS-16397
, HDFS-16413
, HDFS-16457
: Makes a bunch of
Datanode level properties reconfigurable. The list is at[2], if required.

That is my small list from HDFS, I can come back before the RC if I find
something else as well, or If I figure out that I messed up with the above
stuff(I double checked the fix versions though...)

BTW. 2 unresolved Jira with FixVersion 3.3.5[3], do give a check on such
cases before you generate the RC

-Ayush

[1]
https://issues.apache.org/jira/issues/?jql=project%20%3D%20HDFS%20AND%20fixVersion%20%3D%203.3.5%20ORDER%20BY%20updated%20DESC
[2]
https://github.com/apache/hadoop/blob/branch-3.3.5/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java#L346-L361
[3]
https://issues.apache.org/jira/issues/?jql=project%20in%20(HADOOP%2C%20HDFS%2C%20YARN%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Unresolved%20AND%20fixVersion%20%3D%203.3.5%20ORDER%20BY%20updated%20DESC

On Fri, 2 Dec 2022 at 00:21, Steve Loughran 
wrote:

> The first "smoke test" RC is about to be up for people to play with, we are
> just testing things here and getting that arm build done.
>
> Can I have some content for the index.html page describing what has
> changed?
> hadoop-project/src/site/markdown/index.md.vm
>
> I can (and will) speak highly of stuff I've been involved in, but need
> contributions from others for what is new in this release in HDFS, YARN,
> and MR (other than the manifest committer).
>
> It'd be good to have a list of CVEs fixed by upgrading jars. Maybe we
> should have a transitive-CVE tag for all JIRAs which update a dependency
> for this, so that then we could have the release notes explicitly list
> these in their own section.
>
> Please submit changes to branch-3.3.5; use HADOOP-18470. as the jira for
> all the release notes.
>
>  thanks.
>

Re: [DISCUSS] JIRA Public Signup Disabled

2022-11-26 Thread Ayush Saxena

Thanx Mingliang Liu, Considering the traffic around private@ for new
accounts(surprisingly), I went ahead and updated our Contribution
Guidelines[1], If anyone feels I made a mistake in the content or it can
rephrased in a better way, feel free to change it or let me know.

In future if the traffic increases with such requests, we can obviously
come back here and discuss the 2nd option or any other possible options.

Thanx Everyone!!!

-Ayush

[1]
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute#HowToContribute-RequestingforaJiraaccount

On Fri, 25 Nov 2022 at 06:16, Mingliang Liu  wrote:

> Thanks Ayush for taking care of this. I think option 1 sounds good.
>
> > On Nov 22, 2022, at 2:51 PM, Ayush Saxena  wrote:
> >
> > Hi Folks,
> > Just driving the attention towards the recent change from Infra, which
> > disables new people from creating a Jira account, in order to prevent
> spams
> > around JIRA.
> > So, the new Jira account creation request needs to be routed via the PMC
> of
> > the project.
> > So, we have 2 options, which I can think of:
> > 1. Update the contribution guidelines to route such requests to private@
> > 2. Create a dedicated ML for it. A couple of projects which I know did
> that.
> >
> > The Infra page: https://infra.apache.org/jira-guidelines.html
> >
> > Let me know what folks think, if nothing, I will go with the 1st option
> and
> > update the contribution guidelines mostly by next week or a week after
> that.
> >
> > -Ayuhs
>
>

[DISCUSS] JIRA Public Signup Disabled

2022-11-22 Thread Ayush Saxena

Hi Folks,
Just driving the attention towards the recent change from Infra, which
disables new people from creating a Jira account, in order to prevent spams
around JIRA.
So, the new Jira account creation request needs to be routed via the PMC of
the project.
So, we have 2 options, which I can think of:
1. Update the contribution guidelines to route such requests to private@
2. Create a dedicated ML for it. A couple of projects which I know did that.

The Infra page: https://infra.apache.org/jira-guidelines.html

Let me know what folks think, if nothing, I will go with the 1st option and
update the contribution guidelines mostly by next week or a week after that.

-Ayuhs

2022 ASF Community Survey

2022-08-25 Thread Ayush Saxena


Hello everyone,

The 2022 ASF Community Survey is looking to gather scientific data that
allows us to understand our community better, both in its demographic
composition, and also in collaboration styles and preferences. We want to
find areas where we can continue to do great work, and others where we need
to provide more support so that our projects can keep growing healthy and
diverse.

If you have an apache.org email, you should have received an email with an
invitation to take the 2022 ASF Community Survey. Please take 15 minutes to
complete it.

If you do not have an apache.org email address or you didn’t receive a
link, please follow this link to the survey:

https://edi-asf.limesurvey.net/912832?lang=en

You can find information about privacy on the survey’s Confluence page.
<https://cwiki.apache.org/confluence/display/EDI/Survey+-+Launch+Plan> The
last surveys of this kind were implemented in 2016 and 2020, which means we
are finally in a position to see trends over time.

Your participation is paramount to the success of this project! Please
consider filling out the survey, and share this news with your fellow
Apache contributors. As individuals form the Apache community, your opinion
matters: we want to hear your voice.

If you have any questions about the survey or otherwise, please reach out
to us!

Kindly,

Ayush Saxena
On Behalf of:

Katia Rojas

V.P. of Diversity and Inclusion

The Apache Software Foundation

[jira] [Resolved] (MAPREDUCE-7385) impove JobEndNotifier#httpNotification With recommended methods

2022-08-08 Thread Ayush Saxena (Jira)



 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved MAPREDUCE-7385.
-
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> impove JobEndNotifier#httpNotification With recommended methods
> ---
>
> Key: MAPREDUCE-7385
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7385
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> JobEndNotifier#httpNotification's DefaultHttpClient has been Deprecated, use 
> the recommended method instead
> JobEndNotifier#httpNotification
> {code:java}
> private static int httpNotification(String uri, int timeout)
>       throws IOException, URISyntaxException {
>     DefaultHttpClient client = new DefaultHttpClient();
>     client.getParams()
>         .setIntParameter(CoreConnectionPNames.SO_TIMEOUT, timeout)
>         .setLongParameter(ClientPNames.CONN_MANAGER_TIMEOUT, (long) timeout);
>     HttpGet httpGet = new HttpGet(new URI(uri));
>     httpGet.setHeader("Accept", "*/*");
>     return client.execute(httpGet).getStatusLine().getStatusCode();
>   } {code}
>  * CoreConnectionPNames.SO_TIMEOUT
>  * Use RequestConfig.setSocketTimeout instead
> {code:java}
> Deprecated.Defines the socket timeout (SO_TIMEOUT) in milliseconds, which is 
> the timeout for waiting for data or, put differently, a maximum period 
> inactivity between two consecutive data packets). A timeout value of zero is 
> interpreted as an infinite timeout. {code}
>  
>  * ClientPNames.CONN_MANAGER_TIMEOUT
>  * Use RequestConfig.setConnectionRequestTimeout instead
> {code:java}
> Deprecated. Defines the timeout in milliseconds used when retrieving an 
> instance of ManagedClientConnection from the ClientConnectionManager. {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 3.2.4 - RC0

2022-07-21 Thread Ayush Saxena

+1(Binding)

* Built from Source
* Successful native build on Ubuntu 18.04
* Verified Checksums
* Verified Signatures
* Successful RAT check
* Ran Basic HDFS shell commands
* Ran basic MR example Jobs (TeraGen/TeraSort & TeraValidate)
* Browsed through UI(NN, DN, RM, NM & JHS)
* Skimmed through the contents of ChangeLog & ReleaseNotes. Look Good

Thanx Masatake for driving the release, Good Luck!!!

-Ayush

On Thu, 21 Jul 2022 at 23:28, Chris Nauroth  wrote:

> I'm changing my vote to +1 (binding).
>
> Masatake and Ashutosh, thank you for investigating.
>
> I reran tests without the parallel options, and that mostly addressed the
> failures. Maybe the tests in question are just not sufficiently isolated to
> support parallel execution. That looks to be the case for TestFsck, where
> the failure was caused by missing audit log entries. This test works by
> toggling global logging state, so I can see why multi-threaded execution
> might confuse the test.
>
> Chris Nauroth
>
>
> On Thu, Jul 21, 2022 at 12:01 AM Ashutosh Gupta <
> ashutoshgupta...@gmail.com>
> wrote:
>
> > +1(non-binding)
> >
> > * Builds from source look good.
> > * Checksums and signatures are correct.
> > * Running basic HDFS and MapReduce commands looks good.
> >
> > > * TestAMRMProxy - Not able to reproduce in local
> > > * TestFsck - I can see failure only I can see is
> >  TestFsck.testFsckListCorruptSnapshotFiles which passed after applying
> > HDFS-15038
> > > * TestSLSStreamAMSynth - Not able to reproduce in local
> > > * TestServiceAM - Not able to reproduce in local
> >
> > Thanks Masatake for driving this release.
> >
> > On Thu, Jul 21, 2022 at 5:51 AM Masatake Iwasaki <
> > iwasak...@oss.nttdata.com>
> > wrote:
> >
> > > Hi developers,
> > >
> > > I'm still waiting for your vote.
> > > I'm considering the intermittent test failures mentioned by Chris are
> not
> > > blocker.
> > > Please file a JIRA and let me know if you find a blocker issue.
> > >
> > > I will appreciate your help for the release process.
> > >
> > > Regards,
> > > Masatake Iwasaki
> > >
> > > On 2022/07/20 14:50, Masatake Iwasaki wrote:
> > > >> TestServiceAM
> > > >
> > > > I can see the reported failure of TestServiceAM in some "Apache
> Hadoop
> > > qbt Report: branch-3.2+JDK8 on Linux/x86_64".
> > > > 3.3.0 and above might be fixed by YARN-8867 which added guard using
> > > GenericTestUtils#waitFor for stabilizing the
> > > testContainersReleasedWhenPreLaunchFails.
> > > > YARN 8867 did not modified other code under hadoop-yarn-services.
> > > > If it is the case, TestServiceAM can be tagged as flaky in
> branch-3.2.
> > > >
> > > >
> > > > On 2022/07/20 14:21, Masatake Iwasaki wrote:
> > > >> Thanks for testing the RC0, Chris.
> > > >>
> > > >>> The following are new test failures for me on 3.2.4:
> > > >>> * TestAMRMProxy
> > > >>> * TestFsck
> > > >>> * TestSLSStreamAMSynth
> > > >>> * TestServiceAM
> > > >>
> > > >> I could not reproduce the test failures on my local.
> > > >>
> > > >> For TestFsck, if the failed test case is
> > > testFsckListCorruptSnapshotFiles,
> > > >> cherry-picking HDFS-15038 (fixing only test code) could be the fix.
> > > >>
> > > >> The failure of TestSLSStreamAMSynth looks frequently reported by
> > > >> "Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64".
> > > >> It could be tagged as known flaky test.
> > > >>
> > > >> On 2022/07/20 9:15, Chris Nauroth wrote:
> > > >>> -0 (binding)
> > > >>>
> > > >>> * Verified all checksums.
> > > >>> * Verified all signatures.
> > > >>> * Built from source, including native code on Linux.
> > > >>>  * mvn clean package -Pnative -Psrc -Drequire.openssl
> > > -Drequire.snappy
> > > >>> -Drequire.zstd -DskipTests
> > > >>> * Tests mostly passed, but see below.
> > > >>>  * mvn --fail-never clean test -Pnative -Dparallel-tests
> > > >>> -Drequire.snappy -Drequire.zstd -Drequire.openssl
> > > >>> -Dsurefire.rerunFailingTestsCount=3 -DtestsThreadCount=8
> > > >>>
> > > >>> The following are new test failures for me on 3.2.4:
> > > >>> * TestAMRMProxy
> > > >>> * TestFsck
> > > >>> * TestSLSStreamAMSynth
> > > >>> * TestServiceAM
> > > >>>
> > > >>> The following tests also failed, but they also fail for me on
> 3.2.3,
> > so
> > > >>> they aren't likely to be related to this release candidate:
> > > >>> * TestCapacitySchedulerNodeLabelUpdate
> > > >>> * TestFrameworkUploader
> > > >>> * TestSLSGenericSynth
> > > >>> * TestSLSRunner
> > > >>> * test_libhdfs_threaded_hdfspp_test_shim_static
> > > >>>
> > > >>> I'm not voting a full -1, because I haven't done any root cause
> > > analysis on
> > > >>> these new test failures. I don't know if it's a quirk to my
> > > environment,
> > > >>> though I'm using the start-build-env.sh Docker container, so any
> > build
> > > >>> dependencies should be consistent. I'd be comfortable moving ahead
> if
> > > >>> others are seeing these tests pass.
> > > >>>
> > > >>> Chris Nauroth
> > > >>>
> > > >>>
> > > >>> On Thu, Jul 14, 2022 at 7:57 AM

[jira] [Resolved] (MAPREDUCE-7389) Typo in description of "mapreduce.application.classpath" in mapred-default.xml

2022-06-21 Thread Ayush Saxena (Jira)



 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved MAPREDUCE-7389.
-
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Typo in description of "mapreduce.application.classpath" in mapred-default.xml
> --
>
> Key: MAPREDUCE-7389
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7389
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 3.3.3
>Reporter: Christian Bartolomäus
>Assignee: Christian Bartolomäus
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> There is a small typo for {variable} in the description of 
> "mapreduce.application.classpath" in 
> hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml.
> {noformat}
> If mapreduce.app-submission.cross-platform is false, platform-specific
> environment vairable expansion syntax would be used to construct the default
> CLASSPATH entries.
> {noformat}
> I just stumbled upon that and will open a PR over at github.com.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Resolved] (MAPREDUCE-7387) Fix TestJHSSecurity#testDelegationToken AssertionError due to HDFS-16563

2022-06-20 Thread Ayush Saxena (Jira)



 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved MAPREDUCE-7387.
-
Fix Version/s: 3.4.0
   3.3.4
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Fix TestJHSSecurity#testDelegationToken AssertionError due to HDFS-16563
> 
>
> Key: MAPREDUCE-7387
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7387
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.4.0, 3.3.4
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> During the processing of HADOOP-18284. Fix Repeated Semicolons., PR#4422 was 
> submitted, and an error was reported in 
> hadoop.mapreduce.security.TestJHSSecurity#testDelegationToken in the test 
> report.
> {code:java}
> [ERROR] 
> testDelegationToken(org.apache.hadoop.mapreduce.security.TestJHSSecurity)  
> Time elapsed: 16.344 s  <<< FAILURE!
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:87)
>   at org.junit.Assert.assertTrue(Assert.java:42)
>   at org.junit.Assert.assertTrue(Assert.java:53)
>   at 
> org.apache.hadoop.mapreduce.security.TestJHSSecurity.testDelegationToken(TestJHSSecurity.java:163)
> .{code}
> It can be found that HDFS-16563 is causing this problem.
> The reason is because HDFS-16563 changed error msg, which made MR's Jinit 
> Test assertion fail.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 2.10.2 - RC0

2022-05-30 Thread Ayush Saxena

+1,
* Built from source
* Verified Checksums
* Verified Signatures
* Successful RAT check
* Ran some basic HDFS shell commands.
* Browsed through the UI(NN,DN,RM,NM,JHS)

Thanx Masatake for driving the release, Good Luck!!!

-Ayush

On Mon, 30 May 2022 at 03:14, Chris Nauroth  wrote:

> +1 (binding)
>
> * Verified all checksums.
> * Verified all signatures.
> * Built from source, including native code on Linux.
> * mvn clean package -Pnative -Psrc -Drequire.openssl -Drequire.snappy
> -Drequire.zstd -DskipTests
> * Almost all unit tests passed.
> * mvn clean test -Pnative -Dparallel-tests -Drequire.snappy
> -Drequire.zstd -Drequire.openssl -Dsurefire.rerunFailingTestsCount=3
> -DtestsThreadCount=8
> * TestBookKeeperHACheckpoints consistently has a few failures.
> * TestCapacitySchedulerNodeLabelUpdate is flaky, intermittently timing
> out.
>
> These test failures don't look significant enough to hold up a release, so
> I'm still voting +1.
>
> Chris Nauroth
>
>
> On Sun, May 29, 2022 at 2:35 AM Masatake Iwasaki <
> iwasak...@oss.nttdata.co.jp> wrote:
>
>> Thanks for the help, Ayush.
>>
>> I committed HADOOP-16663/HADOOP-16664 and cherry-picked HADOOP-16985 to
>> branch-2.10 (and branch-3.2).
>> If I need to cut RC1, I will try cherry-picking them to branch-2.10.2
>>
>> Masatake Iwasaki
>>
>>
>> On 2022/05/28 5:23, Ayush Saxena wrote:
>> > The checksum stuff was addressed in HADOOP-16985, so that filename
>> stuff is
>> > sorted only post 3.3.x
>> > BTW it is a known issue:
>> >
>> https://issues.apache.org/jira/browse/HADOOP-16494?focusedCommentId=16927236=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16927236
>> >
>> > Must not be a blocker for us
>> >
>> > The RAT check failing with dependency issue. That also should work post
>> > 3.3.x because there is no Hadoop-maven-plugin dependency in
>> Hadoop-yarn-api
>> > module post 3.3.x, HADOOP-16560 removed it.
>> > Ref:
>> >
>> https://github.com/apache/hadoop/pull/1496/files#diff-f5d219eaf211871f9527ae48da59586e7e9958ea7649de74a1393e599caa6dd6L121-R122
>> >
>> > So, that is why the RAT check passes for 3.3.x+ without the need of this
>> > module. Committing HADOOP-16663, should solve this though.(I haven't
>> tried
>> > though, just by looking at the problem)
>> >
>> > Good to have patches, but doesn't look like blockers to me. kind of
>> build
>> > related stuffs only, nothing bad with our core Hadoop code.
>> >
>> > -Ayush
>> >
>> > On Sat, 28 May 2022 at 01:04, Viraj Jasani  wrote:
>> >
>> >> +0 (non-binding),
>> >>
>> >> * Signature/Checksum looks good, though I am not sure where
>> >> "target/artifacts" is coming from for the tars, here is the diff (this
>> was
>> >> the case for 2.10.1 as well but checksum was correct):
>> >>
>> >> 1c1
>> >> < SHA512 (hadoop-2.10.2-site.tar.gz) =
>> >>
>> >>
>> 3055a830003f5012660d92da68a317e15da5b73301c2c73cf618e724c67b7d830551b16928e0c28c10b66f04567e4b6f0b564647015bacc4677e232c0011537f
>> >> ---
>> >>> SHA512 (target/artifacts/hadoop-2.10.2-site.tar.gz) =
>> >>
>> >>
>> 3055a830003f5012660d92da68a317e15da5b73301c2c73cf618e724c67b7d830551b16928e0c28c10b66f04567e4b6f0b564647015bacc4677e232c0011537f
>> >> 1c1
>> >> < SHA512 (hadoop-2.10.2-src.tar.gz) =
>> >>
>> >>
>> 483b6a4efd44234153e21ffb63a9f551530a1627f983a8837c655ce1b8ef13486d7178a7917ed3f35525c338e7df9b23404f4a1b0db186c49880448988b88600
>> >> ---
>> >>> SHA512 (target/artifacts/hadoop-2.10.2-src.tar.gz) =
>> >>
>> >>
>> 483b6a4efd44234153e21ffb63a9f551530a1627f983a8837c655ce1b8ef13486d7178a7917ed3f35525c338e7df9b23404f4a1b0db186c49880448988b88600
>> >> 1c1
>> >> < SHA512 (hadoop-2.10.2.tar.gz) =
>> >>
>> >>
>> 13e95907073d815e3f86cdcc24193bb5eec0374239c79151923561e863326988c7f32a05fb7a1e5bc962728deb417f546364c2149541d6234221b00459154576
>> >> ---
>> >>> SHA512 (target/artifacts/hadoop-2.10.2.tar.gz) =
>> >>
>> >>
>> 13e95907073d815e3f86cdcc24193bb5eec0374239c79151923561e863326988c7f32a05fb7a1e5bc962728deb417f546364c2149541d6234221b00459154576
>> >>
>> >> However, checksums are correct.
>> >>
>> >> * Builds from source look good
>> >>   - mvn clean

Re: [VOTE] Release Apache Hadoop 2.10.2 - RC0

2022-05-27 Thread Ayush Saxena

The checksum stuff was addressed in HADOOP-16985, so that filename stuff is
sorted only post 3.3.x
BTW it is a known issue:
https://issues.apache.org/jira/browse/HADOOP-16494?focusedCommentId=16927236=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16927236

Must not be a blocker for us

The RAT check failing with dependency issue. That also should work post
3.3.x because there is no Hadoop-maven-plugin dependency in Hadoop-yarn-api
module post 3.3.x, HADOOP-16560 removed it.
Ref:
https://github.com/apache/hadoop/pull/1496/files#diff-f5d219eaf211871f9527ae48da59586e7e9958ea7649de74a1393e599caa6dd6L121-R122

So, that is why the RAT check passes for 3.3.x+ without the need of this
module. Committing HADOOP-16663, should solve this though.(I haven't tried
though, just by looking at the problem)

Good to have patches, but doesn't look like blockers to me. kind of build
related stuffs only, nothing bad with our core Hadoop code.

-Ayush

On Sat, 28 May 2022 at 01:04, Viraj Jasani  wrote:

> +0 (non-binding),
>
> * Signature/Checksum looks good, though I am not sure where
> "target/artifacts" is coming from for the tars, here is the diff (this was
> the case for 2.10.1 as well but checksum was correct):
>
> 1c1
> < SHA512 (hadoop-2.10.2-site.tar.gz) =
>
> 3055a830003f5012660d92da68a317e15da5b73301c2c73cf618e724c67b7d830551b16928e0c28c10b66f04567e4b6f0b564647015bacc4677e232c0011537f
> ---
> > SHA512 (target/artifacts/hadoop-2.10.2-site.tar.gz) =
>
> 3055a830003f5012660d92da68a317e15da5b73301c2c73cf618e724c67b7d830551b16928e0c28c10b66f04567e4b6f0b564647015bacc4677e232c0011537f
> 1c1
> < SHA512 (hadoop-2.10.2-src.tar.gz) =
>
> 483b6a4efd44234153e21ffb63a9f551530a1627f983a8837c655ce1b8ef13486d7178a7917ed3f35525c338e7df9b23404f4a1b0db186c49880448988b88600
> ---
> > SHA512 (target/artifacts/hadoop-2.10.2-src.tar.gz) =
>
> 483b6a4efd44234153e21ffb63a9f551530a1627f983a8837c655ce1b8ef13486d7178a7917ed3f35525c338e7df9b23404f4a1b0db186c49880448988b88600
> 1c1
> < SHA512 (hadoop-2.10.2.tar.gz) =
>
> 13e95907073d815e3f86cdcc24193bb5eec0374239c79151923561e863326988c7f32a05fb7a1e5bc962728deb417f546364c2149541d6234221b00459154576
> ---
> > SHA512 (target/artifacts/hadoop-2.10.2.tar.gz) =
>
> 13e95907073d815e3f86cdcc24193bb5eec0374239c79151923561e863326988c7f32a05fb7a1e5bc962728deb417f546364c2149541d6234221b00459154576
>
> However, checksums are correct.
>
> * Builds from source look good
>  - mvn clean install  -DskipTests
>  - mvn clean package  -Pdist -DskipTests -Dtar -Dmaven.javadoc.skip=true
>
> * Rat check, if run before building from source locally, fails with error:
>
> [ERROR] Plugin org.apache.hadoop:hadoop-maven-plugins:2.10.2 or one of its
> dependencies could not be resolved: Could not find artifact
> org.apache.hadoop:hadoop-maven-plugins:jar:2.10.2 in central (
> https://repo.maven.apache.org/maven2) -> [Help 1]
> [ERROR]
>
> However, once we build locally, rat check passes (because
> hadoop-maven-plugins 2.10.2 would be present in local .m2).
> Also, hadoop-maven-plugins:2.10.2 is available here
>
> https://repository.apache.org/content/repositories/orgapachehadoop-1350/org/apache/hadoop/hadoop-maven-plugins/2.10.2/
>
> * Ran sample HDFS and MapReduce commands, look good.
>
> Until we release Hadoop artifacts, hadoop-maven-plugins for that release
> would not be present in the central maven repository, hence I am still
> wondering how rat check failed only for this RC and not for any of previous
> release RCs. hadoop-vote.sh always runs rat check before building from
> source locally.
>
>
> On Tue, May 24, 2022 at 7:41 PM Masatake Iwasaki <
> iwasak...@oss.nttdata.co.jp> wrote:
>
> > Hi all,
> >
> > Here's Hadoop 2.10.2 release candidate #0:
> >
> > The RC is available at:
> >https://home.apache.org/~iwasakims/hadoop-2.10.2-RC0/
> >
> > The RC tag is at:
> >https://github.com/apache/hadoop/releases/tag/release-2.10.2-RC0
> >
> > The Maven artifacts are staged at:
> >
> https://repository.apache.org/content/repositories/orgapachehadoop-1350
> >
> > You can find my public key at:
> >https://downloads.apache.org/hadoop/common/KEYS
> >
> > Please evaluate the RC and vote.
> > The vote will be open for (at least) 5 days.
> >
> > Thanks,
> > Masatake Iwasaki
> >
> > -
> > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >
> >
>

Re: [VOTE] Release Apache Hadoop 3.3.3 (RC1)

2022-05-16 Thread Ayush Saxena

+1,
* Built from source.
* Successful native build on Ubuntu 18.04
* Verified Checksums.
(CHANGELOG.md,RELEASENOTES.md,hadoop-3.3.3-rat.txt,hadoop-3.3.3-site.tar.gz,hadoop-3.3.3-src.tar.gz,hadoop-3.3.3.tar.gz)
* Verified Signature.
* Successful RAT check
* Ran basic HDFS shell commands.
* Ran basic YARN shell commands.
* Verified version in hadoop version command and UI
* Ran some MR example Jobs.
* Browsed UI(Namenode/Datanode/ResourceManager/NodeManager/HistoryServer)
* Browsed the contents of Maven Artifacts.
* Browsed the contents of the website.

Thanx Steve for driving the release, Good Luck!!!

-Ayush

On Mon, 16 May 2022 at 08:20, Xiaoqiao He  wrote:

> +1(binding)
>
> * Verified signature and checksum of the source tarball.
> * Built the source code on Ubuntu and OpenJDK 11 by `mvn clean package
> -DskipTests -Pnative -Pdist -Dtar`.
> * Setup pseudo cluster with HDFS and YARN.
> * Run simple FsShell - mkdir/put/get/mv/rm and check the result.
> * Run example mr applications and check the result - Pi & wordcount.
> * Check the Web UI of NameNode/DataNode/Resourcemanager/NodeManager etc.
>
> Thanks Steve for your work.
>
> - He Xiaoqiao
>
> On Mon, May 16, 2022 at 4:25 AM Viraj Jasani  wrote:
> >
> > +1 (non-binding)
> >
> > * Signature: ok
> > * Checksum : ok
> > * Rat check (1.8.0_301): ok
> >  - mvn clean apache-rat:check
> > * Built from source (1.8.0_301): ok
> >  - mvn clean install  -DskipTests
> > * Built tar from source (1.8.0_301): ok
> >  - mvn clean package  -Pdist -DskipTests -Dtar -Dmaven.javadoc.skip=true
> >
> > HDFS, MapReduce and HBase (2.5) CRUD functional testing on
> > pseudo-distributed mode looks good.
> >
> >
> > On Wed, May 11, 2022 at 10:26 AM Steve Loughran
> 
> > wrote:
> >
> > > I have put together a release candidate (RC1) for Hadoop 3.3.3
> > >
> > > The RC is available at:
> > > https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC1/
> > >
> > > The git tag is release-3.3.3-RC1, commit d37586cbda3
> > >
> > > The maven artifacts are staged at
> > >
> https://repository.apache.org/content/repositories/orgapachehadoop-1349/
> > >
> > > You can find my public key at:
> > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> > >
> > > Change log
> > > https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC1/CHANGELOG.md
> > >
> > > Release notes
> > >
> https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC1/RELEASENOTES.md
> > >
> > > There's a very small number of changes, primarily critical
> code/packaging
> > > issues and security fixes.
> > >
> > > * The critical fixes which shipped in the 3.2.3 release.
> > > * CVEs in our code and dependencies
> > > * Shaded client packaging issues.
> > > * A switch from log4j to reload4j
> > >
> > > reload4j is an active fork of the log4j 1.17 library with the classes
> > > which contain CVEs removed. Even though hadoop never used those
> classes,
> > > they regularly raised alerts on security scans and concen from users.
> > > Switching to the forked project allows us to ship a secure logging
> > > framework. It will complicate the builds of downstream
> > > maven/ivy/gradle projects which exclude our log4j artifacts, as they
> > > need to cut the new dependency instead/as well.
> > >
> > > See the release notes for details.
> > >
> > > This is the second release attempt. It is the same git commit as
> before,
> > > but
> > > fully recompiled with another republish to maven staging, which has bee
> > > verified by building spark, as well as a minimal test project.
> > >
> > > Please try the release and vote. The vote will run for 5 days.
> > >
> > > -Steve
> > >
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>

[jira] [Resolved] (MAPREDUCE-7376) AggregateWordCount fetches wrong results

2022-05-09 Thread Ayush Saxena (Jira)



 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved MAPREDUCE-7376.
-
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> AggregateWordCount fetches wrong results
> 
>
> Key: MAPREDUCE-7376
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7376
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>        Reporter: Ayush Saxena
>    Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> AggregateWordCount rather than counting  the words, gives a single line 
> output counting the number of rows
> Wrong Result Looks Like:
> {noformat}
> hadoop-3.4.0-SNAPSHOT % bin/hdfs dfs -cat /testOut1/part-r-0
> record_count 2
> {noformat}
> Correct Should Look Like:
> {noformat}
> hadoop-3.4.0-SNAPSHOT % bin/hdfs dfs -cat /testOut1/part-r-0  
>  
> Bye   1
> Goodbye   1
> Hadoop2
> Hello 2
> World 2
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 3.3.3

2022-05-06 Thread Ayush Saxena

Hmm, I see the artifacts ideally should have got overwritten by the new RC, but 
they didn’t. The reason seems like the staging path shared doesn’t have any 
jars…
That is why it was picking the old jars. I think Steve needs to run mvn deploy 
again…

Sent from my iPhone

> On 07-May-2022, at 7:12 AM, Chao Sun  wrote:
> 
> 
>> 
>> Chao can you use the one that Steve mentioned in the mail?
> 
> Hmm how do I do that? Typically after closing the RC in nexus the
> release bits will show up in
> https://repository.apache.org/content/repositories/staging/org/apache/hadoop
> and Spark build will be able to pick them up for testing. However in
> this case I don't see any 3.3.3 jars in the URL.
> 
>> On Fri, May 6, 2022 at 6:24 PM Ayush Saxena  wrote:
>> 
>> There were two 3.3.3 staged. The earlier one was with skipShade, the date 
>> was also april 22, I archived that. Chao can you use the one that Steve 
>> mentioned in the mail?
>> 
>>> On Sat, 7 May 2022 at 06:18, Chao Sun  wrote:
>>> 
>>> Seems there are some issues with the shaded client as I was not able
>>> to compile Apache Spark with the RC
>>> (https://github.com/apache/spark/pull/36474). Looks like it's compiled
>>> with the `-DskipShade` option and the hadoop-client-api JAR doesn't
>>> contain any class:
>>> 
>>> ➜  hadoop-client-api jar tf 3.3.3/hadoop-client-api-3.3.3.jar
>>> META-INF/
>>> META-INF/MANIFEST.MF
>>> META-INF/NOTICE.txt
>>> META-INF/LICENSE.txt
>>> META-INF/maven/
>>> META-INF/maven/org.apache.hadoop/
>>> META-INF/maven/org.apache.hadoop/hadoop-client-api/
>>> META-INF/maven/org.apache.hadoop/hadoop-client-api/pom.xml
>>> META-INF/maven/org.apache.hadoop/hadoop-client-api/pom.properties
>>> 
>>> On Fri, May 6, 2022 at 4:24 PM Stack  wrote:
>>>> 
>>>> +1 (binding)
>>>> 
>>>>  * Signature: ok
>>>>  * Checksum : passed
>>>>  * Rat check (1.8.0_191): passed
>>>>   - mvn clean apache-rat:check
>>>>  * Built from source (1.8.0_191): failed
>>>>   - mvn clean install  -DskipTests
>>>>   - mvn -fae --no-transfer-progress -DskipTests -Dmaven.javadoc.skip=true
>>>> -Pnative -Drequire.openssl -Drequire.snappy -Drequire.valgrind
>>>> -Drequire.zstd -Drequire.test.libhadoop clean install
>>>>  * Unit tests pass (1.8.0_191):
>>>>- HDFS Tests passed (Didn't run more than this).
>>>> 
>>>> Deployed a ten node ha hdfs cluster with three namenodes and five
>>>> journalnodes. Ran a ten node hbase (older version of 2.5 branch built
>>>> against 3.3.2) against it. Tried a small verification job. Good. Ran a
>>>> bigger job with mild chaos. All seems to be working properly (recoveries,
>>>> logs look fine). Killed a namenode. Failover worked promptly. UIs look
>>>> good. Poked at the hdfs cli. Seems good.
>>>> 
>>>> S
>>>> 
>>>> On Tue, May 3, 2022 at 4:24 AM Steve Loughran 
>>>> wrote:
>>>> 
>>>>> I have put together a release candidate (rc0) for Hadoop 3.3.3
>>>>> 
>>>>> The RC is available at:
>>>>> https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC0/
>>>>> 
>>>>> The git tag is release-3.3.3-RC0, commit d37586cbda3
>>>>> 
>>>>> The maven artifacts are staged at
>>>>> https://repository.apache.org/content/repositories/orgapachehadoop-1348/
>>>>> 
>>>>> You can find my public key at:
>>>>> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>>>>> 
>>>>> Change log
>>>>> https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC0/CHANGELOG.md
>>>>> 
>>>>> Release notes
>>>>> https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC0/RELEASENOTES.md
>>>>> 
>>>>> There's a very small number of changes, primarily critical code/packaging
>>>>> issues and security fixes.
>>>>> 
>>>>> 
>>>>>   - The critical fixes which shipped in the 3.2.3 release.
>>>>>   -  CVEs in our code and dependencies
>>>>>   - Shaded client packaging issues.
>>>>>   - A switch from log4j to reload4j
>>>>> 
>>>>> 
>>>>> reload4j is an active fork of the log4j 1.17 library with the classes 
>>>>> which
>>>

Re: [VOTE] Release Apache Hadoop 3.3.3

2022-05-06 Thread Ayush Saxena

There were two 3.3.3 staged. The earlier one was with skipShade, the date
was also april 22, I archived that. Chao can you use the one that Steve
mentioned in the mail?

On Sat, 7 May 2022 at 06:18, Chao Sun  wrote:

> Seems there are some issues with the shaded client as I was not able
> to compile Apache Spark with the RC
> (https://github.com/apache/spark/pull/36474). Looks like it's compiled
> with the `-DskipShade` option and the hadoop-client-api JAR doesn't
> contain any class:
>
> ➜  hadoop-client-api jar tf 3.3.3/hadoop-client-api-3.3.3.jar
> META-INF/
> META-INF/MANIFEST.MF
> META-INF/NOTICE.txt
> META-INF/LICENSE.txt
> META-INF/maven/
> META-INF/maven/org.apache.hadoop/
> META-INF/maven/org.apache.hadoop/hadoop-client-api/
> META-INF/maven/org.apache.hadoop/hadoop-client-api/pom.xml
> META-INF/maven/org.apache.hadoop/hadoop-client-api/pom.properties
>
> On Fri, May 6, 2022 at 4:24 PM Stack  wrote:
> >
> > +1 (binding)
> >
> >   * Signature: ok
> >   * Checksum : passed
> >   * Rat check (1.8.0_191): passed
> >- mvn clean apache-rat:check
> >   * Built from source (1.8.0_191): failed
> >- mvn clean install  -DskipTests
> >- mvn -fae --no-transfer-progress -DskipTests
> -Dmaven.javadoc.skip=true
> > -Pnative -Drequire.openssl -Drequire.snappy -Drequire.valgrind
> > -Drequire.zstd -Drequire.test.libhadoop clean install
> >   * Unit tests pass (1.8.0_191):
> > - HDFS Tests passed (Didn't run more than this).
> >
> > Deployed a ten node ha hdfs cluster with three namenodes and five
> > journalnodes. Ran a ten node hbase (older version of 2.5 branch built
> > against 3.3.2) against it. Tried a small verification job. Good. Ran a
> > bigger job with mild chaos. All seems to be working properly (recoveries,
> > logs look fine). Killed a namenode. Failover worked promptly. UIs look
> > good. Poked at the hdfs cli. Seems good.
> >
> > S
> >
> > On Tue, May 3, 2022 at 4:24 AM Steve Loughran
> 
> > wrote:
> >
> > > I have put together a release candidate (rc0) for Hadoop 3.3.3
> > >
> > > The RC is available at:
> > > https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC0/
> > >
> > > The git tag is release-3.3.3-RC0, commit d37586cbda3
> > >
> > > The maven artifacts are staged at
> > >
> https://repository.apache.org/content/repositories/orgapachehadoop-1348/
> > >
> > > You can find my public key at:
> > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> > >
> > > Change log
> > > https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC0/CHANGELOG.md
> > >
> > > Release notes
> > >
> https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC0/RELEASENOTES.md
> > >
> > > There's a very small number of changes, primarily critical
> code/packaging
> > > issues and security fixes.
> > >
> > >
> > >- The critical fixes which shipped in the 3.2.3 release.
> > >-  CVEs in our code and dependencies
> > >- Shaded client packaging issues.
> > >- A switch from log4j to reload4j
> > >
> > >
> > > reload4j is an active fork of the log4j 1.17 library with the classes
> which
> > > contain CVEs removed. Even though hadoop never used those classes, they
> > > regularly raised alerts on security scans and concen from users.
> Switching
> > > to the forked project allows us to ship a secure logging framework. It
> will
> > > complicate the builds of downstream maven/ivy/gradle projects which
> exclude
> > > our log4j artifacts, as they need to cut the new dependency instead/as
> > > well.
> > >
> > > See the release notes for details.
> > >
> > > This is my first release through the new docker build process, do
> please
> > > validate artifact signing  to make sure it is good. I'll be trying
> builds
> > > of downstream projects.
> > >
> > > We know there are some outstanding issues with at least one library we
> are
> > > shipping (okhttp), but I don't want to hold this release up for it. If
> the
> > > docker based release process works smoothly enough we can do a followup
> > > security release in a few weeks.
> > >
> > > Please try the release and vote. The vote will run for 5 days.
> > >
> > > -Steve
> > >
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>

Re: [VOTE] Release Apache Hadoop 3.3.3

2022-05-06 Thread Ayush Saxena

+1,
* Built from source
* Successful native build on Ubuntu 18.04
* Verified Checksums
(CHANGELOG.md,RELEASENOTES.md,hadoop-3.3.3-rat.txt,hadoop-3.3.3-site.tar.gz,hadoop-3.3.3-src.tar.gz,hadoop-3.3.3.tar.gz)
* Successful RAT check
* Ran some basic HDFS shell commands
* Ran some basic YARN shell commands
* Browsed through UI (NN ,DN, RM, NM & JHS)
* Tried some commands on Hive using Hive built on hive-master
* Verified Signature: Says Good Signature but "This key is not certified
with a trusted signature!"
* Ran some MR example jobs(TeraGen, TeraSort, TeraValidate, WordCount,
WordMean & Pi)
* Version & commit hash seems correct in UI as well as in hadoop version
output.
* Browsed through the ChangeLog & Release Notes (One place mentions hadoop
3.4.0 though, but we can survive I suppose)
* Browsed through the documentation.

Thanx Steve for driving the release, Good Luck!!!

-Ayush



On Tue, 3 May 2022 at 16:54, Steve Loughran 
wrote:

> I have put together a release candidate (rc0) for Hadoop 3.3.3
>
> The RC is available at:
> https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC0/
>
> The git tag is release-3.3.3-RC0, commit d37586cbda3
>
> The maven artifacts are staged at
> https://repository.apache.org/content/repositories/orgapachehadoop-1348/
>
> You can find my public key at:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> Change log
> https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC0/CHANGELOG.md
>
> Release notes
> https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC0/RELEASENOTES.md
>
> There's a very small number of changes, primarily critical code/packaging
> issues and security fixes.
>
>
>- The critical fixes which shipped in the 3.2.3 release.
>-  CVEs in our code and dependencies
>- Shaded client packaging issues.
>- A switch from log4j to reload4j
>
>
> reload4j is an active fork of the log4j 1.17 library with the classes which
> contain CVEs removed. Even though hadoop never used those classes, they
> regularly raised alerts on security scans and concen from users. Switching
> to the forked project allows us to ship a secure logging framework. It will
> complicate the builds of downstream maven/ivy/gradle projects which exclude
> our log4j artifacts, as they need to cut the new dependency instead/as
> well.
>
> See the release notes for details.
>
> This is my first release through the new docker build process, do please
> validate artifact signing  to make sure it is good. I'll be trying builds
> of downstream projects.
>
> We know there are some outstanding issues with at least one library we are
> shipping (okhttp), but I don't want to hold this release up for it. If the
> docker based release process works smoothly enough we can do a followup
> security release in a few weeks.
>
> Please try the release and vote. The vote will run for 5 days.
>
> -Steve
>

Re: Aggregate Word Count from the Mapreduce examples

2022-05-02 Thread Ayush Saxena

>Am I correct in understanding then that Aggregate WordCount and WordCount
do the same thing, apart from the fact that the Aggregate WordCount example
uses the Aggregate framework of Hadoop?
That's what I feel and the output of both are same as well. The description
of both also seems to be saying that:

  *aggregatewordcount*: An Aggregate based map/reduce program that counts
the words in the input files.

&

  *wordcount*: A map/reduce program that counts the words in the input
files.


BTW. I have created a Jira and raised a PR for this:

https://issues.apache.org/jira/browse/MAPREDUCE-7376


Once it gets reviewed, you can try patching it or wait for 3.4.0
release(not anytime soon).


Thanx...


-Ayush

On Tue, 3 May 2022 at 00:12, Pratyush Das  wrote:

> Thanks!
>
> Am I correct in understanding then that Aggregate WordCount and WordCount
> do the same thing, apart from the fact that the Aggregate WordCount example
> uses the Aggregate framework of Hadoop?  - as mentioned here in
> https://stackoverflow.com/questions/24105117/how-to-execute-aggreagatewordcount-example-in-hadoop-which-uses-hadoop-aggregate#comment37203837_24105117
>
>
> On Mon, 2 May 2022 at 13:16, Ayush Saxena  wrote:
>
>> Hi,
>> I tried it too and it gave me a similar output. Looks like some bug with
>> the code. The code seems to be there since stone age though...
>> I tried a fix, it seems there was "." period missing while setting the
>> conf and when retrieving we were trying to get it with the period.
>> Have put the code here:
>>
>> https://github.com/ayushtkn/hadoop/commit/ab7da425e204903e867855b05b7c8fc2fbdd8b0e
>>
>> Patched it on top of trunk and gave it a try locally for your use case,
>> seems post that output is correct. Will check and raise a MAPRED Jira to
>> fix, If it gets reviewed & Committed you can either patch your hadoop
>> distro or wait for the next release which would contain a fix.
>>
>> hadoop-3.4.0-SNAPSHOT % bin/hadoop jar
>> share/hadoop/mapreduce/hadoop-mapreduce-examples-3.4.0-SNAPSHOT.jar  
>> aggregatewordcount
>> /testData /testOut 1 textinputformat
>>
>>
>> hadoop-3.4.0-SNAPSHOT % bin/hdfs dfs -cat /testOut/part-r-0
>>
>>
>>
>> Bye 1
>>
>> Goodbye 1
>>
>> Hadoop 2
>>
>> Hello 2
>>
>> World 2
>>
>>
>>
>> > Does this mean that Aggregate WordCount is merely counting the number
>> of files in the input directory?
>>
>> Not in an ideal situation, The JavaDoc says: *It reads the text input
>> files, breaks each line into words and counts them. The output is a locally
>> sorted list of words and the count of how often they occurred.*
>>
>> On Mon, 2 May 2022 at 10:23, Pratyush Das  wrote:
>>
>>> Hi,
>>>
>>> I had some questions about what the Aggregate Word Count example in the
>>> hadoop-mapreduce-examples-3.3.1.jar actually does.
>>>
>>> This is how I executed the AggregateWordCount example - hadoop jar
>>> hadoop-3.3.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar
>>> aggregatewordcount /examples-input/wordcount/ /examples-output/wordcount/ 1
>>> textinputformat
>>>
>>> /examples-input/wordcount/ contains 2 files - wc01.txt and wc02.txt.
>>>
>>> These are the contents of wc01.txt:
>>> Hello World Bye World
>>>
>>> These are the contents of wc02.txt:
>>> Hello Hadoop Goodbye Hadoop
>>>
>>> The generated output file - /examples-output/wordcount/part-r-0
>>> contains the following line:
>>> record_count 2
>>>
>>> I tried adding another file - wc03.txt which changed the content of the
>>> generated file to:
>>> record_count 3
>>>
>>> Does this mean that Aggregate WordCount is merely counting the number of
>>> files in the input directory?
>>>
>>> Regards,
>>>
>>>
>>> --
>>> Pratyush Das
>>>
>>
>
> --
> Pratyush Das
>

[jira] [Created] (MAPREDUCE-7376) AggregateWordCount fetches wrong results

2022-05-02 Thread Ayush Saxena (Jira)

Ayush Saxena created MAPREDUCE-7376:
---

 Summary: AggregateWordCount fetches wrong results
 Key: MAPREDUCE-7376
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7376
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Ayush Saxena
Assignee: Ayush Saxena


AggregateWordCount rather than counting  the words, gives a single line output 
counting the number of rows
Wrong Result Looks Like:

{noformat}
hadoop-3.4.0-SNAPSHOT % bin/hdfs dfs -cat /testOut1/part-r-0
record_count 2
{noformat}

Correct Should Look Like:

{noformat}
hadoop-3.4.0-SNAPSHOT % bin/hdfs dfs -cat /testOut1/part-r-0
   
Bye 1
Goodbye 1
Hadoop  2
Hello   2
World   2
{noformat}





--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Fwd: Applications for Travel Assistance to ApacheCon NA 2022 now open

2022-04-04 Thread Ayush Saxena


Forwarded as asked in the mail.

-Ayush

Begin forwarded message:

> From: Gavin McDonald 
> Date: 4 April 2022 at 1:56:40 PM IST
> To: travel-assista...@apache.org
> Subject: Applications for Travel Assistance to ApacheCon NA 2022 now open
> Reply-To: tac-ap...@apache.org
> 
> Hello, Apache PMCs!
> 
> Please redistribute the below to your user and dev lists, feel free to also
> use social media to spread the word. Thanks!
> 
> ---
> 
> The ASF Travel Assistance Committee (TAC) is pleased to announce that travel
> assistance applications for ApacheCon NA 2022 are now open!
> 
> We will be supporting ApacheCon North America in New Orleans, Louisiana,
> on October 3rd through 6th, 2022.
> 
> TAC exists to help those that would like to attend ApacheCon events, but
> are unable to do so for financial reasons. This year, We are supporting
> both committers and non-committers involved with projects at the
> Apache Software Foundation, or open source projects in general.
> 
> For more info on this year's applications and qualifying criteria, please
> visit the TAC website at http://www.apache.org/travel/
> Applications opened today and will close on the 1st of July 2022.
> 
> Important: Applicants have until the closing date above to submit their
> applications (which should contain as much supporting material as required
> to efficiently and accurately process their request), this will enable TAC
> to announce successful awards shortly afterwards.
> 
> As usual, TAC expects to deal with a range of applications from a diverse
> range of backgrounds. We therefore encourage (as always) anyone thinking
> about sending in an application to do so ASAP.

Re: [DISCUSS] Race condition in ProtobufRpcEngine2

2022-02-28 Thread Ayush Saxena

Hey Andras,
I had a quick look at HADOOP-18143, the methods in question in
ProtobufRpcEngine2 are identical to the ones in ProtobufRpcEngine. So, I am
not very sure how the  race condition doesn't happen in  ProtobufRpcEngine.
I have to debug and spend some more time, considering that I have
reverted HADOOP-18082 for now to unblock YARN. Though the issue would still
be there as you said, but will give us some time to analyse.

Thanks
-Ayush

On Mon, 28 Feb 2022 at 21:26, Gyori Andras 
wrote:

> Hey everyone!
>
> We have started seeing test failures in YARN PRs for a while. We have
> identified the problematic commit, which is HADOOP-18082
> , however, this change
> just revealed the race condition lying in ProtobufRpcEngine2 introduced in
> HADOOP-17046 . We have
> also fixed the underlying issue via a locking mechanism, presented in
> HADOOP-18143 , but
> since it is out of our area of expertise, we can neither verify nor
> guarantee that it will not cause some subtle issues in the RPC system.
> As we think it is a core part of Hadoop, we would use feedback from someone
> who is proficient in this part.
>
> Regards:
> Andras
>

Re: Possibility of using ci-hadoop.a.o for Nutch integration tests

2022-01-05 Thread Ayush Saxena

Moved to Dev lists.

Not sure about this though:
 when a PR is submitted to Nutch project it will run some MR job in Hadoop CI.

Whatever that PR requires should run as part of Nutch Infra. Why in Hadoop CI?
Our CI is already loaded with our own workloads.
If by any chance the above assertion gets a pass, then secondly we have very 
less number of people managing work related to CI and Infra. I don’t think most 
of the people won’t have context or say in the Nutch project, neither bandwidth 
to fix stuff if it gets broken.

Just my thoughts. Looped in the dev lists, if others have any feedback. As for 
the process, this would require a consensus from the Hadoop PMC

-Ayush

> On 06-Jan-2022, at 7:02 AM, lewis john mcgibbney  wrote:
> 
> Hi general@,
> 
> Not sure if this is the correct mailing list. Please redirect me if there
> is a more suitable location. Thank you
> 
> I am PMC over on the Nutch project (https://nutch.apache.org). I would like
> to investigate whether we can build an integration testing capability for
> the project. This would involve running a Nutch integration test suite
> (collection of MR jobs) in a Hadoop CI environment. For example whenever a
> pull request is submitted to the Nutch project. This could easily be
> automated through Jenkins.
> 
> I’m not sure if this is something the Hadoop PMC would consider. Thank you
> for the consideration.
> 
> lewismc
> -- 
> http://home.apache.org/~lewismc/
> http://people.apache.org/keys/committer/lewismc

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Resolved] (MAPREDUCE-7368) DBOutputFormat.DBRecordWriter#write must throw exception when it fails

2021-12-08 Thread Ayush Saxena (Jira)



 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved MAPREDUCE-7368.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

> DBOutputFormat.DBRecordWriter#write must throw exception when it fails
> --
>
> Key: MAPREDUCE-7368
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7368
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When the 
> [DBRecordWriter#write|https://github.com/apache/hadoop/blob/91af256a5b44925e5dfdf333293251a19685ba2a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/DBOutputFormat.java#L120]
>  fails with an {{SQLException}} the problem is not propagated but printed in 
> {{System.err}} instead. 
> {code:java}
> public void write(K key, V value) throws IOException {
>   try {
> key.write(statement);
> statement.addBatch();
>   } catch (SQLException e) {
> e.printStackTrace();
>   }
> }
> {code}
> The consumer of this API has no way to tell that the write failed. Moreover, 
> the exception is not present in the logs which makes the problem very hard 
> debug and can easily lead to data corruption since clients can easily assume 
> that everything went well.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [DISCUSS] Merging PRs vs. commit from CLI and keep committer field

2021-10-30 Thread Ayush Saxena

Hey Sean,
I just gave a try to Github CLI and merged a PR, using the command:

*gh pr merge 3600 --squash --body "HDFS-16290.--Message--"*


Unfortunately this also has the same result, nothing different.


-Ayush

On Tue, 26 Oct 2021 at 21:25, Sean Busbey  wrote:

> If you add a line in the commit message that the commit closes a given PR
> # then GitHub will annotate the PR as related to the specific commit and
> close it for you.
>
> i.e. you can add “closes #3454” to the commit message body and then PR
> 3454 will close and link to that commit when it hits the default branch.
>
> I believe these docs describing associating a GitHub issue via a commit
> message also apply to associating commits to pull requests, if you are
> interested in what specific phrases work:
>
>
> https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword
>
>
> Has anyone tried handling a squash merge via the GitHub CLI tooling rather
> than the web or plain git tooling? Does it similarly overwrite committer
> metadata?
>
> e.g. the GitHub CLI example from the merging PRs docs?
>
>
> https://docs.github.com/en/github/collaborating-with-pull-requests/incorporating-changes-from-a-pull-request/merging-a-pull-request#merging-a-pull-request
>
>
>
> On Oct 25, 2021, at 9:57 AM, Szilárd Németh  wrote:
>
> Hi Ayush,
> Thanks a lot for your response.
>
>
> Yahh, remember chasing Github support regarding this last year and they
> reverted back that they are aware of this and have an internal jira for
> this but can not comment upon how much time it would take. (As of now 1
> year & counting)
>
> So if I get this right: Their internal jira is in the same untouched state
> and they wouldn't make any progress?
>
>
> Just in case you want to use the CLI & still make the PR marked as merged,
>
> A dirty way to do this is:
> Pull the changes to your local.
> Rebase & Squash all commits, in the commit message in the end put actual
> the PR number, eg. #1234,
> force push to the PR, that is the contributors fork, this will update his
> PR and then merge the Branch to trunk in your local and push. It marks
> the PR as merged, at least last year it did when I tried. :-)
>
>
> Thanks for this dirty hack :)
> Probably I will refrain from doing this, I don't like to force push if it's
> not "mandatory".
> Anyway, if the community is fine with it, I will continue to commit from
> the CLI and close the PR, even though it's not that convenient.
>
>
> Best,
> Szilard
>
>
>
> On Wed, 20 Oct 2021 at 05:07, Ayush Saxena  wrote:
>
> Yahh, remember chasing Github support regarding this last year and they
> reverted back that they are aware of this and have an internal jira for
> this but can not comment upon how much time it would take. (As of now 1
> year & counting)
>
> As far as I remember this is with squash & commit only. If you do say a
> rebase & merge. The commit email id is preserved. But other options we have
> blocked.
>
> I think most of the projects are using squash & merge and many people
> won’t be ok to use CLI rather than UI
> So, I don’t think we have an ALT here.
>
> Just in case you want to use the CLI & still make the PR marked as merged,
> A dirty way to do this is:
> Pull the changes to your local.
> Rebase & Squash all commits, in the commit message in the end put actual
> the PR number, eg. #1234,
> force push to the PR, that is the contributors fork, this will update his
> PR and then merge the Branch to trunk in your local and push. It marks the
> PR as merged, at least last year it did when I tried. :-)
>
> -Ayush
>
>
> On 20-Oct-2021, at 3:27 AM, Szilárd Németh  wrote:
>
> Hi,
>
> I noticed something strange in our commits, in particular the committer
> field is not reflecting the user who committed the commit.
>
> *1. First, I wanted to check Gergely's commits from the last month or so.
> This was getting to be suspicious as I expected to see a bunch of commits
> from Sept / Oct of this year. *
>
> *git log CLI output:*
> ➜ git --no-pager log --format=fuller --committer=shuzirra
> commit 44bab51be44e31224dabbfa548eb27ea5fb2f916
> Author: Gergely Pollak 
> AuthorDate: Wed Aug 4 15:43:07 2021 +0200
> Commit: Gergely Pollak 
> CommitDate: Wed Aug 4 15:43:57 2021 +0200
>
>
>   YARN-10849 Clarify testcase documentation for
> TestServiceAM#testContainersReleasedWhenPreLaunchFails. Contributed by
> Szilard Nemeth
>
>
> commit e9339aa3761295fe65bb786e01938c7c177cd6e7
> Author: Gergely Pollak 
> AuthorDate: Tue Jun 1 15:57:22 2021

Re: [DISCUSS] Merging PRs vs. commit from CLI and keep committer field

2021-10-19 Thread Ayush Saxena

Yahh, remember chasing Github support regarding this last year and they 
reverted back that they are aware of this and have an internal jira for this 
but can not comment upon how much time it would take. (As of now 1 year & 
counting)

As far as I remember this is with squash & commit only. If you do say a rebase 
& merge. The commit email id is preserved. But other options we have blocked. 

I think most of the projects are using squash & merge and many people won’t be 
ok to use CLI rather than UI
So, I don’t think we have an ALT here.

Just in case you want to use the CLI & still make the PR marked as merged, A 
dirty way to do this is:
Pull the changes to your local.
Rebase & Squash all commits, in the commit message in the end put actual the PR 
number, eg. #1234, 
force push to the PR, that is the contributors fork, this will update his PR 
and then merge the Branch to trunk in your local and push. It marks the PR as 
merged, at least last year it did when I tried. :-)

-Ayush


> On 20-Oct-2021, at 3:27 AM, Szilárd Németh  wrote:
> 
> Hi,
> 
> I noticed something strange in our commits, in particular the committer
> field is not reflecting the user who committed the commit.
> 
> *1. First, I wanted to check Gergely's commits from the last month or so.
> This was getting to be suspicious as I expected to see a bunch of commits
> from Sept / Oct of this year. *
> 
> *git log CLI output:*
> ➜ git --no-pager log --format=fuller --committer=shuzirra
> commit 44bab51be44e31224dabbfa548eb27ea5fb2f916
> Author: Gergely Pollak 
> AuthorDate: Wed Aug 4 15:43:07 2021 +0200
> Commit: Gergely Pollak 
> CommitDate: Wed Aug 4 15:43:57 2021 +0200
> 
> 
>YARN-10849 Clarify testcase documentation for
> TestServiceAM#testContainersReleasedWhenPreLaunchFails. Contributed by
> Szilard Nemeth
> 
> 
> commit e9339aa3761295fe65bb786e01938c7c177cd6e7
> Author: Gergely Pollak 
> AuthorDate: Tue Jun 1 15:57:22 2021 +0200
> Commit: Gergely Pollak 
> CommitDate: Tue Jun 1 15:57:22 2021 +0200
> 
> 
>YARN-10797. Logging parameter issues in scheduler package. Contributed
> by Szilard Nemeth
> 
> 
> *2. Another example of a merged PR, here I was the author and Adam Antal
> was the committer:  *
> PR link: https://github.com/apache/hadoop/pull/3454
> 
> *git log CLI output:*
> ➜ git --no-pager log --format=fuller a9b2469a534 -1
> commit a9b2469a534c5bc554c09aaf2d460a5a00922aca
> Author: Adam Antal 
> AuthorDate: Sun Sep 19 14:42:02 2021 +0200
> Commit: GitHub 
> CommitDate: Sun Sep 19 14:42:02 2021 +0200
> 
> 
>YARN-10950. Code cleanup in QueueCapacities (#3454)
> 
> 
> *3. Let's see another two example of merged PRs by Gergely and how the git
> log CLI output look like for these commits: *
> 
> *3.1.*
> PR link: https://github.com/apache/hadoop/pull/3419
> Commit:
> https://github.com/apache/hadoop/commit/4df4389325254465b52557d6fa99bcd470d64409
> 
> *git log CLI output:*
> ➜ git --no-pager log --format=fuller
> 4df4389325254465b52557d6fa99bcd470d64409 -1
> commit 4df4389325254465b52557d6fa99bcd470d64409
> Author: Szilard Nemeth <954799+szilard-nem...@users.noreply.github.com>
> AuthorDate: Mon Sep 20 16:47:46 2021 +0200
> Commit: GitHub 
> CommitDate: Mon Sep 20 16:47:46 2021 +0200
> 
> 
>YARN-10911. AbstractCSQueue: Create a separate class for usernames and
> weights that are travelling in a Map. Contributed by Szilard Nemeth
> 
> 
> *3.2.  *
> PR link: https://github.com/apache/hadoop/pull/3342
> Commit:
> https://github.com/apache/hadoop/commit/9f6430c9ed2bca5696e77bfe9eda5d4f10b0d280
> 
> *git log CLI output:*
> ➜ git --no-pager log --format=fuller
> 9f6430c9ed2bca5696e77bfe9eda5d4f10b0d280 -1
> commit 9f6430c9ed2bca5696e77bfe9eda5d4f10b0d280
> Author: 9uapaw 
> AuthorDate: Tue Sep 21 16:08:24 2021 +0200
> Commit: GitHub 
> CommitDate: Tue Sep 21 16:08:24 2021 +0200
> 
> 
>YARN-10897. Introduce QueuePath class. Contributed by Andras Gyori
> 
> 
> As you can see, the committer field contains: *"GitHub  >".*
> Is this something specific to Hadoop or our Gitbox commit environment?
> Basically, any PR merged on the github.com UI will lose the committer
> information in the commit, which is very bad.
> 
> As I think reviewing and having discussion on Github's UI is way better
> than in jira, the only thing that makes sense for me to do perform as a
> workaround is that downloading the patch from Github before the commit,
> then commit from the CLI by adding the author info, optionally appending
> the standard "Contributed by " message to the commit message.
> For example:
> 
> git commit -m "YARN-xxx.  Contributed by  name>" --author=
> 
> This way, both the author and committer field will be correct. One downside
> is that the PR won't be merged on Github, it will be in closed state
> because the commit is committed from the CLI, so the Github PR will have a
> misleading status.
> 
> What do you think?
> What is your workflow for commits?
> 
> 
> Thanks,
> Szilard

Re: [DISCUSS] Checkin Hadoop code formatter

2021-09-11 Thread Ayush Saxena

Thanx Viraj for initiating, Makes sense to me to include a formmater inline 
with our checkstyle rules in the code, would make life simpler for all devs.

-Ayush

> On 12-Sep-2021, at 12:28 AM, Viraj Jasani  wrote:
> 
> + common-...@hadoop.apache.org
> 
> -- Forwarded message -
> From: Viraj Jasani 
> Date: Tue, Sep 7, 2021 at 6:18 PM
> Subject: Checkin Hadoop code formatter
> To: common-...@hadoop.apache.org 
> 
> 
> It seems some recent new devs are not familiar with the common code
> formatter that we use for our codebase.
> While we already have Wiki page [1] for new contributors and it mentions:
> "Code must be formatted according to Sun's conventions
> "
> but this Oracle's code conventions page is not being actively maintained
> (no update has been received after 1999) and hence, I believe we should
> check-in and maintain code formatter xmls for supported IDEs in our
> codebase only (under dev-support) for all devs to be able to import it in
> the respective IDE.
> Keeping this in mind, I have created this PR 3387
> . If you could please take a
> look and if the PR receives sufficient +1s, we might want to update our
> Wiki page to directly refer to our own codebase for code formatters that we
> maintain. Thoughts?
> 
> 
> 1.
> https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute#HowToContribute-MakingChanges

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [DISCUSS] Hadoop 3.3.2 release?

2021-09-08 Thread Ayush Saxena

+1,
Thanx Chao for volunteering!!!

> On 07-Sep-2021, at 10:36 PM, Chao Sun  wrote:
> 
> Hi all,
> 
> It has been almost 3 months since the 3.3.1 release and branch-3.3 has
> accumulated quite a few commits (118 atm). In particular, Spark community
> recently found an issue which prevents one from using the shaded Hadoop
> client together with certain compression codecs such as lz4 and snappy
> codec. The details are recorded in HADOOP-17891 and SPARK-36669.
> 
> Therefore, I'm wondering if anyone is also interested in a 3.3.2 release.
> If there is no objection, I'd like to volunteer myself for the work as well.
> 
> Best Regards,
> Chao

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: Yetus unable to build YARN / HADOOP for any patch

2021-07-05 Thread Ayush Saxena

Got it sorted,made changes in HDFS-PreCommit and triggered the build:

https://ci-hadoop.apache.org/view/Hadoop/job/PreCommit-HDFS-Build/669/console

Got the result:

https://issues.apache.org/jira/browse/HDFS-16101?focusedCommentId=17374991=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17374991


This worked, but I have also changed the user to Hudson from the original 
hadoopci as of now.
May be I am missing something or the creds got screwed up. Need to check that.

For the time being we can use Hudson? If so we can do the same changes in all 
other pre commit jobs as well and see if things work.  I can check and get the 
creds stuff sorted out in a couple of days…

-Ayush

> On 05-Jul-2021, at 6:31 PM, Ayush Saxena  wrote:
> Yeps, something is broken since last week. I tried a bunch of things on 
> HDFS-PreCommit this weekend, kept on solving issues one after the other.
> 
> https://issues.apache.org/jira/browse/HADOOP-17787?focusedCommentId=17374212=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17374212
> 
> First it was the hadoop.sh path was coming wrong, I fixed it, post that it 
> wasn’t able to download the patch, added jira as plugin, then the curl 
> started failing with 22 error code, the hadoop ci creds were being rejected. 
> I changed it to Hudson for trying, it downloaded the patch successfully. Post 
> that it started giving error :
> 
> Unprocessed flag(s): --brief-report-file --html-report-file 
> --mvn-custom-repos  --shelldocs --mvn-javadoc-goals --mvn-custom-repos-dir
> 
> https://ci-hadoop.apache.org/view/Hadoop/job/PreCommit-HDFS-Build/664/console
> 
> Here I left it. Not sure what broke it and whether INFRA folks would help us 
> on this. The errors are coming from our scripts. 
> 
> But we need to fix it somehow. Do let me know if anyone has any pointers to 
> any recent change which might affect us.
> 
> -Ayush
> 
>> On 05-Jul-2021, at 6:06 PM, Szilárd Németh  wrote:
>> Hi All,
>> 
>> We are experiencing a build issue for HADOOP / YARN, this phenomenon
>> started around last week.
>> Just reported an infra ticket to tackle this:
>> https://issues.apache.org/jira/browse/INFRA-22079
>> Do you think of anything else we can do at this point?
>> 
>> 
>> Best regards,
>> Szilard

Re: Yetus unable to build YARN / HADOOP for any patch

2021-07-05 Thread Ayush Saxena

Yeps, something is broken since last week. I tried a bunch of things on 
HDFS-PreCommit this weekend, kept on solving issues one after the other.

https://issues.apache.org/jira/browse/HADOOP-17787?focusedCommentId=17374212=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17374212

First it was the hadoop.sh path was coming wrong, I fixed it, post that it 
wasn’t able to download the patch, added jira as plugin, then the curl started 
failing with 22 error code, the hadoop ci creds were being rejected. I changed 
it to Hudson for trying, it downloaded the patch successfully. Post that it 
started giving error :

Unprocessed flag(s): --brief-report-file --html-report-file --mvn-custom-repos  
--shelldocs --mvn-javadoc-goals --mvn-custom-repos-dir

https://ci-hadoop.apache.org/view/Hadoop/job/PreCommit-HDFS-Build/664/console

Here I left it. Not sure what broke it and whether INFRA folks would help us on 
this. The errors are coming from our scripts. 

But we need to fix it somehow. Do let me know if anyone has any pointers to any 
recent change which might affect us.

-Ayush

> On 05-Jul-2021, at 6:06 PM, Szilárd Németh  wrote:
> 
> Hi All,
> 
> We are experiencing a build issue for HADOOP / YARN, this phenomenon
> started around last week.
> Just reported an infra ticket to tackle this:
> https://issues.apache.org/jira/browse/INFRA-22079
> Do you think of anything else we can do at this point?
> 
> 
> Best regards,
> Szilard

Re: [VOTE] Release Apache Hadoop 3.3.1 RC3

2021-06-12 Thread Ayush Saxena

+1,
Built from Source.
Successful Native Build on Ubuntu 20.04
Verified Checksums
Ran basic hdfs shell commands.
Ran simple MR jobs.
Browsed NN,DN,RM and NM UI.

Thanx Wei-Chiu for driving the release. 

-Ayush


> On 12-Jun-2021, at 1:45 AM, epa...@apache.org wrote:
> 
> +1 (binding)
> Eric
> 
> 
> On Tuesday, June 1, 2021, 5:29:49 AM CDT, Wei-Chiu Chuang 
>  wrote:  
> 
> Hi community,
> 
> This is the release candidate RC3 of Apache Hadoop 3.3.1 line. All blocker
> issues have been resolved [1] again.
> 
> There are 2 additional issues resolved for RC3:
> * Revert "MAPREDUCE-7303. Fix TestJobResourceUploader failures after
> HADOOP-16878
> * Revert "HADOOP-16878. FileUtil.copy() to throw IOException if the source
> and destination are the same
> 
> There are 4 issues resolved for RC2:
> * HADOOP-17666. Update LICENSE for 3.3.1
> * MAPREDUCE-7348. TestFrameworkUploader#testNativeIO fails. (#3053)
> * Revert "HADOOP-17563. Update Bouncy Castle to 1.68. (#2740)" (#3055)
> * HADOOP-17739. Use hadoop-thirdparty 1.1.1. (#3064)
> 
> The Hadoop-thirdparty 1.1.1, as previously mentioned, contains two extra
> fixes compared to hadoop-thirdparty 1.1.0:
> * HADOOP-17707. Remove jaeger document from site index.
> * HADOOP-17730. Add back error_prone
> 
> *RC tag is release-3.3.1-RC3
> https://github.com/apache/hadoop/releases/tag/release-3.3.1-RC3
> 
> *The RC3 artifacts are at*:
> https://home.apache.org/~weichiu/hadoop-3.3.1-RC3/
> ARM artifacts: https://home.apache.org/~weichiu/hadoop-3.3.1-RC3-arm/
> 
> *The maven artifacts are hosted here:*
> https://repository.apache.org/content/repositories/orgapachehadoop-1320/
> 
> *My public key is available here:*
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> 
> 
> Things I've verified:
> * all blocker issues targeting 3.3.1 have been resolved.
> * stable/evolving API changes between 3.3.0 and 3.3.1 are compatible.
> * LICENSE and NOTICE files checked
> * RELEASENOTES and CHANGELOG
> * rat check passed.
> * Built HBase master branch on top of Hadoop 3.3.1 RC2, ran unit tests.
> * Built Ozone master on top fo Hadoop 3.3.1 RC2, ran unit tests.
> * Extra: built 50 other open source projects on top of Hadoop 3.3.1 RC2.
> Had to patch some of them due to commons-lang migration (Hadoop 3.2.0) and
> dependency divergence. Issues are being identified but so far nothing
> blocker for Hadoop itself.
> 
> Please try the release and vote. The vote will run for 5 days.
> 
> My +1 to start,
> 
> [1] https://issues.apache.org/jira/issues/?filter=12350491
> [2]
> https://github.com/apache/hadoop/compare/release-3.3.1-RC1...release-3.3.1-RC3

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Hadoop 3.1.x EOL

2021-06-03 Thread Ayush Saxena

+1

-Ayush

> On 03-Jun-2021, at 11:26 PM, hemanth boyina  
> wrote:
> 
> +1
> 
> Thanks
> HemanthBoyina
> 
>> On Thu, 3 Jun 2021, 20:33 Wanqiang Ji,  wrote:
>> 
>> +1 (non-binding)
>> 
>> Wanqiang Ji
>> 
>>> On Thu, Jun 3, 2021 at 10:47 PM Sangjin Lee  wrote:
>>> 
>>> +1
>>> 
>>> On Thu, Jun 3, 2021 at 7:35 AM Sean Busbey 
>>> wrote:
>>> 
 +1
 
> On Jun 3, 2021, at 1:14 AM, Akira Ajisaka 
>> wrote:
> 
> Dear Hadoop developers,
> 
> Given the feedback from the discussion thread [1], I'd like to start
> an official vote
> thread for the community to vote and start the 3.1 EOL process.
> 
> What this entails:
> 
> (1) an official announcement that no further regular Hadoop 3.1.x
 releases
> will be made after 3.1.4.
> (2) resolve JIRAs that specifically target 3.1.5 as won't fix.
> 
> This vote will run for 7 days and conclude by June 10th, 16:00 JST
>> [2].
> 
> Committers are eligible to cast binding votes. Non-committers are
 welcomed
> to cast non-binding votes.
> 
> Here is my vote, +1
> 
> [1] https://s.apache.org/w9ilb
> [2]
 
>>> 
>> https://www.timeanddate.com/worldclock/fixedtime.html?msg=4=20210610T16=248
> 
> Regards,
> Akira
> 
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> 
 
 
 
 -
 To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
 For additional commands, e-mail: common-dev-h...@hadoop.apache.org
 
 
>>> 
>> 

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [DISCUSS] which release lines should we still consider actively maintained?

2021-05-24 Thread Ayush Saxena

+1, to mark 3.1.x EOL.
Apache Hive does depends on 3.1.0 as of now, but due to guave upgrade on 
branch-3.1, the attempt to migrate to latest 3.1.x didn’t work for me atleast 
couple of months back. So, mostly 3.3.1 would be the only option replacing 
3.1.0 there or at worst 3.3.2 in a couple of months.


-Ayush

> On 24-May-2021, at 8:43 PM, Arpit Agarwal  
> wrote:
> 
> +1 to EOL 3.1.x at least.
> 
> 
>> On May 23, 2021, at 9:51 PM, Wei-Chiu Chuang  
>> wrote:
>> 
>> Sean,
>> 
>> For reasons I don't understand, I never received emails from your new
>> address in the mailing list. Only Akira's response.
>> 
>> I was just able to start a thread like this.
>> 
>> I am +1 to EOL 3.1.5.
>> Reason? Spark is already on Hadoop 3.2. Hive and Tez are actively working
>> to support Hadoop 3.3. HBase supports Hadoop 3.3 already. They are the most
>> common Hadoop applications so I think a 3.1 isn't that necessarily
>> important.
>> 
>> With Hadoop 3.3.1, we have a number of improvements to support a better
>> HDFS upgrade experience, so upgrading from Hadoop 3.1 should be relatively
>> easy. Application upgrade takes some effort though (commons-lang ->
>> commons-lang3 migration for example)
>> I've been maintaining the HDFS code in branch-3.1, so from a
>> HDFS perspective the branch is always in a ready to release state.
>> 
>> The Hadoop 3.1 line is more than 3 years old. Maintaining this branch is
>> getting trickier. I am +100 to reduce the number of actively maintained
>> release line. IMO, 2 Hadoop 3 lines + 1 Hadoop 2 line is a good idea.
>> 
>> 
>> 
>> For Hadoop 3.3 line: If no one beats me, I plan to make a 3.3.2 in 2-3
>> months. And another one in another 2-3 months.
>> The Hadoop 3.3.1 has nearly 700 commits not in 3.3.0. It is very difficult
>> to make/validate a maint release with such a big divergence in the code.
>> 
>> 
>>> On Mon, May 24, 2021 at 12:06 PM Akira Ajisaka >> > wrote:
>>> 
>>> Hi Sean,
>>> 
>>> Thank you for starting the discussion.
>>> 
>>> I think branch-2.10, branch-3.1, branch-3.2, branch-3.3, and trunk
>>> (3.4.x) are actively maintained.
>>> 
>>> The next releases will be:
>>> - 3.4.0
>>> - 3.3.1 (Thanks, Wei-Chiu!)
>>> - 3.2.3
>>> - 3.1.5
>>> - 2.10.2
>>> 
 Are there folks willing to go through being release managers to get more
>>> of these release lines on a steady cadence?
>>> 
>>> Now I'm interested in becoming a release manager of 3.1.5.
>>> 
 If I were to take up maintenance release for one of them which should it
>>> be?
>>> 
>>> 3.2.3 or 2.10.2 seems to be a good choice.
>>> 
 Should we declare to our downstream users that some of these lines
>>> aren’t going to get more releases?
>>> 
>>> Now I think we don't need to declare that. I believe 3.3.1, 3.2.3,
>>> 3.1.5, and 2.10.2 will be released in the near future.
>>> There are some earlier discussions of 3.1.x EoL, so 3.1.5 may be a
>>> final release of the 3.1.x release line.
>>> 
 Is there downstream facing documentation somewhere that I missed for
>>> setting expectations about our release cadence and actively maintained
>>> branches?
>>> 
>>> As you commented, the confluence wiki pages for Hadoop releases were
>>> out of date. Updated [1].
>>> 
 Do we have a backlog of work written up that could make the release
>>> process easier for our release managers?
>>> 
>>> The release process is documented and maintained:
>>> https://cwiki.apache.org/confluence/display/HADOOP2/HowToRelease
>>> Also, there are some backlogs [1], [2].
>>> 
>>> [1]:
>>> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Active+Release+Lines
>>> [2]: https://cwiki.apache.org/confluence/display/HADOOP/Roadmap
>>> 
>>> Thanks,
>>> Akira
>>> 
>>> On Fri, May 21, 2021 at 7:12 AM Sean Busbey 
>>> wrote:
 
 
 Hi folks!
 
 Which release lines do we as a community still consider actively
>>> maintained?
 
 I found an earlier discussion[1] where we had consensus to consider
>>> branches that don’t get maintenance releases on a regular basis end-of-life
>>> for practical purposes. The result of that discussion was written up in our
>>> wiki docs in the “EOL Release Branches” page, summarized here
 
> If no volunteer to do a maintenance release in a short to mid-term
>>> (like 3 months to 1 or 1.5 year).
 
 Looking at release lines that are still on our download page[3]:
 
 * Hadoop 2.10.z - last release 8 months ago
 * Hadoop 3.1.z - last release 9.5 months ago
 * Hadoop 3.2.z - last release 4.5 months ago
 * Hadoop 3.3.z - last release 10 months ago
 
 And then trunk holds 3.4 which hasn’t had a release since the branch-3.3
>>> fork ~14 months ago.
 
 I can see that Wei-Chiu has been actively working on getting the 3.3.1
>>> release out[4] (thanks Wei-Chiu!) but I do not see anything similar for the
>>> other release lines.
 
 We also have pages on the wiki for our project roadmap of release[5],
>>> but it

[jira] [Resolved] (MAPREDUCE-7343) Increase the job name max length in mapred job -list

2021-05-13 Thread Ayush Saxena (Jira)



 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved MAPREDUCE-7343.
-
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
 Assignee: Ayush Saxena
   Resolution: Fixed

Committed to trunk

> Increase the job name max length in mapred job -list 
> -
>
> Key: MAPREDUCE-7343
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7343
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>        Reporter: Ayush Saxena
>    Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Presently the job name length is capped at 20, But in many cases(One being 
> Hive). The length gets crossed in too many cases and post that it doesn't 
> fetch much value.
>  
> Propose to increase the length limit from 20->35 here:
> {code:java}
> writer.printf(dataPattern, job.getJobID().toString(),
> job.getJobName().substring(0, jobNameLength > 20 ? 20 : jobNameLength),
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [DISCUSS] check style changes

2021-05-13 Thread Ayush Saxena

If you tend to change the line limit, It should have a proper DISCUSS and VOTE 
thread. If I remember it correct for ozone it was discussed to increase the 
limit to 100 from 80 and that got vetoed. 
So, we too should get an agreement first. Well I have gone through the ozone 
discussion and I also don’t support changing the line length limit in general.

Regarding the nightly runs, I think fixing the flaky tests and getting away 
with the frequent OOM errors in the build is something should get priority, 
that would be helpfull to the project.

Fixing checkstyles, firstly would be just a look and feel change. Many 
checkstyle warnings would be the one accepted by the committer while commiting 
and the most important stuff, it would make backports tough, checking the git 
history a little tougher, and apart from that these changes would be huge. So 
reviewers need to be extra cautious, that accidentally some bug doesn’t get 
induced. 

So, Personal opinions: Line length stuff needs proper discussion and vote, 
justifying a strong reason.

Chekstyle modifications also, if it is like tooo much of change, and no big 
advantages, I think atleast I can survive with it, giving the fact it might 
take my ease to backport issues internally as well as to other branches. But 
still since it won’t break anything, so I don’t think so, I have any right to 
say No to this. But in case you plan to pick this up with priority. Get the 
plan discussed before merging.

Would be happy to help in case you plan to pick up the other stuffs related to 
the builds. :-)

-Ayush

> On 13-May-2021, at 8:40 PM, Sean Busbey  wrote:
> 
> Hi folks!
> 
> I’d like to start cleaning up our nightly tests. As a bit of low hanging 
> fruit I’d like to alter some of our check style rules to match what I think 
> we’ve been doing in the community. How would folks prefer I make sure we have 
> consensus on such changes?
> 
> As an example, our last nightly run had ~81k check style violations (it’s a 
> big number but it’s not that bad given the size of the repo) and roughly 16% 
> of those were for line lengths in excess of 80 characters but <= 100 
> characters.
> 
> If I wanted to change our line length check to be 100 characters rather than 
> the default of 80, would folks rather I have a DISCUSS thread first? Or would 
> they rather a Jira + PR with the discussion of the merits happening there?
> 
> —
> busbey
> 
> 
> 
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> 

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Created] (MAPREDUCE-7343) Increase the job name max length in mapred job -list

2021-05-02 Thread Ayush Saxena (Jira)

Ayush Saxena created MAPREDUCE-7343:
---

 Summary: Increase the job name max length in mapred job -list 
 Key: MAPREDUCE-7343
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7343
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Ayush Saxena


Presently the job name length is capped at 20, But in many cases(One being 
Hive). The length gets crossed in too many cases and post that it doesn't fetch 
much value.

 

Propose to increase the length limit from 20->40 here:
{code:java}
writer.printf(dataPattern, job.getJobID().toString(),
job.getJobName().substring(0, jobNameLength > 20 ? 20 : jobNameLength),
{code}
 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [DISCUSS] hadoop-thirdparty 1.1.0 release

2021-04-26 Thread Ayush Saxena

Yep, you have to do it manually

-Ayush

> On 26-Apr-2021, at 3:23 PM, Wei-Chiu Chuang  wrote:
> 
> 
> Does anyone know how we publish hadoop-thirdparty SNAPSHOT artifacts?
> 
> The main Hadoop arifacts are published by this job 
> https://ci-hadoop.apache.org/view/Hadoop/job/Hadoop-trunk-Commit/ after every 
> commit.
> However, we don't seem to publish hadoop-thirdparty regularly. (Apache nexus: 
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/thirdparty/)
> 
> Are they published manually?
> 
> 
>> On Fri, Apr 23, 2021 at 6:06 PM Ayush Saxena  wrote:
>> Regarding Guava: before release once you merge the change to thrirdparty 
>> repo, can update the hadoop thirdparty snapshot, the hadoop code would pick 
>> that up, and watch out everything is safe and clean before release. Unless 
>> you have a better way to verify or already verified!!! 
>> 
>> -Ayush
>> 
>> > On 23-Apr-2021, at 3:16 PM, Wei-Chiu Chuang  
>> > wrote:
>> > 
>> > Another suggestion: looks like the shaded jaeger is not being used by
>> > Hadoop code. Maybe we can remove that from the release for now? I don't
>> > want to release something that's not being used.
>> > We can release the shaded jaeger when it's ready for use. We will have to
>> > update the jaeger version anyway. The version used is too old.
>> > 
>> >> On Fri, Apr 23, 2021 at 10:55 AM Wei-Chiu Chuang  
>> >> wrote:
>> >> 
>> >> Hi community,
>> >> 
>> >> In preparation of the Hadoop 3.3.1 release, I am starting a thread to
>> >> discuss its prerequisite: the release of hadoop-thirdparty 1.1.0.
>> >> 
>> >> My plan:
>> >> update guava to 30.1.1 (latest). I have the PR ready to merge.
>> >> 
>> >> Do we want to update protobuf and jaeger? Anything else?
>> >> 
>> >> I suppose we won't update protobuf too frequently.
>> >> Jaeger is under active development. We're currently on 0.34.2, the latest
>> >> is 1.22.0.
>> >> 
>> >> If there is no change to this plan, I can start the release work as soon
>> >> as possible.
>> >> 
>> 
>> -
>> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>>

Re: [DISCUSS] hadoop-thirdparty 1.1.0 release

2021-04-23 Thread Ayush Saxena

Regarding Guava: before release once you merge the change to thrirdparty repo, 
can update the hadoop thirdparty snapshot, the hadoop code would pick that up, 
and watch out everything is safe and clean before release. Unless you have a 
better way to verify or already verified!!! 

-Ayush

> On 23-Apr-2021, at 3:16 PM, Wei-Chiu Chuang  
> wrote:
> 
> Another suggestion: looks like the shaded jaeger is not being used by
> Hadoop code. Maybe we can remove that from the release for now? I don't
> want to release something that's not being used.
> We can release the shaded jaeger when it's ready for use. We will have to
> update the jaeger version anyway. The version used is too old.
> 
>> On Fri, Apr 23, 2021 at 10:55 AM Wei-Chiu Chuang  wrote:
>> 
>> Hi community,
>> 
>> In preparation of the Hadoop 3.3.1 release, I am starting a thread to
>> discuss its prerequisite: the release of hadoop-thirdparty 1.1.0.
>> 
>> My plan:
>> update guava to 30.1.1 (latest). I have the PR ready to merge.
>> 
>> Do we want to update protobuf and jaeger? Anything else?
>> 
>> I suppose we won't update protobuf too frequently.
>> Jaeger is under active development. We're currently on 0.34.2, the latest
>> is 1.22.0.
>> 
>> If there is no change to this plan, I can start the release work as soon
>> as possible.
>> 

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [DISCUSS] Hadoop 3.3.1 release

2021-01-27 Thread Ayush Saxena

+1
Just to mention we would need to release hadoop-thirdparty too before. 
Presently we are using the snapshot version of it.

-Ayush

> On 28-Jan-2021, at 6:59 AM, Wei-Chiu Chuang  wrote:
> 
> Hi all,
> 
> Hadoop 3.3.0 was released half a year ago, and as of now we've accumulated
> more than 400 changes in the branch-3.3. A number of downstreamers are
> eagerly waiting for 3.3.1 which addresses the guava version conflict issue.
> 
> https://issues.apache.org/jira/issues/?filter=-1=project%20in%20(HDFS%2C%20HADOOP%2C%20YARN%2C%20MAPREDUCE)%20and%20fixVersion%20in%20(3.3.1)%20and%20status%20%3D%20Resolved%20
> 
> We should start the release work for 3.3.1 before the diff becomes even
> larger.
> 
> I believe there are  currently only two real blockers for a 3.3.1 (using
> this filter
> https://issues.apache.org/jira/issues/?filter=-1=project%20in%20(HDFS%2C%20HADOOP%2C%20YARN%2C%20MAPREDUCE)%20AND%20cf%5B12310320%5D%20in%20(3.3.1)%20AND%20status%20not%20in%20(Resolved)%20ORDER%20BY%20priority%20DESC
> )
> 
> 
>   1. HDFS-15566 
>   2.
>  1. HADOOP-17112 
> 2.
> 
> 
> 
> Is there anyone who would volunteer to be the 3.3.1 RM?
> 
> Also, the HowToRelease wiki does not describe the ARM build process. That's
> going to be important for future releases.

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: Intervention. Stabilizing Yetus (Attn. Azure)

2020-11-09 Thread Ayush Saxena

The failing Azure tests are being tracked at HADOOP-17325

https://issues.apache.org/jira/browse/HADOOP-17325

On Mon, 9 Nov 2020 at 23:02, Ahmed Hussein  wrote:

> I created new Jiras for HDFS failures. Please consider doing the same for
> Yarn and Azure.
> For convenience, the list of failures in the qbt report is as follows:
>
> Test Result
> <
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/319/testReport/
> >
> (50
> failures / -7)
>
>-
>
>  
> org.apache.hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination.testGetCachedDatanodeReport
><
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/319/testReport/junit/org.apache.hadoop.hdfs.server.federation.router/TestRouterRpcMultiDestination/testGetCachedDatanodeReport/
> >
>-
>
>  
> org.apache.hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination.testNamenodeMetrics
><
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/319/testReport/junit/org.apache.hadoop.hdfs.server.federation.router/TestRouterRpcMultiDestination/testNamenodeMetrics/
> >
>-
>
>  
> org.apache.hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination.testErasureCoding
><
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/319/testReport/junit/org.apache.hadoop.hdfs.server.federation.router/TestRouterRpcMultiDestination/testErasureCoding/
> >
>-
>
>  
> org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister
><
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/319/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestBPOfferService/testMissBlocksWhenReregister/
> >
>-
> org.apache.hadoop.yarn.sls.TestReservationSystemInvariants.testSimulatorRunning[Testing
>with: SYNTH,
>
>  org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler,
>(nodeFile null)]
><
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/319/testReport/junit/org.apache.hadoop.yarn.sls/TestReservationSystemInvariants/testSimulatorRunning_Testing_with__SYNTH__org_apache_hadoop_yarn_server_resourcemanager_scheduler_fair_FairScheduler___nodeFile_null__/
> >
>-
> org.apache.hadoop.yarn.sls.appmaster.TestAMSimulator.testAMSimulator[1]
><
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/319/testReport/junit/org.apache.hadoop.yarn.sls.appmaster/TestAMSimulator/testAMSimulator_1_/
> >
>-
>
>  
> org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer.testTokenThreadTimeout
><
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/319/testReport/junit/org.apache.hadoop.yarn.server.resourcemanager.security/TestDelegationTokenRenewer/testTokenThreadTimeout/
> >
>-
>
>  
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithOpportunisticContainers
><
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/319/testReport/junit/org.apache.hadoop.yarn.applications.distributedshell/TestDistributedShell/testDSShellWithOpportunisticContainers/
> >
>-
>
>  
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithEnforceExecutionType
><
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/319/testReport/junit/org.apache.hadoop.yarn.applications.distributedshell/TestDistributedShell/testDSShellWithEnforceExecutionType/
> >
>- org.apache.hadoop.fs.azure.TestBlobMetadata.testFolderMetadata
><
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/319/testReport/junit/org.apache.hadoop.fs.azure/TestBlobMetadata/testFolderMetadata/
> >
>-
>
>  org.apache.hadoop.fs.azure.TestBlobMetadata.testFirstContainerVersionMetadata
><
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/319/testReport/junit/org.apache.hadoop.fs.azure/TestBlobMetadata/testFirstContainerVersionMetadata/
> >
>- org.apache.hadoop.fs.azure.TestBlobMetadata.testPermissionMetadata
><
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/319/testReport/junit/org.apache.hadoop.fs.azure/TestBlobMetadata/testPermissionMetadata/
> >
>- org.apache.hadoop.fs.azure.TestBlobMetadata.testOldPermissionMetadata
><
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/319/testReport/junit/org.apache.hadoop.fs.azure/TestBlobMetadata/testOldPermissionMetadata/
> >
>-
>
>  
> org.apache.hadoop.fs.azure.TestNativeAzureFileSystemConcurrency.testNoTempBlobsVisible
><
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/319/testReport/junit/org.apache.hadoop.fs.azure/TestNativeAzureFileSystemConcurrency/testNoTempBlobsVisible/
> >
>-
>
>  org.apache.hadoop.fs.azure.TestNativeAzureFileSystemConcurrency.testLinkBlobs
><
>

Re: Precommit job fails to write comments in the GitHub pull requests

2020-10-02 Thread Ayush Saxena

Hi Akira,
Thanx for working on this.
Is there anything blocking here? I think it isn’t working still.
Can we point to a commit before Yetus-994 instead of master and get away with 
this? or if there is no solution handy we may go ahead with approach #4 and 
then figure out how to proceed.

-Ayush

> On 29-Sep-2020, at 5:31 PM, Masatake Iwasaki  
> wrote:
> 
> Hi Akira,
> 
> Thanks for working on this.
> It looks like you are trying token on PR #2348.
> https://github.com/apache/hadoop/pull/2348
> 
> It it does not work, I think option #4 is reasonable.
> I prefer using fixed/static version of toolchain in order to
> avoid surprising the developers.
> We can upgrade Yetus when JDK11 Javadoc issue is fixed.
> 
> Regards,
> Masatake Iwasaki
> 
>> On 2020/09/29 2:55, Akira Ajisaka wrote:
>> Apache Hadoop developers,
>> After YETUS-994, the jenkins job updates the commit status instead of
>> writing a comment to the pull request. It requires an OAuth token with
>> write access to "repo:status" but now there is no such token for
>> Apache Hadoop. I asked the infra team to create a token:
>> https://issues.apache.org/jira/browse/INFRA-20906
>> For now, I think there 4 options to get the information:
>> 1. Attach a patch to the JIRA instead of GitHub
>> 2. Update the Jenkinsfile for a while (Sample PR:
>> https://github.com/apache/hadoop/pull/2346)
>> 3. Search the job and get the comment from
>> https://ci-hadoop.apache.org/job/hadoop-multibranch/
>> 4. Create a JIRA to update the Jenkinsfile to use 0.12.0 instead of
>> main. (JDK11 javadoc support is broken in Yetus 0.12.0)
>> Sorry for the inconvenience.
>> Regards,
>> Akira
>> -
>> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
> 
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> 

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Moving Ozone to a separated Apache project

2020-09-25 Thread Ayush Saxena

+1

-Ayush

> On 25-Sep-2020, at 11:30 AM, Elek, Marton  wrote:
> 
> Hi all,
> 
> Thank you for all the feedback and requests,
> 
> As we discussed in the previous thread(s) [1], Ozone is proposed to be a 
> separated Apache Top Level Project (TLP)
> 
> The proposal with all the details, motivation and history is here:
> 
> https://cwiki.apache.org/confluence/display/HADOOP/Ozone+Hadoop+subproject+to+Apache+TLP+proposal
> 
> This voting runs for 7 days and will be concluded at 2nd of October, 6AM GMT.
> 
> Thanks,
> Marton Elek
> 
> [1]: 
> https://lists.apache.org/thread.html/rc6c79463330b3e993e24a564c6817aca1d290f186a1206c43ff0436a%40%3Chdfs-dev.hadoop.apache.org%3E
> 
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> 

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Apache Hadoop Ozone 1.0.0 RC1

2020-09-01 Thread Ayush Saxena

+1
* Built from source
* Verified checksums & signature
* Ran some basic shell commands.

Thanx Sammi for driving the release. Good Luck!!!

-Ayush

On Tue, 1 Sep 2020 at 13:20, Mukul Kumar Singh 
wrote:

> Thanks for preparing the RC Sammi.
>
> +1 (binding)
>
> 1. Verified Signatures
>
> 2. Compiled the source
>
> 3. Local Docker based deployed clsuter and some basic commands.
>
> Thanks,
>
> Mukul
>
> On 01/09/20 12:07 pm, Rakesh Radhakrishnan wrote:
> > Thanks Sammi for getting this out!
> >
> > +1 (binding)
> >
> >   * Verified signatures.
> >   * Built from source.
> >   * Deployed small non-HA un-secure cluster.
> >   * Verified basic Ozone file system.
> >   * Tried out a few basic Ozone shell commands - create, list, delete
> >   * Ran a few Freon benchmark tests.
> >
> > Thanks,
> > Rakesh
> >
> > On Tue, Sep 1, 2020 at 11:53 AM Jitendra Pandey
> >  wrote:
> >
> >> +1 (binding)
> >>
> >> 1. Verified signatures
> >> 2. Built from source
> >> 3. deployed with docker
> >> 4. tested with basic s3 apis.
> >>
> >> On Tue, Aug 25, 2020 at 7:01 AM Sammi Chen 
> wrote:
> >>
> >>> RC1 artifacts are at:
> >>> https://home.apache.org/~sammichen/ozone-1.0.0-rc1/
> >>> 
> >>>
> >>> Maven artifacts are staged at:
> >>>
> https://repository.apache.org/content/repositories/orgapachehadoop-1278
> >>> <
> https://repository.apache.org/content/repositories/orgapachehadoop-1277
> >>>
> >>>
> >>> The public key used for signing the artifacts can be found at:
> >>> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >>>
> >>> The RC1 tag in github is at:
> >>> https://github.com/apache/hadoop-ozone/releases/tag/ozone-1.0.0-RC1
> >>> 
> >>>
> >>> Change log of RC1, add
> >>> 1. HDDS-4063. Fix InstallSnapshot in OM HA
> >>> 2. HDDS-4139. Update version number in upgrade tests.
> >>> 3. HDDS-4144, Update version info in hadoop client dependency readme
> >>>
> >>> *The vote will run for 7 days, ending on Aug 31th 2020 at 11:59 pm
> PST.*
> >>>
> >>> Thanks,
> >>> Sammi Chen
> >>>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>

Re: [DISCUSS] fate of branch-2.9

2020-08-27 Thread Ayush Saxena

+1

-Ayush

> On 27-Aug-2020, at 6:24 PM, Steve Loughran  
> wrote:
> 
> 
> 
> +1
> 
> are there any Hadoop branch-2 releases planned, ever? If so I'll need to 
> backport my s3a directory compatibility patch to whatever is still live.
> 
> 
>> On Thu, 27 Aug 2020 at 06:55, Wei-Chiu Chuang  wrote:
>> Bump up this thread after 6 months.
>> 
>> Is anyone still interested in the 2.9 release line? Or are we good to start
>> the EOL process? The 2.9.2 was released in Nov 2018.
>> 
>> I'd really like to see the community to converge to fewer release lines and
>> make more frequent releases in each line.
>> 
>> Thanks,
>> Weichiu
>> 
>> 
>> On Fri, Mar 6, 2020 at 5:47 PM Wei-Chiu Chuang  wrote:
>> 
>> > I think that's a great suggestion.
>> > Currently, we make 1 minor release per year, and within each minor release
>> > we bring up 1 thousand to 2 thousand commits in it compared with the
>> > previous one.
>> > I can totally understand it is a big bite for users to swallow. Having a
>> > more frequent release cycle, plus LTS and non-LTS releases should help with
>> > this. (Of course we will need to make the release preparation much easier,
>> > which is currently a pain)
>> >
>> > I am happy to discuss the release model further in the dev ML. LTS v.s.
>> > non-LTS is one suggestion.
>> >
>> > Another similar issue: In the past Hadoop strived to
>> > maintain compatibility. However, this is no longer sustainable as more CVEs
>> > coming from our dependencies: netty, jetty, jackson ... etc.
>> > In many cases, updating the dependencies brings breaking changes. More
>> > recently, especially in Hadoop 3.x, I started to make the effort to update
>> > dependencies much more frequently. How do users feel about this change?
>> >
>> > On Thu, Mar 5, 2020 at 7:58 AM Igor Dvorzhak 
>> > wrote:
>> >
>> >> Maybe Hadoop will benefit from adopting a similar release and support
>> >> strategy as Java? I.e. designate some releases as LTS and support them for
>> >> 2 (?) years (it seems that 2.7.x branch was de-facto LTS), other non-LTS
>> >> releases will be supported for 6 months (or until next release). This
>> >> should allow to reduce maintenance cost of non-LTS release and provide
>> >> conservative users desired stability by allowing them to wait for new LTS
>> >> release and upgrading to it.
>> >>
>> >> On Thu, Mar 5, 2020 at 1:26 AM Rupert Mazzucco 
>> >> wrote:
>> >>
>> >>> After recently jumping from 2.7.7 to 2.10 without issue myself, I vote
>> >>> for keeping only the 2.10 line.
>> >>> It would seem all other 2.x branches can upgrade to a 2.10.x easily if
>> >>> they feel like upgrading at all,
>> >>> unlike a jump to 3.x, which may require more planning.
>> >>>
>> >>> I also vote for having only one main 3.x branch. Why are there 3.1.x and
>> >>> 3.2.x seemingly competing,
>> >>> and now 3.3.x? For a community that does not have the resources to
>> >>> manage multiple release lines,
>> >>> you guys sure like to multiply release lines a lot.
>> >>>
>> >>> Cheers
>> >>> Rupert
>> >>>
>> >>> Am Mi., 4. März 2020 um 19:40 Uhr schrieb Wei-Chiu Chuang
>> >>> :
>> >>>
>>  Forwarding the discussion thread from the dev mailing lists to the user
>>  mailing lists.
>> 
>>  I'd like to get an idea of how many users are still on Hadoop 2.9.
>>  Please share your thoughts.
>> 
>>  On Mon, Mar 2, 2020 at 6:30 PM Sree Vaddi
>>   wrote:
>> 
>> > +1
>> >
>> > Sent from Yahoo Mail on Android
>> >
>> >   On Mon, Mar 2, 2020 at 5:12 PM, Wei-Chiu Chuang
>> > wrote:   Hi,
>> >
>> > Following the discussion to end branch-2.8, I want to start a
>> > discussion
>> > around what's next with branch-2.9. I am hesitant to use the word "end
>> > of
>> > life" but consider these facts:
>> >
>> > * 2.9.0 was released Dec 17, 2017.
>> > * 2.9.2, the last 2.9.x release, went out Nov 19 2018, which is more
>> > than
>> > 15 months ago.
>> > * no one seems to be interested in being the release manager for 2.9.3.
>> > * Most if not all of the active Hadoop contributors are using Hadoop
>> > 2.10
>> > or Hadoop 3.x.
>> > * We as a community do not have the cycle to manage multiple release
>> > line,
>> > especially since Hadoop 3.3.0 is coming out soon.
>> >
>> > It is perhaps the time to gradually reduce our footprint in Hadoop
>> > 2.x, and
>> > encourage people to upgrade to Hadoop 3.x
>> >
>> > Thoughts?
>> >
>> >

Re: [RESULT][VOTE] Rlease Apache Hadoop-3.3.0

2020-07-17 Thread Ayush Saxena

Hi Brahma,
Seems the link to changelog for Release-3.3.0 isn't correct at :
https://hadoop.apache.org/

It points to :
http://hadoop.apache.org/docs/r3.3.0/hadoop-project-dist/hadoop-common/release/3.3.0/CHANGES.3.3.0.html

CHANGES.3.3.0.html isn't there, instead it should point to :

http://hadoop.apache.org/docs/r3.3.0/hadoop-project-dist/hadoop-common/release/3.3.0/CHANGELOG.3.3.0.html

Please give a check once!!!

-Ayush




On Wed, 15 Jul 2020 at 19:18, Brahma Reddy Battula 
wrote:

> Hi Stephen,
>
> thanks for bringing this to my attention.
>
> Looks it's late..I pushed the release tag ( which can't be reverted) and
> updated the release date in the jira.
>
> Can we plan this next release near future..?
>
>
> On Wed, Jul 15, 2020 at 5:25 PM Stephen O'Donnell
>  wrote:
>
> > Hi All,
> >
> > Sorry for being a bit late to this, but I wonder if we have a potential
> > blocker to this release.
> >
> > In Cloudera we have recently encountered a serious dataloss issue in HDFS
> > surrounding snapshots. To hit the dataloss issue, you must have
> HDFS-13101
> > and HDFS-15012 on the build (which branch-3.3.0 does). To prevent it, you
> > must also have HDFS-15313 and unfortunately, this was only committed to
> > trunk, so we need to cherry-pick it down the active branches.
> >
> > With data loss being a serious issue, should we pull this Jira into
> > branch-3.3.0 and cut a new release candidate?
> >
> > Thanks,
> >
> > Stephen.
> >
> > On Tue, Jul 14, 2020 at 1:22 PM Brahma Reddy Battula 
> > wrote:
> >
> > > Hi All,
> > >
> > > With 8 binding and 11 non-binding +1s and no -1s the vote for Apache
> > > hadoop-3.3.0 Release
> > > passes.
> > >
> > > Thank you everybody for contributing to the release, testing, and
> voting.
> > >
> > > Special thanks whoever verified the ARM Binary as this is the first
> > release
> > > to support the ARM in hadoop.
> > >
> > >
> > > Binding +1s
> > >
> > > =
> > > Akira Ajisaka
> > > Vinayakumar B
> > > Inigo Goiri
> > > Surendra Singh Lilhore
> > > Masatake Iwasaki
> > > Rakesh Radhakrishnan
> > > Eric Badger
> > > Brahma Reddy Battula
> > >
> > > Non-binding +1s
> > >
> > > =
> > > Zhenyu Zheng
> > > Sheng Liu
> > > Yikun Jiang
> > > Tianhua huang
> > > Ayush Saxena
> > > Hemanth Boyina
> > > Bilwa S T
> > > Takanobu Asanuma
> > > Xiaoqiao He
> > > CR Hota
> > > Gergely Pollak
> > >
> > > I'm going to work on staging the release.
> > >
> > >
> > > The voting thread is:
> > >
> > >  https://s.apache.org/hadoop-3.3.0-Release-vote-thread
> > >
> > >
> > >
> > > --Brahma Reddy Battula
> > >
> >
>
>
> --
>
>
>
> --Brahma Reddy Battula
>

Re: [VOTE] Release Apache Hadoop 3.3.0 - RC0

2020-07-11 Thread Ayush Saxena

+1(non-binding)
* Built from source on Ubuntu-18.04 (ARM & x86)
* Successful native build.
* Ran some basic HDFS shell commands.
* Browsed through the WebUI (NN, DN, RM & NM)
* Executed WordCount, TeraGen, TeraSort & TeraValidate
* Verified checksums and signatures.

Thanx everyone for the efforts towards the release. Good Luck!!!

-Ayush

On Sat, 11 Jul 2020 at 00:05, Iñigo Goiri  wrote:

> +1 (Binding)
>
> Deployed a cluster on Azure VMs with:
> * 3 VMs with HDFS Namenodes and Routers
> * 2 VMs with YARN Resource Managers
> * 5 VMs with HDFS Datanodes and Node Managers
>
> Tests:
> * Executed Tergagen+Terasort+Teravalidate.
> * Executed wordcount.
> * Browsed through the Web UI.
>
>
>
> On Fri, Jul 10, 2020 at 1:06 AM Vinayakumar B 
> wrote:
>
> > +1 (Binding)
> >
> > -Verified all checksums and Signatures.
> > -Verified site, Release notes and Change logs
> >   + May be changelog and release notes could be grouped based on the
> > project at second level for better look (this needs to be supported from
> > yetus)
> > -Tested in x86 local 3-node docker cluster.
> >   + Built from source with OpenJdk 8 and Ubuntu 18.04
> >   + Deployed 3 node docker cluster
> >   + Ran various Jobs (wordcount, Terasort, Pi, etc)
> >
> > No Issues reported.
> >
> > -Vinay
> >
> > On Fri, Jul 10, 2020 at 1:19 PM Sheng Liu 
> wrote:
> >
> > > +1 (non-binding)
> > >
> > > - checkout the "3.3.0-aarch64-RC0" binaries packages
> > >
> > > - started a clusters with 3 nodes VMs of Ubuntu 18.04 ARM/aarch64,
> > > openjdk-11-jdk
> > >
> > > - checked some web UIs (NN, DN, RM, NM)
> > >
> > > - Executed a wordcount, TeraGen, TeraSort and TeraValidate
> > >
> > > - Executed a TestDFSIO job
> > >
> > > - Executed a Pi job
> > >
> > > BR,
> > > Liusheng
> > >
> > > Zhenyu Zheng  于2020年7月10日周五 下午3:45写道：
> > >
> > > > +1 (non-binding)
> > > >
> > > > - Verified all hashes and checksums
> > > > - Tested on ARM platform for the following actions:
> > > >   + Built from source on Ubuntu 18.04, OpenJDK 8
> > > >   + Deployed a pseudo cluster
> > > >   + Ran some example jobs(grep, wordcount, pi)
> > > >   + Ran teragen/terasort/teravalidate
> > > >   + Ran TestDFSIO job
> > > >
> > > > BR,
> > > >
> > > > Zhenyu
> > > >
> > > > On Fri, Jul 10, 2020 at 2:40 PM Akira Ajisaka 
> > > wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > - Verified checksums and signatures.
> > > > > - Built from the source with CentOS 7 and OpenJDK 8.
> > > > > - Successfully upgraded HDFS to 3.3.0-RC0 in our development
> cluster
> > > > (with
> > > > > RBF, security, and OpenJDK 11) for end-users. No issues reported.
> > > > > - The document looks good.
> > > > > - Deployed pseudo cluster and ran some MapReduce jobs.
> > > > >
> > > > > Thanks,
> > > > > Akira
> > > > >
> > > > >
> > > > > On Tue, Jul 7, 2020 at 7:27 AM Brahma Reddy Battula <
> > bra...@apache.org
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hi folks,
> > > > > >
> > > > > > This is the first release candidate for the first release of
> Apache
> > > > > > Hadoop 3.3.0
> > > > > > line.
> > > > > >
> > > > > > It contains *1644[1]* fixed jira issues since 3.2.1 which
> include a
> > > lot
> > > > > of
> > > > > > features and improvements(read the full set of release notes).
> > > > > >
> > > > > > Below feature additions are the highlights of the release.
> > > > > >
> > > > > > - ARM Support
> > > > > > - Enhancements and new features on S3a,S3Guard,ABFS
> > > > > > - Java 11 Runtime support and TLS 1.3.
> > > > > > - Support Tencent Cloud COS File System.
> > > > > > - Added security to HDFS Router.
> > > > > > - Support non-volatile storage class memory(SCM) in HDFS cache
> > > > directives
> > > > > > - Support Interactive Docker Shell for running Containers.
> > > > > > - Scheduling of opportunistic containers
> > > > > > - A pluggable device plugin framework to ease vendor plugin
> > > development
> > > > > >
> > > > > > *The RC0 artifacts are at*:
> > > > > > http://home.apache.org/~brahma/Hadoop-3.3.0-RC0/
> > > > > >
> > > > > > *First release to include ARM binary, Have a check.*
> > > > > > *RC tag is *release-3.3.0-RC0.
> > > > > >
> > > > > >
> > > > > > *The maven artifacts are hosted here:*
> > > > > >
> > > >
> > https://repository.apache.org/content/repositories/orgapachehadoop-1271/
> > > > > >
> > > > > > *My public key is available here:*
> > > > > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> > > > > >
> > > > > > The vote will run for 5 weekdays, until Tuesday, July 13 at 3:50
> AM
> > > > IST.
> > > > > >
> > > > > >
> > > > > > I have done a few testing with my pseudo cluster. My +1 to start.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Regards,
> > > > > > Brahma Reddy Battula
> > > > > >
> > > > > >
> > > > > > 1. project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in
> > > (3.3.0)
> > > > > AND
> > > > > > fixVersion not in (3.2.0, 3.2.1, 3.1.3) AND status = Resolved
> ORDER
> > > BY
> > > > > > fixVersion ASC
> > > > > >
> > > > >

Re: [VOTE] Release Apache Hadoop 3.1.4 (RC1)

2020-06-24 Thread Ayush Saxena

Hi Gabor,
As you are going ahead with another RC,
Please include : HDFS-15323 as well if possible.

https://issues.apache.org/jira/browse/HDFS-15323

Remember tagging you there but at that time RC0 was up, try if that could
make into the release as well.

Thanx!!!
-Ayush

On Wed, 24 Jun 2020 at 16:16, Gabor Bota 
wrote:

> Thanks for looking into this Akira, Kihwal!
>
>
> I noted that it is a hard to create situation described in HDFS-14941.
> The issue created by HDFS-14941 would be even harder to fix in
> HDFS-15421, test it, prove that it's stable, etc..
> That's why I will do a revert of HDFS-14941 and create an RC3.
>
>
>
> * I withdraw this vote now for RC2 because of that blocker issue
> (HDFS-15421). I will create an RC3 with HDFS-14941 reverted. *
>
> Regards,
> Gabor
>
> On Tue, Jun 23, 2020 at 4:59 PM Kihwal Lee
>  wrote:
> >
> > Gabor,
> > If you want to release asap, you can simply revert HDFS-14941 in the
> > release branch for now. It is causing the issue and was committed after
> > 3.1.3.  This causes failure of the automated upgrade process and namenode
> > memory leak.
> >
> > Kihwal
> >
> > On Tue, Jun 23, 2020 at 8:47 AM Akira Ajisaka 
> wrote:
> >
> > > Hi Gabor,
> > >
> > > Thank you for your work!
> > >
> > > Kihwal reported IBR leak in standby NameNode:
> > > https://issues.apache.org/jira/browse/HDFS-15421.
> > > I think this is a blocker and this affects 3.1.4-RC1. Would you check
> this?
> > >
> > > Best regards,
> > > Akira
> > >
> > > On Mon, Jun 22, 2020 at 10:26 PM Gabor Bota  > > .invalid>
> > > wrote:
> > >
> > > > Hi folks,
> > > >
> > > > I have put together a release candidate (RC1) for Hadoop 3.1.4.
> > > >
> > > > The RC is available at:
> > > http://people.apache.org/~gabota/hadoop-3.1.4-RC1/
> > > > The RC tag in git is here:
> > > > https://github.com/apache/hadoop/releases/tag/release-3.1.4-RC1
> > > > The maven artifacts are staged at
> > > >
> https://repository.apache.org/content/repositories/orgapachehadoop-1267/
> > > >
> > > > You can find my public key at:
> > > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> > > > and
> http://keys.gnupg.net/pks/lookup?op=get=0xB86249D83539B38C
> > > >
> > > > Please try the release and vote. The vote will run for 5 weekdays,
> > > > until June 30. 2020. 23:00 CET.
> > > >
> > > > Thanks,
> > > > Gabor
> > > >
> > > > -
> > > > To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> > > > For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
> > > >
> > > >
> > >
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>

Re: [DISCUSS] Hadoop 3.3.0 Release include ARM binary

2020-06-15 Thread Ayush Saxena

YARN-10314 also seems to be a blocker.

https://issues.apache.org/jira/browse/YARN-10314

We should wait for that as well, should get concluded in a day or two.

-Ayush

> On 15-Jun-2020, at 7:21 AM, Sheng Liu  wrote:
> 
> The  HADOOP-17046  has
> been merged :)
> 
> Brahma Reddy Battula  于2020年6月4日周四 下午10:43写道：
> 
>> Following blocker is pending for 3.3.0 release which is ready for review.
>> Mostly we'll have RC soon.
>> https://issues.apache.org/jira/browse/HADOOP-17046
>> 
>> Protobuf dependency was unexpected .
>> 
>>> On Mon, Jun 1, 2020 at 7:11 AM Sheng Liu  wrote:
>>> 
>>> Hi folks,
>>> 
>>> It looks like the 3.3.0 branch has been created for quite a while. Not
>> sure
>>> if there is remain block issue that need to be addressed before Hadoop
>>> 3.3.0 release publishing, maybe we can bring up to here and move the
>>> release forward ?
>>> 
>>> Thank.
>>> 
>>> Brahma Reddy Battula  于2020年3月25日周三 上午1:55写道：
>>> 
 thanks to all.
 
 will make this as optional..will update the wiki accordingly.
 
 On Wed, Mar 18, 2020 at 12:05 AM Vinayakumar B <
>> vinayakum...@apache.org>
 wrote:
 
> Making ARM artifact optional, makes the release process simpler for
>> RM
 and
> unblocks release process (if there is unavailability of ARM
>> resources).
> 
> Still there are possible options to collaborate with RM ( as brahma
> mentioned earlier) and provide ARM artifact may be before or after
>>> vote.
> If feasible RM can decide to add ARM artifact by collaborating with
 @Brahma
> Reddy Battula  or me to get the ARM artifact.
> 
> -Vinay
> 
> On Tue, Mar 17, 2020 at 11:39 PM Arpit Agarwal
>  wrote:
> 
>> Thanks for the clarification Brahma. Can you update the proposal to
 state
>> that it is optional (it may help to put the proposal on cwiki)?
>> 
>> Also if we go ahead then the RM documentation should be clear this
>> is
 an
>> optional step.
>> 
>> 
>>> On Mar 17, 2020, at 11:06 AM, Brahma Reddy Battula <
 bra...@apache.org>
>> wrote:
>>> 
>>> Sure, we can't make mandatory while voting and we can upload to
> downloads
>>> once release vote is passed.
>>> 
>>> On Tue, 17 Mar 2020 at 11:24 PM, Arpit Agarwal
>>>  wrote:
>>> 
> Sorry,didn't get you...do you mean, once release voting is
> processed and upload by RM..?
 
 Yes, that is what I meant. I don’t want us to make more
>> mandatory
 work
>> for
 the release manager because the job is hard enough already.
 
 
> On Mar 17, 2020, at 10:46 AM, Brahma Reddy Battula <
> bra...@apache.org>
 wrote:
> 
> Sorry,didn't get you...do you mean, once release voting is
 processed
>> and
> upload by RM..?
> 
> FYI. There is docker image for ARM also which support all
>> scripts
> (createrelease, start-build-env.sh, etc ).
> 
> https://issues.apache.org/jira/browse/HADOOP-16797
> 
> On Tue, Mar 17, 2020 at 10:59 PM Arpit Agarwal
>  wrote:
> 
>> Can ARM binaries be provided after the fact? We cannot
>> increase
 the
>> RM’s
>> burden by asking them to generate an extra set of binaries.
>> 
>> 
>>> On Mar 17, 2020, at 10:23 AM, Brahma Reddy Battula <
>> bra...@apache.org>
>> wrote:
>>> 
>>> + Dev mailing list.
>>> 
>>> -- Forwarded message -
>>> From: Brahma Reddy Battula 
>>> Date: Tue, Mar 17, 2020 at 10:31 PM
>>> Subject: Re: [DISCUSS] Hadoop 3.3.0 Release include ARM
>> binary
>>> To: junping_du 
>>> 
>>> 
>>> thanks junping for your reply.
>>> 
>>> bq.  I think most of us in Hadoop community doesn't want
>> to
> have
>> biased
>>> on ARM or any other platforms.
>>> 
>>> Yes, release voting will be based on the source
>>> code.AFAIK,Binary
> we
 are
>>> providing for user to easy to download and verify.
>>> 
>>> bq. The only thing I try to understand is how much
>>> complexity
> get
>>> involved for our RM work. Does that potentially become a
>>> blocker
> for
>> future
>>> releases? And how we can get rid of this risk.
>>> 
>>> As I mentioned earlier, RM need to access the ARM machine(it
>>> will
> be
>>> donated and current qbt also using one ARM machine) and build
>>> tar
>> using
>> the
>>> keys. As it can be common machine, RM can delete his keys
>> once
>> release
>>> approved.
>>> Can be sorted out as I mentioned earlier.(For accessing the
>> ARM
 machine)
>>> 
>>> bq.   If you can list the

Re: [ANNOUNCE] New Apache Hadoop Committer - He Xiaoqiao

2020-06-11 Thread Ayush Saxena

Congratulations He Xiaoqiao!!!

-Ayush

> On 11-Jun-2020, at 9:30 PM, Wei-Chiu Chuang  wrote:
> 
> In bcc: general@
> 
> It's my pleasure to announce that He Xiaoqiao has been elected as a
> committer on the Apache Hadoop project recognizing his continued
> contributions to the
> project.
> 
> Please join me in congratulating him.
> 
> Hearty Congratulations & Welcome aboard Xiaoqiao!
> 
> Wei-Chiu Chuang
> (On behalf of the Hadoop PMC)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [DISCUSS] making Ozone a separate Apache project

2020-05-13 Thread Ayush Saxena

+1

-Ayush

> On 13-May-2020, at 11:26 PM, Aravindan Vijayan 
>  wrote:
> 
> +1.
> 
> Thank you for taking this up Marton.
> 
> 
>> On Wed, May 13, 2020 at 10:48 AM Uma Maheswara Rao G 
>> wrote:
>> 
>> +1
>> 
>> Regards,
>> Uma
>> 
>>> On Wed, May 13, 2020 at 12:53 AM Elek, Marton  wrote:
>>> 
>>> 
>>> 
>>> I would like to start a discussion to make a separate Apache project for
>>> Ozone
>>> 
>>> 
>>> 
>>> ### HISTORY [1]
>>> 
>>>  * Apache Hadoop Ozone development started on a feature branch of
>>> Hadoop repository (HDFS-7240)
>>> 
>>>  * In the October of 2017 a discussion has been started to merge it to
>>> the Hadoop main branch
>>> 
>>>  * After a long discussion it's merged to Hadoop trunk at the March of
>>> 2018
>>> 
>>>  * During the discussion of the merge, it was suggested multiple times
>>> to create a separated project for the Ozone. But at that time:
>>> 1). Ozone was tightly integrated with Hadoop/HDFS
>>> 2). There was an active plan to use Block layer of Ozone (HDDS or
>>> HDSL at that time) as the block level of HDFS
>>> 3). The community of Ozone was a subset of the HDFS community
>>> 
>>>  * The first beta release of Ozone was just released. Seems to be a
>>> good time before the first GA to make a decision about the future.
>>> 
>>> 
>>> 
>>> ### WHAT HAS BEEN CHANGED
>>> 
>>>  During the last years Ozone became more and more independent both at
>>> the community and code side. The separation has been suggested again and
>>> again (for example by Owen [2] and Vinod [3])
>>> 
>>> 
>>> 
>>>  From COMMUNITY point of view:
>>> 
>>> 
>>>   * Fortunately more and more new contributors are helping Ozone.
>>> Originally the Ozone community was a subset of HDFS project. But now a
>>> bigger and bigger part of the community is related to Ozone only.
>>> 
>>>   * It seems to be easier to _build_ the community as a separated
>> project.
>>> 
>>>   * A new, younger project might have different practices
>>> (communication, commiter criteria, development style) compared to old,
>>> mature project
>>> 
>>>   * It's easier to communicate (and improve) these standards in a
>>> separated projects with clean boundaries
>>> 
>>>   * Separated project/brand can help to increase the adoption rate and
>>> attract more individual contributor (AFAIK it has been seen in Submarine
>>> after a similar move)
>>> 
>>>  * Contribution process can be communicated more easily, we can make
>>> first time contribution more easy
>>> 
>>> 
>>> 
>>>  From CODE point of view Ozone became more and more independent:
>>> 
>>> 
>>>  * Ozone has different release cycle
>>> 
>>>  * Code is already separated from Hadoop code base
>>> (apache/hadoop-ozone.git)
>>> 
>>>  * It has separated CI (github actions)
>>> 
>>>  * Ozone uses different (more strict) coding style (zero toleration of
>>> unit test / checkstyle errors)
>>> 
>>>  * The code itself became more and more independent from Hadoop on
>>> Maven level. Originally it was compiled together with the in-tree latest
>>> Hadoop snapshot. Now it depends on released Hadoop artifacts (RPC,
>>> Configuration...)
>>> 
>>>  * It starts to use multiple version of Hadoop (on client side)
>>> 
>>>  * Volume of resolved issues are already very high on Ozone side (Ozone
>>> had slightly more resolved issues than HDFS/YARN/MAPREDUCE/COMMON all
>>> together in the last 2-3 months)
>>> 
>>> 
>>> Summary: Before the first Ozone GA release, It seems to be a good time
>>> to discuss the long-term future of Ozone. Managing it as a separated TLP
>>> project seems to have more benefits.
>>> 
>>> 
>>> Please let me know what your opinion is...
>>> 
>>> Thanks a lot,
>>> Marton
>>> 
>>> 
>>> 
>>> 
>>> 
>>> [1]: For more details, see:
>>> https://github.com/apache/hadoop-ozone/blob/master/HISTORY.md
>>> 
>>> [2]:
>>> 
>>> 
>> https://lists.apache.org/thread.html/0d0253f6e5fa4f609bd9b917df8e1e4d8848e2b7fdb3099b730095e6%40%3Cprivate.hadoop.apache.org%3E
>>> 
>>> [3]:
>>> 
>>> 
>> https://lists.apache.org/thread.html/8be74421ea495a62e159f2b15d74627c63ea1f67a2464fa02c85d4aa%40%3Chdfs-dev.hadoop.apache.org%3E
>>> 
>>> -
>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>> 
>>> 
>> 

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [DISCUSS] Shade guava into hadoop-thirdparty

2020-04-06 Thread Ayush Saxena

+1

-Ayush

> On 05-Apr-2020, at 12:43 AM, Wei-Chiu Chuang  wrote:
> 
> Hi Hadoop devs,
> 
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
> 
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
> 
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
> 
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
> 
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
> 
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
> 
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
> 
> Weichiu

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Apache Hadoop Ozone 0.5.0-beta RC2

2020-03-22 Thread Ayush Saxena

+1(non binding)
*Built from source
*Verified Checksums
*Ran some basic Shell Commands.

Thanx Dinesh for driving the release.

-Ayush

On Mon, 16 Mar 2020 at 07:57, Dinesh Chitlangia 
wrote:

> Hi Folks,
>
> We have put together RC2 for Apache Hadoop Ozone 0.5.0-beta.
>
> The RC artifacts are at:
> https://home.apache.org/~dineshc/ozone-0.5.0-rc2/
>
> The public key used for signing the artifacts can be found at:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> The maven artifacts are staged at:
> https://repository.apache.org/content/repositories/orgapachehadoop-1262
>
> The RC tag in git is at:
> https://github.com/apache/hadoop-ozone/tree/ozone-0.5.0-beta-RC2
>
> This release contains 800+ fixes/improvements [1].
> Thanks to everyone who put in the effort to make this happen.
>
> *The vote will run for 7 days, ending on March 22nd 2020 at 11:59 pm PST.*
>
> Note: This release is beta quality, it’s not recommended to use in
> production but we believe that it’s stable enough to try out the feature
> set and collect feedback.
>
>
> [1] https://s.apache.org/ozone-0.5.0-fixed-issues
>
> Thanks,
> Dinesh Chitlangia
>

Re: [ANNOUNCE] New Apache Hadoop Committer - Siyao Meng

2020-03-21 Thread Ayush Saxena

Congratulations Siyao!!!

-Ayush 

> On 21-Mar-2020, at 8:13 AM, Zhankun Tang  wrote:
> 
> Welcome onboard. Siyao!
> 
> BR,
> Zhankun
> 
> Xiaoyu Yao 于2020年3月21日 周六上午1:24写道：
> 
>> It's my pleasure to announce that Siyao Meng has been elected as committer
>> on the Apache Hadoop project recognizing his continued contributions to the
>> project.
>> 
>> Please join me in congratulating him.
>> 
>> Congratulations & Welcome aboard Siyao!
>> 
>> Xiaoyu Yao
>> (On behalf of the Hadoop PMC)
>>

Re: [VOTE] Release Apache Hadoop Thirdparty 1.0.0 - RC1

2020-03-12 Thread Ayush Saxena

Thanx Vinay for driving the release.
+1(non-binding)
Built trunk with -Dhadoop-thirdparty-protobuf.version=1.0.0
Build from source on Ubuntu 19.10
Verified source checksum.

Good Luck!!!

-Ayush

On Thu, 12 Mar 2020 at 01:56, Vinayakumar B  wrote:

> Hi folks,
>
> Thanks to everyone's help on this release.
>
> I have re-created a release candidate (RC1) for Apache Hadoop Thirdparty
> 1.0.0.
>
> RC Release artifacts are available at :
>
> http://home.apache.org/~vinayakumarb/release/hadoop-thirdparty-1.0.0-RC1/
>
> Maven artifacts are available in staging repo:
>
> https://repository.apache.org/content/repositories/orgapachehadoop-1261/
>
> The RC tag in git is here:
> https://github.com/apache/hadoop-thirdparty/tree/release-1.0.0-RC1
>
> And my public key is at:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> *This vote will run for 5 days, ending on March 18th 2020 at 11:59 pm IST.*
>
> For the testing, I have verified Hadoop trunk compilation with
>"-DdistMgmtSnapshotsUrl=
> https://repository.apache.org/content/repositories/orgapachehadoop-1261/
>  -Dhadoop-thirdparty-protobuf.version=1.0.0"
>
> My +1 to start.
>
> -Vinay
>

1 2 >

1 - 100 of 119 matches

Mail list logo