HADOOP-18207 hadoop-logging module about to land

2023-07-26 Thread Wei-Chiu Chuang
Hi,

I am preparing to resolve HADOOP-18207
 (
https://github.com/apache/hadoop/pull/5717).

This change affects all modules. With this change, it will eliminate almost
all the direct log4j usage.

As always, landing such a big piece is tricky. I am sorry for the mishaps
last time and am doing more due diligence to make it a smoother transition.
I am triggering one last precommit check. Once the change is merged, Viraj
and I will pay attention to any potential problems.

Weichiu


[ANNOUNCE] Apache Hadoop 3.3.6 release

2023-06-26 Thread Wei-Chiu Chuang
On behalf of the Apache Hadoop Project Management Committee, I am pleased
to announce the release of Apache Hadoop 3.3.6.

It contains 117 bug fixes, improvements and enhancements since 3.3.5. Users
of Apache Hadoop 3.3.5 and earlier should upgrade to this release.

https://hadoop.apache.org/release/3.3.6.html
Feature highlights:

SBOM artifacts

Starting from this release, Hadoop publishes Software Bill of Materials
(SBOM) using
CycloneDX Maven plugin. For more information about SBOM, please go to
[SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).

HDFS RBF: RDBMS based token storage support

HDFS Router-Router Based Federation now supports storing delegation tokens
on MySQL,
[HADOOP-18535](https://issues.apache.org/jira/browse/HADOOP-18535)
which improves token operation through over the original Zookeeper-based
implementation.


New File System APIs

[HADOOP-18671](https://issues.apache.org/jira/browse/HADOOP-18671) moved a
number of
HDFS-specific APIs to Hadoop Common to make it possible for certain
applications that
depend on HDFS semantics to run on other Hadoop compatible file systems.

In particular, recoverLease() and isFileClosed() are exposed through
LeaseRecoverable
interface. While setSafeMode() is exposed through SafeMode interface.

Many thanks to everyone who helped in this release by supplying patches,
reviewing them, helping get this release building and testing and
reviewing the final artifacts.

Weichiu


Re: [VOTE] Release Apache Hadoop 3.3.6 RC1

2023-06-25 Thread Wei-Chiu Chuang
Thanks all!
The vote passed with 6 binding +1 votes, no +0, -1 votes and 4
non-binding +1 votes.

Publishing the release bits and updating webpage and user docs now.

Thanks
to the binding votes from Ayush, Xiaoqiao, Sammi, Mukund, Masatake
and non-binding votes from Nilotpal, Viraj, Stephen, George and Ahmar.

On Fri, Jun 23, 2023 at 11:48 PM Ayush Saxena  wrote:

> +1 (Binding)
>
> * Built from source (x86 & Arm)
> * Successful native build on ubuntu 18.04(x86) & ubuntu 20.04(Arm)
> * Verified Checksums (x86 & Arm)
> * Verified Signatures (x86 & Arm)
> * Successful RAT check (x86 & Arm)
> * Verified the diff b/w the tag & the source tar
> * Built Ozone with 3.3.6, green build after a retrigger due to some OOM
> issues [1]
> * Built Tez with 3.3.6 green build [2]
> * Ran basic HDFS shell commands (Fs
> Operations/EC/RBF/StoragePolicy/Snapshots) (x86 & Arm)
> * Ran some basic Yarn shell commands.
> * Browsed through the UI (NN, DN, RM, NM, JHS) (x86 & Arm)
> * Ran some example Jobs (TeraGen, TeraSort, TeraValidate, WordCount,
> WordMean, Pi) (x86 & Arm)
> * Verified the output of `hadoop version` (x86 & Arm)
> * Ran some HDFS unit tests around FsOperations/EC/Observer Read/RBF/SPS
> * Skimmed over the contents of site jar
> * Skimmed over the staging repo.
> * Checked the NOTICE & Licence files.
>
> Thanx Wei-Chiu for driving the release, Good Luck!!!
>
> -Ayush
>
>
> [1] https://github.com/ayushtkn/hadoop-ozone/actions/runs/5282707769
> [2] https://github.com/apache/tez/pull/285#issuecomment-1590962978
>
> On Sat, 24 Jun 2023 at 09:43, Nilotpal Nandi 
> wrote:
>
>> +1 (Non-binding).
>> Thanks a lot Wei-Chiu for driving it.
>>
>> Thanks,
>> Nilotpal Nandi
>>
>> On 2023/06/23 21:51:56 Wei-Chiu Chuang wrote:
>> > +1 (binding)
>> >
>> > Note: according to the Hadoop bylaw, release vote is open for 5 days,
>> not 7
>> > days. So technically the time is almost up.
>> > https://hadoop.apache.org/bylaws#Decision+Making
>> >
>> > If you plan to cast a vote, please do so soon. In the meantime, I'll
>> start
>> > to prepare to wrap up the release work.
>> >
>> > On Fri, Jun 23, 2023 at 6:09 AM Xiaoqiao He 
>> wrote:
>> >
>> > > +1(binding)
>> > >
>> > > * Verified signature and checksum of all source tarballs.
>> > > * Built source code on Ubuntu and OpenJDK 11 by `mvn clean package
>> > > -DskipTests -Pnative -Pdist -Dtar`.
>> > > * Setup pseudo cluster with HDFS and YARN.
>> > > * Run simple FsShell - mkdir/put/get/mv/rm and check the result.
>> > > * Run example mr applications and check the result - Pi & wordcount.
>> > > * Checked the Web UI of NameNode/DataNode/Resourcemanager/NodeManager
>> etc.
>> > > * Checked git and JIRA using dev-support tools
>> > > `git_jira_fix_version_check.py` .
>> > >
>> > > Thanks WeiChiu for your work.
>> > >
>> > > NOTE: I believe the build fatal error report from me above is only
>> related
>> > > to my own environment.
>> > >
>> > > Best Regards,
>> > > - He Xiaoqiao
>> > >
>> > > On Thu, Jun 22, 2023 at 4:17 PM Chen Yi 
>> wrote:
>> > >
>> > > > Thanks Wei-Chiu for leading this effort !
>> > > >
>> > > > +1(Binding)
>> > > >
>> > > >
>> > > > + Verified the signature and checksum of all tarballs.
>> > > > + Started a web server and viewed documentation site.
>> > > > + Built from the source tarball on macOS 12.3 and OpenJDK 8.
>> > > > + Launched a pseudo distributed cluster using released binary
>> packages,
>> > > > done some HDFS dir/file basic opeations.
>> > > > + Run grep, pi and wordcount MR tasks on the pseudo cluster.
>> > > >
>> > > > Bests,
>> > > > Sammi Chen
>> > > > 
>> > > > 发件人: Wei-Chiu Chuang 
>> > > > 发送时间: 2023年6月19日 8:52
>> > > > 收件人: Hadoop Common ; Hdfs-dev <
>> > > > hdfs-...@hadoop.apache.org>; yarn-dev ;
>> > > > mapreduce-dev 
>> > > > 主题: [VOTE] Release Apache Hadoop 3.3.6 RC1
>> > > >
>> > > > I am inviting anyone to try and vote on this release candidate.
>> > > >
>> > > > Note:
>> > > > This is exactly the same as RC0, except the CHANGE

Re: [VOTE] Release Apache Hadoop 3.3.6 RC1

2023-06-23 Thread Wei-Chiu Chuang
+1 (binding)

Note: according to the Hadoop bylaw, release vote is open for 5 days, not 7
days. So technically the time is almost up.
https://hadoop.apache.org/bylaws#Decision+Making

If you plan to cast a vote, please do so soon. In the meantime, I'll start
to prepare to wrap up the release work.

On Fri, Jun 23, 2023 at 6:09 AM Xiaoqiao He  wrote:

> +1(binding)
>
> * Verified signature and checksum of all source tarballs.
> * Built source code on Ubuntu and OpenJDK 11 by `mvn clean package
> -DskipTests -Pnative -Pdist -Dtar`.
> * Setup pseudo cluster with HDFS and YARN.
> * Run simple FsShell - mkdir/put/get/mv/rm and check the result.
> * Run example mr applications and check the result - Pi & wordcount.
> * Checked the Web UI of NameNode/DataNode/Resourcemanager/NodeManager etc.
> * Checked git and JIRA using dev-support tools
> `git_jira_fix_version_check.py` .
>
> Thanks WeiChiu for your work.
>
> NOTE: I believe the build fatal error report from me above is only related
> to my own environment.
>
> Best Regards,
> - He Xiaoqiao
>
> On Thu, Jun 22, 2023 at 4:17 PM Chen Yi  wrote:
>
> > Thanks Wei-Chiu for leading this effort !
> >
> > +1(Binding)
> >
> >
> > + Verified the signature and checksum of all tarballs.
> > + Started a web server and viewed documentation site.
> > + Built from the source tarball on macOS 12.3 and OpenJDK 8.
> > + Launched a pseudo distributed cluster using released binary packages,
> > done some HDFS dir/file basic opeations.
> > + Run grep, pi and wordcount MR tasks on the pseudo cluster.
> >
> > Bests,
> > Sammi Chen
> > 
> > 发件人: Wei-Chiu Chuang 
> > 发送时间: 2023年6月19日 8:52
> > 收件人: Hadoop Common ; Hdfs-dev <
> > hdfs-...@hadoop.apache.org>; yarn-dev ;
> > mapreduce-dev 
> > 主题: [VOTE] Release Apache Hadoop 3.3.6 RC1
> >
> > I am inviting anyone to try and vote on this release candidate.
> >
> > Note:
> > This is exactly the same as RC0, except the CHANGELOG.
> >
> > The RC is available at:
> > https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/ (for amd64)
> > https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-arm64/ (for arm64)
> >
> > Git tag: release-3.3.6-RC1
> > https://github.com/apache/hadoop/releases/tag/release-3.3.6-RC1
> >
> > Maven artifacts is built by x86 machine and are staged at
> > https://repository.apache.org/content/repositories/orgapachehadoop-1380/
> >
> > My public key:
> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >
> > Changelog:
> > https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/CHANGELOG.md
> >
> > Release notes:
> > https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/RELEASENOTES.md
> >
> > This is a relatively small release (by Hadoop standard) containing about
> > 120 commits.
> > Please give it a try, this RC vote will run for 7 days.
> >
> >
> > Feature highlights:
> >
> > SBOM artifacts
> > 
> > Starting from this release, Hadoop publishes Software Bill of Materials
> > (SBOM) using
> > CycloneDX Maven plugin. For more information about SBOM, please go to
> > [SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).
> >
> > HDFS RBF: RDBMS based token storage support
> > 
> > HDFS Router-Router Based Federation now supports storing delegation
> tokens
> > on MySQL,
> > [HADOOP-18535](https://issues.apache.org/jira/browse/HADOOP-18535)
> > which improves token operation through over the original Zookeeper-based
> > implementation.
> >
> >
> > New File System APIs
> > 
> > [HADOOP-18671](https://issues.apache.org/jira/browse/HADOOP-18671)
> moved a
> > number of
> > HDFS-specific APIs to Hadoop Common to make it possible for certain
> > applications that
> > depend on HDFS semantics to run on other Hadoop compatible file systems.
> >
> > In particular, recoverLease() and isFileClosed() are exposed through
> > LeaseRecoverable
> > interface. While setSafeMode() is exposed through SafeMode interface.
> >
>


Clean up old Hadoop release tarballs

2023-06-23 Thread Wei-Chiu Chuang
https://dist.apache.org/repos/dist/release/hadoop/common/ has Hadoop
release tarballs 3.3.1 ~ 3.3.5. I plan to remove the tarballs from 3.3.1 to
3.3.4 and leave only 3.3.5 (and the upcoming 3.3.6). Shout out if you have
something depending on the old release tarballs (you shouldn't)

Other release lines (2.10, 3.2) have two release tarballs each, which is
good. I'll leave it that way.

Weichiu


DockerHub admin for Apache Hadoop

2023-06-22 Thread Wei-Chiu Chuang
Ayush and I have acquired the DockerHub admin privilege for the Hadoop
project in order to facilitate the release of Hadoop 3.3.6.

Apache Infra allows only two seats per project. So if you need something,
let Ayush and I know and we will make it happen for you. If you are a
Docker guru and can't help but want to do more to make Hadoop easier and
better on DockerHub, feel free to let me know! I'm happy to give away my
seat (PMC member only).

https://cwiki.apache.org/confluence/display/HADOOP2/HowToRelease#HowToRelease-Dockerimages

https://infra.apache.org/docker-hub-policy.html

Best Regards,
Weichiu


Re: [VOTE] Release Apache Hadoop 3.3.6 RC1

2023-06-21 Thread Wei-Chiu Chuang
I am using Maven 3.6.3 on Mac (x86), JDK 1.8.0_341
No issue for me.

On Wed, Jun 21, 2023 at 5:48 AM Xiaoqiao He  wrote:

> Addendum:
> A. Build the release from sources using: `mvn clean install package
> -DskipTests=true -Dmaven.javadoc.skip=true`
> B. It works well when using the same command to build from the
> source branch trunk.
>
> On Wed, Jun 21, 2023 at 8:44 PM Xiaoqiao He  wrote:
>
> > Hi,
> >
> > I met a fatal error when building from source on local Mac OSX. It
> > could reproduce stably.
> > Not sure if it is related to my local environment. Try to dig it, but not
> > any conclusion right now.
> > Will feedback once find reasons.
> >
> > Appendix system environment, some more stack information refer to the
> > attachment please.
> > OS:Bsduname:Darwin 21.2.0 Darwin Kernel Version 21.2.0: Sun Nov 28
> > 20:28:54 PST 2021; root:xnu-8019.61.5~1/RELEASE_X86_64 x86_64
> > rlimit: STACK 8192k, CORE 0k, NPROC 2784, NOFILE 10240, AS infinity
> > load average:13.99 12.30 8.96
> >
> > CPU:total 12 (initial active 12) (6 cores per cpu, 2 threads per core)
> > family 6 model 158 stepping 10, cmov, cx8, fxsr, mmx, sse, sse2, sse3,
> > ssse3, sse4.1, sse4.2, popcnt, avx, avx2, aes, clmul, erms, 3dnowpref,
> > lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx
> >
> > vm_info: Java HotSpot(TM) 64-Bit Server VM (25.202-b08) for bsd-amd64 JRE
> > (1.8.0_202-b08), built on Dec 15 2018 20:16:16 by "java_re" with gcc
> 4.2.1
> > (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)
> >
> > mvn version: Apache Maven 3.6.0
> >
> > Best Regards,
> > - He Xiaoqiao
> >
> >
> > On Wed, Jun 21, 2023 at 10:09 AM Tak Lon (Stephen) Wu  >
> > wrote:
> >
> >> +1 (non-binding), and thanks a lot for driving the vote.
> >>
> >> * Signature of sources and binaries: ok
> >> * Checksum of sources and binaries: ok
> >> * Rat check (1.8.0_362): okie
> >>  - mvn clean apache-rat:check
> >> * Built from source (1.8.0_362): ok
> >>  - mvn clean install -DskipTests
> >> * Run Pseudo-Distributed mode with HDFS and YARN (1.8.0_362): ok
> >> * Run Shell command (mkdir/put/ls/get) (1.8.0_362) : ok
> >> * Run MR examples applications and check the result (1.8.0_362): ok
> >>  - bin/hadoop jar
> >> share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.6.jar grep input
> >> output 'dfs[a-z.]+'
> >>
> >> -Stephen
> >>
> >> On Tue, Jun 20, 2023 at 6:56 PM Masatake Iwasaki <
> >> iwasak...@oss.nttdata.com>
> >> wrote:
> >>
> >> > +1
> >> >
> >> > + verified the signature and checksum of the source tarball.
> >> > + built from the source tarball on Rocky Linux 8 (x86_64) and OpenJDK
> 8
> >> > with native profile enabled.
> >> >+ launched pseudo distributed cluster including kms and httpfs with
> >> > Kerberos and SSL enabled.
> >> >+ created encryption zone, put and read files via httpfs.
> >> > + built RPM packages by Bigtop (modified to use ZooKeepr 3.6) on Rocky
> >> > Linux 8 (x86_64).
> >> >    + built HBase and Hive against Hadoop 3.3.6.
> >> >+ ran smoke-tests of hdfs, yarn, mapreduce, hbase and hive.
> >> > + skimmed the contents of site documentation.
> >> >
> >> > Thanks,
> >> > Masatake Iwasaki
> >> >
> >> > On 2023/06/21 8:07, Wei-Chiu Chuang wrote:
> >> > > Bumping this thread to the top.
> >> > > If you are verifying the release, please vote on this thread. RC0
> and
> >> RC1
> >> > > are exactly the same. The only material difference is the Changelog.
> >> > >
> >> > > Thanks!!
> >> > >
> >> > > On Sun, Jun 18, 2023 at 5:52 PM Wei-Chiu Chuang  >
> >> > wrote:
> >> > >
> >> > >> I am inviting anyone to try and vote on this release candidate.
> >> > >>
> >> > >> Note:
> >> > >> This is exactly the same as RC0, except the CHANGELOG.
> >> > >>
> >> > >> The RC is available at:
> >> > >> https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/ (for
> amd64)
> >> > >> https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-arm64/ (for
> arm64)
> >> > >>
> >> > >> Git tag: release-3.3.6-RC1
> >> > >> https://github.com/apache/hadoop/releases/tag/releas

Re: [VOTE] Release Apache Hadoop 3.3.6 RC1

2023-06-20 Thread Wei-Chiu Chuang
Bumping this thread to the top.
If you are verifying the release, please vote on this thread. RC0 and RC1
are exactly the same. The only material difference is the Changelog.

Thanks!!

On Sun, Jun 18, 2023 at 5:52 PM Wei-Chiu Chuang  wrote:

> I am inviting anyone to try and vote on this release candidate.
>
> Note:
> This is exactly the same as RC0, except the CHANGELOG.
>
> The RC is available at:
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/ (for amd64)
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-arm64/ (for arm64)
>
> Git tag: release-3.3.6-RC1
> https://github.com/apache/hadoop/releases/tag/release-3.3.6-RC1
>
> Maven artifacts is built by x86 machine and are staged at
> https://repository.apache.org/content/repositories/orgapachehadoop-1380/
>
> My public key:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> Changelog:
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/CHANGELOG.md
>
> Release notes:
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/RELEASENOTES.md
>
> This is a relatively small release (by Hadoop standard) containing about
> 120 commits.
> Please give it a try, this RC vote will run for 7 days.
>
>
> Feature highlights:
>
> SBOM artifacts
> 
> Starting from this release, Hadoop publishes Software Bill of Materials
> (SBOM) using
> CycloneDX Maven plugin. For more information about SBOM, please go to
> [SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).
>
> HDFS RBF: RDBMS based token storage support
> 
> HDFS Router-Router Based Federation now supports storing delegation tokens
> on MySQL,
> [HADOOP-18535](https://issues.apache.org/jira/browse/HADOOP-18535)
> which improves token operation through over the original Zookeeper-based
> implementation.
>
>
> New File System APIs
> 
> [HADOOP-18671](https://issues.apache.org/jira/browse/HADOOP-18671) moved
> a number of
> HDFS-specific APIs to Hadoop Common to make it possible for certain
> applications that
> depend on HDFS semantics to run on other Hadoop compatible file systems.
>
> In particular, recoverLease() and isFileClosed() are exposed through
> LeaseRecoverable
> interface. While setSafeMode() is exposed through SafeMode interface.
>
>
>


[VOTE] Release Apache Hadoop 3.3.6 RC1

2023-06-18 Thread Wei-Chiu Chuang
I am inviting anyone to try and vote on this release candidate.

Note:
This is exactly the same as RC0, except the CHANGELOG.

The RC is available at:
https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/ (for amd64)
https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-arm64/ (for arm64)

Git tag: release-3.3.6-RC1
https://github.com/apache/hadoop/releases/tag/release-3.3.6-RC1

Maven artifacts is built by x86 machine and are staged at
https://repository.apache.org/content/repositories/orgapachehadoop-1380/

My public key:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

Changelog:
https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/CHANGELOG.md

Release notes:
https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/RELEASENOTES.md

This is a relatively small release (by Hadoop standard) containing about
120 commits.
Please give it a try, this RC vote will run for 7 days.


Feature highlights:

SBOM artifacts

Starting from this release, Hadoop publishes Software Bill of Materials
(SBOM) using
CycloneDX Maven plugin. For more information about SBOM, please go to
[SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).

HDFS RBF: RDBMS based token storage support

HDFS Router-Router Based Federation now supports storing delegation tokens
on MySQL,
[HADOOP-18535](https://issues.apache.org/jira/browse/HADOOP-18535)
which improves token operation through over the original Zookeeper-based
implementation.


New File System APIs

[HADOOP-18671](https://issues.apache.org/jira/browse/HADOOP-18671) moved a
number of
HDFS-specific APIs to Hadoop Common to make it possible for certain
applications that
depend on HDFS semantics to run on other Hadoop compatible file systems.

In particular, recoverLease() and isFileClosed() are exposed through
LeaseRecoverable
interface. While setSafeMode() is exposed through SafeMode interface.


Re: [VOTE] Release Apache Hadoop 3.3.6 RC0

2023-06-17 Thread Wei-Chiu Chuang
I was going to do another RC in case something comes up.
But it looks like the only thing that needs to be fixed is the Changelog.


   1. HADOOP-18596 <https://issues.apache.org/jira/browse/HADOOP-18596>

HADOOP-18633 <https://issues.apache.org/jira/browse/HADOOP-18633>
are related to cloud store semantics, and I don't want to make a judgement
call on it. As far as I can tell its effect can be addressed by supplying a
config option in the application code.
It looks like the feature improves fault tolerance by ensuring files are
synchronized if modification time is different between the source and
destination. So to me it's the better behavior.

I can make a RC1 over the weekend to fix the Changelog but that's probably
the only thing that's going to have.
On Sat, Jun 17, 2023 at 2:00 AM Xiaoqiao He  wrote:

> Thanks Wei-Chiu for driving this release. The next RC will be prepared,
> right?
> If true, I would like to try and vote on the next RC.
> Just notice that some JIRAs are not included and need to revert some PRs to
> pass HBase verification which are mentioned above.
>
> Best Regards,
> - He Xiaoqiao
>
>
> On Fri, Jun 16, 2023 at 9:20 AM Wei-Chiu Chuang
>  wrote:
>
> > Overall so far so good.
> >
> > hadoop-api-shim:
> > built, tested successfully.
> >
> > cloudstore:
> > built successfully.
> >
> > Spark:
> > built successfully. Passed hadoop-cloud tests.
> >
> > Ozone:
> > One test failure due to unrelated Ozone issue. This test is being
> disabled
> > in the latest Ozone code.
> >
> > org.apache.hadoop.hdds.utils.NativeLibraryNotLoadedException: Unable
> > to load library ozone_rocksdb_tools from both java.library.path &
> > resource file libozone_rocksdb_t
> > ools.so from jar.
> > at
> >
> org.apache.hadoop.hdds.utils.db.managed.ManagedSSTDumpTool.(ManagedSSTDumpTool.java:49)
> >
> >
> > Google gcs:
> > There are two test failures. The tests were added recently by
> HADOOP-18724
> > <https://issues.apache.org/jira/browse/HADOOP-18724> in Hadoop 3.3.6.
> This
> > is okay. Not production code problem. Can be addressed in GCS code.
> >
> > [ERROR] Errors:
> > [ERROR]
> >
> >
> TestInMemoryGoogleContractOpen>AbstractContractOpenTest.testFloatingPointLength:403
> > » IllegalArgument Unknown mandatory key for gs://fake-in-memory-test-buck
> > et/contract-test/testFloatingPointLength "fs.option.openfile.length"
> > [ERROR]
> >
> >
> TestInMemoryGoogleContractOpen>AbstractContractOpenTest.testOpenFileApplyAsyncRead:341
> > » IllegalArgument Unknown mandatory key for gs://fake-in-memory-test-b
> > ucket/contract-test/testOpenFileApplyAsyncRead
> "fs.option.openfile.length"
> >
> >
> >
> >
> >
> > On Wed, Jun 14, 2023 at 5:01 PM Wei-Chiu Chuang 
> > wrote:
> >
> > > The hbase-filesystem tests passed after reverting HADOOP-18596
> > > <https://issues.apache.org/jira/browse/HADOOP-18596> and HADOOP-18633
> > > <https://issues.apache.org/jira/browse/HADOOP-18633> from my local
> tree.
> > > So I think it's a matter of the default behavior being changed. It's
> not
> > > the end of the world. I think we can address it by adding an
> incompatible
> > > change flag and a release note.
> > >
> > > On Wed, Jun 14, 2023 at 3:55 PM Wei-Chiu Chuang 
> > > wrote:
> > >
> > >> Cross referenced git history and jira. Changelog needs some update
> > >>
> > >> Not in the release
> > >>
> > >>1. HDFS-16858 <https://issues.apache.org/jira/browse/HDFS-16858>
> > >>
> > >>
> > >>1. HADOOP-18532 <
> https://issues.apache.org/jira/browse/HADOOP-18532>
> > >>2.
> > >>   1. HDFS-16861 <https://issues.apache.org/jira/browse/HDFS-16861
> >
> > >>  2.
> > >> 1. HDFS-16866
> > >> <https://issues.apache.org/jira/browse/HDFS-16866>
> > >> 2.
> > >>1. HADOOP-18320
> > >><https://issues.apache.org/jira/browse/HADOOP-18320>
> > >>2.
> > >>
> > >> Updated fixed version. Will generate. new Changelog in the next RC.
> > >>
> > >> Was able to build HBase and hbase-filesystem without any code change.
> > >>
> > >> hbase has one unit test failure. This one is reproducible even with
> > >> Hadoop 3.3.5, so maybe a red 

Re: [VOTE] Release Apache Hadoop 3.3.6 RC0

2023-06-15 Thread Wei-Chiu Chuang
Overall so far so good.

hadoop-api-shim:
built, tested successfully.

cloudstore:
built successfully.

Spark:
built successfully. Passed hadoop-cloud tests.

Ozone:
One test failure due to unrelated Ozone issue. This test is being disabled
in the latest Ozone code.

org.apache.hadoop.hdds.utils.NativeLibraryNotLoadedException: Unable
to load library ozone_rocksdb_tools from both java.library.path &
resource file libozone_rocksdb_t
ools.so from jar.
at 
org.apache.hadoop.hdds.utils.db.managed.ManagedSSTDumpTool.(ManagedSSTDumpTool.java:49)


Google gcs:
There are two test failures. The tests were added recently by HADOOP-18724
<https://issues.apache.org/jira/browse/HADOOP-18724> in Hadoop 3.3.6. This
is okay. Not production code problem. Can be addressed in GCS code.

[ERROR] Errors:
[ERROR]
TestInMemoryGoogleContractOpen>AbstractContractOpenTest.testFloatingPointLength:403
» IllegalArgument Unknown mandatory key for gs://fake-in-memory-test-buck
et/contract-test/testFloatingPointLength "fs.option.openfile.length"
[ERROR]
TestInMemoryGoogleContractOpen>AbstractContractOpenTest.testOpenFileApplyAsyncRead:341
» IllegalArgument Unknown mandatory key for gs://fake-in-memory-test-b
ucket/contract-test/testOpenFileApplyAsyncRead "fs.option.openfile.length"





On Wed, Jun 14, 2023 at 5:01 PM Wei-Chiu Chuang  wrote:

> The hbase-filesystem tests passed after reverting HADOOP-18596
> <https://issues.apache.org/jira/browse/HADOOP-18596> and HADOOP-18633
> <https://issues.apache.org/jira/browse/HADOOP-18633> from my local tree.
> So I think it's a matter of the default behavior being changed. It's not
> the end of the world. I think we can address it by adding an incompatible
> change flag and a release note.
>
> On Wed, Jun 14, 2023 at 3:55 PM Wei-Chiu Chuang 
> wrote:
>
>> Cross referenced git history and jira. Changelog needs some update
>>
>> Not in the release
>>
>>1. HDFS-16858 <https://issues.apache.org/jira/browse/HDFS-16858>
>>
>>
>>1. HADOOP-18532 <https://issues.apache.org/jira/browse/HADOOP-18532>
>>2.
>>   1. HDFS-16861 <https://issues.apache.org/jira/browse/HDFS-16861>
>>  2.
>> 1. HDFS-16866
>> <https://issues.apache.org/jira/browse/HDFS-16866>
>> 2.
>>1. HADOOP-18320
>><https://issues.apache.org/jira/browse/HADOOP-18320>
>>2.
>>
>> Updated fixed version. Will generate. new Changelog in the next RC.
>>
>> Was able to build HBase and hbase-filesystem without any code change.
>>
>> hbase has one unit test failure. This one is reproducible even with
>> Hadoop 3.3.5, so maybe a red herring. Local env or something.
>>
>> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
>> 9.007 s <<< FAILURE! - in
>> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker
>> [ERROR]
>> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker.testConcurrentIncludeTimestampCorrectness
>>  Time elapsed: 3.13 s  <<< ERROR!
>> java.lang.OutOfMemoryError: Java heap space
>> at
>> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker$RandomTestData.(TestSyncTimeRangeTracker.java:91)
>> at
>> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker.testConcurrentIncludeTimestampCorrectness(TestSyncTimeRangeTracker.java:156)
>>
>> hbase-filesystem has three test failures in TestHBOSSContractDistCp, and
>> is not reproducible with Hadoop 3.3.5.
>> [ERROR] Failures: [ERROR]
>> TestHBOSSContractDistCp>AbstractContractDistCpTest.testDistCpUpdateCheckFileSkip:976->Assert.fail:88
>> 10 errors in file of length 10
>> [ERROR]
>> TestHBOSSContractDistCp>AbstractContractDistCpTest.testUpdateDeepDirectoryStructureNoChange:270->AbstractContractDistCpTest.assertCounterInRange:290->Assert.assertTrue:41->Assert.fail:88
>> Files Skipped value 0 too below minimum 1
>> [ERROR]
>> TestHBOSSContractDistCp>AbstractContractDistCpTest.testUpdateDeepDirectoryStructureToRemote:259->AbstractContractDistCpTest.distCpUpdateDeepDirectoryStructure:334->AbstractContractDistCpTest.assertCounterInRange:294->Assert.assertTrue:41->Assert.fail:88
>> Files Copied value 2 above maximum 1
>> [INFO]
>> [ERROR] Tests run: 240, Failures: 3, Errors: 0, Skipped: 58
>>
>>
>> Ozone
>> test in progress. Will report back.
>>
>>
>> On Tue, Jun 13, 2023 at 11:27 PM Wei-Chiu Chuang 
>> wrote:
>>
>>> I am inviting anyone to try and vote on this release candidate.
>>>
>>> Note:
>>

Re: [VOTE] Release Apache Hadoop 3.3.6 RC0

2023-06-15 Thread Wei-Chiu Chuang
It's branching off branch-3.3

On Thu, Jun 15, 2023 at 3:18 AM Steve Loughran 
wrote:

> Which branch is -3.3.6 off? 3.3.5 or 3.3?
>
> I'm travelling for the next few days and unlikely to be able to test this;
> will do my best
>
> On Wed, 14 Jun 2023 at 07:27, Wei-Chiu Chuang  wrote:
>
> > I am inviting anyone to try and vote on this release candidate.
> >
> > Note:
> > This is built off branch-3.3.6 plus PR#5741 (aws sdk update) and PR#5740
> > (LICENSE file update)
> >
> > The RC is available at:
> > https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/ (for amd64)
> > https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-arm64/ (for arm64)
> >
> > Git tag: release-3.3.6-RC0
> > https://github.com/apache/hadoop/releases/tag/release-3.3.6-RC0
> >
> > Maven artifacts is built by x86 machine and are staged at
> > https://repository.apache.org/content/repositories/orgapachehadoop-1378/
> >
> > My public key:
> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >
> > Changelog:
> > https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/CHANGELOG.md
> >
> > Release notes:
> > https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/RELEASENOTES.md
> >
> > This is a relatively small release (by Hadoop standard) containing about
> > 120 commits.
> > Please give it a try, this RC vote will run for 7 days.
> >
> >
> > Feature highlights:
> >
> > SBOM artifacts
> > 
> > Starting from this release, Hadoop publishes Software Bill of Materials
> > (SBOM) using
> > CycloneDX Maven plugin. For more information about SBOM, please go to
> > [SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).
> >
> > HDFS RBF: RDBMS based token storage support
> > 
> > HDFS Router-Router Based Federation now supports storing delegation
> tokens
> > on MySQL,
> > [HADOOP-18535](https://issues.apache.org/jira/browse/HADOOP-18535)
> > which improves token operation through over the original Zookeeper-based
> > implementation.
> >
> >
> > New File System APIs
> > 
> > [HADOOP-18671](https://issues.apache.org/jira/browse/HADOOP-18671)
> moved a
> > number of
> > HDFS-specific APIs to Hadoop Common to make it possible for certain
> > applications that
> > depend on HDFS semantics to run on other Hadoop compatible file systems.
> >
> > In particular, recoverLease() and isFileClosed() are exposed through
> > LeaseRecoverable
> > interface. While setSafeMode() is exposed through SafeMode interface.
> >
>


Re: [VOTE] Release Apache Hadoop 3.3.6 RC0

2023-06-14 Thread Wei-Chiu Chuang
The hbase-filesystem tests passed after reverting HADOOP-18596
<https://issues.apache.org/jira/browse/HADOOP-18596> and HADOOP-18633
<https://issues.apache.org/jira/browse/HADOOP-18633> from my local tree.
So I think it's a matter of the default behavior being changed. It's not
the end of the world. I think we can address it by adding an incompatible
change flag and a release note.

On Wed, Jun 14, 2023 at 3:55 PM Wei-Chiu Chuang  wrote:

> Cross referenced git history and jira. Changelog needs some update
>
> Not in the release
>
>1. HDFS-16858 <https://issues.apache.org/jira/browse/HDFS-16858>
>
>
>1. HADOOP-18532 <https://issues.apache.org/jira/browse/HADOOP-18532>
>2.
>   1. HDFS-16861 <https://issues.apache.org/jira/browse/HDFS-16861>
>  2.
> 1. HDFS-16866
> <https://issues.apache.org/jira/browse/HDFS-16866>
> 2.
>1. HADOOP-18320
><https://issues.apache.org/jira/browse/HADOOP-18320>
>2.
>
> Updated fixed version. Will generate. new Changelog in the next RC.
>
> Was able to build HBase and hbase-filesystem without any code change.
>
> hbase has one unit test failure. This one is reproducible even with Hadoop
> 3.3.5, so maybe a red herring. Local env or something.
>
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> 9.007 s <<< FAILURE! - in
> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker
> [ERROR]
> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker.testConcurrentIncludeTimestampCorrectness
>  Time elapsed: 3.13 s  <<< ERROR!
> java.lang.OutOfMemoryError: Java heap space
> at
> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker$RandomTestData.(TestSyncTimeRangeTracker.java:91)
> at
> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker.testConcurrentIncludeTimestampCorrectness(TestSyncTimeRangeTracker.java:156)
>
> hbase-filesystem has three test failures in TestHBOSSContractDistCp, and
> is not reproducible with Hadoop 3.3.5.
> [ERROR] Failures: [ERROR]
> TestHBOSSContractDistCp>AbstractContractDistCpTest.testDistCpUpdateCheckFileSkip:976->Assert.fail:88
> 10 errors in file of length 10
> [ERROR]
> TestHBOSSContractDistCp>AbstractContractDistCpTest.testUpdateDeepDirectoryStructureNoChange:270->AbstractContractDistCpTest.assertCounterInRange:290->Assert.assertTrue:41->Assert.fail:88
> Files Skipped value 0 too below minimum 1
> [ERROR]
> TestHBOSSContractDistCp>AbstractContractDistCpTest.testUpdateDeepDirectoryStructureToRemote:259->AbstractContractDistCpTest.distCpUpdateDeepDirectoryStructure:334->AbstractContractDistCpTest.assertCounterInRange:294->Assert.assertTrue:41->Assert.fail:88
> Files Copied value 2 above maximum 1
> [INFO]
> [ERROR] Tests run: 240, Failures: 3, Errors: 0, Skipped: 58
>
>
> Ozone
> test in progress. Will report back.
>
>
> On Tue, Jun 13, 2023 at 11:27 PM Wei-Chiu Chuang 
> wrote:
>
>> I am inviting anyone to try and vote on this release candidate.
>>
>> Note:
>> This is built off branch-3.3.6 plus PR#5741 (aws sdk update) and PR#5740
>> (LICENSE file update)
>>
>> The RC is available at:
>> https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/ (for amd64)
>> https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-arm64/ (for arm64)
>>
>> Git tag: release-3.3.6-RC0
>> https://github.com/apache/hadoop/releases/tag/release-3.3.6-RC0
>>
>> Maven artifacts is built by x86 machine and are staged at
>> https://repository.apache.org/content/repositories/orgapachehadoop-1378/
>>
>> My public key:
>> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>>
>> Changelog:
>> https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/CHANGELOG.md
>>
>> Release notes:
>> https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/RELEASENOTES.md
>>
>> This is a relatively small release (by Hadoop standard) containing about
>> 120 commits.
>> Please give it a try, this RC vote will run for 7 days.
>>
>>
>> Feature highlights:
>>
>> SBOM artifacts
>> 
>> Starting from this release, Hadoop publishes Software Bill of Materials
>> (SBOM) using
>> CycloneDX Maven plugin. For more information about SBOM, please go to
>> [SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).
>>
>> HDFS RBF: RDBMS based token storage support
>> 
>> HDFS Router-Router Based Federation now supports storing delegation
>> tokens

Re: [VOTE] Release Apache Hadoop 3.3.6 RC0

2023-06-14 Thread Wei-Chiu Chuang
Cross referenced git history and jira. Changelog needs some update

Not in the release

   1. HDFS-16858 <https://issues.apache.org/jira/browse/HDFS-16858>


   1. HADOOP-18532 <https://issues.apache.org/jira/browse/HADOOP-18532>
   2.
  1. HDFS-16861 <https://issues.apache.org/jira/browse/HDFS-16861>
 2.
1. HDFS-16866 <https://issues.apache.org/jira/browse/HDFS-16866>
2.
   1. HADOOP-18320
   <https://issues.apache.org/jira/browse/HADOOP-18320>
   2.

Updated fixed version. Will generate. new Changelog in the next RC.

Was able to build HBase and hbase-filesystem without any code change.

hbase has one unit test failure. This one is reproducible even with Hadoop
3.3.5, so maybe a red herring. Local env or something.

[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
9.007 s <<< FAILURE! - in
org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker
[ERROR]
org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker.testConcurrentIncludeTimestampCorrectness
 Time elapsed: 3.13 s  <<< ERROR!
java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker$RandomTestData.(TestSyncTimeRangeTracker.java:91)
at
org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker.testConcurrentIncludeTimestampCorrectness(TestSyncTimeRangeTracker.java:156)

hbase-filesystem has three test failures in TestHBOSSContractDistCp, and is
not reproducible with Hadoop 3.3.5.
[ERROR] Failures: [ERROR]
TestHBOSSContractDistCp>AbstractContractDistCpTest.testDistCpUpdateCheckFileSkip:976->Assert.fail:88
10 errors in file of length 10
[ERROR]
TestHBOSSContractDistCp>AbstractContractDistCpTest.testUpdateDeepDirectoryStructureNoChange:270->AbstractContractDistCpTest.assertCounterInRange:290->Assert.assertTrue:41->Assert.fail:88
Files Skipped value 0 too below minimum 1
[ERROR]
TestHBOSSContractDistCp>AbstractContractDistCpTest.testUpdateDeepDirectoryStructureToRemote:259->AbstractContractDistCpTest.distCpUpdateDeepDirectoryStructure:334->AbstractContractDistCpTest.assertCounterInRange:294->Assert.assertTrue:41->Assert.fail:88
Files Copied value 2 above maximum 1
[INFO]
[ERROR] Tests run: 240, Failures: 3, Errors: 0, Skipped: 58


Ozone
test in progress. Will report back.


On Tue, Jun 13, 2023 at 11:27 PM Wei-Chiu Chuang  wrote:

> I am inviting anyone to try and vote on this release candidate.
>
> Note:
> This is built off branch-3.3.6 plus PR#5741 (aws sdk update) and PR#5740
> (LICENSE file update)
>
> The RC is available at:
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/ (for amd64)
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-arm64/ (for arm64)
>
> Git tag: release-3.3.6-RC0
> https://github.com/apache/hadoop/releases/tag/release-3.3.6-RC0
>
> Maven artifacts is built by x86 machine and are staged at
> https://repository.apache.org/content/repositories/orgapachehadoop-1378/
>
> My public key:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> Changelog:
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/CHANGELOG.md
>
> Release notes:
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/RELEASENOTES.md
>
> This is a relatively small release (by Hadoop standard) containing about
> 120 commits.
> Please give it a try, this RC vote will run for 7 days.
>
>
> Feature highlights:
>
> SBOM artifacts
> 
> Starting from this release, Hadoop publishes Software Bill of Materials
> (SBOM) using
> CycloneDX Maven plugin. For more information about SBOM, please go to
> [SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).
>
> HDFS RBF: RDBMS based token storage support
> 
> HDFS Router-Router Based Federation now supports storing delegation tokens
> on MySQL,
> [HADOOP-18535](https://issues.apache.org/jira/browse/HADOOP-18535)
> which improves token operation through over the original Zookeeper-based
> implementation.
>
>
> New File System APIs
> 
> [HADOOP-18671](https://issues.apache.org/jira/browse/HADOOP-18671) moved
> a number of
> HDFS-specific APIs to Hadoop Common to make it possible for certain
> applications that
> depend on HDFS semantics to run on other Hadoop compatible file systems.
>
> In particular, recoverLease() and isFileClosed() are exposed through
> LeaseRecoverable
> interface. While setSafeMode() is exposed through SafeMode interface.
>
>
>


[VOTE] Release Apache Hadoop 3.3.6 RC0

2023-06-14 Thread Wei-Chiu Chuang
I am inviting anyone to try and vote on this release candidate.

Note:
This is built off branch-3.3.6 plus PR#5741 (aws sdk update) and PR#5740
(LICENSE file update)

The RC is available at:
https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/ (for amd64)
https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-arm64/ (for arm64)

Git tag: release-3.3.6-RC0
https://github.com/apache/hadoop/releases/tag/release-3.3.6-RC0

Maven artifacts is built by x86 machine and are staged at
https://repository.apache.org/content/repositories/orgapachehadoop-1378/

My public key:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

Changelog:
https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/CHANGELOG.md

Release notes:
https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/RELEASENOTES.md

This is a relatively small release (by Hadoop standard) containing about
120 commits.
Please give it a try, this RC vote will run for 7 days.


Feature highlights:

SBOM artifacts

Starting from this release, Hadoop publishes Software Bill of Materials
(SBOM) using
CycloneDX Maven plugin. For more information about SBOM, please go to
[SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).

HDFS RBF: RDBMS based token storage support

HDFS Router-Router Based Federation now supports storing delegation tokens
on MySQL,
[HADOOP-18535](https://issues.apache.org/jira/browse/HADOOP-18535)
which improves token operation through over the original Zookeeper-based
implementation.


New File System APIs

[HADOOP-18671](https://issues.apache.org/jira/browse/HADOOP-18671) moved a
number of
HDFS-specific APIs to Hadoop Common to make it possible for certain
applications that
depend on HDFS semantics to run on other Hadoop compatible file systems.

In particular, recoverLease() and isFileClosed() are exposed through
LeaseRecoverable
interface. While setSafeMode() is exposed through SafeMode interface.


Re: [DISCUSS] Hadoop 3.3.6 release planning

2023-06-14 Thread Wei-Chiu Chuang
Thanks a lot for the help to move this release forward.

Some status update:

branch-3.3.6 just forked out of branch-3.3.
Starting a RC0 now.

I would like to add the text to highlight big features and improvements.
Let me know if you have something that's included in this release and would
also like to highlight. @Steve Loughran  I am aware the
S3A prefetch code is in this release, but I am not sure if it is in a state
where we can make it public. I'll let you decide.

SBOM artifacts

Starting from this release, Hadoop publishes Software Bill of Materials
(SBOM) using
CycloneDX Maven plugin. For more information about SBOM, please go to
[SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).

HDFS RBF: RDBMS based token storage support

HDFS Router-Router Based Federation now supports storing delegation tokens
on MySQL,
[HADOOP-18535](https://issues.apache.org/jira/browse/HADOOP-18535)
which improves token operation through over the original Zookeeper-based
implementation.


New File System APIs

[HADOOP-18671](https://issues.apache.org/jira/browse/HADOOP-18671) moved a
number of
HDFS-specific APIs to Hadoop Common to make it possible for certain
applications that
depend on HDFS semantics to run on other Hadoop compatible file systems.

In particular, recoverLease() and isFileClosed() are exposed through
LeaseRecoverable
interface. While setSafeMode() is exposed through SafeMode interface.



On Thu, Jun 8, 2023 at 11:11 AM Wei-Chiu Chuang  wrote:

> Thanks for comments
>
> Looking at jiras fixed in 3.3.6 and 3.3.9 (my bad, forgot that most
> commits landing in branch-3.3 was 3.3.9), most are okay. We have about 119
> commits so it's manageable.
>
> I am planning to cut 3.3.6 out of branch-3.3 later today. Anything open
> that is still targeting 3.3.6 will be cherry picked one by one.
> I will also bulk-update jiras fixed in 3.3.9 to 3.3.6.
>
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12337047
> As of now, I am tracking 4 jiras that's targeting 3.3.6 -- update Kerby,
> fix hadoop shaded client to support Spark history server, a small
> regression in HDFS (probably will move this out), protobuf 2.5 dependency
> change.
>
> I had a dry-run of the RC. Env set up for me and everything works. So I
> expect to have a RC ready to vote on soon. If I can move out some of the
> jiras or help them resolved, I can probably have a RC0 on Monday for vote.
>
> On Mon, May 8, 2023 at 2:01 PM Ayush Saxena  wrote:
>
>> That openssl change ain't a blocker now from my side, that ABFS-Jdk-17
>> stuff got sorted out, Steve knew a way out
>>
>> On Sat, 6 May 2023 at 00:51, Ayush Saxena  wrote:
>> >
>> > Thanx Wei-Chiu for the initiative, Good to have quick releases :)
>> >
>> > With my Hive developer hat on, I would like to bring some stuff up for
>> > consideration(feel free to say no, if it is beyond scope or feels even
>> > a bit unsafe, don't want to mess up the release)
>> >
>> > * HADOOP-18662: ListFiles with recursive fails with FNF : This broke
>> > compaction in Hive, bothers only with HDFS though. There is a
>> > workaround to that, if it doesn't feel safe. no issues, or if some
>> > improvements suggested. I can quickly do that :)
>> >
>> > * HADOOP-17649: Update wildfly openssl to 2.1.3.Final. Maybe not 2.1.3
>> > but if it works and is safe then to 2.2.5. I got flagged today that
>> > this openssl creates a bit of mess with JDK-17 for Hive with ABFS I
>> > think(need to dig in more),
>> >
>> > Now for the dependency upgrades:
>> >
>> > A big NO to Jackson, that ain't safe and the wounds are still fresh,
>> > it screwed the 3.3.3 release for many projects. So, let's not get into
>> > that. Infact anything that touches those shaded jars is risky, some
>> > package-json exclusion also created a mess recently. So, Lets not
>> > touch only and that too when we have less time.
>> >
>> > Avoid anything around Jetty upgrade, I have selfish reasons for that.
>> > Jetty messes something up with Hbase and Hive has a dependency on
>> > Hbase, and it is crazy, in case interested [1]. So, any upgrade to
>> > Jetty will block hive from upgrading Hadoop as of today. But that is a
>> > selfish reason and just up for consideration. Go ahead if necessary. I
>> > just wanted to let folks know
>> >
>> >
>> > Apart from the Jackson stuff, everything is suggestive in nature, your
>> > call feel free to ignore.
>> >
>> > @Xiaoqiao He , maybe pulling in all those 1

Re: [DISCUSS] Hadoop 3.3.6 release planning

2023-06-08 Thread Wei-Chiu Chuang
gt; > > >
> > > > If we should consider both 3.3.6 and 3.3.9 (which is from
> release-3.3.5
> > > > discuss)[1] for this release line?
> > > > I try to query with `project in (HDFS, YARN, HADOOP, MAPREDUCE) AND
> > > > fixVersion in (3.3.6, 3.3.9)`[2],
> > > > there are more than hundred jiras now.
> > > >
> > > > Best Regards,
> > > > - He Xiaoqiao
> > > >
> > > > [1] https://lists.apache.org/thread/kln96frt2tcg93x6ht99yck9m7r9qwxp
> > > > [2]
> > > >
> > > >
> https://issues.apache.org/jira/browse/YARN-11482?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20fixVersion%20in%20(3.3.6%2C%203.3.9)
> > > >
> > > >
> > > > On Fri, May 5, 2023 at 1:19 AM Wei-Chiu Chuang 
> wrote:
> > > >
> > > > > Hi community,
> > > > >
> > > > > I'd like to kick off the discussion around Hadoop 3.3.6 release
> plan.
> > > > >
> > > > > I'm being selfish but my intent for 3.3.6 is to have the new APIs
> in
> > > > > HADOOP-18671 <https://issues.apache.org/jira/browse/HADOOP-18671>
> added
> > > > so
> > > > > we can have HBase to adopt this new API. Other than that, perhaps
> > > > > thirdparty dependency updates.
> > > > >
> > > > > If you have open items to be added in the coming weeks, please add
> 3.3.6
> > > > to
> > > > > the target release version. Right now I am only seeing three open
> jiras
> > > > > targeting 3.3.6.
> > > > >
> > > > > I imagine this is going to be a small release as 3.3.5 (hat tip to
> Steve)
> > > > > was only made two months back, and so far only 8 jiras were
> resolved in
> > > > the
> > > > > branch-3.3 line.
> > > > >
> > > > > Best,
> > > > > Weichiu
> > > > >
> > > >
>


Re: Call for Presentations, Community Over Code 2023

2023-05-09 Thread Wei-Chiu Chuang
There's also a call for presentation for Community over Code Asia 2023

https://www.bagevent.com/event/cocasia-2023-EN
Happening Aug 18-20. CfP due by 6/6


On Tue, May 9, 2023 at 8:39 PM Ayush Saxena  wrote:

> Forwarding from dev@hadoop to the dev ML which we use.
>
> The actual mail lies here:
> https://www.mail-archive.com/dev@hadoop.apache.org/msg00160.html
>
> -Ayush
>
> On 2023/05/09 21:24:09 Rich Bowen wrote:
> > (Note: You are receiving this because you are subscribed to the dev@
> > list for one or more Apache Software Foundation projects.)
> >
> > The Call for Presentations (CFP) for Community Over Code (formerly
> > Apachecon) 2023 is open at
> > https://communityovercode.org/call-for-presentations/, and will close
> > Thu, 13 Jul 2023 23:59:59 GMT.
> >
> > The event will be held in Halifax, Canada, October 7-10, 2023.
> >
> > We welcome submissions on any topic related to the Apache Software
> > Foundation, Apache projects, or the communities around those projects.
> > We are specifically looking for presentations in the following
> > catetegories:
> >
> > Fintech
> > Search
> > Big Data, Storage
> > Big Data, Compute
> > Internet of Things
> > Groovy
> > Incubator
> > Community
> > Data Engineering
> > Performance Engineering
> > Geospatial
> > API/Microservices
> > Frameworks
> > Content Wrangling
> > Tomcat and httpd
> > Cloud and Runtime
> > Streaming
> > Sustainability
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: dev-h...@hadoop.apache.org
> >
> >
>
>
> Sent from my iPhone
>
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>
>


[DISCUSS] Hadoop 3.3.6 release planning

2023-05-04 Thread Wei-Chiu Chuang
Hi community,

I'd like to kick off the discussion around Hadoop 3.3.6 release plan.

I'm being selfish but my intent for 3.3.6 is to have the new APIs in
HADOOP-18671  added so
we can have HBase to adopt this new API. Other than that, perhaps
thirdparty dependency updates.

If you have open items to be added in the coming weeks, please add 3.3.6 to
the target release version. Right now I am only seeing three open jiras
targeting 3.3.6.

I imagine this is going to be a small release as 3.3.5 (hat tip to Steve)
was only made two months back, and so far only 8 jiras were resolved in the
branch-3.3 line.

Best,
Weichiu


Re: [DISCUSS] hadoop branch-3.3+ going to java11 only

2023-03-28 Thread Wei-Chiu Chuang
My random thoughts. Probably bad takes:

There are projects experimenting with JDK17 now.
JDK11 active support will end in 6 months. If it's already hard to migrate
from JDK8 why not retarget JDK17.

On Tue, Mar 28, 2023 at 10:30 AM Ayush Saxena  wrote:

> I know Jersey upgrade as a blocker. Some folks were chasing that last year
> during 3.3.4 time, I don’t know where it is now, didn’t see then what’s the
> problem there but I remember there was some intitial PR which did it for
> HDFS atleast, so I never looked beyond that…
>
> I too had jdk-11 in my mind, but only for trunk. 3.4.x can stay as java-11
> only branch may be, but that is something later to decide, once we get the
> code sorted…
>
> -Ayush
>
> > On 28-Mar-2023, at 9:16 PM, Steve Loughran 
> wrote:
> >
> > well, how about we flip the switch and get on with it.
> >
> > slf4j seems happy on java11,
> >
> > side issue, anyone seen test failures on zulu1.8; somehow my test run is
> > failing and i'm trying to work out whether its a mismatch in command
> > line/ide jvm versions, or the 3.3.5 JARs have been built with an openjdk
> > version which requires IntBuffer implements an overridden method
> IntBuffer
> > rewind().
> >
> > java.lang.NoSuchMethodError:
> java.nio.IntBuffer.rewind()Ljava/nio/IntBuffer;
> >
> > at
> org.apache.hadoop.fs.FSInputChecker.verifySums(FSInputChecker.java:341)
> > at
> >
> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:308)
> > at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:257)
> > at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:202)
> > at java.io.DataInputStream.read(DataInputStream.java:149)
> >
> >> On Tue, 28 Mar 2023 at 15:52, Viraj Jasani  wrote:
> >> IIRC some of the ongoing major dependency upgrades (log4j 1 to 2,
> jersey 1
> >> to 2 and junit 4 to 5) are blockers for java 11 compile + test
> stability.
> >> On Tue, Mar 28, 2023 at 4:55 AM Steve Loughran
>  >> wrote:
> >>> Now that hadoop 3.3.5 is out, i want to propose something new
> >>> we switch branch-3.3 and trunk to being java11 only
> >>> 1. java 11 has been out for years
> >>> 2. oracle java 8 is no longer available under "premier support"; you
> >>> can't really get upgrades
> >>> https://www.oracle.com/java/technologies/java-se-support-roadmap.html
> >>> 3. openJDK 8 releases != oracle ones, and things you compile with them
> >>> don't always link to oracle java 8 (some classes in java.nio have
> >> added
> >>> more overrides)
> >>> 4. more and more libraries we want to upgrade to/bundle are java 11
> >> only
> >>> 5. moving to java 11 would cut our yetus build workload in half, and
> >>> line up for adding java 17 builds instead.
> >>> I know there are some outstanding issues still in
> >>> https://issues.apache.org/jira/browse/HADOOP-16795 -but are they
> >> blockers?
> >>> Could we just move to java11 and enhance at our leisure, once java8 is
> no
> >>> longer a concern.
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


Re: MapReduce Terasort job is slow on Java 11

2022-11-23 Thread Wei-Chiu Chuang
For the JDK11 case, does everyone on the cluster run on JDK11? or is it the
MR job that is on JDK11?

We have users running JDK in production. The NN GC performance was the only
thing we were aware of.
In the past we noticed because JDK11 uses G1GC by default, large NameNode
performance was worse than JDK8 which uses CMS.


On Wed, Nov 23, 2022 at 8:44 AM Prabhu Joseph 
wrote:

> Hi, Any pointers on why the MapReduce Terasort job is slower on Java 11
> compared with Java 8. Input data, Configs, Number of Worker Nodes, Node
> instance type, Hadoop version and Resources are the same in both the runs.
> Have compared App logs of both good and bad runs and observed Avg Task
> (both Map and Reduce) time is slower in Java 11.
>
> *Java 8 : **7 min 2 secs *
>
> hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar terasort
> -Dmapred.reduce.tasks=120
> /tmp/terasort/127130b1-ceb0-422c-a957-48c651b20f30/input/
> /tmp/terasort/127130b1-ceb0-422c-a957-48c651b20f30/output/
> 2022-11-23 12:22:41,948 INFO terasort.TeraSort: starting
> 2022-11-23 12:29:59,520 INFO terasort.TeraSort: done
>
> *Java 11 : 9 min 37 secs *
>
> [hadoop@ip-172-31-60-208 ~]$ hadoop jar
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar terasort
> -Dmapred.reduce.tasks=120
> /tmp/terasort/127130b1-ceb0-422c-a957-48c651b20f30/input/
> /tmp/terasort/127130b1-ceb0-422c-a957-48c651b20f30/output/
> 2022-11-23 12:22:44,167 INFO terasort.TeraSort: starting
> 2022-11-23 12:32:21,791 INFO terasort.TeraSort: done
>
> Thanks,
> Prabhu Joseph
>


[DISCUSS] Hadoop 3.3.5 release planning

2022-10-07 Thread Wei-Chiu Chuang
Bumping this up. Adding the [DISCUSS] text to make this message stand out
of your inbox.

I certainly missed this message and only realized 3.3.5 has more than just
security updates.

What was the issue with the ARM64 build? I was able to publish ARM64 build
for 3.3.1 release without problems.


On Tue, Sep 27, 2022 at 9:35 AM Steve Loughran 
wrote:

> Mukund has just created the new Hadoop release JIRA,
> https://issues.apache.org/jira/browse/HADOOP-18470, and is doing the first
> build/test before pushing it up. This is off branch-3.3, so it will have
> more significant changes than the 3.3.3 and 3.3.4 releases, which were just
> CVE/integration fixes.
>
> The new branch, branch-3.3.5 has its maven/hadoop.version set to
> 3.3.5-SNAPSHOT.
>
> All JIRA issues fixed/blocked for 3.3.9 now reference 3.3.5. The next step
> of the release is to actually move those wontfix issues back to being
> against 3.3.9
>
> There is still a 3.3.9 version; branch-3.3's maven build still refers to
> it. Issues found/fixed in branch-3.3 *but not the new branch-3.3.5 branch-
> should still refer to this version. Anything targeting the 3.3.5 release
> must be committed to the new branch, and have its JIRA version tagged
> appropriately.
>
> All changes to be cherrypicked into 3.3.5, except for those ones related to
> the release process itself, MUST be in branch-3.3 first, and SHOULD be in
> trunk unless there is some fundamental reason they can't apply there
> (reload4j etc).
>
> Let's try and stabilise this releases, especially bringing up to date all
> the JAR dependencies which we can safely update.
>
> Anyone planning to give talks at ApacheCon about forthcoming features
> already in 3.3 SHOULD
>
>1. reference Hadoop 3.3.5 as the version
>2. make sure their stuff works.
>
> Mukund will be at the conf; find him and offer any help you can in getting
> this release out.
>
> I'd like to get that Arm64 build workingdoes anyone else want to get
> involved?
>
> -steve
>


Re: Hadoop BoF at ApacheCon NA'22

2022-10-02 Thread Wei-Chiu Chuang
Hi Junping yes it’s all settled. We’ll be meeting Monday 5:50pm central
time which will be 6:50am Tuesday for you. Sorry the complete conference
schedule is only available to participants at this time. It was taken down
from the website.

Hey so it’s really early for you. I’d suggest to move you to the back of
the session if that’s okay for you.

Zoom link:
https://cloudera.zoom.us/j/94221207158

Dial in: +1 877-853-5257

The zoom link is available for any one to join remotely.

俊平堵 於 2022年10月2日 週日,上午7:23寫道:

> Hi Uma and Wei-Chiu,
>  Does this schedule update have settled down (from Tues. to Mon.)? If
> so, would you provide a meeting link for me so that I can join remotely?
> Thanks!
>
> Best,
>
> Junping
>
> Uma Maheswara Rao Gangumalla  于2022年9月27日周二 02:00写道:
>
> > Guys, there is a schedule change for Hadoop BoaF due to the conflicts
> with
> > Lightning talks on Wednesday. It has been moved to Monday.
> > Plan accordingly.
> >
> > Ozone BoAF will happen in Rhythms-I on Monday. Depending on the number of
> > people, we could combine as well.
> >
> > Regards,
> > Uma
> >
> > On Sat, Sep 24, 2022 at 8:52 PM 俊平堵  wrote:
> >
> > > Yes. It should be a short talk and I can only present remotely for this
> > > time. That would be great if you help to coordinate. :)
> > >
> > > Thanks,
> > >
> > > Junping
> > >
> > > Wei-Chiu Chuang  于2022年9月24日周六 01:21写道:
> > >
> > >> That would be great! Will you be presenting in person or remote?
> > >> If it's a short talk (15-20 minutes) we'll have enough time for 2-3 of
> > >> these. If folks want to present remotely I can help coordinate that.
> > >>
> > >> Here I put up a short agenda for the BoF:
> > >>
> > >>
> >
> https://docs.google.com/document/d/1_ha1BFeEyIkAJtl5tJ8Z4Vf5Md_0rN23RCrXkyNCY1c/edit?usp=sharing
> > >> Please add more details here.
> > >>
> > >>
> > >> On Thu, Sep 22, 2022 at 10:27 PM 俊平堵  wrote:
> > >>
> > >> > Thanks Wei-Chiu. I am happy to share the status of Hadoop Meetups in
> > >> China
> > >> > (2019-2022) if that is a suitable topic. :)
> > >> >
> > >> > Thanks,
> > >> >
> > >> > Junping
> > >> >
> > >> > Wei-Chiu Chuang  于2022年9月21日周三 02:31写道:
> > >> >
> > >> > > We've not had a physical event for a long long time and we're way
> > >> overdue
> > >> > > for one.
> > >> > >
> > >> > > I'm excited to announce we've reserved a room at the upcoming
> > >> ApacheCon
> > >> > for
> > >> > > Birds-of-Feather on October 4th from 17:50-18:30 CDT in Rhythms
> I. I
> > >> was
> > >> > > also told that participants can stay after that until the hotel
> > >> > personnels
> > >> > > throw us out.
> > >> > >
> > >> > > Feel free to pass along this information.
> > >> > >
> > >> > > On top of that, I was told that this year's ApacheCon is very
> > popular
> > >> > and a
> > >> > > lot of good talk proposals were not selected. If folks are
> > interested
> > >> I'm
> > >> > > happy to invite you to share online. A physical meetup in the Bay
> > Area
> > >> > > would also be a great idea, if we can find a sponsor.
> > >> > >
> > >> > > Thanks,
> > >> > > Wei-Chiu
> > >> > >
> > >> >
> > >>
> > >
> >
>


Re: Hadoop BoF at ApacheCon NA'22

2022-09-23 Thread Wei-Chiu Chuang
That would be great! Will you be presenting in person or remote?
If it's a short talk (15-20 minutes) we'll have enough time for 2-3 of
these. If folks want to present remotely I can help coordinate that.

Here I put up a short agenda for the BoF:
https://docs.google.com/document/d/1_ha1BFeEyIkAJtl5tJ8Z4Vf5Md_0rN23RCrXkyNCY1c/edit?usp=sharing
Please add more details here.


On Thu, Sep 22, 2022 at 10:27 PM 俊平堵  wrote:

> Thanks Wei-Chiu. I am happy to share the status of Hadoop Meetups in China
> (2019-2022) if that is a suitable topic. :)
>
> Thanks,
>
> Junping
>
> Wei-Chiu Chuang  于2022年9月21日周三 02:31写道:
>
> > We've not had a physical event for a long long time and we're way overdue
> > for one.
> >
> > I'm excited to announce we've reserved a room at the upcoming ApacheCon
> for
> > Birds-of-Feather on October 4th from 17:50-18:30 CDT in Rhythms I. I was
> > also told that participants can stay after that until the hotel
> personnels
> > throw us out.
> >
> > Feel free to pass along this information.
> >
> > On top of that, I was told that this year's ApacheCon is very popular
> and a
> > lot of good talk proposals were not selected. If folks are interested I'm
> > happy to invite you to share online. A physical meetup in the Bay Area
> > would also be a great idea, if we can find a sponsor.
> >
> > Thanks,
> > Wei-Chiu
> >
>


Hadoop BoF at ApacheCon NA'22

2022-09-20 Thread Wei-Chiu Chuang
We've not had a physical event for a long long time and we're way overdue
for one.

I'm excited to announce we've reserved a room at the upcoming ApacheCon for
Birds-of-Feather on October 4th from 17:50-18:30 CDT in Rhythms I. I was
also told that participants can stay after that until the hotel personnels
throw us out.

Feel free to pass along this information.

On top of that, I was told that this year's ApacheCon is very popular and a
lot of good talk proposals were not selected. If folks are interested I'm
happy to invite you to share online. A physical meetup in the Bay Area
would also be a great idea, if we can find a sponsor.

Thanks,
Wei-Chiu


Re: [DISCUSS] Hadoop on Windows

2022-04-28 Thread Wei-Chiu Chuang
Great!

Sorry I missed the earlier discussion thread. Is there a target version for
this support? I assume the milestone is still in a dev branch?

On Thu, Apr 28, 2022 at 8:26 AM Gautham Banasandra 
wrote:

> Hi Hadoop devs,
>
> I would like to announce that we recently reached a new milestone - we
> recently finished all the tasks in item 3 under Phase 1. This implies that
> all the HDFS native client tools[1] have become cross platform now. We're
> inching closer towards making Hadoop cross platform. Watch this space for
> more updates.
>
> [1] =
>
> https://github.com/apache/hadoop/tree/trunk/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/tools
>
> Thanks,
> --Gautham
>
> On Mon, 21 Feb 2022 at 00:12, Gautham Banasandra 
> wrote:
>
> > Hi all,
> >
> > I've been working on getting Hadoop to build on Windows for quite some
> > time now. We're now at a stage where we can parallelize the effort and
> > complete this sooner. I've outlined the parts that are remaining. Please
> > get in touch with me if anyone wishes to join hands in realizing this
> goal.
> >
> > *Why do we need Hadoop to run on Windows?*
> > Windows has a very large user base. The modern alternative softwares to
> > Hadoop (like Kubernetes) are cross platform by design. We have to
> > acknowledge the fact it isn't easy to get Hadoop running on Windows. The
> > reason why we haven't seen much adoption of Hadoop on Windows is probably
> > because of issues like compilation, requiring work-arounds every step of
> > the way etc. If we were to nail these issues, I believe it would
> > tremendously expand the usage of Hadoop.
> >
> > I plan to complete this in 4 phases.
> >
> > *Phase 1 : Building Hadoop on Windows*
> > 1. [HADOOP-17193] Compile Hadoop on Windows natively - ASF JIRA
> > (apache.org) 
> > The Hadoop build on Windows is currently broken because of the POSIX API
> > calls made in the HDFS native client (libhdfspp). MinGW and Cygwin
> > provide POSIX implementation on Windows. While it's possible to use these
> > C++ compilers, it won't be the same as compiling Hadoop with Visual C++.
> > The Visual C++ runtime is the native C++ runtime on Windows and provides
> > much more capabilities (like core dumps etc.) than its alternatives.
> Thus,
> > it's essential to get Hadoop to compile with Visual Studio on Windows.
> > We'll be using Visual Studio 2019.
> >
> > 2. [HDFS-15843] [libhdfs++] Make write cross platform - ASF JIRA
> > (apache.org) 
> > Until recently, Hadoop was being built with C++11. I upgraded the
> compiler
> > version to a level where it supports C++17 so that we've access to
> > std::filesystem and a few other modern C++ APIs. However, there are some
> > cases where the C++17 APIs don't suffice. Thus, I wrote the XPlatform
> > library
> > <
> https://github.com/apache/hadoop/tree/trunk/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/x-platform
> >,
> > which is a collection of system call APIs implemented in a cross-platform
> > friendly manner. The CMake build system will choose the appropriate
> > platform implementation while building so that we can do away with all
> the
> > #ifdefs based on platform in the code. In summary, if you ever come
> across
> > a need to use system calls, please put them into the XPlatform library
> and
> > use its APIs.
> >
> > 3. [HDFS-16474] Make HDFS tail tool cross platform - ASF JIRA (
> apache.org)
> > 
> > [HDFS-16473] Make HDFS stat tool cross platform - ASF JIRA
> > (apache.org) 
> > [HDFS-16472] Make HDFS setrep tool cross platform - ASF JIRA
> > (apache.org) 
> > [HDFS-16471] Make HDFS ls tool cross platform - ASF JIRA (apache.org
> )
> > 
> > [HDFS-16470] Make HDFS find tool cross platform - ASF JIRA
> > (apache.org) 
> > The HDFS native client tools use getopt API to parse the command line
> > arguments. getopt isn't available on Windows. One can follow this PR to
> > make the above tools cross platform compatible - HDFS-16285. Make HDFS
> > ownership tools cross platform by GauthamBanasandra · Pull Request #3588
> ·
> > apache/hadoop (github.com) .
> >
> > 4. [HDFS-16463] Make dirent.h cross platform compatible - ASF JIRA
> > (apache.org) 
> > [HDFS-16465] Make usage of strings.h cross platform compatible - ASF
> > JIRA (apache.org) 
> > For these JIRAs, the header files aren't there for Windows. Thus, we need
> > to inspect the APIs that have been used from these headers and implement
> > them.
> >
> > 5. 

Re: Institutions running kerberized Hadoop clusters

2022-04-12 Thread Wei-Chiu Chuang
Last time I checked (~2 years ago), there were thousands of Kerberized
clusters among Cloudera customers.
The largest ones had a few thousand nodes.

What are you looking for?

On Wed, Apr 13, 2022 at 7:22 AM Santosh Marella  wrote:

> Hey folks,
>
>   Just curious if we have a list of institutions that are running
> *kerberized* Hadoop clusters? I noticed we have a PoweredBy Hadoop
>  page that
> lists all the institutions running Hadoop, but couldn't find something
> similar for kerberized Hadoop clusters. Appreciate any pointers on this.
>
> Thanks,
> Santosh
>


Re: [VOTE] Release Apache Hadoop 3.2.3 - RC0

2022-03-17 Thread Wei-Chiu Chuang
aarch64 support is only introduced in/after 3.3.0

On Thu, Mar 17, 2022 at 2:27 PM Emil Ejbyfeldt
 wrote:

> Hi,
>
>
> There is no aarch64 artifact in the release candidate. Is this something
> that is intended?
>
> Best,
> Emil Ejbyfeldt
>
> On 14/03/2022 08:14, Masatake Iwasaki wrote:
> > Hi all,
> >
> > Here's Hadoop 3.2.3 release candidate #0:
> >
> > The RC is available at:
> >https://home.apache.org/~iwasakims/hadoop-3.2.3-RC0/
> >
> > The RC tag is at:
> >https://github.com/apache/hadoop/releases/tag/release-3.2.3-RC0
> >
> > The Maven artifacts are staged at:
> >
> https://repository.apache.org/content/repositories/orgapachehadoop-1339
> >
> > You can find my public key at:
> >https://downloads.apache.org/hadoop/common/KEYS
> >
> > Please evaluate the RC and vote.
> >
> > Thanks,
> > Masatake Iwasaki
> >
> > -
> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


[ANNOUNCE] New Hadoop PMC Sun Chao

2022-03-08 Thread Wei-Chiu Chuang
On behalf of the Apache Hadoop PMC, I am pleased to announce that Sun
Chao(sunchao) has accepted the PMC's invitation to become a PMC on
the project. We appreciate all of Sun's generous contributions thus far
and look forward to his continued involvement.

Congratulations and welcome, Sun!


Re: [ANNOUNCE] Apache Hadoop 3.3.2 release

2022-03-03 Thread Wei-Chiu Chuang
Thanks a lot for the tremendous work!

On Fri, Mar 4, 2022 at 9:30 AM Chao Sun  wrote:

> Hi All,
>
> It gives me great pleasure to announce that the Apache Hadoop community has
> voted to release Apache Hadoop 3.3.2.
>
> This is the second stable release of Apache Hadoop 3.3 line. It contains
> 284 bug fixes, improvements and enhancements since 3.3.1.
>
> Users are encouraged to read the overview of major changes [1] since 3.3.1.
> For details of 284 bug fixes, improvements, and other enhancements since
> the previous 3.3.1 release, please check release notes [2] and changelog
> [3].
>
> [1]: https://hadoop.apache.org/docs/r3.3.2/index.html
> [2]:
>
> http://hadoop.apache.org/docs/r3.3.2/hadoop-project-dist/hadoop-common/release/3.3.2/RELEASENOTES.3.3.2.html
> [3]:
>
> http://hadoop.apache.org/docs/r3.3.2/hadoop-project-dist/hadoop-common/release/3.3.2/CHANGELOG.3.3.2.html
>
> Many thanks to everyone who contributed to the release, and everyone in the
> Apache Hadoop community! This release is a direct result of your great
> contributions.
>
> Many thanks to everyone who helped in this release process!
>
> Many thanks to Viraj Jasani, Michael Stack, Masatake Iwasaki, Xiaoqiao He,
> Mukund Madhav Thakur, Wei-Chiu Chuang, Steve Loughran, Akira Ajisaka and
> other folks who helped for this release process.
>
> Best Regards,
> Chao
>


Re: [VOTE] Release Apache Hadoop 3.3.2 - RC2

2022-01-20 Thread Wei-Chiu Chuang
I'll find time to check out the RC bits.
I just feel bad that the tarball is now more than 600MB in size.

On Fri, Jan 21, 2022 at 2:23 AM Steve Loughran 
wrote:

> *+1 binding.*
>
> reviewed binaries, source, artifacts in the staging maven repository in
> downstream builds. all good.
>
> *## test run*
>
> checked out the asf github repo at commit 6da346a358c into a location
> already set up with aws and azure test credentials
>
> ran the hadoop-aws tests with -Dparallel-tests -DtestsThreadCount=6
>  -Dmarkers=delete -Dscale
> and hadoop-azure against azure cardiff with -Dparallel-tests=abfs
> -DtestsThreadCount=6
>
> all happy
>
>
>
> *## binary*
> downloaded KEYS and imported, so adding your key to my list (also signed
> this and updated the key servers)
>
> downloaded rc tar and verified
> ```
> > gpg2 --verify hadoop-3.3.2.tar.gz.asc hadoop-3.3.2.tar.gz
> gpg: Signature made Sat Jan 15 23:41:10 2022 GMT
> gpg:using RSA key DE7FA241EB298D027C97B2A1D8F1A97BE51ECA98
> gpg: Good signature from "Chao Sun (CODE SIGNING KEY)  >"
> [full]
>
>
> > cat hadoop-3.3.2.tar.gz.sha512
> SHA512 (hadoop-3.3.2.tar.gz) =
>
> cdd3d9298ba7d6e63ed63f93c159729ea14d2b7d5e3a0640b1761c86c7714a721f88bdfa8cb1d8d3da316f616e4f0ceaace4f32845ee4441e6aaa7a12b8c647d
>
> > shasum -a 512 hadoop-3.3.2.tar.gz
>
> cdd3d9298ba7d6e63ed63f93c159729ea14d2b7d5e3a0640b1761c86c7714a721f88bdfa8cb1d8d3da316f616e4f0ceaace4f32845ee4441e6aaa7a12b8c647d
>  hadoop-3.3.2.tar.gz
> ```
>
>
> *# cloudstore against staged artifacts*
> ```
> cd ~/.m2/repository/org/apache/hadoop
> find . -name \*3.3.2\* -print | xargs rm -r
> ```
> ensures no local builds have tainted the repo.
>
> in cloudstore mvn build without tests
> ```
> mci -Pextra -Phadoop-3.3.2 -Psnapshots-and-staging
> ```
> this fetches all from asf staging
>
> ```
> Downloading from ASF Staging:
>
> https://repository.apache.org/content/groups/staging/org/apache/hadoop/hadoop-client/3.3.2/hadoop-client-3.3.2.pom
> Downloaded from ASF Staging:
>
> https://repository.apache.org/content/groups/staging/org/apache/hadoop/hadoop-client/3.3.2/hadoop-client-3.3.2.pom
> (11 kB at 20 kB/s)
> ```
> there's no tests there, but it did audit the download process. FWIW, that
> project has switched to logback, so I now have all hadoop imports excluding
> slf4j and log4j. it takes too much effort right now.
>
> build works.
>
> tested abfs and s3a storediags, all happy
>
>
>
>
> *### google GCS against staged artifacts*
>
> gcs is now java 11 only, so I had to switch JVMs here.
>
> had to add a snapshots and staging profile, after which I could build and
> test.
>
> ```
>  -Dhadoop.three.version=3.3.2 -Psnapshots-and-staging
> ```
> two test failures were related to auth failures where the tests were trying
> to raise exceptions but things failed differently
> ```
> [ERROR] Failures:
> [ERROR]
>
> GoogleHadoopFileSystemTest.eagerInitialization_fails_withInvalidCredentialsConfiguration:122
> unexpected exception type thrown; expected:
> but was:
> [ERROR]
>
> GoogleHadoopFileSystemTest.lazyInitialization_deleteCall_fails_withInvalidCredentialsConfiguration:100
> value of: throwable.getMessage()
> expected: Failed to create GCS FS
> but was : A JSON key file may not be specified at the same time as
> credentials via configuration.
>
> ```
>
> I'm not worried here.
>
> ran cloudstore's diagnostics against gcs.
>
> Nice to see they are now collecting IOStatistics on their input streams. we
> really need to get this collected through the parquet/orc libs and then
> through the query engines.
>
> ```
> > bin/hadoop jar $CLOUDSTORE storediag gs://stevel-london/
>
> ...
> 2022-01-20 17:52:47,447 [main] INFO  diag.StoreDiag
> (StoreDurationInfo.java:(56)) - Starting: Reading a file
> gs://stevel-london/dir-9cbfc774-76ff-49c0-b216-d7800369c3e1/file
> input stream summary: org.apache.hadoop.fs.FSDataInputStream@6cfd9a54:
> com.google.cloud.hadoop.fs.gcs.GoogleHadoopFSInputStream@78c1372d
> {counters=((stream_read_close_operations=1)
> (stream_read_seek_backward_operations=0) (stream_read_total_bytes=7)
> (stream_read_bytes=7) (stream_read_exceptions=0)
> (stream_read_seek_operations=0) (stream_read_seek_bytes_skipped=0)
> (stream_read_operations=3) (stream_read_bytes_backwards_on_seek=0)
> (stream_read_seek_forward_operations=0)
> (stream_read_operations_incomplete=1));
> gauges=();
> minimums=();
> maximums=();
> means=();
> }
> ...
> ```
>
> *### source*
>
> once I'd done builds and tests which fetched from staging, I did a local
> build and test
>
> repeated download/validate of source tarball, unzip/untar
>
> build with java11.
>
> I've not done the test run there, because that directory tree doesn't have
> the credentials, and this mornings run was good.
>
> altogether then: very happy. tests good, downstream libraries building and
> linking.
>
> On Wed, 19 Jan 2022 at 17:50, Chao Sun  wrote:
>
> > Hi all,
> >
> > I've put together Hadoop 3.3.2 RC2 below:
> >
> > The RC is available at:
> > 

Re: [DISCUSS] Migrate hadoop from log4j1 to log4j2

2022-01-20 Thread Wei-Chiu Chuang
+1 I think it makes sense to use reload4j in maint releases.
I have a draft PR doing this (https://github.com/apache/hadoop/pull/3906)

log4j2 in Hadoop 3.4.0 makes sense to me. There could be incompatibilities
introduced by log4j2, but I feel we should at least make it 3.4.0 a
"preview" release, and try to address the incompat in later versions (e.g.
3.4.1)

On Fri, Jan 21, 2022 at 8:42 AM Duo Zhang  wrote:

> For maintenance release line I also support we switch to reload4j to
> address the security issues first. We could file an issue for it.
>
> Andrew Purtell 于2022年1月21日 周五01:15写道:
>
> > Just to clarify: I think you want to upgrade to Log4J2 (or switch to
> > LogBack) as a strategy for new releases, but you have the option in
> > maintenance releases to use Reload4J to maintain Appender API and
> > operational compatibility, and users who want to minimize risks in
> > production while mitigating the security issues will prefer that.
> >
> > > On Jan 20, 2022, at 8:59 AM, Andrew Purtell 
> > wrote:
> > >
> > > Reload4J has fixed all of those CVEs without requiring an upgrade.
> > >
> > >> On Jan 20, 2022, at 5:56 AM, Duo Zhang  wrote:
> > >>
> > >> There are 3 new CVEs for log4j1 reported recently[1][2][3]. So I
> think
> > it
> > >> is time to speed up the migration to log4j2 work[4] now.
> > >>
> > >> You can see the discussion on the jira issue[4], our goal is to fully
> > >> migrate to log4j2 and the current most blocking issue is lack of the
> > >> "log4j.rootLogger=INFO,Console" grammer support for log4j2. I've
> already
> > >> started a discussion thread on the log4j dev mailing list[5] and the
> > result
> > >> is optimistic and I've filed an issue for log4j2[6], but I do not
> think
> > it
> > >> could be addressed and released soon. If we want to fully migrate to
> > >> log4j2, then either we introduce new environment variables or split
> the
> > old
> > >> HADOOP_ROOT_LOGGER variable in the startup scripts. And considering
> the
> > >> complexity of our current startup scripts, the work is not easy and it
> > will
> > >> also break lots of other hadoop deployment systems if they do not use
> > our
> > >> startup scripts...
> > >>
> > >> So after reconsidering the current situation, I prefer we use the
> > log4j1.2
> > >> bridge to remove the log4j1 dependency first, and once LOG4J2-3341 is
> > >> addressed and released, we start to fully migrate to log4j2. Of course
> > we
> > >> have other problems for log4j1.2 bridge too, as we have
> TaskLogAppender,
> > >> ContainerLogAppender and ContainerRollingLogAppender which inherit
> > >> FileAppender and RollingFileAppender in log4j1, which are not part of
> > the
> > >> log4j1.2 bridge. But anyway, at least we could just copy the source
> > code to
> > >> hadoop as we have WriteAppender in log4j1.2 bridge, and these two
> > classes
> > >> do not have related CVEs.
> > >>
> > >> Thoughts? For me I would like us to make a new 3.4.x release line to
> > remove
> > >> the log4j1 dependencies ASAP.
> > >>
> > >> Thanks.
> > >>
> > >> 1. https://nvd.nist.gov/vuln/detail/CVE-2022-23302
> > >> 2. https://nvd.nist.gov/vuln/detail/CVE-2022-23305
> > >> 3. https://nvd.nist.gov/vuln/detail/CVE-2022-23307
> > >> 4. https://issues.apache.org/jira/browse/HADOOP-16206
> > >> 5. https://lists.apache.org/thread/gvfb3jkg6t11cyds4jmpo7lrswmx28w3
> > >> 6. https://issues.apache.org/jira/browse/LOG4J2-3341
> >
>


Re: Hadoop-3.2.3 Release Update

2022-01-11 Thread Wei-Chiu Chuang
Is this still making progress?

On Tue, Oct 5, 2021 at 8:45 PM Brahma Reddy Battula 
wrote:

> Hi Akira,
>
> Thanks for your email!!
>
> I am evaluating the CVE’s which needs to go for this release..
>
> Will update soon!!
>
>
> On Tue, 5 Oct 2021 at 1:46 PM, Akira Ajisaka  wrote:
>
> > Hi Brahma,
> >
> > What is the release process going on? Is there any blocker for the RC?
> >
> > -Akira
> >
> > On Wed, Sep 22, 2021 at 7:37 PM Xiaoqiao He  wrote:
> >
> > > Hi Brahma,
> > >
> > > The feature 'BPServiceActor processes commands from NameNode
> > > asynchronously' has been ready for both branch-3.2 and branch-3.2.3.
> > While
> > > cherry-picking there is only minor conflict, So I checked in directly.
> > BTW,
> > > run some unit tests and build pseudo cluster to verify, it seems to
> work
> > > fine.
> > > FYI.
> > >
> > > Regards,
> > > - He Xiaoqiao
> > >
> > > On Thu, Sep 16, 2021 at 10:52 PM Brahma Reddy Battula <
> bra...@apache.org
> > >
> > > wrote:
> > >
> > >> Please go ahead. Let me know any help required on review.
> > >>
> > >> On Tue, Sep 14, 2021 at 6:57 PM Xiaoqiao He 
> > wrote:
> > >>
> > >>> Hi Brahma,
> > >>>
> > >>> I plan to involve HDFS-14997 and related JIRAs if possible. I have
> > >>> resolved the conflict and verified them locally.
> > >>> It will include: HDFS-14997 HDFS-15075 HDFS-15651 HDFS-15113.
> > >>> I would like to hear some more response that if we have enough time
> to
> > >>> wait for it to be ready.
> > >>> Thanks.
> > >>>
> > >>> Best Regards,
> > >>> - He Xiaoqiao
> > >>>
> > >>> On Tue, Sep 14, 2021 at 3:39 PM Xiaoqiao He 
> > wrote:
> > >>>
> >  Hi Brahma, HDFS-15160 has checked in branch-3.2 & branch-3.2.3. FYI.
> > 
> >  On Tue, Sep 14, 2021 at 3:52 AM Brahma Reddy Battula <
> > bra...@apache.org>
> >  wrote:
> > 
> > > Hi All,
> > >
> > > Waiting for the following jira to commit to hadoop-3.2.3 , mostly
> > this
> > > can
> > > be done by this week,then I will try to create the RC next if there
> > is
> > > no
> > > objection.
> > >
> > > https://issues.apache.org/jira/browse/HDFS-15160
> > >
> > >
> > >
> > > On Mon, Aug 16, 2021 at 2:22 PM Brahma Reddy Battula <
> > > bra...@apache.org>
> > > wrote:
> > >
> > > > @Akira Ajisaka   and @Masatake Iwasaki
> > > > 
> > > > Looks all are build related issues when you try with bigtop. We
> can
> > > > discuss and prioritize this.. Will connect with you guys.
> > > >
> > > > On Mon, Aug 16, 2021 at 1:43 PM Masatake Iwasaki <
> > > > iwasak...@oss.nttdata.co.jp> wrote:
> > > >
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch2-exclude-spotbugs-annotations.diff
> > > >> >
> > > >> > This is for building hadoop-3.2.2 against zookeeper-3.4.14.
> > > >> > we do not see the issue usually since branch-3.2 uses
> > > zooekeper-3.4.13,
> > > >> > while it would be harmless to add the exclusion even for
> > > >> zooekeeper-3.4.13.
> > > >>
> > > >> I filed HADOOP-17849 for this.
> > > >>
> > > >> On 2021/08/16 12:02, Masatake Iwasaki wrote:
> > > >> > Thanks for bringing this up, Akira. Let me explain some
> > > background.
> > > >> >
> > > >> >
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch2-exclude-spotbugs-annotations.diff
> > > >> >
> > > >> > This is for building hadoop-3.2.2 against zookeeper-3.4.14.
> > > >> > we do not see the issue usually since branch-3.2 uses
> > > zooekeper-3.4.13,
> > > >> > while it would be harmless to add the exclusion even for
> > > >> zooekeeper-3.4.13.
> > > >> >
> > > >> >
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch3-fix-broken-dir-detection.diff
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch5-fix-kms-shellprofile.diff
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch6-fix-httpfs-sh.diff
> > > >> >
> > > >> > These are relevant to directory structure used by Bigtop
> > package.
> > > >> > If the fix does not break the tarball dist,
> > > >> > it would be nice to have these on Hadoop too.
> > > >> >
> > > >> >
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch7-remove-phantomjs-in-yarn-ui.diff
> > > >> >
> > > >> > This is for aarch64 and ppe64le lacking required phantomjs.
> > > >> > It is only acceptable for Bigtop not running tests of YARN-UI2
> > on
> > > >> packaging.
> > > >> > Hadoop needs the phantomjs for testing YARN-UI2.
> > > >> >
> > 

Apache Hadoop and CVE-2021-44228 Log4JShell vulnerability

2021-12-19 Thread Wei-Chiu Chuang
Hi,
Given the widespread attention to the recent log4j vulnerability
(CVE-2021-44228), I'd like to share an update from the Hadoop developer
community regarding the incident.

As you probably know, Apache Hadoop depends on the log4j library to keep
log files. The highlighted vulnerability CVE-2021-44228 affects log4j2
2.0-beta9 through 2.15.0. Hadoop has been using log4j 1.2.x in the last 10
years and therefore no release is affected by it.

That said, another CVE CVE-2021-4104 states the JMSAppender in the 1.2.x
log4j, which is used by Apache Hadoop, is vulnerable to the same attack.
Fortunately, it is not configured by default and Hadoop does not enable it
by default.

For more information and mitigation, please check out Hadoop's CVE list
page.
https://hadoop.apache.org/cve_list.html

Wei-Chiu


Trunk broken by HDFS-16384

2021-12-16 Thread Wei-Chiu Chuang
My bad. There was a transitive dependency problem in the PR causing trunk
to fail the build.

The commit has since been reverted.

Sorry for the inconvenience.


Re: [VOTE] Release Apache Hadoop 3.3.2 - RC0

2021-12-13 Thread Wei-Chiu Chuang
Thanks a lot for pushing it forward!

A few things I noticed that we should incorporate:

1. the overview page of the doc is for the Hadoop 3.0 release. It would be
best to base the doc on top of Hadoop 3.3.0 overview page.
(it's a miss on my part... The overview page of 3.3.1 wasn't updated)

2. ARM binaries is not included.
For the 3.3.1 release, I had to run the create release script on an ARM
machine separately to create the binary tarball.

3. the jdiff version
https://github.com/apache/hadoop/blob/branch-3.3.2/hadoop-project-dist/pom.xml#L137

I am not sure exactly what this is used for but i think it should be
updated to 3.3.2 (or 3.3.1)
(it was updated in trunk but i forgot to update the branch-3.3)

The 3.3.1 binary tarball is 577mb. The 3.3.2 RC0 is 608mb. I'm curious what
are added.



On Fri, Dec 10, 2021 at 10:09 AM Chao Sun  wrote:

> Hi all,
>
> Sorry for the long delay. I've prepared RC0 for Hadoop 3.3.2 below:
>
> The RC is available at:
> http://people.apache.org/~sunchao/hadoop-3.3.2-RC0/
> The RC tag is at:
> https://github.com/apache/hadoop/releases/tag/release-3.3.2-RC0
> The Maven artifacts are staged at:
> https://repository.apache.org/content/repositories/orgapachehadoop-1330/
>
> You can find my public key at: https://people.apache.org/~sunchao/KEYS
>
> Please evaluate the RC and vote.
>
> Thanks,
> Chao
>


ApacheCon@Home Big Data tracks recordings!

2021-10-11 Thread Wei-Chiu Chuang
For those who missed the live Apache@Home Big Data tracks, the video
recordings are being uploaded to the official ASF channel!

Big Data:
https://www.youtube.com/playlist?list=PLU2OcwpQkYCzXcumE9UxNirLF1IYLmARj
Big Data Ozone:
https://www.youtube.com/playlist?list=PLU2OcwpQkYCxtPdZ0nSowYLQMgkmoczMl
Big Data SQL/NoSQL:
https://www.youtube.com/playlist?list=PLU2OcwpQkYCwu-bpf3K-OIfAjHpf4kr4L
Big Data Streaming:
https://www.youtube.com/playlist?list=PLU2OcwpQkYCwf7Cl6xsCgHuIa8_NWX2JG

You can find other topics as well:
https://www.youtube.com/c/TheApacheFoundation/playlists

Thanks to all who presented. I seen multiple talks related to Hadoop:

* YARN Resource Management and Dynamic Max by Fang Liu, Fengguang Tian,
  Prashant Golash, Hanxiong Zhang, Shuyi Zhang
* Uber HDFS Unit Storage Cost 10x Deduction by Jeffrey Zhong, Jing Zhao,
Leon
  Gao
* Scaling the Namenode - Lessons learnt by Dinesh Chitlangia
* How Uber achieved millions of savings by managing disk IO across HDFS
  cluster by Leon Gao, Ekanth Sethuramalingam
* Containing an Elephant: How we moved Hadoop/HBase into Kubernetes and
Public
  Cloud by Dhiraj Hegde


You can also find the recordings for the Apache Asia (August 2021) and some
of our community members who presented include:

* Bigtop 3.0: Rerising community driven Hadoop distribution by Kengo Seki,
  Masatake Iwasaki.
* Technical tips for secure Apache Hadoop cluster by Akira Ajisaka, Kei
KORI.
* Data Lake accelerator on Hadoop-COS in Tencent Cloud by Li Cheng.

I may have missed a few great talks as I glanced through the list, so
please let me know if you find other relevant talks in other tracks.

Cheers,
Wei-Chiu


Re: Hadoop-3.2.3 Release Update

2021-10-06 Thread Wei-Chiu Chuang
Hi to raise the awareness,
it looks like reverting the FoldedTreeSet HDFS-13671
 breaks TestBlockManager
in branch-3.2.  Branch-3.3 is good.

tracking jira: HDFS-16258 

On Tue, Oct 5, 2021 at 8:45 PM Brahma Reddy Battula 
wrote:

> Hi Akira,
>
> Thanks for your email!!
>
> I am evaluating the CVE’s which needs to go for this release..
>
> Will update soon!!
>
>
> On Tue, 5 Oct 2021 at 1:46 PM, Akira Ajisaka  wrote:
>
> > Hi Brahma,
> >
> > What is the release process going on? Is there any blocker for the RC?
> >
> > -Akira
> >
> > On Wed, Sep 22, 2021 at 7:37 PM Xiaoqiao He  wrote:
> >
> > > Hi Brahma,
> > >
> > > The feature 'BPServiceActor processes commands from NameNode
> > > asynchronously' has been ready for both branch-3.2 and branch-3.2.3.
> > While
> > > cherry-picking there is only minor conflict, So I checked in directly.
> > BTW,
> > > run some unit tests and build pseudo cluster to verify, it seems to
> work
> > > fine.
> > > FYI.
> > >
> > > Regards,
> > > - He Xiaoqiao
> > >
> > > On Thu, Sep 16, 2021 at 10:52 PM Brahma Reddy Battula <
> bra...@apache.org
> > >
> > > wrote:
> > >
> > >> Please go ahead. Let me know any help required on review.
> > >>
> > >> On Tue, Sep 14, 2021 at 6:57 PM Xiaoqiao He 
> > wrote:
> > >>
> > >>> Hi Brahma,
> > >>>
> > >>> I plan to involve HDFS-14997 and related JIRAs if possible. I have
> > >>> resolved the conflict and verified them locally.
> > >>> It will include: HDFS-14997 HDFS-15075 HDFS-15651 HDFS-15113.
> > >>> I would like to hear some more response that if we have enough time
> to
> > >>> wait for it to be ready.
> > >>> Thanks.
> > >>>
> > >>> Best Regards,
> > >>> - He Xiaoqiao
> > >>>
> > >>> On Tue, Sep 14, 2021 at 3:39 PM Xiaoqiao He 
> > wrote:
> > >>>
> >  Hi Brahma, HDFS-15160 has checked in branch-3.2 & branch-3.2.3. FYI.
> > 
> >  On Tue, Sep 14, 2021 at 3:52 AM Brahma Reddy Battula <
> > bra...@apache.org>
> >  wrote:
> > 
> > > Hi All,
> > >
> > > Waiting for the following jira to commit to hadoop-3.2.3 , mostly
> > this
> > > can
> > > be done by this week,then I will try to create the RC next if there
> > is
> > > no
> > > objection.
> > >
> > > https://issues.apache.org/jira/browse/HDFS-15160
> > >
> > >
> > >
> > > On Mon, Aug 16, 2021 at 2:22 PM Brahma Reddy Battula <
> > > bra...@apache.org>
> > > wrote:
> > >
> > > > @Akira Ajisaka   and @Masatake Iwasaki
> > > > 
> > > > Looks all are build related issues when you try with bigtop. We
> can
> > > > discuss and prioritize this.. Will connect with you guys.
> > > >
> > > > On Mon, Aug 16, 2021 at 1:43 PM Masatake Iwasaki <
> > > > iwasak...@oss.nttdata.co.jp> wrote:
> > > >
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch2-exclude-spotbugs-annotations.diff
> > > >> >
> > > >> > This is for building hadoop-3.2.2 against zookeeper-3.4.14.
> > > >> > we do not see the issue usually since branch-3.2 uses
> > > zooekeper-3.4.13,
> > > >> > while it would be harmless to add the exclusion even for
> > > >> zooekeeper-3.4.13.
> > > >>
> > > >> I filed HADOOP-17849 for this.
> > > >>
> > > >> On 2021/08/16 12:02, Masatake Iwasaki wrote:
> > > >> > Thanks for bringing this up, Akira. Let me explain some
> > > background.
> > > >> >
> > > >> >
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch2-exclude-spotbugs-annotations.diff
> > > >> >
> > > >> > This is for building hadoop-3.2.2 against zookeeper-3.4.14.
> > > >> > we do not see the issue usually since branch-3.2 uses
> > > zooekeper-3.4.13,
> > > >> > while it would be harmless to add the exclusion even for
> > > >> zooekeeper-3.4.13.
> > > >> >
> > > >> >
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch3-fix-broken-dir-detection.diff
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch5-fix-kms-shellprofile.diff
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch6-fix-httpfs-sh.diff
> > > >> >
> > > >> > These are relevant to directory structure used by Bigtop
> > package.
> > > >> > If the fix does not break the tarball dist,
> > > >> > it would be nice to have these on Hadoop too.
> > > >> >
> > > >> >
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch7-remove-phantomjs-in-yarn-ui.diff
> > > >> >
> > > >> > This is for 

[ANNOUNCE] Apache Hadoop 3.3.1 release

2021-06-15 Thread Wei-Chiu Chuang
Hi All,

It gives me great pleasure to announce that the Apache Hadoop community has
voted to release Apache Hadoop 3.3.1.

This is the first stable release of Apache Hadoop 3.3.x line. It contains
697 bug fixes, improvements and enhancements since 3.3.0.

Users are encouraged to read the overview of major changes
<https://hadoop.apache.org/docs/r3.3.1/index.html> since 3.3.0. For details
of 697 bug fixes, improvements, and other enhancements since the previous
3.3.0 release, please check release notes
<http://hadoop.apache.org/docs/r3.3.1/hadoop-project-dist/hadoop-common/release/3.3.1/RELEASENOTES.3.3.1.html>
 and changelog
<http://hadoop.apache.org/docs/r3.3.1/hadoop-project-dist/hadoop-common/release/3.3.1/CHANGES.3.3.1.html>
detail
the changes since 3.3.0.

Many thanks to everyone who contributed to the release, and everyone in the
Apache Hadoop community! This release is a direct result of your great
contributions.

Many thanks to everyone who helped in this release process!

Many thanks to Sean Busbey, Chao Sun, Steve Loughran, Masatake Iwasaki,
Michael Stack, Viraj Jasani, Eric Payne, Ayush Saxena, Vinayakumar B,
Takanobu Asanuma, Xiaoqiao He and other folks who continued helps for this
release process.

Best Regards,
Wei-Chiu Chuang


Re: [VOTE] Hadoop 3.1.x EOL

2021-06-15 Thread Wei-Chiu Chuang
Dropped 3.1.4 from website.
Removed from https://dist.apache.org/repos/dist/release/hadoop/common/

On Thu, Jun 10, 2021 at 3:42 PM Akira Ajisaka  wrote:

> This vote has passed with 18 binding +1. I'll update the JIRA and the wiki.
>
> Thanks all for your participation.
>
> On Tue, Jun 8, 2021 at 3:03 AM Steve Loughran  wrote:
> >
> >
> >
> > On Thu, 3 Jun 2021 at 07:14, Akira Ajisaka  wrote:
> >>
> >> Dear Hadoop developers,
> >>
> >> Given the feedback from the discussion thread [1], I'd like to start
> >> an official vote
> >> thread for the community to vote and start the 3.1 EOL process.
> >>
> >> What this entails:
> >>
> >> (1) an official announcement that no further regular Hadoop 3.1.x
> releases
> >> will be made after 3.1.4.
> >> (2) resolve JIRAs that specifically target 3.1.5 as won't fix.
> >>
> >> This vote will run for 7 days and conclude by June 10th, 16:00 JST [2].
> >>
> >> Committers are eligible to cast binding votes. Non-committers are
> welcomed
> >> to cast non-binding votes.
> >>
> >> Here is my vote, +1
> >
> >
> >
> > +1 (binding)
> >>
> >>
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: [VOTE] Release Apache Hadoop 3.3.1 RC3

2021-06-15 Thread Wei-Chiu Chuang
Thanks all for the input!

I released the maven artifacts earlier today, so soon you should be able to
use the 3.3.1 dependencies in downstream applications.
The official website is refreshed to include 3.3.1:
https://hadoop.apache.org/
The tarball was uploaded and verified the links are good to go.

Will send out the announcement to the user mailing list later.

On Tue, Jun 15, 2021 at 9:56 PM Xiaoqiao He  wrote:

> +0.
>
> - Built successfully from tag release-3.3.1-RC3 on Ubuntu 20.04
> - Verified signature and checksum
> - Deployed pseudo-distributed cluster with 3 nodes
> - Ran basic HDFS shell commands and sample MR Jobs, it worked well.
> - Browsed NN/DN/RM/NM UI. *YARN UI2 DOES NOT work from my side.*
>
> NOTE: attach configuration for YARN. Please correct me if I missed
> something.
>
> Thanks Wei-Chiu for your great work!
>
> - He Xiaoqiao
>
>
> On Mon, Jun 14, 2021 at 12:28 PM Takanobu Asanuma 
> wrote:
>
>> +1.
>>  - Verified hashes
>>  - Confirmed native build on CentOS7
>>  - Started kerberized cluster (using docker)
>>  - Checked NN/RBF Web UI
>>  - Ran basic Erasure Coding shell commands
>>
>> Thanks for the great work, Wei-Chiu.
>>
>> - Takanobu
>>
>> 2021年6月13日(日) 3:25 Vinayakumar B :
>>
>> > +1 (Binding)
>> >
>> > 1. Built from Tag.
>> > 2. Successful Native Build on Ubuntu 20.04
>> > 3. Verified Checksums
>> > 4. Deployed the docker cluster with 3 nodes
>> > 5. Ran sample MR Jobs
>> >
>> > -Vinay
>> >
>> >
>> > On Sat, Jun 12, 2021 at 6:40 PM Ayush Saxena 
>> wrote:
>> >
>> > > +1,
>> > > Built from Source.
>> > > Successful Native Build on Ubuntu 20.04
>> > > Verified Checksums
>> > > Ran basic hdfs shell commands.
>> > > Ran simple MR jobs.
>> > > Browsed NN,DN,RM and NM UI.
>> > >
>> > > Thanx Wei-Chiu for driving the release.
>> > >
>> > > -Ayush
>> > >
>> > >
>> > > > On 12-Jun-2021, at 1:45 AM, epa...@apache.org wrote:
>> > > >
>> > > > +1 (binding)
>> > > > Eric
>> > > >
>> > > >
>> > > > On Tuesday, June 1, 2021, 5:29:49 AM CDT, Wei-Chiu Chuang <
>> > > weic...@apache.org> wrote:
>> > > >
>> > > > Hi community,
>> > > >
>> > > > This is the release candidate RC3 of Apache Hadoop 3.3.1 line. All
>> > > blocker
>> > > > issues have been resolved [1] again.
>> > > >
>> > > > There are 2 additional issues resolved for RC3:
>> > > > * Revert "MAPREDUCE-7303. Fix TestJobResourceUploader failures after
>> > > > HADOOP-16878
>> > > > * Revert "HADOOP-16878. FileUtil.copy() to throw IOException if the
>> > > source
>> > > > and destination are the same
>> > > >
>> > > > There are 4 issues resolved for RC2:
>> > > > * HADOOP-17666. Update LICENSE for 3.3.1
>> > > > * MAPREDUCE-7348. TestFrameworkUploader#testNativeIO fails. (#3053)
>> > > > * Revert "HADOOP-17563. Update Bouncy Castle to 1.68. (#2740)"
>> (#3055)
>> > > > * HADOOP-17739. Use hadoop-thirdparty 1.1.1. (#3064)
>> > > >
>> > > > The Hadoop-thirdparty 1.1.1, as previously mentioned, contains two
>> > extra
>> > > > fixes compared to hadoop-thirdparty 1.1.0:
>> > > > * HADOOP-17707. Remove jaeger document from site index.
>> > > > * HADOOP-17730. Add back error_prone
>> > > >
>> > > > *RC tag is release-3.3.1-RC3
>> > > > https://github.com/apache/hadoop/releases/tag/release-3.3.1-RC3
>> > > >
>> > > > *The RC3 artifacts are at*:
>> > > > https://home.apache.org/~weichiu/hadoop-3.3.1-RC3/
>> > > > ARM artifacts:
>> https://home.apache.org/~weichiu/hadoop-3.3.1-RC3-arm/
>> > > >
>> > > > *The maven artifacts are hosted here:*
>> > > >
>> >
>> https://repository.apache.org/content/repositories/orgapachehadoop-1320/
>> > > >
>> > > > *My public key is available here:*
>> > > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>> > > >
>> > > >
>> > > > Things I've verified:
>> > > > * all blocker issues targeting 3.3.1 have been resolved.
>> > > > * stable/evolving API changes between 3.3.0 and 3.3.1 are
>> compatible.
>> > > > * LICENSE and NOTICE files checked
>> > > > * RELEASENOTES and CHANGELOG
>> > > > * rat check passed.
>> > > > * Built HBase master branch on top of Hadoop 3.3.1 RC2, ran unit
>> tests.
>> > > > * Built Ozone master on top fo Hadoop 3.3.1 RC2, ran unit tests.
>> > > > * Extra: built 50 other open source projects on top of Hadoop 3.3.1
>> > RC2.
>> > > > Had to patch some of them due to commons-lang migration (Hadoop
>> 3.2.0)
>> > > and
>> > > > dependency divergence. Issues are being identified but so far
>> nothing
>> > > > blocker for Hadoop itself.
>> > > >
>> > > > Please try the release and vote. The vote will run for 5 days.
>> > > >
>> > > > My +1 to start,
>> > > >
>> > > > [1] https://issues.apache.org/jira/issues/?filter=12350491
>> > > > [2]
>> > > >
>> > >
>> >
>> https://github.com/apache/hadoop/compare/release-3.3.1-RC1...release-3.3.1-RC3
>> > >
>> > > -
>> > > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>> > > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>> > >
>> > >
>> >
>>
>


Re: [VOTE] Release Apache Hadoop 3.3.1 RC3

2021-06-04 Thread Wei-Chiu Chuang
Dropped them. Please check again.

On Fri, Jun 4, 2021 at 1:19 PM Chao Sun  wrote:

> Hi Wei-Chiu,
>
> It seems the Maven staging repository is still pointing at RC0:
> https://repository.apache.org/content/repositories/staging/org/apache/hadoop/hadoop-client/3.3.1/,
> and it's probably because you need to drop the old RCs in Apache Nexus
> server. Could you do that? Thanks.
>
> Chao
>
> On Thu, Jun 3, 2021 at 12:52 PM Steve Loughran 
> wrote:
>
>> Extend for a bit from the last RC, as it takes time to qualify.
>>
>> I'm busy testing, doing
>>
>> - packaging 
>> - CLI working with abfs and s3, both fs and cloudstore library calls
>> - building downstream projects (so validating maven artifacts). cloudstore
>> and spark there
>> - building downstream of downstream projects, i.e. my spark cloud
>> IO/committer test module. Moving to spark 3 cost me the afternoon, not
>> through any incompatible changes there but because the upgraded scalatest
>> "moved" their foundational FunTest class to a different package and name.
>> Not happy with Team Scalatest there.
>> - reviewing the docs in the -aws and azure modules to see they link
>> together OK.
>>
>> So far so good.
>>
>> One troublespot (which isn't any reason to hold up the release), is that
>> the table in the directory_markers markdown file doesn't render right.
>>  Created https://issues.apache.org/jira/browse/HADOOP-17746.
>>
>> This is *not a blocker*
>>
>> I can prepare a fix and we can have it in so that if any other changes
>> come
>> in the page will look OK.
>>
>>
>>
>>
>> On Thu, 3 Jun 2021 at 17:30, Wei-Chiu Chuang  wrote:
>>
>> > Hello,
>> > do we want to extend the release vote? I understand a big release like
>> this
>> > takes time to validate.
>> >
>> > I am aware a number of people are testing it: Attila tested Ozone on
>> Hadoop
>> > 3.3.1 RC3, Stack is testing HBase, Chao tested Spark.
>> > I also learned that anecdotally Spark on S3 on Hadoop 3.3 is faster by
>> 20%
>> > over Hadoop 3.2 library.
>> >
>>
>> ooh. That'll be from Mukund's listing improvements translating into query
>> planning speedups,
>>
>> Nice
>>
>> If someone benchmarking this stuff were to enable directory marker
>> retention
>> fs.s3a.directory.marker.retention=keep , I'd be interested to know how
>> much
>> speedup that
>> delivers on versioned and unversioned buckets.
>>
>> Unversioned: reduces risk of IO throttling on writes
>> Versioned: that and should stop subsequent LIST operations from getting
>> slowed down from all the tombstones
>>
>>
>> >
>> > Looks like we may need some more time to test. How about extending it
>> by a
>> > week?
>> >
>>
>>
>> That would be good. This week included some holidays for people in the
>> US/UK which is why I'm a bit behind on my testing.
>>
>>
>> >
>> >
>>
>


Re: [VOTE] Release Apache Hadoop 3.3.1 RC3

2021-06-03 Thread Wei-Chiu Chuang
So I was thinking 5+7 days from Tuesday = next Sunday so that gives
everyone a whole week to validate.

On Fri, Jun 4, 2021 at 12:38 AM Sean Busbey  wrote:

> Sounds good to me. That would be until Thursday June 10th, right?
>
> As a side note it’s concerning that a double-dot maintenance release is a
> big release, but I get that it’s the current state of the project.
>
> > On Jun 3, 2021, at 11:30 AM, Wei-Chiu Chuang  wrote:
> >
> > Hello,
> > do we want to extend the release vote? I understand a big release like
> this
> > takes time to validate.
> >
> > I am aware a number of people are testing it: Attila tested Ozone on
> Hadoop
> > 3.3.1 RC3, Stack is testing HBase, Chao tested Spark.
> > I also learned that anecdotally Spark on S3 on Hadoop 3.3 is faster by
> 20%
> > over Hadoop 3.2 library.
> >
> > Looks like we may need some more time to test. How about extending it by
> a
> > week?
> >
> > On Tue, Jun 1, 2021 at 6:29 PM Wei-Chiu Chuang 
> wrote:
> >
> >> Hi community,
> >>
> >> This is the release candidate RC3 of Apache Hadoop 3.3.1 line. All
> blocker
> >> issues have been resolved [1] again.
> >>
> >> There are 2 additional issues resolved for RC3:
> >> * Revert "MAPREDUCE-7303. Fix TestJobResourceUploader failures after
> >> HADOOP-16878
> >> * Revert "HADOOP-16878. FileUtil.copy() to throw IOException if the
> source
> >> and destination are the same
> >>
> >> There are 4 issues resolved for RC2:
> >> * HADOOP-17666. Update LICENSE for 3.3.1
> >> * MAPREDUCE-7348. TestFrameworkUploader#testNativeIO fails. (#3053)
> >> * Revert "HADOOP-17563. Update Bouncy Castle to 1.68. (#2740)" (#3055)
> >> * HADOOP-17739. Use hadoop-thirdparty 1.1.1. (#3064)
> >>
> >> The Hadoop-thirdparty 1.1.1, as previously mentioned, contains two extra
> >> fixes compared to hadoop-thirdparty 1.1.0:
> >> * HADOOP-17707. Remove jaeger document from site index.
> >> * HADOOP-17730. Add back error_prone
> >>
> >> *RC tag is release-3.3.1-RC3
> >> https://github.com/apache/hadoop/releases/tag/release-3.3.1-RC3
> >>
> >> *The RC3 artifacts are at*:
> >> https://home.apache.org/~weichiu/hadoop-3.3.1-RC3/
> >> ARM artifacts: https://home.apache.org/~weichiu/hadoop-3.3.1-RC3-arm/
> >>
> >> *The maven artifacts are hosted here:*
> >>
> https://repository.apache.org/content/repositories/orgapachehadoop-1320/
> >>
> >> *My public key is available here:*
> >> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >>
> >>
> >> Things I've verified:
> >> * all blocker issues targeting 3.3.1 have been resolved.
> >> * stable/evolving API changes between 3.3.0 and 3.3.1 are compatible.
> >> * LICENSE and NOTICE files checked
> >> * RELEASENOTES and CHANGELOG
> >> * rat check passed.
> >> * Built HBase master branch on top of Hadoop 3.3.1 RC2, ran unit tests.
> >> * Built Ozone master on top fo Hadoop 3.3.1 RC2, ran unit tests.
> >> * Extra: built 50 other open source projects on top of Hadoop 3.3.1 RC2.
> >> Had to patch some of them due to commons-lang migration (Hadoop 3.2.0)
> and
> >> dependency divergence. Issues are being identified but so far nothing
> >> blocker for Hadoop itself.
> >>
> >> Please try the release and vote. The vote will run for 5 days.
> >>
> >> My +1 to start,
> >>
> >> [1] https://issues.apache.org/jira/issues/?filter=12350491
> >> [2]
> >>
> https://github.com/apache/hadoop/compare/release-3.3.1-RC1...release-3.3.1-RC3
> >>
> >>
> >>
>
>
>


Re: [VOTE] Release Apache Hadoop 3.3.1 RC3

2021-06-03 Thread Wei-Chiu Chuang
Hello,
do we want to extend the release vote? I understand a big release like this
takes time to validate.

I am aware a number of people are testing it: Attila tested Ozone on Hadoop
3.3.1 RC3, Stack is testing HBase, Chao tested Spark.
I also learned that anecdotally Spark on S3 on Hadoop 3.3 is faster by 20%
over Hadoop 3.2 library.

Looks like we may need some more time to test. How about extending it by a
week?

On Tue, Jun 1, 2021 at 6:29 PM Wei-Chiu Chuang  wrote:

> Hi community,
>
> This is the release candidate RC3 of Apache Hadoop 3.3.1 line. All blocker
> issues have been resolved [1] again.
>
> There are 2 additional issues resolved for RC3:
> * Revert "MAPREDUCE-7303. Fix TestJobResourceUploader failures after
> HADOOP-16878
> * Revert "HADOOP-16878. FileUtil.copy() to throw IOException if the source
> and destination are the same
>
> There are 4 issues resolved for RC2:
> * HADOOP-17666. Update LICENSE for 3.3.1
> * MAPREDUCE-7348. TestFrameworkUploader#testNativeIO fails. (#3053)
> * Revert "HADOOP-17563. Update Bouncy Castle to 1.68. (#2740)" (#3055)
> * HADOOP-17739. Use hadoop-thirdparty 1.1.1. (#3064)
>
> The Hadoop-thirdparty 1.1.1, as previously mentioned, contains two extra
> fixes compared to hadoop-thirdparty 1.1.0:
> * HADOOP-17707. Remove jaeger document from site index.
> * HADOOP-17730. Add back error_prone
>
> *RC tag is release-3.3.1-RC3
> https://github.com/apache/hadoop/releases/tag/release-3.3.1-RC3
>
> *The RC3 artifacts are at*:
> https://home.apache.org/~weichiu/hadoop-3.3.1-RC3/
> ARM artifacts: https://home.apache.org/~weichiu/hadoop-3.3.1-RC3-arm/
>
> *The maven artifacts are hosted here:*
> https://repository.apache.org/content/repositories/orgapachehadoop-1320/
>
> *My public key is available here:*
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
>
> Things I've verified:
> * all blocker issues targeting 3.3.1 have been resolved.
> * stable/evolving API changes between 3.3.0 and 3.3.1 are compatible.
> * LICENSE and NOTICE files checked
> * RELEASENOTES and CHANGELOG
> * rat check passed.
> * Built HBase master branch on top of Hadoop 3.3.1 RC2, ran unit tests.
> * Built Ozone master on top fo Hadoop 3.3.1 RC2, ran unit tests.
> * Extra: built 50 other open source projects on top of Hadoop 3.3.1 RC2.
> Had to patch some of them due to commons-lang migration (Hadoop 3.2.0) and
> dependency divergence. Issues are being identified but so far nothing
> blocker for Hadoop itself.
>
> Please try the release and vote. The vote will run for 5 days.
>
> My +1 to start,
>
> [1] https://issues.apache.org/jira/issues/?filter=12350491
> [2]
> https://github.com/apache/hadoop/compare/release-3.3.1-RC1...release-3.3.1-RC3
>
>
>


Re: [VOTE] Hadoop 3.1.x EOL

2021-06-03 Thread Wei-Chiu Chuang
+1

On Thu, Jun 3, 2021 at 2:14 PM Akira Ajisaka  wrote:

> Dear Hadoop developers,
>
> Given the feedback from the discussion thread [1], I'd like to start
> an official vote
> thread for the community to vote and start the 3.1 EOL process.
>
> What this entails:
>
> (1) an official announcement that no further regular Hadoop 3.1.x releases
> will be made after 3.1.4.
> (2) resolve JIRAs that specifically target 3.1.5 as won't fix.
>
> This vote will run for 7 days and conclude by June 10th, 16:00 JST [2].
>
> Committers are eligible to cast binding votes. Non-committers are welcomed
> to cast non-binding votes.
>
> Here is my vote, +1
>
> [1] https://s.apache.org/w9ilb
> [2]
> https://www.timeanddate.com/worldclock/fixedtime.html?msg=4=20210610T16=248
>
> Regards,
> Akira
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


[VOTE] Release Apache Hadoop 3.3.1 RC3

2021-06-01 Thread Wei-Chiu Chuang
Hi community,

This is the release candidate RC3 of Apache Hadoop 3.3.1 line. All blocker
issues have been resolved [1] again.

There are 2 additional issues resolved for RC3:
* Revert "MAPREDUCE-7303. Fix TestJobResourceUploader failures after
HADOOP-16878
* Revert "HADOOP-16878. FileUtil.copy() to throw IOException if the source
and destination are the same

There are 4 issues resolved for RC2:
* HADOOP-17666. Update LICENSE for 3.3.1
* MAPREDUCE-7348. TestFrameworkUploader#testNativeIO fails. (#3053)
* Revert "HADOOP-17563. Update Bouncy Castle to 1.68. (#2740)" (#3055)
* HADOOP-17739. Use hadoop-thirdparty 1.1.1. (#3064)

The Hadoop-thirdparty 1.1.1, as previously mentioned, contains two extra
fixes compared to hadoop-thirdparty 1.1.0:
* HADOOP-17707. Remove jaeger document from site index.
* HADOOP-17730. Add back error_prone

*RC tag is release-3.3.1-RC3
https://github.com/apache/hadoop/releases/tag/release-3.3.1-RC3

*The RC3 artifacts are at*:
https://home.apache.org/~weichiu/hadoop-3.3.1-RC3/
ARM artifacts: https://home.apache.org/~weichiu/hadoop-3.3.1-RC3-arm/

*The maven artifacts are hosted here:*
https://repository.apache.org/content/repositories/orgapachehadoop-1320/

*My public key is available here:*
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS


Things I've verified:
* all blocker issues targeting 3.3.1 have been resolved.
* stable/evolving API changes between 3.3.0 and 3.3.1 are compatible.
* LICENSE and NOTICE files checked
* RELEASENOTES and CHANGELOG
* rat check passed.
* Built HBase master branch on top of Hadoop 3.3.1 RC2, ran unit tests.
* Built Ozone master on top fo Hadoop 3.3.1 RC2, ran unit tests.
* Extra: built 50 other open source projects on top of Hadoop 3.3.1 RC2.
Had to patch some of them due to commons-lang migration (Hadoop 3.2.0) and
dependency divergence. Issues are being identified but so far nothing
blocker for Hadoop itself.

Please try the release and vote. The vote will run for 5 days.

My +1 to start,

[1] https://issues.apache.org/jira/issues/?filter=12350491
[2]
https://github.com/apache/hadoop/compare/release-3.3.1-RC1...release-3.3.1-RC3


[jira] [Reopened] (MAPREDUCE-7303) Fix TestJobResourceUploader failures after HADOOP-16878

2021-06-01 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reopened MAPREDUCE-7303:


We're reverting HADOOP-16878 so this should be reverted as well.

> Fix TestJobResourceUploader failures after HADOOP-16878
> ---
>
> Key: MAPREDUCE-7303
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7303
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.4.0, 3.3.1
>
> Attachments: MAPREDUCE-7303-001.patch
>
>
> Currently, two test cases fail with NPE:
> {{org.apache.hadoop.mapreduce.TestJobResourceUploader.testOriginalPathIsRoot()}}
> {{org.apache.hadoop.mapreduce.TestJobResourceUploader.testOriginalPathEndsInSlash()}}
> Root cause is the src/dst qualified path check introduced by HADOOP-16878.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop Thirdparty 1.1.1 RC0

2021-05-31 Thread Wei-Chiu Chuang
I am +1 as well

So the result of the vote is:

4 +1, 1 non-binding +1, no +0 no -1.

This vote passed. I'll go ahead and release the bits and 3.3.1 RC2.

Thanks all!!!

On Fri, May 28, 2021 at 11:23 AM Xiaoqiao He  wrote:

> +1
>
> - Verified checksums and signatures.
> - Checked CHANGELOG / RELEASE NOTES.
> - Built from source with jdk1.8.0_202.
> - Built Hadoop trunk with hadoop-thirdparty 1.1.1 and checked with pseudo
> distribution hadoop cluster deployment.
>
> Thanks Wei-Chiu for your work on this. I am confused about the
> relationship between
> version 1.1.0 and 1.1.1 since we voted 1.1.0-RC0[1] a few days ago. Are we
> prepared
> to release these versions at the same time or given up on 1.1.0?
>
> Thanks again.
>
> Regards,
> - He Xiaoqiao
>
> [1]
> https://lists.apache.org/thread.html/r79b13d0c34d14cd086bf97c1d87c72782fcc3c9e20569f5cdd4c7124%40%3Ccommon-dev.hadoop.apache.org%3E
>
> On Fri, May 28, 2021 at 9:52 AM Akira Ajisaka  wrote:
>
>> +1
>>
>> - Verified checksums and signatures
>> - Built from source with -Psrc profile
>> - Checked the documents
>> - Compiled Hadoop trunk and branch-3.3 with Hadoop third-party 1.1.1.
>>
>> -Akira
>>
>> On Wed, May 26, 2021 at 5:29 PM Wei-Chiu Chuang 
>> wrote:
>> >
>> > Hi folks,
>> >
>> > I have put together a release candidate (RC0) for Hadoop Thirdparty
>> > 1.1.1 which will be consumed by Hadoop 3.3.1 RC2.
>> >
>> >
>> > The RC is available at:
>> > https://people.apache.org/~weichiu/hadoop-thirdparty-1.1.1-RC0/
>> >
>> >
>> > The RC tag in svn is
>> > here:
>> https://github.com/apache/hadoop-thirdparty/releases/tag/release-1.1.1-RC0
>> >
>> > The maven artifacts are staged at
>> >
>> >
>> https://repository.apache.org/content/repositories/orgapachehadoop-1316/
>> >
>> >
>> > Comparing to 1.1.0, there are two additional fixes:
>> >
>> > HADOOP-17707. Remove jaeger document from site index.
>> > <
>> https://github.com/apache/hadoop-thirdparty/commit/e1db87b85117b5694972f2725aa32c9975a83b5b
>> >
>> >
>> > HADOOP-17730. Add back error_prone
>> > <
>> https://github.com/apache/hadoop-thirdparty/commit/db2fc27e2f53637a06c36c3a9d8dae0a8c894cd8
>> >
>> >
>> > You can find my public key
>> > at:https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>> > <http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS>
>> >
>> > Please try the release and vote. The vote will run for 5 days.
>> >
>> > Thanks
>> > Weichiu
>>
>> -
>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>>
>>


Re: [VOTE] Release Apache Hadoop Thirdparty 1.1.1 RC0

2021-05-27 Thread Wei-Chiu Chuang
Hi,
sorry I understand it's a lot of work to review each release so I don't
take it lightly.

While testing Ozone, I realized the build breaks due to the missing
error_prone from hadoop-thirdparty. HADOOP-17730
<https://issues.apache.org/jira/browse/HADOOP-17730>
I have a dirty workaround to resolve this in Ozone, but I imagine the same
issue could occur for other downstream applications.

Since the Hadoop 3.3.1 requires a RC2 anyway, I figured we could slip in
the thirdparty 1 1 1.

On Fri, May 28, 2021 at 11:23 AM Xiaoqiao He  wrote:

> +1
>
> - Verified checksums and signatures.
> - Checked CHANGELOG / RELEASE NOTES.
> - Built from source with jdk1.8.0_202.
> - Built Hadoop trunk with hadoop-thirdparty 1.1.1 and checked with pseudo
> distribution hadoop cluster deployment.
>
> Thanks Wei-Chiu for your work on this. I am confused about the
> relationship between
> version 1.1.0 and 1.1.1 since we voted 1.1.0-RC0[1] a few days ago. Are we
> prepared
> to release these versions at the same time or given up on 1.1.0?
>
> Thanks again.
>
> Regards,
> - He Xiaoqiao
>
> [1]
> https://lists.apache.org/thread.html/r79b13d0c34d14cd086bf97c1d87c72782fcc3c9e20569f5cdd4c7124%40%3Ccommon-dev.hadoop.apache.org%3E
>
> On Fri, May 28, 2021 at 9:52 AM Akira Ajisaka  wrote:
>
>> +1
>>
>> - Verified checksums and signatures
>> - Built from source with -Psrc profile
>> - Checked the documents
>> - Compiled Hadoop trunk and branch-3.3 with Hadoop third-party 1.1.1.
>>
>> -Akira
>>
>> On Wed, May 26, 2021 at 5:29 PM Wei-Chiu Chuang 
>> wrote:
>> >
>> > Hi folks,
>> >
>> > I have put together a release candidate (RC0) for Hadoop Thirdparty
>> > 1.1.1 which will be consumed by Hadoop 3.3.1 RC2.
>> >
>> >
>> > The RC is available at:
>> > https://people.apache.org/~weichiu/hadoop-thirdparty-1.1.1-RC0/
>> >
>> >
>> > The RC tag in svn is
>> > here:
>> https://github.com/apache/hadoop-thirdparty/releases/tag/release-1.1.1-RC0
>> >
>> > The maven artifacts are staged at
>> >
>> >
>> https://repository.apache.org/content/repositories/orgapachehadoop-1316/
>> >
>> >
>> > Comparing to 1.1.0, there are two additional fixes:
>> >
>> > HADOOP-17707. Remove jaeger document from site index.
>> > <
>> https://github.com/apache/hadoop-thirdparty/commit/e1db87b85117b5694972f2725aa32c9975a83b5b
>> >
>> >
>> > HADOOP-17730. Add back error_prone
>> > <
>> https://github.com/apache/hadoop-thirdparty/commit/db2fc27e2f53637a06c36c3a9d8dae0a8c894cd8
>> >
>> >
>> > You can find my public key
>> > at:https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>> > <http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS>
>> >
>> > Please try the release and vote. The vote will run for 5 days.
>> >
>> > Thanks
>> > Weichiu
>>
>> -
>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>>
>>


Re: [VOTE] Release Apache Hadoop 3.3.1 - RC1

2021-05-27 Thread Wei-Chiu Chuang
Ozone tests:

A few issues were found during the test.
(1) HADOOP-17730 <https://issues.apache.org/jira/browse/HADOOP-17730> as
mentioned previously, this is a Hadoop side problem, which requires
a new Hadoop thirdparty release.
(2) class path conflicts. As Ozone was split from Hadoop, it copies a few
test utility from Hadoop but didn't relocate the classpaths. This is an
Ozone-side problem.
(3) mockito2 migration caused by HADOOP-14178
<https://issues.apache.org/jira/browse/HADOOP-14178>. Again, Ozone's test
code partially depends on Hadoop, which previously used Mockito 1. There's
some migration work required at Ozone side.


HBase:

   1. HBASE-25928 <https://issues.apache.org/jira/browse/HBASE-25928> behavior
   change in Configuration in Hadoop 3.3.0 caused by
   2. HADOOP-15708 <https://issues.apache.org/jira/browse/HADOOP-15708>.

I think we better document this somewhere, because Configuration class is
used widely.

Additionally,

   1. MAPREDUCE-7348
<https://issues.apache.org/jira/browse/MAPREDUCE-7348> caused
   by commons-io update
   2.
  1.
 1. HADOOP-17563
 <https://issues.apache.org/jira/browse/HADOOP-17563> bumped
 bouncycastle version but it breaks Spark and HBase.
  3.


   1. HADOOP-17292
<https://issues.apache.org/jira/browse/HADOOP-17292> replaced
  lz4 codec with lz4-java. But it's declared as provided scope, requiring
  downstream applications such as HBase to declare the dependency
explicitly.
  It would be best to add a release note.



I propose to start a RC2 to include
(1) hadoop-thirdparty 1.1.1 (release vote in progress),
(2) MAPREDUCE-7348,
(3) revert HADOOP-17563,
(4) add release notes to HADOOP-17292 and HADOOP-14178.

Please review the hadoop-thirdparty 1.1.1 RC0 and cast a vote. Thanks!

On Wed, May 26, 2021 at 9:39 AM Wei-Chiu Chuang  wrote:

>
> Found a few issues building downstreams
>
> Ozone: HADOOP-17730 <https://issues.apache.org/jira/browse/HADOOP-17730> --
> we may have to make an hadoop-thirdparty 1.1.1 to address this problem.
>
> HBase: HBASE-25908 <https://issues.apache.org/jira/browse/HBASE-25908> --
> this is mostly a HBase side of problem.
>
> hbase-filesystem (HBOSS) does not compile because of the changing API on
> the Hadoop side. This is expected since the interface isn't stable.
>
> Tez -- does not compile due to TEZ-4298
> <https://issues.apache.org/jira/browse/TEZ-4298>
>
> ===
> HBase test failures:
>
> TestBackupSmallTests
> java.lang.NoClassDefFoundError:
> org/bouncycastle/asn1/edec/EdECObjectIdentifiers
>
> --> HBase needs to bump the runtime version of bouncycastle to 1.68 to
> find the missing class file.
>
> TestCompressionTest
> java.lang.AssertionError
> at
> org.apache.hadoop.hbase.util.TestCompressionTest.testTestCompression(TestCompressionTest.java:89)
> (This is due to the lz4/lzo compression codec change)
> The lz4-java is in provide scope. Downstream applications that use
> lz4-java must declare lz4-java explicitly. I think we need to add a release
> note for it.
>
>
> TestHBaseConfiguration
> java.lang.AssertionError: expected null, but was:<1000>
> at
> org.apache.hadoop.hbase.TestHBaseConfiguration.testDeprecatedConfigurations(TestHBaseConfiguration.java:150)
>
> On Mon, May 24, 2021 at 10:36 PM Wei-Chiu Chuang 
> wrote:
>
>> Hi community,
>>
>> This is the release candidate RC1 of Apache Hadoop 3.3.1 line. All
>> blocker issues have been resolved.
>>
>> It contains 697 fixed jira issues [2] since 3.3.0 which include a lot of
>> features and improvements(read the full set of release notes).
>>
>> Below feature additions are the highlights of the release.
>> Using lz4-java in Lz4Codec
>> <https://issues.apache.org/jira/browse/HADOOP-17292>
>> Using snappy-java in SnappyCodec
>> <https://issues.apache.org/jira/browse/HADOOP-17125>
>> Provide Regex Based Mount Point In Inode Tree
>> <https://issues.apache.org/jira/browse/HADOOP-15891>
>> Add Public IOStatistics API
>> <https://issues.apache.org/jira/browse/HADOOP-16830>
>> ABFS: Delegation SAS Generator Updates
>> <https://issues.apache.org/jira/browse/HADOOP-17076>
>> ABFS: Delegation SAS generator for integration with Ranger
>> <https://issues.apache.org/jira/browse/HADOOP-16916>
>> Über-jira: S3A Hadoop 3.3.1 features
>> <https://issues.apache.org/jira/browse/HADOOP-16829>
>> Add Metrics to HttpFS Server
>> <https://issues.apache.org/jira/browse/HDFS-15711>
>> EC: Verify EC reconstruction correctness on DataNode
>> <https://issues.apache.org/jira/browse/HDFS-15759>
>> Standby NameNode process getBl

[VOTE] Release Apache Hadoop Thirdparty 1.1.1 RC0

2021-05-26 Thread Wei-Chiu Chuang
Hi folks,

I have put together a release candidate (RC0) for Hadoop Thirdparty
1.1.1 which will be consumed by Hadoop 3.3.1 RC2.


The RC is available at:
https://people.apache.org/~weichiu/hadoop-thirdparty-1.1.1-RC0/


The RC tag in svn is
here:https://github.com/apache/hadoop-thirdparty/releases/tag/release-1.1.1-RC0

The maven artifacts are staged at

https://repository.apache.org/content/repositories/orgapachehadoop-1316/


Comparing to 1.1.0, there are two additional fixes:

HADOOP-17707. Remove jaeger document from site index.


HADOOP-17730. Add back error_prone


You can find my public key
at:https://dist.apache.org/repos/dist/release/hadoop/common/KEYS


Please try the release and vote. The vote will run for 5 days.

Thanks
Weichiu


Re: [VOTE] Release Apache Hadoop 3.3.1 - RC1

2021-05-25 Thread Wei-Chiu Chuang
Found a few issues building downstreams

Ozone: HADOOP-17730 <https://issues.apache.org/jira/browse/HADOOP-17730> --
we may have to make an hadoop-thirdparty 1.1.1 to address this problem.

HBase: HBASE-25908 <https://issues.apache.org/jira/browse/HBASE-25908> --
this is mostly a HBase side of problem.

hbase-filesystem (HBOSS) does not compile because of the changing API on
the Hadoop side. This is expected since the interface isn't stable.

Tez -- does not compile due to TEZ-4298
<https://issues.apache.org/jira/browse/TEZ-4298>

===
HBase test failures:

TestBackupSmallTests
java.lang.NoClassDefFoundError:
org/bouncycastle/asn1/edec/EdECObjectIdentifiers

--> HBase needs to bump the runtime version of bouncycastle to 1.68 to find
the missing class file.

TestCompressionTest
java.lang.AssertionError
at
org.apache.hadoop.hbase.util.TestCompressionTest.testTestCompression(TestCompressionTest.java:89)
(This is due to the lz4/lzo compression codec change)
The lz4-java is in provide scope. Downstream applications that use lz4-java
must declare lz4-java explicitly. I think we need to add a release note for
it.


TestHBaseConfiguration
java.lang.AssertionError: expected null, but was:<1000>
at
org.apache.hadoop.hbase.TestHBaseConfiguration.testDeprecatedConfigurations(TestHBaseConfiguration.java:150)

On Mon, May 24, 2021 at 10:36 PM Wei-Chiu Chuang  wrote:

> Hi community,
>
> This is the release candidate RC1 of Apache Hadoop 3.3.1 line. All
> blocker issues have been resolved.
>
> It contains 697 fixed jira issues [2] since 3.3.0 which include a lot of
> features and improvements(read the full set of release notes).
>
> Below feature additions are the highlights of the release.
> Using lz4-java in Lz4Codec
> <https://issues.apache.org/jira/browse/HADOOP-17292>
> Using snappy-java in SnappyCodec
> <https://issues.apache.org/jira/browse/HADOOP-17125>
> Provide Regex Based Mount Point In Inode Tree
> <https://issues.apache.org/jira/browse/HADOOP-15891>
> Add Public IOStatistics API
> <https://issues.apache.org/jira/browse/HADOOP-16830>
> ABFS: Delegation SAS Generator Updates
> <https://issues.apache.org/jira/browse/HADOOP-17076>
> ABFS: Delegation SAS generator for integration with Ranger
> <https://issues.apache.org/jira/browse/HADOOP-16916>
> Über-jira: S3A Hadoop 3.3.1 features
> <https://issues.apache.org/jira/browse/HADOOP-16829>
> Add Metrics to HttpFS Server
> <https://issues.apache.org/jira/browse/HDFS-15711>
> EC: Verify EC reconstruction correctness on DataNode
> <https://issues.apache.org/jira/browse/HDFS-15759>
> Standby NameNode process getBlocks request to reduce Active load
> <https://issues.apache.org/jira/browse/HDFS-13183>
> LocatedFileStatusFetcher to collect/publish IOStatistics
> <https://issues.apache.org/jira/browse/MAPREDUCE-7315>
>
> *RC tag is release-3.3.1-RC1
> https://github.com/apache/hadoop/releases/tag/release-3.3.1-RC1
>
> *The RC1 artifacts are at*:
> https://home.apache.org/~weichiu/hadoop-3.3.1-RC1/
> ARM artifacts: https://home.apache.org/~weichiu/hadoop-3.3.1-RC1-arm/
>
>
> *The maven artifacts are hosted here:*
> https://repository.apache.org/content/repositories/orgapachehadoop-1314/
>
> *My public key is available here:*
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> [1] https://issues.apache.org/jira/issues/?filter=12350491
> [2]
> https://issues.apache.org/jira/issues/?jql=project%20in%20(YARN%2C%20HADOOP%2C%20MAPREDUCE%2C%20HDFS)%20AND%20fixVersion%20in%20(
> 3.3.1)%20AND%20status%20%3D%20Resolved%20ORDER%20BY%0AfixVersion%20ASC
>
>
> My ask:
> (1) please use the bits to test downstream applications. I am aware of a
> number of API changes between 3.1.x and 3.3.1 and even between 3.3.0 and
> 3.3.1. You should use this as an opportunity to test out applications and
> be ready for it.
>
> (2) please check out the release notes and change log, find out if
> anything important should be included in 3.3.1.
>
> Please try the release and vote. The vote will run for 5 days until
> 2021/05/30 at 00:00 CST.
>
> Things I've verified:
> * all blocker issues targeting 3.3.1 have been resolved.
> * stable/evolving API changes between 3.3.0 and 3.3.1 are compatible.
> * LICENSE and NOTICE files checked
> * RELEASENOTES and CHANGELOG
> * rat check passed.
>
>
> Regards,
> Weichiu
>
>


[VOTE] Release Apache Hadoop 3.3.1 - RC1

2021-05-24 Thread Wei-Chiu Chuang
Hi community,

This is the release candidate RC1 of Apache Hadoop 3.3.1 line. All blocker
issues have been resolved.

It contains 697 fixed jira issues [2] since 3.3.0 which include a lot of
features and improvements(read the full set of release notes).

Below feature additions are the highlights of the release.
Using lz4-java in Lz4Codec

Using snappy-java in SnappyCodec

Provide Regex Based Mount Point In Inode Tree

Add Public IOStatistics API

ABFS: Delegation SAS Generator Updates

ABFS: Delegation SAS generator for integration with Ranger

Über-jira: S3A Hadoop 3.3.1 features

Add Metrics to HttpFS Server

EC: Verify EC reconstruction correctness on DataNode

Standby NameNode process getBlocks request to reduce Active load

LocatedFileStatusFetcher to collect/publish IOStatistics


*RC tag is release-3.3.1-RC1
https://github.com/apache/hadoop/releases/tag/release-3.3.1-RC1

*The RC1 artifacts are at*:
https://home.apache.org/~weichiu/hadoop-3.3.1-RC1/
ARM artifacts: https://home.apache.org/~weichiu/hadoop-3.3.1-RC1-arm/


*The maven artifacts are hosted here:*
https://repository.apache.org/content/repositories/orgapachehadoop-1314/

*My public key is available here:*
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

[1] https://issues.apache.org/jira/issues/?filter=12350491
[2]
https://issues.apache.org/jira/issues/?jql=project%20in%20(YARN%2C%20HADOOP%2C%20MAPREDUCE%2C%20HDFS)%20AND%20fixVersion%20in%20(
3.3.1)%20AND%20status%20%3D%20Resolved%20ORDER%20BY%0AfixVersion%20ASC


My ask:
(1) please use the bits to test downstream applications. I am aware of a
number of API changes between 3.1.x and 3.3.1 and even between 3.3.0 and 3.3
.1. You should use this as an opportunity to test out applications and be
ready for it.

(2) please check out the release notes and change log, find out if anything
important should be included in 3.3.1.

Please try the release and vote. The vote will run for 5 days until
2021/05/30 at 00:00 CST.

Things I've verified:
* all blocker issues targeting 3.3.1 have been resolved.
* stable/evolving API changes between 3.3.0 and 3.3.1 are compatible.
* LICENSE and NOTICE files checked
* RELEASENOTES and CHANGELOG
* rat check passed.


Regards,
Weichiu


Re: [DISCUSS] which release lines should we still consider actively maintained?

2021-05-23 Thread Wei-Chiu Chuang
Sean,

For reasons I don't understand, I never received emails from your new
address in the mailing list. Only Akira's response.

I was just able to start a thread like this.

I am +1 to EOL 3.1.5.
Reason? Spark is already on Hadoop 3.2. Hive and Tez are actively working
to support Hadoop 3.3. HBase supports Hadoop 3.3 already. They are the most
common Hadoop applications so I think a 3.1 isn't that necessarily
important.

With Hadoop 3.3.1, we have a number of improvements to support a better
HDFS upgrade experience, so upgrading from Hadoop 3.1 should be relatively
easy. Application upgrade takes some effort though (commons-lang ->
commons-lang3 migration for example)
I've been maintaining the HDFS code in branch-3.1, so from a
HDFS perspective the branch is always in a ready to release state.

The Hadoop 3.1 line is more than 3 years old. Maintaining this branch is
getting trickier. I am +100 to reduce the number of actively maintained
release line. IMO, 2 Hadoop 3 lines + 1 Hadoop 2 line is a good idea.



For Hadoop 3.3 line: If no one beats me, I plan to make a 3.3.2 in 2-3
months. And another one in another 2-3 months.
The Hadoop 3.3.1 has nearly 700 commits not in 3.3.0. It is very difficult
to make/validate a maint release with such a big divergence in the code.


On Mon, May 24, 2021 at 12:06 PM Akira Ajisaka  wrote:

> Hi Sean,
>
> Thank you for starting the discussion.
>
> I think branch-2.10, branch-3.1, branch-3.2, branch-3.3, and trunk
> (3.4.x) are actively maintained.
>
> The next releases will be:
> - 3.4.0
> - 3.3.1 (Thanks, Wei-Chiu!)
> - 3.2.3
> - 3.1.5
> - 2.10.2
>
> > Are there folks willing to go through being release managers to get more
> of these release lines on a steady cadence?
>
> Now I'm interested in becoming a release manager of 3.1.5.
>
> > If I were to take up maintenance release for one of them which should it
> be?
>
> 3.2.3 or 2.10.2 seems to be a good choice.
>
> > Should we declare to our downstream users that some of these lines
> aren’t going to get more releases?
>
> Now I think we don't need to declare that. I believe 3.3.1, 3.2.3,
> 3.1.5, and 2.10.2 will be released in the near future.
> There are some earlier discussions of 3.1.x EoL, so 3.1.5 may be a
> final release of the 3.1.x release line.
>
> > Is there downstream facing documentation somewhere that I missed for
> setting expectations about our release cadence and actively maintained
> branches?
>
> As you commented, the confluence wiki pages for Hadoop releases were
> out of date. Updated [1].
>
> > Do we have a backlog of work written up that could make the release
> process easier for our release managers?
>
> The release process is documented and maintained:
> https://cwiki.apache.org/confluence/display/HADOOP2/HowToRelease
> Also, there are some backlogs [1], [2].
>
> [1]:
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Active+Release+Lines
> [2]: https://cwiki.apache.org/confluence/display/HADOOP/Roadmap
>
> Thanks,
> Akira
>
> On Fri, May 21, 2021 at 7:12 AM Sean Busbey 
> wrote:
> >
> >
> > Hi folks!
> >
> > Which release lines do we as a community still consider actively
> maintained?
> >
> > I found an earlier discussion[1] where we had consensus to consider
> branches that don’t get maintenance releases on a regular basis end-of-life
> for practical purposes. The result of that discussion was written up in our
> wiki docs in the “EOL Release Branches” page, summarized here
> >
> > >  If no volunteer to do a maintenance release in a short to mid-term
> (like 3 months to 1 or 1.5 year).
> >
> > Looking at release lines that are still on our download page[3]:
> >
> > * Hadoop 2.10.z - last release 8 months ago
> > * Hadoop 3.1.z - last release 9.5 months ago
> > * Hadoop 3.2.z - last release 4.5 months ago
> > * Hadoop 3.3.z - last release 10 months ago
> >
> > And then trunk holds 3.4 which hasn’t had a release since the branch-3.3
> fork ~14 months ago.
> >
> > I can see that Wei-Chiu has been actively working on getting the 3.3.1
> release out[4] (thanks Wei-Chiu!) but I do not see anything similar for the
> other release lines.
> >
> > We also have pages on the wiki for our project roadmap of release[5],
> but it seems out of date since it lists in progress releases that have
> happened or branches we have announced as end of life, i.e. 2.8.
> >
> > We also have a group of pages (sorry, I’m not sure what the confluence
> jargon is for this) for “hadoop active release lines”[6] but this list has
> 2.8, 2.9, 3.0, 3.1, and 3.3. So several declared end of life lines and no
> 2.10 or 3.2 despite those being our release lines with the most recent
> releases.
> >
> > Are there folks willing to go through being release managers to get more
> of these release lines on a steady cadence?
> >
> > If I were to take up maintenance release for one of them which should it
> be?
> >
> > Should we declare to our downstream users that some of these lines
> aren’t going to get more releases?
> >
> 

Publishing Apache Hadoop 3.3.1-RC0 Preview bits

2021-05-20 Thread Wei-Chiu Chuang
Hi community,

This is the preview bits of the first release candidate of Apache
Hadoop 3.3.1 line. It takes tremendous amount of effort to produce a Hadoop
RC
and we still have a number of unresolved issues [1]. Meanwhile, I'd like to
publish
preview bits so you guys can check out earlier than later. I expect to roll
the official RC earlier next week.

It contains 693 fixed jira issues [2] since 3.3.0 which include a lot of
features and improvements(read the full set of release notes).

Below feature additions are the highlights of the release.
Using lz4-java in Lz4Codec

Using snappy-java in SnappyCodec

Provide Regex Based Mount Point In Inode Tree

Add Public IOStatistics API

ABFS: Delegation SAS Generator Updates

ABFS: Delegation SAS generator for integration with Ranger

Über-jira: S3A Hadoop 3.3.1 features

Add Metrics to HttpFS Server

EC: Verify EC reconstruction correctness on DataNode

Standby NameNode process getBlocks request to reduce Active load

LocatedFileStatusFetcher to collect/publish IOStatistics


*RC tag is release-3.3.0-RC0 (4a0b8c92f599553a93a39071d287bb2cc3d1a19d)

*The RC0 artifacts are at*:
https://home.apache.org/~weichiu/hadoop-3.3.1-RC0/

*The maven artifacts are hosted here:*
https://repository.apache.org/content/repositories/orgapachehadoop-1310/

*My public key is available here:*
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

[1] https://issues.apache.org/jira/issues/?filter=12350491
[2]
https://issues.apache.org/jira/issues/?jql=project%20in%20(YARN%2C%20HADOOP%2C%20MAPREDUCE%2C%20HDFS)%20AND%20fixVersion%20in%20(3.3.1)%20AND%20status%20%3D%20Resolved%20ORDER%20BY%0AfixVersion%20ASC


My ask:
(1) please use the bits to test downstream applications. I am aware of a
number of API changes between 3.1.x and 3.3.1 and even between 3.3.0 and
3.3.1. You should use this as an opportunity to test out applications and
be ready for it.
(AFAIK Knox, Ranger, Tez, Phoenix-omid don't compile)

(2) please check out the release notes and change log, find out if anything
important should be included in 3.3.1.

Regards,


1. project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.3.1) AND
status = Resolved ORDER BY
fixVersion ASC


Hadoop Branching: branch-3.3 is now 3.3.2-SNAPSHOT

2021-05-19 Thread Wei-Chiu Chuang
Hi I created a branch-3.3.1. The branch-3.3. now tracks 3.3.2-SNAPSHOT.

Please use Fix version 3.3.2 when cherrypicking your commits to branch-3.3.

Regards,
Wei-Chiu


Re: [DISCUSS] Hadoop 3.3.1 release

2021-05-18 Thread Wei-Chiu Chuang
Back at this. The Hadoop thirdparty 1.1.0 is released.

We have one real blocker for the 3.3.1 release (HDFS-15790 Make
ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist
<https://issues.apache.org/jira/browse/HDFS-15790>)
I tried to rebase the PR but it didn't work like I wanted. Some help is
needed there. If either Vinay comes back to rebase the PR, or if someone
else can take over the PR, that would be great.

Meanwhile, I'll cut a branch for 3.3.1 later. If everything works out fine,
I should be able to make a RC0 by tonight (which doesn't contain
HDFS-15790) so you guys can quickly check out.

The first RC will probably not have the aarch64 binary. I don't have much
experience with that.

Cheers,
Wei-Chiu



On Thu, May 13, 2021 at 9:04 AM Wei-Chiu Chuang 
wrote:

> Hello it's me again.
>
> While working on the hadoop-thirdparty release, I would like to make
> progress on the Hadoop 3.3.1 release in parallel as well.
> Looking at the jira dashboard, I think the only real blocker is HDFS-15790
> https://github.com/apache/hadoop/pull/2767
>
> I'll try to review the remaining jiras and push out those that don't have
> much progress today.
>
> Thanks all!
> Weichiu
>
>
> On Tue, Apr 20, 2021 at 12:57 PM Wei-Chiu Chuang 
> wrote:
>
>> Billie and Viraj,
>>
>> As of today we are at 657 resolved jiras in 3.3.1
>> https://issues.apache.org/jira/issues/?filter=-1=project%20in%20(HADOOP%2C%20HDFS%2C%20YARN%2C%20MAPREDUCE)%20AND%20status%20in%20(Resolved)%20AND%20fixVersion%20in%20(3.3.1)
>> There are so many new features (
>> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.3+Release) I
>> counted 10, that it is barely a dot release.
>>
>> Please make every effort to backport the said jiras to branch-3.3. I'd
>> like to cut the branch sooner than later. I'd also call for more frequent
>> releases. At the current development pace, a dot release every quarter
>> would make sense IMO.
>>
>> Given the number of release blockers, I propose a tentative code freeze
>> date: end of next Friday. After that I'd like to cut the branch and make an
>> RC.
>>
>> Thoughts? I can't do this all alone. If you find things that are worth
>> stopping the release, please tag the jira with the label 'release-blocker'.
>>
>>
>>
>> On Tue, Apr 20, 2021 at 2:44 AM Billie Rinaldi  wrote:
>>
>>> I was thinking of backporting HADOOP-16948 for 3.3.1.
>>>
>>> Billie
>>>
>>> On Mon, Apr 19, 2021 at 1:33 AM Wei-Chiu Chuang
>>>  wrote:
>>>
>>> > Hello, reviving this thread.
>>> >
>>> > I created a dashboard for Hadoop 3.3.1 release.
>>> >
>>> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12336122
>>> > Also a jira to track the release work: HADOOP-17647
>>> > <https://issues.apache.org/jira/browse/HADOOP-17647>
>>> >
>>> > We are currently at 5 release blockers and 3 critical issues for Hadoop
>>> > 3.3.1. I'll go through each of them and push out the ones that aren't
>>> > really blocking us.
>>> >
>>> > If you believe there are more features/bug fixes we should include in
>>> 3.3.1
>>> > (I spent the past few weeks backporting jiras but I'm sure I missed
>>> some)
>>> > please shout out.
>>> >
>>> > Meanwhile, I believe we need to release hadoop-thirdparty 1.1.0 too.
>>> There
>>> > are a number of tasks to be done there too. Let's start another thread
>>> for
>>> > hadoop-thirdparty 1.1.0 release.
>>> >
>>> > On Mon, Mar 15, 2021 at 7:04 PM hemanth boyina <
>>> hemanthboyina...@gmail.com
>>> > >
>>> > wrote:
>>> >
>>> > > Hi Steve and Wei-Chiu
>>> > >
>>> > > Regarding the IPV6,Few years back we have rebased the HADOOP-11890 to
>>> > trunk
>>> > > and tried to work out with IPV6, we have faced some issues and have
>>> made
>>> > > the required changes for ipv6 to work.After the changes were made we
>>> have
>>> > > tested the IPV6 changes on top of Ipv4 and Ipv6 machines and tested
>>> > > rigorously.Its been quite a some time these changes were deployed in
>>> > > production cluster and have been in use for extensive purpose.
>>> > >
>>> > > I think it's good time to add this feature.
>>> > >
>>> > > Thanks
>>> > > Hemanth Boyina
>>> > >
>>>

Re: [VOTE] hadoop-thirdparty 1.1.0-RC0

2021-05-18 Thread Wei-Chiu Chuang
My +1 too

So this release vote passed with 4 +1 and 1 non-binding +1.
The artifacts have been released from Nexus.

I'll go ahead work on the Hadoop 3.3.1 release.

Thanks all!

On Sun, May 16, 2021 at 11:07 PM Xiaoqiao He  wrote:

> +1
>
> - Verified checksums and signatures.
> - Checked CHANGELOG / RELEASE NOTES.
> - Built from source with jdk1.8.0_202.
> - Built Hadoop trunk with hadoop-thirdparty 1.1.0 and checked with pseudo
> distribution hadoop cluster deployment.
>
> Thanks,
> - He Xiaoqiao
>
> On Fri, May 14, 2021 at 6:16 PM Viraj Jasani  wrote:
>
> > +1 (non-binding)
> >
> > * Signature: ok
> > * Checksum : ok
> > * CHANGELOG / RELEASENOTES: ok
> > * Rat check (1.8.0_171): ok
> >  - mvn clean apache-rat:check
> > * Built from source (1.8.0_171): ok
> >  - mvn clean install  -DskipTests
> >  - mvn clean install -DskipTests -Psrc
> > * Built Hadoop trunk againt hadoop-thirdparty 1.1.0: ok
> >
> >
> > On Thu, May 13, 2021 at 5:25 PM Wei-Chiu Chuang 
> > wrote:
> >
> > > Hello my fellow Hadoop developers,
> > >
> > > I am putting together the first release candidate (RC0) for
> > > Hadoop-thirdparty 1.1.0. This is going to be consumed by the upcoming
> > > Hadoop 3.3.1 release.
> > >
> > > The RC is available at:
> > > https://people.apache.org/~weichiu/hadoop-thirdparty-1.1.0-RC0/
> > > The RC tag in github is here:
> > > https://github.com/apache/hadoop-thirdparty/tree/release-1.1.0-RC0
> > > The maven artifacts are staged at:
> > >
> https://repository.apache.org/content/repositories/orgapachehadoop-1309/
> > >
> > > You can find my public key at:
> > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS or
> > > https://people.apache.org/keys/committer/weichiu.asc
> > >
> > >
> > > Please try the release and vote. The vote will run for 5 days until
> > > 2021/05/19 at 00:00 CST.
> > >
> > > Note: Our post commit automation builds the code, and pushes the
> SNAPSHOT
> > > artifacts to central Maven, which is consumed by Hadoop trunk and
> > > branch-3.3, so it is a good validation that things are working properly
> > in
> > > hadoop-thirdparty.
> > >
> > > Thanks,
> > > Wei-Chiu
> > >
> >
>


Hadoop LZ4 codec questions

2021-05-18 Thread Wei-Chiu Chuang
Hi I'm trying to understand the LZ4 codec usage in Hadoop.

Liang-Chi replaced the LZ4 codec with the lz4-java HADOOP-17292. The intent
is so that we can use the native library that is bundled in the jar, no
need to install lz4 native libraries on the host machine.

However, there's another LZ4 codec that we ship inside
hadoop-mapreduce-client-nativetask.
https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/lz4/lz4.c

What's the purpose of this file? Does the MapReduce client use a different
path to do lz4 compression? Maybe it's deadcode?


Updated Hadoop fix versions

2021-05-16 Thread Wei-Chiu Chuang
Hi,

I found hundreds of discrepancies between Apache JIRA and git history, and
updated the jira fix versions according to git history.

Some of the noticeable updates:

(1)
HADOOP-16730 ABFS: Support for Shared Access Signatures (SAS) -->
originally a new feature in 3.3.1, is actually in 3.3.0.

(2)
The issue for the following jira was deleted. I am not sure why.

commit 3aae563421748cbf180dd7027b5a3851d5ab1b28
Author: kwangsun 
Date:   Mon Mar 22 11:43:32 2021 +0900

HADOOP-17952. Fix the wrong CIDR range example in Proxy User
documentation. (#2780)

Signed-off-by: Akira Ajisaka 
(cherry picked from commit c8d327a4f1a7f15d6be35051414199d1d3fdc5ef)

(3) there's a branch-3 in the repo which is stale. I don't think we're
using that branch.

Here's a python script (Python 3 compatible) that compares git history vs
apache jira.
Feel free to use it for future releases.
https://gist.github.com/jojochuang/ea6485e7b2e1da41dcd33427eb476fb6

To use it, install two modules:
pip3 install gitpython
pip3 install jira


[VOTE] hadoop-thirdparty 1.1.0-RC0

2021-05-13 Thread Wei-Chiu Chuang
Hello my fellow Hadoop developers,

I am putting together the first release candidate (RC0) for
Hadoop-thirdparty 1.1.0. This is going to be consumed by the upcoming
Hadoop 3.3.1 release.

The RC is available at:
https://people.apache.org/~weichiu/hadoop-thirdparty-1.1.0-RC0/
The RC tag in github is here:
https://github.com/apache/hadoop-thirdparty/tree/release-1.1.0-RC0
The maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapachehadoop-1309/

You can find my public key at:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS or
https://people.apache.org/keys/committer/weichiu.asc


Please try the release and vote. The vote will run for 5 days until
2021/05/19 at 00:00 CST.

Note: Our post commit automation builds the code, and pushes the SNAPSHOT
artifacts to central Maven, which is consumed by Hadoop trunk and
branch-3.3, so it is a good validation that things are working properly in
hadoop-thirdparty.

Thanks,
Wei-Chiu


Re: [DISCUSS] Hadoop 3.3.1 release

2021-05-12 Thread Wei-Chiu Chuang
Hello it's me again.

While working on the hadoop-thirdparty release, I would like to make
progress on the Hadoop 3.3.1 release in parallel as well.
Looking at the jira dashboard, I think the only real blocker is HDFS-15790
https://github.com/apache/hadoop/pull/2767

I'll try to review the remaining jiras and push out those that don't have
much progress today.

Thanks all!
Weichiu


On Tue, Apr 20, 2021 at 12:57 PM Wei-Chiu Chuang 
wrote:

> Billie and Viraj,
>
> As of today we are at 657 resolved jiras in 3.3.1
> https://issues.apache.org/jira/issues/?filter=-1=project%20in%20(HADOOP%2C%20HDFS%2C%20YARN%2C%20MAPREDUCE)%20AND%20status%20in%20(Resolved)%20AND%20fixVersion%20in%20(3.3.1)
> There are so many new features (
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.3+Release) I
> counted 10, that it is barely a dot release.
>
> Please make every effort to backport the said jiras to branch-3.3. I'd
> like to cut the branch sooner than later. I'd also call for more frequent
> releases. At the current development pace, a dot release every quarter
> would make sense IMO.
>
> Given the number of release blockers, I propose a tentative code freeze
> date: end of next Friday. After that I'd like to cut the branch and make an
> RC.
>
> Thoughts? I can't do this all alone. If you find things that are worth
> stopping the release, please tag the jira with the label 'release-blocker'.
>
>
>
> On Tue, Apr 20, 2021 at 2:44 AM Billie Rinaldi  wrote:
>
>> I was thinking of backporting HADOOP-16948 for 3.3.1.
>>
>> Billie
>>
>> On Mon, Apr 19, 2021 at 1:33 AM Wei-Chiu Chuang
>>  wrote:
>>
>> > Hello, reviving this thread.
>> >
>> > I created a dashboard for Hadoop 3.3.1 release.
>> >
>> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12336122
>> > Also a jira to track the release work: HADOOP-17647
>> > <https://issues.apache.org/jira/browse/HADOOP-17647>
>> >
>> > We are currently at 5 release blockers and 3 critical issues for Hadoop
>> > 3.3.1. I'll go through each of them and push out the ones that aren't
>> > really blocking us.
>> >
>> > If you believe there are more features/bug fixes we should include in
>> 3.3.1
>> > (I spent the past few weeks backporting jiras but I'm sure I missed
>> some)
>> > please shout out.
>> >
>> > Meanwhile, I believe we need to release hadoop-thirdparty 1.1.0 too.
>> There
>> > are a number of tasks to be done there too. Let's start another thread
>> for
>> > hadoop-thirdparty 1.1.0 release.
>> >
>> > On Mon, Mar 15, 2021 at 7:04 PM hemanth boyina <
>> hemanthboyina...@gmail.com
>> > >
>> > wrote:
>> >
>> > > Hi Steve and Wei-Chiu
>> > >
>> > > Regarding the IPV6,Few years back we have rebased the HADOOP-11890 to
>> > trunk
>> > > and tried to work out with IPV6, we have faced some issues and have
>> made
>> > > the required changes for ipv6 to work.After the changes were made we
>> have
>> > > tested the IPV6 changes on top of Ipv4 and Ipv6 machines and tested
>> > > rigorously.Its been quite a some time these changes were deployed in
>> > > production cluster and have been in use for extensive purpose.
>> > >
>> > > I think it's good time to add this feature.
>> > >
>> > > Thanks
>> > > Hemanth Boyina
>> > >
>> > >
>> > >
>> > > On Thu, 11 Mar 2021, 10:22 Vinayakumar B, 
>> > wrote:
>> > >
>> > > > Hi David,
>> > > >
>> > > > >> Still hoping for help here:
>> > > >
>> > > > >> https://issues.apache.org/jira/browse/HDFS-15790
>> > > >
>> > > > I will raise a PR for the said solution soon (in a day or two).
>> > > >
>> > > > -Vinay
>> > > >
>> > > > On Thu, 11 Mar 2021 at 5:39 AM, David  wrote:
>> > > >
>> > > > > Hello,
>> > > > >
>> > > > > Still hoping for help here:
>> > > > >
>> > > > > https://issues.apache.org/jira/browse/HDFS-15790
>> > > > >
>> > > > > Looks like it has been worked on, not sure how to best move it
>> > forward.
>> > > > >
>> > > > > On Wed, Mar 10, 2021, 12:21 PM Steve Loughran
>> > > > > > > > > >
>> > > > > wr

Re: [DISCUSS] hadoop-thirdparty 1.1.0 release

2021-05-12 Thread Wei-Chiu Chuang
Sorry for the delay in the release work.

I'm almost ready to start a release vote. (Took several days to figure out
all the Docker gimmicks, plus company holidays)

The only missing piece is this PR:
https://github.com/apache/hadoop-thirdparty/pull/14
I've dry-run the release activity with this change so please help review.

Thanks!


On Tue, Apr 27, 2021 at 2:59 PM Wei-Chiu Chuang  wrote:

> I'll start preparing the release vote for third-party 1.1.0. Thanks all
> for help!
>
> On Mon, Apr 26, 2021 at 7:20 PM Wei-Chiu Chuang 
> wrote:
>
>> Thanks. I created a Jenkins job to upload the SNAPSHOT to Apache nexus
>> repository.
>>
>> https://ci-hadoop.apache.org/view/Hadoop/job/Hadoop-thirdparty-trunk-Commit/
>>
>> I can see the new artifacts uploaded. Let's see if the main Hadoop repo
>> precommit can consume the bits.
>>
>> On Mon, Apr 26, 2021 at 6:02 PM Ayush Saxena  wrote:
>>
>>> Yep, you have to do it manually
>>>
>>> -Ayush
>>>
>>> On 26-Apr-2021, at 3:23 PM, Wei-Chiu Chuang 
>>> wrote:
>>>
>>> 
>>> Does anyone know how we publish hadoop-thirdparty SNAPSHOT artifacts?
>>>
>>> The main Hadoop arifacts are published by this job
>>> https://ci-hadoop.apache.org/view/Hadoop/job/Hadoop-trunk-Commit/ after
>>> every commit.
>>> However, we don't seem to publish hadoop-thirdparty regularly. (Apache
>>> nexus:
>>> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/thirdparty/
>>> )
>>>
>>> Are they published manually?
>>>
>>>
>>> On Fri, Apr 23, 2021 at 6:06 PM Ayush Saxena  wrote:
>>>
>>>> Regarding Guava: before release once you merge the change to
>>>> thrirdparty repo, can update the hadoop thirdparty snapshot, the hadoop
>>>> code would pick that up, and watch out everything is safe and clean before
>>>> release. Unless you have a better way to verify or already verified!!!
>>>>
>>>> -Ayush
>>>>
>>>> > On 23-Apr-2021, at 3:16 PM, Wei-Chiu Chuang
>>>>  wrote:
>>>> >
>>>> > Another suggestion: looks like the shaded jaeger is not being used by
>>>> > Hadoop code. Maybe we can remove that from the release for now? I
>>>> don't
>>>> > want to release something that's not being used.
>>>> > We can release the shaded jaeger when it's ready for use. We will
>>>> have to
>>>> > update the jaeger version anyway. The version used is too old.
>>>> >
>>>> >> On Fri, Apr 23, 2021 at 10:55 AM Wei-Chiu Chuang 
>>>> wrote:
>>>> >>
>>>> >> Hi community,
>>>> >>
>>>> >> In preparation of the Hadoop 3.3.1 release, I am starting a thread to
>>>> >> discuss its prerequisite: the release of hadoop-thirdparty 1.1.0.
>>>> >>
>>>> >> My plan:
>>>> >> update guava to 30.1.1 (latest). I have the PR ready to merge.
>>>> >>
>>>> >> Do we want to update protobuf and jaeger? Anything else?
>>>> >>
>>>> >> I suppose we won't update protobuf too frequently.
>>>> >> Jaeger is under active development. We're currently on 0.34.2, the
>>>> latest
>>>> >> is 1.22.0.
>>>> >>
>>>> >> If there is no change to this plan, I can start the release work as
>>>> soon
>>>> >> as possible.
>>>> >>
>>>>
>>>> -
>>>> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
>>>> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>>>>
>>>>


Re: [DISCUSS] hadoop-thirdparty 1.1.0 release

2021-04-27 Thread Wei-Chiu Chuang
I'll start preparing the release vote for third-party 1.1.0. Thanks all for
help!

On Mon, Apr 26, 2021 at 7:20 PM Wei-Chiu Chuang 
wrote:

> Thanks. I created a Jenkins job to upload the SNAPSHOT to Apache nexus
> repository.
>
> https://ci-hadoop.apache.org/view/Hadoop/job/Hadoop-thirdparty-trunk-Commit/
>
> I can see the new artifacts uploaded. Let's see if the main Hadoop repo
> precommit can consume the bits.
>
> On Mon, Apr 26, 2021 at 6:02 PM Ayush Saxena  wrote:
>
>> Yep, you have to do it manually
>>
>> -Ayush
>>
>> On 26-Apr-2021, at 3:23 PM, Wei-Chiu Chuang  wrote:
>>
>> 
>> Does anyone know how we publish hadoop-thirdparty SNAPSHOT artifacts?
>>
>> The main Hadoop arifacts are published by this job
>> https://ci-hadoop.apache.org/view/Hadoop/job/Hadoop-trunk-Commit/ after
>> every commit.
>> However, we don't seem to publish hadoop-thirdparty regularly. (Apache
>> nexus:
>> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/thirdparty/
>> )
>>
>> Are they published manually?
>>
>>
>> On Fri, Apr 23, 2021 at 6:06 PM Ayush Saxena  wrote:
>>
>>> Regarding Guava: before release once you merge the change to thrirdparty
>>> repo, can update the hadoop thirdparty snapshot, the hadoop code would pick
>>> that up, and watch out everything is safe and clean before release. Unless
>>> you have a better way to verify or already verified!!!
>>>
>>> -Ayush
>>>
>>> > On 23-Apr-2021, at 3:16 PM, Wei-Chiu Chuang
>>>  wrote:
>>> >
>>> > Another suggestion: looks like the shaded jaeger is not being used by
>>> > Hadoop code. Maybe we can remove that from the release for now? I don't
>>> > want to release something that's not being used.
>>> > We can release the shaded jaeger when it's ready for use. We will have
>>> to
>>> > update the jaeger version anyway. The version used is too old.
>>> >
>>> >> On Fri, Apr 23, 2021 at 10:55 AM Wei-Chiu Chuang 
>>> wrote:
>>> >>
>>> >> Hi community,
>>> >>
>>> >> In preparation of the Hadoop 3.3.1 release, I am starting a thread to
>>> >> discuss its prerequisite: the release of hadoop-thirdparty 1.1.0.
>>> >>
>>> >> My plan:
>>> >> update guava to 30.1.1 (latest). I have the PR ready to merge.
>>> >>
>>> >> Do we want to update protobuf and jaeger? Anything else?
>>> >>
>>> >> I suppose we won't update protobuf too frequently.
>>> >> Jaeger is under active development. We're currently on 0.34.2, the
>>> latest
>>> >> is 1.22.0.
>>> >>
>>> >> If there is no change to this plan, I can start the release work as
>>> soon
>>> >> as possible.
>>> >>
>>>
>>> -
>>> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
>>> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>>>
>>>


Re: [DISCUSS] hadoop-thirdparty 1.1.0 release

2021-04-26 Thread Wei-Chiu Chuang
Thanks. I created a Jenkins job to upload the SNAPSHOT to Apache nexus
repository.
https://ci-hadoop.apache.org/view/Hadoop/job/Hadoop-thirdparty-trunk-Commit/

I can see the new artifacts uploaded. Let's see if the main Hadoop repo
precommit can consume the bits.

On Mon, Apr 26, 2021 at 6:02 PM Ayush Saxena  wrote:

> Yep, you have to do it manually
>
> -Ayush
>
> On 26-Apr-2021, at 3:23 PM, Wei-Chiu Chuang  wrote:
>
> 
> Does anyone know how we publish hadoop-thirdparty SNAPSHOT artifacts?
>
> The main Hadoop arifacts are published by this job
> https://ci-hadoop.apache.org/view/Hadoop/job/Hadoop-trunk-Commit/ after
> every commit.
> However, we don't seem to publish hadoop-thirdparty regularly. (Apache
> nexus:
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/thirdparty/
> )
>
> Are they published manually?
>
>
> On Fri, Apr 23, 2021 at 6:06 PM Ayush Saxena  wrote:
>
>> Regarding Guava: before release once you merge the change to thrirdparty
>> repo, can update the hadoop thirdparty snapshot, the hadoop code would pick
>> that up, and watch out everything is safe and clean before release. Unless
>> you have a better way to verify or already verified!!!
>>
>> -Ayush
>>
>> > On 23-Apr-2021, at 3:16 PM, Wei-Chiu Chuang
>>  wrote:
>> >
>> > Another suggestion: looks like the shaded jaeger is not being used by
>> > Hadoop code. Maybe we can remove that from the release for now? I don't
>> > want to release something that's not being used.
>> > We can release the shaded jaeger when it's ready for use. We will have
>> to
>> > update the jaeger version anyway. The version used is too old.
>> >
>> >> On Fri, Apr 23, 2021 at 10:55 AM Wei-Chiu Chuang 
>> wrote:
>> >>
>> >> Hi community,
>> >>
>> >> In preparation of the Hadoop 3.3.1 release, I am starting a thread to
>> >> discuss its prerequisite: the release of hadoop-thirdparty 1.1.0.
>> >>
>> >> My plan:
>> >> update guava to 30.1.1 (latest). I have the PR ready to merge.
>> >>
>> >> Do we want to update protobuf and jaeger? Anything else?
>> >>
>> >> I suppose we won't update protobuf too frequently.
>> >> Jaeger is under active development. We're currently on 0.34.2, the
>> latest
>> >> is 1.22.0.
>> >>
>> >> If there is no change to this plan, I can start the release work as
>> soon
>> >> as possible.
>> >>
>>
>> -
>> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>>
>>


Re: [DISCUSS] hadoop-thirdparty 1.1.0 release

2021-04-26 Thread Wei-Chiu Chuang
Does anyone know how we publish hadoop-thirdparty SNAPSHOT artifacts?

The main Hadoop arifacts are published by this job
https://ci-hadoop.apache.org/view/Hadoop/job/Hadoop-trunk-Commit/ after
every commit.
However, we don't seem to publish hadoop-thirdparty regularly. (Apache
nexus:
https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/thirdparty/
)

Are they published manually?


On Fri, Apr 23, 2021 at 6:06 PM Ayush Saxena  wrote:

> Regarding Guava: before release once you merge the change to thrirdparty
> repo, can update the hadoop thirdparty snapshot, the hadoop code would pick
> that up, and watch out everything is safe and clean before release. Unless
> you have a better way to verify or already verified!!!
>
> -Ayush
>
> > On 23-Apr-2021, at 3:16 PM, Wei-Chiu Chuang 
> wrote:
> >
> > Another suggestion: looks like the shaded jaeger is not being used by
> > Hadoop code. Maybe we can remove that from the release for now? I don't
> > want to release something that's not being used.
> > We can release the shaded jaeger when it's ready for use. We will have to
> > update the jaeger version anyway. The version used is too old.
> >
> >> On Fri, Apr 23, 2021 at 10:55 AM Wei-Chiu Chuang 
> wrote:
> >>
> >> Hi community,
> >>
> >> In preparation of the Hadoop 3.3.1 release, I am starting a thread to
> >> discuss its prerequisite: the release of hadoop-thirdparty 1.1.0.
> >>
> >> My plan:
> >> update guava to 30.1.1 (latest). I have the PR ready to merge.
> >>
> >> Do we want to update protobuf and jaeger? Anything else?
> >>
> >> I suppose we won't update protobuf too frequently.
> >> Jaeger is under active development. We're currently on 0.34.2, the
> latest
> >> is 1.22.0.
> >>
> >> If there is no change to this plan, I can start the release work as soon
> >> as possible.
> >>
>
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>
>


Sorry for breaking the build

2021-04-26 Thread Wei-Chiu Chuang
If you are seeing precommit failures, like this:

[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile
(default-compile) on project hadoop-annotations: Compilation failure
[ERROR] javac: invalid flag: -Xmaxwarns=
[ERROR] Usage: javac  
[ERROR] use -help for a list of possible options

[ERROR] -> [Help 1]

it was caused by HADOOP-17661
. I have just reverted
the commit.
(the precommit for the PR didn't fail, so it wasn't caught in the first
place)

Sorry for the trouble
Weichiu


Re: [DISCUSS] hadoop-thirdparty 1.1.0 release

2021-04-23 Thread Wei-Chiu Chuang
Another suggestion: looks like the shaded jaeger is not being used by
Hadoop code. Maybe we can remove that from the release for now? I don't
want to release something that's not being used.
We can release the shaded jaeger when it's ready for use. We will have to
update the jaeger version anyway. The version used is too old.

On Fri, Apr 23, 2021 at 10:55 AM Wei-Chiu Chuang  wrote:

> Hi community,
>
> In preparation of the Hadoop 3.3.1 release, I am starting a thread to
> discuss its prerequisite: the release of hadoop-thirdparty 1.1.0.
>
> My plan:
> update guava to 30.1.1 (latest). I have the PR ready to merge.
>
> Do we want to update protobuf and jaeger? Anything else?
>
> I suppose we won't update protobuf too frequently.
> Jaeger is under active development. We're currently on 0.34.2, the latest
> is 1.22.0.
>
> If there is no change to this plan, I can start the release work as soon
> as possible.
>


[DISCUSS] hadoop-thirdparty 1.1.0 release

2021-04-22 Thread Wei-Chiu Chuang
Hi community,

In preparation of the Hadoop 3.3.1 release, I am starting a thread to
discuss its prerequisite: the release of hadoop-thirdparty 1.1.0.

My plan:
update guava to 30.1.1 (latest). I have the PR ready to merge.

Do we want to update protobuf and jaeger? Anything else?

I suppose we won't update protobuf too frequently.
Jaeger is under active development. We're currently on 0.34.2, the latest
is 1.22.0.

If there is no change to this plan, I can start the release work as soon as
possible.


Re: [DISCUSS] Hadoop 3.3.1 release

2021-04-19 Thread Wei-Chiu Chuang
Billie and Viraj,

As of today we are at 657 resolved jiras in 3.3.1
https://issues.apache.org/jira/issues/?filter=-1=project%20in%20(HADOOP%2C%20HDFS%2C%20YARN%2C%20MAPREDUCE)%20AND%20status%20in%20(Resolved)%20AND%20fixVersion%20in%20(3.3.1)
There are so many new features (
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.3+Release) I
counted 10, that it is barely a dot release.

Please make every effort to backport the said jiras to branch-3.3. I'd like
to cut the branch sooner than later. I'd also call for more frequent
releases. At the current development pace, a dot release every quarter
would make sense IMO.

Given the number of release blockers, I propose a tentative code freeze
date: end of next Friday. After that I'd like to cut the branch and make an
RC.

Thoughts? I can't do this all alone. If you find things that are worth
stopping the release, please tag the jira with the label 'release-blocker'.



On Tue, Apr 20, 2021 at 2:44 AM Billie Rinaldi  wrote:

> I was thinking of backporting HADOOP-16948 for 3.3.1.
>
> Billie
>
> On Mon, Apr 19, 2021 at 1:33 AM Wei-Chiu Chuang
>  wrote:
>
> > Hello, reviving this thread.
> >
> > I created a dashboard for Hadoop 3.3.1 release.
> >
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12336122
> > Also a jira to track the release work: HADOOP-17647
> > <https://issues.apache.org/jira/browse/HADOOP-17647>
> >
> > We are currently at 5 release blockers and 3 critical issues for Hadoop
> > 3.3.1. I'll go through each of them and push out the ones that aren't
> > really blocking us.
> >
> > If you believe there are more features/bug fixes we should include in
> 3.3.1
> > (I spent the past few weeks backporting jiras but I'm sure I missed some)
> > please shout out.
> >
> > Meanwhile, I believe we need to release hadoop-thirdparty 1.1.0 too.
> There
> > are a number of tasks to be done there too. Let's start another thread
> for
> > hadoop-thirdparty 1.1.0 release.
> >
> > On Mon, Mar 15, 2021 at 7:04 PM hemanth boyina <
> hemanthboyina...@gmail.com
> > >
> > wrote:
> >
> > > Hi Steve and Wei-Chiu
> > >
> > > Regarding the IPV6,Few years back we have rebased the HADOOP-11890 to
> > trunk
> > > and tried to work out with IPV6, we have faced some issues and have
> made
> > > the required changes for ipv6 to work.After the changes were made we
> have
> > > tested the IPV6 changes on top of Ipv4 and Ipv6 machines and tested
> > > rigorously.Its been quite a some time these changes were deployed in
> > > production cluster and have been in use for extensive purpose.
> > >
> > > I think it's good time to add this feature.
> > >
> > > Thanks
> > > Hemanth Boyina
> > >
> > >
> > >
> > > On Thu, 11 Mar 2021, 10:22 Vinayakumar B, 
> > wrote:
> > >
> > > > Hi David,
> > > >
> > > > >> Still hoping for help here:
> > > >
> > > > >> https://issues.apache.org/jira/browse/HDFS-15790
> > > >
> > > > I will raise a PR for the said solution soon (in a day or two).
> > > >
> > > > -Vinay
> > > >
> > > > On Thu, 11 Mar 2021 at 5:39 AM, David  wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > Still hoping for help here:
> > > > >
> > > > > https://issues.apache.org/jira/browse/HDFS-15790
> > > > >
> > > > > Looks like it has been worked on, not sure how to best move it
> > forward.
> > > > >
> > > > > On Wed, Mar 10, 2021, 12:21 PM Steve Loughran
> > > >  > > > > >
> > > > > wrote:
> > > > >
> > > > > > I'm going to argue its too late to do IPv6 support close to a
> > > release,
> > > > as
> > > > > > it's best if its on developer machines for some time to let all
> > > quirks
> > > > > > surface. It's not so much IPv6 itself, but do we cause any
> > > regressions
> > > > on
> > > > > > IPv4?
> > > > > >
> > > > > > But: it can/should go into trunk and stabilize there
> > > > > >
> > > > > > On Thu, 4 Mar 2021 at 03:52, Muralikrishna Dmmkr <
> > > > > > muralikrishna.dm...@gmail.com> wrote:
> > > > > >
> > > > > > > Hi Brahma,
> > > > > > >
> > > 

Re: [DISCUSS] Hadoop 3.3.1 release

2021-04-18 Thread Wei-Chiu Chuang
we
> > > > > >> can
> > > > > >> plan this.But needs to be merged ASAP.
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> On Fri, Feb 19, 2021 at 5:20 PM bilwa st 
> > wrote:
> > > > > >>
> > > > > >> > Hi Brahma,
> > > > > >> >
> > > > > >> > Can we have below features in 3.3.1 release? We have been
> using
> > > > these
> > > > > >> > features for a long time. They are stable and tested in bigger
> > > > > clusters.
> > > > > >> >
> > > > > >> > 1. Container reuse -
> > > > > >> https://issues.apache.org/jira/browse/MAPREDUCE-6749
> > > > > >> > 2. Speculative attempts should not run on the same node -
> > > > > >> > https://issues.apache.org/jira/browse/MAPREDUCE-7169
> > > > > >> >
> > > > > >> > Thanks,
> > > > > >> > Bilwa
> > > > > >> >
> > > > > >> > On Thu, Feb 18, 2021, 1:49 PM Brahma Reddy Battula <
> > > > bra...@apache.org
> > > > > >
> > > > > >> > wrote:
> > > > > >> >
> > > > > >> >> Sorry for the late reply..
> > > > > >> >>
> > > > > >> >> I will come up with a plan.. Please let me know if anybody
> has
> > > some
> > > > > >> >> features/improvements/bugs that need to be included.
> > > > > >> >>
> > > > > >> >> On Mon, Feb 15, 2021 at 9:39 PM Sunil Govindan <
> > > sun...@apache.org>
> > > > > >> wrote:
> > > > > >> >>
> > > > > >> >> > Hi Wei-Chiu,
> > > > > >> >> >
> > > > > >> >> > What will be the next steps here for 3.3.1 planning?
> > > > > >> >> >
> > > > > >> >> > Thanks
> > > > > >> >> > Sunil
> > > > > >> >> >
> > > > > >> >> > On Mon, Feb 8, 2021 at 11:56 PM Stack 
> > > wrote:
> > > > > >> >> >
> > > > > >> >> > > On Wed, Feb 3, 2021 at 6:41 AM Steve Loughran
> > > > > >> >> >  > > > > >> >> > > >
> > > > > >> >> > > wrote:
> > > > > >> >> > >
> > > > > >> >> > > >
> > > > > >> >> > > > Regarding blockers : how about we have a little
> > hackathon
> > > > > >> where we
> > > > > >> >> > try
> > > > > >> >> > > > and get things in. This means a promise of review time
> > from
> > > > the
> > > > > >> >> people
> > > > > >> >> > > with
> > > > > >> >> > > > commit rights and other people who understand the code
> > > > (Stack?)
> > > > > >> >> > > >
> > > > > >> >> > > >
> > > > > >> >> > >
> > > > > >> >> > > I'm up for helping get 3.3.1 out (reviewing, hackathon,
> > > > testing).
> > > > > >> >> > > Thanks,
> > > > > >> >> > > S
> > > > > >> >> > >
> > > > > >> >> > >
> > > > > >> >> > >
> > > > > >> >> > >
> > > > > >> >> > > > -steve
> > > > > >> >> > > >
> > > > > >> >> > > > On Thu, 28 Jan 2021 at 06:48, Ayush Saxena <
> > > > ayush...@gmail.com
> > > > > >
> > > > > >> >> wrote:
> > > > > >> >> > > >
> > > > > >> >> > > > > +1
> > > > > >> >> > > > > Just to mention we would need to release
> > > hadoop-thirdparty
> > > > > too
> > > > > >> >> > before.
> > > > > >

Re: [DISCUSS] Hadoop 3.3.1 release

2021-03-10 Thread Wei-Chiu Chuang
Without looking into details, I would say that new features like IPv6
support should get into 3.4.0 (and hopefully spend a little time to
stabilize) before backporting to 3.3.x or lower branches.

On Thu, Mar 11, 2021 at 1:21 AM Steve Loughran 
wrote:

> I'm going to argue its too late to do IPv6 support close to a release, as
> it's best if its on developer machines for some time to let all quirks
> surface. It's not so much IPv6 itself, but do we cause any regressions on
> IPv4?
>
> But: it can/should go into trunk and stabilize there
>
> On Thu, 4 Mar 2021 at 03:52, Muralikrishna Dmmkr <
> muralikrishna.dm...@gmail.com> wrote:
>
> > Hi Brahma,
> >
> > I have missed out mentioning about the IPV6 feature in the last mail,
> > Support for IPV6 has been in development since 2015 and We have done a
> good
> > amount of testing at our organisation, the feature is stable and used by
> > our customers extensively in the last one year. I think it is a good time
> > to add the IPV6 support to 3.3.1.
> >
> > https://issues.apache.org/jira/browse/HADOOP-11890
> >
> > Thanks
> > D M Murali Krishna Reddy
> >
> > On Wed, Feb 24, 2021 at 9:13 AM Muralikrishna Dmmkr <
> > muralikrishna.dm...@gmail.com> wrote:
> >
> > > Hi Brahma,
> > >
> > > Can we have this new feature "YARN Registry based AM discovery with
> retry
> > > and in-flight task persistent via JHS" in the upcoming 3.1.1 release. I
> > > have also attached a test-report in the below jira.
> > >
> > > https://issues.apache.org/jira/browse/MAPREDUCE-6726
> > >
> > >
> > > Thanks,
> > > D M Murali Krishna Reddy
> > >
> > > On Tue, Feb 23, 2021 at 10:11 AM Brahma Reddy Battula <
> bra...@apache.org
> > >
> > > wrote:
> > >
> > >> Hi Bilwa,
> > >>
> > >> I have commented on the jira's you mentioned. Based on the stability
> we
> > >> can
> > >> plan this.But needs to be merged ASAP.
> > >>
> > >>
> > >>
> > >> On Fri, Feb 19, 2021 at 5:20 PM bilwa st  wrote:
> > >>
> > >> > Hi Brahma,
> > >> >
> > >> > Can we have below features in 3.3.1 release? We have been using
> these
> > >> > features for a long time. They are stable and tested in bigger
> > clusters.
> > >> >
> > >> > 1. Container reuse -
> > >> https://issues.apache.org/jira/browse/MAPREDUCE-6749
> > >> > 2. Speculative attempts should not run on the same node -
> > >> > https://issues.apache.org/jira/browse/MAPREDUCE-7169
> > >> >
> > >> > Thanks,
> > >> > Bilwa
> > >> >
> > >> > On Thu, Feb 18, 2021, 1:49 PM Brahma Reddy Battula <
> bra...@apache.org
> > >
> > >> > wrote:
> > >> >
> > >> >> Sorry for the late reply..
> > >> >>
> > >> >> I will come up with a plan.. Please let me know if anybody has some
> > >> >> features/improvements/bugs that need to be included.
> > >> >>
> > >> >> On Mon, Feb 15, 2021 at 9:39 PM Sunil Govindan 
> > >> wrote:
> > >> >>
> > >> >> > Hi Wei-Chiu,
> > >> >> >
> > >> >> > What will be the next steps here for 3.3.1 planning?
> > >> >> >
> > >> >> > Thanks
> > >> >> > Sunil
> > >> >> >
> > >> >> > On Mon, Feb 8, 2021 at 11:56 PM Stack  wrote:
> > >> >> >
> > >> >> > > On Wed, Feb 3, 2021 at 6:41 AM Steve Loughran
> > >> >> >  > >> >> > > >
> > >> >> > > wrote:
> > >> >> > >
> > >> >> > > >
> > >> >> > > > Regarding blockers : how about we have a little hackathon
> > >> where we
> > >> >> > try
> > >> >> > > > and get things in. This means a promise of review time from
> the
> > >> >> people
> > >> >> > > with
> > >> >> > > > commit rights and other people who understand the code
> (Stack?)
> > >> >> > > >
> > >> >> > > >
> > >> >> > >
> > >> >> > > I'm up for he

[DISCUSS] Hadoop 3.3.1 release

2021-01-27 Thread Wei-Chiu Chuang
Hi all,

Hadoop 3.3.0 was released half a year ago, and as of now we've accumulated
more than 400 changes in the branch-3.3. A number of downstreamers are
eagerly waiting for 3.3.1 which addresses the guava version conflict issue.

https://issues.apache.org/jira/issues/?filter=-1=project%20in%20(HDFS%2C%20HADOOP%2C%20YARN%2C%20MAPREDUCE)%20and%20fixVersion%20in%20(3.3.1)%20and%20status%20%3D%20Resolved%20

We should start the release work for 3.3.1 before the diff becomes even
larger.

I believe there are  currently only two real blockers for a 3.3.1 (using
this filter
https://issues.apache.org/jira/issues/?filter=-1=project%20in%20(HDFS%2C%20HADOOP%2C%20YARN%2C%20MAPREDUCE)%20AND%20cf%5B12310320%5D%20in%20(3.3.1)%20AND%20status%20not%20in%20(Resolved)%20ORDER%20BY%20priority%20DESC
)


   1. HDFS-15566 
   2.
  1. HADOOP-17112 
 2.



Is there anyone who would volunteer to be the 3.3.1 RM?

Also, the HowToRelease wiki does not describe the ARM build process. That's
going to be important for future releases.


Re: [VOTE] Release Apache Hadoop 3.2.2 - RC5

2021-01-07 Thread Wei-Chiu Chuang
+1 (binding)

On Mon, Jan 4, 2021 at 11:11 PM Akira Ajisaka  wrote:

> +1 (binding)
>
> - Verified the checksums and signatures
> - Verified the source is the same as the RC5 tag
> - Built from source with CentOS 7.9 and Java 1.8.0_275
> - Setup pseudo cluster and ran some FsShell commands
>
> Thanks,
> Akira
>
> On Sun, Jan 3, 2021 at 11:40 PM Xiaoqiao He  wrote:
> >
> > Hi folks,
> >
> > The release candidate (RC5) for Hadoop-3.2.2 is available now.
> > There are 4 commits[1] differences between RC5 and RC4[2].
> >
> > The RC5 is available at:
> > http://people.apache.org/~hexiaoqiao/hadoop-3.2.2-RC5
> > The RC5 tag in github is here:
> > https://github.com/apache/hadoop/tree/release-3.2.2-RC5
> > The maven artifacts are staged at:
> > https://repository.apache.org/content/repositories/orgapachehadoop-1298
> >
> > You can find my public key at:
> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS or
> > https://people.apache.org/keys/committer/hexiaoqiao.asc directly.
> >
> > Please try the release and vote.
> >
> > I have done a simple test.
> > * Verify gpg sign and md5sum.
> > * Check staging repositories point to the correct artifact(
> > https://repository.apache.org/content/repositories/staging/).
> > * Setup pseudo cluster with HDFS and YARN.
> > * Run simple FsShell - mkdir/put/get/mv/rm.
> > * Submit example mr applications and check the result - Pi & wordcount.
> > * Check the web UI and the release year of
> > NameNode/DataNode/Resourcemanager/NodeManager including YARN UI2.
> >
> > My +1 to start.
> >
> > Thanks and Happy New Year!
> > - He Xiaoqiao
> >
> > [1]
> >
> https://github.com/apache/hadoop/compare/release-3.2.2-RC4...release-3.2.2-RC5
> > [2]
> >
> https://lists.apache.org/thread.html/r8911d2934ebfe3869874842be94d1cfb00d99334bc93819d71466243%40%3Chdfs-dev.hadoop.apache.org%3E
> > [3]
> >
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12335948
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: [E] Re: [VOTE] Release Apache Hadoop 3.2.2 - RC4

2020-12-21 Thread Wei-Chiu Chuang
We should target
3.4.0,
3.3.1
3.2.3 (or 3.2.2)

I updated and set 3.3.0 released in YARN.


Time to clean up some of the old releases. I'll do that.
On Mon, Dec 21, 2020 at 4:04 PM Eric Badger
 wrote:

> I've committed https://issues.apache.org/jira/browse/YARN-10540. Xiaoqiao,
> feel free to cherry-pick this into the 3.2.2 release branch if you think it
> is relevant.
>
> Also, can someone tell me which releases we should be targeting? Currently
> these versions are all Unreleased on JIRA:
> 3.4.1, 3.4.0
> 3.3.1, 3.3.0
> 3.2.3, 3.2.2
>
> As far as I know, neither trunk nor 3.3 have a release going on. So I don't
> know why there are 2 versions as unreleased there.
>
> Eric
>
> On Mon, Dec 21, 2020 at 3:20 PM Jim Brennan
>  wrote:
>
> > I put up a patch for
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_YARN-2D10540=DwIFaQ=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY=KVdP1SUmHYb-tZP8tcigmw=WEMl9mt9SZiq7P20DPgeZO69TFwf3d0eldQGRWlLyQg=FdkrkiWqHwuibT6qFqvApTkO5aTimLx7WeP74tka5XM=
> > .
> > Thanks for bringing it to my attention.
> > Jim
> >
> > On Mon, Dec 21, 2020 at 10:36 AM Sunil Govindan 
> wrote:
> >
> > > I had some offline talks with a few folks.
> > > This issue is happening only in Mac, hence ideally it does not cause
> much
> > > of a problem in the supported OS.
> > >
> > > I will wait for feedback here to see whether we need another RC by
> fixing
> > > this. And will continue the discussion in the jira.
> > >
> > > Thanks
> > > Sunil
> > >
> > > On Sat, Dec 19, 2020 at 11:07 PM Sunil Govindan 
> > wrote:
> > >
> > > > Thanks, Xiaoqiao.
> > > > All files are looking good.
> > > >
> > > > However, while I did the tests to verify the RC, I ran into a serious
> > NPE
> > > > in YARN.
> > > > I raised YARN-10540 <
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_YARN-2D10540=DwIBaQ=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY=7Imi06B91L3gbxmt5ChzH4cwlA2_f2tmXh3OXmV9MLw=nSlLXPsCxZGl0VV03dBWreCNrSH0SsNAZzmjRWO-2Zg=8i-pN_j9VKNxmOzU6gYGtWm_IVyeZkBcMwVI2eyzpRk=
> > > > to
> > > > analyze this further. I think this issue due to YARN-10450
> > > > <
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_YARN-2D10450=DwIBaQ=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY=7Imi06B91L3gbxmt5ChzH4cwlA2_f2tmXh3OXmV9MLw=nSlLXPsCxZGl0VV03dBWreCNrSH0SsNAZzmjRWO-2Zg=QEHMGtEbBz5Gn7mW4UsGlc-wNZ8ugZwiFQBy2pTx-Fw=
> > > >.
> > > > In the trunk, I am not able to see this issue. So It could be
> possible
> > > > that some patches are not backported to branch-3.2.2.
> > > >
> > > > UI1 & UI2 nodes page is not working at this moment. I will check a
> bit
> > > > more to see about this and update here.
> > > >
> > > > Thanks
> > > > Sunil
> > > >
> > > > On Sat, Dec 19, 2020 at 5:36 PM Xiaoqiao He 
> > > wrote:
> > > >
> > > >> Thanks Sunil, md5 files have been removed from RC4. Please have a
> > look.
> > > >> Thanks & Regards.
> > > >>
> > > >> - He Xiaoqiao
> > > >>
> > > >> On Sat, Dec 19, 2020 at 7:22 PM Sunil Govindan 
> > > wrote:
> > > >>
> > > >>> Hi Xiaoqiao,
> > > >>>
> > > >>> Please remove the md5 files from your shared RC4 repo. Thanks,
> @Akira
> > > >>> Ajisaka  for sharing this input.
> > > >>>
> > > >>> Thanks
> > > >>> Sunil
> > > >>>
> > > >>> On Sat, Dec 19, 2020 at 10:21 AM Sunil Govindan  >
> > > >>> wrote:
> > > >>>
> > >  Reference:
> > > 
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.apache.org_dev_release-2Ddistribution.html-23sigs-2Dand-2Dsums=DwIBaQ=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY=7Imi06B91L3gbxmt5ChzH4cwlA2_f2tmXh3OXmV9MLw=nSlLXPsCxZGl0VV03dBWreCNrSH0SsNAZzmjRWO-2Zg=0qrQgqFXZzLqTDPzH_T1emam7NnHvnzXqZ6Ag0ccgIQ=
> > >  Also, we had a Jira to track this HADOOP-15930
> > >  <
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_HADOOP-2D15930=DwIBaQ=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY=7Imi06B91L3gbxmt5ChzH4cwlA2_f2tmXh3OXmV9MLw=nSlLXPsCxZGl0VV03dBWreCNrSH0SsNAZzmjRWO-2Zg=afUJEs9zxobnuEnC4v0xJ6v7d0dtrUkUaX15lI6VZbM=
> > > >.
> > > 
> > >  Thanks
> > >  Sunil
> > > 
> > >  On Sat, Dec 19, 2020 at 10:16 AM Sunil Govindan <
> sun...@apache.org>
> > >  wrote:
> > > 
> > > > Hi Xiaoqiao and Wei-chiu
> > > >
> > > > I am a bit confused after seeing both *.sha512 and *.md5 files in
> > the
> > > > RC directory.
> > > > Are we releasing both now?
> > > >
> > > > Thanks
> > > > Sunil
> > > >
> > > > On Wed, Dec 9, 2020 at 10:32 PM Xiaoqiao He <
> hexiaoq...@apache.org
> > >
> > > > wrote:
> > > >
> > > >> Hi folks,
> > > >>
> > > >> The release candidate (RC4) for Hadoop-3.2.2 is available now.
> > > >> There are 10 commits[1] differences between RC4 and RC3[2].
> > > >>
> > > >> The RC4 is available at:
> > > >>
> > >
> >
> 

Re: [VOTE] Release Apache Hadoop 3.2.2 - RC4

2020-12-16 Thread Wei-Chiu Chuang
Forgot to mention I am +1 (binding) too based on the fact there are no
blocker issues and RC3 and RC4 diff.

On Mon, Dec 14, 2020 at 1:42 PM Wei-Chiu Chuang  wrote:

> Diff between RC3 and RC4:
>
>  MAPREDUCE-7284. TestCombineFileInputFormat#testMissingBlocks fails (#2136)
> HDFS-13404. Addendum: RBF:
> TestRouterWebHDFSContractAppend.testRenameFileBeingAppended fail.
> Contributed by Takanobu Asanuma.
> HADOOP-16080. hadoop-aws does not work with hadoop-client-api  (#2510).
> Contributed by Chao Sun
> HADOOP-15775. [JDK9] Add missing javax.activation-api dependency.
> Contributed by Akira Ajisaka.
> HDFS-15240. Erasure Coding: dirty buffer causes reconstruction block
> error. Contributed by HuangTao.
> HDFS-15708. TestURLConnectionFactory fails by NoClassDefFoundError in
> branch-3.3 and branch-3.2 (#2517)
> HADOOP-17389. KMS should log full UGI principal. (#2476)
> HDFS-15707. NNTop counts don't add up as expected. (#2516) Contributed by
> Ahmed Hussein and Daryn Sharp
> HDFS-15709. Socket file descriptor leak in
> StripedBlockChecksumReconstructor. (#2518)
>
>
> On Wed, Dec 9, 2020 at 9:01 AM Xiaoqiao He  wrote:
>
>> Hi folks,
>>
>> The release candidate (RC4) for Hadoop-3.2.2 is available now.
>> There are 10 commits[1] differences between RC4 and RC3[2].
>>
>> The RC4 is available at:
>> http://people.apache.org/~hexiaoqiao/hadoop-3.2.2-RC4
>> The RC4 tag in github is here:
>> https://github.com/apache/hadoop/tree/release-3.2.2-RC4
>> The maven artifacts are staged at:
>> https://repository.apache.org/content/repositories/orgapachehadoop-1296
>>
>> You can find my public key at:
>> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS or
>> https://people.apache.org/keys/committer/hexiaoqiao.asc directly.
>>
>> Please try the release and vote.
>>
>> I have done a simple testing with my pseudo cluster.
>> * Verify gpg sign and md5sum.
>> * Setup pseudo cluster with HDFS and YARN.
>> * Run simple FsShell - mkdir/put/get/rename/mv.
>> * Submit example mr job and check the result - Pi/wordcount.
>> * Check the web UI of NameNode/DataNode/Resourcemanager/NodeManager.
>> My +1 to start.
>>
>> Thanks,
>> He Xiaoqiao
>>
>> [1]
>>
>> https://github.com/apache/hadoop/compare/release-3.2.2-RC3...release-3.2.2-RC4
>> [2]
>>
>> https://lists.apache.org/thread.html/rfb74c3a5d4f223c5804d8ee622829263740cd8701c8f3fc8b6f970af%40%3Chdfs-dev.hadoop.apache.org%3E
>> [3]
>> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12335948
>>
>


Re: [VOTE] Release Apache Hadoop 3.2.2 - RC4

2020-12-14 Thread Wei-Chiu Chuang
Diff between RC3 and RC4:

 MAPREDUCE-7284. TestCombineFileInputFormat#testMissingBlocks fails (#2136)
HDFS-13404. Addendum: RBF:
TestRouterWebHDFSContractAppend.testRenameFileBeingAppended fail.
Contributed by Takanobu Asanuma.
HADOOP-16080. hadoop-aws does not work with hadoop-client-api  (#2510).
Contributed by Chao Sun
HADOOP-15775. [JDK9] Add missing javax.activation-api dependency.
Contributed by Akira Ajisaka.
HDFS-15240. Erasure Coding: dirty buffer causes reconstruction block error.
Contributed by HuangTao.
HDFS-15708. TestURLConnectionFactory fails by NoClassDefFoundError in
branch-3.3 and branch-3.2 (#2517)
HADOOP-17389. KMS should log full UGI principal. (#2476)
HDFS-15707. NNTop counts don't add up as expected. (#2516) Contributed by
Ahmed Hussein and Daryn Sharp
HDFS-15709. Socket file descriptor leak in
StripedBlockChecksumReconstructor. (#2518)


On Wed, Dec 9, 2020 at 9:01 AM Xiaoqiao He  wrote:

> Hi folks,
>
> The release candidate (RC4) for Hadoop-3.2.2 is available now.
> There are 10 commits[1] differences between RC4 and RC3[2].
>
> The RC4 is available at:
> http://people.apache.org/~hexiaoqiao/hadoop-3.2.2-RC4
> The RC4 tag in github is here:
> https://github.com/apache/hadoop/tree/release-3.2.2-RC4
> The maven artifacts are staged at:
> https://repository.apache.org/content/repositories/orgapachehadoop-1296
>
> You can find my public key at:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS or
> https://people.apache.org/keys/committer/hexiaoqiao.asc directly.
>
> Please try the release and vote.
>
> I have done a simple testing with my pseudo cluster.
> * Verify gpg sign and md5sum.
> * Setup pseudo cluster with HDFS and YARN.
> * Run simple FsShell - mkdir/put/get/rename/mv.
> * Submit example mr job and check the result - Pi/wordcount.
> * Check the web UI of NameNode/DataNode/Resourcemanager/NodeManager.
> My +1 to start.
>
> Thanks,
> He Xiaoqiao
>
> [1]
>
> https://github.com/apache/hadoop/compare/release-3.2.2-RC3...release-3.2.2-RC4
> [2]
>
> https://lists.apache.org/thread.html/rfb74c3a5d4f223c5804d8ee622829263740cd8701c8f3fc8b6f970af%40%3Chdfs-dev.hadoop.apache.org%3E
> [3]
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12335948
>


Re: [VOTE] Release Apache Hadoop 3.2.2 - RC2

2020-11-13 Thread Wei-Chiu Chuang
I suggest we wait for HDFS-15680

(If I understand it correctly, it's a blocker)

I wish to update jetty too in this release, but it's turning out to be more
involved than i had anticipated.


On Fri, Nov 13, 2020 at 3:22 PM Chao Sun  wrote:

> +1 (non-binding) from me as well:
>
> - download the source and build it on mac successfully
> - verified all md5, sha512 as well as signatures, however @Xiaoqiao I was
> not able to find your public key in
> http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS so had to use the
> other link.
> - started a single-node HDFS cluster with docker and verified that basic
> operations are working
> - checked the NN webui and it is working as expected
>
> Thanks,
> Chao
>
>
> On Tue, Nov 10, 2020 at 1:13 PM Stephen O'Donnell
>  wrote:
>
>> I compiled a native build from source and started a non-HA, non-HA secure
>> (TLS and Kerberos) and a HA hdfs cluster using docker.
>>
>> On each, I created a few files, looked around the webUI etc. Everything
>> looks fine. The only strange thing, was on the Namenode webUI under the
>> overview section - the "unknown" is a little strange, but I am not sure
>> what is usually in there:
>>
>> Version: 3.2.2, rUnknown
>> Compiled: Mon Nov 09 17:30:00 + 2020 by root from Unknown
>>
>> +1 from me.
>>
>> Stephen.
>>
>> On Sat, Nov 7, 2020 at 2:52 PM Xiaoqiao He  wrote:
>>
>> > Hi folks,
>> >
>> > The release candidate (RC2) for Hadoop-3.2.2 is available now.
>> > There are two commits[1] differences between RC2 and RC1[2](Thanks Akira
>> > Ajisaka for the report.):
>> > * revert HADOOP-17306.
>> > * include HDFS-15643.
>> >
>> > The RC2 is available at:
>> > http://people.apache.org/~hexiaoqiao/hadoop-3.2.2-RC2
>> > The RC2 tag in github is here:
>> > https://github.com/apache/hadoop/tree/release-3.2.2-RC2
>> > The maven artifacts are staged at:
>> > https://repository.apache.org/content/repositories/orgapachehadoop-1288
>> >
>> > You can find my public key at:
>> > http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS or
>> > https://people.apache.org/keys/committer/hexiaoqiao.asc directly.
>> >
>> > Please try the release and vote. The vote will close until 2020/11/14 at
>> > 00:00 CST.
>> >
>> > Thanks,
>> > He Xiaoqiao
>> >
>> > [1]
>> >
>> >
>> https://github.com/apache/hadoop/compare/release-3.2.2-RC1...release-3.2.2-RC2
>> > [2]
>> >
>> >
>> https://lists.apache.org/thread.html/rc7247434f5a77b6d0d1d1f3fcd6b6668eb431a5697e582a6338f0eb7%40%3Chdfs-dev.hadoop.apache.org%3E
>> > [3]
>> >
>> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12335948
>> >
>>
>


Committers please subscribe to security@hadoop

2020-11-10 Thread Wei-Chiu Chuang
The security@hadoop mailing list is restricted to Hadoop committers.
Subscribing to the mailing list is not an automatic process. If you are a
committer, please subscribe by sending an email to
security-subscr...@hadoop.apache.org so we can discuss vulnerabilities in
private.

Thanks,
Wei-Chiu


Fwd: Heads-up, Apache Yetus looking for feedback on RC with lots of changes

2020-11-03 Thread Wei-Chiu Chuang
Forwarding to the Hadoop devs

-- Forwarded message -
From: Sean Busbey 
Date: Tue, Nov 3, 2020 at 9:51 AM
Subject: Heads-up, Apache Yetus looking for feedback on RC with lots of
changes
To: dev 


Hi folks!

FYI, Apache Yetus is doing RCs for their 0.13.0 release and it's got a
lot of changes. Summary from the release manager:

> This release is huge, potentially the biggest one we've done since
project launch. There are a
> lot of changes, many incompatible.  There is a lot of new/reworked
documentation.  Please plan
> on spending some time with it, as this one will almost certainly not be a
drop-in replacement.

The current RC is going to fail due to some issues that came up in
testing of the candidate.

d...@yetus.apache.org subject "[VOTE] Apache Yetus 0.13.0-RC2"
https://lists.apache.org/thread.html/re5d9957989e2311d36ee6ffa0e4c0c3d68d25b9429546104a5f02851%40%3Cdev.yetus.apache.org%3E

The Yetus community is looking for some feedback from downstream users
who rely on yetus on ASF build infrastructure. I think HBase is one of
the larger projects using Yetus at the ASF, so it'd be good if we
could try out things and see how they go before things get too far
with additional release candidates.

I don't have cycles at the moment so I thought I'd send this heads up
to the wider audience.

If you're game, please give a ping here and join the yetus dev mailing
list. I'm happy to give pointers to where things would need to change
for folks who aren't already familiar with the community build infra.


Re: Fixing flaky tests in Apache Hadoop

2020-10-22 Thread Wei-Chiu Chuang
I also wondered if the hardware was too stressed since all Hadoop related
projects all use the same set of Jenkins servers.
However, HBase just recently moved to their own dedicated machines, so I'm
actually surprised to see a lot of resource related failures even now.

On Thu, Oct 22, 2020 at 2:03 PM Wei-Chiu Chuang  wrote:

> Thanks for raising the issue, Akira and Ahmed,
>
> Fixing flaky tests is a thankless job so I want to take this opportunity
> to recognize the time and effort.
>
> We will always have flaky tests due to bad tests or simply infra issues.
> Fixing flaky tests will take time but if they are not addressed it wastes
> everybody's time.
>
> Recognizing this problem, I have two suggestions:
>
> 1. Other projects such as HBase have a tool to exclude flaky tests from
> being executed. They track flaky tests and display them in a dashboard.
> This will allow good tests to pass while leaving time for folks to fix
> them. Or we could manually exclude tests (this is what we used to do at
> Cloudera)
>
> 2. Dedicate a community "Bug Bash Day" / "Fix it Day". We had a bug bash
> day two years ago, and maybe it's time to repeat it again:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75965105 this
> is going to be tricky as we are in a pandemic and most of the community are
> working from home, unlike the last time when we can lock ourselves in a
> conference room and force everybody to work :)
>
> Thoughts?
>
>
> On Thu, Oct 22, 2020 at 12:14 PM Akira Ajisaka 
> wrote:
>
>> Hi Hadoop developers,
>>
>> Now there are a lot of failing unit tests and there is an issue to
>> tackle this bad situation.
>> https://issues.apache.org/jira/browse/HDFS-15646
>>
>> Although this issue is in HDFS project, this issue is related to all
>> the Hadoop developers. Please check the above URL, read the
>> description, and volunteer to dedicate more time to fix flaky tests.
>> Your contribution to fixing the flaky tests will be really
>> appreciated!
>>
>> Thank you Ahmed Hussein for your report.
>>
>> Regards,
>> Akira
>>
>> -
>> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>>
>>


Re: Fixing flaky tests in Apache Hadoop

2020-10-22 Thread Wei-Chiu Chuang
Thanks for raising the issue, Akira and Ahmed,

Fixing flaky tests is a thankless job so I want to take this opportunity to
recognize the time and effort.

We will always have flaky tests due to bad tests or simply infra issues.
Fixing flaky tests will take time but if they are not addressed it wastes
everybody's time.

Recognizing this problem, I have two suggestions:

1. Other projects such as HBase have a tool to exclude flaky tests from
being executed. They track flaky tests and display them in a dashboard.
This will allow good tests to pass while leaving time for folks to fix
them. Or we could manually exclude tests (this is what we used to do at
Cloudera)

2. Dedicate a community "Bug Bash Day" / "Fix it Day". We had a bug bash
day two years ago, and maybe it's time to repeat it again:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75965105 this
is going to be tricky as we are in a pandemic and most of the community are
working from home, unlike the last time when we can lock ourselves in a
conference room and force everybody to work :)

Thoughts?


On Thu, Oct 22, 2020 at 12:14 PM Akira Ajisaka  wrote:

> Hi Hadoop developers,
>
> Now there are a lot of failing unit tests and there is an issue to
> tackle this bad situation.
> https://issues.apache.org/jira/browse/HDFS-15646
>
> Although this issue is in HDFS project, this issue is related to all
> the Hadoop developers. Please check the above URL, read the
> description, and volunteer to dedicate more time to fix flaky tests.
> Your contribution to fixing the flaky tests will be really
> appreciated!
>
> Thank you Ahmed Hussein for your report.
>
> Regards,
> Akira
>
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>
>


JIRA privilege for committers/PMCs

2020-10-14 Thread Wei-Chiu Chuang
Hi,

Is there any one who is a Hadoop committer or PMC, but who does not have
the committer/administrator privilege (e.g. assign/resolve jiras)?

JIRA projects like HADOOP has "hadoop" and "hadoop-pmc" groups added as
administrators.
[image: Screen Shot 2020-10-14 at 3.20.59 PM.png]

So I would have thought all Hadoop committers/PMCs have administrator
privileges (assuming the hadoop group maps to all Hadoop committers and
hadoop-pmc for PMCs). But it seems not the case and I had to add the
administrator privileges to committers manually.  Does anyone know how it's
supposed to work?

Thanks,
Weichiu


Re: Hadoop 3.2.2 Release Code Freeze Plan

2020-10-01 Thread Wei-Chiu Chuang
Thanks Xiaoqiao!
Glad to see this is moving along. I noticed your dashboard has private
filters and therefore the results are not visible publicly.

On Tue, Sep 29, 2020 at 10:02 PM Xiaoqiao He  wrote:

> Hi All,
>
> Plan to code frozen for Hadoop-3.2.2 release at 2020/10/15. From now on,
> most of the issues have been resolved. And there are two JIRA still open
> which target to 3.2.2 traced by [1].
>
> * https://issues.apache.org/jira/browse/YARN-10244
> * https://issues.apache.org/jira/browse/HADOOP-17287
>
> Please let us know if this is really blocking for 3.2.2, if not kindly
> move it out. If required to involve in 3.2.2, Please try to push them
> forward recently.
>
> Thanks & Best Regards,
> He Xiaoqiao
>
> [1]
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12335948
>


Re: [VOTE] Moving Ozone to a separated Apache project

2020-09-29 Thread Wei-Chiu Chuang
+1

Look forward to it.

On Tue, Sep 29, 2020 at 4:48 PM Konstantin Shvachko 
wrote:

> +1
>
> Stay safe,
> --Konstantin
>
> On Thu, Sep 24, 2020 at 10:59 PM Elek, Marton  wrote:
>
> > Hi all,
> >
> > Thank you for all the feedback and requests,
> >
> > As we discussed in the previous thread(s) [1], Ozone is proposed to be a
> > separated Apache Top Level Project (TLP)
> >
> > The proposal with all the details, motivation and history is here:
> >
> >
> >
> https://cwiki.apache.org/confluence/display/HADOOP/Ozone+Hadoop+subproject+to+Apache+TLP+proposal
> >
> > This voting runs for 7 days and will be concluded at 2nd of October, 6AM
> > GMT.
> >
> > Thanks,
> > Marton Elek
> >
> > [1]:
> >
> >
> https://lists.apache.org/thread.html/rc6c79463330b3e993e24a564c6817aca1d290f186a1206c43ff0436a%40%3Chdfs-dev.hadoop.apache.org%3E
> >
> > -
> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >
> >
>


[ANNOUNCE] Hui Fei is a new Apache Hadoop Committer

2020-09-23 Thread Wei-Chiu Chuang
I am pleased to announce that Hui Fei has accepted the invitation to become
a Hadoop committer.

He started contributing to the project in October 2016. Over the past 4
years he has contributed a lot in HDFS, especially in Erasure Coding,
Hadoop 3 upgrade, RBF and Standby Serving reads.

One of the biggest contributions is Hadoop 2->3 rolling upgrade support.
This was a major blocker for any existing Hadoop users to adopt Hadoop 3.
The adoption of Hadoop 3 has gone up after this. In the past the community
discussed a lot about Hadoop 3 rolling upgrade being a must-have, but no
one took the initiative to make it happen. I am personally very grateful
for this.

The work on EC is impressive as well. He managed to onboard EC in
production at scale, fixing tricky problems. Again, I am impressed and
grateful for the contribution in EC.

In addition to code contributions, he invested a lot in the community:

>
>- Apache Hadoop Community 2019 Beijing Meetup
>https://blogs.apache.org/hadoop/entry/hadoop-community-meetup-beijing-aug 
> where
>he discussed the operational experience of RBF in production
>
>
>- Apache Hadoop Storage Community Sync Online
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit#heading=h.irqxw1iy16zo
>  where
>he discussed the Hadoop 3 rolling upgrade support
>
>
Let's congratulate Hui for this new role!

Cheers,
Wei-Chiu Chuang (on behalf of the Apache Hadoop PMC)


[ANNOUNCE] Lisheng Sun is a new Apache Hadoop Committer

2020-09-23 Thread Wei-Chiu Chuang
I am pleased to announce that Lisheng Sun has accepted the invitation to
become a Hadoop committer.

Lisheng actively contributed to the project since July 2019, and he
contributed two new features: Dead datanode detector (HDFS-13571
<https://issues.apache.org/jira/browse/HDFS-13571>) and a new du
implementation (HDFS-14313
<https://issues.apache.org/jira/browse/HDFS-14313>) Lots of improvements
including a number of short circuit read optimization
HDFS-15161 <https://issues.apache.org/jira/browse/HDFS-15161> , speeding up
NN fsimage loading time: HDFS-13694
<https://issues.apache.org/jira/browse/HDFS-13694> and HDFS-13693
<https://issues.apache.org/jira/browse/HDFS-13693>. Code wise, he resolved
57 Hadoop jiras.

Let's congratulate Lisheng for this new role!

Cheers,
Wei-Chiu Chuang (on behalf of the Apache Hadoop PMC)


Re: [VOTE] Release Apache Hadoop 2.10.1 (RC0)

2020-09-20 Thread Wei-Chiu Chuang
+1 (binding)

I did a security scan for the 2.10.1 RC0 and it looks fine to me.

Checked recent critical/blocker HDFS issues that are not in 2.10.1. It
looks mostly fine. Most of them are Hadoop 3.x features (EC, ... etc) but
there is one worth attention:


   1. HDFS-14674  [SBN
   read] Got an unexpected txid when tail editlog.
   2.
  1. But looking at the jira, it doesn't apply to 2.x so I think we are
  good there.
  2.
  3.
  4. I wanted to do an API compat check but didn't finish it yet. If
  someone can do it quickly that would be great. (Does anyone know
of a cloud
  service that we can quickly do a Java API compat check?)

Cheers,
Wei-Chiu

On Sun, Sep 20, 2020 at 9:25 AM Sunil Govindan  wrote:

> +1 (binding)
>
> - verified checksum and sign. Shows as a Good signature from "Masatake
> Iwasaki (CODE SIGNING KEY) "
> - built from source
> - ran basic MR job and looks good
> - UI also seems fine
>
> Thanks,
> Sunil
>
> On Sun, Sep 20, 2020 at 11:38 AM Masatake Iwasaki <
> iwasak...@oss.nttdata.co.jp> wrote:
>
> > The RC0 got 2 binding +1's and 2 non-binging +1's [1].
> >
> > Based on the discussion about release vote [2],
> > bylaws[3] defines the periods in minimum terms.
> > We can extend it if there is not enough activity.
> >
> > I would like to extend the period to 7 days,
> > until Monday September 21 at 10:00 am PDT.
> >
> > I will appreciate additional votes.
> >
> > Thanks,
> > Masatake Iwasaki
> >
> > [1]
> >
> https://lists.apache.org/thread.html/r16a7f36315a0673c7d522c41065e7ef9c9ee15c76ffcb5db80931002%40%3Ccommon-dev.hadoop.apache.org%3E
> > [2]
> >
> https://lists.apache.org/thread.html/e392b902273ee0c14ba34d72c44630e05f54cb3976109af510592ea2%401403330080%40%3Ccommon-dev.hadoop.apache.org%3E
> > [3] https://hadoop.apache.org/bylaws.html
> >
> > On 2020/09/15 2:59, Masatake Iwasaki wrote:
> > > Hi folks,
> > >
> > > This is the first release candidate for the second release of Apache
> > Hadoop 2.10.
> > > It contains 218 fixes/improvements since 2.10.0 [1].
> > >
> > > The RC0 artifacts are at:
> > > http://home.apache.org/~iwasakims/hadoop-2.10.1-RC0/
> > >
> > > RC tag is release-2.10.1-RC0:
> > > https://github.com/apache/hadoop/tree/release-2.10.1-RC0
> > >
> > > The maven artifacts are hosted here:
> > >
> https://repository.apache.org/content/repositories/orgapachehadoop-1279/
> > >
> > > My public key is available here:
> > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> > >
> > > The vote will run for 5 days, until Saturday, September 19 at 10:00 am
> > PDT.
> > >
> > > [1]
> >
> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.1
> > >
> > > Thanks,
> > > Masatake Iwasaki
> > >
> > > -
> > > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
> >
> >
>


Two more binding votes required for Apache Hadoop 2.10.1 RC0

2020-09-18 Thread Wei-Chiu Chuang
Masatake is doing a great job rolling out 2.10.1 RC0. Let's give him a
final push to get it out.

Thanks,
Wei-Chiu


Re: Sep. Y2020 Hadoop Meetup in China

2020-09-16 Thread Wei-Chiu Chuang
This is great news!
Can you also share this with the user@ and user-zh@ mailing lists? We need
to get the message out.

On Tue, Sep 15, 2020 at 10:36 PM 俊平堵  wrote:

> Hi,
> Last year Aug, we had a very successful meetup in Beijing (
> https://blogs.apache.org/hadoop) which help to promote hadoop user
> adoption
> and development locally. This year, we would like to do it again in
> Shanghai on 9/26.
>  So if you have topic to share on hadoop development or some relevant
>  interesting case. Please nominate it by replying this letter or contact me
> directly. Slot is limited, first come first serve. Thanks!
>   BTW, this meetup is plan to be offline. If many people want to share
> but cannot come to Shanghai by then, we can plan something online later.
>
> Thanks,
>
> Junping
>


[ANNOUNCEMENT] Apache Hadoop 2.9.x release line end of life

2020-09-07 Thread Wei-Chiu Chuang
The Apache Hadoop community has voted to end the release line of 2.9.x.
(Vote thread: https://s.apache.org/ApacheHadoop2.9EOLVote)

The first 2.9 release, 2.9.0 was released on 12/17/2017
The last 2.9 release, 2.9.2, was released on 1/19/2018

Existing 2.9.x users are encouraged to upgrade to newer release lines:
2.10.0 / 3.1.4 / 3.2.1 / 3.3.0.

Please check out our Release EOL wiki for details:
https://cwiki.apache.org/confluence/display/HADOOP/EOL+(End-of-life)+Release+Branches

Best Regards,
Wei-Chiu Chuang (On Behalf of the Apache Hadoop PMC)


[DISCUSS] Hadoop 3.2.2 release

2020-09-01 Thread Wei-Chiu Chuang
Hi folks,

I was reminded by Xiaoqiao that Hadoop 3.2.1 was made almost a year ago
(released on September 22, 2019) and we're overdue for a follow-up.

@Rohith Sharma K S   you were the RM for Hadoop
3.2.1. Are we planning to make the 3.2.2 releases soon? Xiaoqiao wants to
help with the release.

Thanks,
Weichiu


[DISCUSS] Hadoop 2.10.1 release

2020-08-31 Thread Wei-Chiu Chuang
Hello,

I see that Masatake graciously agreed to volunteer with the Hadoop 2.10.1
release work in the 2.9 branch EOL discussion thread
https://s.apache.org/hadoop2.9eold

Anyone else likes to contribute also?

Thanks


Re: [DISCUSS] fate of branch-2.9

2020-08-26 Thread Wei-Chiu Chuang
Bump up this thread after 6 months.

Is anyone still interested in the 2.9 release line? Or are we good to start
the EOL process? The 2.9.2 was released in Nov 2018.

I'd really like to see the community to converge to fewer release lines and
make more frequent releases in each line.

Thanks,
Weichiu


On Fri, Mar 6, 2020 at 5:47 PM Wei-Chiu Chuang  wrote:

> I think that's a great suggestion.
> Currently, we make 1 minor release per year, and within each minor release
> we bring up 1 thousand to 2 thousand commits in it compared with the
> previous one.
> I can totally understand it is a big bite for users to swallow. Having a
> more frequent release cycle, plus LTS and non-LTS releases should help with
> this. (Of course we will need to make the release preparation much easier,
> which is currently a pain)
>
> I am happy to discuss the release model further in the dev ML. LTS v.s.
> non-LTS is one suggestion.
>
> Another similar issue: In the past Hadoop strived to
> maintain compatibility. However, this is no longer sustainable as more CVEs
> coming from our dependencies: netty, jetty, jackson ... etc.
> In many cases, updating the dependencies brings breaking changes. More
> recently, especially in Hadoop 3.x, I started to make the effort to update
> dependencies much more frequently. How do users feel about this change?
>
> On Thu, Mar 5, 2020 at 7:58 AM Igor Dvorzhak 
> wrote:
>
>> Maybe Hadoop will benefit from adopting a similar release and support
>> strategy as Java? I.e. designate some releases as LTS and support them for
>> 2 (?) years (it seems that 2.7.x branch was de-facto LTS), other non-LTS
>> releases will be supported for 6 months (or until next release). This
>> should allow to reduce maintenance cost of non-LTS release and provide
>> conservative users desired stability by allowing them to wait for new LTS
>> release and upgrading to it.
>>
>> On Thu, Mar 5, 2020 at 1:26 AM Rupert Mazzucco 
>> wrote:
>>
>>> After recently jumping from 2.7.7 to 2.10 without issue myself, I vote
>>> for keeping only the 2.10 line.
>>> It would seem all other 2.x branches can upgrade to a 2.10.x easily if
>>> they feel like upgrading at all,
>>> unlike a jump to 3.x, which may require more planning.
>>>
>>> I also vote for having only one main 3.x branch. Why are there 3.1.x and
>>> 3.2.x seemingly competing,
>>> and now 3.3.x? For a community that does not have the resources to
>>> manage multiple release lines,
>>> you guys sure like to multiply release lines a lot.
>>>
>>> Cheers
>>> Rupert
>>>
>>> Am Mi., 4. März 2020 um 19:40 Uhr schrieb Wei-Chiu Chuang
>>> :
>>>
>>>> Forwarding the discussion thread from the dev mailing lists to the user
>>>> mailing lists.
>>>>
>>>> I'd like to get an idea of how many users are still on Hadoop 2.9.
>>>> Please share your thoughts.
>>>>
>>>> On Mon, Mar 2, 2020 at 6:30 PM Sree Vaddi
>>>>  wrote:
>>>>
>>>>> +1
>>>>>
>>>>> Sent from Yahoo Mail on Android
>>>>>
>>>>>   On Mon, Mar 2, 2020 at 5:12 PM, Wei-Chiu Chuang
>>>>> wrote:   Hi,
>>>>>
>>>>> Following the discussion to end branch-2.8, I want to start a
>>>>> discussion
>>>>> around what's next with branch-2.9. I am hesitant to use the word "end
>>>>> of
>>>>> life" but consider these facts:
>>>>>
>>>>> * 2.9.0 was released Dec 17, 2017.
>>>>> * 2.9.2, the last 2.9.x release, went out Nov 19 2018, which is more
>>>>> than
>>>>> 15 months ago.
>>>>> * no one seems to be interested in being the release manager for 2.9.3.
>>>>> * Most if not all of the active Hadoop contributors are using Hadoop
>>>>> 2.10
>>>>> or Hadoop 3.x.
>>>>> * We as a community do not have the cycle to manage multiple release
>>>>> line,
>>>>> especially since Hadoop 3.3.0 is coming out soon.
>>>>>
>>>>> It is perhaps the time to gradually reduce our footprint in Hadoop
>>>>> 2.x, and
>>>>> encourage people to upgrade to Hadoop 3.x
>>>>>
>>>>> Thoughts?
>>>>>
>>>>>


Re: Mandarin Hadoop online sync this week

2020-08-26 Thread Wei-Chiu Chuang
This week's summary:

8/26 Mandarin online sync

Weichiu, Xiaoiao, Baoloongmao, Hui, wuweiwei,, Leon Gao, Lisheng Sun,
Jinglun, zhoubin86

Leon shared a DataNode improvement proposal at Uber.

Different storage density. Balance disk IO among different disk size.

Problem: archive disk’s IO utilization is very low. Want to use it more.

The proposed change will be based on the HSM, with quite minimal change.

Cold data is in GCS. A simple scheme to copy cold data to GCS. The data in
GCS is not intended to be accessible readily, so don’t worry about the
scheme change.

Jinglun shared the solutions to an operational problem: NameNode QPS
dropped, waiting time more than 1 second, processing time more than 400ms.
Solution: (1) migrate a directory to a new namespace.  (2) RBF can hash out
a directory to multiple namespaces, reducing the pressure of a particular
NN.

Baoloongmao suggested we can port Ozone features into Hadoop Common. For
example, Java-based configuration is a power feature which can benefit
Hadoop as well.


On Tue, Aug 25, 2020 at 9:47 AM Wei-Chiu Chuang  wrote:

> Hello,
>
> There hasn't been a Mandarin online sync for quite some time. I'd like to
> call for one this week:
>
> Date/time:
>
> 8/27 Thursday Beijing Time 1PM
> 8/26 Wednesday US Pacific Time 10PM
>
> Link:
> https://cloudera.zoom.us/j/880548968
>
> Past sync summary:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>
>
>


Re: Mandarin Hadoop online sync this week

2020-08-25 Thread Wei-Chiu Chuang
Brahma,

Thanks for bring this up. I don't have a specific topic in mind.

For me, I'd like to use this as an opportunity to get a feeling of what
features/problems the community feel interested in. And in general, what
are the Hadoop versions the community are using? Can we stop 2.9.x
development and concentrate on 3.2/3.3 and trunk? I am interested in
arranging a meetup for a broader audience in English for topics like this
soon too.

Other things I'd like to explore (if there's time), is how do we involve
the Chinese community better. How can we grow more Chinese/Asian committers?

On Tue, Aug 25, 2020 at 9:58 AM Brahma Reddy Battula 
wrote:

> HI,
>
> what you are planning for this week?
>
> On Tue, Aug 25, 2020 at 10:18 PM Wei-Chiu Chuang 
> wrote:
>
>> Hello,
>>
>> There hasn't been a Mandarin online sync for quite some time. I'd like to
>> call for one this week:
>>
>> Date/time:
>>
>> 8/27 Thursday Beijing Time 1PM
>> 8/26 Wednesday US Pacific Time 10PM
>>
>> Link:
>> https://cloudera.zoom.us/j/880548968
>>
>> Past sync summary:
>>
>> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>>
>
>
> --
>
>
>
> --Brahma Reddy Battula
>


Mandarin Hadoop online sync this week

2020-08-25 Thread Wei-Chiu Chuang
Hello,

There hasn't been a Mandarin online sync for quite some time. I'd like to
call for one this week:

Date/time:

8/27 Thursday Beijing Time 1PM
8/26 Wednesday US Pacific Time 10PM

Link:
https://cloudera.zoom.us/j/880548968

Past sync summary:
https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit


[ANNOUNCE] New Apache Hadoop Committer - He Xiaoqiao

2020-06-11 Thread Wei-Chiu Chuang
In bcc: general@

It's my pleasure to announce that He Xiaoqiao has been elected as a
committer on the Apache Hadoop project recognizing his continued
contributions to the
project.

Please join me in congratulating him.

Hearty Congratulations & Welcome aboard Xiaoqiao!

Wei-Chiu Chuang
(On behalf of the Hadoop PMC)


Re: [DISCUSS] making Ozone a separate Apache project

2020-05-13 Thread Wei-Chiu Chuang
+1

On Wed, May 13, 2020 at 8:32 AM anu engineer  wrote:

> +1
> —Anu
>
> > On May 13, 2020, at 12:53 AM, Elek, Marton  wrote:
> >
> > 
> >
> > I would like to start a discussion to make a separate Apache project for
> Ozone
> >
> >
> >
> > ### HISTORY [1]
> >
> > * Apache Hadoop Ozone development started on a feature branch of Hadoop
> repository (HDFS-7240)
> >
> > * In the October of 2017 a discussion has been started to merge it to
> the Hadoop main branch
> >
> > * After a long discussion it's merged to Hadoop trunk at the March of
> 2018
> >
> > * During the discussion of the merge, it was suggested multiple times to
> create a separated project for the Ozone. But at that time:
> >1). Ozone was tightly integrated with Hadoop/HDFS
> >2). There was an active plan to use Block layer of Ozone (HDDS or
> HDSL at that time) as the block level of HDFS
> >3). The community of Ozone was a subset of the HDFS community
> >
> > * The first beta release of Ozone was just released. Seems to be a good
> time before the first GA to make a decision about the future.
> >
> >
> >
> > ### WHAT HAS BEEN CHANGED
> >
> > During the last years Ozone became more and more independent both at the
> community and code side. The separation has been suggested again and again
> (for example by Owen [2] and Vinod [3])
> >
> >
> >
> > From COMMUNITY point of view:
> >
> >
> >  * Fortunately more and more new contributors are helping Ozone.
> Originally the Ozone community was a subset of HDFS project. But now a
> bigger and bigger part of the community is related to Ozone only.
> >
> >  * It seems to be easier to _build_ the community as a separated project.
> >
> >  * A new, younger project might have different practices (communication,
> commiter criteria, development style) compared to old, mature project
> >
> >  * It's easier to communicate (and improve) these standards in a
> separated projects with clean boundaries
> >
> >  * Separated project/brand can help to increase the adoption rate and
> attract more individual contributor (AFAIK it has been seen in Submarine
> after a similar move)
> >
> > * Contribution process can be communicated more easily, we can make
> first time contribution more easy
> >
> >
> >
> > From CODE point of view Ozone became more and more independent:
> >
> >
> > * Ozone has different release cycle
> >
> > * Code is already separated from Hadoop code base
> (apache/hadoop-ozone.git)
> >
> > * It has separated CI (github actions)
> >
> > * Ozone uses different (more strict) coding style (zero toleration of
> unit test / checkstyle errors)
> >
> > * The code itself became more and more independent from Hadoop on Maven
> level. Originally it was compiled together with the in-tree latest Hadoop
> snapshot. Now it depends on released Hadoop artifacts (RPC,
> Configuration...)
> >
> > * It starts to use multiple version of Hadoop (on client side)
> >
> > * Volume of resolved issues are already very high on Ozone side (Ozone
> had slightly more resolved issues than HDFS/YARN/MAPREDUCE/COMMON all
> together in the last 2-3 months)
> >
> >
> > Summary: Before the first Ozone GA release, It seems to be a good time
> to discuss the long-term future of Ozone. Managing it as a separated TLP
> project seems to have more benefits.
> >
> >
> > Please let me know what your opinion is...
> >
> > Thanks a lot,
> > Marton
> >
> >
> >
> >
> >
> > [1]: For more details, see:
> https://github.com/apache/hadoop-ozone/blob/master/HISTORY.md
> >
> > [2]:
> https://lists.apache.org/thread.html/0d0253f6e5fa4f609bd9b917df8e1e4d8848e2b7fdb3099b730095e6%40%3Cprivate.hadoop.apache.org%3E
> >
> > [3]:
> https://lists.apache.org/thread.html/8be74421ea495a62e159f2b15d74627c63ea1f67a2464fa02c85d4aa%40%3Chdfs-dev.hadoop.apache.org%3E
> >
> > -
> > To unsubscribe, e-mail: ozone-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: ozone-dev-h...@hadoop.apache.org
> >
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: [VOTE] Release Apache Hadoop 3.1.4 (RC0)

2020-05-04 Thread Wei-Chiu Chuang
Gabor, I'm sorry there's a test failure in branch-3.1 HDFS-14599


   1.
  1. I just cherrypicked the fix to branch-3.2 branch-3.1. It's a test
  only fix so technically I could live with it. But it would be best to add
  the fix to 3.1.4 as well.


On Mon, May 4, 2020 at 3:20 PM Gabor Bota  wrote:

> Hi folks,
>
> I have put together a release candidate (RC0) for Hadoop 3.1.4.
>
> The RC is available at: http://people.apache.org/~gabota/hadoop-3.1.4-RC0/
> The RC tag in git is here:
> https://github.com/apache/hadoop/releases/tag/release-3.1.4-RC0
> The maven artifacts are staged at
> https://repository.apache.org/content/repositories/orgapachehadoop-1266/
>
> You can find my public key at:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> and http://keys.gnupg.net/pks/lookup?op=get=0xB86249D83539B38C
>
> Please try the release and vote. The vote will run for 5 weekdays,
> until May 11. 2020. 23:00 CET.
>
> Thanks,
> Gabor
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


  1   2   >