from:"Wei\-Chiu Chuang"

[jira] [Resolved] (HDFS-17181) WebHDFS not considering whether a DN is good when called from outside the cluster

2024-02-20 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-17181.

Fix Version/s: 3.5.0
   Resolution: Fixed

> WebHDFS not considering whether a DN is good when called from outside the 
> cluster
> -
>
> Key: HDFS-17181
> URL: https://issues.apache.org/jira/browse/HDFS-17181
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, webhdfs
>Affects Versions: 3.3.6
>Reporter: Lars Francke
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
> Attachments: Test_fix_for_HDFS-171811.patch
>
>
> When calling WebHDFS to create a file (I'm sure the same problem occurs for 
> other actions e.g. OPEN but I haven't checked all of them yet) it will 
> happily redirect to nodes that are in maintenance.
> The reason is in the 
> [{{chooseDatanode}}|https://github.com/apache/hadoop/blob/rel/release-3.3.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java#L307C9-L315]
>  method in {{NamenodeWebHdfsMethods}} where it will only call the 
> {{BlockPlacementPolicy}} (which considers all these edge cases) in case the 
> {{remoteAddr}} (i.e. the address making the request to WebHDFS) is also 
> running a DataNode.
>  
> In all other cases it just refers to 
> [{{NetworkTopology#chooseRandom}}|https://github.com/apache/hadoop/blob/rel/release-3.3.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java#L342-L343]
>  which does not consider any of these circumstances (e.g. load, maintenance).
> I don't understand the reason to not just always refer to the placement 
> policy and we're currently testing a patch to do just that.
> I have attached a draft patch for now.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-17024) Potential data race introduced by HDFS-15865

2023-10-26 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-17024.

Fix Version/s: 3.4.0
   Resolution: Fixed

> Potential data race introduced by HDFS-15865
> 
>
> Key: HDFS-17024
> URL: https://issues.apache.org/jira/browse/HDFS-17024
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsclient
>Affects Versions: 3.3.1
>    Reporter: Wei-Chiu Chuang
>Assignee: Segawa Hiroaki
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> After HDFS-15865, we found client aborted due to an NPE.
> {noformat}
> 2023-04-10 16:07:43,409 ERROR 
> org.apache.hadoop.hbase.regionserver.HRegionServer: * ABORTING region 
> server kqhdp36,16020,1678077077562: Replay of WAL required. Forcing server 
> shutdown *
> org.apache.hadoop.hbase.DroppedSnapshotException: region: WAFER_ALL,16|CM 
> RIE.MA1|CP1114561.18|PROC|,1625899466315.0fbdf0f1810efa9e68af831247e6555f.
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2870)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2539)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2511)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2401)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:613)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:582)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$1000(MemStoreFlusher.java:69)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:362)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DataStreamer.waitForAckedSeqno(DataStreamer.java:880)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:781)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:898)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:850)
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:76)
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:105)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.finishClose(HFileWriterImpl.java:859)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.close(HFileWriterImpl.java:687)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileWriter.close(StoreFileWriter.java:393)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:78)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:1047)
> at 
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2349)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2806)
> {noformat}
> This is only possible if a data race happened. File this jira to improve the 
> data and eliminate the data race.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Join us at the Storage User Group Meetup!

2023-10-16 Thread Wei-Chiu Chuang

Hi

Please join us at the Storage Meetup at Cloudera's office next Wednesday.
https://www.meetup.com/futureofdata-sanfrancisco/events/295917033/

We have HDFS developers from Uber join us to talk about optimizing HDFS for
high density disks, and developers from Cloudera to talk about Apache Ozone
and Apache Iceberg.

I am being told this is an in-person event but it will be live streamed
too. Please sign up to get more details about this event.

Thanks,
Wei-Chiu

HADOOP-18207 hadoop-logging module about to land

2023-07-26 Thread Wei-Chiu Chuang

Hi,

I am preparing to resolve HADOOP-18207
 (
https://github.com/apache/hadoop/pull/5717).

This change affects all modules. With this change, it will eliminate almost
all the direct log4j usage.

As always, landing such a big piece is tricky. I am sorry for the mishaps
last time and am doing more due diligence to make it a smoother transition.
I am triggering one last precommit check. Once the change is merged, Viraj
and I will pay attention to any potential problems.

Weichiu

Re: [DISCUSS][HDFS] Add rust binding for libhdfs

2023-07-17 Thread Wei-Chiu Chuang

Inline

On Sat, Jul 15, 2023 at 5:04 AM Ayush Saxena  wrote:

> Forwarding from dev@hadoop to relevant ML
>
> Original mail:
> https://lists.apache.org/thread/r5rcmc7lwwvkysj0320myxltsyokp9kq
>
> -Ayush
>
> On 2023/07/15 09:18:42 Xuanwo wrote:
> > Hello, everyone.
> >
> > I'm the maintainer of [hdfs-sys]: A binding to HDFS Native C API for
> Rust. I want to know is it a good idea of accepting hdfs-sys as a part of
> hadoop project?
> >
> > Users of hdfs-sys for now:
> >
> > - [OpenDAL]: An Apache Incubator project that allows users to easily and
> efficiently retrieve data from various storage services in a unified way.
> > - [Databend]: A modern cloud data warehouse focusing on reducing cost
> and complexity for your massive-scale analytics needs. (via OpenDAL)
> > - [RisingWave]: The distributed streaming database: SQL stream
> processing with Postgres-like experience. (via OpenDAL)
> > - [LakeSoul]: an end-to-end, realtime and cloud native Lakehouse
> framework
> >
> > Licenses information of hdfs-sys:
> >
> > - hdfs-sys itself licensed under Apache-2.0
> > - hdfs-sys only depends on the following libs: cc@1.0.73, glob@0.3.1,
> hdfs-sys@0.3.0, java-locator@0.1.5, lazy_static@1.4.0, they are all dual
> licensed under Apache-2.0 and MIT.

>
> > Works need to do if accept:
> >
> > - Replace libdirent with the same dirent API implemented in HDFS project.
> > - Remove all bundled hdfs C code.
>
What is libdirent? How is it relevant in this context?

How tightly coupled is it to a specific Hadoop version? I am wondering if
it's possible to host it in a separate Hadoop repo, if it's accepted. The
concern I have as a release manager is that it makes my life harder to
ensure the quality of a language binding that I am not familiar with.

> >
> > [hdfs-sys]: https://github.com/Xuanwo/hdfs-sys
> > [OpenDAL]: https://github.com/apache/incubator-opendal
> > [Databend]: https://github.com/datafuselabs/databend
> > [RisingWave]: https://github.com/risingwavelabs/risingwave
> > [LakeSoul]: https://github.com/lakesoul-io/LakeSoul
> >
> > Xuanwo
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: dev-h...@hadoop.apache.org
> >
> >
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>

[jira] [Created] (HDFS-17080) Fix ec connection leak (GitHub PR#5807)

2023-07-11 Thread Wei-Chiu Chuang (Jira)

Wei-Chiu Chuang created HDFS-17080:
--

 Summary: Fix ec connection leak (GitHub PR#5807)
 Key: HDFS-17080
 URL: https://issues.apache.org/jira/browse/HDFS-17080
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Wei-Chiu Chuang


Creating this jira to track GitHub PR #5807.

{quote}Description of PR
This PR fixes EC connection leak if got exception when constructing reader.

How was this patch tested?
Cluster: Presto
Data: EC
Query: select col from table(EC data) limit 10

Presto is a long time running process to deal with query.
In this case, when it gets 10 records, it will interrupt other threads.
Other threads may be in constructing reader or getting next record.
If getting next record is interrupted, it will be caught and Presto will close 
it.
But if constructing reader is interrupted, Presto cannot close it because 
reader in Presto has not been created.
So we can observe whether EC connection is closed when doing EC limit query.
Use netstat command.{quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[ANNOUNCE] Apache Hadoop 3.3.6 release

2023-06-26 Thread Wei-Chiu Chuang

On behalf of the Apache Hadoop Project Management Committee, I am pleased
to announce the release of Apache Hadoop 3.3.6.

It contains 117 bug fixes, improvements and enhancements since 3.3.5. Users
of Apache Hadoop 3.3.5 and earlier should upgrade to this release.

https://hadoop.apache.org/release/3.3.6.html
Feature highlights:

SBOM artifacts

Starting from this release, Hadoop publishes Software Bill of Materials
(SBOM) using
CycloneDX Maven plugin. For more information about SBOM, please go to
[SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).

HDFS RBF: RDBMS based token storage support

HDFS Router-Router Based Federation now supports storing delegation tokens
on MySQL,
[HADOOP-18535](https://issues.apache.org/jira/browse/HADOOP-18535)
which improves token operation through over the original Zookeeper-based
implementation.


New File System APIs

[HADOOP-18671](https://issues.apache.org/jira/browse/HADOOP-18671) moved a
number of
HDFS-specific APIs to Hadoop Common to make it possible for certain
applications that
depend on HDFS semantics to run on other Hadoop compatible file systems.

In particular, recoverLease() and isFileClosed() are exposed through
LeaseRecoverable
interface. While setSafeMode() is exposed through SafeMode interface.

Many thanks to everyone who helped in this release by supplying patches,
reviewing them, helping get this release building and testing and
reviewing the final artifacts.

Weichiu

Re: [VOTE] Release Apache Hadoop 3.3.6 RC1

2023-06-25 Thread Wei-Chiu Chuang

Thanks all!
The vote passed with 6 binding +1 votes, no +0, -1 votes and 4
non-binding +1 votes.

Publishing the release bits and updating webpage and user docs now.

Thanks
to the binding votes from Ayush, Xiaoqiao, Sammi, Mukund, Masatake
and non-binding votes from Nilotpal, Viraj, Stephen, George and Ahmar.

On Fri, Jun 23, 2023 at 11:48 PM Ayush Saxena  wrote:

> +1 (Binding)
>
> * Built from source (x86 & Arm)
> * Successful native build on ubuntu 18.04(x86) & ubuntu 20.04(Arm)
> * Verified Checksums (x86 & Arm)
> * Verified Signatures (x86 & Arm)
> * Successful RAT check (x86 & Arm)
> * Verified the diff b/w the tag & the source tar
> * Built Ozone with 3.3.6, green build after a retrigger due to some OOM
> issues [1]
> * Built Tez with 3.3.6 green build [2]
> * Ran basic HDFS shell commands (Fs
> Operations/EC/RBF/StoragePolicy/Snapshots) (x86 & Arm)
> * Ran some basic Yarn shell commands.
> * Browsed through the UI (NN, DN, RM, NM, JHS) (x86 & Arm)
> * Ran some example Jobs (TeraGen, TeraSort, TeraValidate, WordCount,
> WordMean, Pi) (x86 & Arm)
> * Verified the output of `hadoop version` (x86 & Arm)
> * Ran some HDFS unit tests around FsOperations/EC/Observer Read/RBF/SPS
> * Skimmed over the contents of site jar
> * Skimmed over the staging repo.
> * Checked the NOTICE & Licence files.
>
> Thanx Wei-Chiu for driving the release, Good Luck!!!
>
> -Ayush
>
>
> [1] https://github.com/ayushtkn/hadoop-ozone/actions/runs/5282707769
> [2] https://github.com/apache/tez/pull/285#issuecomment-1590962978
>
> On Sat, 24 Jun 2023 at 09:43, Nilotpal Nandi 
> wrote:
>
>> +1 (Non-binding).
>> Thanks a lot Wei-Chiu for driving it.
>>
>> Thanks,
>> Nilotpal Nandi
>>
>> On 2023/06/23 21:51:56 Wei-Chiu Chuang wrote:
>> > +1 (binding)
>> >
>> > Note: according to the Hadoop bylaw, release vote is open for 5 days,
>> not 7
>> > days. So technically the time is almost up.
>> > https://hadoop.apache.org/bylaws#Decision+Making
>> >
>> > If you plan to cast a vote, please do so soon. In the meantime, I'll
>> start
>> > to prepare to wrap up the release work.
>> >
>> > On Fri, Jun 23, 2023 at 6:09 AM Xiaoqiao He 
>> wrote:
>> >
>> > > +1(binding)
>> > >
>> > > * Verified signature and checksum of all source tarballs.
>> > > * Built source code on Ubuntu and OpenJDK 11 by `mvn clean package
>> > > -DskipTests -Pnative -Pdist -Dtar`.
>> > > * Setup pseudo cluster with HDFS and YARN.
>> > > * Run simple FsShell - mkdir/put/get/mv/rm and check the result.
>> > > * Run example mr applications and check the result - Pi & wordcount.
>> > > * Checked the Web UI of NameNode/DataNode/Resourcemanager/NodeManager
>> etc.
>> > > * Checked git and JIRA using dev-support tools
>> > > `git_jira_fix_version_check.py` .
>> > >
>> > > Thanks WeiChiu for your work.
>> > >
>> > > NOTE: I believe the build fatal error report from me above is only
>> related
>> > > to my own environment.
>> > >
>> > > Best Regards,
>> > > - He Xiaoqiao
>> > >
>> > > On Thu, Jun 22, 2023 at 4:17 PM Chen Yi 
>> wrote:
>> > >
>> > > > Thanks Wei-Chiu for leading this effort !
>> > > >
>> > > > +1(Binding)
>> > > >
>> > > >
>> > > > + Verified the signature and checksum of all tarballs.
>> > > > + Started a web server and viewed documentation site.
>> > > > + Built from the source tarball on macOS 12.3 and OpenJDK 8.
>> > > > + Launched a pseudo distributed cluster using released binary
>> packages,
>> > > > done some HDFS dir/file basic opeations.
>> > > > + Run grep, pi and wordcount MR tasks on the pseudo cluster.
>> > > >
>> > > > Bests,
>> > > > Sammi Chen
>> > > > 
>> > > > 发件人: Wei-Chiu Chuang 
>> > > > 发送时间: 2023年6月19日 8:52
>> > > > 收件人: Hadoop Common ; Hdfs-dev <
>> > > > hdfs-dev@hadoop.apache.org>; yarn-dev ;
>> > > > mapreduce-dev 
>> > > > 主题: [VOTE] Release Apache Hadoop 3.3.6 RC1
>> > > >
>> > > > I am inviting anyone to try and vote on this release candidate.
>> > > >
>> > > > Note:
>> > > > This is exactly the same as RC0, except the CHANGE

Re: [VOTE] Release Apache Hadoop 3.3.6 RC1

2023-06-23 Thread Wei-Chiu Chuang

+1 (binding)

Note: according to the Hadoop bylaw, release vote is open for 5 days, not 7
days. So technically the time is almost up.
https://hadoop.apache.org/bylaws#Decision+Making

If you plan to cast a vote, please do so soon. In the meantime, I'll start
to prepare to wrap up the release work.

On Fri, Jun 23, 2023 at 6:09 AM Xiaoqiao He  wrote:

> +1(binding)
>
> * Verified signature and checksum of all source tarballs.
> * Built source code on Ubuntu and OpenJDK 11 by `mvn clean package
> -DskipTests -Pnative -Pdist -Dtar`.
> * Setup pseudo cluster with HDFS and YARN.
> * Run simple FsShell - mkdir/put/get/mv/rm and check the result.
> * Run example mr applications and check the result - Pi & wordcount.
> * Checked the Web UI of NameNode/DataNode/Resourcemanager/NodeManager etc.
> * Checked git and JIRA using dev-support tools
> `git_jira_fix_version_check.py` .
>
> Thanks WeiChiu for your work.
>
> NOTE: I believe the build fatal error report from me above is only related
> to my own environment.
>
> Best Regards,
> - He Xiaoqiao
>
> On Thu, Jun 22, 2023 at 4:17 PM Chen Yi  wrote:
>
> > Thanks Wei-Chiu for leading this effort !
> >
> > +1(Binding)
> >
> >
> > + Verified the signature and checksum of all tarballs.
> > + Started a web server and viewed documentation site.
> > + Built from the source tarball on macOS 12.3 and OpenJDK 8.
> > + Launched a pseudo distributed cluster using released binary packages,
> > done some HDFS dir/file basic opeations.
> > + Run grep, pi and wordcount MR tasks on the pseudo cluster.
> >
> > Bests,
> > Sammi Chen
> > 
> > 发件人: Wei-Chiu Chuang 
> > 发送时间: 2023年6月19日 8:52
> > 收件人: Hadoop Common ; Hdfs-dev <
> > hdfs-dev@hadoop.apache.org>; yarn-dev ;
> > mapreduce-dev 
> > 主题: [VOTE] Release Apache Hadoop 3.3.6 RC1
> >
> > I am inviting anyone to try and vote on this release candidate.
> >
> > Note:
> > This is exactly the same as RC0, except the CHANGELOG.
> >
> > The RC is available at:
> > https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/ (for amd64)
> > https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-arm64/ (for arm64)
> >
> > Git tag: release-3.3.6-RC1
> > https://github.com/apache/hadoop/releases/tag/release-3.3.6-RC1
> >
> > Maven artifacts is built by x86 machine and are staged at
> > https://repository.apache.org/content/repositories/orgapachehadoop-1380/
> >
> > My public key:
> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >
> > Changelog:
> > https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/CHANGELOG.md
> >
> > Release notes:
> > https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/RELEASENOTES.md
> >
> > This is a relatively small release (by Hadoop standard) containing about
> > 120 commits.
> > Please give it a try, this RC vote will run for 7 days.
> >
> >
> > Feature highlights:
> >
> > SBOM artifacts
> > 
> > Starting from this release, Hadoop publishes Software Bill of Materials
> > (SBOM) using
> > CycloneDX Maven plugin. For more information about SBOM, please go to
> > [SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).
> >
> > HDFS RBF: RDBMS based token storage support
> > 
> > HDFS Router-Router Based Federation now supports storing delegation
> tokens
> > on MySQL,
> > [HADOOP-18535](https://issues.apache.org/jira/browse/HADOOP-18535)
> > which improves token operation through over the original Zookeeper-based
> > implementation.
> >
> >
> > New File System APIs
> > 
> > [HADOOP-18671](https://issues.apache.org/jira/browse/HADOOP-18671)
> moved a
> > number of
> > HDFS-specific APIs to Hadoop Common to make it possible for certain
> > applications that
> > depend on HDFS semantics to run on other Hadoop compatible file systems.
> >
> > In particular, recoverLease() and isFileClosed() are exposed through
> > LeaseRecoverable
> > interface. While setSafeMode() is exposed through SafeMode interface.
> >
>

Clean up old Hadoop release tarballs

2023-06-23 Thread Wei-Chiu Chuang

https://dist.apache.org/repos/dist/release/hadoop/common/ has Hadoop
release tarballs 3.3.1 ~ 3.3.5. I plan to remove the tarballs from 3.3.1 to
3.3.4 and leave only 3.3.5 (and the upcoming 3.3.6). Shout out if you have
something depending on the old release tarballs (you shouldn't)

Other release lines (2.10, 3.2) have two release tarballs each, which is
good. I'll leave it that way.

Weichiu

DockerHub admin for Apache Hadoop

2023-06-22 Thread Wei-Chiu Chuang

Ayush and I have acquired the DockerHub admin privilege for the Hadoop
project in order to facilitate the release of Hadoop 3.3.6.

Apache Infra allows only two seats per project. So if you need something,
let Ayush and I know and we will make it happen for you. If you are a
Docker guru and can't help but want to do more to make Hadoop easier and
better on DockerHub, feel free to let me know! I'm happy to give away my
seat (PMC member only).

https://cwiki.apache.org/confluence/display/HADOOP2/HowToRelease#HowToRelease-Dockerimages

https://infra.apache.org/docker-hub-policy.html

Best Regards,
Weichiu

Re: [VOTE] Release Apache Hadoop 3.3.6 RC1

2023-06-21 Thread Wei-Chiu Chuang

I am using Maven 3.6.3 on Mac (x86), JDK 1.8.0_341
No issue for me.

On Wed, Jun 21, 2023 at 5:48 AM Xiaoqiao He  wrote:

> Addendum:
> A. Build the release from sources using: `mvn clean install package
> -DskipTests=true -Dmaven.javadoc.skip=true`
> B. It works well when using the same command to build from the
> source branch trunk.
>
> On Wed, Jun 21, 2023 at 8:44 PM Xiaoqiao He  wrote:
>
> > Hi,
> >
> > I met a fatal error when building from source on local Mac OSX. It
> > could reproduce stably.
> > Not sure if it is related to my local environment. Try to dig it, but not
> > any conclusion right now.
> > Will feedback once find reasons.
> >
> > Appendix system environment, some more stack information refer to the
> > attachment please.
> > OS:Bsduname:Darwin 21.2.0 Darwin Kernel Version 21.2.0: Sun Nov 28
> > 20:28:54 PST 2021; root:xnu-8019.61.5~1/RELEASE_X86_64 x86_64
> > rlimit: STACK 8192k, CORE 0k, NPROC 2784, NOFILE 10240, AS infinity
> > load average:13.99 12.30 8.96
> >
> > CPU:total 12 (initial active 12) (6 cores per cpu, 2 threads per core)
> > family 6 model 158 stepping 10, cmov, cx8, fxsr, mmx, sse, sse2, sse3,
> > ssse3, sse4.1, sse4.2, popcnt, avx, avx2, aes, clmul, erms, 3dnowpref,
> > lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx
> >
> > vm_info: Java HotSpot(TM) 64-Bit Server VM (25.202-b08) for bsd-amd64 JRE
> > (1.8.0_202-b08), built on Dec 15 2018 20:16:16 by "java_re" with gcc
> 4.2.1
> > (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)
> >
> > mvn version: Apache Maven 3.6.0
> >
> > Best Regards,
> > - He Xiaoqiao
> >
> >
> > On Wed, Jun 21, 2023 at 10:09 AM Tak Lon (Stephen) Wu  >
> > wrote:
> >
> >> +1 (non-binding), and thanks a lot for driving the vote.
> >>
> >> * Signature of sources and binaries: ok
> >> * Checksum of sources and binaries: ok
> >> * Rat check (1.8.0_362): okie
> >>  - mvn clean apache-rat:check
> >> * Built from source (1.8.0_362): ok
> >>  - mvn clean install -DskipTests
> >> * Run Pseudo-Distributed mode with HDFS and YARN (1.8.0_362): ok
> >> * Run Shell command (mkdir/put/ls/get) (1.8.0_362) : ok
> >> * Run MR examples applications and check the result (1.8.0_362): ok
> >>  - bin/hadoop jar
> >> share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.6.jar grep input
> >> output 'dfs[a-z.]+'
> >>
> >> -Stephen
> >>
> >> On Tue, Jun 20, 2023 at 6:56 PM Masatake Iwasaki <
> >> iwasak...@oss.nttdata.com>
> >> wrote:
> >>
> >> > +1
> >> >
> >> > + verified the signature and checksum of the source tarball.
> >> > + built from the source tarball on Rocky Linux 8 (x86_64) and OpenJDK
> 8
> >> > with native profile enabled.
> >> >+ launched pseudo distributed cluster including kms and httpfs with
> >> > Kerberos and SSL enabled.
> >> >+ created encryption zone, put and read files via httpfs.
> >> > + built RPM packages by Bigtop (modified to use ZooKeepr 3.6) on Rocky
> >> > Linux 8 (x86_64).
> >> >    + built HBase and Hive against Hadoop 3.3.6.
> >> >+ ran smoke-tests of hdfs, yarn, mapreduce, hbase and hive.
> >> > + skimmed the contents of site documentation.
> >> >
> >> > Thanks,
> >> > Masatake Iwasaki
> >> >
> >> > On 2023/06/21 8:07, Wei-Chiu Chuang wrote:
> >> > > Bumping this thread to the top.
> >> > > If you are verifying the release, please vote on this thread. RC0
> and
> >> RC1
> >> > > are exactly the same. The only material difference is the Changelog.
> >> > >
> >> > > Thanks!!
> >> > >
> >> > > On Sun, Jun 18, 2023 at 5:52 PM Wei-Chiu Chuang  >
> >> > wrote:
> >> > >
> >> > >> I am inviting anyone to try and vote on this release candidate.
> >> > >>
> >> > >> Note:
> >> > >> This is exactly the same as RC0, except the CHANGELOG.
> >> > >>
> >> > >> The RC is available at:
> >> > >> https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/ (for
> amd64)
> >> > >> https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-arm64/ (for
> arm64)
> >> > >>
> >> > >> Git tag: release-3.3.6-RC1
> >> > >> https://github.com/apache/hadoop/releases/tag/releas

Re: [VOTE] Release Apache Hadoop 3.3.6 RC1

2023-06-20 Thread Wei-Chiu Chuang

Bumping this thread to the top.
If you are verifying the release, please vote on this thread. RC0 and RC1
are exactly the same. The only material difference is the Changelog.

Thanks!!

On Sun, Jun 18, 2023 at 5:52 PM Wei-Chiu Chuang  wrote:

> I am inviting anyone to try and vote on this release candidate.
>
> Note:
> This is exactly the same as RC0, except the CHANGELOG.
>
> The RC is available at:
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/ (for amd64)
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-arm64/ (for arm64)
>
> Git tag: release-3.3.6-RC1
> https://github.com/apache/hadoop/releases/tag/release-3.3.6-RC1
>
> Maven artifacts is built by x86 machine and are staged at
> https://repository.apache.org/content/repositories/orgapachehadoop-1380/
>
> My public key:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> Changelog:
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/CHANGELOG.md
>
> Release notes:
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/RELEASENOTES.md
>
> This is a relatively small release (by Hadoop standard) containing about
> 120 commits.
> Please give it a try, this RC vote will run for 7 days.
>
>
> Feature highlights:
>
> SBOM artifacts
> 
> Starting from this release, Hadoop publishes Software Bill of Materials
> (SBOM) using
> CycloneDX Maven plugin. For more information about SBOM, please go to
> [SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).
>
> HDFS RBF: RDBMS based token storage support
> 
> HDFS Router-Router Based Federation now supports storing delegation tokens
> on MySQL,
> [HADOOP-18535](https://issues.apache.org/jira/browse/HADOOP-18535)
> which improves token operation through over the original Zookeeper-based
> implementation.
>
>
> New File System APIs
> 
> [HADOOP-18671](https://issues.apache.org/jira/browse/HADOOP-18671) moved
> a number of
> HDFS-specific APIs to Hadoop Common to make it possible for certain
> applications that
> depend on HDFS semantics to run on other Hadoop compatible file systems.
>
> In particular, recoverLease() and isFileClosed() are exposed through
> LeaseRecoverable
> interface. While setSafeMode() is exposed through SafeMode interface.
>
>
>

[VOTE] Release Apache Hadoop 3.3.6 RC1

2023-06-18 Thread Wei-Chiu Chuang

I am inviting anyone to try and vote on this release candidate.

Note:
This is exactly the same as RC0, except the CHANGELOG.

The RC is available at:
https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/ (for amd64)
https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-arm64/ (for arm64)

Git tag: release-3.3.6-RC1
https://github.com/apache/hadoop/releases/tag/release-3.3.6-RC1

Maven artifacts is built by x86 machine and are staged at
https://repository.apache.org/content/repositories/orgapachehadoop-1380/

My public key:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

Changelog:
https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/CHANGELOG.md

Release notes:
https://home.apache.org/~weichiu/hadoop-3.3.6-RC1-amd64/RELEASENOTES.md

This is a relatively small release (by Hadoop standard) containing about
120 commits.
Please give it a try, this RC vote will run for 7 days.


Feature highlights:

SBOM artifacts

Starting from this release, Hadoop publishes Software Bill of Materials
(SBOM) using
CycloneDX Maven plugin. For more information about SBOM, please go to
[SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).

HDFS RBF: RDBMS based token storage support

HDFS Router-Router Based Federation now supports storing delegation tokens
on MySQL,
[HADOOP-18535](https://issues.apache.org/jira/browse/HADOOP-18535)
which improves token operation through over the original Zookeeper-based
implementation.


New File System APIs

[HADOOP-18671](https://issues.apache.org/jira/browse/HADOOP-18671) moved a
number of
HDFS-specific APIs to Hadoop Common to make it possible for certain
applications that
depend on HDFS semantics to run on other Hadoop compatible file systems.

In particular, recoverLease() and isFileClosed() are exposed through
LeaseRecoverable
interface. While setSafeMode() is exposed through SafeMode interface.

Re: [VOTE] Release Apache Hadoop 3.3.6 RC0

2023-06-17 Thread Wei-Chiu Chuang

I was going to do another RC in case something comes up.
But it looks like the only thing that needs to be fixed is the Changelog.


   1. HADOOP-18596 <https://issues.apache.org/jira/browse/HADOOP-18596>

HADOOP-18633 <https://issues.apache.org/jira/browse/HADOOP-18633>
are related to cloud store semantics, and I don't want to make a judgement
call on it. As far as I can tell its effect can be addressed by supplying a
config option in the application code.
It looks like the feature improves fault tolerance by ensuring files are
synchronized if modification time is different between the source and
destination. So to me it's the better behavior.

I can make a RC1 over the weekend to fix the Changelog but that's probably
the only thing that's going to have.
On Sat, Jun 17, 2023 at 2:00 AM Xiaoqiao He  wrote:

> Thanks Wei-Chiu for driving this release. The next RC will be prepared,
> right?
> If true, I would like to try and vote on the next RC.
> Just notice that some JIRAs are not included and need to revert some PRs to
> pass HBase verification which are mentioned above.
>
> Best Regards,
> - He Xiaoqiao
>
>
> On Fri, Jun 16, 2023 at 9:20 AM Wei-Chiu Chuang
>  wrote:
>
> > Overall so far so good.
> >
> > hadoop-api-shim:
> > built, tested successfully.
> >
> > cloudstore:
> > built successfully.
> >
> > Spark:
> > built successfully. Passed hadoop-cloud tests.
> >
> > Ozone:
> > One test failure due to unrelated Ozone issue. This test is being
> disabled
> > in the latest Ozone code.
> >
> > org.apache.hadoop.hdds.utils.NativeLibraryNotLoadedException: Unable
> > to load library ozone_rocksdb_tools from both java.library.path &
> > resource file libozone_rocksdb_t
> > ools.so from jar.
> > at
> >
> org.apache.hadoop.hdds.utils.db.managed.ManagedSSTDumpTool.(ManagedSSTDumpTool.java:49)
> >
> >
> > Google gcs:
> > There are two test failures. The tests were added recently by
> HADOOP-18724
> > <https://issues.apache.org/jira/browse/HADOOP-18724> in Hadoop 3.3.6.
> This
> > is okay. Not production code problem. Can be addressed in GCS code.
> >
> > [ERROR] Errors:
> > [ERROR]
> >
> >
> TestInMemoryGoogleContractOpen>AbstractContractOpenTest.testFloatingPointLength:403
> > » IllegalArgument Unknown mandatory key for gs://fake-in-memory-test-buck
> > et/contract-test/testFloatingPointLength "fs.option.openfile.length"
> > [ERROR]
> >
> >
> TestInMemoryGoogleContractOpen>AbstractContractOpenTest.testOpenFileApplyAsyncRead:341
> > » IllegalArgument Unknown mandatory key for gs://fake-in-memory-test-b
> > ucket/contract-test/testOpenFileApplyAsyncRead
> "fs.option.openfile.length"
> >
> >
> >
> >
> >
> > On Wed, Jun 14, 2023 at 5:01 PM Wei-Chiu Chuang 
> > wrote:
> >
> > > The hbase-filesystem tests passed after reverting HADOOP-18596
> > > <https://issues.apache.org/jira/browse/HADOOP-18596> and HADOOP-18633
> > > <https://issues.apache.org/jira/browse/HADOOP-18633> from my local
> tree.
> > > So I think it's a matter of the default behavior being changed. It's
> not
> > > the end of the world. I think we can address it by adding an
> incompatible
> > > change flag and a release note.
> > >
> > > On Wed, Jun 14, 2023 at 3:55 PM Wei-Chiu Chuang 
> > > wrote:
> > >
> > >> Cross referenced git history and jira. Changelog needs some update
> > >>
> > >> Not in the release
> > >>
> > >>1. HDFS-16858 <https://issues.apache.org/jira/browse/HDFS-16858>
> > >>
> > >>
> > >>1. HADOOP-18532 <
> https://issues.apache.org/jira/browse/HADOOP-18532>
> > >>2.
> > >>   1. HDFS-16861 <https://issues.apache.org/jira/browse/HDFS-16861
> >
> > >>  2.
> > >> 1. HDFS-16866
> > >> <https://issues.apache.org/jira/browse/HDFS-16866>
> > >> 2.
> > >>1. HADOOP-18320
> > >><https://issues.apache.org/jira/browse/HADOOP-18320>
> > >>2.
> > >>
> > >> Updated fixed version. Will generate. new Changelog in the next RC.
> > >>
> > >> Was able to build HBase and hbase-filesystem without any code change.
> > >>
> > >> hbase has one unit test failure. This one is reproducible even with
> > >> Hadoop 3.3.5, so maybe a red

Re: [VOTE] Release Apache Hadoop 3.3.6 RC0

2023-06-15 Thread Wei-Chiu Chuang

Overall so far so good.

hadoop-api-shim:
built, tested successfully.

cloudstore:
built successfully.

Spark:
built successfully. Passed hadoop-cloud tests.

Ozone:
One test failure due to unrelated Ozone issue. This test is being disabled
in the latest Ozone code.

org.apache.hadoop.hdds.utils.NativeLibraryNotLoadedException: Unable
to load library ozone_rocksdb_tools from both java.library.path &
resource file libozone_rocksdb_t
ools.so from jar.
at 
org.apache.hadoop.hdds.utils.db.managed.ManagedSSTDumpTool.(ManagedSSTDumpTool.java:49)


Google gcs:
There are two test failures. The tests were added recently by HADOOP-18724
<https://issues.apache.org/jira/browse/HADOOP-18724> in Hadoop 3.3.6. This
is okay. Not production code problem. Can be addressed in GCS code.

[ERROR] Errors:
[ERROR]
TestInMemoryGoogleContractOpen>AbstractContractOpenTest.testFloatingPointLength:403
» IllegalArgument Unknown mandatory key for gs://fake-in-memory-test-buck
et/contract-test/testFloatingPointLength "fs.option.openfile.length"
[ERROR]
TestInMemoryGoogleContractOpen>AbstractContractOpenTest.testOpenFileApplyAsyncRead:341
» IllegalArgument Unknown mandatory key for gs://fake-in-memory-test-b
ucket/contract-test/testOpenFileApplyAsyncRead "fs.option.openfile.length"





On Wed, Jun 14, 2023 at 5:01 PM Wei-Chiu Chuang  wrote:

> The hbase-filesystem tests passed after reverting HADOOP-18596
> <https://issues.apache.org/jira/browse/HADOOP-18596> and HADOOP-18633
> <https://issues.apache.org/jira/browse/HADOOP-18633> from my local tree.
> So I think it's a matter of the default behavior being changed. It's not
> the end of the world. I think we can address it by adding an incompatible
> change flag and a release note.
>
> On Wed, Jun 14, 2023 at 3:55 PM Wei-Chiu Chuang 
> wrote:
>
>> Cross referenced git history and jira. Changelog needs some update
>>
>> Not in the release
>>
>>1. HDFS-16858 <https://issues.apache.org/jira/browse/HDFS-16858>
>>
>>
>>1. HADOOP-18532 <https://issues.apache.org/jira/browse/HADOOP-18532>
>>2.
>>   1. HDFS-16861 <https://issues.apache.org/jira/browse/HDFS-16861>
>>  2.
>> 1. HDFS-16866
>> <https://issues.apache.org/jira/browse/HDFS-16866>
>> 2.
>>1. HADOOP-18320
>><https://issues.apache.org/jira/browse/HADOOP-18320>
>>2.
>>
>> Updated fixed version. Will generate. new Changelog in the next RC.
>>
>> Was able to build HBase and hbase-filesystem without any code change.
>>
>> hbase has one unit test failure. This one is reproducible even with
>> Hadoop 3.3.5, so maybe a red herring. Local env or something.
>>
>> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
>> 9.007 s <<< FAILURE! - in
>> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker
>> [ERROR]
>> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker.testConcurrentIncludeTimestampCorrectness
>>  Time elapsed: 3.13 s  <<< ERROR!
>> java.lang.OutOfMemoryError: Java heap space
>> at
>> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker$RandomTestData.(TestSyncTimeRangeTracker.java:91)
>> at
>> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker.testConcurrentIncludeTimestampCorrectness(TestSyncTimeRangeTracker.java:156)
>>
>> hbase-filesystem has three test failures in TestHBOSSContractDistCp, and
>> is not reproducible with Hadoop 3.3.5.
>> [ERROR] Failures: [ERROR]
>> TestHBOSSContractDistCp>AbstractContractDistCpTest.testDistCpUpdateCheckFileSkip:976->Assert.fail:88
>> 10 errors in file of length 10
>> [ERROR]
>> TestHBOSSContractDistCp>AbstractContractDistCpTest.testUpdateDeepDirectoryStructureNoChange:270->AbstractContractDistCpTest.assertCounterInRange:290->Assert.assertTrue:41->Assert.fail:88
>> Files Skipped value 0 too below minimum 1
>> [ERROR]
>> TestHBOSSContractDistCp>AbstractContractDistCpTest.testUpdateDeepDirectoryStructureToRemote:259->AbstractContractDistCpTest.distCpUpdateDeepDirectoryStructure:334->AbstractContractDistCpTest.assertCounterInRange:294->Assert.assertTrue:41->Assert.fail:88
>> Files Copied value 2 above maximum 1
>> [INFO]
>> [ERROR] Tests run: 240, Failures: 3, Errors: 0, Skipped: 58
>>
>>
>> Ozone
>> test in progress. Will report back.
>>
>>
>> On Tue, Jun 13, 2023 at 11:27 PM Wei-Chiu Chuang 
>> wrote:
>>
>>> I am inviting anyone to try and vote on this release candidate.
>>>
>>> Note:
>>

Re: [VOTE] Release Apache Hadoop 3.3.6 RC0

2023-06-15 Thread Wei-Chiu Chuang

It's branching off branch-3.3

On Thu, Jun 15, 2023 at 3:18 AM Steve Loughran 
wrote:

> Which branch is -3.3.6 off? 3.3.5 or 3.3?
>
> I'm travelling for the next few days and unlikely to be able to test this;
> will do my best
>
> On Wed, 14 Jun 2023 at 07:27, Wei-Chiu Chuang  wrote:
>
> > I am inviting anyone to try and vote on this release candidate.
> >
> > Note:
> > This is built off branch-3.3.6 plus PR#5741 (aws sdk update) and PR#5740
> > (LICENSE file update)
> >
> > The RC is available at:
> > https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/ (for amd64)
> > https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-arm64/ (for arm64)
> >
> > Git tag: release-3.3.6-RC0
> > https://github.com/apache/hadoop/releases/tag/release-3.3.6-RC0
> >
> > Maven artifacts is built by x86 machine and are staged at
> > https://repository.apache.org/content/repositories/orgapachehadoop-1378/
> >
> > My public key:
> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >
> > Changelog:
> > https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/CHANGELOG.md
> >
> > Release notes:
> > https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/RELEASENOTES.md
> >
> > This is a relatively small release (by Hadoop standard) containing about
> > 120 commits.
> > Please give it a try, this RC vote will run for 7 days.
> >
> >
> > Feature highlights:
> >
> > SBOM artifacts
> > 
> > Starting from this release, Hadoop publishes Software Bill of Materials
> > (SBOM) using
> > CycloneDX Maven plugin. For more information about SBOM, please go to
> > [SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).
> >
> > HDFS RBF: RDBMS based token storage support
> > 
> > HDFS Router-Router Based Federation now supports storing delegation
> tokens
> > on MySQL,
> > [HADOOP-18535](https://issues.apache.org/jira/browse/HADOOP-18535)
> > which improves token operation through over the original Zookeeper-based
> > implementation.
> >
> >
> > New File System APIs
> > 
> > [HADOOP-18671](https://issues.apache.org/jira/browse/HADOOP-18671)
> moved a
> > number of
> > HDFS-specific APIs to Hadoop Common to make it possible for certain
> > applications that
> > depend on HDFS semantics to run on other Hadoop compatible file systems.
> >
> > In particular, recoverLease() and isFileClosed() are exposed through
> > LeaseRecoverable
> > interface. While setSafeMode() is exposed through SafeMode interface.
> >
>

Re: [VOTE] Release Apache Hadoop 3.3.6 RC0

2023-06-14 Thread Wei-Chiu Chuang

The hbase-filesystem tests passed after reverting HADOOP-18596
<https://issues.apache.org/jira/browse/HADOOP-18596> and HADOOP-18633
<https://issues.apache.org/jira/browse/HADOOP-18633> from my local tree.
So I think it's a matter of the default behavior being changed. It's not
the end of the world. I think we can address it by adding an incompatible
change flag and a release note.

On Wed, Jun 14, 2023 at 3:55 PM Wei-Chiu Chuang  wrote:

> Cross referenced git history and jira. Changelog needs some update
>
> Not in the release
>
>1. HDFS-16858 <https://issues.apache.org/jira/browse/HDFS-16858>
>
>
>1. HADOOP-18532 <https://issues.apache.org/jira/browse/HADOOP-18532>
>2.
>   1. HDFS-16861 <https://issues.apache.org/jira/browse/HDFS-16861>
>  2.
> 1. HDFS-16866
> <https://issues.apache.org/jira/browse/HDFS-16866>
> 2.
>1. HADOOP-18320
><https://issues.apache.org/jira/browse/HADOOP-18320>
>2.
>
> Updated fixed version. Will generate. new Changelog in the next RC.
>
> Was able to build HBase and hbase-filesystem without any code change.
>
> hbase has one unit test failure. This one is reproducible even with Hadoop
> 3.3.5, so maybe a red herring. Local env or something.
>
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> 9.007 s <<< FAILURE! - in
> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker
> [ERROR]
> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker.testConcurrentIncludeTimestampCorrectness
>  Time elapsed: 3.13 s  <<< ERROR!
> java.lang.OutOfMemoryError: Java heap space
> at
> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker$RandomTestData.(TestSyncTimeRangeTracker.java:91)
> at
> org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker.testConcurrentIncludeTimestampCorrectness(TestSyncTimeRangeTracker.java:156)
>
> hbase-filesystem has three test failures in TestHBOSSContractDistCp, and
> is not reproducible with Hadoop 3.3.5.
> [ERROR] Failures: [ERROR]
> TestHBOSSContractDistCp>AbstractContractDistCpTest.testDistCpUpdateCheckFileSkip:976->Assert.fail:88
> 10 errors in file of length 10
> [ERROR]
> TestHBOSSContractDistCp>AbstractContractDistCpTest.testUpdateDeepDirectoryStructureNoChange:270->AbstractContractDistCpTest.assertCounterInRange:290->Assert.assertTrue:41->Assert.fail:88
> Files Skipped value 0 too below minimum 1
> [ERROR]
> TestHBOSSContractDistCp>AbstractContractDistCpTest.testUpdateDeepDirectoryStructureToRemote:259->AbstractContractDistCpTest.distCpUpdateDeepDirectoryStructure:334->AbstractContractDistCpTest.assertCounterInRange:294->Assert.assertTrue:41->Assert.fail:88
> Files Copied value 2 above maximum 1
> [INFO]
> [ERROR] Tests run: 240, Failures: 3, Errors: 0, Skipped: 58
>
>
> Ozone
> test in progress. Will report back.
>
>
> On Tue, Jun 13, 2023 at 11:27 PM Wei-Chiu Chuang 
> wrote:
>
>> I am inviting anyone to try and vote on this release candidate.
>>
>> Note:
>> This is built off branch-3.3.6 plus PR#5741 (aws sdk update) and PR#5740
>> (LICENSE file update)
>>
>> The RC is available at:
>> https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/ (for amd64)
>> https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-arm64/ (for arm64)
>>
>> Git tag: release-3.3.6-RC0
>> https://github.com/apache/hadoop/releases/tag/release-3.3.6-RC0
>>
>> Maven artifacts is built by x86 machine and are staged at
>> https://repository.apache.org/content/repositories/orgapachehadoop-1378/
>>
>> My public key:
>> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>>
>> Changelog:
>> https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/CHANGELOG.md
>>
>> Release notes:
>> https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/RELEASENOTES.md
>>
>> This is a relatively small release (by Hadoop standard) containing about
>> 120 commits.
>> Please give it a try, this RC vote will run for 7 days.
>>
>>
>> Feature highlights:
>>
>> SBOM artifacts
>> 
>> Starting from this release, Hadoop publishes Software Bill of Materials
>> (SBOM) using
>> CycloneDX Maven plugin. For more information about SBOM, please go to
>> [SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).
>>
>> HDFS RBF: RDBMS based token storage support
>> 
>> HDFS Router-Router Based Federation now supports storing delegation
>> tokens

Re: [VOTE] Release Apache Hadoop 3.3.6 RC0

2023-06-14 Thread Wei-Chiu Chuang

Cross referenced git history and jira. Changelog needs some update

Not in the release

   1. HDFS-16858 <https://issues.apache.org/jira/browse/HDFS-16858>

   1. HADOOP-18532 <https://issues.apache.org/jira/browse/HADOOP-18532>
   2.
  1. HDFS-16861 <https://issues.apache.org/jira/browse/HDFS-16861>
 2.
1. HDFS-16866 <https://issues.apache.org/jira/browse/HDFS-16866>
2.
   1. HADOOP-18320
   <https://issues.apache.org/jira/browse/HADOOP-18320>
   2.

Updated fixed version. Will generate. new Changelog in the next RC.

Was able to build HBase and hbase-filesystem without any code change.

hbase has one unit test failure. This one is reproducible even with Hadoop
3.3.5, so maybe a red herring. Local env or something.

[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
9.007 s <<< FAILURE! - in
org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker
[ERROR]
org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker.testConcurrentIncludeTimestampCorrectness
 Time elapsed: 3.13 s  <<< ERROR!
java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker$RandomTestData.(TestSyncTimeRangeTracker.java:91)
at
org.apache.hadoop.hbase.regionserver.TestSyncTimeRangeTracker.testConcurrentIncludeTimestampCorrectness(TestSyncTimeRangeTracker.java:156)

hbase-filesystem has three test failures in TestHBOSSContractDistCp, and is
not reproducible with Hadoop 3.3.5.
[ERROR] Failures: [ERROR]
TestHBOSSContractDistCp>AbstractContractDistCpTest.testDistCpUpdateCheckFileSkip:976->Assert.fail:88
10 errors in file of length 10
[ERROR]
TestHBOSSContractDistCp>AbstractContractDistCpTest.testUpdateDeepDirectoryStructureNoChange:270->AbstractContractDistCpTest.assertCounterInRange:290->Assert.assertTrue:41->Assert.fail:88
Files Skipped value 0 too below minimum 1
[ERROR]
TestHBOSSContractDistCp>AbstractContractDistCpTest.testUpdateDeepDirectoryStructureToRemote:259->AbstractContractDistCpTest.distCpUpdateDeepDirectoryStructure:334->AbstractContractDistCpTest.assertCounterInRange:294->Assert.assertTrue:41->Assert.fail:88
Files Copied value 2 above maximum 1
[INFO]
[ERROR] Tests run: 240, Failures: 3, Errors: 0, Skipped: 58

Ozone
test in progress. Will report back.

On Tue, Jun 13, 2023 at 11:27 PM Wei-Chiu Chuang  wrote:

> I am inviting anyone to try and vote on this release candidate.
>
> Note:
> This is built off branch-3.3.6 plus PR#5741 (aws sdk update) and PR#5740
> (LICENSE file update)
>
> The RC is available at:
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/ (for amd64)
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-arm64/ (for arm64)
>
> Git tag: release-3.3.6-RC0
> https://github.com/apache/hadoop/releases/tag/release-3.3.6-RC0
>
> Maven artifacts is built by x86 machine and are staged at
> https://repository.apache.org/content/repositories/orgapachehadoop-1378/
>
> My public key:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> Changelog:
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/CHANGELOG.md
>
> Release notes:
> https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/RELEASENOTES.md
>
> This is a relatively small release (by Hadoop standard) containing about
> 120 commits.
> Please give it a try, this RC vote will run for 7 days.
>
>
> Feature highlights:
>
> SBOM artifacts
> 
> Starting from this release, Hadoop publishes Software Bill of Materials
> (SBOM) using
> CycloneDX Maven plugin. For more information about SBOM, please go to
> [SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).
>
> HDFS RBF: RDBMS based token storage support
> 
> HDFS Router-Router Based Federation now supports storing delegation tokens
> on MySQL,
> [HADOOP-18535](https://issues.apache.org/jira/browse/HADOOP-18535)
> which improves token operation through over the original Zookeeper-based
> implementation.
>
>
> New File System APIs
> 
> [HADOOP-18671](https://issues.apache.org/jira/browse/HADOOP-18671) moved
> a number of
> HDFS-specific APIs to Hadoop Common to make it possible for certain
> applications that
> depend on HDFS semantics to run on other Hadoop compatible file systems.
>
> In particular, recoverLease() and isFileClosed() are exposed through
> LeaseRecoverable
> interface. While setSafeMode() is exposed through SafeMode interface.
>
>
>

[VOTE] Release Apache Hadoop 3.3.6 RC0

2023-06-14 Thread Wei-Chiu Chuang

I am inviting anyone to try and vote on this release candidate.

Note:
This is built off branch-3.3.6 plus PR#5741 (aws sdk update) and PR#5740
(LICENSE file update)

The RC is available at:
https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/ (for amd64)
https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-arm64/ (for arm64)

Git tag: release-3.3.6-RC0
https://github.com/apache/hadoop/releases/tag/release-3.3.6-RC0

Maven artifacts is built by x86 machine and are staged at
https://repository.apache.org/content/repositories/orgapachehadoop-1378/

My public key:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

Changelog:
https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/CHANGELOG.md

Release notes:
https://home.apache.org/~weichiu/hadoop-3.3.6-RC0-amd64/RELEASENOTES.md

This is a relatively small release (by Hadoop standard) containing about
120 commits.
Please give it a try, this RC vote will run for 7 days.


Feature highlights:

SBOM artifacts

Starting from this release, Hadoop publishes Software Bill of Materials
(SBOM) using
CycloneDX Maven plugin. For more information about SBOM, please go to
[SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).

HDFS RBF: RDBMS based token storage support

HDFS Router-Router Based Federation now supports storing delegation tokens
on MySQL,
[HADOOP-18535](https://issues.apache.org/jira/browse/HADOOP-18535)
which improves token operation through over the original Zookeeper-based
implementation.


New File System APIs

[HADOOP-18671](https://issues.apache.org/jira/browse/HADOOP-18671) moved a
number of
HDFS-specific APIs to Hadoop Common to make it possible for certain
applications that
depend on HDFS semantics to run on other Hadoop compatible file systems.

In particular, recoverLease() and isFileClosed() are exposed through
LeaseRecoverable
interface. While setSafeMode() is exposed through SafeMode interface.

Re: [DISCUSS] Hadoop 3.3.6 release planning

2023-06-14 Thread Wei-Chiu Chuang

Thanks a lot for the help to move this release forward.

Some status update:

branch-3.3.6 just forked out of branch-3.3.
Starting a RC0 now.

I would like to add the text to highlight big features and improvements.
Let me know if you have something that's included in this release and would
also like to highlight. @Steve Loughran  I am aware the
S3A prefetch code is in this release, but I am not sure if it is in a state
where we can make it public. I'll let you decide.

SBOM artifacts

Starting from this release, Hadoop publishes Software Bill of Materials
(SBOM) using
CycloneDX Maven plugin. For more information about SBOM, please go to
[SBOM](https://cwiki.apache.org/confluence/display/COMDEV/SBOM).

HDFS RBF: RDBMS based token storage support

HDFS Router-Router Based Federation now supports storing delegation tokens
on MySQL,
[HADOOP-18535](https://issues.apache.org/jira/browse/HADOOP-18535)
which improves token operation through over the original Zookeeper-based
implementation.

New File System APIs

[HADOOP-18671](https://issues.apache.org/jira/browse/HADOOP-18671) moved a
number of
HDFS-specific APIs to Hadoop Common to make it possible for certain
applications that
depend on HDFS semantics to run on other Hadoop compatible file systems.

In particular, recoverLease() and isFileClosed() are exposed through
LeaseRecoverable
interface. While setSafeMode() is exposed through SafeMode interface.

On Thu, Jun 8, 2023 at 11:11 AM Wei-Chiu Chuang  wrote:

> Thanks for comments
>
> Looking at jiras fixed in 3.3.6 and 3.3.9 (my bad, forgot that most
> commits landing in branch-3.3 was 3.3.9), most are okay. We have about 119
> commits so it's manageable.
>
> I am planning to cut 3.3.6 out of branch-3.3 later today. Anything open
> that is still targeting 3.3.6 will be cherry picked one by one.
> I will also bulk-update jiras fixed in 3.3.9 to 3.3.6.
>
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12337047
> As of now, I am tracking 4 jiras that's targeting 3.3.6 -- update Kerby,
> fix hadoop shaded client to support Spark history server, a small
> regression in HDFS (probably will move this out), protobuf 2.5 dependency
> change.
>
> I had a dry-run of the RC. Env set up for me and everything works. So I
> expect to have a RC ready to vote on soon. If I can move out some of the
> jiras or help them resolved, I can probably have a RC0 on Monday for vote.
>
> On Mon, May 8, 2023 at 2:01 PM Ayush Saxena  wrote:
>
>> That openssl change ain't a blocker now from my side, that ABFS-Jdk-17
>> stuff got sorted out, Steve knew a way out
>>
>> On Sat, 6 May 2023 at 00:51, Ayush Saxena  wrote:
>> >
>> > Thanx Wei-Chiu for the initiative, Good to have quick releases :)
>> >
>> > With my Hive developer hat on, I would like to bring some stuff up for
>> > consideration(feel free to say no, if it is beyond scope or feels even
>> > a bit unsafe, don't want to mess up the release)
>> >
>> > * HADOOP-18662: ListFiles with recursive fails with FNF : This broke
>> > compaction in Hive, bothers only with HDFS though. There is a
>> > workaround to that, if it doesn't feel safe. no issues, or if some
>> > improvements suggested. I can quickly do that :)
>> >
>> > * HADOOP-17649: Update wildfly openssl to 2.1.3.Final. Maybe not 2.1.3
>> > but if it works and is safe then to 2.2.5. I got flagged today that
>> > this openssl creates a bit of mess with JDK-17 for Hive with ABFS I
>> > think(need to dig in more),
>> >
>> > Now for the dependency upgrades:
>> >
>> > A big NO to Jackson, that ain't safe and the wounds are still fresh,
>> > it screwed the 3.3.3 release for many projects. So, let's not get into
>> > that. Infact anything that touches those shaded jars is risky, some
>> > package-json exclusion also created a mess recently. So, Lets not
>> > touch only and that too when we have less time.
>> >
>> > Avoid anything around Jetty upgrade, I have selfish reasons for that.
>> > Jetty messes something up with Hbase and Hive has a dependency on
>> > Hbase, and it is crazy, in case interested [1]. So, any upgrade to
>> > Jetty will block hive from upgrading Hadoop as of today. But that is a
>> > selfish reason and just up for consideration. Go ahead if necessary. I
>> > just wanted to let folks know
>> >
>> >
>> > Apart from the Jackson stuff, everything is suggestive in nature, your
>> > call feel free to ignore.
>> >
>> > @Xiaoqiao He , maybe pulling in all those 1

Fwd: [jira] [Created] (HADOOP-18768) Integrating Apache Hadoop into OSS-Fuzz

2023-06-12 Thread Wei-Chiu Chuang

Are there Hadoop committers who would like to help triage bug reports from
OSS-Fuzz?

-- Forwarded message -
From: Henry Lin (Jira) 
Date: Mon, Jun 12, 2023 at 9:10 AM
Subject: [jira] [Created] (HADOOP-18768) Integrating Apache Hadoop into
OSS-Fuzz
To: 

Henry Lin created HADOOP-18768:
--

 Summary: Integrating Apache Hadoop into OSS-Fuzz
 Key: HADOOP-18768
 URL: https://issues.apache.org/jira/browse/HADOOP-18768
 Project: Hadoop Common
  Issue Type: Test
Reporter: Henry Lin

Hi all,

We have prepared the [initial integration|
https://github.com/google/oss-fuzz/pull/10511] of Apache Hadoop into
[Google OSS-Fuzz|https://github.com/google/oss-fuzz] which will provide
more security for your project.

*Why do you need Fuzzing?*
The Code Intelligence JVM fuzzer [Jazzer|
https://github.com/CodeIntelligenceTesting/jazzer] has already found
[hundreds of bugs|
https://github.com/CodeIntelligenceTesting/jazzer/blob/main/docs/findings.md]
in open source projects including for example [OpenJDK|
https://nvd.nist.gov/vuln/detail/CVE-2022-21360], [Protobuf|
https://nvd.nist.gov/vuln/detail/CVE-2021-22569] or [jsoup|
https://github.com/jhy/jsoup/security/advisories/GHSA-m72m-mhq2-9p6c].
Fuzzing proved to be very effective having no false positives. It provides
a crashing input which helps you to reproduce and debug any finding easily.
The integration of your project into the OSS-Fuzz platform will enable
continuous fuzzing of your project by [Jazzer|
https://github.com/CodeIntelligenceTesting/jazzer].

*What do you need to do?*
The integration requires the maintainer or one established project
committer to deal with the bug reports.

You need to create or provide one email address that is associated with a
google account as per [here|
https://google.github.io/oss-fuzz/getting-started/accepting-new-projects/].
When a bug is found, you will receive an email that will provide you with
access to ClusterFuzz, crash reports, code coverage reports and fuzzer
statistics. More than 1 person can be included.

*How can Code Intelligence support you?*
We will continue to add more fuzz targets to improve code coverage over
time. Furthermore, we are permanently enhancing fuzzing technologies by
developing new fuzzers and bug detectors.

Please let me know if you have any questions regarding fuzzing or the
OSS-Fuzz integration.

--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Re: [DISCUSS] Mysql-connector-java is GPL licensed; Hadoop can't use it.

2023-06-09 Thread Wei-Chiu Chuang

Thanks Owen,

Yes I am aware of the optional category X. I was under the impression that
the "optionality" need to be more explicit. For example, by adding
true tag.
https://medium.com/@danismaz.furkan/difference-between-optional-true-optional-and-scope-provided-scope-7404ec24fb59

In this case, RBF is used by a small set of Hadoop users, moreover an even
smaller subset of them will choose to use the mysql backend.

(And there need to be instructions for how to acquire the mysql connector
jar since it's not shipped in the convenience binary)

On Thu, Jun 8, 2023 at 10:55 PM Owen O'Malley 
wrote:

> We are allowed to use category X software in optional components.
> Furthermore, the dependency is marked as provided, so it won't be pulled
> into a transitive closure.
>
> You are right that I should have included a comment about that on the
> original jira.
>
> .. Owen
>
>
> On Fri, Jun 9, 2023 at 1:58 AM Wei-Chiu Chuang
> 
> wrote:
>
> > Hi community,
> >
> > While preparing for 3.3.6 RC, I realized the mysql-connector-java
> > dependency added by HADOOP-18535
> > <https://issues.apache.org/jira/browse/HADOOP-18535> is GPL licensed.
> >
> >
> > Source:
> > https://github.com/mysql/mysql-connector-j/blob/release/8.0/LICENSE
> > See legal discussion at Apache LEGAL-423
> > <https://issues.apache.org/jira/browse/LEGAL-423>.
> >
> > I looked at the original jira and github PR and I don't think the license
> > issue was noticed.
> >
> > Is it possible to get rid of the mysql connector dependency? As far as I
> > can tell the dependency is very limited.
> >
> > If not, I guess I'll have to revert the commits for now.
> >
>

[DISCUSS] Mysql-connector-java is GPL licensed; Hadoop can't use it.

2023-06-08 Thread Wei-Chiu Chuang

Hi community,

While preparing for 3.3.6 RC, I realized the mysql-connector-java
dependency added by HADOOP-18535
 is GPL licensed.


Source: https://github.com/mysql/mysql-connector-j/blob/release/8.0/LICENSE
See legal discussion at Apache LEGAL-423
.

I looked at the original jira and github PR and I don't think the license
issue was noticed.

Is it possible to get rid of the mysql connector dependency? As far as I
can tell the dependency is very limited.

If not, I guess I'll have to revert the commits for now.

Re: [DISCUSS] Hadoop 3.3.6 release planning

2023-06-08 Thread Wei-Chiu Chuang

gt; > > >
> > > > If we should consider both 3.3.6 and 3.3.9 (which is from
> release-3.3.5
> > > > discuss)[1] for this release line?
> > > > I try to query with `project in (HDFS, YARN, HADOOP, MAPREDUCE) AND
> > > > fixVersion in (3.3.6, 3.3.9)`[2],
> > > > there are more than hundred jiras now.
> > > >
> > > > Best Regards,
> > > > - He Xiaoqiao
> > > >
> > > > [1] https://lists.apache.org/thread/kln96frt2tcg93x6ht99yck9m7r9qwxp
> > > > [2]
> > > >
> > > >
> https://issues.apache.org/jira/browse/YARN-11482?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20fixVersion%20in%20(3.3.6%2C%203.3.9)
> > > >
> > > >
> > > > On Fri, May 5, 2023 at 1:19 AM Wei-Chiu Chuang 
> wrote:
> > > >
> > > > > Hi community,
> > > > >
> > > > > I'd like to kick off the discussion around Hadoop 3.3.6 release
> plan.
> > > > >
> > > > > I'm being selfish but my intent for 3.3.6 is to have the new APIs
> in
> > > > > HADOOP-18671 <https://issues.apache.org/jira/browse/HADOOP-18671>
> added
> > > > so
> > > > > we can have HBase to adopt this new API. Other than that, perhaps
> > > > > thirdparty dependency updates.
> > > > >
> > > > > If you have open items to be added in the coming weeks, please add
> 3.3.6
> > > > to
> > > > > the target release version. Right now I am only seeing three open
> jiras
> > > > > targeting 3.3.6.
> > > > >
> > > > > I imagine this is going to be a small release as 3.3.5 (hat tip to
> Steve)
> > > > > was only made two months back, and so far only 8 jiras were
> resolved in
> > > > the
> > > > > branch-3.3 line.
> > > > >
> > > > > Best,
> > > > > Weichiu
> > > > >
> > > >
>

[jira] [Created] (HDFS-17024) Potential data race introduced by HDFS-15865

2023-05-23 Thread Wei-Chiu Chuang (Jira)

Wei-Chiu Chuang created HDFS-17024:
--

 Summary: Potential data race introduced by HDFS-15865
 Key: HDFS-17024
 URL: https://issues.apache.org/jira/browse/HDFS-17024
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Wei-Chiu Chuang


After HDFS-15865, we found client aborted due to an NPE.
{noformat}
2023-04-10 16:07:43,409 ERROR 
org.apache.hadoop.hbase.regionserver.HRegionServer: * ABORTING region 
server kqhdp36,16020,1678077077562: Replay of WAL required. Forcing server 
shutdown *
org.apache.hadoop.hbase.DroppedSnapshotException: region: WAFER_ALL,16|CM 
RIE.MA1|CP1114561.18|PROC|,1625899466315.0fbdf0f1810efa9e68af831247e6555f.
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2870)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2539)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2511)
at 
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2401)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:613)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:582)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$1000(MemStoreFlusher.java:69)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:362)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.DataStreamer.waitForAckedSeqno(DataStreamer.java:880)
at 
org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:781)
at 
org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:898)
at 
org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:850)
at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:76)
at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:105)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.finishClose(HFileWriterImpl.java:859)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.close(HFileWriterImpl.java:687)
at 
org.apache.hadoop.hbase.regionserver.StoreFileWriter.close(StoreFileWriter.java:393)
at 
org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
at 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:78)
at 
org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:1047)
at 
org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2349)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2806)
{noformat}

This is only possible if a data race happened. File this jira to improve the 
data and eliminate the data race.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Re: Call for Presentations, Community Over Code 2023

2023-05-09 Thread Wei-Chiu Chuang

There's also a call for presentation for Community over Code Asia 2023

https://www.bagevent.com/event/cocasia-2023-EN
Happening Aug 18-20. CfP due by 6/6


On Tue, May 9, 2023 at 8:39 PM Ayush Saxena  wrote:

> Forwarding from dev@hadoop to the dev ML which we use.
>
> The actual mail lies here:
> https://www.mail-archive.com/dev@hadoop.apache.org/msg00160.html
>
> -Ayush
>
> On 2023/05/09 21:24:09 Rich Bowen wrote:
> > (Note: You are receiving this because you are subscribed to the dev@
> > list for one or more Apache Software Foundation projects.)
> >
> > The Call for Presentations (CFP) for Community Over Code (formerly
> > Apachecon) 2023 is open at
> > https://communityovercode.org/call-for-presentations/, and will close
> > Thu, 13 Jul 2023 23:59:59 GMT.
> >
> > The event will be held in Halifax, Canada, October 7-10, 2023.
> >
> > We welcome submissions on any topic related to the Apache Software
> > Foundation, Apache projects, or the communities around those projects.
> > We are specifically looking for presentations in the following
> > catetegories:
> >
> > Fintech
> > Search
> > Big Data, Storage
> > Big Data, Compute
> > Internet of Things
> > Groovy
> > Incubator
> > Community
> > Data Engineering
> > Performance Engineering
> > Geospatial
> > API/Microservices
> > Frameworks
> > Content Wrangling
> > Tomcat and httpd
> > Cloud and Runtime
> > Streaming
> > Sustainability
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: dev-h...@hadoop.apache.org
> >
> >
>
>
> Sent from my iPhone
>
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>
>

[DISCUSS] Hadoop 3.3.6 release planning

2023-05-04 Thread Wei-Chiu Chuang

Hi community,

I'd like to kick off the discussion around Hadoop 3.3.6 release plan.

I'm being selfish but my intent for 3.3.6 is to have the new APIs in
HADOOP-18671  added so
we can have HBase to adopt this new API. Other than that, perhaps
thirdparty dependency updates.

If you have open items to be added in the coming weeks, please add 3.3.6 to
the target release version. Right now I am only seeing three open jiras
targeting 3.3.6.

I imagine this is going to be a small release as 3.3.5 (hat tip to Steve)
was only made two months back, and so far only 8 jiras were resolved in the
branch-3.3 line.

Best,
Weichiu

Re: [DISCUSS] hadoop branch-3.3+ going to java11 only

2023-03-28 Thread Wei-Chiu Chuang

My random thoughts. Probably bad takes:

There are projects experimenting with JDK17 now.
JDK11 active support will end in 6 months. If it's already hard to migrate
from JDK8 why not retarget JDK17.

On Tue, Mar 28, 2023 at 10:30 AM Ayush Saxena  wrote:

> I know Jersey upgrade as a blocker. Some folks were chasing that last year
> during 3.3.4 time, I don’t know where it is now, didn’t see then what’s the
> problem there but I remember there was some intitial PR which did it for
> HDFS atleast, so I never looked beyond that…
>
> I too had jdk-11 in my mind, but only for trunk. 3.4.x can stay as java-11
> only branch may be, but that is something later to decide, once we get the
> code sorted…
>
> -Ayush
>
> > On 28-Mar-2023, at 9:16 PM, Steve Loughran 
> wrote:
> >
> > well, how about we flip the switch and get on with it.
> >
> > slf4j seems happy on java11,
> >
> > side issue, anyone seen test failures on zulu1.8; somehow my test run is
> > failing and i'm trying to work out whether its a mismatch in command
> > line/ide jvm versions, or the 3.3.5 JARs have been built with an openjdk
> > version which requires IntBuffer implements an overridden method
> IntBuffer
> > rewind().
> >
> > java.lang.NoSuchMethodError:
> java.nio.IntBuffer.rewind()Ljava/nio/IntBuffer;
> >
> > at
> org.apache.hadoop.fs.FSInputChecker.verifySums(FSInputChecker.java:341)
> > at
> >
> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:308)
> > at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:257)
> > at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:202)
> > at java.io.DataInputStream.read(DataInputStream.java:149)
> >
> >> On Tue, 28 Mar 2023 at 15:52, Viraj Jasani  wrote:
> >> IIRC some of the ongoing major dependency upgrades (log4j 1 to 2,
> jersey 1
> >> to 2 and junit 4 to 5) are blockers for java 11 compile + test
> stability.
> >> On Tue, Mar 28, 2023 at 4:55 AM Steve Loughran
>  >> wrote:
> >>> Now that hadoop 3.3.5 is out, i want to propose something new
> >>> we switch branch-3.3 and trunk to being java11 only
> >>> 1. java 11 has been out for years
> >>> 2. oracle java 8 is no longer available under "premier support"; you
> >>> can't really get upgrades
> >>> https://www.oracle.com/java/technologies/java-se-support-roadmap.html
> >>> 3. openJDK 8 releases != oracle ones, and things you compile with them
> >>> don't always link to oracle java 8 (some classes in java.nio have
> >> added
> >>> more overrides)
> >>> 4. more and more libraries we want to upgrade to/bundle are java 11
> >> only
> >>> 5. moving to java 11 would cut our yetus build workload in half, and
> >>> line up for adding java 17 builds instead.
> >>> I know there are some outstanding issues still in
> >>> https://issues.apache.org/jira/browse/HADOOP-16795 -but are they
> >> blockers?
> >>> Could we just move to java11 and enhance at our leisure, once java8 is
> no
> >>> longer a concern.
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>

Re: [DISCUSS] Move HDFS specific APIs to FileSystem abstration

2023-03-27 Thread Wei-Chiu Chuang

I think moving up interfaces to FileSystem or some abstract FileSystem
class has a few benefits:

1. Application can potentially be made FS-agnostic, with
hasPathCapabilities() check.
At least, make the code to compile.

2. We will be able to add a contract test to ensure behavior is expected.
The second one is more critical than (1). For complex applications such as
HBase it is almost impossible to achieve true FS agnosticity without proper
contract tests, as now I am starting to realize.

This is where I am coming from. No need to make Hadoop application
development harder than it already is.

On Mon, Mar 27, 2023 at 4:58 AM Steve Loughran 
wrote:

> side issue, as i think about what bulk delete call would also keep hbase
> happy
> https://issues.apache.org/jira/browse/HADOOP-18679
>
> should we think about new API calls only raising RuntimeExceptions?
>
> The more work I do on futures the more the way we always raise IOEs
> complicates life. java has outgrown checked exceptions
>
> On Fri, 24 Mar 2023 at 09:44, Steve Loughran  wrote:
>
> >
> >
> > On Thu, 23 Mar 2023 at 10:07, Ayush Saxena  wrote:
> >
> >>
> >> Second idea mentioned in the original mail is also similar to mentioned
> in
> >> the comment in the above ticket and is still quite acceptable, name can
> be
> >> negotiated though, Add an interface to pull the relevant methods up in
> >> that
> >> without touching FileSystem class, we can have DFS implement that and
> >> Ozone
> >> FS implement them as well. We should be sorted: No Hacking, No Bothering
> >> FileSystem and still things can work
> >>
> >>
> >>
> > This is the way we should be thinking about it. an interface which
> > filesystems MAY implement, but many do not.
> >
> > this has happened with some of the recent apis.
> >
> > presence of the API doesn't guarantee the api is active, only that it may
> > be possible to call...callers should use PathCapabilities api to see if
> it
> > is live
> >
> >
> >>
>

Re: [DISCUSS] Move HDFS specific APIs to FileSystem abstration

2023-03-20 Thread Wei-Chiu Chuang

Thank you. Makes sense to me. Yes, as part of this effort we are going to
need contract tests.

On Fri, Mar 17, 2023 at 3:52 AM Steve Loughran 
wrote:

>1. I think a new interface would be good as FileContext could do the
>same thing
>2. using PathCapabilities probes should still be mandatory as for
>FileContext it would depend on the back end
>3. Whoever does this gets to specify what the API does and write the
>contract tests. Saying "just to do what HDFS does" isn't enough as it's
> not
>always clear the HDFS team no how much of that behaviour is intentional
>(rename, anyone?).
>
>
> For any new API (a better rename, a better delete,...) I would normally
> insist on making it cloud friendly, with an extensible builder API and an
> emphasis on asynchronous IO. However this is existing code and does target
> HDFS and Ozone -pulling the existing APIs up into a new interface seems the
> right thing to do here.
>
>  I have a WiP project to do a shim library to offer new FS APIs two older
> Hadoop releases by way of reflection, so that we can get new APIs taken up
> across projects where we cannot choreograph version updates across the
> entire stack. (hello parquet, spark,...). My goal is to actually make this
> a Hadoop managed project, with its own release schedule. You could add an
> equivalent of the new interface in here, which would then use reflection
> behind-the-scenes to invoke the underlying HDFS methods when the FS client
> has them.
>
> https://github.com/steveloughran/fs-api-shim
>
> I've just added vector IO API there; the next step is to copy over a lot of
> the contract tests from hadoop common and apply them through the shim -to
> hadoop 3.2, 3.3.0-3.3.5. That testing against many backends is actually as
> tricky as the reflection itself. However without this library it is going
> to take a long long time for the open source applications to pick up the
> higher performance/Cloud ready Apis. Yes, those of us who can build the
> entire stack can do it, but that gradually adds more divergence from the
> open source libraries, reduces the test coverage overall and only increases
> maintenance costs over time.
>
> steve
>
> On Thu, 16 Mar 2023 at 20:56, Wei-Chiu Chuang  wrote:
>
> > Hi,
> >
> > Stephen and I are working on a project to make HBase to run on Ozone.
> >
> > HBase, born out of the Hadoop project, depends on a number of HDFS
> specific
> > APIs, including recoverLease() and isInSafeMode(). The HBase community
> [1]
> > strongly voiced that they don't want the project to have direct
> dependency
> > on additional FS implementations due to dependency and vulnerability
> > management concerns.
> >
> > To make this project successful, we're exploring options, to push up
> these
> > APIs to the FileSystem abstraction. Eventually, it would make HBase FS
> > implementation agnostic, and perhaps enable HBase to support other
> storage
> > systems in the future.
> >
> > We'd use the PathCapabilities API to probe if the underlying FS
> > implementation supports these APIs, and would then invoke the
> corresponding
> > FileSystem APIs. This is straightforward but the FileSystem would become
> > bloated.
> >
> > Another option is to create a "RecoverableFileSystem" interface, and have
> > both DistributedFileSystem (HDFS) and RootedOzoneFileSystem (Ozone). This
> > way the impact to the Hadoop project and the FileSystem abstraction is
> even
> > smaller.
> >
> > Thoughts?
> >
> > [1] https://lists.apache.org/thread/tcrp8vxxs3z12y36mpzx35txhpp7tvxv
> >
>

[DISCUSS] Move HDFS specific APIs to FileSystem abstration

2023-03-16 Thread Wei-Chiu Chuang

Hi,

Stephen and I are working on a project to make HBase to run on Ozone.

HBase, born out of the Hadoop project, depends on a number of HDFS specific
APIs, including recoverLease() and isInSafeMode(). The HBase community [1]
strongly voiced that they don't want the project to have direct dependency
on additional FS implementations due to dependency and vulnerability
management concerns.

To make this project successful, we're exploring options, to push up these
APIs to the FileSystem abstraction. Eventually, it would make HBase FS
implementation agnostic, and perhaps enable HBase to support other storage
systems in the future.

We'd use the PathCapabilities API to probe if the underlying FS
implementation supports these APIs, and would then invoke the corresponding
FileSystem APIs. This is straightforward but the FileSystem would become
bloated.

Another option is to create a "RecoverableFileSystem" interface, and have
both DistributedFileSystem (HDFS) and RootedOzoneFileSystem (Ozone). This
way the impact to the Hadoop project and the FileSystem abstraction is even
smaller.

Thoughts?

[1] https://lists.apache.org/thread/tcrp8vxxs3z12y36mpzx35txhpp7tvxv

[jira] [Resolved] (HDFS-16947) RBF NamenodeHeartbeatService to report error for not being able to register namenode in state store

2023-03-15 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16947.

Resolution: Fixed

> RBF NamenodeHeartbeatService to report error for not being able to register 
> namenode in state store
> ---
>
> Key: HDFS-16947
> URL: https://issues.apache.org/jira/browse/HDFS-16947
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Namenode heartbeat service should provide error with full stacktrace if it 
> cannot register namenode in the state store. As of today, we only log info 
> msg.
> For zookeeper based impl, this might mean either a) curator manager is not 
> initialized or b) if it fails to write to znode after exhausting retries. For 
> either of these cases, reporting only INFO log might not be good enough and 
> we might have to look for errors elsewhere.
>  
> Sample example:
> {code:java}
> 2023-02-20 23:10:33,714 DEBUG [NamenodeHeartbeatService {ns} nn0-0] 
> router.NamenodeHeartbeatService - Received service state: ACTIVE from HA 
> namenode: {ns}-nn0:nn-0-{ns}.{cluster}:9000
> 2023-02-20 23:10:33,731 INFO  [NamenodeHeartbeatService {ns} nn0-0] 
> impl.MembershipStoreImpl - Inserting new NN registration: 
> nn-0.namenode.{cluster}:->{ns}:nn0:nn-0-{ns}.{cluster}:9000-ACTIVE
> 2023-02-20 23:10:33,731 INFO  [NamenodeHeartbeatService {ns} nn0-0] 
> router.NamenodeHeartbeatService - Cannot register namenode in the State Store
>  {code}
> If we could log full stacktrace:
> {code:java}
> 2023-02-21 00:20:24,691 ERROR [NamenodeHeartbeatService {ns} nn0-0] 
> router.NamenodeHeartbeatService - Cannot register namenode in the State Store
> org.apache.hadoop.hdfs.server.federation.store.StateStoreUnavailableException:
>  State Store driver StateStoreZooKeeperImpl in nn-0.namenode.{cluster} is not 
> ready.
>         at 
> org.apache.hadoop.hdfs.server.federation.store.driver.StateStoreDriver.verifyDriverReady(StateStoreDriver.java:158)
>         at 
> org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl.putAll(StateStoreZooKeeperImpl.java:235)
>         at 
> org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreBaseImpl.put(StateStoreBaseImpl.java:74)
>         at 
> org.apache.hadoop.hdfs.server.federation.store.impl.MembershipStoreImpl.namenodeHeartbeat(MembershipStoreImpl.java:179)
>         at 
> org.apache.hadoop.hdfs.server.federation.resolver.MembershipNamenodeResolver.registerNamenode(MembershipNamenodeResolver.java:381)
>         at 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.updateState(NamenodeHeartbeatService.java:317)
>         at 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.lambda$periodicInvoke$0(NamenodeHeartbeatService.java:244)
> ...
> ... {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Re: Request to create an ASF Jira account

2023-02-14 Thread Wei-Chiu Chuang

I've submitted a request to create the account. You should receive an email
shortly.


On Tue, Feb 14, 2023 at 1:40 PM ravindra dingankar <
ravindradingan...@gmail.com> wrote:

> Hi,
>
> I am part of LinkedIn's HDFS team, and would like to start contributing to
> HDFS and be part of the mailing list.
>
> I request the project to create an ASF Jira account for me.
> My details are as follows
>
> email address : ravindra.dingan...@asu.edu
> preferred username : rdingankar
> alternate username : rdingank
> display name : Ravindra Dingankar
>
>
> Thanks & Regards,
> Ravindra Dingankar
>

[jira] [Resolved] (HDFS-16873) FileStatus compareTo does not specify ordering

2022-12-20 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16873.

Fix Version/s: 3.4.0
   Resolution: Fixed

> FileStatus compareTo does not specify ordering
> --
>
> Key: HDFS-16873
> URL: https://issues.apache.org/jira/browse/HDFS-16873
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: DDillon
>Assignee: DDillon
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> The Javadoc of FileStatus does not specify the field and manner in which 
> objects are ordered. In order to use the Comparable interface, this is 
> critical to understand to avoid making any assumptions. Inspection of code 
> showed that it is by path name quite quickly, but we shouldn't have to go 
> into code to confirm any obvious assumptions.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16871) DiskBalancer process may throws IllegalArgumentException when the target DataNode has capital letter in hostname

2022-12-20 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16871.

Fix Version/s: 3.4.0
   Resolution: Fixed

> DiskBalancer process may throws IllegalArgumentException when the target 
> DataNode has capital letter in hostname
> 
>
> Key: HDFS-16871
> URL: https://issues.apache.org/jira/browse/HDFS-16871
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Daniel Ma
>Assignee: Daniel Ma
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> DiskBalancer process read DataNode hostname as lowercase letters,
>  !screenshot-1.png! 
>  but there is no letter case transform when getNodeByName.
>  !screenshot-2.png! 
> For a DataNode with lowercase hostname. everything is ok.
> But for a DataNode with uppercase hostname, when Balancer process try to 
> migrate on it,  there will be a IllegalArgumentException thrown as below,
> {code:java}
> 2022-10-09 16:15:26,631 ERROR tools.DiskBalancerCLI: 
> java.lang.IllegalArgumentException: Unable to find the specified node. 
> node-group-1YlRf0002
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16854) TestDFSIO to support non-default file system

2022-12-01 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16854.

Resolution: Duplicate

> TestDFSIO to support non-default file system
> 
>
> Key: HDFS-16854
> URL: https://issues.apache.org/jira/browse/HDFS-16854
> Project: Hadoop HDFS
>  Issue Type: Improvement
>    Reporter: Wei-Chiu Chuang
>        Assignee: Wei-Chiu Chuang
>Priority: Major
>
> TestDFSIO expects a parameter {{-Dtest.build.data=}} which is where the data 
> is located. Only paths on the default file system is supported. Running t 
> against other file systems, such as Ozone, throws an exception.
> It can be worked around by specifying {{-Dfs.defaultFS=}} but it would be 
> even nicer to support non-default file systems out of box, because no one 
> would know this trick unless she looks at the code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16839) It should consider EC reconstruction work when we determine if a node is busy

2022-11-29 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16839.

Resolution: Fixed

> It should consider EC reconstruction work when we determine if a node is busy
> -
>
> Key: HDFS-16839
> URL: https://issues.apache.org/jira/browse/HDFS-16839
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kidd5368
>Assignee: Kidd5368
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>
> In chooseSourceDatanodes( ), I think it's more reasonable if we take EC 
> reconstruction work as a consideration when we determine if a node is busy or 
> not.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-16854) TestDFSIO to support non-default file system

2022-11-23 Thread Wei-Chiu Chuang (Jira)

Wei-Chiu Chuang created HDFS-16854:
--

 Summary: TestDFSIO to support non-default file system
 Key: HDFS-16854
 URL: https://issues.apache.org/jira/browse/HDFS-16854
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Wei-Chiu Chuang


TestDFSIO expects a parameter {{-Dtest.build.data=}} which is where the data is 
located. Only paths on the default file system is supported. Trying to run it 
against other file systems, such as Ozone, throws an exception.

It can be worked around by specifying {{-Dfs.defaultFS=}} but it would be even 
nice to support non-default file systems out of box, because no one would know 
this trick unless she looks at the code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Re: Code coverage report on github PRs

2022-11-23 Thread Wei-Chiu Chuang

I believe most of them can be added by us using GitHub Workflow. There's a
marketplace for these tools and most of them are free for open source
projects.

On Wed, Nov 23, 2022 at 11:43 AM Ayush Saxena  wrote:

> A simple Infra ticket I suppose should get it done for us, eg.
> https://issues.apache.org/jira/browse/INFRA-23561
>
> -Ayush
>
> On Thu, 24 Nov 2022 at 01:00, Iñigo Goiri  wrote:
>
> > Now that we are using mostly GitHub PRs for the reviews and we have
> decent
> > integration for the builds etc there, I was wondering about code coverage
> > and reporting.
> > Is code coverage setup at all?
> > Does this come from the INFRA team?
> > What would it take to enable it otherwise?
> >
>

[jira] [Resolved] (HDFS-9536) OOM errors during parallel upgrade to Block-ID based layout

2022-10-29 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-9536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-9536.
---
Resolution: Duplicate

I believe this is no longer an issue after HDFS-15937 and HDFS-15610.

> OOM errors during parallel upgrade to Block-ID based layout
> ---
>
> Key: HDFS-9536
> URL: https://issues.apache.org/jira/browse/HDFS-9536
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Major
>
> This is a follow-up jira for the OOM errors observed during parallel upgrade 
> to Block-ID based datanode layout using HDFS-8578 fix.
> more clue 
> [here|https://issues.apache.org/jira/browse/HDFS-8578?focusedCommentId=15042012=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15042012]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[DISCUSS] Supporting partial file rewrite/compose

2022-10-07 Thread Wei-Chiu Chuang

There were a number of discussions that happened during ApacheCon. In the
spirit of the Apache Way, I am taking the conversation online, sharing with
the larger community and also capturing requirements. Credits to Owen who
started this discussion.

There are a number of scenarios where users want to partially rewrite file
blocks, and it would make sense to create a file system API to make these
operations efficient.

1. Apache Iceberg or other evolvable table format.
These table formats need to update table schema. The underlying files are
rewritten but only a subset of blocks are changed. It would be much more
efficient if a new file can be composed using some of the existing file
blocks.

2. GDPR compliance "the right to erasure"
Files must be rewritten to remove a person's data at request. Again, this
is efficient because only a small set of file blocks is updated.

3. In-place erasure coding conversion.
I had a proposal to support atomically rewriting replicated files into
erasure coded files. This can be the building block to support auto-tiering.

Thoughts? What would be a good FS interface to support these requirements?

For Ozone folks, Ritesh opened a jira: HDDS-7297
 but I figured a larger
conversation should happen so that we can take into the consideration of
other FS implementations.

Thanks,
Weichiu

[DISCUSS] Hadoop 3.3.5 release planning

2022-10-07 Thread Wei-Chiu Chuang

Bumping this up. Adding the [DISCUSS] text to make this message stand out
of your inbox.

I certainly missed this message and only realized 3.3.5 has more than just
security updates.

What was the issue with the ARM64 build? I was able to publish ARM64 build
for 3.3.1 release without problems.


On Tue, Sep 27, 2022 at 9:35 AM Steve Loughran 
wrote:

> Mukund has just created the new Hadoop release JIRA,
> https://issues.apache.org/jira/browse/HADOOP-18470, and is doing the first
> build/test before pushing it up. This is off branch-3.3, so it will have
> more significant changes than the 3.3.3 and 3.3.4 releases, which were just
> CVE/integration fixes.
>
> The new branch, branch-3.3.5 has its maven/hadoop.version set to
> 3.3.5-SNAPSHOT.
>
> All JIRA issues fixed/blocked for 3.3.9 now reference 3.3.5. The next step
> of the release is to actually move those wontfix issues back to being
> against 3.3.9
>
> There is still a 3.3.9 version; branch-3.3's maven build still refers to
> it. Issues found/fixed in branch-3.3 *but not the new branch-3.3.5 branch-
> should still refer to this version. Anything targeting the 3.3.5 release
> must be committed to the new branch, and have its JIRA version tagged
> appropriately.
>
> All changes to be cherrypicked into 3.3.5, except for those ones related to
> the release process itself, MUST be in branch-3.3 first, and SHOULD be in
> trunk unless there is some fundamental reason they can't apply there
> (reload4j etc).
>
> Let's try and stabilise this releases, especially bringing up to date all
> the JAR dependencies which we can safely update.
>
> Anyone planning to give talks at ApacheCon about forthcoming features
> already in 3.3 SHOULD
>
>1. reference Hadoop 3.3.5 as the version
>2. make sure their stuff works.
>
> Mukund will be at the conf; find him and offer any help you can in getting
> this release out.
>
> I'd like to get that Arm64 build workingdoes anyone else want to get
> involved?
>
> -steve
>

Re: Hadoop BoF at ApacheCon NA'22

2022-10-02 Thread Wei-Chiu Chuang

Hi Junping yes it’s all settled. We’ll be meeting Monday 5:50pm central
time which will be 6:50am Tuesday for you. Sorry the complete conference
schedule is only available to participants at this time. It was taken down
from the website.

Hey so it’s really early for you. I’d suggest to move you to the back of
the session if that’s okay for you.

Zoom link:
https://cloudera.zoom.us/j/94221207158

Dial in: +1 877-853-5257

The zoom link is available for any one to join remotely.

俊平堵 於 2022年10月2日 週日，上午7:23寫道：

> Hi Uma and Wei-Chiu,
>  Does this schedule update have settled down (from Tues. to Mon.)? If
> so, would you provide a meeting link for me so that I can join remotely?
> Thanks!
>
> Best,
>
> Junping
>
> Uma Maheswara Rao Gangumalla  于2022年9月27日周二 02:00写道：
>
> > Guys, there is a schedule change for Hadoop BoaF due to the conflicts
> with
> > Lightning talks on Wednesday. It has been moved to Monday.
> > Plan accordingly.
> >
> > Ozone BoAF will happen in Rhythms-I on Monday. Depending on the number of
> > people, we could combine as well.
> >
> > Regards,
> > Uma
> >
> > On Sat, Sep 24, 2022 at 8:52 PM 俊平堵  wrote:
> >
> > > Yes. It should be a short talk and I can only present remotely for this
> > > time. That would be great if you help to coordinate. :)
> > >
> > > Thanks,
> > >
> > > Junping
> > >
> > > Wei-Chiu Chuang  于2022年9月24日周六 01:21写道：
> > >
> > >> That would be great! Will you be presenting in person or remote?
> > >> If it's a short talk (15-20 minutes) we'll have enough time for 2-3 of
> > >> these. If folks want to present remotely I can help coordinate that.
> > >>
> > >> Here I put up a short agenda for the BoF:
> > >>
> > >>
> >
> https://docs.google.com/document/d/1_ha1BFeEyIkAJtl5tJ8Z4Vf5Md_0rN23RCrXkyNCY1c/edit?usp=sharing
> > >> Please add more details here.
> > >>
> > >>
> > >> On Thu, Sep 22, 2022 at 10:27 PM 俊平堵  wrote:
> > >>
> > >> > Thanks Wei-Chiu. I am happy to share the status of Hadoop Meetups in
> > >> China
> > >> > (2019-2022) if that is a suitable topic. :)
> > >> >
> > >> > Thanks,
> > >> >
> > >> > Junping
> > >> >
> > >> > Wei-Chiu Chuang  于2022年9月21日周三 02:31写道：
> > >> >
> > >> > > We've not had a physical event for a long long time and we're way
> > >> overdue
> > >> > > for one.
> > >> > >
> > >> > > I'm excited to announce we've reserved a room at the upcoming
> > >> ApacheCon
> > >> > for
> > >> > > Birds-of-Feather on October 4th from 17:50-18:30 CDT in Rhythms
> I. I
> > >> was
> > >> > > also told that participants can stay after that until the hotel
> > >> > personnels
> > >> > > throw us out.
> > >> > >
> > >> > > Feel free to pass along this information.
> > >> > >
> > >> > > On top of that, I was told that this year's ApacheCon is very
> > popular
> > >> > and a
> > >> > > lot of good talk proposals were not selected. If folks are
> > interested
> > >> I'm
> > >> > > happy to invite you to share online. A physical meetup in the Bay
> > Area
> > >> > > would also be a great idea, if we can find a sponsor.
> > >> > >
> > >> > > Thanks,
> > >> > > Wei-Chiu
> > >> > >
> > >> >
> > >>
> > >
> >
>

Re: Hadoop BoF at ApacheCon NA'22

2022-09-23 Thread Wei-Chiu Chuang

That would be great! Will you be presenting in person or remote?
If it's a short talk (15-20 minutes) we'll have enough time for 2-3 of
these. If folks want to present remotely I can help coordinate that.

Here I put up a short agenda for the BoF:
https://docs.google.com/document/d/1_ha1BFeEyIkAJtl5tJ8Z4Vf5Md_0rN23RCrXkyNCY1c/edit?usp=sharing
Please add more details here.

On Thu, Sep 22, 2022 at 10:27 PM 俊平堵  wrote:

> Thanks Wei-Chiu. I am happy to share the status of Hadoop Meetups in China
> (2019-2022) if that is a suitable topic. :)
>
> Thanks,
>
> Junping
>
> Wei-Chiu Chuang  于2022年9月21日周三 02:31写道：
>
> > We've not had a physical event for a long long time and we're way overdue
> > for one.
> >
> > I'm excited to announce we've reserved a room at the upcoming ApacheCon
> for
> > Birds-of-Feather on October 4th from 17:50-18:30 CDT in Rhythms I. I was
> > also told that participants can stay after that until the hotel
> personnels
> > throw us out.
> >
> > Feel free to pass along this information.
> >
> > On top of that, I was told that this year's ApacheCon is very popular
> and a
> > lot of good talk proposals were not selected. If folks are interested I'm
> > happy to invite you to share online. A physical meetup in the Bay Area
> > would also be a great idea, if we can find a sponsor.
> >
> > Thanks,
> > Wei-Chiu
> >
>

Hadoop BoF at ApacheCon NA'22

2022-09-20 Thread Wei-Chiu Chuang

We've not had a physical event for a long long time and we're way overdue
for one.

I'm excited to announce we've reserved a room at the upcoming ApacheCon for
Birds-of-Feather on October 4th from 17:50-18:30 CDT in Rhythms I. I was
also told that participants can stay after that until the hotel personnels
throw us out.

Feel free to pass along this information.

On top of that, I was told that this year's ApacheCon is very popular and a
lot of good talk proposals were not selected. If folks are interested I'm
happy to invite you to share online. A physical meetup in the Bay Area
would also be a great idea, if we can find a sponsor.

Thanks,
Wei-Chiu

[jira] [Resolved] (HDFS-4043) Namenode Kerberos Login does not use proper hostname for host qualified hdfs principal name.

2022-08-22 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-4043.
---
Resolution: Fixed

> Namenode Kerberos Login does not use proper hostname for host qualified hdfs 
> principal name.
> 
>
> Key: HDFS-4043
> URL: https://issues.apache.org/jira/browse/HDFS-4043
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.0-alpha, 2.0.1-alpha, 2.0.2-alpha, 2.0.3-alpha, 
> 3.4.0, 3.3.9
> Environment: CDH4U1 on Ubuntu 12.04
>Reporter: Ahad Rana
>Assignee: Steve Vaughan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>   Original Estimate: 24h
>  Time Spent: 50m
>  Remaining Estimate: 23h 10m
>
> The Namenode uses the loginAsNameNodeUser method in NameNode.java to login 
> using the hdfs principal. This method in turn invokes SecurityUtil.login with 
> a hostname (last parameter) obtained via a call to InetAddress.getHostName. 
> This call does not always return the fully qualified host name, and thus 
> causes the namenode to login to fail due to kerberos's inability to find a 
> matching hdfs principal in the hdfs.keytab file. Instead it should use 
> InetAddress.getCanonicalHostName. This is consistent with what is used 
> internally by SecurityUtil.java to login in other services, such as the 
> DataNode. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-16730) Update the doc that append to EC files is supported

2022-08-16 Thread Wei-Chiu Chuang (Jira)

Wei-Chiu Chuang created HDFS-16730:
--

 Summary: Update the doc that append to EC files is supported
 Key: HDFS-16730
 URL: https://issues.apache.org/jira/browse/HDFS-16730
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Wei-Chiu Chuang


Our doc has a statement regarding EC limitations:
https://hadoop.apache.org/docs/r3.3.0/hadoop-project-dist/hadoop-hdfs/HDFSErasureCoding.html#Limitations

{noformat}
append() and truncate() on an erasure coded file will throw IOException.

{noformat}

In fact, HDFS-7663 added the support since Hadoop 3.3.0. The caveat is that it 
supports "Append to a closed striped file, with NEW_BLOCK flag enabled"



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-16727) Consider reading chunk files using MappedByteBuffer

2022-08-10 Thread Wei-Chiu Chuang (Jira)

Wei-Chiu Chuang created HDFS-16727:
--

 Summary: Consider reading chunk files using MappedByteBuffer
 Key: HDFS-16727
 URL: https://issues.apache.org/jira/browse/HDFS-16727
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Wei-Chiu Chuang
 Attachments: Screen Shot 2022-08-04 at 7.12.55 AM.png, 
ozone_dn-rhel03.ozone.cisco.local.html

Running Impala TPC-DS which stresses Ozone DN read path.

BufferUtils#assignByteBuffers stands out as one of the offender.

We can experiment with MappedByteBuffer and see if it makes performance better.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16619) Fix HttpHeaders.Values And HttpHeaders.Names Deprecated Import.

2022-07-27 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16619.

Resolution: Fixed

> Fix HttpHeaders.Values And HttpHeaders.Names Deprecated Import.
> ---
>
> Key: HDFS-16619
> URL: https://issues.apache.org/jira/browse/HDFS-16619
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: Fix HttpHeaders.Values And HttpHeaders.Names 
> Deprecated.png
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> HttpHeaders.Values and HttpHeaders.Names are deprecated, use 
> HttpHeaderValues and HttpHeaderNames instead.
> HttpHeaders.Names
> Deprecated. 
> Use HttpHeaderNames instead. Standard HTTP header names.
> {code:java}
> /** @deprecated */
> @Deprecated
> public static final class Names {
>   public static final String ACCEPT = "Accept";
>   public static final String ACCEPT_CHARSET = "Accept-Charset";
>   public static final String ACCEPT_ENCODING = "Accept-Encoding";
>   public static final String ACCEPT_LANGUAGE = "Accept-Language";
>   public static final String ACCEPT_RANGES = "Accept-Ranges";
>   public static final String ACCEPT_PATCH = "Accept-Patch";
>   public static final String ACCESS_CONTROL_ALLOW_CREDENTIALS = 
> "Access-Control-Allow-Credentials";
>   public static final String ACCESS_CONTROL_ALLOW_HEADERS = 
> "Access-Control-Allow-Headers"; {code}
> HttpHeaders.Values
> Deprecated. 
> Use HttpHeaderValues instead. Standard HTTP header values.
> {code:java}
> /** @deprecated */
> @Deprecated
> public static final class Values {
>   public static final String APPLICATION_JSON = "application/json";
>   public static final String APPLICATION_X_WWW_FORM_URLENCODED = 
> "application/x-www-form-urlencoded";
>   public static final String BASE64 = "base64";
>   public static final String BINARY = "binary";
>   public static final String BOUNDARY = "boundary";
>   public static final String BYTES = "bytes";
>   public static final String CHARSET = "charset";
>   public static final String CHUNKED = "chunked";
>   public static final String CLOSE = "close"; {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16595) Slow peer metrics - add median, mad and upper latency limits

2022-06-03 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16595.

Fix Version/s: 3.4.0
   Resolution: Fixed

> Slow peer metrics - add median, mad and upper latency limits
> 
>
> Key: HDFS-16595
> URL: https://issues.apache.org/jira/browse/HDFS-16595
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Slow datanode metrics include slow node and it's reporting node details. With 
> HDFS-16582, we added the aggregate latency that is perceived by the reporting 
> nodes.
> In order to get more insights into how the outlier slownode's latencies 
> differ from the rest of the nodes, we should also expose median, median 
> absolute deviation and the calculated upper latency limit details.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16583) DatanodeAdminDefaultMonitor can get stuck in an infinite loop

2022-05-31 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16583.

Resolution: Fixed

> DatanodeAdminDefaultMonitor can get stuck in an infinite loop
> -
>
> Key: HDFS-16583
> URL: https://issues.apache.org/jira/browse/HDFS-16583
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.4
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> We encountered a case where the decommission monitor in the namenode got 
> stuck for about 6 hours. The logs give:
> {code}
> 2022-05-15 01:09:25,490 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.132:50010
> 2022-05-15 01:10:20,918 INFO org.apache.hadoop.http.HttpServer2: Process 
> Thread Dump: jsp requested
> 
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753665_3428271426
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753659_3428271420
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753662_3428271423
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753663_3428271424
> 2022-05-15 06:00:57,281 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.34:50010
> 2022-05-15 06:00:58,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem write lock 
> held for 17492614 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:263)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:220)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1601)
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:496)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
>   Number of suppressed write-lock reports: 0
>   Longest write-lock held interval: 17492614
> {code}
> We only have the one thread dump triggered by the FC:
> {code}
> Thread 80 (DatanodeAdminMonitor-0):
>   State: RUNNABLE
>   Blocked count: 16
>   Waited count: 453693
>   Stack:
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:538)
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:494)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> {code}
> This was the line of code:
> {code}
> private void check() {
>   final Iterator>>
>   it = new CyclicIteration<>(outOfServiceNodeBlocks,
>   iterkey).iterator();
>   final LinkedList toRemove = new LinkedList<>();
>   while (it.hasNext() && !exceededNumBlocksPerCheck() &&a

[jira] [Resolved] (HDFS-16603) Improve DatanodeHttpServer With Netty recommended method

2022-05-31 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16603.

Resolution: Fixed

> Improve DatanodeHttpServer With Netty recommended method
> 
>
> Key: HDFS-16603
> URL: https://issues.apache.org/jira/browse/HDFS-16603
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When reading the code, I found that some usage methods are outdated due to 
> the upgrade of netty components.
> {color:#172b4d}*1.DatanodeHttpServer#Constructor*{color}
> {code:java}
> @Deprecated
> public static final ChannelOption WRITE_BUFFER_HIGH_WATER_MARK = 
> valueOf("WRITE_BUFFER_HIGH_WATER_MARK"); 
> Deprecated. Use WRITE_BUFFER_WATER_MARK
> @Deprecated
> public static final ChannelOption WRITE_BUFFER_LOW_WATER_MARK = 
> valueOf("WRITE_BUFFER_LOW_WATER_MARK");
> Deprecated. Use WRITE_BUFFER_WATER_MARK
> -
> this.httpServer.childOption(
>           ChannelOption.WRITE_BUFFER_HIGH_WATER_MARK,
>           conf.getInt(
>               DFSConfigKeys.DFS_WEBHDFS_NETTY_HIGH_WATERMARK,
>               DFSConfigKeys.DFS_WEBHDFS_NETTY_HIGH_WATERMARK_DEFAULT));
> this.httpServer.childOption(
>           ChannelOption.WRITE_BUFFER_LOW_WATER_MARK,
>           conf.getInt(
>               DFSConfigKeys.DFS_WEBHDFS_NETTY_LOW_WATERMARK,
>               DFSConfigKeys.DFS_WEBHDFS_NETTY_LOW_WATERMARK_DEFAULT));
> {code}
> *2.Duplicate code* 
> {code:java}
> ChannelFuture f = httpServer.bind(infoAddr);
> try {
>  f.syncUninterruptibly();
> } catch (Throwable e) {
>   if (e instanceof BindException) {
>    throw NetUtils.wrapException(null, 0, infoAddr.getHostName(),
>    infoAddr.getPort(), (SocketException) e);
>  } else {
>    throw e;
>  }
> }
> httpAddress = (InetSocketAddress) f.channel().localAddress();
> LOG.info("Listening HTTP traffic on " + httpAddress);{code}
> *3.io.netty.bootstrap.ChannelFactory Deprecated*
> *use io.netty.channel.ChannelFactory instead.*
> {code:java}
> /** @deprecated */
> @Deprecated
> public interface ChannelFactory {
>     T newChannel();
> }{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16585) Add @VisibleForTesting in Dispatcher.java after HDFS-16268

2022-05-26 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16585.

Fix Version/s: 3.4.0
   3.2.4
   3.3.4
   Resolution: Fixed

> Add @VisibleForTesting in Dispatcher.java after HDFS-16268
> --
>
> Key: HDFS-16585
> URL: https://issues.apache.org/jira/browse/HDFS-16585
> Project: Hadoop HDFS
>  Issue Type: Improvement
>    Reporter: Wei-Chiu Chuang
>Assignee: groot
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.4
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The scope of a few methods were opened up by HDFS-16268 to facilitate unit 
> testing. We should annotate them with {{@VisibleForTesting}} so that they 
> don't get used by production code.
> The affected methods include:
> PendingMove
> markMovedIfGoodBlock
> isGoodBlockCandidate



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16583) DatanodeAdminDefaultMonitor can get stuck in an infinite loop

2022-05-26 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16583.

Resolution: Fixed

> DatanodeAdminDefaultMonitor can get stuck in an infinite loop
> -
>
> Key: HDFS-16583
> URL: https://issues.apache.org/jira/browse/HDFS-16583
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We encountered a case where the decommission monitor in the namenode got 
> stuck for about 6 hours. The logs give:
> {code}
> 2022-05-15 01:09:25,490 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.132:50010
> 2022-05-15 01:10:20,918 INFO org.apache.hadoop.http.HttpServer2: Process 
> Thread Dump: jsp requested
> 
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753665_3428271426
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753659_3428271420
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753662_3428271423
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753663_3428271424
> 2022-05-15 06:00:57,281 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.34:50010
> 2022-05-15 06:00:58,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem write lock 
> held for 17492614 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:263)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:220)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1601)
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:496)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
>   Number of suppressed write-lock reports: 0
>   Longest write-lock held interval: 17492614
> {code}
> We only have the one thread dump triggered by the FC:
> {code}
> Thread 80 (DatanodeAdminMonitor-0):
>   State: RUNNABLE
>   Blocked count: 16
>   Waited count: 453693
>   Stack:
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:538)
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:494)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> {code}
> This was the line of code:
> {code}
> private void check() {
>   final Iterator>>
>   it = new CyclicIteration<>(outOfServiceNodeBlocks,
>   iterkey).iterator();
>   final LinkedList toRemove = new LinkedList<>();
>   while (it.hasNext() && !exceededNumBlocksPerCheck() &&a

[jira] [Created] (HDFS-16585) Add @VisibleForTesting in Dispatcher.java after HDFS-16268

2022-05-20 Thread Wei-Chiu Chuang (Jira)

Wei-Chiu Chuang created HDFS-16585:
--

 Summary: Add @VisibleForTesting in Dispatcher.java after HDFS-16268
 Key: HDFS-16585
 URL: https://issues.apache.org/jira/browse/HDFS-16585
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Wei-Chiu Chuang


The scope of a few methods were opened up by HDFS-16268 to facilitate unit 
testing. We should annotate them with {{@VisibleForTesting}} so that they don't 
get used by production code.

The affected methods include:
PendingMove
markMovedIfGoodBlock
isGoodBlockCandidate




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16520) Improve EC pread: avoid potential reading whole block

2022-05-06 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16520.

Resolution: Fixed

Merged the PR and cherrypicked into branch-3.3.

Thanks!

> Improve EC pread: avoid potential reading whole block
> -
>
> Key: HDFS-16520
> URL: https://issues.apache.org/jira/browse/HDFS-16520
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient, ec, erasure-coding
>Affects Versions: 3.3.1, 3.3.2
>Reporter: daimin
>Assignee: daimin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> HDFS client 'pread' represents 'position read', this kind of read just need a 
> range of data instead of reading the whole file/block. By using 
> BlockReaderFactory#setLength, client tells datanode the block length to be 
> read from disk and sent to client.
> To EC file, the block length to read is not well set, by default using 
> 'block.getBlockSize() - offsetInBlock' to both pread and sread. Thus datanode 
> read much more data and send to client, and abort when client closes 
> connection. There is a lot waste of resource to this situation.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Re: [DISCUSS] Enabling all platform builds in CI for all Hadoop PRs

2022-05-06 Thread Wei-Chiu Chuang

Running builds for all platforms for each and every PR seems too excessive.

How about doing all platform builds in the nightly jobs?

On Fri, May 6, 2022 at 8:02 AM Steve Loughran 
wrote:

> I'm not enthusiastic here as it not only makes the builds slower, it
> reduces the #of builds we can through a day
>
> one thing I am wondering is could we remove java8 support on some branches?
>
> make branch 3.3.2.x (i.e the 3.3.3 release) the last java 8 build, and this
> summers branch-3.3 release (which I'd rebadge 3.4) would ship as java 11
> only.
> that would cut buld and test time for those trunk PRs in half...after which
> the preospect of building on more than one platform becomes more viable.
>
> On Thu, 5 May 2022 at 15:34, Gautham Banasandra 
> wrote:
>
> > Hi Hadoop devs,
> >
> > Last week, there was a Hadoop build failure on Debian 10 caused by
> > https://github.com/apache/hadoop/pull/3988. In dev-support/jenkins.sh,
> > there's the capability to build and test Hadoop across the supported
> > platforms. Currently, we're limiting this only for those PRs having only
> > C/C++ changes[1], since C/C++ changes are more likely to cause
> > cross-platform build issues and bypassing the full platform build for non
> > C/C++ PRs would save a great deal of CI time. However, the build failure
> > caused by PR #3988 motivates me to enable the capability to build and
> > test Hadoop for all the supported platforms for ALL the PRs.
> >
> > While this may cause longer CI run duration for each PR, it would
> > immensely minimize the risk of breaking Hadoop across platforms and
> > saves us a lot of debugging time. Kindly post your opinion regarding this
> > and I'll move to enable this capability for all PRs if the response is
> > sufficiently positive.
> >
> > [1] =
> >
> >
> https://github.com/apache/hadoop/blob/bccf2f3ef4c8f09f010656f9061a4e323daf132b/dev-support/jenkins.sh#L97-L103
> >
> >
> > Thanks,
> > --Gautham
> >
>

[jira] [Resolved] (HDFS-16521) DFS API to retrieve slow datanodes

2022-05-05 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16521.

Resolution: Fixed

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.
> Moreover, 
> [FanOutOneBlockAsyncDFSOutput|https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java]
>  in HBase currently has to rely on it's own way of marking and excluding slow 
> nodes while 1) creating pipelines and 2) handling ack, based on factors like 
> the data length of the packet, processing time with last ack timestamp, 
> whether flush to replicas is finished etc. If it can utilize slownode API 
> from HDFS to exclude nodes appropriately while writing block, a lot of it's 
> own post-ack computation of slow nodes can be _saved_ or _improved_ or based 
> on further experiment, we could find _better solution_ to manage slow node 
> detection logic both in HDFS and HBase. However, in order to collect more 
> data points and run more POC around this area, HDFS should provide API for 
> downstreamers to efficiently utilize slownode info for such critical 
> low-latency use-case (like writing WALs).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Re: [DISCUSS] Hadoop on Windows

2022-04-28 Thread Wei-Chiu Chuang

Great!

Sorry I missed the earlier discussion thread. Is there a target version for
this support? I assume the milestone is still in a dev branch?

On Thu, Apr 28, 2022 at 8:26 AM Gautham Banasandra 
wrote:

> Hi Hadoop devs,
>
> I would like to announce that we recently reached a new milestone - we
> recently finished all the tasks in item 3 under Phase 1. This implies that
> all the HDFS native client tools[1] have become cross platform now. We're
> inching closer towards making Hadoop cross platform. Watch this space for
> more updates.
>
> [1] =
>
> https://github.com/apache/hadoop/tree/trunk/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/tools
>
> Thanks,
> --Gautham
>
> On Mon, 21 Feb 2022 at 00:12, Gautham Banasandra 
> wrote:
>
> > Hi all,
> >
> > I've been working on getting Hadoop to build on Windows for quite some
> > time now. We're now at a stage where we can parallelize the effort and
> > complete this sooner. I've outlined the parts that are remaining. Please
> > get in touch with me if anyone wishes to join hands in realizing this
> goal.
> >
> > *Why do we need Hadoop to run on Windows?*
> > Windows has a very large user base. The modern alternative softwares to
> > Hadoop (like Kubernetes) are cross platform by design. We have to
> > acknowledge the fact it isn't easy to get Hadoop running on Windows. The
> > reason why we haven't seen much adoption of Hadoop on Windows is probably
> > because of issues like compilation, requiring work-arounds every step of
> > the way etc. If we were to nail these issues, I believe it would
> > tremendously expand the usage of Hadoop.
> >
> > I plan to complete this in 4 phases.
> >
> > *Phase 1 : Building Hadoop on Windows*
> > 1. [HADOOP-17193] Compile Hadoop on Windows natively - ASF JIRA
> > (apache.org) 
> > The Hadoop build on Windows is currently broken because of the POSIX API
> > calls made in the HDFS native client (libhdfspp). MinGW and Cygwin
> > provide POSIX implementation on Windows. While it's possible to use these
> > C++ compilers, it won't be the same as compiling Hadoop with Visual C++.
> > The Visual C++ runtime is the native C++ runtime on Windows and provides
> > much more capabilities (like core dumps etc.) than its alternatives.
> Thus,
> > it's essential to get Hadoop to compile with Visual Studio on Windows.
> > We'll be using Visual Studio 2019.
> >
> > 2. [HDFS-15843] [libhdfs++] Make write cross platform - ASF JIRA
> > (apache.org) 
> > Until recently, Hadoop was being built with C++11. I upgraded the
> compiler
> > version to a level where it supports C++17 so that we've access to
> > std::filesystem and a few other modern C++ APIs. However, there are some
> > cases where the C++17 APIs don't suffice. Thus, I wrote the XPlatform
> > library
> > <
> https://github.com/apache/hadoop/tree/trunk/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/x-platform
> >,
> > which is a collection of system call APIs implemented in a cross-platform
> > friendly manner. The CMake build system will choose the appropriate
> > platform implementation while building so that we can do away with all
> the
> > #ifdefs based on platform in the code. In summary, if you ever come
> across
> > a need to use system calls, please put them into the XPlatform library
> and
> > use its APIs.
> >
> > 3. [HDFS-16474] Make HDFS tail tool cross platform - ASF JIRA (
> apache.org)
> > 
> > [HDFS-16473] Make HDFS stat tool cross platform - ASF JIRA
> > (apache.org) 
> > [HDFS-16472] Make HDFS setrep tool cross platform - ASF JIRA
> > (apache.org) 
> > [HDFS-16471] Make HDFS ls tool cross platform - ASF JIRA (apache.org
> )
> > 
> > [HDFS-16470] Make HDFS find tool cross platform - ASF JIRA
> > (apache.org) 
> > The HDFS native client tools use getopt API to parse the command line
> > arguments. getopt isn't available on Windows. One can follow this PR to
> > make the above tools cross platform compatible - HDFS-16285. Make HDFS
> > ownership tools cross platform by GauthamBanasandra · Pull Request #3588
> ·
> > apache/hadoop (github.com) .
> >
> > 4. [HDFS-16463] Make dirent.h cross platform compatible - ASF JIRA
> > (apache.org) 
> > [HDFS-16465] Make usage of strings.h cross platform compatible - ASF
> > JIRA (apache.org) 
> > For these JIRAs, the header files aren't there for Windows. Thus, we need
> > to inspect the APIs that have been used from these headers and implement
> > them.
> >
> > 5.

[jira] [Resolved] (HDFS-16551) Backport HADOOP-17588 to 3.3 and other active old branches.

2022-04-24 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16551.

Fix Version/s: 2.10.2
   3.2.4
   Resolution: Fixed

Done. Thanks!

> Backport HADOOP-17588 to 3.3 and other active old branches.
> ---
>
> Key: HDFS-16551
> URL: https://issues.apache.org/jira/browse/HDFS-16551
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.10.2, 3.2.4, 3.3.4
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> This random issue has been handled in trunk, same needs to be backported to 
> active branches.
> org.apache.hadoop.crypto.CryptoInputStream.close() - when 2 threads try to 
> close the stream second thread, fails with error.
> This operation should be synchronized to avoid multiple threads to perform 
> the close operation concurrently.
> [~Hemanth Boyina] 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16389) Improve NNThroughputBenchmark test mkdirs

2022-04-17 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16389.

Fix Version/s: 3.4.0
   Resolution: Fixed

> Improve NNThroughputBenchmark test mkdirs
> -
>
> Key: HDFS-16389
> URL: https://issues.apache.org/jira/browse/HDFS-16389
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: benchmarks, namenode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> When using the NNThroughputBenchmark test to create a large number of 
> directories, some abnormal information will be prompted.
> Here is the command:
> ./bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs 
> hdfs:// -op mkdirs -threads 30 -dirs 500
> There are some exceptions here, such as:
> 21/12/20 10:25:00 INFO namenode.NNThroughputBenchmark: Starting benchmark: 
> mkdirs
> 21/12/20 10:25:01 INFO namenode.NNThroughputBenchmark: Generate 500 
> inputs for mkdirs
> 21/12/20 10:25:08 ERROR namenode.NNThroughputBenchmark: 
> java.lang.ArrayIndexOutOfBoundsException: 20
>   at 
> org.apache.hadoop.hdfs.server.namenode.FileNameGenerator.getNextDirName(FileNameGenerator.java:65)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FileNameGenerator.getNextFileName(FileNameGenerator.java:73)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$MkdirsStats.generateInputs(NNThroughputBenchmark.java:668)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550)
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 20
>   at 
> org.apache.hadoop.hdfs.server.namenode.FileNameGenerator.getNextDirName(FileNameGenerator.java:65)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FileNameGenerator.getNextFileName(FileNameGenerator.java:73)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$MkdirsStats.generateInputs(NNThroughputBenchmark.java:668)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550)
> These messages appear because some parameters are incorrectly set, such as 
> dirsPerDir or filesPerDir.
> When we see this log, this will make us have some questions.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16535) SlotReleaser should reuse the domain socket based on socket paths

2022-04-17 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16535.

Fix Version/s: 3.3.3
   3.4.0
   Resolution: Fixed

Merged. Thanks [~stigahuang] and [~leosun08]!

> SlotReleaser should reuse the domain socket based on socket paths
> -
>
> Key: HDFS-16535
> URL: https://issues.apache.org/jira/browse/HDFS-16535
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.3.1, 3.4.0
>Reporter: Quanlong Huang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.3
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> HDFS-13639 improves the performance of short-circuit shm slot releasing by 
> reusing the domain socket that the client previously used to send release 
> request to the DataNode.
> This is good when there are only one DataNode locates with the client (truth 
> in most of the production environment). However, if we launch multiple 
> DataNodes on a machine (usually for testing, e.g. Impala's end-to-end tests), 
> the request could be sent to the wrong DataNode. See an example in 
> IMPALA-11234.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Re: Institutions running kerberized Hadoop clusters

2022-04-12 Thread Wei-Chiu Chuang

Last time I checked (~2 years ago), there were thousands of Kerberized
clusters among Cloudera customers.
The largest ones had a few thousand nodes.

What are you looking for?

On Wed, Apr 13, 2022 at 7:22 AM Santosh Marella  wrote:

> Hey folks,
>
>   Just curious if we have a list of institutions that are running
> *kerberized* Hadoop clusters? I noticed we have a PoweredBy Hadoop
>  page that
> lists all the institutions running Hadoop, but couldn't find something
> similar for kerberized Hadoop clusters. Appreciate any pointers on this.
>
> Thanks,
> Santosh
>

Re: [VOTE] Release Apache Hadoop 3.2.3 - RC0

2022-03-17 Thread Wei-Chiu Chuang

aarch64 support is only introduced in/after 3.3.0

On Thu, Mar 17, 2022 at 2:27 PM Emil Ejbyfeldt
 wrote:

> Hi,
>
>
> There is no aarch64 artifact in the release candidate. Is this something
> that is intended?
>
> Best,
> Emil Ejbyfeldt
>
> On 14/03/2022 08:14, Masatake Iwasaki wrote:
> > Hi all,
> >
> > Here's Hadoop 3.2.3 release candidate #0:
> >
> > The RC is available at:
> >https://home.apache.org/~iwasakims/hadoop-3.2.3-RC0/
> >
> > The RC tag is at:
> >https://github.com/apache/hadoop/releases/tag/release-3.2.3-RC0
> >
> > The Maven artifacts are staged at:
> >
> https://repository.apache.org/content/repositories/orgapachehadoop-1339
> >
> > You can find my public key at:
> >https://downloads.apache.org/hadoop/common/KEYS
> >
> > Please evaluate the RC and vote.
> >
> > Thanks,
> > Masatake Iwasaki
> >
> > -
> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>

[jira] [Resolved] (HDFS-16502) Reconfigure Block Invalidate limit

2022-03-15 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16502.

Fix Version/s: 3.4.0
   3.3.3
   Resolution: Fixed

> Reconfigure Block Invalidate limit
> --
>
> Key: HDFS-16502
> URL: https://issues.apache.org/jira/browse/HDFS-16502
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.3
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Based on the cluster load, it would be helpful to consider tuning block 
> invalidate limit (dfs.block.invalidate.limit). The only way we can do this 
> without restarting Namenode as of today is by reconfiguring heartbeat 
> interval 
> {code:java}
> Math.max(heartbeatInt*20, blockInvalidateLimit){code}
> , this logic is not straightforward and operators are usually not aware of it 
> (lack of documentation), also updating heartbeat interval is not desired in 
> all the cases.
> We should provide the ability to alter block invalidation limit without 
> affecting heartbeat interval on the live cluster to adjust some load at 
> Datanode level.
> We should also take this opportunity to keep (heartbeatInterval * 20) 
> computation logic in a common method.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[ANNOUNCE] New Hadoop PMC Sun Chao

2022-03-08 Thread Wei-Chiu Chuang

On behalf of the Apache Hadoop PMC, I am pleased to announce that Sun
Chao(sunchao) has accepted the PMC's invitation to become a PMC on
the project. We appreciate all of Sun's generous contributions thus far
and look forward to his continued involvement.

Congratulations and welcome, Sun!

Re: [ANNOUNCE] Apache Hadoop 3.3.2 release

2022-03-03 Thread Wei-Chiu Chuang

Thanks a lot for the tremendous work!

On Fri, Mar 4, 2022 at 9:30 AM Chao Sun  wrote:

> Hi All,
>
> It gives me great pleasure to announce that the Apache Hadoop community has
> voted to release Apache Hadoop 3.3.2.
>
> This is the second stable release of Apache Hadoop 3.3 line. It contains
> 284 bug fixes, improvements and enhancements since 3.3.1.
>
> Users are encouraged to read the overview of major changes [1] since 3.3.1.
> For details of 284 bug fixes, improvements, and other enhancements since
> the previous 3.3.1 release, please check release notes [2] and changelog
> [3].
>
> [1]: https://hadoop.apache.org/docs/r3.3.2/index.html
> [2]:
>
> http://hadoop.apache.org/docs/r3.3.2/hadoop-project-dist/hadoop-common/release/3.3.2/RELEASENOTES.3.3.2.html
> [3]:
>
> http://hadoop.apache.org/docs/r3.3.2/hadoop-project-dist/hadoop-common/release/3.3.2/CHANGELOG.3.3.2.html
>
> Many thanks to everyone who contributed to the release, and everyone in the
> Apache Hadoop community! This release is a direct result of your great
> contributions.
>
> Many thanks to everyone who helped in this release process!
>
> Many thanks to Viraj Jasani, Michael Stack, Masatake Iwasaki, Xiaoqiao He,
> Mukund Madhav Thakur, Wei-Chiu Chuang, Steve Loughran, Akira Ajisaka and
> other folks who helped for this release process.
>
> Best Regards,
> Chao
>

[jira] [Resolved] (HDFS-16422) Fix thread safety of EC decoding during concurrent preads

2022-02-10 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16422.

Fix Version/s: 3.4.0
   3.2.3
   3.3.3
   Resolution: Fixed

Thanks [~cndaimin] for the great finding!

> Fix thread safety of EC decoding during concurrent preads
> -
>
> Key: HDFS-16422
> URL: https://issues.apache.org/jira/browse/HDFS-16422
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsclient, ec, erasure-coding
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.3
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Reading data on an erasure-coded file with missing replicas(internal block of 
> block group) will cause online reconstruction: read dataUnits part of data 
> and decode them into the target missing data. Each DFSStripedInputStream 
> object has a RawErasureDecoder object, and when we doing pread concurrently, 
> RawErasureDecoder.decode will be invoked concurrently too. 
> RawErasureDecoder.decode is not thread safe, as a result of that we get wrong 
> data from pread occasionally.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16437) ReverseXML processor doesn't accept XML files without the SnapshotDiffSection.

2022-02-05 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16437.

Fix Version/s: 3.4.0
   Resolution: Fixed

Merged. Thanks [~it_singer] for the contribution!

> ReverseXML processor doesn't accept XML files without the SnapshotDiffSection.
> --
>
> Key: HDFS-16437
> URL: https://issues.apache.org/jira/browse/HDFS-16437
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.1, 3.3.0
>Reporter: yanbin.zhang
>Assignee: yanbin.zhang
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> In a cluster environment without snapshot, if you want to convert back to 
> fsimage through the generated xml, an error will be reported.
> {code:java}
> //代码占位符
> [test@test001 ~]$ hdfs oiv -p ReverseXML -i fsimage_0257220.xml 
> -o fsimage_0257220
> OfflineImageReconstructor failed: FSImage XML ended prematurely, without 
> including section(s) SnapshotDiffSection
> java.io.IOException: FSImage XML ended prematurely, without including 
> section(s) SnapshotDiffSection
>         at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.processXml(OfflineImageReconstructor.java:1765)
>         at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.run(OfflineImageReconstructor.java:1842)
>         at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewerPB.run(OfflineImageViewerPB.java:211)
>         at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewerPB.main(OfflineImageViewerPB.java:149)
> 22/01/25 15:56:52 INFO util.ExitUtil: Exiting with status 1: ExitException 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16423) balancer should not get blocks on stale storages

2022-01-25 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16423.

Fix Version/s: 3.3.3
   Resolution: Fixed

> balancer should not get blocks on stale storages
> 
>
> Key: HDFS-16423
> URL: https://issues.apache.org/jira/browse/HDFS-16423
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover
>Reporter: qinyuren
>Assignee: qinyuren
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.3
>
> Attachments: image-2022-01-13-17-18-32-409.png
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> We have met a problems as described in HDFS-16420
> We found that balancer copied a block multi times without deleting the source 
> block if this block was placed in a stale storage. And resulting a block with 
> many copies, but these redundant copies are not deleted until the storage 
> become not stale.
>  
> !image-2022-01-13-17-18-32-409.png|width=657,height=275!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16403) Improve FUSE IO performance by supporting FUSE parameter max_background

2022-01-24 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16403.

Resolution: Fixed

Thank you [~cndaimin] for the great work and the excellent performance test!
Thanks [~pifta] for the code review.

> Improve FUSE IO performance by supporting FUSE parameter max_background
> ---
>
> Key: HDFS-16403
> URL: https://issues.apache.org/jira/browse/HDFS-16403
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: fuse-dfs
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.3
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> When we examining the FUSE IO performance on HDFS, we found that the 
> simultaneous IO requests number are limited to a fixed number, like 12. This 
> limitation makes the IO performance on FUSE client quite unacceptable. We did 
> some research on this and inspired by the article  [Performance and Resource 
> Utilization of FUSE User-Space File 
> Systems|https://dl.acm.org/doi/fullHtml/10.1145/3310148], clearly the FUSE 
> parameter '{{{}max_background{}}}' decides the simultaneous IO requests 
> number, which is 12 by default.
> We add 'max_background' to fuse_dfs mount options,  the FUSE kernel will take 
> effect when an option value is given.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Reopened] (HDFS-16423) balancer should not get blocks on stale storages

2022-01-24 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reopened HDFS-16423:


Reopen to backport this to lower branches.

> balancer should not get blocks on stale storages
> 
>
> Key: HDFS-16423
> URL: https://issues.apache.org/jira/browse/HDFS-16423
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover
>Reporter: qinyuren
>Assignee: qinyuren
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: image-2022-01-13-17-18-32-409.png
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> We have met a problems as described in HDFS-16420
> We found that balancer copied a block multi times without deleting the source 
> block if this block was placed in a stale storage. And resulting a block with 
> many copies, but these redundant copies are not deleted until the storage 
> become not stale.
>  
> !image-2022-01-13-17-18-32-409.png|width=657,height=275!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 3.3.2 - RC2

2022-01-20 Thread Wei-Chiu Chuang

I'll find time to check out the RC bits.
I just feel bad that the tarball is now more than 600MB in size.

On Fri, Jan 21, 2022 at 2:23 AM Steve Loughran 
wrote:

> *+1 binding.*
>
> reviewed binaries, source, artifacts in the staging maven repository in
> downstream builds. all good.
>
> *## test run*
>
> checked out the asf github repo at commit 6da346a358c into a location
> already set up with aws and azure test credentials
>
> ran the hadoop-aws tests with -Dparallel-tests -DtestsThreadCount=6
>  -Dmarkers=delete -Dscale
> and hadoop-azure against azure cardiff with -Dparallel-tests=abfs
> -DtestsThreadCount=6
>
> all happy
>
>
>
> *## binary*
> downloaded KEYS and imported, so adding your key to my list (also signed
> this and updated the key servers)
>
> downloaded rc tar and verified
> ```
> > gpg2 --verify hadoop-3.3.2.tar.gz.asc hadoop-3.3.2.tar.gz
> gpg: Signature made Sat Jan 15 23:41:10 2022 GMT
> gpg:using RSA key DE7FA241EB298D027C97B2A1D8F1A97BE51ECA98
> gpg: Good signature from "Chao Sun (CODE SIGNING KEY)  >"
> [full]
>
>
> > cat hadoop-3.3.2.tar.gz.sha512
> SHA512 (hadoop-3.3.2.tar.gz) =
>
> cdd3d9298ba7d6e63ed63f93c159729ea14d2b7d5e3a0640b1761c86c7714a721f88bdfa8cb1d8d3da316f616e4f0ceaace4f32845ee4441e6aaa7a12b8c647d
>
> > shasum -a 512 hadoop-3.3.2.tar.gz
>
> cdd3d9298ba7d6e63ed63f93c159729ea14d2b7d5e3a0640b1761c86c7714a721f88bdfa8cb1d8d3da316f616e4f0ceaace4f32845ee4441e6aaa7a12b8c647d
>  hadoop-3.3.2.tar.gz
> ```
>
>
> *# cloudstore against staged artifacts*
> ```
> cd ~/.m2/repository/org/apache/hadoop
> find . -name \*3.3.2\* -print | xargs rm -r
> ```
> ensures no local builds have tainted the repo.
>
> in cloudstore mvn build without tests
> ```
> mci -Pextra -Phadoop-3.3.2 -Psnapshots-and-staging
> ```
> this fetches all from asf staging
>
> ```
> Downloading from ASF Staging:
>
> https://repository.apache.org/content/groups/staging/org/apache/hadoop/hadoop-client/3.3.2/hadoop-client-3.3.2.pom
> Downloaded from ASF Staging:
>
> https://repository.apache.org/content/groups/staging/org/apache/hadoop/hadoop-client/3.3.2/hadoop-client-3.3.2.pom
> (11 kB at 20 kB/s)
> ```
> there's no tests there, but it did audit the download process. FWIW, that
> project has switched to logback, so I now have all hadoop imports excluding
> slf4j and log4j. it takes too much effort right now.
>
> build works.
>
> tested abfs and s3a storediags, all happy
>
>
>
>
> *### google GCS against staged artifacts*
>
> gcs is now java 11 only, so I had to switch JVMs here.
>
> had to add a snapshots and staging profile, after which I could build and
> test.
>
> ```
>  -Dhadoop.three.version=3.3.2 -Psnapshots-and-staging
> ```
> two test failures were related to auth failures where the tests were trying
> to raise exceptions but things failed differently
> ```
> [ERROR] Failures:
> [ERROR]
>
> GoogleHadoopFileSystemTest.eagerInitialization_fails_withInvalidCredentialsConfiguration:122
> unexpected exception type thrown; expected:
> but was:
> [ERROR]
>
> GoogleHadoopFileSystemTest.lazyInitialization_deleteCall_fails_withInvalidCredentialsConfiguration:100
> value of: throwable.getMessage()
> expected: Failed to create GCS FS
> but was : A JSON key file may not be specified at the same time as
> credentials via configuration.
>
> ```
>
> I'm not worried here.
>
> ran cloudstore's diagnostics against gcs.
>
> Nice to see they are now collecting IOStatistics on their input streams. we
> really need to get this collected through the parquet/orc libs and then
> through the query engines.
>
> ```
> > bin/hadoop jar $CLOUDSTORE storediag gs://stevel-london/
>
> ...
> 2022-01-20 17:52:47,447 [main] INFO  diag.StoreDiag
> (StoreDurationInfo.java:(56)) - Starting: Reading a file
> gs://stevel-london/dir-9cbfc774-76ff-49c0-b216-d7800369c3e1/file
> input stream summary: org.apache.hadoop.fs.FSDataInputStream@6cfd9a54:
> com.google.cloud.hadoop.fs.gcs.GoogleHadoopFSInputStream@78c1372d
> {counters=((stream_read_close_operations=1)
> (stream_read_seek_backward_operations=0) (stream_read_total_bytes=7)
> (stream_read_bytes=7) (stream_read_exceptions=0)
> (stream_read_seek_operations=0) (stream_read_seek_bytes_skipped=0)
> (stream_read_operations=3) (stream_read_bytes_backwards_on_seek=0)
> (stream_read_seek_forward_operations=0)
> (stream_read_operations_incomplete=1));
> gauges=();
> minimums=();
> maximums=();
> means=();
> }
> ...
> ```
>
> *### source*
>
> once I'd done builds and tests which fetched from staging, I did a local
> build and test
>
> repeated download/validate of source tarball, unzip/untar
>
> build with java11.
>
> I've not done the test run there, because that directory tree doesn't have
> the credentials, and this mornings run was good.
>
> altogether then: very happy. tests good, downstream libraries building and
> linking.
>
> On Wed, 19 Jan 2022 at 17:50, Chao Sun  wrote:
>
> > Hi all,
> >
> > I've put together Hadoop 3.3.2 RC2 below:
> >
> > The RC is available at:
> >

Re: [DISCUSS] Migrate hadoop from log4j1 to log4j2

2022-01-20 Thread Wei-Chiu Chuang

+1 I think it makes sense to use reload4j in maint releases.
I have a draft PR doing this (https://github.com/apache/hadoop/pull/3906)

log4j2 in Hadoop 3.4.0 makes sense to me. There could be incompatibilities
introduced by log4j2, but I feel we should at least make it 3.4.0 a
"preview" release, and try to address the incompat in later versions (e.g.
3.4.1)

On Fri, Jan 21, 2022 at 8:42 AM Duo Zhang  wrote:

> For maintenance release line I also support we switch to reload4j to
> address the security issues first. We could file an issue for it.
>
> Andrew Purtell 于2022年1月21日 周五01:15写道：
>
> > Just to clarify: I think you want to upgrade to Log4J2 (or switch to
> > LogBack) as a strategy for new releases, but you have the option in
> > maintenance releases to use Reload4J to maintain Appender API and
> > operational compatibility, and users who want to minimize risks in
> > production while mitigating the security issues will prefer that.
> >
> > > On Jan 20, 2022, at 8:59 AM, Andrew Purtell 
> > wrote:
> > >
> > > Reload4J has fixed all of those CVEs without requiring an upgrade.
> > >
> > >> On Jan 20, 2022, at 5:56 AM, Duo Zhang  wrote:
> > >>
> > >> There are 3 new CVEs for log4j1 reported recently[1][2][3]. So I
> think
> > it
> > >> is time to speed up the migration to log4j2 work[4] now.
> > >>
> > >> You can see the discussion on the jira issue[4], our goal is to fully
> > >> migrate to log4j2 and the current most blocking issue is lack of the
> > >> "log4j.rootLogger=INFO,Console" grammer support for log4j2. I've
> already
> > >> started a discussion thread on the log4j dev mailing list[5] and the
> > result
> > >> is optimistic and I've filed an issue for log4j2[6], but I do not
> think
> > it
> > >> could be addressed and released soon. If we want to fully migrate to
> > >> log4j2, then either we introduce new environment variables or split
> the
> > old
> > >> HADOOP_ROOT_LOGGER variable in the startup scripts. And considering
> the
> > >> complexity of our current startup scripts, the work is not easy and it
> > will
> > >> also break lots of other hadoop deployment systems if they do not use
> > our
> > >> startup scripts...
> > >>
> > >> So after reconsidering the current situation, I prefer we use the
> > log4j1.2
> > >> bridge to remove the log4j1 dependency first, and once LOG4J2-3341 is
> > >> addressed and released, we start to fully migrate to log4j2. Of course
> > we
> > >> have other problems for log4j1.2 bridge too, as we have
> TaskLogAppender,
> > >> ContainerLogAppender and ContainerRollingLogAppender which inherit
> > >> FileAppender and RollingFileAppender in log4j1, which are not part of
> > the
> > >> log4j1.2 bridge. But anyway, at least we could just copy the source
> > code to
> > >> hadoop as we have WriteAppender in log4j1.2 bridge, and these two
> > classes
> > >> do not have related CVEs.
> > >>
> > >> Thoughts? For me I would like us to make a new 3.4.x release line to
> > remove
> > >> the log4j1 dependencies ASAP.
> > >>
> > >> Thanks.
> > >>
> > >> 1. https://nvd.nist.gov/vuln/detail/CVE-2022-23302
> > >> 2. https://nvd.nist.gov/vuln/detail/CVE-2022-23305
> > >> 3. https://nvd.nist.gov/vuln/detail/CVE-2022-23307
> > >> 4. https://issues.apache.org/jira/browse/HADOOP-16206
> > >> 5. https://lists.apache.org/thread/gvfb3jkg6t11cyds4jmpo7lrswmx28w3
> > >> 6. https://issues.apache.org/jira/browse/LOG4J2-3341
> >
>

Re: Hadoop-3.2.3 Release Update

2022-01-11 Thread Wei-Chiu Chuang

Is this still making progress?

On Tue, Oct 5, 2021 at 8:45 PM Brahma Reddy Battula 
wrote:

> Hi Akira,
>
> Thanks for your email!!
>
> I am evaluating the CVE’s which needs to go for this release..
>
> Will update soon!!
>
>
> On Tue, 5 Oct 2021 at 1:46 PM, Akira Ajisaka  wrote:
>
> > Hi Brahma,
> >
> > What is the release process going on? Is there any blocker for the RC?
> >
> > -Akira
> >
> > On Wed, Sep 22, 2021 at 7:37 PM Xiaoqiao He  wrote:
> >
> > > Hi Brahma,
> > >
> > > The feature 'BPServiceActor processes commands from NameNode
> > > asynchronously' has been ready for both branch-3.2 and branch-3.2.3.
> > While
> > > cherry-picking there is only minor conflict, So I checked in directly.
> > BTW,
> > > run some unit tests and build pseudo cluster to verify, it seems to
> work
> > > fine.
> > > FYI.
> > >
> > > Regards,
> > > - He Xiaoqiao
> > >
> > > On Thu, Sep 16, 2021 at 10:52 PM Brahma Reddy Battula <
> bra...@apache.org
> > >
> > > wrote:
> > >
> > >> Please go ahead. Let me know any help required on review.
> > >>
> > >> On Tue, Sep 14, 2021 at 6:57 PM Xiaoqiao He 
> > wrote:
> > >>
> > >>> Hi Brahma,
> > >>>
> > >>> I plan to involve HDFS-14997 and related JIRAs if possible. I have
> > >>> resolved the conflict and verified them locally.
> > >>> It will include: HDFS-14997 HDFS-15075 HDFS-15651 HDFS-15113.
> > >>> I would like to hear some more response that if we have enough time
> to
> > >>> wait for it to be ready.
> > >>> Thanks.
> > >>>
> > >>> Best Regards,
> > >>> - He Xiaoqiao
> > >>>
> > >>> On Tue, Sep 14, 2021 at 3:39 PM Xiaoqiao He 
> > wrote:
> > >>>
> >  Hi Brahma, HDFS-15160 has checked in branch-3.2 & branch-3.2.3. FYI.
> > 
> >  On Tue, Sep 14, 2021 at 3:52 AM Brahma Reddy Battula <
> > bra...@apache.org>
> >  wrote:
> > 
> > > Hi All,
> > >
> > > Waiting for the following jira to commit to hadoop-3.2.3 , mostly
> > this
> > > can
> > > be done by this week,then I will try to create the RC next if there
> > is
> > > no
> > > objection.
> > >
> > > https://issues.apache.org/jira/browse/HDFS-15160
> > >
> > >
> > >
> > > On Mon, Aug 16, 2021 at 2:22 PM Brahma Reddy Battula <
> > > bra...@apache.org>
> > > wrote:
> > >
> > > > @Akira Ajisaka   and @Masatake Iwasaki
> > > > 
> > > > Looks all are build related issues when you try with bigtop. We
> can
> > > > discuss and prioritize this.. Will connect with you guys.
> > > >
> > > > On Mon, Aug 16, 2021 at 1:43 PM Masatake Iwasaki <
> > > > iwasak...@oss.nttdata.co.jp> wrote:
> > > >
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch2-exclude-spotbugs-annotations.diff
> > > >> >
> > > >> > This is for building hadoop-3.2.2 against zookeeper-3.4.14.
> > > >> > we do not see the issue usually since branch-3.2 uses
> > > zooekeper-3.4.13,
> > > >> > while it would be harmless to add the exclusion even for
> > > >> zooekeeper-3.4.13.
> > > >>
> > > >> I filed HADOOP-17849 for this.
> > > >>
> > > >> On 2021/08/16 12:02, Masatake Iwasaki wrote:
> > > >> > Thanks for bringing this up, Akira. Let me explain some
> > > background.
> > > >> >
> > > >> >
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch2-exclude-spotbugs-annotations.diff
> > > >> >
> > > >> > This is for building hadoop-3.2.2 against zookeeper-3.4.14.
> > > >> > we do not see the issue usually since branch-3.2 uses
> > > zooekeper-3.4.13,
> > > >> > while it would be harmless to add the exclusion even for
> > > >> zooekeeper-3.4.13.
> > > >> >
> > > >> >
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch3-fix-broken-dir-detection.diff
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch5-fix-kms-shellprofile.diff
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch6-fix-httpfs-sh.diff
> > > >> >
> > > >> > These are relevant to directory structure used by Bigtop
> > package.
> > > >> > If the fix does not break the tarball dist,
> > > >> > it would be nice to have these on Hadoop too.
> > > >> >
> > > >> >
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch7-remove-phantomjs-in-yarn-ui.diff
> > > >> >
> > > >> > This is for aarch64 and ppe64le lacking required phantomjs.
> > > >> > It is only acceptable for Bigtop not running tests of YARN-UI2
> > on
> > > >> packaging.
> > > >> > Hadoop needs the phantomjs for testing YARN-UI2.
> > > >> >
> >

Re: Next Mandarin Hadoop Online Meetup Jan 6th.

2022-01-09 Thread Wei-Chiu Chuang

Hello

Thanks for joining this event.

The presentation slides (in English) is available at
https://drive.google.com/file/d/1PiZYhzxANqtoyO_nSLt_-v7aP3j17Sbg/view

The recording (in Mandarin) is available at
https://cloudera.zoom.us/rec/share/JaNm70lZQGCZdlFzh9ZbsfrR7MJ7Nazb2g6NCtYPqsRLWtyEhLfgwXOppzMR3csp.HqRJNGXUGSaPu1qw
Access Passcode: 4g1ZF&%f


On Mon, Jan 3, 2022 at 5:39 PM Wei-Chiu Chuang  wrote:

> Hello community,
>
> This week we'll going to have Tao Li (tomscut) speaking about  the
> experience of operating HDFS at BIGO. See you on Thursday!
>
> 题目：《HDFS在BIGO的实践》
> 概要：HDFS作为大数据底层存储服务，在BIGO的发展中起到了非常重要的作用。随着业务的发展和数据的爆发式增长，HDFS单个集群的瓶颈愈发凸显，我们借助Router，将多个HDFS集群整合成一个Namespace，增强集群的扩展能力；改造Router，使其支持Alluxio和自定义策略，并开启HDFS
> EC，实现热、温、冷数据分层存储。同时，通过对HDFS集群慢节点和慢盘的处理，提升了HDFS的读写性能。本次分享主要讲述BIGO对Router的实践经验，以及对慢节点和慢盘的处理经验。
> 关键词：Router，Slow Node，Slow Disk
> 演讲者：Tao Li (Apache id: tomscut)
>
> Date/Time: Jan 6 2PM Beijing Time.
>
> Zoom link: https://cloudera.zoom.us/j/97264903288
>
> One tap mobile
>
> +16465588656,,880548968# US (New York)
>
> +17207072699,,880548968# US
>
> Download Center <https://cloudera.zoom.us/j/880548968>
>
> Dial by your location
>
> +1 646 558 8656 US (New York)
>
> +1 720 707 2699 US
>
> 877 853 5257 US Toll-free
>
> 888 475 4499 US Toll-free
>
> Meeting ID: 972 6490 3288
> Find your local number: https://zoom.us/u/acaGRDfMVl
>

Re: Next Mandarin Hadoop Online Meetup Jan 6th.

2022-01-05 Thread Wei-Chiu Chuang

Just a gentle reminder this is happening now.

On Mon, Jan 3, 2022 at 5:39 PM Wei-Chiu Chuang  wrote:

> Hello community,
>
> This week we'll going to have Tao Li (tomscut) speaking about  the
> experience of operating HDFS at BIGO. See you on Thursday!
>
> 题目：《HDFS在BIGO的实践》
> 概要：HDFS作为大数据底层存储服务，在BIGO的发展中起到了非常重要的作用。随着业务的发展和数据的爆发式增长，HDFS单个集群的瓶颈愈发凸显，我们借助Router，将多个HDFS集群整合成一个Namespace，增强集群的扩展能力；改造Router，使其支持Alluxio和自定义策略，并开启HDFS
> EC，实现热、温、冷数据分层存储。同时，通过对HDFS集群慢节点和慢盘的处理，提升了HDFS的读写性能。本次分享主要讲述BIGO对Router的实践经验，以及对慢节点和慢盘的处理经验。
> 关键词：Router，Slow Node，Slow Disk
> 演讲者：Tao Li (Apache id: tomscut)
>
> Date/Time: Jan 6 2PM Beijing Time.
>
> Zoom link: https://cloudera.zoom.us/j/97264903288
>
> One tap mobile
>
> +16465588656,,880548968# US (New York)
>
> +17207072699,,880548968# US
>
> Download Center <https://cloudera.zoom.us/j/880548968>
>
> Dial by your location
>
> +1 646 558 8656 US (New York)
>
> +1 720 707 2699 US
>
> 877 853 5257 US Toll-free
>
> 888 475 4499 US Toll-free
>
> Meeting ID: 972 6490 3288
> Find your local number: https://zoom.us/u/acaGRDfMVl
>

Next Mandarin Hadoop Online Meetup Jan 6th.

2022-01-03 Thread Wei-Chiu Chuang

Hello community,

This week we'll going to have Tao Li (tomscut) speaking about  the
experience of operating HDFS at BIGO. See you on Thursday!

题目：《HDFS在BIGO的实践》
概要：HDFS作为大数据底层存储服务，在BIGO的发展中起到了非常重要的作用。随着业务的发展和数据的爆发式增长，HDFS单个集群的瓶颈愈发凸显，我们借助Router，将多个HDFS集群整合成一个Namespace，增强集群的扩展能力；改造Router，使其支持Alluxio和自定义策略，并开启HDFS
EC，实现热、温、冷数据分层存储。同时，通过对HDFS集群慢节点和慢盘的处理，提升了HDFS的读写性能。本次分享主要讲述BIGO对Router的实践经验，以及对慢节点和慢盘的处理经验。
关键词：Router，Slow Node，Slow Disk
演讲者：Tao Li (Apache id: tomscut)

Date/Time: Jan 6 2PM Beijing Time.

Zoom link: https://cloudera.zoom.us/j/97264903288

One tap mobile

+16465588656,,880548968# US (New York)

+17207072699,,880548968# US

Download Center 

Dial by your location

+1 646 558 8656 US (New York)

+1 720 707 2699 US

877 853 5257 US Toll-free

888 475 4499 US Toll-free

Meeting ID: 972 6490 3288
Find your local number: https://zoom.us/u/acaGRDfMVl

[jira] [Resolved] (HDFS-16317) Backport HDFS-14729 for branch-3.2

2021-12-21 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16317.

Fix Version/s: 3.2.3
   Resolution: Fixed

Merged the commit into branch-3.2 and branch-3.2.3.

> Backport HDFS-14729 for branch-3.2
> --
>
> Key: HDFS-16317
> URL: https://issues.apache.org/jira/browse/HDFS-16317
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.2.2
>Reporter: Ananya Singh
>Assignee: Ananya Singh
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.3
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Our security tool raised the following security flaw on Hadoop 3.2.2: 
> +[CVE-2015-9251 :  
> |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-9251] 
> [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425]
>  
> [CVE-2015-9251|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-9251]+
> +[CVE-2019-11358|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2019-11358]
>  : 
> [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425]
>  
> [CVE-2019-11358|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2019-11358]+
> +[CVE-2020-11022 
> |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11022] : 
> [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425]
>  
> [CVE-2020-11022|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11022]+
>  
> +[CVE-2020-11023 
> |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11023] [ 
> |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11022] : 
> [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425]
>  
> [CVE-2020-11023|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11023]+
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Apache Hadoop and CVE-2021-44228 Log4JShell vulnerability

2021-12-19 Thread Wei-Chiu Chuang

Hi,
Given the widespread attention to the recent log4j vulnerability
(CVE-2021-44228), I'd like to share an update from the Hadoop developer
community regarding the incident.

As you probably know, Apache Hadoop depends on the log4j library to keep
log files. The highlighted vulnerability CVE-2021-44228 affects log4j2
2.0-beta9 through 2.15.0. Hadoop has been using log4j 1.2.x in the last 10
years and therefore no release is affected by it.

That said, another CVE CVE-2021-4104 states the JMSAppender in the 1.2.x
log4j, which is used by Apache Hadoop, is vulnerable to the same attack.
Fortunately, it is not configured by default and Hadoop does not enable it
by default.

For more information and mitigation, please check out Hadoop's CVE list
page.
https://hadoop.apache.org/cve_list.html

Wei-Chiu

[jira] [Reopened] (HDFS-16384) Upgrade Netty to 4.1.72.Final

2021-12-16 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reopened HDFS-16384:


> Upgrade Netty to 4.1.72.Final
> -
>
> Key: HDFS-16384
> URL: https://issues.apache.org/jira/browse/HDFS-16384
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.3.1
>Reporter: Tamas Penzes
>Assignee: Tamas Penzes
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> New fixes for netty, nothing else changed, just netty version bumped and two 
> more exclusion in hdfs-client because of new netty.
> No new tests added as not needed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Trunk broken by HDFS-16384

2021-12-16 Thread Wei-Chiu Chuang

My bad. There was a transitive dependency problem in the PR causing trunk
to fail the build.

The commit has since been reverted.

Sorry for the inconvenience.

Re: [VOTE] Release Apache Hadoop 3.3.2 - RC0

2021-12-13 Thread Wei-Chiu Chuang

Thanks a lot for pushing it forward!

A few things I noticed that we should incorporate:

1. the overview page of the doc is for the Hadoop 3.0 release. It would be
best to base the doc on top of Hadoop 3.3.0 overview page.
(it's a miss on my part... The overview page of 3.3.1 wasn't updated)

2. ARM binaries is not included.
For the 3.3.1 release, I had to run the create release script on an ARM
machine separately to create the binary tarball.

3. the jdiff version
https://github.com/apache/hadoop/blob/branch-3.3.2/hadoop-project-dist/pom.xml#L137

I am not sure exactly what this is used for but i think it should be
updated to 3.3.2 (or 3.3.1)
(it was updated in trunk but i forgot to update the branch-3.3)

The 3.3.1 binary tarball is 577mb. The 3.3.2 RC0 is 608mb. I'm curious what
are added.

On Fri, Dec 10, 2021 at 10:09 AM Chao Sun  wrote:

> Hi all,
>
> Sorry for the long delay. I've prepared RC0 for Hadoop 3.3.2 below:
>
> The RC is available at:
> http://people.apache.org/~sunchao/hadoop-3.3.2-RC0/
> The RC tag is at:
> https://github.com/apache/hadoop/releases/tag/release-3.3.2-RC0
> The Maven artifacts are staged at:
> https://repository.apache.org/content/repositories/orgapachehadoop-1330/
>
> You can find my public key at: https://people.apache.org/~sunchao/KEYS
>
> Please evaluate the RC and vote.
>
> Thanks,
> Chao
>

Re: Next Hadoop community online meetup (Mandarin)

2021-12-09 Thread Wei-Chiu Chuang

Thanks for participating in today's session. We have nearly a hundred
participants so I am quite impressed!

Yiyang graciously agreed to share the presentation slides, which can be
downloaded from here:
https://drive.google.com/file/d/1aAplPKU2frkKZoLeeHxbAzxLeJcL3bsf/view?usp=sharing
(it's in English so the wider audience can read)

Please find today's recording here:

https://cloudera.zoom.us/rec/share/MOwRDPP50eL7gRfNazfjqAXdl_7HwAaPH460sONmmlx_tLQrftVjjjCBkVvE-9sf.GRb9Kp_16xOqgzNu
Passcode:
L^n30&%d

On Thu, Dec 9, 2021 at 2:10 PM Wei-Chiu Chuang  wrote:

> We are having a little technical difficulties.
> For those who are late, we are using this link instead:
> https://cloudera.zoom.us/j/93901387554
>
> On Thu, Dec 9, 2021 at 12:55 PM Takanobu Asanuma <
> takanobu.asan...@gmail.com> wrote:
>
>> Thanks for the information!
>>
>> 2021年12月9日(木) 13:52 Wei-Chiu Chuang :
>>
>>> Hello, this talk will start in about an hour.
>>>
>>> (I'm sorry forgot to add this is 2PM Beijing time)
>>>
>>> On Thu, Dec 9, 2021 at 11:00 AM Takanobu Asanuma 
>>> wrote:
>>>
>>>> What is the timezone?
>>>>
>>>> 2021年11月29日(月) 14:50 Wei-Chiu Chuang :
>>>>
>>>>> Hello HDFS devs
>>>>>
>>>>> After a long hiatus, I'm happy to share that we'll have Yiyang from
>>>>> Shopee
>>>>> who is going to talk about their experience of using HDFS at Shopee.
>>>>>
>>>>> This is a Mandarin talk.
>>>>>
>>>>> 题目：《HDFS在Shopee的演进》
>>>>>
>>>>> 概要：HDFS作为最底层的存储架构，在Shopee的业务发展中起到了重要支撑作用。随着Shopee业务的扩展，HDFS存储服务也面临了越来越多的挑战。本次分享主要介绍在面临这些挑战时，如何借鉴社区成果，并结合内部具体需求，解决性能和稳定性上的问题。
>>>>> 关键词： Observer, RBF, slownode
>>>>> 演讲者：Yiyang Zhou (Apache id: Symious)
>>>>>
>>>>> 12月9日周四下午2点
>>>>>
>>>>> Zoom Meeting
>>>>>
>>>>> https://cloudera.zoom.us/j/880548968
>>>>>
>>>>> One tap mobile
>>>>>
>>>>> +16465588656,,880548968# US (New York)
>>>>>
>>>>> +17207072699,,880548968# US
>>>>>
>>>>> Download Center <https://cloudera.zoom.us/j/880548968>
>>>>>
>>>>> Dial by your location
>>>>>
>>>>> +1 646 558 8656 US (New York)
>>>>>
>>>>> +1 720 707 2699 US
>>>>>
>>>>> 877 853 5257 US Toll-free
>>>>>
>>>>> 888 475 4499 US Toll-free
>>>>>
>>>>> Meeting ID: 880 548 968
>>>>>
>>>>> Find your local number: https://zoom.us/u/acaGRDfMVl
>>>>>
>>>>

Re: Next Hadoop community online meetup (Mandarin)

2021-12-08 Thread Wei-Chiu Chuang

We are having a little technical difficulties.
For those who are late, we are using this link instead:
https://cloudera.zoom.us/j/93901387554

On Thu, Dec 9, 2021 at 12:55 PM Takanobu Asanuma 
wrote:

> Thanks for the information!
>
> 2021年12月9日(木) 13:52 Wei-Chiu Chuang :
>
>> Hello, this talk will start in about an hour.
>>
>> (I'm sorry forgot to add this is 2PM Beijing time)
>>
>> On Thu, Dec 9, 2021 at 11:00 AM Takanobu Asanuma 
>> wrote:
>>
>>> What is the timezone?
>>>
>>> 2021年11月29日(月) 14:50 Wei-Chiu Chuang :
>>>
>>>> Hello HDFS devs
>>>>
>>>> After a long hiatus, I'm happy to share that we'll have Yiyang from
>>>> Shopee
>>>> who is going to talk about their experience of using HDFS at Shopee.
>>>>
>>>> This is a Mandarin talk.
>>>>
>>>> 题目：《HDFS在Shopee的演进》
>>>>
>>>> 概要：HDFS作为最底层的存储架构，在Shopee的业务发展中起到了重要支撑作用。随着Shopee业务的扩展，HDFS存储服务也面临了越来越多的挑战。本次分享主要介绍在面临这些挑战时，如何借鉴社区成果，并结合内部具体需求，解决性能和稳定性上的问题。
>>>> 关键词： Observer, RBF, slownode
>>>> 演讲者：Yiyang Zhou (Apache id: Symious)
>>>>
>>>> 12月9日周四下午2点
>>>>
>>>> Zoom Meeting
>>>>
>>>> https://cloudera.zoom.us/j/880548968
>>>>
>>>> One tap mobile
>>>>
>>>> +16465588656,,880548968# US (New York)
>>>>
>>>> +17207072699,,880548968# US
>>>>
>>>> Download Center <https://cloudera.zoom.us/j/880548968>
>>>>
>>>> Dial by your location
>>>>
>>>> +1 646 558 8656 US (New York)
>>>>
>>>> +1 720 707 2699 US
>>>>
>>>> 877 853 5257 US Toll-free
>>>>
>>>> 888 475 4499 US Toll-free
>>>>
>>>> Meeting ID: 880 548 968
>>>>
>>>> Find your local number: https://zoom.us/u/acaGRDfMVl
>>>>
>>>

Re: Next Hadoop community online meetup (Mandarin)

2021-12-08 Thread Wei-Chiu Chuang

Hello, this talk will start in about an hour.

(I'm sorry forgot to add this is 2PM Beijing time)

On Thu, Dec 9, 2021 at 11:00 AM Takanobu Asanuma 
wrote:

> What is the timezone?
>
> 2021年11月29日(月) 14:50 Wei-Chiu Chuang :
>
>> Hello HDFS devs
>>
>> After a long hiatus, I'm happy to share that we'll have Yiyang from Shopee
>> who is going to talk about their experience of using HDFS at Shopee.
>>
>> This is a Mandarin talk.
>>
>> 题目：《HDFS在Shopee的演进》
>>
>> 概要：HDFS作为最底层的存储架构，在Shopee的业务发展中起到了重要支撑作用。随着Shopee业务的扩展，HDFS存储服务也面临了越来越多的挑战。本次分享主要介绍在面临这些挑战时，如何借鉴社区成果，并结合内部具体需求，解决性能和稳定性上的问题。
>> 关键词： Observer, RBF, slownode
>> 演讲者：Yiyang Zhou (Apache id: Symious)
>>
>> 12月9日周四下午2点
>>
>> Zoom Meeting
>>
>> https://cloudera.zoom.us/j/880548968
>>
>> One tap mobile
>>
>> +16465588656,,880548968# US (New York)
>>
>> +17207072699,,880548968# US
>>
>> Download Center <https://cloudera.zoom.us/j/880548968>
>>
>> Dial by your location
>>
>> +1 646 558 8656 US (New York)
>>
>> +1 720 707 2699 US
>>
>> 877 853 5257 US Toll-free
>>
>> 888 475 4499 US Toll-free
>>
>> Meeting ID: 880 548 968
>>
>> Find your local number: https://zoom.us/u/acaGRDfMVl
>>
>

Next Hadoop community online meetup (Mandarin)

2021-11-28 Thread Wei-Chiu Chuang

Hello HDFS devs

After a long hiatus, I'm happy to share that we'll have Yiyang from Shopee
who is going to talk about their experience of using HDFS at Shopee.

This is a Mandarin talk.

题目：《HDFS在Shopee的演进》
概要：HDFS作为最底层的存储架构，在Shopee的业务发展中起到了重要支撑作用。随着Shopee业务的扩展，HDFS存储服务也面临了越来越多的挑战。本次分享主要介绍在面临这些挑战时，如何借鉴社区成果，并结合内部具体需求，解决性能和稳定性上的问题。
关键词： Observer, RBF, slownode
演讲者：Yiyang Zhou (Apache id: Symious)

12月9日周四下午2点

Zoom Meeting

https://cloudera.zoom.us/j/880548968

One tap mobile

+16465588656,,880548968# US (New York)

+17207072699,,880548968# US

Download Center 

Dial by your location

+1 646 558 8656 US (New York)

+1 720 707 2699 US

877 853 5257 US Toll-free

888 475 4499 US Toll-free

Meeting ID: 880 548 968

Find your local number: https://zoom.us/u/acaGRDfMVl

[jira] [Resolved] (HDFS-16337) Show start time of Datanode on Web

2021-11-22 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16337.

Fix Version/s: 3.4.0
   Resolution: Fixed

> Show start time of Datanode on Web
> --
>
> Key: HDFS-16337
> URL: https://issues.apache.org/jira/browse/HDFS-16337
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: image-2021-11-19-08-55-58-343.png
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Show _start time_ of Datanode on Web.
> !image-2021-11-19-08-55-58-343.png|width=540,height=155!
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16241) Standby close reconstruction thread

2021-10-11 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16241.

Fix Version/s: 3.3.2
   3.2.3
   3.4.0
   Resolution: Fixed

Thanks. Merged.

> Standby close reconstruction thread
> ---
>
> Key: HDFS-16241
> URL: https://issues.apache.org/jira/browse/HDFS-16241
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: zhanghuazong
>Assignee: zhanghuazong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
> Attachments: HDFS-16241
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When the "Reconstruction Queue Initializer" thread of the active namenode has 
> not stopped, switch to standby namenode. The "Reconstruction Queue 
> Initializer" thread should be closed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

ApacheCon@Home Big Data tracks recordings!

2021-10-11 Thread Wei-Chiu Chuang

For those who missed the live Apache@Home Big Data tracks, the video
recordings are being uploaded to the official ASF channel!

Big Data:
https://www.youtube.com/playlist?list=PLU2OcwpQkYCzXcumE9UxNirLF1IYLmARj
Big Data Ozone:
https://www.youtube.com/playlist?list=PLU2OcwpQkYCxtPdZ0nSowYLQMgkmoczMl
Big Data SQL/NoSQL:
https://www.youtube.com/playlist?list=PLU2OcwpQkYCwu-bpf3K-OIfAjHpf4kr4L
Big Data Streaming:
https://www.youtube.com/playlist?list=PLU2OcwpQkYCwf7Cl6xsCgHuIa8_NWX2JG

You can find other topics as well:
https://www.youtube.com/c/TheApacheFoundation/playlists

Thanks to all who presented. I seen multiple talks related to Hadoop:

* YARN Resource Management and Dynamic Max by Fang Liu, Fengguang Tian,
  Prashant Golash, Hanxiong Zhang, Shuyi Zhang
* Uber HDFS Unit Storage Cost 10x Deduction by Jeffrey Zhong, Jing Zhao,
Leon
  Gao
* Scaling the Namenode - Lessons learnt by Dinesh Chitlangia
* How Uber achieved millions of savings by managing disk IO across HDFS
  cluster by Leon Gao, Ekanth Sethuramalingam
* Containing an Elephant: How we moved Hadoop/HBase into Kubernetes and
Public
  Cloud by Dhiraj Hegde


You can also find the recordings for the Apache Asia (August 2021) and some
of our community members who presented include:

* Bigtop 3.0: Rerising community driven Hadoop distribution by Kengo Seki,
  Masatake Iwasaki.
* Technical tips for secure Apache Hadoop cluster by Akira Ajisaka, Kei
KORI.
* Data Lake accelerator on Hadoop-COS in Tencent Cloud by Li Cheng.

I may have missed a few great talks as I glanced through the list, so
please let me know if you find other relevant talks in other tracks.

Cheers,
Wei-Chiu

Re: Hadoop-3.2.3 Release Update

2021-10-06 Thread Wei-Chiu Chuang

Hi to raise the awareness,
it looks like reverting the FoldedTreeSet HDFS-13671
 breaks TestBlockManager
in branch-3.2.  Branch-3.3 is good.

tracking jira: HDFS-16258 

On Tue, Oct 5, 2021 at 8:45 PM Brahma Reddy Battula 
wrote:

> Hi Akira,
>
> Thanks for your email!!
>
> I am evaluating the CVE’s which needs to go for this release..
>
> Will update soon!!
>
>
> On Tue, 5 Oct 2021 at 1:46 PM, Akira Ajisaka  wrote:
>
> > Hi Brahma,
> >
> > What is the release process going on? Is there any blocker for the RC?
> >
> > -Akira
> >
> > On Wed, Sep 22, 2021 at 7:37 PM Xiaoqiao He  wrote:
> >
> > > Hi Brahma,
> > >
> > > The feature 'BPServiceActor processes commands from NameNode
> > > asynchronously' has been ready for both branch-3.2 and branch-3.2.3.
> > While
> > > cherry-picking there is only minor conflict, So I checked in directly.
> > BTW,
> > > run some unit tests and build pseudo cluster to verify, it seems to
> work
> > > fine.
> > > FYI.
> > >
> > > Regards,
> > > - He Xiaoqiao
> > >
> > > On Thu, Sep 16, 2021 at 10:52 PM Brahma Reddy Battula <
> bra...@apache.org
> > >
> > > wrote:
> > >
> > >> Please go ahead. Let me know any help required on review.
> > >>
> > >> On Tue, Sep 14, 2021 at 6:57 PM Xiaoqiao He 
> > wrote:
> > >>
> > >>> Hi Brahma,
> > >>>
> > >>> I plan to involve HDFS-14997 and related JIRAs if possible. I have
> > >>> resolved the conflict and verified them locally.
> > >>> It will include: HDFS-14997 HDFS-15075 HDFS-15651 HDFS-15113.
> > >>> I would like to hear some more response that if we have enough time
> to
> > >>> wait for it to be ready.
> > >>> Thanks.
> > >>>
> > >>> Best Regards,
> > >>> - He Xiaoqiao
> > >>>
> > >>> On Tue, Sep 14, 2021 at 3:39 PM Xiaoqiao He 
> > wrote:
> > >>>
> >  Hi Brahma, HDFS-15160 has checked in branch-3.2 & branch-3.2.3. FYI.
> > 
> >  On Tue, Sep 14, 2021 at 3:52 AM Brahma Reddy Battula <
> > bra...@apache.org>
> >  wrote:
> > 
> > > Hi All,
> > >
> > > Waiting for the following jira to commit to hadoop-3.2.3 , mostly
> > this
> > > can
> > > be done by this week,then I will try to create the RC next if there
> > is
> > > no
> > > objection.
> > >
> > > https://issues.apache.org/jira/browse/HDFS-15160
> > >
> > >
> > >
> > > On Mon, Aug 16, 2021 at 2:22 PM Brahma Reddy Battula <
> > > bra...@apache.org>
> > > wrote:
> > >
> > > > @Akira Ajisaka   and @Masatake Iwasaki
> > > > 
> > > > Looks all are build related issues when you try with bigtop. We
> can
> > > > discuss and prioritize this.. Will connect with you guys.
> > > >
> > > > On Mon, Aug 16, 2021 at 1:43 PM Masatake Iwasaki <
> > > > iwasak...@oss.nttdata.co.jp> wrote:
> > > >
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch2-exclude-spotbugs-annotations.diff
> > > >> >
> > > >> > This is for building hadoop-3.2.2 against zookeeper-3.4.14.
> > > >> > we do not see the issue usually since branch-3.2 uses
> > > zooekeper-3.4.13,
> > > >> > while it would be harmless to add the exclusion even for
> > > >> zooekeeper-3.4.13.
> > > >>
> > > >> I filed HADOOP-17849 for this.
> > > >>
> > > >> On 2021/08/16 12:02, Masatake Iwasaki wrote:
> > > >> > Thanks for bringing this up, Akira. Let me explain some
> > > background.
> > > >> >
> > > >> >
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch2-exclude-spotbugs-annotations.diff
> > > >> >
> > > >> > This is for building hadoop-3.2.2 against zookeeper-3.4.14.
> > > >> > we do not see the issue usually since branch-3.2 uses
> > > zooekeper-3.4.13,
> > > >> > while it would be harmless to add the exclusion even for
> > > >> zooekeeper-3.4.13.
> > > >> >
> > > >> >
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch3-fix-broken-dir-detection.diff
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch5-fix-kms-shellprofile.diff
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch6-fix-httpfs-sh.diff
> > > >> >
> > > >> > These are relevant to directory structure used by Bigtop
> > package.
> > > >> > If the fix does not break the tarball dist,
> > > >> > it would be nice to have these on Hadoop too.
> > > >> >
> > > >> >
> > > >> >> -
> > > >>
> > >
> >
> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/patch7-remove-phantomjs-in-yarn-ui.diff
> > > >> >
> > > >> > This is for

[jira] [Created] (HDFS-16258) HDFS-13671 breaks TestBlockManager in branch-3.2

2021-10-06 Thread Wei-Chiu Chuang (Jira)

Wei-Chiu Chuang created HDFS-16258:
--

 Summary: HDFS-13671 breaks TestBlockManager in branch-3.2
 Key: HDFS-16258
 URL: https://issues.apache.org/jira/browse/HDFS-16258
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.2.3
Reporter: Wei-Chiu Chuang


TestBlockManager in branch-3.2 has two failed tests: 
* testDeleteCorruptReplicaWithStatleStorages
* testBlockManagerMachinesArray

Looks like broken by HDFS-13671. CC: [~brahmareddy]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16238) Improve comments related to EncryptionZoneManager

2021-09-30 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16238.

Fix Version/s: 3.4.0
   Resolution: Fixed

Thanks [~vjasani] [~hexiaoqiao]for the review!

> Improve comments related to EncryptionZoneManager
> -
>
> Key: HDFS-16238
> URL: https://issues.apache.org/jira/browse/HDFS-16238
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation, encryption, namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> In EncryptionZoneManager, there are some missing
> The description of the relevant comment. The purpose of this jira is to 
> perfect them.
> E.g:
>/**
> * Re-encrypts the given encryption zone path. If the given path is not the
> * root of an encryption zone, an exception is thrown.
> * @param zoneIIP
> * @param keyVersionName
> * @throws IOException
> */
>List reencryptEncryptionZone(final INodesInPath zoneIIP,
>final String keyVersionName) throws IOException {
> ..
> }
> The description of zoneIIP and keyVersionName is missing here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16232) Fix java doc for BlockReaderRemote#newBlockReader

2021-09-23 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16232.

Fix Version/s: 3.4.0
   Resolution: Fixed

> Fix java doc for BlockReaderRemote#newBlockReader
> -
>
> Key: HDFS-16232
> URL: https://issues.apache.org/jira/browse/HDFS-16232
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Fix java doc for BlockReaderRemote#newBlockReader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-16233) Do not use exception handler to implement copy-on-write for EnumCounters

2021-09-22 Thread Wei-Chiu Chuang (Jira)

Wei-Chiu Chuang created HDFS-16233:
--

 Summary: Do not use exception handler to implement copy-on-write 
for EnumCounters
 Key: HDFS-16233
 URL: https://issues.apache.org/jira/browse/HDFS-16233
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Wei-Chiu Chuang
 Attachments: Screen Shot 2021-09-22 at 1.59.59 PM.png

HDFS-14547 saves the NameNode heap space occupied by EnumCounters by 
essentially implementing a copy-on-write strategy.

At beginning, all EnumCounters refers to the same ConstEnumCounters to save 
heap space. When it is modified, an exception is thrown and the exception 
handler converts ConstEnumCounters to EnumCounters object and updates it.

Using exception handler to perform anything more than occasional is bad for 
performance. 

Propose: use instanceof keyword to detect the type of object and do COW 
accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16192) ViewDistributedFileSystem#rename wrongly using src in the place of dst.

2021-08-30 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16192.

Fix Version/s: 3.3.2
   3.4.0
   Resolution: Fixed

Thanks [~umamaheswararao]!

> ViewDistributedFileSystem#rename wrongly using src in the place of dst.
> ---
>
> Key: HDFS-16192
> URL: https://issues.apache.org/jira/browse/HDFS-16192
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> In ViewDistributedFileSystem, we are mistakenly used src path in the place of 
> dst path when finding mount path info.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16173) Improve CopyCommands#Put#executor queue configurability

2021-08-26 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16173.

Resolution: Fixed

> Improve CopyCommands#Put#executor queue configurability
> ---
>
> Key: HDFS-16173
> URL: https://issues.apache.org/jira/browse/HDFS-16173
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: fs
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2, 3.2.4
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> In CopyCommands#Put, the number of executor queues is a fixed value, 1024.
> We should make him configurable, because there are different usage 
> environments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16175) Improve the configurable value of Server #PURGE_INTERVAL_NANOS

2021-08-25 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16175.

Fix Version/s: 3.2.4
   3.3.2
   3.4.0
   Resolution: Fixed

> Improve the configurable value of Server #PURGE_INTERVAL_NANOS
> --
>
> Key: HDFS-16175
> URL: https://issues.apache.org/jira/browse/HDFS-16175
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ipc
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2, 3.2.4
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> In Server, Server #PURGE_INTERVAL_NANOS is a fixed value, 15.
> We can try to improve the configurable value of Server #PURGE_INTERVAL_NANOS, 
> which will make RPC more flexible.
> private final static long PURGE_INTERVAL_NANOS = TimeUnit.NANOSECONDS.convert(
>   15, TimeUnit.MINUTES);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16180) FsVolumeImpl.nextBlock should consider that the block meta file has been deleted.

2021-08-23 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16180.

Fix Version/s: 3.4.0
   Resolution: Fixed

> FsVolumeImpl.nextBlock should consider that the block meta file has been 
> deleted.
> -
>
> Key: HDFS-16180
> URL: https://issues.apache.org/jira/browse/HDFS-16180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0, 3.4.0
>Reporter: Max  Xie
>Assignee: Max  Xie
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> In my cluster,  we found that when VolumeScanner run, sometime dn will throw 
> some error log below
> ```
>  
> 2021-08-19 08:00:11,549 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Deleted BP-1020175758-nnip-1597745872895 blk_1142977964_69237147 URI 
> file:/disk1/dfs/data/current/BP-1020175758- 
> nnip-1597745872895/current/finalized/subdir0/subdir21/blk_1142977964
> 2021-08-19 08:00:48,368 ERROR 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl: 
> nextBlock(DS-060c8e4c-1ef6-49f5-91ef-91957356891a, BP-1020175758- 
> nnip-1597745872895): I/O error
> java.io.IOException: Meta file not found, 
> blockFile=/disk1/dfs/data/current/BP-1020175758- 
> nnip-1597745872895/current/finalized/subdir0/subdir21/blk_1142977964
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetUtil.findMetaFile(FsDatasetUtil.java:101)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl$BlockIteratorImpl.nextBlock(FsVolumeImpl.java:809)
> at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:528)
> at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:628)
> 2021-08-19 08:00:48,368 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: 
> VolumeScanner(/disk1/dfs/data, DS-060c8e4c-1ef6-49f5-91ef-91957356891a): 
> nextBlock error on 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl$BlockIteratorImpl@7febc6b4
> ```
> When VolumeScanner scan block  blk_1142977964,  it has been deleted by 
> datanode,  scanner can not find the meta file of blk_1142977964, so it throw 
> these error log.
>  
> Maybe we should handle FileNotFoundException during nextblock to reduce error 
> log and nextblock retry times.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

1 2 3 4 5 6 7 8 >

1 - 100 of 735 matches

Mail list logo