[
https://issues.apache.org/jira/browse/HDFS-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stephen O'Donnell resolved HDFS-16272.
--------------------------------------
Resolution: Fixed
Committed down the active branches. Thanks for the contribution [~cndaimin].
> Int overflow in computing safe length during EC block recovery
> --------------------------------------------------------------
>
> Key: HDFS-16272
> URL: https://issues.apache.org/jira/browse/HDFS-16272
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: 3.1.1
> Affects Versions: 3.3.0, 3.3.1
> Environment: Cluster settings: EC RS-8-2-256k, Block Size 1GiB.
> Reporter: daimin
> Assignee: daimin
> Priority: Critical
> Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> There exists an int overflow problem in StripedBlockUtil#getSafeLength, which
> will produce a negative or zero length:
> 1. With negative length, it fails to the later >=0 check, and will crash the
> BlockRecoveryWorker thread, which make the lease recovery operation unable to
> finish.
> 2. With zero length, it passes the check, and directly truncate the block
> size to zero, leads to data lossing.
> If you are using any of the default EC policies (3-2, 6-3 or 10-4) and the
> default HDFS block size of 128MB, then you will not be impacted by this issue.
> To be impacted, the EC dataNumber * blockSize has to be larger than the Java
> max int of 2,147,483,647.
> For example 10-4 is 10 * 134217728 = 1,342,177,280 which is OK.
> However 10-4 with 256MB blocks is 2,684,354,560 which overflows the INT and
> causes the problem.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]