[
https://issues.apache.org/jira/browse/HDFS-17542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18022591#comment-18022591
]
ASF GitHub Bot commented on HDFS-17542:
---------------------------------------
github-actions[bot] closed pull request #6915: HDFS-17542. EC: Optimize the EC
block reconstruction.
URL: https://github.com/apache/hadoop/pull/6915
> EC: Optimize the EC block reconstruction.
> -----------------------------------------
>
> Key: HDFS-17542
> URL: https://issues.apache.org/jira/browse/HDFS-17542
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Chenyu Zheng
> Assignee: Chenyu Zheng
> Priority: Major
> Labels: pull-request-available
>
> The current reconstruction process of EC blocks is based on the original
> contiguous blocks. It is mainly implemented through the work constructed by
> computeReconstructionWorkForBlocks. It can be roughly divided into three
> processes:
> * scheduleReconstruction
> * chooseTargets
> * validateReconstructionWork
> For ordinary contiguous blocks:
> * (1) scheduleReconstruction
> Select srcNodes as the source of the copy block according to the status of
> each replica of the block.
> * (2) chooseTargets
> Select the target of the copy.
> * (3) validateReconstructionWork
> Add the copy command to srcNode, srcNode receives the command through
> heartbeat, and executes the block copy from src to target.
> For EC blocks:
> (1) and (2) seems nearly same. However, whether to perform simple block copy
> or block reconstruction for EC blocks is determined in (3). And when some
> storage is busy, may result no work, it will lead to the problem described in
> HDFS-17516. Even if no block copying or block reconstruction is generated,
> pendingReconstruction and neededReconstruction will still be updated until
> the block times out, which wastes the scheduling opportunity.
> Because the decision of whether to perform block copy or block reconstruction
> is made in (3), unnecessary liveBusyBlockIndices, and
> excludeReconstructedIndices are introduced. We know many bugs are related
> here. These should be avoided.
> Improvements:
> * Move the work of deciding whether to copy or reconstruct blocks from (3)
> to (1).
> Such improvements are more conducive to implementing the explicit
> specification of the reconstruction block index mentioned in HDFS-16874.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]