[
https://issues.apache.org/jira/browse/HDFS-12072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Wang updated HDFS-12072:
-------------------------------
Resolution: Fixed
Fix Version/s: 3.0.0-beta1
Status: Resolved (was: Patch Available)
Thanks for the contribution Eddy, committed this to trunk!
> Provide fairness between EC and non-EC recovery tasks.
> ------------------------------------------------------
>
> Key: HDFS-12072
> URL: https://issues.apache.org/jira/browse/HDFS-12072
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: erasure-coding
> Affects Versions: 3.0.0-alpha3
> Reporter: Lei (Eddy) Xu
> Assignee: Lei (Eddy) Xu
> Labels: hdfs-ec-3.0-nice-to-have
> Fix For: 3.0.0-beta1
>
> Attachments: HDFS-12072.00.patch, HDFS-12072.01.patch
>
>
> In {{DatanodeManager#handleHeartbeat}}, it takes up to {{maxTransfer}}
> reconstruction tasks for non-EC, then if the request can not be full filled,
> it takes more tasks from EC reconstruction tasks.
> {code}
> List<BlockTargetPair> pendingList = nodeinfo.getReplicationCommand(
> maxTransfers);
> if (pendingList != null) {
> cmds.add(new BlockCommand(DatanodeProtocol.DNA_TRANSFER, blockPoolId,
> pendingList));
> maxTransfers -= pendingList.size();
> }
> // check pending erasure coding tasks
> List<BlockECReconstructionInfo> pendingECList = nodeinfo
> .getErasureCodeCommand(maxTransfers);
> if (pendingECList != null) {
> cmds.add(new BlockECReconstructionCommand(
> DNA_ERASURE_CODING_RECONSTRUCTION, pendingECList));
> }
> {code}
> So on a large cluster, if there are large number of constantly non-EC
> reconstruction tasks, EC reconstruction tasks do not have a chance to run.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]