[
https://issues.apache.org/jira/browse/HDFS-8418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14547551#comment-14547551
]
Yi Liu commented on HDFS-8418:
------------------------------
It will cause some unexpected behavior, for example: schedule unnecessary
striped block reconstruction work, I can see the exception from the log:
{code}
2015-05-18 13:25:57,955 WARN datanode.DataNode
(ErasureCodingWorker.java:processErasureCodingTasks(158)) - Failed to recover
striped block blk_-9223372036854775792_1001
java.lang.IllegalArgumentException: No enough live striped blocks.
at
com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
at
org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.<init>(ErasureCodingWorker.java:278)
at
org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker.processErasureCodingTasks(ErasureCodingWorker.java:156)
at
org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:730)
at
org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:617)
at
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:854)
at
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:671)
at
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:820)
at java.lang.Thread.run(Thread.java:744)
{code}
> Fix the isNeededReplication calculation for Striped block in NN
> ---------------------------------------------------------------
>
> Key: HDFS-8418
> URL: https://issues.apache.org/jira/browse/HDFS-8418
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Yi Liu
> Assignee: Yi Liu
> Priority: Critical
>
> Currently when calculating {{isNeededReplication}} for striped block, we use
> BlockCollection#getPreferredBlockReplication to get expected replica number
> for striped block. See an example:
> {code}
> public void checkReplication(BlockCollection bc) {
> final short expected = bc.getPreferredBlockReplication();
> for (BlockInfo block : bc.getBlocks()) {
> final NumberReplicas n = countNodes(block);
> if (isNeededReplication(block, expected, n.liveReplicas())) {
> neededReplications.add(block, n.liveReplicas(),
> n.decommissionedAndDecommissioning(), expected);
> } else if (n.liveReplicas() > expected) {
> processOverReplicatedBlock(block, expected, null, null);
> }
> }
> }
> {code}
> But actually it's not correct, for example, if the length of striped file is
> less than a cell, then the expected replica of the block should be {{1 +
> parityBlkNum}} instead of {{dataBlkNum + parityBlkNum}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)