[ 
https://issues.apache.org/jira/browse/HDFS-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15898236#comment-15898236
 ] 

Jing Zhao commented on HDFS-11445:
----------------------------------

Thanks for working on this, [~brahmareddy]!

I think you just find a scenario that the following inconsistency happens:
{code:title=BlockManager#createLocatedBlock}
    NumberReplicas numReplicas = countNodes(blk);
    final int numCorruptNodes = numReplicas.corruptReplicas();
    final int numCorruptReplicas = corruptReplicas.numCorruptReplicas(blk);
    if (numCorruptNodes != numCorruptReplicas) {
      LOG.warn("Inconsistent number of corrupt replicas for "
          + blk + " blockMap has " + numCorruptNodes
          + " but corrupt replicas map has " + numCorruptReplicas);
    }
{code}

I also did some debugging using your unit test. Looks like the root cause for 
this inconsistency is: {{BlockInfo#setGenerationStampAndVerifyReplicas}} may 
remove a datanode storage from the block's storage list, but still leave the 
storage in the CorruptReplicasMap.

This inconsistency later can be fixed automatically, e.g., by a full block 
report. But maybe we should consider using 
{{BlockManager#removeStoredBlock(BlockInfo, DatanodeDescriptor)}} to remove all 
the records related to the block-dn pair.

> FSCK shows overall health stauts as corrupt even one replica is corrupt
> -----------------------------------------------------------------------
>
>                 Key: HDFS-11445
>                 URL: https://issues.apache.org/jira/browse/HDFS-11445
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Brahma Reddy Battula
>            Assignee: Brahma Reddy Battula
>         Attachments: HDFS-11445.patch
>
>
> In the following scenario,FSCK shows overall health status as corrupt even 
> it's has one good replica.
> 1. Create file with 2 RF.
> 2. Shutdown one DN
> 3. Append to file again. 
> 4. Restart the DN
> 5. After block report, check Fsck



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to