[ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14169673#comment-14169673
 ] 

Colin Patrick McCabe commented on HDFS-7235:
--------------------------------------------

Thanks for looking at this, Yongjun.

I don't understand why we need a new function named 
{{FsDatasetSpi#isInvalidBlockDueToNonexistentBlockFile}}.  The JavaDoc for 
{{FsDatasetSpi#isValid}} says that it checks if the block "exist\[s\] and has 
the given state" and it's clear from the code that this is what it actually 
implements.

We start by calling isValid...
{code}
  private void transferBlock(ExtendedBlock block, DatanodeInfo[] xferTargets,
      StorageType[] xferTargetStorageTypes) throws IOException {
    BPOfferService bpos = getBPOSForBlock(block);
    DatanodeRegistration bpReg = getDNRegistrationForBP(block.getBlockPoolId());

    if (!data.isValidBlock(block)) {
      // block does not exist or is under-construction
      String errStr = "Can't send invalid block " + block;
      LOG.info(errStr);

      bpos.trySendErrorReport(DatanodeProtocol.INVALID_BLOCK, errStr);
      return;
    }
    ...
{code}

{{isValid}} checks whether the block file exists...
{code}
/** Does the block exist and have the given state? */                           
                                     
  private boolean isValid(final ExtendedBlock b, final ReplicaState state) {    
                                       
    final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(),           
                                       
        b.getLocalBlock());
    return replicaInfo != null                                                  
                                       
        && replicaInfo.getState() == state                                      
                                       
        && replicaInfo.getBlockFile().exists();
  }   
{code}

So there's no need for a new function.  isValid already does what you want.

bq. The key issue we found here is, after DN detects an invalid block for the 
above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
know that the block is corrupted, and keeps sending the data transfer request 
to the same DN to be decommissioned, again and again. This caused an infinite 
loop, so the decommission process hangs.

Is this a problem with {{BPOfferService#trySendErrorReport}}?  If so, it seems 
like we should fix it there.

I can see that BPServiceActor#trySendErrorReport calls 
{{NameNodeRpc#errorReport}}, whereas your patch calls 
{{NameNodeRpc#reportBadBlocks}}.  What's the reason for this change, and does 
it fix the bug described above?

> Can not decommission DN which has invalid block due to bad disk
> ---------------------------------------------------------------
>
>                 Key: HDFS-7235
>                 URL: https://issues.apache.org/jira/browse/HDFS-7235
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, namenode
>    Affects Versions: 2.6.0
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-7235.001.patch
>
>
> When to decommission a DN, the process hangs. 
> What happens is, when NN chooses a replica as a source to replicate data on 
> the to-be-decommissioned DN to other DNs, it favors choosing this DN 
> to-be-decommissioned as the source of transfer (see BlockManager.java).  
> However, because of the bad disk, the DN would detect the source block to be 
> transfered as invalidBlock with the following logic in FsDatasetImpl.java:
> {code}
> /** Does the block exist and have the given state? */
>   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
>     final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
>         b.getLocalBlock());
>     return replicaInfo != null
>         && replicaInfo.getState() == state
>         && replicaInfo.getBlockFile().exists();
>   }
> {code}
> The reason that this method returns false (detecting invalid block) is 
> because the block file doesn't exist due to bad disk in this case. 
> The key issue we found here is, after DN detects an invalid block for the 
> above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
> know that the block is corrupted, and keeps sending the data transfer request 
> to the same DN to be decommissioned, again and again. This caused an infinite 
> loop, so the decommission process hangs.
> Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to