[ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14169714#comment-14169714
 ] 

Yongjun Zhang commented on HDFS-7235:
-------------------------------------

Hi Colin,

Thanks a lot for the review.

The key issue identified for the original symptom was, when a block is detected 
as invalid by the existing isValid() method, we call SendErrorReport() which 
just log a message there, and Namenode doesn't do more than logging the message 
for this call, so NameNode doesn't know the block is bad.

 What I did was, I separate the reasons for isValid to be false to two parts,
-  if it's false because getBlockFile().exists() , call reportBadBlocks, so 
NameNode will record the bad block for future reference.
-  if it's false because either replicaInfo == null OR replicaInfo.getState() 
!= state, it still calls SendErrorReport() like before. Actually for this case, 
the state has to be FINALIZED. We don't want to report badBlock for state 
that's RBW for example.

If we make the change in SendErrorReport, that means we need to change the 
behavior of this method, to also call reportBadBlocks from there conditionally, 
which is not clean to me, because SendErrorReport is supposed to just send 
error report.

Wonder if this explanation makes sense to you?

Thanks.








> Can not decommission DN which has invalid block due to bad disk
> ---------------------------------------------------------------
>
>                 Key: HDFS-7235
>                 URL: https://issues.apache.org/jira/browse/HDFS-7235
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, namenode
>    Affects Versions: 2.6.0
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-7235.001.patch
>
>
> When to decommission a DN, the process hangs. 
> What happens is, when NN chooses a replica as a source to replicate data on 
> the to-be-decommissioned DN to other DNs, it favors choosing this DN 
> to-be-decommissioned as the source of transfer (see BlockManager.java).  
> However, because of the bad disk, the DN would detect the source block to be 
> transfered as invalidBlock with the following logic in FsDatasetImpl.java:
> {code}
> /** Does the block exist and have the given state? */
>   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
>     final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
>         b.getLocalBlock());
>     return replicaInfo != null
>         && replicaInfo.getState() == state
>         && replicaInfo.getBlockFile().exists();
>   }
> {code}
> The reason that this method returns false (detecting invalid block) is 
> because the block file doesn't exist due to bad disk in this case. 
> The key issue we found here is, after DN detects an invalid block for the 
> above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
> know that the block is corrupted, and keeps sending the data transfer request 
> to the same DN to be decommissioned, again and again. This caused an infinite 
> loop, so the decommission process hangs.
> Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to