[ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171051#comment-14171051
 ] 

Yongjun Zhang commented on HDFS-7235:
-------------------------------------

HI [~cmccabe],

Thanks for the review and discussion yesterday. I was in a rush to leave when I 
put the previous comment with patch yesterday. Here is some more info to share:

* You said that external user might be deriving FsDatasetSpi interface, any 
change to this interface might imply compatibility. This is a very good point. 
So indeed it'd be nice if we can avoid changing FsDatasetSpi.
* If we use {{FsDatasetSpi#getLength}} method to check file existence, it's not 
guaranteed that the replica state is FINALIZED. So it's not sufficient for the 
fix here. 
* Without changing FsDatasetSpi, we need to add similar logic as I did in rev 
001 to DataNode.java. To check replica state in DataNode.java, I had to use the 
deprecated method getReplica(). 
* Having this logic in DataNode.java is a bit concern to me, DataNode is 
supposed to use FsDatasetSpi interface only, now we incorporate logic specific 
to FsDatasetImpl in DataNode.java.  If user derives FsDatasetSpi and write 
their own version, the logic may not be the same as FsDatasetImpl. This might 
cause potential problem. This is the point I was trying to make in last comment.

Would you please comment again?

Thanks.
 


> Can not decommission DN which has invalid block due to bad disk
> ---------------------------------------------------------------
>
>                 Key: HDFS-7235
>                 URL: https://issues.apache.org/jira/browse/HDFS-7235
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, namenode
>    Affects Versions: 2.6.0
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch
>
>
> When to decommission a DN, the process hangs. 
> What happens is, when NN chooses a replica as a source to replicate data on 
> the to-be-decommissioned DN to other DNs, it favors choosing this DN 
> to-be-decommissioned as the source of transfer (see BlockManager.java).  
> However, because of the bad disk, the DN would detect the source block to be 
> transfered as invalidBlock with the following logic in FsDatasetImpl.java:
> {code}
> /** Does the block exist and have the given state? */
>   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
>     final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
>         b.getLocalBlock());
>     return replicaInfo != null
>         && replicaInfo.getState() == state
>         && replicaInfo.getBlockFile().exists();
>   }
> {code}
> The reason that this method returns false (detecting invalid block) is 
> because the block file doesn't exist due to bad disk in this case. 
> The key issue we found here is, after DN detects an invalid block for the 
> above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
> know that the block is corrupted, and keeps sending the data transfer request 
> to the same DN to be decommissioned, again and again. This caused an infinite 
> loop, so the decommission process hangs.
> Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to