[ 
https://issues.apache.org/jira/browse/HDFS-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14179124#comment-14179124
 ] 

Zhe Zhang commented on HDFS-7225:
---------------------------------

Sure we should cleanup when a DN is removed. {{wipeDatanode}} looks right place 
to me, being the only method calling {{datanodeMap.remove()}}. I think we 
should also keep the cleanup at lookup time in case elements are removed from 
{{datanodeMap}} unexpectedly. 

Thinking a little more about our previous assumption below:

bq. If the old volume is brought back, the old blocks will be in the block 
report and the NN will re-populate InvalidateBlocks as needed when it processes 
the report.

How about file deletions? With our current approach would there be orphan 
blocks when the volume comes back?

> Failed DataNode lookup can crash NameNode with NullPointerException
> -------------------------------------------------------------------
>
>                 Key: HDFS-7225
>                 URL: https://issues.apache.org/jira/browse/HDFS-7225
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.0
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-7225-v1.patch, HDFS-7225-v2.patch
>
>
> {{BlockManager#invalidateWorkForOneNode}} looks up a DataNode by the 
> {{datanodeUuid}} and passes the resultant {{DatanodeDescriptor}} to 
> {{InvalidateBlocks#invalidateWork}}. However, if a wrong or outdated 
> {{datanodeUuid}} is used, a null pointer will be passed to {{invalidateWork}} 
> which will use it to lookup in a {{TreeMap}}. Since the key type is 
> {{DatanodeDescriptor}}, key comparison is based on the IP address. A null key 
> will crash the NameNode with an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to