[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15304702#comment-15304702
 ] 

Colin Patrick McCabe commented on HDFS-10301:
---------------------------------------------

[~redvine], the fact that you are having trouble with stale storages versus 
zombie storages is because your patch uses a separate mechanism to detect what 
storages exist on the DN.  The existing code doesn't have this problem because 
the full block report itself acted as the record of what storages existed.  
This is one negative side effect of the more complex approach.  Another 
negative side effect is that you are transmitting the same information about 
which storages are present multiple times.

Despite these negatives, I'm still willing to review a patch that uses the more 
complicated method as long as you don't introduce extra RPCs.  I agree that we 
should remove a stale storage if it doesn't appear in the full listing that 
gets sent.  Just to be clear, I am -1 on a patch which adds extra RPCs.  
Perhaps you can send this listing in an optional field in the first RPC.

[~daryn], I don't like the idea of "band-aiding" this issue rather than fixing 
it at the root.  Throwing an exception on interleaved storage reports, or 
forbidding combined storage reports, seem like very brittle work-arounds that 
could easily be undone by someone making follow-on changes.  I came up with 
patch 005 and the earlier patches as a very simple fix that could easily be 
backported.  If you are interested in something simple, then please check it 
out... or at least give a reason for not checking it out.

> BlockReport retransmissions may lead to storages falsely being declared 
> zombie if storage report processing happens out of order
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-10301
>                 URL: https://issues.apache.org/jira/browse/HDFS-10301
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.1
>            Reporter: Konstantin Shvachko
>            Assignee: Colin Patrick McCabe
>            Priority: Critical
>         Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, 
> HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.01.patch, 
> HDFS-10301.sample.patch, zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to