[
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271737#comment-15271737
]
Konstantin Shvachko commented on HDFS-10301:
--------------------------------------------
??In the short term, however, I would prefer the current patch, since it
involves no RPC changes, and doesn't require all the DataNodes to be upgraded
before it can work.??
* I don't think my approach requires RPC change, since the block-report RPC
message already has all required structures in place. It should require only
the processing logic change.
* DataNodes will need to be upgraded indeed, but only in the case if they split
its block-reports into multiple RPC, because full report lists all storages
already. But even multi-RPC case it will only mean that zombie storages will
not be removed until they are upgraded.
* Colin, it would have been good to have an interim solution, but it does not
seem reasonable to commit a patch, which fixes one bug, while introducing
another.
I traced back a series of jiras related to this problem. It looks like that
multiple storages were not thoroughly thought through in the beginning and that
people were trying to solve problems as they appear for a while. Feels like the
time for the right fix.
> BlockReport retransmissions may lead to storages falsely being declared
> zombie if storage report processing happens out of order
> --------------------------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-10301
> URL: https://issues.apache.org/jira/browse/HDFS-10301
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 2.6.1
> Reporter: Konstantin Shvachko
> Assignee: Colin Patrick McCabe
> Priority: Critical
> Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch,
> HDFS-10301.01.patch, HDFS-10301.sample.patch, zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it
> sends the block report again. Then NameNode while process these two reports
> at the same time can interleave processing storages from different reports.
> This screws up the blockReportId field, which makes NameNode think that some
> storages are zombie. Replicas from zombie storages are immediately removed,
> causing missing blocks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]