Kihwal Lee created HDFS-5438:
--------------------------------
Summary: Flaws in block report processing can cause data loss
Key: HDFS-5438
URL: https://issues.apache.org/jira/browse/HDFS-5438
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode
Affects Versions: 2.2.0, 0.23.9
Reporter: Kihwal Lee
Priority: Critical
The incremental block reports from data nodes and block commits are
asynchronous. This becomes troublesome when the gen stamp for a block is
changed during a write pipeline recovery.
* If an incremental block report is delayed from a node but NN had enough
replicas already, a report with the old gen stamp may be received after block
completion. This replica will be correctly marked corrupt. But if the node had
participated in the pipeline recovery, a new (delayed) report with the correct
gen stamp will come soon. However, this report won't have any effect on the
corrupt state of the replica.
* If block reports are received while the block is still under construction
(i.e. client's call to make block committed has not been received by NN), they
are blindly accepted regardless of the gen stamp. If a failed node reports in
with the old gen stamp while pipeline recovery is on-going, it will be accepted
and counted as valid during commit of the block.
Due to the above two problems, correct replicas can be marked corrupt and
corrupt replicas can be accepted during commit. So far we have observed two
cases in production.
* The client hangs forever to close a file. All replicas are marked corrupt.
* After the successful close of a file, read fails. Corrupt replicas are
accepted them during commit and valid replicas are marked corrupt afterward.
--
This message was sent by Atlassian JIRA
(v6.1#6144)