Kihwal Lee created HDFS-8395:
--------------------------------
Summary: Verify on-disk data after transferring block data
Key: HDFS-8395
URL: https://issues.apache.org/jira/browse/HDFS-8395
Project: Hadoop HDFS
Issue Type: Improvement
Reporter: Kihwal Lee
Priority: Critical
Currently the integrity of on-disk data is not checked during pipeline recovery
or replication. The target in the pipeline-recovery-transfer can detect a
corruption, but sometimes it is detected long after a corruption happens. (e.g.
HDFS-4660) If multiple pipeline failures occur, delayed corruption detection
can cause data loss.
During replications involving multiple destinations, if a middle node corrupts
the data, it can cause the healthy source to be marked corrupt. Because of lack
of full ack mechanism during replication, the corrupt replica will continue to
be written and finalized. Now this replica will be source of further
replication because the original source is marked corrupt. All subsequent
replications of course fail and this results in a missing block.
By adding on-disk corruption detection to appropriate places, the situation can
be improved.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)