[
https://issues.apache.org/jira/browse/HDFS-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15018177#comment-15018177
]
Mikhail Dutikov commented on HDFS-6937:
---------------------------------------
Good day, any update on this? I seem to be running into a similar issue with
Hbase WAL based on HDFS 2.6 chd 2.4.8. A pipeline is being reconstructed with
many candidate datanodes, but none of the substitutes seems to be receiving the
block replica correctly:
Checksum error in block .... from /IP:port
org.apache.hadoop.fs.ChecksumException: Checksum error:
DFSClient_NONMAPREDUCE_1267344484_1 at 2048 exp: -652491368 got: -585724081
[note: checksums are the same on all new candidate nodes]
java.io.IOException: Terminating due to a checksum error.java.io.IOException:
Unexpected checksum mismatch while writing .... from /IP:port
>···at
>org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:562)
>···at
>org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:780)
>···at
>org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:783)
>···at
>org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>···at
>org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>···at
>org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:243)
>···at java.lang.Thread.run(Thread.java:745)
This is repeated until no datanode is left to try, and leads to data loss (?)
and terminated Region Servers due to inability to write to WAL. (The cluster
has 3 racks and 21 nodes)
> Another issue in handling checksum errors in write pipeline
> -----------------------------------------------------------
>
> Key: HDFS-6937
> URL: https://issues.apache.org/jira/browse/HDFS-6937
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode, hdfs-client
> Affects Versions: 2.5.0
> Reporter: Yongjun Zhang
> Assignee: Colin Patrick McCabe
>
> Given a write pipeline:
> DN1 -> DN2 -> DN3
> DN3 detected cheksum error and terminate, DN2 truncates its replica to the
> ACKed size. Then a new pipeline is attempted as
> DN1 -> DN2 -> DN4
> DN4 detects checksum error again. Later when replaced DN4 with DN5 (and so
> on), it failed for the same reason. This led to the observation that DN2's
> data is corrupted.
> Found that the software currently truncates DN2's replca to the ACKed size
> after DN3 terminates. But it doesn't check the correctness of the data
> already written to disk.
> So intuitively, a solution would be, when downstream DN (DN3 here) found
> checksum error, propagate this info back to upstream DN (DN2 here), DN2
> checks the correctness of the data already written to disk, and truncate the
> replica to to MIN(correctDataSize, ACKedSize).
> Found this issue is similar to what was reported by HDFS-3875, and the
> truncation at DN2 was actually introduced as part of the HDFS-3875 solution.
> Filing this jira for the issue reported here. HDFS-3875 was filed by
> [~tlipcon]
> and found he proposed something similar there.
> {quote}
> if the tail node in the pipeline detects a checksum error, then it returns a
> special error code back up the pipeline indicating this (rather than just
> disconnecting)
> if a non-tail node receives this error code, then it immediately scans its
> own block on disk (from the beginning up through the last acked length). If
> it detects a corruption on its local copy, then it should assume that it is
> the faulty one, rather than the downstream neighbor. If it detects no
> corruption, then the faulty node is either the downstream mirror or the
> network link between the two, and the current behavior is reasonable.
> {quote}
> Thanks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)