[
https://issues.apache.org/jira/browse/HDFS-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14618199#comment-14618199
]
Xinwei Qin commented on HDFS-8732:
-----------------------------------
Hi, [~hitliuyi], I noticed HDFS-8602 had resolved the similar problem, but it
cannot fix the issue in this jira.
Thanks [~walter.k.su] to clarify.
The error log in HDFS-8602:
{code}
2015-07-08 16:19:04,742 ERROR datanode.DataNode
(BlockSender.java:sendPacket(615)) - BlockSender.sendChunks() exception:
java.io.EOFException: EOF Reached. file size is 10 and 65526 more bytes left to
be transfered.
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:228)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:585)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:765)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:556)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:256)
at java.lang.Thread.run(Thread.java:722)
{code}
and the error or warn log of this jira:
{code}
2015-07-08 15:05:13,455 WARN hdfs.DFSClient
(DFSInputStream.java:actualGetFromOneDataNode(1203)) - fetchBlockByteRange().
Got a checksum exception for /partially_corrupted_1_0 at
BP-1928182115-9.96.1.31-1436339108502:blk_-9223372036854775792_1001:13824 from
DatanodeInfoWithStorage[127.0.0.1:36871,DS-cfab070a-8983-4c61-8647-eb0526df31c9,DISK]
2015-07-08 15:05:13,457 WARN hdfs.DFSClient
(StripedBlockUtil.java:getNextCompletedStripedRead(215)) - ExecutionException
java.util.concurrent.ExecutionException: java.io.IOException:
fetchBlockByteRange(). Got a checksum exception for /partially_corrupted_1_0 at
BP-1928182115-9.96.1.31-1436339108502:blk_-9223372036854775792_1001:13824 from
DatanodeInfoWithStorage[127.0.0.1:36871,DS-cfab070a-8983-4c61-8647-eb0526df31c9,DISK]
2015-07-08 15:05:13,560 INFO hdfs.StateChange
(FSNamesystem.java:reportBadBlocks(5783)) - *DIR* reportBadBlocks
2015-07-08 15:05:13,561 INFO BlockStateChange
(CorruptReplicasMap.java:addToCorruptReplicasMap(76)) - BLOCK
NameSystem.addToCorruptReplicasMap: blk_-9223372036854775792 added as corrupt
on 127.0.0.1:36871 by /127.0.0.1 because client machine reported it
2015-07-08 15:05:13,690 WARN hdfs.DFSClient
(DFSInputStream.java:actualGetFromOneDataNode(1203)) - fetchBlockByteRange().
Got a checksum exception for /partially_corrupted_1_0 at
BP-1928182115-9.96.1.31-1436339108502:blk_-9223372036854775792_1001:13824 from
DatanodeInfoWithStorage[127.0.0.1:36871,DS-cfab070a-8983-4c61-8647-eb0526df31c9,DISK]
2015-07-08 15:05:13,693 WARN hdfs.DFSClient
(StripedBlockUtil.java:getNextCompletedStripedRead(215)) - ExecutionException
java.util.concurrent.ExecutionException: java.io.IOException:
fetchBlockByteRange(). Got a checksum exception for /partially_corrupted_1_0 at
BP-1928182115-9.96.1.31-1436339108502:blk_-9223372036854775792_1001:13824 from
DatanodeInfoWithStorage[127.0.0.1:36871,DS-cfab070a-8983-4c61-8647-eb0526df31c9,DISK]
2015-07-08 15:05:13,705 INFO hdfs.StateChange
(FSNamesystem.java:reportBadBlocks(5783)) - *DIR* reportBadBlocks
2015-07-08 15:05:13,706 INFO BlockStateChange
(CorruptReplicasMap.java:addToCorruptReplicasMap(81)) - BLOCK
NameSystem.addToCorruptReplicasMap: duplicate requested for
blk_-9223372036854775792 to add as corrupt on 127.0.0.1:36871 by /127.0.0.1
because client machine reported it
2015-07-08 15:05:14,033 INFO FSNamesystem.audit
(FSNamesystem.java:logAuditMessage(7816)) - allowed=true ugi=root
(auth:SIMPLE) ip=/127.0.0.1 cmd=open src=/partially_corrupted_1_0
dst=null perm=null proto=rpc
2015-07-08 15:05:14,049 INFO hdfs.MiniDFSCluster
(MiniDFSCluster.java:shutdown(1728)) - Shutting down the Mini HDFS Cluster
{code}
> Erasure Coding: Fail to read a file with corrupted blocks
> ---------------------------------------------------------
>
> Key: HDFS-8732
> URL: https://issues.apache.org/jira/browse/HDFS-8732
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Xinwei Qin
> Assignee: Walter Su
>
> In system test of reading EC file(HDFS-8259), the methods
> {{testReadCorruptedData*()}} failed to read a EC file with corrupted
> blocks(overwrite some data to several blocks and this will make client get a
> checksum exception).
> Exception logs:
> {code}
> java.lang.NullPointerException
> at
> org.apache.hadoop.hdfs.DFSStripedInputStream$StatefulStripeReader.readChunk(DFSStripedInputStream.java:771)
> at
> org.apache.hadoop.hdfs.DFSStripedInputStream$StripeReader.readStripe(DFSStripedInputStream.java:623)
> at
> org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:335)
> at
> org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:465)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:946)
> at java.io.DataInputStream.read(DataInputStream.java:149)
> at
> org.apache.hadoop.hdfs.StripedFileTestUtil.verifyStatefulRead(StripedFileTestUtil.java:98)
> at
> org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.verifyRead(TestReadStripedFileWithDecoding.java:196)
> at
> org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.testOneFileWithBlockCorrupted(TestReadStripedFileWithDecoding.java:246)
> at
> org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.testReadCorruptedData11(TestReadStripedFileWithDecoding.java:114)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)