[ 
https://issues.apache.org/jira/browse/HDFS-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14618199#comment-14618199
 ] 

Xinwei Qin  commented on HDFS-8732:
-----------------------------------

Hi, [~hitliuyi], I noticed HDFS-8602 had resolved the similar problem, but it 
cannot fix the issue in this jira.
Thanks [~walter.k.su] to clarify.

The error log in HDFS-8602:
{code}
2015-07-08 16:19:04,742 ERROR datanode.DataNode 
(BlockSender.java:sendPacket(615)) - BlockSender.sendChunks() exception: 
java.io.EOFException: EOF Reached. file size is 10 and 65526 more bytes left to 
be transfered.
at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:228)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:585)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:765)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:556)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:256)
at java.lang.Thread.run(Thread.java:722)
{code}

and the error or warn log of this jira:
{code}
2015-07-08 15:05:13,455 WARN hdfs.DFSClient 
(DFSInputStream.java:actualGetFromOneDataNode(1203)) - fetchBlockByteRange(). 
Got a checksum exception for /partially_corrupted_1_0 at 
BP-1928182115-9.96.1.31-1436339108502:blk_-9223372036854775792_1001:13824 from 
DatanodeInfoWithStorage[127.0.0.1:36871,DS-cfab070a-8983-4c61-8647-eb0526df31c9,DISK]
2015-07-08 15:05:13,457 WARN hdfs.DFSClient 
(StripedBlockUtil.java:getNextCompletedStripedRead(215)) - ExecutionException 
java.util.concurrent.ExecutionException: java.io.IOException: 
fetchBlockByteRange(). Got a checksum exception for /partially_corrupted_1_0 at 
BP-1928182115-9.96.1.31-1436339108502:blk_-9223372036854775792_1001:13824 from 
DatanodeInfoWithStorage[127.0.0.1:36871,DS-cfab070a-8983-4c61-8647-eb0526df31c9,DISK]
2015-07-08 15:05:13,560 INFO hdfs.StateChange 
(FSNamesystem.java:reportBadBlocks(5783)) - *DIR* reportBadBlocks
2015-07-08 15:05:13,561 INFO BlockStateChange 
(CorruptReplicasMap.java:addToCorruptReplicasMap(76)) - BLOCK 
NameSystem.addToCorruptReplicasMap: blk_-9223372036854775792 added as corrupt 
on 127.0.0.1:36871 by /127.0.0.1 because client machine reported it
2015-07-08 15:05:13,690 WARN hdfs.DFSClient 
(DFSInputStream.java:actualGetFromOneDataNode(1203)) - fetchBlockByteRange(). 
Got a checksum exception for /partially_corrupted_1_0 at 
BP-1928182115-9.96.1.31-1436339108502:blk_-9223372036854775792_1001:13824 from 
DatanodeInfoWithStorage[127.0.0.1:36871,DS-cfab070a-8983-4c61-8647-eb0526df31c9,DISK]
2015-07-08 15:05:13,693 WARN hdfs.DFSClient 
(StripedBlockUtil.java:getNextCompletedStripedRead(215)) - ExecutionException 
java.util.concurrent.ExecutionException: java.io.IOException: 
fetchBlockByteRange(). Got a checksum exception for /partially_corrupted_1_0 at 
BP-1928182115-9.96.1.31-1436339108502:blk_-9223372036854775792_1001:13824 from 
DatanodeInfoWithStorage[127.0.0.1:36871,DS-cfab070a-8983-4c61-8647-eb0526df31c9,DISK]
2015-07-08 15:05:13,705 INFO hdfs.StateChange 
(FSNamesystem.java:reportBadBlocks(5783)) - *DIR* reportBadBlocks
2015-07-08 15:05:13,706 INFO BlockStateChange 
(CorruptReplicasMap.java:addToCorruptReplicasMap(81)) - BLOCK 
NameSystem.addToCorruptReplicasMap: duplicate requested for 
blk_-9223372036854775792 to add as corrupt on 127.0.0.1:36871 by /127.0.0.1 
because client machine reported it
2015-07-08 15:05:14,033 INFO FSNamesystem.audit 
(FSNamesystem.java:logAuditMessage(7816)) - allowed=true        ugi=root 
(auth:SIMPLE)  ip=/127.0.0.1   cmd=open        src=/partially_corrupted_1_0    
dst=null        perm=null       proto=rpc
2015-07-08 15:05:14,049 INFO hdfs.MiniDFSCluster 
(MiniDFSCluster.java:shutdown(1728)) - Shutting down the Mini HDFS Cluster
{code}

> Erasure Coding: Fail to read a file with corrupted blocks
> ---------------------------------------------------------
>
>                 Key: HDFS-8732
>                 URL: https://issues.apache.org/jira/browse/HDFS-8732
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Xinwei Qin 
>            Assignee: Walter Su
>
> In system test of reading EC file(HDFS-8259), the methods 
> {{testReadCorruptedData*()}} failed to read a EC file with corrupted 
> blocks(overwrite some data to several blocks and this will make client get a  
> checksum exception). 
> Exception logs:
> {code}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DFSStripedInputStream$StatefulStripeReader.readChunk(DFSStripedInputStream.java:771)
> at 
> org.apache.hadoop.hdfs.DFSStripedInputStream$StripeReader.readStripe(DFSStripedInputStream.java:623)
> at 
> org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:335)
> at 
> org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:465)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:946)
> at java.io.DataInputStream.read(DataInputStream.java:149)
> at 
> org.apache.hadoop.hdfs.StripedFileTestUtil.verifyStatefulRead(StripedFileTestUtil.java:98)
> at 
> org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.verifyRead(TestReadStripedFileWithDecoding.java:196)
> at 
> org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.testOneFileWithBlockCorrupted(TestReadStripedFileWithDecoding.java:246)
> at 
> org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.testReadCorruptedData11(TestReadStripedFileWithDecoding.java:114)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to