[jira] [Commented] (HDFS-8732) Erasure Coding: Fail to read a file with corrupted blocks
[ https://issues.apache.org/jira/browse/HDFS-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621060#comment-14621060 ] Jing Zhao commented on HDFS-8732: - HDFS-8669 can fix this I think. Erasure Coding: Fail to read a file with corrupted blocks - Key: HDFS-8732 URL: https://issues.apache.org/jira/browse/HDFS-8732 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xinwei Qin Assignee: Walter Su Attachments: testReadCorruptedData.patch In system test of reading EC file(HDFS-8259), the methods {{testReadCorruptedData*()}} failed to read a EC file with corrupted blocks(overwrite some data to several blocks and this will make client get a checksum exception). Exception logs: {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSStripedInputStream$StatefulStripeReader.readChunk(DFSStripedInputStream.java:771) at org.apache.hadoop.hdfs.DFSStripedInputStream$StripeReader.readStripe(DFSStripedInputStream.java:623) at org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:335) at org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:465) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:946) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.hdfs.StripedFileTestUtil.verifyStatefulRead(StripedFileTestUtil.java:98) at org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.verifyRead(TestReadStripedFileWithDecoding.java:196) at org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.testOneFileWithBlockCorrupted(TestReadStripedFileWithDecoding.java:246) at org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.testReadCorruptedData11(TestReadStripedFileWithDecoding.java:114) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8732) Erasure Coding: Fail to read a file with corrupted blocks
[ https://issues.apache.org/jira/browse/HDFS-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621593#comment-14621593 ] Xinwei Qin commented on HDFS-8732: --- Yes, [~jingzhao], the test passed with patch in HDFS-8669. I think this jira can be closed now. Erasure Coding: Fail to read a file with corrupted blocks - Key: HDFS-8732 URL: https://issues.apache.org/jira/browse/HDFS-8732 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xinwei Qin Assignee: Walter Su Attachments: testReadCorruptedData.patch In system test of reading EC file(HDFS-8259), the methods {{testReadCorruptedData*()}} failed to read a EC file with corrupted blocks(overwrite some data to several blocks and this will make client get a checksum exception). Exception logs: {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSStripedInputStream$StatefulStripeReader.readChunk(DFSStripedInputStream.java:771) at org.apache.hadoop.hdfs.DFSStripedInputStream$StripeReader.readStripe(DFSStripedInputStream.java:623) at org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:335) at org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:465) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:946) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.hdfs.StripedFileTestUtil.verifyStatefulRead(StripedFileTestUtil.java:98) at org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.verifyRead(TestReadStripedFileWithDecoding.java:196) at org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.testOneFileWithBlockCorrupted(TestReadStripedFileWithDecoding.java:246) at org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.testReadCorruptedData11(TestReadStripedFileWithDecoding.java:114) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8732) Erasure Coding: Fail to read a file with corrupted blocks
[ https://issues.apache.org/jira/browse/HDFS-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618194#comment-14618194 ] Walter Su commented on HDFS-8732: - It's different from HDFS-8602. HDFS-8602 corrupts block by changing block size. This jira overwrites some bytes of the block to cause checksum exception. Erasure Coding: Fail to read a file with corrupted blocks - Key: HDFS-8732 URL: https://issues.apache.org/jira/browse/HDFS-8732 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xinwei Qin In system test of reading EC file(HDFS-8259), the methods {{testReadCorruptedData*()}} failed to read a EC file with corrupted blocks(overwrite some data to several blocks and this will make client get a checksum exception). Exception logs: {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSStripedInputStream$StatefulStripeReader.readChunk(DFSStripedInputStream.java:771) at org.apache.hadoop.hdfs.DFSStripedInputStream$StripeReader.readStripe(DFSStripedInputStream.java:623) at org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:335) at org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:465) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:946) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.hdfs.StripedFileTestUtil.verifyStatefulRead(StripedFileTestUtil.java:98) at org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.verifyRead(TestReadStripedFileWithDecoding.java:196) at org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.testOneFileWithBlockCorrupted(TestReadStripedFileWithDecoding.java:246) at org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.testReadCorruptedData11(TestReadStripedFileWithDecoding.java:114) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8732) Erasure Coding: Fail to read a file with corrupted blocks
[ https://issues.apache.org/jira/browse/HDFS-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618199#comment-14618199 ] Xinwei Qin commented on HDFS-8732: --- Hi, [~hitliuyi], I noticed HDFS-8602 had resolved the similar problem, but it cannot fix the issue in this jira. Thanks [~walter.k.su] to clarify. The error log in HDFS-8602: {code} 2015-07-08 16:19:04,742 ERROR datanode.DataNode (BlockSender.java:sendPacket(615)) - BlockSender.sendChunks() exception: java.io.EOFException: EOF Reached. file size is 10 and 65526 more bytes left to be transfered. at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:228) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:585) at org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:765) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:556) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:256) at java.lang.Thread.run(Thread.java:722) {code} and the error or warn log of this jira: {code} 2015-07-08 15:05:13,455 WARN hdfs.DFSClient (DFSInputStream.java:actualGetFromOneDataNode(1203)) - fetchBlockByteRange(). Got a checksum exception for /partially_corrupted_1_0 at BP-1928182115-9.96.1.31-1436339108502:blk_-9223372036854775792_1001:13824 from DatanodeInfoWithStorage[127.0.0.1:36871,DS-cfab070a-8983-4c61-8647-eb0526df31c9,DISK] 2015-07-08 15:05:13,457 WARN hdfs.DFSClient (StripedBlockUtil.java:getNextCompletedStripedRead(215)) - ExecutionException java.util.concurrent.ExecutionException: java.io.IOException: fetchBlockByteRange(). Got a checksum exception for /partially_corrupted_1_0 at BP-1928182115-9.96.1.31-1436339108502:blk_-9223372036854775792_1001:13824 from DatanodeInfoWithStorage[127.0.0.1:36871,DS-cfab070a-8983-4c61-8647-eb0526df31c9,DISK] 2015-07-08 15:05:13,560 INFO hdfs.StateChange (FSNamesystem.java:reportBadBlocks(5783)) - *DIR* reportBadBlocks 2015-07-08 15:05:13,561 INFO BlockStateChange (CorruptReplicasMap.java:addToCorruptReplicasMap(76)) - BLOCK NameSystem.addToCorruptReplicasMap: blk_-9223372036854775792 added as corrupt on 127.0.0.1:36871 by /127.0.0.1 because client machine reported it 2015-07-08 15:05:13,690 WARN hdfs.DFSClient (DFSInputStream.java:actualGetFromOneDataNode(1203)) - fetchBlockByteRange(). Got a checksum exception for /partially_corrupted_1_0 at BP-1928182115-9.96.1.31-1436339108502:blk_-9223372036854775792_1001:13824 from DatanodeInfoWithStorage[127.0.0.1:36871,DS-cfab070a-8983-4c61-8647-eb0526df31c9,DISK] 2015-07-08 15:05:13,693 WARN hdfs.DFSClient (StripedBlockUtil.java:getNextCompletedStripedRead(215)) - ExecutionException java.util.concurrent.ExecutionException: java.io.IOException: fetchBlockByteRange(). Got a checksum exception for /partially_corrupted_1_0 at BP-1928182115-9.96.1.31-1436339108502:blk_-9223372036854775792_1001:13824 from DatanodeInfoWithStorage[127.0.0.1:36871,DS-cfab070a-8983-4c61-8647-eb0526df31c9,DISK] 2015-07-08 15:05:13,705 INFO hdfs.StateChange (FSNamesystem.java:reportBadBlocks(5783)) - *DIR* reportBadBlocks 2015-07-08 15:05:13,706 INFO BlockStateChange (CorruptReplicasMap.java:addToCorruptReplicasMap(81)) - BLOCK NameSystem.addToCorruptReplicasMap: duplicate requested for blk_-9223372036854775792 to add as corrupt on 127.0.0.1:36871 by /127.0.0.1 because client machine reported it 2015-07-08 15:05:14,033 INFO FSNamesystem.audit (FSNamesystem.java:logAuditMessage(7816)) - allowed=trueugi=root (auth:SIMPLE) ip=/127.0.0.1 cmd=opensrc=/partially_corrupted_1_0 dst=nullperm=null proto=rpc 2015-07-08 15:05:14,049 INFO hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1728)) - Shutting down the Mini HDFS Cluster {code} Erasure Coding: Fail to read a file with corrupted blocks - Key: HDFS-8732 URL: https://issues.apache.org/jira/browse/HDFS-8732 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xinwei Qin Assignee: Walter Su In system test of reading EC file(HDFS-8259), the methods {{testReadCorruptedData*()}} failed to read a EC file with corrupted blocks(overwrite some data to several blocks and this will make client get a checksum exception). Exception logs: {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSStripedInputStream$StatefulStripeReader.readChunk(DFSStripedInputStream.java:771) at
[jira] [Commented] (HDFS-8732) Erasure Coding: Fail to read a file with corrupted blocks
[ https://issues.apache.org/jira/browse/HDFS-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618246#comment-14618246 ] Xinwei Qin commented on HDFS-8732: --- a simple patch Erasure Coding: Fail to read a file with corrupted blocks - Key: HDFS-8732 URL: https://issues.apache.org/jira/browse/HDFS-8732 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xinwei Qin Assignee: Walter Su Attachments: testReadCorruptedData.patch In system test of reading EC file(HDFS-8259), the methods {{testReadCorruptedData*()}} failed to read a EC file with corrupted blocks(overwrite some data to several blocks and this will make client get a checksum exception). Exception logs: {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSStripedInputStream$StatefulStripeReader.readChunk(DFSStripedInputStream.java:771) at org.apache.hadoop.hdfs.DFSStripedInputStream$StripeReader.readStripe(DFSStripedInputStream.java:623) at org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:335) at org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:465) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:946) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.hdfs.StripedFileTestUtil.verifyStatefulRead(StripedFileTestUtil.java:98) at org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.verifyRead(TestReadStripedFileWithDecoding.java:196) at org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.testOneFileWithBlockCorrupted(TestReadStripedFileWithDecoding.java:246) at org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.testReadCorruptedData11(TestReadStripedFileWithDecoding.java:114) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8732) Erasure Coding: Fail to read a file with corrupted blocks
[ https://issues.apache.org/jira/browse/HDFS-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618180#comment-14618180 ] Yi Liu commented on HDFS-8732: -- fixed in HDFS-8602? Erasure Coding: Fail to read a file with corrupted blocks - Key: HDFS-8732 URL: https://issues.apache.org/jira/browse/HDFS-8732 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xinwei Qin In system test of reading EC file(HDFS-8259), the methods {{testReadCorruptedData*()}} failed to read a EC file with corrupted blocks(overwrite some data to several blocks and this will make client get a checksum exception). Exception logs: {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSStripedInputStream$StatefulStripeReader.readChunk(DFSStripedInputStream.java:771) at org.apache.hadoop.hdfs.DFSStripedInputStream$StripeReader.readStripe(DFSStripedInputStream.java:623) at org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:335) at org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:465) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:946) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.hdfs.StripedFileTestUtil.verifyStatefulRead(StripedFileTestUtil.java:98) at org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.verifyRead(TestReadStripedFileWithDecoding.java:196) at org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.testOneFileWithBlockCorrupted(TestReadStripedFileWithDecoding.java:246) at org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding.testReadCorruptedData11(TestReadStripedFileWithDecoding.java:114) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)