[jira] [Updated] (HDFS-10624) VolumeScanner to report why a block is found bad
[ https://issues.apache.org/jira/browse/HDFS-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-10624: - Description: (was: Seeing the following on DN log. {code} 2016-04-07 20:27:45,416 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 received exception java.io.EOFException: Premature EOF: no length prefix available 2016-04-07 20:27:45,416 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: rn2-lampp-lapp1115.rno.apple.com:1110:DataXceiver error processing WRITE_BLOCK operation src: /10.204.64.137:45112 dst: /10.204.64.151:1110 java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2241) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:738) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244) at java.lang.Thread.run(Thread.java:745) 2016-04-07 20:27:46,116 WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad BP-1800173197-10.204.68.5-125156296:blk_1170125248_96458336 on /ngs8/app/lampp/dfs/dn 2016-04-07 20:27:46,117 ERROR org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) exiting because of exception java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.DataNode.reportBadBlocks(DataNode.java:1018) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner$ScanResultHandler.handle(VolumeScanner.java:287) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:443) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:547) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:621) 2016-04-07 20:27:46,118 INFO org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) exiting. 2016-04-07 20:27:46,442 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.204.64.151, datanodeUuid=6064994a-6769-4192-9377-83f78bd3d7a6, infoPort=0, infoSecurePort=1175, ipcPort=1120, storageInfo=lv=-56;cid=cluster6;nsid=1112595121;c=0):Failed to transfer BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 to 10.204.64.10:1110 got java.net.SocketException: Original Exception : java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at sun.nio.ch.IOUtil.write(IOUtil.java:65) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at java.io.DataOutputStream.write(DataOutputStream.java:107) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126) at org.apache.hadoop.security.SaslOutputStream.write(SaslOutputStream.java:190) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at java.io.DataOutputStream.write(DataOutputStream.java:107) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:585) at org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:758) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:705) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2154) at org.apache.hadoop.hdfs.server.datanode.DataNode.transferReplicaForPipelineRecovery(DataNode.java:2884) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.transferBlock(DataXceiver.java:862) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opTransferBlock(Receiver.java:200) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:118) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Connection
[jira] [Updated] (HDFS-10624) VolumeScanner to report why a block is found bad
[ https://issues.apache.org/jira/browse/HDFS-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-10624: - Description: Seeing the following on DN log. {code} 2016-04-07 20:27:45,416 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 received exception java.io.EOFException: Premature EOF: no length prefix available 2016-04-07 20:27:45,416 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: rn2-lampp-lapp1115.rno.apple.com:1110:DataXceiver error processing WRITE_BLOCK operation src: /10.204.64.137:45112 dst: /10.204.64.151:1110 java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2241) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:738) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244) at java.lang.Thread.run(Thread.java:745) 2016-04-07 20:27:46,116 WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad BP-1800173197-10.204.68.5-125156296:blk_1170125248_96458336 on /ngs8/app/lampp/dfs/dn 2016-04-07 20:27:46,117 ERROR org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) exiting because of exception java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.DataNode.reportBadBlocks(DataNode.java:1018) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner$ScanResultHandler.handle(VolumeScanner.java:287) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:443) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:547) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:621) 2016-04-07 20:27:46,118 INFO org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) exiting. 2016-04-07 20:27:46,442 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.204.64.151, datanodeUuid=6064994a-6769-4192-9377-83f78bd3d7a6, infoPort=0, infoSecurePort=1175, ipcPort=1120, storageInfo=lv=-56;cid=cluster6;nsid=1112595121;c=0):Failed to transfer BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 to 10.204.64.10:1110 got java.net.SocketException: Original Exception : java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at sun.nio.ch.IOUtil.write(IOUtil.java:65) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at java.io.DataOutputStream.write(DataOutputStream.java:107) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126) at org.apache.hadoop.security.SaslOutputStream.write(SaslOutputStream.java:190) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at java.io.DataOutputStream.write(DataOutputStream.java:107) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:585) at org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:758) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:705) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2154) at org.apache.hadoop.hdfs.server.datanode.DataNode.transferReplicaForPipelineRecovery(DataNode.java:2884) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.transferBlock(DataXceiver.java:862) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opTransferBlock(Receiver.java:200) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:118) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Connection reset by