[jira] [Updated] (HDFS-10624) VolumeScanner to report why a block is found bad

2016-07-13 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-10624:
-
Description: (was: Seeing the following on DN log. 

{code}
2016-04-07 20:27:45,416 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
opWriteBlock BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 
received exception java.io.EOFException: Premature EOF: no length prefix 
available
2016-04-07 20:27:45,416 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
rn2-lampp-lapp1115.rno.apple.com:1110:DataXceiver error processing WRITE_BLOCK 
operation  src: /10.204.64.137:45112 dst: /10.204.64.151:1110
java.io.EOFException: Premature EOF: no length prefix available
at 
org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2241)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:738)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244)
at java.lang.Thread.run(Thread.java:745)
2016-04-07 20:27:46,116 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
BP-1800173197-10.204.68.5-125156296:blk_1170125248_96458336 on 
/ngs8/app/lampp/dfs/dn
2016-04-07 20:27:46,117 ERROR 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: 
VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) 
exiting because of exception
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.reportBadBlocks(DataNode.java:1018)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner$ScanResultHandler.handle(VolumeScanner.java:287)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:443)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:547)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:621)
2016-04-07 20:27:46,118 INFO 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: 
VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) 
exiting.
2016-04-07 20:27:46,442 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(10.204.64.151, 
datanodeUuid=6064994a-6769-4192-9377-83f78bd3d7a6, infoPort=0, 
infoSecurePort=1175, ipcPort=1120, 
storageInfo=lv=-56;cid=cluster6;nsid=1112595121;c=0):Failed to transfer 
BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 to 
10.204.64.10:1110 got
java.net.SocketException: Original Exception : java.io.IOException: Connection 
reset by peer
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at 
org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at 
org.apache.hadoop.security.SaslOutputStream.write(SaslOutputStream.java:190)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:585)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:758)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:705)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2154)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.transferReplicaForPipelineRecovery(DataNode.java:2884)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.transferBlock(DataXceiver.java:862)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opTransferBlock(Receiver.java:200)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:118)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Connection 

[jira] [Updated] (HDFS-10624) VolumeScanner to report why a block is found bad

2016-07-13 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-10624:
-
Description: 
Seeing the following on DN log. 

{code}
2016-04-07 20:27:45,416 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
opWriteBlock BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 
received exception java.io.EOFException: Premature EOF: no length prefix 
available
2016-04-07 20:27:45,416 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
rn2-lampp-lapp1115.rno.apple.com:1110:DataXceiver error processing WRITE_BLOCK 
operation  src: /10.204.64.137:45112 dst: /10.204.64.151:1110
java.io.EOFException: Premature EOF: no length prefix available
at 
org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2241)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:738)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244)
at java.lang.Thread.run(Thread.java:745)
2016-04-07 20:27:46,116 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
BP-1800173197-10.204.68.5-125156296:blk_1170125248_96458336 on 
/ngs8/app/lampp/dfs/dn
2016-04-07 20:27:46,117 ERROR 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: 
VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) 
exiting because of exception
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.reportBadBlocks(DataNode.java:1018)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner$ScanResultHandler.handle(VolumeScanner.java:287)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:443)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:547)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:621)
2016-04-07 20:27:46,118 INFO 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: 
VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) 
exiting.
2016-04-07 20:27:46,442 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(10.204.64.151, 
datanodeUuid=6064994a-6769-4192-9377-83f78bd3d7a6, infoPort=0, 
infoSecurePort=1175, ipcPort=1120, 
storageInfo=lv=-56;cid=cluster6;nsid=1112595121;c=0):Failed to transfer 
BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 to 
10.204.64.10:1110 got
java.net.SocketException: Original Exception : java.io.IOException: Connection 
reset by peer
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at 
org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at 
org.apache.hadoop.security.SaslOutputStream.write(SaslOutputStream.java:190)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:585)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:758)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:705)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2154)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.transferReplicaForPipelineRecovery(DataNode.java:2884)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.transferBlock(DataXceiver.java:862)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opTransferBlock(Receiver.java:200)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:118)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Connection reset by