[ 
https://issues.apache.org/jira/browse/HDFS-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445350#comment-13445350
 ] 

Todd Lipcon commented on HDFS-3874:
-----------------------------------

The bug seems to be that the datanode doesn't report the right remote DN when 
it detects a checksum error when receiving a block. Here are the DN side logs:

{code}
2012-08-27 16:34:30,396 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
Checksum error in block 
BP-1507505631-172.29.97.196-1337120439433:blk_8285012733733669474_140475196 
from /172.29.97.219:52544
org.apache.hadoop.fs.ChecksumException: Checksum error: 
DFSClient_NONMAPREDUCE_334070927_1 at 44032 exp: -983390667 got: 557443094
        at 
org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:335)
        at 
org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:266)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.verifyChunks(BlockReceiver.java:377)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:496)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:635)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:506)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:219)
        at java.lang.Thread.run(Thread.java:662)
2012-08-27 16:34:30,396 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
report corrupt block 
BP-1507505631-172.29.97.196-1337120439433:blk_8285012733733669474_140475196 
from datanode :0 to namenode
{code}
                
> Exception when client reports bad checksum to NN
> ------------------------------------------------
>
>                 Key: HDFS-3874
>                 URL: https://issues.apache.org/jira/browse/HDFS-3874
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client, name-node
>    Affects Versions: 2.0.0-alpha
>            Reporter: Todd Lipcon
>
> We see the following exception in our logs on a cluster:
> {code}
> 2012-08-27 16:34:30,400 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> NameNode.reportBadBlocks
> 2012-08-27 16:34:30,400 ERROR 
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException 
> as:hdfs (auth:SIMPLE) cause:java.io.IOException: Cannot mark 
> blk_8285012733733669474_140475196{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[172.29.97.219:50010|RBW]]}(same as stored) 
> as corrupt because datanode :0 does not exist
> 2012-08-27 16:34:30,400 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 46 on 8020, call 
> org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.reportBadBlocks from 
> 172.29.97.219:43805: error: java.io.IOException: Cannot mark 
> blk_8285012733733669474_140475196{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[172.29.97.219:50010|RBW]]}(same as stored) 
> as corrupt because datanode :0 does not exist
> java.io.IOException: Cannot mark 
> blk_8285012733733669474_140475196{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[172.29.97.219:50010|RBW]]}(same as stored) 
> as corrupt because datanode :0 does not exist
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.markBlockAsCorrupt(BlockManager.java:1001)
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.findAndMarkBlockAsCorrupt(BlockManager.java:994)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.reportBadBlocks(FSNamesystem.java:4736)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.reportBadBlocks(NameNodeRpcServer.java:537)
>         at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.reportBadBlocks(DatanodeProtocolServerSideTranslatorPB.java:242)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:20032)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to