[
https://issues.apache.org/jira/browse/HDFS-6804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brahma Reddy Battula updated HDFS-6804:
---------------------------------------
Attachment: HDFS-6804-branch-2.8.patch
Uploading testcase for branch-2.8..Testcase will not applicable to {{trunk}}
and {{branch-2}} since transfer will fail after HDFS-10958 and HDFS-11337.
Please check trace for same.
After reverting the HDFS-11060, testcase will fail.
*Trunk*
{noformat}
getBlockURI() =
file:/D:/OSCode/hadoop-trunk/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current/BP-1559881164-10.18.246.125-1496746386088/current/finalized/subdir0/subdir0/blk_1073741825
reopen failed. Unable to move meta file
D:\OSCode\hadoop-trunk\hadoop\hadoop-hdfs-project\hadoop-hdfs\target\test\data\dfs\data\data1\current\BP-1559881164-10.18.246.125-1496746386088\current\finalized\subdir0\subdir0\blk_1073741825_1001.meta
to rbw dir
D:\OSCode\hadoop-trunk\hadoop\hadoop-hdfs-project\hadoop-hdfs\target\test\data\dfs\data\data1\current\BP-1559881164-10.18.246.125-1496746386088\current\rbw\blk_1073741825_1002.meta
at
org.apache.hadoop.hdfs.server.datanode.LocalReplicaInPipeline.moveReplicaFrom(LocalReplicaInPipeline.java:388)
at
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.append(FsVolumeImpl.java:1194)
at
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.append(FsDatasetImpl.java:1176)
{noformat}
*branch-2*
{noformat}
2017-06-06 20:01:40,314 WARN datanode.DataNode (DataNode.java:run(2488)) -
DatanodeRegistration(127.0.0.1:52070,
datanodeUuid=4fd53a59-936b-4e14-836d-83c30c530c1c, infoPort=52106,
infoSecurePort=0, ipcPort=52107,
storageInfo=lv=-57;cid=testClusterID;nsid=2117479843;c=1496750487571):Failed to
transfer BP-974715696-10.18.246.125-1496750487571:blk_1073741825_1001 to
127.0.0.1:52121 got
java.io.IOException: Block
BP-974715696-10.18.246.125-1496750487571:blk_1073741825_1001 is not valid.
Expected block file at null does not exist.
at
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockInputStream(FsDatasetImpl.java:810)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:417)
at
org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2442)
{noformat}
[~jojochuang] Kindly review the testcase. Sorry,for delayed response.I missed
this.
> race condition between transferring block and appending block causes
> "Unexpected checksum mismatch exception"
> --------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-6804
> URL: https://issues.apache.org/jira/browse/HDFS-6804
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Affects Versions: 2.2.0
> Reporter: Gordon Wang
> Assignee: Brahma Reddy Battula
> Attachments: HDFS-6804-branch-2.8.patch,
> Testcase_append_transfer_block.patch
>
>
> We found some error log in the datanode. like this
> {noformat}
> 2014-07-22 01:49:51,338 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
> Ex
> ception for BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248
> java.io.IOException: Terminating due to a checksum error.java.io.IOException:
> Unexpected checksum mismatch while writing
> BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248 from
> /192.168.2.101:39495
> at
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:536)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:703)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:575)
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:115)
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
> at java.lang.Thread.run(Thread.java:744)
> {noformat}
> While on the source datanode, the log says the block is transmitted.
> {noformat}
> 2014-07-22 01:49:50,805 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
> Da
> taTransfer: Transmitted
> BP-2072804351-192.168.2.104-1406008383435:blk_1073741997
> _9248 (numBytes=16188152) to /192.168.2.103:50010
> {noformat}
> When the destination datanode gets the checksum mismatch, it reports bad
> block to NameNode and NameNode marks the replica on the source datanode as
> corrupt. But actually, the replica on the source datanode is valid. Because
> the replica can pass the checksum verification.
> In all, the replica on the source data is wrongly marked as corrupted.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]