[jira] [Updated] (HDFS-4660) Duplicated checksum on DN in a recovered pipeline

Kihwal Lee (JIRA) Wed, 13 May 2015 15:41:14 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Kihwal Lee updated HDFS-4660:
-----------------------------
    Attachment: HDFS-4660.patch

We saw this kind of corruption happening when the copied partial block data 
does not end at a packet boundary. The un-acked packets are resent from the 
client and if the end of the on-disk data is not aligned, corruption happens.  
This is very difficult to reproduce in unit test without being too invasive.

However, data corruption can be reproduced in a 10-node cluster. Here is how we 
reproduced it and verified the patch (Credit goes to [~nroberts]) :

- Modify teragen to hflush() every 10000 records
- Change datanode WRITE_TIMEOUT_EXTENSION from 5000 ms to 1ms - allows socket 
write timeout config to have full control over the write timeout
- Config dfs.datanode.socket.write.timeout to 2000ms
- Config dfs.client.block.write.replace-datanode-on-failure.policy to ALWAYS so 
that write pipelines are always immediately reconstructed when a failure occurs
- Run teragen with 100 maps, each outputting 10000000000
- Success criteria is no "Checksum verification failed" in any datanode logs. 
This is from added checksum verification in recoverRbw(). A patch will be 
provided in HDFS-8395. 
- The write timeout is so aggressive that the teragen job will probably fail 
due to multiple, repeated failures eventually causing task attempts to fail, 
this is expected.



> Duplicated checksum on DN in a recovered pipeline
> -------------------------------------------------
>
>                 Key: HDFS-4660
>                 URL: https://issues.apache.org/jira/browse/HDFS-4660
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 3.0.0, 2.0.3-alpha
>            Reporter: Peng Zhang
>            Assignee: Kihwal Lee
>            Priority: Critical
>         Attachments: HDFS-4660.patch, HDFS-4660.patch
>
>
> pipeline DN1  DN2  DN3
> stop DN2
> pipeline added node DN4 located at 2nd position
> DN1  DN4  DN3
> recover RBW
> DN4 after recover rbw
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1004
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134144
>   getBytesOnDisk() = 134144
>   getVisibleLength()= 134144
> end at chunk (134144/512=262)
> DN3 after recover rbw
> 2013-04-01 21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_10042013-04-01
>  21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134028 
>   getBytesOnDisk() = 134028
>   getVisibleLength()= 134028
> client send packet after recover pipeline
> offset=133632  len=1008
> DN4 after flush 
> 2013-04-01 21:02:31,779 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1063
> // meta end position should be floor(134640/512)*4 + 7 == 1059, but now it is 
> 1063.
> DN3 after flush
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005, 
> type=LAST_IN_PIPELINE, downstreams=0:[]: enqueue Packet(seqno=219, 
> lastPacketInBlock=false, offsetInBlock=134640, 
> ackEnqueueNanoTime=8817026136871545)
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Changing 
> meta file offset of block 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005 from 
> 1055 to 1051
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1059
> After checking meta on DN4, I found checksum of chunk 262 is duplicated, but 
> data not.
> Later after block was finalized, DN4's scanner detected bad block, and then 
> reported it to NM. NM send a command to delete this block, and replicate this 
> block from other DN in pipeline to satisfy duplication num.
> I think this is because in BlockReceiver it skips data bytes already written, 
> but not skips checksum bytes already written. And function 
> adjustCrcFilePosition is only used for last non-completed chunk, but
> not for this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-4660) Duplicated checksum on DN in a recovered pipeline

Reply via email to