[ 
https://issues.apache.org/jira/browse/HDFS-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HDFS-1231.
------------------------------------

    Resolution: Won't Fix

append got overhauled in 2.x. closing.

> Generation Stamp mismatches, leading to failed append
> -----------------------------------------------------
>
>                 Key: HDFS-1231
>                 URL: https://issues.apache.org/jira/browse/HDFS-1231
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 0.20-append
>            Reporter: Thanh Do
>
> - Summary: the recoverBlock is not atomic, leading retrial fails when 
> facing a failure.
>  
> - Setup:
> + # available datanodes = 3
> + # disks / datanode = 1
> + # failures = 2
> + failure type = crash
> + When/where failure happens = (see below)
>  
> - Details:
> Suppose there are 3 datanodes in the pipeline: dn3, dn2, and dn1. Dn1 is 
> primary.
> When appending, client first calls dn1.recoverBlock to make all the datanodes 
> in 
> pipeline agree on the new Generation Stamp (GS1) and the length of the block.
> Client then sends a data packet to dn3. dn3 in turn forwards this packet to 
> down stream
> dns (dn2 and dn1) and starts writing to its own disk, then it crashes AFTER 
> writing to the block
> file but BEFORE writing to the meta file. Client notices the crash, it calls 
> dn1.recoverBlock().
> dn1.recoverBlock() first creates a syncList (by calling getMetadataInfo at 
> all dn2 and dn1).
> Then dn1 calls NameNode.getNextGS() to get new Generation Stamp (GS2).
> Then it calls dn2.updateBlock(), this returns successfully.
> Now, it starts calling its own updateBlock and crashes after renaming from
> blk_X_GS1.meta to blk_X_GS1.meta_tmpGS2.
> Therefore, dn1.recoverBlock() from the client point of view fails.
> but the GS for corresponding block has been incremented in the namenode (GS2)
> The client retries by calling dn2.recoverBlock with old GS (GS1), which does 
> not match with
> the new GS at the NameNode (GS1) -->exception, leading to append fails.
>  
> Now, after all, we have
> - in dn3 (which is crashed)
> tmp/blk_X
> tmp/blk_X_GS1.meta
> - in dn2
> current/blk_X
> current/blk_X_GS2
> - in dn1:
> current/blk_X
> current/blk_X_GS1.meta_tmpGS2
> - in NameNode, the block X has generation stamp GS1 (because dn1 has not 
> called
> commitSyncronization yet).
>  
> Therefore, when crashed datanodes restart, at dn1 the block is invalid 
> because 
> there is no meta file. In dn3, block file and meta file are finalized, 
> however, the 
> block is corrupted because CRC mismatch. In dn2, the GS of the block is GS2,
> which is not equal with the generation stamp info of the block maintained in 
> NameNode.
> Hence, the block blk_X is inaccessible.
> This bug was found by our Failure Testing Service framework:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
> For questions, please email us: Thanh Do (than...@cs.wisc.edu) and 
> Haryadi Gunawi (hary...@eecs.berkeley.edu)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to