[ https://issues.apache.org/jira/browse/HDFS-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Allen Wittenauer resolved HDFS-1231. ------------------------------------ Resolution: Won't Fix append got overhauled in 2.x. closing. > Generation Stamp mismatches, leading to failed append > ----------------------------------------------------- > > Key: HDFS-1231 > URL: https://issues.apache.org/jira/browse/HDFS-1231 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client > Affects Versions: 0.20-append > Reporter: Thanh Do > > - Summary: the recoverBlock is not atomic, leading retrial fails when > facing a failure. > > - Setup: > + # available datanodes = 3 > + # disks / datanode = 1 > + # failures = 2 > + failure type = crash > + When/where failure happens = (see below) > > - Details: > Suppose there are 3 datanodes in the pipeline: dn3, dn2, and dn1. Dn1 is > primary. > When appending, client first calls dn1.recoverBlock to make all the datanodes > in > pipeline agree on the new Generation Stamp (GS1) and the length of the block. > Client then sends a data packet to dn3. dn3 in turn forwards this packet to > down stream > dns (dn2 and dn1) and starts writing to its own disk, then it crashes AFTER > writing to the block > file but BEFORE writing to the meta file. Client notices the crash, it calls > dn1.recoverBlock(). > dn1.recoverBlock() first creates a syncList (by calling getMetadataInfo at > all dn2 and dn1). > Then dn1 calls NameNode.getNextGS() to get new Generation Stamp (GS2). > Then it calls dn2.updateBlock(), this returns successfully. > Now, it starts calling its own updateBlock and crashes after renaming from > blk_X_GS1.meta to blk_X_GS1.meta_tmpGS2. > Therefore, dn1.recoverBlock() from the client point of view fails. > but the GS for corresponding block has been incremented in the namenode (GS2) > The client retries by calling dn2.recoverBlock with old GS (GS1), which does > not match with > the new GS at the NameNode (GS1) -->exception, leading to append fails. > > Now, after all, we have > - in dn3 (which is crashed) > tmp/blk_X > tmp/blk_X_GS1.meta > - in dn2 > current/blk_X > current/blk_X_GS2 > - in dn1: > current/blk_X > current/blk_X_GS1.meta_tmpGS2 > - in NameNode, the block X has generation stamp GS1 (because dn1 has not > called > commitSyncronization yet). > > Therefore, when crashed datanodes restart, at dn1 the block is invalid > because > there is no meta file. In dn3, block file and meta file are finalized, > however, the > block is corrupted because CRC mismatch. In dn2, the GS of the block is GS2, > which is not equal with the generation stamp info of the block maintained in > NameNode. > Hence, the block blk_X is inaccessible. > This bug was found by our Failure Testing Service framework: > http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html > For questions, please email us: Thanh Do (than...@cs.wisc.edu) and > Haryadi Gunawi (hary...@eecs.berkeley.edu) -- This message was sent by Atlassian JIRA (v6.2#6252)