[
https://issues.apache.org/jira/browse/HDFS-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Allen Wittenauer resolved HDFS-1231.
------------------------------------
Resolution: Won't Fix
append got overhauled in 2.x. closing.
> Generation Stamp mismatches, leading to failed append
> -----------------------------------------------------
>
> Key: HDFS-1231
> URL: https://issues.apache.org/jira/browse/HDFS-1231
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client
> Affects Versions: 0.20-append
> Reporter: Thanh Do
>
> - Summary: the recoverBlock is not atomic, leading retrial fails when
> facing a failure.
>
> - Setup:
> + # available datanodes = 3
> + # disks / datanode = 1
> + # failures = 2
> + failure type = crash
> + When/where failure happens = (see below)
>
> - Details:
> Suppose there are 3 datanodes in the pipeline: dn3, dn2, and dn1. Dn1 is
> primary.
> When appending, client first calls dn1.recoverBlock to make all the datanodes
> in
> pipeline agree on the new Generation Stamp (GS1) and the length of the block.
> Client then sends a data packet to dn3. dn3 in turn forwards this packet to
> down stream
> dns (dn2 and dn1) and starts writing to its own disk, then it crashes AFTER
> writing to the block
> file but BEFORE writing to the meta file. Client notices the crash, it calls
> dn1.recoverBlock().
> dn1.recoverBlock() first creates a syncList (by calling getMetadataInfo at
> all dn2 and dn1).
> Then dn1 calls NameNode.getNextGS() to get new Generation Stamp (GS2).
> Then it calls dn2.updateBlock(), this returns successfully.
> Now, it starts calling its own updateBlock and crashes after renaming from
> blk_X_GS1.meta to blk_X_GS1.meta_tmpGS2.
> Therefore, dn1.recoverBlock() from the client point of view fails.
> but the GS for corresponding block has been incremented in the namenode (GS2)
> The client retries by calling dn2.recoverBlock with old GS (GS1), which does
> not match with
> the new GS at the NameNode (GS1) -->exception, leading to append fails.
>
> Now, after all, we have
> - in dn3 (which is crashed)
> tmp/blk_X
> tmp/blk_X_GS1.meta
> - in dn2
> current/blk_X
> current/blk_X_GS2
> - in dn1:
> current/blk_X
> current/blk_X_GS1.meta_tmpGS2
> - in NameNode, the block X has generation stamp GS1 (because dn1 has not
> called
> commitSyncronization yet).
>
> Therefore, when crashed datanodes restart, at dn1 the block is invalid
> because
> there is no meta file. In dn3, block file and meta file are finalized,
> however, the
> block is corrupted because CRC mismatch. In dn2, the GS of the block is GS2,
> which is not equal with the generation stamp info of the block maintained in
> NameNode.
> Hence, the block blk_X is inaccessible.
> This bug was found by our Failure Testing Service framework:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
> For questions, please email us: Thanh Do ([email protected]) and
> Haryadi Gunawi ([email protected])
--
This message was sent by Atlassian JIRA
(v6.2#6252)