[
https://issues.apache.org/jira/browse/HDFS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038288#comment-13038288
]
Todd Lipcon commented on HDFS-1984:
-----------------------------------
Currently this test scenario fails after a few seconds with an exception like:
11/05/23 15:25:46 WARN mortbay.log: /getimage: java.io.IOException: GetImage
failed. java.io.IOException: Namenode has an edit log corresponding to txid
1240 but new checkpoint was created using editlog ending at txid 1238.
Checkpoint Aborted.
at
org.apache.hadoop.hdfs.server.namenode.FSImage.validateCheckpointUpload(FSImage.java:894)
at
org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1.run(GetImageServlet.java:107)
at
org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1.run(GetImageServlet.java:80)
but this validation is bogus. So long as no two checkpointers try to upload a
checkpoint at the same txid, it's OK if they upload "old" fsimages.
To fix this, I think we need to do the following:
a) Repurpose the "checkpointTxId" field of FSImage. This currently tracks the
last txid at which the NN has either saved or uploaded a checkpoint. We use it
to advertise which image file a checkpointer should download, but we also use
it to validate the checkpoint upload. Instead, it should be renamed to
"mostRecentImageTxId" and only be used to advertise the image.
b) Remove the "imageDigest" field. The function of validation is now being done
by an adjacent ".md5" file next to each image. When the checkpointer downloads
an image, the image transfer servlet can just read the .md5 file and include
the hash as an HTTP header. The checkpointer can then verify that it
transferred correctly by comparing the image it downloaded against that md5
hash. When uploading the new checkpoint back to the NN, the same process is
used in reverse.
The new validation rules for accepting a checkpoint upload should be:
- the namespace/clusterid/etc match up (same as today)
- the transaction ID of the uploaded image is less than the current transaction
ID of the namespace (sanity check)
- the hash of the file received matches the hash that the 2NN communicates for
a header
> HDFS-1073: Enable multiple checkpointers to run simultaneously
> --------------------------------------------------------------
>
> Key: HDFS-1984
> URL: https://issues.apache.org/jira/browse/HDFS-1984
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: name-node
> Affects Versions: Edit log branch (HDFS-1073)
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Fix For: Edit log branch (HDFS-1073)
>
>
> One of the motivations of HDFS-1073 is that it decouples the checkpoint
> process so that multiple checkpoints could be taken at the same time and not
> interfere with each other.
> Currently on the 1073 branch this doesn't quite work right, since we have
> some state and validation in FSImage that's tied to a single fsimage_N --
> thus if two 2NNs perform a checkpoint at different transaction IDs, only one
> will succeed.
> As a stress test, we can run two 2NNs each configured with the
> fs.checkpoint.interval set to "0" which causes them to continuously
> checkpoint as fast as they can.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira