[
https://issues.apache.org/jira/browse/HDFS-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052908#comment-13052908
]
Todd Lipcon commented on HDFS-2026:
-----------------------------------
bq. Can we remove Checkpointer#uploadCheckpoint commented out?
Added TODO -- the CN/BN will be addressed separately
bq. testReformatNNBetweenCheckpoints method comment is missing a period
fixed
bq. The new call to sd.read in SecondaryNameNode#recoverCreate could use a
comment
added:
{code}
case NORMAL:
// Read the VERSION file. This verifies that:
// (a) the VERSION file for each of the directories is the same,
// and (b) when we connect to a NN, we can verify that the remote
// node matches the same namespace that we ran on previously.
sd.read();
break;
{code}
bq. As an aside, readVersionFile would be a better name for that method
I agree, but we should do that separately -- this function gets used throughout
all of HDFS (eg also on the DN side)
bq. Not you change would be good to add a comment to uploadImageFromStorage
indicating it doesn't actually post an image but the 2NN posts to the NN asking
it to get an image
Added the following javadoc:
{code}
/**
* Requests that the NameNode download an image from this node.
*
* @param fsName the http address for the remote NN
* @param imageListenAddress the host/port where the local node is running an
* HTTPServer hosting GetImageServlet
* @param storage the storage directory to transfer the image from
* @param txid the transaction ID of the image to be uploaded
*/
{code}
and this comment:
{code}
// this doesn't directly upload an image, but rather asks the NN
// to connect back to the 2NN to download the specified image.
TransferFsImage.getFileClient(fsName, fileid, null, false);
...
{code}
> 1073: 2NN needs to handle case of reformatted NN better
> -------------------------------------------------------
>
> Key: HDFS-2026
> URL: https://issues.apache.org/jira/browse/HDFS-2026
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: name-node
> Affects Versions: Edit log branch (HDFS-1073)
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Critical
> Fix For: Edit log branch (HDFS-1073)
>
> Attachments: hdfs-2026.txt
>
>
> Currently in the 1073 branch, the following steps ends up with a very
> confused 2NN:
> - format NN, run NN
> - start 2NN, perform some checkpoints
> - reformat NN, start NN on new namespace
> - restart same 2NN
> The 2NN currently saves the new VERSION info into its local storage directory
> but doesn't clear out the old checkpoint or edits files. This is obviously
> wrong and might lead to a corrupt checkpoint getting uploaded.
> If the 2NN has storage directories with VERSION info, and connects to an NN
> with different VERSION info, there are two alternatives:
> a) refuse to perform any checkpoints until the operator issues a
> "secondarynamenode -format" command (this is similar to how the
> backupnode/checkpointnode works)
> b) clear the current contents of the storage directory and save the new NN's
> VERSION info.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira