[ 
https://issues.apache.org/jira/browse/HADOOP-4995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662411#action_12662411
 ] 

Brian Bockelman commented on HADOOP-4995:
-----------------------------------------

Hey Konstantin, Dhruba,

I think we're approximately on the same page.  The best thing would be to 
verify image correctness is the namenode itself.  I believe Dhruba expressed 
this most succinctly: the "offline fsImage verification" could simply be a 
"-checkimage" flag where the namenode would load the fsImage / edits, then exit 
0 if nothing bad happened and exit 1 if there was some error.

I wasn't proposing a completely separate tool to verify an image for the 
reasons Konstantin pointed out - the only sane way to verify the image is 
usable is by the namenode is to use the namenode itself; it'd be impossible to 
try and sync two separate implementations.

> Offline Namenode fsImage verification
> -------------------------------------
>
>                 Key: HADOOP-4995
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4995
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Brian Bockelman
>
> Currently, there is no way to verify that a copy of the fsImage is not 
> corrupt.  I propose that we should have an offline tool that loads the 
> fsImage into memory to see if it is usable.  This will allow us to automate 
> backup testing to some extent.
> One can start a namenode process on the fsImage to see if it can be loaded, 
> but this is not easy to automate.
> To use HDFS in production, it is greatly desired to have both checkpoints - 
> and have some idea that the checkpoints are valid!  No one wants to see the 
> day where they reload from backup only to find that the fsImage in the backup 
> wasn't usable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to