[ 
https://issues.apache.org/jira/browse/HADOOP-4995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662221#action_12662221
 ] 

Konstantin Shvachko commented on HADOOP-4995:
---------------------------------------------

Are you going to use name-node methods to verify fsimage or are you planning to 
implement a completely independent tool to do that.
If the former then you will probably need the same amount of memory as the 
name-node uses and therefore you might just use the real name-node or the 
secondary one and do the "try and pray".
If the latter then it will be hard to keep it in sync with the changing image 
layout and the name-node code. Suppose the tool has a bug (which might be just 
that the real image layout was not reflected in the tool code) and it reports 
the image is good or bad, how do you trust it. Who is going to verify the 
tool's correctness?
I am saying that there is no better tool for verifying image correctness than 
the name-node itself. And may be the "try and pray" is the only approach you 
can really trust in the end. You can do it offline rather than online if 
required.
Do I miss your point?


> Offline Namenode fsImage verification
> -------------------------------------
>
>                 Key: HADOOP-4995
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4995
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Brian Bockelman
>
> Currently, there is no way to verify that a copy of the fsImage is not 
> corrupt.  I propose that we should have an offline tool that loads the 
> fsImage into memory to see if it is usable.  This will allow us to automate 
> backup testing to some extent.
> One can start a namenode process on the fsImage to see if it can be loaded, 
> but this is not easy to automate.
> To use HDFS in production, it is greatly desired to have both checkpoints - 
> and have some idea that the checkpoints are valid!  No one wants to see the 
> day where they reload from backup only to find that the fsImage in the backup 
> wasn't usable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to