[
https://issues.apache.org/jira/browse/HDFS-13031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576052#comment-16576052
]
Adam Antal commented on HDFS-13031:
-----------------------------------
Thanks [~smeng], [~yzhangal].
I created a follow-up Jira (HDFS-13818) towards the OIV.
I absolutely agree with what you've said: a real full checking of the FSImage
can only be done by actually starting a NN with that image, but as for the
HDFS-9406 issue, I think the OIV is more suitable for the problem you stated in
this jira.
If you agree to resolve the issue following the direction of the OIV tool, then
discuss it on that jira. Otherwise I leave this unresolved and let's talk about
a pure NN solution.
> To detect fsimage corruption on the spot
> ----------------------------------------
>
> Key: HDFS-13031
> URL: https://issues.apache.org/jira/browse/HDFS-13031
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs
> Environment:
> Reporter: Yongjun Zhang
> Assignee: Adam Antal
> Priority: Major
>
> Since we fixed HDFS-9406, there are new cases reported from the field that
> similar fsimage corruption happens. We need good fsimage + editlogs to replay
> to reproduce the corruption. However, usually when the corruption is detected
> (at later NN restart), the good fsimage is already deleted.
> We need to have a way to detect fsimage corruption on the spot. Currently
> what I think we could do is:
> # after SNN creates a new fsimage, it spawn a new modified NN process (NN
> with some new command line args) to just load the fsimage and do nothing
> else.
> # If the process failed, the currently running SNN will do either a) backup
> the fsimage + editlogs or b) no longer do checkpointing. And it need to
> somehow raise a flag to user that the fsimage is corrupt.
> In step 2, if we do a, we need to introduce new NN->JN API to backup
> editlogs; if we do b, it changes SNN's behavior, and kind of not compatible.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]