[
https://issues.apache.org/jira/browse/HDFS-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aaron T. Myers updated HDFS-2010:
---------------------------------
Status: Patch Available (was: Open)
> Clean up and test behavior under failed edit streams
> ----------------------------------------------------
>
> Key: HDFS-2010
> URL: https://issues.apache.org/jira/browse/HDFS-2010
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: name-node
> Affects Versions: Edit log branch (HDFS-1073)
> Reporter: Todd Lipcon
> Assignee: Aaron T. Myers
> Fix For: Edit log branch (HDFS-1073)
>
> Attachments: hdfs-2010.0.patch
>
>
> Right now there is very little test coverage of situations where one or more
> of the edits directories fails. In trunk, the behavior when all of the edits
> directories are dead is that the NN prints a fatal level log message and
> calls Runtime.exit(-1).
> I don't think this is really the behavior we want. Needs a bit of thought,
> but I think something like the following would make more sense:
> - any calls currently waiting on logSync should end up throwing an exception
> - NN should probably enter safe mode
> - ops can restore edits directories and then ask the NN to restore storage,
> at which point it could edit safemode
> - alternatively, ops could call ask the NN to do saveNamespace and then shut
> it down
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira