[ 
https://issues.apache.org/jira/browse/HDFS-2912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203217#comment-13203217
 ] 

Todd Lipcon commented on HDFS-2912:
-----------------------------------

I think the issue is this -- previously the abort logic was to only do 
Runtime.exit(1) when a _sync_ fails. We figured this was sufficient since it 
guards against data loss. But, as you've pointed out in the JIRAs today, there 
are some other cases where we should abort to avoid getting into an 
inconsistent state.

The old code (which is verified by the tests Aaron mentioned above -- look for 
mock(Runtime.class) ) does the abort by catching the IOException thrown by 
mapJournalsAndReportErrors and aborting at that point. The particular call site 
is logSync() in FSEditLog. So we either need to do as you did (and abort from 
mapJournalsAndReportErrors itself) or change _all_ of the call sites to do the 
abort in case an exception is thrown.
                
> HA: Namenode not shutting down when shared edits dir is inaccessible
> --------------------------------------------------------------------
>
>                 Key: HDFS-2912
>                 URL: https://issues.apache.org/jira/browse/HDFS-2912
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>         Attachments: HDFS-2909.HDFS-1623.patch
>
>
> When there is an error in shared edits dir then current policy requires the 
> active name node to abort and shutdown.
> Currently there is no way to shut down the name node and hence this does not 
> happen even after all journals have been aborted on error. In fact the name 
> node stays Active and also is not in safe mode. Ideally it should shut down, 
> or at least go into safe mode or standby mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to