[ 
https://issues.apache.org/jira/browse/HDFS-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203171#comment-13203171
 ] 

Todd Lipcon commented on HDFS-2910:
-----------------------------------

In order to make the NN ride over a hiccup, it seems the solution is to add a 
more resilient JournalSet implementation -- ie either one that operates over a 
quorum of shared dirs, or one which has a more stubborn retry policy. Given 
that NFS itself already has built in retries and can be configured to arbitrary 
timeouts, it doesn't seem like we should worry about short hiccups -- any 
outage that makes it past the configured NFS retry/timeouts is likely to be 
worth causing a failover IMO.
                
> HA: Active NN reports Bad state: BETWEEN_LOG_SEGMENTS when shared edits dir 
> is inaccessible during log roll
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-2910
>                 URL: https://issues.apache.org/jira/browse/HDFS-2910
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to