[ 
https://issues.apache.org/jira/browse/HDFS-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205723#comment-13205723
 ] 

Hari Mankude commented on HDFS-2914:
------------------------------------

bq.   The patch file still needs to have the ".patch" extension.
done
bq.   Rather than sleep for 10 seconds, let's increase the frequency which the 
NNResourceChecker threads runs to every 0 or 1 seconds, and then sleep for 2 
seconds.
I would rather leave this as is since I could easily make the problem happen 
with 10s sleep.

bq.   Our coding conventions require the use of curly braces ("{}") even for 
single-line if statements.
done

bq.    What do you think the behavior should be for an NN which is active, 
experiences low resources, then becomes standby? I think the current behavior 
seems fine (i.e. require the admin to make the now-standby NN leave SM) but I'm 
wondering if you've considered this case. You might want to write a test case 
which asserts the desired behavior.

I am not sure that I completely understand your concern. When active has low 
resources, it goes into safemode. If shared edits goes away, then active dies. 
If you are talking about doing a switchover (active to standby) when active is 
in safemode, I thought I saw a test in testHAsafemode for this conditon. If 
not, I can add a test in a seperate jira.

bq.    Note that Jitendra's suggestion also said "When it transitions to 
active, that's when a check for available resources to write logs should be 
performed." I agree with this (much as the NN currently checks for available 
resources on startup) but your patch doesn't implement this.

This is already handled in checkAvailableResources() being called during 
startupCommonServices(). Also, resourcechecker thread is always running and it 
will catch the issue in 5s.
                
> HA: Standby should not enter safemode when resources are low
> ------------------------------------------------------------
>
>                 Key: HDFS-2914
>                 URL: https://issues.apache.org/jira/browse/HDFS-2914
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Hari Mankude
>            Assignee: Hari Mankude
>         Attachments: HDFS-2914-HDFS-1623, HDFS-2914-HDFS-1623, 
> HDFS-2914-HDFS-1623.patch, hdfs-2914
>
>
> When shared edits dir is bounced, standby NN is put into safemode by the 
> NameNodeResourceMonitor(). However, there is no path for it to exit out of 
> safe mode when shared edits dir reappears.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to