[jira] [Commented] (HBASE-24292) A "stuck" master should not idle as active without taking action

Anoop Sam John (Jira) Sun, 10 May 2020 00:54:08 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-24292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103685#comment-17103685
 ]


Anoop Sam John commented on HBASE-24292:
----------------------------------------

Ya seems the sleep time can really grow high like 10+ mins even. (Just was 
trying to test on a 2.0.x cluster)
Also the while loop is with !isStopped() condition..  No one is stopping this.
I believe in 1.x there was a timeout like 24 mins or so by default after which 
the active HM will get stopped.  I have seen it in a cluster where it was an 
issue with getting the NS table online..  After the META table wait, we can see 
below there is a wait for the META region also.  Did not check 1.x code to see 
what was different there.

> A "stuck" master should not idle as active without taking action
> ----------------------------------------------------------------
>
>                 Key: HBASE-24292
>                 URL: https://issues.apache.org/jira/browse/HBASE-24292
>             Project: HBase
>          Issue Type: Bug
>          Components: master, Region Assignment
>    Affects Versions: 2.3.0
>            Reporter: Nick Dimiduk
>            Priority: Critical
>
> The master schedules a SCP for the region server hosting meta. However, due 
> to a misconfiguration, the cluster cannot make progress. After fixing the 
> configuration issue and restarting, the cluster still cannot make progress. 
> After the configured period (15 minuets), the master enters a "holding 
> pattern" where it retains Active master status, but isn't taking any action.
> This "brown-out" state is toxic. It should either keep trying to make 
> progress, or it should abort. Staying up and not doing anything is the wrong 
> thing to do.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HBASE-24292) A "stuck" master should not idle as active without taking action

Reply via email to