[ 
https://issues.apache.org/jira/browse/HBASE-26897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

May updated HBASE-26897:
------------------------
    Description: 
We got a cluster of 2 master nodes (hm1, hm2) and 3 slaves nodes (rs1, rs2, rs3)

 
 # hm1 becomes active
 # rs3 becomes the meta RS
 # rs3 crashes
 # hm1 crashes. At this time, the meta-region-server in ZooKeeper is still rs3
 # hm2 becomes active, but stucks in "master.HMaster: hbase:meta,,1.1588230740 
is NOT online; state=\{1588230740 state=OPEN, ts=1648007404659, 
server=rs3,16020,1648007272771}; ServerCrashProcedures=true. Master startup 
cannot progress, in holding-pattern until region onlined."

  was:
We got a cluster of 2 master nodes (hm1, hm2) and 3 slaves nodes (rs1, rs2, rs3)

 
 # hm1 becomes active
 # rs3 becomes the meta RS
 # rs3 crashes
 # rs2 crashes
 # hm1 crashes when handling the crash of rs3. At this time, the 
meta-region-server in ZooKeeper is still rs3
 # hm2 becomes active, but stucks in "master.HMaster: hbase:meta,,1.1588230740 
is NOT online; state=\{1588230740 state=OPEN, ts=1648007404659, 
server=rs3,16020,1648007272771}; ServerCrashProcedures=true. Master startup 
cannot progress, in holding-pattern until region onlined."


> Crash Meta-RS and active HMaster successively make cluster out of service
> -------------------------------------------------------------------------
>
>                 Key: HBASE-26897
>                 URL: https://issues.apache.org/jira/browse/HBASE-26897
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.4.8
>            Reporter: May
>            Priority: Major
>
> We got a cluster of 2 master nodes (hm1, hm2) and 3 slaves nodes (rs1, rs2, 
> rs3)
>  
>  # hm1 becomes active
>  # rs3 becomes the meta RS
>  # rs3 crashes
>  # hm1 crashes. At this time, the meta-region-server in ZooKeeper is still rs3
>  # hm2 becomes active, but stucks in "master.HMaster: 
> hbase:meta,,1.1588230740 is NOT online; state=\{1588230740 state=OPEN, 
> ts=1648007404659, server=rs3,16020,1648007272771}; 
> ServerCrashProcedures=true. Master startup cannot progress, in 
> holding-pattern until region onlined."



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to