[jira] [Updated] (HBASE-19144) [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all servers in the group are unavailable

Andrew Purtell (JIRA) Tue, 31 Oct 2017 15:29:26 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-19144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Andrew Purtell updated HBASE-19144:
-----------------------------------
    Description: 
After all servers in the RSgroup are down the regions cannot be opened anywhere 
and transition rapidly into FAILED_OPEN state.
 
2017-10-31 21:06:25,449 INFO [ProcedureExecutor-13] master.RegionStates: 
Transition {c6c8150c9f4b8df25ba32073f25a5143 state=OFFLINE, ts=1509483985448, 
server=node-5.cluster,16020,1509482700768} to {c6c8150c9f4b8df25ba32073f25a5143 
state=FAILED_OPEN, ts=1509483985449, server=node-5.cluster,16020,1509482700768}
2017-10-31 21:06:25,449 WARN [ProcedureExecutor-13] master.RegionStates: Failed 
to open/close d4e2f173e31ffad6aac140f4bd7b02bc on 
node-5.cluster,16020,1509482700768, set to FAILED_OPEN
 
Any region in FAILED_OPEN state has to be manually reassigned, or the master 
can be restarted and this will also cause reattempt of assignment of any 
regions in FAILED_OPEN state. This is not unexpected but is an operational 
headache. It would be better if the RSGroupInfoManager could automatically kick 
reassignments of regions in FAILED_OPEN state when servers rejoin the cluster. 

> [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all 
> servers in the group are unavailable
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-19144
>                 URL: https://issues.apache.org/jira/browse/HBASE-19144
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0, 3.0.0, 1.4.0, 1.5.0
>            Reporter: Andrew Purtell
>
> After all servers in the RSgroup are down the regions cannot be opened 
> anywhere and transition rapidly into FAILED_OPEN state.
>  
> 2017-10-31 21:06:25,449 INFO [ProcedureExecutor-13] master.RegionStates: 
> Transition {c6c8150c9f4b8df25ba32073f25a5143 state=OFFLINE, ts=1509483985448, 
> server=node-5.cluster,16020,1509482700768} to 
> {c6c8150c9f4b8df25ba32073f25a5143 state=FAILED_OPEN, ts=1509483985449, 
> server=node-5.cluster,16020,1509482700768}
> 2017-10-31 21:06:25,449 WARN [ProcedureExecutor-13] master.RegionStates: 
> Failed to open/close d4e2f173e31ffad6aac140f4bd7b02bc on 
> node-5.cluster,16020,1509482700768, set to FAILED_OPEN
>  
> Any region in FAILED_OPEN state has to be manually reassigned, or the master 
> can be restarted and this will also cause reattempt of assignment of any 
> regions in FAILED_OPEN state. This is not unexpected but is an operational 
> headache. It would be better if the RSGroupInfoManager could automatically 
> kick reassignments of regions in FAILED_OPEN state when servers rejoin the 
> cluster. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HBASE-19144) [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all servers in the group are unavailable

Reply via email to