[
https://issues.apache.org/jira/browse/HBASE-25815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Caroline Zhou reassigned HBASE-25815:
-------------------------------------
Assignee: Caroline Zhou
> RSGroupBasedLoadBalancer online status never updates after being set to true
> for the first time
> -----------------------------------------------------------------------------------------------
>
> Key: HBASE-25815
> URL: https://issues.apache.org/jira/browse/HBASE-25815
> Project: HBase
> Issue Type: Bug
> Reporter: Caroline Zhou
> Assignee: Caroline Zhou
> Priority: Minor
>
> Once the RSGroupBasedLoadBalancer is “online” (it has found the hbase:meta
> and hbase:rsgroup tables), it will never update the status again. That means
> if hbase:meta or hbase:rsgroup ever go offline, the balancer doesn’t update
> its status to “offline,” so some of the code paths will go through the
> “online” code path even though the catalog tables aren’t available to be read
> from or written to (in particular, anything that calls
> RSGroupInfoManagerImpl#flushConfig).
> Also, in the RSGroupInfoManagerImpl#flushConfig code path, the call to write
> to hbase:rsgroup comes before the update to the rsGroupMap and tableMap which
> are stored in memory (see order of [these lines of
> code|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java#L664-L670]),
> so if hbase:rsgroup goes offline after the RSGroupBasedLoadBalancer is
> already marked as “online,” exceptions thrown while trying to write to an
> offline hbase:rsgroup table prevent the in-memory rsGroupMap and tableMap
> from being updated. In terms of the order just mentioned, in-memory state
> should be updated first.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)