[ 
https://issues.apache.org/jira/browse/HBASE-23369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Stack resolved HBASE-23369.
-----------------------------------
    Fix Version/s: 2.3.0
                   3.0.0
     Hadoop Flags: Incompatible change
     Release Note: If a RegionServer reports a Region as OPEN in disagreement 
with Master's status on the Region, the Master now tells the RegionServer to 
silently close the Region.
         Assignee: Michael Stack
       Resolution: Fixed

Merged to branch-2 and. master branch.

I think this belongs in branch-2.2 too. Shout and I'll pull it back.

> Auto-close 'unknown' Regions reported as OPEN on RegionServers
> --------------------------------------------------------------
>
>                 Key: HBASE-23369
>                 URL: https://issues.apache.org/jira/browse/HBASE-23369
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Michael Stack
>            Assignee: Michael Stack
>            Priority: Major
>             Fix For: 3.0.0, 2.3.0
>
>
> In old days, if a RegionServer reported a variance that didn't agree w/ 
> Master view of the cluster, we'd kill the RegionServer.
> Lately, in tests that overrun a cluster, after a sustained high-load, Master 
> can start failing its updates against Meta (CallQueueTooBigException <= More 
> on this later). It then can lose proper accounting of all Region members. One 
> variant has a RegionServer reporting its list of open Regions to the Master 
> and the Master doesn't 'know' of a particular Region or the Master may know 
> the Region but expects it open on another RegionServer.
> Here is an example of how it looks each time RS reports:
> {code}
>  2019-12-03 07:07:00,757 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: No 
> t1,08f5c285,1573094375485.ee78a0c951c1c902d8f3f3912394a0e5. RegionStateNode 
> but reported ONLINE at server.example.org,16020,1575354666245 
> (inServerRegionList=false).
>  2019-12-03 07:07:03,793 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: No 
> t1,08f5c285,1573094375485.ee78a0c951c1c902d8f3f3912394a0e5. RegionStateNode 
> but reported ONLINE at server.example.org,16020,1575354666245 
> (inServerRegionList=false).
> {code}
> Will also show as an 'inconsistency' in the 'HBCK' tab on the Master UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to