[ 
https://issues.apache.org/jira/browse/HBASE-23369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17007687#comment-17007687
 ] 

Michael Stack commented on HBASE-23369:
---------------------------------------

I've been running cluster tests with this patch in place the last few weeks. 
Good for tampering down the mayhem when cluster goes haywire when overdriven by 
sustained loading causing Master lose accounting.

> Auto-close 'unknown' Regions reported as OPEN on RegionServers
> --------------------------------------------------------------
>
>                 Key: HBASE-23369
>                 URL: https://issues.apache.org/jira/browse/HBASE-23369
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Michael Stack
>            Assignee: Michael Stack
>            Priority: Major
>             Fix For: 3.0.0, 2.3.0
>
>
> In old days, if a RegionServer reported a variance that didn't agree w/ 
> Master view of the cluster, we'd kill the RegionServer.
> Lately, in tests that overrun a cluster, after a sustained high-load, Master 
> can start failing its updates against Meta (CallQueueTooBigException <= More 
> on this later). It then can lose proper accounting of all Region members. One 
> variant has a RegionServer reporting its list of open Regions to the Master 
> and the Master doesn't 'know' of a particular Region or the Master may know 
> the Region but expects it open on another RegionServer.
> Here is an example of how it looks each time RS reports:
> {code}
>  2019-12-03 07:07:00,757 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: No 
> t1,08f5c285,1573094375485.ee78a0c951c1c902d8f3f3912394a0e5. RegionStateNode 
> but reported ONLINE at server.example.org,16020,1575354666245 
> (inServerRegionList=false).
>  2019-12-03 07:07:03,793 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: No 
> t1,08f5c285,1573094375485.ee78a0c951c1c902d8f3f3912394a0e5. RegionStateNode 
> but reported ONLINE at server.example.org,16020,1575354666245 
> (inServerRegionList=false).
> {code}
> Will also show as an 'inconsistency' in the 'HBCK' tab on the Master UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to