[ 
https://issues.apache.org/jira/browse/HBASE-23372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16989179#comment-16989179
 ] 

Andrew Kyle Purtell edited comment on HBASE-23372 at 12/5/19 9:03 PM:
----------------------------------------------------------------------

I would say #2 is definitely needed. It should delete assignment znodes with a 
mtime older than what could possibly be reasonable or for tables that no longer 
exist. 

It could also be reasonable to have the assignment manager take care of this 
itself, but in any case hbck should have a mode that cleans up this state 
because then it's done with operator consent and in a separate process and out 
of band with other master function.

One core change that would not be controversial is to have the master look for 
and clean up all znode state for tables that have just been dropped. 


was (Author: apurtell):
I would say #2 is definitely needed. It should delete assignment znodes with a 
mtime older than what could possibly be reasonable or for tables that no longer 
exist. 

It could also be reasonable to have the assignment manager take care of this 
itself, but in any case hbck should have a mode that cleans up this state 
because then it's done with operator consent and in a separate process and out 
of band with other master function.

> ZooKeeper Assignment can result in stale znodes in region-in-transition after 
> table is dropped and hbck run
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-23372
>                 URL: https://issues.apache.org/jira/browse/HBASE-23372
>             Project: HBase
>          Issue Type: Bug
>          Components: hbck, master, Region Assignment, Zookeeper
>    Affects Versions: 1.3.2
>            Reporter: David Manning
>            Priority: Major
>
> It is possible for znodes under /hbase/region-in-transition to remain long 
> after a table is deleted. There does not appear to be any cleanup logic for 
> these.
> The details are a little fuzzy, but it seems to be fallout from HBASE-22617. 
> Incidents related to that bug involved regions stuck in transition, and use 
> of hbck to fix clusters. There was a temporary table created and deleted once 
> per day, but somehow it led to receiving 
> {{FSLimitException$MaxDirectoryItemsExceededException}} and regions stuck in 
> transition. Even weeks after fixing the bug and upgrading the cluster, the 
> znodes remain under /hbase/region-in-transition. In the most impacted 
> cluster, {{hbase zkcli ls /hbase/region-in-transition | wc -w}} returns 
> almost 100,000 entries. This causes very slow region transition times (often 
> 80 seconds), likely due to enumerating all these entries when zk watch on 
> this node is triggered.
> Log lines for slow region transitions:
> {code:java}
> 2019-12-05 07:02:14,714 DEBUG [K.Worker-pool3-t7344] master.AssignmentManager 
> - Handling RS_ZK_REGION_CLOSED, server=<<SERVERNAME>>, 
> region=<<REGION_HASH>>, which is more than 15 seconds late, 
> current_state={<<REGION_HASH>> state=PENDING_CLOSE, ts=1575529254635, 
> server=<<SERVERNAME>>}
> {code}
> Even during hmaster failover, entries are not cleaned, but the following log 
> lines can be seen:
> {code:java}
> 2019-11-27 00:26:27,044 WARN  [.activeMasterManager] master.AssignmentManager 
> - Couldn't find the region in recovering region=<<DELETED_TABLE_REGION>>, 
> state=RS_ZK_REGION_FAILED_OPEN, servername=<<SERVERNAME>>, 
> createTime=1565603905404, payload.length=0
> {code}
> Possible solutions:
>  # Logic to parse the RIT znode during master failover which sees if the 
> table exists. Clean up entries for nonexistent tables.
>  # New mode for hbck to do cleanup of nonexistent regions under the znode.
>  # Others?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to