Viraj Jasani created HBASE-26371:
------------------------------------
Summary: Prioritize meta region move in region_mover
Key: HBASE-26371
URL: https://issues.apache.org/jira/browse/HBASE-26371
Project: HBase
Issue Type: Task
Affects Versions: 1.6.0
Reporter: Viraj Jasani
Assignee: Viraj Jasani
Fix For: 2.5.0, 3.0.0-alpha-2, 1.7.2, 2.4.8, 2.3.8
We have seen few issues in production when meta region movement took some time
from one server to another and in the meanwhile some other system table's
regions were also moved (that were hosted on the same server) simultaneously
but when non-meta system regions came online on other servers, the new servers
could not make info:sn update to meta table for updated destination of system
regions (e.g namespace region) and at the same time, active master was also
bounced and the new active master that comes online usually reads namespace
region's location from meta table and considers it as final, hence even if for
instance, namespace region is already online (but on different host), the
inconsistent info:sn value would prevent master from getting initialized
because it keeps waiting for namespace region's availability on old
regionserver. In this case, we need to make special arrangement to bring
namespace region online on the old server only.
{code:java}
2021-10-12 20:00:00,630 INFO [1f507eff84ef336f1250] regionserver.HRegionServer
- Post open deploy tasks for
hbase:namespace,1626899414773.52693312958f1f507eff84ef336f1250.
2021-10-12 20:04:18,622 INFO [1f507eff84ef336f1250] hbase.MetaTableAccessor -
Updated row hbase:namespace,1626899414773.52693312958f1f507eff84ef336f1250.
with server=server-0,60020,1633467603387
2021-10-12 20:04:18,622 INFO [1f507eff84ef336f1250] client.AsyncProcess - #27,
waiting for some tasks to finish. Expected max=0, tasksInProgress=4
hasError=false, tableName=hbase:meta
2021-10-12 20:04:18,622 INFO [1f507eff84ef336f1250] client.AsyncProcess - Left
over 4 task(s) are processed on server(s): []
2021-10-12 20:04:18,622 DEBUG [1f507eff84ef336f1250] regionserver.HRegionServer
- Finished post open deploy task for
hbase:namespace,1626899414773.52693312958f1f507eff84ef336f1250.
{code}
Similar to namespace, even other user or system table regions that are hosted
on the same server as meta have also encountered inconsistent state updates
specifically when meta region moves around and active master is also restarted
around the same time. And once active master comes online, we have to fix such
inconsistencies with hbck.
On the other hand, there have been some enhancement around not requiring meta
region's colocation with active master as part of ZK-less region assignment,
e.g HBASE-11610
We have not yet enabled ZK-less region assignment entirely, only migration
config is enabled i.e. hbase.assignment.usezk.migrating. With this, we expect
active master to perform an additional write to meta table for the updated
region state (in addition to updating RIT map in the memory of RegionStates).
We have seen some hanging state here as well if meta region is going through
some transition (not available) and other non-meta regions are also moved by
the region mover simultaneously, and active master cannot complete meta update,
which further delays intermediate state transition based ZK watcher updates.
{code:java}
client.AsyncProcess - #3, waiting for 1 actions to finish on table: hbase:meta
{code}
If we take a step back, and think about these issues, all issues are associated
with graceful start/stop of regionservers. Region mover will try to move all
regions of the given server in parallel using user configurable thread pool and
hence it gives no preference to meta.
On the other hand, after trying to reproduce this inconsistent region state
behaviour with non-graceful start/stop, I have realized that we don't face such
issues because ServerCrashProcedure (SCP) always prioritize meta region's
availability over any other regions if the server being processed by the SCP
was hosting the meta region. This is exactly what region_mover should also
provide. Given that every non-meta region's location is stored in meta table,
meta region must always be moved first and only after it comes online, can
other regions be allowed to be moved in parallel using the configured thread
pool.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)