[
https://issues.apache.org/jira/browse/HBASE-21787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16752862#comment-16752862
]
Sergey Shelukhin edited comment on HBASE-21787 at 1/26/19 12:44 AM:
--------------------------------------------------------------------
It's a fresh cluster.
Offline regions might be from manual intervention... although I'm not 100%
sure, didn't check. Anyway that would explain HBASE-21786.
However the issue here is more general - we load 2 RITs; take lock for the 1st;
but then replace it with the 2nd in the region. That doesn't depend on meta
state as far as I see.
was (Author: sershe):
It's a fresh cluster.
Offline regions might be from manual intervention... that would explain
HBASE-21786.
However the issue here is more general - we load 2 RITs; take lock for the 1st;
but then replace it with the 2nd in the region. That doesn't depend on meta
state as far as I see.
> proc WAL replaces a RIT that holds a lock with a RIT that doesn't
> -----------------------------------------------------------------
>
> Key: HBASE-21787
> URL: https://issues.apache.org/jira/browse/HBASE-21787
> Project: HBase
> Issue Type: Bug
> Affects Versions: 3.0.0
> Reporter: Sergey Shelukhin
> Priority: Critical
>
> This is not the same as HBASE-21786, but related - after master restart, 2
> RITs are both in proc WAL. According to the comment where RIT is restored,
> this is expected.
> However what happens is that master takes lock for the older RIT, and then
> replaces the older RIT with the newer RIT on the region.
> You can see two "to restore RIT" log lines.
> Both RITs are still active in procedures view (and stuck due to yet another
> bug that I will file later). However, it seems wrong that lock is held by one
> RIT but region points to the other RIT as the correct one.
> {noformat}
> 2019-01-25 11:26:54,616 INFO [master/master:17000:becomeActiveMaster]
> procedure.MasterProcedureScheduler: Took xlock for pid=1738, ppid=3,
> state=WAITING:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=false;
> TransitRegionStateProcedure table=table,
> region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN
> 2019-01-25 11:26:54,834 INFO [master/master:17000:becomeActiveMaster]
> assignment.AssignmentManager: Attach pid=1738, ppid=3,
> state=WAITING:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=false;
> TransitRegionStateProcedure table=table,
> region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN to rit=OFFLINE,
> location=null, table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e to
> restore RIT
> 2019-01-25 11:26:54,853 INFO [master/master:17000:becomeActiveMaster]
> assignment.AssignmentManager: Attach pid=4351,
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false;
> TransitRegionStateProcedure table=table,
> region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN to rit=OFFLINE,
> location=null, table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e to
> restore RIT
> 2019-01-25 11:27:02,460 INFO [master/master:17000:becomeActiveMaster]
> assignment.RegionStateStore: Load hbase:meta entry
> region=27f7ab2a05d9d730b2ab2339d1531b8e, regionState=OPENING,
> lastHost=server1,17020,1548290445704,
> regionLocation=server2,17020,1548442571056, openSeqNum=120108
> 2019-01-25 11:27:10,184 INFO [PEWorker-11]
> procedure.MasterProcedureScheduler: Waiting on xlock for pid=4351,
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false;
> TransitRegionStateProcedure table=table,
> region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN held by pid=1738
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)