[
https://issues.apache.org/jira/browse/HBASE-21787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16752860#comment-16752860
]
Duo Zhang commented on HBASE-21787:
-----------------------------------
There must be other bugs, maybe not here. In the code base for 2.0+, IIRC, no
region could be in OFFLINE state, the only usage is for a new cluster, where we
do not have meta table, we will use OFFLINE to indicate this condition. All
other regions can only be in CLOSED state.
Is your cluster a fresh one? Or upgraded from a previous version?
> proc WAL replaces a RIT that holds a lock with a RIT that doesn't
> -----------------------------------------------------------------
>
> Key: HBASE-21787
> URL: https://issues.apache.org/jira/browse/HBASE-21787
> Project: HBase
> Issue Type: Bug
> Affects Versions: 3.0.0
> Reporter: Sergey Shelukhin
> Priority: Critical
>
> This is not the same as HBASE-21786, but related - after master restart, 2
> RITs are both in proc WAL. According to the comment where RIT is restored,
> this is expected.
> However what happens is that master takes lock for the older RIT, and then
> replaces the older RIT with the newer RIT on the region.
> You can see two "to restore RIT" log lines.
> Both RITs are still active in procedures view (and stuck due to yet another
> bug that I will file later). However, it seems wrong that lock is held by one
> RIT but region points to the other RIT as the correct one.
> {noformat}
> 2019-01-25 11:26:54,616 INFO [master/master:17000:becomeActiveMaster]
> procedure.MasterProcedureScheduler: Took xlock for pid=1738, ppid=3,
> state=WAITING:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=false;
> TransitRegionStateProcedure table=table,
> region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN
> 2019-01-25 11:26:54,834 INFO [master/master:17000:becomeActiveMaster]
> assignment.AssignmentManager: Attach pid=1738, ppid=3,
> state=WAITING:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=false;
> TransitRegionStateProcedure table=table,
> region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN to rit=OFFLINE,
> location=null, table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e to
> restore RIT
> 2019-01-25 11:26:54,853 INFO [master/master:17000:becomeActiveMaster]
> assignment.AssignmentManager: Attach pid=4351,
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false;
> TransitRegionStateProcedure table=table,
> region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN to rit=OFFLINE,
> location=null, table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e to
> restore RIT
> 2019-01-25 11:27:02,460 INFO [master/master:17000:becomeActiveMaster]
> assignment.RegionStateStore: Load hbase:meta entry
> region=27f7ab2a05d9d730b2ab2339d1531b8e, regionState=OPENING,
> lastHost=server1,17020,1548290445704,
> regionLocation=server2,17020,1548442571056, openSeqNum=120108
> 2019-01-25 11:27:10,184 INFO [PEWorker-11]
> procedure.MasterProcedureScheduler: Waiting on xlock for pid=4351,
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false;
> TransitRegionStateProcedure table=table,
> region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN held by pid=1738
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)