[ 
https://issues.apache.org/jira/browse/HBASE-20949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16559091#comment-16559091
 ] 

Duo Zhang commented on HBASE-20949:
-----------------------------------

The problem here is that, we seem to finish the procedure twice.

{noformat}
2018-07-26 22:35:44,762 INFO  [PEWorker-4] procedure2.ProcedureExecutor(1551): 
Initialized subprocedures=[{pid=96, ppid=95, 
state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure 
table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=d0b0215a67002d7c19486ee75f610d94}, {pid=97, ppid=95, 
state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure 
table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=9ae1ea9ba045c13d80b255e0de39eafb}, {pid=98, ppid=95, 
state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure 
table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=46cd0014b889f5540cb5e356c5fb7d6e}, {pid=99, ppid=95, 
state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure 
table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=201c3520780fedb059b26f6d757c250f}, {pid=100, ppid=95, 
state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure 
table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=c5e9ddc19cbd286cf4a016f889e79732}, {pid=101, ppid=95, 
state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure 
table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=e1d6b5e83f6ed6175d95b0a628627251}]
2018-07-26 22:35:44,819 DEBUG [PEWorker-8] procedure2.LockAndQueue(123): 
pid=101, ppid=95, state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; 
AssignProcedure table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=e1d6b5e83f6ed6175d95b0a628627251 acquire shared lock 3284bcdf: 
exclusiveLockOwner=NONE, sharedLockCount=1, waitingProcCount=0 succeeded
2018-07-26 22:35:44,819 DEBUG [PEWorker-8] procedure2.LockAndQueue(123): 
pid=101, ppid=95, state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; 
AssignProcedure table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=e1d6b5e83f6ed6175d95b0a628627251 acquire shared lock 7f6eec32: 
exclusiveLockOwner=95, sharedLockCount=0, waitingProcCount=0 succeeded
2018-07-26 22:35:44,819 INFO  [PEWorker-8] 
procedure.MasterProcedureScheduler(689): pid=101, ppid=95, 
state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure 
table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=e1d6b5e83f6ed6175d95b0a628627251 checking lock on 
e1d6b5e83f6ed6175d95b0a628627251
2018-07-26 22:35:44,871 INFO  [PEWorker-8] assignment.AssignProcedure(218): 
Starting pid=101, ppid=95, state=RUNNABLE:REGION_TRANSITION_QUEUE, 
hasLock=true; AssignProcedure 
table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=e1d6b5e83f6ed6175d95b0a628627251; rit=OFFLINE, 
location=asf911.gq1.ygridcore.net,41392,1532644489849; forceNewPlan=false, 
retain=true
2018-07-26 22:35:45,022 INFO  [PEWorker-5] assignment.RegionStateStore(199): 
pid=101 updating hbase:meta row=e1d6b5e83f6ed6175d95b0a628627251, 
regionState=OPENING, regionLocation=asf911.gq1.ygridcore.net,41392,1532644489849
2018-07-26 22:35:45,026 INFO  [PEWorker-5] 
assignment.RegionTransitionProcedure(241): Dispatch pid=101, ppid=95, 
state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; AssignProcedure 
table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=e1d6b5e83f6ed6175d95b0a628627251; rit=OPENING, 
location=asf911.gq1.ygridcore.net,41392,1532644489849
2018-07-26 22:35:45,198 DEBUG 
[RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=57564] 
assignment.RegionTransitionProcedure(264): Received report OPENED seqId=5, 
pid=101, ppid=95, state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; 
AssignProcedure table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=e1d6b5e83f6ed6175d95b0a628627251; rit=OPENING, 
location=asf911.gq1.ygridcore.net,41392,1532644489849
2018-07-26 22:35:45,198 DEBUG [PEWorker-15] 
assignment.RegionTransitionProcedure(354): Finishing pid=101, ppid=95, 
state=RUNNABLE:REGION_TRANSITION_FINISH, hasLock=true; AssignProcedure 
table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=e1d6b5e83f6ed6175d95b0a628627251; rit=OPENING, 
location=asf911.gq1.ygridcore.net,41392,1532644489849
2018-07-26 22:35:45,199 INFO  [PEWorker-15] assignment.RegionStateStore(199): 
pid=101 updating hbase:meta row=e1d6b5e83f6ed6175d95b0a628627251, 
regionState=OPEN, openSeqNum=5, 
regionLocation=asf911.gq1.ygridcore.net,41392,1532644489849
2018-07-26 22:35:45,276 DEBUG [PEWorker-5] procedure2.LockAndQueue(132): 
pid=101, ppid=95, state=SUCCESS, hasLock=false; AssignProcedure 
table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=e1d6b5e83f6ed6175d95b0a628627251 release shared lock 7f6eec32: 
exclusiveLockOwner=NONE, sharedLockCount=6, waitingProcCount=0
2018-07-26 22:35:45,276 DEBUG [PEWorker-5] procedure2.LockAndQueue(132): 
pid=101, ppid=95, state=SUCCESS, hasLock=false; AssignProcedure 
table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=e1d6b5e83f6ed6175d95b0a628627251 release shared lock 3284bcdf: 
exclusiveLockOwner=NONE, sharedLockCount=6, waitingProcCount=0
2018-07-26 22:35:45,277 INFO  [PEWorker-5] procedure2.ProcedureExecutor(1307): 
Finished pid=101, ppid=95, state=SUCCESS, hasLock=false; AssignProcedure 
table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=e1d6b5e83f6ed6175d95b0a628627251 in 439msec
2018-07-26 22:35:45,543 DEBUG [PEWorker-15] procedure2.LockAndQueue(132): 
pid=101, ppid=95, state=SUCCESS, hasLock=false; AssignProcedure 
table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=e1d6b5e83f6ed6175d95b0a628627251 release shared lock 7f6eec32: 
exclusiveLockOwner=NONE, sharedLockCount=4, waitingProcCount=0
2018-07-26 22:35:45,544 DEBUG [PEWorker-15] procedure2.LockAndQueue(132): 
pid=101, ppid=95, state=SUCCESS, hasLock=false; AssignProcedure 
table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=e1d6b5e83f6ed6175d95b0a628627251 release shared lock 3284bcdf: 
exclusiveLockOwner=NONE, sharedLockCount=4, waitingProcCount=0
2018-07-26 22:35:45,544 INFO  [PEWorker-15] procedure2.ProcedureExecutor(1307): 
Finished pid=101, ppid=95, state=SUCCESS, hasLock=false; AssignProcedure 
table=testRestoreSnapshotAfterSplittingRegions-1532644538441, 
region=e1d6b5e83f6ed6175d95b0a628627251 in 439msec
{noformat}

The PEWorker-5 and PEWORKER-15 both finish the procedure causes the double 
release.

Anyway let me revert the debug commit first as it  prints too many logs.

> Split/Merge table can be executed concurrently with DisableTableProcedure
> -------------------------------------------------------------------------
>
>                 Key: HBASE-20949
>                 URL: https://issues.apache.org/jira/browse/HBASE-20949
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Duo Zhang
>            Priority: Major
>         Attachments: HBASE-20949-debug.patch
>
>
> The top flaky tests on the dashboard are all because of this.
> TestRestoreSnapshotFromClient
> TestSimpleRegionNormalizerOnCluster
> Theoretically this should not happen, need to dig more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to