stack created HBASE-21213:
-----------------------------

             Summary: [hbck2] Need more cleanup needed on bypass; old Procedure 
left in RegionStateNodes
                 Key: HBASE-21213
                 URL: https://issues.apache.org/jira/browse/HBASE-21213
             Project: HBase
          Issue Type: Bug
          Components: amv2, hbck2
            Reporter: stack
            Assignee: stack
             Fix For: 2.1.1


This is a follow-on from HBASE-21083 which added the 'bypass' functionality. On 
bypass, there is more state to be cleared if we are allow new Procedures to be 
scheduled.

For example, here is a bypass:

{code}
2018-09-20 05:45:43,722 INFO org.apache.hadoop.hbase.procedure2.Procedure: 
pid=100449, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true, 
bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, 
region=37cc206fe9c4bc1c0a46a34c5f523d16, 
server=ve1233.halxg.cloudera.com,22101,1537397961664 bypassed, returning null 
to finish it
2018-09-20 05:45:44,022 INFO 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=100449, 
state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, 
region=37cc206fe9c4bc1c0a46a34c5f523d16, 
server=ve1233.halxg.cloudera.com,22101,1537397961664 in 2mins, 7.618sec
{code}

... but then when I try to assign the bypassed region later, I get this:

{code}
2018-09-20 05:46:31,435 WARN 
org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: There is 
already another procedure running on this region this=pid=100450, 
state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure 
table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 
owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure 
table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, 
server=ve1233.halxg.cloudera.com,22101,1537397961664 pid=100450, 
state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure 
table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16; rit=OPENING, 
location=ve1233.halxg.cloudera.com,22101,1537397961664
2018-09-20 05:46:31,510 INFO 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Rolled back pid=100450, 
state=ROLLEDBACK, 
exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via 
AssignProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: 
There is already another procedure running on this region this=pid=100450, 
state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure 
table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 
owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure 
table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, 
server=ve1233.halxg.cloudera.com,22101,1537397961664; AssignProcedure 
table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 exec-time=473msec
{code}

... which is a long-winded way of saying the Unassign Procedure still exists 
still in RegionStateNodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to