[ 
https://issues.apache.org/jira/browse/HBASE-22626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16872941#comment-16872941
 ] 

stack commented on HBASE-22626:
-------------------------------

Is this hbase-2.0.0 as affected version says? If so, a bunch of fixes went in 
on branch-2.0 after 2.0.0 release. Was the data migrated from a branch-1 to 
2.0.0? Thanks

> Master assigns the region successfully, but updates the state of region 
> failed, and then keeping the state of the region is OPENNING in zookeeper,  
> If master restarted, those OPENNING regions will not be assign forever.
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-22626
>                 URL: https://issues.apache.org/jira/browse/HBASE-22626
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>    Affects Versions: 2.0.0
>            Reporter: chenwandong
>            Priority: Critical
>
> Problem Description
>  (1)One of the region server restarts causes all the regions that have been 
> assigned to be migrated.
> (2)The master checks these regions and assigns them to other region servers, 
> and assigns other region server assignments successfully, but the update 
> state fails.
> 2019-06-22 16:44:20,065 INFO [PEWorker-8] procedure2.ProcedureExecutor: 
> Finished pid=5038, ppid=4488, state=SUCCESS; AssignProcedure 
> table=test061910, region=379c730490b4848f3db065fb25b87452 in 10.5680sec
>  ... ...
>  2019-06-22 16:44:38,725 INFO [PEWorker-4] 
> procedure.MasterProcedureScheduler: pid=5038, ppid=4488, state=SUCCESS; 
> AssignProcedure table=test061910, region=379c730490b4848f3db065fb25b87452 
> checking lock on 379c730490b4848f3db065fb25b87452
>  2019-06-22 16:44:38,725 ERROR [PEWorker-4] procedure2.ProcedureExecutor: 
> CODE-BUG: Uncaught runtime exception for pid=5038, ppid=4488, state=SUCCESS; 
> AssignProcedure table=test061910, region=379c730490b4848f3db065fb25b87452
>  java.lang.UnsupportedOperationException: Unhandled state 
> REGION_TRANSITION_FINISH; there is no rollback for assignment unless we 
> cancel the operation by dropping/disabling the table
>  at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.rollback(RegionTransitionProcedure.java:412)
>  at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.rollback(RegionTransitionProcedure.java:95)
>  at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
>  at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1373)
>  at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1329)
>  at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1198)
>  at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
>  at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1761)
> (3)The master restarted, and the old region state will be loaded after 
> restarting. The region for the OPENNING state is not reassigned forever, and 
> the following log is printed.
> 2019-06-22 16:46:26,210 INFO [master/hdp0:16000] assignment.RegionStateStore: 
> Load hbase:meta entry region=379c730490b4848f3db065fb25b87452, 
> regionState=OPENING, lastHost=hdp2,16020,1561190163476, 
> regionLocation=hdp0,16020,1561190163887, openSeqNum=69448
>  ... ...
>  2019-06-22 16:51:28,514 WARN [ProcExecTimeout] assignment.AssignmentManager: 
> STUCK Region-In-Transition rit=OPENING, location=hdp0,16020,1561190163887, 
> table=test061910, region=379c730490b4848f3db065fb25b87452
>  ... ....
>  2019-06-22 16:49:28,483 WARN [ProcExecTimeout] assignment.AssignmentManager: 
> STUCK Region-In-Transition rit=OPENING, location=hdp0,16020,1561190163887, 
> table=test061910, region=379c730490b4848f3db065fb25b87452
> (4) The state in zookeeper
>  
> test061910,00000000000000000031457280,1560943902323.379c730490b4848f3db065fb25b87452.
>  column=info:regioninfo, timestamp=1561192609882, value=\{ENCODED => 
> 379c730490b4848f3db065fb25b87452, NAME => 
> 'test061910,00000000000000000031457280,1560943902323.379c730490b4848f3db065fb25b87452.',
>  STARTKEY => '00000000000000000031457280', ENDKEY => 
> '00000000000000000041943040'}
>  
> test061910,00000000000000000031457280,1560943902323.379c730490b4848f3db065fb25b87452.
>  column=info:seqnumDuringOpen, timestamp=1561190288613, 
> value=\x00\x00\x00\x00\x00\x01\x0FH
>  
> test061910,00000000000000000031457280,1560943902323.379c730490b4848f3db065fb25b87452.
>  column=info:server, timestamp=1561190288613, value=hdp2:16020
>  
> test061910,00000000000000000031457280,1560943902323.379c730490b4848f3db065fb25b87452.
>  column=info:serverstartcode, timestamp=1561190288613, value=1561190163476
>  
> test061910,00000000000000000031457280,1560943902323.379c730490b4848f3db065fb25b87452.
>  column=info:sn, timestamp=1561192609882, value=hdp0,16020,1561190163887
>  
> test061910,00000000000000000031457280,1560943902323.379c730490b4848f3db065fb25b87452.
>  column=info:state, timestamp=1561192609882, value=OPENING
>  
> Problem Recurrence
>  (1) Modify the state of a normal region from OPEN to OPENNING.
>  (2) Restart the master and region servers, view the master log and hbase web.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to