[ 
https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack reopened HBASE-20893:
---------------------------

Reopening to look at these logs I see running this patch on cluster (Its great 
it detected recovered.edits... but it looks like the patch causes us to hit 
CODE-BUG...  though we seem to be ok...Minimally it will freak-out an operator):

{code}

2018-07-25 06:46:56,692 ERROR [PEWorker-3] 
assignment.SplitTableRegionProcedure: Error trying to split region 
2cb977a87bc6bdf90ef7fc71320d7b50 in the table IntegrationTestBigLinkedList (in 
state=SPLIT_TABLE_REGIONS_CHECK_CLOSED_REGIONS)
java.io.IOException: Recovered.edits are found in Region: {ENCODED => 
2cb977a87bc6bdf90ef7fc71320d7b50, NAME => 
'IntegrationTestBigLinkedList,z\xAA;\xC7M\x1Bf8\x85\xB5\x07\xD5\x9B#\xCD\xCC,1531911202047.2cb977a87bc6bdf90ef7fc71320d7b50.',
 STARTKEY => 'z\xAA;\xC7M\x1Bf8\x85\xB5\x07\xD5\x9B#\xCD\xCC', ENDKEY => 
'{\x8D\xF2?'}, abort split to prevent data loss
  at 
org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.checkClosedRegion(SplitTableRegionProcedure.java:151)
  at 
org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:259)
  at 
org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:92)
  at 
org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:184)
  at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:850)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1472)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1240)
                                                                                
                                       at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1760)
                                                                                
                                     2018-07-25 06:46:56,934 INFO  [PEWorker-3] 
procedure.MasterProcedureScheduler: pid=4106, ppid=4105, state=SUCCESS; 
UnassignProcedure table=IntegrationTestBigLinkedList, 
region=2cb977a87bc6bdf90ef7fc71320d7b50, 
server=ve0540.halxg.cloudera.com,16020,1532501580658 checking lock on 
2cb977a87bc6bdf90ef7fc71320d7b50                                                
                                                                                
2018-07-25 06:46:56,934 ERROR [PEWorker-3] procedure2.ProcedureExecutor: 
CODE-BUG: Uncaught runtime exception for pid=4106, ppid=4105, state=SUCCESS; 
UnassignProcedure table=IntegrationTestBigLinkedList, 
region=2cb977a87bc6bdf90ef7fc71320d7b50, 
server=ve0540.halxg.cloudera.com,16020,1532501580658                            
                                                                                
                                   java.lang.UnsupportedOperationException: 
Unhandled state REGION_TRANSITION_FINISH; there is no rollback for assignment 
unless we cancel the operation by dropping/disabling the table
  at 
org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.rollback(RegionTransitionProcedure.java:412)
  at 
org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.rollback(RegionTransitionProcedure.java:95)
                                                                                
                          at 
org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1372)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1328)
                                                                                
                                        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1197)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
                                                                                
                                               at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1760)
2018-07-25 06:46:57,088 ERROR [PEWorker-3] procedure2.ProcedureExecutor: 
CODE-BUG: Uncaught runtime exception for pid=4106, ppid=4105, state=SUCCESS; 
UnassignProcedure table=IntegrationTestBigLinkedList, 
region=2cb977a87bc6bdf90ef7fc71320d7b50, 
server=ve0540.halxg.cloudera.com,16020,1532501580658                            
                                                                                
                                   java.lang.UnsupportedOperationException: 
Unhandled state REGION_TRANSITION_FINISH; there is no rollback for assignment 
unless we cancel the operation by dropping/disabling the table                  
                       at 
org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.rollback(RegionTransitionProcedure.java:412)
  at 
org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.rollback(RegionTransitionProcedure.java:95)
                                                                                
                          at 
org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)     
                                                                                
                                                         at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1372)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1328)
                                                                                
                                        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1197)
                                                                                
                                       at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1760)
                                                                                
                                     2018-07-25 06:46:57,196 INFO  [PEWorker-9] 
procedure.MasterProcedureScheduler: pid=4107, 
state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure 
table=IntegrationTestBigLinkedList, region=2cb977a87bc6bdf90ef7fc71320d7b50, 
target=ve0540.halxg.cloudera.com,16020,1532501580658 checking lock on 
2cb977a87bc6bdf90ef7fc71320d7b50
2018-07-25 06:46:57,760 INFO  [PEWorker-3] procedure2.ProcedureExecutor: Rolled 
back pid=4105, state=ROLLEDBACK, exception=java.io.IOException via 
master-split-regions:java.io.IOException: Recovered.edits are found in Region: 
{ENCODED => 2cb977a87bc6bdf90ef7fc71320d7b50, NAME => 
'IntegrationTestBigLinkedList,z\xAA;\xC7M\x1Bf8\x85\xB5\x07\xD5\x9B#\xCD\xCC,1531911202047.2cb977a87bc6bdf90ef7fc71320d7b50.',
 STARTKEY => 'z\xAA;\xC7M\x1Bf8\x85\xB5\x07\xD5\x9B#\xCD\xCC', ENDKEY => 
'{\x8D\xF2?'}, abort split to prevent data loss; SplitTableRegionProcedure 
table=IntegrationTestBigLinkedList, parent=2cb977a87bc6bdf90ef7fc71320d7b50, 
daughterA=8b6804c043fe3707493f052e18aca74f, 
daughterB=f64f248effb5b9ef66210778d9a87fd3 exec-time=1.8490sec
{code}



> Data loss if splitting region while ServerCrashProcedure executing
> ------------------------------------------------------------------
>
>                 Key: HBASE-20893
>                 URL: https://issues.apache.org/jira/browse/HBASE-20893
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 3.0.0, 2.1.0, 2.0.1
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>            Priority: Major
>             Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
>         Attachments: HBASE-20893.branch-2.0.001.patch, 
> HBASE-20893.branch-2.0.002.patch, HBASE-20893.branch-2.0.003.patch, 
> HBASE-20893.branch-2.0.004.patch, HBASE-20893.branch-2.0.005.patch
>
>
> Similar case as HBASE-20878.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to