[ 
https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16559392#comment-16559392
 ] 

Allan Yang edited comment on HBASE-20893 at 7/27/18 7:50 AM:
-------------------------------------------------------------

{quote}Split does not support rollback at the 
SPLIT_TABLE_REGIONS_CHECK_CLOSED_REGIONS stage
{quote}
[~stack], the Split does support at the 
SPLIT_TABLE_REGIONS_CHECK_CLOSED_REGIONS stage, as you can see from the code
{code:java}
case SPLIT_TABLE_REGIONS_CHECK_CLOSED_REGIONS:
          // Doing nothing, in SPLIT_TABLE_REGION_CLOSE_PARENT_REGION,
          // we will bring parent region online
          break;
{code}
Actually, it does nothing here, since when the statemachine procedure is 
rolling back, it will roll back every state backwards, so when the rollback 
process reaches at state SPLIT_TABLE_REGION_CLOSE_PARENT_REGION, it will open 
the parent region.
 But, some procedure does not support rollback, e.g Assign/Unassign Procedure. 
It will throw a UnsupportedOperationException, which is totally OK.(We do not 
except the region to close or open again after rollback) 
 May we shouldn't throw a exception if we do not support rollback, just log a 
message there. Otherwise, the ProcedureExecutor will print a "CODE-BUG" message 
which is pretty scary.

But, there is truly a bug here,
{code:java}
  @Override
  protected void rollback(final TEnvironment env)
      throws IOException, InterruptedException {
    if (isEofState()) stateCount--;
    try {
      updateTimestamp();
      rollbackState(env, getCurrentState());
      stateCount--;
    } finally {
      updateTimestamp();
    }
  }
{code}
We need to decrease the stateCount when rolling back, so we can rollback for 
the previous state correctly. But. since a exception is thrown, the decrease 
for stateCount never happen. So ProcedureExecutor will continue to rollback for 
only one state(the one throw a exception) until the end of the execution stack.


was (Author: allan163):
{quote}
Split does not support rollback at the SPLIT_TABLE_REGIONS_CHECK_CLOSED_REGIONS 
stage
{quote}
[~stack], the Split does support at the 
SPLIT_TABLE_REGIONS_CHECK_CLOSED_REGIONS stage, as you can see from the code
{code}
case SPLIT_TABLE_REGIONS_CHECK_CLOSED_REGIONS:
          // Doing nothing, in SPLIT_TABLE_REGION_CLOSE_PARENT_REGION,
          // we will bring parent region online
          break;
{code}
Actually, it does nothing here, since when the statemachine  procedure is 
rolling back, it will roll back every state backwards, so when the rollback 
process reaches at state SPLIT_TABLE_REGION_CLOSE_PARENT_REGION, it will open 
the parent region.
But, some procedure does not support rollback, e.g Assign/Unassign Procedure. 
It will throw a UnsupportedOperationException, which is totally OK.(We do not 
except the region to close or open again after rollback) 
May we shouldn't throw a exception if we do not support rollback, just log a 
message there. Otherwise, the ProcedureExecutor will print a "CODE-BUG" message 
which is pretty scary.

But, there is truly a bug here,
{code}
  @Override
  protected void rollback(final TEnvironment env)
      throws IOException, InterruptedException {
    if (isEofState()) stateCount--;
    try {
      updateTimestamp();
      rollbackState(env, getCurrentState());
      stateCount--;
    } finally {
      updateTimestamp();
    }
  }
{code}
We need to decrease the stateCount when rolling back, so we can rollback for 
the previous state correctly. But. since a exception is thrown, the decrease 
for stateCount never happen. So ProcedureExecutor will continue to rollback for 
only state until the end of the execution stack.


> Data loss if splitting region while ServerCrashProcedure executing
> ------------------------------------------------------------------
>
>                 Key: HBASE-20893
>                 URL: https://issues.apache.org/jira/browse/HBASE-20893
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 3.0.0, 2.1.0, 2.0.1
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>            Priority: Major
>             Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
>         Attachments: HBASE-20893-branch-2.0.addendum.patch, 
> HBASE-20893.branch-2.0.001.patch, HBASE-20893.branch-2.0.002.patch, 
> HBASE-20893.branch-2.0.003.patch, HBASE-20893.branch-2.0.004.patch, 
> HBASE-20893.branch-2.0.005.patch
>
>
> Similar case as HBASE-20878.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to