Sergey Soldatov created HBASE-20657:
---------------------------------------

             Summary: Retrying RPC call for ModifyTableProcedure may stuck
                 Key: HBASE-20657
                 URL: https://issues.apache.org/jira/browse/HBASE-20657
             Project: HBase
          Issue Type: Bug
          Components: Client, proc-v2
    Affects Versions: 2.0.0
            Reporter: Sergey Soldatov


Env: 2 masters, 1 RS. 
Steps to reproduce: Active master is killed while ModifyTableProcedure is 
executed. 
If the table has enough regions it may come that when the secondary master get 
active some of the regions may be closed, so once client retries the call to 
the new active master, a new ModifyTableProcedure is created and get stuck 
during MODIFY_TABLE_REOPEN_ALL_REGIONS state handling. That happens because:
1. When we are retrying from client side, we call modifyTableAsync which create 
a procedure with a new nonce key:
{noformat}
         ModifyTableRequest request = RequestConverter.buildModifyTableRequest(
            td.getTableName(), td, ng.getNonceGroup(), ng.newNonce());
{noformat}
 So on the server side, it's considered as a new procedure and starts executing 
immediately.
2. When we are processing  MODIFY_TABLE_REOPEN_ALL_REGIONS we create 
MoveRegionProcedure for each region, but it checks whether the region is online 
(and it's not), so it fails immediately, forcing the procedure to restart.

[[email protected]] saw a similar case when two concurrent ModifyTable 
procedures were running and got stuck in the similar way. 





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to