[ 
https://issues.apache.org/jira/browse/HBASE-20657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Soldatov updated HBASE-20657:
------------------------------------
    Summary: Retrying RPC call for ModifyTableProcedure may get stuck  (was: 
Retrying RPC call for ModifyTableProcedure may stuck)

> Retrying RPC call for ModifyTableProcedure may get stuck
> --------------------------------------------------------
>
>                 Key: HBASE-20657
>                 URL: https://issues.apache.org/jira/browse/HBASE-20657
>             Project: HBase
>          Issue Type: Bug
>          Components: Client, proc-v2
>    Affects Versions: 2.0.0
>            Reporter: Sergey Soldatov
>            Priority: Major
>
> Env: 2 masters, 1 RS. 
> Steps to reproduce: Active master is killed while ModifyTableProcedure is 
> executed. 
> If the table has enough regions it may come that when the secondary master 
> get active some of the regions may be closed, so once client retries the call 
> to the new active master, a new ModifyTableProcedure is created and get stuck 
> during MODIFY_TABLE_REOPEN_ALL_REGIONS state handling. That happens because:
> 1. When we are retrying from client side, we call modifyTableAsync which 
> create a procedure with a new nonce key:
> {noformat}
>          ModifyTableRequest request = 
> RequestConverter.buildModifyTableRequest(
>             td.getTableName(), td, ng.getNonceGroup(), ng.newNonce());
> {noformat}
>  So on the server side, it's considered as a new procedure and starts 
> executing immediately.
> 2. When we are processing  MODIFY_TABLE_REOPEN_ALL_REGIONS we create 
> MoveRegionProcedure for each region, but it checks whether the region is 
> online (and it's not), so it fails immediately, forcing the procedure to 
> restart.
> [~an...@apache.org] saw a similar case when two concurrent ModifyTable 
> procedures were running and got stuck in the similar way. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to