[ 
https://issues.apache.org/jira/browse/HBASE-11659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085050#comment-14085050
 ] 

Virag Kothari commented on HBASE-11659:
---------------------------------------

I dont see any exception on master when the first OPEN call timed out. There 
was only socket timeout on regionserver.
I saw similar issue during CLOSE where master had already moved the region 
state to OFFLINE, but the call had timeout on region server. On next retry, the 
master complained that the region is not pending close, but it didn't lead in 
any inconsistency on the region server as the region was already closed.
bq. We need to make sure the region is OPEN on the right server with the right 
open seq number to make sure it is a retry.

Do we need to query meta to fetch the seq number?






> Region state RPC call is not idempotent
> ---------------------------------------
>
>                 Key: HBASE-11659
>                 URL: https://issues.apache.org/jira/browse/HBASE-11659
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>            Reporter: Virag Kothari
>            Assignee: Virag Kothari
>         Attachments: HBASE-11659.patch
>
>
> Here is the scenario on 0.98 with zk-less assignment
> The master gets an OPEN RPC call from region server.
> So, it moves the region state from PENDING_OPEN to OPEN.
> But, the call timeouts on the region server and region server retries sending 
> the OPEN call. However, now the master throws an Exception saying the region 
> is not PENDING_OPEN. So, the region servers aborts the region on receiving 
> that exception and sends FAILED_OPEN to master. But the master cannot change 
> its state from FAILED_OPEN to OPEN, so eventually the master keeps the state 
> as OPEN while the actual region is no longer open on region server.
> The master should not throw an exception on receiving OPEN RPC calls multiple 
> times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to