[ 
https://issues.apache.org/jira/browse/HBASE-20642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497023#comment-16497023
 ] 

Ankit Singhal commented on HBASE-20642:
---------------------------------------

{quote} This is a general problem with the synchronous calls; they are not 
built to migrate across server failure. Currently there is no connection 
between running procedure and client invocation other than the stalled call. 
Perhaps we could build some sort of tether but the thinking was that we'd move 
off these old-style (deprecated) synchronous calls to instead use async where 
we do have a connection between the invocation and the running procedure via 
the returned future.
{quote}
Current implementation of synchronous calls in HBase simulates the way the 
client will handle the async calls, Like waiting for the future to return the 
results. so except the Modify and Truncate table procedure, the current 
mechanism is good, like we submit the procedure and checking periodically for 
the procedure to complete in separate calls which can handle the migration of 
master as well.
{quote} Not to complete. The latch covers setup of the procedure only (A quote 
from HBASE-19953 suggests doc to make it clear that "....the latch is just-for 
the Procedure preparation – that we are not blocking for the whole procedure 
run...")
{quote}
Yes, but in case of Modify and Truncate table procedure only, a latch is 
released at the end of the procedure. Raised HBASE-20658 for that.
{quote}Yeah, this is a problem (why have nonce's if client is doing this...). 
Does this break your suggested solution here? Or rather, it needs client 
changes too?
{quote}
We may not need client change if we fix
{quote}Re-reading the description, how would ensuring nonce-respect help? We'll 
not resubmit the procedure but neither will we recognize its successful 
completion since it happens on the new master, not the old.
{quote}
In case of synchronous calls as well , we check for procedure completion by 
requesting the server for the procedure results periodically, so the call will 
get to know if a new master has completed the procedure.
Procedure is getting resubmitted in case of Modify and Truncate table procedure 
because of HBASE-20658.

Just to summarize, 
if we fix HBASE-20658 by releasing the latch after some pre-checks for Modify 
and Truncate table , then probably we may not need to do nonce check as retry 
mechanism will not kick in if the procedure is submitted successfully.

Thanks [~stack], What do you say for HBASE-20658?

> IntegrationTestDDLMasterFailover throws 'InvalidFamilyOperationException 
> -------------------------------------------------------------------------
>
>                 Key: HBASE-20642
>                 URL: https://issues.apache.org/jira/browse/HBASE-20642
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ankit Singhal
>            Assignee: Ankit Singhal
>            Priority: Major
>         Attachments: HBASE-20642.patch
>
>
> [~romil.choksi] reported that IntegrationTestDDLMasterFailover is failing 
> while adding column family during the time master is restarting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to