[ 
https://issues.apache.org/jira/browse/HBASE-19216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251013#comment-16251013
 ] 

Duo Zhang commented on HBASE-19216:
-----------------------------------

For a peer change, I think it is idempotent, so we can retry forever if an RS 
fails to report in. We can have a nonce to prevent useless refresh but an extra 
refresh will not effect correctness. So here I would say that there is no 
rollback for this procedure.

And for ServerCrashProcedure, we can just skip the refresh on this node as it 
will load the new peer config when restarting. Also for 
ServerNotRunningYetException, maybe we have the same logic for unassign region?

{quote}
When you need this by?
{quote}
The synchronous replication needs this for state transition. The current 
'eventually done' semantic for changing the peer config is not enough. So, the 
earlier the better :)
Anyway, I can do it by myself, but I need to confirm that my approach is 
correct.

Thanks.

> Use procedure to execute replication peer related operations
> ------------------------------------------------------------
>
>                 Key: HBASE-19216
>                 URL: https://issues.apache.org/jira/browse/HBASE-19216
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Duo Zhang
>
> When building the basic framework for HBASE-19064, I found that the 
> enable/disable peer is built upon the watcher of zk.
> The problem of using watcher is that, you do not know the exact time when all 
> RSes in the cluster have done the change, it is a 'eventually done'. 
> And for synchronous replication, when changing the state of a replication 
> peer, we need to know the exact time as we can only enable read/write after 
> that time. So I think we'd better use procedure to do this. Change the flag 
> on zk, and then execute a procedure on all RSes to reload the flag from zk.
> Another benefit is that, after the change, zk will be mainly used as a 
> storage, so it will be easy to implement another replication peer storage to 
> replace zk so that we can reduce the dependency on zk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to