[ 
https://issues.apache.org/jira/browse/HBASE-19216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253478#comment-16253478
 ] 

Duo Zhang commented on HBASE-19216:
-----------------------------------

I plan to add a reportProcedureDone method in RegionServerStatusService. Fro 
the failing path I think the current framework can work well. We can retry for 
ever if the remote procedure call can not be sent, and finally a 
remoteCallFailed will be triggered and we can give up retrying.

But for the normal path, I can get a full picture but some details are still 
behind the misty. I plan to add a procedureId in the request, and RS will 
report back the procedureId when done. We can get a procedure with this 
procedureId, but then I'm a little confused. How can I wake up a suspended 
procedure? There seems to be a ProcedureEvent, then how is it generated, and 
how can I get it when I only have a procedureId? I need to create one by myself 
when suspending the procedure and store it in the procedure, so I can get it 
through the procedureId?

Help expected... Still a beginner on the procedure v2 framework... Thanks sir 
[~stack].

> Use procedure to execute replication peer related operations
> ------------------------------------------------------------
>
>                 Key: HBASE-19216
>                 URL: https://issues.apache.org/jira/browse/HBASE-19216
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Duo Zhang
>
> When building the basic framework for HBASE-19064, I found that the 
> enable/disable peer is built upon the watcher of zk.
> The problem of using watcher is that, you do not know the exact time when all 
> RSes in the cluster have done the change, it is a 'eventually done'. 
> And for synchronous replication, when changing the state of a replication 
> peer, we need to know the exact time as we can only enable read/write after 
> that time. So I think we'd better use procedure to do this. Change the flag 
> on zk, and then execute a procedure on all RSes to reload the flag from zk.
> Another benefit is that, after the change, zk will be mainly used as a 
> storage, so it will be easy to implement another replication peer storage to 
> replace zk so that we can reduce the dependency on zk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to