[ https://issues.apache.org/jira/browse/HBASE-19216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253478#comment-16253478 ]
Duo Zhang commented on HBASE-19216: ----------------------------------- I plan to add a reportProcedureDone method in RegionServerStatusService. Fro the failing path I think the current framework can work well. We can retry for ever if the remote procedure call can not be sent, and finally a remoteCallFailed will be triggered and we can give up retrying. But for the normal path, I can get a full picture but some details are still behind the misty. I plan to add a procedureId in the request, and RS will report back the procedureId when done. We can get a procedure with this procedureId, but then I'm a little confused. How can I wake up a suspended procedure? There seems to be a ProcedureEvent, then how is it generated, and how can I get it when I only have a procedureId? I need to create one by myself when suspending the procedure and store it in the procedure, so I can get it through the procedureId? Help expected... Still a beginner on the procedure v2 framework... Thanks sir [~stack]. > Use procedure to execute replication peer related operations > ------------------------------------------------------------ > > Key: HBASE-19216 > URL: https://issues.apache.org/jira/browse/HBASE-19216 > Project: HBase > Issue Type: Improvement > Reporter: Duo Zhang > > When building the basic framework for HBASE-19064, I found that the > enable/disable peer is built upon the watcher of zk. > The problem of using watcher is that, you do not know the exact time when all > RSes in the cluster have done the change, it is a 'eventually done'. > And for synchronous replication, when changing the state of a replication > peer, we need to know the exact time as we can only enable read/write after > that time. So I think we'd better use procedure to do this. Change the flag > on zk, and then execute a procedure on all RSes to reload the flag from zk. > Another benefit is that, after the change, zk will be mainly used as a > storage, so it will be easy to implement another replication peer storage to > replace zk so that we can reduce the dependency on zk. -- This message was sent by Atlassian JIRA (v6.4.14#64029)