[ 
https://issues.apache.org/jira/browse/CURATOR-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245512#comment-17245512
 ] 

Cam McKenzie commented on CURATOR-584:
--------------------------------------

Sounds good to me [~jmslocum16], if you would like to raise a PR I would be 
happy to review.

> Curator Client Fault Tolerance Extensions
> -----------------------------------------
>
>                 Key: CURATOR-584
>                 URL: https://issues.apache.org/jira/browse/CURATOR-584
>             Project: Apache Curator
>          Issue Type: Improvement
>            Reporter: Josh Slocum
>            Assignee: Jordan Zimmerman
>            Priority: Minor
>
> Tl;dr My team at Indeed has developed ZooKeeper functionality to handle 
> stateful retrying of connectionloss for write operations, and we wanted to 
> reach out to discuss if this is something the Curator team may be interested 
> in incorporating.
> We initially reached out to the Zookeeper team 
> (https://issues.apache.org/jira/browse/ZOOKEEPER-3927) but were redirected to 
> Curator as the better place to contribute them. The changes could be 
> relatively easily added as additional parameters and/or extensions of the 
> existing retry behavior in Curator's write operations.
>  
> Hi Curator Devs,
> My team uses zookeeper extensively as part of a distributed key-value store 
> we've built at Indeed (think HBase replacement). Due to our deployment setup 
> co-locating our database daemons with our large hadoop cluster, and the 
> network-intensive nature of a lot of our compute jobs, we were experiencing a 
> large amount of transient ConnectionLoss issues. This was especially 
> problematic on important write operations, such as the creation deletion of 
> distributed locks/leases or updating distributed state in the cluster. 
> We saw that some existing zookeeper client wrappers handled retrying in the 
> presence of ConnectionLoss, but all of the ones we looked at (including 
> Curator) didn't allow for retrying writes wiith all of the proper state. 
> Consider the case of retrying a create. If the initial create had succeeded 
> on the server, but the client got connectionloss, the client would get a 
> NodeExists exception on the retried request, even though the znode was 
> created. This resulted in many issues. For the distributed lock/lease 
> example, to other nodes, it looked like the calling node had been successful 
> acquiring the "lock", and to the calling node, it appeared that it was not 
> able to acquire the "lock", which results in a deadlock.
> Curator has parameters that can modify the behavior upon retry, but those 
> were not sufficient. For example, create() has orSetData(), and delete() has 
> guaranteed().
> To solve this, we implemented a set of "connection-loss tolerant primitives" 
> for the main types of write operations. They handle a connection loss by 
> retrying the operation in a loop, but upon error cases in the retry, inspect 
> the current state to see if it matches the case where a previous round that 
> got connectionloss actually succeeded.
> * createRetriable(String path, byte[] data)
> * setDataRetriable(String path, byte[] newData, int currentVersion)
> * deleteRetriable(String path, int currentVersion)
> * compareAndDeleteRetriable(String path, byte[] currentData, int 
> currentVersion)
> For example, in createRetriable, it will retry the create again on connection 
> loss. If the retried call gets a NodeExists exception, it will check to see 
> if (getData(path) == data and dataVersion == 0). If it does, it assumes the 
> first create succeeded and returns success, otherwise it propagates the 
> NodeExists exception.
> These primitives have allowed us to program our ZooKeeper layer as if 
> ConnectionLoss isn't a transient state we have to worry about, since they 
> have essentially the same guarantees as the non-retriable functions in the 
> zookeeper api do (with a slight difference in semantics).
> Because these behaviors could be relatively easily added to Curator as 
> additional parameters to the existing mechanisms, and (to my knowledge) 
> aren't implemented anywhere else, we think it could be a useful contribution 
> to the Curator project. If this isn't something that Curator is interested in 
> incorporating, Indeed may also consider open sourcing it as a standalone 
> library.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to