Thanks to Jordan for the helpful comments and suggestions on the blog post!
I'll wait another day in case anyone else has any other comments, otherwise
I'll move forward with getting the post published.

Thanks All,
Josh

On Wed, Jul 28, 2021 at 12:29 PM Josh Slocum <[email protected]> wrote:

> Hi All,
>
> Per my discussion with Enrico, I wrote up a Google Doc with a Draft of a
> blog post for idempotent writes in Curator 5.2.0:
>
> https://docs.google.com/document/d/1867mkraRzat3fueYm5BR2ihzma4f_zgywqDH0msQLu0/edit?usp=sharing
>
> If anyone has comments or suggestions, feel free to comment them on the
> Google Doc.
>
> Thanks,
> Josh
>
> On Tue, Jul 27, 2021 at 9:58 AM Enrico Olivelli (Jira) <[email protected]>
> wrote:
>
>>
>>     [
>> https://issues.apache.org/jira/browse/CURATOR-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17388123#comment-17388123
>> ]
>>
>> Enrico Olivelli commented on CURATOR-584:
>> -----------------------------------------
>>
>> Great !
>>
>> I usually use Medium [https://medium.com/@eolivelli] to post my blogs.
>> If you want to write up a Google Doc I will be happy to review it and
>> help you get it published (or co-authored)
>>
>> you can reach me out directly to my apache address [email protected],
>> or better we can interact on [[email protected]|mailto:
>> [email protected]] ([https://curator.apache.org/mailing-lists.html
>> )]
>>
>> > Curator Client Fault Tolerance Extensions
>> > -----------------------------------------
>> >
>> >                 Key: CURATOR-584
>> >                 URL: https://issues.apache.org/jira/browse/CURATOR-584
>> >             Project: Apache Curator
>> >          Issue Type: Improvement
>> >            Reporter: Josh Slocum
>> >            Assignee: Enrico Olivelli
>> >            Priority: Minor
>> >             Fix For: 5.2.0
>> >
>> >          Time Spent: 7h 40m
>> >  Remaining Estimate: 0h
>> >
>> > Tl;dr My team at Indeed has developed ZooKeeper functionality to handle
>> stateful retrying of connectionloss for write operations, and we wanted to
>> reach out to discuss if this is something the Curator team may be
>> interested in incorporating.
>> > We initially reached out to the Zookeeper team (
>> https://issues.apache.org/jira/browse/ZOOKEEPER-3927) but were
>> redirected to Curator as the better place to contribute them. The changes
>> could be relatively easily added as additional parameters and/or extensions
>> of the existing retry behavior in Curator's write operations.
>> >
>> > Hi Curator Devs,
>> > My team uses zookeeper extensively as part of a distributed key-value
>> store we've built at Indeed (think HBase replacement). Due to our
>> deployment setup co-locating our database daemons with our large hadoop
>> cluster, and the network-intensive nature of a lot of our compute jobs, we
>> were experiencing a large amount of transient ConnectionLoss issues. This
>> was especially problematic on important write operations, such as the
>> creation deletion of distributed locks/leases or updating distributed state
>> in the cluster.
>> > We saw that some existing zookeeper client wrappers handled retrying in
>> the presence of ConnectionLoss, but all of the ones we looked at (including
>> Curator) didn't allow for retrying writes wiith all of the proper state.
>> Consider the case of retrying a create. If the initial create had succeeded
>> on the server, but the client got connectionloss, the client would get a
>> NodeExists exception on the retried request, even though the znode was
>> created. This resulted in many issues. For the distributed lock/lease
>> example, to other nodes, it looked like the calling node had been
>> successful acquiring the "lock", and to the calling node, it appeared that
>> it was not able to acquire the "lock", which results in a deadlock.
>> > Curator has parameters that can modify the behavior upon retry, but
>> those were not sufficient. For example, create() has orSetData(), and
>> delete() has guaranteed().
>> > To solve this, we implemented a set of "connection-loss tolerant
>> primitives" for the main types of write operations. They handle a
>> connection loss by retrying the operation in a loop, but upon error cases
>> in the retry, inspect the current state to see if it matches the case where
>> a previous round that got connectionloss actually succeeded.
>> > * createRetriable(String path, byte[] data)
>> > * setDataRetriable(String path, byte[] newData, int currentVersion)
>> > * deleteRetriable(String path, int currentVersion)
>> > * compareAndDeleteRetriable(String path, byte[] currentData, int
>> currentVersion)
>> > For example, in createRetriable, it will retry the create again on
>> connection loss. If the retried call gets a NodeExists exception, it will
>> check to see if (getData(path) == data and dataVersion == 0). If it does,
>> it assumes the first create succeeded and returns success, otherwise it
>> propagates the NodeExists exception.
>> > These primitives have allowed us to program our ZooKeeper layer as if
>> ConnectionLoss isn't a transient state we have to worry about, since they
>> have essentially the same guarantees as the non-retriable functions in the
>> zookeeper api do (with a slight difference in semantics).
>> > Because these behaviors could be relatively easily added to Curator as
>> additional parameters to the existing mechanisms, and (to my knowledge)
>> aren't implemented anywhere else, we think it could be a useful
>> contribution to the Curator project. If this isn't something that Curator
>> is interested in incorporating, Indeed may also consider open sourcing it
>> as a standalone library.
>>
>>
>>
>> --
>> This message was sent by Atlassian Jira
>> (v8.3.4#803005)
>>
>

Reply via email to