If the client isn't sure that the delete has gone through, just do it again once reconnected (to server 2 in the scenario described). Whatever response you get for the delete should determine what you need to do.
-Flavio > On 04 Aug 2015, at 22:11, Alexander Shraer <[email protected]> wrote: > > maybe 1 or 2 synctime, is enough given what you said about syncs - after 1 > synctime > we know that either server1 disconnected (and will have to bootstrap its > state from the leader > if it ever reconnects) or the request got to the leader. But since synctime > may not be measured > exactly from our request submission it maybe that 2 synctime are needed. > Would need to look > deeper into pings and synctime to tell for sure. > > On Tue, Aug 4, 2015 at 2:05 PM, Camille Fournier <[email protected]> wrote: > >> That's true. I spent some time trying to think about when and how that >> would be possible, and didn't get very far. We have guarantees about how >> far out of sync a quorum member can be before it's booted, so I would think >> that there's some way to timebound this potentially to prevent it, a la >> your suggestion about 3X synctime. >> >> C >> >> >> On Tue, Aug 4, 2015 at 4:58 PM, Alexander Shraer <[email protected]> >> wrote: >> >>> Yes, I checked and you're right. It gets queued at the leader until all >>> previously proposed requests at the leader >>> are committed. But still if the request is only on its way between >> server 1 >>> and the leader sync won't immediately help, right ? >>> >>> >>> On Tue, Aug 4, 2015 at 11:39 AM, Camille Fournier <[email protected]> >>> wrote: >>> >>>> I thought that sync forced a flush of the queued events on a quorum >>> member >>>> before completing/got it in the path of events from the leader, so that >>> it >>>> won't return until all of the pending leader events before it have been >>>> seen by this quorum member. Is that not correct? >>>> >>>> On Tue, Aug 4, 2015 at 2:20 PM, Alexander Shraer <[email protected]> >>>> wrote: >>>> >>>>> It seems that since the delete may be in-flight (between server 1 and >>>>> leader, or still being proposed by the leader) >>>>> when the client connects to server 2, doing a sync right a way may >> not >>>> help >>>>> since the operation hasn't been committed yet. Perhaps the client >>> should >>>>> wait some multiple of synclimit time (3x ?) before invoking the sync >> to >>>>> allow the delete to commit or disappear for sure. This is all related >>> to >>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-22, which is still >>> open >>>>> unfortunately... >>>>> >>>>> On Tue, Aug 4, 2015 at 10:15 AM, Camille Fournier < >> [email protected]> >>>>> wrote: >>>>> >>>>>> True, I'm not sure when the xid increments. If that is the case, >> you >>>> can >>>>>> force a sync before the read of the path, to prevent reading stale >>>> data. >>>>> So >>>>>> that would be the solve for that edge case although it's an >> expensive >>>>>> solve. >>>>>> >>>>>> C >>>>>> >>>>>> On Tue, Aug 4, 2015 at 12:52 PM, Alexander Shraer < >> [email protected] >>>> >>>>>> wrote: >>>>>> >>>>>>> Hi Camille, >>>>>>> >>>>>>> if the client received a response for the delete then sure it >>>> shouldn't >>>>>> be >>>>>>> able to connect >>>>>>> to servers that didn't see it. But if it disconnected before >> seeing >>>> the >>>>>>> response the example seems possible to me. >>>>>>> I haven't checked the code to see when exactly the transaction >>> number >>>>> is >>>>>>> incremented at >>>>>>> the client, so I may be wrong, but suppose for example that >>>> zkserver-1 >>>>>>> crashes before >>>>>>> sending the delete request to the leader. Then, the request is >> gone >>>>>>> forever. If you don't let the client >>>>>>> connect to another server that hasn't seen the delete, the client >>>> will >>>>>>> never be able to connect. >>>>>>> So it seems quite possible that it connects, then the request is >>>>> executed >>>>>>> (if zkserver-1 hasn't crashed >>>>>>> after all) and the znode disappears. >>>>>>> >>>>>>> Alex >>>>>>> >>>>>>> >>>>>>> On Tue, Aug 4, 2015 at 8:33 AM, Camille Fournier < >>> [email protected] >>>>> >>>>>>> wrote: >>>>>>> >>>>>>>> ZooKeeper provides a session-coherent single system image >>>> guarantee. >>>>>> Any >>>>>>>> request from the same session will see the results of all of >> its >>>>>> writes, >>>>>>>> regardless of which server it connects to. See: >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> http://zookeeper.apache.org/doc/r3.4.6/zookeeperProgrammers.html#ch_zkGuarantees >>>>>>>> >>>>>>>> So, if your session deletes, and the delete is successfully >>>> processed >>>>>> by >>>>>>>> the quorum, you will not see the path that you have deleted no >>>> matter >>>>>>> what >>>>>>>> server your session connects to. I believe in practice that >> this >>>>> means >>>>>>> that >>>>>>>> the ZK servers that might be behind your session (say server 2 >> is >>>>>> lagging >>>>>>>> behind a few commits) will refuse to allow your session to >>> connect >>>> to >>>>>> it, >>>>>>>> so that you will not see stale data. >>>>>>>> >>>>>>>> This means that the example Lokesh gave: >>>>>>>> >>>>>>>> "1. Quorum leader has forwarded request to zkserver-2 for >> "delete >>>>>> /path". >>>>>>>> 2. If your client connects to "zkserver-2" after step 1 is >>> executed >>>>>> (get >>>>>>>> /path). Then your "/path" will not be available. >>>>>>>> 3. If your client connects to "zkserver-2" before step1 is >>> executed >>>>>> (get >>>>>>>> /path) then your "/path" would be available and after some time >>>> your >>>>>> path >>>>>>>> would not be available (after zkserver-2 is synched with the >>>> leader)" >>>>>>>> >>>>>>>> Cannot happen, so long as you are in the same session. >>>>>>>> >>>>>>>> C >>>>>>>> >>>>>>>> On Tue, Aug 4, 2015 at 6:49 AM, Lokesh Shrivastava < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> I think it depends on whether your request reaches zkserver-1 >>> and >>>>>>> whether >>>>>>>>> it is able to send the request to quorum leader. Considering >>> that >>>>>>> "delete >>>>>>>>> /path" request has reached the quorum leader then following >> may >>>>>> happen >>>>>>>>> >>>>>>>>> 1. Quorum leader has forwarded request to zkserver-2 for >>> "delete >>>>>>> /path". >>>>>>>>> 2. If your client connects to "zkserver-2" after step 1 is >>>> executed >>>>>>> (get >>>>>>>>> /path). Then your "/path" will not be available. >>>>>>>>> 3. If your client connects to "zkserver-2" before step1 is >>>> executed >>>>>>> (get >>>>>>>>> /path) then your "/path" would be available and after some >> time >>>>> your >>>>>>> path >>>>>>>>> would not be available (after zkserver-2 is synched with the >>>>> leader) >>>>>>>>> >>>>>>>>> Others can correct me if this is not how it works. >>>>>>>>> >>>>>>>>> Thanks. >>>>>>>>> Lokesh >>>>>>>>> >>>>>>>>> On 4 August 2015 at 12:09, [email protected] < >>>>>>> [email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> I'm thinking about a program desgin with libzookeeper, >>>> here >>>>> is >>>>>>> my >>>>>>>>>> doubts: >>>>>>>>>> >>>>>>>>>> 1) first, I connnect to zkserver-1, and there exists >> the >>>> path >>>>>>>>> "/path". >>>>>>>>>> 2) I sends "delete /path", the request reaches(may >> not, i >>>>> don't >>>>>>>> know >>>>>>>>>> about that) zkserver-1 and dont't know whether this >> effected, >>>> and >>>>>>> then >>>>>>>>> lost >>>>>>>>>> connection before response returns. >>>>>>>>>> 3) reconnect the same session to zkserver-2, and I >> sends >>>>> "get >>>>>>>>> /path". >>>>>>>>>> >>>>>>>>>> which one will the "get /path" return possibly : >>>>>>>>>> 1, "not exists" >>>>>>>>>> 2, "exists" and "always exists" >>>>>>>>>> 3, "exists" and "not exists" afterwards >>>>>>>>>> >>>>>>>>>> my biggist problem is wether the 3) will occur or not, >>>>> thanks! >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> [email protected] >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>
