I disagree, ZooKeeper itself actually doesn't rely on timing for safety -
it won't get into an inconsistent state even if all timing assumptions fail
(except for the sync operation, which is then not guaranteed to return the
latest value, but that's a known issue that needs to be fixed).




On Wed, Jul 15, 2015 at 2:13 PM, Jordan Zimmerman <
jor...@jordanzimmerman.com> wrote:

> This property may hold if you make a lot of timing/synchrony assumptions
>
> These assumptions and timing are intrinsic to using ZooKeeper. So, of
> course I’m making these assumptions.
>
> -Jordan
>
>
>
> On July 15, 2015 at 3:57:12 PM, Alexander Shraer (shra...@gmail.com)
> wrote:
>
> This property may hold if you make a lot of timing/synchrony assumptions
> -- agreeing on who holds the lock in an asynchronous distributed system
> with failures is impossible, this is the FLP impossibility.
>
> But even if it holds, this property is not very useful if the ZK client
> itself doesn't have the application data. So one has to consider whether it
> is possible that the application sees a messages from two clients that both
> think are the leader in an order which contradicts the lock acquisition
> order.
>
> On Wed, Jul 15, 2015 at 1:26 PM, Jordan Zimmerman <
> jor...@jordanzimmerman.com> wrote:
>
>>  I think we may be talking past each other here. My contention (and the
>> ZK docs agree BTW) is that, properly written and configured, "at any
>> snapshot in time no two clients think they hold the same lock”. How your
>> application acts on that fact is another thing. You might need sequence
>> numbers, you might not.
>>
>> -Jordan
>>
>>
>> On July 15, 2015 at 3:15:16 PM, Alexander Shraer (shra...@gmail.com)
>> wrote:
>>
>>  Jordan, as Camille suggested, please read Sec 2.4 in the Chubby paper:
>> link
>> <
>> http://static.googleusercontent.com/media/research.google.com/en//archive/chubby-osdi06.pdf
>> >
>>
>> it suggests 2 ways in which the storage can support lock generations and
>> proposes an alternative for the case where the storage can't be made aware
>> of lock generations.
>>
>> On Wed, Jul 15, 2015 at 1:08 PM, Jordan Zimmerman <
>> jor...@jordanzimmerman.com> wrote:
>>
>> > Ivan, I just read the blog and I still don’t see how this can happen.
>> > Sorry if I’m being dense. I’d appreciate a discussion on this. In your
>> blog
>> > you state: "when ZooKeeper tells you that you are leader, there’s no
>> > guarantee that there isn’t another node that 'thinks' its the leader.”
>> > However, given a long enough session time — I usually recommend 30–60
>> > seconds, I don’t see how this can happen. The client itself determines
>> that
>> > there is a network partition when there is no heartbeat success. The
>> > heartbeat is a fraction of the session timeout. Once the heartbeat
>> fails,
>> > the client must assume it no longer has the lock. Another client cannot
>> > take over the lock until, at minimum, session timeout. So, how then can
>> > there be two leaders?
>> >
>> > -Jordan
>> >
>> > On July 15, 2015 at 2:23:12 PM, Ivan Kelly (iv...@apache.org) wrote:
>> >
>> > I blogged about this exact problem a couple of weeks ago [1]. I give an
>> > example of how split brain can happen in a resource under a zk lock
>> (Hbase
>> > in this case). As Camille says, sequence numbers ftw. I'll add that the
>> > data store has to support them though, which not all do (in fact I've
>> yet
>> > to see one in the wild that does). I've implemented a prototype that
>> works
>> > with hbase[2] if you want to see what it looks like.
>> >
>> > -Ivan
>> >
>> > [1]
>> >
>> >
>> https://medium.com/@ivankelly/reliable-table-writer-locks-for-hbase-731024295215
>> > [2] https://github.com/ivankelly/hbase-exclusive-writer
>> >
>> > On Wed, Jul 15, 2015 at 9:16 PM Vikas Mehta <vikasme...@gmail.com>
>> wrote:
>> >
>> > > Jordan, I mean the client gives up the lock and stops working on the
>> > shared
>> > > resource. So when zookeeper is unavailable, no one is working on any
>> > shared
>> > > resource (because they cannot distinguish network partition from
>> > zookeeper
>> > > DEAD scenario).
>> > >
>> > >
>> > >
>> > > --
>> > > View this message in context:
>> > >
>> >
>> http://zookeeper-user.578899.n2.nabble.com/locking-leader-election-and-dealing-with-session-loss-tp7581277p7581293.html
>> > > Sent from the zookeeper-user mailing list archive at Nabble.com.
>> > >
>> >
>>
>>
>

Reply via email to