Re: leader election, scheduled tasks, losing leadership

Vitalii Tymchyshyn Sun, 09 Dec 2012 22:50:05 -0800

How are you going to ensure atomicity? I mean, if you processor dies in the
middle of the operation, how do you know if it is done or not?


--
Best regards,
Vitalii Tymchyshyn
10 груд. 2012 00:11, "Eric Pederson" <[email protected]> напис.

> Also sometimes the app leadership (via LeaderLatch) will get lost - I will
> follow up about this on the Curator list:
> https://gist.github.com/4247226
>
> So back to my previous question, what is the best way to implement the
> "fence"?
>
> -- Eric
>
>
>
> On Sun, Dec 9, 2012 at 4:42 PM, Eric Pederson <[email protected]> wrote:
>
> > The irony is that I am using leader election to convert non-idempotent
> > operations into idempotent operations :)   For example, once a night a
> > report is emailed out to a set of addresses.   We don't want the report
> to
> > go to the same person more than once.
> >
> > Prior to using leader election one of the cluster members was designated
> > as the scheduled task "leader" using a system property.  But if that
> > cluster member crashed it required a manual operation to failover the
> > "leader" responsibility to another cluster member.   I considered using
> > app-specific techniques to make the scheduled tasks idempotent (for
> example
> > using "select for update" / database locking) but I wanted a general
> > solution and I needed clustering support for other reasons (cluster
> > membership, etc).
> >
> > Anyway, here is the code that I'm using.
> >
> > Application startup (using Curator LeaderLatch):
> > https://gist.github.com/3936162
> > https://gist.github.com/3935895
> > https://gist.github.com/3935889
> >
> > ClusterStatus:
> > https://gist.github.com/3943149
> > https://gist.github.com/3935861
> >
> > Scheduled task:
> > https://gist.github.com/4246388
> >
> > In the last gist the "distribute" scheduled task is run every 30 seconds.
> >   It checks clusterStatus.isLeader to see if the current cluster member
> is
> > the leader before running the real method (which sends email).
> > clusterStatus() calls methods on LeaderLatch.
> >
> > Here is the output that I am seeing if I kill the ZK quorum leader and
> the
> > app cluster member that was the leader loses its LeaderLatch leadership
> to
> > another cluster member:
> > https://gist.github.com/4247058
> >
> >
> > -- Eric
> >
> >
> >
> > On Sun, Dec 9, 2012 at 12:30 AM, Henry Robinson <[email protected]
> >wrote:
> >
> >> On 8 December 2012 21:18, Jordan Zimmerman <[email protected]
> >> >wrote:
> >>
> >> > If your ConnectionStateListener gets SUSPENDED or LOST you've lost
> >> > connection to ZooKeeper. Therefore you cannot use that same ZooKeeper
> >> > connection to manage a node that denotes the process is running or
> not.
> >> > Only 1 VM at a time will be running the process. That process can
> watch
> >> for
> >> > SUSPENDED/LOST and wind down the task.
> >> >
> >> >
> >> My point is that by the time that VM sees SUSPENDED/LOST, another VM may
> >> have been elected leader and have started running another process.
> >>
> >> It's a classic problem - you need some mechanism to fence a node that
> >> thinks its the leader, but isn't and hasn't got the memo yet. The way
> >> around the problem is to either ensure that no work is done by you once
> >> you
> >> are no longer the leader (perhaps by checking every time you want to do
> >> work), or that the work you do does not affect the system (e.g. by
> >> idempotent work units).
> >>
> >> ZK itself solves this internally by checking with that it has a quorum
> for
> >> every operation, which forces an ordering between the disconnection
> event
> >> and trying to do something that relies upon being the leader. Other
> >> systems
> >> forcibly terminate old leaders before allowing a new leader to take the
> >> throne.
> >>
> >> Henry
> >>
> >>
> >> > > You can't assume that the notification is received locally before
> >> another
> >> > > leader election finishes elsewhere
> >> > Which notification? The ConnectionStateListener is an abstraction on
> >> > ZooKeeper's watcher mechanism. It's only significant for the VM that
> is
> >> the
> >> > leader. Non-leaders don't need to be concerned.
> >>
> >>
> >> > -JZ
> >> >
> >> > On Dec 8, 2012, at 9:12 PM, Henry Robinson <[email protected]>
> wrote:
> >> >
> >> > > You can't assume that the notification is received locally before
> >> another
> >> > > leader election finishes elsewhere (particularly if you are running
> >> > slowly
> >> > > for some reason!), so it's not sufficient to guarantee that the
> >> process
> >> > > that is running locally has finished before someone else starts
> >> another.
> >> > >
> >> > > It's usually best - if possible - to restructure the system so that
> >> > > processes are idempotent to work around these kinds of problem, in
> >> > > conjunction with using the kind of primitives that Curator provides.
> >> > >
> >> > > Henry
> >> > >
> >> > > On 8 December 2012 21:04, Jordan Zimmerman <
> >> [email protected]
> >> > >wrote:
> >> > >
> >> > >> This is why you need a ConnectionStateListener. You'll get a notice
> >> that
> >> > >> the connection has been suspended and you should assume all
> >> > locks/leaders
> >> > >> are invalid.
> >> > >>
> >> > >> -JZ
> >> > >>
> >> > >> On Dec 8, 2012, at 9:02 PM, Henry Robinson <[email protected]>
> >> wrote:
> >> > >>
> >> > >>> What about a network disconnection? Presumably leadership is
> revoked
> >> > when
> >> > >>> the leader appears to have failed, which can be for more reasons
> >> than a
> >> > >> VM
> >> > >>> crash (VM running slow, network event, GC pause etc).
> >> > >>>
> >> > >>> Henry
> >> > >>>
> >> > >>> On 8 December 2012 21:00, Jordan Zimmerman <
> >> [email protected]
> >> > >>> wrote:
> >> > >>>
> >> > >>>> The leader latch lock is the equivalent of task in progress. I
> >> assume
> >> > >> the
> >> > >>>> task is running in the same VM as the leader lock. The only
> reason
> >> the
> >> > >> VM
> >> > >>>> would lose leadership is if it crashes in which case the process
> >> would
> >> > >> die
> >> > >>>> anyway.
> >> > >>>>
> >> > >>>> -JZ
> >> > >>>>
> >> > >>>> On Dec 8, 2012, at 8:56 PM, Eric Pederson <[email protected]>
> >> wrote:
> >> > >>>>
> >> > >>>>> If I recall correctly it was Henry Robinson that gave me the
> >> advice
> >> > to
> >> > >>>> have
> >> > >>>>> a "task in progress" check.
> >> > >>>>>
> >> > >>>>>
> >> > >>>>> -- Eric
> >> > >>>>>
> >> > >>>>>
> >> > >>>>>
> >> > >>>>> On Sat, Dec 8, 2012 at 11:54 PM, Eric Pederson <
> [email protected]
> >> >
> >> > >>>> wrote:
> >> > >>>>>
> >> > >>>>>> I am using Curator LeaderLatch :)
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>> -- Eric
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>> On Sat, Dec 8, 2012 at 11:52 PM, Jordan Zimmerman <
> >> > >>>>>> [email protected]> wrote:
> >> > >>>>>>
> >> > >>>>>>> You might check your leader implementation. Writing a correct
> >> > leader
> >> > >>>>>>> recipe is actually quite challenging due to edge cases. Have a
> >> look
> >> > >> at
> >> > >>>>>>> Curator (disclosure: I wrote it) for an example.
> >> > >>>>>>>
> >> > >>>>>>> -JZ
> >> > >>>>>>>
> >> > >>>>>>> On Dec 8, 2012, at 8:49 PM, Eric Pederson <[email protected]>
> >> > wrote:
> >> > >>>>>>>
> >> > >>>>>>>> Actually I had the same thought and didn't consider having to
> >> do
> >> > >> this
> >> > >>>>>>> until
> >> > >>>>>>>> I talked about my project at a Zookeeper User Group a month
> or
> >> so
> >> > >> ago
> >> > >>>>>>> and I
> >> > >>>>>>>> was given this advice.
> >> > >>>>>>>>
> >> > >>>>>>>> I know that I do see leadership being lost/transferred when
> >> one of
> >> > >> the
> >> > >>>>>>> ZK
> >> > >>>>>>>> servers is restarted (not the whole ensemble).   And it seems
> >> like
> >> > >>>> I've
> >> > >>>>>>>> seen it happen even when the ensemble stays totally stable
> >> > (though I
> >> > >>>> am
> >> > >>>>>>> not
> >> > >>>>>>>> 100% sure as it's been a while since I have worked on this
> >> > >> particular
> >> > >>>>>>>> application).
> >> > >>>>>>>>
> >> > >>>>>>>>
> >> > >>>>>>>>
> >> > >>>>>>>> -- Eric
> >> > >>>>>>>>
> >> > >>>>>>>>
> >> > >>>>>>>>
> >> > >>>>>>>> On Sat, Dec 8, 2012 at 11:25 PM, Jordan Zimmerman <
> >> > >>>>>>>> [email protected]> wrote:
> >> > >>>>>>>>
> >> > >>>>>>>>> Why would it lose leadership? The only reason I can think of
> >> is
> >> > if
> >> > >>>> the
> >> > >>>>>>> ZK
> >> > >>>>>>>>> cluster goes down. In normal use, the ZK cluster won't go
> >> down (I
> >> > >>>>>>> assume
> >> > >>>>>>>>> you're running 3 or 5 instances).
> >> > >>>>>>>>>
> >> > >>>>>>>>> -JZ
> >> > >>>>>>>>>
> >> > >>>>>>>>> On Dec 8, 2012, at 8:17 PM, Eric Pederson <
> [email protected]>
> >> > >> wrote:
> >> > >>>>>>>>>
> >> > >>>>>>>>>> During the time the task is running a cluster member could
> >> lose
> >> > >> its
> >> > >>>>>>>>>> leadership.
> >> > >>>>>>>>>
> >> > >>>>>>>>>
> >> > >>>>>>>
> >> > >>>>>>>
> >> > >>>>>>
> >> > >>>>
> >> > >>>>
> >> > >>>
> >> > >>>
> >> > >>> --
> >> > >>> Henry Robinson
> >> > >>> Software Engineer
> >> > >>> Cloudera
> >> > >>> 415-994-6679
> >> > >>
> >> > >>
> >> > >
> >> > >
> >> > > --
> >> > > Henry Robinson
> >> > > Software Engineer
> >> > > Cloudera
> >> > > 415-994-6679
> >> >
> >> >
> >>
> >>
> >> --
> >> Henry Robinson
> >> Software Engineer
> >> Cloudera
> >> 415-994-6679
> >>
> >
> >
>

Re: leader election, scheduled tasks, losing leadership

Reply via email to