Thanks Vitalii! I will think about this and ask if I have any questions.
-- Eric On Tue, Dec 11, 2012 at 3:09 PM, Vitalii Tymchyshyn <[email protected]>wrote: > I am asking because you have this "at most once" vs "at least one" problem. > I don't think you can have "exactly one" unless your jobs are transactional > and you can synhronize your transaction commits to zookeeper (and better > with two-phase commit). So, you need to decide > > What I'd recommend to you is to make queue-like architecture, not > lock-based. This way you can: > a) Do parallel task processing > b) Try increasing timeouts to be larger than maximum task time. > E.g. set it to one hour. This would mean that task running will restart > in an hour if client fails. > > But this would mean moving from database to zookeeper for task > status/queueing. As for me this would be good as database is SPOF for you. > > Best regards, Vitalii Tymchyshyn > > > 2012/12/10 Eric Pederson <[email protected]> > > > It depends on the scheduled task. Some have status fields in the > database > > that indicate new/in-progress/done, but others do not. > > > > > > -- Eric > > > > > > > > On Mon, Dec 10, 2012 at 1:49 AM, Vitalii Tymchyshyn <[email protected] > > >wrote: > > > > > How are you going to ensure atomicity? I mean, if you processor dies in > > the > > > middle of the operation, how do you know if it is done or not? > > > > > > -- > > > Best regards, > > > Vitalii Tymchyshyn > > > 10 груд. 2012 00:11, "Eric Pederson" <[email protected]> напис. > > > > > > > Also sometimes the app leadership (via LeaderLatch) will get lost - I > > > will > > > > follow up about this on the Curator list: > > > > https://gist.github.com/4247226 > > > > > > > > So back to my previous question, what is the best way to implement > the > > > > "fence"? > > > > > > > > -- Eric > > > > > > > > > > > > > > > > On Sun, Dec 9, 2012 at 4:42 PM, Eric Pederson <[email protected]> > > wrote: > > > > > > > > > The irony is that I am using leader election to convert > > non-idempotent > > > > > operations into idempotent operations :) For example, once a > night > > a > > > > > report is emailed out to a set of addresses. We don't want the > > report > > > > to > > > > > go to the same person more than once. > > > > > > > > > > Prior to using leader election one of the cluster members was > > > designated > > > > > as the scheduled task "leader" using a system property. But if > that > > > > > cluster member crashed it required a manual operation to failover > the > > > > > "leader" responsibility to another cluster member. I considered > > using > > > > > app-specific techniques to make the scheduled tasks idempotent (for > > > > example > > > > > using "select for update" / database locking) but I wanted a > general > > > > > solution and I needed clustering support for other reasons (cluster > > > > > membership, etc). > > > > > > > > > > Anyway, here is the code that I'm using. > > > > > > > > > > Application startup (using Curator LeaderLatch): > > > > > https://gist.github.com/3936162 > > > > > https://gist.github.com/3935895 > > > > > https://gist.github.com/3935889 > > > > > > > > > > ClusterStatus: > > > > > https://gist.github.com/3943149 > > > > > https://gist.github.com/3935861 > > > > > > > > > > Scheduled task: > > > > > https://gist.github.com/4246388 > > > > > > > > > > In the last gist the "distribute" scheduled task is run every 30 > > > seconds. > > > > > It checks clusterStatus.isLeader to see if the current cluster > > member > > > > is > > > > > the leader before running the real method (which sends email). > > > > > clusterStatus() calls methods on LeaderLatch. > > > > > > > > > > Here is the output that I am seeing if I kill the ZK quorum leader > > and > > > > the > > > > > app cluster member that was the leader loses its LeaderLatch > > leadership > > > > to > > > > > another cluster member: > > > > > https://gist.github.com/4247058 > > > > > > > > > > > > > > > -- Eric > > > > > > > > > > > > > > > > > > > > On Sun, Dec 9, 2012 at 12:30 AM, Henry Robinson < > [email protected] > > > > >wrote: > > > > > > > > > >> On 8 December 2012 21:18, Jordan Zimmerman < > > > [email protected] > > > > >> >wrote: > > > > >> > > > > >> > If your ConnectionStateListener gets SUSPENDED or LOST you've > lost > > > > >> > connection to ZooKeeper. Therefore you cannot use that same > > > ZooKeeper > > > > >> > connection to manage a node that denotes the process is running > or > > > > not. > > > > >> > Only 1 VM at a time will be running the process. That process > can > > > > watch > > > > >> for > > > > >> > SUSPENDED/LOST and wind down the task. > > > > >> > > > > > >> > > > > > >> My point is that by the time that VM sees SUSPENDED/LOST, another > VM > > > may > > > > >> have been elected leader and have started running another process. > > > > >> > > > > >> It's a classic problem - you need some mechanism to fence a node > > that > > > > >> thinks its the leader, but isn't and hasn't got the memo yet. The > > way > > > > >> around the problem is to either ensure that no work is done by you > > > once > > > > >> you > > > > >> are no longer the leader (perhaps by checking every time you want > to > > > do > > > > >> work), or that the work you do does not affect the system (e.g. by > > > > >> idempotent work units). > > > > >> > > > > >> ZK itself solves this internally by checking with that it has a > > quorum > > > > for > > > > >> every operation, which forces an ordering between the > disconnection > > > > event > > > > >> and trying to do something that relies upon being the leader. > Other > > > > >> systems > > > > >> forcibly terminate old leaders before allowing a new leader to > take > > > the > > > > >> throne. > > > > >> > > > > >> Henry > > > > >> > > > > >> > > > > >> > > You can't assume that the notification is received locally > > before > > > > >> another > > > > >> > > leader election finishes elsewhere > > > > >> > Which notification? The ConnectionStateListener is an > abstraction > > on > > > > >> > ZooKeeper's watcher mechanism. It's only significant for the VM > > that > > > > is > > > > >> the > > > > >> > leader. Non-leaders don't need to be concerned. > > > > >> > > > > >> > > > > >> > -JZ > > > > >> > > > > > >> > On Dec 8, 2012, at 9:12 PM, Henry Robinson <[email protected]> > > > > wrote: > > > > >> > > > > > >> > > You can't assume that the notification is received locally > > before > > > > >> another > > > > >> > > leader election finishes elsewhere (particularly if you are > > > running > > > > >> > slowly > > > > >> > > for some reason!), so it's not sufficient to guarantee that > the > > > > >> process > > > > >> > > that is running locally has finished before someone else > starts > > > > >> another. > > > > >> > > > > > > >> > > It's usually best - if possible - to restructure the system so > > > that > > > > >> > > processes are idempotent to work around these kinds of > problem, > > in > > > > >> > > conjunction with using the kind of primitives that Curator > > > provides. > > > > >> > > > > > > >> > > Henry > > > > >> > > > > > > >> > > On 8 December 2012 21:04, Jordan Zimmerman < > > > > >> [email protected] > > > > >> > >wrote: > > > > >> > > > > > > >> > >> This is why you need a ConnectionStateListener. You'll get a > > > notice > > > > >> that > > > > >> > >> the connection has been suspended and you should assume all > > > > >> > locks/leaders > > > > >> > >> are invalid. > > > > >> > >> > > > > >> > >> -JZ > > > > >> > >> > > > > >> > >> On Dec 8, 2012, at 9:02 PM, Henry Robinson < > [email protected] > > > > > > > >> wrote: > > > > >> > >> > > > > >> > >>> What about a network disconnection? Presumably leadership is > > > > revoked > > > > >> > when > > > > >> > >>> the leader appears to have failed, which can be for more > > reasons > > > > >> than a > > > > >> > >> VM > > > > >> > >>> crash (VM running slow, network event, GC pause etc). > > > > >> > >>> > > > > >> > >>> Henry > > > > >> > >>> > > > > >> > >>> On 8 December 2012 21:00, Jordan Zimmerman < > > > > >> [email protected] > > > > >> > >>> wrote: > > > > >> > >>> > > > > >> > >>>> The leader latch lock is the equivalent of task in > progress. > > I > > > > >> assume > > > > >> > >> the > > > > >> > >>>> task is running in the same VM as the leader lock. The only > > > > reason > > > > >> the > > > > >> > >> VM > > > > >> > >>>> would lose leadership is if it crashes in which case the > > > process > > > > >> would > > > > >> > >> die > > > > >> > >>>> anyway. > > > > >> > >>>> > > > > >> > >>>> -JZ > > > > >> > >>>> > > > > >> > >>>> On Dec 8, 2012, at 8:56 PM, Eric Pederson < > [email protected] > > > > > > > >> wrote: > > > > >> > >>>> > > > > >> > >>>>> If I recall correctly it was Henry Robinson that gave me > the > > > > >> advice > > > > >> > to > > > > >> > >>>> have > > > > >> > >>>>> a "task in progress" check. > > > > >> > >>>>> > > > > >> > >>>>> > > > > >> > >>>>> -- Eric > > > > >> > >>>>> > > > > >> > >>>>> > > > > >> > >>>>> > > > > >> > >>>>> On Sat, Dec 8, 2012 at 11:54 PM, Eric Pederson < > > > > [email protected] > > > > >> > > > > > >> > >>>> wrote: > > > > >> > >>>>> > > > > >> > >>>>>> I am using Curator LeaderLatch :) > > > > >> > >>>>>> > > > > >> > >>>>>> > > > > >> > >>>>>> -- Eric > > > > >> > >>>>>> > > > > >> > >>>>>> > > > > >> > >>>>>> > > > > >> > >>>>>> > > > > >> > >>>>>> On Sat, Dec 8, 2012 at 11:52 PM, Jordan Zimmerman < > > > > >> > >>>>>> [email protected]> wrote: > > > > >> > >>>>>> > > > > >> > >>>>>>> You might check your leader implementation. Writing a > > > correct > > > > >> > leader > > > > >> > >>>>>>> recipe is actually quite challenging due to edge cases. > > > Have a > > > > >> look > > > > >> > >> at > > > > >> > >>>>>>> Curator (disclosure: I wrote it) for an example. > > > > >> > >>>>>>> > > > > >> > >>>>>>> -JZ > > > > >> > >>>>>>> > > > > >> > >>>>>>> On Dec 8, 2012, at 8:49 PM, Eric Pederson < > > > [email protected]> > > > > >> > wrote: > > > > >> > >>>>>>> > > > > >> > >>>>>>>> Actually I had the same thought and didn't consider > > having > > > to > > > > >> do > > > > >> > >> this > > > > >> > >>>>>>> until > > > > >> > >>>>>>>> I talked about my project at a Zookeeper User Group a > > month > > > > or > > > > >> so > > > > >> > >> ago > > > > >> > >>>>>>> and I > > > > >> > >>>>>>>> was given this advice. > > > > >> > >>>>>>>> > > > > >> > >>>>>>>> I know that I do see leadership being lost/transferred > > when > > > > >> one of > > > > >> > >> the > > > > >> > >>>>>>> ZK > > > > >> > >>>>>>>> servers is restarted (not the whole ensemble). And it > > > seems > > > > >> like > > > > >> > >>>> I've > > > > >> > >>>>>>>> seen it happen even when the ensemble stays totally > > stable > > > > >> > (though I > > > > >> > >>>> am > > > > >> > >>>>>>> not > > > > >> > >>>>>>>> 100% sure as it's been a while since I have worked on > > this > > > > >> > >> particular > > > > >> > >>>>>>>> application). > > > > >> > >>>>>>>> > > > > >> > >>>>>>>> > > > > >> > >>>>>>>> > > > > >> > >>>>>>>> -- Eric > > > > >> > >>>>>>>> > > > > >> > >>>>>>>> > > > > >> > >>>>>>>> > > > > >> > >>>>>>>> On Sat, Dec 8, 2012 at 11:25 PM, Jordan Zimmerman < > > > > >> > >>>>>>>> [email protected]> wrote: > > > > >> > >>>>>>>> > > > > >> > >>>>>>>>> Why would it lose leadership? The only reason I can > > think > > > of > > > > >> is > > > > >> > if > > > > >> > >>>> the > > > > >> > >>>>>>> ZK > > > > >> > >>>>>>>>> cluster goes down. In normal use, the ZK cluster won't > > go > > > > >> down (I > > > > >> > >>>>>>> assume > > > > >> > >>>>>>>>> you're running 3 or 5 instances). > > > > >> > >>>>>>>>> > > > > >> > >>>>>>>>> -JZ > > > > >> > >>>>>>>>> > > > > >> > >>>>>>>>> On Dec 8, 2012, at 8:17 PM, Eric Pederson < > > > > [email protected]> > > > > >> > >> wrote: > > > > >> > >>>>>>>>> > > > > >> > >>>>>>>>>> During the time the task is running a cluster member > > > could > > > > >> lose > > > > >> > >> its > > > > >> > >>>>>>>>>> leadership. > > > > >> > >>>>>>>>> > > > > >> > >>>>>>>>> > > > > >> > >>>>>>> > > > > >> > >>>>>>> > > > > >> > >>>>>> > > > > >> > >>>> > > > > >> > >>>> > > > > >> > >>> > > > > >> > >>> > > > > >> > >>> -- > > > > >> > >>> Henry Robinson > > > > >> > >>> Software Engineer > > > > >> > >>> Cloudera > > > > >> > >>> 415-994-6679 > > > > >> > >> > > > > >> > >> > > > > >> > > > > > > >> > > > > > > >> > > -- > > > > >> > > Henry Robinson > > > > >> > > Software Engineer > > > > >> > > Cloudera > > > > >> > > 415-994-6679 > > > > >> > > > > > >> > > > > > >> > > > > >> > > > > >> -- > > > > >> Henry Robinson > > > > >> Software Engineer > > > > >> Cloudera > > > > >> 415-994-6679 > > > > >> > > > > > > > > > > > > > > > > > > > > > > > -- > Best regards, > Vitalii Tymchyshyn >
