On Tue, May 25, 2010 at 10:02:06AM -0700, Alan Jones wrote: > This seems like a good design for services that cannot tolerate restart. Well sam gives you option of restart the process or watchdog (others too).
> However, Pacemaker is designed to restart - so registering a watchdog > for it doesn't make sense. We clearly need a watchdog on the corosync > daemon and may need one on whatever is restarting Pacemaker (corosync > also?). It is also interesting to ask where in corosync the watchdog is > petted. At the moment I have put this in a timer, which isn't too bad as it is driven off of the poll loop. > Petting the watchdog should indicate that corosync is live is some higher > sense and not blocked on socket calls, for example. I suspect this might make corosync hit too many false positives. There is a totempg_callback_token_create() call that will run your function when a token is sent or recieved (depending on the option you pass to it). We could hook this up to pett the watchdog. But what happens on lossy network? How do you set the tolerance? I'll have a think about it though, prehaps Steve has some sugestions. -Angus > Alan > > On Mon, May 24, 2010 at 5:29 PM, Angus Salkeld <[email protected]> wrote: > _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
