Re: [Openais] kernel watchdog timer for corosync

Angus Salkeld Tue, 25 May 2010 15:26:17 -0700

On Tue, May 25, 2010 at 10:02:06AM -0700, Alan Jones wrote:
> This seems like a good design for services that cannot tolerate restart.
Well sam gives you option of restart the process or watchdog (others too).

> However, Pacemaker is designed to restart - so registering a watchdog
> for it doesn't make sense.  We clearly need a watchdog on the corosync
> daemon and may need one on whatever is restarting Pacemaker (corosync
> also?).  It is also interesting to ask where in corosync the watchdog is
> petted.
At the moment I have put this in a timer, which isn't too bad as it is
driven off of the poll loop.

> Petting the watchdog should indicate that corosync is live is some higher
> sense and not blocked on socket calls, for example.

I suspect this might make corosync hit too many false positives.

There is a totempg_callback_token_create() call that will run your function
when a token is sent or recieved (depending on the option you pass to it).
We could hook this up to pett the watchdog. But what happens on lossy
network? How do you set the tolerance?

I'll have a think about it though, prehaps Steve has some sugestions.

-Angus

> Alan
> 
> On Mon, May 24, 2010 at 5:29 PM, Angus Salkeld <[email protected]> wrote:
> 

_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Re: [Openais] kernel watchdog timer for corosync

Reply via email to