On 17-1-2012 4:41, William Yodlowsky wrote: > On 16 January 2012 at 10:02, Camiel Dobbelaar <[email protected]> wrote: > >> On 14-1-2012 7:48, William Yodlowsky wrote: >>> On 13 January 2012 at 16:35, Stuart Henderson <[email protected]> wrote: >>>> i suspect the semantics you're actually looking for for relayd might >>>> be "if relayd is not running, carpdemote should be raised". >>> >>> For my small corner of the net, having some other daemon or the kernel >>> provide carp "auto-demotion" for when a service fails would be a great >>> thing. Like a userland-accessible carp watchdog. >> >> ifstated can do this. (maybe it should be renamed to hoststated ;-) ) >> >> daemon_up = '( "/etc/rc.d/relayd check" every 10 )' >> >> state main { >> if !$daemon_up { >> run "ifconfig -g carp carpdemote" >> set-state demoted >> } >> } >> >> state demoted { >> if $daemon_up { >> run "ifconfig -g carp -carpdemote" >> set-state main >> } >> } > > Yes. The thing is that I have a few 3rd-party daemons that aren't so > predictable. Even though the daemon is running, it will sometimes act > strangely over the network. So the idea was to healthcheck a daemon on > localhost with a TCP connection... which is what relayd does... and have > the machine demote itself if its service acts weird and doesn't restart > properly. (Don't say use nc(1) with ifstated!)
relayd can already do this, it has plenty of checkers and it can also demote on a table/check basis. > I don't know enough to know whether this is rediculous or not, but what > if you could have carp watch for a certain event and demote itself after > a configurable amount of time if that event is not seen? > > # (a 'pat' every 30s or less keeps it from demoting) > ifconfig -g carp watchdog pat timeout 30 autodemote 128 > ifconfig -g carp watchdog pat > ifconfig -g carp -watchdog > > If the interfaces in the group don't see a "pat" to their watchdog in > 30s, then auto-demote by 128. If they do, stay at the current level, > whatever it is. If the pats come back, reduce by 128. > > ifconfig -g carp watchdog relayd timeout 10 autodemote 1 > # (relayd must issue the equivalent 'watchdog relayd' event within 10s) > > If relayd (or whatever daemon supports this) crashes/isn't around to > pat the watchdog, then carp auto-demotes and the machine gracefully > bows out, yielding to the next carp machine. In my naive view of > things there could be multiple watchdogs with different names (like > pat and relayd...). Anyway, thanks for reading :) How is this better then having ifstated check that the relayd parent is around? Maybe relayd is a little more sane if it can pat a watchdog? It's a lot more complex though and shifts part of the complexity into the kernel.
