On Mon, May 31, 2010 at 04:47:43PM +0200, Lars Marowsky-Bree wrote:
> On 2010-05-31T11:45:37, Lars Ellenberg <[email protected]> wrote:
>
> > Use the "anything" resource agent, and define a monitor action script
> > of your choice? or put a loop in your script, and restart whatever is
> > necessary from there, then let lrm monitor your script, and restart
> > that script if necessary?
>
> That is pretty costly in overhead though, and requires something to
> actually periodically monitor (ie, it is asynchronous with latency).
>
> > There are so many ways to do it that I really fail to see the challenge?
>
> The suggestion was to register the pids to be monitored with a daemon,
> that would then waitpid() for them (or use signal(0) to check them more
> frequently),
> and create event-driven failure notifications with low
> overhead.
There are several flavors of overhead.
One underestimated is programming and code maintenance overhead ;-)
Why would we "register pid with lrm" and duplicate code from heartbeat
proper to lrmd and whatnot, or even rewrite the respawn from scratch
again, if the same thing can be achieved with a few lines of shell code?
I suggested to have the loop in the script, and monitor that script.
sort of what the mysql guys did with their "mysql_safe" thing,
or whatever it is called.
while :; do
echo "(re)starting $whatever... "
$your_daemon_here
ex=$?
if [ $ex = $cleanly_and_expectedly_stopped ] ||
not_supposed_to_respawn_this_time; then
break
fi
sleep 1
done
There is not much in those few lines that can fail,
so monitoring it "with some latency" is fine.
If $your_daemon_here exits unexpectedly,
it will be respawned right away.
Of course you can get fancy, background the daemon, then wait,
have a trap on sigterm, put in some ulimits, ...
Write the "respawn_everything" RA.
--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/