On Wed, Dec 08, 2004 at 01:06:31AM -0500, Alan Altmark wrote:
> Do the open source solutions deal with STOMITH?  SAfLi will perform FORCE
> or will use LPAR management APIs to kill off the offending machine.

STO[MN]ITH is important in the clustering techniques that involve all the
machines actively listening for traffic and independently making decisions
about who shall respond (like in DHCP, for instance, where clusters of
DHCP servers will all respond to a request for a lease, and the client
just picks the first (or otherwise best) reply it gets...  The servers have
delaying techniques that ensure a lightly-patronised server will get in
ahead of a heavily-patronised one, but I digress...)

In clustering techniques where only one server can receive requests (like
what you'd have just using Keepalived, for instance), STO[MN]ITH might not be
so important -- no traffic is sent to the bogus host, which can be just as
good as it being shot.  You're obviously at the mercy of what method you use
to determine that a host needs to be shot -- but that's true of any failover
system.

Note I say "can be as good" -- obviously there might be other good reasons
for wanting the lame duck out of the pond.  DoS of the environment through
excess CPU consumption or network jabbering are examples.

Just trying to illustrate that sometimes you don't need *all* the bells and
whistles.

Back to the original question: yes, a quick search of the Linux-VS site does
show some information about STO[MN]ITH capabilities.  There is also a lot of
kernel heartbeat support that in PC-land interfaces with hardware (and
software) monitors to trigger a reset.  I think I even recall seeing something
in Keepalived.  I don't imagine it being too difficult to interface that stuff
to the z/VM SMAPI, or even the new feature that came in with z/VM 5.1 that
the very thing.

Cheers,
Vic Cross
  <at home, speaking for myself only>

PS:
STOMITH: Shoot The Other Machine In The Head
STONITH: Shoot The Other   Node  In the Head
Yes, I know: you say tomato, I say tomato... :)
Just in case anyone saw STOMITH and wondered if that was anything like
STONITH -- they're the same.  Well, except that "N" might be a little more
appropriate in a z/VM scenario, as we probably don't want to shoot the whole
machine... :)

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Reply via email to