On Wed, Dec 08, 2004 at 01:06:31AM -0500, Alan Altmark wrote: > Do the open source solutions deal with STOMITH? SAfLi will perform FORCE > or will use LPAR management APIs to kill off the offending machine.
STO[MN]ITH is important in the clustering techniques that involve all the machines actively listening for traffic and independently making decisions about who shall respond (like in DHCP, for instance, where clusters of DHCP servers will all respond to a request for a lease, and the client just picks the first (or otherwise best) reply it gets... The servers have delaying techniques that ensure a lightly-patronised server will get in ahead of a heavily-patronised one, but I digress...) In clustering techniques where only one server can receive requests (like what you'd have just using Keepalived, for instance), STO[MN]ITH might not be so important -- no traffic is sent to the bogus host, which can be just as good as it being shot. You're obviously at the mercy of what method you use to determine that a host needs to be shot -- but that's true of any failover system. Note I say "can be as good" -- obviously there might be other good reasons for wanting the lame duck out of the pond. DoS of the environment through excess CPU consumption or network jabbering are examples. Just trying to illustrate that sometimes you don't need *all* the bells and whistles. Back to the original question: yes, a quick search of the Linux-VS site does show some information about STO[MN]ITH capabilities. There is also a lot of kernel heartbeat support that in PC-land interfaces with hardware (and software) monitors to trigger a reset. I think I even recall seeing something in Keepalived. I don't imagine it being too difficult to interface that stuff to the z/VM SMAPI, or even the new feature that came in with z/VM 5.1 that the very thing. Cheers, Vic Cross <at home, speaking for myself only> PS: STOMITH: Shoot The Other Machine In The Head STONITH: Shoot The Other Node In the Head Yes, I know: you say tomato, I say tomato... :) Just in case anyone saw STOMITH and wondered if that was anything like STONITH -- they're the same. Well, except that "N" might be a little more appropriate in a z/VM scenario, as we probably don't want to shoot the whole machine... :) ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
