Dejan Muhamedagic wrote: > Hi, > > On Thu, Aug 20, 2009 at 03:58:19PM +0200, Andrew Beekhof wrote: >> On Thu, Aug 20, 2009 at 3:56 PM, Terry L. >> Inzauro<[email protected]> wrote: >> >>> Ok. I am indeed using 'external/ssh' as the stonith device. I figure it >>> was better than nothing as I do not have access to >>> a hardware stonith device. In you opinion, is using the 'external/ssh' >>> plugin 'better' than NOT using a stonith plugin at all? >> personally, i think so. >> but there are plenty that disagree. > > Ah, that would include me :) > > If the stonith device fails to fence the failing node then there > is no failover and you get zero availability. The probability > that that happens is much higher when using a device such as > external/ssh since it depends on both the network availability > and the OS health. I'll leave it to you to figure out in how many > ways these two dependencies can hinder a fencing operation. > > Thanks, > > Dejan > > >
Ahem. How many ways to hinder, let me count the ways. Glad I got that out of my system. Now on to the business at hand. -------------------- There may be many different failures, but I guess I would have to spit them into two groups: probably and improbable. Probable list: 1. Physical network link failure 2. Ethernet switch fabric failure 3. administrator error (accidentally breaking network configurations including sshd breakage) Improbable list: 1. IP stack failure 2. Unexpected OS errors (linux is pretty stable these days) 3. Ethernet adapter failure (i cant remember the last time i saw an Ethernet card fail) Having said all that, one can derive a thought that assumes 99% of the failures are related to 'external/ssh' stonith device are network related. So, my last question is: Can the 'external/ssh' stonith plugin be configured to be "network fault tolerant". For instance: <nvpair id="stonithclone-attr-1" name="hostlist" value="node1 node1-c node2 node2-c"/> where: node1 = communications over eth0 and switch0 node1-c = communications over eth1 via xover node2 = communications over eth0 and switch0 node2-c = communications over eth1 via xover the desired logic is this: if node1 communication to node2 fails then use node1-c communications to node2-c else stonith thy self i would say the probability of both links failing is slim. this setup would then alleviate the "probable" list. right? _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
