Dejan Muhamedagic wrote: > On Thu, Aug 20, 2009 at 03:16:56PM +0200, Andrew Beekhof wrote: >> On Thu, Aug 20, 2009 at 2:55 PM, Terry L. >> Inzauro<[email protected]> wrote: >>> Dejan Muhamedagic wrote: >>>> Hi, >>>> >>>> On Thu, Aug 20, 2009 at 11:01:43AM +0200, Thomas Glanzmann wrote: >>>>> Hello Terry, >>>>> >>>>>> What would cause the stonith 'start' operation to fail after it >>>>>> initially had succeeded? >>>>> if my understanding is correct (I wrote a stonith agent for vsphere >>>>> yesterday). Than it runs the status command of the stonith agent and >>>>> looks at the exist status, like that: >>>>> >>>>> (ha-01) [~] VI_SERVER=esx-03.glanzmann.de VI_USERNAME=root >>>>> /usr/lib/stonith/plugins/external/vsphere status; echo $? >>>>> Enter password: >>>>> 0 >>>> Right. The start operation includes a status. If the status >>>> operation fails, the start obviously fails too. >>>> >>>> Thanks, >>>> >>>> Dejan >>>> >>>>> Thomas >>>>> _______________________________________________ >>> >>> Ok, I understand that, but why would it intermittently fail if it initialy >>> succeeds? These machines are not heavily loaded >>> and are by no means slow. >> Could be a timing issue. Some boxes only allow 1 simultaneous connection. > > Oh, right, of course, forgot about that, though in this case > (external/ssh) the device allows multiple simultaneous > connections. But it could be also due to timeouts. > > Thanks, > > Dejan > >>> I guess the better question would be: How to I track down the culprit? >>> Obviously, if a monitor or start command fail on the >>> stonith agent, then it will cause a "stonith reboot" operation of one or >>> both of the nodes which shouldn't happen unless >>> theres a definite reason to do so. >>> >>> >>> >>>
Ok. I am indeed using 'external/ssh' as the stonith device. I figure it was better than nothing as I do not have access to a hardware stonith device. In you opinion, is using the 'external/ssh' plugin 'better' than NOT using a stonith plugin at all? In the mean time, I'll bump up the timeouts from 20s to 40s and see how it goes. thanks for the help. _Terry _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
