Dejan Muhamedagic wrote: > Hi, > > On Thu, Aug 20, 2009 at 11:01:43AM +0200, Thomas Glanzmann wrote: >> Hello Terry, >> >>> What would cause the stonith 'start' operation to fail after it >>> initially had succeeded? >> if my understanding is correct (I wrote a stonith agent for vsphere >> yesterday). Than it runs the status command of the stonith agent and >> looks at the exist status, like that: >> >> (ha-01) [~] VI_SERVER=esx-03.glanzmann.de VI_USERNAME=root >> /usr/lib/stonith/plugins/external/vsphere status; echo $? >> Enter password: >> 0 > > Right. The start operation includes a status. If the status > operation fails, the start obviously fails too. > > Thanks, > > Dejan > >> Thomas >> _______________________________________________
Ok, I understand that, but why would it intermittently fail if it initialy succeeds? These machines are not heavily loaded and are by no means slow. I guess the better question would be: How to I track down the culprit? Obviously, if a monitor or start command fail on the stonith agent, then it will cause a "stonith reboot" operation of one or both of the nodes which shouldn't happen unless theres a definite reason to do so. _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
