Dejan Muhamedagic wrote:
> On Thu, Aug 20, 2009 at 03:16:56PM +0200, Andrew Beekhof wrote:
>> On Thu, Aug 20, 2009 at 2:55 PM, Terry L.
>> Inzauro<[email protected]> wrote:
>>> Dejan Muhamedagic wrote:
>>>> Hi,
>>>>
>>>> On Thu, Aug 20, 2009 at 11:01:43AM +0200, Thomas Glanzmann wrote:
>>>>> Hello Terry,
>>>>>
>>>>>> What would cause the stonith 'start' operation to fail after it
>>>>>> initially had succeeded?
>>>>> if my understanding is correct (I wrote a stonith agent for vsphere
>>>>> yesterday). Than it runs the status command of the stonith agent and
>>>>> looks at the exist status, like that:
>>>>>
>>>>> (ha-01) [~] VI_SERVER=esx-03.glanzmann.de VI_USERNAME=root 
>>>>> /usr/lib/stonith/plugins/external/vsphere status; echo $?
>>>>> Enter password:
>>>>> 0
>>>> Right. The start operation includes a status. If the status
>>>> operation fails, the start obviously fails too.
>>>>
>>>> Thanks,
>>>>
>>>> Dejan
>>>>
>>>>>         Thomas
>>>>> _______________________________________________
>>>
>>> Ok, I understand that, but why would it intermittently fail if it initialy 
>>> succeeds?  These machines are not heavily loaded
>>> and are by no means slow.
>> Could be a timing issue. Some boxes only allow 1 simultaneous connection.
> 
> Oh, right, of course, forgot about that, though in this case
> (external/ssh) the device allows multiple simultaneous
> connections. But it could be also due to timeouts.
> 
> Thanks,
> 
> Dejan
> 
>>> I guess the better question would be:  How to I track down the culprit?  
>>> Obviously, if a monitor or start command fail on the
>>> stonith agent, then it will cause a "stonith reboot" operation of one or 
>>> both of the nodes which shouldn't happen unless
>>> theres a definite reason to do so.
>>>
>>>
>>>
>>>



Ok. I am indeed using 'external/ssh' as the stonith device.   I figure it was 
better than nothing as I do not have access to
a hardware stonith device.  In you opinion, is using the 'external/ssh'  plugin 
'better' than NOT using a stonith plugin at all?


In the mean time, I'll bump up the timeouts from 20s to 40s and see how it goes.



thanks for the help.


_Terry


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to