Andrew Beekhof wrote:
> On Fri, Jun 26, 2009 at 3:07 PM, Jan Kalcic<[email protected]> wrote:
>   
>> Andrew Beekhof wrote:
>>     
>>> On Fri, Jun 26, 2009 at 10:55 AM, Jan<[email protected]> wrote:
>>>
>>>       
>>>> Hi,
>>>>
>>>> a very boring issue with stonith using the plugin external/riloe (never 
>>>> used
>>>> it). Whenever I try to simulate a split-brain condition (using iptables) in
>>>> order to test stonith, both nodes kill each other. Not exactly what
>>>> expected.
>>>>
>>>>         
>>> Sure it is
>>>
>>> [snip]
>>>
>>>
>>>       
>>>>        <nvpair id="nvpair-56c027e0-80c8-49a7-9cf1-1af593a9391f"
>>>> name="no-quorum-policy"
>>>> value="ignore"/>
>>>>
>>>>         
>>> With that option, this is exactly what I'd expect.
>>>
>>> Have a read of:
>>>    http://ourobengr.com/ha
>>>
>>>       
>> For what I understood, probably wrongly, that should be the right option
>> for a two nodes cluster, where only one node can't have quorum, that's
>> why should be "ignore". Is this wrong?
>>
>> I had already taken a quick look at that document (I love that picture
>> btw) but not as deeply as now. I am going to review my timeout for sure.
>> Anyway, I don't get any hint about the quorum setting. Should it be
>> different that "ignore"?
>>     
>
> No, thats the right value for a two node cluster.
> But that value can also leads to the behavior you described.
>
> Though normally one side shoots the other before it can shoot back.
>   
This does not happen. The reason could be that usin iLO the node is not
actually shot but gracefully shutdown. For this reason the shot node has
all the time to shoot the other side back. Make sense?

In this case I would need to stonith the other side not gracefully but
strongly like unplugging the cable but it seems this is not available
with the riloe plugin, is it?

Thanks,
Jan
>> My issue isn't exactly the deathmatch described there, first of all
>> because the openais daemon is disable at boot and secondly because this
>> stonith policy is poweroff. Rather, is a strange situation where both
>> nodes kill themselves and they both shutdown.
>>     
>
> They'd both be killing each other.
>
>   
>> I wonder if it is a timeout issue. My timeout here for the stonith
>> resource is 15s. Does it mean that when a stonith is sent by the first
>> node to the second one and this node can't shutdown itself in 15s, it
>> stonith the first node?
>>     
>
> No.  This is unrelated
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>   

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to