Re: [ClusterLabs] Wtrlt: Antw: Re: Antw: Re: how important would you consider to have two independent fencing device for each node ?

2017-04-21 Thread Klaus Wenninger
On 04/21/2017 10:10 AM, Kristoffer Grönlund wrote:
> Ken Gaillot  writes:
>
 I think it works differently: One task periodically reads ist mailbox slot 
 for commands, and once a comment was read, it's executed immediately. Only
>>> if 
 the read task does hang for a long time, the watchdog itself triggers a
>>> reset 
 (as SBD seems dead). So the delay is actually made from the sum of "write 
 delay", "read delay", "command excution".
>> I think you're right when sbd uses shared-storage, but there is a
>> watchdog-only configuration that I believe digimer was referring to.
>>
>> With watchdog-only, the cluster will wait for the value of the
>> stonith-watchdog-timeout property before considering the fencing successful.
> I think there are some important distictions to make, to clarify what
> SBD is and how it works:
>
> * The original SBD model uses shared storage as its fencing mechanism
>   (thus the name Shared-storage based death) - when talking about
>   watchdog-only SBD, a new mode only introduced in a fork of the SBD
>   project, it would probably help avoid confusion to be explicit about
>   that.
>
> * Watchdog-only SBD relies on quorum to avoid split-brain or fence
>   loops, and thus requires at least three nodes or an additional qdevice
>   node. This is my understanding, correct me if I am wrong. Also, this
>   disqualifies watchdog-sbd from any of Digimers setups since they are
>   2-node only, so that's probably something to be aware of in this
>   discussion. ;)

There is no way around the '2 physical devices are not enough
for an HA-cluster' paradigm ;-)
And Watchdog-only SBD isn't one either although it helps you
to make three nodes without any additional physical device
(stonith-device, disk, ...) work as a reliable cluster.

>
> * The watchdog fencing in SBD is not the primary fence mechanism when
>   shared storage is available. In fact, it is an optional although
>   strongly recommended component. [1]

As strongly recommended as fencing in general I would say ;-)

Regards,
Klaus

> [1]: We (as in SUSE) require use of a watchdog for supported
> configurations, but technically it is optional.
>


-- 
Klaus Wenninger

Senior Software Engineer, EMEA ENG Openstack Infrastructure

Red Hat

kwenn...@redhat.com   


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Wtrlt: Antw: Re: Antw: Re: how important would you consider to have two independent fencing device for each node ?

2017-04-21 Thread Kristoffer Grönlund
Ken Gaillot  writes:

>>> I think it works differently: One task periodically reads ist mailbox slot 
>>> for commands, and once a comment was read, it's executed immediately. Only
>> if 
>>> the read task does hang for a long time, the watchdog itself triggers a
>> reset 
>>> (as SBD seems dead). So the delay is actually made from the sum of "write 
>>> delay", "read delay", "command excution".
>
> I think you're right when sbd uses shared-storage, but there is a
> watchdog-only configuration that I believe digimer was referring to.
>
> With watchdog-only, the cluster will wait for the value of the
> stonith-watchdog-timeout property before considering the fencing successful.

I think there are some important distictions to make, to clarify what
SBD is and how it works:

* The original SBD model uses shared storage as its fencing mechanism
  (thus the name Shared-storage based death) - when talking about
  watchdog-only SBD, a new mode only introduced in a fork of the SBD
  project, it would probably help avoid confusion to be explicit about
  that.

* Watchdog-only SBD relies on quorum to avoid split-brain or fence
  loops, and thus requires at least three nodes or an additional qdevice
  node. This is my understanding, correct me if I am wrong. Also, this
  disqualifies watchdog-sbd from any of Digimers setups since they are
  2-node only, so that's probably something to be aware of in this
  discussion. ;)

* The watchdog fencing in SBD is not the primary fence mechanism when
  shared storage is available. In fact, it is an optional although
  strongly recommended component. [1]

[1]: We (as in SUSE) require use of a watchdog for supported
configurations, but technically it is optional.

-- 
// Kristoffer Grönlund
// kgronl...@suse.com

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Wtrlt: Antw: Re: Antw: Re: how important would you consider to have two independent fencing device for each node ?

2017-04-20 Thread Ken Gaillot
On 04/20/2017 01:43 AM, Ulrich Windl wrote:
> Should have gone to the list...
> 
> Digimer  schrieb am 19.04.2017 um 17:20 in Nachricht
>> <600637f1-fef8-0a3d-821c-7aecfa398...@alteeve.ca>:
>>> On 19/04/17 02:38 AM, Ulrich Windl wrote:
>>> Digimer  schrieb am 18.04.2017 um 19:08 in
> Nachricht
 <26e49390-b384-b46e-4965-eba5bfe59...@alteeve.ca>:
> On 18/04/17 11:07 AM, Lentes, Bernd wrote:
>> Hi,
>>
>> i'm currently establishing a two node cluster. Each node is a HP
> server
 with 
> an ILO card.
>> I can fence both of them, it's working fine.
>> But what is if the ILO does not work correctly ? Then fencing is not 
> possible.
>
> Correct. If you only have iLO fencing, then the cluster would hang
> (failed fencing is *not* an indication of node death).
>
>> I also have a switched PDU from APC. Each server has two power
> supplies. 
> Currently one is connected to the normal power equipment, the other to
> the 
> UPS.
>> As a sort of redundancy, if the UPS does not work properly.
>
> That's a fine setup.
>
>> When i'd like to use the switched PDU as a fencing device i will loose
> the

> redundancy of two independent power sources, because then i have to
> connect

> both power supplies together to the UPS.
>> I wouldn't like to do that.
>
> Not if you have two switched PDUs. This is what we do in our Anvil!
> systems... One PDU feeds the first PSU in each node and the second PDU
> feeds the second PSUs. Ideally both PDUs are fed by UPSes, but that's
> not as important. One PDU on a UPS and one PDU directly from mains will
> work.
>
>> How important would you consider to have two independent fencing device
> for

> each node ? I'd can't by another PDU, currently we are very poor.
>
> Depends entirely on your tolerance for interruption. *I* answer that
> with "extremely important". However, most clusters out there have only
> IPMI-based fencing, so they would obviously say "not so important".
>
>> Is there another way to create a second fencing device, independent
> from
 the 
> ILO card ?
>>
>> Thanks.
>
> Sure, SBD would work. I've never seen IPMI not have a watchdog timer
> (and iLO is IPMI++), as one example. It's slow, and needs shared
> storage, but a small box somewhere running a small tgtd or iscsid
> should
> do the trick (note that I have never used SBD myself...).

 Slow is relative: If it takes 3 seconds from issuing the reset command
> until
 the node is dead, it's fast enough for most cases. Even a switched PDU
> has 
>>> some
 delays: The command has to be processed, the relay may "stick" a short 
>>> moment,
 the power supply's capacitors have to discharge (if you have two power 
>>> supplys,
 both need to)...  And iLOs don't really like to be powered off.

 Ulrich
>>>
>>> The way I understand SBD, and correct me if I am wrong, recovery won't
>>> begin until sometime after the watchdog timer kicks. If the watchdog
>>> timer is 60 seconds, then your cluster will hang for >60 seconds (plus
>>> fence delays, etc).
>>
>> I think it works differently: One task periodically reads ist mailbox slot 
>> for commands, and once a comment was read, it's executed immediately. Only
> if 
>> the read task does hang for a long time, the watchdog itself triggers a
> reset 
>> (as SBD seems dead). So the delay is actually made from the sum of "write 
>> delay", "read delay", "command excution".

I think you're right when sbd uses shared-storage, but there is a
watchdog-only configuration that I believe digimer was referring to.

With watchdog-only, the cluster will wait for the value of the
stonith-watchdog-timeout property before considering the fencing successful.

>> The manual page (LSES 11 SP4) states: "Set watchdog timeout to N seconds. 
>> This depends mostly on your storage latency; the majority of devices must be
> 
>> successfully read within this time, or else the node will self-fence." and 
>> "If a watchdog is used together with the "sbd" as is strongly recommended, 
>> the watchdog is activated at initial start of the sbd daemon. The watchdog
> is 
>> refreshed every time the majority of SBD devices has been successfully read.
> 
>> Using a watchdog provides additional protection against "sbd" crashing."
>>
>> Final remark: I thing the developers of sbd were under drugs (or never saw a
> 
>> UNIX program before) when designing the options. For example: "-W  Enable or
> 
>> disable use of the system watchdog to protect against the sbd processes 
>> failing and the node being left in an undefined state. Specify this once to
> 
>> enable, twice to disable." (MHO)
>>
>> Regards,
>> Ulrich
>>
>>>
>>> IPMI and PDUs can confirm fence the peer if ~5 seconds (plus fence
> delays).
>>>
>>> -- 
>>> Digimer
>>> Papers and