Hi!

MHO: The correct time to wait is in an interval bounded by these two values:
1: An I/O delay that may occur during normal operation that is never allowed to 
trigger fencing
2: The maximum value to are willing to accept to wait for fencing to occur

Many people thing making 1 close to zero and 2 as small as possible is the best 
solution.

But imagine one of your SBD disks has some read problem, and the operation has 
be be retried a few times. Or think about "online" upgrading your disk 
firmware, etc.: Usually I/Os are stopped for a short time (typically less than 
one minute).


So once you have determined you timeout value for your environment, you can 
configure SBD.  We have a rather long timeout, so SBD fencing can take some 
time. That means usually fencing takes place in a few seconds, but the cluster 
waits the longer time to make sure the node must have processed the SBD fencing 
command (fencing is not confirmed at the SBD level: You send the fencing 
command on SBD, then you expect that every node reads the command after some 
delay (and thus performs the command).

Unfortunately the SBD syntax is a real mess, and there is not manual page 
(AFAIK) for SBD.
YOu can change the SBD parameters (on disk) online, but to be effective, the 
SBD daemon has to be restarted.

I hope this helps.

Regards,
Ulrich

>>> Muhammad Sharfuddin <m.sharfud...@nds.com.pk> schrieb am 15.01.2015 um 
>>> 16:33 in
Nachricht <54b7ddd2.3000...@nds.com.pk>:
> I have to put this 2 node active/passive cluster in production very soon 
> and I have tested the resource migration
> works perfectly in case of the node running the resource goes 
> down(abruptly/forcefully).
> 
> I have always read and heard to increase msgwait and watchdog timeout 
> when sbd is a multipath disk, but in my case
> I have just created the disk via
>      sbd -d /dev/mapper/mpathe create
> 
> and I have following resource for sbd
>      primitive sbd_stonith stonith:external/sbd \
>              op monitor interval="3000" timeout="120" start-delay="21" \
>              op start interval="0" timeout="120" \
>              op stop interval="0" timeout="120" \
>              params sbd_device="/dev/mapper/mpathe"
> 
> as of now I am quite satisfied, but should I increase the msgwait and 
> watchdog timeouts ?
> 
> also I am using the start-delay=21 for "op monitor interval" should I 
> also use the start-delay=11 for "op start interval"
> 
> Please recommend
> 
> -- 
> Regards,
> 
> Muhammad Sharfuddin
> Cell: +92-3332144823 | UAN: +92(21) 111-111-142 ext: 113 | NDS.COM.PK 
> <http://www.nds.com.pk>
> 
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org 
> http://lists.linux-ha.org/mailman/listinfo/linux-ha 
> See also: http://linux-ha.org/ReportingProblems 




_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to