> ... right. I hadn't noticed that before. > > So what's happening is that, in pacemaker's fencing/remote.c, the > stonith-timeout specified is divided up in 10% for _querying_ the list > of nodes a given stonith device can retrieve, and 90% for then > performing an actual operation. (Compare initiate_remote_stonith_op() > and call_remote_stonith()) > > I think this is counter-intuitive, to say the least. > > In your initial case, it just so happens that 100s * 90% obviously > exactly matches your sbd msgwait, so an increase of +10s just wasn't > enough. > > > > Regards, > Lars
Thanks Lars. Interesting. It would seem more intuitive for remote.c to add 10% to the specified value in order to get it's querying overhead accounted for. Now that I know about the "query tax", will verify stonith-timeout is set to a value > (sbd-msgwait*110%). Hopefully that little tidbit will make it in to the sbd wiki at some point. Take care, Craig _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems