Hi, We're having some issues with a particular oversubscribed hypervisor (cpu-wise) where we run SLES 11 SP4 guests. I had to increase many timeouts on the cluster to cope with this:
- Corosync's token timeout (from the default of 5 secs to 30 seconds) - SBD's watchdog & msgwait (from 15/30 to 30/60 respectively) - Pacemaker's resource-monitoring timeouts I know the consequence for doing all this will be *slow reaction times* but it's all I can do in the meantime. However, when the hypervisor is at 100% full CPU utilization I still get these messages: sbd: :WARN: Latency: 4 exceeded threshold 3 on disk /dev/mapper/clustersbd logd: WARN: G_CH_prepare_int: working on IPC channel took 220 ms (> 100 ms) sbd: WARN: Pacemaker state outdated (age: 4) sbd: info: Pacemaker health check: OK sbd: WARN: Latency: 4 exceeded threshold 3 on disk /dev/mapper/clustersbd logd: WARN: G_CH_check_int: working on IPC channel took 150 ms (> 100 ms) sbd: WARN: Latency: 4 exceeded threshold 3 on disk /dev/mapper/clustersbd sbd: WARN: Servant for /dev/mapper/clustersbd outdated (age: 5) sbd: WARN: Majority of devices lost - surviving on pacemaker Is this latency configurable? It keeps mentioning "threshold 3". Is that 3 seconds? How does it relates to the following parameters ? ==Dumping header on disk /dev/mapper/clustersbd Header version : 2.1 UUID : 54597871-2392-475f-ba2d-71bdf92c36b5 Number of slots : 255 Sector size : 512 Timeout (watchdog) : 30 Timeout (allocate) : 2 Timeout (loop) : 1 Timeout (msgwait) : 60 ==Header on disk /dev/mapper/clustersbd is dumped I'm using the -P option with sbd so I know it will not fence the system as long as the node's health is ok (as reported by Pacemaker). I'd still like to find out if the latency mentioned is configurable or is it safe to ignore. Thanks! Regards, Jorge _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org