Hello and thank you for your answer! So should I just disable "monitor" options at all? In my case I'd better delete the whole "op" row: "op monitor interval=20 timeout=60 on-fail=restart"
am I correct? On Mon, Oct 7, 2019 at 2:36 PM Ulrich Windl < [email protected]> wrote: > Hi! > > I can't remember the exact reason, but probably it was exactly that what > made us remove any monitor operation from IPaddr2 (back in 2011). So far no > problems doing so ;-) > > > Regards, > Ulrich > P.S.: Of cource it would be nice if the real issue could be found and > fixed. > > >>> Donat Zenichev <[email protected]> schrieb am 20.09.2019 um > 14:43 in > Nachricht > <canlwqcmvjcatzhkcjsnoxljghtyflvbp3fd_d4nxrnqpm_j...@mail.gmail.com>: > > Hi there! > > > > I've got a tricky case, when my IpAddr2 resource fails to start with > > literally no-reason: > > "IPSHARED_monitor_20000 on my-master-1 'not running' (7): call=11, > > status=complete, exitreason='', > > last-rc-change='Wed Sep 4 06:08:07 2019', queued=0ms, exec=0ms" > > > > Resource IpAddr2 managed to fix itself and continued to work properly > > further after that. > > > > What I've done after, was setting 'Failure-timeout=900' seconds for my > > IpAddr2 resource, to prevent working of > > the resource on a node where it fails. I also set the > > 'migration-threshold=2' so IpAddr2 can fail only 2 times, and goes to a > > Slave side after that. Meanwhile Master gets banned for 900 seconds. > > > > After 900 seconds cluster tries to start IpAddr2 again at Master, in case > > it's ok, fail counter gets cleared. > > That's how I avoid appearing of the error I mentioned above. > > > > I tried to get so hard, why this can happen, but still no idea on the > > count. Any clue how to find a reason? > > And another question, can snap-shoting of VM machines have any impact on > > such? > > > > And my configurations: > > ------------------------------- > > node 000001: my-master-1 > > node 000002: my-master-2 > > > > primitive IPSHARED IPaddr2 \ > > params ip=10.10.10.5 nic=eth0 cidr_netmask=24 \ > > meta migration-threshold=2 failure-timeout=900 target-role=Started \ > > op monitor interval=20 timeout=60 on-fail=restart > > > > location PREFER_MASTER IPSHARED 100: my-master-1 > > > > property cib-bootstrap-options: \ > > have-watchdog=false \ > > dc-version=1.1.18-2b07d5c5a9 \ > > cluster-infrastructure=corosync \ > > cluster-name=wall \ > > cluster-recheck-interval=5s \ > > start-failure-is-fatal=false \ > > stonith-enabled=false \ > > no-quorum-policy=ignore \ > > last-lrm-refresh=1554982967 > > ------------------------------- > > > > Thanks in advance! > > > > -- > > -- > > BR, Donat Zenichev > > > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -- Best regards, Donat Zenichev
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
