Re: [ClusterLabs] How does failure-timeout works, will the resource not be scheduled when setting too short?
On Sun, 2018-05-20 at 10:19 +0800, lkxjtu wrote: > > I have two pacemaker resources. We call them A and B. Because of > environmental reasons, their start methods and monitor methods always > return failure > > (OCF_ERR_GENERIC). The following are their configurations:(The > cluster property of start-failure-is-fatal is false) > > primitive A A \ > op monitor interval=20 timeout=120 \ > op stop interval=0 timeout=120 on-fail=restart \ > op start interval=0 timeout=240 on-fail=restart \ > meta failure-timeout=60s > primitive B B \ > op monitor interval=20 timeout=120 \ > op stop interval=0 timeout=120 on-fail=restart \ > op start interval=0 timeout=240 on-fail=restart \ > meta failure-timeout=60s > clone A_cl A > clone B_cl B > > The time consuming of their methods is different: > A: > start = 60s monitor < 1s stop = 80s > B: > start < 1s monitor < 1s stop < 1s > > Resource of A is scheduled normally, always start and stop. But for > resource B, there is only circular monitor fails, without start and > stop. > . And there is no fail-count showing of B in "crm status -f". > > Two operations can solve the problem of B not being scheduled: > 1,Set failure-timeout of B from 60s to 600s > 2,Modify ocf of A,make the stop method return as soon as possible > > I tested it several times, and the results were the same. Why does > the resource not be scheduled when failure-timeout setting too short? > And what does > > it have to do with the time consuming stop of another resource? Is > this a bug? > > My pacemaker version is 1.1.16. Any suggestion is welcome. Thank you! > > > James > 2018-05-20 That behavior is unexpected. Can you share logs? -- Ken Gaillot___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] How does failure-timeout works, will the resource not be scheduled when setting too short?
I have two pacemaker resources. We call them A and B. Because of environmental reasons, their start methods and monitor methods always return failure (OCF_ERR_GENERIC). The following are their configurations:(The cluster property of start-failure-is-fatal is false) primitive A A \ op monitor interval=20 timeout=120 \ op stop interval=0 timeout=120 on-fail=restart \ op start interval=0 timeout=240 on-fail=restart \ meta failure-timeout=60s primitive B B \ op monitor interval=20 timeout=120 \ op stop interval=0 timeout=120 on-fail=restart \ op start interval=0 timeout=240 on-fail=restart \ meta failure-timeout=60s clone A_cl A clone B_cl B The time consuming of their methods is different: A: start = 60s monitor < 1sstop = 80s B: start < 1smonitor < 1sstop < 1s Resource of A is scheduled normally, always start and stop. But for resource B, there is only circular monitor fails, without start and stop. . And there is no fail-count showing of B in "crm status -f". Two operations can solve the problem of B not being scheduled: 1,Set failure-timeout of B from 60s to 600s 2,Modify ocf of A,make the stop method return as soon as possible I tested it several times, and the results were the same. Why does the resource not be scheduled when failure-timeout setting too short? And what does it have to do with the time consuming stop of another resource? Is this a bug? My pacemaker version is 1.1.16. Any suggestion is welcome. Thank you! James 2018-05-20 ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org