Reviewed: https://review.openstack.org/419204 Committed: https://git.openstack.org/cgit/openstack/charm-hacluster/commit/?id=fda5176bd53f17a69f3e22b6b363bff96ff565c0 Submitter: Jenkins Branch: master
commit fda5176bd53f17a69f3e22b6b363bff96ff565c0 Author: David Ames <[email protected]> Date: Wed Jan 11 16:00:39 2017 -0800 Fix pacemaker down crm infinite loop On corosync restart, corosync may take longer than a minute to come up. The systemd start script times out too soon. Then pacemaker which is dependent on corosync is immediatly started and fails as corosync is still in the process of starting. Subsequently the charm would run crm node list to validate pacemaker. This would become an infinite loop. This change adds longer timeout values for systemd scripts and adds better error handling and communication to the end user. Change-Id: I7c3d018a03fddfb1f6bfd91fd7aeed4b13879e45 Partial-Bug: #1654403 -- You received this bug notification because you are a member of Ubuntu High Availability Team, which is subscribed to corosync in Ubuntu. https://bugs.launchpad.net/bugs/1654403 Title: Race condition in hacluster charm that leaves pacemaker down Status in corosync package in Ubuntu: New Status in hacluster package in Juju Charms Collection: Triaged Bug description: Symptom: one or more hacluster nodes are left in an executing state. Observing the process list on the affected nodes the command 'crm node list' is in an infinite loop and pacemaker is not started. On nodes that complete the crm node list and other crm commands pacemaker is started. See the artefacts from this run: https://openstack-ci-reports.ubuntu.com/artifacts/test_charm_pipeline/openstack/charm-percona-cluster/417131/1/1873/index.html Hypothesis: There is a race that leads to crm node list being executed before pacemaker is started. It is also possible that something causes pacemaker to fail to start. Suggest a check for pacemaker heath before any crm commands are run. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1654403/+subscriptions _______________________________________________ Mailing list: https://launchpad.net/~ubuntu-ha Post to : [email protected] Unsubscribe : https://launchpad.net/~ubuntu-ha More help : https://help.launchpad.net/ListHelp

