Based on my last comment, I have created one PPA for users to test and give us feedback for this specific case:
https://launchpad.net/~inaddy/+archive/ubuntu/lp1368737 Instructions on how to use: # add-apt-repository ppa:inaddy/lp1368737 # apt-get update # apt-get install pacemaker This PPA contains both fixes: Pacemaker: haproxy monitor NG Pacemaker: Pacemaker's lrmd process crashed. With the following changelog: pacemaker (1.1.10+git20130802-1ubuntu4~lp1368737~1) trusty; urgency=medium * Fix: services: Prevent use-of-NULL when executing service actions - 1/2 (LP: #1368737) * Fix: services: Fix the executing of synchronous actions - 2/2 (LP: #1368737) -- Rafael David Tinoco <[email protected]> Fri, 12 Sep 2014 15:52:14 -0300 pacemaker (1.1.10+git20130802-1ubuntu3) trusty; urgency=medium * Fix: services: Do not allow duplicate recurring op entries - 1/3 (LP: #1353473) * High: lrmd: Merge duplicate recurring monitor operations - 2/3 (LP: #1353473) * Fix: lrmd: Cancel recurring operations before stop action is executed - 3/3 (LP: #1353473) -- Rafael David Tinoco <[email protected]> Wed, 06 Aug 2014 09:24:13 -0300 Since I was waiting for the Stable Release Update for pacemaker on Trusty but it did not get released until the date of this fix. If this fix solves the issue I'll push both SRUs (for 2 cases above) for our sponsor team to upload them for Trusty. Waiting on community feedback to request the Release Update. Thank you in advance. -- You received this bug notification because you are a member of Ubuntu High Availability Team, which is subscribed to pacemaker in Ubuntu. https://bugs.launchpad.net/bugs/1368737 Title: Pacemaker can seg fault on crm node online/standy Status in “pacemaker” package in Ubuntu: New Bug description: It was brought to my attention the following situation: """ [Issue] lrmd process crashed when repeating "crm node standby" and "crm node online" ---------------- # grep pacemakerd ha-log.k1pm101 | grep core Aug 27 17:47:06 k1pm101 pacemakerd[49271]: error: child_waitpid: Managed process 49275 (lrmd) dumped core Aug 27 17:47:06 k1pm101 pacemakerd[49271]: notice: pcmk_child_exit: Child process lrmd terminated with signal 11 (pid=49275, core=1) Aug 27 18:27:14 k1pm101 pacemakerd[49271]: error: child_waitpid: Managed process 1471 (lrmd) dumped core Aug 27 18:27:14 k1pm101 pacemakerd[49271]: notice: pcmk_child_exit: Child process lrmd terminated with signal 11 (pid=1471, core=1) Aug 27 18:56:41 k1pm101 pacemakerd[49271]: error: child_waitpid: Managed process 35771 (lrmd) dumped core Aug 27 18:56:41 k1pm101 pacemakerd[49271]: notice: pcmk_child_exit: Child process lrmd terminated with signal 11 (pid=35771, core=1) Aug 27 19:44:09 k1pm101 pacemakerd[49271]: error: child_waitpid: Managed process 60709 (lrmd) dumped core Aug 27 19:44:09 k1pm101 pacemakerd[49271]: notice: pcmk_child_exit: Child process lrmd terminated with signal 11 (pid=60709, core=1) Aug 27 20:00:53 k1pm101 pacemakerd[49271]: error: child_waitpid: Managed process 35838 (lrmd) dumped core Aug 27 20:00:53 k1pm101 pacemakerd[49271]: notice: pcmk_child_exit: Child process lrmd terminated with signal 11 (pid=35838, core=1) Aug 27 21:33:52 k1pm101 pacemakerd[49271]: error: child_waitpid: Managed process 49249 (lrmd) dumped core Aug 27 21:33:52 k1pm101 pacemakerd[49271]: notice: pcmk_child_exit: Child process lrmd terminated with signal 11 (pid=49249, core=1) Aug 27 22:01:16 k1pm101 pacemakerd[49271]: error: child_waitpid: Managed process 65358 (lrmd) dumped core Aug 27 22:01:16 k1pm101 pacemakerd[49271]: notice: pcmk_child_exit: Child process lrmd terminated with signal 11 (pid=65358, core=1) Aug 27 22:28:02 k1pm101 pacemakerd[49271]: error: child_waitpid: Managed process 22693 (lrmd) dumped core Aug 27 22:28:02 k1pm101 pacemakerd[49271]: notice: pcmk_child_exit: Child process lrmd terminated with signal 11 (pid=22693, core=1) ---------------- ---------------- # grep pacemakerd ha-log.k1pm102 | grep core Aug 27 15:32:48 k1pm102 pacemakerd[5808]: error: child_waitpid: Managed process 5812 (lrmd) dumped core Aug 27 15:32:48 k1pm102 pacemakerd[5808]: notice: pcmk_child_exit: Child process lrmd terminated with signal 11 (pid=5812, core=1) Aug 27 15:52:52 k1pm102 pacemakerd[5808]: error: child_waitpid: Managed process 35781 (lrmd) dumped core Aug 27 15:52:52 k1pm102 pacemakerd[5808]: notice: pcmk_child_exit: Child process lrmd terminated with signal 11 (pid=35781, core=1) Aug 27 16:02:54 k1pm102 pacemakerd[5808]: error: child_waitpid: Managed process 51984 (lrmd) dumped core Aug 27 16:02:54 k1pm102 pacemakerd[5808]: notice: pcmk_child_exit: Child process lrmd terminated with signal 11 (pid=51984, core=1) """ Analyzing core file with dbgsyms I could see that: #0 0x00007f7184a45983 in services_action_sync (op=0x7f7185b605d0) at services.c:434 434 crm_trace(" > stdout: %s", op->stdout_data); Is responsible for the core. I've checked upstream code and there might be 2 important commits that could be cherry-picked to fix this behavior: commit f2a637cc553cb7aec59bdcf05c5e1d077173419f Author: Andrew Beekhof <[email protected]> Date: Fri Sep 20 12:20:36 2013 +1000 Fix: services: Prevent use-of-NULL when executing service actions commit 11473a5a8c88eb17d5e8d6cd1d99dc497e817aac Author: Gao,Yan <[email protected]> Date: Sun Sep 29 12:40:18 2013 +0800 Fix: services: Fix the executing of synchronous actions The core can be caused by things such as this missing code: if (op == NULL) { crm_trace("No operation to execute"); return FALSE; on the beginning of "services_action_sync(svc_action_t * op)" function. And improved by commit #11473a5. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1368737/+subscriptions _______________________________________________ Mailing list: https://launchpad.net/~ubuntu-ha Post to : [email protected] Unsubscribe : https://launchpad.net/~ubuntu-ha More help : https://help.launchpad.net/ListHelp

