[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
Uploaded for Xenial and Artful, it is now waiting in the upload queue for SRU verification team approval. -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: Fix Released Status in pacemaker package in Ubuntu: Fix Released Status in corosync source package in Trusty: Won't Fix Status in pacemaker source package in Trusty: Won't Fix Status in corosync source package in Xenial: In Progress Status in pacemaker source package in Xenial: In Progress Status in corosync source package in Artful: In Progress Status in pacemaker source package in Artful: In Progress Status in corosync source package in Bionic: Fix Released Status in corosync package in Debian: New Bug description: [Impact] When corosync and pacemaker are both installed, a corosync upgrade caused pacemaker to fail. pacemaker will need to be restarted manually to work again, it won't recover by itself. [Test Case] 1) Have corosync (< 2.3.5-3ubuntu2) and pacemaker (< 1.1.14-2ubuntu1.3) installed 2) Make sure corosync & pacemaker are running via systemctl status cmd. 3) Upgrade corosync 4) Look corosync and pacemaker via systemctl status cmd again. You will notice pacemaker is dead (inactive) and doesn't recover, unless a systemctl start pacemaker is done manually. [Regression Potential] Regression potential is low, it doesn't change corosync/pacemaker core functionality. This patch make sure thing goes smoother at the packaging level during a corosync upgrade where pacemaker is installed/involved. This can also be useful in particular in situation where the system has "unattended-upgrades" enable (software upgrades without supervision), and no sysadmin available to start pacemaker manually because this isn't a schedule maintenance. [Other Info] XENIAL Merge-proposal: https://code.launchpad.net/~nacc/ubuntu/+source/corosync/+git/corosync/+merge/336338 https://code.launchpad.net/~nacc/ubuntu/+source/pacemaker/+git/pacemaker/+merge/336339 [Original Description] During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
[Artful (pre-sru)] # dpkg -l | egrep "corosync|pacemaker" ii corosync 2.4.2-3build1 amd64cluster engine daemon and utilities ii crmsh 2.3.2-1 amd64CRM shell for the pacemaker cluster manager ii libcorosync-common4:amd64 2.4.2-3build1 amd64cluster engine common library ii pacemaker 1.1.16-1ubuntu1 amd64cluster resource manager ii pacemaker-cli-utils1.1.16-1ubuntu1 amd64cluster resource manager command line utilities ii pacemaker-common 1.1.16-1ubuntu1 all cluster resource manager common files ii pacemaker-resource-agents 1.1.16-1ubuntu1 all cluster resource manager general resource agents # systemctl status corosync | egrep -i "Active:|pid" Active: active (running) since Mon 2018-02-26 15:23:37 UTC; 15min ago Main PID: 8943 (corosync) # systemctl status pacemaker | egrep -i "Active:|pid" Active: active (running) since Mon 2018-02-26 15:23:39 UTC; 15min ago Main PID: 9033 (pacemakerd) # apt-get install corosync -y Reading package lists... Done Building dependency tree Reading state information... Done The following package was automatically installed and is no longer required: libfreetype6 Use 'apt autoremove' to remove it. The following additional packages will be installed: pacemaker Suggested packages: fence-agents The following packages will be upgraded: corosync pacemaker 2 upgraded, 0 newly installed, 0 to remove and 41 not upgraded. Need to get 768 kB of archives. After this operation, 11.3 kB of additional disk space will be used. Get:1 http://ppa.launchpad.net/slashd/lp1740892/ubuntu artful/main amd64 pacemaker amd64 1.1.16-1ubuntu2 [389 kB] Get:2 http://ppa.launchpad.net/slashd/lp1740892/ubuntu artful/main amd64 corosync amd64 2.4.2-3ubuntu0.17.10.1 [379 kB] Fetched 768 kB in 1s (420 kB/s) (Reading database ... 29268 files and directories currently installed.) Preparing to unpack .../pacemaker_1.1.16-1ubuntu2_amd64.deb ... Unpacking pacemaker (1.1.16-1ubuntu2) over (1.1.16-1ubuntu1) ... Preparing to unpack .../corosync_2.4.2-3ubuntu0.17.10.1_amd64.deb ... Unpacking corosync (2.4.2-3ubuntu0.17.10.1) over (2.4.2-3build1) ... Processing triggers for ureadahead (0.100.0-20) ... Processing triggers for systemd (234-2ubuntu12.1) ... Setting up corosync (2.4.2-3ubuntu0.17.10.1) ... Processing triggers for man-db (2.7.6.1-2) ... Setting up pacemaker (1.1.16-1ubuntu2) ... # dpkg -l | egrep "corosync|pacemaker" ii corosync 2.4.2-3ubuntu0.17.10.1 amd64cluster engine daemon and utilities ii crmsh 2.3.2-1 amd64CRM shell for the pacemaker cluster manager ii libcorosync-common4:amd64 2.4.2-3build1 amd64cluster engine common library ii pacemaker 1.1.16-1ubuntu2 amd64cluster resource manager ii pacemaker-cli-utils1.1.16-1ubuntu1 amd64cluster resource manager command line utilities ii pacemaker-common 1.1.16-1ubuntu1 all cluster resource manager common files ii pacemaker-resource-agents 1.1.16-1ubuntu1 all cluster resource manager general resource agents # systemctl status corosync | egrep -i "Active:|pid" Active: active (running) since Mon 2018-02-26 15:40:04 UTC; 13s ago Main PID: 9814 (corosync) # systemctl status pacemaker | egrep -i "Active:|pid" Active: active (running) since Mon 2018-02-26 15:40:05 UTC; 14s ago Main PID: 9996 (pacemakerd) -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: Fix Released Status in pacemaker package in Ubuntu: Fix Released Status in corosync source package in Trusty: Won't Fix Status in pacemaker source package in Trusty: Won't Fix Status in corosync source package in Xenial: In Progress Status in pacemaker source package in Xenial: In Progress Status in corosync source package in Artful: In Progress Status in pacemaker source package in Artful: In Progress Status in corosync source package in Bionic: Fix Released Status in corosync
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
I'll resume the SRU and hopefully upload everything next Monday. -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: Fix Released Status in pacemaker package in Ubuntu: Fix Released Status in corosync source package in Trusty: Won't Fix Status in pacemaker source package in Trusty: Won't Fix Status in corosync source package in Xenial: In Progress Status in pacemaker source package in Xenial: In Progress Status in corosync source package in Artful: In Progress Status in pacemaker source package in Artful: In Progress Status in corosync source package in Bionic: Fix Released Status in corosync package in Debian: New Bug description: [Impact] When corosync and pacemaker are both installed, a corosync upgrade caused pacemaker to fail. pacemaker will need to be restarted manually to work again, it won't recover by itself. [Test Case] 1) Have corosync (< 2.3.5-3ubuntu2) and pacemaker (< 1.1.14-2ubuntu1.3) installed 2) Make sure corosync & pacemaker are running via systemctl status cmd. 3) Upgrade corosync 4) Look corosync and pacemaker via systemctl status cmd again. You will notice pacemaker is dead (inactive) and doesn't recover, unless a systemctl start pacemaker is done manually. [Regression Potential] Regression potential is low, it doesn't change corosync/pacemaker core functionality. This patch make sure thing goes smoother at the packaging level during a corosync upgrade where pacemaker is installed/involved. This can also be useful in particular in situation where the system has "unattended-upgrades" enable (software upgrades without supervision), and no sysadmin available to start pacemaker manually because this isn't a schedule maintenance. [Other Info] XENIAL Merge-proposal: https://code.launchpad.net/~nacc/ubuntu/+source/corosync/+git/corosync/+merge/336338 https://code.launchpad.net/~nacc/ubuntu/+source/pacemaker/+git/pacemaker/+merge/336339 [Original Description] During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
Was able to build fine using (optional) Standard symbol tags optional A symbol marked as optional can disappear from the library at any time and that will never cause dpkg-gensymbols to fail. However, disappeared optional symbols will continuously appear as MISSING in the diff in each new package revision. This behaviour serves as a reminder for the maintainer that such a symbol needs to be removed from the symbol file or readded to the library. When the optional symbol, which was previously declared as MISSING, suddenly reappears in the next revision, it will be upgraded back to the “existing” status with its minimum version unchanged. This tag is useful for symbols which are private where their disappearance do not cause ABI breakage. For example, most of C++ template instantiations fall into this category. Like any other tag, this one may also have an arbitrary value: it could be used to indicate why the symbol is considered optional. -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: Fix Released Status in pacemaker package in Ubuntu: Fix Released Status in corosync source package in Trusty: Won't Fix Status in pacemaker source package in Trusty: Won't Fix Status in corosync source package in Xenial: In Progress Status in pacemaker source package in Xenial: In Progress Status in corosync source package in Artful: In Progress Status in pacemaker source package in Artful: In Progress Status in corosync source package in Bionic: Fix Released Status in corosync package in Debian: New Bug description: [Impact] When corosync and pacemaker are both installed, a corosync upgrade caused pacemaker to fail. pacemaker will need to be restarted manually to work again, it won't recover by itself. [Test Case] 1) Have corosync (< 2.3.5-3ubuntu2) and pacemaker (< 1.1.14-2ubuntu1.3) installed 2) Make sure corosync & pacemaker are running via systemctl status cmd. 3) Upgrade corosync 4) Look corosync and pacemaker via systemctl status cmd again. You will notice pacemaker is dead (inactive) and doesn't recover, unless a systemctl start pacemaker is done manually. [Regression Potential] Regression potential is low, it doesn't change corosync/pacemaker core functionality. This patch make sure thing goes smoother at the packaging level during a corosync upgrade where pacemaker is installed/involved. This can also be useful in particular in situation where the system has "unattended-upgrades" enable (software upgrades without supervision), and no sysadmin available to start pacemaker manually because this isn't a schedule maintenance. [Other Info] XENIAL Merge-proposal: https://code.launchpad.net/~nacc/ubuntu/+source/corosync/+git/corosync/+merge/336338 https://code.launchpad.net/~nacc/ubuntu/+source/pacemaker/+git/pacemaker/+merge/336339 [Original Description] During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to:
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
another quick update base on a discussion between nacc/slangasek and myself : ... slangasek: fair, above patch results in https://paste.ubuntu.com/p/hb68G8rpMw/ nacc: those are pretty clearly internal symbols which are not part of the ABI and you should just mark them (optional) instead of doing an architecture-based exclusion list and when I say mark them (optional), I mean mark them '(optional)' nacc: dropping the symbols, or marking them optional, both valid. (optional) would make the same source package more cleanly backportable to older toolchains but a symbol that starts with a __ and isn't listed in the public headers for the library, and especially that doesn't originate in the source of this library, can be assumed safe to drop from .symbols -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: Fix Released Status in pacemaker package in Ubuntu: Fix Released Status in corosync source package in Trusty: Won't Fix Status in pacemaker source package in Trusty: Won't Fix Status in corosync source package in Xenial: In Progress Status in pacemaker source package in Xenial: In Progress Status in corosync source package in Artful: In Progress Status in pacemaker source package in Artful: In Progress Status in corosync source package in Bionic: Fix Released Status in corosync package in Debian: New Bug description: [Impact] When corosync and pacemaker are both installed, a corosync upgrade caused pacemaker to fail. pacemaker will need to be restarted manually to work again, it won't recover by itself. [Test Case] 1) Have corosync (< 2.3.5-3ubuntu2) and pacemaker (< 1.1.14-2ubuntu1.3) installed 2) Make sure corosync & pacemaker are running via systemctl status cmd. 3) Upgrade corosync 4) Look corosync and pacemaker via systemctl status cmd again. You will notice pacemaker is dead (inactive) and doesn't recover, unless a systemctl start pacemaker is done manually. [Regression Potential] Regression potential is low, it doesn't change corosync/pacemaker core functionality. This patch make sure thing goes smoother at the packaging level during a corosync upgrade where pacemaker is installed/involved. This can also be useful in particular in situation where the system has "unattended-upgrades" enable (software upgrades without supervision), and no sysadmin available to start pacemaker manually because this isn't a schedule maintenance. [Other Info] XENIAL Merge-proposal: https://code.launchpad.net/~nacc/ubuntu/+source/corosync/+git/corosync/+merge/336338 https://code.launchpad.net/~nacc/ubuntu/+source/pacemaker/+git/pacemaker/+merge/336339 [Original Description] During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
Quick update The corosync/pacemaker SRU is on hold for now until the FBTFS situation is fix for Artful. As mentioned above in comment #58 based on the build log error I had and the debbug #869986, it is related to some libqb header issues. Server team will have a look at this, and I'll then resume the SRU once completed. -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: Fix Released Status in pacemaker package in Ubuntu: Fix Released Status in corosync source package in Trusty: Won't Fix Status in pacemaker source package in Trusty: Won't Fix Status in corosync source package in Xenial: In Progress Status in pacemaker source package in Xenial: In Progress Status in corosync source package in Artful: In Progress Status in pacemaker source package in Artful: In Progress Status in corosync source package in Bionic: Fix Released Status in corosync package in Debian: New Bug description: [Impact] When corosync and pacemaker are both installed, a corosync upgrade caused pacemaker to fail. pacemaker will need to be restarted manually to work again, it won't recover by itself. [Test Case] 1) Have corosync (< 2.3.5-3ubuntu2) and pacemaker (< 1.1.14-2ubuntu1.3) installed 2) Make sure corosync & pacemaker are running via systemctl status cmd. 3) Upgrade corosync 4) Look corosync and pacemaker via systemctl status cmd again. You will notice pacemaker is dead (inactive) and doesn't recover, unless a systemctl start pacemaker is done manually. [Regression Potential] Regression potential is low, it doesn't change corosync/pacemaker core functionality. This patch make sure thing goes smoother at the packaging level during a corosync upgrade where pacemaker is installed/involved. This can also be useful in particular in situation where the system has "unattended-upgrades" enable (software upgrades without supervision), and no sysadmin available to start pacemaker manually because this isn't a schedule maintenance. [Other Info] XENIAL Merge-proposal: https://code.launchpad.net/~nacc/ubuntu/+source/corosync/+git/corosync/+merge/336338 https://code.launchpad.net/~nacc/ubuntu/+source/pacemaker/+git/pacemaker/+merge/336339 [Original Description] During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
Forgot the mentioned that It still fails to build, but it fails differently now than before applying debian commit "a7476dd96e79197f65acf0f049f75ce8e8f9e801" -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: Fix Released Status in pacemaker package in Ubuntu: Fix Released Status in corosync source package in Trusty: Won't Fix Status in pacemaker source package in Trusty: Won't Fix Status in corosync source package in Xenial: In Progress Status in pacemaker source package in Xenial: In Progress Status in corosync source package in Artful: In Progress Status in pacemaker source package in Artful: In Progress Status in corosync source package in Bionic: Fix Released Status in corosync package in Debian: New Bug description: [Impact] When corosync and pacemaker are both installed, a corosync upgrade caused pacemaker to fail. pacemaker will need to be restarted manually to work again, it won't recover by itself. [Test Case] 1) Have corosync (< 2.3.5-3ubuntu2) and pacemaker (< 1.1.14-2ubuntu1.3) installed 2) Make sure corosync & pacemaker are running via systemctl status cmd. 3) Upgrade corosync 4) Look corosync and pacemaker via systemctl status cmd again. You will notice pacemaker is dead (inactive) and doesn't recover, unless a systemctl start pacemaker is done manually. [Regression Potential] Regression potential is low, it doesn't change corosync/pacemaker core functionality. This patch make sure thing goes smoother at the packaging level during a corosync upgrade where pacemaker is installed/involved. This can also be useful in particular in situation where the system has "unattended-upgrades" enable (software upgrades without supervision), and no sysadmin available to start pacemaker manually because this isn't a schedule maintenance. [Other Info] XENIAL Merge-proposal: https://code.launchpad.net/~nacc/ubuntu/+source/corosync/+git/corosync/+merge/336338 https://code.launchpad.net/~nacc/ubuntu/+source/pacemaker/+git/pacemaker/+merge/336339 [Original Description] During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
[XENIAL (pre-SRU)] == BEFORE UPGRADE == # dpkg -l | egrep "corosync|pacemaker" ii corosync 2.3.5-3ubuntu2 amd64cluster engine daemon and utilities ii crmsh2.2.0-1 amd64CRM shell for the pacemaker cluster manager ii libcorosync-common4:amd642.3.5-3ubuntu2 amd64cluster engine common library ii pacemaker1.1.14-2ubuntu1.3 amd64cluster resource manager ii pacemaker-cli-utils 1.1.14-2ubuntu1.3 amd64cluster resource manager command line utilities ii pacemaker-common 1.1.14-2ubuntu1.3 all cluster resource manager common files ii pacemaker-resource-agents1.1.14-2ubuntu1.3 all cluster resource manager general resource agents # systemctl status corosync | egrep "Active:|Main PID" Active: active (running) since Mon 2018-02-19 15:14:44 UTC; 16min ago Main PID: 3228 (corosync) # systemctl status pacemaker | egrep "Active:|Main PID" Active: active (running) since Mon 2018-02-19 15:14:44 UTC; 16min ago Main PID: 3321 (pacemakerd) == UPGRADE == # apt-cache policy corosync corosync: Installed: 2.3.5-3ubuntu2 Candidate: 2.3.5-3ubuntu2.1 Version table: 2.3.5-3ubuntu2.1 500 500 http://ppa.launchpad.net/slashd/test/ubuntu xenial/main amd64 Packages *** 2.3.5-3ubuntu2 500 500 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages 100 /var/lib/dpkg/status # apt-get install corosync Reading package lists... Done Building dependency tree Reading state information... Done The following package was automatically installed and is no longer required: libfreetype6 Use 'apt autoremove' to remove it. The following additional packages will be installed: pacemaker Suggested packages: fence-agents The following packages will be upgraded: corosync pacemaker 2 upgraded, 0 newly installed, 0 to remove and 55 not upgraded. Need to get 766 kB of archives. After this operation, 2048 B of additional disk space will be used. Do you want to continue? [Y/n] Get:1 http://ppa.launchpad.net/slashd/test/ubuntu xenial/main amd64 pacemaker amd64 1.1.14-2ubuntu1.4 [404 kB] Get:2 http://ppa.launchpad.net/slashd/test/ubuntu xenial/main amd64 corosync amd64 2.3.5-3ubuntu2.1 [361 kB] Fetched 766 kB in 1s (507 kB/s) (Reading database ... 28089 files and directories currently installed.) Preparing to unpack .../pacemaker_1.1.14-2ubuntu1.4_amd64.deb ... Unpacking pacemaker (1.1.14-2ubuntu1.4) over (1.1.14-2ubuntu1.3) ... Preparing to unpack .../corosync_2.3.5-3ubuntu2.1_amd64.deb ... Unpacking corosync (2.3.5-3ubuntu2.1) over (2.3.5-3ubuntu2) ... Processing triggers for systemd (229-4ubuntu21) ... Processing triggers for ureadahead (0.100.0-19) ... Processing triggers for man-db (2.7.5-1) ... Setting up corosync (2.3.5-3ubuntu2.1) ... Setting up pacemaker (1.1.14-2ubuntu1.4) ... == AFTER UPGRADE == # dpkg -l | egrep "corosync|pacemaker" ii corosync 2.3.5-3ubuntu2.1 amd64cluster engine daemon and utilities ii crmsh2.2.0-1 amd64CRM shell for the pacemaker cluster manager ii libcorosync-common4:amd642.3.5-3ubuntu2 amd64cluster engine common library ii pacemaker1.1.14-2ubuntu1.4 amd64cluster resource manager ii pacemaker-cli-utils 1.1.14-2ubuntu1.3 amd64cluster resource manager command line utilities ii pacemaker-common 1.1.14-2ubuntu1.3 all cluster resource manager common files ii pacemaker-resource-agents1.1.14-2ubuntu1.3 all cluster resource manager general resource agents # systemctl status corosync | egrep "Active:|Main PID" Active: active (running) since Mon 2018-02-19 15:33:25 UTC; 30s ago Main PID: 4769 (corosync) # systemctl status pacemaker | egrep "Active:|Main PID" Active: active (running) since Mon 2018-02-19 15:33:25 UTC; 35s ago Main PID: 4844 (pacemakerd) --- * The packages also installs sucessfully and as it should during a fresh new install (no package upgrade involve) -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: Fix Released Status in pacemaker package in Ubuntu: Fix Released Status in corosync
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
** Also affects: pacemaker (Ubuntu) Importance: Undecided Status: New ** Changed in: pacemaker (Ubuntu) Status: New => In Progress ** Changed in: pacemaker (Ubuntu) Status: In Progress => Fix Released ** Changed in: pacemaker (Ubuntu) Assignee: (unassigned) => Nish Aravamudan (nacc) ** Changed in: pacemaker (Ubuntu) Importance: Undecided => Medium ** No longer affects: corosync (Ubuntu Artful) ** No longer affects: corosync (Ubuntu Xenial) ** No longer affects: corosync (Ubuntu Trusty) ** Also affects: pacemaker (Ubuntu Trusty) Importance: Undecided Status: New ** Also affects: corosync (Ubuntu Trusty) Importance: Undecided Status: New ** Also affects: pacemaker (Ubuntu Artful) Importance: Undecided Status: New ** Also affects: corosync (Ubuntu Artful) Importance: Undecided Status: New ** Also affects: pacemaker (Ubuntu Xenial) Importance: Undecided Status: New ** Also affects: corosync (Ubuntu Xenial) Importance: Undecided Status: New ** Changed in: corosync (Ubuntu Trusty) Importance: Undecided => Medium ** Changed in: corosync (Ubuntu Trusty) Status: New => Won't Fix ** Changed in: corosync (Ubuntu Trusty) Assignee: (unassigned) => Eric Desrochers (slashd) ** Changed in: corosync (Ubuntu Trusty) Assignee: Eric Desrochers (slashd) => Nish Aravamudan (nacc) ** Changed in: pacemaker (Ubuntu Trusty) Importance: Undecided => Medium ** Changed in: pacemaker (Ubuntu Trusty) Status: New => Won't Fix ** Changed in: pacemaker (Ubuntu Trusty) Assignee: (unassigned) => Nish Aravamudan (nacc) ** Changed in: corosync (Ubuntu Xenial) Importance: Undecided => High ** Changed in: corosync (Ubuntu Xenial) Status: New => In Progress ** Changed in: corosync (Ubuntu Xenial) Assignee: (unassigned) => Eric Desrochers (slashd) ** Changed in: corosync (Ubuntu Artful) Importance: Undecided => High ** Changed in: corosync (Ubuntu Artful) Status: New => In Progress ** Changed in: corosync (Ubuntu Artful) Assignee: (unassigned) => Eric Desrochers (slashd) ** Changed in: pacemaker (Ubuntu Xenial) Importance: Undecided => High ** Changed in: pacemaker (Ubuntu Xenial) Status: New => In Progress ** Changed in: pacemaker (Ubuntu Xenial) Assignee: (unassigned) => Eric Desrochers (slashd) ** Changed in: pacemaker (Ubuntu Artful) Importance: Undecided => High ** Changed in: pacemaker (Ubuntu Artful) Status: New => In Progress ** Changed in: pacemaker (Ubuntu Artful) Assignee: (unassigned) => Eric Desrochers (slashd) -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: Fix Released Status in pacemaker package in Ubuntu: Fix Released Status in corosync source package in Trusty: Won't Fix Status in pacemaker source package in Trusty: Won't Fix Status in corosync source package in Xenial: In Progress Status in pacemaker source package in Xenial: In Progress Status in corosync source package in Artful: In Progress Status in pacemaker source package in Artful: In Progress Status in corosync source package in Bionic: Fix Released Status in corosync package in Debian: New Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
** Tags added: id-5a53cc961fb7361dbac726f8 -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: Fix Released Status in corosync source package in Trusty: Won't Fix Status in corosync source package in Xenial: In Progress Status in corosync source package in Artful: In Progress Status in corosync source package in Bionic: Fix Released Status in corosync package in Debian: New Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
I'll proceed with the SRU next week. As per my discussion with server team, we will fix Xenial and Artful but we won't fix Trusty. ** Changed in: corosync (Ubuntu Artful) Assignee: Nish Aravamudan (nacc) => Eric Desrochers (slashd) ** Changed in: corosync (Ubuntu Xenial) Assignee: Nish Aravamudan (nacc) => Eric Desrochers (slashd) ** Changed in: corosync (Ubuntu Trusty) Status: Confirmed => Won't Fix ** Changed in: corosync (Ubuntu Artful) Status: Confirmed => In Progress ** Changed in: corosync (Ubuntu Xenial) Status: Confirmed => In Progress -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: Fix Released Status in corosync source package in Trusty: Won't Fix Status in corosync source package in Xenial: In Progress Status in corosync source package in Artful: In Progress Status in corosync source package in Bionic: Fix Released Status in corosync package in Debian: New Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
Sorry for the long delay on my end! I have pushed up MPs for the correct resolution (I think) for Bionic (incl. appropriate comments of what can be dropped after B+1 opens). I am building them in my PPA (pacemaker 1.1.18~rc4-1ubuntu1~ppa1 and corosync 2.4.2-3ubuntu1~ppa1) now and will test the following scenarios: (to level-set) X -> B [should be broken] X -> B + PPA [should work] A -> B [may or may not be broken, because there is not a corosync version change] A -> B + PPA [should work] as well as the prior cases of fresh install in B and reinstall in B. Additionally, we should be able to test starting/stopping/restarting of corosync in B successfully doing the same state to pacemaker. Presuming these tests pass and the Canonical Server Team reviews and approves them, I will upload them this week. Eric & co. at that point, I'm wondering if perhaps your team can pick up the SRUs to X, A and T? I think X and A will take the same changes. As we discussed, we would do the minimum required for the older releases, as in my MPs already up. The only thing currently missing is a debconf note prompt, I think, that says pacemaker will have been stopped by the corosync upgrade and will need to be manually restarted. -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Status in corosync package in Debian: New Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
** Merge proposal linked: https://code.launchpad.net/~nacc/ubuntu/+source/pacemaker/+git/pacemaker/+merge/336063 ** Merge proposal linked: https://code.launchpad.net/~nacc/ubuntu/+source/pacemaker/+git/pacemaker/+merge/336879 ** Merge proposal linked: https://code.launchpad.net/~nacc/ubuntu/+source/corosync/+git/corosync/+merge/336880 -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Status in corosync package in Debian: New Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
** Merge proposal linked: https://code.launchpad.net/~nacc/ubuntu/+source/pacemaker/+git/pacemaker/+merge/336579 -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Status in corosync package in Debian: New Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
** Merge proposal linked: https://code.launchpad.net/~nacc/ubuntu/+source/corosync/+git/corosync/+merge/336508 ** Merge proposal linked: https://code.launchpad.net/~nacc/ubuntu/+source/pacemaker/+git/pacemaker/+merge/336509 -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Status in corosync package in Debian: New Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
[VERIFICATION for XENIAL] - [UPGRADE SCENARIO] # dpkg -l | egrep "corosync|pacemaker" ii corosync 2.3.5-3ubuntu2 amd64cluster engine daemon and utilities ii crmsh2.2.0-1 amd64CRM shell for the pacemaker cluster manager ii libcorosync-common4:amd642.3.5-3ubuntu2 amd64cluster engine common library ii pacemaker1.1.14-2ubuntu1.3 amd64cluster resource manager ii pacemaker-cli-utils 1.1.14-2ubuntu1.3 amd64cluster resource manager command line utilities ii pacemaker-common 1.1.14-2ubuntu1.3 all cluster resource manager common files ii pacemaker-resource-agents1.1.14-2ubuntu1.3 all cluster resource manager general resource agents # pidof pacemakerd 3647 # pidof corosync 1283 # sudo add-apt-repository ppa:nacc/lp1740892 # sudo apt-get update # apt-get install corosync -y Reading package lists... Done Building dependency tree Reading state information... Done The following package was automatically installed and is no longer required: libfreetype6 Use 'apt autoremove' to remove it. The following additional packages will be installed: pacemaker Suggested packages: fence-agents The following packages will be upgraded: corosync pacemaker 2 upgraded, 0 newly installed, 0 to remove and 43 not upgraded. Need to get 765 kB of archives. After this operation, 1024 B of additional disk space will be used. Get:1 http://ppa.launchpad.net/nacc/lp1740892/ubuntu xenial/main amd64 pacemaker amd64 1.1.14-2ubuntu1.4~ppa3 [403 kB] Get:2 http://ppa.launchpad.net/nacc/lp1740892/ubuntu xenial/main amd64 corosync amd64 2.3.5-3ubuntu2.1~ppa3 [361 kB] Fetched 765 kB in 1s (488 kB/s) (Reading database ... 28089 files and directories currently installed.) Preparing to unpack .../pacemaker_1.1.14-2ubuntu1.4~ppa3_amd64.deb ... Unpacking pacemaker (1.1.14-2ubuntu1.4~ppa3) over (1.1.14-2ubuntu1.3) ... Preparing to unpack .../corosync_2.3.5-3ubuntu2.1~ppa3_amd64.deb ... Unpacking corosync (2.3.5-3ubuntu2.1~ppa3) over (2.3.5-3ubuntu2) ... Processing triggers for systemd (229-4ubuntu21) ... Processing triggers for ureadahead (0.100.0-19) ... Processing triggers for man-db (2.7.5-1) ... Setting up corosync (2.3.5-3ubuntu2.1~ppa3) ... Setting up pacemaker (1.1.14-2ubuntu1.4~ppa3) ... # dpkg -l | egrep "corosync|pacemaker" ii corosync 2.3.5-3ubuntu2.1~ppa3 amd64cluster engine daemon and utilities ii crmsh2.2.0-1 amd64CRM shell for the pacemaker cluster manager ii libcorosync-common4:amd642.3.5-3ubuntu2 amd64cluster engine common library ii pacemaker1.1.14-2ubuntu1.4~ppa3 amd64cluster resource manager ii pacemaker-cli-utils 1.1.14-2ubuntu1.3 amd64cluster resource manager command line utilities ii pacemaker-common 1.1.14-2ubuntu1.3 all cluster resource manager common files ii pacemaker-resource-agents1.1.14-2ubuntu1.3 all cluster resource manager general resource agents # pidof corosync 4876 # pidof pacemakerd 4951 *** Result : The upgrade scenario make sure pacemaker is restarted after the upgrade. - [NEW INSTALL SCENARIO] lgtm +1 # sudo add-apt-repository ppa:nacc/lp1740892 # sudo apt-get update # sudo apt-get install corosync pacemaker -y # dpkg -l | egrep "corosync|pacemaker" ii corosync 2.3.5-3ubuntu2.1~ppa3 amd64cluster engine daemon and utilities ii crmsh2.2.0-1 amd64CRM shell for the pacemaker cluster manager ii libcorosync-common4:amd642.3.5-3ubuntu2.1~ppa3 amd64cluster engine common library ii pacemaker1.1.14-2ubuntu1.4~ppa3 amd64cluster resource manager ii pacemaker-cli-utils 1.1.14-2ubuntu1.4~ppa3 amd64cluster resource manager command line utilities ii pacemaker-common 1.1.14-2ubuntu1.4~ppa3 all cluster resource manager common files ii pacemaker-resource-agents1.1.14-2ubuntu1.4~ppa3 all cluster resource manager general resource agents *** Result: No problem observed when both packages are installed for the first time in Xenial.
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
After hitting some corner cases with my Trusty packages (and Xenial), I uploaded new versions: corosync - 2.3.3-1ubuntu4.1~ppa7 pacemaker - 1.1.10+git20130802-1ubuntu2.5~ppa4 And ran the following tests: 1) install corosync and pacemaker, but do not enable either Upgrade to the PPA packages No errors, both are still disabled 2) install corosync and pacemaker, enable corosync only Upgrade to the PPA packages No errors, corosyc is restarted 3) install corosync and pacemaker, enable both Upgrade to the PPA packages No errors, corosync and pacemaker restarted I'm moving on to update the xenial branches now. -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Status in corosync package in Debian: New Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
Xenial MPs updated and packages uploaded to PPA: corosync - 2.3.5-3ubuntu2.1~ppa2 pacemaker - 1.1.14-2ubuntu1.4~ppa2 Bionic will have to wait til tomorrow. -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Status in corosync package in Debian: New Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
I tested the versions mentioned in the last comment (with one syntax fix) and the upgrade path successfully worked! I need to test more corner-cases and would appreciate help with that (e.g., corosync only installed, installing pacemaker separately, etc) The MPs have been updated and I'm building packages that are source- identical to the MPs as corosync - 2.3.3-1ubuntu4.1~ppa6 pacemaker - 1.1.10+git20130802-1ubuntu2.5~ppa3 The reason for the repeated version bumps is these packages now have inter-related versioning. I will try and do a similar set of MPs for xenial and bionic first thing tomorrow, if not later today. ** Merge proposal unlinked: https://code.launchpad.net/~nacc/ubuntu/+source/corosync/+git/corosync/+merge/336185 -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Status in corosync package in Debian: New Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
** Merge proposal linked: https://code.launchpad.net/~nacc/ubuntu/+source/corosync/+git/corosync/+merge/336336 ** Merge proposal linked: https://code.launchpad.net/~nacc/ubuntu/+source/pacemaker/+git/pacemaker/+merge/336337 -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Status in corosync package in Debian: New Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
Testing on Trusty: # apt-get install corosync pacemaker # Make corosync start at boot # sed -i 's/no/yes/' /etc/default/corosync # Make pacemaker start at boot # update-rc.d pacemaker defaults # reboot # service corosync status; service pacemaker status * corosync is running pacemakerd (pid 1927) is running... Add PPA and upgrade corosync: # add-apt-repository ppa:nacc/lp1740892 # apt-get update; apt-get install corosync # service corosync status; service pacemaker status * corosync is running pacemakerd is stopped So what is in my PPA is not yet a fix and I think I see why: Preparing to unpack .../corosync_2.3.3-1ubuntu4.1~ppa3_amd64.deb ... * Stopping corosync daemon corosync ... Setting up corosync (2.3.3-1ubuntu4.1~ppa3) ... Installing new version of config file /etc/init.d/corosync ... * Restarting corosync daemon corosync warning [MAIN ] Could not lock memory of service to avoid page faults: Cannot allocate memory (12) So the postinst change is correct and we now restart corosync instead of start it. However, because the old package's prerm is run, that leads to a stop of corosync which in turn causes pacemaker to exit. When we run our updated init-script, it does not detect that pacemaker is running and so does not restart it. -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
Ok, I've put up MPs for Trusty (just corosync) and Xenial (corosync and pacemaker). I think that's correct, and the underlying bug here (package upgrade of corosync does not lead to pacemaker restarting) should be resolved in both cases [1) in c#33]. Additionally, case 2) in c#33 is resolved in both cases, but requires different changes in each release. I will propose similar MPs for bionic as are in xenial. The corresponding packages are building in the PPA currentlly. -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
Sorry for the long-winded, and delayed update to the bug! Here's my TL;DR: 1) We want the postinst of corosync to be created with dh_installinit --restart-on-upgrade, which is the default in compat levels <= 10. 2) We want the init scripts (of whatever type) to restart pacemaker, if they restart corosync and if pacemaker was running before corosync was restarted. -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
My findings from today: 1) This situation has always existed on Trusty, afaict. Removing the regression related tag. 2) There are 24 possible combinations to consider (some are by definition green already, but I'm including them for completeness; and some are not achievable) for each release: `service {start,stop,restart} {corosync,pacemaker}` where each of corosync and pacemaker can begin in one of {started,stopped}; 3 * 2 * 2 * 2 = 24. 3) For now, I'm ignoring the case of pacemaker configured to use heartbeat, as that is not the default in the current Ubuntu release. 4) On Trusty, 6 of those combinations are not possible by default (corosync stopped but pacemaker running). 5) On Trusty, the only failing situation I can provoke is `service restart corosync` when corosync and pacemaker are running already. In all other 17 cases, the expected result is obtained with existing packages. 6) I have submitted an MP to the Ubuntu Server Git repository for general review and submitted a build to a PPA at: https://launchpad.net/~nacc/+archive/ubuntu/lp1740892/, which adds a manual SysV start of pacemaker in corosync's SysV restart logic, if pacemaker was running before corosync was restarted. I think this is the least likely path to affect any existing configurations. In particularly, this does not affect the corosync start path, which may or may not have previously started pacemaker (that is a local configuration decision, afaict). 7) In my investigation (this relates to xnox's and other's comments), there is no SysV link between pacemaker and corosync. Instead, pacemaker itself quits due to not finding corosync if it's not already started. This is why the SysV do_stop routine for corosync ends up resulting in pacemaker stopping. -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
** Merge proposal linked: https://code.launchpad.net/~nacc/ubuntu/+source/corosync/+git/corosync/+merge/336185 -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
** No longer affects: corosync (Ubuntu Zesty) -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
** Changed in: corosync (Ubuntu Bionic) Assignee: Eric Desrochers (slashd) => Nish Aravamudan (nacc) -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Zesty: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
Status changed to 'Confirmed' because the bug affects multiple users. ** Changed in: corosync (Ubuntu Zesty) Status: New => Confirmed -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Zesty: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
Status changed to 'Confirmed' because the bug affects multiple users. ** Changed in: corosync (Ubuntu Trusty) Status: New => Confirmed -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: Confirmed Status in corosync source package in Xenial: Confirmed Status in corosync source package in Zesty: Confirmed Status in corosync source package in Artful: Confirmed Status in corosync source package in Bionic: In Progress Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
This issue has been identified as Field Critical. The medium importance assigned to this defect seems counter intuitive to the SLA level assigned. @Eric can you provide some insight into a time frame for a fix. Typically under Critical SLA we expect a dedicated engineer and a fix in a week. -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: New Status in corosync source package in Xenial: New Status in corosync source package in Zesty: New Status in corosync source package in Artful: New Status in corosync source package in Bionic: In Progress Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
I was wrong regarding iii) "when corosync is stopped, do not stop pacemaker": Pacemaker can use other applications[1] (e.g. heartbeat) instead of corosync, so this is a property we want to keep. -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: New Status in corosync source package in Xenial: New Status in corosync source package in Zesty: New Status in corosync source package in Artful: New Status in corosync source package in Bionic: In Progress Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
In my opinion, from the list of desired properties, only the second one is true: i) Corosync can be used on its own, regardless of having pacemaker installed or not. Starting both of them would force to mask pacemaker's unit file under particular scenarios. iii) IIRC, pacemaker requires corosync to run, so this property can't happen (in fact pacemaker SIGTERMs its components when corosync is not available). I like the idea stated at point 3) (restart on upgrade instead of stop+start). It would solve the issue without having to change the unit files. Regarding Trusty, both corosync and pacemaker currently use sysV scripts. I ran a short test switching to upstart using the scripts in source [1] and it seems to work fine (thanks to the 'respawn' directive for pacemaker). [1] master/mcp/pacemaker.upstart.in master/init/corosync.conf.in -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: New Status in corosync source package in Xenial: New Status in corosync source package in Zesty: New Status in corosync source package in Artful: New Status in corosync source package in Bionic: In Progress Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
(maybe actually even /bin/systemctl try-restart pacemaker.service -> restart, if it's running) -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: New Status in corosync source package in Xenial: New Status in corosync source package in Zesty: New Status in corosync source package in Artful: New Status in corosync source package in Bionic: In Progress Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
Currently, in bionic: $ systemctl cat pacemaker.service # /lib/systemd/system/pacemaker.service After=corosync.service Requires=corosync.service $ systemctl cat corosync.service Desired properties: i) when corosync is started, attempt to start pacemaker ii) when corosync is restarted, attempt to restart pacemaker too iii) when corosync is stopped, do not stop pacemaker 1) Property i) can be satisfied with [Install] WantedBy=corosync.service, in pacemaker.service. 2) Requires=corosync.service is too strong, as it means that pacemaker cannot operate without corosync. Is this true? 3) Currently on upgrade corosync prerm script does "stop corosync" and later postinst does "start corosync". My understanding it would be better, on upgrades to simply "restart" corosync, instead of doing stop Please consider switching corosync package to use dh_systemd and use restart-on-upgrade dh_installinit/systemd option. 4) Properties ii) and iii) cannot currently be satisfied simultaneously with simple stanzas. If pacemaker requires corosync at all time, then pacemaker.service should declare PartOf=corosync.service. Then stop/restart of corosync will stop and restart pacemaker. Condition ii) is good. However that will violate condition iii). However, we can instead introduce a helper unit to achieve both ii) and iii) simultaneously. e.g.: pacemaker-restart.service [Unit] PartOf=corosync.service [Service] ExecStart=/bin/true ExecStop=/bin/systemctl restart pacemaker.service [Install] WantedBy=corosync.service This means that whenever corosync is stopped, or restarted, pacemaker.service will be restarted too. This extra unit will satisfy the conditions `ii` and `iii` as stated. -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail Status in OpenStack hacluster charm: Invalid Status in corosync package in Ubuntu: In Progress Status in corosync source package in Trusty: New Status in corosync source package in Xenial: New Status in corosync source package in Zesty: New Status in corosync source package in Artful: New Status in corosync source package in Bionic: In Progress Bug description: During upgrades on 2018-01-02, corosync and it's libs were upgraded: (from a trusty/mitaka cloud) Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4) During this process, it appears that pacemaker service is restarted and it errors: syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member) syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: cfg_connection_destroy: Connection destroyed syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050 syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: mcp_cpg_destroy: Connection destroyed Also affected xenial/ocata To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp