[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-12 Thread Trent Lloyd
** Changed in: charm-hacluster Status: New => Confirmed ** Changed in: pacemaker (Ubuntu) Status: Confirmed => Invalid ** Summary changed: - upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters + pacemaker left stopped after unattended-upgrade of pacemaker

[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-12 Thread Trent Lloyd
For the fix to Bug #1654403 charm-hacluster sets TimeoutStartSec and TimeoutStopSync for both corosync and pacemaker, to the same value. system-wide default (xenial, bionic): TimeoutStopSec=90s TimeoutStartSec=90s corosync package default: system-wide default (no changes) pacemaker package

[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-12 Thread Ante Karamatić
** Also affects: charm-hacluster Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1903745 Title: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks

[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-12 Thread Trent Lloyd
I misread and the systemd unit is native, and it already sets the following settings: SendSIGKILL=no TimeoutStopSec=30min TimeoutStartSec=60s The problem is that most of these failures have been experienced on juju hacluster charm installations, which are overriding these values $ cat

[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-11 Thread Trent Lloyd
Analysed the logs for an occurance of this, the problem appears to be that pacemaker doesn't stop after 1 minute so systemd gives up and just starts a new instance anyway, noting that all of the existing processes are left behind. I am awaiting the extra rotated logs to confirm but from what I

[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-11 Thread Felipe Reyes
Just some history, in the past we attempted to disable unattended- upgrades (as a config option) in the hacluster charm, but it was decided that it wasn't the right place to get this addressed. Bug https://bugs.launchpad.net/charm-hacluster/+bug/1826898 ** Tags added: sts -- You received this

[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-11 Thread Ante Karamatić
Well, yes. Therefore, if unattended upgrades wants to update stuff randomly and mimic an operator, it should do it as an operator and not do it partially. I guess what I'm saying is that this is a bug in unattended upgrades, rather than pacemaker/corosync. It should have a way of defining an

[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-11 Thread James Troup
Er, but unattended upgrades are on by default? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1903745 Title: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters To manage

[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-10 Thread Ante Karamatić
Big problem with corosync and pacemaker packages is that they don't put resources into unmanaged state before upgrade. Hitting an issue like this is also tightly related to configuration of the pacemaker/corosync. Upgrading corosync, without stopping pacemaker (and therefore lrmd) has high

[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-10 Thread Trent Lloyd
I reviewed the sosreports and provide some general analysis below. [sosreport-juju-machine-2-lxc-1-2020-11-10-tayyude] I don't see any sign in this log of package upgrades or VIP stop/starts, I suspect this host may be unrelated. [sosreport-juju-caae6f-19-lxd-6-20201110230352.tar.xz] This is

[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-10 Thread Zachary Zehring
I have uploaded sosreports from 2 lxds that would have been affected by the issue: https://private-fileshare.canonical.com/~zzehring/sosreport-xenial-2020-11-10.tar.xz https://private-fileshare.canonical.com/~zzehring/sosreport-bionic-20201110230352.tar.xz Let me know if you need other logs. --

[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-10 Thread Zachary Zehring
There was a mix of bionic and xenial clouds that experienced this issue. Doesn't look to be tied to distro, so not all affected systems were running systemd-networkd. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-10 Thread Jamie Strandboge
Can someone confirm that the affected systems are running systemd- networkd? That would more strongly suggest that https://bugs.launchpad.net/netplan/+bug/1815101 is related. Based on the description and the other bug, the security update doesn't seem to have regressed pacemaker; instead, it's a

[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-10 Thread Marc Deslauriers
Does anyone have any pertinent log files that were generated during the upgrade process? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1903745 Title: upgrade from 1.1.14-2ubuntu1.8 to

[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-10 Thread Dan Streetman
possibly related to bug 1815101 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1903745 Title: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters To manage notifications about this

[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-10 Thread Launchpad Bug Tracker
Status changed to 'Confirmed' because the bug affects multiple users. ** Changed in: pacemaker (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1903745 Title:

[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters

2020-11-10 Thread Dan Ackerson
We've been working all day on fixing pacemaker/corosync issues on several of our clouds due to this issue. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1903745 Title: upgrade from