On Mon, Jan 8, 2018 at 9:51 AM, Nish Aravamudan <nish.aravamu...@canonical.com> wrote: > On Mon, Jan 8, 2018 at 8:48 AM, Victor Tapia <victor.ta...@canonical.com> > wrote: >> As mentioned by Mario @ #10, stopping corosync while pacemaker runs >> throws the same error as the upgrade. Syslog from Xenial + >> corosync=2.3.5-3ubuntu1: >> >> Jan 8 16:24:37 xenial-corosync systemd[1]: Stopping Pacemaker High >> Availability Cluster Manager... >> Jan 8 16:24:37 xenial-corosync pacemakerd[28747]: notice: Invoking >> handler for signal 15: Terminated >> Jan 8 16:24:37 xenial-corosync crmd[28753]: notice: Invoking handler for >> signal 15: Terminated >> Jan 8 16:24:37 xenial-corosync crmd[28753]: notice: State transition >> S_IDLE -> S_POLICY_ENGINE [ input=I_SHUTDOWN cause=C_SHUTDOWN >> origin=crm_shutdown ] >> Jan 8 16:24:37 xenial-corosync pengine[28752]: notice: Delaying fencing >> operations until there are resources to manage >> Jan 8 16:24:37 xenial-corosync pengine[28752]: notice: Scheduling Node >> xenial-corosync for shutdown >> Jan 8 16:24:37 xenial-corosync pengine[28752]: notice: Calculated >> Transition 1: /var/lib/pacemaker/pengine/pe-input-52.bz2 >> Jan 8 16:24:37 xenial-corosync crmd[28753]: notice: Transition 1 >> (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0, >> Source=/var/lib/pacemaker/pengine/pe-input-52.bz2): Complete >> Jan 8 16:24:37 xenial-corosync crmd[28753]: notice: Disconnecting from >> Corosync >> Jan 8 16:24:37 xenial-corosync cib[28748]: warning: new_event_notification >> (28748-28753-12): Broken pipe (32) >> Jan 8 16:24:37 xenial-corosync pengine[28752]: notice: Invoking handler >> for signal 15: Terminated >> Jan 8 16:24:37 xenial-corosync attrd[28751]: notice: Invoking handler for >> signal 15: Terminated >> Jan 8 16:24:37 xenial-corosync lrmd[28750]: notice: Invoking handler for >> signal 15: Terminated >> Jan 8 16:24:37 xenial-corosync stonith-ng[28749]: notice: Invoking >> handler for signal 15: Terminated >> Jan 8 16:24:37 xenial-corosync cib[28748]: notice: Invoking handler for >> signal 15: Terminated >> Jan 8 16:24:37 xenial-corosync cib[28748]: notice: Disconnecting from >> Corosync >> Jan 8 16:24:37 xenial-corosync cib[28748]: notice: Disconnecting from >> Corosync >> Jan 8 16:24:37 xenial-corosync systemd[1]: Stopped Pacemaker High >> Availability Cluster Manager. >> >> >> Pacemakerd shuts down sending SIGTERM to its components, but after the >> install, corosync does not start pacemaker. BTW, "systemctl restart >> corosync" restarts both services perfectly >> >> I think that the option A from James Page (#11) is the way to go > > I took a quick look at a LXD container after seeing Felipe and > Victor's posts. It seems like this is a bug in the xenial (at least) > systemd unit files: > > # grep pacemaker /lib/systemd/system/corosync.service > # pacemaker.service, and if you want to exert the watchdog when a > > # grep corosync /lib/systemd/system/pacemaker.service > After=corosync.service > Requires=corosync.service > # ExecStopPost=/bin/sh -c 'pidof crmd || killall -TERM corosync' > > So, what I see is that corosync.service has no dependency on > pacemaker.service (in the file). > > pacemaker.service will start after corosync.service. And when > pacemaker.service is shutdown it will be before corosync.service. > Additionally, if pacemaker.service is started, then corosync.service > is started as well. > > Note, nothing specifies what Felipe said -- there is no guarantee that > pacemaker is started, restarted, etc. when corosync is. > > I think the next step is to look at Bionic's systemd services > (probably newer) or upstream's and see if there is a difference, or > new dependencies added there.
Or perhaps ask upstream what they think is providing this assurance in their systemd files, because I'm not seeing it. If we have a hard dependency between pacemaker and corosync, then I think we might need a PartOf directive, in order to ensure they are always following the state transitions together. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to pacemaker in Ubuntu. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs