Bug#1025618: cloud-init and firewalld systemd unit files have ordering cycles
On Fri, Dec 16, 2022 at 03:48:00PM -0800, Ross Vandegrift wrote: > At a high level the issue is: firewalld.service forces network-pre.target > after > sysinit.target, but cloud-init.service forces the other way around. In > detail, > using < to represent Before, the imposed orderings look like: > > - from firewalld: > sysinit.target < dbus.service < firewalld.service < network-pre.target > - from cloud-init: > cloud-init-local.service < network-pre.target < > systemd-networkd-wait-online.service < cloud-init.service < sysinit.target > > There's a few approaches to resolving this. As far as I can tell, the only > immediately viable one (at the bottom) requires users to manually fix this > and accept some trade-offs. Anyone have any better ideas? We discussed this issue on the recent cloud-team meeting and had some revised options. > Modify firewalld to run before sysinit.target > - [snip] This one still seems impossible. > Modify cloud-init to run after sysinit.target > - [snip] The main downside of this one, is that cloud-init will be running too late to configure block devices. But this feature didn't always work well. So maybe we'd affect a non-working feature. I've confirmed that cloud-init's block device setup is working well on AWS at least. So I think this will break working cloud-init features. IMO, that means it is not viable. > Locally override firewalld.service's order > -- [snip] This remains unattractive since unsuspecting users will be left with broken images and no clear path to fix the problem. Modify dbus to run later We discussed a way improve things by shuffling dbus later, but I didn't take good enough notes, and I can't reconstruct the details. Sorry for forgetting - Bastian do you recall the details? Add Breaks or Conflicts to prevent coinstallation - None of the alternatives seem reasonable and installing cloud-init and firewalld cannot produce a working Debian image. So we should prevent this state. We thought Conflicts might be required because once both are unpacked, the problematic cycle technically exists. Though it may not cause harm unless both services are (re-)started simultaneously. Ross
Bug#1025618: cloud-init and firewalld systemd unit files have ordering cycles
On Mon, Dec 12, 2022 at 05:41:46PM -0800, Noah Meyerhans wrote: > On 12/12/2022 6:44 AM, Sam Hartman wrote: > > >> From my quick read: Michael Biebl proposes dropping > > >> network-pre.target > > Ross> from cloud-init's After=, and replacing it with each of the > > Ross> config backends that cloud-init supports. This sounds pretty > > Ross> reasonable, but also like something that upstream should > > Ross> address first. > > > > Why wait for upstream? > > It's a bug affecting Debian users, our systemd maintainer has a solution > > that you (and I) think is reasonable. > > The symptom is quite serious. > > We often make changes before upstream in situations like that, > > especially when the alternative is: > > > > Ross> Should we consider adding "Conflicts: firewalld" to cloud-init > > Ross> before the freeze? That's not optimal of course, but it'd > > Ross> prevent a user from ending up in this situation for now. > > > > I'd much rather see Debian local changes than conflicts. > > We should simply move this discussion to an upstream pull request rather > than wait passively for their response. I agree that diverging from upstream > is preferable to unnecessary conflicts, but it shouldn't be done without > first consulting with upstream on our proposed solution. I played with the suggested solution and was unable to get it working: cloud-init.service doesn't have a /direct/ Before=network-pre.target to remove. The ordering is implicit in the combination of units. Probably, I think Michael knew that when he made the suggestion - but I had to play with it for a few hours first. :) At a high level the issue is: firewalld.service forces network-pre.target after sysinit.target, but cloud-init.service forces the other way around. In detail, using < to represent Before, the imposed orderings look like: - from firewalld: sysinit.target < dbus.service < firewalld.service < network-pre.target - from cloud-init: cloud-init-local.service < network-pre.target < systemd-networkd-wait-online.service < cloud-init.service < sysinit.target There's a few approaches to resolving this. As far as I can tell, the only immediately viable one (at the bottom) requires users to manually fix this and accept some trade-offs. Anyone have any better ideas? Modify firewalld to run before sysinit.target - This would let cloud-init and firewalld agree to do network-pre.target before sysinit.target. This is probably not possible since firewalld requires dbus, which starts after sysinit.target. There's a thread at [1] about why moving firewalld to be an early boot service is difficult. Modify cloud-init to run after sysinit.target - This would let cloud-init and firewalld agree to do network-pre.target after sysinit.target. This might not be advisable (see comments in [1] about running network management services in late boot), but it looks like this is how RHEL does it [2]. >From [3], I think cloud-init.service added Before=basic.target (which eventually became Before=sysinit.target) to ensure cloud-init configured block device mounts were ready early enough in boot process. The network needs to be online for this, since some block device config can come from network sources. So changing this in the Debian package seems risky to me. Locally override firewalld.service's order -- If you need to use both together, create an override unit that removes Before=network-pre.target. This eliminates the cycle by allowing cloud-init's order to win. But it the network will be up without firewalld for a period. Unfortunately, dependencies can't be removed in a drop-in - so I think you need to copy the unit to /etc/systemd/system and modify it. Ross [1] - https://lists.freedesktop.org/archives/systemd-devel/2022-March/047538.html [2] - https://github.com/canonical/cloud-init/blob/main/systemd/cloud-init.service.tmpl#L4-L6 [3] - https://github.com/canonical/cloud-init/commit/80f5ec4be0f781b26eca51d90d51abfab396b3f6
Bug#1025618: cloud-init and firewalld systemd unit files have ordering cycles
On 12/12/2022 6:44 AM, Sam Hartman wrote: >> From my quick read: Michael Biebl proposes dropping >> network-pre.target Ross> from cloud-init's After=, and replacing it with each of the Ross> config backends that cloud-init supports. This sounds pretty Ross> reasonable, but also like something that upstream should Ross> address first. Why wait for upstream? It's a bug affecting Debian users, our systemd maintainer has a solution that you (and I) think is reasonable. The symptom is quite serious. We often make changes before upstream in situations like that, especially when the alternative is: Ross> Should we consider adding "Conflicts: firewalld" to cloud-init Ross> before the freeze? That's not optimal of course, but it'd Ross> prevent a user from ending up in this situation for now. I'd much rather see Debian local changes than conflicts. We should simply move this discussion to an upstream pull request rather than wait passively for their response. I agree that diverging from upstream is preferable to unnecessary conflicts, but it shouldn't be done without first consulting with upstream on our proposed solution. noah
Bug#1025618: cloud-init and firewalld systemd unit files have ordering cycles
> "Ross" == Ross Vandegrift writes: >> From my quick read: Michael Biebl proposes dropping >> network-pre.target Ross> from cloud-init's After=, and replacing it with each of the Ross> config backends that cloud-init supports. This sounds pretty Ross> reasonable, but also like something that upstream should Ross> address first. Why wait for upstream? It's a bug affecting Debian users, our systemd maintainer has a solution that you (and I) think is reasonable. The symptom is quite serious. We often make changes before upstream in situations like that, especially when the alternative is: Ross> Should we consider adding "Conflicts: firewalld" to cloud-init Ross> before the freeze? That's not optimal of course, but it'd Ross> prevent a user from ending up in this situation for now. I'd much rather see Debian local changes than conflicts.
Bug#1025618: cloud-init and firewalld systemd unit files have ordering cycles
Hi Ross, > Should we consider adding "Conflicts: firewalld" to cloud-init before > the freeze? That's not optimal of course, but it'd prevent a user from > ending up in this situation for now. Is there a way to bypass "Conflicts" and install such packages anyway, in case the user finds a way to customize the configuration in a way that fits their needs? For example, on one of my systems I kept cloud-init installed for now, but I disabled and masked cloud-init.service. Cheers! Guillaume
Bug#1025618: cloud-init and firewalld systemd unit files have ordering cycles
Control: forwarded -1 https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1956629 Hi Guillaume, On Tue, Dec 06, 2022 at 06:26:26PM +0100, Guillaume Knispel wrote: > firewalld and cloud-init have ordering cycles between their systemd unit > files, leading to more or less broken boot results when both are installed > and active, because at each boot systemd decides to skip a > non-deterministically choosen service (not necessarily cloud-init or > firewalld) to break the cycle. Thanks for bringing this to our attention. There's a few useful discussions: https://github.com/firewalld/firewalld/issues/414 https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1956629 >From my quick read: Michael Biebl proposes dropping network-pre.target from cloud-init's After=, and replacing it with each of the config backends that cloud-init supports. This sounds pretty reasonable, but also like something that upstream should address first. Should we consider adding "Conflicts: firewalld" to cloud-init before the freeze? That's not optimal of course, but it'd prevent a user from ending up in this situation for now. Thanks, Ross
Bug#1025618: cloud-init and firewalld systemd unit files have ordering cycles
Package: cloud-init Version: 20.4.1-2+deb11u1 Severity: important X-Debbugs-Cc: xi...@australdx.fr Dear Maintainer, firewalld and cloud-init have ordering cycles between their systemd unit files, leading to more or less broken boot results when both are installed and active, because at each boot systemd decides to skip a non-deterministically choosen service (not necessarily cloud-init or firewalld) to break the cycle. I'm not sure if any of firewalld or cloud-init is more at fault (maybe in not respecting some systemd rules?) so I'm also opening a duplicate of this bug for the other package. This can have various but potentially serious consequences, depending on what should be, but is not, started. Examples of boot traces of this issue happening: * example 1: sysinit.target: Found ordering cycle on cloud-init.service/start sysinit.target: Found dependency on networking.service/start sysinit.target: Found dependency on network-pre.target/start sysinit.target: Found dependency on firewalld.service/start sysinit.target: Found dependency on basic.target/start sysinit.target: Found dependency on sockets.target/start sysinit.target: Found dependency on uuidd.socket/start sysinit.target: Found dependency on sysinit.target/start sysinit.target: Job cloud-init.service/start deleted to break ordering cycle starting with sysinit.target/start * example 2: sysinit.target: Found ordering cycle on cloud-init.service/start sysinit.target: Found dependency on networking.service/start sysinit.target: Found dependency on network-pre.target/start sysinit.target: Found dependency on firewalld.service/start sysinit.target: Found dependency on dbus.service/start sysinit.target: Found dependency on sysinit.target/start sysinit.target: Job cloud-init.service/start deleted to break ordering cycle starting with sysinit.target/start * example 3: firewalld.service: Found ordering cycle on dbus.socket/start firewalld.service: Found dependency on sysinit.target/start firewalld.service: Found dependency on cloud-init.service/start firewalld.service: Found dependency on networking.service/start firewalld.service: Found dependency on network-pre.target/start firewalld.service: Found dependency on firewalld.service/start firewalld.service: Job dbus.socket/start deleted to break ordering cycle starting with firewalld.service/start * example 4: firewalld.service: Found ordering cycle on dbus.service/start firewalld.service: Found dependency on sysinit.target/start firewalld.service: Found dependency on cloud-init.service/start firewalld.service: Found dependency on networking.service/start firewalld.service: Found dependency on network-pre.target/start firewalld.service: Found dependency on firewalld.service/start firewalld.service: Job dbus.service/start deleted to break ordering cycle starting with firewalld.service/start basic.target: Found ordering cycle on sysinit.target/start basic.target: Found dependency on cloud-init.service/start basic.target: Found dependency on networking.service/start basic.target: Found dependency on network-pre.target/start basic.target: Found dependency on firewalld.service/start basic.target: Found dependency on basic.target/start basic.target: Job cloud-init.service/start deleted to break ordering cycle starting with basic.target/start * example 5: basic.target: Found ordering cycle on sockets.target/start basic.target: Found dependency on uuidd.socket/start basic.target: Found dependency on sysinit.target/start basic.target: Found dependency on cloud-init.service/start basic.target: Found dependency on networking.service/start basic.target: Found dependency on network-pre.target/start basic.target: Found dependency on firewalld.service/start basic.target: Found dependency on dbus.service/start basic.target: Found dependency on basic.target/start basic.target: Job sockets.target/start deleted to break ordering cycle starting with basic.target/start firewalld.service: Found ordering cycle on dbus.socket/start firewalld.service: Found dependency on sysinit.target/start firewalld.service: Found dependency on cloud-init.service/start firewalld.service: Found dependency on networking.service/start firewalld.service: Found dependency on network-pre.target/start firewalld.service: Found dependency on firewalld.service/start firewalld.service: Job dbus.socket/start deleted to break ordering cycle starting with firewalld.service/start * example 6: networking.service: Found ordering cycle on network-pre.target/start networking.service: Found dependency on firewalld.service/start networking.service: Found dependency on dbus.service/start networking.service: Found dependency on basic.target/start networking.service: Found dependency on sockets.target/start networking.service: Found dependency on uuidd.socket/start networking.service: Found dependency on sysinit.target/start networking.service: Found dependency on cloud-init.service/start networking.service: Found dependency on networking.service/star