Not quite.

On boot, there are multiple ways that ifup is called, and effectively it
races with itself.

In my case I have two vlans, on top of a bond, of two NICs. By the time
networking.service is called, the two NICs are present.
networking.service is essentially `ifup -a`, it looks at the eni file
and realises that it should bring bond0. It looks for bond0 in its
internal state and creates it.

This is where the race starts.

ifupdown ships /lib/udev/rules.d/80-ifupdown.rules which calls 
/lib/udev/ifupdown-hotplug which effectively does
$ exec systemctl --no-block start $(systemd-escape --template [email protected] 
$INTERFACE)
(very strange to do this on systemd systems, because one could have just did 
SYSTEMD_WANTS, but anyway)

At this point bond0 is being brought up by networking.service unit (ifup
-a) and [email protected] (ifup bond0). Sometimes one can see "already
configured" message from either of the two units in the logs.

But also, at this point it time, [email protected] and
[email protected] may have been started as well.

In my case my machine manages to hit this race quite a bit. I am
attaching a journal log, of what is happening.

The log is produced using:
journalctl -u ifup@*.service -u networking -o verbose | grep -e UTC -e UNIT -e 
MESSAGE

You can see messages that things are waiting on bond0 to be up; and that
one or the other vlan is waiting on bond0 lock. To beat the locks and to
prevent [email protected] interfering with [email protected], or executing
in parallel and creating deadlocks, I had to encode the dependencies
between these units in systemd brain by doing this:

# cat /etc/systemd/system/[email protected]/order.conf 
[Unit]
[email protected]
[email protected]
# cat /etc/systemd/system/[email protected]/order.conf 
[Unit]
[email protected]
[email protected]

This way the ordering is enforced for the [email protected] hotplug. IMHO
ifupdown should ship a generator, that would create these dependencies
and orderings between interfaces. And possibly ifup -a should be reduced
to starting ifup@%I.service for every interface it is meant to start for
a given command.

I'm not sure if we can cheat and state that [email protected] should be
Wants=networking.service After=networking.service. Because I think then
we may get ourselves into the situation that ifupdown fails to resolve
cycles in the eni, when eni is specified out of order.

For cloud-init, this is more complicated. As on boot the generators will
fire, before eni is populated. Therefore cloud-init should probably re-
run this magical ifupdwon generator (just like it does for netplan) or
cloud-init should create these symlinks directly, and reload systemd
before moving onto networking.service.

Does above make sense at all?

** Attachment added: "ifupdown-race-itself.txt"
   
https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1636708/+attachment/4874021/+files/ifupdown-race-itself.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1636708

Title:
  ifup -a does not start dependants last, causes deadlocks with
  vlans/bonding

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1636708/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to