Package: bridge-utils
Version: 1.5-11
Severity: serious

TL;DR: If you're using bridges, bonds and VLANs together, set 
       BRIDGE_HOTPLUG=no in /etc/default/bridge-utils.

Dear Maintainer,

There are some rather serious race conditions arising from the fact that 
bridge-utils handles udev events triggered by ifupdown actions and  
messes with the state of various interfaces while ifupdown is still 
running. To illustrate why this is happening, take the following e/n/i 
configuration as an example:

auto bond0
iface bond0 inet manual
  bond-slaves eth0 eth1
  bond-mode   active-backup
  bond-miimon 100
  up ip link set $IFACE mtu 9000

auto dmz
iface dmz inet manual
  bridge_ports  bond0.200
  bridge_fd       0
  bridge_stp      off
  bridge_maxwait  0
  up ip link set $IFACE up

This straightforward configuration worked fine in Jessie, but produces 
unexpected results on boot since Stretch, which - among others - 
include:

 - not setting the bond mode to active-backup, but to round-robin
 - creating bond0.200 with MTU 1500 instead of 9000

We have been hit by the above issues on production systems dist-upgraded 
to Stretch, and it all comes down to the races introduced by the 
bridge-utils hotplug support (which is now enabled by default).

So, what is actually happening is the following:

 1. On boot, networking.service calls `ifup --allow=auto -a`. This 
    starts off by creating bond0. As soon as the ifenslave hooks create 
    the interface, a udev "add" event for bond0 is triggered, *while 
    ifup is still configuring bond0*.

 2. /lib/udev/bridge-network-interface is called, with $INTERFACE set to 
    bond0. The script will run `ifquery --list --allow auto` and will 
    look for any interface containing bond0 or bond0.* in its 
    bridge_ports, matching "dmz" in our case. It will then go on to:
     a) create_vlan_port: this will run `ip link set bond0 up` and then 
        create the vlan sub-interface on bond0
     b) call `ifup dmz` once the vlan port has been created

    All of the above - for all we know - happen while `ifup -a` is 
    *still* configuring bond0 on its own.

Step 2 is especially troublesome for the following reasons:

 i) create_vlan_port messes with the interface state while ifup is still 
    configuring it. Bonding interfaces - for instance - need to be down 
    to have their mode configured, and create_vlan_port explicitly sets 
    the bond interface up. This causes the bond interface to potentially 
    come up with the default mode (round-robin), making the system 
    unreachable in case e.g. 802.3ad was requested.

 ii) create_vlan_port creates the VLAN sub-interface while the 
     underlying device is still being configured. This means that the 
     VLAN interface may be inherit the wrong MTU value, if ifup has not 
     yet set the parent interface's MTU to the desired value at the time 
     the VLAN interface is created.

 iii) dmz is brought up whenever bond0 is brought up, although this has 
      not been necessarily requested.

 iv) dmz is configured twice (once because of `ifup -a` and once because 
     of bridge-utils setting it up).

Note that high-cpu-count SMP systems seem more prone to the races i) and 
ii).

To be completely honest, I don't know what the hotplugging code is 
trying to achieve here, especially when it comes to short-circuiting 
ifupdown's internals. At a bare minimum, it should neither bring up 
"auto" interfaces that happen to be down, nor touch any interface while 
ifup might be still configuring it.

Regards,
Apollon

-- System Information:
Debian Release: buster/sid
  APT prefers unstable-debug
  APT policy: (500, 'unstable-debug'), (500, 'testing-debug'), (500, 
'testing'), (500, 'stable'), (90, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.14.0-3-amd64 (SMP w/4 CPU cores)
Locale: LANG=el_GR.UTF-8, LC_CTYPE=el_GR.UTF-8 (charmap=UTF-8), 
LANGUAGE=en_US:en (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages bridge-utils depends on:
ii  libc6  2.25-5

bridge-utils recommends no packages.

Versions of packages bridge-utils suggests:
ii  ifupdown  0.8.29

-- no debconf information

Reply via email to