[Touch-packages] [Bug 1768235] [NEW] Ifenslave failing when bonding vlans
Public bug reported: There is a problem when running a bond on top of vlans. Running ifup with verbose enabled shows run-parts being executed in (what seems like) alphabetical order, but to enslave a vlan interface, run-part /etc/network/if-pre-up.d/vlan should be executed before run-part /etc/network/if-pre-up.d/ifenslave. Our workaround has been to add "pre-up export IFACE= IF_VLAN_RAW_DEVICE=; /etc/network/if-pre-up.d/vlan" to all vlan slaves, but it would be better to fix this in the ifup scripts themselves by reordering the run-parts. Tested with version 1.9-3ubuntu10.4 ** Affects: ifupdown (Ubuntu) Importance: Undecided Status: New ** Description changed: There is a problem when running a bond on top of vlans. Running ifup with verbose enabled shows run-parts being executed in (what seems like) alphabetical order, but to enslave a vlan interface, run-part /etc/network/if-pre-up.d/vlan should be executed before run-part /etc/network/if-pre-up.d/ifenslave. Our workaround has been to add "pre-up export IFACE= IF_VLAN_RAW_DEVICE=; /etc/network/if-pre-up.d/vlan" to all vlan slaves, but it would be better to fix this in the ifup scripts themselves by reordering the run-parts. + + Tested with version 1.9-3ubuntu10.4 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ifupdown in Ubuntu. https://bugs.launchpad.net/bugs/1768235 Title: Ifenslave failing when bonding vlans Status in ifupdown package in Ubuntu: New Bug description: There is a problem when running a bond on top of vlans. Running ifup with verbose enabled shows run-parts being executed in (what seems like) alphabetical order, but to enslave a vlan interface, run-part /etc/network/if-pre-up.d/vlan should be executed before run-part /etc/network/if-pre-up.d/ifenslave. Our workaround has been to add "pre-up export IFACE= IF_VLAN_RAW_DEVICE=; /etc/network/if-pre-up.d/vlan" to all vlan slaves, but it would be better to fix this in the ifup scripts themselves by reordering the run-parts. Tested with version 1.9-3ubuntu10.4 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1768235/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1701023] Re: (on trusty) version 1.9-3ubuntu10.4 regression blocking boot completion
Neither issue is fixed by the downgrade. As said, neither seems to have to do with vlan but with ifupdown. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ifupdown in Ubuntu. https://bugs.launchpad.net/bugs/1701023 Title: (on trusty) version 1.9-3ubuntu10.4 regression blocking boot completion Status in ifupdown package in Ubuntu: In Progress Status in vlan package in Ubuntu: In Progress Status in ifupdown source package in Trusty: In Progress Status in vlan source package in Trusty: In Progress Status in ifupdown source package in Xenial: In Progress Status in vlan source package in Xenial: In Progress Status in ifupdown source package in Artful: In Progress Status in vlan source package in Artful: In Progress Status in ifupdown source package in Bionic: In Progress Status in vlan source package in Bionic: In Progress Status in ifupdown package in Debian: Fix Released Status in vlan package in Debian: New Bug description: When upgrading from version 1.9-3ubuntu10.1, a previously working machine can't successfully reboot completely. ifup is hanging indefinitely, with this process structure (from "pstree -a 1299"): ifup,1299 -a └─run-parts,1501 /etc/network/if-pre-up.d └─bridge,1502 /etc/network/if-pre-up.d/bridge └─bridge,1508 /etc/network/if-pre-up.d/bridge └─vlan,1511 /etc/network/if-pre-up.d/vlan └─ifup,1532 eth0 auto lo iface lo inet loopback auto eth0 iface eth0 inet static address 192.168.10.65 netmask 255.255.255.192 gateway 192.168.10.66 auto eth0.11 address 192.168.11.1 netmask 255.255.255.0 auto br1134 iface br1134 inet manual bridge_ports eth0.1134 bridge_stp off bridge_fd 0 The underlying interface eth0.1134 is not explicitly defined, but was previously auto-created during "ifup -a" execution. This apparently fails now. Reverting back to the 10.1 version re-establishes old behavior. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1701023/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1701023] Re: (on trusty) version 1.9-3ubuntu10.4 regression blocking boot completion
Hi @ddstreet. Completely understand your need to limit the scope of this bug. Just shared our findings, but feel free to ignore the stuff in #28 under item 2. We did a lot of extensive testing over the weekend with the latest version of your PPA package and here are our main findings: 1) We migrated from separate files in /etc/networking/interfaces.d to just declaring everything in the single /etc/networking/interfaces file. This overcomes a lot of issues with regards to bringing interfaces up in the proper order and "ifup -a" now works perfectly again. Some lessons learned for future reference: (a) to have bonds come up correctly you absolutely have to define slaves before the bond master and the primary slave before secondary slaves in the configuration file, and (b) to have a vlan come up correctly define its raw device before the vlan device. 2) Even though "ifup -a" now works again, bringing bonds up correctly at boot does not. Pretty sure this has to do with the raw interfaces being detected by the kernel and brought up by systemd in a different order at boot. As said under (1) the order really really matters. Bringing up a secondary slave before the primary slave seems to break the bond (looks like due to using the wrong MAC address) and it looks like this is what sometimes happens at boot. Our workaround mentioned in #28 under (3) mitigates this, but it's not very elegant at all. 3) There is a problem when running a bond on top of vlans. Running ifup with verbose enabled shows run-parts being executed in (what seems like) alphabetical order, but to enslave a vlan interface, run-part /etc/network/if-pre-up.d/vlan should be executed before run-part /etc/network/if-pre-up.d/ifenslave. For now we added "pre-up export IFACE= IF_VLAN_RAW_DEVICE=; /etc/network/if-pre- up.d/vlan" to all vlan slaves as a workaround, but it would be better to fix this in the ifupdown package itself. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ifupdown in Ubuntu. https://bugs.launchpad.net/bugs/1701023 Title: (on trusty) version 1.9-3ubuntu10.4 regression blocking boot completion Status in ifupdown package in Ubuntu: In Progress Status in vlan package in Ubuntu: In Progress Status in ifupdown source package in Trusty: In Progress Status in vlan source package in Trusty: In Progress Status in ifupdown source package in Xenial: In Progress Status in vlan source package in Xenial: In Progress Status in ifupdown source package in Artful: In Progress Status in vlan source package in Artful: In Progress Status in ifupdown source package in Bionic: In Progress Status in vlan source package in Bionic: In Progress Status in ifupdown package in Debian: Fix Released Status in vlan package in Debian: New Bug description: When upgrading from version 1.9-3ubuntu10.1, a previously working machine can't successfully reboot completely. ifup is hanging indefinitely, with this process structure (from "pstree -a 1299"): ifup,1299 -a └─run-parts,1501 /etc/network/if-pre-up.d └─bridge,1502 /etc/network/if-pre-up.d/bridge └─bridge,1508 /etc/network/if-pre-up.d/bridge └─vlan,1511 /etc/network/if-pre-up.d/vlan └─ifup,1532 eth0 auto lo iface lo inet loopback auto eth0 iface eth0 inet static address 192.168.10.65 netmask 255.255.255.192 gateway 192.168.10.66 auto eth0.11 address 192.168.11.1 netmask 255.255.255.0 auto br1134 iface br1134 inet manual bridge_ports eth0.1134 bridge_stp off bridge_fd 0 The underlying interface eth0.1134 is not explicitly defined, but was previously auto-created during "ifup -a" execution. This apparently fails now. Reverting back to the 10.1 version re-establishes old behavior. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1701023/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1701023] Re: (on trusty) version 1.9-3ubuntu10.4 regression blocking boot completion
We've been doing a lot more testing and debugging and I'd like to share our findings: 1) Unfortunately it turns out this change does not fix the issue of interfaces not coming up correctly for a bond with a (static) network configuration. The race condition seems to be removed so at least there are no more hangs between bonds and their vlan children. All the interfaces also say they are UP both when running ifup and after reboot. However: - Running "ifup " does bring up the bond (and its vlans) in a working state. - Running "ifup -a" or rebooting don't actually work, causing "network not available" errors and "Destination Host Unreachable" when pinging other machines. Executing "ifdown -a; ifup -a" shows that ifupdown tries to bring up the bond BEFORE the slaves in stead of the other way around. Even though after the 60s timeout the bond and it's slaves say they are UP, they don't actually function. - We're not seeing any issues with bonds that do not have a network configuration of their own 2) The networking script stack / concept seems fundamentally flawed in three areas: 2.A) bonds relying on slaves having "bond-master" and being started by bringing up the slaves, but not supporting the master having "bond- slaves" and being able to start a bond by just bringing up the bond directly. 2.B) bringing a specific interface up automatically brings up it's child vlans. This does not make a lot of sense. The other way around does - e.g. in order to bring up a vlan we need to bring up it's raw device - but why would the ifupdown scripts assume that I want to bring up all of it's vlans when I bring up an interface that (also) serves as a raw device? In that case I would probably run "ifup -a"! 2.C) a vlan running on top of a bond cannot be brought up directly due to /sys/class/net// not existing. This results in the following: > # ifup bo-adm.2 > Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config > cat: /sys/class/net/bo-adm/mtu: No such file or directory > Device "bo-adm" does not exist. > bo-adm does not exist, unable to create bo-adm.2 > run-parts: /etc/network/if-pre-up.d/vlan exited with return code 1 > Failed to bring up bo-adm.2. 3) Our new workaround for boot has become this very intrusive systemd service: > [Unit] > Wants=network-online.target > After=network-online.target > > [Install] > WantedBy=multi-user.target > > [Service] > Type=oneshot > ExecStartPre=/sbin/ifdown bo-adm > ExecStart=/sbin/ifup enp0s3 > ExecStart=/sbin/ifup enp0s10 > ExecStop=/sbin/ifdown bo-adm > RemainAfterExit=yes > TimeoutStartSec=5min -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ifupdown in Ubuntu. https://bugs.launchpad.net/bugs/1701023 Title: (on trusty) version 1.9-3ubuntu10.4 regression blocking boot completion Status in ifupdown package in Ubuntu: In Progress Status in vlan package in Ubuntu: In Progress Status in ifupdown source package in Trusty: In Progress Status in vlan source package in Trusty: In Progress Status in ifupdown source package in Xenial: In Progress Status in vlan source package in Xenial: In Progress Status in ifupdown source package in Artful: In Progress Status in vlan source package in Artful: In Progress Status in ifupdown source package in Bionic: In Progress Status in vlan source package in Bionic: In Progress Status in ifupdown package in Debian: Fix Released Status in vlan package in Debian: New Bug description: When upgrading from version 1.9-3ubuntu10.1, a previously working machine can't successfully reboot completely. ifup is hanging indefinitely, with this process structure (from "pstree -a 1299"): ifup,1299 -a └─run-parts,1501 /etc/network/if-pre-up.d └─bridge,1502 /etc/network/if-pre-up.d/bridge └─bridge,1508 /etc/network/if-pre-up.d/bridge └─vlan,1511 /etc/network/if-pre-up.d/vlan └─ifup,1532 eth0 auto lo iface lo inet loopback auto eth0 iface eth0 inet static address 192.168.10.65 netmask 255.255.255.192 gateway 192.168.10.66 auto eth0.11 address 192.168.11.1 netmask 255.255.255.0 auto br1134 iface br1134 inet manual bridge_ports eth0.1134 bridge_stp off bridge_fd 0 The underlying interface eth0.1134 is not explicitly defined, but was previously auto-created during "ifup -a" execution. This apparently fails now. Reverting back to the 10.1 version re-establishes old behavior. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1701023/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1701023] Re: (on trusty) version 1.9-3ubuntu10.4 regression blocking boot completion
@ddstreet tested with the latest version of ifupdown and vlan from your PPA with my 4 testscenario's and can confirm it works as expected. Interfaces come up correctly both when doing an "ifup -a" and during boot. One small thing I've noticed is a variation in the number of "Set name- type for VLAN subsystem. Should be visible in /proc/net/vlan/config" messages in the ifup output depending on the different scenario, even though the number of vlans is the same in all 4 tests. With 2 vlans, the 2 scenario's with bonding generate 1 message, the ones without bonding generate 4. It doesn't hurt, just something I noticed. Please check the attached log for more details on my 4 tests. ** Attachment added: "Test output for 4 different scenario's" https://bugs.launchpad.net/ubuntu/+source/vlan/+bug/1701023/+attachment/5124886/+files/vlan-tests.txt -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ifupdown in Ubuntu. https://bugs.launchpad.net/bugs/1701023 Title: (on trusty) version 1.9-3ubuntu10.4 regression blocking boot completion Status in ifupdown package in Ubuntu: In Progress Status in vlan package in Ubuntu: In Progress Status in ifupdown source package in Trusty: In Progress Status in vlan source package in Trusty: In Progress Status in ifupdown source package in Xenial: In Progress Status in vlan source package in Xenial: In Progress Status in ifupdown source package in Artful: In Progress Status in vlan source package in Artful: In Progress Status in ifupdown source package in Bionic: In Progress Status in vlan source package in Bionic: In Progress Status in ifupdown package in Debian: New Status in vlan package in Debian: New Bug description: When upgrading from version 1.9-3ubuntu10.1, a previously working machine can't successfully reboot completely. ifup is hanging indefinitely, with this process structure (from "pstree -a 1299"): ifup,1299 -a └─run-parts,1501 /etc/network/if-pre-up.d └─bridge,1502 /etc/network/if-pre-up.d/bridge └─bridge,1508 /etc/network/if-pre-up.d/bridge └─vlan,1511 /etc/network/if-pre-up.d/vlan └─ifup,1532 eth0 auto lo iface lo inet loopback auto eth0 iface eth0 inet static address 192.168.10.65 netmask 255.255.255.192 gateway 192.168.10.66 auto eth0.11 address 192.168.11.1 netmask 255.255.255.0 auto br1134 iface br1134 inet manual bridge_ports eth0.1134 bridge_stp off bridge_fd 0 The underlying interface eth0.1134 is not explicitly defined, but was previously auto-created during "ifup -a" execution. This apparently fails now. Reverting back to the 10.1 version re-establishes old behavior. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1701023/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1759573] Re: vlan on top of untagged network won't start
@ddstreet here's how I reproduced: I created a VirtualBox VM with Xenial and 3 interfaces: enp0s3 and enp0s8 on an internal network, enp0s9 on a bridge to my LAN). Then I applied each of the 4 configurations below and ran "ifup -a". Try it and you'll see the same behavior. You are correct: the proper way to bring up the bond is to bring up it's slaves. Running ifup on the bond just hangs as you have already established. This is an entirely different bug I guess, but not my main concern right now. I've had to use "bond-master" for the enp0sX interfaces and set "bond-slaves none" for the bond to get it to work. Guess we need support for setting bond-master and bond-slaves at the same time to be bale to bring up the bond both by bringing up a slave or by bringing up the bond itself. Just to summarize: it is a duplicate of the other bug and it is fixed by your patch! == auto lo iface lo inet loopback auto enp0s9 iface enp0s9 inet static mtu 1500 address 192.168.1.9 gateway 192.168.1.1 netmask 255.255.255.0 dns-nameservers 1.1.1.1 auto enp0s3 iface enp0s3 inet manual mtu 1500 bond-master bo-adm bond-primary enp0s3 auto enp0s8 iface enp0s8 inet manual mtu 1500 bond-master bo-adm auto bo-adm iface bo-adm inet static mtu 1500 address 10.10.10.3 netmask 255.255.0.0 bond-miimon 100 bond-mode active-backup bond-slaves none bond-downdelay 200 bond-updelay 200 auto bo-adm.2 iface bo-adm.2 inet static mtu 1500 address 10.11.10.3 netmask 255.255.0.0 vlan-raw-device bo-adm == auto lo iface lo inet loopback auto enp0s9 iface enp0s9 inet static mtu 1500 address 192.168.1.9 gateway 192.168.1.1 netmask 255.255.255.0 dns-nameservers 1.1.1.1 auto enp0s3 iface enp0s3 inet manual mtu 1500 bond-master bo-adm bond-primary enp0s3 auto enp0s8 iface enp0s8 inet manual mtu 1500 bond-master bo-adm auto bo-adm iface bo-adm inet manual mtu 1500 bond-miimon 100 bond-mode active-backup bond-slaves none bond-downdelay 200 bond-updelay 200 auto bo-adm.2 iface bo-adm.2 inet static mtu 1500 address 10.11.10.3 netmask 255.255.0.0 vlan-raw-device bo-adm == auto lo iface lo inet loopback auto enp0s9 iface enp0s9 inet static mtu 1500 address 192.168.1.9 gateway 192.168.1.1 netmask 255.255.255.0 dns-nameservers 1.1.1.1 auto enp0s3 iface enp0s3 inet manual mtu 1500 auto enp0s3.2 iface enp0s3.2 inet static mtu 1500 address 10.11.10.3 netmask 255.255.0.0 vlan-raw-device enp0s3 == auto lo iface lo inet loopback auto enp0s9 iface enp0s9 inet static mtu 1500 address 192.168.1.9 gateway 192.168.1.1 netmask 255.255.255.0 dns-nameservers 1.1.1.1 auto enp0s3 iface enp0s3 inet static mtu 1500 address 10.10.10.3 netmask 255.255.0.0 bond-miimon 100 bond-mode active-backup bond-slaves none bond-downdelay 200 bond-updelay 200 auto enp0s3.2 iface enp0s3.2 inet static mtu 1500 address 10.11.10.3 netmask 255.255.0.0 vlan-raw-device enp0s3 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ifupdown in Ubuntu. https://bugs.launchpad.net/bugs/1759573
[Touch-packages] [Bug 1759573] Re: vlan on top of untagged network won't start
*** This bug is a duplicate of bug 1701023 *** https://bugs.launchpad.net/bugs/1701023 Totally support marking this as a duplicate. As long as we get this fix pushed a.s.a.p. :) -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ifupdown in Ubuntu. https://bugs.launchpad.net/bugs/1759573 Title: vlan on top of untagged network won't start Status in ifupdown package in Ubuntu: New Status in vlan package in Ubuntu: New Bug description: Due to an upgrade (of probably of the ifupdown or vlan package), this specific network configuration no longer comes up automatically: 1) Two or more network interfaces bonded 2) An untagged network configured on that bond 3) A vlan on top of that untagged network What does come up automatically: 1) A single (e.g. unbonded) network interface with an untagged network configured and a vlan on top of that network 2) Two or more network interfaces bonded with a vlan on top of that untagged bond An exact example of the configuration that doesn't work is provided below. It fails to come up correctly, both during boot and manually. The problem seems to be a blocking dependency loop between the bond and the vlan. As recommended in https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1636708/comments/13 we added dependency ordering using ifup@.service systemd units for all 4 interfaces, but this did not affect the behaviour in any way. Perhaps related to LP bug 1573272 or bug 1636708 ? == Interface configuration == auto eno1 iface eno1 inet manual mtu 1500 bond-master bond1 bond-primary eno1 auto eno2 iface eno2 inet manual mtu 1500 bond-master bond1 auto bond1 iface bond1 inet static mtu 1500 address 10.10.10.3 bond-miimon 100 bond-mode active-backup bond-slaves none bond-downdelay 0 bond-updelay 0 dns-nameservers 10.10.10.1 gateway 10.10.10.1 netmask 255.255.0.0 auto bond1.2 iface bond1.2 inet static mtu 1500 address 10.11.10.3 netmask 255.255.0.0 vlan-raw-device bond1 == When bringing up the bond == # ifup bond1 & Waiting for a slave to join bond1 (will timeout after 60s) # ps afx (...) ifup bond1 \_ /bin/sh -c /bin/run-parts --exit-on-error /etc/network/if-pre-up.d \_ /bin/run-parts --exit-on-error /etc/network/if-pre-up.d \_ /bin/sh /etc/network/if-pre-up.d/ifenslave (...) /lib/systemd/systemd-udevd \_ /lib/systemd/systemd-udevd \_ /bin/sh /lib/udev/vlan-network-interface \_ /bin/sh /etc/network/if-pre-up.d/vlan \_ ifup bond1 (...) ==> After waiting 60 seconds: # ip link | grep -E 'eno[1|2]|bond1*' eno1:mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 eno2: mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 bond1: mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 bond1.2@bond1: mtu 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000 == When bringing up a slave == # ifup eno1 Waiting for bond master bond1 to be ready # ps afx (...) /lib/systemd/systemd-udevd \_ /lib/systemd/systemd-udevd \_ /bin/sh /lib/udev/vlan-network-interface \_ /bin/sh /etc/network/if-pre-up.d/vlan \_ ifup bond1 \_ /bin/sh -c /bin/run-parts --exit-on-error /etc/network/if-pre-up.d \_ /bin/run-parts --exit-on-error /etc/network/if-pre-up.d \_ /bin/sh /etc/network/if-pre-up.d/ifenslave \_ /bin/sh /lib/udev/vlan-network-interface \_ /bin/sh /etc/network/if-pre-up.d/vlan \_ ifup bond1 (...) # ip link | grep -E 'eno[1|2]|bond1*' eno1: mtu 1500 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000 eno2: mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 bond1: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 == Only workaround that works == # ifup eno1 Waiting for bond master bond1 to be ready # kill $(ps -ef | grep 'ifup bond1' | sed
[Touch-packages] [Bug 1759573] Re: vlan on top of untagged network won't start
Got right on it! Tested your PPA version on the same testcases and it works perfectly. Indeed the same issue and the same type of resolution. Your ifquery is more elegant though, so definitely go with that! Would love to see this released and pushed a.s.a.p. because this is breaking our production systems! -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ifupdown in Ubuntu. https://bugs.launchpad.net/bugs/1759573 Title: vlan on top of untagged network won't start Status in ifupdown package in Ubuntu: New Status in vlan package in Ubuntu: New Bug description: Due to an upgrade (of probably of the ifupdown or vlan package), this specific network configuration no longer comes up automatically: 1) Two or more network interfaces bonded 2) An untagged network configured on that bond 3) A vlan on top of that untagged network What does come up automatically: 1) A single (e.g. unbonded) network interface with an untagged network configured and a vlan on top of that network 2) Two or more network interfaces bonded with a vlan on top of that untagged bond An exact example of the configuration that doesn't work is provided below. It fails to come up correctly, both during boot and manually. The problem seems to be a blocking dependency loop between the bond and the vlan. As recommended in https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1636708/comments/13 we added dependency ordering using ifup@.service systemd units for all 4 interfaces, but this did not affect the behaviour in any way. Perhaps related to LP bug 1573272 or bug 1636708 ? == Interface configuration == auto eno1 iface eno1 inet manual mtu 1500 bond-master bond1 bond-primary eno1 auto eno2 iface eno2 inet manual mtu 1500 bond-master bond1 auto bond1 iface bond1 inet static mtu 1500 address 10.10.10.3 bond-miimon 100 bond-mode active-backup bond-slaves none bond-downdelay 0 bond-updelay 0 dns-nameservers 10.10.10.1 gateway 10.10.10.1 netmask 255.255.0.0 auto bond1.2 iface bond1.2 inet static mtu 1500 address 10.11.10.3 netmask 255.255.0.0 vlan-raw-device bond1 == When bringing up the bond == # ifup bond1 & Waiting for a slave to join bond1 (will timeout after 60s) # ps afx (...) ifup bond1 \_ /bin/sh -c /bin/run-parts --exit-on-error /etc/network/if-pre-up.d \_ /bin/run-parts --exit-on-error /etc/network/if-pre-up.d \_ /bin/sh /etc/network/if-pre-up.d/ifenslave (...) /lib/systemd/systemd-udevd \_ /lib/systemd/systemd-udevd \_ /bin/sh /lib/udev/vlan-network-interface \_ /bin/sh /etc/network/if-pre-up.d/vlan \_ ifup bond1 (...) ==> After waiting 60 seconds: # ip link | grep -E 'eno[1|2]|bond1*' eno1:mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 eno2: mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 bond1: mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 bond1.2@bond1: mtu 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000 == When bringing up a slave == # ifup eno1 Waiting for bond master bond1 to be ready # ps afx (...) /lib/systemd/systemd-udevd \_ /lib/systemd/systemd-udevd \_ /bin/sh /lib/udev/vlan-network-interface \_ /bin/sh /etc/network/if-pre-up.d/vlan \_ ifup bond1 \_ /bin/sh -c /bin/run-parts --exit-on-error /etc/network/if-pre-up.d \_ /bin/run-parts --exit-on-error /etc/network/if-pre-up.d \_ /bin/sh /etc/network/if-pre-up.d/ifenslave \_ /bin/sh /lib/udev/vlan-network-interface \_ /bin/sh /etc/network/if-pre-up.d/vlan \_ ifup bond1 (...) # ip link | grep -E 'eno[1|2]|bond1*' eno1: mtu 1500 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000 eno2: mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 bond1: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 == Only workaround that works
[Touch-packages] [Bug 1759573] Re: vlan on top of untagged network won't start
Been doing some troubleshooting and we think we've found the fix for this issue: The script /etc/network/if-pre-up.d/vlan contains the following section of code starting at line 62: if [ ! -e "/sys/class/net/$IFACE" ]; then # Try ifup for the raw device, if it fails then bring it up directly # this is required e.g. there is no configuration for the raw device ifup $IF_VLAN_RAW_DEVICE || ip link set up dev $IF_VLAN_RAW_DEVICE vconfig add $IF_VLAN_RAW_DEVICE $VLANID fi In this case it's trying to bring up a raw device that has already been brought up, causing it to wait forever for the lock on the raw interface to be released. It is however lacking a check on the status of the raw interface, which it shouldn't have to bring up if it already exists. So this problem goes away when we put an if-statement around that section of the code: if [ ! -e "/sys/class/net/$IFACE" ]; then if ! `cat /sys/class/net/$IF_VLAN_RAW_DEVICE/operstate 2> /dev/null | grep -q "up"`; then # Try ifup for the raw device, if it fails then bring it up directly # this is required e.g. there is no configuration for the raw device ifup $IF_VLAN_RAW_DEVICE || ip link set up dev $IF_VLAN_RAW_DEVICE fi vconfig add $IF_VLAN_RAW_DEVICE $VLANID fi It seems to work perfectly, tested on these cases: 1) a vlan on top of a single enp0sX interface without untagged network configuration 2) a vlan on top of a single enp0sX interface with its own untagged network configuration 3) a vlan on top of a bond of two enp0sX interfaces, without the bond having its own untagged network configuration 4) a vlan on top of a bond of two enp0sX interfaces, with the bond having its own untagged network configuration -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ifupdown in Ubuntu. https://bugs.launchpad.net/bugs/1759573 Title: vlan on top of untagged network won't start Status in ifupdown package in Ubuntu: New Status in vlan package in Ubuntu: New Bug description: Due to an upgrade (of probably of the ifupdown or vlan package), this specific network configuration no longer comes up automatically: 1) Two or more network interfaces bonded 2) An untagged network configured on that bond 3) A vlan on top of that untagged network What does come up automatically: 1) A single (e.g. unbonded) network interface with an untagged network configured and a vlan on top of that network 2) Two or more network interfaces bonded with a vlan on top of that untagged bond An exact example of the configuration that doesn't work is provided below. It fails to come up correctly, both during boot and manually. The problem seems to be a blocking dependency loop between the bond and the vlan. As recommended in https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1636708/comments/13 we added dependency ordering using ifup@.service systemd units for all 4 interfaces, but this did not affect the behaviour in any way. Perhaps related to LP bug 1573272 or bug 1636708 ? == Interface configuration == auto eno1 iface eno1 inet manual mtu 1500 bond-master bond1 bond-primary eno1 auto eno2 iface eno2 inet manual mtu 1500 bond-master bond1 auto bond1 iface bond1 inet static mtu 1500 address 10.10.10.3 bond-miimon 100 bond-mode active-backup bond-slaves none bond-downdelay 0 bond-updelay 0 dns-nameservers 10.10.10.1 gateway 10.10.10.1 netmask 255.255.0.0 auto bond1.2 iface bond1.2 inet static mtu 1500 address 10.11.10.3 netmask 255.255.0.0 vlan-raw-device bond1 == When bringing up the bond == # ifup bond1 & Waiting for a slave to join bond1 (will timeout after 60s) # ps afx (...) ifup bond1 \_ /bin/sh -c /bin/run-parts --exit-on-error /etc/network/if-pre-up.d \_ /bin/run-parts --exit-on-error /etc/network/if-pre-up.d \_ /bin/sh /etc/network/if-pre-up.d/ifenslave (...) /lib/systemd/systemd-udevd \_ /lib/systemd/systemd-udevd \_ /bin/sh /lib/udev/vlan-network-interface \_ /bin/sh /etc/network/if-pre-up.d/vlan \_ ifup bond1 (...) ==> After waiting 60 seconds: # ip link | grep -E 'eno[1|2]|bond1*' eno1:mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 eno2: mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 bond1: mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 bond1.2@bond1: mtu 1500 qdisc noqueue state
[Touch-packages] [Bug 1636708] Re: ifup -a does not start dependants last, causes deadlocks with vlans/bonding
@ddstreet I opened a separate ticket as requested a while ago. Some eyes on it would be very welcome. It's bug 1759573 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ifupdown in Ubuntu. https://bugs.launchpad.net/bugs/1636708 Title: ifup -a does not start dependants last, causes deadlocks with vlans/bonding Status in ifupdown package in Ubuntu: Confirmed Status in ifupdown source package in Xenial: Confirmed Bug description: This is a problem I've been struggling with since moving to 16.04.1 from 14.04 (fresh install) I don't believe this problem affected 14.04. I have used an almost identical interfaces file on 14.04 without problem. On 16.04.1, however, 9/10 boots would hang during network configuration and leave the network incorrectly configured. When calling "ifup -a" all candidate interfaces appear to be started in parallel leading to collisions with locks. This causes hanging (until timeout) during booting and the network interfaces left incorrectly configured Imagine this /etc/network/interfaces auto eno1 bond0 bond0.1 iface eno1 inet manual bond-master bond0 iface bond0 inet manual bond-slaves eno1 bond-mode 4 bond-lacp-rate 1 bond-miimon 100 bond-updelay200 bond-downdelay 200 iface bond0.5 inet dhcp vlan-raw-device bond0 eno1 -> bond0 -> bond0.5 -> dhcp When calling "ifup -a" at boot time all three interfaces are started at the same time. bond0 and bond0.5 both attempt to share the same lock file: /run/network/ifstate.bond0 If bond0 wins the race, the system will start correctly (1/10): * bond0 starts and creates the bond0 device and the ifenslave.bond0 file to indicate the bond is ready * eno1 polls for the ifenslave.bond0 file, when it appears it attaches eno1 to bond0 * bond0 finishes and releases the lock * bond0.5 now acquires the lock. * bond0.5 starts dhclient, which can talk to the network and configure the interface If, however, bond0.2 wins the lock race, the system will hang at boot (5 mins) and fail to set up the network. * bond0.5 is awarded the ifstate.bond0 lockfile * bond0.5 starts dhclient waiting to hear from the network * bond0 is blocked, so bond0 is not created nor is the bond0.ifenslave file * eno1 polls but never finds the ifenslave.bond0 file so never attaches to bond0 * bond0.5's dhclient is trying to talk to a disconnected network and never receives an answer ! bond0.5 is stuck running dhclient ! bond0 is stuck waiting for bond0.5 to finish ! eno1 is stuck waiting for bond0 to create the ifenslave.bond0 file I believe ifup should start interfaces (that share lock files) in dependant order. The most basic interface must be awarded the lock over its dependants. In this case: 1 eno1 2 bond0 3 bond0.5 but never: 1 eno1 2 bond0.5 3 bond0 As a work arouund, in /etc/network/interfaces -auto eno1 bond0 bond0.1 +auto eno1 bond0 +allow-bond bond0.1 And also in /lib/systemd/system/networking.service ExecStart=/sbin/ifup -a --read-environment +ExecStart=/sbin/ifup -a --allow=bond --read-environment ExecStop=/sbin/ifdown -a --read-environment Then run: systemctl dameon-reload This causes all "auto" interfaces to start then, when they've completed, all allow-bond interfaces to start. ProblemType: Bug DistroRelease: Ubuntu 16.04 Package: ifupdown 0.8.10ubuntu1.1 [modified: lib/systemd/system/networking.service] ProcVersionSignature: Ubuntu 4.4.0-45.66-generic 4.4.21 Uname: Linux 4.4.0-45-generic x86_64 ApportVersion: 2.20.1-0ubuntu2.1 Architecture: amd64 Date: Wed Oct 26 06:32:57 2016 InstallationDate: Installed on 2016-10-24 (1 days ago) InstallationMedia: Ubuntu-Server 16.04.1 LTS "Xenial Xerus" - Release amd64 (20160719) SourcePackage: ifupdown UpgradeStatus: No upgrade log present (probably fresh install) modified.conffile..etc.init.networking.conf: [modified] mtime.conffile..etc.init.networking.conf: 2016-10-26T04:52:05.750927 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1636708/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1759573] [NEW] vlan on top of untagged network won't start
Public bug reported: Due to an upgrade (of probably of the ifupdown or vlan package), this specific network configuration no longer comes up automatically: 1) Two or more network interfaces bonded 2) An untagged network configured on that bond 3) A vlan on top of that untagged network What does come up automatically: 1) A single (e.g. unbonded) network interface with an untagged network configured and a vlan on top of that network 2) Two or more network interfaces bonded with a vlan on top of that untagged bond An exact example of the configuration that doesn't work is provided below. It fails to come up correctly, both during boot and manually. The problem seems to be a blocking dependency loop between the bond and the vlan. As recommended in https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1636708/comments/13 we added dependency ordering using ifup@.service systemd units for all 4 interfaces, but this did not affect the behaviour in any way. Perhaps related to LP bug 1573272 or bug 1636708 ? == Interface configuration == auto eno1 iface eno1 inet manual mtu 1500 bond-master bond1 bond-primary eno1 auto eno2 iface eno2 inet manual mtu 1500 bond-master bond1 auto bond1 iface bond1 inet static mtu 1500 address 10.10.10.3 bond-miimon 100 bond-mode active-backup bond-slaves none bond-downdelay 0 bond-updelay 0 dns-nameservers 10.10.10.1 gateway 10.10.10.1 netmask 255.255.0.0 auto bond1.2 iface bond1.2 inet static mtu 1500 address 10.11.10.3 netmask 255.255.0.0 vlan-raw-device bond1 == When bringing up the bond == # ifup bond1 & Waiting for a slave to join bond1 (will timeout after 60s) # ps afx (...) ifup bond1 \_ /bin/sh -c /bin/run-parts --exit-on-error /etc/network/if-pre-up.d \_ /bin/run-parts --exit-on-error /etc/network/if-pre-up.d \_ /bin/sh /etc/network/if-pre-up.d/ifenslave (...) /lib/systemd/systemd-udevd \_ /lib/systemd/systemd-udevd \_ /bin/sh /lib/udev/vlan-network-interface \_ /bin/sh /etc/network/if-pre-up.d/vlan \_ ifup bond1 (...) ==> After waiting 60 seconds: # ip link | grep -E 'eno[1|2]|bond1*' eno1:mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 eno2: mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 bond1: mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 bond1.2@bond1: mtu 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000 == When bringing up a slave == # ifup eno1 Waiting for bond master bond1 to be ready # ps afx (...) /lib/systemd/systemd-udevd \_ /lib/systemd/systemd-udevd \_ /bin/sh /lib/udev/vlan-network-interface \_ /bin/sh /etc/network/if-pre-up.d/vlan \_ ifup bond1 \_ /bin/sh -c /bin/run-parts --exit-on-error /etc/network/if-pre-up.d \_ /bin/run-parts --exit-on-error /etc/network/if-pre-up.d \_ /bin/sh /etc/network/if-pre-up.d/ifenslave \_ /bin/sh /lib/udev/vlan-network-interface \_ /bin/sh /etc/network/if-pre-up.d/vlan \_ ifup bond1 (...) # ip link | grep -E 'eno[1|2]|bond1*' eno1: mtu 1500 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000 eno2: mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 bond1: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 == Only workaround that works == # ifup eno1 Waiting for bond master bond1 to be ready # kill $(ps -ef | grep 'ifup bond1' | sed -n 2p | awk '{ print $2}') # ifup eno2 # ip link | grep -E 'eno[1|2]|bond1*' eno1: mtu 1500 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000 eno2: mtu 1500 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000 bond1: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 bond1.2@bond1: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 ** Affects: ifupdown (Ubuntu) Importance: Undecided Status: New ** Affects: vlan (Ubuntu) Importance: Undecided
[Touch-packages] [Bug 1636708] Re: ifup -a does not start dependants last, causes deadlocks with vlans/bonding
We upgraded to the vlan 1.9-3.2ubuntu1.16.04.3 package and our networking broke horribly in a very similar way. Let me start with our networking configuration. Two slaves, a bond and a vlan on top of that bond: auto eno1 iface eno1 inet manual mtu 1500 bond-master bond1 bond-primary eno1 auto eno2 iface eno2 inet manual mtu 1500 bond-master bond1 auto bond1 iface bond1 inet static mtu 1500 address 10.10.10.3 bond-miimon 100 bond-mode active-backup bond-slaves none bond-downdelay 200 bond-updelay 200 dns-nameservers 10.10.0.1 netmask 255.255.0.0 auto bond1.2 iface bond1.2 inet static mtu 1500 address 10.11.10.3 netmask 255.255.0.0 vlan-raw-device bond1 This fails to come up correctly, both during boot and manually. Bringing up either eno1, eno2, bond1 or bond1.2 all result in the same problem: "ifup: waiting for lock on /run/network/ifstate.bond1". Problem seems to be that ifup tries to bring up the base bond1 interface *again*. Even if it is already up. And it gets stuck waiting for the bond1 interface to be unlocked so it can bring it up, but it is already up and thus locked so that will never happen. We also tried bringing all interfaces down and just running "ifup bond1.2" but that results in the same behavior. Only workaround that seemed to work for us was to: 1) temporarily remove the bond1.2.cfg from /etc/network/interfaces.d 2) bring up eno1, eno2 and bond1 3) put the bond1.2.cfg back in its place 4) run "ifup bond1.2" 5) using another terminal, list all open processes using "ps -ef | grep ifup" 6) kill the "ifup bond1" process The "ps -ef | grep ifup" during step 5, outputs two ifup processes. One for bond1 and one for bond1.2. As soon as we kill the "ifup bond1" process, the "ifup bond1.2" process completes immediately and correctly configures the vlan 2 subinterface. This is clearly linked to vlan, because our infiniband interfaces work just fine. Also it worked just fine before upgrading the package. So my best guess would be that something broke in the code that detects if the vlan-raw-device is up. Perhaps related to LP #1573272 ? -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ifupdown in Ubuntu. https://bugs.launchpad.net/bugs/1636708 Title: ifup -a does not start dependants last, causes deadlocks with vlans/bonding Status in ifupdown package in Ubuntu: Confirmed Status in ifupdown source package in Xenial: Confirmed Bug description: This is a problem I've been struggling with since moving to 16.04.1 from 14.04 (fresh install) I don't believe this problem affected 14.04. I have used an almost identical interfaces file on 14.04 without problem. On 16.04.1, however, 9/10 boots would hang during network configuration and leave the network incorrectly configured. When calling "ifup -a" all candidate interfaces appear to be started in parallel leading to collisions with locks. This causes hanging (until timeout) during booting and the network interfaces left incorrectly configured Imagine this /etc/network/interfaces auto eno1 bond0 bond0.1 iface eno1 inet manual bond-master bond0 iface bond0 inet manual bond-slaves eno1 bond-mode 4 bond-lacp-rate 1 bond-miimon 100 bond-updelay200 bond-downdelay 200 iface bond0.5 inet dhcp vlan-raw-device bond0 eno1 -> bond0 -> bond0.5 -> dhcp When calling "ifup -a" at boot time all three interfaces are started at the same time. bond0 and bond0.5 both attempt to share the same lock file: /run/network/ifstate.bond0 If bond0 wins the race, the system will start correctly (1/10): * bond0 starts and creates the bond0 device and the ifenslave.bond0 file to indicate the bond is ready * eno1 polls for the ifenslave.bond0 file, when it appears it attaches eno1 to bond0 * bond0 finishes and releases the lock * bond0.5 now acquires the lock. * bond0.5 starts dhclient, which can talk to the network and configure the interface If, however, bond0.2 wins the lock race, the system will hang at boot (5 mins) and fail to set up the network. * bond0.5 is awarded the ifstate.bond0 lockfile * bond0.5 starts dhclient waiting to hear from the network * bond0 is blocked, so bond0 is not created nor is the bond0.ifenslave file * eno1 polls but never finds the ifenslave.bond0 file so never attaches to bond0 * bond0.5's dhclient is trying to talk to a disconnected network and never receives an answer ! bond0.5 is stuck running dhclient ! bond0 is stuck waiting for bond0.5 to finish ! eno1 is stuck waiting for bond0 to create the ifenslave.bond0 file I believe ifup should start interfaces (that share lock files) in dependant order. The most basic interface must be awarded the lock over its dependants. In