We've been doing a lot more testing and debugging and I'd like to share
our findings:

1) Unfortunately it turns out this change does not fix the issue of interfaces 
not coming up correctly for a bond with a (static) network configuration. The 
race condition seems to be removed so at least there are no more hangs between 
bonds and their vlan children. All the interfaces also say they are UP both 
when running ifup and after reboot. However:
- Running "ifup <slavename>" does bring up the bond (and its vlans) in a 
working state.
- Running "ifup -a" or rebooting don't actually work, causing "network not 
available" errors and "Destination Host Unreachable" when pinging other 
machines. Executing "ifdown -a; ifup -a" shows that ifupdown tries to bring up 
the bond BEFORE the slaves in stead of the other way around. Even though after 
the 60s timeout the bond and it's slaves say they are UP, they don't actually 
function.
- We're not seeing any issues with bonds that do not have a network 
configuration of their own

2) The networking script stack / concept seems fundamentally flawed in
three areas:

2.A) bonds relying on slaves having "bond-master" and being started by
bringing up the slaves, but not supporting the master having "bond-
slaves" and being able to start a bond by just bringing up the bond
directly.

2.B) bringing a specific interface up automatically brings up it's child
vlans. This does not make a lot of sense. The other way around does -
e.g. in order to bring up a vlan we need to bring up it's raw device -
but why would the ifupdown scripts assume that I want to bring up all of
it's vlans when I bring up an interface that (also) serves as a raw
device? In that case I would probably run "ifup -a"!

2.C) a vlan running on top of a bond cannot be brought up directly due to 
/sys/class/net/<bondname>/ not existing. This results in the following:
>  # ifup bo-adm.2
>  Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config
>  cat: /sys/class/net/bo-adm/mtu: No such file or directory
>  Device "bo-adm" does not exist.
>  bo-adm does not exist, unable to create bo-adm.2
>  run-parts: /etc/network/if-pre-up.d/vlan exited with return code 1
>  Failed to bring up bo-adm.2.

3) Our new workaround for boot has become this very intrusive systemd service:
> [Unit]
> Wants=network-online.target
> After=network-online.target
> 
> [Install]
> WantedBy=multi-user.target
> 
> [Service]
> Type=oneshot
> ExecStartPre=/sbin/ifdown bo-adm
> ExecStart=/sbin/ifup enp0s3
> ExecStart=/sbin/ifup enp0s10
> ExecStop=/sbin/ifdown bo-adm
> RemainAfterExit=yes
> TimeoutStartSec=5min

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1701023

Title:
  (on trusty) version 1.9-3ubuntu10.4 regression blocking boot
  completion

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1701023/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to