** Description changed: [ Impact ] - A logic bug in libvirt causes specific network interface configurations to set - the default `qdisc` on macvtap devices (`fq_codel`) instead of `noqueue`, - causing severe performance degradation due to lock contention in the kernel. + A logic bug in libvirt causes specific network interface QoS configurations to + set the root `qdisc` on macvtap devices to `fq_codel` (the default) instead of + `noqueue`. This can cause severe performance degradation due to lock + contention in the kernel. - This happens only when an interface has only an ingress bandwidth limit - configured: + This affects interfaces of types 'ethernet', 'network', 'bridge' and + 'direct'. + + For 'direct' and some 'ethernet' interfaces, this happens only when the + interface has only an inbound bandwidth limit configured: <interface type='direct'> <bandwidth> <inbound average='3125000' peak='3125000'/> </bandwidth> ... </interface> - This affects interfaces of types 'ethernet', 'network', 'bridge' and - 'direct'. + Other interface types are only affected with only an outbound bandwidth + limit: - The affected user measured the outgoing bandwidth of the interface with both - qdiscs: + <interface type='network'> + <bandwidth> + <outbound average='3125000' peak='3125000'/> + </bandwidth> + ... + </interface> + + The affected user measured the outgoing bandwidth of their macvtap + interface with both qdiscs: `fq_codel`: 9.64 Gbits/sec `noqueue`: 16.0 Gbits/sec See "Other information" below for a more detailed breakdown of the problem. Reported upstream at [1] and fixed with [2] and [3]. [1] https://gitlab.com/libvirt/libvirt/-/work_items/875 [2] https://gitlab.com/libvirt/libvirt/-/commit/0d906d8a141c7b6024ff9416a2b814acbac592c1 [3] https://gitlab.com/libvirt/libvirt/-/commit/124c53169eae20655f04eb9ce05be8f6cac0eb08 [ Test plan ] In a LXD VM with two nics (one for connectivity and the other to pass through to a guest): ```sh sudo apt install libvirt-daemon-system virtinst cat > user-data <<EOF #cloud-config password: password chpasswd: expire: False EOF touch meta-data network-config wget https://cloud-images.ubuntu.com/daily/server/noble/current/noble-server-cloudimg-amd64.img sudo mv noble-server-cloudimg-amd64.img /var/lib/libvirt/images/ virt-install \ --name n0 \ --os-variant=ubuntu24.04 \ --ram=1024 --vcpus=2 \ --disk pool=default,size=4,backing_store=/var/lib/libvirt/images/noble-server-cloudimg-amd64.img,bus=virtio,cache=writethrough \ --graphics none \ --network type=direct,source=enp6s0 \ --cloud-init user-data="user-data,meta-data=meta-data,network-config=network-config" ``` Confirm that the interface's qdisc is `noqueue`: ``` $ sudo tc qdisc show dev macvtap0 qdisc noqueue 8001: root refcnt 2 ``` ```sh virsh shutdown n0 virsh edit n0 ``` Add an ingress bandwidth limit to the interface: ```xml <bandwidth> <inbound average='3125000' peak='3125000'/> </bandwidth> ``` ```sh virsh start n0 ``` Expected behavior: The interface's `root` qdisc is `noqueue`: ``` $ sudo tc qdisc show dev macvtap0 qdisc noqueue 8001: root refcnt 2 qdisc ingress ffff: parent ffff:fff1 ---------------- ``` Actual behavior: The interface's `root` qdisc is `fq_codel`: ``` $ sudo tc qdisc show dev macvtap1 qdisc fq_codel 0: root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 qdisc ingress ffff: parent ffff:fff1 ---------------- ``` [ Where problems could occur ] The patches modify a codepath that is only executed when a domain or network is configured with <bandwidth> elements (see docs links below); regressions would only affect users utilizing libvirt's QoS configuration features. The change should be a no-op for all interface types except the four for which libvirt sets the qdisc to noqueue (see `qemuDomainInterfaceSetDefaultQDisc` in `src/qemu/qemu_domain.c`): - VIR_DOMAIN_NET_TYPE_ETHERNET: <interface type='ethernet'> - VIR_DOMAIN_NET_TYPE_NETWORK: <interface type='network'> - VIR_DOMAIN_NET_TYPE_BRIDGE: <interface type='bridge'> - VIR_DOMAIN_NET_TYPE_DIRECT: <interface type='direct'> QoS users utilizing these interface types with only <bandwidth> <ingress> set should expect to see `noqueue` on their macvtap devices after a domain restart. [ Other information ] libvirt supports configuring QoS rules for network interfaces [1][2][3]. - By default in Ubuntu, ifaces will be assigned the `fq_codel` qdisc; for some - iface types (including direct/macvtap), libvirt will set the qdisc of the - iface to `noqueue` as it is assumed that the guest is also applying a qdisc to - its outgoing traffic. This is done to avoid lock contention in `fq_codel` from - limiting the outgoing bandwidth of the iface. + By default in Ubuntu, ifaces will be assigned the `fq_codel` qdisc; for iface types 'ethernet', 'network', 'bridge' and 'direct', libvirt will set the qdisc + of the iface to `noqueue` as it is assumed that the guest is also applying a + qdisc to its outgoing traffic. This is done to avoid lock contention in + `fq_codel` from limiting the outgoing bandwidth of the iface. The qdisc of an iface can be configured independently for `root` (egress) and `ingress` directions; i.e. an interface can use `noqueue` for the `root` qdisc but `htb` for `ingress`. - A logic bug in libvirt causes interface configurations which only define an - `inbound` bandwidth limit (affecting the `ingress` direction) to reset the + A logic bug in libvirt causes interface configurations which only define a bandwidth limit affecting the `ingress` direction to reset the `root` (egress) qdisc to the system default. + + This is complicated somewhat because not all interface types share the + same "host view" (ingress/egress directions swapped), see + virDomainNetTypeSharesHostView in src/conf/domain_conf.c. For macvtap + devices, defining an ingress-only QoS rule causes the bug; other + interface types are affected by egress-only rules. For example, without bandwidth limits, the `root` qdisc of macvtap0 will be `noqueue`: <interface type='direct'> <mac address='52:54:00:f5:7e:97'/> <source dev='ens2f0np0' mode='passthrough'/> <target dev='macvtap0'/> <model type='virtio'/> <alias name='net1'/> <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> </interface> $ sudo tc qdisc show dev macvtap0 qdisc noqueue 8001: root refcnt 2 qdisc ingress ffff: parent ffff:fff1 ---------------- With an `inbound` bandwidth limit, we'd expect to see qdisc `noqueue`; instead it's set to the system default, `fq_codel`: <interface type='direct'> <mac address='52:54:00:05:8f:18'/> <source dev='ens2f0np0' mode='passthrough'/> <bandwidth> <inbound average='3125000' peak='3125000'/> </bandwidth> <target dev='macvtap5'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </interface> $ sudo tc qdisc show dev macvtap5 qdisc fq_codel 0: root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 qdisc ingress ffff: parent ffff:fff1 ---------------- [1] https://libvirt.org/formatdomain.html#quality-of-service [2] https://libvirt.org/formatnetwork.html#quality-of-service [3] https://tldp.org/HOWTO/Traffic-Control-HOWTO/components.html
** Description changed: [ Impact ] - A logic bug in libvirt causes specific network interface QoS configurations to + A logic bug in libvirt causes specific network interface QoS configurations to set the root `qdisc` on macvtap devices to `fq_codel` (the default) instead of - `noqueue`. This can cause severe performance degradation due to lock + `noqueue`. This can cause severe performance degradation due to lock contention in the kernel. This affects interfaces of types 'ethernet', 'network', 'bridge' and 'direct'. For 'direct' and some 'ethernet' interfaces, this happens only when the interface has only an inbound bandwidth limit configured: <interface type='direct'> <bandwidth> <inbound average='3125000' peak='3125000'/> </bandwidth> ... </interface> Other interface types are only affected with only an outbound bandwidth limit: <interface type='network'> <bandwidth> <outbound average='3125000' peak='3125000'/> </bandwidth> ... </interface> The affected user measured the outgoing bandwidth of their macvtap interface with both qdiscs: `fq_codel`: 9.64 Gbits/sec `noqueue`: 16.0 Gbits/sec See "Other information" below for a more detailed breakdown of the problem. Reported upstream at [1] and fixed with [2] and [3]. [1] https://gitlab.com/libvirt/libvirt/-/work_items/875 [2] https://gitlab.com/libvirt/libvirt/-/commit/0d906d8a141c7b6024ff9416a2b814acbac592c1 [3] https://gitlab.com/libvirt/libvirt/-/commit/124c53169eae20655f04eb9ce05be8f6cac0eb08 [ Test plan ] In a LXD VM with two nics (one for connectivity and the other to pass through to a guest): ```sh sudo apt install libvirt-daemon-system virtinst cat > user-data <<EOF #cloud-config password: password chpasswd: expire: False EOF touch meta-data network-config wget https://cloud-images.ubuntu.com/daily/server/noble/current/noble-server-cloudimg-amd64.img sudo mv noble-server-cloudimg-amd64.img /var/lib/libvirt/images/ virt-install \ --name n0 \ --os-variant=ubuntu24.04 \ --ram=1024 --vcpus=2 \ --disk pool=default,size=4,backing_store=/var/lib/libvirt/images/noble-server-cloudimg-amd64.img,bus=virtio,cache=writethrough \ --graphics none \ --network type=direct,source=enp6s0 \ --cloud-init user-data="user-data,meta-data=meta-data,network-config=network-config" ``` Confirm that the interface's qdisc is `noqueue`: ``` $ sudo tc qdisc show dev macvtap0 qdisc noqueue 8001: root refcnt 2 ``` ```sh virsh shutdown n0 virsh edit n0 ``` Add an ingress bandwidth limit to the interface: ```xml <bandwidth> <inbound average='3125000' peak='3125000'/> </bandwidth> ``` ```sh virsh start n0 ``` Expected behavior: The interface's `root` qdisc is `noqueue`: ``` $ sudo tc qdisc show dev macvtap0 qdisc noqueue 8001: root refcnt 2 qdisc ingress ffff: parent ffff:fff1 ---------------- ``` Actual behavior: The interface's `root` qdisc is `fq_codel`: ``` $ sudo tc qdisc show dev macvtap1 qdisc fq_codel 0: root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 qdisc ingress ffff: parent ffff:fff1 ---------------- ``` + Repeat the above steps, using `--network network=default ` instead of + the macvtap device and setting the QoS rule to use `<outbound + average='3125000' peak='3125000'/>` instead of `inbound`. + [ Where problems could occur ] The patches modify a codepath that is only executed when a domain or network is configured with <bandwidth> elements (see docs links below); regressions would only affect users utilizing libvirt's QoS configuration features. The change should be a no-op for all interface types except the four for which libvirt sets the qdisc to noqueue (see `qemuDomainInterfaceSetDefaultQDisc` in `src/qemu/qemu_domain.c`): - VIR_DOMAIN_NET_TYPE_ETHERNET: <interface type='ethernet'> - VIR_DOMAIN_NET_TYPE_NETWORK: <interface type='network'> - VIR_DOMAIN_NET_TYPE_BRIDGE: <interface type='bridge'> - VIR_DOMAIN_NET_TYPE_DIRECT: <interface type='direct'> QoS users utilizing these interface types with only <bandwidth> <ingress> set should expect to see `noqueue` on their macvtap devices after a domain restart. [ Other information ] libvirt supports configuring QoS rules for network interfaces [1][2][3]. By default in Ubuntu, ifaces will be assigned the `fq_codel` qdisc; for iface types 'ethernet', 'network', 'bridge' and 'direct', libvirt will set the qdisc of the iface to `noqueue` as it is assumed that the guest is also applying a qdisc to its outgoing traffic. This is done to avoid lock contention in `fq_codel` from limiting the outgoing bandwidth of the iface. The qdisc of an iface can be configured independently for `root` (egress) and `ingress` directions; i.e. an interface can use `noqueue` for the `root` qdisc but `htb` for `ingress`. A logic bug in libvirt causes interface configurations which only define a bandwidth limit affecting the `ingress` direction to reset the `root` (egress) qdisc to the system default. This is complicated somewhat because not all interface types share the same "host view" (ingress/egress directions swapped), see virDomainNetTypeSharesHostView in src/conf/domain_conf.c. For macvtap devices, defining an ingress-only QoS rule causes the bug; other interface types are affected by egress-only rules. For example, without bandwidth limits, the `root` qdisc of macvtap0 will be `noqueue`: <interface type='direct'> <mac address='52:54:00:f5:7e:97'/> <source dev='ens2f0np0' mode='passthrough'/> <target dev='macvtap0'/> <model type='virtio'/> <alias name='net1'/> <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> </interface> $ sudo tc qdisc show dev macvtap0 qdisc noqueue 8001: root refcnt 2 qdisc ingress ffff: parent ffff:fff1 ---------------- With an `inbound` bandwidth limit, we'd expect to see qdisc `noqueue`; instead it's set to the system default, `fq_codel`: <interface type='direct'> <mac address='52:54:00:05:8f:18'/> <source dev='ens2f0np0' mode='passthrough'/> <bandwidth> <inbound average='3125000' peak='3125000'/> </bandwidth> <target dev='macvtap5'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </interface> $ sudo tc qdisc show dev macvtap5 qdisc fq_codel 0: root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 qdisc ingress ffff: parent ffff:fff1 ---------------- [1] https://libvirt.org/formatdomain.html#quality-of-service [2] https://libvirt.org/formatnetwork.html#quality-of-service [3] https://tldp.org/HOWTO/Traffic-Control-HOWTO/components.html -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2155755 Title: macvtap device qdisc reset to system default when ingress-only bandwidth limit applied To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/2155755/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
