** Description changed:

  [ Impact ]
  
  A logic bug in libvirt causes specific network interface configurations to set
- the default `qdisc` on macvtap devices instead of `noqueue`, causing severe
- performance degredation due to lock contention.
+ the default `qdisc` on macvtap devices (`fq_codel`) instead of `noqueue`, 
causing severe performance degradation due to lock contention in the kernel.
+ 
+ This happens only when an interface has only an ingress bandwidth limit
+ configured:
+ 
+     <interface type='direct'>
+       <bandwidth>
+         <inbound average='3125000' peak='3125000'/>
+       </bandwidth>
+       ...
+     </interface>
+ 
+ This affects interfaces of types 'ethernet', 'network', 'bridge' and
+ 'direct'.
  
  The affected user measured the outgoing bandwidth of the interface with both
  qdiscs:
  
  `fq_codel`: 9.64 Gbits/sec
  `noqueue`: 16.0 Gbits/sec
  
  See "Other information" below for a more detailed breakdown of the
  problem.
  
  Reported upstream at [1] and fixed with [2] and [3].
  
  [1] https://gitlab.com/libvirt/libvirt/-/work_items/875
  [2] 
https://gitlab.com/libvirt/libvirt/-/commit/0d906d8a141c7b6024ff9416a2b814acbac592c1
  [3] 
https://gitlab.com/libvirt/libvirt/-/commit/124c53169eae20655f04eb9ce05be8f6cac0eb08
  
  [ Test plan ]
  
  In a LXD VM with two nics (one for connectivity and the other to pass through
  to a guest):
  
  ```sh
  sudo apt install libvirt-daemon-system virtinst
  cat > user-data <<EOF
  #cloud-config
  password: password
  chpasswd:
-   expire: False
+   expire: False
  EOF
  touch meta-data network-config
  wget 
https://cloud-images.ubuntu.com/daily/server/noble/current/noble-server-cloudimg-amd64.img
  sudo mv noble-server-cloudimg-amd64.img /var/lib/libvirt/images/
  virt-install \
-   --name n0 \
-   --os-variant=ubuntu24.04 \
-   --ram=1024 --vcpus=2 \
-   --disk 
pool=default,size=4,backing_store=/var/lib/libvirt/images/noble-server-cloudimg-amd64.img,bus=virtio,cache=writethrough
 \
-   --graphics none \
-   --network type=direct,source=enp6s0 \
-   --cloud-init 
user-data="user-data,meta-data=meta-data,network-config=network-config"
+   --name n0 \
+   --os-variant=ubuntu24.04 \
+   --ram=1024 --vcpus=2 \
+   --disk 
pool=default,size=4,backing_store=/var/lib/libvirt/images/noble-server-cloudimg-amd64.img,bus=virtio,cache=writethrough
 \
+   --graphics none \
+   --network type=direct,source=enp6s0 \
+   --cloud-init 
user-data="user-data,meta-data=meta-data,network-config=network-config"
  ```
  
  Confirm that the interface's qdisc is `noqueue`:
  
  ```
  $ sudo tc qdisc show dev macvtap0
  qdisc noqueue 8001: root refcnt 2
  ```
  
  ```sh
  virsh shutdown n0
  virsh edit n0
  ```
  
  Add an ingress bandwidth limit to the interface:
  ```xml
-       <bandwidth>
-         <inbound average='3125000' peak='3125000'/>
-       </bandwidth>
+       <bandwidth>
+         <inbound average='3125000' peak='3125000'/>
+       </bandwidth>
  ```
  
  ```sh
  virsh start n0
  ```
  
  Expected behavior:
  
  The interface's `root` qdisc is `noqueue`:
  
  ```
  $ sudo tc qdisc show dev macvtap0
  qdisc noqueue 8001: root refcnt 2
  qdisc ingress ffff: parent ffff:fff1 ----------------
  ```
  
  Actual behavior:
  
  The interface's `root` qdisc is `fq_codel`:
  
  ```
  $ sudo tc qdisc show dev macvtap1
  qdisc fq_codel 0: root refcnt 2 limit 10240p flows 1024 quantum 1514 target 
5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
  qdisc ingress ffff: parent ffff:fff1 ----------------
  ```
  
  [ Where problems could occur ]
  
  The patches modify a codepath that is only executed when a domain or network
  is configured with <bandwidth> elements (see docs links below); regressions
  would only affect users utilizing libvirt's QoS configuration features.
  
  The change should be a no-op for all interface types except the four for which
  libvirt sets the qdisc to noqueue (see `qemuDomainInterfaceSetDefaultQDisc`
  in `src/qemu/qemu_domain.c`):
  
  - VIR_DOMAIN_NET_TYPE_ETHERNET: <interface type='ethernet'>
  - VIR_DOMAIN_NET_TYPE_NETWORK: <interface type='network'>
  - VIR_DOMAIN_NET_TYPE_BRIDGE: <interface type='bridge'>
  - VIR_DOMAIN_NET_TYPE_DIRECT: <interface type='direct'>
  
  QoS users utilizing these interface types with only <bandwidth> <ingress> set
  should expect to see `noqueue` on their macvtap devices after a domain 
restart.
  
  [ Other information ]
  
  libvirt supports configuring QoS rules for network interfaces [1][2][3].
  
  By default in Ubuntu, ifaces will be assigned the `fq_codel` qdisc; for some
  iface types (including direct/macvtap), libvirt will set the qdisc of the
  iface to `noqueue` as it is assumed that the guest is also applying a qdisc to
  its outgoing traffic. This is done to avoid lock contention in `fq_codel` from
  limiting the outgoing bandwidth of the iface.
  
  The qdisc of an iface can be configured independently for `root` (egress) and
  `ingress` directions; i.e. an interface can use `noqueue` for the `root` qdisc
  but `htb` for `ingress`.
  
  A logic bug in libvirt causes interface configurations which only define an
  `inbound` bandwidth limit (affecting the `ingress` direction) to reset the
  `root` (egress) qdisc to the system default.
  
  For example, without bandwidth limits, the `root` qdisc of macvtap0 will be
  `noqueue`:
  
-     <interface type='direct'>
-       <mac address='52:54:00:f5:7e:97'/>
-       <source dev='ens2f0np0' mode='passthrough'/>
-       <target dev='macvtap0'/>
-       <model type='virtio'/>
-       <alias name='net1'/>
-       <address type='pci' domain='0x0000' bus='0x02' slot='0x00' 
function='0x0'/>
-     </interface>
+     <interface type='direct'>
+       <mac address='52:54:00:f5:7e:97'/>
+       <source dev='ens2f0np0' mode='passthrough'/>
+       <target dev='macvtap0'/>
+       <model type='virtio'/>
+       <alias name='net1'/>
+       <address type='pci' domain='0x0000' bus='0x02' slot='0x00' 
function='0x0'/>
+     </interface>
  
  $ sudo tc qdisc show dev macvtap0
  qdisc noqueue 8001: root refcnt 2
  qdisc ingress ffff: parent ffff:fff1 ----------------
  
  With an `inbound` bandwidth limit, we'd expect to see qdisc `noqueue`; instead
  it's set to the system default, `fq_codel`:
  
-     <interface type='direct'>
-       <mac address='52:54:00:05:8f:18'/>
-       <source dev='ens2f0np0' mode='passthrough'/>
-       <bandwidth>
-         <inbound average='3125000' peak='3125000'/>
-       </bandwidth>
-       <target dev='macvtap5'/>
-       <model type='virtio'/>
-       <alias name='net0'/>
-       <address type='pci' domain='0x0000' bus='0x01' slot='0x00' 
function='0x0'/>
-     </interface>
+     <interface type='direct'>
+       <mac address='52:54:00:05:8f:18'/>
+       <source dev='ens2f0np0' mode='passthrough'/>
+       <bandwidth>
+         <inbound average='3125000' peak='3125000'/>
+       </bandwidth>
+       <target dev='macvtap5'/>
+       <model type='virtio'/>
+       <alias name='net0'/>
+       <address type='pci' domain='0x0000' bus='0x01' slot='0x00' 
function='0x0'/>
+     </interface>
  
  $ sudo tc qdisc show dev macvtap5
  qdisc fq_codel 0: root refcnt 2 limit 10240p flows 1024 quantum 1514 target 
5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
  qdisc ingress ffff: parent ffff:fff1 ----------------
  
  [1] https://libvirt.org/formatdomain.html#quality-of-service
  [2] https://libvirt.org/formatnetwork.html#quality-of-service
  [3] https://tldp.org/HOWTO/Traffic-Control-HOWTO/components.html

** Description changed:

  [ Impact ]
  
  A logic bug in libvirt causes specific network interface configurations to set
- the default `qdisc` on macvtap devices (`fq_codel`) instead of `noqueue`, 
causing severe performance degradation due to lock contention in the kernel.
+ the default `qdisc` on macvtap devices (`fq_codel`) instead of `noqueue`, 
+ causing severe performance degradation due to lock contention in the kernel.
  
  This happens only when an interface has only an ingress bandwidth limit
  configured:
  
-     <interface type='direct'>
+     <interface type='direct'>
        <bandwidth>
          <inbound average='3125000' peak='3125000'/>
        </bandwidth>
        ...
      </interface>
  
  This affects interfaces of types 'ethernet', 'network', 'bridge' and
  'direct'.
  
  The affected user measured the outgoing bandwidth of the interface with both
  qdiscs:
  
  `fq_codel`: 9.64 Gbits/sec
  `noqueue`: 16.0 Gbits/sec
  
  See "Other information" below for a more detailed breakdown of the
  problem.
  
  Reported upstream at [1] and fixed with [2] and [3].
  
  [1] https://gitlab.com/libvirt/libvirt/-/work_items/875
  [2] 
https://gitlab.com/libvirt/libvirt/-/commit/0d906d8a141c7b6024ff9416a2b814acbac592c1
  [3] 
https://gitlab.com/libvirt/libvirt/-/commit/124c53169eae20655f04eb9ce05be8f6cac0eb08
  
  [ Test plan ]
  
  In a LXD VM with two nics (one for connectivity and the other to pass through
  to a guest):
  
  ```sh
  sudo apt install libvirt-daemon-system virtinst
  cat > user-data <<EOF
  #cloud-config
  password: password
  chpasswd:
    expire: False
  EOF
  touch meta-data network-config
  wget 
https://cloud-images.ubuntu.com/daily/server/noble/current/noble-server-cloudimg-amd64.img
  sudo mv noble-server-cloudimg-amd64.img /var/lib/libvirt/images/
  virt-install \
    --name n0 \
    --os-variant=ubuntu24.04 \
    --ram=1024 --vcpus=2 \
    --disk 
pool=default,size=4,backing_store=/var/lib/libvirt/images/noble-server-cloudimg-amd64.img,bus=virtio,cache=writethrough
 \
    --graphics none \
    --network type=direct,source=enp6s0 \
    --cloud-init 
user-data="user-data,meta-data=meta-data,network-config=network-config"
  ```
  
  Confirm that the interface's qdisc is `noqueue`:
  
  ```
  $ sudo tc qdisc show dev macvtap0
  qdisc noqueue 8001: root refcnt 2
  ```
  
  ```sh
  virsh shutdown n0
  virsh edit n0
  ```
  
  Add an ingress bandwidth limit to the interface:
  ```xml
        <bandwidth>
          <inbound average='3125000' peak='3125000'/>
        </bandwidth>
  ```
  
  ```sh
  virsh start n0
  ```
  
  Expected behavior:
  
  The interface's `root` qdisc is `noqueue`:
  
  ```
  $ sudo tc qdisc show dev macvtap0
  qdisc noqueue 8001: root refcnt 2
  qdisc ingress ffff: parent ffff:fff1 ----------------
  ```
  
  Actual behavior:
  
  The interface's `root` qdisc is `fq_codel`:
  
  ```
  $ sudo tc qdisc show dev macvtap1
  qdisc fq_codel 0: root refcnt 2 limit 10240p flows 1024 quantum 1514 target 
5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
  qdisc ingress ffff: parent ffff:fff1 ----------------
  ```
  
  [ Where problems could occur ]
  
  The patches modify a codepath that is only executed when a domain or network
  is configured with <bandwidth> elements (see docs links below); regressions
  would only affect users utilizing libvirt's QoS configuration features.
  
  The change should be a no-op for all interface types except the four for which
  libvirt sets the qdisc to noqueue (see `qemuDomainInterfaceSetDefaultQDisc`
  in `src/qemu/qemu_domain.c`):
  
  - VIR_DOMAIN_NET_TYPE_ETHERNET: <interface type='ethernet'>
  - VIR_DOMAIN_NET_TYPE_NETWORK: <interface type='network'>
  - VIR_DOMAIN_NET_TYPE_BRIDGE: <interface type='bridge'>
  - VIR_DOMAIN_NET_TYPE_DIRECT: <interface type='direct'>
  
  QoS users utilizing these interface types with only <bandwidth> <ingress> set
  should expect to see `noqueue` on their macvtap devices after a domain 
restart.
  
  [ Other information ]
  
  libvirt supports configuring QoS rules for network interfaces [1][2][3].
  
  By default in Ubuntu, ifaces will be assigned the `fq_codel` qdisc; for some
  iface types (including direct/macvtap), libvirt will set the qdisc of the
  iface to `noqueue` as it is assumed that the guest is also applying a qdisc to
  its outgoing traffic. This is done to avoid lock contention in `fq_codel` from
  limiting the outgoing bandwidth of the iface.
  
  The qdisc of an iface can be configured independently for `root` (egress) and
  `ingress` directions; i.e. an interface can use `noqueue` for the `root` qdisc
  but `htb` for `ingress`.
  
  A logic bug in libvirt causes interface configurations which only define an
  `inbound` bandwidth limit (affecting the `ingress` direction) to reset the
  `root` (egress) qdisc to the system default.
  
  For example, without bandwidth limits, the `root` qdisc of macvtap0 will be
  `noqueue`:
  
      <interface type='direct'>
        <mac address='52:54:00:f5:7e:97'/>
        <source dev='ens2f0np0' mode='passthrough'/>
        <target dev='macvtap0'/>
        <model type='virtio'/>
        <alias name='net1'/>
        <address type='pci' domain='0x0000' bus='0x02' slot='0x00' 
function='0x0'/>
      </interface>
  
  $ sudo tc qdisc show dev macvtap0
  qdisc noqueue 8001: root refcnt 2
  qdisc ingress ffff: parent ffff:fff1 ----------------
  
  With an `inbound` bandwidth limit, we'd expect to see qdisc `noqueue`; instead
  it's set to the system default, `fq_codel`:
  
      <interface type='direct'>
        <mac address='52:54:00:05:8f:18'/>
        <source dev='ens2f0np0' mode='passthrough'/>
        <bandwidth>
          <inbound average='3125000' peak='3125000'/>
        </bandwidth>
        <target dev='macvtap5'/>
        <model type='virtio'/>
        <alias name='net0'/>
        <address type='pci' domain='0x0000' bus='0x01' slot='0x00' 
function='0x0'/>
      </interface>
  
  $ sudo tc qdisc show dev macvtap5
  qdisc fq_codel 0: root refcnt 2 limit 10240p flows 1024 quantum 1514 target 
5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
  qdisc ingress ffff: parent ffff:fff1 ----------------
  
  [1] https://libvirt.org/formatdomain.html#quality-of-service
  [2] https://libvirt.org/formatnetwork.html#quality-of-service
  [3] https://tldp.org/HOWTO/Traffic-Control-HOWTO/components.html

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2155755

Title:
  macvtap device qdisc reset to system default when ingress-only
  bandwidth limit applied

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/2155755/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to