Hi,

This is a follow up to the first patch series [0] for using iommufd
to propagate DMA mappings to the kernel for VM-assigned host
devices in a qemu VM.

We add a new 'iommufd' attribute for hostdev devices to be
associated with the iommufd object.

For instance, specifying the iommufd object and associated hostdev in a
VM definition:

  <devices>
...
    <hostdev mode='subsystem' type='pci' managed='no'>
      <driver iommufd='yes'/>
      <source>
        <address domain='0x0009' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x15' slot='0x00' 
function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='no'>
      <driver iommufd='yes'/>
      <source>
        <address domain='0x0019' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x16' slot='0x00' 
function='0x0'/>
    </hostdev>
...
  </devices>

This would get translated to a qemu command line with the arguments below.
Note that libvirt will open the /dev/iommu and VFIO cdev, passing the
associated fd number to qemu:

 -object '{"qom-type":"iommufd","id":"iommufd0","fd":"24"}' \
 -device 
'{"driver":"vfio-pci","host":"0009:01:00.0","id":"hostdev0","iommufd":"iommufd0","fd":"22","bus":"pci.21","addr":"0x0"}'
 \
 -device 
'{"driver":"vfio-pci","host":"0019:01:00.0","id":"hostdev1","iommufd":"iommufd0","fd":"25","bus":"pci.22","addr":"0x0"}'
 \

Note that when Libvirt launches the qemu process as an unprivileged user,
i.e. libvirt-qemu, it is bound by the RLIMIT_MEMLOCK limit calculated
based on how much RAM is assigned to the VM. IOMMUFD's unified memory
accounting causes ENOMEM errors for large device BARs that exceed the
process memlock limit. VFIO without IOMMUFD does not check device memory
against the limit and therefore does not show this behavior.

To work around the issue, we can specify CAP_IPC_LOCK=ep capability for
the qemu binary to bypass the RLIMIT_MEMLOCK limit. I am working on a
long-term fix in the kernel to mark device memory regions so they do
not count against the RLIMIT_MEMLOCK limit for how much memory can be
locked/pinned, as I/O device BARs do not consume host RAM.

This series is on Github:
https://github.com/NathanChenNVIDIA/libvirt/tree/iommufd-11-25

Thanks,
Nathan

[0] 
https://lists.libvirt.org/archives/list/[email protected]/thread/MYPM52P4SIIGZS4NH3PRD5FOCZQW5T6T/

Signed-off-by: Nathan Chen <[email protected]>

Nathan Chen (5):
  qemu: Implement support for associating iommufd to hostdev
  qemu: open VFIO FDs from libvirt backend
  qemu: open iommufd FD from libvirt backend
  qemu: Update Cgroup, namespace, and seclabel for iommufd
  tests: qemuxmlconfdata: provide iommufd sample XML and CLI args

 docs/formatdomain.rst                         |   8 +
 src/conf/device_conf.c                        |  12 ++
 src/conf/device_conf.h                        |   1 +
 src/conf/domain_conf.h                        |   2 +
 src/conf/schemas/basictypes.rng               |   5 +
 src/libvirt_private.syms                      |   1 +
 src/qemu/qemu_cgroup.c                        |  26 +--
 src/qemu/qemu_command.c                       |  53 ++++-
 src/qemu/qemu_domain.c                        |  40 ++++
 src/qemu/qemu_domain.h                        |  19 ++
 src/qemu/qemu_namespace.c                     |  16 +-
 src/qemu/qemu_process.c                       | 186 ++++++++++++++++++
 src/security/security_apparmor.c              |  18 +-
 src/security/security_dac.c                   |  28 ++-
 src/security/security_selinux.c               |  28 ++-
 src/security/virt-aa-helper.c                 |  11 +-
 src/util/virpci.c                             |  69 +++++++
 src/util/virpci.h                             |   2 +
 .../iommufd-q35.x86_64-latest.args            |  41 ++++
 .../iommufd-q35.x86_64-latest.xml             |  60 ++++++
 tests/qemuxmlconfdata/iommufd-q35.xml         |  38 ++++
 .../iommufd-virt.aarch64-latest.args          |  33 ++++
 .../iommufd-virt.aarch64-latest.xml           |  34 ++++
 tests/qemuxmlconfdata/iommufd-virt.xml        |  22 +++
 .../iommufd.x86_64-latest.args                |  35 ++++
 .../qemuxmlconfdata/iommufd.x86_64-latest.xml |  38 ++++
 tests/qemuxmlconfdata/iommufd.xml             |  30 +++
 tests/qemuxmlconftest.c                       |  33 ++++
 28 files changed, 839 insertions(+), 50 deletions(-)
 create mode 100644 tests/qemuxmlconfdata/iommufd-q35.x86_64-latest.args
 create mode 100644 tests/qemuxmlconfdata/iommufd-q35.x86_64-latest.xml
 create mode 100644 tests/qemuxmlconfdata/iommufd-q35.xml
 create mode 100644 tests/qemuxmlconfdata/iommufd-virt.aarch64-latest.args
 create mode 100644 tests/qemuxmlconfdata/iommufd-virt.aarch64-latest.xml
 create mode 100644 tests/qemuxmlconfdata/iommufd-virt.xml
 create mode 100644 tests/qemuxmlconfdata/iommufd.x86_64-latest.args
 create mode 100644 tests/qemuxmlconfdata/iommufd.x86_64-latest.xml
 create mode 100644 tests/qemuxmlconfdata/iommufd.xml

-- 
2.43.0

Reply via email to