Thanks Alex - that was all very helpful! I am still a bit new to this, so that was very informative. I am trying to add multiple NVMe drives to a VM and I misunderstood this intremap setting. I just need to make sure that the *host* has intremap=on . Is there any qemu config needed to get the best perf out of the VMs? Do Posted Interrupts get enabled by default?
On Wed, Jul 4, 2018 at 2:26 AM, Alex Williamson <alex.william...@redhat.com> wrote: > On Tue, 3 Jul 2018 18:42:04 +0530 > Prasun Ratn <prasun.r...@gmail.com> wrote: > >> Hi >> >> I am adding an IOMMU device using the following libvirt syntax (taken >> from https://libvirt.org/formatdomain.html#elementsIommu) >> >> <iommu model='intel'> >> <driver intremap='on'/> >> </iommu> >> >> When I try to start the VM, it fails. If I remove the above 3 lines it >> starts fine. >> >> error: Failed to start domain rhel7.3-32T-nvme-ich9 >> error: internal error: qemu unexpectedly closed the monitor: >> 2018-06-28T15:24:31.401831Z qemu-kvm: -device >> vfio-pci,host=82:00.0,id=hostdev1,bus=pcie.0,addr=0xa: VFIO_MAP_DMA: >> -12 >> 2018-06-28T15:24:31.401854Z qemu-kvm: -device >> vfio-pci,host=82:00.0,id=hostdev1,bus=pcie.0,addr=0xa: >> vfio_dma_map(0x556478bc0820, 0xc0000, 0x7ff40000, 0x7fd94e4c0000) = >> -12 (Cannot allocate memory) >> 2018-06-28T15:24:31.450793Z qemu-kvm: -device >> vfio-pci,host=82:00.0,id=hostdev1,bus=pcie.0,addr=0xa: VFIO_MAP_DMA: >> -12 >> 2018-06-28T15:24:31.450804Z qemu-kvm: -device >> vfio-pci,host=82:00.0,id=hostdev1,bus=pcie.0,addr=0xa: >> vfio_dma_map(0x556478bc0820, 0x100000000, 0x180000000, 0x7fd9ce400000) >> = -12 (Cannot allocate memory) >> 2018-06-28T15:24:31.450878Z qemu-kvm: -device >> vfio-pci,host=82:00.0,id=hostdev1,bus=pcie.0,addr=0xa: vfio error: >> 0000:82:00.0: failed to setup container for group 37: memory listener >> initialization failed for container: Cannot allocate memory >> >> In dmesg I see this: >> >> [189435.289113] vfio_pin_pages_remote: RLIMIT_MEMLOCK (9663676416) >> exceeded >> [189435.338165] vfio_pin_pages_remote: RLIMIT_MEMLOCK (9663676416) >> exceeded >> >> I have enough free memory (I think) and at the failing point enough >> memory seems to be available. >> >> $ free -h >> total used free shared >> buff/cache available >> Mem: 125G 1.4G 123G 17M >> 1.1G 123G >> Swap: 1.0G 0B 1.0G >> >> Here's the ulimit -l output (I changed limits.conf to set memlock to >> unlimited for qemu user and qemu group) >> >> $ ulimit -l >> unlimited >> >> >> $ sudo -u qemu sh -c "ulimit -l" >> unlimited >> >> memlock limit using systemctl >> >> $ systemctl show libvirtd.service | grep LimitMEMLOCK >> LimitMEMLOCK=18446744073709551615 >> >> SELinux is disabled >> >> $ sestatus >> SELinux status: disabled >> >> >> libvirt and kernel version >> >> $ virsh version >> Compiled against library: libvirt 4.1.0 >> Using library: libvirt 4.1.0 >> Using API: QEMU 4.1.0 >> Running hypervisor: QEMU 2.9.0 >> >> $ uname -r >> 3.10.0-693.5.2.el7.x86_64 >> >> $ cat /etc/redhat-release >> Red Hat Enterprise Linux Server release 7.4 (Maipo) >> >> Any idea how to figure out why we are exceeding the memlock limit? > > I'm guessing you're assigning multiple devices to the same VM, which > doesn't work well with a guest IOMMU currently. The trouble is that > with a guest IOMMU, each assigned device has a separate address space > that is initially configured to map the full address space of the VM > and each vfio container for each device is accounted separately. > libvirt will only set the locked memory limit to a value sufficient for > locking the memory once, whereas in this configuration we're locking it > once per assigned device. Without a guest IOMMU, all devices run in > the same address space and therefore the same container, and we only > account the memory once for any number of devices. > > Regardless of all your attempts to prove that the locked memory limit > is set to unlimited, I don't think that's actually the case for the > running qemu instance. You should be able to use the hard_limit option > in the VM xml to increase the locked memory limit: > > https://libvirt.org/formatdomain.html#elementsMemoryTuning > > As above, I'd suggest <# hostdevs> x <VM memory size> > > The next question would be, why are you trying to use a guest IOMMU in > the first place? The typical "production" use case of this is to be > able to make use of userspace drivers, like DPDK, in the guest > userspace. Device assignment to nested guest is also possible, but > beyond proof of concept or development work, I don't know a practical > use for it. If your intent is to get isolation between devices in the > guest drivers (ie. not using iommu=pt in the guest), expect horrendous > performance. Thanks, > > Alex _______________________________________________ vfio-users mailing list vfio-users@redhat.com https://www.redhat.com/mailman/listinfo/vfio-users