On 6/21/21 5:36 PM, Fam Zheng wrote: >> On 21 Jun 2021, at 16:13, Philippe Mathieu-Daudé <phi...@redhat.com> wrote: >> On 6/21/21 3:18 PM, Fam Zheng wrote: >>>> On 21 Jun 2021, at 10:32, Philippe Mathieu-Daudé <phi...@redhat.com> wrote: >>>> >>>> When the NVMe block driver was introduced (see commit bdd6a90a9e5, >>>> January 2018), Linux VFIO_IOMMU_MAP_DMA ioctl was only returning >>>> -ENOMEM in case of error. The driver was correctly handling the >>>> error path to recycle its volatile IOVA mappings. >>>> >>>> To fix CVE-2019-3882, Linux commit 492855939bdb ("vfio/type1: Limit >>>> DMA mappings per container", April 2019) added the -ENOSPC error to >>>> signal the user exhausted the DMA mappings available for a container. >>>> >>>> The block driver started to mis-behave: >>>> >>>> qemu-system-x86_64: VFIO_MAP_DMA failed: No space left on device >>>> (qemu) >>>> (qemu) info status >>>> VM status: paused (io-error) >>>> (qemu) c >>>> VFIO_MAP_DMA failed: No space left on device >>>> qemu-system-x86_64: block/block-backend.c:1968: blk_get_aio_context: >>>> Assertion `ctx == blk->ctx' failed. >>> >>> Hi Phil, >>> >>> >>> The diff looks good to me, but I’m not sure what exactly caused the >>> assertion failure. There is `if (r) { goto fail; }` that handles -ENOSPC >>> before, so it should be treated as a general case. What am I missing? >> >> Good catch, ENOSPC ends setting BLOCK_DEVICE_IO_STATUS_NOSPACE >> -> BLOCK_ERROR_ACTION_STOP, so the VM is paused with DMA mapping >> exhausted. I don't understand the full "VM resume" path, but this >> is not what we want (IO_NOSPACE is to warn the operator to add >> more storage and resume, which is pointless in our case, resuming >> won't help until we flush the mappings). >> >> IIUC what we want is return ENOMEM to set BLOCK_DEVICE_IO_STATUS_FAILED. > > I agree with that. It just makes me feel there’s another bug in the resuming > code path. Can you get a backtrace?
It seems the resuming code path bug has been fixed elsewhere: (qemu) info status info status VM status: paused (io-error) (qemu) c c 2021-06-22T07:27:00.745466Z qemu-system-x86_64: VFIO_MAP_DMA failed: No space left on device (qemu) info status info status VM status: paused (io-error) (qemu) c c 2021-06-22T07:27:12.458137Z qemu-system-x86_64: VFIO_MAP_DMA failed: No space left on device (qemu) c c 2021-06-22T07:27:13.439167Z qemu-system-x86_64: VFIO_MAP_DMA failed: No space left on device (qemu) c c 2021-06-22T07:27:14.272071Z qemu-system-x86_64: VFIO_MAP_DMA failed: No space left on device (qemu)