Hi Leon,

On 06/05/2026 17:14, Leon Romanovsky wrote:

On Wed, May 06, 2026 at 04:55:27PM +0100, Matt Evans wrote:
Hi Leon,

On 06/05/2026 16:29, Leon Romanovsky wrote:

On Wed, May 06, 2026 at 02:53:31PM +0100, Matt Evans wrote:
Hi Alex,

On 01/05/2026 20:12, Alex Williamson wrote:

On Thu, 16 Apr 2026 06:17:44 -0700
Matt Evans <[email protected]> wrote:

vfio_pci_dma_buf_cleanup() assumed all VFIO device DMABUFs need to be
revoked.  However, if vfio_pci_dma_buf_move() revokes DMABUFs before
the fd/device closes, then vfio_pci_dma_buf_cleanup() would do a
second/underflowing kref_put() then wait_for_completion() on a
completion that never fires.  Fixed by predicating on revocation
status.

This could happen if PCI_COMMAND_MEMORY is cleared before closing the
device fd (but the scenario is more likely to hit when future commits
add more methods to revoke DMABUFs).

Fixes: 1a8a5227f2299 ("vfio: Wait for dma-buf invalidation to complete")
Signed-off-by: Matt Evans <[email protected]>
---

(Just a fix, but later "vfio/pci: Convert BAR mmap() to use a DMABUF"
and "vfio/pci: Permanently revoke a DMABUF on request" depend on this
context, so including in this series.)

We really need a fix for this split out from this series, It's already
been shown[1] that this is trivially reachable.  Carlos proposed[2] a
similar solution to the one below.  I was concurrently working on the
issued and suggested an alternative[3].  Let's pick a solution for
7.1-rc.  Thanks,

It looks like [3] is progressing, so I'll drop this one when I can rebase
onto it.

I noticed [3] removes the dma_resv_lock(priv->dmabuf->resv) around the
priv->vdev = NULL, and this series' vfio_pci_mmap_huge_fault() relies on
vdev only changing whilst resv is held to resolve a race between a fault and
cleanup (see patch 7 of this series).  The handler takes resv so that it can
stably test vdev in order to take memory_lock.

I think that you should rely on priv->revoked and not on priv->vdev.

Needs both unfortunately, as the fault handler ultimately needs to take
vdev->memory_lock.

One can argue that if priv->revoked == True, all accesses to device
should be denied and treated as priv->vdev == Null.

I agree, the handler will early-exit when a buffer is revoked. Though when it _isn't_ revoked, it still needs to go through a careful set of steps to keep vdev around long enough to take the lock (and ensure it still isn't revoked, etc.).

I think the sequence in patch 7 still works (with Alex's patch in [3]), since the invariants still hold:

- if not-revoked then vdev is still valid (IOW, vdev = NULL only happens after revoked = true)
- revoke is only changed when holding priv->dmabuf->resv

OK, [3] doesn't seem to break this series (just context/rebase). Sorry for the thinking out loud, it'll be good if someone sees a flaw in my reasoning though.

[3] was https://lore.kernel.org/all/[email protected]/


Matt

Reply via email to