Thank you for your contribution! Sashiko AI review found 3 potential issue(s) 
to consider:
- [High] Unconditional validation of the flag-gated `ph` syscall parameter can 
falsely reject valid requests to clear TPH metadata.
- [Medium] Packing new `tph_*` bitfields into the same byte as the existing 
`revoked` bitfield introduces a data race.
- [Medium] Lockless plain read of `priv->vdev` can race with a concurrent write 
of `NULL` during device cleanup.
--

commit d529be6632b034e32bc07cc3b900f28862d27327
Author: Zhiping Zhang <[email protected]>

vfio/pci: implement get_pci_tph and DMA_BUF_TPH feature

This commit implements dma-buf get_pci_tph for vfio-pci exported dma-bufs.
It adds VFIO_DEVICE_FEATURE_DMA_BUF_TPH to allow userspace to publish TPH
metadata for a VFIO-owned device.

> diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c 
> b/drivers/vfio/pci/vfio_pci_dmabuf.c
> index c16f460c01d68..d6f5dd3210009 100644
> --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
> +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
> @@ -19,7 +20,14 @@ struct vfio_pci_dma_buf {
>       u32 nr_ranges;
>       struct kref kref;
>       struct completion comp;
> -     u8 revoked : 1;
> +
> +     /* Protected by dmabuf->resv. */
> +     u16 tph_st_ext;
> +     u8 tph_st;
> +     u8 revoked:1;
> +     u8 tph_st_valid:1;
> +     u8 tph_st_ext_valid:1;
> +     u8 tph_ph:2;
>  };

[Severity: Medium]
Does this bitfield packing introduce a data race?

While the new tph_* fields are safely updated under the dmabuf->resv lock
in vfio_pci_core_feature_dma_buf_tph(), existing code in
vfio_pci_dma_buf_move() reads the revoked bit without holding the
dmabuf->resv lock:

vfio_pci_dma_buf_move()
    if (priv->revoked != revoked) {
        dma_resv_lock(priv->dmabuf->resv, NULL);
        ...

Because they share the same underlying byte, the locked read-modify-write of
the tph_* fields will concurrently access the byte while the lockless read of
the revoked flag occurs. This can violate the C memory model and trigger
KCSAN warnings. Could the revoked flag be declared as a standalone bool to
avoid sharing the byte with the dynamic tph_* bitfields?

> [ ... ]
> @@ -333,6 +362,72 @@ int vfio_pci_core_feature_dma_buf(struct 
> vfio_pci_core_device *vdev, u32 flags,
> [ ... ]
> +int vfio_pci_core_feature_dma_buf_tph(struct vfio_pci_core_device *vdev,
> +                                   u32 flags,
> +                                   struct vfio_device_feature_dma_buf_tph 
> __user *arg,
> +                                   size_t argsz)
> +{
> [ ... ]
> +     if (copy_from_user(&set_tph, arg, sizeof(set_tph)))
> +             return -EFAULT;
> +
> +     if (set_tph.flags & ~(VFIO_DMA_BUF_TPH_ST | VFIO_DMA_BUF_TPH_ST_EXT))
> +             return -EINVAL;
> +
> +     if (set_tph.ph & ~0x3)
> +             return -EINVAL;

[Severity: High]
Can this unconditional check falsely reject valid requests to clear TPH
metadata?

When userspace sets the flags to 0 to clear or invalidate published TPH
metadata, the ph parameter is unused and might contain uninitialized stack
garbage. Since syscall parameters that are only meaningful when certain flags
are set should not be validated when those flags are absent, could this
cause legitimate clearing requests to fail with -EINVAL?

> [ ... ]
> +     dmabuf = dma_buf_get(set_tph.dmabuf_fd);
> +     if (IS_ERR(dmabuf))
> +             return PTR_ERR(dmabuf);
> +
> +     if (dmabuf->ops != &vfio_pci_dmabuf_ops) {
> +             ret = -EINVAL;
> +             goto out_put;
> +     }
> +
> +     priv = dmabuf->priv;
> +     if (priv->vdev != vdev) {
> +             ret = -EINVAL;
> +             goto out_put;
> +     }

[Severity: Medium]
Could this plain read of priv->vdev race with a concurrent device cleanup?

Here, priv->vdev is accessed to check ownership without holding any locks
that would protect against cleanup. Concurrently, if the original exporting
device is being closed, vfio_pci_dma_buf_cleanup() sets priv->vdev = NULL
locklessly relative to this ioctl execution context:

vfio_pci_dma_buf_cleanup()
    list_del_init(&priv->dmabufs_elm);
    priv->vdev = NULL;
    vfio_device_put_registration(&vdev->vdev);

This plain read racing with a plain write could cause undefined behavior
under the kernel memory model and trigger KCSAN warnings. Should this be
annotated with READ_ONCE() and WRITE_ONCE() to prevent compiler reordering
or load tearing?

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=3

Reply via email to