On Thu, May 21, 2026 at 3:24 PM Alex Williamson <[email protected]> wrote: > > > > On Thu, 21 May 2026 16:04:12 -0600 > Alex Williamson <[email protected]> wrote: > > > On Tue, 19 May 2026 13:13:49 -0700 > > Zhiping Zhang <[email protected]> wrote: > > > > > Add a dma-buf get_tph callback for exporters to return TPH > > > (TLP Processing Hints) metadata, and add VFIO_DEVICE_FEATURE_DMA_BUF_TPH > > > so userspace can attach that metadata to a VFIO-exported dma-buf. > > > > This should be two patches, the first extending the dma-buf framework > > for the get_tph callback for explicit approval from dma-buf maintainers > > (who are not even copied here). The second the vfio-pci implementation > > of get_tph. > > > > > 8-bit ST and 16-bit Extended ST are distinct PCIe TPH namespaces; the > > > uAPI carries both with explicit validity flags so importers get the > > > value matching their requested width. SET is write-once per dma-buf; > > > the existing VFIO_DEVICE_FEATURE_DMA_BUF uAPI is unchanged. > > > > I didn't see what motivated this write-once change, I thought we > > understood that it was a userspace problem that the tph values need to > > be set before providing the dma-buf fd to the importer and that races > > relative to that are a userspace ordering problem. Write-once seems > > unnecessarily restrictive and there's no justification provided here. > > > > > Signed-off-by: Zhiping Zhang <[email protected]> > > > --- > > > drivers/vfio/pci/vfio_pci_core.c | 3 + > > > drivers/vfio/pci/vfio_pci_dmabuf.c | 134 +++++++++++++++++++++++++++-- > > > drivers/vfio/pci/vfio_pci_priv.h | 12 +++ > > > include/linux/dma-buf.h | 21 +++++ > > > include/uapi/linux/vfio.h | 35 ++++++++ > > > 5 files changed, 198 insertions(+), 7 deletions(-) > > > > > > diff --git a/drivers/vfio/pci/vfio_pci_core.c > > > b/drivers/vfio/pci/vfio_pci_core.c > > > index 3f8d093aacf8..94aa6dd95701 100644 > > > --- a/drivers/vfio/pci/vfio_pci_core.c > > > +++ b/drivers/vfio/pci/vfio_pci_core.c > > > @@ -1534,6 +1534,9 @@ int vfio_pci_core_ioctl_feature(struct vfio_device > > > *device, u32 flags, > > > return vfio_pci_core_feature_token(vdev, flags, arg, argsz); > > > case VFIO_DEVICE_FEATURE_DMA_BUF: > > > return vfio_pci_core_feature_dma_buf(vdev, flags, arg, argsz); > > > + case VFIO_DEVICE_FEATURE_DMA_BUF_TPH: > > > + return vfio_pci_core_feature_dma_buf_tph(vdev, flags, arg, > > > + argsz); > > > default: > > > return -ENOTTY; > > > } > > > diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c > > > b/drivers/vfio/pci/vfio_pci_dmabuf.c > > > index f87fd32e4a01..be1c65385670 100644 > > > --- a/drivers/vfio/pci/vfio_pci_dmabuf.c > > > +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c > > > @@ -19,7 +19,24 @@ struct vfio_pci_dma_buf { > > > u32 nr_ranges; > > > struct kref kref; > > > struct completion comp; > > > - u8 revoked : 1; > > > + /* > > > + * TPH metadata published by VFIO_DEVICE_FEATURE_DMA_BUF_TPH and > > > + * consumed by the @get_tph dma-buf callback. > > > + * > > > + * @tph_flags is the publish/consume gate: writers populate > > > + * @steering_tag, @steering_tag_ext and @ph first, then store > > > + * @tph_flags with smp_store_release(); readers do > > > + * smp_load_acquire(&tph_flags) before accessing the value fields. > > > + * @tph_flags == 0 means "TPH not set". Writers publish a non-zero > > > + * value only once per dma-buf and serialize via vdev->memory_lock; > > > + * readers stay lockless to avoid AB-BA against the dma_resv_lock held > > > + * by importers. > > > + */ > > > > Can you outline the ABBA hazard, I'm not seeing it. You're acquiring > > memory_lock in the feature SET and dma_resv_lock doesn't appear to be > > held when calling .get_tph(). There's a lot of lockless complication > > here balanced on this claim of avoiding a hazard that doesn't appear > > present. > > > > > + u32 tph_flags; > > > + u16 steering_tag_ext; > > > + u8 steering_tag; > > > + u8 ph; > > > + bool revoked; > > > > If we still used memory_lock for tph, these could be: > > > > u8 tph_st_valid:1; /* memory_lock */ > > u8 tph_st_ext_valid:1; /* memory_lock */ > > u8 tph_ph:2; /* memory_lock */ > > u8 tph_st; > > u16 tph_st_ext; > > u8 revoked:1; /* dma_resv_lock */ > > > > The existing change of @revoked from bitfield to bool has no rationale > > noted for it in the commit log. > > On second thought, what dependency does anything here have on > memory_lock? I think we're jumping through hoops to avoid a lock we > don't even need. If we just want to serialize SET vs get_tph we could > have a mutex on the dma-buf structure, or use RCU if we want to manage > it locklessly and make sure get_tph always sees a fully consistent set > of values. Thanks, > > Alex
Agreed, we don't need memory_lock in this path. For v5 I'll instead add a struct mutex lock to struct vfio_pci_dma_buf and take it in SET, get_tph, and around the priv->vdev = NULL store in cleanup. Thanks, Zhiping
