Hi Alex, On 07/05/2026 23:21, Alex Williamson wrote:
On Tue, 5 May 2026 10:38:31 -0700 Matt Evans <[email protected]> wrote:Since "vfio/pci: Set up barmap in vfio_pci_core_enable()", the resource request and iomap for the BARs was performed early, and vfio_pci_core_setup_barmap() just checks those actions succeeded. Move this logic to a new helper that checks success and returns the iomap address, replacing the various bare vdev->barmap[] lookups. This maintains the error behaviour of the previous on-demand vfio_pci_core_setup_barmap() scheme. Signed-off-by: Matt Evans <[email protected]> --- drivers/vfio/pci/nvgrace-gpu/main.c | 11 ++++------- drivers/vfio/pci/vfio_pci_core.c | 11 +++++------ drivers/vfio/pci/vfio_pci_dmabuf.c | 2 +- drivers/vfio/pci/vfio_pci_rdwr.c | 30 ++++++++--------------------- drivers/vfio/pci/virtio/legacy_io.c | 13 ++++++------- include/linux/vfio_pci_core.h | 20 ++++++++++++++++++- 6 files changed, 43 insertions(+), 44 deletions(-) diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace-gpu/main.c index fa056b69f899..e153002258ce 100644 --- a/drivers/vfio/pci/nvgrace-gpu/main.c +++ b/drivers/vfio/pci/nvgrace-gpu/main.c @@ -184,13 +184,10 @@ static int nvgrace_gpu_open_device(struct vfio_device *core_vdev)/** GPU readiness is checked by reading the BAR0 registers. - * - * ioremap BAR0 to ensure that the BAR0 mapping is present before - * register reads on first fault before establishing any GPU - * memory mapping. + * The BAR map was just set up by vfio_pci_core_enable(), so + * check that was successful and bail early if not: */ - ret = vfio_pci_core_setup_barmap(vdev, 0); - if (ret) + if (IS_ERR(vfio_pci_core_get_iomap(vdev, 0))) goto error_exit;Sashiko notes we're not setting ret here. The bots are also paranoid about the unreachable condition that the get_iomap below could return an ERR_PTR. Maybe head off both by adding an __iomem pointer to the nvgrace_gpu_pci_core_device struct and a temporary one here. Store the iomap in the temporary variable, use it to test for IS_ERR() and PTR_ERR(), then set the pointer in the structure after the last error condition here. Add one line in the close_device to set it NULL. Then just use nvdev->bar0_io below.
Right about ret. On the 2nd, the bots could benefit from a comment on the ...get_iomap() below saying that it "cannot fail" if this one passes, but hey. I can add a struct member to track it (bots can then worry that it might be NULL, if they don't notice that nvgrace_gpu_check_device_ready() can't happen if nvgrace_gpu_open_device() didn't succeed, etc. etc.).
if (nvdev->resmem.memlength) {@@ -275,7 +272,7 @@ nvgrace_gpu_check_device_ready(struct nvgrace_gpu_pci_core_device *nvdev) if (!__vfio_pci_memory_enabled(vdev)) return -EIO;- ret = nvgrace_gpu_wait_device_ready(vdev->barmap[0]);+ ret = nvgrace_gpu_wait_device_ready(vfio_pci_core_get_iomap(vdev, 0)); if (ret) return ret;diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.cindex 62931dc381d8..5c8bd13f10d0 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -1761,7 +1761,7 @@ int vfio_pci_core_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma struct pci_dev *pdev = vdev->pdev; unsigned int index; u64 phys_len, req_len, pgoff, req_start; - int ret; + void __iomem *bar_io;index = vma->vm_pgoff >> (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT); @@ -1795,12 +1795,11 @@ int vfio_pci_core_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vmareturn -EINVAL;/*- * Even though we don't make use of the barmap for the mmap, - * we need to request the region and the barmap tracks that. + * Ensure the BAR resource region is reserved for use. */ - ret = vfio_pci_core_setup_barmap(vdev, index); - if (ret) - return ret; + bar_io = vfio_pci_core_get_iomap(vdev, index); + if (IS_ERR(bar_io)) + return PTR_ERR(bar_io);vma->vm_private_data = vdev;vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c index 69a5c2d511e6..46cd44b22c9c 100644 --- a/drivers/vfio/pci/vfio_pci_dmabuf.c +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c @@ -248,7 +248,7 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, * else. Check that PCI resources have been claimed for it. */ if (get_dma_buf.region_index >= VFIO_PCI_ROM_REGION_INDEX || - vfio_pci_core_setup_barmap(vdev, get_dma_buf.region_index)) + IS_ERR(vfio_pci_core_get_iomap(vdev, get_dma_buf.region_index))) return -ENODEV;dma_ranges = memdup_array_user(&arg->dma_ranges, get_dma_buf.nr_ranges,diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c index 3bfbb879a005..7f14dd46de17 100644 --- a/drivers/vfio/pci/vfio_pci_rdwr.c +++ b/drivers/vfio/pci/vfio_pci_rdwr.c @@ -198,19 +198,6 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem, } EXPORT_SYMBOL_GPL(vfio_pci_core_do_io_rw);-/*- * The barmap is set up in vfio_pci_core_enable(). Callers use this - * function to check that the BAR resources are requested or that the - * pci_iomap() was done. - */ -int vfio_pci_core_setup_barmap(struct vfio_pci_core_device *vdev, int bar) -{ - if (IS_ERR(vdev->barmap[bar])) - return PTR_ERR(vdev->barmap[bar]); - return 0; -} -EXPORT_SYMBOL_GPL(vfio_pci_core_setup_barmap); - ssize_t vfio_pci_bar_rw(struct vfio_pci_core_device *vdev, char __user *buf, size_t count, loff_t *ppos, bool iswrite) { @@ -262,13 +249,11 @@ ssize_t vfio_pci_bar_rw(struct vfio_pci_core_device *vdev, char __user *buf, */ max_width = VFIO_PCI_IO_WIDTH_4; } else { - int ret = vfio_pci_core_setup_barmap(vdev, bar); - if (ret) { - done = ret; + io = vfio_pci_core_get_iomap(vdev, bar); + if (IS_ERR(io)) { + done = PTR_ERR(io); goto out; } - - io = vdev->barmap[bar]; }if (bar == vdev->msix_bar) {@@ -423,6 +408,7 @@ int vfio_pci_ioeventfd(struct vfio_pci_core_device *vdev, loff_t offset, loff_t pos = offset & VFIO_PCI_OFFSET_MASK; int ret, bar = VFIO_PCI_OFFSET_TO_INDEX(offset); struct vfio_pci_ioeventfd *ioeventfd; + void __iomem *io;/* Only support ioeventfds into BARs */if (bar > VFIO_PCI_BAR5_REGION_INDEX) @@ -440,9 +426,9 @@ int vfio_pci_ioeventfd(struct vfio_pci_core_device *vdev, loff_t offset, if (count == 8) return -EINVAL;- ret = vfio_pci_core_setup_barmap(vdev, bar);- if (ret) - return ret; + io = vfio_pci_core_get_iomap(vdev, bar); + if (IS_ERR(io)) + return PTR_ERR(io);Sashiko seems to note a real existing error here that should also be pulled out to a separate fix. Given the right offset, this could generate a negative BAR value.
Yuck, loff_t signed, yep. Isn't the real root of this that it never makes sense for VFIO_PCI_OFFSET_TO_INDEX() to return a negative index here or anywhere else? I suggest instead, to also avoid this elsewhere in future, something like: #define VFIO_PCI_OFFSET_TO_INDEX(off) ((u64)(off) >> VFIO_PCI_OFFSET_SHIFT)
The test at the end of the previous chunk should should be expanded to `if (bar < 0 || bar > ...BAR5...)`.
Not necessary if VFIO_PCI_OFFSET_TO_INDEX() can't return < 0 (the magnitude would be 24b so can't overflow the `int bar` it's assigned into).
Do you want to pick that up in this series? I think it's the only case that lets that slip through. Thanks,
Sure, I'll post a fix. I don't think it needs to be part of this series though if it's just the macro, do you agree? Do you know why drivers/gpu/drm/i915/gvt/kvmgt.c has copied VFIO_PCI_OFFSET_TO_INDEX() and friends? Perhaps the shift was different (the reason drivers/vfio/pci/ism/main.c has its own versions). The same loff_t issue seems to exist in both of those places, unfortunately. Matt PS: with minor question: Relatedly, I'd made `bar` an int following existing convention in vfio_pci_core_get_iomap(struct vfio_pci_core_device *vdev, int bar) But I'll make this `unsigned int`, please flag if this violates taste and decency. IMO any BAR/index parameter should be unsigned; most are, some signed remain.

