Re: [PATCH v1 04/10] PCI/P2PDMA: Refactor to separate core P2P functionality from memory allocation

2025-08-06 Thread Jason Gunthorpe
On Wed, Aug 06, 2025 at 03:42:14PM -0600, Alex Williamson wrote: > > + p2p->mem.owner = &pdev->dev; > > + /* On all p2p platforms bus_offset is the same for all BARs */ > > + p2p->mem.bus_offset = > > + pci_bus_address(pdev, 0) - pci_resource_start(pdev, 0); > > But not all devices

Re: [PATCH v2 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-08-05 Thread Jason Gunthorpe
On Mon, Aug 04, 2025 at 11:51:38AM +1000, Alistair Popple wrote: > On Fri, Aug 01, 2025 at 01:57:49PM -0300, Jason Gunthorpe wrote: > > On Fri, Aug 01, 2025 at 06:50:18PM +0200, David Hildenbrand wrote: > > > On 01.08.25 18:40, Jason Gunthorpe wrote: > > > > On F

Re: [PATCH 00/10] convert the majority of file systems to mmap_prepare

2025-08-02 Thread Jason Gunthorpe
On Mon, Jun 16, 2025 at 08:33:19PM +0100, Lorenzo Stoakes wrote: > The intent is to gradually deprecate f_op->mmap, and in that vein this > series coverts the majority of file systems to using f_op->mmap_prepare. I saw this on lwn and just wanted to give a little bit of thought on this topic.. I

Re: [PATCH v2 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-08-01 Thread Jason Gunthorpe
On Fri, Aug 01, 2025 at 06:50:18PM +0200, David Hildenbrand wrote: > On 01.08.25 18:40, Jason Gunthorpe wrote: > > On Fri, Jul 25, 2025 at 10:31:25AM +1000, Alistair Popple wrote: > > > > > The only issue would be if there were generic code paths that somehow > > &

Re: [PATCH v2 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-08-01 Thread Jason Gunthorpe
On Sun, Jul 20, 2025 at 11:59:10PM -0700, Christoph Hellwig wrote: > > + /* > > +* Don't fault in device private pages owned by the caller, > > +* just report the PFN. > > +*/ > > + if (pgmap->owner == range->dev_private_owner) { > > + *hmm_pfn = swp_offset_pfn(entry); > >

Re: [PATCH v2 4/5] RDMA/mlx5: Enable P2P DMA with fallback mechanism

2025-08-01 Thread Jason Gunthorpe
On Thu, Jul 24, 2025 at 12:30:34AM -0700, Christoph Hellwig wrote: > On Wed, Jul 23, 2025 at 12:55:22AM -0300, Jason Gunthorpe wrote: > > On Mon, Jul 21, 2025 at 12:03:41AM -0700, Christoph Hellwig wrote: > > > On Fri, Jul 18, 2025 at 02:51:11PM +0300, Yonatan Maman wrote: >

Re: [PATCH v2 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-08-01 Thread Jason Gunthorpe
On Fri, Jul 25, 2025 at 10:31:25AM +1000, Alistair Popple wrote: > The only issue would be if there were generic code paths that somehow have a > raw pfn obtained from neither a page-table walk or struct page. My assumption > (yet to be proven/tested) is that these paths don't exist. hmm does it,

Re: [PATCH 00/10] vfio/pci: Allow MMIO regions to be exported through dma-buf

2025-07-30 Thread Jason Gunthorpe
On Wed, Jul 30, 2025 at 01:58:46PM -0600, Alex Williamson wrote: > On Wed, 23 Jul 2025 16:00:01 +0300 > Leon Romanovsky wrote: > > > From: Leon Romanovsky > > > > --- > > Based on blk and DMA patches which will be sent duri

Re: [PATCH 10/10] vfio/pci: Add dma-buf export support for MMIO regions

2025-07-30 Thread Jason Gunthorpe
On Wed, Jul 30, 2025 at 03:49:45PM +0100, Robin Murphy wrote: > On 2025-07-29 9:13 pm, Jason Gunthorpe wrote: > > On Tue, Jul 29, 2025 at 08:44:21PM +0100, Robin Murphy wrote: > > > > > In this case with just one single > > > contiguous mapping, it is clearly ob

Re: [PATCH 05/10] PCI/P2PDMA: Export pci_p2pdma_map_type() function

2025-07-29 Thread Jason Gunthorpe
On Tue, Jul 29, 2025 at 02:54:13PM -0600, Logan Gunthorpe wrote: > > > On 2025-07-28 17:11, Jason Gunthorpe wrote: > >> If the dma mapping for P2P memory doesn't need to create an iommu > >> mapping then that's fine. But it should be the dma-iommu layer to de

Re: [PATCH 10/10] vfio/pci: Add dma-buf export support for MMIO regions

2025-07-29 Thread Jason Gunthorpe
On Tue, Jul 29, 2025 at 08:44:21PM +0100, Robin Murphy wrote: > In this case with just one single > contiguous mapping, it is clearly objectively worse to have to bounce in and > out of the IOMMU layer 3 separate times and store a dma_map_state, The non-contiguous mappings are comming back, it wa

Re: [PATCH 02/10] PCI/P2PDMA: Introduce p2pdma_provider structure for cleaner abstraction

2025-07-29 Thread Jason Gunthorpe
On Wed, Jul 23, 2025 at 04:00:03PM +0300, Leon Romanovsky wrote: > From: Leon Romanovsky > > Extract the core P2PDMA provider information (device owner and bus > offset) from the dev_pagemap into a dedicated p2pdma_provider structure. > This creates a cleaner separation between the memory managem

Re: [PATCH 05/10] PCI/P2PDMA: Export pci_p2pdma_map_type() function

2025-07-28 Thread Jason Gunthorpe
On Mon, Jul 28, 2025 at 11:07:34AM -0600, Logan Gunthorpe wrote: > > > On 2025-07-28 10:41, Leon Romanovsky wrote: > > On Mon, Jul 28, 2025 at 10:12:31AM -0600, Logan Gunthorpe wrote: > >> > >> > >> On 2025-07-27 13:05, Jason Gunthorpe wrote: > >&g

Re: [PATCH 05/10] PCI/P2PDMA: Export pci_p2pdma_map_type() function

2025-07-27 Thread Jason Gunthorpe
On Fri, Jul 25, 2025 at 10:30:46AM -0600, Logan Gunthorpe wrote: > > > On 2025-07-24 02:13, Leon Romanovsky wrote: > > On Thu, Jul 24, 2025 at 10:03:13AM +0200, Christoph Hellwig wrote: > >> On Wed, Jul 23, 2025 at 04:00:06PM +0300, Leon Romanovsky wrote: > >>> From: Leon Romanovsky > >>> > >>>

Re: [PATCH 05/10] PCI/P2PDMA: Export pci_p2pdma_map_type() function

2025-07-27 Thread Jason Gunthorpe
On Thu, Jul 24, 2025 at 10:03:13AM +0200, Christoph Hellwig wrote: > On Wed, Jul 23, 2025 at 04:00:06PM +0300, Leon Romanovsky wrote: > > From: Leon Romanovsky > > > > Export the pci_p2pdma_map_type() function to allow external modules > > and subsystems to determine the appropriate mapping type

Re: [PATCH 02/10] PCI/P2PDMA: Introduce p2pdma_provider structure for cleaner abstraction

2025-07-27 Thread Jason Gunthorpe
On Thu, Jul 24, 2025 at 09:59:22AM +0200, Christoph Hellwig wrote: > On Thu, Jul 24, 2025 at 10:55:33AM +0300, Leon Romanovsky wrote: > > Please, see last patch in the series > > https://lore.kernel.org/all/aea452cc27ca9e5169f7279d7b524190c39e7260.1753274085.git.leo...@nvidia.com > > It gives me a

Re: [PATCH v2 0/5] *** GPU Direct RDMA (P2P DMA) for Device Private Pages ***

2025-07-22 Thread Jason Gunthorpe
On Mon, Jul 21, 2025 at 09:49:04AM +0300, Leon Romanovsky wrote: > > In fact, hmm_range_fault doesn't have information about the destination > > device that will perform the DMA mapping. > > So probably you need to teach HMM to perform page_faults on specific device. That isn't how the HMM side i

Re: [PATCH v2 4/5] RDMA/mlx5: Enable P2P DMA with fallback mechanism

2025-07-22 Thread Jason Gunthorpe
On Mon, Jul 21, 2025 at 12:03:41AM -0700, Christoph Hellwig wrote: > On Fri, Jul 18, 2025 at 02:51:11PM +0300, Yonatan Maman wrote: > > From: Yonatan Maman > > > > Add support for P2P for MLX5 NIC devices with automatic fallback to > > standard DMA when P2P mapping fails. > > That's now how the

Re: [PATCH v2 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-07-22 Thread Jason Gunthorpe
On Tue, Jul 22, 2025 at 10:49:10AM +1000, Alistair Popple wrote: > > So what is it? > > IMHO a hack, because obviously we shouldn't require real physical addresses > for > something the CPU can't actually address anyway and this causes real > problems IMHO what DEVICE PRIVATE really boils down t

Re: [PATCH v2 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-07-18 Thread Jason Gunthorpe
On Fri, Jul 18, 2025 at 03:17:00PM +0100, Matthew Wilcox wrote: > On Fri, Jul 18, 2025 at 02:51:08PM +0300, Yonatan Maman wrote: > > +++ b/include/linux/memremap.h > > @@ -89,6 +89,14 @@ struct dev_pagemap_ops { > > */ > > vm_fault_t (*migrate_to_ram)(struct vm_fault *vmf); > > > > + /

Re: [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl

2025-06-30 Thread Jason Gunthorpe
On Mon, Jun 30, 2025 at 06:01:13PM -0400, Rodrigo Vivi wrote: > On Tue, May 06, 2025 at 03:13:53PM -0300, Jason Gunthorpe wrote: > > On Tue, Apr 29, 2025 at 09:39:56PM +0530, Badal Nilawar wrote: > > > > > diff --git a/drivers/gpu/drm/xe/xe_pcode_fwctl.c &g

Re: [RFC PATCH 19/30] vfio/pci: Add TSM TDI bind/unbind IOCTLs for TEE-IO support

2025-06-06 Thread Jason Gunthorpe
On Fri, Jun 06, 2025 at 03:02:49PM +0530, Aneesh Kumar K.V wrote: > Jason Gunthorpe writes: > > > On Thu, Jun 05, 2025 at 09:47:01PM +0530, Aneesh Kumar K.V wrote: > >> Jason Gunthorpe writes: > >> > >> > On Thu, Jun 05, 2025 a

Re: [RFC PATCH 19/30] vfio/pci: Add TSM TDI bind/unbind IOCTLs for TEE-IO support

2025-06-05 Thread Jason Gunthorpe
On Thu, Jun 05, 2025 at 09:47:01PM +0530, Aneesh Kumar K.V wrote: > Jason Gunthorpe writes: > > > On Thu, Jun 05, 2025 at 05:33:52PM +0530, Aneesh Kumar K.V wrote: > > > >> > + > >> > +/* To ensure no host side MMIO

Re: [RFC PATCH 19/30] vfio/pci: Add TSM TDI bind/unbind IOCTLs for TEE-IO support

2025-06-05 Thread Jason Gunthorpe
On Thu, Jun 05, 2025 at 05:33:52PM +0530, Aneesh Kumar K.V wrote: > > + > > + /* To ensure no host side MMIO access is possible */ > > + ret = pci_request_regions_exclusive(pdev, "vfio-pci-tsm"); > > + if (ret) > > + goto out_unlock; > > + > > > > I am hitting failures here with s

Re: [RFC PATCH 19/30] vfio/pci: Add TSM TDI bind/unbind IOCTLs for TEE-IO support

2025-06-05 Thread Jason Gunthorpe
On Thu, Jun 05, 2025 at 05:41:17PM +0800, Xu Yilun wrote: > No, this is not device side TDISP requirement. It is host side > requirement to fix DMA silent drop issue. TDX enforces CPU S2 PT share > with IOMMU S2 PT (does ARM do the same?), so unmap CPU S2 PT in KVM equals > unmap IOMMU S2 PT. > >

Re: [PATCH 07/12] mm: Remove redundant pXd_devmap calls

2025-06-05 Thread Jason Gunthorpe
On Wed, Jun 04, 2025 at 07:35:24PM -0700, Dan Williams wrote: > If all dax pages are special, then vm_normal_page() should never find > them and gup should fail. > > ...oh, but vm_normal_page_p[mu]d() is not used in the gup path, and > 'special' is not set in the pte path. That seems really subo

Re: [bug report] drm/xe/svm: Implement prefetch support for SVM ranges

2025-06-04 Thread Jason Gunthorpe
On Wed, Jun 04, 2025 at 04:54:43PM +0200, Simona Vetter wrote: > On Tue, Jun 03, 2025 at 07:29:52PM -0300, Jason Gunthorpe wrote: > > On Mon, May 26, 2025 at 10:15:17PM +0530, Ghimiray, Himal Prasad wrote: > > > > > > > > > On 26-05-2025 20:36, Dan Carpent

Re: [RFC PATCH 17/30] iommufd/device: Add TSM Bind/Unbind for TIO support

2025-06-04 Thread Jason Gunthorpe
On Wed, Jun 04, 2025 at 02:10:43PM +0530, Aneesh Kumar K.V wrote: > Jason Gunthorpe writes: > > > On Tue, Jun 03, 2025 at 02:20:51PM +0800, Xu Yilun wrote: > >> > Wouldn’t it be simpler to skip the reference count increment altogether > >> > and just cal

Re: [bug report] drm/xe/svm: Implement prefetch support for SVM ranges

2025-06-03 Thread Jason Gunthorpe
On Mon, May 26, 2025 at 10:15:17PM +0530, Ghimiray, Himal Prasad wrote: > > > On 26-05-2025 20:36, Dan Carpenter wrote: > > Hello Himal Prasad Ghimiray, > > > > Commit 09ba0a8f06cd ("drm/xe/svm: Implement prefetch support for SVM > > ranges") from May 13, 2025 (linux-next), leads to the followin

Re: [PATCH 12/12] mm/memremap: Remove unused devmap_managed_key

2025-06-03 Thread Jason Gunthorpe
On Thu, May 29, 2025 at 04:32:13PM +1000, Alistair Popple wrote: > It's no longer used so remove it. > > Signed-off-by: Alistair Popple > --- > mm/memremap.c | 27 --- > 1 file changed, 27 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 11/12] mm: Remove callers of pfn_t functionality

2025-06-03 Thread Jason Gunthorpe
toph Hellwig Yay! Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 10/12] mm: Remove devmap related functions and page table bits

2025-06-03 Thread Jason Gunthorpe
le.h | 19 +-- > mm/Kconfig| 4 +- > mm/debug_vm_pgtable.c | 59 + > mm/hmm.c | 3 +- > mm/madvise.c | 8 +-- > 25 files changed, 17 insertions(+), 318 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 09/12] powerpc: Remove checks for devmap pages and PMDs/PUDs

2025-06-03 Thread Jason Gunthorpe
- > arch/powerpc/mm/book3s64/radix_pgtable.c | 5 ++--- > arch/powerpc/mm/pgtable.c| 2 +- > 6 files changed, 10 insertions(+), 14 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 08/12] mm/khugepaged: Remove redundant pmd_devmap() check

2025-06-03 Thread Jason Gunthorpe
t; > Signed-off-by: Alistair Popple > --- > mm/khugepaged.c | 2 -- > 1 file changed, 2 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 07/12] mm: Remove redundant pXd_devmap calls

2025-06-03 Thread Jason Gunthorpe
2 +- > mm/mremap.c| 5 ++--- > mm/page_vma_mapped.c | 5 ++--- > mm/pagewalk.c | 8 +++- > mm/pgtable-generic.c | 7 +++ > mm/userfaultfd.c | 4 ++-- > mm/vmscan.c| 3 --- > 15 files changed, 40 insertions(+), 66 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 06/12] mm/gup: Remove pXX_devmap usage from get_user_pages()

2025-06-03 Thread Jason Gunthorpe
yway so can be removed. > > Signed-off-by: Alistair Popple > --- > include/linux/huge_mm.h | 3 +- > mm/gup.c| 162 + > mm/huge_memory.c| 40 +-- > 3 files changed, 5 insertions(+), 200 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 05/12] mm: Remove remaining uses of PFN_DEV

2025-06-03 Thread Jason Gunthorpe
| 2 +- > include/linux/pfn_t.h | 25 ++--- > mm/memory.c| 4 ++-- > 7 files changed, 11 insertions(+), 36 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 04/12] mm: Convert vmf_insert_mixed() from using pte_devmap to pte_special

2025-06-03 Thread Jason Gunthorpe
.c| 3 --- > mm/memory.c | 20 ++-- > mm/vmscan.c | 2 +- > 3 files changed, 3 insertions(+), 22 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 03/12] mm/pagewalk: Skip dax pages in pagewalk

2025-06-03 Thread Jason Gunthorpe
On Thu, May 29, 2025 at 04:32:04PM +1000, Alistair Popple wrote: > Previously dax pages were skipped by the pagewalk code as pud_special() or > vm_normal_page{_pmd}() would be false for DAX pages. Now that dax pages are > refcounted normally that is no longer the case, so add explicit checks to > s

Re: [PATCH 02/12] mm: Convert pXd_devmap checks to vma_is_dax

2025-06-03 Thread Jason Gunthorpe
2 +- > mm/userfaultfd.c | 2 +- > 3 files changed, 3 insertions(+), 3 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 01/12] mm: Remove PFN_MAP, PFN_SG_CHAIN and PFN_SG_LAST

2025-06-03 Thread Jason Gunthorpe
Christoph Hellwig > --- > include/linux/pfn_t.h | 31 +++ > mm/memory.c | 2 -- > tools/testing/nvdimm/test/iomap.c | 4 > 3 files changed, 3 insertions(+), 34 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [RFC PATCH 17/30] iommufd/device: Add TSM Bind/Unbind for TIO support

2025-06-03 Thread Jason Gunthorpe
On Tue, Jun 03, 2025 at 02:20:51PM +0800, Xu Yilun wrote: > > Wouldn’t it be simpler to skip the reference count increment altogether > > and just call tsm_unbind in the virtual device’s destroy callback? > > (iommufd_vdevice_destroy()) > > The vdevice refcount is the main concern, there is also a

Re: [RFC PATCH 00/30] Host side (KVM/VFIO/IOMMUFD) support for TDISP using TSM

2025-06-02 Thread Jason Gunthorpe
On Thu, May 29, 2025 at 01:34:43PM +0800, Xu Yilun wrote: > This series has 3 sections: I really think this is too big to try to progress, even in RFC form. > Patch 1 - 11 deal with the private MMIO mapping in KVM MMU via DMABUF. > Leverage Jason & Vivek's latest VFIO dmabuf series [3], see Pat

Re: [RFC PATCH 10/30] vfio/pci: Export vfio dma-buf specific info for importers

2025-06-02 Thread Jason Gunthorpe
On Thu, May 29, 2025 at 01:34:53PM +0800, Xu Yilun wrote: > Export vfio dma-buf specific info by attaching vfio_dma_buf_data in > struct dma_buf::priv. Provide a helper vfio_dma_buf_get_data() for > importers to fetch these data. Exporters identify VFIO dma-buf by > successfully getting these data.

Re: [RFC PATCH 00/12] Private MMIO support for private assigned dev

2025-05-29 Thread Jason Gunthorpe
On Thu, May 29, 2025 at 10:41:15PM +0800, Xu Yilun wrote: > > On AMD, the host can "revoke" at any time, at worst it'll see RMP > > events from IOMMU. Thanks, > > Is the RMP event firstly detected by host or guest? If by host, > host could fool guest by just suppress the event. Guest thought the

Re: [RFC PATCH 00/12] Private MMIO support for private assigned dev

2025-05-16 Thread Jason Gunthorpe
On Fri, May 16, 2025 at 02:19:45PM +0800, Xu Yilun wrote: > > I don't know why you'd disable a viommu while the VM is running, > > doesn't make sense. > > Here it means remove the CC setup for viommu, shared setup is still > kept. That might makes sense for the vPCI function, but not the vIOMMU.

Re: [RFC PATCH 00/12] Private MMIO support for private assigned dev

2025-05-15 Thread Jason Gunthorpe
On Fri, May 16, 2025 at 02:02:29AM +0800, Xu Yilun wrote: > > IMHO, I think it might be helpful that you can picture out what are the > > minimum requirements (function/life cycle) to the current IOMMUFD TSM > > bind architecture: > > > > 1.host tsm_bind (preparation) is in IOMMUFD, triggered by Q

Re: [RFC PATCH 00/12] Private MMIO support for private assigned dev

2025-05-15 Thread Jason Gunthorpe
On Fri, May 16, 2025 at 12:04:04AM +0800, Xu Yilun wrote: > > arches this was mostly invisible to the hypervisor? > > Attest & Accept can be invisible to hypervisor, or host just help pass > data blobs between guest, firmware & device. > > Bind cannot be host agnostic, host should be aware not to

Re: [RFC PATCH 00/12] Private MMIO support for private assigned dev

2025-05-14 Thread Jason Gunthorpe
On Wed, May 14, 2025 at 03:02:53PM +0800, Xu Yilun wrote: > > We have an awkward fit for what CCA people are doing to the various > > Linux APIs. Looking somewhat maximally across all the arches a "bind" > > for a CC vPCI device creation operation does: > > > > - Setup the CPU page tables for the

Re: [RFC PATCH 00/12] Private MMIO support for private assigned dev

2025-05-12 Thread Jason Gunthorpe
On Mon, May 12, 2025 at 07:30:21PM +1000, Alexey Kardashevskiy wrote: > > > I'm surprised by this.. iommufd shouldn't be doing PCI stuff, it is > > > just about managing the translation control of the device. > > > > I have a little difficulty to understand. Is TSM bind PCI stuff? To me > > it is

Re: [RFC PATCH 00/12] Private MMIO support for private assigned dev

2025-05-09 Thread Jason Gunthorpe
On Sat, May 10, 2025 at 12:28:48AM +0800, Xu Yilun wrote: > On Fri, May 09, 2025 at 07:12:46PM +0800, Xu Yilun wrote: > > On Fri, May 09, 2025 at 01:04:58PM +1000, Alexey Kardashevskiy wrote: > > > Ping? > > > > Sorry for late reply from vacation. > > > > > Also, since there is pushback on 01/12

Re: [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl

2025-05-07 Thread Jason Gunthorpe
On Wed, May 07, 2025 at 03:49:15PM -0400, Rodrigo Vivi wrote: > One last thing since I have your attention here. Was any time in the previous > fwctl discussions talked about the possibility of some extra usages for like > FW flashing or in-field-repair/tests where big data needs to filled bypassi

Re: [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl

2025-05-06 Thread Jason Gunthorpe
On Tue, Apr 29, 2025 at 09:39:56PM +0530, Badal Nilawar wrote: > diff --git a/drivers/gpu/drm/xe/xe_pcode_fwctl.c > b/drivers/gpu/drm/xe/xe_pcode_fwctl.c > new file mode 100644 I really do prefer it if you can find a way to put the code in drivers/fwctl instead of in DRM subsystem. > +static int

Re: [PATCH v3 03/33] iommu/io-pgtable-arm: Add quirk to quiet WARN_ON()

2025-04-29 Thread Jason Gunthorpe
On Tue, Apr 29, 2025 at 06:58:32AM -0700, Rob Clark wrote: > On Tue, Apr 29, 2025 at 5:28 AM Jason Gunthorpe wrote: > > > > On Mon, Apr 28, 2025 at 01:54:10PM -0700, Rob Clark wrote: > > > From: Rob Clark > > > > > > In situations where mapp

Re: [PATCH v3 03/33] iommu/io-pgtable-arm: Add quirk to quiet WARN_ON()

2025-04-29 Thread Jason Gunthorpe
On Mon, Apr 28, 2025 at 01:54:10PM -0700, Rob Clark wrote: > From: Rob Clark > > In situations where mapping/unmapping squence can be controlled by > userspace, attempting to map over a region that has not yet been > unmapped is an error. But not something that should spam dmesg. I think if you

Re: [PATCH 0/3] uio/dma-buf: Give UIO users access to DMA addresses.

2025-04-22 Thread Jason Gunthorpe
On Mon, Apr 14, 2025 at 09:21:25PM +0200, Thomas Petazzoni wrote: > > "UIO is a broken legacy mess, so let's add more broken things > > to it as broken + broken => still broken, so no harm done", am I > > getting that right? > > Who says UIO is a "broken legacy mess"? Only you says so. I don't se

Re: [PATCH v2 0/4] kbuild: resurrect generic header check facility

2025-04-08 Thread Jason Gunthorpe
On Tue, Apr 08, 2025 at 09:42:36PM +0300, Jani Nikula wrote: > On Tue, 08 Apr 2025, Jason Gunthorpe wrote: > > On Tue, Apr 08, 2025 at 11:27:58AM +0300, Jani Nikula wrote: > >> On Mon, 07 Apr 2025, Jason Gunthorpe wrote: > >> > On Mon, Apr 07, 2025 at 10:17

Re: [PATCH v2 0/4] kbuild: resurrect generic header check facility

2025-04-08 Thread Jason Gunthorpe
On Tue, Apr 08, 2025 at 11:27:58AM +0300, Jani Nikula wrote: > On Mon, 07 Apr 2025, Jason Gunthorpe wrote: > > On Mon, Apr 07, 2025 at 10:17:40AM +0300, Jani Nikula wrote: > > > >> Even with Jason's idea [1], you *still* have to start small and opt-in > >>

Re: [PATCH v2 0/4] kbuild: resurrect generic header check facility

2025-04-07 Thread Jason Gunthorpe
On Mon, Apr 07, 2025 at 10:17:40AM +0300, Jani Nikula wrote: > Even with Jason's idea [1], you *still* have to start small and opt-in > (i.e. the patch series at hand). You can't just start off by testing > every header in one go, because it's a flag day switch. You'd add something like 'make he

Re: [git pull] drm for 6.15-rc1

2025-04-04 Thread Jason Gunthorpe
On Mon, Mar 31, 2025 at 01:03:38PM +0200, Simona Vetter wrote: > Hi Linus, > > On Mon, Mar 31, 2025 at 01:17:28PM +0300, Jani Nikula wrote: > > On Fri, 28 Mar 2025, Linus Torvalds wrote: > > > If you want to do that hdrtest thing, do it as part of your *own* > > > checks. Don't make everybody els

Re: [git pull] drm for 6.15-rc1

2025-04-02 Thread Jason Gunthorpe
On Wed, Apr 02, 2025 at 04:41:44PM +0200, Simona Vetter wrote: > - Gradually roll this out, ideally with support in main Kbuild so it > doesn't have to be replicated. No one said flag day, you'd have to approach the same way everyone else has done when adding new compiler errors and warnings to

Re: [git pull] drm for 6.15-rc1

2025-04-02 Thread Jason Gunthorpe
On Wed, Apr 02, 2025 at 03:56:37PM +0300, Jani Nikula wrote: > On Tue, 01 Apr 2025, Jason Gunthorpe wrote: > > On Tue, Apr 01, 2025 at 10:42:35PM +0300, Jani Nikula wrote: > >> On Tue, 01 Apr 2025, Jason Gunthorpe wrote: > >> > So, I'd suggest a better way to

Re: [git pull] drm for 6.15-rc1

2025-04-01 Thread Jason Gunthorpe
On Tue, Apr 01, 2025 at 10:42:35PM +0300, Jani Nikula wrote: > On Tue, 01 Apr 2025, Jason Gunthorpe wrote: > > So, I'd suggest a better way to run this is first build the kernel, > > then mine the gcc -MD output (ie stored in the .XX.cmd files) to > > generate a list of

Re: [git pull] drm for 6.15-rc1

2025-04-01 Thread Jason Gunthorpe
On Wed, Apr 02, 2025 at 03:46:34AM +0900, Masahiro Yamada wrote: > However, it is annoying to make every header self-contained > "just because we are checking this". >From my POV itis not "just because we are checking this", I have a very deliberate reason for wanting headers to be self contained:

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-21 Thread Jason Gunthorpe
On Fri, Mar 21, 2025 at 01:12:30PM +0100, Danilo Krummrich wrote: > Not all device resources are managed in the context of the subsystem, so > subsystem-level revokes do not apply. They could, you could say that these rust APIs are only safe to use for device drivers with C code providing a fence

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-21 Thread Jason Gunthorpe
On Fri, Mar 21, 2025 at 11:35:40AM +0100, Simona Vetter wrote: > On Wed, Mar 19, 2025 at 02:21:32PM -0300, Jason Gunthorpe wrote: > > On Thu, Mar 13, 2025 at 03:32:14PM +0100, Simona Vetter wrote: > > > > > So I think you can still achieve that building on top of revoca

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-19 Thread Jason Gunthorpe
On Thu, Mar 13, 2025 at 03:32:14PM +0100, Simona Vetter wrote: > So I think you can still achieve that building on top of revocable and a > few more abstractions that are internally unsafe. Or are you thinking of > different runtime checks? I'm thinking on the access side of the revocable you don

Re: [RFC PATCH 08/12] vfio/pci: Create host unaccessible dma-buf for private device

2025-03-17 Thread Jason Gunthorpe
On Tue, Mar 11, 2025 at 06:37:13PM -0700, Dan Williams wrote: > > There is a use case for using TDISP and getting devices up into an > > ecrypted/attested state on pure bare metal without any KVM, VFIO > > should work in that use case too. > > Are you sure you are not confusing the use case for n

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-07 Thread Jason Gunthorpe
On Fri, Mar 07, 2025 at 02:09:12PM +0100, Simona Vetter wrote: > > A driver can do a health check immediately in remove() and make a > > decision if the device is alive or not to speed up removal in the > > hostile hot unplug case. > > Hm ... I guess when you get an all -1 read you check with a s

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-07 Thread Jason Gunthorpe
On Fri, Mar 07, 2025 at 04:19:30PM +0100, Greg KH wrote: > Just like other busses, if PCI can't handle this at the core hotplug > layer (i.e. by giving up new resources to new devices) then the bus core > for it should handle this type of locking scheme as really, that feels > wrong. A new device

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-07 Thread Jason Gunthorpe
On Fri, Mar 07, 2025 at 03:00:09PM +0100, Greg KH wrote: > On Fri, Mar 07, 2025 at 08:32:55AM -0400, Jason Gunthorpe wrote: > > On Fri, Mar 07, 2025 at 11:28:37AM +0100, Simona Vetter wrote: > > > > > > I wouldn't say it is wrong. It is still the correct thing t

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-07 Thread Jason Gunthorpe
On Fri, Mar 07, 2025 at 11:28:37AM +0100, Simona Vetter wrote: > > I wouldn't say it is wrong. It is still the correct thing to do, and > > following down the normal cleanup paths is a good way to ensure the > > special case doesn't have bugs. The primary difference is you want to > > understand t

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-06 Thread Jason Gunthorpe
On Thu, Mar 06, 2025 at 11:42:38AM +0100, Simona Vetter wrote: > > Further, I just remembered, (Danilo please notice!) there is another > > related issue here that DMA mappings *may not* outlive remove() > > either. netdev had a bug related to this recently and it was all > > agreed that it is not

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-05 Thread Jason Gunthorpe
On Wed, Mar 05, 2025 at 08:30:34AM +0100, Simona Vetter wrote: > - developers who want to quickly test new driver versions without full > reboot. They're often preferring convenience over correctness, like with > the removal of module refcounting that's strictly needed but means they > first

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-04 Thread Jason Gunthorpe
On Tue, Mar 04, 2025 at 05:10:45PM +0100, Simona Vetter wrote: > On Fri, Feb 28, 2025 at 02:40:13PM -0400, Jason Gunthorpe wrote: > > On Fri, Feb 28, 2025 at 11:52:57AM +0100, Simona Vetter wrote: > > > > > - Nuke the driver binding manually through sysfs with the unbind

Re: [PATCH 0/4] cover-letter: Allow MMIO regions to be exported through dmabuf

2025-03-04 Thread Jason Gunthorpe
On Tue, Mar 04, 2025 at 03:29:42PM +0100, Christian König wrote: > Am 26.02.25 um 14:38 schrieb Jason Gunthorpe: > > On Wed, Feb 26, 2025 at 07:55:07AM +, Kasireddy, Vivek wrote: > > > >>> Is there any update or ETA for the v3? Are there any ways we can help? >

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-03 Thread Jason Gunthorpe
On Mon, Mar 03, 2025 at 08:36:34PM +0100, Danilo Krummrich wrote: > > > And yes, for *device resources* it is unsound if we do not ensure that the > > > device resource is actually dropped at device unbind. > > > > Why not do a runtime validation then? > > > > It would be easy to have an atomic c

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-28 Thread Jason Gunthorpe
On Fri, Feb 28, 2025 at 11:52:57AM +0100, Simona Vetter wrote: > - Nuke the driver binding manually through sysfs with the unbind files. > - Nuke all userspace that might beholding files and other resources open. > - At this point the module refcount should be zero and you can unload it. > > Exce

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-28 Thread Jason Gunthorpe
On Thu, Feb 27, 2025 at 11:40:53PM +0100, Danilo Krummrich wrote: > On Thu, Feb 27, 2025 at 06:00:13PM -0400, Jason Gunthorpe wrote: > > On Thu, Feb 27, 2025 at 01:25:10PM -0800, Boqun Feng wrote: > > > > > > Most of the cases, it should be naturally achieved, because

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-28 Thread Jason Gunthorpe
, 2025 at 5:02 PM PST, Greg KH wrote: > > >> > On Wed, Feb 26, 2025 at 07:47:30PM -0400, Jason Gunthorpe wrote: > > ... > > > nova is just a drm driver, it's not a rewrite of the drm subsystem, > > > that sort of effort would entail a much larger commitment

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread Jason Gunthorpe
On Thu, Feb 27, 2025 at 01:25:10PM -0800, Boqun Feng wrote: > > The design pattern says that 'share it with the rest of the world' is > > a bug. A driver following the pattern cannot do that, it must contain > > the driver objects within the driver scope and free them. In C we > > I cannot speak f

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread Jason Gunthorpe
On Thu, Feb 27, 2025 at 06:32:15PM +0100, Danilo Krummrich wrote: > On Thu, Feb 27, 2025 at 08:55:09AM -0800, Boqun Feng wrote: > > On Thu, Feb 27, 2025 at 12:17:33PM -0400, Jason Gunthorpe wrote: > > > > > I still wonder why you couldn't also have these reliable ref

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread Jason Gunthorpe
On Thu, Feb 27, 2025 at 07:18:02AM -0800, Boqun Feng wrote: > On Thu, Feb 27, 2025 at 10:46:18AM -0400, Jason Gunthorpe wrote: > > On Wed, Feb 26, 2025 at 04:41:08PM -0800, Boqun Feng wrote: > > > And if you don't store the HrTimerHandle anywhere, like you drop() it &g

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread Jason Gunthorpe
On Thu, Feb 27, 2025 at 12:32:45PM +0100, Danilo Krummrich wrote: > On Wed, Feb 26, 2025 at 07:47:30PM -0400, Jason Gunthorpe wrote: > > On Wed, Feb 26, 2025 at 10:31:10PM +0100, Danilo Krummrich wrote: > > > Let's take a step back and look again why we have Devres (an

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread Jason Gunthorpe
On Wed, Feb 26, 2025 at 04:41:08PM -0800, Boqun Feng wrote: > And if you don't store the HrTimerHandle anywhere, like you drop() it > right after start a hrtimer, it will immediately stop the timer. Does > this make sense? Oh, I understand that, but it is not sufficient in the kernel. You are mak

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread Jason Gunthorpe
On Wed, Feb 26, 2025 at 05:02:23PM -0800, Greg KH wrote: > On Wed, Feb 26, 2025 at 07:47:30PM -0400, Jason Gunthorpe wrote: > > The way misc device works you can't unload the module until all the > > FDs are closed and the misc code directly handles races with opening > >

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-26 Thread Jason Gunthorpe
On Wed, Feb 26, 2025 at 10:31:10PM +0100, Danilo Krummrich wrote: > Let's take a step back and look again why we have Devres (and Revocable) for > e.g. pci::Bar. > > The device / driver model requires that device resources are only held by a > driver, as long as the driver is bound to the device.

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-26 Thread Jason Gunthorpe
On Wed, Feb 26, 2025 at 02:16:58AM +0100, Danilo Krummrich wrote: > > DRM achieves this, in part, by using drm_dev_unplug(). > > No, DRM can have concurrent driver code running, which is why drm_dev_enter() > returns whether the device is unplugged already, such that subsequent > operations, (e.g.

Re: [PATCH 0/4] cover-letter: Allow MMIO regions to be exported through dmabuf

2025-02-26 Thread Jason Gunthorpe
On Wed, Feb 26, 2025 at 07:55:07AM +, Kasireddy, Vivek wrote: > > Is there any update or ETA for the v3? Are there any ways we can help? > I believe Leon's series is very close to getting merged. Once it > lands, this series can be revived. The recent drama has made what happens next unclear

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-25 Thread Jason Gunthorpe
On Wed, Feb 26, 2025 at 12:45:45AM +0100, Danilo Krummrich wrote: > On Tue, Feb 25, 2025 at 06:57:56PM -0400, Jason Gunthorpe wrote: > > The common driver shutdown process in the kernel, that is well tested > > and copied, makes the driver single threaded during the remove() > &g

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-25 Thread Jason Gunthorpe
Another colleague told me RDMA also uses SRCU for a similar purpose as > > > well. > > > > See the reasoning against SRCU from Sima [1], what's the reasoning of RDMA? > > > > [1] https://lore.kernel.org/nouveau/Z7XVfnnrRKrtQbB6@phenom.ffwll.local/ > For RDMA, I

Re: On community influencing (was Re: [PATCH v8 2/2] rust: add dma coherent allocator abstraction.)

2025-02-20 Thread Jason Gunthorpe
On Thu, Feb 20, 2025 at 05:24:01PM +0100, Simona Vetter wrote: > Better analogy aside, I fundamentally disagree with understanding > maintainership as a gatekeeper role that exists to keep the chaos out. My > goal is to help build a community where people enjoy collaborating, and > then gtfo so I d

[PATCH rc] gpu: host1x: Do not assume that a NULL domain means no DMA IOMMU

2025-02-04 Thread Jason Gunthorpe
ssume that a NULL domain means no DMA IOMMU"). Fixes: c8cc2655cc6c ("iommu/tegra-smmu: Implement an IDENTITY domain") Reported-by: Diogo Ivo Closes: https://lore.kernel.org/all/c6a6f114-3acd-4d56-a13b-b88978e92...@tecnico.ulisboa.pt/ Tested-by: Diogo Ivo Signed-off-by: Jason Gu

Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-02-04 Thread Jason Gunthorpe
On Tue, Feb 04, 2025 at 03:29:48PM +0100, Thomas Hellström wrote: > On Tue, 2025-02-04 at 09:26 -0400, Jason Gunthorpe wrote: > > On Tue, Feb 04, 2025 at 10:32:32AM +0100, Thomas Hellström wrote: > > > > > > > > 1) Existing users would never use the callb

Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-02-04 Thread Jason Gunthorpe
On Tue, Feb 04, 2025 at 10:32:32AM +0100, Thomas Hellström wrote: > > I would not be happy to see this. Please improve pagemap directly if > > you think you need more things. > > These are mainly helpers to migrate and populate a range of cpu memory > space (struct mm_struct) with GPU device_priva

Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-02-03 Thread Jason Gunthorpe
On Fri, Jan 31, 2025 at 05:59:26PM +0100, Simona Vetter wrote: > So one aspect where I don't like the pgmap->owner approach much is that > it's a big thing to get right, and it feels a bit to me that we don't yet > know the right questions. Well, I would say it isn't really complete yet. No drive

Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-01-30 Thread Jason Gunthorpe
On Thu, Jan 30, 2025 at 05:09:44PM +0100, Simona Vetter wrote: > > You could also use an integer instead of a pointer to indicate the > > cluster of interconnect, I think there are many options.. > > Hm yeah I guess an integer allocater of the atomic_inc kind plus "surely > 32bit is enough" could

Re: [PATCH v1 08/12] mm/rmap: handle device-exclusive entries correctly in try_to_unmap_one()

2025-01-30 Thread Jason Gunthorpe
On Thu, Jan 30, 2025 at 02:06:12PM +0100, Simona Vetter wrote: > On Thu, Jan 30, 2025 at 12:08:42PM +0100, David Hildenbrand wrote: > > On 30.01.25 11:10, Simona Vetter wrote: > > > On Wed, Jan 29, 2025 at 12:54:06PM +0100, David Hildenbrand wrote: > > > > Ever since commit b756a3b5e7ea ("mm: devic

Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-01-30 Thread Jason Gunthorpe
On Thu, Jan 30, 2025 at 11:50:27AM +0100, Simona Vetter wrote: > On Wed, Jan 29, 2025 at 09:47:57AM -0400, Jason Gunthorpe wrote: > > On Wed, Jan 29, 2025 at 02:38:58PM +0100, Simona Vetter wrote: > > > > > > The pgmap->owner doesn't *have* to fixed, certainl

Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-01-29 Thread Jason Gunthorpe
On Wed, Jan 29, 2025 at 02:38:58PM +0100, Simona Vetter wrote: > > The pgmap->owner doesn't *have* to fixed, certainly during early boot before > > you hand out any page references it can be changed. I wouldn't be > > surprised if this is useful to some requirements to build up the > > private int

  1   2   3   4   5   6   7   8   9   10   >