Re: [PATCH v3 1/1] PCI/ATS: Check PRI supported on the PF device when SRIOV is enabled

2020-07-23 Thread Lu Baolu
On 7/24/20 6:37 AM, Ashok Raj wrote: PASID and PRI capabilities are only enumerated in PF devices. VF devices do not enumerate these capabilites. IOMMU drivers also need to enumerate them before enabling features in the IOMMU. Extending the same support as PASID feature discovery

Re: [PATCH v2] dma-contiguous: cleanup dma_alloc_contiguous

2020-07-23 Thread Nicolin Chen
Hi Christoph, On Thu, Jul 23, 2020 at 02:01:33PM +0200, Christoph Hellwig wrote: > Split out a cma_alloc_aligned helper to deal with the "interesting" > calling conventions for cma_alloc, which then allows to the main > function to be written straight forward. This also takes advantage > of the

[PATCH 11/12] iommu/vt-d: Add page response ops support

2020-07-23 Thread Lu Baolu
After page requests are handled, software must respond to the device which raised the page request with the result. This is done through the iommu ops.page_response if the request was reported to outside of vendor iommu driver through iommu_report_device_fault(). This adds the VT-d implementation

[PATCH 09/12] iommu/vt-d: Add a helper to get svm and sdev for pasid

2020-07-23 Thread Lu Baolu
There are several places in the code that need to get the pointers of svm and sdev according to a pasid and device. Add a helper to achieve this for code consolidation and readability. Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian --- drivers/iommu/intel/svm.c | 115

[PATCH 00/12] [PULL REQUEST] iommu/vt-d: patches for v5.9

2020-07-23 Thread Lu Baolu
Hi Joerg, Below patches have been piled up for v5.9. It includes: - Misc tweaks and fixes for vSVA - Report/response page request events - Cleanups Can you please consider them for iommu/next? Best regards, Lu Baolu Jacob Pan (4): iommu/vt-d: Remove global page support in devTLB flush

[PATCH 08/12] iommu/vt-d: Refactor device_to_iommu() helper

2020-07-23 Thread Lu Baolu
It is refactored in two ways: - Make it global so that it could be used in other files. - Make bus/devfn optional so that callers could ignore these two returned values when they only want to get the coresponding iommu pointer. Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian ---

[PATCH 03/12] iommu/vt-d: Fix PASID devTLB invalidation

2020-07-23 Thread Lu Baolu
From: Jacob Pan DevTLB flush can be used for both DMA request with and without PASIDs. The former uses PASID#0 (RID2PASID), latter uses non-zero PASID for SVA usage. This patch adds a check for PASID value such that devTLB flush with PASID is used for SVA case. This is more efficient in that

[PATCH 06/12] iommu/vt-d: Warn on out-of-range invalidation address

2020-07-23 Thread Lu Baolu
From: Jacob Pan For guest requested IOTLB invalidation, address and mask are provided as part of the invalidation data. VT-d HW silently ignores any address bits below the mask. SW shall also allow such case but give warning if address does not align with the mask. This patch relax the fault

[PATCH 07/12] iommu/vt-d: Disable multiple GPASID-dev bind

2020-07-23 Thread Lu Baolu
From: Jacob Pan For the unlikely use case where multiple aux domains from the same pdev are attached to a single guest and then bound to a single process (thus same PASID) within that guest, we cannot easily support this case by refcounting the number of users. As there is only one SL page table

[PATCH 10/12] iommu/vt-d: Report page request faults for guest SVA

2020-07-23 Thread Lu Baolu
A pasid might be bound to a page table from a VM guest via the iommu ops.sva_bind_gpasid. In this case, when a DMA page fault is detected on the physical IOMMU, we need to inject the page fault request into the guest. After the guest completes handling the page fault, a page response need to be

[PATCH 12/12] iommu/vt-d: Rename intel-pasid.h to pasid.h

2020-07-23 Thread Lu Baolu
As Intel VT-d files have been moved to its own subdirectory, the prefix makes no sense. No functional changes. Signed-off-by: Lu Baolu --- drivers/iommu/intel/debugfs.c | 2 +- drivers/iommu/intel/iommu.c| 2 +- drivers/iommu/intel/pasid.c

[PATCH 05/12] iommu/vt-d: Fix devTLB flush for vSVA

2020-07-23 Thread Lu Baolu
From: Liu Yi L For guest SVA usage, in order to optimize for less VMEXIT, guest request of IOTLB flush also includes device TLB. On the host side, IOMMU driver performs IOTLB and implicit devTLB invalidation. When PASID-selective granularity is requested by the guest we need to derive the

[PATCH 02/12] iommu/vt-d: Remove global page support in devTLB flush

2020-07-23 Thread Lu Baolu
From: Jacob Pan Global pages support is removed from VT-d spec 3.0 for dev TLB invalidation. This patch is to remove the bits for vSVA. Similar change already made for the native SVA. See the link below. Link: https://lore.kernel.org/linux-iommu/20190830142919.ge11...@8bytes.org/T/

[PATCH 01/12] iommu/vt-d: Enforce PASID devTLB field mask

2020-07-23 Thread Lu Baolu
From: Liu Yi L Set proper masks to avoid invalid input spillover to reserved bits. Signed-off-by: Liu Yi L Signed-off-by: Jacob Pan Reviewed-by: Eric Auger Signed-off-by: Lu Baolu --- include/linux/intel-iommu.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git

[PATCH 04/12] iommu/vt-d: Handle non-page aligned address

2020-07-23 Thread Lu Baolu
From: Liu Yi L Address information for device TLB invalidation comes from userspace when device is directly assigned to a guest with vIOMMU support. VT-d requires page aligned address. This patch checks and enforce address to be page aligned, otherwise reserved bits can be set in the

Re: [PATCH v3 1/1] PCI/ATS: Check PRI supported on the PF device when SRIOV is enabled

2020-07-23 Thread Bjorn Helgaas
On Thu, Jul 23, 2020 at 03:37:29PM -0700, Ashok Raj wrote: > PASID and PRI capabilities are only enumerated in PF devices. VF devices > do not enumerate these capabilites. IOMMU drivers also need to enumerate > them before enabling features in the IOMMU. Extending the same support as > PASID

[PATCH v3 1/1] PCI/ATS: Check PRI supported on the PF device when SRIOV is enabled

2020-07-23 Thread Ashok Raj
PASID and PRI capabilities are only enumerated in PF devices. VF devices do not enumerate these capabilites. IOMMU drivers also need to enumerate them before enabling features in the IOMMU. Extending the same support as PASID feature discovery (pci_pasid_features) for PRI. Fixes: b16d0cb9e2fc

Re: [PATCH 18/21] iommu/mediatek: Add support for multi domain

2020-07-23 Thread Rob Herring
On Sat, Jul 11, 2020 at 02:48:43PM +0800, Yong Wu wrote: > Some HW IP(ex: CCU) require the special iova range. That means the > iova got from dma_alloc_attrs for that devices must locate in his > special range. In this patch, we allocate a special iova_range for > each a special requirement and

Re: [PATCH] PCI/ATS: PASID and PRI are only enumerated in PF devices.

2020-07-23 Thread Bjorn Helgaas
On Thu, Jul 23, 2020 at 10:38:19AM -0700, Raj, Ashok wrote: > Hi Bjorn > > On Tue, Jul 21, 2020 at 09:54:01AM -0500, Bjorn Helgaas wrote: > > On Mon, Jul 20, 2020 at 09:43:00AM -0700, Ashok Raj wrote: > > > PASID and PRI capabilities are only enumerated in PF devices. VF devices > > > do not

Re: [PATCH] PCI/ATS: PASID and PRI are only enumerated in PF devices.

2020-07-23 Thread Raj, Ashok
Hi Bjorn On Tue, Jul 21, 2020 at 09:54:01AM -0500, Bjorn Helgaas wrote: > On Mon, Jul 20, 2020 at 09:43:00AM -0700, Ashok Raj wrote: > > PASID and PRI capabilities are only enumerated in PF devices. VF devices > > do not enumerate these capabilites. IOMMU drivers also need to enumerate > > them

[PATCH v6 1/6] docs: IOMMU user API

2020-07-23 Thread Jacob Pan
IOMMU UAPI is newly introduced to support communications between guest virtual IOMMU and host IOMMU. There has been lots of discussions on how it should work with VFIO UAPI and userspace in general. This document is intended to clarify the UAPI design and usage. The mechanics of how future

[PATCH v6 6/6] iommu/vt-d: Check UAPI data processed by IOMMU core

2020-07-23 Thread Jacob Pan
IOMMU generic layer already does sanity checks UAPI data for version match and argsz range under generic information. Remove the redundant version check from VT-d driver and check for vendor specific data size. Signed-off-by: Jacob Pan --- drivers/iommu/intel/iommu.c | 3 +--

[PATCH v6 4/6] iommu/uapi: Rename uapi functions

2020-07-23 Thread Jacob Pan
User APIs such as iommu_sva_unbind_gpasid() may also be used by the kernel. Since we introduced user pointer to the UAPI functions, in-kernel callers cannot share the same APIs. In-kernel callers are also trusted, there is no need to validate the data. This patch renames all UAPI functions with

[PATCH v6 5/6] iommu/uapi: Handle data and argsz filled by users

2020-07-23 Thread Jacob Pan
IOMMU user APIs are responsible for processing user data. This patch changes the interface such that user pointers can be passed into IOMMU code directly. Separate kernel APIs without user pointers are introduced for in-kernel users of the UAPI functionality. IOMMU UAPI data has a user filled

[PATCH v6 3/6] iommu/uapi: Use named union for user data

2020-07-23 Thread Jacob Pan
IOMMU UAPI data size is filled by the user space which must be validated by the kernel. To ensure backward compatibility, user data can only be extended by either re-purpose padding bytes or extend the variable sized union at the end. No size change is allowed before the union. Therefore, the

[PATCH v6 0/6] IOMMU user API enhancement

2020-07-23 Thread Jacob Pan
IOMMU user API header was introduced to support nested DMA translation and related fault handling. The current UAPI data structures consist of three areas that cover the interactions between host kernel and guest: - fault handling - cache invalidation - bind guest page tables, i.e. guest PASID

[PATCH v6 2/6] iommu/uapi: Add argsz for user filled data

2020-07-23 Thread Jacob Pan
As IOMMU UAPI gets extended, user data size may increase. To support backward compatibiliy, this patch introduces a size field to each UAPI data structures. It is *always* the responsibility for the user to fill in the correct size. Padding fields are adjusted to ensure 8 byte alignment. Specific

[PATCH v9 12/13] iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()

2020-07-23 Thread Jean-Philippe Brucker
The sva_bind() function allows devices to access process address spaces using a PASID (aka SSID). (1) bind() allocates or gets an existing MMU notifier tied to the (domain, mm) pair. Each mm gets one PASID. (2) Any change to the address space calls invalidate_range() which sends ATC

[PATCH v9 10/13] iommu/arm-smmu-v3: Check for SVA features

2020-07-23 Thread Jean-Philippe Brucker
Aggregate all sanity-checks for sharing CPU page tables with the SMMU under a single ARM_SMMU_FEAT_SVA bit. For PCIe SVA, users also need to check FEAT_ATS and FEAT_PRI. For platform SVA, they will have to check FEAT_STALLS. Introduce ARM_SMMU_FEAT_BTM (Broadcast TLB Maintenance), but don't

[PATCH v9 11/13] iommu/arm-smmu-v3: Add SVA device feature

2020-07-23 Thread Jean-Philippe Brucker
Implement the IOMMU device feature callbacks to support the SVA feature. At the moment dev_has_feat() returns false since I/O Page Faults isn't yet implemented. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.h | 26 +++ drivers/iommu/arm-smmu-v3-sva.c | 49

[PATCH v9 08/13] iommu/arm-smmu-v3: Share process page tables

2020-07-23 Thread Jean-Philippe Brucker
With Shared Virtual Addressing (SVA), we need to mirror CPU TTBR, TCR, MAIR and ASIDs in SMMU contexts. Each SMMU has a single ASID space split into two sets, shared and private. Shared ASIDs correspond to those obtained from the arch ASID allocator, and private ASIDs are used for "classic"

[PATCH v9 01/13] mm: Define pasid in mm

2020-07-23 Thread Jean-Philippe Brucker
From: Fenghua Yu PASID is shared by all threads in a process. So the logical place to keep track of it is in the "mm". Both ARM and X86 need to use the PASID in the "mm". Suggested-by: Christoph Hellwig Signed-off-by: Fenghua Yu Reviewed-by: Tony Luck ---

[PATCH v9 09/13] iommu/arm-smmu-v3: Seize private ASID

2020-07-23 Thread Jean-Philippe Brucker
The SMMU has a single ASID space, the union of shared and private ASID sets. This means that the SMMU driver competes with the arch allocator for ASIDs. Shared ASIDs are those of Linux processes, allocated by the arch, and contribute in broadcast TLB maintenance. Private ASIDs are allocated by the

[PATCH v9 13/13] iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops

2020-07-23 Thread Jean-Philippe Brucker
The invalidate_range() notifier is called for any change to the address space. Perform the required ATC invalidations. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.h | 2 ++ drivers/iommu/arm-smmu-v3-sva.c | 16 +++- drivers/iommu/arm-smmu-v3.c | 18

[PATCH v9 00/13] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part)

2020-07-23 Thread Jean-Philippe Brucker
Add support for sharing CPU page tables with the SMMUv3. Support for I/O page faults and additional features (DVM, VHE and HTTU) needed for SVA is available on my sva/current branch [2] and will be sent later. Since v8 [1]: * Moved the SVA code to arm-smmu-v3-sva.c under CONFIG_ARM_SMMU_V3_SVA.

[PATCH v9 04/13] arm64: mm: Pin down ASIDs for sharing mm with devices

2020-07-23 Thread Jean-Philippe Brucker
To enable address space sharing with the IOMMU, introduce arm64_mm_context_get() and arm64_mm_context_put(), that pin down a context and ensure that it will keep its ASID after a rollover. Export the symbols to let the modular SMMUv3 driver use them. Pinning is necessary because a device

[PATCH v9 02/13] iommu/ioasid: Add ioasid references

2020-07-23 Thread Jean-Philippe Brucker
Let IOASID users take references to existing ioasids with ioasid_get(). ioasid_put() drops a reference and only frees the ioasid when its reference number is zero. It returns true if the ioasid was freed. For drivers that don't call ioasid_get(), ioasid_put() is the same as ioasid_free().

[PATCH v9 07/13] iommu/arm-smmu-v3: Move definitions to a header

2020-07-23 Thread Jean-Philippe Brucker
Allow sharing structure definitions with the upcoming SVA support for Arm SMMUv3, by moving them to a separate header. We could surgically extract only what is needed but keeping all definitions in one place looks nicer. Signed-off-by: Jean-Philippe Brucker --- v9: new ---

[PATCH v9 05/13] iommu/io-pgtable-arm: Move some definitions to a header

2020-07-23 Thread Jean-Philippe Brucker
Extract some of the most generic TCR defines, so they can be reused by the page table sharing code. Acked-by: Will Deacon Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/io-pgtable-arm.h | 30 ++ drivers/iommu/io-pgtable-arm.c | 27 ++-

[PATCH v9 06/13] arm64: cpufeature: Export symbol read_sanitised_ftr_reg()

2020-07-23 Thread Jean-Philippe Brucker
The SMMUv3 driver would like to read the MMFR0 PARANGE field in order to share CPU page tables with devices. Allow the driver to be built as module by exporting the read_sanitized_ftr_reg() cpufeature symbol. Acked-by: Suzuki K Poulose Signed-off-by: Jean-Philippe Brucker ---

[PATCH v9 03/13] iommu/sva: Add PASID helpers

2020-07-23 Thread Jean-Philippe Brucker
Let IOMMU drivers allocate a single PASID per mm. Store the mm in the IOASID set to allow refcounting and searching mm by PASID, when handling an I/O page fault. Reviewed-by: Lu Baolu Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/Kconfig | 5 +++ drivers/iommu/Makefile

Re: [PATCH v3 0/4] iommu aux-domain APIs extensions

2020-07-23 Thread Lu Baolu
Hi Joerg and Alex, Any comments for this series? Just check to see whether we could make it for v5.9. The first aux- domain capable device driver has been posted [1]. [1] https://lore.kernel.org/linux-pci/159534667974.28840.2045034360240786644.st...@djiang5-desk3.ch.intel.com/ Best regards,

[PATCH v4 0/2] make dma_alloc_coherent NUMA-aware by per-NUMA CMA

2020-07-23 Thread Barry Song
Ganapatrao Kulkarni has put some effort on making arm-smmu-v3 use local memory to save command queues[1]. I also did similar job in patch "iommu/arm-smmu-v3: allocate the memory of queues in local numa node" [2] while not realizing Ganapatrao has done that before. But it seems it is much better

[PATCH v4 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-07-23 Thread Barry Song
Right now, drivers like ARM SMMU are using dma_alloc_coherent() to get coherent DMA buffers to save their command queues and page tables. As there is only one default CMA in the whole system, SMMUs on nodes other than node0 will get remote memory. This leads to significant latency. This patch

[PATCH v4 2/2] arm64: mm: reserve per-numa CMA to localize coherent dma buffers

2020-07-23 Thread Barry Song
Right now, smmu is using dma_alloc_coherent() to get memory to save queues and tables. Typically, on ARM64 server, there is a default CMA located at node0, which could be far away from node2, node3 etc. with this patch, smmu will get memory from local numa node to save command queues and page

RE: [PATCH v3 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-07-23 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Christoph Hellwig [mailto:h...@lst.de] > Sent: Friday, July 24, 2020 12:01 AM > To: Song Bao Hua (Barry Song) > Cc: Christoph Hellwig ; m.szyprow...@samsung.com; > robin.mur...@arm.com; w...@kernel.org; ganapatrao.kulka...@cavium.com; >

[PATCH v2] dma-contiguous: cleanup dma_alloc_contiguous

2020-07-23 Thread Christoph Hellwig
Split out a cma_alloc_aligned helper to deal with the "interesting" calling conventions for cma_alloc, which then allows to the main function to be written straight forward. This also takes advantage of the fact that NULL dev arguments have been gone from the DMA API for a while. Signed-off-by:

Re: [PATCH v3 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-07-23 Thread Christoph Hellwig
On Wed, Jul 22, 2020 at 09:41:50PM +, Song Bao Hua (Barry Song) wrote: > I got a kernel robot warning which said dev should be checked before being > accessed > when I did a similar change in v1. Probably it was an invalid warning if dev > should > never be null. That usually shows up if a

Re: [PATCH v3 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-07-23 Thread Christoph Hellwig
On Wed, Jul 22, 2020 at 09:26:03PM +, Song Bao Hua (Barry Song) wrote: > I understand your concern. Anyway, The primary purpose of this patchset is > providing > a general way for users like IOMMU to get local coherent dma buffers to put > their > command queue and page tables in. The first

Re: [PATCH v2 2/2] iommu: Add gfp parameter to io_pgtable_ops->map()

2020-07-23 Thread Baolin Wang
On Tue, Jul 14, 2020 at 09:28:21AM +0100, Will Deacon wrote: > On Fri, Jun 12, 2020 at 11:39:55AM +0800, Baolin Wang wrote: > > Now the ARM page tables are always allocated by GFP_ATOMIC parameter, > > but the iommu_ops->map() function has been added a gfp_t parameter by > > commit 781ca2de89ba

RE: [PATCH v5 03/15] iommu/smmu: Report empty domain nesting info

2020-07-23 Thread Liu, Yi L
Hi Jean, > From: Jean-Philippe Brucker > Sent: Friday, July 17, 2020 5:09 PM > > On Thu, Jul 16, 2020 at 10:38:17PM +0200, Auger Eric wrote: > > Hi Jean, > > > > On 7/16/20 5:39 PM, Jean-Philippe Brucker wrote: > > > On Tue, Jul 14, 2020 at 10:12:49AM +, Liu, Yi L wrote: > > >>> Have you

RE: [PATCH v5 03/15] iommu/smmu: Report empty domain nesting info

2020-07-23 Thread Liu, Yi L
Hi Jean, > From: Jean-Philippe Brucker > Sent: Thursday, July 16, 2020 11:40 PM > > On Tue, Jul 14, 2020 at 10:12:49AM +, Liu, Yi L wrote: > > > Have you verified that this doesn't break the existing usage of > > > DOMAIN_ATTR_NESTING in drivers/vfio/vfio_iommu_type1.c? > > > > I didn't

Re: [PATCH] dma-contiguous: cleanup dma_alloc_contiguous

2020-07-23 Thread Christoph Hellwig
On Wed, Jul 22, 2020 at 11:00:48PM -0700, Nicolin Chen wrote: > On Wed, Jul 22, 2020 at 04:43:07PM +0200, Christoph Hellwig wrote: > > Split out a cma_alloc_aligned helper to deal with the "interesting" > > calling conventions for cma_alloc, which then allows to the main > > function to be written

Re: [PATCH] dma-contiguous: cleanup dma_alloc_contiguous

2020-07-23 Thread Nicolin Chen
On Wed, Jul 22, 2020 at 04:43:07PM +0200, Christoph Hellwig wrote: > Split out a cma_alloc_aligned helper to deal with the "interesting" > calling conventions for cma_alloc, which then allows to the main > function to be written straight forward. This also takes advantage > of the fact that NULL