[PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc, free}_contiguous()

2019-05-23 Thread Nicolin Chen
Both dma_alloc_from_contiguous() and dma_release_from_contiguous() are very simply implemented, but requiring callers to pass certain parameters like count and align, and taking a boolean parameter to check __GFP_NOWARN in the allocation flags. So every function call duplicates similar work: /*

[PATCH v3 0/2] Optimize dma_*_from_contiguous calls

2019-05-23 Thread Nicolin Chen
[ Per discussion at v1, we decide to add two new functions and start replacing callers one by one. For this series, it only touches the dma-direct part. And instead of merging two PATCHes, I still keep them separate so that we may easily revert PATCH-2 if anything bad happens as last time

[PATCH v3 2/2] dma-contiguous: Use fallback alloc_pages for single pages

2019-05-23 Thread Nicolin Chen
The addresses within a single page are always contiguous, so it's not so necessary to always allocate one single page from CMA area. Since the CMA area has a limited predefined size of space, it may run out of space in heavy use cases, where there might be quite a lot CMA pages being allocated for

Re: [PATCH v2 1/2] dma-contiguous: Abstract dma_{alloc, free}_contiguous()

2019-05-23 Thread Nicolin Chen
On Thu, May 23, 2019 at 08:59:30PM -0600, dann frazier wrote: > > > diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c > > > index b2a87905846d..21f39a6cb04f 100644 > > > --- a/kernel/dma/contiguous.c > > > +++ b/kernel/dma/contiguous.c > > > @@ -214,6 +214,54 @@ bool

Re: [PATCH v2 1/2] dma-contiguous: Abstract dma_{alloc, free}_contiguous()

2019-05-23 Thread dann frazier
On Thu, May 23, 2019 at 7:52 PM dann frazier wrote: > > On Mon, May 6, 2019 at 4:35 PM Nicolin Chen wrote: > > > > Both dma_alloc_from_contiguous() and dma_release_from_contiguous() > > are very simply implemented, but requiring callers to pass certain > > parameters like count and align, and

Re: [PATCH v2 1/2] dma-contiguous: Abstract dma_{alloc, free}_contiguous()

2019-05-23 Thread dann frazier
On Mon, May 6, 2019 at 4:35 PM Nicolin Chen wrote: > > Both dma_alloc_from_contiguous() and dma_release_from_contiguous() > are very simply implemented, but requiring callers to pass certain > parameters like count and align, and taking a boolean parameter to > check __GFP_NOWARN in the

[RFC PATCH v4 15/21] watchdog/hardlockup/hpet: Only enable the HPET watchdog via a boot parameter

2019-05-23 Thread Ricardo Neri
Keep the HPET-based hardlockup detector disabled unless explicitly enabled via a command-line argument. If such parameter is not given, the initialization of the hpet-based hardlockup detector fails and the NMI watchdog will fallback to use the perf-based implementation. Given that

[RFC PATCH v4 17/21] x86/tsc: Switch to perf-based hardlockup detector if TSC become unstable

2019-05-23 Thread Ricardo Neri
The HPET-based hardlockup detector relies on the TSC to determine if an observed NMI interrupt was originated by HPET timer. Hence, this detector can no longer be used with an unstable TSC. In such case, permanently stop the HPET-based hardlockup detector and start the perf-based detector.

[RFC PATCH v4 19/21] iommu/vt-d: Rework prepare_irte() to support per-irq delivery mode

2019-05-23 Thread Ricardo Neri
A recent change introduced a new member to struct irq_cfg to specify the delivery mode of an interrupt. Supporting the configuration of the delivery mode would require adding a third argument to prepare_irte(). Instead, simply take a pointer to a irq_cfg data structure as a the only argument.

[RFC PATCH v4 00/21] Implement an HPET-based hardlockup detector

2019-05-23 Thread Ricardo Neri
Hi, This is the third attempt to demonstrate the implementation of a hardlockup detector driven by the High-Precision Event Timer. This version provides a few but important updates with respect the previous version (please refer to the Changes since v3 section). The three initial implementations

[RFC PATCH v4 16/21] x86/watchdog: Add a shim hardlockup detector

2019-05-23 Thread Ricardo Neri
The generic hardlockup detector is based on perf. It also provides a set of weak stubs that CPU architectures can override. Add a shim hardlockup detector for x86 that selects between perf and hpet implementations. Specifically, this shim implementation is needed for the HPET-based hardlockup

[RFC PATCH v4 11/21] x86/watchdog/hardlockup: Add an HPET-based hardlockup detector

2019-05-23 Thread Ricardo Neri
This is the initial implementation of a hardlockup detector driven by an HPET timer. This initial implementation includes functions to control the timer via its registers. It also requests such timer, installs an NMI interrupt handler and performs the initial configuration of the timer. The

[RFC PATCH v4 13/21] x86/watchdog/hardlockup/hpet: Determine if HPET timer caused NMI

2019-05-23 Thread Ricardo Neri
The only direct method to determine whether an HPET timer caused an interrupt is to read the Interrupt Status register. Unfortunately, reading HPET registers is slow and, therefore, it is not recommended to read them while in NMI context. Furthermore, status is not available if the interrupt is

[RFC PATCH v4 14/21] watchdog/hardlockup: Use parse_option_str() to handle "nmi_watchdog"

2019-05-23 Thread Ricardo Neri
Prepare hardlockup_panic_setup() to handle a comma-separated list of options. This is needed to pass options to specific implementations of the hardlockup detector. Cc: "H. Peter Anvin" Cc: Ashok Raj Cc: Andi Kleen Cc: Tony Luck Cc: Peter Zijlstra Cc: Clemens Ladisch Cc: Arnd Bergmann Cc:

[RFC PATCH v4 21/21] x86/watchdog/hardlockup/hpet: Support interrupt remapping

2019-05-23 Thread Ricardo Neri
When interrupt remapping is enabled in the system, the MSI interrupt message must follow a special format the IOMMU can understand. Hence, utilize the functionality provided by the IOMMU driver for such purpose. The first step is to determine whether interrupt remapping is enabled by looking for

[RFC PATCH v4 20/21] iommu/vt-d: hpet: Reserve an interrupt remampping table entry for watchdog

2019-05-23 Thread Ricardo Neri
When interrupt remapping is enabled, MSI interrupt messages must follow a special format that the IOMMU can understand. Hence, when the HPET hard lockup detector is used with interrupt remapping, it must also follow this special format. The IOMMU, given the information about a particular

[RFC PATCH v4 10/21] watchdog/hardlockup: Add function to enable NMI watchdog on all allowed CPUs at once

2019-05-23 Thread Ricardo Neri
When there are more than one implementation of the NMI watchdog, there may be situations in which switching from one to another is needed (e.g., if the time-stamp counter becomes unstable, the HPET-based NMI watchdog can no longer be used. The perf-based implementation of the hardlockup detector

[RFC PATCH v4 12/21] watchdog/hardlockup/hpet: Adjust timer expiration on the number of monitored CPUs

2019-05-23 Thread Ricardo Neri
Each CPU should be monitored for hardlockups every watchdog_thresh seconds. Since all the CPUs in the system are monitored by the same timer and the timer interrupt is rotated among the monitored CPUs, the timer must expire every watchdog_thresh/N seconds; where N is the number of monitored CPUs.

[RFC PATCH v4 08/21] watchdog/hardlockup: Decouple the hardlockup detector from perf

2019-05-23 Thread Ricardo Neri
The current default implementation of the hardlockup detector assumes that it is implemented using perf events. However, the hardlockup detector can be driven by other sources of non-maskable interrupts (e.g., a properly configured timer). Group and wrap in #ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF

[RFC PATCH v4 01/21] x86/msi: Add definition for NMI delivery mode

2019-05-23 Thread Ricardo Neri
Until now, the delivery mode of MSI interrupts is set to the default mode set in the APIC driver. However, there are no restrictions in hardware to configure each interrupt with a different delivery mode. Specifying the delivery mode per interrupt is useful when one is interested in changing the

[RFC PATCH v4 05/21] x86/hpet: Reserve timer for the HPET hardlockup detector

2019-05-23 Thread Ricardo Neri
HPET timer 2 will be used to drive the HPET-based hardlockup detector. Reserve such timer to ensure it cannot be used by user space programs or for clock events. When looking for MSI-capable timers for clock events, skip timer 2 if the HPET hardlockup detector is selected. Cc: "H. Peter Anvin"

[RFC PATCH v4 07/21] watchdog/hardlockup: Define a generic function to detect hardlockups

2019-05-23 Thread Ricardo Neri
The procedure to detect hardlockups is independent of the underlying mechanism that generates the non-maskable interrupt used to drive the detector. Thus, it can be put in a separate, generic function. In this manner, it can be invoked by various implementations of the NMI watchdog. For this

[RFC PATCH v4 04/21] x86/hpet: Add hpet_set_comparator() for periodic and one-shot modes

2019-05-23 Thread Ricardo Neri
Instead of setting the timer period directly in hpet_set_periodic(), add a new helper function hpet_set_comparator() that only sets the accumulator and comparator. hpet_set_periodic() will only prepare the timer for periodic mode and leave the expiration programming to hpet_set_comparator(). This

[RFC PATCH v4 03/21] x86/hpet: Calculate ticks-per-second in a separate function

2019-05-23 Thread Ricardo Neri
It is easier to compute the expiration times of an HPET timer by using its frequency (i.e., the number of times it ticks in a second) than its period, as given in the capabilities register. In addition to the HPET char driver, the HPET-based hardlockup detector will also need to know the timer's

[RFC PATCH v4 18/21] x86/apic: Add a parameter for the APIC delivery mode

2019-05-23 Thread Ricardo Neri
Until now, the delivery mode of APIC interrupts is set to the default mode set in the APIC driver. However, there are no restrictions in hardware to configure each interrupt with a different delivery mode. Specifying the delivery mode per interrupt is useful when one is interested in changing the

[RFC PATCH v4 02/21] x86/hpet: Expose hpet_writel() in header

2019-05-23 Thread Ricardo Neri
In order to allow hpet_writel() to be used by other components (e.g., the HPET-based hardlockup detector) expose it in the HPET header file. No empty definition is needed if CONFIG_HPET is not selected as all existing callers select such config symbol. Cc: "H. Peter Anvin" Cc: Ashok Raj Cc:

[RFC PATCH v4 06/21] x86/hpet: Configure the timer used by the hardlockup detector

2019-05-23 Thread Ricardo Neri
Implement the initial configuration of the timer to be used by the hardlockup detector. Return a data structure with a description of the timer; this information is subsequently used by the hardlockup detector. Only provide the timer if it supports Front Side Bus interrupt delivery. This

[RFC PATCH v4 09/21] x86/nmi: Add a NMI_WATCHDOG NMI handler category

2019-05-23 Thread Ricardo Neri
Add a NMI_WATCHDOG as a new category of NMI handler. This new category is to be used with the HPET-based hardlockup detector. This detector does not have a direct way of checking if the HPET timer is the source of the NMI. Instead it indirectly estimate it using the time-stamp counter. Therefore,

[PATCH v5 00/10] Support using MSI interrupts in ntb_transport

2019-05-23 Thread Logan Gunthorpe
This is another resend as there has been no feedback since v4. Seems Jon has been MIA this past cycle so hopefully he appears on the list soon. I've addressed the feedback so far and rebased on the latest kernel and would like this to be considered for merging this cycle. The only outstanding

[PATCH v5 08/10] NTB: Add ntb_msi_test support to ntb_test

2019-05-23 Thread Logan Gunthorpe
When the ntb_msi_test module is available, the test code will trigger each of the interrupts and ensure the corresponding occurrences files gets incremented. Signed-off-by: Logan Gunthorpe Cc: Jon Mason Cc: Dave Jiang Cc: Allen Hubbe --- tools/testing/selftests/ntb/ntb_test.sh | 54

[PATCH v5 07/10] NTB: Introduce NTB MSI Test Client

2019-05-23 Thread Logan Gunthorpe
Introduce a tool to test NTB MSI interrupts similar to the other NTB test tools. This tool creates a debugfs directory for each NTB device with the following files: port irqX_occurrences peerX/port peerX/count peerX/trigger The 'port' file tells the user the local port number and the

[PATCH v5 09/10] NTB: Add MSI interrupt support to ntb_transport

2019-05-23 Thread Logan Gunthorpe
Introduce the module parameter 'use_msi' which, when set, uses MSI interrupts instead of doorbells for each queue pair (QP). The parameter is only available if NTB MSI support is configured into the kernel. We also require there to be more than one memory window (MW) so that an extra one is

[PATCH v5 06/10] NTB: Introduce MSI library

2019-05-23 Thread Logan Gunthorpe
The NTB MSI library allows passing MSI interrupts across a memory window. This offers similar functionality to doorbells or messages except will often have much better latency and the client can potentially use significantly more remote interrupts than typical hardware provides for doorbells.

[PATCH v5 03/10] NTB: Introduce helper functions to calculate logical port number

2019-05-23 Thread Logan Gunthorpe
This patch introduces the "Logical Port Number" which is similar to the "Port Number" in that it enumerates the ports in the system. The original (or Physical) "Port Number" can be any number used by the hardware to uniquely identify a port in the system. The "Logical Port Number" enumerates all

[PATCH v5 05/10] NTB: Rename ntb.c to support multiple source files in the module

2019-05-23 Thread Logan Gunthorpe
The kbuild system does not support having multiple source files in a module if one of those source files has the same name as the module. Therefore, we must rename ntb.c to core.c, while the module remains ntb.ko. This is similar to the way the nvme modules are structured. Signed-off-by: Logan

[PATCH v5 01/10] PCI/MSI: Support allocating virtual MSI interrupts

2019-05-23 Thread Logan Gunthorpe
For NTB devices, we want to be able to trigger MSI interrupts through a memory window. In these cases we may want to use more interrupts than the NTB PCI device has available in its MSI-X table. We allow for this by creating a new 'virtual' interrupt. These interrupts are allocated as usual but

[PATCH v5 02/10] PCI/switchtec: Add module parameter to request more interrupts

2019-05-23 Thread Logan Gunthorpe
Seeing the we want to use more interrupts in the NTB MSI code we need to be able allocate more (sometimes virtual) interrupts in the switchtec driver. Therefore add a module parameter to request to allocate additional interrupts. This puts virtually no limit on the number of MSI interrupts

[PATCH v5 10/10] NTB: Describe the ntb_msi_test client in the documentation.

2019-05-23 Thread Logan Gunthorpe
Add a blurb in Documentation/ntb.txt to describe the ntb_msi_test tool's debugfs interface. Similar to the (out of date) ntb_tool description. Signed-off-by: Logan Gunthorpe --- Documentation/ntb.txt | 27 +++ 1 file changed, 27 insertions(+) diff --git

[PATCH v5 04/10] NTB: Introduce functions to calculate multi-port resource index

2019-05-23 Thread Logan Gunthorpe
When using multi-ports each port uses resources (dbs, msgs, mws, etc) on every other port. Creating a mapping for these resources such that each port has a corresponding resource on every other port is a bit tricky. Introduce the ntb_peer_resource_idx() function for this purpose. It returns the

Re: [PATCH v2 03/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2

2019-05-23 Thread Jordan Crouse
On Tue, May 21, 2019 at 07:18:32PM +0100, Robin Murphy wrote: > On 21/05/2019 17:13, Jordan Crouse wrote: > >Add support for a split pagetable (TTBR0/TTBR1) scheme for arm-smmu-v2. > >If split pagetables are enabled, create a pagetable for TTBR1 and set > >up the sign extension bit so that all

Re: [PATCH 3/4] iommu: Introduce device fault report API

2019-05-23 Thread Robin Murphy
On 23/05/2019 19:06, Jean-Philippe Brucker wrote: From: Jacob Pan Traditionally, device specific faults are detected and handled within their own device drivers. When IOMMU is enabled, faults such as DMA related transactions are detected by IOMMU. There is no generic reporting mechanism to

Re: [PATCH 2/4] iommu: Introduce device fault data

2019-05-23 Thread Robin Murphy
On 23/05/2019 19:06, Jean-Philippe Brucker wrote: From: Jacob Pan Device faults detected by IOMMU can be reported outside the IOMMU subsystem for further processing. This patch introduces a generic device fault data structure. The fault can be either an unrecoverable fault or a page request,

[PATCH 2/4] iommu: Introduce device fault data

2019-05-23 Thread Jean-Philippe Brucker
From: Jacob Pan Device faults detected by IOMMU can be reported outside the IOMMU subsystem for further processing. This patch introduces a generic device fault data structure. The fault can be either an unrecoverable fault or a page request, also referred to as a recoverable fault. We only

[PATCH 4/4] iommu: Add recoverable fault reporting

2019-05-23 Thread Jean-Philippe Brucker
Some IOMMU hardware features, for example PCI PRI and Arm SMMU Stall, enable recoverable I/O page faults. Allow IOMMU drivers to report PRI Page Requests and Stall events through the new fault reporting API. The consumer of the fault can be either an I/O page fault handler in the host, or a guest

[PATCH 3/4] iommu: Introduce device fault report API

2019-05-23 Thread Jean-Philippe Brucker
From: Jacob Pan Traditionally, device specific faults are detected and handled within their own device drivers. When IOMMU is enabled, faults such as DMA related transactions are detected by IOMMU. There is no generic reporting mechanism to report faults back to the in-kernel device driver or

[PATCH 0/4] iommu: Add device fault reporting API

2019-05-23 Thread Jean-Philippe Brucker
Allow device drivers and VFIO to get notifications on IOMMU translation fault, and to handle recoverable faults (PCI PRI). These four patches are relatively mature since they are required by three different series, and have been under discussion for a while: * Nested translation support for

[PATCH 1/4] driver core: Add per device iommu param

2019-05-23 Thread Jean-Philippe Brucker
From: Jacob Pan DMA faults can be detected by IOMMU at device level. Adding a pointer to struct device allows IOMMU subsystem to report relevant faults back to the device driver for further handling. For direct assigned device (or user space drivers), guest OS holds responsibility to handle and

Re: [PATCH] swiotlb: sync buffer when mapping FROM_DEVICE

2019-05-23 Thread Robin Murphy
On 23/05/2019 17:43, Christoph Hellwig wrote: On Thu, May 23, 2019 at 07:35:07AM +0200, Marek Szyprowski wrote: Don't we have DMA_BIDIRECTIONAL for such case? Not sure if it was intended for that case, but it definitively should do the right thing for swiotlb, and it should also do the right

Re: [PATCH] swiotlb: sync buffer when mapping FROM_DEVICE

2019-05-23 Thread Horia Geanta
On 5/23/2019 7:43 PM, Christoph Hellwig wrote: > On Thu, May 23, 2019 at 07:35:07AM +0200, Marek Szyprowski wrote: >> Don't we have DMA_BIDIRECTIONAL for such case? > > Not sure if it was intended for that case, but it definitively should > do the right thing for swiotlb, and it should also do

Re: [PATCH] swiotlb: sync buffer when mapping FROM_DEVICE

2019-05-23 Thread Christoph Hellwig
On Thu, May 23, 2019 at 07:35:07AM +0200, Marek Szyprowski wrote: > Don't we have DMA_BIDIRECTIONAL for such case? Not sure if it was intended for that case, but it definitively should do the right thing for swiotlb, and it should also do the right thing in terms of cache maintainance. > Maybe

Re: [PATCH] swiotlb: sync buffer when mapping FROM_DEVICE

2019-05-23 Thread Horia Geanta
On 5/23/2019 8:35 AM, Marek Szyprowski wrote: > Hi Robin, > > On 2019-05-22 15:55, Robin Murphy wrote: >> On 22/05/2019 14:34, Christoph Hellwig wrote: >>> On Wed, May 22, 2019 at 02:25:38PM +0100, Robin Murphy wrote: Sure, but that should be irrelevant since the effective problem here

Re: [PATCH v3 2/3] vfio: zpci: defining the VFIO headers

2019-05-23 Thread Cornelia Huck
On Thu, 23 May 2019 14:25:25 +0200 Pierre Morel wrote: > We define a new device region in vfio.h to be able to > get the ZPCI CLP information by reading this region from > userland. > > We create a new file, vfio_zdev.h to define the structure > of the new region we defined in vfio.h > >

Re: [PATCH v3 04/16] ioasid: Add custom IOASID allocator

2019-05-23 Thread Jacob Pan
On Thu, 23 May 2019 09:14:07 +0200 Auger Eric wrote: > Hi Jacob, > > On 5/22/19 9:42 PM, Jacob Pan wrote: > > On Tue, 21 May 2019 11:55:55 +0200 > > Auger Eric wrote: > > > >> Hi Jacob, > >> > >> On 5/4/19 12:32 AM, Jacob Pan wrote: > >>> Sometimes, IOASID allocation must be handled by

Re: [RFC v3 0/3] vfio_pci: wrap pci device as a mediated device

2019-05-23 Thread Alex Williamson
On Thu, 23 May 2019 08:44:57 + "Liu, Yi L" wrote: > Hi Alex, > > Sorry to disturb you. Do you want to review on this version or review a > rebased version? :-) If rebase version is better, I can try to do it asap. Hi Yi, Perhaps you missed my comments on 1/3:

Re: [PATCH v5 1/1] iommu/io-pgtable-arm: Add support to use system cache

2019-05-23 Thread Vivek Gautam
On Thu, May 23, 2019 at 4:11 PM Robin Murphy wrote: > > On 2019-05-16 10:30 am, Vivek Gautam wrote: > > Few Qualcomm platforms such as, sdm845 have an additional outer > > cache called as System cache, aka. Last level cache (LLC) that > > allows non-coherent devices to upgrade to using caching. >

Re: implement generic dma_map_ops for IOMMUs v6

2019-05-23 Thread Robin Murphy
On 23/05/2019 08:00, Christoph Hellwig wrote: Hi Robin and Joerg, I think we are finally ready for the generic dma-iommu series. I have various DMA API changes pending, and Tom has patches ready to convert the AMD and Intel iommu drivers over to it. I'd love to have this in a stable branch

[PATCH v3 1/3] s390: pci: Exporting access to CLP PCI function and PCI group

2019-05-23 Thread Pierre Morel
For the generic implementation of VFIO PCI we need to retrieve the hardware configuration for the PCI functions and the PCI function groups. We modify the internal function using CLP Query PCI function and CLP query PCI function group so that they can be called from outside the S390 architecture

[PATCH v3 0/3] Retrieving zPCI specific info with VFIO

2019-05-23 Thread Pierre Morel
We define a new configuration entry for VFIO/PCI, VFIO_PCI_ZDEV to configure access to a zPCI region dedicated for retrieving zPCI features. When the VFIO_PCI_ZDEV feature is configured we initialize a new device region, VFIO_REGION_SUBTYPE_ZDEV_CLP, to hold the information from the ZPCI device

[PATCH v3 2/3] vfio: zpci: defining the VFIO headers

2019-05-23 Thread Pierre Morel
We define a new device region in vfio.h to be able to get the ZPCI CLP information by reading this region from userland. We create a new file, vfio_zdev.h to define the structure of the new region we defined in vfio.h Signed-off-by: Pierre Morel --- include/uapi/linux/vfio.h | 4

[PATCH v3 3/3] vfio: pci: Using a device region to retrieve zPCI information

2019-05-23 Thread Pierre Morel
We define a new configuration entry for VFIO/PCI, VFIO_PCI_ZDEV When the VFIO_PCI_ZDEV feature is configured we initialize a new device region, VFIO_REGION_SUBTYPE_ZDEV_CLP, to hold the information from the ZPCI device the userland needs to give to a guest driving the zPCI function.

Re: [PATCH v5 1/1] iommu/io-pgtable-arm: Add support to use system cache

2019-05-23 Thread Robin Murphy
On 2019-05-16 10:30 am, Vivek Gautam wrote: Few Qualcomm platforms such as, sdm845 have an additional outer cache called as System cache, aka. Last level cache (LLC) that allows non-coherent devices to upgrade to using caching. This cache sits right before the DDR, and is tightly coupled with

RE: [RFC v3 0/3] vfio_pci: wrap pci device as a mediated device

2019-05-23 Thread Liu, Yi L
Hi Alex, Sorry to disturb you. Do you want to review on this version or review a rebased version? :-) If rebase version is better, I can try to do it asap. Thanks, Yi Liu > -Original Message- > From: Liu, Yi L > Sent: Tuesday, April 23, 2019 8:15 PM > To: alex.william...@redhat.com;

Re: [PATCH v5 1/1] iommu/io-pgtable-arm: Add support to use system cache

2019-05-23 Thread Vivek Gautam
Hi Robin, On Thu, May 16, 2019 at 3:00 PM Vivek Gautam wrote: > > Few Qualcomm platforms such as, sdm845 have an additional outer > cache called as System cache, aka. Last level cache (LLC) that > allows non-coherent devices to upgrade to using caching. > This cache sits right before the DDR,

Re: [PATCH v3 04/16] ioasid: Add custom IOASID allocator

2019-05-23 Thread Auger Eric
Hi Jacob, On 5/22/19 9:42 PM, Jacob Pan wrote: > On Tue, 21 May 2019 11:55:55 +0200 > Auger Eric wrote: > >> Hi Jacob, >> >> On 5/4/19 12:32 AM, Jacob Pan wrote: >>> Sometimes, IOASID allocation must be handled by platform specific >>> code. The use cases are guest vIOMMU and pvIOMMU where

[PATCH 06/23] iommu/dma: Move domain lookup into __iommu_dma_{map, unmap}

2019-05-23 Thread Christoph Hellwig
From: Robin Murphy Most of the callers don't care, and the couple that do already have the domain to hand for other reasons are in slow paths where the (trivial) overhead of a repeated lookup will be utterly immaterial. Signed-off-by: Robin Murphy Signed-off-by: Christoph Hellwig ---

[PATCH 20/23] iommu/dma: Don't depend on CONFIG_DMA_DIRECT_REMAP

2019-05-23 Thread Christoph Hellwig
For entirely dma coherent architectures there is no requirement to ever remap dma coherent allocation. Move all the remap and pool code under IS_ENABLED() checks and drop the Kconfig dependency. Signed-off-by: Christoph Hellwig Reviewed-by: Robin Murphy --- drivers/iommu/Kconfig | 1 -

[PATCH 21/23] iommu/dma: Switch copyright boilerplace to SPDX

2019-05-23 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig Acked-by: Robin Murphy --- drivers/iommu/dma-iommu.c | 13 + include/linux/dma-iommu.h | 13 + 2 files changed, 2 insertions(+), 24 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index

[PATCH 19/23] iommu/dma: Refactor iommu_dma_mmap

2019-05-23 Thread Christoph Hellwig
Inline __iommu_dma_mmap and __iommu_dma_mmap_pfn into the main function, and use the fact that __iommu_dma_get_pages return NULL for remapped contigous allocations to simplify the code flow a bit. Signed-off-by: Christoph Hellwig Reviewed-by: Robin Murphy --- drivers/iommu/dma-iommu.c | 60

[PATCH 22/23] arm64: switch copyright boilerplace to SPDX in dma-mapping.c

2019-05-23 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig Acked-by: Robin Murphy Reviewed-by: Mukesh Ojha Acked-by: Catalin Marinas --- arch/arm64/mm/dma-mapping.c | 15 +-- 1 file changed, 1 insertion(+), 14 deletions(-) diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index

[PATCH 08/23] iommu/dma: Factor out remapped pages lookup

2019-05-23 Thread Christoph Hellwig
From: Robin Murphy Since we duplicate the find_vm_area() logic a few times in places where we only care aboute the pages, factor out a helper to abstract it. Signed-off-by: Robin Murphy [hch: don't warn when not finding a region, as we'll rely on that later] Signed-off-by: Christoph Hellwig

[PATCH 16/23] iommu/dma: Cleanup variable naming in iommu_dma_alloc

2019-05-23 Thread Christoph Hellwig
From: Robin Murphy Most importantly clear up the size / iosize confusion. Also rename addr to cpu_addr to match the surrounding code and make the intention a little more clear. Signed-off-by: Robin Murphy [hch: split from a larger patch] Signed-off-by: Christoph Hellwig ---

[PATCH 15/23] iommu/dma: Split iommu_dma_free

2019-05-23 Thread Christoph Hellwig
From: Robin Murphy Most of it can double up to serve the failure cleanup path for iommu_dma_alloc(). Signed-off-by: Robin Murphy --- drivers/iommu/dma-iommu.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c

[PATCH 23/23] arm64: trim includes in dma-mapping.c

2019-05-23 Thread Christoph Hellwig
With most of the previous functionality now elsewhere a lot of the headers included in this file are not needed. Signed-off-by: Christoph Hellwig Acked-by: Catalin Marinas --- arch/arm64/mm/dma-mapping.c | 10 -- 1 file changed, 10 deletions(-) diff --git a/arch/arm64/mm/dma-mapping.c

[PATCH 18/23] iommu/dma: Refactor iommu_dma_get_sgtable

2019-05-23 Thread Christoph Hellwig
Inline __iommu_dma_get_sgtable_page into the main function, and use the fact that __iommu_dma_get_pages return NULL for remapped contigous allocations to simplify the code flow a bit. Signed-off-by: Christoph Hellwig Reviewed-by: Robin Murphy --- drivers/iommu/dma-iommu.c | 45

[PATCH 09/23] iommu/dma: Refactor the page array remapping allocator

2019-05-23 Thread Christoph Hellwig
Move the call to dma_common_pages_remap into __iommu_dma_alloc and rename it to iommu_dma_alloc_remap. This creates a self-contained helper for remapped pages allocation and mapping. Signed-off-by: Christoph Hellwig Reviewed-by: Robin Murphy --- drivers/iommu/dma-iommu.c | 54

[PATCH 10/23] iommu/dma: Remove __iommu_dma_free

2019-05-23 Thread Christoph Hellwig
We only have a single caller of this function left, so open code it there. Signed-off-by: Christoph Hellwig Reviewed-by: Robin Murphy --- drivers/iommu/dma-iommu.c | 21 ++--- 1 file changed, 2 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/dma-iommu.c

[PATCH 12/23] iommu/dma: Refactor iommu_dma_alloc

2019-05-23 Thread Christoph Hellwig
From: Robin Murphy Shuffle around the self-contained atomic and non-contiguous cases to return early and get out of the way of the CMA case that we're about to work on next. Signed-off-by: Robin Murphy [hch: slight changes to the code flow] Signed-off-by: Christoph Hellwig ---

[PATCH 11/23] iommu/dma: Refactor iommu_dma_free

2019-05-23 Thread Christoph Hellwig
From: Robin Murphy The freeing logic was made particularly horrible by part of it being opaque to the arch wrapper, which led to a lot of convoluted repetition to ensure each path did everything in the right order. Now that it's all private, we can pick apart and consolidate the

[PATCH 17/23] iommu/dma: Refactor iommu_dma_alloc, part 2

2019-05-23 Thread Christoph Hellwig
All the logic in iommu_dma_alloc that deals with page allocation from the CMA or page allocators can be split into a self-contained helper, and we can than map the result of that or the atomic pool allocation with the iommu later. This also allows reusing __iommu_dma_free to tear down the

[PATCH 13/23] iommu/dma: Don't remap CMA unnecessarily

2019-05-23 Thread Christoph Hellwig
From: Robin Murphy Always remapping CMA allocations was largely a bodge to keep the freeing logic manageable when it was split between here and an arch wrapper. Now that it's all together and streamlined, we can relax that limitation. Signed-off-by: Robin Murphy Signed-off-by: Christoph

[PATCH 14/23] iommu/dma: Merge the CMA and alloc_pages allocation paths

2019-05-23 Thread Christoph Hellwig
Instead of having a separate code path for the non-blocking alloc_pages and CMA allocations paths merge them into one. There is a slight behavior change here in that we try the page allocator if CMA fails. This matches what dma-direct and other iommu drivers do and will be needed to use the

[PATCH 07/23] iommu/dma: Squash __iommu_dma_{map,unmap}_page helpers

2019-05-23 Thread Christoph Hellwig
From: Robin Murphy The remaining internal callsites don't care about having prototypes compatible with the relevant dma_map_ops callbacks, so the extra level of indirection just wastes space and complictaes things. Signed-off-by: Robin Murphy Signed-off-by: Christoph Hellwig ---

implement generic dma_map_ops for IOMMUs v6

2019-05-23 Thread Christoph Hellwig
Hi Robin and Joerg, I think we are finally ready for the generic dma-iommu series. I have various DMA API changes pending, and Tom has patches ready to convert the AMD and Intel iommu drivers over to it. I'd love to have this in a stable branch shared between the dma-mapping and iommu trees

[PATCH 02/23] iommu/dma: Remove the flush_page callback

2019-05-23 Thread Christoph Hellwig
We now have a arch_dma_prep_coherent architecture hook that is used for the generic DMA remap allocator, and we should use the same interface for the dma-iommu code. Signed-off-by: Christoph Hellwig Reviewed-by: Robin Murphy Acked-by: Catalin Marinas --- arch/arm64/mm/dma-mapping.c | 8

[PATCH 05/23] iommu/dma: Move __iommu_dma_map

2019-05-23 Thread Christoph Hellwig
Moving this function up to its unmap counterpart helps to keep related code together for the following changes. Signed-off-by: Christoph Hellwig Reviewed-by: Robin Murphy --- drivers/iommu/dma-iommu.c | 46 +++ 1 file changed, 23 insertions(+), 23

[PATCH 03/23] iommu/dma: Use for_each_sg in iommu_dma_alloc

2019-05-23 Thread Christoph Hellwig
arch_dma_prep_coherent can handle physically contiguous ranges larger than PAGE_SIZE just fine, which means we don't need a page-based iterator. Signed-off-by: Christoph Hellwig Reviewed-by: Robin Murphy --- drivers/iommu/dma-iommu.c | 14 +- 1 file changed, 5 insertions(+), 9

[PATCH 04/23] iommu/dma: move the arm64 wrappers to common code

2019-05-23 Thread Christoph Hellwig
There is nothing really arm64 specific in the iommu_dma_ops implementation, so move it to dma-iommu.c and keep a lot of symbols self-contained. Note the implementation does depend on the DMA_DIRECT_REMAP infrastructure for now, so we'll have to make the DMA_IOMMU support depend on it, but this

[PATCH 01/23] iommu/dma: Cleanup dma-iommu.h

2019-05-23 Thread Christoph Hellwig
No need for a __KERNEL__ guard outside uapi and add a missing comment describing the #else cpp statement. Last but not least include instead of the asm version, which is frowned upon. Signed-off-by: Christoph Hellwig Reviewed-by: Robin Murphy --- include/linux/dma-iommu.h | 6 ++ 1 file