Re: [RFC PATCH 2/6] drivers core: Add I/O ASID allocator

2018-10-22 Thread Lu Baolu

Hi,

On 10/22/18 6:22 PM, Raj, Ashok wrote:

On Mon, Oct 22, 2018 at 12:49:47PM +0800, Lu Baolu wrote:

Hi,

On 10/20/18 2:11 AM, Jean-Philippe Brucker wrote:

Some devices might support multiple DMA address spaces, in particular
those that have the PCI PASID feature. PASID (Process Address Space ID)
allows to share process address spaces with devices (SVA), partition a
device into VM-assignable entities (VFIO mdev) or simply provide
multiple DMA address space to kernel drivers. Add a global PASID
allocator usable by different drivers at the same time. Name it I/O ASID
to avoid confusion with ASIDs allocated by arch code, which are usually
a separate ID space.

The IOASID space is global. Each device can have its own PASID space,
but by convention the IOMMU ended up having a global PASID space, so
that with SVA, each mm_struct is associated to a single PASID.

The allocator doesn't really belong in drivers/iommu because some
drivers would like to allocate PASIDs for devices that aren't managed by
an IOMMU, using the same ID space as IOMMU. It doesn't really belong in
drivers/pci either since platform device also support PASID. Add the
allocator in drivers/base.


One concern of moving pasid allocator here is about paravirtual
allocation of pasid.

Since there is only a single set of pasid tables which is controlled by


Minor correction: Single system wide PASID namespace, but PASID tables
would be created ideally per-bdf for isolation purposes.

I'm sure you meant name space, but didn't want that to be mis-interpreted.


Yes, confirmed.

Best regards,
Lu Baolu





the host, the pasid is a system wide resource. When a driver running in
a guest VM wants to consume a pasid, it must be intercepted by the
simulation software and routed the allocation to the host via VFIO. Some
iommu arch's provide mechanisms to aid this, for example, the virtual
command interfaces defined in vt-d 3.0. Any pasid used in guest VM
should go through the virtual command interfaces.



Cheers,
Ashok


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/arm-smmu: Add support to use Last level cache

2018-10-22 Thread Tomasz Figa
Hi Vivek,

On Fri, Jun 15, 2018 at 7:53 PM Vivek Gautam
 wrote:
>
> Qualcomm SoCs have an additional level of cache called as
> System cache or Last level cache[1]. This cache sits right
> before the DDR, and is tightly coupled with the memory
> controller.
> The cache is available to all the clients present in the
> SoC system. The clients request their slices from this system
> cache, make it active, and can then start using it. For these
> clients with smmu, to start using the system cache for
> dma buffers and related page tables [2], few of the memory
> attributes need to be set accordingly.
> This change makes the related memory Outer-Shareable, and
> updates the MAIR with necessary protection.
>
> The MAIR attribute requirements are:
> Inner Cacheablity = 0
> Outer Cacheablity = 1, Write-Back Write Allocate
> Outer Shareablity = 1
>
> This change is a realisation of following changes
> from downstream msm-4.9:
> iommu: io-pgtable-arm: Support DOMAIN_ATTRIBUTE_USE_UPSTREAM_HINT
> iommu: io-pgtable-arm: Implement IOMMU_USE_UPSTREAM_HINT

Would you be able to provide links to those 2 downstream changes?

>
> [1] https://patchwork.kernel.org/patch/10422531/
> [2] https://patchwork.kernel.org/patch/10302791/
>
> Signed-off-by: Vivek Gautam 
> ---
>  drivers/iommu/arm-smmu.c   | 14 ++
>  drivers/iommu/io-pgtable-arm.c | 24 +++-
>  drivers/iommu/io-pgtable.h |  4 
>  include/linux/iommu.h  |  4 
>  4 files changed, 41 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index f7a96bcf94a6..8058e7205034 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -249,6 +249,7 @@ struct arm_smmu_domain {
> struct mutexinit_mutex; /* Protects smmu pointer 
> */
> spinlock_t  cb_lock; /* Serialises ATS1* ops and 
> TLB syncs */
> struct iommu_domain domain;
> +   boolhas_sys_cache;
>  };
>
>  struct arm_smmu_option_prop {
> @@ -862,6 +863,8 @@ static int arm_smmu_init_domain_context(struct 
> iommu_domain *domain,
>
> if (smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
> pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
> +   if (smmu_domain->has_sys_cache)
> +   pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_SYS_CACHE;
>
> smmu_domain->smmu = smmu;
> pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
> @@ -1477,6 +1480,9 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
> *domain,
> case DOMAIN_ATTR_NESTING:
> *(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
> return 0;
> +   case DOMAIN_ATTR_USE_SYS_CACHE:
> +   *((int *)data) = smmu_domain->has_sys_cache;
> +   return 0;
> default:
> return -ENODEV;
> }
> @@ -1506,6 +1512,14 @@ static int arm_smmu_domain_set_attr(struct 
> iommu_domain *domain,
> smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>
> break;
> +   case DOMAIN_ATTR_USE_SYS_CACHE:
> +   if (smmu_domain->smmu) {
> +   ret = -EPERM;
> +   goto out_unlock;
> +   }
> +   if (*((int *)data))
> +   smmu_domain->has_sys_cache = true;
> +   break;
> default:
> ret = -ENODEV;
> }
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 010a254305dd..b2aee1828524 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -169,9 +169,11 @@
>  #define ARM_LPAE_MAIR_ATTR_DEVICE  0x04
>  #define ARM_LPAE_MAIR_ATTR_NC  0x44
>  #define ARM_LPAE_MAIR_ATTR_WBRWA   0xff
> +#define ARM_LPAE_MAIR_ATTR_SYS_CACHE   0xf4
>  #define ARM_LPAE_MAIR_ATTR_IDX_NC  0
>  #define ARM_LPAE_MAIR_ATTR_IDX_CACHE   1
>  #define ARM_LPAE_MAIR_ATTR_IDX_DEV 2
> +#define ARM_LPAE_MAIR_ATTR_IDX_SYS_CACHE   3
>
>  /* IOPTE accessors */
>  #define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d))
> @@ -442,6 +444,10 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct 
> arm_lpae_io_pgtable *data,
> else if (prot & IOMMU_CACHE)
> pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE
> << ARM_LPAE_PTE_ATTRINDX_SHIFT);
> +   else if (prot & IOMMU_SYS_CACHE)
> +   pte |= (ARM_LPAE_MAIR_ATTR_IDX_SYS_CACHE
> +   << ARM_LPAE_PTE_ATTRINDX_SHIFT);
> +

Okay, so we favor the full caching (IC WBRWA, OC WBRWA, OS) first if
requested or otherwise try to use system cache (IC NC, OC WBWA?, OS)?
Sounds fine.

nit: Unnecessary blank line.

> } else {
> pte = ARM_LPAE_PTE_HAP_FAULT;
> if (prot & IOMMU_READ)
> @@ -771,7 +777,8 @@ arm_64_l

Re: [PATCH] kernel/dma: Fix panic caused by passing swiotlb to command line

2018-10-22 Thread Konrad Rzeszutek Wilk
On Sat, Sep 22, 2018 at 08:56:58PM +0800, He Zhe wrote:
> May I have your input?

Alternatively would it make more sense for it to assume some default
value?
> 
> Thanks,
> Zhe
> 
> On 2018年09月17日 11:27, zhe...@windriver.com wrote:
> > From: He Zhe 
> >
> > setup_io_tlb_npages does not check input argument before passing it
> > to isdigit. The argument would be a NULL pointer if "swiotlb", without
> > its value, is set in command line and thus causes the following panic.
> >
> > PANIC: early exception 0xe3 IP 10:bb9b8e9f error 0 cr2 0x0
> > [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 
> > 4.19.0-rc3-yocto-standard+ #9
> > [0.00] RIP: 0010:setup_io_tlb_npages+0xf/0x95
> > ...
> > [0.00] Call Trace:
> > [0.00]  do_early_param+0x57/0x8e
> > [0.00]  parse_args+0x208/0x320
> > [0.00]  ? rdinit_setup+0x30/0x30
> > [0.00]  parse_early_options+0x29/0x2d
> > [0.00]  ? rdinit_setup+0x30/0x30
> > [0.00]  parse_early_param+0x36/0x4d
> > [0.00]  setup_arch+0x336/0x99e
> > [0.00]  start_kernel+0x6f/0x4e6
> > [0.00]  x86_64_start_reservations+0x24/0x26
> > [0.00]  x86_64_start_kernel+0x6f/0x72
> > [0.00]  secondary_startup_64+0xa4/0xb0
> >
> > This patch adds a check to prevent the panic.
> >
> > Signed-off-by: He Zhe 
> > Cc: sta...@vger.kernel.org
> > Cc: konrad.w...@oracle.com
> > Cc: h...@lst.de
> > Cc: m.szyprow...@samsung.com
> > Cc: robin.mur...@arm.com
> > ---
> >  kernel/dma/swiotlb.c | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> > index 4f8a6db..46fc34e 100644
> > --- a/kernel/dma/swiotlb.c
> > +++ b/kernel/dma/swiotlb.c
> > @@ -109,6 +109,11 @@ static int late_alloc;
> >  static int __init
> >  setup_io_tlb_npages(char *str)
> >  {
> > +   if (!str) {
> > +   pr_err("Config string not provided\n");
> > +   return -EINVAL;
> > +   }
> > +
> > if (isdigit(*str)) {
> > io_tlb_nslabs = simple_strtoul(str, &str, 0);
> > /* avoid tail segment of size < IO_TLB_SEGSIZE */
> 
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-22 Thread Jean-Philippe Brucker
On 22/10/2018 12:50, Robin Murphy wrote:
> To me, that sounds like a very good argument for having separate "attach
> as primary domain" and "attach as aux domain" APIs. Say a driver wants
> to use multiple aux domains itself to isolate logically-separate
> transaction streams, but still also needs to control the main domain for
> transactions without PASID (interrupts?)

Yes, MSIs don't have a PASID (well, shouldn't). So in Arm systems, where
the address of the IRQ chip is translated by the SMMU, they are mapped
in the main domain.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 10/10] arm64: use the generic swiotlb_dma_ops

2018-10-22 Thread Robin Murphy

On 08/10/2018 09:02, Christoph Hellwig wrote:

Now that the generic swiotlb code supports non-coherent DMA we can switch
to it for arm64.  For that we need to refactor the existing
alloc/free/mmap/pgprot helpers to be used as the architecture hooks,
and implement the standard arch_sync_dma_for_{device,cpu} hooks for
cache maintaincance in the streaming dma hooks, which also implies
using the generic dma_coherent flag in struct device.

Note that we need to keep the old is_device_dma_coherent function around
for now, so that the shared arm/arm64 Xen code keeps working.

Signed-off-by: Christoph Hellwig 
---
  arch/arm64/Kconfig   |   4 +
  arch/arm64/include/asm/device.h  |   1 -
  arch/arm64/include/asm/dma-mapping.h |   7 +-
  arch/arm64/mm/dma-mapping.c  | 257 +--
  4 files changed, 56 insertions(+), 213 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1b1a0e95c751..c4db5131d837 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -11,6 +11,8 @@ config ARM64
select ARCH_CLOCKSOURCE_DATA
select ARCH_HAS_DEBUG_VIRTUAL
select ARCH_HAS_DEVMEM_IS_ALLOWED
+   select ARCH_HAS_DMA_COHERENT_TO_PFN
+   select ARCH_HAS_DMA_MMAP_PGPROT
select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
select ARCH_HAS_ELF_RANDOMIZE
select ARCH_HAS_FAST_MULTIPLIER
@@ -24,6 +26,8 @@ config ARM64
select ARCH_HAS_SG_CHAIN
select ARCH_HAS_STRICT_KERNEL_RWX
select ARCH_HAS_STRICT_MODULE_RWX
+   select ARCH_HAS_SYNC_DMA_FOR_DEVICE
+   select ARCH_HAS_SYNC_DMA_FOR_CPU
select ARCH_HAS_SYSCALL_WRAPPER
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAVE_NMI_SAFE_CMPXCHG
diff --git a/arch/arm64/include/asm/device.h b/arch/arm64/include/asm/device.h
index 5a5fa47a6b18..3dd3d664c5c5 100644
--- a/arch/arm64/include/asm/device.h
+++ b/arch/arm64/include/asm/device.h
@@ -23,7 +23,6 @@ struct dev_archdata {
  #ifdef CONFIG_XEN
const struct dma_map_ops *dev_dma_ops;
  #endif
-   bool dma_coherent;
  };
  
  struct pdev_archdata {

diff --git a/arch/arm64/include/asm/dma-mapping.h 
b/arch/arm64/include/asm/dma-mapping.h
index b7847eb8a7bb..c41f3fb1446c 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -44,10 +44,13 @@ void arch_teardown_dma_ops(struct device *dev);
  #define arch_teardown_dma_ops arch_teardown_dma_ops
  #endif
  
-/* do not use this function in a driver */

+/*
+ * Do not use this function in a driver, it is only provided for
+ * arch/arm/mm/xen.c, which is used by arm64 as well.
+ */
  static inline bool is_device_dma_coherent(struct device *dev)
  {
-   return dev->archdata.dma_coherent;
+   return dev->dma_coherent;
  }
  
  #endif	/* __KERNEL__ */

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index eee6cfcfde9e..3c75d69b54e7 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -25,6 +25,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -32,16 +33,6 @@
  
  #include 
  
-static int swiotlb __ro_after_init;

-
-static pgprot_t __get_dma_pgprot(unsigned long attrs, pgprot_t prot,
-bool coherent)
-{
-   if (!coherent || (attrs & DMA_ATTR_WRITE_COMBINE))
-   return pgprot_writecombine(prot);
-   return prot;
-}
-
  static struct gen_pool *atomic_pool __ro_after_init;
  
  #define DEFAULT_DMA_COHERENT_POOL_SIZE  SZ_256K

@@ -91,18 +82,16 @@ static int __free_from_pool(void *start, size_t size)
return 1;
  }
  
-static void *__dma_alloc(struct device *dev, size_t size,

-dma_addr_t *dma_handle, gfp_t flags,
-unsigned long attrs)
+void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
+   gfp_t flags, unsigned long attrs)
  {
struct page *page;
void *ptr, *coherent_ptr;
-   bool coherent = is_device_dma_coherent(dev);
-   pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, false);
+   pgprot_t prot = pgprot_writecombine(PAGE_KERNEL);
  
  	size = PAGE_ALIGN(size);
  
-	if (!coherent && !gfpflags_allow_blocking(flags)) {

+   if (!gfpflags_allow_blocking(flags)) {
struct page *page = NULL;
void *addr = __alloc_from_pool(size, &page, flags);
  
@@ -116,10 +105,6 @@ static void *__dma_alloc(struct device *dev, size_t size,

if (!ptr)
goto no_mem;
  
-	/* no need for non-cacheable mapping if coherent */

-   if (coherent)
-   return ptr;
-
/* remove any dirty cache lines on the kernel alias */
__dma_flush_area(ptr, size);
  
@@ -138,125 +123,50 @@ static void *__dma_alloc(struct device *dev, size_t size,

return NULL;
  }
  
-static void __dma_free(struct device *dev, size_t size,

-  void

Re: [PATCH 09/10] swiotlb: add support for non-coherent DMA

2018-10-22 Thread Robin Murphy

On 08/10/2018 09:02, Christoph Hellwig wrote:

Handle architectures that are not cache coherent directly in the main
swiotlb code by calling arch_sync_dma_for_{device,cpu} in all the right
places from the various dma_map/unmap/sync methods when the device is
non-coherent.

Because swiotlb now uses dma_direct_alloc for the coherent allocation
that side is already taken care of by the dma-direct code calling into
arch_dma_{alloc,free} for devices that are non-coherent.

Signed-off-by: Christoph Hellwig 
---
  kernel/dma/swiotlb.c | 23 +--
  1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 475a41eff3dc..52885b274368 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -21,6 +21,7 @@
  
  #include 

  #include 
+#include 
  #include 
  #include 
  #include 
@@ -677,6 +678,10 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct 
page *page,
dma_addr = swiotlb_bounce_page(dev, &phys, size, dir, attrs);
}
  
+	if (!dev_is_dma_coherent(dev) &&

+   (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)


Nit: other parts of the file are already using the "!(...)" style rather 
than "(...) == 0".



+   arch_sync_dma_for_device(dev, phys, size, dir);
+
return dma_addr;
  }
  
@@ -696,6 +701,10 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
  
  	BUG_ON(dir == DMA_NONE);
  
+	if (!dev_is_dma_coherent(hwdev) &&

+   (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
+   arch_sync_dma_for_cpu(hwdev, paddr, size, dir);
+
if (is_swiotlb_buffer(paddr)) {
swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
return;
@@ -732,15 +741,17 @@ swiotlb_sync_single(struct device *hwdev, dma_addr_t 
dev_addr,
  
  	BUG_ON(dir == DMA_NONE);
  
-	if (is_swiotlb_buffer(paddr)) {

+   if (!dev_is_dma_coherent(hwdev) && target == SYNC_FOR_CPU)
+   arch_sync_dma_for_cpu(hwdev, paddr, size, dir);
+
+   if (is_swiotlb_buffer(paddr))
swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target);
-   return;
-   }
  
-	if (dir != DMA_FROM_DEVICE)

-   return;
+   if (!dev_is_dma_coherent(hwdev) && target == SYNC_FOR_DEVICE)
+   arch_sync_dma_for_device(hwdev, paddr, size, dir);
  
-	dma_mark_clean(phys_to_virt(paddr), size);

+   if (!is_swiotlb_buffer(paddr) && dir == DMA_FROM_DEVICE)
+   dma_mark_clean(phys_to_virt(paddr), size);
  }


All these "if"s end up pretty hard to follow at first glance :(

I had a quick play at moving the cache maintenance here out into the 
callers, which comes out arguably looking perhaps a little cleaner (only 
+1 source line overall, and actually reduces text size by 32 bytes for 
my build), but sadly I can't really see any way of doing the equivalent 
for map/unmap short of duplicating the whole 3-line arch_sync_*() block, 
which just makes for a different readability problem. As you mentioned 
on patch #7, I guess this really is just one of those things which has 
no nice solution, so cosmetics aside,


Reviewed-by: Robin Murphy 

FWIW, below is my "cleanup" attempt (diff on top of the 
swiotlb-noncoherent.3 branch).


Robin.

->8-
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 52885b274368..43ee29969fdd 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -741,23 +741,24 @@ swiotlb_sync_single(struct device *hwdev, 
dma_addr_t dev_addr,


BUG_ON(dir == DMA_NONE);

-   if (!dev_is_dma_coherent(hwdev) && target == SYNC_FOR_CPU)
-   arch_sync_dma_for_cpu(hwdev, paddr, size, dir);
-
-   if (is_swiotlb_buffer(paddr))
+   if (is_swiotlb_buffer(paddr)) {
swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target);
+   return;
+   }

-   if (!dev_is_dma_coherent(hwdev) && target == SYNC_FOR_DEVICE)
-   arch_sync_dma_for_device(hwdev, paddr, size, dir);
+   if (dir != DMA_FROM_DEVICE)
+   return;

-   if (!is_swiotlb_buffer(paddr) && dir == DMA_FROM_DEVICE)
-   dma_mark_clean(phys_to_virt(paddr), size);
+   dma_mark_clean(phys_to_virt(paddr), size);
 }

 void
 swiotlb_sync_single_for_cpu(struct device *hwdev, dma_addr_t dev_addr,
size_t size, enum dma_data_direction dir)
 {
+   if (!dev_is_dma_coherent(hwdev))
+   arch_sync_dma_for_cpu(hwdev, dma_to_phys(hwdev, dev_addr),
+ size, dir);
swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_CPU);
 }

@@ -766,6 +767,9 @@ swiotlb_sync_single_for_device(struct device *hwdev, 
dma_addr_t dev_addr,

   size_t size, enum dma_data_direction dir)
 {
swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_DEVICE);
+   if (!dev_is_dma_coherent(hwdev))
+   arch_sync_dma_for_device(hwdev, dma_t

Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-22 Thread Jordan Crouse
On Fri, Oct 19, 2018 at 07:11:52PM +0100, Jean-Philippe Brucker wrote:

> (2) Allocate a domain and attach it to the device.
> 
>   dom = iommu_domain_alloc()
>   iommu_attach_device(dom, dev)
> 
> I still have concerns about this part, which are highlighted by the
> messy changes of patch 1. I think it would make more sense to
> introduce new attach/detach_dev_aux() functions instead of reusing
> attach/detach_dev()
> 
> Can we reconsider this and avoid unnecessary complications in IOMMU
> core and drivers? Does the VFIO code greatly benefit from using the
> same attach() function? It could as well use a different one for
> devices in AUXD mode, which the mediating driver could tell by
> adding a flag in mdev_set_iommu_device(), for example.
> 
> And I don't think other users of AUXD would benefit from using the
> same attach() function, since they will know whether they want to be
> using main or auxiliary domain when doing attach().

I second this. For the arm-smmu-v2 multiple-pagetable model we would either need
new API functions or do something along the lines of 

dom = iommu_domain_alloc()
iommu_set_attr(dom, DOMAIN_ATTR_I_WANT_TO_BE_AUX)
iommu_attach_device(dom,dev)

Either that or do some some dummy device magic that I haven't started to work
out in my head. Having aux specific functions just for this step would make the
arm-smmu-v2 version work out much cleaner.

Jordan

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-22 Thread Jean-Philippe Brucker
On 22/10/2018 11:07, Raj, Ashok wrote:
>> For my own convenience I've been using the SVA infrastructure since
>> I already had the locking and IOMMU ops in place. The proposed
>> interface is also missing min_pasid and max_pasid parameters, which
>> could be needed by device drivers to enforce PASID limits.
> 
> Variable PASID limits per-device is hard to manage. I suppose we can play
> some games to get this right, but I suspect that wont be very useful in 
> the long run.

Devices may have PASID limits that are not discoverable with the PCI
PASID capability (https://patchwork.kernel.org/patch/9989307/#21389571).
Even if we simply reject devices that don't support enough PASIDs, I
think it's still better to return an error on bind or init_device than
to return a valid PASID that they can't use

> #1: Given that PASID is a system wide resource, during boot iommu driver 
> needs 
> to scan and set the max PASID to be no more than 
> min(iommu_supported_max_pasid) 
> of all available IOMMU's. 
> 
> #2: In addition if any device supports less than the choosen system max PASID
> we should simply disable PASID on that device.
> 
> The reason is if the process were to bind to 2 or more accelerators and
> the second device has a limit smaller than the first that the application
> already attached, the second attemt to bind would fail. (Because
> we would use the same PASID for a single process)

But that's not reason enough to completely disable PASID for the device,
it could be the only one bound to that process, or PASID could be only
used privately by the host device driver.

> In addition for Intel IOMMU's PASID is a system wide resource. We have
> a virtual capability for vIOMMU's to allocate PASID's. At the time of
> allocation we don't know what device this PASID is even being allocated 
> for. 

Ah, I had missed that part, I thought the PV allocation had the device
ID as well. That's a significant change.

I was still hoping that we could go back to per-device PASID spaces in
the host, since I still haven't seen any convincing argument in favor of
system-wide PASID. Get rid of #1 in order to solve #2, and as a bonus
support more PASIDs in total. Until now PASID was a per-device resource
except for choices made when writing the IOMMU driver.

> Certainly we could add other intelligence to pass a hint for 
> max_pasid in the virtiual interface. 
> 
> I suppose when this becomes real, most serious accelerators will 
> support the full width.

I don't know if that's obvious from the device perspective. If I had to
implement one, I would simply size my PASIDs to the number of contexts
supported by my device (which might be especially small in the embedded
space), given that technically nothing prevents software from supporting
that and the specification allows me to choose a width.

This was the intended model for PCI, but thankfully version 4.0 added an
implementation note (6.20.2.1 - PASID Field) advising against this
approach, and to instead support 20 bits for interoperability. Maybe it
will set a trend for non-PCI devices as well.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-22 Thread Jordan Crouse
On Mon, Oct 22, 2018 at 12:50:56PM +0100, Robin Murphy wrote:
> On 22/10/2018 07:53, Tian, Kevin wrote:
> >>From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com]
> >>Sent: Saturday, October 20, 2018 2:12 AM
> >>
> >>This is a first prototype adding auxiliary domain support to Arm SMMUv3,
> >>following Lu Baolu's latest proposal for IOMMU aware mediated devices
> >>[1]. It works, but the attach() API still doesn't feel right. See (2)
> >>below.
> >>
> >>Patch 1 adapts iommu.c to the current proposal for auxiliary domains.
> >>Patches 2-4 rework the PASID allocator to make it usable for SVA and
> >>AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3.
> >>
> >>
> >>When a device can have multiple address space, for instance with PCI
> >>PASID, an auxiliary domain (AUXD) is the IOMMU representation of one
> >>address space. I distinguish auxiliary from "main" domain, which
> >>represents the non-PASID address space but also (at least for SMMUv3)
> >>the whole device context, PASID tables etc.
> >
> >I'd like to clarify a bit. :-)
> >
> >a domain can always represent one or more address spaces, where an
> >address space could be for IOVA or GPA and/or other address spaces for
> >SVA. Address spaces may be chained together in so-called nested mode.
> >
> >a domain can be associated with a full device (in BDF granular), and/or
> >with a partition of a device (say in PASID granular). In the former case,
> >the domain becomes the master (or primary) domain of the device. In
> >the latter case, the domain becomes the auxiliary domain of the device.
> >
> >vendor domain structure may include vendor-specific states which
> >are applied to device context at attach time, e.g. PASID table (SMMUv3)
> >if representing as master domain, or PASID value (VT-d scalable mode)
> >if representing as auxiliary domain.
> >
> >so the domain definition is never changed with the introduction of
> >AUXD. 'auxiliary' is a per-device attribute which takes effect when at
> >domain attach time. A domain is perfectly sane to represent as a
> >master domain to deviceA and an auxiliary domain to deviceB at the
> >same time (say when device A and a mdev on deviceB are assigned to
> >same VM), while supporting one or more address spaces.
> 
> To me, that sounds like a very good argument for having separate
> "attach as primary domain" and "attach as aux domain" APIs. Say a
> driver wants to use multiple aux domains itself to isolate
> logically-separate transaction streams, but still also needs to
> control the main domain for transactions without PASID (interrupts?)
> - having to juggle some invisible device state around multiple
> iommu_{attach,detach}_group() calls which look identical but are
> expected to behave differently at runtime sounds like a recipe for
> unmaintainable code.
> 
> I don't think it's necessarily safe to assume that
> one-struct-device-per-PASID will be the only usage model.

This feels exactly like the QCOM GPU model - multiple aux domains
for individual process pagetables but the main domain for TTBR1
(global) allocations.

Jordan

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 2/2] iommu/arm-smmu-v3: Reunify arm_smmu_cmdq_issue_sync()

2018-10-22 Thread John Garry

On 17/10/2018 14:56, Robin Murphy wrote:

Now that both sync methods are more or less the same shape, we can save
some code and levels of indirection by rolling them up together again,
with just a couple of simple conditionals to discriminate the MSI and
queue-polling specifics.

Signed-off-by: Robin Murphy 


Tested-by: John Garry 

I seem to be getting some boost in the scenarios I tested:
Storage controller: 746K IOPS (with) vs 730K (without)
NVMe disk:  471K IOPS (with) vs 420K IOPS (without)

Note that this is with strict mode set and without the CMD_SYNC 
optimisation I punted. And this is on D05, so no MSI-based CMD_SYNC support.


Thanks,
John

Ps. for anyone testing, use the v1 smmu "Fix Endian" patch to get this 
series to apply.



---
 drivers/iommu/arm-smmu-v3.c | 49 +
 1 file changed, 12 insertions(+), 37 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index da8a91d116bf..36db63e3afcf 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -933,7 +933,7 @@ static void arm_smmu_cmdq_issue_cmd(struct arm_smmu_device 
*smmu,
  * The difference between val and sync_idx is bounded by the maximum size of
  * a queue at 2^20 entries, so 32 bits is plenty for wrap-safe arithmetic.
  */
-static int __arm_smmu_sync_poll_msi(struct arm_smmu_device *smmu, u32 sync_idx)
+static int arm_smmu_sync_poll_msi(struct arm_smmu_device *smmu, u32 sync_idx)
 {
ktime_t timeout;
u32 val;
@@ -988,16 +988,17 @@ static int arm_smmu_sync_poll_cons(struct arm_smmu_device 
*smmu, u32 sync_idx,
return -ETIMEDOUT;
 }

-static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu)
+static void arm_smmu_cmdq_issue_sync(struct arm_smmu_device *smmu)
 {
u64 cmd[CMDQ_ENT_DWORDS];
unsigned long flags;
-   struct arm_smmu_cmdq_ent ent = {
-   .opcode = CMDQ_OP_CMD_SYNC,
-   .sync   = {
-   .msiaddr = cpu_to_le32(virt_to_phys(&smmu->sync_count)),
-   },
-   };
+   bool msi = (smmu->features & ARM_SMMU_FEAT_MSI) &&
+  (smmu->features & ARM_SMMU_FEAT_COHERENCY);
+   struct arm_smmu_cmdq_ent ent = { .opcode = CMDQ_OP_CMD_SYNC };
+   int ret, sync_idx, sync_gen;
+
+   if (msi)
+   ent.sync.msiaddr = cpu_to_le32(virt_to_phys(&smmu->sync_count));

spin_lock_irqsave(&smmu->cmdq.lock, flags);

@@ -1009,39 +1010,13 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct 
arm_smmu_device *smmu)
arm_smmu_cmdq_build_cmd(cmd, &ent);
arm_smmu_cmdq_insert_cmd(smmu, cmd);
}
-
-   spin_unlock_irqrestore(&smmu->cmdq.lock, flags);
-
-   return __arm_smmu_sync_poll_msi(smmu, ent.sync.msidata);
-}
-
-static int __arm_smmu_cmdq_issue_sync(struct arm_smmu_device *smmu)
-{
-   u64 cmd[CMDQ_ENT_DWORDS];
-   unsigned long flags;
-   struct arm_smmu_cmdq_ent ent = { .opcode = CMDQ_OP_CMD_SYNC };
-   int sync_idx, sync_gen;
-
-   arm_smmu_cmdq_build_cmd(cmd, &ent);
-
-   spin_lock_irqsave(&smmu->cmdq.lock, flags);
-   if (smmu->prev_cmd_opcode != CMDQ_OP_CMD_SYNC)
-   arm_smmu_cmdq_insert_cmd(smmu, cmd);
sync_idx = smmu->cmdq.q.prod;
sync_gen = READ_ONCE(smmu->cmdq_generation);
+
spin_unlock_irqrestore(&smmu->cmdq.lock, flags);

-   return arm_smmu_sync_poll_cons(smmu, sync_idx, sync_gen);
-}
-
-static void arm_smmu_cmdq_issue_sync(struct arm_smmu_device *smmu)
-{
-   int ret;
-   bool msi = (smmu->features & ARM_SMMU_FEAT_MSI) &&
-  (smmu->features & ARM_SMMU_FEAT_COHERENCY);
-
-   ret = msi ? __arm_smmu_cmdq_issue_sync_msi(smmu)
- : __arm_smmu_cmdq_issue_sync(smmu);
+   ret = msi ? arm_smmu_sync_poll_msi(smmu, ent.sync.msidata)
+ : arm_smmu_sync_poll_cons(smmu, sync_idx, sync_gen);
if (ret)
dev_err_ratelimited(smmu->dev, "CMD_SYNC timeout\n");
 }




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-22 Thread Robin Murphy

On 22/10/2018 07:53, Tian, Kevin wrote:

From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com]
Sent: Saturday, October 20, 2018 2:12 AM

This is a first prototype adding auxiliary domain support to Arm SMMUv3,
following Lu Baolu's latest proposal for IOMMU aware mediated devices
[1]. It works, but the attach() API still doesn't feel right. See (2)
below.

Patch 1 adapts iommu.c to the current proposal for auxiliary domains.
Patches 2-4 rework the PASID allocator to make it usable for SVA and
AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3.


When a device can have multiple address space, for instance with PCI
PASID, an auxiliary domain (AUXD) is the IOMMU representation of one
address space. I distinguish auxiliary from "main" domain, which
represents the non-PASID address space but also (at least for SMMUv3)
the whole device context, PASID tables etc.


I'd like to clarify a bit. :-)

a domain can always represent one or more address spaces, where an
address space could be for IOVA or GPA and/or other address spaces for
SVA. Address spaces may be chained together in so-called nested mode.

a domain can be associated with a full device (in BDF granular), and/or
with a partition of a device (say in PASID granular). In the former case,
the domain becomes the master (or primary) domain of the device. In
the latter case, the domain becomes the auxiliary domain of the device.

vendor domain structure may include vendor-specific states which
are applied to device context at attach time, e.g. PASID table (SMMUv3)
if representing as master domain, or PASID value (VT-d scalable mode)
if representing as auxiliary domain.

so the domain definition is never changed with the introduction of
AUXD. 'auxiliary' is a per-device attribute which takes effect when at
domain attach time. A domain is perfectly sane to represent as a
master domain to deviceA and an auxiliary domain to deviceB at the
same time (say when device A and a mdev on deviceB are assigned to
same VM), while supporting one or more address spaces.


To me, that sounds like a very good argument for having separate "attach 
as primary domain" and "attach as aux domain" APIs. Say a driver wants 
to use multiple aux domains itself to isolate logically-separate 
transaction streams, but still also needs to control the main domain for 
transactions without PASID (interrupts?) - having to juggle some 
invisible device state around multiple iommu_{attach,detach}_group() 
calls which look identical but are expected to behave differently at 
runtime sounds like a recipe for unmaintainable code.


I don't think it's necessarily safe to assume that 
one-struct-device-per-PASID will be the only usage model.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 3/7] PCI: OF: Allow endpoints to bypass the iommu

2018-10-22 Thread Jean-Philippe Brucker
On 17/10/2018 16:14, Michael S. Tsirkin wrote:
>>> Thinking about this comment, I would like to ask: can't the
>>> virtio device indicate the ranges in a portable way?
>>> This would minimize the dependency on dt bindings and ACPI,
>>> enabling support for systems that have neither but do
>>> have virtio e.g. through pci.
>>
>> I thought about adding a PROBE request for this in virtio-iommu, but it
>> wouldn't be usable by a Linux guest because of a bootstrapping problem.
> 
> Hmm. At some level it seems wrong to design hardware interfaces
> around how Linux happens to probe things. That can change at any time
> ...

I suspect that most other OS will also solve this class of problem using
a standard such as DT or ACPI, because they also provide dependency for
clock, interrupts, power management, etc. We can add a self-contained
PROBE method if someone makes a case for it, but it's unlikely to get
used at all, and nearly impossible to implement in Linux. The host would
still need a method to tell the guest which device to probe first, for
example with kernel parameters.

>> Early on, Linux needs a description of device dependencies, to determine
>> in which order to probe them. If the device dependency was described by
>> virtio-iommu itself, the guest could for example initialize a NIC,
>> allocate buffers and start DMA on the physical address space (which aborts
>> if the IOMMU implementation disallows DMA by default), only to find out
>> once the virtio-iommu module is loaded that it needs to cancel all DMA and
>> reconfigure the NIC. With a static description such as iommu-map in DT or
>> ACPI remapping tables, the guest can defer probing of the NIC until the
>> IOMMU is initialized.
>>
>> Thanks,
>> Jean
> 
> Could you point me at the code you refer to here?

In drivers/base/dd.c, really_probe() calls dma_configure() before the
device driver's probe(). dma_configure() ends up calling either
of_dma_configure() or acpi_dma_configure(), which return -EPROBE_DEFER
if the device's IOMMU isn't yet available. In that case the device is
added to the deferred pending list.

After another device is successfully bound to a driver, all devices on
the pending list are retried (driver_deferred_probe_trigger()), and if
the dependency has been resolved, then dma_configure() succeeds.

Another method (used by Intel and AMD IOMMU drivers) is to initialize
the IOMMU as early as possible, after discovering it in the ACPI tables
and before probing other devices. This can't work for virtio-iommu
because the driver might be a module, in which case early init isn't
possible. We have to defer probe of all dependent devices until the
virtio and virtio-iommu modules are loaded.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 2/6] drivers core: Add I/O ASID allocator

2018-10-22 Thread Raj, Ashok
On Mon, Oct 22, 2018 at 12:49:47PM +0800, Lu Baolu wrote:
> Hi,
> 
> On 10/20/18 2:11 AM, Jean-Philippe Brucker wrote:
> > Some devices might support multiple DMA address spaces, in particular
> > those that have the PCI PASID feature. PASID (Process Address Space ID)
> > allows to share process address spaces with devices (SVA), partition a
> > device into VM-assignable entities (VFIO mdev) or simply provide
> > multiple DMA address space to kernel drivers. Add a global PASID
> > allocator usable by different drivers at the same time. Name it I/O ASID
> > to avoid confusion with ASIDs allocated by arch code, which are usually
> > a separate ID space.
> > 
> > The IOASID space is global. Each device can have its own PASID space,
> > but by convention the IOMMU ended up having a global PASID space, so
> > that with SVA, each mm_struct is associated to a single PASID.
> > 
> > The allocator doesn't really belong in drivers/iommu because some
> > drivers would like to allocate PASIDs for devices that aren't managed by
> > an IOMMU, using the same ID space as IOMMU. It doesn't really belong in
> > drivers/pci either since platform device also support PASID. Add the
> > allocator in drivers/base.
> 
> One concern of moving pasid allocator here is about paravirtual
> allocation of pasid.
> 
> Since there is only a single set of pasid tables which is controlled by

Minor correction: Single system wide PASID namespace, but PASID tables
would be created ideally per-bdf for isolation purposes. 

I'm sure you meant name space, but didn't want that to be mis-interpreted.


> the host, the pasid is a system wide resource. When a driver running in
> a guest VM wants to consume a pasid, it must be intercepted by the
> simulation software and routed the allocation to the host via VFIO. Some
> iommu arch's provide mechanisms to aid this, for example, the virtual
> command interfaces defined in vt-d 3.0. Any pasid used in guest VM
> should go through the virtual command interfaces.
> 

Cheers,
Ashok
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-22 Thread Raj, Ashok
On Fri, Oct 19, 2018 at 07:11:52PM +0100, Jean-Philippe Brucker wrote:
> This is a first prototype adding auxiliary domain support to Arm SMMUv3,
> following Lu Baolu's latest proposal for IOMMU aware mediated devices
> [1]. It works, but the attach() API still doesn't feel right. See (2)
> below.
> 
> Patch 1 adapts iommu.c to the current proposal for auxiliary domains.
> Patches 2-4 rework the PASID allocator to make it usable for SVA and
> AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3.
> 
> 
> When a device can have multiple address space, for instance with PCI
> PASID, an auxiliary domain (AUXD) is the IOMMU representation of one
> address space. I distinguish auxiliary from "main" domain, which
> represents the non-PASID address space but also (at least for SMMUv3)
> the whole device context, PASID tables etc.
> 
> Auxiliary domains will be used by VFIO for IOMMU-aware mdev, and by any
> other device driver that wants to use PASID for private address spaces
> (as opposed to SVA [2]). The following API is available to device
> drivers:
> 
> (1) Enable AUXD for a device. Enable PASID if necessary and set an AUXD
> flag on the IOMMU data associated to a device.
> 
> For my own convenience I've been using the SVA infrastructure since
> I already had the locking and IOMMU ops in place. The proposed
> interface is also missing min_pasid and max_pasid parameters, which
> could be needed by device drivers to enforce PASID limits.

Variable PASID limits per-device is hard to manage. I suppose we can play
some games to get this right, but I suspect that wont be very useful in 
the long run.

#1: Given that PASID is a system wide resource, during boot iommu driver needs 
to scan and set the max PASID to be no more than min(iommu_supported_max_pasid) 
of all available IOMMU's. 

#2: In addition if any device supports less than the choosen system max PASID
we should simply disable PASID on that device.

The reason is if the process were to bind to 2 or more accelerators and
the second device has a limit smaller than the first that the application
already attached, the second attemt to bind would fail. (Because
we would use the same PASID for a single process)

In addition for Intel IOMMU's PASID is a system wide resource. We have
a virtual capability for vIOMMU's to allocate PASID's. At the time of
allocation we don't know what device this PASID is even being allocated 
for. Certainly we could add other intelligence to pass a hint for 
max_pasid in the virtiual interface. 

I suppose when this becomes real, most serious accelerators will 
support the full width. Hence doing #1 and #2 above should be good
for most cases.

Cheers,
Ashok
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dma-direct: reject highmem pages from dma_alloc_from_contiguous

2018-10-22 Thread Christoph Hellwig
Any comments?

On Sun, Oct 14, 2018 at 08:14:47PM +0200, Christoph Hellwig wrote:
> dma_alloc_from_contiguous can return highmem pages depending on the
> setup, which a plain non-remapping DMA allocator can't handle.  Detect
> this case and try the normal page allocator instead.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  kernel/dma/direct.c | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 87a6bc2a96c0..46fbaa49125b 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -126,6 +126,18 @@ void *dma_direct_alloc_pages(struct device *dev, size_t 
> size,
>   if (gfpflags_allow_blocking(gfp)) {
>   page = dma_alloc_from_contiguous(dev, count, page_order,
>gfp & __GFP_NOWARN);
> + if (page && PageHighMem(page)) {
> + /*
> +  * Depending on the cma= arguments and per-arch setup
> +  * dma_alloc_from_contiguous could return highmem
> +  * pages.  Without remapping there is no way to return
> +  * them here, so log an error and fail.
> +  */
> + dev_info(dev, "Ignoring highmem page from CMA.\n");
> + dma_release_from_contiguous(dev, page, count);
> + page = NULL;
> + }
> +
>   if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
>   dma_release_from_contiguous(dev, page, count);
>   page = NULL;
> -- 
> 2.19.1
> 
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
---end quoted text---
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dma-mapping: move the remap helpers to a separate file

2018-10-22 Thread Christoph Hellwig
On Sun, Oct 21, 2018 at 12:41:42AM -0700, Laura Abbott wrote:
>> The dma remap code only really makes sense for not cache coherent
>> architectures, and currently is only used by arm, arm64 and xtensa.
>> Split it out into a separate file with a separate Kconfig symbol.
>>
>> [Laura: you wrote this code back then, do you have a sensible
>>   copyright statement to add, given that the mapping.c statement
>>   obviously does not match your code that was written much later]
>>
>
> Hm, that was written and committed via my codeaurora.org e-mail
> address. Probably best to use "Copyright (c) 2014, The Linux Foundation"
> since that was the standard. If you want,
>
> Acked-by: Laura Abbott 

Thanks!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 10/10] x86/swiotlb: Enable swiotlb for > 4GiG RAM on 32-bit kernels

2018-10-22 Thread Shayan Doust
From: Christoph Hellwig 

We already build the swiotlb code for 32-bit kernels with PAE support,
but the code to actually use swiotlb has only been enabled for 64-bit
kernels for an unknown reason.

Before Linux v4.18 we paper over this fact because the networking code,
the SCSI layer and some random block drivers implemented their own
bounce buffering scheme.

[ mingo: Changelog fixes. ]

Fixes: 21e07dba9fb1 ("scsi: reduce use of block bounce buffers")
Fixes: ab74cfebafa3 ("net: remove the PCI_DMA_BUS_IS_PHYS check in 
illegal_highdma")
Reported-by: Matthew Whitehead 
Signed-off-by: Christoph Hellwig 
Signed-off-by: Thomas Gleixner 
Tested-by: Matthew Whitehead 
Cc: konrad.w...@oracle.com
Cc: iommu@lists.linux-foundation.org
Cc: sta...@vger.kernel.org
Link: https://lkml.kernel.org/r/20181014075208.2715-1-...@lst.de
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/pci-swiotlb.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index 661583662430..71c0b01d93b1 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -42,10 +42,8 @@ IOMMU_INIT_FINISH(pci_swiotlb_detect_override,
 int __init pci_swiotlb_detect_4gb(void)
 {
/* don't initialize swiotlb if iommu=off (no_iommu=1) */
-#ifdef CONFIG_X86_64
if (!no_iommu && max_possible_pfn > MAX_DMA32_PFN)
swiotlb = 1;
-#endif
 
/*
 * If SME is active then swiotlb will be set to 1 so that bounce
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[GIT PULL] first round of dma mapping updates for 4.20

2018-10-22 Thread Christoph Hellwig
Hi Linus, below is the first round of dma-mapping updates for 4.20.

There will be a second PR as some big changes were only applied just
before the end of the merge window, and I want to give them a few
more days in linux-next.

Note that "dma-mapping: add the missing ARCH_HAS_SYNC_DMA_FOR_CPU_ALL
declaration" was applied here and later cherry picked into 4.19, so it
will show up as a duplicate commit.  That also causes an easy to solve
merge conflict in kernel/dma/Kconfig (just make sure there are no
duplicate symbols).

There also are some Kconfig conflicts in linux-next for the architecture
Kconfig files, but those are only due to non-related symbols close to
each other, and trivial to resolve.

The following changes since commit d7b686ebf704e3d91925a535a0905ba6be23757c:

  Merge branch 'i2c/for-current' of 
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux (2018-09-07 17:30:40 
-0700)

are available in the Git repository at:

  git://git.infradead.org/users/hch/dma-mapping.git tags/dma-mapping-4.20

for you to fetch changes up to b9fd04262a8abc366f40a9e97598e94591352c26:

  dma-direct: respect DMA_ATTR_NO_WARN (2018-10-09 15:08:46 +0200)


First batch of dma-mapping changes for 4.20:

 - mostly more consolidation of the direct mapping code, including
   converting over hexagon, and merging the coherent and non-coherent
   code into a single dma_map_ops instance (me)
 - cleanups for the dma_configure/dma_unconfigure callchains (me)
 - better handling of dma_masks in odd setups (me, Alexander Duyck)
 - better debugging of passing vmalloc address to the DMA API
   (Stephen Boyd)
 - CMA command line parsing fix (He Zhe)


Alexander Duyck (1):
  dma-direct: fix return value of dma_direct_supported

Christoph Hellwig (24):
  hexagon: remove the sync_single_for_cpu DMA operation
  hexagon: implement the sync_sg_for_device DMA operation
  hexagon: use generic dma_noncoherent_ops
  arm-nommu: don't define arch_teardown_dma_ops
  dma-mapping: remove dma_configure
  dma-mapping: remove dma_deconfigure
  dma-mapping: clear dev->dma_ops in arch_teardown_dma_ops
  dma-mapping: add the missing ARCH_HAS_SYNC_DMA_FOR_CPU_ALL declaration
  MIPS: don't select DMA_MAYBE_COHERENT from DMA_PERDEV_COHERENT
  dma-mapping: move the dma_coherent flag to struct device
  dma-mapping: merge direct and noncoherent ops
  dma-mapping: consolidate the dma mmap implementations
  dma-mapping: support non-coherent devices in dma_common_get_sgtable
  Revert "dma-mapping: clear dev->dma_ops in arch_teardown_dma_ops"
  unicore32: remove swiotlb support
  dma-mapping: make the get_required_mask method available unconditionally
  dma-direct: add an explicit dma_direct_get_required_mask
  dma-direct: refine dma_direct_alloc zone selection
  dma-direct: implement complete bus_dma_mask handling
  dma-direct: always allow dma mask <= physiscal memory size
  dma-mapping: move dma_default_get_required_mask under ifdef
  dma-direct: document the zone selection logic
  dma-mapping: translate __GFP_NOFAIL to DMA_ATTR_NO_WARN
  dma-direct: respect DMA_ATTR_NO_WARN

He Zhe (1):
  dma-mapping: fix panic caused by passing empty cma command line argument

Stephen Boyd (1):
  dma-debug: Check for drivers mapping invalid addresses in dma_map_single()

 arch/arc/Kconfig |   4 +-
 arch/arc/mm/dma.c|  41 ++
 arch/arm/include/asm/dma-mapping.h   |   2 +
 arch/arm/mm/dma-mapping-nommu.c  |  11 +-
 arch/c6x/Kconfig |   2 +-
 arch/hexagon/Kconfig |   2 +
 arch/hexagon/include/asm/Kbuild  |   1 +
 arch/hexagon/include/asm/dma-mapping.h   |  40 --
 arch/hexagon/kernel/dma.c| 143 ++--
 arch/ia64/include/asm/dma-mapping.h  |   2 -
 arch/ia64/include/asm/machvec.h  |   7 -
 arch/ia64/include/asm/machvec_init.h |   1 -
 arch/ia64/include/asm/machvec_sn2.h  |   2 -
 arch/ia64/pci/pci.c  |  26 
 arch/ia64/sn/pci/pci_dma.c   |   4 +-
 arch/m68k/Kconfig|   2 +-
 arch/microblaze/Kconfig  |   4 +-
 arch/microblaze/include/asm/pgtable.h|   2 -
 arch/microblaze/kernel/dma.c |  22 ---
 arch/microblaze/mm/consistent.c  |   3 +-
 arch/mips/Kconfig|   7 +-
 arch/mips/include/asm/Kbuild |   1 +
 arch/mips/include/asm/device.h   |  19 ---
 arch/mips/include/asm/dma-coherence.h|   6 +
 arch/mips/include/asm/dma-mapping.h  |   4 +-
 arch/mips/jazz/jazzdma.c |   7 +-
 arch/mips/kernel/setup.c |   2 +-
 arch/mips/mm/c-r4k.c |  17 ++-
 arch/mips/mm/dma-noncoherent.c   |  79