Re: [PATCH V7 01/11] iommu/of: Refactor of_iommu_configure() for error handling
On 27/01/17 18:00, Sricharan wrote: > Hi Robin, > > [..] > +const struct iommu_ops *of_iommu_configure(struct device *dev, + struct device_node *master_np) +{ +const struct iommu_ops *ops; + +if (!master_np) +return NULL; + +if (dev_is_pci(dev)) +ops = of_pci_iommu_init(to_pci_dev(dev), master_np); >>> >>> I gave the whole patch set a try on ThunderX. really_probe() is failing >>> on dma_configure()->of_pci_iommu_init() for each PCI device. >> >> When you say "failing", do you mean cleanly, or with a crash? I've >> managed to hit __of_match_node() dereferencing NULL from >> of_iommu_xlate() in a horribly complicated chain of events, which I'm >> trying to figure out now, and I wonder if the two might be related. > > Sorry that there is crash still. __of_match_node seems to checking > for NULL arguments , feels like some invalid pointer was passed in. > Is there any particular sequence to try for this ? Ah, I did figure it out - it wasn't actually a NULL dereference, but an unmapped address. Turns out __iommu_of_table is in initdata, so any driver probing after init, connected to an unprobed IOMMU (in this case disabled in DT), trips over trying to match the now-freed table. I'm working on the fix - technically the bug's in my patch (#2) anyway ;) Robin. > > Regards, > Sricharan > > > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH V7 01/11] iommu/of: Refactor of_iommu_configure() for error handling
Hi Robin, [..] >>> +const struct iommu_ops *of_iommu_configure(struct device *dev, >>> + struct device_node *master_np) >>> +{ >>> +const struct iommu_ops *ops; >>> + >>> +if (!master_np) >>> +return NULL; >>> + >>> +if (dev_is_pci(dev)) >>> +ops = of_pci_iommu_init(to_pci_dev(dev), master_np); >> >> I gave the whole patch set a try on ThunderX. really_probe() is failing >> on dma_configure()->of_pci_iommu_init() for each PCI device. > >When you say "failing", do you mean cleanly, or with a crash? I've >managed to hit __of_match_node() dereferencing NULL from >of_iommu_xlate() in a horribly complicated chain of events, which I'm >trying to figure out now, and I wonder if the two might be related. Sorry that there is crash still. __of_match_node seems to checking for NULL arguments , feels like some invalid pointer was passed in. Is there any particular sequence to try for this ? Regards, Sricharan ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/2] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS
Hi Geert, On 27/01/17 15:34, Geert Uytterhoeven wrote: > Add helpers for allocating physically contiguous DMA buffers to the > generic IOMMU DMA code. This can be useful when two or more devices > with different memory requirements are involved in buffer sharing. > > The iommu_dma_{alloc,free}_contiguous() functions complement the existing > iommu_dma_{alloc,free}() functions, and allow architecture-specific code > to implement support for the DMA_ATTR_FORCE_CONTIGUOUS attribute on > systems with an IOMMU. As this uses the CMA allocator, setting this > attribute has a runtime-dependency on CONFIG_DMA_CMA. > > Note that unlike the existing iommu_dma_alloc() helper, > iommu_dma_alloc_contiguous() has no callback to flush pages. > Ensuring the returned region is made visible to a non-coherent device is > the responsibility of the caller. > > Signed-off-by: Geert Uytterhoeven> --- > v2: > - Provide standalone iommu_dma_{alloc,free}_contiguous() functions, as > requested by Robin Murphy, > - Simplify operations by getting rid of the page array/scatterlist > dance, as the buffer is contiguous, > - Move CPU cache magement into the caller, which is much simpler with > a single contiguous buffer. Thanks for the rework, that's a lot easier to make sense of! Now, please don't hate me, but... > --- > drivers/iommu/dma-iommu.c | 72 > +++ > include/linux/dma-iommu.h | 4 +++ > 2 files changed, 76 insertions(+) > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > index 2db0d641cf4505b5..8f8ed4426f9a3a12 100644 > --- a/drivers/iommu/dma-iommu.c > +++ b/drivers/iommu/dma-iommu.c > @@ -30,6 +30,7 @@ > #include > #include > #include > +#include > > struct iommu_dma_msi_page { > struct list_headlist; > @@ -408,6 +409,77 @@ struct page **iommu_dma_alloc(struct device *dev, size_t > size, gfp_t gfp, > } > > /** > + * iommu_dma_free_contiguous - Free a buffer allocated by > + * iommu_dma_alloc_contiguous() > + * @dev: Device which owns this buffer > + * @page: Buffer page pointer as returned by iommu_dma_alloc_contiguous() > + * @size: Size of buffer in bytes > + * @handle: DMA address of buffer > + * > + * Frees the pages associated with the buffer. > + */ > +void iommu_dma_free_contiguous(struct device *dev, struct page *page, > + size_t size, dma_addr_t *handle) > +{ > + __iommu_dma_unmap(iommu_get_domain_for_dev(dev), *handle); > + dma_release_from_contiguous(dev, page, PAGE_ALIGN(size) >> PAGE_SHIFT); > + *handle = DMA_ERROR_CODE; > +} > + > +/** > + * iommu_dma_alloc_contiguous - Allocate and map a buffer contiguous in IOVA > + * and physical space > + * @dev: Device to allocate memory for. Must be a real device attached to an > + *iommu_dma_domain > + * @size: Size of buffer in bytes > + * @prot: IOMMU mapping flags > + * @handle: Out argument for allocated DMA handle > + * > + * Return: Buffer page pointer, or NULL on failure. > + * > + * Note that unlike iommu_dma_alloc(), it's the caller's responsibility to > + * ensure the returned region is made visible to the given non-coherent > device. > + */ > +struct page *iommu_dma_alloc_contiguous(struct device *dev, size_t size, > + int prot, dma_addr_t *handle) > +{ > + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); > + struct iova_domain *iovad = cookie_iovad(domain); > + dma_addr_t dma_addr; > + unsigned int count; > + struct page *page; > + struct iova *iova; > + int ret; > + > + *handle = DMA_ERROR_CODE; > + > + size = PAGE_ALIGN(size); > + count = size >> PAGE_SHIFT; > + page = dma_alloc_from_contiguous(dev, count, get_order(size)); > + if (!page) > + return NULL; > + > + iova = __alloc_iova(domain, size, dev->coherent_dma_mask); > + if (!iova) > + goto out_free_pages; > + > + size = iova_align(iovad, size); > + dma_addr = iova_dma_addr(iovad, iova); > + ret = iommu_map(domain, dma_addr, page_to_phys(page), size, prot); > + if (ret < 0) > + goto out_free_iova; > + > + *handle = dma_addr; > + return page; > + > +out_free_iova: > + __free_iova(iovad, iova); > +out_free_pages: > + dma_release_from_contiguous(dev, page, count); > + return NULL; > +} ...now that I can see it clearly, isn't this more or less just: page = dma_alloc_from_contiguous(dev, ...); if (page) dma_addr = iommu_dma_map_page(dev, page, ...); ? Would it not be even simpler to just make those two calls directly from the arm64 code? Robin. > + > +/** > * iommu_dma_mmap - Map a buffer into provided user VMA > * @pages: Array representing buffer from iommu_dma_alloc() > * @size: Size of buffer in bytes > diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h > index
Re: [PATCH/RFC] iommu/dma: Per-domain flag to control size-alignment
Hi Magnus, On 27/01/17 06:24, Magnus Damm wrote: > From: Magnus Damm> > Introduce the flag "no_size_align" to allow disabling size-alignment > on a per-domain basis. This follows the suggestion by the comment > in the code, however a per-device control may be preferred? > > Needed to make virtual space contiguous for certain devices. That sounds very suspicious - a single allocation is contiguous with itself by definition, and anyone relying on multiple allocations being contiguous with one another is doing it wrong, because there's no way we could ever guarantee that (with this allocator, at any rate). I'd be very reticent to touch this without a specific example of what problem it solves. Since I understand all this stuff a lot more deeply now that back when I first wrote that comment, I'd say that if there really is some real need to implement this feature, it should be a dma_attr flag, i.e. not even per-device, but per-allocation. We'd be breaking a behaviour currently guaranteed by the DMA API, so we need to be really sure the caller is OK with that - having it be their responsibility to explicitly ask is definitely safest. Robin. > Signed-off-by: Magnus Damm > --- > > drivers/iommu/dma-iommu.c |6 +- > include/linux/iommu.h |1 + > 2 files changed, 6 insertions(+), 1 deletion(-) > > --- 0001/drivers/iommu/dma-iommu.c > +++ work/drivers/iommu/dma-iommu.c2017-01-27 15:17:50.280607110 +0900 > @@ -209,14 +209,18 @@ static struct iova *__alloc_iova(struct > struct iova_domain *iovad = cookie_iovad(domain); > unsigned long shift = iova_shift(iovad); > unsigned long length = iova_align(iovad, size) >> shift; > + bool size_aligned = true; > > if (domain->geometry.force_aperture) > dma_limit = min(dma_limit, domain->geometry.aperture_end); > + > + if (domain->no_size_align) > + size_aligned = false; > /* >* Enforce size-alignment to be safe - there could perhaps be an >* attribute to control this per-device, or at least per-domain... >*/ > - return alloc_iova(iovad, length, dma_limit >> shift, true); > + return alloc_iova(iovad, length, dma_limit >> shift, size_aligned); > } > > /* The IOVA allocator knows what we mapped, so just unmap whatever that was > */ > --- 0001/include/linux/iommu.h > +++ work/include/linux/iommu.h2017-01-27 15:16:37.630607110 +0900 > @@ -83,6 +83,7 @@ struct iommu_domain { > iommu_fault_handler_t handler; > void *handler_token; > struct iommu_domain_geometry geometry; > + bool no_size_align; > void *iova_cookie; > }; > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 3/3] mm: wire up GFP flag passing in dma_alloc_from_contiguous
The callers of the DMA alloc functions already provide the proper context GFP flags. Make sure to pass them through to the CMA allocator, to make the CMA compaction context aware. Signed-off-by: Lucas StachAcked-by: Vlastimil Babka Acked-by: Michal Hocko --- arch/arm/mm/dma-mapping.c | 16 +--- arch/arm64/mm/dma-mapping.c| 4 ++-- arch/mips/mm/dma-default.c | 4 ++-- arch/x86/kernel/pci-dma.c | 3 ++- arch/xtensa/kernel/pci-dma.c | 3 ++- drivers/base/dma-contiguous.c | 5 +++-- drivers/iommu/amd_iommu.c | 2 +- drivers/iommu/intel-iommu.c| 2 +- include/linux/dma-contiguous.h | 4 ++-- 9 files changed, 24 insertions(+), 19 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index ab7710002ba6..4d6ec7d821c8 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -349,7 +349,7 @@ static void __dma_free_buffer(struct page *page, size_t size) static void *__alloc_from_contiguous(struct device *dev, size_t size, pgprot_t prot, struct page **ret_page, const void *caller, bool want_vaddr, -int coherent_flag); +int coherent_flag, gfp_t gfp); static void *__alloc_remap_buffer(struct device *dev, size_t size, gfp_t gfp, pgprot_t prot, struct page **ret_page, @@ -420,7 +420,8 @@ static int __init atomic_pool_init(void) */ if (dev_get_cma_area(NULL)) ptr = __alloc_from_contiguous(NULL, atomic_pool_size, prot, - , atomic_pool_init, true, NORMAL); + , atomic_pool_init, true, NORMAL, + GFP_KERNEL); else ptr = __alloc_remap_buffer(NULL, atomic_pool_size, gfp, prot, , atomic_pool_init, true); @@ -594,14 +595,14 @@ static int __free_from_pool(void *start, size_t size) static void *__alloc_from_contiguous(struct device *dev, size_t size, pgprot_t prot, struct page **ret_page, const void *caller, bool want_vaddr, -int coherent_flag) +int coherent_flag, gfp_t gfp) { unsigned long order = get_order(size); size_t count = size >> PAGE_SHIFT; struct page *page; void *ptr = NULL; - page = dma_alloc_from_contiguous(dev, count, order); + page = dma_alloc_from_contiguous(dev, count, order, gfp); if (!page) return NULL; @@ -655,7 +656,7 @@ static inline pgprot_t __get_dma_pgprot(unsigned long attrs, pgprot_t prot) #define __get_dma_pgprot(attrs, prot) __pgprot(0) #define __alloc_remap_buffer(dev, size, gfp, prot, ret, c, wv) NULL #define __alloc_from_pool(size, ret_page) NULL -#define __alloc_from_contiguous(dev, size, prot, ret, c, wv, coherent_flag) NULL +#define __alloc_from_contiguous(dev, size, prot, ret, c, wv, coherent_flag, gfp) NULL #define __free_from_pool(cpu_addr, size) do { } while (0) #define __free_from_contiguous(dev, page, cpu_addr, size, wv) do { } while (0) #define __dma_free_remap(cpu_addr, size) do { } while (0) @@ -697,7 +698,8 @@ static void *cma_allocator_alloc(struct arm_dma_alloc_args *args, { return __alloc_from_contiguous(args->dev, args->size, args->prot, ret_page, args->caller, - args->want_vaddr, args->coherent_flag); + args->want_vaddr, args->coherent_flag, + args->gfp); } static void cma_allocator_free(struct arm_dma_free_args *args) @@ -1293,7 +1295,7 @@ static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, unsigned long order = get_order(size); struct page *page; - page = dma_alloc_from_contiguous(dev, count, order); + page = dma_alloc_from_contiguous(dev, count, order, gfp); if (!page) goto error; diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index 290a84f3351f..88e76e5ba29f 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -107,7 +107,7 @@ static void *__dma_alloc_coherent(struct device *dev, size_t size, void *addr; page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT, - get_order(size)); +get_order(size), flags); if (!page)
[PATCH v2 2/3] mm: cma_alloc: allow to specify GFP mask
Most users of this interface just want to use it with the default GFP_KERNEL flags, but for cases where DMA memory is allocated it may be called from a different context. No functional change yet, just passing through the flag to the underlying alloc_contig_range function. Signed-off-by: Lucas StachAcked-by: Vlastimil Babka Acked-by: Michal Hocko --- arch/powerpc/kvm/book3s_hv_builtin.c | 3 ++- drivers/base/dma-contiguous.c| 2 +- include/linux/cma.h | 3 ++- mm/cma.c | 5 +++-- mm/cma_debug.c | 2 +- 5 files changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c index 5bb24be0b346..56a62d97ab2d 100644 --- a/arch/powerpc/kvm/book3s_hv_builtin.c +++ b/arch/powerpc/kvm/book3s_hv_builtin.c @@ -56,7 +56,8 @@ struct page *kvm_alloc_hpt(unsigned long nr_pages) { VM_BUG_ON(order_base_2(nr_pages) < KVM_CMA_CHUNK_ORDER - PAGE_SHIFT); - return cma_alloc(kvm_cma, nr_pages, order_base_2(HPT_ALIGN_PAGES)); + return cma_alloc(kvm_cma, nr_pages, order_base_2(HPT_ALIGN_PAGES), +GFP_KERNEL); } EXPORT_SYMBOL_GPL(kvm_alloc_hpt); diff --git a/drivers/base/dma-contiguous.c b/drivers/base/dma-contiguous.c index e167a1e1bccb..d1a9cbabc627 100644 --- a/drivers/base/dma-contiguous.c +++ b/drivers/base/dma-contiguous.c @@ -193,7 +193,7 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count, if (align > CONFIG_CMA_ALIGNMENT) align = CONFIG_CMA_ALIGNMENT; - return cma_alloc(dev_get_cma_area(dev), count, align); + return cma_alloc(dev_get_cma_area(dev), count, align, GFP_KERNEL); } /** diff --git a/include/linux/cma.h b/include/linux/cma.h index 6f0a91b37f68..03f32d0bd1d8 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -29,6 +29,7 @@ extern int __init cma_declare_contiguous(phys_addr_t base, extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, unsigned int order_per_bit, struct cma **res_cma); -extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align); +extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, + gfp_t gfp_mask); extern bool cma_release(struct cma *cma, const struct page *pages, unsigned int count); #endif diff --git a/mm/cma.c b/mm/cma.c index fbd67d866f67..a33ddfde315d 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -362,7 +362,8 @@ int __init cma_declare_contiguous(phys_addr_t base, * This function allocates part of contiguous memory on specific * contiguous memory area. */ -struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align) +struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, + gfp_t gfp_mask) { unsigned long mask, offset; unsigned long pfn = -1; @@ -408,7 +409,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align) pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit); mutex_lock(_mutex); ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, -GFP_KERNEL); +gfp_mask); mutex_unlock(_mutex); if (ret == 0) { page = pfn_to_page(pfn); diff --git a/mm/cma_debug.c b/mm/cma_debug.c index f8e4b60db167..ffc0c3d0ae64 100644 --- a/mm/cma_debug.c +++ b/mm/cma_debug.c @@ -138,7 +138,7 @@ static int cma_alloc_mem(struct cma *cma, int count) if (!mem) return -ENOMEM; - p = cma_alloc(cma, count, 0); + p = cma_alloc(cma, count, 0, GFP_KERNEL); if (!p) { kfree(mem); return -ENOMEM; -- 2.11.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 1/3] mm: alloc_contig_range: allow to specify GFP mask
Currently alloc_contig_range assumes that the compaction should be done with the default GFP_KERNEL flags. This is probably right for all current uses of this interface, but may change as CMA is used in more use-cases (including being the default DMA memory allocator on some platforms). Change the function prototype, to allow for passing through the GFP mask set by upper layers. Also respect global restrictions by applying memalloc_noio_flags to the passed in flags. Signed-off-by: Lucas StachAcked-by: Michal Hocko --- v2: add memalloc_noio restriction --- include/linux/gfp.h | 2 +- mm/cma.c| 3 ++- mm/hugetlb.c| 3 ++- mm/page_alloc.c | 5 +++-- 4 files changed, 8 insertions(+), 5 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 4175dca4ac39..1efa221e0e1d 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -549,7 +549,7 @@ static inline bool pm_suspended_storage(void) #if (defined(CONFIG_MEMORY_ISOLATION) && defined(CONFIG_COMPACTION)) || defined(CONFIG_CMA) /* The below functions must be run on a range from a single zone. */ extern int alloc_contig_range(unsigned long start, unsigned long end, - unsigned migratetype); + unsigned migratetype, gfp_t gfp_mask); extern void free_contig_range(unsigned long pfn, unsigned nr_pages); #endif diff --git a/mm/cma.c b/mm/cma.c index c960459eda7e..fbd67d866f67 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -407,7 +407,8 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align) pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit); mutex_lock(_mutex); - ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA); + ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, +GFP_KERNEL); mutex_unlock(_mutex); if (ret == 0) { page = pfn_to_page(pfn); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 3edb759c5c7d..6ed8b160fc0d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1051,7 +1051,8 @@ static int __alloc_gigantic_page(unsigned long start_pfn, unsigned long nr_pages) { unsigned long end_pfn = start_pfn + nr_pages; - return alloc_contig_range(start_pfn, end_pfn, MIGRATE_MOVABLE); + return alloc_contig_range(start_pfn, end_pfn, MIGRATE_MOVABLE, + GFP_KERNEL); } static bool pfn_range_valid_gigantic(struct zone *z, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index eced9fee582b..c5a745b521c0 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7230,6 +7230,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc, * #MIGRATE_MOVABLE or #MIGRATE_CMA). All pageblocks * in range must have the same migratetype and it must * be either of the two. + * @gfp_mask: GFP mask to use during compaction * * The PFN range does not have to be pageblock or MAX_ORDER_NR_PAGES * aligned, however it's the caller's responsibility to guarantee that @@ -7243,7 +7244,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc, * need to be freed with free_contig_range(). */ int alloc_contig_range(unsigned long start, unsigned long end, - unsigned migratetype) + unsigned migratetype, gfp_t gfp_mask) { unsigned long outer_start, outer_end; unsigned int order; @@ -7255,7 +7256,7 @@ int alloc_contig_range(unsigned long start, unsigned long end, .zone = page_zone(pfn_to_page(start)), .mode = MIGRATE_SYNC, .ignore_skip_hint = true, - .gfp_mask = GFP_KERNEL, + .gfp_mask = memalloc_noio_flags(gfp_mask), }; INIT_LIST_HEAD(); -- 2.11.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH] iommu: Better document the IOMMU_PRIV flag
Hi Robin, >-Original Message- >From: linux-arm-kernel [mailto:linux-arm-kernel-boun...@lists.infradead.org] >On Behalf Of Robin Murphy >Sent: Friday, January 27, 2017 5:53 PM >To: will.dea...@arm.com >Cc: iommu@lists.linux-foundation.org; sricha...@codeaurora.org; >linux-arm-ker...@lists.infradead.org >Subject: [PATCH] iommu: Better document the IOMMU_PRIV flag > >This is a fairly subtle thing - let's make sure it's described as >clearly as possible to avoid potential misunderstandings. > >Signed-off-by: Robin Murphy>--- > >Having another look through the IOMMU_PRIV series, I wasn't convinced >that the original comment was really all that helpful - I'm happy for >this to be squashed in if you like. > >Robin. > > include/linux/iommu.h | 11 +++ > 1 file changed, 7 insertions(+), 4 deletions(-) > >diff --git a/include/linux/iommu.h b/include/linux/iommu.h >index 69e2417a2965..3c830e153069 100644 >--- a/include/linux/iommu.h >+++ b/include/linux/iommu.h >@@ -32,10 +32,13 @@ > #define IOMMU_NOEXEC (1 << 3) > #define IOMMU_MMIO(1 << 4) /* e.g. things like MSI doorbells */ > /* >- * This is to make the IOMMU API setup privileged >- * mapppings accessible by the master only at higher >- * privileged execution level and inaccessible at >- * less privileged levels. >+ * Where the bus hardware includes a privilege level as part of its access >type >+ * markings, and certain devices are capable of issuing transactions marked as >+ * either 'supervisor' or 'user', the IOMMU_PRIV flag requests that the other >+ * given permission flags only apply to accesses at the higher privilege >level, >+ * and that unprivileged transactions should have as little access as >possible. >+ * This would usually imply the same permissions as kernel mappings on the >CPU, >+ * if the IOMMU page table format is equivalent. > */ Agree, gives much more insight. Regards, Sricharan ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[GIT PULL] iommu: KVM PCIe/MSI passthrough on ARM/ARM64 for 4.11
Hi Joerg, Please pull Eric's guest-MSI series for 4.11. This has been through considerable review and associated rework (including a session at LPC), but it has stabilised at last and we all seem to be happy with it. Eric's done a great job of respinning these and remaining patient while we pulled him in a bunch of different directions. With these patches applied, it's possible for us to pass PCI devices through to KVM guests on arm64 using VFIO and have them signal interrupts using MSIs targetting the ITS via the SMMU. Acronym soup, sure, but it's much better than legacy wired irqs! Cheers, Will --->8 The following changes since commit a121103c922847ba5010819a3f250f1f7fc84ab8: Linux 4.10-rc3 (2017-01-08 14:18:17 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git iommu/guest-msi for you to fetch changes up to 5018c8d5ef0c172592eb98cf10e253d47b544ba8: iommu/arm-smmu: Do not advertise IOMMU_CAP_INTR_REMAP anymore (2017-01-23 15:00:47 +) Eric Auger (17): iommu: Rename iommu_dm_regions into iommu_resv_regions iommu: Add a new type field in iommu_resv_region iommu: iommu_alloc_resv_region iommu: Only map direct mapped regions iommu: iommu_get_group_resv_regions iommu: Implement reserved_regions iommu-group sysfs file iommu/vt-d: Implement reserved region get/put callbacks iommu/amd: Declare MSI and HT regions as reserved IOVA regions iommu/arm-smmu: Implement reserved region get/put callbacks iommu/arm-smmu-v3: Implement reserved region get/put callbacks irqdomain: Add irq domain MSI and MSI_REMAP flags genirq/msi: Set IRQ_DOMAIN_FLAG_MSI on MSI domain creation irqdomain: irq_domain_check_msi_remap irqchip/gicv3-its: Sets IRQ_DOMAIN_FLAG_MSI_REMAP vfio/type1: Allow transparent MSI IOVA allocation vfio/type1: Check MSI remapping at irq domain level iommu/arm-smmu: Do not advertise IOMMU_CAP_INTR_REMAP anymore Robin Murphy (1): iommu/dma: Allow MSI-only cookies .../ABI/testing/sysfs-kernel-iommu_groups | 12 ++ drivers/iommu/amd_iommu.c | 54 --- drivers/iommu/arm-smmu-v3.c| 30 +++- drivers/iommu/arm-smmu.c | 30 +++- drivers/iommu/dma-iommu.c | 119 +++--- drivers/iommu/intel-iommu.c| 92 --- drivers/iommu/iommu.c | 177 +++-- drivers/irqchip/irq-gic-v3-its.c | 1 + drivers/vfio/vfio_iommu_type1.c| 37 - include/linux/dma-iommu.h | 6 + include/linux/iommu.h | 46 -- include/linux/irqdomain.h | 36 + kernel/irq/irqdomain.c | 39 + kernel/irq/msi.c | 4 +- 14 files changed, 590 insertions(+), 93 deletions(-) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[GIT PULL] iommu: IOMMU_PRIV support for 4.11
Hi Joerg, Please pull the following IOMMU changes for 4.11. These patches from Sricharan add support for "privileged" IOMMU mappings, which are useful with master devices that support transactions at different privilege levels and want to control the permissions independently. Cheers, Will --->8 The following changes since commit a121103c922847ba5010819a3f250f1f7fc84ab8: Linux 4.10-rc3 (2017-01-08 14:18:17 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git iommu/iommu-priv for you to fetch changes up to fa8c57db44068a77903d9688382b461a836eee68: iommu: Better document the IOMMU_PRIV flag (2017-01-27 13:49:35 +) Jeremy Gebben (1): iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag Mitchel Humpherys (4): iommu: add IOMMU_PRIV attribute common: DMA-mapping: add DMA_ATTR_PRIVILEGED attribute arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED dmaengine: pl330: Make sure microcode is privileged Robin Murphy (3): iommu/io-pgtable-arm-v7s: Add support for the IOMMU_PRIV flag Revert "iommu/arm-smmu: Set PRIVCFG in stage 1 STEs" iommu: Better document the IOMMU_PRIV flag Sricharan R (2): arm/dma-mapping: Implement DMA_ATTR_PRIVILEGED iommu/arm-smmu: Set privileged attribute to 'default' instead of 'unprivileged' Documentation/DMA-attributes.txt | 10 +++ arch/arm/mm/dma-mapping.c | 60 +++--- arch/arm64/mm/dma-mapping.c| 6 ++-- drivers/dma/pl330.c| 5 ++-- drivers/iommu/arm-smmu-v3.c| 7 + drivers/iommu/arm-smmu.c | 2 +- drivers/iommu/dma-iommu.c | 12 ++-- drivers/iommu/io-pgtable-arm-v7s.c | 6 +++- drivers/iommu/io-pgtable-arm.c | 5 +++- include/linux/dma-iommu.h | 3 +- include/linux/dma-mapping.h| 7 + include/linux/iommu.h | 10 +++ 12 files changed, 85 insertions(+), 48 deletions(-) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[GIT PULL] iommu/arm-smmu: Updates for 4.11
Hi Joerg, Please pull these arm-smmu updates for 4.11. Not much this time around: 16-bit SID support on SMMUv2 and a stream table optimisation on SMMUv3. There's also a trivial cleanup to of_iommu_{set/get}_ops() [they are removed] which we promised to make after the IORT stuff went in last time around. Thanks, Will --->8 The following changes since commit a121103c922847ba5010819a3f250f1f7fc84ab8: Linux 4.10-rc3 (2017-01-08 14:18:17 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git for-joerg/arm-smmu/updates for you to fetch changes up to 3677a649a751c8f061ba379a98c369473ccac980: iommu/arm-smmu: Fix for ThunderX erratum #27704 (2017-01-26 18:16:58 +) Aleksey Makarov (1): iommu/arm-smmu: Support for Extended Stream ID (16 bit) Lorenzo Pieralisi (1): iommu: Drop the of_iommu_{set/get}_ops() interface Nate Watterson (2): iommu/arm-smmu-v3: Clear prior settings when updating STEs iommu/arm-smmu-v3: limit use of 2-level stream tables Tomasz Nowicki (1): iommu/arm-smmu: Fix for ThunderX erratum #27704 drivers/iommu/arm-smmu-v3.c | 31 --- drivers/iommu/arm-smmu.c | 72 +++- drivers/iommu/exynos-iommu.c | 2 +- drivers/iommu/msm_iommu.c| 2 +- drivers/iommu/mtk_iommu.c| 2 +- drivers/iommu/of_iommu.c | 4 +-- include/linux/of_iommu.h | 11 --- 7 files changed, 68 insertions(+), 56 deletions(-) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/3] mm: wire up GFP flag passing in dma_alloc_from_contiguous
On Thu 19-01-17 18:07:07, Lucas Stach wrote: > The callers of the DMA alloc functions already provide the proper > context GFP flags. Make sure to pass them through to the CMA > allocator, to make the CMA compaction context aware. > > Signed-off-by: Lucas StachLooks good to me Acked-by: Michal Hocko > --- > arch/arm/mm/dma-mapping.c | 16 +--- > arch/arm64/mm/dma-mapping.c| 4 ++-- > arch/mips/mm/dma-default.c | 4 ++-- > arch/x86/kernel/pci-dma.c | 3 ++- > arch/xtensa/kernel/pci-dma.c | 3 ++- > drivers/base/dma-contiguous.c | 5 +++-- > drivers/iommu/amd_iommu.c | 2 +- > drivers/iommu/intel-iommu.c| 2 +- > include/linux/dma-contiguous.h | 4 ++-- > 9 files changed, 24 insertions(+), 19 deletions(-) > > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > index ab7710002ba6..4d6ec7d821c8 100644 > --- a/arch/arm/mm/dma-mapping.c > +++ b/arch/arm/mm/dma-mapping.c > @@ -349,7 +349,7 @@ static void __dma_free_buffer(struct page *page, size_t > size) > static void *__alloc_from_contiguous(struct device *dev, size_t size, >pgprot_t prot, struct page **ret_page, >const void *caller, bool want_vaddr, > - int coherent_flag); > + int coherent_flag, gfp_t gfp); > > static void *__alloc_remap_buffer(struct device *dev, size_t size, gfp_t gfp, >pgprot_t prot, struct page **ret_page, > @@ -420,7 +420,8 @@ static int __init atomic_pool_init(void) >*/ > if (dev_get_cma_area(NULL)) > ptr = __alloc_from_contiguous(NULL, atomic_pool_size, prot, > - , atomic_pool_init, true, NORMAL); > + , atomic_pool_init, true, NORMAL, > + GFP_KERNEL); > else > ptr = __alloc_remap_buffer(NULL, atomic_pool_size, gfp, prot, > , atomic_pool_init, true); > @@ -594,14 +595,14 @@ static int __free_from_pool(void *start, size_t size) > static void *__alloc_from_contiguous(struct device *dev, size_t size, >pgprot_t prot, struct page **ret_page, >const void *caller, bool want_vaddr, > - int coherent_flag) > + int coherent_flag, gfp_t gfp) > { > unsigned long order = get_order(size); > size_t count = size >> PAGE_SHIFT; > struct page *page; > void *ptr = NULL; > > - page = dma_alloc_from_contiguous(dev, count, order); > + page = dma_alloc_from_contiguous(dev, count, order, gfp); > if (!page) > return NULL; > > @@ -655,7 +656,7 @@ static inline pgprot_t __get_dma_pgprot(unsigned long > attrs, pgprot_t prot) > #define __get_dma_pgprot(attrs, prot) > __pgprot(0) > #define __alloc_remap_buffer(dev, size, gfp, prot, ret, c, wv) NULL > #define __alloc_from_pool(size, ret_page)NULL > -#define __alloc_from_contiguous(dev, size, prot, ret, c, wv, coherent_flag) > NULL > +#define __alloc_from_contiguous(dev, size, prot, ret, c, wv, coherent_flag, > gfp) NULL > #define __free_from_pool(cpu_addr, size) do { } while (0) > #define __free_from_contiguous(dev, page, cpu_addr, size, wv)do { } > while (0) > #define __dma_free_remap(cpu_addr, size) do { } while (0) > @@ -697,7 +698,8 @@ static void *cma_allocator_alloc(struct > arm_dma_alloc_args *args, > { > return __alloc_from_contiguous(args->dev, args->size, args->prot, > ret_page, args->caller, > -args->want_vaddr, args->coherent_flag); > +args->want_vaddr, args->coherent_flag, > +args->gfp); > } > > static void cma_allocator_free(struct arm_dma_free_args *args) > @@ -1293,7 +1295,7 @@ static struct page **__iommu_alloc_buffer(struct device > *dev, size_t size, > unsigned long order = get_order(size); > struct page *page; > > - page = dma_alloc_from_contiguous(dev, count, order); > + page = dma_alloc_from_contiguous(dev, count, order, gfp); > if (!page) > goto error; > > diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c > index e04082700bb1..b998f56e6a53 100644 > --- a/arch/arm64/mm/dma-mapping.c > +++ b/arch/arm64/mm/dma-mapping.c > @@ -107,7 +107,7 @@ static void *__dma_alloc_coherent(struct device *dev, > size_t size, > void *addr; > > page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT, > -
Re: [PATCH 1/3] mm: alloc_contig_range: allow to specify GFP mask
On Fri 20-01-17 13:35:40, Vlastimil Babka wrote: > On 01/19/2017 06:07 PM, Lucas Stach wrote: [...] > > @@ -7255,7 +7256,7 @@ int alloc_contig_range(unsigned long start, unsigned > > long end, > > .zone = page_zone(pfn_to_page(start)), > > .mode = MIGRATE_SYNC, > > .ignore_skip_hint = true, > > - .gfp_mask = GFP_KERNEL, > > + .gfp_mask = gfp_mask, > > I think you should apply memalloc_noio_flags() here (and Michal should > then convert it to the new name in his scoped gfp_nofs series). Note > that then it's technically a functional change, but it's needed. > Otherwise looks good. yes, with that added, feel free to add Acked-by: Michal Hocko> > > }; > > INIT_LIST_HEAD(); > > > > -- Michal Hocko SUSE Labs ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/2] iommu/vt-d: tylersburg isoch identity map check is done too late.
The check to set identity map for tylersburg is done too late. It needs to be done before the check for identity_map domain is done. To: Joerg RoedelTo: David Woodhouse Cc: iommu@lists.linux-foundation.org Cc: linux-ker...@vger.kernel.org Cc: sta...@vger.kernel.org Cc: Ashok Raj Signed-off-by: Ashok Raj Reported-by: Yunhong Jiang --- drivers/iommu/intel-iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 8a18525..23eead3 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3325,13 +3325,14 @@ static int __init init_dmars(void) iommu_identity_mapping |= IDENTMAP_GFX; #endif + check_tylersburg_isoch(); + if (iommu_identity_mapping) { ret = si_domain_init(hw_pass_through); if (ret) goto free_iommu; } - check_tylersburg_isoch(); /* * If we copied translations from a previous kernel in the kdump -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/2] iommu/vt-d: Fix some macros that are incorrectly specified in intel-iommu
From: CQ TangSome of the macros are incorrect with wrong bit-shifts resulting in picking the incorrect invalidation granularity. Incorrect Source-ID in extended devtlb invalidation caused device side errors. To: Joerg Roedel To: David Woodhouse Cc: iommu@lists.linux-foundation.org Cc: linux-ker...@vger.kernel.org Cc: sta...@vger.kernel.org Cc: CQ Tang Cc: Ashok Raj Signed-off-by: CQ Tang Signed-off-by: Ashok Raj Tested-by: CQ Tang --- include/linux/intel-iommu.h | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h index d49e26c..23e129e 100644 --- a/include/linux/intel-iommu.h +++ b/include/linux/intel-iommu.h @@ -153,8 +153,8 @@ static inline void dmar_writeq(void __iomem *addr, u64 val) #define DMA_TLB_GLOBAL_FLUSH (((u64)1) << 60) #define DMA_TLB_DSI_FLUSH (((u64)2) << 60) #define DMA_TLB_PSI_FLUSH (((u64)3) << 60) -#define DMA_TLB_IIRG(type) ((type >> 60) & 7) -#define DMA_TLB_IAIG(val) (((val) >> 57) & 7) +#define DMA_TLB_IIRG(type) ((type >> 60) & 3) +#define DMA_TLB_IAIG(val) (((val) >> 57) & 3) #define DMA_TLB_READ_DRAIN (((u64)1) << 49) #define DMA_TLB_WRITE_DRAIN (((u64)1) << 48) #define DMA_TLB_DID(id)(((u64)((id) & 0x)) << 32) @@ -164,9 +164,9 @@ static inline void dmar_writeq(void __iomem *addr, u64 val) /* INVALID_DESC */ #define DMA_CCMD_INVL_GRANU_OFFSET 61 -#define DMA_ID_TLB_GLOBAL_FLUSH(((u64)1) << 3) -#define DMA_ID_TLB_DSI_FLUSH (((u64)2) << 3) -#define DMA_ID_TLB_PSI_FLUSH (((u64)3) << 3) +#define DMA_ID_TLB_GLOBAL_FLUSH(((u64)1) << 4) +#define DMA_ID_TLB_DSI_FLUSH (((u64)2) << 4) +#define DMA_ID_TLB_PSI_FLUSH (((u64)3) << 4) #define DMA_ID_TLB_READ_DRAIN (((u64)1) << 7) #define DMA_ID_TLB_WRITE_DRAIN (((u64)1) << 6) #define DMA_ID_TLB_DID(id) (((u64)((id & 0x) << 16))) @@ -316,8 +316,8 @@ enum { #define QI_DEV_EIOTLB_SIZE (((u64)1) << 11) #define QI_DEV_EIOTLB_GLOB(g) ((u64)g) #define QI_DEV_EIOTLB_PASID(p) (((u64)p) << 32) -#define QI_DEV_EIOTLB_SID(sid) ((u64)((sid) & 0x) << 32) -#define QI_DEV_EIOTLB_QDEP(qd) (((qd) & 0x1f) << 16) +#define QI_DEV_EIOTLB_SID(sid) ((u64)((sid) & 0x) << 16) +#define QI_DEV_EIOTLB_QDEP(qd) ((u64)((qd) & 0x1f) << 4) #define QI_DEV_EIOTLB_MAX_INVS 32 #define QI_PGRP_IDX(idx) (((u64)(idx)) << 55) -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 1/2] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS
Add helpers for allocating physically contiguous DMA buffers to the generic IOMMU DMA code. This can be useful when two or more devices with different memory requirements are involved in buffer sharing. The iommu_dma_{alloc,free}_contiguous() functions complement the existing iommu_dma_{alloc,free}() functions, and allow architecture-specific code to implement support for the DMA_ATTR_FORCE_CONTIGUOUS attribute on systems with an IOMMU. As this uses the CMA allocator, setting this attribute has a runtime-dependency on CONFIG_DMA_CMA. Note that unlike the existing iommu_dma_alloc() helper, iommu_dma_alloc_contiguous() has no callback to flush pages. Ensuring the returned region is made visible to a non-coherent device is the responsibility of the caller. Signed-off-by: Geert Uytterhoeven--- v2: - Provide standalone iommu_dma_{alloc,free}_contiguous() functions, as requested by Robin Murphy, - Simplify operations by getting rid of the page array/scatterlist dance, as the buffer is contiguous, - Move CPU cache magement into the caller, which is much simpler with a single contiguous buffer. --- drivers/iommu/dma-iommu.c | 72 +++ include/linux/dma-iommu.h | 4 +++ 2 files changed, 76 insertions(+) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 2db0d641cf4505b5..8f8ed4426f9a3a12 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -30,6 +30,7 @@ #include #include #include +#include struct iommu_dma_msi_page { struct list_headlist; @@ -408,6 +409,77 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp, } /** + * iommu_dma_free_contiguous - Free a buffer allocated by + *iommu_dma_alloc_contiguous() + * @dev: Device which owns this buffer + * @page: Buffer page pointer as returned by iommu_dma_alloc_contiguous() + * @size: Size of buffer in bytes + * @handle: DMA address of buffer + * + * Frees the pages associated with the buffer. + */ +void iommu_dma_free_contiguous(struct device *dev, struct page *page, + size_t size, dma_addr_t *handle) +{ + __iommu_dma_unmap(iommu_get_domain_for_dev(dev), *handle); + dma_release_from_contiguous(dev, page, PAGE_ALIGN(size) >> PAGE_SHIFT); + *handle = DMA_ERROR_CODE; +} + +/** + * iommu_dma_alloc_contiguous - Allocate and map a buffer contiguous in IOVA + * and physical space + * @dev: Device to allocate memory for. Must be a real device attached to an + * iommu_dma_domain + * @size: Size of buffer in bytes + * @prot: IOMMU mapping flags + * @handle: Out argument for allocated DMA handle + * + * Return: Buffer page pointer, or NULL on failure. + * + * Note that unlike iommu_dma_alloc(), it's the caller's responsibility to + * ensure the returned region is made visible to the given non-coherent device. + */ +struct page *iommu_dma_alloc_contiguous(struct device *dev, size_t size, + int prot, dma_addr_t *handle) +{ + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); + struct iova_domain *iovad = cookie_iovad(domain); + dma_addr_t dma_addr; + unsigned int count; + struct page *page; + struct iova *iova; + int ret; + + *handle = DMA_ERROR_CODE; + + size = PAGE_ALIGN(size); + count = size >> PAGE_SHIFT; + page = dma_alloc_from_contiguous(dev, count, get_order(size)); + if (!page) + return NULL; + + iova = __alloc_iova(domain, size, dev->coherent_dma_mask); + if (!iova) + goto out_free_pages; + + size = iova_align(iovad, size); + dma_addr = iova_dma_addr(iovad, iova); + ret = iommu_map(domain, dma_addr, page_to_phys(page), size, prot); + if (ret < 0) + goto out_free_iova; + + *handle = dma_addr; + return page; + +out_free_iova: + __free_iova(iovad, iova); +out_free_pages: + dma_release_from_contiguous(dev, page, count); + return NULL; +} + +/** * iommu_dma_mmap - Map a buffer into provided user VMA * @pages: Array representing buffer from iommu_dma_alloc() * @size: Size of buffer in bytes diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h index 7f7e9a7e3839966c..7eee62c2b0e752f7 100644 --- a/include/linux/dma-iommu.h +++ b/include/linux/dma-iommu.h @@ -45,6 +45,10 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp, void (*flush_page)(struct device *, const void *, phys_addr_t)); void iommu_dma_free(struct device *dev, struct page **pages, size_t size, dma_addr_t *handle); +struct page *iommu_dma_alloc_contiguous(struct device *dev, size_t size, + int prot, dma_addr_t *handle); +void iommu_dma_free_contiguous(struct device *dev, struct page *page, + size_t size, dma_addr_t *handle); int
[PATCH v2 2/2] arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU
Add support for allocation physically contiguous DMA buffers on arm64 systems with an IOMMU, by dispatching DMA buffer allocations with the DMA_ATTR_FORCE_CONTIGUOUS attribute to the appropriate IOMMU DMA helpers. Note that as this uses the CMA allocator, setting this attribute has a runtime-dependency on CONFIG_DMA_CMA, just like on arm32. For arm64 systems using swiotlb, no changes are needed to support the allocation of physically contiguous DMA buffers: - swiotlb always uses physically contiguous buffers (up to IO_TLB_SEGSIZE = 128 pages), - arm64's __dma_alloc_coherent() already calls dma_alloc_from_contiguous() when CMA is available. Signed-off-by: Geert Uytterhoeven--- v2: - New, handle dispatching in the arch (arm64) code, as requested by Robin Murphy. --- arch/arm64/mm/dma-mapping.c | 51 - 1 file changed, 37 insertions(+), 14 deletions(-) diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index 1d7d5d2881db7c19..325803e0ba79ef26 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -577,20 +577,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size, */ gfp |= __GFP_ZERO; - if (gfpflags_allow_blocking(gfp)) { - struct page **pages; - pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent); - - pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot, - handle, flush_page); - if (!pages) - return NULL; - - addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot, - __builtin_return_address(0)); - if (!addr) - iommu_dma_free(dev, pages, iosize, handle); - } else { + if (!gfpflags_allow_blocking(gfp)) { struct page *page; /* * In atomic context we can't remap anything, so we'll only @@ -614,6 +601,35 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size, __free_from_pool(addr, size); addr = NULL; } + } else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) { + pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent); + struct page *page; + + page = iommu_dma_alloc_contiguous(dev, iosize, ioprot, handle); + if (!page) + return NULL; + + if (!coherent) + __dma_flush_area(page_to_virt(page), iosize); + + addr = dma_common_contiguous_remap(page, size, VM_USERMAP, + prot, + __builtin_return_address(0)); + if (!addr) + iommu_dma_free_contiguous(dev, page, iosize, handle); + } else { + pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent); + struct page **pages; + + pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot, + handle, flush_page); + if (!pages) + return NULL; + + addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot, + __builtin_return_address(0)); + if (!addr) + iommu_dma_free(dev, pages, iosize, handle); } return addr; } @@ -626,6 +642,8 @@ static void __iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, size = PAGE_ALIGN(size); /* * @cpu_addr will be one of 3 things depending on how it was allocated: +* - A remapped array of pages from iommu_dma_alloc_contiguous() +* for contiguous allocations. * - A remapped array of pages from iommu_dma_alloc(), for all * non-atomic allocations. * - A non-cacheable alias from the atomic pool, for atomic @@ -637,6 +655,11 @@ static void __iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, if (__in_atomic_pool(cpu_addr, size)) { iommu_dma_unmap_page(dev, handle, iosize, 0, 0); __free_from_pool(cpu_addr, size); + } else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) { + struct page *page = phys_to_page(dma_to_phys(dev, handle)); + + iommu_dma_free_contiguous(dev, page, iosize, ); + dma_common_free_remap(cpu_addr, size, VM_USERMAP); } else if (is_vmalloc_addr(cpu_addr)){ struct vm_struct *area = find_vm_area(cpu_addr); -- 1.9.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 0/2] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS
Hi all, Tis patch series adds helpers for DMA_ATTR_FORCE_CONTIGUOUS to the generic IOMMU DMA code, and support for allocating physically contiguous DMA buffers on arm64 systems with an IOMMU. This can be useful when two or more devices with different memory requirements are involved in buffer sharing. Changes compared to v1: - Provide standalone iommu_dma_{alloc,free}_contiguous() functions, as requested by Robin Murphy, - Handle dispatching in the arch (arm64) code, as requested by Robin Murphy, - Simplify operations by getting rid of the page array/scatterlist dance, as the buffer is contiguous, - Move CPU cache magement into the caller, which is much simpler with a single contiguous buffer. Thanks for your comments! Geert Uytterhoeven (2): iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU arch/arm64/mm/dma-mapping.c | 51 +++- drivers/iommu/dma-iommu.c | 72 + include/linux/dma-iommu.h | 4 +++ 3 files changed, 113 insertions(+), 14 deletions(-) -- 1.9.1 Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu: Better document the IOMMU_PRIV flag
This is a fairly subtle thing - let's make sure it's described as clearly as possible to avoid potential misunderstandings. Signed-off-by: Robin Murphy--- Having another look through the IOMMU_PRIV series, I wasn't convinced that the original comment was really all that helpful - I'm happy for this to be squashed in if you like. Robin. include/linux/iommu.h | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 69e2417a2965..3c830e153069 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -32,10 +32,13 @@ #define IOMMU_NOEXEC (1 << 3) #define IOMMU_MMIO (1 << 4) /* e.g. things like MSI doorbells */ /* - * This is to make the IOMMU API setup privileged - * mapppings accessible by the master only at higher - * privileged execution level and inaccessible at - * less privileged levels. + * Where the bus hardware includes a privilege level as part of its access type + * markings, and certain devices are capable of issuing transactions marked as + * either 'supervisor' or 'user', the IOMMU_PRIV flag requests that the other + * given permission flags only apply to accesses at the higher privilege level, + * and that unprivileged transactions should have as little access as possible. + * This would usually imply the same permissions as kernel mappings on the CPU, + * if the IOMMU page table format is equivalent. */ #define IOMMU_PRIV (1 << 5) -- 2.11.0.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/3] iommu/ipmmu-vmsa: Increase maximum micro-TLBS to 48
On Mon, Jan 23, 2017 at 08:40:29PM +0900, Magnus Damm wrote: > From: Magnus Damm> > Bump up the maximum numbers of micro-TLBS to 48. > > Each IPMMU device instance get micro-TLB assignment via > the "iommus" property in DT. Older SoCs tend to use a > maximum number of 32 micro-TLBs per IPMMU instance however > newer SoCs such as r8a7796 make use of up to 48 micro-TLBs. > > At this point no SoC specific handling is done to validate > the maximum number of micro-TLBs, and because of that the > DT information is assumed to be within correct range for > each particular SoC. > > If needed in the future SoC specific feature flags can be > added to handle the maximum number of micro-TLBs without > requiring DT changes, however at this point this does not > seem necessary. > > Signed-off-by: Magnus Damm I get a conflict when applying this to v4.10-rc5. What is this based on, any patches I missed? Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu