Re: [PATCH V2 2/6] iommu: iova: properly handle 0 as a valid IOVA address
Hi Robin On Mon, May 23, 2022 at 11:00 PM Robin Murphy wrote: > > On 2022-05-11 13:15, Ajay Kumar wrote: > > From: Marek Szyprowski > > > > Zero is a valid DMA and IOVA address on many architectures, so adjust the > > IOVA management code to properly handle it. A new value IOVA_BAD_ADDR > > (~0UL) is introduced as a generic value for the error case. Adjust all > > callers of the alloc_iova_fast() function for the new return value. > > And when does anything actually need this? In fact if you were to stop > iommu-dma from reserving IOVA 0 - which you don't - it would only show > how patch #3 is broken. Right! Since the IOVA allocation happens from higher addr to lower addr, hitting this (IOVA==0) case means out of IOVA space which is highly unlikely. > Also note that it's really nothing to do with architectures either way; > iommu-dma simply chooses to reserve IOVA 0 for its own convenience, > mostly because it can. Much the same way that 0 is typically a valid CPU > VA, but mapping something meaningful there is just asking for a world of > pain debugging NULL-dereference bugs. > > Robin. This makes sense, let me think about managing the PFN at lowest address in some other way. Thanks, Ajay Kumar > > Signed-off-by: Marek Szyprowski > > Signed-off-by: Ajay Kumar > > --- > > drivers/iommu/dma-iommu.c | 16 +--- > > drivers/iommu/iova.c | 13 + > > include/linux/iova.h | 1 + > > 3 files changed, 19 insertions(+), 11 deletions(-) > > > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > > index 1ca85d37eeab..16218d6a0703 100644 > > --- a/drivers/iommu/dma-iommu.c > > +++ b/drivers/iommu/dma-iommu.c > > @@ -605,7 +605,7 @@ static dma_addr_t iommu_dma_alloc_iova(struct > > iommu_domain *domain, > > { > > struct iommu_dma_cookie *cookie = domain->iova_cookie; > > struct iova_domain *iovad = &cookie->iovad; > > - unsigned long shift, iova_len, iova = 0; > > + unsigned long shift, iova_len, iova = IOVA_BAD_ADDR; > > > > if (cookie->type == IOMMU_DMA_MSI_COOKIE) { > > cookie->msi_iova += size; > > @@ -625,11 +625,13 @@ static dma_addr_t iommu_dma_alloc_iova(struct > > iommu_domain *domain, > > iova = alloc_iova_fast(iovad, iova_len, > > DMA_BIT_MASK(32) >> shift, false); > > > > - if (!iova) > > + if (iova == IOVA_BAD_ADDR) > > iova = alloc_iova_fast(iovad, iova_len, dma_limit >> shift, > > true); > > > > - return (dma_addr_t)iova << shift; > > + if (iova != IOVA_BAD_ADDR) > > + return (dma_addr_t)iova << shift; > > + return DMA_MAPPING_ERROR; > > } > > > > static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie, > > @@ -688,7 +690,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, > > phys_addr_t phys, > > size = iova_align(iovad, size + iova_off); > > > > iova = iommu_dma_alloc_iova(domain, size, dma_mask, dev); > > - if (!iova) > > + if (iova == DMA_MAPPING_ERROR) > > return DMA_MAPPING_ERROR; > > > > if (iommu_map_atomic(domain, iova, phys - iova_off, size, prot)) { > > @@ -799,7 +801,7 @@ static struct page > > **__iommu_dma_alloc_noncontiguous(struct device *dev, > > > > size = iova_align(iovad, size); > > iova = iommu_dma_alloc_iova(domain, size, dev->coherent_dma_mask, > > dev); > > - if (!iova) > > + if (iova == DMA_MAPPING_ERROR) > > goto out_free_pages; > > > > if (sg_alloc_table_from_pages(sgt, pages, count, 0, size, GFP_KERNEL)) > > @@ -1204,7 +1206,7 @@ static int iommu_dma_map_sg(struct device *dev, > > struct scatterlist *sg, > > } > > > > iova = iommu_dma_alloc_iova(domain, iova_len, dma_get_mask(dev), dev); > > - if (!iova) { > > + if (iova == DMA_MAPPING_ERROR) { > > ret = -ENOMEM; > > goto out_restore_sg; > > } > > @@ -1516,7 +1518,7 @@ static struct iommu_dma_msi_page > > *iommu_dma_get_msi_page(struct device *dev, > > return NULL; > > > > iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev); > > - if (!iova) > > + if (iova == DMA_MAPPING_ERROR) > > goto out_free_page; > > > > if (iommu_map(domain, iova, msi_addr, size, prot)) > > d
Re: [PATCH V2 0/6] IOMMU-DMA - support DMA_ATTR_LOW_ADDRESS attribute
Ping! On Thu, May 12, 2022 at 9:09 AM Ajay Kumar wrote: > > This patchset is a rebase of original patches from Marek Szyprowski: > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2321261.html > > The patches have been rebased on Joro's IOMMU tree "next" branch: > https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git > > This patchset is needed to address the IOVA address dependency issue between > firmware buffers and other buffers in Samsung s5p-mfc driver. > > There have been few discussions in the past on how to find a generic > soultion for this issue, ranging from adding an entirely new API to choose > IOVA window[1], to adding a DMA attribute DMA_ATTR_LOW_ADDRESS which handles > buffer allocation from lower address[2]. > This is a continuation of initial work from Marek for approach [2]. > Patches have been tested with latest version of Samsung s5p-mfc driver. > > Changes since V1: > [PATCH V2 1/6] > - Rebase on latest tree. > > [PATCH V2 2/6] > - Rebase on latest tree. > Added a missing check for iova_pfn in __iova_rcache_get() > Discard changes from drivers/iommu/intel/iommu.c which are not necessary > > [PATCH V2 3/6] > - Rebase on latest tree. > > [PATCH V2 4/6] > - Rebase on latest tree > > [PATCH V2 5/6] > - Rebase on latest tree > > [PATCH V2 6/6] > - Rebase on latest tree. > > Marek Szyprowski (6): > dma-mapping: add DMA_ATTR_LOW_ADDRESS attribute > iommu: iova: properly handle 0 as a valid IOVA address > iommu: iova: add support for 'first-fit' algorithm > iommu: dma-iommu: refactor iommu_dma_alloc_iova() > iommu: dma-iommu: add support for DMA_ATTR_LOW_ADDRESS > media: platform: s5p-mfc: use DMA_ATTR_LOW_ADDRESS > > References: > [1] > https://lore.kernel.org/linux-iommu/20200811054912.ga...@infradead.org/ > > [2] > https://lore.kernel.org/linux-mm/bff57cbe-2247-05e1-9059-d9c66d64c...@arm.com > > drivers/iommu/dma-iommu.c | 77 +++- > drivers/iommu/iova.c | 91 ++- > .../media/platform/samsung/s5p-mfc/s5p_mfc.c | 8 +- > include/linux/dma-mapping.h | 6 ++ > include/linux/iova.h | 3 + > 5 files changed, 156 insertions(+), 29 deletions(-) > > > base-commit: faf93cfaadfaaff2a5c35d6301b45aa2f6e4ddb2 > -- > 2.17.1 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH V2 6/6] media: platform: s5p-mfc: use DMA_ATTR_LOW_ADDRESS
From: Marek Szyprowski S5P-MFC driver relied on the way the ARM DMA-IOMMU glue code worked - mainly it relied on the fact that the allocator used first-fit algorithm and the first allocated buffer were at 0x0 DMA/IOVA address. This is not true for the generic IOMMU-DMA glue code that will be used for ARM architecture soon, so limit the dma_mask to size of the DMA window the hardware can use and add the needed DMA attribute to force proper IOVA allocation of the firmware buffer. Signed-off-by: Marek Szyprowski Signed-off-by: Ajay Kumar --- drivers/media/platform/samsung/s5p-mfc/s5p_mfc.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/media/platform/samsung/s5p-mfc/s5p_mfc.c b/drivers/media/platform/samsung/s5p-mfc/s5p_mfc.c index 761341934925..15c9c2273561 100644 --- a/drivers/media/platform/samsung/s5p-mfc/s5p_mfc.c +++ b/drivers/media/platform/samsung/s5p-mfc/s5p_mfc.c @@ -1196,8 +1196,12 @@ static int s5p_mfc_configure_common_memory(struct s5p_mfc_dev *mfc_dev) if (!mfc_dev->mem_bitmap) return -ENOMEM; - mfc_dev->mem_virt = dma_alloc_coherent(dev, mem_size, - &mfc_dev->mem_base, GFP_KERNEL); + /* MFC v5 can access memory only via the 256M window */ + if (exynos_is_iommu_available(dev) && !IS_MFCV6_PLUS(mfc_dev)) + dma_set_mask_and_coherent(dev, SZ_256M - 1); + + mfc_dev->mem_virt = dma_alloc_attrs(dev, mem_size, &mfc_dev->mem_base, + GFP_KERNEL, DMA_ATTR_LOW_ADDRESS); if (!mfc_dev->mem_virt) { bitmap_free(mfc_dev->mem_bitmap); dev_err(dev, "failed to preallocate %ld MiB for the firmware and context buffers\n", -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH V2 5/6] iommu: dma-iommu: add support for DMA_ATTR_LOW_ADDRESS
From: Marek Szyprowski Implement support for the DMA_ATTR_LOW_ADDRESS DMA attribute. If it has been set, call alloc_iova_first_fit() instead of the alloc_iova_fast() to allocate the new IOVA from the beginning of the address space. Signed-off-by: Marek Szyprowski Signed-off-by: Ajay Kumar --- drivers/iommu/dma-iommu.c | 50 +-- 1 file changed, 38 insertions(+), 12 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index cb235b40303c..553c5b863e19 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -601,6 +601,18 @@ static int dma_info_to_prot(enum dma_data_direction dir, bool coherent, } #define DMA_ALLOC_IOVA_COHERENTBIT(0) +#define DMA_ALLOC_IOVA_FIRST_FIT BIT(1) + +static unsigned int dma_attrs_to_alloc_flags(unsigned long attrs, bool coherent) +{ + unsigned int flags = 0; + + if (coherent) + flags |= DMA_ALLOC_IOVA_COHERENT; + if (attrs & DMA_ATTR_LOW_ADDRESS) + flags |= DMA_ALLOC_IOVA_FIRST_FIT; + return flags; +} static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain, struct device *dev, size_t size, unsigned int flags) @@ -625,13 +637,23 @@ static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain, dma_limit = min(dma_limit, (u64)domain->geometry.aperture_end); /* Try to get PCI devices a SAC address */ - if (dma_limit > DMA_BIT_MASK(32) && !iommu_dma_forcedac && dev_is_pci(dev)) - iova = alloc_iova_fast(iovad, iova_len, - DMA_BIT_MASK(32) >> shift, false); + if (dma_limit > DMA_BIT_MASK(32) && !iommu_dma_forcedac && dev_is_pci(dev)) { + if (unlikely(flags & DMA_ALLOC_IOVA_FIRST_FIT)) + iova = alloc_iova_first_fit(iovad, iova_len, + DMA_BIT_MASK(32) >> shift); + else + iova = alloc_iova_fast(iovad, iova_len, + DMA_BIT_MASK(32) >> shift, false); + } - if (iova == IOVA_BAD_ADDR) - iova = alloc_iova_fast(iovad, iova_len, dma_limit >> shift, - true); + if (iova == IOVA_BAD_ADDR) { + if (unlikely(flags & DMA_ALLOC_IOVA_FIRST_FIT)) + iova = alloc_iova_first_fit(iovad, iova_len, + dma_limit >> shift); + else + iova = alloc_iova_fast(iovad, iova_len, + dma_limit >> shift, true); + } if (iova != IOVA_BAD_ADDR) return (dma_addr_t)iova << shift; @@ -779,6 +801,7 @@ static struct page **__iommu_dma_alloc_noncontiguous(struct device *dev, struct iova_domain *iovad = &cookie->iovad; bool coherent = dev_is_dma_coherent(dev); int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs); + unsigned int flags = dma_attrs_to_alloc_flags(attrs, true); unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap; struct page **pages; dma_addr_t iova; @@ -804,7 +827,7 @@ static struct page **__iommu_dma_alloc_noncontiguous(struct device *dev, return NULL; size = iova_align(iovad, size); - iova = iommu_dma_alloc_iova(domain, dev, size, DMA_ALLOC_IOVA_COHERENT); + iova = iommu_dma_alloc_iova(domain, dev, size, flags); if (iova == DMA_MAPPING_ERROR) goto out_free_pages; @@ -964,6 +987,7 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, phys_addr_t phys = page_to_phys(page) + offset; bool coherent = dev_is_dma_coherent(dev); int prot = dma_info_to_prot(dir, coherent, attrs); + unsigned int flags = dma_attrs_to_alloc_flags(attrs, false); struct iommu_domain *domain = iommu_get_dma_domain(dev); struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iova_domain *iovad = &cookie->iovad; @@ -1005,7 +1029,7 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) arch_sync_dma_for_device(phys, size, dir); - iova = __iommu_dma_map(dev, phys, size, prot, 0); + iova = __iommu_dma_map(dev, phys, size, prot, flags); if (iova == DMA_MAPPING_ERROR && is_swiotlb_buffer(dev, phys)) swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs); return iova; @@ -1152,6 +1176,7 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, struct iova_domain *iovad = &cookie->iovad;
[PATCH V2 4/6] iommu: dma-iommu: refactor iommu_dma_alloc_iova()
From: Marek Szyprowski Change the parameters passed to iommu_dma_alloc_iova(): the dma_limit can be easily extracted from the parameters of the passed struct device, so replace it with a flags parameter, which can later hold more information about the way the IOVA allocator should do it job. While touching the parameter list, move struct device to the second position to better match the convention of the DMA-mapping related functions. Signed-off-by: Marek Szyprowski Signed-off-by: Ajay Kumar --- drivers/iommu/dma-iommu.c | 25 ++--- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 16218d6a0703..cb235b40303c 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -600,12 +600,16 @@ static int dma_info_to_prot(enum dma_data_direction dir, bool coherent, } } +#define DMA_ALLOC_IOVA_COHERENTBIT(0) + static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain, - size_t size, u64 dma_limit, struct device *dev) + struct device *dev, size_t size, unsigned int flags) { struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iova_domain *iovad = &cookie->iovad; unsigned long shift, iova_len, iova = IOVA_BAD_ADDR; + u64 dma_limit = (flags & DMA_ALLOC_IOVA_COHERENT) ? + dev->coherent_dma_mask : dma_get_mask(dev); if (cookie->type == IOMMU_DMA_MSI_COOKIE) { cookie->msi_iova += size; @@ -675,7 +679,7 @@ static void __iommu_dma_unmap(struct device *dev, dma_addr_t dma_addr, } static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, - size_t size, int prot, u64 dma_mask) + size_t size, int prot, unsigned int flags) { struct iommu_domain *domain = iommu_get_dma_domain(dev); struct iommu_dma_cookie *cookie = domain->iova_cookie; @@ -689,7 +693,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, size = iova_align(iovad, size + iova_off); - iova = iommu_dma_alloc_iova(domain, size, dma_mask, dev); + iova = iommu_dma_alloc_iova(domain, dev, size, flags); if (iova == DMA_MAPPING_ERROR) return DMA_MAPPING_ERROR; @@ -800,7 +804,7 @@ static struct page **__iommu_dma_alloc_noncontiguous(struct device *dev, return NULL; size = iova_align(iovad, size); - iova = iommu_dma_alloc_iova(domain, size, dev->coherent_dma_mask, dev); + iova = iommu_dma_alloc_iova(domain, dev, size, DMA_ALLOC_IOVA_COHERENT); if (iova == DMA_MAPPING_ERROR) goto out_free_pages; @@ -963,7 +967,7 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, struct iommu_domain *domain = iommu_get_dma_domain(dev); struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iova_domain *iovad = &cookie->iovad; - dma_addr_t iova, dma_mask = dma_get_mask(dev); + dma_addr_t iova; /* * If both the physical buffer start address and size are @@ -1001,7 +1005,7 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) arch_sync_dma_for_device(phys, size, dir); - iova = __iommu_dma_map(dev, phys, size, prot, dma_mask); + iova = __iommu_dma_map(dev, phys, size, prot, 0); if (iova == DMA_MAPPING_ERROR && is_swiotlb_buffer(dev, phys)) swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs); return iova; @@ -1205,7 +1209,7 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, prev = s; } - iova = iommu_dma_alloc_iova(domain, iova_len, dma_get_mask(dev), dev); + iova = iommu_dma_alloc_iova(domain, dev, iova_len, 0); if (iova == DMA_MAPPING_ERROR) { ret = -ENOMEM; goto out_restore_sg; @@ -1264,8 +1268,7 @@ static dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys, size_t size, enum dma_data_direction dir, unsigned long attrs) { return __iommu_dma_map(dev, phys, size, - dma_info_to_prot(dir, false, attrs) | IOMMU_MMIO, - dma_get_mask(dev)); + dma_info_to_prot(dir, false, attrs) | IOMMU_MMIO, 0); } static void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle, @@ -1375,7 +1378,7 @@ static void *iommu_dma_alloc(struct device *dev, size_t size, return NULL; *handle = __iommu_dma_map(dev, page_to_phys(page), size, ioprot, - dev->coherent_dma_mask); + DMA_ALLOC_IOVA_COHERENT); if (*handle == DMA_MAPPING_ERROR) { __iommu_dma_free
[PATCH V2 3/6] iommu: iova: add support for 'first-fit' algorithm
From: Marek Szyprowski Add support for the 'first-fit' allocation algorithm. It will be used for the special case of implementing DMA_ATTR_LOW_ADDRESS, so this path doesn't use IOVA cache. Signed-off-by: Marek Szyprowski Signed-off-by: Ajay Kumar --- drivers/iommu/iova.c | 78 include/linux/iova.h | 2 ++ 2 files changed, 80 insertions(+) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index ae0fe0a6714e..89f9338f83a3 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -231,6 +231,59 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad, return -ENOMEM; } +static unsigned long +__iova_get_aligned_start(unsigned long start, unsigned long size) +{ + unsigned long mask = __roundup_pow_of_two(size) - 1; + + return (start + mask) & ~mask; +} + +static int __alloc_and_insert_iova_range_forward(struct iova_domain *iovad, + unsigned long size, unsigned long limit_pfn, + struct iova *new) +{ + struct rb_node *curr; + unsigned long flags; + unsigned long start, limit; + + spin_lock_irqsave(&iovad->iova_rbtree_lock, flags); + + curr = rb_first(&iovad->rbroot); + limit = limit_pfn; + start = __iova_get_aligned_start(iovad->start_pfn, size); + + while (curr) { + struct iova *curr_iova = rb_entry(curr, struct iova, node); + struct rb_node *next = rb_next(curr); + + start = __iova_get_aligned_start(curr_iova->pfn_hi + 1, size); + if (next) { + struct iova *next_iova = rb_entry(next, struct iova, node); + limit = next_iova->pfn_lo - 1; + } else { + limit = limit_pfn; + } + + if ((start + size) <= limit) + break; /* found a free slot */ + curr = next; + } + + if (!curr && start + size > limit) { + spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); + return -ENOMEM; + } + + new->pfn_lo = start; + new->pfn_hi = new->pfn_lo + size - 1; + iova_insert_rbtree(&iovad->rbroot, new, curr); + + spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); + + return 0; +} + static struct kmem_cache *iova_cache; static unsigned int iova_cache_users; static DEFINE_MUTEX(iova_cache_mutex); @@ -420,6 +473,31 @@ free_iova(struct iova_domain *iovad, unsigned long pfn) } EXPORT_SYMBOL_GPL(free_iova); +/** + * alloc_iova_first_fit - allocates an iova from the beginning of address space + * @iovad: - iova domain in question + * @size: - size of page frames to allocate + * @limit_pfn: - max limit address + * Returns a pfn the allocated iova starts at or IOVA_BAD_ADDR in the case + * of a failure. + */ +unsigned long +alloc_iova_first_fit(struct iova_domain *iovad, unsigned long size, +unsigned long limit_pfn) +{ + struct iova *new_iova = alloc_iova_mem(); + + if (!new_iova) + return IOVA_BAD_ADDR; + + if (__alloc_and_insert_iova_range_forward(iovad, size, limit_pfn, new_iova)) { + free_iova_mem(new_iova); + return IOVA_BAD_ADDR; + } + return new_iova->pfn_lo; +} +EXPORT_SYMBOL_GPL(alloc_iova_first_fit); + /** * alloc_iova_fast - allocates an iova from rcache * @iovad: - iova domain in question diff --git a/include/linux/iova.h b/include/linux/iova.h index 46b5b10c532b..45ed6d41490a 100644 --- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -89,6 +89,8 @@ void free_iova_fast(struct iova_domain *iovad, unsigned long pfn, unsigned long size); unsigned long alloc_iova_fast(struct iova_domain *iovad, unsigned long size, unsigned long limit_pfn, bool flush_rcache); +unsigned long alloc_iova_first_fit(struct iova_domain *iovad, unsigned long size, + unsigned long limit_pfn); struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo, unsigned long pfn_hi); void init_iova_domain(struct iova_domain *iovad, unsigned long granule, -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH V2 2/6] iommu: iova: properly handle 0 as a valid IOVA address
From: Marek Szyprowski Zero is a valid DMA and IOVA address on many architectures, so adjust the IOVA management code to properly handle it. A new value IOVA_BAD_ADDR (~0UL) is introduced as a generic value for the error case. Adjust all callers of the alloc_iova_fast() function for the new return value. Signed-off-by: Marek Szyprowski Signed-off-by: Ajay Kumar --- drivers/iommu/dma-iommu.c | 16 +--- drivers/iommu/iova.c | 13 + include/linux/iova.h | 1 + 3 files changed, 19 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 1ca85d37eeab..16218d6a0703 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -605,7 +605,7 @@ static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain, { struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iova_domain *iovad = &cookie->iovad; - unsigned long shift, iova_len, iova = 0; + unsigned long shift, iova_len, iova = IOVA_BAD_ADDR; if (cookie->type == IOMMU_DMA_MSI_COOKIE) { cookie->msi_iova += size; @@ -625,11 +625,13 @@ static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain, iova = alloc_iova_fast(iovad, iova_len, DMA_BIT_MASK(32) >> shift, false); - if (!iova) + if (iova == IOVA_BAD_ADDR) iova = alloc_iova_fast(iovad, iova_len, dma_limit >> shift, true); - return (dma_addr_t)iova << shift; + if (iova != IOVA_BAD_ADDR) + return (dma_addr_t)iova << shift; + return DMA_MAPPING_ERROR; } static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie, @@ -688,7 +690,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, size = iova_align(iovad, size + iova_off); iova = iommu_dma_alloc_iova(domain, size, dma_mask, dev); - if (!iova) + if (iova == DMA_MAPPING_ERROR) return DMA_MAPPING_ERROR; if (iommu_map_atomic(domain, iova, phys - iova_off, size, prot)) { @@ -799,7 +801,7 @@ static struct page **__iommu_dma_alloc_noncontiguous(struct device *dev, size = iova_align(iovad, size); iova = iommu_dma_alloc_iova(domain, size, dev->coherent_dma_mask, dev); - if (!iova) + if (iova == DMA_MAPPING_ERROR) goto out_free_pages; if (sg_alloc_table_from_pages(sgt, pages, count, 0, size, GFP_KERNEL)) @@ -1204,7 +1206,7 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, } iova = iommu_dma_alloc_iova(domain, iova_len, dma_get_mask(dev), dev); - if (!iova) { + if (iova == DMA_MAPPING_ERROR) { ret = -ENOMEM; goto out_restore_sg; } @@ -1516,7 +1518,7 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, return NULL; iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev); - if (!iova) + if (iova == DMA_MAPPING_ERROR) goto out_free_page; if (iommu_map(domain, iova, msi_addr, size, prot)) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index db77aa675145..ae0fe0a6714e 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -429,6 +429,8 @@ EXPORT_SYMBOL_GPL(free_iova); * This function tries to satisfy an iova allocation from the rcache, * and falls back to regular allocation on failure. If regular allocation * fails too and the flush_rcache flag is set then the rcache will be flushed. + * Returns a pfn the allocated iova starts at or IOVA_BAD_ADDR in the case + * of a failure. */ unsigned long alloc_iova_fast(struct iova_domain *iovad, unsigned long size, @@ -447,7 +449,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size, size = roundup_pow_of_two(size); iova_pfn = iova_rcache_get(iovad, size, limit_pfn + 1); - if (iova_pfn) + if (iova_pfn != IOVA_BAD_ADDR) return iova_pfn; retry: @@ -456,7 +458,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size, unsigned int cpu; if (!flush_rcache) - return 0; + return IOVA_BAD_ADDR; /* Try replenishing IOVAs by flushing rcache. */ flush_rcache = false; @@ -831,7 +833,7 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache, unsigned long limit_pfn) { struct iova_cpu_rcache *cpu_rcache; - unsigned long iova_pfn = 0; + unsigned long iova_pfn = IOVA_BAD_ADDR; bool has_pfn = false; unsigned long flags; @@ -858,6 +860,9 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache, spin_unlock_irqrest
[PATCH V2 1/6] dma-mapping: add DMA_ATTR_LOW_ADDRESS attribute
From: Marek Szyprowski Some devices require to allocate a special buffer (usually for the firmware) just at the beginning of the address space to ensure that all further allocations can be expressed as a positive offset from that special buffer. When IOMMU is used for managing the DMA address space, such requirement can be easily fulfilled, simply by enforcing the 'first-fit' IOVA allocation algorithm. This patch adds a DMA attribute for such case. Signed-off-by: Marek Szyprowski Signed-off-by: Ajay Kumar --- include/linux/dma-mapping.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index dca2b1355bb1..3cbdf857edd1 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -60,6 +60,12 @@ * at least read-only at lesser-privileged levels). */ #define DMA_ATTR_PRIVILEGED(1UL << 9) +/* + * DMA_ATTR_LOW_ADDRESS: used to indicate that the buffer should be allocated + * at the lowest possible DMA address, usually just at the beginning of the + * DMA/IOVA address space ('first-fit' allocation algorithm). + */ +#define DMA_ATTR_LOW_ADDRESS (1UL << 10) /* * A dma_addr_t can hold any valid DMA or bus address for the platform. It can -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH V2 0/6] IOMMU-DMA - support DMA_ATTR_LOW_ADDRESS attribute
This patchset is a rebase of original patches from Marek Szyprowski: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2321261.html The patches have been rebased on Joro's IOMMU tree "next" branch: https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git This patchset is needed to address the IOVA address dependency issue between firmware buffers and other buffers in Samsung s5p-mfc driver. There have been few discussions in the past on how to find a generic soultion for this issue, ranging from adding an entirely new API to choose IOVA window[1], to adding a DMA attribute DMA_ATTR_LOW_ADDRESS which handles buffer allocation from lower address[2]. This is a continuation of initial work from Marek for approach [2]. Patches have been tested with latest version of Samsung s5p-mfc driver. Changes since V1: [PATCH V2 1/6] - Rebase on latest tree. [PATCH V2 2/6] - Rebase on latest tree. Added a missing check for iova_pfn in __iova_rcache_get() Discard changes from drivers/iommu/intel/iommu.c which are not necessary [PATCH V2 3/6] - Rebase on latest tree. [PATCH V2 4/6] - Rebase on latest tree [PATCH V2 5/6] - Rebase on latest tree [PATCH V2 6/6] - Rebase on latest tree. Marek Szyprowski (6): dma-mapping: add DMA_ATTR_LOW_ADDRESS attribute iommu: iova: properly handle 0 as a valid IOVA address iommu: iova: add support for 'first-fit' algorithm iommu: dma-iommu: refactor iommu_dma_alloc_iova() iommu: dma-iommu: add support for DMA_ATTR_LOW_ADDRESS media: platform: s5p-mfc: use DMA_ATTR_LOW_ADDRESS References: [1] https://lore.kernel.org/linux-iommu/20200811054912.ga...@infradead.org/ [2] https://lore.kernel.org/linux-mm/bff57cbe-2247-05e1-9059-d9c66d64c...@arm.com drivers/iommu/dma-iommu.c | 77 +++- drivers/iommu/iova.c | 91 ++- .../media/platform/samsung/s5p-mfc/s5p_mfc.c | 8 +- include/linux/dma-mapping.h | 6 ++ include/linux/iova.h | 3 + 5 files changed, 156 insertions(+), 29 deletions(-) base-commit: faf93cfaadfaaff2a5c35d6301b45aa2f6e4ddb2 -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu-v3: Handle duplicated Stream IDs from other masters
On Tue, Jan 12, 2021 at 12:57 AM Robin Murphy wrote: > > On 2021-01-07 13:03, Will Deacon wrote: > > On Thu, Jan 07, 2021 at 03:03:40PM +0530, Ajay Kumar wrote: > >> When PCI function drivers(ex:pci-endpoint-test) are probed for already > >> initialized PCIe-RC(Root Complex), and PCIe-RC is already bound to SMMU, > >> then we encounter a situation where the function driver tries to attach > >> itself to the smmu with the same stream-id as PCIe-RC and re-initialize > >> an already initialized STE. This causes ste_live BUG_ON() in the driver. > > Note that this is actually expected behaviour, since Stream ID aliasing > has remained officially not supported until a sufficiently compelling > reason to do so appears. I always thought the most likely scenario would > be a legacy PCI bridge with multiple devices behind it, but even that > seems increasingly improbable for a modern SMMUv3-based system to ever see. Thanks to Will and Robin for reviewing this. I am pretty new to PCI, sorry about that. I assumed that the support for stream-id alias is already handled as part of this patch: https://www.spinics.net/lists/arm-kernel/msg626087.html which prevents STE re-initialization. But, what I do not understand is why the path taken by the arm-smmu-v3 driver misses the aforementioned check for my usecase. > > I don't understand why the endpoint is using the same stream ID as the root > > complex in this case. Why is that? Is the grouping logic not working > > properly? > > It's not so much that it isn't working properly, it's more that it needs > to be implemented at all ;) The pci_endpoint_test picks up the same of_ DMA config node as the PCI RC because they sit on the same PCI bus [via pci_dma_configure( )] While in the arm-smmu-v3 driver, I can see that the pci_device_group( ) hands over the same iommu group as the Root Complex to the newly added master device (pci_endpoint_test in our case) because they share the same stream ID. Shouldn't they? > >> There is an already existing check in the driver to manage duplicated ids > >> if duplicated ids are added in same master device, but there can be > >> scenarios like above where we need to extend the check for other masters > >> using the same stream-id. > >> > >> Signed-off-by: Ajay Kumar > >> --- > >> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 33 + > >> 1 file changed, 33 insertions(+) > > > > It doesn't feel like the driver is the right place to fix this, as the same > > issue could surely occur for other IOMMUs too, right? In which case, I think > > we should avoid getting into the situation where different groups have > > overlapping stream IDs. > > Yes, this patch does not represent the correct thing to do either way. > The main reason that Stream ID aliasing hasn't been supported so far is > that the required Stream ID to group lookup is rather awkward, and > adding all of that complexity just for the sake of a rather unlikely > possibility seemed dubious. However, PRI support has always had a more > pressing need to implement almost the same thing (Stream ID to device), > so once that lands we can finally get round to adding the rest of proper > group support relatively easily. I hope the support will be added soon. Also, can you point me to few drivers which already handle this type of stream-ID aliasing? Thanks, Ajay Kumar > __ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/arm-smmu-v3: Handle duplicated Stream IDs from other masters
When PCI function drivers(ex:pci-endpoint-test) are probed for already initialized PCIe-RC(Root Complex), and PCIe-RC is already bound to SMMU, then we encounter a situation where the function driver tries to attach itself to the smmu with the same stream-id as PCIe-RC and re-initialize an already initialized STE. This causes ste_live BUG_ON() in the driver. There is an already existing check in the driver to manage duplicated ids if duplicated ids are added in same master device, but there can be scenarios like above where we need to extend the check for other masters using the same stream-id. Signed-off-by: Ajay Kumar --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 33 + 1 file changed, 33 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index e634bbe60573..a91c3c0e9ee8 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2022,10 +2022,26 @@ static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid) return step; } +static bool arm_smmu_check_duplicated_sid(struct arm_smmu_master *master, + int sid) +{ + int i; + + for (i = 0; i < master->num_sids; ++i) + if (master->sids[i] == sid) + return true; + + return false; +} + static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master) { + bool sid_in_other_masters; int i, j; struct arm_smmu_device *smmu = master->smmu; + unsigned long flags; + struct arm_smmu_domain *smmu_domain = master->domain; + struct arm_smmu_master *other_masters; for (i = 0; i < master->num_sids; ++i) { u32 sid = master->sids[i]; @@ -2038,6 +2054,23 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master) if (j < i) continue; + /* Check for stream-ID duplication in masters in given domain */ + sid_in_other_masters = false; + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + list_for_each_entry(other_masters, &smmu_domain->devices, + domain_head) { + sid_in_other_masters = + arm_smmu_check_duplicated_sid(other_masters, + sid); + if (sid_in_other_masters) + break; + } + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); + + /* Skip STE re-init if stream-id found in other masters */ + if (sid_in_other_masters) + continue; + arm_smmu_write_strtab_ent(master, sid, step); } } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: IOVA allocation dependency between firmware buffer and remaining buffers
Hello all, We pretty much tried to solve the same issue here with a new API in DMA-IOMMU: https://lore.kernel.org/linux-iommu/20200811054912.ga...@infradead.org/T/ Christopher- the user part would be MFC devices on exynos platforms Thanks, Ajay On 9/23/20, Christoph Hellwig wrote: > On Wed, Sep 23, 2020 at 08:48:26AM +0200, Marek Szyprowski wrote: >> Hi Shaik, >> >> I've run into similar problem while adapting S5P-MFC and Exynos4-IS >> drivers for generic IOMMU-DMA framework. Here is my first solution: >> https://lore.kernel.org/linux-samsung-soc/20200918144833.14618-1-m.szyprow...@samsung.com/T/ >> >> >> >> It allows to remap given buffer at the specific IOVA address, although >> it doesn't guarantee that those specific addresses won't be later used >> by the IOVA allocator. Probably it would make sense to add an API for >> generic IOMMU-DMA framework to mark the given IOVA range as >> reserved/unused to protect them. > > If you want to use IOVA addresses in a device otherwise managed by > dma-iommu we need to expose an API through the dma API. Can you please > include the iommu list in the discussion of your series? > > I don't think using the raw IOMMU API is a very idea in these drivers, > we probably want a way to change the allocator algorithm or hint the > next IOVA and keep using the normal DMA API. Maybe Robin has a better > idea. > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: IOVA allocation dependency between firmware buffer and remaining buffers
Hello all, We pretty much tried to solve the same issue here with a new API in DMA-IOMMU: https://lore.kernel.org/linux-iommu/20200811054912.ga...@infradead.org/T/ Christoph - the user part would be MFC devices on exynos platforms Thanks, Ajay On Wed, Sep 23, 2020 at 12:28 PM Christoph Hellwig wrote: > > On Wed, Sep 23, 2020 at 08:48:26AM +0200, Marek Szyprowski wrote: > > Hi Shaik, > > > > I've run into similar problem while adapting S5P-MFC and Exynos4-IS > > drivers for generic IOMMU-DMA framework. Here is my first solution: > > https://lore.kernel.org/linux-samsung-soc/20200918144833.14618-1-m.szyprow...@samsung.com/T/ > > > > > > It allows to remap given buffer at the specific IOVA address, although > > it doesn't guarantee that those specific addresses won't be later used > > by the IOVA allocator. Probably it would make sense to add an API for > > generic IOMMU-DMA framework to mark the given IOVA range as > > reserved/unused to protect them. > > If you want to use IOVA addresses in a device otherwise managed by > dma-iommu we need to expose an API through the dma API. Can you please > include the iommu list in the discussion of your series? > > I don't think using the raw IOMMU API is a very idea in these drivers, > we probably want a way to change the allocator algorithm or hint the > next IOVA and keep using the normal DMA API. Maybe Robin has a better > idea. > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC V2 PATCH] dma-iommu: allow devices to set IOVA range dynamically
Currently, there is no other option to change the lower limit of IOVA for any device than calling iova_init_domain(), but the said function will re-init whole domain and also doesn't track the previously allocated IOVA before re-initing the domain. There are cases where the device might not support continuous range of addresses, and can also have dependency among buffers allocated for firmware, image manipulation, etc and all of the address requests pass through IOMMU. In such cases, we can allocate buffers stage by stage by setting address limit, and also keep track the of same. Bit of background can be found here: IOVA allocation dependency between firmware buffer and remaining buffers https://www.spinics.net/lists/iommu/msg43586.html This patch allows devices to limit the IOVA space they want during allocation at any given point of time. We shall allow the same only if the device owns the corresponding iommu_domain, that is the device is the only master attached to the domain. The lower limit of IOVA space is marked by start_pfn, and the upper limit is marked by dma_mask and this patch honors the same. Since changing dma_mask can extend the addressable region beyond current cached node, we do a reset of current cached nodes as well. User drivers can make call sequence like below: When they want to limit IOVA for allocated buffers in range 0x0 to 0x100: iommu_set_iova_range(dev, 0x0, 0x100 - 1); When they want to limit IOVA for allocated buffers in range 0x100 to 0xC000: iommu_set_iova_range(dev, 0x100, 0xC000 - 0x100); = Signed-off-by: Ajay Kumar --- drivers/iommu/dma-iommu.c | 73 +++ drivers/iommu/iommu.c | 16 + include/linux/iommu.h | 6 include/linux/iova.h | 6 4 files changed, 101 insertions(+) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 4959f5df21bd..2fe3f57ab648 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -167,6 +167,79 @@ void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list) } EXPORT_SYMBOL(iommu_dma_get_resv_regions); +/** + * iommu_set_iova_range - Limit the IOVA region for a specific device + * @dev: Device to set IOVA range for + * @base: Base address or the lower limit of the IOVA range + * @size: Size of address range from lower limit to upper limit + * + * Allow a master device to dynamically control the range of IOVA addresses + * which are allocated iff the master device is the only device attached to + * the corresponding iommu_domain. + * This function doesn't harm IOVA addresses outside of current range, + * which were allocated prior to calling this function. + */ +int iommu_set_iova_range(struct device *dev, dma_addr_t base, u64 size) +{ + struct iommu_domain *domain; + struct iommu_dma_cookie *cookie; + struct iova_domain *iovad; + unsigned long shift, base_pfn; + u64 new_dma_mask; + + /* +* Check if the IOMMU master device is the sole entry in the group +* If the group has more than one master device using the same IOMMU +* we shouldn't be allowing that device to change the IOVA limit +*/ + if (iommu_group_device_count_from_dev(dev) != 1) + return -EINVAL; + + domain = iommu_get_domain_for_dev(dev); + if (!domain) + return -ENODEV; + + if (domain->type != IOMMU_DOMAIN_DMA) + return -EINVAL; + + cookie = domain->iova_cookie; + if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE) + return -ENODEV; + + iovad = &cookie->iovad; + + shift = iova_shift(iovad); + base_pfn = base >> shift; + + base_pfn = max_t(unsigned long, 1, base_pfn); + + /* base cannot be outside aperture */ + if (domain->geometry.force_aperture) { + if (base > domain->geometry.aperture_end || + base + size <= domain->geometry.aperture_start) { + pr_warn("specified DMA range outside IOMMU capability\n"); + return -EFAULT; + } + /* ...then finally give it a kicking to make sure it fits */ + base_pfn = max_t(unsigned long, base_pfn, + domain->geometry.aperture_start >> shift); + } + /* Set page aligned lower limit of IOVA range to start_pfn */ + iovad->start_pfn = base_pfn; + + /* Set upper limit of IOVA range to dma_mask */ + new_dma_mask = (u64)base + size; + dma_set_mask_and_coherent(dev, new_dma_mask); + + /* Reset cached nodes to start IOVA search from the anchor node */ + iovad->cached_node = &iovad->anchor.node; +
[RFC PATCH] dma-iommu: allow devices to set IOVA range dynamically
Currently, there is no other option to change the lower limit of IOVA for any device than calling iova_init_domain(), but the said function will re-init whole domain and also doesn't track the previously allocated IOVA before re-initing the domain. There are cases where the device might not support continuous range of addresses, and can also have dependency among buffers allocated for firmware, image manipulation, etc and all of the address requests pass through IOMMU. In such cases, we can allocate buffers stage by stage by setting address limit, and also keep track the of same. Bit of background can be found here: IOVA allocation dependency between firmware buffer and remaining buffers https://www.spinics.net/lists/iommu/msg43586.html This patch allows devices to limit the IOVA space they want during allocation at any given point of time. We shall allow the same only if the device owns the corresponding iommu_domain, that is the device is the only master attached to the domain. The lower limit of IOVA space is marked by start_pfn, and the upper limit is marked by dma_mask and this patch honors the same. Since changing dma_mask can extend the addressable region beyond current cached node, we do a reset of current cached nodes as well. Signed-off-by: Ajay Kumar --- drivers/iommu/dma-iommu.c | 73 +++ drivers/iommu/iommu.c | 16 + include/linux/iommu.h | 6 include/linux/iova.h | 6 4 files changed, 101 insertions(+) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 4959f5df21bd..2fe3f57ab648 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -167,6 +167,79 @@ void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list) } EXPORT_SYMBOL(iommu_dma_get_resv_regions); +/** + * iommu_set_iova_range - Limit the IOVA region for a specific device + * @dev: Device to set IOVA range for + * @base: Base address or the lower limit of the IOVA range + * @size: Size of address range from lower limit to upper limit + * + * Allow a master device to dynamically control the range of IOVA addresses + * which are allocated iff the master device is the only device attached to + * the corresponding iommu_domain. + * This function doesn't harm IOVA addresses outside of current range, + * which were allocated prior to calling this function. + */ +int iommu_set_iova_range(struct device *dev, dma_addr_t base, u64 size) +{ + struct iommu_domain *domain; + struct iommu_dma_cookie *cookie; + struct iova_domain *iovad; + unsigned long shift, base_pfn; + u64 new_dma_mask; + + /* +* Check if the IOMMU master device is the sole entry in the group +* If the group has more than one master device using the same IOMMU +* we shouldn't be allowing that device to change the IOVA limit +*/ + if (iommu_group_device_count_from_dev(dev) != 1) + return -EINVAL; + + domain = iommu_get_domain_for_dev(dev); + if (!domain) + return -ENODEV; + + if (domain->type != IOMMU_DOMAIN_DMA) + return -EINVAL; + + cookie = domain->iova_cookie; + if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE) + return -ENODEV; + + iovad = &cookie->iovad; + + shift = iova_shift(iovad); + base_pfn = base >> shift; + + base_pfn = max_t(unsigned long, 1, base_pfn); + + /* base cannot be outside aperture */ + if (domain->geometry.force_aperture) { + if (base > domain->geometry.aperture_end || + base + size <= domain->geometry.aperture_start) { + pr_warn("specified DMA range outside IOMMU capability\n"); + return -EFAULT; + } + /* ...then finally give it a kicking to make sure it fits */ + base_pfn = max_t(unsigned long, base_pfn, + domain->geometry.aperture_start >> shift); + } + /* Set page aligned lower limit of IOVA range to start_pfn */ + iovad->start_pfn = base_pfn; + + /* Set upper limit of IOVA range to dma_mask */ + new_dma_mask = (u64)base + size; + dma_set_mask_and_coherent(dev, new_dma_mask); + + /* Reset cached nodes to start IOVA search from the anchor node */ + iovad->cached_node = &iovad->anchor.node; + iovad->cached32_node = &iovad->anchor.node; + iovad->max32_alloc_size = iovad->dma_32bit_pfn; + + return 0; +} +EXPORT_SYMBOL(iommu_set_iova_range); + static int cookie_init_hw_msi_region(struct iommu_dma_cookie *cookie, phys_addr_t start, phys_addr_t end) { diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 609bd25bf154..30b2d4e5487d 100644 --- a/drivers/iommu/iommu.c +++ b/drive
Re: [PATCH] iommu/iova: Retry from last rb tree node if iova search fails
On 5/7/20, Robin Murphy wrote: > On 2020-05-06 9:01 pm, vji...@codeaurora.org wrote: >> From: Vijayanand Jitta >> >> When ever a new iova alloc request comes iova is always searched >> from the cached node and the nodes which are previous to cached >> node. So, even if there is free iova space available in the nodes >> which are next to the cached node iova allocation can still fail >> because of this approach. >> >> Consider the following sequence of iova alloc and frees on >> 1GB of iova space >> >> 1) alloc - 500MB >> 2) alloc - 12MB >> 3) alloc - 499MB >> 4) free - 12MB which was allocated in step 2 >> 5) alloc - 13MB >> >> After the above sequence we will have 12MB of free iova space and >> cached node will be pointing to the iova pfn of last alloc of 13MB >> which will be the lowest iova pfn of that iova space. Now if we get an >> alloc request of 2MB we just search from cached node and then look >> for lower iova pfn's for free iova and as they aren't any, iova alloc >> fails though there is 12MB of free iova space. > > Yup, this could definitely do with improving. Unfortunately I think this > particular implementation is slightly flawed... > >> To avoid such iova search failures do a retry from the last rb tree node >> when iova search fails, this will search the entire tree and get an iova >> if its available >> >> Signed-off-by: Vijayanand Jitta >> --- >> drivers/iommu/iova.c | 11 +++ >> 1 file changed, 11 insertions(+) >> >> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c >> index 0e6a953..2985222 100644 >> --- a/drivers/iommu/iova.c >> +++ b/drivers/iommu/iova.c >> @@ -186,6 +186,7 @@ static int __alloc_and_insert_iova_range(struct >> iova_domain *iovad, >> unsigned long flags; >> unsigned long new_pfn; >> unsigned long align_mask = ~0UL; >> +bool retry = false; >> >> if (size_aligned) >> align_mask <<= fls_long(size - 1); >> @@ -198,6 +199,8 @@ static int __alloc_and_insert_iova_range(struct >> iova_domain *iovad, >> >> curr = __get_cached_rbnode(iovad, limit_pfn); >> curr_iova = rb_entry(curr, struct iova, node); >> + >> +retry_search: >> do { >> limit_pfn = min(limit_pfn, curr_iova->pfn_lo); >> new_pfn = (limit_pfn - size) & align_mask; >> @@ -207,6 +210,14 @@ static int __alloc_and_insert_iova_range(struct >> iova_domain *iovad, >> } while (curr && new_pfn <= curr_iova->pfn_hi); >> >> if (limit_pfn < size || new_pfn < iovad->start_pfn) { >> +if (!retry) { >> +curr = rb_last(&iovad->rbroot); > > Why walk when there's an anchor node there already? However... +1 > >> +curr_iova = rb_entry(curr, struct iova, node); >> +limit_pfn = curr_iova->pfn_lo; > > ...this doesn't look right, as by now we've lost the original limit_pfn > supplied by the caller, so are highly likely to allocate beyond the > range our caller asked for. In fact AFAICS we'd start allocating from > directly directly below the anchor node, beyond the end of the entire > address space. +1 > > The logic I was imagining we want here was something like the rapidly > hacked up (and untested) diff below. > > Thanks, > Robin. > > ->8- > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c > index 0e6a9536eca6..3574c19272d6 100644 > --- a/drivers/iommu/iova.c > +++ b/drivers/iommu/iova.c > @@ -186,6 +186,7 @@ static int __alloc_and_insert_iova_range(struct > iova_domain *iovad, > unsigned long flags; > unsigned long new_pfn; > unsigned long align_mask = ~0UL; > + unsigned long alloc_hi, alloc_lo; > > if (size_aligned) > align_mask <<= fls_long(size - 1); > @@ -196,17 +197,27 @@ static int __alloc_and_insert_iova_range(struct > iova_domain *iovad, > size >= iovad->max32_alloc_size) > goto iova32_full; > > + alloc_hi = IOVA_ANCHOR; > + alloc_lo = iovad->start_pfn; > +retry: > curr = __get_cached_rbnode(iovad, limit_pfn); > curr_iova = rb_entry(curr, struct iova, node); > + if (alloc_hi < curr_iova->pfn_hi) { > + alloc_lo = curr_iova->pfn_hi; > + alloc_hi = limit_pfn; > + } > + > do { > - limit_pfn = min(limit_pfn, curr_iova->pfn_lo); > - new_pfn = (limit_pfn - size) & align_mask; > + alloc_hi = min(alloc_hi, curr_iova->pfn_lo); During retry case, the curr and curr_iova is not updated. Kindly check it. Ajay > + new_pfn = (alloc_hi - size) & align_mask; > prev = curr; > curr = rb_prev(curr); > curr_iova = rb_entry(curr, struct iova, node); > } while (curr && new_pfn <= curr_iova->pfn_hi); > > - if (limit_pfn < size || new_pfn < iovad->start_pfn) { > + if (limit_pfn < size || new_pfn < alloc_lo) { > + if (alloc_lo == iovad->start_p
Re: [RFC PATCH] drivers: iommu: reset cached node if dma_mask is changed
Hi Robin, On 5/7/20, Robin Murphy wrote: > On 2020-05-04 7:37 pm, Ajay Kumar wrote: >> The current IOVA allocation code stores a cached copy of the >> first allocated IOVA address node, and all the subsequent allocations >> have no way to get past(higher than) the first allocated IOVA range. > > Strictly they do, after that first allocation gets freed, or if the > first limit was <=32 bits and the subsequent limit >32 bits ;) In my case, the first allocated buffer is the firmware buffer, and its not released. >> This causes issue when dma_mask for the master device is changed. >> Though the DMA window is increased, the allocation code unaware of >> the change, goes ahead allocating IOVA address lower than the >> first allocated IOVA address. >> >> This patch adds a check for dma_mask change in the IOVA allocation >> function and resets the cached IOVA node to anchor node everytime >> the dma_mask change is observed. > > This isn't the right approach, since limit_pfn is by design a transient > per-allocation thing. Devices with different limits may well be > allocating from the same IOVA domain concurrently, which is the whole > reason for maintaining two cached nodes to serve the expected PCI case > of mixing 32-bit and 64-bit limits. Trying to track a per-allocation > property on a per-domain basis is just going to thrash and massively > hurt such cases. Agreed. But if two or more non PCI devices share the same domain, and though one of the devices has higher addressable limit, the current logic forces it to take the lower window. > A somewhat more appropriate fix to the allocation loop itself has been > proposed here: > > https://lore.kernel.org/linux-iommu/1588795317-20879-1-git-send-email-vji...@codeaurora.org/ This patch addresses my problem indirectly. Shall add my concerns there. Ajay >> NOTE: >> This patch is needed to address the issue discussed in below thread: >> https://www.spinics.net/lists/iommu/msg43586.html >> >> Signed-off-by: Ajay Kumar >> Signed-off-by: Sathyam Panda >> --- >> drivers/iommu/iova.c | 17 - >> include/linux/iova.h | 1 + >> 2 files changed, 17 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c >> index 41c605b0058f..0e99975036ae 100644 >> --- a/drivers/iommu/iova.c >> +++ b/drivers/iommu/iova.c >> @@ -44,6 +44,7 @@ init_iova_domain(struct iova_domain *iovad, unsigned >> long granule, >> iovad->granule = granule; >> iovad->start_pfn = start_pfn; >> iovad->dma_32bit_pfn = 1UL << (32 - iova_shift(iovad)); >> +iovad->curr_limit_pfn = iovad->dma_32bit_pfn; >> iovad->max32_alloc_size = iovad->dma_32bit_pfn; >> iovad->flush_cb = NULL; >> iovad->fq = NULL; >> @@ -116,9 +117,20 @@ EXPORT_SYMBOL_GPL(init_iova_flush_queue); >> static struct rb_node * >> __get_cached_rbnode(struct iova_domain *iovad, unsigned long limit_pfn) >> { >> -if (limit_pfn <= iovad->dma_32bit_pfn) >> +if (limit_pfn <= iovad->dma_32bit_pfn) { >> +/* re-init cached node if DMA limit has changed */ >> +if (limit_pfn != iovad->curr_limit_pfn) { >> +iovad->cached32_node = &iovad->anchor.node; >> +iovad->curr_limit_pfn = limit_pfn; >> +} >> return iovad->cached32_node; >> +} >> >> +/* re-init cached node if DMA limit has changed */ >> +if (limit_pfn != iovad->curr_limit_pfn) { >> +iovad->cached_node = &iovad->anchor.node; >> +iovad->curr_limit_pfn = limit_pfn; >> +} >> return iovad->cached_node; >> } >> >> @@ -190,6 +202,9 @@ static int __alloc_and_insert_iova_range(struct >> iova_domain *iovad, >> if (size_aligned) >> align_mask <<= fls_long(size - 1); >> >> +if (limit_pfn != iovad->curr_limit_pfn) >> +iovad->max32_alloc_size = iovad->dma_32bit_pfn; >> + >> /* Walk the tree backwards */ >> spin_lock_irqsave(&iovad->iova_rbtree_lock, flags); >> if (limit_pfn <= iovad->dma_32bit_pfn && >> diff --git a/include/linux/iova.h b/include/linux/iova.h >> index a0637abffee8..be2220c096ef 100644 >> --- a/include/linux/iova.h >> +++ b/include/linux/iova.h >> @@ -73,6 +73,7 @@ struct iova_domain { >> unsigned long granule;/* pfn granularity for this domain */ >> unsigned long start_pfn; /* Lower limit for this domain */ >> unsigned long dma_32bit_pfn; >> +unsigned long curr_limit_pfn; /* Current max limit for this domain */ >> unsigned long max32_alloc_size; /* Size of last failed allocation */ >> struct iova_fq __percpu *fq;/* Flush Queue */ >> >> > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH] drivers: iommu: reset cached node if dma_mask is changed
The current IOVA allocation code stores a cached copy of the first allocated IOVA address node, and all the subsequent allocations have no way to get past(higher than) the first allocated IOVA range. This causes issue when dma_mask for the master device is changed. Though the DMA window is increased, the allocation code unaware of the change, goes ahead allocating IOVA address lower than the first allocated IOVA address. This patch adds a check for dma_mask change in the IOVA allocation function and resets the cached IOVA node to anchor node everytime the dma_mask change is observed. NOTE: This patch is needed to address the issue discussed in below thread: https://www.spinics.net/lists/iommu/msg43586.html Signed-off-by: Ajay Kumar Signed-off-by: Sathyam Panda --- drivers/iommu/iova.c | 17 - include/linux/iova.h | 1 + 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 41c605b0058f..0e99975036ae 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -44,6 +44,7 @@ init_iova_domain(struct iova_domain *iovad, unsigned long granule, iovad->granule = granule; iovad->start_pfn = start_pfn; iovad->dma_32bit_pfn = 1UL << (32 - iova_shift(iovad)); + iovad->curr_limit_pfn = iovad->dma_32bit_pfn; iovad->max32_alloc_size = iovad->dma_32bit_pfn; iovad->flush_cb = NULL; iovad->fq = NULL; @@ -116,9 +117,20 @@ EXPORT_SYMBOL_GPL(init_iova_flush_queue); static struct rb_node * __get_cached_rbnode(struct iova_domain *iovad, unsigned long limit_pfn) { - if (limit_pfn <= iovad->dma_32bit_pfn) + if (limit_pfn <= iovad->dma_32bit_pfn) { + /* re-init cached node if DMA limit has changed */ + if (limit_pfn != iovad->curr_limit_pfn) { + iovad->cached32_node = &iovad->anchor.node; + iovad->curr_limit_pfn = limit_pfn; + } return iovad->cached32_node; + } + /* re-init cached node if DMA limit has changed */ + if (limit_pfn != iovad->curr_limit_pfn) { + iovad->cached_node = &iovad->anchor.node; + iovad->curr_limit_pfn = limit_pfn; + } return iovad->cached_node; } @@ -190,6 +202,9 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad, if (size_aligned) align_mask <<= fls_long(size - 1); + if (limit_pfn != iovad->curr_limit_pfn) + iovad->max32_alloc_size = iovad->dma_32bit_pfn; + /* Walk the tree backwards */ spin_lock_irqsave(&iovad->iova_rbtree_lock, flags); if (limit_pfn <= iovad->dma_32bit_pfn && diff --git a/include/linux/iova.h b/include/linux/iova.h index a0637abffee8..be2220c096ef 100644 --- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -73,6 +73,7 @@ struct iova_domain { unsigned long granule;/* pfn granularity for this domain */ unsigned long start_pfn; /* Lower limit for this domain */ unsigned long dma_32bit_pfn; + unsigned long curr_limit_pfn; /* Current max limit for this domain */ unsigned long max32_alloc_size; /* Size of last failed allocation */ struct iova_fq __percpu *fq;/* Flush Queue */ -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: IOVA allocation dependency between firmware buffer and remaining buffers
Can someone check this? On Mon, Apr 20, 2020 at 9:24 PM Ajay kumar wrote: > > Hi All, > > I have an IOMMU master which has limitations as mentioned below: > 1) The IOMMU master internally executes a firmware, and the firmware memory > is allocated by the same master driver. > The firmware buffer address should be of the lowest range than other address > allocated by the device, or in other words, all the remaining buffer addresses > should always be in a higher range than the firmware address. > 2) None of the buffer addresses should go beyond 0xC000_ > > example: > If firmware buffer address is buf_fw = 0x8000_5000; > All other addresses given to the device should be greater than > (0x8000_5000 + firmware size) and less than 0xC000_ > > Currently, this is being handled with one of the below hacks: > 1) By keeping dma_mask in lower range while allocating firmware buffer, > and then increasing the dma_mask to higher range for other buffers. > 2) By reserving IOVA for firmware at the lowest range and creating direct > mappings for the same. > > I want to know if there is a better way this can be handled with current > framework, > or if anybody is facing similar problems with their devices, > please share how it is taken care. > > I also think there should be some way the masters can specify the IOVA > range they want to limit to for current allocation. > Something like a new iommu_ops callback like below: > limit_iova_alloc_range(dev, iova_start, iova_end) > > And, in my driver, the sequence will be: > limit_iova_alloc_range(dev, 0x_, 0x1000_); /* via helpers */ > alloc( ) firmware buffer using DMA API > limit_iova_alloc_range(dev, 0x1000_, 0xC000_); /* via helpers */ > alloc( ) other buffers using DMA API > > Thanks, > Ajay Kumar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
IOVA allocation dependency between firmware buffer and remaining buffers
Hi All, I have an IOMMU master which has limitations as mentioned below: 1) The IOMMU master internally executes a firmware, and the firmware memory is allocated by the same master driver. The firmware buffer address should be of the lowest range than other address allocated by the device, or in other words, all the remaining buffer addresses should always be in a higher range than the firmware address. 2) None of the buffer addresses should go beyond 0xC000_ example: If firmware buffer address is buf_fw = 0x8000_5000; All other addresses given to the device should be greater than (0x8000_5000 + firmware size) and less than 0xC000_ Currently, this is being handled with one of the below hacks: 1) By keeping dma_mask in lower range while allocating firmware buffer, and then increasing the dma_mask to higher range for other buffers. 2) By reserving IOVA for firmware at the lowest range and creating direct mappings for the same. I want to know if there is a better way this can be handled with current framework, or if anybody is facing similar problems with their devices, please share how it is taken care. I also think there should be some way the masters can specify the IOVA range they want to limit to for current allocation. Something like a new iommu_ops callback like below: limit_iova_alloc_range(dev, iova_start, iova_end) And, in my driver, the sequence will be: limit_iova_alloc_range(dev, 0x_, 0x1000_); /* via helpers */ alloc( ) firmware buffer using DMA API limit_iova_alloc_range(dev, 0x1000_, 0xC000_); /* via helpers */ alloc( ) other buffers using DMA API Thanks, Ajay Kumar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu