Re: [PATCH 0/3] mmu_notifier: Allow to manage CPU external TLBs
On Thu, 24 Jul 2014 16:35:38 +0200 Joerg Roedel wrote: > here is a patch-set to extend the mmu_notifiers in the Linux > kernel to allow managing CPU external TLBs. Those TLBs may > be implemented in IOMMUs or any other external device, e.g. > ATS/PRI capable PCI devices. > > The problem with managing these TLBs are the semantics of > the invalidate_range_start/end call-backs currently > available. Currently the subsystem using mmu_notifiers has > to guarantee that no new TLB entries are established between > invalidate_range_start/end. Furthermore the > invalidate_range_start() function is called when all pages > are still mapped and invalidate_range_end() when the pages > are unmapped an already freed. > > So both call-backs can't be used to safely flush any non-CPU > TLB because _start() is called too early and _end() too > late. > > In the AMD IOMMUv2 driver this is currently implemented by > assigning an empty page-table to the external device between > _start() and _end(). But as tests have shown this doesn't > work as external devices don't re-fault infinitly but enter > a failure state after some time. > > Next problem with this solution is that it causes an > interrupt storm for IO page faults to be handled when an > empty page-table is assigned. > > To solve this situation I wrote a patch-set to introduce a > new notifier call-back: mmu_notifer_invalidate_range(). This > notifier lifts the strict requirements that no new > references are taken in the range between _start() and > _end(). When the subsystem can't guarantee that any new > references are taken is has to provide the > invalidate_range() call-back to clear any new references in > there. > > It is called between invalidate_range_start() and _end() > every time the VMM has to wipe out any references to a > couple of pages. This are usually the places where the CPU > TLBs are flushed too and where its important that this > happens before invalidate_range_end() is called. > > Any comments and review appreciated! It looks pretty simple and harmless. I assume the AMD IOMMUv2 driver actually uses this and it's all tested and good? What is the status of that driver? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/3] iommu/ipmmu-vmsa: Invalidate TLB after unmapping
The TLB must be invalidated after unmapping memory to remove stale TLB entries. this was supposed to be performed already, but a bug in the driver prevented the TLB invalidate function from being called. Fix it. Signed-off-by: Laurent Pinchart --- drivers/iommu/ipmmu-vmsa.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c index eb605e5..a010e47 100644 --- a/drivers/iommu/ipmmu-vmsa.c +++ b/drivers/iommu/ipmmu-vmsa.c @@ -784,7 +784,6 @@ static int ipmmu_clear_mapping(struct ipmmu_vmsa_domain *domain, pud_t *pud; pmd_t *pmd; pte_t *pte; - int ret = 0; if (!pgd) return -EINVAL; @@ -846,8 +845,7 @@ static int ipmmu_clear_mapping(struct ipmmu_vmsa_domain *domain, done: spin_unlock_irqrestore(&domain->lock, flags); - if (ret) - ipmmu_tlb_invalidate(domain); + ipmmu_tlb_invalidate(domain); return 0; } -- 1.8.5.5 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/3] iommu/ipmmu-vmsa: Cleanup failures of ARM mapping creation or attachment
The ARM IOMMU mapping needs to be released when attaching the device fails. Add arm_iommu_release_mapping() to the error code path. This is safe to call with a NULL mapping, so no specific check is needed. Cleanup is also missing when failing to create a mapping. Jump to the error code path in that case instead of returning immediately. Signed-off-by: Laurent Pinchart --- drivers/iommu/ipmmu-vmsa.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c index 53cde08..7dc77a6 100644 --- a/drivers/iommu/ipmmu-vmsa.c +++ b/drivers/iommu/ipmmu-vmsa.c @@ -1090,7 +1090,8 @@ static int ipmmu_add_device(struct device *dev) SZ_1G, SZ_2G); if (IS_ERR(mapping)) { dev_err(mmu->dev, "failed to create ARM IOMMU mapping\n"); - return PTR_ERR(mapping); + ret = PTR_ERR(mapping); + goto error; } mmu->mapping = mapping; @@ -1106,6 +1107,7 @@ static int ipmmu_add_device(struct device *dev) return 0; error: + arm_iommu_release_mapping(mmu->mapping); kfree(dev->archdata.iommu); dev->archdata.iommu = NULL; iommu_group_remove_device(dev); -- 1.8.5.5 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/3] iommu/ipmmu-vmsa: Flush P[UM]D entry before freeing the child page table
When clearing PUD or PMD entries the child page table (if any) is freed and the PUD or PMD entry is then cleared. This result in a small race condition window during which a free page table could be accessed by the IPMMU. Fix it by clearing and flushing the PUD or PMD entry before freeing the child page table. Signed-off-by: Laurent Pinchart --- drivers/iommu/ipmmu-vmsa.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c index 7dc77a6..eb605e5 100644 --- a/drivers/iommu/ipmmu-vmsa.c +++ b/drivers/iommu/ipmmu-vmsa.c @@ -678,30 +678,32 @@ done: static void ipmmu_clear_pud(struct ipmmu_vmsa_device *mmu, pud_t *pud) { - /* Free the page table. */ pgtable_t table = pud_pgtable(*pud); - __free_page(table); /* Clear the PUD. */ *pud = __pud(0); ipmmu_flush_pgtable(mmu, pud, sizeof(*pud)); + + /* Free the page table. */ + __free_page(table); } static void ipmmu_clear_pmd(struct ipmmu_vmsa_device *mmu, pud_t *pud, pmd_t *pmd) { + pmd_t pmdval = *pmd; unsigned int i; - /* Free the page table. */ - if (pmd_table(*pmd)) { - pgtable_t table = pmd_pgtable(*pmd); - __free_page(table); - } - /* Clear the PMD. */ *pmd = __pmd(0); ipmmu_flush_pgtable(mmu, pmd, sizeof(*pmd)); + /* Free the page table. */ + if (pmd_table(pmdval)) { + pgtable_t table = pmd_pgtable(pmdval); + __free_page(table); + } + /* Check whether the PUD is still needed. */ pmd = pmd_offset(pud, 0); for (i = 0; i < IPMMU_PTRS_PER_PMD; ++i) { -- 1.8.5.5 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/3] Renesas IPMMU-VMSA fixes
Hello, This series fixes small issues with the ipmmu-vmsa driver. Please see individual patches for details. If not too late, I'd like these fixes to be merged in v3.17 (after proper review of course, or, in the worst case, after lack of review). Laurent Pinchart (3): iommu/ipmmu-vmsa: Cleanup failures of ARM mapping creation or attachment iommu/ipmmu-vmsa: Flush P[UM]D entry before freeing the child page table iommu/ipmmu-vmsa: Invalidate TLB after unmapping drivers/iommu/ipmmu-vmsa.c | 26 ++ 1 file changed, 14 insertions(+), 12 deletions(-) -- Regards, Laurent Pinchart ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/3] mmu_notifier: Allow to manage CPU external TLBs
On Thu, Jul 24, 2014 at 04:35:38PM +0200, Joerg Roedel wrote: > To solve this situation I wrote a patch-set to introduce a > new notifier call-back: mmu_notifer_invalidate_range(). This > notifier lifts the strict requirements that no new > references are taken in the range between _start() and > _end(). When the subsystem can't guarantee that any new > references are taken is has to provide the > invalidate_range() call-back to clear any new references in > there. > > It is called between invalidate_range_start() and _end() > every time the VMM has to wipe out any references to a > couple of pages. This are usually the places where the CPU > TLBs are flushed too and where its important that this > happens before invalidate_range_end() is called. > > Any comments and review appreciated! Reviewed-by: Andrea Arcangeli ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/3] mmu_notifier: Allow to manage CPU external TLBs
Hi, here is a patch-set to extend the mmu_notifiers in the Linux kernel to allow managing CPU external TLBs. Those TLBs may be implemented in IOMMUs or any other external device, e.g. ATS/PRI capable PCI devices. The problem with managing these TLBs are the semantics of the invalidate_range_start/end call-backs currently available. Currently the subsystem using mmu_notifiers has to guarantee that no new TLB entries are established between invalidate_range_start/end. Furthermore the invalidate_range_start() function is called when all pages are still mapped and invalidate_range_end() when the pages are unmapped an already freed. So both call-backs can't be used to safely flush any non-CPU TLB because _start() is called too early and _end() too late. In the AMD IOMMUv2 driver this is currently implemented by assigning an empty page-table to the external device between _start() and _end(). But as tests have shown this doesn't work as external devices don't re-fault infinitly but enter a failure state after some time. Next problem with this solution is that it causes an interrupt storm for IO page faults to be handled when an empty page-table is assigned. To solve this situation I wrote a patch-set to introduce a new notifier call-back: mmu_notifer_invalidate_range(). This notifier lifts the strict requirements that no new references are taken in the range between _start() and _end(). When the subsystem can't guarantee that any new references are taken is has to provide the invalidate_range() call-back to clear any new references in there. It is called between invalidate_range_start() and _end() every time the VMM has to wipe out any references to a couple of pages. This are usually the places where the CPU TLBs are flushed too and where its important that this happens before invalidate_range_end() is called. Any comments and review appreciated! Thanks, Joerg Joerg Roedel (3): mmu_notifier: Add mmu_notifier_invalidate_range() mmu_notifier: Call mmu_notifier_invalidate_range() from VMM mmu_notifier: Add the call-back for mmu_notifier_invalidate_range() include/linux/mmu_notifier.h | 66 kernel/events/uprobes.c | 2 +- mm/fremap.c | 2 +- mm/huge_memory.c | 9 +++--- mm/hugetlb.c | 7 - mm/ksm.c | 4 +-- mm/memory.c | 3 +- mm/migrate.c | 3 +- mm/mmu_notifier.c| 15 ++ mm/rmap.c| 2 +- 10 files changed, 95 insertions(+), 18 deletions(-) -- 1.9.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/3] mmu_notifier: Call mmu_notifier_invalidate_range() from VMM
From: Joerg Roedel Add calls to the new mmu_notifier_invalidate_range() function to all places if the VMM that need it. Signed-off-by: Joerg Roedel --- include/linux/mmu_notifier.h | 28 kernel/events/uprobes.c | 2 +- mm/fremap.c | 2 +- mm/huge_memory.c | 9 + mm/hugetlb.c | 7 ++- mm/ksm.c | 4 ++-- mm/memory.c | 3 ++- mm/migrate.c | 3 ++- mm/rmap.c| 2 +- 9 files changed, 48 insertions(+), 12 deletions(-) diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index f333668..6959dc8 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -273,6 +273,32 @@ static inline void mmu_notifier_mm_destroy(struct mm_struct *mm) __young;\ }) +#defineptep_clear_flush_notify(__vma, __address, __ptep) \ +({ \ + unsigned long ___addr = __address & PAGE_MASK; \ + struct mm_struct *___mm = (__vma)->vm_mm; \ + pte_t ___pte; \ + \ + ___pte = ptep_clear_flush(__vma, __address, __ptep);\ + mmu_notifier_invalidate_range(___mm, ___addr, \ + ___addr + PAGE_SIZE); \ + \ + ___pte; \ +}) + +#define pmdp_clear_flush_notify(__vma, __haddr, __pmd) \ +({ \ + unsigned long ___haddr = __haddr & HPAGE_PMD_MASK; \ + struct mm_struct *___mm = (__vma)->vm_mm; \ + pmd_t ___pmd; \ + \ + ___pmd = pmdp_clear_flush(__vma, __haddr, __pmd); \ + mmu_notifier_invalidate_range(___mm, ___haddr, \ + ___haddr + HPAGE_PMD_SIZE); \ + \ + ___pmd; \ +}) + /* * set_pte_at_notify() sets the pte _after_ running the notifier. * This is safe to start by updating the secondary MMUs, because the primary MMU @@ -346,6 +372,8 @@ static inline void mmu_notifier_mm_destroy(struct mm_struct *mm) #define ptep_clear_flush_young_notify ptep_clear_flush_young #define pmdp_clear_flush_young_notify pmdp_clear_flush_young +#defineptep_clear_flush_notify ptep_clear_flush +#define pmdp_clear_flush_notify pmdp_clear_flush #define set_pte_at_notify set_pte_at #endif /* CONFIG_MMU_NOTIFIER */ diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 6f3254e..642262d 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -186,7 +186,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, } flush_cache_page(vma, addr, pte_pfn(*ptep)); - ptep_clear_flush(vma, addr, ptep); + ptep_clear_flush_notify(vma, addr, ptep); set_pte_at_notify(mm, addr, ptep, mk_pte(kpage, vma->vm_page_prot)); page_remove_rmap(page); diff --git a/mm/fremap.c b/mm/fremap.c index 72b8fa3..9129013 100644 --- a/mm/fremap.c +++ b/mm/fremap.c @@ -37,7 +37,7 @@ static void zap_pte(struct mm_struct *mm, struct vm_area_struct *vma, if (pte_present(pte)) { flush_cache_page(vma, addr, pte_pfn(pte)); - pte = ptep_clear_flush(vma, addr, ptep); + pte = ptep_clear_flush_notify(vma, addr, ptep); page = vm_normal_page(vma, addr, pte); if (page) { if (pte_dirty(pte)) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 33514d8..b322c97 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1031,7 +1031,7 @@ static int do_huge_pmd_wp_page_fallback(struct mm_struct *mm, goto out_free_pages; VM_BUG_ON_PAGE(!PageHead(page), page); - pmdp_clear_flush(vma, haddr, pmd); + pmdp_clear_flush_notify(vma, haddr, pmd); /* leave pmd empty until pte is filled */ pgtable = pgtable_trans_huge_withdraw(mm, pmd); @@ -1168,7 +1168,7 @@ alloc: pmd_t entry; entry = mk_huge_pmd(new_page, vma->vm_page_prot); entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); - pmdp_clear_flush(vma, haddr, pmd); + pmdp_clear_flush_notify(vma, haddr, pmd); page
[PATCH 1/3] mmu_notifier: Add mmu_notifier_invalidate_range()
From: Joerg Roedel This notifier closes an important gap with the current invalidate_range_start()/end() notifiers. The _start() part is called when all pages are still mapped while the _end() notifier is called when all pages are potentially unmapped and already freed. This does not allow to manage external (non-CPU) hardware TLBs with MMU-notifiers because there is no way to prevent that hardware will establish new TLB entries between the calls of these two functions. But this is a requirement to the subsytem that implements these existing notifiers. To allow managing external TLBs the MMU-notifiers need to catch the moment when pages are unmapped but not yet freed. This new notifier catches that moment and notifies the interested subsytem when pages that were unmapped are about to be freed. The new notifier will only be called between invalidate_range_start()/end(). Signed-off-by: Joerg Roedel --- include/linux/mmu_notifier.h | 10 ++ 1 file changed, 10 insertions(+) diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index deca874..f333668 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -228,6 +228,11 @@ static inline void mmu_notifier_invalidate_range_start(struct mm_struct *mm, __mmu_notifier_invalidate_range_start(mm, start, end); } +static inline void mmu_notifier_invalidate_range(struct mm_struct *mm, + unsigned long start, unsigned long end) +{ +} + static inline void mmu_notifier_invalidate_range_end(struct mm_struct *mm, unsigned long start, unsigned long end) { @@ -321,6 +326,11 @@ static inline void mmu_notifier_invalidate_range_start(struct mm_struct *mm, { } +static inline void mmu_notifier_invalidate_range(struct mm_struct *mm, + unsigned long start, unsigned long end) +{ +} + static inline void mmu_notifier_invalidate_range_end(struct mm_struct *mm, unsigned long start, unsigned long end) { -- 1.9.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/3] mmu_notifier: Add the call-back for mmu_notifier_invalidate_range()
From: Joerg Roedel Now that the mmu_notifier_invalidate_range() calls are in place, add the call-back to allow subsystems to register against it. Signed-off-by: Joerg Roedel --- include/linux/mmu_notifier.h | 28 ++-- mm/mmu_notifier.c| 15 +++ 2 files changed, 37 insertions(+), 6 deletions(-) diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index 6959dc8..50dc679 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -95,11 +95,11 @@ struct mmu_notifier_ops { /* * invalidate_range_start() and invalidate_range_end() must be * paired and are called only when the mmap_sem and/or the -* locks protecting the reverse maps are held. The subsystem -* must guarantee that no additional references are taken to -* the pages in the range established between the call to -* invalidate_range_start() and the matching call to -* invalidate_range_end(). +* locks protecting the reverse maps are held. If the subsystem +* can't guarantee that no additional references are taken to +* the pages in the range, it has to implement the +* invalidate_range() notifier to remove any references taken +* after invalidate_range_start(). * * Invalidation of multiple concurrent ranges may be * optionally permitted by the driver. Either way the @@ -110,9 +110,19 @@ struct mmu_notifier_ops { * invalidate_range_start() is called when all pages in the * range are still mapped and have at least a refcount of one. * +* invalidate_range() is called between invalidate_range_start() +* and invalidate_range_end() when the memory management code +* removed mappings to pages in the range and is about to free +* them. This captures the point when pages are unmapped but +* not yet freed. +* Note that invalidate_range() might be called only on a +* sub-range of the range passed to the corresponding +* invalidate_range_start() call. +* * invalidate_range_end() is called when all pages in the * range have been unmapped and the pages have been freed by -* the VM. +* the VM. It might be called under the ptl spin-lock, so this +* notifier is not allowed to preempt. * * The VM will remove the page table entries and potentially * the page between invalidate_range_start() and @@ -138,6 +148,8 @@ struct mmu_notifier_ops { void (*invalidate_range_start)(struct mmu_notifier *mn, struct mm_struct *mm, unsigned long start, unsigned long end); + void (*invalidate_range)(struct mmu_notifier *mn, struct mm_struct *mm, +unsigned long start, unsigned long end); void (*invalidate_range_end)(struct mmu_notifier *mn, struct mm_struct *mm, unsigned long start, unsigned long end); @@ -182,6 +194,8 @@ extern void __mmu_notifier_invalidate_page(struct mm_struct *mm, unsigned long address); extern void __mmu_notifier_invalidate_range_start(struct mm_struct *mm, unsigned long start, unsigned long end); +extern void __mmu_notifier_invalidate_range(struct mm_struct *mm, + unsigned long start, unsigned long end); extern void __mmu_notifier_invalidate_range_end(struct mm_struct *mm, unsigned long start, unsigned long end); @@ -231,6 +245,8 @@ static inline void mmu_notifier_invalidate_range_start(struct mm_struct *mm, static inline void mmu_notifier_invalidate_range(struct mm_struct *mm, unsigned long start, unsigned long end) { + if (mm_has_notifiers(mm)) + __mmu_notifier_invalidate_range(mm, start, end); } static inline void mmu_notifier_invalidate_range_end(struct mm_struct *mm, diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index 41cefdf..d1bdea0 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -165,6 +165,21 @@ void __mmu_notifier_invalidate_range_start(struct mm_struct *mm, } EXPORT_SYMBOL_GPL(__mmu_notifier_invalidate_range_start); +void __mmu_notifier_invalidate_range(struct mm_struct *mm, + unsigned long start, unsigned long end) +{ + struct mmu_notifier *mn; + int id; + + id = srcu_read_lock(&srcu); + hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) { + if (mn->ops->invalidate_range) + mn->ops->invalidate_range(mn, mm, start, end); + } + srcu_read_unlock(&srcu, id); +} +EXPORT_SYMBOL_GPL(__mmu_notifier_invalidate_range); + void __mmu_not
Re: [PATCH v2 1/1] iommu-api: Add map_range/unmap_range functions
On Wed, Jul 23, 2014 at 10:49:55AM -0700, Olav Haugan wrote: > On 7/22/2014 12:45 AM, Thierry Reding wrote: > > On Mon, Jul 21, 2014 at 05:59:22PM -0700, Olav Haugan wrote: > >> On 7/17/2014 1:21 AM, Thierry Reding wrote: > >>> On Wed, Jul 16, 2014 at 06:01:57PM -0700, Olav Haugan wrote: > > [...] > Additionally, the mapping operation would be faster in general since > clients does not have to keep calling map API over and over again for > each physically contiguous chunk of memory that needs to be mapped to a > virtually contiguous region. > > Signed-off-by: Olav Haugan > --- > drivers/iommu/iommu.c | 48 > > include/linux/iommu.h | 25 + > 2 files changed, 73 insertions(+) > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index 1698360..a0eebb7 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -1089,6 +1089,54 @@ size_t iommu_unmap(struct iommu_domain *domain, > unsigned long iova, size_t size) > EXPORT_SYMBOL_GPL(iommu_unmap); > > > +int iommu_map_range(struct iommu_domain *domain, unsigned int iova, > >>> > >>> Maybe iova should be dma_addr_t? Or at least unsigned long? And perhaps > >>> iommu_map_sg() would be more consistent with the equivalent function in > >>> struct dma_ops? > >>> > +struct scatterlist *sg, unsigned int len, int opt) > >>> > >>> The length argument seems to be the size of the mapping. Again, the > >>> struct dma_ops function uses this argument to denote the number of > >>> entries in the scatterlist. > >>> > >>> opt is somewhat opaque. Perhaps this should be turned into unsigned long > >>> flags? Although given that there aren't any users yet it's difficult to > >>> say what's best here. Perhaps the addition of this argument should be > >>> postponed until there are actual users? > >> > >> I am thinking something like this: > >> > >> int iommu_map_sg(struct iommu_domain *domain, struct scatterlist *sg, > >> unsigned int nents, int prot, unsigned long flags); > >> int iommu_unmap_sg(struct iommu_domain *domain, struct scatterlist *sg, > >> unsigned int nents, unsigned long flags); > > > > Looks good. > > > >> The iova is contained within sg so we don't need that argument really > > > > I'm not sure. I think a common use-case for this function is for some > > driver to map an imported DMA-BUF. In that case you'll get a struct > > sg_table, but I think it won't have sg.dma_address set to anything > > useful. So if we don't have iova as a parameter to this function, the > > driver would have to make it a two-step process, like this: > > > > sg_dma_address(sg) = iova; > > > > err = iommu_map_sg(...); > > > > And drivers that use the IOMMU API directly need to manage IOVA space > > themselves anyway, so I think passing around the IOVA within the SG > > won't be a very common case. It will almost always be the driver that > > calls this function which allocates the IOVA range. > > Yes, I see your point. Rob is not a fan either... > So what about this: > > int iommu_map_sg(struct iommu_domain *domain, unsigned long iova, struct > scatterlist *sg, unsigned int nents, int prot, unsigned long flags); > int iommu_unmap_sg(struct iommu_domain *domain, unsigned long iova, > size_t size, unsigned long flags); > > No need for sg in the unmap call then. Keeping iova an unsigned long to > match the existing iommu_map/iommu_unmap calls. Looks good to me. I think we may want to eventually convert iova to dma_addr_t since that's a more appropriate type for these addresses but we can do that in a separate patch later on. [...] > > For .map_sg() I think pretty much every driver will want to implement > > it, so requiring developers to explicitly set it for their driver seems > > like a good idea. If there's no advantage in implementing it then they > > can always get the same functionality by explicitly using the fallback. > > I feel that requiring drivers to set the default callback defeats the > purpose of having a fallback in the first place. The reason to provide > the default fallback is to catch any driver that does not implement this > themselves. Certainly, but the exact same can be achieved by making all drivers point to the generic implementation. In my opinion that makes it much more explicit (and therefore obvious) what's really going on. Hiding fallbacks in the core API obfuscates. And this is all in-tree code, so we have full control over what we change, so the patch that introduces this new API could simply make the necessary adjustments to existing drivers. Thierry pgpQvCRf9H5xM.pgp Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/1] iommu-api: Add map_range/unmap_range functions
On Thu, Jul 24, 2014 at 11:34:27AM +0200, Joerg Roedel wrote: > On Wed, Jul 23, 2014 at 10:49:55AM -0700, Olav Haugan wrote: > > Joerg, can you comment on what you envisioned when you suggested that we > > add the fallback? > > > > The problem is that we already have tons of IOMMU drivers in the tree > which don't provide these call-backs. So adding this API extension > without a fall-back that works for these drivers too would fragment the > functionality between different IOMMU drivers in an inacceptable way and > undermine the purpose of a generic API. But we only care about in-tree drivers anyway, so we can equally well just point all drivers to the generic implementation in the same patch that adds this new function. The end result will be the same, but it will keep the core function simpler (and more consistent with the other core functions). Thierry pgpGmikQGfy4A.pgp Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/1] iommu-api: Add map_range/unmap_range functions
On Wed, Jul 23, 2014 at 10:49:55AM -0700, Olav Haugan wrote: > Joerg, can you comment on what you envisioned when you suggested that we > add the fallback? > The problem is that we already have tons of IOMMU drivers in the tree which don't provide these call-backs. So adding this API extension without a fall-back that works for these drivers too would fragment the functionality between different IOMMU drivers in an inacceptable way and undermine the purpose of a generic API. Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu