Re: [PATCH 03/37] iommu/sva: Manage process address spaces
On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote: > /** > * iommu_sva_device_init() - Initialize Shared Virtual Addressing for a > device > * @dev: the device > @@ -129,7 +439,10 @@ EXPORT_SYMBOL_GPL(iommu_sva_device_shutdown); > int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int > *pasid, > unsigned long flags, void *drvdata) > { > + int i, ret; > + struct io_mm *io_mm = NULL; > struct iommu_domain *domain; > + struct iommu_bond *bond = NULL, *tmp; > struct iommu_param *dev_param = dev->iommu_param; > > domain = iommu_get_domain_for_dev(dev); > @@ -145,7 +458,42 @@ int iommu_sva_bind_device(struct device *dev, struct > mm_struct *mm, int *pasid, > if (flags != (IOMMU_SVA_FEAT_PASID | IOMMU_SVA_FEAT_IOPF)) > return -EINVAL; > > - return -ENOSYS; /* TODO */ > + /* If an io_mm already exists, use it */ > + spin_lock(_sva_lock); > + idr_for_each_entry(_pasid_idr, io_mm, i) { > + if (io_mm->mm != mm || !io_mm_get_locked(io_mm)) > + continue; > + > + /* Is it already bound to this device? */ > + list_for_each_entry(tmp, _mm->devices, mm_head) { > + if (tmp->dev != dev) > + continue; > + > + bond = tmp; > + refcount_inc(>refs); > + io_mm_put_locked(io_mm); > + break; > + } > + break; > + } > + spin_unlock(_sva_lock); > + > + if (bond) Please return pasid when you find an io_mm that is already bound. Something like *pasid = io_mm->pasid should do the work here when bond is true. > + return 0; -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 11/12] swiotlb: move the SWIOTLB config symbol to lib/Kconfig
On Mon, Apr 23, 2018 at 07:04:18PM +0200, Christoph Hellwig wrote: > This way we have one central definition of it, and user can select it as > needed. Note that we also add a second ARCH_HAS_SWIOTLB symbol to > indicate the architecture supports swiotlb at all, so that we can still > make the usage optional for a few architectures that want this feature > to be user selectable. > > Signed-off-by: Christoph HellwigHmm, this looks like we end up with NEED_SG_DMA_LENGTH=y on ARM by default, which probably isn't a good idea - ARM pre-dates the dma_length parameter in scatterlists, and I don't think all code is guaranteed to do the right thing if this is enabled. For example, arch/arm/mach-rpc/dma.c doesn't use the dma_length member of struct scatterlist. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 05/22] iommu: introduce iommu invalidate API function
On Fri, 20 Apr 2018 19:19:54 +0100 Jean-Philippe Bruckerwrote: > Hi Jacob, > > On Mon, Apr 16, 2018 at 10:48:54PM +0100, Jacob Pan wrote: > [...] > > +/** > > + * enum iommu_inv_granularity - Generic invalidation granularity > > + * > > + * When an invalidation request is sent to IOMMU to flush > > translation caches, > > + * it may carry different granularity. These granularity levels > > are not specific > > + * to a type of translation cache. For an example, PASID selective > > granularity > > + * is only applicable to PASID cache invalidation. > > I'm still confused by this, I think we should add more definitions > because architectures tend to use different names. What you call > "Translations caches" encompasses all caches that can be invalidated > with this request, right? So all of: > yes correct. > * "TLB" and "DTLB" that cache IOVA->GPA and GPA->PA (TLB is in the > IOMMU, DTLB is an ATC in an endpoint), > * "PASID cache" that cache PASID->Translation Table, > * "Context cache" that cache RID->PASID table > > Does this match the model you're using? > yes. PASID cache and context caches are in the IOMMU. > The last name is a bit unfortunate. Since the Arm architecture uses > the name "context" for what a PASID points to, "Device cache" would > suit us better but it's not important. > or call it device context cache. actually so far context cache is here only for completeness purpose. the expected use case is that QEMU traps guest device context cache flush and call bind_pasid_table. > I don't understand what you mean by "PASID selective granularity is > only applicable to PASID cache invalidation", it seems to contradict > the preceding sentence. You are right. That was a mistake. I meant to say "These granularity levels are specific to a type of" > What if user sends an invalidation with > IOMMU_INV_TYPE_TLB and IOMMU_INV_GRANU_ALL_PASID? Doesn't this remove > from the TLBs all entries with the given PASID? > No, this meant to invalidate all PASID of a given domain ID. I need to correct the description. The dilemma here is to map model specific fields into generic list. not all combinations are legal. > > + * This enum is a collection of granularities for all types of > > translation > > + * caches. The idea is to make it easy for IOMMU model specific > > driver do > > + * conversion from generic to model specific value. > > + */ > > +enum iommu_inv_granularity { > > In patch 9, inv_type_granu_map has some valid fields with granularity > == 0. Does it mean "invalidate all caches"? > > I don't think user should ever be allowed to invalidate caches entries > of devices and domains it doesn't own. > Agreed, I removed global granu to avoid device invalidation beyond device itself. But I missed some of the fields in inv_type_granu_map{}. > > + IOMMU_INV_GRANU_DOMAIN = 1, /* all TLBs associated > > with a domain */ > > + IOMMU_INV_GRANU_DEVICE, /* caching > > structure associated with a > > +* device ID > > +*/ > > + IOMMU_INV_GRANU_DOMAIN_PAGE,/* address range with > > a domain */ > > > + IOMMU_INV_GRANU_ALL_PASID, /* cache of a given > > PASID */ > > If this corresponds to QI_GRAN_ALL_ALL in patch 9, the comment should > be "Cache of all PASIDs"? Or maybe "all entries for all PASIDs"? Is it > different from GRANU_DOMAIN then? QI_GRAN_ALL_ALL maps to VT-d spec 6.5.2.4, which invalidates all ext TLB cache within a domain. It could reuse GRANU_DOMAIN but I was also trying to match the naming convention in the spec. > > + IOMMU_INV_GRANU_PASID_SEL, /* only invalidate > > specified PASID */ + > > + IOMMU_INV_GRANU_NG_ALL_PASID, /* non-global within > > all PASIDs */ > > + IOMMU_INV_GRANU_NG_PASID, /* non-global within a > > PASIDs */ > > Are the "NG" variant needed since there is a > IOMMU_INVALIDATE_GLOBAL_PAGE below? We should drop either flag or > granule. > > FWIW I'm starting to think more granule options is actually better > than flags, because it flattens the combinations and keeps them to two > dimensions, that we can understand and explain with a table. > > > + IOMMU_INV_GRANU_PAGE_PASID, /* page-selective > > within a PASID */ > > Maybe this should be called "NG_PAGE_PASID", Sure. I was thinking page range already implies non-global pages. > and "DOMAIN_PAGE" should > instead be "PAGE_PASID". If I understood their meaning correctly, it > would be more consistent with the rest. > I am trying not to mix granu between request w/ PASID and w/o. DOMAIN_PAGE meant to be for request w/o PASID. > > + IOMMU_INV_NR_GRANU, > > +}; > > + > > +/** enum iommu_inv_type - Generic translation cache types for > > invalidation > > + * > > + * Invalidation requests sent to IOMMU may indicate which > > translation cache > > + * to be operated on. > > + * Combined with enum iommu_inv_granularity, model specific driver > >
Re: [PATCH 11/12] swiotlb: move the SWIOTLB config symbol to lib/Kconfig
On Mon, Apr 23, 2018 at 07:04:18PM +0200, Christoph Hellwig wrote: > This way we have one central definition of it, and user can select it as > needed. Note that we also add a second ARCH_HAS_SWIOTLB symbol to > indicate the architecture supports swiotlb at all, so that we can still > make the usage optional for a few architectures that want this feature > to be user selectable. If I follow this select business this will enable it on ARM and x86 by default. As such: Reviewed-by: Konrad Rzeszutek WilkThank you! > > Signed-off-by: Christoph Hellwig > --- > arch/arm/Kconfig| 4 +--- > arch/arm64/Kconfig | 5 ++--- > arch/ia64/Kconfig | 9 + > arch/mips/Kconfig | 3 +++ > arch/mips/cavium-octeon/Kconfig | 5 - > arch/mips/loongson64/Kconfig| 8 > arch/powerpc/Kconfig| 9 - > arch/unicore32/mm/Kconfig | 5 - > arch/x86/Kconfig| 14 +++--- > lib/Kconfig | 15 +++ > 10 files changed, 25 insertions(+), 52 deletions(-) > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index 90b81a3a28a7..f91f69174630 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -106,6 +106,7 @@ config ARM > select REFCOUNT_FULL > select RTC_LIB > select SYS_SUPPORTS_APM_EMULATION > + select ARCH_HAS_SWIOTLB > # Above selects are sorted alphabetically; please add new ones > # according to that. Thanks. > help > @@ -1773,9 +1774,6 @@ config SECCOMP > and the task is only allowed to execute a few safe syscalls > defined by each seccomp mode. > > -config SWIOTLB > - bool > - > config PARAVIRT > bool "Enable paravirtualization code" > help > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 4d924eb32e7f..056bc7365adf 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -21,6 +21,7 @@ config ARM64 > select ARCH_HAS_SG_CHAIN > select ARCH_HAS_STRICT_KERNEL_RWX > select ARCH_HAS_STRICT_MODULE_RWX > + select ARCH_HAS_SWIOTLB > select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST > select ARCH_HAVE_NMI_SAFE_CMPXCHG > select ARCH_INLINE_READ_LOCK if !PREEMPT > @@ -144,6 +145,7 @@ config ARM64 > select POWER_SUPPLY > select REFCOUNT_FULL > select SPARSE_IRQ > + select SWIOTLB > select SYSCTL_EXCEPTION_TRACE > select THREAD_INFO_IN_TASK > help > @@ -239,9 +241,6 @@ config HAVE_GENERIC_GUP > config SMP > def_bool y > > -config SWIOTLB > - def_bool y > - > config KERNEL_MODE_NEON > def_bool y > > diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig > index 685d557eea48..d396230913e6 100644 > --- a/arch/ia64/Kconfig > +++ b/arch/ia64/Kconfig > @@ -56,6 +56,7 @@ config IA64 > select HAVE_ARCH_AUDITSYSCALL > select NEED_DMA_MAP_STATE > select NEED_SG_DMA_LENGTH > + select ARCH_HAS_SWIOTLB > default y > help > The Itanium Processor Family is Intel's 64-bit successor to > @@ -80,9 +81,6 @@ config MMU > bool > default y > > -config SWIOTLB > - bool > - > config STACKTRACE_SUPPORT > def_bool y > > @@ -139,7 +137,6 @@ config IA64_GENERIC > bool "generic" > select NUMA > select ACPI_NUMA > - select DMA_DIRECT_OPS > select SWIOTLB > select PCI_MSI > help > @@ -160,7 +157,6 @@ config IA64_GENERIC > > config IA64_DIG > bool "DIG-compliant" > - select DMA_DIRECT_OPS > select SWIOTLB > > config IA64_DIG_VTD > @@ -176,7 +172,6 @@ config IA64_HP_ZX1 > > config IA64_HP_ZX1_SWIOTLB > bool "HP-zx1/sx1000 with software I/O TLB" > - select DMA_DIRECT_OPS > select SWIOTLB > help > Build a kernel that runs on HP zx1 and sx1000 systems even when they > @@ -200,7 +195,6 @@ config IA64_SGI_UV > bool "SGI-UV" > select NUMA > select ACPI_NUMA > - select DMA_DIRECT_OPS > select SWIOTLB > help > Selecting this option will optimize the kernel for use on UV based > @@ -211,7 +205,6 @@ config IA64_SGI_UV > > config IA64_HP_SIM > bool "Ski-simulator" > - select DMA_DIRECT_OPS > select SWIOTLB > depends on !PM > > diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig > index e10cc5c7be69..b6b4c1e154f8 100644 > --- a/arch/mips/Kconfig > +++ b/arch/mips/Kconfig > @@ -912,6 +912,8 @@ config CAVIUM_OCTEON_SOC > select MIPS_NR_CPU_NR_MAP_1024 > select BUILTIN_DTB > select MTD_COMPLEX_MAPPINGS > + select ARCH_HAS_SWIOTLB > + select SWIOTLB > select SYS_SUPPORTS_RELOCATABLE > help > This option supports all of the Octeon reference boards from Cavium > @@ -1367,6 +1369,7 @@ config CPU_LOONGSON3 > select MIPS_PGD_C0_CONTEXT > select MIPS_L1_CACHE_SHIFT_6 > select
Re: [PATCH 10/12] arm: don't build swiotlb by default
On Mon, Apr 23, 2018 at 07:04:17PM +0200, Christoph Hellwig wrote: > swiotlb is only used as a library of helper for xen-swiotlb if Xen support > is enabled on arm, so don't build it by default. > CCing Stefano > Signed-off-by: Christoph Hellwig> --- > arch/arm/Kconfig | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index aa1c187d756d..90b81a3a28a7 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -1774,7 +1774,7 @@ config SECCOMP > defined by each seccomp mode. > > config SWIOTLB > - def_bool y > + bool > > config PARAVIRT > bool "Enable paravirtualization code" > @@ -1807,6 +1807,7 @@ config XEN > depends on MMU > select ARCH_DMA_ADDR_T_64BIT > select ARM_PSCI > + select SWIOTLB > select SWIOTLB_XEN > select PARAVIRT > help > -- > 2.17.0 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/iova: Update cached node pointer when current node fails to get any free IOVA
On Mon, Apr 23, 2018 at 10:07 PM, Robin Murphywrote: > On 19/04/18 18:12, Ganapatrao Kulkarni wrote: >> >> The performance drop is observed with long hours iperf testing using 40G >> cards. This is mainly due to long iterations in finding the free iova >> range in 32bit address space. >> >> In current implementation for 64bit PCI devices, there is always first >> attempt to allocate iova from 32bit(SAC preferred over DAC) address >> range. Once we run out 32bit range, there is allocation from higher range, >> however due to cached32_node optimization it does not suppose to be >> painful. cached32_node always points to recently allocated 32-bit node. >> When address range is full, it will be pointing to last allocated node >> (leaf node), so walking rbtree to find the available range is not >> expensive affair. However this optimization does not behave well when >> one of the middle node is freed. In that case cached32_node is updated >> to point to next iova range. The next iova allocation will consume free >> range and again update cached32_node to itself. From now on, walking >> over 32-bit range is more expensive. >> >> This patch adds fix to update cached node to leaf node when there are no >> iova free range left, which avoids unnecessary long iterations. > > > The only trouble with this is that "allocation failed" doesn't uniquely mean > "space full". Say that after some time the 32-bit space ends up empty except > for one page at 0x1000 and one at 0x8000, then somebody tries to > allocate 2GB. If we move the cached node down to the leftmost entry when > that fails, all subsequent allocation attempts are now going to fail despite > the space being 99.% free! > > I can see a couple of ways to solve that general problem of free space above > the cached node getting lost, but neither of them helps with the case where > there is genuinely insufficient space (and if anything would make it even > slower). In terms of the optimisation you want here, i.e. fail fast when an > allocation cannot possibly succeed, the only reliable idea which comes to > mind is free-PFN accounting. I might give that a go myself to see how ugly > it looks. i see 2 problems in current implementation, 1. We don't replenish the 32 bits range, until first attempt of second allocation(64 bit) fails. 2. Having per cpu cache might not yield good hit on platforms with more number of CPUs. however irrespective of current issues, It makes sense to update cached node as done in this patch , when there is failure to get iova range using current cached pointer which is forcing for the unnecessary time consuming do-while iterations until any replenish happens! thanks Ganapat > > Robin. > > >> Signed-off-by: Ganapatrao Kulkarni >> --- >> drivers/iommu/iova.c | 6 ++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c >> index 83fe262..e6ee2ea 100644 >> --- a/drivers/iommu/iova.c >> +++ b/drivers/iommu/iova.c >> @@ -201,6 +201,12 @@ static int __alloc_and_insert_iova_range(struct >> iova_domain *iovad, >> } while (curr && new_pfn <= curr_iova->pfn_hi); >> if (limit_pfn < size || new_pfn < iovad->start_pfn) { >> + /* No more cached node points to free hole, update to leaf >> node. >> +*/ >> + struct iova *prev_iova; >> + >> + prev_iova = rb_entry(prev, struct iova, node); >> + __cached_rbnode_insert_update(iovad, prev_iova); >> spin_unlock_irqrestore(>iova_rbtree_lock, flags); >> return -ENOMEM; >> } >> > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 11/12] swiotlb: move the SWIOTLB config symbol to lib/Kconfig
This way we have one central definition of it, and user can select it as needed. Note that we also add a second ARCH_HAS_SWIOTLB symbol to indicate the architecture supports swiotlb at all, so that we can still make the usage optional for a few architectures that want this feature to be user selectable. Signed-off-by: Christoph Hellwig--- arch/arm/Kconfig| 4 +--- arch/arm64/Kconfig | 5 ++--- arch/ia64/Kconfig | 9 + arch/mips/Kconfig | 3 +++ arch/mips/cavium-octeon/Kconfig | 5 - arch/mips/loongson64/Kconfig| 8 arch/powerpc/Kconfig| 9 - arch/unicore32/mm/Kconfig | 5 - arch/x86/Kconfig| 14 +++--- lib/Kconfig | 15 +++ 10 files changed, 25 insertions(+), 52 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 90b81a3a28a7..f91f69174630 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -106,6 +106,7 @@ config ARM select REFCOUNT_FULL select RTC_LIB select SYS_SUPPORTS_APM_EMULATION + select ARCH_HAS_SWIOTLB # Above selects are sorted alphabetically; please add new ones # according to that. Thanks. help @@ -1773,9 +1774,6 @@ config SECCOMP and the task is only allowed to execute a few safe syscalls defined by each seccomp mode. -config SWIOTLB - bool - config PARAVIRT bool "Enable paravirtualization code" help diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 4d924eb32e7f..056bc7365adf 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -21,6 +21,7 @@ config ARM64 select ARCH_HAS_SG_CHAIN select ARCH_HAS_STRICT_KERNEL_RWX select ARCH_HAS_STRICT_MODULE_RWX + select ARCH_HAS_SWIOTLB select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select ARCH_HAVE_NMI_SAFE_CMPXCHG select ARCH_INLINE_READ_LOCK if !PREEMPT @@ -144,6 +145,7 @@ config ARM64 select POWER_SUPPLY select REFCOUNT_FULL select SPARSE_IRQ + select SWIOTLB select SYSCTL_EXCEPTION_TRACE select THREAD_INFO_IN_TASK help @@ -239,9 +241,6 @@ config HAVE_GENERIC_GUP config SMP def_bool y -config SWIOTLB - def_bool y - config KERNEL_MODE_NEON def_bool y diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index 685d557eea48..d396230913e6 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -56,6 +56,7 @@ config IA64 select HAVE_ARCH_AUDITSYSCALL select NEED_DMA_MAP_STATE select NEED_SG_DMA_LENGTH + select ARCH_HAS_SWIOTLB default y help The Itanium Processor Family is Intel's 64-bit successor to @@ -80,9 +81,6 @@ config MMU bool default y -config SWIOTLB - bool - config STACKTRACE_SUPPORT def_bool y @@ -139,7 +137,6 @@ config IA64_GENERIC bool "generic" select NUMA select ACPI_NUMA - select DMA_DIRECT_OPS select SWIOTLB select PCI_MSI help @@ -160,7 +157,6 @@ config IA64_GENERIC config IA64_DIG bool "DIG-compliant" - select DMA_DIRECT_OPS select SWIOTLB config IA64_DIG_VTD @@ -176,7 +172,6 @@ config IA64_HP_ZX1 config IA64_HP_ZX1_SWIOTLB bool "HP-zx1/sx1000 with software I/O TLB" - select DMA_DIRECT_OPS select SWIOTLB help Build a kernel that runs on HP zx1 and sx1000 systems even when they @@ -200,7 +195,6 @@ config IA64_SGI_UV bool "SGI-UV" select NUMA select ACPI_NUMA - select DMA_DIRECT_OPS select SWIOTLB help Selecting this option will optimize the kernel for use on UV based @@ -211,7 +205,6 @@ config IA64_SGI_UV config IA64_HP_SIM bool "Ski-simulator" - select DMA_DIRECT_OPS select SWIOTLB depends on !PM diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index e10cc5c7be69..b6b4c1e154f8 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -912,6 +912,8 @@ config CAVIUM_OCTEON_SOC select MIPS_NR_CPU_NR_MAP_1024 select BUILTIN_DTB select MTD_COMPLEX_MAPPINGS + select ARCH_HAS_SWIOTLB + select SWIOTLB select SYS_SUPPORTS_RELOCATABLE help This option supports all of the Octeon reference boards from Cavium @@ -1367,6 +1369,7 @@ config CPU_LOONGSON3 select MIPS_PGD_C0_CONTEXT select MIPS_L1_CACHE_SHIFT_6 select GPIOLIB + select ARCH_HAS_SWIOTLB help The Loongson 3 processor implements the MIPS64R2 instruction set with many extensions. diff --git a/arch/mips/cavium-octeon/Kconfig b/arch/mips/cavium-octeon/Kconfig index 5d73041547a7..4984e462be30 100644 --- a/arch/mips/cavium-octeon/Kconfig +++ b/arch/mips/cavium-octeon/Kconfig @@ -67,11 +67,6
[PATCH 12/12] swiotlb: remove the CONFIG_DMA_DIRECT_OPS ifdefs
swiotlb now selects the DMA_DIRECT_OPS config symbol, so this will always be true. Signed-off-by: Christoph Hellwig--- lib/swiotlb.c | 4 1 file changed, 4 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index fece57566d45..6954f7ad200a 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -692,7 +692,6 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr, } } -#ifdef CONFIG_DMA_DIRECT_OPS static inline bool dma_coherent_ok(struct device *dev, dma_addr_t addr, size_t size) { @@ -764,7 +763,6 @@ static bool swiotlb_free_buffer(struct device *dev, size_t size, DMA_ATTR_SKIP_CPU_SYNC); return true; } -#endif static void swiotlb_full(struct device *dev, size_t size, enum dma_data_direction dir, @@ -1045,7 +1043,6 @@ swiotlb_dma_supported(struct device *hwdev, u64 mask) return __phys_to_dma(hwdev, io_tlb_end - 1) <= mask; } -#ifdef CONFIG_DMA_DIRECT_OPS void *swiotlb_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs) { @@ -1089,4 +1086,3 @@ const struct dma_map_ops swiotlb_dma_ops = { .unmap_page = swiotlb_unmap_page, .dma_supported = dma_direct_supported, }; -#endif /* CONFIG_DMA_DIRECT_OPS */ -- 2.17.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 09/12] PCI: remove CONFIG_PCI_BUS_ADDR_T_64BIT
This symbol is now always identical to CONFIG_ARCH_DMA_ADDR_T_64BIT, so remove it. Signed-off-by: Christoph HellwigAcked-by: Bjorn Helgaas --- drivers/pci/Kconfig | 4 drivers/pci/bus.c | 4 ++-- include/linux/pci.h | 2 +- 3 files changed, 3 insertions(+), 7 deletions(-) diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig index 34b56a8f8480..29a487f31dae 100644 --- a/drivers/pci/Kconfig +++ b/drivers/pci/Kconfig @@ -5,10 +5,6 @@ source "drivers/pci/pcie/Kconfig" -config PCI_BUS_ADDR_T_64BIT - def_bool y if (ARCH_DMA_ADDR_T_64BIT || 64BIT) - depends on PCI - config PCI_MSI bool "Message Signaled Interrupts (MSI and MSI-X)" depends on PCI diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c index bc2ded4c451f..35b7fc87eac5 100644 --- a/drivers/pci/bus.c +++ b/drivers/pci/bus.c @@ -120,7 +120,7 @@ int devm_request_pci_bus_resources(struct device *dev, EXPORT_SYMBOL_GPL(devm_request_pci_bus_resources); static struct pci_bus_region pci_32_bit = {0, 0xULL}; -#ifdef CONFIG_PCI_BUS_ADDR_T_64BIT +#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT static struct pci_bus_region pci_64_bit = {0, (pci_bus_addr_t) 0xULL}; static struct pci_bus_region pci_high = {(pci_bus_addr_t) 0x1ULL, @@ -230,7 +230,7 @@ int pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res, resource_size_t), void *alignf_data) { -#ifdef CONFIG_PCI_BUS_ADDR_T_64BIT +#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT int rc; if (res->flags & IORESOURCE_MEM_64) { diff --git a/include/linux/pci.h b/include/linux/pci.h index 73178a2fcee0..55371cb827ad 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -670,7 +670,7 @@ int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn, int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn, int reg, int len, u32 val); -#ifdef CONFIG_PCI_BUS_ADDR_T_64BIT +#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT typedef u64 pci_bus_addr_t; #else typedef u32 pci_bus_addr_t; -- 2.17.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 07/12] arch: remove the ARCH_PHYS_ADDR_T_64BIT config symbol
Instead select the PHYS_ADDR_T_64BIT for 32-bit architectures that need a 64-bit phys_addr_t type directly. Signed-off-by: Christoph Hellwig--- arch/arc/Kconfig | 4 +--- arch/arm/kernel/setup.c| 2 +- arch/arm/mm/Kconfig| 4 +--- arch/arm64/Kconfig | 3 --- arch/mips/Kconfig | 15 ++- arch/powerpc/Kconfig | 5 + arch/powerpc/platforms/Kconfig.cputype | 1 + arch/riscv/Kconfig | 6 ++ arch/x86/Kconfig | 5 + mm/Kconfig | 2 +- 10 files changed, 15 insertions(+), 32 deletions(-) diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig index d76bf4a83740..f94c61da682a 100644 --- a/arch/arc/Kconfig +++ b/arch/arc/Kconfig @@ -453,13 +453,11 @@ config ARC_HAS_PAE40 default n depends on ISA_ARCV2 select HIGHMEM + select PHYS_ADDR_T_64BIT help Enable access to physical memory beyond 4G, only supported on ARC cores with 40 bit Physical Addressing support -config ARCH_PHYS_ADDR_T_64BIT - def_bool ARC_HAS_PAE40 - config ARCH_DMA_ADDR_T_64BIT bool diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index fc40a2b40595..35ca494c028c 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -754,7 +754,7 @@ int __init arm_add_memory(u64 start, u64 size) else size -= aligned_start - start; -#ifndef CONFIG_ARCH_PHYS_ADDR_T_64BIT +#ifndef CONFIG_PHYS_ADDR_T_64BIT if (aligned_start > ULONG_MAX) { pr_crit("Ignoring memory at 0x%08llx outside 32-bit physical address space\n", (long long)start); diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig index 7f14acf67caf..2f77c6344ef1 100644 --- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -661,6 +661,7 @@ config ARM_LPAE bool "Support for the Large Physical Address Extension" depends on MMU && CPU_32v7 && !CPU_32v6 && !CPU_32v5 && \ !CPU_32v4 && !CPU_32v3 + select PHYS_ADDR_T_64BIT help Say Y if you have an ARMv7 processor supporting the LPAE page table format and you would like to access memory beyond the @@ -673,9 +674,6 @@ config ARM_PV_FIXUP def_bool y depends on ARM_LPAE && ARM_PATCH_PHYS_VIRT && ARCH_KEYSTONE -config ARCH_PHYS_ADDR_T_64BIT - def_bool ARM_LPAE - config ARCH_DMA_ADDR_T_64BIT bool diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 940adfb9a2bc..b6aa33e642cc 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -152,9 +152,6 @@ config ARM64 config 64BIT def_bool y -config ARCH_PHYS_ADDR_T_64BIT - def_bool y - config MMU def_bool y diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index 47d72c64d687..985388078872 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -132,7 +132,7 @@ config MIPS_GENERIC config MIPS_ALCHEMY bool "Alchemy processor based machines" - select ARCH_PHYS_ADDR_T_64BIT + select PHYS_ADDR_T_64BIT select CEVT_R4K select CSRC_R4K select IRQ_MIPS_CPU @@ -890,7 +890,7 @@ config CAVIUM_OCTEON_SOC bool "Cavium Networks Octeon SoC based boards" select CEVT_R4K select ARCH_HAS_PHYS_TO_DMA - select ARCH_PHYS_ADDR_T_64BIT + select PHYS_ADDR_T_64BIT select DMA_COHERENT select SYS_SUPPORTS_64BIT_KERNEL select SYS_SUPPORTS_BIG_ENDIAN @@ -936,7 +936,7 @@ config NLM_XLR_BOARD select SWAP_IO_SPACE select SYS_SUPPORTS_32BIT_KERNEL select SYS_SUPPORTS_64BIT_KERNEL - select ARCH_PHYS_ADDR_T_64BIT + select PHYS_ADDR_T_64BIT select SYS_SUPPORTS_BIG_ENDIAN select SYS_SUPPORTS_HIGHMEM select DMA_COHERENT @@ -962,7 +962,7 @@ config NLM_XLP_BOARD select HW_HAS_PCI select SYS_SUPPORTS_32BIT_KERNEL select SYS_SUPPORTS_64BIT_KERNEL - select ARCH_PHYS_ADDR_T_64BIT + select PHYS_ADDR_T_64BIT select GPIOLIB select SYS_SUPPORTS_BIG_ENDIAN select SYS_SUPPORTS_LITTLE_ENDIAN @@ -1102,7 +1102,7 @@ config FW_CFE bool config ARCH_DMA_ADDR_T_64BIT - def_bool (HIGHMEM && ARCH_PHYS_ADDR_T_64BIT) || 64BIT + def_bool (HIGHMEM && PHYS_ADDR_T_64BIT) || 64BIT config ARCH_SUPPORTS_UPROBES bool @@ -1767,7 +1767,7 @@ config CPU_MIPS32_R5_XPA depends on SYS_SUPPORTS_HIGHMEM select XPA select HIGHMEM - select ARCH_PHYS_ADDR_T_64BIT + select PHYS_ADDR_T_64BIT default n help Choose this option if you want to enable the Extended Physical @@ -2399,9 +2399,6 @@ config SB1_PASS_2_1_WORKAROUNDS default y -config ARCH_PHYS_ADDR_T_64BIT - bool - choice prompt "SmartMIPS or microMIPS ASE support" diff
[PATCH 06/12] dma-mapping: move the NEED_DMA_MAP_STATE config symbol to lib/Kconfig
This way we have one central definition of it, and user can select it as needed. Note that we now also always select it when CONFIG_DMA_API_DEBUG is select, which fixes some incorrect checks in a few network drivers. Signed-off-by: Christoph HellwigReviewed-by: Anshuman Khandual --- arch/alpha/Kconfig | 4 +--- arch/arm/Kconfig| 4 +--- arch/arm64/Kconfig | 4 +--- arch/ia64/Kconfig | 4 +--- arch/mips/Kconfig | 3 --- arch/parisc/Kconfig | 4 +--- arch/s390/Kconfig | 4 +--- arch/sh/Kconfig | 4 +--- arch/sparc/Kconfig | 4 +--- arch/unicore32/Kconfig | 4 +--- arch/x86/Kconfig| 6 ++ drivers/iommu/Kconfig | 1 + include/linux/dma-mapping.h | 2 +- lib/Kconfig | 3 +++ lib/Kconfig.debug | 1 + 15 files changed, 17 insertions(+), 35 deletions(-) diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig index 8e6a67ecf069..1fd9645b0c67 100644 --- a/arch/alpha/Kconfig +++ b/arch/alpha/Kconfig @@ -10,6 +10,7 @@ config ALPHA select HAVE_OPROFILE select HAVE_PCSPKR_PLATFORM select HAVE_PERF_EVENTS + select NEED_DMA_MAP_STATE select NEED_SG_DMA_LENGTH select VIRT_TO_BUS select GENERIC_IRQ_PROBE @@ -68,9 +69,6 @@ config ZONE_DMA config ARCH_DMA_ADDR_T_64BIT def_bool y -config NEED_DMA_MAP_STATE - def_bool y - config GENERIC_ISA_DMA bool default y diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 602c8320282f..aa1c187d756d 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -96,6 +96,7 @@ config ARM select HAVE_VIRT_CPU_ACCOUNTING_GEN select IRQ_FORCED_THREADING select MODULES_USE_ELF_REL + select NEED_DMA_MAP_STATE select NO_BOOTMEM select OF_EARLY_FLATTREE if OF select OF_RESERVED_MEM if OF @@ -221,9 +222,6 @@ config ARCH_MAY_HAVE_PC_FDC config ZONE_DMA bool -config NEED_DMA_MAP_STATE - def_bool y - config ARCH_SUPPORTS_UPROBES def_bool y diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 3b441c5587f1..940adfb9a2bc 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -133,6 +133,7 @@ config ARM64 select IRQ_FORCED_THREADING select MODULES_USE_ELF_RELA select MULTI_IRQ_HANDLER + select NEED_DMA_MAP_STATE select NEED_SG_DMA_LENGTH select NO_BOOTMEM select OF @@ -241,9 +242,6 @@ config HAVE_GENERIC_GUP config ARCH_DMA_ADDR_T_64BIT def_bool y -config NEED_DMA_MAP_STATE - def_bool y - config SMP def_bool y diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index 333917676f7f..0e42731adaf1 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -54,6 +54,7 @@ config IA64 select MODULES_USE_ELF_RELA select ARCH_USE_CMPXCHG_LOCKREF select HAVE_ARCH_AUDITSYSCALL + select NEED_DMA_MAP_STATE select NEED_SG_DMA_LENGTH default y help @@ -82,9 +83,6 @@ config MMU config ARCH_DMA_ADDR_T_64BIT def_bool y -config NEED_DMA_MAP_STATE - def_bool y - config SWIOTLB bool diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index 225c95da23ce..47d72c64d687 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -1122,9 +1122,6 @@ config DMA_NONCOHERENT bool select NEED_DMA_MAP_STATE -config NEED_DMA_MAP_STATE - bool - config SYS_HAS_EARLY_PRINTK bool diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig index 89caea87556e..4d8f64d48597 100644 --- a/arch/parisc/Kconfig +++ b/arch/parisc/Kconfig @@ -51,6 +51,7 @@ config PARISC select GENERIC_CLOCKEVENTS select ARCH_NO_COHERENT_DMA_MMAP select CPU_NO_EFFICIENT_FFS + select NEED_DMA_MAP_STATE select NEED_SG_DMA_LENGTH help @@ -112,9 +113,6 @@ config PM config STACKTRACE_SUPPORT def_bool y -config NEED_DMA_MAP_STATE - def_bool y - config ISA_DMA_API bool diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index f80c6b983159..89a007672f70 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -711,6 +711,7 @@ menuconfig PCI select PCI_MSI select IOMMU_HELPER select IOMMU_SUPPORT + select NEED_DMA_MAP_STATE select NEED_SG_DMA_LENGTH help @@ -736,9 +737,6 @@ config PCI_DOMAINS config HAS_IOMEM def_bool PCI -config NEED_DMA_MAP_STATE - def_bool PCI - config CHSC_SCH def_tristate m prompt "Support for CHSC subchannels" diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig index e127e0cbe30f..9417f70e008e 100644 --- a/arch/sh/Kconfig +++ b/arch/sh/Kconfig @@ -50,6 +50,7 @@ config SUPERH select HAVE_ARCH_AUDITSYSCALL select HAVE_FUTEX_CMPXCHG if FUTEX select HAVE_NMI + select NEED_DMA_MAP_STATE select NEED_SG_DMA_LENGTH help @@
[PATCH 10/12] arm: don't build swiotlb by default
swiotlb is only used as a library of helper for xen-swiotlb if Xen support is enabled on arm, so don't build it by default. Signed-off-by: Christoph Hellwig--- arch/arm/Kconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index aa1c187d756d..90b81a3a28a7 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1774,7 +1774,7 @@ config SECCOMP defined by each seccomp mode. config SWIOTLB - def_bool y + bool config PARAVIRT bool "Enable paravirtualization code" @@ -1807,6 +1807,7 @@ config XEN depends on MMU select ARCH_DMA_ADDR_T_64BIT select ARM_PSCI + select SWIOTLB select SWIOTLB_XEN select PARAVIRT help -- 2.17.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 04/12] iommu-helper: move the IOMMU_HELPER config symbol to lib/
This way we have one central definition of it, and user can select it as needed. Signed-off-by: Christoph HellwigReviewed-by: Anshuman Khandual --- arch/powerpc/Kconfig | 4 +--- arch/s390/Kconfig| 5 ++--- arch/sparc/Kconfig | 5 + arch/x86/Kconfig | 6 ++ lib/Kconfig | 3 +++ 5 files changed, 9 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 43e3c8e4e7f4..7698cf89af9c 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -223,6 +223,7 @@ config PPC select HAVE_SYSCALL_TRACEPOINTS select HAVE_VIRT_CPU_ACCOUNTING select HAVE_IRQ_TIME_ACCOUNTING + select IOMMU_HELPER if PPC64 select IRQ_DOMAIN select IRQ_FORCED_THREADING select MODULES_USE_ELF_RELA @@ -478,9 +479,6 @@ config MPROFILE_KERNEL depends on PPC64 && CPU_LITTLE_ENDIAN def_bool !DISABLE_MPROFILE_KERNEL -config IOMMU_HELPER - def_bool PPC64 - config SWIOTLB bool "SWIOTLB support" default n diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 199ac3e4da1d..60c4ab854182 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -709,7 +709,9 @@ config QDIO menuconfig PCI bool "PCI support" select PCI_MSI + select IOMMU_HELPER select IOMMU_SUPPORT + help Enable PCI support. @@ -733,9 +735,6 @@ config PCI_DOMAINS config HAS_IOMEM def_bool PCI -config IOMMU_HELPER - def_bool PCI - config NEED_SG_DMA_LENGTH def_bool PCI diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig index 8767e45f1b2b..44e0f3cd7988 100644 --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -67,6 +67,7 @@ config SPARC64 select HAVE_SYSCALL_TRACEPOINTS select HAVE_CONTEXT_TRACKING select HAVE_DEBUG_KMEMLEAK + select IOMMU_HELPER select SPARSE_IRQ select RTC_DRV_CMOS select RTC_DRV_BQ4802 @@ -106,10 +107,6 @@ config ARCH_DMA_ADDR_T_64BIT bool default y if ARCH_ATU -config IOMMU_HELPER - bool - default y if SPARC64 - config STACKTRACE_SUPPORT bool default y if SPARC64 diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index cb2c7ecc1fea..fe9713539166 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -871,6 +871,7 @@ config DMI config GART_IOMMU bool "Old AMD GART IOMMU support" + select IOMMU_HELPER select SWIOTLB depends on X86_64 && PCI && AMD_NB ---help--- @@ -892,6 +893,7 @@ config GART_IOMMU config CALGARY_IOMMU bool "IBM Calgary IOMMU support" + select IOMMU_HELPER select SWIOTLB depends on X86_64 && PCI ---help--- @@ -929,10 +931,6 @@ config SWIOTLB with more than 3 GB of memory. If unsure, say Y. -config IOMMU_HELPER - def_bool y - depends on CALGARY_IOMMU || GART_IOMMU - config MAXSMP bool "Enable Maximum number of SMP Processors and NUMA Nodes" depends on X86_64 && SMP && DEBUG_KERNEL diff --git a/lib/Kconfig b/lib/Kconfig index 5fe577673b98..2f6908577534 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -429,6 +429,9 @@ config SGL_ALLOC bool default n +config IOMMU_HELPER + bool + config DMA_DIRECT_OPS bool depends on HAS_DMA && (!64BIT || ARCH_DMA_ADDR_T_64BIT) -- 2.17.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 03/12] iommu-helper: mark iommu_is_span_boundary as inline
This avoids selecting IOMMU_HELPER just for this function. And we only use it once or twice in normal builds so this often even is a size reduction. Signed-off-by: Christoph Hellwig--- arch/alpha/Kconfig | 3 --- arch/arm/Kconfig| 3 --- arch/arm64/Kconfig | 3 --- arch/ia64/Kconfig | 3 --- arch/mips/cavium-octeon/Kconfig | 4 arch/mips/loongson64/Kconfig| 4 arch/mips/netlogic/Kconfig | 3 --- arch/powerpc/Kconfig| 1 - arch/unicore32/mm/Kconfig | 3 --- arch/x86/Kconfig| 2 +- drivers/parisc/Kconfig | 5 - include/linux/iommu-helper.h| 13 ++--- lib/iommu-helper.c | 12 +--- 13 files changed, 12 insertions(+), 47 deletions(-) diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig index b2022885ced8..3ff735a722af 100644 --- a/arch/alpha/Kconfig +++ b/arch/alpha/Kconfig @@ -345,9 +345,6 @@ config PCI_DOMAINS config PCI_SYSCALL def_bool PCI -config IOMMU_HELPER - def_bool PCI - config ALPHA_NONAME bool depends on ALPHA_BOOK1 || ALPHA_NONAME_CH diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index a7f8e7f4b88f..2f79222c5c02 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1781,9 +1781,6 @@ config SECCOMP config SWIOTLB def_bool y -config IOMMU_HELPER - def_bool SWIOTLB - config PARAVIRT bool "Enable paravirtualization code" help diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index eb2cf4938f6d..fbef5d3de83f 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -252,9 +252,6 @@ config SMP config SWIOTLB def_bool y -config IOMMU_HELPER - def_bool SWIOTLB - config KERNEL_MODE_NEON def_bool y diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index bbe12a038d21..862c5160c09d 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -613,6 +613,3 @@ source "security/Kconfig" source "crypto/Kconfig" source "lib/Kconfig" - -config IOMMU_HELPER - def_bool (IA64_HP_ZX1 || IA64_HP_ZX1_SWIOTLB || IA64_GENERIC || SWIOTLB) diff --git a/arch/mips/cavium-octeon/Kconfig b/arch/mips/cavium-octeon/Kconfig index b5eee1a57d6c..647ed158ac98 100644 --- a/arch/mips/cavium-octeon/Kconfig +++ b/arch/mips/cavium-octeon/Kconfig @@ -67,16 +67,12 @@ config CAVIUM_OCTEON_LOCK_L2_MEMCPY help Lock the kernel's implementation of memcpy() into L2. -config IOMMU_HELPER - bool - config NEED_SG_DMA_LENGTH bool config SWIOTLB def_bool y select DMA_DIRECT_OPS - select IOMMU_HELPER select NEED_SG_DMA_LENGTH config OCTEON_ILM diff --git a/arch/mips/loongson64/Kconfig b/arch/mips/loongson64/Kconfig index 72af0c183969..5efb2e63878e 100644 --- a/arch/mips/loongson64/Kconfig +++ b/arch/mips/loongson64/Kconfig @@ -130,9 +130,6 @@ config LOONGSON_UART_BASE default y depends on EARLY_PRINTK || SERIAL_8250 -config IOMMU_HELPER - bool - config NEED_SG_DMA_LENGTH bool @@ -141,7 +138,6 @@ config SWIOTLB default y depends on CPU_LOONGSON3 select DMA_DIRECT_OPS - select IOMMU_HELPER select NEED_SG_DMA_LENGTH select NEED_DMA_MAP_STATE diff --git a/arch/mips/netlogic/Kconfig b/arch/mips/netlogic/Kconfig index 7fcfc7fe9f14..5c5ee0e05a17 100644 --- a/arch/mips/netlogic/Kconfig +++ b/arch/mips/netlogic/Kconfig @@ -83,9 +83,6 @@ endif config NLM_COMMON bool -config IOMMU_HELPER - bool - config NEED_SG_DMA_LENGTH bool diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index c32a181a7cbb..43e3c8e4e7f4 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -484,7 +484,6 @@ config IOMMU_HELPER config SWIOTLB bool "SWIOTLB support" default n - select IOMMU_HELPER ---help--- Support for IO bounce buffering for systems without an IOMMU. This allows us to DMA to the full physical address space on diff --git a/arch/unicore32/mm/Kconfig b/arch/unicore32/mm/Kconfig index e9154a59d561..3f105e00c432 100644 --- a/arch/unicore32/mm/Kconfig +++ b/arch/unicore32/mm/Kconfig @@ -44,9 +44,6 @@ config SWIOTLB def_bool y select DMA_DIRECT_OPS -config IOMMU_HELPER - def_bool SWIOTLB - config NEED_SG_DMA_LENGTH def_bool SWIOTLB diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 00fcf81f2c56..cb2c7ecc1fea 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -931,7 +931,7 @@ config SWIOTLB config IOMMU_HELPER def_bool y - depends on CALGARY_IOMMU || GART_IOMMU || SWIOTLB || AMD_IOMMU + depends on CALGARY_IOMMU || GART_IOMMU config MAXSMP bool "Enable Maximum number of SMP Processors and NUMA Nodes" diff --git a/drivers/parisc/Kconfig b/drivers/parisc/Kconfig index 3a102a84d637..5a48b5606110 100644 --- a/drivers/parisc/Kconfig +++ b/drivers/parisc/Kconfig @@ -103,11
[PATCH 08/12] arch: define the ARCH_DMA_ADDR_T_64BIT config symbol in lib/Kconfig
Define this symbol if the architecture either uses 64-bit pointers or the PHYS_ADDR_T_64BIT is set. This covers 95% of the old arch magic. We only need an additional select for Xen on ARM (why anyway?), and we now always set ARCH_DMA_ADDR_T_64BIT on mips boards with 64-bit physical addressing instead of only doing it when highmem is set. Signed-off-by: Christoph Hellwig--- arch/alpha/Kconfig | 3 --- arch/arc/Kconfig | 3 --- arch/arm/mach-axxia/Kconfig| 1 - arch/arm/mach-bcm/Kconfig | 1 - arch/arm/mach-exynos/Kconfig | 1 - arch/arm/mach-highbank/Kconfig | 1 - arch/arm/mach-rockchip/Kconfig | 1 - arch/arm/mach-shmobile/Kconfig | 1 - arch/arm/mach-tegra/Kconfig| 1 - arch/arm/mm/Kconfig| 3 --- arch/arm64/Kconfig | 3 --- arch/ia64/Kconfig | 3 --- arch/mips/Kconfig | 3 --- arch/powerpc/Kconfig | 3 --- arch/riscv/Kconfig | 3 --- arch/s390/Kconfig | 3 --- arch/sparc/Kconfig | 4 arch/x86/Kconfig | 4 lib/Kconfig| 3 +++ 19 files changed, 3 insertions(+), 42 deletions(-) diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig index 1fd9645b0c67..aa7df1a36fd0 100644 --- a/arch/alpha/Kconfig +++ b/arch/alpha/Kconfig @@ -66,9 +66,6 @@ config ZONE_DMA bool default y -config ARCH_DMA_ADDR_T_64BIT - def_bool y - config GENERIC_ISA_DMA bool default y diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig index f94c61da682a..7498aca4b887 100644 --- a/arch/arc/Kconfig +++ b/arch/arc/Kconfig @@ -458,9 +458,6 @@ config ARC_HAS_PAE40 Enable access to physical memory beyond 4G, only supported on ARC cores with 40 bit Physical Addressing support -config ARCH_DMA_ADDR_T_64BIT - bool - config ARC_KVADDR_SIZE int "Kernel Virtual Address Space size (MB)" range 0 512 diff --git a/arch/arm/mach-axxia/Kconfig b/arch/arm/mach-axxia/Kconfig index bb2ce1c63fd9..d3eae6037913 100644 --- a/arch/arm/mach-axxia/Kconfig +++ b/arch/arm/mach-axxia/Kconfig @@ -2,7 +2,6 @@ config ARCH_AXXIA bool "LSI Axxia platforms" depends on ARCH_MULTI_V7 && ARM_LPAE - select ARCH_DMA_ADDR_T_64BIT select ARM_AMBA select ARM_GIC select ARM_TIMER_SP804 diff --git a/arch/arm/mach-bcm/Kconfig b/arch/arm/mach-bcm/Kconfig index c2f3b0d216a4..c46a728df44e 100644 --- a/arch/arm/mach-bcm/Kconfig +++ b/arch/arm/mach-bcm/Kconfig @@ -211,7 +211,6 @@ config ARCH_BRCMSTB select BRCMSTB_L2_IRQ select BCM7120_L2_IRQ select ARCH_HAS_HOLES_MEMORYMODEL - select ARCH_DMA_ADDR_T_64BIT if ARM_LPAE select ZONE_DMA if ARM_LPAE select SOC_BRCMSTB select SOC_BUS diff --git a/arch/arm/mach-exynos/Kconfig b/arch/arm/mach-exynos/Kconfig index 647c319f9f5f..2ca405816846 100644 --- a/arch/arm/mach-exynos/Kconfig +++ b/arch/arm/mach-exynos/Kconfig @@ -112,7 +112,6 @@ config SOC_EXYNOS5440 bool "SAMSUNG EXYNOS5440" default y depends on ARCH_EXYNOS5 - select ARCH_DMA_ADDR_T_64BIT if ARM_LPAE select HAVE_ARM_ARCH_TIMER select AUTO_ZRELADDR select PINCTRL_EXYNOS5440 diff --git a/arch/arm/mach-highbank/Kconfig b/arch/arm/mach-highbank/Kconfig index 81110ec34226..5552968f07f8 100644 --- a/arch/arm/mach-highbank/Kconfig +++ b/arch/arm/mach-highbank/Kconfig @@ -1,7 +1,6 @@ config ARCH_HIGHBANK bool "Calxeda ECX-1000/2000 (Highbank/Midway)" depends on ARCH_MULTI_V7 - select ARCH_DMA_ADDR_T_64BIT if ARM_LPAE select ARCH_HAS_HOLES_MEMORYMODEL select ARCH_SUPPORTS_BIG_ENDIAN select ARM_AMBA diff --git a/arch/arm/mach-rockchip/Kconfig b/arch/arm/mach-rockchip/Kconfig index a4065966881a..fafd3d7f9f8c 100644 --- a/arch/arm/mach-rockchip/Kconfig +++ b/arch/arm/mach-rockchip/Kconfig @@ -3,7 +3,6 @@ config ARCH_ROCKCHIP depends on ARCH_MULTI_V7 select PINCTRL select PINCTRL_ROCKCHIP - select ARCH_DMA_ADDR_T_64BIT if ARM_LPAE select ARCH_HAS_RESET_CONTROLLER select ARM_AMBA select ARM_GIC diff --git a/arch/arm/mach-shmobile/Kconfig b/arch/arm/mach-shmobile/Kconfig index 280e7312a9e1..fe60cd09a5ca 100644 --- a/arch/arm/mach-shmobile/Kconfig +++ b/arch/arm/mach-shmobile/Kconfig @@ -29,7 +29,6 @@ config ARCH_RMOBILE menuconfig ARCH_RENESAS bool "Renesas ARM SoCs" depends on ARCH_MULTI_V7 && MMU - select ARCH_DMA_ADDR_T_64BIT if ARM_LPAE select ARCH_SHMOBILE select ARM_GIC select GPIOLIB diff --git a/arch/arm/mach-tegra/Kconfig b/arch/arm/mach-tegra/Kconfig index 1e0aeb47bac6..7f3b83e0d324 100644 --- a/arch/arm/mach-tegra/Kconfig +++ b/arch/arm/mach-tegra/Kconfig @@ -15,6 +15,5 @@ menuconfig ARCH_TEGRA select RESET_CONTROLLER select SOC_BUS select ZONE_DMA if ARM_LPAE - select ARCH_DMA_ADDR_T_64BIT
[PATCH 05/12] scatterlist: move the NEED_SG_DMA_LENGTH config symbol to lib/Kconfig
This way we have one central definition of it, and user can select it as needed. Signed-off-by: Christoph HellwigReviewed-by: Anshuman Khandual --- arch/alpha/Kconfig | 4 +--- arch/arm/Kconfig| 3 --- arch/arm64/Kconfig | 4 +--- arch/hexagon/Kconfig| 4 +--- arch/ia64/Kconfig | 4 +--- arch/mips/cavium-octeon/Kconfig | 3 --- arch/mips/loongson64/Kconfig| 3 --- arch/mips/netlogic/Kconfig | 3 --- arch/parisc/Kconfig | 4 +--- arch/powerpc/Kconfig| 4 +--- arch/s390/Kconfig | 4 +--- arch/sh/Kconfig | 5 ++--- arch/sparc/Kconfig | 4 +--- arch/unicore32/mm/Kconfig | 5 + arch/x86/Kconfig| 4 +--- lib/Kconfig | 3 +++ 16 files changed, 15 insertions(+), 46 deletions(-) diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig index 3ff735a722af..8e6a67ecf069 100644 --- a/arch/alpha/Kconfig +++ b/arch/alpha/Kconfig @@ -10,6 +10,7 @@ config ALPHA select HAVE_OPROFILE select HAVE_PCSPKR_PLATFORM select HAVE_PERF_EVENTS + select NEED_SG_DMA_LENGTH select VIRT_TO_BUS select GENERIC_IRQ_PROBE select AUTO_IRQ_AFFINITY if SMP @@ -70,9 +71,6 @@ config ARCH_DMA_ADDR_T_64BIT config NEED_DMA_MAP_STATE def_bool y -config NEED_SG_DMA_LENGTH - def_bool y - config GENERIC_ISA_DMA bool default y diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 2f79222c5c02..602c8320282f 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -119,9 +119,6 @@ config ARM_HAS_SG_CHAIN select ARCH_HAS_SG_CHAIN bool -config NEED_SG_DMA_LENGTH - bool - config ARM_DMA_USE_IOMMU bool select ARM_HAS_SG_CHAIN diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index fbef5d3de83f..3b441c5587f1 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -133,6 +133,7 @@ config ARM64 select IRQ_FORCED_THREADING select MODULES_USE_ELF_RELA select MULTI_IRQ_HANDLER + select NEED_SG_DMA_LENGTH select NO_BOOTMEM select OF select OF_EARLY_FLATTREE @@ -243,9 +244,6 @@ config ARCH_DMA_ADDR_T_64BIT config NEED_DMA_MAP_STATE def_bool y -config NEED_SG_DMA_LENGTH - def_bool y - config SMP def_bool y diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig index 76d2f20d525e..37adb2003033 100644 --- a/arch/hexagon/Kconfig +++ b/arch/hexagon/Kconfig @@ -19,6 +19,7 @@ config HEXAGON select GENERIC_IRQ_SHOW select HAVE_ARCH_KGDB select HAVE_ARCH_TRACEHOOK + select NEED_SG_DMA_LENGTH select NO_IOPORT_MAP select GENERIC_IOMAP select GENERIC_SMP_IDLE_THREAD @@ -63,9 +64,6 @@ config GENERIC_CSUM config GENERIC_IRQ_PROBE def_bool y -config NEED_SG_DMA_LENGTH - def_bool y - config RWSEM_GENERIC_SPINLOCK def_bool n diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index 862c5160c09d..333917676f7f 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -54,6 +54,7 @@ config IA64 select MODULES_USE_ELF_RELA select ARCH_USE_CMPXCHG_LOCKREF select HAVE_ARCH_AUDITSYSCALL + select NEED_SG_DMA_LENGTH default y help The Itanium Processor Family is Intel's 64-bit successor to @@ -84,9 +85,6 @@ config ARCH_DMA_ADDR_T_64BIT config NEED_DMA_MAP_STATE def_bool y -config NEED_SG_DMA_LENGTH - def_bool y - config SWIOTLB bool diff --git a/arch/mips/cavium-octeon/Kconfig b/arch/mips/cavium-octeon/Kconfig index 647ed158ac98..5d73041547a7 100644 --- a/arch/mips/cavium-octeon/Kconfig +++ b/arch/mips/cavium-octeon/Kconfig @@ -67,9 +67,6 @@ config CAVIUM_OCTEON_LOCK_L2_MEMCPY help Lock the kernel's implementation of memcpy() into L2. -config NEED_SG_DMA_LENGTH - bool - config SWIOTLB def_bool y select DMA_DIRECT_OPS diff --git a/arch/mips/loongson64/Kconfig b/arch/mips/loongson64/Kconfig index 5efb2e63878e..641a1477031e 100644 --- a/arch/mips/loongson64/Kconfig +++ b/arch/mips/loongson64/Kconfig @@ -130,9 +130,6 @@ config LOONGSON_UART_BASE default y depends on EARLY_PRINTK || SERIAL_8250 -config NEED_SG_DMA_LENGTH - bool - config SWIOTLB bool "Soft IOMMU Support for All-Memory DMA" default y diff --git a/arch/mips/netlogic/Kconfig b/arch/mips/netlogic/Kconfig index 5c5ee0e05a17..412351c5acc6 100644 --- a/arch/mips/netlogic/Kconfig +++ b/arch/mips/netlogic/Kconfig @@ -83,7 +83,4 @@ endif config NLM_COMMON bool -config NEED_SG_DMA_LENGTH - bool - endif diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig index fc5a574c3482..89caea87556e 100644 --- a/arch/parisc/Kconfig +++ b/arch/parisc/Kconfig @@ -51,6 +51,7 @@ config PARISC select GENERIC_CLOCKEVENTS select
[PATCH 02/12] iommu-helper: unexport iommu_area_alloc
This function is only used by built-in code. Signed-off-by: Christoph HellwigReviewed-by: Anshuman Khandual --- lib/iommu-helper.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c index 23633c0fda4a..ded1703e7e64 100644 --- a/lib/iommu-helper.c +++ b/lib/iommu-helper.c @@ -3,7 +3,6 @@ * IOMMU helper functions for the free area management */ -#include #include #include @@ -38,4 +37,3 @@ unsigned long iommu_area_alloc(unsigned long *map, unsigned long size, } return -1; } -EXPORT_SYMBOL(iommu_area_alloc); -- 2.17.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 01/12] iommu-common: move to arch/sparc
This code is only used by sparc, and all new iommu drivers should use the drivers/iommu/ framework. Also remove the unused exports. Signed-off-by: Christoph HellwigReviewed-by: Anshuman Khandual --- {include/linux => arch/sparc/include/asm}/iommu-common.h | 0 arch/sparc/include/asm/iommu_64.h| 2 +- arch/sparc/kernel/Makefile | 2 +- {lib => arch/sparc/kernel}/iommu-common.c| 5 + arch/sparc/kernel/iommu.c| 2 +- arch/sparc/kernel/ldc.c | 2 +- arch/sparc/kernel/pci_sun4v.c| 2 +- lib/Makefile | 2 +- 8 files changed, 7 insertions(+), 10 deletions(-) rename {include/linux => arch/sparc/include/asm}/iommu-common.h (100%) rename {lib => arch/sparc/kernel}/iommu-common.c (98%) diff --git a/include/linux/iommu-common.h b/arch/sparc/include/asm/iommu-common.h similarity index 100% rename from include/linux/iommu-common.h rename to arch/sparc/include/asm/iommu-common.h diff --git a/arch/sparc/include/asm/iommu_64.h b/arch/sparc/include/asm/iommu_64.h index 9ed6b54caa4b..0ef6dedf747e 100644 --- a/arch/sparc/include/asm/iommu_64.h +++ b/arch/sparc/include/asm/iommu_64.h @@ -17,7 +17,7 @@ #define IOPTE_WRITE 0x0002UL #define IOMMU_NUM_CTXS 4096 -#include +#include struct iommu_arena { unsigned long *map; diff --git a/arch/sparc/kernel/Makefile b/arch/sparc/kernel/Makefile index 76cb57750dda..a284662b0e4c 100644 --- a/arch/sparc/kernel/Makefile +++ b/arch/sparc/kernel/Makefile @@ -59,7 +59,7 @@ obj-$(CONFIG_SPARC32) += leon_pmc.o obj-$(CONFIG_SPARC64) += reboot.o obj-$(CONFIG_SPARC64) += sysfs.o -obj-$(CONFIG_SPARC64) += iommu.o +obj-$(CONFIG_SPARC64) += iommu.o iommu-common.o obj-$(CONFIG_SPARC64) += central.o obj-$(CONFIG_SPARC64) += starfire.o obj-$(CONFIG_SPARC64) += power.o diff --git a/lib/iommu-common.c b/arch/sparc/kernel/iommu-common.c similarity index 98% rename from lib/iommu-common.c rename to arch/sparc/kernel/iommu-common.c index 55b00de106b5..59cb16691322 100644 --- a/lib/iommu-common.c +++ b/arch/sparc/kernel/iommu-common.c @@ -8,9 +8,9 @@ #include #include #include -#include #include #include +#include static unsigned long iommu_large_alloc = 15; @@ -93,7 +93,6 @@ void iommu_tbl_pool_init(struct iommu_map_table *iommu, p->hint = p->start; p->end = num_entries; } -EXPORT_SYMBOL(iommu_tbl_pool_init); unsigned long iommu_tbl_range_alloc(struct device *dev, struct iommu_map_table *iommu, @@ -224,7 +223,6 @@ unsigned long iommu_tbl_range_alloc(struct device *dev, return n; } -EXPORT_SYMBOL(iommu_tbl_range_alloc); static struct iommu_pool *get_pool(struct iommu_map_table *tbl, unsigned long entry) @@ -264,4 +262,3 @@ void iommu_tbl_range_free(struct iommu_map_table *iommu, u64 dma_addr, bitmap_clear(iommu->map, entry, npages); spin_unlock_irqrestore(&(pool->lock), flags); } -EXPORT_SYMBOL(iommu_tbl_range_free); diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c index b08dc3416f06..40d008b0bd3e 100644 --- a/arch/sparc/kernel/iommu.c +++ b/arch/sparc/kernel/iommu.c @@ -14,7 +14,7 @@ #include #include #include -#include +#include #ifdef CONFIG_PCI #include diff --git a/arch/sparc/kernel/ldc.c b/arch/sparc/kernel/ldc.c index 86b625f9d8dc..c0fa3ef6cf01 100644 --- a/arch/sparc/kernel/ldc.c +++ b/arch/sparc/kernel/ldc.c @@ -16,7 +16,7 @@ #include #include #include -#include +#include #include #include diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c index 249367228c33..565d9ac883d0 100644 --- a/arch/sparc/kernel/pci_sun4v.c +++ b/arch/sparc/kernel/pci_sun4v.c @@ -16,7 +16,7 @@ #include #include #include -#include +#include #include #include diff --git a/lib/Makefile b/lib/Makefile index ce20696d5a92..94203b5eecd4 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -147,7 +147,7 @@ obj-$(CONFIG_AUDIT_GENERIC) += audit.o obj-$(CONFIG_AUDIT_COMPAT_GENERIC) += compat_audit.o obj-$(CONFIG_SWIOTLB) += swiotlb.o -obj-$(CONFIG_IOMMU_HELPER) += iommu-helper.o iommu-common.o +obj-$(CONFIG_IOMMU_HELPER) += iommu-helper.o obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o obj-$(CONFIG_NOTIFIER_ERROR_INJECTION) += notifier-error-inject.o obj-$(CONFIG_PM_NOTIFIER_ERROR_INJECT) += pm-notifier-error-inject.o -- 2.17.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
centralize SWIOTLB config symbol and misc other cleanups V2
Hi all, this seris aims for a single defintion of the Kconfig symbol. To get there various cleanups, mostly about config symbols are included as well. Chances since V2 are a fixed s/Reviewed/Signed-Off/ for me, and a few reviewed-by tags. I'd like to start merging this into the dma-mapping tree rather sooner than later given that quite a bit of material for this series depends on it. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/iova: Update cached node pointer when current node fails to get any free IOVA
On 19/04/18 18:12, Ganapatrao Kulkarni wrote: The performance drop is observed with long hours iperf testing using 40G cards. This is mainly due to long iterations in finding the free iova range in 32bit address space. In current implementation for 64bit PCI devices, there is always first attempt to allocate iova from 32bit(SAC preferred over DAC) address range. Once we run out 32bit range, there is allocation from higher range, however due to cached32_node optimization it does not suppose to be painful. cached32_node always points to recently allocated 32-bit node. When address range is full, it will be pointing to last allocated node (leaf node), so walking rbtree to find the available range is not expensive affair. However this optimization does not behave well when one of the middle node is freed. In that case cached32_node is updated to point to next iova range. The next iova allocation will consume free range and again update cached32_node to itself. From now on, walking over 32-bit range is more expensive. This patch adds fix to update cached node to leaf node when there are no iova free range left, which avoids unnecessary long iterations. The only trouble with this is that "allocation failed" doesn't uniquely mean "space full". Say that after some time the 32-bit space ends up empty except for one page at 0x1000 and one at 0x8000, then somebody tries to allocate 2GB. If we move the cached node down to the leftmost entry when that fails, all subsequent allocation attempts are now going to fail despite the space being 99.% free! I can see a couple of ways to solve that general problem of free space above the cached node getting lost, but neither of them helps with the case where there is genuinely insufficient space (and if anything would make it even slower). In terms of the optimisation you want here, i.e. fail fast when an allocation cannot possibly succeed, the only reliable idea which comes to mind is free-PFN accounting. I might give that a go myself to see how ugly it looks. Robin. Signed-off-by: Ganapatrao Kulkarni--- drivers/iommu/iova.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 83fe262..e6ee2ea 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -201,6 +201,12 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad, } while (curr && new_pfn <= curr_iova->pfn_hi); if (limit_pfn < size || new_pfn < iovad->start_pfn) { + /* No more cached node points to free hole, update to leaf node. +*/ + struct iova *prev_iova; + + prev_iova = rb_entry(prev, struct iova, node); + __cached_rbnode_insert_update(iovad, prev_iova); spin_unlock_irqrestore(>iova_rbtree_lock, flags); return -ENOMEM; } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 2/2] iommu/arm-smmu-v3: Support software retention for pm_resume
On 23/04/18 12:45, Yisheng Xie wrote: When system suspend, hisilicon's smmu will do power gating for smmu, this time smmu's reg will be set to default value for not having hardware retention, which means need software do the retention instead. The patch is to use arm_smmu_device_reset() to restore the register of smmu. However, it need to save the msis setting at probe if smmu do not support hardware retention. Signed-off-by: Yisheng Xie--- drivers/iommu/arm-smmu-v3.c | 69 +++-- 1 file changed, 66 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 044df6e..6cb56d8 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -534,6 +534,11 @@ struct arm_smmu_strtab_cfg { u32 strtab_base_cfg; }; +struct arm_smmu_msi_val { + u64 doorbell; + u32 data; +}; What does this do that struct msi_msg doesn't already (apart from take up more space in an array)? + /* An SMMUv3 instance */ struct arm_smmu_device { struct device *dev; @@ -558,6 +563,7 @@ struct arm_smmu_device { #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) #define ARM_SMMU_OPT_PAGE0_REGS_ONLY (1 << 1) +#define ARM_SMMU_OPT_SW_RETENTION (1 << 2) u32 options; struct arm_smmu_cmdq cmdq; @@ -587,6 +593,8 @@ struct arm_smmu_device { u32sync_count; + struct arm_smmu_msi_val *msi; + boolprobed; This looks really hacky. I'm sure there's probably enough driver model information to be able to identify the probe state from just the struct device, but that's still not the right way to go. If you need to know this, then it can only mean we've got one-time software state initialisation mixed in with the actual hardware reset which programs the software state into the device. Thus there should be some refactoring to properly separate those concerns. boolbypass; /* IOMMU core code handle */ @@ -630,6 +638,7 @@ struct arm_smmu_option_prop { static struct arm_smmu_option_prop arm_smmu_options[] = { { ARM_SMMU_OPT_SKIP_PREFETCH, "hisilicon,broken-prefetch-cmd" }, { ARM_SMMU_OPT_PAGE0_REGS_ONLY, "cavium,cn9900-broken-page1-regspace"}, + { ARM_SMMU_OPT_SW_RETENTION, "hisilicon,broken-hardware-retention" }, That seems a bit over-specific - there are going to be any number of SMMU implementations/integrations which may or may not implement hardware retention states. More crucially, it's also backwards. Making the driver assume that *every* SMMU implements hardware retention unless this new DT property is present is quite obviously completely wrong, especially for ACPI... The sensible thing to do is to implement suspend/resume support which works in general, *then* consider optimising it for cases where explicitly restoring the hardware state may be skipped (if indeed it makes a significant difference). Are there not already generic DT/ACPI properties for describing the retention levels of different power states, which could be made use of here? { 0, NULL}, }; @@ -2228,7 +2237,8 @@ static void arm_smmu_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg) phys_addr_t doorbell; struct device *dev = msi_desc_to_dev(desc); struct arm_smmu_device *smmu = dev_get_drvdata(dev); - phys_addr_t *cfg = arm_smmu_msi_cfg[desc->platform.msi_index]; + int msi_index = desc->platform.msi_index; + phys_addr_t *cfg = arm_smmu_msi_cfg[msi_index]; doorbell = (((u64)msg->address_hi) << 32) | msg->address_lo; doorbell &= MSI_CFG0_ADDR_MASK; @@ -2236,6 +2246,28 @@ static void arm_smmu_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg) writeq_relaxed(doorbell, smmu->base + cfg[0]); writel_relaxed(msg->data, smmu->base + cfg[1]); writel_relaxed(ARM_SMMU_MEMATTR_DEVICE_nGnRE, smmu->base + cfg[2]); + + if (smmu->options & ARM_SMMU_OPT_SW_RETENTION) { The overhead of writing an extra 12 bytes per MSI to memory is entirely negligible; saving the message data just doesn't warrant the complexity of being conditional. In fact, given the need to untangle the IRQ requests from the hardware reset, I'd rather expect to end up *only* saving the message here, and writing the IRQ_CFG registers later along with everything else. + smmu->msi[msi_index].doorbell = doorbell; + smmu->msi[msi_index].data = msg->data; + } +} + +static void arm_smmu_restore_msis(struct arm_smmu_device *smmu) +{ + int nevc = ARM_SMMU_MAX_MSIS - 1; + + if (!(smmu->features & ARM_SMMU_FEAT_PRI)) + nevc--; + + for (; nevc >= 0; nevc--) { + phys_addr_t *cfg =
Re: [RFC 1/2] iommu/arm-smmu-v3: Remove bypass in arm_smmu_reset_device
On 23/04/18 12:45, Yisheng Xie wrote: Add a bypass parameter in arm_smmu_device to keep whether smmu device should pypass or not, so parameter bypass in arm_smmu_reset_device can be removed. Given that the GBPA configuration implied by the bypass argument here is only there to avoid initialising a full stream table when the firmware is terminally broken, I wonder if it would make sense to simply skip allocating a stream table at all in that case. Then we could just base the subsequent SMMUEN/GPBA decision on whether strtab_cfg.strtab is valid or not. Robin. This should not have any functional change, but prepare for later patch. Signed-off-by: Yisheng Xie--- drivers/iommu/arm-smmu-v3.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 1d64710..044df6e 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -587,6 +587,8 @@ struct arm_smmu_device { u32sync_count; + boolbypass; + /* IOMMU core code handle */ struct iommu_device iommu; }; @@ -2384,7 +2386,7 @@ static int arm_smmu_device_disable(struct arm_smmu_device *smmu) return ret; } -static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass) +static int arm_smmu_device_reset(struct arm_smmu_device *smmu) { int ret; u32 reg, enables; @@ -2487,7 +2489,7 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass) /* Enable the SMMU interface, or ensure bypass */ - if (!bypass || disable_bypass) { + if (!smmu->bypass || disable_bypass) { enables |= CR0_SMMUEN; } else { ret = arm_smmu_update_gbpa(smmu, 0, GBPA_ABORT); @@ -2778,7 +2780,6 @@ static int arm_smmu_device_probe(struct platform_device *pdev) resource_size_t ioaddr; struct arm_smmu_device *smmu; struct device *dev = >dev; - bool bypass; smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL); if (!smmu) { @@ -2796,7 +2797,7 @@ static int arm_smmu_device_probe(struct platform_device *pdev) } /* Set bypass mode according to firmware probing result */ - bypass = !!ret; + smmu->bypass = !!ret; /* Base address */ res = platform_get_resource(pdev, IORESOURCE_MEM, 0); @@ -2842,7 +2843,7 @@ static int arm_smmu_device_probe(struct platform_device *pdev) platform_set_drvdata(pdev, smmu); /* Reset the device */ - ret = arm_smmu_device_reset(smmu, bypass); + ret = arm_smmu_device_reset(smmu); if (ret) return ret; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 13/22] iommu: introduce page response function
On 23/04/18 13:16, Jacob Pan wrote: > I shall drop these, only put in here to match your patch. i am looking > into converting vt-d svm prq to your queued fault patch. I think it will > give both functional and performance benefit. Thanks, I just rebased my patches onto this series and am hoping to re-send the IOMMU part early next month. Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 14/22] iommu: handle page response timeout
On Mon, Apr 16, 2018 at 10:49:03PM +0100, Jacob Pan wrote: > When IO page faults are reported outside IOMMU subsystem, the page > request handler may fail for various reasons. E.g. a guest received > page requests but did not have a chance to run for a long time. The > irresponsive behavior could hold off limited resources on the pending > device. > There can be hardware or credit based software solutions as suggested > in the PCI ATS Ch-4. To provide a basic safty net this patch > introduces a per device deferrable timer which monitors the longest > pending page fault that requires a response. Proper action such as > sending failure response code could be taken when timer expires but not > included in this patch. We need to consider the life cycle of page > groupd ID to prevent confusion with reused group ID by a device. > For now, a warning message provides clue of such failure. > > Signed-off-by: Jacob Pan> Signed-off-by: Ashok Raj > --- > drivers/iommu/iommu.c | 60 > +-- > include/linux/iommu.h | 4 > 2 files changed, 62 insertions(+), 2 deletions(-) > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index 628346c..f6512692 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -799,6 +799,39 @@ int iommu_group_unregister_notifier(struct iommu_group > *group, > } > EXPORT_SYMBOL_GPL(iommu_group_unregister_notifier); > > +/* Max time to wait for a pending page request */ > +#define IOMMU_PAGE_RESPONSE_MAXTIME (HZ * 10) > +static void iommu_dev_fault_timer_fn(struct timer_list *t) > +{ > + struct iommu_fault_param *fparam = from_timer(fparam, t, timer); > + struct iommu_fault_event *evt, *iter; > + > + u64 now; > + > + now = get_jiffies_64(); > + > + /* The goal is to ensure driver or guest page fault handler(via vfio) > + * send page response on time. Otherwise, limited queue resources > + * may be occupied by some irresponsive guests or drivers. By "limited queue resources", do you mean the PRI fault queue in the pIOMMU device, or something else? I'm still uneasy about this timeout. We don't really know if the guest doesn't respond because it is suspended, because it doesn't support PRI or because it's attempting to kill the host. In the first case, then receiving and responding to page requests later than 10s should be fine, right? Or maybe the guest is doing something weird like fetching pages from network storage and it occasionally hits a latency oddity. This wouldn't interrupt the fault queues, because other page requests for the same device can be serviced in parallel, but if you implement a PRG timeout it would still unfairly disable PRI. In the other cases (unsupported PRI or rogue guest) then disabling PRI using a FAILURE status might be the right thing to do. However, assuming the device follows the PCI spec it will stop sending page requests once there are as many PPRs in flight as the allocated credit. Even though drivers set the PPR credit number arbitrarily (because finding an ideal number is difficult or impossible), the device stops issuing faults at some point if the guest is unresponsive, and it won't grab any more shared resources, or use slots in shared queues. Resources for pending faults can be cleaned when the device is reset and assigned to a different guest. That's for sane endpoints that follow the spec. If on the other hand, we can't rely on the device implementation to respect our maximum credit allocation, then we should do the accounting ourselves and reject incoming faults with INVALID as fast as possible. Otherwise it's an easy way for a guest to DoS the host and I don't think a timeout solves this problem (The guest can wait 9 seconds before replying to faults and meanwhile fill all the queues). In addition the timeout is done on PRGs but not individual page faults, so a guest could overflow the queues by triggering lots of page requests without setting the last bit. If there isn't any possibility of memory leak or abusing resources, I don't think it's our problem that the guest is excessively slow at handling page requests. Setting an upper bound to page request latency might do more harm than good. Ensuring that devices respect the number of allocated in-flight PPRs is more important in my opinion. > + * When per device pending fault list is not empty, we periodically > checks > + * if any anticipated page response time has expired. > + * > + * TODO: > + * We could do the following if response time expires: > + * 1. send page response code FAILURE to all pending PRQ > + * 2. inform device driver or vfio > + * 3. drain in-flight page requests and responses for this device > + * 4. clear pending fault list such that driver can unregister fault > + *handler(otherwise blocked when pending faults are present). > + */ > +
Re: [PATCH v3 1/2] dma-mapping: move dma configuration to bus infrastructure
Can you resend your changes against Linux 4.17-rc2? There are a lot of conflicts as-is. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] dma-direct: Don't repeat allocation for no-op GFP_DMA
Thanks, applied to the dma-mapping tree for 4.17. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dma-coherent: Clarify dma_mmap_from_dev_coherent documentation
On Mon, Apr 09, 2018 at 06:59:14PM +0100, Robin Murphy wrote: > The use of "correctly mapped" here is misleading, since it can give the > wrong expectation in the case that the memory *should* have been mapped > from the per-device pool, but doing so failed for other reasons. Thanks, applied to the dma-mapping tree for 4.17. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] base: dma-mapping: Postpone cpu addr translation on mmap()
Thanks, applied to the dma-mapping tree for 4.17. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 13/22] iommu: introduce page response function
On Mon, 23 Apr 2018 12:47:10 +0100 Jean-Philippe Bruckerwrote: > On Mon, Apr 16, 2018 at 10:49:02PM +0100, Jacob Pan wrote: > [...] > > + /* > > +* Check if we have a matching page request pending to > > respond, > > +* otherwise return -EINVAL > > +*/ > > + list_for_each_entry_safe(evt, iter, > > >fault_param->faults, list) { > > I don't think you need the "_safe" iterator if you're exiting the loop > right after removing the event. > you are right, good catch! > > + if (evt->pasid == msg->pasid && > > + msg->page_req_group_id == > > evt->page_req_group_id) { > > + msg->private_data = evt->iommu_private; > > Ah sorry, I missed this bit in my review of 10/22. I thought > private_data would be for evt->device_private. In this case I guess we > can drop device_private, or do you plan to use it? > NP. vt-d still plan to use device_private for gfx device. > > + ret = domain->ops->page_response(dev, msg); > > + list_del(>list); > > + kfree(evt); > > + break; > > + } > > + } > > + > > +done_unlock: > > + mutex_unlock(>fault_param->lock); > > + return ret; > > +} > > +EXPORT_SYMBOL_GPL(iommu_page_response); > > + > > static void __iommu_detach_device(struct iommu_domain *domain, > > struct device *dev) > > { > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h > > index 32435f9..058b552 100644 > > --- a/include/linux/iommu.h > > +++ b/include/linux/iommu.h > > @@ -163,6 +163,55 @@ struct iommu_resv_region { > > #ifdef CONFIG_IOMMU_API > > > > /** > > + * enum page_response_code - Return status of fault handlers, > > telling the IOMMU > > + * driver how to proceed with the fault. > > + * > > + * @IOMMU_FAULT_STATUS_SUCCESS: Fault has been handled and the > > page tables > > + * populated, retry the access. This is "Success" in PCI > > PRI. > > + * @IOMMU_FAULT_STATUS_FAILURE: General error. Drop all subsequent > > faults from > > + * this device if possible. This is "Response Failure" in > > PCI PRI. > > + * @IOMMU_FAULT_STATUS_INVALID: Could not handle this fault, don't > > retry the > > + * access. This is "Invalid Request" in PCI PRI. > > + */ > > +enum page_response_code { > > + IOMMU_PAGE_RESP_SUCCESS = 0, > > + IOMMU_PAGE_RESP_INVALID, > > + IOMMU_PAGE_RESP_FAILURE, > > +}; > > Field names aren't consistent with the comment. I'd go with > IOMMU_PAGE_RESP_* > will do. > > + > > +/** > > + * enum page_request_handle_t - Return page request/response > > handler status > > + * > > + * @IOMMU_FAULT_STATUS_HANDLED: Stop processing the fault, and do > > not send a > > + * reply to the device. > > + * @IOMMU_FAULT_STATUS_CONTINUE: Fault was not handled. Call the > > next handler, > > + * or terminate. > > + */ > > +enum page_request_handle_t { > > + IOMMU_PAGE_RESP_HANDLED = 0, > > + IOMMU_PAGE_RESP_CONTINUE, > > Same here regarding the comment. Here I'd prefer > "iommu_fault_status_t" for the enum and IOMMU_FAULT_STATUS_* for the > fields, because they can be used for unrecoverable faults as well. > > But since you're not using these values in your patches, I guess you > can drop this enum? At the moment the return value of fault handler > is 0 (as specified at iommu_register_device_fault_handler), meaning > that the handler always takes ownership of the fault. > > It will be easy to extend once we introduce multiple fault handlers > that can either take the fault or pass it to the next one. Existing > implementations will still return 0 - HANDLED, and new ones will > return either HANDLED or CONTINUE. > I shall drop these, only put in here to match your patch. i am looking into converting vt-d svm prq to your queued fault patch. I think it will give both functional and performance benefit. > > +/** > > + * Generic page response information based on PCI ATS and PASID > > spec. > > + * @addr: servicing page address > > + * @pasid: contains process address space ID > > + * @resp_code: response code > > + * @page_req_group_id: page request group index > > + * @type: group or stream/single page response > > @type isn't in the structure > missed that, i move it to iommu private data since it is vtd only > > + * @private_data: uniquely identify device-specific private data > > for an > > + *individual page response > > IOMMU-specific? If it is set by iommu.c, I think we should comment > about it, something like "This field is written by the IOMMU core". > Maybe also rename it to iommu_private to be consistent with > iommu_fault_event > sounds good. > > + */ > > +struct page_response_msg { > > + u64 addr; > > + u32 pasid; > > + enum page_response_code resp_code; > > + u32 pasid_present:1; > > + u32 page_req_group_id; > > + u64 private_data; > > +}; > > + > > +/** > > * struct iommu_ops - iommu ops and capabilities > > * @capable:
[RFC 2/2] iommu/arm-smmu-v3: Support software retention for pm_resume
When system suspend, hisilicon's smmu will do power gating for smmu, this time smmu's reg will be set to default value for not having hardware retention, which means need software do the retention instead. The patch is to use arm_smmu_device_reset() to restore the register of smmu. However, it need to save the msis setting at probe if smmu do not support hardware retention. Signed-off-by: Yisheng Xie--- drivers/iommu/arm-smmu-v3.c | 69 +++-- 1 file changed, 66 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 044df6e..6cb56d8 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -534,6 +534,11 @@ struct arm_smmu_strtab_cfg { u32 strtab_base_cfg; }; +struct arm_smmu_msi_val { + u64 doorbell; + u32 data; +}; + /* An SMMUv3 instance */ struct arm_smmu_device { struct device *dev; @@ -558,6 +563,7 @@ struct arm_smmu_device { #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) #define ARM_SMMU_OPT_PAGE0_REGS_ONLY (1 << 1) +#define ARM_SMMU_OPT_SW_RETENTION (1 << 2) u32 options; struct arm_smmu_cmdqcmdq; @@ -587,6 +593,8 @@ struct arm_smmu_device { u32 sync_count; + struct arm_smmu_msi_val *msi; + boolprobed; boolbypass; /* IOMMU core code handle */ @@ -630,6 +638,7 @@ struct arm_smmu_option_prop { static struct arm_smmu_option_prop arm_smmu_options[] = { { ARM_SMMU_OPT_SKIP_PREFETCH, "hisilicon,broken-prefetch-cmd" }, { ARM_SMMU_OPT_PAGE0_REGS_ONLY, "cavium,cn9900-broken-page1-regspace"}, + { ARM_SMMU_OPT_SW_RETENTION, "hisilicon,broken-hardware-retention" }, { 0, NULL}, }; @@ -2228,7 +2237,8 @@ static void arm_smmu_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg) phys_addr_t doorbell; struct device *dev = msi_desc_to_dev(desc); struct arm_smmu_device *smmu = dev_get_drvdata(dev); - phys_addr_t *cfg = arm_smmu_msi_cfg[desc->platform.msi_index]; + int msi_index = desc->platform.msi_index; + phys_addr_t *cfg = arm_smmu_msi_cfg[msi_index]; doorbell = (((u64)msg->address_hi) << 32) | msg->address_lo; doorbell &= MSI_CFG0_ADDR_MASK; @@ -2236,6 +2246,28 @@ static void arm_smmu_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg) writeq_relaxed(doorbell, smmu->base + cfg[0]); writel_relaxed(msg->data, smmu->base + cfg[1]); writel_relaxed(ARM_SMMU_MEMATTR_DEVICE_nGnRE, smmu->base + cfg[2]); + + if (smmu->options & ARM_SMMU_OPT_SW_RETENTION) { + smmu->msi[msi_index].doorbell = doorbell; + smmu->msi[msi_index].data = msg->data; + } +} + +static void arm_smmu_restore_msis(struct arm_smmu_device *smmu) +{ + int nevc = ARM_SMMU_MAX_MSIS - 1; + + if (!(smmu->features & ARM_SMMU_FEAT_PRI)) + nevc--; + + for (; nevc >= 0; nevc--) { + phys_addr_t *cfg = arm_smmu_msi_cfg[nevc]; + struct arm_smmu_msi_val msi_val = smmu->msi[nevc]; + + writeq_relaxed(msi_val.doorbell, smmu->base + cfg[0]); + writel_relaxed(msi_val.data, smmu->base + cfg[1]); + writel_relaxed(ARM_SMMU_MEMATTR_DEVICE_nGnRE, smmu->base + cfg[2]); + } } static void arm_smmu_setup_msis(struct arm_smmu_device *smmu) @@ -2261,6 +2293,16 @@ static void arm_smmu_setup_msis(struct arm_smmu_device *smmu) return; } + if (smmu->probed) { + BUG_ON(!(smmu->options & ARM_SMMU_OPT_SW_RETENTION)); + arm_smmu_restore_msis(smmu); + return; + } else if (smmu->options & ARM_SMMU_OPT_SW_RETENTION) { + smmu->msi = devm_kmalloc_array(dev, nvec, + sizeof(*(smmu->msi)), + GFP_KERNEL); + } + /* Allocate MSIs for evtq, gerror and priq. Ignore cmdq */ ret = platform_msi_domain_alloc_irqs(dev, nvec, arm_smmu_write_msi_msg); if (ret) { @@ -2294,6 +2336,9 @@ static void arm_smmu_setup_unique_irqs(struct arm_smmu_device *smmu) arm_smmu_setup_msis(smmu); + if (smmu->probed) + return; + /* Request interrupt lines */ irq = smmu->evtq.q.irq; if (irq) { @@ -2348,7 +2393,7 @@ static int arm_smmu_setup_irqs(struct arm_smmu_device *smmu) } irq = smmu->combined_irq; - if (irq) { + if (irq && !smmu->probed) { /* * Cavium ThunderX2 implementation doesn't not support unique * irq lines. Use single irq
[RFC 0/2] iommu/arm-smmu-v3: Support software retention for pm_resume
- Backgroud: Hisilicon's implement of smmuv3 do not support hardware retention if system do power gating when system suspend, however for embed system, we do need to do power gating at trust zone for lower power comsume. So software retention is need. - Implement: From the process of smmu probe, it will get the feature of smmu from IDR registers and keep the status in struct arm_smmu_device, then arm_smmu_device_reset() will initial(set) most of the registers according to it. So we can use arm_smmu_device_reset() to re-initial the smmuv3 device to make it can works again. To achieve this, patch 1 move the bypass parameter from arm_smmu_device_reset() to struct arm_smmu_device to make arm_smmu_device_reset() re-usable. Patch 2 introduces probed parameter for struct arm_smmu_device, so once smmuv3 is probed we can avoid request irqs once more. It also introduce a struct arm_smmu_msi_val to keep the value of smmuv3's msi register which can be restore when reset device after probed. Yisheng Xie (2): iommu/arm-smmu-v3: Remove bypass in arm_smmu_reset_device iommu/arm-smmu-v3: Support software retention for pm_resume drivers/iommu/arm-smmu-v3.c | 80 - 1 file changed, 72 insertions(+), 8 deletions(-) -- 1.7.12.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 10/22] iommu: introduce device fault data
On Mon, 23 Apr 2018 11:11:41 +0100 Jean-Philippe Bruckerwrote: > On Mon, Apr 16, 2018 at 10:48:59PM +0100, Jacob Pan wrote: > > +/** > > + * struct iommu_fault_event - Generic per device fault data > > + * > > + * - PCI and non-PCI devices > > + * - Recoverable faults (e.g. page request), information based on > > PCI ATS > > + * and PASID spec. > > + * - Un-recoverable faults of device interest > > + * - DMA remapping and IRQ remapping faults > > + > > + * @type contains fault type. > > + * @reason fault reasons if relevant outside IOMMU driver, IOMMU > > driver internal > > + * faults are not reported > > + * @addr: tells the offending page address > > + * @pasid: contains process address space ID, used in shared > > virtual memory(SVM) > > + * @rid: requestor ID > > You can remove @rid from the comment > thanks, will do. > > + * @page_req_group_id: page request group index > > + * @last_req: last request in a page request group > > + * @pasid_valid: indicates if the PRQ has a valid PASID > > + * @prot: page access protection flag, e.g. IOMMU_FAULT_READ, > > IOMMU_FAULT_WRITE > > + * @device_private: if present, uniquely identify device-specific > > + * private data for an individual page request. > > + * @iommu_private: used by the IOMMU driver for storing > > fault-specific > > + * data. Users should not modify this field before > > + * sending the fault response. > > In my opinion you can remove @iommu_private entirely. I proposed this > field so that the IOMMU driver can store fault metadata when reporting > them, and read them back when completing the fault. I'm not using it > in SMMUv3 anymore (instead re-fetching the metadata) and it can't be > used anyway, because the value isn't copied into page_response_msg. > In vt-d use, I use private data for preserving vt-d specific fault data across request and response. e.g. vt-d has streaming response type in addition to group response (standard). This way, generic code does not need to know about it. At device level, since we have to sanitize page response based on pending page requests, I am doing the private data copy in iommu.c, in the pending event list. > Thanks, > Jean > > > + */ > > +struct iommu_fault_event { > > + enum iommu_fault_type type; > > + enum iommu_fault_reason reason; > > + u64 addr; > > + u32 pasid; > > + u32 page_req_group_id; > > + u32 last_req : 1; > > + u32 pasid_valid : 1; > > + u32 prot; > > + u64 device_private; > > + u64 iommu_private; > > +}; [Jacob Pan] ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 13/22] iommu: introduce page response function
On Mon, Apr 16, 2018 at 10:49:02PM +0100, Jacob Pan wrote: [...] > + /* > + * Check if we have a matching page request pending to respond, > + * otherwise return -EINVAL > + */ > + list_for_each_entry_safe(evt, iter, >fault_param->faults, list) { I don't think you need the "_safe" iterator if you're exiting the loop right after removing the event. > + if (evt->pasid == msg->pasid && > + msg->page_req_group_id == evt->page_req_group_id) { > + msg->private_data = evt->iommu_private; Ah sorry, I missed this bit in my review of 10/22. I thought private_data would be for evt->device_private. In this case I guess we can drop device_private, or do you plan to use it? > + ret = domain->ops->page_response(dev, msg); > + list_del(>list); > + kfree(evt); > + break; > + } > + } > + > +done_unlock: > + mutex_unlock(>fault_param->lock); > + return ret; > +} > +EXPORT_SYMBOL_GPL(iommu_page_response); > + > static void __iommu_detach_device(struct iommu_domain *domain, > struct device *dev) > { > diff --git a/include/linux/iommu.h b/include/linux/iommu.h > index 32435f9..058b552 100644 > --- a/include/linux/iommu.h > +++ b/include/linux/iommu.h > @@ -163,6 +163,55 @@ struct iommu_resv_region { > #ifdef CONFIG_IOMMU_API > > /** > + * enum page_response_code - Return status of fault handlers, telling the > IOMMU > + * driver how to proceed with the fault. > + * > + * @IOMMU_FAULT_STATUS_SUCCESS: Fault has been handled and the page tables > + * populated, retry the access. This is "Success" in PCI PRI. > + * @IOMMU_FAULT_STATUS_FAILURE: General error. Drop all subsequent faults > from > + * this device if possible. This is "Response Failure" in PCI PRI. > + * @IOMMU_FAULT_STATUS_INVALID: Could not handle this fault, don't retry the > + * access. This is "Invalid Request" in PCI PRI. > + */ > +enum page_response_code { > + IOMMU_PAGE_RESP_SUCCESS = 0, > + IOMMU_PAGE_RESP_INVALID, > + IOMMU_PAGE_RESP_FAILURE, > +}; Field names aren't consistent with the comment. I'd go with IOMMU_PAGE_RESP_* > + > +/** > + * enum page_request_handle_t - Return page request/response handler status > + * > + * @IOMMU_FAULT_STATUS_HANDLED: Stop processing the fault, and do not send a > + * reply to the device. > + * @IOMMU_FAULT_STATUS_CONTINUE: Fault was not handled. Call the next > handler, > + * or terminate. > + */ > +enum page_request_handle_t { > + IOMMU_PAGE_RESP_HANDLED = 0, > + IOMMU_PAGE_RESP_CONTINUE, Same here regarding the comment. Here I'd prefer "iommu_fault_status_t" for the enum and IOMMU_FAULT_STATUS_* for the fields, because they can be used for unrecoverable faults as well. But since you're not using these values in your patches, I guess you can drop this enum? At the moment the return value of fault handler is 0 (as specified at iommu_register_device_fault_handler), meaning that the handler always takes ownership of the fault. It will be easy to extend once we introduce multiple fault handlers that can either take the fault or pass it to the next one. Existing implementations will still return 0 - HANDLED, and new ones will return either HANDLED or CONTINUE. > +/** > + * Generic page response information based on PCI ATS and PASID spec. > + * @addr: servicing page address > + * @pasid: contains process address space ID > + * @resp_code: response code > + * @page_req_group_id: page request group index > + * @type: group or stream/single page response @type isn't in the structure > + * @private_data: uniquely identify device-specific private data for an > + *individual page response IOMMU-specific? If it is set by iommu.c, I think we should comment about it, something like "This field is written by the IOMMU core". Maybe also rename it to iommu_private to be consistent with iommu_fault_event > + */ > +struct page_response_msg { > + u64 addr; > + u32 pasid; > + enum page_response_code resp_code; > + u32 pasid_present:1; > + u32 page_req_group_id; > + u64 private_data; > +}; > + > +/** > * struct iommu_ops - iommu ops and capabilities > * @capable: check capability > * @domain_alloc: allocate iommu domain > @@ -195,6 +244,7 @@ struct iommu_resv_region { > * @bind_pasid_table: bind pasid table pointer for guest SVM > * @unbind_pasid_table: unbind pasid table pointer and restore defaults > * @sva_invalidate: invalidate translation caches of shared virtual address > + * @page_response: handle page request response > */ > struct iommu_ops { > bool (*capable)(enum iommu_cap); > @@ -250,6 +300,7 @@ struct iommu_ops { > struct device *dev); > int (*sva_invalidate)(struct iommu_domain *domain, > struct device *dev, struct tlb_invalidate_info *inv_info); > + int
Re: [PATCH v4 12/22] iommu: introduce device fault report API
On Mon, Apr 16, 2018 at 10:49:01PM +0100, Jacob Pan wrote: [...] > +int iommu_register_device_fault_handler(struct device *dev, > + iommu_dev_fault_handler_t handler, > + void *data) > +{ > + struct iommu_param *param = dev->iommu_param; > + > + /* > + * Device iommu_param should have been allocated when device is > + * added to its iommu_group. > + */ > + if (!param) > + return -EINVAL; > + > + /* Only allow one fault handler registered for each device */ > + if (param->fault_param) > + return -EBUSY; > + > + mutex_lock(>lock); > + get_device(dev); > + param->fault_param = > + kzalloc(sizeof(struct iommu_fault_param), GFP_ATOMIC); This can be GFP_KERNEL [...] > +int iommu_report_device_fault(struct device *dev, struct iommu_fault_event > *evt) > +{ > + int ret = 0; > + struct iommu_fault_event *evt_pending; > + struct iommu_fault_param *fparam; > + > + /* iommu_param is allocated when device is added to group */ > + if (!dev->iommu_param | !evt) > + return -EINVAL; > + /* we only report device fault if there is a handler registered */ > + mutex_lock(>iommu_param->lock); > + if (!dev->iommu_param->fault_param || > + !dev->iommu_param->fault_param->handler) { > + ret = -EINVAL; > + goto done_unlock; > + } > + fparam = dev->iommu_param->fault_param; > + if (evt->type == IOMMU_FAULT_PAGE_REQ && evt->last_req) { > + evt_pending = kzalloc(sizeof(*evt_pending), GFP_ATOMIC); We're expecting caller to be a thread at the moment, so this could be GFP_KERNEL too. You could also use kmemdup to remove the memcpy below [...] > +static inline int iommu_register_device_fault_handler(struct device *dev, > + iommu_dev_fault_handler_t > handler, > + void *data) > +{ > + return 0; Should return -ENODEV Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 11/22] driver core: add per device iommu param
On Mon, Apr 16, 2018 at 02:49:00PM -0700, Jacob Pan wrote: > DMA faults can be detected by IOMMU at device level. Adding a pointer > to struct device allows IOMMU subsystem to report relevant faults > back to the device driver for further handling. > For direct assigned device (or user space drivers), guest OS holds > responsibility to handle and respond per device IOMMU fault. > Therefore we need fault reporting mechanism to propagate faults beyond > IOMMU subsystem. > > There are two other IOMMU data pointers under struct device today, here > we introduce iommu_param as a parent pointer such that all device IOMMU > data can be consolidated here. The idea was suggested here by Greg KH > and Joerg. The name iommu_param is chosen here since iommu_data has been used. > > Suggested-by: Greg Kroah-Hartman> Signed-off-by: Jacob Pan > Signed-off-by: Jean-Philippe Brucker > Link: https://lkml.org/lkml/2017/10/6/81 Reviewed-by: Greg Kroah-Hartman ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 10/22] iommu: introduce device fault data
On Mon, Apr 16, 2018 at 10:48:59PM +0100, Jacob Pan wrote: > +/** > + * struct iommu_fault_event - Generic per device fault data > + * > + * - PCI and non-PCI devices > + * - Recoverable faults (e.g. page request), information based on PCI ATS > + * and PASID spec. > + * - Un-recoverable faults of device interest > + * - DMA remapping and IRQ remapping faults > + > + * @type contains fault type. > + * @reason fault reasons if relevant outside IOMMU driver, IOMMU driver > internal > + * faults are not reported > + * @addr: tells the offending page address > + * @pasid: contains process address space ID, used in shared virtual > memory(SVM) > + * @rid: requestor ID You can remove @rid from the comment > + * @page_req_group_id: page request group index > + * @last_req: last request in a page request group > + * @pasid_valid: indicates if the PRQ has a valid PASID > + * @prot: page access protection flag, e.g. IOMMU_FAULT_READ, > IOMMU_FAULT_WRITE > + * @device_private: if present, uniquely identify device-specific > + * private data for an individual page request. > + * @iommu_private: used by the IOMMU driver for storing fault-specific > + * data. Users should not modify this field before > + * sending the fault response. In my opinion you can remove @iommu_private entirely. I proposed this field so that the IOMMU driver can store fault metadata when reporting them, and read them back when completing the fault. I'm not using it in SMMUv3 anymore (instead re-fetching the metadata) and it can't be used anyway, because the value isn't copied into page_response_msg. Thanks, Jean > + */ > +struct iommu_fault_event { > + enum iommu_fault_type type; > + enum iommu_fault_reason reason; > + u64 addr; > + u32 pasid; > + u32 page_req_group_id; > + u32 last_req : 1; > + u32 pasid_valid : 1; > + u32 prot; > + u64 device_private; > + u64 iommu_private; > +}; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu