Re: IOVA allocation dependency between firmware buffer and remaining buffers
Hi Robin, On 24.09.2020 13:06, Robin Murphy wrote: > On 2020-09-24 11:47, Marek Szyprowski wrote: >> On 24.09.2020 12:40, Robin Murphy wrote: >>> On 2020-09-24 11:16, Thierry Reding wrote: On Thu, Sep 24, 2020 at 10:46:46AM +0200, Marek Szyprowski wrote: > On 24.09.2020 10:28, Joerg Roedel wrote: >> On Wed, Sep 23, 2020 at 08:48:26AM +0200, Marek Szyprowski wrote: >>> It allows to remap given buffer at the specific IOVA address, >>> although >>> it doesn't guarantee that those specific addresses won't be later >>> used >>> by the IOVA allocator. Probably it would make sense to add an >>> API for >>> generic IOMMU-DMA framework to mark the given IOVA range as >>> reserved/unused to protect them. >> There is an API for that, the IOMMU driver can return IOVA reserved >> regions per device and the IOMMU core code will take care of mapping >> these regions and reserving them in the IOVA allocator, so that >> DMA-IOMMU code will not use it for allocations. >> >> Have a look at the iommu_ops->get_resv_regions() and >> iommu_ops->put_resv_regions(). > > I know about the reserved regions IOMMU API, but the main problem > here, > in case of Exynos, is that those reserved regions won't be created by > the IOMMU driver but by the IOMMU client device. It is just a result > how > the media drivers manages their IOVA space. They simply have to load > firmware at the IOVA address lower than the any address of the used > buffers. I've been working on adding a way to automatically add direct mappings using reserved-memory regions parsed from device tree, see: https://lore.kernel.org/lkml/2020090413.691933-1-thierry.red...@gmail.com/ Perhaps this can be of use? With that you should be able to add a reserved-memory region somewhere in the lower range that you need for firmware images and have that automatically added as a direct mapping so that it won't be reused later on for dynamic allocations. >>> >>> It can't easily be a *direct* mapping though - if the driver has to >>> use the DMA masks to ensure that everything stays within the >>> addressable range, then (as far as I'm aware) there's no physical >>> memory that low down to equal the DMA addresses. >>> >>> TBH I'm not convinced that this is a sufficiently common concern to >>> justify new APIs, or even to try to make overly generic. I think just >>> implementing a new DMA attribute to say "please allocate/map this >>> particular request at the lowest DMA address possible" would be good >>> enough. Such a thing could also serve PCI drivers that actually care >>> about SAC/DAC to give us more of a chance of removing the "try a >>> 32-bit mask first" trick from everyone's hotpath... >> >> Hmm, I like the idea of such DMA attribute! It should make things really >> simple, especially in the drivers. Thanks for the great idea! I will try >> to implement it then instead of the workarounds I've proposed in >> s5p-mfc/exynos4-is drivers. > > Right, I think it's fair to draw a line and say that anyone who wants > a *specific* address needs to manage their own IOMMU domain. > > In the backend I suspect it's going to be cleanest to implement a > dedicated iova_alloc_low() (or similar) function in the IOVA API that > sidesteps all of the existing allocation paths and goes straight to > the rbtree. Just for the record - I've implemented this approach here: https://lore.kernel.org/lkml/20200925141218.13550-1-m.szyprow...@samsung.com/ Best regards -- Marek Szyprowski, PhD Samsung R Institute Poland ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/arm-smmu-v3: Add rmb after reading event queue prod_reg
In arm_smmu_evtq_thread, reading event queue is from consumer pointer, which has no address dependency on producer pointer, prog_reg(MMIO) and event queue memory(Normal memory) can disorder. So the load for event queue can be done before the load of prod_reg, then perhaps wrong event entry value will be got. Add rmb to make sure to get correct event queue entry value. Signed-off-by: Zhou Wang --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index c192544..93c9077 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -819,6 +819,9 @@ static int queue_sync_prod_in(struct arm_smmu_queue *q) int ret = 0; u32 prod = readl_relaxed(q->prod_reg); + /* Ensure that reading event queue is after reading prod_reg */ + rmb(); + if (Q_OVF(prod) != Q_OVF(q->llq.prod)) ret = -EOVERFLOW; -- 2.8.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCHv5 5/6] iommu: arm-smmu-impl: Use table to list QCOM implementations
On 2020-09-23 20:54, Robin Murphy wrote: On 2020-09-22 07:18, Sai Prakash Ranjan wrote: Use table and of_match_node() to match qcom implementation instead of multiple of_device_compatible() calls for each QCOM SMMU implementation. Signed-off-by: Sai Prakash Ranjan --- drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c index d199b4bff15d..ce78295cfa78 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c @@ -9,6 +9,13 @@ #include "arm-smmu.h" +static const struct of_device_id __maybe_unused qcom_smmu_impl_of_match[] = { + { .compatible = "qcom,sc7180-smmu-500" }, + { .compatible = "qcom,sdm845-smmu-500" }, + { .compatible = "qcom,sm8150-smmu-500" }, + { .compatible = "qcom,sm8250-smmu-500" }, + { } +}; Can you push the table itself into arm-smmu-qcom? That way you'll be free to add new SoCs willy-nilly without any possibility of conflicting with anything else. Bonus points if you can fold in the Adreno variant and keep everything together ;) Sure I can get bonus points :) Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 05/18] ARM/dma-mapping: Switch to iommu_dma_ops
Hi Robin, On 20.08.2020 17:08, Robin Murphy wrote: > With the IOMMU ops now looking much the same shape as iommu_dma_ops, > switch them out in favour of the iommu-dma library, currently enhanced > with temporary workarounds that allow it to also sit underneath the > arch-specific API. With that in place, we can now start converting the > remaining IOMMU drivers and consumers to work with IOMMU API default > domains instead. > > Signed-off-by: Robin Murphy I've played a bit longer with this and found that reading the kernel virtual address of the buffers allocated via dma_alloc_attrs() from dma-iommu ops gives trashes from time to time. It took me a while to debug this... Your conversion misses adding arch_dma_prep_coherent() to arch/arm, so the buffers are cleared by the mm allocator, but the caches are NOT flushed for the newly allocated buffers. This fixes the issue: diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index fec3e59215b8..8b60bcc5b14f 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -2,6 +2,7 @@ config ARM bool default y + select ARCH_HAS_DMA_PREP_COHERENT select ARCH_32BIT_OFF_T select ARCH_HAS_BINFMT_FLAT select ARCH_HAS_DEBUG_VIRTUAL if MMU diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index ff6c4962161a..6954681b73da 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -266,6 +266,20 @@ static void __dma_clear_buffer(struct page *page, size_t size, int coherent_flag } } +void arch_dma_prep_coherent(struct page *page, size_t size) +{ + + if (PageHighMem(page)) { + phys_addr_t base = __pfn_to_phys(page_to_pfn(page)); + phys_addr_t end = base + size; + outer_flush_range(base, end); + } else { + void *ptr = page_address(page); + dmac_flush_range(ptr, ptr + size); + outer_flush_range(__pa(ptr), __pa(ptr) + size); + } +} + /* * Allocate a DMA buffer for 'dev' of size 'size' using the * specified gfp mask. Note that 'size' must be page aligned. I also wonder if it would be better to use per-arch __dma_clear_buffer() instead of setting __GFP_ZERO unconditionally in dma-iommu.c. This should be faster on ARM with highmem... > ... Best regards -- Marek Szyprowski, PhD Samsung R Institute Poland ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 1/2] iommu/iova: Retry from last rb tree node if iova search fails
From: Vijayanand Jitta When ever a new iova alloc request comes iova is always searched from the cached node and the nodes which are previous to cached node. So, even if there is free iova space available in the nodes which are next to the cached node iova allocation can still fail because of this approach. Consider the following sequence of iova alloc and frees on 1GB of iova space 1) alloc - 500MB 2) alloc - 12MB 3) alloc - 499MB 4) free - 12MB which was allocated in step 2 5) alloc - 13MB After the above sequence we will have 12MB of free iova space and cached node will be pointing to the iova pfn of last alloc of 13MB which will be the lowest iova pfn of that iova space. Now if we get an alloc request of 2MB we just search from cached node and then look for lower iova pfn's for free iova and as they aren't any, iova alloc fails though there is 12MB of free iova space. To avoid such iova search failures do a retry from the last rb tree node when iova search fails, this will search the entire tree and get an iova if its available. Signed-off-by: Vijayanand Jitta --- drivers/iommu/iova.c | 23 +-- 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 30d969a..acc8bee 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -184,8 +184,9 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad, struct rb_node *curr, *prev; struct iova *curr_iova; unsigned long flags; - unsigned long new_pfn; + unsigned long new_pfn, low_pfn_new; unsigned long align_mask = ~0UL; + unsigned long high_pfn = limit_pfn, low_pfn = iovad->start_pfn; if (size_aligned) align_mask <<= fls_long(size - 1); @@ -198,15 +199,25 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad, curr = __get_cached_rbnode(iovad, limit_pfn); curr_iova = rb_entry(curr, struct iova, node); + low_pfn_new = curr_iova->pfn_hi + 1; + +retry: do { - limit_pfn = min(limit_pfn, curr_iova->pfn_lo); - new_pfn = (limit_pfn - size) & align_mask; + high_pfn = min(high_pfn, curr_iova->pfn_lo); + new_pfn = (high_pfn - size) & align_mask; prev = curr; curr = rb_prev(curr); curr_iova = rb_entry(curr, struct iova, node); - } while (curr && new_pfn <= curr_iova->pfn_hi); - - if (limit_pfn < size || new_pfn < iovad->start_pfn) { + } while (curr && new_pfn <= curr_iova->pfn_hi && new_pfn >= low_pfn); + + if (high_pfn < size || new_pfn < low_pfn) { + if (low_pfn == iovad->start_pfn && low_pfn_new < limit_pfn) { + high_pfn = limit_pfn; + low_pfn = low_pfn_new; + curr = >anchor.node; + curr_iova = rb_entry(curr, struct iova, node); + goto retry; + } iovad->max32_alloc_size = size; goto iova32_full; } -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 2/2] iommu/iova: Free global iova rcache on iova alloc failure
From: Vijayanand Jitta When ever an iova alloc request fails we free the iova ranges present in the percpu iova rcaches and then retry but the global iova rcache is not freed as a result we could still see iova alloc failure even after retry as global rcache is holding the iova's which can cause fragmentation. So, free the global iova rcache as well and then go for the retry. Signed-off-by: Vijayanand Jitta --- drivers/iommu/iova.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index acc8bee..b4f04fd 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -442,6 +442,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size, flush_rcache = false; for_each_online_cpu(cpu) free_cpu_cached_iovas(cpu, iovad); + free_global_cached_iovas(iovad); goto retry; } @@ -1057,5 +1058,26 @@ void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad) } } +/* + * free all the IOVA ranges of global cache + */ +static void free_global_cached_iovas(struct iova_domain *iovad) +{ + struct iova_rcache *rcache; + unsigned long flags; + int i, j; + + for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { + rcache = >rcaches[i]; + spin_lock_irqsave(>lock, flags); + for (j = 0; j < rcache->depot_size; ++j) { + iova_magazine_free_pfns(rcache->depot[j], iovad); + iova_magazine_free(rcache->depot[j]); + rcache->depot[j] = NULL; + } + rcache->depot_size = 0; + spin_unlock_irqrestore(>lock, flags); + } +} MODULE_AUTHOR("Anil S Keshavamurthy "); MODULE_LICENSE("GPL"); -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCHv5 4/6] drm/msm/a6xx: Add support for using system cache(LLC)
Hi Jordan, On 2020-09-23 20:33, Jordan Crouse wrote: On Tue, Sep 22, 2020 at 11:48:17AM +0530, Sai Prakash Ranjan wrote: From: Sharat Masetty The last level system cache can be partitioned to 32 different slices of which GPU has two slices preallocated. One slice is used for caching GPU buffers and the other slice is used for caching the GPU SMMU pagetables. This talks to the core system cache driver to acquire the slice handles, configure the SCID's to those slices and activates and deactivates the slices upon GPU power collapse and restore. Some support from the IOMMU driver is also needed to make use of the system cache to set the right TCR attributes. GPU then has the ability to override a few cacheability parameters which it does to override write-allocate to write-no-allocate as the GPU hardware does not benefit much from it. DOMAIN_ATTR_SYS_CACHE is another domain level attribute used by the IOMMU driver to set the right attributes to cache the hardware pagetables into the system cache. Signed-off-by: Sharat Masetty [saiprakash.ranjan: fix to set attr before device attach to iommu and rebase] Signed-off-by: Sai Prakash Ranjan --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 83 + drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 4 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.c | 17 + 3 files changed, 104 insertions(+) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 8915882e..151190ff62f7 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -8,7 +8,9 @@ #include "a6xx_gpu.h" #include "a6xx_gmu.xml.h" +#include #include +#include #define GPU_PAS_ID 13 @@ -1022,6 +1024,79 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu) return IRQ_HANDLED; } +static void a6xx_llc_rmw(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 mask, u32 or) +{ + return msm_rmw(a6xx_gpu->llc_mmio + (reg << 2), mask, or); +} + +static void a6xx_llc_write(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 value) +{ + return msm_writel(value, a6xx_gpu->llc_mmio + (reg << 2)); +} + +static void a6xx_llc_deactivate(struct a6xx_gpu *a6xx_gpu) +{ + llcc_slice_deactivate(a6xx_gpu->llc_slice); + llcc_slice_deactivate(a6xx_gpu->htw_llc_slice); +} + +static void a6xx_llc_activate(struct a6xx_gpu *a6xx_gpu) +{ + u32 cntl1_regval = 0; + + if (IS_ERR(a6xx_gpu->llc_mmio)) + return; + + if (!llcc_slice_activate(a6xx_gpu->llc_slice)) { + u32 gpu_scid = llcc_get_slice_id(a6xx_gpu->llc_slice); + + gpu_scid &= 0x1f; + cntl1_regval = (gpu_scid << 0) | (gpu_scid << 5) | (gpu_scid << 10) | + (gpu_scid << 15) | (gpu_scid << 20); + } + + if (!llcc_slice_activate(a6xx_gpu->htw_llc_slice)) { + u32 gpuhtw_scid = llcc_get_slice_id(a6xx_gpu->htw_llc_slice); + + gpuhtw_scid &= 0x1f; + cntl1_regval |= FIELD_PREP(GENMASK(29, 25), gpuhtw_scid); + } + + if (cntl1_regval) { + /* +* Program the slice IDs for the various GPU blocks and GPU MMU +* pagetables +*/ + a6xx_llc_write(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1, cntl1_regval); + + /* +* Program cacheability overrides to not allocate cache lines on +* a write miss +*/ + a6xx_llc_rmw(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0, 0xF, 0x03); + } +} This code has been around long enough that it pre-dates a650. On a650 and other MMU-500 targets the htw_llc is configured by the firmware and the llc_slice is configured in a different register. I don't think we need to pause everything and add support for the MMU-500 path, but we do need a way to disallow LLCC on affected targets until such time that we can get it fixed up. Thanks for taking a close look, does something like below look ok or something else is needed here? + /* Till the time we get in LLCC support for A650 */ + if (!(info && info->revn == 650)) + a6xx_llc_slices_init(pdev, a6xx_gpu); Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 0/7] Convert the intel iommu driver to the dma-iommu api
On 27/09/2020 07:34, Lu Baolu wrote: Hi, The previous post of this series could be found here. https://lore.kernel.org/linux-iommu/20200912032200.11489-1-baolu...@linux.intel.com/ This version introduce a new patch [4/7] to fix an issue reported here. https://lore.kernel.org/linux-iommu/51a1baec-48d1-c0ac-181b-1fba92aa4...@linux.intel.com/ There aren't any other changes. Please help to test and review. Best regards, baolu Lu Baolu (3): iommu: Add quirk for Intel graphic devices in map_sg Since I do have patches to fix i915 to handle this, do we want to co-ordinate the two and avoid having to add this quirk and then later remove it? Or you want to go the staged approach? Regards, Tvrtko iommu/vt-d: Update domain geometry in iommu_ops.at(de)tach_dev iommu/vt-d: Cleanup after converting to dma-iommu ops Tom Murphy (4): iommu: Handle freelists when using deferred flushing in iommu drivers iommu: Add iommu_dma_free_cpu_cached_iovas() iommu: Allow the dma-iommu api to use bounce buffers iommu/vt-d: Convert intel iommu driver to the iommu ops .../admin-guide/kernel-parameters.txt | 5 - drivers/iommu/dma-iommu.c | 228 - drivers/iommu/intel/Kconfig | 1 + drivers/iommu/intel/iommu.c | 901 +++--- include/linux/dma-iommu.h | 8 + include/linux/iommu.h | 1 + 6 files changed, 336 insertions(+), 808 deletions(-) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 34/46] PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable
On Sat, Sep 26 2020 at 14:38, Vasily Gorbik wrote: > On Fri, Sep 25, 2020 at 09:54:52AM -0400, Qian Cai wrote: > Yes, as well as on mips and sparc which also don't FORCE_PCI. > This seems to work for s390: > > diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig > index b0b7acf07eb8..41136fbe909b 100644 > --- a/arch/s390/Kconfig > +++ b/arch/s390/Kconfig > @@ -192,3 +192,3 @@ config S390 > select PCI_MSI if PCI > - select PCI_MSI_ARCH_FALLBACKS > + select PCI_MSI_ARCH_FALLBACKS if PCI > select SET_FS lemme fix that for all of them ... ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/2] iommu/iova: Free global iova rcache on iova alloc failure
On 9/18/2020 8:11 PM, Robin Murphy wrote: > On 2020-08-20 13:49, vji...@codeaurora.org wrote: >> From: Vijayanand Jitta >> >> When ever an iova alloc request fails we free the iova >> ranges present in the percpu iova rcaches and then retry >> but the global iova rcache is not freed as a result we could >> still see iova alloc failure even after retry as global >> rcache is holding the iova's which can cause fragmentation. >> So, free the global iova rcache as well and then go for the >> retry. >> >> Signed-off-by: Vijayanand Jitta >> --- >> drivers/iommu/iova.c | 23 +++ >> include/linux/iova.h | 6 ++ >> 2 files changed, 29 insertions(+) >> >> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c >> index 4e77116..5836c87 100644 >> --- a/drivers/iommu/iova.c >> +++ b/drivers/iommu/iova.c >> @@ -442,6 +442,7 @@ struct iova *find_iova(struct iova_domain *iovad, >> unsigned long pfn) >> flush_rcache = false; >> for_each_online_cpu(cpu) >> free_cpu_cached_iovas(cpu, iovad); >> + free_global_cached_iovas(iovad); >> goto retry; >> } >> @@ -1055,5 +1056,27 @@ void free_cpu_cached_iovas(unsigned int cpu, >> struct iova_domain *iovad) >> } >> } >> +/* >> + * free all the IOVA ranges of global cache >> + */ >> +void free_global_cached_iovas(struct iova_domain *iovad) > > As John pointed out last time, this should be static and the header > changes dropped. > > (TBH we should probably register our own hotplug notifier instance for a > flush queue, so that external code has no need to poke at the per-CPU > caches either) > > Robin. > Right, I have made it static and dropped header changes in v3. can you please review that. Thanks, Vijay >> +{ >> + struct iova_rcache *rcache; >> + unsigned long flags; >> + int i, j; >> + >> + for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { >> + rcache = >rcaches[i]; >> + spin_lock_irqsave(>lock, flags); >> + for (j = 0; j < rcache->depot_size; ++j) { >> + iova_magazine_free_pfns(rcache->depot[j], iovad); >> + iova_magazine_free(rcache->depot[j]); >> + rcache->depot[j] = NULL; >> + } >> + rcache->depot_size = 0; >> + spin_unlock_irqrestore(>lock, flags); >> + } >> +} >> + >> MODULE_AUTHOR("Anil S Keshavamurthy "); >> MODULE_LICENSE("GPL"); >> diff --git a/include/linux/iova.h b/include/linux/iova.h >> index a0637ab..a905726 100644 >> --- a/include/linux/iova.h >> +++ b/include/linux/iova.h >> @@ -163,6 +163,7 @@ int init_iova_flush_queue(struct iova_domain *iovad, >> struct iova *split_and_remove_iova(struct iova_domain *iovad, >> struct iova *iova, unsigned long pfn_lo, unsigned long pfn_hi); >> void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain >> *iovad); >> +void free_global_cached_iovas(struct iova_domain *iovad); >> #else >> static inline int iova_cache_get(void) >> { >> @@ -270,6 +271,11 @@ static inline void free_cpu_cached_iovas(unsigned >> int cpu, >> struct iova_domain *iovad) >> { >> } >> + >> +static inline void free_global_cached_iovas(struct iova_domain *iovad) >> +{ >> +} >> + >> #endif >> #endif >> -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCHv5 4/6] drm/msm/a6xx: Add support for using system cache(LLC)
On Mon, Sep 28, 2020 at 05:56:55PM +0530, Sai Prakash Ranjan wrote: > Hi Jordan, > > On 2020-09-23 20:33, Jordan Crouse wrote: > >On Tue, Sep 22, 2020 at 11:48:17AM +0530, Sai Prakash Ranjan wrote: > >>From: Sharat Masetty > >> > >>The last level system cache can be partitioned to 32 different > >>slices of which GPU has two slices preallocated. One slice is > >>used for caching GPU buffers and the other slice is used for > >>caching the GPU SMMU pagetables. This talks to the core system > >>cache driver to acquire the slice handles, configure the SCID's > >>to those slices and activates and deactivates the slices upon > >>GPU power collapse and restore. > >> > >>Some support from the IOMMU driver is also needed to make use > >>of the system cache to set the right TCR attributes. GPU then > >>has the ability to override a few cacheability parameters which > >>it does to override write-allocate to write-no-allocate as the > >>GPU hardware does not benefit much from it. > >> > >>DOMAIN_ATTR_SYS_CACHE is another domain level attribute used by the > >>IOMMU driver to set the right attributes to cache the hardware > >>pagetables into the system cache. > >> > >>Signed-off-by: Sharat Masetty > >>[saiprakash.ranjan: fix to set attr before device attach to iommu and > >>rebase] > >>Signed-off-by: Sai Prakash Ranjan > >>--- > >> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 83 + > >> drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 4 ++ > >> drivers/gpu/drm/msm/adreno/adreno_gpu.c | 17 + > >> 3 files changed, 104 insertions(+) > >> > >>diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > >>b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > >>index 8915882e..151190ff62f7 100644 > >>--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > >>+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > >>@@ -8,7 +8,9 @@ > >> #include "a6xx_gpu.h" > >> #include "a6xx_gmu.xml.h" > >> > >>+#include > >> #include > >>+#include > >> > >> #define GPU_PAS_ID 13 > >> > >>@@ -1022,6 +1024,79 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu) > >>return IRQ_HANDLED; > >> } > >> > >>+static void a6xx_llc_rmw(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 mask, > >>u32 or) > >>+{ > >>+ return msm_rmw(a6xx_gpu->llc_mmio + (reg << 2), mask, or); > >>+} > >>+ > >>+static void a6xx_llc_write(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 > >>value) > >>+{ > >>+ return msm_writel(value, a6xx_gpu->llc_mmio + (reg << 2)); > >>+} > >>+ > >>+static void a6xx_llc_deactivate(struct a6xx_gpu *a6xx_gpu) > >>+{ > >>+ llcc_slice_deactivate(a6xx_gpu->llc_slice); > >>+ llcc_slice_deactivate(a6xx_gpu->htw_llc_slice); > >>+} > >>+ > >>+static void a6xx_llc_activate(struct a6xx_gpu *a6xx_gpu) > >>+{ > >>+ u32 cntl1_regval = 0; > >>+ > >>+ if (IS_ERR(a6xx_gpu->llc_mmio)) > >>+ return; > >>+ > >>+ if (!llcc_slice_activate(a6xx_gpu->llc_slice)) { > >>+ u32 gpu_scid = llcc_get_slice_id(a6xx_gpu->llc_slice); > >>+ > >>+ gpu_scid &= 0x1f; > >>+ cntl1_regval = (gpu_scid << 0) | (gpu_scid << 5) | (gpu_scid << > >>10) | > >>+ (gpu_scid << 15) | (gpu_scid << 20); > >>+ } > >>+ > >>+ if (!llcc_slice_activate(a6xx_gpu->htw_llc_slice)) { > >>+ u32 gpuhtw_scid = llcc_get_slice_id(a6xx_gpu->htw_llc_slice); > >>+ > >>+ gpuhtw_scid &= 0x1f; > >>+ cntl1_regval |= FIELD_PREP(GENMASK(29, 25), gpuhtw_scid); > >>+ } > >>+ > >>+ if (cntl1_regval) { > >>+ /* > >>+* Program the slice IDs for the various GPU blocks and GPU MMU > >>+* pagetables > >>+*/ > >>+ a6xx_llc_write(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1, > >>cntl1_regval); > >>+ > >>+ /* > >>+* Program cacheability overrides to not allocate cache lines on > >>+* a write miss > >>+*/ > >>+ a6xx_llc_rmw(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0, > >>0xF, > >>0x03); > >>+ } > >>+} > > > >This code has been around long enough that it pre-dates a650. On a650 and > >other > >MMU-500 targets the htw_llc is configured by the firmware and the > >llc_slice is > >configured in a different register. > > > >I don't think we need to pause everything and add support for the MMU-500 > >path, > >but we do need a way to disallow LLCC on affected targets until such time > >that > >we can get it fixed up. > > > > Thanks for taking a close look, does something like below look ok or > something > else is needed here? > > + /* Till the time we get in LLCC support for A650 */ > + if (!(info && info->revn == 650)) > + a6xx_llc_slices_init(pdev, a6xx_gpu); It doesn't look like Rob picked this up for 5.10, so we have some time to do it right. Would you like me to give you an add-on patch for mmu-500 targets? Jordan > Thanks, > Sai > > -- > QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member > of Code Aurora Forum, hosted by The Linux Foundation -- The
Re: [PATCH v2 1/3] iommu/io-pgtable-arm: Support coherency for Mali LPAE
On Tue, Sep 22, 2020 at 03:16:48PM +0100, Robin Murphy wrote: > Midgard GPUs have ACE-Lite master interfaces which allows systems to > integrate them in an I/O-coherent manner. It seems that from the GPU's > viewpoint, the rest of the system is its outer shareable domain, and so > even when snoop signals are wired up, they are only emitted for outer > shareable accesses. As such, setting the TTBR_SHARE_OUTER bit does > indeed get coherent pagetable walks working nicely for the coherent > T620 in the Arm Juno SoC. > > Reviewed-by: Steven Price > Tested-by: Neil Armstrong > Signed-off-by: Robin Murphy > --- > drivers/iommu/io-pgtable-arm.c | 11 ++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c > index dc7bcf858b6d..b4072a18e45d 100644 > --- a/drivers/iommu/io-pgtable-arm.c > +++ b/drivers/iommu/io-pgtable-arm.c > @@ -440,7 +440,13 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct > arm_lpae_io_pgtable *data, > << ARM_LPAE_PTE_ATTRINDX_SHIFT); > } > > - if (prot & IOMMU_CACHE) > + /* > + * Also Mali has its own notions of shareability wherein its Inner > + * domain covers the cores within the GPU, and its Outer domain is > + * "outside the GPU" (i.e. either the Inner or System domain in CPU > + * terms, depending on coherency). > + */ > + if (prot & IOMMU_CACHE && data->iop.fmt != ARM_MALI_LPAE) > pte |= ARM_LPAE_PTE_SH_IS; > else > pte |= ARM_LPAE_PTE_SH_OS; > @@ -1049,6 +1055,9 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, > void *cookie) > cfg->arm_mali_lpae_cfg.transtab = virt_to_phys(data->pgd) | > ARM_MALI_LPAE_TTBR_READ_INNER | > ARM_MALI_LPAE_TTBR_ADRMODE_TABLE; > + if (cfg->coherent_walk) > + cfg->arm_mali_lpae_cfg.transtab |= > ARM_MALI_LPAE_TTBR_SHARE_OUTER; > + Acked-by: Will Deacon I'm assuming I'm not the right person to merge this, and it needs to go alongside the other patches in this series. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCHv5 4/6] drm/msm/a6xx: Add support for using system cache(LLC)
On 2020-09-28 21:41, Jordan Crouse wrote: On Mon, Sep 28, 2020 at 05:56:55PM +0530, Sai Prakash Ranjan wrote: Hi Jordan, On 2020-09-23 20:33, Jordan Crouse wrote: >On Tue, Sep 22, 2020 at 11:48:17AM +0530, Sai Prakash Ranjan wrote: >>From: Sharat Masetty >> >>The last level system cache can be partitioned to 32 different >>slices of which GPU has two slices preallocated. One slice is >>used for caching GPU buffers and the other slice is used for >>caching the GPU SMMU pagetables. This talks to the core system >>cache driver to acquire the slice handles, configure the SCID's >>to those slices and activates and deactivates the slices upon >>GPU power collapse and restore. >> >>Some support from the IOMMU driver is also needed to make use >>of the system cache to set the right TCR attributes. GPU then >>has the ability to override a few cacheability parameters which >>it does to override write-allocate to write-no-allocate as the >>GPU hardware does not benefit much from it. >> >>DOMAIN_ATTR_SYS_CACHE is another domain level attribute used by the >>IOMMU driver to set the right attributes to cache the hardware >>pagetables into the system cache. >> >>Signed-off-by: Sharat Masetty >>[saiprakash.ranjan: fix to set attr before device attach to iommu and >>rebase] >>Signed-off-by: Sai Prakash Ranjan >>--- >> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 83 + >> drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 4 ++ >> drivers/gpu/drm/msm/adreno/adreno_gpu.c | 17 + >> 3 files changed, 104 insertions(+) >> >>diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c >>b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c >>index 8915882e..151190ff62f7 100644 >>--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c >>+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c >>@@ -8,7 +8,9 @@ >> #include "a6xx_gpu.h" >> #include "a6xx_gmu.xml.h" >> >>+#include >> #include >>+#include >> >> #define GPU_PAS_ID 13 >> >>@@ -1022,6 +1024,79 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu) >>return IRQ_HANDLED; >> } >> >>+static void a6xx_llc_rmw(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 mask, >>u32 or) >>+{ >>+ return msm_rmw(a6xx_gpu->llc_mmio + (reg << 2), mask, or); >>+} >>+ >>+static void a6xx_llc_write(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 >>value) >>+{ >>+ return msm_writel(value, a6xx_gpu->llc_mmio + (reg << 2)); >>+} >>+ >>+static void a6xx_llc_deactivate(struct a6xx_gpu *a6xx_gpu) >>+{ >>+ llcc_slice_deactivate(a6xx_gpu->llc_slice); >>+ llcc_slice_deactivate(a6xx_gpu->htw_llc_slice); >>+} >>+ >>+static void a6xx_llc_activate(struct a6xx_gpu *a6xx_gpu) >>+{ >>+ u32 cntl1_regval = 0; >>+ >>+ if (IS_ERR(a6xx_gpu->llc_mmio)) >>+ return; >>+ >>+ if (!llcc_slice_activate(a6xx_gpu->llc_slice)) { >>+ u32 gpu_scid = llcc_get_slice_id(a6xx_gpu->llc_slice); >>+ >>+ gpu_scid &= 0x1f; >>+ cntl1_regval = (gpu_scid << 0) | (gpu_scid << 5) | (gpu_scid << 10) | >>+ (gpu_scid << 15) | (gpu_scid << 20); >>+ } >>+ >>+ if (!llcc_slice_activate(a6xx_gpu->htw_llc_slice)) { >>+ u32 gpuhtw_scid = llcc_get_slice_id(a6xx_gpu->htw_llc_slice); >>+ >>+ gpuhtw_scid &= 0x1f; >>+ cntl1_regval |= FIELD_PREP(GENMASK(29, 25), gpuhtw_scid); >>+ } >>+ >>+ if (cntl1_regval) { >>+ /* >>+* Program the slice IDs for the various GPU blocks and GPU MMU >>+* pagetables >>+*/ >>+ a6xx_llc_write(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1, >>cntl1_regval); >>+ >>+ /* >>+* Program cacheability overrides to not allocate cache lines on >>+* a write miss >>+*/ >>+ a6xx_llc_rmw(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0, 0xF, >>0x03); >>+ } >>+} > >This code has been around long enough that it pre-dates a650. On a650 and >other >MMU-500 targets the htw_llc is configured by the firmware and the >llc_slice is >configured in a different register. > >I don't think we need to pause everything and add support for the MMU-500 >path, >but we do need a way to disallow LLCC on affected targets until such time >that >we can get it fixed up. > Thanks for taking a close look, does something like below look ok or something else is needed here? + /* Till the time we get in LLCC support for A650 */ + if (!(info && info->revn == 650)) + a6xx_llc_slices_init(pdev, a6xx_gpu); It doesn't look like Rob picked this up for 5.10, so we have some time to do it right. Would you like me to give you an add-on patch for mmu-500 targets? Yes that will be great. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list
Re: [PATCH 1/5] iommu/tegra-smmu: Unwrap tegra_smmu_group_get
On Sat, Sep 26, 2020 at 01:07:15AM -0700, Nicolin Chen wrote: > The tegra_smmu_group_get was added to group devices in different > SWGROUPs and it'd return a NULL group pointer upon a mismatch at > tegra_smmu_find_group(), so for most of clients/devices, it very > likely would mismatch and need a fallback generic_device_group(). > > But now tegra_smmu_group_get handles devices in same SWGROUP too, > which means that it would allocate a group for every new SWGROUP > or would directly return an existing one upon matching a SWGROUP, > i.e. any device will go through this function. > > So possibility of having a NULL group pointer in device_group() > is upon failure of either devm_kzalloc() or iommu_group_alloc(). > In either case, calling generic_device_group() no longer makes a > sense. Especially for devm_kzalloc() failing case, it'd cause a > problem if it fails at devm_kzalloc() yet succeeds at a fallback > generic_device_group(), because it does not create a group->list > for other devices to match. > > This patch simply unwraps the function to clean it up. > > Signed-off-by: Nicolin Chen > --- > drivers/iommu/tegra-smmu.c | 19 --- > 1 file changed, 4 insertions(+), 15 deletions(-) > > diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c > index 0becdbfea306..6335285dc373 100644 > --- a/drivers/iommu/tegra-smmu.c > +++ b/drivers/iommu/tegra-smmu.c > @@ -903,11 +903,13 @@ static void tegra_smmu_group_release(void *iommu_data) > mutex_unlock(>lock); > } > > -static struct iommu_group *tegra_smmu_group_get(struct tegra_smmu *smmu, > - unsigned int swgroup) > +static struct iommu_group *tegra_smmu_device_group(struct device *dev) > { > + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); > + struct tegra_smmu *smmu = dev_iommu_priv_get(dev); > const struct tegra_smmu_group_soc *soc; > struct tegra_smmu_group *group; > + int swgroup = fwspec->ids[0]; This should be unsigned int to match the type of swgroup elsewhere. Also, it might not be worth having an extra local variable for this since it's only used once. Thierry signature.asc Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/5] iommu/tegra-smmu: Use iommu_fwspec in .probe_/.attach_device()
On Sat, Sep 26, 2020 at 05:48:17PM +0300, Dmitry Osipenko wrote: > 26.09.2020 11:07, Nicolin Chen пишет: > ... > > + /* NULL smmu pointer means that SMMU driver is not probed yet */ > > + if (unlikely(!smmu)) > > + return ERR_PTR(-EPROBE_DEFER); > > Hello, Nicolin! > > Please don't pollute code with likely/unlikely. This is not a > performance-critical code. > > ... > > -static struct platform_driver tegra_mc_driver = { > > +struct platform_driver tegra_mc_driver = { > > .driver = { > > .name = "tegra-mc", > > .of_match_table = tegra_mc_of_match, > > diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h > > index 1238e35653d1..49a4cf64c4b9 100644 > > --- a/include/soc/tegra/mc.h > > +++ b/include/soc/tegra/mc.h > > @@ -184,4 +184,6 @@ struct tegra_mc { > > int tegra_mc_write_emem_configuration(struct tegra_mc *mc, unsigned long > > rate); > > unsigned int tegra_mc_get_emem_device_count(struct tegra_mc *mc); > > > > +extern struct platform_driver tegra_mc_driver; > > No global variables, please. See for the example: > > https://elixir.bootlin.com/linux/v5.9-rc6/source/drivers/devfreq/tegra20-devfreq.c#L100 > > The tegra_get_memory_controller() is now needed by multiple Tegra > drivers, I think it should be good to have it added into the MC driver > and then make it globally available for all drivers by making use of > of_find_matching_node_and_match(). > > diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c > index e1db209fd2ea..ed1bd6d00aaf 100644 > --- a/drivers/memory/tegra/mc.c > +++ b/drivers/memory/tegra/mc.c > @@ -43,6 +43,29 @@ static const struct of_device_id tegra_mc_of_match[] = { > }; > MODULE_DEVICE_TABLE(of, tegra_mc_of_match); > > +struct tegra_mc *tegra_get_memory_controller(void) > +{ > + struct platform_device *pdev; > + struct device_node *np; > + struct tegra_mc *mc; > + > + np = of_find_matching_node_and_match(NULL, tegra_mc_of_match, NULL); > + if (!np) > + return ERR_PTR(-ENOENT); > + > + pdev = of_find_device_by_node(np); > + of_node_put(np); > + if (!pdev) > + return ERR_PTR(-ENODEV); > + > + mc = platform_get_drvdata(pdev); > + if (!mc) > + return ERR_PTR(-EPROBE_DEFER); > + > + return mc; > +} > +EXPORT_SYMBOL_GPL(tegra_get_memory_controller); We already have tegra_smmu_find(), which should be enough for this particular use-case. Thierry signature.asc Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/5] iommu/tegra-smmu: Add PCI support
On Sat, Sep 26, 2020 at 01:07:18AM -0700, Nicolin Chen wrote: > This patch simply adds support for PCI devices. > > Signed-off-by: Nicolin Chen > --- > drivers/iommu/tegra-smmu.c | 17 - > 1 file changed, 16 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c > index 97a7185b4578..9dbc5d7183cc 100644 > --- a/drivers/iommu/tegra-smmu.c > +++ b/drivers/iommu/tegra-smmu.c > @@ -15,6 +15,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -935,6 +936,7 @@ static struct iommu_group *tegra_smmu_device_group(struct > device *dev) > const struct tegra_smmu_group_soc *soc; > struct tegra_smmu_group *group; > int swgroup = fwspec->ids[0]; > + bool pci = dev_is_pci(dev); > struct iommu_group *grp; > > /* Find group_soc associating with swgroup */ > @@ -961,7 +963,7 @@ static struct iommu_group *tegra_smmu_device_group(struct > device *dev) > group->smmu = smmu; > group->soc = soc; > > - group->group = iommu_group_alloc(); > + group->group = pci ? pci_device_group(dev) : iommu_group_alloc(); > if (IS_ERR(group->group)) { > devm_kfree(smmu->dev, group); > mutex_unlock(>lock); > @@ -1180,6 +1182,19 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev, > return ERR_PTR(err); > } > > +#ifdef CONFIG_PCI > + if (!iommu_present(_bus_type)) { > + pci_request_acs(); > + err = bus_set_iommu(_bus_type, _smmu_ops); > + if (err < 0) { > + bus_set_iommu(_bus_type, NULL); > + iommu_device_unregister(>iommu); > + iommu_device_sysfs_remove(>iommu); > + return ERR_PTR(err); It might be worth factoring out the cleanup code now that there are multiple failures from which we may need to clean up. Also, it'd be great if somehow we could do this without the #ifdef, but I guess since we're using the pci_bus_type global variable directly, there isn't much we can do here? Thierry signature.asc Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/5] iommu/tegra-smmu: Use iommu_fwspec in .probe_/.attach_device()
On Sat, Sep 26, 2020 at 01:07:17AM -0700, Nicolin Chen wrote: > The tegra_smmu_probe_device() function searches in DT for the iommu > phandler to get "smmu" pointer. This works for most of SMMU clients > that exist in the DTB. But a PCI device will not be added to iommu, > since it doesn't have a DT node. > > Fortunately, for a client with a DT node, tegra_smmu_probe_device() > calls tegra_smmu_of_xlate() via tegra_smmu_configure(), while for a > PCI device, of_pci_iommu_init() in the IOMMU core calls .of_xlate() > as well, even before running tegra_smmu_probe_device(). And in both > cases, tegra_smmu_of_xlate() prepares a valid iommu_fwspec pointer > that allows us to get the mc->smmu pointer via dev_get_drvdata() by > calling driver_find_device_by_fwnode(). > > So this patch uses iommu_fwspec in .probe_device() and related code > for a client that does not exist in the DTB, especially a PCI one. > > Signed-off-by: Nicolin Chen > --- > drivers/iommu/tegra-smmu.c | 89 +++--- > drivers/memory/tegra/mc.c | 2 +- > include/soc/tegra/mc.h | 2 + > 3 files changed, 56 insertions(+), 37 deletions(-) > > diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c > index b10e02073610..97a7185b4578 100644 > --- a/drivers/iommu/tegra-smmu.c > +++ b/drivers/iommu/tegra-smmu.c > @@ -13,6 +13,7 @@ > #include > #include > #include > +#include Why is this needed? I don't see any of the symbols declared in that file used here. > #include > > #include > @@ -61,6 +62,8 @@ struct tegra_smmu_as { > u32 attr; > }; > > +static const struct iommu_ops tegra_smmu_ops; > + > static struct tegra_smmu_as *to_smmu_as(struct iommu_domain *dom) > { > return container_of(dom, struct tegra_smmu_as, domain); > @@ -484,60 +487,49 @@ static void tegra_smmu_as_unprepare(struct tegra_smmu > *smmu, > static int tegra_smmu_attach_dev(struct iommu_domain *domain, >struct device *dev) > { > + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); > struct tegra_smmu *smmu = dev_iommu_priv_get(dev); > struct tegra_smmu_as *as = to_smmu_as(domain); > - struct device_node *np = dev->of_node; > - struct of_phandle_args args; > - unsigned int index = 0; > - int err = 0; > - > - while (!of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index, > -)) { > - unsigned int swgroup = args.args[0]; > - > - if (args.np != smmu->dev->of_node) { > - of_node_put(args.np); > - continue; > - } > + int index, err = 0; > > - of_node_put(args.np); > + if (!fwspec || fwspec->ops != _smmu_ops) > + return -ENOENT; > > + for (index = 0; index < fwspec->num_ids; index++) { > err = tegra_smmu_as_prepare(smmu, as); > - if (err < 0) > - return err; > + if (err) > + goto err_disable; I'd personally drop the err_ prefix here because it's pretty obvious that we're going to do this as a result of an error happening. > > - tegra_smmu_enable(smmu, swgroup, as->id); > - index++; > + tegra_smmu_enable(smmu, fwspec->ids[index], as->id); > } > > if (index == 0) > return -ENODEV; > > return 0; > + > +err_disable: > + for (index--; index >= 0; index--) { > + tegra_smmu_disable(smmu, fwspec->ids[index], as->id); > + tegra_smmu_as_unprepare(smmu, as); > + } I think a more idiomatic version of doing this would be: while (index--) { ... } > + > + return err; > } > > static void tegra_smmu_detach_dev(struct iommu_domain *domain, struct device > *dev) > { > + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); > struct tegra_smmu_as *as = to_smmu_as(domain); > - struct device_node *np = dev->of_node; > struct tegra_smmu *smmu = as->smmu; > - struct of_phandle_args args; > unsigned int index = 0; > > - while (!of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index, > -)) { > - unsigned int swgroup = args.args[0]; > - > - if (args.np != smmu->dev->of_node) { > - of_node_put(args.np); > - continue; > - } > - > - of_node_put(args.np); > + if (!fwspec || fwspec->ops != _smmu_ops) > + return; > > - tegra_smmu_disable(smmu, swgroup, as->id); > + for (index = 0; index < fwspec->num_ids; index++) { > + tegra_smmu_disable(smmu, fwspec->ids[index], as->id); > tegra_smmu_as_unprepare(smmu, as); > - index++; > } > } > > @@ -845,10 +837,25 @@ static int tegra_smmu_configure(struct tegra_smmu > *smmu, struct
Re: [PATCH 1/5] iommu/tegra-smmu: Unwrap tegra_smmu_group_get
Hi Thierry, Thanks for the review. On Mon, Sep 28, 2020 at 09:13:56AM +0200, Thierry Reding wrote: > > -static struct iommu_group *tegra_smmu_group_get(struct tegra_smmu *smmu, > > - unsigned int swgroup) > > +static struct iommu_group *tegra_smmu_device_group(struct device *dev) > > { > > + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); > > + struct tegra_smmu *smmu = dev_iommu_priv_get(dev); > > const struct tegra_smmu_group_soc *soc; > > struct tegra_smmu_group *group; > > + int swgroup = fwspec->ids[0]; > > This should be unsigned int to match the type of swgroup elsewhere. > Also, it might not be worth having an extra local variable for this > since it's only used once. Will change to unsigned int. There are actually a few places using it in this function, previously tegra_smmu_group_get(). ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v10 00/13] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part)
Hi Will, On Fri, Sep 18, 2020 at 12:18:40PM +0200, Jean-Philippe Brucker wrote: > This is version 10 of the page table sharing support for Arm SMMUv3. > Patch 1 still needs an Ack from mm maintainers. However patches 4-11 do > not depend on it, and could get merged for v5.10 regardless. Are you OK with taking patches 4-11 for v5.10? The rest depends on patch 1 which hasn't been acked yet. It's uncontroversial and I'm sure it will eventually make it. In case it doesn't, we'll keep track of mm->pasid within the IOMMU subsystem instead. Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v12 0/6] IOMMU user API enhancement
Hi Joerg, Just wondering if you will be able to take this for v5.10? There hasn't been any material changes since we last discussed in LPC. We have VFIO and other vSVA patches depending on it. Thanks! Jacob On Fri, 25 Sep 2020 09:32:41 -0700, Jacob Pan wrote: > IOMMU user API header was introduced to support nested DMA translation and > related fault handling. The current UAPI data structures consist of three > areas that cover the interactions between host kernel and guest: > - fault handling > - cache invalidation > - bind guest page tables, i.e. guest PASID > > Future extensions are likely to support more architectures and vIOMMU > features. > > In the previous discussion, using user-filled data size and feature flags > is made a preferred approach over a unified version number. > https://lkml.org/lkml/2020/1/29/45 > > In addition to introduce argsz field to data structures, this patchset is > also trying to document the UAPI design, usage, and extension rules. VT-d > driver changes to utilize the new argsz field is included, VFIO usage is > to follow. > > This set is available at: > https://github.com/jacobpan/linux.git vsva_v5.9_uapi_v12 > > Thanks, > > Jacob > > > Changelog: > v12 > - Removed a redundant check in cache invalidate API > v11 > - Use #define instead of enum in PASID data format, squashed > change into "iommu/uapi: Handle data and argsz filled by users" > - Remove alloc/free from documentation per Yi's comment. IOMMU > UAPI does not perform IOASID alloc/free. > v10 > - Documentation grammar fixes based on Randy's review > v9 > - Directly pass PASID value to iommu_sva_unbind_gpasid() without > the superfluous data in struct iommu_gpasid_bind_data. > v8 > - Rebased to v5.9-rc2 > - Addressed review comments from Eric Auger > 1. added a check for the unused vendor flags > 2. commit message improvements > v7 > - Added PASID data format enum for range checking > - Tidy up based on reviews from Alex W. > - Removed doc section for vIOMMU fault handling > v6 > - Renamed all UAPI functions with iommu_uapi_ prefix > - Replaced argsz maxsz checking with flag specific size checks > - Documentation improvements based on suggestions by Eric Auger > Replaced example code with a pointer to the actual code > - Added more checks for illegal flags combinations > - Added doc file to MAINTAINERS > v5 > - Addjusted paddings in UAPI data to be 8 byte aligned > - Do not clobber argsz in IOMMU core before passing on to vendor > driver > - Removed pr_warn_ for invalid UAPI data check, just return > -EINVAL > - Clarified VFIO responsibility in UAPI data handling > - Use iommu_uapi prefix to differentiate APIs has in-kernel caller > - Added comment for unchecked flags of invalidation granularity > - Added example in doc to show vendor data checking > > v4 > - Added checks of UAPI data for reserved fields, version, and > flags. > - Removed version check from vendor driver (vt-d) > - Relaxed argsz check to match the UAPI struct size instead of > variable union size > - Updated documentation > > v3: > - Rewrote backward compatibility rule to support existing code > re-compiled with newer kernel UAPI header that runs on older > kernel. Based on review comment from Alex W. > https://lore.kernel.org/linux-iommu/20200611094741.6d118...@w520.home/ > - Take user pointer directly in UAPI functions. Perform argsz > check and copy_from_user() in IOMMU driver. Eliminate the need for > VFIO or other upper layer to parse IOMMU data. > - Create wrapper function for in-kernel users of UAPI functions > v2: > - Removed unified API version and helper > - Introduced argsz for each UAPI data > - Introduced UAPI doc > > > Jacob Pan (6): > docs: IOMMU user API > iommu/uapi: Add argsz for user filled data > iommu/uapi: Use named union for user data > iommu/uapi: Rename uapi functions > iommu/uapi: Handle data and argsz filled by users > iommu/vt-d: Check UAPI data processed by IOMMU core > > Documentation/userspace-api/iommu.rst | 209 > ++ MAINTAINERS > | 1 + drivers/iommu/intel/iommu.c | 25 ++-- > drivers/iommu/intel/svm.c | 13 ++- > drivers/iommu/iommu.c | 196 > +-- include/linux/iommu.h | > 35 -- include/uapi/linux/iommu.h| 18 ++- > 7 files changed, 456 insertions(+), 41 deletions(-) > create mode 100644 Documentation/userspace-api/iommu.rst > Thanks, Jacob ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v10 00/13] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part)
Hi Jean-Philippe, On Mon, Sep 28, 2020 at 06:47:31PM +0200, Jean-Philippe Brucker wrote: > On Fri, Sep 18, 2020 at 12:18:40PM +0200, Jean-Philippe Brucker wrote: > > This is version 10 of the page table sharing support for Arm SMMUv3. > > Patch 1 still needs an Ack from mm maintainers. However patches 4-11 do > > not depend on it, and could get merged for v5.10 regardless. > > Are you OK with taking patches 4-11 for v5.10? > > The rest depends on patch 1 which hasn't been acked yet. It's > uncontroversial and I'm sure it will eventually make it. In case it > doesn't, we'll keep track of mm->pasid within the IOMMU subsystem instead. I was off most of last week, but I plan to see how much of this I can queue tonight. Stay tuned... Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 10/14] iommu/ioasid: Introduce notification APIs
Relations among IOASID users largely follow a publisher-subscriber pattern. E.g. to support guest SVA on Intel Scalable I/O Virtualization (SIOV) enabled platforms, VFIO, IOMMU, device drivers, KVM are all users of IOASIDs. When a state change occurs, VFIO publishes the change event that needs to be processed by other users/subscribers. This patch introduced two types of notifications: global and per ioasid_set. The latter is intended for users who only needs to handle events related to the IOASID of a given set. For more information, refer to the kernel documentation at Documentation/ioasid.rst. Signed-off-by: Liu Yi L Signed-off-by: Wu Hao Signed-off-by: Jacob Pan --- drivers/iommu/ioasid.c | 141 + include/linux/ioasid.h | 57 +++- 2 files changed, 197 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c index 378fef8f23d9..894b17c06ead 100644 --- a/drivers/iommu/ioasid.c +++ b/drivers/iommu/ioasid.c @@ -10,12 +10,35 @@ #include #include +/* + * An IOASID can have multiple consumers where each consumer may have + * hardware contexts associated with the IOASID. + * When a status change occurs, like on IOASID deallocation, notifier chains + * are used to keep the consumers in sync. + * This is a publisher-subscriber pattern where publisher can change the + * state of each IOASID, e.g. alloc/free, bind IOASID to a device and mm. + * On the other hand, subscribers get notified for the state change and + * keep local states in sync. + */ +static ATOMIC_NOTIFIER_HEAD(ioasid_notifier); +/* List to hold pending notification block registrations */ +static LIST_HEAD(ioasid_nb_pending_list); +static DEFINE_SPINLOCK(ioasid_nb_lock); + /* Default to PCIe standard 20 bit PASID */ #define PCI_PASID_MAX 0x10 static ioasid_t ioasid_capacity = PCI_PASID_MAX; static ioasid_t ioasid_capacity_avail = PCI_PASID_MAX; static DEFINE_XARRAY_ALLOC(ioasid_sets); +struct ioasid_set_nb { + struct list_headlist; + struct notifier_block *nb; + void*token; + struct ioasid_set *set; + boolactive; +}; + enum ioasid_state { IOASID_STATE_INACTIVE, IOASID_STATE_ACTIVE, @@ -365,6 +388,42 @@ void ioasid_detach_data(ioasid_t ioasid) } EXPORT_SYMBOL_GPL(ioasid_detach_data); +/** + * ioasid_notify - Send notification on a given IOASID for status change. + * + * @data: The IOASID data to which the notification will send + * @cmd: Notification event sent by IOASID external users, can be + * IOASID_BIND or IOASID_UNBIND. + * + * @flags: Special instructions, e.g. notify within a set or global by + * IOASID_NOTIFY_FLAG_SET or IOASID_NOTIFY_FLAG_ALL flags + * Caller must hold ioasid_allocator_lock and reference to the IOASID + */ +static int ioasid_notify(struct ioasid_data *data, +enum ioasid_notify_val cmd, unsigned int flags) +{ + struct ioasid_nb_args args = { 0 }; + int ret = 0; + + /* IOASID_FREE/ALLOC are internal events emitted by IOASID core only */ + if (cmd <= IOASID_NOTIFY_FREE) + return -EINVAL; + + if (flags & ~(IOASID_NOTIFY_FLAG_ALL | IOASID_NOTIFY_FLAG_SET)) + return -EINVAL; + + args.id = data->id; + args.set = data->set; + args.pdata = data->private; + args.spid = data->spid; + if (flags & IOASID_NOTIFY_FLAG_ALL) + ret = atomic_notifier_call_chain(_notifier, cmd, ); + if (flags & IOASID_NOTIFY_FLAG_SET) + ret = atomic_notifier_call_chain(>set->nh, cmd, ); + + return ret; +} + static ioasid_t ioasid_find_by_spid_locked(struct ioasid_set *set, ioasid_t spid) { ioasid_t ioasid = INVALID_IOASID; @@ -417,6 +476,7 @@ int ioasid_attach_spid(ioasid_t ioasid, ioasid_t spid) goto done_unlock; } data->spid = spid; + ioasid_notify(data, IOASID_NOTIFY_BIND, IOASID_NOTIFY_FLAG_SET); done_unlock: spin_unlock(_allocator_lock); @@ -436,6 +496,7 @@ void ioasid_detach_spid(ioasid_t ioasid) goto done_unlock; } data->spid = INVALID_IOASID; + ioasid_notify(data, IOASID_NOTIFY_UNBIND, IOASID_NOTIFY_FLAG_SET); done_unlock: spin_unlock(_allocator_lock); @@ -469,6 +530,28 @@ static inline bool ioasid_set_is_valid(struct ioasid_set *set) return xa_load(_sets, set->id) == set; } +static void ioasid_add_pending_nb(struct ioasid_set *set) +{ + struct ioasid_set_nb *curr; + + if (set->type != IOASID_SET_TYPE_MM) + return; + + /* +* Check if there are any pending nb requests for the given token, if so +* add them to the notifier chain. +*/ + spin_lock(_nb_lock); + list_for_each_entry(curr, _nb_pending_list, list) { + if (curr->token == set->token
[PATCH v3 01/14] docs: Document IO Address Space ID (IOASID) APIs
IOASID is used to identify address spaces that can be targeted by device DMA. It is a system-wide resource that is essential to its many users. This document is an attempt to help developers from all vendors navigate the APIs. At this time, ARM SMMU and Intel’s Scalable IO Virtualization (SIOV) enabled platforms are the primary users of IOASID. Examples of how SIOV components interact with IOASID APIs are provided in that many APIs are driven by the requirements from SIOV. Cc: Jonathan Corbet Cc: linux-...@vger.kernel.org Cc: Randy Dunlap Signed-off-by: Liu Yi L Signed-off-by: Wu Hao Signed-off-by: Jacob Pan --- Documentation/driver-api/ioasid.rst | 648 1 file changed, 648 insertions(+) create mode 100644 Documentation/driver-api/ioasid.rst diff --git a/Documentation/driver-api/ioasid.rst b/Documentation/driver-api/ioasid.rst new file mode 100644 index ..7f8e702997ab --- /dev/null +++ b/Documentation/driver-api/ioasid.rst @@ -0,0 +1,648 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. ioasid: + +=== +IO Address Space ID +=== + +IOASID is a generic name for PCIe Process Address ID (PASID) or ARM +SMMU SubstreamID. An IOASID identifies an address space that DMA +requests can target. + +The primary use cases for IOASID are Shared Virtual Address (SVA) and +multiple IOVA spaces per device. However, the requirements for IOASID +management can vary among hardware architectures. + +For baremetal IOVA, IOASID #0 is used for DMA request without +PASID. Even though some architectures such as VT-d also offers +the flexibility of using any PASIDs for DMA request without PASID. +PASID #0 is reserved and not allocated from any ioasid_set. + +Multiple IOVA spaces per device are mapped to auxiliary domains which +can be used for mediated device assignment with and without a virtual +IOMMU (vIOMMU). An IOASID is allocated for each auxiliary domain as default +PASID. Without vIOMMU, default IOASID is used for DMA map/unmap +APIs. With vIOMMU, default IOASID is used for guest IOVA where DMA +request with PASID is required for the device. The reason is that +there is only one PASID #0 per device, e.g. VT-d, RID_PASID is per PCI +device. + +This document covers the generic features supported by IOASID +APIs. Vendor-specific use cases are also illustrated with Intel's VT-d +based platforms as the first example. + +.. contents:: :local: + +Glossary + +PASID - Process Address Space ID + +IOASID - IO Address Space ID (generic term for PCIe PASID and +SubstreamID in SMMU) + +SVA/SVM - Shared Virtual Addressing/Memory + +ENQCMD - Intel X86 ISA for efficient workqueue submission [1] +!!!TODO: Link to Spec at the bottom + +DSA - Intel Data Streaming Accelerator [2] + +VDCM - Virtual Device Composition Module [3] + +SIOV - Intel Scalable IO Virtualization + + +Key Concepts + + +IOASID Set +--- +An IOASID set is a group of IOASIDs allocated from the system-wide +IOASID pool. Refer to IOASID set APIs for more details. + +IOASID set is particularly useful for guest SVA where each guest could +have its own IOASID set for security and efficiency reasons. + +IOASID Set Private ID (SPID) + +Each IOASID set has a private namespace of SPIDs. An SPID maps to a +single system-wide IOASID. Conversely, each IOASID may be associated +with an alias ID, local to the IOASID set, named SPID. +SPIDs can be used as guest IOASIDs where each guest could do +IOASID allocation from its own pool and map them to host physical +IOASIDs. SPIDs are particularly useful for supporting live migration +where decoupling guest and host physical resources are necessary. + +For example, two VMs can both allocate guest PASID/SPID #101 but map to +different host PASIDs #201 and #202 respectively as shown in the +diagram below. +:: + + .--..--. + | VM 1 || VM 2 | + | || | + |--||--| + | GPASID/SPID 101 || GPASID/SPID 101 | + '--'---' Guest + __|__| + | | Host + v v + .--..--. + | Host IOASID 201 || Host IOASID 202 | + '--''--' + | IOASID set 1 || IOASID set 2 | + '--''--' + +Guest PASID is treated as IOASID set private ID (SPID) within an +IOASID set, mappings between guest and host IOASIDs are stored in the +set for inquiry. + +IOASID APIs +=== +To get the IOASID APIs, users must #include . These APIs +serve the following functionalities: + + - IOASID allocation/Free + - Group management in the form of ioasid_set + - Private data storage and lookup + - Reference counting + - Event notification
[PATCH v3 06/14] iommu/ioasid: Introduce API to adjust the quota of an ioasid_set
Each ioasid_set is given a quota during allocation. As system administrators balance resources among VMs, we shall support the adjustment of quota at runtime. The new quota cannot be less than the outstanding IOASIDs already allocated within the set. The extra quota will be returned to the system-wide IOASID pool if the new quota is smaller than the existing one. Signed-off-by: Jacob Pan --- drivers/iommu/ioasid.c | 47 +++ include/linux/ioasid.h | 6 ++ 2 files changed, 53 insertions(+) diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c index 61e25c2375ab..cf8c7d34e2de 100644 --- a/drivers/iommu/ioasid.c +++ b/drivers/iommu/ioasid.c @@ -654,6 +654,53 @@ void ioasid_set_put(struct ioasid_set *set) EXPORT_SYMBOL_GPL(ioasid_set_put); /** + * ioasid_adjust_set - Adjust the quota of an IOASID set + * @set: IOASID set to be assigned + * @quota: Quota allowed in this set + * + * Return 0 on success. If the new quota is smaller than the number of + * IOASIDs already allocated, -EINVAL will be returned. No change will be + * made to the existing quota. + */ +int ioasid_adjust_set(struct ioasid_set *set, int quota) +{ + int ret = 0; + + if (quota <= 0) + return -EINVAL; + + spin_lock(_allocator_lock); + if (set->nr_ioasids > quota) { + pr_err("New quota %d is smaller than outstanding IOASIDs %d\n", + quota, set->nr_ioasids); + ret = -EINVAL; + goto done_unlock; + } + + if ((quota > set->quota) && + (quota - set->quota > ioasid_capacity_avail)) { + ret = -ENOSPC; + goto done_unlock; + } + + /* Return the delta back to system pool */ + ioasid_capacity_avail += set->quota - quota; + + /* +* May have a policy to prevent giving all available IOASIDs +* to one set. But we don't enforce here, it should be in the +* upper layers. +*/ + set->quota = quota; + +done_unlock: + spin_unlock(_allocator_lock); + + return ret; +} +EXPORT_SYMBOL_GPL(ioasid_adjust_set); + +/** * ioasid_find - Find IOASID data * @set: the IOASID set * @ioasid: the IOASID to find diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h index 1ae213b660f0..0a5e82148eb9 100644 --- a/include/linux/ioasid.h +++ b/include/linux/ioasid.h @@ -62,6 +62,7 @@ struct ioasid_allocator_ops { void ioasid_install_capacity(ioasid_t total); ioasid_t ioasid_get_capacity(void); struct ioasid_set *ioasid_set_alloc(void *token, ioasid_t quota, int type); +int ioasid_adjust_set(struct ioasid_set *set, int quota); void ioasid_set_get(struct ioasid_set *set); void ioasid_set_put(struct ioasid_set *set); @@ -99,6 +100,11 @@ static inline struct ioasid_set *ioasid_set_alloc(void *token, ioasid_t quota, i return ERR_PTR(-ENOTSUPP); } +static inline int ioasid_adjust_set(struct ioasid_set *set, int quota) +{ + return -ENOTSUPP; +} + static inline void ioasid_set_put(struct ioasid_set *set) { } -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 09/14] iommu/ioasid: Introduce ioasid_set private ID
When an IOASID set is used for guest SVA, each VM will acquire its ioasid_set for IOASID allocations. IOASIDs within the VM must have a host/physical IOASID backing, mapping between guest and host IOASIDs can be non-identical. IOASID set private ID (SPID) is introduced in this patch to be used as guest IOASID. However, the concept of ioasid_set specific namespace is generic, thus named SPID. As SPID namespace is within the IOASID set, the IOASID core can provide lookup services at both directions. SPIDs may not be available when its IOASID is allocated, the mapping between SPID and IOASID is usually established when a guest page table is bound to a host PASID. Signed-off-by: Jacob Pan --- drivers/iommu/ioasid.c | 102 + include/linux/ioasid.h | 19 + 2 files changed, 121 insertions(+) diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c index 828cc44b1b1c..378fef8f23d9 100644 --- a/drivers/iommu/ioasid.c +++ b/drivers/iommu/ioasid.c @@ -26,6 +26,7 @@ enum ioasid_state { * struct ioasid_data - Meta data about ioasid * * @id:Unique ID + * @spid: Private ID unique within a set * @users: Number of active users * @state: Track state of the IOASID * @set: ioasid_set of the IOASID belongs to @@ -34,6 +35,7 @@ enum ioasid_state { */ struct ioasid_data { ioasid_t id; + ioasid_t spid; refcount_t users; enum ioasid_state state; struct ioasid_set *set; @@ -363,6 +365,105 @@ void ioasid_detach_data(ioasid_t ioasid) } EXPORT_SYMBOL_GPL(ioasid_detach_data); +static ioasid_t ioasid_find_by_spid_locked(struct ioasid_set *set, ioasid_t spid) +{ + ioasid_t ioasid = INVALID_IOASID; + struct ioasid_data *entry; + unsigned long index; + + if (!xa_load(_sets, set->id)) { + pr_warn("Invalid set\n"); + goto done; + } + + xa_for_each(>xa, index, entry) { + if (spid == entry->spid) { + refcount_inc(>users); + ioasid = index; + } + } +done: + return ioasid; +} + +/** + * ioasid_attach_spid - Attach ioasid_set private ID to an IOASID + * + * @ioasid: the system-wide IOASID to attach + * @spid: the ioasid_set private ID of @ioasid + * + * After attching SPID, future lookup can be done via ioasid_find_by_spid(). + */ +int ioasid_attach_spid(ioasid_t ioasid, ioasid_t spid) +{ + struct ioasid_data *data; + int ret = 0; + + if (spid == INVALID_IOASID) + return -EINVAL; + + spin_lock(_allocator_lock); + data = xa_load(_allocator->xa, ioasid); + + if (!data) { + pr_err("No IOASID entry %d to attach SPID %d\n", + ioasid, spid); + ret = -ENOENT; + goto done_unlock; + } + /* Check if SPID is unique within the set */ + if (ioasid_find_by_spid_locked(data->set, spid) != INVALID_IOASID) { + ret = -EINVAL; + goto done_unlock; + } + data->spid = spid; + +done_unlock: + spin_unlock(_allocator_lock); + return ret; +} +EXPORT_SYMBOL_GPL(ioasid_attach_spid); + +void ioasid_detach_spid(ioasid_t ioasid) +{ + struct ioasid_data *data; + + spin_lock(_allocator_lock); + data = xa_load(_allocator->xa, ioasid); + + if (!data || data->spid == INVALID_IOASID) { + pr_err("Invalid IOASID entry %d to detach\n", ioasid); + goto done_unlock; + } + data->spid = INVALID_IOASID; + +done_unlock: + spin_unlock(_allocator_lock); +} +EXPORT_SYMBOL_GPL(ioasid_detach_spid); + +/** + * ioasid_find_by_spid - Find the system-wide IOASID by a set private ID and + * its set. + * + * @set: the ioasid_set to search within + * @spid: the set private ID + * + * Given a set private ID and its IOASID set, find the system-wide IOASID. Take + * a reference upon finding the matching IOASID. Return INVALID_IOASID if the + * IOASID is not found in the set or the set is not valid. + */ +ioasid_t ioasid_find_by_spid(struct ioasid_set *set, ioasid_t spid) +{ + ioasid_t ioasid; + + spin_lock(_allocator_lock); + ioasid = ioasid_find_by_spid_locked(set, spid); + spin_unlock(_allocator_lock); + return ioasid; +} +EXPORT_SYMBOL_GPL(ioasid_find_by_spid); + static inline bool ioasid_set_is_valid(struct ioasid_set *set) { return xa_load(_sets, set->id) == set; @@ -529,6 +630,7 @@ ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max, goto exit_free; } data->id = id; + data->spid = INVALID_IOASID; data->state = IOASID_STATE_ACTIVE; refcount_set(>users, 1); diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h index 16d421357173..2dfe85e6cb7e 100644 --- a/include/linux/ioasid.h +++ b/include/linux/ioasid.h @@ -79,6
[PATCH v3 12/14] iommu/vt-d: Remove mm reference for guest SVA
Now that IOASID core keeps track of the IOASID to mm_struct ownership in the forms of ioasid_set with IOASID_SET_TYPE_MM token type, there is no need to keep the same mapping in VT-d driver specific data. Native SVM usage is not affected by the change. Signed-off-by: Jacob Pan --- drivers/iommu/intel/svm.c | 7 --- 1 file changed, 7 deletions(-) diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 2e764e283469..39a09a93300e 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -338,12 +338,6 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev, ret = -ENOMEM; goto out; } - /* REVISIT: upper layer/VFIO can track host process that bind -* the PASID. ioasid_set = mm might be sufficient for vfio to -* check pasid VMM ownership. We can drop the following line -* once VFIO and IOASID set check is in place. -*/ - svm->mm = get_task_mm(current); svm->pasid = data->hpasid; if (data->flags & IOMMU_SVA_GPASID_VAL) { svm->gpasid = data->gpasid; @@ -351,7 +345,6 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev, } ioasid_attach_data(data->hpasid, svm); INIT_LIST_HEAD_RCU(>devs); - mmput(svm->mm); } sdev = kzalloc(sizeof(*sdev), GFP_KERNEL); if (!sdev) { -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 02/14] iommu/ioasid: Rename ioasid_set_data()
Rename ioasid_set_data() to ioasid_attach_data() to avoid confusion with struct ioasid_set. ioasid_set is a group of IOASIDs that share a common token. Reviewed-by: Jean-Philippe Brucker Signed-off-by: Jacob Pan --- drivers/iommu/intel/svm.c | 6 +++--- drivers/iommu/ioasid.c| 6 +++--- include/linux/ioasid.h| 4 ++-- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 0cb9a15f1112..2c5645f0737a 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -346,7 +346,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev, svm->gpasid = data->gpasid; svm->flags |= SVM_FLAG_GUEST_PASID; } - ioasid_set_data(data->hpasid, svm); + ioasid_attach_data(data->hpasid, svm); INIT_LIST_HEAD_RCU(>devs); mmput(svm->mm); } @@ -398,7 +398,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev, list_add_rcu(>list, >devs); out: if (!IS_ERR_OR_NULL(svm) && list_empty(>devs)) { - ioasid_set_data(data->hpasid, NULL); + ioasid_attach_data(data->hpasid, NULL); kfree(svm); } @@ -441,7 +441,7 @@ int intel_svm_unbind_gpasid(struct device *dev, int pasid) * the unbind, IOMMU driver will get notified * and perform cleanup. */ - ioasid_set_data(pasid, NULL); + ioasid_attach_data(pasid, NULL); kfree(svm); } } diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c index 0f8dd377aada..5f63af07acd5 100644 --- a/drivers/iommu/ioasid.c +++ b/drivers/iommu/ioasid.c @@ -258,14 +258,14 @@ void ioasid_unregister_allocator(struct ioasid_allocator_ops *ops) EXPORT_SYMBOL_GPL(ioasid_unregister_allocator); /** - * ioasid_set_data - Set private data for an allocated ioasid + * ioasid_attach_data - Set private data for an allocated ioasid * @ioasid: the ID to set data * @data: the private data * * For IOASID that is already allocated, private data can be set * via this API. Future lookup can be done via ioasid_find. */ -int ioasid_set_data(ioasid_t ioasid, void *data) +int ioasid_attach_data(ioasid_t ioasid, void *data) { struct ioasid_data *ioasid_data; int ret = 0; @@ -287,7 +287,7 @@ int ioasid_set_data(ioasid_t ioasid, void *data) return ret; } -EXPORT_SYMBOL_GPL(ioasid_set_data); +EXPORT_SYMBOL_GPL(ioasid_attach_data); /** * ioasid_alloc - Allocate an IOASID diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h index 6f000d7a0ddc..9c44947a68c8 100644 --- a/include/linux/ioasid.h +++ b/include/linux/ioasid.h @@ -39,7 +39,7 @@ void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid, bool (*getter)(void *)); int ioasid_register_allocator(struct ioasid_allocator_ops *allocator); void ioasid_unregister_allocator(struct ioasid_allocator_ops *allocator); -int ioasid_set_data(ioasid_t ioasid, void *data); +int ioasid_attach_data(ioasid_t ioasid, void *data); #else /* !CONFIG_IOASID */ static inline ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, @@ -67,7 +67,7 @@ static inline void ioasid_unregister_allocator(struct ioasid_allocator_ops *allo { } -static inline int ioasid_set_data(ioasid_t ioasid, void *data) +static inline int ioasid_attach_data(ioasid_t ioasid, void *data) { return -ENOTSUPP; } -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 03/14] iommu/ioasid: Add a separate function for detach data
IOASID private data can be cleared by ioasid_attach_data() with a NULL data pointer. A common use case is for a caller to free the data afterward. ioasid_attach_data() calls synchronize_rcu() before return such that free data can be sure without outstanding readers. However, since synchronize_rcu() may sleep, ioasid_attach_data() cannot be used under spinlocks. This patch adds ioasid_detach_data() as a separate API where synchronize_rcu() is called only in this case. ioasid_attach_data() can then be used under spinlocks. In addition, this change makes the API symmetrical. Signed-off-by: Jacob Pan --- drivers/iommu/intel/svm.c | 4 ++-- drivers/iommu/ioasid.c| 54 ++- include/linux/ioasid.h| 5 - 3 files changed, 50 insertions(+), 13 deletions(-) diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 2c5645f0737a..06a16bee7b65 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -398,7 +398,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev, list_add_rcu(>list, >devs); out: if (!IS_ERR_OR_NULL(svm) && list_empty(>devs)) { - ioasid_attach_data(data->hpasid, NULL); + ioasid_detach_data(data->hpasid); kfree(svm); } @@ -441,7 +441,7 @@ int intel_svm_unbind_gpasid(struct device *dev, int pasid) * the unbind, IOMMU driver will get notified * and perform cleanup. */ - ioasid_attach_data(pasid, NULL); + ioasid_detach_data(pasid); kfree(svm); } } diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c index 5f63af07acd5..6cfbdfb492e0 100644 --- a/drivers/iommu/ioasid.c +++ b/drivers/iommu/ioasid.c @@ -272,24 +272,58 @@ int ioasid_attach_data(ioasid_t ioasid, void *data) spin_lock(_allocator_lock); ioasid_data = xa_load(_allocator->xa, ioasid); - if (ioasid_data) - rcu_assign_pointer(ioasid_data->private, data); - else + + if (!ioasid_data) { ret = -ENOENT; - spin_unlock(_allocator_lock); + goto done_unlock; + } - /* -* Wait for readers to stop accessing the old private data, so the -* caller can free it. -*/ - if (!ret) - synchronize_rcu(); + if (ioasid_data->private) { + ret = -EBUSY; + goto done_unlock; + } + rcu_assign_pointer(ioasid_data->private, data); + +done_unlock: + spin_unlock(_allocator_lock); return ret; } EXPORT_SYMBOL_GPL(ioasid_attach_data); /** + * ioasid_detach_data - Clear the private data of an ioasid + * + * @ioasid: the IOASIDD to clear private data + */ +void ioasid_detach_data(ioasid_t ioasid) +{ + struct ioasid_data *ioasid_data; + + spin_lock(_allocator_lock); + ioasid_data = xa_load(_allocator->xa, ioasid); + + if (!ioasid_data) { + pr_warn("IOASID %u not found to detach data from\n", ioasid); + goto done_unlock; + } + + if (ioasid_data->private) { + rcu_assign_pointer(ioasid_data->private, NULL); + goto done_unlock; + } + +done_unlock: + spin_unlock(_allocator_lock); + /* +* Wait for readers to stop accessing the old private data, +* so the caller can free it. +*/ + synchronize_rcu(); +} +EXPORT_SYMBOL_GPL(ioasid_detach_data); + +/** * ioasid_alloc - Allocate an IOASID * @set: the IOASID set * @min: the minimum ID (inclusive) diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h index 9c44947a68c8..c7f649fa970a 100644 --- a/include/linux/ioasid.h +++ b/include/linux/ioasid.h @@ -40,7 +40,7 @@ void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid, int ioasid_register_allocator(struct ioasid_allocator_ops *allocator); void ioasid_unregister_allocator(struct ioasid_allocator_ops *allocator); int ioasid_attach_data(ioasid_t ioasid, void *data); - +void ioasid_detach_data(ioasid_t ioasid); #else /* !CONFIG_IOASID */ static inline ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max, void *private) @@ -72,5 +72,8 @@ static inline int ioasid_attach_data(ioasid_t ioasid, void *data) return -ENOTSUPP; } +static inline void ioasid_detach_data(ioasid_t ioasid) +{ +} #endif /* CONFIG_IOASID */ #endif /* __LINUX_IOASID_H */ -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 05/14] iommu/ioasid: Redefine IOASID set and allocation APIs
ioasid_set was introduced as an arbitrary token that is shared by a group of IOASIDs. For example, two IOASIDs allocated via the same ioasid_set pointer belong to the same set. For guest SVA usages, system-wide IOASID resources need to be partitioned such that each VM can have its own quota and being managed separately. ioasid_set is the perfect candidate for meeting such requirements. This patch redefines and extends ioasid_set with the following new fields: - Quota - Reference count - Storage of its namespace - The token is now stored in the ioasid_set with types Basic ioasid_set level APIs are introduced that wire up these new data. Existing users of IOASID APIs are converted where a host IOASID set is allocated for bare-metal usage. Signed-off-by: Liu Yi L Signed-off-by: Jacob Pan --- drivers/iommu/intel/iommu.c | 26 +++-- drivers/iommu/intel/pasid.h | 1 + drivers/iommu/intel/svm.c | 25 ++-- drivers/iommu/ioasid.c | 277 include/linux/ioasid.h | 53 +++-- 5 files changed, 333 insertions(+), 49 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index e7bcb299e51e..872391890323 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -104,6 +104,9 @@ */ #define INTEL_IOMMU_PGSIZES(~0xFFFUL) +/* PASIDs used by host SVM */ +struct ioasid_set *host_pasid_set; + static inline int agaw_to_level(int agaw) { return agaw + 2; @@ -3147,8 +3150,8 @@ static void intel_vcmd_ioasid_free(ioasid_t ioasid, void *data) * Sanity check the ioasid owner is done at upper layer, e.g. VFIO * We can only free the PASID when all the devices are unbound. */ - if (ioasid_find(NULL, ioasid, NULL)) { - pr_alert("Cannot free active IOASID %d\n", ioasid); + if (IS_ERR(ioasid_find(host_pasid_set, ioasid, NULL))) { + pr_err("IOASID %d to be freed but not in system set\n", ioasid); return; } vcmd_free_pasid(iommu, ioasid); @@ -,8 +3336,17 @@ static int __init init_dmars(void) goto free_iommu; /* PASID is needed for scalable mode irrespective to SVM */ - if (intel_iommu_sm) + if (intel_iommu_sm) { ioasid_install_capacity(intel_pasid_max_id); + /* We should not run out of IOASIDs at boot */ + host_pasid_set = ioasid_set_alloc(NULL, PID_MAX_DEFAULT, + IOASID_SET_TYPE_NULL); + if (IS_ERR_OR_NULL(host_pasid_set)) { + pr_err("Failed to allocate host PASID set %lu\n", + PTR_ERR(host_pasid_set)); + intel_iommu_sm = 0; + } + } /* * for each drhd @@ -3381,7 +3393,7 @@ static int __init init_dmars(void) disable_dmar_iommu(iommu); free_dmar_iommu(iommu); } - + ioasid_set_put(host_pasid_set); kfree(g_iommus); error: @@ -5163,7 +5175,7 @@ static void auxiliary_unlink_device(struct dmar_domain *domain, domain->auxd_refcnt--; if (!domain->auxd_refcnt && domain->default_pasid > 0) - ioasid_free(domain->default_pasid); + ioasid_free(host_pasid_set, domain->default_pasid); } static int aux_domain_add_dev(struct dmar_domain *domain, @@ -5181,7 +5193,7 @@ static int aux_domain_add_dev(struct dmar_domain *domain, int pasid; /* No private data needed for the default pasid */ - pasid = ioasid_alloc(NULL, PASID_MIN, + pasid = ioasid_alloc(host_pasid_set, PASID_MIN, pci_max_pasids(to_pci_dev(dev)) - 1, NULL); if (pasid == INVALID_IOASID) { @@ -5224,7 +5236,7 @@ static int aux_domain_add_dev(struct dmar_domain *domain, spin_unlock(>lock); spin_unlock_irqrestore(_domain_lock, flags); if (!domain->auxd_refcnt && domain->default_pasid > 0) - ioasid_free(domain->default_pasid); + ioasid_free(host_pasid_set, domain->default_pasid); return ret; } diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h index c9850766c3a9..ccdc23446015 100644 --- a/drivers/iommu/intel/pasid.h +++ b/drivers/iommu/intel/pasid.h @@ -99,6 +99,7 @@ static inline bool pasid_pte_is_present(struct pasid_entry *pte) } extern u32 intel_pasid_max_id; +extern struct ioasid_set *host_pasid_set; int intel_pasid_alloc_id(void *ptr, int start, int end, gfp_t gfp); void intel_pasid_free_id(int pasid); void *intel_pasid_lookup_id(int pasid); diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 06a16bee7b65..2e764e283469 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -228,7 +228,9 @@ static LIST_HEAD(global_svm_list);
[PATCH v3 07/14] iommu/ioasid: Add an iterator API for ioasid_set
Users of an ioasid_set may not keep track of all the IOASIDs allocated under the set. When collective actions are needed for each IOASIDs, it is useful to iterate over all the IOASIDs within the set. For example, when the ioasid_set is freed, the user might perform the same cleanup operation on each IOASID. This patch adds an API to iterate all the IOASIDs within the set. Signed-off-by: Jacob Pan --- drivers/iommu/ioasid.c | 17 + include/linux/ioasid.h | 9 + 2 files changed, 26 insertions(+) diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c index cf8c7d34e2de..9628e78b2ab4 100644 --- a/drivers/iommu/ioasid.c +++ b/drivers/iommu/ioasid.c @@ -701,6 +701,23 @@ int ioasid_adjust_set(struct ioasid_set *set, int quota) EXPORT_SYMBOL_GPL(ioasid_adjust_set); /** + * ioasid_set_for_each_ioasid - Iterate over all the IOASIDs within the set + * + * Caller must hold a reference of the set and handles its own locking. + */ +void ioasid_set_for_each_ioasid(struct ioasid_set *set, + void (*fn)(ioasid_t id, void *data), + void *data) +{ + struct ioasid_data *entry; + unsigned long index; + + xa_for_each(>xa, index, entry) + fn(index, data); +} +EXPORT_SYMBOL_GPL(ioasid_set_for_each_ioasid); + +/** * ioasid_find - Find IOASID data * @set: the IOASID set * @ioasid: the IOASID to find diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h index 0a5e82148eb9..aab58bc26714 100644 --- a/include/linux/ioasid.h +++ b/include/linux/ioasid.h @@ -75,6 +75,9 @@ int ioasid_register_allocator(struct ioasid_allocator_ops *allocator); void ioasid_unregister_allocator(struct ioasid_allocator_ops *allocator); int ioasid_attach_data(ioasid_t ioasid, void *data); void ioasid_detach_data(ioasid_t ioasid); +void ioasid_set_for_each_ioasid(struct ioasid_set *sdata, + void (*fn)(ioasid_t id, void *data), + void *data); #else /* !CONFIG_IOASID */ static inline void ioasid_install_capacity(ioasid_t total) { @@ -131,5 +134,11 @@ static inline int ioasid_attach_data(ioasid_t ioasid, void *data) static inline void ioasid_detach_data(ioasid_t ioasid) { } + +static inline void ioasid_set_for_each_ioasid(struct ioasid_set *sdata, + void (*fn)(ioasid_t id, void *data), + void *data) +{ +} #endif /* CONFIG_IOASID */ #endif /* __LINUX_IOASID_H */ -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 14/14] iommu/vt-d: Store guest PASID during bind
IOASID core maintains the guest-host mapping in the form of SPID and IOASID. This patch assigns the guest PASID (if valid) as SPID while binding guest page table with a host PASID. This mapping will be used for lookup and notifications. Signed-off-by: Jacob Pan --- drivers/iommu/intel/svm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 8f886718df83..e18f8b5af9ba 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -98,6 +98,7 @@ static inline bool intel_svm_capable(struct intel_iommu *iommu) static inline void intel_svm_drop_pasid(ioasid_t pasid) { ioasid_detach_data(pasid); + ioasid_detach_spid(pasid); ioasid_put(NULL, pasid); } @@ -425,6 +426,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev, if (data->flags & IOMMU_SVA_GPASID_VAL) { svm->gpasid = data->gpasid; svm->flags |= SVM_FLAG_GUEST_PASID; + ioasid_attach_spid(data->hpasid, data->gpasid); } ioasid_attach_data(data->hpasid, svm); ioasid_get(NULL, svm->pasid); -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 00/14] IOASID extensions for guest SVA
IOASID was introduced in v5.5 as a generic kernel allocator service for both PCIe Process Address Space ID (PASID) and ARM SMMU's Sub Stream ID. In addition to basic ID allocation, ioasid_set was defined as a token that is shared by a group of IOASIDs. This set token can be used for permission checking, but lack of some features to address the following needs by guest Shared Virtual Address (SVA). - Manage IOASIDs by group, group ownership, quota, etc. - State synchronization among IOASID users - Non-identity guest-host IOASID mapping - Lifecycle management across many users This patchset introduces the following extensions as solutions to the problems above. - Redefine and extend IOASID set such that IOASIDs can be managed by groups. - Add notifications for IOASID state synchronization - Add reference counting for life cycle alignment among users - Support ioasid_set private IDs, which can be used as guest IOASIDs Please refer to Documentation/ioasid.rst in enclosed patch 1/9 for more details. This patchset only included VT-d driver as users of some of the new APIs. VFIO and KVM patches are coming up to fully utilize the APIs introduced here. You can find this series at: https://github.com/jacobpan/linux.git ioasid_v3 (VFIO and KVM patches will be available at this branch when published.) This work is a result of collaboration with many people: Liu, Yi L Wu Hao Ashok Raj Kevin Tian Thanks, Jacob Changelog: V3: - Use consistent ioasid_set_ prefix for ioasid_set level APIs - Make SPID and private detach/attach APIs symmetric - Use the same ioasid_put semantics as Jean-Phillippe IOASID reference patch - Take away the public ioasid_notify() function, notifications are now emitted by IOASID core as a result of certain IOASID APIs - Partition into finer incremental patches - Miscellaneous cleanup, locking, exception handling fixes based on v2 reviews V2: - Redesigned ioasid_set APIs, removed set ID - Added set private ID (SPID) for guest PASID usage. - Add per ioasid_set notification and priority support. - Back to use spinlocks and atomic notifications. - Added async work in VT-d driver to perform teardown outside atomic context Jacob Pan (14): docs: Document IO Address Space ID (IOASID) APIs iommu/ioasid: Rename ioasid_set_data() iommu/ioasid: Add a separate function for detach data iommu/ioasid: Support setting system-wide capacity iommu/ioasid: Redefine IOASID set and allocation APIs iommu/ioasid: Introduce API to adjust the quota of an ioasid_set iommu/ioasid: Add an iterator API for ioasid_set iommu/ioasid: Add reference couting functions iommu/ioasid: Introduce ioasid_set private ID iommu/ioasid: Introduce notification APIs iommu/ioasid: Support mm type ioasid_set notifications iommu/vt-d: Remove mm reference for guest SVA iommu/vt-d: Listen to IOASID notifications iommu/vt-d: Store guest PASID during bind Documentation/driver-api/ioasid.rst | 648 ++ drivers/iommu/intel/iommu.c | 29 +- drivers/iommu/intel/pasid.h | 1 + drivers/iommu/intel/svm.c | 132 +- drivers/iommu/ioasid.c | 890 ++-- include/linux/intel-iommu.h | 2 + include/linux/ioasid.h | 197 +++- 7 files changed, 1830 insertions(+), 69 deletions(-) create mode 100644 Documentation/driver-api/ioasid.rst -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 11/14] iommu/ioasid: Support mm type ioasid_set notifications
As a system-wide resource, IOASID is often shared by multiple kernel subsystems that are independent of each other. However, at the ioasid_set level, these kernel subsystems must communicate with each other for ownership checking, event notifications, etc. For example, on Intel Scalable IO Virtualization (SIOV) enabled platforms, KVM and VFIO instances under the same process/guest must be aware of a shared IOASID set. IOASID_SET_TYPE_MM token type was introduced to explicitly mark an IOASID set that belongs to a process, thus use the same mm_struct pointer as a token. Users of the same process can then identify with each other based on this token. This patch introduces MM token specific event registration APIs. Event subscribers such as KVM instances can register IOASID event handler without the knowledge of its ioasid_set. Event handlers are registered based on its mm_struct pointer as a token. In case when subscribers register handler *prior* to the creation of the ioasid_set, the handler’s notification block is stored in a pending list within IOASID core. Once the ioasid_set of the MM token is created, the notification block will be registered by the IOASID core. Signed-off-by: Liu Yi L Signed-off-by: Wu Hao Signed-off-by: Jacob Pan --- drivers/iommu/ioasid.c | 117 + include/linux/ioasid.h | 15 +++ 2 files changed, 132 insertions(+) diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c index 894b17c06ead..d5faeb559a43 100644 --- a/drivers/iommu/ioasid.c +++ b/drivers/iommu/ioasid.c @@ -889,6 +889,29 @@ void ioasid_set_put(struct ioasid_set *set) } EXPORT_SYMBOL_GPL(ioasid_set_put); +/* + * ioasid_find_mm_set - Retrieve IOASID set with mm token + * Take a reference of the set if found. + */ +static struct ioasid_set *ioasid_find_mm_set(struct mm_struct *token) +{ + struct ioasid_set *set; + unsigned long index; + + spin_lock(_allocator_lock); + + xa_for_each(_sets, index, set) { + if (set->type == IOASID_SET_TYPE_MM && set->token == token) { + refcount_inc(>ref); + goto exit_unlock; + } + } + set = NULL; +exit_unlock: + spin_unlock(_allocator_lock); + return set; +} + /** * ioasid_adjust_set - Adjust the quota of an IOASID set * @set: IOASID set to be assigned @@ -1121,6 +1144,100 @@ void ioasid_unregister_notifier(struct ioasid_set *set, } EXPORT_SYMBOL_GPL(ioasid_unregister_notifier); +/** + * ioasid_register_notifier_mm - Register a notifier block on the IOASID set + * created by the mm_struct pointer as the token + * + * @mm: the mm_struct token of the ioasid_set + * @nb: notfier block to be registered on the ioasid_set + * + * This a variant of ioasid_register_notifier() where the caller intends to + * listen to IOASID events belong the ioasid_set created under the same + * process. Caller is not aware of the ioasid_set, no need to hold reference + * of the ioasid_set. + */ +int ioasid_register_notifier_mm(struct mm_struct *mm, struct notifier_block *nb) +{ + struct ioasid_set_nb *curr; + struct ioasid_set *set; + int ret = 0; + + if (!mm) + return -EINVAL; + + spin_lock(_nb_lock); + + /* Check for duplicates, nb is unique per set */ + list_for_each_entry(curr, _nb_pending_list, list) { + if (curr->token == mm && curr->nb == nb) { + ret = -EBUSY; + goto exit_unlock; + } + } + + /* Check if the token has an existing set */ + set = ioasid_find_mm_set(mm); + if (!set) { + /* Add to the rsvd list as inactive */ + curr->active = false; + } else { + /* REVISIT: Only register empty set for now. Can add an option +* in the future to playback existing PASIDs. +*/ + if (set->nr_ioasids) { + pr_warn("IOASID set %d not empty\n", set->id); + ret = -EBUSY; + goto exit_unlock; + } + curr = kzalloc(sizeof(*curr), GFP_ATOMIC); + if (!curr) { + ret = -ENOMEM; + goto exit_unlock; + } + curr->token = mm; + curr->nb = nb; + curr->active = true; + curr->set = set; + + /* Set already created, add to the notifier chain */ + atomic_notifier_chain_register(>nh, nb); + /* +* Do not hold a reference, if the set gets destroyed, the nb +* entry will be marked inactive. +*/ + ioasid_set_put(set); + } + + list_add(>list, _nb_pending_list); + +exit_unlock: + spin_unlock(_nb_lock); + return ret; +}
Re: [RFC 0/3] iommu: Reserved regions for IOVAs beyond dma_mask and iommu aperture
On Mon, 28 Sep 2020 21:50:34 +0200 Eric Auger wrote: > VFIO currently exposes the usable IOVA regions through the > VFIO_IOMMU_GET_INFO ioctl / VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE > capability. However it fails to take into account the dma_mask > of the devices within the container. The top limit currently is > defined by the iommu aperture. I think that dma_mask is traditionally a DMA API interface for a device driver to indicate to the DMA layer which mappings are accessible to the device. On the other hand, vfio makes use of the IOMMU API where the driver is in userspace. That userspace driver has full control of the IOVA range of the device, therefore dma_mask is mostly irrelevant to vfio. I think the issue you're trying to tackle is that the IORT code is making use of the dma_mask to try to describe a DMA address limitation imposed by the PCI root bus, living between the endpoint device and the IOMMU. Therefore, if the IORT code is exposing a topology or system imposed device limitation, this seems much more akin to something like an MSI reserved range, where it's not necessarily the device or the IOMMU with the limitation, but something that sits between them. > So, for instance, if the IOMMU supports up to 48bits, it may give > the impression the max IOVA is 48b while a device may have a > dma_mask of 42b. So this API cannot really be used to compute > the max usable IOVA. > > This patch removes the IOVA region beyond the dma_mask's. Rather it adds a reserved region accounting for the range above the device's dma_mask. I don't think the IOMMU API should be consuming dma_mask like this though. For example, what happens in pci_dma_configure() when there are no OF or ACPI DMA restrictions? It appears to me that the dma_mask from whatever previous driver had the device carries over to the new driver. That's generally ok for the DMA API because a driver is required to set the device's DMA mask. It doesn't make sense however to blindly consume that dma_mask and export it via an IOMMU API. For example I would expect to see different results depending on whether a host driver has been bound to a device. It seems the correct IOMMU API approach would be for the IORT code to specifically register reserved ranges for the device. > As we start to expose this reserved region in the sysfs file > /sys/kernel/iommu_groups//reserved_regions, we also need to > handle the IOVA range beyond the IOMMU aperture to handle the case > where the dma_mask would have a higher number of bits than the iommu > max input address. Why? The IOMMU geometry already describes this and vfio combines both the IOMMU geometry and the device reserved regions when generating the IOVA ranges? Who is going to consume this information? Additionally it appears that reserved regions will report different information depending on whether a device is attached to a domain. > This is a change to the ABI as this reserved region was not yet > exposed in sysfs /sys/kernel/iommu_groups//reserved_regions or > through the VFIO ioctl. At VFIO level we increment the version of > the VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE capability to advertise > that change. Is this really an ABI change? The original entry for reserved regions includes: Not necessarily all reserved regions are listed. This is typically used to output direct-mapped, MSI, non mappable regions. I imagine the intention here was non-mappable relative to the IOMMU, but non-mappable to the device is essentially what we're including here. I'm also concerned about bumping the vfio interface version for the IOVA range. We're not changing the interface, we're modifying the result, and even then only for a fraction of users. How many users are potentially broken by that change? Are we going to bump the version for everyone any time the result changes on any platform? Thanks, Alex ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu-v3: Add rmb after reading event queue prod_reg
On Mon, 28 Sep 2020 16:32:02 +0800, Zhou Wang wrote: > In arm_smmu_evtq_thread, reading event queue is from consumer pointer, > which has no address dependency on producer pointer, prog_reg(MMIO) and > event queue memory(Normal memory) can disorder. So the load for event queue > can be done before the load of prod_reg, then perhaps wrong event entry > value will be got. > > Add rmb to make sure to get correct event queue entry value. Applied to will (for-joerg/arm-smmu/updates), thanks! [1/1] iommu/arm-smmu-v3: Add rmb after reading event queue prod_reg https://git.kernel.org/will/c/a76a3f2c (please note that I changed the patch to use readl() instead of an rmb() in conjunction with the _relaxed() accessor, and then adjusted the cons side to match in terms of DMB vs DSB). Cheers, -- Will https://fixes.arm64.dev https://next.arm64.dev https://will.arm64.dev ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v10 01/13] mm: Define pasid in mm
Hi, Will and Jean, On Mon, Sep 28, 2020 at 11:22:51PM +0100, Will Deacon wrote: > On Fri, Sep 18, 2020 at 12:18:41PM +0200, Jean-Philippe Brucker wrote: > > From: Fenghua Yu > > > > PASID is shared by all threads in a process. So the logical place to keep > > track of it is in the "mm". Both ARM and X86 need to use the PASID in the > > "mm". > > > > Suggested-by: Christoph Hellwig > > Signed-off-by: Fenghua Yu > > Reviewed-by: Tony Luck > > --- > > https://lore.kernel.org/linux-iommu/1600187413-163670-8-git-send-email-fenghua...@intel.com/ > > --- > > include/linux/mm_types.h | 4 > > 1 file changed, 4 insertions(+) > > Acked-by: Will Deacon FYI. This patch is in x86 maintainers tree tip:x86/pasid now as part of the x86 PASID MSR series. Thanks. -Fenghua ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate()
On Mon, Sep 21, 2020 at 09:45:57PM +0100, Will Deacon wrote: > On Tue, Sep 22, 2020 at 03:13:53AM +0800, kernel test robot wrote: > > Thank you for the patch! Perhaps something to improve: > > > > [auto build test WARNING on iommu/next] > > [also build test WARNING on linus/master v5.9-rc6 next-20200921] > > [cannot apply to robclark/msm-next] > > [If your patch is applied to the wrong git tree, kindly drop us a note. > > And when submitting patch, we suggest to use '--base' as documented in > > https://git-scm.com/docs/git-format-patch] > > > > url: > > https://github.com/0day-ci/linux/commits/Yu-Kuai/iommu-qcom-add-missing-put_device-call-in-qcom_iommu_of_xlate/20200918-091341 > > base: https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next > > config: arm64-randconfig-r023-20200920 (attached as .config) > > compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project > > 4e8c028158b56d9c2142a62464e8e0686bde3584) > > reproduce (this is a W=1 build): > > wget > > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O > > ~/bin/make.cross > > chmod +x ~/bin/make.cross > > # install arm64 cross compiling tool for clang build > > # apt-get install binutils-aarch64-linux-gnu > > # save the attached .config to linux build tree > > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross > > ARCH=arm64 > > > > If you fix the issue, kindly add following tag as appropriate > > Reported-by: kernel test robot > > > > All warnings (new ones prefixed by >>): > > > > >> drivers/iommu/arm/arm-smmu/qcom_iommu.c:601:4: warning: misleading > > >> indentation; statement is not part of the previous 'if' > > >> [-Wmisleading-indentation] > >return -EINVAL; > >^ > >drivers/iommu/arm/arm-smmu/qcom_iommu.c:599:3: note: previous statement > > is here > >if (WARN_ON(qcom_iommu != dev_iommu_priv_get(dev))) > > Oh, this looks like a nasty bug. Seems we're missing some braces. Yu Kuai: please could you send a v2 of this? Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC 3/3] vfio/type1: Increase the version of VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE
Now the IOVA regions beyond the dma_mask and the vfio aperture are removed from the usable IOVA ranges, the API becomes reliable to compute the max IOVA. Let's advertise this by using a new version for the capability. Signed-off-by: Eric Auger --- drivers/vfio/vfio_iommu_type1.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 5fbf0c1f7433..af4596123954 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -2541,7 +2541,7 @@ static int vfio_iommu_iova_add_cap(struct vfio_info_cap *caps, struct vfio_iommu_type1_info_cap_iova_range *iova_cap; header = vfio_info_cap_add(caps, size, - VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE, 1); + VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE, 2); if (IS_ERR(header)) return PTR_ERR(header); -- 2.21.3 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC 0/3] iommu: Reserved regions for IOVAs beyond dma_mask and iommu aperture
VFIO currently exposes the usable IOVA regions through the VFIO_IOMMU_GET_INFO ioctl / VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE capability. However it fails to take into account the dma_mask of the devices within the container. The top limit currently is defined by the iommu aperture. So, for instance, if the IOMMU supports up to 48bits, it may give the impression the max IOVA is 48b while a device may have a dma_mask of 42b. So this API cannot really be used to compute the max usable IOVA. This patch removes the IOVA region beyond the dma_mask's. As we start to expose this reserved region in the sysfs file /sys/kernel/iommu_groups//reserved_regions, we also need to handle the IOVA range beyond the IOMMU aperture to handle the case where the dma_mask would have a higher number of bits than the iommu max input address. This is a change to the ABI as this reserved region was not yet exposed in sysfs /sys/kernel/iommu_groups//reserved_regions or through the VFIO ioctl. At VFIO level we increment the version of the VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE capability to advertise that change. Best Regards Eric This series can be found at: https://github.com/eauger/qemu/tree/dma_mask_rfc Eric Auger (3): iommu: Fix merging in iommu_insert_resv_region iommu: Account for dma_mask and iommu aperture in IOVA reserved regions vfio/type1: Increase the version of VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE .../ABI/testing/sysfs-kernel-iommu_groups | 7 drivers/iommu/iommu.c | 41 ++- drivers/vfio/vfio_iommu_type1.c | 2 +- 3 files changed, 48 insertions(+), 2 deletions(-) -- 2.21.3 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC 2/3] iommu: Account for dma_mask and iommu aperture in IOVA reserved regions
VFIO currently exposes the usable IOVA regions through the VFIO_IOMMU_GET_INFO ioctl. However it fails to take into account the dma_mask of the devices within the container. The top limit currently is defined by the iommu aperture. So, for instance, if the IOMMU supports up to 48bits, it may give the impression the max IOVA is 48b while a device may have a dma_mask of 42b. So this API cannot really be used to compute the max usable IOVA. This patch removes the IOVA region beyond the dma_mask's. As we start to expose this reserved region in the sysfs file /sys/kernel/iommu_groups//reserved_regions, we also need to expose the IOVA range beyond the IOMMU aperture to handle the case where the dma_mask would have a higher number of bits than the iommu max input address. Those out-of-reach regions get the IOMMU_RESV_RESERVED type. This is a change to the ABI as this reserved region was not yet exposed in sysfs /sys/kernel/iommu_groups//reserved_regions or through the VFIO ioctl. Document that change. Signed-off-by: Eric Auger --- .../ABI/testing/sysfs-kernel-iommu_groups | 7 drivers/iommu/iommu.c | 39 +++ 2 files changed, 46 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups b/Documentation/ABI/testing/sysfs-kernel-iommu_groups index 017f5bc3920c..2f316686c88b 100644 --- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups +++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups @@ -33,3 +33,10 @@ Description:In case an RMRR is used only by graphics or USB devices it is now exposed as "direct-relaxable" instead of "direct". In device assignment use case, for instance, those RMRR are considered to be relaxable and safe. + +What: /sys/kernel/iommu_groups/reserved_regions +Date: Sept 2020 +KernelVersion: v5.11 +Contact: Eric Auger +Description:Regions beyond the device dma_mask and the iommu aperture + now are exposed as IOMMU_RESV_RESERVED reserved regions. diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index dd8cda340e62..d797f07b3625 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2511,9 +2511,48 @@ EXPORT_SYMBOL_GPL(iommu_domain_set_attr); void iommu_get_resv_regions(struct device *dev, struct list_head *list) { const struct iommu_ops *ops = dev->bus->iommu_ops; + struct iommu_resv_region *region; + struct iommu_domain *domain; + + domain = iommu_get_domain_for_dev(dev); + + if (domain) { + struct iommu_domain_geometry geo; + + if (iommu_domain_get_attr(domain, DOMAIN_ATTR_GEOMETRY, )) + return; + + if (geo.aperture_end < ULLONG_MAX && geo.aperture_end != geo.aperture_start) { + region = iommu_alloc_resv_region(geo.aperture_end + 1, +ULLONG_MAX - geo.aperture_end, +0, IOMMU_RESV_RESERVED); + if (!region) + return; + list_add_tail(>list, list); + } + + if (geo.aperture_start > 0) { + region = iommu_alloc_resv_region(0, geo.aperture_start, +0, IOMMU_RESV_RESERVED); + if (!region) + return; + list_add_tail(>list, list); + } + } if (ops && ops->get_resv_regions) ops->get_resv_regions(dev, list); + + if (!dev->dma_mask || *dev->dma_mask == ULLONG_MAX) + return; + + region = iommu_alloc_resv_region(*dev->dma_mask + 1, +ULLONG_MAX - *dev->dma_mask, +0, IOMMU_RESV_RESERVED); + if (!region) + return; + + list_add_tail(>list, list); } void iommu_put_resv_regions(struct device *dev, struct list_head *list) -- 2.21.3 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/arm: Add module parameter to set msi iova address
On Wed, Sep 23, 2020 at 08:32:43AM +0200, Auger Eric wrote: > On 9/21/20 10:45 PM, Will Deacon wrote: > > On Mon, Sep 14, 2020 at 11:13:07AM -0700, Vennila Megavannan wrote: > >> From: Srinath Mannam > >> > >> Add provision to change default value of MSI IOVA base to platform's > >> suitable IOVA using module parameter. The present hardcoded MSI IOVA base > >> may not be the accessible IOVA ranges of platform. > >> > >> If any platform has the limitaion to access default MSI IOVA, then it can > >> be changed using "arm-smmu.msi_iova_base=0xa000" command line argument. > >> > >> Signed-off-by: Srinath Mannam > >> Co-developed-by: Vennila Megavannan > >> Signed-off-by: Vennila Megavannan > >> --- > >> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 5 - > >> drivers/iommu/arm/arm-smmu/arm-smmu.c | 5 - > >> 2 files changed, 8 insertions(+), 2 deletions(-) > > > > This feels pretty fragile. Wouldn't it be better to realise that there's > > a region conflict with iommu_dma_get_resv_regions() and move the MSI window > > accordingly at runtime? > > Since cd2c9fcf5c66 ("iommu/dma: Move PCI window region reservation back > into dma specific path"), the PCI host bridge windows are not exposed by > iommu_dma_get_resv_regions() anymore. If I understood correctly, what is > attempted here is to avoid the collision between such PCI host bridge > window and the MSI IOVA range. Either way, I think the kernel should figure this out at runtime and not rely on a cmdline option to tell it where to place the region. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/5] iommu/tegra-smmu: Use iommu_fwspec in .probe_/.attach_device()
On Mon, Sep 28, 2020 at 09:52:12AM +0200, Thierry Reding wrote: > On Sat, Sep 26, 2020 at 01:07:17AM -0700, Nicolin Chen wrote: > > @@ -13,6 +13,7 @@ > > #include > > #include > > #include > > +#include > > Why is this needed? I don't see any of the symbols declared in that file > used here. Hmm..I think I mixed with some other change that needs this header file. Removing it > > #include > > > > #include > > @@ -61,6 +62,8 @@ struct tegra_smmu_as { > > u32 attr; > > }; > > > > +static const struct iommu_ops tegra_smmu_ops; > > + > > static struct tegra_smmu_as *to_smmu_as(struct iommu_domain *dom) > > { > > return container_of(dom, struct tegra_smmu_as, domain); > > @@ -484,60 +487,49 @@ static void tegra_smmu_as_unprepare(struct tegra_smmu > > *smmu, > > static int tegra_smmu_attach_dev(struct iommu_domain *domain, > > struct device *dev) > > { > > + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); > > struct tegra_smmu *smmu = dev_iommu_priv_get(dev); > > struct tegra_smmu_as *as = to_smmu_as(domain); > > - struct device_node *np = dev->of_node; > > - struct of_phandle_args args; > > - unsigned int index = 0; > > - int err = 0; > > - > > - while (!of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index, > > - )) { > > - unsigned int swgroup = args.args[0]; > > - > > - if (args.np != smmu->dev->of_node) { > > - of_node_put(args.np); > > - continue; > > - } > > + int index, err = 0; > > > > - of_node_put(args.np); > > + if (!fwspec || fwspec->ops != _smmu_ops) > > + return -ENOENT; > > > > + for (index = 0; index < fwspec->num_ids; index++) { > > err = tegra_smmu_as_prepare(smmu, as); > > - if (err < 0) > > - return err; > > + if (err) > > + goto err_disable; > > I'd personally drop the err_ prefix here because it's pretty obvious > that we're going to do this as a result of an error happening. Changing to "goto disable". > > @@ -845,10 +837,25 @@ static int tegra_smmu_configure(struct tegra_smmu > > *smmu, struct device *dev, > > return 0; > > } > > > > +static struct tegra_smmu *tegra_smmu_get_by_fwnode(struct fwnode_handle > > *fwnode) > > +{ > > + struct device *dev = > > driver_find_device_by_fwnode(_mc_driver.driver, fwnode); > > + struct tegra_mc *mc; > > + > > + if (!dev) > > + return NULL; > > + > > + put_device(dev); > > + mc = dev_get_drvdata(dev); > > + > > + return mc->smmu; > > +} > > + > > As I mentioned in another reply, I think tegra_smmu_find() should be all > you need in this case. This function is used by .probe_device() where its dev pointer is an SMMU client. IIUC, tegra_smmu_find() needs np pointer of "mc". For a PCI device that doesn't have a DT node with iommus property, not sure how we can use tegra_smmu_find(). ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v10 01/13] mm: Define pasid in mm
On Fri, Sep 18, 2020 at 12:18:41PM +0200, Jean-Philippe Brucker wrote: > From: Fenghua Yu > > PASID is shared by all threads in a process. So the logical place to keep > track of it is in the "mm". Both ARM and X86 need to use the PASID in the > "mm". > > Suggested-by: Christoph Hellwig > Signed-off-by: Fenghua Yu > Reviewed-by: Tony Luck > --- > https://lore.kernel.org/linux-iommu/1600187413-163670-8-git-send-email-fenghua...@intel.com/ > --- > include/linux/mm_types.h | 4 > 1 file changed, 4 insertions(+) Acked-by: Will Deacon Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/5] iommu/tegra-smmu: Add PCI support
On Sun, Sep 27, 2020 at 01:18:15AM +0300, Dmitry Osipenko wrote: > 26.09.2020 11:07, Nicolin Chen пишет: > ... > > +#ifdef CONFIG_PCI > > + if (!iommu_present(_bus_type)) { > > Is this iommu_present() check really needed? > > > + pci_request_acs(); > > Shouldn't pci_request_acs() be invoked *after* bus_set_iommu() succeeds? Will move it after that. Thanks! ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/5] iommu/tegra-smmu: Add PCI support
On Mon, Sep 28, 2020 at 09:55:45AM +0200, Thierry Reding wrote: > On Sat, Sep 26, 2020 at 01:07:18AM -0700, Nicolin Chen wrote: > > +#ifdef CONFIG_PCI > > + if (!iommu_present(_bus_type)) { > > + pci_request_acs(); > > + err = bus_set_iommu(_bus_type, _smmu_ops); > > + if (err < 0) { > > + bus_set_iommu(_bus_type, NULL); > > + iommu_device_unregister(>iommu); > > + iommu_device_sysfs_remove(>iommu); > > + return ERR_PTR(err); > > It might be worth factoring out the cleanup code now that there are > multiple failures from which we may need to clean up. Will do. > Also, it'd be great if somehow we could do this without the #ifdef, > but I guess since we're using the pci_bus_type global variable directly, > there isn't much we can do here? Probably no -- both pci_bus_type and pci_request_acs seem to depend on it. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v10 00/13] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part)
On Mon, Sep 28, 2020 at 06:23:15PM +0100, Will Deacon wrote: > On Mon, Sep 28, 2020 at 06:47:31PM +0200, Jean-Philippe Brucker wrote: > > On Fri, Sep 18, 2020 at 12:18:40PM +0200, Jean-Philippe Brucker wrote: > > > This is version 10 of the page table sharing support for Arm SMMUv3. > > > Patch 1 still needs an Ack from mm maintainers. However patches 4-11 do > > > not depend on it, and could get merged for v5.10 regardless. > > > > Are you OK with taking patches 4-11 for v5.10? > > > > The rest depends on patch 1 which hasn't been acked yet. It's > > uncontroversial and I'm sure it will eventually make it. In case it > > doesn't, we'll keep track of mm->pasid within the IOMMU subsystem instead. > > I was off most of last week, but I plan to see how much of this I can queue > tonight. Stay tuned... I've queued 4-11 locally, but I've put 4 and 6 on a shared branch with arm64 (for-next/svm) so I'd like that to hit next before I push out the merge into the branch for Joerg. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 2/2] iommu/tegra-smmu: Expend mutex protection range
This is used to protect potential race condition at use_count. since probes of client drivers, calling attach_dev(), may run concurrently. Signed-off-by: Nicolin Chen --- Changelog v1->v2: * N/A drivers/iommu/tegra-smmu.c | 34 +- 1 file changed, 21 insertions(+), 13 deletions(-) diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c index ec4c9dafff95..acda5902e095 100644 --- a/drivers/iommu/tegra-smmu.c +++ b/drivers/iommu/tegra-smmu.c @@ -256,26 +256,19 @@ static int tegra_smmu_alloc_asid(struct tegra_smmu *smmu, unsigned int *idp) { unsigned long id; - mutex_lock(>lock); - id = find_first_zero_bit(smmu->asids, smmu->soc->num_asids); - if (id >= smmu->soc->num_asids) { - mutex_unlock(>lock); + if (id >= smmu->soc->num_asids) return -ENOSPC; - } set_bit(id, smmu->asids); *idp = id; - mutex_unlock(>lock); return 0; } static void tegra_smmu_free_asid(struct tegra_smmu *smmu, unsigned int id) { - mutex_lock(>lock); clear_bit(id, smmu->asids); - mutex_unlock(>lock); } static bool tegra_smmu_capable(enum iommu_cap cap) @@ -420,17 +413,21 @@ static int tegra_smmu_as_prepare(struct tegra_smmu *smmu, struct tegra_smmu_as *as) { u32 value; - int err; + int err = 0; + + mutex_lock(>lock); if (as->use_count > 0) { as->use_count++; - return 0; + goto err_unlock; } as->pd_dma = dma_map_page(smmu->dev, as->pd, 0, SMMU_SIZE_PD, DMA_TO_DEVICE); - if (dma_mapping_error(smmu->dev, as->pd_dma)) - return -ENOMEM; + if (dma_mapping_error(smmu->dev, as->pd_dma)) { + err = -ENOMEM; + goto err_unlock; + } /* We can't handle 64-bit DMA addresses */ if (!smmu_dma_addr_valid(smmu, as->pd_dma)) { @@ -453,24 +450,35 @@ static int tegra_smmu_as_prepare(struct tegra_smmu *smmu, as->smmu = smmu; as->use_count++; + mutex_unlock(>lock); + return 0; err_unmap: dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD, DMA_TO_DEVICE); +err_unlock: + mutex_unlock(>lock); + return err; } static void tegra_smmu_as_unprepare(struct tegra_smmu *smmu, struct tegra_smmu_as *as) { - if (--as->use_count > 0) + mutex_lock(>lock); + + if (--as->use_count > 0) { + mutex_unlock(>lock); return; + } tegra_smmu_free_asid(smmu, as->id); dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD, DMA_TO_DEVICE); as->smmu = NULL; + + mutex_unlock(>lock); } static int tegra_smmu_attach_dev(struct iommu_domain *domain, -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 0/2] iommu/tegra-smmu: Two followup changes
Two followup patches for tegra-smmu: PATCH-1 is a clean-up patch for the recently applied SWGROUP change. PATCH-2 fixes a potential race condition Changelog v1->v2: * Separated first two changs of V1 so they may get applied first, since the other three changes need some extra time to finalize. Nicolin Chen (2): iommu/tegra-smmu: Unwrap tegra_smmu_group_get iommu/tegra-smmu: Expend mutex protection range drivers/iommu/tegra-smmu.c | 53 ++ 1 file changed, 25 insertions(+), 28 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 1/2] iommu/tegra-smmu: Unwrap tegra_smmu_group_get
The tegra_smmu_group_get was added to group devices in different SWGROUPs and it'd return a NULL group pointer upon a mismatch at tegra_smmu_find_group(), so for most of clients/devices, it very likely would mismatch and need a fallback generic_device_group(). But now tegra_smmu_group_get handles devices in same SWGROUP too, which means that it would allocate a group for every new SWGROUP or would directly return an existing one upon matching a SWGROUP, i.e. any device will go through this function. So possibility of having a NULL group pointer in device_group() is upon failure of either devm_kzalloc() or iommu_group_alloc(). In either case, calling generic_device_group() no longer makes a sense. Especially for devm_kzalloc() failing case, it'd cause a problem if it fails at devm_kzalloc() yet succeeds at a fallback generic_device_group(), because it does not create a group->list for other devices to match. This patch simply unwraps the function to clean it up. Signed-off-by: Nicolin Chen --- Changelog v1->v2: * Changed type of swgroup to "unsigned int", following Thierry's commnets. drivers/iommu/tegra-smmu.c | 19 --- 1 file changed, 4 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c index 0becdbfea306..ec4c9dafff95 100644 --- a/drivers/iommu/tegra-smmu.c +++ b/drivers/iommu/tegra-smmu.c @@ -903,10 +903,12 @@ static void tegra_smmu_group_release(void *iommu_data) mutex_unlock(>lock); } -static struct iommu_group *tegra_smmu_group_get(struct tegra_smmu *smmu, - unsigned int swgroup) +static struct iommu_group *tegra_smmu_device_group(struct device *dev) { + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); + struct tegra_smmu *smmu = dev_iommu_priv_get(dev); const struct tegra_smmu_group_soc *soc; + unsigned int swgroup = fwspec->ids[0]; struct tegra_smmu_group *group; struct iommu_group *grp; @@ -950,19 +952,6 @@ static struct iommu_group *tegra_smmu_group_get(struct tegra_smmu *smmu, return group->group; } -static struct iommu_group *tegra_smmu_device_group(struct device *dev) -{ - struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); - struct tegra_smmu *smmu = dev_iommu_priv_get(dev); - struct iommu_group *group; - - group = tegra_smmu_group_get(smmu, fwspec->ids[0]); - if (!group) - group = generic_device_group(dev); - - return group; -} - static int tegra_smmu_of_xlate(struct device *dev, struct of_phandle_args *args) { -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH V2] iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate()
if of_find_device_by_node() succeed, qcom_iommu_of_xlate() doesn't have a corresponding put_device(). Thus add put_device() to fix the exception handling for this function implementation. Fixes: 0ae349a0f33fb ("iommu/qcom: Add qcom_iommu") Signed-off-by: Yu Kuai --- Changes in V2: - Fix wrong 'Fixes' - add missing '{}' after if drivers/iommu/arm/arm-smmu/qcom_iommu.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c index 9535a6af7553..b30d6c966e2c 100644 --- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c +++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c @@ -584,8 +584,10 @@ static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args) * index into qcom_iommu->ctxs: */ if (WARN_ON(asid < 1) || - WARN_ON(asid > qcom_iommu->num_ctxs)) + WARN_ON(asid > qcom_iommu->num_ctxs)) { + put_device(_pdev->dev); return -EINVAL; + } if (!dev_iommu_priv_get(dev)) { dev_iommu_priv_set(dev, qcom_iommu); @@ -594,8 +596,10 @@ static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args) * multiple different iommu devices. Multiple context * banks are ok, but multiple devices are not: */ - if (WARN_ON(qcom_iommu != dev_iommu_priv_get(dev))) + if (WARN_ON(qcom_iommu != dev_iommu_priv_get(dev))) { + put_device(_pdev->dev); return -EINVAL; + } } return iommu_fwspec_add_ids(dev, , 1); -- 2.25.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 0/7] Convert the intel iommu driver to the dma-iommu api
Hi Tvrtko, On 9/28/20 5:44 PM, Tvrtko Ursulin wrote: On 27/09/2020 07:34, Lu Baolu wrote: Hi, The previous post of this series could be found here. https://lore.kernel.org/linux-iommu/20200912032200.11489-1-baolu...@linux.intel.com/ This version introduce a new patch [4/7] to fix an issue reported here. https://lore.kernel.org/linux-iommu/51a1baec-48d1-c0ac-181b-1fba92aa4...@linux.intel.com/ There aren't any other changes. Please help to test and review. Best regards, baolu Lu Baolu (3): iommu: Add quirk for Intel graphic devices in map_sg Since I do have patches to fix i915 to handle this, do we want to co-ordinate the two and avoid having to add this quirk and then later remove it? Or you want to go the staged approach? I have no preference. It depends on which patch goes first. Let the maintainers help here. Best regards, baolu Regards, Tvrtko iommu/vt-d: Update domain geometry in iommu_ops.at(de)tach_dev iommu/vt-d: Cleanup after converting to dma-iommu ops Tom Murphy (4): iommu: Handle freelists when using deferred flushing in iommu drivers iommu: Add iommu_dma_free_cpu_cached_iovas() iommu: Allow the dma-iommu api to use bounce buffers iommu/vt-d: Convert intel iommu driver to the iommu ops .../admin-guide/kernel-parameters.txt | 5 - drivers/iommu/dma-iommu.c | 228 - drivers/iommu/intel/Kconfig | 1 + drivers/iommu/intel/iommu.c | 901 +++--- include/linux/dma-iommu.h | 8 + include/linux/iommu.h | 1 + 6 files changed, 336 insertions(+), 808 deletions(-) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/2] iommu/tegra-smmu: Expend mutex protection range
... > static bool tegra_smmu_capable(enum iommu_cap cap) > @@ -420,17 +413,21 @@ static int tegra_smmu_as_prepare(struct tegra_smmu > *smmu, >struct tegra_smmu_as *as) > { > u32 value; > - int err; > + int err = 0; > + > + mutex_lock(>lock); > > if (as->use_count > 0) { > as->use_count++; > - return 0; > + goto err_unlock; This looks a bit odd because it's not a error condition. Perhaps should be better to "goto bump_usecount"? Or make it similar to tegra_smmu_as_unprepare()? > } > > as->pd_dma = dma_map_page(smmu->dev, as->pd, 0, SMMU_SIZE_PD, > DMA_TO_DEVICE); > - if (dma_mapping_error(smmu->dev, as->pd_dma)) > - return -ENOMEM; > + if (dma_mapping_error(smmu->dev, as->pd_dma)) { > + err = -ENOMEM; > + goto err_unlock; > + } > > /* We can't handle 64-bit DMA addresses */ > if (!smmu_dma_addr_valid(smmu, as->pd_dma)) { > @@ -453,24 +450,35 @@ static int tegra_smmu_as_prepare(struct tegra_smmu > *smmu, > as->smmu = smmu; bump_usecount: > as->use_count++; > > + mutex_unlock(>lock); > + > return 0; > > err_unmap: > dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD, DMA_TO_DEVICE); > +err_unlock: > + mutex_unlock(>lock); > + > return err; > } > > static void tegra_smmu_as_unprepare(struct tegra_smmu *smmu, > struct tegra_smmu_as *as) > { > - if (--as->use_count > 0) > + mutex_lock(>lock); > + > + if (--as->use_count > 0) { > + mutex_unlock(>lock); > return; > + } > > tegra_smmu_free_asid(smmu, as->id); > > dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD, DMA_TO_DEVICE); > > as->smmu = NULL; > + > + mutex_unlock(>lock); > } > > static int tegra_smmu_attach_dev(struct iommu_domain *domain, > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/2] iommu/tegra-smmu: Expend mutex protection range
On Tue, Sep 29, 2020 at 03:17:58AM +0300, Dmitry Osipenko wrote: > ... > > static bool tegra_smmu_capable(enum iommu_cap cap) > > @@ -420,17 +413,21 @@ static int tegra_smmu_as_prepare(struct tegra_smmu > > *smmu, > > struct tegra_smmu_as *as) > > { > > u32 value; > > - int err; > > + int err = 0; > > + > > + mutex_lock(>lock); > > > > if (as->use_count > 0) { > > as->use_count++; > > - return 0; > > + goto err_unlock; > > This looks a bit odd because it's not a error condition. Perhaps should > be better to "goto bump_usecount"? > > Or make it similar to tegra_smmu_as_unprepare()? Hmm...I think it's simple to just make it "goto unlock" then. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate()
On 2020/09/29 7:08, Will Deacon wrote: On Mon, Sep 21, 2020 at 09:45:57PM +0100, Will Deacon wrote: On Tue, Sep 22, 2020 at 03:13:53AM +0800, kernel test robot wrote: Thank you for the patch! Perhaps something to improve: [auto build test WARNING on iommu/next] [also build test WARNING on linus/master v5.9-rc6 next-20200921] [cannot apply to robclark/msm-next] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Yu-Kuai/iommu-qcom-add-missing-put_device-call-in-qcom_iommu_of_xlate/20200918-091341 base: https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next config: arm64-randconfig-r023-20200920 (attached as .config) compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 4e8c028158b56d9c2142a62464e8e0686bde3584) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install arm64 cross compiling tool for clang build # apt-get install binutils-aarch64-linux-gnu # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All warnings (new ones prefixed by >>): drivers/iommu/arm/arm-smmu/qcom_iommu.c:601:4: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] return -EINVAL; ^ drivers/iommu/arm/arm-smmu/qcom_iommu.c:599:3: note: previous statement is here if (WARN_ON(qcom_iommu != dev_iommu_priv_get(dev))) Oh, this looks like a nasty bug. Seems we're missing some braces. Yu Kuai: please could you send a v2 of this? Hi, Will Thanks for your notice, will send a V2 soon. Yu Kuai ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V7 0/3] iommu: Add support to change default domain of an iommu group
Hi Joerg On Fri, Sep 25, 2020 at 09:34:23AM +0200, Joerg Roedel wrote: > Hi Ashok, > > On Thu, Sep 24, 2020 at 10:21:48AM -0700, Raj, Ashok wrote: > > Just trying to followup on this series. > > > > Sai has moved out of Intel, hence I'm trying to followup on his behalf. > > > > Let me know if you have queued this for the next release. > > Not yet, but I think this is mostly ready. Can you please send a new > version in a new mail thread so that I can pick it up with b4? I reposted Sai's patches again here. https://lore.kernel.org/linux-iommu/20200925190620.18732-2-ashok@intel.com/ Can you let me know if they are in a format you can pickup> via b4? With recent issues we have sending mails out of Intel, wanted to make sure everything made it out in one piece. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/5] iommu/tegra-smmu: Use iommu_fwspec in .probe_/.attach_device()
... >> As I mentioned in another reply, I think tegra_smmu_find() should be all >> you need in this case. > > This function is used by .probe_device() where its dev pointer is > an SMMU client. IIUC, tegra_smmu_find() needs np pointer of "mc". > For a PCI device that doesn't have a DT node with iommus property, > not sure how we can use tegra_smmu_find(). Perhaps you could get np from struct pci_dev.bus? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 2/2] iommu/tegra-smmu: Expend mutex protection range
This is used to protect potential race condition at use_count. since probes of client drivers, calling attach_dev(), may run concurrently. Signed-off-by: Nicolin Chen --- Changelog v2->v3: * Renamed label "err_unlock" to "unlock" v1->v2: * N/A drivers/iommu/tegra-smmu.c | 34 +- 1 file changed, 21 insertions(+), 13 deletions(-) diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c index ec4c9dafff95..6a3ecc334481 100644 --- a/drivers/iommu/tegra-smmu.c +++ b/drivers/iommu/tegra-smmu.c @@ -256,26 +256,19 @@ static int tegra_smmu_alloc_asid(struct tegra_smmu *smmu, unsigned int *idp) { unsigned long id; - mutex_lock(>lock); - id = find_first_zero_bit(smmu->asids, smmu->soc->num_asids); - if (id >= smmu->soc->num_asids) { - mutex_unlock(>lock); + if (id >= smmu->soc->num_asids) return -ENOSPC; - } set_bit(id, smmu->asids); *idp = id; - mutex_unlock(>lock); return 0; } static void tegra_smmu_free_asid(struct tegra_smmu *smmu, unsigned int id) { - mutex_lock(>lock); clear_bit(id, smmu->asids); - mutex_unlock(>lock); } static bool tegra_smmu_capable(enum iommu_cap cap) @@ -420,17 +413,21 @@ static int tegra_smmu_as_prepare(struct tegra_smmu *smmu, struct tegra_smmu_as *as) { u32 value; - int err; + int err = 0; + + mutex_lock(>lock); if (as->use_count > 0) { as->use_count++; - return 0; + goto unlock; } as->pd_dma = dma_map_page(smmu->dev, as->pd, 0, SMMU_SIZE_PD, DMA_TO_DEVICE); - if (dma_mapping_error(smmu->dev, as->pd_dma)) - return -ENOMEM; + if (dma_mapping_error(smmu->dev, as->pd_dma)) { + err = -ENOMEM; + goto unlock; + } /* We can't handle 64-bit DMA addresses */ if (!smmu_dma_addr_valid(smmu, as->pd_dma)) { @@ -453,24 +450,35 @@ static int tegra_smmu_as_prepare(struct tegra_smmu *smmu, as->smmu = smmu; as->use_count++; + mutex_unlock(>lock); + return 0; err_unmap: dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD, DMA_TO_DEVICE); +unlock: + mutex_unlock(>lock); + return err; } static void tegra_smmu_as_unprepare(struct tegra_smmu *smmu, struct tegra_smmu_as *as) { - if (--as->use_count > 0) + mutex_lock(>lock); + + if (--as->use_count > 0) { + mutex_unlock(>lock); return; + } tegra_smmu_free_asid(smmu, as->id); dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD, DMA_TO_DEVICE); as->smmu = NULL; + + mutex_unlock(>lock); } static int tegra_smmu_attach_dev(struct iommu_domain *domain, -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 0/2] iommu/tegra-smmu: Two followup changes
Two followup patches for tegra-smmu: PATCH-1 is a clean-up patch for the recently applied SWGROUP change. PATCH-2 fixes a potential race condition Changelog v2->v3: * PATCH-2: renamed "err_unlock" to "unlock" v1->v2: * Separated first two changs of V1 so they may get applied first, since the other three changes need some extra time to finalize. Nicolin Chen (2): iommu/tegra-smmu: Unwrap tegra_smmu_group_get iommu/tegra-smmu: Expend mutex protection range drivers/iommu/tegra-smmu.c | 53 ++ 1 file changed, 25 insertions(+), 28 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 1/2] iommu/tegra-smmu: Unwrap tegra_smmu_group_get
The tegra_smmu_group_get was added to group devices in different SWGROUPs and it'd return a NULL group pointer upon a mismatch at tegra_smmu_find_group(), so for most of clients/devices, it very likely would mismatch and need a fallback generic_device_group(). But now tegra_smmu_group_get handles devices in same SWGROUP too, which means that it would allocate a group for every new SWGROUP or would directly return an existing one upon matching a SWGROUP, i.e. any device will go through this function. So possibility of having a NULL group pointer in device_group() is upon failure of either devm_kzalloc() or iommu_group_alloc(). In either case, calling generic_device_group() no longer makes a sense. Especially for devm_kzalloc() failing case, it'd cause a problem if it fails at devm_kzalloc() yet succeeds at a fallback generic_device_group(), because it does not create a group->list for other devices to match. This patch simply unwraps the function to clean it up. Signed-off-by: Nicolin Chen --- Changelog v2->v3: * N/A v1->v2: * Changed type of swgroup to "unsigned int", following Thierry's commnets. drivers/iommu/tegra-smmu.c | 19 --- 1 file changed, 4 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c index 0becdbfea306..ec4c9dafff95 100644 --- a/drivers/iommu/tegra-smmu.c +++ b/drivers/iommu/tegra-smmu.c @@ -903,10 +903,12 @@ static void tegra_smmu_group_release(void *iommu_data) mutex_unlock(>lock); } -static struct iommu_group *tegra_smmu_group_get(struct tegra_smmu *smmu, - unsigned int swgroup) +static struct iommu_group *tegra_smmu_device_group(struct device *dev) { + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); + struct tegra_smmu *smmu = dev_iommu_priv_get(dev); const struct tegra_smmu_group_soc *soc; + unsigned int swgroup = fwspec->ids[0]; struct tegra_smmu_group *group; struct iommu_group *grp; @@ -950,19 +952,6 @@ static struct iommu_group *tegra_smmu_group_get(struct tegra_smmu *smmu, return group->group; } -static struct iommu_group *tegra_smmu_device_group(struct device *dev) -{ - struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); - struct tegra_smmu *smmu = dev_iommu_priv_get(dev); - struct iommu_group *group; - - group = tegra_smmu_group_get(smmu, fwspec->ids[0]); - if (!group) - group = generic_device_group(dev); - - return group; -} - static int tegra_smmu_of_xlate(struct device *dev, struct of_phandle_args *args) { -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/5] iommu/tegra-smmu: Use iommu_fwspec in .probe_/.attach_device()
On Tue, Sep 29, 2020 at 07:06:37AM +0300, Dmitry Osipenko wrote: > ... > >> As I mentioned in another reply, I think tegra_smmu_find() should be all > >> you need in this case. > > > > This function is used by .probe_device() where its dev pointer is > > an SMMU client. IIUC, tegra_smmu_find() needs np pointer of "mc". > > For a PCI device that doesn't have a DT node with iommus property, > > not sure how we can use tegra_smmu_find(). > > Perhaps you could get np from struct pci_dev.bus? Possibly yes...but I was hoping the solution would be potentially reused by other devices that don't have DT nodes, USB for example, though I haven't tested with current version. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu