Re: [PATCH 04/12] dma-mapping: fix DMA_OPS dependencies
On Thu, Sep 10, 2020 at 01:55:37PM +0100, Robin Murphy wrote: > AFAICS all three of these bus drivers are only proxying a struct > dma_map_ops * pointer around, so if they used the set_dma_ops() helper they > shouldn't even need these selects at all. Only INTEL_MIC_HOST appears to > have a logical dependency on DMA_OPS for actual functionality. > > However, I have a vague feeling you might not be fond of those dma_ops > helpers, and I have no great objection to this one-liner as-is, so (modulo > the couple of commit message typos), The problem with these inherÑ–tances is that they don't actually work for the general case. You'd also need to inherity things like the dma ranges, the bus limits, etc, etc. So we need to kill them instead. That whole mic/vop case is even worse than that with it's weird set of chained dma ops that seems to implement some kind of device side iommu that isn't in scope for the DMA API at all. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 04/12] dma-mapping: fix DMA_OPS dependencies
On Tue, Sep 08, 2020 at 09:04:17PM +0300, Sergei Shtylyov wrote: > Hello! > > On 9/8/20 7:47 PM, Christoph Hellwig wrote: > > > Driver that select DMA_OPS need to depend on HAS_DMA support to > > work. The vop driver was missing that dependency, so add it, and also > > add a nother depends in DMA_OPS itself. That won't fix the issue due > >Another? :-) Yes, fixed. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 09/12] dma-direct: remove __dma_to_phys
On Thu, Sep 10, 2020 at 02:26:03PM +0100, Robin Murphy wrote: > On 2020-09-08 17:47, Christoph Hellwig wrote: >> There is no harm in just always clearing the SME encryption bit, while >> significantly simplifying the interface. > > After a 10-minute diversion into "but hang on, force_dma_unencrypted() is > meaningful on PPC and S390 too..." before realising that it all does just > come back to __sme_clr(), which is indeed a no-op for everyone other than > AMD, any simplification of this mess is indeed welcome :) > > Unless I've massively misunderstood how SME is supposed to work, Exactly. This weird encryption bit in AMD SME causes all kinds of harm, and I'm glad no one picked it up. I've also been wondering if we should change the interface to explicit set/clear the bit, but I'll leave that for another pass as fixing up the SME interfaces would turn into a massive disgression. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 11/12] dma-mapping: move dma_common_{mmap,get_sgtable} out of mapping.c
On Thu, Sep 10, 2020 at 02:34:18PM +0100, Robin Murphy wrote: > On 2020-09-08 17:47, Christoph Hellwig wrote: >> Add a new file that contains helpera for misc DMA ops, which is only > > The Latin plural of the singular "helperum", I guess? :P I've switched it to helpers, that might be easier to understand :) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 12/12] dma-mapping: move the dma_declare_coherent_memory documentation
On Thu, Sep 10, 2020 at 02:51:47PM +0100, Robin Murphy wrote: > On 2020-09-08 17:47, Christoph Hellwig wrote: >> dma_declare_coherent_memory should not be in a DMA API guide aimed >> at driver writers (that is consumers of the API). Move it to a comment >> near the function instead. > > I still think there might be an occasional valid use for device-local > memory outside the scope of platform code without the driver having to go > full ZONE_DEVICE/HMM/TTM, e.g. with stuff like PCIe-based FPGA prototyping > cards, but the kind of driver I'm imagining for that case would never be > upstream anyway (if it were even written, rather than just using hard-coded > hacks), so meh. And I'm not sure this would be the right interface for it. E.g. NVMe has the concept of a Controller Memory buffer (and a similar persistent variant not supported by Linux), where the device can do this local DMA (in a completely broken way that relies on correlating addresses seen by the device and those by the host, but that's another disgression). But that memory obviously can also be addresses by other devices using PCIe P2P transactions which would also be useful for any HMM-ish devices, so we'd need to expose it as P2P memory anyay.. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/3] iommu/tegra-smmu: Allow to group clients in same swgroup
There can be clients using the same swgroup in DT, for example i2c0 and i2c1. The current driver will add them to separate IOMMU groups, though it has implemented device_group() callback which is to group devices using different swgroups like DC and DCB. All clients having the same swgroup should be also added to the same IOMMU group so as to share an asid. Otherwise, the asid register may get overwritten every time a new device is attached. Signed-off-by: Nicolin Chen --- drivers/iommu/tegra-smmu.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c index 50b962b0647e..84fdee473873 100644 --- a/drivers/iommu/tegra-smmu.c +++ b/drivers/iommu/tegra-smmu.c @@ -23,6 +23,7 @@ struct tegra_smmu_group { struct tegra_smmu *smmu; const struct tegra_smmu_group_soc *soc; struct iommu_group *group; + unsigned int swgroup; }; struct tegra_smmu { @@ -909,14 +910,14 @@ static struct iommu_group *tegra_smmu_group_get(struct tegra_smmu *smmu, struct tegra_smmu_group *group; struct iommu_group *grp; + /* Find group_soc associating with swgroup */ soc = tegra_smmu_find_group(smmu, swgroup); - if (!soc) - return NULL; mutex_lock(&smmu->lock); + /* Find existing iommu_group associating with swgroup or group_soc */ list_for_each_entry(group, &smmu->groups, list) - if (group->soc == soc) { + if ((group->swgroup == swgroup) || (soc && group->soc == soc)) { grp = iommu_group_ref_get(group->group); mutex_unlock(&smmu->lock); return grp; @@ -929,6 +930,7 @@ static struct iommu_group *tegra_smmu_group_get(struct tegra_smmu *smmu, } INIT_LIST_HEAD(&group->list); + group->swgroup = swgroup; group->smmu = smmu; group->soc = soc; @@ -940,7 +942,8 @@ static struct iommu_group *tegra_smmu_group_get(struct tegra_smmu *smmu, } iommu_group_set_iommudata(group->group, group, tegra_smmu_group_release); - iommu_group_set_name(group->group, soc->name); + if (soc) + iommu_group_set_name(group->group, soc->name); list_add_tail(&group->list, &smmu->groups); mutex_unlock(&smmu->lock); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/3] iommu/tegra-smmu: Fix iova->phys translation
IOVA might not be always 4KB aligned. So tegra_smmu_iova_to_phys function needs to add on the lower 12-bit offset from input iova. Signed-off-by: Nicolin Chen --- drivers/iommu/tegra-smmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c index 789d21c01b77..50b962b0647e 100644 --- a/drivers/iommu/tegra-smmu.c +++ b/drivers/iommu/tegra-smmu.c @@ -795,7 +795,7 @@ static phys_addr_t tegra_smmu_iova_to_phys(struct iommu_domain *domain, pfn = *pte & as->smmu->pfn_mask; - return SMMU_PFN_PHYS(pfn); + return SMMU_PFN_PHYS(pfn) + SMMU_OFFSET_IN_PAGE(iova); } static struct tegra_smmu *tegra_smmu_find(struct device_node *np) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/3] iommu/tegra-smmu: Some small fixes
These are a series of small fixes for tegra-smmu driver. They might not be critial bugs as current mainline does not enable a lot of clients, but be nicer to have since we are going to enable the DMA domain for other clients in the near future. Nicolin Chen (3): iommu/tegra-smmu: Do not use PAGE_SHIFT and PAGE_MASK iommu/tegra-smmu: Fix iova->phys translation iommu/tegra-smmu: Allow to group clients in same swgroup drivers/iommu/tegra-smmu.c | 25 + 1 file changed, 17 insertions(+), 8 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/3] iommu/tegra-smmu: Do not use PAGE_SHIFT and PAGE_MASK
PAGE_SHIFT and PAGE_MASK are defined corresponding to the page size for CPU virtual addresses, which means PAGE_SHIFT could be a number other than 12, but tegra-smmu maintains fixed 4KB IOVA pages and has fixed [21:12] bit range for PTE entries. So this patch replaces all PAGE_SHIFT/PAGE_MASK references with the macros defined with SMMU_PTE_SHIFT. Signed-off-by: Nicolin Chen --- drivers/iommu/tegra-smmu.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c index 046add7acb61..789d21c01b77 100644 --- a/drivers/iommu/tegra-smmu.c +++ b/drivers/iommu/tegra-smmu.c @@ -130,6 +130,11 @@ static inline u32 smmu_readl(struct tegra_smmu *smmu, unsigned long offset) #define SMMU_PDE_SHIFT 22 #define SMMU_PTE_SHIFT 12 +#define SMMU_PAGE_MASK (~(SMMU_SIZE_PT-1)) +#define SMMU_OFFSET_IN_PAGE(x) ((unsigned long)(x) & ~SMMU_PAGE_MASK) +#define SMMU_PFN_PHYS(x) ((phys_addr_t)(x) << SMMU_PTE_SHIFT) +#define SMMU_PHYS_PFN(x) ((unsigned long)((x) >> SMMU_PTE_SHIFT)) + #define SMMU_PD_READABLE (1 << 31) #define SMMU_PD_WRITABLE (1 << 30) #define SMMU_PD_NONSECURE (1 << 29) @@ -644,7 +649,7 @@ static void tegra_smmu_set_pte(struct tegra_smmu_as *as, unsigned long iova, u32 *pte, dma_addr_t pte_dma, u32 val) { struct tegra_smmu *smmu = as->smmu; - unsigned long offset = offset_in_page(pte); + unsigned long offset = SMMU_OFFSET_IN_PAGE(pte); *pte = val; @@ -726,7 +731,7 @@ __tegra_smmu_map(struct iommu_domain *domain, unsigned long iova, pte_attrs |= SMMU_PTE_WRITABLE; tegra_smmu_set_pte(as, iova, pte, pte_dma, - __phys_to_pfn(paddr) | pte_attrs); + SMMU_PHYS_PFN(paddr) | pte_attrs); return 0; } @@ -790,7 +795,7 @@ static phys_addr_t tegra_smmu_iova_to_phys(struct iommu_domain *domain, pfn = *pte & as->smmu->pfn_mask; - return PFN_PHYS(pfn); + return SMMU_PFN_PHYS(pfn); } static struct tegra_smmu *tegra_smmu_find(struct device_node *np) @@ -1108,7 +1113,8 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev, smmu->dev = dev; smmu->mc = mc; - smmu->pfn_mask = BIT_MASK(mc->soc->num_address_bits - PAGE_SHIFT) - 1; + smmu->pfn_mask = + BIT_MASK(mc->soc->num_address_bits - SMMU_PTE_SHIFT) - 1; dev_dbg(dev, "address bits: %u, PFN mask: %#lx\n", mc->soc->num_address_bits, smmu->pfn_mask); smmu->tlb_mask = (smmu->soc->num_tlb_lines << 1) - 1; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 0/8] iommu/arm-smmu: Support maintaining bootloader mappings
Hi Bjorn, On 2020-09-04 21:25, Bjorn Andersson wrote: Based on previous attempts and discussions this is the latest attempt at inheriting stream mappings set up by the bootloader, for e.g. boot splash or efifb. Per Will's request this builds on the work by Jordan and Rob for the Adreno SMMU support. It applies cleanly ontop of v16 of their series, which can be found at https://lore.kernel.org/linux-arm-msm/20200901164707.2645413-1-robdcl...@gmail.com/ Thanks for working on this, I have tested this on qcom platforms where firmware does these shenanigans(most android) and this series works well and where firmware doesn't do all this (chrome) and no regressions there. Review and test tags given on individual patches. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 1/8] iommu/arm-smmu: Refactor context bank allocation
On 2020-09-04 21:25, Bjorn Andersson wrote: Extract the conditional invocation of the platform defined alloc_context_bank() to a separate function to keep arm_smmu_init_domain_context() cleaner. Instead pass a reference to the arm_smmu_device as parameter to the call. Also remove the count parameter, as this can be read from the newly passed object. This allows us to not assign smmu_domain->smmu before attempting to allocate the context bank and as such we don't need to roll back this assignment on failure. Signed-off-by: Bjorn Andersson --- Reviewed-by: Sai Prakash Ranjan Tested-by: Sai Prakash Ranjan -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 2/8] iommu/arm-smmu: Delay modifying domain during init
On 2020-09-04 21:25, Bjorn Andersson wrote: Delay modifications to the domain during arm_smmu_init_domain_context() until we've allocated a context bank. This will allow us to postpone the special handling of identity domains until the platform specific context bank allocator has been executed, in a later patch. Signed-off-by: Bjorn Andersson --- Reviewed-by: Sai Prakash Ranjan Tested-by: Sai Prakash Ranjan -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 3/8] iommu/arm-smmu: Consult context bank allocator for identify domains
On 2020-09-04 21:25, Bjorn Andersson wrote: For implementations of the ARM SMMU where stream mappings of bypass type are prohibited identity domains can be implemented by using context banks with translation disabled. Postpone the decision to skip allocating a context bank until the implementation specific context bank allocator has been consulted and if it decides to use a context bank for the identity map, don't enable translation (i.e. omit ARM_SMMU_SCTLR_M). Signed-off-by: Bjorn Andersson --- Minor nit in the subject: identify -> identity Reviewed-by: Sai Prakash Ranjan Tested-by: Sai Prakash Ranjan -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 3/8] iommu/arm-smmu: Consult context bank allocator for identify domains
On 2020-09-04 21:25, Bjorn Andersson wrote: For implementations of the ARM SMMU where stream mappings of bypass type are prohibited identity domains can be implemented by using context banks with translation disabled. Postpone the decision to skip allocating a context bank until the implementation specific context bank allocator has been consulted and if it decides to use a context bank for the identity map, don't enable translation (i.e. omit ARM_SMMU_SCTLR_M). Signed-off-by: Bjorn Andersson --- ... diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h index ddf2ca4c923d..235d9a3a6ab6 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h @@ -243,6 +243,8 @@ enum arm_smmu_cbar_type { #define TLB_LOOP_TIMEOUT 100 /* 1s! */ #define TLB_SPIN_COUNT 10 +#define ARM_SMMU_CBNDX_BYPASS 0x + /* Shared driver definitions */ enum arm_smmu_arch_version { ARM_SMMU_V1, @@ -346,6 +348,7 @@ struct arm_smmu_cfg { u32 sctlr_clr;/* bits to mask in SCTLR */ enum arm_smmu_cbar_type cbar; enum arm_smmu_context_fmt fmt; + boolm; Can we use mmu_enable instead of m here to be more descriptive? Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 4/8] iommu/arm-smmu-qcom: Emulate bypass by using context banks
On 2020-09-04 21:25, Bjorn Andersson wrote: Some firmware found on various Qualcomm platforms traps writes to S2CR of type BYPASS and writes FAULT into the register. In particular, this prevents us from marking the streams for the display controller as BYPASS to allow continued scanout of the screen through the initialization of the ARM SMMU. This adds a Qualcomm specific cfg_probe function, which probes for the broken behavior of the S2CR registers and implements a custom alloc_context_bank() that when necessary allocates a context bank (without translation) for these domains as well. Signed-off-by: Bjorn Andersson --- Changes since v2: - Move quirk from arm_smmudevice to qcom_smmu, as we localize the quirk handling to the Qualcomm specific implemntation. drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 52 ++ 1 file changed, 52 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c index 229fc8ff8cea..284761a1cd8e 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c @@ -11,8 +11,14 @@ struct qcom_smmu { struct arm_smmu_device smmu; + bool bypass_broken; }; +static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device *smmu) +{ + return container_of(smmu, struct qcom_smmu, smmu); +} + #define QCOM_ADRENO_SMMU_GPU_SID 0 static bool qcom_adreno_smmu_is_gpu_device(struct device *dev) @@ -162,6 +168,50 @@ static const struct of_device_id qcom_smmu_client_of_match[] __maybe_unused = { { } }; +static int qcom_smmu_alloc_context_bank(struct arm_smmu_domain *smmu_domain, + struct arm_smmu_device *smmu, + struct device *dev, int start) +{ + struct iommu_domain *domain = &smmu_domain->domain; + struct qcom_smmu *qsmmu = to_qcom_smmu(smmu); + + /* Keep identity domains as bypass, unless bypass is broken */ + if (domain->type == IOMMU_DOMAIN_IDENTITY && !qsmmu->bypass_broken) + return ARM_SMMU_CBNDX_BYPASS; + + /* +* The identity domain to emulate bypass is the only domain without a +* dev, use the last context bank for this to avoid collisions with +* active contexts during initialization. +*/ + if (!dev) + start = smmu->num_context_banks - 1; + + return __arm_smmu_alloc_bitmap(smmu->context_map, start, smmu->num_context_banks); +} + +static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu) +{ + unsigned int last_s2cr = ARM_SMMU_GR0_S2CR(smmu->num_mapping_groups - 1); + struct qcom_smmu *qsmmu = to_qcom_smmu(smmu); + u32 reg; + + /* +* With some firmware writes to S2CR of type FAULT are ignored, and + * writing BYPASS will end up as FAULT in the register. Perform a write +* to S2CR to detect if this is the case with the current firmware. +*/ + reg = FIELD_PREP(ARM_SMMU_S2CR_TYPE, S2CR_TYPE_BYPASS) | + FIELD_PREP(ARM_SMMU_S2CR_CBNDX, 0xff) | + FIELD_PREP(ARM_SMMU_S2CR_PRIVCFG, S2CR_PRIVCFG_DEFAULT); + arm_smmu_gr0_write(smmu, last_s2cr, reg); + reg = arm_smmu_gr0_read(smmu, last_s2cr); + if (FIELD_GET(ARM_SMMU_S2CR_TYPE, reg) != S2CR_TYPE_BYPASS) + qsmmu->bypass_broken = true; + Clever :) Reviewed-by: Sai Prakash Ranjan Tested-by: Sai Prakash Ranjan -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 5/8] iommu/arm-smmu-qcom: Consistently initialize stream mappings
On 2020-09-04 21:25, Bjorn Andersson wrote: Firmware that traps writes to S2CR to translate BYPASS into FAULT also ignores writes of type FAULT. As such booting with "disable_bypass" set will result in all S2CR registers left as configured by the bootloader. This has been seen to result in indeterministic results, as these mappings might linger and reference context banks that Linux is reconfiguring. Use the fact that BYPASS writes result in FAULT type to force all stream mappings to FAULT. Signed-off-by: Bjorn Andersson --- Reviewed-by: Sai Prakash Ranjan Tested-by: Sai Prakash Ranjan -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 6/8] iommu/arm-smmu: Add impl hook for inherit boot mappings
On 2020-09-04 21:25, Bjorn Andersson wrote: Add a new operation to allow platform implementations to inherit any stream mappings from the boot loader. Signed-off-by: Bjorn Andersson --- Reviewed-by: Sai Prakash Ranjan Tested-by: Sai Prakash Ranjan -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 7/8] iommu/arm-smmu: Provide helper for allocating identity domain
On 2020-09-04 21:25, Bjorn Andersson wrote: Some platform implementations needs to be able to allocate a domain for emulating identity mappings using a context bank without translation. Provide a helper function to allocate such a domain. Signed-off-by: Bjorn Andersson --- Reviewed-by: Sai Prakash Ranjan Tested-by: Sai Prakash Ranjan -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 8/8] iommu/arm-smmu-qcom: Setup identity domain for boot mappings
On 2020-09-04 21:25, Bjorn Andersson wrote: With many Qualcomm platforms not having functional S2CR BYPASS a temporary IOMMU domain, without translation, needs to be allocated in order to allow these memory transactions. Unfortunately the boot loader uses the first few context banks, so rather than overwriting a active bank the last context bank is used and streams are diverted here during initialization. This also performs the readback of SMR registers for the Qualcomm platform, to trigger the mechanism. This is based on prior work by Thierry Reding and Laurentiu Tudor. Signed-off-by: Bjorn Andersson --- Reviewed-by: Sai Prakash Ranjan Tested-by: Sai Prakash Ranjan -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu: Kconfig: Update help description for IPMMU_VMSA config
ipmmu-vmsa driver is also used on Renesas RZ/G{1,2} Soc's, update the same to reflect the help description for IPMMU_VMSA config. Signed-off-by: Lad Prabhakar Reviewed-by: Chris Paterson --- drivers/iommu/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index bef5d75e306b..d8f71bf31786 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -232,7 +232,7 @@ config IPMMU_VMSA select ARM_DMA_USE_IOMMU help Support for the Renesas VMSA-compatible IPMMU found in the R-Mobile - APE6, R-Car Gen2, and R-Car Gen3 SoCs. + APE6, R-Car Gen{2,3} and RZ/G{1,2} SoCs. If unsure, say N. -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2] iommu/dma: Fix IOVA reserve dma ranges
Fix IOVA reserve failure in the case when address of first memory region listed in dma-ranges is equal to 0x0. Fixes: aadad097cd46f ("iommu/dma: Reserve IOVA for PCIe inaccessible DMA address") Signed-off-by: Srinath Mannam --- Changes from v1: Removed unnecessary changes based on Robin's review comments. drivers/iommu/dma-iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 5141d49a046b..682068a9aae7 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -217,7 +217,7 @@ static int iova_reserve_pci_windows(struct pci_dev *dev, lo = iova_pfn(iovad, start); hi = iova_pfn(iovad, end); reserve_iova(iovad, lo, hi); - } else { + } else if (end < start) { /* dma_ranges list should be sorted */ dev_err(&dev->dev, "Failed to reserve IOVA\n"); return -EINVAL; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/3] ARM/keystone: move the DMA offset handling under ifdef CONFIG_ARM_LPAE
On 2020-09-10 06:40, Christoph Hellwig wrote: The DMA offset notifier can only be used if PHYS_OFFSET is at least KEYSTONE_HIGH_PHYS_START, which can't be represented by a 32-bit phys_addr_t. Currently the code compiles fine despite that, a pending change to the DMA offset handling would create a compiler warning for this case. Add an ifdef to not compile the code except for LPAE configs. Seems reasonable - once again I wonder whether this notifier is really needed any more since "dma-ranges" should now be handled properly for PCI devices as well, but that's not something to worry about in this series. Reviewed-by: Robin Murphy Signed-off-by: Christoph Hellwig --- arch/arm/mach-keystone/keystone.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c index 638808c4e12247..dcd031ba84c2e0 100644 --- a/arch/arm/mach-keystone/keystone.c +++ b/arch/arm/mach-keystone/keystone.c @@ -24,6 +24,7 @@ #include "keystone.h" +#ifdef CONFIG_ARM_LPAE static unsigned long keystone_dma_pfn_offset __read_mostly; static int keystone_platform_notifier(struct notifier_block *nb, @@ -48,14 +49,17 @@ static int keystone_platform_notifier(struct notifier_block *nb, static struct notifier_block platform_nb = { .notifier_call = keystone_platform_notifier, }; +#endif /* CONFIG_ARM_LPAE */ static void __init keystone_init(void) { +#ifdef CONFIG_ARM_LPAE if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START) { keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START); bus_register_notifier(&platform_bus_type, &platform_nb); } +#endif keystone_pm_runtime_init(); } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/3] ARM/keystone: move the DMA offset handling under ifdef CONFIG_ARM_LPAE
On 2020-09-11 12:15, Russell King - ARM Linux admin wrote: On Thu, Sep 10, 2020 at 07:40:37AM +0200, Christoph Hellwig wrote: The DMA offset notifier can only be used if PHYS_OFFSET is at least KEYSTONE_HIGH_PHYS_START, which can't be represented by a 32-bit phys_addr_t. Currently the code compiles fine despite that, a pending change to the DMA offset handling would create a compiler warning for this case. Add an ifdef to not compile the code except for LPAE configs. However, to have use of the high physical offset, LPAE needs to be enabled, which ensures that phys_addr_t is 64-bit. I believe that DMA is non-coherent on this platform unless the high physical address is used. Or something like that. Yeah, it's probably not a configuration that anyone would actually want to use in anger on Keystone itself, but as long as folks might have ARCH_KEYSTONE selected in their non-LPAE multiplatform config we should avoid build regressions. I did wonder if Keystone should simply have a hard dependency on LPAE, but there does appear to be some explicit support for running directly from the non-coherent low 2G alias :/ Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/3] ARM/keystone: move the DMA offset handling under ifdef CONFIG_ARM_LPAE
On Thu, Sep 10, 2020 at 07:40:37AM +0200, Christoph Hellwig wrote: > The DMA offset notifier can only be used if PHYS_OFFSET is at least > KEYSTONE_HIGH_PHYS_START, which can't be represented by a 32-bit > phys_addr_t. Currently the code compiles fine despite that, a pending > change to the DMA offset handling would create a compiler warning for > this case. Add an ifdef to not compile the code except for LPAE > configs. However, to have use of the high physical offset, LPAE needs to be enabled, which ensures that phys_addr_t is 64-bit. I believe that DMA is non-coherent on this platform unless the high physical address is used. Or something like that. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu: fsl_pamu: Replace use of kzfree with kfree_sensitive
kzfree() is effectively deprecated as of commit 453431a54934 ("mm, treewide: rename kzfree() to kfree_sensitive()") and is now simply an alias for kfree_sensitive(). So just replace it with kfree_sensitive(). Signed-off-by: Alex Dewar --- drivers/iommu/fsl_pamu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/fsl_pamu.c b/drivers/iommu/fsl_pamu.c index 099a11a35fb9..b9a974d97831 100644 --- a/drivers/iommu/fsl_pamu.c +++ b/drivers/iommu/fsl_pamu.c @@ -1174,7 +1174,7 @@ static int fsl_pamu_probe(struct platform_device *pdev) if (irq != NO_IRQ) free_irq(irq, data); - kzfree(data); + kfree_sensitive(data); if (pamu_regs) iounmap(pamu_regs); -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCHv4 0/6] System Cache support for GPU and required SMMU support
Some hardware variants contain a system cache or the last level cache(llc). This cache is typically a large block which is shared by multiple clients on the SOC. GPU uses the system cache to cache both the GPU data buffers(like textures) as well the SMMU pagetables. This helps with improved render performance as well as lower power consumption by reducing the bus traffic to the system memory. The system cache architecture allows the cache to be split into slices which then be used by multiple SOC clients. This patch series is an effort to enable and use two of those slices perallocated for the GPU, one for the GPU data buffers and another for the GPU SMMU hardware pagetables. Patch 1 - Patch 4 adds system cache support in SMMU and GPU driver. Patch 5 and 6 are minor cleanups for arm-smmu impl. The series is based on top of https://gitlab.freedesktop.org/drm/msm/-/tree/msm-next-pgtables Changes in v4: * Drop IOMMU_SYS_CACHE prot flag * Rebase on top of https://gitlab.freedesktop.org/drm/msm/-/tree/msm-next-pgtables Changes in v3: * Fix domain attribute setting to before iommu_attach_device() * Fix few code style and checkpatch warnings * Rebase on top of Jordan's latest split pagetables and per-instance pagetables support Changes in v2: * Addressed review comments and rebased on top of Jordan's split pagetables series Sai Prakash Ranjan (4): iommu/io-pgtable-arm: Add support to use system cache iommu/arm-smmu: Add domain attribute for system cache iommu: arm-smmu-impl: Use table to list QCOM implementations iommu: arm-smmu-impl: Remove unwanted extra blank lines Sharat Masetty (2): drm/msm: rearrange the gpu_rmw() function drm/msm/a6xx: Add support for using system cache(LLC) drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 83 ++ drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 4 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.c| 21 +- drivers/gpu/drm/msm/msm_drv.c | 8 +++ drivers/gpu/drm/msm/msm_drv.h | 1 + drivers/gpu/drm/msm/msm_gpu.h | 5 +- drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 17 ++--- drivers/iommu/arm/arm-smmu/arm-smmu.c | 17 + drivers/iommu/arm/arm-smmu/arm-smmu.h | 1 + drivers/iommu/io-pgtable-arm.c | 7 +- include/linux/io-pgtable.h | 4 ++ include/linux/iommu.h | 1 + 12 files changed, 155 insertions(+), 14 deletions(-) base-commit: 11e579ab6a3c2003efa2cfd1f0b3b4395f041618 -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCHv4 1/6] iommu/io-pgtable-arm: Add support to use system cache
Add a quirk IO_PGTABLE_QUIRK_SYS_CACHE to override the attributes set in TCR for the page table walker when using system cache. Signed-off-by: Sai Prakash Ranjan --- drivers/iommu/io-pgtable-arm.c | 7 ++- include/linux/io-pgtable.h | 4 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index dc7bcf858b6d..828426c16fa9 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -789,7 +789,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie) if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NON_STRICT | - IO_PGTABLE_QUIRK_ARM_TTBR1)) + IO_PGTABLE_QUIRK_ARM_TTBR1 | + IO_PGTABLE_QUIRK_SYS_CACHE)) return NULL; data = arm_lpae_alloc_pgtable(cfg); @@ -801,6 +802,10 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie) tcr->sh = ARM_LPAE_TCR_SH_IS; tcr->irgn = ARM_LPAE_TCR_RGN_WBWA; tcr->orgn = ARM_LPAE_TCR_RGN_WBWA; + } else if (cfg->quirks & IO_PGTABLE_QUIRK_SYS_CACHE) { + tcr->sh = ARM_LPAE_TCR_SH_OS; + tcr->irgn = ARM_LPAE_TCR_RGN_NC; + tcr->orgn = ARM_LPAE_TCR_RGN_WBWA; } else { tcr->sh = ARM_LPAE_TCR_SH_OS; tcr->irgn = ARM_LPAE_TCR_RGN_NC; diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index 23285ba645db..ecc9d2248b84 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -86,6 +86,9 @@ struct io_pgtable_cfg { * * IO_PGTABLE_QUIRK_ARM_TTBR1: (ARM LPAE format) Configure the table * for use in the upper half of a split address space. +* +* IO_PGTABLE_QUIRK_SYS_CACHE: Override the attributes set in TCR for +* the page table walker when using system cache. */ #define IO_PGTABLE_QUIRK_ARM_NS BIT(0) #define IO_PGTABLE_QUIRK_NO_PERMS BIT(1) @@ -93,6 +96,7 @@ struct io_pgtable_cfg { #define IO_PGTABLE_QUIRK_ARM_MTK_EXTBIT(3) #define IO_PGTABLE_QUIRK_NON_STRICT BIT(4) #define IO_PGTABLE_QUIRK_ARM_TTBR1 BIT(5) + #define IO_PGTABLE_QUIRK_SYS_CACHE BIT(6) unsigned long quirks; unsigned long pgsize_bitmap; unsigned intias; -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCHv4 2/6] iommu/arm-smmu: Add domain attribute for system cache
Add iommu domain attribute for using system cache aka last level cache by client drivers like GPU to set right attributes for caching the hardware pagetables into the system cache. Signed-off-by: Sai Prakash Ranjan --- drivers/iommu/arm/arm-smmu/arm-smmu.c | 17 + drivers/iommu/arm/arm-smmu/arm-smmu.h | 1 + include/linux/iommu.h | 1 + 3 files changed, 19 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index 1f06ab219819..d449c895ba16 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -789,6 +789,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain, if (smmu_domain->non_strict) pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT; + if (smmu_domain->sys_cache) + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_SYS_CACHE; + pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain); if (!pgtbl_ops) { ret = -ENOMEM; @@ -1513,6 +1516,9 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain, case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: *(int *)data = smmu_domain->non_strict; return 0; + case DOMAIN_ATTR_SYS_CACHE: + *((int *)data) = smmu_domain->sys_cache; + return 0; default: return -ENODEV; } @@ -1544,6 +1550,17 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain, else smmu_domain->stage = ARM_SMMU_DOMAIN_S1; break; + case DOMAIN_ATTR_SYS_CACHE: + if (smmu_domain->smmu) { + ret = -EPERM; + goto out_unlock; + } + + if (*((int *)data)) + smmu_domain->sys_cache = true; + else + smmu_domain->sys_cache = false; + break; default: ret = -ENODEV; } diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h index ddf2ca4c923d..93593e164e44 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h @@ -373,6 +373,7 @@ struct arm_smmu_domain { struct mutexinit_mutex; /* Protects smmu pointer */ spinlock_t cb_lock; /* Serialises ATS1* ops and TLB syncs */ struct iommu_domain domain; + boolsys_cache; }; struct arm_smmu_master_cfg { diff --git a/include/linux/iommu.h b/include/linux/iommu.h index fee209efb756..a580dfe9c68d 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -118,6 +118,7 @@ enum iommu_attr { DOMAIN_ATTR_FSL_PAMUV1, DOMAIN_ATTR_NESTING,/* two stages of translation */ DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE, + DOMAIN_ATTR_SYS_CACHE, DOMAIN_ATTR_MAX, }; -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCHv4 3/6] drm/msm: rearrange the gpu_rmw() function
From: Sharat Masetty The register read-modify-write construct is generic enough that it can be used by other subsystems as needed, create a more generic rmw() function and have the gpu_rmw() use this new function. Signed-off-by: Sharat Masetty Reviewed-by: Jordan Crouse Signed-off-by: Sai Prakash Ranjan --- drivers/gpu/drm/msm/msm_drv.c | 8 drivers/gpu/drm/msm/msm_drv.h | 1 + drivers/gpu/drm/msm/msm_gpu.h | 5 + 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index abf5799d9a22..03caafa7c7b2 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -180,6 +180,14 @@ u32 msm_readl(const void __iomem *addr) return val; } +void msm_rmw(void __iomem *addr, u32 mask, u32 or) +{ + u32 val = msm_readl(addr); + + val &= ~mask; + msm_writel(val | or, addr); +} + struct msm_vblank_work { struct work_struct work; int crtc_id; diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index 2ca9c3c03845..aa07900d9ac4 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -419,6 +419,7 @@ void __iomem *msm_ioremap_quiet(struct platform_device *pdev, const char *name, const char *dbgname); void msm_writel(u32 data, void __iomem *addr); u32 msm_readl(const void __iomem *addr); +void msm_rmw(void __iomem *addr, u32 mask, u32 or); struct msm_gpu_submitqueue; int msm_submitqueue_init(struct drm_device *drm, struct msm_file_private *ctx); diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 5ee358b480e6..1d446b2e5746 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -245,10 +245,7 @@ static inline u32 gpu_read(struct msm_gpu *gpu, u32 reg) static inline void gpu_rmw(struct msm_gpu *gpu, u32 reg, u32 mask, u32 or) { - uint32_t val = gpu_read(gpu, reg); - - val &= ~mask; - gpu_write(gpu, reg, val | or); + msm_rmw(gpu->mmio + (reg << 2), mask, or); } static inline u64 gpu_read64(struct msm_gpu *gpu, u32 lo, u32 hi) -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCHv4 5/6] iommu: arm-smmu-impl: Use table to list QCOM implementations
Use table and of_match_node() to match qcom implementation instead of multiple of_device_compatible() calls for each QCOM SMMU implementation. Signed-off-by: Sai Prakash Ranjan --- drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c index d199b4bff15d..ce78295cfa78 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c @@ -9,6 +9,13 @@ #include "arm-smmu.h" +static const struct of_device_id __maybe_unused qcom_smmu_impl_of_match[] = { + { .compatible = "qcom,sc7180-smmu-500" }, + { .compatible = "qcom,sdm845-smmu-500" }, + { .compatible = "qcom,sm8150-smmu-500" }, + { .compatible = "qcom,sm8250-smmu-500" }, + { } +}; static int arm_smmu_gr0_ns(int offset) { @@ -217,10 +224,7 @@ struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu) if (of_device_is_compatible(np, "nvidia,tegra194-smmu")) return nvidia_smmu_impl_init(smmu); - if (of_device_is_compatible(np, "qcom,sdm845-smmu-500") || - of_device_is_compatible(np, "qcom,sc7180-smmu-500") || - of_device_is_compatible(np, "qcom,sm8150-smmu-500") || - of_device_is_compatible(np, "qcom,sm8250-smmu-500")) + if (of_match_node(qcom_smmu_impl_of_match, np)) return qcom_smmu_impl_init(smmu); if (of_device_is_compatible(smmu->dev->of_node, "qcom,adreno-smmu")) -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCHv4 6/6] iommu: arm-smmu-impl: Remove unwanted extra blank lines
There are few places in arm-smmu-impl where there are extra blank lines, remove them and while at it fix the checkpatch warning for space required before the open parenthesis. Signed-off-by: Sai Prakash Ranjan --- drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c index ce78295cfa78..f5b5218cbe5b 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c @@ -19,7 +19,7 @@ static const struct of_device_id __maybe_unused qcom_smmu_impl_of_match[] = { static int arm_smmu_gr0_ns(int offset) { - switch(offset) { + switch (offset) { case ARM_SMMU_GR0_sCR0: case ARM_SMMU_GR0_sACR: case ARM_SMMU_GR0_sGFSR: @@ -54,7 +54,6 @@ static const struct arm_smmu_impl calxeda_impl = { .write_reg = arm_smmu_write_ns, }; - struct cavium_smmu { struct arm_smmu_device smmu; u32 id_base; @@ -110,7 +109,6 @@ static struct arm_smmu_device *cavium_smmu_impl_init(struct arm_smmu_device *smm return &cs->smmu; } - #define ARM_MMU500_ACTLR_CPRE (1 << 1) #define ARM_MMU500_ACR_CACHE_LOCK (1 << 26) @@ -197,7 +195,6 @@ static const struct arm_smmu_impl mrvl_mmu500_impl = { .reset = arm_mmu500_reset, }; - struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu) { const struct device_node *np = smmu->dev->of_node; -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCHv4 4/6] drm/msm/a6xx: Add support for using system cache(LLC)
From: Sharat Masetty The last level system cache can be partitioned to 32 different slices of which GPU has two slices preallocated. One slice is used for caching GPU buffers and the other slice is used for caching the GPU SMMU pagetables. This talks to the core system cache driver to acquire the slice handles, configure the SCID's to those slices and activates and deactivates the slices upon GPU power collapse and restore. Some support from the IOMMU driver is also needed to make use of the system cache to set the right TCR attributes. GPU then has the ability to override a few cacheability parameters which it does to override write-allocate to write-no-allocate as the GPU hardware does not benefit much from it. DOMAIN_ATTR_SYS_CACHE is another domain level attribute used by the IOMMU driver to set the right attributes to cache the hardware pagetables into the system cache. Signed-off-by: Sharat Masetty (saiprakash.ranjan: fix to set attr before device attach to iommu and rebase) Signed-off-by: Sai Prakash Ranjan --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 83 + drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 4 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.c | 21 ++- 3 files changed, 107 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index a3a8d6fd06bb..6241b93ed375 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -9,6 +9,8 @@ #include "a6xx_gmu.xml.h" #include +#include +#include #define GPU_PAS_ID 13 @@ -969,6 +971,79 @@ static const u32 a6xx_register_offsets[REG_ADRENO_REGISTER_MAX] = { REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_CNTL, REG_A6XX_CP_RB_CNTL), }; +static void a6xx_llc_rmw(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 mask, u32 or) +{ + return msm_rmw(a6xx_gpu->llc_mmio + (reg << 2), mask, or); +} + +static void a6xx_llc_write(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 value) +{ + return msm_writel(value, a6xx_gpu->llc_mmio + (reg << 2)); +} + +static void a6xx_llc_deactivate(struct a6xx_gpu *a6xx_gpu) +{ + llcc_slice_deactivate(a6xx_gpu->llc_slice); + llcc_slice_deactivate(a6xx_gpu->htw_llc_slice); +} + +static void a6xx_llc_activate(struct a6xx_gpu *a6xx_gpu) +{ + u32 cntl1_regval = 0; + + if (IS_ERR(a6xx_gpu->llc_mmio)) + return; + + if (!llcc_slice_activate(a6xx_gpu->llc_slice)) { + u32 gpu_scid = llcc_get_slice_id(a6xx_gpu->llc_slice); + + gpu_scid &= 0x1f; + cntl1_regval = (gpu_scid << 0) | (gpu_scid << 5) | (gpu_scid << 10) | + (gpu_scid << 15) | (gpu_scid << 20); + } + + if (!llcc_slice_activate(a6xx_gpu->htw_llc_slice)) { + u32 gpuhtw_scid = llcc_get_slice_id(a6xx_gpu->htw_llc_slice); + + gpuhtw_scid &= 0x1f; + cntl1_regval |= FIELD_PREP(GENMASK(29, 25), gpuhtw_scid); + } + + if (cntl1_regval) { + /* +* Program the slice IDs for the various GPU blocks and GPU MMU +* pagetables +*/ + a6xx_llc_write(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1, cntl1_regval); + + /* +* Program cacheability overrides to not allocate cache lines on +* a write miss +*/ + a6xx_llc_rmw(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0, 0xF, 0x03); + } +} + +static void a6xx_llc_slices_destroy(struct a6xx_gpu *a6xx_gpu) +{ + llcc_slice_putd(a6xx_gpu->llc_slice); + llcc_slice_putd(a6xx_gpu->htw_llc_slice); +} + +static void a6xx_llc_slices_init(struct platform_device *pdev, + struct a6xx_gpu *a6xx_gpu) +{ + a6xx_gpu->llc_mmio = msm_ioremap(pdev, "cx_mem", "gpu_cx"); + if (IS_ERR(a6xx_gpu->llc_mmio)) + return; + + a6xx_gpu->llc_slice = llcc_slice_getd(LLCC_GPU); + a6xx_gpu->htw_llc_slice = llcc_slice_getd(LLCC_GPUHTW); + + if (IS_ERR(a6xx_gpu->llc_slice) && IS_ERR(a6xx_gpu->htw_llc_slice)) + a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL); +} + static int a6xx_pm_resume(struct msm_gpu *gpu) { struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); @@ -985,6 +1060,8 @@ static int a6xx_pm_resume(struct msm_gpu *gpu) msm_gpu_resume_devfreq(gpu); + a6xx_llc_activate(a6xx_gpu); + return 0; } @@ -995,6 +1072,8 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu) trace_msm_gpu_suspend(0); + a6xx_llc_deactivate(a6xx_gpu); + devfreq_suspend_device(gpu->devfreq.devfreq); return a6xx_gmu_stop(a6xx_gpu); @@ -1033,6 +1112,8 @@ static void a6xx_destroy(struct msm_gpu *gpu) drm_gem_object_put(a6xx_gpu->sqe_bo); } + a6xx_llc_slices_destroy(a6xx_gpu); + a6xx_gmu_remove(a6xx_gpu); adreno_gpu_cleanup(adreno_gpu); @@ -1132,6 +1213,8 @
Re: [PATCH v2] iommu/dma: Fix IOVA reserve dma ranges
On Fri, Sep 11, 2020 at 03:55:34PM +0530, Srinath Mannam wrote: > Fix IOVA reserve failure in the case when address of first memory region > listed in dma-ranges is equal to 0x0. > > Fixes: aadad097cd46f ("iommu/dma: Reserve IOVA for PCIe inaccessible DMA > address") > Signed-off-by: Srinath Mannam > --- > Changes from v1: >Removed unnecessary changes based on Robin's review comments. > > drivers/iommu/dma-iommu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > index 5141d49a046b..682068a9aae7 100644 > --- a/drivers/iommu/dma-iommu.c > +++ b/drivers/iommu/dma-iommu.c > @@ -217,7 +217,7 @@ static int iova_reserve_pci_windows(struct pci_dev *dev, > lo = iova_pfn(iovad, start); > hi = iova_pfn(iovad, end); > reserve_iova(iovad, lo, hi); > - } else { > + } else if (end < start) { > /* dma_ranges list should be sorted */ > dev_err(&dev->dev, "Failed to reserve IOVA\n"); You didn't actually change the error message, but the message would be way more useful if it included the IOVA address range, e.g., the format used in pci_register_host_bridge(): bus address [%#010llx-%#010llx] Incidentally, the pr_err() in copy_reserved_iova() looks bogus; it prints iova->pfn_low twice, when it should probably print the base and size or (my preference) something like the above: pr_err("Reserve iova range %lx@%lx failed\n", iova->pfn_lo, iova->pfn_lo); > return -EINVAL; > -- > 2.17.1 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCHv4 6/6] iommu: arm-smmu-impl: Remove unwanted extra blank lines
On 2020-09-11 15:28, Sai Prakash Ranjan wrote: There are few places in arm-smmu-impl where there are extra blank lines, remove them FWIW those were deliberate - sometimes I like a bit of subtle space to visually delineate distinct groups of definitions. I suppose it won't be to everyone's taste :/ and while at it fix the checkpatch warning for space required before the open parenthesis. That one, however, was not ;) BTW am I supposed to have received 3 copies of everything? Because I did... Robin. Signed-off-by: Sai Prakash Ranjan --- drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c index ce78295cfa78..f5b5218cbe5b 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c @@ -19,7 +19,7 @@ static const struct of_device_id __maybe_unused qcom_smmu_impl_of_match[] = { static int arm_smmu_gr0_ns(int offset) { - switch(offset) { + switch (offset) { case ARM_SMMU_GR0_sCR0: case ARM_SMMU_GR0_sACR: case ARM_SMMU_GR0_sGFSR: @@ -54,7 +54,6 @@ static const struct arm_smmu_impl calxeda_impl = { .write_reg = arm_smmu_write_ns, }; - struct cavium_smmu { struct arm_smmu_device smmu; u32 id_base; @@ -110,7 +109,6 @@ static struct arm_smmu_device *cavium_smmu_impl_init(struct arm_smmu_device *smm return &cs->smmu; } - #define ARM_MMU500_ACTLR_CPRE (1 << 1) #define ARM_MMU500_ACR_CACHE_LOCK (1 << 26) @@ -197,7 +195,6 @@ static const struct arm_smmu_impl mrvl_mmu500_impl = { .reset = arm_mmu500_reset, }; - struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu) { const struct device_node *np = smmu->dev->of_node; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCHv4 6/6] iommu: arm-smmu-impl: Remove unwanted extra blank lines
On Fri, Sep 11, 2020 at 05:03:06PM +0100, Robin Murphy wrote: > BTW am I supposed to have received 3 copies of everything? Because I did... Yeah, this seems to be happening for all of Sai's emails :/ Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 0/8] iommu/arm-smmu: Support maintaining bootloader mappings
On Fri, 4 Sep 2020 at 21:25, Bjorn Andersson wrote: > > Based on previous attempts and discussions this is the latest attempt at > inheriting stream mappings set up by the bootloader, for e.g. boot splash or > efifb. > > Per Will's request this builds on the work by Jordan and Rob for the Adreno > SMMU support. It applies cleanly ontop of v16 of their series, which can be > found at > https://lore.kernel.org/linux-arm-msm/20200901164707.2645413-1-robdcl...@gmail.com/ > Boot tested the series on Xiaomi Poco F1 phone (sdm845) Tested-by: Amit Pundir > Bjorn Andersson (8): > iommu/arm-smmu: Refactor context bank allocation > iommu/arm-smmu: Delay modifying domain during init > iommu/arm-smmu: Consult context bank allocator for identify domains > iommu/arm-smmu-qcom: Emulate bypass by using context banks > iommu/arm-smmu-qcom: Consistently initialize stream mappings > iommu/arm-smmu: Add impl hook for inherit boot mappings > iommu/arm-smmu: Provide helper for allocating identity domain > iommu/arm-smmu-qcom: Setup identity domain for boot mappings > > drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 111 ++- > drivers/iommu/arm/arm-smmu/arm-smmu.c | 122 ++--- > drivers/iommu/arm/arm-smmu/arm-smmu.h | 14 ++- > 3 files changed, 205 insertions(+), 42 deletions(-) > > -- > 2.28.0 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/3] dma-mapping: introduce DMA range map, supplanting dma_pfn_offset
(apologies to Jim - I did look through one of the previous versions since I last commented and thought it looked OK, but never actually replied as such) On 2020-09-10 06:40, Christoph Hellwig wrote: From: Jim Quinlan The new field 'dma_range_map' in struct device is used to facilitate the use of single or multiple offsets between mapping regions of cpu addrs and dma addrs. It subsumes the role of "dev->dma_pfn_offset" which was only capable of holding a single uniform offset and had no region bounds checking. The function of_dma_get_range() has been modified so that it takes a single argument -- the device node -- and returns a map, NULL, or an error code. The map is an array that holds the information regarding the DMA regions. Each range entry contains the address offset, the cpu_start address, the dma_start address, and the size of the region. of_dma_configure() is the typical manner to set range offsets but there are a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel driver code. These cases now invoke the function dma_attach_offset_range(dev, cpu_addr, dma_addr, size). This is now called dma_direct_set_offset(), right? Signed-off-by: Jim Quinlan [hch: various interface cleanups] Signed-off-by: Christoph Hellwig Tested-by: Nathan Chancellor --- arch/arm/include/asm/dma-direct.h | 9 +-- arch/arm/mach-keystone/keystone.c | 17 ++-- arch/sh/drivers/pci/pcie-sh7786.c | 9 ++- arch/x86/pci/sta2x11-fixup.c | 6 +- drivers/acpi/arm64/iort.c | 6 +- drivers/base/core.c | 2 + drivers/gpu/drm/sun4i/sun4i_backend.c | 8 +- drivers/iommu/io-pgtable-arm.c| 2 +- .../platform/sunxi/sun4i-csi/sun4i_csi.c | 9 ++- .../platform/sunxi/sun6i-csi/sun6i_csi.c | 11 ++- drivers/of/address.c | 73 - drivers/of/device.c | 44 ++ drivers/of/of_private.h | 11 +-- drivers/of/unittest.c | 34 +--- drivers/remoteproc/remoteproc_core.c | 4 +- .../staging/media/sunxi/cedrus/cedrus_hw.c| 10 ++- drivers/usb/core/message.c| 5 +- drivers/usb/core/usb.c| 3 +- include/linux/device.h| 4 +- include/linux/dma-direct.h| 52 ++-- include/linux/dma-mapping.h | 19 - kernel/dma/coherent.c | 7 +- kernel/dma/direct.c | 81 ++- 23 files changed, 303 insertions(+), 123 deletions(-) diff --git a/arch/arm/include/asm/dma-direct.h b/arch/arm/include/asm/dma-direct.h index de0f4ff9279615..a443d5257a21ed 100644 --- a/arch/arm/include/asm/dma-direct.h +++ b/arch/arm/include/asm/dma-direct.h @@ -16,8 +16,8 @@ #ifndef __arch_pfn_to_dma static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn) { - if (dev) - pfn -= dev->dma_pfn_offset; + if (dev && dev->dma_range_map) + pfn = PFN_DOWN(translate_phys_to_dma(dev, PFN_PHYS(pfn))); return (dma_addr_t)__pfn_to_bus(pfn); } @@ -25,9 +25,8 @@ static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr) { unsigned long pfn = __bus_to_pfn(addr); - if (dev) - pfn += dev->dma_pfn_offset; - + if (dev && dev->dma_range_map) + pfn = PFN_DOWN(translate_dma_to_phys(dev, PFN_PHYS(pfn))); return pfn; } diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c index dcd031ba84c2e0..09a65c2dfd7327 100644 --- a/arch/arm/mach-keystone/keystone.c +++ b/arch/arm/mach-keystone/keystone.c @@ -8,6 +8,7 @@ */ #include #include +#include #include #include #include @@ -25,8 +26,6 @@ #include "keystone.h" #ifdef CONFIG_ARM_LPAE -static unsigned long keystone_dma_pfn_offset __read_mostly; - static int keystone_platform_notifier(struct notifier_block *nb, unsigned long event, void *data) { @@ -39,9 +38,12 @@ static int keystone_platform_notifier(struct notifier_block *nb, return NOTIFY_BAD; if (!dev->of_node) { - dev->dma_pfn_offset = keystone_dma_pfn_offset; - dev_err(dev, "set dma_pfn_offset%08lx\n", - dev->dma_pfn_offset); + int ret = dma_direct_set_offset(dev, KEYSTONE_HIGH_PHYS_START, + KEYSTONE_LOW_PHYS_START, + KEYSTONE_HIGH_PHYS_SIZE); + dev_err(dev, "set dma_offset%08llx%s\n", + KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START, + ret ? " failed" : ""); FWIW I've already been thinking of some optimisations which would have
Re: [PATCHv4 6/6] iommu: arm-smmu-impl: Remove unwanted extra blank lines
On 2020-09-11 21:33, Robin Murphy wrote: On 2020-09-11 15:28, Sai Prakash Ranjan wrote: There are few places in arm-smmu-impl where there are extra blank lines, remove them FWIW those were deliberate - sometimes I like a bit of subtle space to visually delineate distinct groups of definitions. I suppose it won't be to everyone's taste :/ Ah ok, I thought it was not intentional, I can drop it. and while at it fix the checkpatch warning for space required before the open parenthesis. That one, however, was not ;) I'll keep this one. BTW am I supposed to have received 3 copies of everything? Because I did... Ugh no, I just sent it once but something seems to have gone wrong. Apologies again if you receive this message also multiple times. I'll check further what's going wrong with my setup. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCHv4 6/6] iommu: arm-smmu-impl: Remove unwanted extra blank lines
On 2020-09-11 21:37, Will Deacon wrote: On Fri, Sep 11, 2020 at 05:03:06PM +0100, Robin Murphy wrote: BTW am I supposed to have received 3 copies of everything? Because I did... Yeah, this seems to be happening for all of Sai's emails :/ Sorry, I am not sure what went wrong as I only sent this once and there are no recent changes to any of my configs, I'll check it further. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCHv4 6/6] iommu: arm-smmu-impl: Remove unwanted extra blank lines
On 2020-09-11 17:21, Sai Prakash Ranjan wrote: On 2020-09-11 21:37, Will Deacon wrote: On Fri, Sep 11, 2020 at 05:03:06PM +0100, Robin Murphy wrote: BTW am I supposed to have received 3 copies of everything? Because I did... Yeah, this seems to be happening for all of Sai's emails :/ Sorry, I am not sure what went wrong as I only sent this once and there are no recent changes to any of my configs, I'll check it further. Actually on closer inspection it appears to be "correct" behaviour. I'm still subscribed to LAKML and the IOMMU list on this account, but normally Office 365 deduplicates so aggressively that I have rules set up to copy list mails that I'm cc'ed on back to my inbox, in case they arrive first and cause the direct copy to get eaten - apparently there's something unique about your email setup that manages to defeat the deduplicator and make it deliver all 3 copies intact... :/ Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCHv4 6/6] iommu: arm-smmu-impl: Remove unwanted extra blank lines
On 2020-09-11 22:04, Robin Murphy wrote: On 2020-09-11 17:21, Sai Prakash Ranjan wrote: On 2020-09-11 21:37, Will Deacon wrote: On Fri, Sep 11, 2020 at 05:03:06PM +0100, Robin Murphy wrote: BTW am I supposed to have received 3 copies of everything? Because I did... Yeah, this seems to be happening for all of Sai's emails :/ Sorry, I am not sure what went wrong as I only sent this once and there are no recent changes to any of my configs, I'll check it further. Actually on closer inspection it appears to be "correct" behaviour. I'm still subscribed to LAKML and the IOMMU list on this account, but normally Office 365 deduplicates so aggressively that I have rules set up to copy list mails that I'm cc'ed on back to my inbox, in case they arrive first and cause the direct copy to get eaten - apparently there's something unique about your email setup that manages to defeat the deduplicator and make it deliver all 3 copies intact... :/ No changes in my local setup atleast, but in the past we have had cases with codeaurora mail acting weird or it could be my vpn, will have to check. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 6/8] iommu/arm-smmu: Add impl hook for inherit boot mappings
On 2020-09-04 16:55, Bjorn Andersson wrote: Add a new operation to allow platform implementations to inherit any stream mappings from the boot loader. Is there a reason we need an explicit step for this? The aim of the cfg_probe hook is that the SMMU software state should all be set up by then, and you can mess about with it however you like before arm_smmu_reset() actually commits anything to hardware. I would have thought you could permanently steal a context bank, configure it as your bypass hole, read out the previous SME configuration and tweak smmu->smrs and smmu->s2crs appropriately all together "invisibly" at that point. If that can't work, I'm very curious as to what I've overlooked. Robin. Signed-off-by: Bjorn Andersson --- Changes since v2: - New patch/interface drivers/iommu/arm/arm-smmu/arm-smmu.c | 11 ++- drivers/iommu/arm/arm-smmu/arm-smmu.h | 6 ++ 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index eb5c6ca5c138..4c4d302cd747 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -85,11 +85,6 @@ static inline void arm_smmu_rpm_put(struct arm_smmu_device *smmu) pm_runtime_put_autosuspend(smmu->dev); } -static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) -{ - return container_of(dom, struct arm_smmu_domain, domain); -} - static struct platform_driver arm_smmu_driver; static struct iommu_ops arm_smmu_ops; @@ -2188,6 +2183,12 @@ static int arm_smmu_device_probe(struct platform_device *pdev) if (err) return err; + if (smmu->impl->inherit_mappings) { + err = smmu->impl->inherit_mappings(smmu); + if (err) + return err; + } + if (smmu->version == ARM_SMMU_V2) { if (smmu->num_context_banks > smmu->num_context_irqs) { dev_err(dev, diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h index 235d9a3a6ab6..f58164976e74 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h @@ -378,6 +378,11 @@ struct arm_smmu_domain { struct iommu_domain domain; }; +static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) +{ + return container_of(dom, struct arm_smmu_domain, domain); +} + struct arm_smmu_master_cfg { struct arm_smmu_device *smmu; s16 smendx[]; @@ -442,6 +447,7 @@ struct arm_smmu_impl { int (*alloc_context_bank)(struct arm_smmu_domain *smmu_domain, struct arm_smmu_device *smmu, struct device *dev, int start); + int (*inherit_mappings)(struct arm_smmu_device *smmu); }; #define INVALID_SMENDX -1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 8/8] iommu/arm-smmu-qcom: Setup identity domain for boot mappings
On 2020-09-04 16:55, Bjorn Andersson wrote: With many Qualcomm platforms not having functional S2CR BYPASS a temporary IOMMU domain, without translation, needs to be allocated in order to allow these memory transactions. Unfortunately the boot loader uses the first few context banks, so rather than overwriting a active bank the last context bank is used and streams are diverted here during initialization. This also performs the readback of SMR registers for the Qualcomm platform, to trigger the mechanism. This is based on prior work by Thierry Reding and Laurentiu Tudor. Signed-off-by: Bjorn Andersson --- Changes since v2: - Combined from pieces spread between the Qualcomm impl and generic code in v2. - Moved to use the newly introduced inherit_mapping op. drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 33 ++ 1 file changed, 33 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c index 70a1eaa52e14..a54302190932 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c @@ -12,6 +12,7 @@ struct qcom_smmu { struct arm_smmu_device smmu; bool bypass_broken; + struct iommu_domain *identity; }; static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device *smmu) @@ -228,6 +229,37 @@ static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu) return 0; } +static int qcom_smmu_inherit_mappings(struct arm_smmu_device *smmu) +{ + struct qcom_smmu *qsmmu = to_qcom_smmu(smmu); + int cbndx; + u32 smr; + int i; + + qsmmu->identity = arm_smmu_alloc_identity_domain(smmu); + if (IS_ERR(qsmmu->identity)) + return PTR_ERR(qsmmu->identity); + + cbndx = to_smmu_domain(qsmmu->identity)->cfg.cbndx; I don't really get the point of going through the dance of allocating a whole iommu_domain() just to get a context. If you don't want to simply statically reserve a context at probe time, then just allocate from smmu->context_map here (where AFAICS "here" should be in cfg_probe anyway). This is entirely driver-internal, so there shouldn't be any need for IOMMU-API-level stuff to be involved. Robin. + + for (i = 0; i < smmu->num_mapping_groups; i++) { + smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(i)); + + if (FIELD_GET(ARM_SMMU_SMR_VALID, smr)) { + smmu->smrs[i].id = FIELD_GET(ARM_SMMU_SMR_ID, smr); + smmu->smrs[i].mask = FIELD_GET(ARM_SMMU_SMR_MASK, smr); + smmu->smrs[i].valid = true; + + smmu->s2crs[i].type = S2CR_TYPE_TRANS; + smmu->s2crs[i].privcfg = S2CR_PRIVCFG_DEFAULT; + smmu->s2crs[i].cbndx = cbndx; + smmu->s2crs[i].count++; + } + } + + return 0; +} + static int qcom_smmu_def_domain_type(struct device *dev) { const struct of_device_id *match = @@ -270,6 +302,7 @@ static const struct arm_smmu_impl qcom_smmu_impl = { .cfg_probe = qcom_smmu_cfg_probe, .def_domain_type = qcom_smmu_def_domain_type, .reset = qcom_smmu500_reset, + .inherit_mappings = qcom_smmu_inherit_mappings, }; static const struct arm_smmu_impl qcom_adreno_smmu_impl = { ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/3] ARM/keystone: move the DMA offset handling under ifdef CONFIG_ARM_LPAE
On 9/11/20 4:15 AM, Russell King - ARM Linux admin wrote: On Thu, Sep 10, 2020 at 07:40:37AM +0200, Christoph Hellwig wrote: The DMA offset notifier can only be used if PHYS_OFFSET is at least KEYSTONE_HIGH_PHYS_START, which can't be represented by a 32-bit phys_addr_t. Currently the code compiles fine despite that, a pending change to the DMA offset handling would create a compiler warning for this case. Add an ifdef to not compile the code except for LPAE configs. However, to have use of the high physical offset, LPAE needs to be enabled, which ensures that phys_addr_t is 64-bit. I believe that DMA is non-coherent on this platform unless the high physical address is used. Or something like that. Exactly. Higher address ranges needs to be used for DMA coherency. Regards, Santosh ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 01/16] iommu: Report domain nesting info
On Thu, 10 Sep 2020 03:45:18 -0700 Liu Yi L wrote: > IOMMUs that support nesting translation needs report the capability info > to userspace. It gives information about requirements the userspace needs > to implement plus other features characterizing the physical implementation. > > This patch introduces a new IOMMU UAPI struct that gives information about > the nesting capabilities and features. This struct is supposed to be returned > by iommu_domain_get_attr() with DOMAIN_ATTR_NESTING attribute parameter, with > one domain whose type has been set to DOMAIN_ATTR_NESTING. > > Cc: Kevin Tian > CC: Jacob Pan > Cc: Alex Williamson > Cc: Eric Auger > Cc: Jean-Philippe Brucker > Cc: Joerg Roedel > Cc: Lu Baolu > Signed-off-by: Liu Yi L > Signed-off-by: Jacob Pan > --- > v6 -> v7: > *) rephrase the commit message, replace the @data[] field in struct >iommu_nesting_info with union per comments from Eric Auger. > > v5 -> v6: > *) rephrase the feature notes per comments from Eric Auger. > *) rename @size of struct iommu_nesting_info to @argsz. > > v4 -> v5: > *) address comments from Eric Auger. > > v3 -> v4: > *) split the SMMU driver changes to be a separate patch > *) move the @addr_width and @pasid_bits from vendor specific >part to generic part. > *) tweak the description for the @features field of struct >iommu_nesting_info. > *) add description on the @data[] field of struct iommu_nesting_info > > v2 -> v3: > *) remvoe cap/ecap_mask in iommu_nesting_info. > *) reuse DOMAIN_ATTR_NESTING to get nesting info. > *) return an empty iommu_nesting_info for SMMU drivers per Jean' >suggestion. > --- > include/uapi/linux/iommu.h | 76 > ++ > 1 file changed, 76 insertions(+) > > diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h > index 1ebc23d..ff987e4 100644 > --- a/include/uapi/linux/iommu.h > +++ b/include/uapi/linux/iommu.h > @@ -341,4 +341,80 @@ struct iommu_gpasid_bind_data { > } vendor; > }; > > +/* > + * struct iommu_nesting_info_vtd - Intel VT-d specific nesting info. > + * > + * @flags: VT-d specific flags. Currently reserved for future > + * extension. must be set to 0. > + * @cap_reg: Describe basic capabilities as defined in VT-d capability > + * register. > + * @ecap_reg:Describe the extended capabilities as defined in VT-d > + * extended capability register. > + */ > +struct iommu_nesting_info_vtd { > + __u32 flags; > + __u64 cap_reg; > + __u64 ecap_reg; > +}; The vendor union has 8-byte alignment, so flags here will be 8-byte aligned, followed by a compiler dependent gap before the 8-byte fields. We should fill that gap with padding to make it deterministic for userspace. Thanks, Alex > + > +/* > + * struct iommu_nesting_info - Information for nesting-capable IOMMU. > + * userspace should check it before using > + * nesting capability. > + * > + * @argsz: size of the whole structure. > + * @flags: currently reserved for future extension. must set to 0. > + * @format: PASID table entry format, the same definition as struct > + * iommu_gpasid_bind_data @format. > + * @features:supported nesting features. > + * @addr_width: The output addr width of first level/stage translation > + * @pasid_bits: Maximum supported PASID bits, 0 represents no PASID > + * support. > + * @vendor: vendor specific data, structure type can be deduced from > + * @format field. > + * > + * +===+==+ > + * | feature | Notes | > + * +===+==+ > + * | SYSWIDE_PASID | IOMMU vendor driver sets it to mandate userspace| > + * | | to allocate PASID from kernel. All PASID allocation | > + * | | free must be mediated through the IOMMU UAPI. | > + * +---+--+ > + * | BIND_PGTBL| IOMMU vendor driver sets it to mandate userspace to | > + * | | bind the first level/stage page table to associated | > + * | | PASID (either the one specified in bind request or | > + * | | the default PASID of iommu domain), through IOMMU | > + * | | UAPI. | > + * +---+--+ > + * | CACHE_INVLD | IOMMU vendor driver sets it to mandate userspace to | > + * | | explicitly invalidate the IOMMU cache through IOMMU | > + * | | UAPI according to vendor-specific requirement when | > + * | | changing the 1st level/stage page table.| > + * +---+
Re: [PATCH v7 03/16] vfio/type1: Report iommu nesting info to userspace
On Thu, 10 Sep 2020 03:45:20 -0700 Liu Yi L wrote: > This patch exports iommu nesting capability info to user space through > VFIO. Userspace is expected to check this info for supported uAPIs (e.g. > PASID alloc/free, bind page table, and cache invalidation) and the vendor > specific format information for first level/stage page table that will be > bound to. > > The nesting info is available only after container set to be NESTED type. > Current implementation imposes one limitation - one nesting container > should include at most one iommu group. The philosophy of vfio container > is having all groups/devices within the container share the same IOMMU > context. When vSVA is enabled, one IOMMU context could include one 2nd- > level address space and multiple 1st-level address spaces. While the > 2nd-level address space is reasonably sharable by multiple groups, blindly > sharing 1st-level address spaces across all groups within the container > might instead break the guest expectation. In the future sub/super container > concept might be introduced to allow partial address space sharing within > an IOMMU context. But for now let's go with this restriction by requiring > singleton container for using nesting iommu features. Below link has the > related discussion about this decision. > > https://lore.kernel.org/kvm/20200515115924.37e69...@w520.home/ > > This patch also changes the NESTING type container behaviour. Something > that would have succeeded before will now fail: Before this series, if > user asked for a VFIO_IOMMU_TYPE1_NESTING, it would have succeeded even > if the SMMU didn't support stage-2, as the driver would have silently > fallen back on stage-1 mappings (which work exactly the same as stage-2 > only since there was no nesting supported). After the series, we do check > for DOMAIN_ATTR_NESTING so if user asks for VFIO_IOMMU_TYPE1_NESTING and > the SMMU doesn't support stage-2, the ioctl fails. But it should be a good > fix and completely harmless. Detail can be found in below link as well. > > https://lore.kernel.org/kvm/20200717090900.GC4850@myrica/ > > Cc: Kevin Tian > CC: Jacob Pan > Cc: Alex Williamson > Cc: Eric Auger > Cc: Jean-Philippe Brucker > Cc: Joerg Roedel > Cc: Lu Baolu > Signed-off-by: Liu Yi L > --- > v6 -> v7: > *) using vfio_info_add_capability() for adding nesting cap per suggestion >from Eric. > > v5 -> v6: > *) address comments against v5 from Eric Auger. > *) don't report nesting cap to userspace if the nesting_info->format is >invalid. > > v4 -> v5: > *) address comments from Eric Auger. > *) return struct iommu_nesting_info for VFIO_IOMMU_TYPE1_INFO_CAP_NESTING as >cap is much "cheap", if needs extension in future, just define another cap. >https://lore.kernel.org/kvm/20200708132947.5b7ee...@x1.home/ > > v3 -> v4: > *) address comments against v3. > > v1 -> v2: > *) added in v2 > --- > drivers/vfio/vfio_iommu_type1.c | 92 > +++-- > include/uapi/linux/vfio.h | 19 + > 2 files changed, 99 insertions(+), 12 deletions(-) > > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > index c992973..3c0048b 100644 > --- a/drivers/vfio/vfio_iommu_type1.c > +++ b/drivers/vfio/vfio_iommu_type1.c > @@ -62,18 +62,20 @@ MODULE_PARM_DESC(dma_entry_limit, >"Maximum number of user DMA mappings per container (65535)."); > > struct vfio_iommu { > - struct list_headdomain_list; > - struct list_headiova_list; > - struct vfio_domain *external_domain; /* domain for external user */ > - struct mutexlock; > - struct rb_root dma_list; > - struct blocking_notifier_head notifier; > - unsigned intdma_avail; > - uint64_tpgsize_bitmap; > - boolv2; > - boolnesting; > - booldirty_page_tracking; > - boolpinned_page_dirty_scope; > + struct list_headdomain_list; > + struct list_headiova_list; > + /* domain for external user */ > + struct vfio_domain *external_domain; > + struct mutexlock; > + struct rb_root dma_list; > + struct blocking_notifier_head notifier; > + unsigned intdma_avail; > + uint64_tpgsize_bitmap; > + boolv2; > + boolnesting; > + booldirty_page_tracking; > + boolpinned_page_dirty_scope; > + struct iommu_nesting_info *nesting_info; Nit, not as important as the previous alignment, but might as well move this up with the uint64_t pgsize_bitmap with the bools at the end of the structure to avoid adding new gaps. > }; > > struct vfio_domain { > @@ -130,6 +132,9 @@
Re: [PATCH v7 04/16] vfio: Add PASID allocation/free support
On Thu, 10 Sep 2020 03:45:21 -0700 Liu Yi L wrote: > Shared Virtual Addressing (a.k.a Shared Virtual Memory) allows sharing > multiple process virtual address spaces with the device for simplified > programming model. PASID is used to tag an virtual address space in DMA > requests and to identify the related translation structure in IOMMU. When > a PASID-capable device is assigned to a VM, we want the same capability > of using PASID to tag guest process virtual address spaces to achieve > virtual SVA (vSVA). > > PASID management for guest is vendor specific. Some vendors (e.g. Intel > VT-d) requires system-wide managed PASIDs across all devices, regardless > of whether a device is used by host or assigned to guest. Other vendors > (e.g. ARM SMMU) may allow PASIDs managed per-device thus could be fully > delegated to the guest for assigned devices. > > For system-wide managed PASIDs, this patch introduces a vfio module to > handle explicit PASID alloc/free requests from guest. Allocated PASIDs > are associated to a process (or, mm_struct) in IOASID core. A vfio_mm > object is introduced to track mm_struct. Multiple VFIO containers within > a process share the same vfio_mm object. > > A quota mechanism is provided to prevent malicious user from exhausting > available PASIDs. Currently the quota is a global parameter applied to > all VFIO devices. In the future per-device quota might be supported too. > > Cc: Kevin Tian > CC: Jacob Pan > Cc: Eric Auger > Cc: Jean-Philippe Brucker > Cc: Joerg Roedel > Cc: Lu Baolu > Suggested-by: Alex Williamson > Signed-off-by: Liu Yi L > Reviewed-by: Eric Auger > --- > v6 -> v7: > *) remove "#include " and add r-b from Eric Auger. > > v5 -> v6: > *) address comments from Eric. Add vfio_unlink_pasid() to be consistent >with vfio_unlink_dma(). Add a comment in vfio_pasid_exit(). > > v4 -> v5: > *) address comments from Eric Auger. > *) address the comments from Alex on the pasid free range support. Added >per vfio_mm pasid r-b tree. >https://lore.kernel.org/kvm/20200709082751.32074...@x1.home/ > > v3 -> v4: > *) fix lock leam in vfio_mm_get_from_task() > *) drop pasid_quota field in struct vfio_mm > *) vfio_mm_get_from_task() returns ERR_PTR(-ENOTTY) when !CONFIG_VFIO_PASID > > v1 -> v2: > *) added in v2, split from the pasid alloc/free support of v1 > --- > drivers/vfio/Kconfig | 5 + > drivers/vfio/Makefile | 1 + > drivers/vfio/vfio_pasid.c | 247 > ++ > include/linux/vfio.h | 28 ++ > 4 files changed, 281 insertions(+) > create mode 100644 drivers/vfio/vfio_pasid.c > > diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig > index fd17db9..3d8a108 100644 > --- a/drivers/vfio/Kconfig > +++ b/drivers/vfio/Kconfig > @@ -19,6 +19,11 @@ config VFIO_VIRQFD > depends on VFIO && EVENTFD > default n > > +config VFIO_PASID > + tristate > + depends on IOASID && VFIO > + default n > + > menuconfig VFIO > tristate "VFIO Non-Privileged userspace driver framework" > depends on IOMMU_API > diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile > index de67c47..bb836a3 100644 > --- a/drivers/vfio/Makefile > +++ b/drivers/vfio/Makefile > @@ -3,6 +3,7 @@ vfio_virqfd-y := virqfd.o > > obj-$(CONFIG_VFIO) += vfio.o > obj-$(CONFIG_VFIO_VIRQFD) += vfio_virqfd.o > +obj-$(CONFIG_VFIO_PASID) += vfio_pasid.o > obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o > obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o > obj-$(CONFIG_VFIO_SPAPR_EEH) += vfio_spapr_eeh.o > diff --git a/drivers/vfio/vfio_pasid.c b/drivers/vfio/vfio_pasid.c > new file mode 100644 > index 000..44ecdd5 > --- /dev/null > +++ b/drivers/vfio/vfio_pasid.c > @@ -0,0 +1,247 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * Copyright (C) 2020 Intel Corporation. > + * Author: Liu Yi L > + * > + */ > + > +#include > +#include > +#include > +#include > +#include > + > +#define DRIVER_VERSION "0.1" > +#define DRIVER_AUTHOR "Liu Yi L " > +#define DRIVER_DESC "PASID management for VFIO bus drivers" > + > +#define VFIO_DEFAULT_PASID_QUOTA 1000 I'm not sure we really need a macro to define this since it's only used once, but a comment discussing the basis for this default value would be useful. Also, since Matthew Rosato is finding it necessary to expose the available DMA mapping counter to userspace, is this also a limitation that userspace might be interested in knowing such that we should plumb it through an IOMMU info capability? > +static int pasid_quota = VFIO_DEFAULT_PASID_QUOTA; > +module_param_named(pasid_quota, pasid_quota, uint, 0444); > +MODULE_PARM_DESC(pasid_quota, > + "Set the quota for max number of PASIDs that an application is > allowed to request (default 1000)"); > + > +struct vfio_mm_token { > + unsigned long long val; > +}; > + > +struct vfio_mm { > + struct kref kref; > + struct ioasi
Re: [PATCH v7 07/16] vfio/type1: Add VFIO_IOMMU_PASID_REQUEST (alloc/free)
On Thu, 10 Sep 2020 03:45:24 -0700 Liu Yi L wrote: > This patch allows userspace to request PASID allocation/free, e.g. when > serving the request from the guest. > > PASIDs that are not freed by userspace are automatically freed when the > IOASID set is destroyed when process exits. > > Cc: Kevin Tian > CC: Jacob Pan > Cc: Alex Williamson > Cc: Eric Auger > Cc: Jean-Philippe Brucker > Cc: Joerg Roedel > Cc: Lu Baolu > Signed-off-by: Liu Yi L > Signed-off-by: Yi Sun > Signed-off-by: Jacob Pan > --- > v6 -> v7: > *) current VFIO returns allocated pasid via signed int, thus VFIO UAPI >can only support 31 bits pasid. If user space gives min,max which is >wider than 31 bits, should fail the allocation or free request. > > v5 -> v6: > *) address comments from Eric against v5. remove the alloc/free helper. > > v4 -> v5: > *) address comments from Eric Auger. > *) the comments for the PASID_FREE request is addressed in patch 5/15 of >this series. > > v3 -> v4: > *) address comments from v3, except the below comment against the range >of PASID_FREE request. needs more help on it. > "> +if (req.range.min > req.range.max) > > Is it exploitable that a user can spin the kernel for a long time in > the case of a free by calling this with [0, MAX_UINT] regardless of > their actual allocations?" > https://lore.kernel.org/linux-iommu/20200702151832.048b4...@x1.home/ > > v1 -> v2: > *) move the vfio_mm related code to be a seprate module > *) use a single structure for alloc/free, could support a range of PASIDs > *) fetch vfio_mm at group_attach time instead of at iommu driver open time > --- > drivers/vfio/Kconfig| 1 + > drivers/vfio/vfio_iommu_type1.c | 76 > + > drivers/vfio/vfio_pasid.c | 10 ++ > include/linux/vfio.h| 6 > include/uapi/linux/vfio.h | 43 +++ > 5 files changed, 136 insertions(+) > > diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig > index 3d8a108..95d90c6 100644 > --- a/drivers/vfio/Kconfig > +++ b/drivers/vfio/Kconfig > @@ -2,6 +2,7 @@ > config VFIO_IOMMU_TYPE1 > tristate > depends on VFIO > + select VFIO_PASID if (X86) > default n > > config VFIO_IOMMU_SPAPR_TCE > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > index 3c0048b..bd4b668 100644 > --- a/drivers/vfio/vfio_iommu_type1.c > +++ b/drivers/vfio/vfio_iommu_type1.c > @@ -76,6 +76,7 @@ struct vfio_iommu { > booldirty_page_tracking; > boolpinned_page_dirty_scope; > struct iommu_nesting_info *nesting_info; > + struct vfio_mm *vmm; > }; > > struct vfio_domain { > @@ -2000,6 +2001,11 @@ static void vfio_iommu_iova_insert_copy(struct > vfio_iommu *iommu, > > static void vfio_iommu_release_nesting_info(struct vfio_iommu *iommu) > { > + if (iommu->vmm) { > + vfio_mm_put(iommu->vmm); > + iommu->vmm = NULL; > + } > + > kfree(iommu->nesting_info); > iommu->nesting_info = NULL; > } > @@ -2127,6 +2133,26 @@ static int vfio_iommu_type1_attach_group(void > *iommu_data, > iommu->nesting_info); > if (ret) > goto out_detach; > + > + if (iommu->nesting_info->features & > + IOMMU_NESTING_FEAT_SYSWIDE_PASID) { > + struct vfio_mm *vmm; > + struct ioasid_set *set; > + > + vmm = vfio_mm_get_from_task(current); > + if (IS_ERR(vmm)) { > + ret = PTR_ERR(vmm); > + goto out_detach; > + } > + iommu->vmm = vmm; > + > + set = vfio_mm_ioasid_set(vmm); > + ret = iommu_domain_set_attr(domain->domain, > + DOMAIN_ATTR_IOASID_SET, > + set); > + if (ret) > + goto out_detach; > + } > } > > /* Get aperture info */ > @@ -2908,6 +2934,54 @@ static int vfio_iommu_type1_dirty_pages(struct > vfio_iommu *iommu, > return -EINVAL; > } > > +static int vfio_iommu_type1_pasid_request(struct vfio_iommu *iommu, > + unsigned long arg) > +{ > + struct vfio_iommu_type1_pasid_request req; > + unsigned long minsz; > + int ret; > + > + minsz = offsetofend(struct vfio_iommu_type1_pasid_request, range); > + > + if (copy_from_user(&req, (void __user *)arg, minsz)) > + return -EFAULT; > + > + if (req.argsz < minsz || (req.flags & ~VFIO_PASID_REQUEST_MASK)) > + return -EINVAL; > + > + /* > + * Current VFIO_IOMMU_PASID
[PATCH v9 1/7] docs: IOMMU user API
IOMMU UAPI is newly introduced to support communications between guest virtual IOMMU and host IOMMU. There has been lots of discussions on how it should work with VFIO UAPI and userspace in general. This document is intended to clarify the UAPI design and usage. The mechanics of how future extensions should be achieved are also covered in this documentation. Reviewed-by: Eric Auger Signed-off-by: Liu Yi L Signed-off-by: Jacob Pan --- Documentation/userspace-api/iommu.rst | 211 ++ MAINTAINERS | 1 + 2 files changed, 212 insertions(+) create mode 100644 Documentation/userspace-api/iommu.rst diff --git a/Documentation/userspace-api/iommu.rst b/Documentation/userspace-api/iommu.rst new file mode 100644 index ..1e68e8f05bb3 --- /dev/null +++ b/Documentation/userspace-api/iommu.rst @@ -0,0 +1,211 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. iommu: + += +IOMMU Userspace API += + +IOMMU UAPI is used for virtualization cases where communications are +needed between physical and virtual IOMMU drivers. For baremetal +usage, the IOMMU is a system device which does not need to communicate +with user space directly. + +The primary use cases are guest Shared Virtual Address (SVA) and +guest IO virtual address (IOVA), wherin the vIOMMU implementation +relies on the physical IOMMU and for this reason requires interactions +with the host driver. + +.. contents:: :local: + +Functionalities +=== +Communications of user and kernel involve both directions. The +supported user-kernel APIs are as follows: + +1. Alloc/Free PASID +2. Bind/Unbind guest PASID (e.g. Intel VT-d) +3. Bind/Unbind guest PASID table (e.g. ARM SMMU) +4. Invalidate IOMMU caches upon guest requests +5. Report errors to the guest and serve page requests + +Requirements + +The IOMMU UAPIs are generic and extensible to meet the following +requirements: + +1. Emulated and para-virtualised vIOMMUs +2. Multiple vendors (Intel VT-d, ARM SMMU, etc.) +3. Extensions to the UAPI shall not break existing user space + +Interfaces +== +Although the data structures defined in IOMMU UAPI are self-contained, +there is no user API functions introduced. Instead, IOMMU UAPI is +designed to work with existing user driver frameworks such as VFIO. + +Extension Rules & Precautions +- +When IOMMU UAPI gets extended, the data structures can *only* be +modified in two ways: + +1. Adding new fields by re-purposing the padding[] field. No size change. +2. Adding new union members at the end. May increase the structure sizes. + +No new fields can be added *after* the variable sized union in that it +will break backward compatibility when offset moves. A new flag must +be introduced whenever a change affects the structure using either +method. The IOMMU driver processes the data based on flags which +ensures backward compatibility. + +Version field is only reserved for the unlikely event of UAPI upgrade +at its entirety. + +It's *always* the caller's responsibility to indicate the size of the +structure passed by setting argsz appropriately. +Though at the same time, argsz is user provided data which is not +trusted. The argsz field allows the user app to indicate how much data +it is providing, it's still the kernel's responsibility to validate +whether it's correct and sufficient for the requested operation. + +Compatibility Checking +-- +When IOMMU UAPI extension results in some structure size increase, +IOMMU UAPI code shall handle the following cases: + +1. User and kernel has exact size match +2. An older user with older kernel header (smaller UAPI size) running on a + newer kernel (larger UAPI size) +3. A newer user with newer kernel header (larger UAPI size) running + on an older kernel. +4. A malicious/misbehaving user pass illegal/invalid size but within + range. The data may contain garbage. + +Feature Checking + +While launching a guest with vIOMMU, it is strongly advised to check +the compatibility upfront, as some subsequent errors happening during +vIOMMU operation, such as cache invalidation failures cannot be nicely +escaladated to the guest due to IOMMU specifications. This can lead to +catastrophic failures for the users. + +User applications such as QEMU are expected to import kernel UAPI +headers. Backward compatibility is supported per feature flags. +For example, an older QEMU (with older kernel header) can run on newer +kernel. Newer QEMU (with new kernel header) may refuse to initialize +on an older kernel if new feature flags are not supported by older +kernel. Simply recompiling existing code with newer kernel header should +not be an issue in that only existing flags are used. + +IOMMU vendor driver should report the below features to IOMMU UAPI +consumers (e.g. via VFIO). + +1. IOMMU_NESTING_FEAT_SYSWIDE_PASID
[PATCH v9 4/7] iommu/uapi: Use named union for user data
IOMMU UAPI data size is filled by the user space which must be validated by the kernel. To ensure backward compatibility, user data can only be extended by either re-purpose padding bytes or extend the variable sized union at the end. No size change is allowed before the union. Therefore, the minimum size is the offset of the union. To use offsetof() on the union, we must make it named. Link: https://lore.kernel.org/linux-iommu/20200611145518.0c281...@x1.home/ Signed-off-by: Jacob Pan Reviewed-by: Lu Baolu Reviewed-by: Eric Auger --- drivers/iommu/intel/iommu.c | 22 +++--- drivers/iommu/intel/svm.c | 2 +- include/uapi/linux/iommu.h | 4 ++-- 3 files changed, 14 insertions(+), 14 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 87b17bac04c2..461f3a6864d4 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -5434,8 +5434,8 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, struct device *dev, /* Size is only valid in address selective invalidation */ if (inv_info->granularity == IOMMU_INV_GRANU_ADDR) - size = to_vtd_size(inv_info->addr_info.granule_size, - inv_info->addr_info.nb_granules); + size = to_vtd_size(inv_info->granu.addr_info.granule_size, + inv_info->granu.addr_info.nb_granules); for_each_set_bit(cache_type, (unsigned long *)&inv_info->cache, @@ -5456,20 +5456,20 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, struct device *dev, * granularity. */ if (inv_info->granularity == IOMMU_INV_GRANU_PASID && - (inv_info->pasid_info.flags & IOMMU_INV_PASID_FLAGS_PASID)) - pasid = inv_info->pasid_info.pasid; + (inv_info->granu.pasid_info.flags & IOMMU_INV_PASID_FLAGS_PASID)) + pasid = inv_info->granu.pasid_info.pasid; else if (inv_info->granularity == IOMMU_INV_GRANU_ADDR && -(inv_info->addr_info.flags & IOMMU_INV_ADDR_FLAGS_PASID)) - pasid = inv_info->addr_info.pasid; +(inv_info->granu.addr_info.flags & IOMMU_INV_ADDR_FLAGS_PASID)) + pasid = inv_info->granu.addr_info.pasid; switch (BIT(cache_type)) { case IOMMU_CACHE_INV_TYPE_IOTLB: /* HW will ignore LSB bits based on address mask */ if (inv_info->granularity == IOMMU_INV_GRANU_ADDR && size && - (inv_info->addr_info.addr & ((BIT(VTD_PAGE_SHIFT + size)) - 1))) { + (inv_info->granu.addr_info.addr & ((BIT(VTD_PAGE_SHIFT + size)) - 1))) { pr_err_ratelimited("User address not aligned, 0x%llx, size order %llu\n", - inv_info->addr_info.addr, size); + inv_info->granu.addr_info.addr, size); } /* @@ -5477,9 +5477,9 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, struct device *dev, * We use npages = -1 to indicate that. */ qi_flush_piotlb(iommu, did, pasid, - mm_to_dma_pfn(inv_info->addr_info.addr), + mm_to_dma_pfn(inv_info->granu.addr_info.addr), (granu == QI_GRAN_NONG_PASID) ? -1 : 1 << size, - inv_info->addr_info.flags & IOMMU_INV_ADDR_FLAGS_LEAF); + inv_info->granu.addr_info.flags & IOMMU_INV_ADDR_FLAGS_LEAF); if (!info->ats_enabled) break; @@ -5502,7 +5502,7 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, struct device *dev, size = 64 - VTD_PAGE_SHIFT; addr = 0; } else if (inv_info->granularity == IOMMU_INV_GRANU_ADDR) { - addr = inv_info->addr_info.addr; + addr = inv_info->granu.addr_info.addr; } if (info->ats_enabled) diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 95c3164a2302..99353d6468fa 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -370,7 +370,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev, spin_lock(&iommu->lock); ret = intel_pasid_setup_nested(iommu, dev, (pgd_t *)(uintptr_t)data->gpgd, - data->hpasid,
[PATCH v9 5/7] iommu/uapi: Rename uapi functions
User APIs such as iommu_sva_unbind_gpasid() may also be used by the kernel. Since we introduced user pointer to the UAPI functions, in-kernel callers cannot share the same APIs. In-kernel callers are also trusted, there is no need to validate the data. We plan to have two flavors of the same API functions, one called through ioctls, carrying a user pointer and one called directly with valid IOMMU UAPI structs. To differentiate both, let's rename existing functions with an iommu_uapi_ prefix. Suggested-by: Alex Williamson Reviewed-by: Eric Auger Signed-off-by: Jacob Pan --- drivers/iommu/iommu.c | 18 +- include/linux/iommu.h | 31 --- 2 files changed, 25 insertions(+), 24 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 609bd25bf154..4ae02291ccc2 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1961,35 +1961,35 @@ int iommu_attach_device(struct iommu_domain *domain, struct device *dev) } EXPORT_SYMBOL_GPL(iommu_attach_device); -int iommu_cache_invalidate(struct iommu_domain *domain, struct device *dev, - struct iommu_cache_invalidate_info *inv_info) +int iommu_uapi_cache_invalidate(struct iommu_domain *domain, struct device *dev, + struct iommu_cache_invalidate_info *inv_info) { if (unlikely(!domain->ops->cache_invalidate)) return -ENODEV; return domain->ops->cache_invalidate(domain, dev, inv_info); } -EXPORT_SYMBOL_GPL(iommu_cache_invalidate); +EXPORT_SYMBOL_GPL(iommu_uapi_cache_invalidate); -int iommu_sva_bind_gpasid(struct iommu_domain *domain, - struct device *dev, struct iommu_gpasid_bind_data *data) +int iommu_uapi_sva_bind_gpasid(struct iommu_domain *domain, + struct device *dev, struct iommu_gpasid_bind_data *data) { if (unlikely(!domain->ops->sva_bind_gpasid)) return -ENODEV; return domain->ops->sva_bind_gpasid(domain, dev, data); } -EXPORT_SYMBOL_GPL(iommu_sva_bind_gpasid); +EXPORT_SYMBOL_GPL(iommu_uapi_sva_bind_gpasid); -int iommu_sva_unbind_gpasid(struct iommu_domain *domain, struct device *dev, -ioasid_t pasid) +int iommu_uapi_sva_unbind_gpasid(struct iommu_domain *domain, struct device *dev, +ioasid_t pasid) { if (unlikely(!domain->ops->sva_unbind_gpasid)) return -ENODEV; return domain->ops->sva_unbind_gpasid(dev, pasid); } -EXPORT_SYMBOL_GPL(iommu_sva_unbind_gpasid); +EXPORT_SYMBOL_GPL(iommu_uapi_sva_unbind_gpasid); static void __iommu_detach_device(struct iommu_domain *domain, struct device *dev) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index fee209efb756..710d5d2691eb 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -424,13 +424,13 @@ extern int iommu_attach_device(struct iommu_domain *domain, struct device *dev); extern void iommu_detach_device(struct iommu_domain *domain, struct device *dev); -extern int iommu_cache_invalidate(struct iommu_domain *domain, - struct device *dev, - struct iommu_cache_invalidate_info *inv_info); -extern int iommu_sva_bind_gpasid(struct iommu_domain *domain, - struct device *dev, struct iommu_gpasid_bind_data *data); -extern int iommu_sva_unbind_gpasid(struct iommu_domain *domain, - struct device *dev, ioasid_t pasid); +extern int iommu_uapi_cache_invalidate(struct iommu_domain *domain, + struct device *dev, + struct iommu_cache_invalidate_info *inv_info); +extern int iommu_uapi_sva_bind_gpasid(struct iommu_domain *domain, + struct device *dev, struct iommu_gpasid_bind_data *data); +extern int iommu_uapi_sva_unbind_gpasid(struct iommu_domain *domain, + struct device *dev, ioasid_t pasid); extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev); extern struct iommu_domain *iommu_get_dma_domain(struct device *dev); extern int iommu_map(struct iommu_domain *domain, unsigned long iova, @@ -1032,21 +1032,22 @@ static inline int iommu_sva_get_pasid(struct iommu_sva *handle) return IOMMU_PASID_INVALID; } -static inline int -iommu_cache_invalidate(struct iommu_domain *domain, - struct device *dev, - struct iommu_cache_invalidate_info *inv_info) +static inline int iommu_uapi_cache_invalidate(struct iommu_domain *domain, + struct device *dev, + struct iommu_cache_invalidate_info *inv_info) { return -ENODEV; } -static inline int iom
[PATCH v9 3/7] iommu/uapi: Introduce enum type for PASID data format
There can be multiple vendor-specific PASID data formats used in UAPI structures. This patch adds enum type with a last entry which makes range checking much easier. Suggested-by: Alex Williamson Reviewed-by: Eric Auger Signed-off-by: Jacob Pan --- include/uapi/linux/iommu.h | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h index b42acc8fe007..7cc6ee6c41f7 100644 --- a/include/uapi/linux/iommu.h +++ b/include/uapi/linux/iommu.h @@ -298,11 +298,16 @@ struct iommu_gpasid_bind_data_vtd { IOMMU_SVA_VTD_GPASID_PCD | \ IOMMU_SVA_VTD_GPASID_PWT) +enum iommu_pasid_data_format { + IOMMU_PASID_FORMAT_INTEL_VTD = 1, + IOMMU_PASID_FORMAT_LAST, +}; + /** * struct iommu_gpasid_bind_data - Information about device and guest PASID binding * @argsz: User filled size of this data * @version: Version of this data structure - * @format:PASID table entry format + * @format:PASID table entry format of enum iommu_pasid_data_format type * @flags: Additional information on guest bind request * @gpgd: Guest page directory base of the guest mm to bind * @hpasid:Process address space ID used for the guest mm in host IOMMU @@ -321,7 +326,6 @@ struct iommu_gpasid_bind_data { __u32 argsz; #define IOMMU_GPASID_BIND_VERSION_11 __u32 version; -#define IOMMU_PASID_FORMAT_INTEL_VTD 1 __u32 format; __u32 addr_width; #define IOMMU_SVA_GPASID_VAL (1 << 0) /* guest PASID valid */ -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 6/7] iommu/uapi: Handle data and argsz filled by users
IOMMU user APIs are responsible for processing user data. This patch changes the interface such that user pointers can be passed into IOMMU code directly. Separate kernel APIs without user pointers are introduced for in-kernel users of the UAPI functionality. IOMMU UAPI data has a user filled argsz field which indicates the data length of the structure. User data is not trusted, argsz must be validated based on the current kernel data size, mandatory data size, and feature flags. User data may also be extended, resulting in possible argsz increase. Backward compatibility is ensured based on size and flags (or the functional equivalent fields) checking. This patch adds sanity checks in the IOMMU layer. In addition to argsz, reserved/unused fields in padding, flags, and version are also checked. Details are documented in Documentation/userspace-api/iommu.rst Signed-off-by: Liu Yi L Signed-off-by: Jacob Pan --- drivers/iommu/iommu.c | 199 -- include/linux/iommu.h | 28 --- 2 files changed, 211 insertions(+), 16 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 4ae02291ccc2..5c1b7ae48aae 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1961,34 +1961,219 @@ int iommu_attach_device(struct iommu_domain *domain, struct device *dev) } EXPORT_SYMBOL_GPL(iommu_attach_device); +/* + * Check flags and other user provided data for valid combinations. We also + * make sure no reserved fields or unused flags are set. This is to ensure + * not breaking userspace in the future when these fields or flags are used. + */ +static int iommu_check_cache_invl_data(struct iommu_cache_invalidate_info *info) +{ + u32 mask; + int i; + + if (info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1) + return -EINVAL; + + mask = (1 << IOMMU_CACHE_INV_TYPE_NR) - 1; + if (info->cache & ~mask) + return -EINVAL; + + if (info->granularity >= IOMMU_INV_GRANU_NR) + return -EINVAL; + + switch (info->granularity) { + case IOMMU_INV_GRANU_ADDR: + if (info->cache & IOMMU_CACHE_INV_TYPE_PASID) + return -EINVAL; + + mask = IOMMU_INV_ADDR_FLAGS_PASID | + IOMMU_INV_ADDR_FLAGS_ARCHID | + IOMMU_INV_ADDR_FLAGS_LEAF; + + if (info->granu.addr_info.flags & ~mask) + return -EINVAL; + break; + case IOMMU_INV_GRANU_PASID: + mask = IOMMU_INV_PASID_FLAGS_PASID | + IOMMU_INV_PASID_FLAGS_ARCHID; + if (info->granu.pasid_info.flags & ~mask) + return -EINVAL; + + break; + case IOMMU_INV_GRANU_DOMAIN: + if (info->cache & IOMMU_CACHE_INV_TYPE_DEV_IOTLB) + return -EINVAL; + break; + default: + return -EINVAL; + } + + /* Check reserved padding fields */ + for (i = 0; i < sizeof(info->padding); i++) { + if (info->padding[i]) + return -EINVAL; + } + + return 0; +} + int iommu_uapi_cache_invalidate(struct iommu_domain *domain, struct device *dev, - struct iommu_cache_invalidate_info *inv_info) + void __user *uinfo) { + struct iommu_cache_invalidate_info inv_info = { 0 }; + u32 minsz; + int ret = 0; + if (unlikely(!domain->ops->cache_invalidate)) return -ENODEV; - return domain->ops->cache_invalidate(domain, dev, inv_info); + /* +* No new spaces can be added before the variable sized union, the +* minimum size is the offset to the union. +*/ + minsz = offsetof(struct iommu_cache_invalidate_info, granu); + + /* Copy minsz from user to get flags and argsz */ + if (copy_from_user(&inv_info, uinfo, minsz)) + return -EFAULT; + + /* Fields before variable size union is mandatory */ + if (inv_info.argsz < minsz) + return -EINVAL; + + /* PASID and address granu require additional info beyond minsz */ + if (inv_info.argsz == minsz && + ((inv_info.granularity == IOMMU_INV_GRANU_PASID) || + (inv_info.granularity == IOMMU_INV_GRANU_ADDR))) + return -EINVAL; + + if (inv_info.granularity == IOMMU_INV_GRANU_PASID && + inv_info.argsz < offsetofend(struct iommu_cache_invalidate_info, granu.pasid_info)) + return -EINVAL; + + if (inv_info.granularity == IOMMU_INV_GRANU_ADDR && + inv_info.argsz < offsetofend(struct iommu_cache_invalidate_info, granu.addr_info)) + return -EINVAL; + + /* +* User might be using a newer UAPI header which has a larger data +* size, we shall support the existing flag
[PATCH v9 0/7] IOMMU user API enhancement
IOMMU user API header was introduced to support nested DMA translation and related fault handling. The current UAPI data structures consist of three areas that cover the interactions between host kernel and guest: - fault handling - cache invalidation - bind guest page tables, i.e. guest PASID Future extensions are likely to support more architectures and vIOMMU features. In the previous discussion, using user-filled data size and feature flags is made a preferred approach over a unified version number. https://lkml.org/lkml/2020/1/29/45 In addition to introduce argsz field to data structures, this patchset is also trying to document the UAPI design, usage, and extension rules. VT-d driver changes to utilize the new argsz field is included, VFIO usage is to follow. This set is available at: https://github.com/jacobpan/linux.git vsva_v5.9_uapi_v9 Thanks, Jacob Changelog: v9 - Directly pass PASID value to iommu_sva_unbind_gpasid() without the superfluous data in struct iommu_gpasid_bind_data. v8 - Rebased to v5.9-rc2 - Addressed review comments from Eric Auger 1. added a check for the unused vendor flags 2. commit message improvements v7 - Added PASID data format enum for range checking - Tidy up based on reviews from Alex W. - Removed doc section for vIOMMU fault handling v6 - Renamed all UAPI functions with iommu_uapi_ prefix - Replaced argsz maxsz checking with flag specific size checks - Documentation improvements based on suggestions by Eric Auger Replaced example code with a pointer to the actual code - Added more checks for illegal flags combinations - Added doc file to MAINTAINERS v5 - Addjusted paddings in UAPI data to be 8 byte aligned - Do not clobber argsz in IOMMU core before passing on to vendor driver - Removed pr_warn_ for invalid UAPI data check, just return -EINVAL - Clarified VFIO responsibility in UAPI data handling - Use iommu_uapi prefix to differentiate APIs has in-kernel caller - Added comment for unchecked flags of invalidation granularity - Added example in doc to show vendor data checking v4 - Added checks of UAPI data for reserved fields, version, and flags. - Removed version check from vendor driver (vt-d) - Relaxed argsz check to match the UAPI struct size instead of variable union size - Updated documentation v3: - Rewrote backward compatibility rule to support existing code re-compiled with newer kernel UAPI header that runs on older kernel. Based on review comment from Alex W. https://lore.kernel.org/linux-iommu/20200611094741.6d118...@w520.home/ - Take user pointer directly in UAPI functions. Perform argsz check and copy_from_user() in IOMMU driver. Eliminate the need for VFIO or other upper layer to parse IOMMU data. - Create wrapper function for in-kernel users of UAPI functions v2: - Removed unified API version and helper - Introduced argsz for each UAPI data - Introduced UAPI doc Jacob Pan (7): docs: IOMMU user API iommu/uapi: Add argsz for user filled data iommu/uapi: Introduce enum type for PASID data format iommu/uapi: Use named union for user data iommu/uapi: Rename uapi functions iommu/uapi: Handle data and argsz filled by users iommu/vt-d: Check UAPI data processed by IOMMU core Documentation/userspace-api/iommu.rst | 211 ++ MAINTAINERS | 1 + drivers/iommu/intel/iommu.c | 25 ++-- drivers/iommu/intel/svm.c | 13 ++- drivers/iommu/iommu.c | 201 ++-- include/linux/iommu.h | 35 -- include/uapi/linux/iommu.h| 25 ++-- 7 files changed, 468 insertions(+), 43 deletions(-) create mode 100644 Documentation/userspace-api/iommu.rst -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 7/7] iommu/vt-d: Check UAPI data processed by IOMMU core
IOMMU generic layer already does sanity checks on UAPI data for version match and argsz range based on generic information. This patch adjusts the following data checking responsibilities: - removes the redundant version check from VT-d driver - removes the check for vendor specific data size - adds check for the use of reserved/undefined flags Signed-off-by: Jacob Pan --- drivers/iommu/intel/iommu.c | 3 +-- drivers/iommu/intel/svm.c | 11 +-- include/uapi/linux/iommu.h | 1 + 3 files changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 461f3a6864d4..18ed3b3c70d7 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -5408,8 +5408,7 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, struct device *dev, int ret = 0; u64 size = 0; - if (!inv_info || !dmar_domain || - inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1) + if (!inv_info || !dmar_domain) return -EINVAL; if (!dev || !dev_is_pci(dev)) diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 99353d6468fa..0cb9a15f1112 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -284,8 +284,15 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev, if (WARN_ON(!iommu) || !data) return -EINVAL; - if (data->version != IOMMU_GPASID_BIND_VERSION_1 || - data->format != IOMMU_PASID_FORMAT_INTEL_VTD) + if (data->format != IOMMU_PASID_FORMAT_INTEL_VTD) + return -EINVAL; + + /* IOMMU core ensures argsz is more than the start of the union */ + if (data->argsz < offsetofend(struct iommu_gpasid_bind_data, vendor.vtd)) + return -EINVAL; + + /* Make sure no undefined flags are used in vendor data */ + if (data->vendor.vtd.flags & ~(IOMMU_SVA_VTD_GPASID_LAST - 1)) return -EINVAL; if (!dev_is_pci(dev)) diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h index c64bca5af419..1ebc23df4fbc 100644 --- a/include/uapi/linux/iommu.h +++ b/include/uapi/linux/iommu.h @@ -288,6 +288,7 @@ struct iommu_gpasid_bind_data_vtd { #define IOMMU_SVA_VTD_GPASID_PWT (1 << 3) /* page-level write through */ #define IOMMU_SVA_VTD_GPASID_EMTE (1 << 4) /* extended mem type enable */ #define IOMMU_SVA_VTD_GPASID_CD(1 << 5) /* PASID-level cache disable */ +#define IOMMU_SVA_VTD_GPASID_LAST (1 << 6) __u64 flags; __u32 pat; __u32 emt; -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 2/7] iommu/uapi: Add argsz for user filled data
As IOMMU UAPI gets extended, user data size may increase. To support backward compatibiliy, this patch introduces a size field to each UAPI data structures. It is *always* the responsibility for the user to fill in the correct size. Padding fields are adjusted to ensure 8 byte alignment. Specific scenarios for user data handling are documented in: Documentation/userspace-api/iommu.rst As there is no current users of the API, struct version is not incremented. Reviewed-by: Eric Auger Signed-off-by: Liu Yi L Signed-off-by: Jacob Pan --- include/uapi/linux/iommu.h | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h index c2b2caf9ed41..b42acc8fe007 100644 --- a/include/uapi/linux/iommu.h +++ b/include/uapi/linux/iommu.h @@ -139,6 +139,7 @@ enum iommu_page_response_code { /** * struct iommu_page_response - Generic page response information + * @argsz: User filled size of this data * @version: API version of this structure * @flags: encodes whether the corresponding fields are valid * (IOMMU_FAULT_PAGE_RESPONSE_* values) @@ -147,6 +148,7 @@ enum iommu_page_response_code { * @code: response code from &enum iommu_page_response_code */ struct iommu_page_response { + __u32 argsz; #define IOMMU_PAGE_RESP_VERSION_1 1 __u32 version; #define IOMMU_PAGE_RESP_PASID_VALID(1 << 0) @@ -222,6 +224,7 @@ struct iommu_inv_pasid_info { /** * struct iommu_cache_invalidate_info - First level/stage invalidation * information + * @argsz: User filled size of this data * @version: API version of this structure * @cache: bitfield that allows to select which caches to invalidate * @granularity: defines the lowest granularity used for the invalidation: @@ -250,6 +253,7 @@ struct iommu_inv_pasid_info { * must support the used granularity. */ struct iommu_cache_invalidate_info { + __u32 argsz; #define IOMMU_CACHE_INVALIDATE_INFO_VERSION_1 1 __u32 version; /* IOMMU paging structure cache */ @@ -259,7 +263,7 @@ struct iommu_cache_invalidate_info { #define IOMMU_CACHE_INV_TYPE_NR(3) __u8cache; __u8granularity; - __u8padding[2]; + __u8padding[6]; union { struct iommu_inv_pasid_info pasid_info; struct iommu_inv_addr_info addr_info; @@ -296,6 +300,7 @@ struct iommu_gpasid_bind_data_vtd { /** * struct iommu_gpasid_bind_data - Information about device and guest PASID binding + * @argsz: User filled size of this data * @version: Version of this data structure * @format:PASID table entry format * @flags: Additional information on guest bind request @@ -313,17 +318,18 @@ struct iommu_gpasid_bind_data_vtd { * PASID to host PASID based on this bind data. */ struct iommu_gpasid_bind_data { + __u32 argsz; #define IOMMU_GPASID_BIND_VERSION_11 __u32 version; #define IOMMU_PASID_FORMAT_INTEL_VTD 1 __u32 format; + __u32 addr_width; #define IOMMU_SVA_GPASID_VAL (1 << 0) /* guest PASID valid */ __u64 flags; __u64 gpgd; __u64 hpasid; __u64 gpasid; - __u32 addr_width; - __u8 padding[12]; + __u8 padding[8]; /* Vendor specific data */ union { struct iommu_gpasid_bind_data_vtd vtd; -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 10/16] vfio/type1: Support binding guest page tables to PASID
On Thu, 10 Sep 2020 03:45:27 -0700 Liu Yi L wrote: > Nesting translation allows two-levels/stages page tables, with 1st level > for guest translations (e.g. GVA->GPA), 2nd level for host translations > (e.g. GPA->HPA). This patch adds interface for binding guest page tables > to a PASID. This PASID must have been allocated by the userspace before > the binding request. > > Cc: Kevin Tian > CC: Jacob Pan > Cc: Alex Williamson > Cc: Eric Auger > Cc: Jean-Philippe Brucker > Cc: Joerg Roedel > Cc: Lu Baolu > Signed-off-by: Jean-Philippe Brucker > Signed-off-by: Liu Yi L > Signed-off-by: Jacob Pan > --- > v6 -> v7: > *) introduced @user in struct domain_capsule to simplify the code per Eric's >suggestion. > *) introduced VFIO_IOMMU_NESTING_OP_NUM for sanitizing op from userspace. > *) corrected the @argsz value of unbind_data in vfio_group_unbind_gpasid_fn(). > > v5 -> v6: > *) dropped vfio_find_nesting_group() and add > vfio_get_nesting_domain_capsule(). >per comment from Eric. > *) use iommu_uapi_sva_bind/unbind_gpasid() and iommu_sva_unbind_gpasid() in >linux/iommu.h for userspace operation and in-kernel operation. > > v3 -> v4: > *) address comments from Alex on v3 > > v2 -> v3: > *) use __iommu_sva_unbind_gpasid() for unbind call issued by VFIO > > https://lore.kernel.org/linux-iommu/1592931837-58223-6-git-send-email-jacob.jun@linux.intel.com/ > > v1 -> v2: > *) rename subject from "vfio/type1: Bind guest page tables to host" > *) remove VFIO_IOMMU_BIND, introduce VFIO_IOMMU_NESTING_OP to support bind/ >unbind guet page table > *) replaced vfio_iommu_for_each_dev() with a group level loop since this >series enforces one group per container w/ nesting type as start. > *) rename vfio_bind/unbind_gpasid_fn() to vfio_dev_bind/unbind_gpasid_fn() > *) vfio_dev_unbind_gpasid() always successful > *) use vfio_mm->pasid_lock to avoid race between PASID free and page table >bind/unbind > --- > drivers/vfio/vfio_iommu_type1.c | 163 > > drivers/vfio/vfio_pasid.c | 26 +++ > include/linux/vfio.h| 20 + > include/uapi/linux/vfio.h | 36 + > 4 files changed, 245 insertions(+) > > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > index bd4b668..11f1156 100644 > --- a/drivers/vfio/vfio_iommu_type1.c > +++ b/drivers/vfio/vfio_iommu_type1.c > @@ -149,6 +149,39 @@ struct vfio_regions { > #define DIRTY_BITMAP_PAGES_MAX((u64)INT_MAX) > #define DIRTY_BITMAP_SIZE_MAX > DIRTY_BITMAP_BYTES(DIRTY_BITMAP_PAGES_MAX) > > +struct domain_capsule { > + struct vfio_group *group; > + struct iommu_domain *domain; > + /* set if @data contains a user pointer*/ > + booluser; > + void*data; > +}; Put the hole in the structure at the end, but I suspect we might lose the user field when the internal api drops the unnecessary structure for unbind anyway. > + > +/* iommu->lock must be held */ > +static int vfio_prepare_nesting_domain_capsule(struct vfio_iommu *iommu, > +struct domain_capsule *dc) > +{ > + struct vfio_domain *domain = NULL; > + struct vfio_group *group = NULL; Unnecessary initialization. > + > + if (!iommu->nesting_info) > + return -EINVAL; > + > + /* > + * Only support singleton container with nesting type. If > + * nesting_info is non-NULL, the container is non-empty. > + * Also domain is non-empty. > + */ > + domain = list_first_entry(&iommu->domain_list, > + struct vfio_domain, next); > + group = list_first_entry(&domain->group_list, > + struct vfio_group, next); > + dc->group = group; > + dc->domain = domain->domain; > + dc->user = true; > + return 0; > +} > + > static int put_pfn(unsigned long pfn, int prot); > > static struct vfio_group *vfio_iommu_find_iommu_group(struct vfio_iommu > *iommu, > @@ -2405,6 +2438,49 @@ static int vfio_iommu_resv_refresh(struct vfio_iommu > *iommu, > return ret; > } > > +static int vfio_dev_bind_gpasid_fn(struct device *dev, void *data) > +{ > + struct domain_capsule *dc = (struct domain_capsule *)data; > + unsigned long arg = *(unsigned long *)dc->data; > + > + return iommu_uapi_sva_bind_gpasid(dc->domain, dev, > + (void __user *)arg); > +} > + > +static int vfio_dev_unbind_gpasid_fn(struct device *dev, void *data) > +{ > + struct domain_capsule *dc = (struct domain_capsule *)data; > + > + if (dc->user) { > + unsigned long arg = *(unsigned long *)dc->data; > + > + iommu_uapi_sva_unbind_gpasid(dc->domain, > + dev, (void __user *)arg); > + } else { > + struct iommu_gpasid_bind_data *unbind_data = > +
Re: [PATCH v7 13/16] vfio/pci: Expose PCIe PASID capability to guest
On Thu, 10 Sep 2020 03:45:30 -0700 Liu Yi L wrote: > This patch exposes PCIe PASID capability to guest for assigned devices. > Existing vfio_pci driver hides it from guest by setting the capability > length as 0 in pci_ext_cap_length[]. This exposes the PASID capability, but it's still read-only, so this largely just helps userspace know where to emulate the capability, right? Thanks, Alex > And this patch only exposes PASID capability for devices which has PCIe > PASID extended struture in its configuration space. VFs will not expose > the PASID capability as they do not implement the PASID extended structure > in their config space. It is a TODO in future. Related discussion can be > found in below link: > > https://lore.kernel.org/kvm/20200407095801.648b1...@w520.home/ > > Cc: Kevin Tian > CC: Jacob Pan > Cc: Alex Williamson > Cc: Eric Auger > Cc: Jean-Philippe Brucker > Cc: Joerg Roedel > Cc: Lu Baolu > Signed-off-by: Liu Yi L > Reviewed-by: Eric Auger > --- > v5 -> v6: > *) add review-by from Eric Auger. > > v1 -> v2: > *) added in v2, but it was sent in a separate patchseries before > --- > drivers/vfio/pci/vfio_pci_config.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/vfio/pci/vfio_pci_config.c > b/drivers/vfio/pci/vfio_pci_config.c > index d98843f..07ff2e6 100644 > --- a/drivers/vfio/pci/vfio_pci_config.c > +++ b/drivers/vfio/pci/vfio_pci_config.c > @@ -95,7 +95,7 @@ static const u16 pci_ext_cap_length[PCI_EXT_CAP_ID_MAX + 1] > = { > [PCI_EXT_CAP_ID_LTR]= PCI_EXT_CAP_LTR_SIZEOF, > [PCI_EXT_CAP_ID_SECPCI] = 0, /* not yet */ > [PCI_EXT_CAP_ID_PMUX] = 0, /* not yet */ > - [PCI_EXT_CAP_ID_PASID] = 0, /* not yet */ > + [PCI_EXT_CAP_ID_PASID] = PCI_EXT_CAP_PASID_SIZEOF, > }; > > /* ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V2 2/5] iommu: Add iommu_dma_free_cpu_cached_iovas function
On 2020/9/9 15:05, Christoph Hellwig wrote: +static inline void iommu_dma_free_cpu_cached_iovas(unsigned int cpu, + struct iommu_domain *domain) This adds a crazy long line. Which is rather pointless as other bits of code in the patch use the more compact two tab indentations for the prototype continuation lines anyway. Okay. I will use two tabs instead. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V2 5/5] DO NOT MERGE: iommu: disable list appending in dma-iommu
On 2020/9/9 15:06, Christoph Hellwig wrote: On Wed, Sep 09, 2020 at 09:43:09AM +0800, Lu Baolu wrote: + /* +* The Intel graphic device driver is used to assume that the returned +* sg list is not combound. This blocks the efforts of converting the This adds pointless overly long lines. +* Intel IOMMU driver to dma-iommu api's. Add this quirk to make the +* device driver work and should be removed once it's fixed in i915 +* driver. +*/ + if (dev_is_pci(dev) && + to_pci_dev(dev)->vendor == PCI_VENDOR_ID_INTEL && + (to_pci_dev(dev)->class >> 16) == PCI_BASE_CLASS_DISPLAY) { + for_each_sg(sg, s, nents, i) { + unsigned int s_iova_off = sg_dma_address(s); + unsigned int s_length = sg_dma_len(s); + unsigned int s_iova_len = s->length; + + s->offset += s_iova_off; + s->length = s_length; + sg_dma_address(s) = dma_addr + s_iova_off; + sg_dma_len(s) = s_length; + dma_addr += s_iova_len; + } + + return nents; + } This wants an IS_ENABLED() check. And probably a pr_once reminding of the workaround. Will fix in the next version. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 5/6] iommu/vt-d: Convert intel iommu driver to the iommu ops
From: Tom Murphy Convert the intel iommu driver to the dma-iommu api. Remove the iova handling and reserve region code from the intel iommu driver. Signed-off-by: Tom Murphy Signed-off-by: Lu Baolu --- drivers/iommu/intel/Kconfig | 1 + drivers/iommu/intel/iommu.c | 742 ++-- 2 files changed, 43 insertions(+), 700 deletions(-) diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig index 5337ee1584b0..28a3d1596c76 100644 --- a/drivers/iommu/intel/Kconfig +++ b/drivers/iommu/intel/Kconfig @@ -13,6 +13,7 @@ config INTEL_IOMMU select DMAR_TABLE select SWIOTLB select IOASID + select IOMMU_DMA help DMA remapping (DMAR) devices support enables independent address translations for Direct Memory Access (DMA) from devices. diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 63ee30c689a7..adc231790e0a 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -31,6 +31,7 @@ #include #include #include +#include #include #include #include @@ -41,7 +42,6 @@ #include #include #include -#include #include #include #include @@ -382,9 +382,6 @@ struct device_domain_info *get_domain_info(struct device *dev) DEFINE_SPINLOCK(device_domain_lock); static LIST_HEAD(device_domain_list); -#define device_needs_bounce(d) (!intel_no_bounce && dev_is_pci(d) && \ - to_pci_dev(d)->untrusted) - /* * Iterate over elements in device_domain_list and call the specified * callback @fn against each element. @@ -1242,13 +1239,6 @@ static void dma_free_pagelist(struct page *freelist) } } -static void iova_entry_free(unsigned long data) -{ - struct page *freelist = (struct page *)data; - - dma_free_pagelist(freelist); -} - /* iommu handling */ static int iommu_alloc_root_entry(struct intel_iommu *iommu) { @@ -1613,19 +1603,17 @@ static inline void __mapping_notify_one(struct intel_iommu *iommu, iommu_flush_write_buffer(iommu); } -static void iommu_flush_iova(struct iova_domain *iovad) +static void intel_flush_iotlb_all(struct iommu_domain *domain) { - struct dmar_domain *domain; + struct dmar_domain *dmar_domain = to_dmar_domain(domain); int idx; - domain = container_of(iovad, struct dmar_domain, iovad); - - for_each_domain_iommu(idx, domain) { + for_each_domain_iommu(idx, dmar_domain) { struct intel_iommu *iommu = g_iommus[idx]; - u16 did = domain->iommu_did[iommu->seq_id]; + u16 did = dmar_domain->iommu_did[iommu->seq_id]; - if (domain_use_first_level(domain)) - domain_flush_piotlb(iommu, domain, 0, -1, 0); + if (domain_use_first_level(dmar_domain)) + domain_flush_piotlb(iommu, dmar_domain, 0, -1, 0); else iommu->flush.flush_iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH); @@ -1907,48 +1895,6 @@ static int domain_detach_iommu(struct dmar_domain *domain, return count; } -static struct iova_domain reserved_iova_list; -static struct lock_class_key reserved_rbtree_key; - -static int dmar_init_reserved_ranges(void) -{ - struct pci_dev *pdev = NULL; - struct iova *iova; - int i; - - init_iova_domain(&reserved_iova_list, VTD_PAGE_SIZE, IOVA_START_PFN); - - lockdep_set_class(&reserved_iova_list.iova_rbtree_lock, - &reserved_rbtree_key); - - /* IOAPIC ranges shouldn't be accessed by DMA */ - iova = reserve_iova(&reserved_iova_list, IOVA_PFN(IOAPIC_RANGE_START), - IOVA_PFN(IOAPIC_RANGE_END)); - if (!iova) { - pr_err("Reserve IOAPIC range failed\n"); - return -ENODEV; - } - - /* Reserve all PCI MMIO to avoid peer-to-peer access */ - for_each_pci_dev(pdev) { - struct resource *r; - - for (i = 0; i < PCI_NUM_RESOURCES; i++) { - r = &pdev->resource[i]; - if (!r->flags || !(r->flags & IORESOURCE_MEM)) - continue; - iova = reserve_iova(&reserved_iova_list, - IOVA_PFN(r->start), - IOVA_PFN(r->end)); - if (!iova) { - pci_err(pdev, "Reserve iova for %pR failed\n", r); - return -ENODEV; - } - } - } - return 0; -} - static inline int guestwidth_to_adjustwidth(int gaw) { int agaw; @@ -1971,7 +1917,7 @@ static void domain_exit(struct dmar_domain *domain) /* destroy iovas */ if (domain->domain.type == IOMMU_DOMAIN_DMA) - put_iova_domain(&domain->iovad); +
[PATCH v3 2/6] iommu: Add iommu_dma_free_cpu_cached_iovas()
From: Tom Murphy Add a iommu_dma_free_cpu_cached_iovas function to allow drivers which use the dma-iommu ops to free cached cpu iovas. Signed-off-by: Tom Murphy Signed-off-by: Lu Baolu --- drivers/iommu/dma-iommu.c | 9 + include/linux/dma-iommu.h | 8 2 files changed, 17 insertions(+) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 82c071b2d5c8..d06411bd5e08 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -50,6 +50,15 @@ struct iommu_dma_cookie { struct iommu_domain *fq_domain; }; +void iommu_dma_free_cpu_cached_iovas(unsigned int cpu, + struct iommu_domain *domain) +{ + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + + free_cpu_cached_iovas(cpu, iovad); +} + static void iommu_dma_entry_dtor(unsigned long data) { struct page *freelist = (struct page *)data; diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h index 2112f21f73d8..706b68d1359b 100644 --- a/include/linux/dma-iommu.h +++ b/include/linux/dma-iommu.h @@ -37,6 +37,9 @@ void iommu_dma_compose_msi_msg(struct msi_desc *desc, void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list); +void iommu_dma_free_cpu_cached_iovas(unsigned int cpu, + struct iommu_domain *domain); + #else /* CONFIG_IOMMU_DMA */ struct iommu_domain; @@ -78,5 +81,10 @@ static inline void iommu_dma_get_resv_regions(struct device *dev, struct list_he { } +static inline void iommu_dma_free_cpu_cached_iovas(unsigned int cpu, + struct iommu_domain *domain) +{ +} + #endif /* CONFIG_IOMMU_DMA */ #endif /* __DMA_IOMMU_H */ -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 6/6] iommu/vt-d: Cleanup after converting to dma-iommu ops
Some cleanups after converting the driver to use dma-iommu ops. - Remove nobounce option; - Cleanup and simplify the path in domain mapping. Signed-off-by: Lu Baolu --- .../admin-guide/kernel-parameters.txt | 5 -- drivers/iommu/intel/iommu.c | 90 ++- 2 files changed, 28 insertions(+), 67 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index a1068742a6df..0d11ef43d314 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1866,11 +1866,6 @@ Note that using this option lowers the security provided by tboot because it makes the system vulnerable to DMA attacks. - nobounce [Default off] - Disable bounce buffer for untrusted devices such as - the Thunderbolt devices. This will treat the untrusted - devices as the trusted ones, hence might expose security - risks of DMA attacks. intel_idle.max_cstate= [KNL,HW,ACPI,X86] 0 disables intel_idle and fall back on acpi_idle. diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index adc231790e0a..fe2544c95013 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -355,7 +355,6 @@ static int dmar_forcedac; static int intel_iommu_strict; static int intel_iommu_superpage = 1; static int iommu_identity_mapping; -static int intel_no_bounce; static int iommu_skip_te_disable; #define IDENTMAP_GFX 2 @@ -457,9 +456,6 @@ static int __init intel_iommu_setup(char *str) } else if (!strncmp(str, "tboot_noforce", 13)) { pr_info("Intel-IOMMU: not forcing on after tboot. This could expose security risk for tboot\n"); intel_iommu_tboot_noforce = 1; - } else if (!strncmp(str, "nobounce", 8)) { - pr_info("Intel-IOMMU: No bounce buffer. This could expose security risks of DMA attacks\n"); - intel_no_bounce = 1; } str += strcspn(str, ","); @@ -2230,15 +2226,14 @@ static inline int hardware_largepage_caps(struct dmar_domain *domain, return level; } -static int __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn, - struct scatterlist *sg, unsigned long phys_pfn, - unsigned long nr_pages, int prot) +static int +__domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn, +unsigned long phys_pfn, unsigned long nr_pages, int prot) { struct dma_pte *first_pte = NULL, *pte = NULL; - phys_addr_t pteval; - unsigned long sg_res = 0; unsigned int largepage_lvl = 0; unsigned long lvl_pages = 0; + phys_addr_t pteval; u64 attr; BUG_ON(!domain_pfn_supported(domain, iov_pfn + nr_pages - 1)); @@ -2250,26 +2245,14 @@ static int __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn, if (domain_use_first_level(domain)) attr |= DMA_FL_PTE_PRESENT | DMA_FL_PTE_XD | DMA_FL_PTE_US; - if (!sg) { - sg_res = nr_pages; - pteval = ((phys_addr_t)phys_pfn << VTD_PAGE_SHIFT) | attr; - } + pteval = ((phys_addr_t)phys_pfn << VTD_PAGE_SHIFT) | attr; while (nr_pages > 0) { uint64_t tmp; - if (!sg_res) { - unsigned int pgoff = sg->offset & ~PAGE_MASK; - - sg_res = aligned_nrpages(sg->offset, sg->length); - sg->dma_address = ((dma_addr_t)iov_pfn << VTD_PAGE_SHIFT) + pgoff; - sg->dma_length = sg->length; - pteval = (sg_phys(sg) - pgoff) | attr; - phys_pfn = pteval >> VTD_PAGE_SHIFT; - } - if (!pte) { - largepage_lvl = hardware_largepage_caps(domain, iov_pfn, phys_pfn, sg_res); + largepage_lvl = hardware_largepage_caps(domain, iov_pfn, + phys_pfn, nr_pages); first_pte = pte = pfn_to_dma_pte(domain, iov_pfn, &largepage_lvl); if (!pte) @@ -2281,7 +2264,7 @@ static int __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn, pteval |= DMA_PTE_LARGE_PAGE; lvl_pages = lvl_to_nr_pages(largepage_lvl); - nr_superpages = sg_res / lvl_pages; + nr_superpages = nr_pages / lvl_pages; end_pfn = iov_pfn + nr_superpages * lvl_pages - 1; /* @@ -2315,48 +2298,4
[PATCH v3 3/6] iommu: Allow the dma-iommu api to use bounce buffers
From: Tom Murphy Allow the dma-iommu api to use bounce buffers for untrusted devices. This is a copy of the intel bounce buffer code. Signed-off-by: Tom Murphy Co-developed-by: Lu Baolu Signed-off-by: Lu Baolu --- drivers/iommu/dma-iommu.c | 163 +++--- 1 file changed, 150 insertions(+), 13 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index d06411bd5e08..1a1da22e5a5e 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -21,9 +21,11 @@ #include #include #include +#include #include #include #include +#include struct iommu_dma_msi_page { struct list_headlist; @@ -498,6 +500,31 @@ static void __iommu_dma_unmap(struct device *dev, dma_addr_t dma_addr, iommu_dma_free_iova(cookie, dma_addr, size, iotlb_gather.freelist); } +static void __iommu_dma_unmap_swiotlb(struct device *dev, dma_addr_t dma_addr, + size_t size, enum dma_data_direction dir, + unsigned long attrs) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + phys_addr_t phys; + + phys = iommu_iova_to_phys(domain, dma_addr); + if (WARN_ON(!phys)) + return; + + __iommu_dma_unmap(dev, dma_addr, size); + + if (unlikely(is_swiotlb_buffer(phys))) + swiotlb_tbl_unmap_single(dev, phys, size, + iova_align(iovad, size), dir, attrs); +} + +static bool dev_is_untrusted(struct device *dev) +{ + return dev_is_pci(dev) && to_pci_dev(dev)->untrusted; +} + static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, size_t size, int prot, u64 dma_mask) { @@ -523,6 +550,55 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, return iova + iova_off; } +static dma_addr_t __iommu_dma_map_swiotlb(struct device *dev, phys_addr_t phys, + size_t org_size, dma_addr_t dma_mask, bool coherent, + enum dma_data_direction dir, unsigned long attrs) +{ + int prot = dma_info_to_prot(dir, coherent, attrs); + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + size_t aligned_size = org_size; + void *padding_start; + size_t padding_size; + dma_addr_t iova; + + /* +* If both the physical buffer start address and size are +* page aligned, we don't need to use a bounce page. +*/ + if (IS_ENABLED(CONFIG_SWIOTLB) && dev_is_untrusted(dev) && + iova_offset(iovad, phys | org_size)) { + aligned_size = iova_align(iovad, org_size); + phys = swiotlb_tbl_map_single(dev, + __phys_to_dma(dev, io_tlb_start), + phys, org_size, aligned_size, dir, attrs); + + if (phys == DMA_MAPPING_ERROR) + return DMA_MAPPING_ERROR; + + /* Cleanup the padding area. */ + padding_start = phys_to_virt(phys); + padding_size = aligned_size; + + if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) && + (dir == DMA_TO_DEVICE || +dir == DMA_BIDIRECTIONAL)) { + padding_start += org_size; + padding_size -= org_size; + } + + memset(padding_start, 0, padding_size); + } + + iova = __iommu_dma_map(dev, phys, aligned_size, prot, dma_mask); + if ((iova == DMA_MAPPING_ERROR) && is_swiotlb_buffer(phys)) + swiotlb_tbl_unmap_single(dev, phys, org_size, + aligned_size, dir, attrs); + + return iova; +} + static void __iommu_dma_free_pages(struct page **pages, int count) { while (count--) @@ -698,11 +774,15 @@ static void iommu_dma_sync_single_for_cpu(struct device *dev, { phys_addr_t phys; - if (dev_is_dma_coherent(dev)) + if (dev_is_dma_coherent(dev) && !dev_is_untrusted(dev)) return; phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle); - arch_sync_dma_for_cpu(phys, size, dir); + if (!dev_is_dma_coherent(dev)) + arch_sync_dma_for_cpu(phys, size, dir); + + if (is_swiotlb_buffer(phys)) + swiotlb_tbl_sync_single(dev, phys, size, dir, SYNC_FOR_CPU); } static void iommu_dma_sync_single_for_device(struct device *dev, @@ -710,11 +790,15 @@ static void iommu_dma_sync_single_for_device(struct device *dev, { phys_addr_t phys; - if (dev_is_dma_coherent(dev)) + if (dev_is_dma_coherent(dev) && !dev_is_untrusted(dev)) return; phys = iommu_iova_to_phys(iommu_get_dma_domain(dev
[PATCH v3 0/6] Convert the intel iommu driver to the dma-iommu api
Tom Murphy has almost done all the work. His latest patch series was posted here. https://lore.kernel.org/linux-iommu/20200903201839.7327-1-murph...@tcd.ie/ Thanks a lot! This series is a follow-up with below changes: 1. Add a quirk for the i915 driver issue described in Tom's cover letter. 2. Fix several bugs in patch "iommu: Allow the dma-iommu api to use bounce buffers" to make the bounce buffer work for untrusted devices. 3. Several cleanups in iommu/vt-d driver after the conversion. Please review and test. Best regards, baolu Lu Baolu (2): iommu: Add quirk for Intel graphic devices in map_sg iommu/vt-d: Cleanup after converting to dma-iommu ops Tom Murphy (4): iommu: Handle freelists when using deferred flushing in iommu drivers iommu: Add iommu_dma_free_cpu_cached_iovas() iommu: Allow the dma-iommu api to use bounce buffers iommu/vt-d: Convert intel iommu driver to the iommu ops .../admin-guide/kernel-parameters.txt | 5 - drivers/iommu/dma-iommu.c | 229 - drivers/iommu/intel/Kconfig | 1 + drivers/iommu/intel/iommu.c | 885 +++--- include/linux/dma-iommu.h | 8 + include/linux/iommu.h | 1 + 6 files changed, 323 insertions(+), 806 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 1/6] iommu: Handle freelists when using deferred flushing in iommu drivers
From: Tom Murphy Allow the iommu_unmap_fast to return newly freed page table pages and pass the freelist to queue_iova in the dma-iommu ops path. This is useful for iommu drivers (in this case the intel iommu driver) which need to wait for the ioTLB to be flushed before newly free/unmapped page table pages can be freed. This way we can still batch ioTLB free operations and handle the freelists. Signed-off-by: Tom Murphy Signed-off-by: Lu Baolu --- drivers/iommu/dma-iommu.c | 30 ++-- drivers/iommu/intel/iommu.c | 55 - include/linux/iommu.h | 1 + 3 files changed, 59 insertions(+), 27 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 5141d49a046b..82c071b2d5c8 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -50,6 +50,18 @@ struct iommu_dma_cookie { struct iommu_domain *fq_domain; }; +static void iommu_dma_entry_dtor(unsigned long data) +{ + struct page *freelist = (struct page *)data; + + while (freelist) { + unsigned long p = (unsigned long)page_address(freelist); + + freelist = freelist->freelist; + free_page(p); + } +} + static inline size_t cookie_msi_granule(struct iommu_dma_cookie *cookie) { if (cookie->type == IOMMU_DMA_IOVA_COOKIE) @@ -344,7 +356,8 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, if (!cookie->fq_domain && !iommu_domain_get_attr(domain, DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE, &attr) && attr) { cookie->fq_domain = domain; - init_iova_flush_queue(iovad, iommu_dma_flush_iotlb_all, NULL); + init_iova_flush_queue(iovad, iommu_dma_flush_iotlb_all, + iommu_dma_entry_dtor); } if (!dev) @@ -438,7 +451,7 @@ static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain, } static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie, - dma_addr_t iova, size_t size) + dma_addr_t iova, size_t size, struct page *freelist) { struct iova_domain *iovad = &cookie->iovad; @@ -447,7 +460,8 @@ static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie, cookie->msi_iova -= size; else if (cookie->fq_domain) /* non-strict mode */ queue_iova(iovad, iova_pfn(iovad, iova), - size >> iova_shift(iovad), 0); + size >> iova_shift(iovad), + (unsigned long)freelist); else free_iova_fast(iovad, iova_pfn(iovad, iova), size >> iova_shift(iovad)); @@ -472,7 +486,7 @@ static void __iommu_dma_unmap(struct device *dev, dma_addr_t dma_addr, if (!cookie->fq_domain) iommu_tlb_sync(domain, &iotlb_gather); - iommu_dma_free_iova(cookie, dma_addr, size); + iommu_dma_free_iova(cookie, dma_addr, size, iotlb_gather.freelist); } static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, @@ -494,7 +508,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, return DMA_MAPPING_ERROR; if (iommu_map_atomic(domain, iova, phys - iova_off, size, prot)) { - iommu_dma_free_iova(cookie, iova, size); + iommu_dma_free_iova(cookie, iova, size, NULL); return DMA_MAPPING_ERROR; } return iova + iova_off; @@ -649,7 +663,7 @@ static void *iommu_dma_alloc_remap(struct device *dev, size_t size, out_free_sg: sg_free_table(&sgt); out_free_iova: - iommu_dma_free_iova(cookie, iova, size); + iommu_dma_free_iova(cookie, iova, size, NULL); out_free_pages: __iommu_dma_free_pages(pages, count); return NULL; @@ -900,7 +914,7 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, return __finalise_sg(dev, sg, nents, iova); out_free_iova: - iommu_dma_free_iova(cookie, iova, iova_len); + iommu_dma_free_iova(cookie, iova, iova_len, NULL); out_restore_sg: __invalidate_sg(sg, nents); return 0; @@ -1194,7 +1208,7 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, return msi_page; out_free_iova: - iommu_dma_free_iova(cookie, iova, size); + iommu_dma_free_iova(cookie, iova, size, NULL); out_free_page: kfree(msi_page); return NULL; diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 87b17bac04c2..63ee30c689a7 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1208,17 +1208,17 @@ static struct page *dma_pte_clear_level(struct dmar_domain *domain, int level, pages can only be freed after the IOTLB flush has been done. */ static struct page *domain_unmap(struct dmar_domain
[PATCH v3 4/6] iommu: Add quirk for Intel graphic devices in map_sg
Combining the sg segments exposes a bug in the Intel i915 driver which causes visual artifacts and the screen to freeze. This is most likely because of how the i915 handles the returned list. It probably doesn't respect the returned value specifying the number of elements in the list and instead depends on the previous behaviour of the Intel iommu driver which would return the same number of elements in the output list as in the input list. Signed-off-by: Tom Murphy Signed-off-by: Lu Baolu --- drivers/iommu/dma-iommu.c | 27 +++ 1 file changed, 27 insertions(+) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 1a1da22e5a5e..fc19f1fb9413 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -880,6 +880,33 @@ static int __finalise_sg(struct device *dev, struct scatterlist *sg, int nents, unsigned int cur_len = 0, max_len = dma_get_max_seg_size(dev); int i, count = 0; + /* +* The Intel graphic driver is used to assume that the returned +* sg list is not combound. This blocks the efforts of converting +* Intel IOMMU driver to dma-iommu api's. Add this quirk to make the +* device driver work and should be removed once it's fixed in i915 +* driver. +*/ + if (IS_ENABLED(CONFIG_DRM_I915) && dev_is_pci(dev) && + to_pci_dev(dev)->vendor == PCI_VENDOR_ID_INTEL && + (to_pci_dev(dev)->class >> 16) == PCI_BASE_CLASS_DISPLAY) { + for_each_sg(sg, s, nents, i) { + unsigned int s_iova_off = sg_dma_address(s); + unsigned int s_length = sg_dma_len(s); + unsigned int s_iova_len = s->length; + + s->offset += s_iova_off; + s->length = s_length; + sg_dma_address(s) = dma_addr + s_iova_off; + sg_dma_len(s) = s_length; + dma_addr += s_iova_len; + + pr_info_once("sg combining disabled due to i915 driver\n"); + } + + return nents; + } + for_each_sg(sg, s, nents, i) { /* Restore this segment's original unaligned fields first */ unsigned int s_iova_off = sg_dma_address(s); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v7 10/16] vfio/type1: Support binding guest page tables to PASID
Hi Alex, > From: Alex Williamson > Sent: Saturday, September 12, 2020 6:04 AM > > On Thu, 10 Sep 2020 03:45:27 -0700 > Liu Yi L wrote: > > > Nesting translation allows two-levels/stages page tables, with 1st > > level for guest translations (e.g. GVA->GPA), 2nd level for host > > translations (e.g. GPA->HPA). This patch adds interface for binding > > guest page tables to a PASID. This PASID must have been allocated by > > the userspace before the binding request. > > > > Cc: Kevin Tian > > CC: Jacob Pan > > Cc: Alex Williamson > > Cc: Eric Auger > > Cc: Jean-Philippe Brucker > > Cc: Joerg Roedel > > Cc: Lu Baolu > > Signed-off-by: Jean-Philippe Brucker > > Signed-off-by: Liu Yi L > > Signed-off-by: Jacob Pan > > --- > > v6 -> v7: > > *) introduced @user in struct domain_capsule to simplify the code per Eric's > >suggestion. > > *) introduced VFIO_IOMMU_NESTING_OP_NUM for sanitizing op from userspace. > > *) corrected the @argsz value of unbind_data in > > vfio_group_unbind_gpasid_fn(). > > > > v5 -> v6: > > *) dropped vfio_find_nesting_group() and add > > vfio_get_nesting_domain_capsule(). > >per comment from Eric. > > *) use iommu_uapi_sva_bind/unbind_gpasid() and iommu_sva_unbind_gpasid() in > >linux/iommu.h for userspace operation and in-kernel operation. > > > > v3 -> v4: > > *) address comments from Alex on v3 > > > > v2 -> v3: > > *) use __iommu_sva_unbind_gpasid() for unbind call issued by VFIO > > > > https://lore.kernel.org/linux-iommu/1592931837-58223-6-git-send-email- > > jacob.jun@linux.intel.com/ > > > > v1 -> v2: > > *) rename subject from "vfio/type1: Bind guest page tables to host" > > *) remove VFIO_IOMMU_BIND, introduce VFIO_IOMMU_NESTING_OP to support > bind/ > >unbind guet page table > > *) replaced vfio_iommu_for_each_dev() with a group level loop since this > >series enforces one group per container w/ nesting type as start. > > *) rename vfio_bind/unbind_gpasid_fn() to > > vfio_dev_bind/unbind_gpasid_fn() > > *) vfio_dev_unbind_gpasid() always successful > > *) use vfio_mm->pasid_lock to avoid race between PASID free and page table > >bind/unbind > > --- > > drivers/vfio/vfio_iommu_type1.c | 163 > > > drivers/vfio/vfio_pasid.c | 26 +++ > > include/linux/vfio.h| 20 + > > include/uapi/linux/vfio.h | 36 + > > 4 files changed, 245 insertions(+) > > > > diff --git a/drivers/vfio/vfio_iommu_type1.c > > b/drivers/vfio/vfio_iommu_type1.c index bd4b668..11f1156 100644 > > --- a/drivers/vfio/vfio_iommu_type1.c > > +++ b/drivers/vfio/vfio_iommu_type1.c > > @@ -149,6 +149,39 @@ struct vfio_regions { > > #define DIRTY_BITMAP_PAGES_MAX ((u64)INT_MAX) > > #define DIRTY_BITMAP_SIZE_MAX > DIRTY_BITMAP_BYTES(DIRTY_BITMAP_PAGES_MAX) > > > > +struct domain_capsule { > > + struct vfio_group *group; > > + struct iommu_domain *domain; > > + /* set if @data contains a user pointer*/ > > + booluser; > > + void*data; > > +}; > > Put the hole in the structure at the end, but I suspect we might lose the > user field > when the internal api drops the unnecessary structure for unbind anyway. I see. will move @user and its comment to the end of this struct. As it's used to imply the @data field user pointer or not, I guess it's still useful to keep it. The difference would be the @data is a pasid not a bind_data struct. > > + > > +/* iommu->lock must be held */ > > +static int vfio_prepare_nesting_domain_capsule(struct vfio_iommu *iommu, > > + struct domain_capsule *dc) { > > + struct vfio_domain *domain = NULL; > > + struct vfio_group *group = NULL; > > Unnecessary initialization. will remove them. :-) > > + > > + if (!iommu->nesting_info) > > + return -EINVAL; > > + > > + /* > > +* Only support singleton container with nesting type. If > > +* nesting_info is non-NULL, the container is non-empty. > > +* Also domain is non-empty. > > +*/ > > + domain = list_first_entry(&iommu->domain_list, > > + struct vfio_domain, next); > > + group = list_first_entry(&domain->group_list, > > +struct vfio_group, next); > > + dc->group = group; > > + dc->domain = domain->domain; > > + dc->user = true; > > + return 0; > > +} > > + > > static int put_pfn(unsigned long pfn, int prot); > > > > static struct vfio_group *vfio_iommu_find_iommu_group(struct > > vfio_iommu *iommu, @@ -2405,6 +2438,49 @@ static int > vfio_iommu_resv_refresh(struct vfio_iommu *iommu, > > return ret; > > } > > > > +static int vfio_dev_bind_gpasid_fn(struct device *dev, void *data) { > > + struct domain_capsule *dc = (struct domain_capsule *)data; > > + unsigned long arg = *(unsigned long *)dc->data; > > + > > + return iommu_uapi_sva_bind_gpasid(dc->domain, dev, > > +
RE: [PATCH v7 07/16] vfio/type1: Add VFIO_IOMMU_PASID_REQUEST (alloc/free)
Hi Alex, > From: Alex Williamson > Sent: Saturday, September 12, 2020 5:38 AM > > On Thu, 10 Sep 2020 03:45:24 -0700 > Liu Yi L wrote: > > > This patch allows userspace to request PASID allocation/free, e.g. when > > serving the request from the guest. > > > > PASIDs that are not freed by userspace are automatically freed when the > > IOASID set is destroyed when process exits. > > > > Cc: Kevin Tian > > CC: Jacob Pan > > Cc: Alex Williamson > > Cc: Eric Auger > > Cc: Jean-Philippe Brucker > > Cc: Joerg Roedel > > Cc: Lu Baolu > > Signed-off-by: Liu Yi L > > Signed-off-by: Yi Sun > > Signed-off-by: Jacob Pan > > --- > > v6 -> v7: > > *) current VFIO returns allocated pasid via signed int, thus VFIO UAPI > >can only support 31 bits pasid. If user space gives min,max which is > >wider than 31 bits, should fail the allocation or free request. > > > > v5 -> v6: > > *) address comments from Eric against v5. remove the alloc/free helper. > > > > v4 -> v5: > > *) address comments from Eric Auger. > > *) the comments for the PASID_FREE request is addressed in patch 5/15 of > >this series. > > > > v3 -> v4: > > *) address comments from v3, except the below comment against the range > >of PASID_FREE request. needs more help on it. > > "> +if (req.range.min > req.range.max) > > > > Is it exploitable that a user can spin the kernel for a long time in > > the case of a free by calling this with [0, MAX_UINT] regardless of > > their actual allocations?" > > https://lore.kernel.org/linux-iommu/20200702151832.048b4...@x1.home/ > > > > v1 -> v2: > > *) move the vfio_mm related code to be a seprate module > > *) use a single structure for alloc/free, could support a range of PASIDs > > *) fetch vfio_mm at group_attach time instead of at iommu driver open time > > --- > > drivers/vfio/Kconfig| 1 + > > drivers/vfio/vfio_iommu_type1.c | 76 > + > > drivers/vfio/vfio_pasid.c | 10 ++ > > include/linux/vfio.h| 6 > > include/uapi/linux/vfio.h | 43 +++ > > 5 files changed, 136 insertions(+) > > > > diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig > > index 3d8a108..95d90c6 100644 > > --- a/drivers/vfio/Kconfig > > +++ b/drivers/vfio/Kconfig > > @@ -2,6 +2,7 @@ > > config VFIO_IOMMU_TYPE1 > > tristate > > depends on VFIO > > + select VFIO_PASID if (X86) > > default n > > > > config VFIO_IOMMU_SPAPR_TCE > > diff --git a/drivers/vfio/vfio_iommu_type1.c > > b/drivers/vfio/vfio_iommu_type1.c > > index 3c0048b..bd4b668 100644 > > --- a/drivers/vfio/vfio_iommu_type1.c > > +++ b/drivers/vfio/vfio_iommu_type1.c > > @@ -76,6 +76,7 @@ struct vfio_iommu { > > booldirty_page_tracking; > > boolpinned_page_dirty_scope; > > struct iommu_nesting_info *nesting_info; > > + struct vfio_mm *vmm; > > }; > > > > struct vfio_domain { > > @@ -2000,6 +2001,11 @@ static void vfio_iommu_iova_insert_copy(struct > vfio_iommu *iommu, > > > > static void vfio_iommu_release_nesting_info(struct vfio_iommu *iommu) > > { > > + if (iommu->vmm) { > > + vfio_mm_put(iommu->vmm); > > + iommu->vmm = NULL; > > + } > > + > > kfree(iommu->nesting_info); > > iommu->nesting_info = NULL; > > } > > @@ -2127,6 +2133,26 @@ static int vfio_iommu_type1_attach_group(void > *iommu_data, > > iommu->nesting_info); > > if (ret) > > goto out_detach; > > + > > + if (iommu->nesting_info->features & > > + IOMMU_NESTING_FEAT_SYSWIDE_PASID) > { > > + struct vfio_mm *vmm; > > + struct ioasid_set *set; > > + > > + vmm = vfio_mm_get_from_task(current); > > + if (IS_ERR(vmm)) { > > + ret = PTR_ERR(vmm); > > + goto out_detach; > > + } > > + iommu->vmm = vmm; > > + > > + set = vfio_mm_ioasid_set(vmm); > > + ret = iommu_domain_set_attr(domain->domain, > > + DOMAIN_ATTR_IOASID_SET, > > + set); > > + if (ret) > > + goto out_detach; > > + } > > } > > > > /* Get aperture info */ > > @@ -2908,6 +2934,54 @@ static int vfio_iommu_type1_dirty_pages(struct > vfio_iommu *iommu, > > return -EINVAL; > > } > > > > +static int vfio_iommu_type1_pasid_request(struct vfio_iommu *iommu, > > + unsigned long arg) > > +{ > > + struct vfio_iommu_type1_pasid_request req; > > + unsigned long minsz; > > + int ret; > > + > > + minsz = offsetofend(struct vfio_iommu_type1_pasid_request, range); > > + > > +
Re: [PATCH 3/3] dma-mapping: introduce DMA range map, supplanting dma_pfn_offset
On Fri, Sep 11, 2020 at 05:12:36PM +0100, Robin Murphy wrote: > (apologies to Jim - I did look through one of the previous versions since I > last commented and thought it looked OK, but never actually replied as > such) > > On 2020-09-10 06:40, Christoph Hellwig wrote: >> From: Jim Quinlan >> >> The new field 'dma_range_map' in struct device is used to facilitate the >> use of single or multiple offsets between mapping regions of cpu addrs and >> dma addrs. It subsumes the role of "dev->dma_pfn_offset" which was only >> capable of holding a single uniform offset and had no region bounds >> checking. >> >> The function of_dma_get_range() has been modified so that it takes a single >> argument -- the device node -- and returns a map, NULL, or an error code. >> The map is an array that holds the information regarding the DMA regions. >> Each range entry contains the address offset, the cpu_start address, the >> dma_start address, and the size of the region. >> >> of_dma_configure() is the typical manner to set range offsets but there are >> a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel >> driver code. These cases now invoke the function >> dma_attach_offset_range(dev, cpu_addr, dma_addr, size). > > This is now called dma_direct_set_offset(), right? Yes. >> +int ret = dma_direct_set_offset(dev, KEYSTONE_HIGH_PHYS_START, >> +KEYSTONE_LOW_PHYS_START, >> +KEYSTONE_HIGH_PHYS_SIZE); >> +dev_err(dev, "set dma_offset%08llx%s\n", >> +KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START, >> +ret ? " failed" : ""); > > FWIW I've already been thinking of some optimisations which would have the > happy side-effect of removing many of these allocation failure scenarios, > but at this point I reckon it's more practical to just get the current > implementation landed and working. Given that no one deals or can easily deal with these failures we should probably take care of that. IMHO we could just allocate a single static range and point all the devices to it, what would you think of that? >> @@ -811,8 +812,13 @@ static int sun4i_backend_bind(struct device *dev, >> struct device *master, >> * because of an old DT, we need to set the DMA offset by hand >> * on our device since the RAM mapping is at 0 for the DMA bus, >> * unlike the CPU. >> + * >> + * XXX(hch): this has no business in a driver and needs to move >> + * to the device tree. > > As the context implies, this has actually grown a proper DT description of > the funky interconnect layout (see 564d6fd611f9 and the linked patch > series), and this is just an ugly fallback path to prevent regressions with > old DTBs that are already out there. So unless you can fire up the time > machine to fix those, this extra comment is really just beating a dead > horse :( Well, I need to beat the dead horse to avoid having to export the function, which will really just bread new users. So at the minimum we'd need to move the setup to platform code that is always built in. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu