[PATCH v3 7/7] iommu/amd: Do not support IOMMUv2 APIs when SNP is enabled
The IOMMUv2 APIs (for supporting shared virtual memory with PASID) configures the domain with IOMMU v2 page table, and sets DTE[Mode]=0. This configuration cannot be supported on SNP-enabled system. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index f5695ccb7c81..4c9b96160a8b 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -3448,7 +3448,12 @@ __setup("ivrs_acpihid", parse_ivrs_acpihid); bool amd_iommu_v2_supported(void) { - return amd_iommu_v2_present; + /* +* Since DTE[Mode]=0 is prohibited on SNP-enabled system +* (i.e. EFR[SNPSup]=1), IOMMUv2 page table cannot be used without +* setting up IOMMUv1 page table. +*/ + return amd_iommu_v2_present && !amd_iommu_snp_en; } EXPORT_SYMBOL(amd_iommu_v2_supported); -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 0/7] iommu/amd: Enforce IOMMU restrictions for SNP-enabled system
SNP-enabled system requires IOMMU v1 page table to be configured with non-zero DTE[Mode] for DMA-capable devices. This effects a number of usecases such as IOMMU pass-through mode and AMD IOMMUv2 APIs for binding/unbinding pasid. The series introduce a global variable to check SNP-enabled state during driver initialization, and use it to enforce the SNP restrictions during runtime. Also, for non-DMA-capable devices such as IOAPIC, the recommendation is to set DTE[TV] and DTE[Mode] to zero on SNP-enabled system. Therefore, additinal checks is added before setting DTE[TV]. Testing: - Tested booting and verify dmesg. - Tested booting with iommu=pt - Tested loading amd_iommu_v2 driver - Tested changing the iommu domain at runtime - Tested booting SEV/SNP-enabled guest - Tested when CONFIG_AMD_MEM_ENCRYPT is not set Pre-requisite: - [PATCH v3 00/35] iommu/amd: Add multiple PCI segments support https://lore.kernel.org/linux-iommu/20220511072141.15485-29-vasant.he...@amd.com/T/ Chanages from V2: (https://lists.linuxfoundation.org/pipermail/iommu/2022-June/066392.html) - Patch 4: * Update pr_err message to report SNP not supported. * Remove export GPL. * Remove function stub when CONFIG_AMD_MEM_ENCRYPT is not set. - Patch 6: Change to WARN_ONCE. Best Regards, Suravee Brijesh Singh (1): iommu/amd: Introduce function to check and enable SNP Suravee Suthikulpanit (6): iommu/amd: Warn when found inconsistency EFR mask iommu/amd: Process all IVHDs before enabling IOMMU features iommu/amd: Introduce an iommu variable for tracking SNP support status iommu/amd: Set translation valid bit only when IO page tables are in use iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY after SNP is enabled iommu/amd: Do not support IOMMUv2 APIs when SNP is enabled drivers/iommu/amd/amd_iommu_types.h | 5 ++ drivers/iommu/amd/init.c| 109 +++- drivers/iommu/amd/iommu.c | 27 ++- include/linux/amd-iommu.h | 4 + 4 files changed, 123 insertions(+), 22 deletions(-) -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 6/7] iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY after SNP is enabled
Once SNP is enabled (by executing SNP_INIT command), IOMMU can no longer support the passthrough domain (i.e. IOMMU_DOMAIN_IDENTITY). The SNP_INIT command is called early in the boot process, and would fail if the kernel is configure to default to passthrough mode. After the system is already booted, users can try to change IOMMU domain type of a particular IOMMU group. In this case, the IOMMU driver needs to check the SNP-enable status and return failure when requesting to change domain type to identity. Therefore, return failure when trying to allocate identity domain. Reviewed-by: Robin Murphy Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/iommu.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 4f4571d3ff61..7093e26fec59 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2119,6 +2119,14 @@ static struct iommu_domain *amd_iommu_domain_alloc(unsigned type) { struct protection_domain *domain; + /* +* Since DTE[Mode]=0 is prohibited on SNP-enabled system, +* default to use IOMMU_DOMAIN_DMA[_FQ]. +*/ + if (WARN_ONCE(amd_iommu_snp_en && (type == IOMMU_DOMAIN_IDENTITY), + "Cannot allocate identity domain due to SNP\n")) + return NULL; + domain = protection_domain_alloc(type); if (!domain) return NULL; -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 4/7] iommu/amd: Introduce function to check and enable SNP
From: Brijesh Singh To support SNP, IOMMU needs to be enabled, and prohibits IOMMU configurations where DTE[Mode]=0, which means it cannot be supported with IOMMU passthrough domain (a.k.a IOMMU_DOMAIN_IDENTITY), and when AMD IOMMU driver is configured to not use the IOMMU host (v1) page table. Otherwise, RMP table initialization could cause the system to crash. The request to enable SNP support in IOMMU must be done before PCI initialization state of the IOMMU driver because enabling SNP affects how IOMMU driver sets up IOMMU data structures (i.e. DTE). Unlike other IOMMU features, SNP feature does not have an enable bit in the IOMMU control register. Instead, the IOMMU driver introduces an amd_iommu_snp_en variable to track enabling state of SNP. Introduce amd_iommu_snp_enable() for other drivers to request enabling the SNP support in IOMMU, which checks all prerequisites and determines if the feature can be safely enabled. Please see the IOMMU spec section 2.12 for further details. Reviewed-by: Robin Murphy Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Brijesh Singh --- drivers/iommu/amd/amd_iommu_types.h | 5 drivers/iommu/amd/init.c| 44 +++-- drivers/iommu/amd/iommu.c | 4 +-- include/linux/amd-iommu.h | 4 +++ 4 files changed, 53 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 73b729be7410..ce4db2835b36 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -463,6 +463,9 @@ extern bool amd_iommu_irq_remap; /* kmem_cache to get tables with 128 byte alignement */ extern struct kmem_cache *amd_iommu_irq_cache; +/* SNP is enabled on the system? */ +extern bool amd_iommu_snp_en; + #define PCI_SBDF_TO_SEGID(sbdf)(((sbdf) >> 16) & 0x) #define PCI_SBDF_TO_DEVID(sbdf)((sbdf) & 0x) #define PCI_SEG_DEVID_TO_SBDF(seg, devid) u32)(seg) & 0x) << 16) | \ @@ -1013,4 +1016,6 @@ extern struct amd_irte_ops irte_32_ops; extern struct amd_irte_ops irte_128_ops; #endif +extern struct iommu_ops amd_iommu_ops; + #endif /* _ASM_X86_AMD_IOMMU_TYPES_H */ diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 013c55e3c2f2..c62fb4470519 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -95,8 +95,6 @@ * out of it. */ -extern const struct iommu_ops amd_iommu_ops; - /* * structure describing one IOMMU in the ACPI table. Typically followed by one * or more ivhd_entrys. @@ -168,6 +166,9 @@ static int amd_iommu_target_ivhd_type; static bool amd_iommu_snp_sup; +bool amd_iommu_snp_en; +EXPORT_SYMBOL(amd_iommu_snp_en); + LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */ LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the system */ @@ -3549,3 +3550,42 @@ int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, u8 fxn, u64 return iommu_pc_get_set_reg(iommu, bank, cntr, fxn, value, true); } + +#ifdef CONFIG_AMD_MEM_ENCRYPT +int amd_iommu_snp_enable(void) +{ + /* +* The SNP support requires that IOMMU must be enabled, and is +* not configured in the passthrough mode. +*/ + if (no_iommu || iommu_default_passthrough()) { + pr_err("SNP: IOMMU is disabled or configured in passthrough mode, SNP cannot be supported"); + return -EINVAL; + } + + /* +* Prevent enabling SNP after IOMMU_ENABLED state because this process +* affect how IOMMU driver sets up data structures and configures +* IOMMU hardware. +*/ + if (init_state > IOMMU_ENABLED) { + pr_err("SNP: Too late to enable SNP for IOMMU.\n"); + return -EINVAL; + } + + amd_iommu_snp_en = amd_iommu_snp_sup; + if (!amd_iommu_snp_en) + return -EINVAL; + + pr_info("SNP enabled\n"); + + /* Enforce IOMMU v1 pagetable when SNP is enabled. */ + if (amd_iommu_pgtable != AMD_IOMMU_V1) { + pr_warn("Force to using AMD IOMMU v1 page table due to SNP\n"); + amd_iommu_pgtable = AMD_IOMMU_V1; + amd_iommu_ops.pgsize_bitmap = AMD_IOMMU_PGSIZES; + } + + return 0; +} +#endif diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 86045dc50a0f..0792cd618dba 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -71,7 +71,7 @@ LIST_HEAD(acpihid_map); * Domain for untranslated devices - only allocated * if iommu=pt passed on kernel cmd line. */ -const struct iommu_ops amd_iommu_ops; +struct iommu_ops amd_iommu_ops; static ATOMIC_NOTIFIER_HEAD(ppr_notifier); int amd_iommu_max_glx_val = -1; @@ -2412,7
[PATCH v3 5/7] iommu/amd: Set translation valid bit only when IO page tables are in use
On AMD system with SNP enabled, IOMMU hardware checks the host translation valid (TV) and guest translation valid (GV) bits in the device table entry (DTE) before accessing the corresponded page tables. However, current IOMMU driver sets the TV bit for all devices regardless of whether the host page table is in use. This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which do not the host page table root pointer set up. Thefore, when SNP is enabled, only set TV bit when DMA remapping is not used, which is when domain ID in the AMD IOMMU device table entry (DTE) is zero. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 3 ++- drivers/iommu/amd/iommu.c | 15 +-- 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index c62fb4470519..f5695ccb7c81 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -2544,7 +2544,8 @@ static void init_device_table_dma(struct amd_iommu_pci_seg *pci_seg) for (devid = 0; devid <= pci_seg->last_bdf; ++devid) { __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_VALID); - __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_TRANSLATION); + if (!amd_iommu_snp_en) + __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_TRANSLATION); } } diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 0792cd618dba..4f4571d3ff61 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1563,7 +1563,14 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 devid, (domain->flags & PD_GIOV_MASK)) pte_root |= DTE_FLAG_GIOV; - pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV; + pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V; + + /* +* When SNP is enabled, Only set TV bit when IOMMU +* page translation is in use. +*/ + if (!amd_iommu_snp_en || (domain->id != 0)) + pte_root |= DTE_FLAG_TV; flags = dev_table[devid].data[1]; @@ -1625,7 +1632,11 @@ static void clear_dte_entry(struct amd_iommu *iommu, u16 devid) struct dev_table_entry *dev_table = get_dev_table(iommu); /* remove entry from the device table seen by the hardware */ - dev_table[devid].data[0] = DTE_FLAG_V | DTE_FLAG_TV; + dev_table[devid].data[0] = DTE_FLAG_V; + + if (!amd_iommu_snp_en) + dev_table[devid].data[0] |= DTE_FLAG_TV; + dev_table[devid].data[1] &= DTE_FLAG_MASK; amd_iommu_apply_erratum_63(iommu, devid); -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 3/7] iommu/amd: Introduce an iommu variable for tracking SNP support status
EFR[SNPSup] needs to be checked early in the boot process, since it is used to determine how IOMMU driver configures other IOMMU features and data structures. This check can be done as soon as the IOMMU driver finishes parsing IVHDs. Introduce a variable for tracking the SNP support status, which is initialized before enabling the rest of IOMMU features. Also report IOMMU SNP support information for each IOMMU. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 5f86e357dbaa..013c55e3c2f2 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -166,6 +166,8 @@ static bool amd_iommu_disabled __initdata; static bool amd_iommu_force_enable __initdata; static int amd_iommu_target_ivhd_type; +static bool amd_iommu_snp_sup; + LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */ LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the system */ @@ -260,7 +262,6 @@ int amd_iommu_get_num_iommus(void) return amd_iommus_present; } -#ifdef CONFIG_IRQ_REMAP /* * Iterate through all the IOMMUs to verify if the specified * EFR bitmask of IOMMU feature are set. @@ -285,7 +286,6 @@ static bool check_feature_on_all_iommus(u64 mask) } return ret; } -#endif /* * For IVHD type 0x11/0x40, EFR is also available via IVHD. @@ -368,7 +368,7 @@ static void iommu_set_cwwb_range(struct amd_iommu *iommu) u64 start = iommu_virt_to_phys((void *)iommu->cmd_sem); u64 entry = start & PM_ADDR_MASK; - if (!iommu_feature(iommu, FEATURE_SNP)) + if (!amd_iommu_snp_sup) return; /* Note: @@ -783,7 +783,7 @@ static void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu, void *buf = (void *)__get_free_pages(gfp, order); if (buf && - iommu_feature(iommu, FEATURE_SNP) && + amd_iommu_snp_sup && set_memory_4k((unsigned long)buf, (1 << order))) { free_pages((unsigned long)buf, order); buf = NULL; @@ -1882,6 +1882,7 @@ static int __init init_iommu_all(struct acpi_table_header *table) WARN_ON(p != end); /* Phase 2 : Early feature support check */ + amd_iommu_snp_sup = check_feature_on_all_iommus(FEATURE_SNP); /* Phase 3 : Enabling IOMMU features */ for_each_iommu(iommu) { @@ -2118,6 +2119,9 @@ static void print_iommu_info(void) if (iommu->features & FEATURE_GAM_VAPIC) pr_cont(" GA_vAPIC"); + if (iommu->features & FEATURE_SNP) + pr_cont(" SNP"); + pr_cont("\n"); } } -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 1/7] iommu/amd: Warn when found inconsistency EFR mask
The function check_feature_on_all_iommus() checks to ensure if an IOMMU feature support bit is set on the Extended Feature Register (EFR). Current logic iterates through all IOMMU, and returns false when it found the first unset bit. To provide more thorough checking, modify the logic to iterate through all IOMMUs even when found that the bit is not set, and also throws a FW_BUG warning if inconsistency is found. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 3dd0f26039c7..b3e4551ce9dd 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -261,18 +261,29 @@ int amd_iommu_get_num_iommus(void) } #ifdef CONFIG_IRQ_REMAP +/* + * Iterate through all the IOMMUs to verify if the specified + * EFR bitmask of IOMMU feature are set. + * Warn and return false if found inconsistency. + */ static bool check_feature_on_all_iommus(u64 mask) { bool ret = false; struct amd_iommu *iommu; for_each_iommu(iommu) { - ret = iommu_feature(iommu, mask); - if (!ret) + bool tmp = iommu_feature(iommu, mask); + + if ((ret != tmp) && + !list_is_first(>list, _iommu_list)) { + pr_err(FW_BUG "Found inconsistent EFR mask (%#llx) on iommu%d (%04x:%02x:%02x.%01x).\n", + mask, iommu->index, iommu->pci_seg->id, PCI_BUS_NUM(iommu->devid), + PCI_SLOT(iommu->devid), PCI_FUNC(iommu->devid)); return false; + } + ret = tmp; } - - return true; + return ret; } #endif -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 2/7] iommu/amd: Process all IVHDs before enabling IOMMU features
The ACPI IVRS table can contain multiple IVHD blocks. Each block contains information used to initialize each IOMMU instance. Currently, init_iommu_all sequentially process IVHD block and initialize IOMMU instance one-by-one. However, certain features require all IOMMUs to be configured in the same way system-wide. In case certain IVHD blocks contain inconsistent information (most likely FW bugs), the driver needs to go through and try to revert settings on IOMMUs that have already been configured. A solution is to split IOMMU initialization into 3 phases: Phase1 : Processes information of the IVRS table for all IOMMU instances. This allow all IVHDs to be processed prior to enabling features. Phase2 : Early feature support check on all IOMMUs (using information in IVHD blocks. Phase3 : Iterates through all IOMMU instances and enabling features. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 24 ++-- 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index b3e4551ce9dd..5f86e357dbaa 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1692,7 +1692,6 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, struct acpi_table_header *ivrs_base) { struct amd_iommu_pci_seg *pci_seg; - int ret; pci_seg = get_pci_segment(h->pci_seg, ivrs_base); if (pci_seg == NULL) @@ -1773,6 +1772,13 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, if (!iommu->mmio_base) return -ENOMEM; + return init_iommu_from_acpi(iommu, h); +} + +static int __init init_iommu_one_late(struct amd_iommu *iommu) +{ + int ret; + if (alloc_cwwb_sem(iommu)) return -ENOMEM; @@ -1794,10 +1800,6 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, if (amd_iommu_pre_enabled) amd_iommu_pre_enabled = translation_pre_enabled(iommu); - ret = init_iommu_from_acpi(iommu, h); - if (ret) - return ret; - if (amd_iommu_irq_remap) { ret = amd_iommu_create_irq_domain(iommu); if (ret) @@ -1808,7 +1810,7 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, * Make sure IOMMU is not considered to translate itself. The IVRS * table tells us so, but this is a lie! */ - pci_seg->rlookup_table[iommu->devid] = NULL; + iommu->pci_seg->rlookup_table[iommu->devid] = NULL; return 0; } @@ -1853,6 +1855,7 @@ static int __init init_iommu_all(struct acpi_table_header *table) end += table->length; p += IVRS_HEADER_LENGTH; + /* Phase 1: Process all IVHD blocks */ while (p < end) { h = (struct ivhd_header *)p; if (*p == amd_iommu_target_ivhd_type) { @@ -1878,6 +1881,15 @@ static int __init init_iommu_all(struct acpi_table_header *table) } WARN_ON(p != end); + /* Phase 2 : Early feature support check */ + + /* Phase 3 : Enabling IOMMU features */ + for_each_iommu(iommu) { + ret = init_iommu_one_late(iommu); + if (ret) + return ret; + } + return 0; } -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 4/7] iommu/amd: Introduce function to check and enable SNP
From: Brijesh Singh To support SNP, IOMMU needs to be enabled, and prohibits IOMMU configurations where DTE[Mode]=0, which means it cannot be supported with IOMMU passthrough domain (a.k.a IOMMU_DOMAIN_IDENTITY), and when AMD IOMMU driver is configured to not use the IOMMU host (v1) page table. Otherwise, RMP table initialization could cause the system to crash. The request to enable SNP support in IOMMU must be done before PCI initialization state of the IOMMU driver because enabling SNP affects how IOMMU driver sets up IOMMU data structures (i.e. DTE). Unlike other IOMMU features, SNP feature does not have an enable bit in the IOMMU control register. Instead, the IOMMU driver introduces an amd_iommu_snp_en variable to track enabling state of SNP. Introduce amd_iommu_snp_enable() for other drivers to request enabling the SNP support in IOMMU, which checks all prerequisites and determines if the feature can be safely enabled. Please see the IOMMU spec section 2.12 for further details. Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Brijesh Singh --- drivers/iommu/amd/amd_iommu_types.h | 5 drivers/iommu/amd/init.c| 45 +++-- drivers/iommu/amd/iommu.c | 4 +-- include/linux/amd-iommu.h | 6 4 files changed, 56 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 73b729be7410..ce4db2835b36 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -463,6 +463,9 @@ extern bool amd_iommu_irq_remap; /* kmem_cache to get tables with 128 byte alignement */ extern struct kmem_cache *amd_iommu_irq_cache; +/* SNP is enabled on the system? */ +extern bool amd_iommu_snp_en; + #define PCI_SBDF_TO_SEGID(sbdf)(((sbdf) >> 16) & 0x) #define PCI_SBDF_TO_DEVID(sbdf)((sbdf) & 0x) #define PCI_SEG_DEVID_TO_SBDF(seg, devid) u32)(seg) & 0x) << 16) | \ @@ -1013,4 +1016,6 @@ extern struct amd_irte_ops irte_32_ops; extern struct amd_irte_ops irte_128_ops; #endif +extern struct iommu_ops amd_iommu_ops; + #endif /* _ASM_X86_AMD_IOMMU_TYPES_H */ diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 013c55e3c2f2..b5d3de327a5f 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -95,8 +95,6 @@ * out of it. */ -extern const struct iommu_ops amd_iommu_ops; - /* * structure describing one IOMMU in the ACPI table. Typically followed by one * or more ivhd_entrys. @@ -168,6 +166,9 @@ static int amd_iommu_target_ivhd_type; static bool amd_iommu_snp_sup; +bool amd_iommu_snp_en; +EXPORT_SYMBOL(amd_iommu_snp_en); + LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */ LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the system */ @@ -3549,3 +3550,43 @@ int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, u8 fxn, u64 return iommu_pc_get_set_reg(iommu, bank, cntr, fxn, value, true); } + +#ifdef CONFIG_AMD_MEM_ENCRYPT +int amd_iommu_snp_enable(void) +{ + /* +* The SNP support requires that IOMMU must be enabled, and is +* not configured in the passthrough mode. +*/ + if (no_iommu || iommu_default_passthrough()) { + pr_err("SNP: IOMMU is either disabled or configured in passthrough mode.\n"); + return -EINVAL; + } + + /* +* Prevent enabling SNP after IOMMU_ENABLED state because this process +* affect how IOMMU driver sets up data structures and configures +* IOMMU hardware. +*/ + if (init_state > IOMMU_ENABLED) { + pr_err("SNP: Too late to enable SNP for IOMMU.\n"); + return -EINVAL; + } + + amd_iommu_snp_en = amd_iommu_snp_sup; + if (!amd_iommu_snp_en) + return -EINVAL; + + pr_info("SNP enabled\n"); + + /* Enforce IOMMU v1 pagetable when SNP is enabled. */ + if (amd_iommu_pgtable != AMD_IOMMU_V1) { + pr_warn("Force to using AMD IOMMU v1 page table due to SNP\n"); + amd_iommu_pgtable = AMD_IOMMU_V1; + amd_iommu_ops.pgsize_bitmap = AMD_IOMMU_PGSIZES; + } + + return 0; +} +EXPORT_SYMBOL_GPL(amd_iommu_snp_enable); +#endif diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 86045dc50a0f..0792cd618dba 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -71,7 +71,7 @@ LIST_HEAD(acpihid_map); * Domain for untranslated devices - only allocated * if iommu=pt passed on kernel cmd line. */ -const struct iommu_ops amd_iommu_ops; +struct iommu_ops amd_iommu_ops; static ATOMIC_NOTIFIER_HEAD(ppr_notifier); int amd_iommu_max_glx_val = -1; @@ -2412,7
[PATCH v2 2/7] iommu/amd: Process all IVHDs before enabling IOMMU features
The ACPI IVRS table can contain multiple IVHD blocks. Each block contains information used to initialize each IOMMU instance. Currently, init_iommu_all sequentially process IVHD block and initialize IOMMU instance one-by-one. However, certain features require all IOMMUs to be configured in the same way system-wide. In case certain IVHD blocks contain inconsistent information (most likely FW bugs), the driver needs to go through and try to revert settings on IOMMUs that have already been configured. A solution is to split IOMMU initialization into 3 phases: Phase1 : Processes information of the IVRS table for all IOMMU instances. This allow all IVHDs to be processed prior to enabling features. Phase2 : Early feature support check on all IOMMUs (using information in IVHD blocks. Phase3 : Iterates through all IOMMU instances and enabling features. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 24 ++-- 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index b3e4551ce9dd..5f86e357dbaa 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1692,7 +1692,6 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, struct acpi_table_header *ivrs_base) { struct amd_iommu_pci_seg *pci_seg; - int ret; pci_seg = get_pci_segment(h->pci_seg, ivrs_base); if (pci_seg == NULL) @@ -1773,6 +1772,13 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, if (!iommu->mmio_base) return -ENOMEM; + return init_iommu_from_acpi(iommu, h); +} + +static int __init init_iommu_one_late(struct amd_iommu *iommu) +{ + int ret; + if (alloc_cwwb_sem(iommu)) return -ENOMEM; @@ -1794,10 +1800,6 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, if (amd_iommu_pre_enabled) amd_iommu_pre_enabled = translation_pre_enabled(iommu); - ret = init_iommu_from_acpi(iommu, h); - if (ret) - return ret; - if (amd_iommu_irq_remap) { ret = amd_iommu_create_irq_domain(iommu); if (ret) @@ -1808,7 +1810,7 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, * Make sure IOMMU is not considered to translate itself. The IVRS * table tells us so, but this is a lie! */ - pci_seg->rlookup_table[iommu->devid] = NULL; + iommu->pci_seg->rlookup_table[iommu->devid] = NULL; return 0; } @@ -1853,6 +1855,7 @@ static int __init init_iommu_all(struct acpi_table_header *table) end += table->length; p += IVRS_HEADER_LENGTH; + /* Phase 1: Process all IVHD blocks */ while (p < end) { h = (struct ivhd_header *)p; if (*p == amd_iommu_target_ivhd_type) { @@ -1878,6 +1881,15 @@ static int __init init_iommu_all(struct acpi_table_header *table) } WARN_ON(p != end); + /* Phase 2 : Early feature support check */ + + /* Phase 3 : Enabling IOMMU features */ + for_each_iommu(iommu) { + ret = init_iommu_one_late(iommu); + if (ret) + return ret; + } + return 0; } -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 7/7] iommu/amd: Do not support IOMMUv2 APIs when SNP is enabled
The IOMMUv2 APIs (for supporting shared virtual memory with PASID) configures the domain with IOMMU v2 page table, and sets DTE[Mode]=0. This configuration cannot be supported on SNP-enabled system. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index bc008a82c12c..780d6977a331 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -3448,7 +3448,12 @@ __setup("ivrs_acpihid", parse_ivrs_acpihid); bool amd_iommu_v2_supported(void) { - return amd_iommu_v2_present; + /* +* Since DTE[Mode]=0 is prohibited on SNP-enabled system +* (i.e. EFR[SNPSup]=1), IOMMUv2 page table cannot be used without +* setting up IOMMUv1 page table. +*/ + return amd_iommu_v2_present && !amd_iommu_snp_en; } EXPORT_SYMBOL(amd_iommu_v2_supported); -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 6/7] iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY after SNP is enabled
Once SNP is enabled (by executing SNP_INIT command), IOMMU can no longer support the passthrough domain (i.e. IOMMU_DOMAIN_IDENTITY). The SNP_INIT command is called early in the boot process, and would fail if the kernel is configure to default to passthrough mode. After the system is already booted, users can try to change IOMMU domain type of a particular IOMMU group. In this case, the IOMMU driver needs to check the SNP-enable status and return failure when requesting to change domain type to identity. Therefore, return failure when trying to allocate identity domain. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/iommu.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 4f4571d3ff61..d8a6df423b90 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2119,6 +2119,15 @@ static struct iommu_domain *amd_iommu_domain_alloc(unsigned type) { struct protection_domain *domain; + /* +* Since DTE[Mode]=0 is prohibited on SNP-enabled system, +* default to use IOMMU_DOMAIN_DMA[_FQ]. +*/ + if (amd_iommu_snp_en && (type == IOMMU_DOMAIN_IDENTITY)) { + pr_warn("Cannot allocate identity domain due to SNP\n"); + return NULL; + } + domain = protection_domain_alloc(type); if (!domain) return NULL; -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 3/7] iommu/amd: Introduce an iommu variable for tracking SNP support status
EFR[SNPSup] needs to be checked early in the boot process, since it is used to determine how IOMMU driver configures other IOMMU features and data structures. This check can be done as soon as the IOMMU driver finishes parsing IVHDs. Introduce a variable for tracking the SNP support status, which is initialized before enabling the rest of IOMMU features. Also report IOMMU SNP support information for each IOMMU. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 5f86e357dbaa..013c55e3c2f2 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -166,6 +166,8 @@ static bool amd_iommu_disabled __initdata; static bool amd_iommu_force_enable __initdata; static int amd_iommu_target_ivhd_type; +static bool amd_iommu_snp_sup; + LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */ LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the system */ @@ -260,7 +262,6 @@ int amd_iommu_get_num_iommus(void) return amd_iommus_present; } -#ifdef CONFIG_IRQ_REMAP /* * Iterate through all the IOMMUs to verify if the specified * EFR bitmask of IOMMU feature are set. @@ -285,7 +286,6 @@ static bool check_feature_on_all_iommus(u64 mask) } return ret; } -#endif /* * For IVHD type 0x11/0x40, EFR is also available via IVHD. @@ -368,7 +368,7 @@ static void iommu_set_cwwb_range(struct amd_iommu *iommu) u64 start = iommu_virt_to_phys((void *)iommu->cmd_sem); u64 entry = start & PM_ADDR_MASK; - if (!iommu_feature(iommu, FEATURE_SNP)) + if (!amd_iommu_snp_sup) return; /* Note: @@ -783,7 +783,7 @@ static void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu, void *buf = (void *)__get_free_pages(gfp, order); if (buf && - iommu_feature(iommu, FEATURE_SNP) && + amd_iommu_snp_sup && set_memory_4k((unsigned long)buf, (1 << order))) { free_pages((unsigned long)buf, order); buf = NULL; @@ -1882,6 +1882,7 @@ static int __init init_iommu_all(struct acpi_table_header *table) WARN_ON(p != end); /* Phase 2 : Early feature support check */ + amd_iommu_snp_sup = check_feature_on_all_iommus(FEATURE_SNP); /* Phase 3 : Enabling IOMMU features */ for_each_iommu(iommu) { @@ -2118,6 +2119,9 @@ static void print_iommu_info(void) if (iommu->features & FEATURE_GAM_VAPIC) pr_cont(" GA_vAPIC"); + if (iommu->features & FEATURE_SNP) + pr_cont(" SNP"); + pr_cont("\n"); } } -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 5/7] iommu/amd: Set translation valid bit only when IO page tables are in use
On AMD system with SNP enabled, IOMMU hardware checks the host translation valid (TV) and guest translation valid (GV) bits in the device table entry (DTE) before accessing the corresponded page tables. However, current IOMMU driver sets the TV bit for all devices regardless of whether the host page table is in use. This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which do not the host page table root pointer set up. Thefore, when SNP is enabled, only set TV bit when DMA remapping is not used, which is when domain ID in the AMD IOMMU device table entry (DTE) is zero. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 3 ++- drivers/iommu/amd/iommu.c | 15 +-- 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index b5d3de327a5f..bc008a82c12c 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -2544,7 +2544,8 @@ static void init_device_table_dma(struct amd_iommu_pci_seg *pci_seg) for (devid = 0; devid <= pci_seg->last_bdf; ++devid) { __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_VALID); - __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_TRANSLATION); + if (!amd_iommu_snp_en) + __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_TRANSLATION); } } diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 0792cd618dba..4f4571d3ff61 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1563,7 +1563,14 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 devid, (domain->flags & PD_GIOV_MASK)) pte_root |= DTE_FLAG_GIOV; - pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV; + pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V; + + /* +* When SNP is enabled, Only set TV bit when IOMMU +* page translation is in use. +*/ + if (!amd_iommu_snp_en || (domain->id != 0)) + pte_root |= DTE_FLAG_TV; flags = dev_table[devid].data[1]; @@ -1625,7 +1632,11 @@ static void clear_dte_entry(struct amd_iommu *iommu, u16 devid) struct dev_table_entry *dev_table = get_dev_table(iommu); /* remove entry from the device table seen by the hardware */ - dev_table[devid].data[0] = DTE_FLAG_V | DTE_FLAG_TV; + dev_table[devid].data[0] = DTE_FLAG_V; + + if (!amd_iommu_snp_en) + dev_table[devid].data[0] |= DTE_FLAG_TV; + dev_table[devid].data[1] &= DTE_FLAG_MASK; amd_iommu_apply_erratum_63(iommu, devid); -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 0/7] iommu/amd: Enforce IOMMU restrictions for SNP-enabled system
SNP-enabled system requires IOMMU v1 page table to be configured with non-zero DTE[Mode] for DMA-capable devices. This effects a number of usecases such as IOMMU pass-through mode and AMD IOMMUv2 APIs for binding/unbinding pasid. The series introduce a global variable to check SNP-enabled state during driver initialization, and use it to enforce the SNP restrictions during runtime. Also, for non-DMA-capable devices such as IOAPIC, the recommendation is to set DTE[TV] and DTE[Mode] to zero on SNP-enabled system. Therefore, additinal checks is added before setting DTE[TV]. Testing: - Tested booting and verify dmesg. - Tested booting with iommu=pt - Tested loading amd_iommu_v2 driver - Tested changing the iommu domain at runtime - Tested booting SEV/SNP-enabled guest - Tested when CONFIG_AMD_MEM_ENCRYPT is not set Pre-requisite: - [PATCH v3 00/35] iommu/amd: Add multiple PCI segments support https://lore.kernel.org/linux-iommu/20220511072141.15485-29-vasant.he...@amd.com/T/ Chanages from V1: (https://lore.kernel.org/linux-iommu/20220613012502.109918-1-suravee.suthikulpa...@amd.com/T/#t ) - Remove the newly introduced domain_type_supported() callback. - Patch 1: Modify existing check_feature_on_all_iommus() instead of introducing another helper function to do similar check. - Patch 3: Modify to use check_feature_on_all_iommus(). - Patch 4: Add IOMMU init_state check before enabling SNP. Also move the function declaration to include/linux/amd-iommu.h - Patch 6: Modify amd_iommu_domain_alloc() to fail when allocating identity domain and SNP is enabled. Best Regards, Suravee Brijesh Singh (1): iommu/amd: Introduce function to check and enable SNP Suravee Suthikulpanit (6): iommu/amd: Warn when found inconsistency EFR mask iommu/amd: Process all IVHDs before enabling IOMMU features iommu/amd: Introduce an iommu variable for tracking SNP support status iommu/amd: Set translation valid bit only when IO page tables are in use iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY after SNP is enabled iommu/amd: Do not support IOMMUv2 APIs when SNP is enabled drivers/iommu/amd/amd_iommu_types.h | 5 ++ drivers/iommu/amd/init.c| 110 +++- drivers/iommu/amd/iommu.c | 28 ++- include/linux/amd-iommu.h | 6 ++ 4 files changed, 127 insertions(+), 22 deletions(-) -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 1/7] iommu/amd: Warn when found inconsistency EFR mask
The function check_feature_on_all_iommus() checks to ensure if an IOMMU feature support bit is set on the Extended Feature Register (EFR). Current logic iterates through all IOMMU, and returns false when it found the first unset bit. To provide more thorough checking, modify the logic to iterate through all IOMMUs even when found that the bit is not set, and also throws a FW_BUG warning if inconsistency is found. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 3dd0f26039c7..b3e4551ce9dd 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -261,18 +261,29 @@ int amd_iommu_get_num_iommus(void) } #ifdef CONFIG_IRQ_REMAP +/* + * Iterate through all the IOMMUs to verify if the specified + * EFR bitmask of IOMMU feature are set. + * Warn and return false if found inconsistency. + */ static bool check_feature_on_all_iommus(u64 mask) { bool ret = false; struct amd_iommu *iommu; for_each_iommu(iommu) { - ret = iommu_feature(iommu, mask); - if (!ret) + bool tmp = iommu_feature(iommu, mask); + + if ((ret != tmp) && + !list_is_first(>list, _iommu_list)) { + pr_err(FW_BUG "Found inconsistent EFR mask (%#llx) on iommu%d (%04x:%02x:%02x.%01x).\n", + mask, iommu->index, iommu->pci_seg->id, PCI_BUS_NUM(iommu->devid), + PCI_SLOT(iommu->devid), PCI_FUNC(iommu->devid)); return false; + } + ret = tmp; } - - return true; + return ret; } #endif -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 6/7] iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY when SNP is enabled
Since DTE[Mode]=0 is prohibited on system, which enables SNP, the passthrough domain (IOMMU_DOMAIN_IDENTITY) is not support. Instead, only support IOMMU_DOMAIN_DMA[_FQ] domains. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/iommu.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index ca4647f04382..ecde9e08102d 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2379,6 +2379,17 @@ static int amd_iommu_def_domain_type(struct device *dev) return 0; } +static bool amd_iommu_domain_type_supported(struct device *dev, int type) +{ + /* +* Since DTE[Mode]=0 is prohibited on SNP-enabled system, +* default to use IOMMU_DOMAIN_DMA[_FQ]. +*/ + if (amd_iommu_snp_en && (type == IOMMU_DOMAIN_IDENTITY)) + return false; + return true; +} + struct iommu_ops amd_iommu_ops = { .capable = amd_iommu_capable, .domain_alloc = amd_iommu_domain_alloc, @@ -2391,6 +2402,7 @@ struct iommu_ops amd_iommu_ops = { .is_attach_deferred = amd_iommu_is_attach_deferred, .pgsize_bitmap = AMD_IOMMU_PGSIZES, .def_domain_type = amd_iommu_def_domain_type, + .domain_type_supported = amd_iommu_domain_type_supported, .default_domain_ops = &(const struct iommu_domain_ops) { .attach_dev = amd_iommu_attach_device, .detach_dev = amd_iommu_detach_device, -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/7] iommu/amd: Introduce function to check SEV-SNP support
From: Brijesh Singh The SEV-SNP support requires that IOMMU must to enabled. It also prohibits IOMMU configurations where DTE[Mode]=0, which means the SEV-SNP feature is not supported with IOMMU passthrough domain (a.k.a IOMMU_DOMAIN_IDENTITY), or when AMD IOMMU driver is configured to not use the IOMMU host (v1) page table. Otherwise, the SNP_INIT command (used for initializing firmware) will fail. Unlike other IOMMU features, SNP feature does not have an enable bit in the IOMMU control register. Instead, the feature is considered enabled when SNP_INIT command is executed, which is done by a separte driver. Introduce iommu_sev_snp_supported() for checking if IOMMU supports the SEV-SNP feature, and an amd_iommu_snp_en global variable to keep track of SNP enable status. Please see the IOMMU spec section 2.12 for further details. Tested-by: Ashish Kalra Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Brijesh Singh --- drivers/iommu/amd/amd_iommu_types.h | 11 drivers/iommu/amd/init.c| 39 ++--- drivers/iommu/amd/iommu.c | 4 +-- include/linux/iommu.h | 9 +++ 4 files changed, 52 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 328572cf6fa5..6552c0da8f32 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -450,6 +450,9 @@ extern bool amd_iommu_irq_remap; /* kmem_cache to get tables with 128 byte alignement */ extern struct kmem_cache *amd_iommu_irq_cache; +/* SNP is enabled on the system? */ +extern bool amd_iommu_snp_en; + #define PCI_SBDF_TO_SEGID(sbdf)(((sbdf) >> 16) & 0x) #define PCI_SBDF_TO_DEVID(sbdf)((sbdf) & 0x) #define PCI_SEG_DEVID_TO_SBDF(seg, devid) u32)(seg) & 0x) << 16) | \ @@ -999,4 +1002,12 @@ extern struct amd_irte_ops irte_32_ops; extern struct amd_irte_ops irte_128_ops; #endif +/* + * ACPI table definitions + * + * These data structures are laid over the table to parse the important values + * out of it. + */ +extern struct iommu_ops amd_iommu_ops; + #endif /* _ASM_X86_AMD_IOMMU_TYPES_H */ diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 3965bd3f4f67..da32e7bdd1fa 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -88,15 +88,6 @@ #define IVRS_GET_SBDF_ID(seg, bus, dev, fd)(((seg & 0x) << 16) | ((bus & 0xff) << 8) \ | ((dev & 0x1f) << 3) | (fn & 0x7)) -/* - * ACPI table definitions - * - * These data structures are laid over the table to parse the important values - * out of it. - */ - -extern const struct iommu_ops amd_iommu_ops; - /* * structure describing one IOMMU in the ACPI table. Typically followed by one * or more ivhd_entrys. @@ -166,6 +157,9 @@ static int amd_iommu_target_ivhd_type; static bool amd_iommu_snp_sup; +bool amd_iommu_snp_en; +EXPORT_SYMBOL(amd_iommu_snp_en); + LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */ LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the system */ @@ -3543,3 +3537,30 @@ int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, u8 fxn, u64 return iommu_pc_get_set_reg(iommu, bank, cntr, fxn, value, true); } + +bool iommu_sev_snp_supported(void) +{ + /* +* The SEV-SNP support requires that IOMMU must be enabled, and is +* not configured in the passthrough mode. +*/ + if (no_iommu || iommu_default_passthrough()) { + pr_err("SEV-SNP: IOMMU is either disabled or configured in passthrough mode.\n"); + return false; + } + + amd_iommu_snp_en = amd_iommu_snp_sup; + if (amd_iommu_snp_en) + pr_info("SNP enabled\n"); + + /* Enforce IOMMU v1 pagetable when SNP is enabled. */ + if ((amd_iommu_pgtable != AMD_IOMMU_V1) && +amd_iommu_snp_en) { + pr_info("Force to using AMD IOMMU v1 page table due to SNP\n"); + amd_iommu_pgtable = AMD_IOMMU_V1; + amd_iommu_ops.pgsize_bitmap = AMD_IOMMU_PGSIZES; + } + + return amd_iommu_snp_en; +} +EXPORT_SYMBOL_GPL(iommu_sev_snp_supported); diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 3e1f0fa42ec3..b9dc0d4b6d77 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -70,7 +70,7 @@ LIST_HEAD(acpihid_map); * Domain for untranslated devices - only allocated * if iommu=pt passed on kernel cmd line. */ -const struct iommu_ops amd_iommu_ops; +struct iommu_ops amd_iommu_ops; static ATOMIC_NOTIFIER_HEAD(ppr_notifier); int amd_iommu_max_glx_val = -1; @@ -2368,7 +2368,7 @@ static int amd_iommu_
[PATCH 4/7] iommu/amd: Set translation valid bit only when IO page tables are in use
On AMD system with SNP enabled, IOMMU hardware checks the host translation valid (TV) and guest translation valid (GV) bits in the device table entry (DTE) before accessing the corresponded page tables. However, current IOMMU driver sets the TV bit for all devices regardless of whether the host page table is in use. This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which do not the host page table root pointer set up. Thefore, when SNP is enabled, only set TV bit when DMA remapping is not used, which is when domain ID in the AMD IOMMU device table entry (DTE) is zero. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 3 ++- drivers/iommu/amd/iommu.c | 15 +-- 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index da32e7bdd1fa..a9152d3f33bf 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -2546,7 +2546,8 @@ static void init_device_table_dma(struct amd_iommu_pci_seg *pci_seg) for (devid = 0; devid <= pci_seg->last_bdf; ++devid) { __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_VALID); - __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_TRANSLATION); + if (!amd_iommu_snp_en) + __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_TRANSLATION); } } diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index b9dc0d4b6d77..ca4647f04382 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1552,7 +1552,14 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 devid, pte_root |= (domain->iop.mode & DEV_ENTRY_MODE_MASK) << DEV_ENTRY_MODE_SHIFT; - pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV; + pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V; + + /* +* When SNP is enabled, Only set TV bit when IOMMU +* page translation is in use. +*/ + if (!amd_iommu_snp_en || (domain->id != 0)) + pte_root |= DTE_FLAG_TV; flags = dev_table[devid].data[1]; @@ -1612,7 +1619,11 @@ static void clear_dte_entry(struct amd_iommu *iommu, u16 devid) struct dev_table_entry *dev_table = get_dev_table(iommu); /* remove entry from the device table seen by the hardware */ - dev_table[devid].data[0] = DTE_FLAG_V | DTE_FLAG_TV; + dev_table[devid].data[0] = DTE_FLAG_V; + + if (!amd_iommu_snp_en) + dev_table[devid].data[0] |= DTE_FLAG_TV; + dev_table[devid].data[1] &= DTE_FLAG_MASK; amd_iommu_apply_erratum_63(iommu, devid); -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 7/7] iommu/amd: Do not support IOMMUv2 APIs when SNP is enabled
The IOMMUv2 APIs (for supporting shared virtual memory with PASID) configures the domain with IOMMU v2 page table, and sets DTE[Mode]=0. This configuration cannot be supported on SNP-enabled system. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index a9152d3f33bf..1565f0fb955a 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -3435,7 +3435,12 @@ __setup("ivrs_acpihid", parse_ivrs_acpihid); bool amd_iommu_v2_supported(void) { - return amd_iommu_v2_present; + /* +* Since DTE[Mode]=0 is prohibited on SNP-enabled system +* (i.e. EFR[SNPSup]=1), IOMMUv2 page table cannot be used without +* setting up IOMMUv1 page table. +*/ + return amd_iommu_v2_present && !amd_iommu_snp_en; } EXPORT_SYMBOL(amd_iommu_v2_supported); -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/7] iommu/amd: Process all IVHDs before enabling IOMMU features
The ACPI IVRS table can contain multiple IVHD blocks. Each block contains information used to initialize each IOMMU instance. Currently, init_iommu_all sequentially process IVHD block and initialize IOMMU instance one-by-one. However, certain features require all IOMMUs to be configured in the same way system-wide. In case certain IVHD blocks contain inconsistent information (most likely FW bugs), the driver needs to go through and try to revert settings on IOMMUs that have already been configured. A solution is to split IOMMU initialization into 2 phases: Phase 1 processes information of the IVRS table for all IOMMU instances. This allow all IVHDs to be processed prior to enabling features. Phase 2 iterates through all IOMMU instances and enabling each features. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 20 ++-- 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 8877d2a20398..6a4a019f1e1d 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1687,7 +1687,6 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, struct acpi_table_header *ivrs_base) { struct amd_iommu_pci_seg *pci_seg; - int ret; pci_seg = get_pci_segment(h->pci_seg, ivrs_base); if (pci_seg == NULL) @@ -1768,6 +1767,13 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, if (!iommu->mmio_base) return -ENOMEM; + return init_iommu_from_acpi(iommu, h); +} + +static int __init init_iommu_one_late(struct amd_iommu *iommu) +{ + int ret; + if (alloc_cwwb_sem(iommu)) return -ENOMEM; @@ -1789,10 +1795,6 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, if (amd_iommu_pre_enabled) amd_iommu_pre_enabled = translation_pre_enabled(iommu); - ret = init_iommu_from_acpi(iommu, h); - if (ret) - return ret; - if (amd_iommu_irq_remap) { ret = amd_iommu_create_irq_domain(iommu); if (ret) @@ -1803,7 +1805,7 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, * Make sure IOMMU is not considered to translate itself. The IVRS * table tells us so, but this is a lie! */ - pci_seg->rlookup_table[iommu->devid] = NULL; + iommu->pci_seg->rlookup_table[iommu->devid] = NULL; return 0; } @@ -1873,6 +1875,12 @@ static int __init init_iommu_all(struct acpi_table_header *table) } WARN_ON(p != end); + for_each_iommu(iommu) { + ret = init_iommu_one_late(iommu); + if (ret) + return ret; + } + return 0; } -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/7] iommu/amd: Introduce a global variable for tracking SNP enable status
IOMMU support for SNP feature is detected via the EFR[SNPSup] bit. Also, it is required that EFR[SNPSup] are consistent across all IOMMU instances. This information is needed early in the boot process, since it is used to determine how IOMMU driver configures several other IOMMU features and data structures (e.g. as soon as the IOMMU driver finishes parsing IVHDs). Introduce a global variable for tracking the SNP support status, which is initialized before enabling the rest of IOMMU features. Also throw a warning if found inconsistency EFR[SNPSup] among IOMMU instances. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 42 ++-- 1 file changed, 40 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 6a4a019f1e1d..3965bd3f4f67 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -164,6 +164,8 @@ static bool amd_iommu_disabled __initdata; static bool amd_iommu_force_enable __initdata; static int amd_iommu_target_ivhd_type; +static bool amd_iommu_snp_sup; + LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */ LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the system */ @@ -355,7 +357,7 @@ static void iommu_set_cwwb_range(struct amd_iommu *iommu) u64 start = iommu_virt_to_phys((void *)iommu->cmd_sem); u64 entry = start & PM_ADDR_MASK; - if (!iommu_feature(iommu, FEATURE_SNP)) + if (!amd_iommu_snp_sup) return; /* Note: @@ -770,7 +772,7 @@ static void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu, void *buf = (void *)__get_free_pages(gfp, order); if (buf && - iommu_feature(iommu, FEATURE_SNP) && + amd_iommu_snp_sup && set_memory_4k((unsigned long)buf, (1 << order))) { free_pages((unsigned long)buf, order); buf = NULL; @@ -1836,6 +1838,37 @@ static u8 get_highest_supported_ivhd_type(struct acpi_table_header *ivrs) return last_type; } +/* + * SNP is enabled system-wide. So, iterate through all the IOMMUs to + * verify all EFR[SNPSup] bits are set, and use global variable to track + * whether the feature is supported. + */ +static void __init init_snp_global(void) +{ + struct amd_iommu *iommu; + + for_each_iommu(iommu) { + if (iommu_feature(iommu, FEATURE_SNP)) { + amd_iommu_snp_sup = true; + continue; + } + + /* +* Warn and mark SNP as not supported if there is inconsistency +* in any of the IOMMU. +*/ + if (amd_iommu_snp_sup && !list_is_first(>list, _iommu_list)) { + pr_err(FW_BUG "iommu%d (%04x:%02x:%02x.%01x): Found inconsistent EFR[SNPSup].\n", + iommu->index, iommu->pci_seg->id, PCI_BUS_NUM(iommu->devid), + PCI_SLOT(iommu->devid), PCI_FUNC(iommu->devid)); + pr_err(FW_BUG "Disable SNP support\n"); + amd_iommu_snp_sup = false; + } + return; + } + amd_iommu_snp_sup = true; +} + /* * Iterates over all IOMMU entries in the ACPI table, allocates the * IOMMU structure and initializes it with init_iommu_one() @@ -1875,6 +1908,8 @@ static int __init init_iommu_all(struct acpi_table_header *table) } WARN_ON(p != end); + init_snp_global(); + for_each_iommu(iommu) { ret = init_iommu_one_late(iommu); if (ret) @@ -2095,6 +2130,9 @@ static void print_iommu_info(void) if (iommu->features & FEATURE_GAM_VAPIC) pr_cont(" GA_vAPIC"); + if (iommu->features & FEATURE_SNP) + pr_cont(" SNP"); + pr_cont("\n"); } } -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 5/7] iommu: Add domain_type_supported() callback in iommu_ops
When user requests to change IOMMU domain to a new type, IOMMU generic layer checks the requested type against the default domain type returned by vendor-specific IOMMU driver. However, there is only one default domain type, and current mechanism does not allow if the requested type does not match the default type. Introducing check_domain_type_supported() callback in iommu_ops, which allows IOMMU generic layer to check with vendor-specific IOMMU driver whether the requested type is supported. This allows user to request types other than the default type. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/iommu.c | 13 - include/linux/iommu.h | 2 ++ 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index f2c45b85b9fc..4afb956ce083 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1521,6 +1521,16 @@ struct iommu_group *fsl_mc_device_group(struct device *dev) } EXPORT_SYMBOL_GPL(fsl_mc_device_group); +static bool iommu_domain_type_supported(struct device *dev, int type) +{ + const struct iommu_ops *ops = dev_iommu_ops(dev); + + if (ops->domain_type_supported) + return ops->domain_type_supported(dev, type); + + return true; +} + static int iommu_get_def_domain_type(struct device *dev) { const struct iommu_ops *ops = dev_iommu_ops(dev); @@ -2937,7 +2947,8 @@ static int iommu_change_dev_def_domain(struct iommu_group *group, * domain the device was booted with */ type = dev_def_dom ? : iommu_def_domain_type; - } else if (dev_def_dom && type != dev_def_dom) { + } else if (!iommu_domain_type_supported(dev, type) || + (dev_def_dom && type != dev_def_dom)) { dev_err_ratelimited(prev_dev, "Device cannot be in %s domain\n", iommu_domain_type_str(type)); ret = -EINVAL; diff --git a/include/linux/iommu.h b/include/linux/iommu.h index fecb72e1b11b..40c47ab15005 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -214,6 +214,7 @@ struct iommu_iotlb_gather { * - IOMMU_DOMAIN_IDENTITY: must use an identity domain * - IOMMU_DOMAIN_DMA: must use a dma domain * - 0: use the default setting + * @domain_type_supported: check if the specified domain type is supported * @default_domain_ops: the default ops for domains * @pgsize_bitmap: bitmap of all possible supported page sizes * @owner: Driver module providing these ops @@ -252,6 +253,7 @@ struct iommu_ops { struct iommu_page_response *msg); int (*def_domain_type)(struct device *dev); + bool (*domain_type_supported)(struct device *dev, int type); const struct iommu_domain_ops *default_domain_ops; unsigned long pgsize_bitmap; -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/7] iommu/amd: Enforce IOMMU restrictions for SNP-enabled system
SNP-enabled system requires IOMMU v1 page table to be configured with non-zero DTE[Mode] for DMA-capable devices. This effects a number of usecases such as IOMMU pass-through mode and AMD IOMMUv2 APIs for binding/unbinding pasid. The series introduce a global variable to check SNP-enabled state during driver initialization, and use it to enforce the SNP restrictions during runtime. Also, for non-DMA-capable devices such as IOAPIC, the recommendation is to set DTE[TV] and DTE[Mode] to zero on SNP-enabled system. Therefore, additinal checks is added before setting DTE[TV]. Testing: - Tested booting and verify dmesg. - Tested booting with iommu=pt - Tested loading amd_iommu_v2 driver - Tested changing the iommu domain at runtime - Tested booting SEV/SNP-enabled guest Pre-requisite: - [PATCH v3 00/35] iommu/amd: Add multiple PCI segments support https://lore.kernel.org/linux-iommu/20220511072141.15485-29-vasant.he...@amd.com/T/ Note: - Previously discussed on here: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used https://www.spinics.net/lists/kernel/msg4351005.html Best Regards, Suravee Brijesh Singh (1): iommu/amd: Introduce function to check SEV-SNP support Suravee Suthikulpanit (6): iommu/amd: Process all IVHDs before enabling IOMMU features iommu/amd: Introduce a global variable for tracking SNP enable status iommu/amd: Set translation valid bit only when IO page tables are in use iommu: Add domain_type_supported() callback in iommu_ops iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY when SNP is enabled iommu/amd: Do not support IOMMUv2 APIs when SNP is enabled drivers/iommu/amd/amd_iommu_types.h | 11 +++ drivers/iommu/amd/init.c| 111 +++- drivers/iommu/amd/iommu.c | 31 +++- drivers/iommu/iommu.c | 13 +++- include/linux/iommu.h | 11 +++ 5 files changed, 153 insertions(+), 24 deletions(-) -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RFC 10/19] iommu/amd: Add unmap_read_dirty() support
On 4/29/22 4:09 AM, Joao Martins wrote: AMD implementation of unmap_read_dirty() is pretty simple as mostly reuses unmap code with the extra addition of marshalling the dirty bit into the bitmap as it walks the to-be-unmapped IOPTE. Extra care is taken though, to switch over to cmpxchg as opposed to a non-serialized store to the PTE and testing the dirty bit only set until cmpxchg succeeds to set to 0. Signed-off-by: Joao Martins --- drivers/iommu/amd/io_pgtable.c | 44 +- drivers/iommu/amd/iommu.c | 22 + 2 files changed, 60 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c index 8325ef193093..1868c3b58e6d 100644 --- a/drivers/iommu/amd/io_pgtable.c +++ b/drivers/iommu/amd/io_pgtable.c @@ -355,6 +355,16 @@ static void free_clear_pte(u64 *pte, u64 pteval, struct list_head *freelist) free_sub_pt(pt, mode, freelist); } +static bool free_pte_dirty(u64 *pte, u64 pteval) Nitpick: Since we free and clearing the dirty bit, should we change the function name to free_clear_pte_dirty()? +{ + bool dirty = false; + + while (IOMMU_PTE_DIRTY(cmpxchg64(pte, pteval, 0))) We should use 0ULL instead of 0. + dirty = true; + + return dirty; +} + Actually, what do you think if we enhance the current free_clear_pte() to also handle the check dirty as well? /* * Generic mapping functions. It maps a physical address into a DMA * address space. It allocates the page table pages if necessary. @@ -428,10 +438,11 @@ static int iommu_v1_map_page(struct io_pgtable_ops *ops, unsigned long iova, return ret; } -static unsigned long iommu_v1_unmap_page(struct io_pgtable_ops *ops, - unsigned long iova, - size_t size, - struct iommu_iotlb_gather *gather) +static unsigned long __iommu_v1_unmap_page(struct io_pgtable_ops *ops, + unsigned long iova, + size_t size, + struct iommu_iotlb_gather *gather, + struct iommu_dirty_bitmap *dirty) { struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops); unsigned long long unmapped; @@ -445,11 +456,15 @@ static unsigned long iommu_v1_unmap_page(struct io_pgtable_ops *ops, while (unmapped < size) { pte = fetch_pte(pgtable, iova, _size); if (pte) { - int i, count; + unsigned long i, count; + bool pte_dirty = false; count = PAGE_SIZE_PTE_COUNT(unmap_size); for (i = 0; i < count; i++) - pte[i] = 0ULL; + pte_dirty |= free_pte_dirty([i], pte[i]); + Actually, what if we change the existing free_clear_pte() to free_and_clear_dirty_pte(), and incorporate the logic for ... diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 0a86392b2367..a8fcb6e9a684 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2144,6 +2144,27 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova, return r; } +static size_t amd_iommu_unmap_read_dirty(struct iommu_domain *dom, +unsigned long iova, size_t page_size, +struct iommu_iotlb_gather *gather, +struct iommu_dirty_bitmap *dirty) +{ + struct protection_domain *domain = to_pdomain(dom); + struct io_pgtable_ops *ops = >iop.iop.ops; + size_t r; + + if ((amd_iommu_pgtable == AMD_IOMMU_V1) && + (domain->iop.mode == PAGE_MODE_NONE)) + return 0; + + r = (ops->unmap_read_dirty) ? + ops->unmap_read_dirty(ops, iova, page_size, gather, dirty) : 0; + + amd_iommu_iotlb_gather_add_page(dom, gather, iova, page_size); + + return r; +} + Instead of creating a new function, what if we enhance the current amd_iommu_unmap() to also handle read dirty part as well (e.g. __amd_iommu_unmap_read_dirty()), and then both amd_iommu_unmap() and amd_iommu_unmap_read_dirty() can call the __amd_iommu_unmap_read_dirty()? Best Regards, Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RFC 09/19] iommu/amd: Access/Dirty bit support in IOPTEs
Joao, On 4/29/22 4:09 AM, Joao Martins wrote: . +static int amd_iommu_set_dirty_tracking(struct iommu_domain *domain, + bool enable) +{ + struct protection_domain *pdomain = to_pdomain(domain); + struct iommu_dev_data *dev_data; + bool dom_flush = false; + + if (!amd_iommu_had_support) + return -EOPNOTSUPP; + + list_for_each_entry(dev_data, >dev_list, list) { Since we iterate through device list for the domain, we would need to call spin_lock_irqsave(>lock, flags) here. + struct amd_iommu *iommu; + u64 pte_root; + + iommu = amd_iommu_rlookup_table[dev_data->devid]; + pte_root = amd_iommu_dev_table[dev_data->devid].data[0]; + + /* No change? */ + if (!(enable ^ !!(pte_root & DTE_FLAG_HAD))) + continue; + + pte_root = (enable ? + pte_root | DTE_FLAG_HAD : pte_root & ~DTE_FLAG_HAD); + + /* Flush device DTE */ + amd_iommu_dev_table[dev_data->devid].data[0] = pte_root; + device_flush_dte(dev_data); + dom_flush = true; + } + + /* Flush IOTLB to mark IOPTE dirty on the next translation(s) */ + if (dom_flush) { + unsigned long flags; + + spin_lock_irqsave(>lock, flags); + amd_iommu_domain_flush_tlb_pde(pdomain); + amd_iommu_domain_flush_complete(pdomain); + spin_unlock_irqrestore(>lock, flags); + } And call spin_unlock_irqrestore(>lock, flags); here. + + return 0; +} + +static bool amd_iommu_get_dirty_tracking(struct iommu_domain *domain) +{ + struct protection_domain *pdomain = to_pdomain(domain); + struct iommu_dev_data *dev_data; + u64 dte; + Also call spin_lock_irqsave(>lock, flags) here + list_for_each_entry(dev_data, >dev_list, list) { + dte = amd_iommu_dev_table[dev_data->devid].data[0]; + if (!(dte & DTE_FLAG_HAD)) + return false; + } + And call spin_unlock_irqsave(>lock, flags) here + return true; +} + +static int amd_iommu_read_and_clear_dirty(struct iommu_domain *domain, + unsigned long iova, size_t size, + struct iommu_dirty_bitmap *dirty) +{ + struct protection_domain *pdomain = to_pdomain(domain); + struct io_pgtable_ops *ops = >iop.iop.ops; + + if (!amd_iommu_get_dirty_tracking(domain)) + return -EOPNOTSUPP; + + if (!ops || !ops->read_and_clear_dirty) + return -ENODEV; We move this check before the amd_iommu_get_dirty_tracking(). Best Regards, Suravee + + return ops->read_and_clear_dirty(ops, iova, size, dirty); +} + + static void amd_iommu_get_resv_regions(struct device *dev, struct list_head *head) { @@ -2293,6 +2368,8 @@ const struct iommu_ops amd_iommu_ops = { .flush_iotlb_all = amd_iommu_flush_iotlb_all, .iotlb_sync = amd_iommu_iotlb_sync, .free = amd_iommu_domain_free, + .set_dirty_tracking = amd_iommu_set_dirty_tracking, + .read_and_clear_dirty = amd_iommu_read_and_clear_dirty, } }; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used
Joerg, On 5/20/22 3:09 PM, Joerg Roedel wrote: Hi Suravee, On Mon, May 16, 2022 at 07:27:51PM +0700, Suravee Suthikulpanit wrote: - Also, it seems that the current iommu v2 page table use case, where GVA->GPA=SPA will no longer be supported on system w/ SNPSup=1. Any thoughts? Support for that is not upstream yet, it should be easy to disallow this configuration and just use the v1 page-tables when SNP is active. This can be handled entirely inside the AMD IOMMU driver. Actually, I am referring to when user uses the IOMMU v2 table for shared virtual address in current iommu_v2 driver (e.g. amd_iommu_init_device(), amd_iommu_bind_pasid). Best Regards, Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used
Joerg, On 5/13/22 8:07 PM, Joerg Roedel wrote: On Mon, May 09, 2022 at 02:48:15AM -0500, Suravee Suthikulpanit wrote: On AMD system with SNP enabled, IOMMU hardware checks the host translation valid (TV) and guest translation valid (GV) bits in the device table entry (DTE) before accessing the corresponded page tables. However, current IOMMU driver sets the TV bit for all devices regardless of whether the host page table is in used. This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which do not the host page table root pointer set up. Hmm, this sound weird. In the early AMD IOMMUs it was recommended to set TV=1 and V=1 and the rest to 0 to block all DMA from a device. I wonder how this triggers ILLEGAL_DEV_TABLE_ENTRY errors now. It is (was?) legal to set V=1 TV=1, mode=0 and leave the page-table empty. Due to the new restriction (please see the IOMMU spec Rev 3.06-PUB - Apr 2021 https://www.amd.com/system/files/TechDocs/48882_IOMMU.pdf) where the use of DTE[Mode]=0 is not supported on systems that are SNP-enabled (i.e. EFR[SNPSup]=1), the IOMMU HW looks at the DTE[TV] bit to determine if it needs to handle the v1 page table. When the HW encounters DTE entry with TV=1, V=1, Mode=0, it would generate ILLEGAL_DEV_TABLE_ENTRY event. Note: I am following up with HW folks for the updated document for this specific detail. Therefore, we need to modify IOMMU driver as following: - For non-DMA devices (e.g. the IOAPIC devices), we need to modify IOMMU driver to default to DTE[TV]=0. For Linux, this is equivalent to DTE with domain ID 0. - I am still trying to see what is the best way to force Linux to not allow Mode=0 (i.e. iommu=pt mode). Any thoughts? - Also, it seems that the current iommu v2 page table use case, where GVA->GPA=SPA will no longer be supported on system w/ SNPSup=1. Any thoughts? When then IW=0 and IR=0, DMA is blocked. From what I remember this is a valid setting in a DTE. Correct. Do you have an example DTE which triggers this error message? This is specifically from the device representing an IOAPIC. [ +0.000108] iommu ivhd0: AMD-Vi: Event logged [ILLEGAL_DEV_TABLE_ENTRY device=c0:00.1 pasid=0x0 address=0xfffdf814 flags=0x0008] [ +0.11] AMD-Vi: DTE[0]: 0003 [ +0.03] AMD-Vi: DTE[1]: [ +0.02] AMD-Vi: DTE[2]: 2008000100258013 [ +0.01] AMD-Vi: DTE[3]: Best Regards, Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used
On AMD system with SNP enabled, IOMMU hardware checks the host translation valid (TV) and guest translation valid (GV) bits in the device table entry (DTE) before accessing the corresponded page tables. However, current IOMMU driver sets the TV bit for all devices regardless of whether the host page table is in used. This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which do not the host page table root pointer set up. Thefore, only set TV bit when DMA remapping is not used, which is when domain ID in the AMD IOMMU device table entry (DTE) is zero. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 4 +--- drivers/iommu/amd/iommu.c | 8 ++-- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 648d6b94ba8c..6a2dadf2b2dc 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -2336,10 +2336,8 @@ static void init_device_table_dma(void) { u32 devid; - for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) { + for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) set_dev_entry_bit(devid, DEV_ENTRY_VALID); - set_dev_entry_bit(devid, DEV_ENTRY_TRANSLATION); - } } static void __init uninit_device_table_dma(void) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index a1ada7bff44e..cea254968f06 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1473,7 +1473,7 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, pte_root |= (domain->iop.mode & DEV_ENTRY_MODE_MASK) << DEV_ENTRY_MODE_SHIFT; - pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV; + pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V; flags = amd_iommu_dev_table[devid].data[1]; @@ -1513,6 +1513,10 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, flags|= tmp; } + /* Only set TV bit when IOMMU page translation is in used */ + if (domain->id != 0) + pte_root |= DTE_FLAG_TV; + flags &= ~DEV_DOMID_MASK; flags |= domain->id; @@ -1535,7 +1539,7 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, static void clear_dte_entry(u16 devid) { /* remove entry from the device table seen by the hardware */ - amd_iommu_dev_table[devid].data[0] = DTE_FLAG_V | DTE_FLAG_TV; + amd_iommu_dev_table[devid].data[0] = DTE_FLAG_V; amd_iommu_dev_table[devid].data[1] &= DTE_FLAG_MASK; amd_iommu_apply_erratum_63(devid); -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: Set translation valid bit only when IO page tables are in used
On 4/20/22 6:29 PM, Suravee Suthikulpanit wrote: On AMD system with SNP enabled, IOMMU hardware checks the host translation valid (TV) and guest translation valid (GV) bits in the device table entry (DTE) before accessing the corresponded page tables. However, current IOMMU driver sets the TV bit for all devices regardless of whether the host page table is in used. This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which do not the host page table root pointer set up. Thefore, only set TV bit when host or guest page tables are in used. Signed-off-by: Suravee Suthikulpanit I found a bug in this patch. I will send out v2 with the fix. Regards, Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/amd: Set translation valid bit only when IO page tables are in used
On AMD system with SNP enabled, IOMMU hardware checks the host translation valid (TV) and guest translation valid (GV) bits in the device table entry (DTE) before accessing the corresponded page tables. However, current IOMMU driver sets the TV bit for all devices regardless of whether the host page table is in used. This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which do not the host page table root pointer set up. Thefore, only set TV bit when host or guest page tables are in used. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 4 +--- drivers/iommu/amd/iommu.c | 13 +++-- 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index b4a798c7b347..4f483f22e58c 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -2337,10 +2337,8 @@ static void init_device_table_dma(void) { u32 devid; - for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) { + for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) set_dev_entry_bit(devid, DEV_ENTRY_VALID); - set_dev_entry_bit(devid, DEV_ENTRY_TRANSLATION); - } } static void __init uninit_device_table_dma(void) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index a1ada7bff44e..6dd35998e53c 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1473,7 +1473,7 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, pte_root |= (domain->iop.mode & DEV_ENTRY_MODE_MASK) << DEV_ENTRY_MODE_SHIFT; - pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV; + pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V; flags = amd_iommu_dev_table[devid].data[1]; @@ -1513,6 +1513,15 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, flags|= tmp; } + /* +* Only set TV bit when: +* - IOMMUv1 table is in used. +* - IOMMUv2 table is in used. +*/ + if ((domain->iop.mode != PAGE_MODE_NONE) || + (domain->flags & PD_IOMMUV2_MASK)) + pte_root |= DTE_FLAG_TV; + flags &= ~DEV_DOMID_MASK; flags |= domain->id; @@ -1535,7 +1544,7 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, static void clear_dte_entry(u16 devid) { /* remove entry from the device table seen by the hardware */ - amd_iommu_dev_table[devid].data[0] = DTE_FLAG_V | DTE_FLAG_TV; + amd_iommu_dev_table[devid].data[0] = DTE_FLAG_V; amd_iommu_dev_table[devid].data[1] &= DTE_FLAG_MASK; amd_iommu_apply_erratum_63(devid); -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/amd: Do not call sleep while holding spinlock
Smatch static checker warns: drivers/iommu/amd/iommu_v2.c:133 free_device_state() warn: sleeping in atomic context Fixes by storing the list of struct device_state in a temporary list, and then free the memory after releasing the spinlock. Reported-by: Dan Carpenter Fixes: dc6a709e5123 ("iommu/amd: Improve amd_iommu_v2_exit()") Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/iommu_v2.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c index 490da41c3c71..5a6e4f87d875 100644 --- a/drivers/iommu/amd/iommu_v2.c +++ b/drivers/iommu/amd/iommu_v2.c @@ -947,6 +947,7 @@ static void __exit amd_iommu_v2_exit(void) { struct device_state *dev_state, *next; unsigned long flags; + LIST_HEAD(freelist); if (!amd_iommu_v2_supported()) return; @@ -966,11 +967,20 @@ static void __exit amd_iommu_v2_exit(void) put_device_state(dev_state); list_del(_state->list); - free_device_state(dev_state); + list_add_tail(_state->list, ); } spin_unlock_irqrestore(_lock, flags); + /* +* Since free_device_state waits on the count to be zero, +* we need to free dev_state outside the spinlock. +*/ + list_for_each_entry_safe(dev_state, next, , list) { + list_del(_state->list); + free_device_state(dev_state); + } + destroy_workqueue(iommu_wq); } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/amd: Fix I/O page table memory leak
The current logic updates the I/O page table mode for the domain before calling the logic to free memory used for the page table. This results in IOMMU page table memory leak, and can be observed when launching VM w/ pass-through devices. Fix by freeing the memory used for page table before updating the mode. Cc: Joerg Roedel Reported-by: Daniel Jordan Tested-by: Daniel Jordan Signed-off-by: Suravee Suthikulpanit Fixes: e42ba0633064 ("iommu/amd: Restructure code for freeing page table") Link: https://lore.kernel.org/all/20220118194720.urjgi73b7c3tq...@oracle.com/ --- drivers/iommu/amd/io_pgtable.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c index b1bf4125b0f7..6608d1717574 100644 --- a/drivers/iommu/amd/io_pgtable.c +++ b/drivers/iommu/amd/io_pgtable.c @@ -492,18 +492,18 @@ static void v1_free_pgtable(struct io_pgtable *iop) dom = container_of(pgtable, struct protection_domain, iop); - /* Update data structure */ - amd_iommu_domain_clr_pt_root(dom); - - /* Make changes visible to IOMMUs */ - amd_iommu_domain_update(dom); - /* Page-table is not visible to IOMMU anymore, so free it */ BUG_ON(pgtable->mode < PAGE_MODE_NONE || pgtable->mode > PAGE_MODE_6_LEVEL); free_sub_pt(pgtable->root, pgtable->mode, ); + /* Update data structure */ + amd_iommu_domain_clr_pt_root(dom); + + /* Make changes visible to IOMMUs */ + amd_iommu_domain_update(dom); + put_pages_list(); } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/3] iommu/amd: Remove iommu_init_ga()
Since the function has been simplified and only call iommu_init_ga_log(), remove the function and replace with iommu_init_ga_log() instead. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 17 - 1 file changed, 4 insertions(+), 13 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index ea3330ed545d..5ec683675ff0 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -827,9 +827,9 @@ static int iommu_ga_log_enable(struct amd_iommu *iommu) return 0; } -#ifdef CONFIG_IRQ_REMAP static int iommu_init_ga_log(struct amd_iommu *iommu) { +#ifdef CONFIG_IRQ_REMAP u64 entry; if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir)) @@ -859,18 +859,9 @@ static int iommu_init_ga_log(struct amd_iommu *iommu) err_out: free_ga_log(iommu); return -EINVAL; -} -#endif /* CONFIG_IRQ_REMAP */ - -static int iommu_init_ga(struct amd_iommu *iommu) -{ - int ret = 0; - -#ifdef CONFIG_IRQ_REMAP - ret = iommu_init_ga_log(iommu); +#else + return 0; #endif /* CONFIG_IRQ_REMAP */ - - return ret; } static int __init alloc_cwwb_sem(struct amd_iommu *iommu) @@ -1852,7 +1843,7 @@ static int __init iommu_init_pci(struct amd_iommu *iommu) if (iommu_feature(iommu, FEATURE_PPR) && alloc_ppr_log(iommu)) return -ENOMEM; - ret = iommu_init_ga(iommu); + ret = iommu_init_ga_log(iommu); if (ret) return ret; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/3] iommu/amd: Introduce helper function to check feature bit on all IOMMUs
IOMMU advertises feature via Extended Features Register (EFR). The helper function checks if the specified feature bit is set across all IOMMUs. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 46280e6e1535..c97961451ac5 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -298,6 +298,19 @@ int amd_iommu_get_num_iommus(void) return amd_iommus_present; } +static bool check_feature_on_all_iommus(u64 mask) +{ + bool ret = false; + struct amd_iommu *iommu; + + for_each_iommu(iommu) { + ret = iommu_feature(iommu, mask); + if (!ret) + return false; + } + + return true; +} /* * For IVHD type 0x11/0x40, EFR is also available via IVHD. * Default to IVHD EFR since it is available sooner -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/3] iommu/amd: Relocate GAMSup check to early_enable_iommus
From: Wei Huang Currently, iommu_init_ga() checks and disables IOMMU VAPIC support (i.e. AMD AVIC support in IOMMU) when GAMSup feature bit is not set. However it forgets to clear IRQ_POSTING_CAP from the previously set amd_iommu_irq_ops.capability. This triggers an invalid page fault bug during guest VM warm reboot if AVIC is enabled since the irq_remapping_cap(IRQ_POSTING_CAP) is incorrectly set, and crash the system with the following kernel trace. BUG: unable to handle page fault for address: 00400dd8 RIP: 0010:amd_iommu_deactivate_guest_mode+0x19/0xbc Call Trace: svm_set_pi_irte_mode+0x8a/0xc0 [kvm_amd] ? kvm_make_all_cpus_request_except+0x50/0x70 [kvm] kvm_request_apicv_update+0x10c/0x150 [kvm] svm_toggle_avic_for_irq_window+0x52/0x90 [kvm_amd] svm_enable_irq_window+0x26/0xa0 [kvm_amd] vcpu_enter_guest+0xbbe/0x1560 [kvm] ? avic_vcpu_load+0xd5/0x120 [kvm_amd] ? kvm_arch_vcpu_load+0x76/0x240 [kvm] ? svm_get_segment_base+0xa/0x10 [kvm_amd] kvm_arch_vcpu_ioctl_run+0x103/0x590 [kvm] kvm_vcpu_ioctl+0x22a/0x5d0 [kvm] __x64_sys_ioctl+0x84/0xc0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes by moving the initializing of AMD IOMMU interrupt remapping mode (amd_iommu_guest_ir) earlier before setting up the amd_iommu_irq_ops.capability with appropriate IRQ_POSTING_CAP flag. Signed-off-by: Wei Huang Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index c97961451ac5..ea3330ed545d 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -867,13 +867,6 @@ static int iommu_init_ga(struct amd_iommu *iommu) int ret = 0; #ifdef CONFIG_IRQ_REMAP - /* Note: We have already checked GASup from IVRS table. -* Now, we need to make sure that GAMSup is set. -*/ - if (AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) && - !iommu_feature(iommu, FEATURE_GAM_VAPIC)) - amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY_GA; - ret = iommu_init_ga_log(iommu); #endif /* CONFIG_IRQ_REMAP */ @@ -2490,6 +2483,14 @@ static void early_enable_iommus(void) } #ifdef CONFIG_IRQ_REMAP + /* +* Note: We have already checked GASup from IVRS table. +* Now, we need to make sure that GAMSup is set. +*/ + if (AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) && + !check_feature_on_all_iommus(FEATURE_GAM_VAPIC)) + amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY_GA; + if (AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir)) amd_iommu_irq_ops.capability |= (1 << IRQ_POSTING_CAP); #endif -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/3] iommu/amd: Fix unable to handle page fault due to AVIC
This bug is triggered when rebooting VM on a system which SVM AVIC is enabled but IOMMU AVIC is disabled in the BIOS. The series reworks interrupt remapping intialiation to check for IOMMU AVIC support (GAMSup) at earlier stage using EFR provided by IVRS table instead of the PCI MMIO register, which is available after PCI support for IOMMU is initialized. This helps avoid having to disable and clean up the already initialized interrupt-remapping-related parameter. Thanks, Suravee Suravee Suthikulpanit (2): iommu/amd: Introduce helper function to check feature bit on all IOMMUs iommu/amd: Remove iommu_init_ga() Wei Huang (1): iommu/amd: Relocate GAMSup check to early_enable_iommus drivers/iommu/amd/init.c | 45 ++-- 1 file changed, 25 insertions(+), 20 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] MAINTAINERS: Add Suravee Suthikulpanit as Reviewer for AMD IOMMU (AMD-Vi)
To help review changes related to AMD IOMMU. Signed-off-by: Suravee Suthikulpanit --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index b80e6f7..8022dbd 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -933,6 +933,7 @@ F: drivers/video/fbdev/geode/ AMD IOMMU (AMD-VI) M: Joerg Roedel +R: Suravee Suthikulpanit L: iommu@lists.linux-foundation.org S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git -- 1.8.3.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating
On certain AMD platforms, when the IOMMU performance counter source (csource) field is zero, power-gating for the counter is enabled, which prevents write access and returns zero for read access. This can cause invalid perf result especially when event multiplexing is needed (i.e. more number of events than available counters) since the current logic keeps track of the previously read counter value, and subsequently re-program the counter to continue counting the event. With power-gating enabled, we cannot gurantee successful re-programming of the counter. Workaround this issue by : 1. Modifying the ordering of setting/reading counters and enabing/ disabling csources to only access the counter when the csource is set to non-zero. 2. Since AMD IOMMU PMU does not support interrupt mode, the logic can be simplified to always start counting with value zero, and accumulate the counter value when stopping without the need to keep track and reprogram the counter with the previously read counter value. This has been tested on systems with and without power-gating. Fixes: 994d6608efe4 ("iommu/amd: Remove performance counter pre-initialization test") Suggested-by: Alexander Monakov Cc: David Coe Signed-off-by: Suravee Suthikulpanit --- arch/x86/events/amd/iommu.c | 47 - 1 file changed, 26 insertions(+), 21 deletions(-) diff --git a/arch/x86/events/amd/iommu.c b/arch/x86/events/amd/iommu.c index 1c1a7e45dc64..913745f1419b 100644 --- a/arch/x86/events/amd/iommu.c +++ b/arch/x86/events/amd/iommu.c @@ -19,8 +19,6 @@ #include "../perf_event.h" #include "iommu.h" -#define COUNTER_SHIFT 16 - /* iommu pmu conf masks */ #define GET_CSOURCE(x) ((x)->conf & 0xFFULL) #define GET_DEVID(x) (((x)->conf >> 8) & 0xULL) @@ -286,22 +284,31 @@ static void perf_iommu_start(struct perf_event *event, int flags) WARN_ON_ONCE(!(hwc->state & PERF_HES_UPTODATE)); hwc->state = 0; + /* +* To account for power-gating, which prevents write to +* the counter, we need to enable the counter +* before setting up counter register. +*/ + perf_iommu_enable_event(event); + if (flags & PERF_EF_RELOAD) { - u64 prev_raw_count = local64_read(>prev_count); + u64 count = 0; struct amd_iommu *iommu = perf_event_2_iommu(event); + /* +* Since the IOMMU PMU only support counting mode, +* the counter always start with value zero. +*/ amd_iommu_pc_set_reg(iommu, hwc->iommu_bank, hwc->iommu_cntr, -IOMMU_PC_COUNTER_REG, _raw_count); +IOMMU_PC_COUNTER_REG, ); } - perf_iommu_enable_event(event); perf_event_update_userpage(event); - } static void perf_iommu_read(struct perf_event *event) { - u64 count, prev, delta; + u64 count; struct hw_perf_event *hwc = >hw; struct amd_iommu *iommu = perf_event_2_iommu(event); @@ -312,14 +319,11 @@ static void perf_iommu_read(struct perf_event *event) /* IOMMU pc counter register is only 48 bits */ count &= GENMASK_ULL(47, 0); - prev = local64_read(>prev_count); - if (local64_cmpxchg(>prev_count, prev, count) != prev) - return; - - /* Handle 48-bit counter overflow */ - delta = (count << COUNTER_SHIFT) - (prev << COUNTER_SHIFT); - delta >>= COUNTER_SHIFT; - local64_add(delta, >count); + /* +* Since the counter always start with value zero, +* simply just accumulate the count for the event. +*/ + local64_add(count, >count); } static void perf_iommu_stop(struct perf_event *event, int flags) @@ -329,15 +333,16 @@ static void perf_iommu_stop(struct perf_event *event, int flags) if (hwc->state & PERF_HES_UPTODATE) return; + /* +* To account for power-gating, in which reading the counter would +* return zero, we need to read the register before disabling. +*/ + perf_iommu_read(event); + hwc->state |= PERF_HES_UPTODATE; + perf_iommu_disable_event(event); WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED); hwc->state |= PERF_HES_STOPPED; - - if (hwc->state & PERF_HES_UPTODATE) - return; - - perf_iommu_read(event); - hwc->state |= PERF_HES_UPTODATE; } static int perf_iommu_add(struct perf_event *event, int flags) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/2] Revert "iommu/amd: Fix performance counter initialization"
From: Paul Menzel This reverts commit 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b. The original commit tries to address an issue, where PMC power-gating causing the IOMMU PMC pre-init test to fail on certain desktop/mobile platforms where the power-gating is normally enabled. There have been several reports that the workaround still does not guarantee to work, and can add up to 100 ms (on the worst case) to the boot process on certain platforms such as the MSI B350M MORTAR with AMD Ryzen 3 2200G. Therefore, revert this commit as a prelude to removing the pre-init test. Link: https://lore.kernel.org/linux-iommu/alpine.lnx.3.20.13.2006030935570.3...@monopod.intra.ispras.ru/ Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753 Cc: Tj (Elloe Linux) Cc: Shuah Khan Cc: Alexander Monakov Cc: David Coe Signed-off-by: Paul Menzel Signed-off-by: Suravee Suthikulpanit --- Note: I have revised the commit message to add more detail and remove uncessary information. drivers/iommu/amd/init.c | 45 ++-- 1 file changed, 11 insertions(+), 34 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 321f5906e6ed..648cdfd03074 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -12,7 +12,6 @@ #include #include #include -#include #include #include #include @@ -257,8 +256,6 @@ static enum iommu_init_state init_state = IOMMU_START_STATE; static int amd_iommu_enable_interrupts(void); static int __init iommu_go_to_state(enum iommu_init_state state); static void init_device_table_dma(void); -static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, - u8 fxn, u64 *value, bool is_write); static bool amd_iommu_pre_enabled = true; @@ -1717,11 +1714,13 @@ static int __init init_iommu_all(struct acpi_table_header *table) return 0; } -static void __init init_iommu_perf_ctr(struct amd_iommu *iommu) +static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, + u8 fxn, u64 *value, bool is_write); + +static void init_iommu_perf_ctr(struct amd_iommu *iommu) { - int retry; struct pci_dev *pdev = iommu->dev; - u64 val = 0xabcd, val2 = 0, save_reg, save_src; + u64 val = 0xabcd, val2 = 0, save_reg = 0; if (!iommu_feature(iommu, FEATURE_PC)) return; @@ -1729,39 +1728,17 @@ static void __init init_iommu_perf_ctr(struct amd_iommu *iommu) amd_iommu_pc_present = true; /* save the value to restore, if writable */ - if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, false) || - iommu_pc_get_set_reg(iommu, 0, 0, 8, _src, false)) - goto pc_false; - - /* -* Disable power gating by programing the performance counter -* source to 20 (i.e. counts the reads and writes from/to IOMMU -* Reserved Register [MMIO Offset 1FF8h] that are ignored.), -* which never get incremented during this init phase. -* (Note: The event is also deprecated.) -*/ - val = 20; - if (iommu_pc_get_set_reg(iommu, 0, 0, 8, , true)) + if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, false)) goto pc_false; /* Check if the performance counters can be written to */ - val = 0xabcd; - for (retry = 5; retry; retry--) { - if (iommu_pc_get_set_reg(iommu, 0, 0, 0, , true) || - iommu_pc_get_set_reg(iommu, 0, 0, 0, , false) || - val2) - break; - - /* Wait about 20 msec for power gating to disable and retry. */ - msleep(20); - } - - /* restore */ - if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, true) || - iommu_pc_get_set_reg(iommu, 0, 0, 8, _src, true)) + if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, , true)) || + (iommu_pc_get_set_reg(iommu, 0, 0, 0, , false)) || + (val != val2)) goto pc_false; - if (val != val2) + /* restore */ + if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, true)) goto pc_false; pci_info(pdev, "IOMMU performance counters supported\n"); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
In early AMD desktop/mobile platforms (during 2013), when the IOMMU Performance Counter (PMC) support was first introduced in commit 30861ddc9cca ("perf/x86/amd: Add IOMMU Performance Counter resource management"), there was a HW bug where the counters could not be accessed. The result was reading of the counter always return zero. At the time, the suggested workaround was to add a test logic prior to initializing the PMC feature to check if the counters can be programmed and read back the same value. This has been working fine until the more recent desktop/mobile platforms start enabling power gating for the PMC, which prevents access to the counters. This results in the PMC support being disabled unnecesarily. Unfortunatly, there is no documentation of since which generation of hardware the original PMC HW bug was fixed. Although, it was fixed soon after the first introduction of the PMC. Base on this, we assume that the buggy platforms are less likely to be in used, and it should be relatively safe to remove this legacy logic. Link: https://lore.kernel.org/linux-iommu/alpine.lnx.3.20.13.2006030935570.3...@monopod.intra.ispras.ru/ Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753 Cc: Tj (Elloe Linux) Cc: Shuah Khan Cc: Alexander Monakov Cc: David Coe Cc: Paul Menzel Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 24 +--- 1 file changed, 1 insertion(+), 23 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 648cdfd03074..247cdda5d683 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1714,33 +1714,16 @@ static int __init init_iommu_all(struct acpi_table_header *table) return 0; } -static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, - u8 fxn, u64 *value, bool is_write); - static void init_iommu_perf_ctr(struct amd_iommu *iommu) { + u64 val; struct pci_dev *pdev = iommu->dev; - u64 val = 0xabcd, val2 = 0, save_reg = 0; if (!iommu_feature(iommu, FEATURE_PC)) return; amd_iommu_pc_present = true; - /* save the value to restore, if writable */ - if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, false)) - goto pc_false; - - /* Check if the performance counters can be written to */ - if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, , true)) || - (iommu_pc_get_set_reg(iommu, 0, 0, 0, , false)) || - (val != val2)) - goto pc_false; - - /* restore */ - if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, true)) - goto pc_false; - pci_info(pdev, "IOMMU performance counters supported\n"); val = readl(iommu->mmio_base + MMIO_CNTR_CONF_OFFSET); @@ -1748,11 +1731,6 @@ static void init_iommu_perf_ctr(struct amd_iommu *iommu) iommu->max_counters = (u8) ((val >> 7) & 0xf); return; - -pc_false: - pci_err(pdev, "Unable to read/write to IOMMU perf counter.\n"); - amd_iommu_pc_present = false; - return; } static ssize_t amd_iommu_show_cap(struct device *dev, -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/2] iommu/amd: Revert and remove failing PMC test
This has prevented PMC to work on more recent desktop/mobile platforms, where the PMC power-gating is normally enabled. After consulting with HW designers and IOMMU maintainer, we have decide to remove the legacy test altogether to avoid future PMC enabling issues. Thanks the community for helping to test, investigate, provide data and report issues on several platforms in the field. Regards, Suravee Paul Menzel (1): Revert "iommu/amd: Fix performance counter initialization" Suravee Suthikulpanit (1): iommu/amd: Remove performance counter pre-initialization test drivers/iommu/amd/init.c | 49 ++-- 1 file changed, 2 insertions(+), 47 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 5/7] iommu/amd: Add support for Guest IO protection
Joerg, On 3/18/21 10:31 PM, Joerg Roedel wrote: On Fri, Mar 12, 2021 at 03:04:09AM -0600, Suravee Suthikulpanit wrote: @@ -519,6 +521,7 @@ struct protection_domain { spinlock_t lock;/* mostly used to lock the page table*/ u16 id; /* the domain id written to the device table */ int glx;/* Number of levels for GCR3 table */ + bool giov; /* guest IO protection domain */ Could this be turned into a flag? Good point. I'll convert to use the protection_domain.flags. Thanks, Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 6/7] iommu/amd: Introduce amd_iommu_pgtable command-line option
Joerg, On 3/18/21 10:33 PM, Joerg Roedel wrote: On Fri, Mar 12, 2021 at 03:04:10AM -0600, Suravee Suthikulpanit wrote: To allow specification whether to use v1 or v2 IOMMU pagetable for DMA remapping when calling kernel DMA-API. Signed-off-by: Suravee Suthikulpanit --- Documentation/admin-guide/kernel-parameters.txt | 6 ++ drivers/iommu/amd/init.c| 15 +++ 2 files changed, 21 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 04545725f187..466e807369ea 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -319,6 +319,12 @@ This mode requires kvm-amd.avic=1. (Default when IOMMU HW support is present.) + amd_iommu_pgtable= [HW,X86-64] + Specifies one of the following AMD IOMMU page table to + be used for DMA remapping for DMA-API: + v1 - Use v1 page table (Default) + v2 - Use v2 page table Any reason v2 can not be the default when it is supported by the IOMMU? Eventually, we should be able to default to v2. However, we will need to make sure that the v2 implementation will have comparable performance as currently used v1. FYI: I'm also looking into adding support for SVA as well. Thanks, Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH 7/7] iommu/amd: Add support for using AMD IOMMU v2 page table for DMA-API
Introduce init function for setting up DMA domain for DMA-API with the IOMMU v2 page table. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/iommu.c | 21 + 1 file changed, 21 insertions(+) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index e29ece6e1e68..bd26de8764bd 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1937,6 +1937,24 @@ static int protection_domain_init_v1(struct protection_domain *domain, int mode) return 0; } +static int protection_domain_init_v2(struct protection_domain *domain) +{ + spin_lock_init(>lock); + domain->id = domain_id_alloc(); + if (!domain->id) + return -ENOMEM; + INIT_LIST_HEAD(>dev_list); + + domain->giov = true; + + if (amd_iommu_pgtable == AMD_IOMMU_V2 && + domain_enable_v2(domain, 1, false)) { + return -ENOMEM; + } + + return 0; +} + static struct protection_domain *protection_domain_alloc(unsigned int type) { struct io_pgtable_ops *pgtbl_ops; @@ -1964,6 +1982,9 @@ static struct protection_domain *protection_domain_alloc(unsigned int type) case AMD_IOMMU_V1: ret = protection_domain_init_v1(domain, mode); break; + case AMD_IOMMU_V2: + ret = protection_domain_init_v2(domain); + break; default: ret = -EINVAL; } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH 6/7] iommu/amd: Introduce amd_iommu_pgtable command-line option
To allow specification whether to use v1 or v2 IOMMU pagetable for DMA remapping when calling kernel DMA-API. Signed-off-by: Suravee Suthikulpanit --- Documentation/admin-guide/kernel-parameters.txt | 6 ++ drivers/iommu/amd/init.c| 15 +++ 2 files changed, 21 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 04545725f187..466e807369ea 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -319,6 +319,12 @@ This mode requires kvm-amd.avic=1. (Default when IOMMU HW support is present.) + amd_iommu_pgtable= [HW,X86-64] + Specifies one of the following AMD IOMMU page table to + be used for DMA remapping for DMA-API: + v1 - Use v1 page table (Default) + v2 - Use v2 page table + amijoy.map= [HW,JOY] Amiga joystick support Map of devices attached to JOY0DAT and JOY1DAT Format: , diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 9265c1bf1d84..6d5163bfb87e 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -3123,6 +3123,20 @@ static int __init parse_amd_iommu_dump(char *str) return 1; } +static int __init parse_amd_iommu_pgtable(char *str) +{ + for (; *str; ++str) { + if (strncmp(str, "v1", 2) == 0) { + amd_iommu_pgtable = AMD_IOMMU_V1; + break; + } else if (strncmp(str, "v2", 2) == 0) { + amd_iommu_pgtable = AMD_IOMMU_V2; + break; + } + } + return 1; +} + static int __init parse_amd_iommu_intr(char *str) { for (; *str; ++str) { @@ -3246,6 +3260,7 @@ static int __init parse_ivrs_acpihid(char *str) __setup("amd_iommu_dump", parse_amd_iommu_dump); __setup("amd_iommu=", parse_amd_iommu_options); +__setup("amd_iommu_pgtable=", parse_amd_iommu_pgtable); __setup("amd_iommu_intr=", parse_amd_iommu_intr); __setup("ivrs_ioapic", parse_ivrs_ioapic); __setup("ivrs_hpet", parse_ivrs_hpet); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH 3/7] iommu/amd: Decouple the logic to enable PPR and GT
Currently, the function to enable iommu v2 (GT) assumes PPR log must also be enabled. This is no longer the case since the IOMMU v2 page table can be enabled without PRR support (for DMA-API use case). Therefore, separate the enabling logic for PPR and GT. There is no functional change. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 19 +-- 1 file changed, 5 insertions(+), 14 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 9126efcbaf2c..5def566de6f6 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -898,14 +898,6 @@ static void iommu_enable_xt(struct amd_iommu *iommu) #endif /* CONFIG_IRQ_REMAP */ } -static void iommu_enable_gt(struct amd_iommu *iommu) -{ - if (!iommu_feature(iommu, FEATURE_GT)) - return; - - iommu_feature_enable(iommu, CONTROL_GT_EN); -} - /* sets a specific bit in the device table entry. */ static void set_dev_entry_bit(u16 devid, u8 bit) { @@ -1882,6 +1874,7 @@ static int __init iommu_init_pci(struct amd_iommu *iommu) amd_iommu_max_glx_val = glxval; else amd_iommu_max_glx_val = min(amd_iommu_max_glx_val, glxval); + iommu_feature_enable(iommu, CONTROL_GT_EN); } if (iommu_feature(iommu, FEATURE_GT) && @@ -2530,21 +2523,19 @@ static void early_enable_iommus(void) #endif } -static void enable_iommus_v2(void) +static void enable_iommus_ppr(void) { struct amd_iommu *iommu; - for_each_iommu(iommu) { + for_each_iommu(iommu) iommu_enable_ppr_log(iommu); - iommu_enable_gt(iommu); - } } static void enable_iommus(void) { early_enable_iommus(); - enable_iommus_v2(); + enable_iommus_ppr(); } static void disable_iommus(void) @@ -2935,7 +2926,7 @@ static int __init state_next(void) register_syscore_ops(_iommu_syscore_ops); ret = amd_iommu_init_pci(); init_state = ret ? IOMMU_INIT_ERROR : IOMMU_PCI_INIT; - enable_iommus_v2(); + enable_iommus_ppr(); break; case IOMMU_PCI_INIT: ret = amd_iommu_enable_interrupts(); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH 4/7] iommu/amd: Initial support for AMD IOMMU v2 page table
Introduce IO page table framework support for AMD IOMMU v2 page table. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/Makefile | 2 +- drivers/iommu/amd/amd_iommu_types.h | 2 + drivers/iommu/amd/io_pgtable_v2.c | 239 drivers/iommu/io-pgtable.c | 1 + include/linux/io-pgtable.h | 2 + 5 files changed, 245 insertions(+), 1 deletion(-) create mode 100644 drivers/iommu/amd/io_pgtable_v2.c diff --git a/drivers/iommu/amd/Makefile b/drivers/iommu/amd/Makefile index a935f8f4b974..773d8aa00283 100644 --- a/drivers/iommu/amd/Makefile +++ b/drivers/iommu/amd/Makefile @@ -1,4 +1,4 @@ # SPDX-License-Identifier: GPL-2.0-only -obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o io_pgtable.o +obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o io_pgtable.o io_pgtable_v2.o obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += debugfs.o obj-$(CONFIG_AMD_IOMMU_V2) += iommu_v2.o diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 6937e3674a16..25062eb86c8b 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -265,6 +265,7 @@ * 512GB Pages are not supported due to a hardware bug */ #define AMD_IOMMU_PGSIZES ((~0xFFFUL) & ~(2ULL << 38)) +#define AMD_IOMMU_PGSIZES_V2 (PAGE_SIZE | (1ULL << 12) | (1ULL << 30)) /* Bit value definition for dte irq remapping fields*/ #define DTE_IRQ_PHYS_ADDR_MASK (((1ULL << 45)-1) << 6) @@ -503,6 +504,7 @@ struct amd_io_pgtable { int mode; u64 *root; atomic64_t pt_root;/* pgtable root and pgtable mode */ + struct mm_structv2_mm; }; /* diff --git a/drivers/iommu/amd/io_pgtable_v2.c b/drivers/iommu/amd/io_pgtable_v2.c new file mode 100644 index ..b0b6ba2d8d35 --- /dev/null +++ b/drivers/iommu/amd/io_pgtable_v2.c @@ -0,0 +1,239 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * CPU-agnostic AMD IO page table v2 allocator. + * + * Copyright (C) 2020 Advanced Micro Devices, Inc. + * Author: Suravee Suthikulpanit + */ + +#define pr_fmt(fmt) "AMD-Vi: " fmt +#define dev_fmt(fmt)pr_fmt(fmt) + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include "amd_iommu_types.h" +#include "amd_iommu.h" + +static pte_t *fetch_pte(struct amd_io_pgtable *pgtable, + unsigned long iova, + unsigned long *page_size) +{ + int level; + pte_t *ptep; + + ptep = lookup_address_in_mm(>v2_mm, iova, ); + if (!ptep || pte_none(*ptep) || (level == PG_LEVEL_NONE)) + return NULL; + + *page_size = PTE_LEVEL_PAGE_SIZE(level-1); + return ptep; +} + +static pte_t *v2_pte_alloc_map(struct mm_struct *mm, unsigned long vaddr) +{ + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + + pgd = pgd_offset(mm, vaddr); + p4d = p4d_alloc(mm, pgd, vaddr); + if (!p4d) + return NULL; + pud = pud_alloc(mm, p4d, vaddr); + if (!pud) + return NULL; + pmd = pmd_alloc(mm, pud, vaddr); + if (!pmd) + return NULL; + pte = pte_alloc_map(mm, pmd, vaddr); + return pte; +} + +static int iommu_v2_map_page(struct io_pgtable_ops *ops, unsigned long iova, + phys_addr_t paddr, size_t size, int prot, gfp_t gfp) +{ + struct protection_domain *dom = io_pgtable_ops_to_domain(ops); + struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops); + pte_t *pte; + int ret, i, count; + bool updated = false; + unsigned long o_iova = iova; + unsigned long pte_pgsize; + + BUG_ON(!IS_ALIGNED(iova, size) || !IS_ALIGNED(paddr, size)); + + ret = -EINVAL; + if (!(prot & IOMMU_PROT_MASK)) + goto out; + + count = PAGE_SIZE_PTE_COUNT(size); + + for (i = 0; i < count; ++i, iova += PAGE_SIZE, paddr += PAGE_SIZE) { + pte = fetch_pte(pgtable, iova, _pgsize); + if (!pte || pte_none(*pte)) { + pte = v2_pte_alloc_map(>iop.v2_mm, iova); + if (!pte) + goto out; + } else { + updated = true; + } + set_pte(pte, __pte((paddr & PAGE_MASK)|_PAGE_PRESENT|_PAGE_USER)); + if (prot & IOMMU_PROT_IW) + *pte = pte_mkwrite(*pte); + } + + if (updated) { + if (count > 1) + amd_iommu_flush_tlb(>domain, 0); + else + amd_iommu_flush_page(>domain, 0, o_iova); + } + + ret = 0; +out: + return ret; +} + +static unsigned long iommu_v2_u
[RFC PATCH 1/7] iommu/amd: Refactor amd_iommu_domain_enable_v2
The current function to enable IOMMU v2 also lock the domain. In order to reuse the same code in different code path, in which the domain has already been locked, refactor the function to separate the locking from the enabling logic. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/iommu.c | 42 +-- 1 file changed, 27 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index a69a8b573e40..6f3e42495709 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -88,6 +88,7 @@ struct iommu_cmd { struct kmem_cache *amd_iommu_irq_cache; static void detach_device(struct device *dev); +static int domain_enable_v2(struct protection_domain *domain, int pasids, bool has_ppr); / * @@ -2304,10 +2305,9 @@ void amd_iommu_domain_direct_map(struct iommu_domain *dom) } EXPORT_SYMBOL(amd_iommu_domain_direct_map); -int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids) +/* Note: This function expects iommu_domain->lock to be held prior calling the function. */ +static int domain_enable_v2(struct protection_domain *domain, int pasids, bool has_ppr) { - struct protection_domain *domain = to_pdomain(dom); - unsigned long flags; int levels, ret; if (pasids <= 0 || pasids > (PASID_MASK + 1)) @@ -2320,17 +2320,6 @@ int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids) if (levels > amd_iommu_max_glx_val) return -EINVAL; - spin_lock_irqsave(>lock, flags); - - /* -* Save us all sanity checks whether devices already in the -* domain support IOMMUv2. Just force that the domain has no -* devices attached when it is switched into IOMMUv2 mode. -*/ - ret = -EBUSY; - if (domain->dev_cnt > 0 || domain->flags & PD_IOMMUV2_MASK) - goto out; - ret = -ENOMEM; domain->gcr3_tbl = (void *)get_zeroed_page(GFP_ATOMIC); if (domain->gcr3_tbl == NULL) @@ -2344,8 +2333,31 @@ int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids) ret = 0; out: - spin_unlock_irqrestore(>lock, flags); + return ret; +} +int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids) +{ + int ret; + unsigned long flags; + struct protection_domain *pdom = to_pdomain(dom); + + spin_lock_irqsave(>lock, flags); + + /* +* Save us all sanity checks whether devices already in the +* domain support IOMMUv2. Just force that the domain has no +* devices attached when it is switched into IOMMUv2 mode. +*/ + ret = -EBUSY; + if (pdom->dev_cnt > 0 || pdom->flags & PD_IOMMUV2_MASK) + goto out; + + if (pdom->dev_cnt == 0 && !(pdom->gcr3_tbl)) + ret = domain_enable_v2(pdom, pasids, true); + +out: + spin_unlock_irqrestore(>lock, flags); return ret; } EXPORT_SYMBOL(amd_iommu_domain_enable_v2); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH 5/7] iommu/amd: Add support for Guest IO protection
AMD IOMMU introduces support for Guest I/O protection where the request from the I/O device without a PASID are treated as if they have PASID 0. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu_types.h | 3 +++ drivers/iommu/amd/init.c| 8 drivers/iommu/amd/iommu.c | 4 3 files changed, 15 insertions(+) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 25062eb86c8b..876ba1adf73e 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -93,6 +93,7 @@ #define FEATURE_HE (1ULL<<8) #define FEATURE_PC (1ULL<<9) #define FEATURE_GAM_VAPIC (1ULL<<21) +#define FEATURE_GIOSUP (1ULL<<48) #define FEATURE_EPHSUP (1ULL<<50) #define FEATURE_SNP(1ULL<<63) @@ -366,6 +367,7 @@ #define DTE_FLAG_IW (1ULL << 62) #define DTE_FLAG_IOTLB (1ULL << 32) +#define DTE_FLAG_GIOV (1ULL << 54) #define DTE_FLAG_GV(1ULL << 55) #define DTE_FLAG_MASK (0x3ffULL << 32) #define DTE_GLX_SHIFT (56) @@ -519,6 +521,7 @@ struct protection_domain { spinlock_t lock;/* mostly used to lock the page table*/ u16 id; /* the domain id written to the device table */ int glx;/* Number of levels for GCR3 table */ + bool giov; /* guest IO protection domain */ u64 *gcr3_tbl; /* Guest CR3 table */ unsigned long flags;/* flags to find out type of domain */ unsigned dev_cnt; /* devices assigned to this domain */ diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 5def566de6f6..9265c1bf1d84 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1895,6 +1895,12 @@ static int __init iommu_init_pci(struct amd_iommu *iommu) init_iommu_perf_ctr(iommu); + if (amd_iommu_pgtable == AMD_IOMMU_V2 && + !iommu_feature(iommu, FEATURE_GIOSUP)) { + pr_warn("Cannot enable v2 page table for DMA-API. Fallback to v1.\n"); + amd_iommu_pgtable = AMD_IOMMU_V1; + } + if (is_rd890_iommu(iommu->dev)) { int i, j; @@ -1969,6 +1975,8 @@ static void print_iommu_info(void) if (amd_iommu_xt_mode == IRQ_REMAP_X2APIC_MODE) pr_info("X2APIC enabled\n"); } + if (amd_iommu_pgtable == AMD_IOMMU_V2) + pr_info("GIOV enabled\n"); } static int __init amd_iommu_init_pci(void) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index f3800efdbb29..e29ece6e1e68 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1405,6 +1405,10 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, pte_root |= (domain->iop.mode & DEV_ENTRY_MODE_MASK) << DEV_ENTRY_MODE_SHIFT; + + if (domain->giov && (domain->flags & PD_IOMMUV2_MASK)) + pte_root |= DTE_FLAG_GIOV; + pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV; flags = amd_iommu_dev_table[devid].data[1]; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH 2/7] iommu/amd: Update sanity check when enable PRI/ATS
Currently, PPR/ATS can be enabled only if the domain is type identity mapping. However, when we allow the IOMMU v2 page table to be used for DMA-API, the sanity check needs to be updated to only apply for the case when using AMD_IOMMU_V1 page table mode. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/iommu.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 6f3e42495709..f3800efdbb29 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1549,7 +1549,7 @@ static int pri_reset_while_enabled(struct pci_dev *pdev) return 0; } -static int pdev_iommuv2_enable(struct pci_dev *pdev) +static int pdev_pri_ats_enable(struct pci_dev *pdev) { bool reset_enable; int reqs, ret; @@ -1624,11 +1624,19 @@ static int attach_device(struct device *dev, struct iommu_domain *def_domain = iommu_get_dma_domain(dev); ret = -EINVAL; - if (def_domain->type != IOMMU_DOMAIN_IDENTITY) + + /* +* In case of using AMD_IOMMU_V1 page table mode, and the device +* is enabling for PPR/ATS support (using v2 table), +* we need to make sure that the domain type is identity map. +*/ + if ((amd_iommu_pgtable == AMD_IOMMU_V1) && + def_domain->type != IOMMU_DOMAIN_IDENTITY) { goto out; + } if (dev_data->iommu_v2) { - if (pdev_iommuv2_enable(pdev) != 0) + if (pdev_pri_ats_enable(pdev) != 0) goto out; dev_data->ats.enabled = true; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH 0/7] iommu/amd: Add Generic IO Page Table Framework Support for v2 Page Table
This series introduces a new usage model for the v2 page table, where it can be used to implement support for DMA-API by adopting the generic IO page table framework. One of the target usecases is to support nested IO page tables where the guest uses the guest IO page table (v2) for translating GVA to GPA, and the hypervisor uses the host I/O page table (v1) for translating GPA to SPA. This is a pre-requisite for supporting the new HW-assisted vIOMMU presented at the KVM Forum 2020. https://static.sched.com/hosted_files/kvmforum2020/26/vIOMMU%20KVM%20Forum%202020.pdf The following components are introduced in this series: - Part 1 (patch 1-4 and 7) Refactor the current IOMMU page table v2 code to adopt the generic IO page table framework, and add AMD IOMMU Guest (v2) page table management code. - Part 2 (patch 5) Add support for the AMD IOMMU Guest IO Protection feature (GIOV) where requests from the I/O device without a PASID are treated as if they have PASID of 0. - Part 3 (patch 6) Introduce new amd_iommu_pgtable command-line to allow users to select the mode of operation (v1 or v2). See AMD I/O Virtualization Technology Specification for more detail. http://www.amd.com/system/files/TechDocs/48882_IOMMU_3.05_PUB.pdf Thanks, Suravee Suravee Suthikulpanit (7): iommu/amd: Refactor amd_iommu_domain_enable_v2 iommu/amd: Update sanity check when enable PRI/ATS iommu/amd: Decouple the logic to enable PPR and GT iommu/amd: Initial support for AMD IOMMU v2 page table iommu/amd: Add support for Guest IO protection iommu/amd: Introduce amd_iommu_pgtable command-line option iommu/amd: Add support for using AMD IOMMU v2 page table for DMA-API .../admin-guide/kernel-parameters.txt | 6 + drivers/iommu/amd/Makefile| 2 +- drivers/iommu/amd/amd_iommu_types.h | 5 + drivers/iommu/amd/init.c | 42 ++- drivers/iommu/amd/io_pgtable_v2.c | 239 ++ drivers/iommu/amd/iommu.c | 81 -- drivers/iommu/io-pgtable.c| 1 + include/linux/io-pgtable.h| 2 + 8 files changed, 345 insertions(+), 33 deletions(-) create mode 100644 drivers/iommu/amd/io_pgtable_v2.c -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] Revert "iommu/amd: Fix performance counter initialization"
Paul, On 3/3/21 7:11 PM, Paul Menzel wrote: This reverts commit 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b. The commit adds up to 100 ms to the boot process, which is not mentioned in the commit message, and is making up more than 20 % on current systems, where the Linux kernel takes 500 ms. The 100 msec (5 * 20ms) is only for the worst-case scenario. For most cases, the delay is not applicable. In addition, this patch has shown to fix the issue for some users in the field. [0.00] Linux version 5.11.0-10281-g19b4f3edd5c9 (root@a2ab663d937e) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1) #138 SMP Wed Feb 24 11:28:17 UTC 2021 […] [0.106422] smpboot: CPU0: AMD Ryzen 3 2200G with Radeon Vega Graphics (family: 0x17, model: 0x11, stepping: 0x0) […] [0.291257] pci :00:00.2: AMD-Vi: Unable to read/write to IOMMU perf counter. […] Also, it does not fix the problem on an MSI B350M MORTAR with AMD Ryzen 3 2200G (even with ten retries, resulting in 200 ms time-out). We are still investigating to root cause the long delay for the IOMMU performance counter unit to disable power-gating, and allow access to the performance counters. If your concern is the amount of retries, we can try to reduce the number of retires. [0.401152] pci :00:00.2: AMD-Vi: Unable to read/write to IOMMU perf counter. Additionally, alternative proposed solutions [1] were not considered or discussed. [1]:https://lore.kernel.org/linux-iommu/alpine.lnx.2.20.13.2006030935570.3...@monopod.intra.ispras.ru/ This check has been introduced early on to detect a HW issue for certain platforms in the past, where the performance counters are not accessible and would result in silent failure when try to use the counters. This is considered legacy code, and can be removed if we decide to no longer provide sanity check for such case. Regards, Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: Fix event counter availability check
This fix has been accepted in the upstream recently. https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/?h=x86/amd Could you please give this a try? Thanks, Suravee On 2/21/21 8:49 PM, Paul Menzel wrote: Dear Suravee, Am 17.09.20 um 19:55 schrieb Alexander Monakov: On Tue, 16 Jun 2020, Suravee Suthikulpanit wrote: Instead of blindly moving the code around to a spot that would just work, I am trying to understand what might be required here. In this case, the init_device_table_dma()should not be needed. I suspect it's the IOMMU invalidate all command that's also needed here. I'm also checking with the HW and BIOS team. Meanwhile, could you please give the following change a try: Hello. Can you give any update please? […] Sorry for late reply. I have a reproducer and working with the HW team to understand the issue. I should be able to provide update with solution by the end of this week. Hello, hope you are doing well. Has this investigation found anything? I am wondering the same. It’d be great to have this fixed in the upstream Linux kernel. Kind regards, Paul ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/amd: Fix performance counter initialization
Certain AMD platforms enable power gating feature for IOMMU PMC, which prevents the IOMMU driver from updating the counter while trying to validate the PMC functionality in the init_iommu_perf_ctr(). This results in disabling PMC support and the following error message: "AMD-Vi: Unable to read/write to IOMMU perf counter" To workaround this issue, disable power gating temporarily by programming the counter source to non-zero value while validating the counter, and restore the prior state afterward. Tested-by: Tj (Elloe Linux) Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753 Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 45 ++-- 1 file changed, 34 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 83d8ab2aed9f..01da76dc1caa 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include @@ -254,6 +255,8 @@ static enum iommu_init_state init_state = IOMMU_START_STATE; static int amd_iommu_enable_interrupts(void); static int __init iommu_go_to_state(enum iommu_init_state state); static void init_device_table_dma(void); +static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, + u8 fxn, u64 *value, bool is_write); static bool amd_iommu_pre_enabled = true; @@ -1712,13 +1715,11 @@ static int __init init_iommu_all(struct acpi_table_header *table) return 0; } -static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, - u8 fxn, u64 *value, bool is_write); - -static void init_iommu_perf_ctr(struct amd_iommu *iommu) +static void __init init_iommu_perf_ctr(struct amd_iommu *iommu) { + int retry; struct pci_dev *pdev = iommu->dev; - u64 val = 0xabcd, val2 = 0, save_reg = 0; + u64 val = 0xabcd, val2 = 0, save_reg, save_src; if (!iommu_feature(iommu, FEATURE_PC)) return; @@ -1726,17 +1727,39 @@ static void init_iommu_perf_ctr(struct amd_iommu *iommu) amd_iommu_pc_present = true; /* save the value to restore, if writable */ - if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, false)) + if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, false) || + iommu_pc_get_set_reg(iommu, 0, 0, 8, _src, false)) goto pc_false; - /* Check if the performance counters can be written to */ - if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, , true)) || - (iommu_pc_get_set_reg(iommu, 0, 0, 0, , false)) || - (val != val2)) + /* +* Disable power gating by programing the performance counter +* source to 20 (i.e. counts the reads and writes from/to IOMMU +* Reserved Register [MMIO Offset 1FF8h] that are ignored.), +* which never get incremented during this init phase. +* (Note: The event is also deprecated.) +*/ + val = 20; + if (iommu_pc_get_set_reg(iommu, 0, 0, 8, , true)) goto pc_false; + /* Check if the performance counters can be written to */ + val = 0xabcd; + for (retry = 5; retry; retry--) { + if (iommu_pc_get_set_reg(iommu, 0, 0, 0, , true) || + iommu_pc_get_set_reg(iommu, 0, 0, 0, , false) || + val2) + break; + + /* Wait about 20 msec for power gating to disable and retry. */ + msleep(20); + } + /* restore */ - if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, true)) + if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, true) || + iommu_pc_get_set_reg(iommu, 0, 0, 8, _src, true)) + goto pc_false; + + if (val != val2) goto pc_false; pci_info(pdev, "IOMMU performance counters supported\n"); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: AMD-Vi: Unable to read/write to IOMMU perf counter
TJ, Thanks for testing. I will submit this change upstream w/ you as Tested-by. On 2/8/21 12:18 AM, Tj (Elloe Linux) wrote: On 06/02/2021 04:02, Suravee Suthikulpanit wrote: Would this be in any way related to the following from the same device: kernel: pci :00:00.2: can't derive routing for PCI INT A kernel: pci :00:00.2: PCI INT A: not connected This is not related, but should not cause issues. Thanks, Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: AMD-Vi: Unable to read/write to IOMMU perf counter
Tj, I have posted RFCv3 in the BZ https://bugzilla.kernel.org/show_bug.cgi?id=201753. RFCv3 patch adds the logic to retry checking after 20msec wait for each retry loop since I have founded that certain platform takes about 10msec for the power gating to disable. Please give this a try to see if this works better on your platform. Thanks, Suravee On 2/4/21 1:25 PM, Tj (Elloe Linux) wrote: On 02/02/2021 05:54, Suravee Suthikulpanit wrote: Could you please try the attached patch to see if the problem still persist. Tested on top of commit 61556703b610 doesn't appear to have solved the issue. Linux version 5.11.0-rc6+ (tj@elloe000) (gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubunt> Command line: BOOT_IMAGE=/vmlinuz-5.11.0-rc6+ root=/dev/mapper/ELLOE000-rootfs ro acpi_osi=! "acpi_osi=Windows 20> ... DMI: LENOVO 20NECTO1WW/20NECTO1WW, BIOS R11ET32W (1.12 ) 12/23/2019 ... AMD-Vi: ivrs, add hid:PNPD0040, uid:, rdevid:152 ... smpboot: CPU0: AMD Ryzen 7 3700U with Radeon Vega Mobile Gfx (family: 0x17, model: 0x18, stepping: 0x1) ... pci :00:00.2: AMD-Vi: Unable to read/write to IOMMU perf counter. pci :00:00.2: can't derive routing for PCI INT A pci :00:00.2: PCI INT A: not connected pci :00:01.0: Adding to iommu group 0 pci :00:01.1: Adding to iommu group 1 ... pci :00:00.2: AMD-Vi: Found IOMMU cap 0x40 pci :00:00.2: AMD-Vi: Extended features (0x4f77ef22294ada): PPR NX GT IA GA PC GA_vAPIC AMD-Vi: Interrupt remapping enabled AMD-Vi: Virtual APIC enabled AMD-Vi: Lazy IO/TLB flushing enabled amd_uncore: 4 amd_df counters detected ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: AMD-Vi: Unable to read/write to IOMMU perf counter
Could you please try the attached patch to see if the problem still persist. Thanks, Suravee On 1/25/21 4:24 PM, Tj (Elloe Linux) wrote: Lenovo E495 reports: pci :00:00.2: AMD-Vi: Unable to read/write to IOMMU perf counter. pci :00:00.2: can't derive routing for PCI INT A pci :00:00.2: PCI INT A: not connected I found an existing identical bug report that doesn't seem to have gained any attention: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D201753data=04%7C01%7Csuravee.suthikulpanit%40amd.com%7C7c56640fcf24465050f008d8c145eba4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637471853347946970%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=uykr%2FZMpr%2BuLrw3k1bKVcwywfJB4CU0p2qJSZXgLNK8%3Dreserved=0 Linux version 5.11.0-rc4+ (tj@elloe000) (gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #12 SMP PREEMPT Sun Jan 24 11:28:01 GMT 2021 Command line: BOOT_IMAGE=/vmlinuz-5.11.0-rc4+ root=/dev/mapper/ELLOE000-rootfs ro acpi_osi=! "acpi_osi=Windows 2016" systemd.unified_cgroup_hierarchy=1 nosplash ... DMI: LENOVO 20NECTO1WW/20NECTO1WW, BIOS R11ET32W (1.12 ) 12/23/2019 ... AMD-Vi: ivrs, add hid:PNPD0040, uid:, rdevid:152 ... smpboot: CPU0: AMD Ryzen 7 3700U with Radeon Vega Mobile Gfx (family: 0x17, model: 0x18, stepping: 0x1) ... pci :00:00.2: AMD-Vi: Unable to read/write to IOMMU perf counter. pci :00:00.2: can't derive routing for PCI INT A pci :00:00.2: PCI INT A: not connected pci :00:01.0: Adding to iommu group 0 pci :00:01.1: Adding to iommu group 1 pci :00:01.2: Adding to iommu group 2 pci :00:01.3: Adding to iommu group 3 pci :00:01.6: Adding to iommu group 4 pci :00:08.0: Adding to iommu group 5 pci :00:08.1: Adding to iommu group 6 pci :00:14.0: Adding to iommu group 7 pci :00:14.3: Adding to iommu group 7 pci :00:18.0: Adding to iommu group 8 pci :00:18.1: Adding to iommu group 8 pci :00:18.2: Adding to iommu group 8 pci :00:18.3: Adding to iommu group 8 pci :00:18.4: Adding to iommu group 8 pci :00:18.5: Adding to iommu group 8 pci :00:18.6: Adding to iommu group 8 pci :00:18.7: Adding to iommu group 8 pci :01:00.0: Adding to iommu group 9 pci :02:00.0: Adding to iommu group 10 pci :03:00.0: Adding to iommu group 11 pci :04:00.0: Adding to iommu group 12 pci :05:00.0: Adding to iommu group 13 pci :05:00.1: Adding to iommu group 14 pci :05:00.2: Adding to iommu group 14 pci :05:00.3: Adding to iommu group 14 pci :05:00.4: Adding to iommu group 14 pci :05:00.5: Adding to iommu group 14 pci :05:00.6: Adding to iommu group 14 pci :00:00.2: AMD-Vi: Found IOMMU cap 0x40 pci :00:00.2: AMD-Vi: Extended features (0x4f77ef22294ada): PPR NX GT IA GA PC GA_vAPIC AMD-Vi: Interrupt remapping enabled AMD-Vi: Virtual APIC enabled AMD-Vi: Lazy IO/TLB flushing enabled amd_uncore: 4 amd_df counters detected ___ iommu mailing list iommu@lists.linux-foundation.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.linuxfoundation.org%2Fmailman%2Flistinfo%2Fiommudata=04%7C01%7Csuravee.suthikulpanit%40amd.com%7C7c56640fcf24465050f008d8c145eba4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637471853347946970%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=5w2IiD7Cjsvk9qyiYC9eLmFaBIJLXdLQx4kg27LWycg%3Dreserved=0 From c103d631285cf376420e7f7869837302f2ac38c0 Mon Sep 17 00:00:00 2001 From: Suravee Suthikulpanit Date: Mon, 1 Feb 2021 18:38:26 -0600 Subject: [RFC PATCH] iommu/amd: Fix performance counter initialization Certain AMD platforms enable power gating feature for IOMMU PMC, which prevents the IOMMU driver from updating the counter while trying to validate the PMC functionality in the init_iommu_perf_ctr(). This results in disabling PMC support and the following error message: "AMD-Vi: Unable to write to IOMMU perf counter" To workaround this issue, disable power gating temporarily by programming the counter source to non-zero value while validating the counter, and restore the prior state afterward. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753 Signed-off-by: Suravee Suthikulpanit --- NOTE: I have tested this patch only on certain platforms. It might need more testing coverage on other mobile and desktop platforms. Thank you, Suravee drivers/iommu/amd/init.c | 33 - 1 file changed, 24 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 83d8ab2aed9f..edb885625e47 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -254,6 +254,8 @@ static enum iommu_init_state init_state = IOMMU_START_STATE; static int amd_iommu_enable_interrupts(void); static int __init io
Re: [PATCH v4 00/13] iommu/amd: Add Generic IO Page Table Framework Support
On 1/27/21 7:06 PM, Joerg Roedel wrote: Hi Suravee, On Tue, Dec 15, 2020 at 01:36:52AM -0600, Suravee Suthikulpanit wrote: Suravee Suthikulpanit (13): iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline iommu/amd: Prepare for generic IO page table framework iommu/amd: Move pt_root to struct amd_io_pgtable iommu/amd: Convert to using amd_io_pgtable iommu/amd: Declare functions as extern iommu/amd: Move IO page table related functions iommu/amd: Restructure code for freeing page table iommu/amd: Remove amd_iommu_domain_get_pgtable iommu/amd: Rename variables to be consistent with struct io_pgtable_ops iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable iommu/amd: Introduce iommu_v1_iova_to_phys iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page iommu/amd: Adopt IO page table framework for AMD IOMMU v1 page table Applied this series, thanks for the work! Given testing goes well you can consider this queued for 5.12. Thanks, Joerg Thanks Joerg and Will, and welcome back!!! Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2] iommu/amd: Use IVHD EFR for early initialization of IOMMU features
IOMMU Extended Feature Register (EFR) is used to communicate the supported features for each IOMMU to the IOMMU driver. This is normally read from the PCI MMIO register offset 0x30, and used by the iommu_feature() helper function. However, there are certain scenarios where the information is needed prior to PCI initialization, and the iommu_feature() function is used prematurely w/o warning. This has caused incorrect initialization of IOMMU. This is the case for the commit 6d39bdee238f ("iommu/amd: Enforce 4k mapping for certain IOMMU data structures") Since, the EFR is also available in the IVHD header, and is available to the driver prior to PCI initialization. Therefore, default to using the IVHD EFR instead. Fixes: 6d39bdee238f ("iommu/amd: Enforce 4k mapping for certain IOMMU data structures") Reviewed-by: Robert Richter Tested-by: Brijesh Singh Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 7 ++-- drivers/iommu/amd/amd_iommu_types.h | 4 +++ drivers/iommu/amd/init.c| 56 +++-- 3 files changed, 60 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 6b8cbdf71714..b4adab698563 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -84,12 +84,9 @@ static inline bool is_rd890_iommu(struct pci_dev *pdev) (pdev->device == PCI_DEVICE_ID_RD890_IOMMU); } -static inline bool iommu_feature(struct amd_iommu *iommu, u64 f) +static inline bool iommu_feature(struct amd_iommu *iommu, u64 mask) { - if (!(iommu->cap & (1 << IOMMU_CAP_EFR))) - return false; - - return !!(iommu->features & f); + return !!(iommu->features & mask); } static inline u64 iommu_virt_to_phys(void *vaddr) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 553587827771..1a0495dd5fcb 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -387,6 +387,10 @@ #define IOMMU_CAP_NPCACHE 26 #define IOMMU_CAP_EFR 27 +/* IOMMU IVINFO */ +#define IOMMU_IVINFO_OFFSET 36 +#define IOMMU_IVINFO_EFRSUP BIT(0) + /* IOMMU Feature Reporting Field (for IVHD type 10h */ #define IOMMU_FEAT_GASUP_SHIFT 6 diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 6a1f7048dacc..83d8ab2aed9f 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -257,6 +257,8 @@ static void init_device_table_dma(void); static bool amd_iommu_pre_enabled = true; +static u32 amd_iommu_ivinfo __initdata; + bool translation_pre_enabled(struct amd_iommu *iommu) { return (iommu->flags & AMD_IOMMU_FLAG_TRANS_PRE_ENABLED); @@ -296,6 +298,18 @@ int amd_iommu_get_num_iommus(void) return amd_iommus_present; } +/* + * For IVHD type 0x11/0x40, EFR is also available via IVHD. + * Default to IVHD EFR since it is available sooner + * (i.e. before PCI init). + */ +static void __init early_iommu_features_init(struct amd_iommu *iommu, +struct ivhd_header *h) +{ + if (amd_iommu_ivinfo & IOMMU_IVINFO_EFRSUP) + iommu->features = h->efr_reg; +} + /* Access to l1 and l2 indexed register spaces */ static u32 iommu_read_l1(struct amd_iommu *iommu, u16 l1, u8 address) @@ -1577,6 +1591,9 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) if (h->efr_reg & BIT(IOMMU_EFR_XTSUP_SHIFT)) amd_iommu_xt_mode = IRQ_REMAP_X2APIC_MODE; + + early_iommu_features_init(iommu, h); + break; default: return -EINVAL; @@ -1770,6 +1787,35 @@ static const struct attribute_group *amd_iommu_groups[] = { NULL, }; +/* + * Note: IVHD 0x11 and 0x40 also contains exact copy + * of the IOMMU Extended Feature Register [MMIO Offset 0030h]. + * Default to EFR in IVHD since it is available sooner (i.e. before PCI init). + */ +static void __init late_iommu_features_init(struct amd_iommu *iommu) +{ + u64 features; + + if (!(iommu->cap & (1 << IOMMU_CAP_EFR))) + return; + + /* read extended feature bits */ + features = readq(iommu->mmio_base + MMIO_EXT_FEATURES); + + if (!iommu->features) { + iommu->features = features; + return; + } + + /* +* Sanity check and warn if EFR values from +* IVHD and MMIO conflict. +*/ + if (features != iommu->features) + pr_warn(FW_WARN "EFR mismatch. Use IVHD EFR (%#llx : %#llx\n).", + features, iommu->features); +} + static int __init iommu_init_pci(struct amd_iommu *iommu) { int cap_ptr = iommu->cap_ptr; @@ -1789,8 +1835,7 @@ static int __init iommu_init_pci(struc
Re: [PATCH] iommu/amd: Make use of EFR from IVHD when available
I will send out v2 of this patch. Please ignore this v1. Thanks, Suravee On 1/18/21 12:19 PM, Suravee Suthikulpanit wrote: IOMMU Extended Feature Register (EFR) is used to communicate the supported features for each IOMMU to the IOMMU driver. This is normally read from the PCI MMIO register offset 0x30, and used by the iommu_feature() helper function. However, there are certain scenarios where the information is needed prior to PCI initialization, and the iommu_feature() function is used prematurely w/o warning. This has caused incorrect initialization of IOMMU. The EFR is also available in the IVHD header, and is available to the driver prior to PCI initialization. Therefore, default to using the IVHD EFR instead. Fixes: 6d39bdee238f ("iommu/amd: Enforce 4k mapping for certain IOMMU data structures") Tested-by: Brijesh Singh Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 3 ++- drivers/iommu/amd/amd_iommu_types.h | 4 +++ drivers/iommu/amd/init.c| 39 +++-- 3 files changed, 43 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 6b8cbdf71714..0a89e9c4f7b3 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -86,7 +86,8 @@ static inline bool is_rd890_iommu(struct pci_dev *pdev) static inline bool iommu_feature(struct amd_iommu *iommu, u64 f) { - if (!(iommu->cap & (1 << IOMMU_CAP_EFR))) + /* features == 0 means EFR is not supported */ + if (!iommu->features) return false; return !!(iommu->features & f); diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 553587827771..35331e458dd1 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -387,6 +387,10 @@ #define IOMMU_CAP_NPCACHE 26 #define IOMMU_CAP_EFR 27 +/* IOMMU IVINFO */ +#define IOMMU_IVINFO_OFFSET 36 +#define IOMMU_IVINFO_EFRSUP_SHIFT0 + /* IOMMU Feature Reporting Field (for IVHD type 10h */ #define IOMMU_FEAT_GASUP_SHIFT6 diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 6a1f7048dacc..28b1d2feec96 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -257,6 +257,8 @@ static void init_device_table_dma(void); static bool amd_iommu_pre_enabled = true; +static u32 amd_iommu_ivinfo; + bool translation_pre_enabled(struct amd_iommu *iommu) { return (iommu->flags & AMD_IOMMU_FLAG_TRANS_PRE_ENABLED); @@ -1577,6 +1579,14 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) if (h->efr_reg & BIT(IOMMU_EFR_XTSUP_SHIFT)) amd_iommu_xt_mode = IRQ_REMAP_X2APIC_MODE; + + /* +* For IVHD type 0x11/0x40, EFR is also available via IVHD. +* Default to IVHD EFR since it is available sooner +* (i.e. before PCI init). +*/ + if (amd_iommu_ivinfo & (1 << IOMMU_IVINFO_EFRSUP_SHIFT)) + iommu->features = h->efr_reg; break; default: return -EINVAL; @@ -1770,6 +1780,29 @@ static const struct attribute_group *amd_iommu_groups[] = { NULL, }; +/* + * Note: IVHD 0x11 and 0x40 also contains exact copy + * of the IOMMU Extended Feature Register [MMIO Offset 0030h]. + * Default to EFR in IVHD since it is available sooner (i.e. before PCI init). + * However, sanity check and warn if they conflict. + */ +static void __init iommu_init_features(struct amd_iommu *iommu) +{ + u64 features; + + if (!(iommu->cap & (1 << IOMMU_CAP_EFR))) + return; + + /* read extended feature bits */ + features = readq(iommu->mmio_base + MMIO_EXT_FEATURES); + + if (iommu->features && (features != iommu->features)) + pr_err(FW_BUG "EFR mismatch. Use IVHD EFR (%#llx : %#llx\n).", + features, iommu->features); + else + iommu->features = features; +} + static int __init iommu_init_pci(struct amd_iommu *iommu) { int cap_ptr = iommu->cap_ptr; @@ -1789,8 +1822,7 @@ static int __init iommu_init_pci(struct amd_iommu *iommu) if (!(iommu->cap & (1 << IOMMU_CAP_IOTLB))) amd_iommu_iotlb_sup = false; - /* read extended feature bits */ - iommu->features = readq(iommu->mmio_base + MMIO_EXT_FEATURES); + iommu_init_features(iommu); if (iommu_feature(iommu, FEATURE_GT)) { int glxval; @@ -2661,6 +2693,9 @@ static int __init early_amd_iommu_init(void) if (ret) goto out; + /* Store IVRS IVinfo field. */ + amd_iommu_ivinfo = *((u32 *)((u8 *)ivrs_base + IOMMU_IVINFO_OFFS
[PATCH] iommu/amd: Make use of EFR from IVHD when available
IOMMU Extended Feature Register (EFR) is used to communicate the supported features for each IOMMU to the IOMMU driver. This is normally read from the PCI MMIO register offset 0x30, and used by the iommu_feature() helper function. However, there are certain scenarios where the information is needed prior to PCI initialization, and the iommu_feature() function is used prematurely w/o warning. This has caused incorrect initialization of IOMMU. The EFR is also available in the IVHD header, and is available to the driver prior to PCI initialization. Therefore, default to using the IVHD EFR instead. Fixes: 6d39bdee238f ("iommu/amd: Enforce 4k mapping for certain IOMMU data structures") Tested-by: Brijesh Singh Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 3 ++- drivers/iommu/amd/amd_iommu_types.h | 4 +++ drivers/iommu/amd/init.c| 39 +++-- 3 files changed, 43 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 6b8cbdf71714..0a89e9c4f7b3 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -86,7 +86,8 @@ static inline bool is_rd890_iommu(struct pci_dev *pdev) static inline bool iommu_feature(struct amd_iommu *iommu, u64 f) { - if (!(iommu->cap & (1 << IOMMU_CAP_EFR))) + /* features == 0 means EFR is not supported */ + if (!iommu->features) return false; return !!(iommu->features & f); diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 553587827771..35331e458dd1 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -387,6 +387,10 @@ #define IOMMU_CAP_NPCACHE 26 #define IOMMU_CAP_EFR 27 +/* IOMMU IVINFO */ +#define IOMMU_IVINFO_OFFSET 36 +#define IOMMU_IVINFO_EFRSUP_SHIFT0 + /* IOMMU Feature Reporting Field (for IVHD type 10h */ #define IOMMU_FEAT_GASUP_SHIFT 6 diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 6a1f7048dacc..28b1d2feec96 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -257,6 +257,8 @@ static void init_device_table_dma(void); static bool amd_iommu_pre_enabled = true; +static u32 amd_iommu_ivinfo; + bool translation_pre_enabled(struct amd_iommu *iommu) { return (iommu->flags & AMD_IOMMU_FLAG_TRANS_PRE_ENABLED); @@ -1577,6 +1579,14 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) if (h->efr_reg & BIT(IOMMU_EFR_XTSUP_SHIFT)) amd_iommu_xt_mode = IRQ_REMAP_X2APIC_MODE; + + /* +* For IVHD type 0x11/0x40, EFR is also available via IVHD. +* Default to IVHD EFR since it is available sooner +* (i.e. before PCI init). +*/ + if (amd_iommu_ivinfo & (1 << IOMMU_IVINFO_EFRSUP_SHIFT)) + iommu->features = h->efr_reg; break; default: return -EINVAL; @@ -1770,6 +1780,29 @@ static const struct attribute_group *amd_iommu_groups[] = { NULL, }; +/* + * Note: IVHD 0x11 and 0x40 also contains exact copy + * of the IOMMU Extended Feature Register [MMIO Offset 0030h]. + * Default to EFR in IVHD since it is available sooner (i.e. before PCI init). + * However, sanity check and warn if they conflict. + */ +static void __init iommu_init_features(struct amd_iommu *iommu) +{ + u64 features; + + if (!(iommu->cap & (1 << IOMMU_CAP_EFR))) + return; + + /* read extended feature bits */ + features = readq(iommu->mmio_base + MMIO_EXT_FEATURES); + + if (iommu->features && (features != iommu->features)) + pr_err(FW_BUG "EFR mismatch. Use IVHD EFR (%#llx : %#llx\n).", + features, iommu->features); + else + iommu->features = features; +} + static int __init iommu_init_pci(struct amd_iommu *iommu) { int cap_ptr = iommu->cap_ptr; @@ -1789,8 +1822,7 @@ static int __init iommu_init_pci(struct amd_iommu *iommu) if (!(iommu->cap & (1 << IOMMU_CAP_IOTLB))) amd_iommu_iotlb_sup = false; - /* read extended feature bits */ - iommu->features = readq(iommu->mmio_base + MMIO_EXT_FEATURES); + iommu_init_features(iommu); if (iommu_feature(iommu, FEATURE_GT)) { int glxval; @@ -2661,6 +2693,9 @@ static int __init early_amd_iommu_init(void) if (ret) goto out; + /* Store IVRS IVinfo field. */ + amd_iommu_ivinfo = *((u32 *)((u8 *)ivrs_base + IOMMU_IVINFO_OFFSET)); + amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base); DUMP_printk("Using IVHD type %#x\n", a
Re: [PATCH v4 00/13] iommu/amd: Add Generic IO Page Table Framework Support
Hi Joerg / Will, Happy New Year!! Just want to follow up on this series. Thanks, Suravee On 12/15/20 2:36 PM, Suravee Suthikulpanit wrote: The framework allows callable implementation of IO page table. This allows AMD IOMMU driver to switch between different types of AMD IOMMU page tables (e.g. v1 vs. v2). This series refactors the current implementation of AMD IOMMU v1 page table to adopt the framework. There should be no functional change. Subsequent series will introduce support for the AMD IOMMU v2 page table. Thanks, Suravee Change from V3 (https://lore.kernel.org/linux-iommu/20201004014549.16065-1-suravee.suthikulpa...@amd.com/) - Rebase to v5.10 - Patch 2: Add struct iommu_flush_ops (previously in patch 13 of v3) - Patch 7: Consolidate logic into v1_free_pgtable() instead of amd_iommu_free_pgtable() - Patch 12: Check ops->[map|unmap] before calling. - Patch 13: Setup page table when allocating domain (instead of when attaching device). Change from V2 (https://lore.kernel.org/lkml/835c0d46-ed96-9fbe-856a-777dcffac...@amd.com/T/#t) - Patch 2: Introduce helper function io_pgtable_cfg_to_data. - Patch 13: Put back the struct iommu_flush_ops since patch v2 would run into NULL pointer bug when calling free_io_pgtable_ops if not defined. Change from V1 (https://lkml.org/lkml/2020/9/23/251) - Do not specify struct io_pgtable_cfg.coherent_walk, since it is not currently used. (per Robin) - Remove unused struct iommu_flush_ops. (patch 2/13) - Move amd_iommu_setup_io_pgtable_ops to iommu.c instead of io_pgtable.c patch 13/13) Suravee Suthikulpanit (13): iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline iommu/amd: Prepare for generic IO page table framework iommu/amd: Move pt_root to struct amd_io_pgtable iommu/amd: Convert to using amd_io_pgtable iommu/amd: Declare functions as extern iommu/amd: Move IO page table related functions iommu/amd: Restructure code for freeing page table iommu/amd: Remove amd_iommu_domain_get_pgtable iommu/amd: Rename variables to be consistent with struct io_pgtable_ops iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable iommu/amd: Introduce iommu_v1_iova_to_phys iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page iommu/amd: Adopt IO page table framework for AMD IOMMU v1 page table drivers/iommu/amd/Kconfig | 1 + drivers/iommu/amd/Makefile | 2 +- drivers/iommu/amd/amd_iommu.h | 22 + drivers/iommu/amd/amd_iommu_types.h | 43 +- drivers/iommu/amd/init.c| 2 + drivers/iommu/amd/io_pgtable.c | 564 +++ drivers/iommu/amd/iommu.c | 672 drivers/iommu/io-pgtable.c | 3 + include/linux/io-pgtable.h | 2 + 9 files changed, 707 insertions(+), 604 deletions(-) create mode 100644 drivers/iommu/amd/io_pgtable.c ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 10/13] iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable
To simplify the fetch_pte function. There is no functional change. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 2 +- drivers/iommu/amd/io_pgtable.c | 13 +++-- drivers/iommu/amd/iommu.c | 4 +++- 3 files changed, 11 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 76276d9e463c..83ca822c5349 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -143,7 +143,7 @@ extern int iommu_map_page(struct protection_domain *dom, extern unsigned long iommu_unmap_page(struct protection_domain *dom, unsigned long bus_addr, unsigned long page_size); -extern u64 *fetch_pte(struct protection_domain *domain, +extern u64 *fetch_pte(struct amd_io_pgtable *pgtable, unsigned long address, unsigned long *page_size); extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain, diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c index 35dd9153e6b7..87184b6cee0f 100644 --- a/drivers/iommu/amd/io_pgtable.c +++ b/drivers/iommu/amd/io_pgtable.c @@ -317,7 +317,7 @@ static u64 *alloc_pte(struct protection_domain *domain, * This function checks if there is a PTE for a given dma address. If * there is one, it returns the pointer to it. */ -u64 *fetch_pte(struct protection_domain *domain, +u64 *fetch_pte(struct amd_io_pgtable *pgtable, unsigned long address, unsigned long *page_size) { @@ -326,11 +326,11 @@ u64 *fetch_pte(struct protection_domain *domain, *page_size = 0; - if (address > PM_LEVEL_SIZE(domain->iop.mode)) + if (address > PM_LEVEL_SIZE(pgtable->mode)) return NULL; - level = domain->iop.mode - 1; - pte= >iop.root[PM_LEVEL_INDEX(level, address)]; + level = pgtable->mode - 1; + pte= >root[PM_LEVEL_INDEX(level, address)]; *page_size = PTE_LEVEL_PAGE_SIZE(level); while (level > 0) { @@ -465,6 +465,8 @@ unsigned long iommu_unmap_page(struct protection_domain *dom, unsigned long iova, unsigned long size) { + struct io_pgtable_ops *ops = >iop.iop.ops; + struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops); unsigned long long unmapped; unsigned long unmap_size; u64 *pte; @@ -474,8 +476,7 @@ unsigned long iommu_unmap_page(struct protection_domain *dom, unmapped = 0; while (unmapped < size) { - pte = fetch_pte(dom, iova, _size); - + pte = fetch_pte(pgtable, iova, _size); if (pte) { int i, count; diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 2963a37b7c16..76f61dd6b89f 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2100,13 +2100,15 @@ static phys_addr_t amd_iommu_iova_to_phys(struct iommu_domain *dom, dma_addr_t iova) { struct protection_domain *domain = to_pdomain(dom); + struct io_pgtable_ops *ops = >iop.iop.ops; + struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops); unsigned long offset_mask, pte_pgsize; u64 *pte, __pte; if (domain->iop.mode == PAGE_MODE_NONE) return iova; - pte = fetch_pte(domain, iova, _pgsize); + pte = fetch_pte(pgtable, iova, _pgsize); if (!pte || !IOMMU_PTE_PRESENT(*pte)) return 0; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 12/13] iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page
These implement map and unmap for AMD IOMMU v1 pagetable, which will be used by the IO pagetable framework. Also clean up unused extern function declarations. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 13 - drivers/iommu/amd/io_pgtable.c | 25 - drivers/iommu/amd/iommu.c | 13 - 3 files changed, 20 insertions(+), 31 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 83ca822c5349..3770b1a4d51c 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -133,19 +133,6 @@ void amd_iommu_apply_ivrs_quirks(void); static inline void amd_iommu_apply_ivrs_quirks(void) { } #endif -/* TODO: These are temporary and will be removed once fully transition */ -extern int iommu_map_page(struct protection_domain *dom, - unsigned long bus_addr, - unsigned long phys_addr, - unsigned long page_size, - int prot, - gfp_t gfp); -extern unsigned long iommu_unmap_page(struct protection_domain *dom, - unsigned long bus_addr, - unsigned long page_size); -extern u64 *fetch_pte(struct amd_io_pgtable *pgtable, - unsigned long address, - unsigned long *page_size); extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain, u64 *root, int mode); #endif diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c index a293b69b38b9..d91964e98d58 100644 --- a/drivers/iommu/amd/io_pgtable.c +++ b/drivers/iommu/amd/io_pgtable.c @@ -317,9 +317,9 @@ static u64 *alloc_pte(struct protection_domain *domain, * This function checks if there is a PTE for a given dma address. If * there is one, it returns the pointer to it. */ -u64 *fetch_pte(struct amd_io_pgtable *pgtable, - unsigned long address, - unsigned long *page_size) +static u64 *fetch_pte(struct amd_io_pgtable *pgtable, + unsigned long address, + unsigned long *page_size) { int level; u64 *pte; @@ -392,13 +392,10 @@ static struct page *free_clear_pte(u64 *pte, u64 pteval, struct page *freelist) * supporting all features of AMD IOMMU page tables like level skipping * and full 64 bit address spaces. */ -int iommu_map_page(struct protection_domain *dom, - unsigned long iova, - unsigned long paddr, - unsigned long size, - int prot, - gfp_t gfp) +static int iommu_v1_map_page(struct io_pgtable_ops *ops, unsigned long iova, + phys_addr_t paddr, size_t size, int prot, gfp_t gfp) { + struct protection_domain *dom = io_pgtable_ops_to_domain(ops); struct page *freelist = NULL; bool updated = false; u64 __pte, *pte; @@ -461,11 +458,11 @@ int iommu_map_page(struct protection_domain *dom, return ret; } -unsigned long iommu_unmap_page(struct protection_domain *dom, - unsigned long iova, - unsigned long size) +static unsigned long iommu_v1_unmap_page(struct io_pgtable_ops *ops, + unsigned long iova, + size_t size, + struct iommu_iotlb_gather *gather) { - struct io_pgtable_ops *ops = >iop.iop.ops; struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops); unsigned long long unmapped; unsigned long unmap_size; @@ -554,6 +551,8 @@ static struct io_pgtable *v1_alloc_pgtable(struct io_pgtable_cfg *cfg, void *coo cfg->oas= IOMMU_OUT_ADDR_BIT_SIZE, cfg->tlb= _flush_ops; + pgtable->iop.ops.map = iommu_v1_map_page; + pgtable->iop.ops.unmap= iommu_v1_unmap_page; pgtable->iop.ops.iova_to_phys = iommu_v1_iova_to_phys; return >iop; diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 29b7fefc8485..1f04b251f0c6 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2066,8 +2066,9 @@ static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova, gfp_t gfp) { struct protection_domain *domain = to_pdomain(dom); + struct io_pgtable_ops *ops = >iop.iop.ops; int prot = 0; - int ret; + int ret = -EINVAL; if (domain->iop.mode == PAGE_MODE_NONE) return -EINVAL; @@ -2077,9 +2078,10 @@ static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova, if (iommu_prot & IOMMU_WRITE) prot |= IOMMU_PROT_IW; - ret = iommu_map_page(domain, io
[PATCH v4 09/13] iommu/amd: Rename variables to be consistent with struct io_pgtable_ops
There is no functional change. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/io_pgtable.c | 31 +++ 1 file changed, 15 insertions(+), 16 deletions(-) diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c index d4d131e43dcd..35dd9153e6b7 100644 --- a/drivers/iommu/amd/io_pgtable.c +++ b/drivers/iommu/amd/io_pgtable.c @@ -393,9 +393,9 @@ static struct page *free_clear_pte(u64 *pte, u64 pteval, struct page *freelist) * and full 64 bit address spaces. */ int iommu_map_page(struct protection_domain *dom, - unsigned long bus_addr, - unsigned long phys_addr, - unsigned long page_size, + unsigned long iova, + unsigned long paddr, + unsigned long size, int prot, gfp_t gfp) { @@ -404,15 +404,15 @@ int iommu_map_page(struct protection_domain *dom, u64 __pte, *pte; int ret, i, count; - BUG_ON(!IS_ALIGNED(bus_addr, page_size)); - BUG_ON(!IS_ALIGNED(phys_addr, page_size)); + BUG_ON(!IS_ALIGNED(iova, size)); + BUG_ON(!IS_ALIGNED(paddr, size)); ret = -EINVAL; if (!(prot & IOMMU_PROT_MASK)) goto out; - count = PAGE_SIZE_PTE_COUNT(page_size); - pte = alloc_pte(dom, bus_addr, page_size, NULL, gfp, ); + count = PAGE_SIZE_PTE_COUNT(size); + pte = alloc_pte(dom, iova, size, NULL, gfp, ); ret = -ENOMEM; if (!pte) @@ -425,10 +425,10 @@ int iommu_map_page(struct protection_domain *dom, updated = true; if (count > 1) { - __pte = PAGE_SIZE_PTE(__sme_set(phys_addr), page_size); + __pte = PAGE_SIZE_PTE(__sme_set(paddr), size); __pte |= PM_LEVEL_ENC(7) | IOMMU_PTE_PR | IOMMU_PTE_FC; } else - __pte = __sme_set(phys_addr) | IOMMU_PTE_PR | IOMMU_PTE_FC; + __pte = __sme_set(paddr) | IOMMU_PTE_PR | IOMMU_PTE_FC; if (prot & IOMMU_PROT_IR) __pte |= IOMMU_PTE_IR; @@ -462,20 +462,19 @@ int iommu_map_page(struct protection_domain *dom, } unsigned long iommu_unmap_page(struct protection_domain *dom, - unsigned long bus_addr, - unsigned long page_size) + unsigned long iova, + unsigned long size) { unsigned long long unmapped; unsigned long unmap_size; u64 *pte; - BUG_ON(!is_power_of_2(page_size)); + BUG_ON(!is_power_of_2(size)); unmapped = 0; - while (unmapped < page_size) { - - pte = fetch_pte(dom, bus_addr, _size); + while (unmapped < size) { + pte = fetch_pte(dom, iova, _size); if (pte) { int i, count; @@ -485,7 +484,7 @@ unsigned long iommu_unmap_page(struct protection_domain *dom, pte[i] = 0ULL; } - bus_addr = (bus_addr & ~(unmap_size - 1)) + unmap_size; + iova = (iova & ~(unmap_size - 1)) + unmap_size; unmapped += unmap_size; } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 08/13] iommu/amd: Remove amd_iommu_domain_get_pgtable
Since the IO page table root and mode parameters have been moved into the struct amd_io_pg, the function is no longer needed. Therefore, remove it along with the struct domain_pgtable. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 4 ++-- drivers/iommu/amd/amd_iommu_types.h | 6 - drivers/iommu/amd/io_pgtable.c | 36 ++--- drivers/iommu/amd/iommu.c | 34 --- 4 files changed, 19 insertions(+), 61 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 91d098003f12..76276d9e463c 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -110,6 +110,8 @@ static inline void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root) { atomic64_set(>iop.pt_root, root); + domain->iop.root = (u64 *)(root & PAGE_MASK); + domain->iop.mode = root & 7; /* lowest 3 bits encode pgtable mode */ } static inline @@ -144,8 +146,6 @@ extern unsigned long iommu_unmap_page(struct protection_domain *dom, extern u64 *fetch_pte(struct protection_domain *domain, unsigned long address, unsigned long *page_size); -extern void amd_iommu_domain_get_pgtable(struct protection_domain *domain, -struct domain_pgtable *pgtable); extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain, u64 *root, int mode); #endif diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 7c971c76d685..6897567d307e 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -518,12 +518,6 @@ struct protection_domain { unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */ }; -/* For decocded pt_root */ -struct domain_pgtable { - int mode; - u64 *root; -}; - /* * Structure where we save information about one hardware AMD IOMMU in the * system. diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c index dc674e79ddf0..d4d131e43dcd 100644 --- a/drivers/iommu/amd/io_pgtable.c +++ b/drivers/iommu/amd/io_pgtable.c @@ -184,30 +184,27 @@ static bool increase_address_space(struct protection_domain *domain, unsigned long address, gfp_t gfp) { - struct domain_pgtable pgtable; unsigned long flags; bool ret = true; u64 *pte; spin_lock_irqsave(>lock, flags); - amd_iommu_domain_get_pgtable(domain, ); - - if (address <= PM_LEVEL_SIZE(pgtable.mode)) + if (address <= PM_LEVEL_SIZE(domain->iop.mode)) goto out; ret = false; - if (WARN_ON_ONCE(pgtable.mode == PAGE_MODE_6_LEVEL)) + if (WARN_ON_ONCE(domain->iop.mode == PAGE_MODE_6_LEVEL)) goto out; pte = (void *)get_zeroed_page(gfp); if (!pte) goto out; - *pte = PM_LEVEL_PDE(pgtable.mode, iommu_virt_to_phys(pgtable.root)); + *pte = PM_LEVEL_PDE(domain->iop.mode, iommu_virt_to_phys(domain->iop.root)); - pgtable.root = pte; - pgtable.mode += 1; + domain->iop.root = pte; + domain->iop.mode += 1; amd_iommu_update_and_flush_device_table(domain); amd_iommu_domain_flush_complete(domain); @@ -215,7 +212,7 @@ static bool increase_address_space(struct protection_domain *domain, * Device Table needs to be updated and flushed before the new root can * be published. */ - amd_iommu_domain_set_pgtable(domain, pte, pgtable.mode); + amd_iommu_domain_set_pgtable(domain, pte, domain->iop.mode); ret = true; @@ -232,29 +229,23 @@ static u64 *alloc_pte(struct protection_domain *domain, gfp_t gfp, bool *updated) { - struct domain_pgtable pgtable; int level, end_lvl; u64 *pte, *page; BUG_ON(!is_power_of_2(page_size)); - amd_iommu_domain_get_pgtable(domain, ); - - while (address > PM_LEVEL_SIZE(pgtable.mode)) { + while (address > PM_LEVEL_SIZE(domain->iop.mode)) { /* * Return an error if there is no memory to update the * page-table. */ if (!increase_address_space(domain, address, gfp)) return NULL; - - /* Read new values to check if update was successful */ - amd_iommu_domain_get_pgtable(domain, ); } - level = pgtable.mode - 1; - pte = [PM_LEVEL_INDEX(level, address)]; + level = domain->iop.mode - 1; + pte = >iop.root[PM_LEVEL_INDEX(level, address)]; address = PAGE_SIZE_ALIGN(address, page_size); end
[PATCH v4 13/13] iommu/amd: Adopt IO page table framework for AMD IOMMU v1 page table
Switch to using IO page table framework for AMD IOMMU v1 page table. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 1 + drivers/iommu/amd/init.c | 2 ++ drivers/iommu/amd/iommu.c | 48 ++- 3 files changed, 39 insertions(+), 12 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 3770b1a4d51c..91452e0ff072 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -36,6 +36,7 @@ extern void amd_iommu_disable(void); extern int amd_iommu_reenable(int); extern int amd_iommu_enable_faulting(void); extern int amd_iommu_guest_ir; +extern enum io_pgtable_fmt amd_iommu_pgtable; /* IOMMUv2 specific functions */ struct iommu_domain; diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 23a790f8f550..5fb4bea14cc4 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -147,6 +147,8 @@ struct ivmd_header { bool amd_iommu_dump; bool amd_iommu_irq_remap __read_mostly; +enum io_pgtable_fmt amd_iommu_pgtable = AMD_IOMMU_V1; + int amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_VAPIC; static int amd_iommu_xt_mode = IRQ_REMAP_XAPIC_MODE; diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 1f04b251f0c6..571e8806e4a1 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -31,6 +31,7 @@ #include #include #include +#include #include #include #include @@ -1901,7 +1902,7 @@ static void protection_domain_free(struct protection_domain *domain) kfree(domain); } -static int protection_domain_init(struct protection_domain *domain, int mode) +static int protection_domain_init_v1(struct protection_domain *domain, int mode) { u64 *pt_root = NULL; @@ -1924,34 +1925,55 @@ static int protection_domain_init(struct protection_domain *domain, int mode) return 0; } -static struct protection_domain *protection_domain_alloc(int mode) +static struct protection_domain *protection_domain_alloc(unsigned int type) { + struct io_pgtable_ops *pgtbl_ops; struct protection_domain *domain; + int pgtable = amd_iommu_pgtable; + int mode = DEFAULT_PGTABLE_LEVEL; + int ret; domain = kzalloc(sizeof(*domain), GFP_KERNEL); if (!domain) return NULL; - if (protection_domain_init(domain, mode)) + /* +* Force IOMMU v1 page table when iommu=pt and +* when allocating domain for pass-through devices. +*/ + if (type == IOMMU_DOMAIN_IDENTITY) { + pgtable = AMD_IOMMU_V1; + mode = PAGE_MODE_NONE; + } else if (type == IOMMU_DOMAIN_UNMANAGED) { + pgtable = AMD_IOMMU_V1; + } + + switch (pgtable) { + case AMD_IOMMU_V1: + ret = protection_domain_init_v1(domain, mode); + break; + default: + ret = -EINVAL; + } + + if (ret) goto out_err; - return domain; + pgtbl_ops = alloc_io_pgtable_ops(pgtable, >iop.pgtbl_cfg, domain); + if (!pgtbl_ops) + goto out_err; + return domain; out_err: kfree(domain); - return NULL; } static struct iommu_domain *amd_iommu_domain_alloc(unsigned type) { struct protection_domain *domain; - int mode = DEFAULT_PGTABLE_LEVEL; - - if (type == IOMMU_DOMAIN_IDENTITY) - mode = PAGE_MODE_NONE; - domain = protection_domain_alloc(mode); + domain = protection_domain_alloc(type); if (!domain) return NULL; @@ -2070,7 +2092,8 @@ static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova, int prot = 0; int ret = -EINVAL; - if (domain->iop.mode == PAGE_MODE_NONE) + if ((amd_iommu_pgtable == AMD_IOMMU_V1) && + (domain->iop.mode == PAGE_MODE_NONE)) return -EINVAL; if (iommu_prot & IOMMU_READ) @@ -2093,7 +2116,8 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova, struct protection_domain *domain = to_pdomain(dom); struct io_pgtable_ops *ops = >iop.iop.ops; - if (domain->iop.mode == PAGE_MODE_NONE) + if ((amd_iommu_pgtable == AMD_IOMMU_V1) && + (domain->iop.mode == PAGE_MODE_NONE)) return 0; return (ops->unmap) ? ops->unmap(ops, iova, page_size, gather) : 0; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 01/13] iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline
Move the function to header file to allow inclusion in other files. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 13 + drivers/iommu/amd/iommu.c | 10 -- 2 files changed, 13 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 6b8cbdf71714..0817bc732d1a 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -102,6 +102,19 @@ static inline void *iommu_phys_to_virt(unsigned long paddr) return phys_to_virt(__sme_clr(paddr)); } +static inline +void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root) +{ + atomic64_set(>pt_root, root); +} + +static inline +void amd_iommu_domain_clr_pt_root(struct protection_domain *domain) +{ + amd_iommu_domain_set_pt_root(domain, 0); +} + + extern bool translation_pre_enabled(struct amd_iommu *iommu); extern bool amd_iommu_is_attach_deferred(struct iommu_domain *domain, struct device *dev); diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index b9cf59443843..7f6b0f60b958 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -161,16 +161,6 @@ static void amd_iommu_domain_get_pgtable(struct protection_domain *domain, pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */ } -static void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root) -{ - atomic64_set(>pt_root, root); -} - -static void amd_iommu_domain_clr_pt_root(struct protection_domain *domain) -{ - amd_iommu_domain_set_pt_root(domain, 0); -} - static void amd_iommu_domain_set_pgtable(struct protection_domain *domain, u64 *root, int mode) { -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 11/13] iommu/amd: Introduce iommu_v1_iova_to_phys
This implements iova_to_phys for AMD IOMMU v1 pagetable, which will be used by the IO page table framework. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/io_pgtable.c | 22 ++ drivers/iommu/amd/iommu.c | 16 +--- 2 files changed, 23 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c index 87184b6cee0f..a293b69b38b9 100644 --- a/drivers/iommu/amd/io_pgtable.c +++ b/drivers/iommu/amd/io_pgtable.c @@ -494,6 +494,26 @@ unsigned long iommu_unmap_page(struct protection_domain *dom, return unmapped; } +static phys_addr_t iommu_v1_iova_to_phys(struct io_pgtable_ops *ops, unsigned long iova) +{ + struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops); + unsigned long offset_mask, pte_pgsize; + u64 *pte, __pte; + + if (pgtable->mode == PAGE_MODE_NONE) + return iova; + + pte = fetch_pte(pgtable, iova, _pgsize); + + if (!pte || !IOMMU_PTE_PRESENT(*pte)) + return 0; + + offset_mask = pte_pgsize - 1; + __pte = __sme_clr(*pte & PM_ADDR_MASK); + + return (__pte & ~offset_mask) | (iova & offset_mask); +} + /* * */ @@ -534,6 +554,8 @@ static struct io_pgtable *v1_alloc_pgtable(struct io_pgtable_cfg *cfg, void *coo cfg->oas= IOMMU_OUT_ADDR_BIT_SIZE, cfg->tlb= _flush_ops; + pgtable->iop.ops.iova_to_phys = iommu_v1_iova_to_phys; + return >iop; } diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 76f61dd6b89f..29b7fefc8485 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2101,22 +2101,8 @@ static phys_addr_t amd_iommu_iova_to_phys(struct iommu_domain *dom, { struct protection_domain *domain = to_pdomain(dom); struct io_pgtable_ops *ops = >iop.iop.ops; - struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops); - unsigned long offset_mask, pte_pgsize; - u64 *pte, __pte; - if (domain->iop.mode == PAGE_MODE_NONE) - return iova; - - pte = fetch_pte(pgtable, iova, _pgsize); - - if (!pte || !IOMMU_PTE_PRESENT(*pte)) - return 0; - - offset_mask = pte_pgsize - 1; - __pte = __sme_clr(*pte & PM_ADDR_MASK); - - return (__pte & ~offset_mask) | (iova & offset_mask); + return ops->iova_to_phys(ops, iova); } static bool amd_iommu_capable(enum iommu_cap cap) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 06/13] iommu/amd: Move IO page table related functions
Preparing to migrate to use IO page table framework. There is no functional change. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 18 ++ drivers/iommu/amd/io_pgtable.c | 473 drivers/iommu/amd/iommu.c | 476 + 3 files changed, 493 insertions(+), 474 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index bf29ab8c99f0..1bad42a3c73c 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -131,4 +131,22 @@ void amd_iommu_apply_ivrs_quirks(void); static inline void amd_iommu_apply_ivrs_quirks(void) { } #endif +/* TODO: These are temporary and will be removed once fully transition */ +extern void free_pagetable(struct domain_pgtable *pgtable); +extern int iommu_map_page(struct protection_domain *dom, + unsigned long bus_addr, + unsigned long phys_addr, + unsigned long page_size, + int prot, + gfp_t gfp); +extern unsigned long iommu_unmap_page(struct protection_domain *dom, + unsigned long bus_addr, + unsigned long page_size); +extern u64 *fetch_pte(struct protection_domain *domain, + unsigned long address, + unsigned long *page_size); +extern void amd_iommu_domain_get_pgtable(struct protection_domain *domain, +struct domain_pgtable *pgtable); +extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain, +u64 *root, int mode); #endif diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c index aedf2c932c40..345e9bc81fde 100644 --- a/drivers/iommu/amd/io_pgtable.c +++ b/drivers/iommu/amd/io_pgtable.c @@ -50,6 +50,479 @@ static const struct iommu_flush_ops v1_flush_ops = { .tlb_add_page = v1_tlb_add_page, }; +/* + * Helper function to get the first pte of a large mapping + */ +static u64 *first_pte_l7(u64 *pte, unsigned long *page_size, +unsigned long *count) +{ + unsigned long pte_mask, pg_size, cnt; + u64 *fpte; + + pg_size = PTE_PAGE_SIZE(*pte); + cnt = PAGE_SIZE_PTE_COUNT(pg_size); + pte_mask = ~((cnt << 3) - 1); + fpte = (u64 *)(((unsigned long)pte) & pte_mask); + + if (page_size) + *page_size = pg_size; + + if (count) + *count = cnt; + + return fpte; +} + +/ + * + * The functions below are used the create the page table mappings for + * unity mapped regions. + * + / + +static void free_page_list(struct page *freelist) +{ + while (freelist != NULL) { + unsigned long p = (unsigned long)page_address(freelist); + + freelist = freelist->freelist; + free_page(p); + } +} + +static struct page *free_pt_page(unsigned long pt, struct page *freelist) +{ + struct page *p = virt_to_page((void *)pt); + + p->freelist = freelist; + + return p; +} + +#define DEFINE_FREE_PT_FN(LVL, FN) \ +static struct page *free_pt_##LVL (unsigned long __pt, struct page *freelist) \ +{ \ + unsigned long p; \ + u64 *pt; \ + int i; \ + \ + pt = (u64 *)__pt; \ + \ + for (i = 0; i < 512; ++i) { \ + /* PTE present? */ \ + if (!IOMMU_PTE_PRESENT(pt[i])) \ + continue; \ + \ + /* Large PTE? */ \ + if (PM_PTE_LEVEL(pt[i]) == 0 || \ + PM_PTE_LEVEL(pt[i]) == 7) \ + continue; \ + \ + p = (unsigned long)
[PATCH v4 05/13] iommu/amd: Declare functions as extern
And move declaration to header file so that they can be included across multiple files. There is no functional change. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 3 +++ drivers/iommu/amd/iommu.c | 39 +-- 2 files changed, 22 insertions(+), 20 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index bf9723b35e77..bf29ab8c99f0 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -57,6 +57,9 @@ extern int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids); extern int amd_iommu_flush_page(struct iommu_domain *dom, u32 pasid, u64 address); extern void amd_iommu_update_and_flush_device_table(struct protection_domain *domain); +extern void amd_iommu_domain_update(struct protection_domain *domain); +extern void amd_iommu_domain_flush_complete(struct protection_domain *domain); +extern void amd_iommu_domain_flush_tlb_pde(struct protection_domain *domain); extern int amd_iommu_flush_tlb(struct iommu_domain *dom, u32 pasid); extern int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, u32 pasid, unsigned long cr3); diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index fdb6030b505d..1b10710c91cf 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -87,7 +87,6 @@ struct iommu_cmd { struct kmem_cache *amd_iommu_irq_cache; -static void update_domain(struct protection_domain *domain); static void detach_device(struct device *dev); / @@ -1314,12 +1313,12 @@ static void domain_flush_pages(struct protection_domain *domain, } /* Flush the whole IO/TLB for a given protection domain - including PDE */ -static void domain_flush_tlb_pde(struct protection_domain *domain) +void amd_iommu_domain_flush_tlb_pde(struct protection_domain *domain) { __domain_flush_pages(domain, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS, 1); } -static void domain_flush_complete(struct protection_domain *domain) +void amd_iommu_domain_flush_complete(struct protection_domain *domain) { int i; @@ -1344,7 +1343,7 @@ static void domain_flush_np_cache(struct protection_domain *domain, spin_lock_irqsave(>lock, flags); domain_flush_pages(domain, iova, size); - domain_flush_complete(domain); + amd_iommu_domain_flush_complete(domain); spin_unlock_irqrestore(>lock, flags); } } @@ -1501,7 +1500,7 @@ static bool increase_address_space(struct protection_domain *domain, pgtable.root = pte; pgtable.mode += 1; amd_iommu_update_and_flush_device_table(domain); - domain_flush_complete(domain); + amd_iommu_domain_flush_complete(domain); /* * Device Table needs to be updated and flushed before the new root can @@ -1754,8 +1753,8 @@ static int iommu_map_page(struct protection_domain *dom, * Updates and flushing already happened in * increase_address_space(). */ - domain_flush_tlb_pde(dom); - domain_flush_complete(dom); + amd_iommu_domain_flush_tlb_pde(dom); + amd_iommu_domain_flush_complete(dom); spin_unlock_irqrestore(>lock, flags); } @@ -1998,10 +1997,10 @@ static void do_detach(struct iommu_dev_data *dev_data) device_flush_dte(dev_data); /* Flush IOTLB */ - domain_flush_tlb_pde(domain); + amd_iommu_domain_flush_tlb_pde(domain); /* Wait for the flushes to finish */ - domain_flush_complete(domain); + amd_iommu_domain_flush_complete(domain); /* decrease reference counters - needs to happen after the flushes */ domain->dev_iommu[iommu->index] -= 1; @@ -2134,9 +2133,9 @@ static int attach_device(struct device *dev, * left the caches in the IOMMU dirty. So we have to flush * here to evict all dirty stuff. */ - domain_flush_tlb_pde(domain); + amd_iommu_domain_flush_tlb_pde(domain); - domain_flush_complete(domain); + amd_iommu_domain_flush_complete(domain); out: spin_unlock(_data->lock); @@ -2298,7 +2297,7 @@ void amd_iommu_update_and_flush_device_table(struct protection_domain *domain) domain_flush_devices(domain); } -static void update_domain(struct protection_domain *domain) +void amd_iommu_domain_update(struct protection_domain *domain) { struct domain_pgtable pgtable; @@ -2307,8 +2306,8 @@ static void update_domain(struct protection_domain *domain) amd_iommu_update_and_flush_device_table(domain); /* Flush domain TLB(s) and wait for completion */ - domain_flush_tlb_pde(domain); - domain_flush_complete(domain); + amd_iommu_do
[PATCH v4 07/13] iommu/amd: Restructure code for freeing page table
By consolidate logic into v1_free_pgtable helper function, which is called from IO page table framework. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 1 - drivers/iommu/amd/io_pgtable.c | 41 -- drivers/iommu/amd/iommu.c | 21 - 3 files changed, 28 insertions(+), 35 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 1bad42a3c73c..91d098003f12 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -132,7 +132,6 @@ static inline void amd_iommu_apply_ivrs_quirks(void) { } #endif /* TODO: These are temporary and will be removed once fully transition */ -extern void free_pagetable(struct domain_pgtable *pgtable); extern int iommu_map_page(struct protection_domain *dom, unsigned long bus_addr, unsigned long phys_addr, diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c index 345e9bc81fde..dc674e79ddf0 100644 --- a/drivers/iommu/amd/io_pgtable.c +++ b/drivers/iommu/amd/io_pgtable.c @@ -163,23 +163,6 @@ static struct page *free_sub_pt(unsigned long root, int mode, return freelist; } -void free_pagetable(struct domain_pgtable *pgtable) -{ - struct page *freelist = NULL; - unsigned long root; - - if (pgtable->mode == PAGE_MODE_NONE) - return; - - BUG_ON(pgtable->mode < PAGE_MODE_NONE || - pgtable->mode > PAGE_MODE_6_LEVEL); - - root = (unsigned long)pgtable->root; - freelist = free_sub_pt(root, pgtable->mode, freelist); - - free_page_list(freelist); -} - void amd_iommu_domain_set_pgtable(struct protection_domain *domain, u64 *root, int mode) { @@ -528,6 +511,30 @@ unsigned long iommu_unmap_page(struct protection_domain *dom, */ static void v1_free_pgtable(struct io_pgtable *iop) { + struct amd_io_pgtable *pgtable = container_of(iop, struct amd_io_pgtable, iop); + struct protection_domain *dom; + struct page *freelist = NULL; + unsigned long root; + + if (pgtable->mode == PAGE_MODE_NONE) + return; + + dom = container_of(pgtable, struct protection_domain, iop); + + /* Update data structure */ + amd_iommu_domain_clr_pt_root(dom); + + /* Make changes visible to IOMMUs */ + amd_iommu_domain_update(dom); + + /* Page-table is not visible to IOMMU anymore, so free it */ + BUG_ON(pgtable->mode < PAGE_MODE_NONE || + pgtable->mode > PAGE_MODE_6_LEVEL); + + root = (unsigned long)pgtable->root; + freelist = free_sub_pt(root, pgtable->mode, freelist); + + free_page_list(freelist); } static struct io_pgtable *v1_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index e823a457..37ecedce2c14 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1903,17 +1903,14 @@ static void cleanup_domain(struct protection_domain *domain) static void protection_domain_free(struct protection_domain *domain) { - struct domain_pgtable pgtable; - if (!domain) return; if (domain->id) domain_id_free(domain->id); - amd_iommu_domain_get_pgtable(domain, ); - amd_iommu_domain_clr_pt_root(domain); - free_pagetable(); + if (domain->iop.pgtbl_cfg.tlb) + free_io_pgtable_ops(>iop.iop.ops); kfree(domain); } @@ -2302,22 +2299,12 @@ EXPORT_SYMBOL(amd_iommu_unregister_ppr_notifier); void amd_iommu_domain_direct_map(struct iommu_domain *dom) { struct protection_domain *domain = to_pdomain(dom); - struct domain_pgtable pgtable; unsigned long flags; spin_lock_irqsave(>lock, flags); - /* First save pgtable configuration*/ - amd_iommu_domain_get_pgtable(domain, ); - - /* Remove page-table from domain */ - amd_iommu_domain_clr_pt_root(domain); - - /* Make changes visible to IOMMUs */ - amd_iommu_domain_update(domain); - - /* Page-table is not visible to IOMMU anymore, so free it */ - free_pagetable(); + if (domain->iop.pgtbl_cfg.tlb) + free_io_pgtable_ops(>iop.iop.ops); spin_unlock_irqrestore(>lock, flags); } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 04/13] iommu/amd: Convert to using amd_io_pgtable
Make use of the new struct amd_io_pgtable in preparation to remove the struct domain_pgtable. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 1 + drivers/iommu/amd/iommu.c | 25 ++--- 2 files changed, 11 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index b8dae3941f0f..bf9723b35e77 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -56,6 +56,7 @@ extern void amd_iommu_domain_direct_map(struct iommu_domain *dom); extern int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids); extern int amd_iommu_flush_page(struct iommu_domain *dom, u32 pasid, u64 address); +extern void amd_iommu_update_and_flush_device_table(struct protection_domain *domain); extern int amd_iommu_flush_tlb(struct iommu_domain *dom, u32 pasid); extern int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, u32 pasid, unsigned long cr3); diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 5b93536d6877..fdb6030b505d 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -89,8 +89,6 @@ struct kmem_cache *amd_iommu_irq_cache; static void update_domain(struct protection_domain *domain); static void detach_device(struct device *dev); -static void update_and_flush_device_table(struct protection_domain *domain, - struct domain_pgtable *pgtable); / * @@ -1502,7 +1500,7 @@ static bool increase_address_space(struct protection_domain *domain, pgtable.root = pte; pgtable.mode += 1; - update_and_flush_device_table(domain, ); + amd_iommu_update_and_flush_device_table(domain); domain_flush_complete(domain); /* @@ -1877,17 +1875,16 @@ static void free_gcr3_table(struct protection_domain *domain) } static void set_dte_entry(u16 devid, struct protection_domain *domain, - struct domain_pgtable *pgtable, bool ats, bool ppr) { u64 pte_root = 0; u64 flags = 0; u32 old_domid; - if (pgtable->mode != PAGE_MODE_NONE) - pte_root = iommu_virt_to_phys(pgtable->root); + if (domain->iop.mode != PAGE_MODE_NONE) + pte_root = iommu_virt_to_phys(domain->iop.root); - pte_root |= (pgtable->mode & DEV_ENTRY_MODE_MASK) + pte_root |= (domain->iop.mode & DEV_ENTRY_MODE_MASK) << DEV_ENTRY_MODE_SHIFT; pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV; @@ -1977,7 +1974,7 @@ static void do_attach(struct iommu_dev_data *dev_data, /* Update device table */ amd_iommu_domain_get_pgtable(domain, ); - set_dte_entry(dev_data->devid, domain, , + set_dte_entry(dev_data->devid, domain, ats, dev_data->iommu_v2); clone_aliases(dev_data->pdev); @@ -2284,22 +2281,20 @@ static int amd_iommu_domain_get_attr(struct iommu_domain *domain, * */ -static void update_device_table(struct protection_domain *domain, - struct domain_pgtable *pgtable) +static void update_device_table(struct protection_domain *domain) { struct iommu_dev_data *dev_data; list_for_each_entry(dev_data, >dev_list, list) { - set_dte_entry(dev_data->devid, domain, pgtable, + set_dte_entry(dev_data->devid, domain, dev_data->ats.enabled, dev_data->iommu_v2); clone_aliases(dev_data->pdev); } } -static void update_and_flush_device_table(struct protection_domain *domain, - struct domain_pgtable *pgtable) +void amd_iommu_update_and_flush_device_table(struct protection_domain *domain) { - update_device_table(domain, pgtable); + update_device_table(domain); domain_flush_devices(domain); } @@ -2309,7 +2304,7 @@ static void update_domain(struct protection_domain *domain) /* Update device table */ amd_iommu_domain_get_pgtable(domain, ); - update_and_flush_device_table(domain, ); + amd_iommu_update_and_flush_device_table(domain); /* Flush domain TLB(s) and wait for completion */ domain_flush_tlb_pde(domain); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 03/13] iommu/amd: Move pt_root to struct amd_io_pgtable
To better organize the data structure since it contains IO page table related information. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 2 +- drivers/iommu/amd/amd_iommu_types.h | 2 +- drivers/iommu/amd/iommu.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 0817bc732d1a..b8dae3941f0f 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -105,7 +105,7 @@ static inline void *iommu_phys_to_virt(unsigned long paddr) static inline void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root) { - atomic64_set(>pt_root, root); + atomic64_set(>iop.pt_root, root); } static inline diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 5d77f34e0fda..7c971c76d685 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -497,6 +497,7 @@ struct amd_io_pgtable { struct io_pgtable iop; int mode; u64 *root; + atomic64_t pt_root;/* pgtable root and pgtable mode */ }; /* @@ -510,7 +511,6 @@ struct protection_domain { struct amd_io_pgtable iop; spinlock_t lock;/* mostly used to lock the page table*/ u16 id; /* the domain id written to the device table */ - atomic64_t pt_root; /* pgtable root and pgtable mode */ int glx;/* Number of levels for GCR3 table */ u64 *gcr3_tbl; /* Guest CR3 table */ unsigned long flags;/* flags to find out type of domain */ diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 45d3977d6c00..5b93536d6877 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -145,7 +145,7 @@ static struct protection_domain *to_pdomain(struct iommu_domain *dom) static void amd_iommu_domain_get_pgtable(struct protection_domain *domain, struct domain_pgtable *pgtable) { - u64 pt_root = atomic64_read(>pt_root); + u64 pt_root = atomic64_read(>iop.pt_root); pgtable->root = (u64 *)(pt_root & PAGE_MASK); pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */ -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 02/13] iommu/amd: Prepare for generic IO page table framework
Add initial hook up code to implement generic IO page table framework. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/Kconfig | 1 + drivers/iommu/amd/Makefile | 2 +- drivers/iommu/amd/amd_iommu_types.h | 35 ++ drivers/iommu/amd/io_pgtable.c | 75 + drivers/iommu/amd/iommu.c | 10 drivers/iommu/io-pgtable.c | 3 ++ include/linux/io-pgtable.h | 2 + 7 files changed, 117 insertions(+), 11 deletions(-) create mode 100644 drivers/iommu/amd/io_pgtable.c diff --git a/drivers/iommu/amd/Kconfig b/drivers/iommu/amd/Kconfig index 626b97d0dd21..a3cbafb603f5 100644 --- a/drivers/iommu/amd/Kconfig +++ b/drivers/iommu/amd/Kconfig @@ -10,6 +10,7 @@ config AMD_IOMMU select IOMMU_API select IOMMU_IOVA select IOMMU_DMA + select IOMMU_IO_PGTABLE depends on X86_64 && PCI && ACPI && HAVE_CMPXCHG_DOUBLE help With this option you can enable support for AMD IOMMU hardware in diff --git a/drivers/iommu/amd/Makefile b/drivers/iommu/amd/Makefile index dc5a2fa4fd37..a935f8f4b974 100644 --- a/drivers/iommu/amd/Makefile +++ b/drivers/iommu/amd/Makefile @@ -1,4 +1,4 @@ # SPDX-License-Identifier: GPL-2.0-only -obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o +obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o io_pgtable.o obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += debugfs.o obj-$(CONFIG_AMD_IOMMU_V2) += iommu_v2.o diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 494b42a31b7a..5d77f34e0fda 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -15,6 +15,7 @@ #include #include #include +#include /* * Maximum number of IOMMUs supported @@ -252,6 +253,19 @@ #define GA_GUEST_NR0x1 +#define IOMMU_IN_ADDR_BIT_SIZE 52 +#define IOMMU_OUT_ADDR_BIT_SIZE 52 + +/* + * This bitmap is used to advertise the page sizes our hardware support + * to the IOMMU core, which will then use this information to split + * physically contiguous memory regions it is mapping into page sizes + * that we support. + * + * 512GB Pages are not supported due to a hardware bug + */ +#define AMD_IOMMU_PGSIZES ((~0xFFFUL) & ~(2ULL << 38)) + /* Bit value definition for dte irq remapping fields*/ #define DTE_IRQ_PHYS_ADDR_MASK (((1ULL << 45)-1) << 6) #define DTE_IRQ_REMAP_INTCTL_MASK (0x3ULL << 60) @@ -465,6 +479,26 @@ struct amd_irte_ops; #define AMD_IOMMU_FLAG_TRANS_PRE_ENABLED (1 << 0) +#define io_pgtable_to_data(x) \ + container_of((x), struct amd_io_pgtable, iop) + +#define io_pgtable_ops_to_data(x) \ + io_pgtable_to_data(io_pgtable_ops_to_pgtable(x)) + +#define io_pgtable_ops_to_domain(x) \ + container_of(io_pgtable_ops_to_data(x), \ +struct protection_domain, iop) + +#define io_pgtable_cfg_to_data(x) \ + container_of((x), struct amd_io_pgtable, pgtbl_cfg) + +struct amd_io_pgtable { + struct io_pgtable_cfg pgtbl_cfg; + struct io_pgtable iop; + int mode; + u64 *root; +}; + /* * This structure contains generic data for IOMMU protection domains * independent of their use. @@ -473,6 +507,7 @@ struct protection_domain { struct list_head dev_list; /* List of all devices in this domain */ struct iommu_domain domain; /* generic domain handle used by iommu core code */ + struct amd_io_pgtable iop; spinlock_t lock;/* mostly used to lock the page table*/ u16 id; /* the domain id written to the device table */ atomic64_t pt_root; /* pgtable root and pgtable mode */ diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c new file mode 100644 index ..aedf2c932c40 --- /dev/null +++ b/drivers/iommu/amd/io_pgtable.c @@ -0,0 +1,75 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * CPU-agnostic AMD IO page table allocator. + * + * Copyright (C) 2020 Advanced Micro Devices, Inc. + * Author: Suravee Suthikulpanit + */ + +#define pr_fmt(fmt) "AMD-Vi: " fmt +#define dev_fmt(fmt)pr_fmt(fmt) + +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include "amd_iommu_types.h" +#include "amd_iommu.h" + +static void v1_tlb_flush_all(void *cookie) +{ +} + +static void v1_tlb_flush_walk(unsigned long iova, size_t size, + size_t granule, void *cookie) +{ +} + +static void v1_tlb_flush_leaf(unsigned long iova, size_t size, + size_t granule, void *cookie) +{ +} + +static void v1_tlb_add_page(struct iommu_iotlb_gather *gather, +unsigned long iova, size_t granule, +
[PATCH v4 00/13] iommu/amd: Add Generic IO Page Table Framework Support
The framework allows callable implementation of IO page table. This allows AMD IOMMU driver to switch between different types of AMD IOMMU page tables (e.g. v1 vs. v2). This series refactors the current implementation of AMD IOMMU v1 page table to adopt the framework. There should be no functional change. Subsequent series will introduce support for the AMD IOMMU v2 page table. Thanks, Suravee Change from V3 (https://lore.kernel.org/linux-iommu/20201004014549.16065-1-suravee.suthikulpa...@amd.com/) - Rebase to v5.10 - Patch 2: Add struct iommu_flush_ops (previously in patch 13 of v3) - Patch 7: Consolidate logic into v1_free_pgtable() instead of amd_iommu_free_pgtable() - Patch 12: Check ops->[map|unmap] before calling. - Patch 13: Setup page table when allocating domain (instead of when attaching device). Change from V2 (https://lore.kernel.org/lkml/835c0d46-ed96-9fbe-856a-777dcffac...@amd.com/T/#t) - Patch 2: Introduce helper function io_pgtable_cfg_to_data. - Patch 13: Put back the struct iommu_flush_ops since patch v2 would run into NULL pointer bug when calling free_io_pgtable_ops if not defined. Change from V1 (https://lkml.org/lkml/2020/9/23/251) - Do not specify struct io_pgtable_cfg.coherent_walk, since it is not currently used. (per Robin) - Remove unused struct iommu_flush_ops. (patch 2/13) - Move amd_iommu_setup_io_pgtable_ops to iommu.c instead of io_pgtable.c patch 13/13) Suravee Suthikulpanit (13): iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline iommu/amd: Prepare for generic IO page table framework iommu/amd: Move pt_root to struct amd_io_pgtable iommu/amd: Convert to using amd_io_pgtable iommu/amd: Declare functions as extern iommu/amd: Move IO page table related functions iommu/amd: Restructure code for freeing page table iommu/amd: Remove amd_iommu_domain_get_pgtable iommu/amd: Rename variables to be consistent with struct io_pgtable_ops iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable iommu/amd: Introduce iommu_v1_iova_to_phys iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page iommu/amd: Adopt IO page table framework for AMD IOMMU v1 page table drivers/iommu/amd/Kconfig | 1 + drivers/iommu/amd/Makefile | 2 +- drivers/iommu/amd/amd_iommu.h | 22 + drivers/iommu/amd/amd_iommu_types.h | 43 +- drivers/iommu/amd/init.c| 2 + drivers/iommu/amd/io_pgtable.c | 564 +++ drivers/iommu/amd/iommu.c | 672 drivers/iommu/io-pgtable.c | 3 + include/linux/io-pgtable.h | 2 + 9 files changed, 707 insertions(+), 604 deletions(-) create mode 100644 drivers/iommu/amd/io_pgtable.c -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/amd: Add sanity check for interrupt remapping table length macros
Currently, macros related to the interrupt remapping table length are defined separately. This has resulted in an oversight in which one of the macros were missed when changing the length. To prevent this, redefine the macros to add built-in sanity check. Also, rename macros to use the name of the DTE[IntTabLen] field as specified in the AMD IOMMU specification. There is no functional change. Suggested-by: Linus Torvalds Reviewed-by: Tom Lendacky Signed-off-by: Suravee Suthikulpanit Cc: Will Deacon Cc: Jerry Snitselaar Cc: Joerg Roedel --- drivers/iommu/amd/amd_iommu_types.h | 19 ++- drivers/iommu/amd/init.c| 6 +++--- drivers/iommu/amd/iommu.c | 2 +- 3 files changed, 14 insertions(+), 13 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 494b42a31b7a..899ce62df3f0 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -255,11 +255,19 @@ /* Bit value definition for dte irq remapping fields*/ #define DTE_IRQ_PHYS_ADDR_MASK (((1ULL << 45)-1) << 6) #define DTE_IRQ_REMAP_INTCTL_MASK (0x3ULL << 60) -#define DTE_IRQ_TABLE_LEN_MASK (0xfULL << 1) #define DTE_IRQ_REMAP_INTCTL(2ULL << 60) -#define DTE_IRQ_TABLE_LEN (9ULL << 1) #define DTE_IRQ_REMAP_ENABLE1ULL +/* + * AMD IOMMU hardware only support 512 IRTEs despite + * the architectural limitation of 2048 entries. + */ +#define DTE_INTTAB_ALIGNMENT128 +#define DTE_INTTABLEN_VALUE 9ULL +#define DTE_INTTABLEN (DTE_INTTABLEN_VALUE << 1) +#define DTE_INTTABLEN_MASK (0xfULL << 1) +#define MAX_IRQS_PER_TABLE (1 << DTE_INTTABLEN_VALUE) + #define PAGE_MODE_NONE0x00 #define PAGE_MODE_1_LEVEL 0x01 #define PAGE_MODE_2_LEVEL 0x02 @@ -409,13 +417,6 @@ extern bool amd_iommu_np_cache; /* Only true if all IOMMUs support device IOTLBs */ extern bool amd_iommu_iotlb_sup; -/* - * AMD IOMMU hardware only support 512 IRTEs despite - * the architectural limitation of 2048 entries. - */ -#define MAX_IRQS_PER_TABLE 512 -#define IRQ_TABLE_ALIGNMENT128 - struct irq_remap_table { raw_spinlock_t lock; unsigned min_index; diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 23a790f8f550..6bec8913d064 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -989,10 +989,10 @@ static bool copy_device_table(void) irq_v = old_devtb[devid].data[2] & DTE_IRQ_REMAP_ENABLE; int_ctl = old_devtb[devid].data[2] & DTE_IRQ_REMAP_INTCTL_MASK; - int_tab_len = old_devtb[devid].data[2] & DTE_IRQ_TABLE_LEN_MASK; + int_tab_len = old_devtb[devid].data[2] & DTE_INTTABLEN_MASK; if (irq_v && (int_ctl || int_tab_len)) { if ((int_ctl != DTE_IRQ_REMAP_INTCTL) || - (int_tab_len != DTE_IRQ_TABLE_LEN)) { + (int_tab_len != DTE_INTTABLEN)) { pr_err("Wrong old irq remapping flag: %#x\n", devid); return false; } @@ -2674,7 +2674,7 @@ static int __init early_amd_iommu_init(void) remap_cache_sz = MAX_IRQS_PER_TABLE * (sizeof(u64) * 2); amd_iommu_irq_cache = kmem_cache_create("irq_remap_cache", remap_cache_sz, - IRQ_TABLE_ALIGNMENT, + DTE_INTTAB_ALIGNMENT, 0, NULL); if (!amd_iommu_irq_cache) goto out; diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index b9cf59443843..f7abf16d1e3a 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3191,7 +3191,7 @@ static void set_dte_irq_entry(u16 devid, struct irq_remap_table *table) dte &= ~DTE_IRQ_PHYS_ADDR_MASK; dte |= iommu_virt_to_phys(table->table); dte |= DTE_IRQ_REMAP_INTCTL; - dte |= DTE_IRQ_TABLE_LEN; + dte |= DTE_INTTABLEN; dte |= DTE_IRQ_REMAP_ENABLE; amd_iommu_dev_table[devid].data[2] = dte; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/amd: Set DTE[IntTabLen] to represent 512 IRTEs
According to the AMD IOMMU spec, the commit 73db2fc595f3 ("iommu/amd: Increase interrupt remapping table limit to 512 entries") also requires the interrupt table length (IntTabLen) to be set to 9 (power of 2) in the device table mapping entry (DTE). Fixes: 73db2fc595f3 ("iommu/amd: Increase interrupt remapping table limit to 512 entries") Reported-by: Jerry Snitselaar Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu_types.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 89647700bab2..494b42a31b7a 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -257,7 +257,7 @@ #define DTE_IRQ_REMAP_INTCTL_MASK (0x3ULL << 60) #define DTE_IRQ_TABLE_LEN_MASK (0xfULL << 1) #define DTE_IRQ_REMAP_INTCTL(2ULL << 60) -#define DTE_IRQ_TABLE_LEN (8ULL << 1) +#define DTE_IRQ_TABLE_LEN (9ULL << 1) #define DTE_IRQ_REMAP_ENABLE1ULL #define PAGE_MODE_NONE0x00 -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/amd: Enforce 4k mapping for certain IOMMU data structures
Will, To answer your questions from v1 thread. On 11/18/20 5:57 AM, Will Deacon wrote: > On 11/5/20 9:58 PM, Suravee Suthikulpanit wrote: >> AMD IOMMU requires 4k-aligned pages for the event log, the PPR log, >> and the completion wait write-back regions. However, when allocating >> the pages, they could be part of large mapping (e.g. 2M) page. >> This causes #PF due to the SNP RMP hardware enforces the check based >> on the page level for these data structures. > > Please could you include an example backtrace here? Unfortunately, we don't actually have the backtrace available here. This information is based on the SEV-SNP specification. >> So, fix by calling set_memory_4k() on the allocated pages. > > I think I'm missing something here. set_memory_4k() will break the kernel > linear mapping up into page granular mappings, but the IOMMU isn't using > that mapping, right? That's correct. This does not affect the IOMMU, but it affects the PSP FW. > It's just using the physical address returned by iommu_virt_to_phys(), so why does it matter? > > Just be nice to capture some of this rationale in the log, especially as > I'm not familiar with this device. According to the AMD SEV-SNP white paper (https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf), the Reverse Map Table (RMP) contains one entry for every 4K page of DRAM that may be used by the VM. In this case, the pages allocated by the IOMMU driver are added as 4K entries in the RMP table by the SEV-SNP FW. During the page table walk, the RMP checks if the page is owned by the hypervisor. Without calling set_memory_4k() to break the mapping up into 4K pages, pages could end up being part of large mapping (e.g. 2M page), in which the page access would be denied and result in #PF. >> Fixes: commit c69d89aff393 ("iommu/amd: Use 4K page for completion wait write-back semaphore") > > I couldn't figure out how that commit could cause this problem. Please can > you explain that to me? Hope this helps clarify. If so, I'll update the commit log and send out V3. Thanks, Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: Enforce 4k mapping for certain IOMMU data structures
Will, I have already submitted v2 of this patch. Let me move the discussion there instead ... (https://lore.kernel.org/linux-iommu/20201105145832.3065-1-suravee.suthikulpa...@amd.com/) Suravee On 11/18/20 5:57 AM, Will Deacon wrote: On Wed, Oct 28, 2020 at 11:18:24PM +, Suravee Suthikulpanit wrote: AMD IOMMU requires 4k-aligned pages for the event log, the PPR log, and the completion wait write-back regions. However, when allocating the pages, they could be part of large mapping (e.g. 2M) page. This causes #PF due to the SNP RMP hardware enforces the check based on the page level for these data structures. Please could you include an example backtrace here? So, fix by calling set_memory_4k() on the allocated pages. I think I'm missing something here. set_memory_4k() will break the kernel linear mapping up into page granular mappings, but the IOMMU isn't using that mapping, right? It's just using the physical address returned by iommu_virt_to_phys(), so why does it matter? Just be nice to capture some of this rationale in the log, especially as I'm not familiar with this device. Fixes: commit c69d89aff393 ("iommu/amd: Use 4K page for completion wait write-back semaphore") I couldn't figure out how that commit could cause this problem. Please can you explain that to me? Cheers, Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 00/14] iommu/amd: Add Generic IO Page Table Framework Support
Joerg, Please ignore to include the V3. I am working on V4 to resubmit. Thank you, Suravee On 11/11/20 10:10 AM, Suravee Suthikulpanit wrote: Hi Joerg, Do you have any update on this series? Thanks, Suravee On 11/2/20 10:16 AM, Suravee Suthikulpanit wrote: Joerg, You mentioned to remind you to pull this in to linux-next. Thanks, Suravee On 10/4/20 8:45 AM, Suravee Suthikulpanit wrote: The framework allows callable implementation of IO page table. This allows AMD IOMMU driver to switch between different types of AMD IOMMU page tables (e.g. v1 vs. v2). This series refactors the current implementation of AMD IOMMU v1 page table to adopt the framework. There should be no functional change. Subsequent series will introduce support for the AMD IOMMU v2 page table. Thanks, Suravee Change from V2 (https://lore.kernel.org/lkml/835c0d46-ed96-9fbe-856a-777dcffac...@amd.com/T/#t) - Patch 2/14: Introduce helper function io_pgtable_cfg_to_data. - Patch 13/14: Put back the struct iommu_flush_ops since patch v2 would run into NULL pointer bug when calling free_io_pgtable_ops if not defined. Change from V1 (https://lkml.org/lkml/2020/9/23/251) - Do not specify struct io_pgtable_cfg.coherent_walk, since it is not currently used. (per Robin) - Remove unused struct iommu_flush_ops. (patch 2/13) - Move amd_iommu_setup_io_pgtable_ops to iommu.c instead of io_pgtable.c patch 13/13) Suravee Suthikulpanit (14): iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline iommu/amd: Prepare for generic IO page table framework iommu/amd: Move pt_root to to struct amd_io_pgtable iommu/amd: Convert to using amd_io_pgtable iommu/amd: Declare functions as extern iommu/amd: Move IO page table related functions iommu/amd: Restructure code for freeing page table iommu/amd: Remove amd_iommu_domain_get_pgtable iommu/amd: Rename variables to be consistent with struct io_pgtable_ops iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable iommu/amd: Introduce iommu_v1_iova_to_phys iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page iommu/amd: Introduce IOMMU flush callbacks iommu/amd: Adopt IO page table framework drivers/iommu/amd/Kconfig | 1 + drivers/iommu/amd/Makefile | 2 +- drivers/iommu/amd/amd_iommu.h | 22 + drivers/iommu/amd/amd_iommu_types.h | 43 +- drivers/iommu/amd/io_pgtable.c | 564 drivers/iommu/amd/iommu.c | 646 +++- drivers/iommu/io-pgtable.c | 3 + include/linux/io-pgtable.h | 2 + 8 files changed, 691 insertions(+), 592 deletions(-) create mode 100644 drivers/iommu/amd/io_pgtable.c ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 00/14] iommu/amd: Add Generic IO Page Table Framework Support
Hi Joerg, Do you have any update on this series? Thanks, Suravee On 11/2/20 10:16 AM, Suravee Suthikulpanit wrote: Joerg, You mentioned to remind you to pull this in to linux-next. Thanks, Suravee On 10/4/20 8:45 AM, Suravee Suthikulpanit wrote: The framework allows callable implementation of IO page table. This allows AMD IOMMU driver to switch between different types of AMD IOMMU page tables (e.g. v1 vs. v2). This series refactors the current implementation of AMD IOMMU v1 page table to adopt the framework. There should be no functional change. Subsequent series will introduce support for the AMD IOMMU v2 page table. Thanks, Suravee Change from V2 (https://lore.kernel.org/lkml/835c0d46-ed96-9fbe-856a-777dcffac...@amd.com/T/#t) - Patch 2/14: Introduce helper function io_pgtable_cfg_to_data. - Patch 13/14: Put back the struct iommu_flush_ops since patch v2 would run into NULL pointer bug when calling free_io_pgtable_ops if not defined. Change from V1 (https://lkml.org/lkml/2020/9/23/251) - Do not specify struct io_pgtable_cfg.coherent_walk, since it is not currently used. (per Robin) - Remove unused struct iommu_flush_ops. (patch 2/13) - Move amd_iommu_setup_io_pgtable_ops to iommu.c instead of io_pgtable.c patch 13/13) Suravee Suthikulpanit (14): iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline iommu/amd: Prepare for generic IO page table framework iommu/amd: Move pt_root to to struct amd_io_pgtable iommu/amd: Convert to using amd_io_pgtable iommu/amd: Declare functions as extern iommu/amd: Move IO page table related functions iommu/amd: Restructure code for freeing page table iommu/amd: Remove amd_iommu_domain_get_pgtable iommu/amd: Rename variables to be consistent with struct io_pgtable_ops iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable iommu/amd: Introduce iommu_v1_iova_to_phys iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page iommu/amd: Introduce IOMMU flush callbacks iommu/amd: Adopt IO page table framework drivers/iommu/amd/Kconfig | 1 + drivers/iommu/amd/Makefile | 2 +- drivers/iommu/amd/amd_iommu.h | 22 + drivers/iommu/amd/amd_iommu_types.h | 43 +- drivers/iommu/amd/io_pgtable.c | 564 drivers/iommu/amd/iommu.c | 646 +++- drivers/iommu/io-pgtable.c | 3 + include/linux/io-pgtable.h | 2 + 8 files changed, 691 insertions(+), 592 deletions(-) create mode 100644 drivers/iommu/amd/io_pgtable.c ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2] iommu/amd: Enforce 4k mapping for certain IOMMU data structures
AMD IOMMU requires 4k-aligned pages for the event log, the PPR log, and the completion wait write-back regions. However, when allocating the pages, they could be part of large mapping (e.g. 2M) page. This causes #PF due to the SNP RMP hardware enforces the check based on the page level for these data structures. So, fix by calling set_memory_4k() on the allocated pages. Fixes: commit c69d89aff393 ("iommu/amd: Use 4K page for completion wait write-back semaphore") Cc: Brijesh Singh Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 27 ++- 1 file changed, 22 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 82e4af8f09bb..23a790f8f550 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -29,6 +29,7 @@ #include #include #include +#include #include @@ -672,11 +673,27 @@ static void __init free_command_buffer(struct amd_iommu *iommu) free_pages((unsigned long)iommu->cmd_buf, get_order(CMD_BUFFER_SIZE)); } +static void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu, +gfp_t gfp, size_t size) +{ + int order = get_order(size); + void *buf = (void *)__get_free_pages(gfp, order); + + if (buf && + iommu_feature(iommu, FEATURE_SNP) && + set_memory_4k((unsigned long)buf, (1 << order))) { + free_pages((unsigned long)buf, order); + buf = NULL; + } + + return buf; +} + /* allocates the memory where the IOMMU will log its events to */ static int __init alloc_event_buffer(struct amd_iommu *iommu) { - iommu->evt_buf = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, - get_order(EVT_BUFFER_SIZE)); + iommu->evt_buf = iommu_alloc_4k_pages(iommu, GFP_KERNEL | __GFP_ZERO, + EVT_BUFFER_SIZE); return iommu->evt_buf ? 0 : -ENOMEM; } @@ -715,8 +732,8 @@ static void __init free_event_buffer(struct amd_iommu *iommu) /* allocates the memory where the IOMMU will log its events to */ static int __init alloc_ppr_log(struct amd_iommu *iommu) { - iommu->ppr_log = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, - get_order(PPR_LOG_SIZE)); + iommu->ppr_log = iommu_alloc_4k_pages(iommu, GFP_KERNEL | __GFP_ZERO, + PPR_LOG_SIZE); return iommu->ppr_log ? 0 : -ENOMEM; } @@ -838,7 +855,7 @@ static int iommu_init_ga(struct amd_iommu *iommu) static int __init alloc_cwwb_sem(struct amd_iommu *iommu) { - iommu->cmd_sem = (void *)get_zeroed_page(GFP_KERNEL); + iommu->cmd_sem = iommu_alloc_4k_pages(iommu, GFP_KERNEL | __GFP_ZERO, 1); return iommu->cmd_sem ? 0 : -ENOMEM; } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 00/14] iommu/amd: Add Generic IO Page Table Framework Support
Joerg, You mentioned to remind you to pull this in to linux-next. Thanks, Suravee On 10/4/20 8:45 AM, Suravee Suthikulpanit wrote: The framework allows callable implementation of IO page table. This allows AMD IOMMU driver to switch between different types of AMD IOMMU page tables (e.g. v1 vs. v2). This series refactors the current implementation of AMD IOMMU v1 page table to adopt the framework. There should be no functional change. Subsequent series will introduce support for the AMD IOMMU v2 page table. Thanks, Suravee Change from V2 (https://lore.kernel.org/lkml/835c0d46-ed96-9fbe-856a-777dcffac...@amd.com/T/#t) - Patch 2/14: Introduce helper function io_pgtable_cfg_to_data. - Patch 13/14: Put back the struct iommu_flush_ops since patch v2 would run into NULL pointer bug when calling free_io_pgtable_ops if not defined. Change from V1 (https://lkml.org/lkml/2020/9/23/251) - Do not specify struct io_pgtable_cfg.coherent_walk, since it is not currently used. (per Robin) - Remove unused struct iommu_flush_ops. (patch 2/13) - Move amd_iommu_setup_io_pgtable_ops to iommu.c instead of io_pgtable.c patch 13/13) Suravee Suthikulpanit (14): iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline iommu/amd: Prepare for generic IO page table framework iommu/amd: Move pt_root to to struct amd_io_pgtable iommu/amd: Convert to using amd_io_pgtable iommu/amd: Declare functions as extern iommu/amd: Move IO page table related functions iommu/amd: Restructure code for freeing page table iommu/amd: Remove amd_iommu_domain_get_pgtable iommu/amd: Rename variables to be consistent with struct io_pgtable_ops iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable iommu/amd: Introduce iommu_v1_iova_to_phys iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page iommu/amd: Introduce IOMMU flush callbacks iommu/amd: Adopt IO page table framework drivers/iommu/amd/Kconfig | 1 + drivers/iommu/amd/Makefile | 2 +- drivers/iommu/amd/amd_iommu.h | 22 + drivers/iommu/amd/amd_iommu_types.h | 43 +- drivers/iommu/amd/io_pgtable.c | 564 drivers/iommu/amd/iommu.c | 646 +++- drivers/iommu/io-pgtable.c | 3 + include/linux/io-pgtable.h | 2 + 8 files changed, 691 insertions(+), 592 deletions(-) create mode 100644 drivers/iommu/amd/io_pgtable.c ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/amd: Enforce 4k mapping for certain IOMMU data structures
AMD IOMMU requires 4k-aligned pages for the event log, the PPR log, and the completion wait write-back regions. However, when allocating the pages, they could be part of large mapping (e.g. 2M) page. This causes #PF due to the SNP RMP hardware enforces the check based on the page level for these data structures. So, fix by calling set_memory_4k() on the allocated pages. Fixes: commit c69d89aff393 ("iommu/amd: Use 4K page for completion wait write-back semaphore") Cc: Brijesh Singh Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 22 +- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 82e4af8f09bb..75dc30226a7c 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -29,6 +29,7 @@ #include #include #include +#include #include @@ -672,11 +673,22 @@ static void __init free_command_buffer(struct amd_iommu *iommu) free_pages((unsigned long)iommu->cmd_buf, get_order(CMD_BUFFER_SIZE)); } +static void *__init iommu_alloc_4k_pages(gfp_t gfp, size_t size) +{ + void *buf; + int order = get_order(size); + + buf = (void *)__get_free_pages(gfp, order); + if (!buf) + return buf; + return set_memory_4k((unsigned long)buf, (1 << order)) ? NULL : buf; +} + /* allocates the memory where the IOMMU will log its events to */ static int __init alloc_event_buffer(struct amd_iommu *iommu) { - iommu->evt_buf = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, - get_order(EVT_BUFFER_SIZE)); + iommu->evt_buf = iommu_alloc_4k_pages(GFP_KERNEL | __GFP_ZERO, + EVT_BUFFER_SIZE); return iommu->evt_buf ? 0 : -ENOMEM; } @@ -715,8 +727,8 @@ static void __init free_event_buffer(struct amd_iommu *iommu) /* allocates the memory where the IOMMU will log its events to */ static int __init alloc_ppr_log(struct amd_iommu *iommu) { - iommu->ppr_log = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, - get_order(PPR_LOG_SIZE)); + iommu->ppr_log = iommu_alloc_4k_pages(GFP_KERNEL | __GFP_ZERO, + PPR_LOG_SIZE); return iommu->ppr_log ? 0 : -ENOMEM; } @@ -838,7 +850,7 @@ static int iommu_init_ga(struct amd_iommu *iommu) static int __init alloc_cwwb_sem(struct amd_iommu *iommu) { - iommu->cmd_sem = (void *)get_zeroed_page(GFP_KERNEL); + iommu->cmd_sem = iommu_alloc_4k_pages(GFP_KERNEL | __GFP_ZERO, 1); return iommu->cmd_sem ? 0 : -ENOMEM; } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: Increase interrupt remapping table limit to 512 entries
Hi Joerg, Do you have any concerns regarding this patch? Thanks, Suravee On 10/15/20 9:50 AM, Suravee Suthikulpanit wrote: Certain device drivers allocate IO queues on a per-cpu basis. On AMD EPYC platform, which can support up-to 256 cpu threads, this can exceed the current MAX_IRQ_PER_TABLE limit of 256, and result in the error message: AMD-Vi: Failed to allocate IRTE This has been observed with certain NVME devices. AMD IOMMU hardware can actually support upto 512 interrupt remapping table entries. Therefore, update the driver to match the hardware limit. Please note that this also increases the size of interrupt remapping table to 8KB per device when using the 128-bit IRTE format. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu_types.h | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 30a5d412255a..427484c45589 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -406,7 +406,11 @@ extern bool amd_iommu_np_cache; /* Only true if all IOMMUs support device IOTLBs */ extern bool amd_iommu_iotlb_sup; -#define MAX_IRQS_PER_TABLE 256 +/* + * AMD IOMMU hardware only support 512 IRTEs despite + * the architectural limitation of 2048 entries. + */ +#define MAX_IRQS_PER_TABLE 512 #define IRQ_TABLE_ALIGNMENT 128 struct irq_remap_table { ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/amd: Increase interrupt remapping table limit to 512 entries
Certain device drivers allocate IO queues on a per-cpu basis. On AMD EPYC platform, which can support up-to 256 cpu threads, this can exceed the current MAX_IRQ_PER_TABLE limit of 256, and result in the error message: AMD-Vi: Failed to allocate IRTE This has been observed with certain NVME devices. AMD IOMMU hardware can actually support upto 512 interrupt remapping table entries. Therefore, update the driver to match the hardware limit. Please note that this also increases the size of interrupt remapping table to 8KB per device when using the 128-bit IRTE format. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu_types.h | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 30a5d412255a..427484c45589 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -406,7 +406,11 @@ extern bool amd_iommu_np_cache; /* Only true if all IOMMUs support device IOTLBs */ extern bool amd_iommu_iotlb_sup; -#define MAX_IRQS_PER_TABLE 256 +/* + * AMD IOMMU hardware only support 512 IRTEs despite + * the architectural limitation of 2048 entries. + */ +#define MAX_IRQS_PER_TABLE 512 #define IRQ_TABLE_ALIGNMENT128 struct irq_remap_table { -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 07/14] iommu/amd: Restructure code for freeing page table
Introduce amd_iommu_free_pgtable helper function, which consolidates logic for freeing page table. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 2 +- drivers/iommu/amd/io_pgtable.c | 12 +++- drivers/iommu/amd/iommu.c | 19 ++- 3 files changed, 14 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index ee7ff4d827e1..8dff7d85be79 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -123,7 +123,6 @@ static inline void amd_iommu_apply_ivrs_quirks(void) { } #endif /* TODO: These are temporary and will be removed once fully transition */ -extern void free_pagetable(struct domain_pgtable *pgtable); extern int iommu_map_page(struct protection_domain *dom, unsigned long bus_addr, unsigned long phys_addr, @@ -140,4 +139,5 @@ extern void amd_iommu_domain_get_pgtable(struct protection_domain *domain, struct domain_pgtable *pgtable); extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain, u64 *root, int mode); +extern void amd_iommu_free_pgtable(struct amd_io_pgtable *pgtable); #endif diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c index c11355afe624..23e82da2dea8 100644 --- a/drivers/iommu/amd/io_pgtable.c +++ b/drivers/iommu/amd/io_pgtable.c @@ -136,14 +136,24 @@ static struct page *free_sub_pt(unsigned long root, int mode, return freelist; } -void free_pagetable(struct domain_pgtable *pgtable) +void amd_iommu_free_pgtable(struct amd_io_pgtable *pgtable) { + struct protection_domain *dom; struct page *freelist = NULL; unsigned long root; if (pgtable->mode == PAGE_MODE_NONE) return; + dom = container_of(pgtable, struct protection_domain, iop); + + /* Update data structure */ + amd_iommu_domain_clr_pt_root(dom); + + /* Make changes visible to IOMMUs */ + amd_iommu_domain_update(dom); + + /* Page-table is not visible to IOMMU anymore, so free it */ BUG_ON(pgtable->mode < PAGE_MODE_NONE || pgtable->mode > PAGE_MODE_6_LEVEL); diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 4d65f64236b6..cbbea7b952fb 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1882,17 +1882,13 @@ static void cleanup_domain(struct protection_domain *domain) static void protection_domain_free(struct protection_domain *domain) { - struct domain_pgtable pgtable; - if (!domain) return; if (domain->id) domain_id_free(domain->id); - amd_iommu_domain_get_pgtable(domain, ); - amd_iommu_domain_clr_pt_root(domain); - free_pagetable(); + amd_iommu_free_pgtable(>iop); kfree(domain); } @@ -2281,22 +2277,11 @@ EXPORT_SYMBOL(amd_iommu_unregister_ppr_notifier); void amd_iommu_domain_direct_map(struct iommu_domain *dom) { struct protection_domain *domain = to_pdomain(dom); - struct domain_pgtable pgtable; unsigned long flags; spin_lock_irqsave(>lock, flags); - /* First save pgtable configuration*/ - amd_iommu_domain_get_pgtable(domain, ); - - /* Remove page-table from domain */ - amd_iommu_domain_clr_pt_root(domain); - - /* Make changes visible to IOMMUs */ - amd_iommu_domain_update(domain); - - /* Page-table is not visible to IOMMU anymore, so free it */ - free_pagetable(); + amd_iommu_free_pgtable(>iop); spin_unlock_irqrestore(>lock, flags); } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 11/14] iommu/amd: Introduce iommu_v1_iova_to_phys
This implements iova_to_phys for AMD IOMMU v1 pagetable, which will be used by the IO page table framework. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/io_pgtable.c | 22 ++ drivers/iommu/amd/iommu.c | 16 +--- 2 files changed, 23 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c index 93ff8cb452ed..7841e5e1e563 100644 --- a/drivers/iommu/amd/io_pgtable.c +++ b/drivers/iommu/amd/io_pgtable.c @@ -494,6 +494,26 @@ unsigned long iommu_unmap_page(struct protection_domain *dom, return unmapped; } +static phys_addr_t iommu_v1_iova_to_phys(struct io_pgtable_ops *ops, unsigned long iova) +{ + struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops); + unsigned long offset_mask, pte_pgsize; + u64 *pte, __pte; + + if (pgtable->mode == PAGE_MODE_NONE) + return iova; + + pte = fetch_pte(pgtable, iova, _pgsize); + + if (!pte || !IOMMU_PTE_PRESENT(*pte)) + return 0; + + offset_mask = pte_pgsize - 1; + __pte = __sme_clr(*pte & PM_ADDR_MASK); + + return (__pte & ~offset_mask) | (iova & offset_mask); +} + /* * */ @@ -505,6 +525,8 @@ static struct io_pgtable *v1_alloc_pgtable(struct io_pgtable_cfg *cfg, void *coo { struct amd_io_pgtable *pgtable = io_pgtable_cfg_to_data(cfg); + pgtable->iop.ops.iova_to_phys = iommu_v1_iova_to_phys; + return >iop; } diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 87cea1cde414..9a1a16031e00 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2079,22 +2079,8 @@ static phys_addr_t amd_iommu_iova_to_phys(struct iommu_domain *dom, { struct protection_domain *domain = to_pdomain(dom); struct io_pgtable_ops *ops = >iop.iop.ops; - struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops); - unsigned long offset_mask, pte_pgsize; - u64 *pte, __pte; - if (domain->iop.mode == PAGE_MODE_NONE) - return iova; - - pte = fetch_pte(pgtable, iova, _pgsize); - - if (!pte || !IOMMU_PTE_PRESENT(*pte)) - return 0; - - offset_mask = pte_pgsize - 1; - __pte = __sme_clr(*pte & PM_ADDR_MASK); - - return (__pte & ~offset_mask) | (iova & offset_mask); + return ops->iova_to_phys(ops, iova); } static bool amd_iommu_capable(enum iommu_cap cap) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 12/14] iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page
These implement map and unmap for AMD IOMMU v1 pagetable, which will be used by the IO pagetable framework. Also clean up unused extern function declarations. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 13 - drivers/iommu/amd/io_pgtable.c | 25 - drivers/iommu/amd/iommu.c | 7 --- 3 files changed, 16 insertions(+), 29 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 69996e57fae2..2e8dc2a1ec0f 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -124,19 +124,6 @@ void amd_iommu_apply_ivrs_quirks(void); static inline void amd_iommu_apply_ivrs_quirks(void) { } #endif -/* TODO: These are temporary and will be removed once fully transition */ -extern int iommu_map_page(struct protection_domain *dom, - unsigned long bus_addr, - unsigned long phys_addr, - unsigned long page_size, - int prot, - gfp_t gfp); -extern unsigned long iommu_unmap_page(struct protection_domain *dom, - unsigned long bus_addr, - unsigned long page_size); -extern u64 *fetch_pte(struct amd_io_pgtable *pgtable, - unsigned long address, - unsigned long *page_size); extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain, u64 *root, int mode); extern void amd_iommu_free_pgtable(struct amd_io_pgtable *pgtable); diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c index 7841e5e1e563..d8b329aa0bb2 100644 --- a/drivers/iommu/amd/io_pgtable.c +++ b/drivers/iommu/amd/io_pgtable.c @@ -317,9 +317,9 @@ static u64 *alloc_pte(struct protection_domain *domain, * This function checks if there is a PTE for a given dma address. If * there is one, it returns the pointer to it. */ -u64 *fetch_pte(struct amd_io_pgtable *pgtable, - unsigned long address, - unsigned long *page_size) +static u64 *fetch_pte(struct amd_io_pgtable *pgtable, + unsigned long address, + unsigned long *page_size) { int level; u64 *pte; @@ -392,13 +392,10 @@ static struct page *free_clear_pte(u64 *pte, u64 pteval, struct page *freelist) * supporting all features of AMD IOMMU page tables like level skipping * and full 64 bit address spaces. */ -int iommu_map_page(struct protection_domain *dom, - unsigned long iova, - unsigned long paddr, - unsigned long size, - int prot, - gfp_t gfp) +static int iommu_v1_map_page(struct io_pgtable_ops *ops, unsigned long iova, + phys_addr_t paddr, size_t size, int prot, gfp_t gfp) { + struct protection_domain *dom = io_pgtable_ops_to_domain(ops); struct page *freelist = NULL; bool updated = false; u64 __pte, *pte; @@ -461,11 +458,11 @@ int iommu_map_page(struct protection_domain *dom, return ret; } -unsigned long iommu_unmap_page(struct protection_domain *dom, - unsigned long iova, - unsigned long size) +static unsigned long iommu_v1_unmap_page(struct io_pgtable_ops *ops, + unsigned long iova, + size_t size, + struct iommu_iotlb_gather *gather) { - struct io_pgtable_ops *ops = >iop.iop.ops; struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops); unsigned long long unmapped; unsigned long unmap_size; @@ -525,6 +522,8 @@ static struct io_pgtable *v1_alloc_pgtable(struct io_pgtable_cfg *cfg, void *coo { struct amd_io_pgtable *pgtable = io_pgtable_cfg_to_data(cfg); + pgtable->iop.ops.map = iommu_v1_map_page; + pgtable->iop.ops.unmap= iommu_v1_unmap_page; pgtable->iop.ops.iova_to_phys = iommu_v1_iova_to_phys; return >iop; diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 9a1a16031e00..77f44b927ae7 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2044,6 +2044,7 @@ static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova, gfp_t gfp) { struct protection_domain *domain = to_pdomain(dom); + struct io_pgtable_ops *ops = >iop.iop.ops; int prot = 0; int ret; @@ -2055,8 +2056,7 @@ static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova, if (iommu_prot & IOMMU_WRITE) prot |= IOMMU_PROT_IW; - ret = iommu_map_page(domain, iova, paddr, page_size, prot, gfp); - + ret = ops->map(ops, iova, pa
[PATCH v3 14/14] iommu/amd: Adopt IO page table framework
Switch to using IO page table framework for AMD IOMMU v1 page table. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/iommu.c | 26 ++ 1 file changed, 26 insertions(+) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 77f44b927ae7..6f8316206fb8 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -32,6 +32,7 @@ #include #include #include +#include #include #include #include @@ -1573,6 +1574,22 @@ static int pdev_iommuv2_enable(struct pci_dev *pdev) return ret; } +struct io_pgtable_ops * +amd_iommu_setup_io_pgtable_ops(struct iommu_dev_data *dev_data, + struct protection_domain *domain) +{ + struct amd_iommu *iommu = amd_iommu_rlookup_table[dev_data->devid]; + + domain->iop.pgtbl_cfg = (struct io_pgtable_cfg) { + .pgsize_bitmap = AMD_IOMMU_PGSIZES, + .ias= IOMMU_IN_ADDR_BIT_SIZE, + .oas= IOMMU_OUT_ADDR_BIT_SIZE, + .iommu_dev = >dev->dev, + }; + + return alloc_io_pgtable_ops(AMD_IOMMU_V1, >iop.pgtbl_cfg, domain); +} + /* * If a device is not yet associated with a domain, this function makes the * device visible in the domain @@ -1580,6 +1597,7 @@ static int pdev_iommuv2_enable(struct pci_dev *pdev) static int attach_device(struct device *dev, struct protection_domain *domain) { + struct io_pgtable_ops *pgtbl_ops; struct iommu_dev_data *dev_data; struct pci_dev *pdev; unsigned long flags; @@ -1623,6 +1641,12 @@ static int attach_device(struct device *dev, skip_ats_check: ret = 0; + pgtbl_ops = amd_iommu_setup_io_pgtable_ops(dev_data, domain); + if (!pgtbl_ops) { + ret = -ENOMEM; + goto out; + } + do_attach(dev_data, domain); /* @@ -1958,6 +1982,8 @@ static void amd_iommu_domain_free(struct iommu_domain *dom) if (domain->dev_cnt > 0) cleanup_domain(domain); + free_io_pgtable_ops(>iop.iop.ops); + BUG_ON(domain->dev_cnt != 0); if (!dom) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 13/14] iommu/amd: Introduce IOMMU flush callbacks
Add TLB flush callback functions, which are used by the IO page table framework. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/io_pgtable.c | 29 + 1 file changed, 29 insertions(+) diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c index d8b329aa0bb2..3c2faa47ea5d 100644 --- a/drivers/iommu/amd/io_pgtable.c +++ b/drivers/iommu/amd/io_pgtable.c @@ -514,6 +514,33 @@ static phys_addr_t iommu_v1_iova_to_phys(struct io_pgtable_ops *ops, unsigned lo /* * */ +static void v1_tlb_flush_all(void *cookie) +{ +} + +static void v1_tlb_flush_walk(unsigned long iova, size_t size, + size_t granule, void *cookie) +{ +} + +static void v1_tlb_flush_leaf(unsigned long iova, size_t size, + size_t granule, void *cookie) +{ +} + +static void v1_tlb_add_page(struct iommu_iotlb_gather *gather, +unsigned long iova, size_t granule, +void *cookie) +{ +} + +const struct iommu_flush_ops v1_flush_ops = { + .tlb_flush_all = v1_tlb_flush_all, + .tlb_flush_walk = v1_tlb_flush_walk, + .tlb_flush_leaf = v1_tlb_flush_leaf, + .tlb_add_page = v1_tlb_add_page, +}; + static void v1_free_pgtable(struct io_pgtable *iop) { } @@ -526,6 +553,8 @@ static struct io_pgtable *v1_alloc_pgtable(struct io_pgtable_cfg *cfg, void *coo pgtable->iop.ops.unmap= iommu_v1_unmap_page; pgtable->iop.ops.iova_to_phys = iommu_v1_iova_to_phys; + cfg->tlb = _flush_ops; + return >iop; } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 00/14] iommu/amd: Add Generic IO Page Table Framework Support
The framework allows callable implementation of IO page table. This allows AMD IOMMU driver to switch between different types of AMD IOMMU page tables (e.g. v1 vs. v2). This series refactors the current implementation of AMD IOMMU v1 page table to adopt the framework. There should be no functional change. Subsequent series will introduce support for the AMD IOMMU v2 page table. Thanks, Suravee Change from V2 (https://lore.kernel.org/lkml/835c0d46-ed96-9fbe-856a-777dcffac...@amd.com/T/#t) - Patch 2/14: Introduce helper function io_pgtable_cfg_to_data. - Patch 13/14: Put back the struct iommu_flush_ops since patch v2 would run into NULL pointer bug when calling free_io_pgtable_ops if not defined. Change from V1 (https://lkml.org/lkml/2020/9/23/251) - Do not specify struct io_pgtable_cfg.coherent_walk, since it is not currently used. (per Robin) - Remove unused struct iommu_flush_ops. (patch 2/13) - Move amd_iommu_setup_io_pgtable_ops to iommu.c instead of io_pgtable.c patch 13/13) Suravee Suthikulpanit (14): iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline iommu/amd: Prepare for generic IO page table framework iommu/amd: Move pt_root to to struct amd_io_pgtable iommu/amd: Convert to using amd_io_pgtable iommu/amd: Declare functions as extern iommu/amd: Move IO page table related functions iommu/amd: Restructure code for freeing page table iommu/amd: Remove amd_iommu_domain_get_pgtable iommu/amd: Rename variables to be consistent with struct io_pgtable_ops iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable iommu/amd: Introduce iommu_v1_iova_to_phys iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page iommu/amd: Introduce IOMMU flush callbacks iommu/amd: Adopt IO page table framework drivers/iommu/amd/Kconfig | 1 + drivers/iommu/amd/Makefile | 2 +- drivers/iommu/amd/amd_iommu.h | 22 + drivers/iommu/amd/amd_iommu_types.h | 43 +- drivers/iommu/amd/io_pgtable.c | 564 drivers/iommu/amd/iommu.c | 646 +++- drivers/iommu/io-pgtable.c | 3 + include/linux/io-pgtable.h | 2 + 8 files changed, 691 insertions(+), 592 deletions(-) create mode 100644 drivers/iommu/amd/io_pgtable.c -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 09/14] iommu/amd: Rename variables to be consistent with struct io_pgtable_ops
There is no functional change. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/io_pgtable.c | 31 +++ 1 file changed, 15 insertions(+), 16 deletions(-) diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c index 6c063d2c8bf0..989db64a89a7 100644 --- a/drivers/iommu/amd/io_pgtable.c +++ b/drivers/iommu/amd/io_pgtable.c @@ -393,9 +393,9 @@ static struct page *free_clear_pte(u64 *pte, u64 pteval, struct page *freelist) * and full 64 bit address spaces. */ int iommu_map_page(struct protection_domain *dom, - unsigned long bus_addr, - unsigned long phys_addr, - unsigned long page_size, + unsigned long iova, + unsigned long paddr, + unsigned long size, int prot, gfp_t gfp) { @@ -404,15 +404,15 @@ int iommu_map_page(struct protection_domain *dom, u64 __pte, *pte; int ret, i, count; - BUG_ON(!IS_ALIGNED(bus_addr, page_size)); - BUG_ON(!IS_ALIGNED(phys_addr, page_size)); + BUG_ON(!IS_ALIGNED(iova, size)); + BUG_ON(!IS_ALIGNED(paddr, size)); ret = -EINVAL; if (!(prot & IOMMU_PROT_MASK)) goto out; - count = PAGE_SIZE_PTE_COUNT(page_size); - pte = alloc_pte(dom, bus_addr, page_size, NULL, gfp, ); + count = PAGE_SIZE_PTE_COUNT(size); + pte = alloc_pte(dom, iova, size, NULL, gfp, ); ret = -ENOMEM; if (!pte) @@ -425,10 +425,10 @@ int iommu_map_page(struct protection_domain *dom, updated = true; if (count > 1) { - __pte = PAGE_SIZE_PTE(__sme_set(phys_addr), page_size); + __pte = PAGE_SIZE_PTE(__sme_set(paddr), size); __pte |= PM_LEVEL_ENC(7) | IOMMU_PTE_PR | IOMMU_PTE_FC; } else - __pte = __sme_set(phys_addr) | IOMMU_PTE_PR | IOMMU_PTE_FC; + __pte = __sme_set(paddr) | IOMMU_PTE_PR | IOMMU_PTE_FC; if (prot & IOMMU_PROT_IR) __pte |= IOMMU_PTE_IR; @@ -462,20 +462,19 @@ int iommu_map_page(struct protection_domain *dom, } unsigned long iommu_unmap_page(struct protection_domain *dom, - unsigned long bus_addr, - unsigned long page_size) + unsigned long iova, + unsigned long size) { unsigned long long unmapped; unsigned long unmap_size; u64 *pte; - BUG_ON(!is_power_of_2(page_size)); + BUG_ON(!is_power_of_2(size)); unmapped = 0; - while (unmapped < page_size) { - - pte = fetch_pte(dom, bus_addr, _size); + while (unmapped < size) { + pte = fetch_pte(dom, iova, _size); if (pte) { int i, count; @@ -485,7 +484,7 @@ unsigned long iommu_unmap_page(struct protection_domain *dom, pte[i] = 0ULL; } - bus_addr = (bus_addr & ~(unmap_size - 1)) + unmap_size; + iova = (iova & ~(unmap_size - 1)) + unmap_size; unmapped += unmap_size; } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 08/14] iommu/amd: Remove amd_iommu_domain_get_pgtable
Since the IO page table root and mode parameters have been moved into the struct amd_io_pg, the function is no longer needed. Therefore, remove it along with the struct domain_pgtable. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 4 ++-- drivers/iommu/amd/amd_iommu_types.h | 6 - drivers/iommu/amd/io_pgtable.c | 36 ++--- drivers/iommu/amd/iommu.c | 34 --- 4 files changed, 19 insertions(+), 61 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 8dff7d85be79..2059e64fdc53 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -101,6 +101,8 @@ static inline void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root) { atomic64_set(>iop.pt_root, root); + domain->iop.root = (u64 *)(root & PAGE_MASK); + domain->iop.mode = root & 7; /* lowest 3 bits encode pgtable mode */ } static inline @@ -135,8 +137,6 @@ extern unsigned long iommu_unmap_page(struct protection_domain *dom, extern u64 *fetch_pte(struct protection_domain *domain, unsigned long address, unsigned long *page_size); -extern void amd_iommu_domain_get_pgtable(struct protection_domain *domain, -struct domain_pgtable *pgtable); extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain, u64 *root, int mode); extern void amd_iommu_free_pgtable(struct amd_io_pgtable *pgtable); diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 80b5c34357ed..de3fe9433080 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -514,12 +514,6 @@ struct protection_domain { unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */ }; -/* For decocded pt_root */ -struct domain_pgtable { - int mode; - u64 *root; -}; - /* * Structure where we save information about one hardware AMD IOMMU in the * system. diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c index 23e82da2dea8..6c063d2c8bf0 100644 --- a/drivers/iommu/amd/io_pgtable.c +++ b/drivers/iommu/amd/io_pgtable.c @@ -184,30 +184,27 @@ static bool increase_address_space(struct protection_domain *domain, unsigned long address, gfp_t gfp) { - struct domain_pgtable pgtable; unsigned long flags; bool ret = true; u64 *pte; spin_lock_irqsave(>lock, flags); - amd_iommu_domain_get_pgtable(domain, ); - - if (address <= PM_LEVEL_SIZE(pgtable.mode)) + if (address <= PM_LEVEL_SIZE(domain->iop.mode)) goto out; ret = false; - if (WARN_ON_ONCE(pgtable.mode == PAGE_MODE_6_LEVEL)) + if (WARN_ON_ONCE(domain->iop.mode == PAGE_MODE_6_LEVEL)) goto out; pte = (void *)get_zeroed_page(gfp); if (!pte) goto out; - *pte = PM_LEVEL_PDE(pgtable.mode, iommu_virt_to_phys(pgtable.root)); + *pte = PM_LEVEL_PDE(domain->iop.mode, iommu_virt_to_phys(domain->iop.root)); - pgtable.root = pte; - pgtable.mode += 1; + domain->iop.root = pte; + domain->iop.mode += 1; amd_iommu_update_and_flush_device_table(domain); amd_iommu_domain_flush_complete(domain); @@ -215,7 +212,7 @@ static bool increase_address_space(struct protection_domain *domain, * Device Table needs to be updated and flushed before the new root can * be published. */ - amd_iommu_domain_set_pgtable(domain, pte, pgtable.mode); + amd_iommu_domain_set_pgtable(domain, pte, domain->iop.mode); ret = true; @@ -232,29 +229,23 @@ static u64 *alloc_pte(struct protection_domain *domain, gfp_t gfp, bool *updated) { - struct domain_pgtable pgtable; int level, end_lvl; u64 *pte, *page; BUG_ON(!is_power_of_2(page_size)); - amd_iommu_domain_get_pgtable(domain, ); - - while (address > PM_LEVEL_SIZE(pgtable.mode)) { + while (address > PM_LEVEL_SIZE(domain->iop.mode)) { /* * Return an error if there is no memory to update the * page-table. */ if (!increase_address_space(domain, address, gfp)) return NULL; - - /* Read new values to check if update was successful */ - amd_iommu_domain_get_pgtable(domain, ); } - level = pgtable.mode - 1; - pte = [PM_LEVEL_INDEX(level, address)]; + level = domain->iop.mode - 1; + pte = >iop.root[PM_LEVEL_INDEX(level, address)];
[PATCH v3 02/14] iommu/amd: Prepare for generic IO page table framework
Add initial hook up code to implement generic IO page table framework. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/Kconfig | 1 + drivers/iommu/amd/Makefile | 2 +- drivers/iommu/amd/amd_iommu_types.h | 35 +++ drivers/iommu/amd/io_pgtable.c | 43 + drivers/iommu/amd/iommu.c | 10 --- drivers/iommu/io-pgtable.c | 3 ++ include/linux/io-pgtable.h | 2 ++ 7 files changed, 85 insertions(+), 11 deletions(-) create mode 100644 drivers/iommu/amd/io_pgtable.c diff --git a/drivers/iommu/amd/Kconfig b/drivers/iommu/amd/Kconfig index 626b97d0dd21..a3cbafb603f5 100644 --- a/drivers/iommu/amd/Kconfig +++ b/drivers/iommu/amd/Kconfig @@ -10,6 +10,7 @@ config AMD_IOMMU select IOMMU_API select IOMMU_IOVA select IOMMU_DMA + select IOMMU_IO_PGTABLE depends on X86_64 && PCI && ACPI && HAVE_CMPXCHG_DOUBLE help With this option you can enable support for AMD IOMMU hardware in diff --git a/drivers/iommu/amd/Makefile b/drivers/iommu/amd/Makefile index dc5a2fa4fd37..a935f8f4b974 100644 --- a/drivers/iommu/amd/Makefile +++ b/drivers/iommu/amd/Makefile @@ -1,4 +1,4 @@ # SPDX-License-Identifier: GPL-2.0-only -obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o +obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o io_pgtable.o obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += debugfs.o obj-$(CONFIG_AMD_IOMMU_V2) += iommu_v2.o diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index f696ac7c5f89..e3ac3e57e507 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -15,6 +15,7 @@ #include #include #include +#include /* * Maximum number of IOMMUs supported @@ -252,6 +253,19 @@ #define GA_GUEST_NR0x1 +#define IOMMU_IN_ADDR_BIT_SIZE 52 +#define IOMMU_OUT_ADDR_BIT_SIZE 52 + +/* + * This bitmap is used to advertise the page sizes our hardware support + * to the IOMMU core, which will then use this information to split + * physically contiguous memory regions it is mapping into page sizes + * that we support. + * + * 512GB Pages are not supported due to a hardware bug + */ +#define AMD_IOMMU_PGSIZES ((~0xFFFUL) & ~(2ULL << 38)) + /* Bit value definition for dte irq remapping fields*/ #define DTE_IRQ_PHYS_ADDR_MASK (((1ULL << 45)-1) << 6) #define DTE_IRQ_REMAP_INTCTL_MASK (0x3ULL << 60) @@ -461,6 +475,26 @@ struct amd_irte_ops; #define AMD_IOMMU_FLAG_TRANS_PRE_ENABLED (1 << 0) +#define io_pgtable_to_data(x) \ + container_of((x), struct amd_io_pgtable, iop) + +#define io_pgtable_ops_to_data(x) \ + io_pgtable_to_data(io_pgtable_ops_to_pgtable(x)) + +#define io_pgtable_ops_to_domain(x) \ + container_of(io_pgtable_ops_to_data(x), \ +struct protection_domain, iop) + +#define io_pgtable_cfg_to_data(x) \ + container_of((x), struct amd_io_pgtable, pgtbl_cfg) + +struct amd_io_pgtable { + struct io_pgtable_cfg pgtbl_cfg; + struct io_pgtable iop; + int mode; + u64 *root; +}; + /* * This structure contains generic data for IOMMU protection domains * independent of their use. @@ -469,6 +503,7 @@ struct protection_domain { struct list_head dev_list; /* List of all devices in this domain */ struct iommu_domain domain; /* generic domain handle used by iommu core code */ + struct amd_io_pgtable iop; spinlock_t lock;/* mostly used to lock the page table*/ u16 id; /* the domain id written to the device table */ atomic64_t pt_root; /* pgtable root and pgtable mode */ diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c new file mode 100644 index ..6b2de9e467d9 --- /dev/null +++ b/drivers/iommu/amd/io_pgtable.c @@ -0,0 +1,43 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * CPU-agnostic AMD IO page table allocator. + * + * Copyright (C) 2020 Advanced Micro Devices, Inc. + * Author: Suravee Suthikulpanit + */ + +#define pr_fmt(fmt) "AMD-Vi: " fmt +#define dev_fmt(fmt)pr_fmt(fmt) + +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include "amd_iommu_types.h" +#include "amd_iommu.h" + +/* + * + */ +static void v1_free_pgtable(struct io_pgtable *iop) +{ +} + +static struct io_pgtable *v1_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie) +{ + struct amd_io_pgtable *pgtable = io_pgtable_cfg_to_data(cfg); + + return >iop; +} + +struct io_pgtable_init_fns io_pgtable_amd_iommu_v1_init_fns = { + .alloc = v1_alloc_pgtable, + .free = v1_free_pgtable, +}; diff --git a/dr
[PATCH v3 04/14] iommu/amd: Convert to using amd_io_pgtable
Make use of the new struct amd_io_pgtable in preparation to remove the struct domain_pgtable. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 1 + drivers/iommu/amd/iommu.c | 25 ++--- 2 files changed, 11 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index da6e09657e00..22ecacb71675 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -47,6 +47,7 @@ extern void amd_iommu_domain_direct_map(struct iommu_domain *dom); extern int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids); extern int amd_iommu_flush_page(struct iommu_domain *dom, int pasid, u64 address); +extern void amd_iommu_update_and_flush_device_table(struct protection_domain *domain); extern int amd_iommu_flush_tlb(struct iommu_domain *dom, int pasid); extern int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, int pasid, unsigned long cr3); diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index c8b8619cc744..09da37c4c9c4 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -90,8 +90,6 @@ struct kmem_cache *amd_iommu_irq_cache; static void update_domain(struct protection_domain *domain); static void detach_device(struct device *dev); -static void update_and_flush_device_table(struct protection_domain *domain, - struct domain_pgtable *pgtable); / * @@ -1482,7 +1480,7 @@ static bool increase_address_space(struct protection_domain *domain, pgtable.root = pte; pgtable.mode += 1; - update_and_flush_device_table(domain, ); + amd_iommu_update_and_flush_device_table(domain); domain_flush_complete(domain); /* @@ -1857,17 +1855,16 @@ static void free_gcr3_table(struct protection_domain *domain) } static void set_dte_entry(u16 devid, struct protection_domain *domain, - struct domain_pgtable *pgtable, bool ats, bool ppr) { u64 pte_root = 0; u64 flags = 0; u32 old_domid; - if (pgtable->mode != PAGE_MODE_NONE) - pte_root = iommu_virt_to_phys(pgtable->root); + if (domain->iop.mode != PAGE_MODE_NONE) + pte_root = iommu_virt_to_phys(domain->iop.root); - pte_root |= (pgtable->mode & DEV_ENTRY_MODE_MASK) + pte_root |= (domain->iop.mode & DEV_ENTRY_MODE_MASK) << DEV_ENTRY_MODE_SHIFT; pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV; @@ -1957,7 +1954,7 @@ static void do_attach(struct iommu_dev_data *dev_data, /* Update device table */ amd_iommu_domain_get_pgtable(domain, ); - set_dte_entry(dev_data->devid, domain, , + set_dte_entry(dev_data->devid, domain, ats, dev_data->iommu_v2); clone_aliases(dev_data->pdev); @@ -2263,22 +2260,20 @@ static int amd_iommu_domain_get_attr(struct iommu_domain *domain, * */ -static void update_device_table(struct protection_domain *domain, - struct domain_pgtable *pgtable) +static void update_device_table(struct protection_domain *domain) { struct iommu_dev_data *dev_data; list_for_each_entry(dev_data, >dev_list, list) { - set_dte_entry(dev_data->devid, domain, pgtable, + set_dte_entry(dev_data->devid, domain, dev_data->ats.enabled, dev_data->iommu_v2); clone_aliases(dev_data->pdev); } } -static void update_and_flush_device_table(struct protection_domain *domain, - struct domain_pgtable *pgtable) +void amd_iommu_update_and_flush_device_table(struct protection_domain *domain) { - update_device_table(domain, pgtable); + update_device_table(domain); domain_flush_devices(domain); } @@ -2288,7 +2283,7 @@ static void update_domain(struct protection_domain *domain) /* Update device table */ amd_iommu_domain_get_pgtable(domain, ); - update_and_flush_device_table(domain, ); + amd_iommu_update_and_flush_device_table(domain); /* Flush domain TLB(s) and wait for completion */ domain_flush_tlb_pde(domain); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 01/14] iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline
Move the function to header file to allow inclusion in other files. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 13 + drivers/iommu/amd/iommu.c | 10 -- 2 files changed, 13 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 57309716fd18..97cdb235ce69 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -93,6 +93,19 @@ static inline void *iommu_phys_to_virt(unsigned long paddr) return phys_to_virt(__sme_clr(paddr)); } +static inline +void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root) +{ + atomic64_set(>pt_root, root); +} + +static inline +void amd_iommu_domain_clr_pt_root(struct protection_domain *domain) +{ + amd_iommu_domain_set_pt_root(domain, 0); +} + + extern bool translation_pre_enabled(struct amd_iommu *iommu); extern bool amd_iommu_is_attach_deferred(struct iommu_domain *domain, struct device *dev); diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index db4fb840c59c..e92b3f744292 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -162,16 +162,6 @@ static void amd_iommu_domain_get_pgtable(struct protection_domain *domain, pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */ } -static void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root) -{ - atomic64_set(>pt_root, root); -} - -static void amd_iommu_domain_clr_pt_root(struct protection_domain *domain) -{ - amd_iommu_domain_set_pt_root(domain, 0); -} - static void amd_iommu_domain_set_pgtable(struct protection_domain *domain, u64 *root, int mode) { -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 03/14] iommu/amd: Move pt_root to to struct amd_io_pgtable
To better organize the data structure since it contains IO page table related information. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 2 +- drivers/iommu/amd/amd_iommu_types.h | 2 +- drivers/iommu/amd/iommu.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 97cdb235ce69..da6e09657e00 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -96,7 +96,7 @@ static inline void *iommu_phys_to_virt(unsigned long paddr) static inline void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root) { - atomic64_set(>pt_root, root); + atomic64_set(>iop.pt_root, root); } static inline diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index e3ac3e57e507..80b5c34357ed 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -493,6 +493,7 @@ struct amd_io_pgtable { struct io_pgtable iop; int mode; u64 *root; + atomic64_t pt_root; /* pgtable root and pgtable mode */ }; /* @@ -506,7 +507,6 @@ struct protection_domain { struct amd_io_pgtable iop; spinlock_t lock;/* mostly used to lock the page table*/ u16 id; /* the domain id written to the device table */ - atomic64_t pt_root; /* pgtable root and pgtable mode */ int glx;/* Number of levels for GCR3 table */ u64 *gcr3_tbl; /* Guest CR3 table */ unsigned long flags;/* flags to find out type of domain */ diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 2b7eb51dcbb8..c8b8619cc744 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -146,7 +146,7 @@ static struct protection_domain *to_pdomain(struct iommu_domain *dom) static void amd_iommu_domain_get_pgtable(struct protection_domain *domain, struct domain_pgtable *pgtable) { - u64 pt_root = atomic64_read(>pt_root); + u64 pt_root = atomic64_read(>iop.pt_root); pgtable->root = (u64 *)(pt_root & PAGE_MASK); pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */ -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu