[PATCH v3 7/7] iommu/amd: Do not support IOMMUv2 APIs when SNP is enabled

2022-06-22 Thread Suravee Suthikulpanit via iommu
The IOMMUv2 APIs (for supporting shared virtual memory with PASID)
configures the domain with IOMMU v2 page table, and sets DTE[Mode]=0.
This configuration cannot be supported on SNP-enabled system.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index f5695ccb7c81..4c9b96160a8b 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -3448,7 +3448,12 @@ __setup("ivrs_acpihid",  parse_ivrs_acpihid);
 
 bool amd_iommu_v2_supported(void)
 {
-   return amd_iommu_v2_present;
+   /*
+* Since DTE[Mode]=0 is prohibited on SNP-enabled system
+* (i.e. EFR[SNPSup]=1), IOMMUv2 page table cannot be used without
+* setting up IOMMUv1 page table.
+*/
+   return amd_iommu_v2_present && !amd_iommu_snp_en;
 }
 EXPORT_SYMBOL(amd_iommu_v2_supported);
 
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 0/7] iommu/amd: Enforce IOMMU restrictions for SNP-enabled system

2022-06-22 Thread Suravee Suthikulpanit via iommu
SNP-enabled system requires IOMMU v1 page table to be configured with
non-zero DTE[Mode] for DMA-capable devices. This effects a number of
usecases such as IOMMU pass-through mode and AMD IOMMUv2 APIs for
binding/unbinding pasid.

The series introduce a global variable to check SNP-enabled state
during driver initialization, and use it to enforce the SNP restrictions
during runtime.

Also, for non-DMA-capable devices such as IOAPIC, the recommendation
is to set DTE[TV] and DTE[Mode] to zero on SNP-enabled system.
Therefore, additinal checks is added before setting DTE[TV].

Testing:
  - Tested booting and verify dmesg.
  - Tested booting with iommu=pt
  - Tested loading amd_iommu_v2 driver
  - Tested changing the iommu domain at runtime
  - Tested booting SEV/SNP-enabled guest
  - Tested when CONFIG_AMD_MEM_ENCRYPT is not set

Pre-requisite:
  - [PATCH v3 00/35] iommu/amd: Add multiple PCI segments support

https://lore.kernel.org/linux-iommu/20220511072141.15485-29-vasant.he...@amd.com/T/

Chanages from V2:
(https://lists.linuxfoundation.org/pipermail/iommu/2022-June/066392.html)
  - Patch 4:
 * Update pr_err message to report SNP not supported.
 * Remove export GPL.
 * Remove function stub when CONFIG_AMD_MEM_ENCRYPT is not set.
  - Patch 6: Change to WARN_ONCE.

Best Regards,
Suravee

Brijesh Singh (1):
  iommu/amd: Introduce function to check and enable SNP

Suravee Suthikulpanit (6):
  iommu/amd: Warn when found inconsistency EFR mask
  iommu/amd: Process all IVHDs before enabling IOMMU features
  iommu/amd: Introduce an iommu variable for tracking SNP support status
  iommu/amd: Set translation valid bit only when IO page tables are in
use
  iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY after SNP is enabled
  iommu/amd: Do not support IOMMUv2 APIs when SNP is enabled

 drivers/iommu/amd/amd_iommu_types.h |   5 ++
 drivers/iommu/amd/init.c| 109 +++-
 drivers/iommu/amd/iommu.c   |  27 ++-
 include/linux/amd-iommu.h   |   4 +
 4 files changed, 123 insertions(+), 22 deletions(-)

-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 6/7] iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY after SNP is enabled

2022-06-22 Thread Suravee Suthikulpanit via iommu
Once SNP is enabled (by executing SNP_INIT command), IOMMU can no longer
support the passthrough domain (i.e. IOMMU_DOMAIN_IDENTITY).

The SNP_INIT command is called early in the boot process, and would fail
if the kernel is configure to default to passthrough mode.

After the system is already booted, users can try to change IOMMU domain
type of a particular IOMMU group. In this case, the IOMMU driver needs to
check the SNP-enable status and return failure when requesting to change
domain type to identity.

Therefore, return failure when trying to allocate identity domain.

Reviewed-by: Robin Murphy 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/iommu.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 4f4571d3ff61..7093e26fec59 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2119,6 +2119,14 @@ static struct iommu_domain 
*amd_iommu_domain_alloc(unsigned type)
 {
struct protection_domain *domain;
 
+   /*
+* Since DTE[Mode]=0 is prohibited on SNP-enabled system,
+* default to use IOMMU_DOMAIN_DMA[_FQ].
+*/
+   if (WARN_ONCE(amd_iommu_snp_en && (type == IOMMU_DOMAIN_IDENTITY),
+ "Cannot allocate identity domain due to SNP\n"))
+   return NULL;
+
domain = protection_domain_alloc(type);
if (!domain)
return NULL;
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 4/7] iommu/amd: Introduce function to check and enable SNP

2022-06-22 Thread Suravee Suthikulpanit via iommu
From: Brijesh Singh 

To support SNP, IOMMU needs to be enabled, and prohibits IOMMU
configurations where DTE[Mode]=0, which means it cannot be supported with
IOMMU passthrough domain (a.k.a IOMMU_DOMAIN_IDENTITY),
and when AMD IOMMU driver is configured to not use the IOMMU host (v1) page
table. Otherwise, RMP table initialization could cause the system to crash.

The request to enable SNP support in IOMMU must be done before PCI
initialization state of the IOMMU driver because enabling SNP affects
how IOMMU driver sets up IOMMU data structures (i.e. DTE).

Unlike other IOMMU features, SNP feature does not have an enable bit in
the IOMMU control register. Instead, the IOMMU driver introduces
an amd_iommu_snp_en variable to track enabling state of SNP.

Introduce amd_iommu_snp_enable() for other drivers to request enabling
the SNP support in IOMMU, which checks all prerequisites and determines
if the feature can be safely enabled.

Please see the IOMMU spec section 2.12 for further details.

Reviewed-by: Robin Murphy 
Co-developed-by: Suravee Suthikulpanit 
Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Brijesh Singh 
---
 drivers/iommu/amd/amd_iommu_types.h |  5 
 drivers/iommu/amd/init.c| 44 +++--
 drivers/iommu/amd/iommu.c   |  4 +--
 include/linux/amd-iommu.h   |  4 +++
 4 files changed, 53 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 73b729be7410..ce4db2835b36 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -463,6 +463,9 @@ extern bool amd_iommu_irq_remap;
 /* kmem_cache to get tables with 128 byte alignement */
 extern struct kmem_cache *amd_iommu_irq_cache;
 
+/* SNP is enabled on the system? */
+extern bool amd_iommu_snp_en;
+
 #define PCI_SBDF_TO_SEGID(sbdf)(((sbdf) >> 16) & 0x)
 #define PCI_SBDF_TO_DEVID(sbdf)((sbdf) & 0x)
 #define PCI_SEG_DEVID_TO_SBDF(seg, devid)  u32)(seg) & 0x) << 16) 
| \
@@ -1013,4 +1016,6 @@ extern struct amd_irte_ops irte_32_ops;
 extern struct amd_irte_ops irte_128_ops;
 #endif
 
+extern struct iommu_ops amd_iommu_ops;
+
 #endif /* _ASM_X86_AMD_IOMMU_TYPES_H */
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 013c55e3c2f2..c62fb4470519 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -95,8 +95,6 @@
  * out of it.
  */
 
-extern const struct iommu_ops amd_iommu_ops;
-
 /*
  * structure describing one IOMMU in the ACPI table. Typically followed by one
  * or more ivhd_entrys.
@@ -168,6 +166,9 @@ static int amd_iommu_target_ivhd_type;
 
 static bool amd_iommu_snp_sup;
 
+bool amd_iommu_snp_en;
+EXPORT_SYMBOL(amd_iommu_snp_en);
+
 LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */
 LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the
   system */
@@ -3549,3 +3550,42 @@ int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8 
bank, u8 cntr, u8 fxn, u64
 
return iommu_pc_get_set_reg(iommu, bank, cntr, fxn, value, true);
 }
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+int amd_iommu_snp_enable(void)
+{
+   /*
+* The SNP support requires that IOMMU must be enabled, and is
+* not configured in the passthrough mode.
+*/
+   if (no_iommu || iommu_default_passthrough()) {
+   pr_err("SNP: IOMMU is disabled or configured in passthrough 
mode, SNP cannot be supported");
+   return -EINVAL;
+   }
+
+   /*
+* Prevent enabling SNP after IOMMU_ENABLED state because this process
+* affect how IOMMU driver sets up data structures and configures
+* IOMMU hardware.
+*/
+   if (init_state > IOMMU_ENABLED) {
+   pr_err("SNP: Too late to enable SNP for IOMMU.\n");
+   return -EINVAL;
+   }
+
+   amd_iommu_snp_en = amd_iommu_snp_sup;
+   if (!amd_iommu_snp_en)
+   return -EINVAL;
+
+   pr_info("SNP enabled\n");
+
+   /* Enforce IOMMU v1 pagetable when SNP is enabled. */
+   if (amd_iommu_pgtable != AMD_IOMMU_V1) {
+   pr_warn("Force to using AMD IOMMU v1 page table due to SNP\n");
+   amd_iommu_pgtable = AMD_IOMMU_V1;
+   amd_iommu_ops.pgsize_bitmap = AMD_IOMMU_PGSIZES;
+   }
+
+   return 0;
+}
+#endif
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 86045dc50a0f..0792cd618dba 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -71,7 +71,7 @@ LIST_HEAD(acpihid_map);
  * Domain for untranslated devices - only allocated
  * if iommu=pt passed on kernel cmd line.
  */
-const struct iommu_ops amd_iommu_ops;
+struct iommu_ops amd_iommu_ops;
 
 static ATOMIC_NOTIFIER_HEAD(ppr_notifier);
 int amd_iommu_max_glx_val = -1;
@@ -2412,7

[PATCH v3 5/7] iommu/amd: Set translation valid bit only when IO page tables are in use

2022-06-22 Thread Suravee Suthikulpanit via iommu
On AMD system with SNP enabled, IOMMU hardware checks the host translation
valid (TV) and guest translation valid (GV) bits in the device table entry
(DTE) before accessing the corresponded page tables.

However, current IOMMU driver sets the TV bit for all devices regardless
of whether the host page table is in use. This results in
ILLEGAL_DEV_TABLE_ENTRY event for devices, which do not the host page
table root pointer set up.

Thefore, when SNP is enabled, only set TV bit when DMA remapping is not
used, which is when domain ID in the AMD IOMMU device table entry (DTE)
is zero.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c  |  3 ++-
 drivers/iommu/amd/iommu.c | 15 +--
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index c62fb4470519..f5695ccb7c81 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -2544,7 +2544,8 @@ static void init_device_table_dma(struct 
amd_iommu_pci_seg *pci_seg)
 
for (devid = 0; devid <= pci_seg->last_bdf; ++devid) {
__set_dev_entry_bit(dev_table, devid, DEV_ENTRY_VALID);
-   __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_TRANSLATION);
+   if (!amd_iommu_snp_en)
+   __set_dev_entry_bit(dev_table, devid, 
DEV_ENTRY_TRANSLATION);
}
 }
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 0792cd618dba..4f4571d3ff61 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1563,7 +1563,14 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 
devid,
(domain->flags & PD_GIOV_MASK))
pte_root |= DTE_FLAG_GIOV;
 
-   pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
+   pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V;
+
+   /*
+* When SNP is enabled, Only set TV bit when IOMMU
+* page translation is in use.
+*/
+   if (!amd_iommu_snp_en || (domain->id != 0))
+   pte_root |= DTE_FLAG_TV;
 
flags = dev_table[devid].data[1];
 
@@ -1625,7 +1632,11 @@ static void clear_dte_entry(struct amd_iommu *iommu, u16 
devid)
struct dev_table_entry *dev_table = get_dev_table(iommu);
 
/* remove entry from the device table seen by the hardware */
-   dev_table[devid].data[0]  = DTE_FLAG_V | DTE_FLAG_TV;
+   dev_table[devid].data[0]  = DTE_FLAG_V;
+
+   if (!amd_iommu_snp_en)
+   dev_table[devid].data[0] |= DTE_FLAG_TV;
+
dev_table[devid].data[1] &= DTE_FLAG_MASK;
 
amd_iommu_apply_erratum_63(iommu, devid);
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 3/7] iommu/amd: Introduce an iommu variable for tracking SNP support status

2022-06-22 Thread Suravee Suthikulpanit via iommu
EFR[SNPSup] needs to be checked early in the boot process, since it is
used to determine how IOMMU driver configures other IOMMU features
and data structures. This check can be done as soon as the IOMMU driver
finishes parsing IVHDs.

Introduce a variable for tracking the SNP support status, which is
initialized before enabling the rest of IOMMU features.

Also report IOMMU SNP support information for each IOMMU.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 5f86e357dbaa..013c55e3c2f2 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -166,6 +166,8 @@ static bool amd_iommu_disabled __initdata;
 static bool amd_iommu_force_enable __initdata;
 static int amd_iommu_target_ivhd_type;
 
+static bool amd_iommu_snp_sup;
+
 LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */
 LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the
   system */
@@ -260,7 +262,6 @@ int amd_iommu_get_num_iommus(void)
return amd_iommus_present;
 }
 
-#ifdef CONFIG_IRQ_REMAP
 /*
  * Iterate through all the IOMMUs to verify if the specified
  * EFR bitmask of IOMMU feature are set.
@@ -285,7 +286,6 @@ static bool check_feature_on_all_iommus(u64 mask)
}
return ret;
 }
-#endif
 
 /*
  * For IVHD type 0x11/0x40, EFR is also available via IVHD.
@@ -368,7 +368,7 @@ static void iommu_set_cwwb_range(struct amd_iommu *iommu)
u64 start = iommu_virt_to_phys((void *)iommu->cmd_sem);
u64 entry = start & PM_ADDR_MASK;
 
-   if (!iommu_feature(iommu, FEATURE_SNP))
+   if (!amd_iommu_snp_sup)
return;
 
/* Note:
@@ -783,7 +783,7 @@ static void *__init iommu_alloc_4k_pages(struct amd_iommu 
*iommu,
void *buf = (void *)__get_free_pages(gfp, order);
 
if (buf &&
-   iommu_feature(iommu, FEATURE_SNP) &&
+   amd_iommu_snp_sup &&
set_memory_4k((unsigned long)buf, (1 << order))) {
free_pages((unsigned long)buf, order);
buf = NULL;
@@ -1882,6 +1882,7 @@ static int __init init_iommu_all(struct acpi_table_header 
*table)
WARN_ON(p != end);
 
/* Phase 2 : Early feature support check */
+   amd_iommu_snp_sup = check_feature_on_all_iommus(FEATURE_SNP);
 
/* Phase 3 : Enabling IOMMU features */
for_each_iommu(iommu) {
@@ -2118,6 +2119,9 @@ static void print_iommu_info(void)
if (iommu->features & FEATURE_GAM_VAPIC)
pr_cont(" GA_vAPIC");
 
+   if (iommu->features & FEATURE_SNP)
+   pr_cont(" SNP");
+
pr_cont("\n");
}
}
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 1/7] iommu/amd: Warn when found inconsistency EFR mask

2022-06-22 Thread Suravee Suthikulpanit via iommu
The function check_feature_on_all_iommus() checks to ensure if an IOMMU
feature support bit is set on the Extended Feature Register (EFR).
Current logic iterates through all IOMMU, and returns false when it
found the first unset bit.

To provide more thorough checking, modify the logic to iterate through all
IOMMUs even when found that the bit is not set, and also throws a FW_BUG
warning if inconsistency is found.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 3dd0f26039c7..b3e4551ce9dd 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -261,18 +261,29 @@ int amd_iommu_get_num_iommus(void)
 }
 
 #ifdef CONFIG_IRQ_REMAP
+/*
+ * Iterate through all the IOMMUs to verify if the specified
+ * EFR bitmask of IOMMU feature are set.
+ * Warn and return false if found inconsistency.
+ */
 static bool check_feature_on_all_iommus(u64 mask)
 {
bool ret = false;
struct amd_iommu *iommu;
 
for_each_iommu(iommu) {
-   ret = iommu_feature(iommu, mask);
-   if (!ret)
+   bool tmp = iommu_feature(iommu, mask);
+
+   if ((ret != tmp) &&
+   !list_is_first(>list, _iommu_list)) {
+   pr_err(FW_BUG "Found inconsistent EFR mask (%#llx) on 
iommu%d (%04x:%02x:%02x.%01x).\n",
+  mask, iommu->index, iommu->pci_seg->id, 
PCI_BUS_NUM(iommu->devid),
+  PCI_SLOT(iommu->devid), PCI_FUNC(iommu->devid));
return false;
+   }
+   ret = tmp;
}
-
-   return true;
+   return ret;
 }
 #endif
 
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 2/7] iommu/amd: Process all IVHDs before enabling IOMMU features

2022-06-22 Thread Suravee Suthikulpanit via iommu
The ACPI IVRS table can contain multiple IVHD blocks. Each block contains
information used to initialize each IOMMU instance.

Currently, init_iommu_all sequentially process IVHD block and initialize
IOMMU instance one-by-one. However, certain features require all IOMMUs
to be configured in the same way system-wide. In case certain IVHD blocks
contain inconsistent information (most likely FW bugs), the driver needs
to go through and try to revert settings on IOMMUs that have already been
configured.

A solution is to split IOMMU initialization into 3 phases:

Phase1 : Processes information of the IVRS table for all IOMMU instances.
This allow all IVHDs to be processed prior to enabling features.

Phase2 : Early feature support check on all IOMMUs (using information in
IVHD blocks.

Phase3 : Iterates through all IOMMU instances and enabling features.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 24 ++--
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index b3e4551ce9dd..5f86e357dbaa 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1692,7 +1692,6 @@ static int __init init_iommu_one(struct amd_iommu *iommu, 
struct ivhd_header *h,
 struct acpi_table_header *ivrs_base)
 {
struct amd_iommu_pci_seg *pci_seg;
-   int ret;
 
pci_seg = get_pci_segment(h->pci_seg, ivrs_base);
if (pci_seg == NULL)
@@ -1773,6 +1772,13 @@ static int __init init_iommu_one(struct amd_iommu 
*iommu, struct ivhd_header *h,
if (!iommu->mmio_base)
return -ENOMEM;
 
+   return init_iommu_from_acpi(iommu, h);
+}
+
+static int __init init_iommu_one_late(struct amd_iommu *iommu)
+{
+   int ret;
+
if (alloc_cwwb_sem(iommu))
return -ENOMEM;
 
@@ -1794,10 +1800,6 @@ static int __init init_iommu_one(struct amd_iommu 
*iommu, struct ivhd_header *h,
if (amd_iommu_pre_enabled)
amd_iommu_pre_enabled = translation_pre_enabled(iommu);
 
-   ret = init_iommu_from_acpi(iommu, h);
-   if (ret)
-   return ret;
-
if (amd_iommu_irq_remap) {
ret = amd_iommu_create_irq_domain(iommu);
if (ret)
@@ -1808,7 +1810,7 @@ static int __init init_iommu_one(struct amd_iommu *iommu, 
struct ivhd_header *h,
 * Make sure IOMMU is not considered to translate itself. The IVRS
 * table tells us so, but this is a lie!
 */
-   pci_seg->rlookup_table[iommu->devid] = NULL;
+   iommu->pci_seg->rlookup_table[iommu->devid] = NULL;
 
return 0;
 }
@@ -1853,6 +1855,7 @@ static int __init init_iommu_all(struct acpi_table_header 
*table)
end += table->length;
p += IVRS_HEADER_LENGTH;
 
+   /* Phase 1: Process all IVHD blocks */
while (p < end) {
h = (struct ivhd_header *)p;
if (*p == amd_iommu_target_ivhd_type) {
@@ -1878,6 +1881,15 @@ static int __init init_iommu_all(struct 
acpi_table_header *table)
}
WARN_ON(p != end);
 
+   /* Phase 2 : Early feature support check */
+
+   /* Phase 3 : Enabling IOMMU features */
+   for_each_iommu(iommu) {
+   ret = init_iommu_one_late(iommu);
+   if (ret)
+   return ret;
+   }
+
return 0;
 }
 
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 4/7] iommu/amd: Introduce function to check and enable SNP

2022-06-15 Thread Suravee Suthikulpanit via iommu
From: Brijesh Singh 

To support SNP, IOMMU needs to be enabled, and prohibits IOMMU
configurations where DTE[Mode]=0, which means it cannot be supported with
IOMMU passthrough domain (a.k.a IOMMU_DOMAIN_IDENTITY),
and when AMD IOMMU driver is configured to not use the IOMMU host (v1) page
table. Otherwise, RMP table initialization could cause the system to crash.

The request to enable SNP support in IOMMU must be done before PCI
initialization state of the IOMMU driver because enabling SNP affects
how IOMMU driver sets up IOMMU data structures (i.e. DTE).

Unlike other IOMMU features, SNP feature does not have an enable bit in
the IOMMU control register. Instead, the IOMMU driver introduces
an amd_iommu_snp_en variable to track enabling state of SNP.

Introduce amd_iommu_snp_enable() for other drivers to request enabling
the SNP support in IOMMU, which checks all prerequisites and determines
if the feature can be safely enabled.

Please see the IOMMU spec section 2.12 for further details.

Co-developed-by: Suravee Suthikulpanit 
Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Brijesh Singh 
---
 drivers/iommu/amd/amd_iommu_types.h |  5 
 drivers/iommu/amd/init.c| 45 +++--
 drivers/iommu/amd/iommu.c   |  4 +--
 include/linux/amd-iommu.h   |  6 
 4 files changed, 56 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 73b729be7410..ce4db2835b36 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -463,6 +463,9 @@ extern bool amd_iommu_irq_remap;
 /* kmem_cache to get tables with 128 byte alignement */
 extern struct kmem_cache *amd_iommu_irq_cache;
 
+/* SNP is enabled on the system? */
+extern bool amd_iommu_snp_en;
+
 #define PCI_SBDF_TO_SEGID(sbdf)(((sbdf) >> 16) & 0x)
 #define PCI_SBDF_TO_DEVID(sbdf)((sbdf) & 0x)
 #define PCI_SEG_DEVID_TO_SBDF(seg, devid)  u32)(seg) & 0x) << 16) 
| \
@@ -1013,4 +1016,6 @@ extern struct amd_irte_ops irte_32_ops;
 extern struct amd_irte_ops irte_128_ops;
 #endif
 
+extern struct iommu_ops amd_iommu_ops;
+
 #endif /* _ASM_X86_AMD_IOMMU_TYPES_H */
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 013c55e3c2f2..b5d3de327a5f 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -95,8 +95,6 @@
  * out of it.
  */
 
-extern const struct iommu_ops amd_iommu_ops;
-
 /*
  * structure describing one IOMMU in the ACPI table. Typically followed by one
  * or more ivhd_entrys.
@@ -168,6 +166,9 @@ static int amd_iommu_target_ivhd_type;
 
 static bool amd_iommu_snp_sup;
 
+bool amd_iommu_snp_en;
+EXPORT_SYMBOL(amd_iommu_snp_en);
+
 LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */
 LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the
   system */
@@ -3549,3 +3550,43 @@ int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8 
bank, u8 cntr, u8 fxn, u64
 
return iommu_pc_get_set_reg(iommu, bank, cntr, fxn, value, true);
 }
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+int amd_iommu_snp_enable(void)
+{
+   /*
+* The SNP support requires that IOMMU must be enabled, and is
+* not configured in the passthrough mode.
+*/
+   if (no_iommu || iommu_default_passthrough()) {
+   pr_err("SNP: IOMMU is either disabled or configured in 
passthrough mode.\n");
+   return -EINVAL;
+   }
+
+   /*
+* Prevent enabling SNP after IOMMU_ENABLED state because this process
+* affect how IOMMU driver sets up data structures and configures
+* IOMMU hardware.
+*/
+   if (init_state > IOMMU_ENABLED) {
+   pr_err("SNP: Too late to enable SNP for IOMMU.\n");
+   return -EINVAL;
+   }
+
+   amd_iommu_snp_en = amd_iommu_snp_sup;
+   if (!amd_iommu_snp_en)
+   return -EINVAL;
+
+   pr_info("SNP enabled\n");
+
+   /* Enforce IOMMU v1 pagetable when SNP is enabled. */
+   if (amd_iommu_pgtable != AMD_IOMMU_V1) {
+   pr_warn("Force to using AMD IOMMU v1 page table due to SNP\n");
+   amd_iommu_pgtable = AMD_IOMMU_V1;
+   amd_iommu_ops.pgsize_bitmap = AMD_IOMMU_PGSIZES;
+   }
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(amd_iommu_snp_enable);
+#endif
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 86045dc50a0f..0792cd618dba 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -71,7 +71,7 @@ LIST_HEAD(acpihid_map);
  * Domain for untranslated devices - only allocated
  * if iommu=pt passed on kernel cmd line.
  */
-const struct iommu_ops amd_iommu_ops;
+struct iommu_ops amd_iommu_ops;
 
 static ATOMIC_NOTIFIER_HEAD(ppr_notifier);
 int amd_iommu_max_glx_val = -1;
@@ -2412,7

[PATCH v2 2/7] iommu/amd: Process all IVHDs before enabling IOMMU features

2022-06-15 Thread Suravee Suthikulpanit via iommu
The ACPI IVRS table can contain multiple IVHD blocks. Each block contains
information used to initialize each IOMMU instance.

Currently, init_iommu_all sequentially process IVHD block and initialize
IOMMU instance one-by-one. However, certain features require all IOMMUs
to be configured in the same way system-wide. In case certain IVHD blocks
contain inconsistent information (most likely FW bugs), the driver needs
to go through and try to revert settings on IOMMUs that have already been
configured.

A solution is to split IOMMU initialization into 3 phases:

Phase1 : Processes information of the IVRS table for all IOMMU instances.
This allow all IVHDs to be processed prior to enabling features.

Phase2 : Early feature support check on all IOMMUs (using information in
IVHD blocks.

Phase3 : Iterates through all IOMMU instances and enabling features.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 24 ++--
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index b3e4551ce9dd..5f86e357dbaa 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1692,7 +1692,6 @@ static int __init init_iommu_one(struct amd_iommu *iommu, 
struct ivhd_header *h,
 struct acpi_table_header *ivrs_base)
 {
struct amd_iommu_pci_seg *pci_seg;
-   int ret;
 
pci_seg = get_pci_segment(h->pci_seg, ivrs_base);
if (pci_seg == NULL)
@@ -1773,6 +1772,13 @@ static int __init init_iommu_one(struct amd_iommu 
*iommu, struct ivhd_header *h,
if (!iommu->mmio_base)
return -ENOMEM;
 
+   return init_iommu_from_acpi(iommu, h);
+}
+
+static int __init init_iommu_one_late(struct amd_iommu *iommu)
+{
+   int ret;
+
if (alloc_cwwb_sem(iommu))
return -ENOMEM;
 
@@ -1794,10 +1800,6 @@ static int __init init_iommu_one(struct amd_iommu 
*iommu, struct ivhd_header *h,
if (amd_iommu_pre_enabled)
amd_iommu_pre_enabled = translation_pre_enabled(iommu);
 
-   ret = init_iommu_from_acpi(iommu, h);
-   if (ret)
-   return ret;
-
if (amd_iommu_irq_remap) {
ret = amd_iommu_create_irq_domain(iommu);
if (ret)
@@ -1808,7 +1810,7 @@ static int __init init_iommu_one(struct amd_iommu *iommu, 
struct ivhd_header *h,
 * Make sure IOMMU is not considered to translate itself. The IVRS
 * table tells us so, but this is a lie!
 */
-   pci_seg->rlookup_table[iommu->devid] = NULL;
+   iommu->pci_seg->rlookup_table[iommu->devid] = NULL;
 
return 0;
 }
@@ -1853,6 +1855,7 @@ static int __init init_iommu_all(struct acpi_table_header 
*table)
end += table->length;
p += IVRS_HEADER_LENGTH;
 
+   /* Phase 1: Process all IVHD blocks */
while (p < end) {
h = (struct ivhd_header *)p;
if (*p == amd_iommu_target_ivhd_type) {
@@ -1878,6 +1881,15 @@ static int __init init_iommu_all(struct 
acpi_table_header *table)
}
WARN_ON(p != end);
 
+   /* Phase 2 : Early feature support check */
+
+   /* Phase 3 : Enabling IOMMU features */
+   for_each_iommu(iommu) {
+   ret = init_iommu_one_late(iommu);
+   if (ret)
+   return ret;
+   }
+
return 0;
 }
 
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 7/7] iommu/amd: Do not support IOMMUv2 APIs when SNP is enabled

2022-06-15 Thread Suravee Suthikulpanit via iommu
The IOMMUv2 APIs (for supporting shared virtual memory with PASID)
configures the domain with IOMMU v2 page table, and sets DTE[Mode]=0.
This configuration cannot be supported on SNP-enabled system.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index bc008a82c12c..780d6977a331 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -3448,7 +3448,12 @@ __setup("ivrs_acpihid",  parse_ivrs_acpihid);
 
 bool amd_iommu_v2_supported(void)
 {
-   return amd_iommu_v2_present;
+   /*
+* Since DTE[Mode]=0 is prohibited on SNP-enabled system
+* (i.e. EFR[SNPSup]=1), IOMMUv2 page table cannot be used without
+* setting up IOMMUv1 page table.
+*/
+   return amd_iommu_v2_present && !amd_iommu_snp_en;
 }
 EXPORT_SYMBOL(amd_iommu_v2_supported);
 
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 6/7] iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY after SNP is enabled

2022-06-15 Thread Suravee Suthikulpanit via iommu
Once SNP is enabled (by executing SNP_INIT command), IOMMU can no longer
support the passthrough domain (i.e. IOMMU_DOMAIN_IDENTITY).

The SNP_INIT command is called early in the boot process, and would fail
if the kernel is configure to default to passthrough mode.

After the system is already booted, users can try to change IOMMU domain
type of a particular IOMMU group. In this case, the IOMMU driver needs to
check the SNP-enable status and return failure when requesting to change
domain type to identity.

Therefore, return failure when trying to allocate identity domain.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/iommu.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 4f4571d3ff61..d8a6df423b90 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2119,6 +2119,15 @@ static struct iommu_domain 
*amd_iommu_domain_alloc(unsigned type)
 {
struct protection_domain *domain;
 
+   /*
+* Since DTE[Mode]=0 is prohibited on SNP-enabled system,
+* default to use IOMMU_DOMAIN_DMA[_FQ].
+*/
+   if (amd_iommu_snp_en && (type == IOMMU_DOMAIN_IDENTITY)) {
+   pr_warn("Cannot allocate identity domain due to SNP\n");
+   return NULL;
+   }
+
domain = protection_domain_alloc(type);
if (!domain)
return NULL;
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 3/7] iommu/amd: Introduce an iommu variable for tracking SNP support status

2022-06-15 Thread Suravee Suthikulpanit via iommu
EFR[SNPSup] needs to be checked early in the boot process, since it is
used to determine how IOMMU driver configures other IOMMU features
and data structures. This check can be done as soon as the IOMMU driver
finishes parsing IVHDs.

Introduce a variable for tracking the SNP support status, which is
initialized before enabling the rest of IOMMU features.

Also report IOMMU SNP support information for each IOMMU.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 5f86e357dbaa..013c55e3c2f2 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -166,6 +166,8 @@ static bool amd_iommu_disabled __initdata;
 static bool amd_iommu_force_enable __initdata;
 static int amd_iommu_target_ivhd_type;
 
+static bool amd_iommu_snp_sup;
+
 LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */
 LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the
   system */
@@ -260,7 +262,6 @@ int amd_iommu_get_num_iommus(void)
return amd_iommus_present;
 }
 
-#ifdef CONFIG_IRQ_REMAP
 /*
  * Iterate through all the IOMMUs to verify if the specified
  * EFR bitmask of IOMMU feature are set.
@@ -285,7 +286,6 @@ static bool check_feature_on_all_iommus(u64 mask)
}
return ret;
 }
-#endif
 
 /*
  * For IVHD type 0x11/0x40, EFR is also available via IVHD.
@@ -368,7 +368,7 @@ static void iommu_set_cwwb_range(struct amd_iommu *iommu)
u64 start = iommu_virt_to_phys((void *)iommu->cmd_sem);
u64 entry = start & PM_ADDR_MASK;
 
-   if (!iommu_feature(iommu, FEATURE_SNP))
+   if (!amd_iommu_snp_sup)
return;
 
/* Note:
@@ -783,7 +783,7 @@ static void *__init iommu_alloc_4k_pages(struct amd_iommu 
*iommu,
void *buf = (void *)__get_free_pages(gfp, order);
 
if (buf &&
-   iommu_feature(iommu, FEATURE_SNP) &&
+   amd_iommu_snp_sup &&
set_memory_4k((unsigned long)buf, (1 << order))) {
free_pages((unsigned long)buf, order);
buf = NULL;
@@ -1882,6 +1882,7 @@ static int __init init_iommu_all(struct acpi_table_header 
*table)
WARN_ON(p != end);
 
/* Phase 2 : Early feature support check */
+   amd_iommu_snp_sup = check_feature_on_all_iommus(FEATURE_SNP);
 
/* Phase 3 : Enabling IOMMU features */
for_each_iommu(iommu) {
@@ -2118,6 +2119,9 @@ static void print_iommu_info(void)
if (iommu->features & FEATURE_GAM_VAPIC)
pr_cont(" GA_vAPIC");
 
+   if (iommu->features & FEATURE_SNP)
+   pr_cont(" SNP");
+
pr_cont("\n");
}
}
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 5/7] iommu/amd: Set translation valid bit only when IO page tables are in use

2022-06-15 Thread Suravee Suthikulpanit via iommu
On AMD system with SNP enabled, IOMMU hardware checks the host translation
valid (TV) and guest translation valid (GV) bits in the device table entry
(DTE) before accessing the corresponded page tables.

However, current IOMMU driver sets the TV bit for all devices regardless
of whether the host page table is in use. This results in
ILLEGAL_DEV_TABLE_ENTRY event for devices, which do not the host page
table root pointer set up.

Thefore, when SNP is enabled, only set TV bit when DMA remapping is not
used, which is when domain ID in the AMD IOMMU device table entry (DTE)
is zero.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c  |  3 ++-
 drivers/iommu/amd/iommu.c | 15 +--
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index b5d3de327a5f..bc008a82c12c 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -2544,7 +2544,8 @@ static void init_device_table_dma(struct 
amd_iommu_pci_seg *pci_seg)
 
for (devid = 0; devid <= pci_seg->last_bdf; ++devid) {
__set_dev_entry_bit(dev_table, devid, DEV_ENTRY_VALID);
-   __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_TRANSLATION);
+   if (!amd_iommu_snp_en)
+   __set_dev_entry_bit(dev_table, devid, 
DEV_ENTRY_TRANSLATION);
}
 }
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 0792cd618dba..4f4571d3ff61 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1563,7 +1563,14 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 
devid,
(domain->flags & PD_GIOV_MASK))
pte_root |= DTE_FLAG_GIOV;
 
-   pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
+   pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V;
+
+   /*
+* When SNP is enabled, Only set TV bit when IOMMU
+* page translation is in use.
+*/
+   if (!amd_iommu_snp_en || (domain->id != 0))
+   pte_root |= DTE_FLAG_TV;
 
flags = dev_table[devid].data[1];
 
@@ -1625,7 +1632,11 @@ static void clear_dte_entry(struct amd_iommu *iommu, u16 
devid)
struct dev_table_entry *dev_table = get_dev_table(iommu);
 
/* remove entry from the device table seen by the hardware */
-   dev_table[devid].data[0]  = DTE_FLAG_V | DTE_FLAG_TV;
+   dev_table[devid].data[0]  = DTE_FLAG_V;
+
+   if (!amd_iommu_snp_en)
+   dev_table[devid].data[0] |= DTE_FLAG_TV;
+
dev_table[devid].data[1] &= DTE_FLAG_MASK;
 
amd_iommu_apply_erratum_63(iommu, devid);
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 0/7] iommu/amd: Enforce IOMMU restrictions for SNP-enabled system

2022-06-15 Thread Suravee Suthikulpanit via iommu
SNP-enabled system requires IOMMU v1 page table to be configured with
non-zero DTE[Mode] for DMA-capable devices. This effects a number of
usecases such as IOMMU pass-through mode and AMD IOMMUv2 APIs for
binding/unbinding pasid.

The series introduce a global variable to check SNP-enabled state
during driver initialization, and use it to enforce the SNP restrictions
during runtime.

Also, for non-DMA-capable devices such as IOAPIC, the recommendation
is to set DTE[TV] and DTE[Mode] to zero on SNP-enabled system.
Therefore, additinal checks is added before setting DTE[TV].

Testing:
  - Tested booting and verify dmesg.
  - Tested booting with iommu=pt
  - Tested loading amd_iommu_v2 driver
  - Tested changing the iommu domain at runtime
  - Tested booting SEV/SNP-enabled guest
  - Tested when CONFIG_AMD_MEM_ENCRYPT is not set

Pre-requisite:
  - [PATCH v3 00/35] iommu/amd: Add multiple PCI segments support

https://lore.kernel.org/linux-iommu/20220511072141.15485-29-vasant.he...@amd.com/T/

Chanages from V1:
(https://lore.kernel.org/linux-iommu/20220613012502.109918-1-suravee.suthikulpa...@amd.com/T/#t
 )
  - Remove the newly introduced domain_type_supported() callback.
  - Patch 1: Modify existing check_feature_on_all_iommus() instead of
 introducing another helper function to do similar check.
  - Patch 3: Modify to use check_feature_on_all_iommus().
  - Patch 4: Add IOMMU init_state check before enabling SNP.
 Also move the function declaration to include/linux/amd-iommu.h 
  - Patch 6: Modify amd_iommu_domain_alloc() to fail when allocating identity
 domain and SNP is enabled.

Best Regards,
Suravee

Brijesh Singh (1):
  iommu/amd: Introduce function to check and enable SNP

Suravee Suthikulpanit (6):
  iommu/amd: Warn when found inconsistency EFR mask
  iommu/amd: Process all IVHDs before enabling IOMMU features
  iommu/amd: Introduce an iommu variable for tracking SNP support status
  iommu/amd: Set translation valid bit only when IO page tables are in
use
  iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY after SNP is enabled
  iommu/amd: Do not support IOMMUv2 APIs when SNP is enabled

 drivers/iommu/amd/amd_iommu_types.h |   5 ++
 drivers/iommu/amd/init.c| 110 +++-
 drivers/iommu/amd/iommu.c   |  28 ++-
 include/linux/amd-iommu.h   |   6 ++
 4 files changed, 127 insertions(+), 22 deletions(-)

-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/7] iommu/amd: Warn when found inconsistency EFR mask

2022-06-15 Thread Suravee Suthikulpanit via iommu
The function check_feature_on_all_iommus() checks to ensure if an IOMMU
feature support bit is set on the Extended Feature Register (EFR).
Current logic iterates through all IOMMU, and returns false when it
found the first unset bit.

To provide more thorough checking, modify the logic to iterate through all
IOMMUs even when found that the bit is not set, and also throws a FW_BUG
warning if inconsistency is found.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 3dd0f26039c7..b3e4551ce9dd 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -261,18 +261,29 @@ int amd_iommu_get_num_iommus(void)
 }
 
 #ifdef CONFIG_IRQ_REMAP
+/*
+ * Iterate through all the IOMMUs to verify if the specified
+ * EFR bitmask of IOMMU feature are set.
+ * Warn and return false if found inconsistency.
+ */
 static bool check_feature_on_all_iommus(u64 mask)
 {
bool ret = false;
struct amd_iommu *iommu;
 
for_each_iommu(iommu) {
-   ret = iommu_feature(iommu, mask);
-   if (!ret)
+   bool tmp = iommu_feature(iommu, mask);
+
+   if ((ret != tmp) &&
+   !list_is_first(>list, _iommu_list)) {
+   pr_err(FW_BUG "Found inconsistent EFR mask (%#llx) on 
iommu%d (%04x:%02x:%02x.%01x).\n",
+  mask, iommu->index, iommu->pci_seg->id, 
PCI_BUS_NUM(iommu->devid),
+  PCI_SLOT(iommu->devid), PCI_FUNC(iommu->devid));
return false;
+   }
+   ret = tmp;
}
-
-   return true;
+   return ret;
 }
 #endif
 
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 6/7] iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY when SNP is enabled

2022-06-12 Thread Suravee Suthikulpanit via iommu
Since DTE[Mode]=0 is prohibited on system, which enables SNP,
the passthrough domain (IOMMU_DOMAIN_IDENTITY) is not support.
Instead, only support IOMMU_DOMAIN_DMA[_FQ] domains.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/iommu.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index ca4647f04382..ecde9e08102d 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2379,6 +2379,17 @@ static int amd_iommu_def_domain_type(struct device *dev)
return 0;
 }
 
+static bool amd_iommu_domain_type_supported(struct device *dev, int type)
+{
+   /*
+* Since DTE[Mode]=0 is prohibited on SNP-enabled system,
+* default to use IOMMU_DOMAIN_DMA[_FQ].
+*/
+   if (amd_iommu_snp_en && (type == IOMMU_DOMAIN_IDENTITY))
+   return false;
+   return true;
+}
+
 struct iommu_ops amd_iommu_ops = {
.capable = amd_iommu_capable,
.domain_alloc = amd_iommu_domain_alloc,
@@ -2391,6 +2402,7 @@ struct iommu_ops amd_iommu_ops = {
.is_attach_deferred = amd_iommu_is_attach_deferred,
.pgsize_bitmap  = AMD_IOMMU_PGSIZES,
.def_domain_type = amd_iommu_def_domain_type,
+   .domain_type_supported = amd_iommu_domain_type_supported,
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = amd_iommu_attach_device,
.detach_dev = amd_iommu_detach_device,
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/7] iommu/amd: Introduce function to check SEV-SNP support

2022-06-12 Thread Suravee Suthikulpanit via iommu
From: Brijesh Singh 

The SEV-SNP support requires that IOMMU must to enabled. It also prohibits
IOMMU configurations where DTE[Mode]=0, which means the SEV-SNP feature is
not supported with IOMMU passthrough domain (a.k.a IOMMU_DOMAIN_IDENTITY),
or when AMD IOMMU driver is configured to not use the IOMMU host (v1) page
table.

Otherwise, the SNP_INIT command (used for initializing firmware) will fail.

Unlike other IOMMU features, SNP feature does not have an enable bit in
the IOMMU control register. Instead, the feature is considered enabled
when SNP_INIT command is executed, which is done by a separte driver.

Introduce iommu_sev_snp_supported() for checking if IOMMU supports
the SEV-SNP feature, and an amd_iommu_snp_en global variable to keep track
of SNP enable status.

Please see the IOMMU spec section 2.12 for further details.

Tested-by: Ashish Kalra 
Co-developed-by: Suravee Suthikulpanit 
Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Brijesh Singh 
---
 drivers/iommu/amd/amd_iommu_types.h | 11 
 drivers/iommu/amd/init.c| 39 ++---
 drivers/iommu/amd/iommu.c   |  4 +--
 include/linux/iommu.h   |  9 +++
 4 files changed, 52 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 328572cf6fa5..6552c0da8f32 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -450,6 +450,9 @@ extern bool amd_iommu_irq_remap;
 /* kmem_cache to get tables with 128 byte alignement */
 extern struct kmem_cache *amd_iommu_irq_cache;
 
+/* SNP is enabled on the system? */
+extern bool amd_iommu_snp_en;
+
 #define PCI_SBDF_TO_SEGID(sbdf)(((sbdf) >> 16) & 0x)
 #define PCI_SBDF_TO_DEVID(sbdf)((sbdf) & 0x)
 #define PCI_SEG_DEVID_TO_SBDF(seg, devid)  u32)(seg) & 0x) << 16) 
| \
@@ -999,4 +1002,12 @@ extern struct amd_irte_ops irte_32_ops;
 extern struct amd_irte_ops irte_128_ops;
 #endif
 
+/*
+ * ACPI table definitions
+ *
+ * These data structures are laid over the table to parse the important values
+ * out of it.
+ */
+extern struct iommu_ops amd_iommu_ops;
+
 #endif /* _ASM_X86_AMD_IOMMU_TYPES_H */
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 3965bd3f4f67..da32e7bdd1fa 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -88,15 +88,6 @@
 #define IVRS_GET_SBDF_ID(seg, bus, dev, fd)(((seg & 0x) << 16) | ((bus 
& 0xff) << 8) \
 | ((dev & 0x1f) << 3) | (fn & 
0x7))
 
-/*
- * ACPI table definitions
- *
- * These data structures are laid over the table to parse the important values
- * out of it.
- */
-
-extern const struct iommu_ops amd_iommu_ops;
-
 /*
  * structure describing one IOMMU in the ACPI table. Typically followed by one
  * or more ivhd_entrys.
@@ -166,6 +157,9 @@ static int amd_iommu_target_ivhd_type;
 
 static bool amd_iommu_snp_sup;
 
+bool amd_iommu_snp_en;
+EXPORT_SYMBOL(amd_iommu_snp_en);
+
 LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */
 LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the
   system */
@@ -3543,3 +3537,30 @@ int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8 
bank, u8 cntr, u8 fxn, u64
 
return iommu_pc_get_set_reg(iommu, bank, cntr, fxn, value, true);
 }
+
+bool iommu_sev_snp_supported(void)
+{
+   /*
+* The SEV-SNP support requires that IOMMU must be enabled, and is
+* not configured in the passthrough mode.
+*/
+   if (no_iommu || iommu_default_passthrough()) {
+   pr_err("SEV-SNP: IOMMU is either disabled or configured in 
passthrough mode.\n");
+   return false;
+   }
+
+   amd_iommu_snp_en = amd_iommu_snp_sup;
+   if (amd_iommu_snp_en)
+   pr_info("SNP enabled\n");
+
+   /* Enforce IOMMU v1 pagetable when SNP is enabled. */
+   if ((amd_iommu_pgtable != AMD_IOMMU_V1) &&
+amd_iommu_snp_en) {
+   pr_info("Force to using AMD IOMMU v1 page table due to SNP\n");
+   amd_iommu_pgtable = AMD_IOMMU_V1;
+   amd_iommu_ops.pgsize_bitmap = AMD_IOMMU_PGSIZES;
+   }
+
+   return amd_iommu_snp_en;
+}
+EXPORT_SYMBOL_GPL(iommu_sev_snp_supported);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 3e1f0fa42ec3..b9dc0d4b6d77 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -70,7 +70,7 @@ LIST_HEAD(acpihid_map);
  * Domain for untranslated devices - only allocated
  * if iommu=pt passed on kernel cmd line.
  */
-const struct iommu_ops amd_iommu_ops;
+struct iommu_ops amd_iommu_ops;
 
 static ATOMIC_NOTIFIER_HEAD(ppr_notifier);
 int amd_iommu_max_glx_val = -1;
@@ -2368,7 +2368,7 @@ static int amd_iommu_

[PATCH 4/7] iommu/amd: Set translation valid bit only when IO page tables are in use

2022-06-12 Thread Suravee Suthikulpanit via iommu
On AMD system with SNP enabled, IOMMU hardware checks the host translation
valid (TV) and guest translation valid (GV) bits in the device table entry
(DTE) before accessing the corresponded page tables.

However, current IOMMU driver sets the TV bit for all devices regardless
of whether the host page table is in use. This results in
ILLEGAL_DEV_TABLE_ENTRY event for devices, which do not the host page
table root pointer set up.

Thefore, when SNP is enabled, only set TV bit when DMA remapping is not
used, which is when domain ID in the AMD IOMMU device table entry (DTE)
is zero.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c  |  3 ++-
 drivers/iommu/amd/iommu.c | 15 +--
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index da32e7bdd1fa..a9152d3f33bf 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -2546,7 +2546,8 @@ static void init_device_table_dma(struct 
amd_iommu_pci_seg *pci_seg)
 
for (devid = 0; devid <= pci_seg->last_bdf; ++devid) {
__set_dev_entry_bit(dev_table, devid, DEV_ENTRY_VALID);
-   __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_TRANSLATION);
+   if (!amd_iommu_snp_en)
+   __set_dev_entry_bit(dev_table, devid, 
DEV_ENTRY_TRANSLATION);
}
 }
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index b9dc0d4b6d77..ca4647f04382 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1552,7 +1552,14 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 
devid,
 
pte_root |= (domain->iop.mode & DEV_ENTRY_MODE_MASK)
<< DEV_ENTRY_MODE_SHIFT;
-   pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
+   pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V;
+
+   /*
+* When SNP is enabled, Only set TV bit when IOMMU
+* page translation is in use.
+*/
+   if (!amd_iommu_snp_en || (domain->id != 0))
+   pte_root |= DTE_FLAG_TV;
 
flags = dev_table[devid].data[1];
 
@@ -1612,7 +1619,11 @@ static void clear_dte_entry(struct amd_iommu *iommu, u16 
devid)
struct dev_table_entry *dev_table = get_dev_table(iommu);
 
/* remove entry from the device table seen by the hardware */
-   dev_table[devid].data[0]  = DTE_FLAG_V | DTE_FLAG_TV;
+   dev_table[devid].data[0]  = DTE_FLAG_V;
+
+   if (!amd_iommu_snp_en)
+   dev_table[devid].data[0] |= DTE_FLAG_TV;
+
dev_table[devid].data[1] &= DTE_FLAG_MASK;
 
amd_iommu_apply_erratum_63(iommu, devid);
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 7/7] iommu/amd: Do not support IOMMUv2 APIs when SNP is enabled

2022-06-12 Thread Suravee Suthikulpanit via iommu
The IOMMUv2 APIs (for supporting shared virtual memory with PASID)
configures the domain with IOMMU v2 page table, and sets DTE[Mode]=0.
This configuration cannot be supported on SNP-enabled system.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index a9152d3f33bf..1565f0fb955a 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -3435,7 +3435,12 @@ __setup("ivrs_acpihid",  parse_ivrs_acpihid);
 
 bool amd_iommu_v2_supported(void)
 {
-   return amd_iommu_v2_present;
+   /*
+* Since DTE[Mode]=0 is prohibited on SNP-enabled system
+* (i.e. EFR[SNPSup]=1), IOMMUv2 page table cannot be used without
+* setting up IOMMUv1 page table.
+*/
+   return amd_iommu_v2_present && !amd_iommu_snp_en;
 }
 EXPORT_SYMBOL(amd_iommu_v2_supported);
 
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/7] iommu/amd: Process all IVHDs before enabling IOMMU features

2022-06-12 Thread Suravee Suthikulpanit via iommu
The ACPI IVRS table can contain multiple IVHD blocks. Each block contains
information used to initialize each IOMMU instance.

Currently, init_iommu_all sequentially process IVHD block and initialize
IOMMU instance one-by-one. However, certain features require all IOMMUs
to be configured in the same way system-wide. In case certain IVHD blocks
contain inconsistent information (most likely FW bugs), the driver needs
to go through and try to revert settings on IOMMUs that have already been
configured.

A solution is to split IOMMU initialization into 2 phases:

Phase 1 processes information of the IVRS table for all IOMMU instances.
This allow all IVHDs to be processed prior to enabling features.

Phase 2 iterates through all IOMMU instances and enabling each features.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 8877d2a20398..6a4a019f1e1d 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1687,7 +1687,6 @@ static int __init init_iommu_one(struct amd_iommu *iommu, 
struct ivhd_header *h,
 struct acpi_table_header *ivrs_base)
 {
struct amd_iommu_pci_seg *pci_seg;
-   int ret;
 
pci_seg = get_pci_segment(h->pci_seg, ivrs_base);
if (pci_seg == NULL)
@@ -1768,6 +1767,13 @@ static int __init init_iommu_one(struct amd_iommu 
*iommu, struct ivhd_header *h,
if (!iommu->mmio_base)
return -ENOMEM;
 
+   return init_iommu_from_acpi(iommu, h);
+}
+
+static int __init init_iommu_one_late(struct amd_iommu *iommu)
+{
+   int ret;
+
if (alloc_cwwb_sem(iommu))
return -ENOMEM;
 
@@ -1789,10 +1795,6 @@ static int __init init_iommu_one(struct amd_iommu 
*iommu, struct ivhd_header *h,
if (amd_iommu_pre_enabled)
amd_iommu_pre_enabled = translation_pre_enabled(iommu);
 
-   ret = init_iommu_from_acpi(iommu, h);
-   if (ret)
-   return ret;
-
if (amd_iommu_irq_remap) {
ret = amd_iommu_create_irq_domain(iommu);
if (ret)
@@ -1803,7 +1805,7 @@ static int __init init_iommu_one(struct amd_iommu *iommu, 
struct ivhd_header *h,
 * Make sure IOMMU is not considered to translate itself. The IVRS
 * table tells us so, but this is a lie!
 */
-   pci_seg->rlookup_table[iommu->devid] = NULL;
+   iommu->pci_seg->rlookup_table[iommu->devid] = NULL;
 
return 0;
 }
@@ -1873,6 +1875,12 @@ static int __init init_iommu_all(struct 
acpi_table_header *table)
}
WARN_ON(p != end);
 
+   for_each_iommu(iommu) {
+   ret = init_iommu_one_late(iommu);
+   if (ret)
+   return ret;
+   }
+
return 0;
 }
 
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/7] iommu/amd: Introduce a global variable for tracking SNP enable status

2022-06-12 Thread Suravee Suthikulpanit via iommu
IOMMU support for SNP feature is detected via the EFR[SNPSup] bit.
Also, it is required that EFR[SNPSup] are consistent across all IOMMU
instances.

This information is needed early in the boot process,
since it is used to determine how IOMMU driver configures several other
IOMMU features and data structures (e.g. as soon as the IOMMU driver
finishes parsing IVHDs).

Introduce a global variable for tracking the SNP support status, which is
initialized before enabling the rest of IOMMU features.

Also throw a warning if found inconsistency EFR[SNPSup] among IOMMU
instances.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 42 ++--
 1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 6a4a019f1e1d..3965bd3f4f67 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -164,6 +164,8 @@ static bool amd_iommu_disabled __initdata;
 static bool amd_iommu_force_enable __initdata;
 static int amd_iommu_target_ivhd_type;
 
+static bool amd_iommu_snp_sup;
+
 LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */
 LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the
   system */
@@ -355,7 +357,7 @@ static void iommu_set_cwwb_range(struct amd_iommu *iommu)
u64 start = iommu_virt_to_phys((void *)iommu->cmd_sem);
u64 entry = start & PM_ADDR_MASK;
 
-   if (!iommu_feature(iommu, FEATURE_SNP))
+   if (!amd_iommu_snp_sup)
return;
 
/* Note:
@@ -770,7 +772,7 @@ static void *__init iommu_alloc_4k_pages(struct amd_iommu 
*iommu,
void *buf = (void *)__get_free_pages(gfp, order);
 
if (buf &&
-   iommu_feature(iommu, FEATURE_SNP) &&
+   amd_iommu_snp_sup &&
set_memory_4k((unsigned long)buf, (1 << order))) {
free_pages((unsigned long)buf, order);
buf = NULL;
@@ -1836,6 +1838,37 @@ static u8 get_highest_supported_ivhd_type(struct 
acpi_table_header *ivrs)
return last_type;
 }
 
+/*
+ * SNP is enabled system-wide. So, iterate through all the IOMMUs to
+ * verify all EFR[SNPSup] bits are set, and use global variable to track
+ * whether the feature is supported.
+ */
+static void __init init_snp_global(void)
+{
+   struct amd_iommu *iommu;
+
+   for_each_iommu(iommu) {
+   if (iommu_feature(iommu, FEATURE_SNP)) {
+   amd_iommu_snp_sup = true;
+   continue;
+   }
+
+   /*
+* Warn and mark SNP as not supported if there is inconsistency
+* in any of the IOMMU.
+*/
+   if (amd_iommu_snp_sup && !list_is_first(>list, 
_iommu_list)) {
+   pr_err(FW_BUG "iommu%d (%04x:%02x:%02x.%01x): Found 
inconsistent EFR[SNPSup].\n",
+  iommu->index, iommu->pci_seg->id, 
PCI_BUS_NUM(iommu->devid),
+  PCI_SLOT(iommu->devid), PCI_FUNC(iommu->devid));
+   pr_err(FW_BUG "Disable SNP support\n");
+   amd_iommu_snp_sup = false;
+   }
+   return;
+   }
+   amd_iommu_snp_sup = true;
+}
+
 /*
  * Iterates over all IOMMU entries in the ACPI table, allocates the
  * IOMMU structure and initializes it with init_iommu_one()
@@ -1875,6 +1908,8 @@ static int __init init_iommu_all(struct acpi_table_header 
*table)
}
WARN_ON(p != end);
 
+   init_snp_global();
+
for_each_iommu(iommu) {
ret = init_iommu_one_late(iommu);
if (ret)
@@ -2095,6 +2130,9 @@ static void print_iommu_info(void)
if (iommu->features & FEATURE_GAM_VAPIC)
pr_cont(" GA_vAPIC");
 
+   if (iommu->features & FEATURE_SNP)
+   pr_cont(" SNP");
+
pr_cont("\n");
}
}
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 5/7] iommu: Add domain_type_supported() callback in iommu_ops

2022-06-12 Thread Suravee Suthikulpanit via iommu
When user requests to change IOMMU domain to a new type, IOMMU generic
layer checks the requested type against the default domain type returned
by vendor-specific IOMMU driver.

However, there is only one default domain type, and current mechanism
does not allow if the requested type does not match the default type.

Introducing check_domain_type_supported() callback in iommu_ops,
which allows IOMMU generic layer to check with vendor-specific IOMMU driver
whether the requested type is supported. This allows user to request
types other than the default type.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/iommu.c | 13 -
 include/linux/iommu.h |  2 ++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f2c45b85b9fc..4afb956ce083 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1521,6 +1521,16 @@ struct iommu_group *fsl_mc_device_group(struct device 
*dev)
 }
 EXPORT_SYMBOL_GPL(fsl_mc_device_group);
 
+static bool iommu_domain_type_supported(struct device *dev, int type)
+{
+   const struct iommu_ops *ops = dev_iommu_ops(dev);
+
+   if (ops->domain_type_supported)
+   return ops->domain_type_supported(dev, type);
+
+   return true;
+}
+
 static int iommu_get_def_domain_type(struct device *dev)
 {
const struct iommu_ops *ops = dev_iommu_ops(dev);
@@ -2937,7 +2947,8 @@ static int iommu_change_dev_def_domain(struct iommu_group 
*group,
 * domain the device was booted with
 */
type = dev_def_dom ? : iommu_def_domain_type;
-   } else if (dev_def_dom && type != dev_def_dom) {
+   } else if (!iommu_domain_type_supported(dev, type) ||
+  (dev_def_dom && type != dev_def_dom)) {
dev_err_ratelimited(prev_dev, "Device cannot be in %s domain\n",
iommu_domain_type_str(type));
ret = -EINVAL;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index fecb72e1b11b..40c47ab15005 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -214,6 +214,7 @@ struct iommu_iotlb_gather {
  * - IOMMU_DOMAIN_IDENTITY: must use an identity domain
  * - IOMMU_DOMAIN_DMA: must use a dma domain
  * - 0: use the default setting
+ * @domain_type_supported: check if the specified domain type is supported
  * @default_domain_ops: the default ops for domains
  * @pgsize_bitmap: bitmap of all possible supported page sizes
  * @owner: Driver module providing these ops
@@ -252,6 +253,7 @@ struct iommu_ops {
 struct iommu_page_response *msg);
 
int (*def_domain_type)(struct device *dev);
+   bool (*domain_type_supported)(struct device *dev, int type);
 
const struct iommu_domain_ops *default_domain_ops;
unsigned long pgsize_bitmap;
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/7] iommu/amd: Enforce IOMMU restrictions for SNP-enabled system

2022-06-12 Thread Suravee Suthikulpanit via iommu
SNP-enabled system requires IOMMU v1 page table to be configured with
non-zero DTE[Mode] for DMA-capable devices. This effects a number of
usecases such as IOMMU pass-through mode and AMD IOMMUv2 APIs for
binding/unbinding pasid.

The series introduce a global variable to check SNP-enabled state
during driver initialization, and use it to enforce the SNP restrictions
during runtime.

Also, for non-DMA-capable devices such as IOAPIC, the recommendation
is to set DTE[TV] and DTE[Mode] to zero on SNP-enabled system.
Therefore, additinal checks is added before setting DTE[TV].

Testing:
  - Tested booting and verify dmesg.
  - Tested booting with iommu=pt
  - Tested loading amd_iommu_v2 driver
  - Tested changing the iommu domain at runtime
  - Tested booting SEV/SNP-enabled guest

Pre-requisite:
  - [PATCH v3 00/35] iommu/amd: Add multiple PCI segments support

https://lore.kernel.org/linux-iommu/20220511072141.15485-29-vasant.he...@amd.com/T/

Note:
  - Previously discussed on here:
[PATCH v2] iommu/amd: Set translation valid bit only when IO page tables 
are in used
https://www.spinics.net/lists/kernel/msg4351005.html

Best Regards,
Suravee

Brijesh Singh (1):
  iommu/amd: Introduce function to check SEV-SNP support

Suravee Suthikulpanit (6):
  iommu/amd: Process all IVHDs before enabling IOMMU features
  iommu/amd: Introduce a global variable for tracking SNP enable status
  iommu/amd: Set translation valid bit only when IO page tables are in
use
  iommu: Add domain_type_supported() callback in iommu_ops
  iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY when SNP is enabled
  iommu/amd: Do not support IOMMUv2 APIs when SNP is enabled

 drivers/iommu/amd/amd_iommu_types.h |  11 +++
 drivers/iommu/amd/init.c| 111 +++-
 drivers/iommu/amd/iommu.c   |  31 +++-
 drivers/iommu/iommu.c   |  13 +++-
 include/linux/iommu.h   |  11 +++
 5 files changed, 153 insertions(+), 24 deletions(-)

-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 10/19] iommu/amd: Add unmap_read_dirty() support

2022-05-31 Thread Suravee Suthikulpanit via iommu




On 4/29/22 4:09 AM, Joao Martins wrote:

AMD implementation of unmap_read_dirty() is pretty simple as
mostly reuses unmap code with the extra addition of marshalling
the dirty bit into the bitmap as it walks the to-be-unmapped
IOPTE.

Extra care is taken though, to switch over to cmpxchg as opposed
to a non-serialized store to the PTE and testing the dirty bit
only set until cmpxchg succeeds to set to 0.

Signed-off-by: Joao Martins 
---
  drivers/iommu/amd/io_pgtable.c | 44 +-
  drivers/iommu/amd/iommu.c  | 22 +
  2 files changed, 60 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 8325ef193093..1868c3b58e6d 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -355,6 +355,16 @@ static void free_clear_pte(u64 *pte, u64 pteval, struct 
list_head *freelist)
free_sub_pt(pt, mode, freelist);
  }
  
+static bool free_pte_dirty(u64 *pte, u64 pteval)


Nitpick: Since we free and clearing the dirty bit, should we change
the function name to free_clear_pte_dirty()?


+{
+   bool dirty = false;
+
+   while (IOMMU_PTE_DIRTY(cmpxchg64(pte, pteval, 0)))


We should use 0ULL instead of 0.


+   dirty = true;
+
+   return dirty;
+}
+


Actually, what do you think if we enhance the current free_clear_pte()
to also handle the check dirty as well?


  /*
   * Generic mapping functions. It maps a physical address into a DMA
   * address space. It allocates the page table pages if necessary.
@@ -428,10 +438,11 @@ static int iommu_v1_map_page(struct io_pgtable_ops *ops, 
unsigned long iova,
return ret;
  }
  
-static unsigned long iommu_v1_unmap_page(struct io_pgtable_ops *ops,

- unsigned long iova,
- size_t size,
- struct iommu_iotlb_gather *gather)
+static unsigned long __iommu_v1_unmap_page(struct io_pgtable_ops *ops,
+  unsigned long iova,
+  size_t size,
+  struct iommu_iotlb_gather *gather,
+  struct iommu_dirty_bitmap *dirty)
  {
struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
unsigned long long unmapped;
@@ -445,11 +456,15 @@ static unsigned long iommu_v1_unmap_page(struct 
io_pgtable_ops *ops,
while (unmapped < size) {
pte = fetch_pte(pgtable, iova, _size);
if (pte) {
-   int i, count;
+   unsigned long i, count;
+   bool pte_dirty = false;
  
  			count = PAGE_SIZE_PTE_COUNT(unmap_size);

for (i = 0; i < count; i++)
-   pte[i] = 0ULL;
+   pte_dirty |= free_pte_dirty([i], pte[i]);
+


Actually, what if we change the existing free_clear_pte() to 
free_and_clear_dirty_pte(),
and incorporate the logic for


...
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 0a86392b2367..a8fcb6e9a684 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2144,6 +2144,27 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, 
unsigned long iova,
return r;
  }
  
+static size_t amd_iommu_unmap_read_dirty(struct iommu_domain *dom,

+unsigned long iova, size_t page_size,
+struct iommu_iotlb_gather *gather,
+struct iommu_dirty_bitmap *dirty)
+{
+   struct protection_domain *domain = to_pdomain(dom);
+   struct io_pgtable_ops *ops = >iop.iop.ops;
+   size_t r;
+
+   if ((amd_iommu_pgtable == AMD_IOMMU_V1) &&
+   (domain->iop.mode == PAGE_MODE_NONE))
+   return 0;
+
+   r = (ops->unmap_read_dirty) ?
+   ops->unmap_read_dirty(ops, iova, page_size, gather, dirty) : 0;
+
+   amd_iommu_iotlb_gather_add_page(dom, gather, iova, page_size);
+
+   return r;
+}
+


Instead of creating a new function, what if we enhance the current 
amd_iommu_unmap()
to also handle read dirty part as well (e.g. __amd_iommu_unmap_read_dirty()), 
and
then both amd_iommu_unmap() and amd_iommu_unmap_read_dirty() can call
the __amd_iommu_unmap_read_dirty()?

Best Regards,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 09/19] iommu/amd: Access/Dirty bit support in IOPTEs

2022-05-31 Thread Suravee Suthikulpanit via iommu

Joao,

On 4/29/22 4:09 AM, Joao Martins wrote:

.
+static int amd_iommu_set_dirty_tracking(struct iommu_domain *domain,
+   bool enable)
+{
+   struct protection_domain *pdomain = to_pdomain(domain);
+   struct iommu_dev_data *dev_data;
+   bool dom_flush = false;
+
+   if (!amd_iommu_had_support)
+   return -EOPNOTSUPP;
+
+   list_for_each_entry(dev_data, >dev_list, list) {


Since we iterate through device list for the domain, we would need to
call spin_lock_irqsave(>lock, flags) here.


+   struct amd_iommu *iommu;
+   u64 pte_root;
+
+   iommu = amd_iommu_rlookup_table[dev_data->devid];
+   pte_root = amd_iommu_dev_table[dev_data->devid].data[0];
+
+   /* No change? */
+   if (!(enable ^ !!(pte_root & DTE_FLAG_HAD)))
+   continue;
+
+   pte_root = (enable ?
+   pte_root | DTE_FLAG_HAD : pte_root & ~DTE_FLAG_HAD);
+
+   /* Flush device DTE */
+   amd_iommu_dev_table[dev_data->devid].data[0] = pte_root;
+   device_flush_dte(dev_data);
+   dom_flush = true;
+   }
+
+   /* Flush IOTLB to mark IOPTE dirty on the next translation(s) */
+   if (dom_flush) {
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   amd_iommu_domain_flush_tlb_pde(pdomain);
+   amd_iommu_domain_flush_complete(pdomain);
+   spin_unlock_irqrestore(>lock, flags);
+   }


And call spin_unlock_irqrestore(>lock, flags); here.

+
+   return 0;
+}
+
+static bool amd_iommu_get_dirty_tracking(struct iommu_domain *domain)
+{
+   struct protection_domain *pdomain = to_pdomain(domain);
+   struct iommu_dev_data *dev_data;
+   u64 dte;
+


Also call spin_lock_irqsave(>lock, flags) here


+   list_for_each_entry(dev_data, >dev_list, list) {
+   dte = amd_iommu_dev_table[dev_data->devid].data[0];
+   if (!(dte & DTE_FLAG_HAD))
+   return false;
+   }
+


And call spin_unlock_irqsave(>lock, flags) here


+   return true;
+}
+
+static int amd_iommu_read_and_clear_dirty(struct iommu_domain *domain,
+ unsigned long iova, size_t size,
+ struct iommu_dirty_bitmap *dirty)
+{
+   struct protection_domain *pdomain = to_pdomain(domain);
+   struct io_pgtable_ops *ops = >iop.iop.ops;
+
+   if (!amd_iommu_get_dirty_tracking(domain))
+   return -EOPNOTSUPP;
+
+   if (!ops || !ops->read_and_clear_dirty)
+   return -ENODEV;


We move this check before the amd_iommu_get_dirty_tracking().

Best Regards,
Suravee


+
+   return ops->read_and_clear_dirty(ops, iova, size, dirty);
+}
+
+
  static void amd_iommu_get_resv_regions(struct device *dev,
   struct list_head *head)
  {
@@ -2293,6 +2368,8 @@ const struct iommu_ops amd_iommu_ops = {
.flush_iotlb_all = amd_iommu_flush_iotlb_all,
.iotlb_sync = amd_iommu_iotlb_sync,
.free   = amd_iommu_domain_free,
+   .set_dirty_tracking = amd_iommu_set_dirty_tracking,
+   .read_and_clear_dirty = amd_iommu_read_and_clear_dirty,
}
  };
  

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used

2022-05-25 Thread Suravee Suthikulpanit via iommu

Joerg,

On 5/20/22 3:09 PM, Joerg Roedel wrote:

Hi Suravee,

On Mon, May 16, 2022 at 07:27:51PM +0700, Suravee Suthikulpanit wrote:


- Also, it seems that the current iommu v2 page table use case, where 
GVA->GPA=SPA
will no longer be supported on system w/ SNPSup=1. Any thoughts?


Support for that is not upstream yet, it should be easy to disallow this
configuration and just use the v1 page-tables when SNP is active. This
can be handled entirely inside the AMD IOMMU driver.



Actually, I am referring to when user uses the IOMMU v2 table for shared 
virtual address
in current iommu_v2 driver (e.g. amd_iommu_init_device(), amd_iommu_bind_pasid).

Best Regards,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used

2022-05-16 Thread Suravee Suthikulpanit via iommu

Joerg,

On 5/13/22 8:07 PM, Joerg Roedel wrote:

On Mon, May 09, 2022 at 02:48:15AM -0500, Suravee Suthikulpanit wrote:

On AMD system with SNP enabled, IOMMU hardware checks the host translation
valid (TV) and guest translation valid (GV) bits in the device
table entry (DTE) before accessing the corresponded page tables.

However, current IOMMU driver sets the TV bit for all devices
regardless of whether the host page table is in used.
This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which
do not the host page table root pointer set up.


Hmm, this sound weird. In the early AMD IOMMUs it was recommended to set
TV=1 and V=1 and the rest to 0 to block all DMA from a device.

I wonder how this triggers ILLEGAL_DEV_TABLE_ENTRY errors now. It is
(was?) legal to set V=1 TV=1, mode=0 and leave the page-table empty.


Due to the new restriction (please see the IOMMU spec Rev 3.06-PUB - Apr 2021
https://www.amd.com/system/files/TechDocs/48882_IOMMU.pdf) where the use of
DTE[Mode]=0 is not supported on systems that are SNP-enabled (i.e. 
EFR[SNPSup]=1),
the IOMMU HW looks at the DTE[TV] bit to determine if it needs to handle the v1 
page table.
When the HW encounters DTE entry with TV=1, V=1, Mode=0, it would generate
ILLEGAL_DEV_TABLE_ENTRY event.

Note: I am following up with HW folks for the updated document for this
specific detail.

Therefore, we need to modify IOMMU driver as following:

- For non-DMA devices (e.g. the IOAPIC devices), we need to
modify IOMMU driver to default to DTE[TV]=0. For Linux, this is equivalent
to DTE with domain ID 0.

- I am still trying to see what is the best way to force Linux to not allow
Mode=0 (i.e. iommu=pt mode). Any thoughts?

- Also, it seems that the current iommu v2 page table use case, where 
GVA->GPA=SPA
will no longer be supported on system w/ SNPSup=1. Any thoughts?


When then IW=0 and IR=0, DMA is blocked. From what I remember this is a
valid setting in a DTE.


Correct.


Do you have an example DTE which triggers this error message?


This is specifically from the device representing an IOAPIC.

[  +0.000108] iommu ivhd0: AMD-Vi: Event logged [ILLEGAL_DEV_TABLE_ENTRY device=c0:00.1 pasid=0x0 
address=0xfffdf814 flags=0x0008]

[  +0.11] AMD-Vi: DTE[0]: 0003
[  +0.03] AMD-Vi: DTE[1]: 
[  +0.02] AMD-Vi: DTE[2]: 2008000100258013
[  +0.01] AMD-Vi: DTE[3]: 

Best Regards,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used

2022-05-09 Thread Suravee Suthikulpanit via iommu
On AMD system with SNP enabled, IOMMU hardware checks the host translation
valid (TV) and guest translation valid (GV) bits in the device
table entry (DTE) before accessing the corresponded page tables.

However, current IOMMU driver sets the TV bit for all devices
regardless of whether the host page table is in used.
This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which
do not the host page table root pointer set up.

Thefore, only set TV bit when DMA remapping is not used, which is
when domain ID in the AMD IOMMU device table entry (DTE) is zero.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c  | 4 +---
 drivers/iommu/amd/iommu.c | 8 ++--
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 648d6b94ba8c..6a2dadf2b2dc 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -2336,10 +2336,8 @@ static void init_device_table_dma(void)
 {
u32 devid;
 
-   for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) {
+   for (devid = 0; devid <= amd_iommu_last_bdf; ++devid)
set_dev_entry_bit(devid, DEV_ENTRY_VALID);
-   set_dev_entry_bit(devid, DEV_ENTRY_TRANSLATION);
-   }
 }
 
 static void __init uninit_device_table_dma(void)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a1ada7bff44e..cea254968f06 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1473,7 +1473,7 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain,
 
pte_root |= (domain->iop.mode & DEV_ENTRY_MODE_MASK)
<< DEV_ENTRY_MODE_SHIFT;
-   pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
+   pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V;
 
flags = amd_iommu_dev_table[devid].data[1];
 
@@ -1513,6 +1513,10 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain,
flags|= tmp;
}
 
+   /* Only set TV bit when IOMMU page translation is in used */
+   if (domain->id != 0)
+   pte_root |= DTE_FLAG_TV;
+
flags &= ~DEV_DOMID_MASK;
flags |= domain->id;
 
@@ -1535,7 +1539,7 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain,
 static void clear_dte_entry(u16 devid)
 {
/* remove entry from the device table seen by the hardware */
-   amd_iommu_dev_table[devid].data[0]  = DTE_FLAG_V | DTE_FLAG_TV;
+   amd_iommu_dev_table[devid].data[0]  = DTE_FLAG_V;
amd_iommu_dev_table[devid].data[1] &= DTE_FLAG_MASK;
 
amd_iommu_apply_erratum_63(devid);
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Set translation valid bit only when IO page tables are in used

2022-04-28 Thread Suravee Suthikulpanit via iommu




On 4/20/22 6:29 PM, Suravee Suthikulpanit wrote:

On AMD system with SNP enabled, IOMMU hardware checks the host translation
valid (TV) and guest translation valid (GV) bits in the device
table entry (DTE) before accessing the corresponded page tables.

However, current IOMMU driver sets the TV bit for all devices
regardless of whether the host page table is in used.
This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which
do not the host page table root pointer set up.

Thefore, only set TV bit when host or guest page tables are in used.

Signed-off-by: Suravee Suthikulpanit 



I found a bug in this patch. I will send out v2 with the fix.

Regards,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/amd: Set translation valid bit only when IO page tables are in used

2022-04-20 Thread Suravee Suthikulpanit via iommu
On AMD system with SNP enabled, IOMMU hardware checks the host translation
valid (TV) and guest translation valid (GV) bits in the device
table entry (DTE) before accessing the corresponded page tables.

However, current IOMMU driver sets the TV bit for all devices
regardless of whether the host page table is in used.
This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which
do not the host page table root pointer set up.

Thefore, only set TV bit when host or guest page tables are in used.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c  |  4 +---
 drivers/iommu/amd/iommu.c | 13 +++--
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index b4a798c7b347..4f483f22e58c 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -2337,10 +2337,8 @@ static void init_device_table_dma(void)
 {
u32 devid;
 
-   for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) {
+   for (devid = 0; devid <= amd_iommu_last_bdf; ++devid)
set_dev_entry_bit(devid, DEV_ENTRY_VALID);
-   set_dev_entry_bit(devid, DEV_ENTRY_TRANSLATION);
-   }
 }
 
 static void __init uninit_device_table_dma(void)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a1ada7bff44e..6dd35998e53c 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1473,7 +1473,7 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain,
 
pte_root |= (domain->iop.mode & DEV_ENTRY_MODE_MASK)
<< DEV_ENTRY_MODE_SHIFT;
-   pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
+   pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V;
 
flags = amd_iommu_dev_table[devid].data[1];
 
@@ -1513,6 +1513,15 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain,
flags|= tmp;
}
 
+   /*
+* Only set TV bit when:
+*   - IOMMUv1 table is in used.
+*   - IOMMUv2 table is in used.
+*/
+   if ((domain->iop.mode != PAGE_MODE_NONE) ||
+   (domain->flags & PD_IOMMUV2_MASK))
+   pte_root |= DTE_FLAG_TV;
+
flags &= ~DEV_DOMID_MASK;
flags |= domain->id;
 
@@ -1535,7 +1544,7 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain,
 static void clear_dte_entry(u16 devid)
 {
/* remove entry from the device table seen by the hardware */
-   amd_iommu_dev_table[devid].data[0]  = DTE_FLAG_V | DTE_FLAG_TV;
+   amd_iommu_dev_table[devid].data[0]  = DTE_FLAG_V;
amd_iommu_dev_table[devid].data[1] &= DTE_FLAG_MASK;
 
amd_iommu_apply_erratum_63(devid);
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/amd: Do not call sleep while holding spinlock

2022-03-13 Thread Suravee Suthikulpanit via iommu
Smatch static checker warns:
drivers/iommu/amd/iommu_v2.c:133 free_device_state()
warn: sleeping in atomic context

Fixes by storing the list of struct device_state in a temporary
list, and then free the memory after releasing the spinlock.

Reported-by: Dan Carpenter 
Fixes: dc6a709e5123 ("iommu/amd: Improve amd_iommu_v2_exit()")
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/iommu_v2.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
index 490da41c3c71..5a6e4f87d875 100644
--- a/drivers/iommu/amd/iommu_v2.c
+++ b/drivers/iommu/amd/iommu_v2.c
@@ -947,6 +947,7 @@ static void __exit amd_iommu_v2_exit(void)
 {
struct device_state *dev_state, *next;
unsigned long flags;
+   LIST_HEAD(freelist);
 
if (!amd_iommu_v2_supported())
return;
@@ -966,11 +967,20 @@ static void __exit amd_iommu_v2_exit(void)
 
put_device_state(dev_state);
list_del(_state->list);
-   free_device_state(dev_state);
+   list_add_tail(_state->list, );
}
 
spin_unlock_irqrestore(_lock, flags);
 
+   /*
+* Since free_device_state waits on the count to be zero,
+* we need to free dev_state outside the spinlock.
+*/
+   list_for_each_entry_safe(dev_state, next, , list) {
+   list_del(_state->list);
+   free_device_state(dev_state);
+   }
+
destroy_workqueue(iommu_wq);
 }
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/amd: Fix I/O page table memory leak

2022-02-10 Thread Suravee Suthikulpanit via iommu
The current logic updates the I/O page table mode for the domain
before calling the logic to free memory used for the page table.
This results in IOMMU page table memory leak, and can be observed
when launching VM w/ pass-through devices.

Fix by freeing the memory used for page table before updating the mode.

Cc: Joerg Roedel 
Reported-by: Daniel Jordan 
Tested-by: Daniel Jordan 
Signed-off-by: Suravee Suthikulpanit 
Fixes: e42ba0633064 ("iommu/amd: Restructure code for freeing page table")
Link: https://lore.kernel.org/all/20220118194720.urjgi73b7c3tq...@oracle.com/
---
 drivers/iommu/amd/io_pgtable.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index b1bf4125b0f7..6608d1717574 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -492,18 +492,18 @@ static void v1_free_pgtable(struct io_pgtable *iop)
 
dom = container_of(pgtable, struct protection_domain, iop);
 
-   /* Update data structure */
-   amd_iommu_domain_clr_pt_root(dom);
-
-   /* Make changes visible to IOMMUs */
-   amd_iommu_domain_update(dom);
-
/* Page-table is not visible to IOMMU anymore, so free it */
BUG_ON(pgtable->mode < PAGE_MODE_NONE ||
   pgtable->mode > PAGE_MODE_6_LEVEL);
 
free_sub_pt(pgtable->root, pgtable->mode, );
 
+   /* Update data structure */
+   amd_iommu_domain_clr_pt_root(dom);
+
+   /* Make changes visible to IOMMUs */
+   amd_iommu_domain_update(dom);
+
put_pages_list();
 }
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/3] iommu/amd: Remove iommu_init_ga()

2021-08-20 Thread Suravee Suthikulpanit via iommu
Since the function has been simplified and only call iommu_init_ga_log(),
remove the function and replace with iommu_init_ga_log() instead.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 17 -
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index ea3330ed545d..5ec683675ff0 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -827,9 +827,9 @@ static int iommu_ga_log_enable(struct amd_iommu *iommu)
return 0;
 }
 
-#ifdef CONFIG_IRQ_REMAP
 static int iommu_init_ga_log(struct amd_iommu *iommu)
 {
+#ifdef CONFIG_IRQ_REMAP
u64 entry;
 
if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir))
@@ -859,18 +859,9 @@ static int iommu_init_ga_log(struct amd_iommu *iommu)
 err_out:
free_ga_log(iommu);
return -EINVAL;
-}
-#endif /* CONFIG_IRQ_REMAP */
-
-static int iommu_init_ga(struct amd_iommu *iommu)
-{
-   int ret = 0;
-
-#ifdef CONFIG_IRQ_REMAP
-   ret = iommu_init_ga_log(iommu);
+#else
+   return 0;
 #endif /* CONFIG_IRQ_REMAP */
-
-   return ret;
 }
 
 static int __init alloc_cwwb_sem(struct amd_iommu *iommu)
@@ -1852,7 +1843,7 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
if (iommu_feature(iommu, FEATURE_PPR) && alloc_ppr_log(iommu))
return -ENOMEM;
 
-   ret = iommu_init_ga(iommu);
+   ret = iommu_init_ga_log(iommu);
if (ret)
return ret;
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/3] iommu/amd: Introduce helper function to check feature bit on all IOMMUs

2021-08-20 Thread Suravee Suthikulpanit via iommu
IOMMU advertises feature via Extended Features Register (EFR).
The helper function checks if the specified feature bit is set
across all IOMMUs.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 46280e6e1535..c97961451ac5 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -298,6 +298,19 @@ int amd_iommu_get_num_iommus(void)
return amd_iommus_present;
 }
 
+static bool check_feature_on_all_iommus(u64 mask)
+{
+   bool ret = false;
+   struct amd_iommu *iommu;
+
+   for_each_iommu(iommu) {
+   ret = iommu_feature(iommu, mask);
+   if (!ret)
+   return false;
+   }
+
+   return true;
+}
 /*
  * For IVHD type 0x11/0x40, EFR is also available via IVHD.
  * Default to IVHD EFR since it is available sooner
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/3] iommu/amd: Relocate GAMSup check to early_enable_iommus

2021-08-20 Thread Suravee Suthikulpanit via iommu
From: Wei Huang 

Currently, iommu_init_ga() checks and disables IOMMU VAPIC support
(i.e. AMD AVIC support in IOMMU) when GAMSup feature bit is not set.
However it forgets to clear IRQ_POSTING_CAP from the previously set
amd_iommu_irq_ops.capability.

This triggers an invalid page fault bug during guest VM warm reboot
if AVIC is enabled since the irq_remapping_cap(IRQ_POSTING_CAP) is
incorrectly set, and crash the system with the following kernel trace.

BUG: unable to handle page fault for address: 00400dd8
RIP: 0010:amd_iommu_deactivate_guest_mode+0x19/0xbc
Call Trace:
 svm_set_pi_irte_mode+0x8a/0xc0 [kvm_amd]
 ? kvm_make_all_cpus_request_except+0x50/0x70 [kvm]
 kvm_request_apicv_update+0x10c/0x150 [kvm]
 svm_toggle_avic_for_irq_window+0x52/0x90 [kvm_amd]
 svm_enable_irq_window+0x26/0xa0 [kvm_amd]
 vcpu_enter_guest+0xbbe/0x1560 [kvm]
 ? avic_vcpu_load+0xd5/0x120 [kvm_amd]
 ? kvm_arch_vcpu_load+0x76/0x240 [kvm]
 ? svm_get_segment_base+0xa/0x10 [kvm_amd]
 kvm_arch_vcpu_ioctl_run+0x103/0x590 [kvm]
 kvm_vcpu_ioctl+0x22a/0x5d0 [kvm]
 __x64_sys_ioctl+0x84/0xc0
 do_syscall_64+0x33/0x40
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes by moving the initializing of AMD IOMMU interrupt remapping mode
(amd_iommu_guest_ir) earlier before setting up the
amd_iommu_irq_ops.capability with appropriate IRQ_POSTING_CAP flag.

Signed-off-by: Wei Huang 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index c97961451ac5..ea3330ed545d 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -867,13 +867,6 @@ static int iommu_init_ga(struct amd_iommu *iommu)
int ret = 0;
 
 #ifdef CONFIG_IRQ_REMAP
-   /* Note: We have already checked GASup from IVRS table.
-*   Now, we need to make sure that GAMSup is set.
-*/
-   if (AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) &&
-   !iommu_feature(iommu, FEATURE_GAM_VAPIC))
-   amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY_GA;
-
ret = iommu_init_ga_log(iommu);
 #endif /* CONFIG_IRQ_REMAP */
 
@@ -2490,6 +2483,14 @@ static void early_enable_iommus(void)
}
 
 #ifdef CONFIG_IRQ_REMAP
+   /*
+* Note: We have already checked GASup from IVRS table.
+*   Now, we need to make sure that GAMSup is set.
+*/
+   if (AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) &&
+   !check_feature_on_all_iommus(FEATURE_GAM_VAPIC))
+   amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY_GA;
+
if (AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir))
amd_iommu_irq_ops.capability |= (1 << IRQ_POSTING_CAP);
 #endif
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/3] iommu/amd: Fix unable to handle page fault due to AVIC

2021-08-20 Thread Suravee Suthikulpanit via iommu
This bug is triggered when rebooting VM on a system which
SVM AVIC is enabled but IOMMU AVIC is disabled in the BIOS.

The series reworks interrupt remapping intialiation to
check for IOMMU AVIC support (GAMSup) at earlier stage using
EFR provided by IVRS table instead of the PCI MMIO register,
which is available after PCI support for IOMMU is initialized.
This helps avoid having to disable and clean up the already
initialized interrupt-remapping-related parameter. 

Thanks,
Suravee

Suravee Suthikulpanit (2):
  iommu/amd: Introduce helper function to check feature bit on all
IOMMUs
  iommu/amd: Remove iommu_init_ga()

Wei Huang (1):
  iommu/amd: Relocate GAMSup check to early_enable_iommus

 drivers/iommu/amd/init.c | 45 ++--
 1 file changed, 25 insertions(+), 20 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] MAINTAINERS: Add Suravee Suthikulpanit as Reviewer for AMD IOMMU (AMD-Vi)

2021-07-14 Thread Suravee Suthikulpanit via iommu
To help review changes related to AMD IOMMU.

Signed-off-by: Suravee Suthikulpanit 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index b80e6f7..8022dbd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -933,6 +933,7 @@ F:  drivers/video/fbdev/geode/
 
 AMD IOMMU (AMD-VI)
 M: Joerg Roedel 
+R: Suravee Suthikulpanit 
 L: iommu@lists.linux-foundation.org
 S: Maintained
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating

2021-05-04 Thread Suravee Suthikulpanit
On certain AMD platforms, when the IOMMU performance counter source
(csource) field is zero, power-gating for the counter is enabled, which
prevents write access and returns zero for read access.

This can cause invalid perf result especially when event multiplexing
is needed (i.e. more number of events than available counters) since
the current logic keeps track of the previously read counter value,
and subsequently re-program the counter to continue counting the event.
With power-gating enabled, we cannot gurantee successful re-programming
of the counter.

Workaround this issue by :

1. Modifying the ordering of setting/reading counters and enabing/
   disabling csources to only access the counter when the csource
   is set to non-zero.

2. Since AMD IOMMU PMU does not support interrupt mode, the logic
   can be simplified to always start counting with value zero,
   and accumulate the counter value when stopping without the need
   to keep track and reprogram the counter with the previously read
   counter value.

This has been tested on systems with and without power-gating.

Fixes: 994d6608efe4 ("iommu/amd: Remove performance counter pre-initialization 
test")
Suggested-by: Alexander Monakov 
Cc: David Coe 
Signed-off-by: Suravee Suthikulpanit 
---
 arch/x86/events/amd/iommu.c | 47 -
 1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/arch/x86/events/amd/iommu.c b/arch/x86/events/amd/iommu.c
index 1c1a7e45dc64..913745f1419b 100644
--- a/arch/x86/events/amd/iommu.c
+++ b/arch/x86/events/amd/iommu.c
@@ -19,8 +19,6 @@
 #include "../perf_event.h"
 #include "iommu.h"
 
-#define COUNTER_SHIFT  16
-
 /* iommu pmu conf masks */
 #define GET_CSOURCE(x) ((x)->conf & 0xFFULL)
 #define GET_DEVID(x)   (((x)->conf >> 8)  & 0xULL)
@@ -286,22 +284,31 @@ static void perf_iommu_start(struct perf_event *event, 
int flags)
WARN_ON_ONCE(!(hwc->state & PERF_HES_UPTODATE));
hwc->state = 0;
 
+   /*
+* To account for power-gating, which prevents write to
+* the counter, we need to enable the counter
+* before setting up counter register.
+*/
+   perf_iommu_enable_event(event);
+
if (flags & PERF_EF_RELOAD) {
-   u64 prev_raw_count = local64_read(>prev_count);
+   u64 count = 0;
struct amd_iommu *iommu = perf_event_2_iommu(event);
 
+   /*
+* Since the IOMMU PMU only support counting mode,
+* the counter always start with value zero.
+*/
amd_iommu_pc_set_reg(iommu, hwc->iommu_bank, hwc->iommu_cntr,
-IOMMU_PC_COUNTER_REG, _raw_count);
+IOMMU_PC_COUNTER_REG, );
}
 
-   perf_iommu_enable_event(event);
perf_event_update_userpage(event);
-
 }
 
 static void perf_iommu_read(struct perf_event *event)
 {
-   u64 count, prev, delta;
+   u64 count;
struct hw_perf_event *hwc = >hw;
struct amd_iommu *iommu = perf_event_2_iommu(event);
 
@@ -312,14 +319,11 @@ static void perf_iommu_read(struct perf_event *event)
/* IOMMU pc counter register is only 48 bits */
count &= GENMASK_ULL(47, 0);
 
-   prev = local64_read(>prev_count);
-   if (local64_cmpxchg(>prev_count, prev, count) != prev)
-   return;
-
-   /* Handle 48-bit counter overflow */
-   delta = (count << COUNTER_SHIFT) - (prev << COUNTER_SHIFT);
-   delta >>= COUNTER_SHIFT;
-   local64_add(delta, >count);
+   /*
+* Since the counter always start with value zero,
+* simply just accumulate the count for the event.
+*/
+   local64_add(count, >count);
 }
 
 static void perf_iommu_stop(struct perf_event *event, int flags)
@@ -329,15 +333,16 @@ static void perf_iommu_stop(struct perf_event *event, int 
flags)
if (hwc->state & PERF_HES_UPTODATE)
return;
 
+   /*
+* To account for power-gating, in which reading the counter would
+* return zero, we need to read the register before disabling.
+*/
+   perf_iommu_read(event);
+   hwc->state |= PERF_HES_UPTODATE;
+
perf_iommu_disable_event(event);
WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
hwc->state |= PERF_HES_STOPPED;
-
-   if (hwc->state & PERF_HES_UPTODATE)
-   return;
-
-   perf_iommu_read(event);
-   hwc->state |= PERF_HES_UPTODATE;
 }
 
 static int perf_iommu_add(struct perf_event *event, int flags)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] Revert "iommu/amd: Fix performance counter initialization"

2021-04-09 Thread Suravee Suthikulpanit
From: Paul Menzel 

This reverts commit 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b.

The original commit tries to address an issue, where PMC power-gating
causing the IOMMU PMC pre-init test to fail on certain desktop/mobile
platforms where the power-gating is normally enabled.

There have been several reports that the workaround still does not
guarantee to work, and can add up to 100 ms (on the worst case)
to the boot process on certain platforms such as the MSI B350M MORTAR
with AMD Ryzen 3 2200G.

Therefore, revert this commit as a prelude to removing the pre-init
test.

Link: 
https://lore.kernel.org/linux-iommu/alpine.lnx.3.20.13.2006030935570.3...@monopod.intra.ispras.ru/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
Cc: Tj (Elloe Linux) 
Cc: Shuah Khan 
Cc: Alexander Monakov 
Cc: David Coe 
Signed-off-by: Paul Menzel 
Signed-off-by: Suravee Suthikulpanit 
---
Note: I have revised the commit message to add more detail
  and remove uncessary information.

 drivers/iommu/amd/init.c | 45 ++--
 1 file changed, 11 insertions(+), 34 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 321f5906e6ed..648cdfd03074 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -12,7 +12,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -257,8 +256,6 @@ static enum iommu_init_state init_state = IOMMU_START_STATE;
 static int amd_iommu_enable_interrupts(void);
 static int __init iommu_go_to_state(enum iommu_init_state state);
 static void init_device_table_dma(void);
-static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
-   u8 fxn, u64 *value, bool is_write);
 
 static bool amd_iommu_pre_enabled = true;
 
@@ -1717,11 +1714,13 @@ static int __init init_iommu_all(struct 
acpi_table_header *table)
return 0;
 }
 
-static void __init init_iommu_perf_ctr(struct amd_iommu *iommu)
+static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
+   u8 fxn, u64 *value, bool is_write);
+
+static void init_iommu_perf_ctr(struct amd_iommu *iommu)
 {
-   int retry;
struct pci_dev *pdev = iommu->dev;
-   u64 val = 0xabcd, val2 = 0, save_reg, save_src;
+   u64 val = 0xabcd, val2 = 0, save_reg = 0;
 
if (!iommu_feature(iommu, FEATURE_PC))
return;
@@ -1729,39 +1728,17 @@ static void __init init_iommu_perf_ctr(struct amd_iommu 
*iommu)
amd_iommu_pc_present = true;
 
/* save the value to restore, if writable */
-   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, false) ||
-   iommu_pc_get_set_reg(iommu, 0, 0, 8, _src, false))
-   goto pc_false;
-
-   /*
-* Disable power gating by programing the performance counter
-* source to 20 (i.e. counts the reads and writes from/to IOMMU
-* Reserved Register [MMIO Offset 1FF8h] that are ignored.),
-* which never get incremented during this init phase.
-* (Note: The event is also deprecated.)
-*/
-   val = 20;
-   if (iommu_pc_get_set_reg(iommu, 0, 0, 8, , true))
+   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, false))
goto pc_false;
 
/* Check if the performance counters can be written to */
-   val = 0xabcd;
-   for (retry = 5; retry; retry--) {
-   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, , true) ||
-   iommu_pc_get_set_reg(iommu, 0, 0, 0, , false) ||
-   val2)
-   break;
-
-   /* Wait about 20 msec for power gating to disable and retry. */
-   msleep(20);
-   }
-
-   /* restore */
-   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, true) ||
-   iommu_pc_get_set_reg(iommu, 0, 0, 8, _src, true))
+   if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, , true)) ||
+   (iommu_pc_get_set_reg(iommu, 0, 0, 0, , false)) ||
+   (val != val2))
goto pc_false;
 
-   if (val != val2)
+   /* restore */
+   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, true))
goto pc_false;
 
pci_info(pdev, "IOMMU performance counters supported\n");
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test

2021-04-09 Thread Suravee Suthikulpanit
In early AMD desktop/mobile platforms (during 2013), when the IOMMU
Performance Counter (PMC) support was first introduced in
commit 30861ddc9cca ("perf/x86/amd: Add IOMMU Performance Counter
resource management"), there was a HW bug where the counters could not
be accessed. The result was reading of the counter always return zero.

At the time, the suggested workaround was to add a test logic prior
to initializing the PMC feature to check if the counters can be programmed
and read back the same value. This has been working fine until the more
recent desktop/mobile platforms start enabling power gating for the PMC,
which prevents access to the counters. This results in the PMC support
being disabled unnecesarily.

Unfortunatly, there is no documentation of since which generation
of hardware the original PMC HW bug was fixed. Although, it was fixed
soon after the first introduction of the PMC. Base on this, we assume
that the buggy platforms are less likely to be in used, and it should
be relatively safe to remove this legacy logic.

Link: 
https://lore.kernel.org/linux-iommu/alpine.lnx.3.20.13.2006030935570.3...@monopod.intra.ispras.ru/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
Cc: Tj (Elloe Linux) 
Cc: Shuah Khan 
Cc: Alexander Monakov 
Cc: David Coe 
Cc: Paul Menzel 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 24 +---
 1 file changed, 1 insertion(+), 23 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 648cdfd03074..247cdda5d683 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1714,33 +1714,16 @@ static int __init init_iommu_all(struct 
acpi_table_header *table)
return 0;
 }
 
-static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
-   u8 fxn, u64 *value, bool is_write);
-
 static void init_iommu_perf_ctr(struct amd_iommu *iommu)
 {
+   u64 val;
struct pci_dev *pdev = iommu->dev;
-   u64 val = 0xabcd, val2 = 0, save_reg = 0;
 
if (!iommu_feature(iommu, FEATURE_PC))
return;
 
amd_iommu_pc_present = true;
 
-   /* save the value to restore, if writable */
-   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, false))
-   goto pc_false;
-
-   /* Check if the performance counters can be written to */
-   if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, , true)) ||
-   (iommu_pc_get_set_reg(iommu, 0, 0, 0, , false)) ||
-   (val != val2))
-   goto pc_false;
-
-   /* restore */
-   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, true))
-   goto pc_false;
-
pci_info(pdev, "IOMMU performance counters supported\n");
 
val = readl(iommu->mmio_base + MMIO_CNTR_CONF_OFFSET);
@@ -1748,11 +1731,6 @@ static void init_iommu_perf_ctr(struct amd_iommu *iommu)
iommu->max_counters = (u8) ((val >> 7) & 0xf);
 
return;
-
-pc_false:
-   pci_err(pdev, "Unable to read/write to IOMMU perf counter.\n");
-   amd_iommu_pc_present = false;
-   return;
 }
 
 static ssize_t amd_iommu_show_cap(struct device *dev,
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/2] iommu/amd: Revert and remove failing PMC test

2021-04-09 Thread Suravee Suthikulpanit
This has prevented PMC to work on more recent desktop/mobile platforms,
where the PMC power-gating is normally enabled. After consulting
with HW designers and IOMMU maintainer, we have decide to remove
the legacy test altogether to avoid future PMC enabling issues.

Thanks the community for helping to test, investigate, provide data
and report issues on several platforms in the field.

Regards,
Suravee 

Paul Menzel (1):
  Revert "iommu/amd: Fix performance counter initialization"

Suravee Suthikulpanit (1):
  iommu/amd: Remove performance counter pre-initialization test

 drivers/iommu/amd/init.c | 49 ++--
 1 file changed, 2 insertions(+), 47 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 5/7] iommu/amd: Add support for Guest IO protection

2021-03-25 Thread Suravee Suthikulpanit

Joerg,

On 3/18/21 10:31 PM, Joerg Roedel wrote:

On Fri, Mar 12, 2021 at 03:04:09AM -0600, Suravee Suthikulpanit wrote:

@@ -519,6 +521,7 @@ struct protection_domain {
spinlock_t lock;/* mostly used to lock the page table*/
u16 id; /* the domain id written to the device table */
int glx;/* Number of levels for GCR3 table */
+   bool giov;  /* guest IO protection domain */


Could this be turned into a flag?



Good point. I'll convert to use the protection_domain.flags.

Thanks,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 6/7] iommu/amd: Introduce amd_iommu_pgtable command-line option

2021-03-21 Thread Suravee Suthikulpanit

Joerg,

On 3/18/21 10:33 PM, Joerg Roedel wrote:

On Fri, Mar 12, 2021 at 03:04:10AM -0600, Suravee Suthikulpanit wrote:

To allow specification whether to use v1 or v2 IOMMU pagetable for
DMA remapping when calling kernel DMA-API.

Signed-off-by: Suravee Suthikulpanit 
---
  Documentation/admin-guide/kernel-parameters.txt |  6 ++
  drivers/iommu/amd/init.c| 15 +++
  2 files changed, 21 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 04545725f187..466e807369ea 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -319,6 +319,12 @@
 This mode requires kvm-amd.avic=1.
 (Default when IOMMU HW support is present.)
  
+	amd_iommu_pgtable= [HW,X86-64]

+   Specifies one of the following AMD IOMMU page table to
+   be used for DMA remapping for DMA-API:
+   v1 - Use v1 page table (Default)
+   v2 - Use v2 page table


Any reason v2 can not be the default when it is supported by the IOMMU?



Eventually, we should be able to default to v2. However, we will need to make 
sure that
the v2 implementation will have comparable performance as currently used v1.

FYI: I'm also looking into adding support for SVA as well.

Thanks,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH 7/7] iommu/amd: Add support for using AMD IOMMU v2 page table for DMA-API

2021-03-12 Thread Suravee Suthikulpanit
Introduce init function for setting up DMA domain for DMA-API with
the IOMMU v2 page table.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/iommu.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index e29ece6e1e68..bd26de8764bd 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1937,6 +1937,24 @@ static int protection_domain_init_v1(struct 
protection_domain *domain, int mode)
return 0;
 }
 
+static int protection_domain_init_v2(struct protection_domain *domain)
+{
+   spin_lock_init(>lock);
+   domain->id = domain_id_alloc();
+   if (!domain->id)
+   return -ENOMEM;
+   INIT_LIST_HEAD(>dev_list);
+
+   domain->giov = true;
+
+   if (amd_iommu_pgtable == AMD_IOMMU_V2 &&
+   domain_enable_v2(domain, 1, false)) {
+   return -ENOMEM;
+   }
+
+   return 0;
+}
+
 static struct protection_domain *protection_domain_alloc(unsigned int type)
 {
struct io_pgtable_ops *pgtbl_ops;
@@ -1964,6 +1982,9 @@ static struct protection_domain 
*protection_domain_alloc(unsigned int type)
case AMD_IOMMU_V1:
ret = protection_domain_init_v1(domain, mode);
break;
+   case AMD_IOMMU_V2:
+   ret = protection_domain_init_v2(domain);
+   break;
default:
ret = -EINVAL;
}
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH 6/7] iommu/amd: Introduce amd_iommu_pgtable command-line option

2021-03-12 Thread Suravee Suthikulpanit
To allow specification whether to use v1 or v2 IOMMU pagetable for
DMA remapping when calling kernel DMA-API.

Signed-off-by: Suravee Suthikulpanit 
---
 Documentation/admin-guide/kernel-parameters.txt |  6 ++
 drivers/iommu/amd/init.c| 15 +++
 2 files changed, 21 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 04545725f187..466e807369ea 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -319,6 +319,12 @@
 This mode requires kvm-amd.avic=1.
 (Default when IOMMU HW support is present.)
 
+   amd_iommu_pgtable= [HW,X86-64]
+   Specifies one of the following AMD IOMMU page table to
+   be used for DMA remapping for DMA-API:
+   v1 - Use v1 page table (Default)
+   v2 - Use v2 page table
+
amijoy.map= [HW,JOY] Amiga joystick support
Map of devices attached to JOY0DAT and JOY1DAT
Format: ,
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 9265c1bf1d84..6d5163bfb87e 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -3123,6 +3123,20 @@ static int __init parse_amd_iommu_dump(char *str)
return 1;
 }
 
+static int __init parse_amd_iommu_pgtable(char *str)
+{
+   for (; *str; ++str) {
+   if (strncmp(str, "v1", 2) == 0) {
+   amd_iommu_pgtable = AMD_IOMMU_V1;
+   break;
+   } else if (strncmp(str, "v2", 2) == 0) {
+   amd_iommu_pgtable = AMD_IOMMU_V2;
+   break;
+   }
+   }
+   return 1;
+}
+
 static int __init parse_amd_iommu_intr(char *str)
 {
for (; *str; ++str) {
@@ -3246,6 +3260,7 @@ static int __init parse_ivrs_acpihid(char *str)
 
 __setup("amd_iommu_dump",  parse_amd_iommu_dump);
 __setup("amd_iommu=",  parse_amd_iommu_options);
+__setup("amd_iommu_pgtable=",  parse_amd_iommu_pgtable);
 __setup("amd_iommu_intr=", parse_amd_iommu_intr);
 __setup("ivrs_ioapic", parse_ivrs_ioapic);
 __setup("ivrs_hpet",   parse_ivrs_hpet);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH 3/7] iommu/amd: Decouple the logic to enable PPR and GT

2021-03-12 Thread Suravee Suthikulpanit
Currently, the function to enable iommu v2 (GT) assumes PPR log
must also be enabled. This is no longer the case since the IOMMU
v2 page table can be enabled without PRR support (for DMA-API
use case).

Therefore, separate the enabling logic for PPR and GT.
There is no functional change.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 9126efcbaf2c..5def566de6f6 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -898,14 +898,6 @@ static void iommu_enable_xt(struct amd_iommu *iommu)
 #endif /* CONFIG_IRQ_REMAP */
 }
 
-static void iommu_enable_gt(struct amd_iommu *iommu)
-{
-   if (!iommu_feature(iommu, FEATURE_GT))
-   return;
-
-   iommu_feature_enable(iommu, CONTROL_GT_EN);
-}
-
 /* sets a specific bit in the device table entry. */
 static void set_dev_entry_bit(u16 devid, u8 bit)
 {
@@ -1882,6 +1874,7 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
amd_iommu_max_glx_val = glxval;
else
amd_iommu_max_glx_val = min(amd_iommu_max_glx_val, 
glxval);
+   iommu_feature_enable(iommu, CONTROL_GT_EN);
}
 
if (iommu_feature(iommu, FEATURE_GT) &&
@@ -2530,21 +2523,19 @@ static void early_enable_iommus(void)
 #endif
 }
 
-static void enable_iommus_v2(void)
+static void enable_iommus_ppr(void)
 {
struct amd_iommu *iommu;
 
-   for_each_iommu(iommu) {
+   for_each_iommu(iommu)
iommu_enable_ppr_log(iommu);
-   iommu_enable_gt(iommu);
-   }
 }
 
 static void enable_iommus(void)
 {
early_enable_iommus();
 
-   enable_iommus_v2();
+   enable_iommus_ppr();
 }
 
 static void disable_iommus(void)
@@ -2935,7 +2926,7 @@ static int __init state_next(void)
register_syscore_ops(_iommu_syscore_ops);
ret = amd_iommu_init_pci();
init_state = ret ? IOMMU_INIT_ERROR : IOMMU_PCI_INIT;
-   enable_iommus_v2();
+   enable_iommus_ppr();
break;
case IOMMU_PCI_INIT:
ret = amd_iommu_enable_interrupts();
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH 4/7] iommu/amd: Initial support for AMD IOMMU v2 page table

2021-03-12 Thread Suravee Suthikulpanit
Introduce IO page table framework support for AMD IOMMU v2 page table.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/Makefile  |   2 +-
 drivers/iommu/amd/amd_iommu_types.h |   2 +
 drivers/iommu/amd/io_pgtable_v2.c   | 239 
 drivers/iommu/io-pgtable.c  |   1 +
 include/linux/io-pgtable.h  |   2 +
 5 files changed, 245 insertions(+), 1 deletion(-)
 create mode 100644 drivers/iommu/amd/io_pgtable_v2.c

diff --git a/drivers/iommu/amd/Makefile b/drivers/iommu/amd/Makefile
index a935f8f4b974..773d8aa00283 100644
--- a/drivers/iommu/amd/Makefile
+++ b/drivers/iommu/amd/Makefile
@@ -1,4 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
-obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o io_pgtable.o
+obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o io_pgtable.o io_pgtable_v2.o
 obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += debugfs.o
 obj-$(CONFIG_AMD_IOMMU_V2) += iommu_v2.o
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 6937e3674a16..25062eb86c8b 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -265,6 +265,7 @@
  * 512GB Pages are not supported due to a hardware bug
  */
 #define AMD_IOMMU_PGSIZES  ((~0xFFFUL) & ~(2ULL << 38))
+#define AMD_IOMMU_PGSIZES_V2   (PAGE_SIZE | (1ULL << 12) | (1ULL << 30))
 
 /* Bit value definition for dte irq remapping fields*/
 #define DTE_IRQ_PHYS_ADDR_MASK (((1ULL << 45)-1) << 6)
@@ -503,6 +504,7 @@ struct amd_io_pgtable {
int mode;
u64 *root;
atomic64_t  pt_root;/* pgtable root and pgtable mode */
+   struct mm_structv2_mm;
 };
 
 /*
diff --git a/drivers/iommu/amd/io_pgtable_v2.c 
b/drivers/iommu/amd/io_pgtable_v2.c
new file mode 100644
index ..b0b6ba2d8d35
--- /dev/null
+++ b/drivers/iommu/amd/io_pgtable_v2.c
@@ -0,0 +1,239 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * CPU-agnostic AMD IO page table v2 allocator.
+ *
+ * Copyright (C) 2020 Advanced Micro Devices, Inc.
+ * Author: Suravee Suthikulpanit 
+ */
+
+#define pr_fmt(fmt) "AMD-Vi: " fmt
+#define dev_fmt(fmt)pr_fmt(fmt)
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include "amd_iommu_types.h"
+#include "amd_iommu.h"
+
+static pte_t *fetch_pte(struct amd_io_pgtable *pgtable,
+ unsigned long iova,
+ unsigned long *page_size)
+{
+   int level;
+   pte_t *ptep;
+
+   ptep = lookup_address_in_mm(>v2_mm, iova, );
+   if (!ptep || pte_none(*ptep) || (level == PG_LEVEL_NONE))
+   return NULL;
+
+   *page_size = PTE_LEVEL_PAGE_SIZE(level-1);
+   return ptep;
+}
+
+static pte_t *v2_pte_alloc_map(struct mm_struct *mm, unsigned long vaddr)
+{
+   pgd_t *pgd;
+   p4d_t *p4d;
+   pud_t *pud;
+   pmd_t *pmd;
+   pte_t *pte;
+
+   pgd = pgd_offset(mm, vaddr);
+   p4d = p4d_alloc(mm, pgd, vaddr);
+   if (!p4d)
+   return NULL;
+   pud = pud_alloc(mm, p4d, vaddr);
+   if (!pud)
+   return NULL;
+   pmd = pmd_alloc(mm, pud, vaddr);
+   if (!pmd)
+   return NULL;
+   pte = pte_alloc_map(mm, pmd, vaddr);
+   return pte;
+}
+
+static int iommu_v2_map_page(struct io_pgtable_ops *ops, unsigned long iova,
+ phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
+{
+   struct protection_domain *dom = io_pgtable_ops_to_domain(ops);
+   struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
+   pte_t *pte;
+   int ret, i, count;
+   bool updated = false;
+   unsigned long o_iova = iova;
+   unsigned long pte_pgsize;
+
+   BUG_ON(!IS_ALIGNED(iova, size) || !IS_ALIGNED(paddr, size));
+
+   ret = -EINVAL;
+   if (!(prot & IOMMU_PROT_MASK))
+   goto out;
+
+   count = PAGE_SIZE_PTE_COUNT(size);
+
+   for (i = 0; i < count; ++i, iova += PAGE_SIZE, paddr += PAGE_SIZE) {
+   pte = fetch_pte(pgtable, iova, _pgsize);
+   if (!pte || pte_none(*pte)) {
+   pte = v2_pte_alloc_map(>iop.v2_mm, iova);
+   if (!pte)
+   goto out;
+   } else {
+   updated = true;
+   }
+   set_pte(pte, __pte((paddr & 
PAGE_MASK)|_PAGE_PRESENT|_PAGE_USER));
+   if (prot & IOMMU_PROT_IW)
+   *pte = pte_mkwrite(*pte);
+   }
+
+   if (updated) {
+   if (count > 1)
+   amd_iommu_flush_tlb(>domain, 0);
+   else
+   amd_iommu_flush_page(>domain, 0, o_iova);
+   }
+
+   ret = 0;
+out:
+   return ret;
+}
+
+static unsigned long iommu_v2_u

[RFC PATCH 1/7] iommu/amd: Refactor amd_iommu_domain_enable_v2

2021-03-12 Thread Suravee Suthikulpanit
The current function to enable IOMMU v2 also lock the domain.
In order to reuse the same code in different code path, in which
the domain has already been locked, refactor the function to separate
the locking from the enabling logic.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/iommu.c | 42 +--
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a69a8b573e40..6f3e42495709 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -88,6 +88,7 @@ struct iommu_cmd {
 struct kmem_cache *amd_iommu_irq_cache;
 
 static void detach_device(struct device *dev);
+static int domain_enable_v2(struct protection_domain *domain, int pasids, bool 
has_ppr);
 
 /
  *
@@ -2304,10 +2305,9 @@ void amd_iommu_domain_direct_map(struct iommu_domain 
*dom)
 }
 EXPORT_SYMBOL(amd_iommu_domain_direct_map);
 
-int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids)
+/* Note: This function expects iommu_domain->lock to be held prior calling the 
function. */
+static int domain_enable_v2(struct protection_domain *domain, int pasids, bool 
has_ppr)
 {
-   struct protection_domain *domain = to_pdomain(dom);
-   unsigned long flags;
int levels, ret;
 
if (pasids <= 0 || pasids > (PASID_MASK + 1))
@@ -2320,17 +2320,6 @@ int amd_iommu_domain_enable_v2(struct iommu_domain *dom, 
int pasids)
if (levels > amd_iommu_max_glx_val)
return -EINVAL;
 
-   spin_lock_irqsave(>lock, flags);
-
-   /*
-* Save us all sanity checks whether devices already in the
-* domain support IOMMUv2. Just force that the domain has no
-* devices attached when it is switched into IOMMUv2 mode.
-*/
-   ret = -EBUSY;
-   if (domain->dev_cnt > 0 || domain->flags & PD_IOMMUV2_MASK)
-   goto out;
-
ret = -ENOMEM;
domain->gcr3_tbl = (void *)get_zeroed_page(GFP_ATOMIC);
if (domain->gcr3_tbl == NULL)
@@ -2344,8 +2333,31 @@ int amd_iommu_domain_enable_v2(struct iommu_domain *dom, 
int pasids)
ret = 0;
 
 out:
-   spin_unlock_irqrestore(>lock, flags);
+   return ret;
+}
 
+int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids)
+{
+   int ret;
+   unsigned long flags;
+   struct protection_domain *pdom = to_pdomain(dom);
+
+   spin_lock_irqsave(>lock, flags);
+
+   /*
+* Save us all sanity checks whether devices already in the
+* domain support IOMMUv2. Just force that the domain has no
+* devices attached when it is switched into IOMMUv2 mode.
+*/
+   ret = -EBUSY;
+   if (pdom->dev_cnt > 0 || pdom->flags & PD_IOMMUV2_MASK)
+   goto out;
+
+   if (pdom->dev_cnt == 0 && !(pdom->gcr3_tbl))
+   ret = domain_enable_v2(pdom, pasids, true);
+
+out:
+   spin_unlock_irqrestore(>lock, flags);
return ret;
 }
 EXPORT_SYMBOL(amd_iommu_domain_enable_v2);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH 5/7] iommu/amd: Add support for Guest IO protection

2021-03-12 Thread Suravee Suthikulpanit
AMD IOMMU introduces support for Guest I/O protection where the request
from the I/O device without a PASID are treated as if they have PASID 0.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu_types.h | 3 +++
 drivers/iommu/amd/init.c| 8 
 drivers/iommu/amd/iommu.c   | 4 
 3 files changed, 15 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 25062eb86c8b..876ba1adf73e 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -93,6 +93,7 @@
 #define FEATURE_HE (1ULL<<8)
 #define FEATURE_PC (1ULL<<9)
 #define FEATURE_GAM_VAPIC  (1ULL<<21)
+#define FEATURE_GIOSUP (1ULL<<48)
 #define FEATURE_EPHSUP (1ULL<<50)
 #define FEATURE_SNP(1ULL<<63)
 
@@ -366,6 +367,7 @@
 #define DTE_FLAG_IW (1ULL << 62)
 
 #define DTE_FLAG_IOTLB (1ULL << 32)
+#define DTE_FLAG_GIOV  (1ULL << 54)
 #define DTE_FLAG_GV(1ULL << 55)
 #define DTE_FLAG_MASK  (0x3ffULL << 32)
 #define DTE_GLX_SHIFT  (56)
@@ -519,6 +521,7 @@ struct protection_domain {
spinlock_t lock;/* mostly used to lock the page table*/
u16 id; /* the domain id written to the device table */
int glx;/* Number of levels for GCR3 table */
+   bool giov;  /* guest IO protection domain */
u64 *gcr3_tbl;  /* Guest CR3 table */
unsigned long flags;/* flags to find out type of domain */
unsigned dev_cnt;   /* devices assigned to this domain */
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 5def566de6f6..9265c1bf1d84 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1895,6 +1895,12 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
 
init_iommu_perf_ctr(iommu);
 
+   if (amd_iommu_pgtable == AMD_IOMMU_V2 &&
+   !iommu_feature(iommu, FEATURE_GIOSUP)) {
+   pr_warn("Cannot enable v2 page table for DMA-API. Fallback to 
v1.\n");
+   amd_iommu_pgtable = AMD_IOMMU_V1;
+   }
+
if (is_rd890_iommu(iommu->dev)) {
int i, j;
 
@@ -1969,6 +1975,8 @@ static void print_iommu_info(void)
if (amd_iommu_xt_mode == IRQ_REMAP_X2APIC_MODE)
pr_info("X2APIC enabled\n");
}
+   if (amd_iommu_pgtable == AMD_IOMMU_V2)
+   pr_info("GIOV enabled\n");
 }
 
 static int __init amd_iommu_init_pci(void)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index f3800efdbb29..e29ece6e1e68 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1405,6 +1405,10 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain,
 
pte_root |= (domain->iop.mode & DEV_ENTRY_MODE_MASK)
<< DEV_ENTRY_MODE_SHIFT;
+
+   if (domain->giov && (domain->flags & PD_IOMMUV2_MASK))
+   pte_root |= DTE_FLAG_GIOV;
+
pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
 
flags = amd_iommu_dev_table[devid].data[1];
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH 2/7] iommu/amd: Update sanity check when enable PRI/ATS

2021-03-12 Thread Suravee Suthikulpanit
Currently, PPR/ATS can be enabled only if the domain is type
identity mapping. However, when we allow the IOMMU v2 page table
to be used for DMA-API, the sanity check needs to be updated to
only apply for the case when using AMD_IOMMU_V1 page table mode.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/iommu.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 6f3e42495709..f3800efdbb29 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1549,7 +1549,7 @@ static int pri_reset_while_enabled(struct pci_dev *pdev)
return 0;
 }
 
-static int pdev_iommuv2_enable(struct pci_dev *pdev)
+static int pdev_pri_ats_enable(struct pci_dev *pdev)
 {
bool reset_enable;
int reqs, ret;
@@ -1624,11 +1624,19 @@ static int attach_device(struct device *dev,
struct iommu_domain *def_domain = iommu_get_dma_domain(dev);
 
ret = -EINVAL;
-   if (def_domain->type != IOMMU_DOMAIN_IDENTITY)
+
+   /*
+* In case of using AMD_IOMMU_V1 page table mode, and the device
+* is enabling for PPR/ATS support (using v2 table),
+* we need to make sure that the domain type is identity map.
+*/
+   if ((amd_iommu_pgtable == AMD_IOMMU_V1) &&
+   def_domain->type != IOMMU_DOMAIN_IDENTITY) {
goto out;
+   }
 
if (dev_data->iommu_v2) {
-   if (pdev_iommuv2_enable(pdev) != 0)
+   if (pdev_pri_ats_enable(pdev) != 0)
goto out;
 
dev_data->ats.enabled = true;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH 0/7] iommu/amd: Add Generic IO Page Table Framework Support for v2 Page Table

2021-03-12 Thread Suravee Suthikulpanit
This series introduces a new usage model for the v2 page table, where it
can be used to implement support for DMA-API by adopting the generic
IO page table framework.

One of the target usecases is to support nested IO page tables
where the guest uses the guest IO page table (v2) for translating
GVA to GPA, and the hypervisor uses the host I/O page table (v1) for
translating GPA to SPA. This is a pre-requisite for supporting the new
HW-assisted vIOMMU presented at the KVM Forum 2020.

  
https://static.sched.com/hosted_files/kvmforum2020/26/vIOMMU%20KVM%20Forum%202020.pdf

The following components are introduced in this series:

- Part 1 (patch 1-4 and 7)
  Refactor the current IOMMU page table v2 code
  to adopt the generic IO page table framework, and add
  AMD IOMMU Guest (v2) page table management code.

- Part 2 (patch 5)
  Add support for the AMD IOMMU Guest IO Protection feature (GIOV)
  where requests from the I/O device without a PASID are treated as
  if they have PASID of 0.

- Part 3 (patch 6)
  Introduce new amd_iommu_pgtable command-line to allow users
  to select the mode of operation (v1 or v2).

See AMD I/O Virtualization Technology Specification for more detail.

  http://www.amd.com/system/files/TechDocs/48882_IOMMU_3.05_PUB.pdf

Thanks,
Suravee

Suravee Suthikulpanit (7):
  iommu/amd: Refactor amd_iommu_domain_enable_v2
  iommu/amd: Update sanity check when enable PRI/ATS
  iommu/amd: Decouple the logic to enable PPR and GT
  iommu/amd: Initial support for AMD IOMMU v2 page table
  iommu/amd: Add support for Guest IO protection
  iommu/amd: Introduce amd_iommu_pgtable command-line option
  iommu/amd: Add support for using AMD IOMMU v2 page table for DMA-API

 .../admin-guide/kernel-parameters.txt |   6 +
 drivers/iommu/amd/Makefile|   2 +-
 drivers/iommu/amd/amd_iommu_types.h   |   5 +
 drivers/iommu/amd/init.c  |  42 ++-
 drivers/iommu/amd/io_pgtable_v2.c | 239 ++
 drivers/iommu/amd/iommu.c |  81 --
 drivers/iommu/io-pgtable.c|   1 +
 include/linux/io-pgtable.h|   2 +
 8 files changed, 345 insertions(+), 33 deletions(-)
 create mode 100644 drivers/iommu/amd/io_pgtable_v2.c

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] Revert "iommu/amd: Fix performance counter initialization"

2021-03-03 Thread Suravee Suthikulpanit

Paul,

On 3/3/21 7:11 PM, Paul Menzel wrote:

This reverts commit 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b.

The commit adds up to 100 ms to the boot process, which is not mentioned
in the commit message, and is making up more than 20 % on current
systems, where the Linux kernel takes 500 ms.


The 100 msec (5 * 20ms) is only for the worst-case scenario. For most cases,
the delay is not applicable. In addition, this patch has shown to fix the issue 
for
some users in the field.



 [0.00] Linux version 5.11.0-10281-g19b4f3edd5c9 
(root@a2ab663d937e) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU 
Binutils for Debian) 2.35.1) #138 SMP Wed Feb 24 11:28:17 UTC 2021
 […]
 [0.106422] smpboot: CPU0: AMD Ryzen 3 2200G with Radeon Vega Graphics 
(family: 0x17, model: 0x11, stepping: 0x0)
 […]
 [0.291257] pci :00:00.2: AMD-Vi: Unable to read/write to IOMMU 
perf counter.
 […]

Also, it does not fix the problem on an MSI B350M MORTAR with AMD Ryzen
3 2200G (even with ten retries, resulting in 200 ms time-out).


We are still investigating to root cause the long delay for the IOMMU
performance counter unit to disable power-gating, and allow access to
the performance counters. If your concern is the amount of retries,
we can try to reduce the number of retires.



 [0.401152] pci :00:00.2: AMD-Vi: Unable to read/write to IOMMU 
perf counter.

Additionally, alternative proposed solutions [1] were not considered or
discussed.

[1]:https://lore.kernel.org/linux-iommu/alpine.lnx.2.20.13.2006030935570.3...@monopod.intra.ispras.ru/


This check has been introduced early on to detect a HW issue for certain 
platforms in the past,
where the performance counters are not accessible and would result in silent 
failure when try
to use the counters. This is considered legacy code, and can be removed if we 
decide to no
longer provide sanity check for such case.

Regards,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/amd: Fix event counter availability check

2021-02-22 Thread Suravee Suthikulpanit

This fix has been accepted in the upstream recently.

https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/?h=x86/amd

Could you please give this a try?

Thanks,
Suravee

On 2/21/21 8:49 PM, Paul Menzel wrote:

Dear Suravee,


Am 17.09.20 um 19:55 schrieb Alexander Monakov:

On Tue, 16 Jun 2020, Suravee Suthikulpanit wrote:


Instead of blindly moving the code around to a spot that would just work,
I am trying to understand what might be required here. In this case,
the init_device_table_dma()should not be needed. I suspect it's the IOMMU
invalidate all command that's also needed here.

I'm also checking with the HW and BIOS team. Meanwhile, could you please
give
the following change a try:

Hello. Can you give any update please?


[…]


Sorry for late reply. I have a reproducer and working with the HW team to
understand the issue.
I should be able to provide update with solution by the end of this week.


Hello, hope you are doing well. Has this investigation found anything?


I am wondering the same. It’d be great to have this fixed in the upstream Linux 
kernel.


Kind regards,

Paul

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] iommu/amd: Fix performance counter initialization

2021-02-08 Thread Suravee Suthikulpanit
Certain AMD platforms enable power gating feature for IOMMU PMC,
which prevents the IOMMU driver from updating the counter while
trying to validate the PMC functionality in the init_iommu_perf_ctr().
This results in disabling PMC support and the following error message:

"AMD-Vi: Unable to read/write to IOMMU perf counter"

To workaround this issue, disable power gating temporarily by programming
the counter source to non-zero value while validating the counter,
and restore the prior state afterward.

Tested-by: Tj (Elloe Linux) 
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 45 ++--
 1 file changed, 34 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 83d8ab2aed9f..01da76dc1caa 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -254,6 +255,8 @@ static enum iommu_init_state init_state = IOMMU_START_STATE;
 static int amd_iommu_enable_interrupts(void);
 static int __init iommu_go_to_state(enum iommu_init_state state);
 static void init_device_table_dma(void);
+static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
+   u8 fxn, u64 *value, bool is_write);
 
 static bool amd_iommu_pre_enabled = true;
 
@@ -1712,13 +1715,11 @@ static int __init init_iommu_all(struct 
acpi_table_header *table)
return 0;
 }
 
-static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
-   u8 fxn, u64 *value, bool is_write);
-
-static void init_iommu_perf_ctr(struct amd_iommu *iommu)
+static void __init init_iommu_perf_ctr(struct amd_iommu *iommu)
 {
+   int retry;
struct pci_dev *pdev = iommu->dev;
-   u64 val = 0xabcd, val2 = 0, save_reg = 0;
+   u64 val = 0xabcd, val2 = 0, save_reg, save_src;
 
if (!iommu_feature(iommu, FEATURE_PC))
return;
@@ -1726,17 +1727,39 @@ static void init_iommu_perf_ctr(struct amd_iommu *iommu)
amd_iommu_pc_present = true;
 
/* save the value to restore, if writable */
-   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, false))
+   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, false) ||
+   iommu_pc_get_set_reg(iommu, 0, 0, 8, _src, false))
goto pc_false;
 
-   /* Check if the performance counters can be written to */
-   if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, , true)) ||
-   (iommu_pc_get_set_reg(iommu, 0, 0, 0, , false)) ||
-   (val != val2))
+   /*
+* Disable power gating by programing the performance counter
+* source to 20 (i.e. counts the reads and writes from/to IOMMU
+* Reserved Register [MMIO Offset 1FF8h] that are ignored.),
+* which never get incremented during this init phase.
+* (Note: The event is also deprecated.)
+*/
+   val = 20;
+   if (iommu_pc_get_set_reg(iommu, 0, 0, 8, , true))
goto pc_false;
 
+   /* Check if the performance counters can be written to */
+   val = 0xabcd;
+   for (retry = 5; retry; retry--) {
+   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, , true) ||
+   iommu_pc_get_set_reg(iommu, 0, 0, 0, , false) ||
+   val2)
+   break;
+
+   /* Wait about 20 msec for power gating to disable and retry. */
+   msleep(20);
+   }
+
/* restore */
-   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, true))
+   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, true) ||
+   iommu_pc_get_set_reg(iommu, 0, 0, 8, _src, true))
+   goto pc_false;
+
+   if (val != val2)
goto pc_false;
 
pci_info(pdev, "IOMMU performance counters supported\n");
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: AMD-Vi: Unable to read/write to IOMMU perf counter

2021-02-08 Thread Suravee Suthikulpanit

TJ,

Thanks for testing. I will submit this change upstream w/ you as Tested-by.

On 2/8/21 12:18 AM, Tj (Elloe Linux) wrote:

On 06/02/2021 04:02, Suravee Suthikulpanit wrote:
Would this be in any way related to the following from the same device:

kernel: pci :00:00.2: can't derive routing for PCI INT A
kernel: pci :00:00.2: PCI INT A: not connected


This is not related, but should not cause issues.

Thanks,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: AMD-Vi: Unable to read/write to IOMMU perf counter

2021-02-05 Thread Suravee Suthikulpanit

Tj,

I have posted RFCv3 in the BZ 
https://bugzilla.kernel.org/show_bug.cgi?id=201753.

RFCv3 patch adds the logic to retry checking after 20msec wait for each retry loop since I have founded that certain 
platform takes about 10msec for the power gating to disable.


Please give this a try to see if this works better on your platform.

Thanks,
Suravee

On 2/4/21 1:25 PM, Tj (Elloe Linux) wrote:

On 02/02/2021 05:54, Suravee Suthikulpanit wrote:

Could you please try the attached patch to see if the problem still
persist.


Tested on top of commit 61556703b610 doesn't appear to have solved the
issue.



Linux version 5.11.0-rc6+ (tj@elloe000) (gcc (Ubuntu
9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubunt>
Command line: BOOT_IMAGE=/vmlinuz-5.11.0-rc6+
root=/dev/mapper/ELLOE000-rootfs ro acpi_osi=! "acpi_osi=Windows 20>
...
DMI: LENOVO 20NECTO1WW/20NECTO1WW, BIOS R11ET32W (1.12 ) 12/23/2019
...
AMD-Vi: ivrs, add hid:PNPD0040, uid:, rdevid:152
...
smpboot: CPU0: AMD Ryzen 7 3700U with Radeon Vega Mobile Gfx (family:
0x17, model: 0x18, stepping: 0x1)
...
pci :00:00.2: AMD-Vi: Unable to read/write to IOMMU perf counter.
pci :00:00.2: can't derive routing for PCI INT A
pci :00:00.2: PCI INT A: not connected
pci :00:01.0: Adding to iommu group 0
pci :00:01.1: Adding to iommu group 1
...
pci :00:00.2: AMD-Vi: Found IOMMU cap 0x40
pci :00:00.2: AMD-Vi: Extended features (0x4f77ef22294ada):
PPR NX GT IA GA PC GA_vAPIC
AMD-Vi: Interrupt remapping enabled
AMD-Vi: Virtual APIC enabled
AMD-Vi: Lazy IO/TLB flushing enabled
amd_uncore: 4  amd_df counters detected


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: AMD-Vi: Unable to read/write to IOMMU perf counter

2021-02-01 Thread Suravee Suthikulpanit

Could you please try the attached patch to see if the problem still persist.

Thanks,
Suravee

On 1/25/21 4:24 PM, Tj (Elloe Linux) wrote:

Lenovo E495 reports:

pci :00:00.2: AMD-Vi: Unable to read/write to IOMMU perf counter.
pci :00:00.2: can't derive routing for PCI INT A
pci :00:00.2: PCI INT A: not connected

I found an existing identical bug report that doesn't seem to have
gained any attention:

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D201753data=04%7C01%7Csuravee.suthikulpanit%40amd.com%7C7c56640fcf24465050f008d8c145eba4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637471853347946970%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=uykr%2FZMpr%2BuLrw3k1bKVcwywfJB4CU0p2qJSZXgLNK8%3Dreserved=0

Linux version 5.11.0-rc4+ (tj@elloe000) (gcc (Ubuntu
9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #12
SMP PREEMPT Sun Jan 24 11:28:01 GMT 2021
Command line: BOOT_IMAGE=/vmlinuz-5.11.0-rc4+
root=/dev/mapper/ELLOE000-rootfs ro acpi_osi=! "acpi_osi=Windows 2016"
systemd.unified_cgroup_hierarchy=1 nosplash
...
DMI: LENOVO 20NECTO1WW/20NECTO1WW, BIOS R11ET32W (1.12 ) 12/23/2019
...
AMD-Vi: ivrs, add hid:PNPD0040, uid:, rdevid:152
...
smpboot: CPU0: AMD Ryzen 7 3700U with Radeon Vega Mobile Gfx (family:
0x17, model: 0x18, stepping: 0x1)
...
pci :00:00.2: AMD-Vi: Unable to read/write to IOMMU perf counter.
pci :00:00.2: can't derive routing for PCI INT A
pci :00:00.2: PCI INT A: not connected
pci :00:01.0: Adding to iommu group 0
pci :00:01.1: Adding to iommu group 1
pci :00:01.2: Adding to iommu group 2
pci :00:01.3: Adding to iommu group 3
pci :00:01.6: Adding to iommu group 4
pci :00:08.0: Adding to iommu group 5
pci :00:08.1: Adding to iommu group 6
pci :00:14.0: Adding to iommu group 7
pci :00:14.3: Adding to iommu group 7
pci :00:18.0: Adding to iommu group 8
pci :00:18.1: Adding to iommu group 8
pci :00:18.2: Adding to iommu group 8
pci :00:18.3: Adding to iommu group 8
pci :00:18.4: Adding to iommu group 8
pci :00:18.5: Adding to iommu group 8
pci :00:18.6: Adding to iommu group 8
pci :00:18.7: Adding to iommu group 8
pci :01:00.0: Adding to iommu group 9
pci :02:00.0: Adding to iommu group 10
pci :03:00.0: Adding to iommu group 11
pci :04:00.0: Adding to iommu group 12
pci :05:00.0: Adding to iommu group 13
pci :05:00.1: Adding to iommu group 14
pci :05:00.2: Adding to iommu group 14
pci :05:00.3: Adding to iommu group 14
pci :05:00.4: Adding to iommu group 14
pci :05:00.5: Adding to iommu group 14
pci :05:00.6: Adding to iommu group 14
pci :00:00.2: AMD-Vi: Found IOMMU cap 0x40
pci :00:00.2: AMD-Vi: Extended features (0x4f77ef22294ada):
  PPR NX GT IA GA PC GA_vAPIC
AMD-Vi: Interrupt remapping enabled
AMD-Vi: Virtual APIC enabled
AMD-Vi: Lazy IO/TLB flushing enabled
amd_uncore: 4  amd_df counters detected
___
iommu mailing list
iommu@lists.linux-foundation.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.linuxfoundation.org%2Fmailman%2Flistinfo%2Fiommudata=04%7C01%7Csuravee.suthikulpanit%40amd.com%7C7c56640fcf24465050f008d8c145eba4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637471853347946970%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=5w2IiD7Cjsvk9qyiYC9eLmFaBIJLXdLQx4kg27LWycg%3Dreserved=0

From c103d631285cf376420e7f7869837302f2ac38c0 Mon Sep 17 00:00:00 2001
From: Suravee Suthikulpanit 
Date: Mon, 1 Feb 2021 18:38:26 -0600
Subject: [RFC PATCH] iommu/amd: Fix performance counter initialization

Certain AMD platforms enable power gating feature for IOMMU PMC,
which prevents the IOMMU driver from updating the counter while
trying to validate the PMC functionality in the init_iommu_perf_ctr().
This results in disabling PMC support and the following error message:

"AMD-Vi: Unable to write to IOMMU perf counter"

To workaround this issue, disable power gating temporarily by programming
the counter source to non-zero value while validating the counter,
and restore the prior state afterward.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
Signed-off-by: Suravee Suthikulpanit 
---
NOTE:
I have tested this patch only on certain platforms. It might need more testing
coverage on other mobile and desktop platforms.

Thank you,
Suravee

 drivers/iommu/amd/init.c | 33 -
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 83d8ab2aed9f..edb885625e47 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -254,6 +254,8 @@ static enum iommu_init_state init_state = IOMMU_START_STATE;
 static int amd_iommu_enable_interrupts(void);
 static int __init io

Re: [PATCH v4 00/13] iommu/amd: Add Generic IO Page Table Framework Support

2021-01-27 Thread Suravee Suthikulpanit




On 1/27/21 7:06 PM, Joerg Roedel wrote:

Hi Suravee,

On Tue, Dec 15, 2020 at 01:36:52AM -0600, Suravee Suthikulpanit wrote:
  

Suravee Suthikulpanit (13):
   iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline
   iommu/amd: Prepare for generic IO page table framework
   iommu/amd: Move pt_root to struct amd_io_pgtable
   iommu/amd: Convert to using amd_io_pgtable
   iommu/amd: Declare functions as extern
   iommu/amd: Move IO page table related functions
   iommu/amd: Restructure code for freeing page table
   iommu/amd: Remove amd_iommu_domain_get_pgtable
   iommu/amd: Rename variables to be consistent with struct
 io_pgtable_ops
   iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable
   iommu/amd: Introduce iommu_v1_iova_to_phys
   iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page
   iommu/amd: Adopt IO page table framework for AMD IOMMU v1 page table


Applied this series, thanks for the work! Given testing goes well you
can consider this queued for 5.12.

Thanks,

Joerg



Thanks Joerg and Will, and welcome back!!!

Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2] iommu/amd: Use IVHD EFR for early initialization of IOMMU features

2021-01-20 Thread Suravee Suthikulpanit
IOMMU Extended Feature Register (EFR) is used to communicate
the supported features for each IOMMU to the IOMMU driver.
This is normally read from the PCI MMIO register offset 0x30,
and used by the iommu_feature() helper function.

However, there are certain scenarios where the information is needed
prior to PCI initialization, and the iommu_feature() function is used
prematurely w/o warning. This has caused incorrect initialization of IOMMU.
This is the case for the commit 6d39bdee238f ("iommu/amd: Enforce 4k
mapping for certain IOMMU data structures")

Since, the EFR is also available in the IVHD header, and is available to
the driver prior to PCI initialization. Therefore, default to using
the IVHD EFR instead.

Fixes: 6d39bdee238f ("iommu/amd: Enforce 4k mapping for certain IOMMU data 
structures")
Reviewed-by: Robert Richter 
Tested-by: Brijesh Singh 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h   |  7 ++--
 drivers/iommu/amd/amd_iommu_types.h |  4 +++
 drivers/iommu/amd/init.c| 56 +++--
 3 files changed, 60 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 6b8cbdf71714..b4adab698563 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -84,12 +84,9 @@ static inline bool is_rd890_iommu(struct pci_dev *pdev)
   (pdev->device == PCI_DEVICE_ID_RD890_IOMMU);
 }
 
-static inline bool iommu_feature(struct amd_iommu *iommu, u64 f)
+static inline bool iommu_feature(struct amd_iommu *iommu, u64 mask)
 {
-   if (!(iommu->cap & (1 << IOMMU_CAP_EFR)))
-   return false;
-
-   return !!(iommu->features & f);
+   return !!(iommu->features & mask);
 }
 
 static inline u64 iommu_virt_to_phys(void *vaddr)
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 553587827771..1a0495dd5fcb 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -387,6 +387,10 @@
 #define IOMMU_CAP_NPCACHE 26
 #define IOMMU_CAP_EFR 27
 
+/* IOMMU IVINFO */
+#define IOMMU_IVINFO_OFFSET 36
+#define IOMMU_IVINFO_EFRSUP BIT(0)
+
 /* IOMMU Feature Reporting Field (for IVHD type 10h */
 #define IOMMU_FEAT_GASUP_SHIFT 6
 
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 6a1f7048dacc..83d8ab2aed9f 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -257,6 +257,8 @@ static void init_device_table_dma(void);
 
 static bool amd_iommu_pre_enabled = true;
 
+static u32 amd_iommu_ivinfo __initdata;
+
 bool translation_pre_enabled(struct amd_iommu *iommu)
 {
return (iommu->flags & AMD_IOMMU_FLAG_TRANS_PRE_ENABLED);
@@ -296,6 +298,18 @@ int amd_iommu_get_num_iommus(void)
return amd_iommus_present;
 }
 
+/*
+ * For IVHD type 0x11/0x40, EFR is also available via IVHD.
+ * Default to IVHD EFR since it is available sooner
+ * (i.e. before PCI init).
+ */
+static void __init early_iommu_features_init(struct amd_iommu *iommu,
+struct ivhd_header *h)
+{
+   if (amd_iommu_ivinfo & IOMMU_IVINFO_EFRSUP)
+   iommu->features = h->efr_reg;
+}
+
 /* Access to l1 and l2 indexed register spaces */
 
 static u32 iommu_read_l1(struct amd_iommu *iommu, u16 l1, u8 address)
@@ -1577,6 +1591,9 @@ static int __init init_iommu_one(struct amd_iommu *iommu, 
struct ivhd_header *h)
 
if (h->efr_reg & BIT(IOMMU_EFR_XTSUP_SHIFT))
amd_iommu_xt_mode = IRQ_REMAP_X2APIC_MODE;
+
+   early_iommu_features_init(iommu, h);
+
break;
default:
return -EINVAL;
@@ -1770,6 +1787,35 @@ static const struct attribute_group *amd_iommu_groups[] 
= {
NULL,
 };
 
+/*
+ * Note: IVHD 0x11 and 0x40 also contains exact copy
+ * of the IOMMU Extended Feature Register [MMIO Offset 0030h].
+ * Default to EFR in IVHD since it is available sooner (i.e. before PCI init).
+ */
+static void __init late_iommu_features_init(struct amd_iommu *iommu)
+{
+   u64 features;
+
+   if (!(iommu->cap & (1 << IOMMU_CAP_EFR)))
+   return;
+
+   /* read extended feature bits */
+   features = readq(iommu->mmio_base + MMIO_EXT_FEATURES);
+
+   if (!iommu->features) {
+   iommu->features = features;
+   return;
+   }
+
+   /*
+* Sanity check and warn if EFR values from
+* IVHD and MMIO conflict.
+*/
+   if (features != iommu->features)
+   pr_warn(FW_WARN "EFR mismatch. Use IVHD EFR (%#llx : %#llx\n).",
+   features, iommu->features);
+}
+
 static int __init iommu_init_pci(struct amd_iommu *iommu)
 {
int cap_ptr = iommu->cap_ptr;
@@ -1789,8 +1835,7 @@ static int __init iommu_init_pci(struc

Re: [PATCH] iommu/amd: Make use of EFR from IVHD when available

2021-01-20 Thread Suravee Suthikulpanit

I will send out v2 of this patch. Please ignore this v1.

Thanks,
Suravee

On 1/18/21 12:19 PM, Suravee Suthikulpanit wrote:

IOMMU Extended Feature Register (EFR) is used to communicate
the supported features for each IOMMU to the IOMMU driver.
This is normally read from the PCI MMIO register offset 0x30,
and used by the iommu_feature() helper function.

However, there are certain scenarios where the information is needed
prior to PCI initialization, and the iommu_feature() function is used
prematurely w/o warning. This has caused incorrect initialization of IOMMU.

The EFR is also available in the IVHD header, and is available to
the driver prior to PCI initialization. Therefore, default to using
the IVHD EFR instead.

Fixes: 6d39bdee238f ("iommu/amd: Enforce 4k mapping for certain IOMMU data 
structures")
Tested-by: Brijesh Singh 
Signed-off-by: Suravee Suthikulpanit 
---
  drivers/iommu/amd/amd_iommu.h   |  3 ++-
  drivers/iommu/amd/amd_iommu_types.h |  4 +++
  drivers/iommu/amd/init.c| 39 +++--
  3 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 6b8cbdf71714..0a89e9c4f7b3 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -86,7 +86,8 @@ static inline bool is_rd890_iommu(struct pci_dev *pdev)
  
  static inline bool iommu_feature(struct amd_iommu *iommu, u64 f)

  {
-   if (!(iommu->cap & (1 << IOMMU_CAP_EFR)))
+   /* features == 0 means EFR is not supported */
+   if (!iommu->features)
return false;
  
  	return !!(iommu->features & f);

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 553587827771..35331e458dd1 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -387,6 +387,10 @@
  #define IOMMU_CAP_NPCACHE 26
  #define IOMMU_CAP_EFR 27
  
+/* IOMMU IVINFO */

+#define IOMMU_IVINFO_OFFSET  36
+#define IOMMU_IVINFO_EFRSUP_SHIFT0
+
  /* IOMMU Feature Reporting Field (for IVHD type 10h */
  #define IOMMU_FEAT_GASUP_SHIFT6
  
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c

index 6a1f7048dacc..28b1d2feec96 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -257,6 +257,8 @@ static void init_device_table_dma(void);
  
  static bool amd_iommu_pre_enabled = true;
  
+static u32 amd_iommu_ivinfo;

+
  bool translation_pre_enabled(struct amd_iommu *iommu)
  {
return (iommu->flags & AMD_IOMMU_FLAG_TRANS_PRE_ENABLED);
@@ -1577,6 +1579,14 @@ static int __init init_iommu_one(struct amd_iommu 
*iommu, struct ivhd_header *h)
  
  		if (h->efr_reg & BIT(IOMMU_EFR_XTSUP_SHIFT))

amd_iommu_xt_mode = IRQ_REMAP_X2APIC_MODE;
+
+   /*
+* For IVHD type 0x11/0x40, EFR is also available via IVHD.
+* Default to IVHD EFR since it is available sooner
+* (i.e. before PCI init).
+*/
+   if (amd_iommu_ivinfo & (1 << IOMMU_IVINFO_EFRSUP_SHIFT))
+   iommu->features = h->efr_reg;
break;
default:
return -EINVAL;
@@ -1770,6 +1780,29 @@ static const struct attribute_group *amd_iommu_groups[] 
= {
NULL,
  };
  
+/*

+ * Note: IVHD 0x11 and 0x40 also contains exact copy
+ * of the IOMMU Extended Feature Register [MMIO Offset 0030h].
+ * Default to EFR in IVHD since it is available sooner (i.e. before PCI init).
+ * However, sanity check and warn if they conflict.
+ */
+static void __init iommu_init_features(struct amd_iommu *iommu)
+{
+   u64 features;
+
+   if (!(iommu->cap & (1 << IOMMU_CAP_EFR)))
+   return;
+
+   /* read extended feature bits */
+   features = readq(iommu->mmio_base + MMIO_EXT_FEATURES);
+
+   if (iommu->features && (features != iommu->features))
+   pr_err(FW_BUG "EFR mismatch. Use IVHD EFR (%#llx : %#llx\n).",
+  features, iommu->features);
+   else
+   iommu->features = features;
+}
+
  static int __init iommu_init_pci(struct amd_iommu *iommu)
  {
int cap_ptr = iommu->cap_ptr;
@@ -1789,8 +1822,7 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
if (!(iommu->cap & (1 << IOMMU_CAP_IOTLB)))
amd_iommu_iotlb_sup = false;
  
-	/* read extended feature bits */

-   iommu->features = readq(iommu->mmio_base + MMIO_EXT_FEATURES);
+   iommu_init_features(iommu);
  
  	if (iommu_feature(iommu, FEATURE_GT)) {

int glxval;
@@ -2661,6 +2693,9 @@ static int __init early_amd_iommu_init(void)
if (ret)
goto out;
  
+	/* Store IVRS IVinfo field. */

+   amd_iommu_ivinfo = *((u32 *)((u8 *)ivrs_base + IOMMU_IVINFO_OFFS

[PATCH] iommu/amd: Make use of EFR from IVHD when available

2021-01-17 Thread Suravee Suthikulpanit
IOMMU Extended Feature Register (EFR) is used to communicate
the supported features for each IOMMU to the IOMMU driver.
This is normally read from the PCI MMIO register offset 0x30,
and used by the iommu_feature() helper function.

However, there are certain scenarios where the information is needed
prior to PCI initialization, and the iommu_feature() function is used
prematurely w/o warning. This has caused incorrect initialization of IOMMU.

The EFR is also available in the IVHD header, and is available to
the driver prior to PCI initialization. Therefore, default to using
the IVHD EFR instead.

Fixes: 6d39bdee238f ("iommu/amd: Enforce 4k mapping for certain IOMMU data 
structures")
Tested-by: Brijesh Singh 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h   |  3 ++-
 drivers/iommu/amd/amd_iommu_types.h |  4 +++
 drivers/iommu/amd/init.c| 39 +++--
 3 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 6b8cbdf71714..0a89e9c4f7b3 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -86,7 +86,8 @@ static inline bool is_rd890_iommu(struct pci_dev *pdev)
 
 static inline bool iommu_feature(struct amd_iommu *iommu, u64 f)
 {
-   if (!(iommu->cap & (1 << IOMMU_CAP_EFR)))
+   /* features == 0 means EFR is not supported */
+   if (!iommu->features)
return false;
 
return !!(iommu->features & f);
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 553587827771..35331e458dd1 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -387,6 +387,10 @@
 #define IOMMU_CAP_NPCACHE 26
 #define IOMMU_CAP_EFR 27
 
+/* IOMMU IVINFO */
+#define IOMMU_IVINFO_OFFSET  36
+#define IOMMU_IVINFO_EFRSUP_SHIFT0
+
 /* IOMMU Feature Reporting Field (for IVHD type 10h */
 #define IOMMU_FEAT_GASUP_SHIFT 6
 
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 6a1f7048dacc..28b1d2feec96 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -257,6 +257,8 @@ static void init_device_table_dma(void);
 
 static bool amd_iommu_pre_enabled = true;
 
+static u32 amd_iommu_ivinfo;
+
 bool translation_pre_enabled(struct amd_iommu *iommu)
 {
return (iommu->flags & AMD_IOMMU_FLAG_TRANS_PRE_ENABLED);
@@ -1577,6 +1579,14 @@ static int __init init_iommu_one(struct amd_iommu 
*iommu, struct ivhd_header *h)
 
if (h->efr_reg & BIT(IOMMU_EFR_XTSUP_SHIFT))
amd_iommu_xt_mode = IRQ_REMAP_X2APIC_MODE;
+
+   /*
+* For IVHD type 0x11/0x40, EFR is also available via IVHD.
+* Default to IVHD EFR since it is available sooner
+* (i.e. before PCI init).
+*/
+   if (amd_iommu_ivinfo & (1 << IOMMU_IVINFO_EFRSUP_SHIFT))
+   iommu->features = h->efr_reg;
break;
default:
return -EINVAL;
@@ -1770,6 +1780,29 @@ static const struct attribute_group *amd_iommu_groups[] 
= {
NULL,
 };
 
+/*
+ * Note: IVHD 0x11 and 0x40 also contains exact copy
+ * of the IOMMU Extended Feature Register [MMIO Offset 0030h].
+ * Default to EFR in IVHD since it is available sooner (i.e. before PCI init).
+ * However, sanity check and warn if they conflict.
+ */
+static void __init iommu_init_features(struct amd_iommu *iommu)
+{
+   u64 features;
+
+   if (!(iommu->cap & (1 << IOMMU_CAP_EFR)))
+   return;
+
+   /* read extended feature bits */
+   features = readq(iommu->mmio_base + MMIO_EXT_FEATURES);
+
+   if (iommu->features && (features != iommu->features))
+   pr_err(FW_BUG "EFR mismatch. Use IVHD EFR (%#llx : %#llx\n).",
+  features, iommu->features);
+   else
+   iommu->features = features;
+}
+
 static int __init iommu_init_pci(struct amd_iommu *iommu)
 {
int cap_ptr = iommu->cap_ptr;
@@ -1789,8 +1822,7 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
if (!(iommu->cap & (1 << IOMMU_CAP_IOTLB)))
amd_iommu_iotlb_sup = false;
 
-   /* read extended feature bits */
-   iommu->features = readq(iommu->mmio_base + MMIO_EXT_FEATURES);
+   iommu_init_features(iommu);
 
if (iommu_feature(iommu, FEATURE_GT)) {
int glxval;
@@ -2661,6 +2693,9 @@ static int __init early_amd_iommu_init(void)
if (ret)
goto out;
 
+   /* Store IVRS IVinfo field. */
+   amd_iommu_ivinfo = *((u32 *)((u8 *)ivrs_base + IOMMU_IVINFO_OFFSET));
+
amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
DUMP_printk("Using IVHD type %#x\n", a

Re: [PATCH v4 00/13] iommu/amd: Add Generic IO Page Table Framework Support

2021-01-04 Thread Suravee Suthikulpanit

Hi Joerg / Will,

Happy New Year!! Just want to follow up on this series.

Thanks,
Suravee

On 12/15/20 2:36 PM, Suravee Suthikulpanit wrote:

The framework allows callable implementation of IO page table.
This allows AMD IOMMU driver to switch between different types
of AMD IOMMU page tables (e.g. v1 vs. v2).

This series refactors the current implementation of AMD IOMMU v1 page table
to adopt the framework. There should be no functional change.
Subsequent series will introduce support for the AMD IOMMU v2 page table.

Thanks,
Suravee

Change from V3 
(https://lore.kernel.org/linux-iommu/20201004014549.16065-1-suravee.suthikulpa...@amd.com/)
   - Rebase to v5.10
   - Patch  2: Add struct iommu_flush_ops (previously in patch 13 of v3)
   - Patch  7: Consolidate logic into v1_free_pgtable() instead of 
amd_iommu_free_pgtable()
   - Patch 12: Check ops->[map|unmap] before calling.
   - Patch 13: Setup page table when allocating domain (instead of when 
attaching device).

Change from V2 
(https://lore.kernel.org/lkml/835c0d46-ed96-9fbe-856a-777dcffac...@amd.com/T/#t)
   - Patch  2: Introduce helper function io_pgtable_cfg_to_data.
   - Patch 13: Put back the struct iommu_flush_ops since patch v2 would run into
 NULL pointer bug when calling free_io_pgtable_ops if not defined.

Change from V1 (https://lkml.org/lkml/2020/9/23/251)
   - Do not specify struct io_pgtable_cfg.coherent_walk, since it is
 not currently used. (per Robin)
   - Remove unused struct iommu_flush_ops.  (patch 2/13)
   - Move amd_iommu_setup_io_pgtable_ops to iommu.c instead of io_pgtable.c
 patch 13/13)

Suravee Suthikulpanit (13):
   iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline
   iommu/amd: Prepare for generic IO page table framework
   iommu/amd: Move pt_root to struct amd_io_pgtable
   iommu/amd: Convert to using amd_io_pgtable
   iommu/amd: Declare functions as extern
   iommu/amd: Move IO page table related functions
   iommu/amd: Restructure code for freeing page table
   iommu/amd: Remove amd_iommu_domain_get_pgtable
   iommu/amd: Rename variables to be consistent with struct
 io_pgtable_ops
   iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable
   iommu/amd: Introduce iommu_v1_iova_to_phys
   iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page
   iommu/amd: Adopt IO page table framework for AMD IOMMU v1 page table

  drivers/iommu/amd/Kconfig   |   1 +
  drivers/iommu/amd/Makefile  |   2 +-
  drivers/iommu/amd/amd_iommu.h   |  22 +
  drivers/iommu/amd/amd_iommu_types.h |  43 +-
  drivers/iommu/amd/init.c|   2 +
  drivers/iommu/amd/io_pgtable.c  | 564 +++
  drivers/iommu/amd/iommu.c   | 672 
  drivers/iommu/io-pgtable.c  |   3 +
  include/linux/io-pgtable.h  |   2 +
  9 files changed, 707 insertions(+), 604 deletions(-)
  create mode 100644 drivers/iommu/amd/io_pgtable.c


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 10/13] iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable

2020-12-15 Thread Suravee Suthikulpanit
To simplify the fetch_pte function. There is no functional change.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h  |  2 +-
 drivers/iommu/amd/io_pgtable.c | 13 +++--
 drivers/iommu/amd/iommu.c  |  4 +++-
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 76276d9e463c..83ca822c5349 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -143,7 +143,7 @@ extern int iommu_map_page(struct protection_domain *dom,
 extern unsigned long iommu_unmap_page(struct protection_domain *dom,
  unsigned long bus_addr,
  unsigned long page_size);
-extern u64 *fetch_pte(struct protection_domain *domain,
+extern u64 *fetch_pte(struct amd_io_pgtable *pgtable,
  unsigned long address,
  unsigned long *page_size);
 extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 35dd9153e6b7..87184b6cee0f 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -317,7 +317,7 @@ static u64 *alloc_pte(struct protection_domain *domain,
  * This function checks if there is a PTE for a given dma address. If
  * there is one, it returns the pointer to it.
  */
-u64 *fetch_pte(struct protection_domain *domain,
+u64 *fetch_pte(struct amd_io_pgtable *pgtable,
   unsigned long address,
   unsigned long *page_size)
 {
@@ -326,11 +326,11 @@ u64 *fetch_pte(struct protection_domain *domain,
 
*page_size = 0;
 
-   if (address > PM_LEVEL_SIZE(domain->iop.mode))
+   if (address > PM_LEVEL_SIZE(pgtable->mode))
return NULL;
 
-   level  =  domain->iop.mode - 1;
-   pte= >iop.root[PM_LEVEL_INDEX(level, address)];
+   level  =  pgtable->mode - 1;
+   pte= >root[PM_LEVEL_INDEX(level, address)];
*page_size =  PTE_LEVEL_PAGE_SIZE(level);
 
while (level > 0) {
@@ -465,6 +465,8 @@ unsigned long iommu_unmap_page(struct protection_domain 
*dom,
   unsigned long iova,
   unsigned long size)
 {
+   struct io_pgtable_ops *ops = >iop.iop.ops;
+   struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
unsigned long long unmapped;
unsigned long unmap_size;
u64 *pte;
@@ -474,8 +476,7 @@ unsigned long iommu_unmap_page(struct protection_domain 
*dom,
unmapped = 0;
 
while (unmapped < size) {
-   pte = fetch_pte(dom, iova, _size);
-
+   pte = fetch_pte(pgtable, iova, _size);
if (pte) {
int i, count;
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 2963a37b7c16..76f61dd6b89f 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2100,13 +2100,15 @@ static phys_addr_t amd_iommu_iova_to_phys(struct 
iommu_domain *dom,
  dma_addr_t iova)
 {
struct protection_domain *domain = to_pdomain(dom);
+   struct io_pgtable_ops *ops = >iop.iop.ops;
+   struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
unsigned long offset_mask, pte_pgsize;
u64 *pte, __pte;
 
if (domain->iop.mode == PAGE_MODE_NONE)
return iova;
 
-   pte = fetch_pte(domain, iova, _pgsize);
+   pte = fetch_pte(pgtable, iova, _pgsize);
 
if (!pte || !IOMMU_PTE_PRESENT(*pte))
return 0;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 12/13] iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page

2020-12-15 Thread Suravee Suthikulpanit
These implement map and unmap for AMD IOMMU v1 pagetable, which
will be used by the IO pagetable framework.

Also clean up unused extern function declarations.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h  | 13 -
 drivers/iommu/amd/io_pgtable.c | 25 -
 drivers/iommu/amd/iommu.c  | 13 -
 3 files changed, 20 insertions(+), 31 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 83ca822c5349..3770b1a4d51c 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -133,19 +133,6 @@ void amd_iommu_apply_ivrs_quirks(void);
 static inline void amd_iommu_apply_ivrs_quirks(void) { }
 #endif
 
-/* TODO: These are temporary and will be removed once fully transition */
-extern int iommu_map_page(struct protection_domain *dom,
- unsigned long bus_addr,
- unsigned long phys_addr,
- unsigned long page_size,
- int prot,
- gfp_t gfp);
-extern unsigned long iommu_unmap_page(struct protection_domain *dom,
- unsigned long bus_addr,
- unsigned long page_size);
-extern u64 *fetch_pte(struct amd_io_pgtable *pgtable,
- unsigned long address,
- unsigned long *page_size);
 extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
 u64 *root, int mode);
 #endif
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index a293b69b38b9..d91964e98d58 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -317,9 +317,9 @@ static u64 *alloc_pte(struct protection_domain *domain,
  * This function checks if there is a PTE for a given dma address. If
  * there is one, it returns the pointer to it.
  */
-u64 *fetch_pte(struct amd_io_pgtable *pgtable,
-  unsigned long address,
-  unsigned long *page_size)
+static u64 *fetch_pte(struct amd_io_pgtable *pgtable,
+ unsigned long address,
+ unsigned long *page_size)
 {
int level;
u64 *pte;
@@ -392,13 +392,10 @@ static struct page *free_clear_pte(u64 *pte, u64 pteval, 
struct page *freelist)
  * supporting all features of AMD IOMMU page tables like level skipping
  * and full 64 bit address spaces.
  */
-int iommu_map_page(struct protection_domain *dom,
-  unsigned long iova,
-  unsigned long paddr,
-  unsigned long size,
-  int prot,
-  gfp_t gfp)
+static int iommu_v1_map_page(struct io_pgtable_ops *ops, unsigned long iova,
+ phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
 {
+   struct protection_domain *dom = io_pgtable_ops_to_domain(ops);
struct page *freelist = NULL;
bool updated = false;
u64 __pte, *pte;
@@ -461,11 +458,11 @@ int iommu_map_page(struct protection_domain *dom,
return ret;
 }
 
-unsigned long iommu_unmap_page(struct protection_domain *dom,
-  unsigned long iova,
-  unsigned long size)
+static unsigned long iommu_v1_unmap_page(struct io_pgtable_ops *ops,
+ unsigned long iova,
+ size_t size,
+ struct iommu_iotlb_gather *gather)
 {
-   struct io_pgtable_ops *ops = >iop.iop.ops;
struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
unsigned long long unmapped;
unsigned long unmap_size;
@@ -554,6 +551,8 @@ static struct io_pgtable *v1_alloc_pgtable(struct 
io_pgtable_cfg *cfg, void *coo
cfg->oas= IOMMU_OUT_ADDR_BIT_SIZE,
cfg->tlb= _flush_ops;
 
+   pgtable->iop.ops.map  = iommu_v1_map_page;
+   pgtable->iop.ops.unmap= iommu_v1_unmap_page;
pgtable->iop.ops.iova_to_phys = iommu_v1_iova_to_phys;
 
return >iop;
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 29b7fefc8485..1f04b251f0c6 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2066,8 +2066,9 @@ static int amd_iommu_map(struct iommu_domain *dom, 
unsigned long iova,
 gfp_t gfp)
 {
struct protection_domain *domain = to_pdomain(dom);
+   struct io_pgtable_ops *ops = >iop.iop.ops;
int prot = 0;
-   int ret;
+   int ret = -EINVAL;
 
if (domain->iop.mode == PAGE_MODE_NONE)
return -EINVAL;
@@ -2077,9 +2078,10 @@ static int amd_iommu_map(struct iommu_domain *dom, 
unsigned long iova,
if (iommu_prot & IOMMU_WRITE)
prot |= IOMMU_PROT_IW;
 
-   ret = iommu_map_page(domain, io

[PATCH v4 09/13] iommu/amd: Rename variables to be consistent with struct io_pgtable_ops

2020-12-15 Thread Suravee Suthikulpanit
There is no functional change.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/io_pgtable.c | 31 +++
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index d4d131e43dcd..35dd9153e6b7 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -393,9 +393,9 @@ static struct page *free_clear_pte(u64 *pte, u64 pteval, 
struct page *freelist)
  * and full 64 bit address spaces.
  */
 int iommu_map_page(struct protection_domain *dom,
-  unsigned long bus_addr,
-  unsigned long phys_addr,
-  unsigned long page_size,
+  unsigned long iova,
+  unsigned long paddr,
+  unsigned long size,
   int prot,
   gfp_t gfp)
 {
@@ -404,15 +404,15 @@ int iommu_map_page(struct protection_domain *dom,
u64 __pte, *pte;
int ret, i, count;
 
-   BUG_ON(!IS_ALIGNED(bus_addr, page_size));
-   BUG_ON(!IS_ALIGNED(phys_addr, page_size));
+   BUG_ON(!IS_ALIGNED(iova, size));
+   BUG_ON(!IS_ALIGNED(paddr, size));
 
ret = -EINVAL;
if (!(prot & IOMMU_PROT_MASK))
goto out;
 
-   count = PAGE_SIZE_PTE_COUNT(page_size);
-   pte   = alloc_pte(dom, bus_addr, page_size, NULL, gfp, );
+   count = PAGE_SIZE_PTE_COUNT(size);
+   pte   = alloc_pte(dom, iova, size, NULL, gfp, );
 
ret = -ENOMEM;
if (!pte)
@@ -425,10 +425,10 @@ int iommu_map_page(struct protection_domain *dom,
updated = true;
 
if (count > 1) {
-   __pte = PAGE_SIZE_PTE(__sme_set(phys_addr), page_size);
+   __pte = PAGE_SIZE_PTE(__sme_set(paddr), size);
__pte |= PM_LEVEL_ENC(7) | IOMMU_PTE_PR | IOMMU_PTE_FC;
} else
-   __pte = __sme_set(phys_addr) | IOMMU_PTE_PR | IOMMU_PTE_FC;
+   __pte = __sme_set(paddr) | IOMMU_PTE_PR | IOMMU_PTE_FC;
 
if (prot & IOMMU_PROT_IR)
__pte |= IOMMU_PTE_IR;
@@ -462,20 +462,19 @@ int iommu_map_page(struct protection_domain *dom,
 }
 
 unsigned long iommu_unmap_page(struct protection_domain *dom,
-  unsigned long bus_addr,
-  unsigned long page_size)
+  unsigned long iova,
+  unsigned long size)
 {
unsigned long long unmapped;
unsigned long unmap_size;
u64 *pte;
 
-   BUG_ON(!is_power_of_2(page_size));
+   BUG_ON(!is_power_of_2(size));
 
unmapped = 0;
 
-   while (unmapped < page_size) {
-
-   pte = fetch_pte(dom, bus_addr, _size);
+   while (unmapped < size) {
+   pte = fetch_pte(dom, iova, _size);
 
if (pte) {
int i, count;
@@ -485,7 +484,7 @@ unsigned long iommu_unmap_page(struct protection_domain 
*dom,
pte[i] = 0ULL;
}
 
-   bus_addr  = (bus_addr & ~(unmap_size - 1)) + unmap_size;
+   iova = (iova & ~(unmap_size - 1)) + unmap_size;
unmapped += unmap_size;
}
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 08/13] iommu/amd: Remove amd_iommu_domain_get_pgtable

2020-12-14 Thread Suravee Suthikulpanit
Since the IO page table root and mode parameters have been moved into
the struct amd_io_pg, the function is no longer needed. Therefore,
remove it along with the struct domain_pgtable.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h   |  4 ++--
 drivers/iommu/amd/amd_iommu_types.h |  6 -
 drivers/iommu/amd/io_pgtable.c  | 36 ++---
 drivers/iommu/amd/iommu.c   | 34 ---
 4 files changed, 19 insertions(+), 61 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 91d098003f12..76276d9e463c 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -110,6 +110,8 @@ static inline
 void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root)
 {
atomic64_set(>iop.pt_root, root);
+   domain->iop.root = (u64 *)(root & PAGE_MASK);
+   domain->iop.mode = root & 7; /* lowest 3 bits encode pgtable mode */
 }
 
 static inline
@@ -144,8 +146,6 @@ extern unsigned long iommu_unmap_page(struct 
protection_domain *dom,
 extern u64 *fetch_pte(struct protection_domain *domain,
  unsigned long address,
  unsigned long *page_size);
-extern void amd_iommu_domain_get_pgtable(struct protection_domain *domain,
-struct domain_pgtable *pgtable);
 extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
 u64 *root, int mode);
 #endif
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 7c971c76d685..6897567d307e 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -518,12 +518,6 @@ struct protection_domain {
unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */
 };
 
-/* For decocded pt_root */
-struct domain_pgtable {
-   int mode;
-   u64 *root;
-};
-
 /*
  * Structure where we save information about one hardware AMD IOMMU in the
  * system.
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index dc674e79ddf0..d4d131e43dcd 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -184,30 +184,27 @@ static bool increase_address_space(struct 
protection_domain *domain,
   unsigned long address,
   gfp_t gfp)
 {
-   struct domain_pgtable pgtable;
unsigned long flags;
bool ret = true;
u64 *pte;
 
spin_lock_irqsave(>lock, flags);
 
-   amd_iommu_domain_get_pgtable(domain, );
-
-   if (address <= PM_LEVEL_SIZE(pgtable.mode))
+   if (address <= PM_LEVEL_SIZE(domain->iop.mode))
goto out;
 
ret = false;
-   if (WARN_ON_ONCE(pgtable.mode == PAGE_MODE_6_LEVEL))
+   if (WARN_ON_ONCE(domain->iop.mode == PAGE_MODE_6_LEVEL))
goto out;
 
pte = (void *)get_zeroed_page(gfp);
if (!pte)
goto out;
 
-   *pte = PM_LEVEL_PDE(pgtable.mode, iommu_virt_to_phys(pgtable.root));
+   *pte = PM_LEVEL_PDE(domain->iop.mode, 
iommu_virt_to_phys(domain->iop.root));
 
-   pgtable.root  = pte;
-   pgtable.mode += 1;
+   domain->iop.root  = pte;
+   domain->iop.mode += 1;
amd_iommu_update_and_flush_device_table(domain);
amd_iommu_domain_flush_complete(domain);
 
@@ -215,7 +212,7 @@ static bool increase_address_space(struct protection_domain 
*domain,
 * Device Table needs to be updated and flushed before the new root can
 * be published.
 */
-   amd_iommu_domain_set_pgtable(domain, pte, pgtable.mode);
+   amd_iommu_domain_set_pgtable(domain, pte, domain->iop.mode);
 
ret = true;
 
@@ -232,29 +229,23 @@ static u64 *alloc_pte(struct protection_domain *domain,
  gfp_t gfp,
  bool *updated)
 {
-   struct domain_pgtable pgtable;
int level, end_lvl;
u64 *pte, *page;
 
BUG_ON(!is_power_of_2(page_size));
 
-   amd_iommu_domain_get_pgtable(domain, );
-
-   while (address > PM_LEVEL_SIZE(pgtable.mode)) {
+   while (address > PM_LEVEL_SIZE(domain->iop.mode)) {
/*
 * Return an error if there is no memory to update the
 * page-table.
 */
if (!increase_address_space(domain, address, gfp))
return NULL;
-
-   /* Read new values to check if update was successful */
-   amd_iommu_domain_get_pgtable(domain, );
}
 
 
-   level   = pgtable.mode - 1;
-   pte = [PM_LEVEL_INDEX(level, address)];
+   level   = domain->iop.mode - 1;
+   pte = >iop.root[PM_LEVEL_INDEX(level, address)];
address = PAGE_SIZE_ALIGN(address, page_size);
end

[PATCH v4 13/13] iommu/amd: Adopt IO page table framework for AMD IOMMU v1 page table

2020-12-14 Thread Suravee Suthikulpanit
Switch to using IO page table framework for AMD IOMMU v1 page table.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h |  1 +
 drivers/iommu/amd/init.c  |  2 ++
 drivers/iommu/amd/iommu.c | 48 ++-
 3 files changed, 39 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 3770b1a4d51c..91452e0ff072 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -36,6 +36,7 @@ extern void amd_iommu_disable(void);
 extern int amd_iommu_reenable(int);
 extern int amd_iommu_enable_faulting(void);
 extern int amd_iommu_guest_ir;
+extern enum io_pgtable_fmt amd_iommu_pgtable;
 
 /* IOMMUv2 specific functions */
 struct iommu_domain;
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 23a790f8f550..5fb4bea14cc4 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -147,6 +147,8 @@ struct ivmd_header {
 bool amd_iommu_dump;
 bool amd_iommu_irq_remap __read_mostly;
 
+enum io_pgtable_fmt amd_iommu_pgtable = AMD_IOMMU_V1;
+
 int amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_VAPIC;
 static int amd_iommu_xt_mode = IRQ_REMAP_XAPIC_MODE;
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 1f04b251f0c6..571e8806e4a1 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1901,7 +1902,7 @@ static void protection_domain_free(struct 
protection_domain *domain)
kfree(domain);
 }
 
-static int protection_domain_init(struct protection_domain *domain, int mode)
+static int protection_domain_init_v1(struct protection_domain *domain, int 
mode)
 {
u64 *pt_root = NULL;
 
@@ -1924,34 +1925,55 @@ static int protection_domain_init(struct 
protection_domain *domain, int mode)
return 0;
 }
 
-static struct protection_domain *protection_domain_alloc(int mode)
+static struct protection_domain *protection_domain_alloc(unsigned int type)
 {
+   struct io_pgtable_ops *pgtbl_ops;
struct protection_domain *domain;
+   int pgtable = amd_iommu_pgtable;
+   int mode = DEFAULT_PGTABLE_LEVEL;
+   int ret;
 
domain = kzalloc(sizeof(*domain), GFP_KERNEL);
if (!domain)
return NULL;
 
-   if (protection_domain_init(domain, mode))
+   /*
+* Force IOMMU v1 page table when iommu=pt and
+* when allocating domain for pass-through devices.
+*/
+   if (type == IOMMU_DOMAIN_IDENTITY) {
+   pgtable = AMD_IOMMU_V1;
+   mode = PAGE_MODE_NONE;
+   } else if (type == IOMMU_DOMAIN_UNMANAGED) {
+   pgtable = AMD_IOMMU_V1;
+   }
+
+   switch (pgtable) {
+   case AMD_IOMMU_V1:
+   ret = protection_domain_init_v1(domain, mode);
+   break;
+   default:
+   ret = -EINVAL;
+   }
+
+   if (ret)
goto out_err;
 
-   return domain;
+   pgtbl_ops = alloc_io_pgtable_ops(pgtable, >iop.pgtbl_cfg, 
domain);
+   if (!pgtbl_ops)
+   goto out_err;
 
+   return domain;
 out_err:
kfree(domain);
-
return NULL;
 }
 
 static struct iommu_domain *amd_iommu_domain_alloc(unsigned type)
 {
struct protection_domain *domain;
-   int mode = DEFAULT_PGTABLE_LEVEL;
-
-   if (type == IOMMU_DOMAIN_IDENTITY)
-   mode = PAGE_MODE_NONE;
 
-   domain = protection_domain_alloc(mode);
+   domain = protection_domain_alloc(type);
if (!domain)
return NULL;
 
@@ -2070,7 +2092,8 @@ static int amd_iommu_map(struct iommu_domain *dom, 
unsigned long iova,
int prot = 0;
int ret = -EINVAL;
 
-   if (domain->iop.mode == PAGE_MODE_NONE)
+   if ((amd_iommu_pgtable == AMD_IOMMU_V1) &&
+   (domain->iop.mode == PAGE_MODE_NONE))
return -EINVAL;
 
if (iommu_prot & IOMMU_READ)
@@ -2093,7 +2116,8 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, 
unsigned long iova,
struct protection_domain *domain = to_pdomain(dom);
struct io_pgtable_ops *ops = >iop.iop.ops;
 
-   if (domain->iop.mode == PAGE_MODE_NONE)
+   if ((amd_iommu_pgtable == AMD_IOMMU_V1) &&
+   (domain->iop.mode == PAGE_MODE_NONE))
return 0;
 
return (ops->unmap) ? ops->unmap(ops, iova, page_size, gather) : 0;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 01/13] iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline

2020-12-14 Thread Suravee Suthikulpanit
Move the function to header file to allow inclusion in other files.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h | 13 +
 drivers/iommu/amd/iommu.c | 10 --
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 6b8cbdf71714..0817bc732d1a 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -102,6 +102,19 @@ static inline void *iommu_phys_to_virt(unsigned long paddr)
return phys_to_virt(__sme_clr(paddr));
 }
 
+static inline
+void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root)
+{
+   atomic64_set(>pt_root, root);
+}
+
+static inline
+void amd_iommu_domain_clr_pt_root(struct protection_domain *domain)
+{
+   amd_iommu_domain_set_pt_root(domain, 0);
+}
+
+
 extern bool translation_pre_enabled(struct amd_iommu *iommu);
 extern bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
 struct device *dev);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index b9cf59443843..7f6b0f60b958 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -161,16 +161,6 @@ static void amd_iommu_domain_get_pgtable(struct 
protection_domain *domain,
pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */
 }
 
-static void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 
root)
-{
-   atomic64_set(>pt_root, root);
-}
-
-static void amd_iommu_domain_clr_pt_root(struct protection_domain *domain)
-{
-   amd_iommu_domain_set_pt_root(domain, 0);
-}
-
 static void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
 u64 *root, int mode)
 {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 11/13] iommu/amd: Introduce iommu_v1_iova_to_phys

2020-12-14 Thread Suravee Suthikulpanit
This implements iova_to_phys for AMD IOMMU v1 pagetable,
which will be used by the IO page table framework.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/io_pgtable.c | 22 ++
 drivers/iommu/amd/iommu.c  | 16 +---
 2 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 87184b6cee0f..a293b69b38b9 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -494,6 +494,26 @@ unsigned long iommu_unmap_page(struct protection_domain 
*dom,
return unmapped;
 }
 
+static phys_addr_t iommu_v1_iova_to_phys(struct io_pgtable_ops *ops, unsigned 
long iova)
+{
+   struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
+   unsigned long offset_mask, pte_pgsize;
+   u64 *pte, __pte;
+
+   if (pgtable->mode == PAGE_MODE_NONE)
+   return iova;
+
+   pte = fetch_pte(pgtable, iova, _pgsize);
+
+   if (!pte || !IOMMU_PTE_PRESENT(*pte))
+   return 0;
+
+   offset_mask = pte_pgsize - 1;
+   __pte   = __sme_clr(*pte & PM_ADDR_MASK);
+
+   return (__pte & ~offset_mask) | (iova & offset_mask);
+}
+
 /*
  * 
  */
@@ -534,6 +554,8 @@ static struct io_pgtable *v1_alloc_pgtable(struct 
io_pgtable_cfg *cfg, void *coo
cfg->oas= IOMMU_OUT_ADDR_BIT_SIZE,
cfg->tlb= _flush_ops;
 
+   pgtable->iop.ops.iova_to_phys = iommu_v1_iova_to_phys;
+
return >iop;
 }
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 76f61dd6b89f..29b7fefc8485 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2101,22 +2101,8 @@ static phys_addr_t amd_iommu_iova_to_phys(struct 
iommu_domain *dom,
 {
struct protection_domain *domain = to_pdomain(dom);
struct io_pgtable_ops *ops = >iop.iop.ops;
-   struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
-   unsigned long offset_mask, pte_pgsize;
-   u64 *pte, __pte;
 
-   if (domain->iop.mode == PAGE_MODE_NONE)
-   return iova;
-
-   pte = fetch_pte(pgtable, iova, _pgsize);
-
-   if (!pte || !IOMMU_PTE_PRESENT(*pte))
-   return 0;
-
-   offset_mask = pte_pgsize - 1;
-   __pte   = __sme_clr(*pte & PM_ADDR_MASK);
-
-   return (__pte & ~offset_mask) | (iova & offset_mask);
+   return ops->iova_to_phys(ops, iova);
 }
 
 static bool amd_iommu_capable(enum iommu_cap cap)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 06/13] iommu/amd: Move IO page table related functions

2020-12-14 Thread Suravee Suthikulpanit
Preparing to migrate to use IO page table framework.
There is no functional change.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h  |  18 ++
 drivers/iommu/amd/io_pgtable.c | 473 
 drivers/iommu/amd/iommu.c  | 476 +
 3 files changed, 493 insertions(+), 474 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index bf29ab8c99f0..1bad42a3c73c 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -131,4 +131,22 @@ void amd_iommu_apply_ivrs_quirks(void);
 static inline void amd_iommu_apply_ivrs_quirks(void) { }
 #endif
 
+/* TODO: These are temporary and will be removed once fully transition */
+extern void free_pagetable(struct domain_pgtable *pgtable);
+extern int iommu_map_page(struct protection_domain *dom,
+ unsigned long bus_addr,
+ unsigned long phys_addr,
+ unsigned long page_size,
+ int prot,
+ gfp_t gfp);
+extern unsigned long iommu_unmap_page(struct protection_domain *dom,
+ unsigned long bus_addr,
+ unsigned long page_size);
+extern u64 *fetch_pte(struct protection_domain *domain,
+ unsigned long address,
+ unsigned long *page_size);
+extern void amd_iommu_domain_get_pgtable(struct protection_domain *domain,
+struct domain_pgtable *pgtable);
+extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
+u64 *root, int mode);
 #endif
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index aedf2c932c40..345e9bc81fde 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -50,6 +50,479 @@ static const struct iommu_flush_ops v1_flush_ops = {
.tlb_add_page   = v1_tlb_add_page,
 };
 
+/*
+ * Helper function to get the first pte of a large mapping
+ */
+static u64 *first_pte_l7(u64 *pte, unsigned long *page_size,
+unsigned long *count)
+{
+   unsigned long pte_mask, pg_size, cnt;
+   u64 *fpte;
+
+   pg_size  = PTE_PAGE_SIZE(*pte);
+   cnt  = PAGE_SIZE_PTE_COUNT(pg_size);
+   pte_mask = ~((cnt << 3) - 1);
+   fpte = (u64 *)(((unsigned long)pte) & pte_mask);
+
+   if (page_size)
+   *page_size = pg_size;
+
+   if (count)
+   *count = cnt;
+
+   return fpte;
+}
+
+/
+ *
+ * The functions below are used the create the page table mappings for
+ * unity mapped regions.
+ *
+ /
+
+static void free_page_list(struct page *freelist)
+{
+   while (freelist != NULL) {
+   unsigned long p = (unsigned long)page_address(freelist);
+
+   freelist = freelist->freelist;
+   free_page(p);
+   }
+}
+
+static struct page *free_pt_page(unsigned long pt, struct page *freelist)
+{
+   struct page *p = virt_to_page((void *)pt);
+
+   p->freelist = freelist;
+
+   return p;
+}
+
+#define DEFINE_FREE_PT_FN(LVL, FN) 
\
+static struct page *free_pt_##LVL (unsigned long __pt, struct page *freelist)  
\
+{  
\
+   unsigned long p;
\
+   u64 *pt;
\
+   int i;  
\
+   
\
+   pt = (u64 *)__pt;   
\
+   
\
+   for (i = 0; i < 512; ++i) { 
\
+   /* PTE present? */  
\
+   if (!IOMMU_PTE_PRESENT(pt[i]))  
\
+   continue;   
\
+   
\
+   /* Large PTE? */
\
+   if (PM_PTE_LEVEL(pt[i]) == 0 || 
\
+   PM_PTE_LEVEL(pt[i]) == 7)   
\
+   continue;   
\
+   
\
+   p = (unsigned long)

[PATCH v4 05/13] iommu/amd: Declare functions as extern

2020-12-14 Thread Suravee Suthikulpanit
And move declaration to header file so that they can be included across
multiple files. There is no functional change.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h |  3 +++
 drivers/iommu/amd/iommu.c | 39 +--
 2 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index bf9723b35e77..bf29ab8c99f0 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -57,6 +57,9 @@ extern int amd_iommu_domain_enable_v2(struct iommu_domain 
*dom, int pasids);
 extern int amd_iommu_flush_page(struct iommu_domain *dom, u32 pasid,
u64 address);
 extern void amd_iommu_update_and_flush_device_table(struct protection_domain 
*domain);
+extern void amd_iommu_domain_update(struct protection_domain *domain);
+extern void amd_iommu_domain_flush_complete(struct protection_domain *domain);
+extern void amd_iommu_domain_flush_tlb_pde(struct protection_domain *domain);
 extern int amd_iommu_flush_tlb(struct iommu_domain *dom, u32 pasid);
 extern int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, u32 pasid,
 unsigned long cr3);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index fdb6030b505d..1b10710c91cf 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -87,7 +87,6 @@ struct iommu_cmd {
 
 struct kmem_cache *amd_iommu_irq_cache;
 
-static void update_domain(struct protection_domain *domain);
 static void detach_device(struct device *dev);
 
 /
@@ -1314,12 +1313,12 @@ static void domain_flush_pages(struct protection_domain 
*domain,
 }
 
 /* Flush the whole IO/TLB for a given protection domain - including PDE */
-static void domain_flush_tlb_pde(struct protection_domain *domain)
+void amd_iommu_domain_flush_tlb_pde(struct protection_domain *domain)
 {
__domain_flush_pages(domain, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS, 1);
 }
 
-static void domain_flush_complete(struct protection_domain *domain)
+void amd_iommu_domain_flush_complete(struct protection_domain *domain)
 {
int i;
 
@@ -1344,7 +1343,7 @@ static void domain_flush_np_cache(struct 
protection_domain *domain,
 
spin_lock_irqsave(>lock, flags);
domain_flush_pages(domain, iova, size);
-   domain_flush_complete(domain);
+   amd_iommu_domain_flush_complete(domain);
spin_unlock_irqrestore(>lock, flags);
}
 }
@@ -1501,7 +1500,7 @@ static bool increase_address_space(struct 
protection_domain *domain,
pgtable.root  = pte;
pgtable.mode += 1;
amd_iommu_update_and_flush_device_table(domain);
-   domain_flush_complete(domain);
+   amd_iommu_domain_flush_complete(domain);
 
/*
 * Device Table needs to be updated and flushed before the new root can
@@ -1754,8 +1753,8 @@ static int iommu_map_page(struct protection_domain *dom,
 * Updates and flushing already happened in
 * increase_address_space().
 */
-   domain_flush_tlb_pde(dom);
-   domain_flush_complete(dom);
+   amd_iommu_domain_flush_tlb_pde(dom);
+   amd_iommu_domain_flush_complete(dom);
spin_unlock_irqrestore(>lock, flags);
}
 
@@ -1998,10 +1997,10 @@ static void do_detach(struct iommu_dev_data *dev_data)
device_flush_dte(dev_data);
 
/* Flush IOTLB */
-   domain_flush_tlb_pde(domain);
+   amd_iommu_domain_flush_tlb_pde(domain);
 
/* Wait for the flushes to finish */
-   domain_flush_complete(domain);
+   amd_iommu_domain_flush_complete(domain);
 
/* decrease reference counters - needs to happen after the flushes */
domain->dev_iommu[iommu->index] -= 1;
@@ -2134,9 +2133,9 @@ static int attach_device(struct device *dev,
 * left the caches in the IOMMU dirty. So we have to flush
 * here to evict all dirty stuff.
 */
-   domain_flush_tlb_pde(domain);
+   amd_iommu_domain_flush_tlb_pde(domain);
 
-   domain_flush_complete(domain);
+   amd_iommu_domain_flush_complete(domain);
 
 out:
spin_unlock(_data->lock);
@@ -2298,7 +2297,7 @@ void amd_iommu_update_and_flush_device_table(struct 
protection_domain *domain)
domain_flush_devices(domain);
 }
 
-static void update_domain(struct protection_domain *domain)
+void amd_iommu_domain_update(struct protection_domain *domain)
 {
struct domain_pgtable pgtable;
 
@@ -2307,8 +2306,8 @@ static void update_domain(struct protection_domain 
*domain)
amd_iommu_update_and_flush_device_table(domain);
 
/* Flush domain TLB(s) and wait for completion */
-   domain_flush_tlb_pde(domain);
-   domain_flush_complete(domain);
+   amd_iommu_do

[PATCH v4 07/13] iommu/amd: Restructure code for freeing page table

2020-12-14 Thread Suravee Suthikulpanit
By consolidate logic into v1_free_pgtable helper function,
which is called from IO page table framework.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h  |  1 -
 drivers/iommu/amd/io_pgtable.c | 41 --
 drivers/iommu/amd/iommu.c  | 21 -
 3 files changed, 28 insertions(+), 35 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 1bad42a3c73c..91d098003f12 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -132,7 +132,6 @@ static inline void amd_iommu_apply_ivrs_quirks(void) { }
 #endif
 
 /* TODO: These are temporary and will be removed once fully transition */
-extern void free_pagetable(struct domain_pgtable *pgtable);
 extern int iommu_map_page(struct protection_domain *dom,
  unsigned long bus_addr,
  unsigned long phys_addr,
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 345e9bc81fde..dc674e79ddf0 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -163,23 +163,6 @@ static struct page *free_sub_pt(unsigned long root, int 
mode,
return freelist;
 }
 
-void free_pagetable(struct domain_pgtable *pgtable)
-{
-   struct page *freelist = NULL;
-   unsigned long root;
-
-   if (pgtable->mode == PAGE_MODE_NONE)
-   return;
-
-   BUG_ON(pgtable->mode < PAGE_MODE_NONE ||
-  pgtable->mode > PAGE_MODE_6_LEVEL);
-
-   root = (unsigned long)pgtable->root;
-   freelist = free_sub_pt(root, pgtable->mode, freelist);
-
-   free_page_list(freelist);
-}
-
 void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
  u64 *root, int mode)
 {
@@ -528,6 +511,30 @@ unsigned long iommu_unmap_page(struct protection_domain 
*dom,
  */
 static void v1_free_pgtable(struct io_pgtable *iop)
 {
+   struct amd_io_pgtable *pgtable = container_of(iop, struct 
amd_io_pgtable, iop);
+   struct protection_domain *dom;
+   struct page *freelist = NULL;
+   unsigned long root;
+
+   if (pgtable->mode == PAGE_MODE_NONE)
+   return;
+
+   dom = container_of(pgtable, struct protection_domain, iop);
+
+   /* Update data structure */
+   amd_iommu_domain_clr_pt_root(dom);
+
+   /* Make changes visible to IOMMUs */
+   amd_iommu_domain_update(dom);
+
+   /* Page-table is not visible to IOMMU anymore, so free it */
+   BUG_ON(pgtable->mode < PAGE_MODE_NONE ||
+  pgtable->mode > PAGE_MODE_6_LEVEL);
+
+   root = (unsigned long)pgtable->root;
+   freelist = free_sub_pt(root, pgtable->mode, freelist);
+
+   free_page_list(freelist);
 }
 
 static struct io_pgtable *v1_alloc_pgtable(struct io_pgtable_cfg *cfg, void 
*cookie)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index e823a457..37ecedce2c14 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1903,17 +1903,14 @@ static void cleanup_domain(struct protection_domain 
*domain)
 
 static void protection_domain_free(struct protection_domain *domain)
 {
-   struct domain_pgtable pgtable;
-
if (!domain)
return;
 
if (domain->id)
domain_id_free(domain->id);
 
-   amd_iommu_domain_get_pgtable(domain, );
-   amd_iommu_domain_clr_pt_root(domain);
-   free_pagetable();
+   if (domain->iop.pgtbl_cfg.tlb)
+   free_io_pgtable_ops(>iop.iop.ops);
 
kfree(domain);
 }
@@ -2302,22 +2299,12 @@ EXPORT_SYMBOL(amd_iommu_unregister_ppr_notifier);
 void amd_iommu_domain_direct_map(struct iommu_domain *dom)
 {
struct protection_domain *domain = to_pdomain(dom);
-   struct domain_pgtable pgtable;
unsigned long flags;
 
spin_lock_irqsave(>lock, flags);
 
-   /* First save pgtable configuration*/
-   amd_iommu_domain_get_pgtable(domain, );
-
-   /* Remove page-table from domain */
-   amd_iommu_domain_clr_pt_root(domain);
-
-   /* Make changes visible to IOMMUs */
-   amd_iommu_domain_update(domain);
-
-   /* Page-table is not visible to IOMMU anymore, so free it */
-   free_pagetable();
+   if (domain->iop.pgtbl_cfg.tlb)
+   free_io_pgtable_ops(>iop.iop.ops);
 
spin_unlock_irqrestore(>lock, flags);
 }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 04/13] iommu/amd: Convert to using amd_io_pgtable

2020-12-14 Thread Suravee Suthikulpanit
Make use of the new struct amd_io_pgtable in preparation to remove
the struct domain_pgtable.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h |  1 +
 drivers/iommu/amd/iommu.c | 25 ++---
 2 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index b8dae3941f0f..bf9723b35e77 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -56,6 +56,7 @@ extern void amd_iommu_domain_direct_map(struct iommu_domain 
*dom);
 extern int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids);
 extern int amd_iommu_flush_page(struct iommu_domain *dom, u32 pasid,
u64 address);
+extern void amd_iommu_update_and_flush_device_table(struct protection_domain 
*domain);
 extern int amd_iommu_flush_tlb(struct iommu_domain *dom, u32 pasid);
 extern int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, u32 pasid,
 unsigned long cr3);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 5b93536d6877..fdb6030b505d 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -89,8 +89,6 @@ struct kmem_cache *amd_iommu_irq_cache;
 
 static void update_domain(struct protection_domain *domain);
 static void detach_device(struct device *dev);
-static void update_and_flush_device_table(struct protection_domain *domain,
- struct domain_pgtable *pgtable);
 
 /
  *
@@ -1502,7 +1500,7 @@ static bool increase_address_space(struct 
protection_domain *domain,
 
pgtable.root  = pte;
pgtable.mode += 1;
-   update_and_flush_device_table(domain, );
+   amd_iommu_update_and_flush_device_table(domain);
domain_flush_complete(domain);
 
/*
@@ -1877,17 +1875,16 @@ static void free_gcr3_table(struct protection_domain 
*domain)
 }
 
 static void set_dte_entry(u16 devid, struct protection_domain *domain,
- struct domain_pgtable *pgtable,
  bool ats, bool ppr)
 {
u64 pte_root = 0;
u64 flags = 0;
u32 old_domid;
 
-   if (pgtable->mode != PAGE_MODE_NONE)
-   pte_root = iommu_virt_to_phys(pgtable->root);
+   if (domain->iop.mode != PAGE_MODE_NONE)
+   pte_root = iommu_virt_to_phys(domain->iop.root);
 
-   pte_root |= (pgtable->mode & DEV_ENTRY_MODE_MASK)
+   pte_root |= (domain->iop.mode & DEV_ENTRY_MODE_MASK)
<< DEV_ENTRY_MODE_SHIFT;
pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
 
@@ -1977,7 +1974,7 @@ static void do_attach(struct iommu_dev_data *dev_data,
 
/* Update device table */
amd_iommu_domain_get_pgtable(domain, );
-   set_dte_entry(dev_data->devid, domain, ,
+   set_dte_entry(dev_data->devid, domain,
  ats, dev_data->iommu_v2);
clone_aliases(dev_data->pdev);
 
@@ -2284,22 +2281,20 @@ static int amd_iommu_domain_get_attr(struct 
iommu_domain *domain,
  *
  */
 
-static void update_device_table(struct protection_domain *domain,
-   struct domain_pgtable *pgtable)
+static void update_device_table(struct protection_domain *domain)
 {
struct iommu_dev_data *dev_data;
 
list_for_each_entry(dev_data, >dev_list, list) {
-   set_dte_entry(dev_data->devid, domain, pgtable,
+   set_dte_entry(dev_data->devid, domain,
  dev_data->ats.enabled, dev_data->iommu_v2);
clone_aliases(dev_data->pdev);
}
 }
 
-static void update_and_flush_device_table(struct protection_domain *domain,
- struct domain_pgtable *pgtable)
+void amd_iommu_update_and_flush_device_table(struct protection_domain *domain)
 {
-   update_device_table(domain, pgtable);
+   update_device_table(domain);
domain_flush_devices(domain);
 }
 
@@ -2309,7 +2304,7 @@ static void update_domain(struct protection_domain 
*domain)
 
/* Update device table */
amd_iommu_domain_get_pgtable(domain, );
-   update_and_flush_device_table(domain, );
+   amd_iommu_update_and_flush_device_table(domain);
 
/* Flush domain TLB(s) and wait for completion */
domain_flush_tlb_pde(domain);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 03/13] iommu/amd: Move pt_root to struct amd_io_pgtable

2020-12-14 Thread Suravee Suthikulpanit
To better organize the data structure since it contains IO page table
related information.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h   | 2 +-
 drivers/iommu/amd/amd_iommu_types.h | 2 +-
 drivers/iommu/amd/iommu.c   | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 0817bc732d1a..b8dae3941f0f 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -105,7 +105,7 @@ static inline void *iommu_phys_to_virt(unsigned long paddr)
 static inline
 void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root)
 {
-   atomic64_set(>pt_root, root);
+   atomic64_set(>iop.pt_root, root);
 }
 
 static inline
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 5d77f34e0fda..7c971c76d685 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -497,6 +497,7 @@ struct amd_io_pgtable {
struct io_pgtable   iop;
int mode;
u64 *root;
+   atomic64_t  pt_root;/* pgtable root and pgtable mode */
 };
 
 /*
@@ -510,7 +511,6 @@ struct protection_domain {
struct amd_io_pgtable iop;
spinlock_t lock;/* mostly used to lock the page table*/
u16 id; /* the domain id written to the device table */
-   atomic64_t pt_root; /* pgtable root and pgtable mode */
int glx;/* Number of levels for GCR3 table */
u64 *gcr3_tbl;  /* Guest CR3 table */
unsigned long flags;/* flags to find out type of domain */
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 45d3977d6c00..5b93536d6877 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -145,7 +145,7 @@ static struct protection_domain *to_pdomain(struct 
iommu_domain *dom)
 static void amd_iommu_domain_get_pgtable(struct protection_domain *domain,
 struct domain_pgtable *pgtable)
 {
-   u64 pt_root = atomic64_read(>pt_root);
+   u64 pt_root = atomic64_read(>iop.pt_root);
 
pgtable->root = (u64 *)(pt_root & PAGE_MASK);
pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 02/13] iommu/amd: Prepare for generic IO page table framework

2020-12-14 Thread Suravee Suthikulpanit
Add initial hook up code to implement generic IO page table framework.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/Kconfig   |  1 +
 drivers/iommu/amd/Makefile  |  2 +-
 drivers/iommu/amd/amd_iommu_types.h | 35 ++
 drivers/iommu/amd/io_pgtable.c  | 75 +
 drivers/iommu/amd/iommu.c   | 10 
 drivers/iommu/io-pgtable.c  |  3 ++
 include/linux/io-pgtable.h  |  2 +
 7 files changed, 117 insertions(+), 11 deletions(-)
 create mode 100644 drivers/iommu/amd/io_pgtable.c

diff --git a/drivers/iommu/amd/Kconfig b/drivers/iommu/amd/Kconfig
index 626b97d0dd21..a3cbafb603f5 100644
--- a/drivers/iommu/amd/Kconfig
+++ b/drivers/iommu/amd/Kconfig
@@ -10,6 +10,7 @@ config AMD_IOMMU
select IOMMU_API
select IOMMU_IOVA
select IOMMU_DMA
+   select IOMMU_IO_PGTABLE
depends on X86_64 && PCI && ACPI && HAVE_CMPXCHG_DOUBLE
help
  With this option you can enable support for AMD IOMMU hardware in
diff --git a/drivers/iommu/amd/Makefile b/drivers/iommu/amd/Makefile
index dc5a2fa4fd37..a935f8f4b974 100644
--- a/drivers/iommu/amd/Makefile
+++ b/drivers/iommu/amd/Makefile
@@ -1,4 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
-obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o
+obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o io_pgtable.o
 obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += debugfs.o
 obj-$(CONFIG_AMD_IOMMU_V2) += iommu_v2.o
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 494b42a31b7a..5d77f34e0fda 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Maximum number of IOMMUs supported
@@ -252,6 +253,19 @@
 
 #define GA_GUEST_NR0x1
 
+#define IOMMU_IN_ADDR_BIT_SIZE  52
+#define IOMMU_OUT_ADDR_BIT_SIZE 52
+
+/*
+ * This bitmap is used to advertise the page sizes our hardware support
+ * to the IOMMU core, which will then use this information to split
+ * physically contiguous memory regions it is mapping into page sizes
+ * that we support.
+ *
+ * 512GB Pages are not supported due to a hardware bug
+ */
+#define AMD_IOMMU_PGSIZES  ((~0xFFFUL) & ~(2ULL << 38))
+
 /* Bit value definition for dte irq remapping fields*/
 #define DTE_IRQ_PHYS_ADDR_MASK (((1ULL << 45)-1) << 6)
 #define DTE_IRQ_REMAP_INTCTL_MASK  (0x3ULL << 60)
@@ -465,6 +479,26 @@ struct amd_irte_ops;
 
 #define AMD_IOMMU_FLAG_TRANS_PRE_ENABLED  (1 << 0)
 
+#define io_pgtable_to_data(x) \
+   container_of((x), struct amd_io_pgtable, iop)
+
+#define io_pgtable_ops_to_data(x) \
+   io_pgtable_to_data(io_pgtable_ops_to_pgtable(x))
+
+#define io_pgtable_ops_to_domain(x) \
+   container_of(io_pgtable_ops_to_data(x), \
+struct protection_domain, iop)
+
+#define io_pgtable_cfg_to_data(x) \
+   container_of((x), struct amd_io_pgtable, pgtbl_cfg)
+
+struct amd_io_pgtable {
+   struct io_pgtable_cfg   pgtbl_cfg;
+   struct io_pgtable   iop;
+   int mode;
+   u64 *root;
+};
+
 /*
  * This structure contains generic data for  IOMMU protection domains
  * independent of their use.
@@ -473,6 +507,7 @@ struct protection_domain {
struct list_head dev_list; /* List of all devices in this domain */
struct iommu_domain domain; /* generic domain handle used by
   iommu core code */
+   struct amd_io_pgtable iop;
spinlock_t lock;/* mostly used to lock the page table*/
u16 id; /* the domain id written to the device table */
atomic64_t pt_root; /* pgtable root and pgtable mode */
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
new file mode 100644
index ..aedf2c932c40
--- /dev/null
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -0,0 +1,75 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * CPU-agnostic AMD IO page table allocator.
+ *
+ * Copyright (C) 2020 Advanced Micro Devices, Inc.
+ * Author: Suravee Suthikulpanit 
+ */
+
+#define pr_fmt(fmt) "AMD-Vi: " fmt
+#define dev_fmt(fmt)pr_fmt(fmt)
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "amd_iommu_types.h"
+#include "amd_iommu.h"
+
+static void v1_tlb_flush_all(void *cookie)
+{
+}
+
+static void v1_tlb_flush_walk(unsigned long iova, size_t size,
+ size_t granule, void *cookie)
+{
+}
+
+static void v1_tlb_flush_leaf(unsigned long iova, size_t size,
+ size_t granule, void *cookie)
+{
+}
+
+static void v1_tlb_add_page(struct iommu_iotlb_gather *gather,
+unsigned long iova, size_t granule,
+   

[PATCH v4 00/13] iommu/amd: Add Generic IO Page Table Framework Support

2020-12-14 Thread Suravee Suthikulpanit
The framework allows callable implementation of IO page table.
This allows AMD IOMMU driver to switch between different types
of AMD IOMMU page tables (e.g. v1 vs. v2).

This series refactors the current implementation of AMD IOMMU v1 page table
to adopt the framework. There should be no functional change.
Subsequent series will introduce support for the AMD IOMMU v2 page table.

Thanks,
Suravee

Change from V3 
(https://lore.kernel.org/linux-iommu/20201004014549.16065-1-suravee.suthikulpa...@amd.com/)
  - Rebase to v5.10
  - Patch  2: Add struct iommu_flush_ops (previously in patch 13 of v3)
  - Patch  7: Consolidate logic into v1_free_pgtable() instead of 
amd_iommu_free_pgtable()
  - Patch 12: Check ops->[map|unmap] before calling.
  - Patch 13: Setup page table when allocating domain (instead of when 
attaching device).

Change from V2 
(https://lore.kernel.org/lkml/835c0d46-ed96-9fbe-856a-777dcffac...@amd.com/T/#t)
  - Patch  2: Introduce helper function io_pgtable_cfg_to_data.
  - Patch 13: Put back the struct iommu_flush_ops since patch v2 would run into
NULL pointer bug when calling free_io_pgtable_ops if not defined.

Change from V1 (https://lkml.org/lkml/2020/9/23/251)
  - Do not specify struct io_pgtable_cfg.coherent_walk, since it is
not currently used. (per Robin)
  - Remove unused struct iommu_flush_ops.  (patch 2/13)
  - Move amd_iommu_setup_io_pgtable_ops to iommu.c instead of io_pgtable.c
patch 13/13)

Suravee Suthikulpanit (13):
  iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline
  iommu/amd: Prepare for generic IO page table framework
  iommu/amd: Move pt_root to struct amd_io_pgtable
  iommu/amd: Convert to using amd_io_pgtable
  iommu/amd: Declare functions as extern
  iommu/amd: Move IO page table related functions
  iommu/amd: Restructure code for freeing page table
  iommu/amd: Remove amd_iommu_domain_get_pgtable
  iommu/amd: Rename variables to be consistent with struct
io_pgtable_ops
  iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable
  iommu/amd: Introduce iommu_v1_iova_to_phys
  iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page
  iommu/amd: Adopt IO page table framework for AMD IOMMU v1 page table

 drivers/iommu/amd/Kconfig   |   1 +
 drivers/iommu/amd/Makefile  |   2 +-
 drivers/iommu/amd/amd_iommu.h   |  22 +
 drivers/iommu/amd/amd_iommu_types.h |  43 +-
 drivers/iommu/amd/init.c|   2 +
 drivers/iommu/amd/io_pgtable.c  | 564 +++
 drivers/iommu/amd/iommu.c   | 672 
 drivers/iommu/io-pgtable.c  |   3 +
 include/linux/io-pgtable.h  |   2 +
 9 files changed, 707 insertions(+), 604 deletions(-)
 create mode 100644 drivers/iommu/amd/io_pgtable.c

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/amd: Add sanity check for interrupt remapping table length macros

2020-12-10 Thread Suravee Suthikulpanit
Currently, macros related to the interrupt remapping table length are
defined separately. This has resulted in an oversight in which one of
the macros were missed when changing the length. To prevent this,
redefine the macros to add built-in sanity check.

Also, rename macros to use the name of the DTE[IntTabLen] field as
specified in the AMD IOMMU specification. There is no functional change.

Suggested-by: Linus Torvalds 
Reviewed-by: Tom Lendacky 
Signed-off-by: Suravee Suthikulpanit 
Cc: Will Deacon 
Cc: Jerry Snitselaar 
Cc: Joerg Roedel 
---
 drivers/iommu/amd/amd_iommu_types.h | 19 ++-
 drivers/iommu/amd/init.c|  6 +++---
 drivers/iommu/amd/iommu.c   |  2 +-
 3 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 494b42a31b7a..899ce62df3f0 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -255,11 +255,19 @@
 /* Bit value definition for dte irq remapping fields*/
 #define DTE_IRQ_PHYS_ADDR_MASK (((1ULL << 45)-1) << 6)
 #define DTE_IRQ_REMAP_INTCTL_MASK  (0x3ULL << 60)
-#define DTE_IRQ_TABLE_LEN_MASK (0xfULL << 1)
 #define DTE_IRQ_REMAP_INTCTL(2ULL << 60)
-#define DTE_IRQ_TABLE_LEN   (9ULL << 1)
 #define DTE_IRQ_REMAP_ENABLE1ULL
 
+/*
+ * AMD IOMMU hardware only support 512 IRTEs despite
+ * the architectural limitation of 2048 entries.
+ */
+#define DTE_INTTAB_ALIGNMENT128
+#define DTE_INTTABLEN_VALUE 9ULL
+#define DTE_INTTABLEN   (DTE_INTTABLEN_VALUE << 1)
+#define DTE_INTTABLEN_MASK  (0xfULL << 1)
+#define MAX_IRQS_PER_TABLE  (1 << DTE_INTTABLEN_VALUE)
+
 #define PAGE_MODE_NONE0x00
 #define PAGE_MODE_1_LEVEL 0x01
 #define PAGE_MODE_2_LEVEL 0x02
@@ -409,13 +417,6 @@ extern bool amd_iommu_np_cache;
 /* Only true if all IOMMUs support device IOTLBs */
 extern bool amd_iommu_iotlb_sup;
 
-/*
- * AMD IOMMU hardware only support 512 IRTEs despite
- * the architectural limitation of 2048 entries.
- */
-#define MAX_IRQS_PER_TABLE 512
-#define IRQ_TABLE_ALIGNMENT128
-
 struct irq_remap_table {
raw_spinlock_t lock;
unsigned min_index;
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 23a790f8f550..6bec8913d064 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -989,10 +989,10 @@ static bool copy_device_table(void)
 
irq_v = old_devtb[devid].data[2] & DTE_IRQ_REMAP_ENABLE;
int_ctl = old_devtb[devid].data[2] & DTE_IRQ_REMAP_INTCTL_MASK;
-   int_tab_len = old_devtb[devid].data[2] & DTE_IRQ_TABLE_LEN_MASK;
+   int_tab_len = old_devtb[devid].data[2] & DTE_INTTABLEN_MASK;
if (irq_v && (int_ctl || int_tab_len)) {
if ((int_ctl != DTE_IRQ_REMAP_INTCTL) ||
-   (int_tab_len != DTE_IRQ_TABLE_LEN)) {
+   (int_tab_len != DTE_INTTABLEN)) {
pr_err("Wrong old irq remapping flag: %#x\n", 
devid);
return false;
}
@@ -2674,7 +2674,7 @@ static int __init early_amd_iommu_init(void)
remap_cache_sz = MAX_IRQS_PER_TABLE * (sizeof(u64) * 2);
amd_iommu_irq_cache = kmem_cache_create("irq_remap_cache",
remap_cache_sz,
-   IRQ_TABLE_ALIGNMENT,
+   DTE_INTTAB_ALIGNMENT,
0, NULL);
if (!amd_iommu_irq_cache)
goto out;
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index b9cf59443843..f7abf16d1e3a 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3191,7 +3191,7 @@ static void set_dte_irq_entry(u16 devid, struct 
irq_remap_table *table)
dte &= ~DTE_IRQ_PHYS_ADDR_MASK;
dte |= iommu_virt_to_phys(table->table);
dte |= DTE_IRQ_REMAP_INTCTL;
-   dte |= DTE_IRQ_TABLE_LEN;
+   dte |= DTE_INTTABLEN;
dte |= DTE_IRQ_REMAP_ENABLE;
 
amd_iommu_dev_table[devid].data[2] = dte;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/amd: Set DTE[IntTabLen] to represent 512 IRTEs

2020-12-07 Thread Suravee Suthikulpanit
According to the AMD IOMMU spec, the commit 73db2fc595f3
("iommu/amd: Increase interrupt remapping table limit to 512 entries")
also requires the interrupt table length (IntTabLen) to be set to 9
(power of 2) in the device table mapping entry (DTE).

Fixes: 73db2fc595f3 ("iommu/amd: Increase interrupt remapping table limit to 
512 entries")
Reported-by: Jerry Snitselaar 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu_types.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 89647700bab2..494b42a31b7a 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -257,7 +257,7 @@
 #define DTE_IRQ_REMAP_INTCTL_MASK  (0x3ULL << 60)
 #define DTE_IRQ_TABLE_LEN_MASK (0xfULL << 1)
 #define DTE_IRQ_REMAP_INTCTL(2ULL << 60)
-#define DTE_IRQ_TABLE_LEN   (8ULL << 1)
+#define DTE_IRQ_TABLE_LEN   (9ULL << 1)
 #define DTE_IRQ_REMAP_ENABLE1ULL
 
 #define PAGE_MODE_NONE0x00
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/amd: Enforce 4k mapping for certain IOMMU data structures

2020-11-19 Thread Suravee Suthikulpanit

Will,

To answer your questions from v1 thread.

On 11/18/20 5:57 AM, Will Deacon wrote:
> On 11/5/20 9:58 PM, Suravee Suthikulpanit wrote:
>> AMD IOMMU requires 4k-aligned pages for the event log, the PPR log,
>> and the completion wait write-back regions. However, when allocating
>> the pages, they could be part of large mapping (e.g. 2M) page.
>> This causes #PF due to the SNP RMP hardware enforces the check based
>> on the page level for these data structures.
>
> Please could you include an example backtrace here?

Unfortunately, we don't actually have the backtrace available here.
This information is based on the SEV-SNP specification.

>> So, fix by calling set_memory_4k() on the allocated pages.
>
> I think I'm missing something here. set_memory_4k() will break the kernel
> linear mapping up into page granular mappings, but the IOMMU isn't using
> that mapping, right?

That's correct. This does not affect the IOMMU, but it affects the PSP FW.

> It's just using the physical address returned by iommu_virt_to_phys(), so why 
does it matter?
>
> Just be nice to capture some of this rationale in the log, especially as
> I'm not familiar with this device.

According to the AMD SEV-SNP white paper 
(https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf), 
the Reverse Map Table (RMP) contains one entry for every 4K page of DRAM that may be used by the VM. In this case, the 
pages allocated by the IOMMU driver are added as 4K entries in the RMP table by the SEV-SNP FW.


During the page table walk, the RMP checks if the page is owned by the hypervisor. Without calling set_memory_4k() to 
break the mapping up into 4K pages, pages could end up being part of large mapping (e.g. 2M page), in which the page 
access would be denied and result in #PF.


>> Fixes: commit c69d89aff393 ("iommu/amd: Use 4K page for completion wait 
write-back semaphore")
>
> I couldn't figure out how that commit could cause this problem. Please can
> you explain that to me?

Hope this helps clarify. If so, I'll update the commit log and send out V3.

Thanks,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Enforce 4k mapping for certain IOMMU data structures

2020-11-19 Thread Suravee Suthikulpanit

Will,

I have already submitted v2 of this patch. Let me move the discussion there 
instead ...
(https://lore.kernel.org/linux-iommu/20201105145832.3065-1-suravee.suthikulpa...@amd.com/)

Suravee

On 11/18/20 5:57 AM, Will Deacon wrote:

On Wed, Oct 28, 2020 at 11:18:24PM +, Suravee Suthikulpanit wrote:

AMD IOMMU requires 4k-aligned pages for the event log, the PPR log,
and the completion wait write-back regions. However, when allocating
the pages, they could be part of large mapping (e.g. 2M) page.
This causes #PF due to the SNP RMP hardware enforces the check based
on the page level for these data structures.


Please could you include an example backtrace here?


So, fix by calling set_memory_4k() on the allocated pages.


I think I'm missing something here. set_memory_4k() will break the kernel
linear mapping up into page granular mappings, but the IOMMU isn't using
that mapping, right? It's just using the physical address returned by
iommu_virt_to_phys(), so why does it matter?

Just be nice to capture some of this rationale in the log, especially as
I'm not familiar with this device.


Fixes: commit c69d89aff393 ("iommu/amd: Use 4K page for completion wait write-back 
semaphore")


I couldn't figure out how that commit could cause this problem. Please can
you explain that to me?

Cheers,

Will


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 00/14] iommu/amd: Add Generic IO Page Table Framework Support

2020-11-12 Thread Suravee Suthikulpanit

Joerg,

Please ignore to include the V3. I am working on V4 to resubmit.

Thank you,
Suravee

On 11/11/20 10:10 AM, Suravee Suthikulpanit wrote:

Hi Joerg,

Do you have any update on this series?

Thanks,
Suravee

On 11/2/20 10:16 AM, Suravee Suthikulpanit wrote:

Joerg,

You mentioned to remind you to pull this in to linux-next.

Thanks,
Suravee

On 10/4/20 8:45 AM, Suravee Suthikulpanit wrote:

The framework allows callable implementation of IO page table.
This allows AMD IOMMU driver to switch between different types
of AMD IOMMU page tables (e.g. v1 vs. v2).

This series refactors the current implementation of AMD IOMMU v1 page table
to adopt the framework. There should be no functional change.
Subsequent series will introduce support for the AMD IOMMU v2 page table.

Thanks,
Suravee

Change from V2 
(https://lore.kernel.org/lkml/835c0d46-ed96-9fbe-856a-777dcffac...@amd.com/T/#t)
   - Patch 2/14: Introduce helper function io_pgtable_cfg_to_data.
   - Patch 13/14: Put back the struct iommu_flush_ops since patch v2 would run 
into
 NULL pointer bug when calling free_io_pgtable_ops if not defined.

Change from V1 (https://lkml.org/lkml/2020/9/23/251)
   - Do not specify struct io_pgtable_cfg.coherent_walk, since it is
 not currently used. (per Robin)
   - Remove unused struct iommu_flush_ops.  (patch 2/13)
   - Move amd_iommu_setup_io_pgtable_ops to iommu.c instead of io_pgtable.c
 patch 13/13)

Suravee Suthikulpanit (14):
   iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline
   iommu/amd: Prepare for generic IO page table framework
   iommu/amd: Move pt_root to to struct amd_io_pgtable
   iommu/amd: Convert to using amd_io_pgtable
   iommu/amd: Declare functions as extern
   iommu/amd: Move IO page table related functions
   iommu/amd: Restructure code for freeing page table
   iommu/amd: Remove amd_iommu_domain_get_pgtable
   iommu/amd: Rename variables to be consistent with struct
 io_pgtable_ops
   iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable
   iommu/amd: Introduce iommu_v1_iova_to_phys
   iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page
   iommu/amd: Introduce IOMMU flush callbacks
   iommu/amd: Adopt IO page table framework

  drivers/iommu/amd/Kconfig   |   1 +
  drivers/iommu/amd/Makefile  |   2 +-
  drivers/iommu/amd/amd_iommu.h   |  22 +
  drivers/iommu/amd/amd_iommu_types.h |  43 +-
  drivers/iommu/amd/io_pgtable.c  | 564 
  drivers/iommu/amd/iommu.c   | 646 +++-
  drivers/iommu/io-pgtable.c  |   3 +
  include/linux/io-pgtable.h  |   2 +
  8 files changed, 691 insertions(+), 592 deletions(-)
  create mode 100644 drivers/iommu/amd/io_pgtable.c


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 00/14] iommu/amd: Add Generic IO Page Table Framework Support

2020-11-10 Thread Suravee Suthikulpanit

Hi Joerg,

Do you have any update on this series?

Thanks,
Suravee

On 11/2/20 10:16 AM, Suravee Suthikulpanit wrote:

Joerg,

You mentioned to remind you to pull this in to linux-next.

Thanks,
Suravee

On 10/4/20 8:45 AM, Suravee Suthikulpanit wrote:

The framework allows callable implementation of IO page table.
This allows AMD IOMMU driver to switch between different types
of AMD IOMMU page tables (e.g. v1 vs. v2).

This series refactors the current implementation of AMD IOMMU v1 page table
to adopt the framework. There should be no functional change.
Subsequent series will introduce support for the AMD IOMMU v2 page table.

Thanks,
Suravee

Change from V2 
(https://lore.kernel.org/lkml/835c0d46-ed96-9fbe-856a-777dcffac...@amd.com/T/#t)
   - Patch 2/14: Introduce helper function io_pgtable_cfg_to_data.
   - Patch 13/14: Put back the struct iommu_flush_ops since patch v2 would run 
into
 NULL pointer bug when calling free_io_pgtable_ops if not defined.

Change from V1 (https://lkml.org/lkml/2020/9/23/251)
   - Do not specify struct io_pgtable_cfg.coherent_walk, since it is
 not currently used. (per Robin)
   - Remove unused struct iommu_flush_ops.  (patch 2/13)
   - Move amd_iommu_setup_io_pgtable_ops to iommu.c instead of io_pgtable.c
 patch 13/13)

Suravee Suthikulpanit (14):
   iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline
   iommu/amd: Prepare for generic IO page table framework
   iommu/amd: Move pt_root to to struct amd_io_pgtable
   iommu/amd: Convert to using amd_io_pgtable
   iommu/amd: Declare functions as extern
   iommu/amd: Move IO page table related functions
   iommu/amd: Restructure code for freeing page table
   iommu/amd: Remove amd_iommu_domain_get_pgtable
   iommu/amd: Rename variables to be consistent with struct
 io_pgtable_ops
   iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable
   iommu/amd: Introduce iommu_v1_iova_to_phys
   iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page
   iommu/amd: Introduce IOMMU flush callbacks
   iommu/amd: Adopt IO page table framework

  drivers/iommu/amd/Kconfig   |   1 +
  drivers/iommu/amd/Makefile  |   2 +-
  drivers/iommu/amd/amd_iommu.h   |  22 +
  drivers/iommu/amd/amd_iommu_types.h |  43 +-
  drivers/iommu/amd/io_pgtable.c  | 564 
  drivers/iommu/amd/iommu.c   | 646 +++-
  drivers/iommu/io-pgtable.c  |   3 +
  include/linux/io-pgtable.h  |   2 +
  8 files changed, 691 insertions(+), 592 deletions(-)
  create mode 100644 drivers/iommu/amd/io_pgtable.c


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v2] iommu/amd: Enforce 4k mapping for certain IOMMU data structures

2020-11-05 Thread Suravee Suthikulpanit
AMD IOMMU requires 4k-aligned pages for the event log, the PPR log,
and the completion wait write-back regions. However, when allocating
the pages, they could be part of large mapping (e.g. 2M) page.
This causes #PF due to the SNP RMP hardware enforces the check based
on the page level for these data structures.

So, fix by calling set_memory_4k() on the allocated pages.

Fixes: commit c69d89aff393 ("iommu/amd: Use 4K page for completion wait 
write-back semaphore")
Cc: Brijesh Singh 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 27 ++-
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 82e4af8f09bb..23a790f8f550 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -672,11 +673,27 @@ static void __init free_command_buffer(struct amd_iommu 
*iommu)
free_pages((unsigned long)iommu->cmd_buf, get_order(CMD_BUFFER_SIZE));
 }
 
+static void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu,
+gfp_t gfp, size_t size)
+{
+   int order = get_order(size);
+   void *buf = (void *)__get_free_pages(gfp, order);
+
+   if (buf &&
+   iommu_feature(iommu, FEATURE_SNP) &&
+   set_memory_4k((unsigned long)buf, (1 << order))) {
+   free_pages((unsigned long)buf, order);
+   buf = NULL;
+   }
+
+   return buf;
+}
+
 /* allocates the memory where the IOMMU will log its events to */
 static int __init alloc_event_buffer(struct amd_iommu *iommu)
 {
-   iommu->evt_buf = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
- get_order(EVT_BUFFER_SIZE));
+   iommu->evt_buf = iommu_alloc_4k_pages(iommu, GFP_KERNEL | __GFP_ZERO,
+ EVT_BUFFER_SIZE);
 
return iommu->evt_buf ? 0 : -ENOMEM;
 }
@@ -715,8 +732,8 @@ static void __init free_event_buffer(struct amd_iommu 
*iommu)
 /* allocates the memory where the IOMMU will log its events to */
 static int __init alloc_ppr_log(struct amd_iommu *iommu)
 {
-   iommu->ppr_log = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
- get_order(PPR_LOG_SIZE));
+   iommu->ppr_log = iommu_alloc_4k_pages(iommu, GFP_KERNEL | __GFP_ZERO,
+ PPR_LOG_SIZE);
 
return iommu->ppr_log ? 0 : -ENOMEM;
 }
@@ -838,7 +855,7 @@ static int iommu_init_ga(struct amd_iommu *iommu)
 
 static int __init alloc_cwwb_sem(struct amd_iommu *iommu)
 {
-   iommu->cmd_sem = (void *)get_zeroed_page(GFP_KERNEL);
+   iommu->cmd_sem = iommu_alloc_4k_pages(iommu, GFP_KERNEL | __GFP_ZERO, 
1);
 
return iommu->cmd_sem ? 0 : -ENOMEM;
 }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 00/14] iommu/amd: Add Generic IO Page Table Framework Support

2020-11-01 Thread Suravee Suthikulpanit

Joerg,

You mentioned to remind you to pull this in to linux-next.

Thanks,
Suravee

On 10/4/20 8:45 AM, Suravee Suthikulpanit wrote:

The framework allows callable implementation of IO page table.
This allows AMD IOMMU driver to switch between different types
of AMD IOMMU page tables (e.g. v1 vs. v2).

This series refactors the current implementation of AMD IOMMU v1 page table
to adopt the framework. There should be no functional change.
Subsequent series will introduce support for the AMD IOMMU v2 page table.

Thanks,
Suravee

Change from V2 
(https://lore.kernel.org/lkml/835c0d46-ed96-9fbe-856a-777dcffac...@amd.com/T/#t)
   - Patch 2/14: Introduce helper function io_pgtable_cfg_to_data.
   - Patch 13/14: Put back the struct iommu_flush_ops since patch v2 would run 
into
 NULL pointer bug when calling free_io_pgtable_ops if not defined.

Change from V1 (https://lkml.org/lkml/2020/9/23/251)
   - Do not specify struct io_pgtable_cfg.coherent_walk, since it is
 not currently used. (per Robin)
   - Remove unused struct iommu_flush_ops.  (patch 2/13)
   - Move amd_iommu_setup_io_pgtable_ops to iommu.c instead of io_pgtable.c
 patch 13/13)

Suravee Suthikulpanit (14):
   iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline
   iommu/amd: Prepare for generic IO page table framework
   iommu/amd: Move pt_root to to struct amd_io_pgtable
   iommu/amd: Convert to using amd_io_pgtable
   iommu/amd: Declare functions as extern
   iommu/amd: Move IO page table related functions
   iommu/amd: Restructure code for freeing page table
   iommu/amd: Remove amd_iommu_domain_get_pgtable
   iommu/amd: Rename variables to be consistent with struct
 io_pgtable_ops
   iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable
   iommu/amd: Introduce iommu_v1_iova_to_phys
   iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page
   iommu/amd: Introduce IOMMU flush callbacks
   iommu/amd: Adopt IO page table framework

  drivers/iommu/amd/Kconfig   |   1 +
  drivers/iommu/amd/Makefile  |   2 +-
  drivers/iommu/amd/amd_iommu.h   |  22 +
  drivers/iommu/amd/amd_iommu_types.h |  43 +-
  drivers/iommu/amd/io_pgtable.c  | 564 
  drivers/iommu/amd/iommu.c   | 646 +++-
  drivers/iommu/io-pgtable.c  |   3 +
  include/linux/io-pgtable.h  |   2 +
  8 files changed, 691 insertions(+), 592 deletions(-)
  create mode 100644 drivers/iommu/amd/io_pgtable.c


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/amd: Enforce 4k mapping for certain IOMMU data structures

2020-10-28 Thread Suravee Suthikulpanit
AMD IOMMU requires 4k-aligned pages for the event log, the PPR log,
and the completion wait write-back regions. However, when allocating
the pages, they could be part of large mapping (e.g. 2M) page.
This causes #PF due to the SNP RMP hardware enforces the check based
on the page level for these data structures.

So, fix by calling set_memory_4k() on the allocated pages.

Fixes: commit c69d89aff393 ("iommu/amd: Use 4K page for completion wait 
write-back semaphore")
Cc: Brijesh Singh 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 82e4af8f09bb..75dc30226a7c 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -672,11 +673,22 @@ static void __init free_command_buffer(struct amd_iommu 
*iommu)
free_pages((unsigned long)iommu->cmd_buf, get_order(CMD_BUFFER_SIZE));
 }
 
+static void *__init iommu_alloc_4k_pages(gfp_t gfp, size_t size)
+{
+   void *buf;
+   int order = get_order(size);
+
+   buf = (void *)__get_free_pages(gfp, order);
+   if (!buf)
+   return buf;
+   return set_memory_4k((unsigned long)buf, (1 << order)) ? NULL : buf;
+}
+
 /* allocates the memory where the IOMMU will log its events to */
 static int __init alloc_event_buffer(struct amd_iommu *iommu)
 {
-   iommu->evt_buf = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
- get_order(EVT_BUFFER_SIZE));
+   iommu->evt_buf = iommu_alloc_4k_pages(GFP_KERNEL | __GFP_ZERO,
+ EVT_BUFFER_SIZE);
 
return iommu->evt_buf ? 0 : -ENOMEM;
 }
@@ -715,8 +727,8 @@ static void __init free_event_buffer(struct amd_iommu 
*iommu)
 /* allocates the memory where the IOMMU will log its events to */
 static int __init alloc_ppr_log(struct amd_iommu *iommu)
 {
-   iommu->ppr_log = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
- get_order(PPR_LOG_SIZE));
+   iommu->ppr_log = iommu_alloc_4k_pages(GFP_KERNEL | __GFP_ZERO,
+ PPR_LOG_SIZE);
 
return iommu->ppr_log ? 0 : -ENOMEM;
 }
@@ -838,7 +850,7 @@ static int iommu_init_ga(struct amd_iommu *iommu)
 
 static int __init alloc_cwwb_sem(struct amd_iommu *iommu)
 {
-   iommu->cmd_sem = (void *)get_zeroed_page(GFP_KERNEL);
+   iommu->cmd_sem = iommu_alloc_4k_pages(GFP_KERNEL | __GFP_ZERO, 1);
 
return iommu->cmd_sem ? 0 : -ENOMEM;
 }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Increase interrupt remapping table limit to 512 entries

2020-10-25 Thread Suravee Suthikulpanit

Hi Joerg,

Do you have any concerns regarding this patch?

Thanks,
Suravee

On 10/15/20 9:50 AM, Suravee Suthikulpanit wrote:

Certain device drivers allocate IO queues on a per-cpu basis.
On AMD EPYC platform, which can support up-to 256 cpu threads,
this can exceed the current MAX_IRQ_PER_TABLE limit of 256,
and result in the error message:

 AMD-Vi: Failed to allocate IRTE

This has been observed with certain NVME devices.

AMD IOMMU hardware can actually support upto 512 interrupt
remapping table entries. Therefore, update the driver to
match the hardware limit.

Please note that this also increases the size of interrupt remapping
table to 8KB per device when using the 128-bit IRTE format.

Signed-off-by: Suravee Suthikulpanit 
---
  drivers/iommu/amd/amd_iommu_types.h | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 30a5d412255a..427484c45589 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -406,7 +406,11 @@ extern bool amd_iommu_np_cache;
  /* Only true if all IOMMUs support device IOTLBs */
  extern bool amd_iommu_iotlb_sup;
  
-#define MAX_IRQS_PER_TABLE	256

+/*
+ * AMD IOMMU hardware only support 512 IRTEs despite
+ * the architectural limitation of 2048 entries.
+ */
+#define MAX_IRQS_PER_TABLE 512
  #define IRQ_TABLE_ALIGNMENT   128
  
  struct irq_remap_table {



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/amd: Increase interrupt remapping table limit to 512 entries

2020-10-14 Thread Suravee Suthikulpanit
Certain device drivers allocate IO queues on a per-cpu basis.
On AMD EPYC platform, which can support up-to 256 cpu threads,
this can exceed the current MAX_IRQ_PER_TABLE limit of 256,
and result in the error message:

AMD-Vi: Failed to allocate IRTE

This has been observed with certain NVME devices.

AMD IOMMU hardware can actually support upto 512 interrupt
remapping table entries. Therefore, update the driver to
match the hardware limit.

Please note that this also increases the size of interrupt remapping
table to 8KB per device when using the 128-bit IRTE format.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu_types.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 30a5d412255a..427484c45589 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -406,7 +406,11 @@ extern bool amd_iommu_np_cache;
 /* Only true if all IOMMUs support device IOTLBs */
 extern bool amd_iommu_iotlb_sup;
 
-#define MAX_IRQS_PER_TABLE 256
+/*
+ * AMD IOMMU hardware only support 512 IRTEs despite
+ * the architectural limitation of 2048 entries.
+ */
+#define MAX_IRQS_PER_TABLE 512
 #define IRQ_TABLE_ALIGNMENT128
 
 struct irq_remap_table {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 07/14] iommu/amd: Restructure code for freeing page table

2020-10-03 Thread Suravee Suthikulpanit
Introduce amd_iommu_free_pgtable helper function, which consolidates
logic for freeing page table.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h  |  2 +-
 drivers/iommu/amd/io_pgtable.c | 12 +++-
 drivers/iommu/amd/iommu.c  | 19 ++-
 3 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index ee7ff4d827e1..8dff7d85be79 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -123,7 +123,6 @@ static inline void amd_iommu_apply_ivrs_quirks(void) { }
 #endif
 
 /* TODO: These are temporary and will be removed once fully transition */
-extern void free_pagetable(struct domain_pgtable *pgtable);
 extern int iommu_map_page(struct protection_domain *dom,
  unsigned long bus_addr,
  unsigned long phys_addr,
@@ -140,4 +139,5 @@ extern void amd_iommu_domain_get_pgtable(struct 
protection_domain *domain,
 struct domain_pgtable *pgtable);
 extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
 u64 *root, int mode);
+extern void amd_iommu_free_pgtable(struct amd_io_pgtable *pgtable);
 #endif
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index c11355afe624..23e82da2dea8 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -136,14 +136,24 @@ static struct page *free_sub_pt(unsigned long root, int 
mode,
return freelist;
 }
 
-void free_pagetable(struct domain_pgtable *pgtable)
+void amd_iommu_free_pgtable(struct amd_io_pgtable *pgtable)
 {
+   struct protection_domain *dom;
struct page *freelist = NULL;
unsigned long root;
 
if (pgtable->mode == PAGE_MODE_NONE)
return;
 
+   dom = container_of(pgtable, struct protection_domain, iop);
+
+   /* Update data structure */
+   amd_iommu_domain_clr_pt_root(dom);
+
+   /* Make changes visible to IOMMUs */
+   amd_iommu_domain_update(dom);
+
+   /* Page-table is not visible to IOMMU anymore, so free it */
BUG_ON(pgtable->mode < PAGE_MODE_NONE ||
   pgtable->mode > PAGE_MODE_6_LEVEL);
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 4d65f64236b6..cbbea7b952fb 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1882,17 +1882,13 @@ static void cleanup_domain(struct protection_domain 
*domain)
 
 static void protection_domain_free(struct protection_domain *domain)
 {
-   struct domain_pgtable pgtable;
-
if (!domain)
return;
 
if (domain->id)
domain_id_free(domain->id);
 
-   amd_iommu_domain_get_pgtable(domain, );
-   amd_iommu_domain_clr_pt_root(domain);
-   free_pagetable();
+   amd_iommu_free_pgtable(>iop);
 
kfree(domain);
 }
@@ -2281,22 +2277,11 @@ EXPORT_SYMBOL(amd_iommu_unregister_ppr_notifier);
 void amd_iommu_domain_direct_map(struct iommu_domain *dom)
 {
struct protection_domain *domain = to_pdomain(dom);
-   struct domain_pgtable pgtable;
unsigned long flags;
 
spin_lock_irqsave(>lock, flags);
 
-   /* First save pgtable configuration*/
-   amd_iommu_domain_get_pgtable(domain, );
-
-   /* Remove page-table from domain */
-   amd_iommu_domain_clr_pt_root(domain);
-
-   /* Make changes visible to IOMMUs */
-   amd_iommu_domain_update(domain);
-
-   /* Page-table is not visible to IOMMU anymore, so free it */
-   free_pagetable();
+   amd_iommu_free_pgtable(>iop);
 
spin_unlock_irqrestore(>lock, flags);
 }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 11/14] iommu/amd: Introduce iommu_v1_iova_to_phys

2020-10-03 Thread Suravee Suthikulpanit
This implements iova_to_phys for AMD IOMMU v1 pagetable,
which will be used by the IO page table framework.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/io_pgtable.c | 22 ++
 drivers/iommu/amd/iommu.c  | 16 +---
 2 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 93ff8cb452ed..7841e5e1e563 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -494,6 +494,26 @@ unsigned long iommu_unmap_page(struct protection_domain 
*dom,
return unmapped;
 }
 
+static phys_addr_t iommu_v1_iova_to_phys(struct io_pgtable_ops *ops, unsigned 
long iova)
+{
+   struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
+   unsigned long offset_mask, pte_pgsize;
+   u64 *pte, __pte;
+
+   if (pgtable->mode == PAGE_MODE_NONE)
+   return iova;
+
+   pte = fetch_pte(pgtable, iova, _pgsize);
+
+   if (!pte || !IOMMU_PTE_PRESENT(*pte))
+   return 0;
+
+   offset_mask = pte_pgsize - 1;
+   __pte   = __sme_clr(*pte & PM_ADDR_MASK);
+
+   return (__pte & ~offset_mask) | (iova & offset_mask);
+}
+
 /*
  * 
  */
@@ -505,6 +525,8 @@ static struct io_pgtable *v1_alloc_pgtable(struct 
io_pgtable_cfg *cfg, void *coo
 {
struct amd_io_pgtable *pgtable = io_pgtable_cfg_to_data(cfg);
 
+   pgtable->iop.ops.iova_to_phys = iommu_v1_iova_to_phys;
+
return >iop;
 }
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 87cea1cde414..9a1a16031e00 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2079,22 +2079,8 @@ static phys_addr_t amd_iommu_iova_to_phys(struct 
iommu_domain *dom,
 {
struct protection_domain *domain = to_pdomain(dom);
struct io_pgtable_ops *ops = >iop.iop.ops;
-   struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
-   unsigned long offset_mask, pte_pgsize;
-   u64 *pte, __pte;
 
-   if (domain->iop.mode == PAGE_MODE_NONE)
-   return iova;
-
-   pte = fetch_pte(pgtable, iova, _pgsize);
-
-   if (!pte || !IOMMU_PTE_PRESENT(*pte))
-   return 0;
-
-   offset_mask = pte_pgsize - 1;
-   __pte   = __sme_clr(*pte & PM_ADDR_MASK);
-
-   return (__pte & ~offset_mask) | (iova & offset_mask);
+   return ops->iova_to_phys(ops, iova);
 }
 
 static bool amd_iommu_capable(enum iommu_cap cap)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 12/14] iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page

2020-10-03 Thread Suravee Suthikulpanit
These implement map and unmap for AMD IOMMU v1 pagetable, which
will be used by the IO pagetable framework.

Also clean up unused extern function declarations.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h  | 13 -
 drivers/iommu/amd/io_pgtable.c | 25 -
 drivers/iommu/amd/iommu.c  |  7 ---
 3 files changed, 16 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 69996e57fae2..2e8dc2a1ec0f 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -124,19 +124,6 @@ void amd_iommu_apply_ivrs_quirks(void);
 static inline void amd_iommu_apply_ivrs_quirks(void) { }
 #endif
 
-/* TODO: These are temporary and will be removed once fully transition */
-extern int iommu_map_page(struct protection_domain *dom,
- unsigned long bus_addr,
- unsigned long phys_addr,
- unsigned long page_size,
- int prot,
- gfp_t gfp);
-extern unsigned long iommu_unmap_page(struct protection_domain *dom,
- unsigned long bus_addr,
- unsigned long page_size);
-extern u64 *fetch_pte(struct amd_io_pgtable *pgtable,
- unsigned long address,
- unsigned long *page_size);
 extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
 u64 *root, int mode);
 extern void amd_iommu_free_pgtable(struct amd_io_pgtable *pgtable);
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 7841e5e1e563..d8b329aa0bb2 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -317,9 +317,9 @@ static u64 *alloc_pte(struct protection_domain *domain,
  * This function checks if there is a PTE for a given dma address. If
  * there is one, it returns the pointer to it.
  */
-u64 *fetch_pte(struct amd_io_pgtable *pgtable,
-  unsigned long address,
-  unsigned long *page_size)
+static u64 *fetch_pte(struct amd_io_pgtable *pgtable,
+ unsigned long address,
+ unsigned long *page_size)
 {
int level;
u64 *pte;
@@ -392,13 +392,10 @@ static struct page *free_clear_pte(u64 *pte, u64 pteval, 
struct page *freelist)
  * supporting all features of AMD IOMMU page tables like level skipping
  * and full 64 bit address spaces.
  */
-int iommu_map_page(struct protection_domain *dom,
-  unsigned long iova,
-  unsigned long paddr,
-  unsigned long size,
-  int prot,
-  gfp_t gfp)
+static int iommu_v1_map_page(struct io_pgtable_ops *ops, unsigned long iova,
+ phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
 {
+   struct protection_domain *dom = io_pgtable_ops_to_domain(ops);
struct page *freelist = NULL;
bool updated = false;
u64 __pte, *pte;
@@ -461,11 +458,11 @@ int iommu_map_page(struct protection_domain *dom,
return ret;
 }
 
-unsigned long iommu_unmap_page(struct protection_domain *dom,
-  unsigned long iova,
-  unsigned long size)
+static unsigned long iommu_v1_unmap_page(struct io_pgtable_ops *ops,
+ unsigned long iova,
+ size_t size,
+ struct iommu_iotlb_gather *gather)
 {
-   struct io_pgtable_ops *ops = >iop.iop.ops;
struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
unsigned long long unmapped;
unsigned long unmap_size;
@@ -525,6 +522,8 @@ static struct io_pgtable *v1_alloc_pgtable(struct 
io_pgtable_cfg *cfg, void *coo
 {
struct amd_io_pgtable *pgtable = io_pgtable_cfg_to_data(cfg);
 
+   pgtable->iop.ops.map  = iommu_v1_map_page;
+   pgtable->iop.ops.unmap= iommu_v1_unmap_page;
pgtable->iop.ops.iova_to_phys = iommu_v1_iova_to_phys;
 
return >iop;
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 9a1a16031e00..77f44b927ae7 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2044,6 +2044,7 @@ static int amd_iommu_map(struct iommu_domain *dom, 
unsigned long iova,
 gfp_t gfp)
 {
struct protection_domain *domain = to_pdomain(dom);
+   struct io_pgtable_ops *ops = >iop.iop.ops;
int prot = 0;
int ret;
 
@@ -2055,8 +2056,7 @@ static int amd_iommu_map(struct iommu_domain *dom, 
unsigned long iova,
if (iommu_prot & IOMMU_WRITE)
prot |= IOMMU_PROT_IW;
 
-   ret = iommu_map_page(domain, iova, paddr, page_size, prot, gfp);
-
+   ret = ops->map(ops, iova, pa

[PATCH v3 14/14] iommu/amd: Adopt IO page table framework

2020-10-03 Thread Suravee Suthikulpanit
Switch to using IO page table framework for AMD IOMMU v1 page table.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/iommu.c | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 77f44b927ae7..6f8316206fb8 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1573,6 +1574,22 @@ static int pdev_iommuv2_enable(struct pci_dev *pdev)
return ret;
 }
 
+struct io_pgtable_ops *
+amd_iommu_setup_io_pgtable_ops(struct iommu_dev_data *dev_data,
+  struct protection_domain *domain)
+{
+   struct amd_iommu *iommu = amd_iommu_rlookup_table[dev_data->devid];
+
+   domain->iop.pgtbl_cfg = (struct io_pgtable_cfg) {
+   .pgsize_bitmap  = AMD_IOMMU_PGSIZES,
+   .ias= IOMMU_IN_ADDR_BIT_SIZE,
+   .oas= IOMMU_OUT_ADDR_BIT_SIZE,
+   .iommu_dev  = >dev->dev,
+   };
+
+   return alloc_io_pgtable_ops(AMD_IOMMU_V1, >iop.pgtbl_cfg, 
domain);
+}
+
 /*
  * If a device is not yet associated with a domain, this function makes the
  * device visible in the domain
@@ -1580,6 +1597,7 @@ static int pdev_iommuv2_enable(struct pci_dev *pdev)
 static int attach_device(struct device *dev,
 struct protection_domain *domain)
 {
+   struct io_pgtable_ops *pgtbl_ops;
struct iommu_dev_data *dev_data;
struct pci_dev *pdev;
unsigned long flags;
@@ -1623,6 +1641,12 @@ static int attach_device(struct device *dev,
 skip_ats_check:
ret = 0;
 
+   pgtbl_ops = amd_iommu_setup_io_pgtable_ops(dev_data, domain);
+   if (!pgtbl_ops) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
do_attach(dev_data, domain);
 
/*
@@ -1958,6 +1982,8 @@ static void amd_iommu_domain_free(struct iommu_domain 
*dom)
if (domain->dev_cnt > 0)
cleanup_domain(domain);
 
+   free_io_pgtable_ops(>iop.iop.ops);
+
BUG_ON(domain->dev_cnt != 0);
 
if (!dom)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 13/14] iommu/amd: Introduce IOMMU flush callbacks

2020-10-03 Thread Suravee Suthikulpanit
Add TLB flush callback functions, which are used by the IO
page table framework.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/io_pgtable.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index d8b329aa0bb2..3c2faa47ea5d 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -514,6 +514,33 @@ static phys_addr_t iommu_v1_iova_to_phys(struct 
io_pgtable_ops *ops, unsigned lo
 /*
  * 
  */
+static void v1_tlb_flush_all(void *cookie)
+{
+}
+
+static void v1_tlb_flush_walk(unsigned long iova, size_t size,
+ size_t granule, void *cookie)
+{
+}
+
+static void v1_tlb_flush_leaf(unsigned long iova, size_t size,
+ size_t granule, void *cookie)
+{
+}
+
+static void v1_tlb_add_page(struct iommu_iotlb_gather *gather,
+unsigned long iova, size_t granule,
+void *cookie)
+{
+}
+
+const struct iommu_flush_ops v1_flush_ops = {
+   .tlb_flush_all  = v1_tlb_flush_all,
+   .tlb_flush_walk = v1_tlb_flush_walk,
+   .tlb_flush_leaf = v1_tlb_flush_leaf,
+   .tlb_add_page   = v1_tlb_add_page,
+};
+
 static void v1_free_pgtable(struct io_pgtable *iop)
 {
 }
@@ -526,6 +553,8 @@ static struct io_pgtable *v1_alloc_pgtable(struct 
io_pgtable_cfg *cfg, void *coo
pgtable->iop.ops.unmap= iommu_v1_unmap_page;
pgtable->iop.ops.iova_to_phys = iommu_v1_iova_to_phys;
 
+   cfg->tlb = _flush_ops;
+
return >iop;
 }
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 00/14] iommu/amd: Add Generic IO Page Table Framework Support

2020-10-03 Thread Suravee Suthikulpanit
The framework allows callable implementation of IO page table.
This allows AMD IOMMU driver to switch between different types
of AMD IOMMU page tables (e.g. v1 vs. v2).

This series refactors the current implementation of AMD IOMMU v1 page table
to adopt the framework. There should be no functional change.
Subsequent series will introduce support for the AMD IOMMU v2 page table.

Thanks,
Suravee

Change from V2 
(https://lore.kernel.org/lkml/835c0d46-ed96-9fbe-856a-777dcffac...@amd.com/T/#t)
  - Patch 2/14: Introduce helper function io_pgtable_cfg_to_data.
  - Patch 13/14: Put back the struct iommu_flush_ops since patch v2 would run 
into
NULL pointer bug when calling free_io_pgtable_ops if not defined.

Change from V1 (https://lkml.org/lkml/2020/9/23/251)
  - Do not specify struct io_pgtable_cfg.coherent_walk, since it is
not currently used. (per Robin)
  - Remove unused struct iommu_flush_ops.  (patch 2/13)
  - Move amd_iommu_setup_io_pgtable_ops to iommu.c instead of io_pgtable.c
patch 13/13)

Suravee Suthikulpanit (14):
  iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline
  iommu/amd: Prepare for generic IO page table framework
  iommu/amd: Move pt_root to to struct amd_io_pgtable
  iommu/amd: Convert to using amd_io_pgtable
  iommu/amd: Declare functions as extern
  iommu/amd: Move IO page table related functions
  iommu/amd: Restructure code for freeing page table
  iommu/amd: Remove amd_iommu_domain_get_pgtable
  iommu/amd: Rename variables to be consistent with struct
io_pgtable_ops
  iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable
  iommu/amd: Introduce iommu_v1_iova_to_phys
  iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page
  iommu/amd: Introduce IOMMU flush callbacks
  iommu/amd: Adopt IO page table framework

 drivers/iommu/amd/Kconfig   |   1 +
 drivers/iommu/amd/Makefile  |   2 +-
 drivers/iommu/amd/amd_iommu.h   |  22 +
 drivers/iommu/amd/amd_iommu_types.h |  43 +-
 drivers/iommu/amd/io_pgtable.c  | 564 
 drivers/iommu/amd/iommu.c   | 646 +++-
 drivers/iommu/io-pgtable.c  |   3 +
 include/linux/io-pgtable.h  |   2 +
 8 files changed, 691 insertions(+), 592 deletions(-)
 create mode 100644 drivers/iommu/amd/io_pgtable.c

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 09/14] iommu/amd: Rename variables to be consistent with struct io_pgtable_ops

2020-10-03 Thread Suravee Suthikulpanit
There is no functional change.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/io_pgtable.c | 31 +++
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 6c063d2c8bf0..989db64a89a7 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -393,9 +393,9 @@ static struct page *free_clear_pte(u64 *pte, u64 pteval, 
struct page *freelist)
  * and full 64 bit address spaces.
  */
 int iommu_map_page(struct protection_domain *dom,
-  unsigned long bus_addr,
-  unsigned long phys_addr,
-  unsigned long page_size,
+  unsigned long iova,
+  unsigned long paddr,
+  unsigned long size,
   int prot,
   gfp_t gfp)
 {
@@ -404,15 +404,15 @@ int iommu_map_page(struct protection_domain *dom,
u64 __pte, *pte;
int ret, i, count;
 
-   BUG_ON(!IS_ALIGNED(bus_addr, page_size));
-   BUG_ON(!IS_ALIGNED(phys_addr, page_size));
+   BUG_ON(!IS_ALIGNED(iova, size));
+   BUG_ON(!IS_ALIGNED(paddr, size));
 
ret = -EINVAL;
if (!(prot & IOMMU_PROT_MASK))
goto out;
 
-   count = PAGE_SIZE_PTE_COUNT(page_size);
-   pte   = alloc_pte(dom, bus_addr, page_size, NULL, gfp, );
+   count = PAGE_SIZE_PTE_COUNT(size);
+   pte   = alloc_pte(dom, iova, size, NULL, gfp, );
 
ret = -ENOMEM;
if (!pte)
@@ -425,10 +425,10 @@ int iommu_map_page(struct protection_domain *dom,
updated = true;
 
if (count > 1) {
-   __pte = PAGE_SIZE_PTE(__sme_set(phys_addr), page_size);
+   __pte = PAGE_SIZE_PTE(__sme_set(paddr), size);
__pte |= PM_LEVEL_ENC(7) | IOMMU_PTE_PR | IOMMU_PTE_FC;
} else
-   __pte = __sme_set(phys_addr) | IOMMU_PTE_PR | IOMMU_PTE_FC;
+   __pte = __sme_set(paddr) | IOMMU_PTE_PR | IOMMU_PTE_FC;
 
if (prot & IOMMU_PROT_IR)
__pte |= IOMMU_PTE_IR;
@@ -462,20 +462,19 @@ int iommu_map_page(struct protection_domain *dom,
 }
 
 unsigned long iommu_unmap_page(struct protection_domain *dom,
-  unsigned long bus_addr,
-  unsigned long page_size)
+  unsigned long iova,
+  unsigned long size)
 {
unsigned long long unmapped;
unsigned long unmap_size;
u64 *pte;
 
-   BUG_ON(!is_power_of_2(page_size));
+   BUG_ON(!is_power_of_2(size));
 
unmapped = 0;
 
-   while (unmapped < page_size) {
-
-   pte = fetch_pte(dom, bus_addr, _size);
+   while (unmapped < size) {
+   pte = fetch_pte(dom, iova, _size);
 
if (pte) {
int i, count;
@@ -485,7 +484,7 @@ unsigned long iommu_unmap_page(struct protection_domain 
*dom,
pte[i] = 0ULL;
}
 
-   bus_addr  = (bus_addr & ~(unmap_size - 1)) + unmap_size;
+   iova = (iova & ~(unmap_size - 1)) + unmap_size;
unmapped += unmap_size;
}
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 08/14] iommu/amd: Remove amd_iommu_domain_get_pgtable

2020-10-03 Thread Suravee Suthikulpanit
Since the IO page table root and mode parameters have been moved into
the struct amd_io_pg, the function is no longer needed. Therefore,
remove it along with the struct domain_pgtable.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h   |  4 ++--
 drivers/iommu/amd/amd_iommu_types.h |  6 -
 drivers/iommu/amd/io_pgtable.c  | 36 ++---
 drivers/iommu/amd/iommu.c   | 34 ---
 4 files changed, 19 insertions(+), 61 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 8dff7d85be79..2059e64fdc53 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -101,6 +101,8 @@ static inline
 void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root)
 {
atomic64_set(>iop.pt_root, root);
+   domain->iop.root = (u64 *)(root & PAGE_MASK);
+   domain->iop.mode = root & 7; /* lowest 3 bits encode pgtable mode */
 }
 
 static inline
@@ -135,8 +137,6 @@ extern unsigned long iommu_unmap_page(struct 
protection_domain *dom,
 extern u64 *fetch_pte(struct protection_domain *domain,
  unsigned long address,
  unsigned long *page_size);
-extern void amd_iommu_domain_get_pgtable(struct protection_domain *domain,
-struct domain_pgtable *pgtable);
 extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
 u64 *root, int mode);
 extern void amd_iommu_free_pgtable(struct amd_io_pgtable *pgtable);
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 80b5c34357ed..de3fe9433080 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -514,12 +514,6 @@ struct protection_domain {
unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */
 };
 
-/* For decocded pt_root */
-struct domain_pgtable {
-   int mode;
-   u64 *root;
-};
-
 /*
  * Structure where we save information about one hardware AMD IOMMU in the
  * system.
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 23e82da2dea8..6c063d2c8bf0 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -184,30 +184,27 @@ static bool increase_address_space(struct 
protection_domain *domain,
   unsigned long address,
   gfp_t gfp)
 {
-   struct domain_pgtable pgtable;
unsigned long flags;
bool ret = true;
u64 *pte;
 
spin_lock_irqsave(>lock, flags);
 
-   amd_iommu_domain_get_pgtable(domain, );
-
-   if (address <= PM_LEVEL_SIZE(pgtable.mode))
+   if (address <= PM_LEVEL_SIZE(domain->iop.mode))
goto out;
 
ret = false;
-   if (WARN_ON_ONCE(pgtable.mode == PAGE_MODE_6_LEVEL))
+   if (WARN_ON_ONCE(domain->iop.mode == PAGE_MODE_6_LEVEL))
goto out;
 
pte = (void *)get_zeroed_page(gfp);
if (!pte)
goto out;
 
-   *pte = PM_LEVEL_PDE(pgtable.mode, iommu_virt_to_phys(pgtable.root));
+   *pte = PM_LEVEL_PDE(domain->iop.mode, 
iommu_virt_to_phys(domain->iop.root));
 
-   pgtable.root  = pte;
-   pgtable.mode += 1;
+   domain->iop.root  = pte;
+   domain->iop.mode += 1;
amd_iommu_update_and_flush_device_table(domain);
amd_iommu_domain_flush_complete(domain);
 
@@ -215,7 +212,7 @@ static bool increase_address_space(struct protection_domain 
*domain,
 * Device Table needs to be updated and flushed before the new root can
 * be published.
 */
-   amd_iommu_domain_set_pgtable(domain, pte, pgtable.mode);
+   amd_iommu_domain_set_pgtable(domain, pte, domain->iop.mode);
 
ret = true;
 
@@ -232,29 +229,23 @@ static u64 *alloc_pte(struct protection_domain *domain,
  gfp_t gfp,
  bool *updated)
 {
-   struct domain_pgtable pgtable;
int level, end_lvl;
u64 *pte, *page;
 
BUG_ON(!is_power_of_2(page_size));
 
-   amd_iommu_domain_get_pgtable(domain, );
-
-   while (address > PM_LEVEL_SIZE(pgtable.mode)) {
+   while (address > PM_LEVEL_SIZE(domain->iop.mode)) {
/*
 * Return an error if there is no memory to update the
 * page-table.
 */
if (!increase_address_space(domain, address, gfp))
return NULL;
-
-   /* Read new values to check if update was successful */
-   amd_iommu_domain_get_pgtable(domain, );
}
 
 
-   level   = pgtable.mode - 1;
-   pte = [PM_LEVEL_INDEX(level, address)];
+   level   = domain->iop.mode - 1;
+   pte = >iop.root[PM_LEVEL_INDEX(level, address)];

[PATCH v3 02/14] iommu/amd: Prepare for generic IO page table framework

2020-10-03 Thread Suravee Suthikulpanit
Add initial hook up code to implement generic IO page table framework.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/Kconfig   |  1 +
 drivers/iommu/amd/Makefile  |  2 +-
 drivers/iommu/amd/amd_iommu_types.h | 35 +++
 drivers/iommu/amd/io_pgtable.c  | 43 +
 drivers/iommu/amd/iommu.c   | 10 ---
 drivers/iommu/io-pgtable.c  |  3 ++
 include/linux/io-pgtable.h  |  2 ++
 7 files changed, 85 insertions(+), 11 deletions(-)
 create mode 100644 drivers/iommu/amd/io_pgtable.c

diff --git a/drivers/iommu/amd/Kconfig b/drivers/iommu/amd/Kconfig
index 626b97d0dd21..a3cbafb603f5 100644
--- a/drivers/iommu/amd/Kconfig
+++ b/drivers/iommu/amd/Kconfig
@@ -10,6 +10,7 @@ config AMD_IOMMU
select IOMMU_API
select IOMMU_IOVA
select IOMMU_DMA
+   select IOMMU_IO_PGTABLE
depends on X86_64 && PCI && ACPI && HAVE_CMPXCHG_DOUBLE
help
  With this option you can enable support for AMD IOMMU hardware in
diff --git a/drivers/iommu/amd/Makefile b/drivers/iommu/amd/Makefile
index dc5a2fa4fd37..a935f8f4b974 100644
--- a/drivers/iommu/amd/Makefile
+++ b/drivers/iommu/amd/Makefile
@@ -1,4 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
-obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o
+obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o io_pgtable.o
 obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += debugfs.o
 obj-$(CONFIG_AMD_IOMMU_V2) += iommu_v2.o
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index f696ac7c5f89..e3ac3e57e507 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Maximum number of IOMMUs supported
@@ -252,6 +253,19 @@
 
 #define GA_GUEST_NR0x1
 
+#define IOMMU_IN_ADDR_BIT_SIZE  52
+#define IOMMU_OUT_ADDR_BIT_SIZE 52
+
+/*
+ * This bitmap is used to advertise the page sizes our hardware support
+ * to the IOMMU core, which will then use this information to split
+ * physically contiguous memory regions it is mapping into page sizes
+ * that we support.
+ *
+ * 512GB Pages are not supported due to a hardware bug
+ */
+#define AMD_IOMMU_PGSIZES  ((~0xFFFUL) & ~(2ULL << 38))
+
 /* Bit value definition for dte irq remapping fields*/
 #define DTE_IRQ_PHYS_ADDR_MASK (((1ULL << 45)-1) << 6)
 #define DTE_IRQ_REMAP_INTCTL_MASK  (0x3ULL << 60)
@@ -461,6 +475,26 @@ struct amd_irte_ops;
 
 #define AMD_IOMMU_FLAG_TRANS_PRE_ENABLED  (1 << 0)
 
+#define io_pgtable_to_data(x) \
+   container_of((x), struct amd_io_pgtable, iop)
+
+#define io_pgtable_ops_to_data(x) \
+   io_pgtable_to_data(io_pgtable_ops_to_pgtable(x))
+
+#define io_pgtable_ops_to_domain(x) \
+   container_of(io_pgtable_ops_to_data(x), \
+struct protection_domain, iop)
+
+#define io_pgtable_cfg_to_data(x) \
+   container_of((x), struct amd_io_pgtable, pgtbl_cfg)
+
+struct amd_io_pgtable {
+   struct io_pgtable_cfg   pgtbl_cfg;
+   struct io_pgtable   iop;
+   int mode;
+   u64 *root;
+};
+
 /*
  * This structure contains generic data for  IOMMU protection domains
  * independent of their use.
@@ -469,6 +503,7 @@ struct protection_domain {
struct list_head dev_list; /* List of all devices in this domain */
struct iommu_domain domain; /* generic domain handle used by
   iommu core code */
+   struct amd_io_pgtable iop;
spinlock_t lock;/* mostly used to lock the page table*/
u16 id; /* the domain id written to the device table */
atomic64_t pt_root; /* pgtable root and pgtable mode */
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
new file mode 100644
index ..6b2de9e467d9
--- /dev/null
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * CPU-agnostic AMD IO page table allocator.
+ *
+ * Copyright (C) 2020 Advanced Micro Devices, Inc.
+ * Author: Suravee Suthikulpanit 
+ */
+
+#define pr_fmt(fmt) "AMD-Vi: " fmt
+#define dev_fmt(fmt)pr_fmt(fmt)
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "amd_iommu_types.h"
+#include "amd_iommu.h"
+
+/*
+ * 
+ */
+static void v1_free_pgtable(struct io_pgtable *iop)
+{
+}
+
+static struct io_pgtable *v1_alloc_pgtable(struct io_pgtable_cfg *cfg, void 
*cookie)
+{
+   struct amd_io_pgtable *pgtable = io_pgtable_cfg_to_data(cfg);
+
+   return >iop;
+}
+
+struct io_pgtable_init_fns io_pgtable_amd_iommu_v1_init_fns = {
+   .alloc  = v1_alloc_pgtable,
+   .free   = v1_free_pgtable,
+};
diff --git a/dr

[PATCH v3 04/14] iommu/amd: Convert to using amd_io_pgtable

2020-10-03 Thread Suravee Suthikulpanit
Make use of the new struct amd_io_pgtable in preparation to remove
the struct domain_pgtable.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h |  1 +
 drivers/iommu/amd/iommu.c | 25 ++---
 2 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index da6e09657e00..22ecacb71675 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -47,6 +47,7 @@ extern void amd_iommu_domain_direct_map(struct iommu_domain 
*dom);
 extern int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids);
 extern int amd_iommu_flush_page(struct iommu_domain *dom, int pasid,
u64 address);
+extern void amd_iommu_update_and_flush_device_table(struct protection_domain 
*domain);
 extern int amd_iommu_flush_tlb(struct iommu_domain *dom, int pasid);
 extern int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, int pasid,
 unsigned long cr3);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index c8b8619cc744..09da37c4c9c4 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -90,8 +90,6 @@ struct kmem_cache *amd_iommu_irq_cache;
 
 static void update_domain(struct protection_domain *domain);
 static void detach_device(struct device *dev);
-static void update_and_flush_device_table(struct protection_domain *domain,
- struct domain_pgtable *pgtable);
 
 /
  *
@@ -1482,7 +1480,7 @@ static bool increase_address_space(struct 
protection_domain *domain,
 
pgtable.root  = pte;
pgtable.mode += 1;
-   update_and_flush_device_table(domain, );
+   amd_iommu_update_and_flush_device_table(domain);
domain_flush_complete(domain);
 
/*
@@ -1857,17 +1855,16 @@ static void free_gcr3_table(struct protection_domain 
*domain)
 }
 
 static void set_dte_entry(u16 devid, struct protection_domain *domain,
- struct domain_pgtable *pgtable,
  bool ats, bool ppr)
 {
u64 pte_root = 0;
u64 flags = 0;
u32 old_domid;
 
-   if (pgtable->mode != PAGE_MODE_NONE)
-   pte_root = iommu_virt_to_phys(pgtable->root);
+   if (domain->iop.mode != PAGE_MODE_NONE)
+   pte_root = iommu_virt_to_phys(domain->iop.root);
 
-   pte_root |= (pgtable->mode & DEV_ENTRY_MODE_MASK)
+   pte_root |= (domain->iop.mode & DEV_ENTRY_MODE_MASK)
<< DEV_ENTRY_MODE_SHIFT;
pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
 
@@ -1957,7 +1954,7 @@ static void do_attach(struct iommu_dev_data *dev_data,
 
/* Update device table */
amd_iommu_domain_get_pgtable(domain, );
-   set_dte_entry(dev_data->devid, domain, ,
+   set_dte_entry(dev_data->devid, domain,
  ats, dev_data->iommu_v2);
clone_aliases(dev_data->pdev);
 
@@ -2263,22 +2260,20 @@ static int amd_iommu_domain_get_attr(struct 
iommu_domain *domain,
  *
  */
 
-static void update_device_table(struct protection_domain *domain,
-   struct domain_pgtable *pgtable)
+static void update_device_table(struct protection_domain *domain)
 {
struct iommu_dev_data *dev_data;
 
list_for_each_entry(dev_data, >dev_list, list) {
-   set_dte_entry(dev_data->devid, domain, pgtable,
+   set_dte_entry(dev_data->devid, domain,
  dev_data->ats.enabled, dev_data->iommu_v2);
clone_aliases(dev_data->pdev);
}
 }
 
-static void update_and_flush_device_table(struct protection_domain *domain,
- struct domain_pgtable *pgtable)
+void amd_iommu_update_and_flush_device_table(struct protection_domain *domain)
 {
-   update_device_table(domain, pgtable);
+   update_device_table(domain);
domain_flush_devices(domain);
 }
 
@@ -2288,7 +2283,7 @@ static void update_domain(struct protection_domain 
*domain)
 
/* Update device table */
amd_iommu_domain_get_pgtable(domain, );
-   update_and_flush_device_table(domain, );
+   amd_iommu_update_and_flush_device_table(domain);
 
/* Flush domain TLB(s) and wait for completion */
domain_flush_tlb_pde(domain);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 01/14] iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline

2020-10-03 Thread Suravee Suthikulpanit
Move the function to header file to allow inclusion in other files.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h | 13 +
 drivers/iommu/amd/iommu.c | 10 --
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 57309716fd18..97cdb235ce69 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -93,6 +93,19 @@ static inline void *iommu_phys_to_virt(unsigned long paddr)
return phys_to_virt(__sme_clr(paddr));
 }
 
+static inline
+void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root)
+{
+   atomic64_set(>pt_root, root);
+}
+
+static inline
+void amd_iommu_domain_clr_pt_root(struct protection_domain *domain)
+{
+   amd_iommu_domain_set_pt_root(domain, 0);
+}
+
+
 extern bool translation_pre_enabled(struct amd_iommu *iommu);
 extern bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
 struct device *dev);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index db4fb840c59c..e92b3f744292 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -162,16 +162,6 @@ static void amd_iommu_domain_get_pgtable(struct 
protection_domain *domain,
pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */
 }
 
-static void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 
root)
-{
-   atomic64_set(>pt_root, root);
-}
-
-static void amd_iommu_domain_clr_pt_root(struct protection_domain *domain)
-{
-   amd_iommu_domain_set_pt_root(domain, 0);
-}
-
 static void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
 u64 *root, int mode)
 {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 03/14] iommu/amd: Move pt_root to to struct amd_io_pgtable

2020-10-03 Thread Suravee Suthikulpanit
To better organize the data structure since it contains IO page table
related information.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h   | 2 +-
 drivers/iommu/amd/amd_iommu_types.h | 2 +-
 drivers/iommu/amd/iommu.c   | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 97cdb235ce69..da6e09657e00 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -96,7 +96,7 @@ static inline void *iommu_phys_to_virt(unsigned long paddr)
 static inline
 void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root)
 {
-   atomic64_set(>pt_root, root);
+   atomic64_set(>iop.pt_root, root);
 }
 
 static inline
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index e3ac3e57e507..80b5c34357ed 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -493,6 +493,7 @@ struct amd_io_pgtable {
struct io_pgtable   iop;
int mode;
u64 *root;
+   atomic64_t pt_root; /* pgtable root and pgtable mode */
 };
 
 /*
@@ -506,7 +507,6 @@ struct protection_domain {
struct amd_io_pgtable iop;
spinlock_t lock;/* mostly used to lock the page table*/
u16 id; /* the domain id written to the device table */
-   atomic64_t pt_root; /* pgtable root and pgtable mode */
int glx;/* Number of levels for GCR3 table */
u64 *gcr3_tbl;  /* Guest CR3 table */
unsigned long flags;/* flags to find out type of domain */
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 2b7eb51dcbb8..c8b8619cc744 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -146,7 +146,7 @@ static struct protection_domain *to_pdomain(struct 
iommu_domain *dom)
 static void amd_iommu_domain_get_pgtable(struct protection_domain *domain,
 struct domain_pgtable *pgtable)
 {
-   u64 pt_root = atomic64_read(>pt_root);
+   u64 pt_root = atomic64_read(>iop.pt_root);
 
pgtable->root = (u64 *)(pt_root & PAGE_MASK);
pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


  1   2   3   4   >