[PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V

2021-11-18 Thread Nicolin Chen via iommu
From: Nate Watterson 

NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
which extends the standard ARM SMMU v3 IP to support multiple
VCMDQs with virtualization capabilities. In-kernel of host OS,
they're used to reduce contention on a single queue. In terms
of command queue, they are very like the standard CMDQ/ECMDQs,
but only support CS_NONE in the CS field of CMD_SYNC command.

This patch adds a new nvidia-grace-cmdqv file and inserts its
structure pointer into the existing arm_smmu_device, and then
adds related function calls in the arm-smmu-v3 driver.

In the CMDQV driver itself, this patch only adds minimal part
for host kernel support. Upon probe(), VINTF0 is reserved for
in-kernel use. And some of the VCMDQs are assigned to VINTF0.
Then the driver will select one of VCMDQs in the VINTF0 based
on the CPU currently executing, to issue commands.

Note that for the current plan the CMDQV driver only supports
ACPI configuration.

Signed-off-by: Nate Watterson 
Signed-off-by: Nicolin Chen 
---
Changelog:
v2->v3:
 * Replaced impl design with simpler "nvidia_grace_cmdqv" pointer
 * Aligned all the namings to "nvidia_grace_cmdqv" or "cmdqv"
 * Changed VINTF_ENABLED check in get_cmdq() to VINTF_STATUS
 * Dropped overrides at smmu->features and smmu->options
 * Inlined hw_probe() to acpi_probe() for simplification
 * Added a new CMDQV CONFIG depending on CONFIG_ACPI
 * Removed additional platform_device involvement
 * Switched krealloc to kzalloc for cmdqv Pointer
 * Moved devm_request_irq() out of device_reset()
 * Dropped IRQF_SHARED flag at devm_request_irq()
 * Reused acpi_iort_node pointer from SMMU driver
 * Reused existing smmu functions to init vcmdqs
 * Changed writel_relaxed to writel to be safe
 * Removed pointless comments and prints
 * Updated Copyright lines

 MAINTAINERS   |   1 +
 drivers/iommu/Kconfig |  12 +
 drivers/iommu/arm/arm-smmu-v3/Makefile|   1 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  21 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  41 ++
 .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c  | 418 ++
 6 files changed, 488 insertions(+), 6 deletions(-)
 create mode 100644 drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c

diff --git a/MAINTAINERS b/MAINTAINERS
index f32c7d733255..0314ee1edf62 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18726,6 +18726,7 @@ M:  Thierry Reding 
 R: Krishna Reddy 
 L: linux-te...@vger.kernel.org
 S: Supported
+F: drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
 F: drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
 F: drivers/iommu/tegra*
 
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 3eb68fa1b8cc..290af9c7b2a5 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -388,6 +388,18 @@ config ARM_SMMU_V3_SVA
  Say Y here if your system supports SVA extensions such as PCIe PASID
  and PRI.
 
+config NVIDIA_GRACE_CMDQV
+   bool "NVIDIA Grace CMDQ-V extension support for ARM SMMUv3"
+   depends on ARM_SMMU_V3
+   depends on ACPI
+   help
+ Support for NVIDIA Grace CMDQ-Virtualization extension for ARM SMMUv3.
+ The CMDQ-V extension is similar to v3.3 ECMDQ for multi command queues
+ support, except with virtualization capabilities.
+
+ Say Y here if your system is NVIDIA Grace or it has the same CMDQ-V
+ extension.
+
 config S390_IOMMU
def_bool y if S390 && PCI
depends on S390 && PCI
diff --git a/drivers/iommu/arm/arm-smmu-v3/Makefile 
b/drivers/iommu/arm/arm-smmu-v3/Makefile
index 54feb1ecccad..a083019de68a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/Makefile
+++ b/drivers/iommu/arm/arm-smmu-v3/Makefile
@@ -2,4 +2,5 @@
 obj-$(CONFIG_ARM_SMMU_V3) += arm_smmu_v3.o
 arm_smmu_v3-objs-y += arm-smmu-v3.o
 arm_smmu_v3-objs-$(CONFIG_ARM_SMMU_V3_SVA) += arm-smmu-v3-sva.o
+arm_smmu_v3-objs-$(CONFIG_NVIDIA_GRACE_CMDQV) += nvidia-grace-cmdqv.o
 arm_smmu_v3-objs := $(arm_smmu_v3-objs-y)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 188865ec9a33..b1182dd825fd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -339,6 +339,9 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct 
arm_smmu_cmdq_ent *ent)
 
 static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
 {
+   if (smmu->nvidia_grace_cmdqv)
+   return nvidia_grace_cmdqv_get_cmdq(smmu);
+
return >cmdq;
 }
 
@@ -2874,12 +2877,10 @@ static struct iommu_ops arm_smmu_ops = {
 };
 
 /* Probing and initialisation functions */
-static int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
-  struct arm_smmu_queue *q,
-  void __iomem *page,
-  unsigned long prod_off,
-  unsigned long 

[PATCH v3 2/5] iommu/arm-smmu-v3: Make arm_smmu_cmdq_init reusable

2021-11-18 Thread Nicolin Chen via iommu
The CMDQV extension in NVIDIA Grace SoC resues the arm_smmu_cmdq
structure while the queue location isn't same as smmu->cmdq. So,
this patch adds a cmdq argument to arm_smmu_cmdq_init() function
and shares its define in the header for CMDQV driver to use.

Signed-off-by: Nicolin Chen 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 +++---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 ++
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e6fee69dd79c..6be20e926f63 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2922,10 +2922,10 @@ static void arm_smmu_cmdq_free_bitmap(void *data)
bitmap_free(bitmap);
 }
 
-static int arm_smmu_cmdq_init(struct arm_smmu_device *smmu)
+int arm_smmu_cmdq_init(struct arm_smmu_device *smmu,
+  struct arm_smmu_cmdq *cmdq)
 {
int ret = 0;
-   struct arm_smmu_cmdq *cmdq = >cmdq;
unsigned int nents = 1 << cmdq->q.llq.max_n_shift;
atomic_long_t *bitmap;
 
@@ -2955,7 +2955,7 @@ static int arm_smmu_init_queues(struct arm_smmu_device 
*smmu)
if (ret)
return ret;
 
-   ret = arm_smmu_cmdq_init(smmu);
+   ret = arm_smmu_cmdq_init(smmu, >cmdq);
if (ret)
return ret;
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 7a6a6045700d..475f004ccbe4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -751,6 +751,8 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t 
size, int asid,
 bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd);
 int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
unsigned long iova, size_t size);
+int arm_smmu_cmdq_init(struct arm_smmu_device *smmu,
+  struct arm_smmu_cmdq *cmdq);
 
 #ifdef CONFIG_ARM_SMMU_V3_SVA
 bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF

2021-11-18 Thread Nicolin Chen via iommu
When VCMDQs are assigned to a VINTF that is owned by a guest, not
hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
are supported. This requires get_cmd() function to scan the input
cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
unsupported commands can still go through emulated smmu->cmdq.

Also the guest shouldn't have HYP_OWN bit being set regardless of
guest kernel driver writing it or not, i.e. the user space driver
running in the host OS should wire this bit to zero when trapping
a write access to this VINTF_CONFIG register from a guest kernel.
So instead of using the existing regval, this patch reads out the
register value explicitly to cache in vintf->cfg.

Signed-off-by: Nicolin Chen 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  6 ++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  5 +--
 .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c  | 32 +--
 3 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b1182dd825fd..73941ccc1a3e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct 
arm_smmu_cmdq_ent *ent)
return 0;
 }
 
-static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
+static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, 
u64 *cmds, int n)
 {
if (smmu->nvidia_grace_cmdqv)
-   return nvidia_grace_cmdqv_get_cmdq(smmu);
+   return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
 
return >cmdq;
 }
@@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct 
arm_smmu_device *smmu,
u32 prod;
unsigned long flags;
bool owner;
-   struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
+   struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
struct arm_smmu_ll_queue llq, head;
int ret = 0;
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 24f93444aeeb..085c775c2eea 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
 nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
  struct acpi_iort_node *node);
 int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
-struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device 
*smmu);
+struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
+ u64 *cmds, int n);
 #else /* CONFIG_NVIDIA_GRACE_CMDQV */
 static inline struct nvidia_grace_cmdqv *
 nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
@@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct 
arm_smmu_device *smmu)
 }
 
 static inline struct arm_smmu_cmdq *
-nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
+nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
 {
return NULL;
 }
diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c 
b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
index c0d7351f13e2..71f6bc684e64 100644
--- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
+++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
@@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct 
nvidia_grace_cmdqv *cmdqv,
return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
 }
 
-struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
+struct arm_smmu_cmdq *
+nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
 {
struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
struct nvidia_grace_cmdqv_vintf *vintf0 = >vintf0;
@@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct 
arm_smmu_device *smmu)
if (!FIELD_GET(VINTF_STATUS, vintf0->status))
return >cmdq;
 
+   /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) 
*/
+   if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
+   u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : 
CMDQ_OP_CMD_SYNC;
+
+   /* List all supported CMDs for vintf->cmdq pathway */
+   switch (opcode) {
+   case CMDQ_OP_TLBI_NH_ASID:
+   case CMDQ_OP_TLBI_NH_VA:
+   case CMDQ_OP_TLBI_S12_VMALL:
+   case CMDQ_OP_TLBI_S2_IPA:
+   case CMDQ_OP_ATC_INV:
+   break;
+   default:
+   /* Unsupported CMDs go for smmu->cmdq pathway */
+   return >cmdq;
+   }
+   }
+
/*
 * Select a vcmdq to use. Here we use a temporal solution to
 * 

[PATCH v3 3/5] iommu/arm-smmu-v3: Pass cmdq pointer in arm_smmu_cmdq_issue_cmdlist()

2021-11-18 Thread Nicolin Chen via iommu
The driver currently calls arm_smmu_get_cmdq() helper internally in
different places, though they are all actually called from the same
source -- arm_smmu_cmdq_issue_cmdlist() function.

This patch changes this to pass the cmdq pointer to these functions
instead of calling arm_smmu_get_cmdq() every time.

This also helps CMDQV extension in NVIDIA Grace SoC, whose driver'd
maintain its own cmdq pointers and needs to redirect arm_smmu->cmdq
to that upon seeing a supported command by checking its opcode.

Signed-off-by: Nicolin Chen 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6be20e926f63..188865ec9a33 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -586,11 +586,11 @@ static void arm_smmu_cmdq_poll_valid_map(struct 
arm_smmu_cmdq *cmdq,
 
 /* Wait for the command queue to become non-full */
 static int arm_smmu_cmdq_poll_until_not_full(struct arm_smmu_device *smmu,
+struct arm_smmu_cmdq *cmdq,
 struct arm_smmu_ll_queue *llq)
 {
unsigned long flags;
struct arm_smmu_queue_poll qp;
-   struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
int ret = 0;
 
/*
@@ -621,11 +621,11 @@ static int arm_smmu_cmdq_poll_until_not_full(struct 
arm_smmu_device *smmu,
  * Must be called with the cmdq lock held in some capacity.
  */
 static int __arm_smmu_cmdq_poll_until_msi(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq,
  struct arm_smmu_ll_queue *llq)
 {
int ret = 0;
struct arm_smmu_queue_poll qp;
-   struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
u32 *cmd = (u32 *)(Q_ENT(>q, llq->prod));
 
queue_poll_init(smmu, );
@@ -645,10 +645,10 @@ static int __arm_smmu_cmdq_poll_until_msi(struct 
arm_smmu_device *smmu,
  * Must be called with the cmdq lock held in some capacity.
  */
 static int __arm_smmu_cmdq_poll_until_consumed(struct arm_smmu_device *smmu,
+  struct arm_smmu_cmdq *cmdq,
   struct arm_smmu_ll_queue *llq)
 {
struct arm_smmu_queue_poll qp;
-   struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
u32 prod = llq->prod;
int ret = 0;
 
@@ -695,12 +695,13 @@ static int __arm_smmu_cmdq_poll_until_consumed(struct 
arm_smmu_device *smmu,
 }
 
 static int arm_smmu_cmdq_poll_until_sync(struct arm_smmu_device *smmu,
+struct arm_smmu_cmdq *cmdq,
 struct arm_smmu_ll_queue *llq)
 {
if (smmu->options & ARM_SMMU_OPT_MSIPOLL)
-   return __arm_smmu_cmdq_poll_until_msi(smmu, llq);
+   return __arm_smmu_cmdq_poll_until_msi(smmu, cmdq, llq);
 
-   return __arm_smmu_cmdq_poll_until_consumed(smmu, llq);
+   return __arm_smmu_cmdq_poll_until_consumed(smmu, cmdq, llq);
 }
 
 static void arm_smmu_cmdq_write_entries(struct arm_smmu_cmdq *cmdq, u64 *cmds,
@@ -757,7 +758,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct 
arm_smmu_device *smmu,
 
while (!queue_has_space(, n + sync)) {
local_irq_restore(flags);
-   if (arm_smmu_cmdq_poll_until_not_full(smmu, ))
+   if (arm_smmu_cmdq_poll_until_not_full(smmu, cmdq, ))
dev_err_ratelimited(smmu->dev, "CMDQ 
timeout\n");
local_irq_save(flags);
}
@@ -833,7 +834,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct 
arm_smmu_device *smmu,
/* 5. If we are inserting a CMD_SYNC, we must wait for it to complete */
if (sync) {
llq.prod = queue_inc_prod_n(, n);
-   ret = arm_smmu_cmdq_poll_until_sync(smmu, );
+   ret = arm_smmu_cmdq_poll_until_sync(smmu, cmdq, );
if (ret) {
dev_err_ratelimited(smmu->dev,
"CMD_SYNC timeout at 0x%08x [hwprod 
0x%08x, hwcons 0x%08x]\n",
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 1/5] iommu/arm-smmu-v3: Add CS_NONE quirk

2021-11-18 Thread Nicolin Chen via iommu
The CMDQV extension in NVIDIA Grace SoC only supports CS_NONE in the
CS field of CMD_SYNC. So this patch adds a quirk flag to accommodate
that.

Signed-off-by: Nicolin Chen 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 7 ++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f5848b351b19..e6fee69dd79c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -319,7 +319,9 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct 
arm_smmu_cmdq_ent *ent)
cmd[1] |= FIELD_PREP(CMDQ_RESUME_1_STAG, ent->resume.stag);
break;
case CMDQ_OP_CMD_SYNC:
-   if (ent->sync.msiaddr) {
+   if (ent->sync.cs_none) {
+   cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, 
CMDQ_SYNC_0_CS_NONE);
+   } else if (ent->sync.msiaddr) {
cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, 
CMDQ_SYNC_0_CS_IRQ);
cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK;
} else {
@@ -356,6 +358,9 @@ static void arm_smmu_cmdq_build_sync_cmd(u64 *cmd, struct 
arm_smmu_device *smmu,
   q->ent_dwords * 8;
}
 
+   if (q->quirks & CMDQ_QUIRK_SYNC_CS_NONE_ONLY)
+   ent.sync.cs_none = true;
+
arm_smmu_cmdq_build_cmd(cmd, );
 }
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 4cb136f07914..7a6a6045700d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -499,6 +499,7 @@ struct arm_smmu_cmdq_ent {
#define CMDQ_OP_CMD_SYNC0x46
struct {
u64 msiaddr;
+   boolcs_none;
} sync;
};
 };
@@ -531,6 +532,9 @@ struct arm_smmu_queue {
 
u32 __iomem *prod_reg;
u32 __iomem *cons_reg;
+
+#define CMDQ_QUIRK_SYNC_CS_NONE_ONLY   BIT(0)  /* CMD_SYNC CS field supports 
CS_NONE only */
+   u32 quirks;
 };
 
 struct arm_smmu_queue_poll {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 0/5] iommu/arm-smmu-v3: Add NVIDIA Grace CMDQ-V Support

2021-11-18 Thread Nicolin Chen via iommu
From: Nicolin Chen 

NVIDIA's Grace SoC has a CMDQ-Virtualization (CMDQV) hardware that
extends standard ARM SMMUv3 to support multiple command queues with
virtualization capabilities. Though this is similar to the ECMDQ in
SMMUv3.3, CMDQV provides additional V-Interfaces that allow VMs to
have their own interfaces and command queues, and these queues are
able to execute a limited set of commands, mainly TLB invalidation
commands when running in the guest mode, comparing to the standard
SMMUv3 CMDQ.

This patch series extends the SMMUv3 driver to support NVIDIA CMDQV
and implements it first for in-kernel use. Upon kernel boot some of
the vcmdqs will be setup for kernel driver to use, by selecting one
of the command queues based on the CPU currently executing to avoid
lock contention hot spots with a single queue.

Although HW is able to securely expose the additional V-Interfaces
and command queues to guest VMs for fast TLB invalidations without
a hypervisor trap, due to the ongoing proposal of IOMMUFD [0], we
have to postpone the virtualization support that were available in
v2, suggested by Alex and Jason [1]. And we envision that it will
be added back via IOMMUFD in the months ahead.

Thank you!

[0] https://lore.kernel.org/lkml/20210919063848.1476776-1-yi.l@intel.com/
[1] 
https://lore.kernel.org/kvm/20210831101549.237151fa.alex.william...@redhat.com/T/#ma07dcfce69fa3f9d59e8b16579f694a0e10798d9

Changelog (details available in PATCH)
v2->v3:
 * Dropped VMID and mdev patches to redesign later based on IOMMUFD.
 * Separated HYP_OWN part for guest support into a new patch
 * Added new preparational changes
v1->v2:
 * Added mdev interface support for hypervisor and VMs.
 * Added preparational changes for mdev interface implementation.
 * PATCH-12 Changed ->issue_cmdlist() to ->get_cmdq() for a better
   integration with recently merged ECMDQ-related changes.

Nate Watterson (1):
  iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V

Nicolin Chen (4):
  iommu/arm-smmu-v3: Add CS_NONE quirk
  iommu/arm-smmu-v3: Make arm_smmu_cmdq_init reusable
  iommu/arm-smmu-v3: Pass cmdq pointer in arm_smmu_cmdq_issue_cmdlist()
  iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF

 MAINTAINERS   |   1 +
 drivers/iommu/Kconfig |  12 +
 drivers/iommu/arm/arm-smmu-v3/Makefile|   1 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  53 ++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  48 ++
 .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c  | 446 ++
 6 files changed, 542 insertions(+), 19 deletions(-)
 create mode 100644 drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 01/11] iommu: Add device dma ownership set/release interfaces

2021-11-18 Thread Tian, Kevin
> From: Jason Gunthorpe 
> Sent: Thursday, November 18, 2021 9:33 PM
> 
> > In concept a singleton group is different from a
> > multi-devices group which has only one device bound to driver...
> 
> Really? Why? I don't see it that way..
> 
> A singleton group is just a multi-device group that hasn't been
> hotplugged yet.
> 
> We don't seem to have the concept of a "true" singleton group which is
> permanently single due to HW features.
> 
> > This series aims to avoid conflict having both user and kernel drivers
> > mixed in a multi-devices group.

Well, the difference is just in literal. I don't know the background
why the existing iommu_attach_device() users want to do it this
way. But given the condition in iommu_attach_device() it could
in theory imply some unknown hardware-level side effect which 
may break the desired functionality once the group size grows 
beyond singleton. Is it a real case? I don't know...

You are now redefining that condition from singleton group to
multi-devices group with single driver bound. As long as no object
from existing driver users, I'm fine with it. But still want to raise
awareness as it does change the existing semantics (though might
be considered as an imperfect way).

Thanks
Kevin

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/vt-d: Fix unmap_pages support

2021-11-18 Thread Lu Baolu

Hi Jerry,

On 11/19/21 3:48 AM, Jerry Snitselaar wrote:

On Fri, 2021-11-12 at 10:59 +0800, Lu Baolu wrote:

Hi Alex,

On 11/11/21 8:32 AM, Alex Williamson wrote:

When supporting only the .map and .unmap callbacks of iommu_ops,
the IOMMU driver can make assumptions about the size and alignment
used for mappings based on the driver provided pgsize_bitmap.  VT-d
previously used essentially PAGE_MASK for this bitmap as any power
of two mapping was acceptably filled by native page sizes.

However, with the .map_pages and .unmap_pages interface we're now
getting page-size and count arguments.  If we simply combine these
as (page-size * count) and make use of the previous map/unmap
functions internally, any size and alignment assumptions are very
different.

As an example, a given vfio device assignment VM will often create
a 4MB mapping at IOVA pfn [0x3fe00 - 0x401ff].  On a system that
does not support IOMMU super pages, the unmap_pages interface will
ask to unmap 1024 4KB pages at the base IOVA.
dma_pte_clear_level()
will recurse down to level 2 of the page table where the first half
of the pfn range exactly matches the entire pte level.  We clear
the
pte, increment the pfn by the level size, but (oops) the next pte
is
on a new page, so we exit the loop an pop back up a level.  When we
then update the pfn based on that higher level, we seem to assume
that the previous pfn value was at the start of the level.  In this
case the level size is 256K pfns, which we add to the base pfn and
get a results of 0x7fe00, which is clearly greater than 0x401ff,
so we're done.  Meanwhile we never cleared the ptes for the
remainder
of the range.  When the VM remaps this range, we're overwriting
valid
ptes and the VT-d driver complains loudly, as reported by the user
report linked below.

The fix for this seems relatively simple, if each iteration of the
loop in dma_pte_clear_level() is assumed to clear to the end of the
level pte page, then our next pfn should be calculated from
level_pfn
rather than our working pfn.

Fixes: 3f34f1259776 ("iommu/vt-d: Implement map/unmap_pages()
iommu_ops callback")
Reported-by: Ajay Garg 
Link:
https://lore.kernel.org/all/20211002124012.18186-1-ajaygargn...@gmail.com/
Signed-off-by: Alex Williamson 


Thank you for fixing this! I will queue it for v5.16.

Best regards,
baolu



Hi Baolu,

Do you have an estimate of when this will be submitted?


I will submit all fix patches in my queue to Joerg early the next week.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/vt-d: Fix unmap_pages support

2021-11-18 Thread Jerry Snitselaar
On Fri, 2021-11-12 at 10:59 +0800, Lu Baolu wrote:
> Hi Alex,
> 
> On 11/11/21 8:32 AM, Alex Williamson wrote:
> > When supporting only the .map and .unmap callbacks of iommu_ops,
> > the IOMMU driver can make assumptions about the size and alignment
> > used for mappings based on the driver provided pgsize_bitmap.  VT-d
> > previously used essentially PAGE_MASK for this bitmap as any power
> > of two mapping was acceptably filled by native page sizes.
> > 
> > However, with the .map_pages and .unmap_pages interface we're now
> > getting page-size and count arguments.  If we simply combine these
> > as (page-size * count) and make use of the previous map/unmap
> > functions internally, any size and alignment assumptions are very
> > different.
> > 
> > As an example, a given vfio device assignment VM will often create
> > a 4MB mapping at IOVA pfn [0x3fe00 - 0x401ff].  On a system that
> > does not support IOMMU super pages, the unmap_pages interface will
> > ask to unmap 1024 4KB pages at the base IOVA. 
> > dma_pte_clear_level()
> > will recurse down to level 2 of the page table where the first half
> > of the pfn range exactly matches the entire pte level.  We clear
> > the
> > pte, increment the pfn by the level size, but (oops) the next pte
> > is
> > on a new page, so we exit the loop an pop back up a level.  When we
> > then update the pfn based on that higher level, we seem to assume
> > that the previous pfn value was at the start of the level.  In this
> > case the level size is 256K pfns, which we add to the base pfn and
> > get a results of 0x7fe00, which is clearly greater than 0x401ff,
> > so we're done.  Meanwhile we never cleared the ptes for the
> > remainder
> > of the range.  When the VM remaps this range, we're overwriting
> > valid
> > ptes and the VT-d driver complains loudly, as reported by the user
> > report linked below.
> > 
> > The fix for this seems relatively simple, if each iteration of the
> > loop in dma_pte_clear_level() is assumed to clear to the end of the
> > level pte page, then our next pfn should be calculated from
> > level_pfn
> > rather than our working pfn.
> > 
> > Fixes: 3f34f1259776 ("iommu/vt-d: Implement map/unmap_pages()
> > iommu_ops callback")
> > Reported-by: Ajay Garg 
> > Link:
> > https://lore.kernel.org/all/20211002124012.18186-1-ajaygargn...@gmail.com/
> > Signed-off-by: Alex Williamson 
> 
> Thank you for fixing this! I will queue it for v5.16.
> 
> Best regards,
> baolu
> 

Hi Baolu,

Do you have an estimate of when this will be submitted?

Regards,
Jerry


> > ---
> >   drivers/iommu/intel/iommu.c |    2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/intel/iommu.c
> > b/drivers/iommu/intel/iommu.c
> > index d75f59ae28e6..f6395f5425f0 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -1249,7 +1249,7 @@ static struct page
> > *dma_pte_clear_level(struct dmar_domain *domain, int level,
> >    freelist);
> > }
> >   next:
> > -   pfn += level_size(level);
> > +   pfn = level_pfn + level_size(level);
> > } while (!first_pte_in_page(++pte) && pfn <= last_pfn);
> >   
> > if (first_pte)
> > 
> > 
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/2] dt-bindings: Add Arm SMMUv3 PMCG binding

2021-11-18 Thread Robin Murphy

On 2021-11-17 23:19, Rob Herring wrote:

On Tue, Nov 16, 2021 at 5:52 AM Jean-Philippe Brucker
 wrote:


Add binding for the Arm SMMUv3 PMU. Each node represents a PMCG, and is
placed as a sibling node of the SMMU. Although the PMCGs registers may
be within the SMMU MMIO region, they are separate devices, and there can
be multiple PMCG devices for each SMMU (for example one for the TCU and
one for each TBU).

Signed-off-by: Jean-Philippe Brucker 
---
  .../bindings/iommu/arm,smmu-v3-pmcg.yaml  | 67 +++
  1 file changed, 67 insertions(+)
  create mode 100644 
Documentation/devicetree/bindings/iommu/arm,smmu-v3-pmcg.yaml

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu-v3-pmcg.yaml 
b/Documentation/devicetree/bindings/iommu/arm,smmu-v3-pmcg.yaml
new file mode 100644
index ..a893e071fdb4
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu-v3-pmcg.yaml
@@ -0,0 +1,67 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/iommu/arm,smmu-v3-pmcg.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Arm SMMUv3 Performance Monitor Counter Group
+
+maintainers:
+  - Will Deacon 
+  - Robin Murphy 
+
+description: |+


Don't need '|+' if no formatting to preserve.


+  An SMMUv3 may have several Performance Monitor Counter Group (PMCG).
+  They are standalone performance monitoring units that support both
+  architected and IMPLEMENTATION DEFINED event counters.


Humm, I don't know that I agree they are standalone. They could be I
guess, but looking at the MMU-600 spec the PMCG looks like it's just a
subset of registers in a larger block. This seems similar to MPAM
(which I'm working on a binding for) where it's just a register map
and interrupts, but every other possible resource is unspecified by
the architecture.


They're "standalone" in the sense that they don't have to be part of an 
SMMU, they could be part of a PCIe root complex or other SoC device that 
couples to an SMMU (e.g. anything that can speak AMBA DTI, in the case 
of our SMMU implementations).


In fact our SMMU TBUs are pretty much separate devices themselves, they 
just *only* speak DTI, so access to their registers is proxied through 
the TCU programming interface.



The simplest change from this would be just specifying that the PMCG
is child node(s) of whatever it is part of. The extreme would be this
is all part of the SMMU binding (i.e. reg entry X is PMCG registers,
interrupts entry Y is pmu irq).


Being a child of its associated device doesn't seem too bad 
semantically, however how would we describe a PMCG as a child of a PCIe 
node when its "reg" property still exists in the parent address space 
and not PCI config/memory space like any of its siblings? Also in 
practical terms, consuming that binding in Linux and getting the things 
to probe when it may want to be independent of whether we even 
understand the parent node at all could be... unpleasant.


Robin.


+
+properties:
+  $nodename:
+pattern: "^pmu@[0-9a-f]*"


s/*/+/

Need at least 1 digit.


+  compatible:
+oneOf:
+  - items:
+- enum:
+  - hisilicon,smmu-v3-pmcg-hip08
+- const: arm,smmu-v3-pmcg
+  - const: arm,smmu-v3-pmcg
+
+  reg:
+description: |
+  Base addresses of the PMCG registers. Either a single address for Page 0
+  or an additional address for Page 1, where some registers can be
+  relocated with SMMU_PMCG_CFGR.RELOC_CTRS.
+minItems: 1
+maxItems: 2
+
+  interrupts:
+maxItems: 1
+
+  msi-parent: true
+
+required:
+  - compatible
+  - reg
+
+additionalProperties: false
+
+examples:
+  - |+
+#include 
+#include 
+
+pmu@2b42 {
+compatible = "arm,smmu-v3-pmcg";
+reg = <0 0x2b42 0 0x1000>,
+  <0 0x2b43 0 0x1000>;
+interrupts = ;
+msi-parent = < 0xff>;
+};
+
+pmu@2b44 {
+compatible = "arm,smmu-v3-pmcg";
+reg = <0 0x2b44 0 0x1000>,
+  <0 0x2b45 0 0x1000>;
+interrupts = ;
+msi-parent = < 0xff>;
+};
--
2.33.1


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 01/11] iommu: Add device dma ownership set/release interfaces

2021-11-18 Thread Jason Gunthorpe via iommu
On Thu, Nov 18, 2021 at 09:12:41AM +0800, Lu Baolu wrote:
> The existing iommu_attach_device() allows only for singleton group. As
> we have added group ownership attribute, we can enforce this interface
> only for kernel domain usage.

Below is what I came up with.
 - Replace the file * with a simple void *

 - Use owner_count == 0 <-> dma_owner == DMA_OWNER to simplify
the logic and remove levels of indent

 - Add a kernel state DMA_OWNER_PRIVATE_DOMAIN

 - Rename the user state to DMA_OWNER_PRIVATE_DOMAIN_USER

   It differs from the above because it does extra work to keep the
   group isolated that kernel users do no need to do.
 
 - Rename the kernel state to DMA_OWNER_DMA_API to better reflect
   its purpose. Inspired by Robin's point that alot of this is
   indirectly coupled to the domain pointer.

 - Have iommu_attach_device() atomically swap from DMA_OWNER_DMA_API
   to DMA_OWNER_PRIVATE_DOMAIN - replaces the group size check.

When we figure out tegra we can add an WARN_ON to iommu_attach_group()
that dma_owner != DMA_OWNER_NONE || DMA_OWNER_DMA_API

Then the whole thing makes some general sense..

Jason

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 064d0679906afd..4cafe074775e30 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -49,7 +49,7 @@ struct iommu_group {
struct list_head entry;
enum iommu_dma_owner dma_owner;
refcount_t owner_cnt;
-   struct file *owner_user_file;
+   void *owner_cookie;
 };
 
 struct group_device {
@@ -1937,12 +1937,18 @@ int iommu_attach_device(struct iommu_domain *domain, 
struct device *dev)
 * change while we are attaching
 */
mutex_lock(>mutex);
-   ret = -EINVAL;
-   if (iommu_group_device_count(group) != 1)
+   if (group->dma_owner != DMA_OWNER_DMA_API ||
+   refcount_read(>owner_cnt) != 1) {
+   ret = -EBUSY;
goto out_unlock;
+   }
 
ret = __iommu_attach_group(domain, group);
+   if (ret)
+   goto out_unlock;
 
+   group->dma_owner = DMA_OWNER_PRIVATE_DOMAIN;
+   group->owner_cookie = domain;
 out_unlock:
mutex_unlock(>mutex);
iommu_group_put(group);
@@ -2193,14 +2199,11 @@ void iommu_detach_device(struct iommu_domain *domain, 
struct device *dev)
return;
 
mutex_lock(>mutex);
-   if (iommu_group_device_count(group) != 1) {
-   WARN_ON(1);
-   goto out_unlock;
-   }
-
+   WARN_ON(group->dma_owner != DMA_OWNER_PRIVATE_DOMAIN ||
+   refcount_read(>owner_cnt) != 1 ||
+   group->owner_cookie != domain);
+   group->dma_owner = DMA_OWNER_DMA_API;
__iommu_detach_group(domain, group);
-
-out_unlock:
mutex_unlock(>mutex);
iommu_group_put(group);
 }
@@ -3292,44 +3295,33 @@ static ssize_t iommu_group_store_type(struct 
iommu_group *group,
 
 static int __iommu_group_set_dma_owner(struct iommu_group *group,
   enum iommu_dma_owner owner,
-  struct file *user_file)
+  void *owner_cookie)
 {
-   if (group->dma_owner != DMA_OWNER_NONE && group->dma_owner != owner)
-   return -EBUSY;
-
-   if (owner == DMA_OWNER_USER) {
-   if (!user_file)
-   return -EINVAL;
-
-   if (group->owner_user_file && group->owner_user_file != 
user_file)
-   return -EPERM;
+   if (refcount_inc_not_zero(>owner_cnt)) {
+   if (group->dma_owner != owner ||
+   group->owner_cookie != owner_cookie) {
+   refcount_dec(>owner_cnt);
+   return -EBUSY;
+   }
+   return 0;
}
 
-   if (!refcount_inc_not_zero(>owner_cnt)) {
-   group->dma_owner = owner;
-   refcount_set(>owner_cnt, 1);
-
-   if (owner == DMA_OWNER_USER) {
-   /*
-* The UNMANAGED domain shouldn't be attached before
-* claiming the USER ownership for the first time.
-*/
-   if (group->domain) {
-   if (group->domain != group->default_domain) {
-   group->dma_owner = DMA_OWNER_NONE;
-   refcount_set(>owner_cnt, 0);
-
-   return -EBUSY;
-   }
-
-   __iommu_detach_group(group->domain, group);
-   }
-
-   get_file(user_file);
-   group->owner_user_file = user_file;
+   /*
+* We must ensure that any device DMAs issued after this call
+* are discarded. DMAs can only reach real memory once someone
+* has attached a real domain.
+  

Re: [PATCH 01/11] iommu: Add device dma ownership set/release interfaces

2021-11-18 Thread Jason Gunthorpe via iommu
On Thu, Nov 18, 2021 at 02:39:45AM +, Tian, Kevin wrote:
> > From: Jason Gunthorpe 
> > Sent: Tuesday, November 16, 2021 9:46 PM
> > 
> > On Tue, Nov 16, 2021 at 09:57:30AM +0800, Lu Baolu wrote:
> > > Hi Christoph,
> > >
> > > On 11/15/21 9:14 PM, Christoph Hellwig wrote:
> > > > On Mon, Nov 15, 2021 at 10:05:42AM +0800, Lu Baolu wrote:
> > > > > +enum iommu_dma_owner {
> > > > > + DMA_OWNER_NONE,
> > > > > + DMA_OWNER_KERNEL,
> > > > > + DMA_OWNER_USER,
> > > > > +};
> > > > > +
> > > >
> > > > > + enum iommu_dma_owner dma_owner;
> > > > > + refcount_t owner_cnt;
> > > > > + struct file *owner_user_file;
> > > >
> > > > I'd just overload the ownership into owner_user_file,
> > > >
> > > >   NULL  -> no owner
> > > >   (struct file *)1UL)   -> kernel
> > > >   real pointer  -> user
> > > >
> > > > Which could simplify a lot of the code dealing with the owner.
> > > >
> > >
> > > Yeah! Sounds reasonable. I will make this in the next version.
> > 
> > It would be good to figure out how to make iommu_attach_device()
> > enforce no other driver binding as a kernel user without a file *, as
> > Robin pointed to, before optimizing this.
> > 
> > This fixes an existing bug where iommu_attach_device() only checks the
> > group size and is vunerable to a hot plug increasing the group size
> > after it returns. That check should be replaced by this series's logic
> > instead.
> > 
> 
> I think this existing bug in iommu_attach_devce() is different from 
> what this series is attempting to solve. To avoid breaking singleton
> group assumption there the ideal band-aid is to fail device hotplug.
> Otherwise some IOVA ranges which are supposed to go upstream 
> to IOMMU may be considered as p2p and routed to the hotplugged
> device instead.

Yes, but the instability of the reserved regions during hotplug with
!ACS seems like an entirely different problem. It affects everything,
including VFIO, and multi-device groups. Certainly it is nothing to do
with this series.

> In concept a singleton group is different from a
> multi-devices group which has only one device bound to driver...

Really? Why? I don't see it that way..

A singleton group is just a multi-device group that hasn't been
hotplugged yet.

We don't seem to have the concept of a "true" singleton group which is
permanently single due to HW features.

> This series aims to avoid conflict having both user and kernel drivers
> mixed in a multi-devices group.

I see this series about bringing order to all the places that want to
use a non-default domain - in-kernel or user doesn't really matter.

ie why shouldn't iommu_attach_device() work in a group that has a PCI
bridge, just like VFIO does?

The only thing that is special about VFIO vs a kernel driver is we
want a little help to track userspace ownership and VFIO opens
userspace to do the P2P attack.

The way I see it the num device == 1 test in iommu_attach_device() is
an imperfect way of controlling driver binding, and we can do better
by using the mechanism in this series.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 2/6] hwtracing: Add trace function support for HiSilicon PCIe Tune and Trace device

2021-11-18 Thread Yicong Yang via iommu
Hi Robin,

On 2021/11/16 19:37, Yicong Yang wrote:
> On 2021/11/16 18:56, Robin Murphy wrote:
>> On 2021-11-16 09:06, Yicong Yang via iommu wrote:
>> [...]
>>> +/*
>>> + * Get RMR address if provided by the firmware.
>>> + * Return 0 if the IOMMU doesn't present or the policy of the
>>> + * IOMMU domain is passthrough or we get a usable RMR region.
>>> + * Otherwise a negative value is returned.
>>> + */
>>> +static int hisi_ptt_get_rmr(struct hisi_ptt *hisi_ptt)
>>> +{
>>> +    struct pci_dev *pdev = hisi_ptt->pdev;
>>> +    struct iommu_domain *iommu_domain;
>>> +    struct iommu_resv_region *region;
>>> +    LIST_HEAD(list);
>>> +
>>> +    /*
>>> + * Use direct DMA if IOMMU does not present or the policy of the
>>> + * IOMMU domain is passthrough.
>>> + */
>>> +    iommu_domain = iommu_get_domain_for_dev(>dev);
>>> +    if (!iommu_domain || iommu_domain->type == IOMMU_DOMAIN_IDENTITY)
>>> +    return 0;
>>> +
>>> +    iommu_get_resv_regions(>dev, );
>>> +    list_for_each_entry(region, , list)
>>> +    if (region->type == IOMMU_RESV_DIRECT &&
>>> +    region->length >= HISI_PTT_TRACE_BUFFER_SIZE) {
>>> +    hisi_ptt->trace_ctrl.has_rmr = true;
>>> +    hisi_ptt->trace_ctrl.rmr_addr = region->start;
>>> +    hisi_ptt->trace_ctrl.rmr_length = region->length;
>>> +    break;
>>> +    }
>>> +
>>> +    iommu_put_resv_regions(>dev, );
>>> +    return hisi_ptt->trace_ctrl.has_rmr ? 0 : -ENOMEM;
>>> +}
>>
>> No.
>>
>> The whole point of RMRs is for devices that are already configured to access 
>> the given address range in a manner beyond the kernel's control. If you can 
>> do this, it proves that you should not have an RMR in the first place.
>>
>> The notion of a kernel driver explicitly configuring its device to DMA into 
>> any random RMR that looks big enough is so egregiously wrong that I'm almost 
>> lost for words...
>>
> 
> our bios will reserve such a region and reported it through iort. the device 
> will write to the region and in the driver we need to access the region
> to get the traced data. the region is reserved exclusively and will not be 
> accessed by kernel or other devices.
> 
> is it ok to let bios configure the address to the device and from CPU side we 
> just read it?
> 

Any suggestion?  Is this still an issue you concern if we move the 
configuration of the device address to BIOS and just read from the CPU side?

> 
> 
> .
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu