Re: [PATCH v12 01/13] vfio: VFIO_IOMMU_SET_PASID_TABLE

2021-04-07 Thread Zenghui Yu

Hi Eric,

On 2021/2/24 5:06, Eric Auger wrote:

+/*
+ * VFIO_IOMMU_SET_PASID_TABLE - _IOWR(VFIO_TYPE, VFIO_BASE + 18,
+ * struct vfio_iommu_type1_set_pasid_table)
+ *
+ * The SET operation passes a PASID table to the host while the
+ * UNSET operation detaches the one currently programmed. Setting
+ * a table while another is already programmed replaces the old table.


It looks to me that this description doesn't match the IOMMU part.

[v14,05/13] iommu/smmuv3: Implement attach/detach_pasid_table

|   case IOMMU_PASID_CONFIG_TRANSLATE:
|   /* we do not support S1 <-> S1 transitions */
|   if (smmu_domain->s1_cfg.set)
|   goto out;

Maybe I've misread something?


Thanks,
Zenghui
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/arm-smmu-v3: Remove the unused fields for PREFETCH_CONFIG command

2021-04-07 Thread Zenghui Yu
Per SMMUv3 spec, there is no Size and Addr field in the PREFETCH_CONFIG
command and they're not used by the driver. Remove them.

We can add them back if we're going to use PREFETCH_ADDR in the future.

Signed-off-by: Zenghui Yu 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 --
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 --
 2 files changed, 4 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 8594b4a83043..610c9f4b7789 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -245,8 +245,6 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct 
arm_smmu_cmdq_ent *ent)
break;
case CMDQ_OP_PREFETCH_CFG:
cmd[0] |= FIELD_PREP(CMDQ_PREFETCH_0_SID, ent->prefetch.sid);
-   cmd[1] |= FIELD_PREP(CMDQ_PREFETCH_1_SIZE, ent->prefetch.size);
-   cmd[1] |= ent->prefetch.addr & CMDQ_PREFETCH_1_ADDR_MASK;
break;
case CMDQ_OP_CFGI_CD:
cmd[0] |= FIELD_PREP(CMDQ_CFGI_0_SSID, ent->cfgi.ssid);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index f985817c967a..83726b6a1cba 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -410,8 +410,6 @@ struct arm_smmu_cmdq_ent {
#define CMDQ_OP_PREFETCH_CFG0x1
struct {
u32 sid;
-   u8  size;
-   u64 addr;
} prefetch;
 
#define CMDQ_OP_CFGI_STE0x3
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v14 08/13] dma-iommu: Implement NESTED_MSI cookie

2021-04-07 Thread Zenghui Yu

Hi Eric,

On 2021/2/24 4:56, Eric Auger wrote:

Up to now, when the type was UNMANAGED, we used to
allocate IOVA pages within a reserved IOVA MSI range.

If both the host and the guest are exposed with SMMUs, each
would allocate an IOVA. The guest allocates an IOVA (gIOVA)
to map onto the guest MSI doorbell (gDB). The Host allocates
another IOVA (hIOVA) to map onto the physical doorbell (hDB).

So we end up with 2 unrelated mappings, at S1 and S2:
  S1 S2
gIOVA-> gDB
hIOVA->hDB

The PCI device would be programmed with hIOVA.
No stage 1 mapping would existing, causing the MSIs to fault.

iommu_dma_bind_guest_msi() allows to pass gIOVA/gDB
to the host so that gIOVA can be used by the host instead of
re-allocating a new hIOVA.

  S1   S2
gIOVA->gDB->hDB

this time, the PCI device can be programmed with the gIOVA MSI
doorbell which is correctly mapped through both stages.

Nested mode is not compatible with HW MSI regions as in that
case gDB and hDB should have a 1-1 mapping. This check will
be done when attaching each device to the IOMMU domain.

Signed-off-by: Eric Auger 


[...]


diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f659395e7959..d25eb7cecaa7 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -19,6 +19,7 @@
  #include 
  #include 
  #include 
+#include 


Duplicated include.


  #include 
  #include 
  #include 
@@ -29,12 +30,15 @@
  struct iommu_dma_msi_page {
struct list_headlist;
dma_addr_t  iova;
+   dma_addr_t  gpa;
phys_addr_t phys;
+   size_t  s1_granule;
  };
  
  enum iommu_dma_cookie_type {

IOMMU_DMA_IOVA_COOKIE,
IOMMU_DMA_MSI_COOKIE,
+   IOMMU_DMA_NESTED_MSI_COOKIE,
  };
  
  struct iommu_dma_cookie {

@@ -46,6 +50,7 @@ struct iommu_dma_cookie {
dma_addr_t  msi_iova;


msi_iova is unused in the nested mode, but we still set it to the start
address of the RESV_SW_MSI region (in iommu_get_msi_cookie()), which
looks a bit strange to me.


};
struct list_headmsi_page_list;
+   spinlock_t  msi_lock;


Should msi_lock be grabbed everywhere msi_page_list is populated?
Especially in iommu_dma_get_msi_page(), which can be invoked from the
irqchip driver.

  
  	/* Domain for flush queue callback; NULL if flush queue not in use */

struct iommu_domain *fq_domain;
@@ -87,6 +92,7 @@ static struct iommu_dma_cookie *cookie_alloc(enum 
iommu_dma_cookie_type type)
  
  	cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);

if (cookie) {
+   spin_lock_init(>msi_lock);
INIT_LIST_HEAD(>msi_page_list);
cookie->type = type;
}
@@ -120,14 +126,17 @@ EXPORT_SYMBOL(iommu_get_dma_cookie);
   *
   * Users who manage their own IOVA allocation and do not want DMA API support,
   * but would still like to take advantage of automatic MSI remapping, can use
- * this to initialise their own domain appropriately. Users should reserve a
+ * this to initialise their own domain appropriately. Users may reserve a
   * contiguous IOVA region, starting at @base, large enough to accommodate the
   * number of PAGE_SIZE mappings necessary to cover every MSI doorbell address
- * used by the devices attached to @domain.
+ * used by the devices attached to @domain. The other way round is to provide
+ * usable iova pages through the iommu_dma_bind_doorbell API (nested stages


s/iommu_dma_bind_doorbell/iommu_dma_bind_guest_msi/ ?


+ * use case)
   */
  int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base)
  {
struct iommu_dma_cookie *cookie;
+   int nesting, ret;
  
  	if (domain->type != IOMMU_DOMAIN_UNMANAGED)

return -EINVAL;
@@ -135,7 +144,12 @@ int iommu_get_msi_cookie(struct iommu_domain *domain, 
dma_addr_t base)
if (domain->iova_cookie)
return -EEXIST;
  
-	cookie = cookie_alloc(IOMMU_DMA_MSI_COOKIE);

+   ret =  iommu_domain_get_attr(domain, DOMAIN_ATTR_NESTING, );


Redundant space.


+   if (!ret && nesting)
+   cookie = cookie_alloc(IOMMU_DMA_NESTED_MSI_COOKIE);
+   else
+   cookie = cookie_alloc(IOMMU_DMA_MSI_COOKIE);
+
if (!cookie)
return -ENOMEM;
  
@@ -156,6 +170,7 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)

  {
struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iommu_dma_msi_page *msi, *tmp;
+   bool s2_unmap = false;
  
  	if (!cookie)

return;
@@ -163,7 +178,15 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)
if (cookie->type == IOMMU_DMA_IOVA_COOKIE && cookie->iovad.granule)
put_iova_domain(>iovad);
  
+	if (cookie->type == IOMMU_DMA_NESTED_MSI_COOKIE)

+   s2_unmap = true;
+
 

Re: [PATCH v14 07/13] iommu/smmuv3: Implement cache_invalidate

2021-04-01 Thread Zenghui Yu

Hi Eric,

On 2021/2/24 4:56, Eric Auger wrote:

+static int
+arm_smmu_cache_invalidate(struct iommu_domain *domain, struct device *dev,
+ struct iommu_cache_invalidate_info *inv_info)
+{
+   struct arm_smmu_cmdq_ent cmd = {.opcode = CMDQ_OP_TLBI_NSNH_ALL};
+   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+   if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
+   return -EINVAL;
+
+   if (!smmu)
+   return -EINVAL;
+
+   if (inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1)
+   return -EINVAL;
+
+   if (inv_info->cache & IOMMU_CACHE_INV_TYPE_PASID ||


I didn't find any code where we would emulate the CFGI_CD{_ALL} commands
for guest and invalidate the stale CD entries on the physical side. Is
PASID-cache type designed for that effect?


+   inv_info->cache & IOMMU_CACHE_INV_TYPE_DEV_IOTLB) {
+   return -ENOENT;
+   }
+
+   if (!(inv_info->cache & IOMMU_CACHE_INV_TYPE_IOTLB))
+   return -EINVAL;
+
+   /* IOTLB invalidation */
+
+   switch (inv_info->granularity) {
+   case IOMMU_INV_GRANU_PASID:
+   {
+   struct iommu_inv_pasid_info *info =
+   _info->granu.pasid_info;
+
+   if (info->flags & IOMMU_INV_ADDR_FLAGS_PASID)
+   return -ENOENT;
+   if (!(info->flags & IOMMU_INV_PASID_FLAGS_ARCHID))
+   return -EINVAL;
+
+   __arm_smmu_tlb_inv_context(smmu_domain, info->archid);
+   return 0;
+   }
+   case IOMMU_INV_GRANU_ADDR:
+   {
+   struct iommu_inv_addr_info *info = _info->granu.addr_info;
+   size_t size = info->nb_granules * info->granule_size;
+   bool leaf = info->flags & IOMMU_INV_ADDR_FLAGS_LEAF;
+
+   if (info->flags & IOMMU_INV_ADDR_FLAGS_PASID)
+   return -ENOENT;
+
+   if (!(info->flags & IOMMU_INV_ADDR_FLAGS_ARCHID))
+   break;
+
+   arm_smmu_tlb_inv_range_domain(info->addr, size,
+ info->granule_size, leaf,
+ info->archid, smmu_domain);
+
+   arm_smmu_cmdq_issue_sync(smmu);


There is no need to issue one more SYNC.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v14 13/13] iommu/smmuv3: Accept configs with more than one context descriptor

2021-03-30 Thread Zenghui Yu

Hi Eric,

On 2021/2/24 4:56, Eric Auger wrote:

In preparation for vSVA, let's accept userspace provided configs
with more than one CD. We check the max CD against the host iommu
capability and also the format (linear versus 2 level).

Signed-off-by: Eric Auger 
Signed-off-by: Shameer Kolothum 
---
  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 13 -
  1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 332d31c0680f..ab74a0289893 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3038,14 +3038,17 @@ static int arm_smmu_attach_pasid_table(struct 
iommu_domain *domain,
if (smmu_domain->s1_cfg.set)
goto out;
  
-		/*

-* we currently support a single CD so s1fmt and s1dss
-* fields are also ignored
-*/
-   if (cfg->pasid_bits)
+   list_for_each_entry(master, _domain->devices, domain_head) 
{
+   if (cfg->pasid_bits > master->ssid_bits)
+   goto out;
+   }
+   if (cfg->vendor_data.smmuv3.s1fmt == STRTAB_STE_0_S1FMT_64K_L2 
&&
+   !(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB))
goto out;
  
  		smmu_domain->s1_cfg.cdcfg.cdtab_dma = cfg->base_ptr;

+   smmu_domain->s1_cfg.s1cdmax = cfg->pasid_bits;
+   smmu_domain->s1_cfg.s1fmt = cfg->vendor_data.smmuv3.s1fmt;


And what about the SIDSS field?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v14 06/13] iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs

2021-03-30 Thread Zenghui Yu

On 2021/2/24 4:56, Eric Auger wrote:

@@ -1936,7 +1950,12 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long 
iova, size_t size,
},
};
  
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {

+   if (ext_asid >= 0) {  /* guest stage 1 invalidation */
+   cmd.opcode  = smmu_domain->smmu->features & 
ARM_SMMU_FEAT_E2H ?
+ CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA;


If I understand it correctly, the true nested mode effectively gives us
a *NS-EL1* StreamWorld. We should therefore use CMDQ_OP_TLBI_NH_VA to
invalidate the stage-1 NS-EL1 entries, no matter E2H is selected or not.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/vt-d: Fix status code for Allocate/Free PASID command

2021-02-26 Thread Zenghui Yu
As per Intel vt-d spec, Rev 3.0 (section 10.4.45 "Virtual Command Response
Register"), the status code of "No PASID available" error in response to
the Allocate PASID command is 2, not 1. The same for "Invalid PASID" error
in response to the Free PASID command.

We will otherwise see confusing kernel log under the command failure from
guest side. Fix it.

Fixes: 24f27d32ab6b ("iommu/vt-d: Enlightened PASID allocation")
Signed-off-by: Zenghui Yu 
---
 drivers/iommu/intel/pasid.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h
index 97dfcffbf495..444c0bec221a 100644
--- a/drivers/iommu/intel/pasid.h
+++ b/drivers/iommu/intel/pasid.h
@@ -30,8 +30,8 @@
 #define VCMD_VRSP_IP   0x1
 #define VCMD_VRSP_SC(e)(((e) >> 1) & 0x3)
 #define VCMD_VRSP_SC_SUCCESS   0
-#define VCMD_VRSP_SC_NO_PASID_AVAIL1
-#define VCMD_VRSP_SC_INVALID_PASID 1
+#define VCMD_VRSP_SC_NO_PASID_AVAIL2
+#define VCMD_VRSP_SC_INVALID_PASID 2
 #define VCMD_VRSP_RESULT_PASID(e)  (((e) >> 8) & 0xf)
 #define VCMD_CMD_OPERAND(e)((e) << 8)
 /*
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] dma-debug: Delete outdated comment of the hash function

2020-12-29 Thread Zenghui Yu
We actually use dev_addr[26:13] as the index into dma_entry_hash. Given
that the code itself is clear enough, let's drop the hardcoded comment so
that we won't need to revisit it every time HASH_FN_{SHIFT,MASK} gets
updated.

Signed-off-by: Zenghui Yu 
---
 kernel/dma/debug.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 14de1271463f..d89341162d35 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -235,10 +235,7 @@ static bool driver_filter(struct device *dev)
  */
 static int hash_fn(struct dma_debug_entry *entry)
 {
-   /*
-* Hash function is based on the dma address.
-* We use bits 20-27 here as the index into the hash
-*/
+   /* Hash function is based on the dma address. */
return (entry->dev_addr >> HASH_FN_SHIFT) & HASH_FN_MASK;
 }
 
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 11/11] vfio: Document nested stage control

2020-09-24 Thread Zenghui Yu

Hi Eric,

On 2020/3/21 0:19, Eric Auger wrote:

The VFIO API was enhanced to support nested stage control: a bunch of
new iotcls, one DMA FAULT region and an associated specific IRQ.

Let's document the process to follow to set up nested mode.

Signed-off-by: Eric Auger 


[...]


+The userspace must be prepared to receive faults. The VFIO-PCI device
+exposes one dedicated DMA FAULT region: it contains a ring buffer and
+its header that allows to manage the head/tail indices. The region is
+identified by the following index/subindex:
+- VFIO_REGION_TYPE_NESTED/VFIO_REGION_SUBTYPE_NESTED_DMA_FAULT
+
+The DMA FAULT region exposes a VFIO_REGION_INFO_CAP_PRODUCER_FAULT
+region capability that allows the userspace to retrieve the ABI version
+of the fault records filled by the host.


Nit: I don't see this capability in the code.


Thanks,
Zenghui
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 05/11] vfio/pci: Register an iommu fault handler

2020-09-24 Thread Zenghui Yu

Hi Eric,

On 2020/3/21 0:19, Eric Auger wrote:

Register an IOMMU fault handler which records faults in
the DMA FAULT region ring buffer. In a subsequent patch, we
will add the signaling of a specific eventfd to allow the
userspace to be notified whenever a new fault as shown up.

Signed-off-by: Eric Auger 


[...]


diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 586b89debed5..69595c240baf 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -27,6 +27,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include "vfio_pci_private.h"
  
@@ -283,6 +284,38 @@ static const struct vfio_pci_regops vfio_pci_dma_fault_regops = {

.add_capability = vfio_pci_dma_fault_add_capability,
  };
  
+int vfio_pci_iommu_dev_fault_handler(struct iommu_fault *fault, void *data)

+{
+   struct vfio_pci_device *vdev = (struct vfio_pci_device *)data;
+   struct vfio_region_dma_fault *reg =
+   (struct vfio_region_dma_fault *)vdev->fault_pages;
+   struct iommu_fault *new =
+   (struct iommu_fault *)(vdev->fault_pages + reg->offset +
+   reg->head * reg->entry_size);


Shouldn't 'reg->head' be protected under the fault_queue_lock? Otherwise
things may change behind our backs...

We shouldn't take any assumption about how IOMMU driver would report the
fault (serially or in parallel), I think.


+   int head, tail, size;
+   int ret = 0;
+
+   if (fault->type != IOMMU_FAULT_DMA_UNRECOV)
+   return -ENOENT;
+
+   mutex_lock(>fault_queue_lock);
+
+   head = reg->head;
+   tail = reg->tail;
+   size = reg->nb_entries;
+
+   if (CIRC_SPACE(head, tail, size) < 1) {
+   ret = -ENOSPC;
+   goto unlock;
+   }
+
+   *new = *fault;
+   reg->head = (head + 1) % size;
+unlock:
+   mutex_unlock(>fault_queue_lock);
+   return ret;
+}
+
  #define DMA_FAULT_RING_LENGTH 512
  
  static int vfio_pci_init_dma_fault_region(struct vfio_pci_device *vdev)

@@ -317,6 +350,13 @@ static int vfio_pci_init_dma_fault_region(struct 
vfio_pci_device *vdev)
header->entry_size = sizeof(struct iommu_fault);
header->nb_entries = DMA_FAULT_RING_LENGTH;
header->offset = sizeof(struct vfio_region_dma_fault);
+
+   ret = iommu_register_device_fault_handler(>pdev->dev,
+   vfio_pci_iommu_dev_fault_handler,
+   vdev);
+   if (ret)
+   goto out;
+
return 0;
  out:
kfree(vdev->fault_pages);



Thanks,
Zenghui
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 04/11] vfio/pci: Add VFIO_REGION_TYPE_NESTED region type

2020-09-24 Thread Zenghui Yu

Hi Eric,

On 2020/3/21 0:19, Eric Auger wrote:

Add a new specific DMA_FAULT region aiming to exposed nested mode
translation faults.

The region has a ring buffer that contains the actual fault
records plus a header allowing to handle it (tail/head indices,
max capacity, entry size). At the moment the region is dimensionned
for 512 fault records.

Signed-off-by: Eric Auger 


[...]


diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 379a02c36e37..586b89debed5 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -260,6 +260,69 @@ int vfio_pci_set_power_state(struct vfio_pci_device *vdev, 
pci_power_t state)
return ret;
  }
  
+static void vfio_pci_dma_fault_release(struct vfio_pci_device *vdev,

+  struct vfio_pci_region *region)
+{
+}
+
+static int vfio_pci_dma_fault_add_capability(struct vfio_pci_device *vdev,
+struct vfio_pci_region *region,
+struct vfio_info_cap *caps)
+{
+   struct vfio_region_info_cap_fault cap = {
+   .header.id = VFIO_REGION_INFO_CAP_DMA_FAULT,
+   .header.version = 1,
+   .version = 1,
+   };
+   return vfio_info_add_capability(caps, , sizeof(cap));
+}
+
+static const struct vfio_pci_regops vfio_pci_dma_fault_regops = {
+   .rw = vfio_pci_dma_fault_rw,
+   .release= vfio_pci_dma_fault_release,
+   .add_capability = vfio_pci_dma_fault_add_capability,
+};
+
+#define DMA_FAULT_RING_LENGTH 512
+
+static int vfio_pci_init_dma_fault_region(struct vfio_pci_device *vdev)
+{
+   struct vfio_region_dma_fault *header;
+   size_t size;
+   int ret;
+
+   mutex_init(>fault_queue_lock);
+
+   /*
+* We provision 1 page for the header and space for
+* DMA_FAULT_RING_LENGTH fault records in the ring buffer.
+*/
+   size = ALIGN(sizeof(struct iommu_fault) *
+DMA_FAULT_RING_LENGTH, PAGE_SIZE) + PAGE_SIZE;
+
+   vdev->fault_pages = kzalloc(size, GFP_KERNEL);
+   if (!vdev->fault_pages)
+   return -ENOMEM;
+
+   ret = vfio_pci_register_dev_region(vdev,
+   VFIO_REGION_TYPE_NESTED,
+   VFIO_REGION_SUBTYPE_NESTED_DMA_FAULT,
+   _pci_dma_fault_regops, size,
+   VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_WRITE,
+   vdev->fault_pages);
+   if (ret)
+   goto out;
+
+   header = (struct vfio_region_dma_fault *)vdev->fault_pages;
+   header->entry_size = sizeof(struct iommu_fault);
+   header->nb_entries = DMA_FAULT_RING_LENGTH;
+   header->offset = sizeof(struct vfio_region_dma_fault);
+   return 0;
+out:
+   kfree(vdev->fault_pages);
+   return ret;
+}
+
  static int vfio_pci_enable(struct vfio_pci_device *vdev)
  {
struct pci_dev *pdev = vdev->pdev;
@@ -358,6 +421,10 @@ static int vfio_pci_enable(struct vfio_pci_device *vdev)
}
}
  
+	ret = vfio_pci_init_dma_fault_region(vdev);

+   if (ret)
+   goto disable_exit;
+
vfio_pci_probe_mmaps(vdev);
  
  	return 0;

@@ -1383,6 +1450,7 @@ static void vfio_pci_remove(struct pci_dev *pdev)
  
  	vfio_iommu_group_put(pdev->dev.iommu_group, >dev);

kfree(vdev->region);
+   kfree(vdev->fault_pages);
mutex_destroy(>ioeventfds_lock);
  
  	if (!disable_idle_d3)

diff --git a/drivers/vfio/pci/vfio_pci_private.h 
b/drivers/vfio/pci/vfio_pci_private.h
index 8a2c7607d513..a392f50e3a99 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -119,6 +119,8 @@ struct vfio_pci_device {
int ioeventfds_nr;
struct eventfd_ctx  *err_trigger;
struct eventfd_ctx  *req_trigger;
+   u8  *fault_pages;


What's the reason to use 'u8'? It doesn't match the type of header, nor
the type of ring buffer.


+   struct mutexfault_queue_lock;
struct list_headdummy_resources_list;
struct mutexioeventfds_lock;
struct list_headioeventfds_list;
@@ -150,6 +152,14 @@ extern ssize_t vfio_pci_vga_rw(struct vfio_pci_device 
*vdev, char __user *buf,
  extern long vfio_pci_ioeventfd(struct vfio_pci_device *vdev, loff_t offset,
   uint64_t data, int count, int fd);
  
+struct vfio_pci_fault_abi {

+   u32 entry_size;
+};


This is not used by this patch (and the whole series).


+
+extern size_t vfio_pci_dma_fault_rw(struct vfio_pci_device *vdev,
+   char __user *buf, size_t count,
+   loff_t *ppos, bool iswrite);
+
  extern int vfio_pci_init_perm_bits(void);
  extern void vfio_pci_uninit_perm_bits(void);
  
diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c

index 

Re: [PATCH v10 01/11] vfio: VFIO_IOMMU_SET_PASID_TABLE

2020-09-23 Thread Zenghui Yu

Hi Eric,

On 2020/3/21 0:19, Eric Auger wrote:

From: "Liu, Yi L" 

This patch adds an VFIO_IOMMU_SET_PASID_TABLE ioctl
which aims to pass the virtual iommu guest configuration
to the host. This latter takes the form of the so-called
PASID table.

Signed-off-by: Jacob Pan 
Signed-off-by: Liu, Yi L 
Signed-off-by: Eric Auger 


[...]


diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index a177bf2c6683..bfacbd876ee1 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2172,6 +2172,43 @@ static int vfio_iommu_iova_build_caps(struct vfio_iommu 
*iommu,
return ret;
  }
  
+static void

+vfio_detach_pasid_table(struct vfio_iommu *iommu)
+{
+   struct vfio_domain *d;
+
+   mutex_lock(>lock);
+
+   list_for_each_entry(d, >domain_list, next) {
+   iommu_detach_pasid_table(d->domain);
+   }
+   mutex_unlock(>lock);
+}
+
+static int
+vfio_attach_pasid_table(struct vfio_iommu *iommu,
+   struct vfio_iommu_type1_set_pasid_table *ustruct)
+{
+   struct vfio_domain *d;
+   int ret = 0;
+
+   mutex_lock(>lock);
+
+   list_for_each_entry(d, >domain_list, next) {
+   ret = iommu_attach_pasid_table(d->domain, >config);
+   if (ret)
+   goto unwind;
+   }
+   goto unlock;
+unwind:
+   list_for_each_entry_continue_reverse(d, >domain_list, next) {
+   iommu_detach_pasid_table(d->domain);
+   }
+unlock:
+   mutex_unlock(>lock);
+   return ret;
+}
+
  static long vfio_iommu_type1_ioctl(void *iommu_data,
   unsigned int cmd, unsigned long arg)
  {
@@ -2276,6 +2313,25 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
  
  		return copy_to_user((void __user *)arg, , minsz) ?

-EFAULT : 0;
+   } else if (cmd == VFIO_IOMMU_SET_PASID_TABLE) {
+   struct vfio_iommu_type1_set_pasid_table ustruct;
+
+   minsz = offsetofend(struct vfio_iommu_type1_set_pasid_table,
+   config);
+
+   if (copy_from_user(, (void __user *)arg, minsz))
+   return -EFAULT;
+
+   if (ustruct.argsz < minsz)
+   return -EINVAL;
+
+   if (ustruct.flags & VFIO_PASID_TABLE_FLAG_SET)
+   return vfio_attach_pasid_table(iommu, );
+   else if (ustruct.flags & VFIO_PASID_TABLE_FLAG_UNSET) {
+   vfio_detach_pasid_table(iommu);
+   return 0;
+   } else
+   return -EINVAL;


Nit:

What if user-space blindly set both flags? Should we check that only one
flag is allowed to be set at this stage, and return error otherwise?

Besides, before going through the whole series [1][2], I'd like to know
if this is the latest version of your Nested-Stage-Setup work in case I
had missed something.

[1] https://lore.kernel.org/r/20200320161911.27494-1-eric.au...@redhat.com
[2] https://lore.kernel.org/r/20200414150607.28488-1-eric.au...@redhat.com


Thanks,
Zenghui
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 18/24] iommu/arm-smmu-v3: Add support for Hardware Translation Table Update

2020-08-28 Thread Zenghui Yu

On 2020/5/20 1:54, Jean-Philippe Brucker wrote:

@@ -4454,6 +4470,12 @@ static int arm_smmu_device_hw_probe(struct 
arm_smmu_device *smmu)
smmu->features |= ARM_SMMU_FEAT_E2H;
}
  
+	if (reg & (IDR0_HA | IDR0_HD)) {

+   smmu->features |= ARM_SMMU_FEAT_HA;
+   if (reg & IDR0_HD)
+   smmu->features |= ARM_SMMU_FEAT_HD;
+   }
+


nitpick:

As per the IORT spec (DEN0049D, 3.1.1.2 SMMUv3 node, Table 10), the
"HTTU Override" flag of the SMMUv3 node can override the value in
SMMU_IDR0.HTTU. You may want to check this bit before selecting the
{HA,HD} features and shout if there is a mismatch between firmware and
the SMMU implementation. Just like how ARM_SMMU_FEAT_COHERENCY is
selected.


Thanks,
Zenghui
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/arm-smmu-v3: Fix l1 stream table size in the error message

2020-08-26 Thread Zenghui Yu
The actual size of level-1 stream table is l1size. This looks like an
oversight on commit d2e88e7c081ef ("iommu/arm-smmu: Fix LOG2SIZE setting
for 2-level stream tables") which forgot to update the @size in error
message as well.

As memory allocation failure is already bad enough, nothing worse would
happen. But let's be careful.

Signed-off-by: Zenghui Yu 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index c192544e874b..bb458d0c7b73 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3280,7 +3280,7 @@ static int arm_smmu_init_strtab_2lvl(struct 
arm_smmu_device *smmu)
if (!strtab) {
dev_err(smmu->dev,
"failed to allocate l1 stream table (%u bytes)\n",
-   size);
+   l1size);
return -ENOMEM;
}
cfg->strtab = strtab;
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu