Re: [PATCH v12 01/13] vfio: VFIO_IOMMU_SET_PASID_TABLE
Hi Eric, On 2021/2/24 5:06, Eric Auger wrote: +/* + * VFIO_IOMMU_SET_PASID_TABLE - _IOWR(VFIO_TYPE, VFIO_BASE + 18, + * struct vfio_iommu_type1_set_pasid_table) + * + * The SET operation passes a PASID table to the host while the + * UNSET operation detaches the one currently programmed. Setting + * a table while another is already programmed replaces the old table. It looks to me that this description doesn't match the IOMMU part. [v14,05/13] iommu/smmuv3: Implement attach/detach_pasid_table | case IOMMU_PASID_CONFIG_TRANSLATE: | /* we do not support S1 <-> S1 transitions */ | if (smmu_domain->s1_cfg.set) | goto out; Maybe I've misread something? Thanks, Zenghui ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/arm-smmu-v3: Remove the unused fields for PREFETCH_CONFIG command
Per SMMUv3 spec, there is no Size and Addr field in the PREFETCH_CONFIG command and they're not used by the driver. Remove them. We can add them back if we're going to use PREFETCH_ADDR in the future. Signed-off-by: Zenghui Yu --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 -- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 -- 2 files changed, 4 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 8594b4a83043..610c9f4b7789 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -245,8 +245,6 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) break; case CMDQ_OP_PREFETCH_CFG: cmd[0] |= FIELD_PREP(CMDQ_PREFETCH_0_SID, ent->prefetch.sid); - cmd[1] |= FIELD_PREP(CMDQ_PREFETCH_1_SIZE, ent->prefetch.size); - cmd[1] |= ent->prefetch.addr & CMDQ_PREFETCH_1_ADDR_MASK; break; case CMDQ_OP_CFGI_CD: cmd[0] |= FIELD_PREP(CMDQ_CFGI_0_SSID, ent->cfgi.ssid); diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index f985817c967a..83726b6a1cba 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -410,8 +410,6 @@ struct arm_smmu_cmdq_ent { #define CMDQ_OP_PREFETCH_CFG0x1 struct { u32 sid; - u8 size; - u64 addr; } prefetch; #define CMDQ_OP_CFGI_STE0x3 -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v14 08/13] dma-iommu: Implement NESTED_MSI cookie
Hi Eric, On 2021/2/24 4:56, Eric Auger wrote: Up to now, when the type was UNMANAGED, we used to allocate IOVA pages within a reserved IOVA MSI range. If both the host and the guest are exposed with SMMUs, each would allocate an IOVA. The guest allocates an IOVA (gIOVA) to map onto the guest MSI doorbell (gDB). The Host allocates another IOVA (hIOVA) to map onto the physical doorbell (hDB). So we end up with 2 unrelated mappings, at S1 and S2: S1 S2 gIOVA-> gDB hIOVA->hDB The PCI device would be programmed with hIOVA. No stage 1 mapping would existing, causing the MSIs to fault. iommu_dma_bind_guest_msi() allows to pass gIOVA/gDB to the host so that gIOVA can be used by the host instead of re-allocating a new hIOVA. S1 S2 gIOVA->gDB->hDB this time, the PCI device can be programmed with the gIOVA MSI doorbell which is correctly mapped through both stages. Nested mode is not compatible with HW MSI regions as in that case gDB and hDB should have a 1-1 mapping. This check will be done when attaching each device to the IOMMU domain. Signed-off-by: Eric Auger [...] diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index f659395e7959..d25eb7cecaa7 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -19,6 +19,7 @@ #include #include #include +#include Duplicated include. #include #include #include @@ -29,12 +30,15 @@ struct iommu_dma_msi_page { struct list_headlist; dma_addr_t iova; + dma_addr_t gpa; phys_addr_t phys; + size_t s1_granule; }; enum iommu_dma_cookie_type { IOMMU_DMA_IOVA_COOKIE, IOMMU_DMA_MSI_COOKIE, + IOMMU_DMA_NESTED_MSI_COOKIE, }; struct iommu_dma_cookie { @@ -46,6 +50,7 @@ struct iommu_dma_cookie { dma_addr_t msi_iova; msi_iova is unused in the nested mode, but we still set it to the start address of the RESV_SW_MSI region (in iommu_get_msi_cookie()), which looks a bit strange to me. }; struct list_headmsi_page_list; + spinlock_t msi_lock; Should msi_lock be grabbed everywhere msi_page_list is populated? Especially in iommu_dma_get_msi_page(), which can be invoked from the irqchip driver. /* Domain for flush queue callback; NULL if flush queue not in use */ struct iommu_domain *fq_domain; @@ -87,6 +92,7 @@ static struct iommu_dma_cookie *cookie_alloc(enum iommu_dma_cookie_type type) cookie = kzalloc(sizeof(*cookie), GFP_KERNEL); if (cookie) { + spin_lock_init(>msi_lock); INIT_LIST_HEAD(>msi_page_list); cookie->type = type; } @@ -120,14 +126,17 @@ EXPORT_SYMBOL(iommu_get_dma_cookie); * * Users who manage their own IOVA allocation and do not want DMA API support, * but would still like to take advantage of automatic MSI remapping, can use - * this to initialise their own domain appropriately. Users should reserve a + * this to initialise their own domain appropriately. Users may reserve a * contiguous IOVA region, starting at @base, large enough to accommodate the * number of PAGE_SIZE mappings necessary to cover every MSI doorbell address - * used by the devices attached to @domain. + * used by the devices attached to @domain. The other way round is to provide + * usable iova pages through the iommu_dma_bind_doorbell API (nested stages s/iommu_dma_bind_doorbell/iommu_dma_bind_guest_msi/ ? + * use case) */ int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base) { struct iommu_dma_cookie *cookie; + int nesting, ret; if (domain->type != IOMMU_DOMAIN_UNMANAGED) return -EINVAL; @@ -135,7 +144,12 @@ int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base) if (domain->iova_cookie) return -EEXIST; - cookie = cookie_alloc(IOMMU_DMA_MSI_COOKIE); + ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_NESTING, ); Redundant space. + if (!ret && nesting) + cookie = cookie_alloc(IOMMU_DMA_NESTED_MSI_COOKIE); + else + cookie = cookie_alloc(IOMMU_DMA_MSI_COOKIE); + if (!cookie) return -ENOMEM; @@ -156,6 +170,7 @@ void iommu_put_dma_cookie(struct iommu_domain *domain) { struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iommu_dma_msi_page *msi, *tmp; + bool s2_unmap = false; if (!cookie) return; @@ -163,7 +178,15 @@ void iommu_put_dma_cookie(struct iommu_domain *domain) if (cookie->type == IOMMU_DMA_IOVA_COOKIE && cookie->iovad.granule) put_iova_domain(>iovad); + if (cookie->type == IOMMU_DMA_NESTED_MSI_COOKIE) + s2_unmap = true; +
Re: [PATCH v14 07/13] iommu/smmuv3: Implement cache_invalidate
Hi Eric, On 2021/2/24 4:56, Eric Auger wrote: +static int +arm_smmu_cache_invalidate(struct iommu_domain *domain, struct device *dev, + struct iommu_cache_invalidate_info *inv_info) +{ + struct arm_smmu_cmdq_ent cmd = {.opcode = CMDQ_OP_TLBI_NSNH_ALL}; + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_device *smmu = smmu_domain->smmu; + + if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED) + return -EINVAL; + + if (!smmu) + return -EINVAL; + + if (inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1) + return -EINVAL; + + if (inv_info->cache & IOMMU_CACHE_INV_TYPE_PASID || I didn't find any code where we would emulate the CFGI_CD{_ALL} commands for guest and invalidate the stale CD entries on the physical side. Is PASID-cache type designed for that effect? + inv_info->cache & IOMMU_CACHE_INV_TYPE_DEV_IOTLB) { + return -ENOENT; + } + + if (!(inv_info->cache & IOMMU_CACHE_INV_TYPE_IOTLB)) + return -EINVAL; + + /* IOTLB invalidation */ + + switch (inv_info->granularity) { + case IOMMU_INV_GRANU_PASID: + { + struct iommu_inv_pasid_info *info = + _info->granu.pasid_info; + + if (info->flags & IOMMU_INV_ADDR_FLAGS_PASID) + return -ENOENT; + if (!(info->flags & IOMMU_INV_PASID_FLAGS_ARCHID)) + return -EINVAL; + + __arm_smmu_tlb_inv_context(smmu_domain, info->archid); + return 0; + } + case IOMMU_INV_GRANU_ADDR: + { + struct iommu_inv_addr_info *info = _info->granu.addr_info; + size_t size = info->nb_granules * info->granule_size; + bool leaf = info->flags & IOMMU_INV_ADDR_FLAGS_LEAF; + + if (info->flags & IOMMU_INV_ADDR_FLAGS_PASID) + return -ENOENT; + + if (!(info->flags & IOMMU_INV_ADDR_FLAGS_ARCHID)) + break; + + arm_smmu_tlb_inv_range_domain(info->addr, size, + info->granule_size, leaf, + info->archid, smmu_domain); + + arm_smmu_cmdq_issue_sync(smmu); There is no need to issue one more SYNC. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v14 13/13] iommu/smmuv3: Accept configs with more than one context descriptor
Hi Eric, On 2021/2/24 4:56, Eric Auger wrote: In preparation for vSVA, let's accept userspace provided configs with more than one CD. We check the max CD against the host iommu capability and also the format (linear versus 2 level). Signed-off-by: Eric Auger Signed-off-by: Shameer Kolothum --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 332d31c0680f..ab74a0289893 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -3038,14 +3038,17 @@ static int arm_smmu_attach_pasid_table(struct iommu_domain *domain, if (smmu_domain->s1_cfg.set) goto out; - /* -* we currently support a single CD so s1fmt and s1dss -* fields are also ignored -*/ - if (cfg->pasid_bits) + list_for_each_entry(master, _domain->devices, domain_head) { + if (cfg->pasid_bits > master->ssid_bits) + goto out; + } + if (cfg->vendor_data.smmuv3.s1fmt == STRTAB_STE_0_S1FMT_64K_L2 && + !(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB)) goto out; smmu_domain->s1_cfg.cdcfg.cdtab_dma = cfg->base_ptr; + smmu_domain->s1_cfg.s1cdmax = cfg->pasid_bits; + smmu_domain->s1_cfg.s1fmt = cfg->vendor_data.smmuv3.s1fmt; And what about the SIDSS field? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v14 06/13] iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
On 2021/2/24 4:56, Eric Auger wrote: @@ -1936,7 +1950,12 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size, }, }; - if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { + if (ext_asid >= 0) { /* guest stage 1 invalidation */ + cmd.opcode = smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H ? + CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA; If I understand it correctly, the true nested mode effectively gives us a *NS-EL1* StreamWorld. We should therefore use CMDQ_OP_TLBI_NH_VA to invalidate the stage-1 NS-EL1 entries, no matter E2H is selected or not. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/vt-d: Fix status code for Allocate/Free PASID command
As per Intel vt-d spec, Rev 3.0 (section 10.4.45 "Virtual Command Response Register"), the status code of "No PASID available" error in response to the Allocate PASID command is 2, not 1. The same for "Invalid PASID" error in response to the Free PASID command. We will otherwise see confusing kernel log under the command failure from guest side. Fix it. Fixes: 24f27d32ab6b ("iommu/vt-d: Enlightened PASID allocation") Signed-off-by: Zenghui Yu --- drivers/iommu/intel/pasid.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h index 97dfcffbf495..444c0bec221a 100644 --- a/drivers/iommu/intel/pasid.h +++ b/drivers/iommu/intel/pasid.h @@ -30,8 +30,8 @@ #define VCMD_VRSP_IP 0x1 #define VCMD_VRSP_SC(e)(((e) >> 1) & 0x3) #define VCMD_VRSP_SC_SUCCESS 0 -#define VCMD_VRSP_SC_NO_PASID_AVAIL1 -#define VCMD_VRSP_SC_INVALID_PASID 1 +#define VCMD_VRSP_SC_NO_PASID_AVAIL2 +#define VCMD_VRSP_SC_INVALID_PASID 2 #define VCMD_VRSP_RESULT_PASID(e) (((e) >> 8) & 0xf) #define VCMD_CMD_OPERAND(e)((e) << 8) /* -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] dma-debug: Delete outdated comment of the hash function
We actually use dev_addr[26:13] as the index into dma_entry_hash. Given that the code itself is clear enough, let's drop the hardcoded comment so that we won't need to revisit it every time HASH_FN_{SHIFT,MASK} gets updated. Signed-off-by: Zenghui Yu --- kernel/dma/debug.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index 14de1271463f..d89341162d35 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -235,10 +235,7 @@ static bool driver_filter(struct device *dev) */ static int hash_fn(struct dma_debug_entry *entry) { - /* -* Hash function is based on the dma address. -* We use bits 20-27 here as the index into the hash -*/ + /* Hash function is based on the dma address. */ return (entry->dev_addr >> HASH_FN_SHIFT) & HASH_FN_MASK; } -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v10 11/11] vfio: Document nested stage control
Hi Eric, On 2020/3/21 0:19, Eric Auger wrote: The VFIO API was enhanced to support nested stage control: a bunch of new iotcls, one DMA FAULT region and an associated specific IRQ. Let's document the process to follow to set up nested mode. Signed-off-by: Eric Auger [...] +The userspace must be prepared to receive faults. The VFIO-PCI device +exposes one dedicated DMA FAULT region: it contains a ring buffer and +its header that allows to manage the head/tail indices. The region is +identified by the following index/subindex: +- VFIO_REGION_TYPE_NESTED/VFIO_REGION_SUBTYPE_NESTED_DMA_FAULT + +The DMA FAULT region exposes a VFIO_REGION_INFO_CAP_PRODUCER_FAULT +region capability that allows the userspace to retrieve the ABI version +of the fault records filled by the host. Nit: I don't see this capability in the code. Thanks, Zenghui ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v10 05/11] vfio/pci: Register an iommu fault handler
Hi Eric, On 2020/3/21 0:19, Eric Auger wrote: Register an IOMMU fault handler which records faults in the DMA FAULT region ring buffer. In a subsequent patch, we will add the signaling of a specific eventfd to allow the userspace to be notified whenever a new fault as shown up. Signed-off-by: Eric Auger [...] diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index 586b89debed5..69595c240baf 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -27,6 +27,7 @@ #include #include #include +#include #include "vfio_pci_private.h" @@ -283,6 +284,38 @@ static const struct vfio_pci_regops vfio_pci_dma_fault_regops = { .add_capability = vfio_pci_dma_fault_add_capability, }; +int vfio_pci_iommu_dev_fault_handler(struct iommu_fault *fault, void *data) +{ + struct vfio_pci_device *vdev = (struct vfio_pci_device *)data; + struct vfio_region_dma_fault *reg = + (struct vfio_region_dma_fault *)vdev->fault_pages; + struct iommu_fault *new = + (struct iommu_fault *)(vdev->fault_pages + reg->offset + + reg->head * reg->entry_size); Shouldn't 'reg->head' be protected under the fault_queue_lock? Otherwise things may change behind our backs... We shouldn't take any assumption about how IOMMU driver would report the fault (serially or in parallel), I think. + int head, tail, size; + int ret = 0; + + if (fault->type != IOMMU_FAULT_DMA_UNRECOV) + return -ENOENT; + + mutex_lock(>fault_queue_lock); + + head = reg->head; + tail = reg->tail; + size = reg->nb_entries; + + if (CIRC_SPACE(head, tail, size) < 1) { + ret = -ENOSPC; + goto unlock; + } + + *new = *fault; + reg->head = (head + 1) % size; +unlock: + mutex_unlock(>fault_queue_lock); + return ret; +} + #define DMA_FAULT_RING_LENGTH 512 static int vfio_pci_init_dma_fault_region(struct vfio_pci_device *vdev) @@ -317,6 +350,13 @@ static int vfio_pci_init_dma_fault_region(struct vfio_pci_device *vdev) header->entry_size = sizeof(struct iommu_fault); header->nb_entries = DMA_FAULT_RING_LENGTH; header->offset = sizeof(struct vfio_region_dma_fault); + + ret = iommu_register_device_fault_handler(>pdev->dev, + vfio_pci_iommu_dev_fault_handler, + vdev); + if (ret) + goto out; + return 0; out: kfree(vdev->fault_pages); Thanks, Zenghui ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v10 04/11] vfio/pci: Add VFIO_REGION_TYPE_NESTED region type
Hi Eric, On 2020/3/21 0:19, Eric Auger wrote: Add a new specific DMA_FAULT region aiming to exposed nested mode translation faults. The region has a ring buffer that contains the actual fault records plus a header allowing to handle it (tail/head indices, max capacity, entry size). At the moment the region is dimensionned for 512 fault records. Signed-off-by: Eric Auger [...] diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index 379a02c36e37..586b89debed5 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -260,6 +260,69 @@ int vfio_pci_set_power_state(struct vfio_pci_device *vdev, pci_power_t state) return ret; } +static void vfio_pci_dma_fault_release(struct vfio_pci_device *vdev, + struct vfio_pci_region *region) +{ +} + +static int vfio_pci_dma_fault_add_capability(struct vfio_pci_device *vdev, +struct vfio_pci_region *region, +struct vfio_info_cap *caps) +{ + struct vfio_region_info_cap_fault cap = { + .header.id = VFIO_REGION_INFO_CAP_DMA_FAULT, + .header.version = 1, + .version = 1, + }; + return vfio_info_add_capability(caps, , sizeof(cap)); +} + +static const struct vfio_pci_regops vfio_pci_dma_fault_regops = { + .rw = vfio_pci_dma_fault_rw, + .release= vfio_pci_dma_fault_release, + .add_capability = vfio_pci_dma_fault_add_capability, +}; + +#define DMA_FAULT_RING_LENGTH 512 + +static int vfio_pci_init_dma_fault_region(struct vfio_pci_device *vdev) +{ + struct vfio_region_dma_fault *header; + size_t size; + int ret; + + mutex_init(>fault_queue_lock); + + /* +* We provision 1 page for the header and space for +* DMA_FAULT_RING_LENGTH fault records in the ring buffer. +*/ + size = ALIGN(sizeof(struct iommu_fault) * +DMA_FAULT_RING_LENGTH, PAGE_SIZE) + PAGE_SIZE; + + vdev->fault_pages = kzalloc(size, GFP_KERNEL); + if (!vdev->fault_pages) + return -ENOMEM; + + ret = vfio_pci_register_dev_region(vdev, + VFIO_REGION_TYPE_NESTED, + VFIO_REGION_SUBTYPE_NESTED_DMA_FAULT, + _pci_dma_fault_regops, size, + VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_WRITE, + vdev->fault_pages); + if (ret) + goto out; + + header = (struct vfio_region_dma_fault *)vdev->fault_pages; + header->entry_size = sizeof(struct iommu_fault); + header->nb_entries = DMA_FAULT_RING_LENGTH; + header->offset = sizeof(struct vfio_region_dma_fault); + return 0; +out: + kfree(vdev->fault_pages); + return ret; +} + static int vfio_pci_enable(struct vfio_pci_device *vdev) { struct pci_dev *pdev = vdev->pdev; @@ -358,6 +421,10 @@ static int vfio_pci_enable(struct vfio_pci_device *vdev) } } + ret = vfio_pci_init_dma_fault_region(vdev); + if (ret) + goto disable_exit; + vfio_pci_probe_mmaps(vdev); return 0; @@ -1383,6 +1450,7 @@ static void vfio_pci_remove(struct pci_dev *pdev) vfio_iommu_group_put(pdev->dev.iommu_group, >dev); kfree(vdev->region); + kfree(vdev->fault_pages); mutex_destroy(>ioeventfds_lock); if (!disable_idle_d3) diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h index 8a2c7607d513..a392f50e3a99 100644 --- a/drivers/vfio/pci/vfio_pci_private.h +++ b/drivers/vfio/pci/vfio_pci_private.h @@ -119,6 +119,8 @@ struct vfio_pci_device { int ioeventfds_nr; struct eventfd_ctx *err_trigger; struct eventfd_ctx *req_trigger; + u8 *fault_pages; What's the reason to use 'u8'? It doesn't match the type of header, nor the type of ring buffer. + struct mutexfault_queue_lock; struct list_headdummy_resources_list; struct mutexioeventfds_lock; struct list_headioeventfds_list; @@ -150,6 +152,14 @@ extern ssize_t vfio_pci_vga_rw(struct vfio_pci_device *vdev, char __user *buf, extern long vfio_pci_ioeventfd(struct vfio_pci_device *vdev, loff_t offset, uint64_t data, int count, int fd); +struct vfio_pci_fault_abi { + u32 entry_size; +}; This is not used by this patch (and the whole series). + +extern size_t vfio_pci_dma_fault_rw(struct vfio_pci_device *vdev, + char __user *buf, size_t count, + loff_t *ppos, bool iswrite); + extern int vfio_pci_init_perm_bits(void); extern void vfio_pci_uninit_perm_bits(void); diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c index
Re: [PATCH v10 01/11] vfio: VFIO_IOMMU_SET_PASID_TABLE
Hi Eric, On 2020/3/21 0:19, Eric Auger wrote: From: "Liu, Yi L" This patch adds an VFIO_IOMMU_SET_PASID_TABLE ioctl which aims to pass the virtual iommu guest configuration to the host. This latter takes the form of the so-called PASID table. Signed-off-by: Jacob Pan Signed-off-by: Liu, Yi L Signed-off-by: Eric Auger [...] diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index a177bf2c6683..bfacbd876ee1 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -2172,6 +2172,43 @@ static int vfio_iommu_iova_build_caps(struct vfio_iommu *iommu, return ret; } +static void +vfio_detach_pasid_table(struct vfio_iommu *iommu) +{ + struct vfio_domain *d; + + mutex_lock(>lock); + + list_for_each_entry(d, >domain_list, next) { + iommu_detach_pasid_table(d->domain); + } + mutex_unlock(>lock); +} + +static int +vfio_attach_pasid_table(struct vfio_iommu *iommu, + struct vfio_iommu_type1_set_pasid_table *ustruct) +{ + struct vfio_domain *d; + int ret = 0; + + mutex_lock(>lock); + + list_for_each_entry(d, >domain_list, next) { + ret = iommu_attach_pasid_table(d->domain, >config); + if (ret) + goto unwind; + } + goto unlock; +unwind: + list_for_each_entry_continue_reverse(d, >domain_list, next) { + iommu_detach_pasid_table(d->domain); + } +unlock: + mutex_unlock(>lock); + return ret; +} + static long vfio_iommu_type1_ioctl(void *iommu_data, unsigned int cmd, unsigned long arg) { @@ -2276,6 +2313,25 @@ static long vfio_iommu_type1_ioctl(void *iommu_data, return copy_to_user((void __user *)arg, , minsz) ? -EFAULT : 0; + } else if (cmd == VFIO_IOMMU_SET_PASID_TABLE) { + struct vfio_iommu_type1_set_pasid_table ustruct; + + minsz = offsetofend(struct vfio_iommu_type1_set_pasid_table, + config); + + if (copy_from_user(, (void __user *)arg, minsz)) + return -EFAULT; + + if (ustruct.argsz < minsz) + return -EINVAL; + + if (ustruct.flags & VFIO_PASID_TABLE_FLAG_SET) + return vfio_attach_pasid_table(iommu, ); + else if (ustruct.flags & VFIO_PASID_TABLE_FLAG_UNSET) { + vfio_detach_pasid_table(iommu); + return 0; + } else + return -EINVAL; Nit: What if user-space blindly set both flags? Should we check that only one flag is allowed to be set at this stage, and return error otherwise? Besides, before going through the whole series [1][2], I'd like to know if this is the latest version of your Nested-Stage-Setup work in case I had missed something. [1] https://lore.kernel.org/r/20200320161911.27494-1-eric.au...@redhat.com [2] https://lore.kernel.org/r/20200414150607.28488-1-eric.au...@redhat.com Thanks, Zenghui ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 18/24] iommu/arm-smmu-v3: Add support for Hardware Translation Table Update
On 2020/5/20 1:54, Jean-Philippe Brucker wrote: @@ -4454,6 +4470,12 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) smmu->features |= ARM_SMMU_FEAT_E2H; } + if (reg & (IDR0_HA | IDR0_HD)) { + smmu->features |= ARM_SMMU_FEAT_HA; + if (reg & IDR0_HD) + smmu->features |= ARM_SMMU_FEAT_HD; + } + nitpick: As per the IORT spec (DEN0049D, 3.1.1.2 SMMUv3 node, Table 10), the "HTTU Override" flag of the SMMUv3 node can override the value in SMMU_IDR0.HTTU. You may want to check this bit before selecting the {HA,HD} features and shout if there is a mismatch between firmware and the SMMU implementation. Just like how ARM_SMMU_FEAT_COHERENCY is selected. Thanks, Zenghui ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/arm-smmu-v3: Fix l1 stream table size in the error message
The actual size of level-1 stream table is l1size. This looks like an oversight on commit d2e88e7c081ef ("iommu/arm-smmu: Fix LOG2SIZE setting for 2-level stream tables") which forgot to update the @size in error message as well. As memory allocation failure is already bad enough, nothing worse would happen. But let's be careful. Signed-off-by: Zenghui Yu --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index c192544e874b..bb458d0c7b73 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -3280,7 +3280,7 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu) if (!strtab) { dev_err(smmu->dev, "failed to allocate l1 stream table (%u bytes)\n", - size); + l1size); return -ENOMEM; } cfg->strtab = strtab; -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu