Re: [PATCH v5 07/14] drivers: acpi: iort: add support for ARM SMMU platform devices creation
On 2016-09-09 10:23, Lorenzo Pieralisi wrote: In ARM ACPI systems, IOMMU components are specified through static IORT table entries. In order to create platform devices for the corresponding ARM SMMU components, IORT kernel code should be made able to parse IORT table entries and create platform devices dynamically. This patch adds the generic IORT infrastructure required to create platform devices for ARM SMMUs. ARM SMMU versions have different resources requirement therefore this patch also introduces an IORT specific structure (ie iort_iommu_config) that contains hooks (to be defined when the corresponding ARM SMMU driver support is added to the kernel) to be used to define the platform devices names, init the IOMMUs, count their resources and finally initialize them. Signed-off-by: Lorenzo PieralisiCc: Hanjun Guo Cc: Tomasz Nowicki Cc: "Rafael J. Wysocki" --- drivers/acpi/arm64/iort.c | 131 ++ 1 file changed, 131 insertions(+) diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index b89b3d3..e0a9b16 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -22,6 +22,7 @@ #include #include #include +#include #include struct iort_its_msi_chip { @@ -424,6 +425,135 @@ struct irq_domain *iort_get_device_domain(struct device *dev, u32 req_id) return irq_find_matching_fwnode(handle, DOMAIN_BUS_PCI_MSI); } +struct iort_iommu_config { + const char *name; + int (*iommu_init)(struct acpi_iort_node *node); + bool (*iommu_is_coherent)(struct acpi_iort_node *node); + int (*iommu_count_resources)(struct acpi_iort_node *node); + void (*iommu_init_resources)(struct resource *res, +struct acpi_iort_node *node); +}; + +static __init +const struct iort_iommu_config *iort_get_iommu_cfg(struct acpi_iort_node *node) +{ + return NULL; +} + +/** + * iort_add_smmu_platform_device() - Allocate a platform device for SMMU + * @fwnode: IORT node associated fwnode handle + * @node: Pointer to SMMU ACPI IORT node + * + * Returns: 0 on success, <0 failure + */ +static int __init iort_add_smmu_platform_device(struct fwnode_handle *fwnode, + struct acpi_iort_node *node) +{ + struct platform_device *pdev; + struct resource *r; + enum dev_dma_attr attr; + int ret, count; + const struct iort_iommu_config *ops = iort_get_iommu_cfg(node); + + if (!ops) + return -ENODEV; + + pdev = platform_device_alloc(ops->name, PLATFORM_DEVID_AUTO); + if (!pdev) + return PTR_ERR(pdev); + + count = ops->iommu_count_resources(node); + + r = kcalloc(count, sizeof(*r), GFP_KERNEL); + if (!r) { + ret = -ENOMEM; + goto dev_put; + } + + ops->iommu_init_resources(r, node); + + ret = platform_device_add_resources(pdev, r, count); + /* +* Resources are duplicated in platform_device_add_resources, +* free their allocated memory +*/ + kfree(r); + + if (ret) + goto dev_put; + + /* +* Add a copy of IORT node pointer to platform_data to +* be used to retrieve IORT data information. +*/ + ret = platform_device_add_data(pdev, , sizeof(node)); + if (ret) + goto dev_put; + + pdev->dev.dma_mask = kmalloc(sizeof(*pdev->dev.dma_mask), GFP_KERNEL); + if (!pdev->dev.dma_mask) { + ret = -ENOMEM; + goto dev_put; + } + + pdev->dev.fwnode = fwnode; + + /* +* Set default dma mask value for the table walker, +* to be overridden on probing with correct value. +*/ + *pdev->dev.dma_mask = DMA_BIT_MASK(32); + pdev->dev.coherent_dma_mask = *pdev->dev.dma_mask; + + attr = ops->iommu_is_coherent(node) ? +DEV_DMA_COHERENT : DEV_DMA_NON_COHERENT; + + /* Configure DMA for the page table walker */ + acpi_dma_configure(>dev, attr); + + ret = platform_device_add(pdev); + if (ret) + goto dma_deconfigure; + + return 0; + +dma_deconfigure: + acpi_dma_deconfigure(>dev); + kfree(pdev->dev.dma_mask); + +dev_put: + platform_device_put(pdev); + + return ret; +} + +static acpi_status __init iort_match_iommu_callback(struct acpi_iort_node *node, + void *context) +{ + int ret; + struct fwnode_handle *fwnode; + + fwnode = iort_get_fwnode(node); + + if (!fwnode) + return AE_NOT_FOUND; + + ret = iort_add_smmu_platform_device(fwnode, node); + if (ret) { + pr_err("Error in platform device creation\n"); + return AE_ERROR; + } +
Re: [RFC PATCH v3 13/13] drivers: acpi: iort: introduce iort_iommu_configure
On 2016-07-20 07:23, Lorenzo Pieralisi wrote: DT based systems have a generic kernel API to configure IOMMUs for devices (ie of_iommu_configure()). On ARM based ACPI systems, the of_iommu_configure() equivalent can be implemented atop ACPI IORT kernel API, with the corresponding functions to map device identifiers to IOMMUs and retrieve the corresponding IOMMU operations necessary for DMA operations set-up. By relying on the iommu_fwspec generic kernel infrastructure, implement the IORT based IOMMU configuration for ARM ACPI systems and hook it up in the ACPI kernel layer that implements DMA configuration for a device. Signed-off-by: Lorenzo PieralisiCc: Hanjun Guo Cc: Tomasz Nowicki Cc: "Rafael J. Wysocki" --- drivers/acpi/iort.c | 64 drivers/acpi/scan.c | 7 +- include/linux/iort.h | 4 3 files changed, 74 insertions(+), 1 deletion(-) diff --git a/drivers/acpi/iort.c b/drivers/acpi/iort.c index c116b68..a12a4ff 100644 --- a/drivers/acpi/iort.c +++ b/drivers/acpi/iort.c @@ -18,6 +18,7 @@ #define pr_fmt(fmt)"ACPI: IORT: " fmt +#include #include #include #include @@ -27,6 +28,8 @@ #define IORT_TYPE_MASK(type) (1 << (type)) #define IORT_MSI_TYPE (1 << ACPI_IORT_NODE_ITS_GROUP) +#define IORT_IOMMU_TYPE((1 << ACPI_IORT_NODE_SMMU) | \ + (1 << ACPI_IORT_NODE_SMMU_V3)) struct iort_its_msi_chip { struct list_headlist; @@ -458,6 +461,67 @@ iort_get_device_domain(struct device *dev, u32 req_id) return irq_find_matching_fwnode(handle, DOMAIN_BUS_PCI_MSI); } +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data) +{ + u32 *rid = data; + + *rid = alias; + return 0; +} + +static int arm_smmu_iort_xlate(struct device *dev, u32 streamid, + struct fwnode_handle *fwnode) +{ + int ret = iommu_fwspec_init(dev, fwnode); + + if (!ret) + ret = iommu_fwspec_add_ids(dev, , 1); + + return 0; Are you intentionally returning 0 instead of ret? How about doing this instead? return ret ? ret : iommu_fwspec_add_ids(dev, , 1); +} + +/** + * iort_iommu_configure - Set-up IOMMU configuration for a device. + * + * @dev: device to configure + * + * Returns: iommu_ops pointer on configuration success + * NULL on configuration failure + */ +const struct iommu_ops *iort_iommu_configure(struct device *dev) +{ + struct acpi_iort_node *node, *parent; + struct fwnode_handle *iort_fwnode; + u32 rid = 0, devid = 0; Since this routine maps the RID space of a device to the StreamID space of its parent smmu, would you consider renaming the devid variable to some form of sid or streamid? + + if (dev_is_pci(dev)) { + struct pci_bus *bus = to_pci_dev(dev)->bus; + + pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid, + ); + + node = iort_scan_node(ACPI_IORT_NODE_PCI_ROOT_COMPLEX, + iort_match_node_callback, >dev); + } else { + node = iort_scan_node(ACPI_IORT_NODE_NAMED_COMPONENT, + iort_match_node_callback, dev); + } + + if (!node) + return NULL; + + parent = iort_node_map_rid(node, rid, , IORT_IOMMU_TYPE); + if (parent) { + iort_fwnode = iort_get_fwnode(parent); + if (iort_fwnode) { + arm_smmu_iort_xlate(dev, devid, iort_fwnode); What about named components with multiple stream ids? Since establishing the relationship between a named component and its parent smmu is already dependent on there being an appropriate mapping of rid 0, it stands to reason that all of the stream ids for a named component could be enumerated by mapping increasing rid values until the output parent no longer matches that returned for rid 0. + return fwspec_iommu_get_ops(iort_fwnode); + } + } + + return NULL; +} It should be noted that while trying out the approach described above, I noticed that each of the smmu attached named components described in my iort were ending up with an extra stream id. The culprit appears to be that the range checking in iort_id_map() is overly permissive on the upper bounds. For example, mappings with input_base=N and id_count=1 were matching both N and N+1. The following change fixed the issue. @@ -296,7 +296,7 @@ iort_id_map(struct acpi_iort_id_mapping *map, u8 type, u32 rid_in, u32 *rid_out) } if (rid_in < map->input_base || - (rid_in > map->input_base + map->id_count)) + (rid_in >= map->input_base + map->id_count)) return -ENXIO; *rid_out =
Re: [PATCH] iommu/dma: Don't put uninitialised IOVA domains
On 2016-07-27 12:00, Auger Eric wrote: Hi, On 27/07/2016 17:46, Robin Murphy wrote: Due to the limitations of having to wait until we see a device's DMA restrictions before we know how we want an IOVA domain initialised, there is a window for error if a DMA ops domain is allocated but later freed without ever being used. In that case, init_iova_domain() was never called, so calling put_iova_domain() from iommu_put_dma_cookie() ends up trying to take an uninitialised lock and crashing. Make things robust by skipping the call unless the IOVA domain actually has been initialised, as we probably should have done from the start. Reported-by: Nate WattersonSigned-off-by: Robin Murphy --- I'm not sure this warrants a cc stable, as with the code currently in mainline it's only at all likely if other things have already failed elsewhere in a manner they should not be expected to. drivers/iommu/dma-iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index ea5a9ebf0f78..97a23082e18a 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -68,7 +68,8 @@ void iommu_put_dma_cookie(struct iommu_domain *domain) if (!iovad) return; - put_iova_domain(iovad); + if (iovad->granule) + put_iova_domain(iovad); kfree(iovad); domain->iova_cookie = NULL; } Reviewed-by: Eric Auger Tested-by: Eric Auger Thanks Eric ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu Reviewed-by: Nate Watterson Tested-by: Nate Watterson -- Qualcomm Datacenter Technologies, Inc. on behalf of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/iova: validate iova_domain input to put_iova_domain
On 2016-07-14 07:21, Auger Eric wrote: Hi Robin, Nate, On 14/07/2016 12:36, Robin Murphy wrote: On 14/07/16 09:34, Joerg Roedel wrote: On Wed, Jul 13, 2016 at 02:49:32PM -0400, Nate Watterson wrote: Passing a NULL or uninitialized iova_domain into put_iova_domain will currently crash the kernel when the unconfigured iova_domain data members are accessed. To prevent this from occurring, this patch adds a check to make sure that the domain is non-NULL and that the domain granule is non-zero. The granule can be used to check if the domain was properly initialized because calling init_iova_domain with a granule of zero would have already triggered a BUG statement crashing the kernel. Have you seen real crashes happening because of this? In my case, it was calling iommu_request_dm_for_dev() which triggered the "iommu_[get/put]_dma_cookie() without iommu_dma_init_domain()" issue that has Robin documented below. I also saw the crash happening with my PCIe passthrough series (not upstreamed) [PATCH v10 0/8] [PATCH v10 0/8] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes https://lkml.org/lkml/2016/6/7/676 patch [PATCH v10 8/8] iommu/arm-smmu: get/put the msi cookie also uses iommu_put_dma_cookie and the uninitialised lock crash happens if the group gets destroyed before the iommu_dma_init_domain is called, which can also happen for me. It _can_ happen via the iommu-dma code if something goes wrong initialising a group - the IOVA domain gets allocated at the same time as the default IOMMU domain, but isn't initialised until later once the device in question gets ity dma ops set up. If adding the device to the group fails, everything gets torn down again and iommu_put_dma_cookie() ends up trying to take an uninitialised lock . Cant' we allow the granule check also with UNMANAGED type? Thanks Eric However, I think the appropriate fix for that particular situation would be more like this: diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index ea5a9ebf0f78..d00d22930a6b 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -65,10 +65,11 @@ void iommu_put_dma_cookie(struct iommu_domain *domain) { struct iova_domain *iovad = domain->iova_cookie; - if (!iovad) + if (domain->type != IOMMU_DOMAIN_DMA || !iovad) return; - put_iova_domain(iovad); + if (iovad->granule) + put_iova_domain(iovad); kfree(iovad); domain->iova_cookie = NULL; } (It probably should have been that way from the start; mea culpa) I originally put together a similar patch, but then thought that people would complain it didn't fix the root of the problem. Yet another instance where thinking was best avoided I guess. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/arm-smmu-v3: limit use of 2-level stream tables
On 2016-07-14 09:31, Will Deacon wrote: On Tue, Jul 12, 2016 at 02:19:20PM -0400, Nate Watterson wrote: In the current arm-smmu-v3 driver, all smmus that support 2-level stream tables are being forced to use them. This is suboptimal for smmus that support fewer stream id bits than would fill in a single second level table. This patch limits the use of 2-level tables to smmus that both support the feature and whose first level table can possibly contain more than a single entry. Just to be clear, what exactly are you seeing as being suboptimal here? Is it the memory wastage from overallocating the L2 table, or something more? Disregarding the config cache, fetching an STE when 2-level tables are being used will require the hw to perform more memory accesses than it would have to with a linear table since the L1 descriptor must also be fetched. Presumably this is why the spec states, "ARM recommends that a more efficient linear table is used instead of programming SPLIT > LOG2SIZE". My understanding is that the only benefit to using 2-level tables is that it can save space when stream ids are sparsely distributed. Are there any other compelling reasons to use them? if it's just the memory allocation, I'd sooner restrict the span field in the L1 desc. Although I am not especially concerned about the memory allocation, even if the span was reduced, we would still be wasting a page for the L1 table unless L1 and L2 tables were allocated in a single dmam_alloc_coherent call. Will Nate -- Qualcomm Datacenter Technologies, Inc. on behalf of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu