Re: [RFC 0/2] iommu: arm-smmu: Add support for early direct mappings
Hi Thierry, I happened upon this thread while looking into another thread [1]. > From: Thierry Reding > > On some platforms, the firmware will setup hardware to read from a given > region of memory. One such example is a display controller that is > scanning out a splash screen from physical memory. > > During Linux' boot process, the ARM SMMU will configure all contexts to > fault by default. This means that memory accesses that happen by an SMMU > master before its driver has had a chance to properly set up the IOMMU > will cause a fault. This is especially annoying for something like the > display controller scanning out a splash screen because the faults will > result in the display controller getting bogus data (all-ones on Tegra) > and since it repeatedly scans that framebuffer, it will keep triggering > such faults and spam the boot log with them. While I'm not an expert on IOMMUs, I have a decent high level understanding of the problem you are trying to solve. > In order to work around such problems, scan the device tree for IOMMU > masters and set up a special identity domain that will map 1:1 all of > the reserved regions associated with them. This happens before the SMMU > is enabled, so that the mappings are already set up before translations > begin. I'm not sure if this RFC will solve the splash screen issue across SoCs ([1] seems to have a different issue and might not have memory-regions). > One thing that was pointed out earlier, and which I don't have a good > idea on how to solve it, is that the early identity domain is not > discarded. The assumption is that the standard direct mappings code of > the IOMMU framework will replace the early identity domain once devices > are properly attached to domains, but we don't have a good point in time > when it would be safe to remove the early identity domain. You are in luck! I added sync_state() driver callbacks [2] exactly for cases like this. Heck, I even listed IOMMUs as an example use case. :) sync_state() works even with modules if one enables of_devlink [3] kernel parameter (which already supports iommus DT bindings). I'd be happy to answer any question you have on sync_state() and of_devlink. > One option that I can think of would be to create an early identity > domain for each master and inherit it when that master is attached to > the domain later on, but that seems rather complicated from an book- > keeping point of view and tricky because we need to be careful not to > map regions twice, etc. > > Any good ideas on how to solve this? It'd also be interesting to see if > there's a more generic way of doing this. I know that something like > this isn't necessary on earlier Tegra SoCs with the custom Tegra SMMU > because translations are only enabled when the devices are attached to a > domain. Good foresight. As [1] shows, identity mapping doesn't solve it in a generic way. How about actually reading the current settings/mappings and just inheriting that instead of always doing a 1:1 identity mapping? And then those "inherited" mappings can be dropped when you get a sync_state(). What's wrong with that option? Cheers, Saravana [1] https://lore.kernel.org/linux-arm-msm/20200108091641.GA15147@willie-the-truck/ [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/driver-api/driver-model/driver.rst#n172 [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kernel-parameters.txt#n3239 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v11 0/4] Add uacce module for Accelerator
Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Because of unified address, hardware and user space of process can share the same virtual address in the communication. Uacce is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva patchset [2] We also keep verifying with Eric's SMMUv3 Nested Stage patches [3] This series and related zip & qm driver https://github.com/Linaro/linux-kernel-warpdrive/tree/v5.5-rc1-uacce-v11 The library and user application: https://github.com/Linaro/warpdrive/tree/wdprd-upstream-v11 References: [1] http://jpbrucker.net/sva/ [2] http://jpbrucker.net/git/linux/log/?h=sva/zip-devel [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9 The series contains 4 patches, Patch 1 & 2 are for uacce Patch 3 & 4 are an example using uacce, which happens to be crypto, can be merged later. Change History: v11: add Reviewed-by, and fix one mismatch with sys v10: Modify the include header to fix kbuild test erorr in other arch. v9: Suggested by Jonathan 1. Remove sysfs: numa_distance, node_id, id, also add is_visible callback 2. Split the api to solve the potential race struct uacce_device *uacce_alloc(struct device *parent, struct uacce_interface *interface) int uacce_register(struct uacce_device *uacce) void uacce_remove(struct uacce_device *uacce) 3. Split clean up patch 03 v8: Address some comments from Jonathan Merge Jean's patch, using uacce_mm instead of pid for sva_exit v7: As suggested by Jean and Jerome Only consider sva case and remove unused dma apis for the first patch. Also add mm_exit for sva and vm_ops.close etc v6: https://lkml.org/lkml/2019/10/16/231 Change sys qfrs_size to different file, suggested by Jonathan Fix crypto daily build issue and based on crypto code base, also 5.4-rc1. v5: https://lkml.org/lkml/2019/10/14/74 Add an example patch using the uacce interface, suggested by Greg 0003-crypto-hisilicon-register-zip-engine-to-uacce.patch v4: https://lkml.org/lkml/2019/9/17/116 Based on 5.4-rc1 Considering other driver integrating uacce, if uacce not compiled, uacce_register return error and uacce_unregister is empty. Simplify uacce flag: UACCE_DEV_SVA. Address Greg's comments: Fix state machine, remove potential syslog triggered from user space etc. v3: https://lkml.org/lkml/2019/9/2/990 Recommended by Greg, use sturct uacce_device instead of struct uacce, and use struct *cdev in struct uacce_device, as a result, cdev can be released by itself when refcount decreased to 0. So the two structures are decoupled and self-maintained by themsleves. Also add dev.release for put_device. v2: https://lkml.org/lkml/2019/8/28/565 Address comments from Greg and Jonathan Modify interface uacce_register Drop noiommu mode first v1: https://lkml.org/lkml/2019/8/14/277 1. Rebase to 5.3-rc1 2. Build on iommu interface 3. Verifying with Jean's sva and Eric's nested mode iommu. 4. User library has developed a lot: support zlib, openssl etc. 5. Move to misc first RFC3: https://lkml.org/lkml/2018/11/12/1951 RFC2: https://lwn.net/Articles/763990/ Background of why Uacce: Von Neumann processor is not good at general data manipulation. It is designed for control-bound rather than data-bound application. The latter need less control path facility and more/specific ALUs. So there are more and more heterogeneous processors, such as encryption/decryption accelerators, TPUs, or EDGE (Explicated Data Graph Execution) processors, introduced to gain better performance or power efficiency for particular applications these days. There are generally two ways to make use of these heterogeneous processors: The first is to make them co-processors, just like FPU. This is good for some application but it has its own cons: It changes the ISA set permanently. You must save all state elements when the process is switched out. But most data-bound processors have a huge set of state elements. It makes the kernel scheduler more complex. The second is Accelerator. It is taken as a IO device from the CPU's point of view (but it need not to be physically). The process, running on CPU, hold a context of the accelerator and send instructions to it as if it calls a function or thread running with FPU. The context is bound with the processor itself. So the state elements remain in the hardware context until the context is released. We believe this is the core feature of an "Accelerator" vs. Co-processor or other heterogeneous processors. The intention of Uacce is to provide the basic facility to backup this scenario. Its first step is to make sure the accelerator and process can
[PATCH v11 3/4] crypto: hisilicon - Remove module_param uacce_mode
Remove the module_param uacce_mode, which is not used currently. Reviewed-by: Jonathan Cameron Signed-off-by: Zhangfei Gao Signed-off-by: Zhou Wang --- drivers/crypto/hisilicon/zip/zip_main.c | 31 ++- 1 file changed, 6 insertions(+), 25 deletions(-) diff --git a/drivers/crypto/hisilicon/zip/zip_main.c b/drivers/crypto/hisilicon/zip/zip_main.c index 31ae6a7..853b97e 100644 --- a/drivers/crypto/hisilicon/zip/zip_main.c +++ b/drivers/crypto/hisilicon/zip/zip_main.c @@ -298,9 +298,6 @@ static u32 pf_q_num = HZIP_PF_DEF_Q_NUM; module_param_cb(pf_q_num, _q_num_ops, _q_num, 0444); MODULE_PARM_DESC(pf_q_num, "Number of queues in PF(v1 1-4096, v2 1-1024)"); -static int uacce_mode; -module_param(uacce_mode, int, 0); - static u32 vfs_num; module_param(vfs_num, uint, 0444); MODULE_PARM_DESC(vfs_num, "Number of VFs to enable(1-63)"); @@ -796,6 +793,7 @@ static int hisi_zip_probe(struct pci_dev *pdev, const struct pci_device_id *id) pci_set_drvdata(pdev, hisi_zip); qm = _zip->qm; + qm->use_dma_api = true; qm->pdev = pdev; qm->ver = rev_id; @@ -803,20 +801,6 @@ static int hisi_zip_probe(struct pci_dev *pdev, const struct pci_device_id *id) qm->dev_name = hisi_zip_name; qm->fun_type = (pdev->device == PCI_DEVICE_ID_ZIP_PF) ? QM_HW_PF : QM_HW_VF; - switch (uacce_mode) { - case 0: - qm->use_dma_api = true; - break; - case 1: - qm->use_dma_api = false; - break; - case 2: - qm->use_dma_api = true; - break; - default: - return -EINVAL; - } - ret = hisi_qm_init(qm); if (ret) { dev_err(>dev, "Failed to init qm!\n"); @@ -1015,12 +999,10 @@ static int __init hisi_zip_init(void) goto err_pci; } - if (uacce_mode == 0 || uacce_mode == 2) { - ret = hisi_zip_register_to_crypto(); - if (ret < 0) { - pr_err("Failed to register driver to crypto.\n"); - goto err_crypto; - } + ret = hisi_zip_register_to_crypto(); + if (ret < 0) { + pr_err("Failed to register driver to crypto.\n"); + goto err_crypto; } return 0; @@ -1035,8 +1017,7 @@ static int __init hisi_zip_init(void) static void __exit hisi_zip_exit(void) { - if (uacce_mode == 0 || uacce_mode == 2) - hisi_zip_unregister_from_crypto(); + hisi_zip_unregister_from_crypto(); pci_unregister_driver(_zip_pci_driver); hisi_zip_unregister_debugfs(); } -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v11 2/4] uacce: add uacce driver
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share only data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. The IOMMU core only tracks mm<->device bonds at the moment, because it only needs to handle IOTLB invalidation and PASID table entries. However uacce needs a finer granularity since multiple queues from the same device can be bound to an mm. When the mm exits, all bound queues must be stopped so that the IOMMU can safely clear the PASID table entry and reallocate the PASID. An intermediate struct uacce_mm links uacce devices and queues. Note that an mm may be bound to multiple devices but an uacce_mm structure only ever belongs to a single device, because we don't need anything more complex (if multiple devices are bound to one mm, then we'll create one uacce_mm for each bond). uacce_device --+-- uacce_mm --+-- uacce_queue | '-- uacce_queue | '-- uacce_mm --+-- uacce_queue +-- uacce_queue '-- uacce_queue Reviewed-by: Jonathan Cameron Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Jean-Philippe Brucker Signed-off-by: Zhangfei Gao --- Documentation/ABI/testing/sysfs-driver-uacce | 39 ++ drivers/misc/Kconfig | 1 + drivers/misc/Makefile| 1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile | 2 + drivers/misc/uacce/uacce.c | 626 +++ include/linux/uacce.h| 161 +++ include/uapi/misc/uacce/uacce.h | 38 ++ 8 files changed, 881 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce/uacce.h diff --git a/Documentation/ABI/testing/sysfs-driver-uacce b/Documentation/ABI/testing/sysfs-driver-uacce new file mode 100644 index 000..ef4003a --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-uacce @@ -0,0 +1,39 @@ +What: /sys/class/uacce//api +Date: Jan 2020 +KernelVersion: 5.6 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Api of the device +Can be any string and up to userspace to parse. +Application use the api to match the correct driver + +What: /sys/class/uacce//flags +Date: Jan 2020 +KernelVersion: 5.6 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h + +What: /sys/class/uacce//available_instances +Date: Jan 2020 +KernelVersion: 5.6 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Available instances left of the device +Return -ENODEV if uacce_ops get_available_instances is not provided + +What: /sys/class/uacce//algorithms +Date: Jan 2020 +KernelVersion: 5.6 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Algorithms supported by this accelerator, separated by new line. +Can be any string and up to userspace to parse. + +What: /sys/class/uacce//region_mmio_size +Date: Jan 2020 +KernelVersion: 5.6 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Size (bytes) of mmio region queue file + +What: /sys/class/uacce//region_dus_size +Date: Jan 2020 +KernelVersion: 5.6 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Size (bytes) of dus region queue file diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index 7f0d48f..99e1514 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -480,4 +480,5 @@ source "drivers/misc/cxl/Kconfig" source "drivers/misc/ocxl/Kconfig" source "drivers/misc/cardreader/Kconfig" source "drivers/misc/habanalabs/Kconfig" +source "drivers/misc/uacce/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index c1860d3..9abf292 100644 ---
[PATCH v11 1/4] uacce: Add documents for uacce
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) is a kernel module targets to provide Shared Virtual Addressing (SVA) between the accelerator and process. This patch add document to explain how it works. Reviewed-by: Jonathan Cameron Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/misc-devices/uacce.rst | 176 +++ 1 file changed, 176 insertions(+) create mode 100644 Documentation/misc-devices/uacce.rst diff --git a/Documentation/misc-devices/uacce.rst b/Documentation/misc-devices/uacce.rst new file mode 100644 index 000..1db412e --- /dev/null +++ b/Documentation/misc-devices/uacce.rst @@ -0,0 +1,176 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction of Uacce +- + +Uacce (Unified/User-space-access-intended Accelerator Framework) targets to +provide Shared Virtual Addressing (SVA) between accelerators and processes. +So accelerator can access any data structure of the main cpu. +This differs from the data sharing between cpu and io device, which share +only data content rather than address. +Because of the unified address, hardware and user space of process can +share the same virtual address in the communication. +Uacce takes the hardware accelerator as a heterogeneous processor, while +IOMMU share the same CPU page tables and as a result the same translation +from va to pa. + +:: + + __ __ +| | | | +| User application (CPU) | | Hardware Accelerator | +|__| |__| + + | | + | va | va + V V + ____ +| | | | +| MMU| | IOMMU | +|__| |__| + | | + | | + V pa V pa + ___ +| | +| Memory | +|___| + + + +Architecture + + +Uacce is the kernel module, taking charge of iommu and address sharing. +The user drivers and libraries are called WarpDrive. + +The uacce device, built around the IOMMU SVA API, can access multiple +address spaces, including the one without PASID. + +A virtual concept, queue, is used for the communication. It provides a +FIFO-like interface. And it maintains a unified address space between the +application and all involved hardware. + +:: + + ___ +| | user API | | +| WarpDrive library | > | user driver | +|___| || + || + || + | queue fd | + || + || + v| + ___ _| +| | | | | mmap memory +| Other framework | | uacce | | r/w interface +| crypto/nic/others | |_| | +|___| | + | || + | register | register | + | || + | || + |_ __ | + | | | | | | + - | Device Driver | | IOMMU | | + |_| |__| | + |
[PATCH v3 3/5] PCI: Introduce pci_direct_dma_alias()
The current DMA alias implementation requires the aliased device be on the same PCI bus as the requester ID. This introduces an arch-specific mechanism to point to another PCI device when doing mapping and PCI DMA alias search. Signed-off-by: Jon Derrick --- arch/x86/pci/common.c | 7 +++ drivers/pci/pci.c | 19 ++- drivers/pci/search.c | 7 +++ include/linux/pci.h | 1 + 4 files changed, 33 insertions(+), 1 deletion(-) diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index 1e59df0..83334a5 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -736,3 +736,10 @@ int pci_ext_cfg_avail(void) else return 0; } + +#if IS_ENABLED(CONFIG_VMD) +struct pci_dev *pci_direct_dma_alias(struct pci_dev *dev) +{ + return to_pci_sysdata(dev->bus)->vmd_dev; +} +#endif diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index ad746d9..1362694 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -6034,7 +6034,9 @@ bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2) return (dev1->dma_alias_mask && test_bit(dev2->devfn, dev1->dma_alias_mask)) || (dev2->dma_alias_mask && - test_bit(dev1->devfn, dev2->dma_alias_mask)); + test_bit(dev1->devfn, dev2->dma_alias_mask)) || + (pci_direct_dma_alias(dev1) == dev2) || + (pci_direct_dma_alias(dev2) == dev1); } bool pci_device_is_present(struct pci_dev *pdev) @@ -6058,6 +6060,21 @@ void pci_ignore_hotplug(struct pci_dev *dev) } EXPORT_SYMBOL_GPL(pci_ignore_hotplug); +/** + * pci_direct_dma_alias - Get PCI DMA alias for PCI device + * @dev: the PCI device that may have a PCI DMA alias + * + * Permits the platform to provide architecture-specific functionality to + * devices needing to alias DMA to another PCI device on another PCI bus. If + * the PCI device is on the same bus, it is recommended to use + * pci_add_dma_alias(). This is the default implementation. Architecture + * implementations can override this. + */ +struct pci_dev __weak *pci_direct_dma_alias(struct pci_dev *dev) +{ + return NULL; +} + resource_size_t __weak pcibios_default_alignment(void) { return 0; diff --git a/drivers/pci/search.c b/drivers/pci/search.c index bade140..12811b3 100644 --- a/drivers/pci/search.c +++ b/drivers/pci/search.c @@ -32,6 +32,13 @@ int pci_for_each_dma_alias(struct pci_dev *pdev, struct pci_bus *bus; int ret; + /* +* The device may have an explicit alias requester ID for DMA where the +* requester is on another PCI bus. +*/ + if (unlikely(pci_direct_dma_alias(pdev))) + pdev = pci_direct_dma_alias(pdev); + ret = fn(pdev, pci_dev_id(pdev), data); if (ret) return ret; diff --git a/include/linux/pci.h b/include/linux/pci.h index c393dff..cb6677b 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1202,6 +1202,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev, int pci_select_bars(struct pci_dev *dev, unsigned long flags); bool pci_device_is_present(struct pci_dev *pdev); void pci_ignore_hotplug(struct pci_dev *dev); +struct pci_dev *pci_direct_dma_alias(struct pci_dev *dev); int __printf(6, 7) pci_request_irq(struct pci_dev *dev, unsigned int nr, irq_handler_t handler, irq_handler_t thread_fn, void *dev_id, -- 1.8.3.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 1/5] x86/pci: Add a to_pci_sysdata helper
From: Christoph Hellwig Various helpers need the pci_sysdata just to dereference a single field in it. Add a little helper that returns the properly typed sysdata pointer to require a little less boilerplate code. Signed-off-by: Christoph Hellwig [jonathan.derrick: added un-const cast] Signed-off-by: Jon Derrick --- arch/x86/include/asm/pci.h | 28 +--- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h index 90d0731..cf680c5 100644 --- a/arch/x86/include/asm/pci.h +++ b/arch/x86/include/asm/pci.h @@ -35,12 +35,15 @@ struct pci_sysdata { #ifdef CONFIG_PCI +static inline struct pci_sysdata *to_pci_sysdata(struct pci_bus *bus) +{ + return bus->sysdata; +} + #ifdef CONFIG_PCI_DOMAINS static inline int pci_domain_nr(struct pci_bus *bus) { - struct pci_sysdata *sd = bus->sysdata; - - return sd->domain; + return to_pci_sysdata(bus)->domain; } static inline int pci_proc_domain(struct pci_bus *bus) @@ -52,23 +55,20 @@ static inline int pci_proc_domain(struct pci_bus *bus) #ifdef CONFIG_PCI_MSI_IRQ_DOMAIN static inline void *_pci_root_bus_fwnode(struct pci_bus *bus) { - struct pci_sysdata *sd = bus->sysdata; - - return sd->fwnode; + return to_pci_sysdata(bus)->fwnode; } #define pci_root_bus_fwnode_pci_root_bus_fwnode #endif +#if IS_ENABLED(CONFIG_VMD) static inline bool is_vmd(struct pci_bus *bus) { -#if IS_ENABLED(CONFIG_VMD) - struct pci_sysdata *sd = bus->sysdata; - - return sd->vmd_domain; + return to_pci_sysdata(bus)->vmd_domain; +} #else - return false; -#endif +#define is_vmd(bus)false +#endif /* CONFIG_VMD */ } /* Can be used to override the logic in pci_scan_bus for skipping @@ -124,9 +124,7 @@ static inline void early_quirks(void) { } /* Returns the node based on pci bus */ static inline int __pcibus_to_node(const struct pci_bus *bus) { - const struct pci_sysdata *sd = bus->sysdata; - - return sd->node; + return to_pci_sysdata((struct pci_bus *) bus)->node; } static inline const struct cpumask * -- 1.8.3.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 4/5] PCI: vmd: Stop overriding dma_map_ops
Devices on the VMD domain use the VMD endpoint's requester ID and have been relying on the VMD endpoint's DMA operations. The problem with this was that VMD domain devices would use the VMD endpoint's attributes when doing DMA and IOMMU mapping. We can be smarter about this by only using the VMD endpoint when mapping and providing the correct child device's attributes during DMA operations. This patch modifies Intel-IOMMU to check for a 'Direct DMA Alias' and refer to it for mapping. Signed-off-by: Jon Derrick --- drivers/iommu/intel-iommu.c| 18 +++-- drivers/pci/controller/Kconfig | 1 - drivers/pci/controller/vmd.c | 150 - 3 files changed, 13 insertions(+), 156 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 716347e2..7ca807a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -804,14 +804,14 @@ static struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devf if (dev_is_pci(dev)) { struct pci_dev *pf_pdev; + struct pci_dev *dma_alias; pdev = to_pci_dev(dev); -#ifdef CONFIG_X86 - /* VMD child devices currently cannot be handled individually */ - if (is_vmd(pdev->bus)) - return NULL; -#endif + /* DMA aliased devices use the DMA alias's IOMMU */ + dma_alias = pci_direct_dma_alias(pdev); + if (dma_alias) + pdev = dma_alias; /* VFs aren't listed in scope tables; we need to look up * the PF instead to find the IOMMU. */ @@ -2521,6 +2521,14 @@ struct dmar_domain *find_domain(struct device *dev) dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO)) return NULL; + if (dev_is_pci(dev)) { + struct pci_dev *pdev = to_pci_dev(dev); + struct pci_dev *dma_alias = pci_direct_dma_alias(pdev); + + if (dma_alias) + dev = _alias->dev; + } + /* No lock here, assumes no domain exit in normal case */ info = dev->archdata.iommu; if (likely(info)) diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig index c77069c..55671429 100644 --- a/drivers/pci/controller/Kconfig +++ b/drivers/pci/controller/Kconfig @@ -239,7 +239,6 @@ config PCIE_TANGO_SMP8759 config VMD depends on PCI_MSI && X86_64 && SRCU - select X86_DEV_DMA_OPS tristate "Intel Volume Management Device Driver" ---help--- Adds support for the Intel Volume Management Device (VMD). VMD is a diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c index d67ad56..fe1acb0 100644 --- a/drivers/pci/controller/vmd.c +++ b/drivers/pci/controller/vmd.c @@ -98,9 +98,6 @@ struct vmd_dev { struct irq_domain *irq_domain; struct pci_bus *bus; u8 busn_start; - - struct dma_map_ops dma_ops; - struct dma_domain dma_domain; }; static inline struct vmd_dev *vmd_from_bus(struct pci_bus *bus) @@ -295,151 +292,6 @@ static void vmd_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc) .chip = _msi_controller, }; -/* - * VMD replaces the requester ID with its own. DMA mappings for devices in a - * VMD domain need to be mapped for the VMD, not the device requiring - * the mapping. - */ -static struct device *to_vmd_dev(struct device *dev) -{ - struct pci_dev *pdev = to_pci_dev(dev); - struct vmd_dev *vmd = vmd_from_bus(pdev->bus); - - return >dev->dev; -} - -static void *vmd_alloc(struct device *dev, size_t size, dma_addr_t *addr, - gfp_t flag, unsigned long attrs) -{ - return dma_alloc_attrs(to_vmd_dev(dev), size, addr, flag, attrs); -} - -static void vmd_free(struct device *dev, size_t size, void *vaddr, -dma_addr_t addr, unsigned long attrs) -{ - return dma_free_attrs(to_vmd_dev(dev), size, vaddr, addr, attrs); -} - -static int vmd_mmap(struct device *dev, struct vm_area_struct *vma, - void *cpu_addr, dma_addr_t addr, size_t size, - unsigned long attrs) -{ - return dma_mmap_attrs(to_vmd_dev(dev), vma, cpu_addr, addr, size, - attrs); -} - -static int vmd_get_sgtable(struct device *dev, struct sg_table *sgt, - void *cpu_addr, dma_addr_t addr, size_t size, - unsigned long attrs) -{ - return dma_get_sgtable_attrs(to_vmd_dev(dev), sgt, cpu_addr, addr, size, - attrs); -} - -static dma_addr_t vmd_map_page(struct device *dev, struct page *page, - unsigned long offset, size_t size, - enum dma_data_direction dir, - unsigned long attrs) -{ -
[PATCH v3 0/5] Clean up VMD DMA Map Ops
v2 Set: https://lore.kernel.org/linux-iommu/1578580256-3483-1-git-send-email-jonathan.derr...@intel.com/T/#t v1 Set: https://lore.kernel.org/linux-iommu/20200107134125.gd30...@8bytes.org/T/#t VMD currently works with VT-d enabled by pointing DMA and IOMMU actions at the VMD endpoint. The problem with this approach is that the VMD endpoint's device-specific attributes, such as the DMA Mask Bits, are used instead. This set cleans up VMD by removing the override that redirects DMA map operations to the VMD endpoint. Instead it introduces a new DMA alias mechanism into the existing DMA alias infrastructure. v1 added a pointer in struct pci_dev that pointed to the DMA alias' struct pci_dev and did the necessary DMA alias and IOMMU modifications. v2 introduced a new weak function to reference the 'Direct DMA Alias', and removed the need to add a pointer in struct device or pci_dev. Weak functions are generally frowned upon when it's a single architecture implementation, so I am open to alternatives. v3 references the pci_dev rather than the struct device for the PCI 'Direct DMA Alias' (pci_direct_dma_alias()). This revision also allows pci_for_each_dma_alias() to call any DMA aliases for the Direct DMA alias device, though I don't expect the VMD endpoint to need intra-bus DMA aliases. Changes from v2: Uses struct pci_dev for PCI Device 'Direct DMA aliasing' (pci_direct_dma_alias) Allows pci_for_each_dma_alias to iterate over the alias mask of the 'Direct DMA alias' Changes from v1: Removed 1/5 & 2/5 misc fix patches that were merged Uses Christoph's staging/cleanup patches Introduce weak function rather than including pointer in struct device or pci_dev. Based on Joerg's next: https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/ Jon Derrick (5): x86/pci: Add a to_pci_sysdata helper x86/PCI: Expose VMD's PCI Device in pci_sysdata PCI: Introduce pci_direct_dma_alias() PCI: vmd: Stop overriding dma_map_ops x86/pci: Remove X86_DEV_DMA_OPS arch/x86/Kconfig | 3 - arch/x86/include/asm/device.h | 10 --- arch/x86/include/asm/pci.h | 31 - arch/x86/pci/common.c | 45 ++-- drivers/iommu/intel-iommu.c| 18 +++-- drivers/pci/controller/Kconfig | 1 - drivers/pci/controller/vmd.c | 152 + drivers/pci/pci.c | 19 +- drivers/pci/search.c | 7 ++ include/linux/pci.h| 1 + 10 files changed, 61 insertions(+), 226 deletions(-) -- 1.8.3.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 2/5] x86/PCI: Expose VMD's PCI Device in pci_sysdata
To be used by Intel-IOMMU code to find the correct domain. CC: Christoph Hellwig Signed-off-by: Jon Derrick --- arch/x86/include/asm/pci.h | 5 ++--- drivers/pci/controller/vmd.c | 2 +- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h index cf680c5..b982d9c 100644 --- a/arch/x86/include/asm/pci.h +++ b/arch/x86/include/asm/pci.h @@ -25,7 +25,7 @@ struct pci_sysdata { void*fwnode;/* IRQ domain for MSI assignment */ #endif #if IS_ENABLED(CONFIG_VMD) - bool vmd_domain;/* True if in Intel VMD domain */ + struct pci_dev *vmd_dev; /* VMD Device if in Intel VMD domain */ #endif }; @@ -64,12 +64,11 @@ static inline void *_pci_root_bus_fwnode(struct pci_bus *bus) #if IS_ENABLED(CONFIG_VMD) static inline bool is_vmd(struct pci_bus *bus) { - return to_pci_sysdata(bus)->vmd_domain; + return to_pci_sysdata(bus)->vmd_dev != NULL; } #else #define is_vmd(bus)false #endif /* CONFIG_VMD */ -} /* Can be used to override the logic in pci_scan_bus for skipping already-configured bus numbers - to be used for buggy BIOSes diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c index 2128422..d67ad56 100644 --- a/drivers/pci/controller/vmd.c +++ b/drivers/pci/controller/vmd.c @@ -679,7 +679,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features) .parent = res, }; - sd->vmd_domain = true; + sd->vmd_dev = vmd->dev; sd->domain = vmd_find_free_domain(); if (sd->domain < 0) return sd->domain; -- 1.8.3.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 5/5] x86/pci: Remove X86_DEV_DMA_OPS
From: Christoph Hellwig There are no users of X86_DEV_DMA_OPS left, so remove the code. Reviewed-by: Jon Derrick Signed-off-by: Christoph Hellwig --- arch/x86/Kconfig | 3 --- arch/x86/include/asm/device.h | 10 -- arch/x86/pci/common.c | 38 -- 3 files changed, 51 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5e89499..77f9426 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2955,9 +2955,6 @@ config HAVE_ATOMIC_IOMAP def_bool y depends on X86_32 -config X86_DEV_DMA_OPS - bool - source "drivers/firmware/Kconfig" source "arch/x86/kvm/Kconfig" diff --git a/arch/x86/include/asm/device.h b/arch/x86/include/asm/device.h index 5e12c63..7e31f7f 100644 --- a/arch/x86/include/asm/device.h +++ b/arch/x86/include/asm/device.h @@ -8,16 +8,6 @@ struct dev_archdata { #endif }; -#if defined(CONFIG_X86_DEV_DMA_OPS) && defined(CONFIG_PCI_DOMAINS) -struct dma_domain { - struct list_head node; - const struct dma_map_ops *dma_ops; - int domain_nr; -}; -void add_dma_domain(struct dma_domain *domain); -void del_dma_domain(struct dma_domain *domain); -#endif - struct pdev_archdata { }; diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index 83334a5..8f60d52 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -625,43 +625,6 @@ unsigned int pcibios_assign_all_busses(void) return (pci_probe & PCI_ASSIGN_ALL_BUSSES) ? 1 : 0; } -#if defined(CONFIG_X86_DEV_DMA_OPS) && defined(CONFIG_PCI_DOMAINS) -static LIST_HEAD(dma_domain_list); -static DEFINE_SPINLOCK(dma_domain_list_lock); - -void add_dma_domain(struct dma_domain *domain) -{ - spin_lock(_domain_list_lock); - list_add(>node, _domain_list); - spin_unlock(_domain_list_lock); -} -EXPORT_SYMBOL_GPL(add_dma_domain); - -void del_dma_domain(struct dma_domain *domain) -{ - spin_lock(_domain_list_lock); - list_del(>node); - spin_unlock(_domain_list_lock); -} -EXPORT_SYMBOL_GPL(del_dma_domain); - -static void set_dma_domain_ops(struct pci_dev *pdev) -{ - struct dma_domain *domain; - - spin_lock(_domain_list_lock); - list_for_each_entry(domain, _domain_list, node) { - if (pci_domain_nr(pdev->bus) == domain->domain_nr) { - pdev->dev.dma_ops = domain->dma_ops; - break; - } - } - spin_unlock(_domain_list_lock); -} -#else -static void set_dma_domain_ops(struct pci_dev *pdev) {} -#endif - static void set_dev_domain_options(struct pci_dev *pdev) { if (is_vmd(pdev->bus)) @@ -697,7 +660,6 @@ int pcibios_add_device(struct pci_dev *dev) pa_data = data->next; memunmap(data); } - set_dma_domain_ops(dev); set_dev_domain_options(dev); return 0; } -- 1.8.3.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/2] iommu/vt-d: skip invalid RMRR entries
Hi Barret, this looks good. thanks Yian On 1/7/2020 11:16 AM, Barret Rhoden wrote: The VT-d docs specify requirements for the RMRR entries base and end (called 'Limit' in the docs) addresses. This commit will cause the DMAR processing to skip any RMRR entries that do not meet these requirements and mark the firmware as tainted, since the firmware is giving us junk. Signed-off-by: Barret Rhoden --- drivers/iommu/intel-iommu.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index a8bb458845bc..32c3c6338a3d 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -4315,13 +4315,25 @@ static void __init init_iommu_pm_ops(void) static inline void init_iommu_pm_ops(void) {} #endif/* CONFIG_PM */ +static int rmrr_validity_check(struct acpi_dmar_reserved_memory *rmrr) +{ + if ((rmrr->base_address & PAGE_MASK) || + (rmrr->end_address <= rmrr->base_address) || + ((rmrr->end_address - rmrr->base_address + 1) & PAGE_MASK)) { + pr_err(FW_BUG "Broken RMRR base: %#018Lx end: %#018Lx\n", + rmrr->base_address, rmrr->end_address); + return -EINVAL; + } + return 0; +} + int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg) { struct acpi_dmar_reserved_memory *rmrr; struct dmar_rmrr_unit *rmrru; rmrr = (struct acpi_dmar_reserved_memory *)header; - if (arch_rmrr_sanity_check(rmrr)) { + if (rmrr_validity_check(rmrr) || arch_rmrr_sanity_check(rmrr)) { WARN_TAINT(1, TAINT_FIRMWARE_WORKAROUND, "Your BIOS is broken; bad RMRR [%#018Lx-%#018Lx]\n" "BIOS vendor: %s; Ver: %s; Product Version: %s\n", ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 02/10] iommu/vt-d: Add nested translation helper function
On Fri, 10 Jan 2020 09:15:45 +0800 Lu Baolu wrote: > Hi Jacob, > > On 1/10/20 2:39 AM, Jacob Pan wrote: > > On Wed, 18 Dec 2019 10:41:53 +0800 > > Lu Baolu wrote: > > > >> Hi again, > >> > >> On 12/17/19 3:24 AM, Jacob Pan wrote: > >>> +/** > >>> + * intel_pasid_setup_nested() - Set up PASID entry for nested > >>> translation > >>> + * which is used for vSVA. The first level page tables are used > >>> for > >>> + * GVA-GPA or GIOVA-GPA translation in the guest, second level > >>> page tables > >>> + * are used for GPA-HPA translation. > >>> + * > >>> + * @iommu: Iommu which the device belong to > >>> + * @dev:Device to be set up for translation > >>> + * @gpgd: FLPTPTR: First Level Page translation pointer in > >>> GPA > >>> + * @pasid: PASID to be programmed in the device PASID table > >>> + * @pasid_data: Additional PASID info from the guest bind request > >>> + * @domain: Domain info for setting up second level page > >>> tables > >>> + * @addr_width: Address width of the first level (guest) > >>> + */ > >>> +int intel_pasid_setup_nested(struct intel_iommu *iommu, > >>> + struct device *dev, pgd_t *gpgd, > >>> + int pasid, struct > >>> iommu_gpasid_bind_data_vtd *pasid_data, > >>> + struct dmar_domain *domain, > >>> + int addr_width) > >>> +{ > >>> + struct pasid_entry *pte; > >>> + struct dma_pte *pgd; > >>> + u64 pgd_val; > >>> + int agaw; > >>> + u16 did; > >>> + > >>> + if (!ecap_nest(iommu->ecap)) { > >>> + pr_err("IOMMU: %s: No nested translation > >>> support\n", > >>> +iommu->name); > >>> + return -EINVAL; > >>> + } > >>> + > >>> + pte = intel_pasid_get_entry(dev, pasid); > >>> + if (WARN_ON(!pte)) > >>> + return -EINVAL; > >>> + > >>> + pasid_clear_entry(pte); > >> > >> In some cases, e.g. nested mode for GIOVA-HPA, the PASID entry > >> might have already been setup for second level translation. (This > >> could be checked with the Present bit.) Hence, it's safe to flush > >> caches here. > >> > >> Or, maybe intel_pasid_tear_down_entry() is more suitable? > >> > > We don't allow binding the same device-PASID twice, so if the PASID > > entry was used for GIOVA/RID2PASID, it should unbind first, and > > teardown flush included, right? > > > > Fair enough. Can you please add this as a comment to this function? So > that the caller of this interface can know this. Or add a check in > this function which returns error if the pasid entry has already been > bond. > Sounds good, i will do both comment and check as this: /* * Caller must ensure PASID entry is not in use, i.e. not bind the * same PASID to the same device twice. */ if (pasid_pte_is_present(pte)) return -EBUSY; We already have the check in the current caller. Thanks, > Best regards, > baolu [Jacob Pan] ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dma-contiguous: CMA: give precedence to cmdline
Hi Nicolas, On 10/01/2020 17:19, Nicolas Saenz Julienne wrote: Although the device tree might contain a reserved-memory DT node dedicated as the default CMA pool, users might want to change CMA's parameters using the kernel command line for debugging purposes and whatnot. Honor this by bypassing the reserved memory CMA setup, which will ultimately end up freeing the memblock and allow the command line CMA configuration routine to run. Signed-off-by: Nicolas Saenz Julienne --- NOTE: Tested this on arm and arm64 with the Raspberry Pi 4. kernel/dma/contiguous.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c index daa4e6eefdde..8bc6f2d670f9 100644 --- a/kernel/dma/contiguous.c +++ b/kernel/dma/contiguous.c @@ -302,9 +302,16 @@ static int __init rmem_cma_setup(struct reserved_mem *rmem) phys_addr_t align = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order); phys_addr_t mask = align - 1; unsigned long node = rmem->fdt_node; + bool default_cma = of_get_flat_dt_prop(node, "linux,cma-default", NULL); struct cma *cma; int err; + if (size_cmdline != -1 && default_cma) { + pr_info("Reserved memory: bypass %s node, using cmdline CMA params instead\n", + rmem->name); + return -EBUSY; + } + if (!of_get_flat_dt_prop(node, "reusable", NULL) || of_get_flat_dt_prop(node, "no-map", NULL)) return -EINVAL; @@ -322,7 +329,7 @@ static int __init rmem_cma_setup(struct reserved_mem *rmem) /* Architecture specific contiguous memory fixup. */ dma_contiguous_early_fixup(rmem->base, rmem->size); - if (of_get_flat_dt_prop(node, "linux,cma-default", NULL)) + if (default_cma) dma_contiguous_set_default(cma); rmem->ops = _cma_ops; For what it's worth, Reviewed-by: Phil Elwell Phil ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] dma-contiguous: CMA: give precedence to cmdline
Although the device tree might contain a reserved-memory DT node dedicated as the default CMA pool, users might want to change CMA's parameters using the kernel command line for debugging purposes and whatnot. Honor this by bypassing the reserved memory CMA setup, which will ultimately end up freeing the memblock and allow the command line CMA configuration routine to run. Signed-off-by: Nicolas Saenz Julienne --- NOTE: Tested this on arm and arm64 with the Raspberry Pi 4. kernel/dma/contiguous.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c index daa4e6eefdde..8bc6f2d670f9 100644 --- a/kernel/dma/contiguous.c +++ b/kernel/dma/contiguous.c @@ -302,9 +302,16 @@ static int __init rmem_cma_setup(struct reserved_mem *rmem) phys_addr_t align = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order); phys_addr_t mask = align - 1; unsigned long node = rmem->fdt_node; + bool default_cma = of_get_flat_dt_prop(node, "linux,cma-default", NULL); struct cma *cma; int err; + if (size_cmdline != -1 && default_cma) { + pr_info("Reserved memory: bypass %s node, using cmdline CMA params instead\n", + rmem->name); + return -EBUSY; + } + if (!of_get_flat_dt_prop(node, "reusable", NULL) || of_get_flat_dt_prop(node, "no-map", NULL)) return -EINVAL; @@ -322,7 +329,7 @@ static int __init rmem_cma_setup(struct reserved_mem *rmem) /* Architecture specific contiguous memory fixup. */ dma_contiguous_early_fixup(rmem->base, rmem->size); - if (of_get_flat_dt_prop(node, "linux,cma-default", NULL)) + if (default_cma) dma_contiguous_set_default(cma); rmem->ops = _cma_ops; -- 2.24.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Regression iommu/vt-d bounce buffer
Hi Baolu, any ideas here? On Mon, Jan 06, 2020 at 04:43:05PM +0100, Frederik Schwan wrote: > Hello people, > since the introduction of the bounce buffer, a regression with TB3 devices > has been introduced. > USB devices attached to TB3 refuse to work since. Removing the commits that > introduced the bounce buffer, fixes the issue. > > Commits: > 3b53034c268d550d9e8522e613a14ab53b8840d8 > c5a5dc4cbbf4540c1891cdb2b70cf469405ea61f > cfb94a372f2d4ee226247447c863f8709863d170 > e5e04d051979dbd636a99099b7a595093c50a4bc > > > An excerpt of the trace: > > [ +0,05] WARNING: CPU: 10 PID: 0 at drivers/iommu/intel-iommu.c:3916 > > bounce_unmap_single+0x103/0x110 > > [...] > > [ +0,01] Call Trace: > > [ +0,02] > > [ +0,03] usb_hcd_unmap_urb_setup_for_dma+0x9f/0xe0 > > [ +0,01] usb_hcd_unmap_urb_for_dma+0x1c/0x170 > > [ +0,02] __usb_hcd_giveback_urb+0x36/0x120 > > [ +0,08] xhci_giveback_urb_in_irq.isra.0+0x72/0x100 [xhci_hcd] > > [ +0,07] xhci_td_cleanup+0x101/0x140 [xhci_hcd] > > [ +0,07] xhci_irq+0xbf0/0x1db0 [xhci_hcd] > > [ +0,05] __handle_irq_event_percpu+0x44/0x1b0 > > [ +0,02] handle_irq_event_percpu+0x34/0x80 > > [ +0,02] handle_irq_event+0x37/0x54 > > [ +0,02] handle_edge_irq+0xae/0x1f0 > > [ +0,02] do_IRQ+0x84/0x140 > > [ +0,03] common_interrupt+0xf/0xf > > [ +0,01] > > Dmesg log and further information have been posted here: > https://bugzilla.kernel.org/show_bug.cgi?id=205893 > > Cheers, > Frederik > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/io-pgtable-arm: Improve attribute handling
On Fri, Jan 10, 2020 at 03:21:51PM +, Robin Murphy wrote: > By VMSA rules, using Normal Non-Cacheable type with a shareability > attribute of anything other than Outer Shareable is liable to lead into > unpredictable territory. Although the SMMU architectures seem to give > some slightly stronger guarantees of Non-Cacheable output types becoming > implicitly Outer Shareable in most cases, we may as well be explicit and > not take any chances. It's also weird that LPAE attribute handling is > currently split between prot_to_pte() and init_pte() given that it can > all be statically determined up-front. Thus, collect *all* the LPAE > attributes into prot_to_pte() in order to logically pick the > shareability based on the incoming IOMMU API prot value, and tweak the > short-descriptor code to stop setting TTBR0.NOS for Non-Cacheable walks. > > Signed-off-by: Robin Murphy > --- > drivers/iommu/io-pgtable-arm-v7s.c | 7 +++ > drivers/iommu/io-pgtable-arm.c | 17 +++-- > 2 files changed, 14 insertions(+), 10 deletions(-) > > diff --git a/drivers/iommu/io-pgtable-arm-v7s.c > b/drivers/iommu/io-pgtable-arm-v7s.c > index 7c3bd2c3cdca..7d6a8622f2e6 100644 > --- a/drivers/iommu/io-pgtable-arm-v7s.c > +++ b/drivers/iommu/io-pgtable-arm-v7s.c > @@ -823,10 +823,9 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct > io_pgtable_cfg *cfg, > wmb(); > > /* TTBRs */ > - cfg->arm_v7s_cfg.ttbr[0] = virt_to_phys(data->pgd) | > -ARM_V7S_TTBR_S | ARM_V7S_TTBR_NOS | > -(cfg->coherent_walk ? > -(ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_WBWA) | > + cfg->arm_v7s_cfg.ttbr[0] = virt_to_phys(data->pgd) | ARM_V7S_TTBR_S | > +(cfg->coherent_walk ? (ARM_V7S_TTBR_NOS | > + ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_WBWA) | Ha, I just sent a broken version of this out myself, but with you as the author ;) I'll merge this hunk into that patch, if you don't mind? > + if (prot & IOMMU_CACHE) > + pte |= ARM_LPAE_PTE_SH_IS; > + else > + pte |= ARM_LPAE_PTE_SH_OS; > + > if (prot & IOMMU_NOEXEC) > pte |= ARM_LPAE_PTE_XN; > > + if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_NS) > + pte |= ARM_LPAE_PTE_NS; > + > + if (data->iop.fmt != ARM_MALI_LPAE) > + pte |= ARM_LPAE_PTE_AF; > + I left these last two where they were, just because they're not driven directly from the IOMMU prot encoding. However, I don't mind moving them, but let's do that as a separate patch. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 7/8] iommu/io-pgtable-arm: Rationalise VTCR handling
Commit 05a648cd2dd7 ("iommu/io-pgtable-arm: Rationalise TCR handling") reworked the way in which the TCR register value is returned from the io-pgtable code when targetting the Arm long-descriptor format, in preparation for allowing page-tables to target TTBR1. As it turns out, the new interface is a lot nicer to use, so do the same conversion for the VTCR register even though there is only a single base register for stage-2 translation. Cc: Robin Murphy Signed-off-by: Will Deacon --- drivers/iommu/arm-smmu-v3.c| 17 +- drivers/iommu/arm-smmu.c | 2 +- drivers/iommu/arm-smmu.h | 21 + drivers/iommu/io-pgtable-arm.c | 57 +- include/linux/io-pgtable.h | 10 +- 5 files changed, 68 insertions(+), 39 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index d127974afdb7..4443e1890077 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -250,6 +250,13 @@ #define STRTAB_STE_2_S2VMIDGENMASK_ULL(15, 0) #define STRTAB_STE_2_VTCR GENMASK_ULL(50, 32) +#define STRTAB_STE_2_VTCR_S2T0SZ GENMASK_ULL(5, 0) +#define STRTAB_STE_2_VTCR_S2SL0GENMASK_ULL(7, 6) +#define STRTAB_STE_2_VTCR_S2IR0GENMASK_ULL(9, 8) +#define STRTAB_STE_2_VTCR_S2OR0GENMASK_ULL(11, 10) +#define STRTAB_STE_2_VTCR_S2SH0GENMASK_ULL(13, 12) +#define STRTAB_STE_2_VTCR_S2TG GENMASK_ULL(15, 14) +#define STRTAB_STE_2_VTCR_S2PS GENMASK_ULL(18, 16) #define STRTAB_STE_2_S2AA64(1UL << 51) #define STRTAB_STE_2_S2ENDI(1UL << 52) #define STRTAB_STE_2_S2PTW (1UL << 54) @@ -2159,14 +2166,22 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain, int vmid; struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_s2_cfg *cfg = _domain->s2_cfg; + typeof(_cfg->arm_lpae_s2_cfg.vtcr) vtcr; vmid = arm_smmu_bitmap_alloc(smmu->vmid_map, smmu->vmid_bits); if (vmid < 0) return vmid; + vtcr = _cfg->arm_lpae_s2_cfg.vtcr; cfg->vmid = (u16)vmid; cfg->vttbr = pgtbl_cfg->arm_lpae_s2_cfg.vttbr; - cfg->vtcr = pgtbl_cfg->arm_lpae_s2_cfg.vtcr; + cfg->vtcr = FIELD_PREP(STRTAB_STE_2_VTCR_S2T0SZ, vtcr->tsz) | + FIELD_PREP(STRTAB_STE_2_VTCR_S2SL0, vtcr->sl) | + FIELD_PREP(STRTAB_STE_2_VTCR_S2IR0, vtcr->irgn) | + FIELD_PREP(STRTAB_STE_2_VTCR_S2OR0, vtcr->orgn) | + FIELD_PREP(STRTAB_STE_2_VTCR_S2SH0, vtcr->sh) | + FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) | + FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps); return 0; } diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 214be09f6ded..f067783ebd59 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -548,7 +548,7 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain, cb->tcr[0] |= ARM_SMMU_TCR_EAE; } } else { - cb->tcr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vtcr; + cb->tcr[0] = arm_smmu_lpae_vtcr(pgtbl_cfg); } /* TTBRs */ diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h index 6501f38a5966..8d1cd54d82a6 100644 --- a/drivers/iommu/arm-smmu.h +++ b/drivers/iommu/arm-smmu.h @@ -174,6 +174,15 @@ enum arm_smmu_cbar_type { #define ARM_SMMU_TCR_IRGN0 GENMASK(9, 8) #define ARM_SMMU_TCR_T0SZ GENMASK(5, 0) +#define ARM_SMMU_VTCR_RES1 BIT(31) +#define ARM_SMMU_VTCR_PS GENMASK(18, 16) +#define ARM_SMMU_VTCR_TG0 ARM_SMMU_TCR_TG0 +#define ARM_SMMU_VTCR_SH0 ARM_SMMU_TCR_SH0 +#define ARM_SMMU_VTCR_ORGN0ARM_SMMU_TCR_ORGN0 +#define ARM_SMMU_VTCR_IRGN0ARM_SMMU_TCR_IRGN0 +#define ARM_SMMU_VTCR_SL0 GENMASK(7, 6) +#define ARM_SMMU_VTCR_T0SZ ARM_SMMU_TCR_T0SZ + #define ARM_SMMU_CB_CONTEXTIDR 0x34 #define ARM_SMMU_CB_S1_MAIR0 0x38 #define ARM_SMMU_CB_S1_MAIR1 0x3c @@ -352,6 +361,18 @@ static inline u32 arm_smmu_lpae_tcr2(struct io_pgtable_cfg *cfg) FIELD_PREP(ARM_SMMU_TCR2_SEP, ARM_SMMU_TCR2_SEP_UPSTREAM); } +static inline u32 arm_smmu_lpae_vtcr(struct io_pgtable_cfg *cfg) +{ + return ARM_SMMU_VTCR_RES1 | + FIELD_PREP(ARM_SMMU_VTCR_PS, cfg->arm_lpae_s2_cfg.vtcr.ps) | + FIELD_PREP(ARM_SMMU_VTCR_TG0, cfg->arm_lpae_s2_cfg.vtcr.tg) | + FIELD_PREP(ARM_SMMU_VTCR_SH0, cfg->arm_lpae_s2_cfg.vtcr.sh) | + FIELD_PREP(ARM_SMMU_VTCR_ORGN0, cfg->arm_lpae_s2_cfg.vtcr.orgn) | + FIELD_PREP(ARM_SMMU_VTCR_IRGN0, cfg->arm_lpae_s2_cfg.vtcr.irgn) | +
[PATCH 4/8] iommu/io-pgtable-arm: Ensure ARM_64_LPAE_S2_TCR_RES1 is unsigned
ARM_64_LPAE_S2_TCR_RES1 is intended to map to bit 31 of the VTCR register, which is required to be set to 1 by the architecture. Unfortunately, we accidentally treat this as a signed quantity which means we also set the upper 32 bits of the VTCR to one, and they are required to be zero. Treat ARM_64_LPAE_S2_TCR_RES1 as unsigned to avoid the unwanted sign-extension up to 64 bits. Cc: Robin Murphy Signed-off-by: Will Deacon --- drivers/iommu/io-pgtable-arm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 4b437ead2300..89216bf37282 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -101,7 +101,7 @@ /* Register bits */ #define ARM_32_LPAE_TCR_EAE(1 << 31) -#define ARM_64_LPAE_S2_TCR_RES1(1 << 31) +#define ARM_64_LPAE_S2_TCR_RES1(1U << 31) #define ARM_LPAE_TCR_EPD1 (1 << 23) -- 2.25.0.rc1.283.g88dfdc4193-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 8/8] iommu/io-pgtable-arm: Prepare for TTBR1 usage
From: Robin Murphy Now that we can correctly extract top-level indices without relying on the remaining upper bits being zero, the only remaining impediments to using a given table for TTBR1 are the address validation on map/unmap and the awkward TCR translation granule format. Add a quirk so that we can do the right thing at those points. Tested-by: Jordan Crouse Signed-off-by: Robin Murphy Signed-off-by: Will Deacon --- drivers/iommu/io-pgtable-arm.c | 25 +++-- include/linux/io-pgtable.h | 4 2 files changed, 23 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 846963c87e0f..5e966ef0bacf 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -104,6 +104,10 @@ #define ARM_LPAE_TCR_TG0_64K 1 #define ARM_LPAE_TCR_TG0_16K 2 +#define ARM_LPAE_TCR_TG1_16K 1 +#define ARM_LPAE_TCR_TG1_4K2 +#define ARM_LPAE_TCR_TG1_64K 3 + #define ARM_LPAE_TCR_SH_NS 0 #define ARM_LPAE_TCR_SH_OS 2 #define ARM_LPAE_TCR_SH_IS 3 @@ -463,6 +467,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova, arm_lpae_iopte *ptep = data->pgd; int ret, lvl = data->start_level; arm_lpae_iopte prot; + long iaext = (long)iova >> cfg->ias; /* If no access, then nothing to do */ if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE))) @@ -471,7 +476,9 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova, if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size)) return -EINVAL; - if (WARN_ON(iova >> data->iop.cfg.ias || paddr >> data->iop.cfg.oas)) + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) + iaext = ~iaext; + if (WARN_ON(iaext || paddr >> cfg->oas)) return -ERANGE; prot = arm_lpae_prot_to_pte(data, iommu_prot); @@ -637,11 +644,14 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova, struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); struct io_pgtable_cfg *cfg = >iop.cfg; arm_lpae_iopte *ptep = data->pgd; + long iaext = (long)iova >> cfg->ias; if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size)) return 0; - if (WARN_ON(iova >> data->iop.cfg.ias)) + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) + iaext = ~iaext; + if (WARN_ON(iaext)) return 0; return __arm_lpae_unmap(data, gather, iova, size, data->start_level, ptep); @@ -777,9 +787,11 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie) u64 reg; struct arm_lpae_io_pgtable *data; typeof(>arm_lpae_s1_cfg.tcr) tcr = >arm_lpae_s1_cfg.tcr; + bool tg1; if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | - IO_PGTABLE_QUIRK_NON_STRICT)) + IO_PGTABLE_QUIRK_NON_STRICT | + IO_PGTABLE_QUIRK_ARM_TTBR1)) return NULL; data = arm_lpae_alloc_pgtable(cfg); @@ -797,15 +809,16 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie) tcr->orgn = ARM_LPAE_TCR_RGN_NC; } + tg1 = cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1; switch (ARM_LPAE_GRANULE(data)) { case SZ_4K: - tcr->tg = ARM_LPAE_TCR_TG0_4K; + tcr->tg = tg1 ? ARM_LPAE_TCR_TG1_4K : ARM_LPAE_TCR_TG0_4K; break; case SZ_16K: - tcr->tg = ARM_LPAE_TCR_TG0_16K; + tcr->tg = tg1 ? ARM_LPAE_TCR_TG1_16K : ARM_LPAE_TCR_TG0_16K; break; case SZ_64K: - tcr->tg = ARM_LPAE_TCR_TG0_64K; + tcr->tg = tg1 ? ARM_LPAE_TCR_TG1_64K : ARM_LPAE_TCR_TG0_64K; break; } diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index 40c1b7745fb6..53d53c6c2be9 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -83,12 +83,16 @@ struct io_pgtable_cfg { * IO_PGTABLE_QUIRK_NON_STRICT: Skip issuing synchronous leaf TLBIs * on unmap, for DMA domains using the flush queue mechanism for * delayed invalidation. +* +* IO_PGTABLE_QUIRK_ARM_TTBR1: (ARM LPAE format) Configure the table +* for use in the upper half of a split address space. */ #define IO_PGTABLE_QUIRK_ARM_NS BIT(0) #define IO_PGTABLE_QUIRK_NO_PERMS BIT(1) #define IO_PGTABLE_QUIRK_TLBI_ON_MAPBIT(2) #define IO_PGTABLE_QUIRK_ARM_MTK_EXTBIT(3) #define IO_PGTABLE_QUIRK_NON_STRICT BIT(4) + #define IO_PGTABLE_QUIRK_ARM_TTBR1 BIT(5) unsigned long quirks; unsigned long pgsize_bitmap;
[PATCH 3/8] iommu/io-pgtable-arm: Ensure non-cacheable mappings are Outer Shareable
The Armv7 ARM states that Normal, Non-cacheable mappings must explicitly be marked as Outer Shareable in order to avoid UNPREDICTABLE behaviour: | Overlaying the shareability attribute (B3-1377, ARM DDI 0406C.c) | | A memory region with a resultant memory type attribute of Normal, and | a resultant cacheability attribute of Inner Non-cacheable, Outer | Non-cacheable, must have a resultant shareability attribute of Outer | Shareable, otherwise shareability is UNPREDICTABLE Although this requirement doesn't appear to exist in the Armv8 docs, where the 'SH' field is simply ignored in this case, it's straightforward enough to set it here. Cc: Robin Murphy Signed-off-by: Will Deacon --- drivers/iommu/io-pgtable-arm.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index ab440b52a5f4..4b437ead2300 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -303,9 +303,8 @@ static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data, if (data->iop.fmt != ARM_MALI_LPAE) pte |= ARM_LPAE_PTE_AF; - pte |= ARM_LPAE_PTE_SH_IS; - pte |= paddr_to_iopte(paddr, data); + pte |= paddr_to_iopte(paddr, data); __arm_lpae_set_pte(ptep, pte, >iop.cfg); } @@ -463,6 +462,11 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, if (prot & IOMMU_NOEXEC) pte |= ARM_LPAE_PTE_XN; + if (prot & IOMMU_CACHE) + pte |= ARM_LPAE_PTE_SH_IS; + else + pte |= ARM_LPAE_PTE_SH_OS; + return pte; } -- 2.25.0.rc1.283.g88dfdc4193-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/8] iommu/io-pgtable-arm: Support non-coherent stage-2 page tables
Commit 9e6ea59f3ff3 ("iommu/io-pgtable: Support non-coherent page tables") added support for non-coherent page-table walks to the Arm IOMMU page-table backends. Unfortunately, it left the stage-2 allocator unchanged, so let's hook that up in the same way. Cc: Bjorn Andersson Cc: Robin Murphy Signed-off-by: Will Deacon --- drivers/iommu/io-pgtable-arm.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 7b422b9fe05b..ab440b52a5f4 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -910,10 +910,16 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie) } /* VTCR */ - reg = ARM_64_LPAE_S2_TCR_RES1 | -(ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | -(ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | -(ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); + reg = ARM_64_LPAE_S2_TCR_RES1; + if (cfg->coherent_walk) { + reg |= (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); + } else { + reg |= (ARM_LPAE_TCR_SH_OS << ARM_LPAE_TCR_SH0_SHIFT) | + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT) | + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT); + } sl = data->start_level; -- 2.25.0.rc1.283.g88dfdc4193-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 6/8] iommu/arm-smmu: Rename public #defines under ARM_SMMU_ namespace
Now that we have arm-smmu.h defining various SMMU constants, ensure that they are namespaced with the ARM_SMMU_ prefix in order to avoid conflicts with the CPU, such as the one we're currently bodging around with the TCR. Cc: Robin Murphy Signed-off-by: Will Deacon --- drivers/iommu/arm-smmu-impl.c | 2 +- drivers/iommu/arm-smmu.c | 157 drivers/iommu/arm-smmu.h | 217 +- drivers/iommu/qcom_iommu.c| 16 +-- 4 files changed, 204 insertions(+), 188 deletions(-) diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c index b2fe72a8f019..74d97a886e93 100644 --- a/drivers/iommu/arm-smmu-impl.c +++ b/drivers/iommu/arm-smmu-impl.c @@ -119,7 +119,7 @@ int arm_mmu500_reset(struct arm_smmu_device *smmu) * Secure has also cleared SACR.CACHE_LOCK for this to take effect... */ reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_ID7); - major = FIELD_GET(ID7_MAJOR, reg); + major = FIELD_GET(ARM_SMMU_ID7_MAJOR, reg); reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sACR); if (major >= 2) reg &= ~ARM_MMU500_ACR_CACHE_LOCK; diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index e65eb60812ec..214be09f6ded 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -271,7 +271,7 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, int page, for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) { for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) { reg = arm_smmu_readl(smmu, page, status); - if (!(reg & sTLBGSTATUS_GSACTIVE)) + if (!(reg & ARM_SMMU_sTLBGSTATUS_GSACTIVE)) return; cpu_relax(); } @@ -478,7 +478,7 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev) int idx = smmu_domain->cfg.cbndx; fsr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSR); - if (!(fsr & FSR_FAULT)) + if (!(fsr & ARM_SMMU_FSR_FAULT)) return IRQ_NONE; fsynr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSYNR0); @@ -510,7 +510,7 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev) if (__ratelimit()) { if (IS_ENABLED(CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT) && - (gfsr & sGFSR_USF)) + (gfsr & ARM_SMMU_sGFSR_USF)) dev_err(smmu->dev, "Blocked unknown Stream ID 0x%hx; boot with \"arm-smmu.disable_bypass=0\" to allow, but this may have security implications\n", (u16)gfsynr1); @@ -543,9 +543,9 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain, cb->tcr[0] = arm_smmu_lpae_tcr(pgtbl_cfg); cb->tcr[1] = arm_smmu_lpae_tcr2(pgtbl_cfg); if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64) - cb->tcr[1] |= TCR2_AS; + cb->tcr[1] |= ARM_SMMU_TCR2_AS; else - cb->tcr[0] |= TCR_EAE; + cb->tcr[0] |= ARM_SMMU_TCR_EAE; } } else { cb->tcr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vtcr; @@ -558,8 +558,10 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain, cb->ttbr[1] = 0; } else { cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr; - cb->ttbr[0] |= FIELD_PREP(TTBRn_ASID, cfg->asid); - cb->ttbr[1] = FIELD_PREP(TTBRn_ASID, cfg->asid); + cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID, + cfg->asid); + cb->ttbr[1] = FIELD_PREP(ARM_SMMU_TTBRn_ASID, +cfg->asid); } } else { cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr; @@ -595,31 +597,33 @@ static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx) /* CBA2R */ if (smmu->version > ARM_SMMU_V1) { if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64) - reg = CBA2R_VA64; + reg = ARM_SMMU_CBA2R_VA64; else reg = 0; /* 16-bit VMIDs live in CBA2R */ if (smmu->features & ARM_SMMU_FEAT_VMID16) - reg |= FIELD_PREP(CBA2R_VMID16, cfg->vmid); + reg |= FIELD_PREP(ARM_SMMU_CBA2R_VMID16, cfg->vmid); arm_smmu_gr1_write(smmu, ARM_SMMU_GR1_CBA2R(idx), reg); } /* CBAR */ - reg = FIELD_PREP(CBAR_TYPE, cfg->cbar); + reg = FIELD_PREP(ARM_SMMU_CBAR_TYPE, cfg->cbar);
[PATCH 0/8] Finish off the split page table prep work
Hi all, Last merge window, I merged most of the split page table prep work from Robin [1], but there were a few patches left pending some rework. I think Robin was hoping to get that done for 5.5, but what with the holidays falling like they did and other committments, I've ended up picked up the bits that were left over. I'm pretty limited with regards to SMMU hardware on which I can test this lot, so I might have broken something. Applies against for-joerg/arm-smmu/updates. Will [1] https://lore.kernel.org/lkml/20191104202012.GN24909@willie-the-truck Cc: Robin Murphy Cc: Jordan Crouse --->8 Robin Murphy (3): iommu/io-pgtable-arm: Rationalise TTBRn handling iommu/io-pgtable-arm: Rationalise TCR handling iommu/io-pgtable-arm: Prepare for TTBR1 usage Will Deacon (5): iommu/io-pgtable-arm: Support non-coherent stage-2 page tables iommu/io-pgtable-arm: Ensure non-cacheable mappings are Outer Shareable iommu/io-pgtable-arm: Ensure ARM_64_LPAE_S2_TCR_RES1 is unsigned iommu/arm-smmu: Rename public #defines under ARM_SMMU_ namespace iommu/io-pgtable-arm: Rationalise VTCR handling drivers/iommu/arm-smmu-impl.c | 2 +- drivers/iommu/arm-smmu-v3.c| 60 drivers/iommu/arm-smmu.c | 171 -- drivers/iommu/arm-smmu.h | 228 ++--- drivers/iommu/io-pgtable-arm-v7s.c | 23 ++- drivers/iommu/io-pgtable-arm.c | 155 +--- drivers/iommu/io-pgtable.c | 2 +- drivers/iommu/ipmmu-vmsa.c | 2 +- drivers/iommu/msm_iommu.c | 4 +- drivers/iommu/mtk_iommu.c | 4 +- drivers/iommu/qcom_iommu.c | 25 ++-- include/linux/io-pgtable.h | 27 +++- 12 files changed, 381 insertions(+), 322 deletions(-) -- 2.25.0.rc1.283.g88dfdc4193-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 5/8] iommu/io-pgtable-arm: Rationalise TCR handling
From: Robin Murphy Although it's conceptually nice for the io_pgtable_cfg to provide a standard VMSA TCR value, the reality is that no VMSA-compliant IOMMU looks exactly like an Arm CPU, and they all have various other TCR controls which io-pgtable can't be expected to understand. Thus since there is an expectation that drivers will have to add to the given TCR value anyway, let's strip it down to just the essentials that are directly relevant to io-pgtable's inner workings - namely the various sizes and the walk attributes. Tested-by: Jordan Crouse Signed-off-by: Robin Murphy [will: Add missing include of bitfield.h] Signed-off-by: Will Deacon --- drivers/iommu/arm-smmu-v3.c| 41 +++-- drivers/iommu/arm-smmu.c | 7 ++- drivers/iommu/arm-smmu.h | 28 + drivers/iommu/io-pgtable-arm-v7s.c | 6 +- drivers/iommu/io-pgtable-arm.c | 98 -- drivers/iommu/io-pgtable.c | 2 +- drivers/iommu/qcom_iommu.c | 8 +-- include/linux/io-pgtable.h | 9 ++- 8 files changed, 95 insertions(+), 104 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index cf2ae065a6c2..d127974afdb7 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -260,27 +260,18 @@ /* Context descriptor (stage-1 only) */ #define CTXDESC_CD_DWORDS 8 #define CTXDESC_CD_0_TCR_T0SZ GENMASK_ULL(5, 0) -#define ARM64_TCR_T0SZ GENMASK_ULL(5, 0) #define CTXDESC_CD_0_TCR_TG0 GENMASK_ULL(7, 6) -#define ARM64_TCR_TG0 GENMASK_ULL(15, 14) #define CTXDESC_CD_0_TCR_IRGN0 GENMASK_ULL(9, 8) -#define ARM64_TCR_IRGN0GENMASK_ULL(9, 8) #define CTXDESC_CD_0_TCR_ORGN0 GENMASK_ULL(11, 10) -#define ARM64_TCR_ORGN0GENMASK_ULL(11, 10) #define CTXDESC_CD_0_TCR_SH0 GENMASK_ULL(13, 12) -#define ARM64_TCR_SH0 GENMASK_ULL(13, 12) #define CTXDESC_CD_0_TCR_EPD0 (1ULL << 14) -#define ARM64_TCR_EPD0 (1ULL << 7) #define CTXDESC_CD_0_TCR_EPD1 (1ULL << 30) -#define ARM64_TCR_EPD1 (1ULL << 23) #define CTXDESC_CD_0_ENDI (1UL << 15) #define CTXDESC_CD_0_V (1UL << 31) #define CTXDESC_CD_0_TCR_IPS GENMASK_ULL(34, 32) -#define ARM64_TCR_IPS GENMASK_ULL(34, 32) #define CTXDESC_CD_0_TCR_TBI0 (1ULL << 38) -#define ARM64_TCR_TBI0 (1ULL << 37) #define CTXDESC_CD_0_AA64 (1UL << 41) #define CTXDESC_CD_0_S (1UL << 44) @@ -291,10 +282,6 @@ #define CTXDESC_CD_1_TTB0_MASK GENMASK_ULL(51, 4) -/* Convert between AArch64 (CPU) TCR format and SMMU CD format */ -#define ARM_SMMU_TCR2CD(tcr, fld) FIELD_PREP(CTXDESC_CD_0_TCR_##fld, \ - FIELD_GET(ARM64_TCR_##fld, tcr)) - /* Command queue */ #define CMDQ_ENT_SZ_SHIFT 4 #define CMDQ_ENT_DWORDS((1 << CMDQ_ENT_SZ_SHIFT) >> 3) @@ -1439,23 +1426,6 @@ static int arm_smmu_cmdq_issue_sync(struct arm_smmu_device *smmu) } /* Context descriptor manipulation functions */ -static u64 arm_smmu_cpu_tcr_to_cd(u64 tcr) -{ - u64 val = 0; - - /* Repack the TCR. Just care about TTBR0 for now */ - val |= ARM_SMMU_TCR2CD(tcr, T0SZ); - val |= ARM_SMMU_TCR2CD(tcr, TG0); - val |= ARM_SMMU_TCR2CD(tcr, IRGN0); - val |= ARM_SMMU_TCR2CD(tcr, ORGN0); - val |= ARM_SMMU_TCR2CD(tcr, SH0); - val |= ARM_SMMU_TCR2CD(tcr, EPD0); - val |= ARM_SMMU_TCR2CD(tcr, EPD1); - val |= ARM_SMMU_TCR2CD(tcr, IPS); - - return val; -} - static void arm_smmu_write_ctx_desc(struct arm_smmu_device *smmu, struct arm_smmu_s1_cfg *cfg) { @@ -1465,7 +1435,7 @@ static void arm_smmu_write_ctx_desc(struct arm_smmu_device *smmu, * We don't need to issue any invalidation here, as we'll invalidate * the STE when installing the new entry anyway. */ - val = arm_smmu_cpu_tcr_to_cd(cfg->cd.tcr) | + val = cfg->cd.tcr | #ifdef __BIG_ENDIAN CTXDESC_CD_0_ENDI | #endif @@ -2151,6 +2121,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, int asid; struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_s1_cfg *cfg = _domain->s1_cfg; + typeof(_cfg->arm_lpae_s1_cfg.tcr) tcr = _cfg->arm_lpae_s1_cfg.tcr; asid = arm_smmu_bitmap_alloc(smmu->asid_map, smmu->asid_bits); if (asid < 0) @@ -2167,7 +2138,13 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, cfg->cd.asid= (u16)asid; cfg->cd.ttbr= pgtbl_cfg->arm_lpae_s1_cfg.ttbr; - cfg->cd.tcr = pgtbl_cfg->arm_lpae_s1_cfg.tcr; + cfg->cd.tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, tcr->tsz) | +
[PATCH 1/8] iommu/io-pgtable-arm: Rationalise TTBRn handling
From: Robin Murphy TTBR1 values have so far been redundant since no users implement any support for split address spaces. Crucially, though, one of the main reasons for wanting to do so is to be able to manage each half entirely independently, e.g. context-switching one set of mappings without disturbing the other. Thus it seems unlikely that tying two tables together in a single io_pgtable_cfg would ever be particularly desirable or useful. Streamline the configs to just a single conceptual TTBR value representing the allocated table. This paves the way for future users to support split address spaces by simply allocating a table and dealing with the detailed TTBRn logistics themselves. Tested-by: Jordan Crouse Signed-off-by: Robin Murphy [will: Keep v7s non-coherent mappings explicitly outer-shareable] Signed-off-by: Will Deacon --- drivers/iommu/arm-smmu-v3.c| 2 +- drivers/iommu/arm-smmu.c | 9 - drivers/iommu/io-pgtable-arm-v7s.c | 17 - drivers/iommu/io-pgtable-arm.c | 5 ++--- drivers/iommu/ipmmu-vmsa.c | 2 +- drivers/iommu/msm_iommu.c | 4 ++-- drivers/iommu/mtk_iommu.c | 4 ++-- drivers/iommu/qcom_iommu.c | 3 +-- include/linux/io-pgtable.h | 4 ++-- 9 files changed, 23 insertions(+), 27 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index aa7e53023585..cf2ae065a6c2 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -2166,7 +2166,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, } cfg->cd.asid= (u16)asid; - cfg->cd.ttbr= pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0]; + cfg->cd.ttbr= pgtbl_cfg->arm_lpae_s1_cfg.ttbr; cfg->cd.tcr = pgtbl_cfg->arm_lpae_s1_cfg.tcr; cfg->cd.mair= pgtbl_cfg->arm_lpae_s1_cfg.mair; return 0; diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 46b87740d708..72640e045268 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -553,13 +553,12 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain, /* TTBRs */ if (stage1) { if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) { - cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr[0]; - cb->ttbr[1] = pgtbl_cfg->arm_v7s_cfg.ttbr[1]; + cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr; + cb->ttbr[1] = 0; } else { - cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0]; + cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr; cb->ttbr[0] |= FIELD_PREP(TTBRn_ASID, cfg->asid); - cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[1]; - cb->ttbr[1] |= FIELD_PREP(TTBRn_ASID, cfg->asid); + cb->ttbr[1] = FIELD_PREP(TTBRn_ASID, cfg->asid); } } else { cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr; diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c index 7c3bd2c3cdca..714cb406b0a8 100644 --- a/drivers/iommu/io-pgtable-arm-v7s.c +++ b/drivers/iommu/io-pgtable-arm-v7s.c @@ -822,15 +822,14 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg, /* Ensure the empty pgd is visible before any actual TTBR write */ wmb(); - /* TTBRs */ - cfg->arm_v7s_cfg.ttbr[0] = virt_to_phys(data->pgd) | - ARM_V7S_TTBR_S | ARM_V7S_TTBR_NOS | - (cfg->coherent_walk ? - (ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_WBWA) | - ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_WBWA)) : - (ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_NC) | - ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_NC))); - cfg->arm_v7s_cfg.ttbr[1] = 0; + /* TTBR */ + cfg->arm_v7s_cfg.ttbr = virt_to_phys(data->pgd) | + ARM_V7S_TTBR_S | ARM_V7S_TTBR_NOS | + (cfg->coherent_walk ? +(ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_WBWA) | + ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_WBWA)) : +(ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_NC) | + ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_NC))); return >iop; out_free_data: diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index bdf47f745268..7b422b9fe05b 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -872,9 +872,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie) /* Ensure the empty pgd is visible before any actual TTBR write */ wmb(); - /* TTBRs */
[PATCH] iommu/arm-smmu: Improve SMR mask test
Make the SMR mask test more robust against SMR0 being live at probe time, which might happen once we start supporting firmware reservations for framebuffers and suchlike. Signed-off-by: Robin Murphy --- drivers/iommu/arm-smmu.c | 23 ++- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 4f1a350d9529..df6490bc7700 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -946,23 +946,36 @@ static void arm_smmu_write_sme(struct arm_smmu_device *smmu, int idx) static void arm_smmu_test_smr_masks(struct arm_smmu_device *smmu) { u32 smr; + int i; if (!smmu->smrs) return; - + /* +* If we've had to accommodate firmware memory regions, we may +* have live SMRs by now; tread carefully... +* +* Somewhat perversely, not having a free SMR for this test implies we +* can get away without it anyway, as we'll only be able to 'allocate' +* these SMRs for the ID/mask values we're already trusting to be OK. +*/ + for (i = 0; i < smmu->num_mapping_groups; i++) + if (!smmu->smrs[i].valid) + goto smr_ok; + return; +smr_ok: /* * SMR.ID bits may not be preserved if the corresponding MASK * bits are set, so check each one separately. We can reject * masters later if they try to claim IDs outside these masks. */ smr = FIELD_PREP(SMR_ID, smmu->streamid_mask); - arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_SMR(0), smr); - smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(0)); + arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_SMR(i), smr); + smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(i)); smmu->streamid_mask = FIELD_GET(SMR_ID, smr); smr = FIELD_PREP(SMR_MASK, smmu->streamid_mask); - arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_SMR(0), smr); - smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(0)); + arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_SMR(i), smr); + smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(i)); smmu->smr_mask_mask = FIELD_GET(SMR_MASK, smr); } -- 2.23.0.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/io-pgtable-arm: Improve attribute handling
By VMSA rules, using Normal Non-Cacheable type with a shareability attribute of anything other than Outer Shareable is liable to lead into unpredictable territory. Although the SMMU architectures seem to give some slightly stronger guarantees of Non-Cacheable output types becoming implicitly Outer Shareable in most cases, we may as well be explicit and not take any chances. It's also weird that LPAE attribute handling is currently split between prot_to_pte() and init_pte() given that it can all be statically determined up-front. Thus, collect *all* the LPAE attributes into prot_to_pte() in order to logically pick the shareability based on the incoming IOMMU API prot value, and tweak the short-descriptor code to stop setting TTBR0.NOS for Non-Cacheable walks. Signed-off-by: Robin Murphy --- drivers/iommu/io-pgtable-arm-v7s.c | 7 +++ drivers/iommu/io-pgtable-arm.c | 17 +++-- 2 files changed, 14 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c index 7c3bd2c3cdca..7d6a8622f2e6 100644 --- a/drivers/iommu/io-pgtable-arm-v7s.c +++ b/drivers/iommu/io-pgtable-arm-v7s.c @@ -823,10 +823,9 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg, wmb(); /* TTBRs */ - cfg->arm_v7s_cfg.ttbr[0] = virt_to_phys(data->pgd) | - ARM_V7S_TTBR_S | ARM_V7S_TTBR_NOS | - (cfg->coherent_walk ? - (ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_WBWA) | + cfg->arm_v7s_cfg.ttbr[0] = virt_to_phys(data->pgd) | ARM_V7S_TTBR_S | + (cfg->coherent_walk ? (ARM_V7S_TTBR_NOS | + ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_WBWA) | ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_WBWA)) : (ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_NC) | ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_NC))); diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index bdf47f745268..b5025d732f5e 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -293,17 +293,11 @@ static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data, { arm_lpae_iopte pte = prot; - if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_NS) - pte |= ARM_LPAE_PTE_NS; - if (data->iop.fmt != ARM_MALI_LPAE && lvl == ARM_LPAE_MAX_LEVELS - 1) pte |= ARM_LPAE_PTE_TYPE_PAGE; else pte |= ARM_LPAE_PTE_TYPE_BLOCK; - if (data->iop.fmt != ARM_MALI_LPAE) - pte |= ARM_LPAE_PTE_AF; - pte |= ARM_LPAE_PTE_SH_IS; pte |= paddr_to_iopte(paddr, data); __arm_lpae_set_pte(ptep, pte, >iop.cfg); @@ -460,9 +454,20 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, << ARM_LPAE_PTE_ATTRINDX_SHIFT); } + if (prot & IOMMU_CACHE) + pte |= ARM_LPAE_PTE_SH_IS; + else + pte |= ARM_LPAE_PTE_SH_OS; + if (prot & IOMMU_NOEXEC) pte |= ARM_LPAE_PTE_XN; + if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_NS) + pte |= ARM_LPAE_PTE_NS; + + if (data->iop.fmt != ARM_MALI_LPAE) + pte |= ARM_LPAE_PTE_AF; + return pte; } -- 2.23.0.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 00/10] iommu/io-pgtable: Cleanup and prep for split tables
On Mon, Nov 04, 2019 at 08:20:12PM +, Will Deacon wrote: > On Mon, Nov 04, 2019 at 07:22:28PM +, Will Deacon wrote: > > On Fri, Oct 25, 2019 at 07:08:29PM +0100, Robin Murphy wrote: > > > Since the flawed first attempt, I've reworked things with an abstracted > > > TCR and an explicit TTBR1 quirk. I originally envisaged the need to pass > > > the quirk all the way down to the TLBI calls, hence getting diverted > > > into trying to make the parameter passing less cluttered in general, but > > > in the end it turned out fairly neat to just fix the indexing such that > > > we can always just pass around the original unmodified IOVA. Most of the > > > new patches come from staring at that indexing code for long enough to > > > see the subtle inefficiencies that were worth ironing out, plus a bit of > > > random cleanup which doesn't feel worth posting separately. > > > > > > Note that these patches depend on the fixes already queued in -rc4, > > > otherwise there will be conflicts in arm_mali_lpae_alloc_pgtable(). > > > > > > Robin. > > > > > > > > > Robin Murphy (10): > > > iommu/io-pgtable: Make selftest gubbins consistently __init > > > iommu/io-pgtable-arm: Rationalise size check > > > iommu/io-pgtable-arm: Simplify bounds checks > > > iommu/io-pgtable-arm: Simplify start level lookup > > > iommu/io-pgtable-arm: Simplify PGD size handling > > > iommu/io-pgtable-arm: Simplify level indexing > > > iommu/io-pgtable-arm: Rationalise MAIR handling > > > iommu/io-pgtable-arm: Rationalise TTBRn handling > > > iommu/io-pgtable-arm: Rationalise TCR handling > > > iommu/io-pgtable-arm: Prepare for TTBR1 usage > > > > Overall, this looks really good to me. There's a bit more work to do > > (see my comments) and I'd like Jordan to have a look as well, but on the > > whole it's a big improvement. Thanks. > > Also, I've merged the first 7 patches to save you having to repost those: > > https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=for-joerg/arm-smmu/updates I've now picked up the remaining three patches, but I'll post them to the list shortly because I've ended up trying to address my own review comments as I'd like this stuff in before we go ahead with Jordan's patches. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v10 2/4] uacce: add uacce driver
On 2020/1/10 下午6:10, Jonathan Cameron wrote: On Fri, 10 Jan 2020 14:55:39 +0800 "zhangfei@foxmail.com" wrote: On 2020/1/10 上午1:38, Jonathan Cameron wrote: On Mon, 16 Dec 2019 11:08:15 +0800 Zhangfei Gao wrote: From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share only data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. The IOMMU core only tracks mm<->device bonds at the moment, because it only needs to handle IOTLB invalidation and PASID table entries. However uacce needs a finer granularity since multiple queues from the same device can be bound to an mm. When the mm exits, all bound queues must be stopped so that the IOMMU can safely clear the PASID table entry and reallocate the PASID. An intermediate struct uacce_mm links uacce devices and queues. Note that an mm may be bound to multiple devices but an uacce_mm structure only ever belongs to a single device, because we don't need anything more complex (if multiple devices are bound to one mm, then we'll create one uacce_mm for each bond). uacce_device --+-- uacce_mm --+-- uacce_queue | '-- uacce_queue | '-- uacce_mm --+-- uacce_queue +-- uacce_queue '-- uacce_queue Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Jean-Philippe Brucker Signed-off-by: Zhangfei Gao Hi, Two small things I'd missed previously. Fix those and for what it's worth Reviewed-by: Jonathan Cameron Thanks Jonathan --- Documentation/ABI/testing/sysfs-driver-uacce | 37 ++ drivers/misc/Kconfig | 1 + drivers/misc/Makefile| 1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile | 2 + drivers/misc/uacce/uacce.c | 628 +++ include/linux/uacce.h| 161 +++ include/uapi/misc/uacce/uacce.h | 38 ++ 8 files changed, 881 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce/uacce.h ... + +What: /sys/class/uacce//available_instances +Date: Dec 2019 +KernelVersion: 5.6 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Available instances left of the device +Return -ENODEV if uacce_ops get_available_instances is not provided + See below. It doesn't "return" it prints it currently. ... +static ssize_t available_instances_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct uacce_device *uacce = to_uacce_device(dev); + int val = -ENODEV; + + if (uacce->ops->get_available_instances) + val = uacce->ops->get_available_instances(uacce); + + return sprintf(buf, "%d\n", val); It's unusual to pass an error value back as a string. I'd expect some logic like.. if (val < 0) return val; return sprintf(buf, "%d\n", val); Note this is the documented behavior "returns -ENODEV". If return -ENODEV, cat /sys/class/uacce/hisi_zip-0/available_instances cat: /sys/class/uacce/hisi_zip-0/available_instances: No such device I think print "unknown" maybe better, like cpufreq.c if (uacce->ops->get_available_instances) return sprintf(buf, "%d\n", uacce->ops->get_available_instances(uacce)); return sprintf(buf, "unknown\n"); From userspace code point a simple error code return is better than a 'magic' string in the file. You'll find people just try to read an integer without checking for unknown and hence get a very odd result. Much better to throw them an error code. OK, understand, thanks ___ iommu mailing list iommu@lists.linux-foundation.org
Re: [PATCH 1/4] memory: tegra: Correct reset value of xusb_hostr
On Thu, Dec 19, 2019 at 04:29:11PM -0800, Nicolin Chen wrote: > According to Tegra X1 (Tegra210) TRM, the reset value of xusb_hostr > field (bit [7:0]) should be 0x7a. So this patch simply corrects it. > > Signed-off-by: Nicolin Chen > --- > drivers/memory/tegra/tegra210.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Applied to for-5.6/memory, thanks. Thierry signature.asc Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v10 0/4] Add uacce module for Accelerator
On 2020/1/10 下午6:08, Jonathan Cameron wrote: On Fri, 10 Jan 2020 15:03:25 +0800 zhangfei wrote: On 2020/1/10 上午1:49, Jonathan Cameron wrote: On Mon, 16 Dec 2019 11:08:13 +0800 Zhangfei Gao wrote: Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Because of unified address, hardware and user space of process can share the same virtual address in the communication. Uacce is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva patchset [2] We also keep verifying with Eric's SMMUv3 Nested Stage patches [3] Hi Zhangfei Gao, Just to check my understanding... This patch set is not dependent on either 2 or 3? To use it on our hardware, we need 2, but the interfaces used are already upstream, so this could move forwards in parallel. Yes, patch 1, 2 is for uacce. patch 3, 4 is an example using uacce, which happen to be crypto. Sorry, I wasn't clear enough. Question is whether we need Jean's sva patch set [2] to merge this? Since the sva api are already merged, the patchset can be build without problem, so no dependency. Though in order to make sva work, we need Jean's sva patch together, which is in reviewing https://lore.kernel.org/linux-iommu/20191219163033.2608177-1-jean-phili...@linaro.org/ And with this patchset, we can also easier verify sva. So I think they can go independently. Thanks ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v10 2/4] uacce: add uacce driver
On Fri, 10 Jan 2020 14:55:39 +0800 "zhangfei@foxmail.com" wrote: > On 2020/1/10 上午1:38, Jonathan Cameron wrote: > > On Mon, 16 Dec 2019 11:08:15 +0800 > > Zhangfei Gao wrote: > > > >> From: Kenneth Lee > >> > >> Uacce (Unified/User-space-access-intended Accelerator Framework) targets to > >> provide Shared Virtual Addressing (SVA) between accelerators and processes. > >> So accelerator can access any data structure of the main cpu. > >> This differs from the data sharing between cpu and io device, which share > >> only data content rather than address. > >> Since unified address, hardware and user space of process can share the > >> same virtual address in the communication. > >> > >> Uacce create a chrdev for every registration, the queue is allocated to > >> the process when the chrdev is opened. Then the process can access the > >> hardware resource by interact with the queue file. By mmap the queue > >> file space to user space, the process can directly put requests to the > >> hardware without syscall to the kernel space. > >> > >> The IOMMU core only tracks mm<->device bonds at the moment, because it > >> only needs to handle IOTLB invalidation and PASID table entries. However > >> uacce needs a finer granularity since multiple queues from the same > >> device can be bound to an mm. When the mm exits, all bound queues must > >> be stopped so that the IOMMU can safely clear the PASID table entry and > >> reallocate the PASID. > >> > >> An intermediate struct uacce_mm links uacce devices and queues. > >> Note that an mm may be bound to multiple devices but an uacce_mm > >> structure only ever belongs to a single device, because we don't need > >> anything more complex (if multiple devices are bound to one mm, then > >> we'll create one uacce_mm for each bond). > >> > >> uacce_device --+-- uacce_mm --+-- uacce_queue > >> | '-- uacce_queue > >> | > >> '-- uacce_mm --+-- uacce_queue > >>+-- uacce_queue > >>'-- uacce_queue > >> > >> Signed-off-by: Kenneth Lee > >> Signed-off-by: Zaibo Xu > >> Signed-off-by: Zhou Wang > >> Signed-off-by: Jean-Philippe Brucker > >> Signed-off-by: Zhangfei Gao > > Hi, > > > > Two small things I'd missed previously. Fix those and for > > what it's worth > > > > Reviewed-by: Jonathan Cameron > Thanks Jonathan > > > >> --- > >> Documentation/ABI/testing/sysfs-driver-uacce | 37 ++ > >> drivers/misc/Kconfig | 1 + > >> drivers/misc/Makefile| 1 + > >> drivers/misc/uacce/Kconfig | 13 + > >> drivers/misc/uacce/Makefile | 2 + > >> drivers/misc/uacce/uacce.c | 628 > >> +++ > >> include/linux/uacce.h| 161 +++ > >> include/uapi/misc/uacce/uacce.h | 38 ++ > >> 8 files changed, 881 insertions(+) > >> create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce > >> create mode 100644 drivers/misc/uacce/Kconfig > >> create mode 100644 drivers/misc/uacce/Makefile > >> create mode 100644 drivers/misc/uacce/uacce.c > >> create mode 100644 include/linux/uacce.h > >> create mode 100644 include/uapi/misc/uacce/uacce.h > >> > > ... > >> + > >> +What: /sys/class/uacce//available_instances > >> +Date: Dec 2019 > >> +KernelVersion: 5.6 > >> +Contact:linux-accelerat...@lists.ozlabs.org > >> +Description:Available instances left of the device > >> +Return -ENODEV if uacce_ops get_available_instances is > >> not provided > >> + > > See below. It doesn't "return" it prints it currently. > Will update to > 'unknown' if uacce_ops get_available_instances is not provided > > > > ... > > > >> +static int uacce_fops_mmap(struct file *filep, struct vm_area_struct *vma) > >> +{ > >> + struct uacce_queue *q = filep->private_data; > >> + struct uacce_device *uacce = q->uacce; > >> + struct uacce_qfile_region *qfr; > >> + enum uacce_qfrt type = UACCE_MAX_REGION; > >> + int ret = 0; > >> + > >> + if (vma->vm_pgoff < UACCE_MAX_REGION) > >> + type = vma->vm_pgoff; > >> + else > >> + return -EINVAL; > >> + > >> + qfr = kzalloc(sizeof(*qfr), GFP_KERNEL); > >> + if (!qfr) > >> + return -ENOMEM; > >> + > >> + vma->vm_flags |= VM_DONTCOPY | VM_DONTEXPAND | VM_WIPEONFORK; > >> + vma->vm_ops = _vm_ops; > >> + vma->vm_private_data = q; > >> + qfr->type = type; > >> + > >> + mutex_lock(_mutex); > >> + > >> + if (q->state != UACCE_Q_INIT && q->state != UACCE_Q_STARTED) { > >> + ret = -EINVAL; > >> + goto out_with_lock; > >> + } > >> + > >> + if (q->qfrs[type]) { > >> + ret = -EEXIST; > >> + goto out_with_lock; > >> + } > >> + > >> + switch (type) { > >> + case UACCE_QFRT_MMIO: > >> +
Re: [PATCH v10 0/4] Add uacce module for Accelerator
On Fri, 10 Jan 2020 15:03:25 +0800 zhangfei wrote: > On 2020/1/10 上午1:49, Jonathan Cameron wrote: > > On Mon, 16 Dec 2019 11:08:13 +0800 > > Zhangfei Gao wrote: > > > >> Uacce (Unified/User-space-access-intended Accelerator Framework) targets to > >> provide Shared Virtual Addressing (SVA) between accelerators and processes. > >> So accelerator can access any data structure of the main cpu. > >> This differs from the data sharing between cpu and io device, which share > >> data content rather than address. > >> Because of unified address, hardware and user space of process can share > >> the same virtual address in the communication. > >> > >> Uacce is intended to be used with Jean Philippe Brucker's SVA > >> patchset[1], which enables IO side page fault and PASID support. > >> We have keep verifying with Jean's sva patchset [2] > >> We also keep verifying with Eric's SMMUv3 Nested Stage patches [3] > > Hi Zhangfei Gao, > > > > Just to check my understanding... > > > > This patch set is not dependent on either 2 or 3? > > > > To use it on our hardware, we need 2, but the interfaces used are already > > upstream, so this could move forwards in parallel. > > > > > Yes, > patch 1, 2 is for uacce. > patch 3, 4 is an example using uacce, which happen to be crypto. Sorry, I wasn't clear enough. Question is whether we need Jean's sva patch set [2] to merge this? > > Thanks ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3] iommu/vt-d: Don't reject Host Bridge due to scope mismatch
On a system with two host bridges(:00:00.0,:80:00.0), iommu initialization fails with DMAR: Device scope type does not match for :80:00.0 This is because the DMAR table reports this device as having scope 2 (ACPI_DMAR_SCOPE_TYPE_BRIDGE): but the device has a type 0 PCI header: 80:00.0 Class 0600: Device 8086:2020 (rev 06) 00: 86 80 20 20 47 05 10 00 06 00 00 06 10 00 00 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 00 00 30: 00 00 00 00 90 00 00 00 00 00 00 00 00 01 00 00 VT-d works perfectly on this system, so there's no reason to bail out on initialization due to this apparent scope mismatch. Add the class 0x06 ("PCI_BASE_CLASS_BRIDGE") as a heuristic for allowing DMAR initialization for non-bridge PCI devices listed with scope bridge. Signed-off-by: jimyan --- drivers/iommu/dmar.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c index eecd6a421667..50c92eb23ee4 100644 --- a/drivers/iommu/dmar.c +++ b/drivers/iommu/dmar.c @@ -244,7 +244,7 @@ int dmar_insert_dev_scope(struct dmar_pci_notify_info *info, info->dev->hdr_type != PCI_HEADER_TYPE_NORMAL) || (scope->entry_type == ACPI_DMAR_SCOPE_TYPE_BRIDGE && (info->dev->hdr_type == PCI_HEADER_TYPE_NORMAL && - info->dev->class >> 8 != PCI_CLASS_BRIDGE_OTHER))) { + info->dev->class >> 16 != PCI_BASE_CLASS_BRIDGE))) { pr_warn("Device scope type does not match for %s\n", pci_name(info->dev)); return -EINVAL; -- 2.11.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu