Re: [PATCH 01/11] iommu/vt-d: Add pasid private data helpers
Hi BaoLu, On Thu, 20 May 2021 11:15:21 +0800, Lu Baolu wrote: > We are about to use iommu_sva_alloc/free_pasid() helpers in iommu core. > That means the pasid life cycle will be managed by iommu core. Use a > local array to save the per pasid private data instead of attaching it > the real pasid. > I feel a little awkward to have a separate xarray for storing per IOASID data. Seems duplicated. Jason suggested in another thread that we can make ioasid_data public and embeded in struct intel_svm, then we can get rid of the private data pointer. ioasid_find will return the ioasid_data, then we can retrieve the private data with container_of. roughly, struct intel_svm { ... struct ioasid_data; }; struct ioasid_data { ioasid_t id; refcount_t refs; struct mm_struct *mm; }; This can be a separate patch/effort if it make sense to you. > Signed-off-by: Lu Baolu > --- > drivers/iommu/intel/svm.c | 62 ++- > 1 file changed, 41 insertions(+), 21 deletions(-) > > diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c > index 5165cea90421..82b0627ad7e7 100644 > --- a/drivers/iommu/intel/svm.c > +++ b/drivers/iommu/intel/svm.c > @@ -17,6 +17,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -28,6 +29,23 @@ static void intel_svm_drain_prq(struct device *dev, > u32 pasid); > #define PRQ_ORDER 0 > > +static DEFINE_XARRAY_ALLOC(pasid_private_array); > +static int pasid_private_add(ioasid_t pasid, void *priv) > +{ > + return xa_alloc(_private_array, , priv, > + XA_LIMIT(pasid, pasid), GFP_ATOMIC); > +} > + > +static void pasid_private_remove(ioasid_t pasid) > +{ > + xa_erase(_private_array, pasid); > +} > + > +static void *pasid_private_find(ioasid_t pasid) > +{ > + return xa_load(_private_array, pasid); > +} > + > int intel_svm_enable_prq(struct intel_iommu *iommu) > { > struct page *pages; > @@ -224,7 +242,7 @@ static int pasid_to_svm_sdev(struct device *dev, > unsigned int pasid, if (pasid == INVALID_IOASID || pasid >= PASID_MAX) > return -EINVAL; > > - svm = ioasid_find(NULL, pasid, NULL); > + svm = pasid_private_find(pasid); > if (IS_ERR(svm)) > return PTR_ERR(svm); > > @@ -334,7 +352,7 @@ int intel_svm_bind_gpasid(struct iommu_domain > *domain, struct device *dev, svm->gpasid = data->gpasid; > svm->flags |= SVM_FLAG_GUEST_PASID; > } > - ioasid_set_data(data->hpasid, svm); > + pasid_private_add(data->hpasid, svm); > INIT_LIST_HEAD_RCU(>devs); > mmput(svm->mm); > } > @@ -388,7 +406,7 @@ int intel_svm_bind_gpasid(struct iommu_domain > *domain, struct device *dev, list_add_rcu(>list, >devs); > out: > if (!IS_ERR_OR_NULL(svm) && list_empty(>devs)) { > - ioasid_set_data(data->hpasid, NULL); > + pasid_private_remove(data->hpasid); > kfree(svm); > } > > @@ -431,7 +449,7 @@ int intel_svm_unbind_gpasid(struct device *dev, u32 > pasid) >* the unbind, IOMMU driver will get > notified >* and perform cleanup. >*/ > - ioasid_set_data(pasid, NULL); > + pasid_private_remove(pasid); > kfree(svm); > } > } > @@ -547,8 +565,7 @@ intel_svm_bind_mm(struct device *dev, unsigned int > flags, svm = kzalloc(sizeof(*svm), GFP_KERNEL); > if (!svm) { > ret = -ENOMEM; > - kfree(sdev); > - goto out; > + goto sdev_err; > } > > if (pasid_max > intel_pasid_max_id) > @@ -556,13 +573,16 @@ intel_svm_bind_mm(struct device *dev, unsigned int > flags, > /* Do not use PASID 0, reserved for RID to PASID */ > svm->pasid = ioasid_alloc(NULL, PASID_MIN, > - pasid_max - 1, svm); > + pasid_max - 1, NULL); > if (svm->pasid == INVALID_IOASID) { > - kfree(svm); > - kfree(sdev); > ret = -ENOSPC; > - goto out; > + goto svm_err; > } > + > + ret = pasid_private_add(svm->pasid, svm); > + if (ret) > + goto pasid_err; > + > svm->notifier.ops = _mmuops; > svm->mm = mm; > svm->flags = flags; > @@ -571,12 +591,8 @@ intel_svm_bind_mm(struct device *dev, unsigned int > flags, ret = -ENOMEM; > if (mm) { > ret = mmu_notifier_register(>notifier, mm); > - if (ret) { > -
Re: [PATCH 2/2] firmware: QCOM_SCM: Allow qcom_scm driver to be loadable as a permenent module
On Tue, May 18, 2021 at 09:19:22PM +, John Stultz wrote: > Allow the qcom_scm driver to be loadable as a permenent module. > > This still uses the "depends on QCOM_SCM || !QCOM_SCM" bit to > ensure that drivers that call into the qcom_scm driver are > also built as modules. While not ideal in some cases its the > only safe way I can find to avoid build errors without having > those drivers select QCOM_SCM and have to force it on (as > QCOM_SCM=n can be valid for those drivers). > > Reviving this now that Saravana's fw_devlink defaults to on, > which should avoid loading troubles seen before. > > Cc: Catalin Marinas > Cc: Will Deacon > Cc: Andy Gross > Cc: Bjorn Andersson > Cc: Joerg Roedel > Cc: Thomas Gleixner > Cc: Jason Cooper > Cc: Marc Zyngier > Cc: Linus Walleij > Cc: Vinod Koul > Cc: Kalle Valo > Cc: Maulik Shah > Cc: Lina Iyer > Cc: Saravana Kannan > Cc: Todd Kjos > Cc: Greg Kroah-Hartman > Cc: linux-arm-...@vger.kernel.org > Cc: iommu@lists.linux-foundation.org > Cc: linux-g...@vger.kernel.org > Acked-by: Kalle Valo > Acked-by: Greg Kroah-Hartman > Reviewed-by: Bjorn Andersson > Signed-off-by: John Stultz Acked-by: Greg Kroah-Hartman ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] irqchip/qcom-pdc: Switch to IRQCHIP_PLATFORM_DRIVER and allow as a module
On Tue, May 18, 2021 at 09:19:21PM +, John Stultz wrote: > From: Saravana Kannan > > This patch revives changes from Saravana Kannan to switch the > qcom-pdc driver to use IRQCHIP_PLATFORM_DRIVER helper macros, > and allows qcom-pdc driver to be loaded as a permanent module. > > Earlier attempts at this ran into trouble with loading > dependencies, but with Saravana's fw_devlink=on set by default > now we should avoid those. > > Cc: Andy Gross > Cc: Bjorn Andersson > Cc: Joerg Roedel > Cc: Thomas Gleixner > Cc: Jason Cooper > Cc: Marc Zyngier > Cc: Linus Walleij > Cc: Maulik Shah > Cc: Lina Iyer > Cc: Saravana Kannan > Cc: Todd Kjos > Cc: Greg Kroah-Hartman > Cc: linux-arm-...@vger.kernel.org > Cc: iommu@lists.linux-foundation.org > Cc: linux-g...@vger.kernel.org > Signed-off-by: Saravana Kannan > [jstultz: Folded in with my changes to allow the driver to be > loadable as a permenent module] > Signed-off-by: John Stultz Acked-by: Greg Kroah-Hartman ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4] iommu/of: Fix pci_request_acs() before enumerating PCI devices
On Thu, May 20, 2021 at 10:03 PM Wang Xingang wrote: > > From: Xingang Wang > > When booting with devicetree, the pci_request_acs() is called after the > enumeration and initialization of PCI devices, thus the ACS is not > enabled. And ACS should be enabled when IOMMU is detected for the > PCI host bridge, so add check for IOMMU before probe of PCI host and call > pci_request_acs() to make sure ACS will be enabled when enumerating PCI > devices. > > Fixes: 6bf6c24720d33 ("iommu/of: Request ACS from the PCI core when > configuring IOMMU linkage") > Signed-off-by: Xingang Wang > --- > drivers/iommu/of_iommu.c | 1 - > drivers/pci/of.c | 8 +++- > 2 files changed, 7 insertions(+), 2 deletions(-) Reviewed-by: Rob Herring > diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c > index a9d2df001149..54a14da242cc 100644 > --- a/drivers/iommu/of_iommu.c > +++ b/drivers/iommu/of_iommu.c > @@ -205,7 +205,6 @@ const struct iommu_ops *of_iommu_configure(struct device > *dev, > .np = master_np, > }; > > - pci_request_acs(); > err = pci_for_each_dma_alias(to_pci_dev(dev), > of_pci_iommu_init, ); > } else { > diff --git a/drivers/pci/of.c b/drivers/pci/of.c > index da5b414d585a..2313c3f848b0 100644 > --- a/drivers/pci/of.c > +++ b/drivers/pci/of.c > @@ -581,9 +581,15 @@ static int pci_parse_request_of_pci_ranges(struct device > *dev, > > int devm_of_pci_bridge_init(struct device *dev, struct pci_host_bridge > *bridge) > { > - if (!dev->of_node) > + struct device_node *node = dev->of_node; > + > + if (!node) > return 0; > > + /* Detect IOMMU and make sure ACS will be enabled */ > + if (of_property_read_bool(node, "iommu-map")) > + pci_request_acs(); > + > bridge->swizzle_irq = pci_common_swizzle; > bridge->map_irq = of_irq_parse_and_map_pci; > > -- > 2.19.1 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/6] ACPI: Move IOMMU setup code out of IORT
On Fri, Apr 23, 2021 at 1:57 PM Jean-Philippe Brucker wrote: > > Some of the IOMMU setup code in IORT is fairly generic and can be reused > by VIOT. Extract it from IORT. Except that iort_iommu_configure_id() is not really generic AFAICS. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 3/6] ACPI: Add driver for the VIOT table
On Fri, Apr 23, 2021 at 1:57 PM Jean-Philippe Brucker wrote: > > The ACPI Virtual I/O Translation Table describes topology of > para-virtual platforms, similarly to vendor tables DMAR, IVRS and IORT. > For now it describes the relation between virtio-iommu and the endpoints > it manages. > > Three steps are needed to configure DMA of endpoints: > > (1) acpi_viot_init(): parse the VIOT table, find or create the fwnode > associated to each vIOMMU device. > > (2) When probing the vIOMMU device, the driver registers its IOMMU ops > within the IOMMU subsystem. This step doesn't require any > intervention from the VIOT driver. > > (3) viot_iommu_configure(): before binding the endpoint to a driver, > find the associated IOMMU ops. Register them, along with the > endpoint ID, into the device's iommu_fwspec. > > If step (3) happens before step (2), it is deferred until the IOMMU is > initialized, then retried. > > Signed-off-by: Jean-Philippe Brucker > --- > drivers/acpi/Kconfig | 3 + > drivers/iommu/Kconfig | 1 + > drivers/acpi/Makefile | 2 + > include/linux/acpi_viot.h | 19 +++ > drivers/acpi/bus.c| 2 + > drivers/acpi/scan.c | 3 + > drivers/acpi/viot.c | 350 ++ > MAINTAINERS | 8 + > 8 files changed, 388 insertions(+) > create mode 100644 include/linux/acpi_viot.h > create mode 100644 drivers/acpi/viot.c > > diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig > index eedec61e3476..3758c6940ed7 100644 > --- a/drivers/acpi/Kconfig > +++ b/drivers/acpi/Kconfig > @@ -526,6 +526,9 @@ endif > > source "drivers/acpi/pmic/Kconfig" > > +config ACPI_VIOT > + bool > + > endif # ACPI > > config X86_PM_TIMER > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig > index 1f111b399bca..aff8a4830dd1 100644 > --- a/drivers/iommu/Kconfig > +++ b/drivers/iommu/Kconfig > @@ -403,6 +403,7 @@ config VIRTIO_IOMMU > depends on ARM64 > select IOMMU_API > select INTERVAL_TREE > + select ACPI_VIOT if ACPI > help > Para-virtualised IOMMU driver with virtio. > > diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile > index 700b41adf2db..a6e644c48987 100644 > --- a/drivers/acpi/Makefile > +++ b/drivers/acpi/Makefile > @@ -118,3 +118,5 @@ video-objs += acpi_video.o video_detect.o > obj-y += dptf/ > > obj-$(CONFIG_ARM64)+= arm64/ > + > +obj-$(CONFIG_ACPI_VIOT)+= viot.o > diff --git a/include/linux/acpi_viot.h b/include/linux/acpi_viot.h > new file mode 100644 > index ..1eb8ee5b0e5f > --- /dev/null > +++ b/include/linux/acpi_viot.h > @@ -0,0 +1,19 @@ > +/* SPDX-License-Identifier: GPL-2.0-only */ > + > +#ifndef __ACPI_VIOT_H__ > +#define __ACPI_VIOT_H__ > + > +#include > + > +#ifdef CONFIG_ACPI_VIOT > +void __init acpi_viot_init(void); > +int viot_iommu_configure(struct device *dev); > +#else > +static inline void acpi_viot_init(void) {} > +static inline int viot_iommu_configure(struct device *dev) > +{ > + return -ENODEV; > +} > +#endif > + > +#endif /* __ACPI_VIOT_H__ */ > diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c > index be7da23fad76..b835ca702ff0 100644 > --- a/drivers/acpi/bus.c > +++ b/drivers/acpi/bus.c > @@ -27,6 +27,7 @@ > #include > #endif > #include > +#include > #include > #include > #include > @@ -1339,6 +1340,7 @@ static int __init acpi_init(void) > pci_mmcfg_late_init(); > acpi_iort_init(); > acpi_scan_init(); > + acpi_viot_init(); > acpi_ec_init(); > acpi_debugfs_init(); > acpi_sleep_proc_init(); > diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c > index 5924421075f6..4db43c822ee7 100644 > --- a/drivers/acpi/scan.c > +++ b/drivers/acpi/scan.c > @@ -9,6 +9,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -1554,6 +1555,8 @@ static const struct iommu_ops > *acpi_iommu_configure_id(struct device *dev, > return ops; > > err = iort_iommu_configure_id(dev, id_in); > + if (err && err != -EPROBE_DEFER) > + err = viot_iommu_configure(dev); > > /* > * If we have reason to believe the IOMMU driver missed the initial > diff --git a/drivers/acpi/viot.c b/drivers/acpi/viot.c > new file mode 100644 > index ..710e5a5eac70 > --- /dev/null > +++ b/drivers/acpi/viot.c > @@ -0,0 +1,350 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Virtual I/O topology In the first place, more information on what this is all about, please. What it does and how it is used. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 3/4] iommu: rockchip: Add internal ops to handle variants
On 2021-05-21 14:38, Benjamin Gaignard wrote: Le 21/05/2021 à 14:58, Robin Murphy a écrit : On 2021-05-21 09:36, Benjamin Gaignard wrote: Add internal ops to be able to handle incoming variant v2. The goal is to keep the overall structure of the framework but to allow to add the evolution of this hardware block. The ops are global for a SoC because iommu domains are not attached to a specific devices if they are for a virtuel device like drm. Use a global variable shouldn't be since SoC usually doesn't embedded different versions of the iommu hardware block. If that happen one day a WARN_ON will be displayed at probe time. IMO it would be a grievous error if such a "virtual device" ever gets near the IOMMU API, so personally I wouldn't use that as a justification for anything :) FWIW you should be OK to handle things on a per-instance basis, it just means you have to defer some of the domain setup to .attach_dev time, like various other drivers do. That said, there's nothing wrong with the global if we do expect instances to be consistent across any given Rockchip SoC (and my gut feeling is that we probably should). I have tried that solution first but drm device appear to but such "virtual device" so I had to use the global. Hmm, the "rockchip,display-subsystem" node is not associated with an IOMMU, and shouldn't even be passed to the DMA API either, because it's not a real piece of DMA capable hardware. Whatever the DRM stack is doing above, it should only be the actual VOP devices that we see down here. If not, that's indicative of something being wrong elsewhere. Like I say though, I think it's fine to use global ops simply on the expectation that that's how the new SOCs are going to be. In fact this reminds me, I think I started writing a patch somewhere to clean up the virtual device mess for rockchip-drm (IIRC I could see no reason why we can't just allocate the DRM device from the VOP driver, similar to what exynos-drm does). Maybe I should dig that up again... I send a v6 to fix your others remarks. I guess I'll wait for v7 now then, since I got sidetracked before sending my review of patch #4 (heck, I've just spent the last half hour doing something else in the middle of writing this!) ;) Cheers, Robin. Thanks for your advice. Benjamin Signed-off-by: Benjamin Gaignard --- version 5: - Use of_device_get_match_data() - Add internal ops inside the driver drivers/iommu/rockchip-iommu.c | 69 -- 1 file changed, 50 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index 7a2932772fdf..e7b9bcf174b1 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -19,6 +19,7 @@ #include #include #include +#include This seems to be an unrelated and unnecessary change. #include #include #include @@ -96,6 +97,14 @@ static const char * const rk_iommu_clocks[] = { "aclk", "iface", }; +struct rk_iommu_ops { + phys_addr_t (*pt_address)(u32 dte); + u32 (*mk_dtentries)(dma_addr_t pt_dma); + u32 (*mk_ptentries)(phys_addr_t page, int prot); + phys_addr_t (*dte_addr_phys)(phys_addr_t addr); + u32 pt_address_mask; +}; + struct rk_iommu { struct device *dev; void __iomem **bases; @@ -116,6 +125,7 @@ struct rk_iommudata { }; static struct device *dma_dev; +static const struct rk_iommu_ops *rk_ops; static inline void rk_table_flush(struct rk_iommu_domain *dom, dma_addr_t dma, unsigned int count) @@ -215,11 +225,6 @@ static inline u32 rk_mk_dte(dma_addr_t pt_dma) #define RK_PTE_PAGE_READABLE BIT(1) #define RK_PTE_PAGE_VALID BIT(0) -static inline phys_addr_t rk_pte_page_address(u32 pte) -{ - return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK; -} - static inline bool rk_pte_is_page_valid(u32 pte) { return pte & RK_PTE_PAGE_VALID; @@ -451,7 +456,7 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu) rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, DTE_ADDR_DUMMY); dte_addr = rk_iommu_read(iommu->bases[i], RK_MMU_DTE_ADDR); - if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) { + if (dte_addr != (DTE_ADDR_DUMMY & rk_ops->pt_address_mask)) { Nit: might it make more sense to do something like: dte_addr = rk_ops->pt_address(... DTE_ADDR_DUMMY); rk_iommu_write(... dte_addr) if (rk_iommu_read(...) != dte_addr) so that you don't need to bother defining ->pt_address_mask for just this one sanity-check? dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is not functioning\n"); return -EFAULT; } @@ -470,6 +475,11 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu) return 0; } +static inline phys_addr_t rk_dte_addr_phys(phys_addr_t addr) The argument type here should be u32, since it's a DTE,
[PATCH] iommu/rockchip: Remove redundant DMA syncs
When we allocate new pagetable pages, we don't modify them between the initial dma_map_single() call and the dma_sync_single_for_device() call via rk_iommu_flush(), so the latter is entirely pointless. Signed-off-by: Robin Murphy --- drivers/iommu/rockchip-iommu.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index 7a2932772fdf..e5116d6a6260 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -682,7 +682,6 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain, dte = rk_mk_dte(pt_dma); *dte_addr = dte; - rk_table_flush(rk_domain, pt_dma, NUM_PT_ENTRIES); rk_table_flush(rk_domain, rk_domain->dt_dma + dte_index * sizeof(u32), 1); done: @@ -1004,8 +1003,6 @@ static struct iommu_domain *rk_iommu_domain_alloc(unsigned type) goto err_free_dt; } - rk_table_flush(rk_domain, rk_domain->dt_dma, NUM_DT_ENTRIES); - spin_lock_init(_domain->iommus_lock); spin_lock_init(_domain->dt_lock); INIT_LIST_HEAD(_domain->iommus); -- 2.21.0.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 4/4] iommu: rockchip: Add support for iommu v2
On 2021-05-21 09:36, Benjamin Gaignard wrote: This second version of the hardware block has a different bits mapping for page table entries. Add the ops matching to this new mapping. Define a new compatible to distinguish it from the first version. Signed-off-by: Benjamin Gaignard --- version 5: - Use internal ops to support v2 hardware block - Use GENMASK macro. - Keep rk_dte_pt_address() and rk_dte_pt_address_v2() separated because I believe that is more readable like this. - Do not duplicate code. drivers/iommu/rockchip-iommu.c | 78 ++ 1 file changed, 78 insertions(+) diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index e7b9bcf174b1..23253a2f269e 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -189,6 +189,33 @@ static inline phys_addr_t rk_dte_pt_address(u32 dte) return (phys_addr_t)dte & RK_DTE_PT_ADDRESS_MASK; } +/* + * In v2: + * 31:12 - PT address bit 31:0 + * 11: 8 - PT address bit 35:32 + * 7: 4 - PT address bit 39:36 + * 3: 1 - Reserved + * 0 - 1 if PT @ PT address is valid + */ +#define RK_DTE_PT_ADDRESS_MASK_V2 GENMASK_ULL(31, 4) +#define DTE_HI_MASK1 GENMASK(11, 8) +#define DTE_HI_MASK2 GENMASK(7, 4) +#define DTE_HI_SHIFT1 24 /* shift bit 8 to bit 32 */ +#define DTE_HI_SHIFT2 32 /* shift bit 4 to bit 36 */ Nit: no harm in doing "#define DTE_HI_SHIFT1 (32 - 8)" etc. for maximum clarity if you want. +#define PAGE_DESC_HI_MASK1 GENMASK_ULL(39, 36) +#define PAGE_DESC_HI_MASK2 GENMASK_ULL(35, 32) + +static inline phys_addr_t rk_dte_pt_address_v2(u32 dte) +{ + u64 dte_v2 = dte; + + dte_v2 = ((dte_v2 & DTE_HI_MASK2) << DTE_HI_SHIFT2) | +((dte_v2 & DTE_HI_MASK1) << DTE_HI_SHIFT1) | +(dte_v2 & RK_DTE_PT_ADDRESS_MASK); + + return (phys_addr_t)dte_v2; +} + static inline bool rk_dte_is_pt_valid(u32 dte) { return dte & RK_DTE_PT_VALID; @@ -199,6 +226,15 @@ static inline u32 rk_mk_dte(dma_addr_t pt_dma) return (pt_dma & RK_DTE_PT_ADDRESS_MASK) | RK_DTE_PT_VALID; } +static inline u32 rk_mk_dte_v2(dma_addr_t pt_dma) +{ + pt_dma = (pt_dma & RK_DTE_PT_ADDRESS_MASK) | +((pt_dma & PAGE_DESC_HI_MASK1) >> DTE_HI_SHIFT1) | +(pt_dma & PAGE_DESC_HI_MASK2) >> DTE_HI_SHIFT2; + + return (pt_dma & RK_DTE_PT_ADDRESS_MASK_V2) | RK_DTE_PT_VALID; +} + /* * Each PTE has a Page address, some flags and a valid bit: * +-+---+---+-+ @@ -240,6 +276,29 @@ static u32 rk_mk_pte(phys_addr_t page, int prot) return page | flags | RK_PTE_PAGE_VALID; } +/* + * In v2: + * 31:12 - Page address bit 31:0 + * 11:9 - Page address bit 34:32 + * 8:4 - Page address bit 39:35 + * 3 - Security + * 2 - Readable + * 1 - Writable + * 0 - 1 if Page @ Page address is valid + */ +#define RK_PTE_PAGE_READABLE_V2 BIT(2) +#define RK_PTE_PAGE_WRITABLE_V2 BIT(1) + +static u32 rk_mk_pte_v2(phys_addr_t page, int prot) +{ + u32 flags = 0; + + flags |= (prot & IOMMU_READ) ? RK_PTE_PAGE_READABLE_V2 : 0; + flags |= (prot & IOMMU_WRITE) ? RK_PTE_PAGE_WRITABLE_V2 : 0; + + return rk_mk_dte_v2(page) | flags ; +} + static u32 rk_mk_pte_invalid(u32 pte) { return pte & ~RK_PTE_PAGE_VALID; @@ -480,6 +539,14 @@ static inline phys_addr_t rk_dte_addr_phys(phys_addr_t addr) return addr; } +#define DT_HI_MASK GENMASK_ULL(39, 32) +#define DT_SHIFT 28 + +static inline phys_addr_t rk_dte_addr_phys_v2(phys_addr_t addr) +{ + return (addr & RK_DTE_PT_ADDRESS_MASK) | ((addr & DT_HI_MASK) << DT_SHIFT); +} Are we missing something overall? AFAICS the DT_HI_MASK bits of RK_MMU_DTE_ADDR will never actually be used, since rk_iommu_enable() just writes the value of dt_dma directly... + static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) { void __iomem *base = iommu->bases[index]; @@ -1305,10 +1372,21 @@ static struct rk_iommu_ops iommu_data_ops_v1 = { .pt_address_mask = RK_DTE_PT_ADDRESS_MASK, }; +static struct rk_iommu_ops iommu_data_ops_v2 = { + .pt_address = _dte_pt_address_v2, + .mk_dtentries = _mk_dte_v2, + .mk_ptentries = _mk_pte_v2, + .dte_addr_phys = _dte_addr_phys_v2, + .pt_address_mask = RK_DTE_PT_ADDRESS_MASK_V2, +}; + static const struct of_device_id rk_iommu_dt_ids[] = { { .compatible = "rockchip,iommu", .data = _data_ops_v1, }, + { .compatible = "rockchip,rk3568-iommu", + .data = _data_ops_v2, + }, { /* sentinel */ } }; MODULE_DEVICE_TABLE(of, rk_iommu_dt_ids); ...and I'll bet the reason it appears to work is that we also never actually set the IOMMU device's DMA masks anywhere, so what happens in practice is that even if pagetable pages are allocated above 32 bits they'll just get bounced by the DMA
[PATCH v6 4/4] iommu: rockchip: Add support for iommu v2
This second version of the hardware block has a different bits mapping for page table entries. Add the ops matching to this new mapping. Define a new compatible to distinguish it from the first version. Signed-off-by: Benjamin Gaignard Reviewed-by: Heiko Stuebner --- version 5: - Use internal ops to support v2 hardware block - Use GENMASK macro. - Keep rk_dte_pt_address() and rk_dte_pt_address_v2() separated because I believe that is more readable like this. - Do not duplicate code. drivers/iommu/rockchip-iommu.c | 78 ++ 1 file changed, 78 insertions(+) diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index 7b537dd168ca..d943b9d8bdb2 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -187,6 +187,33 @@ static inline phys_addr_t rk_dte_pt_address(u32 dte) return (phys_addr_t)dte & RK_DTE_PT_ADDRESS_MASK; } +/* + * In v2: + * 31:12 - PT address bit 31:0 + * 11: 8 - PT address bit 35:32 + * 7: 4 - PT address bit 39:36 + * 3: 1 - Reserved + * 0 - 1 if PT @ PT address is valid + */ +#define RK_DTE_PT_ADDRESS_MASK_V2 GENMASK_ULL(31, 4) +#define DTE_HI_MASK1 GENMASK(11, 8) +#define DTE_HI_MASK2 GENMASK(7, 4) +#define DTE_HI_SHIFT1 24 /* shift bit 8 to bit 32 */ +#define DTE_HI_SHIFT2 32 /* shift bit 4 to bit 36 */ +#define PAGE_DESC_HI_MASK1 GENMASK_ULL(39, 36) +#define PAGE_DESC_HI_MASK2 GENMASK_ULL(35, 32) + +static inline phys_addr_t rk_dte_pt_address_v2(u32 dte) +{ + u64 dte_v2 = dte; + + dte_v2 = ((dte_v2 & DTE_HI_MASK2) << DTE_HI_SHIFT2) | +((dte_v2 & DTE_HI_MASK1) << DTE_HI_SHIFT1) | +(dte_v2 & RK_DTE_PT_ADDRESS_MASK); + + return (phys_addr_t)dte_v2; +} + static inline bool rk_dte_is_pt_valid(u32 dte) { return dte & RK_DTE_PT_VALID; @@ -197,6 +224,15 @@ static inline u32 rk_mk_dte(dma_addr_t pt_dma) return (pt_dma & RK_DTE_PT_ADDRESS_MASK) | RK_DTE_PT_VALID; } +static inline u32 rk_mk_dte_v2(dma_addr_t pt_dma) +{ + pt_dma = (pt_dma & RK_DTE_PT_ADDRESS_MASK) | +((pt_dma & PAGE_DESC_HI_MASK1) >> DTE_HI_SHIFT1) | +(pt_dma & PAGE_DESC_HI_MASK2) >> DTE_HI_SHIFT2; + + return (pt_dma & RK_DTE_PT_ADDRESS_MASK_V2) | RK_DTE_PT_VALID; +} + /* * Each PTE has a Page address, some flags and a valid bit: * +-+---+---+-+ @@ -238,6 +274,29 @@ static u32 rk_mk_pte(phys_addr_t page, int prot) return page | flags | RK_PTE_PAGE_VALID; } +/* + * In v2: + * 31:12 - Page address bit 31:0 + * 11:9 - Page address bit 34:32 + * 8:4 - Page address bit 39:35 + * 3 - Security + * 2 - Readable + * 1 - Writable + * 0 - 1 if Page @ Page address is valid + */ +#define RK_PTE_PAGE_READABLE_V2 BIT(2) +#define RK_PTE_PAGE_WRITABLE_V2 BIT(1) + +static u32 rk_mk_pte_v2(phys_addr_t page, int prot) +{ + u32 flags = 0; + + flags |= (prot & IOMMU_READ) ? RK_PTE_PAGE_READABLE_V2 : 0; + flags |= (prot & IOMMU_WRITE) ? RK_PTE_PAGE_WRITABLE_V2 : 0; + + return rk_mk_dte_v2(page) | flags; +} + static u32 rk_mk_pte_invalid(u32 pte) { return pte & ~RK_PTE_PAGE_VALID; @@ -478,6 +537,15 @@ static inline phys_addr_t rk_dte_addr_phys(u32 addr) return (phys_addr_t)addr; } +#define DT_HI_MASK GENMASK_ULL(39, 32) +#define DT_SHIFT 28 + +static inline phys_addr_t rk_dte_addr_phys_v2(u32 addr) +{ + return (phys_addr_t)(addr & RK_DTE_PT_ADDRESS_MASK) | + ((addr & DT_HI_MASK) << DT_SHIFT); +} + static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) { void __iomem *base = iommu->bases[index]; @@ -1305,10 +1373,20 @@ static struct rk_iommu_ops iommu_data_ops_v1 = { .dte_addr_phys = _dte_addr_phys, }; +static struct rk_iommu_ops iommu_data_ops_v2 = { + .pt_address = _dte_pt_address_v2, + .mk_dtentries = _mk_dte_v2, + .mk_ptentries = _mk_pte_v2, + .dte_addr_phys = _dte_addr_phys_v2, +}; + static const struct of_device_id rk_iommu_dt_ids[] = { { .compatible = "rockchip,iommu", .data = _data_ops_v1, }, + { .compatible = "rockchip,rk3568-iommu", + .data = _data_ops_v2, + }, { /* sentinel */ } }; -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v6 1/4] dt-bindings: iommu: rockchip: Convert IOMMU to DT schema
Convert Rockchip IOMMU to DT schema Signed-off-by: Benjamin Gaignard Reviewed-by: Rob Herring Reviewed-by: Heiko Stuebner --- .../bindings/iommu/rockchip,iommu.txt | 38 - .../bindings/iommu/rockchip,iommu.yaml| 80 +++ 2 files changed, 80 insertions(+), 38 deletions(-) delete mode 100644 Documentation/devicetree/bindings/iommu/rockchip,iommu.txt create mode 100644 Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml diff --git a/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt b/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt deleted file mode 100644 index 6ecefea1c6f9.. --- a/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt +++ /dev/null @@ -1,38 +0,0 @@ -Rockchip IOMMU -== - -A Rockchip DRM iommu translates io virtual addresses to physical addresses for -its master device. Each slave device is bound to a single master device, and -shares its clocks, power domain and irq. - -Required properties: -- compatible : Should be "rockchip,iommu" -- reg : Address space for the configuration registers -- interrupts : Interrupt specifier for the IOMMU instance -- interrupt-names : Interrupt name for the IOMMU instance -- #iommu-cells: Should be <0>. This indicates the iommu is a -"single-master" device, and needs no additional information -to associate with its master device. See: -Documentation/devicetree/bindings/iommu/iommu.txt -- clocks : A list of clocks required for the IOMMU to be accessible by -the host CPU. -- clock-names : Should contain the following: - "iface" - Main peripheral bus clock (PCLK/HCL) (required) - "aclk" - AXI bus clock (required) - -Optional properties: -- rockchip,disable-mmu-reset : Don't use the mmu reset operation. - Some mmu instances may produce unexpected results - when the reset operation is used. - -Example: - - vopl_mmu: iommu@ff940300 { - compatible = "rockchip,iommu"; - reg = <0xff940300 0x100>; - interrupts = ; - interrupt-names = "vopl_mmu"; - clocks = < ACLK_VOP1>, < HCLK_VOP1>; - clock-names = "aclk", "iface"; - #iommu-cells = <0>; - }; diff --git a/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml b/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml new file mode 100644 index ..099fc2578b54 --- /dev/null +++ b/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml @@ -0,0 +1,80 @@ +# SPDX-License-Identifier: GPL-2.0-only +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/iommu/rockchip,iommu.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Rockchip IOMMU + +maintainers: + - Heiko Stuebner + +description: |+ + A Rockchip DRM iommu translates io virtual addresses to physical addresses for + its master device. Each slave device is bound to a single master device and + shares its clocks, power domain and irq. + + For information on assigning IOMMU controller to its peripheral devices, + see generic IOMMU bindings. + +properties: + compatible: +const: rockchip,iommu + + reg: +items: + - description: configuration registers for MMU instance 0 + - description: configuration registers for MMU instance 1 +minItems: 1 +maxItems: 2 + + interrupts: +items: + - description: interruption for MMU instance 0 + - description: interruption for MMU instance 1 +minItems: 1 +maxItems: 2 + + clocks: +items: + - description: Core clock + - description: Interface clock + + clock-names: +items: + - const: aclk + - const: iface + + "#iommu-cells": +const: 0 + + rockchip,disable-mmu-reset: +$ref: /schemas/types.yaml#/definitions/flag +description: | + Do not use the mmu reset operation. + Some mmu instances may produce unexpected results + when the reset operation is used. + +required: + - compatible + - reg + - interrupts + - clocks + - clock-names + - "#iommu-cells" + +additionalProperties: false + +examples: + - | +#include +#include + +vopl_mmu: iommu@ff940300 { + compatible = "rockchip,iommu"; + reg = <0xff940300 0x100>; + interrupts = ; + clocks = < ACLK_VOP1>, < HCLK_VOP1>; + clock-names = "aclk", "iface"; + #iommu-cells = <0>; +}; -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v6 2/4] dt-bindings: iommu: rockchip: Add compatible for v2
Add compatible for the second version of IOMMU hardware block. RK356x IOMMU can also be link to a power domain. Signed-off-by: Benjamin Gaignard Reviewed-by: Rob Herring Reviewed-by: Heiko Stuebner --- .../devicetree/bindings/iommu/rockchip,iommu.yaml | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml b/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml index 099fc2578b54..d2e28a9e3545 100644 --- a/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml +++ b/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml @@ -19,7 +19,9 @@ description: |+ properties: compatible: -const: rockchip,iommu +enum: + - rockchip,iommu + - rockchip,rk3568-iommu reg: items: @@ -48,6 +50,9 @@ properties: "#iommu-cells": const: 0 + power-domains: +maxItems: 1 + rockchip,disable-mmu-reset: $ref: /schemas/types.yaml#/definitions/flag description: | -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v6 3/4] iommu: rockchip: Add internal ops to handle variants
Add internal ops to be able to handle incoming variant v2. The goal is to keep the overall structure of the framework but to allow to add the evolution of this hardware block. The ops are global for a SoC because iommu domains are not attached to a specific devices if they are for a virtuel device like drm. Use a global variable shouldn't be since SoC usually doesn't embedded different versions of the iommu hardware block. If that happen one day a WARN_ON will be displayed at probe time. Signed-off-by: Benjamin Gaignard Reviewed-by: Heiko Stuebner --- version 6: - Remove #include - Remove pt_address_mask field - Only use once of_device_get_match_data - Return an error if ops don't match version 5: - Use of_device_get_match_data() - Add internal ops inside the driver drivers/iommu/rockchip-iommu.c | 72 -- 1 file changed, 51 insertions(+), 21 deletions(-) diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index 7a2932772fdf..7b537dd168ca 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -96,6 +96,13 @@ static const char * const rk_iommu_clocks[] = { "aclk", "iface", }; +struct rk_iommu_ops { + phys_addr_t (*pt_address)(u32 dte); + u32 (*mk_dtentries)(dma_addr_t pt_dma); + u32 (*mk_ptentries)(phys_addr_t page, int prot); + phys_addr_t (*dte_addr_phys)(u32 addr); +}; + struct rk_iommu { struct device *dev; void __iomem **bases; @@ -116,6 +123,7 @@ struct rk_iommudata { }; static struct device *dma_dev; +static const struct rk_iommu_ops *rk_ops; static inline void rk_table_flush(struct rk_iommu_domain *dom, dma_addr_t dma, unsigned int count) @@ -215,11 +223,6 @@ static inline u32 rk_mk_dte(dma_addr_t pt_dma) #define RK_PTE_PAGE_READABLE BIT(1) #define RK_PTE_PAGE_VALID BIT(0) -static inline phys_addr_t rk_pte_page_address(u32 pte) -{ - return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK; -} - static inline bool rk_pte_is_page_valid(u32 pte) { return pte & RK_PTE_PAGE_VALID; @@ -448,10 +451,10 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu) * and verifying that upper 5 nybbles are read back. */ for (i = 0; i < iommu->num_mmu; i++) { - rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, DTE_ADDR_DUMMY); + dte_addr = rk_ops->pt_address(DTE_ADDR_DUMMY); + rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, dte_addr); - dte_addr = rk_iommu_read(iommu->bases[i], RK_MMU_DTE_ADDR); - if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) { + if (dte_addr != rk_iommu_read(iommu->bases[i], RK_MMU_DTE_ADDR)) { dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is not functioning\n"); return -EFAULT; } @@ -470,6 +473,11 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu) return 0; } +static inline phys_addr_t rk_dte_addr_phys(u32 addr) +{ + return (phys_addr_t)addr; +} + static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) { void __iomem *base = iommu->bases[index]; @@ -489,7 +497,7 @@ static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) page_offset = rk_iova_page_offset(iova); mmu_dte_addr = rk_iommu_read(base, RK_MMU_DTE_ADDR); - mmu_dte_addr_phys = (phys_addr_t)mmu_dte_addr; + mmu_dte_addr_phys = rk_ops->dte_addr_phys(mmu_dte_addr); dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index); dte_addr = phys_to_virt(dte_addr_phys); @@ -498,14 +506,14 @@ static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) if (!rk_dte_is_pt_valid(dte)) goto print_it; - pte_addr_phys = rk_dte_pt_address(dte) + (pte_index * 4); + pte_addr_phys = rk_ops->pt_address(dte) + (pte_index * 4); pte_addr = phys_to_virt(pte_addr_phys); pte = *pte_addr; if (!rk_pte_is_page_valid(pte)) goto print_it; - page_addr_phys = rk_pte_page_address(pte) + page_offset; + page_addr_phys = rk_ops->pt_address(pte) + page_offset; page_flags = pte & RK_PTE_PAGE_FLAGS_MASK; print_it: @@ -601,13 +609,13 @@ static phys_addr_t rk_iommu_iova_to_phys(struct iommu_domain *domain, if (!rk_dte_is_pt_valid(dte)) goto out; - pt_phys = rk_dte_pt_address(dte); + pt_phys = rk_ops->pt_address(dte); page_table = (u32 *)phys_to_virt(pt_phys); pte = page_table[rk_iova_pte_index(iova)]; if (!rk_pte_is_page_valid(pte)) goto out; - phys = rk_pte_page_address(pte) + rk_iova_page_offset(iova); + phys = rk_ops->pt_address(pte) + rk_iova_page_offset(iova); out: spin_unlock_irqrestore(_domain->dt_lock, flags); @@ -679,14 +687,14 @@
[PATCH v6 0/4] Add IOMMU driver for rk356x
This series adds the IOMMU driver for rk356x SoC. Since a new compatible is needed to distinguish this second version of IOMMU hardware block from the first one, it is an opportunity to convert the binding to DT schema. version 6: - Remove #include - Remove pt_address_mask field - Only use once of_device_get_match_data - Return an error if ops don't match version 5: - Add internal ops inside the driver to be able to add variants. - Add support of v2 variant. - Stop using 'version' field - Use GENMASK macro version 4: - Add description for reg items - Remove useless interrupt-names properties - Add description for interrupts items - Remove interrupt-names properties from DST files version 3: - Rename compatible with soc prefix - Rebase on v5.12 tag version 2: - Fix iommu-cells typo in rk322x.dtsi - Change maintainer - Change reg maxItems - Add power-domains property Add IOMMU driver for rk356x This series adds the IOMMU driver for rk356x SoC. Since a new compatible is needed to distinguish this second version of IOMMU hardware block from the first one, it is an opportunity to convert the binding to DT schema. version 4: - Add description for reg items - Remove useless interrupt-names properties - Add description for interrupts items - Remove interrupt-names properties from DST files version 3: - Rename compatible with soc prefix - Rebase on v5.12 tag version 2: - Fix iommu-cells typo in rk322x.dtsi - Change maintainer - Change reg maxItems - Add power-domains property Benjamin Gaignard (4): dt-bindings: iommu: rockchip: Convert IOMMU to DT schema dt-bindings: iommu: rockchip: Add compatible for v2 iommu: rockchip: Add internal ops to handle variants iommu: rockchip: Add support for iommu v2 .../bindings/iommu/rockchip,iommu.txt | 38 - .../bindings/iommu/rockchip,iommu.yaml| 85 ++ drivers/iommu/rockchip-iommu.c| 150 +++--- 3 files changed, 214 insertions(+), 59 deletions(-) delete mode 100644 Documentation/devicetree/bindings/iommu/rockchip,iommu.txt create mode 100644 Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 3/4] iommu: rockchip: Add internal ops to handle variants
Le 21/05/2021 à 14:58, Robin Murphy a écrit : On 2021-05-21 09:36, Benjamin Gaignard wrote: Add internal ops to be able to handle incoming variant v2. The goal is to keep the overall structure of the framework but to allow to add the evolution of this hardware block. The ops are global for a SoC because iommu domains are not attached to a specific devices if they are for a virtuel device like drm. Use a global variable shouldn't be since SoC usually doesn't embedded different versions of the iommu hardware block. If that happen one day a WARN_ON will be displayed at probe time. IMO it would be a grievous error if such a "virtual device" ever gets near the IOMMU API, so personally I wouldn't use that as a justification for anything :) FWIW you should be OK to handle things on a per-instance basis, it just means you have to defer some of the domain setup to .attach_dev time, like various other drivers do. That said, there's nothing wrong with the global if we do expect instances to be consistent across any given Rockchip SoC (and my gut feeling is that we probably should). I have tried that solution first but drm device appear to but such "virtual device" so I had to use the global. I send a v6 to fix your others remarks. Thanks for your advice. Benjamin Signed-off-by: Benjamin Gaignard --- version 5: - Use of_device_get_match_data() - Add internal ops inside the driver drivers/iommu/rockchip-iommu.c | 69 -- 1 file changed, 50 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index 7a2932772fdf..e7b9bcf174b1 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -19,6 +19,7 @@ #include #include #include +#include This seems to be an unrelated and unnecessary change. #include #include #include @@ -96,6 +97,14 @@ static const char * const rk_iommu_clocks[] = { "aclk", "iface", }; +struct rk_iommu_ops { + phys_addr_t (*pt_address)(u32 dte); + u32 (*mk_dtentries)(dma_addr_t pt_dma); + u32 (*mk_ptentries)(phys_addr_t page, int prot); + phys_addr_t (*dte_addr_phys)(phys_addr_t addr); + u32 pt_address_mask; +}; + struct rk_iommu { struct device *dev; void __iomem **bases; @@ -116,6 +125,7 @@ struct rk_iommudata { }; static struct device *dma_dev; +static const struct rk_iommu_ops *rk_ops; static inline void rk_table_flush(struct rk_iommu_domain *dom, dma_addr_t dma, unsigned int count) @@ -215,11 +225,6 @@ static inline u32 rk_mk_dte(dma_addr_t pt_dma) #define RK_PTE_PAGE_READABLE BIT(1) #define RK_PTE_PAGE_VALID BIT(0) -static inline phys_addr_t rk_pte_page_address(u32 pte) -{ - return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK; -} - static inline bool rk_pte_is_page_valid(u32 pte) { return pte & RK_PTE_PAGE_VALID; @@ -451,7 +456,7 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu) rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, DTE_ADDR_DUMMY); dte_addr = rk_iommu_read(iommu->bases[i], RK_MMU_DTE_ADDR); - if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) { + if (dte_addr != (DTE_ADDR_DUMMY & rk_ops->pt_address_mask)) { Nit: might it make more sense to do something like: dte_addr = rk_ops->pt_address(... DTE_ADDR_DUMMY); rk_iommu_write(... dte_addr) if (rk_iommu_read(...) != dte_addr) so that you don't need to bother defining ->pt_address_mask for just this one sanity-check? dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is not functioning\n"); return -EFAULT; } @@ -470,6 +475,11 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu) return 0; } +static inline phys_addr_t rk_dte_addr_phys(phys_addr_t addr) The argument type here should be u32, since it's a DTE, not a physical address... +{ + return addr; +} + static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) { void __iomem *base = iommu->bases[index]; @@ -489,7 +499,7 @@ static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) page_offset = rk_iova_page_offset(iova); mmu_dte_addr = rk_iommu_read(base, RK_MMU_DTE_ADDR); - mmu_dte_addr_phys = (phys_addr_t)mmu_dte_addr; + mmu_dte_addr_phys = rk_ops->dte_addr_phys((phys_addr_t)mmu_dte_addr); ...and the cast here should not be here, since it *is* the conversion that the called function is supposed to be performing. dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index); dte_addr = phys_to_virt(dte_addr_phys); @@ -498,14 +508,14 @@ static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) if (!rk_dte_is_pt_valid(dte)) goto print_it; - pte_addr_phys = rk_dte_pt_address(dte) + (pte_index * 4); + pte_addr_phys = rk_ops->pt_address(dte) + (pte_index * 4);
RE: [PATCH v4 0/8] ACPI/IORT: Support for IORT RMR node
> -Original Message- > From: Steven Price [mailto:steven.pr...@arm.com] > Sent: 21 May 2021 13:55 > To: Shameerali Kolothum Thodi ; > linux-arm-ker...@lists.infradead.org; linux-a...@vger.kernel.org; > iommu@lists.linux-foundation.org > Cc: Linuxarm ; lorenzo.pieral...@arm.com; > j...@8bytes.org; robin.mur...@arm.com; wanghuiqiang > ; Guohanjun (Hanjun Guo) > ; sami.muja...@arm.com; j...@solid-run.com; > eric.au...@redhat.com; yangyicong > Subject: Re: [PATCH v4 0/8] ACPI/IORT: Support for IORT RMR node > > On 13/05/2021 14:45, Shameer Kolothum wrote: > > Hi, > > > > v3 -->v4 > > -Included the SMMUv2 SMR bypass install changes suggested by > > Steve(patch #7) > > -As per Robin's comments, RMR reserve implementation is now > > more generic (patch #8) and dropped v3 patches 8 and 10. > > -Rebase to 5.13-rc1 > > > > The whole series is available here, > > https://github.com/hisilicon/kernel-dev/tree/private-v5.13-rc1-rmr-v4-ext > > > > RFC v2 --> v3 > > -Dropped RFC tag as the ACPICA header changes are now ready to be > > part of 5.13[0]. But this series still has a dependency on that patch. > > -Added IORT E.b related changes(node flags, _DSM function 5 checks for > > PCIe). > > -Changed RMR to stream id mapping from M:N to M:1 as per the spec and > > discussion here[1]. > > -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!) > > > > Sanity tested on a HiSilicon D06. Further testing and feedback is greatly > > appreciated. > > With the updated SMMUv2 support this works fine on my Juno with EFIFB > (and corresponding patches to the firmware to expose the ACPI tables). > Feel free to add > > Tested-by: Steven Price Thanks Steve. I am in the process of incorporating the comments from Joerg/Robin to reuse the struct iommu_resv_region. I will post a v5 soon with that and a couple of other minor fixes. Thanks, Shameer > Thanks, > > Steve > > > https://github.com/hisilicon/kernel-dev/tree/private-v5.12-rc8-rmr-v3 > > > > Thanks, > > Shameer > > > > [0] > https://lore.kernel.org/linux-acpi/20210406213028.718796-22-erik.kaneda@i > ntel.com/ > > [1] > https://op-lists.linaro.org/pipermail/linaro-open-discussions/2021-April/00015 > 0.html > > > > RFC v1 --> v2: > > - Added a generic interface for IOMMU drivers to retrieve all the > > RMR info associated with a given IOMMU. > > - SMMUv3 driver gets the RMR list during probe() and installs > > bypass STEs for all the SIDs in the RMR list. This is to keep > > the ongoing traffic alive(if any) during SMMUv3 reset. This is > >based on the suggestions received for v1 to take care of the > >EFI framebuffer use case. Only sanity tested for now. > > - During the probe/attach device, SMMUv3 driver reserves any > > RMR region associated with the device such that there is a unity > > mapping for them in SMMU. > > --- > > > > From RFC v1: > > - > > The series adds support to IORT RMR nodes specified in IORT > > Revision E -ARM DEN 0049E[0]. RMR nodes are used to describe memory > > ranges that are used by endpoints and require a unity mapping > > in SMMU. > > > > We have faced issues with 3408iMR RAID controller cards which > > fail to boot when SMMU is enabled. This is because these controllers > > make use of host memory for various caching related purposes and when > > SMMU is enabled the iMR firmware fails to access these memory regions > > as there is no mapping for them. IORT RMR provides a way for UEFI to > > describe and report these memory regions so that the kernel can make > > a unity mapping for these in SMMU. > > > > Tests: > > > > With a UEFI, that reports the RMR for the dev, > > > > [16F0h 5872 1] Type : 06 > > [16F1h 5873 2] Length : 007C > > [16F3h 5875 1] Revision : 00 > > [1038h 0056 2] Reserved : > > [1038h 0056 2] Identifier : > > [16F8h 5880 4]Mapping Count : 0001 > > [16FCh 5884 4] Mapping Offset : 0040 > > > > [1700h 5888 4]Number of RMR Descriptors : 0002 > > [1704h 5892 4]RMR Descriptor Offset : 0018 > > > > [1708h 5896 8] Base Address of RMR : E640 > > [1710h 5904 8]Length of RMR : 0010 > > [1718h 5912 4] Reserved : > > > > [171Ch 5916 8] Base Address of RMR : 27B0 > > [1724h 5924 8]Length of RMR : 00C0 > > [172Ch 5932 4] Reserved : > > > > [1730h 5936 4] Input base : > > [1734h 5940 4] ID Count : 0001 > > [1738h 5944 4] Output Base : 0003 > > [173Ch 5948 4] Output Reference : 0064 > > [1740h 5952 4]Flags (decoded below) : 0001 > >Single Mapping : 1 > > ... > > > >
Re: [PATCH v5 3/4] iommu: rockchip: Add internal ops to handle variants
On 2021-05-21 09:36, Benjamin Gaignard wrote: Add internal ops to be able to handle incoming variant v2. The goal is to keep the overall structure of the framework but to allow to add the evolution of this hardware block. The ops are global for a SoC because iommu domains are not attached to a specific devices if they are for a virtuel device like drm. Use a global variable shouldn't be since SoC usually doesn't embedded different versions of the iommu hardware block. If that happen one day a WARN_ON will be displayed at probe time. IMO it would be a grievous error if such a "virtual device" ever gets near the IOMMU API, so personally I wouldn't use that as a justification for anything :) FWIW you should be OK to handle things on a per-instance basis, it just means you have to defer some of the domain setup to .attach_dev time, like various other drivers do. That said, there's nothing wrong with the global if we do expect instances to be consistent across any given Rockchip SoC (and my gut feeling is that we probably should). Signed-off-by: Benjamin Gaignard --- version 5: - Use of_device_get_match_data() - Add internal ops inside the driver drivers/iommu/rockchip-iommu.c | 69 -- 1 file changed, 50 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index 7a2932772fdf..e7b9bcf174b1 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -19,6 +19,7 @@ #include #include #include +#include This seems to be an unrelated and unnecessary change. #include #include #include @@ -96,6 +97,14 @@ static const char * const rk_iommu_clocks[] = { "aclk", "iface", }; +struct rk_iommu_ops { + phys_addr_t (*pt_address)(u32 dte); + u32 (*mk_dtentries)(dma_addr_t pt_dma); + u32 (*mk_ptentries)(phys_addr_t page, int prot); + phys_addr_t (*dte_addr_phys)(phys_addr_t addr); + u32 pt_address_mask; +}; + struct rk_iommu { struct device *dev; void __iomem **bases; @@ -116,6 +125,7 @@ struct rk_iommudata { }; static struct device *dma_dev; +static const struct rk_iommu_ops *rk_ops; static inline void rk_table_flush(struct rk_iommu_domain *dom, dma_addr_t dma, unsigned int count) @@ -215,11 +225,6 @@ static inline u32 rk_mk_dte(dma_addr_t pt_dma) #define RK_PTE_PAGE_READABLE BIT(1) #define RK_PTE_PAGE_VALID BIT(0) -static inline phys_addr_t rk_pte_page_address(u32 pte) -{ - return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK; -} - static inline bool rk_pte_is_page_valid(u32 pte) { return pte & RK_PTE_PAGE_VALID; @@ -451,7 +456,7 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu) rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, DTE_ADDR_DUMMY); dte_addr = rk_iommu_read(iommu->bases[i], RK_MMU_DTE_ADDR); - if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) { + if (dte_addr != (DTE_ADDR_DUMMY & rk_ops->pt_address_mask)) { Nit: might it make more sense to do something like: dte_addr = rk_ops->pt_address(... DTE_ADDR_DUMMY); rk_iommu_write(... dte_addr) if (rk_iommu_read(...) != dte_addr) so that you don't need to bother defining ->pt_address_mask for just this one sanity-check? dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is not functioning\n"); return -EFAULT; } @@ -470,6 +475,11 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu) return 0; } +static inline phys_addr_t rk_dte_addr_phys(phys_addr_t addr) The argument type here should be u32, since it's a DTE, not a physical address... +{ + return addr; +} + static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) { void __iomem *base = iommu->bases[index]; @@ -489,7 +499,7 @@ static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) page_offset = rk_iova_page_offset(iova); mmu_dte_addr = rk_iommu_read(base, RK_MMU_DTE_ADDR); - mmu_dte_addr_phys = (phys_addr_t)mmu_dte_addr; + mmu_dte_addr_phys = rk_ops->dte_addr_phys((phys_addr_t)mmu_dte_addr); ...and the cast here should not be here, since it *is* the conversion that the called function is supposed to be performing. dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index); dte_addr = phys_to_virt(dte_addr_phys); @@ -498,14 +508,14 @@ static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) if (!rk_dte_is_pt_valid(dte)) goto print_it; - pte_addr_phys = rk_dte_pt_address(dte) + (pte_index * 4); + pte_addr_phys = rk_ops->pt_address(dte) + (pte_index * 4); pte_addr = phys_to_virt(pte_addr_phys); pte = *pte_addr; if (!rk_pte_is_page_valid(pte))
Re: [PATCH v4 0/8] ACPI/IORT: Support for IORT RMR node
On 13/05/2021 14:45, Shameer Kolothum wrote: > Hi, > > v3 -->v4 > -Included the SMMUv2 SMR bypass install changes suggested by > Steve(patch #7) > -As per Robin's comments, RMR reserve implementation is now > more generic (patch #8) and dropped v3 patches 8 and 10. > -Rebase to 5.13-rc1 > > The whole series is available here, > https://github.com/hisilicon/kernel-dev/tree/private-v5.13-rc1-rmr-v4-ext > > RFC v2 --> v3 > -Dropped RFC tag as the ACPICA header changes are now ready to be > part of 5.13[0]. But this series still has a dependency on that patch. > -Added IORT E.b related changes(node flags, _DSM function 5 checks for > PCIe). > -Changed RMR to stream id mapping from M:N to M:1 as per the spec and > discussion here[1]. > -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!) > > Sanity tested on a HiSilicon D06. Further testing and feedback is greatly > appreciated. With the updated SMMUv2 support this works fine on my Juno with EFIFB (and corresponding patches to the firmware to expose the ACPI tables). Feel free to add Tested-by: Steven Price Thanks, Steve > https://github.com/hisilicon/kernel-dev/tree/private-v5.12-rc8-rmr-v3 > > Thanks, > Shameer > > [0] > https://lore.kernel.org/linux-acpi/20210406213028.718796-22-erik.kan...@intel.com/ > [1] > https://op-lists.linaro.org/pipermail/linaro-open-discussions/2021-April/000150.html > > RFC v1 --> v2: > - Added a generic interface for IOMMU drivers to retrieve all the > RMR info associated with a given IOMMU. > - SMMUv3 driver gets the RMR list during probe() and installs > bypass STEs for all the SIDs in the RMR list. This is to keep > the ongoing traffic alive(if any) during SMMUv3 reset. This is >based on the suggestions received for v1 to take care of the >EFI framebuffer use case. Only sanity tested for now. > - During the probe/attach device, SMMUv3 driver reserves any > RMR region associated with the device such that there is a unity > mapping for them in SMMU. > --- > > From RFC v1: > - > The series adds support to IORT RMR nodes specified in IORT > Revision E -ARM DEN 0049E[0]. RMR nodes are used to describe memory > ranges that are used by endpoints and require a unity mapping > in SMMU. > > We have faced issues with 3408iMR RAID controller cards which > fail to boot when SMMU is enabled. This is because these controllers > make use of host memory for various caching related purposes and when > SMMU is enabled the iMR firmware fails to access these memory regions > as there is no mapping for them. IORT RMR provides a way for UEFI to > describe and report these memory regions so that the kernel can make > a unity mapping for these in SMMU. > > Tests: > > With a UEFI, that reports the RMR for the dev, > > [16F0h 5872 1] Type : 06 > [16F1h 5873 2] Length : 007C > [16F3h 5875 1] Revision : 00 > [1038h 0056 2] Reserved : > [1038h 0056 2] Identifier : > [16F8h 5880 4]Mapping Count : 0001 > [16FCh 5884 4] Mapping Offset : 0040 > > [1700h 5888 4]Number of RMR Descriptors : 0002 > [1704h 5892 4]RMR Descriptor Offset : 0018 > > [1708h 5896 8] Base Address of RMR : E640 > [1710h 5904 8]Length of RMR : 0010 > [1718h 5912 4] Reserved : > > [171Ch 5916 8] Base Address of RMR : 27B0 > [1724h 5924 8]Length of RMR : 00C0 > [172Ch 5932 4] Reserved : > > [1730h 5936 4] Input base : > [1734h 5940 4] ID Count : 0001 > [1738h 5944 4] Output Base : 0003 > [173Ch 5948 4] Output Reference : 0064 > [1740h 5952 4]Flags (decoded below) : 0001 >Single Mapping : 1 > ... > > Without the series the RAID controller initialization fails as > below, > > ... > [ 12.631117] megaraid_sas :03:00.0: FW supports sync cache: Yes > > [ 12.637360] megaraid_sas :03:00.0: megasas_disable_intr_fusion is > called outbound_intr_mask:0x4009 > > [ 18.776377] megaraid_sas :03:00.0: Init cmd return status FAILED for > SCSI host 0 > > [ 23.019383] megaraid_sas :03:00.0: Waiting for FW to come to ready > state > [ 106.684281] megaraid_sas :03:00.0: FW in FAULT state, Fault > code:0x1 subcode:0x0 func:megasas_transition_to_ready > > [ 106.695186] megaraid_sas :03:00.0: System Register set: > > [ 106.889787] megaraid_sas :03:00.0: Failed to transition controller
Re: [PATCH] dma-iommu: Add a check to avoid dereference null pointer in function iommu_dma_map_sg()
在 2021/5/21 18:36, Robin Murphy 写道: On 2021-05-21 04:05, chenxiang wrote: From: Xiang Chen The issue is reported by tool TscanCode, and it is possible to deference null pointer when prev is NULL which is the initial value. No it isn't. This is literally explained in the comment visible in the diff context below... Robin. ok, thanks Signed-off-by: Xiang Chen --- drivers/iommu/dma-iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 4cb63b2..88a4f34 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1042,7 +1042,7 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, * iova_len == 0, thus we cannot dereference prev the first * time through here (i.e. before it has a meaningful value). */ -if (pad_len && pad_len < s_length - 1) { +if (prev && pad_len && pad_len < s_length - 1) { prev->length += pad_len; iova_len += pad_len; } . ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 2/4] dt-bindings: iommu: rockchip: Add compatible for v2
On 2021-05-21 09:36, Benjamin Gaignard wrote: Add compatible for the second version of IOMMU hardware block. RK356x IOMMU can also be link to a power domain. Signed-off-by: Benjamin Gaignard Reviewed-by: Rob Herring --- .../devicetree/bindings/iommu/rockchip,iommu.yaml | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml b/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml index 099fc2578b54..d2e28a9e3545 100644 --- a/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml +++ b/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml @@ -19,7 +19,9 @@ description: |+ properties: compatible: -const: rockchip,iommu +enum: + - rockchip,iommu + - rockchip,rk3568-iommu reg: items: @@ -48,6 +50,9 @@ properties: "#iommu-cells": const: 0 + power-domains: +maxItems: 1 + Nit: power domains are already present on various IOMMU nodes since RK3288 - it feels like strictly this should be in patch #1 to fix the existing binding as part of the conversion, but on the other hand I can't really imagine anyone caring *that* much about dtscheck bisecting cleanly :) Robin. rockchip,disable-mmu-reset: $ref: /schemas/types.yaml#/definitions/flag description: | ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dma-iommu: Add a check to avoid dereference null pointer in function iommu_dma_map_sg()
On 2021-05-21 04:05, chenxiang wrote: From: Xiang Chen The issue is reported by tool TscanCode, and it is possible to deference null pointer when prev is NULL which is the initial value. No it isn't. This is literally explained in the comment visible in the diff context below... Robin. Signed-off-by: Xiang Chen --- drivers/iommu/dma-iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 4cb63b2..88a4f34 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1042,7 +1042,7 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, * iova_len == 0, thus we cannot dereference prev the first * time through here (i.e. before it has a meaningful value). */ - if (pad_len && pad_len < s_length - 1) { + if (prev && pad_len && pad_len < s_length - 1) { prev->length += pad_len; iova_len += pad_len; } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 1/4] dt-bindings: iommu: rockchip: Convert IOMMU to DT schema
Am Freitag, 21. Mai 2021, 10:36:34 CEST schrieb Benjamin Gaignard: > Convert Rockchip IOMMU to DT schema > > Signed-off-by: Benjamin Gaignard > Reviewed-by: Rob Herring Reviewed-by: Heiko Stuebner ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 2/4] dt-bindings: iommu: rockchip: Add compatible for v2
Am Freitag, 21. Mai 2021, 10:36:35 CEST schrieb Benjamin Gaignard: > Add compatible for the second version of IOMMU hardware block. > RK356x IOMMU can also be link to a power domain. > > Signed-off-by: Benjamin Gaignard > Reviewed-by: Rob Herring Reviewed-by: Heiko Stuebner > --- > .../devicetree/bindings/iommu/rockchip,iommu.yaml | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml > b/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml > index 099fc2578b54..d2e28a9e3545 100644 > --- a/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml > +++ b/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml > @@ -19,7 +19,9 @@ description: |+ > > properties: >compatible: > -const: rockchip,iommu > +enum: > + - rockchip,iommu > + - rockchip,rk3568-iommu > >reg: > items: > @@ -48,6 +50,9 @@ properties: >"#iommu-cells": > const: 0 > > + power-domains: > +maxItems: 1 > + >rockchip,disable-mmu-reset: > $ref: /schemas/types.yaml#/definitions/flag > description: | > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 4/4] iommu: rockchip: Add support for iommu v2
Am Freitag, 21. Mai 2021, 10:36:37 CEST schrieb Benjamin Gaignard: > This second version of the hardware block has a different bits > mapping for page table entries. > Add the ops matching to this new mapping. > Define a new compatible to distinguish it from the first version. > > Signed-off-by: Benjamin Gaignard Reviewed-by: Heiko Stuebner > --- > version 5: > - Use internal ops to support v2 hardware block > - Use GENMASK macro. > - Keep rk_dte_pt_address() and rk_dte_pt_address_v2() separated >because I believe that is more readable like this. > - Do not duplicate code. > > drivers/iommu/rockchip-iommu.c | 78 ++ > 1 file changed, 78 insertions(+) > > diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c > index e7b9bcf174b1..23253a2f269e 100644 > --- a/drivers/iommu/rockchip-iommu.c > +++ b/drivers/iommu/rockchip-iommu.c > @@ -189,6 +189,33 @@ static inline phys_addr_t rk_dte_pt_address(u32 dte) > return (phys_addr_t)dte & RK_DTE_PT_ADDRESS_MASK; > } > > +/* > + * In v2: > + * 31:12 - PT address bit 31:0 > + * 11: 8 - PT address bit 35:32 > + * 7: 4 - PT address bit 39:36 > + * 3: 1 - Reserved > + * 0 - 1 if PT @ PT address is valid > + */ > +#define RK_DTE_PT_ADDRESS_MASK_V2 GENMASK_ULL(31, 4) > +#define DTE_HI_MASK1 GENMASK(11, 8) > +#define DTE_HI_MASK2 GENMASK(7, 4) > +#define DTE_HI_SHIFT124 /* shift bit 8 to bit 32 */ > +#define DTE_HI_SHIFT232 /* shift bit 4 to bit 36 */ > +#define PAGE_DESC_HI_MASK1 GENMASK_ULL(39, 36) > +#define PAGE_DESC_HI_MASK2 GENMASK_ULL(35, 32) > + > +static inline phys_addr_t rk_dte_pt_address_v2(u32 dte) > +{ > + u64 dte_v2 = dte; > + > + dte_v2 = ((dte_v2 & DTE_HI_MASK2) << DTE_HI_SHIFT2) | > + ((dte_v2 & DTE_HI_MASK1) << DTE_HI_SHIFT1) | > + (dte_v2 & RK_DTE_PT_ADDRESS_MASK); > + > + return (phys_addr_t)dte_v2; > +} > + > static inline bool rk_dte_is_pt_valid(u32 dte) > { > return dte & RK_DTE_PT_VALID; > @@ -199,6 +226,15 @@ static inline u32 rk_mk_dte(dma_addr_t pt_dma) > return (pt_dma & RK_DTE_PT_ADDRESS_MASK) | RK_DTE_PT_VALID; > } > > +static inline u32 rk_mk_dte_v2(dma_addr_t pt_dma) > +{ > + pt_dma = (pt_dma & RK_DTE_PT_ADDRESS_MASK) | > + ((pt_dma & PAGE_DESC_HI_MASK1) >> DTE_HI_SHIFT1) | > + (pt_dma & PAGE_DESC_HI_MASK2) >> DTE_HI_SHIFT2; > + > + return (pt_dma & RK_DTE_PT_ADDRESS_MASK_V2) | RK_DTE_PT_VALID; > +} > + > /* > * Each PTE has a Page address, some flags and a valid bit: > * +-+---+---+-+ > @@ -240,6 +276,29 @@ static u32 rk_mk_pte(phys_addr_t page, int prot) > return page | flags | RK_PTE_PAGE_VALID; > } > > +/* > + * In v2: > + * 31:12 - Page address bit 31:0 > + * 11:9 - Page address bit 34:32 > + * 8:4 - Page address bit 39:35 > + * 3 - Security > + * 2 - Readable > + * 1 - Writable > + * 0 - 1 if Page @ Page address is valid > + */ > +#define RK_PTE_PAGE_READABLE_V2 BIT(2) > +#define RK_PTE_PAGE_WRITABLE_V2 BIT(1) > + > +static u32 rk_mk_pte_v2(phys_addr_t page, int prot) > +{ > + u32 flags = 0; > + > + flags |= (prot & IOMMU_READ) ? RK_PTE_PAGE_READABLE_V2 : 0; > + flags |= (prot & IOMMU_WRITE) ? RK_PTE_PAGE_WRITABLE_V2 : 0; > + > + return rk_mk_dte_v2(page) | flags ; > +} > + > static u32 rk_mk_pte_invalid(u32 pte) > { > return pte & ~RK_PTE_PAGE_VALID; > @@ -480,6 +539,14 @@ static inline phys_addr_t rk_dte_addr_phys(phys_addr_t > addr) > return addr; > } > > +#define DT_HI_MASK GENMASK_ULL(39, 32) > +#define DT_SHIFT 28 > + > +static inline phys_addr_t rk_dte_addr_phys_v2(phys_addr_t addr) > +{ > + return (addr & RK_DTE_PT_ADDRESS_MASK) | ((addr & DT_HI_MASK) << > DT_SHIFT); > +} > + > static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) > { > void __iomem *base = iommu->bases[index]; > @@ -1305,10 +1372,21 @@ static struct rk_iommu_ops iommu_data_ops_v1 = { > .pt_address_mask = RK_DTE_PT_ADDRESS_MASK, > }; > > +static struct rk_iommu_ops iommu_data_ops_v2 = { > + .pt_address = _dte_pt_address_v2, > + .mk_dtentries = _mk_dte_v2, > + .mk_ptentries = _mk_pte_v2, > + .dte_addr_phys = _dte_addr_phys_v2, > + .pt_address_mask = RK_DTE_PT_ADDRESS_MASK_V2, > +}; > + > static const struct of_device_id rk_iommu_dt_ids[] = { > { .compatible = "rockchip,iommu", > .data = _data_ops_v1, > }, > + { .compatible = "rockchip,rk3568-iommu", > + .data = _data_ops_v2, > + }, > { /* sentinel */ } > }; > MODULE_DEVICE_TABLE(of, rk_iommu_dt_ids); > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 3/4] iommu: rockchip: Add internal ops to handle variants
Am Freitag, 21. Mai 2021, 10:36:36 CEST schrieb Benjamin Gaignard: > Add internal ops to be able to handle incoming variant v2. > The goal is to keep the overall structure of the framework but > to allow to add the evolution of this hardware block. > > The ops are global for a SoC because iommu domains are not > attached to a specific devices if they are for a virtuel device like > drm. Use a global variable shouldn't be since SoC usually doesn't > embedded different versions of the iommu hardware block. > If that happen one day a WARN_ON will be displayed at probe time. > > Signed-off-by: Benjamin Gaignard Reviewed-by: Heiko Stuebner ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 3/4] iommu: rockchip: Add internal ops to handle variants
Add internal ops to be able to handle incoming variant v2. The goal is to keep the overall structure of the framework but to allow to add the evolution of this hardware block. The ops are global for a SoC because iommu domains are not attached to a specific devices if they are for a virtuel device like drm. Use a global variable shouldn't be since SoC usually doesn't embedded different versions of the iommu hardware block. If that happen one day a WARN_ON will be displayed at probe time. Signed-off-by: Benjamin Gaignard --- version 5: - Use of_device_get_match_data() - Add internal ops inside the driver drivers/iommu/rockchip-iommu.c | 69 -- 1 file changed, 50 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index 7a2932772fdf..e7b9bcf174b1 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include #include @@ -96,6 +97,14 @@ static const char * const rk_iommu_clocks[] = { "aclk", "iface", }; +struct rk_iommu_ops { + phys_addr_t (*pt_address)(u32 dte); + u32 (*mk_dtentries)(dma_addr_t pt_dma); + u32 (*mk_ptentries)(phys_addr_t page, int prot); + phys_addr_t (*dte_addr_phys)(phys_addr_t addr); + u32 pt_address_mask; +}; + struct rk_iommu { struct device *dev; void __iomem **bases; @@ -116,6 +125,7 @@ struct rk_iommudata { }; static struct device *dma_dev; +static const struct rk_iommu_ops *rk_ops; static inline void rk_table_flush(struct rk_iommu_domain *dom, dma_addr_t dma, unsigned int count) @@ -215,11 +225,6 @@ static inline u32 rk_mk_dte(dma_addr_t pt_dma) #define RK_PTE_PAGE_READABLE BIT(1) #define RK_PTE_PAGE_VALID BIT(0) -static inline phys_addr_t rk_pte_page_address(u32 pte) -{ - return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK; -} - static inline bool rk_pte_is_page_valid(u32 pte) { return pte & RK_PTE_PAGE_VALID; @@ -451,7 +456,7 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu) rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, DTE_ADDR_DUMMY); dte_addr = rk_iommu_read(iommu->bases[i], RK_MMU_DTE_ADDR); - if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) { + if (dte_addr != (DTE_ADDR_DUMMY & rk_ops->pt_address_mask)) { dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is not functioning\n"); return -EFAULT; } @@ -470,6 +475,11 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu) return 0; } +static inline phys_addr_t rk_dte_addr_phys(phys_addr_t addr) +{ + return addr; +} + static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) { void __iomem *base = iommu->bases[index]; @@ -489,7 +499,7 @@ static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) page_offset = rk_iova_page_offset(iova); mmu_dte_addr = rk_iommu_read(base, RK_MMU_DTE_ADDR); - mmu_dte_addr_phys = (phys_addr_t)mmu_dte_addr; + mmu_dte_addr_phys = rk_ops->dte_addr_phys((phys_addr_t)mmu_dte_addr); dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index); dte_addr = phys_to_virt(dte_addr_phys); @@ -498,14 +508,14 @@ static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) if (!rk_dte_is_pt_valid(dte)) goto print_it; - pte_addr_phys = rk_dte_pt_address(dte) + (pte_index * 4); + pte_addr_phys = rk_ops->pt_address(dte) + (pte_index * 4); pte_addr = phys_to_virt(pte_addr_phys); pte = *pte_addr; if (!rk_pte_is_page_valid(pte)) goto print_it; - page_addr_phys = rk_pte_page_address(pte) + page_offset; + page_addr_phys = rk_ops->pt_address(pte) + page_offset; page_flags = pte & RK_PTE_PAGE_FLAGS_MASK; print_it: @@ -601,13 +611,13 @@ static phys_addr_t rk_iommu_iova_to_phys(struct iommu_domain *domain, if (!rk_dte_is_pt_valid(dte)) goto out; - pt_phys = rk_dte_pt_address(dte); + pt_phys = rk_ops->pt_address(dte); page_table = (u32 *)phys_to_virt(pt_phys); pte = page_table[rk_iova_pte_index(iova)]; if (!rk_pte_is_page_valid(pte)) goto out; - phys = rk_pte_page_address(pte) + rk_iova_page_offset(iova); + phys = rk_ops->pt_address(pte) + rk_iova_page_offset(iova); out: spin_unlock_irqrestore(_domain->dt_lock, flags); @@ -679,14 +689,14 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain, return ERR_PTR(-ENOMEM); } - dte = rk_mk_dte(pt_dma); + dte = rk_ops->mk_dtentries(pt_dma); *dte_addr = dte; rk_table_flush(rk_domain, pt_dma, NUM_PT_ENTRIES);
[PATCH v5 4/4] iommu: rockchip: Add support for iommu v2
This second version of the hardware block has a different bits mapping for page table entries. Add the ops matching to this new mapping. Define a new compatible to distinguish it from the first version. Signed-off-by: Benjamin Gaignard --- version 5: - Use internal ops to support v2 hardware block - Use GENMASK macro. - Keep rk_dte_pt_address() and rk_dte_pt_address_v2() separated because I believe that is more readable like this. - Do not duplicate code. drivers/iommu/rockchip-iommu.c | 78 ++ 1 file changed, 78 insertions(+) diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index e7b9bcf174b1..23253a2f269e 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -189,6 +189,33 @@ static inline phys_addr_t rk_dte_pt_address(u32 dte) return (phys_addr_t)dte & RK_DTE_PT_ADDRESS_MASK; } +/* + * In v2: + * 31:12 - PT address bit 31:0 + * 11: 8 - PT address bit 35:32 + * 7: 4 - PT address bit 39:36 + * 3: 1 - Reserved + * 0 - 1 if PT @ PT address is valid + */ +#define RK_DTE_PT_ADDRESS_MASK_V2 GENMASK_ULL(31, 4) +#define DTE_HI_MASK1 GENMASK(11, 8) +#define DTE_HI_MASK2 GENMASK(7, 4) +#define DTE_HI_SHIFT1 24 /* shift bit 8 to bit 32 */ +#define DTE_HI_SHIFT2 32 /* shift bit 4 to bit 36 */ +#define PAGE_DESC_HI_MASK1 GENMASK_ULL(39, 36) +#define PAGE_DESC_HI_MASK2 GENMASK_ULL(35, 32) + +static inline phys_addr_t rk_dte_pt_address_v2(u32 dte) +{ + u64 dte_v2 = dte; + + dte_v2 = ((dte_v2 & DTE_HI_MASK2) << DTE_HI_SHIFT2) | +((dte_v2 & DTE_HI_MASK1) << DTE_HI_SHIFT1) | +(dte_v2 & RK_DTE_PT_ADDRESS_MASK); + + return (phys_addr_t)dte_v2; +} + static inline bool rk_dte_is_pt_valid(u32 dte) { return dte & RK_DTE_PT_VALID; @@ -199,6 +226,15 @@ static inline u32 rk_mk_dte(dma_addr_t pt_dma) return (pt_dma & RK_DTE_PT_ADDRESS_MASK) | RK_DTE_PT_VALID; } +static inline u32 rk_mk_dte_v2(dma_addr_t pt_dma) +{ + pt_dma = (pt_dma & RK_DTE_PT_ADDRESS_MASK) | +((pt_dma & PAGE_DESC_HI_MASK1) >> DTE_HI_SHIFT1) | +(pt_dma & PAGE_DESC_HI_MASK2) >> DTE_HI_SHIFT2; + + return (pt_dma & RK_DTE_PT_ADDRESS_MASK_V2) | RK_DTE_PT_VALID; +} + /* * Each PTE has a Page address, some flags and a valid bit: * +-+---+---+-+ @@ -240,6 +276,29 @@ static u32 rk_mk_pte(phys_addr_t page, int prot) return page | flags | RK_PTE_PAGE_VALID; } +/* + * In v2: + * 31:12 - Page address bit 31:0 + * 11:9 - Page address bit 34:32 + * 8:4 - Page address bit 39:35 + * 3 - Security + * 2 - Readable + * 1 - Writable + * 0 - 1 if Page @ Page address is valid + */ +#define RK_PTE_PAGE_READABLE_V2 BIT(2) +#define RK_PTE_PAGE_WRITABLE_V2 BIT(1) + +static u32 rk_mk_pte_v2(phys_addr_t page, int prot) +{ + u32 flags = 0; + + flags |= (prot & IOMMU_READ) ? RK_PTE_PAGE_READABLE_V2 : 0; + flags |= (prot & IOMMU_WRITE) ? RK_PTE_PAGE_WRITABLE_V2 : 0; + + return rk_mk_dte_v2(page) | flags ; +} + static u32 rk_mk_pte_invalid(u32 pte) { return pte & ~RK_PTE_PAGE_VALID; @@ -480,6 +539,14 @@ static inline phys_addr_t rk_dte_addr_phys(phys_addr_t addr) return addr; } +#define DT_HI_MASK GENMASK_ULL(39, 32) +#define DT_SHIFT 28 + +static inline phys_addr_t rk_dte_addr_phys_v2(phys_addr_t addr) +{ + return (addr & RK_DTE_PT_ADDRESS_MASK) | ((addr & DT_HI_MASK) << DT_SHIFT); +} + static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) { void __iomem *base = iommu->bases[index]; @@ -1305,10 +1372,21 @@ static struct rk_iommu_ops iommu_data_ops_v1 = { .pt_address_mask = RK_DTE_PT_ADDRESS_MASK, }; +static struct rk_iommu_ops iommu_data_ops_v2 = { + .pt_address = _dte_pt_address_v2, + .mk_dtentries = _mk_dte_v2, + .mk_ptentries = _mk_pte_v2, + .dte_addr_phys = _dte_addr_phys_v2, + .pt_address_mask = RK_DTE_PT_ADDRESS_MASK_V2, +}; + static const struct of_device_id rk_iommu_dt_ids[] = { { .compatible = "rockchip,iommu", .data = _data_ops_v1, }, + { .compatible = "rockchip,rk3568-iommu", + .data = _data_ops_v2, + }, { /* sentinel */ } }; MODULE_DEVICE_TABLE(of, rk_iommu_dt_ids); -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 1/4] dt-bindings: iommu: rockchip: Convert IOMMU to DT schema
Convert Rockchip IOMMU to DT schema Signed-off-by: Benjamin Gaignard Reviewed-by: Rob Herring --- .../bindings/iommu/rockchip,iommu.txt | 38 - .../bindings/iommu/rockchip,iommu.yaml| 80 +++ 2 files changed, 80 insertions(+), 38 deletions(-) delete mode 100644 Documentation/devicetree/bindings/iommu/rockchip,iommu.txt create mode 100644 Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml diff --git a/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt b/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt deleted file mode 100644 index 6ecefea1c6f9.. --- a/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt +++ /dev/null @@ -1,38 +0,0 @@ -Rockchip IOMMU -== - -A Rockchip DRM iommu translates io virtual addresses to physical addresses for -its master device. Each slave device is bound to a single master device, and -shares its clocks, power domain and irq. - -Required properties: -- compatible : Should be "rockchip,iommu" -- reg : Address space for the configuration registers -- interrupts : Interrupt specifier for the IOMMU instance -- interrupt-names : Interrupt name for the IOMMU instance -- #iommu-cells: Should be <0>. This indicates the iommu is a -"single-master" device, and needs no additional information -to associate with its master device. See: -Documentation/devicetree/bindings/iommu/iommu.txt -- clocks : A list of clocks required for the IOMMU to be accessible by -the host CPU. -- clock-names : Should contain the following: - "iface" - Main peripheral bus clock (PCLK/HCL) (required) - "aclk" - AXI bus clock (required) - -Optional properties: -- rockchip,disable-mmu-reset : Don't use the mmu reset operation. - Some mmu instances may produce unexpected results - when the reset operation is used. - -Example: - - vopl_mmu: iommu@ff940300 { - compatible = "rockchip,iommu"; - reg = <0xff940300 0x100>; - interrupts = ; - interrupt-names = "vopl_mmu"; - clocks = < ACLK_VOP1>, < HCLK_VOP1>; - clock-names = "aclk", "iface"; - #iommu-cells = <0>; - }; diff --git a/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml b/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml new file mode 100644 index ..099fc2578b54 --- /dev/null +++ b/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml @@ -0,0 +1,80 @@ +# SPDX-License-Identifier: GPL-2.0-only +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/iommu/rockchip,iommu.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Rockchip IOMMU + +maintainers: + - Heiko Stuebner + +description: |+ + A Rockchip DRM iommu translates io virtual addresses to physical addresses for + its master device. Each slave device is bound to a single master device and + shares its clocks, power domain and irq. + + For information on assigning IOMMU controller to its peripheral devices, + see generic IOMMU bindings. + +properties: + compatible: +const: rockchip,iommu + + reg: +items: + - description: configuration registers for MMU instance 0 + - description: configuration registers for MMU instance 1 +minItems: 1 +maxItems: 2 + + interrupts: +items: + - description: interruption for MMU instance 0 + - description: interruption for MMU instance 1 +minItems: 1 +maxItems: 2 + + clocks: +items: + - description: Core clock + - description: Interface clock + + clock-names: +items: + - const: aclk + - const: iface + + "#iommu-cells": +const: 0 + + rockchip,disable-mmu-reset: +$ref: /schemas/types.yaml#/definitions/flag +description: | + Do not use the mmu reset operation. + Some mmu instances may produce unexpected results + when the reset operation is used. + +required: + - compatible + - reg + - interrupts + - clocks + - clock-names + - "#iommu-cells" + +additionalProperties: false + +examples: + - | +#include +#include + +vopl_mmu: iommu@ff940300 { + compatible = "rockchip,iommu"; + reg = <0xff940300 0x100>; + interrupts = ; + clocks = < ACLK_VOP1>, < HCLK_VOP1>; + clock-names = "aclk", "iface"; + #iommu-cells = <0>; +}; -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 2/4] dt-bindings: iommu: rockchip: Add compatible for v2
Add compatible for the second version of IOMMU hardware block. RK356x IOMMU can also be link to a power domain. Signed-off-by: Benjamin Gaignard Reviewed-by: Rob Herring --- .../devicetree/bindings/iommu/rockchip,iommu.yaml | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml b/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml index 099fc2578b54..d2e28a9e3545 100644 --- a/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml +++ b/Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml @@ -19,7 +19,9 @@ description: |+ properties: compatible: -const: rockchip,iommu +enum: + - rockchip,iommu + - rockchip,rk3568-iommu reg: items: @@ -48,6 +50,9 @@ properties: "#iommu-cells": const: 0 + power-domains: +maxItems: 1 + rockchip,disable-mmu-reset: $ref: /schemas/types.yaml#/definitions/flag description: | -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 0/4] Add IOMMU driver for rk356x
This series adds the IOMMU driver for rk356x SoC. Since a new compatible is needed to distinguish this second version of IOMMU hardware block from the first one, it is an opportunity to convert the binding to DT schema. version 5: - Add internal ops inside the driver to be able to add variants. - Add support of v2 variant. - Stop using 'version' field - Use GENMASK macro version 4: - Add description for reg items - Remove useless interrupt-names properties - Add description for interrupts items - Remove interrupt-names properties from DST files version 3: - Rename compatible with soc prefix - Rebase on v5.12 tag version 2: - Fix iommu-cells typo in rk322x.dtsi - Change maintainer - Change reg maxItems - Add power-domains property Add IOMMU driver for rk356x This series adds the IOMMU driver for rk356x SoC. Since a new compatible is needed to distinguish this second version of IOMMU hardware block from the first one, it is an opportunity to convert the binding to DT schema. version 4: - Add description for reg items - Remove useless interrupt-names properties - Add description for interrupts items - Remove interrupt-names properties from DST files version 3: - Rename compatible with soc prefix - Rebase on v5.12 tag version 2: - Fix iommu-cells typo in rk322x.dtsi - Change maintainer - Change reg maxItems - Add power-domains property Benjamin Gaignard (4): dt-bindings: iommu: rockchip: Convert IOMMU to DT schema dt-bindings: iommu: rockchip: Add compatible for v2 iommu: rockchip: Add internal ops to handle variants iommu: rockchip: Add support for iommu v2 .../bindings/iommu/rockchip,iommu.txt | 38 - .../bindings/iommu/rockchip,iommu.yaml| 85 ++ drivers/iommu/rockchip-iommu.c| 147 +++--- 3 files changed, 213 insertions(+), 57 deletions(-) delete mode 100644 Documentation/devicetree/bindings/iommu/rockchip,iommu.txt create mode 100644 Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH v3 8/8] vfio: Add nested IOPF support
On 2021/5/19 2:58, Alex Williamson wrote: > On Fri, 9 Apr 2021 11:44:20 +0800 > Shenming Lu wrote: > >> To set up nested mode, drivers such as vfio_pci need to register a >> handler to receive stage/level 1 faults from the IOMMU, but since >> currently each device can only have one iommu dev fault handler, >> and if stage 2 IOPF is already enabled (VFIO_IOMMU_ENABLE_IOPF), >> we choose to update the registered handler (a consolidated one) via >> flags (set FAULT_REPORT_NESTED_L1), and further deliver the received >> stage 1 faults in the handler to the guest through a newly added >> vfio_device_ops callback. > > Are there proposed in-kernel drivers that would use any of these > symbols? I hope that such as Eric's SMMUv3 Nested Stage Setup series [1] can use these symbols to consolidate the two page fault handlers into one. [1] https://patchwork.kernel.org/project/kvm/cover/2021044659.15051-1-eric.au...@redhat.com/ > >> Signed-off-by: Shenming Lu >> --- >> drivers/vfio/vfio.c | 81 + >> drivers/vfio/vfio_iommu_type1.c | 49 +++- >> include/linux/vfio.h| 12 + >> 3 files changed, 141 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c >> index 44c8dfabf7de..4245f15914bf 100644 >> --- a/drivers/vfio/vfio.c >> +++ b/drivers/vfio/vfio.c >> @@ -2356,6 +2356,87 @@ struct iommu_domain *vfio_group_iommu_domain(struct >> vfio_group *group) >> } >> EXPORT_SYMBOL_GPL(vfio_group_iommu_domain); >> >> +/* >> + * Register/Update the VFIO IOPF handler to receive >> + * nested stage/level 1 faults. >> + */ >> +int vfio_iommu_dev_fault_handler_register_nested(struct device *dev) >> +{ >> +struct vfio_container *container; >> +struct vfio_group *group; >> +struct vfio_iommu_driver *driver; >> +int ret; >> + >> +if (!dev) >> +return -EINVAL; >> + >> +group = vfio_group_get_from_dev(dev); >> +if (!group) >> +return -ENODEV; >> + >> +ret = vfio_group_add_container_user(group); >> +if (ret) >> +goto out; >> + >> +container = group->container; >> +driver = container->iommu_driver; >> +if (likely(driver && driver->ops->register_handler)) >> +ret = driver->ops->register_handler(container->iommu_data, dev); >> +else >> +ret = -ENOTTY; >> + >> +vfio_group_try_dissolve_container(group); >> + >> +out: >> +vfio_group_put(group); >> +return ret; >> +} >> +EXPORT_SYMBOL_GPL(vfio_iommu_dev_fault_handler_register_nested); >> + >> +int vfio_iommu_dev_fault_handler_unregister_nested(struct device *dev) >> +{ >> +struct vfio_container *container; >> +struct vfio_group *group; >> +struct vfio_iommu_driver *driver; >> +int ret; >> + >> +if (!dev) >> +return -EINVAL; >> + >> +group = vfio_group_get_from_dev(dev); >> +if (!group) >> +return -ENODEV; >> + >> +ret = vfio_group_add_container_user(group); >> +if (ret) >> +goto out; >> + >> +container = group->container; >> +driver = container->iommu_driver; >> +if (likely(driver && driver->ops->unregister_handler)) >> +ret = driver->ops->unregister_handler(container->iommu_data, >> dev); >> +else >> +ret = -ENOTTY; >> + >> +vfio_group_try_dissolve_container(group); >> + >> +out: >> +vfio_group_put(group); >> +return ret; >> +} >> +EXPORT_SYMBOL_GPL(vfio_iommu_dev_fault_handler_unregister_nested); >> + >> +int vfio_transfer_iommu_fault(struct device *dev, struct iommu_fault *fault) >> +{ >> +struct vfio_device *device = dev_get_drvdata(dev); >> + >> +if (unlikely(!device->ops->transfer)) >> +return -EOPNOTSUPP; >> + >> +return device->ops->transfer(device->device_data, fault); >> +} >> +EXPORT_SYMBOL_GPL(vfio_transfer_iommu_fault); >> + >> /** >> * Module/class support >> */ >> diff --git a/drivers/vfio/vfio_iommu_type1.c >> b/drivers/vfio/vfio_iommu_type1.c >> index ba2b5a1cf6e9..9d1adeddb303 100644 >> --- a/drivers/vfio/vfio_iommu_type1.c >> +++ b/drivers/vfio/vfio_iommu_type1.c >> @@ -3821,13 +3821,32 @@ static int vfio_iommu_type1_dma_map_iopf(struct >> iommu_fault *fault, void *data) >> struct vfio_batch batch; >> struct vfio_range *range; >> dma_addr_t iova = ALIGN_DOWN(fault->prm.addr, PAGE_SIZE); >> -int access_flags = 0; >> +int access_flags = 0, nested; >> size_t premap_len, map_len, mapped_len = 0; >> unsigned long bit_offset, vaddr, pfn, i, npages; >> int ret; >> enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID; >> struct iommu_page_response resp = {0}; >> >> +if (vfio_dev_domian_nested(dev, )) >> +return -ENODEV; >> + >> +/* >> + * When configured in nested mode, further deliver the >> + * stage/level 1 faults to the guest. >> + */ >> +if (nested) { >> +bool l2; >> + >> +
Re: [PATCH] iommu/vt-d: Fix sysfs leak in alloc_domain()
Hi Joerg, On 5/21/21 2:53 PM, Rolf Eike Beer wrote: Am Donnerstag, 22. April 2021, 07:34:17 CEST schrieb Lu Baolu: Hi Rolf, On 4/22/21 1:39 PM, Rolf Eike Beer wrote: iommu_device_sysfs_add() is called before, so is has to be cleaned on subsequent errors. Fixes: 39ab9555c2411 ("iommu: Add sysfs bindings for struct iommu_device") Cc: sta...@vger.kernel.org # 4.11.x Signed-off-by: Rolf Eike Beer Acked-by: Lu Baolu Ping. Who is going to pick this up, please? I forgot to put this fix in my last pull request. Could you please pick it up? Or, I can send another pull request the next week. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/vt-d: Fix sysfs leak in alloc_domain()
Am Donnerstag, 22. April 2021, 07:34:17 CEST schrieb Lu Baolu: > Hi Rolf, > > On 4/22/21 1:39 PM, Rolf Eike Beer wrote: > > iommu_device_sysfs_add() is called before, so is has to be cleaned on > > subsequent errors. > > > > Fixes: 39ab9555c2411 ("iommu: Add sysfs bindings for struct iommu_device") > > Cc: sta...@vger.kernel.org # 4.11.x > > Signed-off-by: Rolf Eike Beer > > Acked-by: Lu Baolu Ping. Who is going to pick this up, please? Eike -- Rolf Eike Beer, emlix GmbH, http://www.emlix.com Fon +49 551 30664-0, Fax +49 551 30664-11 Gothaer Platz 3, 37083 Göttingen, Germany Sitz der Gesellschaft: Göttingen, Amtsgericht Göttingen HR B 3160 Geschäftsführung: Heike Jordan, Dr. Uwe Kracke – Ust-IdNr.: DE 205 198 055 emlix - smart embedded open source signature.asc Description: This is a digitally signed message part. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH v1 0/2] iommu/arm-smmu-v3: Add some parameter check in __arm_smmu_tlb_inv_range()
Hi Robin, On 2021/5/19 18:01, Robin Murphy wrote: On 2021-05-19 10:43, Kunkun Jiang wrote: Hi all, This set of patches solves some errors when I tested the SMMU nested mode. Test scenario description: guest kernel: 4KB translation granule host kernel: 16KB translation granule errors: 1. encountered an endless loop in __arm_smmu_tlb_inv_range because num_pages is 0 2. encountered CERROR_ILL because the fields of TLB invalidation command are as follow: TG = 2, NUM = 0, SCALE = 0, TTL = 0. The combination is exactly the kind of reserved combination pointed out in the SMMUv3 spec(page 143-144, version D.a) In my opinion, it is more appropriate to add parameter check in __arm_smmu_tlb_inv_range(), although these problems only appeared when I tested the SMMU nested mode. What do you think? FWIW I think it would be better to fix the caller to not issue broken commands in the first place. The kernel shouldn't do so for itself (and definitely needs fixing if it ever does), so it sounds like the nesting implementation needs to do a bit more validation of what it's passing through. Thanks for your reply. I will report these errors to Eric and discuss how to fix them. Thanks, Kunkun Jiang Robin. This series include patches as below: Patch 1: - align the invalid range with leaf page size upwards when smmu supports RIL Patch 2: - add a check to standardize granule size when smmu supports RIL Kunkun Jiang (2): iommu/arm-smmu-v3: Align invalid range with leaf page size upwards when support RIL iommu/arm-smmu-v3: Standardize granule size when support RIL drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 + 1 file changed, 9 insertions(+) . ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v15 07/12] iommu/smmuv3: Implement cache_invalidate
Hi Eric, On 2021/4/11 19:12, Eric Auger wrote: Implement domain-selective, pasid selective and page-selective IOTLB invalidations. Signed-off-by: Eric Auger --- v4 -> v15: - remove the redundant arm_smmu_cmdq_issue_sync(smmu) in IOMMU_INV_GRANU_ADDR case (Zenghui) - if RIL is not supported by the host, make sure the granule_size that is passed by the userspace is supported or fix it (Chenxiang) v13 -> v14: - Add domain invalidation - do global inval when asid is not provided with addr granularity v7 -> v8: - ASID based invalidation using iommu_inv_pasid_info - check ARCHID/PASID flags in addr based invalidation - use __arm_smmu_tlb_inv_context and __arm_smmu_tlb_inv_range_nosync v6 -> v7 - check the uapi version v3 -> v4: - adapt to changes in the uapi - add support for leaf parameter - do not use arm_smmu_tlb_inv_range_nosync or arm_smmu_tlb_inv_context anymore v2 -> v3: - replace __arm_smmu_tlb_sync by arm_smmu_cmdq_issue_sync v1 -> v2: - properly pass the asid --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 89 + 1 file changed, 89 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 56a301fbe75a..bfc112cc0d38 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2961,6 +2961,94 @@ static void arm_smmu_detach_pasid_table(struct iommu_domain *domain) mutex_unlock(_domain->init_mutex); } +static int +arm_smmu_cache_invalidate(struct iommu_domain *domain, struct device *dev, + struct iommu_cache_invalidate_info *inv_info) +{ + struct arm_smmu_cmdq_ent cmd = {.opcode = CMDQ_OP_TLBI_NSNH_ALL}; + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_device *smmu = smmu_domain->smmu; + + if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED) + return -EINVAL; + + if (!smmu) + return -EINVAL; + + if (inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1) + return -EINVAL; + + if (inv_info->cache & IOMMU_CACHE_INV_TYPE_PASID || + inv_info->cache & IOMMU_CACHE_INV_TYPE_DEV_IOTLB) { + return -ENOENT; + } + + if (!(inv_info->cache & IOMMU_CACHE_INV_TYPE_IOTLB)) + return -EINVAL; + + /* IOTLB invalidation */ + + switch (inv_info->granularity) { + case IOMMU_INV_GRANU_PASID: + { + struct iommu_inv_pasid_info *info = + _info->granu.pasid_info; + + if (info->flags & IOMMU_INV_ADDR_FLAGS_PASID) + return -ENOENT; + if (!(info->flags & IOMMU_INV_PASID_FLAGS_ARCHID)) + return -EINVAL; + + __arm_smmu_tlb_inv_context(smmu_domain, info->archid); + return 0; + } + case IOMMU_INV_GRANU_ADDR: + { + struct iommu_inv_addr_info *info = _info->granu.addr_info; + size_t granule_size = info->granule_size; + size_t size = info->nb_granules * info->granule_size; + bool leaf = info->flags & IOMMU_INV_ADDR_FLAGS_LEAF; + int tg; + + if (info->flags & IOMMU_INV_ADDR_FLAGS_PASID) + return -ENOENT; + + if (!(info->flags & IOMMU_INV_ADDR_FLAGS_ARCHID)) + break; + + tg = __ffs(granule_size); + if (granule_size & ~(1 << tg)) + return -EINVAL; + /* +* When RIL is not supported, make sure the granule size that is +* passed is supported. In RIL mode, this is enforced in +* __arm_smmu_tlb_inv_range() +*/ + if (!(smmu->features & ARM_SMMU_FEAT_RANGE_INV) && + !(granule_size & smmu_domain->domain.pgsize_bitmap)) { + tg = __ffs(smmu_domain->domain.pgsize_bitmap); + granule_size = 1 << tg; + size = size >> tg; + } + + arm_smmu_tlb_inv_range_domain(info->addr, size, + granule_size, leaf, + info->archid, smmu_domain); I encountered some errors when I tested the SMMU nested mode. Test scenario description: guest kernel: 4KB translation granule host kernel: 16KB translation granule errors: 1. encountered an endless loop in __arm_smmu_tlb_inv_range because num_pages is 0 2. encountered CERROR_ILL because the fields of TLB invalidation command are as follow: TG = 2, NUM = 0, SCALE = 0, TTL = 0. The combination is exactly the kind of reserved combination pointed out in the SMMUv3 spec(page 143-144, version D.a) According to my analysis, we should do a bit more validation on the 'size' and 'granule_size' when SMMU supports RIL: 1.
Re: [RFC PATCH v3 7/8] vfio/type1: Add selective DMA faulting support
On 2021/5/19 2:58, Alex Williamson wrote: > On Fri, 9 Apr 2021 11:44:19 +0800 > Shenming Lu wrote: > >> Some devices only allow selective DMA faulting. Similar to the selective >> dirty page tracking, the vendor driver can call vfio_pin_pages() to >> indicate the non-faultable scope, we add a new struct vfio_range to >> record it, then when the IOPF handler receives any page request out >> of the scope, we can directly return with an invalid response. > > Seems like this highlights a deficiency in the design, that the user > can't specify mappings as iopf enabled or disabled. Also, if the > vendor driver has pinned pages within the range, shouldn't that prevent > them from faulting in the first place? Why do we need yet more > tracking structures? Pages pinned by the vendor driver need to count > against the user's locked memory limits regardless of iopf. Thanks, Currently we only have a vfio_pfn struct to track the external pinned pages (single page granularity), so I add a vfio_range struct for efficient lookup. Yeah, by this patch, for the non-pinned scope, we can directly return INVALID, but for the pinned(non-faultable) scope, tracking the pinned range doesn't seem to help more... Thanks, Shenming > > Alex > > . > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH v3 6/8] vfio/type1: No need to statically pin and map if IOPF enabled
On 2021/5/19 2:58, Alex Williamson wrote: > On Fri, 9 Apr 2021 11:44:18 +0800 > Shenming Lu wrote: > >> If IOPF enabled for the VFIO container, there is no need to statically >> pin and map the entire DMA range, we can do it on demand. And unmap >> according to the IOPF mapped bitmap when removing vfio_dma. >> >> Note that we still mark all pages dirty even if IOPF enabled, we may >> add IOPF-based fine grained dirty tracking support in the future. >> >> Signed-off-by: Shenming Lu >> --- >> drivers/vfio/vfio_iommu_type1.c | 38 +++-- >> 1 file changed, 32 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/vfio/vfio_iommu_type1.c >> b/drivers/vfio/vfio_iommu_type1.c >> index 7df5711e743a..dcc93c3b258c 100644 >> --- a/drivers/vfio/vfio_iommu_type1.c >> +++ b/drivers/vfio/vfio_iommu_type1.c >> @@ -175,6 +175,7 @@ struct vfio_iopf_group { >> #define IOPF_MAPPED_BITMAP_GET(dma, i) \ >>((dma->iopf_mapped_bitmap[(i) / BITS_PER_LONG] >> \ >> >> ((i) % BITS_PER_LONG)) & 0x1) >> +#define IOPF_MAPPED_BITMAP_BYTES(n) DIRTY_BITMAP_BYTES(n) >> >> #define WAITED 1 >> >> @@ -959,7 +960,8 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data, >> * already pinned and accounted. Accouting should be done if there is no >> * iommu capable domain in the container. >> */ >> -do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu); >> +do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu) || >> +iommu->iopf_enabled; >> >> for (i = 0; i < npage; i++) { >> struct vfio_pfn *vpfn; >> @@ -1048,7 +1050,8 @@ static int vfio_iommu_type1_unpin_pages(void >> *iommu_data, >> >> mutex_lock(>lock); >> >> -do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu); >> +do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu) || >> +iommu->iopf_enabled; > > pin/unpin are actually still pinning pages, why does iopf exempt them > from accounting? If iopf_enabled is true, do_accounting will be true too, we will account the external pinned pages? > > >> for (i = 0; i < npage; i++) { >> struct vfio_dma *dma; >> dma_addr_t iova; >> @@ -1169,7 +1172,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, >> struct vfio_dma *dma, >> if (!dma->size) >> return 0; >> >> -if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) >> +if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu) || iommu->iopf_enabled) >> return 0; >> >> /* >> @@ -1306,11 +1309,20 @@ static void vfio_unmap_partial_iopf(struct >> vfio_iommu *iommu, >> } >> } >> >> +static void vfio_dma_clean_iopf(struct vfio_iommu *iommu, struct vfio_dma >> *dma) >> +{ >> +vfio_unmap_partial_iopf(iommu, dma, dma->iova, dma->iova + dma->size); >> + >> +kfree(dma->iopf_mapped_bitmap); >> +} >> + >> static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma) >> { >> WARN_ON(!RB_EMPTY_ROOT(>pfn_list)); >> vfio_unmap_unpin(iommu, dma, true); >> vfio_unlink_dma(iommu, dma); >> +if (iommu->iopf_enabled) >> +vfio_dma_clean_iopf(iommu, dma); >> put_task_struct(dma->task); >> vfio_dma_bitmap_free(dma); >> if (dma->vaddr_invalid) { >> @@ -1359,7 +1371,8 @@ static int update_user_bitmap(u64 __user *bitmap, >> struct vfio_iommu *iommu, >> * mark all pages dirty if any IOMMU capable device is not able >> * to report dirty pages and all pages are pinned and mapped. >> */ >> -if (iommu->num_non_pinned_groups && dma->iommu_mapped) >> +if (iommu->num_non_pinned_groups && >> +(dma->iommu_mapped || iommu->iopf_enabled)) >> bitmap_set(dma->bitmap, 0, nbits); > > This seems like really poor integration of iopf into dirty page > tracking. I'd expect dirty logging to flush the mapped pages and > write faults to mark pages dirty. Shouldn't the fault handler also > provide only the access faulted, so for example a read fault wouldn't > mark the page dirty? I just want to keep the behavior here as before, if IOPF enabled, we will still mark all pages dirty. We can distinguish between write and read faults in the fault handler, so there is a way to add IOPF-based fine grained dirty tracking support... But I am not sure whether there is a need to implement this, we can consider this in the future? > >> >> if (shift) { >> @@ -1772,6 +1785,16 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, >> goto out_unlock; >> } >> >> +if (iommu->iopf_enabled) { >> +dma->iopf_mapped_bitmap = kvzalloc(IOPF_MAPPED_BITMAP_BYTES( >> +size >> PAGE_SHIFT), >> GFP_KERNEL); >> +if (!dma->iopf_mapped_bitmap) { >> +ret = -ENOMEM; >> +kfree(dma); >> +goto out_unlock; >> +
Re: [RFC PATCH v3 2/8] vfio/type1: Add a page fault handler
On 2021/5/19 2:58, Alex Williamson wrote: > On Fri, 9 Apr 2021 11:44:14 +0800 > Shenming Lu wrote: > >> VFIO manages the DMA mapping itself. To support IOPF (on-demand paging) >> for VFIO (IOMMU capable) devices, we add a VFIO page fault handler to >> serve the reported page faults from the IOMMU driver. >> >> Signed-off-by: Shenming Lu >> --- >> drivers/vfio/vfio_iommu_type1.c | 114 >> 1 file changed, 114 insertions(+) >> >> diff --git a/drivers/vfio/vfio_iommu_type1.c >> b/drivers/vfio/vfio_iommu_type1.c >> index 45cbfd4879a5..ab0ff60ee207 100644 >> --- a/drivers/vfio/vfio_iommu_type1.c >> +++ b/drivers/vfio/vfio_iommu_type1.c >> @@ -101,6 +101,7 @@ struct vfio_dma { >> struct task_struct *task; >> struct rb_root pfn_list; /* Ex-user pinned pfn list */ >> unsigned long *bitmap; >> +unsigned long *iopf_mapped_bitmap; >> }; >> >> struct vfio_batch { >> @@ -141,6 +142,16 @@ struct vfio_regions { >> size_t len; >> }; >> >> +/* A global IOPF enabled group list */ >> +static struct rb_root iopf_group_list = RB_ROOT; >> +static DEFINE_MUTEX(iopf_group_list_lock); >> + >> +struct vfio_iopf_group { >> +struct rb_node node; >> +struct iommu_group *iommu_group; >> +struct vfio_iommu *iommu; >> +}; >> + >> #define IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu) \ >> (!list_empty(>domain_list)) >> >> @@ -157,6 +168,10 @@ struct vfio_regions { >> #define DIRTY_BITMAP_PAGES_MAX ((u64)INT_MAX) >> #define DIRTY_BITMAP_SIZE_MAX >> DIRTY_BITMAP_BYTES(DIRTY_BITMAP_PAGES_MAX) >> >> +#define IOPF_MAPPED_BITMAP_GET(dma, i) \ >> + ((dma->iopf_mapped_bitmap[(i) / BITS_PER_LONG] >> \ >> + >> ((i) % BITS_PER_LONG)) & 0x1) > > > Can't we just use test_bit()? Yeah, we can use it. > > >> + >> #define WAITED 1 >> >> static int put_pfn(unsigned long pfn, int prot); >> @@ -416,6 +431,34 @@ static int vfio_iova_put_vfio_pfn(struct vfio_dma *dma, >> struct vfio_pfn *vpfn) >> return ret; >> } >> >> +/* >> + * Helper functions for iopf_group_list >> + */ >> +static struct vfio_iopf_group * >> +vfio_find_iopf_group(struct iommu_group *iommu_group) >> +{ >> +struct vfio_iopf_group *iopf_group; >> +struct rb_node *node; >> + >> +mutex_lock(_group_list_lock); >> + >> +node = iopf_group_list.rb_node; >> + >> +while (node) { >> +iopf_group = rb_entry(node, struct vfio_iopf_group, node); >> + >> +if (iommu_group < iopf_group->iommu_group) >> +node = node->rb_left; >> +else if (iommu_group > iopf_group->iommu_group) >> +node = node->rb_right; >> +else >> +break; >> +} >> + >> +mutex_unlock(_group_list_lock); >> +return node ? iopf_group : NULL; >> +} > > This looks like a pretty heavy weight operation per DMA fault. > > I'm also suspicious of this validity of this iopf_group after we've > dropped the locking, the ordering of patches makes this very confusing. My thought was to include the handling of DMA faults completely in the type1 backend by introducing the vfio_iopf_group struct. But it seems that introducing a struct with an unknown lifecycle causes more problems... I will use the path from vfio-core as in the v2 for simplicity and validity. Sorry for the confusing, I will reconstruct the series later. :-) > >> + >> static int vfio_lock_acct(struct vfio_dma *dma, long npage, bool async) >> { >> struct mm_struct *mm; >> @@ -3106,6 +3149,77 @@ static int vfio_iommu_type1_dirty_pages(struct >> vfio_iommu *iommu, >> return -EINVAL; >> } >> >> +/* VFIO I/O Page Fault handler */ >> +static int vfio_iommu_type1_dma_map_iopf(struct iommu_fault *fault, void >> *data) > >>From the comment, this seems like the IOMMU fault handler (the > construction of this series makes this difficult to follow) and > eventually it handles more than DMA mapping, for example transferring > faults to the device driver. "dma_map_iopf" seems like a poorly scoped > name. Maybe just call it dev_fault_handler? > >> +{ >> +struct device *dev = (struct device *)data; >> +struct iommu_group *iommu_group; >> +struct vfio_iopf_group *iopf_group; >> +struct vfio_iommu *iommu; >> +struct vfio_dma *dma; >> +dma_addr_t iova = ALIGN_DOWN(fault->prm.addr, PAGE_SIZE); >> +int access_flags = 0; >> +unsigned long bit_offset, vaddr, pfn; >> +int ret; >> +enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID; >> +struct iommu_page_response resp = {0}; >> + >> +if (fault->type != IOMMU_FAULT_PAGE_REQ) >> +return -EOPNOTSUPP; >> + >> +iommu_group = iommu_group_get(dev); >> +if (!iommu_group) >> +return -ENODEV; >> + >> +iopf_group = vfio_find_iopf_group(iommu_group);
Re: [RFC PATCH v3 5/8] vfio/type1: VFIO_IOMMU_ENABLE_IOPF
On 2021/5/19 2:58, Alex Williamson wrote: > On Fri, 9 Apr 2021 11:44:17 +0800 > Shenming Lu wrote: > >> Since enabling IOPF for devices may lead to a slow ramp up of performance, >> we add an ioctl VFIO_IOMMU_ENABLE_IOPF to make it configurable. And the >> IOPF enabling of a VFIO device includes setting IOMMU_DEV_FEAT_IOPF and >> registering the VFIO IOPF handler. >> >> Note that VFIO_IOMMU_DISABLE_IOPF is not supported since there may be >> inflight page faults when disabling. >> >> Signed-off-by: Shenming Lu >> --- >> drivers/vfio/vfio_iommu_type1.c | 223 +++- >> include/uapi/linux/vfio.h | 6 + >> 2 files changed, 226 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/vfio/vfio_iommu_type1.c >> b/drivers/vfio/vfio_iommu_type1.c >> index 01e296c6dc9e..7df5711e743a 100644 >> --- a/drivers/vfio/vfio_iommu_type1.c >> +++ b/drivers/vfio/vfio_iommu_type1.c >> @@ -71,6 +71,7 @@ struct vfio_iommu { >> struct rb_root dma_list; >> struct blocking_notifier_head notifier; >> struct mmu_notifier mn; >> +struct mm_struct*mm; > > We currently have no requirement that a single mm is used for all > container mappings. Does enabling IOPF impose such a requirement? > Shouldn't MAP/UNMAP enforce that? Did you mean that there is a possibility that each vfio_dma in a container has a different mm? If so, could we register one MMU notifier for each vfio_dma in the MAP ioctl? > >> unsigned intdma_avail; >> unsigned intvaddr_invalid_count; >> uint64_tpgsize_bitmap; >> @@ -81,6 +82,7 @@ struct vfio_iommu { >> booldirty_page_tracking; >> boolpinned_page_dirty_scope; >> boolcontainer_open; >> +booliopf_enabled; >> }; >> >> struct vfio_domain { >> @@ -461,6 +463,38 @@ vfio_find_iopf_group(struct iommu_group *iommu_group) >> return node ? iopf_group : NULL; >> } >> >> +static void vfio_link_iopf_group(struct vfio_iopf_group *new) >> +{ >> +struct rb_node **link, *parent = NULL; >> +struct vfio_iopf_group *iopf_group; >> + >> +mutex_lock(_group_list_lock); >> + >> +link = _group_list.rb_node; >> + >> +while (*link) { >> +parent = *link; >> +iopf_group = rb_entry(parent, struct vfio_iopf_group, node); >> + >> +if (new->iommu_group < iopf_group->iommu_group) >> +link = &(*link)->rb_left; >> +else >> +link = &(*link)->rb_right; >> +} >> + >> +rb_link_node(>node, parent, link); >> +rb_insert_color(>node, _group_list); >> + >> +mutex_unlock(_group_list_lock); >> +} >> + >> +static void vfio_unlink_iopf_group(struct vfio_iopf_group *old) >> +{ >> +mutex_lock(_group_list_lock); >> +rb_erase(>node, _group_list); >> +mutex_unlock(_group_list_lock); >> +} >> + >> static int vfio_lock_acct(struct vfio_dma *dma, long npage, bool async) >> { >> struct mm_struct *mm; >> @@ -2363,6 +2397,68 @@ static void vfio_iommu_iova_insert_copy(struct >> vfio_iommu *iommu, >> list_splice_tail(iova_copy, iova); >> } >> >> +static int vfio_dev_domian_nested(struct device *dev, int *nested) > > > s/domian/domain/ > > >> +{ >> +struct iommu_domain *domain; >> + >> +domain = iommu_get_domain_for_dev(dev); >> +if (!domain) >> +return -ENODEV; >> + >> +return iommu_domain_get_attr(domain, DOMAIN_ATTR_NESTING, nested); > > > Wouldn't this be easier to use if it returned bool, false on -errno? Make sense. > > >> +} >> + >> +static int vfio_iommu_type1_dma_map_iopf(struct iommu_fault *fault, void >> *data); >> + >> +static int dev_enable_iopf(struct device *dev, void *data) >> +{ >> +int *enabled_dev_cnt = data; >> +int nested; >> +u32 flags; >> +int ret; >> + >> +ret = iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_IOPF); >> +if (ret) >> +return ret; >> + >> +ret = vfio_dev_domian_nested(dev, ); >> +if (ret) >> +goto out_disable; >> + >> +if (nested) >> +flags = FAULT_REPORT_NESTED_L2; >> +else >> +flags = FAULT_REPORT_FLAT; >> + >> +ret = iommu_register_device_fault_handler(dev, >> +vfio_iommu_type1_dma_map_iopf, flags, dev); >> +if (ret) >> +goto out_disable; >> + >> +(*enabled_dev_cnt)++; >> +return 0; >> + >> +out_disable: >> +iommu_dev_disable_feature(dev, IOMMU_DEV_FEAT_IOPF); >> +return ret; >> +} >> + >> +static int dev_disable_iopf(struct device *dev, void *data) >> +{ >> +int *enabled_dev_cnt = data; >> + >> +if (enabled_dev_cnt && *enabled_dev_cnt <= 0) >> +return -1; > > > Use an errno.> > >> + >> +WARN_ON(iommu_unregister_device_fault_handler(dev)); >> +WARN_ON(iommu_dev_disable_feature(dev, IOMMU_DEV_FEAT_IOPF)); >> + >> +if
Re: [RFC PATCH v3 4/8] vfio/type1: Pre-map more pages than requested in the IOPF handling
On 2021/5/19 2:58, Alex Williamson wrote: > On Fri, 9 Apr 2021 11:44:16 +0800 > Shenming Lu wrote: > >> To optimize for fewer page fault handlings, we can pre-map more pages >> than requested at once. >> >> Note that IOPF_PREMAP_LEN is just an arbitrary value for now, which we >> could try further tuning. > > I'd prefer that the series introduced full end-to-end functionality > before trying to improve performance. The pre-map value seems > arbitrary here and as noted in the previous patch, the IOMMU API does > not guarantee unmaps of ranges smaller than the original mapping. This > would need to map with single page granularity in order to guarantee > page granularity at the mmu notifier when the IOMMU supports > superpages. Thanks, Yeah, I will drop this patch in the current stage. Thanks, Shenming > > Alex > > . > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH v3 3/8] vfio/type1: Add an MMU notifier to avoid pinning
On 2021/5/19 2:58, Alex Williamson wrote: > On Fri, 9 Apr 2021 11:44:15 +0800 > Shenming Lu wrote: > >> To avoid pinning pages when they are mapped in IOMMU page tables, we >> add an MMU notifier to tell the addresses which are no longer valid >> and try to unmap them. >> >> Signed-off-by: Shenming Lu >> --- >> drivers/vfio/vfio_iommu_type1.c | 112 +++- >> 1 file changed, 109 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/vfio/vfio_iommu_type1.c >> b/drivers/vfio/vfio_iommu_type1.c >> index ab0ff60ee207..1cb9d1f2717b 100644 >> --- a/drivers/vfio/vfio_iommu_type1.c >> +++ b/drivers/vfio/vfio_iommu_type1.c >> @@ -40,6 +40,7 @@ >> #include >> #include >> #include >> +#include >> >> #define DRIVER_VERSION "0.2" >> #define DRIVER_AUTHOR "Alex Williamson " >> @@ -69,6 +70,7 @@ struct vfio_iommu { >> struct mutexlock; >> struct rb_root dma_list; >> struct blocking_notifier_head notifier; >> +struct mmu_notifier mn; >> unsigned intdma_avail; >> unsigned intvaddr_invalid_count; >> uint64_tpgsize_bitmap; >> @@ -1204,6 +1206,72 @@ static long vfio_unmap_unpin(struct vfio_iommu >> *iommu, struct vfio_dma *dma, >> return unlocked; >> } >> >> +/* Unmap the IOPF mapped pages in the specified range. */ >> +static void vfio_unmap_partial_iopf(struct vfio_iommu *iommu, >> +struct vfio_dma *dma, >> +dma_addr_t start, dma_addr_t end) >> +{ >> +struct iommu_iotlb_gather *gathers; >> +struct vfio_domain *d; >> +int i, num_domains = 0; >> + >> +list_for_each_entry(d, >domain_list, next) >> +num_domains++; >> + >> +gathers = kzalloc(sizeof(*gathers) * num_domains, GFP_KERNEL); >> +if (gathers) { >> +for (i = 0; i < num_domains; i++) >> +iommu_iotlb_gather_init([i]); >> +} > > > If we're always serialized on iommu->lock, would it make sense to have > gathers pre-allocated on the vfio_iommu object? Yeah, we can do it. > >> + >> +while (start < end) { >> +unsigned long bit_offset; >> +size_t len; >> + >> +bit_offset = (start - dma->iova) >> PAGE_SHIFT; >> + >> +for (len = 0; start + len < end; len += PAGE_SIZE) { >> +if (!IOPF_MAPPED_BITMAP_GET(dma, >> +bit_offset + (len >> PAGE_SHIFT))) >> +break; > > > There are bitmap helpers for this, find_first_bit(), > find_next_zero_bit(). Thanks for the hint. :-) > > >> +} >> + >> +if (len) { >> +i = 0; >> +list_for_each_entry(d, >domain_list, next) { >> +size_t unmapped; >> + >> +if (gathers) >> +unmapped = iommu_unmap_fast(d->domain, >> +start, len, >> + >> [i++]); >> +else >> +unmapped = iommu_unmap(d->domain, >> + start, len); >> + >> +if (WARN_ON(unmapped != len)) > > The IOMMU API does not guarantee arbitrary unmap sizes, this will > trigger and this exit path is wrong. If we've already unmapped the > IOMMU, shouldn't we proceed with @unmapped rather than @len so the > device can re-fault the extra mappings? Otherwise the IOMMU state > doesn't match the iopf bitmap state. OK, I will correct it. And can we assume that the @unmapped values (returned from iommu_unmap) of all domains are the same (since all domains share the same iopf_mapped_bitmap)? > >> +goto out; >> +} >> + >> +bitmap_clear(dma->iopf_mapped_bitmap, >> + bit_offset, len >> PAGE_SHIFT); >> + >> +cond_resched(); >> +} >> + >> +start += (len + PAGE_SIZE); >> +} >> + >> +out: >> +if (gathers) { >> +i = 0; >> +list_for_each_entry(d, >domain_list, next) >> +iommu_iotlb_sync(d->domain, [i++]); >> + >> +kfree(gathers); >> +} >> +} >> + >> static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma) >> { >> WARN_ON(!RB_EMPTY_ROOT(>pfn_list)); >> @@ -3197,17 +3265,18 @@ static int vfio_iommu_type1_dma_map_iopf(struct >> iommu_fault *fault, void *data) >> >> vaddr = iova - dma->iova + dma->vaddr; >> >> -if (vfio_pin_page_external(dma, vaddr, , true)) >> +if (vfio_pin_page_external(dma, vaddr, , false)) >> goto out_invalid; >> >> if (vfio_iommu_map(iommu, iova, pfn, 1, dma->prot)) {
Re: [RFC PATCH v3 1/8] iommu: Evolve the device fault reporting framework
On 2021/5/19 2:58, Alex Williamson wrote: > On Fri, 9 Apr 2021 11:44:13 +0800 > Shenming Lu wrote: > >> This patch follows the discussion here: >> >> https://lore.kernel.org/linux-acpi/YAaxjmJW+ZMvrhac@myrica/ >> >> Besides SVA/vSVA, such as VFIO may also enable (2nd level) IOPF to remove >> pinning restriction. In order to better support more scenarios of using >> device faults, we extend iommu_register_fault_handler() with flags and >> introduce FAULT_REPORT_ to describe the device fault reporting capability >> under a specific configuration. >> >> Note that we don't further distinguish recoverable and unrecoverable faults >> by flags in the fault reporting cap, having PAGE_FAULT_REPORT_ + >> UNRECOV_FAULT_REPORT_ seems not a clean way. >> >> In addition, still take VFIO as an example, in nested mode, the 1st level >> and 2nd level fault reporting may be configured separately and currently >> each device can only register one iommu dev fault handler, so we add a >> handler update interface for this. > > > IIUC, you're introducing flags for the fault handler callout, which > essentially allows the IOMMU layer to filter which types of faults the > handler can receive. You then need an update function to modify those > flags. Why can't the handler itself perform this filtering? For > instance in your vfio example, the handler registered by the type1 > backend could itself return fault until the fault transfer path to the > device driver is established. Thanks, As discussed in [1]: In nested IOPF, we have to figure out whether a fault comes from L1 or L2. A SMMU stall event comes with this information, but a PRI page request doesn't. The IOMMU driver can walk the page tables to find out the level information. If the walk terminates at L1, further inject to the guest. Otherwise fix the fault at L2 in VFIO. It's not efficient compared to hardware-provided info. And in pinned case, if VFIO can tell the IOMMU driver that no L2 fault is expected, there is no need to walk the page tables in the IOMMU driver? [1] https://patchwork.kernel.org/project/linux-arm-kernel/patch/20210108145217.2254447-4-jean-phili...@linaro.org/ Thanks, Shenming > > Alex > > . > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH v3 0/8] Add IOPF support for VFIO passthrough
Hi Alex, On 2021/5/19 2:57, Alex Williamson wrote: > On Fri, 9 Apr 2021 11:44:12 +0800 > Shenming Lu wrote: > >> Hi, >> >> Requesting for your comments and suggestions. :-) >> >> The static pinning and mapping problem in VFIO and possible solutions >> have been discussed a lot [1, 2]. One of the solutions is to add I/O >> Page Fault support for VFIO devices. Different from those relatively >> complicated software approaches such as presenting a vIOMMU that provides >> the DMA buffer information (might include para-virtualized optimizations), >> IOPF mainly depends on the hardware faulting capability, such as the PCIe >> PRI extension or Arm SMMU stall model. What's more, the IOPF support in >> the IOMMU driver has already been implemented in SVA [3]. So we add IOPF >> support for VFIO passthrough based on the IOPF part of SVA in this series. > > The SVA proposals are being reworked to make use of a new IOASID > object, it's not clear to me that this shouldn't also make use of that > work as it does a significant expansion of the type1 IOMMU with fault > handlers that would duplicate new work using that new model. It seems that the IOASID extension for guest SVA would not affect this series, will we do any host-guest IOASID translation in the VFIO fault handler? > >> We have measured its performance with UADK [4] (passthrough an accelerator >> to a VM(1U16G)) on Hisilicon Kunpeng920 board (and compared with host SVA): >> >> Run hisi_sec_test... >> - with varying sending times and message lengths >> - with/without IOPF enabled (speed slowdown) >> >> when msg_len = 1MB (and PREMAP_LEN (in Patch 4) = 1): >> slowdown (num of faults) >> times VFIO IOPF host SVA >> 1 63.4% (518)82.8% (512) >> 10022.9% (1058) 47.9% (1024) >> 1000 2.6% (1071)8.5% (1024) >> >> when msg_len = 10MB (and PREMAP_LEN = 512): >> slowdown (num of faults) >> times VFIO IOPF >> 1 32.6% (13) >> 1003.5% (26) >> 1000 1.6% (26) > > It seems like this is only an example that you can make a benchmark > show anything you want. The best results would be to pre-map > everything, which is what we have without this series. What is an > acceptable overhead to incur to avoid page pinning? What if userspace > had more fine grained control over which mappings were available for > faulting and which were statically mapped? I don't really see what > sense the pre-mapping range makes. If I assume the user is QEMU in a > non-vIOMMU configuration, pre-mapping the beginning of each RAM section > doesn't make any logical sense relative to device DMA. As you said in Patch 4, we can introduce full end-to-end functionality before trying to improve performance, and I will drop the pre-mapping patch in the current stage... Is there a need that userspace wants more fine grained control over which mappings are available for faulting? If so, we may evolve the MAP ioctl to support for specifying the faulting range. As for the overhead of IOPF, it is unavoidable if enabling on-demand paging (and page faults occur almost only when first accessing), and it seems that there is a large optimization space compared to CPU page faulting. Thanks, Shenming > > Comments per patch to follow. Thanks, > > Alex > > >> History: >> >> v2 -> v3 >> - Nit fixes. >> - No reason to disable reporting the unrecoverable faults. (baolu) >> - Maintain a global IOPF enabled group list. >> - Split the pre-mapping optimization to be a separate patch. >> - Add selective faulting support (use vfio_pin_pages to indicate the >>non-faultable scope and add a new struct vfio_range to record it, >>untested). (Kevin) >> >> v1 -> v2 >> - Numerous improvements following the suggestions. Thanks a lot to all >>of you. >> >> Note that PRI is not supported at the moment since there is no hardware. >> >> Links: >> [1] Lesokhin I, et al. Page Fault Support for Network Controllers. In ASPLOS, >> 2016. >> [2] Tian K, et al. coIOMMU: A Virtual IOMMU with Cooperative DMA Buffer >> Tracking >> for Efficient Memory Management in Direct I/O. In USENIX ATC, 2020. >> [3] >> https://patchwork.kernel.org/project/linux-arm-kernel/cover/20210401154718.307519-1-jean-phili...@linaro.org/ >> [4] https://github.com/Linaro/uadk >> >> Thanks, >> Shenming >> >> >> Shenming Lu (8): >> iommu: Evolve the device fault reporting framework >> vfio/type1: Add a page fault handler >> vfio/type1: Add an MMU notifier to avoid pinning >> vfio/type1: Pre-map more pages than requested in the IOPF handling >> vfio/type1: VFIO_IOMMU_ENABLE_IOPF >> vfio/type1: No need to statically pin and map if IOPF enabled >> vfio/type1: Add selective DMA faulting support >> vfio: Add nested IOPF support >> >> .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c |3 +- >> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 18 +- >> drivers/iommu/iommu.c | 56 +- >>