Re: next/master bisection: baseline.login on jetson-tk1
Please see the bisection report below about a kernel panic. Reports aren't automatically sent to the public while we're trialing new bisection features on kernelci.org but this one looks valid. See the kernel Oops due to a NULL pointer followed by a panic: https://storage.kernelci.org/next/master/next-20200511/arm/tegra_defconfig/gcc-8/lab-collabora/baseline-tegra124-jetson-tk1.html#L573 Stack trace: <0>[2.953683] [] (__iommu_probe_device) from [] (iommu_probe_device+0x18/0x124) <0>[2.962810] [] (iommu_probe_device) from [] (of_iommu_configure+0x154/0x1b8) <0>[2.971853] [] (of_iommu_configure) from [] (of_dma_configure+0x144/0x2c8) <0>[2.980722] [] (of_dma_configure) from [] (host1x_attach_driver+0x148/0x2c4) <0>[2.989763] [] (host1x_attach_driver) from [] (host1x_driver_register_full+0x70/0xcc) <0>[2.999585] [] (host1x_driver_register_full) from [] (host1x_drm_init+0x14/0x50) <0>[3.008973] [] (host1x_drm_init) from [] (do_one_initcall+0x50/0x2b0) <0>[3.017405] [] (do_one_initcall) from [] (kernel_init_freeable+0x188/0x200) <0>[3.026361] [] (kernel_init_freeable) from [] (kernel_init+0x8/0x114) <0>[3.034794] [] (kernel_init) from [] (ret_from_fork+0x14/0x2c) Guillaume On 12/05/2020 02:24, kernelci.org bot wrote: > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * > * This automated bisection report was sent to you on the basis * > * that you may be involved with the breaking commit it has * > * found. No manual investigation has been done to verify it, * > * and the root cause of the problem may be somewhere else. * > * * > * If you do send a fix, please include this trailer:* > * Reported-by: "kernelci.org bot" * > * * > * Hope this helps! * > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * > > next/master bisection: baseline.login on jetson-tk1 > > Summary: > Start: 4b20e7462caa6 Add linux-next specific files for 20200511 > Plain log: > https://storage.kernelci.org/next/master/next-20200511/arm/tegra_defconfig/gcc-8/lab-collabora/baseline-tegra124-jetson-tk1.txt > HTML log: > https://storage.kernelci.org/next/master/next-20200511/arm/tegra_defconfig/gcc-8/lab-collabora/baseline-tegra124-jetson-tk1.html > Result: 3eeeb45c6d044 iommu: Remove add_device()/remove_device() > code-paths > > Checks: > revert: PASS > verify: PASS > > Parameters: > Tree: next > URL: > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git > Branch: master > Target: jetson-tk1 > CPU arch: arm > Lab:lab-collabora > Compiler: gcc-8 > Config: tegra_defconfig > Test case: baseline.login > > Breaking commit found: > > --- > commit 3eeeb45c6d0444b368cdeba9bdafa8bbcf5370d1 > Author: Joerg Roedel > Date: Wed Apr 29 15:37:10 2020 +0200 > > iommu: Remove add_device()/remove_device() code-paths > > All drivers are converted to use the probe/release_device() > call-backs, so the add_device/remove_device() pointers are unused and > the code using them can be removed. > > Signed-off-by: Joerg Roedel > Tested-by: Marek Szyprowski > Acked-by: Marek Szyprowski > Link: https://lore.kernel.org/r/20200429133712.31431-33-j...@8bytes.org > Signed-off-by: Joerg Roedel > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index 397fd4fd0c320..7f99e5ae432c6 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -220,12 +220,20 @@ static int __iommu_probe_device(struct device *dev, > struct list_head *group_list > return ret; > } > > -static int __iommu_probe_device_helper(struct device *dev) > +int iommu_probe_device(struct device *dev) > { > const struct iommu_ops *ops = dev->bus->iommu_ops; > struct iommu_group *group; > int ret; > > + if (!dev_iommu_get(dev)) > + return -ENOMEM; > + > + if (!try_module_get(ops->owner)) { > + ret = -EINVAL; > + goto err_out; > + } > + > ret = __iommu_probe_device(dev, NULL); > if (ret) > goto err_out; > @@ -259,75 +267,23 @@ static int __iommu_probe_device_helper(struct device > *dev) > > err_release: > iommu_release_device(dev); > + > err_out: > return ret; > > } &
Re: [PATCH] iommu/qcom: add optional clock for TLB invalidate
On Sat 09 May 06:08 PDT 2020, Shawn Guo wrote: > On some SoCs like MSM8939 with A405 adreno, there is a gfx_tbu clock > needs to be on while doing TLB invalidate. Otherwise, TLBSYNC status > will not be correctly reflected, causing the system to go into a bad > state. Add it as an optional clock, so that platforms that have this > clock can pass it over DT. > > Signed-off-by: Shawn Guo > --- > drivers/iommu/qcom_iommu.c | 16 > 1 file changed, 16 insertions(+) > > diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c > index 0e2a96467767..2f6c6da7d540 100644 > --- a/drivers/iommu/qcom_iommu.c > +++ b/drivers/iommu/qcom_iommu.c > @@ -45,6 +45,7 @@ struct qcom_iommu_dev { > struct device *dev; > struct clk *iface_clk; > struct clk *bus_clk; > + struct clk *tlb_clk; > void __iomem*local_base; > u32 sec_id; > u8 num_ctxs; > @@ -643,11 +644,20 @@ static int qcom_iommu_enable_clocks(struct > qcom_iommu_dev *qcom_iommu) > return ret; > } > > + ret = clk_prepare_enable(qcom_iommu->tlb_clk); > + if (ret) { > + dev_err(qcom_iommu->dev, "Couldn't enable tlb_clk\n"); > + clk_disable_unprepare(qcom_iommu->bus_clk); > + clk_disable_unprepare(qcom_iommu->iface_clk); > + return ret; > + } Seems this is an excellent opportunity to replace qcom_iommu_enable_clocks() to clk_bulk_prepare_enable() and disable, respectively. > + > return 0; > } > > static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu) > { > + clk_disable_unprepare(qcom_iommu->tlb_clk); > clk_disable_unprepare(qcom_iommu->bus_clk); > clk_disable_unprepare(qcom_iommu->iface_clk); > } > @@ -839,6 +849,12 @@ static int qcom_iommu_device_probe(struct > platform_device *pdev) > return PTR_ERR(qcom_iommu->bus_clk); > } > > + qcom_iommu->tlb_clk = devm_clk_get(dev, "tlb"); Wouldn't "tbu" be a better name for this clock? Given that seems the actually be the hardware block you're clocking. That said, I thought we used device links and pm_runtime to ensure that the TBUs are powered and clocked... Regards, Bjorn > + if (IS_ERR(qcom_iommu->tlb_clk)) { > + dev_dbg(dev, "failed to get tlb clock\n"); > + qcom_iommu->tlb_clk = NULL; > + } > + > if (of_property_read_u32(dev->of_node, "qcom,iommu-secure-id", >&qcom_iommu->sec_id)) { > dev_err(dev, "missing qcom,iommu-secure-id property\n"); > -- > 2.17.1 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH -next] iommu/msm: Make msm_iommu_lock static
On Mon 11 May 19:17 PDT 2020, Samuel Zou wrote: > Fix the following sparse warning: > > drivers/iommu/msm_iommu.c:37:1: warning: symbol 'msm_iommu_lock' was not > declared. > > The msm_iommu_lock has only call site within msm_iommu.c > It should be static > Reviewed-by: Bjorn Andersson Regards, Bjorn > Fixes: 0720d1f052dc ("msm: Add MSM IOMMU support") > Reported-by: Hulk Robot > Signed-off-by: Samuel Zou > --- > drivers/iommu/msm_iommu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c > index 10cd4db..3d8a635 100644 > --- a/drivers/iommu/msm_iommu.c > +++ b/drivers/iommu/msm_iommu.c > @@ -34,7 +34,7 @@ __asm__ __volatile__ ( > \ > /* bitmap of the page sizes currently supported */ > #define MSM_IOMMU_PGSIZES(SZ_4K | SZ_64K | SZ_1M | SZ_16M) > > -DEFINE_SPINLOCK(msm_iommu_lock); > +static DEFINE_SPINLOCK(msm_iommu_lock); > static LIST_HEAD(qcom_iommu_devices); > static struct iommu_ops msm_iommu_ops; > > -- > 2.6.2 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/2] iommu/of: Let pci_fixup_final access iommu_fwnode
Calling pci_fixup_final after of_pci_iommu_init, which alloc iommu_fwnode. Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_final after iommu_fwnode is allocated. Signed-off-by: Zhangfei Gao --- drivers/iommu/of_iommu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c index 20738aac..c1b58c4 100644 --- a/drivers/iommu/of_iommu.c +++ b/drivers/iommu/of_iommu.c @@ -188,6 +188,7 @@ const struct iommu_ops *of_iommu_configure(struct device *dev, pci_request_acs(); err = pci_for_each_dma_alias(to_pci_dev(dev), of_pci_iommu_init, &info); + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); } else if (dev_is_fsl_mc(dev)) { err = of_fsl_mc_iommu_init(to_fsl_mc_device(dev), master_np); } else { -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/2] ACPI/IORT: Let pci_fixup_final access iommu_fwnode
Calling pci_fixup_final after iommu_fwspec_init, which alloc iommu_fwnode. Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_final after iommu_fwnode is allocated. Signed-off-by: Zhangfei Gao --- drivers/acpi/arm64/iort.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index 7d04424..02e361d 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -1027,6 +1027,7 @@ const struct iommu_ops *iort_iommu_configure(struct device *dev) info.node = node; err = pci_for_each_dma_alias(to_pci_dev(dev), iort_pci_iommu_init, &info); + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); fwspec = dev_iommu_fwspec_get(dev); if (fwspec && iort_pci_rc_supports_ats(node)) -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/2] Let pci_fixup_final access iommu_fwnode
Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_final after iommu_fwnode is allocated. For example: Hisilicon platform device need fixup in drivers/pci/quirks.c +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ + struct iommu_fwspec *fwspec; + + pdev->eetlp_prefix_path = 1; + fwspec = dev_iommu_fwspec_get(&pdev->dev); + if (fwspec) + fwspec->can_stall = 1; +} + +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); Zhangfei Gao (2): iommu/of: Let pci_fixup_final access iommu_fwnode ACPI/IORT: Let pci_fixup_final access iommu_fwnode drivers/acpi/arm64/iort.c | 1 + drivers/iommu/of_iommu.c | 1 + 2 files changed, 2 insertions(+) -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 0/5] iommu/vt-d: Add page request draining support
When a PASID is stopped or terminated, there can be pending PRQs (requests that haven't received responses) in the software and remapping hardware. The pending page requests must be drained so that the pasid could be reused. The chapter 7.10 in the VT-d specification specifies the software steps to drain pending page requests and responses. This includes two parts: - PATCH 1/5 ~ 2/5: refactor the qi_submit_sync() to support multiple descriptors per submission which will be used in the following patch. - PATCH 3/5 ~ 5/5: add page request drain support after a pasid entry is torn down. This series is based on Jacob's vSVA support hence the guest pasid could also be drained. https://lkml.org/lkml/2020/4/21/1038 Best regards, baolu Change log: v4->v5: - Only set FPD bit in PASID entry when entry changed from present to non-present; - Check pri_support in drain helper; - Refine the wait/completion usage. v3->v4: - Remove prq drain in mm notifier; - Set PASID FPD bit when pasid is cleared in mm notifier and clear it in unbound(). v2->v3: - Address Kevin's review comments - Squash the first 2 patches together; - The prq thread is serialized, no need to consider reentrance; - Ensure no new-coming prq before drain prq in queue; - Handle page request overflow case. v1->v2: - Fix race between multiple prq handling threads. Lu Baolu (5): iommu/vt-d: Multiple descriptors per qi_submit_sync() iommu/vt-d: debugfs: Add support to show inv queue internals iommu/vt-d: Disable non-recoverable fault processing before unbind iommu/vt-d: Add page request draining support iommu/vt-d: Remove redundant IOTLB flush drivers/iommu/dmar.c| 63 -- drivers/iommu/intel-iommu-debugfs.c | 62 ++ drivers/iommu/intel-iommu.c | 4 +- drivers/iommu/intel-pasid.c | 30 +-- drivers/iommu/intel-pasid.h | 4 +- drivers/iommu/intel-svm.c | 128 drivers/iommu/intel_irq_remapping.c | 2 +- include/linux/intel-iommu.h | 13 ++- 8 files changed, 253 insertions(+), 53 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 1/5] iommu/vt-d: Multiple descriptors per qi_submit_sync()
Current qi_submit_sync() only supports single invalidation descriptor per submission and appends wait descriptor after each submission to poll the hardware completion. This extends the qi_submit_sync() helper to support multiple descriptors, and add an option so that the caller could specify the Page-request Drain (PD) bit in the wait descriptor. Signed-off-by: Jacob Pan Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian --- drivers/iommu/dmar.c| 63 + drivers/iommu/intel-pasid.c | 4 +- drivers/iommu/intel-svm.c | 6 +-- drivers/iommu/intel_irq_remapping.c | 2 +- include/linux/intel-iommu.h | 9 - 5 files changed, 52 insertions(+), 32 deletions(-) diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c index d9dc787feef7..61d049e91f84 100644 --- a/drivers/iommu/dmar.c +++ b/drivers/iommu/dmar.c @@ -1157,12 +1157,11 @@ static inline void reclaim_free_desc(struct q_inval *qi) } } -static int qi_check_fault(struct intel_iommu *iommu, int index) +static int qi_check_fault(struct intel_iommu *iommu, int index, int wait_index) { u32 fault; int head, tail; struct q_inval *qi = iommu->qi; - int wait_index = (index + 1) % QI_LENGTH; int shift = qi_shift(iommu); if (qi->desc_status[wait_index] == QI_ABORT) @@ -1225,17 +1224,21 @@ static int qi_check_fault(struct intel_iommu *iommu, int index) } /* - * Submit the queued invalidation descriptor to the remapping - * hardware unit and wait for its completion. + * Function to submit invalidation descriptors of all types to the queued + * invalidation interface(QI). Multiple descriptors can be submitted at a + * time, a wait descriptor will be appended to each submission to ensure + * hardware has completed the invalidation before return. Wait descriptors + * can be part of the submission but it will not be polled for completion. */ -int qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu) +int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc, + unsigned int count, unsigned long options) { - int rc; struct q_inval *qi = iommu->qi; - int offset, shift, length; struct qi_desc wait_desc; int wait_index, index; unsigned long flags; + int offset, shift; + int rc, i; if (!qi) return 0; @@ -1244,32 +1247,41 @@ int qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu) rc = 0; raw_spin_lock_irqsave(&qi->q_lock, flags); - while (qi->free_cnt < 3) { + /* +* Check if we have enough empty slots in the queue to submit, +* the calculation is based on: +* # of desc + 1 wait desc + 1 space between head and tail +*/ + while (qi->free_cnt < count + 2) { raw_spin_unlock_irqrestore(&qi->q_lock, flags); cpu_relax(); raw_spin_lock_irqsave(&qi->q_lock, flags); } index = qi->free_head; - wait_index = (index + 1) % QI_LENGTH; + wait_index = (index + count) % QI_LENGTH; shift = qi_shift(iommu); - length = 1 << shift; - qi->desc_status[index] = qi->desc_status[wait_index] = QI_IN_USE; + for (i = 0; i < count; i++) { + offset = ((index + i) % QI_LENGTH) << shift; + memcpy(qi->desc + offset, &desc[i], 1 << shift); + qi->desc_status[(index + i) % QI_LENGTH] = QI_IN_USE; + } + qi->desc_status[wait_index] = QI_IN_USE; - offset = index << shift; - memcpy(qi->desc + offset, desc, length); wait_desc.qw0 = QI_IWD_STATUS_DATA(QI_DONE) | QI_IWD_STATUS_WRITE | QI_IWD_TYPE; + if (options & QI_OPT_WAIT_DRAIN) + wait_desc.qw0 |= QI_IWD_PRQ_DRAIN; wait_desc.qw1 = virt_to_phys(&qi->desc_status[wait_index]); wait_desc.qw2 = 0; wait_desc.qw3 = 0; offset = wait_index << shift; - memcpy(qi->desc + offset, &wait_desc, length); + memcpy(qi->desc + offset, &wait_desc, 1 << shift); - qi->free_head = (qi->free_head + 2) % QI_LENGTH; - qi->free_cnt -= 2; + qi->free_head = (qi->free_head + count + 1) % QI_LENGTH; + qi->free_cnt -= count + 1; /* * update the HW tail register indicating the presence of @@ -1285,7 +1297,7 @@ int qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu) * a deadlock where the interrupt context can wait indefinitely * for free slots in the queue. */ - rc = qi_check_fault(iommu, index); + rc = qi_check_fault(iommu, index, wait_index); if (rc) break; @@ -1294,7 +1306,8 @@ int qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu) raw_spin_lock(&qi->q_lock); } - qi->desc_stat
[PATCH v5 2/5] iommu/vt-d: debugfs: Add support to show inv queue internals
Export invalidation queue internals of each iommu device through the debugfs. Example of such dump on a Skylake machine: $ sudo cat /sys/kernel/debug/iommu/intel/invalidation_queue Invalidation queue on IOMMU: dmar1 Base: 0x1672c9000 Head: 80Tail: 80 Index qw0 qw1 status 0 0004 1 000200250001672be804 2 0011 3 000200250001672be80c 4 00d2 5 000200250001672be814 6 0014 7 000200250001672be81c 8 0014 9 000200250001672be824 Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian --- drivers/iommu/intel-iommu-debugfs.c | 62 + 1 file changed, 62 insertions(+) diff --git a/drivers/iommu/intel-iommu-debugfs.c b/drivers/iommu/intel-iommu-debugfs.c index 3eb1fe240fb0..e3089865b8f3 100644 --- a/drivers/iommu/intel-iommu-debugfs.c +++ b/drivers/iommu/intel-iommu-debugfs.c @@ -372,6 +372,66 @@ static int domain_translation_struct_show(struct seq_file *m, void *unused) } DEFINE_SHOW_ATTRIBUTE(domain_translation_struct); +static void invalidation_queue_entry_show(struct seq_file *m, + struct intel_iommu *iommu) +{ + int index, shift = qi_shift(iommu); + struct qi_desc *desc; + int offset; + + if (ecap_smts(iommu->ecap)) + seq_puts(m, "Index\t\tqw0\t\t\tqw1\t\t\tqw2\t\t\tqw3\t\t\tstatus\n"); + else + seq_puts(m, "Index\t\tqw0\t\t\tqw1\t\t\tstatus\n"); + + for (index = 0; index < QI_LENGTH; index++) { + offset = index << shift; + desc = iommu->qi->desc + offset; + if (ecap_smts(iommu->ecap)) + seq_printf(m, "%5d\t%016llx\t%016llx\t%016llx\t%016llx\t%016x\n", + index, desc->qw0, desc->qw1, + desc->qw2, desc->qw3, + iommu->qi->desc_status[index]); + else + seq_printf(m, "%5d\t%016llx\t%016llx\t%016x\n", + index, desc->qw0, desc->qw1, + iommu->qi->desc_status[index]); + } +} + +static int invalidation_queue_show(struct seq_file *m, void *unused) +{ + struct dmar_drhd_unit *drhd; + struct intel_iommu *iommu; + unsigned long flags; + struct q_inval *qi; + int shift; + + rcu_read_lock(); + for_each_active_iommu(iommu, drhd) { + qi = iommu->qi; + shift = qi_shift(iommu); + + if (!qi || !ecap_qis(iommu->ecap)) + continue; + + seq_printf(m, "Invalidation queue on IOMMU: %s\n", iommu->name); + + raw_spin_lock_irqsave(&qi->q_lock, flags); + seq_printf(m, " Base: 0x%llx\tHead: %lld\tTail: %lld\n", + virt_to_phys(qi->desc), + dmar_readq(iommu->reg + DMAR_IQH_REG) >> shift, + dmar_readq(iommu->reg + DMAR_IQT_REG) >> shift); + invalidation_queue_entry_show(m, iommu); + raw_spin_unlock_irqrestore(&qi->q_lock, flags); + seq_putc(m, '\n'); + } + rcu_read_unlock(); + + return 0; +} +DEFINE_SHOW_ATTRIBUTE(invalidation_queue); + #ifdef CONFIG_IRQ_REMAP static void ir_tbl_remap_entry_show(struct seq_file *m, struct intel_iommu *iommu) @@ -490,6 +550,8 @@ void __init intel_iommu_debugfs_init(void) debugfs_create_file("domain_translation_struct", 0444, intel_iommu_debug, NULL, &domain_translation_struct_fops); + debugfs_create_file("invalidation_queue", 0444, intel_iommu_debug, + NULL, &invalidation_queue_fops); #ifdef CONFIG_IRQ_REMAP debugfs_create_file("ir_translation_struct", 0444, intel_iommu_debug, NULL, &ir_translation_struct_fops); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 4/5] iommu/vt-d: Add page request draining support
When a PASID is stopped or terminated, there can be pending PRQs (requests that haven't received responses) in remapping hardware. This adds the interface to drain page requests and call it when a PASID is terminated. Signed-off-by: Jacob Pan Signed-off-by: Liu Yi L Signed-off-by: Lu Baolu --- drivers/iommu/intel-svm.c | 107 ++-- include/linux/intel-iommu.h | 4 ++ 2 files changed, 106 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c index 9561ba59a170..84cc263cec47 100644 --- a/drivers/iommu/intel-svm.c +++ b/drivers/iommu/intel-svm.c @@ -23,6 +23,7 @@ #include "intel-pasid.h" static irqreturn_t prq_event_thread(int irq, void *d); +static void intel_svm_drain_prq(struct device *dev, int pasid); #define PRQ_ORDER 0 @@ -66,6 +67,8 @@ int intel_svm_enable_prq(struct intel_iommu *iommu) dmar_writeq(iommu->reg + DMAR_PQT_REG, 0ULL); dmar_writeq(iommu->reg + DMAR_PQA_REG, virt_to_phys(iommu->prq) | PRQ_ORDER); + init_completion(&iommu->prq_complete); + return 0; } @@ -403,12 +406,8 @@ int intel_svm_unbind_gpasid(struct device *dev, int pasid) list_del_rcu(&sdev->list); intel_pasid_tear_down_entry(iommu, dev, svm->pasid, false); + intel_svm_drain_prq(dev, svm->pasid); intel_flush_svm_range_dev(svm, sdev, 0, -1, 0); - /* TODO: Drain in flight PRQ for the PASID since it -* may get reused soon, we don't want to -* confuse with its previous life. -* intel_svm_drain_prq(dev, pasid); -*/ kfree_rcu(sdev, rcu); if (list_empty(&svm->devs)) { @@ -647,6 +646,7 @@ int intel_svm_unbind_mm(struct device *dev, int pasid) * hard to be as defensive as we might like. */ intel_pasid_tear_down_entry(iommu, dev, svm->pasid, false); + intel_svm_drain_prq(dev, svm->pasid); intel_flush_svm_range_dev(svm, sdev, 0, -1, 0); kfree_rcu(sdev, rcu); @@ -725,6 +725,93 @@ static bool is_canonical_address(u64 addr) return (((saddr << shift) >> shift) == saddr); } +/** + * intel_svm_drain_prq - Drain page requests and responses for a pasid + * @dev: target device + * @pasid: pasid for draining + * + * Drain all pending page requests and responses related to @pasid in both + * software and hardware. This is supposed to be called after the device + * driver has stopped DMA, the pasid entry has been cleared, and both IOTLB + * and DevTLB have been invalidated. + * + * It waits until all pending page requests for @pasid in the page fault + * queue are completed by the prq handling thread. Then follow the steps + * described in VT-d spec CH7.10 to drain all page requests and page + * responses pending in the hardware. + */ +static void intel_svm_drain_prq(struct device *dev, int pasid) +{ + struct device_domain_info *info; + struct dmar_domain *domain; + struct intel_iommu *iommu; + struct qi_desc desc[3]; + struct pci_dev *pdev; + int head, tail; + u16 sid, did; + int qdep; + + info = get_domain_info(dev); + if (WARN_ON(!info || !dev_is_pci(dev))) + return; + + if (!info->pri_enabled) + return; + + iommu = info->iommu; + domain = info->domain; + pdev = to_pci_dev(dev); + sid = PCI_DEVID(info->bus, info->devfn); + did = domain->iommu_did[iommu->seq_id]; + qdep = pci_ats_queue_depth(pdev); + + /* +* Check and wait until all pending page requests in the queue are +* handled by the prq handling thread. +*/ +prq_retry: + reinit_completion(&iommu->prq_complete); + tail = dmar_readq(iommu->reg + DMAR_PQT_REG) & PRQ_RING_MASK; + head = dmar_readq(iommu->reg + DMAR_PQH_REG) & PRQ_RING_MASK; + while (head != tail) { + struct page_req_dsc *req; + + req = &iommu->prq[head / sizeof(*req)]; + if (!req->pasid_present || req->pasid != pasid) { + head = (head + sizeof(*req)) & PRQ_RING_MASK; + continue; + } + + wait_for_completion(&iommu->prq_complete); + goto prq_retry; + } + + /* +* Perform steps described in VT-d spec CH7.10 to drain page +* requests and responses in hardware. +*/ + memset(desc, 0, sizeof(desc)); + desc[0].qw0 = QI_IWD_STATUS_DATA(QI_DONE) | + QI_IWD_FENCE | + QI_IWD_TYPE; + desc[1].qw0 = QI_EIOTLB_PASID(pasid) | +
[PATCH v5 5/5] iommu/vt-d: Remove redundant IOTLB flush
IOTLB flush already included in the PASID tear down and the page request drain process. There is no need to flush again. Signed-off-by: Jacob Pan Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian --- drivers/iommu/intel-svm.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c index 84cc263cec47..d8e19486523f 100644 --- a/drivers/iommu/intel-svm.c +++ b/drivers/iommu/intel-svm.c @@ -209,11 +209,9 @@ static void intel_mm_release(struct mmu_notifier *mn, struct mm_struct *mm) * *has* to handle gracefully without affecting other processes. */ rcu_read_lock(); - list_for_each_entry_rcu(sdev, &svm->devs, list) { + list_for_each_entry_rcu(sdev, &svm->devs, list) intel_pasid_tear_down_entry(svm->iommu, sdev->dev, svm->pasid, true); - intel_flush_svm_range_dev(svm, sdev, 0, -1, 0); - } rcu_read_unlock(); } @@ -407,7 +405,6 @@ int intel_svm_unbind_gpasid(struct device *dev, int pasid) intel_pasid_tear_down_entry(iommu, dev, svm->pasid, false); intel_svm_drain_prq(dev, svm->pasid); - intel_flush_svm_range_dev(svm, sdev, 0, -1, 0); kfree_rcu(sdev, rcu); if (list_empty(&svm->devs)) { @@ -647,7 +644,6 @@ int intel_svm_unbind_mm(struct device *dev, int pasid) intel_pasid_tear_down_entry(iommu, dev, svm->pasid, false); intel_svm_drain_prq(dev, svm->pasid); - intel_flush_svm_range_dev(svm, sdev, 0, -1, 0); kfree_rcu(sdev, rcu); if (list_empty(&svm->devs)) { -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 3/5] iommu/vt-d: Disable non-recoverable fault processing before unbind
When a PASID is used for SVA by the device, it's possible that the PASID entry is cleared before the device flushes all ongoing DMA requests. The IOMMU should tolerate and ignore the non-recoverable faults caused by the untranslated requests from this device. For example, when an exception happens, the process terminates before the device driver stops DMA and call IOMMU driver to unbind PASID. The flow of process exist is as follows: do_exit() { exit_mm() { mm_put(); exit_mmap() { intel_invalidate_range() //mmu notifier tlb_finish_mmu() mmu_notifier_release(mm) { intel_iommu_release() { [2] intel_iommu_teardown_pasid(); intel_iommu_flush_tlbs(); } } unmap_vmas(); free_pgtables(); }; } exit_files(tsk) { close_files() { dsa_close(); [1] dsa_stop_dma(); intel_svm_unbind_pasid(); } } } Care must be taken on VT-d to avoid unrecoverable faults between the time window of [1] and [2]. [Process exist flow was contributed by Jacob Pan.] Intel VT-d provides such function through the FPD bit of the PASID entry. This sets FPD bit when PASID entry is changing from present to nonpresent in the mm notifier and will clear it when the pasid is unbound. Signed-off-by: Lu Baolu Reviewed-by: Jacob Pan --- drivers/iommu/intel-iommu.c | 4 ++-- drivers/iommu/intel-pasid.c | 26 +- drivers/iommu/intel-pasid.h | 4 +++- drivers/iommu/intel-svm.c | 9 ++--- 4 files changed, 32 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index d1866c0905b1..7811422b5a68 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5352,7 +5352,7 @@ static void __dmar_remove_one_dev_info(struct device_domain_info *info) if (info->dev) { if (dev_is_pci(info->dev) && sm_supported(iommu)) intel_pasid_tear_down_entry(iommu, info->dev, - PASID_RID2PASID); + PASID_RID2PASID, false); iommu_disable_dev_iotlb(info); domain_context_clear(iommu, info->dev); @@ -5587,7 +5587,7 @@ static void aux_domain_remove_dev(struct dmar_domain *domain, auxiliary_unlink_device(domain, dev); spin_lock(&iommu->lock); - intel_pasid_tear_down_entry(iommu, dev, domain->default_pasid); + intel_pasid_tear_down_entry(iommu, dev, domain->default_pasid, false); domain_detach_iommu(domain, iommu); spin_unlock(&iommu->lock); diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c index 7969e3dac2ad..110b2c2c4cb7 100644 --- a/drivers/iommu/intel-pasid.c +++ b/drivers/iommu/intel-pasid.c @@ -292,7 +292,20 @@ static inline void pasid_clear_entry(struct pasid_entry *pe) WRITE_ONCE(pe->val[7], 0); } -static void intel_pasid_clear_entry(struct device *dev, int pasid) +static inline void pasid_clear_entry_with_fpd(struct pasid_entry *pe) +{ + WRITE_ONCE(pe->val[0], PASID_PTE_FPD); + WRITE_ONCE(pe->val[1], 0); + WRITE_ONCE(pe->val[2], 0); + WRITE_ONCE(pe->val[3], 0); + WRITE_ONCE(pe->val[4], 0); + WRITE_ONCE(pe->val[5], 0); + WRITE_ONCE(pe->val[6], 0); + WRITE_ONCE(pe->val[7], 0); +} + +static void +intel_pasid_clear_entry(struct device *dev, int pasid, bool fault_ignore) { struct pasid_entry *pe; @@ -300,7 +313,10 @@ static void intel_pasid_clear_entry(struct device *dev, int pasid) if (WARN_ON(!pe)) return; - pasid_clear_entry(pe); + if (fault_ignore && pasid_pte_is_present(pe)) + pasid_clear_entry_with_fpd(pe); + else + pasid_clear_entry(pe); } static inline void pasid_set_bits(u64 *ptr, u64 mask, u64 bits) @@ -533,8 +549,8 @@ devtlb_invalidation_with_pasid(struct intel_iommu *iommu, qi_flush_dev_iotlb(iommu, sid, pfsid, qdep, 0, 64 - VTD_PAGE_SHIFT); } -void intel_pasid_tear_down_entry(struct intel_iommu *iommu, -struct device *dev, int pasid) +void intel_pasid_tear_down_entry(struct intel_iommu *iommu, struct device *dev, +int pasid, bool fault_ignore) { struct pasid_entry *pte; u16 did; @@ -544,7 +560,7 @@ void intel_pasid_tear_down_entry(struct intel_iommu *iommu, return; did = pasid_get_domain_id(pte); - intel_pasid_clear_entry(dev, pasid); + intel_pasid_clear_entry(dev, pasid, fault_ignore); if (!ecap_coherent(iommu->ecap)) clflush_cache_range(pte, sizeof(*pte));
Re: [PATCH v6] iommu/arm-smmu-qcom: Request direct mapping for modem device
Quoting Sibi Sankar (2020-05-11 10:55:32) > The modem remote processor has two access paths to DDR. One path is > directly connected to DDR and another path goes through an SMMU. The > SMMU path is configured to be a direct mapping because it's used by > various peripherals in the modem subsystem. Typically this direct > mapping is configured statically at EL2 by QHEE (Qualcomm's Hypervisor > Execution Environment) before the kernel is entered. > > In certain firmware configuration, especially when the kernel is already > in full control of the SMMU, defer programming the modem SIDs to the > kernel. Let's add compatibles here so that we can have the kernel > program the SIDs for the modem in these cases. > > Signed-off-by: Sibi Sankar > --- Reviewed-by: Stephen Boyd ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6] iommu/arm-smmu-qcom: Request direct mapping for modem device
On Mon 11 May 10:55 PDT 2020, Sibi Sankar wrote: > The modem remote processor has two access paths to DDR. One path is > directly connected to DDR and another path goes through an SMMU. The > SMMU path is configured to be a direct mapping because it's used by > various peripherals in the modem subsystem. Typically this direct > mapping is configured statically at EL2 by QHEE (Qualcomm's Hypervisor > Execution Environment) before the kernel is entered. > > In certain firmware configuration, especially when the kernel is already > in full control of the SMMU, defer programming the modem SIDs to the > kernel. Let's add compatibles here so that we can have the kernel > program the SIDs for the modem in these cases. > Reviewed-by: Bjorn Andersson Regards, Bjorn > Signed-off-by: Sibi Sankar > --- > > V6 > * Rebased on Will's for-joerg/arm-smmu/updates > * Reword commit message and add more details [Stephen] > > drivers/iommu/arm-smmu-qcom.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/iommu/arm-smmu-qcom.c b/drivers/iommu/arm-smmu-qcom.c > index 5bedf21587a56..cf01d0215a397 100644 > --- a/drivers/iommu/arm-smmu-qcom.c > +++ b/drivers/iommu/arm-smmu-qcom.c > @@ -17,7 +17,9 @@ static const struct of_device_id > qcom_smmu_client_of_match[] = { > { .compatible = "qcom,mdp4" }, > { .compatible = "qcom,mdss" }, > { .compatible = "qcom,sc7180-mdss" }, > + { .compatible = "qcom,sc7180-mss-pil" }, > { .compatible = "qcom,sdm845-mdss" }, > + { .compatible = "qcom,sdm845-mss-pil" }, > { } > }; > > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu-v3: Don't reserve implementation defined register space
On 2020-05-06 6:46 pm, Jean-Philippe Brucker wrote: Some SMMUv3 implementation embed the Perf Monitor Group Registers (PMCG) inside the first 64kB region of the SMMU. Since PMCG are managed by a separate driver, this layout causes resource reservation conflicts during boot. To avoid this conflict, only reserve the MMIO region we actually use: the first 0xe0 bytes of page 0 and the first 0xd0 bytes of page 1. Although devm_ioremap() still works on full pages under the hood, this way we benefit from resource conflict checks. Signed-off-by: Jean-Philippe Brucker --- A nicer (and hopefully working) solution to the problem dicussed here: https://lore.kernel.org/linux-iommu/20200421155745.19815-1-jean-phili...@linaro.org/ --- drivers/iommu/arm-smmu-v3.c | 50 + 1 file changed, 45 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 82508730feb7a1..fc85cdd5b62cca 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -171,6 +171,9 @@ #define ARM_SMMU_PRIQ_IRQ_CFG10xd8 #define ARM_SMMU_PRIQ_IRQ_CFG20xdc +#define ARM_SMMU_PAGE0_REG_SZ 0xe0 +#define ARM_SMMU_PAGE1_REG_SZ 0xd0 I wonder if we shouldn't still claim all the way up to 0xdff for good measure, since the IMP-DEF areas only start appearing beyond that. + /* Common MSI config fields */ #define MSI_CFG0_ADDR_MASKGENMASK_ULL(51, 2) #define MSI_CFG2_SH GENMASK(5, 4) @@ -628,6 +631,7 @@ struct arm_smmu_strtab_cfg { struct arm_smmu_device { struct device *dev; void __iomem*base; + void __iomem*page1; #define ARM_SMMU_FEAT_2_LVL_STRTAB (1 << 0) #define ARM_SMMU_FEAT_2_LVL_CDTAB (1 << 1) @@ -733,11 +737,14 @@ static struct arm_smmu_option_prop arm_smmu_options[] = { static inline void __iomem *arm_smmu_page1_fixup(unsigned long offset, struct arm_smmu_device *smmu) { - if ((offset > SZ_64K) && - (smmu->options & ARM_SMMU_OPT_PAGE0_REGS_ONLY)) - offset -= SZ_64K; + void __iomem *base = smmu->base; - return smmu->base + offset; + if (offset > SZ_64K) { + offset -= SZ_64K; + if (smmu->page1) + base = smmu->page1; + } + return base + offset; } Why not just assign page1 = base in the Cavium case and let this simply be: if (offset > SZ_64K) return smmu->page1 + offset - SZ_64K; return smmu->base + offset; Then it's only one step further to get rid of the fixup and use page1 directly where relevant, but that could be a cleanup on top, since we probably want a minimal change here for the sake of backporting (I believe this deserves to go to stable, now that MMU-600 hardware is reaching the field and will go wonky otherwise). static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) @@ -4021,6 +4028,28 @@ err_reset_pci_ops: __maybe_unused; return err; } +static void __iomem *arm_smmu_ioremap(struct device *dev, + resource_size_t start, + resource_size_t size) +{ + void __iomem *dest_ptr; + struct resource *res; + + res = devm_request_mem_region(dev, start, size, dev_name(dev)); + if (!res) { + dev_err(dev, "can't request SMMU region %pa\n", &start); + return IOMEM_ERR_PTR(-EINVAL); + } + + dest_ptr = devm_ioremap(dev, start, size); + if (!dest_ptr) { + dev_err(dev, "ioremap failed for SMMU region %pR\n", res); + devm_release_mem_region(dev, start, size); + dest_ptr = IOMEM_ERR_PTR(-ENOMEM); + } + return dest_ptr; +} Would it be any less complicated to stick with devm_ioremap_resource() and fix up the resource itself for each call, rather than open-coding it? + static int arm_smmu_device_probe(struct platform_device *pdev) { int irq, ret; @@ -4056,10 +4085,21 @@ static int arm_smmu_device_probe(struct platform_device *pdev) } ioaddr = res->start; - smmu->base = devm_ioremap_resource(dev, res); + /* +* Only map what we need, because the IMPLEMENTATION DEFINED registers +* may be used for the PMCGs, which are reserved by the PMU driver. +*/ + smmu->base = arm_smmu_ioremap(dev, ioaddr, ARM_SMMU_PAGE0_REG_SZ); if (IS_ERR(smmu->base)) return PTR_ERR(smmu->base); + if (arm_smmu_resource_size(smmu) > SZ_64K) { + smmu->page1 = arm_smmu_ioremap(dev, ioaddr + SZ_64K, + ARM_SMMU_PAGE1_REG_SZ); + if (IS_ERR(smmu->page1)) + return PTR_ERR(smmu->page1); + } A
[PATCH v6] iommu/arm-smmu-qcom: Request direct mapping for modem device
The modem remote processor has two access paths to DDR. One path is directly connected to DDR and another path goes through an SMMU. The SMMU path is configured to be a direct mapping because it's used by various peripherals in the modem subsystem. Typically this direct mapping is configured statically at EL2 by QHEE (Qualcomm's Hypervisor Execution Environment) before the kernel is entered. In certain firmware configuration, especially when the kernel is already in full control of the SMMU, defer programming the modem SIDs to the kernel. Let's add compatibles here so that we can have the kernel program the SIDs for the modem in these cases. Signed-off-by: Sibi Sankar --- V6 * Rebased on Will's for-joerg/arm-smmu/updates * Reword commit message and add more details [Stephen] drivers/iommu/arm-smmu-qcom.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/iommu/arm-smmu-qcom.c b/drivers/iommu/arm-smmu-qcom.c index 5bedf21587a56..cf01d0215a397 100644 --- a/drivers/iommu/arm-smmu-qcom.c +++ b/drivers/iommu/arm-smmu-qcom.c @@ -17,7 +17,9 @@ static const struct of_device_id qcom_smmu_client_of_match[] = { { .compatible = "qcom,mdp4" }, { .compatible = "qcom,mdss" }, { .compatible = "qcom,sc7180-mdss" }, + { .compatible = "qcom,sc7180-mss-pil" }, { .compatible = "qcom,sdm845-mdss" }, + { .compatible = "qcom,sdm845-mss-pil" }, { } }; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iomm/arm-smmu: Add stall implementation hook
On Fri, May 08, 2020 at 08:40:40AM -0700, Rob Clark wrote: > On Fri, May 8, 2020 at 8:32 AM Rob Clark wrote: > > > > On Thu, May 7, 2020 at 5:54 AM Will Deacon wrote: > > > > > > On Thu, May 07, 2020 at 11:55:54AM +0100, Robin Murphy wrote: > > > > On 2020-05-07 11:14 am, Sai Prakash Ranjan wrote: > > > > > On 2020-04-22 01:50, Sai Prakash Ranjan wrote: > > > > > > Add stall implementation hook to enable stalling > > > > > > faults on QCOM platforms which supports it without > > > > > > causing any kind of hardware mishaps. Without this > > > > > > on QCOM platforms, GPU faults can cause unrelated > > > > > > GPU memory accesses to return zeroes. This has the > > > > > > unfortunate result of command-stream reads from CP > > > > > > getting invalid data, causing a cascade of fail. > > > > > > > > I think this came up before, but something about this rationale doesn't > > > > add > > > > up - we're not *using* stalls at all, we're still terminating faulting > > > > transactions unconditionally; we're just using CFCFG to terminate them > > > > with > > > > a slight delay, rather than immediately. It's really not clear how or > > > > why > > > > that makes a difference. Is it a GPU bug? Or an SMMU bug? Is this > > > > reliable > > > > (or even a documented workaround for something), or might things start > > > > blowing up again if any other behaviour subtly changes? I'm not dead set > > > > against adding this, but I'd *really* like to have a lot more > > > > confidence in > > > > it. > > > > > > Rob mentioned something about the "bus returning zeroes" before, but I > > > agree > > > that we need more information so that we can reason about this and > > > maintain > > > the code as the driver continues to change. That needs to be a comment in > > > the driver, and I don't think "but android seems to work" is a good enough > > > justification. There was some interaction with HUPCF as well. > > > > The issue is that there are multiple parallel memory accesses > > happening at the same time, for example CP (the cmdstream processor) > > will be reading ahead and setting things up for the next draw or > > compute grid, in parallel with some memory accesses from the shader > > which could trigger a fault. (And with faults triggered by something > > in the shader, there are *many* shader threads running in parallel so > > those tend to generate a big number of faults at the same time.) > > > > We need either CFCFG or HUPCF, otherwise what I have observed is that > > while the fault happens, CP's memory access will start returning > > zero's instead of valid cmdstream data, which triggers a GPU hang. I > > can't say whether this is something unique to qcom's implementation of > > the smmu spec or not. > > > > *Often* a fault is the result of the usermode gl/vk/cl driver bug, > > although I don't think that is an argument against fixing this in the > > smmu driver.. I've been carrying around a local patch to set HUPCF for > > *years* because debugging usermode driver issues is so much harder > > without. But there are some APIs where faults can be caused by the > > user's app on top of the usermode driver. > > > > Also, I'll add to that, a big wish of mine is to have stall with the > ability to resume later from a wq context. That would enable me to > hook in the gpu crash dump handling for faults, which would make > debugging these sorts of issues much easier. I think I posted a > prototype of this quite some time back, which would schedule a worker > on the first fault (since there are cases where you see 1000's of > faults at once), which grabbed some information about the currently > executing submit and some gpu registers to indicate *where* in the > submit (a single submit could have 100's or 1000's of draws), and then > resumed the iommu cb. > > (This would ofc eventually be useful for svm type things.. I expect > we'll eventually care about that too.) Rob is right about HUPCF. Due to the parallel nature of the command processor there is always a very good chance that a CP access is somewhere in the bus so any pagefault is usually a death sentence. The GPU context bank would always want HUPCF set to 1. Downstream also uses CFCFG for stall-on-fault debug case. You wouldn't want this on all the time in production since bringing down the world for every user pagefault is less than desirable so it needs to be modified in run-time (or at the very least kernel command line selectable). Jordan PS: Interestingly, the GMU does not want HUPCF set to 1 because it wants to crash immediately on all invalid accesses so ideally these combination of bits would be configurable on a per-context basis. > > > > > > As a template, I'd suggest: > > > > > > > > > diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h > > > > > > index 8d1cd54d82a6..d5134e0d5cce 100644 > > > > > > --- a/drivers/iommu/arm-smmu.h > > > > > > +++ b/drivers/iommu/arm-smmu.h > > > > > > @@ -386,6 +386,7 @@ struct arm_smmu_im
[PATCH v2] iommu/amd: Fix get_acpihid_device_id()
acpi_dev_hid_uid_match() expects a null pointer for UID if it doesn't exist. The acpihid_map_entry contains a char buffer for holding the UID. If no UID was provided in the IVRS table, this buffer will be zeroed. If we pass in a null string, acpi_dev_hid_uid_match() will return false because it will try and match an empty string to the ACPI UID of the device. Fixes: ae5e6c6439c3 ("iommu/amd: Switch to use acpi_dev_hid_uid_match()") Suggested-by: Andy Shevchenko Signed-off-by: Raul E Rangel Reviewed-by: Andy Shevchenko --- Changes in v2: - Added Suggested by - Fixed commit description - Decided to keep `p->uid[0]` instead of `*p->uid` since the data member is an array instead of a pointer. - Used clang-format drivers/iommu/amd_iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 20cce366e951..06f603366cb1 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -125,7 +125,8 @@ static inline int get_acpihid_device_id(struct device *dev, return -ENODEV; list_for_each_entry(p, &acpihid_map, list) { - if (acpi_dev_hid_uid_match(adev, p->hid, p->uid)) { + if (acpi_dev_hid_uid_match(adev, p->hid, + p->uid[0] ? p->uid : NULL)) { if (entry) *entry = p; return p->devid; -- 2.26.2.645.ge9eca65c58-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu: Do not probe devices on IOMMU-less busses
From: Thierry Reding The host1x bus implemented on Tegra SoCs is primarily an abstraction to create logical device from multiple platform devices. Since the devices in such a setup are typically hierarchical, DMA setup still needs to be done so that DMA masks can be properly inherited, but we don't actually want to attach the host1x logical devices to any IOMMU. The platform devices that make up the logical device are responsible for memory bus transactions, so it is them that will need to be attached to the IOMMU. Add a check to __iommu_probe_device() that aborts IOMMU setup early for busses that don't have the IOMMU operations pointer set since they will cause a crash otherwise. Signed-off-by: Thierry Reding --- Note that this is probably also required for the BCMA bus implemented in drivers/bcma/main.c since no IOMMU operations are ever assigned to that either. drivers/iommu/iommu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 9888a3c82b15..4050569188be 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -196,6 +196,9 @@ static int __iommu_probe_device(struct device *dev, struct list_head *group_list struct iommu_group *group; int ret; + if (!ops) + return -ENODEV; + if (!dev_iommu_get(dev)) return -ENOMEM; -- 2.24.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 10/25] drm: panfrost: fix common struct sg_table related issues
On 05/05/2020 09:45, Marek Szyprowski wrote: The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the numer of the created entries in the DMA address space. However the subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be called with the original number of the entries passed to dma_map_sg. The sg_table->nents in turn holds the result of the dma_map_sg call as stated in include/linux/scatterlist.h. A common mistake was to ignore a result of the dma_map_sg function and don't use the sg_table->orig_nents at all. To avoid such issues, lets use common dma-mapping wrappers operating directly on the struct sg_table objects and adjust references to the nents and orig_nents respectively. Signed-off-by: Marek Szyprowski The change looks good to me: Reviewed-by: Steven Price Although I would have appreciated the commit message being modified to match the specifics of Panfrost - the return of dma_mpa_sg() wasn't being ignored, but the use of orig_nents/nents was indeed wrong. Steve --- For more information, see '[PATCH v3 00/25] DRM: fix struct sg_table nents vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/5/187 --- drivers/gpu/drm/panfrost/panfrost_gem.c | 4 ++-- drivers/gpu/drm/panfrost/panfrost_mmu.c | 5 ++--- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c index 17b654e..95d7e80 100644 --- a/drivers/gpu/drm/panfrost/panfrost_gem.c +++ b/drivers/gpu/drm/panfrost/panfrost_gem.c @@ -41,8 +41,8 @@ static void panfrost_gem_free_object(struct drm_gem_object *obj) for (i = 0; i < n_sgt; i++) { if (bo->sgts[i].sgl) { - dma_unmap_sg(pfdev->dev, bo->sgts[i].sgl, -bo->sgts[i].nents, DMA_BIDIRECTIONAL); + dma_unmap_sgtable(pfdev->dev, &bo->sgts[i], + DMA_BIDIRECTIONAL); sg_free_table(&bo->sgts[i]); } } diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index ed28aeb..9926111 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -517,10 +517,9 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, if (ret) goto err_pages; - if (!dma_map_sg(pfdev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL)) { - ret = -EINVAL; + ret = dma_map_sgtable(pfdev->dev, sgt, DMA_BIDIRECTIONAL); + if (ret) goto err_map; - } mmu_map_sg(pfdev, bomapping->mmu, addr, IOMMU_WRITE | IOMMU_READ | IOMMU_NOEXEC, sgt); ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/amd: fix over-read of ACPI UID from IVRS table
IVRS parsing code always tries to read 255 bytes from memory when retrieving ACPI device path, and makes an assumption that firmware provides a zero-terminated string. Both of those are bugs: the entry is likely to be shorter than 255 bytes, and zero-termination is not guaranteed. With Acer SF314-42 firmware these issues manifest visibly in dmesg: AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR0\xf0\xa5, rdevid:160 AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR1\xf0\xa5, rdevid:160 AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR2\xf0\xa5, rdevid:160 AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR3>\x83e\x8d\x9a\xd1... The first three lines show how the code over-reads adjacent table entries into the UID, and in the last line it even reads garbage data beyond the end of the IVRS table itself. Since each entry has the length of the UID (uidl member of ivhd_entry struct), use that for memcpy, and manually add a zero terminator. Avoid zero-filling hid and uid arrays up front, and instead ensure the uid array is always zero-terminated. No change needed for the hid array, as it was already properly zero-terminated. Fixes: 2a0cb4e2d423c ("iommu/amd: Add new map for storing IVHD dev entry type HID") Signed-off-by: Alexander Monakov Cc: Joerg Roedel Cc: iommu@lists.linux-foundation.org --- drivers/iommu/amd_iommu_init.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c index 6be3853a5d97..faed3d35017a 100644 --- a/drivers/iommu/amd_iommu_init.c +++ b/drivers/iommu/amd_iommu_init.c @@ -1329,8 +1329,8 @@ static int __init init_iommu_from_acpi(struct amd_iommu *iommu, } case IVHD_DEV_ACPI_HID: { u16 devid; - u8 hid[ACPIHID_HID_LEN] = {0}; - u8 uid[ACPIHID_UID_LEN] = {0}; + u8 hid[ACPIHID_HID_LEN]; + u8 uid[ACPIHID_UID_LEN]; int ret; if (h->type != 0x40) { @@ -1347,6 +1347,7 @@ static int __init init_iommu_from_acpi(struct amd_iommu *iommu, break; } + uid[0] = '\0'; switch (e->uidf) { case UID_NOT_PRESENT: @@ -1361,8 +1362,8 @@ static int __init init_iommu_from_acpi(struct amd_iommu *iommu, break; case UID_IS_CHARACTER: - memcpy(uid, (u8 *)(&e->uid), ACPIHID_UID_LEN - 1); - uid[ACPIHID_UID_LEN - 1] = '\0'; + memcpy(uid, &e->uid, e->uidl); + uid[e->uidl] = '\0'; break; default: -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/iova: Retry from last rb tree node if iova search fails
On 5/9/2020 12:25 AM, Vijayanand Jitta wrote: > > > On 5/7/2020 6:54 PM, Robin Murphy wrote: >> On 2020-05-06 9:01 pm, vji...@codeaurora.org wrote: >>> From: Vijayanand Jitta >>> >>> When ever a new iova alloc request comes iova is always searched >>> from the cached node and the nodes which are previous to cached >>> node. So, even if there is free iova space available in the nodes >>> which are next to the cached node iova allocation can still fail >>> because of this approach. >>> >>> Consider the following sequence of iova alloc and frees on >>> 1GB of iova space >>> >>> 1) alloc - 500MB >>> 2) alloc - 12MB >>> 3) alloc - 499MB >>> 4) free - 12MB which was allocated in step 2 >>> 5) alloc - 13MB >>> >>> After the above sequence we will have 12MB of free iova space and >>> cached node will be pointing to the iova pfn of last alloc of 13MB >>> which will be the lowest iova pfn of that iova space. Now if we get an >>> alloc request of 2MB we just search from cached node and then look >>> for lower iova pfn's for free iova and as they aren't any, iova alloc >>> fails though there is 12MB of free iova space. >> >> Yup, this could definitely do with improving. Unfortunately I think this >> particular implementation is slightly flawed... >> >>> To avoid such iova search failures do a retry from the last rb tree node >>> when iova search fails, this will search the entire tree and get an iova >>> if its available >>> >>> Signed-off-by: Vijayanand Jitta >>> --- >>> drivers/iommu/iova.c | 11 +++ >>> 1 file changed, 11 insertions(+) >>> >>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c >>> index 0e6a953..2985222 100644 >>> --- a/drivers/iommu/iova.c >>> +++ b/drivers/iommu/iova.c >>> @@ -186,6 +186,7 @@ static int __alloc_and_insert_iova_range(struct >>> iova_domain *iovad, >>> unsigned long flags; >>> unsigned long new_pfn; >>> unsigned long align_mask = ~0UL; >>> + bool retry = false; >>> if (size_aligned) >>> align_mask <<= fls_long(size - 1); >>> @@ -198,6 +199,8 @@ static int __alloc_and_insert_iova_range(struct >>> iova_domain *iovad, >>> curr = __get_cached_rbnode(iovad, limit_pfn); >>> curr_iova = rb_entry(curr, struct iova, node); >>> + >>> +retry_search: >>> do { >>> limit_pfn = min(limit_pfn, curr_iova->pfn_lo); >>> new_pfn = (limit_pfn - size) & align_mask; >>> @@ -207,6 +210,14 @@ static int __alloc_and_insert_iova_range(struct >>> iova_domain *iovad, >>> } while (curr && new_pfn <= curr_iova->pfn_hi); >>> if (limit_pfn < size || new_pfn < iovad->start_pfn) { >>> + if (!retry) { >>> + curr = rb_last(&iovad->rbroot); >> >> Why walk when there's an anchor node there already? However... >> >>> + curr_iova = rb_entry(curr, struct iova, node); >>> + limit_pfn = curr_iova->pfn_lo; >> >> ...this doesn't look right, as by now we've lost the original limit_pfn >> supplied by the caller, so are highly likely to allocate beyond the >> range our caller asked for. In fact AFAICS we'd start allocating from >> directly directly below the anchor node, beyond the end of the entire >> address space. >> >> The logic I was imagining we want here was something like the rapidly >> hacked up (and untested) diff below. >> >> Thanks, >> Robin. >> > > Thanks for your comments ,I have gone through below logic and I see some > issue with retry check as there could be case where alloc_lo is set to > some pfn other than start_pfn in that case we don't retry and there can > still be iova available. I understand its a hacked up version, I can > work on this. > > But how about we just store limit_pfn and get the node using that and > retry for once from that node, it would be similar to my patch just > correcting the curr node and limit_pfn update in retry check. do you see > any issue with this approach ? > > > Thanks, > Vijay. I found one issue with my earlier approach, where we search twice from cached node to the start_pfn, this can be avoided if we store the pfn_hi of the cached node make this as alloc_lo when we retry. I see the below diff also does the same, I have posted v2 version of the patch after going through the comments and the below diff. can you please review that. Thanks, Vijay >> ->8- >> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c >> index 0e6a9536eca6..3574c19272d6 100644 >> --- a/drivers/iommu/iova.c >> +++ b/drivers/iommu/iova.c >> @@ -186,6 +186,7 @@ static int __alloc_and_insert_iova_range(struct >> iova_domain *iovad, >> unsigned long flags; >> unsigned long new_pfn; >> unsigned long align_mask = ~0UL; >> + unsigned long alloc_hi, alloc_lo; >> >> if (size_aligned) >> align_mask <<= fls_long(size - 1); >> @@ -196,17 +197,27 @@ static int __alloc_and_insert_iova_range(struct >> iova_domain *iovad, >> size >= iovad->max32_alloc_size) >>
[PATCH v2] iommu/iova: Retry from last rb tree node if iova search fails
From: Vijayanand Jitta When ever a new iova alloc request comes iova is always searched from the cached node and the nodes which are previous to cached node. So, even if there is free iova space available in the nodes which are next to the cached node iova allocation can still fail because of this approach. Consider the following sequence of iova alloc and frees on 1GB of iova space 1) alloc - 500MB 2) alloc - 12MB 3) alloc - 499MB 4) free - 12MB which was allocated in step 2 5) alloc - 13MB After the above sequence we will have 12MB of free iova space and cached node will be pointing to the iova pfn of last alloc of 13MB which will be the lowest iova pfn of that iova space. Now if we get an alloc request of 2MB we just search from cached node and then look for lower iova pfn's for free iova and as they aren't any, iova alloc fails though there is 12MB of free iova space. To avoid such iova search failures do a retry from the last rb tree node when iova search fails, this will search the entire tree and get an iova if its available Signed-off-by: Vijayanand Jitta --- drivers/iommu/iova.c | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 0e6a953..7d82afc 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -184,8 +184,9 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad, struct rb_node *curr, *prev; struct iova *curr_iova; unsigned long flags; - unsigned long new_pfn; + unsigned long new_pfn, alloc_lo_new; unsigned long align_mask = ~0UL; + unsigned long alloc_hi = limit_pfn, alloc_lo = iovad->start_pfn; if (size_aligned) align_mask <<= fls_long(size - 1); @@ -198,15 +199,25 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad, curr = __get_cached_rbnode(iovad, limit_pfn); curr_iova = rb_entry(curr, struct iova, node); + alloc_lo_new = curr_iova->pfn_hi; + +retry: do { - limit_pfn = min(limit_pfn, curr_iova->pfn_lo); - new_pfn = (limit_pfn - size) & align_mask; + alloc_hi = min(alloc_hi, curr_iova->pfn_lo); + new_pfn = (alloc_hi - size) & align_mask; prev = curr; curr = rb_prev(curr); curr_iova = rb_entry(curr, struct iova, node); } while (curr && new_pfn <= curr_iova->pfn_hi); - if (limit_pfn < size || new_pfn < iovad->start_pfn) { + if (alloc_hi < size || new_pfn < alloc_lo) { + if (alloc_lo == iovad->start_pfn && alloc_lo_new < limit_pfn) { + alloc_hi = limit_pfn; + alloc_lo = alloc_lo_new; + curr = &iovad->anchor.node; + curr_iova = rb_entry(curr, struct iova, node); + goto retry; + } iovad->max32_alloc_size = size; goto iova32_full; } -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation 1.9.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu