Re: [PATCH v6 29/29] x86/tsc: Switch to perf-based hardlockup detector if TSC become unstable
On Tue, May 10, 2022 at 10:14:00PM +1000, Nicholas Piggin wrote: > Excerpts from Ricardo Neri's message of May 6, 2022 10:00 am: > > The HPET-based hardlockup detector relies on the TSC to determine if an > > observed NMI interrupt was originated by HPET timer. Hence, this detector > > can no longer be used with an unstable TSC. > > > > In such case, permanently stop the HPET-based hardlockup detector and > > start the perf-based detector. > > > > Cc: Andi Kleen > > Cc: Stephane Eranian > > Cc: "Ravi V. Shankar" > > Cc: iommu@lists.linux-foundation.org > > Cc: linuxppc-...@lists.ozlabs.org > > Cc: x...@kernel.org > > Suggested-by: Thomas Gleixner > > Reviewed-by: Tony Luck > > Signed-off-by: Ricardo Neri > > --- > > Changes since v5: > > * Relocated the delcaration of hardlockup_detector_switch_to_perf() to > >x86/nmi.h It does not depend on HPET. > > * Removed function stub. The shim hardlockup detector is always for x86. > > > > Changes since v4: > > * Added a stub version of hardlockup_detector_switch_to_perf() for > >!CONFIG_HPET_TIMER. (lkp) > > * Reconfigure the whole lockup detector instead of unconditionally > >starting the perf-based hardlockup detector. > > > > Changes since v3: > > * None > > > > Changes since v2: > > * Introduced this patch. > > > > Changes since v1: > > * N/A > > --- > > arch/x86/include/asm/nmi.h | 6 ++ > > arch/x86/kernel/tsc.c | 2 ++ > > arch/x86/kernel/watchdog_hld.c | 6 ++ > > 3 files changed, 14 insertions(+) > > > > diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h > > index 4a0d5b562c91..47752ff67d8b 100644 > > --- a/arch/x86/include/asm/nmi.h > > +++ b/arch/x86/include/asm/nmi.h > > @@ -63,4 +63,10 @@ void stop_nmi(void); > > void restart_nmi(void); > > void local_touch_nmi(void); > > > > +#ifdef CONFIG_X86_HARDLOCKUP_DETECTOR > > +void hardlockup_detector_switch_to_perf(void); > > +#else > > +static inline void hardlockup_detector_switch_to_perf(void) { } > > +#endif > > + > > #endif /* _ASM_X86_NMI_H */ > > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c > > index cc1843044d88..74772ffc79d1 100644 > > --- a/arch/x86/kernel/tsc.c > > +++ b/arch/x86/kernel/tsc.c > > @@ -1176,6 +1176,8 @@ void mark_tsc_unstable(char *reason) > > > > clocksource_mark_unstable(_tsc_early); > > clocksource_mark_unstable(_tsc); > > + > > + hardlockup_detector_switch_to_perf(); > > } > > > > EXPORT_SYMBOL_GPL(mark_tsc_unstable); > > diff --git a/arch/x86/kernel/watchdog_hld.c b/arch/x86/kernel/watchdog_hld.c > > index ef11f0af4ef5..7940977c6312 100644 > > --- a/arch/x86/kernel/watchdog_hld.c > > +++ b/arch/x86/kernel/watchdog_hld.c > > @@ -83,3 +83,9 @@ void watchdog_nmi_start(void) > > if (detector_type == X86_HARDLOCKUP_DETECTOR_HPET) > > hardlockup_detector_hpet_start(); > > } > > + > > +void hardlockup_detector_switch_to_perf(void) > > +{ > > + detector_type = X86_HARDLOCKUP_DETECTOR_PERF; > > Another possible problem along the same lines here, > isn't your watchdog still running at this point? And > it uses detector_type in the switch. > > > + lockup_detector_reconfigure(); > > Actually the detector_type switch is used in some > functions called by lockup_detector_reconfigure() > e.g., watchdog_nmi_stop, so this seems buggy even > without concurrent watchdog. Yes, this true. I missed this race. > > Is this switching a good idea in general? The admin > has asked for non-standard option because they want > more PMU counterss available and now it eats a > counter potentially causing a problem rather than > detecting one. Agreed. A very valid point. > > I would rather just disable with a warning if it were > up to me. If you *really* wanted to be fancy then > allow admin to re-enable via proc maybe. I think that in either case, /proc/sys/kernel/nmi_watchdog need to be updated to reflect that the NMI watchdog has been disabled. That would require to expose other interfaces of the watchdog. Thanks and BR, Ricardo ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/5] iommu: Add blocking_domain_ops field in iommu_ops
Hi Jason, On 2022/5/16 21:57, Jason Gunthorpe wrote: On Mon, May 16, 2022 at 12:22:08PM +0100, Robin Murphy wrote: On 2022-05-16 02:57, Lu Baolu wrote: Each IOMMU driver must provide a blocking domain ops. If the hardware supports detaching domain from device, setting blocking domain equals detaching the existing domain from the deivce. Otherwise, an UNMANAGED domain without any mapping will be used instead. Unfortunately that's backwards - most of the implementations of .detach_dev are disabling translation entirely, meaning the device ends up effectively in passthrough rather than blocked. Ideally we'd convert the detach_dev of every driver into either a blocking or identity domain. The trick is knowing which is which.. I am still a bit puzzled about how the blocking_domain should be used when it is extended to support ->set_dev_pasid. If it's a blocking domain, the IOMMU driver knows that setting the blocking domain to device pasid means detaching the existing one. But if it's an identity domain, how could the IOMMU driver choose between: - setting the input domain to the pasid on device; or, - detaching the existing domain. I've ever thought about below solutions: - Checking the domain types and dispatching them to different operations. - Using different blocking domains for different types of domains. But both look rough. Guessing going down the list: apple dart - blocking, detach_dev calls apple_dart_hw_disable_dma() same as IOMMU_DOMAIN_BLOCKED [I wonder if this drive ris wrong in other ways though because I dont see a remove_streams in attach_dev] exynos - this seems to disable the 'sysmmu' so I'm guessing this is identity iommu-vmsa - Comment says 'disable mmu translaction' so I'm guessing this is idenity mkt_v1 - Code looks similar to mkt, which is probably identity. rkt - No idea sprd - No idea sun50i - This driver confusingly treats identity the same as unmanaged, seems wrong, no idea. amd - Not sure, clear_dte_entry() seems to set translation on but points the PTE to 0 ? Based on the spec table 8 I would have expected TV to be clear which would be blocking. Maybe a bug?? arm smmu qcomm - not sure intel - blocking These doesn't support default domains, so detach_dev should return back to DMA API ownership, which is either identity or something weird: fsl_pamu - identity due to the PPC use of dma direct msm mkt omap s390 - platform DMA ops terga-gart - Usually something called a GART would be 0 length once disabled, guessing blocking? tegra-smmu So, the approach here should be to go driver by driver and convert detach_dev to either identity, blocking or just delete it entirely, excluding the above 7 that don't support default domains. And get acks from the driver owners. Agreed. There seems to be a long way to go. I am wondering if we could decouple this refactoring from my new SVA API work? We can easily switch .detach_dev_pasid to using blocking domain later. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 6/7] x86/boot/tboot: Move tboot_force_iommu() to Intel IOMMU
Hi Jason, On 2022/5/17 02:06, Jason Gunthorpe wrote: +static __init int tboot_force_iommu(void) +{ + if (!tboot_enabled()) + return 0; + + if (no_iommu || dmar_disabled) + pr_warn("Forcing Intel-IOMMU to enabled\n"); Unrelated, but when we are in the special secure IOMMU modes, do we force ATS off? Specifically does the IOMMU reject TLPs that are marked as translated? Good question. From IOMMU point of view, I don't see a point to force ATS off, but trust boot involves lots of other things that I am not familiar with. Anybody else could help to answer? Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 5/8] perf tool: Add support for HiSilicon PCIe Tune and Trace device driver
On 2022/5/16 22:20, Jonathan Cameron wrote: On Mon, 16 May 2022 20:52:20 +0800 Yicong Yang wrote: From: Qi Liu HiSilicon PCIe tune and trace device (PTT) could dynamically tune the PCIe link's events, and trace the TLP headers). This patch add support for PTT device in perf tool, so users could use 'perf record' to get TLP headers trace data. Signed-off-by: Qi Liu Signed-off-by: Yicong Yang One query inline. diff --git a/tools/perf/arch/arm/util/auxtrace.c b/tools/perf/arch/arm/util/auxtrace.c index 384c7cfda0fd..297fffedf45e 100644 --- a/tools/perf/arch/arm/util/auxtrace.c +++ b/tools/perf/arch/arm/util/auxtrace.c ... static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus, int pmu_nr, struct evsel *evsel) { @@ -71,17 +120,21 @@ struct auxtrace_record { struct perf_pmu *cs_etm_pmu = NULL; struct perf_pmu **arm_spe_pmus = NULL; + struct perf_pmu **hisi_ptt_pmus = NULL; struct evsel *evsel; struct perf_pmu *found_etm = NULL; struct perf_pmu *found_spe = NULL; + struct perf_pmu *found_ptt = NULL; int auxtrace_event_cnt = 0; int nr_spes = 0; + int nr_ptts = 0; if (!evlist) return NULL; cs_etm_pmu = perf_pmu__find(CORESIGHT_ETM_PMU_NAME); arm_spe_pmus = find_all_arm_spe_pmus(_spes, err); + hisi_ptt_pmus = find_all_hisi_ptt_pmus(_ptts, err); evlist__for_each_entry(evlist, evsel) { if (cs_etm_pmu && !found_etm) @@ -89,9 +142,13 @@ struct auxtrace_record if (arm_spe_pmus && !found_spe) found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, evsel); + + if (arm_spe_pmus && !found_spe) if (hisi_ptt_pmus && !found_ptt) ? Otherwise, I'm not sure what the purpose of the checking against spe is. yes...it's a typo here, thanks for the reminder! Qi + found_ptt = find_pmu_for_event(hisi_ptt_pmus, nr_ptts, evsel); } free(arm_spe_pmus); + free(hisi_ptt_pmus); if (found_etm) auxtrace_event_cnt++; @@ -99,6 +156,9 @@ struct auxtrace_record if (found_spe) auxtrace_event_cnt++; + if (found_ptt) + auxtrace_event_cnt++; + if (auxtrace_event_cnt > 1) { pr_err("Concurrent AUX trace operation not currently supported\n"); *err = -EOPNOTSUPP; @@ -111,6 +171,9 @@ struct auxtrace_record #if defined(__aarch64__) if (found_spe) return arm_spe_recording_init(err, found_spe); + + if (found_ptt) + return hisi_ptt_recording_init(err, found_ptt); #endif . ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 4/8] perf arm: Refactor event list iteration in auxtrace_record__init()
On 2022/5/17 0:29, John Garry wrote: On 16/05/2022 13:52, Yicong Yang wrote: As requested before, please mention "perf tool" in the commit subject "perf arm" is used referenced to previous commit, ok, will mention "perf tool" in the commit subject next time. Thanks, Qi From: Qi Liu Use find_pmu_for_event() to simplify logic in auxtrace_record__init(). Signed-off-by: Qi Liu Signed-off-by: Yicong Yang --- tools/perf/arch/arm/util/auxtrace.c | 53 ++--- 1 file changed, 34 insertions(+), 19 deletions(-) diff --git a/tools/perf/arch/arm/util/auxtrace.c b/tools/perf/arch/arm/util/auxtrace.c index 5fc6a2a3dbc5..384c7cfda0fd 100644 --- a/tools/perf/arch/arm/util/auxtrace.c +++ b/tools/perf/arch/arm/util/auxtrace.c @@ -50,16 +50,32 @@ static struct perf_pmu **find_all_arm_spe_pmus(int *nr_spes, int *err) return arm_spe_pmus; } +static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus, + int pmu_nr, struct evsel *evsel) +{ + int i; + + if (!pmus) + return NULL; + + for (i = 0; i < pmu_nr; i++) { + if (evsel->core.attr.type == pmus[i]->type) + return pmus[i]; + } + + return NULL; +} + struct auxtrace_record *auxtrace_record__init(struct evlist *evlist, int *err) { - struct perf_pmu *cs_etm_pmu; + struct perf_pmu *cs_etm_pmu = NULL; + struct perf_pmu **arm_spe_pmus = NULL; struct evsel *evsel; - bool found_etm = false; + struct perf_pmu *found_etm = NULL; struct perf_pmu *found_spe = NULL; - struct perf_pmu **arm_spe_pmus = NULL; + int auxtrace_event_cnt = 0; int nr_spes = 0; - int i = 0; if (!evlist) return NULL; @@ -68,24 +84,23 @@ struct auxtrace_record arm_spe_pmus = find_all_arm_spe_pmus(_spes, err); evlist__for_each_entry(evlist, evsel) { - if (cs_etm_pmu && - evsel->core.attr.type == cs_etm_pmu->type) - found_etm = true; - - if (!nr_spes || found_spe) - continue; - - for (i = 0; i < nr_spes; i++) { - if (evsel->core.attr.type == arm_spe_pmus[i]->type) { - found_spe = arm_spe_pmus[i]; - break; - } - } + if (cs_etm_pmu && !found_etm) + found_etm = find_pmu_for_event(_etm_pmu, 1, evsel); + + if (arm_spe_pmus && !found_spe) + found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, evsel); } + free(arm_spe_pmus); - if (found_etm && found_spe) { - pr_err("Concurrent ARM Coresight ETM and SPE operation not currently supported\n"); + if (found_etm) + auxtrace_event_cnt++; + + if (found_spe) + auxtrace_event_cnt++; + + if (auxtrace_event_cnt > 1) { + pr_err("Concurrent AUX trace operation not currently supported\n"); *err = -EOPNOTSUPP; return NULL; } . ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 4/8] perf arm: Refactor event list iteration in auxtrace_record__init()
Hi Jonathan, On 2022/5/16 22:17, Jonathan Cameron wrote: On Mon, 16 May 2022 20:52:19 +0800 Yicong Yang wrote: From: Qi Liu Use find_pmu_for_event() to simplify logic in auxtrace_record__init(). Possibly reword as "Add find_pmu_for_event() and use to simplify logic in auxtrace_record_init(). find_pmu_for_event() will be reused in subsequent patches." thanks, I'll modify the commit message next version. Thanks, Qi Signed-off-by: Qi Liu Signed-off-by: Yicong Yang FWIW as this isn't an area I know much about. It seems like a good cleanup and functionally equivalent. Reviewed-by: Jonathan Cameron --- tools/perf/arch/arm/util/auxtrace.c | 53 ++--- 1 file changed, 34 insertions(+), 19 deletions(-) diff --git a/tools/perf/arch/arm/util/auxtrace.c b/tools/perf/arch/arm/util/auxtrace.c index 5fc6a2a3dbc5..384c7cfda0fd 100644 --- a/tools/perf/arch/arm/util/auxtrace.c +++ b/tools/perf/arch/arm/util/auxtrace.c @@ -50,16 +50,32 @@ static struct perf_pmu **find_all_arm_spe_pmus(int *nr_spes, int *err) return arm_spe_pmus; } +static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus, + int pmu_nr, struct evsel *evsel) +{ + int i; + + if (!pmus) + return NULL; + + for (i = 0; i < pmu_nr; i++) { + if (evsel->core.attr.type == pmus[i]->type) + return pmus[i]; + } + + return NULL; +} + struct auxtrace_record *auxtrace_record__init(struct evlist *evlist, int *err) { - struct perf_pmu *cs_etm_pmu; + struct perf_pmu *cs_etm_pmu = NULL; + struct perf_pmu **arm_spe_pmus = NULL; struct evsel *evsel; - bool found_etm = false; + struct perf_pmu *found_etm = NULL; struct perf_pmu *found_spe = NULL; - struct perf_pmu **arm_spe_pmus = NULL; + int auxtrace_event_cnt = 0; int nr_spes = 0; - int i = 0; if (!evlist) return NULL; @@ -68,24 +84,23 @@ struct auxtrace_record arm_spe_pmus = find_all_arm_spe_pmus(_spes, err); evlist__for_each_entry(evlist, evsel) { - if (cs_etm_pmu && - evsel->core.attr.type == cs_etm_pmu->type) - found_etm = true; - - if (!nr_spes || found_spe) - continue; - - for (i = 0; i < nr_spes; i++) { - if (evsel->core.attr.type == arm_spe_pmus[i]->type) { - found_spe = arm_spe_pmus[i]; - break; - } - } + if (cs_etm_pmu && !found_etm) + found_etm = find_pmu_for_event(_etm_pmu, 1, evsel); + + if (arm_spe_pmus && !found_spe) + found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, evsel); } + free(arm_spe_pmus); - if (found_etm && found_spe) { - pr_err("Concurrent ARM Coresight ETM and SPE operation not currently supported\n"); + if (found_etm) + auxtrace_event_cnt++; + + if (found_spe) + auxtrace_event_cnt++; + + if (auxtrace_event_cnt > 1) { + pr_err("Concurrent AUX trace operation not currently supported\n"); *err = -EOPNOTSUPP; return NULL; } . ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 00/21] Userspace P2PDMA with O_DIRECT NVMe devices
On 2022-05-16 16:31, Chaitanya Kulkarni wrote: > Do you have any plans to re-spin this ? I didn't get any feedback this cycle, so there haven't been any changes. I'll probably do a rebase and resend after the merge window. Logan ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 6/7] x86/boot/tboot: Move tboot_force_iommu() to Intel IOMMU
Hi Jason, On Mon, 16 May 2022 15:06:28 -0300, Jason Gunthorpe wrote: > Unrelated, but when we are in the special secure IOMMU modes, do we > force ATS off? Specifically does the IOMMU reject TLPs that are marked > as translated? Yes, VT-d context entry has a Device TLB Enable bit, if 0, it means "Translation Requests (with or without PASID) and Translated Requests received and processed through this scalable-mode context-entry are blocked." Thanks, Jacob ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 00/21] Userspace P2PDMA with O_DIRECT NVMe devices
On 4/7/22 08:46, Logan Gunthorpe wrote: > Hi, > > This patchset continues my work to add userspace P2PDMA access using > O_DIRECT NVMe devices. This posting contains some minor fixes and a > rebase onto v5.18-rc1 which contains cleanup from Christoph around > free_zone_device_page() that helps to enable this patchset. The > previous posting was here[1]. > > The patchset enables userspace P2PDMA by allowing userspace to mmap() > allocated chunks of the CMB. The resulting VMA can be passed only > to O_DIRECT IO on NVMe backed files or block devices. A flag is added > to GUP() in Patch <>, then Patches <> through <> wire this flag up based > on whether the block queue indicates P2PDMA support. Patches <> > through <> enable the CMB to be mapped into userspace by mmaping > the nvme char device. > > This is relatively straightforward, however the one significant > problem is that, presently, pci_p2pdma_map_sg() requires a homogeneous > SGL with all P2PDMA pages or all regular pages. Enhancing GUP to > support enforcing this rule would require a huge hack that I don't > expect would be all that pallatable. So the first 13 patches add > support for P2PDMA pages to dma_map_sg[table]() to the dma-direct > and dma-iommu implementations. Thus systems without an IOMMU plus > Intel and AMD IOMMUs are supported. (Other IOMMU implementations would > then be unsupported, notably ARM and PowerPC but support would be added > when they convert to dma-iommu). > > dma_map_sgtable() is preferred when dealing with P2PDMA memory as it > will return -EREMOTEIO when the DMA device cannot map specific P2PDMA > pages based on the existing rules in calc_map_type_and_dist(). > > The other issue is dma_unmap_sg() needs a flag to determine whether a > given dma_addr_t was mapped regularly or as a PCI bus address. To allow > this, a third flag is added to the page_link field in struct > scatterlist. This effectively means support for P2PDMA will now depend > on CONFIG_64BIT. > > Feedback welcome. > > This series is based on v5.18-rc1. A git branch is available here: > >https://github.com/sbates130272/linux-p2pmem/ p2pdma_user_cmb_v6 > > Thanks, > > Logan > > [1] lkml.kernel.org/r/20220128002614.6136-1-log...@deltatee.com > > -- Do you have any plans to re-spin this ? -ck ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 5/5] iommu/tegra-smmu: Support managed domains
On 5/12/22 22:00, Thierry Reding wrote: > -277,7 +278,9 @@ static struct iommu_domain *tegra_smmu_domain_alloc(unsigned > type) > { > struct tegra_smmu_as *as; > > - if (type != IOMMU_DOMAIN_UNMANAGED) > + if (type != IOMMU_DOMAIN_UNMANAGED && > + type != IOMMU_DOMAIN_DMA && > + type != IOMMU_DOMAIN_IDENTITY) > return NULL; Shouldn't at least pre-210 SoCs be guarded from IOMMU_DOMAIN_DMA? I don't think that DRM and VDE drivers will work as-is today. -- Best regards, Dmitry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 7/7] iommu/vt-d: Move include/linux/intel_iommu.h under iommu
On Sat, May 14, 2022 at 09:43:22AM +0800, Lu Baolu wrote: > This header file is private to the Intel IOMMU driver. Move it to the > driver folder. > > Signed-off-by: Lu Baolu > --- > include/linux/intel-iommu.h => drivers/iommu/intel/iommu.h | 0 > drivers/iommu/intel/trace.h| 3 ++- > drivers/iommu/intel/cap_audit.c| 2 +- > drivers/iommu/intel/debugfs.c | 2 +- > drivers/iommu/intel/dmar.c | 2 +- > drivers/iommu/intel/iommu.c| 2 +- > drivers/iommu/intel/irq_remapping.c| 2 +- > drivers/iommu/intel/pasid.c| 2 +- > drivers/iommu/intel/perf.c | 2 +- > drivers/iommu/intel/svm.c | 2 +- > MAINTAINERS| 1 - > 11 files changed, 10 insertions(+), 10 deletions(-) > rename include/linux/intel-iommu.h => drivers/iommu/intel/iommu.h (100%) Reviewed-by: Jason Gunthorpe Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 6/7] x86/boot/tboot: Move tboot_force_iommu() to Intel IOMMU
On Sat, May 14, 2022 at 09:43:21AM +0800, Lu Baolu wrote: > tboot_force_iommu() is only called by the Intel IOMMU driver. Move the > helper into that driver. No functional change intended. > > Signed-off-by: Lu Baolu > --- > include/linux/tboot.h | 2 -- > arch/x86/kernel/tboot.c | 15 --- > drivers/iommu/intel/iommu.c | 14 ++ > 3 files changed, 14 insertions(+), 17 deletions(-) Reviewed-by: Jason Gunthorpe > +static __init int tboot_force_iommu(void) > +{ > + if (!tboot_enabled()) > + return 0; > + > + if (no_iommu || dmar_disabled) > + pr_warn("Forcing Intel-IOMMU to enabled\n"); Unrelated, but when we are in the special secure IOMMU modes, do we force ATS off? Specifically does the IOMMU reject TLPs that are marked as translated? Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 5/7] KVM: x86: Remove unnecessary include
On Sat, May 14, 2022 at 09:43:20AM +0800, Lu Baolu wrote: > intel-iommu.h is not needed in kvm/x86 anymore. Remove its include. > > Signed-off-by: Lu Baolu > --- > arch/x86/kvm/x86.c | 1 - > 1 file changed, 1 deletion(-) Reviewed-by: Jason Gunthorpe Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/7] drm/i915: Remove unnecessary include
On Sat, May 14, 2022 at 09:43:19AM +0800, Lu Baolu wrote: > intel-iommu.h is not needed in drm/i915 anymore. Remove its include. > > Signed-off-by: Lu Baolu > --- > drivers/gpu/drm/i915/i915_drv.h| 1 - > drivers/gpu/drm/i915/display/intel_display.c | 1 - > drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 1 - > 3 files changed, 3 deletions(-) Reviewed-by: Jason Gunthorpe Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/7] iommu/vt-d: Remove unnecessary exported symbol
On Sat, May 14, 2022 at 09:43:18AM +0800, Lu Baolu wrote: > The exported symbol intel_iommu_gfx_mapped is not used anywhere in the > tree. Remove it to avoid dead code. > > Signed-off-by: Lu Baolu > --- > include/linux/intel-iommu.h | 1 - > drivers/iommu/intel/iommu.c | 6 -- > 2 files changed, 7 deletions(-) Reviewed-by: Jason Gunthorpe Maybe could squash to the prior patch Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/7] agp/intel: Use per device iommu check
On Sat, May 14, 2022 at 09:43:17AM +0800, Lu Baolu wrote: > The IOMMU subsystem has already provided an interface to query whether > the IOMMU hardware is enabled for a specific device. This changes the > check from Intel specific intel_iommu_gfx_mapped (globally exported by > the Intel IOMMU driver) to probing the presence of IOMMU on a specific > device using the generic device_iommu_mapped(). > > This follows commit cca084692394a ("drm/i915: Use per device iommu check") > which converted drm/i915 driver to use device_iommu_mapped(). > > Signed-off-by: Lu Baolu > --- > drivers/char/agp/intel-gtt.c | 17 +++-- > 1 file changed, 7 insertions(+), 10 deletions(-) Reviewed-by: Jason Gunthorpe Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/7] iommu/vt-d: Move trace/events/intel_iommu.h under iommu
On Sat, May 14, 2022 at 09:43:16AM +0800, Lu Baolu wrote: > This header file is private to the Intel IOMMU driver. Move it to the > driver folder. > > Signed-off-by: Lu Baolu > --- > .../trace/events/intel_iommu.h => drivers/iommu/intel/trace.h | 4 > drivers/iommu/intel/dmar.c| 2 +- > drivers/iommu/intel/svm.c | 2 +- > drivers/iommu/intel/trace.c | 2 +- > 4 files changed, 7 insertions(+), 3 deletions(-) > rename include/trace/events/intel_iommu.h => drivers/iommu/intel/trace.h > (94%) Reviewed-by: Jason Gunthorpe Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 4/8] perf arm: Refactor event list iteration in auxtrace_record__init()
On 16/05/2022 13:52, Yicong Yang wrote: As requested before, please mention "perf tool" in the commit subject From: Qi Liu Use find_pmu_for_event() to simplify logic in auxtrace_record__init(). Signed-off-by: Qi Liu Signed-off-by: Yicong Yang --- tools/perf/arch/arm/util/auxtrace.c | 53 ++--- 1 file changed, 34 insertions(+), 19 deletions(-) diff --git a/tools/perf/arch/arm/util/auxtrace.c b/tools/perf/arch/arm/util/auxtrace.c index 5fc6a2a3dbc5..384c7cfda0fd 100644 --- a/tools/perf/arch/arm/util/auxtrace.c +++ b/tools/perf/arch/arm/util/auxtrace.c @@ -50,16 +50,32 @@ static struct perf_pmu **find_all_arm_spe_pmus(int *nr_spes, int *err) return arm_spe_pmus; } +static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus, + int pmu_nr, struct evsel *evsel) +{ + int i; + + if (!pmus) + return NULL; + + for (i = 0; i < pmu_nr; i++) { + if (evsel->core.attr.type == pmus[i]->type) + return pmus[i]; + } + + return NULL; +} + struct auxtrace_record *auxtrace_record__init(struct evlist *evlist, int *err) { - struct perf_pmu *cs_etm_pmu; + struct perf_pmu *cs_etm_pmu = NULL; + struct perf_pmu **arm_spe_pmus = NULL; struct evsel *evsel; - bool found_etm = false; + struct perf_pmu *found_etm = NULL; struct perf_pmu *found_spe = NULL; - struct perf_pmu **arm_spe_pmus = NULL; + int auxtrace_event_cnt = 0; int nr_spes = 0; - int i = 0; if (!evlist) return NULL; @@ -68,24 +84,23 @@ struct auxtrace_record arm_spe_pmus = find_all_arm_spe_pmus(_spes, err); evlist__for_each_entry(evlist, evsel) { - if (cs_etm_pmu && - evsel->core.attr.type == cs_etm_pmu->type) - found_etm = true; - - if (!nr_spes || found_spe) - continue; - - for (i = 0; i < nr_spes; i++) { - if (evsel->core.attr.type == arm_spe_pmus[i]->type) { - found_spe = arm_spe_pmus[i]; - break; - } - } + if (cs_etm_pmu && !found_etm) + found_etm = find_pmu_for_event(_etm_pmu, 1, evsel); + + if (arm_spe_pmus && !found_spe) + found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, evsel); } + free(arm_spe_pmus); - if (found_etm && found_spe) { - pr_err("Concurrent ARM Coresight ETM and SPE operation not currently supported\n"); + if (found_etm) + auxtrace_event_cnt++; + + if (found_spe) + auxtrace_event_cnt++; + + if (auxtrace_event_cnt > 1) { + pr_err("Concurrent AUX trace operation not currently supported\n"); *err = -EOPNOTSUPP; return NULL; } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 1/9] dt-bindings: host1x: Add iommu-map property
On Mon, 16 May 2022 11:52:50 +0300, cyn...@kapsi.fi wrote: > From: Mikko Perttunen > > Add schema information for specifying context stream IDs. This uses > the standard iommu-map property. > > Signed-off-by: Mikko Perttunen > Reviewed-by: Robin Murphy > --- > v3: > * New patch > v4: > * Remove memory-contexts subnode. > --- > .../bindings/display/tegra/nvidia,tegra20-host1x.yaml| 5 + > 1 file changed, 5 insertions(+) > Acked-by: Rob Herring ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 3/8] hwtracing: hisi_ptt: Add tune function support for HiSilicon PCIe Tune and Trace device
On 16/05/2022 13:52, Yicong Yang wrote: Add tune function for the HiSilicon Tune and Trace device. The interface of tune is exposed through sysfs attributes of PTT PMU device. Signed-off-by: Yicong Yang Reviewed-by: Jonathan Cameron Apart from a comment on preferential style: Reviewed-by: John Garry --- drivers/hwtracing/ptt/hisi_ptt.c | 157 +++ drivers/hwtracing/ptt/hisi_ptt.h | 23 + 2 files changed, 180 insertions(+) diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c index ef25ce98f664..c3fdb9bfb1b4 100644 --- a/drivers/hwtracing/ptt/hisi_ptt.c +++ b/drivers/hwtracing/ptt/hisi_ptt.c @@ -25,6 +25,161 @@ /* Dynamic CPU hotplug state used by PTT */ static enum cpuhp_state hisi_ptt_pmu_online; +static bool hisi_ptt_wait_tuning_finish(struct hisi_ptt *hisi_ptt) +{ + u32 val; + + return !readl_poll_timeout(hisi_ptt->iobase + HISI_PTT_TUNING_INT_STAT, + val, !(val & HISI_PTT_TUNING_INT_STAT_MASK), + HISI_PTT_WAIT_POLL_INTERVAL_US, + HISI_PTT_WAIT_TUNE_TIMEOUT_US); +} + +static int hisi_ptt_tune_data_get(struct hisi_ptt *hisi_ptt, + u32 event, u16 *data) this only has 1x caller so may inline it +{ + u32 reg; + + reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_CTRL); + reg &= ~(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB); + reg |= FIELD_PREP(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB, + event); + writel(reg, hisi_ptt->iobase + HISI_PTT_TUNING_CTRL); + + /* Write all 1 to indicates it's the read process */ + writel(~0U, hisi_ptt->iobase + HISI_PTT_TUNING_DATA); + + if (!hisi_ptt_wait_tuning_finish(hisi_ptt)) + return -ETIMEDOUT; + + reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_DATA); + reg &= HISI_PTT_TUNING_DATA_VAL_MASK; + *data = FIELD_GET(HISI_PTT_TUNING_DATA_VAL_MASK, reg); + + return 0; +} + +static int hisi_ptt_tune_data_set(struct hisi_ptt *hisi_ptt, + u32 event, u16 data) again only 1x caller +{ + u32 reg; + + reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_CTRL); + reg &= ~(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB); + reg |= FIELD_PREP(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB, + event); + writel(reg, hisi_ptt->iobase + HISI_PTT_TUNING_CTRL); + + writel(FIELD_PREP(HISI_PTT_TUNING_DATA_VAL_MASK, data), + hisi_ptt->iobase + HISI_PTT_TUNING_DATA); + + if (!hisi_ptt_wait_tuning_finish(hisi_ptt)) + return -ETIMEDOUT; + + return 0; +} + +static ssize_t hisi_ptt_tune_attr_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct hisi_ptt *hisi_ptt = to_hisi_ptt(dev_get_drvdata(dev)); + struct dev_ext_attribute *ext_attr; + struct hisi_ptt_tune_desc *desc; + int ret; + u16 val; + + ext_attr = container_of(attr, struct dev_ext_attribute, attr); + desc = ext_attr->var; + + mutex_lock(_ptt->tune_lock); + ret = hisi_ptt_tune_data_get(hisi_ptt, desc->event_code, ); + mutex_unlock(_ptt->tune_lock); + + if (ret) + return ret; + + return sysfs_emit(buf, "%u\n", val); +} + +static ssize_t hisi_ptt_tune_attr_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct hisi_ptt *hisi_ptt = to_hisi_ptt(dev_get_drvdata(dev)); + struct dev_ext_attribute *ext_attr; + struct hisi_ptt_tune_desc *desc; + int ret; + u16 val; + + ext_attr = container_of(attr, struct dev_ext_attribute, attr); + desc = ext_attr->var; + + if (kstrtou16(buf, 10, )) + return -EINVAL; + + mutex_lock(_ptt->tune_lock); + ret = hisi_ptt_tune_data_set(hisi_ptt, desc->event_code, val); + mutex_unlock(_ptt->tune_lock); + + if (ret) + return ret; + + return count; +} + +#define HISI_PTT_TUNE_ATTR(_name, _val, _show, _store) \ + static struct hisi_ptt_tune_desc _name##_desc = { \ + .name = #_name, \ + .event_code = _val, \ + }; \ + static struct dev_ext_attribute hisi_ptt_##_name##_attr = { \ + .attr = __ATTR(_name, 0600, _show, _store), \ + .var= &_name##_desc,\ + } + +#define HISI_PTT_TUNE_ATTR_COMMON(_name, _val) \ + HISI_PTT_TUNE_ATTR(_name,
Re: [PATCH v8 2/8] hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device
On 16/05/2022 13:52, Yicong Yang wrote: HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex integrated Endpoint(RCiEP) device, providing the capability to dynamically monitor and tune the PCIe traffic and trace the TLP headers. Add the driver for the device to enable the trace function. Register PMU device of PTT trace, then users can use trace through perf command. The driver makes use of perf AUX trace function and support the following events to configure the trace: - filter: select Root port or Endpoint to trace - type: select the type of traced TLP headers - direction: select the direction of traced TLP headers - format: select the data format of the traced TLP headers This patch initially add a basic driver of PTT trace. Initially add basic trace support. Signed-off-by: Yicong Yang Generally this looks ok, apart from nitpicking below, so, FWIW: Reviewed-by: John Garry --- drivers/Makefile | 1 + drivers/hwtracing/Kconfig| 2 + drivers/hwtracing/ptt/Kconfig| 12 + drivers/hwtracing/ptt/Makefile | 2 + drivers/hwtracing/ptt/hisi_ptt.c | 964 +++ drivers/hwtracing/ptt/hisi_ptt.h | 178 ++ 6 files changed, 1159 insertions(+) create mode 100644 drivers/hwtracing/ptt/Kconfig create mode 100644 drivers/hwtracing/ptt/Makefile create mode 100644 drivers/hwtracing/ptt/hisi_ptt.c create mode 100644 drivers/hwtracing/ptt/hisi_ptt.h diff --git a/drivers/Makefile b/drivers/Makefile index 020780b6b4d2..662d50599467 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -175,6 +175,7 @@ obj-$(CONFIG_USB4) += thunderbolt/ obj-$(CONFIG_CORESIGHT) += hwtracing/coresight/ obj-y += hwtracing/intel_th/ obj-$(CONFIG_STM) += hwtracing/stm/ +obj-$(CONFIG_HISI_PTT) += hwtracing/ptt/ obj-$(CONFIG_ANDROID) += android/ obj-$(CONFIG_NVMEM) += nvmem/ obj-$(CONFIG_FPGA)+= fpga/ diff --git a/drivers/hwtracing/Kconfig b/drivers/hwtracing/Kconfig index 13085835a636..911ee977103c 100644 --- a/drivers/hwtracing/Kconfig +++ b/drivers/hwtracing/Kconfig @@ -5,4 +5,6 @@ source "drivers/hwtracing/stm/Kconfig" source "drivers/hwtracing/intel_th/Kconfig" +source "drivers/hwtracing/ptt/Kconfig" + endmenu diff --git a/drivers/hwtracing/ptt/Kconfig b/drivers/hwtracing/ptt/Kconfig new file mode 100644 index ..6d46a09ffeb9 --- /dev/null +++ b/drivers/hwtracing/ptt/Kconfig @@ -0,0 +1,12 @@ +# SPDX-License-Identifier: GPL-2.0-only +config HISI_PTT + tristate "HiSilicon PCIe Tune and Trace Device" + depends on ARM64 || (COMPILE_TEST && 64BIT) + depends on PCI && HAS_DMA && HAS_IOMEM && PERF_EVENTS + help + HiSilicon PCIe Tune and Trace device exists as a PCIe RCiEP + device, and it provides support for PCIe traffic tuning and + tracing TLP headers to the memory. + + This driver can also be built as a module. If so, the module + will be called hisi_ptt. diff --git a/drivers/hwtracing/ptt/Makefile b/drivers/hwtracing/ptt/Makefile new file mode 100644 index ..908c09a98161 --- /dev/null +++ b/drivers/hwtracing/ptt/Makefile @@ -0,0 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_HISI_PTT) += hisi_ptt.o diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c new file mode 100644 index ..ef25ce98f664 --- /dev/null +++ b/drivers/hwtracing/ptt/hisi_ptt.c @@ -0,0 +1,964 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Driver for HiSilicon PCIe tune and trace device + * + * Copyright (c) 2022 HiSilicon Technologies Co., Ltd. + * Author: Yicong Yang + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "hisi_ptt.h" + +/* Dynamic CPU hotplug state used by PTT */ +static enum cpuhp_state hisi_ptt_pmu_online; + +static u16 hisi_ptt_get_filter_val(u16 devid, bool is_port) +{ + if (is_port) + return BIT(HISI_PCIE_CORE_PORT_ID(devid & 0xff)); + + return devid; +} + +static bool hisi_ptt_wait_trace_hw_idle(struct hisi_ptt *hisi_ptt) +{ + u32 val; + + return !readl_poll_timeout_atomic(hisi_ptt->iobase + HISI_PTT_TRACE_STS, + val, val & HISI_PTT_TRACE_IDLE, + HISI_PTT_WAIT_POLL_INTERVAL_US, + HISI_PTT_WAIT_TRACE_TIMEOUT_US); +} + +static void hisi_ptt_wait_dma_reset_done(struct hisi_ptt *hisi_ptt) +{ + u32 val; + + readl_poll_timeout_atomic(hisi_ptt->iobase + HISI_PTT_TRACE_WR_STS, + val, !val, HISI_PTT_RESET_POLL_INTERVAL_US, + HISI_PTT_RESET_TIMEOUT_US); +} + +static void hisi_ptt_trace_end(struct hisi_ptt *hisi_ptt) +{ + writel(0, hisi_ptt->iobase +
Re: [PATCH 1/2] dt-bindings: mediatek: Add bindings for MT6795 M4U
On Fri, 13 May 2022 17:14:10 +0200, AngeloGioacchino Del Regno wrote: > Add bindings for the MediaTek Helio X10 (MT6795) IOMMU/M4U. > > Signed-off-by: AngeloGioacchino Del Regno > > --- > .../bindings/iommu/mediatek,iommu.yaml| 3 + > include/dt-bindings/memory/mt6795-larb-port.h | 96 +++ > 2 files changed, 99 insertions(+) > create mode 100644 include/dt-bindings/memory/mt6795-larb-port.h > Acked-by: Rob Herring ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 6/8] perf tool: Add support for parsing HiSilicon PCIe Trace packet
On Mon, 16 May 2022 20:52:21 +0800 Yicong Yang wrote: > From: Qi Liu > > Add support for using 'perf report --dump-raw-trace' to parse PTT packet. > > Example usage: > > Output will contain raw PTT data and its textual representation, such > as: > > 0 0 0x5810 [0x30]: PERF_RECORD_AUXTRACE size: 0x40 offset: 0 > ref: 0xa5d50c725 idx: 0 tid: -1 cpu: 0 > . > . ... HISI PTT data: size 4194304 bytes > . : 00 00 00 00 Prefix > . 0004: 08 20 00 60 Header DW0 > . 0008: ff 02 00 01 Header DW1 > . 000c: 20 08 00 00 Header DW2 > . 0010: 10 e7 44 ab Header DW3 > . 0014: 2a a8 1e 01 Time > . 0020: 00 00 00 00 Prefix > . 0024: 01 00 00 60 Header DW0 > . 0028: 0f 1e 00 01 Header DW1 > . 002c: 04 00 00 00 Header DW2 > . 0030: 40 00 81 02 Header DW3 > . 0034: ee 02 00 00 Time > > > Signed-off-by: Qi Liu > Signed-off-by: Yicong Yang >From point of view of a reviewer who doesn't know this code well, this all looks sensible. One trivial comment inline. Thanks, Jonathan > diff --git a/tools/perf/util/hisi-ptt.c b/tools/perf/util/hisi-ptt.c > new file mode 100644 > index ..2afc1a663c2a > --- /dev/null > + > +static void hisi_ptt_free(struct perf_session *session) > +{ > + struct hisi_ptt *ptt = container_of(session->auxtrace, struct hisi_ptt, > + auxtrace); > + > + session->auxtrace = NULL; > + free(ptt); > +} > + > +static bool hisi_ptt_evsel_is_auxtrace(struct perf_session *session, > +struct evsel *evsel) > +{ > + struct hisi_ptt *ptt = container_of(session->auxtrace, struct hisi_ptt, > auxtrace); Check for consistent wrapping of lines like this. This doesn't match the one just above. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 1/2] iommu/io-pgtable-arm-v7s: Add a quirk to allow pgtable PA up to 35bit
From: Yunfei Wang The calling to kmem_cache_alloc for level 2 pgtable allocation may run in atomic context, and it fails sometimes when DMA32 zone runs out of memory. Since Mediatek IOMMU hardware support at most 35bit PA in pgtable, so add a quirk to allow the PA of pgtables support up to bit35. Signed-off-by: Ning Li Signed-off-by: Yunfei Wang --- drivers/iommu/io-pgtable-arm-v7s.c | 56 ++ include/linux/io-pgtable.h | 15 +--- 2 files changed, 52 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c index be066c1503d3..668500798fb9 100644 --- a/drivers/iommu/io-pgtable-arm-v7s.c +++ b/drivers/iommu/io-pgtable-arm-v7s.c @@ -149,6 +149,10 @@ #define ARM_V7S_TTBR_IRGN_ATTR(attr) \ attr) & 0x1) << 6) | (((attr) & 0x2) >> 1)) +/* Mediatek extend ttbr bits[2:0] for PA bits[34:32] */ +#define ARM_V7S_TTBR_35BIT_PA(ttbr, pa) \ + ((ttbr & ((u32)(~0U << 3))) | ((pa & GENMASK_ULL(34, 32)) >> 32)) + #ifdef CONFIG_ZONE_DMA32 #define ARM_V7S_TABLE_GFP_DMA GFP_DMA32 #define ARM_V7S_TABLE_SLAB_FLAGS SLAB_CACHE_DMA32 @@ -182,14 +186,8 @@ static bool arm_v7s_is_mtk_enabled(struct io_pgtable_cfg *cfg) (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT); } -static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl, - struct io_pgtable_cfg *cfg) +static arm_v7s_iopte to_iopte_mtk(phys_addr_t paddr, arm_v7s_iopte pte) { - arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl); - - if (!arm_v7s_is_mtk_enabled(cfg)) - return pte; - if (paddr & BIT_ULL(32)) pte |= ARM_V7S_ATTR_MTK_PA_BIT32; if (paddr & BIT_ULL(33)) @@ -199,6 +197,17 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl, return pte; } +static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl, + struct io_pgtable_cfg *cfg) +{ + arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl); + + if (!arm_v7s_is_mtk_enabled(cfg)) + return pte; + + return to_iopte_mtk(paddr, pte); +} + static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl, struct io_pgtable_cfg *cfg) { @@ -234,6 +243,7 @@ static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl, static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp, struct arm_v7s_io_pgtable *data) { + gfp_t gfp_l1 = __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA; struct io_pgtable_cfg *cfg = >iop.cfg; struct device *dev = cfg->iommu_dev; phys_addr_t phys; @@ -241,9 +251,11 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp, size_t size = ARM_V7S_TABLE_SIZE(lvl, cfg); void *table = NULL; + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT) + gfp_l1 = __GFP_ZERO; + if (lvl == 1) - table = (void *)__get_free_pages( - __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size)); + table = (void *)__get_free_pages(gfp_l1, get_order(size)); else if (lvl == 2) table = kmem_cache_zalloc(data->l2_tables, gfp); @@ -251,7 +263,8 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp, return NULL; phys = virt_to_phys(table); - if (phys != (arm_v7s_iopte)phys) { + if (phys != (arm_v7s_iopte)phys && + !(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)) { /* Doesn't fit in PTE */ dev_err(dev, "Page table does not fit in PTE: %pa", ); goto out_free; @@ -457,9 +470,14 @@ static arm_v7s_iopte arm_v7s_install_table(arm_v7s_iopte *table, arm_v7s_iopte curr, struct io_pgtable_cfg *cfg) { + phys_addr_t phys = virt_to_phys(table); arm_v7s_iopte old, new; - new = virt_to_phys(table) | ARM_V7S_PTE_TYPE_TABLE; + new = phys | ARM_V7S_PTE_TYPE_TABLE; + + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT) + new = to_iopte_mtk(phys, new); + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS) new |= ARM_V7S_ATTR_NS_TABLE; @@ -778,7 +796,9 @@ static phys_addr_t arm_v7s_iova_to_phys(struct io_pgtable_ops *ops, static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie) { + slab_flags_t slab_flag = ARM_V7S_TABLE_SLAB_FLAGS; struct arm_v7s_io_pgtable *data; + phys_addr_t paddr; if (cfg->ias > (arm_v7s_is_mtk_enabled(cfg) ? 34 : ARM_V7S_ADDR_BITS)) return NULL; @@ -788,7 +808,8 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg, if (cfg->quirks
[PATCH v5 2/2] iommu/mediatek: Allow page table PA up to 35bit
From: Yunfei Wang Add the quirk IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT support, so that allows page table PA up to 35bit, not only in ZONE_DMA32. Signed-off-by: Ning Li Signed-off-by: Yunfei Wang --- drivers/iommu/mtk_iommu.c | 29 + 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 6fd75a60abd6..1b9a876ef271 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -33,6 +33,7 @@ #define REG_MMU_PT_BASE_ADDR 0x000 #define MMU_PT_ADDR_MASK GENMASK(31, 7) +#define MMU_PT_ADDR_2_0_MASK GENMASK(2, 0) #define REG_MMU_INVALIDATE 0x020 #define F_ALL_INVLD0x2 @@ -118,6 +119,7 @@ #define WR_THROT_ENBIT(6) #define HAS_LEGACY_IVRP_PADDR BIT(7) #define IOVA_34_EN BIT(8) +#define PGTABLE_PA_35_EN BIT(9) #define MTK_IOMMU_HAS_FLAG(pdata, _x) \ pdata)->flags) & (_x)) == (_x)) @@ -401,6 +403,9 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom, .iommu_dev = data->dev, }; + if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN)) + dom->cfg.quirks |= IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT; + if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) dom->cfg.oas = data->enable_4GB ? 33 : 32; else @@ -450,6 +455,7 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain, struct mtk_iommu_domain *dom = to_mtk_domain(domain); struct device *m4udev = data->dev; int ret, domid; + u32 regval; domid = mtk_iommu_get_domain_id(dev, data->plat_data); if (domid < 0) @@ -472,8 +478,14 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain, return ret; } data->m4u_dom = dom; - writel(dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK, - data->base + REG_MMU_PT_BASE_ADDR); + + /* Bits[6:3] are invalid for mediatek platform */ + if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN)) + regval = (dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK) | +(dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_2_0_MASK); + else + regval = dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK; + writel(regval, data->base + REG_MMU_PT_BASE_ADDR); pm_runtime_put(m4udev); } @@ -987,6 +999,7 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) struct mtk_iommu_suspend_reg *reg = >reg; struct mtk_iommu_domain *m4u_dom = data->m4u_dom; void __iomem *base = data->base; + u32 regval; int ret; ret = clk_prepare_enable(data->bclk); @@ -1010,7 +1023,14 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) writel_relaxed(reg->int_main_control, base + REG_MMU_INT_MAIN_CONTROL); writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR); writel_relaxed(reg->vld_pa_rng, base + REG_MMU_VLD_PA_RNG); - writel(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK, base + REG_MMU_PT_BASE_ADDR); + + /* Bits[6:3] are invalid for mediatek platform */ + if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN)) + regval = (m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK) | +(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_2_0_MASK); + else + regval = m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK; + writel(regval, base + REG_MMU_PT_BASE_ADDR); /* * Users may allocate dma buffer before they call pm_runtime_get, @@ -1038,7 +1058,8 @@ static const struct mtk_iommu_plat_data mt2712_data = { static const struct mtk_iommu_plat_data mt6779_data = { .m4u_plat = M4U_MT6779, - .flags = HAS_SUB_COMM | OUT_ORDER_WR_EN | WR_THROT_EN, + .flags = HAS_SUB_COMM | OUT_ORDER_WR_EN | WR_THROT_EN | +PGTABLE_PA_35_EN, .inv_sel_reg = REG_MMU_INV_SEL_GEN2, .iova_region = single_domain, .iova_region_nr = ARRAY_SIZE(single_domain), -- 2.18.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 5/8] perf tool: Add support for HiSilicon PCIe Tune and Trace device driver
On Mon, 16 May 2022 20:52:20 +0800 Yicong Yang wrote: > From: Qi Liu > > HiSilicon PCIe tune and trace device (PTT) could dynamically tune > the PCIe link's events, and trace the TLP headers). > > This patch add support for PTT device in perf tool, so users could > use 'perf record' to get TLP headers trace data. > > Signed-off-by: Qi Liu > Signed-off-by: Yicong Yang One query inline. > diff --git a/tools/perf/arch/arm/util/auxtrace.c > b/tools/perf/arch/arm/util/auxtrace.c > index 384c7cfda0fd..297fffedf45e 100644 > --- a/tools/perf/arch/arm/util/auxtrace.c > +++ b/tools/perf/arch/arm/util/auxtrace.c ... > static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus, > int pmu_nr, struct evsel *evsel) > { > @@ -71,17 +120,21 @@ struct auxtrace_record > { > struct perf_pmu *cs_etm_pmu = NULL; > struct perf_pmu **arm_spe_pmus = NULL; > + struct perf_pmu **hisi_ptt_pmus = NULL; > struct evsel *evsel; > struct perf_pmu *found_etm = NULL; > struct perf_pmu *found_spe = NULL; > + struct perf_pmu *found_ptt = NULL; > int auxtrace_event_cnt = 0; > int nr_spes = 0; > + int nr_ptts = 0; > > if (!evlist) > return NULL; > > cs_etm_pmu = perf_pmu__find(CORESIGHT_ETM_PMU_NAME); > arm_spe_pmus = find_all_arm_spe_pmus(_spes, err); > + hisi_ptt_pmus = find_all_hisi_ptt_pmus(_ptts, err); > > evlist__for_each_entry(evlist, evsel) { > if (cs_etm_pmu && !found_etm) > @@ -89,9 +142,13 @@ struct auxtrace_record > > if (arm_spe_pmus && !found_spe) > found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, > evsel); > + > + if (arm_spe_pmus && !found_spe) if (hisi_ptt_pmus && !found_ptt) ? Otherwise, I'm not sure what the purpose of the checking against spe is. > + found_ptt = find_pmu_for_event(hisi_ptt_pmus, nr_ptts, > evsel); > } > > free(arm_spe_pmus); > + free(hisi_ptt_pmus); > > if (found_etm) > auxtrace_event_cnt++; > @@ -99,6 +156,9 @@ struct auxtrace_record > if (found_spe) > auxtrace_event_cnt++; > > + if (found_ptt) > + auxtrace_event_cnt++; > + > if (auxtrace_event_cnt > 1) { > pr_err("Concurrent AUX trace operation not currently > supported\n"); > *err = -EOPNOTSUPP; > @@ -111,6 +171,9 @@ struct auxtrace_record > #if defined(__aarch64__) > if (found_spe) > return arm_spe_recording_init(err, found_spe); > + > + if (found_ptt) > + return hisi_ptt_recording_init(err, found_ptt); > #endif > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 4/8] perf arm: Refactor event list iteration in auxtrace_record__init()
On Mon, 16 May 2022 20:52:19 +0800 Yicong Yang wrote: > From: Qi Liu > > Use find_pmu_for_event() to simplify logic in auxtrace_record__init(). Possibly reword as "Add find_pmu_for_event() and use to simplify logic in auxtrace_record_init(). find_pmu_for_event() will be reused in subsequent patches." > > Signed-off-by: Qi Liu > Signed-off-by: Yicong Yang FWIW as this isn't an area I know much about. It seems like a good cleanup and functionally equivalent. Reviewed-by: Jonathan Cameron > --- > tools/perf/arch/arm/util/auxtrace.c | 53 ++--- > 1 file changed, 34 insertions(+), 19 deletions(-) > > diff --git a/tools/perf/arch/arm/util/auxtrace.c > b/tools/perf/arch/arm/util/auxtrace.c > index 5fc6a2a3dbc5..384c7cfda0fd 100644 > --- a/tools/perf/arch/arm/util/auxtrace.c > +++ b/tools/perf/arch/arm/util/auxtrace.c > @@ -50,16 +50,32 @@ static struct perf_pmu **find_all_arm_spe_pmus(int > *nr_spes, int *err) > return arm_spe_pmus; > } > > +static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus, > +int pmu_nr, struct evsel *evsel) > +{ > + int i; > + > + if (!pmus) > + return NULL; > + > + for (i = 0; i < pmu_nr; i++) { > + if (evsel->core.attr.type == pmus[i]->type) > + return pmus[i]; > + } > + > + return NULL; > +} > + > struct auxtrace_record > *auxtrace_record__init(struct evlist *evlist, int *err) > { > - struct perf_pmu *cs_etm_pmu; > + struct perf_pmu *cs_etm_pmu = NULL; > + struct perf_pmu **arm_spe_pmus = NULL; > struct evsel *evsel; > - bool found_etm = false; > + struct perf_pmu *found_etm = NULL; > struct perf_pmu *found_spe = NULL; > - struct perf_pmu **arm_spe_pmus = NULL; > + int auxtrace_event_cnt = 0; > int nr_spes = 0; > - int i = 0; > > if (!evlist) > return NULL; > @@ -68,24 +84,23 @@ struct auxtrace_record > arm_spe_pmus = find_all_arm_spe_pmus(_spes, err); > > evlist__for_each_entry(evlist, evsel) { > - if (cs_etm_pmu && > - evsel->core.attr.type == cs_etm_pmu->type) > - found_etm = true; > - > - if (!nr_spes || found_spe) > - continue; > - > - for (i = 0; i < nr_spes; i++) { > - if (evsel->core.attr.type == arm_spe_pmus[i]->type) { > - found_spe = arm_spe_pmus[i]; > - break; > - } > - } > + if (cs_etm_pmu && !found_etm) > + found_etm = find_pmu_for_event(_etm_pmu, 1, evsel); > + > + if (arm_spe_pmus && !found_spe) > + found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, > evsel); > } > + > free(arm_spe_pmus); > > - if (found_etm && found_spe) { > - pr_err("Concurrent ARM Coresight ETM and SPE operation not > currently supported\n"); > + if (found_etm) > + auxtrace_event_cnt++; > + > + if (found_spe) > + auxtrace_event_cnt++; > + > + if (auxtrace_event_cnt > 1) { > + pr_err("Concurrent AUX trace operation not currently > supported\n"); > *err = -EOPNOTSUPP; > return NULL; > } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 5/9] iommu/arm-smmu: Attach to host1x context device bus
On Mon, May 16, 2022 at 02:20:18PM +0300, Mikko Perttunen wrote: > On 5/16/22 13:44, Robin Murphy wrote: > > On 2022-05-16 11:13, Mikko Perttunen wrote: > > > On 5/16/22 13:07, Will Deacon wrote: > > > > On Mon, May 16, 2022 at 11:52:54AM +0300, cyn...@kapsi.fi wrote: > > > > > From: Mikko Perttunen > > > > > > > > > > Set itself as the IOMMU for the host1x context device bus, containing > > > > > "dummy" devices used for Host1x context isolation. > > > > > > > > > > Signed-off-by: Mikko Perttunen > > > > > --- > > > > > drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 + > > > > > 1 file changed, 13 insertions(+) > > > > > > > > > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c > > > > > b/drivers/iommu/arm/arm-smmu/arm-smmu.c > > > > > index 568cce590ccc..9ff54eaecf81 100644 > > > > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c > > > > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c > > > > > @@ -39,6 +39,7 @@ > > > > > #include > > > > > #include > > > > > +#include > > > > > #include "arm-smmu.h" > > > > > @@ -2053,8 +2054,20 @@ static int arm_smmu_bus_init(struct > > > > > iommu_ops *ops) > > > > > goto err_reset_pci_ops; > > > > > } > > > > > #endif > > > > > +#ifdef CONFIG_TEGRA_HOST1X_CONTEXT_BUS > > > > > + if (!iommu_present(_context_device_bus_type)) { > > > > > + err = bus_set_iommu(_context_device_bus_type, ops); > > > > > + if (err) > > > > > + goto err_reset_fsl_mc_ops; > > > > > + } > > > > > +#endif > > > > > + > > > > > return 0; > > > > > +err_reset_fsl_mc_ops: __maybe_unused; > > > > > +#ifdef CONFIG_FSL_MC_BUS > > > > > + bus_set_iommu(_mc_bus_type, NULL); > > > > > +#endif > > > > > > > > bus_set_iommu() is going away: > > > > > > > > https://lore.kernel.org/r/cover.1650890638.git.robin.mur...@arm.com > > > > > > > > Will > > > > > > Thanks for the heads-up. Robin had pointed out that this work was > > > ongoing but I hadn't seen the patches yet. I'll look into it. > > > > Although that *is* currently blocked on the mystery intel-iommu problem > > that I can't reproduce... If this series is ready to land right now for > > 5.19 then in principle that might be the easiest option overall. > > Hopefully at least patch #2 could sneak in so that the compile-time > > dependencies are ready for me to roll up host1x into the next rebase of > > "iommu: Always register bus notifiers". > > > > Cheers, > > Robin. > > My guess is that the series as a whole is not ready to land in the 5.19 > timeframe, but #2 could be possible. > > Thierry, any opinion? Dave and Daniel typically want new material to be in by -rc6 and I've already sent the PR for this cycle. I can ask them if they'd take another one, though, if it make things simpler for the next cycle. Thierry signature.asc Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 2/8] hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device
On Mon, 16 May 2022 20:52:17 +0800 Yicong Yang wrote: > HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex integrated > Endpoint(RCiEP) device, providing the capability to dynamically monitor and > tune the PCIe traffic and trace the TLP headers. > > Add the driver for the device to enable the trace function. Register PMU > device of PTT trace, then users can use trace through perf command. The > driver makes use of perf AUX trace function and support the following > events to configure the trace: > > - filter: select Root port or Endpoint to trace > - type: select the type of traced TLP headers > - direction: select the direction of traced TLP headers > - format: select the data format of the traced TLP headers > > This patch initially add a basic driver of PTT trace. > > Signed-off-by: Yicong Yang Hi Yicong, It's been a while since I looked at this driver, so I'll admit I can't remember if any of the things I've raised below were previously discussed. All minor stuff (biggest is question of failing cleanly in unlikely case of failing the allocation in the filter addition vs carrying on anyway), so feel free to add Reviewed-by: Jonathan Cameron > diff --git a/drivers/hwtracing/ptt/Makefile b/drivers/hwtracing/ptt/Makefile > new file mode 100644 > index ..908c09a98161 > --- /dev/null > +++ b/drivers/hwtracing/ptt/Makefile > @@ -0,0 +1,2 @@ > +# SPDX-License-Identifier: GPL-2.0 > +obj-$(CONFIG_HISI_PTT) += hisi_ptt.o > diff --git a/drivers/hwtracing/ptt/hisi_ptt.c > b/drivers/hwtracing/ptt/hisi_ptt.c > new file mode 100644 > index ..ef25ce98f664 > --- /dev/null > +++ b/drivers/hwtracing/ptt/hisi_ptt.c ... > + > +static int hisi_ptt_init_filters(struct pci_dev *pdev, void *data) > +{ > + struct hisi_ptt_filter_desc *filter; > + struct hisi_ptt *hisi_ptt = data; > + > + filter = kzalloc(sizeof(*filter), GFP_KERNEL); > + if (!filter) { > + pci_err(hisi_ptt->pdev, "failed to add filter %s\n", > pci_name(pdev)); If this fails we carry on anyway (no error checking on the bus_walk). I think we should error out in that case (would need to use a flag placed somewhere in hisi_ptt to tell we had an error). That would complicate the unwind though. Easiest way to do that unwind is probably to register a separate devm_add_action_or_reset() callback for each filter. If you prefer to carry on even with this allocation error, then maybe add a comment here somewhere to make it clear that will happen. > + return -ENOMEM; > + } > + > + filter->devid = PCI_DEVID(pdev->bus->number, pdev->devfn); > + > + if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT) { > + filter->is_port = true; > + list_add_tail(>list, _ptt->port_filters); > + > + /* Update the available port mask */ > + hisi_ptt->port_mask |= hisi_ptt_get_filter_val(filter->devid, > true); > + } else { > + list_add_tail(>list, _ptt->req_filters); > + } > + > + return 0; > +} > + > +static void hisi_ptt_release_filters(void *data) > +{ > + struct hisi_ptt_filter_desc *filter, *tmp; > + struct hisi_ptt *hisi_ptt = data; > + > + list_for_each_entry_safe(filter, tmp, _ptt->req_filters, list) { > + list_del(>list); > + kfree(filter); I think with separate release per entry above, this bit become simpler as we walk all the elements in the devm_ callback list rather than two lists here. > + } > + > + list_for_each_entry_safe(filter, tmp, _ptt->port_filters, list) { > + list_del(>list); > + kfree(filter); > + } > +} > + ... > + > +static int hisi_ptt_init_ctrls(struct hisi_ptt *hisi_ptt) > +{ > + struct pci_dev *pdev = hisi_ptt->pdev; > + struct pci_bus *bus; > + int ret; > + u32 reg; > + > + INIT_LIST_HEAD(_ptt->port_filters); > + INIT_LIST_HEAD(_ptt->req_filters); > + > + ret = hisi_ptt_config_trace_buf(hisi_ptt); > + if (ret) > + return ret; > + > + /* > + * The device range register provides the information about the > + * root ports which the RCiEP can control and trace. The RCiEP > + * and the root ports it support are on the same PCIe core, with > + * same domain number but maybe different bus number. The device > + * range register will tell us which root ports we can support, > + * Bit[31:16] indicates the upper BDF numbers of the root port, > + * while Bit[15:0] indicates the lower. > + */ > + reg = readl(hisi_ptt->iobase + HISI_PTT_DEVICE_RANGE); > + hisi_ptt->upper_bdf = FIELD_GET(HISI_PTT_DEVICE_RANGE_UPPER, reg); > + hisi_ptt->lower_bdf = FIELD_GET(HISI_PTT_DEVICE_RANGE_LOWER, reg); > + > + bus = pci_find_bus(pci_domain_nr(pdev->bus), > PCI_BUS_NUM(hisi_ptt->upper_bdf)); > + if (bus) > + pci_walk_bus(bus, hisi_ptt_init_filters, hisi_ptt); > + > + ret = devm_add_action_or_reset(>dev,
Re: [PATCH 2/5] iommu: Add blocking_domain_ops field in iommu_ops
On Mon, May 16, 2022 at 12:22:08PM +0100, Robin Murphy wrote: > On 2022-05-16 02:57, Lu Baolu wrote: > > Each IOMMU driver must provide a blocking domain ops. If the hardware > > supports detaching domain from device, setting blocking domain equals > > detaching the existing domain from the deivce. Otherwise, an UNMANAGED > > domain without any mapping will be used instead. > > Unfortunately that's backwards - most of the implementations of .detach_dev > are disabling translation entirely, meaning the device ends up effectively > in passthrough rather than blocked. Ideally we'd convert the detach_dev of every driver into either a blocking or identity domain. The trick is knowing which is which.. Guessing going down the list: apple dart - blocking, detach_dev calls apple_dart_hw_disable_dma() same as IOMMU_DOMAIN_BLOCKED [I wonder if this drive ris wrong in other ways though because I dont see a remove_streams in attach_dev] exynos - this seems to disable the 'sysmmu' so I'm guessing this is identity iommu-vmsa - Comment says 'disable mmu translaction' so I'm guessing this is idenity mkt_v1 - Code looks similar to mkt, which is probably identity. rkt - No idea sprd - No idea sun50i - This driver confusingly treats identity the same as unmanaged, seems wrong, no idea. amd - Not sure, clear_dte_entry() seems to set translation on but points the PTE to 0 ? Based on the spec table 8 I would have expected TV to be clear which would be blocking. Maybe a bug?? arm smmu qcomm - not sure intel - blocking These doesn't support default domains, so detach_dev should return back to DMA API ownership, which is either identity or something weird: fsl_pamu - identity due to the PPC use of dma direct msm mkt omap s390 - platform DMA ops terga-gart - Usually something called a GART would be 0 length once disabled, guessing blocking? tegra-smmu So, the approach here should be to go driver by driver and convert detach_dev to either identity, blocking or just delete it entirely, excluding the above 7 that don't support default domains. And get acks from the driver owners. > Conversely, at least arm-smmu and arm-smmu-v3 could implement > IOMMU_DOMAIN_BLOCKED properly with fault-type S2CRs and STEs > respectively, it just needs a bit of wiring up. Given that vfio now uses them it seems worthwhile to do.. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
On Fri, May 13, 2022 at 12:26 PM Saravana Kannan wrote: > > On Fri, May 13, 2022 at 6:58 AM Rob Herring wrote: > > > > On Fri, Apr 29, 2022 at 5:09 PM Saravana Kannan > > wrote: > > > > > > The deferred probe timer that's used for this currently starts at > > > late_initcall and runs for driver_deferred_probe_timeout seconds. The > > > assumption being that all available drivers would be loaded and > > > registered before the timer expires. This means, the > > > driver_deferred_probe_timeout has to be pretty large for it to cover the > > > worst case. But if we set the default value for it to cover the worst > > > case, it would significantly slow down the average case. For this > > > reason, the default value is set to 0. > > > > > > Also, with CONFIG_MODULES=y and the current default values of > > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing > > > drivers will cause their consumer devices to always defer their probes. > > > This is because device links created by fw_devlink defer the probe even > > > before the consumer driver's probe() is called. > > > > > > Instead of a fixed timeout, if we extend an unexpired deferred probe > > > timer on every successful driver registration, with the expectation more > > > modules would be loaded in the near future, then the default value of > > > driver_deferred_probe_timeout only needs to be as long as the worst case > > > time difference between two consecutive module loads. > > > > > > So let's implement that and set the default value to 10 seconds when > > > CONFIG_MODULES=y. > > > > We had to revert a non-zero timeout before (issue with NFS root IIRC). > > Does fw_devlink=on somehow fix that? > > If it's the one where ip autoconfig was timing out, then John Stultz > fixed it by fixing wait_for_device_probe(). > https://lore.kernel.org/all/20200422203245.83244-4-john.stu...@linaro.org/ Yeah, that was it. Acked-by: Rob Herring ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/5] iommu: Add blocking_domain_ops field in iommu_ops
Hi Robin, On 2022/5/16 19:22, Robin Murphy wrote: On 2022-05-16 02:57, Lu Baolu wrote: Each IOMMU driver must provide a blocking domain ops. If the hardware supports detaching domain from device, setting blocking domain equals detaching the existing domain from the deivce. Otherwise, an UNMANAGED domain without any mapping will be used instead. Unfortunately that's backwards - most of the implementations of .detach_dev are disabling translation entirely, meaning the device ends up effectively in passthrough rather than blocked. Conversely, at least arm-smmu and arm-smmu-v3 could implement IOMMU_DOMAIN_BLOCKED properly with fault-type S2CRs and STEs respectively, it just needs a bit of wiring up. Thank you for letting me know this. This means that we need to add an additional UNMANAGED domain for each iommu group, although it is not used most of the time. If most IOMMU drivers could implement real dumb blocking domains, this burden may be reduced. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size()
For streaming DMA mappings involving an IOMMU and whose IOVA len regularly exceeds the IOVA rcache upper limit (meaning that they are not cached), performance can be reduced. Add the IOMMU callback for DMA mapping API dma_max_mapping_size(), which allows the drivers to know the mapping limit and thus limit the requested IOVA lengths. This resolves the performance issue originally reported in [0] for a SCSI HBA driver which was regularly mapping SGLs which required IOVAs in excess of the IOVA caching limit. In this case the block layer limits the max sectors per request - as configured in __scsi_init_queue() - which will limit the total SGL length the driver tries to map and in turn limits IOVA lengths requested. [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leiz...@huawei.com/ Signed-off-by: John Garry --- Sending as an RFC as iommu_dma_max_mapping_size() is a soft limit, and not a hard limit which I expect is the semantics of dma_map_ops.max_mapping_size diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 09f6e1c0f9c0..e2d5205cde37 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1442,6 +1442,21 @@ static unsigned long iommu_dma_get_merge_boundary(struct device *dev) return (1UL << __ffs(domain->pgsize_bitmap)) - 1; } +static size_t iommu_dma_max_mapping_size(struct device *dev) +{ + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); + struct iommu_dma_cookie *cookie; + + if (!domain) + return 0; + + cookie = domain->iova_cookie; + if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE) + return 0; + + return iova_rcache_range(); +} + static const struct dma_map_ops iommu_dma_ops = { .alloc = iommu_dma_alloc, .free = iommu_dma_free, @@ -1462,6 +1477,7 @@ static const struct dma_map_ops iommu_dma_ops = { .map_resource = iommu_dma_map_resource, .unmap_resource = iommu_dma_unmap_resource, .get_merge_boundary = iommu_dma_get_merge_boundary, + .max_mapping_size = iommu_dma_max_mapping_size, }; /* diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index db77aa675145..9f00b58d546e 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -26,6 +26,11 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad, static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad); static void free_iova_rcaches(struct iova_domain *iovad); +unsigned long iova_rcache_range(void) +{ + return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1); +} + static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node) { struct iova_domain *iovad; diff --git a/include/linux/iova.h b/include/linux/iova.h index 320a70e40233..ae3e18d77e6c 100644 --- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -79,6 +79,8 @@ static inline unsigned long iova_pfn(struct iova_domain *iovad, dma_addr_t iova) int iova_cache_get(void); void iova_cache_put(void); +unsigned long iova_rcache_range(void); + void free_iova(struct iova_domain *iovad, unsigned long pfn); void __free_iova(struct iova_domain *iovad, struct iova *iova); struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size, @@ -105,6 +107,11 @@ static inline void iova_cache_put(void) { } +static inline unsigned long iova_rcache_range(void) +{ + return 0; +} + static inline void free_iova(struct iova_domain *iovad, unsigned long pfn) { } -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH V2 1/2] swiotlb: Add Child IO TLB mem support
On 5/16/2022 3:34 PM, Christoph Hellwig wrote: I don't really understand how 'childs' fit in here. The code also doesn't seem to be usable without patch 2 and a caller of the new functions added in patch 2, so it is rather impossible to review. Hi Christoph: OK. I will merge two patches and add a caller patch. The motivation is to avoid global spin lock when devices use swiotlb bounce buffer and this introduces overhead during high throughput cases. In my test environment, current code can achieve about 24Gb/s network throughput with SWIOTLB force enabled and it can achieve about 40Gb/s without SWIOTLB force. Storage also has the same issue. Per-device IO TLB mem may resolve global spin lock issue among devices but device still may have multi queues. Multi queues still need to share one spin lock. This is why introduce child or IO tlb areas in the previous patches. Each device queues will have separate child IO TLB mem and single spin lock to manage their IO TLB buffers. Otherwise, global spin lock still cost cpu usage during high throughput even when there is performance regression. Each device queues needs to spin on the different cpus to acquire the global lock. Child IO TLB mem also may resolve the cpu issue. Also: 1) why is SEV/TDX so different from other cases that need bounce buffering to treat it different and we can't work on a general scalability improvement Other cases also have global spin lock issue but it depends on whether hits the bottleneck. The cpu usage issue may be ignored. 2) per previous discussions at how swiotlb itself works, it is clear that another option is to just make pages we DMA to shared with the hypervisor. Why don't we try that at least for larger I/O? For confidential VM(Both TDX and SEV), we need to use bounce buffer to copy between private memory that hypervisor can't access directly and shared memory. For security consideration, confidential VM should not share IO stack DMA pages with hypervisor directly to avoid attack from hypervisor when IO stack handles the DMA data. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/5] iommu: Add blocking_domain_ops field in iommu_ops
On Mon, May 16, 2022 at 12:27:41AM -0700, Christoph Hellwig wrote: > On Mon, May 16, 2022 at 09:57:56AM +0800, Lu Baolu wrote: > > Each IOMMU driver must provide a blocking domain ops. If the hardware > > supports detaching domain from device, setting blocking domain equals > > detaching the existing domain from the deivce. Otherwise, an UNMANAGED > > domain without any mapping will be used instead. > > blocking in this case means not allowing any access? The naming > sounds a bit odd to me as blocking in the kernel has a specific > meaning. Maybe something like noaccess ops might be a better name? It is because of this: include/linux/iommu.h: *IOMMU_DOMAIN_BLOCKED- All DMA is blocked, can be used to isolate include/linux/iommu.h:#define IOMMU_DOMAIN_BLOCKED (0U) noaccess might be clearer Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v8 7/8] docs: trace: Add HiSilicon PTT device driver documentation
Document the introduction and usage of HiSilicon PTT device driver. Signed-off-by: Yicong Yang Reviewed-by: Jonathan Cameron --- Documentation/trace/hisi-ptt.rst | 307 +++ Documentation/trace/index.rst| 1 + 2 files changed, 308 insertions(+) create mode 100644 Documentation/trace/hisi-ptt.rst diff --git a/Documentation/trace/hisi-ptt.rst b/Documentation/trace/hisi-ptt.rst new file mode 100644 index ..0a3112244d40 --- /dev/null +++ b/Documentation/trace/hisi-ptt.rst @@ -0,0 +1,307 @@ +.. SPDX-License-Identifier: GPL-2.0 + +== +HiSilicon PCIe Tune and Trace device +== + +Introduction + + +HiSilicon PCIe tune and trace device (PTT) is a PCIe Root Complex +integrated Endpoint (RCiEP) device, providing the capability +to dynamically monitor and tune the PCIe link's events (tune), +and trace the TLP headers (trace). The two functions are independent, +but is recommended to use them together to analyze and enhance the +PCIe link's performance. + +On Kunpeng 930 SoC, the PCIe Root Complex is composed of several +PCIe cores. Each PCIe core includes several Root Ports and a PTT +RCiEP, like below. The PTT device is capable of tuning and +tracing the links of the PCIe core. +:: + + +--Core 0---+ + | | [ PTT ] | + | | [Root Port]---[Endpoint] + | | [Root Port]---[Endpoint] + | | [Root Port]---[Endpoint] +Root Complex |--Core 1---+ + | | [ PTT ] | + | | [Root Port]---[ Switch ]---[Endpoint] + | | [Root Port]---[Endpoint] `-[Endpoint] + | | [Root Port]---[Endpoint] + +---+ + +The PTT device driver registers one PMU device for each PTT device. +The name of each PTT device is composed of 'hisi_ptt' prefix with +the id of the SICL and the Core where it locates. The Kunpeng 930 +SoC encapsulates multiple CPU dies (SCCL, Super CPU Cluster) and +IO dies (SICL, Super I/O Cluster), where there's one PCIe Root +Complex for each SICL. +:: + +/sys/devices/hisi_ptt_ + +Tune + + +PTT tune is designed for monitoring and adjusting PCIe link parameters (events). +Currently we support events in 4 classes. The scope of the events +covers the PCIe core to which the PTT device belongs. + +Each event is presented as a file under $(PTT PMU dir)/tune, and +a simple open/read/write/close cycle will be used to tune the event. +:: + +$ cd /sys/devices/hisi_ptt_/tune +$ ls +qos_tx_cplqos_tx_npqos_tx_p +tx_path_rx_req_alloc_buf_level +tx_path_tx_req_alloc_buf_level +$ cat qos_tx_dp +1 +$ echo 2 > qos_tx_dp +$ cat qos_tx_dp +2 + +Current value (numerical value) of the event can be simply read +from the file, and the desired value written to the file to tune. + +1. Tx path QoS control + + +The following files are provided to tune the QoS of the tx path of +the PCIe core. + +- qos_tx_cpl: weight of Tx completion TLPs +- qos_tx_np: weight of Tx non-posted TLPs +- qos_tx_p: weight of Tx posted TLPs + +The weight influences the proportion of certain packets on the PCIe link. +For example, for the storage scenario, increase the proportion +of the completion packets on the link to enhance the performance as +more completions are consumed. + +The available tune data of these events is [0, 1, 2]. +Writing a negative value will return an error, and out of range +values will be converted to 2. Note that the event value just +indicates a probable level, but is not precise. + +2. Tx path buffer control +- + +Following files are provided to tune the buffer of tx path of the PCIe core. + +- tx_path_rx_req_alloc_buf_level: watermark of Rx requested +- tx_path_tx_req_alloc_buf_level: watermark of Tx requested + +These events influence the watermark of the buffer allocated for each +type. Rx means the inbound while Tx means outbound. The packets will +be stored in the buffer first and then transmitted either when the +watermark reached or when timed out. For a busy direction, you should +increase the related buffer watermark to avoid frequently posting and +thus enhance the performance. In most cases just keep the default value. + +The available tune data of above events is [0, 1, 2]. +Writing a negative value will return an error, and out of range +values will be converted to 2. Note that the event value just +indicates a probable level, but is not precise. + +Trace += + +PTT trace is designed for dumping the TLP headers to the memory, which +can be used to analyze the transactions and usage condition of the PCIe +Link. You can choose to filter the traced headers by either requester ID, +or those downstream of a set of Root Ports on the same core of the PTT +device. It's also
[PATCH v8 6/8] perf tool: Add support for parsing HiSilicon PCIe Trace packet
From: Qi Liu Add support for using 'perf report --dump-raw-trace' to parse PTT packet. Example usage: Output will contain raw PTT data and its textual representation, such as: 0 0 0x5810 [0x30]: PERF_RECORD_AUXTRACE size: 0x40 offset: 0 ref: 0xa5d50c725 idx: 0 tid: -1 cpu: 0 . . ... HISI PTT data: size 4194304 bytes . : 00 00 00 00 Prefix . 0004: 08 20 00 60 Header DW0 . 0008: ff 02 00 01 Header DW1 . 000c: 20 08 00 00 Header DW2 . 0010: 10 e7 44 ab Header DW3 . 0014: 2a a8 1e 01 Time . 0020: 00 00 00 00 Prefix . 0024: 01 00 00 60 Header DW0 . 0028: 0f 1e 00 01 Header DW1 . 002c: 04 00 00 00 Header DW2 . 0030: 40 00 81 02 Header DW3 . 0034: ee 02 00 00 Time Signed-off-by: Qi Liu Signed-off-by: Yicong Yang --- tools/perf/util/Build | 2 + tools/perf/util/auxtrace.c| 3 + tools/perf/util/hisi-ptt-decoder/Build| 1 + .../hisi-ptt-decoder/hisi-ptt-pkt-decoder.c | 167 +++ .../hisi-ptt-decoder/hisi-ptt-pkt-decoder.h | 31 +++ tools/perf/util/hisi-ptt.c| 193 ++ 6 files changed, 397 insertions(+) create mode 100644 tools/perf/util/hisi-ptt-decoder/Build create mode 100644 tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c create mode 100644 tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.h create mode 100644 tools/perf/util/hisi-ptt.c diff --git a/tools/perf/util/Build b/tools/perf/util/Build index 9a7209a99e16..2d5cc4dc2732 100644 --- a/tools/perf/util/Build +++ b/tools/perf/util/Build @@ -116,6 +116,8 @@ perf-$(CONFIG_AUXTRACE) += intel-pt.o perf-$(CONFIG_AUXTRACE) += intel-bts.o perf-$(CONFIG_AUXTRACE) += arm-spe.o perf-$(CONFIG_AUXTRACE) += arm-spe-decoder/ +perf-$(CONFIG_AUXTRACE) += hisi-ptt.o +perf-$(CONFIG_AUXTRACE) += hisi-ptt-decoder/ perf-$(CONFIG_AUXTRACE) += s390-cpumsf.o ifdef CONFIG_LIBOPENCSD diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index a24cad3ce24e..84433c34903e 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -51,6 +51,7 @@ #include "intel-pt.h" #include "intel-bts.h" #include "arm-spe.h" +#include "hisi-ptt.h" #include "s390-cpumsf.h" #include "util/mmap.h" @@ -1282,6 +1283,8 @@ int perf_event__process_auxtrace_info(struct perf_session *session, err = s390_cpumsf_process_auxtrace_info(event, session); break; case PERF_AUXTRACE_HISI_PTT: + err = hisi_ptt_process_auxtrace_info(event, session); + break; case PERF_AUXTRACE_UNKNOWN: default: return -EINVAL; diff --git a/tools/perf/util/hisi-ptt-decoder/Build b/tools/perf/util/hisi-ptt-decoder/Build new file mode 100644 index ..db3db8b75033 --- /dev/null +++ b/tools/perf/util/hisi-ptt-decoder/Build @@ -0,0 +1 @@ +perf-$(CONFIG_AUXTRACE) += hisi-ptt-pkt-decoder.o diff --git a/tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c b/tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c new file mode 100644 index ..64f67169ec37 --- /dev/null +++ b/tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c @@ -0,0 +1,167 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * HiSilicon PCIe Trace and Tuning (PTT) support + * Copyright (c) 2022 HiSilicon Technologies Co., Ltd. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "../color.h" +#include "hisi-ptt-pkt-decoder.h" + +/* + * For 8DW format, the bit[31:11] of DW0 is always 0x1f, which can be + * used to distinguish the data format. + * 8DW format is like: + * bits [ 31:11 ][ 10:0 ] + *|---|---| + *DW0 [0x1f ][ Reserved (0x7ff) ] + *DW1 [ Prefix ] + *DW2 [ Header DW0] + *DW3 [ Header DW1] + *DW4 [ Header DW2] + *DW5 [ Header DW3] + *DW6 [ Reserved (0x0) ] + *DW7 [Time ] + * + * 4DW format is like: + * bits [31:30] [ 29:25 ][24][23][22][21][20:11 ][10:0] + *|-|-|---|---|---|---|-|-| + *DW0 [ Fmt ][ Type
[PATCH v8 2/8] hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device
HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex integrated Endpoint(RCiEP) device, providing the capability to dynamically monitor and tune the PCIe traffic and trace the TLP headers. Add the driver for the device to enable the trace function. Register PMU device of PTT trace, then users can use trace through perf command. The driver makes use of perf AUX trace function and support the following events to configure the trace: - filter: select Root port or Endpoint to trace - type: select the type of traced TLP headers - direction: select the direction of traced TLP headers - format: select the data format of the traced TLP headers This patch initially add a basic driver of PTT trace. Signed-off-by: Yicong Yang --- drivers/Makefile | 1 + drivers/hwtracing/Kconfig| 2 + drivers/hwtracing/ptt/Kconfig| 12 + drivers/hwtracing/ptt/Makefile | 2 + drivers/hwtracing/ptt/hisi_ptt.c | 964 +++ drivers/hwtracing/ptt/hisi_ptt.h | 178 ++ 6 files changed, 1159 insertions(+) create mode 100644 drivers/hwtracing/ptt/Kconfig create mode 100644 drivers/hwtracing/ptt/Makefile create mode 100644 drivers/hwtracing/ptt/hisi_ptt.c create mode 100644 drivers/hwtracing/ptt/hisi_ptt.h diff --git a/drivers/Makefile b/drivers/Makefile index 020780b6b4d2..662d50599467 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -175,6 +175,7 @@ obj-$(CONFIG_USB4) += thunderbolt/ obj-$(CONFIG_CORESIGHT)+= hwtracing/coresight/ obj-y += hwtracing/intel_th/ obj-$(CONFIG_STM) += hwtracing/stm/ +obj-$(CONFIG_HISI_PTT) += hwtracing/ptt/ obj-$(CONFIG_ANDROID) += android/ obj-$(CONFIG_NVMEM)+= nvmem/ obj-$(CONFIG_FPGA) += fpga/ diff --git a/drivers/hwtracing/Kconfig b/drivers/hwtracing/Kconfig index 13085835a636..911ee977103c 100644 --- a/drivers/hwtracing/Kconfig +++ b/drivers/hwtracing/Kconfig @@ -5,4 +5,6 @@ source "drivers/hwtracing/stm/Kconfig" source "drivers/hwtracing/intel_th/Kconfig" +source "drivers/hwtracing/ptt/Kconfig" + endmenu diff --git a/drivers/hwtracing/ptt/Kconfig b/drivers/hwtracing/ptt/Kconfig new file mode 100644 index ..6d46a09ffeb9 --- /dev/null +++ b/drivers/hwtracing/ptt/Kconfig @@ -0,0 +1,12 @@ +# SPDX-License-Identifier: GPL-2.0-only +config HISI_PTT + tristate "HiSilicon PCIe Tune and Trace Device" + depends on ARM64 || (COMPILE_TEST && 64BIT) + depends on PCI && HAS_DMA && HAS_IOMEM && PERF_EVENTS + help + HiSilicon PCIe Tune and Trace device exists as a PCIe RCiEP + device, and it provides support for PCIe traffic tuning and + tracing TLP headers to the memory. + + This driver can also be built as a module. If so, the module + will be called hisi_ptt. diff --git a/drivers/hwtracing/ptt/Makefile b/drivers/hwtracing/ptt/Makefile new file mode 100644 index ..908c09a98161 --- /dev/null +++ b/drivers/hwtracing/ptt/Makefile @@ -0,0 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_HISI_PTT) += hisi_ptt.o diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c new file mode 100644 index ..ef25ce98f664 --- /dev/null +++ b/drivers/hwtracing/ptt/hisi_ptt.c @@ -0,0 +1,964 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Driver for HiSilicon PCIe tune and trace device + * + * Copyright (c) 2022 HiSilicon Technologies Co., Ltd. + * Author: Yicong Yang + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "hisi_ptt.h" + +/* Dynamic CPU hotplug state used by PTT */ +static enum cpuhp_state hisi_ptt_pmu_online; + +static u16 hisi_ptt_get_filter_val(u16 devid, bool is_port) +{ + if (is_port) + return BIT(HISI_PCIE_CORE_PORT_ID(devid & 0xff)); + + return devid; +} + +static bool hisi_ptt_wait_trace_hw_idle(struct hisi_ptt *hisi_ptt) +{ + u32 val; + + return !readl_poll_timeout_atomic(hisi_ptt->iobase + HISI_PTT_TRACE_STS, + val, val & HISI_PTT_TRACE_IDLE, + HISI_PTT_WAIT_POLL_INTERVAL_US, + HISI_PTT_WAIT_TRACE_TIMEOUT_US); +} + +static void hisi_ptt_wait_dma_reset_done(struct hisi_ptt *hisi_ptt) +{ + u32 val; + + readl_poll_timeout_atomic(hisi_ptt->iobase + HISI_PTT_TRACE_WR_STS, + val, !val, HISI_PTT_RESET_POLL_INTERVAL_US, + HISI_PTT_RESET_TIMEOUT_US); +} + +static void hisi_ptt_trace_end(struct hisi_ptt *hisi_ptt) +{ + writel(0, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); + hisi_ptt->trace_ctrl.started = false; +} + +static int hisi_ptt_trace_start(struct hisi_ptt *hisi_ptt) +{ + struct hisi_ptt_trace_ctrl *ctrl = _ptt->trace_ctrl; +
[PATCH v8 4/8] perf arm: Refactor event list iteration in auxtrace_record__init()
From: Qi Liu Use find_pmu_for_event() to simplify logic in auxtrace_record__init(). Signed-off-by: Qi Liu Signed-off-by: Yicong Yang --- tools/perf/arch/arm/util/auxtrace.c | 53 ++--- 1 file changed, 34 insertions(+), 19 deletions(-) diff --git a/tools/perf/arch/arm/util/auxtrace.c b/tools/perf/arch/arm/util/auxtrace.c index 5fc6a2a3dbc5..384c7cfda0fd 100644 --- a/tools/perf/arch/arm/util/auxtrace.c +++ b/tools/perf/arch/arm/util/auxtrace.c @@ -50,16 +50,32 @@ static struct perf_pmu **find_all_arm_spe_pmus(int *nr_spes, int *err) return arm_spe_pmus; } +static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus, + int pmu_nr, struct evsel *evsel) +{ + int i; + + if (!pmus) + return NULL; + + for (i = 0; i < pmu_nr; i++) { + if (evsel->core.attr.type == pmus[i]->type) + return pmus[i]; + } + + return NULL; +} + struct auxtrace_record *auxtrace_record__init(struct evlist *evlist, int *err) { - struct perf_pmu *cs_etm_pmu; + struct perf_pmu *cs_etm_pmu = NULL; + struct perf_pmu **arm_spe_pmus = NULL; struct evsel *evsel; - bool found_etm = false; + struct perf_pmu *found_etm = NULL; struct perf_pmu *found_spe = NULL; - struct perf_pmu **arm_spe_pmus = NULL; + int auxtrace_event_cnt = 0; int nr_spes = 0; - int i = 0; if (!evlist) return NULL; @@ -68,24 +84,23 @@ struct auxtrace_record arm_spe_pmus = find_all_arm_spe_pmus(_spes, err); evlist__for_each_entry(evlist, evsel) { - if (cs_etm_pmu && - evsel->core.attr.type == cs_etm_pmu->type) - found_etm = true; - - if (!nr_spes || found_spe) - continue; - - for (i = 0; i < nr_spes; i++) { - if (evsel->core.attr.type == arm_spe_pmus[i]->type) { - found_spe = arm_spe_pmus[i]; - break; - } - } + if (cs_etm_pmu && !found_etm) + found_etm = find_pmu_for_event(_etm_pmu, 1, evsel); + + if (arm_spe_pmus && !found_spe) + found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, evsel); } + free(arm_spe_pmus); - if (found_etm && found_spe) { - pr_err("Concurrent ARM Coresight ETM and SPE operation not currently supported\n"); + if (found_etm) + auxtrace_event_cnt++; + + if (found_spe) + auxtrace_event_cnt++; + + if (auxtrace_event_cnt > 1) { + pr_err("Concurrent AUX trace operation not currently supported\n"); *err = -EOPNOTSUPP; return NULL; } -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v8 3/8] hwtracing: hisi_ptt: Add tune function support for HiSilicon PCIe Tune and Trace device
Add tune function for the HiSilicon Tune and Trace device. The interface of tune is exposed through sysfs attributes of PTT PMU device. Signed-off-by: Yicong Yang Reviewed-by: Jonathan Cameron --- drivers/hwtracing/ptt/hisi_ptt.c | 157 +++ drivers/hwtracing/ptt/hisi_ptt.h | 23 + 2 files changed, 180 insertions(+) diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c index ef25ce98f664..c3fdb9bfb1b4 100644 --- a/drivers/hwtracing/ptt/hisi_ptt.c +++ b/drivers/hwtracing/ptt/hisi_ptt.c @@ -25,6 +25,161 @@ /* Dynamic CPU hotplug state used by PTT */ static enum cpuhp_state hisi_ptt_pmu_online; +static bool hisi_ptt_wait_tuning_finish(struct hisi_ptt *hisi_ptt) +{ + u32 val; + + return !readl_poll_timeout(hisi_ptt->iobase + HISI_PTT_TUNING_INT_STAT, + val, !(val & HISI_PTT_TUNING_INT_STAT_MASK), + HISI_PTT_WAIT_POLL_INTERVAL_US, + HISI_PTT_WAIT_TUNE_TIMEOUT_US); +} + +static int hisi_ptt_tune_data_get(struct hisi_ptt *hisi_ptt, + u32 event, u16 *data) +{ + u32 reg; + + reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_CTRL); + reg &= ~(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB); + reg |= FIELD_PREP(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB, + event); + writel(reg, hisi_ptt->iobase + HISI_PTT_TUNING_CTRL); + + /* Write all 1 to indicates it's the read process */ + writel(~0U, hisi_ptt->iobase + HISI_PTT_TUNING_DATA); + + if (!hisi_ptt_wait_tuning_finish(hisi_ptt)) + return -ETIMEDOUT; + + reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_DATA); + reg &= HISI_PTT_TUNING_DATA_VAL_MASK; + *data = FIELD_GET(HISI_PTT_TUNING_DATA_VAL_MASK, reg); + + return 0; +} + +static int hisi_ptt_tune_data_set(struct hisi_ptt *hisi_ptt, + u32 event, u16 data) +{ + u32 reg; + + reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_CTRL); + reg &= ~(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB); + reg |= FIELD_PREP(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB, + event); + writel(reg, hisi_ptt->iobase + HISI_PTT_TUNING_CTRL); + + writel(FIELD_PREP(HISI_PTT_TUNING_DATA_VAL_MASK, data), + hisi_ptt->iobase + HISI_PTT_TUNING_DATA); + + if (!hisi_ptt_wait_tuning_finish(hisi_ptt)) + return -ETIMEDOUT; + + return 0; +} + +static ssize_t hisi_ptt_tune_attr_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct hisi_ptt *hisi_ptt = to_hisi_ptt(dev_get_drvdata(dev)); + struct dev_ext_attribute *ext_attr; + struct hisi_ptt_tune_desc *desc; + int ret; + u16 val; + + ext_attr = container_of(attr, struct dev_ext_attribute, attr); + desc = ext_attr->var; + + mutex_lock(_ptt->tune_lock); + ret = hisi_ptt_tune_data_get(hisi_ptt, desc->event_code, ); + mutex_unlock(_ptt->tune_lock); + + if (ret) + return ret; + + return sysfs_emit(buf, "%u\n", val); +} + +static ssize_t hisi_ptt_tune_attr_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct hisi_ptt *hisi_ptt = to_hisi_ptt(dev_get_drvdata(dev)); + struct dev_ext_attribute *ext_attr; + struct hisi_ptt_tune_desc *desc; + int ret; + u16 val; + + ext_attr = container_of(attr, struct dev_ext_attribute, attr); + desc = ext_attr->var; + + if (kstrtou16(buf, 10, )) + return -EINVAL; + + mutex_lock(_ptt->tune_lock); + ret = hisi_ptt_tune_data_set(hisi_ptt, desc->event_code, val); + mutex_unlock(_ptt->tune_lock); + + if (ret) + return ret; + + return count; +} + +#define HISI_PTT_TUNE_ATTR(_name, _val, _show, _store) \ + static struct hisi_ptt_tune_desc _name##_desc = { \ + .name = #_name, \ + .event_code = _val, \ + }; \ + static struct dev_ext_attribute hisi_ptt_##_name##_attr = { \ + .attr = __ATTR(_name, 0600, _show, _store), \ + .var= &_name##_desc,\ + } + +#define HISI_PTT_TUNE_ATTR_COMMON(_name, _val) \ + HISI_PTT_TUNE_ATTR(_name, _val, \ + hisi_ptt_tune_attr_show, \ + hisi_ptt_tune_attr_store) + +/* + * The value of the tuning event are composed
[PATCH v8 5/8] perf tool: Add support for HiSilicon PCIe Tune and Trace device driver
From: Qi Liu HiSilicon PCIe tune and trace device (PTT) could dynamically tune the PCIe link's events, and trace the TLP headers). This patch add support for PTT device in perf tool, so users could use 'perf record' to get TLP headers trace data. Signed-off-by: Qi Liu Signed-off-by: Yicong Yang --- tools/perf/arch/arm/util/auxtrace.c | 63 + tools/perf/arch/arm/util/pmu.c| 3 + tools/perf/arch/arm64/util/Build | 2 +- tools/perf/arch/arm64/util/hisi-ptt.c | 187 ++ tools/perf/util/auxtrace.c| 1 + tools/perf/util/auxtrace.h| 1 + tools/perf/util/hisi-ptt.h| 19 +++ 7 files changed, 275 insertions(+), 1 deletion(-) create mode 100644 tools/perf/arch/arm64/util/hisi-ptt.c create mode 100644 tools/perf/util/hisi-ptt.h diff --git a/tools/perf/arch/arm/util/auxtrace.c b/tools/perf/arch/arm/util/auxtrace.c index 384c7cfda0fd..297fffedf45e 100644 --- a/tools/perf/arch/arm/util/auxtrace.c +++ b/tools/perf/arch/arm/util/auxtrace.c @@ -4,9 +4,11 @@ * Author: Mathieu Poirier */ +#include #include #include #include +#include #include "../../../util/auxtrace.h" #include "../../../util/debug.h" @@ -14,6 +16,7 @@ #include "../../../util/pmu.h" #include "cs-etm.h" #include "arm-spe.h" +#include "hisi-ptt.h" static struct perf_pmu **find_all_arm_spe_pmus(int *nr_spes, int *err) { @@ -50,6 +53,52 @@ static struct perf_pmu **find_all_arm_spe_pmus(int *nr_spes, int *err) return arm_spe_pmus; } +static struct perf_pmu **find_all_hisi_ptt_pmus(int *nr_ptts, int *err) +{ + const char *sysfs = sysfs__mountpoint(); + struct perf_pmu **hisi_ptt_pmus = NULL; + struct dirent *dent; + char path[PATH_MAX]; + DIR *dir = NULL; + int idx = 0; + + snprintf(path, PATH_MAX, "%s" EVENT_SOURCE_DEVICE_PATH, sysfs); + dir = opendir(path); + if (!dir) { + pr_err("can't read directory '%s'\n", EVENT_SOURCE_DEVICE_PATH); + *err = -EINVAL; + goto out; + } + + while ((dent = readdir(dir))) { + if (strstr(dent->d_name, HISI_PTT_PMU_NAME)) + (*nr_ptts)++; + } + + if (!(*nr_ptts)) + goto out; + + hisi_ptt_pmus = zalloc(sizeof(struct perf_pmu *) * (*nr_ptts)); + if (!hisi_ptt_pmus) { + pr_err("hisi_ptt alloc failed\n"); + *err = -ENOMEM; + goto out; + } + + rewinddir(dir); + while ((dent = readdir(dir))) { + if (strstr(dent->d_name, HISI_PTT_PMU_NAME) && idx < (*nr_ptts)) { + hisi_ptt_pmus[idx] = perf_pmu__find(dent->d_name); + if (hisi_ptt_pmus[idx]) + idx++; + } + } + +out: + closedir(dir); + return hisi_ptt_pmus; +} + static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus, int pmu_nr, struct evsel *evsel) { @@ -71,17 +120,21 @@ struct auxtrace_record { struct perf_pmu *cs_etm_pmu = NULL; struct perf_pmu **arm_spe_pmus = NULL; + struct perf_pmu **hisi_ptt_pmus = NULL; struct evsel *evsel; struct perf_pmu *found_etm = NULL; struct perf_pmu *found_spe = NULL; + struct perf_pmu *found_ptt = NULL; int auxtrace_event_cnt = 0; int nr_spes = 0; + int nr_ptts = 0; if (!evlist) return NULL; cs_etm_pmu = perf_pmu__find(CORESIGHT_ETM_PMU_NAME); arm_spe_pmus = find_all_arm_spe_pmus(_spes, err); + hisi_ptt_pmus = find_all_hisi_ptt_pmus(_ptts, err); evlist__for_each_entry(evlist, evsel) { if (cs_etm_pmu && !found_etm) @@ -89,9 +142,13 @@ struct auxtrace_record if (arm_spe_pmus && !found_spe) found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, evsel); + + if (arm_spe_pmus && !found_spe) + found_ptt = find_pmu_for_event(hisi_ptt_pmus, nr_ptts, evsel); } free(arm_spe_pmus); + free(hisi_ptt_pmus); if (found_etm) auxtrace_event_cnt++; @@ -99,6 +156,9 @@ struct auxtrace_record if (found_spe) auxtrace_event_cnt++; + if (found_ptt) + auxtrace_event_cnt++; + if (auxtrace_event_cnt > 1) { pr_err("Concurrent AUX trace operation not currently supported\n"); *err = -EOPNOTSUPP; @@ -111,6 +171,9 @@ struct auxtrace_record #if defined(__aarch64__) if (found_spe) return arm_spe_recording_init(err, found_spe); + + if (found_ptt) + return hisi_ptt_recording_init(err, found_ptt); #endif /* diff --git a/tools/perf/arch/arm/util/pmu.c b/tools/perf/arch/arm/util/pmu.c index b8b23b9dc598..887c8addc491 100644 ---
[PATCH v8 8/8] MAINTAINERS: Add maintainer for HiSilicon PTT driver
Add maintainer for driver and documentation of HiSilicon PTT device. Signed-off-by: Yicong Yang Reviewed-by: Jonathan Cameron --- MAINTAINERS | 7 +++ 1 file changed, 7 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index fd768d43e048..d30a1698251c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8858,6 +8858,13 @@ F: Documentation/admin-guide/perf/hisi-pcie-pmu.rst F: Documentation/admin-guide/perf/hisi-pmu.rst F: drivers/perf/hisilicon +HISILICON PTT DRIVER +M: Yicong Yang +L: linux-ker...@vger.kernel.org +S: Maintained +F: Documentation/trace/hisi-ptt.rst +F: drivers/hwtracing/ptt/ + HISILICON QM AND ZIP Controller DRIVER M: Zhou Wang L: linux-cry...@vger.kernel.org -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v8 0/8] Add support for HiSilicon PCIe Tune and Trace device
HiSilicon PCIe tune and trace device (PTT) is a PCIe Root Complex integrated Endpoint (RCiEP) device, providing the capability to dynamically monitor and tune the PCIe traffic (tune), and trace the TLP headers (trace). PTT tune is designed for monitoring and adjusting PCIe link parameters. We provide several parameters of the PCIe link. Through the driver, user can adjust the value of certain parameter to affect the PCIe link for the purpose of enhancing the performance in certian situation. PTT trace is designed for dumping the TLP headers to the memory, which can be used to analyze the transactions and usage condition of the PCIe Link. Users can choose filters to trace headers, by either requester ID, or those downstream of a set of Root Ports on the same core of the PTT device. It's also supported to trace the headers of certain type and of certain direction. The driver registers a PMU device for each PTT device. The trace can be used through `perf record` and the traced headers can be decoded by `perf report`. The perf command support for the device is also added in this patchset. The tune can be used through the sysfs attributes of related PMU device. See the documentation for the detailed usage. Change since v7: - Configure the DMA in probe rather than in runtime. Also use devres to manage PMU device as we have no order problem now - Refactor the config validation function per John and Leo - Use a spinlock hisi_ptt::pmu_lock instead of mutex to serialize the perf process in pmu::start as it's in atomic context - Only commit the traced data when stop, per Leo and James - Drop the filter dynamically updating patch from this series to simply the review of the driver. That patch will be send separately. - add a cpumask sysfs attribute and handle the cpu hotplug events, follow the uncore PMU convention - Other cleanups and fixes, both in driver and perf tool Link: https://lore.kernel.org/lkml/20220407125841.3678-1-yangyic...@hisilicon.com/ Change since v6: - Fix W=1 errors reported by lkp test, thanks Change since v5: - Squash the PMU patch into PATCH 2 suggested by John - refine the commit message of PATCH 1 and some comments Link: https://lore.kernel.org/lkml/20220308084930.5142-1-yangyic...@hisilicon.com/ Change since v4: Address the comments from Jonathan, John and Ma Ca, thanks. - Use devm* also for allocating the DMA buffers - Remove the IRQ handler stub in Patch 2 - Make functions waiting for hardware state return boolean - Manual remove the PMU device as it should be removed first - Modifier the orders in probe and removal to make them matched well - Make available {directions,type,format} array const and non-global - Using the right filter list in filters show and well protect the list with mutex - Record the trace status with a boolean @started rather than enum - Optimize the process of finding the PTT devices of the perf-tool Link: https://lore.kernel.org/linux-pci/20220221084307.33712-1-yangyic...@hisilicon.com/ Change since v3: Address the comments from Jonathan and John, thanks. - drop members in the common struct which can be get on the fly - reduce buffer struct and organize the buffers with array instead of list - reduce the DMA reset wait time to avoid long time busy loop - split the available_filters sysfs attribute into two files, for root port and requester respectively. Update the documentation accordingly - make IOMMU mapping check earlier in probe to avoid race condition. Also make IOMMU quirk patch prior to driver in the series - Cleanups and typos fixes from John and Jonathan Link: https://lore.kernel.org/linux-pci/20220124131118.17887-1-yangyic...@hisilicon.com/ Change since v2: - address the comments from Mathieu, thanks. - rename the directory to ptt to match the function of the device - spinoff the declarations to a separate header - split the trace function to several patches - some other comments. - make default smmu domain type of PTT device to identity Drop the RMR as it's not recommended and use an iommu_def_domain_type quirk to passthrough the device DMA as suggested by Robin. Link: https://lore.kernel.org/linux-pci/2026090625.53702-1-yangyic...@hisilicon.com/ Change since v1: - switch the user interface of trace to perf from debugfs - switch the user interface of tune to sysfs from debugfs - add perf tool support to start trace and decode the trace data - address the comments of documentation from Bjorn - add RMR[1] support of the device as trace works in RMR mode or direct DMA mode. RMR support is achieved by common APIs rather than the APIs implemented in [1]. Link: https://lore.kernel.org/lkml/1618654631-42454-1-git-send-email-yangyic...@hisilicon.com/ [1] https://lore.kernel.org/linux-acpi/20210805080724.480-1-shameerali.kolothum.th...@huawei.com/ Qi Liu (3): perf arm: Refactor event list iteration in auxtrace_record__init() perf tool: Add support for HiSilicon PCIe Tune and Trace device driver perf
[PATCH v8 1/8] iommu/arm-smmu-v3: Make default domain type of HiSilicon PTT device to identity
The DMA operations of HiSilicon PTT device can only work properly with identical mappings. So add a quirk for the device to force the domain as passthrough. Acked-by: Will Deacon Signed-off-by: Yicong Yang Reviewed-by: John Garry --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 21 + 1 file changed, 21 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 627a3ed5ee8f..7f51823ab63b 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2839,6 +2839,26 @@ static int arm_smmu_dev_disable_feature(struct device *dev, } } +/* + * HiSilicon PCIe tune and trace device can be used to trace TLP headers on the + * PCIe link and save the data to memory by DMA. The hardware is restricted to + * use identity mapping only. + */ +#define IS_HISI_PTT_DEVICE(pdev) ((pdev)->vendor == PCI_VENDOR_ID_HUAWEI && \ +(pdev)->device == 0xa12e) + +static int arm_smmu_def_domain_type(struct device *dev) +{ + if (dev_is_pci(dev)) { + struct pci_dev *pdev = to_pci_dev(dev); + + if (IS_HISI_PTT_DEVICE(pdev)) + return IOMMU_DOMAIN_IDENTITY; + } + + return 0; +} + static struct iommu_ops arm_smmu_ops = { .capable= arm_smmu_capable, .domain_alloc = arm_smmu_domain_alloc, @@ -2856,6 +2876,7 @@ static struct iommu_ops arm_smmu_ops = { .sva_unbind = arm_smmu_sva_unbind, .sva_get_pasid = arm_smmu_sva_get_pasid, .page_response = arm_smmu_page_response, + .def_domain_type= arm_smmu_def_domain_type, .pgsize_bitmap = -1UL, /* Restricted during device attach */ .owner = THIS_MODULE, .default_domain_ops = &(const struct iommu_domain_ops) { -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used
Joerg, On 5/13/22 8:07 PM, Joerg Roedel wrote: On Mon, May 09, 2022 at 02:48:15AM -0500, Suravee Suthikulpanit wrote: On AMD system with SNP enabled, IOMMU hardware checks the host translation valid (TV) and guest translation valid (GV) bits in the device table entry (DTE) before accessing the corresponded page tables. However, current IOMMU driver sets the TV bit for all devices regardless of whether the host page table is in used. This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which do not the host page table root pointer set up. Hmm, this sound weird. In the early AMD IOMMUs it was recommended to set TV=1 and V=1 and the rest to 0 to block all DMA from a device. I wonder how this triggers ILLEGAL_DEV_TABLE_ENTRY errors now. It is (was?) legal to set V=1 TV=1, mode=0 and leave the page-table empty. Due to the new restriction (please see the IOMMU spec Rev 3.06-PUB - Apr 2021 https://www.amd.com/system/files/TechDocs/48882_IOMMU.pdf) where the use of DTE[Mode]=0 is not supported on systems that are SNP-enabled (i.e. EFR[SNPSup]=1), the IOMMU HW looks at the DTE[TV] bit to determine if it needs to handle the v1 page table. When the HW encounters DTE entry with TV=1, V=1, Mode=0, it would generate ILLEGAL_DEV_TABLE_ENTRY event. Note: I am following up with HW folks for the updated document for this specific detail. Therefore, we need to modify IOMMU driver as following: - For non-DMA devices (e.g. the IOAPIC devices), we need to modify IOMMU driver to default to DTE[TV]=0. For Linux, this is equivalent to DTE with domain ID 0. - I am still trying to see what is the best way to force Linux to not allow Mode=0 (i.e. iommu=pt mode). Any thoughts? - Also, it seems that the current iommu v2 page table use case, where GVA->GPA=SPA will no longer be supported on system w/ SNPSup=1. Any thoughts? When then IW=0 and IR=0, DMA is blocked. From what I remember this is a valid setting in a DTE. Correct. Do you have an example DTE which triggers this error message? This is specifically from the device representing an IOAPIC. [ +0.000108] iommu ivhd0: AMD-Vi: Event logged [ILLEGAL_DEV_TABLE_ENTRY device=c0:00.1 pasid=0x0 address=0xfffdf814 flags=0x0008] [ +0.11] AMD-Vi: DTE[0]: 0003 [ +0.03] AMD-Vi: DTE[1]: [ +0.02] AMD-Vi: DTE[2]: 2008000100258013 [ +0.01] AMD-Vi: DTE[3]: Best Regards, Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 2/2] iomm/mediatek: Allow page table PA up to 35bit
From: Yunfei Wang Add the quirk IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT support, so that allows page table PA up to 35bit, not only in ZONE_DMA32. Signed-off-by: Ning Li Signed-off-by: Yunfei Wang --- drivers/iommu/mtk_iommu.c | 29 + 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 6fd75a60abd6..1b9a876ef271 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -33,6 +33,7 @@ #define REG_MMU_PT_BASE_ADDR 0x000 #define MMU_PT_ADDR_MASK GENMASK(31, 7) +#define MMU_PT_ADDR_2_0_MASK GENMASK(2, 0) #define REG_MMU_INVALIDATE 0x020 #define F_ALL_INVLD0x2 @@ -118,6 +119,7 @@ #define WR_THROT_ENBIT(6) #define HAS_LEGACY_IVRP_PADDR BIT(7) #define IOVA_34_EN BIT(8) +#define PGTABLE_PA_35_EN BIT(9) #define MTK_IOMMU_HAS_FLAG(pdata, _x) \ pdata)->flags) & (_x)) == (_x)) @@ -401,6 +403,9 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom, .iommu_dev = data->dev, }; + if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN)) + dom->cfg.quirks |= IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT; + if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) dom->cfg.oas = data->enable_4GB ? 33 : 32; else @@ -450,6 +455,7 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain, struct mtk_iommu_domain *dom = to_mtk_domain(domain); struct device *m4udev = data->dev; int ret, domid; + u32 regval; domid = mtk_iommu_get_domain_id(dev, data->plat_data); if (domid < 0) @@ -472,8 +478,14 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain, return ret; } data->m4u_dom = dom; - writel(dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK, - data->base + REG_MMU_PT_BASE_ADDR); + + /* Bits[6:3] are invalid for mediatek platform */ + if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN)) + regval = (dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK) | +(dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_2_0_MASK); + else + regval = dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK; + writel(regval, data->base + REG_MMU_PT_BASE_ADDR); pm_runtime_put(m4udev); } @@ -987,6 +999,7 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) struct mtk_iommu_suspend_reg *reg = >reg; struct mtk_iommu_domain *m4u_dom = data->m4u_dom; void __iomem *base = data->base; + u32 regval; int ret; ret = clk_prepare_enable(data->bclk); @@ -1010,7 +1023,14 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) writel_relaxed(reg->int_main_control, base + REG_MMU_INT_MAIN_CONTROL); writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR); writel_relaxed(reg->vld_pa_rng, base + REG_MMU_VLD_PA_RNG); - writel(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK, base + REG_MMU_PT_BASE_ADDR); + + /* Bits[6:3] are invalid for mediatek platform */ + if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN)) + regval = (m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK) | +(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_2_0_MASK); + else + regval = m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK; + writel(regval, base + REG_MMU_PT_BASE_ADDR); /* * Users may allocate dma buffer before they call pm_runtime_get, @@ -1038,7 +1058,8 @@ static const struct mtk_iommu_plat_data mt2712_data = { static const struct mtk_iommu_plat_data mt6779_data = { .m4u_plat = M4U_MT6779, - .flags = HAS_SUB_COMM | OUT_ORDER_WR_EN | WR_THROT_EN, + .flags = HAS_SUB_COMM | OUT_ORDER_WR_EN | WR_THROT_EN | +PGTABLE_PA_35_EN, .inv_sel_reg = REG_MMU_INV_SEL_GEN2, .iova_region = single_domain, .iova_region_nr = ARRAY_SIZE(single_domain), -- 2.18.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 1/2] iommu/io-pgtable-arm-v7s: Add a quirk to allow pgtable PA up to 35bit
From: Yunfei Wang The calling to kmem_cache_alloc for level 2 pgtable allocation may run in atomic context, and it fails sometimes when DMA32 zone runs out of memory. Since Mediatek IOMMU hardware support at most 35bit PA in pgtable, so add a quirk to allow the PA of pgtables support up to bit35. Signed-off-by: Ning Li Signed-off-by: Yunfei Wang --- drivers/iommu/io-pgtable-arm-v7s.c | 56 ++ include/linux/io-pgtable.h | 15 +--- 2 files changed, 52 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c index be066c1503d3..57455ae052ac 100644 --- a/drivers/iommu/io-pgtable-arm-v7s.c +++ b/drivers/iommu/io-pgtable-arm-v7s.c @@ -149,6 +149,10 @@ #define ARM_V7S_TTBR_IRGN_ATTR(attr) \ attr) & 0x1) << 6) | (((attr) & 0x2) >> 1)) +/* Mediatek extend ttbr bits[2:0] for PA bits[34:32] */ +#define ARM_V7S_TTBR_35BIT_PA(ttbr, pa) \ + ((ttbr & ((u32)(~0U << 3))) | ((pa & GENMASK_ULL(34, 32)) >> 32)) + #ifdef CONFIG_ZONE_DMA32 #define ARM_V7S_TABLE_GFP_DMA GFP_DMA32 #define ARM_V7S_TABLE_SLAB_FLAGS SLAB_CACHE_DMA32 @@ -182,14 +186,8 @@ static bool arm_v7s_is_mtk_enabled(struct io_pgtable_cfg *cfg) (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT); } -static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl, - struct io_pgtable_cfg *cfg) +static arm_v7s_iopte to_iopte_mtk(phys_addr_t paddr, arm_v7s_iopte pte) { - arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl); - - if (!arm_v7s_is_mtk_enabled(cfg)) - return pte; - if (paddr & BIT_ULL(32)) pte |= ARM_V7S_ATTR_MTK_PA_BIT32; if (paddr & BIT_ULL(33)) @@ -199,6 +197,17 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl, return pte; } +static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl, + struct io_pgtable_cfg *cfg) +{ + arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl); + + if (!arm_v7s_is_mtk_enabled(cfg)) + return pte; + + return to_iopte_mtk(paddr, pte); +} + static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl, struct io_pgtable_cfg *cfg) { @@ -234,6 +243,7 @@ static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl, static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp, struct arm_v7s_io_pgtable *data) { + gfp_t gfp_l1 = __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA; struct io_pgtable_cfg *cfg = >iop.cfg; struct device *dev = cfg->iommu_dev; phys_addr_t phys; @@ -241,9 +251,11 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp, size_t size = ARM_V7S_TABLE_SIZE(lvl, cfg); void *table = NULL; + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT) + gfp_l1 = __GFP_ZERO; + if (lvl == 1) - table = (void *)__get_free_pages( - __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size)); + table = (void *)__get_free_pages(gfp_l1, get_order(size)); else if (lvl == 2) table = kmem_cache_zalloc(data->l2_tables, gfp); @@ -251,7 +263,8 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp, return NULL; phys = virt_to_phys(table); - if (phys != (arm_v7s_iopte)phys) { + if (phys != (arm_v7s_iopte)phys && + !(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)) { /* Doesn't fit in PTE */ dev_err(dev, "Page table does not fit in PTE: %pa", ); goto out_free; @@ -457,9 +470,14 @@ static arm_v7s_iopte arm_v7s_install_table(arm_v7s_iopte *table, arm_v7s_iopte curr, struct io_pgtable_cfg *cfg) { + phys_addr_t phys = virt_to_phys(table); arm_v7s_iopte old, new; - new = virt_to_phys(table) | ARM_V7S_PTE_TYPE_TABLE; + new = phys | ARM_V7S_PTE_TYPE_TABLE; + + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT) + new = to_iopte_mtk(phys, new); + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS) new |= ARM_V7S_ATTR_NS_TABLE; @@ -778,7 +796,9 @@ static phys_addr_t arm_v7s_iova_to_phys(struct io_pgtable_ops *ops, static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie) { + slab_flags_t slab_flag = ARM_V7S_TABLE_SLAB_FLAGS; struct arm_v7s_io_pgtable *data; + phys_addr_t paddr; if (cfg->ias > (arm_v7s_is_mtk_enabled(cfg) ? 34 : ARM_V7S_ADDR_BITS)) return NULL; @@ -788,7 +808,8 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg, if (cfg->quirks
Re: [PATCH 4/7] dt-bindings: renesas,rcar-dmac: R-Car V3U is R-Car Gen4
On 02-05-22, 15:34, Geert Uytterhoeven wrote: > Despite the name, R-Car V3U is the first member of the R-Car Gen4 > family. Hence move its compatible value to the R-Car Gen4 section. Applied, thanks -- ~Vinod ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/5] iommu: Add blocking_domain_ops field in iommu_ops
On 2022-05-16 02:57, Lu Baolu wrote: Each IOMMU driver must provide a blocking domain ops. If the hardware supports detaching domain from device, setting blocking domain equals detaching the existing domain from the deivce. Otherwise, an UNMANAGED domain without any mapping will be used instead. Unfortunately that's backwards - most of the implementations of .detach_dev are disabling translation entirely, meaning the device ends up effectively in passthrough rather than blocked. Conversely, at least arm-smmu and arm-smmu-v3 could implement IOMMU_DOMAIN_BLOCKED properly with fault-type S2CRs and STEs respectively, it just needs a bit of wiring up. Thanks, Robin. Signed-off-by: Lu Baolu --- include/linux/iommu.h | 7 +++ drivers/iommu/amd/iommu.c | 12 drivers/iommu/apple-dart.c | 12 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++ drivers/iommu/arm/arm-smmu/arm-smmu.c | 3 +++ drivers/iommu/arm/arm-smmu/qcom_iommu.c | 12 drivers/iommu/exynos-iommu.c| 12 drivers/iommu/fsl_pamu_domain.c | 12 drivers/iommu/intel/iommu.c | 12 drivers/iommu/ipmmu-vmsa.c | 12 drivers/iommu/msm_iommu.c | 12 drivers/iommu/mtk_iommu.c | 12 drivers/iommu/mtk_iommu_v1.c| 12 drivers/iommu/omap-iommu.c | 12 drivers/iommu/rockchip-iommu.c | 12 drivers/iommu/s390-iommu.c | 12 drivers/iommu/sprd-iommu.c | 11 +++ drivers/iommu/sun50i-iommu.c| 12 drivers/iommu/tegra-gart.c | 12 drivers/iommu/tegra-smmu.c | 12 drivers/iommu/virtio-iommu.c| 3 +++ 21 files changed, 219 insertions(+) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 572399ac1d83..5e228aad0ef6 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -216,6 +216,7 @@ struct iommu_iotlb_gather { *- IOMMU_DOMAIN_DMA: must use a dma domain *- 0: use the default setting * @default_domain_ops: the default ops for domains + * @blocking_domain_ops: the blocking ops for domains * @pgsize_bitmap: bitmap of all possible supported page sizes * @owner: Driver module providing these ops */ @@ -255,6 +256,7 @@ struct iommu_ops { int (*def_domain_type)(struct device *dev); const struct iommu_domain_ops *default_domain_ops; + const struct iommu_domain_ops *blocking_domain_ops; unsigned long pgsize_bitmap; struct module *owner; }; @@ -279,6 +281,9 @@ struct iommu_ops { * @enable_nesting: Enable nesting * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*) * @free: Release the domain after use. + * @blocking_domain_detach: iommu hardware support detaching a domain from + * a device, hence setting blocking domain to a device equals to + * detach the existing domain from it. */ struct iommu_domain_ops { int (*set_dev)(struct iommu_domain *domain, struct device *dev); @@ -310,6 +315,8 @@ struct iommu_domain_ops { unsigned long quirks); void (*free)(struct iommu_domain *domain); + + unsigned int blocking_domain_detach:1; }; /** diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 01b8668ef46d..c66713439824 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2272,6 +2272,14 @@ static bool amd_iommu_enforce_cache_coherency(struct iommu_domain *domain) return true; } +static int amd_blocking_domain_set_dev(struct iommu_domain *domain, + struct device *dev) +{ + amd_iommu_detach_device(domain, dev); + + return 0; +} + const struct iommu_ops amd_iommu_ops = { .capable = amd_iommu_capable, .domain_alloc = amd_iommu_domain_alloc, @@ -2295,6 +2303,10 @@ const struct iommu_ops amd_iommu_ops = { .iotlb_sync = amd_iommu_iotlb_sync, .free = amd_iommu_domain_free, .enforce_cache_coherency = amd_iommu_enforce_cache_coherency, + }, + .blocking_domain_ops = &(const struct iommu_domain_ops) { + .set_dev= amd_blocking_domain_set_dev, + .blocking_domain_detach = true, } }; diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c index a0b7281f1989..3c37762e01ec 100644 --- a/drivers/iommu/apple-dart.c +++ b/drivers/iommu/apple-dart.c @@ -763,6 +763,14 @@ static void apple_dart_get_resv_regions(struct device *dev, iommu_dma_get_resv_regions(dev, head); }
Re: [PATCH v5 5/9] iommu/arm-smmu: Attach to host1x context device bus
On 5/16/22 13:44, Robin Murphy wrote: On 2022-05-16 11:13, Mikko Perttunen wrote: On 5/16/22 13:07, Will Deacon wrote: On Mon, May 16, 2022 at 11:52:54AM +0300, cyn...@kapsi.fi wrote: From: Mikko Perttunen Set itself as the IOMMU for the host1x context device bus, containing "dummy" devices used for Host1x context isolation. Signed-off-by: Mikko Perttunen --- drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index 568cce590ccc..9ff54eaecf81 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -39,6 +39,7 @@ #include #include +#include #include "arm-smmu.h" @@ -2053,8 +2054,20 @@ static int arm_smmu_bus_init(struct iommu_ops *ops) goto err_reset_pci_ops; } #endif +#ifdef CONFIG_TEGRA_HOST1X_CONTEXT_BUS + if (!iommu_present(_context_device_bus_type)) { + err = bus_set_iommu(_context_device_bus_type, ops); + if (err) + goto err_reset_fsl_mc_ops; + } +#endif + return 0; +err_reset_fsl_mc_ops: __maybe_unused; +#ifdef CONFIG_FSL_MC_BUS + bus_set_iommu(_mc_bus_type, NULL); +#endif bus_set_iommu() is going away: https://lore.kernel.org/r/cover.1650890638.git.robin.mur...@arm.com Will Thanks for the heads-up. Robin had pointed out that this work was ongoing but I hadn't seen the patches yet. I'll look into it. Although that *is* currently blocked on the mystery intel-iommu problem that I can't reproduce... If this series is ready to land right now for 5.19 then in principle that might be the easiest option overall. Hopefully at least patch #2 could sneak in so that the compile-time dependencies are ready for me to roll up host1x into the next rebase of "iommu: Always register bus notifiers". Cheers, Robin. My guess is that the series as a whole is not ready to land in the 5.19 timeframe, but #2 could be possible. Thierry, any opinion? Thanks, Mikko ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 5/9] iommu/arm-smmu: Attach to host1x context device bus
On 2022-05-16 11:13, Mikko Perttunen wrote: On 5/16/22 13:07, Will Deacon wrote: On Mon, May 16, 2022 at 11:52:54AM +0300, cyn...@kapsi.fi wrote: From: Mikko Perttunen Set itself as the IOMMU for the host1x context device bus, containing "dummy" devices used for Host1x context isolation. Signed-off-by: Mikko Perttunen --- drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index 568cce590ccc..9ff54eaecf81 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -39,6 +39,7 @@ #include #include +#include #include "arm-smmu.h" @@ -2053,8 +2054,20 @@ static int arm_smmu_bus_init(struct iommu_ops *ops) goto err_reset_pci_ops; } #endif +#ifdef CONFIG_TEGRA_HOST1X_CONTEXT_BUS + if (!iommu_present(_context_device_bus_type)) { + err = bus_set_iommu(_context_device_bus_type, ops); + if (err) + goto err_reset_fsl_mc_ops; + } +#endif + return 0; +err_reset_fsl_mc_ops: __maybe_unused; +#ifdef CONFIG_FSL_MC_BUS + bus_set_iommu(_mc_bus_type, NULL); +#endif bus_set_iommu() is going away: https://lore.kernel.org/r/cover.1650890638.git.robin.mur...@arm.com Will Thanks for the heads-up. Robin had pointed out that this work was ongoing but I hadn't seen the patches yet. I'll look into it. Although that *is* currently blocked on the mystery intel-iommu problem that I can't reproduce... If this series is ready to land right now for 5.19 then in principle that might be the easiest option overall. Hopefully at least patch #2 could sneak in so that the compile-time dependencies are ready for me to roll up host1x into the next rebase of "iommu: Always register bus notifiers". Cheers, Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 5/9] iommu/arm-smmu: Attach to host1x context device bus
On 5/16/22 13:07, Will Deacon wrote: On Mon, May 16, 2022 at 11:52:54AM +0300, cyn...@kapsi.fi wrote: From: Mikko Perttunen Set itself as the IOMMU for the host1x context device bus, containing "dummy" devices used for Host1x context isolation. Signed-off-by: Mikko Perttunen --- drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index 568cce590ccc..9ff54eaecf81 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -39,6 +39,7 @@ #include #include +#include #include "arm-smmu.h" @@ -2053,8 +2054,20 @@ static int arm_smmu_bus_init(struct iommu_ops *ops) goto err_reset_pci_ops; } #endif +#ifdef CONFIG_TEGRA_HOST1X_CONTEXT_BUS + if (!iommu_present(_context_device_bus_type)) { + err = bus_set_iommu(_context_device_bus_type, ops); + if (err) + goto err_reset_fsl_mc_ops; + } +#endif + return 0; +err_reset_fsl_mc_ops: __maybe_unused; +#ifdef CONFIG_FSL_MC_BUS + bus_set_iommu(_mc_bus_type, NULL); +#endif bus_set_iommu() is going away: https://lore.kernel.org/r/cover.1650890638.git.robin.mur...@arm.com Will Thanks for the heads-up. Robin had pointed out that this work was ongoing but I hadn't seen the patches yet. I'll look into it. Mikko ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 5/9] iommu/arm-smmu: Attach to host1x context device bus
On Mon, May 16, 2022 at 11:52:54AM +0300, cyn...@kapsi.fi wrote: > From: Mikko Perttunen > > Set itself as the IOMMU for the host1x context device bus, containing > "dummy" devices used for Host1x context isolation. > > Signed-off-by: Mikko Perttunen > --- > drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 + > 1 file changed, 13 insertions(+) > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c > b/drivers/iommu/arm/arm-smmu/arm-smmu.c > index 568cce590ccc..9ff54eaecf81 100644 > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c > @@ -39,6 +39,7 @@ > > #include > #include > +#include > > #include "arm-smmu.h" > > @@ -2053,8 +2054,20 @@ static int arm_smmu_bus_init(struct iommu_ops *ops) > goto err_reset_pci_ops; > } > #endif > +#ifdef CONFIG_TEGRA_HOST1X_CONTEXT_BUS > + if (!iommu_present(_context_device_bus_type)) { > + err = bus_set_iommu(_context_device_bus_type, ops); > + if (err) > + goto err_reset_fsl_mc_ops; > + } > +#endif > + > return 0; > > +err_reset_fsl_mc_ops: __maybe_unused; > +#ifdef CONFIG_FSL_MC_BUS > + bus_set_iommu(_mc_bus_type, NULL); > +#endif bus_set_iommu() is going away: https://lore.kernel.org/r/cover.1650890638.git.robin.mur...@arm.com Will > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 3/9] gpu: host1x: Add context device management code
From: Mikko Perttunen Add code to register context devices from device tree, allocate them out and manage their refcounts. Signed-off-by: Mikko Perttunen --- v2: * Directly set DMA mask instead of inheriting from Host1x. * Use iommu-map instead of custom DT property. v4: * Use u64 instead of dma_addr_t for DMA mask * Use unsigned ints for indexes and adjust error handling flow * Parse iommu-map property at top level host1x DT node * Use separate DMA mask per device * Export symbols as GPL v5: * Rename host1x_context to host1x_memory_context --- drivers/gpu/host1x/Makefile | 1 + drivers/gpu/host1x/context.c | 160 +++ drivers/gpu/host1x/context.h | 27 ++ drivers/gpu/host1x/dev.c | 12 ++- drivers/gpu/host1x/dev.h | 2 + include/linux/host1x.h | 18 6 files changed, 219 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/host1x/context.c create mode 100644 drivers/gpu/host1x/context.h diff --git a/drivers/gpu/host1x/Makefile b/drivers/gpu/host1x/Makefile index c891a3e33844..8a65e13d113a 100644 --- a/drivers/gpu/host1x/Makefile +++ b/drivers/gpu/host1x/Makefile @@ -10,6 +10,7 @@ host1x-y = \ debug.o \ mipi.o \ fence.o \ + context.o \ hw/host1x01.o \ hw/host1x02.o \ hw/host1x04.o \ diff --git a/drivers/gpu/host1x/context.c b/drivers/gpu/host1x/context.c new file mode 100644 index ..d7d95b69a72a --- /dev/null +++ b/drivers/gpu/host1x/context.c @@ -0,0 +1,160 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2021, NVIDIA Corporation. + */ + +#include +#include +#include +#include +#include +#include + +#include "context.h" +#include "dev.h" + +int host1x_memory_context_list_init(struct host1x *host1x) +{ + struct host1x_memory_context_list *cdl = >context_list; + struct device_node *node = host1x->dev->of_node; + struct host1x_memory_context *ctx; + unsigned int i; + int err; + + cdl->devs = NULL; + cdl->len = 0; + mutex_init(>lock); + + err = of_property_count_u32_elems(node, "iommu-map"); + if (err < 0) + return 0; + + cdl->devs = kcalloc(err, sizeof(*cdl->devs), GFP_KERNEL); + if (!cdl->devs) + return -ENOMEM; + cdl->len = err / 4; + + for (i = 0; i < cdl->len; i++) { + struct iommu_fwspec *fwspec; + + ctx = >devs[i]; + + ctx->host = host1x; + + device_initialize(>dev); + + /* +* Due to an issue with T194 NVENC, only 38 bits can be used. +* Anyway, 256GiB of IOVA ought to be enough for anyone. +*/ + ctx->dma_mask = DMA_BIT_MASK(38); + ctx->dev.dma_mask = >dma_mask; + ctx->dev.coherent_dma_mask = ctx->dma_mask; + dev_set_name(>dev, "host1x-ctx.%d", i); + ctx->dev.bus = _context_device_bus_type; + ctx->dev.parent = host1x->dev; + + dma_set_max_seg_size(>dev, UINT_MAX); + + err = device_add(>dev); + if (err) { + dev_err(host1x->dev, "could not add context device %d: %d\n", i, err); + goto del_devices; + } + + err = of_dma_configure_id(>dev, node, true, ); + if (err) { + dev_err(host1x->dev, "IOMMU configuration failed for context device %d: %d\n", + i, err); + device_del(>dev); + goto del_devices; + } + + fwspec = dev_iommu_fwspec_get(>dev); + if (!fwspec) { + dev_err(host1x->dev, "Context device %d has no IOMMU!\n", i); + device_del(>dev); + goto del_devices; + } + + ctx->stream_id = fwspec->ids[0] & 0x; + } + + return 0; + +del_devices: + while (i--) + device_del(>devs[i].dev); + + kfree(cdl->devs); + cdl->len = 0; + + return err; +} + +void host1x_memory_context_list_free(struct host1x_memory_context_list *cdl) +{ + unsigned int i; + + for (i = 0; i < cdl->len; i++) + device_del(>devs[i].dev); + + kfree(cdl->devs); + cdl->len = 0; +} + +struct host1x_memory_context *host1x_memory_context_alloc(struct host1x *host1x, + struct pid *pid) +{ + struct host1x_memory_context_list *cdl = >context_list; + struct host1x_memory_context *free = NULL; + int i; + + if (!cdl->len) + return ERR_PTR(-EOPNOTSUPP); + + mutex_lock(>lock); + + for (i = 0; i < cdl->len; i++) { + struct host1x_memory_context *cd = >devs[i]; + + if (cd->owner == pid) { +
[PATCH v5 0/9] Host1x context isolation support
From: Mikko Perttunen *** New in v5: Rebased Renamed host1x_context to host1x_memory_context Small change in DRM side client driver ops to reduce churn with some upcoming changes Add NVDEC support *** *** New in v4: Addressed review comments. See individual patches. *** *** New in v3: Added device tree bindings for new property. *** *** New in v2: Added support for Tegra194 Use standard iommu-map property instead of custom mechanism *** This series adds support for Host1x 'context isolation'. Since when programming engines through Host1x, userspace can program in any addresses it wants, we need some way to isolate the engines' memory spaces. Traditionally this has either been done imperfectly with a single shared IOMMU domain, or by copying and verifying the programming command stream at submit time (Host1x firewall). Since Tegra186 there is a privileged (only usable by kernel) Host1x opcode that allows setting the stream ID sent by the engine to the SMMU. So, by allocating a number of context banks and stream IDs for this purpose, and using this opcode at the beginning of each job, we can implement isolation. Due to the limited number of context banks only each process gets its own context, and not each channel. This feature also allows sharing engines among multiple VMs when used with Host1x's hardware virtualization support - up to 8 VMs can be configured with a subset of allowed stream IDs, enforced at hardware level. To implement this, this series adds a new host1x context bus, which will contain the 'struct device's corresponding to each context bank / stream ID, changes to device tree and SMMU code to allow registering the devices and using the bus, as well as the Host1x stream ID programming code and support in TegraDRM. - Merging notes - The changes to DT bindings should be applied on top of Thierry's patch 'dt-bindings: display: tegra: Convert to json-schema'. Thanks, Mikko Mikko Perttunen (9): dt-bindings: host1x: Add iommu-map property gpu: host1x: Add context bus gpu: host1x: Add context device management code gpu: host1x: Program context stream ID on submission iommu/arm-smmu: Attach to host1x context device bus arm64: tegra: Add Host1x context stream IDs on Tegra186+ drm/tegra: falcon: Set DMACTX field on DMA transactions drm/tegra: Support context isolation drm/tegra: Implement stream ID related callbacks on engines .../display/tegra/nvidia,tegra20-host1x.yaml | 5 + arch/arm64/boot/dts/nvidia/tegra186.dtsi | 11 ++ arch/arm64/boot/dts/nvidia/tegra194.dtsi | 11 ++ drivers/gpu/Makefile | 3 +- drivers/gpu/drm/tegra/drm.h | 11 ++ drivers/gpu/drm/tegra/falcon.c| 8 + drivers/gpu/drm/tegra/falcon.h| 1 + drivers/gpu/drm/tegra/nvdec.c | 9 + drivers/gpu/drm/tegra/submit.c| 48 +- drivers/gpu/drm/tegra/uapi.c | 43 - drivers/gpu/drm/tegra/vic.c | 67 +++- drivers/gpu/host1x/Kconfig| 5 + drivers/gpu/host1x/Makefile | 2 + drivers/gpu/host1x/context.c | 160 ++ drivers/gpu/host1x/context.h | 27 +++ drivers/gpu/host1x/context_bus.c | 31 drivers/gpu/host1x/dev.c | 12 +- drivers/gpu/host1x/dev.h | 2 + drivers/gpu/host1x/hw/channel_hw.c| 52 +- drivers/gpu/host1x/hw/host1x06_hardware.h | 10 ++ drivers/gpu/host1x/hw/host1x07_hardware.h | 10 ++ drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 ++ include/linux/host1x.h| 26 +++ include/linux/host1x_context_bus.h| 15 ++ 24 files changed, 564 insertions(+), 18 deletions(-) create mode 100644 drivers/gpu/host1x/context.c create mode 100644 drivers/gpu/host1x/context.h create mode 100644 drivers/gpu/host1x/context_bus.c create mode 100644 include/linux/host1x_context_bus.h -- 2.36.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 2/9] gpu: host1x: Add context bus
From: Mikko Perttunen The context bus is a "dummy" bus that contains struct devices that correspond to IOMMU contexts assigned through Host1x to processes. Even when host1x itself is built as a module, the bus is registered in built-in code so that the built-in ARM SMMU driver is able to reference it. Signed-off-by: Mikko Perttunen --- v4: * Export bus as GPL --- drivers/gpu/Makefile | 3 +-- drivers/gpu/host1x/Kconfig | 5 + drivers/gpu/host1x/Makefile| 1 + drivers/gpu/host1x/context_bus.c | 31 ++ include/linux/host1x_context_bus.h | 15 +++ 5 files changed, 53 insertions(+), 2 deletions(-) create mode 100644 drivers/gpu/host1x/context_bus.c create mode 100644 include/linux/host1x_context_bus.h diff --git a/drivers/gpu/Makefile b/drivers/gpu/Makefile index 835c88318cec..8997f0096545 100644 --- a/drivers/gpu/Makefile +++ b/drivers/gpu/Makefile @@ -2,7 +2,6 @@ # drm/tegra depends on host1x, so if both drivers are built-in care must be # taken to initialize them in the correct order. Link order is the only way # to ensure this currently. -obj-$(CONFIG_TEGRA_HOST1X) += host1x/ -obj-y += drm/ vga/ +obj-y += host1x/ drm/ vga/ obj-$(CONFIG_IMX_IPUV3_CORE) += ipu-v3/ obj-$(CONFIG_TRACE_GPU_MEM)+= trace/ diff --git a/drivers/gpu/host1x/Kconfig b/drivers/gpu/host1x/Kconfig index 6815b4db17c1..1861a8180d3f 100644 --- a/drivers/gpu/host1x/Kconfig +++ b/drivers/gpu/host1x/Kconfig @@ -1,8 +1,13 @@ # SPDX-License-Identifier: GPL-2.0-only + +config TEGRA_HOST1X_CONTEXT_BUS + bool + config TEGRA_HOST1X tristate "NVIDIA Tegra host1x driver" depends on ARCH_TEGRA || (ARM && COMPILE_TEST) select DMA_SHARED_BUFFER + select TEGRA_HOST1X_CONTEXT_BUS select IOMMU_IOVA help Driver for the NVIDIA Tegra host1x hardware. diff --git a/drivers/gpu/host1x/Makefile b/drivers/gpu/host1x/Makefile index d2b6f7de0498..c891a3e33844 100644 --- a/drivers/gpu/host1x/Makefile +++ b/drivers/gpu/host1x/Makefile @@ -18,3 +18,4 @@ host1x-y = \ hw/host1x07.o obj-$(CONFIG_TEGRA_HOST1X) += host1x.o +obj-$(CONFIG_TEGRA_HOST1X_CONTEXT_BUS) += context_bus.o diff --git a/drivers/gpu/host1x/context_bus.c b/drivers/gpu/host1x/context_bus.c new file mode 100644 index ..b0d35b2bbe89 --- /dev/null +++ b/drivers/gpu/host1x/context_bus.c @@ -0,0 +1,31 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2021, NVIDIA Corporation. + */ + +#include +#include + +struct bus_type host1x_context_device_bus_type = { + .name = "host1x-context", +}; +EXPORT_SYMBOL_GPL(host1x_context_device_bus_type); + +static int __init host1x_context_device_bus_init(void) +{ + int err; + + if (!of_machine_is_compatible("nvidia,tegra186") && + !of_machine_is_compatible("nvidia,tegra194") && + !of_machine_is_compatible("nvidia,tegra234")) + return 0; + + err = bus_register(_context_device_bus_type); + if (err < 0) { + pr_err("bus type registration failed: %d\n", err); + return err; + } + + return 0; +} +postcore_initcall(host1x_context_device_bus_init); diff --git a/include/linux/host1x_context_bus.h b/include/linux/host1x_context_bus.h new file mode 100644 index ..72462737a6db --- /dev/null +++ b/include/linux/host1x_context_bus.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (c) 2021, NVIDIA Corporation. All rights reserved. + */ + +#ifndef __LINUX_HOST1X_CONTEXT_BUS_H +#define __LINUX_HOST1X_CONTEXT_BUS_H + +#include + +#ifdef CONFIG_TEGRA_HOST1X_CONTEXT_BUS +extern struct bus_type host1x_context_device_bus_type; +#endif + +#endif -- 2.36.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 8/9] drm/tegra: Support context isolation
From: Mikko Perttunen For engines that support context isolation, allocate a context when opening a channel, and set up stream ID offset and context fields when submitting a job. As of this commit, the stream ID offset and fallback stream ID are not used when context isolation is disabled. However, with upcoming patches that enable a full featured job opcode sequence, these will be necessary. Signed-off-by: Mikko Perttunen --- v5: * On supporting engines, always program stream ID offset and new fallback stream ID. * Rename host1x_context to host1x_memory_context v4: * Separate error and output values in get_streamid_offset API * Improve error handling * Rename job->context to job->memory_context for clarity --- drivers/gpu/drm/tegra/drm.h| 3 +++ drivers/gpu/drm/tegra/submit.c | 48 +- drivers/gpu/drm/tegra/uapi.c | 43 -- 3 files changed, 91 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h index fc0a19554eac..2acc8f2948ad 100644 --- a/drivers/gpu/drm/tegra/drm.h +++ b/drivers/gpu/drm/tegra/drm.h @@ -80,6 +80,7 @@ struct tegra_drm_context { /* Only used by new UAPI. */ struct xarray mappings; + struct host1x_memory_context *memory_context; }; struct tegra_drm_client_ops { @@ -91,6 +92,8 @@ struct tegra_drm_client_ops { int (*submit)(struct tegra_drm_context *context, struct drm_tegra_submit *args, struct drm_device *drm, struct drm_file *file); + int (*get_streamid_offset)(struct tegra_drm_client *client, u32 *offset); + int (*can_use_memory_ctx)(struct tegra_drm_client *client, bool *supported); }; int tegra_drm_submit(struct tegra_drm_context *context, diff --git a/drivers/gpu/drm/tegra/submit.c b/drivers/gpu/drm/tegra/submit.c index 6d6dd8c35475..b24738bdf3df 100644 --- a/drivers/gpu/drm/tegra/submit.c +++ b/drivers/gpu/drm/tegra/submit.c @@ -498,6 +498,9 @@ static void release_job(struct host1x_job *job) struct tegra_drm_submit_data *job_data = job->user_data; u32 i; + if (job->memory_context) + host1x_memory_context_put(job->memory_context); + for (i = 0; i < job_data->num_used_mappings; i++) tegra_drm_mapping_put(job_data->used_mappings[i].mapping); @@ -588,11 +591,51 @@ int tegra_drm_ioctl_channel_submit(struct drm_device *drm, void *data, goto put_job; } + if (context->client->ops->get_streamid_offset) { + err = context->client->ops->get_streamid_offset( + context->client, >engine_streamid_offset); + if (err) { + SUBMIT_ERR(context, "failed to get streamid offset: %d", err); + goto unpin_job; + } + } + + if (context->memory_context && context->client->ops->can_use_memory_ctx) { + bool supported; + + err = context->client->ops->can_use_memory_ctx(context->client, ); + if (err) { + SUBMIT_ERR(context, "failed to detect if engine can use memory context: %d", err); + goto unpin_job; + } + + if (supported) { + job->memory_context = context->memory_context; + host1x_memory_context_get(job->memory_context); + } + } else if (context->client->ops->get_streamid_offset) { +#ifdef CONFIG_IOMMU_API + struct iommu_fwspec *spec; + + /* +* Job submission will need to temporarily change stream ID, +* so need to tell it what to change it back to. +*/ + spec = dev_iommu_fwspec_get(context->client->base.dev); + if (spec && spec->num_ids > 0) + job->engine_fallback_streamid = spec->ids[0] & 0x; + else + job->engine_fallback_streamid = 0x7f; +#else + job->engine_fallback_streamid = 0x7f; +#endif + } + /* Boot engine. */ err = pm_runtime_resume_and_get(context->client->base.dev); if (err < 0) { SUBMIT_ERR(context, "could not power up engine: %d", err); - goto unpin_job; + goto put_memory_context; } job->user_data = job_data; @@ -627,6 +670,9 @@ int tegra_drm_ioctl_channel_submit(struct drm_device *drm, void *data, goto put_job; +put_memory_context: + if (job->memory_context) + host1x_memory_context_put(job->memory_context); unpin_job: host1x_job_unpin(job); put_job: diff --git a/drivers/gpu/drm/tegra/uapi.c b/drivers/gpu/drm/tegra/uapi.c index 9ab9179d2026..a98239cb0e29 100644 --- a/drivers/gpu/drm/tegra/uapi.c +++ b/drivers/gpu/drm/tegra/uapi.c @@ -33,6 +33,9 @@ static void
[PATCH v5 4/9] gpu: host1x: Program context stream ID on submission
From: Mikko Perttunen Add code to do stream ID switching at the beginning of a job. The stream ID is switched to the stream ID specified by the context passed in the job structure. Before switching the stream ID, an OP_DONE wait is done on the channel's engine to ensure that there is no residual ongoing work that might do DMA using the new stream ID. Signed-off-by: Mikko Perttunen --- v5: * Add fallback stream ID. Not used yet, will be needed for full featured opcode sequence. * Rename host1x_context to host1x_memory_context v4: * Rename job->context to job->memory_context for clarity --- drivers/gpu/host1x/hw/channel_hw.c| 52 +-- drivers/gpu/host1x/hw/host1x06_hardware.h | 10 + drivers/gpu/host1x/hw/host1x07_hardware.h | 10 + include/linux/host1x.h| 8 4 files changed, 76 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/host1x/hw/channel_hw.c b/drivers/gpu/host1x/hw/channel_hw.c index 6b40e9af1e88..f84caf06621a 100644 --- a/drivers/gpu/host1x/hw/channel_hw.c +++ b/drivers/gpu/host1x/hw/channel_hw.c @@ -180,6 +180,45 @@ static void host1x_enable_gather_filter(struct host1x_channel *ch) #endif } +static void host1x_channel_program_engine_streamid(struct host1x_job *job) +{ +#if HOST1X_HW >= 6 + u32 fence; + + if (!job->memory_context) + return; + + fence = host1x_syncpt_incr_max(job->syncpt, 1); + + /* First, increment a syncpoint on OP_DONE condition.. */ + + host1x_cdma_push(>channel->cdma, + host1x_opcode_nonincr(HOST1X_UCLASS_INCR_SYNCPT, 1), + HOST1X_UCLASS_INCR_SYNCPT_INDX_F(job->syncpt->id) | + HOST1X_UCLASS_INCR_SYNCPT_COND_F(1)); + + /* Wait for syncpoint to increment */ + + host1x_cdma_push(>channel->cdma, + host1x_opcode_setclass(HOST1X_CLASS_HOST1X, + host1x_uclass_wait_syncpt_r(), 1), + host1x_class_host_wait_syncpt(job->syncpt->id, fence)); + + /* +* Now that we know the engine is idle, return to class and +* change stream ID. +*/ + + host1x_cdma_push(>channel->cdma, + host1x_opcode_setclass(job->class, 0, 0), + HOST1X_OPCODE_NOP); + + host1x_cdma_push(>channel->cdma, + host1x_opcode_setpayload(job->memory_context->stream_id), + host1x_opcode_setstreamid(job->engine_streamid_offset / 4)); +#endif +} + static int channel_submit(struct host1x_job *job) { struct host1x_channel *ch = job->channel; @@ -236,18 +275,23 @@ static int channel_submit(struct host1x_job *job) if (sp->base) synchronize_syncpt_base(job); - syncval = host1x_syncpt_incr_max(sp, user_syncpt_incrs); - host1x_hw_syncpt_assign_to_channel(host, sp, ch); - job->syncpt_end = syncval; - /* add a setclass for modules that require it */ if (job->class) host1x_cdma_push(>cdma, host1x_opcode_setclass(job->class, 0, 0), HOST1X_OPCODE_NOP); + /* +* Ensure engine DMA is idle and set new stream ID. May increment +* syncpt max. +*/ + host1x_channel_program_engine_streamid(job); + + syncval = host1x_syncpt_incr_max(sp, user_syncpt_incrs); + job->syncpt_end = syncval; + submit_gathers(job, syncval - user_syncpt_incrs); /* end CDMA submit & stash pinned hMems into sync queue */ diff --git a/drivers/gpu/host1x/hw/host1x06_hardware.h b/drivers/gpu/host1x/hw/host1x06_hardware.h index 01a142a09800..5d515745eee7 100644 --- a/drivers/gpu/host1x/hw/host1x06_hardware.h +++ b/drivers/gpu/host1x/hw/host1x06_hardware.h @@ -127,6 +127,16 @@ static inline u32 host1x_opcode_gather_incr(unsigned offset, unsigned count) return (6 << 28) | (offset << 16) | BIT(15) | BIT(14) | count; } +static inline u32 host1x_opcode_setstreamid(unsigned streamid) +{ + return (7 << 28) | streamid; +} + +static inline u32 host1x_opcode_setpayload(unsigned payload) +{ + return (9 << 28) | payload; +} + static inline u32 host1x_opcode_gather_wide(unsigned count) { return (12 << 28) | count; diff --git a/drivers/gpu/host1x/hw/host1x07_hardware.h b/drivers/gpu/host1x/hw/host1x07_hardware.h index e6582172ebfd..82c0cc9bb0b5 100644 --- a/drivers/gpu/host1x/hw/host1x07_hardware.h +++ b/drivers/gpu/host1x/hw/host1x07_hardware.h @@ -127,6 +127,16 @@ static inline u32 host1x_opcode_gather_incr(unsigned offset, unsigned count) return (6 << 28) | (offset << 16) | BIT(15) | BIT(14) | count; } +static inline u32 host1x_opcode_setstreamid(unsigned streamid) +{ + return (7 << 28) | streamid; +} + +static inline u32 host1x_opcode_setpayload(unsigned payload) +{ + return (9 << 28) | payload; +} + static inline u32 host1x_opcode_gather_wide(unsigned count) { return (12 << 28)
[PATCH v5 7/9] drm/tegra: falcon: Set DMACTX field on DMA transactions
From: Mikko Perttunen The DMACTX field determines which context, as specified in the TRANSCFG register, is used. While during boot it doesn't matter which is used, later on it matters and this value is reused by the firmware. Signed-off-by: Mikko Perttunen --- drivers/gpu/drm/tegra/falcon.c | 8 drivers/gpu/drm/tegra/falcon.h | 1 + 2 files changed, 9 insertions(+) diff --git a/drivers/gpu/drm/tegra/falcon.c b/drivers/gpu/drm/tegra/falcon.c index 3762d87759d9..c0d85463eb1a 100644 --- a/drivers/gpu/drm/tegra/falcon.c +++ b/drivers/gpu/drm/tegra/falcon.c @@ -48,6 +48,14 @@ static int falcon_copy_chunk(struct falcon *falcon, if (target == FALCON_MEMORY_IMEM) cmd |= FALCON_DMATRFCMD_IMEM; + /* +* Use second DMA context (i.e. the one for firmware). Strictly +* speaking, at this point both DMA contexts point to the firmware +* stream ID, but this register's value will be reused by the firmware +* for later DMA transactions, so we need to use the correct value. +*/ + cmd |= FALCON_DMATRFCMD_DMACTX(1); + falcon_writel(falcon, offset, FALCON_DMATRFMOFFS); falcon_writel(falcon, base, FALCON_DMATRFFBOFFS); falcon_writel(falcon, cmd, FALCON_DMATRFCMD); diff --git a/drivers/gpu/drm/tegra/falcon.h b/drivers/gpu/drm/tegra/falcon.h index c56ee32d92ee..1955cf11a8a6 100644 --- a/drivers/gpu/drm/tegra/falcon.h +++ b/drivers/gpu/drm/tegra/falcon.h @@ -50,6 +50,7 @@ #define FALCON_DMATRFCMD_IDLE (1 << 1) #define FALCON_DMATRFCMD_IMEM (1 << 4) #define FALCON_DMATRFCMD_SIZE_256B (6 << 8) +#define FALCON_DMATRFCMD_DMACTX(v) (((v) & 0x7) << 12) #define FALCON_DMATRFFBOFFS0x111c -- 2.36.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 6/9] arm64: tegra: Add Host1x context stream IDs on Tegra186+
From: Mikko Perttunen Add Host1x context stream IDs on systems that support Host1x context isolation. Host1x and attached engines can use these stream IDs to allow isolation between memory used by different processes. The specified stream IDs must match those configured by the hypervisor, if one is present. Signed-off-by: Mikko Perttunen --- v2: * Added context devices on T194. * Use iommu-map instead of custom property. v4: * Remove memory-contexts subnode. --- arch/arm64/boot/dts/nvidia/tegra186.dtsi | 11 +++ arch/arm64/boot/dts/nvidia/tegra194.dtsi | 11 +++ 2 files changed, 22 insertions(+) diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi b/arch/arm64/boot/dts/nvidia/tegra186.dtsi index 0e9afc3e2f26..5f560f13ed93 100644 --- a/arch/arm64/boot/dts/nvidia/tegra186.dtsi +++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi @@ -1461,6 +1461,17 @@ host1x@13e0 { iommus = < TEGRA186_SID_HOST1X>; + /* Context isolation domains */ + iommu-map = < + 0 TEGRA186_SID_HOST1X_CTX0 1 + 1 TEGRA186_SID_HOST1X_CTX1 1 + 2 TEGRA186_SID_HOST1X_CTX2 1 + 3 TEGRA186_SID_HOST1X_CTX3 1 + 4 TEGRA186_SID_HOST1X_CTX4 1 + 5 TEGRA186_SID_HOST1X_CTX5 1 + 6 TEGRA186_SID_HOST1X_CTX6 1 + 7 TEGRA186_SID_HOST1X_CTX7 1>; + dpaux1: dpaux@1504 { compatible = "nvidia,tegra186-dpaux"; reg = <0x1504 0x1>; diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi b/arch/arm64/boot/dts/nvidia/tegra194.dtsi index d1f8248c00f4..613fd71dec25 100644 --- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi +++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi @@ -1769,6 +1769,17 @@ host1x@13e0 { interconnect-names = "dma-mem"; iommus = < TEGRA194_SID_HOST1X>; + /* Context isolation domains */ + iommu-map = < + 0 TEGRA194_SID_HOST1X_CTX0 1 + 1 TEGRA194_SID_HOST1X_CTX1 1 + 2 TEGRA194_SID_HOST1X_CTX2 1 + 3 TEGRA194_SID_HOST1X_CTX3 1 + 4 TEGRA194_SID_HOST1X_CTX4 1 + 5 TEGRA194_SID_HOST1X_CTX5 1 + 6 TEGRA194_SID_HOST1X_CTX6 1 + 7 TEGRA194_SID_HOST1X_CTX7 1>; + nvdec@1514 { compatible = "nvidia,tegra194-nvdec"; reg = <0x1514 0x0004>; -- 2.36.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 1/9] dt-bindings: host1x: Add iommu-map property
From: Mikko Perttunen Add schema information for specifying context stream IDs. This uses the standard iommu-map property. Signed-off-by: Mikko Perttunen Reviewed-by: Robin Murphy --- v3: * New patch v4: * Remove memory-contexts subnode. --- .../bindings/display/tegra/nvidia,tegra20-host1x.yaml| 5 + 1 file changed, 5 insertions(+) diff --git a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.yaml b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.yaml index 4fd513efb0f7..0adeb03b9e3a 100644 --- a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.yaml +++ b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.yaml @@ -144,6 +144,11 @@ allOf: reset-names: maxItems: 1 +iommu-map: + description: Specification of stream IDs available for memory context device +use. Should be a mapping of IDs 0..n to IOMMU entries corresponding to +usable stream IDs. + required: - reg-names -- 2.36.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 9/9] drm/tegra: Implement stream ID related callbacks on engines
From: Mikko Perttunen Implement the get_streamid_offset and can_use_memory_ctx callbacks required for supporting context isolation. Since old firmware on VIC cannot support context isolation without hacks that we don't want to implement, check the firmware binary to see if context isolation should be enabled. Signed-off-by: Mikko Perttunen --- v5: * Split into two callbacks * Add NVDEC support v4: * Add locking in vic_load_firmware * Return -EOPNOTSUPP if context isolation is not available * Update for changed get_streamid_offset declaration * Add comment noting that vic_load_firmware is safe to call without the hardware being powered on Implement context isolation related callbacks in VIC, NVDEC --- drivers/gpu/drm/tegra/drm.h | 8 + drivers/gpu/drm/tegra/nvdec.c | 9 + drivers/gpu/drm/tegra/vic.c | 67 ++- 3 files changed, 76 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h index 2acc8f2948ad..845e60f144c7 100644 --- a/drivers/gpu/drm/tegra/drm.h +++ b/drivers/gpu/drm/tegra/drm.h @@ -100,6 +100,14 @@ int tegra_drm_submit(struct tegra_drm_context *context, struct drm_tegra_submit *args, struct drm_device *drm, struct drm_file *file); +static inline int +tegra_drm_get_streamid_offset_thi(struct tegra_drm_client *client, u32 *offset) +{ + *offset = 0x30; + + return 0; +} + struct tegra_drm_client { struct host1x_client base; struct list_head list; diff --git a/drivers/gpu/drm/tegra/nvdec.c b/drivers/gpu/drm/tegra/nvdec.c index 79e1e88203cf..f1210cfb3708 100644 --- a/drivers/gpu/drm/tegra/nvdec.c +++ b/drivers/gpu/drm/tegra/nvdec.c @@ -304,10 +304,19 @@ static void nvdec_close_channel(struct tegra_drm_context *context) host1x_channel_put(context->channel); } +static int nvdec_can_use_memory_ctx(struct tegra_drm_client *client, bool *supported) +{ + *supported = true; + + return 0; +} + static const struct tegra_drm_client_ops nvdec_ops = { .open_channel = nvdec_open_channel, .close_channel = nvdec_close_channel, .submit = tegra_drm_submit, + .get_streamid_offset = tegra_drm_get_streamid_offset_thi, + .can_use_memory_ctx = nvdec_can_use_memory_ctx, }; #define NVIDIA_TEGRA_210_NVDEC_FIRMWARE "nvidia/tegra210/nvdec.bin" diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c index 1e342fa3d27b..2c0d554bd13c 100644 --- a/drivers/gpu/drm/tegra/vic.c +++ b/drivers/gpu/drm/tegra/vic.c @@ -38,6 +38,8 @@ struct vic { struct clk *clk; struct reset_control *rst; + bool can_use_context; + /* Platform configuration */ const struct vic_config *config; }; @@ -229,28 +231,38 @@ static int vic_load_firmware(struct vic *vic) { struct host1x_client *client = >client.base; struct tegra_drm *tegra = vic->client.drm; + static DEFINE_MUTEX(lock); + u32 fce_bin_data_offset; dma_addr_t iova; size_t size; void *virt; int err; - if (vic->falcon.firmware.virt) - return 0; + mutex_lock(); + + if (vic->falcon.firmware.virt) { + err = 0; + goto unlock; + } err = falcon_read_firmware(>falcon, vic->config->firmware); if (err < 0) - return err; + goto unlock; size = vic->falcon.firmware.size; if (!client->group) { virt = dma_alloc_coherent(vic->dev, size, , GFP_KERNEL); - if (!virt) - return -ENOMEM; + if (!virt) { + err = -ENOMEM; + goto unlock; + } } else { virt = tegra_drm_alloc(tegra, size, ); - if (IS_ERR(virt)) - return PTR_ERR(virt); + if (IS_ERR(virt)) { + err = PTR_ERR(virt); + goto unlock; + } } vic->falcon.firmware.virt = virt; @@ -277,7 +289,28 @@ static int vic_load_firmware(struct vic *vic) vic->falcon.firmware.phys = phys; } - return 0; + /* +* Check if firmware is new enough to not require mapping firmware +* to data buffer domains. +*/ + fce_bin_data_offset = *(u32 *)(virt + VIC_UCODE_FCE_DATA_OFFSET); + + if (!vic->config->supports_sid) { + vic->can_use_context = false; + } else if (fce_bin_data_offset != 0x0 && fce_bin_data_offset != 0xa5a5a5a5) { + /* +* Firmware will access FCE through STREAMID0, so context +* isolation cannot be used. +*/ + vic->can_use_context = false; + dev_warn_once(vic->dev, "context isolation disabled due to old firmware\n"); + } else { +
[PATCH v5 5/9] iommu/arm-smmu: Attach to host1x context device bus
From: Mikko Perttunen Set itself as the IOMMU for the host1x context device bus, containing "dummy" devices used for Host1x context isolation. Signed-off-by: Mikko Perttunen --- drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index 568cce590ccc..9ff54eaecf81 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -39,6 +39,7 @@ #include #include +#include #include "arm-smmu.h" @@ -2053,8 +2054,20 @@ static int arm_smmu_bus_init(struct iommu_ops *ops) goto err_reset_pci_ops; } #endif +#ifdef CONFIG_TEGRA_HOST1X_CONTEXT_BUS + if (!iommu_present(_context_device_bus_type)) { + err = bus_set_iommu(_context_device_bus_type, ops); + if (err) + goto err_reset_fsl_mc_ops; + } +#endif + return 0; +err_reset_fsl_mc_ops: __maybe_unused; +#ifdef CONFIG_FSL_MC_BUS + bus_set_iommu(_mc_bus_type, NULL); +#endif err_reset_pci_ops: __maybe_unused; #ifdef CONFIG_PCI bus_set_iommu(_bus_type, NULL); -- 2.36.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH V2 1/2] swiotlb: Add Child IO TLB mem support
I don't really understand how 'childs' fit in here. The code also doesn't seem to be usable without patch 2 and a caller of the new functions added in patch 2, so it is rather impossible to review. Also: 1) why is SEV/TDX so different from other cases that need bounce buffering to treat it different and we can't work on a general scalability improvement 2) per previous discussions at how swiotlb itself works, it is clear that another option is to just make pages we DMA to shared with the hypervisor. Why don't we try that at least for larger I/O? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/7] drm/i915: Remove unnecessary include
On Sat, 14 May 2022, Lu Baolu wrote: > intel-iommu.h is not needed in drm/i915 anymore. Remove its include. Thanks for the cleanups. Do you want to keep the patches together or want us to pick this up via drm-intel? If you want to keep the patches together, Acked-by: Jani Nikula for merging via whichever tree suits you best. Just let us know. BR, Jani. > > Signed-off-by: Lu Baolu > --- > drivers/gpu/drm/i915/i915_drv.h| 1 - > drivers/gpu/drm/i915/display/intel_display.c | 1 - > drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 1 - > 3 files changed, 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index fa14da84362e..f2a6982c3bef 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -36,7 +36,6 @@ > > #include > #include > -#include > #include > > #include > diff --git a/drivers/gpu/drm/i915/display/intel_display.c > b/drivers/gpu/drm/i915/display/intel_display.c > index 7dfeb458aa65..686ddbeebadc 100644 > --- a/drivers/gpu/drm/i915/display/intel_display.c > +++ b/drivers/gpu/drm/i915/display/intel_display.c > @@ -27,7 +27,6 @@ > #include > #include > #include > -#include > #include > #include > #include > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > index d42f437149c9..c9823528ea94 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > @@ -4,7 +4,6 @@ > * Copyright © 2008,2010 Intel Corporation > */ > > -#include > #include > #include > #include -- Jani Nikula, Intel Open Source Graphics Center ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/5] iommu: Add blocking_domain_ops field in iommu_ops
On Mon, May 16, 2022 at 09:57:56AM +0800, Lu Baolu wrote: > Each IOMMU driver must provide a blocking domain ops. If the hardware > supports detaching domain from device, setting blocking domain equals > detaching the existing domain from the deivce. Otherwise, an UNMANAGED > domain without any mapping will be used instead. blocking in this case means not allowing any access? The naming sounds a bit odd to me as blocking in the kernel has a specific meaning. Maybe something like noaccess ops might be a better name? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] vfio: Remove VFIO_TYPE1_NESTING_IOMMU
Looks good, Reviewed-by: Christoph Hellwig we really should not keep dead code like this around. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu