Re: [PATCH v6 29/29] x86/tsc: Switch to perf-based hardlockup detector if TSC become unstable

2022-05-16 Thread Ricardo Neri
On Tue, May 10, 2022 at 10:14:00PM +1000, Nicholas Piggin wrote:
> Excerpts from Ricardo Neri's message of May 6, 2022 10:00 am:
> > The HPET-based hardlockup detector relies on the TSC to determine if an
> > observed NMI interrupt was originated by HPET timer. Hence, this detector
> > can no longer be used with an unstable TSC.
> > 
> > In such case, permanently stop the HPET-based hardlockup detector and
> > start the perf-based detector.
> > 
> > Cc: Andi Kleen 
> > Cc: Stephane Eranian 
> > Cc: "Ravi V. Shankar" 
> > Cc: iommu@lists.linux-foundation.org
> > Cc: linuxppc-...@lists.ozlabs.org
> > Cc: x...@kernel.org
> > Suggested-by: Thomas Gleixner 
> > Reviewed-by: Tony Luck 
> > Signed-off-by: Ricardo Neri 
> > ---
> > Changes since v5:
> >  * Relocated the delcaration of hardlockup_detector_switch_to_perf() to
> >x86/nmi.h It does not depend on HPET.
> >  * Removed function stub. The shim hardlockup detector is always for x86.
> > 
> > Changes since v4:
> >  * Added a stub version of hardlockup_detector_switch_to_perf() for
> >!CONFIG_HPET_TIMER. (lkp)
> >  * Reconfigure the whole lockup detector instead of unconditionally
> >starting the perf-based hardlockup detector.
> > 
> > Changes since v3:
> >  * None
> > 
> > Changes since v2:
> >  * Introduced this patch.
> > 
> > Changes since v1:
> >  * N/A
> > ---
> >  arch/x86/include/asm/nmi.h | 6 ++
> >  arch/x86/kernel/tsc.c  | 2 ++
> >  arch/x86/kernel/watchdog_hld.c | 6 ++
> >  3 files changed, 14 insertions(+)
> > 
> > diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
> > index 4a0d5b562c91..47752ff67d8b 100644
> > --- a/arch/x86/include/asm/nmi.h
> > +++ b/arch/x86/include/asm/nmi.h
> > @@ -63,4 +63,10 @@ void stop_nmi(void);
> >  void restart_nmi(void);
> >  void local_touch_nmi(void);
> >  
> > +#ifdef CONFIG_X86_HARDLOCKUP_DETECTOR
> > +void hardlockup_detector_switch_to_perf(void);
> > +#else
> > +static inline void hardlockup_detector_switch_to_perf(void) { }
> > +#endif
> > +
> >  #endif /* _ASM_X86_NMI_H */
> > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> > index cc1843044d88..74772ffc79d1 100644
> > --- a/arch/x86/kernel/tsc.c
> > +++ b/arch/x86/kernel/tsc.c
> > @@ -1176,6 +1176,8 @@ void mark_tsc_unstable(char *reason)
> >  
> > clocksource_mark_unstable(_tsc_early);
> > clocksource_mark_unstable(_tsc);
> > +
> > +   hardlockup_detector_switch_to_perf();
> >  }
> >  
> >  EXPORT_SYMBOL_GPL(mark_tsc_unstable);
> > diff --git a/arch/x86/kernel/watchdog_hld.c b/arch/x86/kernel/watchdog_hld.c
> > index ef11f0af4ef5..7940977c6312 100644
> > --- a/arch/x86/kernel/watchdog_hld.c
> > +++ b/arch/x86/kernel/watchdog_hld.c
> > @@ -83,3 +83,9 @@ void watchdog_nmi_start(void)
> > if (detector_type == X86_HARDLOCKUP_DETECTOR_HPET)
> > hardlockup_detector_hpet_start();
> >  }
> > +
> > +void hardlockup_detector_switch_to_perf(void)
> > +{
> > +   detector_type = X86_HARDLOCKUP_DETECTOR_PERF;
> 
> Another possible problem along the same lines here,
> isn't your watchdog still running at this point? And
> it uses detector_type in the switch.
> 
> > +   lockup_detector_reconfigure();
> 
> Actually the detector_type switch is used in some
> functions called by lockup_detector_reconfigure()
> e.g., watchdog_nmi_stop, so this seems buggy even
> without concurrent watchdog.

Yes, this true. I missed this race.

> 
> Is this switching a good idea in general? The admin
> has asked for non-standard option because they want
> more PMU counterss available and now it eats a
> counter potentially causing a problem rather than
> detecting one.

Agreed. A very valid point.
> 
> I would rather just disable with a warning if it were
> up to me. If you *really* wanted to be fancy then
> allow admin to re-enable via proc maybe.

I think that in either case, /proc/sys/kernel/nmi_watchdog
need to be updated to reflect that the NMI watchdog has
been disabled. That would require to expose other interfaces
of the watchdog.

Thanks and BR,
Ricardo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] iommu: Add blocking_domain_ops field in iommu_ops

2022-05-16 Thread Baolu Lu

Hi Jason,

On 2022/5/16 21:57, Jason Gunthorpe wrote:

On Mon, May 16, 2022 at 12:22:08PM +0100, Robin Murphy wrote:

On 2022-05-16 02:57, Lu Baolu wrote:

Each IOMMU driver must provide a blocking domain ops. If the hardware
supports detaching domain from device, setting blocking domain equals
detaching the existing domain from the deivce. Otherwise, an UNMANAGED
domain without any mapping will be used instead.

Unfortunately that's backwards - most of the implementations of .detach_dev
are disabling translation entirely, meaning the device ends up effectively
in passthrough rather than blocked.

Ideally we'd convert the detach_dev of every driver into either
a blocking or identity domain. The trick is knowing which is which..


I am still a bit puzzled about how the blocking_domain should be used 
when it is extended to support ->set_dev_pasid.


If it's a blocking domain, the IOMMU driver knows that setting the
blocking domain to device pasid means detaching the existing one.

But if it's an identity domain, how could the IOMMU driver choose
between:

 - setting the input domain to the pasid on device; or,
 - detaching the existing domain.

I've ever thought about below solutions:

- Checking the domain types and dispatching them to different
  operations.
- Using different blocking domains for different types of domains.

But both look rough.



Guessing going down the list:
  apple dart - blocking, detach_dev calls apple_dart_hw_disable_dma() same as
   IOMMU_DOMAIN_BLOCKED
  [I wonder if this drive ris wrong in other ways though because
I dont see a remove_streams in attach_dev]
  exynos - this seems to disable the 'sysmmu' so I'm guessing this is
   identity
  iommu-vmsa - Comment says 'disable mmu translaction' so I'm guessing
   this is idenity
  mkt_v1 - Code looks similar to mkt, which is probably identity.
  rkt - No idea
  sprd - No idea
  sun50i - This driver confusingly treats identity the same as
   unmanaged, seems wrong, no idea.
  amd - Not sure, clear_dte_entry() seems to set translation on but points
the PTE to 0 ? Based on the spec table 8 I would have expected
TV to be clear which would be blocking. Maybe a bug??
  arm smmu qcomm - not sure
  intel - blocking

These doesn't support default domains, so detach_dev should return
back to DMA API ownership, which is either identity or something weird:
  fsl_pamu - identity due to the PPC use of dma direct
  msm
  mkt
  omap
  s390 - platform DMA ops
  terga-gart - Usually something called a GART would be 0 length once
   disabled, guessing blocking?
  tegra-smmu

So, the approach here should be to go driver by driver and convert
detach_dev to either identity, blocking or just delete it entirely,
excluding the above 7 that don't support default domains. And get acks
from the driver owners.



Agreed. There seems to be a long way to go. I am wondering if we could
decouple this refactoring from my new SVA API work? We can easily switch
.detach_dev_pasid to using blocking domain later.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 6/7] x86/boot/tboot: Move tboot_force_iommu() to Intel IOMMU

2022-05-16 Thread Baolu Lu

Hi Jason,

On 2022/5/17 02:06, Jason Gunthorpe wrote:

+static __init int tboot_force_iommu(void)
+{
+   if (!tboot_enabled())
+   return 0;
+
+   if (no_iommu || dmar_disabled)
+   pr_warn("Forcing Intel-IOMMU to enabled\n");

Unrelated, but when we are in the special secure IOMMU modes, do we
force ATS off? Specifically does the IOMMU reject TLPs that are marked
as translated?


Good question. From IOMMU point of view, I don't see a point to force
ATS off, but trust boot involves lots of other things that I am not
familiar with. Anybody else could help to answer?

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 5/8] perf tool: Add support for HiSilicon PCIe Tune and Trace device driver

2022-05-16 Thread liuqi (BA) via iommu




On 2022/5/16 22:20, Jonathan Cameron wrote:

On Mon, 16 May 2022 20:52:20 +0800
Yicong Yang  wrote:


From: Qi Liu 

HiSilicon PCIe tune and trace device (PTT) could dynamically tune
the PCIe link's events, and trace the TLP headers).

This patch add support for PTT device in perf tool, so users could
use 'perf record' to get TLP headers trace data.

Signed-off-by: Qi Liu 
Signed-off-by: Yicong Yang 


One query inline.



diff --git a/tools/perf/arch/arm/util/auxtrace.c 
b/tools/perf/arch/arm/util/auxtrace.c
index 384c7cfda0fd..297fffedf45e 100644
--- a/tools/perf/arch/arm/util/auxtrace.c
+++ b/tools/perf/arch/arm/util/auxtrace.c


...


  static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus,
   int pmu_nr, struct evsel *evsel)
  {
@@ -71,17 +120,21 @@ struct auxtrace_record
  {
struct perf_pmu *cs_etm_pmu = NULL;
struct perf_pmu **arm_spe_pmus = NULL;
+   struct perf_pmu **hisi_ptt_pmus = NULL;
struct evsel *evsel;
struct perf_pmu *found_etm = NULL;
struct perf_pmu *found_spe = NULL;
+   struct perf_pmu *found_ptt = NULL;
int auxtrace_event_cnt = 0;
int nr_spes = 0;
+   int nr_ptts = 0;
  
  	if (!evlist)

return NULL;
  
  	cs_etm_pmu = perf_pmu__find(CORESIGHT_ETM_PMU_NAME);

arm_spe_pmus = find_all_arm_spe_pmus(_spes, err);
+   hisi_ptt_pmus = find_all_hisi_ptt_pmus(_ptts, err);
  
  	evlist__for_each_entry(evlist, evsel) {

if (cs_etm_pmu && !found_etm)
@@ -89,9 +142,13 @@ struct auxtrace_record
  
  		if (arm_spe_pmus && !found_spe)

found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, 
evsel);
+
+   if (arm_spe_pmus && !found_spe)


if (hisi_ptt_pmus && !found_ptt) ?

Otherwise, I'm not sure what the purpose of the checking against spe is.



yes...it's a typo here, thanks for the reminder!

Qi

+   found_ptt = find_pmu_for_event(hisi_ptt_pmus, nr_ptts, 
evsel);
}
  
  	free(arm_spe_pmus);

+   free(hisi_ptt_pmus);
  
  	if (found_etm)

auxtrace_event_cnt++;
@@ -99,6 +156,9 @@ struct auxtrace_record
if (found_spe)
auxtrace_event_cnt++;
  
+	if (found_ptt)

+   auxtrace_event_cnt++;
+
if (auxtrace_event_cnt > 1) {
pr_err("Concurrent AUX trace operation not currently 
supported\n");
*err = -EOPNOTSUPP;
@@ -111,6 +171,9 @@ struct auxtrace_record
  #if defined(__aarch64__)
if (found_spe)
return arm_spe_recording_init(err, found_spe);
+
+   if (found_ptt)
+   return hisi_ptt_recording_init(err, found_ptt);
  #endif
  

.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 4/8] perf arm: Refactor event list iteration in auxtrace_record__init()

2022-05-16 Thread liuqi (BA) via iommu



On 2022/5/17 0:29, John Garry wrote:

On 16/05/2022 13:52, Yicong Yang wrote:

As requested before, please mention "perf tool" in the commit subject
"perf arm" is used referenced to previous commit, ok, will mention "perf 
tool" in the commit subject next time.


Thanks,
Qi



From: Qi Liu 

Use find_pmu_for_event() to simplify logic in auxtrace_record__init().

Signed-off-by: Qi Liu 
Signed-off-by: Yicong Yang 
---
  tools/perf/arch/arm/util/auxtrace.c | 53 ++---
  1 file changed, 34 insertions(+), 19 deletions(-)

diff --git a/tools/perf/arch/arm/util/auxtrace.c 
b/tools/perf/arch/arm/util/auxtrace.c

index 5fc6a2a3dbc5..384c7cfda0fd 100644
--- a/tools/perf/arch/arm/util/auxtrace.c
+++ b/tools/perf/arch/arm/util/auxtrace.c
@@ -50,16 +50,32 @@ static struct perf_pmu **find_all_arm_spe_pmus(int 
*nr_spes, int *err)

  return arm_spe_pmus;
  }
+static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus,
+   int pmu_nr, struct evsel *evsel)
+{
+    int i;
+
+    if (!pmus)
+    return NULL;
+
+    for (i = 0; i < pmu_nr; i++) {
+    if (evsel->core.attr.type == pmus[i]->type)
+    return pmus[i];
+    }
+
+    return NULL;
+}
+
  struct auxtrace_record
  *auxtrace_record__init(struct evlist *evlist, int *err)
  {
-    struct perf_pmu    *cs_etm_pmu;
+    struct perf_pmu    *cs_etm_pmu = NULL;
+    struct perf_pmu **arm_spe_pmus = NULL;
  struct evsel *evsel;
-    bool found_etm = false;
+    struct perf_pmu *found_etm = NULL;
  struct perf_pmu *found_spe = NULL;
-    struct perf_pmu **arm_spe_pmus = NULL;
+    int auxtrace_event_cnt = 0;
  int nr_spes = 0;
-    int i = 0;
  if (!evlist)
  return NULL;
@@ -68,24 +84,23 @@ struct auxtrace_record
  arm_spe_pmus = find_all_arm_spe_pmus(_spes, err);
  evlist__for_each_entry(evlist, evsel) {
-    if (cs_etm_pmu &&
-    evsel->core.attr.type == cs_etm_pmu->type)
-    found_etm = true;
-
-    if (!nr_spes || found_spe)
-    continue;
-
-    for (i = 0; i < nr_spes; i++) {
-    if (evsel->core.attr.type == arm_spe_pmus[i]->type) {
-    found_spe = arm_spe_pmus[i];
-    break;
-    }
-    }
+    if (cs_etm_pmu && !found_etm)
+    found_etm = find_pmu_for_event(_etm_pmu, 1, evsel);
+
+    if (arm_spe_pmus && !found_spe)
+    found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, 
evsel);

  }
+
  free(arm_spe_pmus);
-    if (found_etm && found_spe) {
-    pr_err("Concurrent ARM Coresight ETM and SPE operation not 
currently supported\n");

+    if (found_etm)
+    auxtrace_event_cnt++;
+
+    if (found_spe)
+    auxtrace_event_cnt++;
+
+    if (auxtrace_event_cnt > 1) {
+    pr_err("Concurrent AUX trace operation not currently 
supported\n");

  *err = -EOPNOTSUPP;
  return NULL;
  }


.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v8 4/8] perf arm: Refactor event list iteration in auxtrace_record__init()

2022-05-16 Thread liuqi (BA) via iommu



Hi Jonathan,
On 2022/5/16 22:17, Jonathan Cameron wrote:

On Mon, 16 May 2022 20:52:19 +0800
Yicong Yang  wrote:


From: Qi Liu 

Use find_pmu_for_event() to simplify logic in auxtrace_record__init().

Possibly reword as

"Add find_pmu_for_event() and use to simplify logic in
auxtrace_record_init(). find_pmu_for_event() will be
reused in subsequent patches."


thanks, I'll modify the commit message next version.

Thanks,
Qi


Signed-off-by: Qi Liu 
Signed-off-by: Yicong Yang 

FWIW as this isn't an area I know much about. It seems
like a good cleanup and functionally equivalent.

Reviewed-by: Jonathan Cameron 

---
  tools/perf/arch/arm/util/auxtrace.c | 53 ++---
  1 file changed, 34 insertions(+), 19 deletions(-)

diff --git a/tools/perf/arch/arm/util/auxtrace.c 
b/tools/perf/arch/arm/util/auxtrace.c
index 5fc6a2a3dbc5..384c7cfda0fd 100644
--- a/tools/perf/arch/arm/util/auxtrace.c
+++ b/tools/perf/arch/arm/util/auxtrace.c
@@ -50,16 +50,32 @@ static struct perf_pmu **find_all_arm_spe_pmus(int 
*nr_spes, int *err)
return arm_spe_pmus;
  }
  
+static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus,

+  int pmu_nr, struct evsel *evsel)
+{
+   int i;
+
+   if (!pmus)
+   return NULL;
+
+   for (i = 0; i < pmu_nr; i++) {
+   if (evsel->core.attr.type == pmus[i]->type)
+   return pmus[i];
+   }
+
+   return NULL;
+}
+
  struct auxtrace_record
  *auxtrace_record__init(struct evlist *evlist, int *err)
  {
-   struct perf_pmu *cs_etm_pmu;
+   struct perf_pmu *cs_etm_pmu = NULL;
+   struct perf_pmu **arm_spe_pmus = NULL;
struct evsel *evsel;
-   bool found_etm = false;
+   struct perf_pmu *found_etm = NULL;
struct perf_pmu *found_spe = NULL;
-   struct perf_pmu **arm_spe_pmus = NULL;
+   int auxtrace_event_cnt = 0;
int nr_spes = 0;
-   int i = 0;
  
  	if (!evlist)

return NULL;
@@ -68,24 +84,23 @@ struct auxtrace_record
arm_spe_pmus = find_all_arm_spe_pmus(_spes, err);
  
  	evlist__for_each_entry(evlist, evsel) {

-   if (cs_etm_pmu &&
-   evsel->core.attr.type == cs_etm_pmu->type)
-   found_etm = true;
-
-   if (!nr_spes || found_spe)
-   continue;
-
-   for (i = 0; i < nr_spes; i++) {
-   if (evsel->core.attr.type == arm_spe_pmus[i]->type) {
-   found_spe = arm_spe_pmus[i];
-   break;
-   }
-   }
+   if (cs_etm_pmu && !found_etm)
+   found_etm = find_pmu_for_event(_etm_pmu, 1, evsel);
+
+   if (arm_spe_pmus && !found_spe)
+   found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, 
evsel);
}
+
free(arm_spe_pmus);
  
-	if (found_etm && found_spe) {

-   pr_err("Concurrent ARM Coresight ETM and SPE operation not currently 
supported\n");
+   if (found_etm)
+   auxtrace_event_cnt++;
+
+   if (found_spe)
+   auxtrace_event_cnt++;
+
+   if (auxtrace_event_cnt > 1) {
+   pr_err("Concurrent AUX trace operation not currently 
supported\n");
*err = -EOPNOTSUPP;
return NULL;
}


.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 00/21] Userspace P2PDMA with O_DIRECT NVMe devices

2022-05-16 Thread Logan Gunthorpe



On 2022-05-16 16:31, Chaitanya Kulkarni wrote:
> Do you have any plans to re-spin this ?

I didn't get any feedback this cycle, so there haven't been any changes.
I'll probably do a rebase and resend after the merge window.

Logan
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 6/7] x86/boot/tboot: Move tboot_force_iommu() to Intel IOMMU

2022-05-16 Thread Jacob Pan
Hi Jason,

On Mon, 16 May 2022 15:06:28 -0300, Jason Gunthorpe  wrote:

> Unrelated, but when we are in the special secure IOMMU modes, do we
> force ATS off? Specifically does the IOMMU reject TLPs that are marked
> as translated?
Yes, VT-d context entry has a Device TLB Enable bit, if 0, it means
"Translation Requests (with or without PASID) and Translated Requests
received and processed through this scalable-mode context-entry are
blocked."

Thanks,

Jacob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 00/21] Userspace P2PDMA with O_DIRECT NVMe devices

2022-05-16 Thread Chaitanya Kulkarni via iommu
On 4/7/22 08:46, Logan Gunthorpe wrote:
> Hi,
> 
> This patchset continues my work to add userspace P2PDMA access using
> O_DIRECT NVMe devices. This posting contains some minor fixes and a
> rebase onto v5.18-rc1 which contains cleanup from Christoph around
> free_zone_device_page() that helps to enable this patchset. The
> previous posting was here[1].
> 
> The patchset enables userspace P2PDMA by allowing userspace to mmap()
> allocated chunks of the CMB. The resulting VMA can be passed only
> to O_DIRECT IO on NVMe backed files or block devices. A flag is added
> to GUP() in Patch <>, then Patches <> through <> wire this flag up based
> on whether the block queue indicates P2PDMA support. Patches <>
> through <> enable the CMB to be mapped into userspace by mmaping
> the nvme char device.
> 
> This is relatively straightforward, however the one significant
> problem is that, presently, pci_p2pdma_map_sg() requires a homogeneous
> SGL with all P2PDMA pages or all regular pages. Enhancing GUP to
> support enforcing this rule would require a huge hack that I don't
> expect would be all that pallatable. So the first 13 patches add
> support for P2PDMA pages to dma_map_sg[table]() to the dma-direct
> and dma-iommu implementations. Thus systems without an IOMMU plus
> Intel and AMD IOMMUs are supported. (Other IOMMU implementations would
> then be unsupported, notably ARM and PowerPC but support would be added
> when they convert to dma-iommu).
> 
> dma_map_sgtable() is preferred when dealing with P2PDMA memory as it
> will return -EREMOTEIO when the DMA device cannot map specific P2PDMA
> pages based on the existing rules in calc_map_type_and_dist().
> 
> The other issue is dma_unmap_sg() needs a flag to determine whether a
> given dma_addr_t was mapped regularly or as a PCI bus address. To allow
> this, a third flag is added to the page_link field in struct
> scatterlist. This effectively means support for P2PDMA will now depend
> on CONFIG_64BIT.
> 
> Feedback welcome.
> 
> This series is based on v5.18-rc1. A git branch is available here:
> 
>https://github.com/sbates130272/linux-p2pmem/  p2pdma_user_cmb_v6
> 
> Thanks,
> 
> Logan
> 
> [1] lkml.kernel.org/r/20220128002614.6136-1-log...@deltatee.com
> 
> --


Do you have any plans to re-spin this ?

-ck


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 5/5] iommu/tegra-smmu: Support managed domains

2022-05-16 Thread Dmitry Osipenko
On 5/12/22 22:00, Thierry Reding wrote:
> -277,7 +278,9 @@ static struct iommu_domain *tegra_smmu_domain_alloc(unsigned 
> type)
>  {
>   struct tegra_smmu_as *as;
>  
> - if (type != IOMMU_DOMAIN_UNMANAGED)
> + if (type != IOMMU_DOMAIN_UNMANAGED &&
> + type != IOMMU_DOMAIN_DMA &&
> + type != IOMMU_DOMAIN_IDENTITY)
>   return NULL;

Shouldn't at least pre-210 SoCs be guarded from IOMMU_DOMAIN_DMA? I
don't think that DRM and VDE drivers will work as-is today.

-- 
Best regards,
Dmitry
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 7/7] iommu/vt-d: Move include/linux/intel_iommu.h under iommu

2022-05-16 Thread Jason Gunthorpe via iommu
On Sat, May 14, 2022 at 09:43:22AM +0800, Lu Baolu wrote:
> This header file is private to the Intel IOMMU driver. Move it to the
> driver folder.
> 
> Signed-off-by: Lu Baolu 
> ---
>  include/linux/intel-iommu.h => drivers/iommu/intel/iommu.h | 0
>  drivers/iommu/intel/trace.h| 3 ++-
>  drivers/iommu/intel/cap_audit.c| 2 +-
>  drivers/iommu/intel/debugfs.c  | 2 +-
>  drivers/iommu/intel/dmar.c | 2 +-
>  drivers/iommu/intel/iommu.c| 2 +-
>  drivers/iommu/intel/irq_remapping.c| 2 +-
>  drivers/iommu/intel/pasid.c| 2 +-
>  drivers/iommu/intel/perf.c | 2 +-
>  drivers/iommu/intel/svm.c  | 2 +-
>  MAINTAINERS| 1 -
>  11 files changed, 10 insertions(+), 10 deletions(-)
>  rename include/linux/intel-iommu.h => drivers/iommu/intel/iommu.h (100%)

Reviewed-by: Jason Gunthorpe 

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 6/7] x86/boot/tboot: Move tboot_force_iommu() to Intel IOMMU

2022-05-16 Thread Jason Gunthorpe via iommu
On Sat, May 14, 2022 at 09:43:21AM +0800, Lu Baolu wrote:
> tboot_force_iommu() is only called by the Intel IOMMU driver. Move the
> helper into that driver. No functional change intended.
> 
> Signed-off-by: Lu Baolu 
> ---
>  include/linux/tboot.h   |  2 --
>  arch/x86/kernel/tboot.c | 15 ---
>  drivers/iommu/intel/iommu.c | 14 ++
>  3 files changed, 14 insertions(+), 17 deletions(-)

Reviewed-by: Jason Gunthorpe 

> +static __init int tboot_force_iommu(void)
> +{
> + if (!tboot_enabled())
> + return 0;
> +
> + if (no_iommu || dmar_disabled)
> + pr_warn("Forcing Intel-IOMMU to enabled\n");

Unrelated, but when we are in the special secure IOMMU modes, do we
force ATS off? Specifically does the IOMMU reject TLPs that are marked
as translated?

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/7] KVM: x86: Remove unnecessary include

2022-05-16 Thread Jason Gunthorpe via iommu
On Sat, May 14, 2022 at 09:43:20AM +0800, Lu Baolu wrote:
> intel-iommu.h is not needed in kvm/x86 anymore. Remove its include.
> 
> Signed-off-by: Lu Baolu 
> ---
>  arch/x86/kvm/x86.c | 1 -
>  1 file changed, 1 deletion(-)

Reviewed-by: Jason Gunthorpe 

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/7] drm/i915: Remove unnecessary include

2022-05-16 Thread Jason Gunthorpe via iommu
On Sat, May 14, 2022 at 09:43:19AM +0800, Lu Baolu wrote:
> intel-iommu.h is not needed in drm/i915 anymore. Remove its include.
> 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/gpu/drm/i915/i915_drv.h| 1 -
>  drivers/gpu/drm/i915/display/intel_display.c   | 1 -
>  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 1 -
>  3 files changed, 3 deletions(-)

Reviewed-by: Jason Gunthorpe 

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/7] iommu/vt-d: Remove unnecessary exported symbol

2022-05-16 Thread Jason Gunthorpe via iommu
On Sat, May 14, 2022 at 09:43:18AM +0800, Lu Baolu wrote:
> The exported symbol intel_iommu_gfx_mapped is not used anywhere in the
> tree. Remove it to avoid dead code.
> 
> Signed-off-by: Lu Baolu 
> ---
>  include/linux/intel-iommu.h | 1 -
>  drivers/iommu/intel/iommu.c | 6 --
>  2 files changed, 7 deletions(-)

Reviewed-by: Jason Gunthorpe 

Maybe could squash to the prior patch

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/7] agp/intel: Use per device iommu check

2022-05-16 Thread Jason Gunthorpe via iommu
On Sat, May 14, 2022 at 09:43:17AM +0800, Lu Baolu wrote:
> The IOMMU subsystem has already provided an interface to query whether
> the IOMMU hardware is enabled for a specific device. This changes the
> check from Intel specific intel_iommu_gfx_mapped (globally exported by
> the Intel IOMMU driver) to probing the presence of IOMMU on a specific
> device using the generic device_iommu_mapped().
> 
> This follows commit cca084692394a ("drm/i915: Use per device iommu check")
> which converted drm/i915 driver to use device_iommu_mapped().
> 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/char/agp/intel-gtt.c | 17 +++--
>  1 file changed, 7 insertions(+), 10 deletions(-)

Reviewed-by: Jason Gunthorpe 

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/7] iommu/vt-d: Move trace/events/intel_iommu.h under iommu

2022-05-16 Thread Jason Gunthorpe via iommu
On Sat, May 14, 2022 at 09:43:16AM +0800, Lu Baolu wrote:
> This header file is private to the Intel IOMMU driver. Move it to the
> driver folder.
> 
> Signed-off-by: Lu Baolu 
> ---
>  .../trace/events/intel_iommu.h => drivers/iommu/intel/trace.h | 4 
>  drivers/iommu/intel/dmar.c| 2 +-
>  drivers/iommu/intel/svm.c | 2 +-
>  drivers/iommu/intel/trace.c   | 2 +-
>  4 files changed, 7 insertions(+), 3 deletions(-)
>  rename include/trace/events/intel_iommu.h => drivers/iommu/intel/trace.h 
> (94%)

Reviewed-by: Jason Gunthorpe 

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 4/8] perf arm: Refactor event list iteration in auxtrace_record__init()

2022-05-16 Thread John Garry via iommu

On 16/05/2022 13:52, Yicong Yang wrote:

As requested before, please mention "perf tool" in the commit subject


From: Qi Liu 

Use find_pmu_for_event() to simplify logic in auxtrace_record__init().

Signed-off-by: Qi Liu 
Signed-off-by: Yicong Yang 
---
  tools/perf/arch/arm/util/auxtrace.c | 53 ++---
  1 file changed, 34 insertions(+), 19 deletions(-)

diff --git a/tools/perf/arch/arm/util/auxtrace.c 
b/tools/perf/arch/arm/util/auxtrace.c
index 5fc6a2a3dbc5..384c7cfda0fd 100644
--- a/tools/perf/arch/arm/util/auxtrace.c
+++ b/tools/perf/arch/arm/util/auxtrace.c
@@ -50,16 +50,32 @@ static struct perf_pmu **find_all_arm_spe_pmus(int 
*nr_spes, int *err)
return arm_spe_pmus;
  }
  
+static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus,

+  int pmu_nr, struct evsel *evsel)
+{
+   int i;
+
+   if (!pmus)
+   return NULL;
+
+   for (i = 0; i < pmu_nr; i++) {
+   if (evsel->core.attr.type == pmus[i]->type)
+   return pmus[i];
+   }
+
+   return NULL;
+}
+
  struct auxtrace_record
  *auxtrace_record__init(struct evlist *evlist, int *err)
  {
-   struct perf_pmu *cs_etm_pmu;
+   struct perf_pmu *cs_etm_pmu = NULL;
+   struct perf_pmu **arm_spe_pmus = NULL;
struct evsel *evsel;
-   bool found_etm = false;
+   struct perf_pmu *found_etm = NULL;
struct perf_pmu *found_spe = NULL;
-   struct perf_pmu **arm_spe_pmus = NULL;
+   int auxtrace_event_cnt = 0;
int nr_spes = 0;
-   int i = 0;
  
  	if (!evlist)

return NULL;
@@ -68,24 +84,23 @@ struct auxtrace_record
arm_spe_pmus = find_all_arm_spe_pmus(_spes, err);
  
  	evlist__for_each_entry(evlist, evsel) {

-   if (cs_etm_pmu &&
-   evsel->core.attr.type == cs_etm_pmu->type)
-   found_etm = true;
-
-   if (!nr_spes || found_spe)
-   continue;
-
-   for (i = 0; i < nr_spes; i++) {
-   if (evsel->core.attr.type == arm_spe_pmus[i]->type) {
-   found_spe = arm_spe_pmus[i];
-   break;
-   }
-   }
+   if (cs_etm_pmu && !found_etm)
+   found_etm = find_pmu_for_event(_etm_pmu, 1, evsel);
+
+   if (arm_spe_pmus && !found_spe)
+   found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, 
evsel);
}
+
free(arm_spe_pmus);
  
-	if (found_etm && found_spe) {

-   pr_err("Concurrent ARM Coresight ETM and SPE operation not currently 
supported\n");
+   if (found_etm)
+   auxtrace_event_cnt++;
+
+   if (found_spe)
+   auxtrace_event_cnt++;
+
+   if (auxtrace_event_cnt > 1) {
+   pr_err("Concurrent AUX trace operation not currently 
supported\n");
*err = -EOPNOTSUPP;
return NULL;
}


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 1/9] dt-bindings: host1x: Add iommu-map property

2022-05-16 Thread Rob Herring
On Mon, 16 May 2022 11:52:50 +0300, cyn...@kapsi.fi wrote:
> From: Mikko Perttunen 
> 
> Add schema information for specifying context stream IDs. This uses
> the standard iommu-map property.
> 
> Signed-off-by: Mikko Perttunen 
> Reviewed-by: Robin Murphy 
> ---
> v3:
> * New patch
> v4:
> * Remove memory-contexts subnode.
> ---
>  .../bindings/display/tegra/nvidia,tegra20-host1x.yaml| 5 +
>  1 file changed, 5 insertions(+)
> 

Acked-by: Rob Herring 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 3/8] hwtracing: hisi_ptt: Add tune function support for HiSilicon PCIe Tune and Trace device

2022-05-16 Thread John Garry via iommu

On 16/05/2022 13:52, Yicong Yang wrote:

Add tune function for the HiSilicon Tune and Trace device. The interface
of tune is exposed through sysfs attributes of PTT PMU device.

Signed-off-by: Yicong Yang 
Reviewed-by: Jonathan Cameron 


Apart from a comment on preferential style:

Reviewed-by: John Garry 


---
  drivers/hwtracing/ptt/hisi_ptt.c | 157 +++
  drivers/hwtracing/ptt/hisi_ptt.h |  23 +
  2 files changed, 180 insertions(+)

diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c
index ef25ce98f664..c3fdb9bfb1b4 100644
--- a/drivers/hwtracing/ptt/hisi_ptt.c
+++ b/drivers/hwtracing/ptt/hisi_ptt.c
@@ -25,6 +25,161 @@
  /* Dynamic CPU hotplug state used by PTT */
  static enum cpuhp_state hisi_ptt_pmu_online;
  
+static bool hisi_ptt_wait_tuning_finish(struct hisi_ptt *hisi_ptt)

+{
+   u32 val;
+
+   return !readl_poll_timeout(hisi_ptt->iobase + HISI_PTT_TUNING_INT_STAT,
+ val, !(val & HISI_PTT_TUNING_INT_STAT_MASK),
+ HISI_PTT_WAIT_POLL_INTERVAL_US,
+ HISI_PTT_WAIT_TUNE_TIMEOUT_US);
+}
+
+static int hisi_ptt_tune_data_get(struct hisi_ptt *hisi_ptt,
+ u32 event, u16 *data)


this only has 1x caller so may inline it


+{
+   u32 reg;
+
+   reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_CTRL);
+   reg &= ~(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB);
+   reg |= FIELD_PREP(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB,
+ event);
+   writel(reg, hisi_ptt->iobase + HISI_PTT_TUNING_CTRL);
+
+   /* Write all 1 to indicates it's the read process */
+   writel(~0U, hisi_ptt->iobase + HISI_PTT_TUNING_DATA);
+
+   if (!hisi_ptt_wait_tuning_finish(hisi_ptt))
+   return -ETIMEDOUT;
+
+   reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_DATA);
+   reg &= HISI_PTT_TUNING_DATA_VAL_MASK;
+   *data = FIELD_GET(HISI_PTT_TUNING_DATA_VAL_MASK, reg);
+
+   return 0;
+}
+
+static int hisi_ptt_tune_data_set(struct hisi_ptt *hisi_ptt,
+ u32 event, u16 data)


again only 1x caller


+{
+   u32 reg;
+
+   reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_CTRL);
+   reg &= ~(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB);
+   reg |= FIELD_PREP(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB,
+ event);
+   writel(reg, hisi_ptt->iobase + HISI_PTT_TUNING_CTRL);
+
+   writel(FIELD_PREP(HISI_PTT_TUNING_DATA_VAL_MASK, data),
+  hisi_ptt->iobase + HISI_PTT_TUNING_DATA);
+
+   if (!hisi_ptt_wait_tuning_finish(hisi_ptt))
+   return -ETIMEDOUT;
+
+   return 0;
+}
+
+static ssize_t hisi_ptt_tune_attr_show(struct device *dev,
+  struct device_attribute *attr,
+  char *buf)
+{
+   struct hisi_ptt *hisi_ptt = to_hisi_ptt(dev_get_drvdata(dev));
+   struct dev_ext_attribute *ext_attr;
+   struct hisi_ptt_tune_desc *desc;
+   int ret;
+   u16 val;
+
+   ext_attr = container_of(attr, struct dev_ext_attribute, attr);
+   desc = ext_attr->var;
+
+   mutex_lock(_ptt->tune_lock);
+   ret = hisi_ptt_tune_data_get(hisi_ptt, desc->event_code, );
+   mutex_unlock(_ptt->tune_lock);
+
+   if (ret)
+   return ret;
+
+   return sysfs_emit(buf, "%u\n", val);
+}
+
+static ssize_t hisi_ptt_tune_attr_store(struct device *dev,
+   struct device_attribute *attr,
+   const char *buf, size_t count)
+{
+   struct hisi_ptt *hisi_ptt = to_hisi_ptt(dev_get_drvdata(dev));
+   struct dev_ext_attribute *ext_attr;
+   struct hisi_ptt_tune_desc *desc;
+   int ret;
+   u16 val;
+
+   ext_attr = container_of(attr, struct dev_ext_attribute, attr);
+   desc = ext_attr->var;
+
+   if (kstrtou16(buf, 10, ))
+   return -EINVAL;
+
+   mutex_lock(_ptt->tune_lock);
+   ret = hisi_ptt_tune_data_set(hisi_ptt, desc->event_code, val);
+   mutex_unlock(_ptt->tune_lock);
+
+   if (ret)
+   return ret;
+
+   return count;
+}
+
+#define HISI_PTT_TUNE_ATTR(_name, _val, _show, _store) \
+   static struct hisi_ptt_tune_desc _name##_desc = {   \
+   .name = #_name, \
+   .event_code = _val, \
+   };  \
+   static struct dev_ext_attribute hisi_ptt_##_name##_attr = { \
+   .attr   = __ATTR(_name, 0600, _show, _store),   \
+   .var= &_name##_desc,\
+   }
+
+#define HISI_PTT_TUNE_ATTR_COMMON(_name, _val) \
+   HISI_PTT_TUNE_ATTR(_name, 

Re: [PATCH v8 2/8] hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device

2022-05-16 Thread John Garry via iommu

On 16/05/2022 13:52, Yicong Yang wrote:

HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex integrated
Endpoint(RCiEP) device, providing the capability to dynamically monitor and
tune the PCIe traffic and trace the TLP headers.

Add the driver for the device to enable the trace function. Register PMU
device of PTT trace, then users can use trace through perf command. The
driver makes use of perf AUX trace function and support the following
events to configure the trace:

- filter: select Root port or Endpoint to trace
- type: select the type of traced TLP headers
- direction: select the direction of traced TLP headers
- format: select the data format of the traced TLP headers

This patch initially add a basic driver of PTT trace.


Initially add basic trace support.



Signed-off-by: Yicong Yang 


Generally this looks ok, apart from nitpicking below, so, FWIW:
Reviewed-by: John Garry 


---
  drivers/Makefile |   1 +
  drivers/hwtracing/Kconfig|   2 +
  drivers/hwtracing/ptt/Kconfig|  12 +
  drivers/hwtracing/ptt/Makefile   |   2 +
  drivers/hwtracing/ptt/hisi_ptt.c | 964 +++
  drivers/hwtracing/ptt/hisi_ptt.h | 178 ++
  6 files changed, 1159 insertions(+)
  create mode 100644 drivers/hwtracing/ptt/Kconfig
  create mode 100644 drivers/hwtracing/ptt/Makefile
  create mode 100644 drivers/hwtracing/ptt/hisi_ptt.c
  create mode 100644 drivers/hwtracing/ptt/hisi_ptt.h

diff --git a/drivers/Makefile b/drivers/Makefile
index 020780b6b4d2..662d50599467 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -175,6 +175,7 @@ obj-$(CONFIG_USB4)  += thunderbolt/
  obj-$(CONFIG_CORESIGHT)   += hwtracing/coresight/
  obj-y += hwtracing/intel_th/
  obj-$(CONFIG_STM) += hwtracing/stm/
+obj-$(CONFIG_HISI_PTT) += hwtracing/ptt/
  obj-$(CONFIG_ANDROID) += android/
  obj-$(CONFIG_NVMEM)   += nvmem/
  obj-$(CONFIG_FPGA)+= fpga/
diff --git a/drivers/hwtracing/Kconfig b/drivers/hwtracing/Kconfig
index 13085835a636..911ee977103c 100644
--- a/drivers/hwtracing/Kconfig
+++ b/drivers/hwtracing/Kconfig
@@ -5,4 +5,6 @@ source "drivers/hwtracing/stm/Kconfig"
  
  source "drivers/hwtracing/intel_th/Kconfig"
  
+source "drivers/hwtracing/ptt/Kconfig"

+
  endmenu
diff --git a/drivers/hwtracing/ptt/Kconfig b/drivers/hwtracing/ptt/Kconfig
new file mode 100644
index ..6d46a09ffeb9
--- /dev/null
+++ b/drivers/hwtracing/ptt/Kconfig
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config HISI_PTT
+   tristate "HiSilicon PCIe Tune and Trace Device"
+   depends on ARM64 || (COMPILE_TEST && 64BIT)
+   depends on PCI && HAS_DMA && HAS_IOMEM && PERF_EVENTS
+   help
+ HiSilicon PCIe Tune and Trace device exists as a PCIe RCiEP
+ device, and it provides support for PCIe traffic tuning and
+ tracing TLP headers to the memory.
+
+ This driver can also be built as a module. If so, the module
+ will be called hisi_ptt.
diff --git a/drivers/hwtracing/ptt/Makefile b/drivers/hwtracing/ptt/Makefile
new file mode 100644
index ..908c09a98161
--- /dev/null
+++ b/drivers/hwtracing/ptt/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_HISI_PTT) += hisi_ptt.o
diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c
new file mode 100644
index ..ef25ce98f664
--- /dev/null
+++ b/drivers/hwtracing/ptt/hisi_ptt.c
@@ -0,0 +1,964 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Driver for HiSilicon PCIe tune and trace device
+ *
+ * Copyright (c) 2022 HiSilicon Technologies Co., Ltd.
+ * Author: Yicong Yang 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "hisi_ptt.h"
+
+/* Dynamic CPU hotplug state used by PTT */
+static enum cpuhp_state hisi_ptt_pmu_online;
+
+static u16 hisi_ptt_get_filter_val(u16 devid, bool is_port)
+{
+   if (is_port)
+   return BIT(HISI_PCIE_CORE_PORT_ID(devid & 0xff));
+
+   return devid;
+}
+
+static bool hisi_ptt_wait_trace_hw_idle(struct hisi_ptt *hisi_ptt)
+{
+   u32 val;
+
+   return !readl_poll_timeout_atomic(hisi_ptt->iobase + HISI_PTT_TRACE_STS,
+ val, val & HISI_PTT_TRACE_IDLE,
+ HISI_PTT_WAIT_POLL_INTERVAL_US,
+ HISI_PTT_WAIT_TRACE_TIMEOUT_US);
+}
+
+static void hisi_ptt_wait_dma_reset_done(struct hisi_ptt *hisi_ptt)
+{
+   u32 val;
+
+   readl_poll_timeout_atomic(hisi_ptt->iobase + HISI_PTT_TRACE_WR_STS,
+ val, !val, HISI_PTT_RESET_POLL_INTERVAL_US,
+ HISI_PTT_RESET_TIMEOUT_US);
+}
+
+static void hisi_ptt_trace_end(struct hisi_ptt *hisi_ptt)
+{
+   writel(0, hisi_ptt->iobase + 

Re: [PATCH 1/2] dt-bindings: mediatek: Add bindings for MT6795 M4U

2022-05-16 Thread Rob Herring
On Fri, 13 May 2022 17:14:10 +0200, AngeloGioacchino Del Regno wrote:
> Add bindings for the MediaTek Helio X10 (MT6795) IOMMU/M4U.
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 
> ---
>  .../bindings/iommu/mediatek,iommu.yaml|  3 +
>  include/dt-bindings/memory/mt6795-larb-port.h | 96 +++
>  2 files changed, 99 insertions(+)
>  create mode 100644 include/dt-bindings/memory/mt6795-larb-port.h
> 

Acked-by: Rob Herring 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 6/8] perf tool: Add support for parsing HiSilicon PCIe Trace packet

2022-05-16 Thread Jonathan Cameron via iommu
On Mon, 16 May 2022 20:52:21 +0800
Yicong Yang  wrote:

> From: Qi Liu 
> 
> Add support for using 'perf report --dump-raw-trace' to parse PTT packet.
> 
> Example usage:
> 
> Output will contain raw PTT data and its textual representation, such
> as:
> 
> 0 0 0x5810 [0x30]: PERF_RECORD_AUXTRACE size: 0x40  offset: 0
> ref: 0xa5d50c725  idx: 0  tid: -1  cpu: 0
> .
> . ... HISI PTT data: size 4194304 bytes
> .  : 00 00 00 00 Prefix
> .  0004: 08 20 00 60 Header DW0
> .  0008: ff 02 00 01 Header DW1
> .  000c: 20 08 00 00 Header DW2
> .  0010: 10 e7 44 ab Header DW3
> .  0014: 2a a8 1e 01 Time
> .  0020: 00 00 00 00 Prefix
> .  0024: 01 00 00 60 Header DW0
> .  0028: 0f 1e 00 01 Header DW1
> .  002c: 04 00 00 00 Header DW2
> .  0030: 40 00 81 02 Header DW3
> .  0034: ee 02 00 00 Time
> 
> 
> Signed-off-by: Qi Liu 
> Signed-off-by: Yicong Yang 

>From point of view of a reviewer who doesn't know this code well, this
all looks sensible.  One trivial comment inline.

Thanks,

Jonathan

> diff --git a/tools/perf/util/hisi-ptt.c b/tools/perf/util/hisi-ptt.c
> new file mode 100644
> index ..2afc1a663c2a
> --- /dev/null
> +
> +static void hisi_ptt_free(struct perf_session *session)
> +{
> + struct hisi_ptt *ptt = container_of(session->auxtrace, struct hisi_ptt,
> + auxtrace);
> +
> + session->auxtrace = NULL;
> + free(ptt);
> +}
> +
> +static bool hisi_ptt_evsel_is_auxtrace(struct perf_session *session,
> +struct evsel *evsel)
> +{
> + struct hisi_ptt *ptt = container_of(session->auxtrace, struct hisi_ptt, 
> auxtrace);

Check for consistent wrapping of lines like this. This doesn't match the one 
just above.



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 1/2] iommu/io-pgtable-arm-v7s: Add a quirk to allow pgtable PA up to 35bit

2022-05-16 Thread yf.wang--- via iommu
From: Yunfei Wang 

The calling to kmem_cache_alloc for level 2 pgtable allocation may run
in atomic context, and it fails sometimes when DMA32 zone runs out of
memory.

Since Mediatek IOMMU hardware support at most 35bit PA in pgtable,
so add a quirk to allow the PA of pgtables support up to bit35.

Signed-off-by: Ning Li 
Signed-off-by: Yunfei Wang 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 56 ++
 include/linux/io-pgtable.h | 15 +---
 2 files changed, 52 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index be066c1503d3..668500798fb9 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -149,6 +149,10 @@
 #define ARM_V7S_TTBR_IRGN_ATTR(attr)   \
attr) & 0x1) << 6) | (((attr) & 0x2) >> 1))
 
+/* Mediatek extend ttbr bits[2:0] for PA bits[34:32] */
+#define ARM_V7S_TTBR_35BIT_PA(ttbr, pa)
\
+   ((ttbr & ((u32)(~0U << 3))) | ((pa & GENMASK_ULL(34, 32)) >> 32))
+
 #ifdef CONFIG_ZONE_DMA32
 #define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
 #define ARM_V7S_TABLE_SLAB_FLAGS SLAB_CACHE_DMA32
@@ -182,14 +186,8 @@ static bool arm_v7s_is_mtk_enabled(struct io_pgtable_cfg 
*cfg)
(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT);
 }
 
-static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
-   struct io_pgtable_cfg *cfg)
+static arm_v7s_iopte to_iopte_mtk(phys_addr_t paddr, arm_v7s_iopte pte)
 {
-   arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
-
-   if (!arm_v7s_is_mtk_enabled(cfg))
-   return pte;
-
if (paddr & BIT_ULL(32))
pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
if (paddr & BIT_ULL(33))
@@ -199,6 +197,17 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int 
lvl,
return pte;
 }
 
+static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
+   struct io_pgtable_cfg *cfg)
+{
+   arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
+
+   if (!arm_v7s_is_mtk_enabled(cfg))
+   return pte;
+
+   return to_iopte_mtk(paddr, pte);
+}
+
 static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
  struct io_pgtable_cfg *cfg)
 {
@@ -234,6 +243,7 @@ static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int 
lvl,
 static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
   struct arm_v7s_io_pgtable *data)
 {
+   gfp_t gfp_l1 = __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA;
struct io_pgtable_cfg *cfg = >iop.cfg;
struct device *dev = cfg->iommu_dev;
phys_addr_t phys;
@@ -241,9 +251,11 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
size_t size = ARM_V7S_TABLE_SIZE(lvl, cfg);
void *table = NULL;
 
+   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)
+   gfp_l1 = __GFP_ZERO;
+
if (lvl == 1)
-   table = (void *)__get_free_pages(
-   __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size));
+   table = (void *)__get_free_pages(gfp_l1, get_order(size));
else if (lvl == 2)
table = kmem_cache_zalloc(data->l2_tables, gfp);
 
@@ -251,7 +263,8 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
return NULL;
 
phys = virt_to_phys(table);
-   if (phys != (arm_v7s_iopte)phys) {
+   if (phys != (arm_v7s_iopte)phys &&
+   !(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)) {
/* Doesn't fit in PTE */
dev_err(dev, "Page table does not fit in PTE: %pa", );
goto out_free;
@@ -457,9 +470,14 @@ static arm_v7s_iopte arm_v7s_install_table(arm_v7s_iopte 
*table,
   arm_v7s_iopte curr,
   struct io_pgtable_cfg *cfg)
 {
+   phys_addr_t phys = virt_to_phys(table);
arm_v7s_iopte old, new;
 
-   new = virt_to_phys(table) | ARM_V7S_PTE_TYPE_TABLE;
+   new = phys | ARM_V7S_PTE_TYPE_TABLE;
+
+   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)
+   new = to_iopte_mtk(phys, new);
+
if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
new |= ARM_V7S_ATTR_NS_TABLE;
 
@@ -778,7 +796,9 @@ static phys_addr_t arm_v7s_iova_to_phys(struct 
io_pgtable_ops *ops,
 static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
void *cookie)
 {
+   slab_flags_t slab_flag = ARM_V7S_TABLE_SLAB_FLAGS;
struct arm_v7s_io_pgtable *data;
+   phys_addr_t paddr;
 
if (cfg->ias > (arm_v7s_is_mtk_enabled(cfg) ? 34 : ARM_V7S_ADDR_BITS))
return NULL;
@@ -788,7 +808,8 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
 
if (cfg->quirks 

[PATCH v5 2/2] iommu/mediatek: Allow page table PA up to 35bit

2022-05-16 Thread yf.wang--- via iommu
From: Yunfei Wang 

Add the quirk IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT support, so that allows
page table PA up to 35bit, not only in ZONE_DMA32.

Signed-off-by: Ning Li 
Signed-off-by: Yunfei Wang 
---
 drivers/iommu/mtk_iommu.c | 29 +
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 6fd75a60abd6..1b9a876ef271 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -33,6 +33,7 @@
 
 #define REG_MMU_PT_BASE_ADDR   0x000
 #define MMU_PT_ADDR_MASK   GENMASK(31, 7)
+#define MMU_PT_ADDR_2_0_MASK   GENMASK(2, 0)
 
 #define REG_MMU_INVALIDATE 0x020
 #define F_ALL_INVLD0x2
@@ -118,6 +119,7 @@
 #define WR_THROT_ENBIT(6)
 #define HAS_LEGACY_IVRP_PADDR  BIT(7)
 #define IOVA_34_EN BIT(8)
+#define PGTABLE_PA_35_EN   BIT(9)
 
 #define MTK_IOMMU_HAS_FLAG(pdata, _x) \
pdata)->flags) & (_x)) == (_x))
@@ -401,6 +403,9 @@ static int mtk_iommu_domain_finalise(struct 
mtk_iommu_domain *dom,
.iommu_dev = data->dev,
};
 
+   if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN))
+   dom->cfg.quirks |= IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT;
+
if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE))
dom->cfg.oas = data->enable_4GB ? 33 : 32;
else
@@ -450,6 +455,7 @@ static int mtk_iommu_attach_device(struct iommu_domain 
*domain,
struct mtk_iommu_domain *dom = to_mtk_domain(domain);
struct device *m4udev = data->dev;
int ret, domid;
+   u32 regval;
 
domid = mtk_iommu_get_domain_id(dev, data->plat_data);
if (domid < 0)
@@ -472,8 +478,14 @@ static int mtk_iommu_attach_device(struct iommu_domain 
*domain,
return ret;
}
data->m4u_dom = dom;
-   writel(dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK,
-  data->base + REG_MMU_PT_BASE_ADDR);
+
+   /* Bits[6:3] are invalid for mediatek platform */
+   if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN))
+   regval = (dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK) 
|
+(dom->cfg.arm_v7s_cfg.ttbr & 
MMU_PT_ADDR_2_0_MASK);
+   else
+   regval = dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK;
+   writel(regval, data->base + REG_MMU_PT_BASE_ADDR);
 
pm_runtime_put(m4udev);
}
@@ -987,6 +999,7 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct 
device *dev)
struct mtk_iommu_suspend_reg *reg = >reg;
struct mtk_iommu_domain *m4u_dom = data->m4u_dom;
void __iomem *base = data->base;
+   u32 regval;
int ret;
 
ret = clk_prepare_enable(data->bclk);
@@ -1010,7 +1023,14 @@ static int __maybe_unused 
mtk_iommu_runtime_resume(struct device *dev)
writel_relaxed(reg->int_main_control, base + REG_MMU_INT_MAIN_CONTROL);
writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR);
writel_relaxed(reg->vld_pa_rng, base + REG_MMU_VLD_PA_RNG);
-   writel(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK, base + 
REG_MMU_PT_BASE_ADDR);
+
+   /* Bits[6:3] are invalid for mediatek platform */
+   if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN))
+   regval = (m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK) |
+(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_2_0_MASK);
+   else
+   regval = m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK;
+   writel(regval, base + REG_MMU_PT_BASE_ADDR);
 
/*
 * Users may allocate dma buffer before they call pm_runtime_get,
@@ -1038,7 +1058,8 @@ static const struct mtk_iommu_plat_data mt2712_data = {
 
 static const struct mtk_iommu_plat_data mt6779_data = {
.m4u_plat  = M4U_MT6779,
-   .flags = HAS_SUB_COMM | OUT_ORDER_WR_EN | WR_THROT_EN,
+   .flags = HAS_SUB_COMM | OUT_ORDER_WR_EN | WR_THROT_EN |
+PGTABLE_PA_35_EN,
.inv_sel_reg   = REG_MMU_INV_SEL_GEN2,
.iova_region   = single_domain,
.iova_region_nr = ARRAY_SIZE(single_domain),
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 5/8] perf tool: Add support for HiSilicon PCIe Tune and Trace device driver

2022-05-16 Thread Jonathan Cameron via iommu
On Mon, 16 May 2022 20:52:20 +0800
Yicong Yang  wrote:

> From: Qi Liu 
> 
> HiSilicon PCIe tune and trace device (PTT) could dynamically tune
> the PCIe link's events, and trace the TLP headers).
> 
> This patch add support for PTT device in perf tool, so users could
> use 'perf record' to get TLP headers trace data.
> 
> Signed-off-by: Qi Liu 
> Signed-off-by: Yicong Yang 

One query inline.


> diff --git a/tools/perf/arch/arm/util/auxtrace.c 
> b/tools/perf/arch/arm/util/auxtrace.c
> index 384c7cfda0fd..297fffedf45e 100644
> --- a/tools/perf/arch/arm/util/auxtrace.c
> +++ b/tools/perf/arch/arm/util/auxtrace.c

...

>  static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus,
>  int pmu_nr, struct evsel *evsel)
>  {
> @@ -71,17 +120,21 @@ struct auxtrace_record
>  {
>   struct perf_pmu *cs_etm_pmu = NULL;
>   struct perf_pmu **arm_spe_pmus = NULL;
> + struct perf_pmu **hisi_ptt_pmus = NULL;
>   struct evsel *evsel;
>   struct perf_pmu *found_etm = NULL;
>   struct perf_pmu *found_spe = NULL;
> + struct perf_pmu *found_ptt = NULL;
>   int auxtrace_event_cnt = 0;
>   int nr_spes = 0;
> + int nr_ptts = 0;
>  
>   if (!evlist)
>   return NULL;
>  
>   cs_etm_pmu = perf_pmu__find(CORESIGHT_ETM_PMU_NAME);
>   arm_spe_pmus = find_all_arm_spe_pmus(_spes, err);
> + hisi_ptt_pmus = find_all_hisi_ptt_pmus(_ptts, err);
>  
>   evlist__for_each_entry(evlist, evsel) {
>   if (cs_etm_pmu && !found_etm)
> @@ -89,9 +142,13 @@ struct auxtrace_record
>  
>   if (arm_spe_pmus && !found_spe)
>   found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, 
> evsel);
> +
> + if (arm_spe_pmus && !found_spe)

if (hisi_ptt_pmus && !found_ptt) ?

Otherwise, I'm not sure what the purpose of the checking against spe is.

> + found_ptt = find_pmu_for_event(hisi_ptt_pmus, nr_ptts, 
> evsel);
>   }
>  
>   free(arm_spe_pmus);
> + free(hisi_ptt_pmus);
>  
>   if (found_etm)
>   auxtrace_event_cnt++;
> @@ -99,6 +156,9 @@ struct auxtrace_record
>   if (found_spe)
>   auxtrace_event_cnt++;
>  
> + if (found_ptt)
> + auxtrace_event_cnt++;
> +
>   if (auxtrace_event_cnt > 1) {
>   pr_err("Concurrent AUX trace operation not currently 
> supported\n");
>   *err = -EOPNOTSUPP;
> @@ -111,6 +171,9 @@ struct auxtrace_record
>  #if defined(__aarch64__)
>   if (found_spe)
>   return arm_spe_recording_init(err, found_spe);
> +
> + if (found_ptt)
> + return hisi_ptt_recording_init(err, found_ptt);
>  #endif
>  
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 4/8] perf arm: Refactor event list iteration in auxtrace_record__init()

2022-05-16 Thread Jonathan Cameron via iommu
On Mon, 16 May 2022 20:52:19 +0800
Yicong Yang  wrote:

> From: Qi Liu 
> 
> Use find_pmu_for_event() to simplify logic in auxtrace_record__init().
Possibly reword as 

"Add find_pmu_for_event() and use to simplify logic in
auxtrace_record_init(). find_pmu_for_event() will be
reused in subsequent patches."

> 
> Signed-off-by: Qi Liu 
> Signed-off-by: Yicong Yang 
FWIW as this isn't an area I know much about. It seems
like a good cleanup and functionally equivalent.

Reviewed-by: Jonathan Cameron 
> ---
>  tools/perf/arch/arm/util/auxtrace.c | 53 ++---
>  1 file changed, 34 insertions(+), 19 deletions(-)
> 
> diff --git a/tools/perf/arch/arm/util/auxtrace.c 
> b/tools/perf/arch/arm/util/auxtrace.c
> index 5fc6a2a3dbc5..384c7cfda0fd 100644
> --- a/tools/perf/arch/arm/util/auxtrace.c
> +++ b/tools/perf/arch/arm/util/auxtrace.c
> @@ -50,16 +50,32 @@ static struct perf_pmu **find_all_arm_spe_pmus(int 
> *nr_spes, int *err)
>   return arm_spe_pmus;
>  }
>  
> +static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus,
> +int pmu_nr, struct evsel *evsel)
> +{
> + int i;
> +
> + if (!pmus)
> + return NULL;
> +
> + for (i = 0; i < pmu_nr; i++) {
> + if (evsel->core.attr.type == pmus[i]->type)
> + return pmus[i];
> + }
> +
> + return NULL;
> +}
> +
>  struct auxtrace_record
>  *auxtrace_record__init(struct evlist *evlist, int *err)
>  {
> - struct perf_pmu *cs_etm_pmu;
> + struct perf_pmu *cs_etm_pmu = NULL;
> + struct perf_pmu **arm_spe_pmus = NULL;
>   struct evsel *evsel;
> - bool found_etm = false;
> + struct perf_pmu *found_etm = NULL;
>   struct perf_pmu *found_spe = NULL;
> - struct perf_pmu **arm_spe_pmus = NULL;
> + int auxtrace_event_cnt = 0;
>   int nr_spes = 0;
> - int i = 0;
>  
>   if (!evlist)
>   return NULL;
> @@ -68,24 +84,23 @@ struct auxtrace_record
>   arm_spe_pmus = find_all_arm_spe_pmus(_spes, err);
>  
>   evlist__for_each_entry(evlist, evsel) {
> - if (cs_etm_pmu &&
> - evsel->core.attr.type == cs_etm_pmu->type)
> - found_etm = true;
> -
> - if (!nr_spes || found_spe)
> - continue;
> -
> - for (i = 0; i < nr_spes; i++) {
> - if (evsel->core.attr.type == arm_spe_pmus[i]->type) {
> - found_spe = arm_spe_pmus[i];
> - break;
> - }
> - }
> + if (cs_etm_pmu && !found_etm)
> + found_etm = find_pmu_for_event(_etm_pmu, 1, evsel);
> +
> + if (arm_spe_pmus && !found_spe)
> + found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, 
> evsel);
>   }
> +
>   free(arm_spe_pmus);
>  
> - if (found_etm && found_spe) {
> - pr_err("Concurrent ARM Coresight ETM and SPE operation not 
> currently supported\n");
> + if (found_etm)
> + auxtrace_event_cnt++;
> +
> + if (found_spe)
> + auxtrace_event_cnt++;
> +
> + if (auxtrace_event_cnt > 1) {
> + pr_err("Concurrent AUX trace operation not currently 
> supported\n");
>   *err = -EOPNOTSUPP;
>   return NULL;
>   }

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 5/9] iommu/arm-smmu: Attach to host1x context device bus

2022-05-16 Thread Thierry Reding
On Mon, May 16, 2022 at 02:20:18PM +0300, Mikko Perttunen wrote:
> On 5/16/22 13:44, Robin Murphy wrote:
> > On 2022-05-16 11:13, Mikko Perttunen wrote:
> > > On 5/16/22 13:07, Will Deacon wrote:
> > > > On Mon, May 16, 2022 at 11:52:54AM +0300, cyn...@kapsi.fi wrote:
> > > > > From: Mikko Perttunen 
> > > > > 
> > > > > Set itself as the IOMMU for the host1x context device bus, containing
> > > > > "dummy" devices used for Host1x context isolation.
> > > > > 
> > > > > Signed-off-by: Mikko Perttunen 
> > > > > ---
> > > > >   drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 +
> > > > >   1 file changed, 13 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > > > b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > > > index 568cce590ccc..9ff54eaecf81 100644
> > > > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > > > @@ -39,6 +39,7 @@
> > > > >   #include 
> > > > >   #include 
> > > > > +#include 
> > > > >   #include "arm-smmu.h"
> > > > > @@ -2053,8 +2054,20 @@ static int arm_smmu_bus_init(struct
> > > > > iommu_ops *ops)
> > > > >   goto err_reset_pci_ops;
> > > > >   }
> > > > >   #endif
> > > > > +#ifdef CONFIG_TEGRA_HOST1X_CONTEXT_BUS
> > > > > +    if (!iommu_present(_context_device_bus_type)) {
> > > > > +    err = bus_set_iommu(_context_device_bus_type, ops);
> > > > > +    if (err)
> > > > > +    goto err_reset_fsl_mc_ops;
> > > > > +    }
> > > > > +#endif
> > > > > +
> > > > >   return 0;
> > > > > +err_reset_fsl_mc_ops: __maybe_unused;
> > > > > +#ifdef CONFIG_FSL_MC_BUS
> > > > > +    bus_set_iommu(_mc_bus_type, NULL);
> > > > > +#endif
> > > > 
> > > > bus_set_iommu() is going away:
> > > > 
> > > > https://lore.kernel.org/r/cover.1650890638.git.robin.mur...@arm.com
> > > > 
> > > > Will
> > > 
> > > Thanks for the heads-up. Robin had pointed out that this work was
> > > ongoing but I hadn't seen the patches yet. I'll look into it.
> > 
> > Although that *is* currently blocked on the mystery intel-iommu problem
> > that I can't reproduce... If this series is ready to land right now for
> > 5.19 then in principle that might be the easiest option overall.
> > Hopefully at least patch #2 could sneak in so that the compile-time
> > dependencies are ready for me to roll up host1x into the next rebase of
> > "iommu: Always register bus notifiers".
> > 
> > Cheers,
> > Robin.
> 
> My guess is that the series as a whole is not ready to land in the 5.19
> timeframe, but #2 could be possible.
> 
> Thierry, any opinion?

Dave and Daniel typically want new material to be in by -rc6 and I've
already sent the PR for this cycle. I can ask them if they'd take
another one, though, if it make things simpler for the next cycle.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v8 2/8] hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device

2022-05-16 Thread Jonathan Cameron via iommu
On Mon, 16 May 2022 20:52:17 +0800
Yicong Yang  wrote:

> HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex integrated
> Endpoint(RCiEP) device, providing the capability to dynamically monitor and
> tune the PCIe traffic and trace the TLP headers.
> 
> Add the driver for the device to enable the trace function. Register PMU
> device of PTT trace, then users can use trace through perf command. The
> driver makes use of perf AUX trace function and support the following
> events to configure the trace:
> 
> - filter: select Root port or Endpoint to trace
> - type: select the type of traced TLP headers
> - direction: select the direction of traced TLP headers
> - format: select the data format of the traced TLP headers
> 
> This patch initially add a basic driver of PTT trace.
> 
> Signed-off-by: Yicong Yang 

Hi Yicong,

It's been a while since I looked at this driver, so I'll admit
I can't remember if any of the things I've raised below were
previously discussed. 

All minor stuff (biggest is question of failing cleanly in unlikely
case of failing the allocation in the filter addition vs carrying
on anyway), so feel free to add

Reviewed-by: Jonathan Cameron 

> diff --git a/drivers/hwtracing/ptt/Makefile b/drivers/hwtracing/ptt/Makefile
> new file mode 100644
> index ..908c09a98161
> --- /dev/null
> +++ b/drivers/hwtracing/ptt/Makefile
> @@ -0,0 +1,2 @@
> +# SPDX-License-Identifier: GPL-2.0
> +obj-$(CONFIG_HISI_PTT) += hisi_ptt.o
> diff --git a/drivers/hwtracing/ptt/hisi_ptt.c 
> b/drivers/hwtracing/ptt/hisi_ptt.c
> new file mode 100644
> index ..ef25ce98f664
> --- /dev/null
> +++ b/drivers/hwtracing/ptt/hisi_ptt.c


...


> +
> +static int hisi_ptt_init_filters(struct pci_dev *pdev, void *data)
> +{
> + struct hisi_ptt_filter_desc *filter;
> + struct hisi_ptt *hisi_ptt = data;
> +
> + filter = kzalloc(sizeof(*filter), GFP_KERNEL);
> + if (!filter) {
> + pci_err(hisi_ptt->pdev, "failed to add filter %s\n", 
> pci_name(pdev));

If this fails we carry on anyway (no error checking on the bus_walk).
I think we should error out in that case (would need to use a flag placed
somewhere in hisi_ptt to tell we had an error).

That would complicate the unwind though.
Easiest way to do that unwind is probably to register a separate
devm_add_action_or_reset() callback for each filter.

If you prefer to carry on even with this allocation error, then maybe add a 
comment
here somewhere to make it clear that will happen.

> + return -ENOMEM;
> + }
> +
> + filter->devid = PCI_DEVID(pdev->bus->number, pdev->devfn);
> +
> + if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT) {
> + filter->is_port = true;
> + list_add_tail(>list, _ptt->port_filters);
> +
> + /* Update the available port mask */
> + hisi_ptt->port_mask |= hisi_ptt_get_filter_val(filter->devid, 
> true);
> + } else {
> + list_add_tail(>list, _ptt->req_filters);
> + }
> +
> + return 0;
> +}
> +
> +static void hisi_ptt_release_filters(void *data)
> +{
> + struct hisi_ptt_filter_desc *filter, *tmp;
> + struct hisi_ptt *hisi_ptt = data;
> +
> + list_for_each_entry_safe(filter, tmp, _ptt->req_filters, list) {
> + list_del(>list);
> + kfree(filter);

I think with separate release per entry above, this bit become simpler as
we walk all the elements in the devm_ callback list rather than two lists here.

> + }
> +
> + list_for_each_entry_safe(filter, tmp, _ptt->port_filters, list) {
> + list_del(>list);
> + kfree(filter);
> + }
> +}
> +

...

> +
> +static int hisi_ptt_init_ctrls(struct hisi_ptt *hisi_ptt)
> +{
> + struct pci_dev *pdev = hisi_ptt->pdev;
> + struct pci_bus *bus;
> + int ret;
> + u32 reg;
> +
> + INIT_LIST_HEAD(_ptt->port_filters);
> + INIT_LIST_HEAD(_ptt->req_filters);
> +
> + ret = hisi_ptt_config_trace_buf(hisi_ptt);
> + if (ret)
> + return ret;
> +
> + /*
> +  * The device range register provides the information about the
> +  * root ports which the RCiEP can control and trace. The RCiEP
> +  * and the root ports it support are on the same PCIe core, with
> +  * same domain number but maybe different bus number. The device
> +  * range register will tell us which root ports we can support,
> +  * Bit[31:16] indicates the upper BDF numbers of the root port,
> +  * while Bit[15:0] indicates the lower.
> +  */
> + reg = readl(hisi_ptt->iobase + HISI_PTT_DEVICE_RANGE);
> + hisi_ptt->upper_bdf = FIELD_GET(HISI_PTT_DEVICE_RANGE_UPPER, reg);
> + hisi_ptt->lower_bdf = FIELD_GET(HISI_PTT_DEVICE_RANGE_LOWER, reg);
> +
> + bus = pci_find_bus(pci_domain_nr(pdev->bus), 
> PCI_BUS_NUM(hisi_ptt->upper_bdf));
> + if (bus)
> + pci_walk_bus(bus, hisi_ptt_init_filters, hisi_ptt);
> +
> + ret = devm_add_action_or_reset(>dev, 

Re: [PATCH 2/5] iommu: Add blocking_domain_ops field in iommu_ops

2022-05-16 Thread Jason Gunthorpe via iommu
On Mon, May 16, 2022 at 12:22:08PM +0100, Robin Murphy wrote:
> On 2022-05-16 02:57, Lu Baolu wrote:
> > Each IOMMU driver must provide a blocking domain ops. If the hardware
> > supports detaching domain from device, setting blocking domain equals
> > detaching the existing domain from the deivce. Otherwise, an UNMANAGED
> > domain without any mapping will be used instead.
> 
> Unfortunately that's backwards - most of the implementations of .detach_dev
> are disabling translation entirely, meaning the device ends up effectively
> in passthrough rather than blocked.

Ideally we'd convert the detach_dev of every driver into either
a blocking or identity domain. The trick is knowing which is which..

Guessing going down the list:
 apple dart - blocking, detach_dev calls apple_dart_hw_disable_dma() same as
  IOMMU_DOMAIN_BLOCKED
  [I wonder if this drive ris wrong in other ways though because
   I dont see a remove_streams in attach_dev]
 exynos - this seems to disable the 'sysmmu' so I'm guessing this is
  identity
 iommu-vmsa - Comment says 'disable mmu translaction' so I'm guessing
  this is idenity
 mkt_v1 - Code looks similar to mkt, which is probably identity.
 rkt - No idea
 sprd - No idea
 sun50i - This driver confusingly treats identity the same as
  unmanaged, seems wrong, no idea.
 amd - Not sure, clear_dte_entry() seems to set translation on but points
   the PTE to 0 ? Based on the spec table 8 I would have expected
   TV to be clear which would be blocking. Maybe a bug??
 arm smmu qcomm - not sure
 intel - blocking

These doesn't support default domains, so detach_dev should return 
back to DMA API ownership, which is either identity or something weird:
 fsl_pamu - identity due to the PPC use of dma direct
 msm
 mkt
 omap
 s390 - platform DMA ops
 terga-gart - Usually something called a GART would be 0 length once
  disabled, guessing blocking?
 tegra-smmu

So, the approach here should be to go driver by driver and convert
detach_dev to either identity, blocking or just delete it entirely,
excluding the above 7 that don't support default domains. And get acks
from the driver owners.

> Conversely, at least arm-smmu and arm-smmu-v3 could implement
> IOMMU_DOMAIN_BLOCKED properly with fault-type S2CRs and STEs
> respectively, it just needs a bit of wiring up.

Given that vfio now uses them it seems worthwhile to do..

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration

2022-05-16 Thread Rob Herring
On Fri, May 13, 2022 at 12:26 PM Saravana Kannan  wrote:
>
> On Fri, May 13, 2022 at 6:58 AM Rob Herring  wrote:
> >
> > On Fri, Apr 29, 2022 at 5:09 PM Saravana Kannan  
> > wrote:
> > >
> > > The deferred probe timer that's used for this currently starts at
> > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > assumption being that all available drivers would be loaded and
> > > registered before the timer expires. This means, the
> > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > worst case. But if we set the default value for it to cover the worst
> > > case, it would significantly slow down the average case. For this
> > > reason, the default value is set to 0.
> > >
> > > Also, with CONFIG_MODULES=y and the current default values of
> > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > drivers will cause their consumer devices to always defer their probes.
> > > This is because device links created by fw_devlink defer the probe even
> > > before the consumer driver's probe() is called.
> > >
> > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > timer on every successful driver registration, with the expectation more
> > > modules would be loaded in the near future, then the default value of
> > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > time difference between two consecutive module loads.
> > >
> > > So let's implement that and set the default value to 10 seconds when
> > > CONFIG_MODULES=y.
> >
> > We had to revert a non-zero timeout before (issue with NFS root IIRC).
> > Does fw_devlink=on somehow fix that?
>
> If it's the one where ip autoconfig was timing out, then John Stultz
> fixed it by fixing wait_for_device_probe().
> https://lore.kernel.org/all/20200422203245.83244-4-john.stu...@linaro.org/

Yeah, that was it.

Acked-by: Rob Herring 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] iommu: Add blocking_domain_ops field in iommu_ops

2022-05-16 Thread Baolu Lu

Hi Robin,

On 2022/5/16 19:22, Robin Murphy wrote:

On 2022-05-16 02:57, Lu Baolu wrote:

Each IOMMU driver must provide a blocking domain ops. If the hardware
supports detaching domain from device, setting blocking domain equals
detaching the existing domain from the deivce. Otherwise, an UNMANAGED
domain without any mapping will be used instead.


Unfortunately that's backwards - most of the implementations of 
.detach_dev are disabling translation entirely, meaning the device ends 
up effectively in passthrough rather than blocked. Conversely, at least 
arm-smmu and arm-smmu-v3 could implement IOMMU_DOMAIN_BLOCKED properly 
with fault-type S2CRs and STEs respectively, it just needs a bit of 
wiring up.


Thank you for letting me know this.

This means that we need to add an additional UNMANAGED domain for each
iommu group, although it is not used most of the time. If most IOMMU
drivers could implement real dumb blocking domains, this burden may be
reduced.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size()

2022-05-16 Thread John Garry via iommu
For streaming DMA mappings involving an IOMMU and whose IOVA len regularly
exceeds the IOVA rcache upper limit (meaning that they are not cached),
performance can be reduced.

Add the IOMMU callback for DMA mapping API dma_max_mapping_size(), which
allows the drivers to know the mapping limit and thus limit the requested 
IOVA lengths.

This resolves the performance issue originally reported in [0] for a SCSI
HBA driver which was regularly mapping SGLs which required IOVAs in
excess of the IOVA caching limit. In this case the block layer limits the
max sectors per request - as configured in __scsi_init_queue() - which
will limit the total SGL length the driver tries to map and in turn limits
IOVA lengths requested.

[0] 
https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leiz...@huawei.com/

Signed-off-by: John Garry 
---
Sending as an RFC as iommu_dma_max_mapping_size() is a soft limit, and not
a hard limit which I expect is the semantics of dma_map_ops.max_mapping_size

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 09f6e1c0f9c0..e2d5205cde37 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1442,6 +1442,21 @@ static unsigned long iommu_dma_get_merge_boundary(struct 
device *dev)
return (1UL << __ffs(domain->pgsize_bitmap)) - 1;
 }
 
+static size_t iommu_dma_max_mapping_size(struct device *dev)
+{
+   struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+   struct iommu_dma_cookie *cookie;
+
+   if (!domain)
+   return 0;
+
+   cookie = domain->iova_cookie;
+   if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
+   return 0;
+
+   return iova_rcache_range();
+}
+
 static const struct dma_map_ops iommu_dma_ops = {
.alloc  = iommu_dma_alloc,
.free   = iommu_dma_free,
@@ -1462,6 +1477,7 @@ static const struct dma_map_ops iommu_dma_ops = {
.map_resource   = iommu_dma_map_resource,
.unmap_resource = iommu_dma_unmap_resource,
.get_merge_boundary = iommu_dma_get_merge_boundary,
+   .max_mapping_size   = iommu_dma_max_mapping_size,
 };
 
 /*
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index db77aa675145..9f00b58d546e 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -26,6 +26,11 @@ static unsigned long iova_rcache_get(struct iova_domain 
*iovad,
 static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad);
 static void free_iova_rcaches(struct iova_domain *iovad);
 
+unsigned long iova_rcache_range(void)
+{
+   return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1);
+}
+
 static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node)
 {
struct iova_domain *iovad;
diff --git a/include/linux/iova.h b/include/linux/iova.h
index 320a70e40233..ae3e18d77e6c 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -79,6 +79,8 @@ static inline unsigned long iova_pfn(struct iova_domain 
*iovad, dma_addr_t iova)
 int iova_cache_get(void);
 void iova_cache_put(void);
 
+unsigned long iova_rcache_range(void);
+
 void free_iova(struct iova_domain *iovad, unsigned long pfn);
 void __free_iova(struct iova_domain *iovad, struct iova *iova);
 struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size,
@@ -105,6 +107,11 @@ static inline void iova_cache_put(void)
 {
 }
 
+static inline unsigned long iova_rcache_range(void)
+{
+   return 0;
+}
+
 static inline void free_iova(struct iova_domain *iovad, unsigned long pfn)
 {
 }
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH V2 1/2] swiotlb: Add Child IO TLB mem support

2022-05-16 Thread Tianyu Lan

On 5/16/2022 3:34 PM, Christoph Hellwig wrote:

I don't really understand how 'childs' fit in here.  The code also
doesn't seem to be usable without patch 2 and a caller of the
new functions added in patch 2, so it is rather impossible to review.


Hi Christoph:
 OK. I will merge two patches and add a caller patch. The motivation
is to avoid global spin lock when devices use swiotlb bounce buffer and
this introduces overhead during high throughput cases. In my test
environment, current code can achieve about 24Gb/s network throughput
with SWIOTLB force enabled and it can achieve about 40Gb/s without
SWIOTLB force. Storage also has the same issue.
 Per-device IO TLB mem may resolve global spin lock issue among
devices but device still may have multi queues. Multi queues still need
to share one spin lock. This is why introduce child or IO tlb areas in
the previous patches. Each device queues will have separate child IO TLB
mem and single spin lock to manage their IO TLB buffers.
 Otherwise, global spin lock still cost cpu usage during high 
throughput even when there is performance regression. Each device queues 
needs to spin on the different cpus to acquire the global lock. Child IO

TLB mem also may resolve the cpu issue.



Also:

  1) why is SEV/TDX so different from other cases that need bounce
 buffering to treat it different and we can't work on a general
 scalability improvement


Other cases also have global spin lock issue but it depends on
whether hits the bottleneck. The cpu usage issue may be ignored.


  2) per previous discussions at how swiotlb itself works, it is
 clear that another option is to just make pages we DMA to
 shared with the hypervisor.  Why don't we try that at least
 for larger I/O?


For confidential VM(Both TDX and SEV), we need to use bounce
buffer to copy between private memory that hypervisor can't
access directly and shared memory. For security consideration,
confidential VM should not share IO stack DMA pages with
hypervisor directly to avoid attack from hypervisor when IO
stack handles the DMA data.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] iommu: Add blocking_domain_ops field in iommu_ops

2022-05-16 Thread Jason Gunthorpe via iommu
On Mon, May 16, 2022 at 12:27:41AM -0700, Christoph Hellwig wrote:
> On Mon, May 16, 2022 at 09:57:56AM +0800, Lu Baolu wrote:
> > Each IOMMU driver must provide a blocking domain ops. If the hardware
> > supports detaching domain from device, setting blocking domain equals
> > detaching the existing domain from the deivce. Otherwise, an UNMANAGED
> > domain without any mapping will be used instead.
> 
> blocking in this case means not allowing any access?  The naming
> sounds a bit odd to me as blocking in the kernel has a specific
> meaning.  Maybe something like noaccess ops might be a better name?

It is because of this:

include/linux/iommu.h: *IOMMU_DOMAIN_BLOCKED- All DMA is blocked, 
can be used to isolate
include/linux/iommu.h:#define IOMMU_DOMAIN_BLOCKED  (0U)

noaccess might be clearer

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 7/8] docs: trace: Add HiSilicon PTT device driver documentation

2022-05-16 Thread Yicong Yang via iommu
Document the introduction and usage of HiSilicon PTT device driver.

Signed-off-by: Yicong Yang 
Reviewed-by: Jonathan Cameron 
---
 Documentation/trace/hisi-ptt.rst | 307 +++
 Documentation/trace/index.rst|   1 +
 2 files changed, 308 insertions(+)
 create mode 100644 Documentation/trace/hisi-ptt.rst

diff --git a/Documentation/trace/hisi-ptt.rst b/Documentation/trace/hisi-ptt.rst
new file mode 100644
index ..0a3112244d40
--- /dev/null
+++ b/Documentation/trace/hisi-ptt.rst
@@ -0,0 +1,307 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==
+HiSilicon PCIe Tune and Trace device
+==
+
+Introduction
+
+
+HiSilicon PCIe tune and trace device (PTT) is a PCIe Root Complex
+integrated Endpoint (RCiEP) device, providing the capability
+to dynamically monitor and tune the PCIe link's events (tune),
+and trace the TLP headers (trace). The two functions are independent,
+but is recommended to use them together to analyze and enhance the
+PCIe link's performance.
+
+On Kunpeng 930 SoC, the PCIe Root Complex is composed of several
+PCIe cores. Each PCIe core includes several Root Ports and a PTT
+RCiEP, like below. The PTT device is capable of tuning and
+tracing the links of the PCIe core.
+::
+
+  +--Core 0---+
+  |   |   [   PTT   ] |
+  |   |   [Root Port]---[Endpoint]
+  |   |   [Root Port]---[Endpoint]
+  |   |   [Root Port]---[Endpoint]
+Root Complex  |--Core 1---+
+  |   |   [   PTT   ] |
+  |   |   [Root Port]---[ Switch ]---[Endpoint]
+  |   |   [Root Port]---[Endpoint] `-[Endpoint]
+  |   |   [Root Port]---[Endpoint]
+  +---+
+
+The PTT device driver registers one PMU device for each PTT device.
+The name of each PTT device is composed of 'hisi_ptt' prefix with
+the id of the SICL and the Core where it locates. The Kunpeng 930
+SoC encapsulates multiple CPU dies (SCCL, Super CPU Cluster) and
+IO dies (SICL, Super I/O Cluster), where there's one PCIe Root
+Complex for each SICL.
+::
+
+/sys/devices/hisi_ptt_
+
+Tune
+
+
+PTT tune is designed for monitoring and adjusting PCIe link parameters 
(events).
+Currently we support events in 4 classes. The scope of the events
+covers the PCIe core to which the PTT device belongs.
+
+Each event is presented as a file under $(PTT PMU dir)/tune, and
+a simple open/read/write/close cycle will be used to tune the event.
+::
+
+$ cd /sys/devices/hisi_ptt_/tune
+$ ls
+qos_tx_cplqos_tx_npqos_tx_p
+tx_path_rx_req_alloc_buf_level
+tx_path_tx_req_alloc_buf_level
+$ cat qos_tx_dp
+1
+$ echo 2 > qos_tx_dp
+$ cat qos_tx_dp
+2
+
+Current value (numerical value) of the event can be simply read
+from the file, and the desired value written to the file to tune.
+
+1. Tx path QoS control
+
+
+The following files are provided to tune the QoS of the tx path of
+the PCIe core.
+
+- qos_tx_cpl: weight of Tx completion TLPs
+- qos_tx_np: weight of Tx non-posted TLPs
+- qos_tx_p: weight of Tx posted TLPs
+
+The weight influences the proportion of certain packets on the PCIe link.
+For example, for the storage scenario, increase the proportion
+of the completion packets on the link to enhance the performance as
+more completions are consumed.
+
+The available tune data of these events is [0, 1, 2].
+Writing a negative value will return an error, and out of range
+values will be converted to 2. Note that the event value just
+indicates a probable level, but is not precise.
+
+2. Tx path buffer control
+-
+
+Following files are provided to tune the buffer of tx path of the PCIe core.
+
+- tx_path_rx_req_alloc_buf_level: watermark of Rx requested
+- tx_path_tx_req_alloc_buf_level: watermark of Tx requested
+
+These events influence the watermark of the buffer allocated for each
+type. Rx means the inbound while Tx means outbound. The packets will
+be stored in the buffer first and then transmitted either when the
+watermark reached or when timed out. For a busy direction, you should
+increase the related buffer watermark to avoid frequently posting and
+thus enhance the performance. In most cases just keep the default value.
+
+The available tune data of above events is [0, 1, 2].
+Writing a negative value will return an error, and out of range
+values will be converted to 2. Note that the event value just
+indicates a probable level, but is not precise.
+
+Trace
+=
+
+PTT trace is designed for dumping the TLP headers to the memory, which
+can be used to analyze the transactions and usage condition of the PCIe
+Link. You can choose to filter the traced headers by either requester ID,
+or those downstream of a set of Root Ports on the same core of the PTT
+device. It's also 

[PATCH v8 6/8] perf tool: Add support for parsing HiSilicon PCIe Trace packet

2022-05-16 Thread Yicong Yang via iommu
From: Qi Liu 

Add support for using 'perf report --dump-raw-trace' to parse PTT packet.

Example usage:

Output will contain raw PTT data and its textual representation, such
as:

0 0 0x5810 [0x30]: PERF_RECORD_AUXTRACE size: 0x40  offset: 0
ref: 0xa5d50c725  idx: 0  tid: -1  cpu: 0
.
. ... HISI PTT data: size 4194304 bytes
.  : 00 00 00 00 Prefix
.  0004: 08 20 00 60 Header DW0
.  0008: ff 02 00 01 Header DW1
.  000c: 20 08 00 00 Header DW2
.  0010: 10 e7 44 ab Header DW3
.  0014: 2a a8 1e 01 Time
.  0020: 00 00 00 00 Prefix
.  0024: 01 00 00 60 Header DW0
.  0028: 0f 1e 00 01 Header DW1
.  002c: 04 00 00 00 Header DW2
.  0030: 40 00 81 02 Header DW3
.  0034: ee 02 00 00 Time


Signed-off-by: Qi Liu 
Signed-off-by: Yicong Yang 
---
 tools/perf/util/Build |   2 +
 tools/perf/util/auxtrace.c|   3 +
 tools/perf/util/hisi-ptt-decoder/Build|   1 +
 .../hisi-ptt-decoder/hisi-ptt-pkt-decoder.c   | 167 +++
 .../hisi-ptt-decoder/hisi-ptt-pkt-decoder.h   |  31 +++
 tools/perf/util/hisi-ptt.c| 193 ++
 6 files changed, 397 insertions(+)
 create mode 100644 tools/perf/util/hisi-ptt-decoder/Build
 create mode 100644 tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c
 create mode 100644 tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.h
 create mode 100644 tools/perf/util/hisi-ptt.c

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 9a7209a99e16..2d5cc4dc2732 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -116,6 +116,8 @@ perf-$(CONFIG_AUXTRACE) += intel-pt.o
 perf-$(CONFIG_AUXTRACE) += intel-bts.o
 perf-$(CONFIG_AUXTRACE) += arm-spe.o
 perf-$(CONFIG_AUXTRACE) += arm-spe-decoder/
+perf-$(CONFIG_AUXTRACE) += hisi-ptt.o
+perf-$(CONFIG_AUXTRACE) += hisi-ptt-decoder/
 perf-$(CONFIG_AUXTRACE) += s390-cpumsf.o
 
 ifdef CONFIG_LIBOPENCSD
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index a24cad3ce24e..84433c34903e 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -51,6 +51,7 @@
 #include "intel-pt.h"
 #include "intel-bts.h"
 #include "arm-spe.h"
+#include "hisi-ptt.h"
 #include "s390-cpumsf.h"
 #include "util/mmap.h"
 
@@ -1282,6 +1283,8 @@ int perf_event__process_auxtrace_info(struct perf_session 
*session,
err = s390_cpumsf_process_auxtrace_info(event, session);
break;
case PERF_AUXTRACE_HISI_PTT:
+   err = hisi_ptt_process_auxtrace_info(event, session);
+   break;
case PERF_AUXTRACE_UNKNOWN:
default:
return -EINVAL;
diff --git a/tools/perf/util/hisi-ptt-decoder/Build 
b/tools/perf/util/hisi-ptt-decoder/Build
new file mode 100644
index ..db3db8b75033
--- /dev/null
+++ b/tools/perf/util/hisi-ptt-decoder/Build
@@ -0,0 +1 @@
+perf-$(CONFIG_AUXTRACE) += hisi-ptt-pkt-decoder.o
diff --git a/tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c 
b/tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c
new file mode 100644
index ..64f67169ec37
--- /dev/null
+++ b/tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c
@@ -0,0 +1,167 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * HiSilicon PCIe Trace and Tuning (PTT) support
+ * Copyright (c) 2022 HiSilicon Technologies Co., Ltd.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "../color.h"
+#include "hisi-ptt-pkt-decoder.h"
+
+/*
+ * For 8DW format, the bit[31:11] of DW0 is always 0x1f, which can be
+ * used to distinguish the data format.
+ * 8DW format is like:
+ *   bits [ 31:11 ][   10:0   ]
+ *|---|---|
+ *DW0 [0x1f   ][ Reserved (0x7ff) ]
+ *DW1 [   Prefix  ]
+ *DW2 [ Header DW0]
+ *DW3 [ Header DW1]
+ *DW4 [ Header DW2]
+ *DW5 [ Header DW3]
+ *DW6 [   Reserved (0x0)  ]
+ *DW7 [Time   ]
+ *
+ * 4DW format is like:
+ *   bits [31:30] [ 29:25 ][24][23][22][21][20:11   ][10:0]
+ *|-|-|---|---|---|---|-|-|
+ *DW0 [ Fmt ][  Type  

[PATCH v8 2/8] hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device

2022-05-16 Thread Yicong Yang via iommu
HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex integrated
Endpoint(RCiEP) device, providing the capability to dynamically monitor and
tune the PCIe traffic and trace the TLP headers.

Add the driver for the device to enable the trace function. Register PMU
device of PTT trace, then users can use trace through perf command. The
driver makes use of perf AUX trace function and support the following
events to configure the trace:

- filter: select Root port or Endpoint to trace
- type: select the type of traced TLP headers
- direction: select the direction of traced TLP headers
- format: select the data format of the traced TLP headers

This patch initially add a basic driver of PTT trace.

Signed-off-by: Yicong Yang 
---
 drivers/Makefile |   1 +
 drivers/hwtracing/Kconfig|   2 +
 drivers/hwtracing/ptt/Kconfig|  12 +
 drivers/hwtracing/ptt/Makefile   |   2 +
 drivers/hwtracing/ptt/hisi_ptt.c | 964 +++
 drivers/hwtracing/ptt/hisi_ptt.h | 178 ++
 6 files changed, 1159 insertions(+)
 create mode 100644 drivers/hwtracing/ptt/Kconfig
 create mode 100644 drivers/hwtracing/ptt/Makefile
 create mode 100644 drivers/hwtracing/ptt/hisi_ptt.c
 create mode 100644 drivers/hwtracing/ptt/hisi_ptt.h

diff --git a/drivers/Makefile b/drivers/Makefile
index 020780b6b4d2..662d50599467 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -175,6 +175,7 @@ obj-$(CONFIG_USB4)  += thunderbolt/
 obj-$(CONFIG_CORESIGHT)+= hwtracing/coresight/
 obj-y  += hwtracing/intel_th/
 obj-$(CONFIG_STM)  += hwtracing/stm/
+obj-$(CONFIG_HISI_PTT) += hwtracing/ptt/
 obj-$(CONFIG_ANDROID)  += android/
 obj-$(CONFIG_NVMEM)+= nvmem/
 obj-$(CONFIG_FPGA) += fpga/
diff --git a/drivers/hwtracing/Kconfig b/drivers/hwtracing/Kconfig
index 13085835a636..911ee977103c 100644
--- a/drivers/hwtracing/Kconfig
+++ b/drivers/hwtracing/Kconfig
@@ -5,4 +5,6 @@ source "drivers/hwtracing/stm/Kconfig"
 
 source "drivers/hwtracing/intel_th/Kconfig"
 
+source "drivers/hwtracing/ptt/Kconfig"
+
 endmenu
diff --git a/drivers/hwtracing/ptt/Kconfig b/drivers/hwtracing/ptt/Kconfig
new file mode 100644
index ..6d46a09ffeb9
--- /dev/null
+++ b/drivers/hwtracing/ptt/Kconfig
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config HISI_PTT
+   tristate "HiSilicon PCIe Tune and Trace Device"
+   depends on ARM64 || (COMPILE_TEST && 64BIT)
+   depends on PCI && HAS_DMA && HAS_IOMEM && PERF_EVENTS
+   help
+ HiSilicon PCIe Tune and Trace device exists as a PCIe RCiEP
+ device, and it provides support for PCIe traffic tuning and
+ tracing TLP headers to the memory.
+
+ This driver can also be built as a module. If so, the module
+ will be called hisi_ptt.
diff --git a/drivers/hwtracing/ptt/Makefile b/drivers/hwtracing/ptt/Makefile
new file mode 100644
index ..908c09a98161
--- /dev/null
+++ b/drivers/hwtracing/ptt/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_HISI_PTT) += hisi_ptt.o
diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c
new file mode 100644
index ..ef25ce98f664
--- /dev/null
+++ b/drivers/hwtracing/ptt/hisi_ptt.c
@@ -0,0 +1,964 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Driver for HiSilicon PCIe tune and trace device
+ *
+ * Copyright (c) 2022 HiSilicon Technologies Co., Ltd.
+ * Author: Yicong Yang 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "hisi_ptt.h"
+
+/* Dynamic CPU hotplug state used by PTT */
+static enum cpuhp_state hisi_ptt_pmu_online;
+
+static u16 hisi_ptt_get_filter_val(u16 devid, bool is_port)
+{
+   if (is_port)
+   return BIT(HISI_PCIE_CORE_PORT_ID(devid & 0xff));
+
+   return devid;
+}
+
+static bool hisi_ptt_wait_trace_hw_idle(struct hisi_ptt *hisi_ptt)
+{
+   u32 val;
+
+   return !readl_poll_timeout_atomic(hisi_ptt->iobase + HISI_PTT_TRACE_STS,
+ val, val & HISI_PTT_TRACE_IDLE,
+ HISI_PTT_WAIT_POLL_INTERVAL_US,
+ HISI_PTT_WAIT_TRACE_TIMEOUT_US);
+}
+
+static void hisi_ptt_wait_dma_reset_done(struct hisi_ptt *hisi_ptt)
+{
+   u32 val;
+
+   readl_poll_timeout_atomic(hisi_ptt->iobase + HISI_PTT_TRACE_WR_STS,
+ val, !val, HISI_PTT_RESET_POLL_INTERVAL_US,
+ HISI_PTT_RESET_TIMEOUT_US);
+}
+
+static void hisi_ptt_trace_end(struct hisi_ptt *hisi_ptt)
+{
+   writel(0, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL);
+   hisi_ptt->trace_ctrl.started = false;
+}
+
+static int hisi_ptt_trace_start(struct hisi_ptt *hisi_ptt)
+{
+   struct hisi_ptt_trace_ctrl *ctrl = _ptt->trace_ctrl;
+  

[PATCH v8 4/8] perf arm: Refactor event list iteration in auxtrace_record__init()

2022-05-16 Thread Yicong Yang via iommu
From: Qi Liu 

Use find_pmu_for_event() to simplify logic in auxtrace_record__init().

Signed-off-by: Qi Liu 
Signed-off-by: Yicong Yang 
---
 tools/perf/arch/arm/util/auxtrace.c | 53 ++---
 1 file changed, 34 insertions(+), 19 deletions(-)

diff --git a/tools/perf/arch/arm/util/auxtrace.c 
b/tools/perf/arch/arm/util/auxtrace.c
index 5fc6a2a3dbc5..384c7cfda0fd 100644
--- a/tools/perf/arch/arm/util/auxtrace.c
+++ b/tools/perf/arch/arm/util/auxtrace.c
@@ -50,16 +50,32 @@ static struct perf_pmu **find_all_arm_spe_pmus(int 
*nr_spes, int *err)
return arm_spe_pmus;
 }
 
+static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus,
+  int pmu_nr, struct evsel *evsel)
+{
+   int i;
+
+   if (!pmus)
+   return NULL;
+
+   for (i = 0; i < pmu_nr; i++) {
+   if (evsel->core.attr.type == pmus[i]->type)
+   return pmus[i];
+   }
+
+   return NULL;
+}
+
 struct auxtrace_record
 *auxtrace_record__init(struct evlist *evlist, int *err)
 {
-   struct perf_pmu *cs_etm_pmu;
+   struct perf_pmu *cs_etm_pmu = NULL;
+   struct perf_pmu **arm_spe_pmus = NULL;
struct evsel *evsel;
-   bool found_etm = false;
+   struct perf_pmu *found_etm = NULL;
struct perf_pmu *found_spe = NULL;
-   struct perf_pmu **arm_spe_pmus = NULL;
+   int auxtrace_event_cnt = 0;
int nr_spes = 0;
-   int i = 0;
 
if (!evlist)
return NULL;
@@ -68,24 +84,23 @@ struct auxtrace_record
arm_spe_pmus = find_all_arm_spe_pmus(_spes, err);
 
evlist__for_each_entry(evlist, evsel) {
-   if (cs_etm_pmu &&
-   evsel->core.attr.type == cs_etm_pmu->type)
-   found_etm = true;
-
-   if (!nr_spes || found_spe)
-   continue;
-
-   for (i = 0; i < nr_spes; i++) {
-   if (evsel->core.attr.type == arm_spe_pmus[i]->type) {
-   found_spe = arm_spe_pmus[i];
-   break;
-   }
-   }
+   if (cs_etm_pmu && !found_etm)
+   found_etm = find_pmu_for_event(_etm_pmu, 1, evsel);
+
+   if (arm_spe_pmus && !found_spe)
+   found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, 
evsel);
}
+
free(arm_spe_pmus);
 
-   if (found_etm && found_spe) {
-   pr_err("Concurrent ARM Coresight ETM and SPE operation not 
currently supported\n");
+   if (found_etm)
+   auxtrace_event_cnt++;
+
+   if (found_spe)
+   auxtrace_event_cnt++;
+
+   if (auxtrace_event_cnt > 1) {
+   pr_err("Concurrent AUX trace operation not currently 
supported\n");
*err = -EOPNOTSUPP;
return NULL;
}
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 3/8] hwtracing: hisi_ptt: Add tune function support for HiSilicon PCIe Tune and Trace device

2022-05-16 Thread Yicong Yang via iommu
Add tune function for the HiSilicon Tune and Trace device. The interface
of tune is exposed through sysfs attributes of PTT PMU device.

Signed-off-by: Yicong Yang 
Reviewed-by: Jonathan Cameron 
---
 drivers/hwtracing/ptt/hisi_ptt.c | 157 +++
 drivers/hwtracing/ptt/hisi_ptt.h |  23 +
 2 files changed, 180 insertions(+)

diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c
index ef25ce98f664..c3fdb9bfb1b4 100644
--- a/drivers/hwtracing/ptt/hisi_ptt.c
+++ b/drivers/hwtracing/ptt/hisi_ptt.c
@@ -25,6 +25,161 @@
 /* Dynamic CPU hotplug state used by PTT */
 static enum cpuhp_state hisi_ptt_pmu_online;
 
+static bool hisi_ptt_wait_tuning_finish(struct hisi_ptt *hisi_ptt)
+{
+   u32 val;
+
+   return !readl_poll_timeout(hisi_ptt->iobase + HISI_PTT_TUNING_INT_STAT,
+ val, !(val & HISI_PTT_TUNING_INT_STAT_MASK),
+ HISI_PTT_WAIT_POLL_INTERVAL_US,
+ HISI_PTT_WAIT_TUNE_TIMEOUT_US);
+}
+
+static int hisi_ptt_tune_data_get(struct hisi_ptt *hisi_ptt,
+ u32 event, u16 *data)
+{
+   u32 reg;
+
+   reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_CTRL);
+   reg &= ~(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB);
+   reg |= FIELD_PREP(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB,
+ event);
+   writel(reg, hisi_ptt->iobase + HISI_PTT_TUNING_CTRL);
+
+   /* Write all 1 to indicates it's the read process */
+   writel(~0U, hisi_ptt->iobase + HISI_PTT_TUNING_DATA);
+
+   if (!hisi_ptt_wait_tuning_finish(hisi_ptt))
+   return -ETIMEDOUT;
+
+   reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_DATA);
+   reg &= HISI_PTT_TUNING_DATA_VAL_MASK;
+   *data = FIELD_GET(HISI_PTT_TUNING_DATA_VAL_MASK, reg);
+
+   return 0;
+}
+
+static int hisi_ptt_tune_data_set(struct hisi_ptt *hisi_ptt,
+ u32 event, u16 data)
+{
+   u32 reg;
+
+   reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_CTRL);
+   reg &= ~(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB);
+   reg |= FIELD_PREP(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB,
+ event);
+   writel(reg, hisi_ptt->iobase + HISI_PTT_TUNING_CTRL);
+
+   writel(FIELD_PREP(HISI_PTT_TUNING_DATA_VAL_MASK, data),
+  hisi_ptt->iobase + HISI_PTT_TUNING_DATA);
+
+   if (!hisi_ptt_wait_tuning_finish(hisi_ptt))
+   return -ETIMEDOUT;
+
+   return 0;
+}
+
+static ssize_t hisi_ptt_tune_attr_show(struct device *dev,
+  struct device_attribute *attr,
+  char *buf)
+{
+   struct hisi_ptt *hisi_ptt = to_hisi_ptt(dev_get_drvdata(dev));
+   struct dev_ext_attribute *ext_attr;
+   struct hisi_ptt_tune_desc *desc;
+   int ret;
+   u16 val;
+
+   ext_attr = container_of(attr, struct dev_ext_attribute, attr);
+   desc = ext_attr->var;
+
+   mutex_lock(_ptt->tune_lock);
+   ret = hisi_ptt_tune_data_get(hisi_ptt, desc->event_code, );
+   mutex_unlock(_ptt->tune_lock);
+
+   if (ret)
+   return ret;
+
+   return sysfs_emit(buf, "%u\n", val);
+}
+
+static ssize_t hisi_ptt_tune_attr_store(struct device *dev,
+   struct device_attribute *attr,
+   const char *buf, size_t count)
+{
+   struct hisi_ptt *hisi_ptt = to_hisi_ptt(dev_get_drvdata(dev));
+   struct dev_ext_attribute *ext_attr;
+   struct hisi_ptt_tune_desc *desc;
+   int ret;
+   u16 val;
+
+   ext_attr = container_of(attr, struct dev_ext_attribute, attr);
+   desc = ext_attr->var;
+
+   if (kstrtou16(buf, 10, ))
+   return -EINVAL;
+
+   mutex_lock(_ptt->tune_lock);
+   ret = hisi_ptt_tune_data_set(hisi_ptt, desc->event_code, val);
+   mutex_unlock(_ptt->tune_lock);
+
+   if (ret)
+   return ret;
+
+   return count;
+}
+
+#define HISI_PTT_TUNE_ATTR(_name, _val, _show, _store) \
+   static struct hisi_ptt_tune_desc _name##_desc = {   \
+   .name = #_name, \
+   .event_code = _val, \
+   };  \
+   static struct dev_ext_attribute hisi_ptt_##_name##_attr = { \
+   .attr   = __ATTR(_name, 0600, _show, _store),   \
+   .var= &_name##_desc,\
+   }
+
+#define HISI_PTT_TUNE_ATTR_COMMON(_name, _val) \
+   HISI_PTT_TUNE_ATTR(_name, _val, \
+  hisi_ptt_tune_attr_show, \
+  hisi_ptt_tune_attr_store)
+
+/*
+ * The value of the tuning event are composed 

[PATCH v8 5/8] perf tool: Add support for HiSilicon PCIe Tune and Trace device driver

2022-05-16 Thread Yicong Yang via iommu
From: Qi Liu 

HiSilicon PCIe tune and trace device (PTT) could dynamically tune
the PCIe link's events, and trace the TLP headers).

This patch add support for PTT device in perf tool, so users could
use 'perf record' to get TLP headers trace data.

Signed-off-by: Qi Liu 
Signed-off-by: Yicong Yang 
---
 tools/perf/arch/arm/util/auxtrace.c   |  63 +
 tools/perf/arch/arm/util/pmu.c|   3 +
 tools/perf/arch/arm64/util/Build  |   2 +-
 tools/perf/arch/arm64/util/hisi-ptt.c | 187 ++
 tools/perf/util/auxtrace.c|   1 +
 tools/perf/util/auxtrace.h|   1 +
 tools/perf/util/hisi-ptt.h|  19 +++
 7 files changed, 275 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/arch/arm64/util/hisi-ptt.c
 create mode 100644 tools/perf/util/hisi-ptt.h

diff --git a/tools/perf/arch/arm/util/auxtrace.c 
b/tools/perf/arch/arm/util/auxtrace.c
index 384c7cfda0fd..297fffedf45e 100644
--- a/tools/perf/arch/arm/util/auxtrace.c
+++ b/tools/perf/arch/arm/util/auxtrace.c
@@ -4,9 +4,11 @@
  * Author: Mathieu Poirier 
  */
 
+#include 
 #include 
 #include 
 #include 
+#include 
 
 #include "../../../util/auxtrace.h"
 #include "../../../util/debug.h"
@@ -14,6 +16,7 @@
 #include "../../../util/pmu.h"
 #include "cs-etm.h"
 #include "arm-spe.h"
+#include "hisi-ptt.h"
 
 static struct perf_pmu **find_all_arm_spe_pmus(int *nr_spes, int *err)
 {
@@ -50,6 +53,52 @@ static struct perf_pmu **find_all_arm_spe_pmus(int *nr_spes, 
int *err)
return arm_spe_pmus;
 }
 
+static struct perf_pmu **find_all_hisi_ptt_pmus(int *nr_ptts, int *err)
+{
+   const char *sysfs = sysfs__mountpoint();
+   struct perf_pmu **hisi_ptt_pmus = NULL;
+   struct dirent *dent;
+   char path[PATH_MAX];
+   DIR *dir = NULL;
+   int idx = 0;
+
+   snprintf(path, PATH_MAX, "%s" EVENT_SOURCE_DEVICE_PATH, sysfs);
+   dir = opendir(path);
+   if (!dir) {
+   pr_err("can't read directory '%s'\n", EVENT_SOURCE_DEVICE_PATH);
+   *err = -EINVAL;
+   goto out;
+   }
+
+   while ((dent = readdir(dir))) {
+   if (strstr(dent->d_name, HISI_PTT_PMU_NAME))
+   (*nr_ptts)++;
+   }
+
+   if (!(*nr_ptts))
+   goto out;
+
+   hisi_ptt_pmus = zalloc(sizeof(struct perf_pmu *) * (*nr_ptts));
+   if (!hisi_ptt_pmus) {
+   pr_err("hisi_ptt alloc failed\n");
+   *err = -ENOMEM;
+   goto out;
+   }
+
+   rewinddir(dir);
+   while ((dent = readdir(dir))) {
+   if (strstr(dent->d_name, HISI_PTT_PMU_NAME) && idx < 
(*nr_ptts)) {
+   hisi_ptt_pmus[idx] = perf_pmu__find(dent->d_name);
+   if (hisi_ptt_pmus[idx])
+   idx++;
+   }
+   }
+
+out:
+   closedir(dir);
+   return hisi_ptt_pmus;
+}
+
 static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus,
   int pmu_nr, struct evsel *evsel)
 {
@@ -71,17 +120,21 @@ struct auxtrace_record
 {
struct perf_pmu *cs_etm_pmu = NULL;
struct perf_pmu **arm_spe_pmus = NULL;
+   struct perf_pmu **hisi_ptt_pmus = NULL;
struct evsel *evsel;
struct perf_pmu *found_etm = NULL;
struct perf_pmu *found_spe = NULL;
+   struct perf_pmu *found_ptt = NULL;
int auxtrace_event_cnt = 0;
int nr_spes = 0;
+   int nr_ptts = 0;
 
if (!evlist)
return NULL;
 
cs_etm_pmu = perf_pmu__find(CORESIGHT_ETM_PMU_NAME);
arm_spe_pmus = find_all_arm_spe_pmus(_spes, err);
+   hisi_ptt_pmus = find_all_hisi_ptt_pmus(_ptts, err);
 
evlist__for_each_entry(evlist, evsel) {
if (cs_etm_pmu && !found_etm)
@@ -89,9 +142,13 @@ struct auxtrace_record
 
if (arm_spe_pmus && !found_spe)
found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, 
evsel);
+
+   if (arm_spe_pmus && !found_spe)
+   found_ptt = find_pmu_for_event(hisi_ptt_pmus, nr_ptts, 
evsel);
}
 
free(arm_spe_pmus);
+   free(hisi_ptt_pmus);
 
if (found_etm)
auxtrace_event_cnt++;
@@ -99,6 +156,9 @@ struct auxtrace_record
if (found_spe)
auxtrace_event_cnt++;
 
+   if (found_ptt)
+   auxtrace_event_cnt++;
+
if (auxtrace_event_cnt > 1) {
pr_err("Concurrent AUX trace operation not currently 
supported\n");
*err = -EOPNOTSUPP;
@@ -111,6 +171,9 @@ struct auxtrace_record
 #if defined(__aarch64__)
if (found_spe)
return arm_spe_recording_init(err, found_spe);
+
+   if (found_ptt)
+   return hisi_ptt_recording_init(err, found_ptt);
 #endif
 
/*
diff --git a/tools/perf/arch/arm/util/pmu.c b/tools/perf/arch/arm/util/pmu.c
index b8b23b9dc598..887c8addc491 100644
--- 

[PATCH v8 8/8] MAINTAINERS: Add maintainer for HiSilicon PTT driver

2022-05-16 Thread Yicong Yang via iommu
Add maintainer for driver and documentation of HiSilicon PTT device.

Signed-off-by: Yicong Yang 
Reviewed-by: Jonathan Cameron 
---
 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index fd768d43e048..d30a1698251c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8858,6 +8858,13 @@ F:   Documentation/admin-guide/perf/hisi-pcie-pmu.rst
 F: Documentation/admin-guide/perf/hisi-pmu.rst
 F: drivers/perf/hisilicon
 
+HISILICON PTT DRIVER
+M: Yicong Yang 
+L: linux-ker...@vger.kernel.org
+S: Maintained
+F: Documentation/trace/hisi-ptt.rst
+F: drivers/hwtracing/ptt/
+
 HISILICON QM AND ZIP Controller DRIVER
 M: Zhou Wang 
 L: linux-cry...@vger.kernel.org
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 0/8] Add support for HiSilicon PCIe Tune and Trace device

2022-05-16 Thread Yicong Yang via iommu
HiSilicon PCIe tune and trace device (PTT) is a PCIe Root Complex
integrated Endpoint (RCiEP) device, providing the capability
to dynamically monitor and tune the PCIe traffic (tune),
and trace the TLP headers (trace).

PTT tune is designed for monitoring and adjusting PCIe link parameters.
We provide several parameters of the PCIe link. Through the driver,
user can adjust the value of certain parameter to affect the PCIe link
for the purpose of enhancing the performance in certian situation.

PTT trace is designed for dumping the TLP headers to the memory, which
can be used to analyze the transactions and usage condition of the PCIe
Link. Users can choose filters to trace headers, by either requester
ID, or those downstream of a set of Root Ports on the same core of the
PTT device. It's also supported to trace the headers of certain type and
of certain direction.

The driver registers a PMU device for each PTT device. The trace can
be used through `perf record` and the traced headers can be decoded
by `perf report`. The perf command support for the device is also
added in this patchset. The tune can be used through the sysfs
attributes of related PMU device. See the documentation for the
detailed usage.

Change since v7:
- Configure the DMA in probe rather than in runtime. Also use devres to manage
  PMU device as we have no order problem now
- Refactor the config validation function per John and Leo
- Use a spinlock hisi_ptt::pmu_lock instead of mutex to serialize the perf 
process
  in pmu::start as it's in atomic context
- Only commit the traced data when stop, per Leo and James
- Drop the filter dynamically updating patch from this series to simply the 
review
  of the driver. That patch will be send separately.
- add a cpumask sysfs attribute and handle the cpu hotplug events, follow the
  uncore PMU convention
- Other cleanups and fixes, both in driver and perf tool
Link: 
https://lore.kernel.org/lkml/20220407125841.3678-1-yangyic...@hisilicon.com/

Change since v6:
- Fix W=1 errors reported by lkp test, thanks

Change since v5:
- Squash the PMU patch into PATCH 2 suggested by John
- refine the commit message of PATCH 1 and some comments
Link: 
https://lore.kernel.org/lkml/20220308084930.5142-1-yangyic...@hisilicon.com/

Change since v4:
Address the comments from Jonathan, John and Ma Ca, thanks.
- Use devm* also for allocating the DMA buffers
- Remove the IRQ handler stub in Patch 2
- Make functions waiting for hardware state return boolean
- Manual remove the PMU device as it should be removed first
- Modifier the orders in probe and removal to make them matched well
- Make available {directions,type,format} array const and non-global
- Using the right filter list in filters show and well protect the
  list with mutex
- Record the trace status with a boolean @started rather than enum
- Optimize the process of finding the PTT devices of the perf-tool
Link: 
https://lore.kernel.org/linux-pci/20220221084307.33712-1-yangyic...@hisilicon.com/

Change since v3:
Address the comments from Jonathan and John, thanks.
- drop members in the common struct which can be get on the fly
- reduce buffer struct and organize the buffers with array instead of list
- reduce the DMA reset wait time to avoid long time busy loop
- split the available_filters sysfs attribute into two files, for root port
  and requester respectively. Update the documentation accordingly
- make IOMMU mapping check earlier in probe to avoid race condition. Also
  make IOMMU quirk patch prior to driver in the series
- Cleanups and typos fixes from John and Jonathan
Link: 
https://lore.kernel.org/linux-pci/20220124131118.17887-1-yangyic...@hisilicon.com/

Change since v2:
- address the comments from Mathieu, thanks.
  - rename the directory to ptt to match the function of the device
  - spinoff the declarations to a separate header
  - split the trace function to several patches
  - some other comments.
- make default smmu domain type of PTT device to identity
  Drop the RMR as it's not recommended and use an iommu_def_domain_type
  quirk to passthrough the device DMA as suggested by Robin. 
Link: 
https://lore.kernel.org/linux-pci/2026090625.53702-1-yangyic...@hisilicon.com/

Change since v1:
- switch the user interface of trace to perf from debugfs
- switch the user interface of tune to sysfs from debugfs
- add perf tool support to start trace and decode the trace data
- address the comments of documentation from Bjorn
- add RMR[1] support of the device as trace works in RMR mode or
  direct DMA mode. RMR support is achieved by common APIs rather
  than the APIs implemented in [1].
Link: 
https://lore.kernel.org/lkml/1618654631-42454-1-git-send-email-yangyic...@hisilicon.com/
[1] 
https://lore.kernel.org/linux-acpi/20210805080724.480-1-shameerali.kolothum.th...@huawei.com/

Qi Liu (3):
  perf arm: Refactor event list iteration in auxtrace_record__init()
  perf tool: Add support for HiSilicon PCIe Tune and Trace device driver
  perf 

[PATCH v8 1/8] iommu/arm-smmu-v3: Make default domain type of HiSilicon PTT device to identity

2022-05-16 Thread Yicong Yang via iommu
The DMA operations of HiSilicon PTT device can only work properly with
identical mappings. So add a quirk for the device to force the domain
as passthrough.

Acked-by: Will Deacon 
Signed-off-by: Yicong Yang 
Reviewed-by: John Garry 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 627a3ed5ee8f..7f51823ab63b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2839,6 +2839,26 @@ static int arm_smmu_dev_disable_feature(struct device 
*dev,
}
 }
 
+/*
+ * HiSilicon PCIe tune and trace device can be used to trace TLP headers on the
+ * PCIe link and save the data to memory by DMA. The hardware is restricted to
+ * use identity mapping only.
+ */
+#define IS_HISI_PTT_DEVICE(pdev)   ((pdev)->vendor == PCI_VENDOR_ID_HUAWEI 
&& \
+(pdev)->device == 0xa12e)
+
+static int arm_smmu_def_domain_type(struct device *dev)
+{
+   if (dev_is_pci(dev)) {
+   struct pci_dev *pdev = to_pci_dev(dev);
+
+   if (IS_HISI_PTT_DEVICE(pdev))
+   return IOMMU_DOMAIN_IDENTITY;
+   }
+
+   return 0;
+}
+
 static struct iommu_ops arm_smmu_ops = {
.capable= arm_smmu_capable,
.domain_alloc   = arm_smmu_domain_alloc,
@@ -2856,6 +2876,7 @@ static struct iommu_ops arm_smmu_ops = {
.sva_unbind = arm_smmu_sva_unbind,
.sva_get_pasid  = arm_smmu_sva_get_pasid,
.page_response  = arm_smmu_page_response,
+   .def_domain_type= arm_smmu_def_domain_type,
.pgsize_bitmap  = -1UL, /* Restricted during device attach */
.owner  = THIS_MODULE,
.default_domain_ops = &(const struct iommu_domain_ops) {
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used

2022-05-16 Thread Suravee Suthikulpanit via iommu

Joerg,

On 5/13/22 8:07 PM, Joerg Roedel wrote:

On Mon, May 09, 2022 at 02:48:15AM -0500, Suravee Suthikulpanit wrote:

On AMD system with SNP enabled, IOMMU hardware checks the host translation
valid (TV) and guest translation valid (GV) bits in the device
table entry (DTE) before accessing the corresponded page tables.

However, current IOMMU driver sets the TV bit for all devices
regardless of whether the host page table is in used.
This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which
do not the host page table root pointer set up.


Hmm, this sound weird. In the early AMD IOMMUs it was recommended to set
TV=1 and V=1 and the rest to 0 to block all DMA from a device.

I wonder how this triggers ILLEGAL_DEV_TABLE_ENTRY errors now. It is
(was?) legal to set V=1 TV=1, mode=0 and leave the page-table empty.


Due to the new restriction (please see the IOMMU spec Rev 3.06-PUB - Apr 2021
https://www.amd.com/system/files/TechDocs/48882_IOMMU.pdf) where the use of
DTE[Mode]=0 is not supported on systems that are SNP-enabled (i.e. 
EFR[SNPSup]=1),
the IOMMU HW looks at the DTE[TV] bit to determine if it needs to handle the v1 
page table.
When the HW encounters DTE entry with TV=1, V=1, Mode=0, it would generate
ILLEGAL_DEV_TABLE_ENTRY event.

Note: I am following up with HW folks for the updated document for this
specific detail.

Therefore, we need to modify IOMMU driver as following:

- For non-DMA devices (e.g. the IOAPIC devices), we need to
modify IOMMU driver to default to DTE[TV]=0. For Linux, this is equivalent
to DTE with domain ID 0.

- I am still trying to see what is the best way to force Linux to not allow
Mode=0 (i.e. iommu=pt mode). Any thoughts?

- Also, it seems that the current iommu v2 page table use case, where 
GVA->GPA=SPA
will no longer be supported on system w/ SNPSup=1. Any thoughts?


When then IW=0 and IR=0, DMA is blocked. From what I remember this is a
valid setting in a DTE.


Correct.


Do you have an example DTE which triggers this error message?


This is specifically from the device representing an IOAPIC.

[  +0.000108] iommu ivhd0: AMD-Vi: Event logged [ILLEGAL_DEV_TABLE_ENTRY device=c0:00.1 pasid=0x0 
address=0xfffdf814 flags=0x0008]

[  +0.11] AMD-Vi: DTE[0]: 0003
[  +0.03] AMD-Vi: DTE[1]: 
[  +0.02] AMD-Vi: DTE[2]: 2008000100258013
[  +0.01] AMD-Vi: DTE[3]: 

Best Regards,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 2/2] iomm/mediatek: Allow page table PA up to 35bit

2022-05-16 Thread yf.wang--- via iommu
From: Yunfei Wang 

Add the quirk IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT support, so that allows
page table PA up to 35bit, not only in ZONE_DMA32.

Signed-off-by: Ning Li 
Signed-off-by: Yunfei Wang 
---
 drivers/iommu/mtk_iommu.c | 29 +
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 6fd75a60abd6..1b9a876ef271 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -33,6 +33,7 @@
 
 #define REG_MMU_PT_BASE_ADDR   0x000
 #define MMU_PT_ADDR_MASK   GENMASK(31, 7)
+#define MMU_PT_ADDR_2_0_MASK   GENMASK(2, 0)
 
 #define REG_MMU_INVALIDATE 0x020
 #define F_ALL_INVLD0x2
@@ -118,6 +119,7 @@
 #define WR_THROT_ENBIT(6)
 #define HAS_LEGACY_IVRP_PADDR  BIT(7)
 #define IOVA_34_EN BIT(8)
+#define PGTABLE_PA_35_EN   BIT(9)
 
 #define MTK_IOMMU_HAS_FLAG(pdata, _x) \
pdata)->flags) & (_x)) == (_x))
@@ -401,6 +403,9 @@ static int mtk_iommu_domain_finalise(struct 
mtk_iommu_domain *dom,
.iommu_dev = data->dev,
};
 
+   if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN))
+   dom->cfg.quirks |= IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT;
+
if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE))
dom->cfg.oas = data->enable_4GB ? 33 : 32;
else
@@ -450,6 +455,7 @@ static int mtk_iommu_attach_device(struct iommu_domain 
*domain,
struct mtk_iommu_domain *dom = to_mtk_domain(domain);
struct device *m4udev = data->dev;
int ret, domid;
+   u32 regval;
 
domid = mtk_iommu_get_domain_id(dev, data->plat_data);
if (domid < 0)
@@ -472,8 +478,14 @@ static int mtk_iommu_attach_device(struct iommu_domain 
*domain,
return ret;
}
data->m4u_dom = dom;
-   writel(dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK,
-  data->base + REG_MMU_PT_BASE_ADDR);
+
+   /* Bits[6:3] are invalid for mediatek platform */
+   if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN))
+   regval = (dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK) 
|
+(dom->cfg.arm_v7s_cfg.ttbr & 
MMU_PT_ADDR_2_0_MASK);
+   else
+   regval = dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK;
+   writel(regval, data->base + REG_MMU_PT_BASE_ADDR);
 
pm_runtime_put(m4udev);
}
@@ -987,6 +999,7 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct 
device *dev)
struct mtk_iommu_suspend_reg *reg = >reg;
struct mtk_iommu_domain *m4u_dom = data->m4u_dom;
void __iomem *base = data->base;
+   u32 regval;
int ret;
 
ret = clk_prepare_enable(data->bclk);
@@ -1010,7 +1023,14 @@ static int __maybe_unused 
mtk_iommu_runtime_resume(struct device *dev)
writel_relaxed(reg->int_main_control, base + REG_MMU_INT_MAIN_CONTROL);
writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR);
writel_relaxed(reg->vld_pa_rng, base + REG_MMU_VLD_PA_RNG);
-   writel(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK, base + 
REG_MMU_PT_BASE_ADDR);
+
+   /* Bits[6:3] are invalid for mediatek platform */
+   if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN))
+   regval = (m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK) |
+(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_2_0_MASK);
+   else
+   regval = m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK;
+   writel(regval, base + REG_MMU_PT_BASE_ADDR);
 
/*
 * Users may allocate dma buffer before they call pm_runtime_get,
@@ -1038,7 +1058,8 @@ static const struct mtk_iommu_plat_data mt2712_data = {
 
 static const struct mtk_iommu_plat_data mt6779_data = {
.m4u_plat  = M4U_MT6779,
-   .flags = HAS_SUB_COMM | OUT_ORDER_WR_EN | WR_THROT_EN,
+   .flags = HAS_SUB_COMM | OUT_ORDER_WR_EN | WR_THROT_EN |
+PGTABLE_PA_35_EN,
.inv_sel_reg   = REG_MMU_INV_SEL_GEN2,
.iova_region   = single_domain,
.iova_region_nr = ARRAY_SIZE(single_domain),
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 1/2] iommu/io-pgtable-arm-v7s: Add a quirk to allow pgtable PA up to 35bit

2022-05-16 Thread yf.wang--- via iommu
From: Yunfei Wang 

The calling to kmem_cache_alloc for level 2 pgtable allocation may run
in atomic context, and it fails sometimes when DMA32 zone runs out of
memory.

Since Mediatek IOMMU hardware support at most 35bit PA in pgtable,
so add a quirk to allow the PA of pgtables support up to bit35.

Signed-off-by: Ning Li 
Signed-off-by: Yunfei Wang 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 56 ++
 include/linux/io-pgtable.h | 15 +---
 2 files changed, 52 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index be066c1503d3..57455ae052ac 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -149,6 +149,10 @@
 #define ARM_V7S_TTBR_IRGN_ATTR(attr)   \
attr) & 0x1) << 6) | (((attr) & 0x2) >> 1))
 
+/* Mediatek extend ttbr bits[2:0] for PA bits[34:32] */
+#define ARM_V7S_TTBR_35BIT_PA(ttbr, pa)
\
+   ((ttbr & ((u32)(~0U << 3))) | ((pa & GENMASK_ULL(34, 32)) >> 32))
+
 #ifdef CONFIG_ZONE_DMA32
 #define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
 #define ARM_V7S_TABLE_SLAB_FLAGS SLAB_CACHE_DMA32
@@ -182,14 +186,8 @@ static bool arm_v7s_is_mtk_enabled(struct io_pgtable_cfg 
*cfg)
(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT);
 }
 
-static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
-   struct io_pgtable_cfg *cfg)
+static arm_v7s_iopte to_iopte_mtk(phys_addr_t paddr, arm_v7s_iopte pte)
 {
-   arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
-
-   if (!arm_v7s_is_mtk_enabled(cfg))
-   return pte;
-
if (paddr & BIT_ULL(32))
pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
if (paddr & BIT_ULL(33))
@@ -199,6 +197,17 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int 
lvl,
return pte;
 }
 
+static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
+   struct io_pgtable_cfg *cfg)
+{
+   arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
+
+   if (!arm_v7s_is_mtk_enabled(cfg))
+   return pte;
+
+   return to_iopte_mtk(paddr, pte);
+}
+
 static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
  struct io_pgtable_cfg *cfg)
 {
@@ -234,6 +243,7 @@ static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int 
lvl,
 static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
   struct arm_v7s_io_pgtable *data)
 {
+   gfp_t gfp_l1 = __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA;
struct io_pgtable_cfg *cfg = >iop.cfg;
struct device *dev = cfg->iommu_dev;
phys_addr_t phys;
@@ -241,9 +251,11 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
size_t size = ARM_V7S_TABLE_SIZE(lvl, cfg);
void *table = NULL;
 
+   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)
+   gfp_l1 = __GFP_ZERO;
+
if (lvl == 1)
-   table = (void *)__get_free_pages(
-   __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size));
+   table = (void *)__get_free_pages(gfp_l1, get_order(size));
else if (lvl == 2)
table = kmem_cache_zalloc(data->l2_tables, gfp);
 
@@ -251,7 +263,8 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
return NULL;
 
phys = virt_to_phys(table);
-   if (phys != (arm_v7s_iopte)phys) {
+   if (phys != (arm_v7s_iopte)phys &&
+   !(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)) {
/* Doesn't fit in PTE */
dev_err(dev, "Page table does not fit in PTE: %pa", );
goto out_free;
@@ -457,9 +470,14 @@ static arm_v7s_iopte arm_v7s_install_table(arm_v7s_iopte 
*table,
   arm_v7s_iopte curr,
   struct io_pgtable_cfg *cfg)
 {
+   phys_addr_t phys = virt_to_phys(table);
arm_v7s_iopte old, new;
 
-   new = virt_to_phys(table) | ARM_V7S_PTE_TYPE_TABLE;
+   new = phys | ARM_V7S_PTE_TYPE_TABLE;
+
+   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)
+   new = to_iopte_mtk(phys, new);
+
if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
new |= ARM_V7S_ATTR_NS_TABLE;
 
@@ -778,7 +796,9 @@ static phys_addr_t arm_v7s_iova_to_phys(struct 
io_pgtable_ops *ops,
 static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
void *cookie)
 {
+   slab_flags_t slab_flag = ARM_V7S_TABLE_SLAB_FLAGS;
struct arm_v7s_io_pgtable *data;
+   phys_addr_t paddr;
 
if (cfg->ias > (arm_v7s_is_mtk_enabled(cfg) ? 34 : ARM_V7S_ADDR_BITS))
return NULL;
@@ -788,7 +808,8 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
 
if (cfg->quirks 

Re: [PATCH 4/7] dt-bindings: renesas,rcar-dmac: R-Car V3U is R-Car Gen4

2022-05-16 Thread Vinod Koul
On 02-05-22, 15:34, Geert Uytterhoeven wrote:
> Despite the name, R-Car V3U is the first member of the R-Car Gen4
> family.  Hence move its compatible value to the R-Car Gen4 section.

Applied, thanks

-- 
~Vinod
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] iommu: Add blocking_domain_ops field in iommu_ops

2022-05-16 Thread Robin Murphy

On 2022-05-16 02:57, Lu Baolu wrote:

Each IOMMU driver must provide a blocking domain ops. If the hardware
supports detaching domain from device, setting blocking domain equals
detaching the existing domain from the deivce. Otherwise, an UNMANAGED
domain without any mapping will be used instead.


Unfortunately that's backwards - most of the implementations of 
.detach_dev are disabling translation entirely, meaning the device ends 
up effectively in passthrough rather than blocked. Conversely, at least 
arm-smmu and arm-smmu-v3 could implement IOMMU_DOMAIN_BLOCKED properly 
with fault-type S2CRs and STEs respectively, it just needs a bit of 
wiring up.


Thanks,
Robin.


Signed-off-by: Lu Baolu 
---
  include/linux/iommu.h   |  7 +++
  drivers/iommu/amd/iommu.c   | 12 
  drivers/iommu/apple-dart.c  | 12 
  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  3 +++
  drivers/iommu/arm/arm-smmu/arm-smmu.c   |  3 +++
  drivers/iommu/arm/arm-smmu/qcom_iommu.c | 12 
  drivers/iommu/exynos-iommu.c| 12 
  drivers/iommu/fsl_pamu_domain.c | 12 
  drivers/iommu/intel/iommu.c | 12 
  drivers/iommu/ipmmu-vmsa.c  | 12 
  drivers/iommu/msm_iommu.c   | 12 
  drivers/iommu/mtk_iommu.c   | 12 
  drivers/iommu/mtk_iommu_v1.c| 12 
  drivers/iommu/omap-iommu.c  | 12 
  drivers/iommu/rockchip-iommu.c  | 12 
  drivers/iommu/s390-iommu.c  | 12 
  drivers/iommu/sprd-iommu.c  | 11 +++
  drivers/iommu/sun50i-iommu.c| 12 
  drivers/iommu/tegra-gart.c  | 12 
  drivers/iommu/tegra-smmu.c  | 12 
  drivers/iommu/virtio-iommu.c|  3 +++
  21 files changed, 219 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 572399ac1d83..5e228aad0ef6 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -216,6 +216,7 @@ struct iommu_iotlb_gather {
   *- IOMMU_DOMAIN_DMA: must use a dma domain
   *- 0: use the default setting
   * @default_domain_ops: the default ops for domains
+ * @blocking_domain_ops: the blocking ops for domains
   * @pgsize_bitmap: bitmap of all possible supported page sizes
   * @owner: Driver module providing these ops
   */
@@ -255,6 +256,7 @@ struct iommu_ops {
int (*def_domain_type)(struct device *dev);
  
  	const struct iommu_domain_ops *default_domain_ops;

+   const struct iommu_domain_ops *blocking_domain_ops;
unsigned long pgsize_bitmap;
struct module *owner;
  };
@@ -279,6 +281,9 @@ struct iommu_ops {
   * @enable_nesting: Enable nesting
   * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
   * @free: Release the domain after use.
+ * @blocking_domain_detach: iommu hardware support detaching a domain from
+ * a device, hence setting blocking domain to a device equals to
+ * detach the existing domain from it.
   */
  struct iommu_domain_ops {
int (*set_dev)(struct iommu_domain *domain, struct device *dev);
@@ -310,6 +315,8 @@ struct iommu_domain_ops {
  unsigned long quirks);
  
  	void (*free)(struct iommu_domain *domain);

+
+   unsigned int blocking_domain_detach:1;
  };
  
  /**

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 01b8668ef46d..c66713439824 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2272,6 +2272,14 @@ static bool amd_iommu_enforce_cache_coherency(struct 
iommu_domain *domain)
return true;
  }
  
+static int amd_blocking_domain_set_dev(struct iommu_domain *domain,

+  struct device *dev)
+{
+   amd_iommu_detach_device(domain, dev);
+
+   return 0;
+}
+
  const struct iommu_ops amd_iommu_ops = {
.capable = amd_iommu_capable,
.domain_alloc = amd_iommu_domain_alloc,
@@ -2295,6 +2303,10 @@ const struct iommu_ops amd_iommu_ops = {
.iotlb_sync = amd_iommu_iotlb_sync,
.free   = amd_iommu_domain_free,
.enforce_cache_coherency = amd_iommu_enforce_cache_coherency,
+   },
+   .blocking_domain_ops = &(const struct iommu_domain_ops) {
+   .set_dev= amd_blocking_domain_set_dev,
+   .blocking_domain_detach = true,
}
  };
  
diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c

index a0b7281f1989..3c37762e01ec 100644
--- a/drivers/iommu/apple-dart.c
+++ b/drivers/iommu/apple-dart.c
@@ -763,6 +763,14 @@ static void apple_dart_get_resv_regions(struct device *dev,
iommu_dma_get_resv_regions(dev, head);
  }
  

Re: [PATCH v5 5/9] iommu/arm-smmu: Attach to host1x context device bus

2022-05-16 Thread Mikko Perttunen

On 5/16/22 13:44, Robin Murphy wrote:

On 2022-05-16 11:13, Mikko Perttunen wrote:

On 5/16/22 13:07, Will Deacon wrote:

On Mon, May 16, 2022 at 11:52:54AM +0300, cyn...@kapsi.fi wrote:

From: Mikko Perttunen 

Set itself as the IOMMU for the host1x context device bus, containing
"dummy" devices used for Host1x context isolation.

Signed-off-by: Mikko Perttunen 
---
  drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 +
  1 file changed, 13 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c

index 568cce590ccc..9ff54eaecf81 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -39,6 +39,7 @@
  #include 
  #include 
+#include 
  #include "arm-smmu.h"
@@ -2053,8 +2054,20 @@ static int arm_smmu_bus_init(struct iommu_ops 
*ops)

  goto err_reset_pci_ops;
  }
  #endif
+#ifdef CONFIG_TEGRA_HOST1X_CONTEXT_BUS
+    if (!iommu_present(_context_device_bus_type)) {
+    err = bus_set_iommu(_context_device_bus_type, ops);
+    if (err)
+    goto err_reset_fsl_mc_ops;
+    }
+#endif
+
  return 0;
+err_reset_fsl_mc_ops: __maybe_unused;
+#ifdef CONFIG_FSL_MC_BUS
+    bus_set_iommu(_mc_bus_type, NULL);
+#endif


bus_set_iommu() is going away:

https://lore.kernel.org/r/cover.1650890638.git.robin.mur...@arm.com

Will


Thanks for the heads-up. Robin had pointed out that this work was 
ongoing but I hadn't seen the patches yet. I'll look into it.


Although that *is* currently blocked on the mystery intel-iommu problem 
that I can't reproduce... If this series is ready to land right now for 
5.19 then in principle that might be the easiest option overall. 
Hopefully at least patch #2 could sneak in so that the compile-time 
dependencies are ready for me to roll up host1x into the next rebase of 
"iommu: Always register bus notifiers".


Cheers,
Robin.


My guess is that the series as a whole is not ready to land in the 5.19 
timeframe, but #2 could be possible.


Thierry, any opinion?

Thanks,
Mikko
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v5 5/9] iommu/arm-smmu: Attach to host1x context device bus

2022-05-16 Thread Robin Murphy

On 2022-05-16 11:13, Mikko Perttunen wrote:

On 5/16/22 13:07, Will Deacon wrote:

On Mon, May 16, 2022 at 11:52:54AM +0300, cyn...@kapsi.fi wrote:

From: Mikko Perttunen 

Set itself as the IOMMU for the host1x context device bus, containing
"dummy" devices used for Host1x context isolation.

Signed-off-by: Mikko Perttunen 
---
  drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 +
  1 file changed, 13 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c

index 568cce590ccc..9ff54eaecf81 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -39,6 +39,7 @@
  #include 
  #include 
+#include 
  #include "arm-smmu.h"
@@ -2053,8 +2054,20 @@ static int arm_smmu_bus_init(struct iommu_ops 
*ops)

  goto err_reset_pci_ops;
  }
  #endif
+#ifdef CONFIG_TEGRA_HOST1X_CONTEXT_BUS
+    if (!iommu_present(_context_device_bus_type)) {
+    err = bus_set_iommu(_context_device_bus_type, ops);
+    if (err)
+    goto err_reset_fsl_mc_ops;
+    }
+#endif
+
  return 0;
+err_reset_fsl_mc_ops: __maybe_unused;
+#ifdef CONFIG_FSL_MC_BUS
+    bus_set_iommu(_mc_bus_type, NULL);
+#endif


bus_set_iommu() is going away:

https://lore.kernel.org/r/cover.1650890638.git.robin.mur...@arm.com

Will


Thanks for the heads-up. Robin had pointed out that this work was 
ongoing but I hadn't seen the patches yet. I'll look into it.


Although that *is* currently blocked on the mystery intel-iommu problem 
that I can't reproduce... If this series is ready to land right now for 
5.19 then in principle that might be the easiest option overall. 
Hopefully at least patch #2 could sneak in so that the compile-time 
dependencies are ready for me to roll up host1x into the next rebase of 
"iommu: Always register bus notifiers".


Cheers,
Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v5 5/9] iommu/arm-smmu: Attach to host1x context device bus

2022-05-16 Thread Mikko Perttunen

On 5/16/22 13:07, Will Deacon wrote:

On Mon, May 16, 2022 at 11:52:54AM +0300, cyn...@kapsi.fi wrote:

From: Mikko Perttunen 

Set itself as the IOMMU for the host1x context device bus, containing
"dummy" devices used for Host1x context isolation.

Signed-off-by: Mikko Perttunen 
---
  drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 +
  1 file changed, 13 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 568cce590ccc..9ff54eaecf81 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -39,6 +39,7 @@
  
  #include 

  #include 
+#include 
  
  #include "arm-smmu.h"
  
@@ -2053,8 +2054,20 @@ static int arm_smmu_bus_init(struct iommu_ops *ops)

goto err_reset_pci_ops;
}
  #endif
+#ifdef CONFIG_TEGRA_HOST1X_CONTEXT_BUS
+   if (!iommu_present(_context_device_bus_type)) {
+   err = bus_set_iommu(_context_device_bus_type, ops);
+   if (err)
+   goto err_reset_fsl_mc_ops;
+   }
+#endif
+
return 0;
  
+err_reset_fsl_mc_ops: __maybe_unused;

+#ifdef CONFIG_FSL_MC_BUS
+   bus_set_iommu(_mc_bus_type, NULL);
+#endif


bus_set_iommu() is going away:

https://lore.kernel.org/r/cover.1650890638.git.robin.mur...@arm.com

Will


Thanks for the heads-up. Robin had pointed out that this work was 
ongoing but I hadn't seen the patches yet. I'll look into it.


Mikko





___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 5/9] iommu/arm-smmu: Attach to host1x context device bus

2022-05-16 Thread Will Deacon
On Mon, May 16, 2022 at 11:52:54AM +0300, cyn...@kapsi.fi wrote:
> From: Mikko Perttunen 
> 
> Set itself as the IOMMU for the host1x context device bus, containing
> "dummy" devices used for Host1x context isolation.
> 
> Signed-off-by: Mikko Perttunen 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index 568cce590ccc..9ff54eaecf81 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -39,6 +39,7 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  #include "arm-smmu.h"
>  
> @@ -2053,8 +2054,20 @@ static int arm_smmu_bus_init(struct iommu_ops *ops)
>   goto err_reset_pci_ops;
>   }
>  #endif
> +#ifdef CONFIG_TEGRA_HOST1X_CONTEXT_BUS
> + if (!iommu_present(_context_device_bus_type)) {
> + err = bus_set_iommu(_context_device_bus_type, ops);
> + if (err)
> + goto err_reset_fsl_mc_ops;
> + }
> +#endif
> +
>   return 0;
>  
> +err_reset_fsl_mc_ops: __maybe_unused;
> +#ifdef CONFIG_FSL_MC_BUS
> + bus_set_iommu(_mc_bus_type, NULL);
> +#endif

bus_set_iommu() is going away:

https://lore.kernel.org/r/cover.1650890638.git.robin.mur...@arm.com

Will
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 3/9] gpu: host1x: Add context device management code

2022-05-16 Thread cyndis
From: Mikko Perttunen 

Add code to register context devices from device tree, allocate them
out and manage their refcounts.

Signed-off-by: Mikko Perttunen 
---
v2:
* Directly set DMA mask instead of inheriting from Host1x.
* Use iommu-map instead of custom DT property.
v4:
* Use u64 instead of dma_addr_t for DMA mask
* Use unsigned ints for indexes and adjust error handling flow
* Parse iommu-map property at top level host1x DT node
* Use separate DMA mask per device
* Export symbols as GPL
v5:
* Rename host1x_context to host1x_memory_context
---
 drivers/gpu/host1x/Makefile  |   1 +
 drivers/gpu/host1x/context.c | 160 +++
 drivers/gpu/host1x/context.h |  27 ++
 drivers/gpu/host1x/dev.c |  12 ++-
 drivers/gpu/host1x/dev.h |   2 +
 include/linux/host1x.h   |  18 
 6 files changed, 219 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/host1x/context.c
 create mode 100644 drivers/gpu/host1x/context.h

diff --git a/drivers/gpu/host1x/Makefile b/drivers/gpu/host1x/Makefile
index c891a3e33844..8a65e13d113a 100644
--- a/drivers/gpu/host1x/Makefile
+++ b/drivers/gpu/host1x/Makefile
@@ -10,6 +10,7 @@ host1x-y = \
debug.o \
mipi.o \
fence.o \
+   context.o \
hw/host1x01.o \
hw/host1x02.o \
hw/host1x04.o \
diff --git a/drivers/gpu/host1x/context.c b/drivers/gpu/host1x/context.c
new file mode 100644
index ..d7d95b69a72a
--- /dev/null
+++ b/drivers/gpu/host1x/context.c
@@ -0,0 +1,160 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2021, NVIDIA Corporation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "context.h"
+#include "dev.h"
+
+int host1x_memory_context_list_init(struct host1x *host1x)
+{
+   struct host1x_memory_context_list *cdl = >context_list;
+   struct device_node *node = host1x->dev->of_node;
+   struct host1x_memory_context *ctx;
+   unsigned int i;
+   int err;
+
+   cdl->devs = NULL;
+   cdl->len = 0;
+   mutex_init(>lock);
+
+   err = of_property_count_u32_elems(node, "iommu-map");
+   if (err < 0)
+   return 0;
+
+   cdl->devs = kcalloc(err, sizeof(*cdl->devs), GFP_KERNEL);
+   if (!cdl->devs)
+   return -ENOMEM;
+   cdl->len = err / 4;
+
+   for (i = 0; i < cdl->len; i++) {
+   struct iommu_fwspec *fwspec;
+
+   ctx = >devs[i];
+
+   ctx->host = host1x;
+
+   device_initialize(>dev);
+
+   /*
+* Due to an issue with T194 NVENC, only 38 bits can be used.
+* Anyway, 256GiB of IOVA ought to be enough for anyone.
+*/
+   ctx->dma_mask = DMA_BIT_MASK(38);
+   ctx->dev.dma_mask = >dma_mask;
+   ctx->dev.coherent_dma_mask = ctx->dma_mask;
+   dev_set_name(>dev, "host1x-ctx.%d", i);
+   ctx->dev.bus = _context_device_bus_type;
+   ctx->dev.parent = host1x->dev;
+
+   dma_set_max_seg_size(>dev, UINT_MAX);
+
+   err = device_add(>dev);
+   if (err) {
+   dev_err(host1x->dev, "could not add context device %d: 
%d\n", i, err);
+   goto del_devices;
+   }
+
+   err = of_dma_configure_id(>dev, node, true, );
+   if (err) {
+   dev_err(host1x->dev, "IOMMU configuration failed for 
context device %d: %d\n",
+   i, err);
+   device_del(>dev);
+   goto del_devices;
+   }
+
+   fwspec = dev_iommu_fwspec_get(>dev);
+   if (!fwspec) {
+   dev_err(host1x->dev, "Context device %d has no 
IOMMU!\n", i);
+   device_del(>dev);
+   goto del_devices;
+   }
+
+   ctx->stream_id = fwspec->ids[0] & 0x;
+   }
+
+   return 0;
+
+del_devices:
+   while (i--)
+   device_del(>devs[i].dev);
+
+   kfree(cdl->devs);
+   cdl->len = 0;
+
+   return err;
+}
+
+void host1x_memory_context_list_free(struct host1x_memory_context_list *cdl)
+{
+   unsigned int i;
+
+   for (i = 0; i < cdl->len; i++)
+   device_del(>devs[i].dev);
+
+   kfree(cdl->devs);
+   cdl->len = 0;
+}
+
+struct host1x_memory_context *host1x_memory_context_alloc(struct host1x 
*host1x,
+ struct pid *pid)
+{
+   struct host1x_memory_context_list *cdl = >context_list;
+   struct host1x_memory_context *free = NULL;
+   int i;
+
+   if (!cdl->len)
+   return ERR_PTR(-EOPNOTSUPP);
+
+   mutex_lock(>lock);
+
+   for (i = 0; i < cdl->len; i++) {
+   struct host1x_memory_context *cd = >devs[i];
+
+   if (cd->owner == pid) {
+   

[PATCH v5 0/9] Host1x context isolation support

2022-05-16 Thread cyndis
From: Mikko Perttunen 

***
New in v5:

Rebased
Renamed host1x_context to host1x_memory_context
Small change in DRM side client driver ops to reduce churn with some
  upcoming changes
Add NVDEC support

***

***
New in v4:

Addressed review comments. See individual patches.
***

***
New in v3:

Added device tree bindings for new property.
***

***
New in v2:

Added support for Tegra194
Use standard iommu-map property instead of custom mechanism
***

This series adds support for Host1x 'context isolation'. Since
when programming engines through Host1x, userspace can program in
any addresses it wants, we need some way to isolate the engines'
memory spaces. Traditionally this has either been done imperfectly
with a single shared IOMMU domain, or by copying and verifying the
programming command stream at submit time (Host1x firewall).

Since Tegra186 there is a privileged (only usable by kernel)
Host1x opcode that allows setting the stream ID sent by the engine
to the SMMU. So, by allocating a number of context banks and stream
IDs for this purpose, and using this opcode at the beginning of
each job, we can implement isolation. Due to the limited number of
context banks only each process gets its own context, and not
each channel.

This feature also allows sharing engines among multiple VMs when
used with Host1x's hardware virtualization support - up to 8 VMs
can be configured with a subset of allowed stream IDs, enforced
at hardware level.

To implement this, this series adds a new host1x context bus, which
will contain the 'struct device's corresponding to each context
bank / stream ID, changes to device tree and SMMU code to allow
registering the devices and using the bus, as well as the Host1x
stream ID programming code and support in TegraDRM.

-
Merging notes
-

The changes to DT bindings should be applied on top of Thierry's patch
'dt-bindings: display: tegra: Convert to json-schema'.

Thanks,
Mikko

Mikko Perttunen (9):
  dt-bindings: host1x: Add iommu-map property
  gpu: host1x: Add context bus
  gpu: host1x: Add context device management code
  gpu: host1x: Program context stream ID on submission
  iommu/arm-smmu: Attach to host1x context device bus
  arm64: tegra: Add Host1x context stream IDs on Tegra186+
  drm/tegra: falcon: Set DMACTX field on DMA transactions
  drm/tegra: Support context isolation
  drm/tegra: Implement stream ID related callbacks on engines

 .../display/tegra/nvidia,tegra20-host1x.yaml  |   5 +
 arch/arm64/boot/dts/nvidia/tegra186.dtsi  |  11 ++
 arch/arm64/boot/dts/nvidia/tegra194.dtsi  |  11 ++
 drivers/gpu/Makefile  |   3 +-
 drivers/gpu/drm/tegra/drm.h   |  11 ++
 drivers/gpu/drm/tegra/falcon.c|   8 +
 drivers/gpu/drm/tegra/falcon.h|   1 +
 drivers/gpu/drm/tegra/nvdec.c |   9 +
 drivers/gpu/drm/tegra/submit.c|  48 +-
 drivers/gpu/drm/tegra/uapi.c  |  43 -
 drivers/gpu/drm/tegra/vic.c   |  67 +++-
 drivers/gpu/host1x/Kconfig|   5 +
 drivers/gpu/host1x/Makefile   |   2 +
 drivers/gpu/host1x/context.c  | 160 ++
 drivers/gpu/host1x/context.h  |  27 +++
 drivers/gpu/host1x/context_bus.c  |  31 
 drivers/gpu/host1x/dev.c  |  12 +-
 drivers/gpu/host1x/dev.h  |   2 +
 drivers/gpu/host1x/hw/channel_hw.c|  52 +-
 drivers/gpu/host1x/hw/host1x06_hardware.h |  10 ++
 drivers/gpu/host1x/hw/host1x07_hardware.h |  10 ++
 drivers/iommu/arm/arm-smmu/arm-smmu.c |  13 ++
 include/linux/host1x.h|  26 +++
 include/linux/host1x_context_bus.h|  15 ++
 24 files changed, 564 insertions(+), 18 deletions(-)
 create mode 100644 drivers/gpu/host1x/context.c
 create mode 100644 drivers/gpu/host1x/context.h
 create mode 100644 drivers/gpu/host1x/context_bus.c
 create mode 100644 include/linux/host1x_context_bus.h

-- 
2.36.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 2/9] gpu: host1x: Add context bus

2022-05-16 Thread cyndis
From: Mikko Perttunen 

The context bus is a "dummy" bus that contains struct devices that
correspond to IOMMU contexts assigned through Host1x to processes.

Even when host1x itself is built as a module, the bus is registered
in built-in code so that the built-in ARM SMMU driver is able to
reference it.

Signed-off-by: Mikko Perttunen 
---
v4:
* Export bus as GPL
---
 drivers/gpu/Makefile   |  3 +--
 drivers/gpu/host1x/Kconfig |  5 +
 drivers/gpu/host1x/Makefile|  1 +
 drivers/gpu/host1x/context_bus.c   | 31 ++
 include/linux/host1x_context_bus.h | 15 +++
 5 files changed, 53 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/host1x/context_bus.c
 create mode 100644 include/linux/host1x_context_bus.h

diff --git a/drivers/gpu/Makefile b/drivers/gpu/Makefile
index 835c88318cec..8997f0096545 100644
--- a/drivers/gpu/Makefile
+++ b/drivers/gpu/Makefile
@@ -2,7 +2,6 @@
 # drm/tegra depends on host1x, so if both drivers are built-in care must be
 # taken to initialize them in the correct order. Link order is the only way
 # to ensure this currently.
-obj-$(CONFIG_TEGRA_HOST1X) += host1x/
-obj-y  += drm/ vga/
+obj-y  += host1x/ drm/ vga/
 obj-$(CONFIG_IMX_IPUV3_CORE)   += ipu-v3/
 obj-$(CONFIG_TRACE_GPU_MEM)+= trace/
diff --git a/drivers/gpu/host1x/Kconfig b/drivers/gpu/host1x/Kconfig
index 6815b4db17c1..1861a8180d3f 100644
--- a/drivers/gpu/host1x/Kconfig
+++ b/drivers/gpu/host1x/Kconfig
@@ -1,8 +1,13 @@
 # SPDX-License-Identifier: GPL-2.0-only
+
+config TEGRA_HOST1X_CONTEXT_BUS
+   bool
+
 config TEGRA_HOST1X
tristate "NVIDIA Tegra host1x driver"
depends on ARCH_TEGRA || (ARM && COMPILE_TEST)
select DMA_SHARED_BUFFER
+   select TEGRA_HOST1X_CONTEXT_BUS
select IOMMU_IOVA
help
  Driver for the NVIDIA Tegra host1x hardware.
diff --git a/drivers/gpu/host1x/Makefile b/drivers/gpu/host1x/Makefile
index d2b6f7de0498..c891a3e33844 100644
--- a/drivers/gpu/host1x/Makefile
+++ b/drivers/gpu/host1x/Makefile
@@ -18,3 +18,4 @@ host1x-y = \
hw/host1x07.o
 
 obj-$(CONFIG_TEGRA_HOST1X) += host1x.o
+obj-$(CONFIG_TEGRA_HOST1X_CONTEXT_BUS) += context_bus.o
diff --git a/drivers/gpu/host1x/context_bus.c b/drivers/gpu/host1x/context_bus.c
new file mode 100644
index ..b0d35b2bbe89
--- /dev/null
+++ b/drivers/gpu/host1x/context_bus.c
@@ -0,0 +1,31 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2021, NVIDIA Corporation.
+ */
+
+#include 
+#include 
+
+struct bus_type host1x_context_device_bus_type = {
+   .name = "host1x-context",
+};
+EXPORT_SYMBOL_GPL(host1x_context_device_bus_type);
+
+static int __init host1x_context_device_bus_init(void)
+{
+   int err;
+
+   if (!of_machine_is_compatible("nvidia,tegra186") &&
+   !of_machine_is_compatible("nvidia,tegra194") &&
+   !of_machine_is_compatible("nvidia,tegra234"))
+   return 0;
+
+   err = bus_register(_context_device_bus_type);
+   if (err < 0) {
+   pr_err("bus type registration failed: %d\n", err);
+   return err;
+   }
+
+   return 0;
+}
+postcore_initcall(host1x_context_device_bus_init);
diff --git a/include/linux/host1x_context_bus.h 
b/include/linux/host1x_context_bus.h
new file mode 100644
index ..72462737a6db
--- /dev/null
+++ b/include/linux/host1x_context_bus.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Copyright (c) 2021, NVIDIA Corporation. All rights reserved.
+ */
+
+#ifndef __LINUX_HOST1X_CONTEXT_BUS_H
+#define __LINUX_HOST1X_CONTEXT_BUS_H
+
+#include 
+
+#ifdef CONFIG_TEGRA_HOST1X_CONTEXT_BUS
+extern struct bus_type host1x_context_device_bus_type;
+#endif
+
+#endif
-- 
2.36.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 8/9] drm/tegra: Support context isolation

2022-05-16 Thread cyndis
From: Mikko Perttunen 

For engines that support context isolation, allocate a context when
opening a channel, and set up stream ID offset and context fields
when submitting a job.

As of this commit, the stream ID offset and fallback stream ID
are not used when context isolation is disabled. However, with
upcoming patches that enable a full featured job opcode sequence,
these will be necessary.

Signed-off-by: Mikko Perttunen 
---
v5:
* On supporting engines, always program stream ID offset and
  new fallback stream ID.
* Rename host1x_context to host1x_memory_context
v4:
* Separate error and output values in get_streamid_offset API
* Improve error handling
* Rename job->context to job->memory_context for clarity
---
 drivers/gpu/drm/tegra/drm.h|  3 +++
 drivers/gpu/drm/tegra/submit.c | 48 +-
 drivers/gpu/drm/tegra/uapi.c   | 43 --
 3 files changed, 91 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
index fc0a19554eac..2acc8f2948ad 100644
--- a/drivers/gpu/drm/tegra/drm.h
+++ b/drivers/gpu/drm/tegra/drm.h
@@ -80,6 +80,7 @@ struct tegra_drm_context {
 
/* Only used by new UAPI. */
struct xarray mappings;
+   struct host1x_memory_context *memory_context;
 };
 
 struct tegra_drm_client_ops {
@@ -91,6 +92,8 @@ struct tegra_drm_client_ops {
int (*submit)(struct tegra_drm_context *context,
  struct drm_tegra_submit *args, struct drm_device *drm,
  struct drm_file *file);
+   int (*get_streamid_offset)(struct tegra_drm_client *client, u32 
*offset);
+   int (*can_use_memory_ctx)(struct tegra_drm_client *client, bool 
*supported);
 };
 
 int tegra_drm_submit(struct tegra_drm_context *context,
diff --git a/drivers/gpu/drm/tegra/submit.c b/drivers/gpu/drm/tegra/submit.c
index 6d6dd8c35475..b24738bdf3df 100644
--- a/drivers/gpu/drm/tegra/submit.c
+++ b/drivers/gpu/drm/tegra/submit.c
@@ -498,6 +498,9 @@ static void release_job(struct host1x_job *job)
struct tegra_drm_submit_data *job_data = job->user_data;
u32 i;
 
+   if (job->memory_context)
+   host1x_memory_context_put(job->memory_context);
+
for (i = 0; i < job_data->num_used_mappings; i++)
tegra_drm_mapping_put(job_data->used_mappings[i].mapping);
 
@@ -588,11 +591,51 @@ int tegra_drm_ioctl_channel_submit(struct drm_device 
*drm, void *data,
goto put_job;
}
 
+   if (context->client->ops->get_streamid_offset) {
+   err = context->client->ops->get_streamid_offset(
+   context->client, >engine_streamid_offset);
+   if (err) {
+   SUBMIT_ERR(context, "failed to get streamid offset: 
%d", err);
+   goto unpin_job;
+   }
+   }
+
+   if (context->memory_context && 
context->client->ops->can_use_memory_ctx) {
+   bool supported;
+
+   err = context->client->ops->can_use_memory_ctx(context->client, 
);
+   if (err) {
+   SUBMIT_ERR(context, "failed to detect if engine can use 
memory context: %d", err);
+   goto unpin_job;
+   }
+
+   if (supported) {
+   job->memory_context = context->memory_context;
+   host1x_memory_context_get(job->memory_context);
+   }
+   } else if (context->client->ops->get_streamid_offset) {
+#ifdef CONFIG_IOMMU_API
+   struct iommu_fwspec *spec;
+
+   /*
+* Job submission will need to temporarily change stream ID,
+* so need to tell it what to change it back to.
+*/
+   spec = dev_iommu_fwspec_get(context->client->base.dev);
+   if (spec && spec->num_ids > 0)
+   job->engine_fallback_streamid = spec->ids[0] & 0x;
+   else
+   job->engine_fallback_streamid = 0x7f;
+#else
+   job->engine_fallback_streamid = 0x7f;
+#endif
+   }
+
/* Boot engine. */
err = pm_runtime_resume_and_get(context->client->base.dev);
if (err < 0) {
SUBMIT_ERR(context, "could not power up engine: %d", err);
-   goto unpin_job;
+   goto put_memory_context;
}
 
job->user_data = job_data;
@@ -627,6 +670,9 @@ int tegra_drm_ioctl_channel_submit(struct drm_device *drm, 
void *data,
 
goto put_job;
 
+put_memory_context:
+   if (job->memory_context)
+   host1x_memory_context_put(job->memory_context);
 unpin_job:
host1x_job_unpin(job);
 put_job:
diff --git a/drivers/gpu/drm/tegra/uapi.c b/drivers/gpu/drm/tegra/uapi.c
index 9ab9179d2026..a98239cb0e29 100644
--- a/drivers/gpu/drm/tegra/uapi.c
+++ b/drivers/gpu/drm/tegra/uapi.c
@@ -33,6 +33,9 @@ static void 

[PATCH v5 4/9] gpu: host1x: Program context stream ID on submission

2022-05-16 Thread cyndis
From: Mikko Perttunen 

Add code to do stream ID switching at the beginning of a job. The
stream ID is switched to the stream ID specified by the context
passed in the job structure.

Before switching the stream ID, an OP_DONE wait is done on the
channel's engine to ensure that there is no residual ongoing
work that might do DMA using the new stream ID.

Signed-off-by: Mikko Perttunen 
---
v5:
* Add fallback stream ID. Not used yet, will be needed for
  full featured opcode sequence.
* Rename host1x_context to host1x_memory_context
v4:
* Rename job->context to job->memory_context for clarity
---
 drivers/gpu/host1x/hw/channel_hw.c| 52 +--
 drivers/gpu/host1x/hw/host1x06_hardware.h | 10 +
 drivers/gpu/host1x/hw/host1x07_hardware.h | 10 +
 include/linux/host1x.h|  8 
 4 files changed, 76 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/host1x/hw/channel_hw.c 
b/drivers/gpu/host1x/hw/channel_hw.c
index 6b40e9af1e88..f84caf06621a 100644
--- a/drivers/gpu/host1x/hw/channel_hw.c
+++ b/drivers/gpu/host1x/hw/channel_hw.c
@@ -180,6 +180,45 @@ static void host1x_enable_gather_filter(struct 
host1x_channel *ch)
 #endif
 }
 
+static void host1x_channel_program_engine_streamid(struct host1x_job *job)
+{
+#if HOST1X_HW >= 6
+   u32 fence;
+
+   if (!job->memory_context)
+   return;
+
+   fence = host1x_syncpt_incr_max(job->syncpt, 1);
+
+   /* First, increment a syncpoint on OP_DONE condition.. */
+
+   host1x_cdma_push(>channel->cdma,
+   host1x_opcode_nonincr(HOST1X_UCLASS_INCR_SYNCPT, 1),
+   HOST1X_UCLASS_INCR_SYNCPT_INDX_F(job->syncpt->id) |
+   HOST1X_UCLASS_INCR_SYNCPT_COND_F(1));
+
+   /* Wait for syncpoint to increment */
+
+   host1x_cdma_push(>channel->cdma,
+   host1x_opcode_setclass(HOST1X_CLASS_HOST1X,
+   host1x_uclass_wait_syncpt_r(), 1),
+   host1x_class_host_wait_syncpt(job->syncpt->id, fence));
+
+   /*
+* Now that we know the engine is idle, return to class and
+* change stream ID.
+*/
+
+   host1x_cdma_push(>channel->cdma,
+   host1x_opcode_setclass(job->class, 0, 0),
+   HOST1X_OPCODE_NOP);
+
+   host1x_cdma_push(>channel->cdma,
+   host1x_opcode_setpayload(job->memory_context->stream_id),
+   host1x_opcode_setstreamid(job->engine_streamid_offset / 4));
+#endif
+}
+
 static int channel_submit(struct host1x_job *job)
 {
struct host1x_channel *ch = job->channel;
@@ -236,18 +275,23 @@ static int channel_submit(struct host1x_job *job)
if (sp->base)
synchronize_syncpt_base(job);
 
-   syncval = host1x_syncpt_incr_max(sp, user_syncpt_incrs);
-
host1x_hw_syncpt_assign_to_channel(host, sp, ch);
 
-   job->syncpt_end = syncval;
-
/* add a setclass for modules that require it */
if (job->class)
host1x_cdma_push(>cdma,
 host1x_opcode_setclass(job->class, 0, 0),
 HOST1X_OPCODE_NOP);
 
+   /*
+* Ensure engine DMA is idle and set new stream ID. May increment
+* syncpt max.
+*/
+   host1x_channel_program_engine_streamid(job);
+
+   syncval = host1x_syncpt_incr_max(sp, user_syncpt_incrs);
+   job->syncpt_end = syncval;
+
submit_gathers(job, syncval - user_syncpt_incrs);
 
/* end CDMA submit & stash pinned hMems into sync queue */
diff --git a/drivers/gpu/host1x/hw/host1x06_hardware.h 
b/drivers/gpu/host1x/hw/host1x06_hardware.h
index 01a142a09800..5d515745eee7 100644
--- a/drivers/gpu/host1x/hw/host1x06_hardware.h
+++ b/drivers/gpu/host1x/hw/host1x06_hardware.h
@@ -127,6 +127,16 @@ static inline u32 host1x_opcode_gather_incr(unsigned 
offset, unsigned count)
return (6 << 28) | (offset << 16) | BIT(15) | BIT(14) | count;
 }
 
+static inline u32 host1x_opcode_setstreamid(unsigned streamid)
+{
+   return (7 << 28) | streamid;
+}
+
+static inline u32 host1x_opcode_setpayload(unsigned payload)
+{
+   return (9 << 28) | payload;
+}
+
 static inline u32 host1x_opcode_gather_wide(unsigned count)
 {
return (12 << 28) | count;
diff --git a/drivers/gpu/host1x/hw/host1x07_hardware.h 
b/drivers/gpu/host1x/hw/host1x07_hardware.h
index e6582172ebfd..82c0cc9bb0b5 100644
--- a/drivers/gpu/host1x/hw/host1x07_hardware.h
+++ b/drivers/gpu/host1x/hw/host1x07_hardware.h
@@ -127,6 +127,16 @@ static inline u32 host1x_opcode_gather_incr(unsigned 
offset, unsigned count)
return (6 << 28) | (offset << 16) | BIT(15) | BIT(14) | count;
 }
 
+static inline u32 host1x_opcode_setstreamid(unsigned streamid)
+{
+   return (7 << 28) | streamid;
+}
+
+static inline u32 host1x_opcode_setpayload(unsigned payload)
+{
+   return (9 << 28) | payload;
+}
+
 static inline u32 host1x_opcode_gather_wide(unsigned count)
 {
return (12 << 28) 

[PATCH v5 7/9] drm/tegra: falcon: Set DMACTX field on DMA transactions

2022-05-16 Thread cyndis
From: Mikko Perttunen 

The DMACTX field determines which context, as specified in the
TRANSCFG register, is used. While during boot it doesn't matter
which is used, later on it matters and this value is reused by
the firmware.

Signed-off-by: Mikko Perttunen 
---
 drivers/gpu/drm/tegra/falcon.c | 8 
 drivers/gpu/drm/tegra/falcon.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/tegra/falcon.c b/drivers/gpu/drm/tegra/falcon.c
index 3762d87759d9..c0d85463eb1a 100644
--- a/drivers/gpu/drm/tegra/falcon.c
+++ b/drivers/gpu/drm/tegra/falcon.c
@@ -48,6 +48,14 @@ static int falcon_copy_chunk(struct falcon *falcon,
if (target == FALCON_MEMORY_IMEM)
cmd |= FALCON_DMATRFCMD_IMEM;
 
+   /*
+* Use second DMA context (i.e. the one for firmware). Strictly
+* speaking, at this point both DMA contexts point to the firmware
+* stream ID, but this register's value will be reused by the firmware
+* for later DMA transactions, so we need to use the correct value.
+*/
+   cmd |= FALCON_DMATRFCMD_DMACTX(1);
+
falcon_writel(falcon, offset, FALCON_DMATRFMOFFS);
falcon_writel(falcon, base, FALCON_DMATRFFBOFFS);
falcon_writel(falcon, cmd, FALCON_DMATRFCMD);
diff --git a/drivers/gpu/drm/tegra/falcon.h b/drivers/gpu/drm/tegra/falcon.h
index c56ee32d92ee..1955cf11a8a6 100644
--- a/drivers/gpu/drm/tegra/falcon.h
+++ b/drivers/gpu/drm/tegra/falcon.h
@@ -50,6 +50,7 @@
 #define FALCON_DMATRFCMD_IDLE  (1 << 1)
 #define FALCON_DMATRFCMD_IMEM  (1 << 4)
 #define FALCON_DMATRFCMD_SIZE_256B (6 << 8)
+#define FALCON_DMATRFCMD_DMACTX(v) (((v) & 0x7) << 12)
 
 #define FALCON_DMATRFFBOFFS0x111c
 
-- 
2.36.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 6/9] arm64: tegra: Add Host1x context stream IDs on Tegra186+

2022-05-16 Thread cyndis
From: Mikko Perttunen 

Add Host1x context stream IDs on systems that support Host1x context
isolation. Host1x and attached engines can use these stream IDs to
allow isolation between memory used by different processes.

The specified stream IDs must match those configured by the hypervisor,
if one is present.

Signed-off-by: Mikko Perttunen 
---
v2:
* Added context devices on T194.
* Use iommu-map instead of custom property.
v4:
* Remove memory-contexts subnode.
---
 arch/arm64/boot/dts/nvidia/tegra186.dtsi | 11 +++
 arch/arm64/boot/dts/nvidia/tegra194.dtsi | 11 +++
 2 files changed, 22 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
index 0e9afc3e2f26..5f560f13ed93 100644
--- a/arch/arm64/boot/dts/nvidia/tegra186.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
@@ -1461,6 +1461,17 @@ host1x@13e0 {
 
iommus = < TEGRA186_SID_HOST1X>;
 
+   /* Context isolation domains */
+   iommu-map = <
+   0  TEGRA186_SID_HOST1X_CTX0 1
+   1  TEGRA186_SID_HOST1X_CTX1 1
+   2  TEGRA186_SID_HOST1X_CTX2 1
+   3  TEGRA186_SID_HOST1X_CTX3 1
+   4  TEGRA186_SID_HOST1X_CTX4 1
+   5  TEGRA186_SID_HOST1X_CTX5 1
+   6  TEGRA186_SID_HOST1X_CTX6 1
+   7  TEGRA186_SID_HOST1X_CTX7 1>;
+
dpaux1: dpaux@1504 {
compatible = "nvidia,tegra186-dpaux";
reg = <0x1504 0x1>;
diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index d1f8248c00f4..613fd71dec25 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -1769,6 +1769,17 @@ host1x@13e0 {
interconnect-names = "dma-mem";
iommus = < TEGRA194_SID_HOST1X>;
 
+   /* Context isolation domains */
+   iommu-map = <
+   0  TEGRA194_SID_HOST1X_CTX0 1
+   1  TEGRA194_SID_HOST1X_CTX1 1
+   2  TEGRA194_SID_HOST1X_CTX2 1
+   3  TEGRA194_SID_HOST1X_CTX3 1
+   4  TEGRA194_SID_HOST1X_CTX4 1
+   5  TEGRA194_SID_HOST1X_CTX5 1
+   6  TEGRA194_SID_HOST1X_CTX6 1
+   7  TEGRA194_SID_HOST1X_CTX7 1>;
+
nvdec@1514 {
compatible = "nvidia,tegra194-nvdec";
reg = <0x1514 0x0004>;
-- 
2.36.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 1/9] dt-bindings: host1x: Add iommu-map property

2022-05-16 Thread cyndis
From: Mikko Perttunen 

Add schema information for specifying context stream IDs. This uses
the standard iommu-map property.

Signed-off-by: Mikko Perttunen 
Reviewed-by: Robin Murphy 
---
v3:
* New patch
v4:
* Remove memory-contexts subnode.
---
 .../bindings/display/tegra/nvidia,tegra20-host1x.yaml| 5 +
 1 file changed, 5 insertions(+)

diff --git 
a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.yaml 
b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.yaml
index 4fd513efb0f7..0adeb03b9e3a 100644
--- a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.yaml
+++ b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.yaml
@@ -144,6 +144,11 @@ allOf:
 reset-names:
   maxItems: 1
 
+iommu-map:
+  description: Specification of stream IDs available for memory 
context device
+use. Should be a mapping of IDs 0..n to IOMMU entries 
corresponding to
+usable stream IDs.
+
   required:
 - reg-names
 
-- 
2.36.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 9/9] drm/tegra: Implement stream ID related callbacks on engines

2022-05-16 Thread cyndis
From: Mikko Perttunen 

Implement the get_streamid_offset and can_use_memory_ctx callbacks
required for supporting context isolation. Since old firmware on VIC
cannot support context isolation without hacks that we don't want to
implement, check the firmware binary to see if context isolation
should be enabled.

Signed-off-by: Mikko Perttunen 
---
v5:
* Split into two callbacks
* Add NVDEC support
v4:
* Add locking in vic_load_firmware
* Return -EOPNOTSUPP if context isolation is not available
* Update for changed get_streamid_offset declaration
* Add comment noting that vic_load_firmware is safe to call
  without the hardware being powered on

Implement context isolation related callbacks in VIC, NVDEC
---
 drivers/gpu/drm/tegra/drm.h   |  8 +
 drivers/gpu/drm/tegra/nvdec.c |  9 +
 drivers/gpu/drm/tegra/vic.c   | 67 ++-
 3 files changed, 76 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
index 2acc8f2948ad..845e60f144c7 100644
--- a/drivers/gpu/drm/tegra/drm.h
+++ b/drivers/gpu/drm/tegra/drm.h
@@ -100,6 +100,14 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 struct drm_tegra_submit *args, struct drm_device *drm,
 struct drm_file *file);
 
+static inline int
+tegra_drm_get_streamid_offset_thi(struct tegra_drm_client *client, u32 *offset)
+{
+   *offset = 0x30;
+
+   return 0;
+}
+
 struct tegra_drm_client {
struct host1x_client base;
struct list_head list;
diff --git a/drivers/gpu/drm/tegra/nvdec.c b/drivers/gpu/drm/tegra/nvdec.c
index 79e1e88203cf..f1210cfb3708 100644
--- a/drivers/gpu/drm/tegra/nvdec.c
+++ b/drivers/gpu/drm/tegra/nvdec.c
@@ -304,10 +304,19 @@ static void nvdec_close_channel(struct tegra_drm_context 
*context)
host1x_channel_put(context->channel);
 }
 
+static int nvdec_can_use_memory_ctx(struct tegra_drm_client *client, bool 
*supported)
+{
+   *supported = true;
+
+   return 0;
+}
+
 static const struct tegra_drm_client_ops nvdec_ops = {
.open_channel = nvdec_open_channel,
.close_channel = nvdec_close_channel,
.submit = tegra_drm_submit,
+   .get_streamid_offset = tegra_drm_get_streamid_offset_thi,
+   .can_use_memory_ctx = nvdec_can_use_memory_ctx,
 };
 
 #define NVIDIA_TEGRA_210_NVDEC_FIRMWARE "nvidia/tegra210/nvdec.bin"
diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c
index 1e342fa3d27b..2c0d554bd13c 100644
--- a/drivers/gpu/drm/tegra/vic.c
+++ b/drivers/gpu/drm/tegra/vic.c
@@ -38,6 +38,8 @@ struct vic {
struct clk *clk;
struct reset_control *rst;
 
+   bool can_use_context;
+
/* Platform configuration */
const struct vic_config *config;
 };
@@ -229,28 +231,38 @@ static int vic_load_firmware(struct vic *vic)
 {
struct host1x_client *client = >client.base;
struct tegra_drm *tegra = vic->client.drm;
+   static DEFINE_MUTEX(lock);
+   u32 fce_bin_data_offset;
dma_addr_t iova;
size_t size;
void *virt;
int err;
 
-   if (vic->falcon.firmware.virt)
-   return 0;
+   mutex_lock();
+
+   if (vic->falcon.firmware.virt) {
+   err = 0;
+   goto unlock;
+   }
 
err = falcon_read_firmware(>falcon, vic->config->firmware);
if (err < 0)
-   return err;
+   goto unlock;
 
size = vic->falcon.firmware.size;
 
if (!client->group) {
virt = dma_alloc_coherent(vic->dev, size, , GFP_KERNEL);
-   if (!virt)
-   return -ENOMEM;
+   if (!virt) {
+   err = -ENOMEM;
+   goto unlock;
+   }
} else {
virt = tegra_drm_alloc(tegra, size, );
-   if (IS_ERR(virt))
-   return PTR_ERR(virt);
+   if (IS_ERR(virt)) {
+   err = PTR_ERR(virt);
+   goto unlock;
+   }
}
 
vic->falcon.firmware.virt = virt;
@@ -277,7 +289,28 @@ static int vic_load_firmware(struct vic *vic)
vic->falcon.firmware.phys = phys;
}
 
-   return 0;
+   /*
+* Check if firmware is new enough to not require mapping firmware
+* to data buffer domains.
+*/
+   fce_bin_data_offset = *(u32 *)(virt + VIC_UCODE_FCE_DATA_OFFSET);
+
+   if (!vic->config->supports_sid) {
+   vic->can_use_context = false;
+   } else if (fce_bin_data_offset != 0x0 && fce_bin_data_offset != 
0xa5a5a5a5) {
+   /*
+* Firmware will access FCE through STREAMID0, so context
+* isolation cannot be used.
+*/
+   vic->can_use_context = false;
+   dev_warn_once(vic->dev, "context isolation disabled due to old 
firmware\n");
+   } else {
+   

[PATCH v5 5/9] iommu/arm-smmu: Attach to host1x context device bus

2022-05-16 Thread cyndis
From: Mikko Perttunen 

Set itself as the IOMMU for the host1x context device bus, containing
"dummy" devices used for Host1x context isolation.

Signed-off-by: Mikko Perttunen 
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 568cce590ccc..9ff54eaecf81 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -39,6 +39,7 @@
 
 #include 
 #include 
+#include 
 
 #include "arm-smmu.h"
 
@@ -2053,8 +2054,20 @@ static int arm_smmu_bus_init(struct iommu_ops *ops)
goto err_reset_pci_ops;
}
 #endif
+#ifdef CONFIG_TEGRA_HOST1X_CONTEXT_BUS
+   if (!iommu_present(_context_device_bus_type)) {
+   err = bus_set_iommu(_context_device_bus_type, ops);
+   if (err)
+   goto err_reset_fsl_mc_ops;
+   }
+#endif
+
return 0;
 
+err_reset_fsl_mc_ops: __maybe_unused;
+#ifdef CONFIG_FSL_MC_BUS
+   bus_set_iommu(_mc_bus_type, NULL);
+#endif
 err_reset_pci_ops: __maybe_unused;
 #ifdef CONFIG_PCI
bus_set_iommu(_bus_type, NULL);
-- 
2.36.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH V2 1/2] swiotlb: Add Child IO TLB mem support

2022-05-16 Thread Christoph Hellwig
I don't really understand how 'childs' fit in here.  The code also
doesn't seem to be usable without patch 2 and a caller of the
new functions added in patch 2, so it is rather impossible to review.

Also:

 1) why is SEV/TDX so different from other cases that need bounce
buffering to treat it different and we can't work on a general
scalability improvement
 2) per previous discussions at how swiotlb itself works, it is
clear that another option is to just make pages we DMA to
shared with the hypervisor.  Why don't we try that at least
for larger I/O?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/7] drm/i915: Remove unnecessary include

2022-05-16 Thread Jani Nikula
On Sat, 14 May 2022, Lu Baolu  wrote:
> intel-iommu.h is not needed in drm/i915 anymore. Remove its include.

Thanks for the cleanups. Do you want to keep the patches together or
want us to pick this up via drm-intel?

If you want to keep the patches together,

Acked-by: Jani Nikula 

for merging via whichever tree suits you best. Just let us know.

BR,
Jani.


>
> Signed-off-by: Lu Baolu 
> ---
>  drivers/gpu/drm/i915/i915_drv.h| 1 -
>  drivers/gpu/drm/i915/display/intel_display.c   | 1 -
>  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 1 -
>  3 files changed, 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index fa14da84362e..f2a6982c3bef 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -36,7 +36,6 @@
>  
>  #include 
>  #include 
> -#include 
>  #include 
>  
>  #include 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
> b/drivers/gpu/drm/i915/display/intel_display.c
> index 7dfeb458aa65..686ddbeebadc 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -27,7 +27,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index d42f437149c9..c9823528ea94 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -4,7 +4,6 @@
>   * Copyright © 2008,2010 Intel Corporation
>   */
>  
> -#include 
>  #include 
>  #include 
>  #include 

-- 
Jani Nikula, Intel Open Source Graphics Center
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/5] iommu: Add blocking_domain_ops field in iommu_ops

2022-05-16 Thread Christoph Hellwig
On Mon, May 16, 2022 at 09:57:56AM +0800, Lu Baolu wrote:
> Each IOMMU driver must provide a blocking domain ops. If the hardware
> supports detaching domain from device, setting blocking domain equals
> detaching the existing domain from the deivce. Otherwise, an UNMANAGED
> domain without any mapping will be used instead.

blocking in this case means not allowing any access?  The naming
sounds a bit odd to me as blocking in the kernel has a specific
meaning.  Maybe something like noaccess ops might be a better name?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] vfio: Remove VFIO_TYPE1_NESTING_IOMMU

2022-05-16 Thread Christoph Hellwig
Looks good,

Reviewed-by: Christoph Hellwig 

we really should not keep dead code like this around.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu