Re: [RFC PATCH v1 2/9] pinctrl: devicetree: Delete usage of driver_deferred_probe_check_state()

2022-06-06 Thread Andy Shevchenko
On Sat, Jun 04, 2022 at 08:41:01PM -0700, Saravana Kannan wrote:
> On Mon, May 30, 2022 at 2:22 AM Geert Uytterhoeven  
> wrote:
> > On Thu, May 26, 2022 at 10:16 AM Saravana Kannan  
> > wrote:
> > > Now that fw_devlink=on by default and fw_devlink supports
> > > "pinctrl-[0-8]" property, the execution will never get to the point
> >
> > 0-9?
> >
> > oh, it's really 0-8:
> >
> > drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl0, "pinctrl-0", NULL)
> > drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl1, "pinctrl-1", NULL)
> > drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl2, "pinctrl-2", NULL)
> > drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl3, "pinctrl-3", NULL)
> > drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl4, "pinctrl-4", NULL)
> > drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl5, "pinctrl-5", NULL)
> > drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl6, "pinctrl-6", NULL)
> > drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl7, "pinctrl-7", NULL)
> > drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl8, "pinctrl-8", NULL)
> >
> > Looks fragile, especially since we now have:
> >
> > arch/arm64/boot/dts/microchip/sparx5_pcb134_board.dtsi:
> > pinctrl-9 = <_9>;
> > arch/arm64/boot/dts/microchip/sparx5_pcb134_board.dtsi: pinctrl-10
> > = <_10>;
> > arch/arm64/boot/dts/microchip/sparx5_pcb134_board.dtsi: pinctrl-11
> > = <_11>;
> > arch/arm64/boot/dts/microchip/sparx5_pcb134_board.dtsi: pinctrl-12
> > = <_pins_i>;
> 
> Checking for pinctrl-* and then verifying if * matches %d would be
> more complicated and probably more expensive compared to listing
> pinctrl-[0-8]. Especially because more than 50% of pinctrl-*
> properties in DT files are NOT pinctrl-%d. So back when we merged
> this, Rob and I agreed [0-8] was good enough for now and we can add
> more if we needed to. Also, when I checked back then, all the
> pinctrl-5+ properties ended up pointing to the same suppliers as the
> lower numbered ones. So it didn't make a difference.
> 
> Ok, I just checked linux-next all the pinctrl-9+ instances and it's
> still true that they all point to the same supplier pointed to by
> pinctrl-[0-8].
> 
> So yeah, it looks fragile, 

And feels fragile, adds into confusion, etc.
Please, consider redesigning this to be more robust.

>   but is not broken and it's more efficient
> than looking for pinctrl-%d or adding more pinctrl-xx entries. So,
> let's fix it if it actually breaks? Not going to oppose a patch if
> anyone wants to make it more complete.
> 
> > > where driver_deferred_probe_check_state() is called before the supplier
> > > has probed successfully or before deferred probe timeout has expired.
> > >
> > > So, delete the call and replace it with -ENODEV.

-- 
With Best Regards,
Andy Shevchenko


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/6] scsi: remove stale BusLogic driver

2022-06-06 Thread Maciej W. Rozycki
On Mon, 6 Jun 2022, Arnd Bergmann wrote:

> This was in turn fixed in commit 56f396146af2 ("scsi: BusLogic: Fix
> 64-bit system enumeration error for Buslogic"), 8 years later.
> 
> The fact that this was found at all is an indication that there are
> users, and it seems that Maciej, Matt and Khalid all have access to
> this hardware, but if it took eight years to find the problem,
> it's likely that nobody actually relies on it.

 Umm, I use it with a 32-bit system, so it would be quite an issue for me 
to discover a problem with 64-bit configurations.  And I quite rely on 
this system for various stuff too!

> Remove it as part of the global virt_to_bus()/bus_to_virt() removal.
> If anyone is still interested in keeping this driver, the alternative
> is to stop it from using bus_to_virt(), possibly along the lines of
> how dpt_i2o gets around the same issue.

 Thanks for the pointer and for cc-ing me.  Please refrain from removing 
the driver at least for this release cycle and let me fix it.  It should 
be easy to mimic what I did for the defza driver: all bus addresses in the 
DMA API come associated with virtual addresses, so it is just a matter of 
recording those somewhere for later use rather than trying to mess up with 
bus addresses to figure out a reverse mapping.

  Maciej
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 4/8] perf tool: arm: Refactor event list iteration in auxtrace_record__init()

2022-06-06 Thread Yicong Yang via iommu
From: Qi Liu 

Add find_pmu_for_event() and use to simplify logic in
auxtrace_record_init(). find_pmu_for_event() will be
reused in subsequent patches.

Reviewed-by: Jonathan Cameron 
Signed-off-by: Qi Liu 
Signed-off-by: Yicong Yang 
---
 tools/perf/arch/arm/util/auxtrace.c | 53 ++---
 1 file changed, 34 insertions(+), 19 deletions(-)

diff --git a/tools/perf/arch/arm/util/auxtrace.c 
b/tools/perf/arch/arm/util/auxtrace.c
index 5fc6a2a3dbc5..384c7cfda0fd 100644
--- a/tools/perf/arch/arm/util/auxtrace.c
+++ b/tools/perf/arch/arm/util/auxtrace.c
@@ -50,16 +50,32 @@ static struct perf_pmu **find_all_arm_spe_pmus(int 
*nr_spes, int *err)
return arm_spe_pmus;
 }
 
+static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus,
+  int pmu_nr, struct evsel *evsel)
+{
+   int i;
+
+   if (!pmus)
+   return NULL;
+
+   for (i = 0; i < pmu_nr; i++) {
+   if (evsel->core.attr.type == pmus[i]->type)
+   return pmus[i];
+   }
+
+   return NULL;
+}
+
 struct auxtrace_record
 *auxtrace_record__init(struct evlist *evlist, int *err)
 {
-   struct perf_pmu *cs_etm_pmu;
+   struct perf_pmu *cs_etm_pmu = NULL;
+   struct perf_pmu **arm_spe_pmus = NULL;
struct evsel *evsel;
-   bool found_etm = false;
+   struct perf_pmu *found_etm = NULL;
struct perf_pmu *found_spe = NULL;
-   struct perf_pmu **arm_spe_pmus = NULL;
+   int auxtrace_event_cnt = 0;
int nr_spes = 0;
-   int i = 0;
 
if (!evlist)
return NULL;
@@ -68,24 +84,23 @@ struct auxtrace_record
arm_spe_pmus = find_all_arm_spe_pmus(_spes, err);
 
evlist__for_each_entry(evlist, evsel) {
-   if (cs_etm_pmu &&
-   evsel->core.attr.type == cs_etm_pmu->type)
-   found_etm = true;
-
-   if (!nr_spes || found_spe)
-   continue;
-
-   for (i = 0; i < nr_spes; i++) {
-   if (evsel->core.attr.type == arm_spe_pmus[i]->type) {
-   found_spe = arm_spe_pmus[i];
-   break;
-   }
-   }
+   if (cs_etm_pmu && !found_etm)
+   found_etm = find_pmu_for_event(_etm_pmu, 1, evsel);
+
+   if (arm_spe_pmus && !found_spe)
+   found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, 
evsel);
}
+
free(arm_spe_pmus);
 
-   if (found_etm && found_spe) {
-   pr_err("Concurrent ARM Coresight ETM and SPE operation not 
currently supported\n");
+   if (found_etm)
+   auxtrace_event_cnt++;
+
+   if (found_spe)
+   auxtrace_event_cnt++;
+
+   if (auxtrace_event_cnt > 1) {
+   pr_err("Concurrent AUX trace operation not currently 
supported\n");
*err = -EOPNOTSUPP;
return NULL;
}
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 0/8] Add support for HiSilicon PCIe Tune and Trace device

2022-06-06 Thread Yicong Yang via iommu
HiSilicon PCIe tune and trace device (PTT) is a PCIe Root Complex
integrated Endpoint (RCiEP) device, providing the capability
to dynamically monitor and tune the PCIe traffic (tune),
and trace the TLP headers (trace).

PTT tune is designed for monitoring and adjusting PCIe link parameters.
We provide several parameters of the PCIe link. Through the driver,
user can adjust the value of certain parameter to affect the PCIe link
for the purpose of enhancing the performance in certian situation.

PTT trace is designed for dumping the TLP headers to the memory, which
can be used to analyze the transactions and usage condition of the PCIe
Link. Users can choose filters to trace headers, by either requester
ID, or those downstream of a set of Root Ports on the same core of the
PTT device. It's also supported to trace the headers of certain type and
of certain direction.

The driver registers a PMU device for each PTT device. The trace can
be used through `perf record` and the traced headers can be decoded
by `perf report`. The perf command support for the device is also
added in this patchset. The tune can be used through the sysfs
attributes of related PMU device. See the documentation for the
detailed usage.

Change since v8:
- Cleanups and one minor fix from Jonathan and John, thanks
Link: 
https://lore.kernel.org/lkml/20220516125223.32012-1-yangyic...@hisilicon.com/

Change since v7:
- Configure the DMA in probe rather than in runtime. Also use devres to manage
  PMU device as we have no order problem now
- Refactor the config validation function per John and Leo
- Use a spinlock hisi_ptt::pmu_lock instead of mutex to serialize the perf 
process
  in pmu::start as it's in atomic context
- Only commit the traced data when stop, per Leo and James
- Drop the filter dynamically updating patch from this series to simply the 
review
  of the driver. That patch will be send separately.
- add a cpumask sysfs attribute and handle the cpu hotplug events, follow the
  uncore PMU convention
- Other cleanups and fixes, both in driver and perf tool
Link: 
https://lore.kernel.org/lkml/20220407125841.3678-1-yangyic...@hisilicon.com/

Change since v6:
- Fix W=1 errors reported by lkp test, thanks

Change since v5:
- Squash the PMU patch into PATCH 2 suggested by John
- refine the commit message of PATCH 1 and some comments
Link: 
https://lore.kernel.org/lkml/20220308084930.5142-1-yangyic...@hisilicon.com/

Change since v4:
Address the comments from Jonathan, John and Ma Ca, thanks.
- Use devm* also for allocating the DMA buffers
- Remove the IRQ handler stub in Patch 2
- Make functions waiting for hardware state return boolean
- Manual remove the PMU device as it should be removed first
- Modifier the orders in probe and removal to make them matched well
- Make available {directions,type,format} array const and non-global
- Using the right filter list in filters show and well protect the
  list with mutex
- Record the trace status with a boolean @started rather than enum
- Optimize the process of finding the PTT devices of the perf-tool
Link: 
https://lore.kernel.org/linux-pci/20220221084307.33712-1-yangyic...@hisilicon.com/

Change since v3:
Address the comments from Jonathan and John, thanks.
- drop members in the common struct which can be get on the fly
- reduce buffer struct and organize the buffers with array instead of list
- reduce the DMA reset wait time to avoid long time busy loop
- split the available_filters sysfs attribute into two files, for root port
  and requester respectively. Update the documentation accordingly
- make IOMMU mapping check earlier in probe to avoid race condition. Also
  make IOMMU quirk patch prior to driver in the series
- Cleanups and typos fixes from John and Jonathan
Link: 
https://lore.kernel.org/linux-pci/20220124131118.17887-1-yangyic...@hisilicon.com/

Change since v2:
- address the comments from Mathieu, thanks.
  - rename the directory to ptt to match the function of the device
  - spinoff the declarations to a separate header
  - split the trace function to several patches
  - some other comments.
- make default smmu domain type of PTT device to identity
  Drop the RMR as it's not recommended and use an iommu_def_domain_type
  quirk to passthrough the device DMA as suggested by Robin. 
Link: 
https://lore.kernel.org/linux-pci/2026090625.53702-1-yangyic...@hisilicon.com/

Change since v1:
- switch the user interface of trace to perf from debugfs
- switch the user interface of tune to sysfs from debugfs
- add perf tool support to start trace and decode the trace data
- address the comments of documentation from Bjorn
- add RMR[1] support of the device as trace works in RMR mode or
  direct DMA mode. RMR support is achieved by common APIs rather
  than the APIs implemented in [1].
Link: 
https://lore.kernel.org/lkml/1618654631-42454-1-git-send-email-yangyic...@hisilicon.com/
[1] 
https://lore.kernel.org/linux-acpi/20210805080724.480-1-shameerali.kolothum.th...@huawei.com/

Qi 

[PATCH v9 8/8] MAINTAINERS: Add maintainer for HiSilicon PTT driver

2022-06-06 Thread Yicong Yang via iommu
Add maintainer for driver and documentation of HiSilicon PTT device.

Signed-off-by: Yicong Yang 
Reviewed-by: Jonathan Cameron 
---
 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index a6d3bd9d2a8d..5893fde0cc82 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8936,6 +8936,13 @@ F:   Documentation/admin-guide/perf/hisi-pcie-pmu.rst
 F: Documentation/admin-guide/perf/hisi-pmu.rst
 F: drivers/perf/hisilicon
 
+HISILICON PTT DRIVER
+M: Yicong Yang 
+L: linux-ker...@vger.kernel.org
+S: Maintained
+F: Documentation/trace/hisi-ptt.rst
+F: drivers/hwtracing/ptt/
+
 HISILICON QM AND ZIP Controller DRIVER
 M: Zhou Wang 
 L: linux-cry...@vger.kernel.org
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 1/8] iommu/arm-smmu-v3: Make default domain type of HiSilicon PTT device to identity

2022-06-06 Thread Yicong Yang via iommu
The DMA operations of HiSilicon PTT device can only work properly with
identical mappings. So add a quirk for the device to force the domain
as passthrough.

Acked-by: Will Deacon 
Signed-off-by: Yicong Yang 
Reviewed-by: John Garry 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 88817a3376ef..e119ff8396c9 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2839,6 +2839,26 @@ static int arm_smmu_dev_disable_feature(struct device 
*dev,
}
 }
 
+/*
+ * HiSilicon PCIe tune and trace device can be used to trace TLP headers on the
+ * PCIe link and save the data to memory by DMA. The hardware is restricted to
+ * use identity mapping only.
+ */
+#define IS_HISI_PTT_DEVICE(pdev)   ((pdev)->vendor == PCI_VENDOR_ID_HUAWEI 
&& \
+(pdev)->device == 0xa12e)
+
+static int arm_smmu_def_domain_type(struct device *dev)
+{
+   if (dev_is_pci(dev)) {
+   struct pci_dev *pdev = to_pci_dev(dev);
+
+   if (IS_HISI_PTT_DEVICE(pdev))
+   return IOMMU_DOMAIN_IDENTITY;
+   }
+
+   return 0;
+}
+
 static struct iommu_ops arm_smmu_ops = {
.capable= arm_smmu_capable,
.domain_alloc   = arm_smmu_domain_alloc,
@@ -2856,6 +2876,7 @@ static struct iommu_ops arm_smmu_ops = {
.sva_unbind = arm_smmu_sva_unbind,
.sva_get_pasid  = arm_smmu_sva_get_pasid,
.page_response  = arm_smmu_page_response,
+   .def_domain_type= arm_smmu_def_domain_type,
.pgsize_bitmap  = -1UL, /* Restricted during device attach */
.owner  = THIS_MODULE,
.default_domain_ops = &(const struct iommu_domain_ops) {
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 7/8] docs: trace: Add HiSilicon PTT device driver documentation

2022-06-06 Thread Yicong Yang via iommu
Document the introduction and usage of HiSilicon PTT device driver.

Signed-off-by: Yicong Yang 
Reviewed-by: Jonathan Cameron 
---
 Documentation/trace/hisi-ptt.rst | 307 +++
 Documentation/trace/index.rst|   1 +
 2 files changed, 308 insertions(+)
 create mode 100644 Documentation/trace/hisi-ptt.rst

diff --git a/Documentation/trace/hisi-ptt.rst b/Documentation/trace/hisi-ptt.rst
new file mode 100644
index ..0a3112244d40
--- /dev/null
+++ b/Documentation/trace/hisi-ptt.rst
@@ -0,0 +1,307 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==
+HiSilicon PCIe Tune and Trace device
+==
+
+Introduction
+
+
+HiSilicon PCIe tune and trace device (PTT) is a PCIe Root Complex
+integrated Endpoint (RCiEP) device, providing the capability
+to dynamically monitor and tune the PCIe link's events (tune),
+and trace the TLP headers (trace). The two functions are independent,
+but is recommended to use them together to analyze and enhance the
+PCIe link's performance.
+
+On Kunpeng 930 SoC, the PCIe Root Complex is composed of several
+PCIe cores. Each PCIe core includes several Root Ports and a PTT
+RCiEP, like below. The PTT device is capable of tuning and
+tracing the links of the PCIe core.
+::
+
+  +--Core 0---+
+  |   |   [   PTT   ] |
+  |   |   [Root Port]---[Endpoint]
+  |   |   [Root Port]---[Endpoint]
+  |   |   [Root Port]---[Endpoint]
+Root Complex  |--Core 1---+
+  |   |   [   PTT   ] |
+  |   |   [Root Port]---[ Switch ]---[Endpoint]
+  |   |   [Root Port]---[Endpoint] `-[Endpoint]
+  |   |   [Root Port]---[Endpoint]
+  +---+
+
+The PTT device driver registers one PMU device for each PTT device.
+The name of each PTT device is composed of 'hisi_ptt' prefix with
+the id of the SICL and the Core where it locates. The Kunpeng 930
+SoC encapsulates multiple CPU dies (SCCL, Super CPU Cluster) and
+IO dies (SICL, Super I/O Cluster), where there's one PCIe Root
+Complex for each SICL.
+::
+
+/sys/devices/hisi_ptt_
+
+Tune
+
+
+PTT tune is designed for monitoring and adjusting PCIe link parameters 
(events).
+Currently we support events in 4 classes. The scope of the events
+covers the PCIe core to which the PTT device belongs.
+
+Each event is presented as a file under $(PTT PMU dir)/tune, and
+a simple open/read/write/close cycle will be used to tune the event.
+::
+
+$ cd /sys/devices/hisi_ptt_/tune
+$ ls
+qos_tx_cplqos_tx_npqos_tx_p
+tx_path_rx_req_alloc_buf_level
+tx_path_tx_req_alloc_buf_level
+$ cat qos_tx_dp
+1
+$ echo 2 > qos_tx_dp
+$ cat qos_tx_dp
+2
+
+Current value (numerical value) of the event can be simply read
+from the file, and the desired value written to the file to tune.
+
+1. Tx path QoS control
+
+
+The following files are provided to tune the QoS of the tx path of
+the PCIe core.
+
+- qos_tx_cpl: weight of Tx completion TLPs
+- qos_tx_np: weight of Tx non-posted TLPs
+- qos_tx_p: weight of Tx posted TLPs
+
+The weight influences the proportion of certain packets on the PCIe link.
+For example, for the storage scenario, increase the proportion
+of the completion packets on the link to enhance the performance as
+more completions are consumed.
+
+The available tune data of these events is [0, 1, 2].
+Writing a negative value will return an error, and out of range
+values will be converted to 2. Note that the event value just
+indicates a probable level, but is not precise.
+
+2. Tx path buffer control
+-
+
+Following files are provided to tune the buffer of tx path of the PCIe core.
+
+- tx_path_rx_req_alloc_buf_level: watermark of Rx requested
+- tx_path_tx_req_alloc_buf_level: watermark of Tx requested
+
+These events influence the watermark of the buffer allocated for each
+type. Rx means the inbound while Tx means outbound. The packets will
+be stored in the buffer first and then transmitted either when the
+watermark reached or when timed out. For a busy direction, you should
+increase the related buffer watermark to avoid frequently posting and
+thus enhance the performance. In most cases just keep the default value.
+
+The available tune data of above events is [0, 1, 2].
+Writing a negative value will return an error, and out of range
+values will be converted to 2. Note that the event value just
+indicates a probable level, but is not precise.
+
+Trace
+=
+
+PTT trace is designed for dumping the TLP headers to the memory, which
+can be used to analyze the transactions and usage condition of the PCIe
+Link. You can choose to filter the traced headers by either requester ID,
+or those downstream of a set of Root Ports on the same core of the PTT
+device. It's also 

[PATCH v9 6/8] perf tool: Add support for parsing HiSilicon PCIe Trace packet

2022-06-06 Thread Yicong Yang via iommu
From: Qi Liu 

Add support for using 'perf report --dump-raw-trace' to parse PTT packet.

Example usage:

Output will contain raw PTT data and its textual representation, such
as:

0 0 0x5810 [0x30]: PERF_RECORD_AUXTRACE size: 0x40  offset: 0
ref: 0xa5d50c725  idx: 0  tid: -1  cpu: 0
.
. ... HISI PTT data: size 4194304 bytes
.  : 00 00 00 00 Prefix
.  0004: 08 20 00 60 Header DW0
.  0008: ff 02 00 01 Header DW1
.  000c: 20 08 00 00 Header DW2
.  0010: 10 e7 44 ab Header DW3
.  0014: 2a a8 1e 01 Time
.  0020: 00 00 00 00 Prefix
.  0024: 01 00 00 60 Header DW0
.  0028: 0f 1e 00 01 Header DW1
.  002c: 04 00 00 00 Header DW2
.  0030: 40 00 81 02 Header DW3
.  0034: ee 02 00 00 Time


Signed-off-by: Qi Liu 
Signed-off-by: Yicong Yang 
---
 tools/perf/util/Build |   2 +
 tools/perf/util/auxtrace.c|   3 +
 tools/perf/util/hisi-ptt-decoder/Build|   1 +
 .../hisi-ptt-decoder/hisi-ptt-pkt-decoder.c   | 164 +++
 .../hisi-ptt-decoder/hisi-ptt-pkt-decoder.h   |  31 +++
 tools/perf/util/hisi-ptt.c| 192 ++
 6 files changed, 393 insertions(+)
 create mode 100644 tools/perf/util/hisi-ptt-decoder/Build
 create mode 100644 tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c
 create mode 100644 tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.h
 create mode 100644 tools/perf/util/hisi-ptt.c

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index a51267d88ca9..e22df9e7fd10 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -116,6 +116,8 @@ perf-$(CONFIG_AUXTRACE) += intel-pt.o
 perf-$(CONFIG_AUXTRACE) += intel-bts.o
 perf-$(CONFIG_AUXTRACE) += arm-spe.o
 perf-$(CONFIG_AUXTRACE) += arm-spe-decoder/
+perf-$(CONFIG_AUXTRACE) += hisi-ptt.o
+perf-$(CONFIG_AUXTRACE) += hisi-ptt-decoder/
 perf-$(CONFIG_AUXTRACE) += s390-cpumsf.o
 
 ifdef CONFIG_LIBOPENCSD
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index c5ef322a30b8..3371a0feec68 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -51,6 +51,7 @@
 #include "intel-pt.h"
 #include "intel-bts.h"
 #include "arm-spe.h"
+#include "hisi-ptt.h"
 #include "s390-cpumsf.h"
 #include "util/mmap.h"
 
@@ -1305,6 +1306,8 @@ int perf_event__process_auxtrace_info(struct perf_session 
*session,
err = s390_cpumsf_process_auxtrace_info(event, session);
break;
case PERF_AUXTRACE_HISI_PTT:
+   err = hisi_ptt_process_auxtrace_info(event, session);
+   break;
case PERF_AUXTRACE_UNKNOWN:
default:
return -EINVAL;
diff --git a/tools/perf/util/hisi-ptt-decoder/Build 
b/tools/perf/util/hisi-ptt-decoder/Build
new file mode 100644
index ..db3db8b75033
--- /dev/null
+++ b/tools/perf/util/hisi-ptt-decoder/Build
@@ -0,0 +1 @@
+perf-$(CONFIG_AUXTRACE) += hisi-ptt-pkt-decoder.o
diff --git a/tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c 
b/tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c
new file mode 100644
index ..dc8f19914628
--- /dev/null
+++ b/tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c
@@ -0,0 +1,164 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * HiSilicon PCIe Trace and Tuning (PTT) support
+ * Copyright (c) 2022 HiSilicon Technologies Co., Ltd.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "../color.h"
+#include "hisi-ptt-pkt-decoder.h"
+
+/*
+ * For 8DW format, the bit[31:11] of DW0 is always 0x1f, which can be
+ * used to distinguish the data format.
+ * 8DW format is like:
+ *   bits [ 31:11 ][   10:0   ]
+ *|---|---|
+ *DW0 [0x1f   ][ Reserved (0x7ff) ]
+ *DW1 [   Prefix  ]
+ *DW2 [ Header DW0]
+ *DW3 [ Header DW1]
+ *DW4 [ Header DW2]
+ *DW5 [ Header DW3]
+ *DW6 [   Reserved (0x0)  ]
+ *DW7 [Time   ]
+ *
+ * 4DW format is like:
+ *   bits [31:30] [ 29:25 ][24][23][22][21][20:11   ][10:0]
+ *|-|-|---|---|---|---|-|-|
+ *DW0 [ Fmt ][  Type  

[PATCH v9 3/8] hwtracing: hisi_ptt: Add tune function support for HiSilicon PCIe Tune and Trace device

2022-06-06 Thread Yicong Yang via iommu
Add tune function for the HiSilicon Tune and Trace device. The interface
of tune is exposed through sysfs attributes of PTT PMU device.

Reviewed-by: Jonathan Cameron 
Reviewed-by: John Garry 
Signed-off-by: Yicong Yang 
---
 drivers/hwtracing/ptt/hisi_ptt.c | 136 +++
 drivers/hwtracing/ptt/hisi_ptt.h |  23 ++
 2 files changed, 159 insertions(+)

diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c
index e317be2e1311..3ffc7cc80ad0 100644
--- a/drivers/hwtracing/ptt/hisi_ptt.c
+++ b/drivers/hwtracing/ptt/hisi_ptt.c
@@ -25,6 +25,140 @@
 /* Dynamic CPU hotplug state used by PTT */
 static enum cpuhp_state hisi_ptt_pmu_online;
 
+static bool hisi_ptt_wait_tuning_finish(struct hisi_ptt *hisi_ptt)
+{
+   u32 val;
+
+   return !readl_poll_timeout(hisi_ptt->iobase + HISI_PTT_TUNING_INT_STAT,
+ val, !(val & HISI_PTT_TUNING_INT_STAT_MASK),
+ HISI_PTT_WAIT_POLL_INTERVAL_US,
+ HISI_PTT_WAIT_TUNE_TIMEOUT_US);
+}
+
+static ssize_t hisi_ptt_tune_attr_show(struct device *dev,
+  struct device_attribute *attr,
+  char *buf)
+{
+   struct hisi_ptt *hisi_ptt = to_hisi_ptt(dev_get_drvdata(dev));
+   struct dev_ext_attribute *ext_attr;
+   struct hisi_ptt_tune_desc *desc;
+   u32 reg;
+   u16 val;
+
+   ext_attr = container_of(attr, struct dev_ext_attribute, attr);
+   desc = ext_attr->var;
+
+   mutex_lock(_ptt->tune_lock);
+
+   reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_CTRL);
+   reg &= ~(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB);
+   reg |= FIELD_PREP(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB,
+ desc->event_code);
+   writel(reg, hisi_ptt->iobase + HISI_PTT_TUNING_CTRL);
+
+   /* Write all 1 to indicates it's the read process */
+   writel(~0U, hisi_ptt->iobase + HISI_PTT_TUNING_DATA);
+
+   if (!hisi_ptt_wait_tuning_finish(hisi_ptt)) {
+   mutex_unlock(_ptt->tune_lock);
+   return -ETIMEDOUT;
+   }
+
+   reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_DATA);
+   reg &= HISI_PTT_TUNING_DATA_VAL_MASK;
+   val = FIELD_GET(HISI_PTT_TUNING_DATA_VAL_MASK, reg);
+
+   mutex_unlock(_ptt->tune_lock);
+   return sysfs_emit(buf, "%u\n", val);
+}
+
+static ssize_t hisi_ptt_tune_attr_store(struct device *dev,
+   struct device_attribute *attr,
+   const char *buf, size_t count)
+{
+   struct hisi_ptt *hisi_ptt = to_hisi_ptt(dev_get_drvdata(dev));
+   struct dev_ext_attribute *ext_attr;
+   struct hisi_ptt_tune_desc *desc;
+   u32 reg;
+   u16 val;
+
+   ext_attr = container_of(attr, struct dev_ext_attribute, attr);
+   desc = ext_attr->var;
+
+   if (kstrtou16(buf, 10, ))
+   return -EINVAL;
+
+   mutex_lock(_ptt->tune_lock);
+
+   reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_CTRL);
+   reg &= ~(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB);
+   reg |= FIELD_PREP(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB,
+ desc->event_code);
+   writel(reg, hisi_ptt->iobase + HISI_PTT_TUNING_CTRL);
+   writel(FIELD_PREP(HISI_PTT_TUNING_DATA_VAL_MASK, val),
+  hisi_ptt->iobase + HISI_PTT_TUNING_DATA);
+
+   if (!hisi_ptt_wait_tuning_finish(hisi_ptt)) {
+   mutex_unlock(_ptt->tune_lock);
+   return -ETIMEDOUT;
+   }
+
+   mutex_unlock(_ptt->tune_lock);
+   return count;
+}
+
+#define HISI_PTT_TUNE_ATTR(_name, _val, _show, _store) \
+   static struct hisi_ptt_tune_desc _name##_desc = {   \
+   .name = #_name, \
+   .event_code = _val, \
+   };  \
+   static struct dev_ext_attribute hisi_ptt_##_name##_attr = { \
+   .attr   = __ATTR(_name, 0600, _show, _store),   \
+   .var= &_name##_desc,\
+   }
+
+#define HISI_PTT_TUNE_ATTR_COMMON(_name, _val) \
+   HISI_PTT_TUNE_ATTR(_name, _val, \
+  hisi_ptt_tune_attr_show, \
+  hisi_ptt_tune_attr_store)
+
+/*
+ * The value of the tuning event are composed of two parts: main event code in 
bit[0,15] and
+ * subevent code in bit[16,23]. For example, qox_tx_cpl is a subevent of 'Tx 
path QoS control'
+ * which for tuning the weight of Tx completion TLPs. See hisi_ptt.rst 
documentation for
+ * more information.
+ */
+#define HISI_PTT_TUNE_QOS_TX_CPL   (0x4 | (3 << 
16))
+#define HISI_PTT_TUNE_QOS_TX_NP

[PATCH v9 5/8] perf tool: Add support for HiSilicon PCIe Tune and Trace device driver

2022-06-06 Thread Yicong Yang via iommu
From: Qi Liu 

HiSilicon PCIe tune and trace device (PTT) could dynamically tune
the PCIe link's events, and trace the TLP headers).

This patch add support for PTT device in perf tool, so users could
use 'perf record' to get TLP headers trace data.

Signed-off-by: Qi Liu 
Signed-off-by: Yicong Yang 
---
 tools/perf/arch/arm/util/auxtrace.c   |  63 +
 tools/perf/arch/arm/util/pmu.c|   3 +
 tools/perf/arch/arm64/util/Build  |   2 +-
 tools/perf/arch/arm64/util/hisi-ptt.c | 187 ++
 tools/perf/util/auxtrace.c|   1 +
 tools/perf/util/auxtrace.h|   1 +
 tools/perf/util/hisi-ptt.h|  19 +++
 7 files changed, 275 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/arch/arm64/util/hisi-ptt.c
 create mode 100644 tools/perf/util/hisi-ptt.h

diff --git a/tools/perf/arch/arm/util/auxtrace.c 
b/tools/perf/arch/arm/util/auxtrace.c
index 384c7cfda0fd..129ed72391a4 100644
--- a/tools/perf/arch/arm/util/auxtrace.c
+++ b/tools/perf/arch/arm/util/auxtrace.c
@@ -4,9 +4,11 @@
  * Author: Mathieu Poirier 
  */
 
+#include 
 #include 
 #include 
 #include 
+#include 
 
 #include "../../../util/auxtrace.h"
 #include "../../../util/debug.h"
@@ -14,6 +16,7 @@
 #include "../../../util/pmu.h"
 #include "cs-etm.h"
 #include "arm-spe.h"
+#include "hisi-ptt.h"
 
 static struct perf_pmu **find_all_arm_spe_pmus(int *nr_spes, int *err)
 {
@@ -50,6 +53,52 @@ static struct perf_pmu **find_all_arm_spe_pmus(int *nr_spes, 
int *err)
return arm_spe_pmus;
 }
 
+static struct perf_pmu **find_all_hisi_ptt_pmus(int *nr_ptts, int *err)
+{
+   const char *sysfs = sysfs__mountpoint();
+   struct perf_pmu **hisi_ptt_pmus = NULL;
+   struct dirent *dent;
+   char path[PATH_MAX];
+   DIR *dir = NULL;
+   int idx = 0;
+
+   snprintf(path, PATH_MAX, "%s" EVENT_SOURCE_DEVICE_PATH, sysfs);
+   dir = opendir(path);
+   if (!dir) {
+   pr_err("can't read directory '%s'\n", EVENT_SOURCE_DEVICE_PATH);
+   *err = -EINVAL;
+   goto out;
+   }
+
+   while ((dent = readdir(dir))) {
+   if (strstr(dent->d_name, HISI_PTT_PMU_NAME))
+   (*nr_ptts)++;
+   }
+
+   if (!(*nr_ptts))
+   goto out;
+
+   hisi_ptt_pmus = zalloc(sizeof(struct perf_pmu *) * (*nr_ptts));
+   if (!hisi_ptt_pmus) {
+   pr_err("hisi_ptt alloc failed\n");
+   *err = -ENOMEM;
+   goto out;
+   }
+
+   rewinddir(dir);
+   while ((dent = readdir(dir))) {
+   if (strstr(dent->d_name, HISI_PTT_PMU_NAME) && idx < 
(*nr_ptts)) {
+   hisi_ptt_pmus[idx] = perf_pmu__find(dent->d_name);
+   if (hisi_ptt_pmus[idx])
+   idx++;
+   }
+   }
+
+out:
+   closedir(dir);
+   return hisi_ptt_pmus;
+}
+
 static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus,
   int pmu_nr, struct evsel *evsel)
 {
@@ -71,17 +120,21 @@ struct auxtrace_record
 {
struct perf_pmu *cs_etm_pmu = NULL;
struct perf_pmu **arm_spe_pmus = NULL;
+   struct perf_pmu **hisi_ptt_pmus = NULL;
struct evsel *evsel;
struct perf_pmu *found_etm = NULL;
struct perf_pmu *found_spe = NULL;
+   struct perf_pmu *found_ptt = NULL;
int auxtrace_event_cnt = 0;
int nr_spes = 0;
+   int nr_ptts = 0;
 
if (!evlist)
return NULL;
 
cs_etm_pmu = perf_pmu__find(CORESIGHT_ETM_PMU_NAME);
arm_spe_pmus = find_all_arm_spe_pmus(_spes, err);
+   hisi_ptt_pmus = find_all_hisi_ptt_pmus(_ptts, err);
 
evlist__for_each_entry(evlist, evsel) {
if (cs_etm_pmu && !found_etm)
@@ -89,9 +142,13 @@ struct auxtrace_record
 
if (arm_spe_pmus && !found_spe)
found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, 
evsel);
+
+   if (hisi_ptt_pmus && !found_ptt)
+   found_ptt = find_pmu_for_event(hisi_ptt_pmus, nr_ptts, 
evsel);
}
 
free(arm_spe_pmus);
+   free(hisi_ptt_pmus);
 
if (found_etm)
auxtrace_event_cnt++;
@@ -99,6 +156,9 @@ struct auxtrace_record
if (found_spe)
auxtrace_event_cnt++;
 
+   if (found_ptt)
+   auxtrace_event_cnt++;
+
if (auxtrace_event_cnt > 1) {
pr_err("Concurrent AUX trace operation not currently 
supported\n");
*err = -EOPNOTSUPP;
@@ -111,6 +171,9 @@ struct auxtrace_record
 #if defined(__aarch64__)
if (found_spe)
return arm_spe_recording_init(err, found_spe);
+
+   if (found_ptt)
+   return hisi_ptt_recording_init(err, found_ptt);
 #endif
 
/*
diff --git a/tools/perf/arch/arm/util/pmu.c b/tools/perf/arch/arm/util/pmu.c
index b8b23b9dc598..887c8addc491 100644
--- 

[PATCH v9 2/8] hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device

2022-06-06 Thread Yicong Yang via iommu
HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex integrated
Endpoint(RCiEP) device, providing the capability to dynamically monitor and
tune the PCIe traffic and trace the TLP headers.

Add the driver for the device to enable the trace function. Register PMU
device of PTT trace, then users can use trace through perf command. The
driver makes use of perf AUX trace function and support the following
events to configure the trace:

- filter: select Root port or Endpoint to trace
- type: select the type of traced TLP headers
- direction: select the direction of traced TLP headers
- format: select the data format of the traced TLP headers

This patch initially add basic trace support of PTT device.

Reviewed-by: Jonathan Cameron 
Reviewed-by: John Garry 
Signed-off-by: Yicong Yang 
---
 drivers/Makefile |   1 +
 drivers/hwtracing/Kconfig|   2 +
 drivers/hwtracing/ptt/Kconfig|  12 +
 drivers/hwtracing/ptt/Makefile   |   2 +
 drivers/hwtracing/ptt/hisi_ptt.c | 956 +++
 drivers/hwtracing/ptt/hisi_ptt.h | 177 ++
 6 files changed, 1150 insertions(+)
 create mode 100644 drivers/hwtracing/ptt/Kconfig
 create mode 100644 drivers/hwtracing/ptt/Makefile
 create mode 100644 drivers/hwtracing/ptt/hisi_ptt.c
 create mode 100644 drivers/hwtracing/ptt/hisi_ptt.h

diff --git a/drivers/Makefile b/drivers/Makefile
index 9a30842b22c5..bf67e0e23c18 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -176,6 +176,7 @@ obj-$(CONFIG_USB4)  += thunderbolt/
 obj-$(CONFIG_CORESIGHT)+= hwtracing/coresight/
 obj-y  += hwtracing/intel_th/
 obj-$(CONFIG_STM)  += hwtracing/stm/
+obj-$(CONFIG_HISI_PTT) += hwtracing/ptt/
 obj-$(CONFIG_ANDROID)  += android/
 obj-$(CONFIG_NVMEM)+= nvmem/
 obj-$(CONFIG_FPGA) += fpga/
diff --git a/drivers/hwtracing/Kconfig b/drivers/hwtracing/Kconfig
index 13085835a636..911ee977103c 100644
--- a/drivers/hwtracing/Kconfig
+++ b/drivers/hwtracing/Kconfig
@@ -5,4 +5,6 @@ source "drivers/hwtracing/stm/Kconfig"
 
 source "drivers/hwtracing/intel_th/Kconfig"
 
+source "drivers/hwtracing/ptt/Kconfig"
+
 endmenu
diff --git a/drivers/hwtracing/ptt/Kconfig b/drivers/hwtracing/ptt/Kconfig
new file mode 100644
index ..6d46a09ffeb9
--- /dev/null
+++ b/drivers/hwtracing/ptt/Kconfig
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config HISI_PTT
+   tristate "HiSilicon PCIe Tune and Trace Device"
+   depends on ARM64 || (COMPILE_TEST && 64BIT)
+   depends on PCI && HAS_DMA && HAS_IOMEM && PERF_EVENTS
+   help
+ HiSilicon PCIe Tune and Trace device exists as a PCIe RCiEP
+ device, and it provides support for PCIe traffic tuning and
+ tracing TLP headers to the memory.
+
+ This driver can also be built as a module. If so, the module
+ will be called hisi_ptt.
diff --git a/drivers/hwtracing/ptt/Makefile b/drivers/hwtracing/ptt/Makefile
new file mode 100644
index ..908c09a98161
--- /dev/null
+++ b/drivers/hwtracing/ptt/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_HISI_PTT) += hisi_ptt.o
diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c
new file mode 100644
index ..e317be2e1311
--- /dev/null
+++ b/drivers/hwtracing/ptt/hisi_ptt.c
@@ -0,0 +1,956 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Driver for HiSilicon PCIe tune and trace device
+ *
+ * Copyright (c) 2022 HiSilicon Technologies Co., Ltd.
+ * Author: Yicong Yang 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "hisi_ptt.h"
+
+/* Dynamic CPU hotplug state used by PTT */
+static enum cpuhp_state hisi_ptt_pmu_online;
+
+static u16 hisi_ptt_get_filter_val(u16 devid, bool is_port)
+{
+   if (is_port)
+   return BIT(HISI_PCIE_CORE_PORT_ID(devid & 0xff));
+
+   return devid;
+}
+
+static bool hisi_ptt_wait_trace_hw_idle(struct hisi_ptt *hisi_ptt)
+{
+   u32 val;
+
+   return !readl_poll_timeout_atomic(hisi_ptt->iobase + HISI_PTT_TRACE_STS,
+ val, val & HISI_PTT_TRACE_IDLE,
+ HISI_PTT_WAIT_POLL_INTERVAL_US,
+ HISI_PTT_WAIT_TRACE_TIMEOUT_US);
+}
+
+static void hisi_ptt_wait_dma_reset_done(struct hisi_ptt *hisi_ptt)
+{
+   u32 val;
+
+   readl_poll_timeout_atomic(hisi_ptt->iobase + HISI_PTT_TRACE_WR_STS,
+ val, !val, HISI_PTT_RESET_POLL_INTERVAL_US,
+ HISI_PTT_RESET_TIMEOUT_US);
+}
+
+static void hisi_ptt_trace_end(struct hisi_ptt *hisi_ptt)
+{
+   writel(0, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL);
+   hisi_ptt->trace_ctrl.started = false;
+}
+
+static int hisi_ptt_trace_start(struct hisi_ptt *hisi_ptt)
+{
+ 

Re: [PATCH V2 2/6] iommu: iova: properly handle 0 as a valid IOVA address

2022-06-06 Thread Robin Murphy

On 2022-06-02 16:58, Marek Szyprowski wrote:

Hi Robin,

On 23.05.2022 19:30, Robin Murphy wrote:

On 2022-05-11 13:15, Ajay Kumar wrote:

From: Marek Szyprowski 

Zero is a valid DMA and IOVA address on many architectures, so adjust
the
IOVA management code to properly handle it. A new value IOVA_BAD_ADDR
(~0UL) is introduced as a generic value for the error case. Adjust all
callers of the alloc_iova_fast() function for the new return value.


And when does anything actually need this? In fact if you were to stop
iommu-dma from reserving IOVA 0 - which you don't - it would only show
how patch #3 is broken.


I don't get why you says that patch #3 is broken.


My mistake, in fact it's already just as broken either way - I forgot 
that that's done implicitly via iovad->start_pfn rather than an actual 
reserve_iova() entry. I see there is an initial calculation attempting 
to honour start_pfn in patch #3, but it's always immediately 
overwritten. More critically, the rb_first()/rb_next walk means the 
starting point can only creep inevitably upwards as older allocations 
are freed, so will end up pathologically stuck at or above limit_pfn and 
unable to recover. At best it's more "next-fit" than "first-fit", and 
TBH the fact that it could ever even allocate anything at all in an 
empty domain makes me want to change the definition of IOVA_ANCHOR ;)



Them main issue with
address 0 being reserved appears when one uses first fit allocation
algorithm. In such case the first allocation will be at the 0 address,
what causes some issues. This patch simply makes the whole iova related
code capable of such case. This mimics the similar code already used on
ARM 32bit. There are drivers that rely on the way the IOVAs are
allocated. This is especially important for the drivers for devices with
limited addressing capabilities. And there exists such and they can be
even behind the IOMMU.

Lets focus on the s5p-mfc driver and the way one of the supported
hardware does the DMA. The hardware has very limited DMA window (256M)
and addresses all memory buffers as an offset from the firmware base.
Currently it has been assumed that the first allocated buffer will be on
the beginning of the IOVA space. This worked fine on ARM 32bit and
supporting such case is a target of this patchset.


I still understand the use-case; I object to a solution where PFN 0 is 
always reserved yet sometimes allocated. Not to mention the slightly 
more fundamental thing of the whole lot not actually working the way 
it's supposed to.


I also remain opposed to baking in secret ABIs where allocators are 
expected to honour a particular undocumented behaviour. FWIW I've had 
plans for a while now to make the IOVA walk bidirectional to make the 
existing retry case more efficient, at which point it's then almost 
trivial to implement "bottom-up" allocation, which I'm thinking I might 
then force on by default for CONFIG_ARM to minimise any further 
surprises for v2 of the default domain conversion. However I'm 
increasingly less convinced that it's really sufficient for your case, 
and am leaning back towards the opinion that any driver with really 
specific constraints on how its DMA addresses are laid out should not be 
passively relying on a generic allocator to do exactly what it wants. A 
simple flag saying "try to allocate DMA addresses bottom-up" doesn't 
actually mean "allocate this thing on a 256MB-aligned address and 
everything subsequent within a 256MB window above it", so let's not 
build a fragile house of cards on top of pretending that it does.



Also note that it's really nothing to do with architecture either way;
iommu-dma simply chooses to reserve IOVA 0 for its own convenience,
mostly because it can. Much the same way that 0 is typically a valid
CPU VA, but mapping something meaningful there is just asking for a
world of pain debugging NULL-dereference bugs.


Right that it is not directly related to the architecture, but it is
much more common case for some architectures where DMA (IOVA) address =
physical address + some offset. The commit message can be rephrased,
though.

I see no reason to forbid the 0 as a valid IOVA. The DMA-mapping also
uses DMA_MAPPING_ERROR as ~0. It looks that when last fit allocation
algorithm is used no one has ever need to consider such case so far.


Because it's the most convenient and efficient thing to do. Remember we 
tried to use 0 for DMA_MAPPING_ERROR too, but that turned out to be a 
usable physical RAM address on some systems. With a virtual address 
space, on the other hand, we're free to do whatever we want; that's kind 
of the point.


Thanks,
Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [GIT PULL] dma-mapping fixes for 5.19

2022-06-06 Thread pr-tracker-bot
The pull request you sent on Mon, 6 Jun 2022 08:38:45 +0200:

> git://git.infradead.org/users/hch/dma-mapping.git 
> tags/dma-mapping-5.19-2022-06-06

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/e71e60cd74df9386c3f684c54888f2367050b831

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 01/11] iommu: Add max_pasids field in struct iommu_device

2022-06-06 Thread Lu Baolu
Use this field to keep the number of supported PASIDs that an IOMMU
hardware is able to support. This is a generic attribute of an IOMMU
and lifting it into the per-IOMMU device structure makes it possible
to allocate a PASID for device without calls into the IOMMU drivers.
Any iommu driver which supports PASID related features should set this
field before enabling them on the devices.

Signed-off-by: Lu Baolu 
Reviewed-by: Jean-Philippe Brucker 
---
 include/linux/intel-iommu.h | 3 ++-
 include/linux/iommu.h   | 2 ++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
 drivers/iommu/intel/dmar.c  | 7 +++
 4 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 4f29139bbfc3..e065cbe3c857 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -479,7 +479,6 @@ enum {
 #define VTD_FLAG_IRQ_REMAP_PRE_ENABLED (1 << 1)
 #define VTD_FLAG_SVM_CAPABLE   (1 << 2)
 
-extern int intel_iommu_sm;
 extern spinlock_t device_domain_lock;
 
 #define sm_supported(iommu)(intel_iommu_sm && ecap_smts((iommu)->ecap))
@@ -786,6 +785,7 @@ struct context_entry *iommu_context_addr(struct intel_iommu 
*iommu, u8 bus,
 extern const struct iommu_ops intel_iommu_ops;
 
 #ifdef CONFIG_INTEL_IOMMU
+extern int intel_iommu_sm;
 extern int iommu_calculate_agaw(struct intel_iommu *iommu);
 extern int iommu_calculate_max_sagaw(struct intel_iommu *iommu);
 extern int dmar_disabled;
@@ -802,6 +802,7 @@ static inline int iommu_calculate_max_sagaw(struct 
intel_iommu *iommu)
 }
 #define dmar_disabled  (1)
 #define intel_iommu_enabled (0)
+#define intel_iommu_sm (0)
 #endif
 
 static inline const char *decode_prq_descriptor(char *str, size_t size,
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 5e1afe169549..03fbb1b71536 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -318,12 +318,14 @@ struct iommu_domain_ops {
  * @list: Used by the iommu-core to keep a list of registered iommus
  * @ops: iommu-ops for talking to this iommu
  * @dev: struct device for sysfs handling
+ * @max_pasids: number of supported PASIDs
  */
 struct iommu_device {
struct list_head list;
const struct iommu_ops *ops;
struct fwnode_handle *fwnode;
struct device *dev;
+   u32 max_pasids;
 };
 
 /**
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 88817a3376ef..ae8ec8df47c1 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3546,6 +3546,7 @@ static int arm_smmu_device_hw_probe(struct 
arm_smmu_device *smmu)
/* SID/SSID sizes */
smmu->ssid_bits = FIELD_GET(IDR1_SSIDSIZE, reg);
smmu->sid_bits = FIELD_GET(IDR1_SIDSIZE, reg);
+   smmu->iommu.max_pasids = 1UL << smmu->ssid_bits;
 
/*
 * If the SMMU supports fewer bits than would fill a single L2 stream
diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 592c1e1a5d4b..6c33061a 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -1123,6 +1123,13 @@ static int alloc_iommu(struct dmar_drhd_unit *drhd)
 
raw_spin_lock_init(>register_lock);
 
+   /*
+* A value of N in PSS field of eCap register indicates hardware
+* supports PASID field of N+1 bits.
+*/
+   if (pasid_supported(iommu))
+   iommu->iommu.max_pasids = 2UL << ecap_pss(iommu->ecap);
+
/*
 * This is only for hotplug; at boot time intel_iommu_enabled won't
 * be set yet. When intel_iommu_init() runs, it registers the units
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 00/11] iommu: SVA and IOPF refactoring

2022-06-06 Thread Lu Baolu
Hi folks,

The former part of this series refactors the IOMMU SVA code by assigning
an SVA type of iommu_domain to a shared virtual address and replacing
sva_bind/unbind iommu ops with set/block_dev_pasid domain ops.

The latter part changes the existing I/O page fault handling framework
from only serving SVA to a generic one. Any driver or component could
handle the I/O page faults for its domain in its own way by installing
an I/O page fault handler.

This series has been functionally tested on an x86 machine and compile
tested for all architectures.

This series is also available on github:
[2] https://github.com/LuBaolu/intel-iommu/commits/iommu-sva-refactoring-v8

Please review and suggest.

Best regards,
baolu

Change log:
v8:
 - Add support for calculating the max pasids that a device could
   consume.
 - Replace container_of_safe() with container_of.
 - Remove iommu_ops->sva_domain_ops and make sva support through the
   generic domain_alloc/free() interfaces.
 - [Robin] It would be logical to pass IOMMU_DOMAIN_SVA to the normal
   domain_alloc call, so that driver-internal stuff like context
   descriptors can be still be hung off the domain as usual (rather than
   all drivers having to implement some extra internal lookup mechanism 
   to handle all the SVA domain ops).
 - [Robin] I'd just stick the mm pointer in struct iommu_domain, in a
   union with the fault handler stuff those are mutually exclusive with
   SVA.
 - 
https://lore.kernel.org/linux-iommu/f3170016-4d7f-e78e-db48-68305f683...@arm.com/

v7:
 - 
https://lore.kernel.org/linux-iommu/20220519072047.2996983-1-baolu...@linux.intel.com/
 - Remove duplicate array for sva domain.
 - Rename detach_dev_pasid to block_dev_pasid.
 - Add raw device driver interfaces for iommufd.
 - Other misc refinements and patch reorganization.
 - Drop "dmaengine: idxd: Separate user and kernel pasid enabling" which
   has been picked for dmaengine tree.

v6:
 - 
https://lore.kernel.org/linux-iommu/20220510061738.2761430-1-baolu...@linux.intel.com/
 - Refine the SVA basic data structures.
   Link: https://lore.kernel.org/linux-iommu/YnFv0ps0Ad8v+7uH@myrica/
 - Refine arm smmuv3 sva domain allocation.
 - Fix a possible lock issue.
   Link: https://lore.kernel.org/linux-iommu/YnFydE8j8l7Q4m+b@myrica/

v5:
 - 
https://lore.kernel.org/linux-iommu/20220502014842.991097-1-baolu...@linux.intel.com/
 - Address review comments from Jean-Philippe Brucker. Very appreciated!
 - Remove redundant pci aliases check in
   device_group_immutable_singleton().
 - Treat all buses except PCI as static in immutable singleton check.
 - As the sva_bind/unbind() have already guaranteed sva domain free only
   after iopf_queue_flush_dev(), remove the unnecessary domain refcount.
 - Move domain get() out of the list iteration in iopf_handle_group().

v4:
 - 
https://lore.kernel.org/linux-iommu/20220421052121.3464100-1-baolu...@linux.intel.com/
 - Solve the overlap with another series and make this series
   self-contained.
 - No objection to the abstraction of data structure during v3 review.
   Hence remove the RFC subject prefix.
 - Refine the immutable singleton group code according to Kevin's
   comments.

v3:
 - 
https://lore.kernel.org/linux-iommu/20220410102443.294128-1-baolu...@linux.intel.com/
 - Rework iommu_group_singleton_lockdown() by adding a flag to the group
   that positively indicates the group can never have more than one
   member, even after hot plug.
 - Abstract the data structs used for iommu sva in a separated patches to
   make it easier for review.
 - I still keep the RFC prefix in this series as above two significant
   changes need at least another round review to be finalized.
 - Several misc refinements.

v2:
 - 
https://lore.kernel.org/linux-iommu/20220329053800.3049561-1-baolu...@linux.intel.com/
 - Add sva domain life cycle management to avoid race between unbind and
   page fault handling.
 - Use a single domain for each mm.
 - Return a single sva handler for the same binding.
 - Add a new helper to meet singleton group requirement.
 - Rework the SVA domain allocation for arm smmu v3 driver and move the
   pasid_bit initialization to device probe.
 - Drop the patch "iommu: Handle IO page faults directly".
 - Add mmget_not_zero(mm) in SVA page fault handler.

v1:
 - 
https://lore.kernel.org/linux-iommu/20220320064030.2936936-1-baolu...@linux.intel.com/
 - Initial post.

Lu Baolu (11):
  iommu: Add max_pasids field in struct iommu_device
  iommu: Add max_pasids field in struct dev_iommu
  iommu: Remove SVM_FLAG_SUPERVISOR_MODE support
  iommu: Add sva iommu_domain support
  iommu/vt-d: Add SVA domain support
  arm-smmu-v3/sva: Add SVA domain support
  iommu/sva: Refactoring iommu_sva_bind/unbind_device()
  iommu: Remove SVA related callbacks from iommu ops
  iommu: Prepare IOMMU domain for IOPF
  iommu: Per-domain I/O page fault handling
  iommu: Rename iommu-sva-lib.{c,h}

 include/linux/intel-iommu.h   |  12 +-
 

[PATCH v8 02/11] iommu: Add max_pasids field in struct dev_iommu

2022-06-06 Thread Lu Baolu
Use this field to save the number of PASIDs that a device is able to
consume. It is a generic attribute of a device and lifting it into the
per-device dev_iommu struct could help to avoid the boilerplate code
in various IOMMU drivers.

Signed-off-by: Lu Baolu 
---
 include/linux/iommu.h |  2 ++
 drivers/iommu/iommu.c | 26 ++
 2 files changed, 28 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 03fbb1b71536..d50afb2c9a09 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -364,6 +364,7 @@ struct iommu_fault_param {
  * @fwspec: IOMMU fwspec data
  * @iommu_dev:  IOMMU device this device is linked to
  * @priv:   IOMMU Driver private data
+ * @max_pasids:  number of PASIDs device can consume
  *
  * TODO: migrate other per device data pointers under iommu_dev_data, e.g.
  * struct iommu_group  *iommu_group;
@@ -375,6 +376,7 @@ struct dev_iommu {
struct iommu_fwspec *fwspec;
struct iommu_device *iommu_dev;
void*priv;
+   u32 max_pasids;
 };
 
 int iommu_device_register(struct iommu_device *iommu,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 847ad47a2dfd..adac85ccde73 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -218,6 +219,30 @@ static void dev_iommu_free(struct device *dev)
kfree(param);
 }
 
+static u32 dev_iommu_get_max_pasids(struct device *dev)
+{
+   u32 max_pasids = dev->iommu->iommu_dev->max_pasids;
+   u32 num_bits;
+   int ret;
+
+   if (!max_pasids)
+   return 0;
+
+   if (dev_is_pci(dev)) {
+   ret = pci_max_pasids(to_pci_dev(dev));
+   if (ret < 0)
+   return 0;
+
+   return min_t(u32, max_pasids, ret);
+   }
+
+   ret = device_property_read_u32(dev, "pasid-num-bits", _bits);
+   if (ret)
+   return 0;
+
+   return min_t(u32, max_pasids, 1UL << num_bits);
+}
+
 static int __iommu_probe_device(struct device *dev, struct list_head 
*group_list)
 {
const struct iommu_ops *ops = dev->bus->iommu_ops;
@@ -243,6 +268,7 @@ static int __iommu_probe_device(struct device *dev, struct 
list_head *group_list
}
 
dev->iommu->iommu_dev = iommu_dev;
+   dev->iommu->max_pasids = dev_iommu_get_max_pasids(dev);
 
group = iommu_group_get_for_dev(dev);
if (IS_ERR(group)) {
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 03/11] iommu: Remove SVM_FLAG_SUPERVISOR_MODE support

2022-06-06 Thread Lu Baolu
The current kernel DMA with PASID support is based on the SVA with a flag
SVM_FLAG_SUPERVISOR_MODE. The IOMMU driver binds the kernel memory address
space to a PASID of the device. The device driver programs the device with
kernel virtual address (KVA) for DMA access. There have been security and
functional issues with this approach:

- The lack of IOTLB synchronization upon kernel page table updates.
  (vmalloc, module/BPF loading, CONFIG_DEBUG_PAGEALLOC etc.)
- Other than slight more protection, using kernel virtual address (KVA)
  has little advantage over physical address. There are also no use
  cases yet where DMA engines need kernel virtual addresses for in-kernel
  DMA.

This removes SVM_FLAG_SUPERVISOR_MODE support from the IOMMU interface.
The device drivers are suggested to handle kernel DMA with PASID through
the kernel DMA APIs.

The drvdata parameter in iommu_sva_bind_device() and all callbacks is not
needed anymore. Cleanup them as well.

Link: https://lore.kernel.org/linux-iommu/20210511194726.gp1002...@nvidia.com/
Signed-off-by: Jacob Pan 
Signed-off-by: Lu Baolu 
Reviewed-by: Jason Gunthorpe 
Reviewed-by: Jean-Philippe Brucker 
Reviewed-by: Kevin Tian 
---
 include/linux/intel-iommu.h   |  3 +-
 include/linux/intel-svm.h | 13 -
 include/linux/iommu.h |  8 +--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  5 +-
 drivers/dma/idxd/cdev.c   |  2 +-
 drivers/dma/idxd/init.c   | 24 +---
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  3 +-
 drivers/iommu/intel/svm.c | 57 +--
 drivers/iommu/iommu.c |  5 +-
 drivers/misc/uacce/uacce.c|  2 +-
 10 files changed, 26 insertions(+), 96 deletions(-)

diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index e065cbe3c857..31e3edc0fc7e 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -738,8 +738,7 @@ struct intel_iommu *device_to_iommu(struct device *dev, u8 
*bus, u8 *devfn);
 extern void intel_svm_check(struct intel_iommu *iommu);
 extern int intel_svm_enable_prq(struct intel_iommu *iommu);
 extern int intel_svm_finish_prq(struct intel_iommu *iommu);
-struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm,
-void *drvdata);
+struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm);
 void intel_svm_unbind(struct iommu_sva *handle);
 u32 intel_svm_get_pasid(struct iommu_sva *handle);
 int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt,
diff --git a/include/linux/intel-svm.h b/include/linux/intel-svm.h
index 207ef06ba3e1..f9a0d44f6fdb 100644
--- a/include/linux/intel-svm.h
+++ b/include/linux/intel-svm.h
@@ -13,17 +13,4 @@
 #define PRQ_RING_MASK  ((0x1000 << PRQ_ORDER) - 0x20)
 #define PRQ_DEPTH  ((0x1000 << PRQ_ORDER) >> 5)
 
-/*
- * The SVM_FLAG_SUPERVISOR_MODE flag requests a PASID which can be used only
- * for access to kernel addresses. No IOTLB flushes are automatically done
- * for kernel mappings; it is valid only for access to the kernel's static
- * 1:1 mapping of physical memory — not to vmalloc or even module mappings.
- * A future API addition may permit the use of such ranges, by means of an
- * explicit IOTLB flush call (akin to the DMA API's unmap method).
- *
- * It is unlikely that we will ever hook into flush_tlb_kernel_range() to
- * do such IOTLB flushes automatically.
- */
-#define SVM_FLAG_SUPERVISOR_MODE   BIT(0)
-
 #endif /* __INTEL_SVM_H__ */
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index d50afb2c9a09..3fbad42c0bf8 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -243,8 +243,7 @@ struct iommu_ops {
int (*dev_enable_feat)(struct device *dev, enum iommu_dev_features f);
int (*dev_disable_feat)(struct device *dev, enum iommu_dev_features f);
 
-   struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm,
- void *drvdata);
+   struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm);
void (*sva_unbind)(struct iommu_sva *handle);
u32 (*sva_get_pasid)(struct iommu_sva *handle);
 
@@ -669,8 +668,7 @@ int iommu_dev_disable_feature(struct device *dev, enum 
iommu_dev_features f);
 bool iommu_dev_feature_enabled(struct device *dev, enum iommu_dev_features f);
 
 struct iommu_sva *iommu_sva_bind_device(struct device *dev,
-   struct mm_struct *mm,
-   void *drvdata);
+   struct mm_struct *mm);
 void iommu_sva_unbind_device(struct iommu_sva *handle);
 u32 iommu_sva_get_pasid(struct iommu_sva *handle);
 
@@ -1012,7 +1010,7 @@ iommu_dev_disable_feature(struct device *dev, enum 
iommu_dev_features feat)
 }
 
 static inline struct iommu_sva *

[PATCH v8 04/11] iommu: Add sva iommu_domain support

2022-06-06 Thread Lu Baolu
The sva iommu_domain represents a hardware pagetable that the IOMMU
hardware could use for SVA translation. This adds some infrastructure
to support SVA domain in the iommu common layer. It includes:

- Extend the iommu_domain to support a new IOMMU_DOMAIN_SVA domain
  type. The IOMMU drivers that support SVA should provide the sva
  domain specific iommu_domain_ops.
- Add a helper to allocate an SVA domain. The iommu_domain_free()
  is still used to free an SVA domain.
- Add helpers to attach an SVA domain to a device and the reverse
  operation.

Some buses, like PCI, route packets without considering the PASID value.
Thus a DMA target address with PASID might be treated as P2P if the
address falls into the MMIO BAR of other devices in the group. To make
things simple, the attach/detach interfaces only apply to devices
belonging to the singleton groups, and the singleton is immutable in
fabric i.e. not affected by hotplug.

The iommu_attach/detach_device_pasid() can be used for other purposes,
such as kernel DMA with pasid, mediation device, etc.

Suggested-by: Jean-Philippe Brucker 
Suggested-by: Jason Gunthorpe 
Signed-off-by: Lu Baolu 
Reviewed-by: Jean-Philippe Brucker 
---
 include/linux/iommu.h | 45 -
 drivers/iommu/iommu.c | 93 +++
 2 files changed, 136 insertions(+), 2 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 3fbad42c0bf8..9173c5741447 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -64,6 +64,9 @@ struct iommu_domain_geometry {
 #define __IOMMU_DOMAIN_PT  (1U << 2)  /* Domain is identity mapped   */
 #define __IOMMU_DOMAIN_DMA_FQ  (1U << 3)  /* DMA-API uses flush queue*/
 
+#define __IOMMU_DOMAIN_SHARED  (1U << 4)  /* Page table shared from CPU  */
+#define __IOMMU_DOMAIN_HOST_VA (1U << 5)  /* Host CPU virtual address */
+
 /*
  * This are the possible domain-types
  *
@@ -86,15 +89,24 @@ struct iommu_domain_geometry {
 #define IOMMU_DOMAIN_DMA_FQ(__IOMMU_DOMAIN_PAGING |\
 __IOMMU_DOMAIN_DMA_API |   \
 __IOMMU_DOMAIN_DMA_FQ)
+#define IOMMU_DOMAIN_SVA   (__IOMMU_DOMAIN_SHARED |\
+__IOMMU_DOMAIN_HOST_VA)
 
 struct iommu_domain {
unsigned type;
const struct iommu_domain_ops *ops;
unsigned long pgsize_bitmap;/* Bitmap of page sizes in use */
-   iommu_fault_handler_t handler;
-   void *handler_token;
struct iommu_domain_geometry geometry;
struct iommu_dma_cookie *iova_cookie;
+   union {
+   struct {/* IOMMU_DOMAIN_DMA */
+   iommu_fault_handler_t handler;
+   void *handler_token;
+   };
+   struct {/* IOMMU_DOMAIN_SVA */
+   struct mm_struct *mm;
+   };
+   };
 };
 
 static inline bool iommu_is_dma_domain(struct iommu_domain *domain)
@@ -262,6 +274,8 @@ struct iommu_ops {
  * struct iommu_domain_ops - domain specific operations
  * @attach_dev: attach an iommu domain to a device
  * @detach_dev: detach an iommu domain from a device
+ * @set_dev_pasid: set an iommu domain to a pasid of device
+ * @block_dev_pasid: block pasid of device from using iommu domain
  * @map: map a physically contiguous memory region to an iommu domain
  * @map_pages: map a physically contiguous set of pages of the same size to
  * an iommu domain.
@@ -282,6 +296,10 @@ struct iommu_ops {
 struct iommu_domain_ops {
int (*attach_dev)(struct iommu_domain *domain, struct device *dev);
void (*detach_dev)(struct iommu_domain *domain, struct device *dev);
+   int (*set_dev_pasid)(struct iommu_domain *domain, struct device *dev,
+ioasid_t pasid);
+   void (*block_dev_pasid)(struct iommu_domain *domain, struct device *dev,
+   ioasid_t pasid);
 
int (*map)(struct iommu_domain *domain, unsigned long iova,
   phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
@@ -679,6 +697,12 @@ int iommu_group_claim_dma_owner(struct iommu_group *group, 
void *owner);
 void iommu_group_release_dma_owner(struct iommu_group *group);
 bool iommu_group_dma_owner_claimed(struct iommu_group *group);
 
+struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
+   struct mm_struct *mm);
+int iommu_attach_device_pasid(struct iommu_domain *domain, struct device *dev,
+ ioasid_t pasid);
+void iommu_detach_device_pasid(struct iommu_domain *domain, struct device *dev,
+  ioasid_t pasid);
 #else /* CONFIG_IOMMU_API */
 
 struct iommu_ops {};
@@ -1052,6 +1076,23 @@ static inline bool iommu_group_dma_owner_claimed(struct 
iommu_group *group)
 {
return false;
 }
+
+static inline struct iommu_domain *

[PATCH v8 05/11] iommu/vt-d: Add SVA domain support

2022-06-06 Thread Lu Baolu
Add support for SVA domain allocation and provide an SVA-specific
iommu_domain_ops.

Signed-off-by: Lu Baolu 
---
 include/linux/intel-iommu.h |  5 
 drivers/iommu/intel/iommu.c |  2 ++
 drivers/iommu/intel/svm.c   | 49 +
 3 files changed, 56 insertions(+)

diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 31e3edc0fc7e..9007428a68f1 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -743,6 +743,7 @@ void intel_svm_unbind(struct iommu_sva *handle);
 u32 intel_svm_get_pasid(struct iommu_sva *handle);
 int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt,
struct iommu_page_response *msg);
+struct iommu_domain *intel_svm_domain_alloc(void);
 
 struct intel_svm_dev {
struct list_head list;
@@ -768,6 +769,10 @@ struct intel_svm {
 };
 #else
 static inline void intel_svm_check(struct intel_iommu *iommu) {}
+static inline struct iommu_domain *intel_svm_domain_alloc(void)
+{
+   return NULL;
+}
 #endif
 
 #ifdef CONFIG_INTEL_IOMMU_DEBUGFS
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 44016594831d..993a1ce509a8 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4298,6 +4298,8 @@ static struct iommu_domain 
*intel_iommu_domain_alloc(unsigned type)
return domain;
case IOMMU_DOMAIN_IDENTITY:
return _domain->domain;
+   case IOMMU_DOMAIN_SVA:
+   return intel_svm_domain_alloc();
default:
return NULL;
}
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index d04880a291c3..6def2fdcc862 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -931,3 +931,52 @@ int intel_svm_page_response(struct device *dev,
mutex_unlock(_mutex);
return ret;
 }
+
+static int intel_svm_attach_dev_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t pasid)
+{
+   struct device_domain_info *info = dev_iommu_priv_get(dev);
+   struct intel_iommu *iommu = info->iommu;
+   struct mm_struct *mm = domain->mm;
+   struct iommu_sva *sva;
+   int ret = 0;
+
+   mutex_lock(_mutex);
+   sva = intel_svm_bind_mm(iommu, dev, mm);
+   if (IS_ERR(sva))
+   ret = PTR_ERR(sva);
+   mutex_unlock(_mutex);
+
+   return ret;
+}
+
+static void intel_svm_detach_dev_pasid(struct iommu_domain *domain,
+  struct device *dev, ioasid_t pasid)
+{
+   mutex_lock(_mutex);
+   intel_svm_unbind_mm(dev, pasid);
+   mutex_unlock(_mutex);
+}
+
+static void intel_svm_domain_free(struct iommu_domain *domain)
+{
+   kfree(to_dmar_domain(domain));
+}
+
+static const struct iommu_domain_ops intel_svm_domain_ops = {
+   .set_dev_pasid  = intel_svm_attach_dev_pasid,
+   .block_dev_pasid= intel_svm_detach_dev_pasid,
+   .free   = intel_svm_domain_free,
+};
+
+struct iommu_domain *intel_svm_domain_alloc(void)
+{
+   struct dmar_domain *domain;
+
+   domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+   if (!domain)
+   return NULL;
+   domain->domain.ops = _svm_domain_ops;
+
+   return >domain;
+}
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group

2022-06-06 Thread Baolu Lu

On 2022/6/6 14:19, Nicolin Chen wrote:

+/**
+ * iommu_attach_group - Attach an IOMMU group to an IOMMU domain
+ * @domain: IOMMU domain to attach
+ * @dev: IOMMU group that will be attached


Nit: @group: ...


+ *
+ * Returns 0 on success and error code on failure
+ *
+ * Specifically, -EMEDIUMTYPE is returned if the domain and the group are
+ * incompatible in some way. This indicates that a caller should try another
+ * existing IOMMU domain or allocate a new one.
+ */
  int iommu_attach_group(struct iommu_domain *domain, struct iommu_group *group)
  {
int ret;


Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 08/11] iommu: Remove SVA related callbacks from iommu ops

2022-06-06 Thread Lu Baolu
These ops'es have been replaced with the dev_attach/detach_pasid domain
ops'es. There's no need for them anymore. Remove them to avoid dead
code.

Signed-off-by: Lu Baolu 
Reviewed-by: Jean-Philippe Brucker 
Reviewed-by: Kevin Tian 
---
 include/linux/intel-iommu.h   |  3 --
 include/linux/iommu.h |  7 ---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   | 16 --
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   | 40 ---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  3 --
 drivers/iommu/intel/iommu.c   |  3 --
 drivers/iommu/intel/svm.c | 49 ---
 7 files changed, 121 deletions(-)

diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 9007428a68f1..5bd19c95a926 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -738,9 +738,6 @@ struct intel_iommu *device_to_iommu(struct device *dev, u8 
*bus, u8 *devfn);
 extern void intel_svm_check(struct intel_iommu *iommu);
 extern int intel_svm_enable_prq(struct intel_iommu *iommu);
 extern int intel_svm_finish_prq(struct intel_iommu *iommu);
-struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm);
-void intel_svm_unbind(struct iommu_sva *handle);
-u32 intel_svm_get_pasid(struct iommu_sva *handle);
 int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt,
struct iommu_page_response *msg);
 struct iommu_domain *intel_svm_domain_alloc(void);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 04d3da546eee..fcdde6dd28c9 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -227,9 +227,6 @@ struct iommu_iotlb_gather {
  * @dev_has/enable/disable_feat: per device entries to check/enable/disable
  *   iommu specific features.
  * @dev_feat_enabled: check enabled feature
- * @sva_bind: Bind process address space to device
- * @sva_unbind: Unbind process address space from device
- * @sva_get_pasid: Get PASID associated to a SVA handle
  * @page_response: handle page request response
  * @def_domain_type: device default domain type, return value:
  * - IOMMU_DOMAIN_IDENTITY: must use an identity domain
@@ -263,10 +260,6 @@ struct iommu_ops {
int (*dev_enable_feat)(struct device *dev, enum iommu_dev_features f);
int (*dev_disable_feat)(struct device *dev, enum iommu_dev_features f);
 
-   struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm);
-   void (*sva_unbind)(struct iommu_sva *handle);
-   u32 (*sva_get_pasid)(struct iommu_sva *handle);
-
int (*page_response)(struct device *dev,
 struct iommu_fault_event *evt,
 struct iommu_page_response *msg);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 96399dd3a67a..15dd4c7e6d3a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -754,9 +754,6 @@ bool arm_smmu_master_sva_enabled(struct arm_smmu_master 
*master);
 int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
 int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
 bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
-struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm);
-void arm_smmu_sva_unbind(struct iommu_sva *handle);
-u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle);
 void arm_smmu_sva_notifier_synchronize(void);
 struct iommu_domain *arm_smmu_sva_domain_alloc(void);
 #else /* CONFIG_ARM_SMMU_V3_SVA */
@@ -790,19 +787,6 @@ static inline bool arm_smmu_master_iopf_supported(struct 
arm_smmu_master *master
return false;
 }
 
-static inline struct iommu_sva *
-arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
-{
-   return ERR_PTR(-ENODEV);
-}
-
-static inline void arm_smmu_sva_unbind(struct iommu_sva *handle) {}
-
-static inline u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle)
-{
-   return IOMMU_PASID_INVALID;
-}
-
 static inline void arm_smmu_sva_notifier_synchronize(void) {}
 
 static inline struct iommu_domain *arm_smmu_sva_domain_alloc(void)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 0af86e31e674..05973ada63c2 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -344,11 +344,6 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct 
*mm)
if (!bond)
return ERR_PTR(-ENOMEM);
 
-   /* Allocate a PASID for this mm if necessary */
-   ret = iommu_sva_alloc_pasid(mm, 1, (1U << master->ssid_bits) - 1);
-   if (ret)
-   goto err_free_bond;
-
bond->mm = mm;
bond->sva.dev = dev;
refcount_set(>refs, 1);
@@ -367,41 +362,6 @@ __arm_smmu_sva_bind(struct device *dev, 

[PATCH v8 09/11] iommu: Prepare IOMMU domain for IOPF

2022-06-06 Thread Lu Baolu
This adds some mechanisms around the iommu_domain so that the I/O page
fault handling framework could route a page fault to the domain and
call the fault handler from it.

Add pointers to the page fault handler and its private data in struct
iommu_domain. The fault handler will be called with the private data
as a parameter once a page fault is routed to the domain. Any kernel
component which owns an iommu domain could install handler and its
private parameter so that the page fault could be further routed and
handled.

This also prepares the SVA implementation to be the first consumer of
the per-domain page fault handling model. The I/O page fault handler
for SVA is copied to the SVA file with mmget_not_zero() added before
mmap_read_lock().

Suggested-by: Jean-Philippe Brucker 
Signed-off-by: Lu Baolu 
Reviewed-by: Jean-Philippe Brucker 
---
 include/linux/iommu.h |  3 ++
 drivers/iommu/iommu-sva-lib.h |  8 +
 drivers/iommu/io-pgfault.c|  7 
 drivers/iommu/iommu-sva-lib.c | 60 +++
 drivers/iommu/iommu.c |  4 +++
 5 files changed, 82 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index fcdde6dd28c9..ee521f91ed21 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -105,6 +105,9 @@ struct iommu_domain {
unsigned long pgsize_bitmap;/* Bitmap of page sizes in use */
struct iommu_domain_geometry geometry;
struct iommu_dma_cookie *iova_cookie;
+   enum iommu_page_response_code (*iopf_handler)(struct iommu_fault *fault,
+ void *data);
+   void *fault_data;
union {
struct {/* IOMMU_DOMAIN_DMA */
iommu_fault_handler_t handler;
diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
index 8909ea1094e3..1b3ace4b5863 100644
--- a/drivers/iommu/iommu-sva-lib.h
+++ b/drivers/iommu/iommu-sva-lib.h
@@ -26,6 +26,8 @@ int iopf_queue_flush_dev(struct device *dev);
 struct iopf_queue *iopf_queue_alloc(const char *name);
 void iopf_queue_free(struct iopf_queue *queue);
 int iopf_queue_discard_partial(struct iopf_queue *queue);
+enum iommu_page_response_code
+iommu_sva_handle_iopf(struct iommu_fault *fault, void *data);
 
 #else /* CONFIG_IOMMU_SVA */
 static inline int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
@@ -63,5 +65,11 @@ static inline int iopf_queue_discard_partial(struct 
iopf_queue *queue)
 {
return -ENODEV;
 }
+
+static inline enum iommu_page_response_code
+iommu_sva_handle_iopf(struct iommu_fault *fault, void *data)
+{
+   return IOMMU_PAGE_RESP_INVALID;
+}
 #endif /* CONFIG_IOMMU_SVA */
 #endif /* _IOMMU_SVA_LIB_H */
diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
index 1df8c1dcae77..aee9e033012f 100644
--- a/drivers/iommu/io-pgfault.c
+++ b/drivers/iommu/io-pgfault.c
@@ -181,6 +181,13 @@ static void iopf_handle_group(struct work_struct *work)
  * request completes, outstanding faults will have been dealt with by the time
  * the PASID is freed.
  *
+ * Any valid page fault will be eventually routed to an iommu domain and the
+ * page fault handler installed there will get called. The users of this
+ * handling framework should guarantee that the iommu domain could only be
+ * freed after the device has stopped generating page faults (or the iommu
+ * hardware has been set to block the page faults) and the pending page faults
+ * have been flushed.
+ *
  * Return: 0 on success and <0 on error.
  */
 int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
index 1e3e2b395b1e..dee8e2e42e06 100644
--- a/drivers/iommu/iommu-sva-lib.c
+++ b/drivers/iommu/iommu-sva-lib.c
@@ -167,3 +167,63 @@ u32 iommu_sva_get_pasid(struct iommu_sva *handle)
return domain->mm->pasid;
 }
 EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
+
+/*
+ * I/O page fault handler for SVA
+ */
+enum iommu_page_response_code
+iommu_sva_handle_iopf(struct iommu_fault *fault, void *data)
+{
+   vm_fault_t ret;
+   struct mm_struct *mm;
+   struct vm_area_struct *vma;
+   unsigned int access_flags = 0;
+   struct iommu_domain *domain = data;
+   unsigned int fault_flags = FAULT_FLAG_REMOTE;
+   struct iommu_fault_page_request *prm = >prm;
+   enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID;
+
+   if (!(prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID))
+   return status;
+
+   mm = domain->mm;
+   if (IS_ERR_OR_NULL(mm) || !mmget_not_zero(mm))
+   return status;
+
+   mmap_read_lock(mm);
+
+   vma = find_extend_vma(mm, prm->addr);
+   if (!vma)
+   /* Unmapped area */
+   goto out_put_mm;
+
+   if (prm->perm & IOMMU_FAULT_PERM_READ)
+   access_flags |= VM_READ;
+
+   if (prm->perm & IOMMU_FAULT_PERM_WRITE) {
+   

[PATCH v8 07/11] iommu/sva: Refactoring iommu_sva_bind/unbind_device()

2022-06-06 Thread Lu Baolu
The existing iommu SVA interfaces are implemented by calling the SVA
specific iommu ops provided by the IOMMU drivers. There's no need for
any SVA specific ops in iommu_ops vector anymore as we can achieve
this through the generic attach/detach_dev_pasid domain ops.

This refactors the IOMMU SVA interfaces implementation by using the
set/block_pasid_dev ops and align them with the concept of the SVA
iommu domain. Put the new SVA code in the sva related file in order
to make it self-contained.

Signed-off-by: Lu Baolu 
---
 include/linux/iommu.h |  67 +++
 drivers/iommu/iommu-sva-lib.c |  98 
 drivers/iommu/iommu.c | 119 --
 3 files changed, 165 insertions(+), 119 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 9173c5741447..04d3da546eee 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -39,7 +39,6 @@ struct device;
 struct iommu_domain;
 struct iommu_domain_ops;
 struct notifier_block;
-struct iommu_sva;
 struct iommu_fault_event;
 struct iommu_dma_cookie;
 
@@ -57,6 +56,14 @@ struct iommu_domain_geometry {
bool force_aperture;   /* DMA only allowed in mappable range? */
 };
 
+/**
+ * struct iommu_sva - handle to a device-mm bond
+ */
+struct iommu_sva {
+   struct device   *dev;
+   refcount_t  users;
+};
+
 /* Domain feature flags */
 #define __IOMMU_DOMAIN_PAGING  (1U << 0)  /* Support for iommu_map/unmap */
 #define __IOMMU_DOMAIN_DMA_API (1U << 1)  /* Domain for use in DMA-API
@@ -105,6 +112,7 @@ struct iommu_domain {
};
struct {/* IOMMU_DOMAIN_SVA */
struct mm_struct *mm;
+   struct iommu_sva bond;
};
};
 };
@@ -638,13 +646,6 @@ struct iommu_fwspec {
 /* ATS is supported */
 #define IOMMU_FWSPEC_PCI_RC_ATS(1 << 0)
 
-/**
- * struct iommu_sva - handle to a device-mm bond
- */
-struct iommu_sva {
-   struct device   *dev;
-};
-
 int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
  const struct iommu_ops *ops);
 void iommu_fwspec_free(struct device *dev);
@@ -685,11 +686,6 @@ int iommu_dev_enable_feature(struct device *dev, enum 
iommu_dev_features f);
 int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features f);
 bool iommu_dev_feature_enabled(struct device *dev, enum iommu_dev_features f);
 
-struct iommu_sva *iommu_sva_bind_device(struct device *dev,
-   struct mm_struct *mm);
-void iommu_sva_unbind_device(struct iommu_sva *handle);
-u32 iommu_sva_get_pasid(struct iommu_sva *handle);
-
 int iommu_device_use_default_domain(struct device *dev);
 void iommu_device_unuse_default_domain(struct device *dev);
 
@@ -703,6 +699,8 @@ int iommu_attach_device_pasid(struct iommu_domain *domain, 
struct device *dev,
  ioasid_t pasid);
 void iommu_detach_device_pasid(struct iommu_domain *domain, struct device *dev,
   ioasid_t pasid);
+struct iommu_domain *
+iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid);
 #else /* CONFIG_IOMMU_API */
 
 struct iommu_ops {};
@@ -1033,21 +1031,6 @@ iommu_dev_disable_feature(struct device *dev, enum 
iommu_dev_features feat)
return -ENODEV;
 }
 
-static inline struct iommu_sva *
-iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
-{
-   return NULL;
-}
-
-static inline void iommu_sva_unbind_device(struct iommu_sva *handle)
-{
-}
-
-static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle)
-{
-   return IOMMU_PASID_INVALID;
-}
-
 static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
 {
return NULL;
@@ -1093,6 +1076,12 @@ static inline void iommu_detach_device_pasid(struct 
iommu_domain *domain,
 struct device *dev, ioasid_t pasid)
 {
 }
+
+static inline struct iommu_domain *
+iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid)
+{
+   return NULL;
+}
 #endif /* CONFIG_IOMMU_API */
 
 /**
@@ -1118,4 +1107,26 @@ void iommu_debugfs_setup(void);
 static inline void iommu_debugfs_setup(void) {}
 #endif
 
+#ifdef CONFIG_IOMMU_SVA
+struct iommu_sva *iommu_sva_bind_device(struct device *dev,
+   struct mm_struct *mm);
+void iommu_sva_unbind_device(struct iommu_sva *handle);
+u32 iommu_sva_get_pasid(struct iommu_sva *handle);
+#else
+static inline struct iommu_sva *
+iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
+{
+   return NULL;
+}
+
+static inline void iommu_sva_unbind_device(struct iommu_sva *handle)
+{
+}
+
+static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle)
+{
+   return IOMMU_PASID_INVALID;
+}
+#endif /* CONFIG_IOMMU_SVA */
+
 #endif /* __LINUX_IOMMU_H */
diff --git 

[PATCH v8 11/11] iommu: Rename iommu-sva-lib.{c,h}

2022-06-06 Thread Lu Baolu
Rename iommu-sva-lib.c[h] to iommu-sva.c[h] as it contains all code
for SVA implementation in iommu core.

Signed-off-by: Lu Baolu 
Reviewed-by: Jean-Philippe Brucker 
---
 drivers/iommu/{iommu-sva-lib.h => iommu-sva.h}  | 6 +++---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +-
 drivers/iommu/intel/iommu.c | 2 +-
 drivers/iommu/intel/svm.c   | 2 +-
 drivers/iommu/io-pgfault.c  | 2 +-
 drivers/iommu/{iommu-sva-lib.c => iommu-sva.c}  | 2 +-
 drivers/iommu/iommu.c   | 2 +-
 drivers/iommu/Makefile  | 2 +-
 9 files changed, 11 insertions(+), 11 deletions(-)
 rename drivers/iommu/{iommu-sva-lib.h => iommu-sva.h} (95%)
 rename drivers/iommu/{iommu-sva-lib.c => iommu-sva.c} (99%)

diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva.h
similarity index 95%
rename from drivers/iommu/iommu-sva-lib.h
rename to drivers/iommu/iommu-sva.h
index 1b3ace4b5863..7215a761b962 100644
--- a/drivers/iommu/iommu-sva-lib.h
+++ b/drivers/iommu/iommu-sva.h
@@ -2,8 +2,8 @@
 /*
  * SVA library for IOMMU drivers
  */
-#ifndef _IOMMU_SVA_LIB_H
-#define _IOMMU_SVA_LIB_H
+#ifndef _IOMMU_SVA_H
+#define _IOMMU_SVA_H
 
 #include 
 #include 
@@ -72,4 +72,4 @@ iommu_sva_handle_iopf(struct iommu_fault *fault, void *data)
return IOMMU_PAGE_RESP_INVALID;
 }
 #endif /* CONFIG_IOMMU_SVA */
-#endif /* _IOMMU_SVA_LIB_H */
+#endif /* _IOMMU_SVA_H */
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 05973ada63c2..c64845a626b1 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -10,7 +10,7 @@
 #include 
 
 #include "arm-smmu-v3.h"
-#include "../../iommu-sva-lib.h"
+#include "../../iommu-sva.h"
 #include "../../io-pgtable-arm.h"
 
 struct arm_smmu_mmu_notifier {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 8b9b78c7a67d..79e8991e9181 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -31,7 +31,7 @@
 #include 
 
 #include "arm-smmu-v3.h"
-#include "../../iommu-sva-lib.h"
+#include "../../iommu-sva.h"
 
 static bool disable_bypass = true;
 module_param(disable_bypass, bool, 0444);
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 37d68eda1889..d16ab6d1cc05 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -27,7 +27,7 @@
 #include 
 
 #include "../irq_remapping.h"
-#include "../iommu-sva-lib.h"
+#include "../iommu-sva.h"
 #include "pasid.h"
 #include "cap_audit.h"
 
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 1fd019043e71..ef1e132c69f8 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -25,7 +25,7 @@
 
 #include "pasid.h"
 #include "perf.h"
-#include "../iommu-sva-lib.h"
+#include "../iommu-sva.h"
 
 static irqreturn_t prq_event_thread(int irq, void *d);
 static void intel_svm_drain_prq(struct device *dev, u32 pasid);
diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
index 4f24ec703479..91b1c6bd01d4 100644
--- a/drivers/iommu/io-pgfault.c
+++ b/drivers/iommu/io-pgfault.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 
-#include "iommu-sva-lib.h"
+#include "iommu-sva.h"
 
 /**
  * struct iopf_queue - IO Page Fault queue
diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva.c
similarity index 99%
rename from drivers/iommu/iommu-sva-lib.c
rename to drivers/iommu/iommu-sva.c
index dee8e2e42e06..1a4897a5697b 100644
--- a/drivers/iommu/iommu-sva-lib.c
+++ b/drivers/iommu/iommu-sva.c
@@ -6,7 +6,7 @@
 #include 
 #include 
 
-#include "iommu-sva-lib.h"
+#include "iommu-sva.h"
 
 static DEFINE_MUTEX(iommu_sva_lock);
 static DECLARE_IOASID_SET(iommu_sva_pasid);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 1f766006ed21..4f8c2ddde7ab 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -29,7 +29,7 @@
 #include 
 #include 
 
-#include "iommu-sva-lib.h"
+#include "iommu-sva.h"
 
 static struct kset *iommu_group_kset;
 static DEFINE_IDA(iommu_group_ida);
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 44475a9b3eea..c1763476162b 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,6 +27,6 @@ obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
 obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
 obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
-obj-$(CONFIG_IOMMU_SVA) += iommu-sva-lib.o io-pgfault.o
+obj-$(CONFIG_IOMMU_SVA) += iommu-sva.o io-pgfault.o
 obj-$(CONFIG_SPRD_IOMMU) += sprd-iommu.o
 obj-$(CONFIG_APPLE_DART) += apple-dart.o
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org

[PATCH v8 06/11] arm-smmu-v3/sva: Add SVA domain support

2022-06-06 Thread Lu Baolu
Add support for SVA domain allocation and provide an SVA-specific
iommu_domain_ops.

Signed-off-by: Lu Baolu 
Reviewed-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  6 ++
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   | 69 +++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  3 +
 3 files changed, 78 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index d2ba86470c42..96399dd3a67a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -758,6 +758,7 @@ struct iommu_sva *arm_smmu_sva_bind(struct device *dev, 
struct mm_struct *mm);
 void arm_smmu_sva_unbind(struct iommu_sva *handle);
 u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle);
 void arm_smmu_sva_notifier_synchronize(void);
+struct iommu_domain *arm_smmu_sva_domain_alloc(void);
 #else /* CONFIG_ARM_SMMU_V3_SVA */
 static inline bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
 {
@@ -803,5 +804,10 @@ static inline u32 arm_smmu_sva_get_pasid(struct iommu_sva 
*handle)
 }
 
 static inline void arm_smmu_sva_notifier_synchronize(void) {}
+
+static inline struct iommu_domain *arm_smmu_sva_domain_alloc(void)
+{
+   return NULL;
+}
 #endif /* CONFIG_ARM_SMMU_V3_SVA */
 #endif /* _ARM_SMMU_V3_H */
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index f155d406c5d5..0af86e31e674 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -549,3 +549,72 @@ void arm_smmu_sva_notifier_synchronize(void)
 */
mmu_notifier_synchronize();
 }
+
+static int arm_smmu_sva_attach_dev_pasid(struct iommu_domain *domain,
+struct device *dev, ioasid_t id)
+{
+   int ret = 0;
+   struct mm_struct *mm;
+   struct iommu_sva *handle;
+
+   if (domain->type != IOMMU_DOMAIN_SVA)
+   return -EINVAL;
+
+   mm = domain->mm;
+   if (WARN_ON(!mm))
+   return -ENODEV;
+
+   mutex_lock(_lock);
+   handle = __arm_smmu_sva_bind(dev, mm);
+   if (IS_ERR(handle))
+   ret = PTR_ERR(handle);
+   mutex_unlock(_lock);
+
+   return ret;
+}
+
+static void arm_smmu_sva_detach_dev_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t id)
+{
+   struct mm_struct *mm = domain->mm;
+   struct arm_smmu_bond *bond = NULL, *t;
+   struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
+   mutex_lock(_lock);
+   list_for_each_entry(t, >bonds, list) {
+   if (t->mm == mm) {
+   bond = t;
+   break;
+   }
+   }
+
+   if (!WARN_ON(!bond) && refcount_dec_and_test(>refs)) {
+   list_del(>list);
+   arm_smmu_mmu_notifier_put(bond->smmu_mn);
+   kfree(bond);
+   }
+   mutex_unlock(_lock);
+}
+
+static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
+{
+   kfree(domain);
+}
+
+static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
+   .set_dev_pasid  = arm_smmu_sva_attach_dev_pasid,
+   .block_dev_pasid= arm_smmu_sva_detach_dev_pasid,
+   .free   = arm_smmu_sva_domain_free,
+};
+
+struct iommu_domain *arm_smmu_sva_domain_alloc(void)
+{
+   struct iommu_domain *domain;
+
+   domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+   if (!domain)
+   return NULL;
+   domain->ops = _smmu_sva_domain_ops;
+
+   return domain;
+}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index ae8ec8df47c1..a30b252e2f95 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1999,6 +1999,9 @@ static struct iommu_domain 
*arm_smmu_domain_alloc(unsigned type)
 {
struct arm_smmu_domain *smmu_domain;
 
+   if (type == IOMMU_DOMAIN_SVA)
+   return arm_smmu_sva_domain_alloc();
+
if (type != IOMMU_DOMAIN_UNMANAGED &&
type != IOMMU_DOMAIN_DMA &&
type != IOMMU_DOMAIN_DMA_FQ &&
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 10/11] iommu: Per-domain I/O page fault handling

2022-06-06 Thread Lu Baolu
Tweak the I/O page fault handling framework to route the page faults to
the domain and call the page fault handler retrieved from the domain.
This makes the I/O page fault handling framework possible to serve more
usage scenarios as long as they have an IOMMU domain and install a page
fault handler in it. Some unused functions are also removed to avoid
dead code.

The iommu_get_domain_for_dev_pasid() which retrieves attached domain
for a {device, PASID} pair is used. It will be used by the page fault
handling framework which knows {device, PASID} reported from the iommu
driver. We have a guarantee that the SVA domain doesn't go away during
IOPF handling, because unbind() waits for pending faults with
iopf_queue_flush_dev() before freeing the domain. Hence, there's no need
to synchronize life cycle of the iommu domains between the unbind() and
the interrupt threads.

Signed-off-by: Lu Baolu 
Reviewed-by: Jean-Philippe Brucker 
---
 drivers/iommu/io-pgfault.c | 64 +-
 1 file changed, 7 insertions(+), 57 deletions(-)

diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
index aee9e033012f..4f24ec703479 100644
--- a/drivers/iommu/io-pgfault.c
+++ b/drivers/iommu/io-pgfault.c
@@ -69,69 +69,18 @@ static int iopf_complete_group(struct device *dev, struct 
iopf_fault *iopf,
return iommu_page_response(dev, );
 }
 
-static enum iommu_page_response_code
-iopf_handle_single(struct iopf_fault *iopf)
-{
-   vm_fault_t ret;
-   struct mm_struct *mm;
-   struct vm_area_struct *vma;
-   unsigned int access_flags = 0;
-   unsigned int fault_flags = FAULT_FLAG_REMOTE;
-   struct iommu_fault_page_request *prm = >fault.prm;
-   enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID;
-
-   if (!(prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID))
-   return status;
-
-   mm = iommu_sva_find(prm->pasid);
-   if (IS_ERR_OR_NULL(mm))
-   return status;
-
-   mmap_read_lock(mm);
-
-   vma = find_extend_vma(mm, prm->addr);
-   if (!vma)
-   /* Unmapped area */
-   goto out_put_mm;
-
-   if (prm->perm & IOMMU_FAULT_PERM_READ)
-   access_flags |= VM_READ;
-
-   if (prm->perm & IOMMU_FAULT_PERM_WRITE) {
-   access_flags |= VM_WRITE;
-   fault_flags |= FAULT_FLAG_WRITE;
-   }
-
-   if (prm->perm & IOMMU_FAULT_PERM_EXEC) {
-   access_flags |= VM_EXEC;
-   fault_flags |= FAULT_FLAG_INSTRUCTION;
-   }
-
-   if (!(prm->perm & IOMMU_FAULT_PERM_PRIV))
-   fault_flags |= FAULT_FLAG_USER;
-
-   if (access_flags & ~vma->vm_flags)
-   /* Access fault */
-   goto out_put_mm;
-
-   ret = handle_mm_fault(vma, prm->addr, fault_flags, NULL);
-   status = ret & VM_FAULT_ERROR ? IOMMU_PAGE_RESP_INVALID :
-   IOMMU_PAGE_RESP_SUCCESS;
-
-out_put_mm:
-   mmap_read_unlock(mm);
-   mmput(mm);
-
-   return status;
-}
-
 static void iopf_handle_group(struct work_struct *work)
 {
struct iopf_group *group;
+   struct iommu_domain *domain;
struct iopf_fault *iopf, *next;
enum iommu_page_response_code status = IOMMU_PAGE_RESP_SUCCESS;
 
group = container_of(work, struct iopf_group, work);
+   domain = iommu_get_domain_for_dev_pasid(group->dev,
+   group->last_fault.fault.prm.pasid);
+   if (!domain || !domain->iopf_handler)
+   status = IOMMU_PAGE_RESP_INVALID;
 
list_for_each_entry_safe(iopf, next, >faults, list) {
/*
@@ -139,7 +88,8 @@ static void iopf_handle_group(struct work_struct *work)
 * faults in the group if there is an error.
 */
if (status == IOMMU_PAGE_RESP_SUCCESS)
-   status = iopf_handle_single(iopf);
+   status = domain->iopf_handler(>fault,
+ domain->fault_data);
 
if (!(iopf->fault.prm.flags &
  IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE))
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group

2022-06-06 Thread Nicolin Chen via iommu
On Tue, Jun 07, 2022 at 11:23:27AM +0800, Baolu Lu wrote:
> External email: Use caution opening links or attachments
> 
> 
> On 2022/6/6 14:19, Nicolin Chen wrote:
> > +/**
> > + * iommu_attach_group - Attach an IOMMU group to an IOMMU domain
> > + * @domain: IOMMU domain to attach
> > + * @dev: IOMMU group that will be attached
> 
> Nit: @group: ...

Oh...Thanks!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/6] phase out CONFIG_VIRT_TO_BUS

2022-06-06 Thread Arnd Bergmann
From: Arnd Bergmann 

The virt_to_bus/bus_to_virt interface has been deprecated for
decades. After Jakub Kicinski put a lot of work into cleaning out the
network drivers using them, there are only a couple of other drivers
left, which can all be removed or otherwise cleaned up, to remove the
old interface for good.

Any out of tree drivers using virt_to_bus() should be converted to
using the dma-mapping interfaces, typically dma_alloc_coherent()
or dma_map_single()).

There are a few m68k and ppc32 specific drivers that keep using the
interfaces, but these are all guarded with architecture-specific
Kconfig dependencies, and are not actually broken.

There are still a number of drivers that are using virt_to_phys()
and phys_to_virt() in place of dma-mapping operations, and these
are often broken, but they are out of scope for this series.

  Arnd

Cc: Jakub Kicinski 
Cc: Christoph Hellwig  # dma-mapping
Cc: Marek Szyprowski  # dma-mapping
Cc: Robin Murphy  # dma-mapping
Cc: iommu@lists.linux-foundation.org
Cc: Khalid Aziz  # buslogic
Cc: linux-s...@vger.kernel.org
Cc: Manohar Vanga  # vme
Cc: Martyn Welch  # vme
Cc: Greg Kroah-Hartman  # vme
Cc: linuxppc-...@lists.ozlabs.org
Cc: linux-a...@vger.kernel.org
Cc: linux-al...@vger.kernel.org
Cc: linux-m...@lists.linux-m68k.org
Cc: linux-par...@vger.kernel.org
Cc: Denis Efremov  # floppy

Arnd Bergmann (6):
  vme: remove ca91cx42 Universe-II support
  vme: move back to staging
  media: sta2x11: remove VIRT_TO_BUS dependency
  scsi: dpt_i2o: drop stale VIRT_TO_BUS dependency
  scsi: remove stale BusLogic driver
  arch/*/: remove CONFIG_VIRT_TO_BUS

 .../core-api/bus-virt-phys-mapping.rst|  220 -
 Documentation/core-api/dma-api-howto.rst  |   14 -
 Documentation/core-api/index.rst  |1 -
 Documentation/driver-api/vme.rst  |4 +-
 Documentation/scsi/BusLogic.rst   |  581 --
 Documentation/scsi/FlashPoint.rst |  176 -
 .../translations/zh_CN/core-api/index.rst |1 -
 MAINTAINERS   |   11 +-
 arch/alpha/Kconfig|1 -
 arch/alpha/include/asm/floppy.h   |2 +-
 arch/alpha/include/asm/io.h   |8 +-
 arch/ia64/Kconfig |1 -
 arch/ia64/include/asm/io.h|8 -
 arch/m68k/Kconfig |1 -
 arch/m68k/include/asm/virtconvert.h   |4 +-
 arch/microblaze/Kconfig   |1 -
 arch/microblaze/include/asm/io.h  |2 -
 arch/mips/Kconfig |1 -
 arch/mips/include/asm/io.h|9 -
 arch/parisc/Kconfig   |1 -
 arch/parisc/include/asm/floppy.h  |4 +-
 arch/parisc/include/asm/io.h  |2 -
 arch/powerpc/Kconfig  |1 -
 arch/powerpc/include/asm/io.h |2 -
 arch/riscv/include/asm/page.h |1 -
 arch/x86/Kconfig  |1 -
 arch/x86/include/asm/io.h |9 -
 arch/xtensa/Kconfig   |1 -
 arch/xtensa/include/asm/io.h  |3 -
 drivers/Kconfig   |2 -
 drivers/Makefile  |1 -
 drivers/media/pci/sta2x11/Kconfig |2 +-
 drivers/scsi/BusLogic.c   | 3727 
 drivers/scsi/BusLogic.h   | 1284 ---
 drivers/scsi/FlashPoint.c | 7560 -
 drivers/scsi/Kconfig  |   26 +-
 drivers/scsi/dpt_i2o.c|4 +-
 drivers/staging/vme_user/Kconfig  |   27 +
 drivers/staging/vme_user/Makefile |3 +
 drivers/{vme => staging/vme_user}/vme.c   |2 +-
 .../linux => drivers/staging/vme_user}/vme.h  |0
 .../{vme => staging/vme_user}/vme_bridge.h|2 +-
 .../bridges => staging/vme_user}/vme_fake.c   |4 +-
 .../bridges => staging/vme_user}/vme_tsi148.c |4 +-
 .../bridges => staging/vme_user}/vme_tsi148.h |0
 drivers/staging/vme_user/vme_user.c   |2 +-
 drivers/vme/Kconfig   |   18 -
 drivers/vme/Makefile  |8 -
 drivers/vme/boards/Kconfig|   10 -
 drivers/vme/boards/Makefile   |6 -
 drivers/vme/boards/vme_vmivme7805.c   |  106 -
 drivers/vme/boards/vme_vmivme7805.h   |   33 -
 drivers/vme/bridges/Kconfig   |   24 -
 drivers/vme/bridges/Makefile  |4 -
 drivers/vme/bridges/vme_ca91cx42.c| 1928 -
 drivers/vme/bridges/vme_ca91cx42.h|  579 --
 include/asm-generic/io.h  |   14 -
 mm/Kconfig|8 -
 58 files changed, 54 insertions(+), 16405 deletions(-)
 delete mode 100644 

[PATCH 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group

2022-06-06 Thread Nicolin Chen via iommu
Cases like VFIO wish to attach a device to an existing domain that was
not allocated specifically from the device. This raises a condition
where the IOMMU driver can fail the domain attach because the domain and
device are incompatible with each other.

This is a soft failure that can be resolved by using a different domain.

Provide a dedicated errno from the IOMMU driver during attach that the
reason attached failed is because of domain incompatability. EMEDIUMTYPE
is chosen because it is never used within the iommu subsystem today and
evokes a sense that the 'medium' aka the domain is incompatible.

VFIO can use this to know attach is a soft failure and it should continue
searching. Otherwise the attach will be a hard failure and VFIO will
return the code to userspace.

Update all drivers to return EMEDIUMTYPE in their failure paths that are
related to domain incompatability.

Add kdocs describing this behavior.

Suggested-by: Jason Gunthorpe 
Signed-off-by: Nicolin Chen 
---
 drivers/iommu/amd/iommu.c   |  2 +-
 drivers/iommu/apple-dart.c  |  4 ++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  6 +++---
 drivers/iommu/arm/arm-smmu/qcom_iommu.c |  2 +-
 drivers/iommu/intel/iommu.c |  4 ++--
 drivers/iommu/iommu.c   | 22 +
 drivers/iommu/ipmmu-vmsa.c  |  2 +-
 drivers/iommu/omap-iommu.c  |  2 +-
 drivers/iommu/virtio-iommu.c|  2 +-
 9 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 840831d5d2ad..ad499658a6b6 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1662,7 +1662,7 @@ static int attach_device(struct device *dev,
if (domain->flags & PD_IOMMUV2_MASK) {
struct iommu_domain *def_domain = iommu_get_dma_domain(dev);
 
-   ret = -EINVAL;
+   ret = -EMEDIUMTYPE;
if (def_domain->type != IOMMU_DOMAIN_IDENTITY)
goto out;
 
diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c
index 8af0242a90d9..e58dc310afd7 100644
--- a/drivers/iommu/apple-dart.c
+++ b/drivers/iommu/apple-dart.c
@@ -495,10 +495,10 @@ static int apple_dart_attach_dev(struct iommu_domain 
*domain,
 
if (cfg->stream_maps[0].dart->force_bypass &&
domain->type != IOMMU_DOMAIN_IDENTITY)
-   return -EINVAL;
+   return -EMEDIUMTYPE;
if (!cfg->stream_maps[0].dart->supports_bypass &&
domain->type == IOMMU_DOMAIN_IDENTITY)
-   return -EINVAL;
+   return -EMEDIUMTYPE;
 
ret = apple_dart_finalize_domain(domain, cfg);
if (ret)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 88817a3376ef..6c393cd84925 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2424,20 +2424,20 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
"cannot attach to SMMU %s (upstream of %s)\n",
dev_name(smmu_domain->smmu->dev),
dev_name(smmu->dev));
-   ret = -ENXIO;
+   ret = -EMEDIUMTYPE;
goto out_unlock;
} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
   master->ssid_bits != smmu_domain->s1_cfg.s1cdmax) {
dev_err(dev,
"cannot attach to incompatible domain (%u SSID bits != 
%u)\n",
smmu_domain->s1_cfg.s1cdmax, master->ssid_bits);
-   ret = -EINVAL;
+   ret = -EMEDIUMTYPE;
goto out_unlock;
} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
   smmu_domain->stall_enabled != master->stall_enabled) {
dev_err(dev, "cannot attach to stall-%s domain\n",
smmu_domain->stall_enabled ? "enabled" : "disabled");
-   ret = -EINVAL;
+   ret = -EMEDIUMTYPE;
goto out_unlock;
}
 
diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c 
b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index 4c077c38fbd6..a8b63b855ffb 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -386,7 +386,7 @@ static int qcom_iommu_attach_dev(struct iommu_domain 
*domain, struct device *dev
"attached to domain on IOMMU %s\n",
dev_name(qcom_domain->iommu->dev),
dev_name(qcom_iommu->dev));
-   return -EINVAL;
+   return -EMEDIUMTYPE;
}
 
return 0;
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 44016594831d..0813b119d680 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ 

[PATCH 0/5] Simplify vfio_iommu_type1 attach/detach routine

2022-06-06 Thread Nicolin Chen via iommu
This is a preparatory series for IOMMUFD v2 patches. It enforces error
code -EMEDIUMTYPE in iommu_attach_device() and iommu_attach_group() when
an IOMMU domain and a device/group are incompatible. It also moves the
domain->ops check into __iommu_attach_device(). These allow VFIO iommu
code to simplify its group attachment routine, by avoiding the extra
IOMMU domain allocations and attach/detach sequences of the old code.

Worths mentioning the exact match for enforce_cache_coherency is removed
with this series, since there's very less value in doing that since KVM
won't be able to take advantage of it -- this just wastes domain memory.
Instead, we rely on Intel IOMMU driver taking care of that internally.

This is on github: https://github.com/nicolinc/iommufd/commits/vfio_iommu_attach

Jason Gunthorpe (1):
  vfio/iommu_type1: Prefer to reuse domains vs match enforced cache
coherency

Nicolin Chen (4):
  iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  iommu: Ensure device has the same iommu_ops as the domain
  vfio/iommu_type1: Clean up update_dirty_scope in detach_group()
  vfio/iommu_type1: Simplify group attachment

 drivers/iommu/amd/iommu.c   |   3 +-
 drivers/iommu/apple-dart.c  |   5 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |   7 +-
 drivers/iommu/arm/arm-smmu/arm-smmu.c   |   1 +
 drivers/iommu/arm/arm-smmu/qcom_iommu.c |   3 +-
 drivers/iommu/exynos-iommu.c|   1 +
 drivers/iommu/fsl_pamu_domain.c |   1 +
 drivers/iommu/intel/iommu.c |   5 +-
 drivers/iommu/iommu.c   |  26 ++
 drivers/iommu/ipmmu-vmsa.c  |   3 +-
 drivers/iommu/msm_iommu.c   |   1 +
 drivers/iommu/mtk_iommu.c   |   1 +
 drivers/iommu/mtk_iommu_v1.c|   1 +
 drivers/iommu/omap-iommu.c  |   3 +-
 drivers/iommu/rockchip-iommu.c  |   1 +
 drivers/iommu/s390-iommu.c  |   1 +
 drivers/iommu/sprd-iommu.c  |   1 +
 drivers/iommu/sun50i-iommu.c|   1 +
 drivers/iommu/tegra-gart.c  |   1 +
 drivers/iommu/tegra-smmu.c  |   1 +
 drivers/iommu/virtio-iommu.c|   3 +-
 drivers/vfio/vfio_iommu_type1.c | 315 ++--
 include/linux/iommu.h   |   2 +
 23 files changed, 223 insertions(+), 164 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/5] iommu: Ensure device has the same iommu_ops as the domain

2022-06-06 Thread Nicolin Chen via iommu
The core code should not call an iommu driver op with a struct device
parameter unless it knows that the dev_iommu_priv_get() for that struct
device was setup by the same driver. Otherwise in a mixed driver system
the iommu_priv could be casted to the wrong type.

Store the iommu_ops pointer in the iommu_domain and use it as a check to
validate that the struct device is correct before invoking any domain op
that accepts a struct device.

This allows removing the check of the domain op equality in VFIO.

Co-developed-by: Jason Gunthorpe 
Signed-off-by: Jason Gunthorpe 
Signed-off-by: Nicolin Chen 
---
 drivers/iommu/amd/iommu.c   | 1 +
 drivers/iommu/apple-dart.c  | 1 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
 drivers/iommu/arm/arm-smmu/arm-smmu.c   | 1 +
 drivers/iommu/arm/arm-smmu/qcom_iommu.c | 1 +
 drivers/iommu/exynos-iommu.c| 1 +
 drivers/iommu/fsl_pamu_domain.c | 1 +
 drivers/iommu/intel/iommu.c | 1 +
 drivers/iommu/iommu.c   | 4 
 drivers/iommu/ipmmu-vmsa.c  | 1 +
 drivers/iommu/msm_iommu.c   | 1 +
 drivers/iommu/mtk_iommu.c   | 1 +
 drivers/iommu/mtk_iommu_v1.c| 1 +
 drivers/iommu/omap-iommu.c  | 1 +
 drivers/iommu/rockchip-iommu.c  | 1 +
 drivers/iommu/s390-iommu.c  | 1 +
 drivers/iommu/sprd-iommu.c  | 1 +
 drivers/iommu/sun50i-iommu.c| 1 +
 drivers/iommu/tegra-gart.c  | 1 +
 drivers/iommu/tegra-smmu.c  | 1 +
 drivers/iommu/virtio-iommu.c| 1 +
 include/linux/iommu.h   | 2 ++
 22 files changed, 26 insertions(+)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index ad499658a6b6..679f7a265013 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2285,6 +2285,7 @@ const struct iommu_ops amd_iommu_ops = {
.pgsize_bitmap  = AMD_IOMMU_PGSIZES,
.def_domain_type = amd_iommu_def_domain_type,
.default_domain_ops = &(const struct iommu_domain_ops) {
+   .iommu_ops  = _iommu_ops,
.attach_dev = amd_iommu_attach_device,
.detach_dev = amd_iommu_detach_device,
.map= amd_iommu_map,
diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c
index e58dc310afd7..3d36d9a12aa7 100644
--- a/drivers/iommu/apple-dart.c
+++ b/drivers/iommu/apple-dart.c
@@ -775,6 +775,7 @@ static const struct iommu_ops apple_dart_iommu_ops = {
.pgsize_bitmap = -1UL, /* Restricted during dart probe */
.owner = THIS_MODULE,
.default_domain_ops = &(const struct iommu_domain_ops) {
+   .iommu_ops  = _dart_iommu_ops,
.attach_dev = apple_dart_attach_dev,
.detach_dev = apple_dart_detach_dev,
.map_pages  = apple_dart_map_pages,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6c393cd84925..471ceb60427c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2859,6 +2859,7 @@ static struct iommu_ops arm_smmu_ops = {
.pgsize_bitmap  = -1UL, /* Restricted during device attach */
.owner  = THIS_MODULE,
.default_domain_ops = &(const struct iommu_domain_ops) {
+   .iommu_ops  = _smmu_ops,
.attach_dev = arm_smmu_attach_dev,
.map_pages  = arm_smmu_map_pages,
.unmap_pages= arm_smmu_unmap_pages,
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 2ed3594f384e..52c2589a4deb 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1597,6 +1597,7 @@ static struct iommu_ops arm_smmu_ops = {
.pgsize_bitmap  = -1UL, /* Restricted during device attach */
.owner  = THIS_MODULE,
.default_domain_ops = &(const struct iommu_domain_ops) {
+   .iommu_ops  = _smmu_ops,
.attach_dev = arm_smmu_attach_dev,
.map_pages  = arm_smmu_map_pages,
.unmap_pages= arm_smmu_unmap_pages,
diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c 
b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index a8b63b855ffb..8806a621f81e 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -596,6 +596,7 @@ static const struct iommu_ops qcom_iommu_ops = {
.of_xlate   = qcom_iommu_of_xlate,
.pgsize_bitmap  = SZ_4K | SZ_64K | SZ_1M | SZ_16M,
.default_domain_ops = &(const struct iommu_domain_ops) {
+   .iommu_ops  = _iommu_ops,

[PATCH 3/5] vfio/iommu_type1: Prefer to reuse domains vs match enforced cache coherency

2022-06-06 Thread Nicolin Chen via iommu
From: Jason Gunthorpe 

The KVM mechanism for controlling wbinvd is only triggered during
kvm_vfio_group_add(), meaning it is a one-shot test done once the devices
are setup.

So, there is no value in trying to push a device that could do enforced
cache coherency to a dedicated domain vs re-using an existing domain since
KVM won't be able to take advantage of it. This just wastes domain memory.

Simplify this code and eliminate the test. This removes the only logic
that needed to have a dummy domain attached prior to searching for a
matching domain and simplifies the next patches.

If someday we want to try and optimize this further the better approach is
to update the Intel driver so that enforce_cache_coherency() can work on a
domain that already has IOPTEs and then call the enforce_cache_coherency()
after detaching a device from a domain to upgrade the whole domain to
enforced cache coherency mode.

Signed-off-by: Jason Gunthorpe 
Signed-off-by: Nicolin Chen 
---
 drivers/vfio/vfio_iommu_type1.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index c13b9290e357..f4e3b423a453 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2285,9 +2285,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 * testing if they're on the same bus_type.
 */
list_for_each_entry(d, >domain_list, next) {
-   if (d->domain->ops == domain->domain->ops &&
-   d->enforce_cache_coherency ==
-   domain->enforce_cache_coherency) {
+   if (d->domain->ops == domain->domain->ops) {
iommu_detach_group(domain->domain, group->iommu_group);
if (!iommu_attach_group(d->domain,
group->iommu_group)) {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/5] vfio/iommu_type1: Clean up update_dirty_scope in detach_group()

2022-06-06 Thread Nicolin Chen via iommu
All devices in emulated_iommu_groups have pinned_page_dirty_scope
set, so the update_dirty_scope in the first list_for_each_entry
is always false. Clean it up, and move the "if update_dirty_scope"
part from the detach_group_done routine to the domain_list part.

Rename the "detach_group_done" goto label accordingly.

Suggested-by: Jason Gunthorpe 
Signed-off-by: Nicolin Chen 
---
 drivers/vfio/vfio_iommu_type1.c | 27 ---
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index f4e3b423a453..b45b1cc118ef 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2463,14 +2463,12 @@ static void vfio_iommu_type1_detach_group(void 
*iommu_data,
struct vfio_iommu *iommu = iommu_data;
struct vfio_domain *domain;
struct vfio_iommu_group *group;
-   bool update_dirty_scope = false;
LIST_HEAD(iova_copy);
 
mutex_lock(>lock);
list_for_each_entry(group, >emulated_iommu_groups, next) {
if (group->iommu_group != iommu_group)
continue;
-   update_dirty_scope = !group->pinned_page_dirty_scope;
list_del(>next);
kfree(group);
 
@@ -2479,7 +2477,7 @@ static void vfio_iommu_type1_detach_group(void 
*iommu_data,
WARN_ON(iommu->notifier.head);
vfio_iommu_unmap_unpin_all(iommu);
}
-   goto detach_group_done;
+   goto out_unlock;
}
 
/*
@@ -2495,9 +2493,7 @@ static void vfio_iommu_type1_detach_group(void 
*iommu_data,
continue;
 
iommu_detach_group(domain->domain, group->iommu_group);
-   update_dirty_scope = !group->pinned_page_dirty_scope;
list_del(>next);
-   kfree(group);
/*
 * Group ownership provides privilege, if the group list is
 * empty, the domain goes away. If it's the last domain with
@@ -2519,7 +2515,17 @@ static void vfio_iommu_type1_detach_group(void 
*iommu_data,
kfree(domain);
vfio_iommu_aper_expand(iommu, _copy);
vfio_update_pgsize_bitmap(iommu);
+   /*
+* Removal of a group without dirty tracking may allow
+* the iommu scope to be promoted.
+*/
+   if (!group->pinned_page_dirty_scope) {
+   iommu->num_non_pinned_groups--;
+   if (iommu->dirty_page_tracking)
+   vfio_iommu_populate_bitmap_full(iommu);
+   }
}
+   kfree(group);
break;
}
 
@@ -2528,16 +2534,7 @@ static void vfio_iommu_type1_detach_group(void 
*iommu_data,
else
vfio_iommu_iova_free(_copy);
 
-detach_group_done:
-   /*
-* Removal of a group without dirty tracking may allow the iommu scope
-* to be promoted.
-*/
-   if (update_dirty_scope) {
-   iommu->num_non_pinned_groups--;
-   if (iommu->dirty_page_tracking)
-   vfio_iommu_populate_bitmap_full(iommu);
-   }
+out_unlock:
mutex_unlock(>lock);
 }
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 5/5] vfio/iommu_type1: Simplify group attachment

2022-06-06 Thread Nicolin Chen via iommu
Un-inline the domain specific logic from the attach/detach_group ops into
two paired functions vfio_iommu_alloc_attach_domain() and
vfio_iommu_detach_destroy_domain() that strictly deal with creating and
destroying struct vfio_domains.

Add the logic to check for EMEDIUMTYPE return code of iommu_attach_group()
and avoid the extra domain allocations and attach/detach sequences of the
old code. This allows properly detecting an actual attach error, like
-ENOMEM, vs treating all attach errors as an incompatible domain.

Remove the duplicated domain->ops comparison that is taken care of in the
IOMMU core.

Co-developed-by: Jason Gunthorpe 
Signed-off-by: Jason Gunthorpe 
Signed-off-by: Nicolin Chen 
---
 drivers/vfio/vfio_iommu_type1.c | 306 +---
 1 file changed, 161 insertions(+), 145 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index b45b1cc118ef..c6f937e1d71f 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -86,6 +86,7 @@ struct vfio_domain {
struct list_headgroup_list;
boolfgsp : 1;   /* Fine-grained super pages */
boolenforce_cache_coherency : 1;
+   boolmsi_cookie : 1;
 };
 
 struct vfio_dma {
@@ -2153,12 +2154,161 @@ static void vfio_iommu_iova_insert_copy(struct 
vfio_iommu *iommu,
list_splice_tail(iova_copy, iova);
 }
 
+static struct vfio_domain *
+vfio_iommu_alloc_attach_domain(struct bus_type *bus, struct vfio_iommu *iommu,
+  struct vfio_iommu_group *group)
+{
+   struct iommu_domain *new_domain;
+   struct vfio_domain *domain;
+   int ret = 0;
+
+   /* Try to match an existing compatible domain */
+   list_for_each_entry (domain, >domain_list, next) {
+   ret = iommu_attach_group(domain->domain, group->iommu_group);
+   if (ret == -EMEDIUMTYPE)
+   continue;
+   if (ret)
+   return ERR_PTR(ret);
+   list_add(>next, >group_list);
+   return domain;
+   }
+
+   new_domain = iommu_domain_alloc(bus);
+   if (!new_domain)
+   return ERR_PTR(-EIO);
+
+   if (iommu->nesting) {
+   ret = iommu_enable_nesting(new_domain);
+   if (ret)
+   goto out_free_iommu_domain;
+   }
+
+   ret = iommu_attach_group(new_domain, group->iommu_group);
+   if (ret)
+   goto out_free_iommu_domain;
+
+   domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+   if (!domain) {
+   ret = -ENOMEM;
+   goto out_detach;
+   }
+
+   domain->domain = new_domain;
+   vfio_test_domain_fgsp(domain);
+
+   /*
+* If the IOMMU can block non-coherent operations (ie PCIe TLPs with
+* no-snoop set) then VFIO always turns this feature on because on Intel
+* platforms it optimizes KVM to disable wbinvd emulation.
+*/
+   if (new_domain->ops->enforce_cache_coherency)
+   domain->enforce_cache_coherency =
+   new_domain->ops->enforce_cache_coherency(new_domain);
+
+   /* replay mappings on new domains */
+   ret = vfio_iommu_replay(iommu, domain);
+   if (ret)
+   goto out_free_domain;
+
+   /*
+* An iommu backed group can dirty memory directly and therefore
+* demotes the iommu scope until it declares itself dirty tracking
+* capable via the page pinning interface.
+*/
+   iommu->num_non_pinned_groups++;
+
+   INIT_LIST_HEAD(>group_list);
+   list_add(>next, >group_list);
+   list_add(>next, >domain_list);
+   vfio_update_pgsize_bitmap(iommu);
+   return domain;
+
+out_free_domain:
+   kfree(domain);
+out_detach:
+   iommu_detach_group(domain->domain, group->iommu_group);
+out_free_iommu_domain:
+   iommu_domain_free(new_domain);
+   return ERR_PTR(ret);
+}
+
+static void vfio_iommu_unmap_unpin_all(struct vfio_iommu *iommu)
+{
+   struct rb_node *node;
+
+   while ((node = rb_first(>dma_list)))
+   vfio_remove_dma(iommu, rb_entry(node, struct vfio_dma, node));
+}
+
+static void vfio_iommu_unmap_unpin_reaccount(struct vfio_iommu *iommu)
+{
+   struct rb_node *n, *p;
+
+   n = rb_first(>dma_list);
+   for (; n; n = rb_next(n)) {
+   struct vfio_dma *dma;
+   long locked = 0, unlocked = 0;
+
+   dma = rb_entry(n, struct vfio_dma, node);
+   unlocked += vfio_unmap_unpin(iommu, dma, false);
+   p = rb_first(>pfn_list);
+   for (; p; p = rb_next(p)) {
+   struct vfio_pfn *vpfn = rb_entry(p, struct vfio_pfn,
+node);
+
+   if (!is_invalid_reserved_pfn(vpfn->pfn))
+  

[GIT PULL] dma-mapping fixes for 5.19

2022-06-06 Thread Christoph Hellwig
[I really wanted to send these before -rc1, but the fact that today is
 a public holiday here really confused me and messed up my schedule]

The following changes since commit 4a37f3dd9a83186cb88d44808ab35b78375082c9:

  dma-direct: don't over-decrypt memory (2022-05-23 15:25:40 +0200)

are available in the Git repository at:

  git://git.infradead.org/users/hch/dma-mapping.git 
tags/dma-mapping-5.19-2022-06-06

for you to fetch changes up to e15db62bc5648ab459a570862f654e787c498faf:

  swiotlb: fix setting ->force_bounce (2022-06-02 07:17:59 +0200)


dma-mapping fixes for Linux 5.19

 - fix a regressin in setting swiotlb ->force_bounce (me)
 - make dma-debug less chatty (Rob Clark)


Christoph Hellwig (1):
  swiotlb: fix setting ->force_bounce

Rob Clark (1):
  dma-debug: make things less spammy under memory pressure

 kernel/dma/debug.c   |  2 +-
 kernel/dma/swiotlb.c | 14 ++
 2 files changed, 7 insertions(+), 9 deletions(-)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/6] vme: remove ca91cx42 Universe-II support

2022-06-06 Thread Arnd Bergmann
From: Arnd Bergmann 

This is one of four remaining drivers using the ancient
virt_to_bus() interface instead of the dma-mapping interface,
making it incompatible with most modern machines.

As nobody has cleaned this up, there is a high chance that this
driver has no actual users. The chip was introduced in 1997 and
only supports 32-bit legacy PCI. It was replaced by TSI148 in
2004, but that chip has since been discontinued, while a version
of the older Universe II remains in production after 25 years.

The vme_vmivme7805 board uses Universe-II, so this also gets
removed in the process, but PCI add-on cards based on TSI148
can still work in theory.

If there are users of the Universe-II driver after all, it is
of course possible to revert this patch and fix it to use the
dma-mapping interface like the tsi148 driver does.

Signed-off-by: Arnd Bergmann 
---
 drivers/vme/Kconfig |2 -
 drivers/vme/Makefile|1 -
 drivers/vme/boards/Kconfig  |   10 -
 drivers/vme/boards/Makefile |6 -
 drivers/vme/boards/vme_vmivme7805.c |  106 --
 drivers/vme/boards/vme_vmivme7805.h |   33 -
 drivers/vme/bridges/Kconfig |7 -
 drivers/vme/bridges/Makefile|1 -
 drivers/vme/bridges/vme_ca91cx42.c  | 1928 ---
 drivers/vme/bridges/vme_ca91cx42.h  |  579 
 10 files changed, 2673 deletions(-)
 delete mode 100644 drivers/vme/boards/Kconfig
 delete mode 100644 drivers/vme/boards/Makefile
 delete mode 100644 drivers/vme/boards/vme_vmivme7805.c
 delete mode 100644 drivers/vme/boards/vme_vmivme7805.h
 delete mode 100644 drivers/vme/bridges/vme_ca91cx42.c
 delete mode 100644 drivers/vme/bridges/vme_ca91cx42.h

diff --git a/drivers/vme/Kconfig b/drivers/vme/Kconfig
index c13dd9d2a604..26feabba19d2 100644
--- a/drivers/vme/Kconfig
+++ b/drivers/vme/Kconfig
@@ -13,6 +13,4 @@ if VME_BUS
 
 source "drivers/vme/bridges/Kconfig"
 
-source "drivers/vme/boards/Kconfig"
-
 endif # VME
diff --git a/drivers/vme/Makefile b/drivers/vme/Makefile
index 8bfe4b370c41..2dfb929a23de 100644
--- a/drivers/vme/Makefile
+++ b/drivers/vme/Makefile
@@ -5,4 +5,3 @@
 obj-$(CONFIG_VME_BUS)  += vme.o
 
 obj-y  += bridges/
-obj-y  += boards/
diff --git a/drivers/vme/boards/Kconfig b/drivers/vme/boards/Kconfig
deleted file mode 100644
index 7a255f72980b..
--- a/drivers/vme/boards/Kconfig
+++ /dev/null
@@ -1,10 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-comment "VME Board Drivers"
-
-config VMIVME_7805
-   tristate "VMIVME-7805"
-   help
- If you say Y here you get support for the VMIVME-7805 board.
- This board has an additional control interface to the Universe II
- chip. This driver has to be included if you want to access VME bus
- with VMIVME-7805 board.
diff --git a/drivers/vme/boards/Makefile b/drivers/vme/boards/Makefile
deleted file mode 100644
index 87122381452c..
--- a/drivers/vme/boards/Makefile
+++ /dev/null
@@ -1,6 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-#
-# Makefile for the VME board drivers.
-#
-
-obj-$(CONFIG_VMIVME_7805)  += vme_vmivme7805.o
diff --git a/drivers/vme/boards/vme_vmivme7805.c 
b/drivers/vme/boards/vme_vmivme7805.c
deleted file mode 100644
index 51e056bae943..
--- a/drivers/vme/boards/vme_vmivme7805.c
+++ /dev/null
@@ -1,106 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * Support for the VMIVME-7805 board access to the Universe II bridge.
- *
- * Author: Arthur Benilov 
- * Copyright 2010 Ion Beam Application, Inc.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include "vme_vmivme7805.h"
-
-static int vmic_probe(struct pci_dev *, const struct pci_device_id *);
-static void vmic_remove(struct pci_dev *);
-
-/** Base address to access FPGA register */
-static void __iomem *vmic_base;
-
-static const char driver_name[] = "vmivme_7805";
-
-static const struct pci_device_id vmic_ids[] = {
-   { PCI_DEVICE(PCI_VENDOR_ID_VMIC, PCI_DEVICE_ID_VTIMR) },
-   { },
-};
-
-static struct pci_driver vmic_driver = {
-   .name = driver_name,
-   .id_table = vmic_ids,
-   .probe = vmic_probe,
-   .remove = vmic_remove,
-};
-
-static int vmic_probe(struct pci_dev *pdev, const struct pci_device_id *id)
-{
-   int retval;
-   u32 data;
-
-   /* Enable the device */
-   retval = pci_enable_device(pdev);
-   if (retval) {
-   dev_err(>dev, "Unable to enable device\n");
-   goto err;
-   }
-
-   /* Map Registers */
-   retval = pci_request_regions(pdev, driver_name);
-   if (retval) {
-   dev_err(>dev, "Unable to reserve resources\n");
-   goto err_resource;
-   }
-
-   /* Map registers in BAR 0 */
-   vmic_base = ioremap(pci_resource_start(pdev, 0), 16);
-   if (!vmic_base) {
-   dev_err(>dev, "Unable to remap CRG 

[PATCH 2/6] vme: move back to staging

2022-06-06 Thread Arnd Bergmann
From: Arnd Bergmann 

The VME subsystem graduated from staging into a top-level subsystem in
2012, with commit db3b9e990e75 ("Staging: VME: move VME drivers out of
staging") stating:

The VME device drivers have not moved out yet due to some API
questions they are still working through, that should happen soon,
hopefully.

However, this never happened: maintenance of drivers/vme effectively
stopped in 2017, with all subsequent changes being treewide cleanups.
No hardware driver remains in staging, only the limited user-level
access, and I just removed one of the two bridge drivers and the only
remaining board.

drivers/staging/vme/devices/ was recently moved to
drivers/staging/vme_user/, but as the vme_user driver is the only one
remaining for this subsystem, it is easier to just move the remaining
three source files into this directory rather than keeping the original
hierarchy.

Signed-off-by: Arnd Bergmann 
---
 Documentation/driver-api/vme.rst  |  4 +--
 MAINTAINERS   |  4 +--
 drivers/Kconfig   |  2 --
 drivers/Makefile  |  1 -
 drivers/staging/vme_user/Kconfig  | 27 +++
 drivers/staging/vme_user/Makefile |  3 +++
 drivers/{vme => staging/vme_user}/vme.c   |  2 +-
 .../linux => drivers/staging/vme_user}/vme.h  |  0
 .../{vme => staging/vme_user}/vme_bridge.h|  2 +-
 .../bridges => staging/vme_user}/vme_fake.c   |  4 +--
 .../bridges => staging/vme_user}/vme_tsi148.c |  4 +--
 .../bridges => staging/vme_user}/vme_tsi148.h |  0
 drivers/staging/vme_user/vme_user.c   |  2 +-
 drivers/vme/Kconfig   | 16 ---
 drivers/vme/Makefile  |  7 -
 drivers/vme/bridges/Kconfig   | 17 
 drivers/vme/bridges/Makefile  |  3 ---
 17 files changed, 40 insertions(+), 58 deletions(-)
 rename drivers/{vme => staging/vme_user}/vme.c (99%)
 rename {include/linux => drivers/staging/vme_user}/vme.h (100%)
 rename drivers/{vme => staging/vme_user}/vme_bridge.h (99%)
 rename drivers/{vme/bridges => staging/vme_user}/vme_fake.c (99%)
 rename drivers/{vme/bridges => staging/vme_user}/vme_tsi148.c (99%)
 rename drivers/{vme/bridges => staging/vme_user}/vme_tsi148.h (100%)
 delete mode 100644 drivers/vme/Kconfig
 delete mode 100644 drivers/vme/Makefile
 delete mode 100644 drivers/vme/bridges/Kconfig
 delete mode 100644 drivers/vme/bridges/Makefile

diff --git a/Documentation/driver-api/vme.rst b/Documentation/driver-api/vme.rst
index def139c13410..c0b475369de0 100644
--- a/Documentation/driver-api/vme.rst
+++ b/Documentation/driver-api/vme.rst
@@ -290,8 +290,8 @@ The function :c:func:`vme_bus_num` returns the bus ID of 
the provided bridge.
 VME API
 ---
 
-.. kernel-doc:: include/linux/vme.h
+.. kernel-doc:: drivers/staging/vme_user/vme.h
:internal:
 
-.. kernel-doc:: drivers/vme/vme.c
+.. kernel-doc:: drivers/staging/vme_user/vme.c
:export:
diff --git a/MAINTAINERS b/MAINTAINERS
index a6d3bd9d2a8d..d8e2cdbb93e3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -21229,12 +21229,10 @@ M:Martyn Welch 
 M: Manohar Vanga 
 M: Greg Kroah-Hartman 
 L: linux-ker...@vger.kernel.org
-S: Maintained
+S: Odd fixes
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 F: Documentation/driver-api/vme.rst
 F: drivers/staging/vme_user/
-F: drivers/vme/
-F: include/linux/vme*
 
 VM SOCKETS (AF_VSOCK)
 M: Stefano Garzarella 
diff --git a/drivers/Kconfig b/drivers/Kconfig
index b6a172d32a7d..19ee995bd0ae 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -183,8 +183,6 @@ source "drivers/iio/Kconfig"
 
 source "drivers/ntb/Kconfig"
 
-source "drivers/vme/Kconfig"
-
 source "drivers/pwm/Kconfig"
 
 source "drivers/irqchip/Kconfig"
diff --git a/drivers/Makefile b/drivers/Makefile
index 9a30842b22c5..dadf2678277f 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -165,7 +165,6 @@ obj-$(CONFIG_PM_DEVFREQ)+= devfreq/
 obj-$(CONFIG_EXTCON)   += extcon/
 obj-$(CONFIG_MEMORY)   += memory/
 obj-$(CONFIG_IIO)  += iio/
-obj-$(CONFIG_VME_BUS)  += vme/
 obj-$(CONFIG_IPACK_BUS)+= ipack/
 obj-$(CONFIG_NTB)  += ntb/
 obj-$(CONFIG_POWERCAP) += powercap/
diff --git a/drivers/staging/vme_user/Kconfig b/drivers/staging/vme_user/Kconfig
index e8b4461bf27f..c8eabf8f40f1 100644
--- a/drivers/staging/vme_user/Kconfig
+++ b/drivers/staging/vme_user/Kconfig
@@ -1,4 +1,29 @@
 # SPDX-License-Identifier: GPL-2.0
+menuconfig VME_BUS
+   bool "VME bridge support"
+   depends on STAGING && PCI
+   help
+ If you say Y here you get support for the VME bridge Framework.
+
+if VME_BUS
+
+comment "VME Bridge Drivers"
+
+config VME_TSI148
+   tristate "Tempe"
+   depends on HAS_DMA
+   help
+If you say Y here you get support 

[PATCH 3/6] media: sta2x11: remove VIRT_TO_BUS dependency

2022-06-06 Thread Arnd Bergmann
From: Arnd Bergmann 

This driver does not use the virt_to_bus() function, though it
depends on x86 specific fixups in the swiotlb code, which was
last rewritten in commit e380a0394c36 ("x86/PCI: sta2x11: use
default DMA address translation").

It is possible that the driver still fails to build on some
architectures that are missing CONFIG_VIRT_TO_BUS, but it is
always set on x86 machines with the STA2X11 platform enabled.

More likely though is that it was never meant to depend on
CONFIG_VIRT_TO_BUS, and the Kconfig dependency was kept from
an out-of-tree version when the driver was originally merged.

Fixes: efeb98b4e2b2 ("[media] STA2X11 VIP: new V4L2 driver")
Signed-off-by: Arnd Bergmann 
---
 drivers/media/pci/sta2x11/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/pci/sta2x11/Kconfig 
b/drivers/media/pci/sta2x11/Kconfig
index a96e170ab04e..118b922c08c3 100644
--- a/drivers/media/pci/sta2x11/Kconfig
+++ b/drivers/media/pci/sta2x11/Kconfig
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 config STA2X11_VIP
tristate "STA2X11 VIP Video For Linux"
-   depends on PCI && VIDEO_DEV && VIRT_TO_BUS && I2C
+   depends on PCI && VIDEO_DEV && I2C
depends on STA2X11 || COMPILE_TEST
select GPIOLIB if MEDIA_SUBDRV_AUTOSELECT
select VIDEO_ADV7180 if MEDIA_SUBDRV_AUTOSELECT
-- 
2.29.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/6] scsi: dpt_i2o: drop stale VIRT_TO_BUS dependency

2022-06-06 Thread Arnd Bergmann
From: Arnd Bergmann 

The dpt_i2o driver was fixed to stop using virt_to_bus() in 2008 but
it a stale reference in a broken error handling code path that could
never work.

Fix it up to build without VIRT_TO_BUS and remove the Kconfig dependency.

The alternative to this would be to just remove the driver, as it is
clearly obsolete. The i2o driver layer was removed in 2015 with commit
4a72a7af462d ("staging: remove i2o subsystem"), but the even older
dpt_i2o scsi driver stayed around.

The last non-cleanup patches I could find were from Miquel van
Smoorenburg and Mark Salyzyn back in 2008, they might know if
there is any chance of the hardware still being used anywhere.

Fixes: 67af2b060e02 ("[SCSI] dpt_i2o: move from virt_to_bus/bus_to_virt to 
dma_alloc_coherent")
Cc: Miquel van Smoorenburg 
Cc: Mark Salyzyn 
Signed-off-by: Arnd Bergmann 
---
 drivers/scsi/Kconfig   | 2 +-
 drivers/scsi/dpt_i2o.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index a9fe5152addd..cf75588a2587 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -460,7 +460,7 @@ config SCSI_MVUMI
 
 config SCSI_DPT_I2O
tristate "Adaptec I2O RAID support "
-   depends on SCSI && PCI && VIRT_TO_BUS
+   depends on SCSI && PCI
help
  This driver supports all of Adaptec's I2O based RAID controllers as 
  well as the DPT SmartRaid V cards.  This is an Adaptec maintained
diff --git a/drivers/scsi/dpt_i2o.c b/drivers/scsi/dpt_i2o.c
index 2e9155ba7408..55dfe7011912 100644
--- a/drivers/scsi/dpt_i2o.c
+++ b/drivers/scsi/dpt_i2o.c
@@ -52,11 +52,11 @@ MODULE_DESCRIPTION("Adaptec I2O RAID Driver");
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
 #include  /* for boot_cpu_data */
-#include /* for virt_to_bus, etc. */
 
 #include 
 #include 
@@ -2112,7 +2112,7 @@ static irqreturn_t adpt_isr(int irq, void *dev_id)
} else {
/* Ick, we should *never* be here */
printk(KERN_ERR "dpti: reply frame not from pool\n");
-   reply = (u8 *)bus_to_virt(m);
+   goto out;
}
 
if (readl(reply) & MSG_FAIL) {
-- 
2.29.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 6/6] arch/*/: remove CONFIG_VIRT_TO_BUS

2022-06-06 Thread Arnd Bergmann
From: Arnd Bergmann 

All architecture-independent users of virt_to_bus() and bus_to_virt()
have been fixed to use the dma mapping interfaces or have been
removed now.  This means the definitions on most architectures, and the
CONFIG_VIRT_TO_BUS symbol are now obsolete and can be removed.

The only exceptions to this are a few network and scsi drivers for m68k
Amiga and VME machines and ppc32 Macintosh. These drivers work correctly
with the old interfaces and are probably not worth changing.

On alpha and parisc, virt_to_bus() were still used in asm/floppy.h.
alpha can use isa_virt_to_bus() like x86 does, and parisc can just
open-code the virt_to_phys() here, as this is architecture specific
code.

I tried updating the bus-virt-phys-mapping.rst documentation, which
started as an email from Linus to explain some details of the Linux-2.0
driver interfaces. The bits about virt_to_bus() were declared obsolete
backin 2000, and the rest is not all that relevant any more, so in the
end I just decided to remove the file completely.

Signed-off-by: Arnd Bergmann 
---
 .../core-api/bus-virt-phys-mapping.rst| 220 --
 Documentation/core-api/dma-api-howto.rst  |  14 --
 Documentation/core-api/index.rst  |   1 -
 .../translations/zh_CN/core-api/index.rst |   1 -
 arch/alpha/Kconfig|   1 -
 arch/alpha/include/asm/floppy.h   |   2 +-
 arch/alpha/include/asm/io.h   |   8 +-
 arch/ia64/Kconfig |   1 -
 arch/ia64/include/asm/io.h|   8 -
 arch/m68k/Kconfig |   1 -
 arch/m68k/include/asm/virtconvert.h   |   4 +-
 arch/microblaze/Kconfig   |   1 -
 arch/microblaze/include/asm/io.h  |   2 -
 arch/mips/Kconfig |   1 -
 arch/mips/include/asm/io.h|   9 -
 arch/parisc/Kconfig   |   1 -
 arch/parisc/include/asm/floppy.h  |   4 +-
 arch/parisc/include/asm/io.h  |   2 -
 arch/powerpc/Kconfig  |   1 -
 arch/powerpc/include/asm/io.h |   2 -
 arch/riscv/include/asm/page.h |   1 -
 arch/x86/Kconfig  |   1 -
 arch/x86/include/asm/io.h |   9 -
 arch/xtensa/Kconfig   |   1 -
 arch/xtensa/include/asm/io.h  |   3 -
 include/asm-generic/io.h  |  14 --
 mm/Kconfig|   8 -
 27 files changed, 10 insertions(+), 311 deletions(-)
 delete mode 100644 Documentation/core-api/bus-virt-phys-mapping.rst

diff --git a/Documentation/core-api/bus-virt-phys-mapping.rst 
b/Documentation/core-api/bus-virt-phys-mapping.rst
deleted file mode 100644
index c72b24a7d52c..
--- a/Documentation/core-api/bus-virt-phys-mapping.rst
+++ /dev/null
@@ -1,220 +0,0 @@
-==
-How to access I/O mapped memory from within device drivers
-==
-
-:Author: Linus
-
-.. warning::
-
-   The virt_to_bus() and bus_to_virt() functions have been
-   superseded by the functionality provided by the PCI DMA interface
-   (see Documentation/core-api/dma-api-howto.rst).  They continue
-   to be documented below for historical purposes, but new code
-   must not use them. --davidm 00/12/12
-
-::
-
-  [ This is a mail message in response to a query on IO mapping, thus the
-strange format for a "document" ]
-
-The AHA-1542 is a bus-master device, and your patch makes the driver give the
-controller the physical address of the buffers, which is correct on x86
-(because all bus master devices see the physical memory mappings directly). 
-
-However, on many setups, there are actually **three** different ways of looking
-at memory addresses, and in this case we actually want the third, the
-so-called "bus address". 
-
-Essentially, the three ways of addressing memory are (this is "real memory",
-that is, normal RAM--see later about other details): 
-
- - CPU untranslated.  This is the "physical" address.  Physical address 
-   0 is what the CPU sees when it drives zeroes on the memory bus.
-
- - CPU translated address. This is the "virtual" address, and is 
-   completely internal to the CPU itself with the CPU doing the appropriate
-   translations into "CPU untranslated". 
-
- - bus address. This is the address of memory as seen by OTHER devices, 
-   not the CPU. Now, in theory there could be many different bus 
-   addresses, with each device seeing memory in some device-specific way, but
-   happily most hardware designers aren't actually actively trying to make
-   things any more complex than necessary, so you can assume that all 
-   external hardware sees the memory the same way. 
-
-Now, on normal PCs the bus address is exactly the same as the 

Re: [PATCH 0/6] phase out CONFIG_VIRT_TO_BUS

2022-06-06 Thread Greg Kroah-Hartman
On Mon, Jun 06, 2022 at 10:41:03AM +0200, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> The virt_to_bus/bus_to_virt interface has been deprecated for
> decades. After Jakub Kicinski put a lot of work into cleaning out the
> network drivers using them, there are only a couple of other drivers
> left, which can all be removed or otherwise cleaned up, to remove the
> old interface for good.
> 
> Any out of tree drivers using virt_to_bus() should be converted to
> using the dma-mapping interfaces, typically dma_alloc_coherent()
> or dma_map_single()).
> 
> There are a few m68k and ppc32 specific drivers that keep using the
> interfaces, but these are all guarded with architecture-specific
> Kconfig dependencies, and are not actually broken.
> 
> There are still a number of drivers that are using virt_to_phys()
> and phys_to_virt() in place of dma-mapping operations, and these
> are often broken, but they are out of scope for this series.

I'll take patches 1 and 2 right now through my staging tree if that's
ok.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits

2022-06-06 Thread John Garry via iommu
Streaming DMA mappings may be considerably slower when mappings go through
an IOMMU and the total mapping length is somewhat long. This is because the
IOMMU IOVA code allocates and free an IOVA for each mapping, which may
affect performance.

For performance reasons set the request_queue max_sectors from
dma_opt_mapping_size(), which knows this mapping limit.

In addition, the shost->max_sectors is repeatedly set for each sdev in
__scsi_init_queue(). This is unnecessary, so set once when adding the
host.

Signed-off-by: John Garry 
Reviewed-by: Damien Le Moal 
Reviewed-by: Martin K. Petersen 
---
 drivers/scsi/hosts.c| 5 +
 drivers/scsi/scsi_lib.c | 4 
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 8352f90d997d..ea1a207634d1 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -236,6 +236,11 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct 
device *dev,
 
shost->dma_dev = dma_dev;
 
+   if (dma_dev->dma_mask) {
+   shost->max_sectors = min_t(unsigned int, shost->max_sectors,
+   dma_opt_mapping_size(dma_dev) >> SECTOR_SHIFT);
+   }
+
error = scsi_mq_setup_tags(shost);
if (error)
goto fail;
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 6ffc9e4258a8..6ce8acea322a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1884,10 +1884,6 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct 
request_queue *q)
blk_queue_max_integrity_segments(q, shost->sg_prot_tablesize);
}
 
-   if (dev->dma_mask) {
-   shost->max_sectors = min_t(unsigned int, shost->max_sectors,
-   dma_max_mapping_size(dev) >> SECTOR_SHIFT);
-   }
blk_queue_max_hw_sectors(q, shost->max_sectors);
blk_queue_segment_boundary(q, shost->dma_boundary);
dma_set_seg_boundary(dev, shost->dma_boundary);
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 4/4] libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors

2022-06-06 Thread John Garry via iommu
ATA devices (struct ata_device) have a max_sectors field which is
configured internally in libata. This is then used to (re)configure the
associated sdev request queue max_sectors value from how it is earlier set
in __scsi_init_queue(). In __scsi_init_queue() the max_sectors value is set
according to shost limits, which includes host DMA mapping limits.

Cap the ata_device max_sectors according to shost->max_sectors to respect
this shost limit.

Signed-off-by: John Garry 
Acked-by: Damien Le Moal 
---
 drivers/ata/libata-scsi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 42cecf95a4e5..8b4b318f378d 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1060,6 +1060,7 @@ int ata_scsi_dev_config(struct scsi_device *sdev, struct 
ata_device *dev)
dev->flags |= ATA_DFLAG_NO_UNLOAD;
 
/* configure max sectors */
+   dev->max_sectors = min(dev->max_sectors, sdev->host->max_sectors);
blk_queue_max_hw_sectors(q, dev->max_sectors);
 
if (dev->class == ATA_DEV_ATAPI) {
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 1/4] dma-mapping: Add dma_opt_mapping_size()

2022-06-06 Thread John Garry via iommu
Streaming DMA mapping involving an IOMMU may be much slower for larger
total mapping size. This is because every IOMMU DMA mapping requires an
IOVA to be allocated and freed. IOVA sizes above a certain limit are not
cached, which can have a big impact on DMA mapping performance.

Provide an API for device drivers to know this "optimal" limit, such that
they may try to produce mapping which don't exceed it.

Signed-off-by: John Garry 
Reviewed-by: Damien Le Moal 
---
 Documentation/core-api/dma-api.rst |  9 +
 include/linux/dma-map-ops.h|  1 +
 include/linux/dma-mapping.h|  5 +
 kernel/dma/mapping.c   | 12 
 4 files changed, 27 insertions(+)

diff --git a/Documentation/core-api/dma-api.rst 
b/Documentation/core-api/dma-api.rst
index 6d6d0edd2d27..b3cd9763d28b 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -204,6 +204,15 @@ Returns the maximum size of a mapping for the device. The 
size parameter
 of the mapping functions like dma_map_single(), dma_map_page() and
 others should not be larger than the returned value.
 
+::
+
+   size_t
+   dma_opt_mapping_size(struct device *dev);
+
+Returns the maximum optimal size of a mapping for the device. Mapping large
+buffers may take longer so device drivers are advised to limit total DMA
+streaming mappings length to the returned value.
+
 ::
 
bool
diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index 0d5b06b3a4a6..98ceba6fa848 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -69,6 +69,7 @@ struct dma_map_ops {
int (*dma_supported)(struct device *dev, u64 mask);
u64 (*get_required_mask)(struct device *dev);
size_t (*max_mapping_size)(struct device *dev);
+   size_t (*opt_mapping_size)(void);
unsigned long (*get_merge_boundary)(struct device *dev);
 };
 
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index dca2b1355bb1..fe3849434b2a 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -144,6 +144,7 @@ int dma_set_mask(struct device *dev, u64 mask);
 int dma_set_coherent_mask(struct device *dev, u64 mask);
 u64 dma_get_required_mask(struct device *dev);
 size_t dma_max_mapping_size(struct device *dev);
+size_t dma_opt_mapping_size(struct device *dev);
 bool dma_need_sync(struct device *dev, dma_addr_t dma_addr);
 unsigned long dma_get_merge_boundary(struct device *dev);
 struct sg_table *dma_alloc_noncontiguous(struct device *dev, size_t size,
@@ -266,6 +267,10 @@ static inline size_t dma_max_mapping_size(struct device 
*dev)
 {
return 0;
 }
+static inline size_t dma_opt_mapping_size(struct device *dev)
+{
+   return 0;
+}
 static inline bool dma_need_sync(struct device *dev, dma_addr_t dma_addr)
 {
return false;
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index db7244291b74..1bfe11b1edb6 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -773,6 +773,18 @@ size_t dma_max_mapping_size(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(dma_max_mapping_size);
 
+size_t dma_opt_mapping_size(struct device *dev)
+{
+   const struct dma_map_ops *ops = get_dma_ops(dev);
+   size_t size = SIZE_MAX;
+
+   if (ops && ops->opt_mapping_size)
+   size = ops->opt_mapping_size();
+
+   return min(dma_max_mapping_size(dev), size);
+}
+EXPORT_SYMBOL_GPL(dma_opt_mapping_size);
+
 bool dma_need_sync(struct device *dev, dma_addr_t dma_addr)
 {
const struct dma_map_ops *ops = get_dma_ops(dev);
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 0/4] DMA mapping changes for SCSI core

2022-06-06 Thread John Garry via iommu
As reported in [0], DMA mappings whose size exceeds the IOMMU IOVA caching
limit may see a big performance hit.

This series introduces a new DMA mapping API, dma_opt_mapping_size(), so
that drivers may know this limit when performance is a factor in the
mapping.

Robin didn't like using dma_max_mapping_size() for this [1].

The SCSI core code is modified to use this limit.

I also added a patch for libata-scsi as it does not currently honour the
shost max_sectors limit.

Note: Christoph has previously kindly offered to take this series via the
  dma-mapping tree, so I think that we just need an ack from the
  IOMMU guys now. 

[0] 
https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leiz...@huawei.com/
[1] 
https://lore.kernel.org/linux-iommu/f5b78c9c-312e-70ab-ecbb-f14623a4b...@arm.com/

Changes since v2:
- Rebase on v5.19-rc1
- Add Damien's tag to 2/4 (thanks)

Changes since v1:
- Relocate scsi_add_host_with_dma() dma_dev check (Reported by Dan)
- Add tags from Damien and Martin (thanks)
  - note: I only added Martin's tag to the SCSI patch

John Garry (4):
  dma-mapping: Add dma_opt_mapping_size()
  dma-iommu: Add iommu_dma_opt_mapping_size()
  scsi: core: Cap shost max_sectors according to DMA optimum mapping
limits
  libata-scsi: Cap ata_device->max_sectors according to
shost->max_sectors

 Documentation/core-api/dma-api.rst |  9 +
 drivers/ata/libata-scsi.c  |  1 +
 drivers/iommu/dma-iommu.c  |  6 ++
 drivers/iommu/iova.c   |  5 +
 drivers/scsi/hosts.c   |  5 +
 drivers/scsi/scsi_lib.c|  4 
 include/linux/dma-map-ops.h|  1 +
 include/linux/dma-mapping.h|  5 +
 include/linux/iova.h   |  2 ++
 kernel/dma/mapping.c   | 12 
 10 files changed, 46 insertions(+), 4 deletions(-)

-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 2/4] dma-iommu: Add iommu_dma_opt_mapping_size()

2022-06-06 Thread John Garry via iommu
Add the IOMMU callback for DMA mapping API dma_opt_mapping_size(), which
allows the drivers to know the optimal mapping limit and thus limit the
requested IOVA lengths.

This value is based on the IOVA rcache range limit, as IOVAs allocated
above this limit must always be newly allocated, which may be quite slow.

Signed-off-by: John Garry 
Reviewed-by: Damien Le Moal 
---
 drivers/iommu/dma-iommu.c | 6 ++
 drivers/iommu/iova.c  | 5 +
 include/linux/iova.h  | 2 ++
 3 files changed, 13 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f90251572a5d..9e1586447ee8 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1459,6 +1459,11 @@ static unsigned long iommu_dma_get_merge_boundary(struct 
device *dev)
return (1UL << __ffs(domain->pgsize_bitmap)) - 1;
 }
 
+static size_t iommu_dma_opt_mapping_size(void)
+{
+   return iova_rcache_range();
+}
+
 static const struct dma_map_ops iommu_dma_ops = {
.alloc  = iommu_dma_alloc,
.free   = iommu_dma_free,
@@ -1479,6 +1484,7 @@ static const struct dma_map_ops iommu_dma_ops = {
.map_resource   = iommu_dma_map_resource,
.unmap_resource = iommu_dma_unmap_resource,
.get_merge_boundary = iommu_dma_get_merge_boundary,
+   .opt_mapping_size   = iommu_dma_opt_mapping_size,
 };
 
 /*
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index db77aa675145..9f00b58d546e 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -26,6 +26,11 @@ static unsigned long iova_rcache_get(struct iova_domain 
*iovad,
 static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad);
 static void free_iova_rcaches(struct iova_domain *iovad);
 
+unsigned long iova_rcache_range(void)
+{
+   return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1);
+}
+
 static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node)
 {
struct iova_domain *iovad;
diff --git a/include/linux/iova.h b/include/linux/iova.h
index 320a70e40233..c6ba6d95d79c 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -79,6 +79,8 @@ static inline unsigned long iova_pfn(struct iova_domain 
*iovad, dma_addr_t iova)
 int iova_cache_get(void);
 void iova_cache_put(void);
 
+unsigned long iova_rcache_range(void);
+
 void free_iova(struct iova_domain *iovad, unsigned long pfn);
 void __free_iova(struct iova_domain *iovad, struct iova *iova);
 struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size,
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/6] scsi: remove stale BusLogic driver

2022-06-06 Thread Arnd Bergmann
On Mon, Jun 6, 2022 at 12:40 PM Maciej W. Rozycki  wrote:
>
> On Mon, 6 Jun 2022, Arnd Bergmann wrote:
>
> > This was in turn fixed in commit 56f396146af2 ("scsi: BusLogic: Fix
> > 64-bit system enumeration error for Buslogic"), 8 years later.
> >
> > The fact that this was found at all is an indication that there are
> > users, and it seems that Maciej, Matt and Khalid all have access to
> > this hardware, but if it took eight years to find the problem,
> > it's likely that nobody actually relies on it.
>
>  Umm, I use it with a 32-bit system, so it would be quite an issue for me
> to discover a problem with 64-bit configurations.  And I quite rely on
> this system for various stuff too!

Ok, good to know.

> > Remove it as part of the global virt_to_bus()/bus_to_virt() removal.
> > If anyone is still interested in keeping this driver, the alternative
> > is to stop it from using bus_to_virt(), possibly along the lines of
> > how dpt_i2o gets around the same issue.
>
>  Thanks for the pointer and for cc-ing me.  Please refrain from removing
> the driver at least for this release cycle and let me fix it.

Sure, no problem. It looks like I forgot to actually Cc you on the series
as I had meant to, glad it reached you anyway.

> It should be easy to mimic what I did for the defza driver: all bus addresses 
> in
> the DMA API come associated with virtual addresses, so it is just a matter of
> recording those somewhere for later use rather than trying to mess up with
> bus addresses to figure out a reverse mapping.

I looked at the defza driver and that one appears simpler to me, as you
can look up the virtual address from an index, but it's possible I missed
what you do here. I meant specifically the calculation added by commit
67af2b060e02 ("[SCSI] dpt_i2o: move from virt_to_bus/bus_to_virt
to dma_alloc_coherent") in the dpt_i2o driver that can be used to
compute the virtual address. If something simpler works as well, that
is of course better.

Arnd
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] iommu: Ensure device has the same iommu_ops as the domain

2022-06-06 Thread Robin Murphy

On 2022-06-06 07:19, Nicolin Chen wrote:

The core code should not call an iommu driver op with a struct device
parameter unless it knows that the dev_iommu_priv_get() for that struct
device was setup by the same driver. Otherwise in a mixed driver system
the iommu_priv could be casted to the wrong type.


We don't have mixed-driver systems, and there are plenty more 
significant problems than this one to solve before we can (but thanks 
for pointing it out - I hadn't got as far as auditing the public 
interfaces yet). Once domains are allocated via a particular device's 
IOMMU instance in the first place, there will be ample opportunity for 
the core to stash suitable identifying information in the domain for 
itself. TBH even the current code could do it without needing the 
weirdly invasive changes here.



Store the iommu_ops pointer in the iommu_domain and use it as a check to
validate that the struct device is correct before invoking any domain op
that accepts a struct device.


In fact this even describes exactly that - "Store the iommu_ops pointer 
in the iommu_domain", vs. the "Store the iommu_ops pointer in the 
iommu_domain_ops" which the patch is actually doing :/


[...]

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 19cf28d40ebe..8a1f437a51f2 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1963,6 +1963,10 @@ static int __iommu_attach_device(struct iommu_domain 
*domain,
  {
int ret;
  
+	/* Ensure the device was probe'd onto the same driver as the domain */

+   if (dev->bus->iommu_ops != domain->ops->iommu_ops)


Nope, dev_iommu_ops(dev) please. Furthermore I think the logical place 
to put this is in iommu_group_do_attach_device(), since that's the 
gateway for the public interfaces - we shouldn't need to second-guess 
ourselves for internal default-domain-related calls.


Thanks,
Robin.


+   return -EMEDIUMTYPE;
+
if (unlikely(domain->ops->attach_dev == NULL))
return -ENODEV;

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/6] phase out CONFIG_VIRT_TO_BUS

2022-06-06 Thread Arnd Bergmann
On Mon, Jun 6, 2022 at 11:25 AM Greg Kroah-Hartman
 wrote:
>
> I'll take patches 1 and 2 right now through my staging tree if that's
> ok.

Yes, that's perfect, as there are no actual interdependencies with the
other drivers -- applying the last patch first would just hide the driver
I'm removing here.

Thanks,

  Arnd
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/6] scsi: remove stale BusLogic driver

2022-06-06 Thread Khalid Aziz

On 6/6/22 02:41, Arnd Bergmann wrote:

From: Arnd Bergmann

The BusLogic driver is the last remaining driver that relies on the
deprecated bus_to_virt() function, which in turn only works on a few
architectures, and is incompatible with both swiotlb and iommu support.

Before commit 391e2f25601e ("[SCSI] BusLogic: Port driver to 64-bit."),
the driver had a dependency on x86-32, presumably because of this
problem. However, the change introduced another bug that made it still
impossible to use the driver on any 64-bit machine.

This was in turn fixed in commit 56f396146af2 ("scsi: BusLogic: Fix
64-bit system enumeration error for Buslogic"), 8 years later.

The fact that this was found at all is an indication that there are
users, and it seems that Maciej, Matt and Khalid all have access to
this hardware, but if it took eight years to find the problem,
it's likely that nobody actually relies on it.

Remove it as part of the global virt_to_bus()/bus_to_virt() removal.
If anyone is still interested in keeping this driver, the alternative
is to stop it from using bus_to_virt(), possibly along the lines of
how dpt_i2o gets around the same issue.

Cc: Maciej W. Rozycki
Cc: Matt Wang
Cc: Khalid Aziz
Signed-off-by: Arnd Bergmann
---
  Documentation/scsi/BusLogic.rst   |  581 ---
  Documentation/scsi/FlashPoint.rst |  176 -
  MAINTAINERS   |7 -
  drivers/scsi/BusLogic.c   | 3727 --
  drivers/scsi/BusLogic.h   | 1284 -
  drivers/scsi/FlashPoint.c | 7560 -
  drivers/scsi/Kconfig  |   24 -
  7 files changed, 13359 deletions(-)
  delete mode 100644 Documentation/scsi/BusLogic.rst
  delete mode 100644 Documentation/scsi/FlashPoint.rst
  delete mode 100644 drivers/scsi/BusLogic.c
  delete mode 100644 drivers/scsi/BusLogic.h
  delete mode 100644 drivers/scsi/FlashPoint.c


I would say no to removing BusLogic driver. Virtualbox is another 
consumer of this driver. This driver is very old but I would rather fix 
the issues than remove it until we do not have any users.


Thanks,
Khalid
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] iommu: Ensure device has the same iommu_ops as the domain

2022-06-06 Thread Nicolin Chen via iommu
Hi Robin,

On Mon, Jun 06, 2022 at 03:33:42PM +0100, Robin Murphy wrote:
> On 2022-06-06 07:19, Nicolin Chen wrote:
> > The core code should not call an iommu driver op with a struct device
> > parameter unless it knows that the dev_iommu_priv_get() for that struct
> > device was setup by the same driver. Otherwise in a mixed driver system
> > the iommu_priv could be casted to the wrong type.
> 
> We don't have mixed-driver systems, and there are plenty more
> significant problems than this one to solve before we can (but thanks
> for pointing it out - I hadn't got as far as auditing the public
> interfaces yet). Once domains are allocated via a particular device's
> IOMMU instance in the first place, there will be ample opportunity for
> the core to stash suitable identifying information in the domain for
> itself. TBH even the current code could do it without needing the
> weirdly invasive changes here.

Do you have an alternative and less invasive solution in mind?

> > Store the iommu_ops pointer in the iommu_domain and use it as a check to
> > validate that the struct device is correct before invoking any domain op
> > that accepts a struct device.
> 
> In fact this even describes exactly that - "Store the iommu_ops pointer
> in the iommu_domain", vs. the "Store the iommu_ops pointer in the
> iommu_domain_ops" which the patch is actually doing :/

Will fix that.

> [...]
> > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > index 19cf28d40ebe..8a1f437a51f2 100644
> > --- a/drivers/iommu/iommu.c
> > +++ b/drivers/iommu/iommu.c
> > @@ -1963,6 +1963,10 @@ static int __iommu_attach_device(struct iommu_domain 
> > *domain,
> >   {
> >   int ret;
> > 
> > + /* Ensure the device was probe'd onto the same driver as the domain */
> > + if (dev->bus->iommu_ops != domain->ops->iommu_ops)
> 
> Nope, dev_iommu_ops(dev) please. Furthermore I think the logical place
> to put this is in iommu_group_do_attach_device(), since that's the
> gateway for the public interfaces - we shouldn't need to second-guess
> ourselves for internal default-domain-related calls.

Will move to iommu_group_do_attach_device and change to dev_iommu_ops.

Thanks!
Nic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] iommu: Ensure device has the same iommu_ops as the domain

2022-06-06 Thread Robin Murphy

On 2022-06-06 17:51, Nicolin Chen wrote:

Hi Robin,

On Mon, Jun 06, 2022 at 03:33:42PM +0100, Robin Murphy wrote:

On 2022-06-06 07:19, Nicolin Chen wrote:

The core code should not call an iommu driver op with a struct device
parameter unless it knows that the dev_iommu_priv_get() for that struct
device was setup by the same driver. Otherwise in a mixed driver system
the iommu_priv could be casted to the wrong type.


We don't have mixed-driver systems, and there are plenty more
significant problems than this one to solve before we can (but thanks
for pointing it out - I hadn't got as far as auditing the public
interfaces yet). Once domains are allocated via a particular device's
IOMMU instance in the first place, there will be ample opportunity for
the core to stash suitable identifying information in the domain for
itself. TBH even the current code could do it without needing the
weirdly invasive changes here.


Do you have an alternative and less invasive solution in mind?


Store the iommu_ops pointer in the iommu_domain and use it as a check to
validate that the struct device is correct before invoking any domain op
that accepts a struct device.


In fact this even describes exactly that - "Store the iommu_ops pointer
in the iommu_domain", vs. the "Store the iommu_ops pointer in the
iommu_domain_ops" which the patch is actually doing :/


Will fix that.


Well, as before I'd prefer to make the code match the commit message - 
if I really need to spell it out, see below - since I can't imagine that 
we should ever have need to identify a set of iommu_domain_ops in 
isolation, therefore I think it's considerably clearer to use the 
iommu_domain itself. However, either way we really don't need this yet, 
so we may as well just go ahead and remove the redundant test from VFIO 
anyway, and I can add some form of this patch to my dev branch for now.


Thanks,
Robin.

->8-
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index cde2e1d6ab9b..72990edc9314 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1902,6 +1902,7 @@ static struct iommu_domain 
*__iommu_domain_alloc(struct device *dev,

domain->type = type;
/* Assume all sizes by default; the driver may override this later */
domain->pgsize_bitmap = ops->pgsize_bitmap;
+   domain->owner = ops;
if (!domain->ops)
domain->ops = ops->default_domain_ops;

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 6f64cbbc6721..79e557207f53 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -89,6 +89,7 @@ struct iommu_domain_geometry {

 struct iommu_domain {
unsigned type;
+   const struct iommu_ops *owner; /* Who allocated this domain */
const struct iommu_domain_ops *ops;
unsigned long pgsize_bitmap;/* Bitmap of page sizes in use */
iommu_fault_handler_t handler;
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] iommu: Ensure device has the same iommu_ops as the domain

2022-06-06 Thread Nicolin Chen via iommu
On Mon, Jun 06, 2022 at 06:50:33PM +0100, Robin Murphy wrote:
> External email: Use caution opening links or attachments
> 
> 
> On 2022-06-06 17:51, Nicolin Chen wrote:
> > Hi Robin,
> > 
> > On Mon, Jun 06, 2022 at 03:33:42PM +0100, Robin Murphy wrote:
> > > On 2022-06-06 07:19, Nicolin Chen wrote:
> > > > The core code should not call an iommu driver op with a struct device
> > > > parameter unless it knows that the dev_iommu_priv_get() for that struct
> > > > device was setup by the same driver. Otherwise in a mixed driver system
> > > > the iommu_priv could be casted to the wrong type.
> > > 
> > > We don't have mixed-driver systems, and there are plenty more
> > > significant problems than this one to solve before we can (but thanks
> > > for pointing it out - I hadn't got as far as auditing the public
> > > interfaces yet). Once domains are allocated via a particular device's
> > > IOMMU instance in the first place, there will be ample opportunity for
> > > the core to stash suitable identifying information in the domain for
> > > itself. TBH even the current code could do it without needing the
> > > weirdly invasive changes here.
> > 
> > Do you have an alternative and less invasive solution in mind?
> > 
> > > > Store the iommu_ops pointer in the iommu_domain and use it as a check to
> > > > validate that the struct device is correct before invoking any domain op
> > > > that accepts a struct device.
> > > 
> > > In fact this even describes exactly that - "Store the iommu_ops pointer
> > > in the iommu_domain", vs. the "Store the iommu_ops pointer in the
> > > iommu_domain_ops" which the patch is actually doing :/
> > 
> > Will fix that.
> 
> Well, as before I'd prefer to make the code match the commit message -
> if I really need to spell it out, see below - since I can't imagine that
> we should ever have need to identify a set of iommu_domain_ops in
> isolation, therefore I think it's considerably clearer to use the
> iommu_domain itself. However, either way we really don't need this yet,
> so we may as well just go ahead and remove the redundant test from VFIO
> anyway, and I can add some form of this patch to my dev branch for now.

I see. The version below is much cleaner. Yet, it'd become having a
common pointer per iommu_domain vs. one pointer per driver. Jason
pointed it out to me earlier that by doing so memory waste would be
unnecessary on platforms that have considerable numbers of masters.

Since we know that it'd be safe to exclude this single change from
this series, I can drop it in next version, if you don't like the
change.

Thanks!
Nic

> ->8-
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index cde2e1d6ab9b..72990edc9314 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1902,6 +1902,7 @@ static struct iommu_domain
> *__iommu_domain_alloc(struct device *dev,
>domain->type = type;
>/* Assume all sizes by default; the driver may override this later */
>domain->pgsize_bitmap = ops->pgsize_bitmap;
> +   domain->owner = ops;
>if (!domain->ops)
>domain->ops = ops->default_domain_ops;
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 6f64cbbc6721..79e557207f53 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -89,6 +89,7 @@ struct iommu_domain_geometry {
> 
>  struct iommu_domain {
>unsigned type;
> +   const struct iommu_ops *owner; /* Who allocated this domain */
>const struct iommu_domain_ops *ops;
>unsigned long pgsize_bitmap;/* Bitmap of page sizes in use */
>iommu_fault_handler_t handler;
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] iommu: Ensure device has the same iommu_ops as the domain

2022-06-06 Thread Jason Gunthorpe via iommu
On Mon, Jun 06, 2022 at 11:28:49AM -0700, Nicolin Chen wrote:

> > Well, as before I'd prefer to make the code match the commit message -
> > if I really need to spell it out, see below - since I can't imagine that
> > we should ever have need to identify a set of iommu_domain_ops in
> > isolation, therefore I think it's considerably clearer to use the
> > iommu_domain itself. However, either way we really don't need this yet,
> > so we may as well just go ahead and remove the redundant test from VFIO
> > anyway, and I can add some form of this patch to my dev branch for now.
> 
> I see. The version below is much cleaner. Yet, it'd become having a
> common pointer per iommu_domain vs. one pointer per driver. Jason
> pointed it out to me earlier that by doing so memory waste would be
> unnecessary on platforms that have considerable numbers of masters.

I had ment using struct iommu_domain when there is a simple solution
to use rodata doesn't seem ideal.

I don't quite understand the reluctance to make small changes to
drivers, it was the same logic with the default_domain_ops thing too.

> Since we know that it'd be safe to exclude this single change from
> this series, I can drop it in next version, if you don't like the
> change.

But this is fine too, that really is half the point of this series..

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu