Re: [PATCH 1/2] PCI: Save properties required to handle FLR for replay purposes.

2017-05-24 Thread Bjorn Helgaas
On Tue, May 23, 2017 at 03:33:22PM -0500, Bjorn Helgaas wrote:
> On Wed, May 10, 2017 at 11:39:02AM -0700, Ashok Raj wrote:
> > From: CQ Tang 
> > 
> > Requires: https://patchwork.kernel.org/patch/9593891
> 
> I'm not sure what the status of the patch above is.  I acked it, but it's
> part of a 30-patch IOMMU series, so I expect it to be merged via an IOMMU
> tree.
> 
> In any case, it's not in v4.12-rc1, so I can't apply *this* patch yet.

Ashok or CQ, would you mind reposting this when the patch it depends
on has been merged?  I'm going to drop it from patchwork for now since
I can't do anything with it, and that means it will completely
disappear from my to-do list.

Bjorn
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 6/7] iommu/arm-smmu-v3: Add support for PCI ATS

2017-05-24 Thread Jean-Philippe Brucker
PCIe devices can implement their own TLB, named Address Translation Cache
(ATC). Enable Address Translation Service (ATS) for devices that support
it and send them invalidation requests whenever we invalidate the IOTLBs.

  Range calculation
  -

The invalidation packet itself is a bit awkward: range must be naturally
aligned, which means that the start address is a multiple of the range
size. In addition, the size must be a power of two number of 4k pages. We
have a few options to enforce this constraint:

(1) Find the smallest naturally aligned region that covers the requested
range. This is simple to compute and only takes one ATC_INV, but it
will spill on lots of neighbouring ATC entries.

(2) Align the start address to the region size (rounded up to a power of
two), and send a second invalidation for the next range of the same
size. Still not great, but reduces spilling.

(3) Cover the range exactly with the smallest number of naturally aligned
regions. This would be interesting to implement but as for (2),
requires multiple ATC_INV.

As I suspect ATC invalidation packets will be a very scarce resource, I'll
go with option (1) for now, and only send one big invalidation. We can
move to (2), which is both easier to read and more gentle with the ATC,
once we've observed on real systems that we can send multiple smaller
Invalidation Requests for roughly the same price as a single big one.

Note that with io-pgtable, the unmap function is called for each page, so
this doesn't matter. The problem shows up when sharing page tables with
the MMU.

  Timeout
  ---

ATC invalidation is allowed to take up to 90 seconds, according to the
PCIe spec, so it is possible to hit the SMMU command queue timeout during
normal operations.

Some SMMU implementations will raise a CERROR_ATC_INV_SYNC when a CMD_SYNC
fails because of an ATC invalidation. Some will just abort the CMD_SYNC.
Others might let CMD_SYNC complete and have an asynchronous IMPDEF
mechanism to record the error. When we receive a CERROR_ATC_INV_SYNC, we
could retry sending all ATC_INV since last successful CMD_SYNC. When a
CMD_SYNC fails without CERROR_ATC_INV_SYNC, we could retry sending *all*
commands since last successful CMD_SYNC.

We cannot afford to wait 90 seconds in iommu_unmap, let alone MMU
notifiers. So we'd have to introduce a more clever system if this timeout
becomes a problem, like keeping hold of mappings and invalidating in the
background. Implementing safe delayed invalidations is a very complex
problem and deserves a series of its own. We'll assess whether more work
is needed to properly handle ATC invalidation timeouts once this code runs
on real hardware.

  Misc
  

I didn't put ATC and TLB invalidations in the same functions for three
reasons:

* TLB invalidation by range is batched and committed with a single sync.
  Batching ATC invalidation is inconvenient, endpoints limit the number of
  inflight invalidations. We'd have to count the number of invalidations
  queued and send a sync periodically. In addition, I suspect we always
  need a sync between TLB and ATC invalidation for the same page.

* Doing ATC invalidation outside tlb_inv_range also allows to send less
  requests, since TLB invalidations are done per page or block, while ATC
  invalidations target IOVA ranges.

* TLB invalidation by context is performed when freeing the domain, at
  which point there isn't any device attached anymore.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm-smmu-v3.c | 232 ++--
 1 file changed, 225 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 524e1b051962..87ed6239b9a6 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -102,6 +103,7 @@
 #define IDR5_OAS_48_BIT(5 << IDR5_OAS_SHIFT)
 
 #define ARM_SMMU_CR0   0x20
+#define CR0_ATSCHK (1 << 4)
 #define CR0_CMDQEN (1 << 3)
 #define CR0_EVTQEN (1 << 2)
 #define CR0_PRIQEN (1 << 1)
@@ -343,6 +345,7 @@
 #define CMDQ_ERR_CERROR_NONE_IDX   0
 #define CMDQ_ERR_CERROR_ILL_IDX1
 #define CMDQ_ERR_CERROR_ABT_IDX2
+#define CMDQ_ERR_CERROR_ATC_INV_IDX3
 
 #define CMDQ_0_OP_SHIFT0
 #define CMDQ_0_OP_MASK 0xffUL
@@ -364,6 +367,15 @@
 #define CMDQ_TLBI_1_VA_MASK~0xfffUL
 #define CMDQ_TLBI_1_IPA_MASK   0xf000UL
 
+#define CMDQ_ATC_0_SSID_SHIFT  12
+#define CMDQ_ATC_0_SSID_MASK   0xfUL
+#define CMDQ_ATC_0_SID_SHIFT   32
+#define CMDQ_ATC_0_SID_MASK0xUL
+#define CMDQ_ATC_0_GLOBAL  (1UL << 9)
+#define CMDQ_ATC_1_SIZE_SHIFT   

[PATCH 7/7] iommu/arm-smmu-v3: Disable tagged pointers

2017-05-24 Thread Jean-Philippe Brucker
The ARM architecture has a "Top Byte Ignore" (TBI) option that makes the
MMU mask out bits [63:56] of an address, allowing a userspace application
to store data in its pointers. This option is incompatible with PCI ATS.

If TBI is enabled in the SMMU and userspace triggers DMA transactions on
tagged pointers, the endpoint might create ATC entries for addresses that
include a tag. Software would then have to send ATC invalidation packets
for each 255 possible alias of an address, or just wipe the whole address
space. This is not a viable option, so disable TBI.

The impact of this change is unclear, since there are very few users of
tagged pointers, much less SVM. But the requirement introduced by this
patch doesn't seem excessive: a userspace application using both tagged
pointers and SVM should now sanitize addresses (clear the tag) before
using them for device DMA.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm-smmu-v3.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 87ed6239b9a6..0b2674f8ba0f 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -997,7 +997,6 @@ static u64 arm_smmu_cpu_tcr_to_cd(u64 tcr)
val |= ARM_SMMU_TCR2CD(tcr, EPD0);
val |= ARM_SMMU_TCR2CD(tcr, EPD1);
val |= ARM_SMMU_TCR2CD(tcr, IPS);
-   val |= ARM_SMMU_TCR2CD(tcr, TBI0);
 
return val;
 }
-- 
2.12.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 5/7] iommu/arm-smmu-v3: Link domains and devices

2017-05-24 Thread Jean-Philippe Brucker
When removing a mapping from a domain, we need to send an invalidation to
all devices that might have stored it in their Address Translation Cache
(ATC). In addition with SVM, we'll need to invalidate context descriptors
of all devices attached to a domain.

Maintain a list of devices in each domain, protected by a spinlock. It is
updated every time we attach or detach devices to and from domains.

It needs to be a spinlock because we'll invalidate ATC entries from
within hardirq-safe contexts, but it may be possible to relax the read
side with RCU later.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm-smmu-v3.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 380969aa60d5..524e1b051962 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -630,6 +630,9 @@ struct arm_smmu_device {
 struct arm_smmu_master_data {
struct arm_smmu_device  *smmu;
struct arm_smmu_strtab_ent  ste;
+
+   struct arm_smmu_domain  *domain;
+   struct list_headlist; /* domain->devices */
 };
 
 /* SMMU private data for an IOMMU domain */
@@ -654,6 +657,9 @@ struct arm_smmu_domain {
};
 
struct iommu_domain domain;
+
+   struct list_headdevices;
+   spinlock_t  devices_lock;
 };
 
 struct arm_smmu_option_prop {
@@ -1407,6 +1413,9 @@ static struct iommu_domain 
*arm_smmu_domain_alloc(unsigned type)
 
mutex_init(_domain->init_mutex);
spin_lock_init(_domain->pgtbl_lock);
+   INIT_LIST_HEAD(_domain->devices);
+   spin_lock_init(_domain->devices_lock);
+
return _domain->domain;
 }
 
@@ -1609,7 +1618,17 @@ static void arm_smmu_install_ste_for_dev(struct 
iommu_fwspec *fwspec)
 
 static void arm_smmu_detach_dev(struct device *dev)
 {
+   unsigned long flags;
struct arm_smmu_master_data *master = dev->iommu_fwspec->iommu_priv;
+   struct arm_smmu_domain *smmu_domain = master->domain;
+
+   if (smmu_domain) {
+   spin_lock_irqsave(_domain->devices_lock, flags);
+   list_del(>list);
+   spin_unlock_irqrestore(_domain->devices_lock, flags);
+
+   master->domain = NULL;
+   }
 
master->ste.assigned = false;
arm_smmu_install_ste_for_dev(dev->iommu_fwspec);
@@ -1618,6 +1637,7 @@ static void arm_smmu_detach_dev(struct device *dev)
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
int ret = 0;
+   unsigned long flags;
struct arm_smmu_device *smmu;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_master_data *master;
@@ -1653,6 +1673,11 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
}
 
ste->assigned = true;
+   master->domain = smmu_domain;
+
+   spin_lock_irqsave(_domain->devices_lock, flags);
+   list_add(>list, _domain->devices);
+   spin_unlock_irqrestore(_domain->devices_lock, flags);
 
if (smmu_domain->stage == ARM_SMMU_DOMAIN_BYPASS) {
ste->s1_cfg = NULL;
-- 
2.12.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/7] ACPI/IORT: Check ATS capability in root complex nodes

2017-05-24 Thread Jean-Philippe Brucker
Root complex node in IORT has a bit telling whether it supports ATS or
not. Store this bit in the IOMMU fwspec when setting up a device, so it
can be accessed later by an IOMMU driver.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/acpi/arm64/iort.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index c5fecf97ee2f..e7d2d156d78f 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -720,6 +720,14 @@ void iort_set_dma_mask(struct device *dev)
dev->dma_mask = >coherent_dma_mask;
 }
 
+static bool iort_pci_rc_supports_ats(struct acpi_iort_node *node)
+{
+   struct acpi_iort_root_complex *pci_rc;
+
+   pci_rc = (struct acpi_iort_root_complex *)node->node_data;
+   return pci_rc->ats_attribute & ACPI_IORT_ATS_SUPPORTED;
+}
+
 /**
  * iort_iommu_configure - Set-up IOMMU configuration for a device.
  *
@@ -752,6 +760,8 @@ const struct iommu_ops *iort_iommu_configure(struct device 
*dev)
 
ops = iort_iommu_xlate(dev, parent, streamid);
 
+   if (!IS_ERR_OR_NULL(ops) && iort_pci_rc_supports_ats(node))
+   dev->iommu_fwspec->flags |= 
IOMMU_FWSPEC_PCI_RC_SUPPORTS_ATS;
} else {
int i = 0;
 
-- 
2.12.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/7] dt-bindings: PCI: Describe ATS property for root complex nodes

2017-05-24 Thread Jean-Philippe Brucker
Address Translation Service (ATS) is an extension to PCIe allowing
endpoints to manage their own IOTLB, called Address Translation Cache
(ATC). Instead of having every memory transaction processed by the IOMMU,
the endpoint can first send an Address Translation Requests for an IOVA,
obtain the corresponding Physical Address from the IOMMU and store it in
its ATC. Subsequent transactions to this memory region can be performed on
the PA, in which case they are marked 'translated' and (partially) bypass
the IOMMU.

Since the extension uses fields that were previously reserved in the
PCIe Translation Layer Packet, it seems ill-advised to enabled it on a
system that doesn't fully support ATS.

To "old" root complexes that simply ignored the new AT field, an Address
Translation Request will look exactly like a Memory Read Request, so the
root bridge will forward a memory read to the IOMMU instead of a
translation request. If the access succeeds, the RC will send a Read
Completion, which looks like a Translation Completion, back to the
endpoint. As a result the endpoint might end up storing the content of
memory instead of a physical address in its ATC. In reality, it's more
likely that the size fields will be invalid and either end will detect the
error, but in any case, it is undesirable.

Add a way for firmware to tell the OS that ATS is supported by the PCI
root complex.

Signed-off-by: Jean-Philippe Brucker 
---
 Documentation/devicetree/bindings/pci/pci-iommu.txt | 8 
 1 file changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/pci/pci-iommu.txt 
b/Documentation/devicetree/bindings/pci/pci-iommu.txt
index 0def586fdcdf..f21a68ec471a 100644
--- a/Documentation/devicetree/bindings/pci/pci-iommu.txt
+++ b/Documentation/devicetree/bindings/pci/pci-iommu.txt
@@ -44,6 +44,14 @@ Optional properties
 - iommu-map-mask: A mask to be applied to each Requester ID prior to being
   mapped to an IOMMU specifier per the iommu-map property.
 
+- ats-supported: if present, the root complex supports the Address
+  Translation Service (ATS). It is able to interpret the AT field in PCIe
+  Transaction Layer Packets, and forward Translation Completions or
+  Invalidation Requests to endpoints.
+
+  Device drivers should not enable ATS in endpoints unless this property
+  is present.
+
 
 Example (1)
 ===
-- 
2.12.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/7] iommu/of: Check ATS capability in root complex nodes

2017-05-24 Thread Jean-Philippe Brucker
The PCI root complex node in DT has a property indicating whether it
supports ATS. Store this bit in the IOMMU fwspec when initializing a
device, so it can be accessed later by an IOMMU driver.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/of_iommu.c | 8 
 include/linux/iommu.h| 4 
 2 files changed, 12 insertions(+)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 9f44ee8ea1bc..3d8168e80634 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -147,6 +147,11 @@ static int __get_pci_rid(struct pci_dev *pdev, u16 alias, 
void *data)
return iommu_spec->np == pdev->bus->dev.of_node;
 }
 
+static bool of_pci_rc_supports_ats(struct device_node *rc_node)
+{
+   return of_property_read_bool(rc_node, "ats-supported");
+}
+
 static const struct iommu_ops
 *of_pci_iommu_init(struct pci_dev *pdev, struct device_node *bridge_np)
 {
@@ -175,6 +180,9 @@ static const struct iommu_ops
 
ops = of_iommu_xlate(>dev, _spec);
 
+   if (!IS_ERR_OR_NULL(ops) && of_pci_rc_supports_ats(bridge_np))
+   pdev->dev.iommu_fwspec->flags |= 
IOMMU_FWSPEC_PCI_RC_SUPPORTS_ATS;
+
of_node_put(iommu_spec.np);
return ops;
 }
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 2cb54adc4a33..206821b9044c 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -360,6 +360,7 @@ extern struct iommu_group *generic_device_group(struct 
device *dev);
  * @ops: ops for this device's IOMMU
  * @iommu_fwnode: firmware handle for this device's IOMMU
  * @iommu_priv: IOMMU driver private data for this device
+ * @flags: miscellaneous properties for this device
  * @num_ids: number of associated device IDs
  * @ids: IDs which this device may present to the IOMMU
  */
@@ -367,10 +368,13 @@ struct iommu_fwspec {
const struct iommu_ops  *ops;
struct fwnode_handle*iommu_fwnode;
void*iommu_priv;
+   u32 flags;
unsigned intnum_ids;
u32 ids[1];
 };
 
+#define IOMMU_FWSPEC_PCI_RC_SUPPORTS_ATS   (1 << 0)
+
 int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
  const struct iommu_ops *ops);
 void iommu_fwspec_free(struct device *dev);
-- 
2.12.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/7] PCI: Move ATS declarations outside of CONFIG_PCI

2017-05-24 Thread Jean-Philippe Brucker
Currently ATS helpers like pci_enable_ats are only defined when CONFIG_PCI
is enabled. The ARM SMMU driver might get built with CONFIG_PCI disabled.
It would thus have to wrap any use of ATS helpers around #ifdef
CONFIG_PCI, which isn't ideal.

A nicer solution is to always define these helpers. Since CONFIG_PCI_ATS
is only enabled in association with CONFIG_PCI, move defines outside of
CONFIG_PCI to prevent build failure when PCI is disabled.

Signed-off-by: Jean-Philippe Brucker 
Acked-by: Bjorn Helgaas 
---
 include/linux/pci.h | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 33c2b0b77429..775d495ef8d7 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1415,19 +1415,6 @@ int  ht_create_irq(struct pci_dev *dev, int idx);
 void ht_destroy_irq(unsigned int irq);
 #endif /* CONFIG_HT_IRQ */
 
-#ifdef CONFIG_PCI_ATS
-/* Address Translation Service */
-void pci_ats_init(struct pci_dev *dev);
-int pci_enable_ats(struct pci_dev *dev, int ps);
-void pci_disable_ats(struct pci_dev *dev);
-int pci_ats_queue_depth(struct pci_dev *dev);
-#else
-static inline void pci_ats_init(struct pci_dev *d) { }
-static inline int pci_enable_ats(struct pci_dev *d, int ps) { return -ENODEV; }
-static inline void pci_disable_ats(struct pci_dev *d) { }
-static inline int pci_ats_queue_depth(struct pci_dev *d) { return -ENODEV; }
-#endif
-
 #ifdef CONFIG_PCIE_PTM
 int pci_enable_ptm(struct pci_dev *dev, u8 *granularity);
 #else
@@ -1613,6 +1600,19 @@ static inline int pci_get_new_domain_nr(void) { return 
-ENOSYS; }
 #define dev_is_pf(d) (false)
 #endif /* CONFIG_PCI */
 
+#ifdef CONFIG_PCI_ATS
+/* Address Translation Service */
+void pci_ats_init(struct pci_dev *dev);
+int pci_enable_ats(struct pci_dev *dev, int ps);
+void pci_disable_ats(struct pci_dev *dev);
+int pci_ats_queue_depth(struct pci_dev *dev);
+#else
+static inline void pci_ats_init(struct pci_dev *d) { }
+static inline int pci_enable_ats(struct pci_dev *d, int ps) { return -ENODEV; }
+static inline void pci_disable_ats(struct pci_dev *d) { }
+static inline int pci_ats_queue_depth(struct pci_dev *d) { return -ENODEV; }
+#endif
+
 /* Include architecture-dependent settings and functions */
 
 #include 
-- 
2.12.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/7] Add PCI ATS support to SMMUv3

2017-05-24 Thread Jean-Philippe Brucker
PCIe devices can implement their own TLB, named Address Translation Cache
(ATC). In order to support Address Translation Service (ATS), the
following changes are needed in software:

* Enable ATS on endpoints when the system supports it. Both PCI root
  complex and associated SMMU must implement the ATS protocol.

* When unmapping an IOVA, send an ATC invalidate request to the endpoint
  in addition to the usual SMMU IOTLB invalidations.

I previously sent this as part of a lengthy RFC [1] adding SVM (ATS +
PASID + PRI) support to SMMUv3. The next PASID/PRI version is almost
ready, but isn't likely to get merged because it needs hardware testing,
so I will send it later. PRI depends on ATS, but ATS should be useful on
its own.

Without PASID and PRI, ATS is used for accelerating transactions. Instead
of having all memory accesses go through SMMU translation, the endpoint
can translate IOVA->PA once, store the result in its ATC, then issue
subsequent transactions using the PA, partially bypassing the SMMU. So in
theory it should be faster while keeping the advantages of an IOMMU,
namely scatter-gather and access control.

The ATS patches can now be tested on some hardware, even though the lack
of compatible PCI endpoints makes it difficult to assess what performance
optimizations we need. That's why the ATS implementation is a bit rough at
the moment, and we will work on optimizing things like invalidation ranges
later.

Since the RFC [1]:
* added DT and ACPI patches,
* added invalidate-all on domain detach,
* removed smmu_group again,
* removed invalidation print from the fast path,
* disabled tagged pointers for good,
* some style changes.

These patches are based on Linux v4.12-rc2

[1] https://www.spinics.net/lists/linux-pci/msg58650.html

Jean-Philippe Brucker (7):
  PCI: Move ATS declarations outside of CONFIG_PCI
  dt-bindings: PCI: Describe ATS property for root complex nodes
  iommu/of: Check ATS capability in root complex nodes
  ACPI/IORT: Check ATS capability in root complex nodes
  iommu/arm-smmu-v3: Link domains and devices
  iommu/arm-smmu-v3: Add support for PCI ATS
  iommu/arm-smmu-v3: Disable tagged pointers

 .../devicetree/bindings/pci/pci-iommu.txt  |   8 +
 drivers/acpi/arm64/iort.c  |  10 +
 drivers/iommu/arm-smmu-v3.c| 258 -
 drivers/iommu/of_iommu.c   |   8 +
 include/linux/iommu.h  |   4 +
 include/linux/pci.h|  26 +--
 6 files changed, 293 insertions(+), 21 deletions(-)

-- 
2.12.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/amd: constify irq_domain_ops

2017-05-24 Thread Tobias Klauser
struct irq_domain_ops is not modified, so it can be made const.

Signed-off-by: Tobias Klauser 
---
 drivers/iommu/amd_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 63cacf5d6cf2..e8bd130d484a 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -4273,7 +4273,7 @@ static void irq_remapping_deactivate(struct irq_domain 
*domain,
irte_info->index);
 }
 
-static struct irq_domain_ops amd_ir_domain_ops = {
+static const struct irq_domain_ops amd_ir_domain_ops = {
.alloc = irq_remapping_alloc,
.free = irq_remapping_free,
.activate = irq_remapping_activate,
-- 
2.13.0


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] staging: fsl-dpaa2/eth: Map Tx buffers as bidirectional

2017-05-24 Thread Ioana Radulescu
WRIOP hardware may need to write to the hardware annotation
area of Tx buffers (e.g. frame status bits) and also to
the data area (e.g. L4 checksum in frame header).

Map these buffers as DMA_BIDIRECTIONAL, otherwise the
write transaction through SMMU will not be allowed.

Signed-off-by: Nipun Gupta 
Signed-off-by: Ioana Radulescu 
---
 drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c 
b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
index 3fee0d6f17e0..49c435bad706 100644
--- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
+++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
@@ -355,7 +355,7 @@ static int build_sg_fd(struct dpaa2_eth_priv *priv,
 
sg_init_table(scl, nr_frags + 1);
num_sg = skb_to_sgvec(skb, scl, 0, skb->len);
-   num_dma_bufs = dma_map_sg(dev, scl, num_sg, DMA_TO_DEVICE);
+   num_dma_bufs = dma_map_sg(dev, scl, num_sg, DMA_BIDIRECTIONAL);
if (unlikely(!num_dma_bufs)) {
err = -ENOMEM;
goto dma_map_sg_failed;
@@ -406,7 +406,7 @@ static int build_sg_fd(struct dpaa2_eth_priv *priv,
swa->num_dma_bufs = num_dma_bufs;
 
/* Separately map the SGT buffer */
-   addr = dma_map_single(dev, sgt_buf, sgt_buf_size, DMA_TO_DEVICE);
+   addr = dma_map_single(dev, sgt_buf, sgt_buf_size, DMA_BIDIRECTIONAL);
if (unlikely(dma_mapping_error(dev, addr))) {
err = -ENOMEM;
goto dma_map_single_failed;
@@ -423,7 +423,7 @@ static int build_sg_fd(struct dpaa2_eth_priv *priv,
 dma_map_single_failed:
kfree(sgt_buf);
 sgt_buf_alloc_failed:
-   dma_unmap_sg(dev, scl, num_sg, DMA_TO_DEVICE);
+   dma_unmap_sg(dev, scl, num_sg, DMA_BIDIRECTIONAL);
 dma_map_sg_failed:
kfree(scl);
return err;
@@ -461,7 +461,7 @@ static int build_single_fd(struct dpaa2_eth_priv *priv,
 
addr = dma_map_single(dev, buffer_start,
  skb_tail_pointer(skb) - buffer_start,
- DMA_TO_DEVICE);
+ DMA_BIDIRECTIONAL);
if (unlikely(dma_mapping_error(dev, addr)))
return -ENOMEM;
 
@@ -510,7 +510,7 @@ static void free_tx_fd(const struct dpaa2_eth_priv *priv,
 */
dma_unmap_single(dev, fd_addr,
 skb_tail_pointer(skb) - buffer_start,
-DMA_TO_DEVICE);
+DMA_BIDIRECTIONAL);
} else if (fd_format == dpaa2_fd_sg) {
swa = (struct dpaa2_eth_swa *)skbh;
skb = swa->skb;
@@ -519,13 +519,13 @@ static void free_tx_fd(const struct dpaa2_eth_priv *priv,
num_dma_bufs = swa->num_dma_bufs;
 
/* Unmap the scatterlist */
-   dma_unmap_sg(dev, scl, num_sg, DMA_TO_DEVICE);
+   dma_unmap_sg(dev, scl, num_sg, DMA_BIDIRECTIONAL);
kfree(scl);
 
/* Unmap the SGT buffer */
unmap_size = priv->tx_data_offset +
   sizeof(struct dpaa2_sg_entry) * (1 + num_dma_bufs);
-   dma_unmap_single(dev, fd_addr, unmap_size, DMA_TO_DEVICE);
+   dma_unmap_single(dev, fd_addr, unmap_size, DMA_BIDIRECTIONAL);
} else {
/* Unsupported format, mark it as errored and give up */
if (status)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] staging: fsl-dpaa2/eth: Fix address translations

2017-05-24 Thread Ioana Radulescu
Use the correct mechanisms for translating a DMA-mapped IOVA
address into a virtual one. Without this fix, once SMMU is
enabled on Layerscape platforms, the Ethernet driver throws
IOMMU translation faults.

Signed-off-by: Nipun Gupta 
Signed-off-by: Ioana Radulescu 
---
 drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c | 25 +++--
 drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h |  1 +
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c 
b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
index 6f9eed66c64d..3fee0d6f17e0 100644
--- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
+++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "../../fsl-mc/include/mc.h"
 #include "../../fsl-mc/include/mc-sys.h"
@@ -54,6 +55,16 @@ MODULE_DESCRIPTION("Freescale DPAA2 Ethernet Driver");
 
 const char dpaa2_eth_drv_version[] = "0.1";
 
+static void *dpaa2_iova_to_virt(struct iommu_domain *domain,
+   dma_addr_t iova_addr)
+{
+   phys_addr_t phys_addr;
+
+   phys_addr = domain ? iommu_iova_to_phys(domain, iova_addr) : iova_addr;
+
+   return phys_to_virt(phys_addr);
+}
+
 static void validate_rx_csum(struct dpaa2_eth_priv *priv,
 u32 fd_status,
 struct sk_buff *skb)
@@ -98,12 +109,11 @@ static void free_rx_fd(struct dpaa2_eth_priv *priv,
sgt = vaddr + dpaa2_fd_get_offset(fd);
for (i = 0; i < DPAA2_ETH_MAX_SG_ENTRIES; i++) {
addr = dpaa2_sg_get_addr([i]);
+   sg_vaddr = dpaa2_iova_to_virt(priv->iommu_domain, addr);
dma_unmap_single(dev, addr, DPAA2_ETH_RX_BUF_SIZE,
 DMA_FROM_DEVICE);
 
-   sg_vaddr = phys_to_virt(addr);
skb_free_frag(sg_vaddr);
-
if (dpaa2_sg_is_final([i]))
break;
}
@@ -159,10 +169,10 @@ static struct sk_buff *build_frag_skb(struct 
dpaa2_eth_priv *priv,
 
/* Get the address and length from the S/G entry */
sg_addr = dpaa2_sg_get_addr(sge);
+   sg_vaddr = dpaa2_iova_to_virt(priv->iommu_domain, sg_addr);
dma_unmap_single(dev, sg_addr, DPAA2_ETH_RX_BUF_SIZE,
 DMA_FROM_DEVICE);
 
-   sg_vaddr = phys_to_virt(sg_addr);
sg_length = dpaa2_sg_get_len(sge);
 
if (i == 0) {
@@ -222,8 +232,8 @@ static void dpaa2_eth_rx(struct dpaa2_eth_priv *priv,
/* Tracing point */
trace_dpaa2_rx_fd(priv->net_dev, fd);
 
+   vaddr = dpaa2_iova_to_virt(priv->iommu_domain, addr);
dma_unmap_single(dev, addr, DPAA2_ETH_RX_BUF_SIZE, DMA_FROM_DEVICE);
-   vaddr = phys_to_virt(addr);
 
prefetch(vaddr + priv->buf_layout.private_data_size);
prefetch(vaddr + dpaa2_fd_get_offset(fd));
@@ -490,7 +500,7 @@ static void free_tx_fd(const struct dpaa2_eth_priv *priv,
struct dpaa2_fas *fas;
 
fd_addr = dpaa2_fd_get_addr(fd);
-   skbh = phys_to_virt(fd_addr);
+   skbh = dpaa2_iova_to_virt(priv->iommu_domain, fd_addr);
 
if (fd_format == dpaa2_fd_single) {
skb = *skbh;
@@ -802,10 +812,11 @@ static void drain_bufs(struct dpaa2_eth_priv *priv, int 
count)
}
for (i = 0; i < ret; i++) {
/* Same logic as on regular Rx path */
+   vaddr = dpaa2_iova_to_virt(priv->iommu_domain,
+  buf_array[i]);
dma_unmap_single(dev, buf_array[i],
 DPAA2_ETH_RX_BUF_SIZE,
 DMA_FROM_DEVICE);
-   vaddr = phys_to_virt(buf_array[i]);
skb_free_frag(vaddr);
}
} while (ret);
@@ -2358,6 +2369,8 @@ static int dpaa2_eth_probe(struct fsl_mc_device *dpni_dev)
priv = netdev_priv(net_dev);
priv->net_dev = net_dev;
 
+   priv->iommu_domain = iommu_get_domain_for_dev(dev);
+
/* Obtain a MC portal */
err = fsl_mc_portal_allocate(dpni_dev, FSL_MC_IO_ATOMIC_CONTEXT_PORTAL,
 >mc_io);
diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h 
b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
index c67cced55b72..55b47623008c 100644
--- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
+++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
@@ -301,6 +301,7 @@ struct dpaa2_eth_priv {
 
struct fsl_mc_device *dpbp_dev;
struct dpbp_attr dpbp_attrs;
+   struct iommu_domain *iommu_domain;
 
u16 tx_qdid;
struct fsl_mc_io *mc_io;
-- 
2.11.0

___
iommu mailing list

Re: [PATCH 1/2] staging: fsl-dpaa2/eth: Fix address translations

2017-05-24 Thread Laurentiu Tudor
Hi Ioana,

Debatable nit inline.

On 05/24/2017 03:13 PM, Ioana Radulescu wrote:
> Use the correct mechanisms for translating a DMA-mapped IOVA
> address into a virtual one. Without this fix, once SMMU is
> enabled on Layerscape platforms, the Ethernet driver throws
> IOMMU translation faults.
>
> Signed-off-by: Nipun Gupta 
> Signed-off-by: Ioana Radulescu 
> ---
>   drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c | 25 
> +++--
>   drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h |  1 +
>   2 files changed, 20 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c 
> b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
> index 6f9eed66c64d..3fee0d6f17e0 100644
> --- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
> +++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
> @@ -37,6 +37,7 @@
>   #include 
>   #include 
>   #include 
> +#include 
>
>   #include "../../fsl-mc/include/mc.h"
>   #include "../../fsl-mc/include/mc-sys.h"
> @@ -54,6 +55,16 @@ MODULE_DESCRIPTION("Freescale DPAA2 Ethernet Driver");
>
>   const char dpaa2_eth_drv_version[] = "0.1";
>
> +static void *dpaa2_iova_to_virt(struct iommu_domain *domain,

if you pass a "struct dpaa2_eth_priv *priv" instead of "iommu_domain" 
you can move the priv->iommu_domain reference in the function and 
slightly simplify the call sites.

---
Best Regards, Laurentiu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V3 6/8] arm: dma-mapping: Reset the device's dma_ops

2017-05-24 Thread Sricharan R
Hi Laurent,

On 5/24/2017 4:56 PM, Laurent Pinchart wrote:
> Hello,
> 
> On Wednesday 24 May 2017 16:01:45 Sricharan R wrote:
>> On 5/24/2017 4:12 AM, Russell King - ARM Linux wrote:
>>> On Wed, May 24, 2017 at 12:46:51AM +0300, Laurent Pinchart wrote:
 On Tuesday 23 May 2017 18:53:19 Russell King - ARM Linux wrote:
> On Tue, May 23, 2017 at 05:55:57PM +0100, Robin Murphy wrote:
>> On 23/05/17 17:25, Russell King - ARM Linux wrote:
>>> So, I've come to apply this patch (since it's finally landed in the
>>> patch system), and I'm not convinced that the commit message is really
>>> up to scratch.
>>>
>>> The current commit message looks like this:
>>>
>>> "   ARM: 8674/1: dma-mapping: Reset the device's dma_ops
>>>
>>> arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops(),
>>> dma_ops should be cleared in the teardown path. Otherwise this
>>> causes problem when the probe of device is retried after being
>>> deferred. The device's iommu structures are cleared after
>>> EPROBEDEFER error, but on the next try dma_ops will still be set
>>> to old value, which is not right."
>>>
>>> It is obviously a fix, but a fix for which patch?  Looking at the
>>> history, we have "arm: dma-mapping: Don't override dma_ops in
>>> arch_setup_dma_ops()" which I would have guessed is the appropriate
>>> one, but this post-dates your patch (it's very recent, v4.12-rc
>>> recent.)
>>>
>>> So, I need more description about the problem you were seeing when
>>> you first proposed this patch.
>>>
>>> How does leaving the dma_ops in place prior to "arm: dma-mapping:
>>> Don't override dma_ops in arch_setup_dma_ops()" cause problems for
>>> deferred probing?
>>>
>>> What patch is your change trying to fix?  In other words, how far
>>> back does this patch need to be backported?
>>
>> In effect, it's fixing a latent inconsistency that's been present since
>> its introduction in 4bb25789ed28. However, that has clearly not proven
>> to be an issue in practice since then. With 09515ef5ddad we start
>> actually calling arch_teardown_dma_ops() in a manner that might leave
>> things partially initialised if anyone starts removing and reprobing
>> drivers, but so far that's still from code inspection[1] rather than
>> anyone hitting it.
>>
>> Given that the changes which tickle it are fresh in -rc1 I'd say
>> there's no need to backport this, but at the same time it shouldn't do
>> any damage if you really want to.
>
> Well, looking at this, I'm not convinced that much of it is correct.
>
> 1) set_dma_ops() is in arch_setup_dma_ops() but this patch adds
>the unsetting of the DMA ops inside arm_teardown_iommu_dma_ops()
>rather than arch_teardown_dma_ops().
>
>This doesn't strike me as being particularly symmetric.
>arm_teardown_iommu_dma_ops() is arm_setup_iommu_dma_ops()'s
>counterpart.
>
> 2) arch_setup_dma_ops(), the recent patch to add the existing dma_ops
>check, and Xen - Xen wants to override the DMA ops if in the
>initial domain.  It's not clear (at least to me) whether the recent
>patch adding the dma_ops check took account of this or not.
>
> 3) random places seem to fiddle with the dma_ops - notice that
>arm_iommu_detach_device() sets the dma_ops to NULL.
>
>In fact, I think moving __arm_iommu_detach_device() into
>arm_iommu_detach_device(), calling arm_iommu_detach_device(),
>and getting rid of the explicit set_dma_ops(, NULL) in this
>path would be a good first step.
>
> 4) I think arch_setup_dma_ops() is over-complex.
>
> So, in summary, this code is a mess today, and that means it's not
> obviously correct - which is bad.  This needs sorting.

 We've reached the same conclusion independently, but I'll refrain from
 commenting on whether that's a good or bad thing :-)

 I don't think this patch should be applied, as it could break Xen (and
 other platforms/drivers that set the DMA operations manually) by
 resetting DMA operations at device remove() time even if they have been
 set independently of arch_setup_dma_ops().
>>>
>>> That will only occur if the dma ops have been overriden once the DMA
>>> operations have been setup via arch_setup_dma_ops. What saves it from
>>> wholesale NULLing of the DMA operations is the check for a valid
>>> dma_iommu_mapping structure in arm_teardown_iommu_dma_ops().  This only
>>> exists when arm_setup_iommu_dma_ops() has attached a mapping to the
>>> device.
> 
> Unfortunately I don't think that's always the case. The dma_iommu_mapping is 
> also set by direct callers of arm_iommu_attach_device(), namely
> 
> - the Renesas R-Car IOMMU driver (ipmmu-vmsa)
> - the Mediatek IOMMU driver 

Re: [PATCH V3 6/8] arm: dma-mapping: Reset the device's dma_ops

2017-05-24 Thread Laurent Pinchart
Hello,

On Wednesday 24 May 2017 16:01:45 Sricharan R wrote:
> On 5/24/2017 4:12 AM, Russell King - ARM Linux wrote:
> > On Wed, May 24, 2017 at 12:46:51AM +0300, Laurent Pinchart wrote:
> >> On Tuesday 23 May 2017 18:53:19 Russell King - ARM Linux wrote:
> >>> On Tue, May 23, 2017 at 05:55:57PM +0100, Robin Murphy wrote:
>  On 23/05/17 17:25, Russell King - ARM Linux wrote:
> > So, I've come to apply this patch (since it's finally landed in the
> > patch system), and I'm not convinced that the commit message is really
> > up to scratch.
> > 
> > The current commit message looks like this:
> > 
> > "   ARM: 8674/1: dma-mapping: Reset the device's dma_ops
> > 
> > arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops(),
> > dma_ops should be cleared in the teardown path. Otherwise this
> > causes problem when the probe of device is retried after being
> > deferred. The device's iommu structures are cleared after
> > EPROBEDEFER error, but on the next try dma_ops will still be set
> > to old value, which is not right."
> > 
> > It is obviously a fix, but a fix for which patch?  Looking at the
> > history, we have "arm: dma-mapping: Don't override dma_ops in
> > arch_setup_dma_ops()" which I would have guessed is the appropriate
> > one, but this post-dates your patch (it's very recent, v4.12-rc
> > recent.)
> > 
> > So, I need more description about the problem you were seeing when
> > you first proposed this patch.
> > 
> > How does leaving the dma_ops in place prior to "arm: dma-mapping:
> > Don't override dma_ops in arch_setup_dma_ops()" cause problems for
> > deferred probing?
> > 
> > What patch is your change trying to fix?  In other words, how far
> > back does this patch need to be backported?
>  
>  In effect, it's fixing a latent inconsistency that's been present since
>  its introduction in 4bb25789ed28. However, that has clearly not proven
>  to be an issue in practice since then. With 09515ef5ddad we start
>  actually calling arch_teardown_dma_ops() in a manner that might leave
>  things partially initialised if anyone starts removing and reprobing
>  drivers, but so far that's still from code inspection[1] rather than
>  anyone hitting it.
>  
>  Given that the changes which tickle it are fresh in -rc1 I'd say
>  there's no need to backport this, but at the same time it shouldn't do
>  any damage if you really want to.
> >>> 
> >>> Well, looking at this, I'm not convinced that much of it is correct.
> >>> 
> >>> 1) set_dma_ops() is in arch_setup_dma_ops() but this patch adds
> >>>the unsetting of the DMA ops inside arm_teardown_iommu_dma_ops()
> >>>rather than arch_teardown_dma_ops().
> >>>
> >>>This doesn't strike me as being particularly symmetric.
> >>>arm_teardown_iommu_dma_ops() is arm_setup_iommu_dma_ops()'s
> >>>counterpart.
> >>> 
> >>> 2) arch_setup_dma_ops(), the recent patch to add the existing dma_ops
> >>>check, and Xen - Xen wants to override the DMA ops if in the
> >>>initial domain.  It's not clear (at least to me) whether the recent
> >>>patch adding the dma_ops check took account of this or not.
> >>> 
> >>> 3) random places seem to fiddle with the dma_ops - notice that
> >>>arm_iommu_detach_device() sets the dma_ops to NULL.
> >>>
> >>>In fact, I think moving __arm_iommu_detach_device() into
> >>>arm_iommu_detach_device(), calling arm_iommu_detach_device(),
> >>>and getting rid of the explicit set_dma_ops(, NULL) in this
> >>>path would be a good first step.
> >>> 
> >>> 4) I think arch_setup_dma_ops() is over-complex.
> >>> 
> >>> So, in summary, this code is a mess today, and that means it's not
> >>> obviously correct - which is bad.  This needs sorting.
> >> 
> >> We've reached the same conclusion independently, but I'll refrain from
> >> commenting on whether that's a good or bad thing :-)
> >> 
> >> I don't think this patch should be applied, as it could break Xen (and
> >> other platforms/drivers that set the DMA operations manually) by
> >> resetting DMA operations at device remove() time even if they have been
> >> set independently of arch_setup_dma_ops().
> > 
> > That will only occur if the dma ops have been overriden once the DMA
> > operations have been setup via arch_setup_dma_ops. What saves it from
> > wholesale NULLing of the DMA operations is the check for a valid
> > dma_iommu_mapping structure in arm_teardown_iommu_dma_ops().  This only
> > exists when arm_setup_iommu_dma_ops() has attached a mapping to the
> > device.

Unfortunately I don't think that's always the case. The dma_iommu_mapping is 
also set by direct callers of arm_iommu_attach_device(), namely

- the Renesas R-Car IOMMU driver (ipmmu-vmsa)
- the Mediatek IOMMU driver (mtk-iommu-v1)
- the Exynos DRM driver
- the OMAP3 ISP driver

Re: [PATCH V3 6/8] arm: dma-mapping: Reset the device's dma_ops

2017-05-24 Thread Sricharan R
Hi Russell/Laurent/Robin,

On 5/24/2017 4:12 AM, Russell King - ARM Linux wrote:
> On Wed, May 24, 2017 at 12:46:51AM +0300, Laurent Pinchart wrote:
>> Hi Russell,
>>
>> On Tuesday 23 May 2017 18:53:19 Russell King - ARM Linux wrote:
>>> On Tue, May 23, 2017 at 05:55:57PM +0100, Robin Murphy wrote:
 On 23/05/17 17:25, Russell King - ARM Linux wrote:
> So, I've come to apply this patch (since it's finally landed in the
> patch system), and I'm not convinced that the commit message is really
> up to scratch.
>
> The current commit message looks like this:
>
> "   ARM: 8674/1: dma-mapping: Reset the device's dma_ops
> arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops(),
> dma_ops should be cleared in the teardown path. Otherwise this
> causes problem when the probe of device is retried after being
> deferred. The device's iommu structures are cleared after
> EPROBEDEFER error, but on the next try dma_ops will still be set to
> old value, which is not right."
>
> It is obviously a fix, but a fix for which patch?  Looking at the
> history, we have "arm: dma-mapping: Don't override dma_ops in
> arch_setup_dma_ops()" which I would have guessed is the appropriate
> one, but this post-dates your patch (it's very recent, v4.12-rc
> recent.)
>
> So, I need more description about the problem you were seeing when
> you first proposed this patch.
>
> How does leaving the dma_ops in place prior to "arm: dma-mapping:
> Don't override dma_ops in arch_setup_dma_ops()" cause problems for
> deferred probing?
>
> What patch is your change trying to fix?  In other words, how far
> back does this patch need to be backported?

 In effect, it's fixing a latent inconsistency that's been present since
 its introduction in 4bb25789ed28. However, that has clearly not proven
 to be an issue in practice since then. With 09515ef5ddad we start
 actually calling arch_teardown_dma_ops() in a manner that might leave
 things partially initialised if anyone starts removing and reprobing
 drivers, but so far that's still from code inspection[1] rather than
 anyone hitting it.

 Given that the changes which tickle it are fresh in -rc1 I'd say there's
 no need to backport this, but at the same time it shouldn't do any
 damage if you really want to.
>>>
>>> Well, looking at this, I'm not convinced that much of it is correct.
>>>
>>> 1) set_dma_ops() is in arch_setup_dma_ops() but this patch adds
>>>the unsetting of the DMA ops inside arm_teardown_iommu_dma_ops()
>>>rather than arch_teardown_dma_ops().
>>>
>>>This doesn't strike me as being particularly symmetric.
>>>arm_teardown_iommu_dma_ops() is arm_setup_iommu_dma_ops()'s
>>>counterpart.
>>>
>>> 2) arch_setup_dma_ops(), the recent patch to add the existing dma_ops
>>>check, and Xen - Xen wants to override the DMA ops if in the
>>>initial domain.  It's not clear (at least to me) whether the recent
>>>patch adding the dma_ops check took account of this or not.
>>>
>>> 3) random places seem to fiddle with the dma_ops - notice that
>>>arm_iommu_detach_device() sets the dma_ops to NULL.
>>>
>>>In fact, I think moving __arm_iommu_detach_device() into
>>>arm_iommu_detach_device(), calling arm_iommu_detach_device(),
>>>and getting rid of the explicit set_dma_ops(, NULL) in this
>>>path would be a good first step.
>>>
>>> 4) I think arch_setup_dma_ops() is over-complex.
>>>
>>> So, in summary, this code is a mess today, and that means it's not
>>> obviously correct - which is bad.  This needs sorting.
>>
>> We've reached the same conclusion independently, but I'll refrain from 
>> commenting on whether that's a good or bad thing :-)
>>
>> I don't think this patch should be applied, as it could break Xen (and other 
>> platforms/drivers that set the DMA operations manually) by resetting DMA 
>> operations at device remove() time even if they have been set independently 
>> of 
>> arch_setup_dma_ops().
> 
> That will only occur if the dma ops have been overriden once the DMA
> operations have been setup via arch_setup_dma_ops.  What saves it from
> wholesale NULLing of the DMA operations is the check for a valid
> dma_iommu_mapping structure in arm_teardown_iommu_dma_ops().  This only
> exists when arm_setup_iommu_dma_ops() has attached a mapping to the
> device.
> 

Right, only if the dma ops are set and no dma_iommu_mapping is created for
the device, then arch_teardown_iommu_dma_ops does nothing.

Firstly, this patch for resetting the dma_ops in teardown was required
only when arch_setup_dma_ops has created both the mapping and dma_ops
for the device. Because mappings that were created in arch_setup_dma_ops
are cleared in teardown, so dma ops should also be reset. But this can be 
done by calling arm_iommu_detach_device() from