Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-09 Thread Lu Baolu

Hi Joerg,

On 12/7/18 6:29 PM, 'j...@8bytes.org' wrote:

Hi,

On Mon, Nov 26, 2018 at 07:29:45AM +, Tian, Kevin wrote:

btw Baolu just reminded me one thing which is worthy of noting here.
'primary' vs. 'aux' concept makes sense only when we look from a device
p.o.v. That binding relationship is not (*should not be*) carry-and-forwarded
cross devices. every domain must be explicitly attached to other devices
(instead of implicitly attached being above example), and new primary/aux
attribute on another device will be decided at attach time.


Okay, so after all the discussions we had I learned a few more things
about the scalable mode feature and thought a bit longer about how to
best support it in the IOMMU-API.


Thanks for thinking about this.



The concept of sub-domains I initially proposed certainly makes no
sense, but scalable-mode specific attach/detach functions do. So instead
of a sub-domain mode, I'd like to propose device-feature sets.

The posted patch-set already includes this as device-attributes, but I
don't like this naming as we are really talking about additional
feature sets of a device. So how about we introduce this:

enum iommu_dev_features {
/* ... */
IOMMU_DEV_FEAT_AUX,
IOMMU_DEV_FEAT_SVA,
/* ... */
};

/* Check if a device supports a given feature of the IOMMU-API */
bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features 
*feat);


Here we pass in a pointer of "enum iommu_dev_features", do we want
to return anything here?

Best regards,
Lu Baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-09 Thread Tian, Kevin
> From: 'j...@8bytes.org' [mailto:j...@8bytes.org]
> Sent: Friday, December 7, 2018 6:29 PM
> 
> Hi,
> 
> On Mon, Nov 26, 2018 at 07:29:45AM +, Tian, Kevin wrote:
> > btw Baolu just reminded me one thing which is worthy of noting here.
> > 'primary' vs. 'aux' concept makes sense only when we look from a device
> > p.o.v. That binding relationship is not (*should not be*) carry-and-
> forwarded
> > cross devices. every domain must be explicitly attached to other devices
> > (instead of implicitly attached being above example), and new
> primary/aux
> > attribute on another device will be decided at attach time.
> 
> Okay, so after all the discussions we had I learned a few more things
> about the scalable mode feature and thought a bit longer about how to
> best support it in the IOMMU-API.

Thanks for thinking through this.

> 
> The concept of sub-domains I initially proposed certainly makes no
> sense, but scalable-mode specific attach/detach functions do. So instead
> of a sub-domain mode, I'd like to propose device-feature sets.

Can I interpret above as that you agree with the aux domain concept (i.e. one
device can be linked to multiple domains) in general, and now we're just trying
to address the remaining open in API level?

> 
> The posted patch-set already includes this as device-attributes, but I
> don't like this naming as we are really talking about additional
> feature sets of a device. So how about we introduce this:
> 
>   enum iommu_dev_features {
>   /* ... */
>   IOMMU_DEV_FEAT_AUX,
>   IOMMU_DEV_FEAT_SVA,
>   /* ... */
>   };
> 

Does above represent whether a device implements aux/sva features,
or whether a device has been enabled by driver to support aux/sva 
features?

>   /* Check if a device supports a given feature of the IOMMU-API */
>   bool iommu_dev_has_feature(struct device *dev, enum
> iommu_dev_features *feat);

If the latter we also need iommu_dev_set_feature so driver can poke
it based on its own configuration. 

> 
>   /*
>* Only works if iommu_dev_has_feature(dev,
> IOMMU_DEV_FEAT_AUX)
>* returns true
>*
>* Also, as long as domains are attached to a device through
>* this interface, any trys to call iommu_attach_device() should
>* fail (iommu_detach_device() can't fail, so we fail on the
>* tryint to re-attach). This should make us safe against a
>* device being attached to a guest as a whole while there are
>* still pasid users on it (aux and sva).

yes, it makes sense.

>*/
>   int iommu_aux_attach_device(struct iommu_domain *domain,
>   struct device *dev);
> 
>   int iommu_aux_detach_device(struct iommu_domain *domain,
>   struct device *dev);
>   /*
>* I know we are targeting a system-wide pasid-space, so that
>* the pasid would be the same for one domain on all devices,
>* let's just keep the option open to have different
>* pasid-spaces in one system. Also this way we can use it to
>* check whether the domain is attached to this device at all.
>*
>* returns pasid or <0 if domain has no pasid on that device.
>*/
>   int iommu_aux_get_pasid(struct iommu_domain *domain,
>   struct device *dev);
> 
>   /* So we need a iommu_aux_detach_all()? */

for what scenario?

> 
> This concept can also be easily extended for supporting SVA in parallel
> on the same device, with the same constraints regarding the behavior of
> iommu_attach_device()/iommu_detach_device().
> 
> So what do you think about that approach?
> 
> Regards,
> 
>   Joerg
> 
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 11/12] iommu/vt-d: Shared virtual address in scalable mode

2018-12-09 Thread Lu Baolu
This patch enables the current SVA (Shared Virtual Address)
implementation to work in the scalable mode.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Sanjay Kumar 
Signed-off-by: Liu Yi L 
Signed-off-by: Lu Baolu 
Reviewed-by: Ashok Raj 
---
 drivers/iommu/intel-iommu.c | 38 
 drivers/iommu/intel-pasid.c |  2 +-
 drivers/iommu/intel-pasid.h |  1 -
 drivers/iommu/intel-svm.c   | 58 -
 include/linux/intel-iommu.h |  9 +-
 5 files changed, 20 insertions(+), 88 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 21e5f1b9a2ca..4552166c553b 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5257,18 +5257,6 @@ static void intel_iommu_put_resv_regions(struct device 
*dev,
 }
 
 #ifdef CONFIG_INTEL_IOMMU_SVM
-static inline unsigned long intel_iommu_get_pts(struct device *dev)
-{
-   int pts, max_pasid;
-
-   max_pasid = intel_pasid_get_dev_max_id(dev);
-   pts = find_first_bit((unsigned long *)_pasid, MAX_NR_PASID_BITS);
-   if (pts < 5)
-   return 0;
-
-   return pts - 5;
-}
-
 int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev 
*sdev)
 {
struct device_domain_info *info;
@@ -5300,33 +5288,7 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, 
struct intel_svm_dev *sd
sdev->sid = PCI_DEVID(info->bus, info->devfn);
 
if (!(ctx_lo & CONTEXT_PASIDE)) {
-   if (iommu->pasid_state_table)
-   context[1].hi = 
(u64)virt_to_phys(iommu->pasid_state_table);
-   context[1].lo = (u64)virt_to_phys(info->pasid_table->table) |
-   intel_iommu_get_pts(sdev->dev);
-
-   wmb();
-   /* CONTEXT_TT_MULTI_LEVEL and CONTEXT_TT_DEV_IOTLB are both
-* extended to permit requests-with-PASID if the PASIDE bit
-* is set. which makes sense. For CONTEXT_TT_PASS_THROUGH,
-* however, the PASIDE bit is ignored and requests-with-PASID
-* are unconditionally blocked. Which makes less sense.
-* So convert from CONTEXT_TT_PASS_THROUGH to one of the new
-* "guest mode" translation types depending on whether ATS
-* is available or not. Annoyingly, we can't use the new
-* modes *unless* PASIDE is set. */
-   if ((ctx_lo & CONTEXT_TT_MASK) == (CONTEXT_TT_PASS_THROUGH << 
2)) {
-   ctx_lo &= ~CONTEXT_TT_MASK;
-   if (info->ats_supported)
-   ctx_lo |= CONTEXT_TT_PT_PASID_DEV_IOTLB << 2;
-   else
-   ctx_lo |= CONTEXT_TT_PT_PASID << 2;
-   }
ctx_lo |= CONTEXT_PASIDE;
-   if (iommu->pasid_state_table)
-   ctx_lo |= CONTEXT_DINVE;
-   if (info->pri_supported)
-   ctx_lo |= CONTEXT_PRS;
context[0].lo = ctx_lo;
wmb();
iommu->flush.flush_context(iommu, sdev->did, sdev->sid,
diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index c3dcf4dc2496..53fe5248d8f1 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -286,7 +286,7 @@ static inline void pasid_clear_entry(struct pasid_entry *pe)
WRITE_ONCE(pe->val[7], 0);
 }
 
-void intel_pasid_clear_entry(struct device *dev, int pasid)
+static void intel_pasid_clear_entry(struct device *dev, int pasid)
 {
struct pasid_entry *pe;
 
diff --git a/drivers/iommu/intel-pasid.h b/drivers/iommu/intel-pasid.h
index 512c63ec8a22..23537b3f34e3 100644
--- a/drivers/iommu/intel-pasid.h
+++ b/drivers/iommu/intel-pasid.h
@@ -58,7 +58,6 @@ void intel_pasid_free_table(struct device *dev);
 struct pasid_table *intel_pasid_get_table(struct device *dev);
 int intel_pasid_get_dev_max_id(struct device *dev);
 struct pasid_entry *intel_pasid_get_entry(struct device *dev, int pasid);
-void intel_pasid_clear_entry(struct device *dev, int pasid);
 int intel_pasid_setup_first_level(struct intel_iommu *iommu,
  struct device *dev, pgd_t *pgd,
  int pasid, u16 did, int flags);
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 9b6771a89207..935712ebda79 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -29,10 +29,6 @@
 
 #include "intel-pasid.h"
 
-#define PASID_ENTRY_P  BIT_ULL(0)
-#define PASID_ENTRY_FLPM_5LP   BIT_ULL(9)
-#define PASID_ENTRY_SREBIT_ULL(11)
-
 static irqreturn_t prq_event_thread(int irq, void *d);
 
 struct pasid_state_entry {
@@ -248,20 +244,6 @@ static void intel_invalidate_range(struct mmu_notifier *mn,
  (end - start + PAGE_SIZE - 1) >> VTD_PAGE_SHIFT, 
0, 0);
 }
 
-
-static void 

[PATCH v6 09/12] iommu/vt-d: Setup context and enable RID2PASID support

2018-12-09 Thread Lu Baolu
This patch enables the translation for requests without PASID in
the scalable mode by setting up the root and context entries.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Sanjay Kumar 
Signed-off-by: Liu Yi L 
Signed-off-by: Lu Baolu 
Reviewed-by: Ashok Raj 
---
 drivers/iommu/intel-iommu.c | 136 
 drivers/iommu/intel-pasid.h |   1 +
 include/linux/intel-iommu.h |   1 +
 3 files changed, 108 insertions(+), 30 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 55c4ffda1246..21e5f1b9a2ca 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1197,6 +1197,8 @@ static void iommu_set_root_entry(struct intel_iommu 
*iommu)
unsigned long flag;
 
addr = virt_to_phys(iommu->root_entry);
+   if (sm_supported(iommu))
+   addr |= DMA_RTADDR_SMT;
 
raw_spin_lock_irqsave(>register_lock, flag);
dmar_writeq(iommu->reg + DMAR_RTADDR_REG, addr);
@@ -1918,6 +1920,56 @@ static void domain_exit(struct dmar_domain *domain)
free_domain_mem(domain);
 }
 
+/*
+ * Get the PASID directory size for scalable mode context entry.
+ * Value of X in the PDTS field of a scalable mode context entry
+ * indicates PASID directory with 2^(X + 7) entries.
+ */
+static inline unsigned long context_get_sm_pds(struct pasid_table *table)
+{
+   int pds, max_pde;
+
+   max_pde = table->max_pasid >> PASID_PDE_SHIFT;
+   pds = find_first_bit((unsigned long *)_pde, MAX_NR_PASID_BITS);
+   if (pds < 7)
+   return 0;
+
+   return pds - 7;
+}
+
+/*
+ * Set the RID_PASID field of a scalable mode context entry. The
+ * IOMMU hardware will use the PASID value set in this field for
+ * DMA translations of DMA requests without PASID.
+ */
+static inline void
+context_set_sm_rid2pasid(struct context_entry *context, unsigned long pasid)
+{
+   context->hi |= pasid & ((1 << 20) - 1);
+   context->hi |= (1 << 20);
+}
+
+/*
+ * Set the DTE(Device-TLB Enable) field of a scalable mode context
+ * entry.
+ */
+static inline void context_set_sm_dte(struct context_entry *context)
+{
+   context->lo |= (1 << 2);
+}
+
+/*
+ * Set the PRE(Page Request Enable) field of a scalable mode context
+ * entry.
+ */
+static inline void context_set_sm_pre(struct context_entry *context)
+{
+   context->lo |= (1 << 4);
+}
+
+/* Convert value to context PASID directory size field coding. */
+#define context_pdts(pds)  (((pds) & 0x7) << 9)
+
 static int domain_context_mapping_one(struct dmar_domain *domain,
  struct intel_iommu *iommu,
  struct pasid_table *table,
@@ -1928,8 +1980,7 @@ static int domain_context_mapping_one(struct dmar_domain 
*domain,
struct device_domain_info *info = NULL;
struct context_entry *context;
unsigned long flags;
-   struct dma_pte *pgd;
-   int ret, agaw;
+   int ret;
 
WARN_ON(did == 0);
 
@@ -1975,41 +2026,67 @@ static int domain_context_mapping_one(struct 
dmar_domain *domain,
}
}
 
-   pgd = domain->pgd;
-
context_clear_entry(context);
-   context_set_domain_id(context, did);
 
-   /*
-* Skip top levels of page tables for iommu which has less agaw
-* than default.  Unnecessary for PT mode.
-*/
-   if (translation != CONTEXT_TT_PASS_THROUGH) {
-   for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
-   ret = -ENOMEM;
-   pgd = phys_to_virt(dma_pte_addr(pgd));
-   if (!dma_pte_present(pgd))
-   goto out_unlock;
-   }
+   if (sm_supported(iommu)) {
+   unsigned long pds;
 
-   info = iommu_support_dev_iotlb(domain, iommu, bus, devfn);
-   if (info && info->ats_supported)
-   translation = CONTEXT_TT_DEV_IOTLB;
-   else
-   translation = CONTEXT_TT_MULTI_LEVEL;
+   WARN_ON(!table);
+
+   /* Setup the PASID DIR pointer: */
+   pds = context_get_sm_pds(table);
+   context->lo = (u64)virt_to_phys(table->table) |
+   context_pdts(pds);
+
+   /* Setup the RID_PASID field: */
+   context_set_sm_rid2pasid(context, PASID_RID2PASID);
 
-   context_set_address_root(context, virt_to_phys(pgd));
-   context_set_address_width(context, agaw);
-   } else {
/*
-* In pass through mode, AW must be programmed to
-* indicate the largest AGAW value supported by
-* hardware. And ASR is ignored by hardware.
+* Setup the Device-TLB enable bit and Page request
+* Enable bit:
 */
-   context_set_address_width(context, 

[PATCH v6 12/12] iommu/vt-d: Remove deferred invalidation

2018-12-09 Thread Lu Baolu
Deferred invalidation is an ECS specific feature. It will not be
supported when IOMMU works in scalable mode. As we deprecated the
ECS support, remove deferred invalidation and cleanup the code.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Cc: Liu Yi L 
Cc: Sanjay Kumar 
Signed-off-by: Lu Baolu 
Reviewed-by: Ashok Raj 
---
 drivers/iommu/intel-iommu.c |  1 -
 drivers/iommu/intel-svm.c   | 45 -
 include/linux/intel-iommu.h |  8 ---
 3 files changed, 54 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 4552166c553b..eb5351e8cde5 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1700,7 +1700,6 @@ static void free_dmar_iommu(struct intel_iommu *iommu)
if (pasid_supported(iommu)) {
if (ecap_prs(iommu->ecap))
intel_svm_finish_prq(iommu);
-   intel_svm_exit(iommu);
}
 #endif
 }
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 935712ebda79..a2a2aa4439aa 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -31,15 +31,8 @@
 
 static irqreturn_t prq_event_thread(int irq, void *d);
 
-struct pasid_state_entry {
-   u64 val;
-};
-
 int intel_svm_init(struct intel_iommu *iommu)
 {
-   struct page *pages;
-   int order;
-
if (cpu_feature_enabled(X86_FEATURE_GBPAGES) &&
!cap_fl1gp_support(iommu->cap))
return -EINVAL;
@@ -48,39 +41,6 @@ int intel_svm_init(struct intel_iommu *iommu)
!cap_5lp_support(iommu->cap))
return -EINVAL;
 
-   /* Start at 2 because it's defined as 2^(1+PSS) */
-   iommu->pasid_max = 2 << ecap_pss(iommu->ecap);
-
-   /* Eventually I'm promised we will get a multi-level PASID table
-* and it won't have to be physically contiguous. Until then,
-* limit the size because 8MiB contiguous allocations can be hard
-* to come by. The limit of 0x2, which is 1MiB for each of
-* the PASID and PASID-state tables, is somewhat arbitrary. */
-   if (iommu->pasid_max > 0x2)
-   iommu->pasid_max = 0x2;
-
-   order = get_order(sizeof(struct pasid_entry) * iommu->pasid_max);
-   if (ecap_dis(iommu->ecap)) {
-   pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order);
-   if (pages)
-   iommu->pasid_state_table = page_address(pages);
-   else
-   pr_warn("IOMMU: %s: Failed to allocate PASID state 
table\n",
-   iommu->name);
-   }
-
-   return 0;
-}
-
-int intel_svm_exit(struct intel_iommu *iommu)
-{
-   int order = get_order(sizeof(struct pasid_entry) * iommu->pasid_max);
-
-   if (iommu->pasid_state_table) {
-   free_pages((unsigned long)iommu->pasid_state_table, order);
-   iommu->pasid_state_table = NULL;
-   }
-
return 0;
 }
 
@@ -214,11 +174,6 @@ static void intel_flush_svm_range(struct intel_svm *svm, 
unsigned long address,
 {
struct intel_svm_dev *sdev;
 
-   /* Try deferred invalidate if available */
-   if (svm->iommu->pasid_state_table &&
-   !cmpxchg64(>iommu->pasid_state_table[svm->pasid].val, 0, 1ULL 
<< 63))
-   return;
-
rcu_read_lock();
list_for_each_entry_rcu(sdev, >devs, list)
intel_flush_svm_range_dev(svm, sdev, address, pages, ih, gl);
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index cfcf9c1e1872..0605f3bf6e79 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -541,15 +541,8 @@ struct intel_iommu {
struct iommu_flush flush;
 #endif
 #ifdef CONFIG_INTEL_IOMMU_SVM
-   /* These are large and need to be contiguous, so we allocate just
-* one for now. We'll maybe want to rethink that if we truly give
-* devices away to userspace processes (e.g. for DPDK) and don't
-* want to trust that userspace will use *only* the PASID it was
-* told to. But while it's all driver-arbitrated, we're fine. */
-   struct pasid_state_entry *pasid_state_table;
struct page_req_dsc *prq;
unsigned char prq_name[16];/* Name for PRQ interrupt */
-   u32 pasid_max;
 #endif
struct q_inval  *qi;/* Queued invalidation info */
u32 *iommu_state; /* Store iommu states between suspend and resume.*/
@@ -663,7 +656,6 @@ void iommu_flush_write_buffer(struct intel_iommu *iommu);
 
 #ifdef CONFIG_INTEL_IOMMU_SVM
 int intel_svm_init(struct intel_iommu *iommu);
-int intel_svm_exit(struct intel_iommu *iommu);
 extern int intel_svm_enable_prq(struct intel_iommu *iommu);
 extern int intel_svm_finish_prq(struct intel_iommu *iommu);
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org

[PATCH v6 10/12] iommu/vt-d: Add first level page table interface

2018-12-09 Thread Lu Baolu
This adds an interface to setup the PASID entries for first
level page table translation.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Sanjay Kumar 
Signed-off-by: Liu Yi L 
Signed-off-by: Lu Baolu 
Reviewed-by: Ashok Raj 
---
 drivers/iommu/intel-pasid.c | 80 +
 drivers/iommu/intel-pasid.h | 11 +
 include/linux/intel-iommu.h |  1 +
 3 files changed, 92 insertions(+)

diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index 6d2b2e87e6fc..c3dcf4dc2496 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -10,6 +10,7 @@
 #define pr_fmt(fmt)"DMAR: " fmt
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -389,6 +390,26 @@ static inline void pasid_set_page_snoop(struct pasid_entry 
*pe, bool value)
pasid_set_bits(>val[1], 1 << 23, value);
 }
 
+/*
+ * Setup the First Level Page table Pointer field (Bit 140~191)
+ * of a scalable mode PASID entry.
+ */
+static inline void
+pasid_set_flptr(struct pasid_entry *pe, u64 value)
+{
+   pasid_set_bits(>val[2], VTD_PAGE_MASK, value);
+}
+
+/*
+ * Setup the First Level Paging Mode field (Bit 130~131) of a
+ * scalable mode PASID entry.
+ */
+static inline void
+pasid_set_flpm(struct pasid_entry *pe, u64 value)
+{
+   pasid_set_bits(>val[2], GENMASK_ULL(3, 2), value << 2);
+}
+
 static void
 pasid_cache_invalidation_with_pasid(struct intel_iommu *iommu,
u16 did, int pasid)
@@ -459,6 +480,65 @@ void intel_pasid_tear_down_entry(struct intel_iommu *iommu,
devtlb_invalidation_with_pasid(iommu, dev, pasid);
 }
 
+/*
+ * Set up the scalable mode pasid table entry for first only
+ * translation type.
+ */
+int intel_pasid_setup_first_level(struct intel_iommu *iommu,
+ struct device *dev, pgd_t *pgd,
+ int pasid, u16 did, int flags)
+{
+   struct pasid_entry *pte;
+
+   if (!ecap_flts(iommu->ecap)) {
+   pr_err("No first level translation support on %s\n",
+  iommu->name);
+   return -EINVAL;
+   }
+
+   pte = intel_pasid_get_entry(dev, pasid);
+   if (WARN_ON(!pte))
+   return -EINVAL;
+
+   pasid_clear_entry(pte);
+
+   /* Setup the first level page table pointer: */
+   pasid_set_flptr(pte, (u64)__pa(pgd));
+   if (flags & PASID_FLAG_SUPERVISOR_MODE) {
+   if (!ecap_srs(iommu->ecap)) {
+   pr_err("No supervisor request support on %s\n",
+  iommu->name);
+   return -EINVAL;
+   }
+   pasid_set_sre(pte);
+   }
+
+#ifdef CONFIG_X86
+   if (cpu_feature_enabled(X86_FEATURE_LA57))
+   pasid_set_flpm(pte, 1);
+#endif /* CONFIG_X86 */
+
+   pasid_set_domain_id(pte, did);
+   pasid_set_address_width(pte, iommu->agaw);
+   pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap));
+
+   /* Setup Present and PASID Granular Transfer Type: */
+   pasid_set_translation_type(pte, 1);
+   pasid_set_present(pte);
+
+   if (!ecap_coherent(iommu->ecap))
+   clflush_cache_range(pte, sizeof(*pte));
+
+   if (cap_caching_mode(iommu->cap)) {
+   pasid_cache_invalidation_with_pasid(iommu, did, pasid);
+   iotlb_invalidation_with_pasid(iommu, did, pasid);
+   } else {
+   iommu_flush_write_buffer(iommu);
+   }
+
+   return 0;
+}
+
 /*
  * Set up the scalable mode pasid entry for second only translation type.
  */
diff --git a/drivers/iommu/intel-pasid.h b/drivers/iommu/intel-pasid.h
index 55bb8715329d..512c63ec8a22 100644
--- a/drivers/iommu/intel-pasid.h
+++ b/drivers/iommu/intel-pasid.h
@@ -25,6 +25,14 @@
  */
 #define FLPT_DEFAULT_DID   1
 
+/*
+ * The SUPERVISOR_MODE flag indicates a first level translation which
+ * can be used for access to kernel addresses. It is valid only for
+ * access to the kernel's static 1:1 mapping of physical memory — not
+ * to vmalloc or even module mappings.
+ */
+#define PASID_FLAG_SUPERVISOR_MODE BIT(0)
+
 struct pasid_dir_entry {
u64 val;
 };
@@ -51,6 +59,9 @@ struct pasid_table *intel_pasid_get_table(struct device *dev);
 int intel_pasid_get_dev_max_id(struct device *dev);
 struct pasid_entry *intel_pasid_get_entry(struct device *dev, int pasid);
 void intel_pasid_clear_entry(struct device *dev, int pasid);
+int intel_pasid_setup_first_level(struct intel_iommu *iommu,
+ struct device *dev, pgd_t *pgd,
+ int pasid, u16 did, int flags);
 int intel_pasid_setup_second_level(struct intel_iommu *iommu,
   struct dmar_domain *domain,
   struct device *dev, int pasid);
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 5fdd33ed2cce..4ad62396e81e 100644
--- 

[PATCH v6 06/12] iommu/vt-d: Add second level page table interface

2018-12-09 Thread Lu Baolu
This adds the interfaces to setup or tear down the structures
for second level page table translations. This includes types
of second level only translation and pass through.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Sanjay Kumar 
Signed-off-by: Liu Yi L 
Signed-off-by: Lu Baolu 
Reviewed-by: Ashok Raj 
---
 drivers/iommu/intel-iommu.c |   2 +-
 drivers/iommu/intel-pasid.c | 280 
 drivers/iommu/intel-pasid.h |   8 ++
 include/linux/intel-iommu.h |   3 +
 4 files changed, 292 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 5e924bc8ebec..e741238e2326 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1210,7 +1210,7 @@ static void iommu_set_root_entry(struct intel_iommu 
*iommu)
raw_spin_unlock_irqrestore(>register_lock, flag);
 }
 
-static void iommu_flush_write_buffer(struct intel_iommu *iommu)
+void iommu_flush_write_buffer(struct intel_iommu *iommu)
 {
u32 val;
unsigned long flag;
diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index fd3ccc0753b0..6d2b2e87e6fc 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -9,6 +9,7 @@
 
 #define pr_fmt(fmt)"DMAR: " fmt
 
+#include 
 #include 
 #include 
 #include 
@@ -294,3 +295,282 @@ void intel_pasid_clear_entry(struct device *dev, int 
pasid)
 
pasid_clear_entry(pe);
 }
+
+static inline void pasid_set_bits(u64 *ptr, u64 mask, u64 bits)
+{
+   u64 old;
+
+   old = READ_ONCE(*ptr);
+   WRITE_ONCE(*ptr, (old & ~mask) | bits);
+}
+
+/*
+ * Setup the DID(Domain Identifier) field (Bit 64~79) of scalable mode
+ * PASID entry.
+ */
+static inline void
+pasid_set_domain_id(struct pasid_entry *pe, u64 value)
+{
+   pasid_set_bits(>val[1], GENMASK_ULL(15, 0), value);
+}
+
+/*
+ * Get domain ID value of a scalable mode PASID entry.
+ */
+static inline u16
+pasid_get_domain_id(struct pasid_entry *pe)
+{
+   return (u16)(READ_ONCE(pe->val[1]) & GENMASK_ULL(15, 0));
+}
+
+/*
+ * Setup the SLPTPTR(Second Level Page Table Pointer) field (Bit 12~63)
+ * of a scalable mode PASID entry.
+ */
+static inline void
+pasid_set_slptr(struct pasid_entry *pe, u64 value)
+{
+   pasid_set_bits(>val[0], VTD_PAGE_MASK, value);
+}
+
+/*
+ * Setup the AW(Address Width) field (Bit 2~4) of a scalable mode PASID
+ * entry.
+ */
+static inline void
+pasid_set_address_width(struct pasid_entry *pe, u64 value)
+{
+   pasid_set_bits(>val[0], GENMASK_ULL(4, 2), value << 2);
+}
+
+/*
+ * Setup the PGTT(PASID Granular Translation Type) field (Bit 6~8)
+ * of a scalable mode PASID entry.
+ */
+static inline void
+pasid_set_translation_type(struct pasid_entry *pe, u64 value)
+{
+   pasid_set_bits(>val[0], GENMASK_ULL(8, 6), value << 6);
+}
+
+/*
+ * Enable fault processing by clearing the FPD(Fault Processing
+ * Disable) field (Bit 1) of a scalable mode PASID entry.
+ */
+static inline void pasid_set_fault_enable(struct pasid_entry *pe)
+{
+   pasid_set_bits(>val[0], 1 << 1, 0);
+}
+
+/*
+ * Setup the SRE(Supervisor Request Enable) field (Bit 128) of a
+ * scalable mode PASID entry.
+ */
+static inline void pasid_set_sre(struct pasid_entry *pe)
+{
+   pasid_set_bits(>val[2], 1 << 0, 1);
+}
+
+/*
+ * Setup the P(Present) field (Bit 0) of a scalable mode PASID
+ * entry.
+ */
+static inline void pasid_set_present(struct pasid_entry *pe)
+{
+   pasid_set_bits(>val[0], 1 << 0, 1);
+}
+
+/*
+ * Setup Page Walk Snoop bit (Bit 87) of a scalable mode PASID
+ * entry.
+ */
+static inline void pasid_set_page_snoop(struct pasid_entry *pe, bool value)
+{
+   pasid_set_bits(>val[1], 1 << 23, value);
+}
+
+static void
+pasid_cache_invalidation_with_pasid(struct intel_iommu *iommu,
+   u16 did, int pasid)
+{
+   struct qi_desc desc;
+
+   desc.qw0 = QI_PC_DID(did) | QI_PC_PASID_SEL | QI_PC_PASID(pasid);
+   desc.qw1 = 0;
+   desc.qw2 = 0;
+   desc.qw3 = 0;
+
+   qi_submit_sync(, iommu);
+}
+
+static void
+iotlb_invalidation_with_pasid(struct intel_iommu *iommu, u16 did, u32 pasid)
+{
+   struct qi_desc desc;
+
+   desc.qw0 = QI_EIOTLB_PASID(pasid) | QI_EIOTLB_DID(did) |
+   QI_EIOTLB_GRAN(QI_GRAN_NONG_PASID) | QI_EIOTLB_TYPE;
+   desc.qw1 = 0;
+   desc.qw2 = 0;
+   desc.qw3 = 0;
+
+   qi_submit_sync(, iommu);
+}
+
+static void
+devtlb_invalidation_with_pasid(struct intel_iommu *iommu,
+  struct device *dev, int pasid)
+{
+   struct device_domain_info *info;
+   u16 sid, qdep, pfsid;
+
+   info = dev->archdata.iommu;
+   if (!info || !info->ats_enabled)
+   return;
+
+   sid = info->bus << 8 | info->devfn;
+   qdep = info->ats_qdep;
+   pfsid = info->pfsid;
+
+   qi_flush_dev_iotlb(iommu, sid, pfsid, qdep, 0, 64 - VTD_PAGE_SHIFT);
+}
+
+void intel_pasid_tear_down_entry(struct 

[PATCH v6 08/12] iommu/vt-d: Pass pasid table to context mapping

2018-12-09 Thread Lu Baolu
So that the pasid related info, such as the pasid table and the
maximum of pasid could be used during setting up scalable mode
context.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Cc: Liu Yi L 
Cc: Sanjay Kumar 
Signed-off-by: Lu Baolu 
Reviewed-by: Ashok Raj 
Reviewed-by: Kevin Tian 
---
 drivers/iommu/intel-iommu.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a077ff3f67a6..55c4ffda1246 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1920,6 +1920,7 @@ static void domain_exit(struct dmar_domain *domain)
 
 static int domain_context_mapping_one(struct dmar_domain *domain,
  struct intel_iommu *iommu,
+ struct pasid_table *table,
  u8 bus, u8 devfn)
 {
u16 did = domain->iommu_did[iommu->seq_id];
@@ -2042,6 +2043,7 @@ static int domain_context_mapping_one(struct dmar_domain 
*domain,
 struct domain_context_mapping_data {
struct dmar_domain *domain;
struct intel_iommu *iommu;
+   struct pasid_table *table;
 };
 
 static int domain_context_mapping_cb(struct pci_dev *pdev,
@@ -2050,25 +2052,31 @@ static int domain_context_mapping_cb(struct pci_dev 
*pdev,
struct domain_context_mapping_data *data = opaque;
 
return domain_context_mapping_one(data->domain, data->iommu,
- PCI_BUS_NUM(alias), alias & 0xff);
+ data->table, PCI_BUS_NUM(alias),
+ alias & 0xff);
 }
 
 static int
 domain_context_mapping(struct dmar_domain *domain, struct device *dev)
 {
+   struct domain_context_mapping_data data;
+   struct pasid_table *table;
struct intel_iommu *iommu;
u8 bus, devfn;
-   struct domain_context_mapping_data data;
 
iommu = device_to_iommu(dev, , );
if (!iommu)
return -ENODEV;
 
+   table = intel_pasid_get_table(dev);
+
if (!dev_is_pci(dev))
-   return domain_context_mapping_one(domain, iommu, bus, devfn);
+   return domain_context_mapping_one(domain, iommu, table,
+ bus, devfn);
 
data.domain = domain;
data.iommu = iommu;
+   data.table = table;
 
return pci_for_each_dma_alias(to_pci_dev(dev),
  _context_mapping_cb, );
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 03/12] iommu/vt-d: Move page table helpers into header

2018-12-09 Thread Lu Baolu
So that they could also be used in other source files.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Cc: Liu Yi L 
Cc: Sanjay Kumar 
Signed-off-by: Lu Baolu 
Reviewed-by: Ashok Raj 
Reviewed-by: Kevin Tian 
---
 drivers/iommu/intel-iommu.c | 43 -
 include/linux/intel-iommu.h | 43 +
 2 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 73d7c15bf737..52cdb0e077f2 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -290,49 +290,6 @@ static inline void context_clear_entry(struct 
context_entry *context)
context->hi = 0;
 }
 
-/*
- * 0: readable
- * 1: writable
- * 2-6: reserved
- * 7: super page
- * 8-10: available
- * 11: snoop behavior
- * 12-63: Host physcial address
- */
-struct dma_pte {
-   u64 val;
-};
-
-static inline void dma_clear_pte(struct dma_pte *pte)
-{
-   pte->val = 0;
-}
-
-static inline u64 dma_pte_addr(struct dma_pte *pte)
-{
-#ifdef CONFIG_64BIT
-   return pte->val & VTD_PAGE_MASK;
-#else
-   /* Must have a full atomic 64-bit read */
-   return  __cmpxchg64(>val, 0ULL, 0ULL) & VTD_PAGE_MASK;
-#endif
-}
-
-static inline bool dma_pte_present(struct dma_pte *pte)
-{
-   return (pte->val & 3) != 0;
-}
-
-static inline bool dma_pte_superpage(struct dma_pte *pte)
-{
-   return (pte->val & DMA_PTE_LARGE_PAGE);
-}
-
-static inline int first_pte_in_page(struct dma_pte *pte)
-{
-   return !((unsigned long)pte & ~VTD_PAGE_MASK);
-}
-
 /*
  * This domain is a statically identity mapping domain.
  * 1. This domain creats a static 1:1 mapping to all usable memory.
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 8c9b6063d275..b4da61385ebf 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -590,6 +590,49 @@ static inline void __iommu_flush_cache(
clflush_cache_range(addr, size);
 }
 
+/*
+ * 0: readable
+ * 1: writable
+ * 2-6: reserved
+ * 7: super page
+ * 8-10: available
+ * 11: snoop behavior
+ * 12-63: Host physcial address
+ */
+struct dma_pte {
+   u64 val;
+};
+
+static inline void dma_clear_pte(struct dma_pte *pte)
+{
+   pte->val = 0;
+}
+
+static inline u64 dma_pte_addr(struct dma_pte *pte)
+{
+#ifdef CONFIG_64BIT
+   return pte->val & VTD_PAGE_MASK;
+#else
+   /* Must have a full atomic 64-bit read */
+   return  __cmpxchg64(>val, 0ULL, 0ULL) & VTD_PAGE_MASK;
+#endif
+}
+
+static inline bool dma_pte_present(struct dma_pte *pte)
+{
+   return (pte->val & 3) != 0;
+}
+
+static inline bool dma_pte_superpage(struct dma_pte *pte)
+{
+   return (pte->val & DMA_PTE_LARGE_PAGE);
+}
+
+static inline int first_pte_in_page(struct dma_pte *pte)
+{
+   return !((unsigned long)pte & ~VTD_PAGE_MASK);
+}
+
 extern struct dmar_drhd_unit * dmar_find_matched_drhd_unit(struct pci_dev 
*dev);
 extern int dmar_find_matched_atsr_unit(struct pci_dev *dev);
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 07/12] iommu/vt-d: Setup pasid entry for RID2PASID support

2018-12-09 Thread Lu Baolu
when the scalable mode is enabled, there is no second level
page translation pointer in the context entry any more (for
DMA request without PASID). Instead, a new RID2PASID field
is introduced in the context entry. Software can choose any
PASID value to set RID2PASID and then setup the translation
in the corresponding PASID entry. Upon receiving a DMA request
without PASID, hardware will firstly look at this RID2PASID
field and then treat this request as a request with a pasid
value specified in RID2PASID field.

Though software is allowed to use any PASID for the RID2PASID,
we will always use the PASID 0 as a sort of design decision.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Sanjay Kumar 
Signed-off-by: Liu Yi L 
Signed-off-by: Lu Baolu 
Reviewed-by: Ashok Raj 
---
 drivers/iommu/intel-iommu.c | 20 
 drivers/iommu/intel-pasid.h |  1 +
 2 files changed, 21 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index e741238e2326..a077ff3f67a6 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2462,6 +2462,22 @@ static struct dmar_domain 
*dmar_insert_one_dev_info(struct intel_iommu *iommu,
dmar_remove_one_dev_info(domain, dev);
return NULL;
}
+
+   /* Setup the PASID entry for requests without PASID: */
+   spin_lock(>lock);
+   if (hw_pass_through && domain_type_is_si(domain))
+   ret = intel_pasid_setup_pass_through(iommu, domain,
+   dev, PASID_RID2PASID);
+   else
+   ret = intel_pasid_setup_second_level(iommu, domain,
+   dev, PASID_RID2PASID);
+   spin_unlock(>lock);
+   if (ret) {
+   pr_err("Setup RID2PASID for %s failed\n",
+  dev_name(dev));
+   dmar_remove_one_dev_info(domain, dev);
+   return NULL;
+   }
}
 
if (dev && domain_context_mapping(domain, dev)) {
@@ -4825,6 +4841,10 @@ static void __dmar_remove_one_dev_info(struct 
device_domain_info *info)
iommu = info->iommu;
 
if (info->dev) {
+   if (dev_is_pci(info->dev) && sm_supported(iommu))
+   intel_pasid_tear_down_entry(iommu, info->dev,
+   PASID_RID2PASID);
+
iommu_disable_dev_iotlb(info);
domain_context_clear(iommu, info->dev);
intel_pasid_free_table(info->dev);
diff --git a/drivers/iommu/intel-pasid.h b/drivers/iommu/intel-pasid.h
index 3c70522091d3..d6f4fead4491 100644
--- a/drivers/iommu/intel-pasid.h
+++ b/drivers/iommu/intel-pasid.h
@@ -10,6 +10,7 @@
 #ifndef __INTEL_PASID_H
 #define __INTEL_PASID_H
 
+#define PASID_RID2PASID0x0
 #define PASID_MIN  0x1
 #define PASID_MAX  0x10
 #define PASID_PTE_MASK 0x3F
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 05/12] iommu/vt-d: Reserve a domain id for FL and PT modes

2018-12-09 Thread Lu Baolu
Vt-d spec rev3.0 (section 6.2.3.1) requires that each pasid
entry for first-level or pass-through translation should be
programmed with a domain id different from those used for
second-level or nested translation. It is recommended that
software could use a same domain id for all first-only and
pass-through translations.

This reserves a domain id for first-level and pass-through
translations.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Cc: Liu Yi L 
Cc: Sanjay Kumar 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel-iommu.c | 10 ++
 drivers/iommu/intel-pasid.h |  6 ++
 2 files changed, 16 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 52cdb0e077f2..5e924bc8ebec 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1621,6 +1621,16 @@ static int iommu_init_domains(struct intel_iommu *iommu)
 */
set_bit(0, iommu->domain_ids);
 
+   /*
+* Vt-d spec rev3.0 (section 6.2.3.1) requires that each pasid
+* entry for first-level or pass-through translation modes should
+* be programmed with a domain id different from those used for
+* second-level or nested translation. We reserve a domain id for
+* this purpose.
+*/
+   if (sm_supported(iommu))
+   set_bit(FLPT_DEFAULT_DID, iommu->domain_ids);
+
return 0;
 }
 
diff --git a/drivers/iommu/intel-pasid.h b/drivers/iommu/intel-pasid.h
index 12f480c2bb8b..03c1612d173c 100644
--- a/drivers/iommu/intel-pasid.h
+++ b/drivers/iommu/intel-pasid.h
@@ -17,6 +17,12 @@
 #define PDE_PFN_MASK   PAGE_MASK
 #define PASID_PDE_SHIFT6
 
+/*
+ * Domain ID reserved for pasid entries programmed for first-level
+ * only and pass-through transfer modes.
+ */
+#define FLPT_DEFAULT_DID   1
+
 struct pasid_dir_entry {
u64 val;
 };
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 02/12] iommu/vt-d: Manage scalalble mode PASID tables

2018-12-09 Thread Lu Baolu
In scalable mode, pasid structure is a two level table with
a pasid directory table and a pasid table. Any pasid entry
can be identified by a pasid value in below way.

   1
   9   6 5  0
.---.---.
|  PASID|   |
'---'---'.-.
 ||  | |
 ||  | |
 ||  | |
 | .---.  |  .-.
 | |   |  |->| PASID Entry |
 | |   |  |  '-'
 | |   |  |Plus  | |
 | .---.  |  | |
 |>| DIR Entry |>| |
 | '---' '-'
.-.  |Plus |   |
| Context |  | |   |
|  Entry  |--->|   |
'-''---'

This changes the pasid table APIs to support scalable mode
PASID directory and PASID table. It also adds a helper to
get the PASID table entry according to the pasid value.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Sanjay Kumar 
Signed-off-by: Liu Yi L 
Signed-off-by: Lu Baolu 
Reviewed-by: Ashok Raj 
---
 drivers/iommu/intel-iommu.c | 23 ++
 drivers/iommu/intel-pasid.c | 87 ++---
 drivers/iommu/intel-pasid.h | 12 -
 drivers/iommu/intel-svm.c   |  6 +--
 4 files changed, 97 insertions(+), 31 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index c0ffde6cb21a..73d7c15bf737 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -425,21 +425,24 @@ static LIST_HEAD(device_domain_list);
 
 /*
  * Iterate over elements in device_domain_list and call the specified
- * callback @fn against each element. This helper should only be used
- * in the context where the device_domain_lock has already been holden.
+ * callback @fn against each element.
  */
 int for_each_device_domain(int (*fn)(struct device_domain_info *info,
 void *data), void *data)
 {
int ret = 0;
+   unsigned long flags;
struct device_domain_info *info;
 
-   assert_spin_locked(_domain_lock);
+   spin_lock_irqsave(_domain_lock, flags);
list_for_each_entry(info, _domain_list, global) {
ret = fn(info, data);
-   if (ret)
+   if (ret) {
+   spin_unlock_irqrestore(_domain_lock, flags);
return ret;
+   }
}
+   spin_unlock_irqrestore(_domain_lock, flags);
 
return 0;
 }
@@ -2481,16 +2484,18 @@ static struct dmar_domain 
*dmar_insert_one_dev_info(struct intel_iommu *iommu,
list_add(>global, _domain_list);
if (dev)
dev->archdata.iommu = info;
+   spin_unlock_irqrestore(_domain_lock, flags);
 
-   if (dev && dev_is_pci(dev) && info->pasid_supported) {
+   /* PASID table is mandatory for a PCI device in scalable mode. */
+   if (dev && dev_is_pci(dev) && sm_supported(iommu)) {
ret = intel_pasid_alloc_table(dev);
if (ret) {
-   pr_warn("No pasid table for %s, pasid disabled\n",
-   dev_name(dev));
-   info->pasid_supported = 0;
+   pr_err("PASID table allocation for %s failed\n",
+  dev_name(dev));
+   dmar_remove_one_dev_info(domain, dev);
+   return NULL;
}
}
-   spin_unlock_irqrestore(_domain_lock, flags);
 
if (dev && domain_context_mapping(domain, dev)) {
pr_err("Domain context map for %s failed\n", dev_name(dev));
diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index fe95c9bd4d33..fd3ccc0753b0 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -123,12 +123,13 @@ int intel_pasid_alloc_table(struct device *dev)
struct pasid_table *pasid_table;
struct pasid_table_opaque data;
struct page *pages;
-   size_t size, count;
+   int max_pasid = 0;
int ret, order;
+   int size;
 
+   might_sleep();
info = dev->archdata.iommu;
-   if (WARN_ON(!info || !dev_is_pci(dev) ||
-   !info->pasid_supported || info->pasid_table))
+   if (WARN_ON(!info || !dev_is_pci(dev) || info->pasid_table))
return -EINVAL;
 
/* DMA alias device already has a pasid table, use it: */
@@ -138,23 +139,25 @@ int intel_pasid_alloc_table(struct device *dev)
if (ret)
goto attach_out;
 
-   pasid_table = kzalloc(sizeof(*pasid_table), GFP_ATOMIC);
+   pasid_table = kzalloc(sizeof(*pasid_table), 

[PATCH v6 00/12] iommu/vt-d: Add scalable mode support

2018-12-09 Thread Lu Baolu
Hi,

Intel vt-d rev3.0 [1] introduces a new translation mode called
'scalable mode', which enables PASID-granular translations for
first level, second level, nested and pass-through modes. The
vt-d scalable mode is the key ingredient to enable Scalable I/O
Virtualization (Scalable IOV) [2] [3], which allows sharing a
device in minimal possible granularity (ADI - Assignable Device
Interface). It also includes all the capabilities required to
enable Shared Virtual Addressing (SVA). As a result, previous
Extended Context (ECS) mode is deprecated (no production ever
implements ECS).

Each scalable mode pasid table entry is 64 bytes in length, with
fields point to the first level page table and the second level
page table. The PGTT (Pasid Granular Translation Type) field is
used by hardware to determine the translation type.


  A Scalable Mode.-.
   PASID Entry .-| |
   .--.  .-| | 1st Level   |
  7|  |  | | | Page Table  |
   .--.  | | | |
  6|  |  | | | |
   '--'  | | '-'
  5|  |  | '-'
   '--'  '-'
  4|  |^
   '--'   /
  3|  |  /   .-.
   ..---.-. /  .-| |
  2|| FLPTR | |/ .-| | 2nd Level   |
   .'---'-.  | | | Page Table  |
  1|  |  | | | |
   .-.---..--..  | | | |
  0| | SLPTR || PGTT ||> | | '-'
   '-'---''--''  | '-'
   6 |0  '-'
   3 v
 ..
 | PASID Granular Translation Type|
 ||
 | 001b: 1st level translation only   |
 | 101b: 2nd level translation only   |
 | 011b: Nested translation   |
 | 100b: Pass through |
 ''

This patch series adds the scalable mode support in the Intel
IOMMU driver. It will make all the Intel IOMMU features work
in scalable mode. The changes are all constrained within the
Intel IOMMU driver, as it's purely internal format change.

References:
[1] 
https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
[2] 
https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
[3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf

Change log:
v5->v6:
  - [02/12] Move pasid table allocation out of lock range
and replace GFP_ATOMIC with GFP_KERNEL when
allocating the pasid memory.

v4->v5:
  - [04/12] Add a comment to explain why we only print
two QWORDs of an invalid descriptor.
  - [06/12] Fix domain agaw being less than iommu agaw.
  - [09/12] Move data type to local branch.
  - [11/12] Hold iommu lock when setup pasid entry for
SVM.
  - All rebased on top of vt-d branch of iommu tree.

v3->v4:
  - Rebase all patches to 4.20-rc1.
  - Use the right PASID directory table size.
  - Add a function for pass through translation setting up.
  - Refine the parameters passed to first level translation
mode interface.

v2->v3:
  - Rebase all patches on top of vt-d branch of iommu repo.
  - Set the pasid directory table size to 1 page for devices
which have no pasid support.
  - Fix various comments received during v2 review period.
All were code style related.

v1->v2:
  - Rebase all patches on top of v4.19-rc1;
  - Add 256-bit invalidation descriptor support;
  - Reserve a domain id for first level and pass-through
usage to make hardware cache entries more efficiently;
  - Various code refinements.

Lu Baolu (12):
  iommu/vt-d: Enumerate the scalable mode capability
  iommu/vt-d: Manage scalalble mode PASID tables
  iommu/vt-d: Move page table helpers into header
  iommu/vt-d: Add 256-bit invalidation descriptor support
  iommu/vt-d: Reserve a domain id for FL and PT modes
  iommu/vt-d: Add second level page table interface
  iommu/vt-d: Setup pasid entry for RID2PASID support
  iommu/vt-d: Pass pasid table to context mapping
  iommu/vt-d: Setup context and enable RID2PASID support
  iommu/vt-d: Add first level page table interface
  iommu/vt-d: Shared virtual address in scalable mode
  iommu/vt-d: Remove deferred invalidation

 .../admin-guide/kernel-parameters.txt |  12 +-
 drivers/iommu/dmar.c  |  91 ++--
 drivers/iommu/intel-iommu.c   | 351 +++---
 drivers/iommu/intel-pasid.c   | 449 +-
 drivers/iommu/intel-pasid.h 

[PATCH v6 2/3] iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

2018-12-09 Thread Nicolas Boichat
IOMMUs using ARMv7 short-descriptor format require page tables
(level 1 and 2) to be allocated within the first 4GB of RAM, even
on 64-bit systems.

For level 1/2 pages, ensure GFP_DMA32 is used if CONFIG_ZONE_DMA32
is defined (e.g. on arm64 platforms).

For level 2 pages, allocate a slab cache in SLAB_CACHE_DMA32. Note
that we do not explicitly pass GFP_DMA[32] to kmem_cache_zalloc,
as this is not strictly necessary, and would cause a warning
in mm/sl*b.c, as we did not update GFP_SLAB_BUG_MASK.

Also, print an error when the physical address does not fit in
32-bit, to make debugging easier in the future.

Cc: sta...@vger.kernel.org
Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat 
---

Changes since v2:
 - Commit message

(v3 used the page_frag approach)

Changes since v4:
 - Do not pass ARM_V7S_TABLE_GFP_DMA to kmem_cache_zalloc, as this
   is unnecessary, and would trigger a warning.

Changes since v5:
 - Rename ARM_V7S_TABLE_SLAB_CACHE to ARM_V7S_TABLE_SLAB_FLAGS.

 drivers/iommu/io-pgtable-arm-v7s.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 445c3bde04800c..d2fdb192f7610f 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -161,6 +161,14 @@
 
 #define ARM_V7S_TCR_PD1BIT(5)
 
+#ifdef CONFIG_ZONE_DMA32
+#define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
+#define ARM_V7S_TABLE_SLAB_FLAGS SLAB_CACHE_DMA32
+#else
+#define ARM_V7S_TABLE_GFP_DMA GFP_DMA
+#define ARM_V7S_TABLE_SLAB_FLAGS SLAB_CACHE_DMA
+#endif
+
 typedef u32 arm_v7s_iopte;
 
 static bool selftest_running;
@@ -198,13 +206,16 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
void *table = NULL;
 
if (lvl == 1)
-   table = (void *)__get_dma_pages(__GFP_ZERO, get_order(size));
+   table = (void *)__get_free_pages(
+   __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size));
else if (lvl == 2)
-   table = kmem_cache_zalloc(data->l2_tables, gfp | GFP_DMA);
+   table = kmem_cache_zalloc(data->l2_tables, gfp);
phys = virt_to_phys(table);
-   if (phys != (arm_v7s_iopte)phys)
+   if (phys != (arm_v7s_iopte)phys) {
/* Doesn't fit in PTE */
+   dev_err(dev, "Page table does not fit in PTE: %pa", );
goto out_free;
+   }
if (table && !(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA)) {
dma = dma_map_single(dev, table, size, DMA_TO_DEVICE);
if (dma_mapping_error(dev, dma))
@@ -737,7 +748,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
data->l2_tables = kmem_cache_create("io-pgtable_armv7s_l2",
ARM_V7S_TABLE_SIZE(2),
ARM_V7S_TABLE_SIZE(2),
-   SLAB_CACHE_DMA, NULL);
+   ARM_V7S_TABLE_SLAB_FLAGS, NULL);
if (!data->l2_tables)
goto out_free_data;
 
-- 
2.20.0.rc2.403.gdbc3b29805-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 1/3] mm: Add support for kmem caches in DMA32 zone

2018-12-09 Thread Nicolas Boichat
IOMMUs using ARMv7 short-descriptor format require page tables
to be allocated within the first 4GB of RAM, even on 64-bit systems.
On arm64, this is done by passing GFP_DMA32 flag to memory allocation
functions.

For IOMMU L2 tables that only take 1KB, it would be a waste to allocate
a full page using get_free_pages, so we considered 3 approaches:
 1. This patch, adding support for GFP_DMA32 slab caches.
 2. genalloc, which requires pre-allocating the maximum number of L2
page tables (4096, so 4MB of memory).
 3. page_frag, which is not very memory-efficient as it is unable
to reuse freed fragments until the whole page is freed.

This change makes it possible to create a custom cache in DMA32 zone
using kmem_cache_create, then allocate memory using kmem_cache_alloc.

We do not create a DMA32 kmalloc cache array, as there are currently
no users of kmalloc(..., GFP_DMA32). These calls will continue to
trigger a warning, as we keep GFP_DMA32 in GFP_SLAB_BUG_MASK.

This implies that calls to kmem_cache_*alloc on a SLAB_CACHE_DMA32
kmem_cache must _not_ use GFP_DMA32 (it is anyway redundant and
unnecessary).

Cc: sta...@vger.kernel.org
Signed-off-by: Nicolas Boichat 
Acked-by: Vlastimil Babka 
---

Changes since v2:
 - Clarified commit message
 - Add entry in sysfs-kernel-slab to document the new sysfs file

(v3 used the page_frag approach)

Changes since v4:
 - Added details to commit message
 - Dropped change that removed GFP_DMA32 from GFP_SLAB_BUG_MASK:
   instead we can just call kmem_cache_*alloc without GFP_DMA32
   parameter. This also means that we can drop PATCH 1/3, as we
   do not make any changes in GFP flag verification.
 - Dropped hunks that added cache_dma32 sysfs file, and moved
   the hunks to PATCH 3/3, so that maintainer can decide whether
   to pick the change independently.

(no change since v5)

 include/linux/slab.h | 2 ++
 mm/slab.c| 2 ++
 mm/slab.h| 3 ++-
 mm/slab_common.c | 2 +-
 mm/slub.c| 5 +
 5 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 11b45f7ae4057c..9449b19c5f107a 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -32,6 +32,8 @@
 #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x2000U)
 /* Use GFP_DMA memory */
 #define SLAB_CACHE_DMA ((slab_flags_t __force)0x4000U)
+/* Use GFP_DMA32 memory */
+#define SLAB_CACHE_DMA32   ((slab_flags_t __force)0x8000U)
 /* DEBUG: Store the last owner for bug hunting */
 #define SLAB_STORE_USER((slab_flags_t __force)0x0001U)
 /* Panic if kmem_cache_create() fails */
diff --git a/mm/slab.c b/mm/slab.c
index 73fe23e649c91a..124f8c556d27fb 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2109,6 +2109,8 @@ int __kmem_cache_create(struct kmem_cache *cachep, 
slab_flags_t flags)
cachep->allocflags = __GFP_COMP;
if (flags & SLAB_CACHE_DMA)
cachep->allocflags |= GFP_DMA;
+   if (flags & SLAB_CACHE_DMA32)
+   cachep->allocflags |= GFP_DMA32;
if (flags & SLAB_RECLAIM_ACCOUNT)
cachep->allocflags |= __GFP_RECLAIMABLE;
cachep->size = size;
diff --git a/mm/slab.h b/mm/slab.h
index 4190c24ef0e9df..fcf717e12f0a86 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -127,7 +127,8 @@ static inline slab_flags_t kmem_cache_flags(unsigned int 
object_size,
 
 
 /* Legal flag mask for kmem_cache_create(), for various configurations */
-#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | SLAB_PANIC | \
+#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | \
+SLAB_CACHE_DMA32 | SLAB_PANIC | \
 SLAB_TYPESAFE_BY_RCU | SLAB_DEBUG_OBJECTS )
 
 #if defined(CONFIG_DEBUG_SLAB)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 70b0cc85db67f8..18b7b809c8d064 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -53,7 +53,7 @@ static DECLARE_WORK(slab_caches_to_rcu_destroy_work,
SLAB_FAILSLAB | SLAB_KASAN)
 
 #define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
-SLAB_ACCOUNT)
+SLAB_CACHE_DMA32 | SLAB_ACCOUNT)
 
 /*
  * Merge control. If this is set then no merging of slab caches will occur.
diff --git a/mm/slub.c b/mm/slub.c
index c229a9b7dd5448..4caadb926838ef 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3583,6 +3583,9 @@ static int calculate_sizes(struct kmem_cache *s, int 
forced_order)
if (s->flags & SLAB_CACHE_DMA)
s->allocflags |= GFP_DMA;
 
+   if (s->flags & SLAB_CACHE_DMA32)
+   s->allocflags |= GFP_DMA32;
+
if (s->flags & SLAB_RECLAIM_ACCOUNT)
s->allocflags |= __GFP_RECLAIMABLE;
 
@@ -5671,6 +5674,8 @@ static char *create_unique_id(struct kmem_cache *s)
 */
if (s->flags & SLAB_CACHE_DMA)
*p++ = 'd';
+   if (s->flags & SLAB_CACHE_DMA32)
+   *p++ = 'D';
if 

[PATCH v6 3/3] mm: Add /sys/kernel/slab/cache/cache_dma32

2018-12-09 Thread Nicolas Boichat
A previous patch in this series adds support for SLAB_CACHE_DMA32
kmem caches. This adds the corresponding
/sys/kernel/slab/cache/cache_dma32 entries, and fixes slabinfo
tool.

Cc: sta...@vger.kernel.org
Signed-off-by: Nicolas Boichat 
---

There were different opinions on whether this sysfs entry should
be added, so I'll leave it up to the mm/slub maintainers to decide
whether they want to pick this up, or drop it.

No change since v5.

 Documentation/ABI/testing/sysfs-kernel-slab |  9 +
 mm/slub.c   | 11 +++
 tools/vm/slabinfo.c |  7 ++-
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-kernel-slab 
b/Documentation/ABI/testing/sysfs-kernel-slab
index 29601d93a1c2ea..d742c6cfdffbe9 100644
--- a/Documentation/ABI/testing/sysfs-kernel-slab
+++ b/Documentation/ABI/testing/sysfs-kernel-slab
@@ -106,6 +106,15 @@ Description:
are from ZONE_DMA.
Available when CONFIG_ZONE_DMA is enabled.
 
+What:  /sys/kernel/slab/cache/cache_dma32
+Date:  December 2018
+KernelVersion: 4.21
+Contact:   Nicolas Boichat 
+Description:
+   The cache_dma32 file is read-only and specifies whether objects
+   are from ZONE_DMA32.
+   Available when CONFIG_ZONE_DMA32 is enabled.
+
 What:  /sys/kernel/slab/cache/cpu_slabs
 Date:  May 2007
 KernelVersion: 2.6.22
diff --git a/mm/slub.c b/mm/slub.c
index 4caadb926838ef..840f3719d9d543 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5104,6 +5104,14 @@ static ssize_t cache_dma_show(struct kmem_cache *s, char 
*buf)
 SLAB_ATTR_RO(cache_dma);
 #endif
 
+#ifdef CONFIG_ZONE_DMA32
+static ssize_t cache_dma32_show(struct kmem_cache *s, char *buf)
+{
+   return sprintf(buf, "%d\n", !!(s->flags & SLAB_CACHE_DMA32));
+}
+SLAB_ATTR_RO(cache_dma32);
+#endif
+
 static ssize_t usersize_show(struct kmem_cache *s, char *buf)
 {
return sprintf(buf, "%u\n", s->usersize);
@@ -5444,6 +5452,9 @@ static struct attribute *slab_attrs[] = {
 #ifdef CONFIG_ZONE_DMA
_dma_attr.attr,
 #endif
+#ifdef CONFIG_ZONE_DMA32
+   _dma32_attr.attr,
+#endif
 #ifdef CONFIG_NUMA
_node_defrag_ratio_attr.attr,
 #endif
diff --git a/tools/vm/slabinfo.c b/tools/vm/slabinfo.c
index 334b16db0ebbe9..4ee1bf6e498dfa 100644
--- a/tools/vm/slabinfo.c
+++ b/tools/vm/slabinfo.c
@@ -29,7 +29,7 @@ struct slabinfo {
char *name;
int alias;
int refs;
-   int aliases, align, cache_dma, cpu_slabs, destroy_by_rcu;
+   int aliases, align, cache_dma, cache_dma32, cpu_slabs, destroy_by_rcu;
unsigned int hwcache_align, object_size, objs_per_slab;
unsigned int sanity_checks, slab_size, store_user, trace;
int order, poison, reclaim_account, red_zone;
@@ -531,6 +531,8 @@ static void report(struct slabinfo *s)
printf("** Hardware cacheline aligned\n");
if (s->cache_dma)
printf("** Memory is allocated in a special DMA zone\n");
+   if (s->cache_dma32)
+   printf("** Memory is allocated in a special DMA32 zone\n");
if (s->destroy_by_rcu)
printf("** Slabs are destroyed via RCU\n");
if (s->reclaim_account)
@@ -599,6 +601,8 @@ static void slabcache(struct slabinfo *s)
*p++ = '*';
if (s->cache_dma)
*p++ = 'd';
+   if (s->cache_dma32)
+   *p++ = 'D';
if (s->hwcache_align)
*p++ = 'A';
if (s->poison)
@@ -1205,6 +1209,7 @@ static void read_slab_dir(void)
slab->aliases = get_obj("aliases");
slab->align = get_obj("align");
slab->cache_dma = get_obj("cache_dma");
+   slab->cache_dma32 = get_obj("cache_dma32");
slab->cpu_slabs = get_obj("cpu_slabs");
slab->destroy_by_rcu = get_obj("destroy_by_rcu");
slab->hwcache_align = get_obj("hwcache_align");
-- 
2.20.0.rc2.403.gdbc3b29805-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2018-12-09 Thread Nicolas Boichat
This is a follow-up to the discussion in [1], [2].

IOMMUs using ARMv7 short-descriptor format require page tables
(level 1 and 2) to be allocated within the first 4GB of RAM, even
on 64-bit systems.

For L1 tables that are bigger than a page, we can just use __get_free_pages
with GFP_DMA32 (on arm64 systems only, arm would still use GFP_DMA).

For L2 tables that only take 1KB, it would be a waste to allocate a full
page, so we considered 3 approaches:
 1. This series, adding support for GFP_DMA32 slab caches.
 2. genalloc, which requires pre-allocating the maximum number of L2 page
tables (4096, so 4MB of memory).
 3. page_frag, which is not very memory-efficient as it is unable to reuse
freed fragments until the whole page is freed. [3]

This series is the most memory-efficient approach.

stable@ note:
  We confirmed that this is a regression, and IOMMU errors happen on 4.19
  and linux-next/master on MT8173 (elm, Acer Chromebook R13). The issue
  most likely starts from commit ad67f5a6545f ("arm64: replace ZONE_DMA
  with ZONE_DMA32"), i.e. 4.15, and presumably breaks a number of Mediatek
  platforms (and maybe others?).

[1] https://lists.linuxfoundation.org/pipermail/iommu/2018-November/030876.html
[2] https://lists.linuxfoundation.org/pipermail/iommu/2018-December/031696.html
[3] https://patchwork.codeaurora.org/patch/671639/

Changes since v1:
 - Add support for SLAB_CACHE_DMA32 in slab and slub (patches 1/2)
 - iommu/io-pgtable-arm-v7s (patch 3):
   - Changed approach to use SLAB_CACHE_DMA32 added by the previous
 commit.
   - Use DMA or DMA32 depending on the architecture (DMA for arm,
 DMA32 for arm64).

Changes since v2:
 - Reworded and expanded commit messages
 - Added cache_dma32 documentation in PATCH 2/3.

v3 used the page_frag approach, see [3].

Changes since v4:
 - Dropped change that removed GFP_DMA32 from GFP_SLAB_BUG_MASK:
   instead we can just call kmem_cache_*alloc without GFP_DMA32
   parameter. This also means that we can drop PATCH v4 1/3, as we
   do not make any changes in GFP flag verification.
 - Dropped hunks that added cache_dma32 sysfs file, and moved
   the hunks to PATCH v5 3/3, so that maintainer can decide whether
   to pick the change independently.

Changes since v5:
 - Rename ARM_V7S_TABLE_SLAB_CACHE to ARM_V7S_TABLE_SLAB_FLAGS.
 - Add stable@ to cc.

Nicolas Boichat (3):
  mm: Add support for kmem caches in DMA32 zone
  iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging
  mm: Add /sys/kernel/slab/cache/cache_dma32

 Documentation/ABI/testing/sysfs-kernel-slab |  9 +
 drivers/iommu/io-pgtable-arm-v7s.c  | 19 +++
 include/linux/slab.h|  2 ++
 mm/slab.c   |  2 ++
 mm/slab.h   |  3 ++-
 mm/slab_common.c|  2 +-
 mm/slub.c   | 16 
 tools/vm/slabinfo.c |  7 ++-
 8 files changed, 53 insertions(+), 7 deletions(-)

-- 
2.20.0.rc2.403.gdbc3b29805-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/2] swiotlb: checking whether swiotlb buffer is full with io_tlb_used

2018-12-09 Thread Dongli Zhang
This patch uses io_tlb_used to help check whether swiotlb buffer is full.
io_tlb_used is no longer used for only debugfs. It is also used to help
optimize swiotlb_tbl_map_single().

Suggested-by: Joe Jin 
Signed-off-by: Dongli Zhang 
---
 kernel/dma/swiotlb.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 3979c2c..9300341 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -76,12 +76,10 @@ static phys_addr_t io_tlb_start, io_tlb_end;
  */
 static unsigned long io_tlb_nslabs;
 
-#ifdef CONFIG_DEBUG_FS
 /*
  * The number of used IO TLB block
  */
 static unsigned long io_tlb_used;
-#endif
 
 /*
  * This is a free list describing the number of free entries available from
@@ -489,6 +487,10 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 * request and allocate a buffer from that IO TLB pool.
 */
spin_lock_irqsave(_tlb_lock, flags);
+
+   if (unlikely(nslots > io_tlb_nslabs - io_tlb_used))
+   goto not_found;
+
index = ALIGN(io_tlb_index, stride);
if (index >= io_tlb_nslabs)
index = 0;
@@ -538,9 +540,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes)\n", 
size);
return SWIOTLB_MAP_ERROR;
 found:
-#ifdef CONFIG_DEBUG_FS
io_tlb_used += nslots;
-#endif
spin_unlock_irqrestore(_tlb_lock, flags);
 
/*
@@ -602,9 +602,7 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, 
phys_addr_t tlb_addr,
for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != 
IO_TLB_SEGSIZE -1) && io_tlb_list[i]; i--)
io_tlb_list[i] = ++count;
 
-#ifdef CONFIG_DEBUG_FS
io_tlb_used -= nslots;
-#endif
}
spin_unlock_irqrestore(_tlb_lock, flags);
 }
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/2] swiotlb: add debugfs to track swiotlb buffer usage

2018-12-09 Thread Dongli Zhang
The device driver will not be able to do dma operations once swiotlb buffer
is full, either because the driver is using so many IO TLB blocks inflight,
or because there is memory leak issue in device driver. To export the
swiotlb buffer usage via debugfs would help the user estimate the size of
swiotlb buffer to pre-allocate or analyze device driver memory leak issue.

Signed-off-by: Dongli Zhang 
---
Changed since v1:
  * init debugfs with late_initcall (suggested by Robin Murphy)
  * create debugfs entries with debugfs_create_ulong(suggested by Robin Murphy)

 kernel/dma/swiotlb.c | 50 ++
 1 file changed, 50 insertions(+)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 045930e..3979c2c 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -35,6 +35,9 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_DEBUG_FS
+#include 
+#endif
 
 #include 
 #include 
@@ -73,6 +76,13 @@ static phys_addr_t io_tlb_start, io_tlb_end;
  */
 static unsigned long io_tlb_nslabs;
 
+#ifdef CONFIG_DEBUG_FS
+/*
+ * The number of used IO TLB block
+ */
+static unsigned long io_tlb_used;
+#endif
+
 /*
  * This is a free list describing the number of free entries available from
  * each index
@@ -528,6 +538,9 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes)\n", 
size);
return SWIOTLB_MAP_ERROR;
 found:
+#ifdef CONFIG_DEBUG_FS
+   io_tlb_used += nslots;
+#endif
spin_unlock_irqrestore(_tlb_lock, flags);
 
/*
@@ -588,6 +601,10 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, 
phys_addr_t tlb_addr,
 */
for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != 
IO_TLB_SEGSIZE -1) && io_tlb_list[i]; i--)
io_tlb_list[i] = ++count;
+
+#ifdef CONFIG_DEBUG_FS
+   io_tlb_used -= nslots;
+#endif
}
spin_unlock_irqrestore(_tlb_lock, flags);
 }
@@ -883,3 +900,36 @@ const struct dma_map_ops swiotlb_dma_ops = {
.dma_supported  = dma_direct_supported,
 };
 EXPORT_SYMBOL(swiotlb_dma_ops);
+
+#ifdef CONFIG_DEBUG_FS
+
+static int __init swiotlb_create_debugfs(void)
+{
+   static struct dentry *d_swiotlb_usage;
+   struct dentry *ent;
+
+   d_swiotlb_usage = debugfs_create_dir("swiotlb", NULL);
+
+   if (!d_swiotlb_usage)
+   return -ENOMEM;
+
+   ent = debugfs_create_ulong("io_tlb_nslabs", 0400,
+  d_swiotlb_usage, _tlb_nslabs);
+   if (!ent)
+   goto fail;
+
+   ent = debugfs_create_ulong("io_tlb_used", 0400,
+   d_swiotlb_usage, _tlb_used);
+   if (!ent)
+   goto fail;
+
+   return 0;
+
+fail:
+   debugfs_remove_recursive(d_swiotlb_usage);
+   return -ENOMEM;
+}
+
+late_initcall(swiotlb_create_debugfs);
+
+#endif
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 18/21] iommu/tegra: gart: Prepend error/debug messages with "gart:"

2018-12-09 Thread Dmitry Osipenko
GART became a part of Memory Controller, hence now the drivers device
is Memory Controller and not GART. As a result all printed messages are
prepended with the "tegra-mc 7000f000.memory-controller:", so let's
prepend GART's messages with "gart:" in order to differentiate them
from the MC.

Signed-off-by: Dmitry Osipenko 
---
 drivers/iommu/tegra-gart.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 99afdbf8860c..9d2df43ce50b 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -19,6 +19,8 @@
  * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
  */
 
+#define dev_fmt(fmt)   "gart: " fmt
+
 #include 
 #include 
 #include 
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 21/21] iommu/tegra: gart: Perform code refactoring

2018-12-09 Thread Dmitry Osipenko
Removed redundant safety-checks in the code and some debug code that
isn't actually very useful for debugging, like enormous pagetable dump
on each fault. The majority of the changes are code reshuffling,
variables/whitespaces clean up and removal of debug messages that
duplicate messages of the IOMMU-core.

Signed-off-by: Dmitry Osipenko 
---
 drivers/iommu/tegra-gart.c | 244 +++--
 1 file changed, 96 insertions(+), 148 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 71de54aa845c..c732c6a2a165 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -1,5 +1,5 @@
 /*
- * IOMMU API for GART in Tegra20
+ * IOMMU API for Graphics Address Relocation Table on Tegra20
  *
  * Copyright (c) 2010-2012, NVIDIA CORPORATION.  All rights reserved.
  *
@@ -31,70 +31,63 @@
 
 #include 
 
-/* bitmap of the page sizes currently supported */
-#define GART_IOMMU_PGSIZES (SZ_4K)
-
 #define GART_REG_BASE  0x24
 #define GART_CONFIG(0x24 - GART_REG_BASE)
 #define GART_ENTRY_ADDR(0x28 - GART_REG_BASE)
 #define GART_ENTRY_DATA(0x2c - GART_REG_BASE)
-#define GART_ENTRY_PHYS_ADDR_VALID (1 << 31)
+
+#define GART_ENTRY_PHYS_ADDR_VALID BIT(31)
 
 #define GART_PAGE_SHIFT12
 #define GART_PAGE_SIZE (1 << GART_PAGE_SHIFT)
-#define GART_PAGE_MASK \
-   (~(GART_PAGE_SIZE - 1) & ~GART_ENTRY_PHYS_ADDR_VALID)
+#define GART_PAGE_MASK GENMASK(30, GART_PAGE_SHIFT)
+
+/* bitmap of the page sizes currently supported */
+#define GART_IOMMU_PGSIZES (GART_PAGE_SIZE)
 
 struct gart_device {
void __iomem*regs;
u32 *savedata;
-   u32 page_count; /* total remappable size */
-   dma_addr_t  iovmm_base; /* offset to vmm_area */
+   unsigned long   iovmm_base; /* offset to vmm_area start */
+   unsigned long   iovmm_end;  /* offset to vmm_area end */
spinlock_t  pte_lock;   /* for pagetable */
spinlock_t  dom_lock;   /* for active domain */
unsigned intactive_devices; /* number of active devices */
struct iommu_domain *active_domain; /* current active domain */
-   struct device   *dev;
-
struct iommu_device iommu;  /* IOMMU Core handle */
+   struct device   *dev;
 };
 
 static struct gart_device *gart_handle; /* unique for a system */
 
 static bool gart_debug;
 
-#define GART_PTE(_pfn) \
-   (GART_ENTRY_PHYS_ADDR_VALID | ((_pfn) << PAGE_SHIFT))
-
 /*
  * Any interaction between any block on PPSB and a block on APB or AHB
  * must have these read-back to ensure the APB/AHB bus transaction is
  * complete before initiating activity on the PPSB block.
  */
-#define FLUSH_GART_REGS(gart)  ((void)readl((gart)->regs + GART_CONFIG))
+#define FLUSH_GART_REGS(gart)  readl_relaxed((gart)->regs + GART_CONFIG)
 
 #define for_each_gart_pte(gart, iova)  \
for (iova = gart->iovmm_base;   \
-iova < gart->iovmm_base + GART_PAGE_SIZE * gart->page_count; \
+iova < gart->iovmm_end;\
 iova += GART_PAGE_SIZE)
 
 static inline void gart_set_pte(struct gart_device *gart,
-   unsigned long offs, u32 pte)
+   unsigned long iova, unsigned long pte)
 {
-   writel(offs, gart->regs + GART_ENTRY_ADDR);
-   writel(pte, gart->regs + GART_ENTRY_DATA);
-
-   dev_dbg(gart->dev, "%s %08lx:%08x\n",
-pte ? "map" : "unmap", offs, pte & GART_PAGE_MASK);
+   writel_relaxed(iova, gart->regs + GART_ENTRY_ADDR);
+   writel_relaxed(pte, gart->regs + GART_ENTRY_DATA);
 }
 
 static inline unsigned long gart_read_pte(struct gart_device *gart,
- unsigned long offs)
+ unsigned long iova)
 {
unsigned long pte;
 
-   writel(offs, gart->regs + GART_ENTRY_ADDR);
-   pte = readl(gart->regs + GART_ENTRY_DATA);
+   writel_relaxed(iova, gart->regs + GART_ENTRY_ADDR);
+   pte = readl_relaxed(gart->regs + GART_ENTRY_DATA);
 
return pte;
 }
@@ -106,49 +99,20 @@ static void do_gart_setup(struct gart_device *gart, const 
u32 *data)
for_each_gart_pte(gart, iova)
gart_set_pte(gart, iova, data ? *(data++) : 0);
 
-   writel(1, gart->regs + GART_CONFIG);
+   writel_relaxed(1, gart->regs + GART_CONFIG);
FLUSH_GART_REGS(gart);
 }
 
-#ifdef DEBUG
-static void gart_dump_table(struct gart_device *gart)
-{
-   unsigned long iova;
-   unsigned long flags;
-
-   spin_lock_irqsave(>pte_lock, flags);
-   

[PATCH v6 20/21] iommu/tegra: gart: Simplify clients-tracking code

2018-12-09 Thread Dmitry Osipenko
GART is a simple IOMMU provider that has single address space. There is
no need to setup global clients list and manage it for tracking of the
active domain, hence lot's of code could be safely removed and replaced
with a simpler alternative.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 155 ++---
 1 file changed, 40 insertions(+), 115 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index f2dc7e72be80..71de54aa845c 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -23,7 +23,6 @@
 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -46,30 +45,20 @@
 #define GART_PAGE_MASK \
(~(GART_PAGE_SIZE - 1) & ~GART_ENTRY_PHYS_ADDR_VALID)
 
-struct gart_client {
-   struct device   *dev;
-   struct list_headlist;
-};
-
 struct gart_device {
void __iomem*regs;
u32 *savedata;
u32 page_count; /* total remappable size */
dma_addr_t  iovmm_base; /* offset to vmm_area */
spinlock_t  pte_lock;   /* for pagetable */
-   struct list_headclient;
-   spinlock_t  client_lock;/* for client list */
+   spinlock_t  dom_lock;   /* for active domain */
+   unsigned intactive_devices; /* number of active devices */
struct iommu_domain *active_domain; /* current active domain */
struct device   *dev;
 
struct iommu_device iommu;  /* IOMMU Core handle */
 };
 
-struct gart_domain {
-   struct iommu_domain domain; /* generic domain handle */
-   struct gart_device *gart;   /* link to gart device   */
-};
-
 static struct gart_device *gart_handle; /* unique for a system */
 
 static bool gart_debug;
@@ -77,11 +66,6 @@ static bool gart_debug;
 #define GART_PTE(_pfn) \
(GART_ENTRY_PHYS_ADDR_VALID | ((_pfn) << PAGE_SHIFT))
 
-static struct gart_domain *to_gart_domain(struct iommu_domain *dom)
-{
-   return container_of(dom, struct gart_domain, domain);
-}
-
 /*
  * Any interaction between any block on PPSB and a block on APB or AHB
  * must have these read-back to ensure the APB/AHB bus transaction is
@@ -170,125 +154,70 @@ static inline bool gart_iova_range_valid(struct 
gart_device *gart,
 static int gart_iommu_attach_dev(struct iommu_domain *domain,
 struct device *dev)
 {
-   struct gart_domain *gart_domain = to_gart_domain(domain);
struct gart_device *gart = gart_handle;
-   struct gart_client *client, *c;
-   int err = 0;
-
-   client = kzalloc(sizeof(*c), GFP_KERNEL);
-   if (!client)
-   return -ENOMEM;
-   client->dev = dev;
-
-   spin_lock(>client_lock);
-   list_for_each_entry(c, >client, list) {
-   if (c->dev == dev) {
-   dev_err(gart->dev,
-   "%s is already attached\n", dev_name(dev));
-   err = -EINVAL;
-   goto fail;
-   }
-   }
-   if (gart->active_domain && gart->active_domain != domain) {
-   dev_err(gart->dev, "Only one domain can be active at a time\n");
-   err = -EINVAL;
-   goto fail;
-   }
-   gart->active_domain = domain;
-   gart_domain->gart = gart;
-   list_add(>list, >client);
-   spin_unlock(>client_lock);
-   dev_dbg(gart->dev, "Attached %s\n", dev_name(dev));
-   return 0;
+   int ret = 0;
 
-fail:
-   kfree(client);
-   spin_unlock(>client_lock);
-   return err;
-}
+   spin_lock(>dom_lock);
 
-static void __gart_iommu_detach_dev(struct iommu_domain *domain,
-   struct device *dev)
-{
-   struct gart_domain *gart_domain = to_gart_domain(domain);
-   struct gart_device *gart = gart_domain->gart;
-   struct gart_client *c;
-
-   list_for_each_entry(c, >client, list) {
-   if (c->dev == dev) {
-   list_del(>list);
-   kfree(c);
-   if (list_empty(>client)) {
-   gart->active_domain = NULL;
-   gart_domain->gart = NULL;
-   }
-   dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
-   return;
-   }
+   if (gart->active_domain && gart->active_domain != domain) {
+   ret = -EBUSY;
+   } else if (dev->archdata.iommu != domain) {
+   dev->archdata.iommu = domain;
+   gart->active_domain = domain;
+   gart->active_devices++;
}
 
-   dev_err(gart->dev, "Couldn't find %s to 

[PATCH v6 17/21] iommu/tegra: gart: Don't use managed resources

2018-12-09 Thread Dmitry Osipenko
GART is a part of the Memory Controller driver that is always built-in,
hence there is no benefit from the use of managed resources.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 1be595d15e7e..99afdbf8860c 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -173,7 +173,7 @@ static int gart_iommu_attach_dev(struct iommu_domain 
*domain,
struct gart_client *client, *c;
int err = 0;
 
-   client = devm_kzalloc(gart->dev, sizeof(*c), GFP_KERNEL);
+   client = kzalloc(sizeof(*c), GFP_KERNEL);
if (!client)
return -ENOMEM;
client->dev = dev;
@@ -199,7 +199,7 @@ static int gart_iommu_attach_dev(struct iommu_domain 
*domain,
return 0;
 
 fail:
-   devm_kfree(gart->dev, client);
+   kfree(client);
spin_unlock(>client_lock);
return err;
 }
@@ -214,7 +214,7 @@ static void __gart_iommu_detach_dev(struct iommu_domain 
*domain,
list_for_each_entry(c, >client, list) {
if (c->dev == dev) {
list_del(>list);
-   devm_kfree(gart->dev, c);
+   kfree(c);
if (list_empty(>client))
gart->active_domain = NULL;
dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
@@ -461,7 +461,7 @@ struct gart_device *tegra_gart_probe(struct device *dev,
return ERR_PTR(-ENXIO);
}
 
-   gart = devm_kzalloc(dev, sizeof(*gart), GFP_KERNEL);
+   gart = kzalloc(sizeof(*gart), GFP_KERNEL);
if (!gart) {
dev_err(dev, "failed to allocate gart_device\n");
return ERR_PTR(-ENOMEM);
@@ -470,7 +470,7 @@ struct gart_device *tegra_gart_probe(struct device *dev,
ret = iommu_device_sysfs_add(>iommu, dev, NULL, "gart");
if (ret) {
dev_err(dev, "Failed to register IOMMU in sysfs\n");
-   return ERR_PTR(ret);
+   goto free_gart;
}
 
iommu_device_set_ops(>iommu, _iommu_ops);
@@ -508,6 +508,8 @@ struct gart_device *tegra_gart_probe(struct device *dev,
iommu_device_unregister(>iommu);
 remove_sysfs:
iommu_device_sysfs_remove(>iommu);
+free_gart:
+   kfree(gart);
 
return ERR_PTR(ret);
 }
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 19/21] iommu/tegra: gart: Don't detach devices from inactive domains

2018-12-09 Thread Dmitry Osipenko
There could be unlimited number of allocated domains, but only one domain
can be active at a time. Hence devices must be detached only from the
active domain.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 9d2df43ce50b..f2dc7e72be80 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -171,7 +171,7 @@ static int gart_iommu_attach_dev(struct iommu_domain 
*domain,
 struct device *dev)
 {
struct gart_domain *gart_domain = to_gart_domain(domain);
-   struct gart_device *gart = gart_domain->gart;
+   struct gart_device *gart = gart_handle;
struct gart_client *client, *c;
int err = 0;
 
@@ -195,6 +195,7 @@ static int gart_iommu_attach_dev(struct iommu_domain 
*domain,
goto fail;
}
gart->active_domain = domain;
+   gart_domain->gart = gart;
list_add(>list, >client);
spin_unlock(>client_lock);
dev_dbg(gart->dev, "Attached %s\n", dev_name(dev));
@@ -217,8 +218,10 @@ static void __gart_iommu_detach_dev(struct iommu_domain 
*domain,
if (c->dev == dev) {
list_del(>list);
kfree(c);
-   if (list_empty(>client))
+   if (list_empty(>client)) {
gart->active_domain = NULL;
+   gart_domain->gart = NULL;
+   }
dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
return;
}
@@ -254,7 +257,6 @@ static struct iommu_domain 
*gart_iommu_domain_alloc(unsigned type)
if (!gart_domain)
return NULL;
 
-   gart_domain->gart = gart;
gart_domain->domain.geometry.aperture_start = gart->iovmm_base;
gart_domain->domain.geometry.aperture_end = gart->iovmm_base +
gart->page_count * GART_PAGE_SIZE - 1;
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 15/21] iommu/tegra: gart: Fix NULL pointer dereference

2018-12-09 Thread Dmitry Osipenko
Fix NULL pointer dereference on IOMMU domain destruction that happens
because clients list is being iterated unsafely and its elements are
getting deleted during the iteration.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index d4a642c636f1..bb5303b06845 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -260,9 +260,9 @@ static void gart_iommu_domain_free(struct iommu_domain 
*domain)
if (gart) {
spin_lock(>client_lock);
if (!list_empty(>client)) {
-   struct gart_client *c;
+   struct gart_client *c, *tmp;
 
-   list_for_each_entry(c, >client, list)
+   list_for_each_entry_safe(c, tmp, >client, list)
__gart_iommu_detach_dev(domain, c->dev);
}
spin_unlock(>client_lock);
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 13/21] iommu/tegra: gart: Integrate with Memory Controller driver

2018-12-09 Thread Dmitry Osipenko
The device-tree binding has been changed. There is no separate GART device
anymore, it is squashed into the Memory Controller. Integrate GART module
with the MC in a way it is done for the SMMU of Tegra30+.

Signed-off-by: Dmitry Osipenko 
---
 drivers/iommu/Kconfig  |  1 +
 drivers/iommu/tegra-gart.c | 77 --
 drivers/memory/tegra/mc.c  | 41 
 include/soc/tegra/mc.h | 27 +
 4 files changed, 93 insertions(+), 53 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index d9a25715650e..83c099bb7288 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -282,6 +282,7 @@ config ROCKCHIP_IOMMU
 config TEGRA_IOMMU_GART
bool "Tegra GART IOMMU Support"
depends on ARCH_TEGRA_2x_SOC
+   depends on TEGRA_MC
select IOMMU_API
help
  Enables support for remapping discontiguous physical memory
diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 835fea461c59..0a72b6afa842 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -19,16 +19,17 @@
  * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
  */
 
-#include 
 #include 
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
 
+#include 
+
 /* bitmap of the page sizes currently supported */
 #define GART_IOMMU_PGSIZES (SZ_4K)
 
@@ -397,9 +398,8 @@ static const struct iommu_ops gart_iommu_ops = {
.iotlb_sync = gart_iommu_sync,
 };
 
-static int tegra_gart_suspend(struct device *dev)
+int tegra_gart_suspend(struct gart_device *gart)
 {
-   struct gart_device *gart = dev_get_drvdata(dev);
unsigned long iova;
u32 *data = gart->savedata;
unsigned long flags;
@@ -411,9 +411,8 @@ static int tegra_gart_suspend(struct device *dev)
return 0;
 }
 
-static int tegra_gart_resume(struct device *dev)
+int tegra_gart_resume(struct gart_device *gart)
 {
-   struct gart_device *gart = dev_get_drvdata(dev);
unsigned long flags;
 
spin_lock_irqsave(>pte_lock, flags);
@@ -422,41 +421,39 @@ static int tegra_gart_resume(struct device *dev)
return 0;
 }
 
-static int tegra_gart_probe(struct platform_device *pdev)
+struct gart_device *tegra_gart_probe(struct device *dev,
+const struct tegra_smmu_soc *soc,
+struct tegra_mc *mc)
 {
struct gart_device *gart;
-   struct resource *res, *res_remap;
+   struct resource *res_remap;
void __iomem *gart_regs;
-   struct device *dev = >dev;
int ret;
 
BUILD_BUG_ON(PAGE_SHIFT != GART_PAGE_SHIFT);
 
+   /* Tegra30+ has an SMMU and no GART */
+   if (soc)
+   return NULL;
+
/* the GART memory aperture is required */
-   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-   res_remap = platform_get_resource(pdev, IORESOURCE_MEM, 1);
-   if (!res || !res_remap) {
+   res_remap = platform_get_resource(to_platform_device(dev),
+ IORESOURCE_MEM, 1);
+   if (!res_remap) {
dev_err(dev, "GART memory aperture expected\n");
-   return -ENXIO;
+   return ERR_PTR(-ENXIO);
}
 
gart = devm_kzalloc(dev, sizeof(*gart), GFP_KERNEL);
if (!gart) {
dev_err(dev, "failed to allocate gart_device\n");
-   return -ENOMEM;
+   return ERR_PTR(-ENOMEM);
}
 
-   gart_regs = devm_ioremap(dev, res->start, resource_size(res));
-   if (!gart_regs) {
-   dev_err(dev, "failed to remap GART registers\n");
-   return -ENXIO;
-   }
-
-   ret = iommu_device_sysfs_add(>iommu, >dev, NULL,
-dev_name(>dev));
+   ret = iommu_device_sysfs_add(>iommu, dev, NULL, "gart");
if (ret) {
dev_err(dev, "Failed to register IOMMU in sysfs\n");
-   return ret;
+   return ERR_PTR(ret);
}
 
iommu_device_set_ops(>iommu, _iommu_ops);
@@ -468,7 +465,8 @@ static int tegra_gart_probe(struct platform_device *pdev)
goto remove_sysfs;
}
 
-   gart->dev = >dev;
+   gart->dev = dev;
+   gart_regs = mc->regs + GART_REG_BASE;
spin_lock_init(>pte_lock);
spin_lock_init(>client_lock);
INIT_LIST_HEAD(>client);
@@ -483,46 +481,19 @@ static int tegra_gart_probe(struct platform_device *pdev)
goto unregister_iommu;
}
 
-   platform_set_drvdata(pdev, gart);
do_gart_setup(gart, NULL);
 
gart_handle = gart;
 
-   return 0;
+   return gart;
 
 unregister_iommu:
iommu_device_unregister(>iommu);
 remove_sysfs:
iommu_device_sysfs_remove(>iommu);
 
-   return ret;
-}
-
-static const struct dev_pm_ops tegra_gart_pm_ops = {
-   .suspend= 

[PATCH v6 16/21] iommu/tegra: gart: Allow only one active domain at a time

2018-12-09 Thread Dmitry Osipenko
GART has a single address space that is shared by all devices, hence only
one domain could be active at a time.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index bb5303b06845..1be595d15e7e 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -57,6 +57,7 @@ struct gart_device {
spinlock_t  pte_lock;   /* for pagetable */
struct list_headclient;
spinlock_t  client_lock;/* for client list */
+   struct iommu_domain *active_domain; /* current active domain */
struct device   *dev;
 
struct iommu_device iommu;  /* IOMMU Core handle */
@@ -186,6 +187,12 @@ static int gart_iommu_attach_dev(struct iommu_domain 
*domain,
goto fail;
}
}
+   if (gart->active_domain && gart->active_domain != domain) {
+   dev_err(gart->dev, "Only one domain can be active at a time\n");
+   err = -EINVAL;
+   goto fail;
+   }
+   gart->active_domain = domain;
list_add(>list, >client);
spin_unlock(>client_lock);
dev_dbg(gart->dev, "Attached %s\n", dev_name(dev));
@@ -208,6 +215,8 @@ static void __gart_iommu_detach_dev(struct iommu_domain 
*domain,
if (c->dev == dev) {
list_del(>list);
devm_kfree(gart->dev, c);
+   if (list_empty(>client))
+   gart->active_domain = NULL;
dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
return;
}
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 12/21] memory: tegra: Use relaxed versions of readl/writel

2018-12-09 Thread Dmitry Osipenko
There is no need for inserting of memory barriers to access registers of
Memory Controller. Hence use the relaxed versions of the accessors.

Signed-off-by: Dmitry Osipenko 
---
 drivers/memory/tegra/mc.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/memory/tegra/mc.h b/drivers/memory/tegra/mc.h
index 9856f085e487..887a3b07334f 100644
--- a/drivers/memory/tegra/mc.h
+++ b/drivers/memory/tegra/mc.h
@@ -26,13 +26,13 @@
 
 static inline u32 mc_readl(struct tegra_mc *mc, unsigned long offset)
 {
-   return readl(mc->regs + offset);
+   return readl_relaxed(mc->regs + offset);
 }
 
 static inline void mc_writel(struct tegra_mc *mc, u32 value,
 unsigned long offset)
 {
-   writel(value, mc->regs + offset);
+   writel_relaxed(value, mc->regs + offset);
 }
 
 extern const struct tegra_mc_reset_ops terga_mc_reset_ops_common;
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 14/21] iommu/tegra: gart: Fix spinlock recursion

2018-12-09 Thread Dmitry Osipenko
Fix spinlock recursion bug that happens on IOMMU domain destruction if
any of the allocated domains have devices attached to them.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 0a72b6afa842..d4a642c636f1 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -197,25 +197,33 @@ static int gart_iommu_attach_dev(struct iommu_domain 
*domain,
return err;
 }
 
-static void gart_iommu_detach_dev(struct iommu_domain *domain,
- struct device *dev)
+static void __gart_iommu_detach_dev(struct iommu_domain *domain,
+   struct device *dev)
 {
struct gart_domain *gart_domain = to_gart_domain(domain);
struct gart_device *gart = gart_domain->gart;
struct gart_client *c;
 
-   spin_lock(>client_lock);
-
list_for_each_entry(c, >client, list) {
if (c->dev == dev) {
list_del(>list);
devm_kfree(gart->dev, c);
dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
-   goto out;
+   return;
}
}
-   dev_err(gart->dev, "Couldn't find\n");
-out:
+
+   dev_err(gart->dev, "Couldn't find %s to detach\n", dev_name(dev));
+}
+
+static void gart_iommu_detach_dev(struct iommu_domain *domain,
+ struct device *dev)
+{
+   struct gart_domain *gart_domain = to_gart_domain(domain);
+   struct gart_device *gart = gart_domain->gart;
+
+   spin_lock(>client_lock);
+   __gart_iommu_detach_dev(domain, dev);
spin_unlock(>client_lock);
 }
 
@@ -255,7 +263,7 @@ static void gart_iommu_domain_free(struct iommu_domain 
*domain)
struct gart_client *c;
 
list_for_each_entry(c, >client, list)
-   gart_iommu_detach_dev(domain, c->dev);
+   __gart_iommu_detach_dev(domain, c->dev);
}
spin_unlock(>client_lock);
}
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 10/21] memory: tegra: Read client ID on GART page fault

2018-12-09 Thread Dmitry Osipenko
With the device tree binding changes, now Memory Controller has access to
GART registers. Hence it is now possible to read client ID on GART page
fault to get information about what memory client causes the fault.

Signed-off-by: Dmitry Osipenko 
---
 drivers/memory/tegra/mc.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index 59db13287b47..ce8cf81b55d7 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -38,6 +38,7 @@
 
 #define MC_ERR_ADR 0x0c
 
+#define MC_GART_ERROR_REQ  0x30
 #define MC_DECERR_EMEM_OTHERS_STATUS   0x58
 #define MC_SECURITY_VIOLATION_STATUS   0x74
 
@@ -575,8 +576,15 @@ static __maybe_unused irqreturn_t tegra20_mc_irq(int irq, 
void *data)
break;
 
case MC_INT_INVALID_GART_PAGE:
-   dev_err_ratelimited(mc->dev, "%s\n", error);
-   continue;
+   reg = MC_GART_ERROR_REQ;
+   value = mc_readl(mc, reg);
+
+   id = (value >> 1) & mc->soc->client_id_mask;
+   desc = error_names[2];
+
+   if (value & BIT(0))
+   direction = "write";
+   break;
 
case MC_INT_SECURITY_VIOLATION:
reg = MC_SECURITY_VIOLATION_STATUS;
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 11/21] memory: tegra: Use of_device_get_match_data()

2018-12-09 Thread Dmitry Osipenko
There is no need to match device with the DT node since it was already
matched, use of_device_get_match_data() helper to get the match-data.

Signed-off-by: Dmitry Osipenko 
---
 drivers/memory/tegra/mc.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index ce8cf81b55d7..55ecfb2d8cfd 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -619,23 +620,18 @@ static __maybe_unused irqreturn_t tegra20_mc_irq(int irq, 
void *data)
 
 static int tegra_mc_probe(struct platform_device *pdev)
 {
-   const struct of_device_id *match;
struct resource *res;
struct tegra_mc *mc;
void *isr;
int err;
 
-   match = of_match_node(tegra_mc_of_match, pdev->dev.of_node);
-   if (!match)
-   return -ENODEV;
-
mc = devm_kzalloc(>dev, sizeof(*mc), GFP_KERNEL);
if (!mc)
return -ENOMEM;
 
platform_set_drvdata(pdev, mc);
spin_lock_init(>lock);
-   mc->soc = match->data;
+   mc->soc = of_device_get_match_data(>dev);
mc->dev = >dev;
 
/* length of MC tick in nanoseconds */
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 06/21] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc

2018-12-09 Thread Dmitry Osipenko
Splitting GART and Memory Controller wasn't a good decision that was made
back in the day. Given that the GART driver wasn't ever been used by
anything in the kernel, we decided that it will be better to correct the
mistakes of the past and merge two bindings into a single one. As a result
there is a DT ABI change for the Memory Controller that allows not to
break newer kernels using older DT and not to break older kernels using
newer DT, that is done by changing the 'compatible' of the node to
'tegra20-mc-gart' and adding a new-required clock property. The new clock
property also puts the tegra20-mc binding in line with the bindings of the
later Tegra generations.

Signed-off-by: Dmitry Osipenko 
Reviewed-by: Rob Herring 
---
 .../bindings/iommu/nvidia,tegra20-gart.txt| 14 --
 .../memory-controllers/nvidia,tegra20-mc.txt  | 27 +--
 2 files changed, 19 insertions(+), 22 deletions(-)
 delete mode 100644 
Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt

diff --git a/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt 
b/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
deleted file mode 100644
index 099d9362ebc1..
--- a/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
+++ /dev/null
@@ -1,14 +0,0 @@
-NVIDIA Tegra 20 GART
-
-Required properties:
-- compatible: "nvidia,tegra20-gart"
-- reg: Two pairs of cells specifying the physical address and size of
-  the memory controller registers and the GART aperture respectively.
-
-Example:
-
-   gart {
-   compatible = "nvidia,tegra20-gart";
-   reg = <0x7000f024 0x0018/* controller registers */
-  0x5800 0x0200>;  /* GART aperture */
-   };
diff --git 
a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt 
b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
index 7d60a50a4fa1..e55328237df4 100644
--- a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
+++ b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
@@ -1,26 +1,37 @@
 NVIDIA Tegra20 MC(Memory Controller)
 
 Required properties:
-- compatible : "nvidia,tegra20-mc"
-- reg : Should contain 2 register ranges(address and length); see the
-  example below. Note that the MC registers are interleaved with the
-  GART registers, and hence must be represented as multiple ranges.
+- compatible : "nvidia,tegra20-mc-gart"
+- reg : Should contain 2 register ranges: physical base address and length of
+  the controller's registers and the GART aperture respectively.
+- clocks: Must contain an entry for each entry in clock-names.
+  See ../clocks/clock-bindings.txt for details.
+- clock-names: Must include the following entries:
+  - mc: the module's clock input
 - interrupts : Should contain MC General interrupt.
 - #reset-cells : Should be 1. This cell represents memory client module ID.
   The assignments may be found in header file 
   or in the TRM documentation.
+- #iommu-cells: Should be 0. This cell represents the number of cells in an
+  IOMMU specifier needed to encode an address. GART supports only a single
+  address space that is shared by all devices, therefore no additional
+  information needed for the address encoding.
 
 Example:
mc: memory-controller@7000f000 {
-   compatible = "nvidia,tegra20-mc";
-   reg = <0x7000f000 0x024
-  0x7000f03c 0x3c4>;
-   interrupts = <0 77 0x04>;
+   compatible = "nvidia,tegra20-mc-gart";
+   reg = <0x7000f000 0x400 /* controller registers */
+  0x5800 0x0200>;  /* GART aperture */
+   clocks = <_car TEGRA20_CLK_MC>;
+   clock-names = "mc";
+   interrupts = ;
#reset-cells = <1>;
+   #iommu-cells = <0>;
};
 
video-codec@6001a000 {
compatible = "nvidia,tegra20-vde";
...
resets = < TEGRA20_MC_RESET_VDE>;
+   iommus = <>;
};
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 05/21] iommu/tegra: gart: Optimize mapping / unmapping performance

2018-12-09 Thread Dmitry Osipenko
Currently GART writes one page entry at a time. More optimal would be to
aggregate the writes and flush BUS buffer in the end, this gives map/unmap
10-40% performance boost (depending on size of mapping) in comparison to
flushing after each page entry update.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 37a76388ff7e..835fea461c59 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -290,7 +290,6 @@ static int gart_iommu_map(struct iommu_domain *domain, 
unsigned long iova,
}
}
gart_set_pte(gart, iova, GART_PTE(pfn));
-   FLUSH_GART_REGS(gart);
spin_unlock_irqrestore(>pte_lock, flags);
return 0;
 }
@@ -307,7 +306,6 @@ static size_t gart_iommu_unmap(struct iommu_domain *domain, 
unsigned long iova,
 
spin_lock_irqsave(>pte_lock, flags);
gart_set_pte(gart, iova, 0);
-   FLUSH_GART_REGS(gart);
spin_unlock_irqrestore(>pte_lock, flags);
return bytes;
 }
@@ -373,6 +371,14 @@ static int gart_iommu_of_xlate(struct device *dev,
return 0;
 }
 
+static void gart_iommu_sync(struct iommu_domain *domain)
+{
+   struct gart_domain *gart_domain = to_gart_domain(domain);
+   struct gart_device *gart = gart_domain->gart;
+
+   FLUSH_GART_REGS(gart);
+}
+
 static const struct iommu_ops gart_iommu_ops = {
.capable= gart_iommu_capable,
.domain_alloc   = gart_iommu_domain_alloc,
@@ -387,6 +393,8 @@ static const struct iommu_ops gart_iommu_ops = {
.iova_to_phys   = gart_iommu_iova_to_phys,
.pgsize_bitmap  = GART_IOMMU_PGSIZES,
.of_xlate   = gart_iommu_of_xlate,
+   .iotlb_sync_map = gart_iommu_sync,
+   .iotlb_sync = gart_iommu_sync,
 };
 
 static int tegra_gart_suspend(struct device *dev)
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 08/21] memory: tegra: Don't invoke Tegra30+ specific memory timing setup on Tegra20

2018-12-09 Thread Dmitry Osipenko
This fixes irrelevant "tegra-mc 7000f000.memory-controller: no memory
timings for RAM code 0 registered" warning message during of kernels
boot-up on Tegra20.

Fixes: a8d502fd3348 ("memory: tegra: Squash tegra20-mc into common tegra-mc 
driver")
Signed-off-by: Dmitry Osipenko 
Acked-by: Jon Hunter 
---
 drivers/memory/tegra/mc.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index 24afc36833bf..b99f3c620f6c 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -664,12 +664,13 @@ static int tegra_mc_probe(struct platform_device *pdev)
}
 
isr = tegra_mc_irq;
-   }
 
-   err = tegra_mc_setup_timings(mc);
-   if (err < 0) {
-   dev_err(>dev, "failed to setup timings: %d\n", err);
-   return err;
+   err = tegra_mc_setup_timings(mc);
+   if (err < 0) {
+   dev_err(>dev, "failed to setup timings: %d\n",
+   err);
+   return err;
+   }
}
 
mc->irq = platform_get_irq(pdev, 0);
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 09/21] memory: tegra: Adapt to Tegra20 device-tree binding changes

2018-12-09 Thread Dmitry Osipenko
The tegra20-mc device-tree binding has been changed, GART has been
squashed into Memory Controller and now the clock property is mandatory
for Tegra20, the DT compatible has been changed as well. Adapt driver to
the DT changes.

Signed-off-by: Dmitry Osipenko 
---
 drivers/memory/tegra/mc.c | 21 -
 drivers/memory/tegra/mc.h |  6 --
 include/soc/tegra/mc.h|  2 +-
 3 files changed, 9 insertions(+), 20 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index b99f3c620f6c..59db13287b47 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -51,7 +51,7 @@
 
 static const struct of_device_id tegra_mc_of_match[] = {
 #ifdef CONFIG_ARCH_TEGRA_2x_SOC
-   { .compatible = "nvidia,tegra20-mc", .data = _mc_soc },
+   { .compatible = "nvidia,tegra20-mc-gart", .data = _mc_soc },
 #endif
 #ifdef CONFIG_ARCH_TEGRA_3x_SOC
{ .compatible = "nvidia,tegra30-mc", .data = _mc_soc },
@@ -638,24 +638,19 @@ static int tegra_mc_probe(struct platform_device *pdev)
if (IS_ERR(mc->regs))
return PTR_ERR(mc->regs);
 
+   mc->clk = devm_clk_get(>dev, "mc");
+   if (IS_ERR(mc->clk)) {
+   dev_err(>dev, "failed to get MC clock: %ld\n",
+   PTR_ERR(mc->clk));
+   return PTR_ERR(mc->clk);
+   }
+
 #ifdef CONFIG_ARCH_TEGRA_2x_SOC
if (mc->soc == _mc_soc) {
-   res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
-   mc->regs2 = devm_ioremap_resource(>dev, res);
-   if (IS_ERR(mc->regs2))
-   return PTR_ERR(mc->regs2);
-
isr = tegra20_mc_irq;
} else
 #endif
{
-   mc->clk = devm_clk_get(>dev, "mc");
-   if (IS_ERR(mc->clk)) {
-   dev_err(>dev, "failed to get MC clock: %ld\n",
-   PTR_ERR(mc->clk));
-   return PTR_ERR(mc->clk);
-   }
-
err = tegra_mc_setup_latency_allowance(mc);
if (err < 0) {
dev_err(>dev, "failed to setup latency allowance: 
%d\n",
diff --git a/drivers/memory/tegra/mc.h b/drivers/memory/tegra/mc.h
index 01065f12ebeb..9856f085e487 100644
--- a/drivers/memory/tegra/mc.h
+++ b/drivers/memory/tegra/mc.h
@@ -26,18 +26,12 @@
 
 static inline u32 mc_readl(struct tegra_mc *mc, unsigned long offset)
 {
-   if (mc->regs2 && offset >= 0x24)
-   return readl(mc->regs2 + offset - 0x3c);
-
return readl(mc->regs + offset);
 }
 
 static inline void mc_writel(struct tegra_mc *mc, u32 value,
 unsigned long offset)
 {
-   if (mc->regs2 && offset >= 0x24)
-   return writel(value, mc->regs2 + offset - 0x3c);
-
writel(value, mc->regs + offset);
 }
 
diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h
index b43f37fea096..db5bfdf589b4 100644
--- a/include/soc/tegra/mc.h
+++ b/include/soc/tegra/mc.h
@@ -144,7 +144,7 @@ struct tegra_mc_soc {
 struct tegra_mc {
struct device *dev;
struct tegra_smmu *smmu;
-   void __iomem *regs, *regs2;
+   void __iomem *regs;
struct clk *clk;
int irq;
 
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 07/21] ARM: dts: tegra20: Update Memory Controller node to the new binding

2018-12-09 Thread Dmitry Osipenko
Device tree binding of Memory Controller has been changed: GART has been
squashed into the MC, there are a new mandatory clock and #iommu-cells
properties, the compatible has been changed to 'tegra20-mc-gart'.

Signed-off-by: Dmitry Osipenko 
---
 arch/arm/boot/dts/tegra20.dtsi | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index dcad6d6128cf..8c942e60703e 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -616,17 +616,14 @@
};
 
mc: memory-controller@7000f000 {
-   compatible = "nvidia,tegra20-mc";
-   reg = <0x7000f000 0x024
-  0x7000f03c 0x3c4>;
+   compatible = "nvidia,tegra20-mc-gart";
+   reg = <0x7000f000 0x400 /* controller registers */
+  0x5800 0x0200>;  /* GART aperture */
+   clocks = <_car TEGRA20_CLK_MC>;
+   clock-names = "mc";
interrupts = ;
#reset-cells = <1>;
-   };
-
-   iommu@7000f024 {
-   compatible = "nvidia,tegra20-gart";
-   reg = <0x7000f024 0x0018/* controller registers */
-  0x5800 0x0200>;  /* GART aperture */
+   #iommu-cells = <0>;
};
 
memory-controller@7000f400 {
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 01/21] iommu/tegra: gart: Remove pr_fmt and clean up includes

2018-12-09 Thread Dmitry Osipenko
Remove unneeded headers inclusion and sort the headers in alphabet order.
Remove pr_fmt macro since there is no pr_*() in the code and it doesn't
affect dev_*() functions.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 17 +
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index da6a4e357b2b..ff75cf60117b 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -19,22 +19,15 @@
  * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
  */
 
-#define pr_fmt(fmt)"%s(): " fmt, __func__
-
 #include 
+#include 
+#include 
+#include 
 #include 
-#include 
-#include 
+#include 
 #include 
+#include 
 #include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
 
 /* bitmap of the page sizes currently supported */
 #define GART_IOMMU_PGSIZES (SZ_4K)
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 02/21] iommu/tegra: gart: Clean up driver probe errors handling

2018-12-09 Thread Dmitry Osipenko
Properly clean up allocated resources on the drivers probe failure and
remove unneeded checks.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index ff75cf60117b..1cd470b2beea 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -411,9 +411,6 @@ static int tegra_gart_probe(struct platform_device *pdev)
struct device *dev = >dev;
int ret;
 
-   if (gart_handle)
-   return -EIO;
-
BUILD_BUG_ON(PAGE_SHIFT != GART_PAGE_SHIFT);
 
/* the GART memory aperture is required */
@@ -448,8 +445,7 @@ static int tegra_gart_probe(struct platform_device *pdev)
ret = iommu_device_register(>iommu);
if (ret) {
dev_err(dev, "Failed to register IOMMU\n");
-   iommu_device_sysfs_remove(>iommu);
-   return ret;
+   goto remove_sysfs;
}
 
gart->dev = >dev;
@@ -463,7 +459,8 @@ static int tegra_gart_probe(struct platform_device *pdev)
gart->savedata = vmalloc(array_size(sizeof(u32), gart->page_count));
if (!gart->savedata) {
dev_err(dev, "failed to allocate context save area\n");
-   return -ENOMEM;
+   ret = -ENOMEM;
+   goto unregister_iommu;
}
 
platform_set_drvdata(pdev, gart);
@@ -472,6 +469,13 @@ static int tegra_gart_probe(struct platform_device *pdev)
gart_handle = gart;
 
return 0;
+
+unregister_iommu:
+   iommu_device_unregister(>iommu);
+remove_sysfs:
+   iommu_device_sysfs_remove(>iommu);
+
+   return ret;
 }
 
 static const struct dev_pm_ops tegra_gart_pm_ops = {
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 04/21] iommu: Introduce iotlb_sync_map callback

2018-12-09 Thread Dmitry Osipenko
Introduce iotlb_sync_map() callback that is invoked in the end of
iommu_map(). This new callback allows IOMMU drivers to avoid syncing
after mapping of each contiguous chunk and sync only when the whole
mapping is completed, optimizing performance of the mapping operation.

Signed-off-by: Dmitry Osipenko 
Reviewed-by: Robin Murphy 
Reviewed-by: Thierry Reding 
---
 drivers/iommu/iommu.c | 8 ++--
 include/linux/iommu.h | 1 +
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index cc25ec6d4c06..79e7c49ed16a 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1582,13 +1582,14 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
 int iommu_map(struct iommu_domain *domain, unsigned long iova,
  phys_addr_t paddr, size_t size, int prot)
 {
+   const struct iommu_ops *ops = domain->ops;
unsigned long orig_iova = iova;
unsigned int min_pagesz;
size_t orig_size = size;
phys_addr_t orig_paddr = paddr;
int ret = 0;
 
-   if (unlikely(domain->ops->map == NULL ||
+   if (unlikely(ops->map == NULL ||
 domain->pgsize_bitmap == 0UL))
return -ENODEV;
 
@@ -1617,7 +1618,7 @@ int iommu_map(struct iommu_domain *domain, unsigned long 
iova,
pr_debug("mapping: iova 0x%lx pa %pa pgsize 0x%zx\n",
 iova, , pgsize);
 
-   ret = domain->ops->map(domain, iova, paddr, pgsize, prot);
+   ret = ops->map(domain, iova, paddr, pgsize, prot);
if (ret)
break;
 
@@ -1626,6 +1627,9 @@ int iommu_map(struct iommu_domain *domain, unsigned long 
iova,
size -= pgsize;
}
 
+   if (ops->iotlb_sync_map)
+   ops->iotlb_sync_map(domain);
+
/* unroll mapping in case something went wrong */
if (ret)
iommu_unmap(domain, orig_iova, orig_size - size);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 11db18b9ffe8..0e90c5cc72db 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -201,6 +201,7 @@ struct iommu_ops {
void (*flush_iotlb_all)(struct iommu_domain *domain);
void (*iotlb_range_add)(struct iommu_domain *domain,
unsigned long iova, size_t size);
+   void (*iotlb_sync_map)(struct iommu_domain *domain);
void (*iotlb_sync)(struct iommu_domain *domain);
phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, dma_addr_t 
iova);
int (*add_device)(struct device *dev);
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 00/21] IOMMU: Tegra GART driver clean up and optimization

2018-12-09 Thread Dmitry Osipenko
Hello,

This patch-series integrates the GART (IOMMU) driver with the Memory
Controller driver, that allows to report the name of a faulty memory
client on GART page fault. A major code clean up and performance
optimization is performed in this series as well.

Changelog:

v6: v5 that is re-based on the recent linux-next.

v5: Addressed review comments from Thierry Reding to v4. Added WARN_ON() to
make sure that active domain isn't getting released, kept include headers
where necessary, etc.. All changes are quite minor.

Added new patch "memory: tegra: Use relaxed versions of readl/writel".

v4: In the v3 Rob Herring requested to make device-tree binding changes
backwards-compatible with the older kernels, that is achieved by
changing the 'compatible' value of the DT node.

The code-refactoring patches got some more (minor) polish.

Added new patch "memory: tegra: Use of_device_get_match_data()".

v3: Memory Controller integration part has been reworked and now GART's
device-tree binding is changed. Adding Rob Herring for the device-tree
changes reviewing.

GART now disallows more than one active domain at a time.

Fixed "spinlock recursion", "NULL pointer dereference" and "detaching
of all devices from inactive domains".

New code-refactoring patches.

The previously standalone patch "memory: tegra: Don't invoke Tegra30+
specific memory timing setup on Tegra20" is now included into this
series because there is a dependency on that patch and it wasn't applied
yet.

v2: Addressed review comments from Robin Murphy to v1 by moving devices
iommu_fwspec check to gart_iommu_add_device().

Dropped the "Provide single domain and group for all devices" patch from
the series for now because after some more considering it became not
exactly apparent whether that is what we need, that was also suggested
by Robin Murphy in the review comment. Maybe something like a runtime
IOMMU usage for devices would be a better solution, allowing to implement
transparent context switching of virtual IOMMU domains.

Some very minor code cleanups, reworded commit messages.

Dmitry Osipenko (21):
  iommu/tegra: gart: Remove pr_fmt and clean up includes
  iommu/tegra: gart: Clean up driver probe errors handling
  iommu/tegra: gart: Ignore devices without IOMMU phandle in DT
  iommu: Introduce iotlb_sync_map callback
  iommu/tegra: gart: Optimize mapping / unmapping performance
  dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc
  ARM: dts: tegra20: Update Memory Controller node to the new binding
  memory: tegra: Don't invoke Tegra30+ specific memory timing setup on
Tegra20
  memory: tegra: Adapt to Tegra20 device-tree binding changes
  memory: tegra: Read client ID on GART page fault
  memory: tegra: Use of_device_get_match_data()
  memory: tegra: Use relaxed versions of readl/writel
  iommu/tegra: gart: Integrate with Memory Controller driver
  iommu/tegra: gart: Fix spinlock recursion
  iommu/tegra: gart: Fix NULL pointer dereference
  iommu/tegra: gart: Allow only one active domain at a time
  iommu/tegra: gart: Don't use managed resources
  iommu/tegra: gart: Prepend error/debug messages with "gart:"
  iommu/tegra: gart: Don't detach devices from inactive domains
  iommu/tegra: gart: Simplify clients-tracking code
  iommu/tegra: gart: Perform code refactoring

 .../bindings/iommu/nvidia,tegra20-gart.txt|  14 -
 .../memory-controllers/nvidia,tegra20-mc.txt  |  27 +-
 arch/arm/boot/dts/tegra20.dtsi|  15 +-
 drivers/iommu/Kconfig |   1 +
 drivers/iommu/iommu.c |   8 +-
 drivers/iommu/tegra-gart.c| 470 +++---
 drivers/memory/tegra/mc.c |  93 +++-
 drivers/memory/tegra/mc.h |  10 +-
 include/linux/iommu.h |   1 +
 include/soc/tegra/mc.h|  29 +-
 10 files changed, 307 insertions(+), 361 deletions(-)
 delete mode 100644 
Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt

-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 03/21] iommu/tegra: gart: Ignore devices without IOMMU phandle in DT

2018-12-09 Thread Dmitry Osipenko
GART can't handle all devices, hence ignore devices that aren't related
to GART. IOMMU phandle must be explicitly assign to devices in the device
tree.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 1cd470b2beea..37a76388ff7e 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -345,8 +345,12 @@ static bool gart_iommu_capable(enum iommu_cap cap)
 
 static int gart_iommu_add_device(struct device *dev)
 {
-   struct iommu_group *group = iommu_group_get_for_dev(dev);
+   struct iommu_group *group;
 
+   if (!dev->iommu_fwspec)
+   return -ENODEV;
+
+   group = iommu_group_get_for_dev(dev);
if (IS_ERR(group))
return PTR_ERR(group);
 
@@ -363,6 +367,12 @@ static void gart_iommu_remove_device(struct device *dev)
iommu_device_unlink(_handle->iommu, dev);
 }
 
+static int gart_iommu_of_xlate(struct device *dev,
+  struct of_phandle_args *args)
+{
+   return 0;
+}
+
 static const struct iommu_ops gart_iommu_ops = {
.capable= gart_iommu_capable,
.domain_alloc   = gart_iommu_domain_alloc,
@@ -376,6 +386,7 @@ static const struct iommu_ops gart_iommu_ops = {
.unmap  = gart_iommu_unmap,
.iova_to_phys   = gart_iommu_iova_to_phys,
.pgsize_bitmap  = GART_IOMMU_PGSIZES,
+   .of_xlate   = gart_iommu_of_xlate,
 };
 
 static int tegra_gart_suspend(struct device *dev)
@@ -441,6 +452,7 @@ static int tegra_gart_probe(struct platform_device *pdev)
}
 
iommu_device_set_ops(>iommu, _iommu_ops);
+   iommu_device_set_fwnode(>iommu, dev->fwnode);
 
ret = iommu_device_register(>iommu);
if (ret) {
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: use generic DMA mapping code in powerpc V4

2018-12-09 Thread Christian Zigotzky
Next step: c1bfcad4b0cf38ce5b00f7ad880d3a13484c123a (dma-mapping, 
powerpc: simplify the arch dma_set_mask override)


Result: No problems with the PASEMI onboard ethernet and with booting 
the X5000 (P5020 board).


-- Christian


On 09 December 2018 at 3:20PM, Christian Zigotzky wrote:
Next step: 602307b034734ce77a05da4b99333a2eaf6b6482 (powerpc/fsl_pci: 
simplify fsl_pci_dma_set_mask)


git checkout 602307b034734ce77a05da4b99333a2eaf6b6482

The PASEMI onboard ethernet works and the X5000 boots.

-- Christian


On 08 December 2018 at 2:47PM, Christian Zigotzky wrote:
Next step: e15cd8173ef85e9cc3e2a9c7cc2982f5c1355615 (powerpc/dma: fix 
an off-by-one in dma_capable)


git checkout e15cd8173ef85e9cc3e2a9c7cc2982f5c1355615

The PASEMI onboard ethernet also works with this commit and the X5000 
boots without any problems.


-- Christian


On 08 December 2018 at 11:29AM, Christian Zigotzky wrote:
Next step: 7ebc44c535f6bd726d553756d38b137acc718443 (powerpc/dma: 
remove max_direct_dma_addr)


git checkout 7ebc44c535f6bd726d553756d38b137acc718443

OK, the PASEMI onboard ethernet works and the P5020 board boots.

-- Christian


On 07 December 2018 at 7:33PM, Christian Zigotzky wrote:
Next step: 13c1fdec5682b6e13257277fa16aa31f342d167d (powerpc/dma: 
move pci_dma_dev_setup_swiotlb to fsl_pci.c)


git checkout 13c1fdec5682b6e13257277fa16aa31f342d167d

Result: The PASEMI onboard ethernet works and the P5020 board boots.

— Christian












___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: use generic DMA mapping code in powerpc V4

2018-12-09 Thread Christian Zigotzky
Next step: 602307b034734ce77a05da4b99333a2eaf6b6482 (powerpc/fsl_pci: 
simplify fsl_pci_dma_set_mask)


git checkout 602307b034734ce77a05da4b99333a2eaf6b6482

The PASEMI onboard ethernet works and the X5000 boots.

-- Christian


On 08 December 2018 at 2:47PM, Christian Zigotzky wrote:
Next step: e15cd8173ef85e9cc3e2a9c7cc2982f5c1355615 (powerpc/dma: fix 
an off-by-one in dma_capable)


git checkout e15cd8173ef85e9cc3e2a9c7cc2982f5c1355615

The PASEMI onboard ethernet also works with this commit and the X5000 
boots without any problems.


-- Christian


On 08 December 2018 at 11:29AM, Christian Zigotzky wrote:
Next step: 7ebc44c535f6bd726d553756d38b137acc718443 (powerpc/dma: 
remove max_direct_dma_addr)


git checkout 7ebc44c535f6bd726d553756d38b137acc718443

OK, the PASEMI onboard ethernet works and the P5020 board boots.

-- Christian


On 07 December 2018 at 7:33PM, Christian Zigotzky wrote:
Next step: 13c1fdec5682b6e13257277fa16aa31f342d167d (powerpc/dma: 
move pci_dma_dev_setup_swiotlb to fsl_pci.c)


git checkout 13c1fdec5682b6e13257277fa16aa31f342d167d

Result: The PASEMI onboard ethernet works and the P5020 board boots.

— Christian









___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 12/34] powerpc/cell: move dma direct window setup out of dma_configure

2018-12-09 Thread Michael Ellerman
Christoph Hellwig  writes:

> Configure the dma settings at device setup time, and stop playing games
> with get_pci_dma_ops.  This prepares for using the common dma_configure
> code later on.
>
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/powerpc/platforms/cell/iommu.c | 20 +++-
>  1 file changed, 11 insertions(+), 9 deletions(-)

This one's crashing, haven't dug into why yet:

  [1.347085] Unable to handle kernel paging request for data at address 
0x0040
  [1.391505] Faulting instruction address: 0xc06b6e6c
  cpu 0x0: Vector: 380 (Data SLB Access) at [c007fc9032d0]
  pc: c06b6e6c: .of_n_addr_cells+0x34/0xc0
  lr: c0070b30: .cell_iommu_get_fixed_address+0x58/0x2b0
  sp: c007fc903560
  msr: 90009032
  dar: 40
  current = 0xc007fc8d
  paca= 0xc0f6   irqmask: 0x03   irq_happened: 0x01
  pid   = 1, comm = swapper/0
  Linux version 4.20.0-rc2-gcc7x-g1e32f48 (kerkins@p82) (gcc version 7.4.1 
20181208 (Custom eb377405ab2d1900)) #1 SMP Sun Dec 9 12:16:48 AEDT 2018
  enter ? for help
  [c007fc9035f0] c0070b30 .cell_iommu_get_fixed_address+0x58/0x2b0
  [c007fc9036c0] c00711ac .cell_dma_dev_setup.part.1+0x24/0x118
  [c007fc903740] c0071374 .cell_of_bus_notify+0x6c/0xbc
  [c007fc9037c0] c00e7ef0 .notifier_call_chain+0x90/0xf8
  [c007fc903860] c00e8c2c .blocking_notifier_call_chain+0x84/0xb8
  [c007fc9038f0] c0597544 .device_add+0x584/0x7b8
  [c007fc9039c0] c05a0308 .platform_device_add+0x148/0x2f0
  [c007fc903a60] c05a1508 .platform_device_register_full+0x148/0x168
  [c007fc903ae0] c0a9a8a0 
.__machine_initcall_cell_cell_publish_devices+0x1bc/0x210
  [c007fc903be0] c000eca4 .do_one_initcall+0x64/0x2d8
  [c007fc903cc0] c0a844ec .kernel_init_freeable+0x3dc/0x4e4
  [c007fc903da0] c000f06c .kernel_init+0x24/0x150
  [c007fc903e20] c000a9c0 .ret_from_kernel_thread+0x58/0x78

cheers

> diff --git a/arch/powerpc/platforms/cell/iommu.c 
> b/arch/powerpc/platforms/cell/iommu.c
> index 12352a58072a..cce5bf9515e5 100644
> --- a/arch/powerpc/platforms/cell/iommu.c
> +++ b/arch/powerpc/platforms/cell/iommu.c
> @@ -657,14 +657,21 @@ static const struct dma_map_ops dma_iommu_fixed_ops = {
>   .mapping_error  = dma_iommu_mapping_error,
>  };
>  
> +static u64 cell_iommu_get_fixed_address(struct device *dev);
> +
>  static void cell_dma_dev_setup(struct device *dev)
>  {
> - if (get_pci_dma_ops() == _iommu_ops)
> + if (get_pci_dma_ops() == _iommu_ops) {
> + u64 addr = cell_iommu_get_fixed_address(dev);
> +
> + if (addr != OF_BAD_ADDR)
> + set_dma_offset(dev, addr + dma_iommu_fixed_base);
>   set_iommu_table_base(dev, cell_get_iommu_table(dev));
> - else if (get_pci_dma_ops() == _nommu_ops)
> + } else if (get_pci_dma_ops() == _nommu_ops) {
>   set_dma_offset(dev, cell_dma_nommu_offset);
> - else
> + } else {
>   BUG();
> + }
>  }
>  
>  static void cell_pci_dma_dev_setup(struct pci_dev *dev)
> @@ -950,19 +957,14 @@ static int dma_suported_and_switch(struct device *dev, 
> u64 dma_mask)
>  {
>   if (dma_mask == DMA_BIT_MASK(64) &&
>   cell_iommu_get_fixed_address(dev) != OF_BAD_ADDR) {
> - u64 addr = cell_iommu_get_fixed_address(dev) +
> - dma_iommu_fixed_base;
>   dev_dbg(dev, "iommu: 64-bit OK, using fixed ops\n");
> - dev_dbg(dev, "iommu: fixed addr = %llx\n", addr);
>   set_dma_ops(dev, _iommu_fixed_ops);
> - set_dma_offset(dev, addr);
>   return 1;
>   }
>  
>   if (dma_iommu_dma_supported(dev, dma_mask)) {
>   dev_dbg(dev, "iommu: not 64-bit, using default ops\n");
> - set_dma_ops(dev, get_pci_dma_ops());
> - cell_dma_dev_setup(dev);
> + set_dma_ops(dev, _iommu_ops);
>   return 1;
>   }
>  
> -- 
> 2.19.1
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu