Re: [PATCH v16 2/5] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2018-10-01 Thread Vivek Gautam
On Tue, Oct 2, 2018 at 9:44 AM Vivek Gautam  wrote:
>
> Hi Will,
>
> On Mon, Oct 1, 2018 at 6:29 PM Will Deacon  wrote:
> >
> > Hi Vivek,
> >
> > On Thu, Aug 30, 2018 at 08:15:38PM +0530, Vivek Gautam wrote:
> > > From: Sricharan R 
> > >
> > > The smmu device probe/remove and add/remove master device callbacks
> > > gets called when the smmu is not linked to its master, that is without
> > > the context of the master device. So calling runtime apis in those places
> > > separately.
> > > Global locks are also initialized before enabling runtime pm as the
> > > runtime_resume() calls device_reset() which does tlb_sync_global()
> > > that ultimately requires locks to be initialized.
> > >
> > > Signed-off-by: Sricharan R 
> > > [vivek: Cleanup pm runtime calls]
> > > Signed-off-by: Vivek Gautam 
> > > Reviewed-by: Tomasz Figa 
> > > Tested-by: Srinivas Kandagatla 
> > > ---
> > >  drivers/iommu/arm-smmu.c | 89 
> > > +++-
> > >  1 file changed, 81 insertions(+), 8 deletions(-)
> >
> > This doesn't apply on my tree[1], possibly because I've got Robin's 
> > non-strict
> > invalidation queued there. However, that got me thinking -- how does this
> > work in conjunction with the timer-based TLB invalidation? Do we need to
> > rpm_{get,put} around flush_iotlb_all()? If so, do we still need the calls
> > in map/unmap when non-strict mode is in use?

For map/unmap(), i think there would be no harm in having additional
power.usage_count even for the non-strict mode.
So, I will just add rpm{get,put} in arm_smmu_flush_iotlb_all(), and
arm_smmu_iotlb_sync().

Regards
Vivek

>
> I haven't tested things with flush queues, but from what it looks like
> both .flush_iotlb_all, and .iotlb_sync callbacks need rpm_get/put().
> I will respin the patches.
>
> Thanks
> Vivek
> >
> > Will
> >
> > [1] 
> > https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=for-joerg/arm-smmu/updates
>
>
>
> --
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> of Code Aurora Forum, hosted by The Linux Foundation



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v16 2/5] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2018-10-01 Thread Vivek Gautam
Hi Will,

On Mon, Oct 1, 2018 at 6:29 PM Will Deacon  wrote:
>
> Hi Vivek,
>
> On Thu, Aug 30, 2018 at 08:15:38PM +0530, Vivek Gautam wrote:
> > From: Sricharan R 
> >
> > The smmu device probe/remove and add/remove master device callbacks
> > gets called when the smmu is not linked to its master, that is without
> > the context of the master device. So calling runtime apis in those places
> > separately.
> > Global locks are also initialized before enabling runtime pm as the
> > runtime_resume() calls device_reset() which does tlb_sync_global()
> > that ultimately requires locks to be initialized.
> >
> > Signed-off-by: Sricharan R 
> > [vivek: Cleanup pm runtime calls]
> > Signed-off-by: Vivek Gautam 
> > Reviewed-by: Tomasz Figa 
> > Tested-by: Srinivas Kandagatla 
> > ---
> >  drivers/iommu/arm-smmu.c | 89 
> > +++-
> >  1 file changed, 81 insertions(+), 8 deletions(-)
>
> This doesn't apply on my tree[1], possibly because I've got Robin's non-strict
> invalidation queued there. However, that got me thinking -- how does this
> work in conjunction with the timer-based TLB invalidation? Do we need to
> rpm_{get,put} around flush_iotlb_all()? If so, do we still need the calls
> in map/unmap when non-strict mode is in use?

I haven't tested things with flush queues, but from what it looks like
both .flush_iotlb_all, and .iotlb_sync callbacks need rpm_get/put().
I will respin the patches.

Thanks
Vivek
>
> Will
>
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=for-joerg/arm-smmu/updates



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


add_device vs attach_device

2018-10-01 Thread Yang Yang
Hi all,
I have a very basic question, what's the difference betwee
arm_smmu_attach_dev()
and arm_smmu_add_device() in ARM SMMUv3 driver? When should I use which
method?

Thanks,
- y
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] dma-direct: document the zone selection logic

2018-10-01 Thread Randy Dunlap
On 10/1/18 1:10 PM, Christoph Hellwig wrote:
> What we are doing here isn't quite obvious, so add a comment explaining
> it.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  kernel/dma/direct.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index ba6f5956a291..14b966e2349a 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -84,7 +84,14 @@ static gfp_t __dma_direct_optimal_gfp_mask(struct device 
> *dev, u64 dma_mask,
>   else
>   *phys_mask = dma_to_phys(dev, dma_mask);
>  
> - /* GFP_DMA32 and GFP_DMA are no ops without the corresponding zones: */
> + /*
> +  * Optimistically try the zone that the physicall address mask falls

physical

> +  * into first.  If that returns memory that isn't actually addressable
> +  * we will fallback to the next lower zone and try again.
> +  *
> +  * Note that GFP_DMA32 and GFP_DMA are no ops without the corresponding
> +  * zones.
> +  */
>   if (*phys_mask <= DMA_BIT_MASK(ARCH_ZONE_DMA_BITS))
>   return GFP_DMA;
>   if (*phys_mask <= DMA_BIT_MASK(32))
> 

thanks for the documentation.

-- 
~Randy
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


fix up nowarn confusion in the dma mapping layer

2018-10-01 Thread Christoph Hellwig
Hi all,

this series sorts out how we deal with the nowarn flags in the dma
mapping code.  We still support __GFP_NOWARN for the legacy APIs that
don't support passing the dma specific flags, but we generally want
every implementation to actually check DMA_ATTR_NO_WARN instead.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH, RFC] swiotlb: don't dip into swiotlb pool for coherent allocations

2018-10-01 Thread Christoph Hellwig
All architectures that support swiotlb also have a zone that backs up
these less than full addressing allocations (usually ZONE_DMA32).

Because of that it is rather pointless to fall back to the global swiotlb
buffer if the normal dma direct allocation failed - the only thing this
will do is to eat up bounce buffers that would be more useful to serve
streaming mappings.

Signed-off-by: Christoph Hellwig 
---

Note: this is against the dma-mapping tree as it requires previous patches
there.

 arch/arm64/mm/dma-mapping.c |   6 +--
 include/linux/swiotlb.h |   5 --
 kernel/dma/swiotlb.c| 105 +---
 3 files changed, 5 insertions(+), 111 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 072c51fb07d7..b457e65951fd 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -112,7 +112,7 @@ static void *__dma_alloc(struct device *dev, size_t size,
return addr;
}
 
-   ptr = swiotlb_alloc(dev, size, dma_handle, flags, attrs);
+   ptr = dma_direct_alloc_pages(dev, size, dma_handle, flags, attrs);
if (!ptr)
goto no_mem;
 
@@ -133,7 +133,7 @@ static void *__dma_alloc(struct device *dev, size_t size,
return coherent_ptr;
 
 no_map:
-   swiotlb_free(dev, size, ptr, *dma_handle, attrs);
+   dma_direct_free_pages(dev, size, ptr, *dma_handle, attrs);
 no_mem:
return NULL;
 }
@@ -151,7 +151,7 @@ static void __dma_free(struct device *dev, size_t size,
return;
vunmap(vaddr);
}
-   swiotlb_free(dev, size, swiotlb_addr, dma_handle, attrs);
+   dma_direct_free_pages(dev, size, swiotlb_addr, dma_handle, attrs);
 }
 
 static dma_addr_t __swiotlb_map_page(struct device *dev, struct page *page,
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 965be92c33b5..39d18f87743c 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -67,11 +67,6 @@ extern void swiotlb_tbl_sync_single(struct device *hwdev,
 
 /* Accessory functions. */
 
-void *swiotlb_alloc(struct device *hwdev, size_t size, dma_addr_t *dma_handle,
-   gfp_t flags, unsigned long attrs);
-void swiotlb_free(struct device *dev, size_t size, void *vaddr,
-   dma_addr_t dma_addr, unsigned long attrs);
-
 extern dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
   unsigned long offset, size_t size,
   enum dma_data_direction dir,
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 4f8a6dbf0b60..b83963dc2afe 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -689,78 +689,6 @@ void swiotlb_tbl_sync_single(struct device *hwdev, 
phys_addr_t tlb_addr,
}
 }
 
-static inline bool dma_coherent_ok(struct device *dev, dma_addr_t addr,
-   size_t size)
-{
-   u64 mask = DMA_BIT_MASK(32);
-
-   if (dev && dev->coherent_dma_mask)
-   mask = dev->coherent_dma_mask;
-   return addr + size - 1 <= mask;
-}
-
-static void *
-swiotlb_alloc_buffer(struct device *dev, size_t size, dma_addr_t *dma_handle,
-   unsigned long attrs)
-{
-   phys_addr_t phys_addr;
-
-   if (swiotlb_force == SWIOTLB_NO_FORCE)
-   goto out_warn;
-
-   phys_addr = swiotlb_tbl_map_single(dev,
-   __phys_to_dma(dev, io_tlb_start),
-   0, size, DMA_FROM_DEVICE, attrs);
-   if (phys_addr == SWIOTLB_MAP_ERROR)
-   goto out_warn;
-
-   *dma_handle = __phys_to_dma(dev, phys_addr);
-   if (!dma_coherent_ok(dev, *dma_handle, size))
-   goto out_unmap;
-
-   memset(phys_to_virt(phys_addr), 0, size);
-   return phys_to_virt(phys_addr);
-
-out_unmap:
-   dev_warn(dev, "hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016Lx\n",
-   (unsigned long long)dev->coherent_dma_mask,
-   (unsigned long long)*dma_handle);
-
-   /*
-* DMA_TO_DEVICE to avoid memcpy in unmap_single.
-* DMA_ATTR_SKIP_CPU_SYNC is optional.
-*/
-   swiotlb_tbl_unmap_single(dev, phys_addr, size, DMA_TO_DEVICE,
-   DMA_ATTR_SKIP_CPU_SYNC);
-out_warn:
-   if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit()) {
-   dev_warn(dev,
-   "swiotlb: coherent allocation failed, size=%zu\n",
-   size);
-   dump_stack();
-   }
-   return NULL;
-}
-
-static bool swiotlb_free_buffer(struct device *dev, size_t size,
-   dma_addr_t dma_addr)
-{
-   phys_addr_t phys_addr = dma_to_phys(dev, dma_addr);
-
-   WARN_ON_ONCE(irqs_disabled());
-
-   if (!is_swiotlb_buffer(phys_addr))
-   return false;
-
-   /*
-* DMA_TO_DEVICE to avoid memcpy in swiotlb_tbl_unmap_single.
-* DMA_ATTR_SKIP_CPU_SYNC is optional.
-*/
-   

[PATCH 1/2] dma-mapping: translate __GFP_NOFAIL to DMA_ATTR_NO_WARN

2018-10-01 Thread Christoph Hellwig
This allows all dma_map_ops instances to entirely rely on
DMA_ATTR_NO_WARN going forward.

Signed-off-by: Christoph Hellwig 
---
 include/linux/dma-mapping.h | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 562af6b45f23..02ecd4679e47 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -557,9 +557,11 @@ static inline void dma_free_attrs(struct device *dev, 
size_t size,
 }
 
 static inline void *dma_alloc_coherent(struct device *dev, size_t size,
-   dma_addr_t *dma_handle, gfp_t flag)
+   dma_addr_t *dma_handle, gfp_t gfp)
 {
-   return dma_alloc_attrs(dev, size, dma_handle, flag, 0);
+
+   return dma_alloc_attrs(dev, size, dma_handle, gfp,
+   (gfp & __GFP_NOWARN) ? DMA_ATTR_NO_WARN : 0);
 }
 
 static inline void dma_free_coherent(struct device *dev, size_t size,
@@ -793,8 +795,12 @@ static inline void dmam_release_declared_memory(struct 
device *dev)
 static inline void *dma_alloc_wc(struct device *dev, size_t size,
 dma_addr_t *dma_addr, gfp_t gfp)
 {
-   return dma_alloc_attrs(dev, size, dma_addr, gfp,
-  DMA_ATTR_WRITE_COMBINE);
+   unsigned long attrs = DMA_ATTR_NO_WARN;
+
+   if (gfp & __GFP_NOWARN)
+   attrs |= DMA_ATTR_NO_WARN;
+
+   return dma_alloc_attrs(dev, size, dma_addr, gfp, attrs);
 }
 #ifndef dma_alloc_writecombine
 #define dma_alloc_writecombine dma_alloc_wc
-- 
2.19.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] dma-direct: respect DMA_ATTR_NO_WARN

2018-10-01 Thread Christoph Hellwig
Respect the DMA_ATTR_NO_WARN flags for allocations in dma-direct.

Signed-off-by: Christoph Hellwig 
---
 kernel/dma/direct.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 170bd322a94a..ba6f5956a291 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -107,6 +107,9 @@ void *dma_direct_alloc_pages(struct device *dev, size_t 
size,
u64 phys_mask;
void *ret;
 
+   if (attrs & DMA_ATTR_NO_WARN)
+   gfp |= __GFP_NOWARN;
+
/* we always manually zero the memory once we are done: */
gfp &= ~__GFP_ZERO;
gfp |= __dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
-- 
2.19.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] dma-direct: document the zone selection logic

2018-10-01 Thread Christoph Hellwig
What we are doing here isn't quite obvious, so add a comment explaining
it.

Signed-off-by: Christoph Hellwig 
---
 kernel/dma/direct.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index ba6f5956a291..14b966e2349a 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -84,7 +84,14 @@ static gfp_t __dma_direct_optimal_gfp_mask(struct device 
*dev, u64 dma_mask,
else
*phys_mask = dma_to_phys(dev, dma_mask);
 
-   /* GFP_DMA32 and GFP_DMA are no ops without the corresponding zones: */
+   /*
+* Optimistically try the zone that the physicall address mask falls
+* into first.  If that returns memory that isn't actually addressable
+* we will fallback to the next lower zone and try again.
+*
+* Note that GFP_DMA32 and GFP_DMA are no ops without the corresponding
+* zones.
+*/
if (*phys_mask <= DMA_BIT_MASK(ARCH_ZONE_DMA_BITS))
return GFP_DMA;
if (*phys_mask <= DMA_BIT_MASK(32))
-- 
2.19.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


document dma-direct zone selection

2018-10-01 Thread Christoph Hellwig
Hi all,

this patch documents the dma-direct zone selection, as it doesn't
seem quite obvuious enough.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: dma mask related fixups (including full bus_dma_mask support) v2

2018-10-01 Thread Benjamin Herrenschmidt
On Mon, 2018-10-01 at 16:32 +0200, Christoph Hellwig wrote:
> FYI, I've pulled this into the dma-mapping tree to make forward
> progress.  All but oatch 4 had formal ACKs, and for that one Robin
> was fine even without and explicit ack.  I'll also send a patch to
> better document the zone selection as it confuses even well versed
> people like Ben.

Thanks. I"ll try to dig out some older systems to test.

Cheers,
Ben.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2] dma-debug: Check for drivers mapping invalid addresses in dma_map_single()

2018-10-01 Thread Stephen Boyd
I recently debugged a DMA mapping oops where a driver was trying to map
a buffer returned from request_firmware() with dma_map_single(). Memory
returned from request_firmware() is mapped into the vmalloc region and
this isn't a valid region to map with dma_map_single() per the DMA
documentation's "What memory is DMA'able?" section.

Unfortunately, we don't really check that in the DMA debugging code, so
enabling DMA debugging doesn't help catch this problem. Let's add a new
DMA debug function to check for a vmalloc address or an invalid virtual
address and print a warning if this happens. This makes it a little
easier to debug these sorts of problems, instead of seeing odd behavior
or crashes when drivers attempt to map the vmalloc space for DMA.

Cc: Marek Szyprowski 
Cc: Robin Murphy 
Signed-off-by: Stephen Boyd 
---

Changes from v1:
 * Update code to check for invalid virtual address too
 * Rename function to debug_dma_map_single()

 include/linux/dma-debug.h   |  8 
 include/linux/dma-mapping.h |  1 +
 kernel/dma/debug.c  | 16 
 3 files changed, 25 insertions(+)

diff --git a/include/linux/dma-debug.h b/include/linux/dma-debug.h
index a785f2507159..30213adbb6b9 100644
--- a/include/linux/dma-debug.h
+++ b/include/linux/dma-debug.h
@@ -32,6 +32,9 @@ extern void dma_debug_add_bus(struct bus_type *bus);
 
 extern int dma_debug_resize_entries(u32 num_entries);
 
+extern void debug_dma_map_single(struct device *dev, const void *addr,
+unsigned long len);
+
 extern void debug_dma_map_page(struct device *dev, struct page *page,
   size_t offset, size_t size,
   int direction, dma_addr_t dma_addr,
@@ -103,6 +106,11 @@ static inline int dma_debug_resize_entries(u32 num_entries)
return 0;
 }
 
+static inline void debug_dma_map_single(struct device *dev, const void *addr,
+   unsigned long len)
+{
+}
+
 static inline void debug_dma_map_page(struct device *dev, struct page *page,
  size_t offset, size_t size,
  int direction, dma_addr_t dma_addr,
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index d23fc45c8208..99ccba66c06a 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -231,6 +231,7 @@ static inline dma_addr_t dma_map_single_attrs(struct device 
*dev, void *ptr,
dma_addr_t addr;
 
BUG_ON(!valid_dma_direction(dir));
+   debug_dma_check_single(dev, ptr, size);
addr = ops->map_page(dev, virt_to_page(ptr),
 offset_in_page(ptr), size,
 dir, attrs);
diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index c007d25bee09..0f34bce82a62 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -1312,6 +1312,22 @@ static void check_sg_segment(struct device *dev, struct 
scatterlist *sg)
 #endif
 }
 
+void debug_dma_check_single(struct device *dev, const void *addr,
+   unsigned long len)
+{
+   if (unlikely(dma_debug_disabled()))
+   return;
+
+   if (!virt_addr_valid(addr))
+   err_printk(dev, NULL, "DMA-API: device driver maps memory from 
invalid area [addr=%p] [len=%lu]\n",
+  addr, len);
+
+   if (is_vmalloc_addr(addr))
+   err_printk(dev, NULL, "DMA-API: device driver maps memory from 
vmalloc area [addr=%p] [len=%lu]\n",
+  addr, len);
+}
+EXPORT_SYMBOL(debug_dma_check_single);
+
 void debug_dma_map_page(struct device *dev, struct page *page, size_t offset,
size_t size, int direction, dma_addr_t dma_addr,
bool map_single)
-- 
Sent by a computer through tubes

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu