Re: [PATCHv7 2/7] iommu/arm-smmu: Add domain attribute for system cache

2020-11-14 Thread Sai Prakash Ranjan

On 2020-11-12 15:05, Will Deacon wrote:

On Wed, Nov 11, 2020 at 12:10:50PM +0530, Sai Prakash Ranjan wrote:

On 2020-11-10 17:48, Will Deacon wrote:
> On Fri, Oct 30, 2020 at 02:53:09PM +0530, Sai Prakash Ranjan wrote:
> > Add iommu domain attribute for using system cache aka last level
> > cache by client drivers like GPU to set right attributes for caching
> > the hardware pagetables into the system cache.
> >
> > Signed-off-by: Sai Prakash Ranjan 
> > ---
> >  drivers/iommu/arm/arm-smmu/arm-smmu.c | 17 +
> >  drivers/iommu/arm/arm-smmu/arm-smmu.h |  1 +
> >  include/linux/iommu.h |  1 +
> >  3 files changed, 19 insertions(+)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > index b1cf8f0abc29..070d13f80c7e 100644
> > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > @@ -789,6 +789,9 @@ static int arm_smmu_init_domain_context(struct
> > iommu_domain *domain,
> >   if (smmu_domain->non_strict)
> >   pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> >
> > + if (smmu_domain->sys_cache)
> > + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_SYS_CACHE;
> > +
> >   pgtbl_ops = alloc_io_pgtable_ops(fmt, _cfg, smmu_domain);
> >   if (!pgtbl_ops) {
> >   ret = -ENOMEM;
> > @@ -1520,6 +1523,9 @@ static int arm_smmu_domain_get_attr(struct
> > iommu_domain *domain,
> >   case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
> >   *(int *)data = smmu_domain->non_strict;
> >   return 0;
> > + case DOMAIN_ATTR_SYS_CACHE:
> > + *((int *)data) = smmu_domain->sys_cache;
> > + return 0;
> >   default:
> >   return -ENODEV;
> >   }
> > @@ -1551,6 +1557,17 @@ static int arm_smmu_domain_set_attr(struct
> > iommu_domain *domain,
> >   else
> >   smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
> >   break;
> > + case DOMAIN_ATTR_SYS_CACHE:
> > + if (smmu_domain->smmu) {
> > + ret = -EPERM;
> > + goto out_unlock;
> > + }
> > +
> > + if (*((int *)data))
> > + smmu_domain->sys_cache = true;
> > + else
> > + smmu_domain->sys_cache = false;
> > + break;
> >   default:
> >   ret = -ENODEV;
> >   }
> > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > index 885840f3bec8..dfc44d806671 100644
> > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > @@ -373,6 +373,7 @@ struct arm_smmu_domain {
> >   struct mutexinit_mutex; /* Protects smmu pointer 
*/
> >   spinlock_t  cb_lock; /* Serialises ATS1* ops and 
TLB syncs */
> >   struct iommu_domain domain;
> > + boolsys_cache;
> >  };
> >
> >  struct arm_smmu_master_cfg {
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > index b95a6f8db6ff..4f4bb9c6f8f6 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -118,6 +118,7 @@ enum iommu_attr {
> >   DOMAIN_ATTR_FSL_PAMUV1,
> >   DOMAIN_ATTR_NESTING,/* two stages of translation */
> >   DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
> > + DOMAIN_ATTR_SYS_CACHE,
>
> I think you're trying to make this look generic, but it's really not.
> If we need to funnel io-pgtable quirks through domain attributes, then I
> think we should be open about that and add something like
> DOMAIN_ATTR_IO_PGTABLE_CFG which could take a struct of page-table
> configuration data for the domain (this could just be quirks initially,
> but maybe it's worth extending to take ias, oas and page size)
>

Actually the initial versions used DOMAIN_ATTR_QCOM_SYS_CACHE
to make it QCOM specific and not generic, I don't see anyone else
using this attribute, would that work?


No -- I'd prefer to have _one_ domain attribute for funneling all the
IP_PGTABLE_CFG data. Otherwise, we'll just end up with things like
DOMAIN_ATTR_QCOM_SYS_CACHE_EXT or DOMAIN_ATTR_QCOM_QUIRKS later on.



Right, that makes sense. I will add this in next version.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RESEND PATCH v2 2/5] dma-direct: add dma_direct_bypass() to force direct ops

2020-11-14 Thread Christoph Hellwig
On Sat, Nov 14, 2020 at 10:17:10AM -0500, Jonathan Marek wrote:
> Add a function to force direct ops and disable swiotlb for a deivce.

s/deivce/device/

> +#if IS_ENABLED(CONFIG_DMA_OPS_BYPASS) && 
> !IS_ENABLED(CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED)

overly long line.

> +#if IS_ENABLED(CONFIG_DMA_OPS_BYPASS) && 
> !IS_ENABLED(CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED)

Again.

> +int dma_direct_bypass(struct device *dev)
> +{
> + int ret;
> +
> + ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64));
> + if (ret)
> + return ret;
> +
> + dev->bus_dma_limit = DMA_BIT_MASK(64);
> + dev->dma_ops_bypass = true;
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(dma_direct_bypass);

But more importantly ARCH_HAS_FORCE_DMA_UNENCRYPTED is just a compile
time flag.  With this you disable the functionality for all the usual
x86, s390 and powerpc configs, while only a tiny number of systems
for bounce buffering.  But I think you can just trivialy check
force_dma_unencrypted instead.  We do not need an extra Kconfig symbol
symbol for this trivial helper.

Also the helper is misnamed and misplaced.  The semantics have nothing
to do with dma-direct, the fact that is uses the ops bypass is an
implementation detail.   It really fits into the iommu code, as it
allows the driver to use the IOMMU API for IOVA management, while using
the DMA API for cache management.  So it should be named to reflect
that, and also grow a kerneldoc comment explaining how it will be used.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 2/5] dma-direct: add dma_direct_bypass() to force direct ops

2020-11-14 Thread Christoph Hellwig
So despite a week long wait I'e only received this patch from the
series.  Please resend the whole series if you want a review.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dma-pool: no need to check return value of debugfs_create functions

2020-11-14 Thread Christoph Hellwig
Thanks,

applied.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH kernel v4 2/2] powerpc/dma: Fallback to dma_ops when persistent memory present

2020-11-14 Thread Christoph Hellwig
On Thu, Oct 29, 2020 at 12:52:41PM +1100, Alexey Kardashevskiy wrote:
> +EXPORT_SYMBOL_GPL(arch_dma_map_page_direct);

I've dropped the unused exports and applied the series to dma-mapping-for-next.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH v2 0/5] drm/msm: support for host-cached BOs

2020-11-14 Thread Jonathan Marek
v2:
 - added patches 2/3 to enable using dma_ops_bypass
 - changed DRM_MSM_GEM_SYNC_CACHE patch to use dma_sync_sg_for_device()
   and dma_sync_sg_for_cpu(), and renamed sync flags.

Not sure I did the right thing with for the dma_ops_bypass part,
this is what I came up with reading the emails.

Jonathan Marek (5):
  drm/msm: add MSM_BO_CACHED_COHERENT
  dma-direct: add dma_direct_bypass() to force direct ops
  drm/msm: call dma_direct_bypass()
  drm/msm: add DRM_MSM_GEM_SYNC_CACHE for non-coherent cache maintenance
  drm/msm: bump up the uapi version

 drivers/gpu/drm/msm/Kconfig|  1 +
 drivers/gpu/drm/msm/adreno/adreno_device.c |  1 +
 drivers/gpu/drm/msm/msm_drv.c  | 32 +++---
 drivers/gpu/drm/msm/msm_drv.h  |  3 ++
 drivers/gpu/drm/msm/msm_gem.c  | 31 +
 include/linux/dma-direct.h |  9 ++
 include/uapi/drm/msm_drm.h | 25 +++--
 kernel/dma/direct.c| 23 
 8 files changed, 118 insertions(+), 7 deletions(-)

-- 
2.26.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH v2 2/5] dma-direct: add dma_direct_bypass() to force direct ops

2020-11-14 Thread Jonathan Marek
Add a function to force direct ops and disable swiotlb for a deivce.

Signed-off-by: Jonathan Marek 
---
 include/linux/dma-direct.h |  9 +
 kernel/dma/direct.c| 23 +++
 2 files changed, 32 insertions(+)

diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 18aade195884..41f57e1b7aa5 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -124,4 +124,13 @@ int dma_direct_supported(struct device *dev, u64 mask);
 dma_addr_t dma_direct_map_resource(struct device *dev, phys_addr_t paddr,
size_t size, enum dma_data_direction dir, unsigned long attrs);
 
+#if IS_ENABLED(CONFIG_DMA_OPS_BYPASS) && 
!IS_ENABLED(CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED)
+int dma_direct_bypass(struct device *dev);
+#else
+static inline int dma_direct_bypass(struct device *dev)
+{
+   return -EIO;
+}
+#endif
+
 #endif /* _LINUX_DMA_DIRECT_H */
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 06c111544f61..304a5a77cccb 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -548,3 +548,26 @@ int dma_direct_set_offset(struct device *dev, phys_addr_t 
cpu_start,
return 0;
 }
 EXPORT_SYMBOL_GPL(dma_direct_set_offset);
+
+/**
+ * dma_direct_bypass - always use direct mapping path for device
+ * @dev:   device pointer
+ *
+ * Note: this also bypasses swiotlb. Not available for arch with
+ * force_dma_unencrypted(), since this doesn't deal with that.
+ */
+#if IS_ENABLED(CONFIG_DMA_OPS_BYPASS) && 
!IS_ENABLED(CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED)
+int dma_direct_bypass(struct device *dev)
+{
+   int ret;
+
+   ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64));
+   if (ret)
+   return ret;
+
+   dev->bus_dma_limit = DMA_BIT_MASK(64);
+   dev->dma_ops_bypass = true;
+   return 0;
+}
+EXPORT_SYMBOL_GPL(dma_direct_bypass);
+#endif
-- 
2.26.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs

2020-11-14 Thread Christoph Hellwig
Lots of > 80 char lines.  Please fix up the style.

I think this needs to set a dma mask as behavior for unlimited dma
mask vs the default 32-bit one can be very different.  I also think
you need to be able to pass the direction or have different tests
for directions.  bidirectional is not exactly heavily used and pays
more cache management penality.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs

2020-11-14 Thread Song Bao Hua (Barry Song)



> -Original Message-
> From: Christoph Hellwig [mailto:h...@lst.de]
> Sent: Sunday, November 15, 2020 5:54 AM
> To: Song Bao Hua (Barry Song) 
> Cc: iommu@lists.linux-foundation.org; h...@lst.de; robin.mur...@arm.com;
> m.szyprow...@samsung.com; Linuxarm ;
> linux-kselft...@vger.kernel.org; xuwei (O) ; Joerg
> Roedel ; Will Deacon ; Shuah Khan
> 
> Subject: Re: [PATCH v3 1/2] dma-mapping: add benchmark support for
> streaming DMA APIs
> 
> Lots of > 80 char lines.  Please fix up the style.

Checkpatch has changed 80 to 100. That's probably why my local checkpatch 
didn't report any warning:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bdc48fa11e46f867ea4d

I am happy to change them to be less than 80 if you like.

> 
> I think this needs to set a dma mask as behavior for unlimited dma
> mask vs the default 32-bit one can be very different. 

I actually prefer users bind real devices with real dma_mask to test rather 
than force to change
the dma_mask in this benchmark.

Some device might have 32bit dma_mask while some others might have unlimited. 
But both of
them can bind to this driver or unbind from it after the test is done. So users 
just need to bind
those different real devices with different real dma_mask to dma_benchmark.

This can reflect the real performance of the real device better, I think.

> I also think you need to be able to pass the direction or have different tests
> for directions.  bidirectional is not exactly heavily used and pays
> more cache management penality.

For this, I'd like to increase a direction option in the test app and pass the 
option to the benchmark
driver.

Thanks
Barry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu