Re: [PATCH v3 5/6] media: uvcvideo: Use dma_alloc_noncontiguos API

2021-01-27 Thread . Christoph Hellwig
On Wed, Jan 27, 2021 at 10:35:02PM +0100, Ricardo Ribalda wrote:
> I have used the current API here:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/ribalda/linux.git/log/?h=uvc-noncontiguous
> 
> And I think the result is very clean. Great work!
> 
> I have tested it in X86 and in arm64, with similar performance as the
> previous patchset.
> 
> Maybe you want to cherry pick that commit into your series I can also
> send the patch to the list for review if you prefer so.
> 
> At least in 5.11 rc5 I the same dmadev worked in arm64 and x86.

Given that Sergey mentioned the v4l patchset needs more work I'll
use your commit as the example for sending out my series.

Thanks for all the work!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 7/9] iommu: Introduce map_sg() as an IOMMU op for IOMMU drivers

2021-01-27 Thread Lu Baolu

Hi,

On 1/28/21 4:01 AM, Chuck Lever wrote:

From: Isaac J. Manjarres 

Add support for IOMMU drivers to have their own map_sg() callbacks.
This completes the path for having iommu_map_sg() invoke an IOMMU
driver's map_sg() callback, which can then invoke the io-pgtable
map_sg() callback with the entire scatter-gather list, so that it
can be processed entirely in the io-pgtable layer.

For IOMMU drivers that do not provide a callback, the default
implementation of iterating through the scatter-gather list, while
calling iommu_map() will be used.

Signed-off-by: Isaac J. Manjarres 
Tested-by: Sai Prakash Ranjan 
[ cel: adjusted new iotlb_sync_map call site ]
Signed-off-by: Chuck Lever 
---
  drivers/iommu/iommu.c |   12 
  include/linux/iommu.h |5 +
  2 files changed, 17 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ed879a4d7fac..bd7adbd0339b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2551,6 +2551,18 @@ static size_t __iommu_map_sg(struct iommu_domain 
*domain, unsigned long iova,
unsigned int i = 0;
int ret;
  
+	if (ops->map_sg) {

+   ret = ops->map_sg(domain, iova, sg, nents, prot, gfp, );
+
+   if (ops->iotlb_sync_map)
+   ops->iotlb_sync_map(domain, iova, mapped);
+
+   if (ret)
+   goto out_err;
+
+   return mapped;
+   }
+
while (i <= nents) {
phys_addr_t s_phys = sg_phys(sg);
  
diff --git a/include/linux/iommu.h b/include/linux/iommu.h

index cd5f35022a25..667edc7b034a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -192,6 +192,8 @@ struct iommu_iotlb_gather {
   * @attach_dev: attach device to an iommu domain
   * @detach_dev: detach device from an iommu domain
   * @map: map a physically contiguous memory region to an iommu domain
+ * @map_sg: map a scatter-gather list of physically contiguous chunks to
+ *  an iommu domain.
   * @unmap: unmap a physically contiguous memory region from an iommu domain
   * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain
   * @iotlb_sync_map: Sync mappings created recently using @map to the hardware
@@ -243,6 +245,9 @@ struct iommu_ops {
void (*detach_dev)(struct iommu_domain *domain, struct device *dev);
int (*map)(struct iommu_domain *domain, unsigned long iova,
   phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
+   int (*map_sg)(struct iommu_domain *domain, unsigned long iova,
+ struct scatterlist *sg, unsigned int nents, int prot,
+ gfp_t gfp, size_t *mapped);


I might be oversensitive. But what if the vendor iommu driver uses iova
beyond the pre-allocated range? It's safer if we could pass an iova
length parameter as well, so that the iommu driver could do some sanity
check.

Best regards,
baolu


size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
 size_t size, struct iommu_iotlb_gather *iotlb_gather);
void (*flush_iotlb_all)(struct iommu_domain *domain);



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 2/9] iommu: Add iova and size as parameters in iotlb_sync_map

2021-01-27 Thread Lu Baolu

On 1/28/21 4:00 AM, Chuck Lever wrote:

From: Yong Wu 

iotlb_sync_map allow IOMMU drivers tlb sync after completing the whole
mapping. This patch adds iova and size as the parameters in it. then the
IOMMU driver could flush tlb with the whole range once after iova mapping
to improve performance.

Signed-off-by: Yong Wu 
Reviewed-by: Robin Murphy 
Signed-off-by: Chuck Lever 
---
  drivers/iommu/iommu.c  |4 ++--
  drivers/iommu/tegra-gart.c |7 +--
  include/linux/iommu.h  |3 ++-
  3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index c304a6a30d42..3d099a31ddca 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2443,7 +2443,7 @@ static int _iommu_map(struct iommu_domain *domain, 
unsigned long iova,
  
  	ret = __iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL);

if (ret == 0 && ops->iotlb_sync_map)
-   ops->iotlb_sync_map(domain);
+   ops->iotlb_sync_map(domain, iova, size);
  
  	return ret;

  }
@@ -2575,7 +2575,7 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, 
unsigned long iova,
}
  
  	if (ops->iotlb_sync_map)

-   ops->iotlb_sync_map(domain);
+   ops->iotlb_sync_map(domain, iova, mapped);
return mapped;
  
  out_err:

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index fac720273889..05e8e19b8269 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -261,7 +261,8 @@ static int gart_iommu_of_xlate(struct device *dev,
return 0;
  }
  
-static void gart_iommu_sync_map(struct iommu_domain *domain)

+static void gart_iommu_sync_map(struct iommu_domain *domain, unsigned long 
iova,
+   size_t size)
  {
FLUSH_GART_REGS(gart_handle);
  }
@@ -269,7 +270,9 @@ static void gart_iommu_sync_map(struct iommu_domain *domain)
  static void gart_iommu_sync(struct iommu_domain *domain,
struct iommu_iotlb_gather *gather)
  {
-   gart_iommu_sync_map(domain);
+   size_t length = gather->end - gather->start;
+
+   gart_iommu_sync_map(domain, gather->start, length);
  }
  
  static const struct iommu_ops gart_iommu_ops = {

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index b3f0e2018c62..9ce0aa9e236b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -246,7 +246,8 @@ struct iommu_ops {
size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
 size_t size, struct iommu_iotlb_gather *iotlb_gather);
void (*flush_iotlb_all)(struct iommu_domain *domain);
-   void (*iotlb_sync_map)(struct iommu_domain *domain);
+   void (*iotlb_sync_map)(struct iommu_domain *domain, unsigned long iova,
+  size_t size);
void (*iotlb_sync)(struct iommu_domain *domain,
   struct iommu_iotlb_gather *iotlb_gather);
phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, dma_addr_t 
iova);




Reviewed-by: Lu Baolu 

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 1/9] iommu: Move iotlb_sync_map out from __iommu_map

2021-01-27 Thread Lu Baolu

Hi,

On 1/28/21 4:00 AM, Chuck Lever wrote:

From: Yong Wu 

In the end of __iommu_map, It alway call iotlb_sync_map.

This patch moves iotlb_sync_map out from __iommu_map since it is
unnecessary to call this for each sg segment especially iotlb_sync_map
is flush tlb all currently. Add a little helper _iommu_map for this.

Signed-off-by: Yong Wu 
Reviewed-by: Robin Murphy 
Signed-off-by: Chuck Lever 
---
  drivers/iommu/iommu.c |   23 ++-
  1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ffeebda8d6de..c304a6a30d42 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2426,9 +2426,6 @@ static int __iommu_map(struct iommu_domain *domain, 
unsigned long iova,
size -= pgsize;
}
  
-	if (ops->iotlb_sync_map)

-   ops->iotlb_sync_map(domain);
-
/* unroll mapping in case something went wrong */
if (ret)
iommu_unmap(domain, orig_iova, orig_size - size);
@@ -2438,18 +2435,31 @@ static int __iommu_map(struct iommu_domain *domain, 
unsigned long iova,
return ret;
  }
  
+static int _iommu_map(struct iommu_domain *domain, unsigned long iova,

+ phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
+{
+   const struct iommu_ops *ops = domain->ops;
+   int ret;
+
+   ret = __iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL);
+   if (ret == 0 && ops->iotlb_sync_map)
+   ops->iotlb_sync_map(domain);


Previous code called iotlb_sync_map() regardless of whether the mapping
was successful or not. Here the logic changes, and the callback is only
called if mapping successfully.

Any reason? It's safer to always call iotlb_sync_map() even in failed
mapping case. In this way, we can ensure the consistency of cache as
much as possible.

Best regards,
baolu


+
+   return ret;
+}
+
  int iommu_map(struct iommu_domain *domain, unsigned long iova,
  phys_addr_t paddr, size_t size, int prot)
  {
might_sleep();
-   return __iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL);
+   return _iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL);
  }
  EXPORT_SYMBOL_GPL(iommu_map);
  
  int iommu_map_atomic(struct iommu_domain *domain, unsigned long iova,

  phys_addr_t paddr, size_t size, int prot)
  {
-   return __iommu_map(domain, iova, paddr, size, prot, GFP_ATOMIC);
+   return _iommu_map(domain, iova, paddr, size, prot, GFP_ATOMIC);
  }
  EXPORT_SYMBOL_GPL(iommu_map_atomic);
  
@@ -2533,6 +2543,7 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova,

 struct scatterlist *sg, unsigned int nents, int 
prot,
 gfp_t gfp)
  {
+   const struct iommu_ops *ops = domain->ops;
size_t len = 0, mapped = 0;
phys_addr_t start;
unsigned int i = 0;
@@ -2563,6 +2574,8 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, 
unsigned long iova,
sg = sg_next(sg);
}
  
+	if (ops->iotlb_sync_map)

+   ops->iotlb_sync_map(domain);
return mapped;
  
  out_err:




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 0/9] Possible set of VT-d optimizations

2021-01-27 Thread Lu Baolu

Hi Chuck,

On 1/28/21 4:00 AM, Chuck Lever wrote:

Hi-

This collection of patches seems to get the best throughtput results
so far. The NFS WRITE result is fully restored, and the NFS READ
result is very close to fully restored.


Very glad to see this. Thanks!

Can you please add below link if you have a next version?

https://lore.kernel.org/linux-iommu/d81314ed-5673-44a6-b597-090e3cb83...@oracle.com/

It helps people to understand what regression you have seen.

Best regards,
baolu



Children see throughput for 12 initial writers  = 5008474.03 kB/sec
Parent sees throughput for 12 initial writers   = 4996927.80 kB/sec
Min throughput per process  = 416956.88 kB/sec
Max throughput per process  = 417910.22 kB/sec
Avg throughput per process  = 417372.84 kB/sec
Min xfer= 1046272.00 kB
CPU Utilization: Wall time2.515CPU time1.996CPU 
utilization  79.37 %


Children see throughput for 12 rewriters= 5020584.59 kB/sec
Parent sees throughput for 12 rewriters = 5012539.29 kB/sec
Min throughput per process  = 417799.00 kB/sec
Max throughput per process  = 419082.22 kB/sec
Avg throughput per process  = 418382.05 kB/sec
Min xfer= 1046528.00 kB
CPU utilization: Wall time2.507CPU time2.024CPU 
utilization  80.73 %


Children see throughput for 12 readers  = 5805484.25 kB/sec
Parent sees throughput for 12 readers   = 5799535.68 kB/sec
Min throughput per process  = 482888.16 kB/sec
Max throughput per process  = 48.16 kB/sec
Avg throughput per process  = 483790.35 kB/sec
Min xfer= 1045760.00 kB
CPU utilization: Wall time2.167CPU time1.964CPU 
utilization  90.63 %


Children see throughput for 12 re-readers   = 5812227.16 kB/sec
Parent sees throughput for 12 re-readers= 5803793.06 kB/sec
Min throughput per process  = 483242.97 kB/sec
Max throughput per process  = 485724.41 kB/sec
Avg throughput per process  = 484352.26 kB/sec
Min xfer= 1043456.00 kB
CPU utilization: Wall time2.161CPU time1.976CPU 
utilization  91.45 %

I've included a simple-minded implementation of a map_sg op for
the Intel IOMMU. This is nothing more than a copy of the loop in
__iommu_map_sg() with the call to __iommu_map() replaced with a
call to intel_iommu_map().

---

Chuck Lever (1):
   iommu/vt-d: Introduce map_sg() for Intel IOMMUs

Isaac J. Manjarres (5):
   iommu/io-pgtable: Introduce map_sg() as a page table op
   iommu/io-pgtable-arm: Hook up map_sg()
   iommu/io-pgtable-arm-v7s: Hook up map_sg()
   iommu: Introduce map_sg() as an IOMMU op for IOMMU drivers
   iommu/arm-smmu: Hook up map_sg()

Lu Baolu (1):
   iommu/vt-d: Add iotlb_sync_map callback

Yong Wu (2):
   iommu: Move iotlb_sync_map out from __iommu_map
   iommu: Add iova and size as parameters in iotlb_sync_map


  drivers/iommu/arm/arm-smmu/arm-smmu.c |  19 
  drivers/iommu/intel/iommu.c   | 131 --
  drivers/iommu/io-pgtable-arm-v7s.c|  90 ++
  drivers/iommu/io-pgtable-arm.c|  86 +
  drivers/iommu/iommu.c |  47 +++--
  drivers/iommu/tegra-gart.c|   7 +-
  include/linux/iommu.h |  16 +++-
  7 files changed, 353 insertions(+), 43 deletions(-)

--
Chuck Lever


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/3] Adding offset keeping option when mapping data via SWIOTLB.*

2021-01-27 Thread Jianxiong Gao via iommu
NVMe driver and other applications may depend on the data offset
to operate correctly. Currently when unaligned data is mapped via
SWIOTLB, the data is mapped as slab aligned with the SWIOTLB. This
patch adds an option to make sure the mapped data preserves its
offset of the orginal addrss.

Without the patch when creating xfs formatted disk on NVMe backends,
with swiotlb=force in kernel boot option, creates the following error:
meta-data=/dev/nvme2n1   isize=512agcount=4, agsize=131072 blks
 =   sectsz=512   attr=2, projid32bit=1
 =   crc=1finobt=1, sparse=0, rmapbt=0, refl
ink=0
data =   bsize=4096   blocks=524288, imaxpct=25
 =   sunit=0  swidth=0 blks
naming   =version 2  bsize=4096   ascii-ci=0 ftype=1
log  =internal log   bsize=4096   blocks=2560, version=2
 =   sectsz=512   sunit=0 blks, lazy-count=1
realtime =none   extsz=4096   blocks=0, rtextents=0
mkfs.xfs: pwrite failed: Input/output error

Jianxiong Gao (3):
  Adding page_offset_mask to device_dma_parameters
  Add swiotlb offset preserving mapping when
dma_dma_parameters->page_offset_mask is non zero.
  Adding device_dma_parameters->offset_preserve_mask to NVMe driver.

 drivers/nvme/host/pci.c |  4 
 include/linux/device.h  |  1 +
 include/linux/dma-mapping.h | 17 +
 kernel/dma/swiotlb.c| 16 +++-
 4 files changed, 37 insertions(+), 1 deletion(-)

-- 
2.27.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3] iommu/vt-d: do not use flush-queue when caching-mode is on

2021-01-27 Thread Lu Baolu

On 1/28/21 1:53 AM, Nadav Amit wrote:

From: Nadav Amit 

When an Intel IOMMU is virtualized, and a physical device is
passed-through to the VM, changes of the virtual IOMMU need to be
propagated to the physical IOMMU. The hypervisor therefore needs to
monitor PTE mappings in the IOMMU page-tables. Intel specifications
provide "caching-mode" capability that a virtual IOMMU uses to report
that the IOMMU is virtualized and a TLB flush is needed after mapping to
allow the hypervisor to propagate virtual IOMMU mappings to the physical
IOMMU. To the best of my knowledge no real physical IOMMU reports
"caching-mode" as turned on.

Synchronizing the virtual and the physical IOMMU tables is expensive if
the hypervisor is unaware which PTEs have changed, as the hypervisor is
required to walk all the virtualized tables and look for changes.
Consequently, domain flushes are much more expensive than page-specific
flushes on virtualized IOMMUs with passthrough devices. The kernel
therefore exploited the "caching-mode" indication to avoid domain
flushing and use page-specific flushing in virtualized environments. See
commit 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching
mode.")

This behavior changed after commit 13cf01744608 ("iommu/vt-d: Make use
of iova deferred flushing"). Now, when batched TLB flushing is used (the
default), full TLB domain flushes are performed frequently, requiring
the hypervisor to perform expensive synchronization between the virtual
TLB and the physical one.

Getting batched TLB flushes to use page-specific invalidations again in
such circumstances is not easy, since the TLB invalidation scheme
assumes that "full" domain TLB flushes are performed for scalability.

Disable batched TLB flushes when caching-mode is on, as the performance
benefit from using batched TLB invalidations is likely to be much
smaller than the overhead of the virtual-to-physical IOMMU page-tables
synchronization.

Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing")
Signed-off-by: Nadav Amit 
Cc: David Woodhouse 
Cc: Lu Baolu 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: sta...@vger.kernel.org

---
v2->v3:

* Fix the fixes tag in the commit-log (Lu).
* Minor English rephrasing of the commit-log.

v1->v2:

* disable flush queue for all domains if caching-mode is on for any
   IOMMU (Lu).
---
  drivers/iommu/intel/iommu.c | 32 +++-
  1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 788119c5b021..de3dd617cf60 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5373,6 +5373,36 @@ intel_iommu_domain_set_attr(struct iommu_domain *domain,
return ret;
  }
  
+static bool domain_use_flush_queue(void)

+{
+   struct dmar_drhd_unit *drhd;
+   struct intel_iommu *iommu;
+   bool r = true;
+
+   if (intel_iommu_strict)
+   return false;
+
+   /*
+* The flush queue implementation does not perform page-selective
+* invalidations that are required for efficient TLB flushes in virtual
+* environments. The benefit of batching is likely to be much lower than
+* the overhead of synchronizing the virtual and physical IOMMU
+* page-tables.
+*/
+   rcu_read_lock();
+   for_each_active_iommu(iommu, drhd) {
+   if (!cap_caching_mode(iommu->cap))
+   continue;
+
+   pr_warn_once("IOMMU batching is disabled due to 
virtualization");
+   r = false;
+   break;
+   }
+   rcu_read_unlock();
+
+   return r;
+}
+
  static int
  intel_iommu_domain_get_attr(struct iommu_domain *domain,
enum iommu_attr attr, void *data)
@@ -5383,7 +5413,7 @@ intel_iommu_domain_get_attr(struct iommu_domain *domain,
case IOMMU_DOMAIN_DMA:
switch (attr) {
case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
-   *(int *)data = !intel_iommu_strict;
+   *(int *)data = domain_use_flush_queue();
return 0;
default:
return -ENODEV;



Acked-by: Lu Baolu 

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH v2] uacce: Add uacce_ctrl misc device

2021-01-27 Thread Song Bao Hua (Barry Song)



> -Original Message-
> From: Jason Gunthorpe [mailto:j...@ziepe.ca]
> Sent: Wednesday, January 27, 2021 7:20 AM
> To: Song Bao Hua (Barry Song) 
> Cc: Wangzhou (B) ; Greg Kroah-Hartman
> ; Arnd Bergmann ; Zhangfei Gao
> ; linux-accelerat...@lists.ozlabs.org;
> linux-ker...@vger.kernel.org; iommu@lists.linux-foundation.org;
> linux...@kvack.org; Liguozhu (Kenneth) ; chensihang
> (A) 
> Subject: Re: [RFC PATCH v2] uacce: Add uacce_ctrl misc device
> 
> On Tue, Jan 26, 2021 at 01:26:45AM +, Song Bao Hua (Barry Song) wrote:
> > > On Mon, Jan 25, 2021 at 11:35:22PM +, Song Bao Hua (Barry Song) wrote:
> > >
> > > > > On Mon, Jan 25, 2021 at 10:21:14PM +, Song Bao Hua (Barry Song)
> wrote:
> > > > > > mlock, while certainly be able to prevent swapping out, it won't
> > > > > > be able to stop page moving due to:
> > > > > > * memory compaction in alloc_pages()
> > > > > > * making huge pages
> > > > > > * numa balance
> > > > > > * memory compaction in CMA
> > > > >
> > > > > Enabling those things is a major reason to have SVA device in the
> > > > > first place, providing a SW API to turn it all off seems like the
> > > > > wrong direction.
> > > >
> > > > I wouldn't say this is a major reason to have SVA. If we read the
> > > > history of SVA and papers, people would think easy programming due
> > > > to data struct sharing between cpu and device, and process space
> > > > isolation in device would be the major reasons for SVA. SVA also
> > > > declares it supports zero-copy while zero-copy doesn't necessarily
> > > > depend on SVA.
> > >
> > > Once you have to explicitly make system calls to declare memory under
> > > IO, you loose all of that.
> > >
> > > Since you've asked the app to be explicit about the DMAs it intends to
> > > do, there is not really much reason to use SVA for those DMAs anymore.
> >
> > Let's see a non-SVA case. We are not using SVA, we can have
> > a memory pool by hugetlb or pin, and app can allocate memory
> > from this pool, and get stable I/O performance on the memory
> > from the pool. But device has its separate page table which
> > is not bound with this process, thus lacking the protection
> > of process space isolation. Plus, CPU and device are using
> > different address.
> 
> So you are relying on the platform to do the SVA for the device?
> 

Sorry for late response.

uacce and its userspace framework UADK depend on SVA, leveraging
the enhanced security by isolated process address space.

This patch is mainly an extension for performance optimization to
get stable high-performance I/O on pinned memory even though the
hardware supports IO page fault to get pages back after swapping
out or page migration.
But IO page fault will cause serious latency jitter for high-speed
I/O.
For slow speed device, they don't need to use this extension.

> This feels like it goes back to another topic where I felt the SVA
> setup uAPI should be shared and not buried into every driver's unique
> ioctls.
> 
> Having something like this in a shared SVA system is somewhat less
> strange.

Sounds reasonable. On the other hand, uacce seems to be an common
uAPI for SVA, and probably the only one for this moment.

uacce is a framework not a specific driver as any accelerators
can hook into this framework as long as a device provides
uacce_ops and register itself by uacce_register(). Uacce, for
itself, doesn't bind with any specific hardware. So uacce interfaces
are kind of common uAPI :-)

> 
> Jason

Thanks
Barry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/3] Adding page_offset_mask to device_dma_parameters

2021-01-27 Thread Jianxiong Gao via iommu
Some devices rely on the address offset in a page to function
correctly (NVMe driver as an example). These devices may use
a different page size than the Linux kernel. The address offset
has to be preserved upon mapping, and in order to do so, we
need to record the page_offset_mask first.

Signed-off-by: Jianxiong Gao 
---
 include/linux/device.h  |  1 +
 include/linux/dma-mapping.h | 17 +
 2 files changed, 18 insertions(+)

diff --git a/include/linux/device.h b/include/linux/device.h
index 1779f90eeb4c..f44e0659fc66 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -292,6 +292,7 @@ struct device_dma_parameters {
 */
unsigned int max_segment_size;
unsigned long segment_boundary_mask;
+   unsigned int page_offset_mask;
 };
 
 /**
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 2e49996a8f39..5529a31fefba 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -500,6 +500,23 @@ static inline int dma_set_seg_boundary(struct device *dev, 
unsigned long mask)
return -EIO;
 }
 
+static inline unsigned int dma_get_page_offset_mask(struct device *dev)
+{
+   if (dev->dma_parms)
+   return dev->dma_parms->page_offset_mask;
+   return 0;
+}
+
+static inline int dma_set_page_offset_mask(struct device *dev,
+   unsigned int page_offset_mask)
+{
+   if (dev->dma_parms) {
+   dev->dma_parms->page_offset_mask = page_offset_mask;
+   return 0;
+   }
+   return -EIO;
+}
+
 static inline int dma_get_cache_alignment(void)
 {
 #ifdef ARCH_DMA_MINALIGN
-- 
2.27.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/3] Add swiotlb offset preserving mapping when dma_dma_parameters->page_offset_mask is non zero.

2021-01-27 Thread Jianxiong Gao via iommu
For devices that need to preserve address offset on mapping through
swiotlb, this patch adds offset preserving based on page_offset_mask
and keeps the offset if the mask is non zero. This is needed for
device drivers like NVMe.

Signed-off-by: Jianxiong Gao 
---
 kernel/dma/swiotlb.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 7c42df6e6100..4cab35f2c9bc 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -468,7 +468,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, 
phys_addr_t orig_addr,
dma_addr_t tbl_dma_addr = phys_to_dma_unencrypted(hwdev, io_tlb_start);
unsigned long flags;
phys_addr_t tlb_addr;
-   unsigned int nslots, stride, index, wrap;
+   unsigned int nslots, stride, index, wrap, page_offset_mask, page_offset;
int i;
unsigned long mask;
unsigned long offset_slots;
@@ -500,12 +500,16 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, 
phys_addr_t orig_addr,
? ALIGN(mask + 1, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT
: 1UL << (BITS_PER_LONG - IO_TLB_SHIFT);
 
+   page_offset_mask = dma_get_page_offset_mask(hwdev);
+   page_offset = orig_addr & page_offset_mask;
+   alloc_size += page_offset;
+
/*
 * For mappings greater than or equal to a page, we limit the stride
 * (and hence alignment) to a page size.
 */
nslots = ALIGN(alloc_size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
-   if (alloc_size >= PAGE_SIZE)
+   if ((alloc_size >= PAGE_SIZE) || (page_offset_mask > (1 << 
IO_TLB_SHIFT)))
stride = (1 << (PAGE_SHIFT - IO_TLB_SHIFT));
else
stride = 1;
@@ -583,6 +587,11 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, 
phys_addr_t orig_addr,
 */
for (i = 0; i < nslots; i++)
io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT);
+   /*
+* When keeping the offset of the original data, we need to advance
+* the tlb_addr by the offset of orig_addr.
+*/
+   tlb_addr += page_offset;
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
swiotlb_bounce(orig_addr, tlb_addr, mapping_size, 
DMA_TO_DEVICE);
@@ -598,7 +607,9 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, 
phys_addr_t tlb_addr,
  enum dma_data_direction dir, unsigned long attrs)
 {
unsigned long flags;
-   int i, count, nslots = ALIGN(alloc_size, 1 << IO_TLB_SHIFT) >> 
IO_TLB_SHIFT;
+   unsigned int num_page_offset_slabs, page_offset_mask = 
dma_get_page_offset_mask(hwdev);
+   int i, count;
+   int nslots = ALIGN(alloc_size + tlb_addr & page_offset_mask, 1 << 
IO_TLB_SHIFT) >> IO_TLB_SHIFT;
int index = (tlb_addr - io_tlb_start) >> IO_TLB_SHIFT;
phys_addr_t orig_addr = io_tlb_orig_addr[index];
 
@@ -610,6 +621,14 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, 
phys_addr_t tlb_addr,
((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)))
swiotlb_bounce(orig_addr, tlb_addr, mapping_size, 
DMA_FROM_DEVICE);
 
+   /*
+* When dma_get_page_offset_mask is used, we may have padded more slabs
+* when padding exceeds one slab. We need to move index back to the
+* beginning of the padding.
+*/
+   num_page_offset_slabs =  (tlb_addr & page_offset_mask) / (1 << 
IO_TLB_SHIFT);
+   index -= num_page_offset_slabs;
+
/*
 * Return the buffer to the free list by setting the corresponding
 * entries to indicate the number of contiguous entries available.
-- 
2.27.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/3] Adding device_dma_parameters->offset_preserve_mask to NVMe driver.

2021-01-27 Thread Jianxiong Gao via iommu
NVMe driver relies on the address offset to function properly.
This patch adds the offset preserve mask to NVMe driver when mapping
via dma_map_sg_attrs and unmapping via nvme_unmap_sg. The mask
depends on the page size defined by CC.MPS register of NVMe
controller.

Signed-off-by: Jianxiong Gao 
---
 drivers/nvme/host/pci.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 856aa31931c1..0b23f04068be 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -580,12 +580,15 @@ static void nvme_free_sgls(struct nvme_dev *dev, struct 
request *req)
 static void nvme_unmap_sg(struct nvme_dev *dev, struct request *req)
 {
struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
-
+   if (dma_set_page_offset_mask(dev->dev, NVME_CTRL_PAGE_SIZE - 1))
+   dev_warn(dev->dev, "dma_set_page_offset_mask failed to set 
offset\n");
if (is_pci_p2pdma_page(sg_page(iod->sg)))
pci_p2pdma_unmap_sg(dev->dev, iod->sg, iod->nents,
rq_dma_dir(req));
else
dma_unmap_sg(dev->dev, iod->sg, iod->nents, rq_dma_dir(req));
+   if (dma_set_page_offset_mask(dev->dev, 0))
+   dev_warn(dev->dev, "dma_set_page_offset_mask failed to reset 
offset\n");
 }
 
 static void nvme_unmap_data(struct nvme_dev *dev, struct request *req)
@@ -842,7 +845,7 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, 
struct request *req,
 {
struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
blk_status_t ret = BLK_STS_RESOURCE;
-   int nr_mapped;
+   int nr_mapped, offset_ret;
 
if (blk_rq_nr_phys_segments(req) == 1) {
struct bio_vec bv = req_bvec(req);
@@ -868,12 +871,24 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, 
struct request *req,
if (!iod->nents)
goto out_free_sg;
 
+   offset_ret = dma_set_page_offset_mask(dev->dev, NVME_CTRL_PAGE_SIZE - 
1);
+   if (offset_ret) {
+   dev_warn(dev->dev, "dma_set_page_offset_mask failed to set 
offset\n");
+   goto out_free_sg;
+   }
+
if (is_pci_p2pdma_page(sg_page(iod->sg)))
nr_mapped = pci_p2pdma_map_sg_attrs(dev->dev, iod->sg,
iod->nents, rq_dma_dir(req), DMA_ATTR_NO_WARN);
else
nr_mapped = dma_map_sg_attrs(dev->dev, iod->sg, iod->nents,
 rq_dma_dir(req), DMA_ATTR_NO_WARN);
+
+   offset_ret = dma_set_page_offset_mask(dev->dev, 0);
+   if (offset_ret) {
+   dev_warn(dev->dev, "dma_set_page_offset_mask failed to reset 
offset\n");
+   goto out_free_sg;
+   }
if (!nr_mapped)
goto out_free_sg;
 
-- 
2.27.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 2/2] vfio/iommu_type1: Fix some sanity checks in detach group

2021-01-27 Thread Alex Williamson
On Fri, 22 Jan 2021 17:26:35 +0800
Keqian Zhu  wrote:

> vfio_sanity_check_pfn_list() is used to check whether pfn_list and
> notifier are empty when remove the external domain, so it makes a
> wrong assumption that only external domain will use the pinning
> interface.
> 
> Now we apply the pfn_list check when a vfio_dma is removed and apply
> the notifier check when all domains are removed.
> 
> Fixes: a54eb55045ae ("vfio iommu type1: Add support for mediated devices")
> Signed-off-by: Keqian Zhu 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 33 ++---
>  1 file changed, 10 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 161725395f2f..d8c10f508321 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -957,6 +957,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, 
> struct vfio_dma *dma,
>  
>  static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
>  {
> + WARN_ON(!RB_EMPTY_ROOT(>pfn_list));
>   vfio_unmap_unpin(iommu, dma, true);
>   vfio_unlink_dma(iommu, dma);
>   put_task_struct(dma->task);
> @@ -2250,23 +2251,6 @@ static void vfio_iommu_unmap_unpin_reaccount(struct 
> vfio_iommu *iommu)
>   }
>  }
>  
> -static void vfio_sanity_check_pfn_list(struct vfio_iommu *iommu)
> -{
> - struct rb_node *n;
> -
> - n = rb_first(>dma_list);
> - for (; n; n = rb_next(n)) {
> - struct vfio_dma *dma;
> -
> - dma = rb_entry(n, struct vfio_dma, node);
> -
> - if (WARN_ON(!RB_EMPTY_ROOT(>pfn_list)))
> - break;
> - }
> - /* mdev vendor driver must unregister notifier */
> - WARN_ON(iommu->notifier.head);
> -}
> -
>  /*
>   * Called when a domain is removed in detach. It is possible that
>   * the removed domain decided the iova aperture window. Modify the
> @@ -2366,10 +2350,10 @@ static void vfio_iommu_type1_detach_group(void 
> *iommu_data,
>   kfree(group);
>  
>   if (list_empty(>external_domain->group_list)) {
> - vfio_sanity_check_pfn_list(iommu);
> -
> - if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu))
> + if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) {
> + WARN_ON(iommu->notifier.head);
>   vfio_iommu_unmap_unpin_all(iommu);
> + }
>  
>   kfree(iommu->external_domain);
>   iommu->external_domain = NULL;
> @@ -2403,10 +2387,12 @@ static void vfio_iommu_type1_detach_group(void 
> *iommu_data,
>*/
>   if (list_empty(>group_list)) {
>   if (list_is_singular(>domain_list)) {
> - if (!iommu->external_domain)
> + if (!iommu->external_domain) {
> + WARN_ON(iommu->notifier.head);
>   vfio_iommu_unmap_unpin_all(iommu);
> - else
> + } else {
>   vfio_iommu_unmap_unpin_reaccount(iommu);
> + }
>   }
>   iommu_domain_free(domain->domain);
>   list_del(>next);
> @@ -2488,9 +2474,10 @@ static void vfio_iommu_type1_release(void *iommu_data)
>   struct vfio_iommu *iommu = iommu_data;
>   struct vfio_domain *domain, *domain_tmp;
>  
> + WARN_ON(iommu->notifier.head);

I don't see that this does any harm, but isn't it actually redundant?
It seems vfio-core only calls the iommu backend release function after
removing all the groups, so the tests in _detach_group should catch all
cases.  We're expecting the vfio bus/mdev driver to remove the notifier
when a device is closed, which necessarily occurs before detaching the
group.  Thanks,

Alex

> +
>   if (iommu->external_domain) {
>   vfio_release_domain(iommu->external_domain, true);
> - vfio_sanity_check_pfn_list(iommu);
>   kfree(iommu->external_domain);
>   }
>  

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] kernel/dma: remove unnecessary unmap_kernel_range

2021-01-27 Thread Nicholas Piggin
Excerpts from Christoph Hellwig's message of January 27, 2021 5:10 pm:
> On Tue, Jan 26, 2021 at 05:08:46PM -0500, Konrad Rzeszutek Wilk wrote:
>> On Tue, Jan 26, 2021 at 02:54:01PM +1000, Nicholas Piggin wrote:
>> > vunmap will remove ptes.
>> 
>> Should there be some ASSERT after the vunmap to make sure that is the
>> case? 
> 
> Not really.  removing the PTEs is the whole point of vunmap.  Everything
> else is just house keeping.

Agree. I did double check this and wrote a quick test to check ptes were 
there before the vunmap and cleared after, just to make sure I didn't 
make a silly mistake with the patch. But in general drivers should be 
able to trust code behind the API call will do the right thing. Such 
assertions should go in the vunmap() implementation as appropriate.

Thanks,
Nick
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 5/6] media: uvcvideo: Use dma_alloc_noncontiguos API

2021-01-27 Thread Ricardo Ribalda
Hi Christoph

On Wed, Jan 27, 2021 at 4:56 PM . Christoph Hellwig  wrote:
>
> On Wed, Jan 27, 2021 at 12:29:08AM +0100, Ricardo Ribalda wrote:
> > - Is there any platform where dma_alloc_noncontiguos can fail?
> > This is, !ops->alloc_noncontiguous and !dev->coherent_dma_mask
> > If yes then we need to add a function to let the driver know in
> > advance that it has to use the coherent allocator (usb_alloc_coherent
> > for uvc)
>
> dev->coherent_dma_mask is set by the driver.  So the only reason why
> dma_alloc_noncontiguos will fail is because is because it can't
> allocate any memory.
>
> > - In dma_alloc_noncontiguos, on the dma_alloc_pages fallback. If we
> > have a device where the dma happens in only one direction, could not
> > get more performance with DMA_FROM/TO_DEVICE instead of
> > DMA_BIDIRECTIONAL ?
>
> Yes, we could probably do that.
>
> >
> >
> > Then I have tried to use the API, and I have encountered a problem: on
> > uvcvideo the device passed to the memory allocator is different for
> > DMA_PAGES and NON_CONTIGUOUS:
> > https://github.com/ribalda/linux/blob/042cd497739f71c8d4a83a67ee970369e2baca4a/drivers/media/usb/uvc/uvc_video.c#L1236
> >
> > I need to dig a bit tomorrow to figure out why this is, I have
> > hardware to test both paths, so it should not be too difficult.
>
> I always found the USB dma alloc API a little weird, but we might have
> to follow the scheme of the usb coherent wrappers there.

I have used the current API here:

https://git.kernel.org/pub/scm/linux/kernel/git/ribalda/linux.git/log/?h=uvc-noncontiguous

And I think the result is very clean. Great work!

I have tested it in X86 and in arm64, with similar performance as the
previous patchset.

Maybe you want to cherry pick that commit into your series I can also
send the patch to the list for review if you prefer so.

At least in 5.11 rc5 I the same dmadev worked in arm64 and x86.

Best regards!


-- 
Ricardo Ribalda
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 9/9] iommu/vt-d: Introduce map_sg() for Intel IOMMUs

2021-01-27 Thread Chuck Lever
Attempt to reduce indirect call overhead when mapping a substantial
scatter-gather list.

Signed-off-by: Chuck Lever 
---
 drivers/iommu/intel/iommu.c |   37 +
 1 file changed, 37 insertions(+)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 013097b6d55f..deae39f1477a 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4952,6 +4952,42 @@ static int intel_iommu_map(struct iommu_domain *domain,
hpa >> VTD_PAGE_SHIFT, size, prot);
 }
 
+static int intel_iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
+ struct scatterlist *sg, unsigned int nents,
+ int prot, gfp_t gfp, size_t *mapped)
+{
+   unsigned int i = 0;
+   phys_addr_t start;
+   size_t len = 0;
+   int ret;
+
+   while (i <= nents) {
+   phys_addr_t s_phys = sg_phys(sg);
+
+   if (len && s_phys != start + len) {
+   ret = intel_iommu_map(domain, iova + *mapped, start,
+ len, prot, gfp);
+   if (ret)
+   return ret;
+
+   *mapped += len;
+   len = 0;
+   }
+
+   if (len) {
+   len += sg->length;
+   } else {
+   len = sg->length;
+   start = s_phys;
+   }
+
+   if (++i < nents)
+   sg = sg_next(sg);
+   }
+
+   return 0;
+}
+
 static size_t intel_iommu_unmap(struct iommu_domain *domain,
unsigned long iova, size_t size,
struct iommu_iotlb_gather *gather)
@@ -5519,6 +,7 @@ const struct iommu_ops intel_iommu_ops = {
.aux_detach_dev = intel_iommu_aux_detach_device,
.aux_get_pasid  = intel_iommu_aux_get_pasid,
.map= intel_iommu_map,
+   .map_sg = intel_iommu_map_sg,
.iotlb_sync_map = intel_iommu_iotlb_sync_map,
.unmap  = intel_iommu_unmap,
.flush_iotlb_all= intel_flush_iotlb_all,


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 4/9] iommu/io-pgtable: Introduce map_sg() as a page table op

2021-01-27 Thread Chuck Lever
From: Isaac J. Manjarres 

While mapping a scatter-gather list, iommu_map_sg() calls
into the IOMMU driver through an indirect call, which can
call into the io-pgtable code through another indirect call.

This sequence of going through the IOMMU core code, the IOMMU
driver, and finally the io-pgtable code, occurs for every
element in the scatter-gather list, in the worse case, which
is not optimal.

Introduce a map_sg callback in the io-pgtable ops so that
IOMMU drivers can invoke it with the complete scatter-gather
list, so that it can be processed within the io-pgtable
code entirely, reducing the number of indirect calls, and
boosting overall iommu_map_sg() performance.

Signed-off-by: Isaac J. Manjarres 
Tested-by: Sai Prakash Ranjan 
Signed-off-by: Chuck Lever 
---
 include/linux/io-pgtable.h |6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index ea727eb1a1a9..6d0e73172603 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -147,6 +147,9 @@ struct io_pgtable_cfg {
  * struct io_pgtable_ops - Page table manipulation API for IOMMU drivers.
  *
  * @map:  Map a physically contiguous memory region.
+ * @map_sg:   Map a scatter-gather list of physically contiguous memory
+ *chunks. The mapped pointer argument is used to store how
+ *many bytes are mapped.
  * @unmap:Unmap a physically contiguous memory region.
  * @iova_to_phys: Translate iova to physical address.
  *
@@ -156,6 +159,9 @@ struct io_pgtable_cfg {
 struct io_pgtable_ops {
int (*map)(struct io_pgtable_ops *ops, unsigned long iova,
   phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
+   int (*map_sg)(struct io_pgtable_ops *ops, unsigned long iova,
+ struct scatterlist *sg, unsigned int nents, int prot,
+ gfp_t gfp, size_t *mapped);
size_t (*unmap)(struct io_pgtable_ops *ops, unsigned long iova,
size_t size, struct iommu_iotlb_gather *gather);
phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops,


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 8/9] iommu/arm-smmu: Hook up map_sg()

2021-01-27 Thread Chuck Lever
From: Isaac J. Manjarres 

Now that everything is in place for iommu_map_sg() to defer
mapping a scatter-gather list to the io-pgtable layer, implement
the map_sg() callback in the SMMU driver, so that iommu_map_sg()
can invoke it with the entire scatter-gather list that will be
mapped.

Signed-off-by: Isaac J. Manjarres 
Tested-by: Sai Prakash Ranjan 
Signed-off-by: Chuck Lever 
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c |   19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index d8c6bfde6a61..52acc6858512 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1208,6 +1208,24 @@ static int arm_smmu_map(struct iommu_domain *domain, 
unsigned long iova,
return ret;
 }
 
+static int arm_smmu_map_sg(struct iommu_domain *domain, unsigned long iova,
+  struct scatterlist *sg, unsigned int nents, int prot,
+  gfp_t gfp, size_t *mapped)
+{
+   struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+   struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
+   int ret;
+
+   if (!ops)
+   return -ENODEV;
+
+   arm_smmu_rpm_get(smmu);
+   ret = ops->map_sg(ops, iova, sg, nents, prot, gfp, mapped);
+   arm_smmu_rpm_put(smmu);
+
+   return ret;
+}
+
 static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 size_t size, struct iommu_iotlb_gather *gather)
 {
@@ -1624,6 +1642,7 @@ static struct iommu_ops arm_smmu_ops = {
.domain_free= arm_smmu_domain_free,
.attach_dev = arm_smmu_attach_dev,
.map= arm_smmu_map,
+   .map_sg = arm_smmu_map_sg,
.unmap  = arm_smmu_unmap,
.flush_iotlb_all= arm_smmu_flush_iotlb_all,
.iotlb_sync = arm_smmu_iotlb_sync,


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] of/device: Update dma_range_map only when dev has valid dma-ranges

2021-01-27 Thread Rob Herring
On Tue, 19 Jan 2021 18:52:03 +0800, Yong Wu wrote:
> The commit e0d072782c73 ("dma-mapping: introduce DMA range map,
> supplanting dma_pfn_offset") always update dma_range_map even though it was
> already set, like in the sunxi_mbus driver. the issue is reported at [1].
> This patch avoid this(Updating it only when dev has valid dma-ranges).
> 
> Meanwhile, dma_range_map contains the devices' dma_ranges information,
> This patch moves dma_range_map before of_iommu_configure. The iommu
> driver may need to know the dma_address requirements of its iommu
> consumer devices.
> 
> [1] 
> https://lore.kernel.org/linux-arm-kernel/5c7946f3-b56e-da00-a750-be097c7ce...@arm.com/
> 
> CC: Rob Herring 
> CC: Frank Rowand 
> Fixes: e0d072782c73 ("dma-mapping: introduce DMA range map, supplanting 
> dma_pfn_offset"),
> Suggested-by: Robin Murphy 
> Signed-off-by: Yong Wu 
> Signed-off-by: Paul Kocialkowski 
> Reviewed-by: Rob Herring 
> ---
>  drivers/of/device.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 

Applied, thanks!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 3/9] iommu/vt-d: Add iotlb_sync_map callback

2021-01-27 Thread Chuck Lever
From: Lu Baolu 

Some Intel VT-d hardware implementations don't support memory coherency
for page table walk (presented by the Page-Walk-coherency bit in the
ecap register), so that software must flush the corresponding CPU cache
lines explicitly after each page table entry update.

The iommu_map_sg() code iterates through the given scatter-gather list
and invokes iommu_map() for each element in the scatter-gather list,
which calls into the vendor IOMMU driver through iommu_ops callback. As
the result, a single sg mapping may lead to multiple cache line flushes,
which leads to the degradation of I/O performance after the commit
 ("iommu/vt-d: Convert intel iommu driver to the iommu
ops").

Fix this by adding iotlb_sync_map callback and centralizing the clflush
operations after all sg mappings.

Fixes: c588072bba6b5 ("iommu/vt-d: Convert intel iommu driver to the iommu ops")
Reported-by: Chuck Lever 
Link: 
https://lore.kernel.org/linux-iommu/d81314ed-5673-44a6-b597-090e3cb83...@oracle.com/
Signed-off-by: Lu Baolu 
Cc: Robin Murphy 
[ cel: removed @first_pte, which is no longer used ]
Signed-off-by: Chuck Lever 
---
 drivers/iommu/intel/iommu.c |   90 +--
 1 file changed, 60 insertions(+), 30 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index f665322a0991..013097b6d55f 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2298,9 +2298,9 @@ static int
 __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
 unsigned long phys_pfn, unsigned long nr_pages, int prot)
 {
-   struct dma_pte *first_pte = NULL, *pte = NULL;
unsigned int largepage_lvl = 0;
unsigned long lvl_pages = 0;
+   struct dma_pte *pte = NULL;
phys_addr_t pteval;
u64 attr;
 
@@ -2322,7 +2322,7 @@ __domain_mapping(struct dmar_domain *domain, unsigned 
long iov_pfn,
largepage_lvl = hardware_largepage_caps(domain, iov_pfn,
phys_pfn, nr_pages);
 
-   first_pte = pte = pfn_to_dma_pte(domain, iov_pfn, 
_lvl);
+   pte = pfn_to_dma_pte(domain, iov_pfn, _lvl);
if (!pte)
return -ENOMEM;
/* It is large page*/
@@ -2383,34 +2383,14 @@ __domain_mapping(struct dmar_domain *domain, unsigned 
long iov_pfn,
 * recalculate 'pte' and switch back to smaller pages for the
 * end of the mapping, if the trailing size is not enough to
 * use another superpage (i.e. nr_pages < lvl_pages).
+*
+* We leave clflush for the leaf pte changes to iotlb_sync_map()
+* callback.
 */
pte++;
if (!nr_pages || first_pte_in_page(pte) ||
-   (largepage_lvl > 1 && nr_pages < lvl_pages)) {
-   domain_flush_cache(domain, first_pte,
-  (void *)pte - (void *)first_pte);
+   (largepage_lvl > 1 && nr_pages < lvl_pages))
pte = NULL;
-   }
-   }
-
-   return 0;
-}
-
-static int
-domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
-  unsigned long phys_pfn, unsigned long nr_pages, int prot)
-{
-   int iommu_id, ret;
-   struct intel_iommu *iommu;
-
-   /* Do the real mapping first */
-   ret = __domain_mapping(domain, iov_pfn, phys_pfn, nr_pages, prot);
-   if (ret)
-   return ret;
-
-   for_each_domain_iommu(iommu_id, domain) {
-   iommu = g_iommus[iommu_id];
-   __mapping_notify_one(iommu, domain, iov_pfn, nr_pages);
}
 
return 0;
@@ -4943,7 +4923,6 @@ static int intel_iommu_map(struct iommu_domain *domain,
struct dmar_domain *dmar_domain = to_dmar_domain(domain);
u64 max_addr;
int prot = 0;
-   int ret;
 
if (iommu_prot & IOMMU_READ)
prot |= DMA_PTE_READ;
@@ -4969,9 +4948,8 @@ static int intel_iommu_map(struct iommu_domain *domain,
/* Round up size to next multiple of PAGE_SIZE, if it and
   the low bits of hpa would take us onto the next page */
size = aligned_nrpages(hpa, size);
-   ret = domain_mapping(dmar_domain, iova >> VTD_PAGE_SHIFT,
-hpa >> VTD_PAGE_SHIFT, size, prot);
-   return ret;
+   return __domain_mapping(dmar_domain, iova >> VTD_PAGE_SHIFT,
+   hpa >> VTD_PAGE_SHIFT, size, prot);
 }
 
 static size_t intel_iommu_unmap(struct iommu_domain *domain,
@@ -5478,6 +5456,57 @@ static bool risky_device(struct pci_dev *pdev)
return false;
 }
 
+static void clflush_sync_map(struct dmar_domain *domain, unsigned long clf_pfn,
+unsigned long clf_pages)
+{
+   struct dma_pte 

[PATCH RFC 7/9] iommu: Introduce map_sg() as an IOMMU op for IOMMU drivers

2021-01-27 Thread Chuck Lever
From: Isaac J. Manjarres 

Add support for IOMMU drivers to have their own map_sg() callbacks.
This completes the path for having iommu_map_sg() invoke an IOMMU
driver's map_sg() callback, which can then invoke the io-pgtable
map_sg() callback with the entire scatter-gather list, so that it
can be processed entirely in the io-pgtable layer.

For IOMMU drivers that do not provide a callback, the default
implementation of iterating through the scatter-gather list, while
calling iommu_map() will be used.

Signed-off-by: Isaac J. Manjarres 
Tested-by: Sai Prakash Ranjan 
[ cel: adjusted new iotlb_sync_map call site ]
Signed-off-by: Chuck Lever 
---
 drivers/iommu/iommu.c |   12 
 include/linux/iommu.h |5 +
 2 files changed, 17 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ed879a4d7fac..bd7adbd0339b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2551,6 +2551,18 @@ static size_t __iommu_map_sg(struct iommu_domain 
*domain, unsigned long iova,
unsigned int i = 0;
int ret;
 
+   if (ops->map_sg) {
+   ret = ops->map_sg(domain, iova, sg, nents, prot, gfp, );
+
+   if (ops->iotlb_sync_map)
+   ops->iotlb_sync_map(domain, iova, mapped);
+
+   if (ret)
+   goto out_err;
+
+   return mapped;
+   }
+
while (i <= nents) {
phys_addr_t s_phys = sg_phys(sg);
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index cd5f35022a25..667edc7b034a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -192,6 +192,8 @@ struct iommu_iotlb_gather {
  * @attach_dev: attach device to an iommu domain
  * @detach_dev: detach device from an iommu domain
  * @map: map a physically contiguous memory region to an iommu domain
+ * @map_sg: map a scatter-gather list of physically contiguous chunks to
+ *  an iommu domain.
  * @unmap: unmap a physically contiguous memory region from an iommu domain
  * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain
  * @iotlb_sync_map: Sync mappings created recently using @map to the hardware
@@ -243,6 +245,9 @@ struct iommu_ops {
void (*detach_dev)(struct iommu_domain *domain, struct device *dev);
int (*map)(struct iommu_domain *domain, unsigned long iova,
   phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
+   int (*map_sg)(struct iommu_domain *domain, unsigned long iova,
+ struct scatterlist *sg, unsigned int nents, int prot,
+ gfp_t gfp, size_t *mapped);
size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
 size_t size, struct iommu_iotlb_gather *iotlb_gather);
void (*flush_iotlb_all)(struct iommu_domain *domain);


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 5/9] iommu/io-pgtable-arm: Hook up map_sg()

2021-01-27 Thread Chuck Lever
From: Isaac J. Manjarres 

Implement the map_sg io-pgtable op for the ARM LPAE io-pgtable
code, so that IOMMU drivers can call it when they need to map
a scatter-gather list.

Signed-off-by: Isaac J. Manjarres 
Tested-by: Sai Prakash Ranjan 
Signed-off-by: Chuck Lever 
---
 drivers/iommu/io-pgtable-arm.c |   86 
 drivers/iommu/iommu.c  |   12 +++---
 include/linux/iommu.h  |8 
 3 files changed, 101 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 87def58e79b5..0c11529442b8 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -473,6 +473,91 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, 
unsigned long iova,
return ret;
 }
 
+static int arm_lpae_map_by_pgsize(struct io_pgtable_ops *ops,
+ unsigned long iova, phys_addr_t paddr,
+ size_t size, int iommu_prot, gfp_t gfp,
+ size_t *mapped)
+{
+   struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+   struct io_pgtable_cfg *cfg = >iop.cfg;
+   arm_lpae_iopte *ptep = data->pgd;
+   int ret, lvl = data->start_level;
+   arm_lpae_iopte prot = arm_lpae_prot_to_pte(data, iommu_prot);
+   unsigned int min_pagesz = 1 << __ffs(cfg->pgsize_bitmap);
+   long iaext = (s64)(iova + size - 1) >> cfg->ias;
+   size_t pgsize;
+
+   if (!IS_ALIGNED(iova | paddr | size, min_pagesz)) {
+   pr_err("unaligned: iova 0x%lx pa %pa size 0x%zx min_pagesz 
0x%x\n",
+  iova, , size, min_pagesz);
+   return -EINVAL;
+   }
+
+   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+   iaext = ~iaext;
+   if (WARN_ON(iaext || (paddr + size - 1) >> cfg->oas))
+   return -ERANGE;
+
+   while (size) {
+   pgsize = iommu_pgsize(cfg->pgsize_bitmap, iova | paddr, size);
+   ret = __arm_lpae_map(data, iova, paddr, pgsize, prot, lvl, ptep,
+gfp);
+   if (ret)
+   return ret;
+
+   iova += pgsize;
+   paddr += pgsize;
+   *mapped += pgsize;
+   size -= pgsize;
+   }
+
+   return 0;
+}
+
+static int arm_lpae_map_sg(struct io_pgtable_ops *ops, unsigned long iova,
+  struct scatterlist *sg, unsigned int nents,
+  int iommu_prot, gfp_t gfp, size_t *mapped)
+{
+
+   size_t len = 0;
+   unsigned int i = 0;
+   int ret;
+   phys_addr_t start;
+
+   *mapped = 0;
+
+   /* If no access, then nothing to do */
+   if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
+   return 0;
+
+   while (i <= nents) {
+   phys_addr_t s_phys = sg_phys(sg);
+
+   if (len && s_phys != start + len) {
+   ret = arm_lpae_map_by_pgsize(ops, iova + *mapped, start,
+len, iommu_prot, gfp,
+mapped);
+
+   if (ret)
+   return ret;
+
+   len = 0;
+   }
+
+   if (len) {
+   len += sg->length;
+   } else {
+   len = sg->length;
+   start = s_phys;
+   }
+
+   if (++i < nents)
+   sg = sg_next(sg);
+   }
+
+   return 0;
+}
+
 static void __arm_lpae_free_pgtable(struct arm_lpae_io_pgtable *data, int lvl,
arm_lpae_iopte *ptep)
 {
@@ -750,6 +835,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 
data->iop.ops = (struct io_pgtable_ops) {
.map= arm_lpae_map,
+   .map_sg = arm_lpae_map_sg,
.unmap  = arm_lpae_unmap,
.iova_to_phys   = arm_lpae_iova_to_phys,
};
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3d099a31ddca..ed879a4d7fac 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2346,8 +2346,8 @@ phys_addr_t iommu_iova_to_phys(struct iommu_domain 
*domain, dma_addr_t iova)
 }
 EXPORT_SYMBOL_GPL(iommu_iova_to_phys);
 
-static size_t iommu_pgsize(struct iommu_domain *domain,
-  unsigned long addr_merge, size_t size)
+size_t iommu_pgsize(unsigned long pgsize_bitmap, unsigned long addr_merge,
+   size_t size)
 {
unsigned int pgsize_idx;
size_t pgsize;
@@ -2366,7 +2366,7 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
pgsize = (1UL << (pgsize_idx + 1)) - 1;
 
/* throw away page sizes not supported by the hardware */
-   pgsize &= domain->pgsize_bitmap;
+   pgsize &= pgsize_bitmap;
 
/* make sure 

[PATCH RFC 6/9] iommu/io-pgtable-arm-v7s: Hook up map_sg()

2021-01-27 Thread Chuck Lever
From: Isaac J. Manjarres 

Implement the map_sg io-pgtable op for the ARMv7s io-pgtable
code, so that IOMMU drivers can call it when they need to map
a scatter-gather list.

Signed-off-by: Isaac J. Manjarres 
Tested-by: Sai Prakash Ranjan 
Signed-off-by: Chuck Lever 
---
 drivers/iommu/io-pgtable-arm-v7s.c |   90 
 1 file changed, 90 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 1d92ac948db7..8665dabb753b 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -545,6 +545,95 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, 
unsigned long iova,
return ret;
 }
 
+static int arm_v7s_map_by_pgsize(struct io_pgtable_ops *ops,
+unsigned long iova, phys_addr_t paddr,
+size_t size, int prot, gfp_t gfp,
+size_t *mapped)
+{
+   struct arm_v7s_io_pgtable *data = io_pgtable_ops_to_data(ops);
+   struct io_pgtable *iop = >iop;
+   struct io_pgtable_cfg *cfg = >cfg;
+   unsigned int min_pagesz = 1 << __ffs(cfg->pgsize_bitmap);
+   int ret;
+   size_t pgsize;
+
+   if (!IS_ALIGNED(iova | paddr | size, min_pagesz)) {
+   pr_err("unaligned: iova 0x%lx pa %pa size 0x%zx min_pagesz 
0x%x\n",
+  iova, , size, min_pagesz);
+   return -EINVAL;
+   }
+
+   if (WARN_ON((iova + size - 1) >= (1ULL << cfg->ias) ||
+   (paddr + size - 1) >= (1ULL << cfg->oas)))
+   return -ERANGE;
+
+   while (size) {
+   pgsize = iommu_pgsize(cfg->pgsize_bitmap, iova | paddr, size);
+   ret = __arm_v7s_map(data, iova, paddr, pgsize, prot, 1,
+   data->pgd, gfp);
+
+   if (iop->cfg.quirks & IO_PGTABLE_QUIRK_TLBI_ON_MAP) {
+   io_pgtable_tlb_flush_walk(>iop, iova, size,
+ ARM_V7S_BLOCK_SIZE(2));
+   } else {
+   wmb();
+   }
+
+   if (ret)
+   return ret;
+
+   iova += pgsize;
+   paddr += pgsize;
+   *mapped += pgsize;
+   size -= pgsize;
+   }
+
+   return 0;
+}
+
+static int arm_v7s_map_sg(struct io_pgtable_ops *ops, unsigned long iova,
+ struct scatterlist *sg, unsigned int nents,
+ int iommu_prot, gfp_t gfp, size_t *mapped)
+{
+   size_t len = 0;
+   unsigned int i = 0;
+   int ret;
+   phys_addr_t start;
+
+   *mapped = 0;
+
+   /* If no access, then nothing to do */
+   if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
+   return 0;
+
+   while (i <= nents) {
+   phys_addr_t s_phys = sg_phys(sg);
+
+   if (len && s_phys != start + len) {
+   ret = arm_v7s_map_by_pgsize(ops, iova + *mapped, start,
+   len, iommu_prot, gfp,
+   mapped);
+
+   if (ret)
+   return ret;
+
+   len = 0;
+   }
+
+   if (len) {
+   len += sg->length;
+   } else {
+   len = sg->length;
+   start = s_phys;
+   }
+
+   if (++i < nents)
+   sg = sg_next(sg);
+   }
+
+   return 0;
+}
+
 static void arm_v7s_free_pgtable(struct io_pgtable *iop)
 {
struct arm_v7s_io_pgtable *data = io_pgtable_to_data(iop);
@@ -783,6 +872,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
 
data->iop.ops = (struct io_pgtable_ops) {
.map= arm_v7s_map,
+   .map_sg = arm_v7s_map_sg,
.unmap  = arm_v7s_unmap,
.iova_to_phys   = arm_v7s_iova_to_phys,
};


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 2/9] iommu: Add iova and size as parameters in iotlb_sync_map

2021-01-27 Thread Chuck Lever
From: Yong Wu 

iotlb_sync_map allow IOMMU drivers tlb sync after completing the whole
mapping. This patch adds iova and size as the parameters in it. then the
IOMMU driver could flush tlb with the whole range once after iova mapping
to improve performance.

Signed-off-by: Yong Wu 
Reviewed-by: Robin Murphy 
Signed-off-by: Chuck Lever 
---
 drivers/iommu/iommu.c  |4 ++--
 drivers/iommu/tegra-gart.c |7 +--
 include/linux/iommu.h  |3 ++-
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index c304a6a30d42..3d099a31ddca 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2443,7 +2443,7 @@ static int _iommu_map(struct iommu_domain *domain, 
unsigned long iova,
 
ret = __iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL);
if (ret == 0 && ops->iotlb_sync_map)
-   ops->iotlb_sync_map(domain);
+   ops->iotlb_sync_map(domain, iova, size);
 
return ret;
 }
@@ -2575,7 +2575,7 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, 
unsigned long iova,
}
 
if (ops->iotlb_sync_map)
-   ops->iotlb_sync_map(domain);
+   ops->iotlb_sync_map(domain, iova, mapped);
return mapped;
 
 out_err:
diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index fac720273889..05e8e19b8269 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -261,7 +261,8 @@ static int gart_iommu_of_xlate(struct device *dev,
return 0;
 }
 
-static void gart_iommu_sync_map(struct iommu_domain *domain)
+static void gart_iommu_sync_map(struct iommu_domain *domain, unsigned long 
iova,
+   size_t size)
 {
FLUSH_GART_REGS(gart_handle);
 }
@@ -269,7 +270,9 @@ static void gart_iommu_sync_map(struct iommu_domain *domain)
 static void gart_iommu_sync(struct iommu_domain *domain,
struct iommu_iotlb_gather *gather)
 {
-   gart_iommu_sync_map(domain);
+   size_t length = gather->end - gather->start;
+
+   gart_iommu_sync_map(domain, gather->start, length);
 }
 
 static const struct iommu_ops gart_iommu_ops = {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index b3f0e2018c62..9ce0aa9e236b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -246,7 +246,8 @@ struct iommu_ops {
size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
 size_t size, struct iommu_iotlb_gather *iotlb_gather);
void (*flush_iotlb_all)(struct iommu_domain *domain);
-   void (*iotlb_sync_map)(struct iommu_domain *domain);
+   void (*iotlb_sync_map)(struct iommu_domain *domain, unsigned long iova,
+  size_t size);
void (*iotlb_sync)(struct iommu_domain *domain,
   struct iommu_iotlb_gather *iotlb_gather);
phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, dma_addr_t 
iova);


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 1/9] iommu: Move iotlb_sync_map out from __iommu_map

2021-01-27 Thread Chuck Lever
From: Yong Wu 

In the end of __iommu_map, It alway call iotlb_sync_map.

This patch moves iotlb_sync_map out from __iommu_map since it is
unnecessary to call this for each sg segment especially iotlb_sync_map
is flush tlb all currently. Add a little helper _iommu_map for this.

Signed-off-by: Yong Wu 
Reviewed-by: Robin Murphy 
Signed-off-by: Chuck Lever 
---
 drivers/iommu/iommu.c |   23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ffeebda8d6de..c304a6a30d42 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2426,9 +2426,6 @@ static int __iommu_map(struct iommu_domain *domain, 
unsigned long iova,
size -= pgsize;
}
 
-   if (ops->iotlb_sync_map)
-   ops->iotlb_sync_map(domain);
-
/* unroll mapping in case something went wrong */
if (ret)
iommu_unmap(domain, orig_iova, orig_size - size);
@@ -2438,18 +2435,31 @@ static int __iommu_map(struct iommu_domain *domain, 
unsigned long iova,
return ret;
 }
 
+static int _iommu_map(struct iommu_domain *domain, unsigned long iova,
+ phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
+{
+   const struct iommu_ops *ops = domain->ops;
+   int ret;
+
+   ret = __iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL);
+   if (ret == 0 && ops->iotlb_sync_map)
+   ops->iotlb_sync_map(domain);
+
+   return ret;
+}
+
 int iommu_map(struct iommu_domain *domain, unsigned long iova,
  phys_addr_t paddr, size_t size, int prot)
 {
might_sleep();
-   return __iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL);
+   return _iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL);
 }
 EXPORT_SYMBOL_GPL(iommu_map);
 
 int iommu_map_atomic(struct iommu_domain *domain, unsigned long iova,
  phys_addr_t paddr, size_t size, int prot)
 {
-   return __iommu_map(domain, iova, paddr, size, prot, GFP_ATOMIC);
+   return _iommu_map(domain, iova, paddr, size, prot, GFP_ATOMIC);
 }
 EXPORT_SYMBOL_GPL(iommu_map_atomic);
 
@@ -2533,6 +2543,7 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, 
unsigned long iova,
 struct scatterlist *sg, unsigned int nents, int 
prot,
 gfp_t gfp)
 {
+   const struct iommu_ops *ops = domain->ops;
size_t len = 0, mapped = 0;
phys_addr_t start;
unsigned int i = 0;
@@ -2563,6 +2574,8 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, 
unsigned long iova,
sg = sg_next(sg);
}
 
+   if (ops->iotlb_sync_map)
+   ops->iotlb_sync_map(domain);
return mapped;
 
 out_err:


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 0/9] Possible set of VT-d optimizations

2021-01-27 Thread Chuck Lever
Hi-

This collection of patches seems to get the best throughtput results
so far. The NFS WRITE result is fully restored, and the NFS READ
result is very close to fully restored.

Children see throughput for 12 initial writers  = 5008474.03 kB/sec
Parent sees throughput for 12 initial writers   = 4996927.80 kB/sec
Min throughput per process  = 416956.88 kB/sec 
Max throughput per process  = 417910.22 kB/sec
Avg throughput per process  = 417372.84 kB/sec
Min xfer= 1046272.00 kB
CPU Utilization: Wall time2.515CPU time1.996CPU 
utilization  79.37 %


Children see throughput for 12 rewriters= 5020584.59 kB/sec
Parent sees throughput for 12 rewriters = 5012539.29 kB/sec
Min throughput per process  = 417799.00 kB/sec 
Max throughput per process  = 419082.22 kB/sec
Avg throughput per process  = 418382.05 kB/sec
Min xfer= 1046528.00 kB
CPU utilization: Wall time2.507CPU time2.024CPU 
utilization  80.73 %


Children see throughput for 12 readers  = 5805484.25 kB/sec
Parent sees throughput for 12 readers   = 5799535.68 kB/sec
Min throughput per process  = 482888.16 kB/sec 
Max throughput per process  = 48.16 kB/sec
Avg throughput per process  = 483790.35 kB/sec
Min xfer= 1045760.00 kB
CPU utilization: Wall time2.167CPU time1.964CPU 
utilization  90.63 %


Children see throughput for 12 re-readers   = 5812227.16 kB/sec
Parent sees throughput for 12 re-readers= 5803793.06 kB/sec
Min throughput per process  = 483242.97 kB/sec 
Max throughput per process  = 485724.41 kB/sec
Avg throughput per process  = 484352.26 kB/sec
Min xfer= 1043456.00 kB
CPU utilization: Wall time2.161CPU time1.976CPU 
utilization  91.45 %

I've included a simple-minded implementation of a map_sg op for
the Intel IOMMU. This is nothing more than a copy of the loop in
__iommu_map_sg() with the call to __iommu_map() replaced with a
call to intel_iommu_map().

---

Chuck Lever (1):
  iommu/vt-d: Introduce map_sg() for Intel IOMMUs

Isaac J. Manjarres (5):
  iommu/io-pgtable: Introduce map_sg() as a page table op
  iommu/io-pgtable-arm: Hook up map_sg()
  iommu/io-pgtable-arm-v7s: Hook up map_sg()
  iommu: Introduce map_sg() as an IOMMU op for IOMMU drivers
  iommu/arm-smmu: Hook up map_sg()

Lu Baolu (1):
  iommu/vt-d: Add iotlb_sync_map callback

Yong Wu (2):
  iommu: Move iotlb_sync_map out from __iommu_map
  iommu: Add iova and size as parameters in iotlb_sync_map


 drivers/iommu/arm/arm-smmu/arm-smmu.c |  19 
 drivers/iommu/intel/iommu.c   | 131 --
 drivers/iommu/io-pgtable-arm-v7s.c|  90 ++
 drivers/iommu/io-pgtable-arm.c|  86 +
 drivers/iommu/iommu.c |  47 +++--
 drivers/iommu/tegra-gart.c|   7 +-
 include/linux/iommu.h |  16 +++-
 7 files changed, 353 insertions(+), 43 deletions(-)

--
Chuck Lever

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] of/device: Update dma_range_map only when dev has valid dma-ranges

2021-01-27 Thread Robin Murphy

On 2021-01-27 19:09, Rob Herring wrote:

On Wed, Jan 27, 2021 at 7:13 AM Robin Murphy  wrote:


[ + Christoph, Marek ]

On 2021-01-27 13:00, Paul Kocialkowski wrote:

Hi,

On Tue 19 Jan 21, 18:52, Yong Wu wrote:

The commit e0d072782c73 ("dma-mapping: introduce DMA range map,
supplanting dma_pfn_offset") always update dma_range_map even though it was
already set, like in the sunxi_mbus driver. the issue is reported at [1].
This patch avoid this(Updating it only when dev has valid dma-ranges).

Meanwhile, dma_range_map contains the devices' dma_ranges information,
This patch moves dma_range_map before of_iommu_configure. The iommu
driver may need to know the dma_address requirements of its iommu
consumer devices.


Just a gentle ping on this issue, it would be nice to have this fix merged
ASAP, in the next RC :)


Ack to that - Rob, Frank, do you want to take this through the OF tree,
or shall we take it through the DMA-mapping tree like the original culprit?


I've already got some fixes queued up and can take it.


Brilliant, thanks!


Suggested-by doesn't mean you are happy with the implementation. So
Acked-by or Reviewed-by?


It still feels slightly awkward to give a tag to say "yes, this is 
exactly what I suggested", but for the avoidance of doubt,


Reviewed-by: Robin Murphy 

Cheers,
Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] of/device: Update dma_range_map only when dev has valid dma-ranges

2021-01-27 Thread Robin Murphy

On 2021-01-27 19:07, Rob Herring wrote:

On Tue, Jan 19, 2021 at 4:52 AM Yong Wu  wrote:


The commit e0d072782c73 ("dma-mapping: introduce DMA range map,
supplanting dma_pfn_offset") always update dma_range_map even though it was
already set, like in the sunxi_mbus driver. the issue is reported at [1].
This patch avoid this(Updating it only when dev has valid dma-ranges).


only when dev *doesn't* have valid dma-ranges?


I believe the intent is to mean when a valid "dma-ranges" property is 
specified in DT. The implementation allows DT to take precedence even if 
platform code *has* already installed a dma_range_map, although we 
shouldn't expect that to ever happen (except perhaps in the wild early 
days of platform bringup).


Robin.


Meanwhile, dma_range_map contains the devices' dma_ranges information,
This patch moves dma_range_map before of_iommu_configure. The iommu
driver may need to know the dma_address requirements of its iommu
consumer devices.

[1] 
https://lore.kernel.org/linux-arm-kernel/5c7946f3-b56e-da00-a750-be097c7ce...@arm.com/

CC: Rob Herring 
CC: Frank Rowand 
Fixes: e0d072782c73 ("dma-mapping: introduce DMA range map, supplanting 
dma_pfn_offset"),
Suggested-by: Robin Murphy 
Signed-off-by: Yong Wu 
Signed-off-by: Paul Kocialkowski 
Reviewed-by: Rob Herring 
---
  drivers/of/device.c | 10 +++---
  1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/of/device.c b/drivers/of/device.c
index aedfaaafd3e7..1122daa8e273 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -162,9 +162,11 @@ int of_dma_configure_id(struct device *dev, struct 
device_node *np,
 mask = DMA_BIT_MASK(ilog2(end) + 1);
 dev->coherent_dma_mask &= mask;
 *dev->dma_mask &= mask;
-   /* ...but only set bus limit if we found valid dma-ranges earlier */
-   if (!ret)
+   /* ...but only set bus limit and range map if we found valid dma-ranges 
earlier */
+   if (!ret) {
 dev->bus_dma_limit = end;
+   dev->dma_range_map = map;
+   }

 coherent = of_dma_is_coherent(np);
 dev_dbg(dev, "device is%sdma coherent\n",
@@ -172,6 +174,9 @@ int of_dma_configure_id(struct device *dev, struct 
device_node *np,

 iommu = of_iommu_configure(dev, np, id);
 if (PTR_ERR(iommu) == -EPROBE_DEFER) {
+   /* Don't touch range map if it wasn't set from a valid 
dma-ranges */
+   if (!ret)
+   dev->dma_range_map = NULL;
 kfree(map);
 return -EPROBE_DEFER;
 }
@@ -181,7 +186,6 @@ int of_dma_configure_id(struct device *dev, struct 
device_node *np,

 arch_setup_dma_ops(dev, dma_start, size, iommu, coherent);

-   dev->dma_range_map = map;
 return 0;
  }
  EXPORT_SYMBOL_GPL(of_dma_configure_id);
--
2.18.0


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] of/device: Update dma_range_map only when dev has valid dma-ranges

2021-01-27 Thread Rob Herring
On Wed, Jan 27, 2021 at 7:13 AM Robin Murphy  wrote:
>
> [ + Christoph, Marek ]
>
> On 2021-01-27 13:00, Paul Kocialkowski wrote:
> > Hi,
> >
> > On Tue 19 Jan 21, 18:52, Yong Wu wrote:
> >> The commit e0d072782c73 ("dma-mapping: introduce DMA range map,
> >> supplanting dma_pfn_offset") always update dma_range_map even though it was
> >> already set, like in the sunxi_mbus driver. the issue is reported at [1].
> >> This patch avoid this(Updating it only when dev has valid dma-ranges).
> >>
> >> Meanwhile, dma_range_map contains the devices' dma_ranges information,
> >> This patch moves dma_range_map before of_iommu_configure. The iommu
> >> driver may need to know the dma_address requirements of its iommu
> >> consumer devices.
> >
> > Just a gentle ping on this issue, it would be nice to have this fix merged
> > ASAP, in the next RC :)
>
> Ack to that - Rob, Frank, do you want to take this through the OF tree,
> or shall we take it through the DMA-mapping tree like the original culprit?

I've already got some fixes queued up and can take it.

Suggested-by doesn't mean you are happy with the implementation. So
Acked-by or Reviewed-by?

Rob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] of/device: Update dma_range_map only when dev has valid dma-ranges

2021-01-27 Thread Rob Herring
On Tue, Jan 19, 2021 at 4:52 AM Yong Wu  wrote:
>
> The commit e0d072782c73 ("dma-mapping: introduce DMA range map,
> supplanting dma_pfn_offset") always update dma_range_map even though it was
> already set, like in the sunxi_mbus driver. the issue is reported at [1].
> This patch avoid this(Updating it only when dev has valid dma-ranges).

only when dev *doesn't* have valid dma-ranges?

> Meanwhile, dma_range_map contains the devices' dma_ranges information,
> This patch moves dma_range_map before of_iommu_configure. The iommu
> driver may need to know the dma_address requirements of its iommu
> consumer devices.
>
> [1] 
> https://lore.kernel.org/linux-arm-kernel/5c7946f3-b56e-da00-a750-be097c7ce...@arm.com/
>
> CC: Rob Herring 
> CC: Frank Rowand 
> Fixes: e0d072782c73 ("dma-mapping: introduce DMA range map, supplanting 
> dma_pfn_offset"),
> Suggested-by: Robin Murphy 
> Signed-off-by: Yong Wu 
> Signed-off-by: Paul Kocialkowski 
> Reviewed-by: Rob Herring 
> ---
>  drivers/of/device.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/of/device.c b/drivers/of/device.c
> index aedfaaafd3e7..1122daa8e273 100644
> --- a/drivers/of/device.c
> +++ b/drivers/of/device.c
> @@ -162,9 +162,11 @@ int of_dma_configure_id(struct device *dev, struct 
> device_node *np,
> mask = DMA_BIT_MASK(ilog2(end) + 1);
> dev->coherent_dma_mask &= mask;
> *dev->dma_mask &= mask;
> -   /* ...but only set bus limit if we found valid dma-ranges earlier */
> -   if (!ret)
> +   /* ...but only set bus limit and range map if we found valid 
> dma-ranges earlier */
> +   if (!ret) {
> dev->bus_dma_limit = end;
> +   dev->dma_range_map = map;
> +   }
>
> coherent = of_dma_is_coherent(np);
> dev_dbg(dev, "device is%sdma coherent\n",
> @@ -172,6 +174,9 @@ int of_dma_configure_id(struct device *dev, struct 
> device_node *np,
>
> iommu = of_iommu_configure(dev, np, id);
> if (PTR_ERR(iommu) == -EPROBE_DEFER) {
> +   /* Don't touch range map if it wasn't set from a valid 
> dma-ranges */
> +   if (!ret)
> +   dev->dma_range_map = NULL;
> kfree(map);
> return -EPROBE_DEFER;
> }
> @@ -181,7 +186,6 @@ int of_dma_configure_id(struct device *dev, struct 
> device_node *np,
>
> arch_setup_dma_ops(dev, dma_start, size, iommu, coherent);
>
> -   dev->dma_range_map = map;
> return 0;
>  }
>  EXPORT_SYMBOL_GPL(of_dma_configure_id);
> --
> 2.18.0
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/vt-d: do not use flush-queue when caching-mode is on

2021-01-27 Thread Nadav Amit
> On Jan 27, 2021, at 3:25 AM, Lu Baolu  wrote:
> 
> On 2021/1/27 14:17, Nadav Amit wrote:
>> From: Nadav Amit 
>> When an Intel IOMMU is virtualized, and a physical device is
>> passed-through to the VM, changes of the virtual IOMMU need to be
>> propagated to the physical IOMMU. The hypervisor therefore needs to
>> monitor PTE mappings in the IOMMU page-tables. Intel specifications
>> provide "caching-mode" capability that a virtual IOMMU uses to report
>> that the IOMMU is virtualized and a TLB flush is needed after mapping to
>> allow the hypervisor to propagate virtual IOMMU mappings to the physical
>> IOMMU. To the best of my knowledge no real physical IOMMU reports
>> "caching-mode" as turned on.
>> Synchronizing the virtual and the physical IOMMU tables is expensive if
>> the hypervisor is unaware which PTEs have changed, as the hypervisor is
>> required to walk all the virtualized tables and look for changes.
>> Consequently, domain flushes are much more expensive than page-specific
>> flushes on virtualized IOMMUs with passthrough devices. The kernel
>> therefore exploited the "caching-mode" indication to avoid domain
>> flushing and use page-specific flushing in virtualized environments. See
>> commit 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching
>> mode.")
>> This behavior changed after commit 13cf01744608 ("iommu/vt-d: Make use
>> of iova deferred flushing"). Now, when batched TLB flushing is used (the
>> default), full TLB domain flushes are performed frequently, requiring
>> the hypervisor to perform expensive synchronization between the virtual
>> TLB and the physical one.
>> Getting batched TLB flushes to use in such circumstances page-specific
>> invalidations again is not easy, since the TLB invalidation scheme
>> assumes that "full" domain TLB flushes are performed for scalability.
>> Disable batched TLB flushes when caching-mode is on, as the performance
>> benefit from using batched TLB invalidations is likely to be much
>> smaller than the overhead of the virtual-to-physical IOMMU page-tables
>> synchronization.
>> Fixes: 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching mode.")
> 
> Isn't it
> 
> Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing")
> 
> ?

Of course it is - bad copy-paste. I will send v3.

Thanks again,
Nadav
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3] iommu/vt-d: do not use flush-queue when caching-mode is on

2021-01-27 Thread Nadav Amit
From: Nadav Amit 

When an Intel IOMMU is virtualized, and a physical device is
passed-through to the VM, changes of the virtual IOMMU need to be
propagated to the physical IOMMU. The hypervisor therefore needs to
monitor PTE mappings in the IOMMU page-tables. Intel specifications
provide "caching-mode" capability that a virtual IOMMU uses to report
that the IOMMU is virtualized and a TLB flush is needed after mapping to
allow the hypervisor to propagate virtual IOMMU mappings to the physical
IOMMU. To the best of my knowledge no real physical IOMMU reports
"caching-mode" as turned on.

Synchronizing the virtual and the physical IOMMU tables is expensive if
the hypervisor is unaware which PTEs have changed, as the hypervisor is
required to walk all the virtualized tables and look for changes.
Consequently, domain flushes are much more expensive than page-specific
flushes on virtualized IOMMUs with passthrough devices. The kernel
therefore exploited the "caching-mode" indication to avoid domain
flushing and use page-specific flushing in virtualized environments. See
commit 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching
mode.")

This behavior changed after commit 13cf01744608 ("iommu/vt-d: Make use
of iova deferred flushing"). Now, when batched TLB flushing is used (the
default), full TLB domain flushes are performed frequently, requiring
the hypervisor to perform expensive synchronization between the virtual
TLB and the physical one.

Getting batched TLB flushes to use page-specific invalidations again in
such circumstances is not easy, since the TLB invalidation scheme
assumes that "full" domain TLB flushes are performed for scalability.

Disable batched TLB flushes when caching-mode is on, as the performance
benefit from using batched TLB invalidations is likely to be much
smaller than the overhead of the virtual-to-physical IOMMU page-tables
synchronization.

Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing")
Signed-off-by: Nadav Amit 
Cc: David Woodhouse 
Cc: Lu Baolu 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: sta...@vger.kernel.org

---
v2->v3:

* Fix the fixes tag in the commit-log (Lu).
* Minor English rephrasing of the commit-log.

v1->v2:

* disable flush queue for all domains if caching-mode is on for any
  IOMMU (Lu).
---
 drivers/iommu/intel/iommu.c | 32 +++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 788119c5b021..de3dd617cf60 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5373,6 +5373,36 @@ intel_iommu_domain_set_attr(struct iommu_domain *domain,
return ret;
 }
 
+static bool domain_use_flush_queue(void)
+{
+   struct dmar_drhd_unit *drhd;
+   struct intel_iommu *iommu;
+   bool r = true;
+
+   if (intel_iommu_strict)
+   return false;
+
+   /*
+* The flush queue implementation does not perform page-selective
+* invalidations that are required for efficient TLB flushes in virtual
+* environments. The benefit of batching is likely to be much lower than
+* the overhead of synchronizing the virtual and physical IOMMU
+* page-tables.
+*/
+   rcu_read_lock();
+   for_each_active_iommu(iommu, drhd) {
+   if (!cap_caching_mode(iommu->cap))
+   continue;
+
+   pr_warn_once("IOMMU batching is disabled due to 
virtualization");
+   r = false;
+   break;
+   }
+   rcu_read_unlock();
+
+   return r;
+}
+
 static int
 intel_iommu_domain_get_attr(struct iommu_domain *domain,
enum iommu_attr attr, void *data)
@@ -5383,7 +5413,7 @@ intel_iommu_domain_get_attr(struct iommu_domain *domain,
case IOMMU_DOMAIN_DMA:
switch (attr) {
case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
-   *(int *)data = !intel_iommu_strict;
+   *(int *)data = domain_use_flush_queue();
return 0;
default:
return -ENODEV;
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] iommu/io-pgtable: Remove TLBI_ON_MAP quirk

2021-01-27 Thread Robin Murphy
IO_PGTABLE_QUIRK_TLBI_ON_MAP is now fully superseded by the
core API's iotlb_sync_map callback.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 8 +---
 include/linux/io-pgtable.h | 5 -
 2 files changed, 1 insertion(+), 12 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 1d92ac948db7..3a92f06aad43 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -535,12 +535,7 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, 
unsigned long iova,
 * Synchronise all PTE updates for the new mapping before there's
 * a chance for anything to kick off a table walk for the new iova.
 */
-   if (iop->cfg.quirks & IO_PGTABLE_QUIRK_TLBI_ON_MAP) {
-   io_pgtable_tlb_flush_walk(iop, iova, size,
- ARM_V7S_BLOCK_SIZE(2));
-   } else {
-   wmb();
-   }
+   wmb();
 
return ret;
 }
@@ -759,7 +754,6 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
 
if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
IO_PGTABLE_QUIRK_NO_PERMS |
-   IO_PGTABLE_QUIRK_TLBI_ON_MAP |
IO_PGTABLE_QUIRK_ARM_MTK_EXT |
IO_PGTABLE_QUIRK_NON_STRICT))
return NULL;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 2a5686ca2ba3..06753323a15e 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -68,10 +68,6 @@ struct io_pgtable_cfg {
 *  hardware which does not implement the permissions of a given
 *  format, and/or requires some format-specific default value.
 *
-* IO_PGTABLE_QUIRK_TLBI_ON_MAP: If the format forbids caching invalid
-*  (unmapped) entries but the hardware might do so anyway, perform
-*  TLB maintenance when mapping as well as when unmapping.
-*
 * IO_PGTABLE_QUIRK_ARM_MTK_EXT: (ARM v7s format) MediaTek IOMMUs extend
 *  to support up to 34 bits PA where the bit32 and bit33 are
 *  encoded in the bit9 and bit4 of the PTE respectively.
@@ -88,7 +84,6 @@ struct io_pgtable_cfg {
 */
#define IO_PGTABLE_QUIRK_ARM_NS BIT(0)
#define IO_PGTABLE_QUIRK_NO_PERMS   BIT(1)
-   #define IO_PGTABLE_QUIRK_TLBI_ON_MAPBIT(2)
#define IO_PGTABLE_QUIRK_ARM_MTK_EXTBIT(3)
#define IO_PGTABLE_QUIRK_NON_STRICT BIT(4)
#define IO_PGTABLE_QUIRK_ARM_TTBR1  BIT(5)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] iommu/msm: Hook up iotlb_sync_map

2021-01-27 Thread Robin Murphy
The core API can now accommodate invalidate-on-map style behaviour in a
single efficient call, so hook that up instead of having io-pgatble do
it piecemeal.

Signed-off-by: Robin Murphy 
---

Just a little quick cleanup on top of the Mediatek TLB patches now
queued - CC'ing a bunch of folks from the Git history in the hope that
someone can still test MSM... :)

 drivers/iommu/msm_iommu.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index 040e85f70861..f0ba6a09b434 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -343,7 +343,6 @@ static int msm_iommu_domain_config(struct msm_priv *priv)
spin_lock_init(>pgtlock);
 
priv->cfg = (struct io_pgtable_cfg) {
-   .quirks = IO_PGTABLE_QUIRK_TLBI_ON_MAP,
.pgsize_bitmap = msm_iommu_ops.pgsize_bitmap,
.ias = 32,
.oas = 32,
@@ -490,6 +489,14 @@ static int msm_iommu_map(struct iommu_domain *domain, 
unsigned long iova,
return ret;
 }
 
+static void msm_iommu_sync_map(struct iommu_domain *domain, unsigned long iova,
+  size_t size)
+{
+   struct msm_priv *priv = to_msm_priv(domain);
+
+   __flush_iotlb_range(iova, size, SZ_4K, false, priv);
+}
+
 static size_t msm_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
  size_t len, struct iommu_iotlb_gather *gather)
 {
@@ -680,6 +687,7 @@ static struct iommu_ops msm_iommu_ops = {
 * kick starting the other master.
 */
.iotlb_sync = NULL,
+   .iotlb_sync_map = msm_iommu_sync_map,
.iova_to_phys = msm_iommu_iova_to_phys,
.probe_device = msm_iommu_probe_device,
.release_device = msm_iommu_release_device,

base-commit: 0954d61a59e3c014e52b8d938bc12dc5a2e4949c
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RESEND] dma-mapping: benchmark: fix kernel crash when dma_map_single fails

2021-01-27 Thread Christoph Hellwig
Thanks,

applied.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 5/6] media: uvcvideo: Use dma_alloc_noncontiguos API

2021-01-27 Thread . Christoph Hellwig
On Wed, Jan 27, 2021 at 12:29:08AM +0100, Ricardo Ribalda wrote:
> - Is there any platform where dma_alloc_noncontiguos can fail?
> This is, !ops->alloc_noncontiguous and !dev->coherent_dma_mask
> If yes then we need to add a function to let the driver know in
> advance that it has to use the coherent allocator (usb_alloc_coherent
> for uvc)

dev->coherent_dma_mask is set by the driver.  So the only reason why
dma_alloc_noncontiguos will fail is because is because it can't
allocate any memory.

> - In dma_alloc_noncontiguos, on the dma_alloc_pages fallback. If we
> have a device where the dma happens in only one direction, could not
> get more performance with DMA_FROM/TO_DEVICE instead of
> DMA_BIDIRECTIONAL ?

Yes, we could probably do that.

> 
> 
> Then I have tried to use the API, and I have encountered a problem: on
> uvcvideo the device passed to the memory allocator is different for
> DMA_PAGES and NON_CONTIGUOUS:
> https://github.com/ribalda/linux/blob/042cd497739f71c8d4a83a67ee970369e2baca4a/drivers/media/usb/uvc/uvc_video.c#L1236
> 
> I need to dig a bit tomorrow to figure out why this is, I have
> hardware to test both paths, so it should not be too difficult.

I always found the USB dma alloc API a little weird, but we might have
to follow the scheme of the usb coherent wrappers there.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v12 10/10] iommu/arm-smmu-v3: Add stall support for platform devices

2021-01-27 Thread Jean-Philippe Brucker
The SMMU provides a Stall model for handling page faults in platform
devices. It is similar to PCIe PRI, but doesn't require devices to have
their own translation cache. Instead, faulting transactions are parked
and the OS is given a chance to fix the page tables and retry the
transaction.

Enable stall for devices that support it (opt-in by firmware). When an
event corresponds to a translation error, call the IOMMU fault handler.
If the fault is recoverable, it will call us back to terminate or
continue the stall.

To use stall device drivers need to enable IOMMU_DEV_FEAT_IOPF, which
initializes the fault queue for the device.

Tested-by: Zhangfei Gao 
Reviewed-by: Jonathan Cameron 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  43 
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  59 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 189 +-
 3 files changed, 276 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 7b15b7580c6e..59af0bbd2f7b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -354,6 +354,13 @@
 #define CMDQ_PRI_1_GRPID   GENMASK_ULL(8, 0)
 #define CMDQ_PRI_1_RESPGENMASK_ULL(13, 12)
 
+#define CMDQ_RESUME_0_RESP_TERM0UL
+#define CMDQ_RESUME_0_RESP_RETRY   1UL
+#define CMDQ_RESUME_0_RESP_ABORT   2UL
+#define CMDQ_RESUME_0_RESP GENMASK_ULL(13, 12)
+#define CMDQ_RESUME_0_SID  GENMASK_ULL(63, 32)
+#define CMDQ_RESUME_1_STAG GENMASK_ULL(15, 0)
+
 #define CMDQ_SYNC_0_CS GENMASK_ULL(13, 12)
 #define CMDQ_SYNC_0_CS_NONE0
 #define CMDQ_SYNC_0_CS_IRQ 1
@@ -370,6 +377,25 @@
 
 #define EVTQ_0_ID  GENMASK_ULL(7, 0)
 
+#define EVT_ID_TRANSLATION_FAULT   0x10
+#define EVT_ID_ADDR_SIZE_FAULT 0x11
+#define EVT_ID_ACCESS_FAULT0x12
+#define EVT_ID_PERMISSION_FAULT0x13
+
+#define EVTQ_0_SSV (1UL << 11)
+#define EVTQ_0_SSIDGENMASK_ULL(31, 12)
+#define EVTQ_0_SID GENMASK_ULL(63, 32)
+#define EVTQ_1_STAGGENMASK_ULL(15, 0)
+#define EVTQ_1_STALL   (1UL << 31)
+#define EVTQ_1_PnU (1UL << 33)
+#define EVTQ_1_InD (1UL << 34)
+#define EVTQ_1_RnW (1UL << 35)
+#define EVTQ_1_S2  (1UL << 39)
+#define EVTQ_1_CLASS   GENMASK_ULL(41, 40)
+#define EVTQ_1_TT_READ (1UL << 44)
+#define EVTQ_2_ADDRGENMASK_ULL(63, 0)
+#define EVTQ_3_IPA GENMASK_ULL(51, 12)
+
 /* PRI queue */
 #define PRIQ_ENT_SZ_SHIFT  4
 #define PRIQ_ENT_DWORDS((1 << PRIQ_ENT_SZ_SHIFT) >> 3)
@@ -464,6 +490,13 @@ struct arm_smmu_cmdq_ent {
enum pri_resp   resp;
} pri;
 
+   #define CMDQ_OP_RESUME  0x44
+   struct {
+   u32 sid;
+   u16 stag;
+   u8  resp;
+   } resume;
+
#define CMDQ_OP_CMD_SYNC0x46
struct {
u64 msiaddr;
@@ -522,6 +555,7 @@ struct arm_smmu_cmdq_batch {
 
 struct arm_smmu_evtq {
struct arm_smmu_queue   q;
+   struct iopf_queue   *iopf;
u32 max_stalls;
 };
 
@@ -659,7 +693,9 @@ struct arm_smmu_master {
struct arm_smmu_stream  *streams;
unsigned intnum_streams;
boolats_enabled;
+   boolstall_enabled;
boolsva_enabled;
+   booliopf_enabled;
struct list_headbonds;
unsigned intssid_bits;
 };
@@ -678,6 +714,7 @@ struct arm_smmu_domain {
 
struct io_pgtable_ops   *pgtbl_ops;
boolnon_strict;
+   boolstall_enabled;
atomic_tnr_ats_masters;
 
enum arm_smmu_domain_stage  stage;
@@ -719,6 +756,7 @@ bool arm_smmu_master_sva_supported(struct arm_smmu_master 
*master);
 bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master);
 int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
 int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
+bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
 struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm,
void *drvdata);
 void 

[PATCH v12 09/10] ACPI/IORT: Enable stall support for platform devices

2021-01-27 Thread Jean-Philippe Brucker
Copy the "Stall supported" bit, that tells whether a named component
supports stall, into the dma-can-stall device property.

Acked-by: Jonathan Cameron 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/acpi/arm64/iort.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index c9a8bbb74b09..42820d7eb869 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -968,13 +968,15 @@ static int iort_pci_iommu_init(struct pci_dev *pdev, u16 
alias, void *data)
 static void iort_named_component_init(struct device *dev,
  struct acpi_iort_node *node)
 {
-   struct property_entry props[2] = {};
+   struct property_entry props[3] = {};
struct acpi_iort_named_component *nc;
 
nc = (struct acpi_iort_named_component *)node->node_data;
props[0] = PROPERTY_ENTRY_U32("pasid-num-bits",
  FIELD_GET(ACPI_IORT_NC_PASID_BITS,
nc->node_flags));
+   if (nc->node_flags & ACPI_IORT_NC_STALL_SUPPORTED)
+   props[1] = PROPERTY_ENTRY_BOOL("dma-can-stall");
 
if (device_add_properties(dev, props))
dev_warn(dev, "Could not add device properties\n");
-- 
2.30.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v12 07/10] iommu/arm-smmu-v3: Maintain a SID->device structure

2021-01-27 Thread Jean-Philippe Brucker
When handling faults from the event or PRI queue, we need to find the
struct device associated with a SID. Add a rb_tree to keep track of
SIDs.

Acked-by: Jonathan Cameron 
Reviewed-by: Eric Auger 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  13 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 161 
 2 files changed, 144 insertions(+), 30 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index f985817c967a..7b15b7580c6e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -639,6 +639,15 @@ struct arm_smmu_device {
 
/* IOMMU core code handle */
struct iommu_device iommu;
+
+   struct rb_root  streams;
+   struct mutexstreams_mutex;
+};
+
+struct arm_smmu_stream {
+   u32 id;
+   struct arm_smmu_master  *master;
+   struct rb_node  node;
 };
 
 /* SMMU private data for each master */
@@ -647,8 +656,8 @@ struct arm_smmu_master {
struct device   *dev;
struct arm_smmu_domain  *domain;
struct list_headdomain_head;
-   u32 *sids;
-   unsigned intnum_sids;
+   struct arm_smmu_stream  *streams;
+   unsigned intnum_streams;
boolats_enabled;
boolsva_enabled;
struct list_headbonds;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 88dd9feb32f4..3afec6ed8075 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -909,8 +909,8 @@ static void arm_smmu_sync_cd(struct arm_smmu_domain 
*smmu_domain,
 
spin_lock_irqsave(_domain->devices_lock, flags);
list_for_each_entry(master, _domain->devices, domain_head) {
-   for (i = 0; i < master->num_sids; i++) {
-   cmd.cfgi.sid = master->sids[i];
+   for (i = 0; i < master->num_streams; i++) {
+   cmd.cfgi.sid = master->streams[i].id;
arm_smmu_cmdq_batch_add(smmu, , );
}
}
@@ -1355,6 +1355,32 @@ static int arm_smmu_init_l2_strtab(struct 
arm_smmu_device *smmu, u32 sid)
return 0;
 }
 
+__maybe_unused
+static struct arm_smmu_master *
+arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
+{
+   struct rb_node *node;
+   struct arm_smmu_stream *stream;
+   struct arm_smmu_master *master = NULL;
+
+   mutex_lock(>streams_mutex);
+   node = smmu->streams.rb_node;
+   while (node) {
+   stream = rb_entry(node, struct arm_smmu_stream, node);
+   if (stream->id < sid) {
+   node = node->rb_right;
+   } else if (stream->id > sid) {
+   node = node->rb_left;
+   } else {
+   master = stream->master;
+   break;
+   }
+   }
+   mutex_unlock(>streams_mutex);
+
+   return master;
+}
+
 /* IRQ and event handlers */
 static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 {
@@ -1588,8 +1614,8 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master 
*master)
 
arm_smmu_atc_inv_to_cmd(0, 0, 0, );
 
-   for (i = 0; i < master->num_sids; i++) {
-   cmd.atc.sid = master->sids[i];
+   for (i = 0; i < master->num_streams; i++) {
+   cmd.atc.sid = master->streams[i].id;
arm_smmu_cmdq_issue_cmd(master->smmu, );
}
 
@@ -1632,8 +1658,8 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain 
*smmu_domain, int ssid,
if (!master->ats_enabled)
continue;
 
-   for (i = 0; i < master->num_sids; i++) {
-   cmd.atc.sid = master->sids[i];
+   for (i = 0; i < master->num_streams; i++) {
+   cmd.atc.sid = master->streams[i].id;
arm_smmu_cmdq_batch_add(smmu_domain->smmu, , );
}
}
@@ -2065,13 +2091,13 @@ static void arm_smmu_install_ste_for_dev(struct 
arm_smmu_master *master)
int i, j;
struct arm_smmu_device *smmu = master->smmu;
 
-   for (i = 0; i < master->num_sids; ++i) {
-   u32 sid = master->sids[i];
+   for (i = 0; i < master->num_streams; ++i) {
+   u32 sid = master->streams[i].id;
__le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
 
/* Bridged PCI devices may end up with duplicated IDs */
for (j = 0; j < i; j++)
-   if (master->sids[j] == sid)
+   if 

[PATCH v12 08/10] dt-bindings: document stall property for IOMMU masters

2021-01-27 Thread Jean-Philippe Brucker
On ARM systems, some platform devices behind an IOMMU may support stall,
which is the ability to recover from page faults. Let the firmware tell us
when a device supports stall.

Reviewed-by: Rob Herring 
Signed-off-by: Jean-Philippe Brucker 
---
 .../devicetree/bindings/iommu/iommu.txt| 18 ++
 1 file changed, 18 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt 
b/Documentation/devicetree/bindings/iommu/iommu.txt
index 3c36334e4f94..26ba9e530f13 100644
--- a/Documentation/devicetree/bindings/iommu/iommu.txt
+++ b/Documentation/devicetree/bindings/iommu/iommu.txt
@@ -92,6 +92,24 @@ Optional properties:
   tagging DMA transactions with an address space identifier. By default,
   this is 0, which means that the device only has one address space.
 
+- dma-can-stall: When present, the master can wait for a transaction to
+  complete for an indefinite amount of time. Upon translation fault some
+  IOMMUs, instead of aborting the translation immediately, may first
+  notify the driver and keep the transaction in flight. This allows the OS
+  to inspect the fault and, for example, make physical pages resident
+  before updating the mappings and completing the transaction. Such IOMMU
+  accepts a limited number of simultaneous stalled transactions before
+  having to either put back-pressure on the master, or abort new faulting
+  transactions.
+
+  Firmware has to opt-in stalling, because most buses and masters don't
+  support it. In particular it isn't compatible with PCI, where
+  transactions have to complete before a time limit. More generally it
+  won't work in systems and masters that haven't been designed for
+  stalling. For example the OS, in order to handle a stalled transaction,
+  may attempt to retrieve pages from secondary storage in a stalled
+  domain, leading to a deadlock.
+
 
 Notes:
 ==
-- 
2.30.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v12 06/10] iommu: Add a page fault handler

2021-01-27 Thread Jean-Philippe Brucker
Some systems allow devices to handle I/O Page Faults in the core mm. For
example systems implementing the PCIe PRI extension or Arm SMMU stall
model. Infrastructure for reporting these recoverable page faults was
added to the IOMMU core by commit 0c830e6b3282 ("iommu: Introduce device
fault report API"). Add a page fault handler for host SVA.

IOMMU driver can now instantiate several fault workqueues and link them
to IOPF-capable devices. Drivers can choose between a single global
workqueue, one per IOMMU device, one per low-level fault queue, one per
domain, etc.

When it receives a fault event, most commonly in an IRQ handler, the
IOMMU driver reports the fault using iommu_report_device_fault(), which
calls the registered handler. The page fault handler then calls the mm
fault handler, and reports either success or failure with
iommu_page_response(). After the handler succeeds, the hardware retries
the access.

The iopf_param pointer could be embedded into iommu_fault_param. But
putting iopf_param into the iommu_param structure allows us not to care
about ordering between calls to iopf_queue_add_device() and
iommu_register_device_fault_handler().

Reviewed-by: Jonathan Cameron 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/Makefile|   1 +
 drivers/iommu/iommu-sva-lib.h |  53 
 include/linux/iommu.h |   2 +
 drivers/iommu/io-pgfault.c| 461 ++
 4 files changed, 517 insertions(+)
 create mode 100644 drivers/iommu/io-pgfault.c

diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 61bd30cd8369..60fafc23dee6 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -28,3 +28,4 @@ obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
 obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
 obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
 obj-$(CONFIG_IOMMU_SVA_LIB) += iommu-sva-lib.o
+obj-$(CONFIG_IOMMU_SVA_LIB) += io-pgfault.o
diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
index b40990aef3fd..031155010ca8 100644
--- a/drivers/iommu/iommu-sva-lib.h
+++ b/drivers/iommu/iommu-sva-lib.h
@@ -12,4 +12,57 @@ int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t 
min, ioasid_t max);
 void iommu_sva_free_pasid(struct mm_struct *mm);
 struct mm_struct *iommu_sva_find(ioasid_t pasid);
 
+/* I/O Page fault */
+struct device;
+struct iommu_fault;
+struct iopf_queue;
+
+#ifdef CONFIG_IOMMU_SVA_LIB
+int iommu_queue_iopf(struct iommu_fault *fault, void *cookie);
+
+int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev);
+int iopf_queue_remove_device(struct iopf_queue *queue,
+struct device *dev);
+int iopf_queue_flush_dev(struct device *dev);
+struct iopf_queue *iopf_queue_alloc(const char *name);
+void iopf_queue_free(struct iopf_queue *queue);
+int iopf_queue_discard_partial(struct iopf_queue *queue);
+
+#else /* CONFIG_IOMMU_SVA_LIB */
+static inline int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
+{
+   return -ENODEV;
+}
+
+static inline int iopf_queue_add_device(struct iopf_queue *queue,
+   struct device *dev)
+{
+   return -ENODEV;
+}
+
+static inline int iopf_queue_remove_device(struct iopf_queue *queue,
+  struct device *dev)
+{
+   return -ENODEV;
+}
+
+static inline int iopf_queue_flush_dev(struct device *dev)
+{
+   return -ENODEV;
+}
+
+static inline struct iopf_queue *iopf_queue_alloc(const char *name)
+{
+   return NULL;
+}
+
+static inline void iopf_queue_free(struct iopf_queue *queue)
+{
+}
+
+static inline int iopf_queue_discard_partial(struct iopf_queue *queue)
+{
+   return -ENODEV;
+}
+#endif /* CONFIG_IOMMU_SVA_LIB */
 #endif /* _IOMMU_SVA_LIB_H */
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 00348e4c3c26..edc9be443a74 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -366,6 +366,7 @@ struct iommu_fault_param {
  * struct dev_iommu - Collection of per-device IOMMU data
  *
  * @fault_param: IOMMU detected device fault reporting data
+ * @iopf_param: I/O Page Fault queue and data
  * @fwspec: IOMMU fwspec data
  * @iommu_dev:  IOMMU device this device is linked to
  * @priv:   IOMMU Driver private data
@@ -376,6 +377,7 @@ struct iommu_fault_param {
 struct dev_iommu {
struct mutex lock;
struct iommu_fault_param*fault_param;
+   struct iopf_device_param*iopf_param;
struct iommu_fwspec *fwspec;
struct iommu_device *iommu_dev;
void*priv;
diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
new file mode 100644
index ..1df8c1dcae77
--- /dev/null
+++ b/drivers/iommu/io-pgfault.c
@@ -0,0 +1,461 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Handle device page faults
+ *
+ * Copyright (C) 2020 ARM Ltd.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+

[PATCH v12 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF

2021-01-27 Thread Jean-Philippe Brucker
The IOPF (I/O Page Fault) feature is now enabled independently from the
SVA feature, because some IOPF implementations are device-specific and
do not require IOMMU support for PCIe PRI or Arm SMMU stall.

Enable IOPF unconditionally when enabling SVA for now. In the future, if
a device driver implementing a uacce interface doesn't need IOPF
support, it will need to tell the uacce module, for example with a new
flag.

Acked-by: Zhangfei Gao 
Signed-off-by: Jean-Philippe Brucker 
---
Cc: Arnd Bergmann 
Cc: Greg Kroah-Hartman 
Cc: Zhangfei Gao 
Cc: Zhou Wang 
---
 drivers/misc/uacce/uacce.c | 39 +-
 1 file changed, 30 insertions(+), 9 deletions(-)

diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
index d07af4edfcac..6db7a98486ec 100644
--- a/drivers/misc/uacce/uacce.c
+++ b/drivers/misc/uacce/uacce.c
@@ -385,6 +385,33 @@ static void uacce_release(struct device *dev)
kfree(uacce);
 }
 
+static unsigned int uacce_enable_sva(struct device *parent, unsigned int flags)
+{
+   if (!(flags & UACCE_DEV_SVA))
+   return flags;
+
+   flags &= ~UACCE_DEV_SVA;
+
+   if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_IOPF))
+   return flags;
+
+   if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)) {
+   iommu_dev_disable_feature(parent, IOMMU_DEV_FEAT_IOPF);
+   return flags;
+   }
+
+   return flags | UACCE_DEV_SVA;
+}
+
+static void uacce_disable_sva(struct uacce_device *uacce)
+{
+   if (!(uacce->flags & UACCE_DEV_SVA))
+   return;
+
+   iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
+   iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
+}
+
 /**
  * uacce_alloc() - alloc an accelerator
  * @parent: pointer of uacce parent device
@@ -404,11 +431,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
if (!uacce)
return ERR_PTR(-ENOMEM);
 
-   if (flags & UACCE_DEV_SVA) {
-   ret = iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA);
-   if (ret)
-   flags &= ~UACCE_DEV_SVA;
-   }
+   flags = uacce_enable_sva(parent, flags);
 
uacce->parent = parent;
uacce->flags = flags;
@@ -432,8 +455,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
return uacce;
 
 err_with_uacce:
-   if (flags & UACCE_DEV_SVA)
-   iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
+   uacce_disable_sva(uacce);
kfree(uacce);
return ERR_PTR(ret);
 }
@@ -487,8 +509,7 @@ void uacce_remove(struct uacce_device *uacce)
mutex_unlock(>queues_lock);
 
/* disable sva now since no opened queues */
-   if (uacce->flags & UACCE_DEV_SVA)
-   iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
+   uacce_disable_sva(uacce);
 
if (uacce->cdev)
cdev_device_del(uacce->cdev, >dev);
-- 
2.30.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v12 04/10] iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF

2021-01-27 Thread Jean-Philippe Brucker
Allow drivers to query and enable IOMMU_DEV_FEAT_IOPF, which amounts to
checking whether PRI is enabled.

Reviewed-by: Lu Baolu 
Signed-off-by: Jean-Philippe Brucker 
---
Cc: David Woodhouse 
Cc: Lu Baolu 
---
 drivers/iommu/intel/iommu.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index f665322a0991..c777bd94df5d 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5330,6 +5330,8 @@ static int siov_find_pci_dvsec(struct pci_dev *pdev)
 static bool
 intel_iommu_dev_has_feat(struct device *dev, enum iommu_dev_features feat)
 {
+   struct device_domain_info *info = get_domain_info(dev);
+
if (feat == IOMMU_DEV_FEAT_AUX) {
int ret;
 
@@ -5344,13 +5346,13 @@ intel_iommu_dev_has_feat(struct device *dev, enum 
iommu_dev_features feat)
return !!siov_find_pci_dvsec(to_pci_dev(dev));
}
 
-   if (feat == IOMMU_DEV_FEAT_SVA) {
-   struct device_domain_info *info = get_domain_info(dev);
+   if (feat == IOMMU_DEV_FEAT_IOPF)
+   return info && info->pri_supported;
 
+   if (feat == IOMMU_DEV_FEAT_SVA)
return info && (info->iommu->flags & VTD_FLAG_SVM_CAPABLE) &&
info->pasid_supported && info->pri_supported &&
info->ats_supported;
-   }
 
return false;
 }
@@ -5361,6 +5363,9 @@ intel_iommu_dev_enable_feat(struct device *dev, enum 
iommu_dev_features feat)
if (feat == IOMMU_DEV_FEAT_AUX)
return intel_iommu_enable_auxd(dev);
 
+   if (feat == IOMMU_DEV_FEAT_IOPF)
+   return intel_iommu_dev_has_feat(dev, feat) ? 0 : -ENODEV;
+
if (feat == IOMMU_DEV_FEAT_SVA) {
struct device_domain_info *info = get_domain_info(dev);
 
-- 
2.30.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v12 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA

2021-01-27 Thread Jean-Philippe Brucker
Some devices manage I/O Page Faults (IOPF) themselves instead of relying
on PCIe PRI or Arm SMMU stall. Allow their drivers to enable SVA without
mandating IOMMU-managed IOPF. The other device drivers now need to first
enable IOMMU_DEV_FEAT_IOPF before enabling IOMMU_DEV_FEAT_SVA. Enabling
IOMMU_DEV_FEAT_IOPF on its own doesn't have any effect visible to the
device driver, it is used in combination with other features.

Signed-off-by: Jean-Philippe Brucker 
---
Cc: Arnd Bergmann 
Cc: David Woodhouse 
Cc: Greg Kroah-Hartman 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Cc: Will Deacon 
Cc: Zhangfei Gao 
Cc: Zhou Wang 
---
 include/linux/iommu.h | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index b7ea11fc1a93..00348e4c3c26 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -156,10 +156,24 @@ struct iommu_resv_region {
enum iommu_resv_typetype;
 };
 
-/* Per device IOMMU features */
+/**
+ * enum iommu_dev_features - Per device IOMMU features
+ * @IOMMU_DEV_FEAT_AUX: Auxiliary domain feature
+ * @IOMMU_DEV_FEAT_SVA: Shared Virtual Addresses
+ * @IOMMU_DEV_FEAT_IOPF: I/O Page Faults such as PRI or Stall. Generally
+ *  enabling %IOMMU_DEV_FEAT_SVA requires
+ *  %IOMMU_DEV_FEAT_IOPF, but some devices manage I/O Page
+ *  Faults themselves instead of relying on the IOMMU. When
+ *  supported, this feature must be enabled before and
+ *  disabled after %IOMMU_DEV_FEAT_SVA.
+ *
+ * Device drivers query whether a feature is supported using
+ * iommu_dev_has_feature(), and enable it using iommu_dev_enable_feature().
+ */
 enum iommu_dev_features {
-   IOMMU_DEV_FEAT_AUX, /* Aux-domain feature */
-   IOMMU_DEV_FEAT_SVA, /* Shared Virtual Addresses */
+   IOMMU_DEV_FEAT_AUX,
+   IOMMU_DEV_FEAT_SVA,
+   IOMMU_DEV_FEAT_IOPF,
 };
 
 #define IOMMU_PASID_INVALID(-1U)
-- 
2.30.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v12 01/10] iommu: Fix comment for struct iommu_fwspec

2021-01-27 Thread Jean-Philippe Brucker
Commit 986d5ecc5699 ("iommu: Move fwspec->iommu_priv to struct
dev_iommu") removed iommu_priv from fwspec and commit 5702ee24182f
("ACPI/IORT: Check ATS capability in root complex nodes") added @flags.
Update the struct doc.

Acked-by: Jonathan Cameron 
Signed-off-by: Jean-Philippe Brucker 
---
 include/linux/iommu.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index b3f0e2018c62..bdf3f34a4457 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -570,7 +570,7 @@ struct iommu_group *fsl_mc_device_group(struct device *dev);
  * struct iommu_fwspec - per-device IOMMU instance data
  * @ops: ops for this device's IOMMU
  * @iommu_fwnode: firmware handle for this device's IOMMU
- * @iommu_priv: IOMMU driver private data for this device
+ * @flags: IOMMU_FWSPEC_* flags
  * @num_pasid_bits: number of PASID bits supported by this device
  * @num_ids: number of associated device IDs
  * @ids: IDs which this device may present to the IOMMU
-- 
2.30.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v12 02/10] iommu/arm-smmu-v3: Use device properties for pasid-num-bits

2021-01-27 Thread Jean-Philippe Brucker
The pasid-num-bits property shouldn't need a dedicated fwspec field,
it's a job for device properties. Add properties for IORT, and access
the number of PASID bits using device_property_read_u32().

Suggested-by: Robin Murphy 
Acked-by: Jonathan Cameron 
Signed-off-by: Jean-Philippe Brucker 
---
 include/linux/iommu.h   |  2 --
 drivers/acpi/arm64/iort.c   | 13 +++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  3 ++-
 drivers/iommu/of_iommu.c|  5 -
 4 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index bdf3f34a4457..b7ea11fc1a93 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -571,7 +571,6 @@ struct iommu_group *fsl_mc_device_group(struct device *dev);
  * @ops: ops for this device's IOMMU
  * @iommu_fwnode: firmware handle for this device's IOMMU
  * @flags: IOMMU_FWSPEC_* flags
- * @num_pasid_bits: number of PASID bits supported by this device
  * @num_ids: number of associated device IDs
  * @ids: IDs which this device may present to the IOMMU
  */
@@ -579,7 +578,6 @@ struct iommu_fwspec {
const struct iommu_ops  *ops;
struct fwnode_handle*iommu_fwnode;
u32 flags;
-   u32 num_pasid_bits;
unsigned intnum_ids;
u32 ids[];
 };
diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index d4eac6d7e9fb..c9a8bbb74b09 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -968,15 +968,16 @@ static int iort_pci_iommu_init(struct pci_dev *pdev, u16 
alias, void *data)
 static void iort_named_component_init(struct device *dev,
  struct acpi_iort_node *node)
 {
+   struct property_entry props[2] = {};
struct acpi_iort_named_component *nc;
-   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-
-   if (!fwspec)
-   return;
 
nc = (struct acpi_iort_named_component *)node->node_data;
-   fwspec->num_pasid_bits = FIELD_GET(ACPI_IORT_NC_PASID_BITS,
-  nc->node_flags);
+   props[0] = PROPERTY_ENTRY_U32("pasid-num-bits",
+ FIELD_GET(ACPI_IORT_NC_PASID_BITS,
+   nc->node_flags));
+
+   if (device_add_properties(dev, props))
+   dev_warn(dev, "Could not add device properties\n");
 }
 
 static int iort_nc_iommu_map(struct device *dev, struct acpi_iort_node *node)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index baebaac34a83..88dd9feb32f4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2392,7 +2392,8 @@ static struct iommu_device *arm_smmu_probe_device(struct 
device *dev)
}
}
 
-   master->ssid_bits = min(smmu->ssid_bits, fwspec->num_pasid_bits);
+   device_property_read_u32(dev, "pasid-num-bits", >ssid_bits);
+   master->ssid_bits = min(smmu->ssid_bits, master->ssid_bits);
 
/*
 * Note that PASID must be enabled before, and disabled after ATS:
diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index e505b9130a1c..a9d2df001149 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -210,11 +210,6 @@ const struct iommu_ops *of_iommu_configure(struct device 
*dev,
 of_pci_iommu_init, );
} else {
err = of_iommu_configure_device(master_np, dev, id);
-
-   fwspec = dev_iommu_fwspec_get(dev);
-   if (!err && fwspec)
-   of_property_read_u32(master_np, "pasid-num-bits",
->num_pasid_bits);
}
 
/*
-- 
2.30.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v12 00/10] iommu: I/O page faults for SMMUv3

2021-01-27 Thread Jean-Philippe Brucker
Add stall support to the SMMUv3, along with a common I/O Page Fault
handler.

Since v11 I added a sanity check in arm_smmu_dev_disable_feature(),
patch 10: return -EBUSY if device drivers attempt to disable the IOPF
feature before SVA. That would leak the IOPF param and fault handler.

v11: 
https://lore.kernel.org/linux-iommu/20210125110650.3232195-1-jean-phili...@linaro.org/
v10: 
https://lore.kernel.org/linux-iommu/20210121123623.2060416-1-jean-phili...@linaro.org/
v9: 
https://lore.kernel.org/linux-iommu/20210108145217.2254447-1-jean-phili...@linaro.org/

Jean-Philippe Brucker (10):
  iommu: Fix comment for struct iommu_fwspec
  iommu/arm-smmu-v3: Use device properties for pasid-num-bits
  iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
  iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF
  uacce: Enable IOMMU_DEV_FEAT_IOPF
  iommu: Add a page fault handler
  iommu/arm-smmu-v3: Maintain a SID->device structure
  dt-bindings: document stall property for IOMMU masters
  ACPI/IORT: Enable stall support for platform devices
  iommu/arm-smmu-v3: Add stall support for platform devices

 drivers/iommu/Makefile|   1 +
 .../devicetree/bindings/iommu/iommu.txt   |  18 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  56 ++-
 drivers/iommu/iommu-sva-lib.h |  53 ++
 include/linux/iommu.h |  26 +-
 drivers/acpi/arm64/iort.c |  15 +-
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  59 ++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 351 +++--
 drivers/iommu/intel/iommu.c   |  11 +-
 drivers/iommu/io-pgfault.c| 461 ++
 drivers/iommu/of_iommu.c  |   5 -
 drivers/misc/uacce/uacce.c|  39 +-
 12 files changed, 1021 insertions(+), 74 deletions(-)
 create mode 100644 drivers/iommu/io-pgfault.c

-- 
2.30.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 00/13] iommu/amd: Add Generic IO Page Table Framework Support

2021-01-27 Thread Suravee Suthikulpanit




On 1/27/21 7:06 PM, Joerg Roedel wrote:

Hi Suravee,

On Tue, Dec 15, 2020 at 01:36:52AM -0600, Suravee Suthikulpanit wrote:
  

Suravee Suthikulpanit (13):
   iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline
   iommu/amd: Prepare for generic IO page table framework
   iommu/amd: Move pt_root to struct amd_io_pgtable
   iommu/amd: Convert to using amd_io_pgtable
   iommu/amd: Declare functions as extern
   iommu/amd: Move IO page table related functions
   iommu/amd: Restructure code for freeing page table
   iommu/amd: Remove amd_iommu_domain_get_pgtable
   iommu/amd: Rename variables to be consistent with struct
 io_pgtable_ops
   iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable
   iommu/amd: Introduce iommu_v1_iova_to_phys
   iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page
   iommu/amd: Adopt IO page table framework for AMD IOMMU v1 page table


Applied this series, thanks for the work! Given testing goes well you
can consider this queued for 5.12.

Thanks,

Joerg



Thanks Joerg and Will, and welcome back!!!

Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 0/7] MediaTek IOMMU improve tlb flush performance in map/unmap

2021-01-27 Thread Will Deacon
On Thu, 7 Jan 2021 20:29:02 +0800, Yong Wu wrote:
> This patchset is to improve tlb flushing performance in iommu_map/unmap
> for MediaTek IOMMU.
> 
> For iommu_map, currently MediaTek IOMMU use IO_PGTABLE_QUIRK_TLBI_ON_MAP
> to do tlb_flush for each a memory chunk. this is so unnecessary. we could
> improve it by tlb flushing one time at the end of iommu_map.
> 
> [...]

After discussion with Joerg, I'll queue this (and hopefully the next posting
of your IOMMU driver) along with the Arm SMMU patches, and then send that
all together.

Applied to will (for-joerg/mtk), thanks!

[1/7] iommu: Move iotlb_sync_map out from __iommu_map
  https://git.kernel.org/arm64/c/d8c1df02ac7f
[2/7] iommu: Add iova and size as parameters in iotlb_sync_map
  https://git.kernel.org/arm64/c/2ebbd25873ce
[3/7] iommu/mediatek: Add iotlb_sync_map to sync whole the iova range
  https://git.kernel.org/arm64/c/20143451eff0
[4/7] iommu: Switch gather->end to the inclusive end
  https://git.kernel.org/arm64/c/862c3715de8f
[5/7] iommu/io-pgtable: Allow io_pgtable_tlb ops optional
  https://git.kernel.org/arm64/c/77e0992aee4e
[6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once
  https://git.kernel.org/arm64/c/f21ae3b10084
[7/7] iommu/mediatek: Remove the tlb-ops for v7s
  https://git.kernel.org/arm64/c/0954d61a59e3

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] of/device: Update dma_range_map only when dev has valid dma-ranges

2021-01-27 Thread Robin Murphy

[ + Christoph, Marek ]

On 2021-01-27 13:00, Paul Kocialkowski wrote:

Hi,

On Tue 19 Jan 21, 18:52, Yong Wu wrote:

The commit e0d072782c73 ("dma-mapping: introduce DMA range map,
supplanting dma_pfn_offset") always update dma_range_map even though it was
already set, like in the sunxi_mbus driver. the issue is reported at [1].
This patch avoid this(Updating it only when dev has valid dma-ranges).

Meanwhile, dma_range_map contains the devices' dma_ranges information,
This patch moves dma_range_map before of_iommu_configure. The iommu
driver may need to know the dma_address requirements of its iommu
consumer devices.


Just a gentle ping on this issue, it would be nice to have this fix merged
ASAP, in the next RC :)


Ack to that - Rob, Frank, do you want to take this through the OF tree, 
or shall we take it through the DMA-mapping tree like the original culprit?


Thanks,
Robin.



Cheers,

Paul


[1] 
https://lore.kernel.org/linux-arm-kernel/5c7946f3-b56e-da00-a750-be097c7ce...@arm.com/

CC: Rob Herring 
CC: Frank Rowand 
Fixes: e0d072782c73 ("dma-mapping: introduce DMA range map, supplanting 
dma_pfn_offset"),
Suggested-by: Robin Murphy 
Signed-off-by: Yong Wu 
Signed-off-by: Paul Kocialkowski 
Reviewed-by: Rob Herring 
---
  drivers/of/device.c | 10 +++---
  1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/of/device.c b/drivers/of/device.c
index aedfaaafd3e7..1122daa8e273 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -162,9 +162,11 @@ int of_dma_configure_id(struct device *dev, struct 
device_node *np,
mask = DMA_BIT_MASK(ilog2(end) + 1);
dev->coherent_dma_mask &= mask;
*dev->dma_mask &= mask;
-   /* ...but only set bus limit if we found valid dma-ranges earlier */
-   if (!ret)
+   /* ...but only set bus limit and range map if we found valid dma-ranges 
earlier */
+   if (!ret) {
dev->bus_dma_limit = end;
+   dev->dma_range_map = map;
+   }
  
  	coherent = of_dma_is_coherent(np);

dev_dbg(dev, "device is%sdma coherent\n",
@@ -172,6 +174,9 @@ int of_dma_configure_id(struct device *dev, struct 
device_node *np,
  
  	iommu = of_iommu_configure(dev, np, id);

if (PTR_ERR(iommu) == -EPROBE_DEFER) {
+   /* Don't touch range map if it wasn't set from a valid 
dma-ranges */
+   if (!ret)
+   dev->dma_range_map = NULL;
kfree(map);
return -EPROBE_DEFER;
}
@@ -181,7 +186,6 @@ int of_dma_configure_id(struct device *dev, struct 
device_node *np,
  
  	arch_setup_dma_ops(dev, dma_start, size, iommu, coherent);
  
-	dev->dma_range_map = map;

return 0;
  }
  EXPORT_SYMBOL_GPL(of_dma_configure_id);
--
2.18.0




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] of/device: Update dma_range_map only when dev has valid dma-ranges

2021-01-27 Thread Paul Kocialkowski
Hi,

On Tue 19 Jan 21, 18:52, Yong Wu wrote:
> The commit e0d072782c73 ("dma-mapping: introduce DMA range map,
> supplanting dma_pfn_offset") always update dma_range_map even though it was
> already set, like in the sunxi_mbus driver. the issue is reported at [1].
> This patch avoid this(Updating it only when dev has valid dma-ranges).
> 
> Meanwhile, dma_range_map contains the devices' dma_ranges information,
> This patch moves dma_range_map before of_iommu_configure. The iommu
> driver may need to know the dma_address requirements of its iommu
> consumer devices.

Just a gentle ping on this issue, it would be nice to have this fix merged
ASAP, in the next RC :)

Cheers,

Paul

> [1] 
> https://lore.kernel.org/linux-arm-kernel/5c7946f3-b56e-da00-a750-be097c7ce...@arm.com/
> 
> CC: Rob Herring 
> CC: Frank Rowand 
> Fixes: e0d072782c73 ("dma-mapping: introduce DMA range map, supplanting 
> dma_pfn_offset"),
> Suggested-by: Robin Murphy 
> Signed-off-by: Yong Wu 
> Signed-off-by: Paul Kocialkowski 
> Reviewed-by: Rob Herring 
> ---
>  drivers/of/device.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/of/device.c b/drivers/of/device.c
> index aedfaaafd3e7..1122daa8e273 100644
> --- a/drivers/of/device.c
> +++ b/drivers/of/device.c
> @@ -162,9 +162,11 @@ int of_dma_configure_id(struct device *dev, struct 
> device_node *np,
>   mask = DMA_BIT_MASK(ilog2(end) + 1);
>   dev->coherent_dma_mask &= mask;
>   *dev->dma_mask &= mask;
> - /* ...but only set bus limit if we found valid dma-ranges earlier */
> - if (!ret)
> + /* ...but only set bus limit and range map if we found valid dma-ranges 
> earlier */
> + if (!ret) {
>   dev->bus_dma_limit = end;
> + dev->dma_range_map = map;
> + }
>  
>   coherent = of_dma_is_coherent(np);
>   dev_dbg(dev, "device is%sdma coherent\n",
> @@ -172,6 +174,9 @@ int of_dma_configure_id(struct device *dev, struct 
> device_node *np,
>  
>   iommu = of_iommu_configure(dev, np, id);
>   if (PTR_ERR(iommu) == -EPROBE_DEFER) {
> + /* Don't touch range map if it wasn't set from a valid 
> dma-ranges */
> + if (!ret)
> + dev->dma_range_map = NULL;
>   kfree(map);
>   return -EPROBE_DEFER;
>   }
> @@ -181,7 +186,6 @@ int of_dma_configure_id(struct device *dev, struct 
> device_node *np,
>  
>   arch_setup_dma_ops(dev, dma_start, size, iommu, coherent);
>  
> - dev->dma_range_map = map;
>   return 0;
>  }
>  EXPORT_SYMBOL_GPL(of_dma_configure_id);
> -- 
> 2.18.0
> 

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v1 2/2] iommu: add Unisoc iommu basic driver

2021-01-27 Thread Chunyan Zhang
On Fri, 22 Jan 2021 at 05:46, Robin Murphy  wrote:
>
> On 2021-01-21 11:23, Chunyan Zhang wrote:
> > From: Chunyan Zhang 
> >
> > This patch only adds display iommu support, the driver was tested with sprd
> > dpu and image codec processor.
> >
> > The iommu support for others would be added once finished tests with those
> > devices, such as a few signal processors, including VSP(video),
> > GSP(graphic), ISP(image), and camera CPP, etc.
> >
> > Signed-off-by: Chunyan Zhang 
> > ---
> >   drivers/iommu/Kconfig  |  12 +
> >   drivers/iommu/Makefile |   1 +
> >   drivers/iommu/sprd-iommu.c | 566 +
> >   3 files changed, 579 insertions(+)
> >   create mode 100644 drivers/iommu/sprd-iommu.c
> >
> > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> > index 192ef8f61310..79af62c519ae 100644
> > --- a/drivers/iommu/Kconfig
> > +++ b/drivers/iommu/Kconfig
> > @@ -408,4 +408,16 @@ config VIRTIO_IOMMU
> >
> > Say Y here if you intend to run this kernel as a guest.
> >
> > +config SPRD_IOMMU
> > + tristate "Unisoc IOMMU Support"
> > + depends on ARCH_SPRD || COMPILE_TEST
> > + select IOMMU_API
> > + help
> > +   Support for IOMMU on Unisoc's SoCs on which multi-media subsystems
> > +   need IOMMU, such as DPU, Image codec(jpeg) processor, and a few
> > +   signal processors, including VSP(video), GSP(graphic), ISP(image), 
> > and
> > +   CPP, etc.
> > +
> > +   Say Y here if you want multi-media functions.
> > +
> >   endif # IOMMU_SUPPORT
> > diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> > index 61bd30cd8369..5925b6af2123 100644
> > --- a/drivers/iommu/Makefile
> > +++ b/drivers/iommu/Makefile
> > @@ -28,3 +28,4 @@ obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
> >   obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
> >   obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
> >   obj-$(CONFIG_IOMMU_SVA_LIB) += iommu-sva-lib.o
> > +obj-$(CONFIG_SPRD_IOMMU) += sprd-iommu.o
> > diff --git a/drivers/iommu/sprd-iommu.c b/drivers/iommu/sprd-iommu.c
> > new file mode 100644
> > index ..44cde44017fa
> > --- /dev/null
> > +++ b/drivers/iommu/sprd-iommu.c
> > @@ -0,0 +1,566 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Unisoc IOMMU driver
> > + *
> > + * Copyright (C) 2020 Unisoc, Inc.
> > + * Author: Chunyan Zhang 
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
>
> You need  since you're using the DMA API.
>
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +/* SPRD IOMMU page is 4K size alignment */
> > +#define SPRD_IOMMU_PAGE_SHIFT12
> > +#define SPRD_IOMMU_PAGE_SIZE SZ_4K
> > +
> > +#define SPRD_EX_CFG  0x0
> > +#define SPRD_IOMMU_VAOR_BYPASS   BIT(4)
> > +#define SPRD_IOMMU_GATE_EN   BIT(1)
> > +#define SPRD_IOMMU_ENBIT(0)
> > +#define SPRD_EX_UPDATE   0x4
> > +#define SPRD_EX_FIRST_VPN0x8
> > +#define SPRD_EX_VPN_RANGE0xc
> > +#define SPRD_EX_FIRST_PPN0x10
> > +#define SPRD_EX_DEFAULT_PPN  0x14
> > +
> > +#define SPRD_IOMMU_VERSION   0x0
> > +#define SPRD_VERSION_MASKGENMASK(15, 8)
> > +#define SPRD_VERSION_SHIFT   0x8
> > +#define SPRD_VAU_CFG 0x4
> > +#define SPRD_VAU_UPDATE  0x8
> > +#define SPRD_VAU_AUTH_CFG0xc
> > +#define SPRD_VAU_FIRST_PPN   0x10
> > +#define SPRD_VAU_DEFAULT_PPN_RD  0x14
> > +#define SPRD_VAU_DEFAULT_PPN_WR  0x18
> > +#define SPRD_VAU_FIRST_VPN   0x1c
> > +#define SPRD_VAU_VPN_RANGE   0x20
> > +
> > +enum sprd_iommu_version {
> > + SPRD_IOMMU_EX,
> > + SPRD_IOMMU_VAU,
> > +};
> > +
> > +struct sprd_iommu_match_data {
> > + unsigned long reg_offset;
> > +};
> > +
> > +/*
> > + * struct sprd_iommu_device - high-level sprd iommu device representation,
> > + * including hardware information and configuration, also driver data, etc
> > + *
> > + * @mdata:   hardware configuration and information
> > + * @ver: sprd iommu device version
> > + * @prot_page:   protect page base address, data would be written to 
> > here
> > + *   while translation fault
> > + * @base:mapped base address for accessing registers
> > + * @dev: pointer to basic device structure
> > + * @iommu:   IOMMU core representation
> > + * @group:   IOMMU group
> > + */
> > +struct sprd_iommu_device {
> > + const struct sprd_iommu_match_data *mdata;
> > + enum sprd_iommu_version ver;
> > + u32 *prot_page_va;
> > + dma_addr_t  prot_page_pa;
> > + void __iomem*base;
> > + struct device   *dev;
> > + struct iommu_device iommu;
> > + struct iommu_group  *group;
> > +};
> > +
> > +struct sprd_iommu_domain {
> > + spinlock_t  pgtlock; /* lock for page table */
> > + struct iommu_domain domain;
> > + u32 *pgt_va; /* page table virtual address base */
> > + dma_addr_t  pgt_pa; /* 

Re: [PATCH v4 00/13] iommu/amd: Add Generic IO Page Table Framework Support

2021-01-27 Thread Joerg Roedel
Hi Suravee,

On Tue, Dec 15, 2020 at 01:36:52AM -0600, Suravee Suthikulpanit wrote:
 
> Suravee Suthikulpanit (13):
>   iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline
>   iommu/amd: Prepare for generic IO page table framework
>   iommu/amd: Move pt_root to struct amd_io_pgtable
>   iommu/amd: Convert to using amd_io_pgtable
>   iommu/amd: Declare functions as extern
>   iommu/amd: Move IO page table related functions
>   iommu/amd: Restructure code for freeing page table
>   iommu/amd: Remove amd_iommu_domain_get_pgtable
>   iommu/amd: Rename variables to be consistent with struct
> io_pgtable_ops
>   iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable
>   iommu/amd: Introduce iommu_v1_iova_to_phys
>   iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page
>   iommu/amd: Adopt IO page table framework for AMD IOMMU v1 page table

Applied this series, thanks for the work! Given testing goes well you
can consider this queued for 5.12.

Thanks,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: remove h from printk format specifier

2021-01-27 Thread Joerg Roedel
On Tue, Dec 15, 2020 at 01:30:21PM -0800, t...@redhat.com wrote:
>  drivers/iommu/amd/init.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH -next] iommu: amd: Use DEFINE_SPINLOCK() for spinlock

2021-01-27 Thread Joerg Roedel
On Mon, Dec 28, 2020 at 09:51:12PM +0800, Zheng Yongjun wrote:
> spinlock can be initialized automatically with DEFINE_SPINLOCK()
> rather than explicitly calling spin_lock_init().
> 
> Signed-off-by: Zheng Yongjun 
> ---
>  drivers/iommu/amd/iommu_v2.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH -next] iommu/amd/init: convert comma to semicolon

2021-01-27 Thread Joerg Roedel
On Mon, Dec 14, 2020 at 09:44:38PM +0800, Zheng Yongjun wrote:
> Replace a comma between expression statements by a semicolon.
> 
> Signed-off-by: Zheng Yongjun 
> ---
>  drivers/iommu/amd/init.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

What tree is this against? This code does not exist in v5.11-rc5.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 3/3] iommu/arm-smmu-v3: Reserving the entire SMMU register space

2021-01-27 Thread Zhen Lei
commit 52f3fab0067d ("iommu/arm-smmu-v3: Don't reserve implementation
defined register space") only reserves the basic SMMU register space. So
the ECMDQ register space is not covered, it should be mapped again. Due
to the size of this ECMDQ resource is not fixed, depending on
SMMU_IDR6.CMDQ_CONTROL_PAGE_LOG2NUMQ. Processing its resource reservation
to avoid resource conflict with PMCG is a bit more complicated.

Therefore, the resources of the PMCG are not reserved, and the entire SMMU
resources are reserved.

Suggested-by: Robin Murphy 
Signed-off-by: Zhen Lei 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 --
 2 files changed, 4 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f04c55a7503c790..fde24403b06a9e3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3476,14 +3476,6 @@ static int arm_smmu_set_bus_ops(struct iommu_ops *ops)
return err;
 }
 
-static void __iomem *arm_smmu_ioremap(struct device *dev, resource_size_t 
start,
- resource_size_t size)
-{
-   struct resource res = DEFINE_RES_MEM(start, size);
-
-   return devm_ioremap_resource(dev, );
-}
-
 static int arm_smmu_device_probe(struct platform_device *pdev)
 {
int irq, ret;
@@ -3519,22 +3511,14 @@ static int arm_smmu_device_probe(struct platform_device 
*pdev)
}
ioaddr = res->start;
 
-   /*
-* Don't map the IMPLEMENTATION DEFINED regions, since they may contain
-* the PMCG registers which are reserved by the PMU driver.
-*/
-   smmu->base = arm_smmu_ioremap(dev, ioaddr, ARM_SMMU_REG_SZ);
+   smmu->base = devm_ioremap_resource(dev, res);
if (IS_ERR(smmu->base))
return PTR_ERR(smmu->base);
 
-   if (arm_smmu_resource_size(smmu) > SZ_64K) {
-   smmu->page1 = arm_smmu_ioremap(dev, ioaddr + SZ_64K,
-  ARM_SMMU_REG_SZ);
-   if (IS_ERR(smmu->page1))
-   return PTR_ERR(smmu->page1);
-   } else {
+   if (arm_smmu_resource_size(smmu) > SZ_64K)
+   smmu->page1 = smmu->base + SZ_64K;
+   else
smmu->page1 = smmu->base;
-   }
 
/* Interrupt lines */
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index da525f46dab4fc1..63f2b476987d6ae 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -152,8 +152,6 @@
 #define ARM_SMMU_PRIQ_IRQ_CFG1 0xd8
 #define ARM_SMMU_PRIQ_IRQ_CFG2 0xdc
 
-#define ARM_SMMU_REG_SZ0xe00
-
 /* Common MSI config fields */
 #define MSI_CFG0_ADDR_MASK GENMASK_ULL(51, 2)
 #define MSI_CFG2_SHGENMASK(5, 4)
-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 1/3] perf/smmuv3: Don't reserve the PMCG register spaces

2021-01-27 Thread Zhen Lei
According to the SMMUv3 specification:
Each PMCG counter group is represented by one 4KB page (Page 0) with one
optional additional 4KB page (Page 1), both of which are at IMPLEMENTATION
DEFINED base addresses.

This means that the PMCG register spaces may be within the 64KB pages of
the SMMUv3 register space. When both the SMMU and PMCG drivers reserve
their own resources, a resource conflict occurs.

To avoid this conflict, don't reserve the PMCG regions.

Suggested-by: Robin Murphy 
Signed-off-by: Zhen Lei 
---
 drivers/perf/arm_smmuv3_pmu.c | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c
index 74474bb322c3f26..e5e505a0804fe53 100644
--- a/drivers/perf/arm_smmuv3_pmu.c
+++ b/drivers/perf/arm_smmuv3_pmu.c
@@ -761,6 +761,29 @@ static void smmu_pmu_get_acpi_options(struct smmu_pmu 
*smmu_pmu)
dev_notice(smmu_pmu->dev, "option mask 0x%x\n", smmu_pmu->options);
 }
 
+static void __iomem *
+smmu_pmu_get_and_ioremap_resource(struct platform_device *pdev,
+ unsigned int index,
+ struct resource **res)
+{
+   void __iomem *base;
+   struct resource *r;
+
+   r = platform_get_resource(pdev, IORESOURCE_MEM, index);
+   if (!r) {
+   dev_err(>dev, "invalid resource\n");
+   return ERR_PTR(-EINVAL);
+   }
+   if (res)
+   *res = r;
+
+   base = devm_ioremap(>dev, r->start, resource_size(r));
+   if (!base)
+   return ERR_PTR(-ENOMEM);
+
+   return base;
+}
+
 static int smmu_pmu_probe(struct platform_device *pdev)
 {
struct smmu_pmu *smmu_pmu;
@@ -793,7 +816,7 @@ static int smmu_pmu_probe(struct platform_device *pdev)
.capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
};
 
-   smmu_pmu->reg_base = devm_platform_get_and_ioremap_resource(pdev, 0, 
_0);
+   smmu_pmu->reg_base = smmu_pmu_get_and_ioremap_resource(pdev, 0, _0);
if (IS_ERR(smmu_pmu->reg_base))
return PTR_ERR(smmu_pmu->reg_base);
 
@@ -801,7 +824,7 @@ static int smmu_pmu_probe(struct platform_device *pdev)
 
/* Determine if page 1 is present */
if (cfgr & SMMU_PMCG_CFGR_RELOC_CTRS) {
-   smmu_pmu->reloc_base = devm_platform_ioremap_resource(pdev, 1);
+   smmu_pmu->reloc_base = smmu_pmu_get_and_ioremap_resource(pdev, 
1, NULL);
if (IS_ERR(smmu_pmu->reloc_base))
return PTR_ERR(smmu_pmu->reloc_base);
} else {
-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 0/3] perf/smmuv3: Don't reserve the PMCG register spaces

2021-01-27 Thread Zhen Lei
v2 --> v3:
Patch 3 is updated because https://lkml.org/lkml/2021/1/22/532 has been queued 
in advance.

v1 --> v2:
According to Robin Murphy's suggestion: https://lkml.org/lkml/2021/1/20/470
Don't reserve the PMCG register spaces, and reserve the entire SMMU register 
space.

v1:
Since the PMCG may implement its resigters space(4KB Page0 and 4KB Page1)
within the SMMUv3 64KB Page0. In this case, when the SMMUv3 driver reserves the
64KB Page0 resource in advance, the PMCG driver try to reserve its Page0 and
Page1 resources, a resource conflict occurs.

commit 52f3fab0067d6fa ("iommu/arm-smmu-v3: Don't reserve implementation
defined register space") reduce the resource reservation range of the SMMUv3
driver, it only reserves the first 0xe00 bytes in the 64KB Page0, to avoid
the above-mentioned resource conflicts.

But the SMMUv3.3 add support for ECMDQ, its registers space is also implemented
in the SMMUv3 64KB Page0. This means we need to build two separate mappings.
New features may be added in the future, and more independent mappings may be
required. The simple problem is complicated because the user expects to map the
entire SMMUv3 64KB Page0.

Therefore, the proper solution is: If the PMCG register resources are located in
the 64KB Page0 of the SMMU, the PMCG driver does not reserve the conflict 
resources
when the SMMUv3 driver has reserved the conflict resources before. Instead, the 
PMCG
driver only performs devm_ioremap() to ensure that it can work properly.

Zhen Lei (3):
  perf/smmuv3: Don't reserve the PMCG register spaces
  perf/smmuv3: Add a MODULE_SOFTDEP() to indicate dependency on SMMU
  iommu/arm-smmu-v3: Reserving the entire SMMU register space

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 --
 drivers/perf/arm_smmuv3_pmu.c   | 28 ++--
 3 files changed, 30 insertions(+), 24 deletions(-)

-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 2/3] perf/smmuv3: Add a MODULE_SOFTDEP() to indicate dependency on SMMU

2021-01-27 Thread Zhen Lei
The MODULE_SOFTDEP() gives user space a hint of the loading sequence. And
when command "modprobe arm_smmuv3_pmu" is executed, the arm_smmu_v3.ko is
automatically loaded in advance.

Signed-off-by: Zhen Lei 
---
 drivers/perf/arm_smmuv3_pmu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c
index e5e505a0804fe53..9a305ac51208cd2 100644
--- a/drivers/perf/arm_smmuv3_pmu.c
+++ b/drivers/perf/arm_smmuv3_pmu.c
@@ -950,6 +950,7 @@ static void __exit arm_smmu_pmu_exit(void)
 module_exit(arm_smmu_pmu_exit);
 
 MODULE_DESCRIPTION("PMU driver for ARM SMMUv3 Performance Monitors Extension");
+MODULE_SOFTDEP("pre: arm_smmu_v3");
 MODULE_AUTHOR("Neil Leeder ");
 MODULE_AUTHOR("Shameer Kolothum ");
 MODULE_LICENSE("GPL v2");
-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RESEND PATCH 1/1] iommu/amd: Remove unnecessary assignment

2021-01-27 Thread Joerg Roedel
On Thu, Dec 10, 2020 at 10:13:30AM +0800, Adrian Huang wrote:
>  drivers/iommu/amd/init.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/6] IOMMU: Some more IOVA and core code tidy-up

2021-01-27 Thread Joerg Roedel
On Wed, Jan 06, 2021 at 09:35:05PM +0800, John Garry wrote:
> Just some tidy-up to IOVA and core code.
> 
> Based on v5.11-rc2
> 
> Differences to v1:
> - Add core IOMMU patches
> 
> John Garry (6):
>   iova: Make has_iova_flush_queue() private
>   iova: Delete copy_reserved_iova()
>   iova: Stop exporting some more functions
>   iommu: Stop exporting iommu_map_sg_atomic()
>   iommu: Delete iommu_domain_window_disable()
>   iommu: Delete iommu_dev_has_feature()

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/vt-d: do not use flush-queue when caching-mode is on

2021-01-27 Thread Lu Baolu

On 2021/1/27 14:17, Nadav Amit wrote:

From: Nadav Amit 

When an Intel IOMMU is virtualized, and a physical device is
passed-through to the VM, changes of the virtual IOMMU need to be
propagated to the physical IOMMU. The hypervisor therefore needs to
monitor PTE mappings in the IOMMU page-tables. Intel specifications
provide "caching-mode" capability that a virtual IOMMU uses to report
that the IOMMU is virtualized and a TLB flush is needed after mapping to
allow the hypervisor to propagate virtual IOMMU mappings to the physical
IOMMU. To the best of my knowledge no real physical IOMMU reports
"caching-mode" as turned on.

Synchronizing the virtual and the physical IOMMU tables is expensive if
the hypervisor is unaware which PTEs have changed, as the hypervisor is
required to walk all the virtualized tables and look for changes.
Consequently, domain flushes are much more expensive than page-specific
flushes on virtualized IOMMUs with passthrough devices. The kernel
therefore exploited the "caching-mode" indication to avoid domain
flushing and use page-specific flushing in virtualized environments. See
commit 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching
mode.")

This behavior changed after commit 13cf01744608 ("iommu/vt-d: Make use
of iova deferred flushing"). Now, when batched TLB flushing is used (the
default), full TLB domain flushes are performed frequently, requiring
the hypervisor to perform expensive synchronization between the virtual
TLB and the physical one.

Getting batched TLB flushes to use in such circumstances page-specific
invalidations again is not easy, since the TLB invalidation scheme
assumes that "full" domain TLB flushes are performed for scalability.

Disable batched TLB flushes when caching-mode is on, as the performance
benefit from using batched TLB invalidations is likely to be much
smaller than the overhead of the virtual-to-physical IOMMU page-tables
synchronization.

Fixes: 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching mode.")


Isn't it

Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing")

?

Best regards,
baolu


Signed-off-by: Nadav Amit 
Cc: David Woodhouse 
Cc: Lu Baolu 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: sta...@vger.kernel.org >
---
v1->v2:

* disable flush queue for all domains if caching-mode is on for any
   IOMMU (Lu).
---
  drivers/iommu/intel/iommu.c | 32 +++-
  1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 788119c5b021..de3dd617cf60 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5373,6 +5373,36 @@ intel_iommu_domain_set_attr(struct iommu_domain *domain,
return ret;
  }
  
+static bool domain_use_flush_queue(void)

+{
+   struct dmar_drhd_unit *drhd;
+   struct intel_iommu *iommu;
+   bool r = true;
+
+   if (intel_iommu_strict)
+   return false;
+
+   /*
+* The flush queue implementation does not perform page-selective
+* invalidations that are required for efficient TLB flushes in virtual
+* environments. The benefit of batching is likely to be much lower than
+* the overhead of synchronizing the virtual and physical IOMMU
+* page-tables.
+*/
+   rcu_read_lock();
+   for_each_active_iommu(iommu, drhd) {
+   if (!cap_caching_mode(iommu->cap))
+   continue;
+
+   pr_warn_once("IOMMU batching is disabled due to 
virtualization");
+   r = false;
+   break;
+   }
+   rcu_read_unlock();
+
+   return r;
+}
+
  static int
  intel_iommu_domain_get_attr(struct iommu_domain *domain,
enum iommu_attr attr, void *data)
@@ -5383,7 +5413,7 @@ intel_iommu_domain_get_attr(struct iommu_domain *domain,
case IOMMU_DOMAIN_DMA:
switch (attr) {
case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
-   *(int *)data = !intel_iommu_strict;
+   *(int *)data = domain_use_flush_queue();
return 0;
default:
return -ENODEV;


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 07/33] iommu: Avoid reallocate default domain for a group

2021-01-27 Thread Yong Wu
On Tue, 2021-01-26 at 22:23 +, Will Deacon wrote:
> On Mon, Jan 11, 2021 at 07:18:48PM +0800, Yong Wu wrote:
> > If group->default_domain exists, avoid reallocate it.
> > 
> > In some iommu drivers, there may be several devices share a group. Avoid
> > realloc the default domain for this case.
> > 
> > Signed-off-by: Yong Wu 
> > ---
> >  drivers/iommu/iommu.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > index 3d099a31ddca..f4b87e6abe80 100644
> > --- a/drivers/iommu/iommu.c
> > +++ b/drivers/iommu/iommu.c
> > @@ -266,7 +266,8 @@ int iommu_probe_device(struct device *dev)
> >  * support default domains, so the return value is not yet
> >  * checked.
> >  */
> > -   iommu_alloc_default_domain(group, dev);
> > +   if (!group->default_domain)
> > +   iommu_alloc_default_domain(group, dev);
> 
> I don't really get what this achieves, since iommu_alloc_default_domain()
> looks like this:
> 
> static int iommu_alloc_default_domain(struct iommu_group *group,
> struct device *dev)
> {
>   unsigned int type;
> 
>   if (group->default_domain)
>   return 0;
> 
>   ...
> 
> in which case, it should be fine?

oh. sorry, I overlooked this. the current code is enough.
I will remove this patch. and send the next version in this week.
Thanks very much.

> 
> Will

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/arm-smmu-v3: add support for BBML

2021-01-27 Thread Robin Murphy

On 2021-01-27 07:36, Keqian Zhu wrote:



On 2021/1/27 10:01, Leizhen (ThunderTown) wrote:



On 2021/1/26 18:12, Will Deacon wrote:

On Mon, Jan 25, 2021 at 08:23:40PM +, Robin Murphy wrote:

Now we probably will need some degreee of BBML feature awareness for the
sake of SVA if and when we start using it for CPU pagetables, but I still
cannot see any need to consider it in io-pgtable.


Agreed; I don't think this is something that io-pgtable should have to care
about.

Hi,

I have a question here :-).
If the old table is not live, then the break procedure seems unnecessary. Do I 
miss something?


The MMU is allowed to prefetch translations at any time, so not 
following the proper update procedure could still potentially lead to a 
TLB conflict, even if there's no device traffic to worry about disrupting.


Robin.


Thanks,
Keqian



Yes, the SVA works in stall mode, and the failed device access requests are not
discarded.

Let me look for examples. The BBML usage scenario was told by a former 
colleague.



Will

.




___
linux-arm-kernel mailing list
linux-arm-ker...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/arm-smmu-v3: Use DEFINE_RES_MEM() to simplify code

2021-01-27 Thread Leizhen (ThunderTown)



On 2021/1/27 17:23, Will Deacon wrote:
> On Wed, Jan 27, 2021 at 10:05:50AM +0800, Leizhen (ThunderTown) wrote:
>> I've sent another set of patches. https://lkml.org/lkml/2021/1/26/1065
>> If those patches are acceptable, then this one should be ignored.
> 
> I've already queued this one, so if you want me to drop it then you need to
> send me a revert.

Thanks. Since it's queued, keep it. I'll update the new patch set.

> 
> Will
> 
> .
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/arm-smmu-v3: Use DEFINE_RES_MEM() to simplify code

2021-01-27 Thread Will Deacon
On Wed, Jan 27, 2021 at 10:05:50AM +0800, Leizhen (ThunderTown) wrote:
> I've sent another set of patches. https://lkml.org/lkml/2021/1/26/1065
> If those patches are acceptable, then this one should be ignored.

I've already queued this one, so if you want me to drop it then you need to
send me a revert.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu