[RESEND PATCH v4 6/9] iommu/dma-iommu.c: Convert to use vm_map_pages()

2019-03-18 Thread Souptick Joarder
Convert to use vm_map_pages() to map range of kernel
memory to user vma.

Signed-off-by: Souptick Joarder 
---
 drivers/iommu/dma-iommu.c | 12 +---
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index d19f3d6..bacebff 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -620,17 +620,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t 
size, gfp_t gfp,
 
 int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct 
*vma)
 {
-   unsigned long uaddr = vma->vm_start;
-   unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-   int ret = -ENXIO;
-
-   for (i = vma->vm_pgoff; i < count && uaddr < vma->vm_end; i++) {
-   ret = vm_insert_page(vma, uaddr, pages[i]);
-   if (ret)
-   break;
-   uaddr += PAGE_SIZE;
-   }
-   return ret;
+   return vm_map_pages(vma, pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
 }
 
 static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH v4 1/9] mm: Introduce new vm_map_pages() and vm_map_pages_zero() API

2019-03-18 Thread Souptick Joarder
Previouly drivers have their own way of mapping range of
kernel pages/memory into user vma and this was done by
invoking vm_insert_page() within a loop.

As this pattern is common across different drivers, it can
be generalized by creating new functions and use it across
the drivers.

vm_map_pages() is the API which could be used to mapped
kernel memory/pages in drivers which has considered vm_pgoff

vm_map_pages_zero() is the API which could be used to map
range of kernel memory/pages in drivers which has not considered
vm_pgoff. vm_pgoff is passed default as 0 for those drivers.

We _could_ then at a later "fix" these drivers which are using
vm_map_pages_zero() to behave according to the normal vm_pgoff
offsetting simply by removing the _zero suffix on the function
name and if that causes regressions, it gives us an easy way to revert.

Tested on Rockchip hardware and display is working, including talking
to Lima via prime.

Signed-off-by: Souptick Joarder 
Suggested-by: Russell King 
Suggested-by: Matthew Wilcox 
Reviewed-by: Mike Rapoport 
Tested-by: Heiko Stuebner 
---
 include/linux/mm.h |  4 +++
 mm/memory.c| 81 ++
 mm/nommu.c | 14 ++
 3 files changed, 99 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 80bb640..e0aaa73 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2565,6 +2565,10 @@ unsigned long change_prot_numa(struct vm_area_struct 
*vma,
 int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
unsigned long pfn, unsigned long size, pgprot_t);
 int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *);
+int vm_map_pages(struct vm_area_struct *vma, struct page **pages,
+   unsigned long num);
+int vm_map_pages_zero(struct vm_area_struct *vma, struct page **pages,
+   unsigned long num);
 vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn);
 vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
diff --git a/mm/memory.c b/mm/memory.c
index e11ca9d..cad3e27 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1520,6 +1520,87 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned 
long addr,
 }
 EXPORT_SYMBOL(vm_insert_page);
 
+/*
+ * __vm_map_pages - maps range of kernel pages into user vma
+ * @vma: user vma to map to
+ * @pages: pointer to array of source kernel pages
+ * @num: number of pages in page array
+ * @offset: user's requested vm_pgoff
+ *
+ * This allows drivers to map range of kernel pages into a user vma.
+ *
+ * Return: 0 on success and error code otherwise.
+ */
+static int __vm_map_pages(struct vm_area_struct *vma, struct page **pages,
+   unsigned long num, unsigned long offset)
+{
+   unsigned long count = vma_pages(vma);
+   unsigned long uaddr = vma->vm_start;
+   int ret, i;
+
+   /* Fail if the user requested offset is beyond the end of the object */
+   if (offset > num)
+   return -ENXIO;
+
+   /* Fail if the user requested size exceeds available object size */
+   if (count > num - offset)
+   return -ENXIO;
+
+   for (i = 0; i < count; i++) {
+   ret = vm_insert_page(vma, uaddr, pages[offset + i]);
+   if (ret < 0)
+   return ret;
+   uaddr += PAGE_SIZE;
+   }
+
+   return 0;
+}
+
+/**
+ * vm_map_pages - maps range of kernel pages starts with non zero offset
+ * @vma: user vma to map to
+ * @pages: pointer to array of source kernel pages
+ * @num: number of pages in page array
+ *
+ * Maps an object consisting of @num pages, catering for the user's
+ * requested vm_pgoff
+ *
+ * If we fail to insert any page into the vma, the function will return
+ * immediately leaving any previously inserted pages present.  Callers
+ * from the mmap handler may immediately return the error as their caller
+ * will destroy the vma, removing any successfully inserted pages. Other
+ * callers should make their own arrangements for calling unmap_region().
+ *
+ * Context: Process context. Called by mmap handlers.
+ * Return: 0 on success and error code otherwise.
+ */
+int vm_map_pages(struct vm_area_struct *vma, struct page **pages,
+   unsigned long num)
+{
+   return __vm_map_pages(vma, pages, num, vma->vm_pgoff);
+}
+EXPORT_SYMBOL(vm_map_pages);
+
+/**
+ * vm_map_pages_zero - map range of kernel pages starts with zero offset
+ * @vma: user vma to map to
+ * @pages: pointer to array of source kernel pages
+ * @num: number of pages in page array
+ *
+ * Similar to vm_map_pages(), except that it explicitly sets the offset
+ * to 0. This function is intended for the drivers that did not consider
+ * vm_pgoff.
+ *
+ * Context: Process context. Called by mmap handlers.
+ * Return: 0 on success and error 

[RESEND PATCH v4 0/9] mm: Use vm_map_pages() and vm_map_pages_zero() API

2019-03-18 Thread Souptick Joarder
Previouly drivers have their own way of mapping range of
kernel pages/memory into user vma and this was done by
invoking vm_insert_page() within a loop.

As this pattern is common across different drivers, it can
be generalized by creating new functions and use it across
the drivers.

vm_map_pages() is the API which could be used to map
kernel memory/pages in drivers which has considered vm_pgoff.

vm_map_pages_zero() is the API which could be used to map
range of kernel memory/pages in drivers which has not considered
vm_pgoff. vm_pgoff is passed default as 0 for those drivers.

We _could_ then at a later "fix" these drivers which are using
vm_map_pages_zero() to behave according to the normal vm_pgoff
offsetting simply by removing the _zero suffix on the function
name and if that causes regressions, it gives us an easy way to revert.

Tested on Rockchip hardware and display is working fine, including talking
to Lima via prime.

v1 -> v2:
Few Reviewed-by.

Updated the change log in [8/9]

In [7/9], vm_pgoff is treated in V4L2 API as a 'cookie'
to select a buffer, not as a in-buffer offset by design
and it always want to mmap a whole buffer from its beginning.
Added additional changes after discussing with Marek and
vm_map_pages() could be used instead of vm_map_pages_zero().

v2 -> v3:
Corrected the documentation as per review comment.

As suggested in v2, renaming the interfaces to -
*vm_insert_range() -> vm_map_pages()* and
*vm_insert_range_buggy() -> vm_map_pages_zero()*.
As the interface is renamed, modified the code accordingly,
updated the change logs and modified the subject lines to use the
new interfaces. There is no other change apart from renaming and
using the new interface.

Patch[1/9] & [4/9], Tested on Rockchip hardware.

v3 -> v4:
Fixed build warnings on patch [8/9] reported by kbuild test robot.

Souptick Joarder (9):
  mm: Introduce new vm_map_pages() and vm_map_pages_zero() API
  arm: mm: dma-mapping: Convert to use vm_map_pages()
  drivers/firewire/core-iso.c: Convert to use vm_map_pages_zero()
  drm/rockchip/rockchip_drm_gem.c: Convert to use vm_map_pages()
  drm/xen/xen_drm_front_gem.c: Convert to use vm_map_pages()
  iommu/dma-iommu.c: Convert to use vm_map_pages()
  videobuf2/videobuf2-dma-sg.c: Convert to use vm_map_pages()
  xen/gntdev.c: Convert to use vm_map_pages()
  xen/privcmd-buf.c: Convert to use vm_map_pages_zero()

 arch/arm/mm/dma-mapping.c  | 22 ++
 drivers/firewire/core-iso.c| 15 +---
 drivers/gpu/drm/rockchip/rockchip_drm_gem.c| 17 +
 drivers/gpu/drm/xen/xen_drm_front_gem.c| 18 ++---
 drivers/iommu/dma-iommu.c  | 12 +---
 drivers/media/common/videobuf2/videobuf2-core.c|  7 ++
 .../media/common/videobuf2/videobuf2-dma-contig.c  |  6 --
 drivers/media/common/videobuf2/videobuf2-dma-sg.c  | 22 ++
 drivers/xen/gntdev.c   | 11 ++-
 drivers/xen/privcmd-buf.c  |  8 +--
 include/linux/mm.h |  4 ++
 mm/memory.c| 81 ++
 mm/nommu.c | 14 
 13 files changed, 134 insertions(+), 103 deletions(-)

-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/iova: Fix tracking of recently failed iova address size

2019-03-18 Thread Robin Murphy

On 15/03/2019 15:56, Robert Richter wrote:

We track the smallest size that failed for a 32 bit allocation. The
Size decreases only and if we actually walked the tree and noticed an
allocation failure. Current code is broken and wrongly updates the
size value even if we did not try an allocation. This leads to
increased size values and we might go the slow path again even if we
have seen a failure before for the same or a smaller size.


That description wasn't too clear (since it rather contradicts itself by 
starting off with "XYZ happens" when the whole point is that XYZ doesn't 
actually happen properly), but having gone and looked at the code in 
context I think I understand it now - specifically, it's that the 
early-exit path for detecting that a 32-bit allocation request is too 
big to possibly succeed should never have gone via the route which 
assigns to max32_alloc_size.


In that respect, the diff looks correct, so modulo possibly tweaking the 
commit message,


Reviewed-by: Robin Murphy 

Thanks,
Robin.


Cc:  # 4.20+
Fixes: bee60e94a1e2 ("iommu/iova: Optimise attempts to allocate iova from 32bit 
address range")
Signed-off-by: Robert Richter 
---
  drivers/iommu/iova.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index f8d3ba247523..2de8122e218f 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -207,8 +207,10 @@ static int __alloc_and_insert_iova_range(struct 
iova_domain *iovad,
curr_iova = rb_entry(curr, struct iova, node);
} while (curr && new_pfn <= curr_iova->pfn_hi);
  
-	if (limit_pfn < size || new_pfn < iovad->start_pfn)

+   if (limit_pfn < size || new_pfn < iovad->start_pfn) {
+   iovad->max32_alloc_size = size;
goto iova32_full;
+   }
  
  	/* pfn_lo will point to size aligned address if size_aligned is set */

new->pfn_lo = new_pfn;
@@ -222,7 +224,6 @@ static int __alloc_and_insert_iova_range(struct iova_domain 
*iovad,
return 0;
  
  iova32_full:

-   iovad->max32_alloc_size = size;
spin_unlock_irqrestore(>iova_rbtree_lock, flags);
return -ENOMEM;
  }


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v1 04/15] iommu: Add DOMAIN_ATTR_PTBASE

2019-03-18 Thread Jordan Crouse
On Mon, Mar 18, 2019 at 10:53:21AM +0100, Joerg Roedel wrote:
> On Fri, Mar 01, 2019 at 12:38:26PM -0700, Jordan Crouse wrote:
> > Add an attribute to return the base address of the pagetable. This is used
> > by auxiliary domains from arm-smmu to return the address of the pagetable
> > to the leaf driver so that it can set the appropriate pagetable through
> > it's own means.
> 
> What is this going to be used for? Page-table management is supposed to
> happen in the arm-smmu driver and the gpu driver only makes changes
> through iommu_map/iommu_unmap calls.

Adreno GPUs can an internal mechanism to switch the pagetable address in the
attached arm-smmu v2 IOMMU so that each individual rendering process can have
their own pagetable. The driver uses iommu_map and iommu_unmap to write
the pagetable but the address for each individual pagetable needs to be queried
so it can be sent to the hardware. You can see the driver specific code that
does this here:

https://patchwork.freedesktop.org/patch/289507/?series=57441=1

Jordan

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/2] iommu/arm-smmu-v3: make sure the stale caching of L1STD are invalid

2019-03-18 Thread Zhen Lei
After the content of L1STD(Level 1 Stream Table Descriptor) in DDR has been
modified, should make sure the cached copies be invalidated.

Signed-off-by: Zhen Lei 
---
 drivers/iommu/arm-smmu-v3.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index d3880010c6cfc8c..9b6afa8e69f70f6 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1071,13 +1071,14 @@ static void arm_smmu_write_ctx_desc(struct 
arm_smmu_device *smmu,
*dst = cpu_to_le64(val);
 }
 
-static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
+static void __arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu,
+u32 sid, bool leaf)
 {
struct arm_smmu_cmdq_ent cmd = {
.opcode = CMDQ_OP_CFGI_STE,
.cfgi   = {
.sid= sid,
-   .leaf   = true,
+   .leaf   = leaf,
},
};
 
@@ -1085,6 +1086,16 @@ static void arm_smmu_sync_ste_for_sid(struct 
arm_smmu_device *smmu, u32 sid)
arm_smmu_cmdq_issue_sync(smmu);
 }
 
+static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
+{
+   __arm_smmu_sync_ste_for_sid(smmu, sid, true);
+}
+
+static void arm_smmu_sync_std_for_sid(struct arm_smmu_device *smmu, u32 sid)
+{
+   __arm_smmu_sync_ste_for_sid(smmu, sid, false);
+}
+
 static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid,
  __le64 *dst, struct arm_smmu_strtab_ent 
*ste)
 {
@@ -1232,6 +1243,7 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device 
*smmu, u32 sid)
 
arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT);
arm_smmu_write_strtab_l1_desc(strtab, desc);
+   arm_smmu_sync_std_for_sid(smmu, sid);
return 0;
 }
 
-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/2] iommu/arm-smmu-v3: to make smmu can be enabled in the kdump kernel

2019-03-18 Thread Zhen Lei
I don't known why device_shutdown() is not called in the first kernel
before the execution switch to the secondary kernel. People may afraid
that the function may lead the kernel to be crashed again, because it
contains too many operations, lead the secondary kernel can not be
entered finally.
  
Maybe the configuration of a device driver is set in the first kernel,
but it's not set in the secondary kernel, because of the limited memory
resources. (In order to facilitate the description, mark this kind of
devices as "unexpected devices".) Because the device was not shutdown in
the first kernel, so it may still access memory in the secondary kernel.
For example, a netcard may still using its ring buffer to auto receive
the external network packets in the secondary kernel.

commit b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU
is enabled in kdump kernel") set SMMU_GBPA.ABORT to abort the unexpected
devices access, but it also abort the memory access of the devices which
we needed, like netcard. For example, a system may have no harddisk, and
the vmcore should be dumped through network. 

In fact, we can use STE.config=0b000 to abort the memory access of the
unexpected devices only. Show as below:
1. In the first kernel, all buffers used by the "unexpected" devices are
   correctly mapped, and it will not be corrupted by the secondary kernel
   because the latter has its dedicated reserved memory.
2. Enter the secondary kernel, set SMMU_GBPA.ABORT=1 then disable smmu.
3. Preset all STE entries: STE.config=0b000. For 2-level Stream Table,
   pre-allocated a dummy L2ST(Level 2 Stream Table) and make all
   L1STD.l2ptr pointer to the dummy L2ST. The dummy L2ST is shared by all
   L1STDs(Level 1 Stream Table Descriptor).
4. Enable smmu. After now, a new attached device if needed, will allocate
   a new L2ST accordingly, and change the related L1STD.l2ptr pointer to
   it. 
   Please note that, we still base desc->l2ptr to judge whether the L2ST
   have been allocated or not, and don't care the value of L1STD.l2ptr.

Fixes: commit b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions ...")
Signed-off-by: Zhen Lei 
---
 drivers/iommu/arm-smmu-v3.c | 72 -
 1 file changed, 51 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 9b6afa8e69f70f6..28b04d4aef62a9f 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1218,35 +1218,57 @@ static void arm_smmu_init_bypass_stes(u64 *strtab, 
unsigned int nent)
}
 }
 
-static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
+static int __arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid,
+struct arm_smmu_strtab_l1_desc *desc)
 {
-   size_t size;
void *strtab;
struct arm_smmu_strtab_cfg *cfg = >strtab_cfg;
-   struct arm_smmu_strtab_l1_desc *desc = >l1_desc[sid >> 
STRTAB_SPLIT];
 
-   if (desc->l2ptr)
-   return 0;
-
-   size = 1 << (STRTAB_SPLIT + ilog2(STRTAB_STE_DWORDS) + 3);
strtab = >strtab[(sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS];
 
-   desc->span = STRTAB_SPLIT + 1;
-   desc->l2ptr = dmam_alloc_coherent(smmu->dev, size, >l2ptr_dma,
- GFP_KERNEL | __GFP_ZERO);
if (!desc->l2ptr) {
-   dev_err(smmu->dev,
-   "failed to allocate l2 stream table for SID %u\n",
-   sid);
-   return -ENOMEM;
+   size_t size;
+
+   size = 1 << (STRTAB_SPLIT + ilog2(STRTAB_STE_DWORDS) + 3);
+   desc->l2ptr = dmam_alloc_coherent(smmu->dev, size,
+ >l2ptr_dma,
+ GFP_KERNEL | __GFP_ZERO);
+   if (!desc->l2ptr) {
+   dev_err(smmu->dev,
+   "failed to allocate l2 stream table for SID 
%u\n",
+   sid);
+   return -ENOMEM;
+   }
+
+   desc->span = STRTAB_SPLIT + 1;
+   arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT);
}
 
-   arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT);
arm_smmu_write_strtab_l1_desc(strtab, desc);
+   return 0;
+}
+
+static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
+{
+   int ret;
+   struct arm_smmu_strtab_cfg *cfg = >strtab_cfg;
+   struct arm_smmu_strtab_l1_desc *desc = >l1_desc[sid >> 
STRTAB_SPLIT];
+
+   ret = __arm_smmu_init_l2_strtab(smmu, sid, desc);
+   if (ret)
+   return ret;
+
arm_smmu_sync_std_for_sid(smmu, sid);
return 0;
 }
 
+static int arm_smmu_init_dummy_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
+{
+   static struct arm_smmu_strtab_l1_desc dummy_desc;
+
+   return 

[PATCH v2 0/2] iommu/arm-smmu-v3: make sure the kdump kernel can work well when smmu is enabled

2019-03-18 Thread Zhen Lei
v1 --> v2:
1. Drop part2. Now, we only use the SMMUv3 hardware feature STE.config=0b000
(Report abort to device, no event recorded) to suppress the event messages
caused by the unexpected devices.
2. rewrite the patch description.

v1:
This patch series include two parts:
1. Patch1-2 use dummy STE tables with "ste abort" hardware feature to abort 
unexpected
   devices accessing. For more details, see the description in patch 2.
2. If the "ste abort" feature is not support, force the unexpected devices in 
the
   secondary kernel to use the memory maps which it used in the first kernel. 
For more
   details, see patch 5.

Zhen Lei (2):
  iommu/arm-smmu-v3: make sure the stale caching of L1STD are invalid
  iommu/arm-smmu-v3: to make smmu can be enabled in the kdump kernel

 drivers/iommu/arm-smmu-v3.c | 88 +
 1 file changed, 65 insertions(+), 23 deletions(-)

-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 05/22] iommu: Introduce cache_invalidate API

2019-03-18 Thread Auger Eric
Hi Jean,

On 3/18/19 12:01 PM, Jean-Philippe Brucker wrote:
> On 17/03/2019 16:43, Auger Eric wrote:
 diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
 index 532a64075f23..e4c6a447e85a 100644
 --- a/include/uapi/linux/iommu.h
 +++ b/include/uapi/linux/iommu.h
 @@ -159,4 +159,75 @@ struct iommu_pasid_table_config {
};
  };
  
 +/* defines the granularity of the invalidation */
 +enum iommu_inv_granularity {
 +  IOMMU_INV_GRANU_DOMAIN, /* domain-selective
 invalidation */
 +  IOMMU_INV_GRANU_PASID,  /* pasid-selective
 invalidation */
 +  IOMMU_INV_GRANU_ADDR,   /* page-selective invalidation
 */ +};
 +
 +/**
 + * Address Selective Invalidation Structure
 + *
 + * @flags indicates the granularity of the address-selective
 invalidation
 + * - if PASID bit is set, @pasid field is populated and the
 invalidation
 + *   relates to cache entries tagged with this PASID and matching the
 + *   address range.
 + * - if ARCHID bit is set, @archid is populated and the invalidation
 relates
 + *   to cache entries tagged with this architecture specific id and
 matching
 + *   the address range.
 + * - Both PASID and ARCHID can be set as they may tag different
 caches.
 + * - if neither PASID or ARCHID is set, global addr invalidation
 applies
 + * - LEAF flag indicates whether only the leaf PTE caching needs to
 be
 + *   invalidated and other paging structure caches can be preserved.
 + * @pasid: process address space id
 + * @archid: architecture-specific id
 + * @addr: first stage/level input address
 + * @granule_size: page/block size of the mapping in bytes
 + * @nb_granules: number of contiguous granules to be invalidated
 + */
 +struct iommu_inv_addr_info {
 +#define IOMMU_INV_ADDR_FLAGS_PASID(1 << 0)
 +#define IOMMU_INV_ADDR_FLAGS_ARCHID   (1 << 1)
 +#define IOMMU_INV_ADDR_FLAGS_LEAF (1 << 2)
 +  __u32   flags;
 +  __u32   archid;
 +  __u64   pasid;
 +  __u64   addr;
 +  __u64   granule_size;
 +  __u64   nb_granules;
 +};
 +
 +/**
 + * First level/stage invalidation information
 + * @cache: bitfield that allows to select which caches to invalidate
 + * @granularity: defines the lowest granularity used for the
 invalidation:
 + * domain > pasid > addr
 + *
 + * Not all the combinations of cache/granularity make sense:
 + *
 + * type |   DEV_IOTLB   | IOTLB |  PASID|
 + * granularity|   |   |
 cache  |
 + * -+---+---+---+
 + * DOMAIN |   N/A |   Y   |
 Y  |
 + * PASID  |   Y   |   Y   |
 Y  |
 + * ADDR   |   Y   |   Y   |
 N/A|
 + */
 +struct iommu_cache_invalidate_info {
 +#define IOMMU_CACHE_INVALIDATE_INFO_VERSION_1 1
 +  __u32   version;
 +/* IOMMU paging structure cache */
 +#define IOMMU_CACHE_INV_TYPE_IOTLB(1 << 0) /* IOMMU IOTLB */
 +#define IOMMU_CACHE_INV_TYPE_DEV_IOTLB(1 << 1) /* Device
 IOTLB */ +#define IOMMU_CACHE_INV_TYPE_PASID   (1 << 2) /* PASID
 cache */
 +  __u8cache;
 +  __u8granularity;
 +  __u8padding[2];
 +  union {
 +  __u64   pasid;
>>> just realized there is already a pasid field in the addr_info, do we
>>> still need this?
>> I think so. Either you do a PASID based invalidation and you directly
>> use the pasid field or you do an address based invalidation and you use
>> the addr_info where the pasid may or not be passed.
> 
> I guess a comment would be useful?
> 
> - Invalidations by %IOMMU_INV_GRANU_ADDR use field @addr_info.
> - Invalidations by %IOMMU_INV_GRANU_PASID use field @pasid.
> - Invalidations by %IOMMU_INV_GRANU_DOMAIN don't take an argument.

OK. I will add those comments in v7.

Thanks

Eric
> 
> Thanks,
> Jean
> 
>>
>> Thanks
>>
>> Eric
 +  struct iommu_inv_addr_info addr_info;
 +  };
 +};
 +
 +
  #endif /* _UAPI_IOMMU_H */
>>>
>>> [Jacob Pan]
>>>
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 05/22] iommu: Introduce cache_invalidate API

2019-03-18 Thread Jean-Philippe Brucker
On 17/03/2019 16:43, Auger Eric wrote:
>>> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
>>> index 532a64075f23..e4c6a447e85a 100644
>>> --- a/include/uapi/linux/iommu.h
>>> +++ b/include/uapi/linux/iommu.h
>>> @@ -159,4 +159,75 @@ struct iommu_pasid_table_config {
>>> };
>>>  };
>>>  
>>> +/* defines the granularity of the invalidation */
>>> +enum iommu_inv_granularity {
>>> +   IOMMU_INV_GRANU_DOMAIN, /* domain-selective
>>> invalidation */
>>> +   IOMMU_INV_GRANU_PASID,  /* pasid-selective
>>> invalidation */
>>> +   IOMMU_INV_GRANU_ADDR,   /* page-selective invalidation
>>> */ +};
>>> +
>>> +/**
>>> + * Address Selective Invalidation Structure
>>> + *
>>> + * @flags indicates the granularity of the address-selective
>>> invalidation
>>> + * - if PASID bit is set, @pasid field is populated and the
>>> invalidation
>>> + *   relates to cache entries tagged with this PASID and matching the
>>> + *   address range.
>>> + * - if ARCHID bit is set, @archid is populated and the invalidation
>>> relates
>>> + *   to cache entries tagged with this architecture specific id and
>>> matching
>>> + *   the address range.
>>> + * - Both PASID and ARCHID can be set as they may tag different
>>> caches.
>>> + * - if neither PASID or ARCHID is set, global addr invalidation
>>> applies
>>> + * - LEAF flag indicates whether only the leaf PTE caching needs to
>>> be
>>> + *   invalidated and other paging structure caches can be preserved.
>>> + * @pasid: process address space id
>>> + * @archid: architecture-specific id
>>> + * @addr: first stage/level input address
>>> + * @granule_size: page/block size of the mapping in bytes
>>> + * @nb_granules: number of contiguous granules to be invalidated
>>> + */
>>> +struct iommu_inv_addr_info {
>>> +#define IOMMU_INV_ADDR_FLAGS_PASID (1 << 0)
>>> +#define IOMMU_INV_ADDR_FLAGS_ARCHID(1 << 1)
>>> +#define IOMMU_INV_ADDR_FLAGS_LEAF  (1 << 2)
>>> +   __u32   flags;
>>> +   __u32   archid;
>>> +   __u64   pasid;
>>> +   __u64   addr;
>>> +   __u64   granule_size;
>>> +   __u64   nb_granules;
>>> +};
>>> +
>>> +/**
>>> + * First level/stage invalidation information
>>> + * @cache: bitfield that allows to select which caches to invalidate
>>> + * @granularity: defines the lowest granularity used for the
>>> invalidation:
>>> + * domain > pasid > addr
>>> + *
>>> + * Not all the combinations of cache/granularity make sense:
>>> + *
>>> + * type |   DEV_IOTLB   | IOTLB |  PASID|
>>> + * granularity |   |   |
>>> cache   |
>>> + * -+---+---+---+
>>> + * DOMAIN  |   N/A |   Y   |
>>> Y   |
>>> + * PASID   |   Y   |   Y   |
>>> Y   |
>>> + * ADDR|   Y   |   Y   |
>>> N/A |
>>> + */
>>> +struct iommu_cache_invalidate_info {
>>> +#define IOMMU_CACHE_INVALIDATE_INFO_VERSION_1 1
>>> +   __u32   version;
>>> +/* IOMMU paging structure cache */
>>> +#define IOMMU_CACHE_INV_TYPE_IOTLB (1 << 0) /* IOMMU IOTLB */
>>> +#define IOMMU_CACHE_INV_TYPE_DEV_IOTLB (1 << 1) /* Device
>>> IOTLB */ +#define IOMMU_CACHE_INV_TYPE_PASID(1 << 2) /* PASID
>>> cache */
>>> +   __u8cache;
>>> +   __u8granularity;
>>> +   __u8padding[2];
>>> +   union {
>>> +   __u64   pasid;
>> just realized there is already a pasid field in the addr_info, do we
>> still need this?
> I think so. Either you do a PASID based invalidation and you directly
> use the pasid field or you do an address based invalidation and you use
> the addr_info where the pasid may or not be passed.

I guess a comment would be useful?

- Invalidations by %IOMMU_INV_GRANU_ADDR use field @addr_info.
- Invalidations by %IOMMU_INV_GRANU_PASID use field @pasid.
- Invalidations by %IOMMU_INV_GRANU_DOMAIN don't take an argument.

Thanks,
Jean

> 
> Thanks
> 
> Eric
>>> +   struct iommu_inv_addr_info addr_info;
>>> +   };
>>> +};
>>> +
>>> +
>>>  #endif /* _UAPI_IOMMU_H */
>>
>> [Jacob Pan]
>>

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1] iommu/vt-d: Switch to bitmap_zalloc()

2019-03-18 Thread Joerg Roedel
On Mon, Mar 04, 2019 at 11:07:37AM +0200, Andy Shevchenko wrote:
> Switch to bitmap_zalloc() to show clearly what we are allocating.
> Besides that it returns pointer of bitmap type instead of opaque void *.
> 
> Signed-off-by: Andy Shevchenko 
> ---
>  drivers/iommu/intel_irq_remapping.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/amd: fix sg->dma_address for sg->offset bigger than PAGE_SIZE

2019-03-18 Thread Joerg Roedel
Hi Stanislaw,

thanks a lot for looking into this and tracking it down!

On Wed, Mar 13, 2019 at 10:03:17AM +0100, Stanislaw Gruszka wrote:
> - s->dma_address += address + s->offset;
> + /* Add in the remaining piece of the scatter-gather offset that
> +  * was masked out when we were determining the physical address
> +  * via (sg_phys(s) & PAGE_MASK) earlier.
> +  */
> + s->dma_address += address + (s->offset & ~PAGE_MASK);
>   s->dma_length   = s->length;

Applied the patch for v5.1 (with an added Fixes-tag and a minor coding
style change) and will send it upstream soon.

Thanks again,

   Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v1 04/15] iommu: Add DOMAIN_ATTR_PTBASE

2019-03-18 Thread Joerg Roedel
On Fri, Mar 01, 2019 at 12:38:26PM -0700, Jordan Crouse wrote:
> Add an attribute to return the base address of the pagetable. This is used
> by auxiliary domains from arm-smmu to return the address of the pagetable
> to the leaf driver so that it can set the appropriate pagetable through
> it's own means.

What is this going to be used for? Page-table management is supposed to
happen in the arm-smmu driver and the gpu driver only makes changes
through iommu_map/iommu_unmap calls.


Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu