Re: [PATCH v2 17/19] iommu: Add max num of cache and granu types

2019-04-29 Thread Auger Eric
Hi Jacob,

On 4/29/19 6:17 PM, Jacob Pan wrote:
> On Fri, 26 Apr 2019 18:22:46 +0200
> Auger Eric  wrote:
> 
>> Hi Jacob,
>>
>> On 4/24/19 1:31 AM, Jacob Pan wrote:
>>> To convert to/from cache types and granularities between generic and
>>> VT-d specific counterparts, a 2D arrary is used. Introduce the
>>> limits  
>> array
>>> to help define the converstion array size.  
>> conversion
>>>
> will fix, thanks
>>> Signed-off-by: Jacob Pan 
>>> ---
>>>  include/uapi/linux/iommu.h | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
>>> index 5c95905..2d8fac8 100644
>>> --- a/include/uapi/linux/iommu.h
>>> +++ b/include/uapi/linux/iommu.h
>>> @@ -197,6 +197,7 @@ struct iommu_inv_addr_info {
>>> __u64   granule_size;
>>> __u64   nb_granules;
>>>  };
>>> +#define NR_IOMMU_CACHE_INVAL_GRANU (3)
>>>  
>>>  /**
>>>   * First level/stage invalidation information
>>> @@ -235,6 +236,7 @@ struct iommu_cache_invalidate_info {
>>> struct iommu_inv_addr_info addr_info;
>>> };
>>>  };
>>> +#define NR_IOMMU_CACHE_TYPE(3)
>>>  /**
>>>   * struct gpasid_bind_data - Information about device and guest
>>> PASID binding
>>>   * @gcr3:  Guest CR3 value from guest mm
>>>   
>> Is it really something that needs to be exposed in the uapi?
>>
> I put it in uapi since the related definitions for granularity and
> cache type are in the same file.
> Maybe putting them close together like this? I was thinking you can just
> fold it into your next series as one patch for introducing cache
> invalidation.
> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> index 2d8fac8..4ff6929 100644
> --- a/include/uapi/linux/iommu.h
> +++ b/include/uapi/linux/iommu.h
> @@ -164,6 +164,7 @@ enum iommu_inv_granularity {
> IOMMU_INV_GRANU_DOMAIN, /* domain-selective invalidation */
> IOMMU_INV_GRANU_PASID,  /* pasid-selective invalidation */
> IOMMU_INV_GRANU_ADDR,   /* page-selective invalidation */
> +   NR_IOMMU_INVAL_GRANU,   /* number of invalidation granularities
> */ };
>  
>  /**
> @@ -228,6 +229,7 @@ struct iommu_cache_invalidate_info {
>  #define IOMMU_CACHE_INV_TYPE_IOTLB (1 << 0) /* IOMMU IOTLB */
>  #define IOMMU_CACHE_INV_TYPE_DEV_IOTLB (1 << 1) /* Device IOTLB */
>  #define IOMMU_CACHE_INV_TYPE_PASID (1 << 2) /* PASID cache */
> +#define NR_IOMMU_CACHE_TYPE(3)

OK I will add this.

Thanks

Eric
> __u8cache;
> __u8granularity;
> 
>> Thanks
>>
>> Eric
> 
> [Jacob Pan]
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 18/19] iommu/vt-d: Support flushing more translation cache types

2019-04-29 Thread Auger Eric
Hi Jacob,

On 4/29/19 11:29 PM, Jacob Pan wrote:
> On Sat, 27 Apr 2019 11:04:04 +0200
> Auger Eric  wrote:
> 
>> Hi Jacob,
>>
>> On 4/24/19 1:31 AM, Jacob Pan wrote:
>>> When Shared Virtual Memory is exposed to a guest via vIOMMU,
>>> extended IOTLB invalidation may be passed down from outside IOMMU
>>> subsystems. This patch adds invalidation functions that can be used
>>> for additional translation cache types.
>>>
>>> Signed-off-by: Jacob Pan 
>>> ---
>>>  drivers/iommu/dmar.c| 48
>>> +
>>> include/linux/intel-iommu.h | 21  2 files
>>> changed, 65 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
>>> index 9c49300..680894e 100644
>>> --- a/drivers/iommu/dmar.c
>>> +++ b/drivers/iommu/dmar.c
>>> @@ -1357,6 +1357,20 @@ void qi_flush_iotlb(struct intel_iommu
>>> *iommu, u16 did, u64 addr, qi_submit_sync(, iommu);
>>>  }
>>>
>> /* PASID-based IOTLB Invalidate */
>>> +void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u64 addr,
>>> u32 pasid,
>>> +   unsigned int size_order, u64 granu)
>>> +{
>>> +   struct qi_desc desc;
>>> +
>>> +   desc.qw0 = QI_EIOTLB_PASID(pasid) | QI_EIOTLB_DID(did) |
>>> +   QI_EIOTLB_GRAN(granu) | QI_EIOTLB_TYPE;
>>> +   desc.qw1 = QI_EIOTLB_ADDR(addr) | QI_EIOTLB_IH(0) |
>>> +   QI_EIOTLB_AM(size_order);  
>> I see IH it hardcoded to 0. Don't you envision to cascade the IH. On
>> ARM this was needed for perf sake.
> Right, we should cascade IH based on IOMMU_INV_ADDR_FLAGS_LEAF. Just
> curious how do you deduce the IH information on ARM? I guess you need
> to get the non-leaf page directory info?
> I will add an argument for IH.
On ARM we have the "Leaf" field in the stage1 TLB invalidation command.
"When Leaf==1, only cached entries for the last level of translation
table walk are required to be invalidated".

Thanks

Eric
>>> +   desc.qw2 = 0;
>>> +   desc.qw3 = 0;
>>> +   qi_submit_sync(, iommu);
>>> +}
>>> +
>>>  void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16
>>> pfsid, u16 qdep, u64 addr, unsigned mask)
>>>  {
>>> @@ -1380,6 +1394,40 @@ void qi_flush_dev_iotlb(struct intel_iommu
>>> *iommu, u16 sid, u16 pfsid, qi_submit_sync(, iommu);
>>>  }
>>>
>> /* Pasid-based Device-TLB Invalidation */
> yes, better to explain piotlb :), same for iotlb.
>>> +void qi_flush_dev_piotlb(struct intel_iommu *iommu, u16 sid, u16
>>> pfsid,
>>> +   u32 pasid,  u16 qdep, u64 addr, unsigned size, u64
>>> granu) +{
>>> +   struct qi_desc desc;
>>> +
>>> +   desc.qw0 = QI_DEV_EIOTLB_PASID(pasid) |
>>> QI_DEV_EIOTLB_SID(sid) |
>>> +   QI_DEV_EIOTLB_QDEP(qdep) | QI_DEIOTLB_TYPE |
>>> +   QI_DEV_IOTLB_PFSID(pfsid);
>>> +   desc.qw1 |= QI_DEV_EIOTLB_GLOB(granu);
> should be desc.qw1 =
>>> +
>>> +   /* If S bit is 0, we only flush a single page. If S bit is
>>> set,
>>> +* The least significant zero bit indicates the size. VT-d
>>> spec
>>> +* 6.5.2.6
>>> +*/
>>> +   if (!size)
>>> +   desc.qw0 = QI_DEV_EIOTLB_ADDR(addr) &
>>> ~QI_DEV_EIOTLB_SIZE;  
>> desc.q1 |= ?
> Right, I also missed previous qw1 assignment.
>>> +   else {
>>> +   unsigned long mask = 1UL << (VTD_PAGE_SHIFT +
>>> size); +
>>> +   desc.qw1 = QI_DEV_EIOTLB_ADDR(addr & ~mask) |
>>> QI_DEV_EIOTLB_SIZE;  
>> desc.q1 |=
> right, thanks
>>> +   }
>>> +   qi_submit_sync(, iommu);
>>> +}
>>> +  
>> /* PASID-cache invalidation */
>>> +void qi_flush_pasid_cache(struct intel_iommu *iommu, u16 did, u64
>>> granu, int pasid) +{
>>> +   struct qi_desc desc;
>>> +
>>> +   desc.qw0 = QI_PC_TYPE | QI_PC_DID(did) | QI_PC_GRAN(granu)
>>> | QI_PC_PASID(pasid);
>>> +   desc.qw1 = 0;
>>> +   desc.qw2 = 0;
>>> +   desc.qw3 = 0;
>>> +   qi_submit_sync(, iommu);
>>> +}
>>>  /*
>>>   * Disable Queued Invalidation interface.
>>>   */
>>> diff --git a/include/linux/intel-iommu.h
>>> b/include/linux/intel-iommu.h index 5d67d0d4..38e5efb 100644
>>> --- a/include/linux/intel-iommu.h
>>> +++ b/include/linux/intel-iommu.h
>>> @@ -339,7 +339,7 @@ enum {
>>>  #define QI_IOTLB_GRAN(gran)(((u64)gran) >>
>>> (DMA_TLB_FLUSH_GRANU_OFFSET-4)) #define QI_IOTLB_ADDR(addr)
>>> (((u64)addr) & VTD_PAGE_MASK) #define
>>> QI_IOTLB_IH(ih) (((u64)ih) << 6) -#define
>>> QI_IOTLB_AM(am) (((u8)am)) +#define
>>> QI_IOTLB_AM(am) (((u8)am) & 0x3f) 
>>>  #define QI_CC_FM(fm)   (((u64)fm) << 48)
>>>  #define QI_CC_SID(sid) (((u64)sid) << 32)
>>> @@ -357,17 +357,22 @@ enum {
>>>  #define QI_PC_DID(did) (((u64)did) << 16)
>>>  #define QI_PC_GRAN(gran)   (((u64)gran) << 4)
>>>  
>>> -#define QI_PC_ALL_PASIDS   (QI_PC_TYPE | QI_PC_GRAN(0))
>>> -#define QI_PC_PASID_SEL(QI_PC_TYPE | QI_PC_GRAN(1))
>>> +/* PASID cache invalidation granu */
>>> +#define QI_PC_ALL_PASIDS   0
>>> +#define QI_PC_PASID_SEL1
>>>  
>>>  #define QI_EIOTLB_ADDR(addr)   ((u64)(addr) & 

Re: [PATCH v3 5/8] iommu/vt-d: Implement def_domain_type iommu ops entry

2019-04-29 Thread Lu Baolu

Hi Christoph,

On 4/30/19 4:03 AM, Christoph Hellwig wrote:

@@ -3631,35 +3607,30 @@ static int iommu_no_mapping(struct device *dev)
if (iommu_dummy(dev))
return 1;
  
-	if (!iommu_identity_mapping)

-   return 0;
-


FYI, iommu_no_mapping has been refactored in for-next:

https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/?h=x86/vt-d=48b2c937ea37a3bece0094b46450ed5267525289


Oh, yes! Thanks for letting me know this. Will rebase the code.




found = identity_mapping(dev);
if (found) {
+   /*
+* If the device's dma_mask is less than the system's memory
+* size then this is not a candidate for identity mapping.
+*/
+   u64 dma_mask = *dev->dma_mask;
+
+   if (dev->coherent_dma_mask &&
+   dev->coherent_dma_mask < dma_mask)
+   dma_mask = dev->coherent_dma_mask;
+
+   if (dma_mask < dma_get_required_mask(dev)) {


I know this is mostly existing code moved around, but it really needs
some fixing.  For one dma_get_required_mask is supposed to return the
required to not bounce mask for the given device.  E.g. for a device
behind an iommu it should always just return 32-bit.  If you really
want to check vs system memory please call dma_direct_get_required_mask
without the dma_ops indirection.

Second I don't even think we need to check the coherent_dma_mask,
dma_direct is pretty good at always finding memory even without
an iommu.

Third this doesn't take take the bus_dma_mask into account.

This probably should just be:

if (min(*dev->dma_mask, dev->bus_dma_mask) <
dma_direct_get_required_mask(dev)) {


Agreed and will add this in the next version.

Best regards,
Lu Baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 02/10] swiotlb: Factor out slot allocation and free

2019-04-29 Thread Lu Baolu

Hi Robin,

On 4/29/19 7:06 PM, Robin Murphy wrote:

On 29/04/2019 06:10, Lu Baolu wrote:

Hi Christoph,

On 4/26/19 11:04 PM, Christoph Hellwig wrote:

On Thu, Apr 25, 2019 at 10:07:19AM +0800, Lu Baolu wrote:

This is not VT-d specific. It's just how generic IOMMU works.

Normally, IOMMU works in paging mode. So if a driver issues DMA with
IOVA  0x0123, IOMMU can remap it with a physical address 
0x0123.

But we should never expect IOMMU to remap 0x0123 with physical
address of 0x. That's the reason why I said that IOMMU will not
work there.


Well, with the iommu it doesn't happen.  With swiotlb it obviosuly
can happen, so drivers are fine with it.  Why would that suddenly
become an issue when swiotlb is called from the iommu code?



I would say IOMMU is DMA remapping, not DMA engine. :-)


I'm not sure I really follow the issue here - if we're copying the 
buffer to the bounce page(s) there's no conceptual difference from 
copying it to SWIOTLB slot(s), so there should be no need to worry about 
the original in-page offset.


 From the reply up-thread I guess you're trying to include an 
optimisation to only copy the head and tail of the buffer if it spans 
multiple pages, and directly map the ones in the middle, but AFAICS 
that's going to tie you to also using strict mode for TLB maintenance, 
which may not be a win overall depending on the balance between 
invalidation bandwidth vs. memcpy bandwidth. At least if we use standard 
SWIOTLB logic to always copy the whole thing, we should be able to 
release the bounce pages via the flush queue to allow 'safe' lazy unmaps.




With respect, even we use the standard SWIOTLB logic, we need to use
the strict mode for TLB maintenance.

Say, some swiotbl slots are used by untrusted device for bounce page
purpose. When the device driver unmaps the IOVA, the slots are freed but
the mapping is still cached in IOTLB, hence the untrusted device is 
still able to access the slots. Then the slots are allocated to other

devices. This makes it possible for the untrusted device to access
the data buffer of other devices.

Best regards,
Lu Baolu

Either way I think it would be worth just implementing the 
straightforward version first, then coming back to consider 
optimisations later.




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[RFC/RFT PATCH 0/2] Optimize dma_*_from_contiguous calls

2019-04-29 Thread Nicolin Chen
This series of patches try to optimize dma_*_from_contiguous calls:
PATCH-1 does some abstraction and cleanup.
PATCH-2 saves single pages and reduce fragmentations from CMA area.

Both two patches may impact the source of pages (CMA or normal)
depending on the use cases, so are being tagged with "RFC/RFT".

Please check their commit messages for detail.

Nicolin Chen (2):
  dma-contiguous: Simplify dma_*_from_contiguous() function calls
  dma-contiguous: Use fallback alloc_pages for single pages

 arch/arm/mm/dma-mapping.c  | 14 +++-
 arch/arm64/mm/dma-mapping.c| 11 +++---
 arch/xtensa/kernel/pci-dma.c   | 19 +++---
 drivers/iommu/amd_iommu.c  | 20 +++
 drivers/iommu/intel-iommu.c| 20 ++-
 include/linux/dma-contiguous.h | 15 +++-
 kernel/dma/contiguous.c| 64 ++
 kernel/dma/direct.c| 24 +++--
 kernel/dma/remap.c | 11 ++
 9 files changed, 73 insertions(+), 125 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 01/10] iommu: Add helper to get minimal page size of domain

2019-04-29 Thread Lu Baolu

Hi Robin,

On 4/29/19 6:55 PM, Robin Murphy wrote:

On 21/04/2019 02:17, Lu Baolu wrote:

This makes it possible for other modules to know the minimal
page size supported by a domain without the knowledge of the
structure details.

Signed-off-by: Lu Baolu 
---
  include/linux/iommu.h | 13 +
  1 file changed, 13 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a5007d035218..46679ef19b7e 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -377,6 +377,14 @@ static inline void iommu_tlb_sync(struct 
iommu_domain *domain)

  domain->ops->iotlb_sync(domain);
  }
+static inline unsigned long domain_minimal_pgsize(struct iommu_domain 
*domain)

+{
+    if (domain && domain->pgsize_bitmap)
+    return 1 << __ffs(domain->pgsize_bitmap);


Nit: this would probably be more efficient on most architectures as:

 if (domain)
     return domain->pgsize_bitmap & -domain->pgsize_bitmap;



It looks reasonable to me.



I'd also suggest s/minimal/min/ in the name, just to stop it getting too 
long. Otherwise, though, I like the idea, and there's at least one other 
place (in iommu-dma) that can make use of it straight away.


Okay, I will change the name to domain_min_pgsize().



Robin.



Best regards,
Lu Baolu


+
+    return 0;
+}
+
  /* PCI device grouping function */
  extern struct iommu_group *pci_device_group(struct device *dev);
  /* Generic device grouping function */
@@ -704,6 +712,11 @@ const struct iommu_ops 
*iommu_ops_from_fwnode(struct fwnode_handle *fwnode)

  return NULL;
  }
+static inline unsigned long domain_minimal_pgsize(struct iommu_domain 
*domain)

+{
+    return 0;
+}
+
  #endif /* CONFIG_IOMMU_API */
  #ifdef CONFIG_IOMMU_DEBUGFS




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v2 3/4] iommu/dma-iommu: Use the dev->coherent_dma_mask

2019-04-29 Thread Tom Murphy via iommu
Use the dev->coherent_dma_mask when allocating in the dma-iommu ops api.

Signed-off-by: Tom Murphy 
---
 drivers/iommu/dma-iommu.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index c18f74ad1e8b..df03104978d7 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -436,7 +436,8 @@ static void __iommu_dma_unmap(struct iommu_domain *domain, 
dma_addr_t dma_addr,
 }
 
 static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
-   size_t size, int prot, struct iommu_domain *domain)
+   size_t size, int prot, struct iommu_domain *domain,
+   dma_addr_t dma_limit)
 {
struct iommu_dma_cookie *cookie = domain->iova_cookie;
size_t iova_off = 0;
@@ -447,7 +448,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, 
phys_addr_t phys,
size = iova_align(>iovad, size + iova_off);
}
 
-   iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev);
+   iova = iommu_dma_alloc_iova(domain, size, dma_limit, dev);
if (!iova)
return DMA_MAPPING_ERROR;
 
@@ -490,7 +491,7 @@ static void *iommu_dma_alloc_contiguous(struct device *dev, 
size_t size,
return NULL;
 
*dma_handle = __iommu_dma_map(dev, page_to_phys(page), size, ioprot,
-   iommu_get_dma_domain(dev));
+   iommu_get_dma_domain(dev), dev->coherent_dma_mask);
if (*dma_handle == DMA_MAPPING_ERROR) {
if (!dma_release_from_contiguous(dev, page, count))
__free_pages(page, page_order);
@@ -760,7 +761,7 @@ static void *iommu_dma_alloc_pool(struct device *dev, 
size_t size,
 
*dma_handle = __iommu_dma_map(dev, page_to_phys(page), size,
dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs),
-   iommu_get_domain_for_dev(dev));
+   iommu_get_domain_for_dev(dev), dev->coherent_dma_mask);
if (*dma_handle == DMA_MAPPING_ERROR) {
dma_free_from_pool(vaddr, PAGE_ALIGN(size));
return NULL;
@@ -850,7 +851,7 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, 
struct page *page,
 
dma_handle =__iommu_dma_map(dev, phys, size,
dma_info_to_prot(dir, coherent, attrs),
-   iommu_get_dma_domain(dev));
+   iommu_get_dma_domain(dev), dma_get_mask(dev));
if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
dma_handle != DMA_MAPPING_ERROR)
arch_sync_dma_for_device(dev, phys, size, dir);
@@ -1065,7 +1066,7 @@ static dma_addr_t iommu_dma_map_resource(struct device 
*dev, phys_addr_t phys,
 
return __iommu_dma_map(dev, phys, size,
dma_info_to_prot(dir, false, attrs) | IOMMU_MMIO,
-   iommu_get_dma_domain(dev));
+   iommu_get_dma_domain(dev), dma_get_mask(dev));
 }
 
 static void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
@@ -1250,7 +1251,8 @@ static struct iommu_dma_msi_page 
*iommu_dma_get_msi_page(struct device *dev,
if (!msi_page)
return NULL;
 
-   iova = __iommu_dma_map(dev, msi_addr, size, prot, domain);
+   iova = __iommu_dma_map(dev, msi_addr, size, prot, domain,
+   dma_get_mask(dev));
if (iova == DMA_MAPPING_ERROR)
goto out_free_page;
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 4/4] iommu/amd: Convert the AMD iommu driver to the dma-iommu api

2019-04-29 Thread Tom Murphy via iommu
Convert the AMD iommu driver to the dma-iommu api. Remove the iova
handling and reserve region code from the AMD iommu driver.

Signed-off-by: Tom Murphy 
---
 drivers/iommu/Kconfig |   1 +
 drivers/iommu/amd_iommu.c | 680 --
 2 files changed, 70 insertions(+), 611 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6f07f3b21816..5914ba85180b 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -136,6 +136,7 @@ config AMD_IOMMU
select PCI_PASID
select IOMMU_API
select IOMMU_IOVA
+   select IOMMU_DMA
depends on X86_64 && PCI && ACPI
---help---
  With this option you can enable support for AMD IOMMU hardware in
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index ea3a5dc61bb0..366af2b27e7f 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -101,8 +102,6 @@ const struct iommu_ops amd_iommu_ops;
 static ATOMIC_NOTIFIER_HEAD(ppr_notifier);
 int amd_iommu_max_glx_val = -1;
 
-static const struct dma_map_ops amd_iommu_dma_ops;
-
 /*
  * general struct to manage commands send to an IOMMU
  */
@@ -115,21 +114,6 @@ struct kmem_cache *amd_iommu_irq_cache;
 static void update_domain(struct protection_domain *domain);
 static int protection_domain_init(struct protection_domain *domain);
 static void detach_device(struct device *dev);
-static void iova_domain_flush_tlb(struct iova_domain *iovad);
-
-/*
- * Data container for a dma_ops specific protection domain
- */
-struct dma_ops_domain {
-   /* generic protection domain information */
-   struct protection_domain domain;
-
-   /* IOVA RB-Tree */
-   struct iova_domain iovad;
-};
-
-static struct iova_domain reserved_iova_ranges;
-static struct lock_class_key reserved_rbtree_key;
 
 /
  *
@@ -200,12 +184,6 @@ static struct protection_domain *to_pdomain(struct 
iommu_domain *dom)
return container_of(dom, struct protection_domain, domain);
 }
 
-static struct dma_ops_domain* to_dma_ops_domain(struct protection_domain 
*domain)
-{
-   BUG_ON(domain->flags != PD_DMA_OPS_MASK);
-   return container_of(domain, struct dma_ops_domain, domain);
-}
-
 static struct iommu_dev_data *alloc_dev_data(u16 devid)
 {
struct iommu_dev_data *dev_data;
@@ -1279,12 +1257,6 @@ static void domain_flush_pages(struct protection_domain 
*domain,
__domain_flush_pages(domain, address, size, 0);
 }
 
-/* Flush the whole IO/TLB for a given protection domain */
-static void domain_flush_tlb(struct protection_domain *domain)
-{
-   __domain_flush_pages(domain, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS, 0);
-}
-
 /* Flush the whole IO/TLB for a given protection domain - including PDE */
 static void domain_flush_tlb_pde(struct protection_domain *domain)
 {
@@ -1686,43 +1658,6 @@ static unsigned long iommu_unmap_page(struct 
protection_domain *dom,
return unmapped;
 }
 
-/
- *
- * The next functions belong to the address allocator for the dma_ops
- * interface functions.
- *
- /
-
-
-static unsigned long dma_ops_alloc_iova(struct device *dev,
-   struct dma_ops_domain *dma_dom,
-   unsigned int pages, u64 dma_mask)
-{
-   unsigned long pfn = 0;
-
-   pages = __roundup_pow_of_two(pages);
-
-   if (dma_mask > DMA_BIT_MASK(32))
-   pfn = alloc_iova_fast(_dom->iovad, pages,
- IOVA_PFN(DMA_BIT_MASK(32)), false);
-
-   if (!pfn)
-   pfn = alloc_iova_fast(_dom->iovad, pages,
- IOVA_PFN(dma_mask), true);
-
-   return (pfn << PAGE_SHIFT);
-}
-
-static void dma_ops_free_iova(struct dma_ops_domain *dma_dom,
- unsigned long address,
- unsigned int pages)
-{
-   pages = __roundup_pow_of_two(pages);
-   address >>= PAGE_SHIFT;
-
-   free_iova_fast(_dom->iovad, address, pages);
-}
-
 /
  *
  * The next functions belong to the domain allocation. A domain is
@@ -1824,40 +1759,25 @@ static void free_gcr3_table(struct protection_domain 
*domain)
free_page((unsigned long)domain->gcr3_tbl);
 }
 
-static void dma_ops_domain_flush_tlb(struct dma_ops_domain *dom)
-{
-   domain_flush_tlb(>domain);
-   domain_flush_complete(>domain);
-}
-
-static void iova_domain_flush_tlb(struct iova_domain *iovad)
-{
-   struct dma_ops_domain *dom;
-
-   dom = container_of(iovad, struct dma_ops_domain, iovad);
-
-   dma_ops_domain_flush_tlb(dom);

[PATCH v2 1/4] iommu: Add gfp parameter to iommu_ops::map

2019-04-29 Thread Tom Murphy via iommu
Add a gfp_t parameter to the iommu_ops::map function.
Remove the needless locking in the AMD iommu driver.

The iommu_ops::map function (or the iommu_map function which calls it)
was always supposed to be sleepable (according to Joerg's comment in
this thread: https://lore.kernel.org/patchwork/patch/977520/ ) and so
should probably have had a "might_sleep()" since it was written. However
currently the dma-iommu api can call iommu_map in an atomic context,
which it shouldn't do. This doesn't cause any problems because any iommu
driver which uses the dma-iommu api uses gfp_atomic in it's
iommu_ops::map function. But doing this wastes the memory allocators
atomic pools.

We can remove the mutex lock from amd_iommu_map and amd_iommu_unmap.
iommu_map doesn’t lock while mapping and so no two calls should touch
the same iova range. The AMD driver already handles the page table page
allocations without locks so we can safely remove the locks.

Signed-off-by: Tom Murphy 
---
 drivers/iommu/amd_iommu.c  | 14 ---
 drivers/iommu/arm-smmu-v3.c|  2 +-
 drivers/iommu/arm-smmu.c   |  2 +-
 drivers/iommu/dma-iommu.c  |  6 ++---
 drivers/iommu/exynos-iommu.c   |  2 +-
 drivers/iommu/intel-iommu.c|  2 +-
 drivers/iommu/iommu.c  | 43 +-
 drivers/iommu/ipmmu-vmsa.c |  2 +-
 drivers/iommu/msm_iommu.c  |  2 +-
 drivers/iommu/mtk_iommu.c  |  2 +-
 drivers/iommu/mtk_iommu_v1.c   |  2 +-
 drivers/iommu/omap-iommu.c |  2 +-
 drivers/iommu/qcom_iommu.c |  2 +-
 drivers/iommu/rockchip-iommu.c |  2 +-
 drivers/iommu/s390-iommu.c |  2 +-
 drivers/iommu/tegra-gart.c |  2 +-
 drivers/iommu/tegra-smmu.c |  2 +-
 include/linux/iommu.h  | 21 -
 18 files changed, 78 insertions(+), 34 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index ebd062522cf5..ea3a5dc61bb0 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3092,7 +3092,8 @@ static int amd_iommu_attach_device(struct iommu_domain 
*dom,
 }
 
 static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova,
-phys_addr_t paddr, size_t page_size, int iommu_prot)
+phys_addr_t paddr, size_t page_size, int iommu_prot,
+gfp_t gfp)
 {
struct protection_domain *domain = to_pdomain(dom);
int prot = 0;
@@ -3106,9 +3107,7 @@ static int amd_iommu_map(struct iommu_domain *dom, 
unsigned long iova,
if (iommu_prot & IOMMU_WRITE)
prot |= IOMMU_PROT_IW;
 
-   mutex_lock(>api_lock);
-   ret = iommu_map_page(domain, iova, paddr, page_size, prot, GFP_KERNEL);
-   mutex_unlock(>api_lock);
+   ret = iommu_map_page(domain, iova, paddr, page_size, prot, gfp);
 
domain_flush_np_cache(domain, iova, page_size);
 
@@ -3119,16 +3118,11 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, 
unsigned long iova,
   size_t page_size)
 {
struct protection_domain *domain = to_pdomain(dom);
-   size_t unmap_size;
 
if (domain->mode == PAGE_MODE_NONE)
return 0;
 
-   mutex_lock(>api_lock);
-   unmap_size = iommu_unmap_page(domain, iova, page_size);
-   mutex_unlock(>api_lock);
-
-   return unmap_size;
+   return iommu_unmap_page(domain, iova, page_size);
 }
 
 static phys_addr_t amd_iommu_iova_to_phys(struct iommu_domain *dom,
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index d3880010c6cf..e5c48089b49f 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1777,7 +1777,7 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
 }
 
 static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
-   phys_addr_t paddr, size_t size, int prot)
+   phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
 {
struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 045d93884164..2d50db55b788 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1286,7 +1286,7 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
 }
 
 static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
-   phys_addr_t paddr, size_t size, int prot)
+   phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
 {
struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index fa5713a4f6f8..7a96c2c8f56b 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -440,7 +440,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, 
phys_addr_t phys,
if (!iova)
return 

[PATCH v2 0/4] iommu/amd: Convert the AMD iommu driver to the dma-iommu api

2019-04-29 Thread Tom Murphy via iommu
Convert the AMD iommu driver to the dma-iommu api. Remove the iova
handling and reserve region code from the AMD iommu driver.

Change-log:
v2:
-Rebase on top of this series:
 http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-iommu-ops.3
-Add a gfp_t parameter to the iommu_ops::map function.
-Made use of the reserve region code inside the dma-iommu api

Tom Murphy (4):
  iommu: Add gfp parameter to iommu_ops::map
  iommu/dma-iommu: Handle deferred devices
  iommu/dma-iommu: Use the dev->coherent_dma_mask
  iommu/amd: Convert the AMD iommu driver to the dma-iommu api

 drivers/iommu/Kconfig  |   1 +
 drivers/iommu/amd_iommu.c  | 694 -
 drivers/iommu/arm-smmu-v3.c|   2 +-
 drivers/iommu/arm-smmu.c   |   2 +-
 drivers/iommu/dma-iommu.c  |  52 ++-
 drivers/iommu/exynos-iommu.c   |   2 +-
 drivers/iommu/intel-iommu.c|   2 +-
 drivers/iommu/iommu.c  |  43 +-
 drivers/iommu/ipmmu-vmsa.c |   2 +-
 drivers/iommu/msm_iommu.c  |   2 +-
 drivers/iommu/mtk_iommu.c  |   2 +-
 drivers/iommu/mtk_iommu_v1.c   |   2 +-
 drivers/iommu/omap-iommu.c |   2 +-
 drivers/iommu/qcom_iommu.c |   2 +-
 drivers/iommu/rockchip-iommu.c |   2 +-
 drivers/iommu/s390-iommu.c |   2 +-
 drivers/iommu/tegra-gart.c |   2 +-
 drivers/iommu/tegra-smmu.c |   2 +-
 include/linux/iommu.h  |  21 +-
 19 files changed, 187 insertions(+), 652 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/4] iommu/dma-iommu: Handle deferred devices

2019-04-29 Thread Tom Murphy via iommu
Handle devices which defer their attach to the iommu in the dma-iommu api

Signed-off-by: Tom Murphy 
---
 drivers/iommu/dma-iommu.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 7a96c2c8f56b..c18f74ad1e8b 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -322,6 +322,17 @@ static int iommu_dma_init_domain(struct iommu_domain 
*domain, dma_addr_t base,
return iova_reserve_iommu_regions(dev, domain);
 }
 
+static int handle_deferred_device(struct device *dev)
+{
+   struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+   const struct iommu_ops *ops = domain->ops;
+
+   if (ops->is_attach_deferred && ops->is_attach_deferred(domain, dev))
+   return iommu_attach_device(domain, dev);
+
+   return 0;
+}
+
 /**
  * dma_info_to_prot - Translate DMA API directions and attributes to IOMMU API
  *page flags.
@@ -835,6 +846,8 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, 
struct page *page,
bool coherent = dev_is_dma_coherent(dev);
dma_addr_t dma_handle;
 
+   handle_deferred_device(dev);
+
dma_handle =__iommu_dma_map(dev, phys, size,
dma_info_to_prot(dir, coherent, attrs),
iommu_get_dma_domain(dev));
@@ -849,6 +862,8 @@ static void iommu_dma_unmap_page(struct device *dev, 
dma_addr_t dma_handle,
 {
struct iommu_domain *domain = iommu_get_dma_domain(dev);
 
+   handle_deferred_device(dev);
+
if (!dev_is_dma_coherent(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
phys_addr_t phys = iommu_iova_to_phys(domain, dma_handle);
 
@@ -873,6 +888,8 @@ static int __finalise_sg(struct device *dev, struct 
scatterlist *sg, int nents,
unsigned int cur_len = 0, max_len = dma_get_max_seg_size(dev);
int i, count = 0;
 
+   handle_deferred_device(dev);
+
for_each_sg(sg, s, nents, i) {
/* Restore this segment's original unaligned fields first */
unsigned int s_iova_off = sg_dma_address(s);
@@ -1022,6 +1039,8 @@ static void iommu_dma_unmap_sg(struct device *dev, struct 
scatterlist *sg,
struct scatterlist *tmp;
int i;
 
+   handle_deferred_device(dev);
+
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
iommu_dma_sync_sg_for_cpu(dev, sg, nents, dir);
 
@@ -1042,6 +1061,8 @@ static void iommu_dma_unmap_sg(struct device *dev, struct 
scatterlist *sg,
 static dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
+   handle_deferred_device(dev);
+
return __iommu_dma_map(dev, phys, size,
dma_info_to_prot(dir, false, attrs) | IOMMU_MMIO,
iommu_get_dma_domain(dev));
@@ -1050,12 +1071,15 @@ static dma_addr_t iommu_dma_map_resource(struct device 
*dev, phys_addr_t phys,
 static void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
+   handle_deferred_device(dev);
+
__iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size);
 }
 
 static void *iommu_dma_alloc(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
 {
+   handle_deferred_device(dev);
gfp |= __GFP_ZERO;
 
 #ifdef CONFIG_DMA_DIRECT_REMAP
@@ -1076,6 +1100,8 @@ static void iommu_dma_free(struct device *dev, size_t 
size, void *cpu_addr,
 {
struct page *page;
 
+   handle_deferred_device(dev);
+
/*
 * cpu_addr can be one of 4 things depending on how it was allocated:
 *
@@ -1115,6 +1141,8 @@ static int iommu_dma_mmap(struct device *dev, struct 
vm_area_struct *vma,
unsigned long pfn;
int ret;
 
+   handle_deferred_device(dev);
+
vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
 
if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, ))
@@ -1143,6 +1171,8 @@ static int iommu_dma_get_sgtable(struct device *dev, 
struct sg_table *sgt,
struct page *page;
int ret;
 
+   handle_deferred_device(dev);
+
 #ifdef CONFIG_DMA_DIRECT_REMAP
if (is_vmalloc_addr(cpu_addr)) {
if (!(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 19/19] iommu/vt-d: Add svm/sva invalidate function

2019-04-29 Thread Jacob Pan
On Fri, 26 Apr 2019 19:23:03 +0200
Auger Eric  wrote:

> Hi Jacob,
> On 4/24/19 1:31 AM, Jacob Pan wrote:
> > When Shared Virtual Address (SVA) is enabled for a guest OS via
> > vIOMMU, we need to provide invalidation support at IOMMU API and
> > driver level. This patch adds Intel VT-d specific function to
> > implement iommu passdown invalidate API for shared virtual address.
> > 
> > The use case is for supporting caching structure invalidation
> > of assigned SVM capable devices. Emulated IOMMU exposes queue
> > invalidation capability and passes down all descriptors from the
> > guest to the physical IOMMU.
> > 
> > The assumption is that guest to host device ID mapping should be
> > resolved prior to calling IOMMU driver. Based on the device handle,
> > host IOMMU driver can replace certain fields before submit to the
> > invalidation queue.
> > 
> > Signed-off-by: Jacob Pan 
> > Signed-off-by: Ashok Raj 
> > Signed-off-by: Liu, Yi L 
> > ---
> >  drivers/iommu/intel-iommu.c | 159
> >  1 file changed, 159
> > insertions(+)
> > 
> > diff --git a/drivers/iommu/intel-iommu.c
> > b/drivers/iommu/intel-iommu.c index 89989b5..54a3d22 100644
> > --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -5338,6 +5338,164 @@ static void
> > intel_iommu_aux_detach_device(struct iommu_domain *domain,
> > aux_domain_remove_dev(to_dmar_domain(domain), dev); }
> >  
> > +/*
> > + * 2D array for converting and sanitizing IOMMU generic TLB
> > granularity to
> > + * VT-d granularity. Invalidation is typically included in the
> > unmap operation
> > + * as a result of DMA or VFIO unmap. However, for assigned device
> > where guest
> > + * could own the first level page tables without being shadowed by
> > QEMU. In
> > + * this case there is no pass down unmap to the host IOMMU as a
> > result of unmap
> > + * in the guest. Only invalidations are trapped and passed down.
> > + * In all cases, only first level TLB invalidation (request with
> > PASID) can be
> > + * passed down, therefore we do not include IOTLB granularity for
> > request
> > + * without PASID (second level).
> > + *
> > + * For an example, to find the VT-d granularity encoding for IOTLB
> > + * type and page selective granularity within PASID:
> > + * X: indexed by iommu cache type
> > + * Y: indexed by enum iommu_inv_granularity
> > + * [IOMMU_INV_TYPE_TLB][IOMMU_INV_GRANU_PAGE_PASID]
> > + *
> > + * Granu_map array indicates validity of the table. 1: valid, 0:
> > invalid
> > + *
> > + */
> > +const static int
> > inv_type_granu_map[NR_IOMMU_CACHE_TYPE][NR_IOMMU_CACHE_INVAL_GRANU]
> > = {  
> The size is frozen for a given uapi version so I guess you can
> hardcode the limits for a given version.
I guess I could, I just felt more readable this way.
> > +   /* PASID based IOTLB, support PASID selective and page
> > selective */
> > +   {0, 1, 1},
> > +   /* PASID based dev TLBs, only support all PASIDs or single
> > PASID */
> > +   {1, 1, 0},
> > +   /* PASID cache */
> > +   {1, 1, 0}
> > +};
> > +
> > +const static u64
> > inv_type_granu_table[NR_IOMMU_CACHE_TYPE][NR_IOMMU_CACHE_INVAL_GRANU]
> > = {
> > +   /* PASID based IOTLB */
> > +   {0, QI_GRAN_NONG_PASID, QI_GRAN_PSI_PASID},
> > +   /* PASID based dev TLBs */
> > +   {QI_DEV_IOTLB_GRAN_ALL, QI_DEV_IOTLB_GRAN_PASID_SEL, 0},
> > +   /* PASID cache */
> > +   {QI_PC_ALL_PASIDS, QI_PC_PASID_SEL, 0},
> > +};  
> Can't you use a single matrix instead, ie. inv_type_granu_table
> 
The reason i have an additional inv_type_granu_map[] matrix is that
some of fields can be 0 but still valid. A single matrix would not be
able to tell the difference between a valid 0 or invalid field.
> > +
> > +static inline int to_vtd_granularity(int type, int granu, u64
> > *vtd_granu) +{
> > +   if (type >= NR_IOMMU_CACHE_TYPE || granu >=
> > NR_IOMMU_CACHE_INVAL_GRANU ||
> > +   !inv_type_granu_map[type][granu])
> > +   return -EINVAL;
> > +
> > +   *vtd_granu = inv_type_granu_table[type][granu];
> > +
> > +   return 0;
> > +}
> > +
> > +static inline u64 to_vtd_size(u64 granu_size, u64 nr_granules)
> > +{
> > +   u64 nr_pages;  
> direct initialization?
will do, thanks
> > +   /* VT-d size is encoded as 2^size of 4K pages, 0 for 4k, 9
> > for 2MB, etc.
> > +* IOMMU cache invalidate API passes granu_size in bytes,
> > and number of
> > +* granu size in contiguous memory.
> > +*/
> > +
> > +   nr_pages = (granu_size * nr_granules) >> VTD_PAGE_SHIFT;
> > +   return order_base_2(nr_pages);
> > +}
> > +
> > +static int intel_iommu_sva_invalidate(struct iommu_domain *domain,
> > +   struct device *dev, struct
> > iommu_cache_invalidate_info *inv_info) +{
> > +   struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > +   struct device_domain_info *info;
> > +   struct intel_iommu *iommu;
> > +   unsigned long flags;
> > +   int cache_type;
> > +   u8 bus, devfn;
> > +   u16 did, sid;
> > +   int ret = 0;
> > +   

Re: [PATCH v2 18/19] iommu/vt-d: Support flushing more translation cache types

2019-04-29 Thread Jacob Pan
On Sat, 27 Apr 2019 11:04:04 +0200
Auger Eric  wrote:

> Hi Jacob,
> 
> On 4/24/19 1:31 AM, Jacob Pan wrote:
> > When Shared Virtual Memory is exposed to a guest via vIOMMU,
> > extended IOTLB invalidation may be passed down from outside IOMMU
> > subsystems. This patch adds invalidation functions that can be used
> > for additional translation cache types.
> > 
> > Signed-off-by: Jacob Pan 
> > ---
> >  drivers/iommu/dmar.c| 48
> > +
> > include/linux/intel-iommu.h | 21  2 files
> > changed, 65 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> > index 9c49300..680894e 100644
> > --- a/drivers/iommu/dmar.c
> > +++ b/drivers/iommu/dmar.c
> > @@ -1357,6 +1357,20 @@ void qi_flush_iotlb(struct intel_iommu
> > *iommu, u16 did, u64 addr, qi_submit_sync(, iommu);
> >  }
> >
> /* PASID-based IOTLB Invalidate */
> > +void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u64 addr,
> > u32 pasid,
> > +   unsigned int size_order, u64 granu)
> > +{
> > +   struct qi_desc desc;
> > +
> > +   desc.qw0 = QI_EIOTLB_PASID(pasid) | QI_EIOTLB_DID(did) |
> > +   QI_EIOTLB_GRAN(granu) | QI_EIOTLB_TYPE;
> > +   desc.qw1 = QI_EIOTLB_ADDR(addr) | QI_EIOTLB_IH(0) |
> > +   QI_EIOTLB_AM(size_order);  
> I see IH it hardcoded to 0. Don't you envision to cascade the IH. On
> ARM this was needed for perf sake.
Right, we should cascade IH based on IOMMU_INV_ADDR_FLAGS_LEAF. Just
curious how do you deduce the IH information on ARM? I guess you need
to get the non-leaf page directory info?
I will add an argument for IH.
> > +   desc.qw2 = 0;
> > +   desc.qw3 = 0;
> > +   qi_submit_sync(, iommu);
> > +}
> > +
> >  void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16
> > pfsid, u16 qdep, u64 addr, unsigned mask)
> >  {
> > @@ -1380,6 +1394,40 @@ void qi_flush_dev_iotlb(struct intel_iommu
> > *iommu, u16 sid, u16 pfsid, qi_submit_sync(, iommu);
> >  }
> >
> /* Pasid-based Device-TLB Invalidation */
yes, better to explain piotlb :), same for iotlb.
> > +void qi_flush_dev_piotlb(struct intel_iommu *iommu, u16 sid, u16
> > pfsid,
> > +   u32 pasid,  u16 qdep, u64 addr, unsigned size, u64
> > granu) +{
> > +   struct qi_desc desc;
> > +
> > +   desc.qw0 = QI_DEV_EIOTLB_PASID(pasid) |
> > QI_DEV_EIOTLB_SID(sid) |
> > +   QI_DEV_EIOTLB_QDEP(qdep) | QI_DEIOTLB_TYPE |
> > +   QI_DEV_IOTLB_PFSID(pfsid);
> > +   desc.qw1 |= QI_DEV_EIOTLB_GLOB(granu);
should be desc.qw1 =
> > +
> > +   /* If S bit is 0, we only flush a single page. If S bit is
> > set,
> > +* The least significant zero bit indicates the size. VT-d
> > spec
> > +* 6.5.2.6
> > +*/
> > +   if (!size)
> > +   desc.qw0 = QI_DEV_EIOTLB_ADDR(addr) &
> > ~QI_DEV_EIOTLB_SIZE;  
> desc.q1 |= ?
Right, I also missed previous qw1 assignment.
> > +   else {
> > +   unsigned long mask = 1UL << (VTD_PAGE_SHIFT +
> > size); +
> > +   desc.qw1 = QI_DEV_EIOTLB_ADDR(addr & ~mask) |
> > QI_DEV_EIOTLB_SIZE;  
> desc.q1 |=
right, thanks
> > +   }
> > +   qi_submit_sync(, iommu);
> > +}
> > +  
> /* PASID-cache invalidation */
> > +void qi_flush_pasid_cache(struct intel_iommu *iommu, u16 did, u64
> > granu, int pasid) +{
> > +   struct qi_desc desc;
> > +
> > +   desc.qw0 = QI_PC_TYPE | QI_PC_DID(did) | QI_PC_GRAN(granu)
> > | QI_PC_PASID(pasid);
> > +   desc.qw1 = 0;
> > +   desc.qw2 = 0;
> > +   desc.qw3 = 0;
> > +   qi_submit_sync(, iommu);
> > +}
> >  /*
> >   * Disable Queued Invalidation interface.
> >   */
> > diff --git a/include/linux/intel-iommu.h
> > b/include/linux/intel-iommu.h index 5d67d0d4..38e5efb 100644
> > --- a/include/linux/intel-iommu.h
> > +++ b/include/linux/intel-iommu.h
> > @@ -339,7 +339,7 @@ enum {
> >  #define QI_IOTLB_GRAN(gran)(((u64)gran) >>
> > (DMA_TLB_FLUSH_GRANU_OFFSET-4)) #define QI_IOTLB_ADDR(addr)
> > (((u64)addr) & VTD_PAGE_MASK) #define
> > QI_IOTLB_IH(ih) (((u64)ih) << 6) -#define
> > QI_IOTLB_AM(am) (((u8)am)) +#define
> > QI_IOTLB_AM(am) (((u8)am) & 0x3f) 
> >  #define QI_CC_FM(fm)   (((u64)fm) << 48)
> >  #define QI_CC_SID(sid) (((u64)sid) << 32)
> > @@ -357,17 +357,22 @@ enum {
> >  #define QI_PC_DID(did) (((u64)did) << 16)
> >  #define QI_PC_GRAN(gran)   (((u64)gran) << 4)
> >  
> > -#define QI_PC_ALL_PASIDS   (QI_PC_TYPE | QI_PC_GRAN(0))
> > -#define QI_PC_PASID_SEL(QI_PC_TYPE | QI_PC_GRAN(1))
> > +/* PASID cache invalidation granu */
> > +#define QI_PC_ALL_PASIDS   0
> > +#define QI_PC_PASID_SEL1
> >  
> >  #define QI_EIOTLB_ADDR(addr)   ((u64)(addr) & VTD_PAGE_MASK)
> >  #define QI_EIOTLB_GL(gl)   (((u64)gl) << 7)
> >  #define QI_EIOTLB_IH(ih)   (((u64)ih) << 6)
> > -#define QI_EIOTLB_AM(am)   (((u64)am))
> > +#define QI_EIOTLB_AM(am)   (((u64)am) & 0x3f)
> >  #define QI_EIOTLB_PASID(pasid) (((u64)pasid) << 32)
> >  #define 

Re: [PATCH v2 0/9] PCI: add help pci_dev_id

2019-04-29 Thread Bjorn Helgaas
On Wed, Apr 24, 2019 at 09:10:21PM +0200, Heiner Kallweit wrote:
> In several places in the kernel we find PCI_DEVID used like this:
> PCI_DEVID(dev->bus->number, dev->devfn) Therefore create a helper
> for it.
> 
> v2:
> - apply the change to all affected places in the kernel
> 
> Heiner Kallweit (9):
>   PCI: add helper pci_dev_id
>   PCI: use helper pci_dev_id
>   r8169: use new helper pci_dev_id
>   powerpc/powernv/npu: use helper pci_dev_id
>   drm/amdkfd: use helper pci_dev_id
>   iommu/amd: use helper pci_dev_id
>   iommu/vt-d: use helper pci_dev_id
>   stmmac: pci: use helper pci_dev_id
>   platform/chrome: chromeos_laptop: use helper pci_dev_id
> 
>  arch/powerpc/platforms/powernv/npu-dma.c | 14 ++
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c|  3 +--
>  drivers/iommu/amd_iommu.c|  2 +-
>  drivers/iommu/intel-iommu.c  |  2 +-
>  drivers/iommu/intel_irq_remapping.c  |  2 +-
>  drivers/net/ethernet/realtek/r8169.c |  3 +--
>  drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c |  2 +-
>  drivers/pci/msi.c|  6 +++---
>  drivers/pci/search.c | 10 +++---
>  drivers/platform/chrome/chromeos_laptop.c|  2 +-
>  include/linux/pci.h  |  5 +
>  11 files changed, 24 insertions(+), 27 deletions(-)

Applied with acks/reviewed-by from Benson, Joerg, Christian, Alexey, and
David to pci/misc for v5.2, thanks!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg in two parts

2019-04-29 Thread Christoph Hellwig
On Mon, Apr 29, 2019 at 04:57:20PM +0100, Marc Zyngier wrote:
> Thanks for having reworked this. I'm quite happy with the way this looks
> now (modulo the couple of nits Robin and I mentioned, which I'm to
> address myself).
> 
> Jorg: are you OK with this going via the irq tree?

As-is this has a trivial conflict with my
"implement generic dma_map_ops for IOMMUs" series.  But I can tweak
it a bit to avoid that conflict.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 14/26] iommu/dma: Refactor iommu_dma_free

2019-04-29 Thread Christoph Hellwig
On Mon, Apr 29, 2019 at 09:03:48PM +0200, Christoph Hellwig wrote:
> On Mon, Apr 29, 2019 at 02:59:43PM +0100, Robin Murphy wrote:
> > Hmm, I do still prefer my original flow with the dma_common_free_remap() 
> > call right out of the way at the end rather than being a special case in 
> > the middle of all the page-freeing (which is the kind of existing 
> > complexity I was trying to eliminate). I guess you've done this to avoid 
> > having "if (!dma_release_from_contiguous(...))..." twice like I ended up 
> > with, which is fair enough I suppose - once we manage to solve the new 
> > dma_{alloc,free}_contiguous() interface that may tip the balance so I can 
> > always revisit this then.
> 
> Ok, I'll try to accomodate that with a minor rework.

Does this look reasonable?

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 5b2a2bf44078..f884d22b1388 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -921,7 +921,7 @@ static void iommu_dma_free(struct device *dev, size_t size, 
void *cpu_addr,
 {
size_t alloc_size = PAGE_ALIGN(size);
int count = alloc_size >> PAGE_SHIFT;
-   struct page *page = NULL;
+   struct page *page = NULL, **pages = NULL;
 
__iommu_dma_unmap(dev, handle, size);
 
@@ -934,19 +934,17 @@ static void iommu_dma_free(struct device *dev, size_t 
size, void *cpu_addr,
 * If it the address is remapped, then it's either non-coherent
 * or highmem CMA, or an iommu_dma_alloc_remap() construction.
 */
-   struct page **pages = __iommu_dma_get_pages(cpu_addr);
-
-   if (pages)
-   __iommu_dma_free_pages(pages, count);
-   else
+   pages = __iommu_dma_get_pages(cpu_addr);
+   if (!pages)
page = vmalloc_to_page(cpu_addr);
-
dma_common_free_remap(cpu_addr, alloc_size, VM_USERMAP);
} else {
/* Lowmem means a coherent atomic or CMA allocation */
page = virt_to_page(cpu_addr);
}
 
+   if (pages)
+   __iommu_dma_free_pages(pages, count);
if (page && !dma_release_from_contiguous(dev, page, count))
__free_pages(page, get_order(alloc_size));
 }
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 11/26] iommu/dma: Factor out remapped pages lookup

2019-04-29 Thread Christoph Hellwig
On Mon, Apr 29, 2019 at 02:05:46PM +0100, Robin Murphy wrote:
> On 22/04/2019 18:59, Christoph Hellwig wrote:
>> From: Robin Murphy 
>>
>> Since we duplicate the find_vm_area() logic a few times in places where
>> we only care aboute the pages, factor out a helper to abstract it.
>>
>> Signed-off-by: Robin Murphy 
>> [hch: don't warn when not finding a region, as we'll rely on that later]
>
> Yeah, I did think about that and the things which it might make a little 
> easier, but preserved it as-is for the sake of keeping my modifications 
> quick and simple. TBH I'm now feeling more inclined to drop the WARNs 
> entirely at this point, since it's not like there's ever been any general 
> guarantee that freeing the wrong thing shouldn't just crash, but that's 
> something we can easily come back to later if need be.

Ok, I've dropped the warnings.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 14/26] iommu/dma: Refactor iommu_dma_free

2019-04-29 Thread Christoph Hellwig
On Mon, Apr 29, 2019 at 02:59:43PM +0100, Robin Murphy wrote:
> Hmm, I do still prefer my original flow with the dma_common_free_remap() 
> call right out of the way at the end rather than being a special case in 
> the middle of all the page-freeing (which is the kind of existing 
> complexity I was trying to eliminate). I guess you've done this to avoid 
> having "if (!dma_release_from_contiguous(...))..." twice like I ended up 
> with, which is fair enough I suppose - once we manage to solve the new 
> dma_{alloc,free}_contiguous() interface that may tip the balance so I can 
> always revisit this then.

Ok, I'll try to accomodate that with a minor rework.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 02/26] arm64/iommu: improve mmap bounds checking

2019-04-29 Thread Christoph Hellwig
On Mon, Apr 29, 2019 at 01:35:46PM +0100, Robin Murphy wrote:
> On 22/04/2019 18:59, Christoph Hellwig wrote:
>> The nr_pages checks should be done for all mmap requests, not just those
>> using remap_pfn_range.
>
> I think it probably makes sense now to just squash this with #22 one way or 
> the other, but if you really really still want to keep it as a separate 
> patch with a misleading commit message then I'm willing to keep my 
> complaints to myself :)

Well, I split this out in response to your earlier comments, so if you
prefer it squasheѕ back in I can do that..
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Linux crash when using FTDI FT232R USB to serial converter on AMD boards.

2019-04-29 Thread Johan Hovold
On Mon, Apr 29, 2019 at 12:51:20PM +0200, starost...@gmail.com wrote:
> Hello,
> sorry for other questions, but I am new in this list:
> Is Ubuntu server 19.04 with "kernel 5.0.9-050009-generic" good for this 
> test?

Yes, that might do depending on what else debian put in that kernel.

> Can I add attachments to this lists?

Sure, it's even preferred. Just try to trim non-relevant bits, and
perhaps provide a link to the full log.

> And who is xhci and iommu maintainers? Are they CC in this mail?

Yes, that's Mathias and Joerg, respectively, that I added on CC.

> Dne 29.4.2019 v 11:48 Johan Hovold napsal(a):
> > So this is a debian 4.18 kernel seemingly crashing due to a xhci or
> > iommu issue.
> >
> > Can you reproduce this on a mainline kernel?
> >
> > If so, please post the corresponding logs to the lists and CC the xhci
> > and iommu maintainers (added to CC).

Here's an excerpt from the 4.18 log meanwhile:

[0.00] Linux version 4.18.0-0.bpo.1-amd64 
(debian-ker...@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 
6.3.0-18+deb9u1)) #1 SMP Debian 4.18.6-1~bpo9+1 (2018-09-13)

...

[  960.145603] xhci_hcd :15:00.0: WARN Set TR Deq Ptr cmd failed due to 
incorrect slot or ep state.
[ 1790.293623] xhci_hcd :15:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x address=0xfff5 flags=0x0020]
[ 1790.300905] xhci_hcd :15:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x address=0xfde9e000 flags=0x0020]

...

[ 4789.145364] xhci_hcd :15:00.0: WARN Set TR Deq Ptr cmd failed due to 
incorrect slot or ep state.
[ 4789.310916] xhci_hcd :15:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x address=0xff63c000 flags=0x0020]
[ 4789.317023] xhci_hcd :15:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x address=0xff5c6000 flags=0x0020]
[ 4789.702446] xhci_hcd :15:00.0: WARN Set TR Deq Ptr cmd failed due to 
incorrect slot or ep state.
[ 4789.766842] xhci_hcd :15:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x address=0xfdeaf000 flags=0x0020]
[ 4789.781531] AMD-Vi: Event logged [IO_PAGE_FAULT device=15:00.0 domain=0x 
address=0xfdeaf040 flags=0x0020]
[ 4790.093644] general protection fault:  [#1] SMP NOPTI
[ 4790.094186] CPU: 2 PID: 24561 Comm: readua Not tainted 4.18.0-0.bpo.1-amd64 
#1 Debian 4.18.6-1~bpo9+1
[ 4790.094738] Hardware name: Micro-Star International Co., Ltd MS-7B38/A320M 
PRO-VD PLUS (MS-7B38), BIOS 1.C0 11/02/2018
[ 4790.095333] RIP: 0010:prefetch_freepointer+0x10/0x20
[ 4790.095936] Code: 58 48 c7 c7 30 09 a5 b0 e8 4b 64 eb ff eb 90 90 66 2e 0f 
1f 84 00 00 00 00 00 0f 1f 44 00 00 48 85 f6 74 13 8b 47 20 48 01 c6 <48> 33 36 
48 33 b7 38 01 00 00 0f 18 0e f3 c3 90 0f 1f 44 00 00 8b 
[ 4790.097314] RSP: 0018:b6e303e67ce0 EFLAGS: 00010286
[ 4790.098016] RAX:  RBX: 93773f762ca13047 RCX: 77df
[ 4790.098711] RDX: 77de RSI: 93773f762ca13047 RDI: 9a30de807c00
[ 4790.099413] RBP: 9a30d20cc018 R08: 9a30deca4de0 R09: 
[ 4790.100141] R10: 9a306f3c6638 R11:  R12: 006000c0
[ 4790.100871] R13: 0008 R14: 9a30de807c00 R15: 9a30de807c00
[ 4790.101619] FS:  () GS:9a30dec8(0063) 
knlGS:f7d6a700
[ 4790.102387] CS:  0010 DS: 002b ES: 002b CR0: 80050033
[ 4790.103157] CR2: ffc77e3c CR3: af0c CR4: 003406e0
[ 4790.103954] Call Trace:
[ 4790.104753]  kmem_cache_alloc_trace+0xb5/0x1c0
[ 4790.105580]  ? proc_do_submiturb+0x35a/0xda0 [usbcore]
[ 4790.106382]  proc_do_submiturb+0x35a/0xda0 [usbcore]
[ 4790.107189]  ? futex_wake+0x94/0x170
[ 4790.108009]  proc_submiturb_compat+0xb1/0xe0 [usbcore]
[ 4790.108851]  usbdev_do_ioctl+0x894/0x1170 [usbcore]
[ 4790.109704]  usbdev_compat_ioctl+0xc/0x10 [usbcore]
[ 4790.110553]  __ia32_compat_sys_ioctl+0xc0/0x250
[ 4790.111413]  do_fast_syscall_32+0x98/0x1e0
[ 4790.112280]  entry_SYSCALL_compat_after_hwframe+0x45/0x4d
[ 4790.113167] Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 
nf_defrag_ipv6 ip6table_filter ip6_tables ipt_REJECT nf_reject_ipv4 xt_tcpudp 
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c 
iptable_filter binfmt_misc nls_ascii nls_cp437 vfat fat efi_pstore edac_mce_amd 
ccp rng_core kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass 
crct10dif_pclmul crc32_pclmul snd_hda_codec_hdmi snd_hda_intel 
ghash_clmulni_intel wmi_bmof snd_hda_codec efivars pcspkr snd_hda_core 
snd_hwdep snd_pcm k10temp snd_timer sg snd soundcore sp5100_tco evdev 
pcc_cpufreq acpi_cpufreq efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache 
jbd2 crc32c_generic fscrypto ecb sd_mod hid_generic usbhid hid crc32c_intel 
aesni_intel aes_x86_64 crypto_simd cryptd glue_helper amdgpu chash gpu_sched
[ 4790.120417]  i2c_algo_bit i2c_piix4 ttm drm_kms_helper ahci xhci_pci libahci 
xhci_hcd drm libata r8169 mii usbcore scsi_mod usb_common 

Re: Linux crash when using FTDI FT232R USB to serial converter on AMD boards.

2019-04-29 Thread Johan Hovold
On Fri, Apr 26, 2019 at 03:47:15PM +0200, starost...@gmail.com wrote:
> Hello all,
> we are development and manufacturing company that use your FT232R serial 
> converter for couple of years with our software. We consume about a 
> hundreds pcs of FT232R per yer. We use FT232R as USB serial converter 
> with direct access (no virtual serial port) and as a "hardware key" 
> using FTDIChip-ID. We operate our software with FT232R converters on 
> Windows and Debian Linux operating system.
> 
> We have used Intel motherboards with Intel processors so far. We want to 
> use AMD motherboards with AMD processors too. *We made a couple of tests 
> on AMD motherboards with AMD processors and Debian Linux 9.6, but we 
> have come across a big problem.
> **When we open internal EEPROM of FT232R for reading, there will arise 
> many error messages in system log files. And then Debian Linux crash 
> after some time!*
> 
> 
> _1) Hardware configurations:_
> - motherboards with AMD A320M chipset:
>    - MSI A320M PRO-VD PLUS, 
> https://www.msi.com/Motherboard/support/A320M-PRO-VD-PLUS
>    - ASUS PRIME A320M-K, https://www.asus.com/Motherboards/PRIME-A320M-K
>    - GIGABYTE A320M-S2H, 
> https://www.gigabyte.com/Motherboard/GA-A320M-S2H-rev-1x#kf
> - latest bios installed, default bios configuration loaded,
> - CPU AMD Athlon 200GE, AMD Ryzen 3 2200G
> - 4GB RAM, SSD drive Kingston A400 120GB
> 
> _2a) Operating system A:_
> - Debian Linux 9.6 64bit, https://www.debian.org/distrib/, 
> https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-9.6.0-amd64-netinst.iso
> - installed from Netinst installer WITHOUT graphic dekstop, default 
> configuration
> - tested kernels
>    - default kernel 4.9.0.8-amd64, 
> https://packages.debian.org/stretch/linux-image-4.9.0-8-amd64
>    - backports kernel 4.18.0-0.bpo.1-amd64, 
> https://packages.debian.org/stretch-backports/linux-image-4.18.0-0.bpo.1-amd64
> 
> _2b) Operating system B:_
> - Ubuntu server 19.04 64bit, http://releases.ubuntu.com/19.04/, 
> http://releases.ubuntu.com/19.04/ubuntu-19.04-live-server-amd64.iso
> - installed WITHOUT graphic dekstop, default configuration
> - tested kernels
>    - default kernel 5.0.0-amd64, 
> https://packages.debian.org/stretch/linux-image-4.9.0-8-amd64
>    - experimental kernel 5.0.9-050009-generic amd64, 
> https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.0.9/
> 
> _3) Drivers_
> - libftd2xx drivers version 1.4.8, https://www.ftdichip.com/Drivers/D2XX.htm
> 
> _4) Performed tests_
> You can repeat this test:
> - connect FT232R into USB2.0 port (not USB3 port!)
> - use examples in directory: ...\libftd2xx-i386-1.4.8.tar\release\examples\
> - add parameter "-m32" to "CFLAGS" variable into "Rules.make" file
> - compiled "\examples\EEPROM\user\read\"
> - run script "test.sh" - see attached file
> - *Debian Linux or Ubuntu server crashes after some minutes* - see 
> attached kernel logs from our system
> - see "kern.log" https://paste.ee/p/xxIZ2

So this is a debian 4.18 kernel seemingly crashing due to a xhci or
iommu issue.

Can you reproduce this on a mainline kernel?

If so, please post the corresponding logs to the lists and CC the xhci
and iommu maintainers (added to CC).

> _5) Very important note_
> *This problem occurs when FT232R is connected into USB2.0 port only!*
> When it is connected into USB3 port, all works fine, no error messages, 
> no crash.
> 
> _6) Other test that we made_
> - we made other tests on Windows 10
>    - same configuration with ASUS PRIME A320M-K motherboard
>    - latest drivers + latest FTDI drivers
>    - FT232R connected to USB2.0 or USB3 - no problem
> 
> - we made the same tests on Intel architecture (that we use now)
>    - motherboard MSI B250M PRO-VH, CPU Intel Pentium G4560, 4GB RAM, SSD 
> drive Kingston A400 120GB
>    - same operating system Debian Linux 9.6 64bit as descripted above
>    - FT232R connected to USB2.0 or USB3 - no problem

Johan
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Linux crash when using FTDI FT232R USB to serial converter on AMD boards.

2019-04-29 Thread starost...@gmail.com

Hello,
sorry for other questions, but I am new in this list:
Is Ubuntu server 19.04 with "kernel 5.0.9-050009-generic" good for this 
test?

Can I add attachments to this lists?
And who is xhci and iommu maintainers? Are they CC in this mail?

starosta

Dne 29.4.2019 v 11:48 Johan Hovold napsal(a):

So this is a debian 4.18 kernel seemingly crashing due to a xhci or
iommu issue.

Can you reproduce this on a mainline kernel?

If so, please post the corresponding logs to the lists and CC the xhci
and iommu maintainers (added to CC).


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Linux crash when using FTDI FT232R USB to serial converter on AMD boards.

2019-04-29 Thread starost...@gmail.com

Ok,
I attach log from today test on Ubuntu 19.04 with5.0.9-050009-generickernel.
Full kern.log: https://paste.ee/p/yF3Qi#section0
Dmesg log: https://paste.ee/p/yF3Qi#section1

Summary:
- motherboard with AMD A320M chipset, CPU AMD Athlon 200GE
- Ubuntu 19.04 default instalation, kernel 5.0.9-050009-generic
- connected "FTDI FT232R" (USB to serial converter) to USB port, latest 
FTDI libftd2xx drivers version 1.4.8 installed, 
https://www.ftdichip.com/Drivers/D2XX.htm

- ftdi_sio driver is unloaded (rmmod ftdi_sio)
- problem is with reading EEPROM content from FTDI FT232R chip
- problem is on USB2.0 port only, on USB3 port it works fine

starosta

Dne 29.4.2019 v 13:22 Johan Hovold napsal(a):

On Mon, Apr 29, 2019 at 12:51:20PM +0200, starost...@gmail.com wrote:

Hello,
sorry for other questions, but I am new in this list:
Is Ubuntu server 19.04 with "kernel 5.0.9-050009-generic" good for this
test?

Yes, that might do depending on what else debian put in that kernel.


Can I add attachments to this lists?

Sure, it's even preferred. Just try to trim non-relevant bits, and
perhaps provide a link to the full log.


And who is xhci and iommu maintainers? Are they CC in this mail?

Yes, that's Mathias and Joerg, respectively, that I added on CC.


Dne 29.4.2019 v 11:48 Johan Hovold napsal(a):

So this is a debian 4.18 kernel seemingly crashing due to a xhci or
iommu issue.

Can you reproduce this on a mainline kernel?

If so, please post the corresponding logs to the lists and CC the xhci
and iommu maintainers (added to CC).

Here's an excerpt from the 4.18 log meanwhile:

[0.00] Linux version 4.18.0-0.bpo.1-amd64 
(debian-ker...@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 
6.3.0-18+deb9u1)) #1 SMP Debian 4.18.6-1~bpo9+1 (2018-09-13)

...

[  960.145603] xhci_hcd :15:00.0: WARN Set TR Deq Ptr cmd failed due to 
incorrect slot or ep state.
[ 1790.293623] xhci_hcd :15:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x address=0xfff5 flags=0x0020]
[ 1790.300905] xhci_hcd :15:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x address=0xfde9e000 flags=0x0020]

...

[ 4789.145364] xhci_hcd :15:00.0: WARN Set TR Deq Ptr cmd failed due to 
incorrect slot or ep state.
[ 4789.310916] xhci_hcd :15:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x address=0xff63c000 flags=0x0020]
[ 4789.317023] xhci_hcd :15:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x address=0xff5c6000 flags=0x0020]
[ 4789.702446] xhci_hcd :15:00.0: WARN Set TR Deq Ptr cmd failed due to 
incorrect slot or ep state.
[ 4789.766842] xhci_hcd :15:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x address=0xfdeaf000 flags=0x0020]
[ 4789.781531] AMD-Vi: Event logged [IO_PAGE_FAULT device=15:00.0 domain=0x 
address=0xfdeaf040 flags=0x0020]
[ 4790.093644] general protection fault:  [#1] SMP NOPTI
[ 4790.094186] CPU: 2 PID: 24561 Comm: readua Not tainted 4.18.0-0.bpo.1-amd64 
#1 Debian 4.18.6-1~bpo9+1
[ 4790.094738] Hardware name: Micro-Star International Co., Ltd MS-7B38/A320M 
PRO-VD PLUS (MS-7B38), BIOS 1.C0 11/02/2018
[ 4790.095333] RIP: 0010:prefetch_freepointer+0x10/0x20
[ 4790.095936] Code: 58 48 c7 c7 30 09 a5 b0 e8 4b 64 eb ff eb 90 90 66 2e 0f 1f 84 
00 00 00 00 00 0f 1f 44 00 00 48 85 f6 74 13 8b 47 20 48 01 c6 <48> 33 36 48 33 
b7 38 01 00 00 0f 18 0e f3 c3 90 0f 1f 44 00 00 8b
[ 4790.097314] RSP: 0018:b6e303e67ce0 EFLAGS: 00010286
[ 4790.098016] RAX:  RBX: 93773f762ca13047 RCX: 77df
[ 4790.098711] RDX: 77de RSI: 93773f762ca13047 RDI: 9a30de807c00
[ 4790.099413] RBP: 9a30d20cc018 R08: 9a30deca4de0 R09: 
[ 4790.100141] R10: 9a306f3c6638 R11:  R12: 006000c0
[ 4790.100871] R13: 0008 R14: 9a30de807c00 R15: 9a30de807c00
[ 4790.101619] FS:  () GS:9a30dec8(0063) 
knlGS:f7d6a700
[ 4790.102387] CS:  0010 DS: 002b ES: 002b CR0: 80050033
[ 4790.103157] CR2: ffc77e3c CR3: af0c CR4: 003406e0
[ 4790.103954] Call Trace:
[ 4790.104753]  kmem_cache_alloc_trace+0xb5/0x1c0
[ 4790.105580]  ? proc_do_submiturb+0x35a/0xda0 [usbcore]
[ 4790.106382]  proc_do_submiturb+0x35a/0xda0 [usbcore]
[ 4790.107189]  ? futex_wake+0x94/0x170
[ 4790.108009]  proc_submiturb_compat+0xb1/0xe0 [usbcore]
[ 4790.108851]  usbdev_do_ioctl+0x894/0x1170 [usbcore]
[ 4790.109704]  usbdev_compat_ioctl+0xc/0x10 [usbcore]
[ 4790.110553]  __ia32_compat_sys_ioctl+0xc0/0x250
[ 4790.111413]  do_fast_syscall_32+0x98/0x1e0
[ 4790.112280]  entry_SYSCALL_compat_after_hwframe+0x45/0x4d
[ 4790.113167] Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 
nf_defrag_ipv6 ip6table_filter ip6_tables ipt_REJECT nf_reject_ipv4 xt_tcpudp 
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c 
iptable_filter binfmt_misc nls_ascii nls_cp437 vfat fat efi_pstore 

Re: [PATCH v2 17/19] iommu: Add max num of cache and granu types

2019-04-29 Thread Jacob Pan
On Fri, 26 Apr 2019 18:22:46 +0200
Auger Eric  wrote:

> Hi Jacob,
> 
> On 4/24/19 1:31 AM, Jacob Pan wrote:
> > To convert to/from cache types and granularities between generic and
> > VT-d specific counterparts, a 2D arrary is used. Introduce the
> > limits  
> array
> > to help define the converstion array size.  
> conversion
> > 
will fix, thanks
> > Signed-off-by: Jacob Pan 
> > ---
> >  include/uapi/linux/iommu.h | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> > index 5c95905..2d8fac8 100644
> > --- a/include/uapi/linux/iommu.h
> > +++ b/include/uapi/linux/iommu.h
> > @@ -197,6 +197,7 @@ struct iommu_inv_addr_info {
> > __u64   granule_size;
> > __u64   nb_granules;
> >  };
> > +#define NR_IOMMU_CACHE_INVAL_GRANU (3)
> >  
> >  /**
> >   * First level/stage invalidation information
> > @@ -235,6 +236,7 @@ struct iommu_cache_invalidate_info {
> > struct iommu_inv_addr_info addr_info;
> > };
> >  };
> > +#define NR_IOMMU_CACHE_TYPE(3)
> >  /**
> >   * struct gpasid_bind_data - Information about device and guest
> > PASID binding
> >   * @gcr3:  Guest CR3 value from guest mm
> >   
> Is it really something that needs to be exposed in the uapi?
> 
I put it in uapi since the related definitions for granularity and
cache type are in the same file.
Maybe putting them close together like this? I was thinking you can just
fold it into your next series as one patch for introducing cache
invalidation.
diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
index 2d8fac8..4ff6929 100644
--- a/include/uapi/linux/iommu.h
+++ b/include/uapi/linux/iommu.h
@@ -164,6 +164,7 @@ enum iommu_inv_granularity {
IOMMU_INV_GRANU_DOMAIN, /* domain-selective invalidation */
IOMMU_INV_GRANU_PASID,  /* pasid-selective invalidation */
IOMMU_INV_GRANU_ADDR,   /* page-selective invalidation */
+   NR_IOMMU_INVAL_GRANU,   /* number of invalidation granularities
*/ };
 
 /**
@@ -228,6 +229,7 @@ struct iommu_cache_invalidate_info {
 #define IOMMU_CACHE_INV_TYPE_IOTLB (1 << 0) /* IOMMU IOTLB */
 #define IOMMU_CACHE_INV_TYPE_DEV_IOTLB (1 << 1) /* Device IOTLB */
 #define IOMMU_CACHE_INV_TYPE_PASID (1 << 2) /* PASID cache */
+#define NR_IOMMU_CACHE_TYPE(3)
__u8cache;
__u8granularity;

> Thanks
> 
> Eric

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 2/3] iommu/dma: Reserve IOVA for PCIe inaccessible DMA address

2019-04-29 Thread Robin Murphy

On 12/04/2019 04:13, Srinath Mannam wrote:

dma_ranges field of PCI host bridge structure has resource entries in
sorted order of address range given through dma-ranges DT property. This
list is the accessible DMA address range. So that this resource list will
be processed and reserve IOVA address to the inaccessible address holes in
the list.

This method is similar to PCI IO resources address ranges reserving in
IOMMU for each EP connected to host bridge.

Signed-off-by: Srinath Mannam 
Based-on-patch-by: Oza Pawandeep 
Reviewed-by: Oza Pawandeep 
---
  drivers/iommu/dma-iommu.c | 19 +++
  1 file changed, 19 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index d19f3d6..fb42d7c 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -212,6 +212,7 @@ static void iova_reserve_pci_windows(struct pci_dev *dev,
struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
struct resource_entry *window;
unsigned long lo, hi;
+   phys_addr_t start = 0, end;
  
  	resource_list_for_each_entry(window, >windows) {

if (resource_type(window->res) != IORESOURCE_MEM)
@@ -221,6 +222,24 @@ static void iova_reserve_pci_windows(struct pci_dev *dev,
hi = iova_pfn(iovad, window->res->end - window->offset);
reserve_iova(iovad, lo, hi);
}
+
+   /* Get reserved DMA windows from host bridge */
+   resource_list_for_each_entry(window, >dma_ranges) {
+   end = window->res->start - window->offset;
+resv_iova:
+   if (end - start) {
+   lo = iova_pfn(iovad, start);
+   hi = iova_pfn(iovad, end);
+   reserve_iova(iovad, lo, hi);
+   }
+   start = window->res->end - window->offset + 1;
+   /* If window is last entry */
+   if (window->node.next == >dma_ranges &&
+   end != DMA_BIT_MASK(sizeof(dma_addr_t) * BITS_PER_BYTE)) {


I still think that's a very silly way to write "~(dma_addr_t)0", but 
otherwise,


Acked-by: Robin Murphy 


+   end = DMA_BIT_MASK(sizeof(dma_addr_t) * BITS_PER_BYTE);
+   goto resv_iova;
+   }
+   }
  }
  
  static int iova_reserve_iommu_regions(struct device *dev,



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg in two parts

2019-04-29 Thread Marc Zyngier
Hi Julien,

On 29/04/2019 15:44, Julien Grall wrote:
> Hi all,
> 
> On RT, the function iommu_dma_map_msi_msg expects to be called from 
> preemptible
> context. However, this is not always the case resulting a splat with
> !CONFIG_DEBUG_ATOMIC_SLEEP:
> 
> [   48.875777] BUG: sleeping function called from invalid context at 
> kernel/locking/rtmutex.c:974
> [   48.875779] in_atomic(): 1, irqs_disabled(): 128, pid: 2103, name: ip
> [   48.875782] INFO: lockdep is turned off.
> [   48.875784] irq event stamp: 10684
> [   48.875786] hardirqs last  enabled at (10683): [] 
> _raw_spin_unlock_irqrestore+0x88/0x90
> [   48.875791] hardirqs last disabled at (10684): [] 
> _raw_spin_lock_irqsave+0x24/0x68
> [   48.875796] softirqs last  enabled at (0): [] 
> copy_process.isra.1.part.2+0x8d8/0x1970
> [   48.875801] softirqs last disabled at (0): [<>]   
> (null)
> [   48.875805] Preemption disabled at:
> [   48.875805] [] __setup_irq+0xd8/0x6c0
> [   48.875811] CPU: 2 PID: 2103 Comm: ip Not tainted 
> 5.0.3-rt1-7-g42ede9a0fed6 #45
> [   48.875815] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno 
> Development Platform, BIOS EDK II Jan 23 2017
> [   48.875817] Call trace:
> [   48.875818]  dump_backtrace+0x0/0x140
> [   48.875821]  show_stack+0x14/0x20
> [   48.875823]  dump_stack+0xa0/0xd4
> [   48.875827]  ___might_sleep+0x16c/0x1f8
> [   48.875831]  rt_spin_lock+0x5c/0x70
> [   48.875835]  iommu_dma_map_msi_msg+0x5c/0x1d8
> [   48.875839]  gicv2m_compose_msi_msg+0x3c/0x48
> [   48.875843]  irq_chip_compose_msi_msg+0x40/0x58
> [   48.875846]  msi_domain_activate+0x38/0x98
> [   48.875849]  __irq_domain_activate_irq+0x58/0xa0
> [   48.875852]  irq_domain_activate_irq+0x34/0x58
> [   48.875855]  irq_activate+0x28/0x30
> [   48.875858]  __setup_irq+0x2b0/0x6c0
> [   48.875861]  request_threaded_irq+0xdc/0x188
> [   48.875865]  sky2_setup_irq+0x44/0xf8
> [   48.875868]  sky2_open+0x1a4/0x240
> [   48.875871]  __dev_open+0xd8/0x188
> [   48.875874]  __dev_change_flags+0x164/0x1f0
> [   48.875877]  dev_change_flags+0x20/0x60
> [   48.875879]  do_setlink+0x2a0/0xd30
> [   48.875882]  __rtnl_newlink+0x5b4/0x6d8
> [   48.875885]  rtnl_newlink+0x50/0x78
> [   48.875888]  rtnetlink_rcv_msg+0x178/0x640
> [   48.875891]  netlink_rcv_skb+0x58/0x118
> [   48.875893]  rtnetlink_rcv+0x14/0x20
> [   48.875896]  netlink_unicast+0x188/0x200
> [   48.875898]  netlink_sendmsg+0x248/0x3d8
> [   48.875900]  sock_sendmsg+0x18/0x40
> [   48.875904]  ___sys_sendmsg+0x294/0x2d0
> [   48.875908]  __sys_sendmsg+0x68/0xb8
> [   48.875911]  __arm64_sys_sendmsg+0x20/0x28
> [   48.875914]  el0_svc_common+0x90/0x118
> [   48.875918]  el0_svc_handler+0x2c/0x80
> [   48.875922]  el0_svc+0x8/0xc
> 
> This series is a first attempt to rework how MSI are mapped and composed
> when an IOMMU is present.
> 
> I was able to test the changes in GICv2m and GICv3 ITS. I don't have
> hardware for the other interrupt controllers.
> 
> Cheers,
> 
> Julien Grall (7):
>   genirq/msi: Add a new field in msi_desc to store an IOMMU cookie
>   iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two parts
>   irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg()
>   irqchip/gic-v3-its: Don't map the MSI page in
> its_irq_compose_msi_msg()
>   irqchip/ls-scfg-msi: Don't map the MSI page in
> ls_scfg_msi_compose_msg()
>   irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b,
> s}i_msg()
>   iommu/dma-iommu: Remove iommu_dma_map_msi_msg()
> 
>  drivers/iommu/Kconfig |  1 +
>  drivers/iommu/dma-iommu.c | 49 
> +++
>  drivers/irqchip/irq-gic-v2m.c |  8 ++-
>  drivers/irqchip/irq-gic-v3-its.c  |  5 +++-
>  drivers/irqchip/irq-gic-v3-mbi.c  | 15 ++--
>  drivers/irqchip/irq-ls-scfg-msi.c |  7 +-
>  include/linux/dma-iommu.h | 22 --
>  include/linux/msi.h   | 26 +
>  kernel/irq/Kconfig|  3 +++
>  9 files changed, 109 insertions(+), 27 deletions(-)

Thanks for having reworked this. I'm quite happy with the way this looks
now (modulo the couple of nits Robin and I mentioned, which I'm to
address myself).

Jorg: are you OK with this going via the irq tree?

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 2/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two parts

2019-04-29 Thread Marc Zyngier
On 29/04/2019 15:44, Julien Grall wrote:
> On RT, iommu_dma_map_msi_msg() may be called from non-preemptible
> context. This will lead to a splat with CONFIG_DEBUG_ATOMIC_SLEEP as
> the function is using spin_lock (they can sleep on RT).
> 
> iommu_dma_map_msi_msg() is used to map the MSI page in the IOMMU PT
> and update the MSI message with the IOVA.
> 
> Only the part to lookup for the MSI page requires to be called in
> preemptible context. As the MSI page cannot change over the lifecycle
> of the MSI interrupt, the lookup can be cached and re-used later on.
> 
> iomma_dma_map_msi_msg() is now split in two functions:
> - iommu_dma_prepare_msi(): This function will prepare the mapping
> in the IOMMU and store the cookie in the structure msi_desc. This
> function should be called in preemptible context.
> - iommu_dma_compose_msi_msg(): This function will update the MSI
> message with the IOVA when the device is behind an IOMMU.
> 
> Signed-off-by: Julien Grall 
> 
> ---
> Changes in v2:
> - Rework the commit message to use imperative mood
> - Use the MSI accessor to get/set the iommu cookie
> - Don't use ternary on return
> - Select CONFIG_IRQ_MSI_IOMMU
> - Pass an msi_desc rather than the irq number
> ---
>  drivers/iommu/Kconfig |  1 +
>  drivers/iommu/dma-iommu.c | 47 
> ++-
>  include/linux/dma-iommu.h | 23 +++
>  3 files changed, 62 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 6f07f3b21816..eb1c8cd243f9 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -94,6 +94,7 @@ config IOMMU_DMA
>   bool
>   select IOMMU_API
>   select IOMMU_IOVA
> + select IRQ_MSI_IOMMU
>   select NEED_SG_DMA_LENGTH
>  
>  config FSL_PAMU
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 77aabe637a60..2309f59cefa4 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -888,17 +888,18 @@ static struct iommu_dma_msi_page 
> *iommu_dma_get_msi_page(struct device *dev,
>   return NULL;
>  }
>  
> -void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
> +int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)
>  {
> - struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
> + struct device *dev = msi_desc_to_dev(desc);
>   struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
>   struct iommu_dma_cookie *cookie;
>   struct iommu_dma_msi_page *msi_page;
> - phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
>   unsigned long flags;
>  
> - if (!domain || !domain->iova_cookie)
> - return;
> + if (!domain || !domain->iova_cookie) {
> + desc->iommu_cookie = NULL;

nit: This could be

msi_desc_set_iommu_cookie(desc, NULL);

now that you have introduced the relevant accessors.

> + return 0;
> + }
>  
>   cookie = domain->iova_cookie;
>  
> @@ -911,7 +912,37 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
>   msi_page = iommu_dma_get_msi_page(dev, msi_addr, domain);
>   spin_unlock_irqrestore(>msi_lock, flags);
>  
> - if (WARN_ON(!msi_page)) {
> + msi_desc_set_iommu_cookie(desc, msi_page);
> +
> + if (!msi_page)
> + return -ENOMEM;
> + else
> + return 0;
> +}
> +
> +void iommu_dma_compose_msi_msg(struct msi_desc *desc,
> +struct msi_msg *msg)
> +{
> + struct device *dev = msi_desc_to_dev(desc);
> + const struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> + const struct iommu_dma_msi_page *msi_page;
> +
> + msi_page = msi_desc_get_iommu_cookie(desc);
> +
> + if (!domain || !domain->iova_cookie || WARN_ON(!msi_page))
> + return;
> +
> + msg->address_hi = upper_32_bits(msi_page->iova);
> + msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
> + msg->address_lo += lower_32_bits(msi_page->iova);
> +}
> +
> +void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
> +{
> + struct msi_desc *desc = irq_get_msi_desc(irq);
> + phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
> +
> + if (WARN_ON(iommu_dma_prepare_msi(desc, msi_addr))) {
>   /*
>* We're called from a void callback, so the best we can do is
>* 'fail' by filling the message with obviously bogus values.
> @@ -922,8 +953,6 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
>   msg->address_lo = ~0U;
>   msg->data = ~0U;
>   } else {
> - msg->address_hi = upper_32_bits(msi_page->iova);
> - msg->address_lo &= cookie_msi_granule(cookie) - 1;
> - msg->address_lo += lower_32_bits(msi_page->iova);
> + iommu_dma_compose_msi_msg(desc, msg);
>   }
>  }
> 

Re: [PATCH v2 15/19] iommu/vt-d: Add bind guest PASID support

2019-04-29 Thread Jacob Pan
On Fri, 26 Apr 2019 18:15:27 +0200
Auger Eric  wrote:

> Hi Jacob,
> 
> On 4/24/19 1:31 AM, Jacob Pan wrote:
> > When supporting guest SVA with emulated IOMMU, the guest PASID
> > table is shadowed in VMM. Updates to guest vIOMMU PASID table
> > will result in PASID cache flush which will be passed down to
> > the host as bind guest PASID calls.
> > 
> > For the SL page tables, it will be harvested from device's
> > default domain (request w/o PASID), or aux domain in case of
> > mediated device.
> > 
> > .-.  .---.
> > |   vIOMMU|  | Guest process CR3, FL only|
> > | |  '---'
> > ./
> > | PASID Entry |--- PASID cache flush -
> > '-'   |
> > | |   V
> > | |CR3 in GPA
> > '-'
> > Guest
> > --| Shadow |--|
> >   vv  v
> > Host
> > .-.  .--.
> > |   pIOMMU|  | Bind FL for GVA-GPA  |
> > | |  '--'
> > ./  |
> > | PASID Entry | V (Nested xlate)
> > '\.--.
> > | |   |SL for GPA-HPA, default domain|
> > | |   '--'
> > '-'
> > Where:
> >  - FL = First level/stage one page tables
> >  - SL = Second level/stage two page tables
> > 
> > Signed-off-by: Jacob Pan 
> > Signed-off-by: Liu, Yi L 
> > ---
> >  drivers/iommu/intel-iommu.c |   4 +
> >  drivers/iommu/intel-svm.c   | 174
> > 
> > include/linux/intel-iommu.h |  10 ++- include/linux/intel-svm.h
> > |   7 ++ 4 files changed, 193 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/iommu/intel-iommu.c
> > b/drivers/iommu/intel-iommu.c index 77bbe1b..89989b5 100644
> > --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -5768,6 +5768,10 @@ const struct iommu_ops intel_iommu_ops = {
> > .dev_enable_feat= intel_iommu_dev_enable_feat,
> > .dev_disable_feat   = intel_iommu_dev_disable_feat,
> > .pgsize_bitmap  = INTEL_IOMMU_PGSIZES,
> > +#ifdef CONFIG_INTEL_IOMMU_SVM
> > +   .sva_bind_gpasid= intel_svm_bind_gpasid,
> > +   .sva_unbind_gpasid  = intel_svm_unbind_gpasid,
> > +#endif
> >  };
> >  
> >  static void quirk_iommu_g4x_gfx(struct pci_dev *dev)
> > diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> > index 8fff212..0a973c2 100644
> > --- a/drivers/iommu/intel-svm.c
> > +++ b/drivers/iommu/intel-svm.c
> > @@ -227,6 +227,180 @@ static const struct mmu_notifier_ops
> > intel_mmuops = { 
> >  static DEFINE_MUTEX(pasid_mutex);
> >  static LIST_HEAD(global_svm_list);
> > +#define for_each_svm_dev() \
> > +   list_for_each_entry(sdev, >devs, list) \
> > +   if (dev == sdev->dev)   \
> > +
> > +int intel_svm_bind_gpasid(struct iommu_domain *domain,
> > +   struct device *dev,
> > +   struct gpasid_bind_data *data)
> > +{
> > +   struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
> > +   struct intel_svm_dev *sdev;
> > +   struct intel_svm *svm = NULL;
> > +   struct dmar_domain *ddomain;
> > +   int pasid_max;
> > +   int ret = 0;
> > +
> > +   if (WARN_ON(!iommu) || !data)
> > +   return -EINVAL;
> > +
> > +   if (dev_is_pci(dev)) {
> > +   pasid_max = pci_max_pasids(to_pci_dev(dev));
> > +   if (pasid_max < 0)
> > +   return -EINVAL;
> > +   } else
> > +   pasid_max = 1 << 20;
> > +
> > +   if (data->pasid <= 0 || data->pasid >= pasid_max)
> > +   return -EINVAL;
> > +
> > +   ddomain = to_dmar_domain(domain);
> > +   /* REVISIT:
> > +* Sanity check adddress width and paging mode support
> > +* width matching in two dimensions:
> > +* 1. paging mode CPU <= IOMMU
> > +* 2. address width Guest <= Host.
> > +*/
> > +   mutex_lock(_mutex);
> > +   svm = ioasid_find(NULL, data->pasid, NULL);
> > +   if (IS_ERR(svm)) {
> > +   ret = PTR_ERR(svm);
> > +   goto out;
> > +   }
> > +   if (svm) {
> > +   if (list_empty(>devs)) {
> > +   dev_err(dev, "GPASID %d has no devices
> > bond but SVA is allocated\n",
> > +   data->pasid);
> > +   ret = -ENODEV; /*
> > +   * If we found svm for the
> > PASID, there must be at
> > +   * least one device bond,
> > otherwise svm should be freed.
> > +   */  
> comment should be put after list_empty I think. In which circumstances
> can it happen, I mean, isn't it a BUG_ON case?
Well, I think failing to bind guest PASID is not severe enough to the
host to use 

Re: [PATCH v2 7/7] iommu/dma-iommu: Remove iommu_dma_map_msi_msg()

2019-04-29 Thread Robin Murphy

On 29/04/2019 15:44, Julien Grall wrote:

A recent change split iommu_dma_map_msi_msg() in two new functions. The
function was still implemented to avoid modifying all the callers at
once.

Now that all the callers have been reworked, iommu_dma_map_msi_msg() can
be removed.


Yay! The end of my horrible bodge :)

Reviewed-by: Robin Murphy 


Signed-off-by: Julien Grall 

---
 Changes in v2:
 - Rework the commit message
---
  drivers/iommu/dma-iommu.c | 20 
  include/linux/dma-iommu.h |  5 -
  2 files changed, 25 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 2309f59cefa4..12f4464828a4 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -936,23 +936,3 @@ void iommu_dma_compose_msi_msg(struct msi_desc *desc,
msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
msg->address_lo += lower_32_bits(msi_page->iova);
  }
-
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
-{
-   struct msi_desc *desc = irq_get_msi_desc(irq);
-   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
-
-   if (WARN_ON(iommu_dma_prepare_msi(desc, msi_addr))) {
-   /*
-* We're called from a void callback, so the best we can do is
-* 'fail' by filling the message with obviously bogus values.
-* Since we got this far due to an IOMMU being present, it's
-* not like the existing address would have worked anyway...
-*/
-   msg->address_hi = ~0U;
-   msg->address_lo = ~0U;
-   msg->data = ~0U;
-   } else {
-   iommu_dma_compose_msi_msg(desc, msg);
-   }
-}
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 3fc48fbd6f63..ddd217c197df 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -82,7 +82,6 @@ int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t 
msi_addr);
  void iommu_dma_compose_msi_msg(struct msi_desc *desc,
   struct msi_msg *msg);
  
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);

  void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
  
  #else

@@ -122,10 +121,6 @@ static inline void iommu_dma_compose_msi_msg(struct 
msi_desc *desc,
  {
  }
  
-static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)

-{
-}
-
  static inline void iommu_dma_get_resv_regions(struct device *dev, struct 
list_head *list)
  {
  }


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 2/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two parts

2019-04-29 Thread Robin Murphy

On 29/04/2019 15:44, Julien Grall wrote:

On RT, iommu_dma_map_msi_msg() may be called from non-preemptible
context. This will lead to a splat with CONFIG_DEBUG_ATOMIC_SLEEP as
the function is using spin_lock (they can sleep on RT).

iommu_dma_map_msi_msg() is used to map the MSI page in the IOMMU PT
and update the MSI message with the IOVA.

Only the part to lookup for the MSI page requires to be called in
preemptible context. As the MSI page cannot change over the lifecycle
of the MSI interrupt, the lookup can be cached and re-used later on.

iomma_dma_map_msi_msg() is now split in two functions:
 - iommu_dma_prepare_msi(): This function will prepare the mapping
 in the IOMMU and store the cookie in the structure msi_desc. This
 function should be called in preemptible context.
 - iommu_dma_compose_msi_msg(): This function will update the MSI
 message with the IOVA when the device is behind an IOMMU.

Signed-off-by: Julien Grall 

---
 Changes in v2:
 - Rework the commit message to use imperative mood
 - Use the MSI accessor to get/set the iommu cookie
 - Don't use ternary on return
 - Select CONFIG_IRQ_MSI_IOMMU
 - Pass an msi_desc rather than the irq number
---
  drivers/iommu/Kconfig |  1 +
  drivers/iommu/dma-iommu.c | 47 ++-
  include/linux/dma-iommu.h | 23 +++
  3 files changed, 62 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6f07f3b21816..eb1c8cd243f9 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -94,6 +94,7 @@ config IOMMU_DMA
bool
select IOMMU_API
select IOMMU_IOVA
+   select IRQ_MSI_IOMMU
select NEED_SG_DMA_LENGTH
  
  config FSL_PAMU

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 77aabe637a60..2309f59cefa4 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -888,17 +888,18 @@ static struct iommu_dma_msi_page 
*iommu_dma_get_msi_page(struct device *dev,
return NULL;
  }
  
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)

+int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)
  {
-   struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
+   struct device *dev = msi_desc_to_dev(desc);
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct iommu_dma_cookie *cookie;
struct iommu_dma_msi_page *msi_page;
-   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
unsigned long flags;
  
-	if (!domain || !domain->iova_cookie)

-   return;
+   if (!domain || !domain->iova_cookie) {
+   desc->iommu_cookie = NULL;
+   return 0;
+   }
  
  	cookie = domain->iova_cookie;
  
@@ -911,7 +912,37 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)

msi_page = iommu_dma_get_msi_page(dev, msi_addr, domain);
spin_unlock_irqrestore(>msi_lock, flags);
  
-	if (WARN_ON(!msi_page)) {

+   msi_desc_set_iommu_cookie(desc, msi_page);
+
+   if (!msi_page)
+   return -ENOMEM;
+   else


Nit: the "else" isn't really necessary.


+   return 0;
+}
+
+void iommu_dma_compose_msi_msg(struct msi_desc *desc,
+  struct msi_msg *msg)
+{
+   struct device *dev = msi_desc_to_dev(desc);
+   const struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+   const struct iommu_dma_msi_page *msi_page;
+
+   msi_page = msi_desc_get_iommu_cookie(desc);
+
+   if (!domain || !domain->iova_cookie || WARN_ON(!msi_page))
+   return;
+
+   msg->address_hi = upper_32_bits(msi_page->iova);
+   msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
+   msg->address_lo += lower_32_bits(msi_page->iova);
+}
+
+void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+{
+   struct msi_desc *desc = irq_get_msi_desc(irq);
+   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
+
+   if (WARN_ON(iommu_dma_prepare_msi(desc, msi_addr))) {
/*
 * We're called from a void callback, so the best we can do is
 * 'fail' by filling the message with obviously bogus values.
@@ -922,8 +953,6 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
msg->address_lo = ~0U;
msg->data = ~0U;
} else {
-   msg->address_hi = upper_32_bits(msi_page->iova);
-   msg->address_lo &= cookie_msi_granule(cookie) - 1;
-   msg->address_lo += lower_32_bits(msi_page->iova);
+   iommu_dma_compose_msi_msg(desc, msg);
}
  }
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index e760dc5d1fa8..3fc48fbd6f63 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -71,12 +71,24 @@ void 

Re: [PATCH v2 1/7] genirq/msi: Add a new field in msi_desc to store an IOMMU cookie

2019-04-29 Thread Robin Murphy

On 29/04/2019 15:44, Julien Grall wrote:

When an MSI doorbell is located downstream of an IOMMU, it is required
to swizzle the physical address with an appropriately-mapped IOVA for any
device attached to one of our DMA ops domain.

At the moment, the allocation of the mapping may be done when composing
the message. However, the composing may be done in non-preemtible
context while the allocation requires to be called from preemptible
context.

A follow-up change will split the current logic in two functions
requiring to keep an IOMMU cookie per MSI.

A new field is introduced in msi_desc to store an IOMMU cookie. As the
cookie may not be required in some configuration, the field is protected
under a new config CONFIG_IRQ_MSI_IOMMU.

A pair of helpers has also been introduced to access the field.


Reviewed-by: Robin Murphy 


Signed-off-by: Julien Grall 

---
 Changes in v2:
 - Update the commit message to use imperative mood
 - Protect the field with a new config that will be selected by
 IOMMU_DMA later on
 - Add a set of helpers to access the new field
---
  include/linux/msi.h | 26 ++
  kernel/irq/Kconfig  |  3 +++
  2 files changed, 29 insertions(+)

diff --git a/include/linux/msi.h b/include/linux/msi.h
index 7e9b81c3b50d..82a308c19222 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -77,6 +77,9 @@ struct msi_desc {
struct device   *dev;
struct msi_msg  msg;
struct irq_affinity_desc*affinity;
+#ifdef CONFIG_IRQ_MSI_IOMMU
+   const void  *iommu_cookie;
+#endif
  
  	union {

/* PCI MSI/X specific data */
@@ -119,6 +122,29 @@ struct msi_desc {
  #define for_each_msi_entry_safe(desc, tmp, dev)   \
list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
  
+#ifdef CONFIG_IRQ_MSI_IOMMU

+static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
+{
+   return desc->iommu_cookie;
+}
+
+static inline void msi_desc_set_iommu_cookie(struct msi_desc *desc,
+const void *iommu_cookie)
+{
+   desc->iommu_cookie = iommu_cookie;
+}
+#else
+static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
+{
+   return NULL;
+}
+
+static inline void msi_desc_set_iommu_cookie(struct msi_desc *desc,
+const void *iommu_cookie)
+{
+}
+#endif
+
  #ifdef CONFIG_PCI_MSI
  #define first_pci_msi_entry(pdev) first_msi_entry(&(pdev)->dev)
  #define for_each_pci_msi_entry(desc, pdev)\
diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
index 5f3e2baefca9..8fee06625c37 100644
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -91,6 +91,9 @@ config GENERIC_MSI_IRQ_DOMAIN
select IRQ_DOMAIN_HIERARCHY
select GENERIC_MSI_IRQ
  
+config IRQ_MSI_IOMMU

+   bool
+
  config HANDLE_DOMAIN_IRQ
bool
  


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: implement generic dma_map_ops for IOMMUs v3

2019-04-29 Thread Robin Murphy

On 22/04/2019 18:59, Christoph Hellwig wrote:

Hi Robin,

please take a look at this series, which implements a completely generic
set of dma_map_ops for IOMMU drivers.  This is done by taking the
existing arm64 code, moving it to drivers/iommu and then massaging it
so that it can also work for architectures with DMA remapping.  This
should help future ports to support IOMMUs more easily, and also allow
to remove various custom IOMMU dma_map_ops implementations, like Tom
was planning to for the AMD one.


Modulo a few nits, I think this looks pretty much good to go, and it 
would certainly be good to get as much as reasonable queued soon for the 
sake of progress elsewhere.


Catalin, Will, I think the arm64 parts are all OK but please take a look 
to confirm.


Thanks,
Robin.



A git tree is also available at:

 git://git.infradead.org/users/hch/misc.git dma-iommu-ops.3

Gitweb:

 
http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-iommu-ops.3

Changes since v2:
  - address various review comments and include patches from Robin

Changes since v1:
  - only include other headers in dma-iommu.h if CONFIG_DMA_IOMMU is enabled
  - keep using a scatterlist in iommu_dma_alloc
  - split out mmap/sgtable fixes and move them early in the series
  - updated a few commit logs


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 26/26] arm64: trim includes in dma-mapping.c

2019-04-29 Thread Robin Murphy

On 22/04/2019 18:59, Christoph Hellwig wrote:

With most of the previous functionality now elsewhere a lot of the
headers included in this file are not needed.

Signed-off-by: Christoph Hellwig 
---
  arch/arm64/mm/dma-mapping.c | 11 ---
  1 file changed, 11 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 184ef9ccd69d..15bd768ceb7e 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -5,20 +5,9 @@
   */
  
  #include 

-#include 
-#include 
  #include 
-#include 
-#include 
-#include 
-#include 
  #include 
-#include 
  #include 
-#include 
-#include 
-#include 
-


Nit: please keep the blank line between linux/ and asm/ include blocks 
to match the predominant local style.


With that,

Reviewed-by: Robin Murphy 


  #include 
  
  pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot,



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 20/26] iommu/dma: Refactor iommu_dma_alloc, part 2

2019-04-29 Thread Robin Murphy

On 22/04/2019 18:59, Christoph Hellwig wrote:

From: Robin Murphy 


Honestly I don't think anything left of my patch here...


Apart from the iommu_dma_alloc_remap() case which remains sufficiently
different that it's better off being self-contained, the rest of the
logic can now be consolidated into a single flow which separates the
logcially-distinct steps of allocating pages, getting the CPU address,
and finally getting the IOMMU address.


...and it certainly doesn't do that any more.

It's clear that we have fundamentally different ways of reading code, so 
I don't think it's productive to keep arguing personal preference - I 
still find the end result here a fair bit more tolerable than before, so 
if you update the commit message to reflect the actual change (at which 
point there's really nothing left of my authorship) I can live with it.


Robin.


Signed-off-by: Robin Murphy 
[hch: split the page allocation into a new helper to simplify the flow]
Signed-off-by: Christoph Hellwig 
---
  drivers/iommu/dma-iommu.c | 65 +--
  1 file changed, 35 insertions(+), 30 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 9b269f0792f3..acdfe866cb29 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -955,35 +955,14 @@ static void iommu_dma_free(struct device *dev, size_t 
size, void *cpu_addr,
__iommu_dma_free(dev, size, cpu_addr);
  }
  
-static void *iommu_dma_alloc(struct device *dev, size_t size,

-   dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
+static void *iommu_dma_alloc_pages(struct device *dev, size_t size,
+   struct page **pagep, gfp_t gfp, unsigned long attrs)
  {
bool coherent = dev_is_dma_coherent(dev);
-   int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
size_t alloc_size = PAGE_ALIGN(size);
struct page *page = NULL;
void *cpu_addr;
  
-	gfp |= __GFP_ZERO;

-
-   if (gfpflags_allow_blocking(gfp) &&
-   !(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
-   return iommu_dma_alloc_remap(dev, size, handle, gfp, attrs);
-
-   if (!gfpflags_allow_blocking(gfp) && !coherent) {
-   cpu_addr = dma_alloc_from_pool(alloc_size, , gfp);
-   if (!cpu_addr)
-   return NULL;
-
-   *handle = __iommu_dma_map(dev, page_to_phys(page), size,
- ioprot);
-   if (*handle == DMA_MAPPING_ERROR) {
-   dma_free_from_pool(cpu_addr, alloc_size);
-   return NULL;
-   }
-   return cpu_addr;
-   }
-
if (gfpflags_allow_blocking(gfp))
page = dma_alloc_from_contiguous(dev, alloc_size >> PAGE_SHIFT,
 get_order(alloc_size),
@@ -993,33 +972,59 @@ static void *iommu_dma_alloc(struct device *dev, size_t 
size,
if (!page)
return NULL;
  
-	*handle = __iommu_dma_map(dev, page_to_phys(page), size, ioprot);

-   if (*handle == DMA_MAPPING_ERROR)
-   goto out_free_pages;
-
if (!coherent || PageHighMem(page)) {
pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
  
  		cpu_addr = dma_common_contiguous_remap(page, alloc_size,

VM_USERMAP, prot, __builtin_return_address(0));
if (!cpu_addr)
-   goto out_unmap;
+   goto out_free_pages;
  
  		if (!coherent)

arch_dma_prep_coherent(page, size);
} else {
cpu_addr = page_address(page);
}
+
+   *pagep = page;
memset(cpu_addr, 0, alloc_size);
return cpu_addr;
-out_unmap:
-   __iommu_dma_unmap(dev, *handle, size);
  out_free_pages:
if (!dma_release_from_contiguous(dev, page, alloc_size >> PAGE_SHIFT))
__free_pages(page, get_order(alloc_size));
return NULL;
  }
  
+static void *iommu_dma_alloc(struct device *dev, size_t size,

+   dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
+{
+   bool coherent = dev_is_dma_coherent(dev);
+   int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
+   struct page *page = NULL;
+   void *cpu_addr;
+
+   gfp |= __GFP_ZERO;
+
+   if (gfpflags_allow_blocking(gfp) &&
+   !(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
+   return iommu_dma_alloc_remap(dev, size, handle, gfp, attrs);
+
+   if (!gfpflags_allow_blocking(gfp) && !coherent)
+   cpu_addr = dma_alloc_from_pool(PAGE_ALIGN(size), , gfp);
+   else
+   cpu_addr = iommu_dma_alloc_pages(dev, size, , gfp, attrs);
+   if (!cpu_addr)
+   return NULL;
+
+   *handle = __iommu_dma_map(dev, page_to_phys(page), size, ioprot);
+   if (*handle == DMA_MAPPING_ERROR) {
+   

[PATCH v2 7/7] iommu/dma-iommu: Remove iommu_dma_map_msi_msg()

2019-04-29 Thread Julien Grall
A recent change split iommu_dma_map_msi_msg() in two new functions. The
function was still implemented to avoid modifying all the callers at
once.

Now that all the callers have been reworked, iommu_dma_map_msi_msg() can
be removed.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Rework the commit message
---
 drivers/iommu/dma-iommu.c | 20 
 include/linux/dma-iommu.h |  5 -
 2 files changed, 25 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 2309f59cefa4..12f4464828a4 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -936,23 +936,3 @@ void iommu_dma_compose_msi_msg(struct msi_desc *desc,
msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
msg->address_lo += lower_32_bits(msi_page->iova);
 }
-
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
-{
-   struct msi_desc *desc = irq_get_msi_desc(irq);
-   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
-
-   if (WARN_ON(iommu_dma_prepare_msi(desc, msi_addr))) {
-   /*
-* We're called from a void callback, so the best we can do is
-* 'fail' by filling the message with obviously bogus values.
-* Since we got this far due to an IOMMU being present, it's
-* not like the existing address would have worked anyway...
-*/
-   msg->address_hi = ~0U;
-   msg->address_lo = ~0U;
-   msg->data = ~0U;
-   } else {
-   iommu_dma_compose_msi_msg(desc, msg);
-   }
-}
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 3fc48fbd6f63..ddd217c197df 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -82,7 +82,6 @@ int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t 
msi_addr);
 void iommu_dma_compose_msi_msg(struct msi_desc *desc,
   struct msi_msg *msg);
 
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
 
 #else
@@ -122,10 +121,6 @@ static inline void iommu_dma_compose_msi_msg(struct 
msi_desc *desc,
 {
 }
 
-static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
-{
-}
-
 static inline void iommu_dma_get_resv_regions(struct device *dev, struct 
list_head *list)
 {
 }
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 23/26] iommu/dma: Don't depend on CONFIG_DMA_DIRECT_REMAP

2019-04-29 Thread Robin Murphy

On 22/04/2019 18:59, Christoph Hellwig wrote:

For entirely dma coherent architectures there is no requirement to ever
remap dma coherent allocation.  Move all the remap and pool code under
IS_ENABLED() checks and drop the Kconfig dependency.


Reviewed-by: Robin Murphy 


Signed-off-by: Christoph Hellwig 
---
  drivers/iommu/Kconfig |  1 -
  drivers/iommu/dma-iommu.c | 16 +---
  2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index bdc14baf2ee5..6f07f3b21816 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -95,7 +95,6 @@ config IOMMU_DMA
select IOMMU_API
select IOMMU_IOVA
select NEED_SG_DMA_LENGTH
-   depends on DMA_DIRECT_REMAP
  
  config FSL_PAMU

bool "Freescale IOMMU support"
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 8fc6098c1eeb..278a9a960107 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -923,10 +923,11 @@ static void __iommu_dma_free(struct device *dev, size_t 
size, void *cpu_addr)
struct page *page = NULL;
  
  	/* Non-coherent atomic allocation? Easy */

-   if (dma_free_from_pool(cpu_addr, alloc_size))
+   if (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
+   dma_free_from_pool(cpu_addr, alloc_size))
return;
  
-	if (is_vmalloc_addr(cpu_addr)) {

+   if (IS_ENABLED(CONFIG_DMA_REMAP) && is_vmalloc_addr(cpu_addr)) {
/*
 * If it the address is remapped, then it's either non-coherent
 * or highmem CMA, or an iommu_dma_alloc_remap() construction.
@@ -972,7 +973,7 @@ static void *iommu_dma_alloc_pages(struct device *dev, 
size_t size,
if (!page)
return NULL;
  
-	if (!coherent || PageHighMem(page)) {

+   if (IS_ENABLED(CONFIG_DMA_REMAP) && (!coherent || PageHighMem(page))) {
pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
  
  		cpu_addr = dma_common_contiguous_remap(page, alloc_size,

@@ -1005,11 +1006,12 @@ static void *iommu_dma_alloc(struct device *dev, size_t 
size,
  
  	gfp |= __GFP_ZERO;
  
-	if (gfpflags_allow_blocking(gfp) &&

+   if (IS_ENABLED(CONFIG_DMA_REMAP) && gfpflags_allow_blocking(gfp) &&
!(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
return iommu_dma_alloc_remap(dev, size, handle, gfp, attrs);
  
-	if (!gfpflags_allow_blocking(gfp) && !coherent)

+   if (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
+   !gfpflags_allow_blocking(gfp) && !coherent)
cpu_addr = dma_alloc_from_pool(PAGE_ALIGN(size), , gfp);
else
cpu_addr = iommu_dma_alloc_pages(dev, size, , gfp, attrs);
@@ -1041,7 +1043,7 @@ static int iommu_dma_mmap(struct device *dev, struct 
vm_area_struct *vma,
if (off >= nr_pages || vma_pages(vma) > nr_pages - off)
return -ENXIO;
  
-	if (is_vmalloc_addr(cpu_addr)) {

+   if (IS_ENABLED(CONFIG_DMA_REMAP) && is_vmalloc_addr(cpu_addr)) {
struct page **pages = __iommu_dma_get_pages(cpu_addr);
  
  		if (pages)

@@ -1063,7 +1065,7 @@ static int iommu_dma_get_sgtable(struct device *dev, 
struct sg_table *sgt,
struct page *page;
int ret;
  
-	if (is_vmalloc_addr(cpu_addr)) {

+   if (IS_ENABLED(CONFIG_DMA_REMAP) && is_vmalloc_addr(cpu_addr)) {
struct page **pages = __iommu_dma_get_pages(cpu_addr);
  
  		if (pages) {



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 5/7] irqchip/ls-scfg-msi: Don't map the MSI page in ls_scfg_msi_compose_msg()

2019-04-29 Thread Julien Grall
ls_scfg_msi_compose_msg() may be called from non-preemptible context.
However, on RT, iommu_dma_map_msi_msg() requires to be called from a
preemptible context.

A recent patch split iommu_dma_map_msi_msg() in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The FreeScale SCFG MSI driver is reworked to avoid executing preemptible
code in non-preemptible context. This can be achieved by preparing the
MSI maping when allocating the MSI interrupt.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Rework the commit message to use imperative mood
---
 drivers/irqchip/irq-ls-scfg-msi.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-ls-scfg-msi.c 
b/drivers/irqchip/irq-ls-scfg-msi.c
index c671b3212010..669d29105772 100644
--- a/drivers/irqchip/irq-ls-scfg-msi.c
+++ b/drivers/irqchip/irq-ls-scfg-msi.c
@@ -100,7 +100,7 @@ static void ls_scfg_msi_compose_msg(struct irq_data *data, 
struct msi_msg *msg)
msg->data |= cpumask_first(mask);
}
 
-   iommu_dma_map_msi_msg(data->irq, msg);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(data), msg);
 }
 
 static int ls_scfg_msi_set_affinity(struct irq_data *irq_data,
@@ -141,6 +141,7 @@ static int ls_scfg_msi_domain_irq_alloc(struct irq_domain 
*domain,
unsigned int nr_irqs,
void *args)
 {
+   msi_alloc_info_t *info = args;
struct ls_scfg_msi *msi_data = domain->host_data;
int pos, err = 0;
 
@@ -157,6 +158,10 @@ static int ls_scfg_msi_domain_irq_alloc(struct irq_domain 
*domain,
if (err)
return err;
 
+   err = iommu_dma_prepare_msi(info->desc, msi_data->msiir_addr);
+   if (err)
+   return err;
+
irq_domain_set_info(domain, virq, pos,
_scfg_msi_parent_chip, msi_data,
handle_simple_irq, NULL, NULL);
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 4/7] irqchip/gic-v3-its: Don't map the MSI page in its_irq_compose_msi_msg()

2019-04-29 Thread Julien Grall
its_irq_compose_msi_msg() may be called from non-preemptible context.
However, on RT, iommu_dma_map_msi_msg requires to be called from a
preemptible context.

A recent change split iommu_dma_map_msi_msg() in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The GICv3 ITS driver is reworked to avoid executing preemptible code in
non-preemptible context. This can be achieved by preparing the MSI
maping when allocating the MSI interrupt.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Rework the commit message to use imperative mood
---
 drivers/irqchip/irq-gic-v3-its.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 7577755bdcf4..12ddbcfe1b1e 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1179,7 +1179,7 @@ static void its_irq_compose_msi_msg(struct irq_data *d, 
struct msi_msg *msg)
msg->address_hi = upper_32_bits(addr);
msg->data   = its_get_event_id(d);
 
-   iommu_dma_map_msi_msg(d->irq, msg);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
 }
 
 static int its_irq_set_irqchip_state(struct irq_data *d,
@@ -2566,6 +2566,7 @@ static int its_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
 {
msi_alloc_info_t *info = args;
struct its_device *its_dev = info->scratchpad[0].ptr;
+   struct its_node *its = its_dev->its;
irq_hw_number_t hwirq;
int err;
int i;
@@ -2574,6 +2575,8 @@ static int its_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
if (err)
return err;
 
+   err = iommu_dma_prepare_msi(info->desc, its->get_msi_base(its_dev));
+
for (i = 0; i < nr_irqs; i++) {
err = its_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two parts

2019-04-29 Thread Julien Grall
On RT, iommu_dma_map_msi_msg() may be called from non-preemptible
context. This will lead to a splat with CONFIG_DEBUG_ATOMIC_SLEEP as
the function is using spin_lock (they can sleep on RT).

iommu_dma_map_msi_msg() is used to map the MSI page in the IOMMU PT
and update the MSI message with the IOVA.

Only the part to lookup for the MSI page requires to be called in
preemptible context. As the MSI page cannot change over the lifecycle
of the MSI interrupt, the lookup can be cached and re-used later on.

iomma_dma_map_msi_msg() is now split in two functions:
- iommu_dma_prepare_msi(): This function will prepare the mapping
in the IOMMU and store the cookie in the structure msi_desc. This
function should be called in preemptible context.
- iommu_dma_compose_msi_msg(): This function will update the MSI
message with the IOVA when the device is behind an IOMMU.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Rework the commit message to use imperative mood
- Use the MSI accessor to get/set the iommu cookie
- Don't use ternary on return
- Select CONFIG_IRQ_MSI_IOMMU
- Pass an msi_desc rather than the irq number
---
 drivers/iommu/Kconfig |  1 +
 drivers/iommu/dma-iommu.c | 47 ++-
 include/linux/dma-iommu.h | 23 +++
 3 files changed, 62 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6f07f3b21816..eb1c8cd243f9 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -94,6 +94,7 @@ config IOMMU_DMA
bool
select IOMMU_API
select IOMMU_IOVA
+   select IRQ_MSI_IOMMU
select NEED_SG_DMA_LENGTH
 
 config FSL_PAMU
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 77aabe637a60..2309f59cefa4 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -888,17 +888,18 @@ static struct iommu_dma_msi_page 
*iommu_dma_get_msi_page(struct device *dev,
return NULL;
 }
 
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)
 {
-   struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
+   struct device *dev = msi_desc_to_dev(desc);
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct iommu_dma_cookie *cookie;
struct iommu_dma_msi_page *msi_page;
-   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
unsigned long flags;
 
-   if (!domain || !domain->iova_cookie)
-   return;
+   if (!domain || !domain->iova_cookie) {
+   desc->iommu_cookie = NULL;
+   return 0;
+   }
 
cookie = domain->iova_cookie;
 
@@ -911,7 +912,37 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
msi_page = iommu_dma_get_msi_page(dev, msi_addr, domain);
spin_unlock_irqrestore(>msi_lock, flags);
 
-   if (WARN_ON(!msi_page)) {
+   msi_desc_set_iommu_cookie(desc, msi_page);
+
+   if (!msi_page)
+   return -ENOMEM;
+   else
+   return 0;
+}
+
+void iommu_dma_compose_msi_msg(struct msi_desc *desc,
+  struct msi_msg *msg)
+{
+   struct device *dev = msi_desc_to_dev(desc);
+   const struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+   const struct iommu_dma_msi_page *msi_page;
+
+   msi_page = msi_desc_get_iommu_cookie(desc);
+
+   if (!domain || !domain->iova_cookie || WARN_ON(!msi_page))
+   return;
+
+   msg->address_hi = upper_32_bits(msi_page->iova);
+   msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
+   msg->address_lo += lower_32_bits(msi_page->iova);
+}
+
+void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+{
+   struct msi_desc *desc = irq_get_msi_desc(irq);
+   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
+
+   if (WARN_ON(iommu_dma_prepare_msi(desc, msi_addr))) {
/*
 * We're called from a void callback, so the best we can do is
 * 'fail' by filling the message with obviously bogus values.
@@ -922,8 +953,6 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
msg->address_lo = ~0U;
msg->data = ~0U;
} else {
-   msg->address_hi = upper_32_bits(msi_page->iova);
-   msg->address_lo &= cookie_msi_granule(cookie) - 1;
-   msg->address_lo += lower_32_bits(msi_page->iova);
+   iommu_dma_compose_msi_msg(desc, msg);
}
 }
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index e760dc5d1fa8..3fc48fbd6f63 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -71,12 +71,24 @@ void iommu_dma_unmap_resource(struct device *dev, 
dma_addr_t handle,
size_t size, enum dma_data_direction dir, 

[PATCH v2 0/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg in two parts

2019-04-29 Thread Julien Grall
Hi all,

On RT, the function iommu_dma_map_msi_msg expects to be called from preemptible
context. However, this is not always the case resulting a splat with
!CONFIG_DEBUG_ATOMIC_SLEEP:

[   48.875777] BUG: sleeping function called from invalid context at 
kernel/locking/rtmutex.c:974
[   48.875779] in_atomic(): 1, irqs_disabled(): 128, pid: 2103, name: ip
[   48.875782] INFO: lockdep is turned off.
[   48.875784] irq event stamp: 10684
[   48.875786] hardirqs last  enabled at (10683): [] 
_raw_spin_unlock_irqrestore+0x88/0x90
[   48.875791] hardirqs last disabled at (10684): [] 
_raw_spin_lock_irqsave+0x24/0x68
[   48.875796] softirqs last  enabled at (0): [] 
copy_process.isra.1.part.2+0x8d8/0x1970
[   48.875801] softirqs last disabled at (0): [<>]   
(null)
[   48.875805] Preemption disabled at:
[   48.875805] [] __setup_irq+0xd8/0x6c0
[   48.875811] CPU: 2 PID: 2103 Comm: ip Not tainted 
5.0.3-rt1-7-g42ede9a0fed6 #45
[   48.875815] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno 
Development Platform, BIOS EDK II Jan 23 2017
[   48.875817] Call trace:
[   48.875818]  dump_backtrace+0x0/0x140
[   48.875821]  show_stack+0x14/0x20
[   48.875823]  dump_stack+0xa0/0xd4
[   48.875827]  ___might_sleep+0x16c/0x1f8
[   48.875831]  rt_spin_lock+0x5c/0x70
[   48.875835]  iommu_dma_map_msi_msg+0x5c/0x1d8
[   48.875839]  gicv2m_compose_msi_msg+0x3c/0x48
[   48.875843]  irq_chip_compose_msi_msg+0x40/0x58
[   48.875846]  msi_domain_activate+0x38/0x98
[   48.875849]  __irq_domain_activate_irq+0x58/0xa0
[   48.875852]  irq_domain_activate_irq+0x34/0x58
[   48.875855]  irq_activate+0x28/0x30
[   48.875858]  __setup_irq+0x2b0/0x6c0
[   48.875861]  request_threaded_irq+0xdc/0x188
[   48.875865]  sky2_setup_irq+0x44/0xf8
[   48.875868]  sky2_open+0x1a4/0x240
[   48.875871]  __dev_open+0xd8/0x188
[   48.875874]  __dev_change_flags+0x164/0x1f0
[   48.875877]  dev_change_flags+0x20/0x60
[   48.875879]  do_setlink+0x2a0/0xd30
[   48.875882]  __rtnl_newlink+0x5b4/0x6d8
[   48.875885]  rtnl_newlink+0x50/0x78
[   48.875888]  rtnetlink_rcv_msg+0x178/0x640
[   48.875891]  netlink_rcv_skb+0x58/0x118
[   48.875893]  rtnetlink_rcv+0x14/0x20
[   48.875896]  netlink_unicast+0x188/0x200
[   48.875898]  netlink_sendmsg+0x248/0x3d8
[   48.875900]  sock_sendmsg+0x18/0x40
[   48.875904]  ___sys_sendmsg+0x294/0x2d0
[   48.875908]  __sys_sendmsg+0x68/0xb8
[   48.875911]  __arm64_sys_sendmsg+0x20/0x28
[   48.875914]  el0_svc_common+0x90/0x118
[   48.875918]  el0_svc_handler+0x2c/0x80
[   48.875922]  el0_svc+0x8/0xc

This series is a first attempt to rework how MSI are mapped and composed
when an IOMMU is present.

I was able to test the changes in GICv2m and GICv3 ITS. I don't have
hardware for the other interrupt controllers.

Cheers,

Julien Grall (7):
  genirq/msi: Add a new field in msi_desc to store an IOMMU cookie
  iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two parts
  irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg()
  irqchip/gic-v3-its: Don't map the MSI page in
its_irq_compose_msi_msg()
  irqchip/ls-scfg-msi: Don't map the MSI page in
ls_scfg_msi_compose_msg()
  irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b,
s}i_msg()
  iommu/dma-iommu: Remove iommu_dma_map_msi_msg()

 drivers/iommu/Kconfig |  1 +
 drivers/iommu/dma-iommu.c | 49 +++
 drivers/irqchip/irq-gic-v2m.c |  8 ++-
 drivers/irqchip/irq-gic-v3-its.c  |  5 +++-
 drivers/irqchip/irq-gic-v3-mbi.c  | 15 ++--
 drivers/irqchip/irq-ls-scfg-msi.c |  7 +-
 include/linux/dma-iommu.h | 22 --
 include/linux/msi.h   | 26 +
 kernel/irq/Kconfig|  3 +++
 9 files changed, 109 insertions(+), 27 deletions(-)

-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 6/7] irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b, s}i_msg()

2019-04-29 Thread Julien Grall
The functions mbi_compose_m{b, s}i_msg may be called from non-preemptible
context. However, on RT, iommu_dma_map_msi_msg() requires to be called
from a preemptible context.

A recent patch split iommu_dma_map_msi_msg in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The GICv3 MSI driver is reworked to avoid executing preemptible code in
non-preemptible context. This can be achieved by preparing the two MSI
mappings when allocating the MSI interrupt.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Rework the commit message to use imperative mood
---
 drivers/irqchip/irq-gic-v3-mbi.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-mbi.c b/drivers/irqchip/irq-gic-v3-mbi.c
index fbfa7ff6deb1..d50f6cdf043c 100644
--- a/drivers/irqchip/irq-gic-v3-mbi.c
+++ b/drivers/irqchip/irq-gic-v3-mbi.c
@@ -84,6 +84,7 @@ static void mbi_free_msi(struct mbi_range *mbi, unsigned int 
hwirq,
 static int mbi_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
   unsigned int nr_irqs, void *args)
 {
+   msi_alloc_info_t *info = args;
struct mbi_range *mbi = NULL;
int hwirq, offset, i, err = 0;
 
@@ -104,6 +105,16 @@ static int mbi_irq_domain_alloc(struct irq_domain *domain, 
unsigned int virq,
 
hwirq = mbi->spi_start + offset;
 
+   err = iommu_dma_prepare_msi(info->desc,
+   mbi_phys_base + GICD_CLRSPI_NSR);
+   if (err)
+   return err;
+
+   err = iommu_dma_prepare_msi(info->desc,
+   mbi_phys_base + GICD_SETSPI_NSR);
+   if (err)
+   return err;
+
for (i = 0; i < nr_irqs; i++) {
err = mbi_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
@@ -142,7 +153,7 @@ static void mbi_compose_msi_msg(struct irq_data *data, 
struct msi_msg *msg)
msg[0].address_lo = lower_32_bits(mbi_phys_base + GICD_SETSPI_NSR);
msg[0].data = data->parent_data->hwirq;
 
-   iommu_dma_map_msi_msg(data->irq, msg);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(data), msg);
 }
 
 #ifdef CONFIG_PCI_MSI
@@ -202,7 +213,7 @@ static void mbi_compose_mbi_msg(struct irq_data *data, 
struct msi_msg *msg)
msg[1].address_lo = lower_32_bits(mbi_phys_base + GICD_CLRSPI_NSR);
msg[1].data = data->parent_data->hwirq;
 
-   iommu_dma_map_msi_msg(data->irq, [1]);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(data), [1]);
 }
 
 /* Platform-MSI specific irqchip */
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 3/7] irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg()

2019-04-29 Thread Julien Grall
gicv2m_compose_msi_msg() may be called from non-preemptible context.
However, on RT, iommu_dma_map_msi_msg() requires to be called from a
preemptible context.

A recent change split iommu_dma_map_msi_msg() in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The GICv2m driver is reworked to avoid executing preemptible code in
non-preemptible context. This can be achieved by preparing the MSI
mapping when allocating the MSI interrupt.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Rework the commit message to use imperative mood
---
 drivers/irqchip/irq-gic-v2m.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
index f5fe0100f9ff..4359f0583377 100644
--- a/drivers/irqchip/irq-gic-v2m.c
+++ b/drivers/irqchip/irq-gic-v2m.c
@@ -110,7 +110,7 @@ static void gicv2m_compose_msi_msg(struct irq_data *data, 
struct msi_msg *msg)
if (v2m->flags & GICV2M_NEEDS_SPI_OFFSET)
msg->data -= v2m->spi_offset;
 
-   iommu_dma_map_msi_msg(data->irq, msg);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(data), msg);
 }
 
 static struct irq_chip gicv2m_irq_chip = {
@@ -167,6 +167,7 @@ static void gicv2m_unalloc_msi(struct v2m_data *v2m, 
unsigned int hwirq,
 static int gicv2m_irq_domain_alloc(struct irq_domain *domain, unsigned int 
virq,
   unsigned int nr_irqs, void *args)
 {
+   msi_alloc_info_t *info = args;
struct v2m_data *v2m = NULL, *tmp;
int hwirq, offset, i, err = 0;
 
@@ -186,6 +187,11 @@ static int gicv2m_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
 
hwirq = v2m->spi_start + offset;
 
+   err = iommu_dma_prepare_msi(info->desc,
+   v2m->res.start + V2M_MSI_SETSPI_NS);
+   if (err)
+   return err;
+
for (i = 0; i < nr_irqs; i++) {
err = gicv2m_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/7] genirq/msi: Add a new field in msi_desc to store an IOMMU cookie

2019-04-29 Thread Julien Grall
When an MSI doorbell is located downstream of an IOMMU, it is required
to swizzle the physical address with an appropriately-mapped IOVA for any
device attached to one of our DMA ops domain.

At the moment, the allocation of the mapping may be done when composing
the message. However, the composing may be done in non-preemtible
context while the allocation requires to be called from preemptible
context.

A follow-up change will split the current logic in two functions
requiring to keep an IOMMU cookie per MSI.

A new field is introduced in msi_desc to store an IOMMU cookie. As the
cookie may not be required in some configuration, the field is protected
under a new config CONFIG_IRQ_MSI_IOMMU.

A pair of helpers has also been introduced to access the field.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Update the commit message to use imperative mood
- Protect the field with a new config that will be selected by
IOMMU_DMA later on
- Add a set of helpers to access the new field
---
 include/linux/msi.h | 26 ++
 kernel/irq/Kconfig  |  3 +++
 2 files changed, 29 insertions(+)

diff --git a/include/linux/msi.h b/include/linux/msi.h
index 7e9b81c3b50d..82a308c19222 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -77,6 +77,9 @@ struct msi_desc {
struct device   *dev;
struct msi_msg  msg;
struct irq_affinity_desc*affinity;
+#ifdef CONFIG_IRQ_MSI_IOMMU
+   const void  *iommu_cookie;
+#endif
 
union {
/* PCI MSI/X specific data */
@@ -119,6 +122,29 @@ struct msi_desc {
 #define for_each_msi_entry_safe(desc, tmp, dev)\
list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
 
+#ifdef CONFIG_IRQ_MSI_IOMMU
+static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
+{
+   return desc->iommu_cookie;
+}
+
+static inline void msi_desc_set_iommu_cookie(struct msi_desc *desc,
+const void *iommu_cookie)
+{
+   desc->iommu_cookie = iommu_cookie;
+}
+#else
+static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
+{
+   return NULL;
+}
+
+static inline void msi_desc_set_iommu_cookie(struct msi_desc *desc,
+const void *iommu_cookie)
+{
+}
+#endif
+
 #ifdef CONFIG_PCI_MSI
 #define first_pci_msi_entry(pdev)  first_msi_entry(&(pdev)->dev)
 #define for_each_pci_msi_entry(desc, pdev) \
diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
index 5f3e2baefca9..8fee06625c37 100644
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -91,6 +91,9 @@ config GENERIC_MSI_IRQ_DOMAIN
select IRQ_DOMAIN_HIERARCHY
select GENERIC_MSI_IRQ
 
+config IRQ_MSI_IOMMU
+   bool
+
 config HANDLE_DOMAIN_IRQ
bool
 
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline

2019-04-29 Thread Joerg Roedel
On Fri, Apr 26, 2019 at 11:55:12AM -0400, Qian Cai wrote:
> https://git.sr.ht/~cai/linux-debug/blob/master/dmesg

Thanks, I can't see any definitions for unity ranges or exclusion ranges
in the IVRS table dump, which makes it even more weird.

Can you please send me the output of

for f in `ls -1 /sys/kernel/iommu_groups/*/reserved_regions`; do echo 
"---$f"; cat $f;done

to double-check?

Thanks,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 19/26] iommu/dma: Cleanup variable naming in iommu_dma_alloc

2019-04-29 Thread Robin Murphy

On 22/04/2019 18:59, Christoph Hellwig wrote:

From: Robin Murphy 

Most importantly clear up the size / iosize confusion.  Also rename addr
to cpu_addr to match the surrounding code and make the intention a little
more clear.

Signed-off-by: Robin Murphy 
[hch: split from a larger patch]


I can't bring myself to actually ack "my" patch, but I am perfectly 
happy with the split :)


Robin.


Signed-off-by: Christoph Hellwig 
---
  drivers/iommu/dma-iommu.c | 45 +++
  1 file changed, 22 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 95a12e975994..9b269f0792f3 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -960,64 +960,63 @@ static void *iommu_dma_alloc(struct device *dev, size_t 
size,
  {
bool coherent = dev_is_dma_coherent(dev);
int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
-   size_t iosize = size;
+   size_t alloc_size = PAGE_ALIGN(size);
struct page *page = NULL;
-   void *addr;
+   void *cpu_addr;
  
-	size = PAGE_ALIGN(size);

gfp |= __GFP_ZERO;
  
  	if (gfpflags_allow_blocking(gfp) &&

!(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
-   return iommu_dma_alloc_remap(dev, iosize, handle, gfp, attrs);
+   return iommu_dma_alloc_remap(dev, size, handle, gfp, attrs);
  
  	if (!gfpflags_allow_blocking(gfp) && !coherent) {

-   addr = dma_alloc_from_pool(size, , gfp);
-   if (!addr)
+   cpu_addr = dma_alloc_from_pool(alloc_size, , gfp);
+   if (!cpu_addr)
return NULL;
  
-		*handle = __iommu_dma_map(dev, page_to_phys(page), iosize,

+   *handle = __iommu_dma_map(dev, page_to_phys(page), size,
  ioprot);
if (*handle == DMA_MAPPING_ERROR) {
-   dma_free_from_pool(addr, size);
+   dma_free_from_pool(cpu_addr, alloc_size);
return NULL;
}
-   return addr;
+   return cpu_addr;
}
  
  	if (gfpflags_allow_blocking(gfp))

-   page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
-get_order(size),
+   page = dma_alloc_from_contiguous(dev, alloc_size >> PAGE_SHIFT,
+get_order(alloc_size),
 gfp & __GFP_NOWARN);
if (!page)
-   page = alloc_pages(gfp, get_order(size));
+   page = alloc_pages(gfp, get_order(alloc_size));
if (!page)
return NULL;
  
-	*handle = __iommu_dma_map(dev, page_to_phys(page), iosize, ioprot);

+   *handle = __iommu_dma_map(dev, page_to_phys(page), size, ioprot);
if (*handle == DMA_MAPPING_ERROR)
goto out_free_pages;
  
  	if (!coherent || PageHighMem(page)) {

pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
  
-		addr = dma_common_contiguous_remap(page, size, VM_USERMAP, prot,

-   __builtin_return_address(0));
-   if (!addr)
+   cpu_addr = dma_common_contiguous_remap(page, alloc_size,
+   VM_USERMAP, prot, __builtin_return_address(0));
+   if (!cpu_addr)
goto out_unmap;
  
  		if (!coherent)

-   arch_dma_prep_coherent(page, iosize);
+   arch_dma_prep_coherent(page, size);
} else {
-   addr = page_address(page);
+   cpu_addr = page_address(page);
}
-   memset(addr, 0, size);
-   return addr;
+   memset(cpu_addr, 0, alloc_size);
+   return cpu_addr;
  out_unmap:
-   __iommu_dma_unmap(dev, *handle, iosize);
+   __iommu_dma_unmap(dev, *handle, size);
  out_free_pages:
-   if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT))
-   __free_pages(page, get_order(size));
+   if (!dma_release_from_contiguous(dev, page, alloc_size >> PAGE_SHIFT))
+   __free_pages(page, get_order(alloc_size));
return NULL;
  }
  


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 21/26] iommu/dma: Refactor iommu_dma_get_sgtable

2019-04-29 Thread Robin Murphy

On 22/04/2019 18:59, Christoph Hellwig wrote:

Inline __iommu_dma_get_sgtable_page into the main function, and use the
fact that __iommu_dma_get_pages return NULL for remapped contigous
allocations to simplify the code flow a bit.


Yeah, even I was a bit dubious about the readability of "if (page)... 
else if (pages)..." that my attempt ended up with, so I don't really 
have anything to complain about here.


Reviewed-by: Robin Murphy 


Signed-off-by: Christoph Hellwig 
---
  drivers/iommu/dma-iommu.c | 45 +++
  1 file changed, 17 insertions(+), 28 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index acdfe866cb29..138b85e675c8 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1070,42 +1070,31 @@ static int iommu_dma_mmap(struct device *dev, struct 
vm_area_struct *vma,
return __iommu_dma_mmap(pages, size, vma);
  }
  
-static int __iommu_dma_get_sgtable_page(struct sg_table *sgt, struct page *page,

-   size_t size)
-{
-   int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
-
-   if (!ret)
-   sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
-   return ret;
-}
-
  static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
void *cpu_addr, dma_addr_t dma_addr, size_t size,
unsigned long attrs)
  {
-   unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-   struct page **pages;
+   struct page *page;
+   int ret;
  
-	if (!is_vmalloc_addr(cpu_addr)) {

-   struct page *page = virt_to_page(cpu_addr);
-   return __iommu_dma_get_sgtable_page(sgt, page, size);
-   }
+   if (is_vmalloc_addr(cpu_addr)) {
+   struct page **pages = __iommu_dma_get_pages(cpu_addr);
  
-	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {

-   /*
-* DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
-* hence in the vmalloc space.
-*/
-   struct page *page = vmalloc_to_page(cpu_addr);
-   return __iommu_dma_get_sgtable_page(sgt, page, size);
+   if (pages) {
+   return sg_alloc_table_from_pages(sgt, pages,
+   PAGE_ALIGN(size) >> PAGE_SHIFT,
+   0, size, GFP_KERNEL);
+   }
+
+   page = vmalloc_to_page(cpu_addr);
+   } else {
+   page = virt_to_page(cpu_addr);
}
  
-	pages = __iommu_dma_get_pages(cpu_addr);

-   if (WARN_ON_ONCE(!pages))
-   return -ENXIO;
-   return sg_alloc_table_from_pages(sgt, pages, count, 0, size,
-GFP_KERNEL);
+   ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+   if (!ret)
+   sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+   return ret;
  }
  
  static const struct dma_map_ops iommu_dma_ops = {



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 22/26] iommu/dma: Refactor iommu_dma_mmap

2019-04-29 Thread Robin Murphy

On 22/04/2019 18:59, Christoph Hellwig wrote:

Inline __iommu_dma_mmap_pfn into the main function, and use the
fact that __iommu_dma_get_pages return NULL for remapped contigous
allocations to simplify the code flow a bit.


...and later we can squash __iommu_dma_mmap() once the dust settles on 
vm_map_pages() - seems good to me.


Reviewed-by: Robin Murphy 


Signed-off-by: Christoph Hellwig 
---
  drivers/iommu/dma-iommu.c | 36 +++-
  1 file changed, 11 insertions(+), 25 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 138b85e675c8..8fc6098c1eeb 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1025,21 +1025,12 @@ static void *iommu_dma_alloc(struct device *dev, size_t 
size,
return cpu_addr;
  }
  
-static int __iommu_dma_mmap_pfn(struct vm_area_struct *vma,

- unsigned long pfn, size_t size)
-{
-   return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
-  vma->vm_end - vma->vm_start,
-  vma->vm_page_prot);
-}
-
  static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
void *cpu_addr, dma_addr_t dma_addr, size_t size,
unsigned long attrs)
  {
unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
-   unsigned long off = vma->vm_pgoff;
-   struct page **pages;
+   unsigned long pfn, off = vma->vm_pgoff;
int ret;
  
  	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);

@@ -1050,24 +1041,19 @@ static int iommu_dma_mmap(struct device *dev, struct 
vm_area_struct *vma,
if (off >= nr_pages || vma_pages(vma) > nr_pages - off)
return -ENXIO;
  
-	if (!is_vmalloc_addr(cpu_addr)) {

-   unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
-   return __iommu_dma_mmap_pfn(vma, pfn, size);
-   }
+   if (is_vmalloc_addr(cpu_addr)) {
+   struct page **pages = __iommu_dma_get_pages(cpu_addr);
  
-	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {

-   /*
-* DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
-* hence in the vmalloc space.
-*/
-   unsigned long pfn = vmalloc_to_pfn(cpu_addr);
-   return __iommu_dma_mmap_pfn(vma, pfn, size);
+   if (pages)
+   return __iommu_dma_mmap(pages, size, vma);
+   pfn = vmalloc_to_pfn(cpu_addr);
+   } else {
+   pfn = page_to_pfn(virt_to_page(cpu_addr));
}
  
-	pages = __iommu_dma_get_pages(cpu_addr);

-   if (WARN_ON_ONCE(!pages))
-   return -ENXIO;
-   return __iommu_dma_mmap(pages, size, vma);
+   return remap_pfn_range(vma, vma->vm_start, pfn + off,
+  vma->vm_end - vma->vm_start,
+  vma->vm_page_prot);
  }
  
  static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 14/26] iommu/dma: Refactor iommu_dma_free

2019-04-29 Thread Robin Murphy

On 22/04/2019 18:59, Christoph Hellwig wrote:

From: Robin Murphy 

The freeing logic was made particularly horrible by part of it being
opaque to the arch wrapper, which led to a lot of convoluted repetition
to ensure each path did everything in the right order. Now that it's
all private, we can pick apart and consolidate the logically-distinct
steps of freeing the IOMMU mapping, the underlying pages, and the CPU
remap (if necessary) into something much more manageable.

Signed-off-by: Robin Murphy 
[various cosmetic changes to the code flow]


Hmm, I do still prefer my original flow with the dma_common_free_remap() 
call right out of the way at the end rather than being a special case in 
the middle of all the page-freeing (which is the kind of existing 
complexity I was trying to eliminate). I guess you've done this to avoid 
having "if (!dma_release_from_contiguous(...))..." twice like I ended up 
with, which is fair enough I suppose - once we manage to solve the new 
dma_{alloc,free}_contiguous() interface that may tip the balance so I 
can always revisit this then.



Signed-off-by: Christoph Hellwig 
---
  drivers/iommu/dma-iommu.c | 75 ++-
  1 file changed, 35 insertions(+), 40 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 4632b9d301a1..9658c4cc3cfe 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -916,6 +916,41 @@ static void iommu_dma_unmap_resource(struct device *dev, 
dma_addr_t handle,
__iommu_dma_unmap(dev, handle, size);
  }
  
+static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,

+   dma_addr_t handle, unsigned long attrs)
+{
+   size_t alloc_size = PAGE_ALIGN(size);
+   int count = alloc_size >> PAGE_SHIFT;
+   struct page *page = NULL;
+
+   __iommu_dma_unmap(dev, handle, size);
+
+   /* Non-coherent atomic allocation? Easy */
+   if (dma_free_from_pool(cpu_addr, alloc_size))
+   return;
+
+   if (is_vmalloc_addr(cpu_addr)) {
+   /*
+* If it the address is remapped, then it's either non-coherent


s/If it/If/

My "easy; more involved; most complex" narrative definitely doesn't 
translate very well to a different order :)


Otherwise,

Reluctantly-Acked-by: Robin Murphy 


+* or highmem CMA, or an iommu_dma_alloc_remap() construction.
+*/
+   struct page **pages = __iommu_dma_get_pages(cpu_addr);
+
+   if (pages)
+   __iommu_dma_free_pages(pages, count);
+   else
+   page = vmalloc_to_page(cpu_addr);
+
+   dma_common_free_remap(cpu_addr, alloc_size, VM_USERMAP);
+   } else {
+   /* Lowmem means a coherent atomic or CMA allocation */
+   page = virt_to_page(cpu_addr);
+   }
+
+   if (page && !dma_release_from_contiguous(dev, page, count))
+   __free_pages(page, get_order(alloc_size));
+}
+
  static void *iommu_dma_alloc(struct device *dev, size_t size,
dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
  {
@@ -985,46 +1020,6 @@ static void *iommu_dma_alloc(struct device *dev, size_t 
size,
return addr;
  }
  
-static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,

-   dma_addr_t handle, unsigned long attrs)
-{
-   size_t iosize = size;
-
-   size = PAGE_ALIGN(size);
-   /*
-* @cpu_addr will be one of 4 things depending on how it was allocated:
-* - A remapped array of pages for contiguous allocations.
-* - A remapped array of pages from iommu_dma_alloc_remap(), for all
-*   non-atomic allocations.
-* - A non-cacheable alias from the atomic pool, for atomic
-*   allocations by non-coherent devices.
-* - A normal lowmem address, for atomic allocations by
-*   coherent devices.
-* Hence how dodgy the below logic looks...
-*/
-   if (dma_in_atomic_pool(cpu_addr, size)) {
-   __iommu_dma_unmap(dev, handle, iosize);
-   dma_free_from_pool(cpu_addr, size);
-   } else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-   struct page *page = vmalloc_to_page(cpu_addr);
-
-   __iommu_dma_unmap(dev, handle, iosize);
-   dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
-   dma_common_free_remap(cpu_addr, size, VM_USERMAP);
-   } else if (is_vmalloc_addr(cpu_addr)){
-   struct page **pages = __iommu_dma_get_pages(cpu_addr);
-
-   if (WARN_ON(!pages))
-   return;
-   __iommu_dma_unmap(dev, handle, iosize);
-   __iommu_dma_free_pages(pages, size >> PAGE_SHIFT);
-   dma_common_free_remap(cpu_addr, size, VM_USERMAP);
-   } else {
-   __iommu_dma_unmap(dev, handle, iosize);
-   

Re: [PATCH 13/26] iommu/dma: Remove __iommu_dma_free

2019-04-29 Thread Robin Murphy

On 22/04/2019 18:59, Christoph Hellwig wrote:

We only have a single caller of this function left, so open code it there.


Heh, I even caught myself out for a moment thinking this looked 
redundant with #18 now, but no :)


Reviewed-by: Robin Murphy 


Signed-off-by: Christoph Hellwig 
---
  drivers/iommu/dma-iommu.c | 21 ++---
  1 file changed, 2 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index b8e46e89a60a..4632b9d301a1 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -534,24 +534,6 @@ static struct page **__iommu_dma_get_pages(void *cpu_addr)
return area->pages;
  }
  
-/**

- * iommu_dma_free - Free a buffer allocated by iommu_dma_alloc_remap()
- * @dev: Device which owns this buffer
- * @pages: Array of buffer pages as returned by __iommu_dma_alloc_remap()
- * @size: Size of buffer in bytes
- * @handle: DMA address of buffer
- *
- * Frees both the pages associated with the buffer, and the array
- * describing them
- */
-static void __iommu_dma_free(struct device *dev, struct page **pages,
-   size_t size, dma_addr_t *handle)
-{
-   __iommu_dma_unmap(dev, *handle, size);
-   __iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
-   *handle = DMA_MAPPING_ERROR;
-}
-
  /**
   * iommu_dma_alloc_remap - Allocate and map a buffer contiguous in IOVA space
   * @dev: Device to allocate memory for. Must be a real device
@@ -1034,7 +1016,8 @@ static void iommu_dma_free(struct device *dev, size_t 
size, void *cpu_addr,
  
  		if (WARN_ON(!pages))

return;
-   __iommu_dma_free(dev, pages, iosize, );
+   __iommu_dma_unmap(dev, handle, iosize);
+   __iommu_dma_free_pages(pages, size >> PAGE_SHIFT);
dma_common_free_remap(cpu_addr, size, VM_USERMAP);
} else {
__iommu_dma_unmap(dev, handle, iosize);


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg in two parts

2019-04-29 Thread Julien Grall

Hi Marc,

On 23/04/2019 11:54, Marc Zyngier wrote:

On 18/04/2019 18:26, Julien Grall wrote:

On RT, the function iommu_dma_map_msi_msg may be called from
non-preemptible context. This will lead to a splat with
CONFIG_DEBUG_ATOMIC_SLEEP as the function is using spin_lock
(they can sleep on RT).

The function iommu_dma_map_msi_msg is used to map the MSI page in the
IOMMU PT and update the MSI message with the IOVA.

Only the part to lookup for the MSI page requires to be called in
preemptible context. As the MSI page cannot change over the lifecycle
of the MSI interrupt, the lookup can be cached and re-used later on.

This patch split the function iommu_dma_map_msi_msg in two new
functions:
 - iommu_dma_prepare_msi: This function will prepare the mapping in
 the IOMMU and store the cookie in the structure msi_desc. This
 function should be called in preemptible context.
 - iommu_dma_compose_msi_msg: This function will update the MSI
 message with the IOVA when the device is behind an IOMMU.

Signed-off-by: Julien Grall 
---
  drivers/iommu/dma-iommu.c | 43 ---
  include/linux/dma-iommu.h | 21 +
  2 files changed, 53 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 77aabe637a60..f5c1f1685095 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -888,17 +888,17 @@ static struct iommu_dma_msi_page 
*iommu_dma_get_msi_page(struct device *dev,
return NULL;
  }
  
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)

+int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)


I quite like the idea of moving from having an irq to having an msi_desc
passed to the IOMMU layer...


  {
-   struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
+   struct device *dev = msi_desc_to_dev(desc);
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct iommu_dma_cookie *cookie;
-   struct iommu_dma_msi_page *msi_page;
-   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
unsigned long flags;
  
-	if (!domain || !domain->iova_cookie)

-   return;
+   if (!domain || !domain->iova_cookie) {
+   desc->iommu_cookie = NULL;
+   return 0;
+   }
  
  	cookie = domain->iova_cookie;
  
@@ -908,10 +908,33 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)

 * of an MSI from within an IPI handler.
 */
spin_lock_irqsave(>msi_lock, flags);
-   msi_page = iommu_dma_get_msi_page(dev, msi_addr, domain);
+   desc->iommu_cookie = iommu_dma_get_msi_page(dev, msi_addr, domain);
spin_unlock_irqrestore(>msi_lock, flags);
  
-	if (WARN_ON(!msi_page)) {

+   return (desc->iommu_cookie) ? 0 : -ENOMEM;
+}
+
+void iommu_dma_compose_msi_msg(int irq, struct msi_msg *msg)


... but I'd like it even better if it was uniform. Can you please move
the irq_get_msi_desc() to the callers of iommu_dma_compose_msi_msg(),
and make both functions take a msi_desc?


Make sense. I will modify iommu_dma_compose_msi_msg to take a msi_desc in 
parameter.

Cheers,

--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 12/26] iommu/dma: Refactor the page array remapping allocator

2019-04-29 Thread Robin Murphy

On 22/04/2019 18:59, Christoph Hellwig wrote:

Move the call to dma_common_pages_remap into __iommu_dma_alloc and
rename it to iommu_dma_alloc_remap.  This creates a self-contained
helper for remapped pages allocation and mapping.


Reviewed-by: Robin Murphy 


Signed-off-by: Christoph Hellwig 
---
  drivers/iommu/dma-iommu.c | 54 +++
  1 file changed, 26 insertions(+), 28 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 8e2d9733113e..b8e46e89a60a 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -535,9 +535,9 @@ static struct page **__iommu_dma_get_pages(void *cpu_addr)
  }
  
  /**

- * iommu_dma_free - Free a buffer allocated by __iommu_dma_alloc()
+ * iommu_dma_free - Free a buffer allocated by iommu_dma_alloc_remap()
   * @dev: Device which owns this buffer
- * @pages: Array of buffer pages as returned by __iommu_dma_alloc()
+ * @pages: Array of buffer pages as returned by __iommu_dma_alloc_remap()
   * @size: Size of buffer in bytes
   * @handle: DMA address of buffer
   *
@@ -553,33 +553,35 @@ static void __iommu_dma_free(struct device *dev, struct 
page **pages,
  }
  
  /**

- * __iommu_dma_alloc - Allocate and map a buffer contiguous in IOVA space
+ * iommu_dma_alloc_remap - Allocate and map a buffer contiguous in IOVA space
   * @dev: Device to allocate memory for. Must be a real device
   * attached to an iommu_dma_domain
   * @size: Size of buffer in bytes
+ * @dma_handle: Out argument for allocated DMA handle
   * @gfp: Allocation flags
   * @attrs: DMA attributes for this allocation
- * @prot: IOMMU mapping flags
- * @handle: Out argument for allocated DMA handle
   *
   * If @size is less than PAGE_SIZE, then a full CPU page will be allocated,
   * but an IOMMU which supports smaller pages might not map the whole thing.
   *
- * Return: Array of struct page pointers describing the buffer,
- *or NULL on failure.
+ * Return: Mapped virtual address, or NULL on failure.
   */
-static struct page **__iommu_dma_alloc(struct device *dev, size_t size,
-   gfp_t gfp, unsigned long attrs, int prot, dma_addr_t *handle)
+static void *iommu_dma_alloc_remap(struct device *dev, size_t size,
+   dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
  {
struct iommu_domain *domain = iommu_get_dma_domain(dev);
struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iova_domain *iovad = >iovad;
+   bool coherent = dev_is_dma_coherent(dev);
+   int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
+   pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
+   unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
struct page **pages;
struct sg_table sgt;
dma_addr_t iova;
-   unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
+   void *vaddr;
  
-	*handle = DMA_MAPPING_ERROR;

+   *dma_handle = DMA_MAPPING_ERROR;
  
  	min_size = alloc_sizes & -alloc_sizes;

if (min_size < PAGE_SIZE) {
@@ -605,7 +607,7 @@ static struct page **__iommu_dma_alloc(struct device *dev, 
size_t size,
if (sg_alloc_table_from_pages(, pages, count, 0, size, GFP_KERNEL))
goto out_free_iova;
  
-	if (!(prot & IOMMU_CACHE)) {

+   if (!(ioprot & IOMMU_CACHE)) {
struct scatterlist *sg;
int i;
  
@@ -613,14 +615,21 @@ static struct page **__iommu_dma_alloc(struct device *dev, size_t size,

arch_dma_prep_coherent(sg_page(sg), sg->length);
}
  
-	if (iommu_map_sg(domain, iova, sgt.sgl, sgt.orig_nents, prot)

+   if (iommu_map_sg(domain, iova, sgt.sgl, sgt.orig_nents, ioprot)
< size)
goto out_free_sg;
  
-	*handle = iova;

+   vaddr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
+   __builtin_return_address(0));
+   if (!vaddr)
+   goto out_unmap;
+
+   *dma_handle = iova;
sg_free_table();
-   return pages;
+   return vaddr;
  
+out_unmap:

+   __iommu_dma_unmap(dev, iova, size);
  out_free_sg:
sg_free_table();
  out_free_iova:
@@ -989,18 +998,7 @@ static void *iommu_dma_alloc(struct device *dev, size_t 
size,
size >> PAGE_SHIFT);
}
} else {
-   pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
-   struct page **pages;
-
-   pages = __iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
-   handle);
-   if (!pages)
-   return NULL;
-
-   addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
- __builtin_return_address(0));
-   if (!addr)
-   __iommu_dma_free(dev, pages, iosize, 

Re: [PATCH 11/26] iommu/dma: Factor out remapped pages lookup

2019-04-29 Thread Robin Murphy

On 22/04/2019 18:59, Christoph Hellwig wrote:

From: Robin Murphy 

Since we duplicate the find_vm_area() logic a few times in places where
we only care aboute the pages, factor out a helper to abstract it.

Signed-off-by: Robin Murphy 
[hch: don't warn when not finding a region, as we'll rely on that later]


Yeah, I did think about that and the things which it might make a little 
easier, but preserved it as-is for the sake of keeping my modifications 
quick and simple. TBH I'm now feeling more inclined to drop the WARNs 
entirely at this point, since it's not like there's ever been any 
general guarantee that freeing the wrong thing shouldn't just crash, but 
that's something we can easily come back to later if need be.


Robin.


Signed-off-by: Christoph Hellwig 
---
  drivers/iommu/dma-iommu.c | 32 
  1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index b52c5d6be7b4..8e2d9733113e 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -525,6 +525,15 @@ static struct page **__iommu_dma_alloc_pages(struct device 
*dev,
return pages;
  }
  
+static struct page **__iommu_dma_get_pages(void *cpu_addr)

+{
+   struct vm_struct *area = find_vm_area(cpu_addr);
+
+   if (!area || !area->pages)
+   return NULL;
+   return area->pages;
+}
+
  /**
   * iommu_dma_free - Free a buffer allocated by __iommu_dma_alloc()
   * @dev: Device which owns this buffer
@@ -1023,11 +1032,11 @@ static void iommu_dma_free(struct device *dev, size_t 
size, void *cpu_addr,
dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
dma_common_free_remap(cpu_addr, size, VM_USERMAP);
} else if (is_vmalloc_addr(cpu_addr)){
-   struct vm_struct *area = find_vm_area(cpu_addr);
+   struct page **pages = __iommu_dma_get_pages(cpu_addr);
  
-		if (WARN_ON(!area || !area->pages))

+   if (WARN_ON(!pages))
return;
-   __iommu_dma_free(dev, area->pages, iosize, );
+   __iommu_dma_free(dev, pages, iosize, );
dma_common_free_remap(cpu_addr, size, VM_USERMAP);
} else {
__iommu_dma_unmap(dev, handle, iosize);
@@ -1049,7 +1058,7 @@ static int iommu_dma_mmap(struct device *dev, struct 
vm_area_struct *vma,
  {
unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
unsigned long off = vma->vm_pgoff;
-   struct vm_struct *area;
+   struct page **pages;
int ret;
  
  	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);

@@ -1074,11 +1083,10 @@ static int iommu_dma_mmap(struct device *dev, struct 
vm_area_struct *vma,
return __iommu_dma_mmap_pfn(vma, pfn, size);
}
  
-	area = find_vm_area(cpu_addr);

-   if (WARN_ON(!area || !area->pages))
+   pages = __iommu_dma_get_pages(cpu_addr);
+   if (WARN_ON_ONCE(!pages))
return -ENXIO;
-
-   return __iommu_dma_mmap(area->pages, size, vma);
+   return __iommu_dma_mmap(pages, size, vma);
  }
  
  static int __iommu_dma_get_sgtable_page(struct sg_table *sgt, struct page *page,

@@ -1096,7 +1104,7 @@ static int iommu_dma_get_sgtable(struct device *dev, 
struct sg_table *sgt,
unsigned long attrs)
  {
unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-   struct vm_struct *area = find_vm_area(cpu_addr);
+   struct page **pages;
  
  	if (!is_vmalloc_addr(cpu_addr)) {

struct page *page = virt_to_page(cpu_addr);
@@ -1112,10 +1120,10 @@ static int iommu_dma_get_sgtable(struct device *dev, 
struct sg_table *sgt,
return __iommu_dma_get_sgtable_page(sgt, page, size);
}
  
-	if (WARN_ON(!area || !area->pages))

+   pages = __iommu_dma_get_pages(cpu_addr);
+   if (WARN_ON_ONCE(!pages))
return -ENXIO;
-
-   return sg_alloc_table_from_pages(sgt, area->pages, count, 0, size,
+   return sg_alloc_table_from_pages(sgt, pages, count, 0, size,
 GFP_KERNEL);
  }
  


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 07/26] iommu/dma: move the arm64 wrappers to common code

2019-04-29 Thread Robin Murphy

On 22/04/2019 18:59, Christoph Hellwig wrote:

There is nothing really arm64 specific in the iommu_dma_ops
implementation, so move it to dma-iommu.c and keep a lot of symbols
self-contained.  Note the implementation does depend on the
DMA_DIRECT_REMAP infrastructure for now, so we'll have to make the
DMA_IOMMU support depend on it, but this will be relaxed soon.


Nothing looks objectionable, and boot testing with this much of the 
series merged has my coherent and non-coherent IOMMU-backed devices 
appearing to still work OK, so:


Acked-by: Robin Murphy 


Signed-off-by: Christoph Hellwig 
---
  arch/arm64/mm/dma-mapping.c | 389 +---
  drivers/iommu/Kconfig   |   1 +
  drivers/iommu/dma-iommu.c   | 388 ---
  include/linux/dma-iommu.h   |  43 +---
  4 files changed, 369 insertions(+), 452 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 636fa7c64370..d1661f78eb4d 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -27,6 +27,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -58,27 +59,6 @@ void arch_dma_prep_coherent(struct page *page, size_t size)
__dma_flush_area(page_address(page), size);
  }
  
-#ifdef CONFIG_IOMMU_DMA

-static int __swiotlb_get_sgtable_page(struct sg_table *sgt,
- struct page *page, size_t size)
-{
-   int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
-
-   if (!ret)
-   sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
-
-   return ret;
-}
-
-static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
- unsigned long pfn, size_t size)
-{
-   return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
- vma->vm_end - vma->vm_start,
- vma->vm_page_prot);
-}
-#endif /* CONFIG_IOMMU_DMA */
-
  static int __init arm64_dma_init(void)
  {
WARN_TAINT(ARCH_DMA_MINALIGN < cache_line_size(),
@@ -90,379 +70,18 @@ static int __init arm64_dma_init(void)
  arch_initcall(arm64_dma_init);
  
  #ifdef CONFIG_IOMMU_DMA

-#include 
-#include 
-#include 
-
-static void *__iommu_alloc_attrs(struct device *dev, size_t size,
-dma_addr_t *handle, gfp_t gfp,
-unsigned long attrs)
-{
-   bool coherent = dev_is_dma_coherent(dev);
-   int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
-   size_t iosize = size;
-   void *addr;
-
-   if (WARN(!dev, "cannot create IOMMU mapping for unknown device\n"))
-   return NULL;
-
-   size = PAGE_ALIGN(size);
-
-   /*
-* Some drivers rely on this, and we probably don't want the
-* possibility of stale kernel data being read by devices anyway.
-*/
-   gfp |= __GFP_ZERO;
-
-   if (!gfpflags_allow_blocking(gfp)) {
-   struct page *page;
-   /*
-* In atomic context we can't remap anything, so we'll only
-* get the virtually contiguous buffer we need by way of a
-* physically contiguous allocation.
-*/
-   if (coherent) {
-   page = alloc_pages(gfp, get_order(size));
-   addr = page ? page_address(page) : NULL;
-   } else {
-   addr = dma_alloc_from_pool(size, , gfp);
-   }
-   if (!addr)
-   return NULL;
-
-   *handle = iommu_dma_map_page(dev, page, 0, iosize, ioprot);
-   if (*handle == DMA_MAPPING_ERROR) {
-   if (coherent)
-   __free_pages(page, get_order(size));
-   else
-   dma_free_from_pool(addr, size);
-   addr = NULL;
-   }
-   } else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-   pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
-   struct page *page;
-
-   page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
-   get_order(size), gfp & __GFP_NOWARN);
-   if (!page)
-   return NULL;
-
-   *handle = iommu_dma_map_page(dev, page, 0, iosize, ioprot);
-   if (*handle == DMA_MAPPING_ERROR) {
-   dma_release_from_contiguous(dev, page,
-   size >> PAGE_SHIFT);
-   return NULL;
-   }
-   addr = dma_common_contiguous_remap(page, size, VM_USERMAP,
-  prot,
-  __builtin_return_address(0));
-   if (addr) {
-   if (!coherent)
-   

Re: [PATCH 02/26] arm64/iommu: improve mmap bounds checking

2019-04-29 Thread Robin Murphy

On 22/04/2019 18:59, Christoph Hellwig wrote:

The nr_pages checks should be done for all mmap requests, not just those
using remap_pfn_range.


I think it probably makes sense now to just squash this with #22 one way 
or the other, but if you really really still want to keep it as a 
separate patch with a misleading commit message then I'm willing to keep 
my complaints to myself :)


Robin.


Signed-off-by: Christoph Hellwig 
---
  arch/arm64/mm/dma-mapping.c | 21 -
  1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 674860e3e478..604c638b2787 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -73,19 +73,9 @@ static int __swiotlb_get_sgtable_page(struct sg_table *sgt,
  static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
  unsigned long pfn, size_t size)
  {
-   int ret = -ENXIO;
-   unsigned long nr_vma_pages = vma_pages(vma);
-   unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
-   unsigned long off = vma->vm_pgoff;
-
-   if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) {
-   ret = remap_pfn_range(vma, vma->vm_start,
- pfn + off,
- vma->vm_end - vma->vm_start,
- vma->vm_page_prot);
-   }
-
-   return ret;
+   return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
+ vma->vm_end - vma->vm_start,
+ vma->vm_page_prot);
  }
  #endif /* CONFIG_IOMMU_DMA */
  
@@ -241,6 +231,8 @@ static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,

  void *cpu_addr, dma_addr_t dma_addr, size_t size,
  unsigned long attrs)
  {
+   unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+   unsigned long off = vma->vm_pgoff;
struct vm_struct *area;
int ret;
  
@@ -249,6 +241,9 @@ static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,

if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, ))
return ret;
  
+	if (off >= nr_pages || vma_pages(vma) > nr_pages - off)

+   return -ENXIO;
+
if (!is_vmalloc_addr(cpu_addr)) {
unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
return __swiotlb_mmap_pfn(vma, pfn, size);


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1] iommu/amd: flush not present cache in iommu_map_page

2019-04-29 Thread Christoph Hellwig
On Mon, Apr 29, 2019 at 01:17:44PM +0100, Tom Murphy wrote:
> Yes. My patches depend on the "iommu/vt-d: Delegate DMA domain to
> generic iommu" patch which is currently being reviewed.

Nice!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1] iommu/amd: flush not present cache in iommu_map_page

2019-04-29 Thread Tom Murphy via iommu
On Mon, Apr 29, 2019 at 12:59 PM Christoph Hellwig  wrote:
>
> On Sat, Apr 27, 2019 at 03:20:35PM +0100, Tom Murphy wrote:
> > I am working on another patch to improve the intel iotlb flushing in
> > the iommu ops patch which should cover this too.
>
> So are you looking into converting the intel-iommu driver to use
> dma-iommu as well?  That would be great!

Yes. My patches depend on the "iommu/vt-d: Delegate DMA domain to
generic iommu" patch which is currently being reviewed.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 12/21] dma-iommu: factor atomic pool allocations into helpers

2019-04-29 Thread Robin Murphy

On 29/04/2019 12:49, Christoph Hellwig wrote:

On Tue, Apr 23, 2019 at 11:01:44AM +0100, Robin Murphy wrote:

Wouldn't this suffice? Since we also use alloc_pages() in the coherent
atomic case, the free path should already be able to deal with it.

Let me take a proper look at v3 and see how it all looks in context.


Any comments on v3?  I've been deferring lots of other DMA work to
not create conflicts, so I'd hate to miss this merge window.


Ah, I did skim the commits in the branch, but I'll run through again and 
reply on the patches while my head's still in email mode...


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1] iommu/amd: flush not present cache in iommu_map_page

2019-04-29 Thread Christoph Hellwig
On Sat, Apr 27, 2019 at 03:20:35PM +0100, Tom Murphy wrote:
> I am working on another patch to improve the intel iotlb flushing in
> the iommu ops patch which should cover this too.

So are you looking into converting the intel-iommu driver to use
dma-iommu as well?  That would be great!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 12/21] dma-iommu: factor atomic pool allocations into helpers

2019-04-29 Thread Christoph Hellwig
On Tue, Apr 23, 2019 at 11:01:44AM +0100, Robin Murphy wrote:
> Wouldn't this suffice? Since we also use alloc_pages() in the coherent 
> atomic case, the free path should already be able to deal with it.
>
> Let me take a proper look at v3 and see how it all looks in context.

Any comments on v3?  I've been deferring lots of other DMA work to
not create conflicts, so I'd hate to miss this merge window.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 02/10] swiotlb: Factor out slot allocation and free

2019-04-29 Thread Christoph Hellwig
On Mon, Apr 29, 2019 at 12:06:52PM +0100, Robin Murphy wrote:
>
> From the reply up-thread I guess you're trying to include an optimisation 
> to only copy the head and tail of the buffer if it spans multiple pages, 
> and directly map the ones in the middle, but AFAICS that's going to tie you 
> to also using strict mode for TLB maintenance, which may not be a win 
> overall depending on the balance between invalidation bandwidth vs. memcpy 
> bandwidth. At least if we use standard SWIOTLB logic to always copy the 
> whole thing, we should be able to release the bounce pages via the flush 
> queue to allow 'safe' lazy unmaps.

Oh.  The head and tail optimization is what I missed.  Yes, for that
we'd need the offset.

> Either way I think it would be worth just implementing the straightforward 
> version first, then coming back to consider optimisations later.

Agreed, let's start simple.  Especially as large DMA mappings or
allocations should usually be properly aligned anyway, and if not we
should fix that for multiple reasons.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 02/10] swiotlb: Factor out slot allocation and free

2019-04-29 Thread Robin Murphy

On 29/04/2019 06:10, Lu Baolu wrote:

Hi Christoph,

On 4/26/19 11:04 PM, Christoph Hellwig wrote:

On Thu, Apr 25, 2019 at 10:07:19AM +0800, Lu Baolu wrote:

This is not VT-d specific. It's just how generic IOMMU works.

Normally, IOMMU works in paging mode. So if a driver issues DMA with
IOVA  0x0123, IOMMU can remap it with a physical address 0x0123.
But we should never expect IOMMU to remap 0x0123 with physical
address of 0x. That's the reason why I said that IOMMU will not
work there.


Well, with the iommu it doesn't happen.  With swiotlb it obviosuly
can happen, so drivers are fine with it.  Why would that suddenly
become an issue when swiotlb is called from the iommu code?



I would say IOMMU is DMA remapping, not DMA engine. :-)


I'm not sure I really follow the issue here - if we're copying the 
buffer to the bounce page(s) there's no conceptual difference from 
copying it to SWIOTLB slot(s), so there should be no need to worry about 
the original in-page offset.


From the reply up-thread I guess you're trying to include an 
optimisation to only copy the head and tail of the buffer if it spans 
multiple pages, and directly map the ones in the middle, but AFAICS 
that's going to tie you to also using strict mode for TLB maintenance, 
which may not be a win overall depending on the balance between 
invalidation bandwidth vs. memcpy bandwidth. At least if we use standard 
SWIOTLB logic to always copy the whole thing, we should be able to 
release the bounce pages via the flush queue to allow 'safe' lazy unmaps.


Either way I think it would be worth just implementing the 
straightforward version first, then coming back to consider 
optimisations later.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 01/10] iommu: Add helper to get minimal page size of domain

2019-04-29 Thread Robin Murphy

On 21/04/2019 02:17, Lu Baolu wrote:

This makes it possible for other modules to know the minimal
page size supported by a domain without the knowledge of the
structure details.

Signed-off-by: Lu Baolu 
---
  include/linux/iommu.h | 13 +
  1 file changed, 13 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a5007d035218..46679ef19b7e 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -377,6 +377,14 @@ static inline void iommu_tlb_sync(struct iommu_domain 
*domain)
domain->ops->iotlb_sync(domain);
  }
  
+static inline unsigned long domain_minimal_pgsize(struct iommu_domain *domain)

+{
+   if (domain && domain->pgsize_bitmap)
+   return 1 << __ffs(domain->pgsize_bitmap);


Nit: this would probably be more efficient on most architectures as:

if (domain)
return domain->pgsize_bitmap & -domain->pgsize_bitmap;


I'd also suggest s/minimal/min/ in the name, just to stop it getting too 
long. Otherwise, though, I like the idea, and there's at least one other 
place (in iommu-dma) that can make use of it straight away.


Robin.


+
+   return 0;
+}
+
  /* PCI device grouping function */
  extern struct iommu_group *pci_device_group(struct device *dev);
  /* Generic device grouping function */
@@ -704,6 +712,11 @@ const struct iommu_ops *iommu_ops_from_fwnode(struct 
fwnode_handle *fwnode)
return NULL;
  }
  
+static inline unsigned long domain_minimal_pgsize(struct iommu_domain *domain)

+{
+   return 0;
+}
+
  #endif /* CONFIG_IOMMU_API */
  
  #ifdef CONFIG_IOMMU_DEBUGFS



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 11/19] iommu/vt-d: Replace Intel specific PASID allocator with IOASID

2019-04-29 Thread Jean-Philippe Brucker
On 27/04/2019 09:38, Auger Eric wrote:
 --- a/drivers/iommu/intel-iommu.c
 +++ b/drivers/iommu/intel-iommu.c
 @@ -5153,7 +5153,7 @@ static void auxiliary_unlink_device(struct
 dmar_domain *domain, domain->auxd_refcnt--;
  
if (!domain->auxd_refcnt && domain->default_pasid > 0)
 -  intel_pasid_free_id(domain->default_pasid);
 +  ioasid_free(domain->default_pasid);
  }
  
  static int aux_domain_add_dev(struct dmar_domain *domain,
 @@ -5171,9 +5171,8 @@ static int aux_domain_add_dev(struct
 dmar_domain *domain, if (domain->default_pasid <= 0) {
int pasid;
  
 -  pasid = intel_pasid_alloc_id(domain, PASID_MIN,
 -
 pci_max_pasids(to_pci_dev(dev)),
 -   GFP_KERNEL);
 +  pasid = ioasid_alloc(NULL, PASID_MIN,
 pci_max_pasids(to_pci_dev(dev)) - 1,
 +  domain);
if (pasid <= 0) {  
>>> ioasid_t is a uint and returns INVALID_IOASID on error. Wouldn't it be
>>> simpler to make ioasid_alloc return an int?
>> Well, I think we still want the full uint range - 1(INVALID_IOASID).
>> Intel uses 20bit but I think SMMUs use 32 bits for streamID? I
>> should just check 
>>  if (pasid == INVALID_IOASID) {
> Jean-Philippe may correct me but SMMU uses 20b SubstreamId which is a
> superset of PASIDs. StreamId is 32b.

Right, we use 20 bits for PASIDs (== SubstreamID really). Given the
choices that vendors are making for PASIDs (a global namespace rather
than per-VM), I wouldn't be surprised if they extend the size of PASIDs
in a couple of years, so I added the typedef ioasid_t to ease a possible
change from 32-bit to 64 in the future.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu