Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

2019-04-15 Thread Lu Baolu

Hi James,

On 4/15/19 10:16 PM, James Sewart wrote:

Hey Lu,


On 10 Apr 2019, at 06:22, Lu Baolu  wrote:

Hi James,

On 4/6/19 2:02 AM, James Sewart wrote:

Hey Lu,
My bad, did some debugging on my end. The issue was swapping out
find_domain for iommu_get_domain_for_dev. It seems in some situations the
domain is not attached to the group but the device is expected to have the
domain still stored in its archdata.
I’ve attached the final patch with find_domain unremoved which seems to
work in my testing.


Just looked into your v3 patch set and some thoughts from my end posted
here just for your information.

Let me post the problems we want to address.

1. When allocating a new group for a device, how should we determine the
type of the default domain?

2. If we need to put a device into an existing group which uses a
different type of domain from what the device desires to use, we might
break the functionality of the device.

My new thought is letting the iommu generic code to determine the
default domain type (hence my proposed vendor specific default domain
type patches could be dropped).

If the default domain type is dynamical mapping, and later in 
iommu_no_mapping(), we determines that we must use an identity domain,
we then call iommu_request_dm_for_dev(dev).

If the default domain type is identity mapping, and later in
iommu_no_mapping(), we determined that we must use a dynamical domain,
we then call iommu_request_dma_domain_for_dev(dev).

We already have iommu_request_dm_for_dev() in iommu.c. We only need to
implement iommu_request_dma_domain_for_dev().

With this done, your patch titled "Create an IOMMU group for devices
that require an identity map" could also be dropped.

Any thoughts?


Theres a couple issues I can think of. iommu_request_dm_for_dev changes
the domain for all devices within the devices group, if we may have
devices with different domain requirements in the same group then only the
last initialised device will get the domain it wants. If we want to add
iommu_request_dma_domain_for_dev then we would still need another IOMMU
group for identity domain devices.


Current solution (a.k.a. v3) for this is to assign a new group to the
device which requires an identity mapped domain. This will break the
functionality of device direct pass-through (to user level). The iommu
group represents the minimum granularity of iommu isolation and
protection. The requirement of identity mapped domain should not be a
factor for a new group.

Both iomm_request_dma_domain_for_dev() or iommu_request_dm_for_dev()
only support groups with a single device in it.

The users could choose between domains of DMA type or IDENTITY type for
a group. If multiple devices share a single group and both don't work
for them, they have to disable the IOMMU. This isn't a valid
configuration from IOMMU's point of view.



Both with v3 and the proposed method here, iommu_should_identity_map is
determining whether the device requires an identity map. In v3 this is
called once up front by the generic code to determine for the IOMMU group
which domain type to use. In the proposed method I think this would happen
after initial domain allocation and would trigger the domain to be
replaced with a different domain.

I prefer the solution in v3. What do you think?


The only concern of solution in v3 from my point of view is some kind of
duplication. Even we can ask the Intel IOMMU driver to tell the default
domain type, we might change it after boot up. For example, for 32-bit
pci devices, we don't know whether it's 64-bit or 32-bit capable during
IOMMU initialization, so we might tell iommu.c to use identity mapped
domain. After boot up, we check it again, and find out that it's only
32-bit capable (hence only can access physical memory below 4G) and the
default identity domain doesn't work. And we have to change it to DMA
domain via iommu_request_dma_domain_for_dev().

So to make it simple and straight-forward, we can just let iommu.c to
determine the default domain type and after that the Intel IOMMU driver
could tweak it according to the quirk bits in the driver.

Best regards,
Lu Baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 10/18] iommu/vt-d: Add custom allocator for IOASID

2019-04-15 Thread Jacob Pan
On Mon, 15 Apr 2019 14:37:11 -0600
Alex Williamson  wrote:

> On Mon,  8 Apr 2019 16:59:25 -0700
> Jacob Pan  wrote:
> 
> > When VT-d driver runs in the guest, PASID allocation must be
> > performed via virtual command interface. This patch register a
> > custom IOASID allocator which takes precedence over the default
> > IDR based allocator. The resulting IOASID allocation will always
> > come from the host. This ensures that PASID namespace is system-
> > wide.
> > 
> > Signed-off-by: Lu Baolu 
> > Signed-off-by: Liu, Yi L 
> > Signed-off-by: Jacob Pan 
> > ---
> >  drivers/iommu/intel-iommu.c | 50
> > +
> > include/linux/intel-iommu.h |  1 + 2 files changed, 51 insertions(+)
> > 
> > diff --git a/drivers/iommu/intel-iommu.c
> > b/drivers/iommu/intel-iommu.c index 28cb713..a38d774 100644
> > --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -4820,6 +4820,42 @@ static int __init
> > platform_optin_force_iommu(void) return 1;
> >  }
> >  
> > +static ioasid_t intel_ioasid_alloc(ioasid_t min, ioasid_t max,
> > void *data) +{
> > +   struct intel_iommu *iommu = data;
> > +   ioasid_t ioasid;
> > +
> > +   if (vcmd_alloc_pasid(iommu, ))
> > +   return INVALID_IOASID;
> > +   return ioasid;  
> 
> How does this honor min/max?
> 
> > +}
> > +
> > +static int intel_ioasid_free(ioasid_t ioasid, void *data)
> > +{
> > +   struct iommu_pasid_alloc_info *svm;
> > +   struct intel_iommu *iommu = data;
> > +
> > +   if (!iommu || !cap_caching_mode(iommu->cap))
> > +   return -EINVAL;
> > +   /*
> > +* Sanity check the ioasid owner is done at upper layer,
> > e.g. VFIO
> > +* We can only free the PASID when all the devices are
> > unbond.
> > +*/
> > +   svm = ioasid_find(NULL, ioasid, NULL);
> > +   if (!svm) {
> > +   pr_warn("Freeing unbond IOASID %d\n", ioasid);
> > +   return -EBUSY;
> > +   }
> > +   vcmd_free_pasid(iommu, ioasid);
> > +
> > +   return 0;
> > +}
> > +
> > +static struct ioasid_allocator intel_iommu_ioasid_allocator = {
> > +   .alloc = intel_ioasid_alloc,
> > +   .free = intel_ioasid_free,
> > +};
> > +
> >  int __init intel_iommu_init(void)
> >  {
> > int ret = -ENODEV;
> > @@ -4921,6 +4957,20 @@ int __init intel_iommu_init(void)
> >"%s", iommu->name);
> > iommu_device_set_ops(>iommu,
> > _iommu_ops); iommu_device_register(>iommu);
> > +   if (cap_caching_mode(iommu->cap) &&
> > sm_supported(iommu)) {
> > +   /*
> > +* Register a custom ASID allocator if we
> > are running
> > +* in a guest, the purpose is to have a
> > system wide PASID
> > +* namespace among all PASID users.
> > +* Note that only one vIOMMU in each guest
> > is supported.  
> 
> Why one vIOMMU per guest?  This would prevent guests with multiple PCI
> domains aiui.
> 
This is mainly for simplicity reasons. These are all virtual BDFs
anyway. As long as guest BDF can be mapped to a host BDF, it should be
sufficient, am I missing anything?

>From PASID allocation perspective, it is not tied to any PCI device
until bind call. We only need to track PASID ownership per guest.

virtio-IOMMU spec does support multiple PCI domains but I am not sure
if that applies to all assigned devices, whether all assigned devices
are under the same domain. Perhaps Jean can help to clarify how PASID
allocation API looks like on virtio IOMMU.

> > +*/
> > +   intel_iommu_ioasid_allocator.pdata = (void
> > *)iommu;
> > +   ret =
> > ioasid_set_allocator(_iommu_ioasid_allocator);
> > +   if (ret == -EBUSY) {
> > +   pr_info("Custom IOASID allocator
> > already registered\n");
> > +   break;
> > +   }
> > +   }
> > }
> >  
> > bus_set_iommu(_bus_type, _iommu_ops);
> > diff --git a/include/linux/intel-iommu.h
> > b/include/linux/intel-iommu.h index b29c85c..bc09d80 100644
> > --- a/include/linux/intel-iommu.h
> > +++ b/include/linux/intel-iommu.h
> > @@ -31,6 +31,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #include 
> >  #include   
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/4] swiotlb-xen: ensure we have a single callsite for xen_dma_map_page

2019-04-15 Thread Stefano Stabellini
On Thu, 11 Apr 2019, Christoph Hellwig wrote:
> Refactor the code a bit to make further changes easier.
> 
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Stefano Stabellini 

> ---
>  drivers/xen/swiotlb-xen.c | 31 ---
>  1 file changed, 16 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 9a951504dc12..5dcb06fe9667 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -391,13 +391,8 @@ static dma_addr_t xen_swiotlb_map_page(struct device 
> *dev, struct page *page,
>   if (dma_capable(dev, dev_addr, size) &&
>   !range_straddles_page_boundary(phys, size) &&
>   !xen_arch_need_swiotlb(dev, phys, dev_addr) &&
> - (swiotlb_force != SWIOTLB_FORCE)) {
> - /* we are not interested in the dma_addr returned by
> -  * xen_dma_map_page, only in the potential cache flushes 
> executed
> -  * by the function. */
> - xen_dma_map_page(dev, page, dev_addr, offset, size, dir, attrs);
> - return dev_addr;
> - }
> + swiotlb_force != SWIOTLB_FORCE)
> + goto done;
>  
>   /*
>* Oh well, have to allocate and map a bounce buffer.
> @@ -410,19 +405,25 @@ static dma_addr_t xen_swiotlb_map_page(struct device 
> *dev, struct page *page,
>   return DMA_MAPPING_ERROR;
>  
>   dev_addr = xen_phys_to_bus(map);
> - xen_dma_map_page(dev, pfn_to_page(map >> PAGE_SHIFT),
> - dev_addr, map & ~PAGE_MASK, size, dir, 
> attrs);
>  
>   /*
>* Ensure that the address returned is DMA'ble
>*/
> - if (dma_capable(dev, dev_addr, size))
> - return dev_addr;
> -
> - attrs |= DMA_ATTR_SKIP_CPU_SYNC;
> - swiotlb_tbl_unmap_single(dev, map, size, dir, attrs);
> + if (unlikely(!dma_capable(dev, dev_addr, size))) {
> + swiotlb_tbl_unmap_single(dev, map, size, dir,
> + attrs | DMA_ATTR_SKIP_CPU_SYNC);
> + return DMA_MAPPING_ERROR;
> + }
>  
> - return DMA_MAPPING_ERROR;
> + page = pfn_to_page(map >> PAGE_SHIFT);
> + offset = map & ~PAGE_MASK;
> +done:
> + /*
> +  * we are not interested in the dma_addr returned by xen_dma_map_page,
> +  * only in the potential cache flushes executed by the function.
> +  */
> + xen_dma_map_page(dev, page, dev_addr, offset, size, dir, attrs);
> + return dev_addr;
>  }
>  
>  /*
> -- 
> 2.20.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/4] swiotlb-xen: use ->map_page to implement ->map_sg

2019-04-15 Thread Stefano Stabellini
On Thu, 11 Apr 2019, Christoph Hellwig wrote:
> We can simply loop over the segments and map them, removing lots of
> duplicate code.

Right, the only difference is the additional dma_capable check which is
good to have.

Reviewed-by: Stefano Stabellini 


> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/xen/swiotlb-xen.c | 68 ++-
>  1 file changed, 10 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index d4bc3aabd44d..97a55c225593 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -517,24 +517,8 @@ xen_swiotlb_unmap_sg(struct device *hwdev, struct 
> scatterlist *sgl, int nelems,
>  
>  }
>  
> -/*
> - * Map a set of buffers described by scatterlist in streaming mode for DMA.
> - * This is the scatter-gather version of the above xen_swiotlb_map_page
> - * interface.  Here the scatter gather list elements are each tagged with the
> - * appropriate dma address and length.  They are obtained via
> - * sg_dma_{address,length}(SG).
> - *
> - * NOTE: An implementation may be able to use a smaller number of
> - *   DMA address/length pairs than there are SG table elements.
> - *   (for example via virtual mapping capabilities)
> - *   The routine returns the number of addr/length pairs actually
> - *   used, at most nents.
> - *
> - * Device ownership issues as mentioned above for xen_swiotlb_map_page are 
> the
> - * same here.
> - */
>  static int
> -xen_swiotlb_map_sg(struct device *hwdev, struct scatterlist *sgl, int nelems,
> +xen_swiotlb_map_sg(struct device *dev, struct scatterlist *sgl, int nelems,
>   enum dma_data_direction dir, unsigned long attrs)
>  {
>   struct scatterlist *sg;
> @@ -543,50 +527,18 @@ xen_swiotlb_map_sg(struct device *hwdev, struct 
> scatterlist *sgl, int nelems,
>   BUG_ON(dir == DMA_NONE);
>  
>   for_each_sg(sgl, sg, nelems, i) {
> - phys_addr_t paddr = sg_phys(sg);
> - dma_addr_t dev_addr = xen_phys_to_bus(paddr);
> -
> - if (swiotlb_force == SWIOTLB_FORCE ||
> - xen_arch_need_swiotlb(hwdev, paddr, dev_addr) ||
> - !dma_capable(hwdev, dev_addr, sg->length) ||
> - range_straddles_page_boundary(paddr, sg->length)) {
> - phys_addr_t map = swiotlb_tbl_map_single(hwdev,
> -  start_dma_addr,
> -  sg_phys(sg),
> -  sg->length,
> -  dir, attrs);
> - if (map == DMA_MAPPING_ERROR) {
> - dev_warn(hwdev, "swiotlb buffer is full\n");
> - /* Don't panic here, we expect map_sg users
> -to do proper error handling. */
> - attrs |= DMA_ATTR_SKIP_CPU_SYNC;
> - xen_swiotlb_unmap_sg(hwdev, sgl, i, dir, attrs);
> - sg_dma_len(sgl) = 0;
> - return 0;
> - }
> - dev_addr = xen_phys_to_bus(map);
> - xen_dma_map_page(hwdev, pfn_to_page(map >> PAGE_SHIFT),
> - dev_addr,
> - map & ~PAGE_MASK,
> - sg->length,
> - dir,
> - attrs);
> - sg->dma_address = dev_addr;
> - } else {
> - /* we are not interested in the dma_addr returned by
> -  * xen_dma_map_page, only in the potential cache 
> flushes executed
> -  * by the function. */
> - xen_dma_map_page(hwdev, pfn_to_page(paddr >> 
> PAGE_SHIFT),
> - dev_addr,
> - paddr & ~PAGE_MASK,
> - sg->length,
> - dir,
> - attrs);
> - sg->dma_address = dev_addr;
> - }
> + sg->dma_address = xen_swiotlb_map_page(dev, sg_page(sg),
> + sg->offset, sg->length, dir, attrs);
> + if (sg->dma_address == DMA_MAPPING_ERROR)
> + goto out_unmap;
>   sg_dma_len(sg) = sg->length;
>   }
> +
>   return nelems;
> +out_unmap:
> + xen_swiotlb_unmap_sg(dev, sgl, i, dir, attrs | DMA_ATTR_SKIP_CPU_SYNC);
> + sg_dma_len(sgl) = 0;
> + return 0;
>  }
>  
>  /*
> -- 
> 2.20.1
> 
___
iommu mailing list

Re: [PATCH 1/4] swiotlb-xen: make instances match their method names

2019-04-15 Thread Stefano Stabellini
On Thu, 11 Apr 2019, Christoph Hellwig wrote:
> Just drop two pointless _attrs prefixes to make the code a little
> more grep-able.
> 
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Stefano Stabellini 

> ---
>  drivers/xen/swiotlb-xen.c | 17 +++--
>  1 file changed, 7 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 877baf2a94f4..d4bc3aabd44d 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -504,9 +504,8 @@ xen_swiotlb_sync_single_for_device(struct device *hwdev, 
> dma_addr_t dev_addr,
>   * concerning calls here are the same as for swiotlb_unmap_page() above.
>   */
>  static void
> -xen_swiotlb_unmap_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
> -int nelems, enum dma_data_direction dir,
> -unsigned long attrs)
> +xen_swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sgl, int 
> nelems,
> + enum dma_data_direction dir, unsigned long attrs)
>  {
>   struct scatterlist *sg;
>   int i;
> @@ -535,9 +534,8 @@ xen_swiotlb_unmap_sg_attrs(struct device *hwdev, struct 
> scatterlist *sgl,
>   * same here.
>   */
>  static int
> -xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
> -  int nelems, enum dma_data_direction dir,
> -  unsigned long attrs)
> +xen_swiotlb_map_sg(struct device *hwdev, struct scatterlist *sgl, int nelems,
> + enum dma_data_direction dir, unsigned long attrs)
>  {
>   struct scatterlist *sg;
>   int i;
> @@ -562,8 +560,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct 
> scatterlist *sgl,
>   /* Don't panic here, we expect map_sg users
>  to do proper error handling. */
>   attrs |= DMA_ATTR_SKIP_CPU_SYNC;
> - xen_swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
> -attrs);
> + xen_swiotlb_unmap_sg(hwdev, sgl, i, dir, attrs);
>   sg_dma_len(sgl) = 0;
>   return 0;
>   }
> @@ -690,8 +687,8 @@ const struct dma_map_ops xen_swiotlb_dma_ops = {
>   .sync_single_for_device = xen_swiotlb_sync_single_for_device,
>   .sync_sg_for_cpu = xen_swiotlb_sync_sg_for_cpu,
>   .sync_sg_for_device = xen_swiotlb_sync_sg_for_device,
> - .map_sg = xen_swiotlb_map_sg_attrs,
> - .unmap_sg = xen_swiotlb_unmap_sg_attrs,
> + .map_sg = xen_swiotlb_map_sg,
> + .unmap_sg = xen_swiotlb_unmap_sg,
>   .map_page = xen_swiotlb_map_page,
>   .unmap_page = xen_swiotlb_unmap_page,
>   .dma_supported = xen_swiotlb_dma_supported,
> -- 
> 2.20.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/4] swiotlb-xen: simplify the DMA sync method implementations

2019-04-15 Thread Stefano Stabellini
On Thu, 11 Apr 2019, Christoph Hellwig wrote:
> Get rid of the grand multiplexer and implement the sync_single_for_cpu
> and sync_single_for_device methods directly, and then loop over them
> for the scatterlist based variants.
> 
> Note that this also loses a few comments related to highlevel DMA API
> concepts, which have nothing to do with the swiotlb-xen implementation
> details.
> 
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Stefano Stabellini 

> ---
>  drivers/xen/swiotlb-xen.c | 84 +--
>  1 file changed, 28 insertions(+), 56 deletions(-)
> 
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 97a55c225593..9a951504dc12 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -455,48 +455,28 @@ static void xen_swiotlb_unmap_page(struct device 
> *hwdev, dma_addr_t dev_addr,
>   xen_unmap_single(hwdev, dev_addr, size, dir, attrs);
>  }
>  
> -/*
> - * Make physical memory consistent for a single streaming mode DMA 
> translation
> - * after a transfer.
> - *
> - * If you perform a xen_swiotlb_map_page() but wish to interrogate the buffer
> - * using the cpu, yet do not wish to teardown the dma mapping, you must
> - * call this function before doing so.  At the next point you give the dma
> - * address back to the card, you must first perform a
> - * xen_swiotlb_dma_sync_for_device, and then the device again owns the buffer
> - */
>  static void
> -xen_swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
> - size_t size, enum dma_data_direction dir,
> - enum dma_sync_target target)
> +xen_swiotlb_sync_single_for_cpu(struct device *dev, dma_addr_t dma_addr,
> + size_t size, enum dma_data_direction dir)
>  {
> - phys_addr_t paddr = xen_bus_to_phys(dev_addr);
> + phys_addr_t paddr = xen_bus_to_phys(dma_addr);
>  
> - BUG_ON(dir == DMA_NONE);
> -
> - if (target == SYNC_FOR_CPU)
> - xen_dma_sync_single_for_cpu(hwdev, dev_addr, size, dir);
> + xen_dma_sync_single_for_cpu(dev, dma_addr, size, dir);
>  
> - /* NOTE: We use dev_addr here, not paddr! */
> - if (is_xen_swiotlb_buffer(dev_addr))
> - swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target);
> -
> - if (target == SYNC_FOR_DEVICE)
> - xen_dma_sync_single_for_device(hwdev, dev_addr, size, dir);
> + if (is_xen_swiotlb_buffer(dma_addr))
> + swiotlb_tbl_sync_single(dev, paddr, size, dir, SYNC_FOR_CPU);
>  }
>  
> -void
> -xen_swiotlb_sync_single_for_cpu(struct device *hwdev, dma_addr_t dev_addr,
> - size_t size, enum dma_data_direction dir)
> +static void
> +xen_swiotlb_sync_single_for_device(struct device *dev, dma_addr_t dma_addr,
> + size_t size, enum dma_data_direction dir)
>  {
> - xen_swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_CPU);
> -}
> + phys_addr_t paddr = xen_bus_to_phys(dma_addr);
>  
> -void
> -xen_swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr,
> -size_t size, enum dma_data_direction dir)
> -{
> - xen_swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_DEVICE);
> + if (is_xen_swiotlb_buffer(dma_addr))
> + swiotlb_tbl_sync_single(dev, paddr, size, dir, SYNC_FOR_DEVICE);
> +
> + xen_dma_sync_single_for_device(dev, dma_addr, size, dir);
>  }
>  
>  /*
> @@ -541,38 +521,30 @@ xen_swiotlb_map_sg(struct device *dev, struct 
> scatterlist *sgl, int nelems,
>   return 0;
>  }
>  
> -/*
> - * Make physical memory consistent for a set of streaming mode DMA 
> translations
> - * after a transfer.
> - *
> - * The same as swiotlb_sync_single_* but for a scatter-gather list, same 
> rules
> - * and usage.
> - */
>  static void
> -xen_swiotlb_sync_sg(struct device *hwdev, struct scatterlist *sgl,
> - int nelems, enum dma_data_direction dir,
> - enum dma_sync_target target)
> +xen_swiotlb_sync_sg_for_cpu(struct device *dev, struct scatterlist *sgl,
> + int nelems, enum dma_data_direction dir)
>  {
>   struct scatterlist *sg;
>   int i;
>  
> - for_each_sg(sgl, sg, nelems, i)
> - xen_swiotlb_sync_single(hwdev, sg->dma_address,
> - sg_dma_len(sg), dir, target);
> -}
> -
> -static void
> -xen_swiotlb_sync_sg_for_cpu(struct device *hwdev, struct scatterlist *sg,
> - int nelems, enum dma_data_direction dir)
> -{
> - xen_swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_CPU);
> + for_each_sg(sgl, sg, nelems, i) {
> + xen_swiotlb_sync_single_for_cpu(dev, sg->dma_address,
> + sg->length, dir);
> + }
>  }
>  
>  static void
> -xen_swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg,
> +xen_swiotlb_sync_sg_for_device(struct device *dev, struct scatterlist *sgl,
>  

Re: [PATCH 02/18] ioasid: Add custom IOASID allocator

2019-04-15 Thread Jacob Pan
On Mon, 15 Apr 2019 12:53:48 -0600
Alex Williamson  wrote:

> On Mon,  8 Apr 2019 16:59:17 -0700
> Jacob Pan  wrote:
> 
> > Sometimes, IOASID allocation must be handled by platform specific
> > code. The use cases are guest vIOMMU and pvIOMMU where IOASIDs need
> > to be allocated by the host via enlightened or paravirt interfaces.
> > 
> > This patch adds an extension to the IOASID allocator APIs such that
> > platform drivers can register a custom allocator, possibly at boot
> > time, to take over the allocation. IDR is still used for tracking
> > and searching purposes internal to the IOASID code. Private data of
> > an IOASID can also be set after the allocation.
> > 
> > Signed-off-by: Jacob Pan 
> > ---
> >  drivers/base/ioasid.c  | 124
> > +
> > include/linux/ioasid.h |  28 ++- 2 files changed, 143
> > insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/base/ioasid.c b/drivers/base/ioasid.c
> > index cf122b2..294e856 100644
> > --- a/drivers/base/ioasid.c
> > +++ b/drivers/base/ioasid.c
> > @@ -17,6 +17,74 @@ struct ioasid_data {
> >  };
> >  
> >  static DEFINE_IDR(ioasid_idr);
> > +static DEFINE_MUTEX(ioasid_allocator_lock);
> > +static const struct ioasid_allocator *ioasid_allocator;
> > +
> > +
> > +/**
> > + * ioasid_set_allocator - register a custom allocator
> > + *
> > + * Custom allocator take precedence over the default IDR based
> > allocator.
> > + * Private data associated with the ASID are managed by ASID
> > common code
> > + * similar to IDR data.
> > + */
> > +int ioasid_set_allocator(struct ioasid_allocator *allocator)
> > +{
> > +   int ret = 0;
> > +
> > +   if (!allocator)
> > +   return -EINVAL;
> > +
> > +   mutex_lock(_allocator_lock);
> > +   if (ioasid_allocator) {
> > +   ret = -EBUSY;
> > +   goto exit_unlock;
> > +   }
> > +   ioasid_allocator = allocator;
> > +
> > +exit_unlock:
> > +   mutex_unlock(_allocator_lock);
> > +   return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(ioasid_set_allocator);  
> 
> Should this fault if there are existing idr's allocated?
> 
Yes, I think that would make things much simpler. There can be only one
allocator for the entire time per boot.
> > +
> > +/**
> > + * ioasid_clear_allocator - Free the custom IOASID allocator
> > + *
> > + * REVISIT: So far there is only one custom allocator allowed.
> > + */
> > +void ioasid_clear_allocator(void)
> > +{
> > +   mutex_lock(_allocator_lock);
> > +   ioasid_allocator = NULL;
> > +   mutex_unlock(_allocator_lock);
> > +}
> > +EXPORT_SYMBOL_GPL(ioasid_clear_allocator);
> > +
> > +/**
> > + * ioasid_set_data - Set private data for an allocated ioasid
> > + *
> > + * For IOASID that is already allocated, private data can be set
> > + * via this API. Future lookup can be done via ioasid_find.
> > + */
> > +int ioasid_set_data(ioasid_t ioasid, void *data)
> > +{
> > +   struct ioasid_data *ioasid_data;
> > +   int ret = 0;
> > +
> > +   idr_lock(_idr);
> > +   ioasid_data = idr_find(_idr, ioasid);
> > +   if (ioasid_data)
> > +   ioasid_data->private = data;
> > +   else
> > +   ret = -ENOENT;
> > +   idr_unlock(_idr);
> > +   /* getter may use the private data */
> > +   synchronize_rcu();
> > +
> > +   return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(ioasid_set_data);
> >  
> >  /**
> >   * ioasid_alloc - Allocate an IOASID
> > @@ -32,7 +100,7 @@ static DEFINE_IDR(ioasid_idr);
> >  ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min,
> > ioasid_t max, void *private)
> >  {
> > -   int id = -1;
> > +   int id = INVALID_IOASID;
> > struct ioasid_data *data;
> >  
> > data = kzalloc(sizeof(*data), GFP_KERNEL);
> > @@ -42,13 +110,30 @@ ioasid_t ioasid_alloc(struct ioasid_set *set,
> > ioasid_t min, ioasid_t max, data->set = set;
> > data->private = private;
> >  
> > +   /* Use custom allocator if available, otherwise default to
> > IDR */
> > +   if (ioasid_allocator) {  
> 
> If this races with ioasid_clear_allocator() ioasid_allocator might be
> set above, but NULL below to generate a segfault.  If this races with
> ioasid_set_allocator() an id can be allocated that the custom
> allocator doesn't track.
> 
right, need to move this under the lock below. And protect it under
clear and set function. Or delete the ioasid_clear_allocator() function
to prevent the case you mentioned below.

> > +   mutex_lock(_allocator_lock);
> > +   id = ioasid_allocator->alloc(min, max,
> > ioasid_allocator->pdata);
> > +   mutex_unlock(_allocator_lock);
> > +   if (id == INVALID_IOASID) {
> > +   pr_err("Failed ASID allocation by custom
> > allocator\n");
> > +   goto exit_free;
> > +   }
> > +   /*
> > +* Use IDR to manage private data also sanitiy
> > check custom
> > +* allocator for duplicates.
> > +*/
> > +   min = id;
> > +   max = id + 1;
> > +   }
> > 

Re: [PATCH 10/18] iommu/vt-d: Add custom allocator for IOASID

2019-04-15 Thread Alex Williamson
On Mon,  8 Apr 2019 16:59:25 -0700
Jacob Pan  wrote:

> When VT-d driver runs in the guest, PASID allocation must be
> performed via virtual command interface. This patch register a
> custom IOASID allocator which takes precedence over the default
> IDR based allocator. The resulting IOASID allocation will always
> come from the host. This ensures that PASID namespace is system-
> wide.
> 
> Signed-off-by: Lu Baolu 
> Signed-off-by: Liu, Yi L 
> Signed-off-by: Jacob Pan 
> ---
>  drivers/iommu/intel-iommu.c | 50 
> +
>  include/linux/intel-iommu.h |  1 +
>  2 files changed, 51 insertions(+)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 28cb713..a38d774 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -4820,6 +4820,42 @@ static int __init platform_optin_force_iommu(void)
>   return 1;
>  }
>  
> +static ioasid_t intel_ioasid_alloc(ioasid_t min, ioasid_t max, void *data)
> +{
> + struct intel_iommu *iommu = data;
> + ioasid_t ioasid;
> +
> + if (vcmd_alloc_pasid(iommu, ))
> + return INVALID_IOASID;
> + return ioasid;

How does this honor min/max?

> +}
> +
> +static int intel_ioasid_free(ioasid_t ioasid, void *data)
> +{
> + struct iommu_pasid_alloc_info *svm;
> + struct intel_iommu *iommu = data;
> +
> + if (!iommu || !cap_caching_mode(iommu->cap))
> + return -EINVAL;
> + /*
> +  * Sanity check the ioasid owner is done at upper layer, e.g. VFIO
> +  * We can only free the PASID when all the devices are unbond.
> +  */
> + svm = ioasid_find(NULL, ioasid, NULL);
> + if (!svm) {
> + pr_warn("Freeing unbond IOASID %d\n", ioasid);
> + return -EBUSY;
> + }
> + vcmd_free_pasid(iommu, ioasid);
> +
> + return 0;
> +}
> +
> +static struct ioasid_allocator intel_iommu_ioasid_allocator = {
> + .alloc = intel_ioasid_alloc,
> + .free = intel_ioasid_free,
> +};
> +
>  int __init intel_iommu_init(void)
>  {
>   int ret = -ENODEV;
> @@ -4921,6 +4957,20 @@ int __init intel_iommu_init(void)
>  "%s", iommu->name);
>   iommu_device_set_ops(>iommu, _iommu_ops);
>   iommu_device_register(>iommu);
> + if (cap_caching_mode(iommu->cap) && sm_supported(iommu)) {
> + /*
> +  * Register a custom ASID allocator if we are running
> +  * in a guest, the purpose is to have a system wide 
> PASID
> +  * namespace among all PASID users.
> +  * Note that only one vIOMMU in each guest is supported.

Why one vIOMMU per guest?  This would prevent guests with multiple PCI
domains aiui.

> +  */
> + intel_iommu_ioasid_allocator.pdata = (void *)iommu;
> + ret = 
> ioasid_set_allocator(_iommu_ioasid_allocator);
> + if (ret == -EBUSY) {
> + pr_info("Custom IOASID allocator already 
> registered\n");
> + break;
> + }
> + }
>   }
>  
>   bus_set_iommu(_bus_type, _iommu_ops);
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index b29c85c..bc09d80 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -31,6 +31,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 02/18] ioasid: Add custom IOASID allocator

2019-04-15 Thread Alex Williamson
On Mon,  8 Apr 2019 16:59:17 -0700
Jacob Pan  wrote:

> Sometimes, IOASID allocation must be handled by platform specific
> code. The use cases are guest vIOMMU and pvIOMMU where IOASIDs need
> to be allocated by the host via enlightened or paravirt interfaces.
> 
> This patch adds an extension to the IOASID allocator APIs such that
> platform drivers can register a custom allocator, possibly at boot
> time, to take over the allocation. IDR is still used for tracking
> and searching purposes internal to the IOASID code. Private data of
> an IOASID can also be set after the allocation.
> 
> Signed-off-by: Jacob Pan 
> ---
>  drivers/base/ioasid.c  | 124 
> +
>  include/linux/ioasid.h |  28 ++-
>  2 files changed, 143 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/base/ioasid.c b/drivers/base/ioasid.c
> index cf122b2..294e856 100644
> --- a/drivers/base/ioasid.c
> +++ b/drivers/base/ioasid.c
> @@ -17,6 +17,74 @@ struct ioasid_data {
>  };
>  
>  static DEFINE_IDR(ioasid_idr);
> +static DEFINE_MUTEX(ioasid_allocator_lock);
> +static const struct ioasid_allocator *ioasid_allocator;
> +
> +
> +/**
> + * ioasid_set_allocator - register a custom allocator
> + *
> + * Custom allocator take precedence over the default IDR based allocator.
> + * Private data associated with the ASID are managed by ASID common code
> + * similar to IDR data.
> + */
> +int ioasid_set_allocator(struct ioasid_allocator *allocator)
> +{
> + int ret = 0;
> +
> + if (!allocator)
> + return -EINVAL;
> +
> + mutex_lock(_allocator_lock);
> + if (ioasid_allocator) {
> + ret = -EBUSY;
> + goto exit_unlock;
> + }
> + ioasid_allocator = allocator;
> +
> +exit_unlock:
> + mutex_unlock(_allocator_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(ioasid_set_allocator);

Should this fault if there are existing idr's allocated?

> +
> +/**
> + * ioasid_clear_allocator - Free the custom IOASID allocator
> + *
> + * REVISIT: So far there is only one custom allocator allowed.
> + */
> +void ioasid_clear_allocator(void)
> +{
> + mutex_lock(_allocator_lock);
> + ioasid_allocator = NULL;
> + mutex_unlock(_allocator_lock);
> +}
> +EXPORT_SYMBOL_GPL(ioasid_clear_allocator);
> +
> +/**
> + * ioasid_set_data - Set private data for an allocated ioasid
> + *
> + * For IOASID that is already allocated, private data can be set
> + * via this API. Future lookup can be done via ioasid_find.
> + */
> +int ioasid_set_data(ioasid_t ioasid, void *data)
> +{
> + struct ioasid_data *ioasid_data;
> + int ret = 0;
> +
> + idr_lock(_idr);
> + ioasid_data = idr_find(_idr, ioasid);
> + if (ioasid_data)
> + ioasid_data->private = data;
> + else
> + ret = -ENOENT;
> + idr_unlock(_idr);
> + /* getter may use the private data */
> + synchronize_rcu();
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(ioasid_set_data);
>  
>  /**
>   * ioasid_alloc - Allocate an IOASID
> @@ -32,7 +100,7 @@ static DEFINE_IDR(ioasid_idr);
>  ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max,
> void *private)
>  {
> - int id = -1;
> + int id = INVALID_IOASID;
>   struct ioasid_data *data;
>  
>   data = kzalloc(sizeof(*data), GFP_KERNEL);
> @@ -42,13 +110,30 @@ ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t 
> min, ioasid_t max,
>   data->set = set;
>   data->private = private;
>  
> + /* Use custom allocator if available, otherwise default to IDR */
> + if (ioasid_allocator) {

If this races with ioasid_clear_allocator() ioasid_allocator might be
set above, but NULL below to generate a segfault.  If this races with
ioasid_set_allocator() an id can be allocated that the custom allocator
doesn't track.

> + mutex_lock(_allocator_lock);
> + id = ioasid_allocator->alloc(min, max, ioasid_allocator->pdata);
> + mutex_unlock(_allocator_lock);
> + if (id == INVALID_IOASID) {
> + pr_err("Failed ASID allocation by custom allocator\n");
> + goto exit_free;
> + }
> + /*
> +  * Use IDR to manage private data also sanitiy check custom
> +  * allocator for duplicates.
> +  */
> + min = id;
> + max = id + 1;
> + }
>   idr_preload(GFP_KERNEL);
>   idr_lock(_idr);
>   data->id = id = idr_alloc(_idr, data, min, max, GFP_ATOMIC);
>   idr_unlock(_idr);
>   idr_preload_end();
>  
> - if (id < 0) {
> +exit_free:
> + if (id < 0 || id == INVALID_IOASID) {
>   kfree(data);

What if an ioasid is already allocated before the ioasid_allocator is
registered?  The .alloc callback above could return an id that
idr_alloc cannot provide, in which case this cleanup does not call the
custom allocator's free callback.

>   return 

Re: [PATCH v2 1/7] ACPI/IORT: Check ATS capability in root complex nodes

2019-04-15 Thread Robin Murphy

On 09/04/2019 17:52, Jean-Philippe Brucker wrote:

Root complex node in IORT has a bit telling whether it supports ATS or
not. Store this bit in the IOMMU fwspec when setting up a device, so it
can be accessed later by an IOMMU driver.


Hmm, it doesn't feel quite right to store a copy of the same RC/SMMU 
integration property for every endpoint...


It seems like it might be more logical to track this at the SMMU end, 
i.e. only allow ARM_SMMU_FEAT_ATS to be set if all RCs targeting that 
SMMU also support ATS. For the moment that seems sufficiently realistic, 
and unless some wacky topology ever actually shows up in silicon to 
complain, I'm inclined not to care too much about it being potentially 
overly restrictive.


Furthermore, I'm now wondering whether it would make sense to push this 
into the PCI layer as well (or instead), i.e. hook into pci_init_ats() 
or pci_enable_ats() and it the device is untrusted or the topology 
doesn't support ATS, prevent the capability from ever being enabled at 
all, rather than trying to mitigate it later at the SMMU end. What do 
you reckon?



Use the negative version (NO_ATS) at the moment because it's not clear
if/how the bit needs to be integrated in other firmware descriptions. The
SMMU has a feature bit telling if it supports ATS, which might be
sufficient in most systems for deciding whether or not we should enable
the ATS capability in endpoints.


I can fairly confidently guarantee that it won't. For instance, MMU-600 
reports IDR0.ATS==1 because MMU-600 implements the SMMUv3 architectural 
ATS support. Actually making use of that support, though, still requires 
an RC capable of generating the appropriate DTI-ATS messages, and that a 
DTI interface is wired up correctly between the two.



Signed-off-by: Jean-Philippe Brucker 
---
  drivers/acpi/arm64/iort.c | 11 +++
  include/linux/iommu.h |  4 
  2 files changed, 15 insertions(+)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index e48894e002ba..7f2c1c9c6b38 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -1028,6 +1028,14 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr, 
u64 *dma_size)
dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
  }
  
+static bool iort_pci_rc_supports_ats(struct acpi_iort_node *node)

+{
+   struct acpi_iort_root_complex *pci_rc;
+
+   pci_rc = (struct acpi_iort_root_complex *)node->node_data;
+   return pci_rc->ats_attribute & ACPI_IORT_ATS_SUPPORTED;
+}
+
  /**
   * iort_iommu_configure - Set-up IOMMU configuration for a device.
   *
@@ -1063,6 +1071,9 @@ const struct iommu_ops *iort_iommu_configure(struct 
device *dev)
info.node = node;
err = pci_for_each_dma_alias(to_pci_dev(dev),
 iort_pci_iommu_init, );
+
+   if (!err && !iort_pci_rc_supports_ats(node))
+   dev->iommu_fwspec->flags |= IOMMU_FWSPEC_PCI_NO_ATS;
} else {
int i = 0;
  
diff --git a/include/linux/iommu.h b/include/linux/iommu.h

index 3dbeb457fb16..ed6738c358ca 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -509,10 +509,14 @@ struct iommu_fwspec {
const struct iommu_ops  *ops;
struct fwnode_handle*iommu_fwnode;
void*iommu_priv;
+   u32 flags;
unsigned intnum_ids;
u32 ids[1];
  };
  
+/* Firmware disabled ATS in the root complex */


More likely firmware is just describing how the hardware was built ;)

Robin.


+#define IOMMU_FWSPEC_PCI_NO_ATS(1 << 0)
+
  int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
  const struct iommu_ops *ops);
  void iommu_fwspec_free(struct device *dev);


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/9] iommu/amd: Implement .flush_np_cache

2019-04-15 Thread Tom Murphy via iommu
This is a cut and paste from the current amd_iommu driver. I really
have no idea if it's a good idea or not. It looks like
joerg.roe...@amd.com might be the person to ask.

@Joerg Roedel should we keep this?

On Mon, Apr 15, 2019 at 7:33 AM Christoph Hellwig  wrote:
>
> > +static void amd_iommu_flush_np_cache(struct iommu_domain *domain,
> > + unsigned long iova, size_t size)
> > +{
> > + struct protection_domain *dom = to_pdomain(domain);
> > +
> > + if (unlikely(amd_iommu_np_cache)) {
>
> Is this case really so unlikely that it needs a static branch prediction
> hint?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/amd: Add allocated domain to global list earlier

2019-04-15 Thread Tom Murphy via iommu
dma_ops_domain_free() expects domain to be in a global list.
Arguably, could be called before protection_domain_init().

Signed-off-by: Dmitry Safonov 
Signed-off-by: Tom Murphy 
---
 drivers/iommu/amd_iommu.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 02b351834a3b..bfb021d656aa 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1801,8 +1801,12 @@ static struct protection_domain 
*dma_ops_domain_alloc(void)
if (!domain)
return NULL;
 
-   if (protection_domain_init(domain))
-   goto free_domain;
+   if (protection_domain_init(domain)) {
+   kfree(domain);
+   return NULL;
+   }
+
+   add_domain_to_list(domain);
 
domain->mode = PAGE_MODE_3_LEVEL;
domain->pt_root = (void *)get_zeroed_page(GFP_KERNEL);
@@ -1824,8 +1828,6 @@ static struct protection_domain 
*dma_ops_domain_alloc(void)
/* Initialize reserved ranges */
iommu_dma_copy_reserved_iova(_iova_ranges, >domain);
 
-   add_domain_to_list(domain);
-
return domain;
 
 free_cookie:
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 9/9] iommu/amd: Add allocated domain to global list earlier

2019-04-15 Thread Tom Murphy via iommu
>This seems like a fix to the existing code and should probably go out first.

I'll send this patch out on it's own now.

On Mon, Apr 15, 2019 at 7:23 AM Christoph Hellwig  wrote:
>
> On Thu, Apr 11, 2019 at 07:47:38PM +0100, Tom Murphy via iommu wrote:
> > dma_ops_domain_free() expects domain to be in a global list.
> > Arguably, could be called before protection_domain_init().
> >
> > Signed-off-by: Dmitry Safonov 
> > Signed-off-by: Tom Murphy 
>
> This seems like a fix to the existing code and should probably go
> out first.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 6/7] iommu/arm-smmu-v3: Add support for PCI ATS

2019-04-15 Thread Jean-Philippe Brucker
On 15/04/2019 14:21, Will Deacon wrote:
> On Tue, Apr 09, 2019 at 05:52:44PM +0100, Jean-Philippe Brucker wrote:
>> PCIe devices can implement their own TLB, named Address Translation Cache
>> (ATC). Enable Address Translation Service (ATS) for devices that support
>> it and send them invalidation requests whenever we invalidate the IOTLBs.
>>
>> ATC invalidation is allowed to take up to 90 seconds, according to the
>> PCIe spec, so it is possible to get a SMMU command queue timeout during
>> normal operations. However we expect implementations to complete
>> invalidation in reasonable time.
>>
>> We only enable ATS for "trusted" devices, and currently rely on the
>> pci_dev->untrusted bit. For ATS we have to trust that:
> 
> AFAICT, devicetree has no way to describe a device as untrusted, so
> everything will be trusted by default on those systems. Is that right?

Yes, although I'm adding a devicetree property for PCI in v5.2:
https://lore.kernel.org/linux-pci/20190411211823.gu256...@google.com/T/#m9f2c036748bab6e262233280b22c1e6fab4e1607

>> (a) The device doesn't issue "translated" memory requests for addresses
>> that weren't returned by the SMMU in a Translation Completion. In
>> particular, if we give control of a device or device partition to a VM
>> or userspace, software cannot program the device to access arbitrary
>> "translated" addresses.
> 
> Any plans to look at split-stage ATS later on? I think that would allow
> us to pass untrusted devices through to a VM behind S1 ATS.

I haven't tested it so far, we can look at that after adding support for
nested translation

>> (b) The device follows permissions granted by the SMMU in a Translation
>> Completion. If the device requested read+write permission and only
>> got read, then it doesn't write.
> 
> Guessing we just ignore execute permissions, or do we need read implies
> exec?

Without PASID, a function cannot ask for execute permission, only RO and
RW. In this case execution by the endpoint is never permitted (because
the Exe bit in an ATS completion is always zero).

With PASID, the endpoint must explicitly ask for execute permission (and
interestingly, can't obtain it if the page is mapped exec-only, because
in ATS exec implies read.)

[...]
>> +static bool disable_ats_check;
>> +module_param_named(disable_ats_check, disable_ats_check, bool, S_IRUGO);
>> +MODULE_PARM_DESC(disable_ats_check,
>> +"By default, the SMMU checks whether each incoming transaction marked 
>> as translated is allowed by the stream configuration. This option disables 
>> the check.");
> 
> Yikes, do we really want this option, or is it just a leftover from
> debugging?

I wasn't sure what to do with it, I'll drop it in next version

[...]
>> @@ -876,6 +911,14 @@ static void arm_smmu_cmdq_skip_err(struct 
>> arm_smmu_device *smmu)
>>  dev_err(smmu->dev, "retrying command fetch\n");
>>  case CMDQ_ERR_CERROR_NONE_IDX:
>>  return;
>> +case CMDQ_ERR_CERROR_ATC_INV_IDX:
>> +/*
>> + * ATC Invalidation Completion timeout. CONS is still pointing
>> + * at the CMD_SYNC. Attempt to complete other pending commands
>> + * by repeating the CMD_SYNC, though we might well end up back
>> + * here since the ATC invalidation may still be pending.
>> + */
>> +return;
> 
> This is pretty bad, as it means we're unable to unmap a page safely from a
> misbehaving device. Ideally, we'd block further transactions from the
> problematic endpoint, but I suppose we can't easily know which one it was,
> or inject a fault back into the unmap() path.
> 
> Having said that, won't we get a CMD_SYNC timeout long before the ATC timeout?

Yes, CMD_SYNC timeout is 1s, ATC timeout is 90s...

> Perhaps we could propagate that out of arm_smmu_cmdq_issue_sync() and fail the
> unmap?
> 
> Not sure, what do you think?

The callers of iommu_unmap() will free the page regardless of the return
value, even though the device could still be accessing it. But I'll look
at returning 0 if the CMD_SYNC times out, it's a good start for
consolidating this. With dma-iommu.c it will trigger a WARN().

It gets a worse with PRI, when the invalidation comes from an MMU
notifier and we can't even return an error. Ideally we'd hold a
reference to these pages until invalidation completes.

[...]
>> +/*
>> + * Find the smallest power of two that covers the range. Most
>> + * significant differing bit between start and end address indicates the
>> + * required span, ie. fls(start ^ end). For example:
>> + *
>> + * We want to invalidate pages [8; 11]. This is already the ideal range:
>> + *  x = 0b1000 ^ 0b1011 = 0b11
>> + *  span = 1 << fls(x) = 4
>> + *
>> + * To invalidate pages [7; 10], we need to invalidate [0; 15]:
>> + *  x = 0b0111 ^ 0b1010 = 0b1101
>> + *  span = 1 << fls(x) = 16
>> +   

Re: [PATCH v2 1/7] ACPI/IORT: Check ATS capability in root complex nodes

2019-04-15 Thread Jean-Philippe Brucker
On 15/04/2019 14:21, Will Deacon wrote:
> On Tue, Apr 09, 2019 at 05:52:39PM +0100, Jean-Philippe Brucker wrote:
>> Root complex node in IORT has a bit telling whether it supports ATS or
>> not. Store this bit in the IOMMU fwspec when setting up a device, so it
>> can be accessed later by an IOMMU driver.
>>
>> Use the negative version (NO_ATS) at the moment because it's not clear
>> if/how the bit needs to be integrated in other firmware descriptions. The
>> SMMU has a feature bit telling if it supports ATS, which might be
>> sufficient in most systems for deciding whether or not we should enable
>> the ATS capability in endpoints.
> 
> Hmm, the SMMUv3 architecture manual is pretty explicit about this:
> 
>   | It [SMMU_IDR0.ATS] does not guarantee that client devices and intermediate
>   | components also support ATS and this must be determined separately.
> 
> so we may need to extend the PCI bindings to describe this. I think the
> negative logic is likely to get in the way if that's the case.

Right. For devicetree I can resurrect the patch I proposed a while ago
[1]. Rob wasn't keen on adding an "ats-supported" property to PCI host
bridge nodes. Instead the host controller drivers should deduce whether
ATS is supported from the compatible string. But I'll resend that patch
adding the property only to pci-host-ecam-generic.

[1] https://lists.linuxfoundation.org/pipermail/iommu/2017-May/022043.html

>> Signed-off-by: Jean-Philippe Brucker 
>> ---
>>  drivers/acpi/arm64/iort.c | 11 +++
>>  include/linux/iommu.h |  4 
>>  2 files changed, 15 insertions(+)
>>
>> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
>> index e48894e002ba..7f2c1c9c6b38 100644
>> --- a/drivers/acpi/arm64/iort.c
>> +++ b/drivers/acpi/arm64/iort.c
>> @@ -1028,6 +1028,14 @@ void iort_dma_setup(struct device *dev, u64 
>> *dma_addr, u64 *dma_size)
>>  dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
>>  }
>>  
>> +static bool iort_pci_rc_supports_ats(struct acpi_iort_node *node)
>> +{
>> +struct acpi_iort_root_complex *pci_rc;
>> +
>> +pci_rc = (struct acpi_iort_root_complex *)node->node_data;
>> +return pci_rc->ats_attribute & ACPI_IORT_ATS_SUPPORTED;
>> +}
> 
> Do we need to worry about the "noats" command-line option here? It feels
> like we should be checking with the PCI subsystem before telling the SMMU
> we're good to go.

I'm checking the noats option in arm_smmu_enable_ats() at the moment,
using the pci_ats_disabled() helper.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/1] iommu/arm-smmu: Log CBFRSYNRA register on context fault

2019-04-15 Thread Vivek Gautam
Bits[15:0] in CBFRSYNRA register contain information about
StreamID of the incoming transaction that generated the
fault. Dump CBFRSYNRA register to get this info.
This is specially useful in a distributed SMMU architecture
where multiple masters are connected to the SMMU.
SID information helps to quickly identify the faulting
master device.

Signed-off-by: Vivek Gautam 
---

V1 of the patch available @
https://lore.kernel.org/patchwork/patch/1061615/

Changes from v1:
 - Dump the raw register value of CBFRSYNRA register in the
   context fault log rather than extracting the SID inforamtion
   and dumping that.

 drivers/iommu/arm-smmu-regs.h | 2 ++
 drivers/iommu/arm-smmu.c  | 8 ++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
index a1226e4ab5f8..e9132a926761 100644
--- a/drivers/iommu/arm-smmu-regs.h
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -147,6 +147,8 @@ enum arm_smmu_s2cr_privcfg {
 #define CBAR_IRPTNDX_SHIFT 24
 #define CBAR_IRPTNDX_MASK  0xff
 
+#define ARM_SMMU_GR1_CBFRSYNRA(n)  (0x400 + ((n) << 2))
+
 #define ARM_SMMU_GR1_CBA2R(n)  (0x800 + ((n) << 2))
 #define CBA2R_RW64_32BIT   (0 << 0)
 #define CBA2R_RW64_64BIT   (1 << 0)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 045d93884164..a4773e8c6b0e 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -575,7 +575,9 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_cfg *cfg = _domain->cfg;
struct arm_smmu_device *smmu = smmu_domain->smmu;
+   void __iomem *gr1_base = ARM_SMMU_GR1(smmu);
void __iomem *cb_base;
+   u32 cbfrsynra;
 
cb_base = ARM_SMMU_CB(smmu, cfg->cbndx);
fsr = readl_relaxed(cb_base + ARM_SMMU_CB_FSR);
@@ -585,10 +587,12 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
 
fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
+   cbfrsynra = readl_relaxed(gr1_base +
+ ARM_SMMU_GR1_CBFRSYNRA(cfg->cbndx));
 
dev_err_ratelimited(smmu->dev,
-   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cb=%d\n",
-   fsr, iova, fsynr, cfg->cbndx);
+   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cbfrsynra 
= 0x%x, cb=%d\n",
+   fsr, iova, fsynr, cbfrsynra, cfg->cbndx);
 
writel(fsr, cb_base + ARM_SMMU_CB_FSR);
return IRQ_HANDLED;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 00/18] Shared virtual address IOMMU and VT-d support

2019-04-15 Thread Jacob Pan


Hi Joerg and all,

Just a gentle reminder if you have any comments on this series. The
goal is to support vSVA but without page request. It shares code with
Eric Auger's series sMMU v3 for nested mode.

Jean, I took your ioasid allocator and modified for PASID allocation as
we discussed for the need of custom allocator both for vt-d and pvIOMMU.

Thanks,

Jacob

On Mon,  8 Apr 2019 16:59:15 -0700
Jacob Pan  wrote:

> Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on
> Intel platforms allow address space sharing between device DMA and
> applications. SVA can reduce programming complexity and enhance
> security. This series is intended to enable SVA virtualization, i.e.
> shared guest application address space and physical device DMA
> address. Only IOMMU portion of the changes are included in this
> series. Additional support is needed in VFIO and QEMU (will be
> submitted separately) to complete this functionality.
> 
> To make incremental changes and reduce the size of each patchset.
> This series does not inlcude support for page request services.
> 
> In VT-d implementation, PASID table is per device and maintained in
> the host. Guest PASID table is shadowed in VMM where virtual IOMMU is
> emulated.
> 
> .-.  .---.
> |   vIOMMU|  | Guest process CR3, FL only|
> | |  '---'
> ./
> | PASID Entry |--- PASID cache flush -
> '-'   |
> | |   V
> | |CR3 in GPA
> '-'
> Guest
> --| Shadow |--|
>   vv  v
> Host
> .-.  .--.
> |   pIOMMU|  | Bind FL for GVA-GPA  |
> | |  '--'
> ./  |
> | PASID Entry | V (Nested xlate)
> '\.--.
> | |   |SL for GPA-HPA, default domain|
> | |   '--'
> '-'
> Where:
>  - FL = First level/stage one page tables
>  - SL = Second level/stage two page tables
> 
> 
> This work is based on collaboration with other developers on the IOMMU
> mailing list. Notably,
> 
> [1] [PATCH v6 00/22] SMMUv3 Nested Stage Setup by Eric Auger
> https://lkml.org/lkml/2019/3/17/124
> 
> [2] [RFC PATCH 2/6] drivers core: Add I/O ASID allocator by
> Jean-Philippe Brucker
> https://www.spinics.net/lists/iommu/msg30639.html
> 
> [3] [RFC PATCH 0/5] iommu: APIs for paravirtual PASID allocation by
> Lu Baolu https://lkml.org/lkml/2018/11/12/1921
> 
> There are roughly three parts:
> 1. Generic PASID allocator [1] with extension to support custom
> allocator 2. IOMMU cache invalidation passdown from guest to host
> 3. Guest PASID bind for nested translation
> 
> All generic IOMMU APIs are reused from [1], which has a v7 just
> published with no real impact to the patches used here. It is worth
> noting that unlike sMMU nested stage setup, where PASID table is
> owned by the guest, VT-d PASID table is owned by the host, individual
> PASIDs are bound instead of the PASID table.
> 
> 
> Jacob Pan (15):
>   ioasid: Add custom IOASID allocator
>   ioasid: Convert ioasid_idr to XArray
>   driver core: add per device iommu param
>   iommu: introduce device fault data
>   iommu: introduce device fault report API
>   iommu: Introduce attach/detach_pasid_table API
>   iommu/vt-d: Add custom allocator for IOASID
>   iommu/vt-d: Replace Intel specific PASID allocator with IOASID
>   iommu: Add guest PASID bind function
>   iommu/vt-d: Move domain helper to header
>   iommu/vt-d: Add nested translation support
>   iommu/vt-d: Add bind guest PASID support
>   iommu: add max num of cache and granu types
>   iommu/vt-d: Support flushing more translation cache types
>   iommu/vt-d: Add svm/sva invalidate function
> 
> Jean-Philippe Brucker (1):
>   drivers core: Add I/O ASID allocator
> 
> Liu, Yi L (1):
>   iommu: Introduce cache_invalidate API
> 
> Lu Baolu (1):
>   iommu/vt-d: Enlightened PASID allocation
> 
>  drivers/base/Kconfig|   7 ++
>  drivers/base/Makefile   |   1 +
>  drivers/base/ioasid.c   | 211
> + drivers/iommu/Kconfig   |
> 1 + drivers/iommu/dmar.c|  48 +
>  drivers/iommu/intel-iommu.c | 219
> -- drivers/iommu/intel-pasid.c |
> 191 +- drivers/iommu/intel-pasid.h |
> 24 - drivers/iommu/intel-svm.c   | 217
> +++--- drivers/iommu/iommu.c   |
> 207 +++- include/linux/device.h
> |   3 + include/linux/intel-iommu.h |  40 +--
>  include/linux/intel-svm.h   |   7 ++
>  include/linux/ioasid.h  |  66 
>  include/linux/iommu.h   | 127 

Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

2019-04-15 Thread James Sewart via iommu
Hey Lu,

> On 10 Apr 2019, at 06:22, Lu Baolu  wrote:
> 
> Hi James,
> 
> On 4/6/19 2:02 AM, James Sewart wrote:
>> Hey Lu,
>> My bad, did some debugging on my end. The issue was swapping out
>> find_domain for iommu_get_domain_for_dev. It seems in some situations the
>> domain is not attached to the group but the device is expected to have the
>> domain still stored in its archdata.
>> I’ve attached the final patch with find_domain unremoved which seems to
>> work in my testing.
> 
> Just looked into your v3 patch set and some thoughts from my end posted
> here just for your information.
> 
> Let me post the problems we want to address.
> 
> 1. When allocating a new group for a device, how should we determine the
> type of the default domain?
> 
> 2. If we need to put a device into an existing group which uses a
> different type of domain from what the device desires to use, we might
> break the functionality of the device.
> 
> My new thought is letting the iommu generic code to determine the
> default domain type (hence my proposed vendor specific default domain
> type patches could be dropped).
> 
> If the default domain type is dynamical mapping, and later in 
> iommu_no_mapping(), we determines that we must use an identity domain,
> we then call iommu_request_dm_for_dev(dev).
> 
> If the default domain type is identity mapping, and later in
> iommu_no_mapping(), we determined that we must use a dynamical domain,
> we then call iommu_request_dma_domain_for_dev(dev).
> 
> We already have iommu_request_dm_for_dev() in iommu.c. We only need to
> implement iommu_request_dma_domain_for_dev().
> 
> With this done, your patch titled "Create an IOMMU group for devices
> that require an identity map" could also be dropped.
> 
> Any thoughts?

Theres a couple issues I can think of. iommu_request_dm_for_dev changes 
the domain for all devices within the devices group, if we may have 
devices with different domain requirements in the same group then only the 
last initialised device will get the domain it wants. If we want to add 
iommu_request_dma_domain_for_dev then we would still need another IOMMU 
group for identity domain devices.

Both with v3 and the proposed method here, iommu_should_identity_map is 
determining whether the device requires an identity map. In v3 this is 
called once up front by the generic code to determine for the IOMMU group 
which domain type to use. In the proposed method I think this would happen 
after initial domain allocation and would trigger the domain to be 
replaced with a different domain.

I prefer the solution in v3. What do you think?

> 
>> Cheers,
>> James.
> 
> Best regards,
> Lu Baolu

Cheers,
James.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 6/7] iommu/arm-smmu-v3: Add support for PCI ATS

2019-04-15 Thread Will Deacon
On Tue, Apr 09, 2019 at 05:52:44PM +0100, Jean-Philippe Brucker wrote:
> PCIe devices can implement their own TLB, named Address Translation Cache
> (ATC). Enable Address Translation Service (ATS) for devices that support
> it and send them invalidation requests whenever we invalidate the IOTLBs.
> 
> ATC invalidation is allowed to take up to 90 seconds, according to the
> PCIe spec, so it is possible to get a SMMU command queue timeout during
> normal operations. However we expect implementations to complete
> invalidation in reasonable time.
> 
> We only enable ATS for "trusted" devices, and currently rely on the
> pci_dev->untrusted bit. For ATS we have to trust that:

AFAICT, devicetree has no way to describe a device as untrusted, so
everything will be trusted by default on those systems. Is that right?

> (a) The device doesn't issue "translated" memory requests for addresses
> that weren't returned by the SMMU in a Translation Completion. In
> particular, if we give control of a device or device partition to a VM
> or userspace, software cannot program the device to access arbitrary
> "translated" addresses.

Any plans to look at split-stage ATS later on? I think that would allow
us to pass untrusted devices through to a VM behind S1 ATS.

> (b) The device follows permissions granted by the SMMU in a Translation
> Completion. If the device requested read+write permission and only
> got read, then it doesn't write.

Guessing we just ignore execute permissions, or do we need read implies
exec?

> (c) The device doesn't send Translated transactions for an address that
> was invalidated by an ATC invalidation.
> 
> Note that the PCIe specification explicitly requires all of these, so we
> can assume that implementations will cleanly shield ATCs from software.
> 
> All ATS translated requests still go through the SMMU, to walk the stream
> table and check that the device is actually allowed to send translated
> requests.
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  drivers/iommu/arm-smmu-v3.c | 196 +++-
>  1 file changed, 191 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 3e7198ee9530..7819cd60d08b 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include 
> @@ -86,6 +87,7 @@
>  #define IDR5_VAX_52_BIT  1
>  
>  #define ARM_SMMU_CR0 0x20
> +#define CR0_ATSCHK   (1 << 4)
>  #define CR0_CMDQEN   (1 << 3)
>  #define CR0_EVTQEN   (1 << 2)
>  #define CR0_PRIQEN   (1 << 1)
> @@ -294,6 +296,7 @@
>  #define CMDQ_ERR_CERROR_NONE_IDX 0
>  #define CMDQ_ERR_CERROR_ILL_IDX  1
>  #define CMDQ_ERR_CERROR_ABT_IDX  2
> +#define CMDQ_ERR_CERROR_ATC_INV_IDX  3
>  
>  #define CMDQ_0_OPGENMASK_ULL(7, 0)
>  #define CMDQ_0_SSV   (1UL << 11)
> @@ -312,6 +315,12 @@
>  #define CMDQ_TLBI_1_VA_MASK  GENMASK_ULL(63, 12)
>  #define CMDQ_TLBI_1_IPA_MASK GENMASK_ULL(51, 12)
>  
> +#define CMDQ_ATC_0_SSID  GENMASK_ULL(31, 12)
> +#define CMDQ_ATC_0_SID   GENMASK_ULL(63, 32)
> +#define CMDQ_ATC_0_GLOBAL(1UL << 9)
> +#define CMDQ_ATC_1_SIZE  GENMASK_ULL(5, 0)
> +#define CMDQ_ATC_1_ADDR_MASK GENMASK_ULL(63, 12)
> +
>  #define CMDQ_PRI_0_SSID  GENMASK_ULL(31, 12)
>  #define CMDQ_PRI_0_SID   GENMASK_ULL(63, 32)
>  #define CMDQ_PRI_1_GRPID GENMASK_ULL(8, 0)
> @@ -365,6 +374,11 @@ module_param_named(disable_bypass, disable_bypass, bool, 
> S_IRUGO);
>  MODULE_PARM_DESC(disable_bypass,
>   "Disable bypass streams such that incoming transactions from devices 
> that are not attached to an iommu domain will report an abort back to the 
> device and will not be allowed to pass through the SMMU.");
>  
> +static bool disable_ats_check;
> +module_param_named(disable_ats_check, disable_ats_check, bool, S_IRUGO);
> +MODULE_PARM_DESC(disable_ats_check,
> + "By default, the SMMU checks whether each incoming transaction marked 
> as translated is allowed by the stream configuration. This option disables 
> the check.");

Yikes, do we really want this option, or is it just a leftover from
debugging?

>  enum pri_resp {
>   PRI_RESP_DENY = 0,
>   PRI_RESP_FAIL = 1,
> @@ -433,6 +447,16 @@ struct arm_smmu_cmdq_ent {
>   u64 addr;
>   } tlbi;
>  
> + #define CMDQ_OP_ATC_INV 0x40
> + #define ATC_INV_SIZE_ALL52
> + struct {
> + u32 sid;
> + u32 ssid;
> + u64 addr;
> +  

Re: [PATCH v2 1/7] ACPI/IORT: Check ATS capability in root complex nodes

2019-04-15 Thread Will Deacon
On Tue, Apr 09, 2019 at 05:52:39PM +0100, Jean-Philippe Brucker wrote:
> Root complex node in IORT has a bit telling whether it supports ATS or
> not. Store this bit in the IOMMU fwspec when setting up a device, so it
> can be accessed later by an IOMMU driver.
> 
> Use the negative version (NO_ATS) at the moment because it's not clear
> if/how the bit needs to be integrated in other firmware descriptions. The
> SMMU has a feature bit telling if it supports ATS, which might be
> sufficient in most systems for deciding whether or not we should enable
> the ATS capability in endpoints.

Hmm, the SMMUv3 architecture manual is pretty explicit about this:

  | It [SMMU_IDR0.ATS] does not guarantee that client devices and intermediate
  | components also support ATS and this must be determined separately.

so we may need to extend the PCI bindings to describe this. I think the
negative logic is likely to get in the way if that's the case.

> Signed-off-by: Jean-Philippe Brucker 
> ---
>  drivers/acpi/arm64/iort.c | 11 +++
>  include/linux/iommu.h |  4 
>  2 files changed, 15 insertions(+)
> 
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index e48894e002ba..7f2c1c9c6b38 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -1028,6 +1028,14 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr, 
> u64 *dma_size)
>   dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
>  }
>  
> +static bool iort_pci_rc_supports_ats(struct acpi_iort_node *node)
> +{
> + struct acpi_iort_root_complex *pci_rc;
> +
> + pci_rc = (struct acpi_iort_root_complex *)node->node_data;
> + return pci_rc->ats_attribute & ACPI_IORT_ATS_SUPPORTED;
> +}

Do we need to worry about the "noats" command-line option here? It feels
like we should be checking with the PCI subsystem before telling the SMMU
we're good to go.

I'll need Lorenzo's ack on this.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/7] Add PCI ATS support to Arm SMMUv3

2019-04-15 Thread Will Deacon
Hi Jean-Philippe,

On Tue, Apr 09, 2019 at 05:52:38PM +0100, Jean-Philippe Brucker wrote:
> This series enables PCI ATS in SMMUv3. Changes since v1 [1]:
> 
> * Simplify the SMMU structures (patches 2-4 are new).
> 
> * Don't enable ATS for devices that are attached to a bypass domain,
>   because in that case a translation request would cause F_BAD_ATS_TREQ.
>   Translation requests in that case cause F_BAD_ATS_TREQ. Enable ATS in
>   attach_dev() rather than add_device().
> 
> * Enable ATS for stage-1 and stage-2 STEs alike. There is no benefit to
>   disabling ATS for stage-2 domains.
> 
> [1] https://www.spinics.net/lists/arm-kernel/msg714628.html
> 
> Jean-Philippe Brucker (7):
>   ACPI/IORT: Check ATS capability in root complex nodes
>   iommu/arm-smmu-v3: Rename arm_smmu_master_data to arm_smmu_master
>   iommu/arm-smmu-v3: Store SteamIDs in master
>   iommu/arm-smmu-v3: Add a master->domain pointer
>   iommu/arm-smmu-v3: Link domains and devices
>   iommu/arm-smmu-v3: Add support for PCI ATS
>   iommu/arm-smmu-v3: Disable tagged pointers

Patches 2, 3, 4, 5 and 7 look fine to me. I've left some comments on the
other two.

Cheers,

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] swiotlb: save io_tlb_used to local variable before leaving critical section

2019-04-15 Thread Dongli Zhang



On 04/15/2019 07:26 PM, Konrad Rzeszutek Wilk wrote:
> 
> 
> On Mon, Apr 15, 2019, 2:50 AM Dongli Zhang  > wrote:
> 
> As the patch to be fixed is still in Konrad's own tree, I will send the 
> v2 for
> the patch to be fixed, instead of an incremental fix.
> 
> 
> I squashed it in.

Thank you very much!

I saw it is at:

https://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb.git/commit/?h=devel/for-linus-5.1=53b29c336830db48ad3dc737f88b8c065b1f0851

Dongli Zhang
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/arm-smmu: Add SID information to context fault log

2019-04-15 Thread Vivek Gautam


On 4/15/2019 3:11 PM, Robin Murphy wrote:

On 15/04/2019 09:07, Vivek Gautam wrote:

Extract the SID and add the information to context fault log.
This is specially useful in a distributed smmu architecture
where multiple masters are connected to smmu. SID information
helps to quickly identify the faulting master device.


Hmm, given how it's UNKNOWN for translation faults, which are arguably 
the most likely context fault, I reckon it probably makes more sense 
to just dump the raw register value for the user to interpret, as we 
do for fsr/fsynr.



Thanks Robin. Sure will update it to dump the raw register value.

Regards
Vivek



Robin.


Signed-off-by: Vivek Gautam 
---
  drivers/iommu/arm-smmu-regs.h |  4 
  drivers/iommu/arm-smmu.c  | 14 --
  2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu-regs.h 
b/drivers/iommu/arm-smmu-regs.h

index a1226e4ab5f8..e5be0344b610 100644
--- a/drivers/iommu/arm-smmu-regs.h
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -147,6 +147,10 @@ enum arm_smmu_s2cr_privcfg {
  #define CBAR_IRPTNDX_SHIFT    24
  #define CBAR_IRPTNDX_MASK    0xff
  +#define ARM_SMMU_GR1_CBFRSYNRA(n)    (0x400 + ((n) << 2))
+#define CBFRSYNRA_V2_SID_MASK    0x
+#define CBFRSYNRA_V1_SID_MASK    0x7fff
+
  #define ARM_SMMU_GR1_CBA2R(n)    (0x800 + ((n) << 2))
  #define CBA2R_RW64_32BIT    (0 << 0)
  #define CBA2R_RW64_64BIT    (1 << 0)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 045d93884164..aa3426dc68d0 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -575,7 +575,10 @@ static irqreturn_t arm_smmu_context_fault(int 
irq, void *dev)

  struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
  struct arm_smmu_cfg *cfg = _domain->cfg;
  struct arm_smmu_device *smmu = smmu_domain->smmu;
+    void __iomem *gr1_base = ARM_SMMU_GR1(smmu);
  void __iomem *cb_base;
+    u32 cbfrsynra;
+    u16 sid;
    cb_base = ARM_SMMU_CB(smmu, cfg->cbndx);
  fsr = readl_relaxed(cb_base + ARM_SMMU_CB_FSR);
@@ -586,9 +589,16 @@ static irqreturn_t arm_smmu_context_fault(int 
irq, void *dev)

  fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
  iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
  +    cbfrsynra = readl_relaxed(gr1_base +
+  ARM_SMMU_GR1_CBFRSYNRA(cfg->cbndx));
+    if (smmu->version > ARM_SMMU_V1)
+    sid = cbfrsynra & CBFRSYNRA_V2_SID_MASK;
+    else
+    sid = cbfrsynra & CBFRSYNRA_V1_SID_MASK;
+
  dev_err_ratelimited(smmu->dev,
-    "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, 
cb=%d\n",

-    fsr, iova, fsynr, cfg->cbndx);
+    "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, 
cb=%d sid = %u\n",

+    fsr, iova, fsynr, cfg->cbndx, sid);
    writel(fsr, cb_base + ARM_SMMU_CB_FSR);
  return IRQ_HANDLED;


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/1] swiotlb: save io_tlb_used to local variable before leaving critical section

2019-04-15 Thread Konrad Rzeszutek Wilk
On Mon, Apr 15, 2019, 2:50 AM Dongli Zhang  wrote:

> As the patch to be fixed is still in Konrad's own tree, I will send the v2
> for
> the patch to be fixed, instead of an incremental fix.
>

I squashed it in.


> Dongli Zhang
>
> On 4/12/19 10:13 PM, Joe Jin wrote:
> > I'm good to have this patch, which helps identify the cause of failure is
> > fragmentation or it really been used up.
> >
> > On 4/12/19 4:38 AM, Dongli Zhang wrote:
> >> When swiotlb is full, the kernel would print io_tlb_used. However, the
> >> result might be inaccurate at that time because we have left the
> critical
> >> section protected by spinlock.
> >>
> >> Therefore, we backup the io_tlb_used into local variable before leaving
> >> critical section.
> >>
> >> Fixes: 83ca25948940 ("swiotlb: dump used and total slots when swiotlb
> buffer is full")
> >> Suggested-by: Håkon Bugge 
> >> Signed-off-by: Dongli Zhang 
> >
> > Reviewed-by: Joe Jin 
> >
> > Thanks,
> > Joe
> >
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/1] iommu/arm-smmu: Add SID information to context fault log

2019-04-15 Thread Robin Murphy

On 15/04/2019 09:07, Vivek Gautam wrote:

Extract the SID and add the information to context fault log.
This is specially useful in a distributed smmu architecture
where multiple masters are connected to smmu. SID information
helps to quickly identify the faulting master device.


Hmm, given how it's UNKNOWN for translation faults, which are arguably 
the most likely context fault, I reckon it probably makes more sense to 
just dump the raw register value for the user to interpret, as we do for 
fsr/fsynr.


Robin.


Signed-off-by: Vivek Gautam 
---
  drivers/iommu/arm-smmu-regs.h |  4 
  drivers/iommu/arm-smmu.c  | 14 --
  2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
index a1226e4ab5f8..e5be0344b610 100644
--- a/drivers/iommu/arm-smmu-regs.h
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -147,6 +147,10 @@ enum arm_smmu_s2cr_privcfg {
  #define CBAR_IRPTNDX_SHIFT24
  #define CBAR_IRPTNDX_MASK 0xff
  
+#define ARM_SMMU_GR1_CBFRSYNRA(n)	(0x400 + ((n) << 2))

+#define CBFRSYNRA_V2_SID_MASK  0x
+#define CBFRSYNRA_V1_SID_MASK  0x7fff
+
  #define ARM_SMMU_GR1_CBA2R(n) (0x800 + ((n) << 2))
  #define CBA2R_RW64_32BIT  (0 << 0)
  #define CBA2R_RW64_64BIT  (1 << 0)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 045d93884164..aa3426dc68d0 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -575,7 +575,10 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_cfg *cfg = _domain->cfg;
struct arm_smmu_device *smmu = smmu_domain->smmu;
+   void __iomem *gr1_base = ARM_SMMU_GR1(smmu);
void __iomem *cb_base;
+   u32 cbfrsynra;
+   u16 sid;
  
  	cb_base = ARM_SMMU_CB(smmu, cfg->cbndx);

fsr = readl_relaxed(cb_base + ARM_SMMU_CB_FSR);
@@ -586,9 +589,16 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
  
+	cbfrsynra = readl_relaxed(gr1_base +

+ ARM_SMMU_GR1_CBFRSYNRA(cfg->cbndx));
+   if (smmu->version > ARM_SMMU_V1)
+   sid = cbfrsynra & CBFRSYNRA_V2_SID_MASK;
+   else
+   sid = cbfrsynra & CBFRSYNRA_V1_SID_MASK;
+
dev_err_ratelimited(smmu->dev,
-   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cb=%d\n",
-   fsr, iova, fsynr, cfg->cbndx);
+   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cb=%d sid = 
%u\n",
+   fsr, iova, fsynr, cfg->cbndx, sid);
  
  	writel(fsr, cb_base + ARM_SMMU_CB_FSR);

return IRQ_HANDLED;


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/3] drm/panfrost: Add initial panfrost driver

2019-04-15 Thread Steven Price
On 15/04/2019 10:18, Daniel Vetter wrote:
> On Fri, Apr 05, 2019 at 05:42:33PM +0100, Steven Price wrote:
>> On 05/04/2019 17:16, Alyssa Rosenzweig wrote:
>>> acronym once ever and have it as a "??"), I'm not sure how to respond to
>>> that... We don't know how to allocate memory for the GPU-internal data
>>> structures (the tiler heap, for instance, but also a few others I've
>>> just named "misc_0" and "scratchpad" -- guessing one of those is for
>>> "TLS"). With kbase, I took the worst-case strategy of allocating
>>> gigantic chunks on startup with tiny commit counts and GROW_ON_GPF set.
>>> With the new driver, well, our memory consumption is scary since
>>> implementing GROW_ON_GPF in an upstream-friendly way is a bit more work
>>> and isn't expected to hit the 5.2 window.
>>
>> Yes GROW_ON_GPF is pretty much required for the tiler heap - it's not
>> (reasonably) possible to determine how big it should be. The Arm user
>> space driver does the same approach (tiny commit count, but allow it to
>> grow).
> 
> Jumping in here with a drive through comment ...
> 
> Growing gem bo and dma-buf is going to be endless amounts of fun, since we
> hard-coded that their size is invariant.
> 
> I think the only reasonable way to implement this is if you allocate a
> really huge bo, map it, but only put the pages in on faulting. Or when
> really evil userspace tries to export it. Actually changing the underlying
> buffer size is not going to work I think.

Yes the idea is that you allocate a large amount of virtual address
space, but only put a few physical pages in. If the GPU needs more you
fault them in as necessary. The "buffer size" (i.e. virtual address
region) never changes size.

> Note: I didn't read kbase, so might be totally wrong in how GROW_ON_GPF
> works.

For kbase we simply don't support exporting this type of memory, and are
fairly restrictive about mapping it into user space (user space
shouldn't normally need to read it).

Since Panfrost is using GEM BO it will have to deal with malicious user
space. But it would be sufficient to simply fully back the region in
that case.

Recent version of kbase also support what is termed JIT (Just In Time
memory allocation). Basically this involves the kernel driver
allocating/freeing memory regions just before the job is loaded onto the
GPU. These regions might also be GROW_ON_GPF. The intention is that when
there isn't memory pressure these regions can be kept between jobs, but
under memory pressure they can be discarded and recreated if they turn
out to be needed again.

Given the differences between the Panfrost and the proprietary user
space I'm not sure exactly what form this will need to be for Panfrost,
but Vulkan makes memory management "more interesting"! Allocating
upfront for the worst case can become prohibitively expensive.

Steve
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/3] drm/panfrost: Add initial panfrost driver

2019-04-15 Thread Daniel Vetter
On Fri, Apr 05, 2019 at 05:42:33PM +0100, Steven Price wrote:
> On 05/04/2019 17:16, Alyssa Rosenzweig wrote:
> > acronym once ever and have it as a "??"), I'm not sure how to respond to
> > that... We don't know how to allocate memory for the GPU-internal data
> > structures (the tiler heap, for instance, but also a few others I've
> > just named "misc_0" and "scratchpad" -- guessing one of those is for
> > "TLS"). With kbase, I took the worst-case strategy of allocating
> > gigantic chunks on startup with tiny commit counts and GROW_ON_GPF set.
> > With the new driver, well, our memory consumption is scary since
> > implementing GROW_ON_GPF in an upstream-friendly way is a bit more work
> > and isn't expected to hit the 5.2 window.
> 
> Yes GROW_ON_GPF is pretty much required for the tiler heap - it's not
> (reasonably) possible to determine how big it should be. The Arm user
> space driver does the same approach (tiny commit count, but allow it to
> grow).

Jumping in here with a drive through comment ...

Growing gem bo and dma-buf is going to be endless amounts of fun, since we
hard-coded that their size is invariant.

I think the only reasonable way to implement this is if you allocate a
really huge bo, map it, but only put the pages in on faulting. Or when
really evil userspace tries to export it. Actually changing the underlying
buffer size is not going to work I think.

Note: I didn't read kbase, so might be totally wrong in how GROW_ON_GPF
works.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/1] iommu/arm-smmu: Add SID information to context fault log

2019-04-15 Thread Vivek Gautam
Extract the SID and add the information to context fault log.
This is specially useful in a distributed smmu architecture
where multiple masters are connected to smmu. SID information
helps to quickly identify the faulting master device.

Signed-off-by: Vivek Gautam 
---
 drivers/iommu/arm-smmu-regs.h |  4 
 drivers/iommu/arm-smmu.c  | 14 --
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
index a1226e4ab5f8..e5be0344b610 100644
--- a/drivers/iommu/arm-smmu-regs.h
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -147,6 +147,10 @@ enum arm_smmu_s2cr_privcfg {
 #define CBAR_IRPTNDX_SHIFT 24
 #define CBAR_IRPTNDX_MASK  0xff
 
+#define ARM_SMMU_GR1_CBFRSYNRA(n)  (0x400 + ((n) << 2))
+#define CBFRSYNRA_V2_SID_MASK  0x
+#define CBFRSYNRA_V1_SID_MASK  0x7fff
+
 #define ARM_SMMU_GR1_CBA2R(n)  (0x800 + ((n) << 2))
 #define CBA2R_RW64_32BIT   (0 << 0)
 #define CBA2R_RW64_64BIT   (1 << 0)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 045d93884164..aa3426dc68d0 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -575,7 +575,10 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_cfg *cfg = _domain->cfg;
struct arm_smmu_device *smmu = smmu_domain->smmu;
+   void __iomem *gr1_base = ARM_SMMU_GR1(smmu);
void __iomem *cb_base;
+   u32 cbfrsynra;
+   u16 sid;
 
cb_base = ARM_SMMU_CB(smmu, cfg->cbndx);
fsr = readl_relaxed(cb_base + ARM_SMMU_CB_FSR);
@@ -586,9 +589,16 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
 
+   cbfrsynra = readl_relaxed(gr1_base +
+ ARM_SMMU_GR1_CBFRSYNRA(cfg->cbndx));
+   if (smmu->version > ARM_SMMU_V1)
+   sid = cbfrsynra & CBFRSYNRA_V2_SID_MASK;
+   else
+   sid = cbfrsynra & CBFRSYNRA_V1_SID_MASK;
+
dev_err_ratelimited(smmu->dev,
-   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cb=%d\n",
-   fsr, iova, fsynr, cfg->cbndx);
+   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cb=%d sid 
= %u\n",
+   fsr, iova, fsynr, cfg->cbndx, sid);
 
writel(fsr, cb_base + ARM_SMMU_CB_FSR);
return IRQ_HANDLED;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] swiotlb: save io_tlb_used to local variable before leaving critical section

2019-04-15 Thread Dongli Zhang
As the patch to be fixed is still in Konrad's own tree, I will send the v2 for
the patch to be fixed, instead of an incremental fix.

Dongli Zhang

On 4/12/19 10:13 PM, Joe Jin wrote:
> I'm good to have this patch, which helps identify the cause of failure is
> fragmentation or it really been used up.
> 
> On 4/12/19 4:38 AM, Dongli Zhang wrote:
>> When swiotlb is full, the kernel would print io_tlb_used. However, the
>> result might be inaccurate at that time because we have left the critical
>> section protected by spinlock.
>>
>> Therefore, we backup the io_tlb_used into local variable before leaving
>> critical section.
>>
>> Fixes: 83ca25948940 ("swiotlb: dump used and total slots when swiotlb buffer 
>> is full")
>> Suggested-by: Håkon Bugge 
>> Signed-off-by: Dongli Zhang 
> 
> Reviewed-by: Joe Jin  
> 
> Thanks,
> Joe
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 4/9] iommu/dma-iommu: Add iommu_dma_map_page_coherent

2019-04-15 Thread Christoph Hellwig
> @@ -693,7 +694,19 @@ dma_addr_t iommu_dma_map_page(struct device *dev, struct 
> page *page,
>   unsigned long offset, size_t size, int prot)
>  {
>   return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot,
> - iommu_get_dma_domain(dev));
> + iommu_get_dma_domain(dev), dma_get_mask(dev));
> +}
> +
> +dma_addr_t iommu_dma_map_page_coherent(struct device *dev, struct page *page,
> + unsigned long offset, size_t size, int prot)
> +{

Note that my tree pointed to above removes the iommu_dma_map_page
wrapper.  I think for the coherent mappign we should also just
call __iommu_dma_map directly and not introduce another wrapper.

> + dma_addr_t dma_mask = dev->coherent_dma_mask;
> +
> + if (!dma_mask)
> + dma_mask = dma_get_mask(dev);
> +
> + return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot,
> + iommu_get_dma_domain(dev), dma_mask);
>  }

In the DMA API there is no fallback when the coherent_dma_mask is not
set.  While various bits of legacy code do that we should not copy it
to new code.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/9] iommu/amd: Implement .flush_np_cache

2019-04-15 Thread Christoph Hellwig
> +static void amd_iommu_flush_np_cache(struct iommu_domain *domain,
> + unsigned long iova, size_t size)
> +{
> + struct protection_domain *dom = to_pdomain(domain);
> +
> + if (unlikely(amd_iommu_np_cache)) {

Is this case really so unlikely that it needs a static branch prediction
hint?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/9] iommu/dma-iommu: Add iommu_map_atomic

2019-04-15 Thread Christoph Hellwig
On Thu, Apr 11, 2019 at 07:47:30PM +0100, Tom Murphy via iommu wrote:
> The iommu ops .map function (or the iommu_map function which calls it)
> was always supposed to be sleepable (according to Joerg's comment in
> this thread: https://lore.kernel.org/patchwork/patch/977520/ ) and so
> should probably have had a "might_sleep()" since it was written. However
> currently the dma-iommu api can call iommu_map in an atomic context,
> which it shouldn't do. This doesn't cause any problems because any iommu
> driver which uses the dma-iommu api uses gfp_atomic in it's iommu_ops
> .map function. But doing this wastes the memory allocators atomic pools.
> 
> Add a new function iommu_map_atomic, use it in the dma-iommu api and add
> “might_sleep()” to the iommu_map function.
> 
> After this change all drivers which use the dma-iommu api need to
> implement the new iommu ops .map_atomic function. For the moment just
> reuse the driver's iommus ops .map function for .map_atomic.

Instead of duplicating the callchain shouldn't we just pass down a flag
to indicate if the context can sleep or not?  Especially as all the
existing arm drivers already implement the atomic behavior, which is
a majority of struct iommu_ops based iommu drivers.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 3/9] iommu/dma-iommu: Add iommu_dma_copy_reserved_iova, iommu_dma_apply_resv_region to the dma-iommu api

2019-04-15 Thread Christoph Hellwig
On Thu, Apr 11, 2019 at 07:47:32PM +0100, Tom Murphy via iommu wrote:
> +
> + WARN_ON_ONCE(iommu_dma_reserve_iova(domain, region->start, end) == 
> NULL);

Overly long line..
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 9/9] iommu/amd: Add allocated domain to global list earlier

2019-04-15 Thread Christoph Hellwig
On Thu, Apr 11, 2019 at 07:47:38PM +0100, Tom Murphy via iommu wrote:
> dma_ops_domain_free() expects domain to be in a global list.
> Arguably, could be called before protection_domain_init().
> 
> Signed-off-by: Dmitry Safonov 
> Signed-off-by: Tom Murphy 

This seems like a fix to the existing code and should probably go
out first.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 8/9] iommu/amd: Clean up unused functions

2019-04-15 Thread Christoph Hellwig
On Thu, Apr 11, 2019 at 07:47:37PM +0100, Tom Murphy via iommu wrote:
> Now that we are using the dma-iommu api we have a lot of unused code.
> This patch removes all that unused code.

This should be merged into the previous patch.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/9] iommu/amd: Convert the AMD iommu driver to the dma-iommu api

2019-04-15 Thread Christoph Hellwig
Hi Tom,

can you rebase this on top of this series:

https://lists.linuxfoundation.org/pipermail/iommu/2019-March/034357.html

or the version addressing Robin's comments here:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-iommu-ops.3

so that we can avoid introducing more dma_map_ops boilerplate code?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu