Re: [PATCH v2] ACPI: VIOT: Fix ACS setup

2022-06-30 Thread Jean-Philippe Brucker
On Thu, Jun 30, 2022 at 11:40:59AM +0200, Eric Auger wrote:
> Currently acpi_viot_init() gets called after the pci
> device has been scanned and pci_enable_acs() has been called.
> So pci_request_acs() fails to be taken into account leading
> to wrong single iommu group topologies when dealing with
> multi-function root ports for instance.
> 
> We cannot simply move the acpi_viot_init() earlier, similarly
> as the IORT init because the VIOT parsing relies on the pci
> scan. However we can detect VIOT is present earlier and in
> such a case, request ACS. Introduce a new acpi_viot_early_init()
> routine that allows to call pci_request_acs() before the scan.
> 
> While at it, guard the call to pci_request_acs() with #ifdef
> CONFIG_PCI.
> 
> Fixes: 3cf485540e7b ("ACPI: Add driver for the VIOT table")
> Signed-off-by: Eric Auger 
> Reported-by: Jin Liu 

Reviewed-by: Jean-Philippe Brucker 
Tested-by: Jean-Philippe Brucker 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] ACPI: VIOT: Fix ACS setup

2022-06-29 Thread Jean-Philippe Brucker
Hi Eric,

On Mon, Jun 27, 2022 at 02:55:34PM +0200, Eric Auger wrote:
> Currently acpi_viot_init() gets called after the pci
> device has been scanned and pci_enable_acs() has been called.
> So pci_request_acs() fails to be taken into account leading
> to wrong single iommu group topologies when dealing with
> multi-function root ports for instance.
> 
> We cannot simply move the acpi_viot_init() earlier, similarly
> as the IORT init because the VIOT parsing relies on the pci
> scan. However we can detect VIOT is present earlier and in
> such a case, request ACS. Introduce a new acpi_viot_early_init()
> routine that allows to call pci_request_acs() before the scan.
> 
> Fixes: 3cf485540e7b ("ACPI: Add driver for the VIOT table")
> Signed-off-by: Eric Auger 
> Reported-by: Jin Liu 

Thanks for the fix, the patch makes sense and fixes the issue.

I wondered whether we should keep the logic where we only request ACS if
an IOMMU is found to manage a PCI range, but I can't see any harm in
requesting it regardless (plus there is a precedent with AMD IOMMU).
I could imagine some VMM wanting to only put an IOMMU in front of its MMIO
devices and leave PCI to roam free, but that seems like a stretch.

There is another issue with the existing code, though: we can't call
pci_request_acs() when CONFIG_PCI is disabled because no stub is defined.
Could you wrap the call in an #ifdef?

> ---
>  drivers/acpi/bus.c|  1 +
>  drivers/acpi/viot.c   | 23 +--
>  include/linux/acpi_viot.h |  2 ++
>  3 files changed, 20 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
> index 86fa61a21826..906ad8153fd9 100644
> --- a/drivers/acpi/bus.c
> +++ b/drivers/acpi/bus.c
> @@ -1400,6 +1400,7 @@ static int __init acpi_init(void)
>  
>   pci_mmcfg_late_init();
>   acpi_iort_init();
> + acpi_viot_early_init();
>   acpi_hest_init();
>   acpi_ghes_init();
>   acpi_scan_init();
> diff --git a/drivers/acpi/viot.c b/drivers/acpi/viot.c
> index d2256326c73a..3c1be123e4d6 100644
> --- a/drivers/acpi/viot.c
> +++ b/drivers/acpi/viot.c
> @@ -248,6 +248,23 @@ static int __init viot_parse_node(const struct 
> acpi_viot_header *hdr)
>   return ret;
>  }
>  
> +/**
> + * acpi_viot_early_init - Test the presence of VIOT and enable ACS
> + *
> + * If the VIOT does exist, ACS must be enabled. This cannot be
> + * done in acpi_viot_init() which is called after the bus scan
> + */
> +void __init acpi_viot_early_init(void)
> +{
> + acpi_status status;
> + struct acpi_table_header *hdr;
> +
> + status = acpi_get_table(ACPI_SIG_VIOT, 0, );
> + if (!ACPI_FAILURE(status))
> + pci_request_acs();
> + acpi_put_table(hdr);

I'd rather not call acpi_put_table() in case of failure. I know it is
handled but it looks fragile and I couldn't find any other user of
acpi_get_table() doing this.

> +}
> +
>  /**
>   * acpi_viot_init - Parse the VIOT table
>   *
> @@ -319,12 +336,6 @@ static int viot_pci_dev_iommu_init(struct pci_dev *pdev, 
> u16 dev_id, void *data)
>   epid = ((domain_nr - ep->segment_start) << 16) +
>   dev_id - ep->bdf_start + ep->endpoint_id;
>  
> - /*
> -  * If we found a PCI range managed by the viommu, we're
> -  * the one that has to request ACS.
> -  */
> - pci_request_acs();
> -
>   return viot_dev_iommu_init(>dev, ep->viommu,
>  epid);
>   }
> diff --git a/include/linux/acpi_viot.h b/include/linux/acpi_viot.h
> index 1eb8ee5b0e5f..e58d60f8ff2e 100644
> --- a/include/linux/acpi_viot.h
> +++ b/include/linux/acpi_viot.h
> @@ -6,10 +6,12 @@
>  #include 
>  
>  #ifdef CONFIG_ACPI_VIOT
> +void __init acpi_viot_early_init(void);
>  void __init acpi_viot_init(void);
>  int viot_iommu_configure(struct device *dev);
>  #else
>  static inline void acpi_viot_init(void) {}
> +static inline void acpi_viot_early_init(void) {}

nit: different declaration order

Thanks,
Jean


>  static inline int viot_iommu_configure(struct device *dev)
>  {
>   return -ENODEV;
> -- 
> 2.35.3
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v9 10/11] iommu: Per-domain I/O page fault handling

2022-06-28 Thread Jean-Philippe Brucker
On Tue, Jun 28, 2022 at 07:53:39PM +0800, Baolu Lu wrote:
> > > > Once the iopf_handle_single() is removed, the name of
> > > > iopf_handle_group() looks a little weired
> > > > 
> > > > and confused, does this group mean the iommu group (domain) ?
> > > > while I take some minutes to
> > > 
> > > No. This is not the iommu group. It's page request group defined by the
> > > PCI SIG spec. Multiple page requests could be put in a group with a
> > > same group id. All page requests in a group could be responded to device
> > > in one shot.
> > 
> > Thanks your explaination, understand the concept of PCIe PRG.  I meant
> > 
> > do we still have the necessity to mention the "group" here in the name
> > 
> > iopf_handle_group(),  which one is better ? iopf_handle_prg() or
> > 
> > iopf_handler(),  perhaps none of them ? :)
> 
> Oh! Sorry for the misunderstanding.
> 
> I have no strong feeling to change this naming. :-) All the names
> express what the helper does. Jean is the author of this framework. If
> he has the same idea as you, I don't mind renaming it in this patch.

I'm not attached to the name, and I see how it could be confusing. Given
that io-pgfault is not only for PCIe, 'prg' is not the best here either.
iopf_handle_faults(), or just iopf_handler(), seem more suitable.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v9 10/11] iommu: Per-domain I/O page fault handling

2022-06-28 Thread Jean-Philippe Brucker
On Tue, Jun 28, 2022 at 08:39:36AM +, Tian, Kevin wrote:
> > From: Lu Baolu 
> > Sent: Tuesday, June 21, 2022 10:44 PM
> > 
> > Tweak the I/O page fault handling framework to route the page faults to
> > the domain and call the page fault handler retrieved from the domain.
> > This makes the I/O page fault handling framework possible to serve more
> > usage scenarios as long as they have an IOMMU domain and install a page
> > fault handler in it. Some unused functions are also removed to avoid
> > dead code.
> > 
> > The iommu_get_domain_for_dev_pasid() which retrieves attached domain
> > for a {device, PASID} pair is used. It will be used by the page fault
> > handling framework which knows {device, PASID} reported from the iommu
> > driver. We have a guarantee that the SVA domain doesn't go away during
> > IOPF handling, because unbind() waits for pending faults with
> > iopf_queue_flush_dev() before freeing the domain. Hence, there's no need
> > to synchronize life cycle of the iommu domains between the unbind() and
> > the interrupt threads.
> 
> I found iopf_queue_flush_dev() is only called in intel-iommu driver. Did
> I overlook anything?

The SMMU driver will need it as well when we upstream PRI support.
Currently it only supports stall, and that requires the device driver to
flush all DMA including stalled transactions *before* calling unbind(), so
ne need for iopf_queue_flush_dev() in this case.

Thanks,
Jean

> 
> >  static void iopf_handle_group(struct work_struct *work)
> >  {
> > struct iopf_group *group;
> > +   struct iommu_domain *domain;
> > struct iopf_fault *iopf, *next;
> > enum iommu_page_response_code status =
> > IOMMU_PAGE_RESP_SUCCESS;
> > 
> > group = container_of(work, struct iopf_group, work);
> > +   domain = iommu_get_domain_for_dev_pasid(group->dev,
> > +   group->last_fault.fault.prm.pasid);
> > +   if (!domain || !domain->iopf_handler)
> > +   status = IOMMU_PAGE_RESP_INVALID;
> 
> Miss a comment on why no refcnt is required on domain as explained
> in the commit msg.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove

2022-06-20 Thread Jean-Philippe Brucker
On Fri, Jun 17, 2022 at 10:23:13PM +0800, Zhangfei Gao wrote:
> @@ -312,12 +345,20 @@ static ssize_t available_instances_show(struct device
> *dev,
>                  char *buf)
>  {
>  struct uacce_device *uacce = to_uacce_device(dev);
> +    ssize_t ret;
> 
> -    if (!uacce->ops->get_available_instances)
> -        return -ENODEV;
> +    mutex_lock(_mutex);
> +    if (!uacce->ops || !uacce->ops->get_available_instances) {

Doesn't the sysfs group go away with uacce_remove()?  We shouldn't need
this check

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove

2022-06-20 Thread Jean-Philippe Brucker
On Fri, Jun 17, 2022 at 02:05:21PM +0800, Zhangfei Gao wrote:
> > The refcount only ensures that the uacce_device object is not freed as
> > long as there are open fds. But uacce_remove() can run while there are
> > open fds, or fds in the process of being opened. And atfer uacce_remove()
> > runs, the uacce_device object still exists but is mostly unusable. For
> > example once the module is freed, uacce->ops is not valid anymore. But
> > currently uacce_fops_open() may dereference the ops in this case:
> > 
> > uacce_fops_open()
> >  if (!uacce->parent->driver)
> >  /* Still valid, keep going */  
> >  ...rmmod
> >  uacce_remove()
> >  ... free_module()
> >  uacce->ops->get_queue() /* BUG */
> 
> uacce_remove should wait for uacce->queues_lock, until fops_open release the
> lock.
> If open happen just after the uacce_remove: unlock, uacce_bind_queue in open
> should fail.

Ah yes sorry, I lost sight of what this patch was adding. But we could
have the same issue with the patch, just in a different order, no?

uacce_fops_open()
 uacce = xa_load()
 ...rmmod
 uacce_remove()
  mutex_lock()
  mutex_unlock()
 mutex_lock()
 if (!uacce->parent->driver)
 /* Still valid, keep going */   parent->driver = NULL
 free_module()
 uacce->ops->get_queue() /* BUG */


> > Accessing uacce->ops after free_module() is a use-after-free. We need all
> you men parent release the resources.
> > the fops to synchronize with uacce_remove() to ensure they don't use any
> > resource of the parent after it's been freed.
> After fops_open, currently we are counting on parent driver stop all dma
> first, then call uacce_remove, which is assumption.
> Like drivers/crypto/hisilicon/zip/zip_main.c: hisi_qm_wait_task_finish,
> which will wait uacce_release.
> If comments this , there may other issue,
> Unable to handle kernel paging request at virtual address 8b700204
> pc : hisi_qm_cache_wb.part.0+0x2c/0xa0
> 
> > I see uacce_fops_poll() may have the same problem, and should be inside
> > uacce_mutex.
> Do we need consider this, uacce_remove can happen anytime but not waiting
> dma stop?

No, the parent driver must stop DMA before calling uacce_remove(), there
is no way around that

> 
> Not sure uacce_mutex can do this.
> Currently the sequence is
> mutex_lock(>queues_lock);
> mutex_lock(_mutex);

We should document why some ops use one lock or the other. I believe it's
to avoid circular lock dependency between ioctl and mmap, do you know if
there was another reason?

> 
> Or we set all the callbacks of uacce_ops to NULL?

That would be cleaner, though we already use the queue state to indicate
whether it is usable or not. I think we just need to extend that to all
ops.

How about the following patch? Unfortunately it still has the lock
disparity between ioctl and mmap because of the circular lockking with
mmap_sem, I don't know how to make that cleaner.

--- 8< ---

>From c7c2b051ec19285bbb973f8a2a5e58bb5326e00e Mon Sep 17 00:00:00 2001
From: Jean-Philippe Brucker 
Date: Mon, 20 Jun 2022 10:10:41 +0100
Subject: [PATCH] uacce: Tidy up locking

The uacce driver must deal with a possible removal of the parent driver
or device at any time. At the moment there are several issues that may
result in use-after-free. Tidy up the locking to handle module removal.

When unbinding the parent device from its driver, the driver calls
uacce_remove(). This function removes the cdev, ensuring that no new
uacce file descriptor will be opened, but existing fds are still open
and uacce fops may be called after uacce_remove() completes, when the
parent module is gone. Each open fd holds a reference to the uacce
device, ensuring that the structure cannot be freed until all fds are
closed. But the uacce fops may still access uacce->ops which belonged to
the parent module, now freed. To solve this:

* use the global uacce_mutex to serialize uacce_fops_open() against
  uacce_remove(), and q->mutex to serialize all other fops against
  uacce_remove().

* q->mutex replaces the less scalable uacce->queues_lock. The queues
  list is now protected by uacce_mutex, and the queue state by q->mutex.
  Note that scalability is only desirable for poll(), since the other
  fops are only used during setup.

* uacce_queue_is_valid(), checked under q->mutex, denotes whe

Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove

2022-06-16 Thread Jean-Philippe Brucker
On Thu, Jun 16, 2022 at 12:10:18PM +0800, Zhangfei Gao wrote:
> > > diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
> > > index 281c54003edc..b6219c6bfb48 100644
> > > --- a/drivers/misc/uacce/uacce.c
> > > +++ b/drivers/misc/uacce/uacce.c
> > > @@ -136,9 +136,16 @@ static int uacce_fops_open(struct inode *inode, 
> > > struct file *filep)
> > >   if (!q)
> > >   return -ENOMEM;
> > > + mutex_lock(>queues_lock);
> > > +
> > > + if (!uacce->parent->driver) {
> > I don't think this is useful, because the core clears parent->driver after
> > having run uacce_remove():
> > 
> >rmmod hisi_zip   open()
> > ...  uacce_fops_open()
> > __device_release_driver() ...
> >  pci_device_remove()
> >   hisi_zip_remove()
> >hisi_qm_uninit()
> > uacce_remove()
> >  ...  ...
> >   mutex_lock(uacce->queues_lock)
> >  ...  if (!uacce->parent->driver)
> >  device_unbind_cleanup()  /* driver still valid, proceed */
> >   dev->driver = NULL
> 
> The check  if (!uacce->parent->driver) is required, otherwise NULL pointer
> may happen.

I agree we need something, what I mean is that this check is not
sufficient.

> iommu_sva_bind_device
> const struct iommu_ops *ops = dev_iommu_ops(dev);  ->
> dev->iommu->iommu_dev->ops
> 
> rmmod has no issue, but remove parent pci device has the issue.

Ah right, relying on the return value of bind() wouldn't be enough even if
we mandated SVA.

[...]
> > 
> > I think we need the global uacce_mutex to serialize uacce_remove() and
> > uacce_fops_open(). uacce_remove() would do everything, including
> > xa_erase(), while holding that mutex. And uacce_fops_open() would try to
> > obtain the uacce object from the xarray while holding the mutex, which
> > fails if the uacce object is being removed.
> 
> Since fops_open get char device refcount, uacce_release will not happen
> until open returns.

The refcount only ensures that the uacce_device object is not freed as
long as there are open fds. But uacce_remove() can run while there are
open fds, or fds in the process of being opened. And atfer uacce_remove()
runs, the uacce_device object still exists but is mostly unusable. For
example once the module is freed, uacce->ops is not valid anymore. But
currently uacce_fops_open() may dereference the ops in this case:

uacce_fops_open()
 if (!uacce->parent->driver)
 /* Still valid, keep going */  
 ...rmmod
 uacce_remove()
 ... free_module()
 uacce->ops->get_queue() /* BUG */

Accessing uacce->ops after free_module() is a use-after-free. We need all
the fops to synchronize with uacce_remove() to ensure they don't use any
resource of the parent after it's been freed. 

I see uacce_fops_poll() may have the same problem, and should be inside
uacce_mutex.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove

2022-06-15 Thread Jean-Philippe Brucker
Hi,

On Fri, Jun 10, 2022 at 08:34:23PM +0800, Zhangfei Gao wrote:
> The uacce parent's module can be removed when uacce is working,
> which may cause troubles.
> 
> If rmmod/uacce_remove happens just after fops_open: bind_queue,
> the uacce_remove can not remove the bound queue since it is not
> added to the queue list yet, which blocks the uacce_disable_sva.
> 
> Change queues_lock area to make sure the bound queue is added to
> the list thereby can be searched in uacce_remove.
> 
> And uacce->parent->driver is checked immediately in case rmmod is
> just happening.
> 
> Also the parent driver must always stop DMA before calling
> uacce_remove.
> 
> Signed-off-by: Yang Shen 
> Signed-off-by: Zhangfei Gao 
> ---
>  drivers/misc/uacce/uacce.c | 19 +--
>  1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
> index 281c54003edc..b6219c6bfb48 100644
> --- a/drivers/misc/uacce/uacce.c
> +++ b/drivers/misc/uacce/uacce.c
> @@ -136,9 +136,16 @@ static int uacce_fops_open(struct inode *inode, struct 
> file *filep)
>   if (!q)
>   return -ENOMEM;
>  
> + mutex_lock(>queues_lock);
> +
> + if (!uacce->parent->driver) {

I don't think this is useful, because the core clears parent->driver after
having run uacce_remove():

  rmmod hisi_zipopen()
   ...   uacce_fops_open()
   __device_release_driver()  ...
pci_device_remove()
 hisi_zip_remove()
  hisi_qm_uninit()
   uacce_remove()
...   ...
  mutex_lock(uacce->queues_lock)
...   if (!uacce->parent->driver)
device_unbind_cleanup()   /* driver still valid, proceed */
 dev->driver = NULL

Since uacce_remove() disabled SVA, the following uacce_bind_queue() will
fail anyway. However, if uacce->flags does not have UACCE_DEV_SVA set,
we'll proceed further and call uacce->ops->get_queue(), which does not
exist anymore since the parent module is gone.

I think we need the global uacce_mutex to serialize uacce_remove() and
uacce_fops_open(). uacce_remove() would do everything, including
xa_erase(), while holding that mutex. And uacce_fops_open() would try to
obtain the uacce object from the xarray while holding the mutex, which
fails if the uacce object is being removed.

Thanks,
Jean

> + ret = -ENODEV;
> + goto out_with_lock;
> + }
> +
>   ret = uacce_bind_queue(uacce, q);
>   if (ret)
> - goto out_with_mem;
> + goto out_with_lock;
>  
>   q->uacce = uacce;
>  
> @@ -153,7 +160,6 @@ static int uacce_fops_open(struct inode *inode, struct 
> file *filep)
>   uacce->inode = inode;
>   q->state = UACCE_Q_INIT;
>  
> - mutex_lock(>queues_lock);
>   list_add(>list, >queues);
>   mutex_unlock(>queues_lock);
>  
> @@ -161,7 +167,8 @@ static int uacce_fops_open(struct inode *inode, struct 
> file *filep)
>  
>  out_with_bond:
>   uacce_unbind_queue(q);
> -out_with_mem:
> +out_with_lock:
> + mutex_unlock(>queues_lock);
>   kfree(q);
>   return ret;
>  }
> @@ -171,10 +178,10 @@ static int uacce_fops_release(struct inode *inode, 
> struct file *filep)
>   struct uacce_queue *q = filep->private_data;
>  
>   mutex_lock(>uacce->queues_lock);
> - list_del(>list);
> - mutex_unlock(>uacce->queues_lock);
>   uacce_put_queue(q);
>   uacce_unbind_queue(q);
> + list_del(>list);
> + mutex_unlock(>uacce->queues_lock);
>   kfree(q);
>  
>   return 0;
> @@ -513,10 +520,10 @@ void uacce_remove(struct uacce_device *uacce)
>   uacce_put_queue(q);
>   uacce_unbind_queue(q);
>   }
> - mutex_unlock(>queues_lock);
>  
>   /* disable sva now since no opened queues */
>   uacce_disable_sva(uacce);
> + mutex_unlock(>queues_lock);
>  
>   if (uacce->cdev)
>   cdev_device_del(uacce->cdev, >dev);
> -- 
> 2.36.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 03/10] iommu/sva: Add iommu_sva_domain support

2022-05-25 Thread Jean-Philippe Brucker
On Wed, May 25, 2022 at 11:07:49AM +0100, Robin Murphy wrote:
> > Did you mean @handler and @handler_token staffs below?
> > 
> > struct iommu_domain {
> >      unsigned type;
> >      const struct iommu_domain_ops *ops;
> >      unsigned long pgsize_bitmap;    /* Bitmap of page sizes in use */
> >      iommu_fault_handler_t handler;
> >      void *handler_token;
> >      struct iommu_domain_geometry geometry;
> >      struct iommu_dma_cookie *iova_cookie;
> > };
> > 
> > Is it only for DMA domains? From the point view of IOMMU faults, it
> > seems to be generic.
> 
> Yes, it's the old common iommu_set_fault_handler() stuff (which arguably is
> more of a "notifier" than a "handler"), but I assume that that's irrelevant
> if SVA is using IOPF instead?

Yes IOMMU drivers call either the newer iommu_report_device_fault() or the
old report_iommu_fault(), and only the former can support IOPF/SVA. I've
tried to merge them before but never completed it. I think the main issue
was with finding the endpoint that caused the fault from the fault
handler. Some IOMMU drivers just pass the IOMMU device to
report_iommu_fault(). I'll probably pick that up at some point.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 06/10] iommu/sva: Refactoring iommu_sva_bind/unbind_device()

2022-05-25 Thread Jean-Philippe Brucker
On Wed, May 25, 2022 at 02:04:49AM +, Tian, Kevin wrote:
> > From: Jean-Philippe Brucker 
> > Sent: Tuesday, May 24, 2022 6:58 PM
> > 
> > On Tue, May 24, 2022 at 10:22:28AM +, Tian, Kevin wrote:
> > > > From: Lu Baolu 
> > > > Sent: Thursday, May 19, 2022 3:21 PM
> > > >
> > > > The existing iommu SVA interfaces are implemented by calling the SVA
> > > > specific iommu ops provided by the IOMMU drivers. There's no need for
> > > > any SVA specific ops in iommu_ops vector anymore as we can achieve
> > > > this through the generic attach/detach_dev_pasid domain ops.
> > >
> > > set/block_pasid_dev, to be consistent.
> > >
> > > > +
> > > > +   mutex_lock(_sva_lock);
> > > > +   /* Search for an existing domain. */
> > > > +   domain = iommu_get_domain_for_dev_pasid(dev, mm->pasid);
> > > > +   if (domain) {
> > > > +   sva_domain = to_sva_domain(domain);
> > > > +   refcount_inc(_domain->bond.users);
> > > > +   goto out_success;
> > > > +   }
> > > > +
> > >
> > > why would one device/pasid be bound to a mm more than once?
> > 
> > Device drivers can call bind() multiple times for the same device and mm,
> > for example if one process wants to open multiple accelerator queues.
> > 
> 
> Is it clearer to have a sva_bond_get/put() pair instead of calling
> bind() multiple times here? 

I don't think it's clearer, and it would force device drivers to keep
track of {dev, mm} pairs, when the IOMMU subsystem already does that.
At the moment a device driver calls

bond = iommu_sva_bind_device(dev, mm)

for each ADI that it wants to assign to userspace. If a process happens to
want multiple ADIs on one device, then the {dev, mm} parameters are the
same and bind() returns the same bond. Since the IOMMU driver needs to
track these anyway, it might as well refcount them.

Thanks,
Jean


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 06/10] iommu/sva: Refactoring iommu_sva_bind/unbind_device()

2022-05-24 Thread Jean-Philippe Brucker
On Tue, May 24, 2022 at 10:22:28AM +, Tian, Kevin wrote:
> > From: Lu Baolu 
> > Sent: Thursday, May 19, 2022 3:21 PM
> > 
> > The existing iommu SVA interfaces are implemented by calling the SVA
> > specific iommu ops provided by the IOMMU drivers. There's no need for
> > any SVA specific ops in iommu_ops vector anymore as we can achieve
> > this through the generic attach/detach_dev_pasid domain ops.
> 
> set/block_pasid_dev, to be consistent.
> 
> > +
> > +   mutex_lock(_sva_lock);
> > +   /* Search for an existing domain. */
> > +   domain = iommu_get_domain_for_dev_pasid(dev, mm->pasid);
> > +   if (domain) {
> > +   sva_domain = to_sva_domain(domain);
> > +   refcount_inc(_domain->bond.users);
> > +   goto out_success;
> > +   }
> > +
> 
> why would one device/pasid be bound to a mm more than once?

Device drivers can call bind() multiple times for the same device and mm,
for example if one process wants to open multiple accelerator queues.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 06/10] iommu/sva: Refactoring iommu_sva_bind/unbind_device()

2022-05-20 Thread Jean-Philippe Brucker
On Fri, May 20, 2022 at 02:38:12PM +0800, Baolu Lu wrote:
> On 2022/5/20 00:39, Jean-Philippe Brucker wrote:
> > > +struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct 
> > > mm_struct *mm)
> > > +{
> > > + struct iommu_sva_domain *sva_domain;
> > > + struct iommu_domain *domain;
> > > + ioasid_t max_pasid = 0;
> > > + int ret = -EINVAL;
> > > +
> > > + /* Allocate mm->pasid if necessary. */
> > > + if (!dev->iommu->iommu_dev->pasids)
> > > + return ERR_PTR(-EOPNOTSUPP);
> > > +
> > > + if (dev_is_pci(dev)) {
> > > + max_pasid = pci_max_pasids(to_pci_dev(dev));
> > > + if (max_pasid < 0)
> > > + return ERR_PTR(max_pasid);
> > > + } else {
> > > + ret = device_property_read_u32(dev, "pasid-num-bits",
> > > +_pasid);
> > > + if (ret)
> > > + return ERR_PTR(ret);
> > > + max_pasid = (1UL << max_pasid);
> > > + }
> > The IOMMU driver needs this PASID width information earlier, when creating
> > the PASID table (in .probe_device(), .attach_dev()). Since we're moving it
> > to the IOMMU core to avoid code duplication, it should be done earlier and
> > stored in dev->iommu
> 
> Yes, really. How about below changes?
> 
> From f1382579e8a15ca49acdf758d38fd36451ea174d Mon Sep 17 00:00:00 2001
> From: Lu Baolu 
> Date: Mon, 28 Feb 2022 15:01:35 +0800
> Subject: [PATCH 1/1] iommu: Add pasids field in struct dev_iommu
> 
> Use this field to save the number of PASIDs that a device is able to
> consume. It is a generic attribute of a device and lifting it into the
> per-device dev_iommu struct could help to avoid the boilerplate code
> in various IOMMU drivers.
> 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/iommu.c | 15 +++
>  include/linux/iommu.h |  2 ++
>  2 files changed, 17 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index e49c5a5b8cc1..6b731171d42f 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -20,6 +20,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -194,6 +195,8 @@ EXPORT_SYMBOL_GPL(iommu_device_unregister);
>  static struct dev_iommu *dev_iommu_get(struct device *dev)
>  {
>   struct dev_iommu *param = dev->iommu;
> + u32 max_pasids = 0;
> + int ret;
> 
>   if (param)
>   return param;
> @@ -202,6 +205,18 @@ static struct dev_iommu *dev_iommu_get(struct device
> *dev)
>   if (!param)
>   return NULL;
> 
> + if (dev_is_pci(dev)) {
> + ret = pci_max_pasids(to_pci_dev(dev));
> + if (ret > 0)
> + max_pasids = ret;
> + } else {
> + ret = device_property_read_u32(dev, "pasid-num-bits",
> +_pasids);
> + if (!ret)
> + max_pasids = (1UL << max_pasids);
> + }
> + param->pasids = max_pasids;
> +

we could also do a min() with the IOMMU PASID size here

>   mutex_init(>lock);
>   dev->iommu = param;
>   return param;
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 45f274b2640d..d4296136ba75 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -371,6 +371,7 @@ struct iommu_fault_param {
>   * @fwspec:   IOMMU fwspec data
>   * @iommu_dev:IOMMU device this device is linked to
>   * @priv: IOMMU Driver private data
> + * @pasids:   number of supported PASIDs

'max_pasids' to stay consistent?

Thanks,
Jean

>   *
>   * TODO: migrate other per device data pointers under iommu_dev_data, e.g.
>   *   struct iommu_group  *iommu_group;
> @@ -382,6 +383,7 @@ struct dev_iommu {
>   struct iommu_fwspec *fwspec;
>   struct iommu_device *iommu_dev;
>   void*priv;
> + u32 pasids;
>  };
> 
>  int iommu_device_register(struct iommu_device *iommu,
> -- 
> 2.25.1
> 
> Best regards,
> baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 08/10] iommu: Prepare IOMMU domain for IOPF

2022-05-19 Thread Jean-Philippe Brucker
On Thu, May 19, 2022 at 03:20:45PM +0800, Lu Baolu wrote:
> This adds some mechanisms around the iommu_domain so that the I/O page
> fault handling framework could route a page fault to the domain and
> call the fault handler from it.
> 
> Add pointers to the page fault handler and its private data in struct
> iommu_domain. The fault handler will be called with the private data
> as a parameter once a page fault is routed to the domain. Any kernel
> component which owns an iommu domain could install handler and its
> private parameter so that the page fault could be further routed and
> handled.
> 
> This also prepares the SVA implementation to be the first consumer of
> the per-domain page fault handling model.
> 
> Suggested-by: Jean-Philippe Brucker 
> Signed-off-by: Lu Baolu 
> ---
>  include/linux/iommu.h |  3 ++
>  drivers/iommu/io-pgfault.c|  7 
>  drivers/iommu/iommu-sva-lib.c | 65 +++
>  3 files changed, 75 insertions(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index e4ce2fe0e144..45f274b2640d 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -100,6 +100,9 @@ struct iommu_domain {
>   void *handler_token;
>   struct iommu_domain_geometry geometry;
>   struct iommu_dma_cookie *iova_cookie;
> + enum iommu_page_response_code (*iopf_handler)(struct iommu_fault *fault,
> +   void *data);
> + void *fault_data;
>  };
>  
>  static inline bool iommu_is_dma_domain(struct iommu_domain *domain)
> diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
> index 1df8c1dcae77..aee9e033012f 100644
> --- a/drivers/iommu/io-pgfault.c
> +++ b/drivers/iommu/io-pgfault.c
> @@ -181,6 +181,13 @@ static void iopf_handle_group(struct work_struct *work)
>   * request completes, outstanding faults will have been dealt with by the 
> time
>   * the PASID is freed.
>   *
> + * Any valid page fault will be eventually routed to an iommu domain and the
> + * page fault handler installed there will get called. The users of this
> + * handling framework should guarantee that the iommu domain could only be
> + * freed after the device has stopped generating page faults (or the iommu
> + * hardware has been set to block the page faults) and the pending page 
> faults
> + * have been flushed.
> + *
>   * Return: 0 on success and <0 on error.
>   */
>  int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
> diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
> index 568e0f64edac..317ab8e8c149 100644
> --- a/drivers/iommu/iommu-sva-lib.c
> +++ b/drivers/iommu/iommu-sva-lib.c
> @@ -72,6 +72,69 @@ struct mm_struct *iommu_sva_find(ioasid_t pasid)
>  }
>  EXPORT_SYMBOL_GPL(iommu_sva_find);
>  
> +/*
> + * I/O page fault handler for SVA
> + *
> + * Copied from io-pgfault.c with mmget_not_zero() added before
> + * mmap_read_lock().

Comment doesn't really belong here, maybe better in the commit message.
Apart from that

Reviewed-by: Jean-Philippe Brucker 

> + */
> +static enum iommu_page_response_code
> +iommu_sva_handle_iopf(struct iommu_fault *fault, void *data)
> +{
> + vm_fault_t ret;
> + struct mm_struct *mm;
> + struct vm_area_struct *vma;
> + unsigned int access_flags = 0;
> + struct iommu_domain *domain = data;
> + unsigned int fault_flags = FAULT_FLAG_REMOTE;
> + struct iommu_fault_page_request *prm = >prm;
> + enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID;
> +
> + if (!(prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID))
> + return status;
> +
> + mm = domain_to_mm(domain);
> + if (IS_ERR_OR_NULL(mm) || !mmget_not_zero(mm))
> + return status;
> +
> + mmap_read_lock(mm);
> +
> + vma = find_extend_vma(mm, prm->addr);
> + if (!vma)
> + /* Unmapped area */
> + goto out_put_mm;
> +
> + if (prm->perm & IOMMU_FAULT_PERM_READ)
> + access_flags |= VM_READ;
> +
> + if (prm->perm & IOMMU_FAULT_PERM_WRITE) {
> + access_flags |= VM_WRITE;
> + fault_flags |= FAULT_FLAG_WRITE;
> + }
> +
> + if (prm->perm & IOMMU_FAULT_PERM_EXEC) {
> + access_flags |= VM_EXEC;
> + fault_flags |= FAULT_FLAG_INSTRUCTION;
> + }
> +
> + if (!(prm->perm & IOMMU_FAULT_PERM_PRIV))
> + fault_flags |= FAULT_FLAG_USER;
> +
> + if (access_flags & ~vma->vm_flags)
> + /* Access fault */
> + goto out_put_mm;
> +
> + ret = handle

Re: [PATCH v7 06/10] iommu/sva: Refactoring iommu_sva_bind/unbind_device()

2022-05-19 Thread Jean-Philippe Brucker
On Thu, May 19, 2022 at 03:20:43PM +0800, Lu Baolu wrote:
> The existing iommu SVA interfaces are implemented by calling the SVA
> specific iommu ops provided by the IOMMU drivers. There's no need for
> any SVA specific ops in iommu_ops vector anymore as we can achieve
> this through the generic attach/detach_dev_pasid domain ops.
> 
> This refactors the IOMMU SVA interfaces implementation by using the
> set/block_pasid_dev ops and align them with the concept of the SVA
> iommu domain. Put the new SVA code in the sva related file in order
> to make it self-contained.
> 
> Signed-off-by: Lu Baolu 
> ---
>  include/linux/iommu.h |  48 --
>  drivers/iommu/iommu-sva-lib.h |   1 +
>  drivers/iommu/iommu-sva-lib.c | 113 
>  drivers/iommu/iommu.c | 119 --
>  4 files changed, 170 insertions(+), 111 deletions(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index e8cf82d46ce1..d9ac5ebe5bbb 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -635,6 +635,7 @@ struct iommu_fwspec {
>   */
>  struct iommu_sva {
>   struct device   *dev;
> + refcount_t  users;
>  };
>  
>  int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
> @@ -677,11 +678,6 @@ int iommu_dev_enable_feature(struct device *dev, enum 
> iommu_dev_features f);
>  int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features f);
>  bool iommu_dev_feature_enabled(struct device *dev, enum iommu_dev_features 
> f);
>  
> -struct iommu_sva *iommu_sva_bind_device(struct device *dev,
> - struct mm_struct *mm);
> -void iommu_sva_unbind_device(struct iommu_sva *handle);
> -u32 iommu_sva_get_pasid(struct iommu_sva *handle);
> -
>  int iommu_device_use_default_domain(struct device *dev);
>  void iommu_device_unuse_default_domain(struct device *dev);
>  
> @@ -693,6 +689,8 @@ int iommu_set_device_pasid(struct iommu_domain *domain, 
> struct device *dev,
>  ioasid_t pasid);
>  void iommu_block_device_pasid(struct iommu_domain *domain, struct device 
> *dev,
> ioasid_t pasid);
> +struct iommu_domain *
> +iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid);
>  #else /* CONFIG_IOMMU_API */
>  
>  struct iommu_ops {};
> @@ -1023,21 +1021,6 @@ iommu_dev_disable_feature(struct device *dev, enum 
> iommu_dev_features feat)
>   return -ENODEV;
>  }
>  
> -static inline struct iommu_sva *
> -iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
> -{
> - return NULL;
> -}
> -
> -static inline void iommu_sva_unbind_device(struct iommu_sva *handle)
> -{
> -}
> -
> -static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle)
> -{
> - return IOMMU_PASID_INVALID;
> -}
> -
>  static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
>  {
>   return NULL;
> @@ -1077,6 +1060,12 @@ static inline void iommu_block_device_pasid(struct 
> iommu_domain *domain,
>   struct device *dev, ioasid_t pasid)
>  {
>  }
> +
> +static inline struct iommu_domain *
> +iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid)
> +{
> + return NULL;
> +}
>  #endif /* CONFIG_IOMMU_API */
>  
>  /**
> @@ -1108,6 +1097,10 @@ iommu_sva_alloc_domain(struct bus_type *bus, struct 
> mm_struct *mm);
>  void iommu_sva_free_domain(struct iommu_domain *domain);
>  int iommu_sva_set_domain(struct iommu_domain *domain, struct device *dev,
>ioasid_t pasid);
> +struct iommu_sva *iommu_sva_bind_device(struct device *dev,
> + struct mm_struct *mm);
> +void iommu_sva_unbind_device(struct iommu_sva *handle);
> +u32 iommu_sva_get_pasid(struct iommu_sva *handle);
>  #else /* CONFIG_IOMMU_SVA */
>  static inline struct iommu_domain *
>  iommu_sva_alloc_domain(struct bus_type *bus, struct mm_struct *mm)
> @@ -1124,6 +1117,21 @@ static inline int iommu_sva_set_domain(struct 
> iommu_domain *domain,
>  {
>   return -EINVAL;
>  }
> +
> +static inline struct iommu_sva *
> +iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
> +{
> + return NULL;
> +}
> +
> +static inline void iommu_sva_unbind_device(struct iommu_sva *handle)
> +{
> +}
> +
> +static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle)
> +{
> + return IOMMU_PASID_INVALID;
> +}
>  #endif /* CONFIG_IOMMU_SVA */
>  
>  #endif /* __LINUX_IOMMU_H */
> diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
> index 1be21e6b93ec..ebab5a8cb126 100644
> --- a/drivers/iommu/iommu-sva-lib.h
> +++ b/drivers/iommu/iommu-sva-lib.h
> @@ -20,6 +20,7 @@ struct iopf_queue;
>  struct iommu_sva_domain {
>   struct iommu_domain domain;
>   struct mm_struct*mm;
> + struct iommu_svabond;
>  };
>  
>  #define to_sva_domain(d) 

Re: [PATCH v7 05/10] arm-smmu-v3/sva: Add SVA domain support

2022-05-19 Thread Jean-Philippe Brucker
On Thu, May 19, 2022 at 03:20:42PM +0800, Lu Baolu wrote:
> Add support for domain ops callbacks for an SVA domain.
> 
> Signed-off-by: Lu Baolu 

Reviewed-by: Jean-Philippe Brucker 

(I'll try to take some time next cycle to clean up the driver following
this change)

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  4 ++
>  .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   | 46 +++
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  6 +++
>  3 files changed, 56 insertions(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index d2ba86470c42..ec77f6a51ff9 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -758,6 +758,10 @@ struct iommu_sva *arm_smmu_sva_bind(struct device *dev, 
> struct mm_struct *mm);
>  void arm_smmu_sva_unbind(struct iommu_sva *handle);
>  u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle);
>  void arm_smmu_sva_notifier_synchronize(void);
> +int arm_smmu_sva_attach_dev_pasid(struct iommu_domain *domain,
> +   struct device *dev, ioasid_t id);
> +void arm_smmu_sva_detach_dev_pasid(struct iommu_domain *domain,
> +struct device *dev, ioasid_t id);
>  #else /* CONFIG_ARM_SMMU_V3_SVA */
>  static inline bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
>  {
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index f155d406c5d5..6969974ca89e 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -549,3 +549,49 @@ void arm_smmu_sva_notifier_synchronize(void)
>*/
>   mmu_notifier_synchronize();
>  }
> +
> +int arm_smmu_sva_attach_dev_pasid(struct iommu_domain *domain,
> +   struct device *dev, ioasid_t id)
> +{
> + int ret = 0;
> + struct mm_struct *mm;
> + struct iommu_sva *handle;
> +
> + if (domain->type != IOMMU_DOMAIN_SVA)
> + return -EINVAL;
> +
> + mm = domain_to_mm(domain);
> + if (WARN_ON(!mm))
> + return -ENODEV;
> +
> + mutex_lock(_lock);
> + handle = __arm_smmu_sva_bind(dev, mm);
> + if (IS_ERR(handle))
> + ret = PTR_ERR(handle);
> + mutex_unlock(_lock);
> +
> + return ret;
> +}
> +
> +void arm_smmu_sva_detach_dev_pasid(struct iommu_domain *domain,
> +struct device *dev, ioasid_t id)
> +{
> + struct arm_smmu_bond *bond = NULL, *t;
> + struct mm_struct *mm = domain_to_mm(domain);
> + struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +
> + mutex_lock(_lock);
> + list_for_each_entry(t, >bonds, list) {
> + if (t->mm == mm) {
> + bond = t;
> + break;
> + }
> + }
> +
> + if (!WARN_ON(!bond) && refcount_dec_and_test(>refs)) {
> + list_del(>list);
> + arm_smmu_mmu_notifier_put(bond->smmu_mn);
> + kfree(bond);
> + }
> + mutex_unlock(_lock);
> +}
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 6e2cd082c670..4ad3ca70cf89 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2858,6 +2858,12 @@ static struct iommu_ops arm_smmu_ops = {
>   .page_response  = arm_smmu_page_response,
>   .pgsize_bitmap  = -1UL, /* Restricted during device attach */
>   .owner  = THIS_MODULE,
> +#ifdef CONFIG_ARM_SMMU_V3_SVA
> + .sva_domain_ops = &(const struct iommu_domain_ops) {
> + .set_dev_pasid  = arm_smmu_sva_attach_dev_pasid,
> + .block_dev_pasid= arm_smmu_sva_detach_dev_pasid,
> + },
> +#endif
>   .default_domain_ops = &(const struct iommu_domain_ops) {
>   .attach_dev = arm_smmu_attach_dev,
>   .map_pages  = arm_smmu_map_pages,
> -- 
> 2.25.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 03/10] iommu/sva: Add iommu_sva_domain support

2022-05-19 Thread Jean-Philippe Brucker
On Thu, May 19, 2022 at 03:20:40PM +0800, Lu Baolu wrote:
> The iommu_sva_domain represents a hardware pagetable that the IOMMU
> hardware could use for SVA translation. This adds some infrastructure
> to support SVA domain in the iommu common layer. It includes:
> 
> - Add a new struct iommu_sva_domain and new IOMMU_DOMAIN_SVA domain
>   type.
> - Add a new domain ops pointer in iommu_ops. The IOMMU drivers that
>   support SVA should provide the callbacks.
> - Add helpers to allocate and free an SVA domain.
> - Add helpers to set an SVA domain to a device and the reverse
>   operation.
> 
> Some buses, like PCI, route packets without considering the PASID value.
> Thus a DMA target address with PASID might be treated as P2P if the
> address falls into the MMIO BAR of other devices in the group. To make
> things simple, the attach/detach interfaces only apply to devices
> belonging to the singleton groups, and the singleton is immutable in
> fabric i.e. not affected by hotplug.
> 
> The iommu_set/block_device_pasid() can be used for other purposes,
> such as kernel DMA with pasid, mediation device, etc. Hence, it is put
> in the iommu.c.
> 
> Suggested-by: Jean-Philippe Brucker 
> Suggested-by: Jason Gunthorpe 
> Signed-off-by: Lu Baolu 
> ---
>  include/linux/iommu.h | 51 +
>  drivers/iommu/iommu-sva-lib.h | 15 
>  drivers/iommu/iommu-sva-lib.c | 48 +++
>  drivers/iommu/iommu.c | 71 +++
>  4 files changed, 185 insertions(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 0c358b7c583b..e8cf82d46ce1 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -64,6 +64,9 @@ struct iommu_domain_geometry {
>  #define __IOMMU_DOMAIN_PT(1U << 2)  /* Domain is identity mapped   */
>  #define __IOMMU_DOMAIN_DMA_FQ(1U << 3)  /* DMA-API uses flush queue  
>   */
>  
> +#define __IOMMU_DOMAIN_SHARED(1U << 4)  /* Page table shared from 
> CPU  */
> +#define __IOMMU_DOMAIN_HOST_VA   (1U << 5)  /* Host CPU virtual address 
> */
> +
>  /*
>   * This are the possible domain-types
>   *
> @@ -86,6 +89,8 @@ struct iommu_domain_geometry {
>  #define IOMMU_DOMAIN_DMA_FQ  (__IOMMU_DOMAIN_PAGING |\
>__IOMMU_DOMAIN_DMA_API |   \
>__IOMMU_DOMAIN_DMA_FQ)
> +#define IOMMU_DOMAIN_SVA (__IOMMU_DOMAIN_SHARED |\
> +  __IOMMU_DOMAIN_HOST_VA)
>  
>  struct iommu_domain {
>   unsigned type;
> @@ -254,6 +259,7 @@ struct iommu_ops {
>   int (*def_domain_type)(struct device *dev);
>  
>   const struct iommu_domain_ops *default_domain_ops;
> + const struct iommu_domain_ops *sva_domain_ops;
>   unsigned long pgsize_bitmap;
>   struct module *owner;
>  };
> @@ -262,6 +268,8 @@ struct iommu_ops {
>   * struct iommu_domain_ops - domain specific operations
>   * @attach_dev: attach an iommu domain to a device
>   * @detach_dev: detach an iommu domain from a device
> + * @set_dev_pasid: set an iommu domain to a pasid of device
> + * @block_dev_pasid: block pasid of device from using iommu domain
>   * @map: map a physically contiguous memory region to an iommu domain
>   * @map_pages: map a physically contiguous set of pages of the same size to
>   * an iommu domain.
> @@ -282,6 +290,10 @@ struct iommu_ops {
>  struct iommu_domain_ops {
>   int (*attach_dev)(struct iommu_domain *domain, struct device *dev);
>   void (*detach_dev)(struct iommu_domain *domain, struct device *dev);
> + int (*set_dev_pasid)(struct iommu_domain *domain, struct device *dev,
> +  ioasid_t pasid);
> + void (*block_dev_pasid)(struct iommu_domain *domain, struct device *dev,
> + ioasid_t pasid);
>  
>   int (*map)(struct iommu_domain *domain, unsigned long iova,
>  phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
> @@ -677,6 +689,10 @@ int iommu_group_claim_dma_owner(struct iommu_group 
> *group, void *owner);
>  void iommu_group_release_dma_owner(struct iommu_group *group);
>  bool iommu_group_dma_owner_claimed(struct iommu_group *group);
>  
> +int iommu_set_device_pasid(struct iommu_domain *domain, struct device *dev,
> +ioasid_t pasid);
> +void iommu_block_device_pasid(struct iommu_domain *domain, struct device 
> *dev,
> +   ioasid_t pasid);
>  #else /* CONFIG_IOMMU_API */
>  
>  struct iommu_ops {};
> @@ -1050,6 +1066,17 @@ static inline bool 
> iommu_group_dma_owner_cla

Re: [PATCH v7 02/10] iommu: Remove SVM_FLAG_SUPERVISOR_MODE support

2022-05-19 Thread Jean-Philippe Brucker
On Thu, May 19, 2022 at 03:20:39PM +0800, Lu Baolu wrote:
> The current kernel DMA with PASID support is based on the SVA with a flag
> SVM_FLAG_SUPERVISOR_MODE. The IOMMU driver binds the kernel memory address
> space to a PASID of the device. The device driver programs the device with
> kernel virtual address (KVA) for DMA access. There have been security and
> functional issues with this approach:
> 
> - The lack of IOTLB synchronization upon kernel page table updates.
>   (vmalloc, module/BPF loading, CONFIG_DEBUG_PAGEALLOC etc.)
> - Other than slight more protection, using kernel virtual address (KVA)
>   has little advantage over physical address. There are also no use
>   cases yet where DMA engines need kernel virtual addresses for in-kernel
>   DMA.
> 
> This removes SVM_FLAG_SUPERVISOR_MODE support from the IOMMU interface.
> The device drivers are suggested to handle kernel DMA with PASID through
> the kernel DMA APIs.
> 
> The drvdata parameter in iommu_sva_bind_device() and all callbacks is not
> needed anymore. Cleanup them as well.
> 
> Link: https://lore.kernel.org/linux-iommu/20210511194726.gp1002...@nvidia.com/
> Signed-off-by: Jacob Pan 
> Signed-off-by: Lu Baolu 
> Reviewed-by: Jason Gunthorpe 

For the SMMU bits

Reviewed-by: Jean-Philippe Brucker 

> ---
>  include/linux/intel-iommu.h   |  3 +-
>  include/linux/intel-svm.h | 13 -
>  include/linux/iommu.h |  8 +--
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  5 +-
>  drivers/dma/idxd/cdev.c   |  2 +-
>  drivers/dma/idxd/init.c   | 24 +---
>  .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  3 +-

>  drivers/iommu/intel/svm.c | 57 +--
>  drivers/iommu/iommu.c |  5 +-
>  drivers/misc/uacce/uacce.c|  2 +-
>  10 files changed, 26 insertions(+), 96 deletions(-)
> 
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 4f29139bbfc3..df23300cfa88 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -739,8 +739,7 @@ struct intel_iommu *device_to_iommu(struct device *dev, 
> u8 *bus, u8 *devfn);
>  extern void intel_svm_check(struct intel_iommu *iommu);
>  extern int intel_svm_enable_prq(struct intel_iommu *iommu);
>  extern int intel_svm_finish_prq(struct intel_iommu *iommu);
> -struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm,
> -  void *drvdata);
> +struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm);
>  void intel_svm_unbind(struct iommu_sva *handle);
>  u32 intel_svm_get_pasid(struct iommu_sva *handle);
>  int intel_svm_page_response(struct device *dev, struct iommu_fault_event 
> *evt,
> diff --git a/include/linux/intel-svm.h b/include/linux/intel-svm.h
> index 207ef06ba3e1..f9a0d44f6fdb 100644
> --- a/include/linux/intel-svm.h
> +++ b/include/linux/intel-svm.h
> @@ -13,17 +13,4 @@
>  #define PRQ_RING_MASK((0x1000 << PRQ_ORDER) - 0x20)
>  #define PRQ_DEPTH((0x1000 << PRQ_ORDER) >> 5)
>  
> -/*
> - * The SVM_FLAG_SUPERVISOR_MODE flag requests a PASID which can be used only
> - * for access to kernel addresses. No IOTLB flushes are automatically done
> - * for kernel mappings; it is valid only for access to the kernel's static
> - * 1:1 mapping of physical memory — not to vmalloc or even module mappings.
> - * A future API addition may permit the use of such ranges, by means of an
> - * explicit IOTLB flush call (akin to the DMA API's unmap method).
> - *
> - * It is unlikely that we will ever hook into flush_tlb_kernel_range() to
> - * do such IOTLB flushes automatically.
> - */
> -#define SVM_FLAG_SUPERVISOR_MODE BIT(0)
> -
>  #endif /* __INTEL_SVM_H__ */
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index da423e87f248..0c358b7c583b 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -243,8 +243,7 @@ struct iommu_ops {
>   int (*dev_enable_feat)(struct device *dev, enum iommu_dev_features f);
>   int (*dev_disable_feat)(struct device *dev, enum iommu_dev_features f);
>  
> - struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm,
> -   void *drvdata);
> + struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm);
>   void (*sva_unbind)(struct iommu_sva *handle);
>   u32 (*sva_get_pasid)(struct iommu_sva *handle);
>  
> @@ -667,8 +666,7 @@ int iommu_dev_disable_feature(struct device *dev, enum 
> iommu_dev_features f);
>  bool iommu_dev_feature_enabled(struct device *dev, enum iommu_dev_f

Re: [PATCH v7 01/10] iommu: Add pasids field in struct iommu_device

2022-05-19 Thread Jean-Philippe Brucker
Hi Baolu,

On Thu, May 19, 2022 at 03:20:38PM +0800, Lu Baolu wrote:
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 88817a3376ef..6e2cd082c670 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -3546,6 +3546,7 @@ static int arm_smmu_device_hw_probe(struct 
> arm_smmu_device *smmu)
>   /* SID/SSID sizes */
>   smmu->ssid_bits = FIELD_GET(IDR1_SSIDSIZE, reg);
>   smmu->sid_bits = FIELD_GET(IDR1_SIDSIZE, reg);
> + smmu->iommu.pasids = smmu->ssid_bits;

This should be 1UL << smmu->ssid_bits

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 03/12] iommu: Add attach/detach_dev_pasid domain ops

2022-05-12 Thread Jean-Philippe Brucker
On Wed, May 11, 2022 at 09:02:40AM -0300, Jason Gunthorpe wrote:
> On Wed, May 11, 2022 at 08:54:39AM +0100, Jean-Philippe Brucker wrote:
> > > > > Then 'detach pasid' is:
> > > > >
> > > > > iommu_ops->blocking_domain->ops->attach_dev_pasid(domain, dev,
> > > > pasid);
> > > > >
> > > > > And we move away from the notion of 'detach' and in the direction that
> > > > > everything continuously has a domain set. PASID would logically
> > > > > default to blocking_domain, though we wouldn't track this anywhere.
> > > > 
> > > > I am not sure whether we still need to keep the blocking domain concept
> > > > when we are entering the new PASID world. Please allow me to wait and
> > > > listen to more opinions.
> > > > 
> > > 
> > > I'm with Jason on this direction. In concept after a PASID is detached 
> > > it's
> > > essentially blocked. Implementation-wise it doesn't prevent the iommu
> > > driver from marking the PASID entry as non-present as doing in this
> > > series instead of actually pointing to the empty page table of the block
> > > domain. But api-wise it does make the entire semantics more consistent.
> > 
> > This is all internal to IOMMU so I don't think we should be concerned
> > about API consistency. I prefer a straighforward detach() operation
> > because that way IOMMU drivers don't have to keep track of which domain is
> > attached to which PASID. That code can be factored into the IOMMU core.
> 
> Why would a driver need to keep additional tracking?
> 
> > In addition to clearing contexts, detach() also needs to invalidate TLBs,
> > and for that the SMMU driver needs to know the old ASID (!= PASID) that
> > was used by the context descriptor. We can certainly work around a missing
> > detach() to implement this, but it will be convoluted.
> 
> It is not "missing" it is just renamed to 
> blocking_domain->ops->set_dev_pasid()
> 
> The implementation of that function would be identical to
> detach_dev_pasid.

  attach(dev, pasid, sva_domain)
  detach(dev, pasid, sva_domain)

versus

  set_dev_pasid(dev, pasid, sva_domain)
  set_dev_pasid(dev, pasid, blocking)

we loose the information of the domain previously attached, and the SMMU
driver has to retrieve it to find the ASID corresponding to the mm. 

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 02/12] iommu: Add pasid_bits field in struct dev_iommu

2022-05-11 Thread Jean-Philippe Brucker
On Wed, May 11, 2022 at 10:25:48AM +0800, Baolu Lu wrote:
> On 2022/5/10 22:34, Jason Gunthorpe wrote:
> > On Tue, May 10, 2022 at 02:17:28PM +0800, Lu Baolu wrote:
> > 
> > >   int iommu_device_register(struct iommu_device *iommu,
> > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> > > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > index 627a3ed5ee8f..afc63fce6107 100644
> > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > @@ -2681,6 +2681,8 @@ static struct iommu_device 
> > > *arm_smmu_probe_device(struct device *dev)
> > >   smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
> > >   master->stall_enabled = true;
> > > + dev->iommu->pasid_bits = master->ssid_bits;
> > >   return >iommu;
> > >   err_free_master:
> > > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > > index 2990f80c5e08..99643f897f26 100644
> > > +++ b/drivers/iommu/intel/iommu.c
> > > @@ -4624,8 +4624,11 @@ static struct iommu_device 
> > > *intel_iommu_probe_device(struct device *dev)
> > >   if (pasid_supported(iommu)) {
> > >   int features = pci_pasid_features(pdev);
> > > - if (features >= 0)
> > > + if (features >= 0) {
> > >   info->pasid_supported = 
> > > features | 1;
> > > + dev->iommu->pasid_bits =
> > > + fls(pci_max_pasids(pdev)) - 1;
> > > + }
> > 
> > It is not very nice that both the iommu drivers have to duplicate the
> > code to read the pasid capability out of the PCI device.
> > 
> > IMHO it would make more sense for the iommu layer to report the
> > capability of its own HW block only, and for the core code to figure
> > out the master's limitation using a bus-specific approach.
> 
> Fair enough. The iommu hardware capability could be reported in
> 
> /**
>  * struct iommu_device - IOMMU core representation of one IOMMU hardware
>  *   instance
>  * @list: Used by the iommu-core to keep a list of registered iommus
>  * @ops: iommu-ops for talking to this iommu
>  * @dev: struct device for sysfs handling
>  */
> struct iommu_device {
> struct list_head list;
> const struct iommu_ops *ops;
> struct fwnode_handle *fwnode;
> struct device *dev;
> };
> 
> I haven't checked ARM code yet, but it works for x86 as far as I can
> see.

Arm also supports non-PCI PASID by reading a firmware property:

device_property_read_u32(dev, "pasid-num-bits", >ssid_bits);

should be the only difference

Thanks,
Jean

> 
> > 
> > It is also unfortunate that the enable/disable pasid is inside the
> > iommu driver as well - ideally the PCI driver itself would do this
> > when it knows it wants to use PASIDs.
> > 
> > The ordering interaction with ATS makes this look quite annoying
> > though. :(
> > 
> > I'm also not convinced individual IOMMU drivers should be forcing ATS
> > on, there are performance and functional implications here. Using ATS
> > or not is possibly best left as an administrator policy controlled by
> > the core code. Again we seem to have some mess.
> 
> Agreed with you. This has already been in my task list. I will start to
> solve it after the iommufd tasks.
> 
> Best regards,
> baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 03/12] iommu: Add attach/detach_dev_pasid domain ops

2022-05-11 Thread Jean-Philippe Brucker
On Wed, May 11, 2022 at 04:09:14AM +, Tian, Kevin wrote:
> > From: Baolu Lu 
> > Sent: Wednesday, May 11, 2022 10:32 AM
> > 
> > On 2022/5/10 22:02, Jason Gunthorpe wrote:
> > > On Tue, May 10, 2022 at 02:17:29PM +0800, Lu Baolu wrote:
> > >
> > >> This adds a pair of common domain ops for this purpose and adds
> > helpers
> > >> to attach/detach a domain to/from a {device, PASID}.
> > >
> > > I wonder if this should not have a detach op - after discussing with
> > > Robin we can see that detach_dev is not used in updated
> > > drivers. Instead attach_dev acts as 'set_domain'
> > >
> > > So, it would be more symmetrical if attaching a blocking_domain to the
> > > PASID was the way to 'detach'.
> > >
> > > This could be made straightforward by following the sketch I showed to
> > > have a static, global blocing_domain and providing a pointer to it in
> > > struct iommu_ops
> > >
> > > Then 'detach pasid' is:
> > >
> > > iommu_ops->blocking_domain->ops->attach_dev_pasid(domain, dev,
> > pasid);
> > >
> > > And we move away from the notion of 'detach' and in the direction that
> > > everything continuously has a domain set. PASID would logically
> > > default to blocking_domain, though we wouldn't track this anywhere.
> > 
> > I am not sure whether we still need to keep the blocking domain concept
> > when we are entering the new PASID world. Please allow me to wait and
> > listen to more opinions.
> > 
> 
> I'm with Jason on this direction. In concept after a PASID is detached it's
> essentially blocked. Implementation-wise it doesn't prevent the iommu
> driver from marking the PASID entry as non-present as doing in this
> series instead of actually pointing to the empty page table of the block
> domain. But api-wise it does make the entire semantics more consistent.

This is all internal to IOMMU so I don't think we should be concerned
about API consistency. I prefer a straighforward detach() operation
because that way IOMMU drivers don't have to keep track of which domain is
attached to which PASID. That code can be factored into the IOMMU core.

In addition to clearing contexts, detach() also needs to invalidate TLBs,
and for that the SMMU driver needs to know the old ASID (!= PASID) that
was used by the context descriptor. We can certainly work around a missing
detach() to implement this, but it will be convoluted.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 10/12] iommu: Prepare IOMMU domain for IOPF

2022-05-05 Thread Jean-Philippe Brucker
Hi Baolu,

On Thu, May 05, 2022 at 04:31:38PM +0800, Baolu Lu wrote:
> On 2022/5/4 02:20, Jean-Philippe Brucker wrote:
> > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > > index 7cae631c1baa..33449523afbe 100644
> > > --- a/drivers/iommu/iommu.c
> > > +++ b/drivers/iommu/iommu.c
> > > @@ -3174,3 +3174,24 @@ void iommu_detach_device_pasid(struct iommu_domain 
> > > *domain,
> > >   iommu_group_put(group);
> > >   }
> > > +
> > > +struct iommu_domain *iommu_get_domain_for_dev_pasid(struct device *dev,
> > > + ioasid_t pasid)
> > > +{
> > > + struct iommu_domain *domain;
> > > + struct iommu_group *group;
> > > +
> > > + if (!pasid_valid(pasid))
> > > + return NULL;
> > > +
> > > + group = iommu_group_get(dev);
> > > + if (!group)
> > > + return NULL;
> > > +
> > > + mutex_lock(>mutex);
> > Unfortunately this still causes the deadlock when unbind() flushes the
> > IOPF queue while holding the group mutex.
> 
> Sorry, I didn't get your point here.
> 
> Do you mean unbind() could hold group mutex before calling this helper?
> The group mutex is only available in iommu.c. The unbind() has no means
> to hold this lock. Or, I missed anything?

I wasn't clear, it's iommu_detach_device_pasid() that holds the
group->mutex:

 iommu_sva_unbind_device()  |
  iommu_detach_device_pasid()   |
   mutex_lock(>mutex)|
   domain->ops->detach_dev_pasid()  | iopf_handle_group()
iopf_queue_flush_dev()  |  iommu_get_domain_for_dev_pasid()
 ... wait for IOPF work |   mutex_lock(>mutex)
|... deadlock

Thanks,
Jean

> 
> Best regards,
> baolu
> 
> > 
> > If we make this function private to IOPF, then we can get rid of this
> > mutex_lock(). It's OK because:
> > 
> > * xarray protects its internal state with RCU, so we can call
> >xa_load() outside the lock.
> > 
> > * The domain obtained from xa_load is finalized. Its content is valid
> >because xarray stores the domain using rcu_assign_pointer(), which has a
> >release memory barrier, which pairs with data dependencies in IOPF
> >(domain->sva_ioas etc).
> > 
> >We'll need to be careful about this when allowing other users to install
> >a fault handler. Should be fine as long as the handler and data are
> >installed before the domain is added to pasid_array.
> > 
> > * We know the domain is valid the whole time IOPF is using it, because
> >unbind() waits for pending faults.
> > 
> > We just need a comment explaining the last point, something like:
> > 
> > /*
> > * Safe to fetch outside the group mutex because:
> >  * - xarray protects its internal state with RCU
> >  * - the domain obtained is either NULL or fully formed
> > * - the IOPF work is the only caller and is flushed before the
> > *   domain is freed.
> >  */
> > 
> > Thanks,
> > Jean
> > 
> > > + domain = xa_load(>pasid_array, pasid);
> > > + mutex_unlock(>mutex);
> > > + iommu_group_put(group);
> > > +
> > > + return domain;
> > > +}
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 12/12] iommu: Rename iommu-sva-lib.{c,h}

2022-05-03 Thread Jean-Philippe Brucker
On Mon, May 02, 2022 at 09:48:42AM +0800, Lu Baolu wrote:
> Rename iommu-sva-lib.c[h] to iommu-sva.c[h] as it contains all code
> for SVA implementation in iommu core.
> 
> Signed-off-by: Lu Baolu 

Reviewed-by: Jean-Philippe Brucker 

> ---
>  drivers/iommu/{iommu-sva-lib.h => iommu-sva.h}  | 0
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +-
>  drivers/iommu/intel/iommu.c | 2 +-
>  drivers/iommu/intel/svm.c   | 2 +-
>  drivers/iommu/io-pgfault.c  | 2 +-
>  drivers/iommu/{iommu-sva-lib.c => iommu-sva.c}  | 2 +-
>  drivers/iommu/Makefile  | 2 +-
>  8 files changed, 7 insertions(+), 7 deletions(-)
>  rename drivers/iommu/{iommu-sva-lib.h => iommu-sva.h} (100%)
>  rename drivers/iommu/{iommu-sva-lib.c => iommu-sva.c} (99%)
> 
> diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva.h
> similarity index 100%
> rename from drivers/iommu/iommu-sva-lib.h
> rename to drivers/iommu/iommu-sva.h
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index 0ace04b27d4b..73a336e17dc8 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -9,7 +9,7 @@
>  #include 
>  
>  #include "arm-smmu-v3.h"
> -#include "../../iommu-sva-lib.h"
> +#include "../../iommu-sva.h"
>  #include "../../io-pgtable-arm.h"
>  
>  struct arm_smmu_mmu_notifier {
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 543d3ef1c102..ca2bd17eec41 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -31,7 +31,7 @@
>  #include 
>  
>  #include "arm-smmu-v3.h"
> -#include "../../iommu-sva-lib.h"
> +#include "../../iommu-sva.h"
>  
>  static bool disable_bypass = true;
>  module_param(disable_bypass, bool, 0444);
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 46e2eb15197b..b38f50810459 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -27,7 +27,7 @@
>  #include 
>  
>  #include "../irq_remapping.h"
> -#include "../iommu-sva-lib.h"
> +#include "../iommu-sva.h"
>  #include "pasid.h"
>  #include "cap_audit.h"
>  
> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> index 6084f960ba27..38c33cde177e 100644
> --- a/drivers/iommu/intel/svm.c
> +++ b/drivers/iommu/intel/svm.c
> @@ -25,7 +25,7 @@
>  
>  #include "pasid.h"
>  #include "perf.h"
> -#include "../iommu-sva-lib.h"
> +#include "../iommu-sva.h"
>  
>  static irqreturn_t prq_event_thread(int irq, void *d);
>  static void intel_svm_drain_prq(struct device *dev, u32 pasid);
> diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
> index 8a2bb56e1474..a9ecf6bf5500 100644
> --- a/drivers/iommu/io-pgfault.c
> +++ b/drivers/iommu/io-pgfault.c
> @@ -11,7 +11,7 @@
>  #include 
>  #include 
>  
> -#include "iommu-sva-lib.h"
> +#include "iommu-sva.h"
>  
>  /**
>   * struct iopf_queue - IO Page Fault queue
> diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva.c
> similarity index 99%
> rename from drivers/iommu/iommu-sva-lib.c
> rename to drivers/iommu/iommu-sva.c
> index ae3595d60f38..b631765fa8c0 100644
> --- a/drivers/iommu/iommu-sva-lib.c
> +++ b/drivers/iommu/iommu-sva.c
> @@ -7,7 +7,7 @@
>  #include 
>  #include 
>  
> -#include "iommu-sva-lib.h"
> +#include "iommu-sva.h"
>  
>  static DEFINE_MUTEX(iommu_sva_lock);
>  static DECLARE_IOASID_SET(iommu_sva_pasid);
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 44475a9b3eea..c1763476162b 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -27,6 +27,6 @@ obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
>  obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
>  obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
>  obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
> -obj-$(CONFIG_IOMMU_SVA) += iommu-sva-lib.o io-pgfault.o
> +obj-$(CONFIG_IOMMU_SVA) += iommu-sva.o io-pgfault.o
>  obj-$(CONFIG_SPRD_IOMMU) += sprd-iommu.o
>  obj-$(CONFIG_APPLE_DART) += apple-dart.o
> -- 
> 2.25.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 11/12] iommu: Per-domain I/O page fault handling

2022-05-03 Thread Jean-Philippe Brucker
On Mon, May 02, 2022 at 09:48:41AM +0800, Lu Baolu wrote:
> Tweak the I/O page fault handling framework to route the page faults to
> the domain and call the page fault handler retrieved from the domain.
> This makes the I/O page fault handling framework possible to serve more
> usage scenarios as long as they have an IOMMU domain and install a page
> fault handler in it. Some unused functions are also removed to avoid
> dead code.
> 
> Signed-off-by: Lu Baolu 

Reviewed-by: Jean-Philippe Brucker 

> ---
>  drivers/iommu/iommu-sva-lib.h |  1 -
>  drivers/iommu/io-pgfault.c| 64 ---
>  drivers/iommu/iommu-sva-lib.c | 20 ---
>  3 files changed, 7 insertions(+), 78 deletions(-)
> 
> diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
> index 5776b4c80cc1..e7813c6706fb 100644
> --- a/drivers/iommu/iommu-sva-lib.h
> +++ b/drivers/iommu/iommu-sva-lib.h
> @@ -8,7 +8,6 @@
>  #include 
>  #include 
>  
> -struct mm_struct *iommu_sva_find(ioasid_t pasid);
>  struct mm_struct *iommu_sva_domain_mm(struct iommu_domain *domain);
>  
>  /* I/O Page fault */
> diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
> index 1df8c1dcae77..8a2bb56e1474 100644
> --- a/drivers/iommu/io-pgfault.c
> +++ b/drivers/iommu/io-pgfault.c
> @@ -69,69 +69,18 @@ static int iopf_complete_group(struct device *dev, struct 
> iopf_fault *iopf,
>   return iommu_page_response(dev, );
>  }
>  
> -static enum iommu_page_response_code
> -iopf_handle_single(struct iopf_fault *iopf)
> -{
> - vm_fault_t ret;
> - struct mm_struct *mm;
> - struct vm_area_struct *vma;
> - unsigned int access_flags = 0;
> - unsigned int fault_flags = FAULT_FLAG_REMOTE;
> - struct iommu_fault_page_request *prm = >fault.prm;
> - enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID;
> -
> - if (!(prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID))
> - return status;
> -
> - mm = iommu_sva_find(prm->pasid);
> - if (IS_ERR_OR_NULL(mm))
> - return status;
> -
> - mmap_read_lock(mm);
> -
> - vma = find_extend_vma(mm, prm->addr);
> - if (!vma)
> - /* Unmapped area */
> - goto out_put_mm;
> -
> - if (prm->perm & IOMMU_FAULT_PERM_READ)
> - access_flags |= VM_READ;
> -
> - if (prm->perm & IOMMU_FAULT_PERM_WRITE) {
> - access_flags |= VM_WRITE;
> - fault_flags |= FAULT_FLAG_WRITE;
> - }
> -
> - if (prm->perm & IOMMU_FAULT_PERM_EXEC) {
> - access_flags |= VM_EXEC;
> - fault_flags |= FAULT_FLAG_INSTRUCTION;
> - }
> -
> - if (!(prm->perm & IOMMU_FAULT_PERM_PRIV))
> - fault_flags |= FAULT_FLAG_USER;
> -
> - if (access_flags & ~vma->vm_flags)
> - /* Access fault */
> - goto out_put_mm;
> -
> - ret = handle_mm_fault(vma, prm->addr, fault_flags, NULL);
> - status = ret & VM_FAULT_ERROR ? IOMMU_PAGE_RESP_INVALID :
> - IOMMU_PAGE_RESP_SUCCESS;
> -
> -out_put_mm:
> - mmap_read_unlock(mm);
> - mmput(mm);
> -
> - return status;
> -}
> -
>  static void iopf_handle_group(struct work_struct *work)
>  {
>   struct iopf_group *group;
> + struct iommu_domain *domain;
>   struct iopf_fault *iopf, *next;
>   enum iommu_page_response_code status = IOMMU_PAGE_RESP_SUCCESS;
>  
>   group = container_of(work, struct iopf_group, work);
> + domain = iommu_get_domain_for_dev_pasid(group->dev,
> + group->last_fault.fault.prm.pasid);
> + if (!domain || !domain->iopf_handler)
> + status = IOMMU_PAGE_RESP_INVALID;
>  
>   list_for_each_entry_safe(iopf, next, >faults, list) {
>   /*
> @@ -139,7 +88,8 @@ static void iopf_handle_group(struct work_struct *work)
>* faults in the group if there is an error.
>*/
>   if (status == IOMMU_PAGE_RESP_SUCCESS)
> - status = iopf_handle_single(iopf);
> + status = domain->iopf_handler(>fault,
> +   domain->fault_data);
>  
>   if (!(iopf->fault.prm.flags &
> IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE))
> diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
> index 05a7d2f0e46f..ae3595d60f38 100644
> --- a/drivers/iommu/iommu-sva-lib.c
> +++ b/drivers/iommu/iommu-sva-lib.c
> @@ -69,26 +69,6 @@ static int iommu_sva_alloc_pasid(struct mm_s

Re: [PATCH v5 10/12] iommu: Prepare IOMMU domain for IOPF

2022-05-03 Thread Jean-Philippe Brucker
On Mon, May 02, 2022 at 09:48:40AM +0800, Lu Baolu wrote:
> This adds some mechanisms around the iommu_domain so that the I/O page
> fault handling framework could route a page fault to the domain and
> call the fault handler from it.
> 
> Add pointers to the page fault handler and its private data in struct
> iommu_domain. The fault handler will be called with the private data
> as a parameter once a page fault is routed to the domain. Any kernel
> component which owns an iommu domain could install handler and its
> private parameter so that the page fault could be further routed and
> handled.
> 
> A new helper iommu_get_domain_for_dev_pasid() which retrieves attached
> domain for a {device, PASID} is added. It will be used by the page fault
> handling framework which knows {device, PASID} reported from the iommu
> driver. We have a guarantee that the SVA domain doesn't go away during
> IOPF handling, because unbind() waits for pending faults with
> iopf_queue_flush_dev() before freeing the domain. Hence, there's no need
> to synchronize life cycle of the iommu domains between the unbind() and
> the interrupt threads.
> 
> This also prepares the SVA implementation to be the first consumer of
> the per-domain page fault handling model.
> 
> Signed-off-by: Lu Baolu 
> ---
>  include/linux/iommu.h | 12 +++
>  drivers/iommu/iommu-sva-lib.c | 65 +++
>  drivers/iommu/iommu.c | 21 +++
>  3 files changed, 98 insertions(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 19718939d9df..1164524814cb 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -102,6 +102,9 @@ struct iommu_domain {
>   struct iommu_domain_geometry geometry;
>   struct iommu_dma_cookie *iova_cookie;
>   struct iommu_sva_ioas *sva_ioas;
> + enum iommu_page_response_code (*iopf_handler)(struct iommu_fault *fault,
> +   void *data);
> + void *fault_data;
>  };
>  
>  static inline bool iommu_is_dma_domain(struct iommu_domain *domain)
> @@ -686,6 +689,9 @@ int iommu_attach_device_pasid(struct iommu_domain *domain,
> struct device *dev, ioasid_t pasid);
>  void iommu_detach_device_pasid(struct iommu_domain *domain,
>  struct device *dev, ioasid_t pasid);
> +struct iommu_domain *
> +iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid);
> +
>  #else /* CONFIG_IOMMU_API */
>  
>  struct iommu_ops {};
> @@ -1055,6 +1061,12 @@ static inline void iommu_detach_device_pasid(struct 
> iommu_domain *domain,
>struct device *dev, ioasid_t pasid)
>  {
>  }
> +
> +static inline struct iommu_domain *
> +iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid)
> +{
> + return NULL;
> +}
>  #endif /* CONFIG_IOMMU_API */
>  
>  #ifdef CONFIG_IOMMU_SVA
> diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
> index 992388106da0..05a7d2f0e46f 100644
> --- a/drivers/iommu/iommu-sva-lib.c
> +++ b/drivers/iommu/iommu-sva-lib.c
> @@ -135,6 +135,69 @@ static void iommu_sva_ioas_put(struct iommu_sva_ioas 
> *ioas)
>   }
>  }
>  
> +/*
> + * I/O page fault handler for SVA
> + *
> + * Copied from io-pgfault.c with mmget_not_zero() added before
> + * mmap_read_lock().
> + */
> +static enum iommu_page_response_code
> +iommu_sva_handle_iopf(struct iommu_fault *fault, void *data)
> +{
> + vm_fault_t ret;
> + struct mm_struct *mm;
> + struct vm_area_struct *vma;
> + unsigned int access_flags = 0;
> + struct iommu_domain *domain = data;
> + unsigned int fault_flags = FAULT_FLAG_REMOTE;
> + struct iommu_fault_page_request *prm = >prm;
> + enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID;
> +
> + if (!(prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID))
> + return status;
> +
> + mm = iommu_sva_domain_mm(domain);
> + if (IS_ERR_OR_NULL(mm) || !mmget_not_zero(mm))
> + return status;
> +
> + mmap_read_lock(mm);
> +
> + vma = find_extend_vma(mm, prm->addr);
> + if (!vma)
> + /* Unmapped area */
> + goto out_put_mm;
> +
> + if (prm->perm & IOMMU_FAULT_PERM_READ)
> + access_flags |= VM_READ;
> +
> + if (prm->perm & IOMMU_FAULT_PERM_WRITE) {
> + access_flags |= VM_WRITE;
> + fault_flags |= FAULT_FLAG_WRITE;
> + }
> +
> + if (prm->perm & IOMMU_FAULT_PERM_EXEC) {
> + access_flags |= VM_EXEC;
> + fault_flags |= FAULT_FLAG_INSTRUCTION;
> + }
> +
> + if (!(prm->perm & IOMMU_FAULT_PERM_PRIV))
> + fault_flags |= FAULT_FLAG_USER;
> +
> + if (access_flags & ~vma->vm_flags)
> + /* Access fault */
> + goto out_put_mm;
> +
> + ret = handle_mm_fault(vma, prm->addr, fault_flags, NULL);
> + status = ret & VM_FAULT_ERROR ? 

Re: [PATCH v5 09/12] iommu: Remove SVA related callbacks from iommu ops

2022-05-03 Thread Jean-Philippe Brucker
On Mon, May 02, 2022 at 09:48:39AM +0800, Lu Baolu wrote:
> These ops'es have been replaced with the dev_attach/detach_pasid domain
> ops'es. There's no need for them anymore. Remove them to avoid dead
> code.
> 
> Signed-off-by: Lu Baolu 

Reviewed-by: Jean-Philippe Brucker 

> ---
>  include/linux/intel-iommu.h   |  4 --
>  include/linux/iommu.h |  8 ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   | 17 ---
>  drivers/iommu/iommu-sva-lib.h |  1 -
>  .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   | 41 
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  3 --
>  drivers/iommu/intel/iommu.c   |  3 --
>  drivers/iommu/intel/svm.c | 49 ---
>  drivers/iommu/iommu-sva-lib.c |  4 +-
>  9 files changed, 2 insertions(+), 128 deletions(-)
> 
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 3b4ca16f53e2..5af24befc9f1 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -738,10 +738,6 @@ struct intel_iommu *device_to_iommu(struct device *dev, 
> u8 *bus, u8 *devfn);
>  extern void intel_svm_check(struct intel_iommu *iommu);
>  extern int intel_svm_enable_prq(struct intel_iommu *iommu);
>  extern int intel_svm_finish_prq(struct intel_iommu *iommu);
> -struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm,
> -  void *drvdata);
> -void intel_svm_unbind(struct iommu_sva *handle);
> -u32 intel_svm_get_pasid(struct iommu_sva *handle);
>  int intel_svm_page_response(struct device *dev, struct iommu_fault_event 
> *evt,
>   struct iommu_page_response *msg);
>  extern const struct iommu_domain_ops intel_svm_domain_ops;
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index c5a16b47cae8..19718939d9df 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -214,9 +214,6 @@ struct iommu_iotlb_gather {
>   * @dev_has/enable/disable_feat: per device entries to check/enable/disable
>   *   iommu specific features.
>   * @dev_feat_enabled: check enabled feature
> - * @sva_bind: Bind process address space to device
> - * @sva_unbind: Unbind process address space from device
> - * @sva_get_pasid: Get PASID associated to a SVA handle
>   * @page_response: handle page request response
>   * @def_domain_type: device default domain type, return value:
>   *   - IOMMU_DOMAIN_IDENTITY: must use an identity domain
> @@ -250,11 +247,6 @@ struct iommu_ops {
>   int (*dev_enable_feat)(struct device *dev, enum iommu_dev_features f);
>   int (*dev_disable_feat)(struct device *dev, enum iommu_dev_features f);
>  
> - struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm,
> -   void *drvdata);
> - void (*sva_unbind)(struct iommu_sva *handle);
> - u32 (*sva_get_pasid)(struct iommu_sva *handle);
> -
>   int (*page_response)(struct device *dev,
>struct iommu_fault_event *evt,
>struct iommu_page_response *msg);
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 7631c00fdcbd..2513309ec0db 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -754,10 +754,6 @@ bool arm_smmu_master_sva_enabled(struct arm_smmu_master 
> *master);
>  int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
>  int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
>  bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
> -struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm,
> - void *drvdata);
> -void arm_smmu_sva_unbind(struct iommu_sva *handle);
> -u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle);
>  void arm_smmu_sva_notifier_synchronize(void);
>  int arm_smmu_sva_attach_dev_pasid(struct iommu_domain *domain,
> struct device *dev, ioasid_t id);
> @@ -794,19 +790,6 @@ static inline bool arm_smmu_master_iopf_supported(struct 
> arm_smmu_master *master
>   return false;
>  }
>  
> -static inline struct iommu_sva *
> -arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
> -{
> - return ERR_PTR(-ENODEV);
> -}
> -
> -static inline void arm_smmu_sva_unbind(struct iommu_sva *handle) {}
> -
> -static inline u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle)
> -{
> - return IOMMU_PASID_INVALID;
> -}
> -
>  static inline void arm_smmu_sva_noti

Re: [PATCH v5 07/12] arm-smmu-v3/sva: Add SVA domain support

2022-05-03 Thread Jean-Philippe Brucker
On Mon, May 02, 2022 at 09:48:37AM +0800, Lu Baolu wrote:
> Add support for SVA domain allocation and provide an SVA-specific
> iommu_domain_ops.
> 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   | 14 +++
>  .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   | 42 +++
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 21 ++
>  3 files changed, 77 insertions(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index cd48590ada30..7631c00fdcbd 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -759,6 +759,10 @@ struct iommu_sva *arm_smmu_sva_bind(struct device *dev, 
> struct mm_struct *mm,
>  void arm_smmu_sva_unbind(struct iommu_sva *handle);
>  u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle);
>  void arm_smmu_sva_notifier_synchronize(void);
> +int arm_smmu_sva_attach_dev_pasid(struct iommu_domain *domain,
> +   struct device *dev, ioasid_t id);
> +void arm_smmu_sva_detach_dev_pasid(struct iommu_domain *domain,
> +struct device *dev, ioasid_t id);
>  #else /* CONFIG_ARM_SMMU_V3_SVA */
>  static inline bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
>  {
> @@ -804,5 +808,15 @@ static inline u32 arm_smmu_sva_get_pasid(struct 
> iommu_sva *handle)
>  }
>  
>  static inline void arm_smmu_sva_notifier_synchronize(void) {}
> +
> +static inline int arm_smmu_sva_attach_dev_pasid(struct iommu_domain *domain,
> + struct device *dev, ioasid_t id)
> +{
> + return -ENODEV;
> +}
> +
> +static inline void arm_smmu_sva_detach_dev_pasid(struct iommu_domain *domain,
> +  struct device *dev,
> +  ioasid_t id) {}
>  #endif /* CONFIG_ARM_SMMU_V3_SVA */
>  #endif /* _ARM_SMMU_V3_H */
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index c623dae1e115..3b843cd3ed67 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -541,3 +541,45 @@ void arm_smmu_sva_notifier_synchronize(void)
>*/
>   mmu_notifier_synchronize();
>  }
> +
> +int arm_smmu_sva_attach_dev_pasid(struct iommu_domain *domain,
> +   struct device *dev, ioasid_t id)
> +{
> + int ret = 0;
> + struct iommu_sva *handle;
> + struct mm_struct *mm = iommu_sva_domain_mm(domain);
> +
> + if (domain->type != IOMMU_DOMAIN_SVA || !mm)

We wouldn't get that far with a non-SVA domain since iommu_sva_domain_mm()
would dereference a NULL pointer. Could you move it after the domain->type
check, and maybe add a WARN_ON()?  It could help catch issues in future
API changes.

> + return -EINVAL;
> +
> + mutex_lock(_lock);
> + handle = __arm_smmu_sva_bind(dev, mm);
> + if (IS_ERR(handle))
> + ret = PTR_ERR(handle);
> + mutex_unlock(_lock);
> +
> + return ret;
> +}
> +
> +void arm_smmu_sva_detach_dev_pasid(struct iommu_domain *domain,
> +struct device *dev, ioasid_t id)
> +{
> + struct arm_smmu_bond *bond = NULL, *t;
> + struct mm_struct *mm = iommu_sva_domain_mm(domain);
> + struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +
> + mutex_lock(_lock);
> + list_for_each_entry(t, >bonds, list) {
> + if (t->mm == mm) {
> + bond = t;
> + break;
> + }
> + }
> +
> + if (!WARN_ON(!bond) && refcount_dec_and_test(>refs)) {
> + list_del(>list);
> + arm_smmu_mmu_notifier_put(bond->smmu_mn);
> + kfree(bond);
> + }
> + mutex_unlock(_lock);
> +}
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index afc63fce6107..bd80de0bad98 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1995,10 +1995,31 @@ static bool arm_smmu_capable(enum iommu_cap cap)
>   }
>  }
>  
> +static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
> +{
> + kfree(domain);
> +}
> +
> +static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
> + .attach_dev_pasid   = arm_smmu_sva_attach_dev_pasid,
> + .detach_dev_pasid   = arm_smmu_sva_detach_dev_pasid,
> + .free   = arm_smmu_sva_domain_free,
> +};
> +
>  static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
>  {
>   struct arm_smmu_domain *smmu_domain;
>  
> + if (type == IOMMU_DOMAIN_SVA) {
> + struct iommu_domain *domain;
> +
> + domain = kzalloc(sizeof(*domain), GFP_KERNEL);
> + if (domain)
> + domain->ops = 

Re: [PATCH v5 04/12] iommu/sva: Basic data structures for SVA

2022-05-03 Thread Jean-Philippe Brucker
On Mon, May 02, 2022 at 09:48:34AM +0800, Lu Baolu wrote:
> Use below data structures for SVA implementation in the IOMMU core:
> 
> - struct iommu_sva_ioas
>   Represent the I/O address space shared with an application CPU address
>   space. This structure has a 1:1 relationship with an mm_struct. It
>   grabs a "mm->mm_count" refcount during creation and drop it on release.

Do we actually need this structure?  At the moment it only keeps track of
bonds, which we can move to struct dev_iommu. Replacing it by a mm pointer
in struct iommu_domain simplifies the driver and seems to work

Thanks,
Jean

> 
> - struct iommu_domain (IOMMU_DOMAIN_SVA type)
>   Represent a hardware pagetable that the IOMMU hardware could use for
>   SVA translation. Multiple iommu domains could be bound with an SVA ioas
>   and each grabs a refcount from ioas in order to make sure ioas could
>   only be freed after all domains have been unbound.
> 
> - struct iommu_sva
>   Represent a bond relationship between an SVA ioas and an iommu domain.
>   If a bond already exists, it's reused and a reference is taken.
> 
> Signed-off-by: Lu Baolu 
> ---
>  include/linux/iommu.h | 14 +-
>  drivers/iommu/iommu-sva-lib.h |  1 +
>  drivers/iommu/iommu-sva-lib.c | 18 ++
>  3 files changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index ab36244d4e94..f582f434c513 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -42,6 +42,7 @@ struct notifier_block;
>  struct iommu_sva;
>  struct iommu_fault_event;
>  struct iommu_dma_cookie;
> +struct iommu_sva_ioas;
>  
>  /* iommu fault flags */
>  #define IOMMU_FAULT_READ 0x0
> @@ -64,6 +65,9 @@ struct iommu_domain_geometry {
>  #define __IOMMU_DOMAIN_PT(1U << 2)  /* Domain is identity mapped   */
>  #define __IOMMU_DOMAIN_DMA_FQ(1U << 3)  /* DMA-API uses flush queue  
>   */
>  
> +#define __IOMMU_DOMAIN_SHARED(1U << 4)  /* Page table shared from 
> CPU  */
> +#define __IOMMU_DOMAIN_HOST_VA   (1U << 5)  /* Host CPU virtual address 
> */
> +
>  /*
>   * This are the possible domain-types
>   *
> @@ -86,6 +90,8 @@ struct iommu_domain_geometry {
>  #define IOMMU_DOMAIN_DMA_FQ  (__IOMMU_DOMAIN_PAGING |\
>__IOMMU_DOMAIN_DMA_API |   \
>__IOMMU_DOMAIN_DMA_FQ)
> +#define IOMMU_DOMAIN_SVA (__IOMMU_DOMAIN_SHARED |\
> +  __IOMMU_DOMAIN_HOST_VA)
>  
>  struct iommu_domain {
>   unsigned type;
> @@ -95,6 +101,7 @@ struct iommu_domain {
>   void *handler_token;
>   struct iommu_domain_geometry geometry;
>   struct iommu_dma_cookie *iova_cookie;
> + struct iommu_sva_ioas *sva_ioas;
>  };
>  
>  static inline bool iommu_is_dma_domain(struct iommu_domain *domain)
> @@ -628,7 +635,12 @@ struct iommu_fwspec {
>   * struct iommu_sva - handle to a device-mm bond
>   */
>  struct iommu_sva {
> - struct device   *dev;
> + struct device   *dev;
> + struct iommu_sva_ioas   *sva_ioas;
> + struct iommu_domain *domain;
> + /* Link to sva ioas's bonds list */
> + struct list_headnode;
> + refcount_t  users;
>  };
>  
>  int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
> diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
> index 8909ea1094e3..9c5e108e2c8a 100644
> --- a/drivers/iommu/iommu-sva-lib.h
> +++ b/drivers/iommu/iommu-sva-lib.h
> @@ -10,6 +10,7 @@
>  
>  int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max);
>  struct mm_struct *iommu_sva_find(ioasid_t pasid);
> +struct mm_struct *iommu_sva_domain_mm(struct iommu_domain *domain);
>  
>  /* I/O Page fault */
>  struct device;
> diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
> index 106506143896..d524a402be3b 100644
> --- a/drivers/iommu/iommu-sva-lib.c
> +++ b/drivers/iommu/iommu-sva-lib.c
> @@ -3,6 +3,8 @@
>   * Helpers for IOMMU drivers implementing SVA
>   */
>  #include 
> +#include 
> +#include 
>  #include 
>  
>  #include "iommu-sva-lib.h"
> @@ -10,6 +12,22 @@
>  static DEFINE_MUTEX(iommu_sva_lock);
>  static DECLARE_IOASID_SET(iommu_sva_pasid);
>  
> +struct iommu_sva_ioas {
> + struct mm_struct *mm;
> + ioasid_t pasid;
> +
> + /* Counter of domains attached to this ioas. */
> + refcount_t users;
> +
> + /* All bindings are linked here. */
> + struct list_head bonds;
> +};
> +
> +struct mm_struct *iommu_sva_domain_mm(struct iommu_domain *domain)
> +{
> + return domain->sva_ioas->mm;
> +}
> +
>  /**
>   * iommu_sva_alloc_pasid - Allocate a PASID for the mm
>   * @mm: the mm
> -- 
> 2.25.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 03/12] iommu: Add attach/detach_dev_pasid domain ops

2022-05-03 Thread Jean-Philippe Brucker
On Mon, May 02, 2022 at 09:48:33AM +0800, Lu Baolu wrote:
> Attaching an IOMMU domain to a PASID of a device is a generic operation
> for modern IOMMU drivers which support PASID-granular DMA address
> translation. Currently visible usage scenarios include (but not limited):
> 
>  - SVA (Shared Virtual Address)
>  - kernel DMA with PASID
>  - hardware-assist mediated device
> 
> This adds a pair of common domain ops for this purpose and adds helpers
> to attach/detach a domain to/from a {device, PASID}. Some buses, like
> PCI, route packets without considering the PASID value. Thus a DMA target
> address with PASID might be treated as P2P if the address falls into the
> MMIO BAR of other devices in the group. To make things simple, these
> interfaces only apply to devices belonging to the singleton groups, and
> the singleton is immutable in fabric i.e. not affected by hotplug.
> 
> Signed-off-by: Lu Baolu 

Reviewed-by: Jean-Philippe Brucker 

just a nit below

> ---
>  include/linux/iommu.h | 21 
>  drivers/iommu/iommu.c | 76 +++
>  2 files changed, 97 insertions(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index b8ffaf2cb1d0..ab36244d4e94 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -263,6 +263,8 @@ struct iommu_ops {
>   * struct iommu_domain_ops - domain specific operations
>   * @attach_dev: attach an iommu domain to a device
>   * @detach_dev: detach an iommu domain from a device
> + * @attach_dev_pasid: attach an iommu domain to a pasid of device
> + * @detach_dev_pasid: detach an iommu domain from a pasid of device
>   * @map: map a physically contiguous memory region to an iommu domain
>   * @map_pages: map a physically contiguous set of pages of the same size to
>   * an iommu domain.
> @@ -283,6 +285,10 @@ struct iommu_ops {
>  struct iommu_domain_ops {
>   int (*attach_dev)(struct iommu_domain *domain, struct device *dev);
>   void (*detach_dev)(struct iommu_domain *domain, struct device *dev);
> + int (*attach_dev_pasid)(struct iommu_domain *domain,
> + struct device *dev, ioasid_t pasid);
> + void (*detach_dev_pasid)(struct iommu_domain *domain,
> +  struct device *dev, ioasid_t pasid);
>  
>   int (*map)(struct iommu_domain *domain, unsigned long iova,
>  phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
> @@ -678,6 +684,10 @@ int iommu_group_claim_dma_owner(struct iommu_group 
> *group, void *owner);
>  void iommu_group_release_dma_owner(struct iommu_group *group);
>  bool iommu_group_dma_owner_claimed(struct iommu_group *group);
>  
> +int iommu_attach_device_pasid(struct iommu_domain *domain,
> +   struct device *dev, ioasid_t pasid);
> +void iommu_detach_device_pasid(struct iommu_domain *domain,
> +struct device *dev, ioasid_t pasid);
>  #else /* CONFIG_IOMMU_API */
>  
>  struct iommu_ops {};
> @@ -1051,6 +1061,17 @@ static inline bool 
> iommu_group_dma_owner_claimed(struct iommu_group *group)
>  {
>   return false;
>  }
> +
> +static inline int iommu_attach_device_pasid(struct iommu_domain *domain,
> + struct device *dev, ioasid_t pasid)
> +{
> + return -ENODEV;
> +}
> +
> +static inline void iommu_detach_device_pasid(struct iommu_domain *domain,
> +  struct device *dev, ioasid_t pasid)
> +{
> +}
>  #endif /* CONFIG_IOMMU_API */
>  
>  /**
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 29906bc16371..89c9d19ddb28 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -38,6 +38,7 @@ struct iommu_group {
>   struct kobject kobj;
>   struct kobject *devices_kobj;
>   struct list_head devices;
> + struct xarray pasid_array;
>   struct mutex mutex;
>   void *iommu_data;
>   void (*iommu_data_release)(void *iommu_data);
> @@ -630,6 +631,7 @@ struct iommu_group *iommu_group_alloc(void)
>   mutex_init(>mutex);
>   INIT_LIST_HEAD(>devices);
>   INIT_LIST_HEAD(>entry);
> + xa_init(>pasid_array);
>  
>   ret = ida_simple_get(_group_ida, 0, 0, GFP_KERNEL);
>   if (ret < 0) {
> @@ -3190,3 +3192,77 @@ bool iommu_group_dma_owner_claimed(struct iommu_group 
> *group)
>   return user;
>  }
>  EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);
> +
> +/*
> + * Use standard PCI bus topology and isolation features to check immutable
> + * singleton. Otherwise, assume the bus is static and then singleton can
> + * know 

Re: [PATCH v5 02/12] iommu: Add pasid_bits field in struct dev_iommu

2022-05-03 Thread Jean-Philippe Brucker
On Mon, May 02, 2022 at 09:48:32AM +0800, Lu Baolu wrote:
> Use this field to save the pasid/ssid bits that a device is able to
> support with its IOMMU hardware. It is a generic attribute of a device
> and lifting it into the per-device dev_iommu struct makes it possible
> to allocate a PASID for device without calls into the IOMMU drivers.
> Any iommu driver which suports PASID related features should set this
> field before features are enabled on the devices.
> 
> For initialization of this field in the VT-d driver, the
> info->pasid_supported is only set for PCI devices. So the status is
> that non-PCI SVA hasn't been supported yet. Setting this field only for
> PCI devices has no functional change.
> 
> Signed-off-by: Lu Baolu 

Reviewed-by: Jean-Philippe Brucker 

> ---
>  include/linux/iommu.h   | 1 +
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 ++
>  drivers/iommu/intel/iommu.c | 5 -
>  3 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 5e1afe169549..b8ffaf2cb1d0 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -373,6 +373,7 @@ struct dev_iommu {
>   struct iommu_fwspec *fwspec;
>   struct iommu_device *iommu_dev;
>   void*priv;
> + unsigned intpasid_bits;
>  };
>  
>  int iommu_device_register(struct iommu_device *iommu,
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 627a3ed5ee8f..afc63fce6107 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2681,6 +2681,8 @@ static struct iommu_device 
> *arm_smmu_probe_device(struct device *dev)
>   smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
>   master->stall_enabled = true;
>  
> + dev->iommu->pasid_bits = master->ssid_bits;
> +
>   return >iommu;
>  
>  err_free_master:
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index cf43e8f9091b..170eb777d57b 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4611,8 +4611,11 @@ static struct iommu_device 
> *intel_iommu_probe_device(struct device *dev)
>   if (pasid_supported(iommu)) {
>   int features = pci_pasid_features(pdev);
>  
> - if (features >= 0)
> + if (features >= 0) {
>   info->pasid_supported = features | 1;
> + dev->iommu->pasid_bits =
> + fls(pci_max_pasids(pdev)) - 1;
> + }
>   }
>  
>   if (info->ats_supported && ecap_prs(iommu->ecap) &&
> -- 
> 2.25.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-05-03 Thread Jean-Philippe Brucker
On Sat, Apr 30, 2022 at 03:33:17PM +0800, Baolu Lu wrote:
> Jean, another quick question about the iommu_sva_bind_device()
> 
> /**
>  * iommu_sva_bind_device() - Bind a process address space to a device
>  * @dev: the device
>  * @mm: the mm to bind, caller must hold a reference to it
>  * @drvdata: opaque data pointer to pass to bind callback
> 
> This interface requires the caller to take a reference to mm. Which
> reference should it take, mm->mm_count or mm->mm_users? It's better to
> make it explicit in this comment.

Agreed, it's mm_users as required by mmu_notifier_register()

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-29 Thread Jean-Philippe Brucker
On Fri, Apr 29, 2022 at 06:51:17AM -0700, Fenghua Yu wrote:
> Hi, Baolu,
> 
> On Fri, Apr 29, 2022 at 03:53:57PM +0800, Baolu Lu wrote:
> > On 2022/4/28 16:39, Jean-Philippe Brucker wrote:
> > > > The address space is what the OOM killer is after.  That gets refcounted
> > > > with mmget()/mmput()/mm->mm_users.  The OOM killer is satiated by the
> > > > page freeing done in __mmput()->exit_mmap().
> > > > 
> > > > Also, all the VMAs should be gone after exit_mmap().  So, even if
> > > > vma->vm_file was holding a reference to a device driver, that reference
> > > > should be gone by the time __mmdrop() is actually freeing the PASID.
> > > 
> > > I agree with all that. The concern was about tearing down the PASID in the
> > > IOMMU and device from the release() MMU notifier, which would happen in
> > > exit_mmap(). But doing the teardown at or before __mmdrop() is fine. And
> > > since the IOMMU drivers need to hold mm->mm_count anyway between bind()
> > > and unbind(), I think Fenghua's fix works.
> > 
> > But I didn't find mmgrab()/mmdrop() get called in both arm and intel
> > IOMMU drivers.
> > 
> > $ git grep mmgrab drivers/iommu/
> > [no output]
> > 
> > Do we need to add these in a separated fix patch, or I missed anything
> > here?
> 
> On both ARM and X86, sva_bind() calls mmu_notifier_register()->mmgrab() and
> sva_unbind() calls mmu_notifier_unregister()/mmu_notifier_put()->mmdrop().

Yes, although for Arm I realized the mmu_notifier grab wasn't sufficient
so I sent a separate fix that should go in 5.18 as well
https://lore.kernel.org/linux-iommu/20220426130444.300556-1-jean-phili...@linaro.org/
The Arm driver still touches the arch mm context after mmu_notifier_put().
I don't think X86 has that problem.

Thanks,
Jean

> So mm->mm_count are already counted in existing ARM and X86 binding and
> unbinding. The fix patch just frees the PASID in __mmdrop() after
> no more mm->mm_count.
> 
> There is no need to add extra  mmgrab() and mmdrop() pair.
> 
> Thanks.
> 
> -Fenghua
> 
>  
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-28 Thread Jean-Philippe Brucker
On Thu, Apr 28, 2022 at 08:09:04AM -0700, Dave Hansen wrote:
> On 4/25/22 21:20, Fenghua Yu wrote:
> >>From 84aa68f6174439d863c40cdc2db0e1b89d620dd0 Mon Sep 17 00:00:00 2001
> > From: Fenghua Yu 
> > Date: Fri, 15 Apr 2022 00:51:33 -0700
> > Subject: [PATCH] iommu/sva: Fix PASID use-after-free issue
> > 
> > A PASID might be still used on ARM after it is freed in __mmput().
> 
> Is it really just ARM?
> 
> > process:
> > open()->sva_bind()->ioasid_alloc() = N; // Get PASID N for the mm
> > exit();
> > exit_mm()->__mmput()->mm_pasid_drop()->mm->pasid = -1; // PASID -1
> > exit_files()->release(dev)->sva_unbind()->use mm->pasid; // Failure
> > 
> > To avoid the use-after-free issue, free the PASID after no device uses it,
> > i.e. after all devices are unbound from the mm.
> > 
> > sva_bind()/sva_unbind() call mmgrab()/mmdrop() to track mm->mm_count.
> > __mmdrop() is called only after mm->mm_count is zero. So freeing the PASID
> > in __mmdrop() guarantees the PASID is safely freed only after no device
> > is bound to the mm.
> 
> Does this changelog work for everyone?
> 
> ==
> 
> tl;dr: The PASID is being freed too early.  It needs to stay around
> until after device drivers that might be using it have had a chance to
> clear it out of the hardware.
> 
> --
> 
> As a reminder:
> 
> mmget() /mmput()  refcount the mm's address space
> mmgrab()/mmdrop() refcount the mm itself
> 
> The PASID is currently tied to the life of the mm's address space and
> freed in __mmput().  This makes logical sense because the PASID can't be
> used once the address space is gone.
> 
> But, this misses an important point: even after the address space is
> gone, the PASID will still be programmed into a device.  Device drivers
> might, for instance, still need to flush operations that are outstanding
> and need to use that PASID.  They do this at ->release() time.

It's not really clear which release() this is. For us it's file descriptor
release() (not MMU notifier release(), which is how I initially understood
this sentence)

> 
> Device drivers hold a reference on the mm itself and drop it at
> ->release() time.  But, the device driver holds a reference mm itself,

"to the mm"

(To be pendantic it's the IOMMU driver that holds this reference, and
the device driver calls the IOMMU driver to release it, but the idea is
the same.)

> not the address space.  The address space (and the PASID) is long gone
> by the time the driver tries to clean up.  This is effectively a
> use-after-free bug on the PASID.
> 
> To fix this, move the PASID free operation from __mmput() to __mmdrop().
>  This ensures that the device drivers' existing mmgrab() keeps the PASID
> allocated until they drop their mm reference.

Good changelog otherwise

Thanks,
Jean

> 
> >  kernel/fork.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index 9796897560ab..35a3beff140b 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -792,6 +792,7 @@ void __mmdrop(struct mm_struct *mm)
> > mmu_notifier_subscriptions_destroy(mm);
> > check_mm(mm);
> > put_user_ns(mm->user_ns);
> > +   mm_pasid_drop(mm);
> > free_mm(mm);
> >  }
> >  EXPORT_SYMBOL_GPL(__mmdrop);
> > @@ -1190,7 +1191,6 @@ static inline void __mmput(struct mm_struct *mm)
> > }
> > if (mm->binfmt)
> > module_put(mm->binfmt->module);
> > -   mm_pasid_drop(mm);
> > mmdrop(mm);
> >  }
> >  
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 11/12] iommu: Per-domain I/O page fault handling

2022-04-28 Thread Jean-Philippe Brucker
On Thu, Apr 21, 2022 at 01:21:20PM +0800, Lu Baolu wrote:
>  static void iopf_handle_group(struct work_struct *work)
>  {
>   struct iopf_group *group;
> @@ -134,12 +78,23 @@ static void iopf_handle_group(struct work_struct *work)
>   group = container_of(work, struct iopf_group, work);
>  
>   list_for_each_entry_safe(iopf, next, >faults, list) {
> + struct iommu_domain *domain;
> +
> + domain = iommu_get_domain_for_dev_pasid_async(group->dev,
> + iopf->fault.prm.pasid);

Reading the PCIe spec again (v6.0 10.4.1.1 PASID Usage), all faults within
the group have the same PASID so we could move the domain fetch out of the
loop. It does deviate from the old behavior, though, so we could change
it later.

Thanks,
Jean

> + if (!domain || !domain->iopf_handler)
> + status = IOMMU_PAGE_RESP_INVALID;
> +
>   /*
>* For the moment, errors are sticky: don't handle subsequent
>* faults in the group if there is an error.
>*/
>   if (status == IOMMU_PAGE_RESP_SUCCESS)
> - status = iopf_handle_single(iopf);
> + status = domain->iopf_handler(>fault,
> +   domain->fault_data);
> +
> + if (domain)
> + iommu_domain_put_async(domain);
>  
>   if (!(iopf->fault.prm.flags &
> IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE))
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 03/12] iommu: Add attach/detach_dev_pasid domain ops

2022-04-28 Thread Jean-Philippe Brucker
On Thu, Apr 21, 2022 at 01:21:12PM +0800, Lu Baolu wrote:
> Attaching an IOMMU domain to a PASID of a device is a generic operation
> for modern IOMMU drivers which support PASID-granular DMA address
> translation. Currently visible usage scenarios include (but not limited):
> 
>  - SVA (Shared Virtual Address)
>  - kernel DMA with PASID
>  - hardware-assist mediated device
> 
> This adds a pair of common domain ops for this purpose and adds helpers
> to attach/detach a domain to/from a {device, PASID}. Some buses, like
> PCI, route packets without considering the PASID value. Thus a DMA target
> address with PASID might be treated as P2P if the address falls into the
> MMIO BAR of other devices in the group. To make things simple, these
> interfaces only apply to devices belonging to the singleton groups, and
> the singleton is immutable in fabric i.e. not affected by hotplug.
> 
> Signed-off-by: Lu Baolu 
[...]
> +/*
> + * Use standard PCI bus topology, isolation features, and DMA
> + * alias quirks to check the immutable singleton attribute. If
> + * the device came from DT, assume it is static and then
> + * singleton can know from the device count in the group.
> + */
> +static bool device_group_immutable_singleton(struct device *dev)
> +{
> + struct iommu_group *group = iommu_group_get(dev);
> + int count;
> +
> + if (!group)
> + return false;
> +
> + mutex_lock(>mutex);
> + count = iommu_group_device_count(group);
> + mutex_unlock(>mutex);
> + iommu_group_put(group);
> +
> + if (count != 1)
> + return false;
> +
> + if (dev_is_pci(dev)) {
> + struct pci_dev *pdev = to_pci_dev(dev);
> +
> + /*
> +  * The device could be considered to be fully isolated if
> +  * all devices on the path from the device to the host-PCI
> +  * bridge are protected from peer-to-peer DMA by ACS.
> +  */
> + if (!pci_acs_path_enabled(pdev, NULL, REQ_ACS_FLAGS))
> + return false;
> +
> + /* Filter out devices which has any alias device. */
> + if (pci_for_each_dma_alias(pdev, has_pci_alias, pdev))
> + return false;

Aren't aliases already added to the group by pci_device_group()?  If so
it's part of the count check above

> +
> + return true;
> + }
> +
> + /*
> +  * If the device came from DT, assume it is static and then
> +  * singleton can know from the device count in the group.
> +  */
> + return is_of_node(dev_fwnode(dev));

I don't think DT is relevant here because a platform device enumerated
through ACPI will also have its own group. It should be safe to stick to
what the IOMMU drivers declare with their device_group() callback. Except
for PCI those groups should be immutable so we can return true here.

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 10/12] iommu: Prepare IOMMU domain for IOPF

2022-04-28 Thread Jean-Philippe Brucker
Hi Baolu,

On Thu, Apr 21, 2022 at 01:21:19PM +0800, Lu Baolu wrote:
> +/*
> + * Get the attached domain for asynchronous usage, for example the I/O
> + * page fault handling framework. The caller get a reference counter
> + * of the domain automatically on a successful return and should put
> + * it with iommu_domain_put() after usage.
> + */
> +struct iommu_domain *
> +iommu_get_domain_for_dev_pasid_async(struct device *dev, ioasid_t pasid)
> +{
> + struct iommu_domain *domain;
> + struct iommu_group *group;
> +
> + if (!pasid_valid(pasid))
> + return NULL;
> +
> + group = iommu_group_get(dev);
> + if (!group)
> + return NULL;
> +
> + mutex_lock(>mutex);

There is a possible deadlock between unbind() and the fault handler:

 unbind()iopf_handle_group()
  mutex_lock(>mutex)
  iommu_detach_device_pasid()
   iopf_queue_flush_dev() iommu_get_domain_for_dev_pasid_async()
... waits for IOPF workmutex_lock(>mutex)

I was wrong in my previous review: we do have a guarantee that the SVA
domain does not go away during IOPF handling, because unbind() waits for
pending faults with iopf_queue_flush_dev() before freeing the domain (or
for Arm stall, knows that there are no pending faults). So we can just get
rid of domain->async_users and the group->mutex in IOPF, I think?

Thanks,
Jean

> + domain = xa_load(>pasid_array, pasid);
> + if (domain)
> + refcount_inc(>async_users);
> + mutex_unlock(>mutex);
> + iommu_group_put(group);
> +
> + return domain;
> +}
> -- 
> 2.25.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-28 Thread Jean-Philippe Brucker
On Wed, Apr 27, 2022 at 05:54:57PM -0700, Fenghua Yu wrote:
> Hi, Dave and Jean,
> 
> On Tue, Apr 26, 2022 at 01:04:45PM +0800, Zhangfei Gao wrote:
> > 
> > 
> > On 2022/4/26 下午12:20, Fenghua Yu wrote:
> > > Hi, Jean and Zhangfei,
> > > 
> > > On Mon, Apr 25, 2022 at 05:13:02PM +0100, Jean-Philippe Brucker wrote:
> > > > Could we move mm_pasid_drop() to __mmdrop() instead of __mmput()?  For 
> > > > Arm
> > > > we do need to hold the mm_count until unbind(), and mmgrab()/mmdrop() is
> > > > also part of Lu's rework [1].
> > > Is this a right fix for the issue? Could you please test it on ARM?
> > > I don't have an ARM machine.
> > > 
> > > Thanks.
> > > 
> > > -Fenghua
> > > 
> > >  From 84aa68f6174439d863c40cdc2db0e1b89d620dd0 Mon Sep 17 00:00:00 2001
> > > From: Fenghua Yu 
> > > Date: Fri, 15 Apr 2022 00:51:33 -0700
> > > Subject: [PATCH] iommu/sva: Fix PASID use-after-free issue
> > > 
> > > A PASID might be still used on ARM after it is freed in __mmput().
> > > 
> > > process:
> > >   open()->sva_bind()->ioasid_alloc() = N; // Get PASID N for the mm
> > >   exit();
> > >   exit_mm()->__mmput()->mm_pasid_drop()->mm->pasid = -1; // PASID -1
> > >   exit_files()->release(dev)->sva_unbind()->use mm->pasid; // Failure
> > > 
> > > To avoid the use-after-free issue, free the PASID after no device uses it,
> > > i.e. after all devices are unbound from the mm.
> > > 
> > > sva_bind()/sva_unbind() call mmgrab()/mmdrop() to track mm->mm_count.
> > > __mmdrop() is called only after mm->mm_count is zero. So freeing the PASID
> > > in __mmdrop() guarantees the PASID is safely freed only after no device
> > > is bound to the mm.
> > > 
> > > Fixes: 701fac40384f ("iommu/sva: Assign a PASID to mm on PASID allocation 
> > > and free it on mm exit")
> > > 
> > > Reported-by: Zhangfei Gao 
> > > Suggested-by: Jean-Philippe Brucker 
> > > Suggested-by: Jacob Pan 
> > > Signed-off-by: Fenghua Yu 
> > Thanks for the fix.
> > 
> > Tested-by: Zhangfei Gao 
> > 
> > 
> > > ---
> > >   kernel/fork.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/kernel/fork.c b/kernel/fork.c
> > > index 9796897560ab..35a3beff140b 100644
> > > --- a/kernel/fork.c
> > > +++ b/kernel/fork.c
> > > @@ -792,6 +792,7 @@ void __mmdrop(struct mm_struct *mm)
> > >   mmu_notifier_subscriptions_destroy(mm);
> > >   check_mm(mm);
> > >   put_user_ns(mm->user_ns);
> > > + mm_pasid_drop(mm);
> > >   free_mm(mm);
> > >   }
> > >   EXPORT_SYMBOL_GPL(__mmdrop);
> > > @@ -1190,7 +1191,6 @@ static inline void __mmput(struct mm_struct *mm)
> > >   }
> > >   if (mm->binfmt)
> > >   module_put(mm->binfmt->module);
> > > - mm_pasid_drop(mm);
> > >   mmdrop(mm);
> > >   }
> > 
> 
> Is this patch a good fix? Will you help push the fix into upstream?

Yes, I think it's the right thing to do for now. Could you resend it
separately so it gets visibility from the maintainers?

Reviewed-by: Jean-Philippe Brucker 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-28 Thread Jean-Philippe Brucker
On Tue, Apr 26, 2022 at 04:31:57PM -0700, Dave Hansen wrote:
> On 4/26/22 09:48, Jean-Philippe Brucker wrote:
> > On Tue, Apr 26, 2022 at 08:27:00AM -0700, Dave Hansen wrote:
> >> On 4/25/22 09:40, Jean-Philippe Brucker wrote:
> >>> The problem is that we'd have to request the device driver to stop DMA
> >>> before we can destroy the context and free the PASID. We did consider
> >>> doing this in the release() MMU notifier, but there were concerns about
> >>> blocking mmput() for too long (for example
> >>> https://lore.kernel.org/linux-iommu/4d68da96-0ad5-b412-5987-2f7a6aa79...@amd.com/
> >>> though I think there was a more recent discussion). We also need to drain
> >>> the PRI and fault queues to get rid of all references to that PASID.
> >> Is the concern truly about blocking mmput() itself?  Or, is it about
> >> releasing the resources associated with the mm?
> > The latter I think, this one was about releasing pages as fast as possible
> > if the process is picked by the OOM killer. 
> 
> We're tying the PASID to the life of the mm itself, not the mm's address
> space.  That means the PASID should be tied to
> mmgrab()/mmdrop()/mm->mm_count.
> 
> The address space is what the OOM killer is after.  That gets refcounted
> with mmget()/mmput()/mm->mm_users.  The OOM killer is satiated by the
> page freeing done in __mmput()->exit_mmap().
> 
> Also, all the VMAs should be gone after exit_mmap().  So, even if
> vma->vm_file was holding a reference to a device driver, that reference
> should be gone by the time __mmdrop() is actually freeing the PASID.

I agree with all that. The concern was about tearing down the PASID in the
IOMMU and device from the release() MMU notifier, which would happen in
exit_mmap(). But doing the teardown at or before __mmdrop() is fine. And
since the IOMMU drivers need to hold mm->mm_count anyway between bind()
and unbind(), I think Fenghua's fix works.

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-26 Thread Jean-Philippe Brucker
On Tue, Apr 26, 2022 at 08:27:00AM -0700, Dave Hansen wrote:
> On 4/25/22 09:40, Jean-Philippe Brucker wrote:
> > The problem is that we'd have to request the device driver to stop DMA
> > before we can destroy the context and free the PASID. We did consider
> > doing this in the release() MMU notifier, but there were concerns about
> > blocking mmput() for too long (for example
> > https://lore.kernel.org/linux-iommu/4d68da96-0ad5-b412-5987-2f7a6aa79...@amd.com/
> > though I think there was a more recent discussion). We also need to drain
> > the PRI and fault queues to get rid of all references to that PASID.
> 
> Is the concern truly about blocking mmput() itself?  Or, is it about
> releasing the resources associated with the mm?

The latter I think, this one was about releasing pages as fast as possible
if the process is picked by the OOM killer. 

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2] iommu/arm-smmu-v3-sva: Fix mm use-after-free

2022-04-26 Thread Jean-Philippe Brucker
We currently call arm64_mm_context_put() without holding a reference to
the mm, which can result in use-after-free. Call mmgrab()/mmdrop() to
ensure the mm only gets freed after we unpinned the ASID.

Fixes: 32784a9562fb ("iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()")
Signed-off-by: Jean-Philippe Brucker 
---
v2: Add missing include
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 22ddd05bbdcd..5d029e87c8af 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "arm-smmu-v3.h"
@@ -96,9 +97,14 @@ static struct arm_smmu_ctx_desc 
*arm_smmu_alloc_shared_cd(struct mm_struct *mm)
struct arm_smmu_ctx_desc *cd;
struct arm_smmu_ctx_desc *ret = NULL;
 
+   /* Don't free the mm until we release the ASID */
+   mmgrab(mm);
+
asid = arm64_mm_context_get(mm);
-   if (!asid)
-   return ERR_PTR(-ESRCH);
+   if (!asid) {
+   err = -ESRCH;
+   goto out_drop_mm;
+   }
 
cd = kzalloc(sizeof(*cd), GFP_KERNEL);
if (!cd) {
@@ -165,6 +171,8 @@ static struct arm_smmu_ctx_desc 
*arm_smmu_alloc_shared_cd(struct mm_struct *mm)
kfree(cd);
 out_put_context:
arm64_mm_context_put(mm);
+out_drop_mm:
+   mmdrop(mm);
return err < 0 ? ERR_PTR(err) : ret;
 }
 
@@ -173,6 +181,7 @@ static void arm_smmu_free_shared_cd(struct 
arm_smmu_ctx_desc *cd)
if (arm_smmu_free_asid(cd)) {
/* Unpin ASID */
arm64_mm_context_put(cd->mm);
+   mmdrop(cd->mm);
kfree(cd);
}
 }
-- 
2.35.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3-sva: Fix mm use-after-free

2022-04-26 Thread Jean-Philippe Brucker
On Tue, Apr 26, 2022 at 08:20:52PM +0800, zhangfei@foxmail.com wrote:
> Hi, Jean
> 
> On 2022/4/26 下午6:04, Jean-Philippe Brucker wrote:
> > We currently call arm64_mm_context_put() without holding a reference to
> > the mm, which can result in use-after-free. Call mmgrab()/mmdrop() to
> > ensure the mm only gets freed after we unpinned the ASID.
> > 
> > Fixes: 32784a9562fb ("iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()")
> > Signed-off-by: Jean-Philippe Brucker 
> > ---
> >   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 12 ++--
> >   1 file changed, 10 insertions(+), 2 deletions(-)
> Missing +#include  for compile.

Ah thanks, I sent the wrong version.

> We still need the fix move mm_pasid_drop from __mmput to __mmdrop, right?

Yes, that's Fenghua's patch

Thanks,
Jean

> 
> 1. Test OK with the mm_pasid_drop patch.
> 
> 2. Test fail if revert the mm_pasid_drop patch,
> uacce_fops_release
> Unable to handle kernel paging request at virtual address 00083cc6ffc0
> 
> By the way, we use mmgrab in bind and mmput in unbind before,
> then the fops_release is not be called if exit without close(fd).
> 
> This patch does not have this issue, still not understand why.
> 
> Thanks
> 
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
> > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> > index 582114f94da0..c93477a2d7f1 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> > @@ -98,9 +98,14 @@ static struct arm_smmu_ctx_desc 
> > *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
> > struct arm_smmu_ctx_desc *cd;
> > struct arm_smmu_ctx_desc *ret = NULL;
> > +   /* Don't free the mm until we release the ASID */
> > +   mmgrab(mm);
> > +
> > asid = arm64_mm_context_get(mm);
> > -   if (!asid)
> > -   return ERR_PTR(-ESRCH);
> > +   if (!asid) {
> > +   err = -ESRCH;
> > +   goto out_drop_mm;
> > +   }
> > cd = kzalloc(sizeof(*cd), GFP_KERNEL);
> > if (!cd) {
> > @@ -167,6 +172,8 @@ static struct arm_smmu_ctx_desc 
> > *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
> > kfree(cd);
> >   out_put_context:
> > arm64_mm_context_put(mm);
> > +out_drop_mm:
> > +   mmdrop(mm);
> > return err < 0 ? ERR_PTR(err) : ret;
> >   }
> > @@ -175,6 +182,7 @@ static void arm_smmu_free_shared_cd(struct 
> > arm_smmu_ctx_desc *cd)
> > if (arm_smmu_free_asid(cd)) {
> > /* Unpin ASID */
> > arm64_mm_context_put(cd->mm);
> > +   mmdrop(cd->mm);
> > kfree(cd);
> > }
> >   }
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] iommu/arm-smmu-v3-sva: Fix mm use-after-free

2022-04-26 Thread Jean-Philippe Brucker
We currently call arm64_mm_context_put() without holding a reference to
the mm, which can result in use-after-free. Call mmgrab()/mmdrop() to
ensure the mm only gets freed after we unpinned the ASID.

Fixes: 32784a9562fb ("iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()")
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 582114f94da0..c93477a2d7f1 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -98,9 +98,14 @@ static struct arm_smmu_ctx_desc 
*arm_smmu_alloc_shared_cd(struct mm_struct *mm)
struct arm_smmu_ctx_desc *cd;
struct arm_smmu_ctx_desc *ret = NULL;
 
+   /* Don't free the mm until we release the ASID */
+   mmgrab(mm);
+
asid = arm64_mm_context_get(mm);
-   if (!asid)
-   return ERR_PTR(-ESRCH);
+   if (!asid) {
+   err = -ESRCH;
+   goto out_drop_mm;
+   }
 
cd = kzalloc(sizeof(*cd), GFP_KERNEL);
if (!cd) {
@@ -167,6 +172,8 @@ static struct arm_smmu_ctx_desc 
*arm_smmu_alloc_shared_cd(struct mm_struct *mm)
kfree(cd);
 out_put_context:
arm64_mm_context_put(mm);
+out_drop_mm:
+   mmdrop(mm);
return err < 0 ? ERR_PTR(err) : ret;
 }
 
@@ -175,6 +182,7 @@ static void arm_smmu_free_shared_cd(struct 
arm_smmu_ctx_desc *cd)
if (arm_smmu_free_asid(cd)) {
/* Unpin ASID */
arm64_mm_context_put(cd->mm);
+   mmdrop(cd->mm);
kfree(cd);
}
 }
-- 
2.35.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-25 Thread Jean-Philippe Brucker
On Mon, Apr 25, 2022 at 08:55:46AM -0700, Dave Hansen wrote:
> On 4/25/22 07:26, Jean-Philippe Brucker wrote:
> >>
> >> How does the IOMMU hardware know that all activity to a given PASID is
> >> finished?  That activity should, today, be independent of an mm or a
> >> fd's lifetime.
> > In the case of uacce, it's tied to the fd lifetime: opening an accelerator
> > queue calls iommu_sva_bind_device(), which sets up the PASID context in
> > the IOMMU. Closing the queue calls iommu_sva_unbind_device() which
> > destroys the PASID context (after the device driver stopped all DMA for
> > this PASID).
> 
> Could this PASID context destruction move from being "fd-based" to
> happening under mm_pasid_drop()?  Logically, it seems like that should
> work because mm_pasid_drop() happens after exit_mmap() where the VMAs
> (which hold references to 'struct file' via vma->vm_file) are torn down.

The problem is that we'd have to request the device driver to stop DMA
before we can destroy the context and free the PASID. We did consider
doing this in the release() MMU notifier, but there were concerns about
blocking mmput() for too long (for example
https://lore.kernel.org/linux-iommu/4d68da96-0ad5-b412-5987-2f7a6aa79...@amd.com/
though I think there was a more recent discussion). We also need to drain
the PRI and fault queues to get rid of all references to that PASID.

At the moment we disable (but not destroy) the PASID context in release(),
so when the process gets killed pending DMA transactions are silently
ignored. Then the device driver informs us through unbind() that no DMA is
active anymore and we can finish cleaning up, then reuse the PASID.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-25 Thread Jean-Philippe Brucker
Hi Jacob,

On Mon, Apr 25, 2022 at 08:34:44AM -0700, Jacob Pan wrote:
> Hi Jean-Philippe,
> 
> On Mon, 25 Apr 2022 15:26:40 +0100, Jean-Philippe Brucker
>  wrote:
> 
> > On Mon, Apr 25, 2022 at 07:18:36AM -0700, Dave Hansen wrote:
> > > On 4/25/22 06:53, Jean-Philippe Brucker wrote:  
> > > > On Sat, Apr 23, 2022 at 07:13:39PM +0800, zhangfei@foxmail.com
> > > > wrote:  
> > > >>>> On 5.17
> > > >>>> fops_release is called automatically, as well as
> > > >>>> iommu_sva_unbind_device. On 5.18-rc1.
> > > >>>> fops_release is not called, have to manually call close(fd)  
> > > >>> Right that's weird  
> > > >> Looks it is caused by the fix patch, via mmget, which may add
> > > >> refcount of fd.  
> > > > Yes indirectly I think: when the process mmaps the queue,
> > > > mmap_region() takes a reference to the uacce fd. That reference is
> > > > released either by explicit close() or munmap(), or by exit_mmap()
> > > > (which is triggered by mmput()). Since there is an mm->fd dependency,
> > > > we cannot add a fd->mm dependency, so no mmget()/mmput() in
> > > > bind()/unbind().
> > > > 
> > > > I guess we should go back to refcounted PASIDs instead, to avoid
> > > > freeing them until unbind().  
> > > 
> > > Yeah, this is a bit gnarly for -rc4.  Let's just make sure there's
> > > nothing else simple we can do.
> > > 
> > > How does the IOMMU hardware know that all activity to a given PASID is
> > > finished?  That activity should, today, be independent of an mm or a
> > > fd's lifetime.  
> > 
> > In the case of uacce, it's tied to the fd lifetime: opening an accelerator
> > queue calls iommu_sva_bind_device(), which sets up the PASID context in
> > the IOMMU. Closing the queue calls iommu_sva_unbind_device() which
> > destroys the PASID context (after the device driver stopped all DMA for
> > this PASID).
> > 
> For VT-d, it is essentially the same flow except managed by the individual
> drivers such as DSA.
> If free() happens before unbind(), we deactivate the PASIDs and suppress
> faults from the device. When the unbind finally comes, we finalize the
> PASID teardown. It seems we have a need for an intermediate state where
> PASID is "pending free"?

Yes we do have that state, though I'm not sure we need to make it explicit
in the ioasid allocator.

Could we move mm_pasid_drop() to __mmdrop() instead of __mmput()?  For Arm
we do need to hold the mm_count until unbind(), and mmgrab()/mmdrop() is
also part of Lu's rework [1].

Thanks,
Jean

[1] 
https://lore.kernel.org/linux-iommu/20220421052121.3464100-9-baolu...@linux.intel.com/
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-25 Thread Jean-Philippe Brucker
On Mon, Apr 25, 2022 at 07:18:36AM -0700, Dave Hansen wrote:
> On 4/25/22 06:53, Jean-Philippe Brucker wrote:
> > On Sat, Apr 23, 2022 at 07:13:39PM +0800, zhangfei@foxmail.com wrote:
> >>>> On 5.17
> >>>> fops_release is called automatically, as well as iommu_sva_unbind_device.
> >>>> On 5.18-rc1.
> >>>> fops_release is not called, have to manually call close(fd)
> >>> Right that's weird
> >> Looks it is caused by the fix patch, via mmget, which may add refcount of
> >> fd.
> > Yes indirectly I think: when the process mmaps the queue, mmap_region()
> > takes a reference to the uacce fd. That reference is released either by
> > explicit close() or munmap(), or by exit_mmap() (which is triggered by
> > mmput()). Since there is an mm->fd dependency, we cannot add a fd->mm
> > dependency, so no mmget()/mmput() in bind()/unbind().
> > 
> > I guess we should go back to refcounted PASIDs instead, to avoid freeing
> > them until unbind().
> 
> Yeah, this is a bit gnarly for -rc4.  Let's just make sure there's
> nothing else simple we can do.
> 
> How does the IOMMU hardware know that all activity to a given PASID is
> finished?  That activity should, today, be independent of an mm or a
> fd's lifetime.

In the case of uacce, it's tied to the fd lifetime: opening an accelerator
queue calls iommu_sva_bind_device(), which sets up the PASID context in
the IOMMU. Closing the queue calls iommu_sva_unbind_device() which
destroys the PASID context (after the device driver stopped all DMA for
this PASID).

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-25 Thread Jean-Philippe Brucker
On Sat, Apr 23, 2022 at 07:13:39PM +0800, zhangfei@foxmail.com wrote:
> > > On 5.17
> > > fops_release is called automatically, as well as iommu_sva_unbind_device.
> > > On 5.18-rc1.
> > > fops_release is not called, have to manually call close(fd)
> > Right that's weird
> Looks it is caused by the fix patch, via mmget, which may add refcount of
> fd.

Yes indirectly I think: when the process mmaps the queue, mmap_region()
takes a reference to the uacce fd. That reference is released either by
explicit close() or munmap(), or by exit_mmap() (which is triggered by
mmput()). Since there is an mm->fd dependency, we cannot add a fd->mm
dependency, so no mmget()/mmput() in bind()/unbind().

I guess we should go back to refcounted PASIDs instead, to avoid freeing
them until unbind().

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-22 Thread Jean-Philippe Brucker
On Fri, Apr 22, 2022 at 09:15:01PM +0800, zhangfei@foxmail.com wrote:
> > I'm trying to piece together what happens from the kernel point of view.
> > 
> > * master process with mm A opens a queue fd through uacce, which calls
> >iommu_sva_bind_device(dev, A) -> PASID 1
> > 
> > * master forks and exits. Child (daemon) gets mm B, inherits the queue fd.
> >The device is still bound to mm A with PASID 1, since the queue fd is
> >still open.
> 
> > We discussed this before, but I don't remember where we left off. The
> > child can't use the queue because its mappings are not copied on fork(),
> > and the queue is still bound to the parent mm A. The child either needs to
> > open a new queue or take ownership of the old one with a new uacce ioctl.
> Yes, currently nginx aligned with the case.
> Child process (worker process) reopen uacce,
> 
> Master process (do init) open uacce, iommu_sva_bind_device(dev, A) -> PASID
> 1
> Master process fork Child (daemon) and exit.
> 
> Child (daemon)  does not use PASID 1 any more, only fork and manage worker
> process.
> Worker process reopen uacce, iommu_sva_bind_device(dev, B) PASID 2
> 
> So it is expected.

Yes, that's fine

> > Is that the "IMPLEMENT_DYNAMIC_BIND_FN()" you mention, something out of
> > tree?  This operation should unbind from A before binding to B, no?
> > Otherwise we leak PASID 1.
> In 5.16 PASID 1 from master is hold until nginx service stop.
> nginx start
> master:
> iommu_sva_alloc_pasid mm->pasid=1      // master process
> 
> lynx https start:
> iommu_sva_alloc_pasid mm->pasid=2    //worker process
> 
> nginx stop:  from fops_release
> iommu_sva_free_pasid mm->pasid=2   // worker process
> iommu_sva_free_pasid mm->pasid=1  // master process

That's the expected behavior (master could close its fd before forking, in
order to free things up earlier, but it's not mandatory)

> Have one silly question.
> 
> kerne driver
> fops_open
> iommu_sva_bind_device
> 
> fops_release
> iommu_sva_unbind_device
> 
> application
> main()
> fd = open
> return;
> 
> Application exit but not close(fd), is it expected fops_release will be
> called automatically by system?

Yes, the application doesn't have to call close() explicitly, the file
descriptor is closed automatically on exit. Note that the fd is copied on
fork(), so it is only released once parent and all child processes exit.

> On 5.17
> fops_release is called automatically, as well as iommu_sva_unbind_device.
> On 5.18-rc1.
> fops_release is not called, have to manually call close(fd)

Right that's weird

> Since nginx may have a issue, it does not call close(fd) when nginx -s quit.

And you're sure that none of the processes are still alive or in zombie
state?  Just to cover every possibility.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-22 Thread Jean-Philippe Brucker
On Fri, Apr 22, 2022 at 05:03:10PM +0800, zhangfei@foxmail.com wrote:
[...]
> > Have tested, still got some issue with our openssl-engine.
> > 
> > 1. If openssl-engine does not register rsa, nginx works well.
> > 
> > 2. If openssl-engine register rsa, nginx also works, but ioasid is not
> > freed when nginx stop.
> > 
> > IMPLEMENT_DYNAMIC_BIND_FN(bind_fn)
> > bind_fn
> > ENGINE_set_RSA(e, rsa_methods())
> > 
> > destroy_fn
> > 
> > If ENGINE_set_RSA is set, nginx start and stop will NOT call destroy_fn.
> > Even rsa_methods is almost new via RSA_meth_new.
> > 
> > In 5.18-rcx, this caused ioasid  not freed in nginx start and stop.
> > In 5.17, though destroy_fn is not called, but ioasid is freed when nginx
> > stop, so not noticed this issue before.
> 
> 1. uacce_fops_release
> In 5.16 or 5.17
> In fact, we aslo has the issue: openssl engine does not call destroy_fn ->
> close(uacce_fd)
> But system will automatically close all opened fd,
> so uacce_fops_release is also called and free ioasid.
> 
> Have one experiment, not call close fd
> 
> log: open uacce fd but no close
> [ 2583.471225]  dump_backtrace+0x0/0x1a0
> [ 2583.474876]  show_stack+0x20/0x30
> [ 2583.478178]  dump_stack_lvl+0x8c/0xb8
> [ 2583.481825]  dump_stack+0x18/0x34
> [ 2583.485126]  uacce_fops_release+0x44/0xdc
> [ 2583.489117]  __fput+0x78/0x240
> [ 2583.492159]  fput+0x18/0x28
> [ 2583.495288]  task_work_run+0x88/0x160
> [ 2583.498936]  do_notify_resume+0x214/0x490
> [ 2583.502927]  el0_svc+0x58/0x70
> [ 2583.505968]  el0t_64_sync_handler+0xb0/0xb8
> [ 2583.510132]  el0t_64_sync+0x1a0/0x1a4
> [ 2583.582292]  uacce_fops_release q=d6674128
> 
> In 5.18, since refcount was add.
> The opened uacce fd was not closed automatically by system
> So we see the issue.
> 
> log: open uacce fd but no close
>  [  106.360140]  uacce_fops_open q=ccc38d74
> [  106.364929]  ioasid_alloc ioasid=1
> [  106.368585]  iommu_sva_alloc_pasid pasid=1
> [  106.372943]  iommu_sva_bind_device handle=6cca298a
> // ioasid is not free

I'm trying to piece together what happens from the kernel point of view.

* master process with mm A opens a queue fd through uacce, which calls
  iommu_sva_bind_device(dev, A) -> PASID 1

* master forks and exits. Child (daemon) gets mm B, inherits the queue fd.
  The device is still bound to mm A with PASID 1, since the queue fd is
  still open.

We discussed this before, but I don't remember where we left off. The
child can't use the queue because its mappings are not copied on fork(),
and the queue is still bound to the parent mm A. The child either needs to
open a new queue or take ownership of the old one with a new uacce ioctl.
Is that the "IMPLEMENT_DYNAMIC_BIND_FN()" you mention, something out of
tree?  This operation should unbind from A before binding to B, no?
Otherwise we leak PASID 1.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 12/13] iommu/virtio: Clean up bus_set_iommu()

2022-04-21 Thread Jean-Philippe Brucker
On Thu, Apr 14, 2022 at 01:42:41PM +0100, Robin Murphy wrote:
> Stop calling bus_set_iommu() since it's now unnecessary, and simplify
> the probe failure path accordingly.
> 
> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/virtio-iommu.c | 24 
>  1 file changed, 24 deletions(-)
> 
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> index 25be4b822aa0..371f8657c0ce 100644
> --- a/drivers/iommu/virtio-iommu.c
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -7,7 +7,6 @@
>  
>  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>  
> -#include 
>  #include 
>  #include 
>  #include 

 isn't needed anymore either. In any case it
looks great, thanks

Reviewed-by: Jean-Philippe Brucker 

and tested on QEMU (so only PCI for now)


> @@ -1146,26 +1145,6 @@ static int viommu_probe(struct virtio_device *vdev)
>  
>   iommu_device_register(>iommu, _ops, parent_dev);
>  
> -#ifdef CONFIG_PCI
> - if (pci_bus_type.iommu_ops != _ops) {
> - ret = bus_set_iommu(_bus_type, _ops);
> - if (ret)
> - goto err_unregister;
> - }
> -#endif
> -#ifdef CONFIG_ARM_AMBA
> - if (amba_bustype.iommu_ops != _ops) {
> - ret = bus_set_iommu(_bustype, _ops);
> - if (ret)
> - goto err_unregister;
> - }
> -#endif
> - if (platform_bus_type.iommu_ops != _ops) {
> - ret = bus_set_iommu(_bus_type, _ops);
> - if (ret)
> - goto err_unregister;
> - }
> -
>   vdev->priv = viommu;
>  
>   dev_info(dev, "input address: %u bits\n",
> @@ -1174,9 +1153,6 @@ static int viommu_probe(struct virtio_device *vdev)
>  
>   return 0;
>  
> -err_unregister:
> - iommu_device_sysfs_remove(>iommu);
> - iommu_device_unregister(>iommu);
>  err_free_vqs:
>   vdev->config->del_vqs(vdev);
>  
> -- 
> 2.28.0.dirty
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-20 Thread Jean-Philippe Brucker
Hi,

On Fri, Apr 15, 2022 at 02:51:08AM -0700, Fenghua Yu wrote:
> From a6444e1e5bd8076f5e5c5e950d3192de327f0c9c Mon Sep 17 00:00:00 2001
> From: Fenghua Yu 
> Date: Fri, 15 Apr 2022 00:51:33 -0700
> Subject: [RFC PATCH] iommu/sva: Fix PASID use-after-free issue
> 
> A PASID might be still used even though it is freed on mm exit.
> 
> process A:
>   sva_bind();
>   ioasid_alloc() = N; // Get PASID N for the mm
>   fork(): // spawn process B
>   exit();
>   ioasid_free(N);
> 
> process B:
>   device uses PASID N -> failure
>   sva_unbind();
> 
> Dave Hansen suggests to take a refcount on the mm whenever binding the
> PASID to a device and drop the refcount on unbinding. The mm won't be
> dropped if the PASID is still bound to it.
> 
> Fixes: 701fac40384f ("iommu/sva: Assign a PASID to mm on PASID allocation and 
> free it on mm exit")
> 
> Reported-by: Zhangfei Gao 
> Suggested-by: Dave Hansen" 
> Signed-off-by: Fenghua Yu 
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 6 ++
>  drivers/iommu/intel/svm.c   | 4 
>  2 files changed, 10 insertions(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index 22ddd05bbdcd..3fcb842a0df0 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -7,6 +7,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "arm-smmu-v3.h"
>  #include "../../iommu-sva-lib.h"
> @@ -363,6 +364,9 @@ arm_smmu_sva_bind(struct device *dev, struct mm_struct 
> *mm, void *drvdata)
>  
>   mutex_lock(_lock);
>   handle = __arm_smmu_sva_bind(dev, mm);
> + /* Take an mm refcount on a successful bind. */
> + if (!IS_ERR(handle))
> + mmget(mm);
>   mutex_unlock(_lock);
>   return handle;
>  }
> @@ -372,6 +376,8 @@ void arm_smmu_sva_unbind(struct iommu_sva *handle)
>   struct arm_smmu_bond *bond = sva_to_bond(handle);
>  
>   mutex_lock(_lock);
> + /* Drop an mm refcount. */
> + mmput(bond->mm);

I do like the idea because it will simplify the driver. We can't call
mmput() here, though, because it may call the release() MMU notifier which
will try to grab sva_lock, already held.

I also found another use-after-free in arm_smmu_free_shared_cd(), where we
call arm64_mm_context_put() when the mm could already be freed. There used
to be an mmgrab() preventing this but it went away during a rewrite.

To fix both we could just move mmput() at the end of unbind() but I'd
rather do a proper cleanup removing the release() notifier right away.
Zhangfei, could you try the patch below?

Thanks,
Jean

--- 8< ---

>From 4e09c0d71dfb35fc90915bd1e36545027fbf8a03 Mon Sep 17 00:00:00 2001
From: Jean-Philippe Brucker 
Date: Wed, 20 Apr 2022 10:19:24 +0100
Subject: [PATCH] iommu/arm-smmu-v3-sva: Fix PASID and mm use-after-free issues

Commit 701fac40384f ("iommu/sva: Assign a PASID to mm on PASID
allocation and free it on mm exit") frees the PASID earlier than what
the SMMUv3 driver expects. At the moment the SMMU driver handles mm exit
in the release() MMU notifier by quiescing the context descriptor. The
context descriptor is only made invalid in unbind(), after the device
driver ensured the PASID is not used anymore. Releasing the PASID on mm
exit may cause it to be reallocated while it is still used by the
context descriptor.

There is another use-after-free, present since the beginning, where we
call arm64_mm_context_put() without a guarantee that mm_count is held.

Dave Hansen suggests to grab mm_users whenever binding the mm to a
device and drop it on unbinding. With that we can fix both issues and
simplify the driver by removing the release() notifier.

Fixes: 32784a9562fb ("iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()")
Reported-by: Zhangfei Gao 
Suggested-by: Dave Hansen 
Signed-off-by: Fenghua Yu 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  1 -
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   | 49 +--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 14 +-
 3 files changed, 15 insertions(+), 49 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index cd48590ada30..d50d215d946c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -735,7 +735,6 @@ static inline struct arm_smmu_domain *to_smmu_domain(struct 
iommu_domain *dom)
 
 extern struct xarray arm_smmu_asid_xa;
 extern struct mutex arm_smmu_asid_lock;
-extern struct arm_smmu_ctx_desc quiet_cd;
 
 int arm_smmu_write_ctx_desc(s

Re: [PATCH] iommu/arm-smmu-v3: Fix size calculation in arm_smmu_mm_invalidate_range()

2022-04-20 Thread Jean-Philippe Brucker
On Tue, Apr 19, 2022 at 02:01:58PM -0700, Nicolin Chen wrote:
> The arm_smmu_mm_invalidate_range function is designed to be called
> by mm core for Shared Virtual Addressing purpose between IOMMU and
> CPU MMU. However, the ways of two subsystems defining their "end"
> addresses are slightly different. IOMMU defines its "end" address
> using the last address of an address range, while mm core defines
> that using the following address of an address range:
> 
>   include/linux/mm_types.h:
>   unsigned long vm_end;
>   /* The first byte after our end address ...
> 
> This mismatch resulted in an incorrect calculation for size so it
> failed to be page-size aligned. Further, it caused a dead loop at
> "while (iova < end)" check in __arm_smmu_tlb_inv_range function.
> 
> This patch fixes the issue by doing the calculation correctly.
> 
> Fixes: 2f7e8c553e98d ("iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Nicolin Chen 

Thanks for the fix, I guess we didn't catch this earlier because our test
platforms didn't support range invalidation, so __arm_smmu_tlb_inv_range()
would always use PAGE_SIZE as increment.

Reviewed-by: Jean-Philippe Brucker 

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index 22ddd05bbdcd..c623dae1e115 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -183,7 +183,14 @@ static void arm_smmu_mm_invalidate_range(struct 
> mmu_notifier *mn,
>  {
>   struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
>   struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
> - size_t size = end - start + 1;
> + size_t size;
> +
> + /*
> +  * The mm_types defines vm_end as the first byte after the end address,
> +  * different from IOMMU subsystem using the last address of an address
> +  * range. So do a simple translation here by calculating size correctly.
> +  */
> + size = end - start;
>  
>   if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_BTM))
>   arm_smmu_tlb_inv_range_asid(start, size, smmu_mn->cd->asid,
> -- 
> 2.17.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-12 Thread Jean-Philippe Brucker
Hi,

On Tue, Apr 12, 2022 at 07:36:21AM -0700, Dave Hansen wrote:
> On 4/12/22 00:04, zhangfei@foxmail.com wrote:
> > master process quit, mmput ->  mm_pasid_drop->ioasid_free
> > But this ignore driver's iommu_sva_unbind_device function,
> > iommu_sva_bind_device and iommu_sva_unbind_device are not pair,  So
> > driver does not know ioasid is freed.
> > 
> > Any suggestion?
> 
> It sounds like you're saying that the device is still abound to the
> PASID even though the mm has exited and freed the PASID.  This is
> essentially a use-after-free for the PASID.  Right?
> 
> The right thing to do here is to have the PASID code hold a reference on
> the mm.  The mm "owns" the PASID for its entire lifetime and if anything
> needs the PASID to live longer, its only recourse for doing that is via
> an mmget().  I _think_ mmget() is the right thing as opposed to mmgrab()
> because the PASID users actually need the page tables to be around.
> 
> This would still be nice to confirm with some traces of fork()/exit()
> and the iommu_sva_{bind,unbind} and ioasid_{alloc,free} functions.
> 
> I wonder if the Intel and ARM IOMMU code differ in the way they keep
> references to the mm, or if this affects Intel as well, but we just
> haven't tested the code enough.

The Arm code was written expecting the PASID to be freed on unbind(), not
mm exit. I missed the change of behavior, sorry (I thought your plan was
to extend PASID lifetime, not shorten it?) but as is it seems very broken.
For example in the iommu_sva_unbind_device(), we have
arm_smmu_mmu_notifier_put() clearing the PASID table entry for
"mm->pasid", which is going to end badly if the PASID has been cleared or
reallocated. We can't clear the PASID entry in mm exit because at that
point the device may still be issuing DMA for that PASID and we need to
quiesce the entry rather than deactivate it. We can only deactivate it
once the device driver has properly stopped the device, at which point it
can call unbind(). There may be other issues but I can't check it
thoroughly until next week.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH RFC 10/11] iommu: Make IOPF handling framework generic

2022-03-22 Thread Jean-Philippe Brucker
On Tue, Mar 22, 2022 at 10:24:26AM +, Tian, Kevin wrote:
> > From: Jean-Philippe Brucker 
> > Sent: Tuesday, March 22, 2022 6:06 PM
> > 
> > On Tue, Mar 22, 2022 at 01:00:08AM +, Tian, Kevin wrote:
> > > > From: Jean-Philippe Brucker 
> > > > Sent: Monday, March 21, 2022 7:42 PM
> > > >
> > > > Hi Kevin,
> > > >
> > > > On Mon, Mar 21, 2022 at 08:09:36AM +, Tian, Kevin wrote:
> > > > > > From: Lu Baolu 
> > > > > > Sent: Sunday, March 20, 2022 2:40 PM
> > > > > >
> > > > > > The existing IOPF handling framework only handles the I/O page 
> > > > > > faults
> > for
> > > > > > SVA. Ginven that we are able to link iommu domain with each I/O
> > page
> > > > fault,
> > > > > > we can now make the I/O page fault handling framework more
> > general
> > > > for
> > > > > > more types of page faults.
> > > > >
> > > > > "make ... generic" in subject line is kind of confusing. Reading this 
> > > > > patch
> > I
> > > > > think you really meant changing from per-device fault handling to per-
> > > > domain
> > > > > fault handling. This is more accurate in concept since the fault is 
> > > > > caused
> > by
> > > > > the domain page table. 
> > > >
> > > > I tend to disagree with that last part. The fault is caused by a 
> > > > specific
> > > > device accessing shared page tables. We should keep that device
> > > > information throughout the fault handling, so that we can report it to 
> > > > the
> > > > driver when things go wrong. A process can have multiple threads bound
> > to
> > > > different devices, they share the same mm so if the driver wanted to
> > > > signal a misbehaving thread, similarly to a SEGV on the CPU side, it 
> > > > would
> > > > need the device information to precisely report it to userspace.
> > > >
> > >
> > > iommu driver can include the device information in the fault data. But
> > > in concept the IOPF should be reported per domain.
> > 
> > So I don't remember where we left off on that topic, what about fault
> > injection into guests?  In that case device info is more than just
> > diagnostic, fault injection can't work without it. I think we talked about
> > passing a device cookie to userspace, just want to make sure.
> > 
> > > and I agree with Jason that at most we can send SEGV to the entire thread
> > > group since there is no way to associate a DMA back to a thread which
> > > initiates the DMA.
> > 
> > The point is providing the most accurate information to the device driver
> > for diagnostics and debugging. A process opens multiple queues to
> > different devices, then if one of the queues issues invalid DMA, the
> > driver won't even know which queue is broken if you only report the target
> > mm and not the source dev. I don't think we gain anything from discarding
> > the device information from the fault path.
> > 
> 
> In case I didn't make it clear, what I talked about is just about having iommu
> core to report IOPF per domain handler vs. per device handler while this
> design choice doesn't change what the fault data should include (device,
> pasid, addr, etc.). i.e. it always includes all the information provided by 
> the
> iommu driver no matter how the fault is reported upwards.

Right thanks, I misunderstood.

Thanks,
Jean

> 
> e.g. with iommufd it is iommufd to register a IOPF handler per managed
> domain and receive IOPF on those domains. If necessary, iommufd further
> forwards to userspace including device cookie according to the fault data.
> 
> Thanks
> Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH RFC 10/11] iommu: Make IOPF handling framework generic

2022-03-22 Thread Jean-Philippe Brucker
On Tue, Mar 22, 2022 at 01:00:08AM +, Tian, Kevin wrote:
> > From: Jean-Philippe Brucker 
> > Sent: Monday, March 21, 2022 7:42 PM
> > 
> > Hi Kevin,
> > 
> > On Mon, Mar 21, 2022 at 08:09:36AM +, Tian, Kevin wrote:
> > > > From: Lu Baolu 
> > > > Sent: Sunday, March 20, 2022 2:40 PM
> > > >
> > > > The existing IOPF handling framework only handles the I/O page faults 
> > > > for
> > > > SVA. Ginven that we are able to link iommu domain with each I/O page
> > fault,
> > > > we can now make the I/O page fault handling framework more general
> > for
> > > > more types of page faults.
> > >
> > > "make ... generic" in subject line is kind of confusing. Reading this 
> > > patch I
> > > think you really meant changing from per-device fault handling to per-
> > domain
> > > fault handling. This is more accurate in concept since the fault is 
> > > caused by
> > > the domain page table. 
> > 
> > I tend to disagree with that last part. The fault is caused by a specific
> > device accessing shared page tables. We should keep that device
> > information throughout the fault handling, so that we can report it to the
> > driver when things go wrong. A process can have multiple threads bound to
> > different devices, they share the same mm so if the driver wanted to
> > signal a misbehaving thread, similarly to a SEGV on the CPU side, it would
> > need the device information to precisely report it to userspace.
> > 
> 
> iommu driver can include the device information in the fault data. But
> in concept the IOPF should be reported per domain.

So I don't remember where we left off on that topic, what about fault
injection into guests?  In that case device info is more than just
diagnostic, fault injection can't work without it. I think we talked about
passing a device cookie to userspace, just want to make sure.

> and I agree with Jason that at most we can send SEGV to the entire thread
> group since there is no way to associate a DMA back to a thread which 
> initiates the DMA.

The point is providing the most accurate information to the device driver
for diagnostics and debugging. A process opens multiple queues to
different devices, then if one of the queues issues invalid DMA, the
driver won't even know which queue is broken if you only report the target
mm and not the source dev. I don't think we gain anything from discarding
the device information from the fault path.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH RFC 10/11] iommu: Make IOPF handling framework generic

2022-03-22 Thread Jean-Philippe Brucker
On Tue, Mar 22, 2022 at 01:03:14PM +0800, Lu Baolu wrote:
> On 2022/3/21 20:43, Jason Gunthorpe wrote:
> > On Mon, Mar 21, 2022 at 11:42:16AM +0000, Jean-Philippe Brucker wrote:
> > 
> > > I tend to disagree with that last part. The fault is caused by a specific
> > > device accessing shared page tables. We should keep that device
> > > information throughout the fault handling, so that we can report it to the
> > > driver when things go wrong.
> > SVA faults should never be reported to drivers??
> > 
> 
> When things go wrong, the corresponding response code will be responded
> to the device through iommu_page_response(). The hardware should then
> report the failure to the device driver and the device driver will
> handle it in the device-specific way. There's no need to propagate the
> I/O page faults to the device driver in any case. Do I understand it
> right?

In theory yes, but devices don't necessarily have the ability to report
precise errors, we may have more information.

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 10/11] iommu: Make IOPF handling framework generic

2022-03-21 Thread Jean-Philippe Brucker
Hi Kevin,

On Mon, Mar 21, 2022 at 08:09:36AM +, Tian, Kevin wrote:
> > From: Lu Baolu 
> > Sent: Sunday, March 20, 2022 2:40 PM
> > 
> > The existing IOPF handling framework only handles the I/O page faults for
> > SVA. Ginven that we are able to link iommu domain with each I/O page fault,
> > we can now make the I/O page fault handling framework more general for
> > more types of page faults.
> 
> "make ... generic" in subject line is kind of confusing. Reading this patch I
> think you really meant changing from per-device fault handling to per-domain
> fault handling. This is more accurate in concept since the fault is caused by
> the domain page table. 

I tend to disagree with that last part. The fault is caused by a specific
device accessing shared page tables. We should keep that device
information throughout the fault handling, so that we can report it to the
driver when things go wrong. A process can have multiple threads bound to
different devices, they share the same mm so if the driver wanted to
signal a misbehaving thread, similarly to a SEGV on the CPU side, it would
need the device information to precisely report it to userspace.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH RFC 10/11] iommu: Make IOPF handling framework generic

2022-03-21 Thread Jean-Philippe Brucker
On Sun, Mar 20, 2022 at 02:40:29PM +0800, Lu Baolu wrote:
> The existing IOPF handling framework only handles the I/O page faults for
> SVA. Ginven that we are able to link iommu domain with each I/O page fault,
> we can now make the I/O page fault handling framework more general for
> more types of page faults.
> 
> Signed-off-by: Lu Baolu 
> ---
>  include/linux/iommu.h |  4 +++
>  drivers/iommu/io-pgfault.c| 67 ++-
>  drivers/iommu/iommu-sva-lib.c | 59 ++
>  3 files changed, 73 insertions(+), 57 deletions(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 803e7b07605e..11c65a7bed88 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -50,6 +50,8 @@ struct iommu_dma_cookie;
>  typedef int (*iommu_fault_handler_t)(struct iommu_domain *,
>   struct device *, unsigned long, int, void *);
>  typedef int (*iommu_dev_fault_handler_t)(struct iommu_fault *, void *);
> +typedef enum iommu_page_response_code (*iommu_domain_iopf_handler_t)
> + (struct iommu_fault *, void *);
>  
>  struct iommu_domain_geometry {
>   dma_addr_t aperture_start; /* First address that can be mapped*/
> @@ -101,6 +103,8 @@ struct iommu_domain {
>   struct iommu_domain_geometry geometry;
>   struct iommu_dma_cookie *iova_cookie;
>   struct mm_struct *sva_cookie;
> + iommu_domain_iopf_handler_t fault_handler;
> + void *fault_data;
>  };
>  
>  static inline bool iommu_is_dma_domain(struct iommu_domain *domain)
> diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
> index 1df8c1dcae77..dad0e40cd8d2 100644
> --- a/drivers/iommu/io-pgfault.c
> +++ b/drivers/iommu/io-pgfault.c
> @@ -69,62 +69,6 @@ static int iopf_complete_group(struct device *dev, struct 
> iopf_fault *iopf,
>   return iommu_page_response(dev, );
>  }
>  
> -static enum iommu_page_response_code
> -iopf_handle_single(struct iopf_fault *iopf)
> -{
> - vm_fault_t ret;
> - struct mm_struct *mm;
> - struct vm_area_struct *vma;
> - unsigned int access_flags = 0;
> - unsigned int fault_flags = FAULT_FLAG_REMOTE;
> - struct iommu_fault_page_request *prm = >fault.prm;
> - enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID;
> -
> - if (!(prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID))
> - return status;
> -
> - mm = iommu_sva_find(prm->pasid);
> - if (IS_ERR_OR_NULL(mm))
> - return status;
> -
> - mmap_read_lock(mm);
> -
> - vma = find_extend_vma(mm, prm->addr);
> - if (!vma)
> - /* Unmapped area */
> - goto out_put_mm;
> -
> - if (prm->perm & IOMMU_FAULT_PERM_READ)
> - access_flags |= VM_READ;
> -
> - if (prm->perm & IOMMU_FAULT_PERM_WRITE) {
> - access_flags |= VM_WRITE;
> - fault_flags |= FAULT_FLAG_WRITE;
> - }
> -
> - if (prm->perm & IOMMU_FAULT_PERM_EXEC) {
> - access_flags |= VM_EXEC;
> - fault_flags |= FAULT_FLAG_INSTRUCTION;
> - }
> -
> - if (!(prm->perm & IOMMU_FAULT_PERM_PRIV))
> - fault_flags |= FAULT_FLAG_USER;
> -
> - if (access_flags & ~vma->vm_flags)
> - /* Access fault */
> - goto out_put_mm;
> -
> - ret = handle_mm_fault(vma, prm->addr, fault_flags, NULL);
> - status = ret & VM_FAULT_ERROR ? IOMMU_PAGE_RESP_INVALID :
> - IOMMU_PAGE_RESP_SUCCESS;
> -
> -out_put_mm:
> - mmap_read_unlock(mm);
> - mmput(mm);
> -
> - return status;
> -}
> -
>  static void iopf_handle_group(struct work_struct *work)
>  {
>   struct iopf_group *group;
> @@ -134,12 +78,21 @@ static void iopf_handle_group(struct work_struct *work)
>   group = container_of(work, struct iopf_group, work);
>  
>   list_for_each_entry_safe(iopf, next, >faults, list) {
> + struct iommu_domain *domain;
> +
> + domain = iommu_get_domain_for_dev_pasid(group->dev,
> + iopf->fault.prm.pasid);

Do we have a guarantee that the domain is not freed while we handle the
fault?  We could prevent unbind() while there are pending faults on this
bond. But a refcount on SVA domains could defer freeing, and would also
help with keeping the semantics where bind() returns a single refcounted
bond for any {dev, mm}.

Given that this path is full of circular locking pitfalls, and to keep the
fault handler efficient (well, at least not make it worse), we should
probably keep a getter like iommu_sva_find() that does not require
locking.

> +
> + if (!domain || !domain->fault_handler)
> + status = IOMMU_PAGE_RESP_INVALID;
> +
>   /*
>* For the moment, errors are sticky: don't handle subsequent
>* faults in the group if there is an error.
>*/
>   if (status == 

Re: [PATCH RFC 08/11] iommu: Handle IO page faults directly

2022-03-21 Thread Jean-Philippe Brucker
On Sun, Mar 20, 2022 at 02:40:27PM +0800, Lu Baolu wrote:
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index c0966fc9b686..4f90b71c6f6e 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -27,6 +27,8 @@
>  #include 
>  #include 
>  
> +#include "iommu-sva-lib.h"
> +
>  static struct kset *iommu_group_kset;
>  static DEFINE_IDA(iommu_group_ida);
>  
> @@ -1177,10 +1179,9 @@ int iommu_report_device_fault(struct device *dev, 
> struct iommu_fault_event *evt)
>   if (!param || !evt)
>   return -EINVAL;
>  
> - /* we only report device fault if there is a handler registered */
>   mutex_lock(>lock);
>   fparam = param->fault_param;
> - if (!fparam || !fparam->handler) {
> + if (!fparam) {
>   ret = -EINVAL;
>   goto done_unlock;
>   }
> @@ -1198,7 +1199,11 @@ int iommu_report_device_fault(struct device *dev, 
> struct iommu_fault_event *evt)
>   mutex_unlock(>lock);
>   }
>  
> - ret = fparam->handler(>fault, fparam->data);
> + if (fparam->handler)
> + ret = fparam->handler(>fault, fparam->data);
> + else
> + ret = iommu_queue_iopf(>fault, fparam->data);
> +

I like the change, but we'll need to consolidate this, because now if the
driver registers a fault handler it disables IOPF. We could instead
prevent registration if an IOPF param is present. We could also just merge
fparam->handler but eventually I'd like to make IOPF fall back to the
fault handler registered by device driver, in case of invalid page faults.
I have a couple patches for this but am still missing some bits.

Thanks,
Jean

>   if (ret && evt_pending) {
>   mutex_lock(>lock);
>   list_del(_pending->list);
> -- 
> 2.25.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 06/11] iommu/sva: Use attach/detach_pasid_dev in SVA interfaces

2022-03-21 Thread Jean-Philippe Brucker
On Sun, Mar 20, 2022 at 02:40:25PM +0800, Lu Baolu wrote:
> diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
> index 106506143896..47cf98e661ff 100644
> --- a/drivers/iommu/iommu-sva-lib.c
> +++ b/drivers/iommu/iommu-sva-lib.c
> @@ -3,6 +3,8 @@
>   * Helpers for IOMMU drivers implementing SVA
>   */
>  #include 
> +#include 
> +#include 
>  #include 
>  
>  #include "iommu-sva-lib.h"
> @@ -69,3 +71,101 @@ struct mm_struct *iommu_sva_find(ioasid_t pasid)
>   return ioasid_find(_sva_pasid, pasid, __mmget_not_zero);
>  }
>  EXPORT_SYMBOL_GPL(iommu_sva_find);
> +
> +static struct iommu_domain *iommu_sva_domain_alloc(struct device *dev)
> +{
> + struct bus_type *bus = dev->bus;
> + struct iommu_domain *domain;
> +
> + if (!bus || !bus->iommu_ops)
> + return NULL;
> +
> + domain = bus->iommu_ops->domain_alloc(IOMMU_DOMAIN_SVA);
> + if (domain)
> + domain->type = IOMMU_DOMAIN_SVA;
> +
> + return domain;
> +}
> +
> +/**
> + * iommu_sva_bind_device() - Bind a process address space to a device
> + * @dev: the device
> + * @mm: the mm to bind, caller must hold a reference to it
> + * @drvdata: opaque data pointer to pass to bind callback
> + *
> + * Create a bond between device and address space, allowing the device to 
> access
> + * the mm using the returned PASID. If a bond already exists between @device 
> and
> + * @mm, it is returned and an additional reference is taken.

This is not true anymore, we return a different structure for each call.

> Caller must call
> + * iommu_sva_unbind_device() to release each reference.
> + *
> + * iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) must be called first, to
> + * initialize the required SVA features.
> + *
> + * On error, returns an ERR_PTR value.
> + */
> +struct iommu_sva *
> +iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void 
> *drvdata)
> +{
> + int ret = -EINVAL;
> + struct iommu_sva *handle;
> + struct iommu_domain *domain;
> +
> + handle = kzalloc(sizeof(*handle), GFP_KERNEL);
> + if (!handle)
> + return ERR_PTR(-ENOMEM);
> +
> + ret = iommu_sva_alloc_pasid(mm, 1, (1U << dev->iommu->pasid_bits) - 1);
> + if (ret)
> + goto out;
> +
> + domain = iommu_sva_domain_alloc(dev);
> + if (!domain) {
> + ret = -ENOMEM;
> + goto out;
> + }
> + domain->sva_cookie = mm;
> +
> + ret = iommu_attach_device_pasid(domain, dev, mm->pasid);
> + if (ret)
> + goto out_free_domain;
> +
> + handle->dev = dev;
> + handle->domain = domain;
> + handle->pasid = mm->pasid;
> +
> + return handle;
> +
> +out_free_domain:
> + iommu_domain_free(domain);
> +out:
> + kfree(handle);
> +
> + return ERR_PTR(ret);
> +}
> +EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
> +
> +/**
> + * iommu_sva_unbind_device() - Remove a bond created with 
> iommu_sva_bind_device
> + * @handle: the handle returned by iommu_sva_bind_device()
> + *
> + * Put reference to a bond between device and address space.

Same here. But I'd prefer keeping the old behavior so device drivers don't
have to keep track of {dev, mm} pairs themselves.

Thanks,
Jean

> The device should
> + * not be issuing any more transaction for this PASID. All outstanding page
> + * requests for this PASID must have been flushed to the IOMMU.
> + */
> +void iommu_sva_unbind_device(struct iommu_sva *handle)
> +{
> + struct device *dev = handle->dev;
> + struct iommu_domain *domain = handle->domain;
> + struct mm_struct *mm = domain->sva_cookie;
> +
> + iommu_detach_device_pasid(domain, dev, mm->pasid);
> + iommu_domain_free(domain);
> + kfree(handle);
> +}
> +EXPORT_SYMBOL_GPL(iommu_sva_unbind_device);
> +
> +u32 iommu_sva_get_pasid(struct iommu_sva *handle)
> +{
> + return handle->pasid;
> +}
> +EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 05/11] arm-smmu-v3/sva: Add SVA domain support

2022-03-21 Thread Jean-Philippe Brucker
On Sun, Mar 20, 2022 at 02:40:24PM +0800, Lu Baolu wrote:
> Add support for SVA domain allocation and provide an SVA-specific
> iommu_domain_ops.
> 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   | 14 ++
>  .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   | 45 +++
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 13 +-
>  3 files changed, 71 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index cd48590ada30..7631c00fdcbd 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -759,6 +759,10 @@ struct iommu_sva *arm_smmu_sva_bind(struct device *dev, 
> struct mm_struct *mm,
>  void arm_smmu_sva_unbind(struct iommu_sva *handle);
>  u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle);
>  void arm_smmu_sva_notifier_synchronize(void);
> +int arm_smmu_sva_attach_dev_pasid(struct iommu_domain *domain,
> +   struct device *dev, ioasid_t id);
> +void arm_smmu_sva_detach_dev_pasid(struct iommu_domain *domain,
> +struct device *dev, ioasid_t id);
>  #else /* CONFIG_ARM_SMMU_V3_SVA */
>  static inline bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
>  {
> @@ -804,5 +808,15 @@ static inline u32 arm_smmu_sva_get_pasid(struct 
> iommu_sva *handle)
>  }
>  
>  static inline void arm_smmu_sva_notifier_synchronize(void) {}
> +
> +static inline int arm_smmu_sva_attach_dev_pasid(struct iommu_domain *domain,
> + struct device *dev, ioasid_t id)
> +{
> + return -ENODEV;
> +}
> +
> +static inline void arm_smmu_sva_detach_dev_pasid(struct iommu_domain *domain,
> +  struct device *dev,
> +  ioasid_t id) {}
>  #endif /* CONFIG_ARM_SMMU_V3_SVA */
>  #endif /* _ARM_SMMU_V3_H */
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index 22ddd05bbdcd..1e114b9dc17f 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -534,3 +534,48 @@ void arm_smmu_sva_notifier_synchronize(void)
>*/
>   mmu_notifier_synchronize();
>  }
> +
> +int arm_smmu_sva_attach_dev_pasid(struct iommu_domain *domain,
> +   struct device *dev, ioasid_t id)
> +{
> + int ret = 0;
> + struct iommu_sva *handle;
> + struct mm_struct *mm = domain->sva_cookie;
> + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +
> + if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1 ||

This check is for the parent domain, iommu_get_domain_for_dev(dev)

> + domain->type != IOMMU_DOMAIN_SVA || !mm)
> + return -EINVAL;
> +
> + mutex_lock(_lock);
> + handle = __arm_smmu_sva_bind(dev, mm);
> + if (IS_ERR_OR_NULL(handle))
> + ret = PTR_ERR(handle);
> + mutex_unlock(_lock);
> +
> + return ret;
> +}
> +
> +void arm_smmu_sva_detach_dev_pasid(struct iommu_domain *domain,
> +struct device *dev, ioasid_t id)
> +{
> + struct arm_smmu_bond *bond = NULL, *t;
> + struct mm_struct *mm = domain->sva_cookie;
> + struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +
> + mutex_lock(_lock);
> + list_for_each_entry(t, >bonds, list) {
> + if (t->mm == mm) {
> + bond = t;
> + break;
> + }
> + }
> +
> + if (!WARN_ON(!bond) && refcount_dec_and_test(>refs)) {
> + list_del(>list);
> + arm_smmu_mmu_notifier_put(bond->smmu_mn);
> + iommu_sva_free_pasid(bond->mm);

Can be dropped since iommu.c does PASID allocation (also the one in
__arm_smmu_sva_bind() as a cleanup patch)

> + kfree(bond);
> + }
> + mutex_unlock(_lock);
> +}
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 8e262210b5ad..2e9d3cd30510 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -88,6 +88,8 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
>   { 0, NULL},
>  };
>  
> +static void arm_smmu_domain_free(struct iommu_domain *domain);
> +
>  static void parse_driver_options(struct arm_smmu_device *smmu)
>  {
>   int i = 0;
> @@ -1995,6 +1997,12 @@ static bool arm_smmu_capable(enum iommu_cap cap)
>   }
>  }
>  
> +static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
> + .attach_dev_pasid   = arm_smmu_sva_attach_dev_pasid,
> + .detach_dev_pasid   = arm_smmu_sva_detach_dev_pasid,
> + .free   = arm_smmu_domain_free,
> +};
> +
>  static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
>  {
>   

Re: [PATCH RFC 01/11] iommu: Add pasid_bits field in struct dev_iommu

2022-03-21 Thread Jean-Philippe Brucker
Hi Baolu,

On Sun, Mar 20, 2022 at 02:40:20PM +0800, Lu Baolu wrote:
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 627a3ed5ee8f..8e262210b5ad 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2812,6 +2812,7 @@ static int arm_smmu_dev_enable_feature(struct device 
> *dev,
>   master->iopf_enabled = true;
>   return 0;
>   case IOMMU_DEV_FEAT_SVA:
> + dev->iommu->pasid_bits = master->ssid_bits;

This would be better in arm_smmu_probe_device()

Thanks,
Jean

>   return arm_smmu_master_enable_sva(master);
>   default:
>   return -EINVAL;
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 8/8] iommu: Remove unused driver data in sva_bind_device

2022-03-15 Thread Jean-Philippe Brucker
On Mon, Mar 14, 2022 at 10:07:12PM -0700, Jacob Pan wrote:
> No one is using drvdata for sva_bind_device after kernel SVA support is
> removed from VT-d driver. Remove the drvdata parameter as well.
> 
> Signed-off-by: Jacob Pan 

Reviewed-by: Jean-Philippe Brucker 

> ---
>  drivers/dma/idxd/cdev.c | 2 +-
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 ++---
>  drivers/iommu/intel/svm.c   | 9 -
>  drivers/iommu/iommu.c   | 4 ++--
>  drivers/misc/uacce/uacce.c  | 2 +-
>  include/linux/intel-iommu.h | 3 +--
>  include/linux/iommu.h   | 9 +++--
>  8 files changed, 15 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/dma/idxd/cdev.c b/drivers/dma/idxd/cdev.c
> index b9b2b4a4124e..312ec37ebf91 100644
> --- a/drivers/dma/idxd/cdev.c
> +++ b/drivers/dma/idxd/cdev.c
> @@ -100,7 +100,7 @@ static int idxd_cdev_open(struct inode *inode, struct 
> file *filp)
>   filp->private_data = ctx;
>  
>   if (device_pasid_enabled(idxd)) {
> - sva = iommu_sva_bind_device(dev, current->mm, NULL);
> + sva = iommu_sva_bind_device(dev, current->mm);
>   if (IS_ERR(sva)) {
>   rc = PTR_ERR(sva);
>   dev_err(dev, "pasid allocation failed: %d\n", rc);
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index a737ba5f727e..eb2f5cb0701a 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -354,7 +354,7 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct 
> *mm)
>  }
>  
>  struct iommu_sva *
> -arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
> +arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
>  {
>   struct iommu_sva *handle;
>   struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index cd48590ada30..d2ba86470c42 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -754,8 +754,7 @@ bool arm_smmu_master_sva_enabled(struct arm_smmu_master 
> *master);
>  int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
>  int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
>  bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
> -struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm,
> - void *drvdata);
> +struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct 
> *mm);
>  void arm_smmu_sva_unbind(struct iommu_sva *handle);
>  u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle);
>  void arm_smmu_sva_notifier_synchronize(void);
> @@ -791,7 +790,7 @@ static inline bool arm_smmu_master_iopf_supported(struct 
> arm_smmu_master *master
>  }
>  
>  static inline struct iommu_sva *
> -arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
> +arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
>  {
>   return ERR_PTR(-ENODEV);
>  }
> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> index 37d6218f173b..94deb58375f5 100644
> --- a/drivers/iommu/intel/svm.c
> +++ b/drivers/iommu/intel/svm.c
> @@ -500,8 +500,7 @@ int intel_svm_unbind_gpasid(struct device *dev, u32 pasid)
>   return ret;
>  }
>  
> -static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm,
> -  unsigned int flags)
> +static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm)
>  {
>   ioasid_t max_pasid = dev_is_pci(dev) ?
>   pci_max_pasids(to_pci_dev(dev)) : intel_pasid_max_id;
> @@ -1002,20 +1001,20 @@ static irqreturn_t prq_event_thread(int irq, void *d)
>   return IRQ_RETVAL(handled);
>  }
>  
> -struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm, 
> void *drvdata)
> +struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm)
>  {
>   struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL);
>   struct iommu_sva *sva;
>   int ret;
>  
>   mutex_lock(_mutex);
> - ret = intel_svm_alloc_pasid(dev, mm, flags);
> + ret = intel_svm_alloc_pasid(dev, mm);
>   if (ret) {
>   mutex_unlock(_mutex);
>   return ERR_PTR(ret);
>   }
>  
> - sva = intel_svm_bin

Re: [PATCH v2 2/8] iommu: Add attach/detach_dev_pasid domain ops

2022-03-15 Thread Jean-Philippe Brucker
On Mon, Mar 14, 2022 at 10:07:06PM -0700, Jacob Pan wrote:
> From: Lu Baolu 
> 
> An IOMMU domain represents an address space which can be attached by
> devices that perform DMA within a domain. However, for platforms with
> PASID capability the domain attachment needs be handled at device+PASID
> level. There can be multiple PASIDs within a device and multiple devices
> attached to a given domain.
> This patch introduces a new IOMMU op which support device, PASID, and
> IOMMU domain attachment. The immediate use case is for PASID capable
> devices to perform DMA under DMA APIs.
> 
> Signed-off-by: Lu Baolu 
> Signed-off-by: Jacob Pan 
> ---
>  include/linux/iommu.h | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 369f05c2a4e2..fde5b933dbe3 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -227,6 +227,8 @@ struct iommu_iotlb_gather {
>   * @aux_get_pasid: get the pasid given an aux-domain
>   * @sva_bind: Bind process address space to device
>   * @sva_unbind: Unbind process address space from device
> + * @attach_dev_pasid: attach an iommu domain to a pasid of device
> + * @detach_dev_pasid: detach an iommu domain from a pasid of device

Isn't that operation "assign a PASID to a domain" instead?  In patch 5,
the domain is already attached to the device, so set_domain_pasid() might
be clearer and to the point. If the IOMMU driver did the allocation we
could also avoid patch 1.

If I understand correctly this series is not about a generic PASID API
that allows drivers to manage multiple DMA address spaces, because there
still doesn't seem to be any interest in that. It's about the specific
IDXD use-case, so let's focus on that. We can introduce a specialized call
such as (iommu|dma)_set_device_pasid(), which will be easy to consolidate
later into a more generic "dma_enable_pasid()" API if that ever seems
useful.

Thanks,
Jean

>   * @sva_get_pasid: Get PASID associated to a SVA handle
>   * @page_response: handle page request response
>   * @cache_invalidate: invalidate translation caches
> @@ -296,6 +298,10 @@ struct iommu_ops {
>   struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm,
> void *drvdata);
>   void (*sva_unbind)(struct iommu_sva *handle);
> + int (*attach_dev_pasid)(struct iommu_domain *domain,
> + struct device *dev, ioasid_t id);
> + void (*detach_dev_pasid)(struct iommu_domain *domain,
> +  struct device *dev, ioasid_t id);
>   u32 (*sva_get_pasid)(struct iommu_sva *handle);
>  
>   int (*page_response)(struct device *dev,
> -- 
> 2.25.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 7/7] iommu: Add iommu_domain::domain_ops

2022-01-26 Thread Jean-Philippe Brucker
On Mon, Jan 24, 2022 at 12:33:02PM -0400, Jason Gunthorpe wrote:
> On Mon, Jan 24, 2022 at 10:16:07AM +0000, Jean-Philippe Brucker wrote:
> > On Mon, Jan 24, 2022 at 09:58:18AM +, Tian, Kevin wrote:
> > > > From: Lu Baolu 
> > > > Sent: Monday, January 24, 2022 3:11 PM
> > > > +/**
> > > > + * struct domain_ops - per-domain ops
> > > > + * @attach_dev: attach an iommu domain to a device
> > > > + * @detach_dev: detach an iommu domain from a device
> > > 
> > > What is the criteria about whether an op should be iommu_ops or domain_ops
> > > when it requires both domain and device pointers like above two (and 
> > > future
> > > PASID-based attach)?
> > > 
> > > Other examples include:
> > >   @apply_resv_region
> > >   @is_attach_deferred
> > 
> > Could attach_dev() be an IOMMU op?  So a driver could set the domain ops
> > in attach_dev() rather than domain_alloc(). That would allow to install
> > map()/unmap() ops that are tailored for the device's IOMMU, which we don't
> > know at domain_alloc() time. 
> 
> I think we should be moving toward 'domain_alloc' returning the
> correct domain and the way the driver implements the domain shouldn't
> change after that.
> 
> > I'm thinking about a guest that has both physical and virtual
> > endpoints, which would ideally use different kinds of domain ops to
> > support both efficiently (caching mode vs page tables)
> 
> In this case shouldn't domain_alloc() reached from the struct device
> already do the correct thing?

Sure, if we can finalise the domains before attach that could also clean
up the drivers a bit.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 7/7] iommu: Add iommu_domain::domain_ops

2022-01-24 Thread Jean-Philippe Brucker
On Mon, Jan 24, 2022 at 09:58:18AM +, Tian, Kevin wrote:
> > From: Lu Baolu 
> > Sent: Monday, January 24, 2022 3:11 PM
> > +/**
> > + * struct domain_ops - per-domain ops
> > + * @attach_dev: attach an iommu domain to a device
> > + * @detach_dev: detach an iommu domain from a device
> 
> What is the criteria about whether an op should be iommu_ops or domain_ops
> when it requires both domain and device pointers like above two (and future
> PASID-based attach)?
> 
> Other examples include:
>   @apply_resv_region
>   @is_attach_deferred

Could attach_dev() be an IOMMU op?  So a driver could set the domain ops
in attach_dev() rather than domain_alloc(). That would allow to install
map()/unmap() ops that are tailored for the device's IOMMU, which we don't
know at domain_alloc() time. I'm thinking about a guest that has both
physical and virtual endpoints, which would ideally use different kinds of
domain ops to support both efficiently (caching mode vs page tables)

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] dt-bindings: Add Arm SMMUv3 PMCG binding

2021-12-10 Thread Jean-Philippe Brucker
On Thu, Nov 18, 2021 at 03:50:54PM +, Robin Murphy wrote:
> > > +  An SMMUv3 may have several Performance Monitor Counter Group (PMCG).
> > > +  They are standalone performance monitoring units that support both
> > > +  architected and IMPLEMENTATION DEFINED event counters.
> > 
> > Humm, I don't know that I agree they are standalone. They could be I
> > guess, but looking at the MMU-600 spec the PMCG looks like it's just a
> > subset of registers in a larger block. This seems similar to MPAM
> > (which I'm working on a binding for) where it's just a register map
> > and interrupts, but every other possible resource is unspecified by
> > the architecture.
> 
> They're "standalone" in the sense that they don't have to be part of an
> SMMU, they could be part of a PCIe root complex or other SoC device that
> couples to an SMMU (e.g. anything that can speak AMBA DTI, in the case of
> our SMMU implementations).

The "standalone" word came from the SMMUv3 spec (IHI0070D.b 10.1):

  The Performance Monitor Counter Groups are standalone monitoring
  facilities and, as such, can be implemented in separate components that
  are all associated with (but not necessarily part of) an SMMU.

> 
> In fact our SMMU TBUs are pretty much separate devices themselves, they just
> *only* speak DTI, so access to their registers is proxied through the TCU
> programming interface.
> 
> > The simplest change from this would be just specifying that the PMCG
> > is child node(s) of whatever it is part of. The extreme would be this
> > is all part of the SMMU binding (i.e. reg entry X is PMCG registers,
> > interrupts entry Y is pmu irq).
> 
> Being a child of its associated device doesn't seem too bad semantically,
> however how would we describe a PMCG as a child of a PCIe node when its
> "reg" property still exists in the parent address space and not PCI
> config/memory space like any of its siblings? Also in practical terms,
> consuming that binding in Linux and getting the things to probe when it may
> want to be independent of whether we even understand the parent node at all
> could be... unpleasant.

So there are multiple options for what "the PMCG is part of".

(a) The SMMU: the spec guarantees that a PMCG is associated with an SMMU.

(b) The MMIO region: may be within the SMMU (as with MMU-600), outside of
it (as does another implementation, two 64k pages after the SMMU base)
or, theoretically, within a separate device (e.g. PCIe controller).

(c) The thing being measured: does not necessarily match the MMIO region.
For example a TBU attached to the PCIe RC but the PMCG MMIO is within
the SMMU region.

(d) None: the PMCG can be probed and driven separately from the SMMU and
other components, as demonstrated by Linux.

Which one is normally picked to decide where to insert a devicetree node?
I guess (b)?  I picked (d) so far as the easiest choice.

(a) is also a reasonable choice, being based on the spec, but it might be
confusing to have a PMCG node inside the SMMU node when the MMIO region is
external, possibly belonging to another device. For the same reason we
could discard (c).

(b) feels more natural, although it's not clear what to do when the PMCG
MMIO region is external or adjacent to the SMMU region. Does the node go
inside the SMMU node or one level up?

Thanks,
Jean

> 
> Robin.
> 
> > > +
> > > +properties:
> > > +  $nodename:
> > > +pattern: "^pmu@[0-9a-f]*"
> > 
> > s/*/+/
> > 
> > Need at least 1 digit.
> > 
> > > +  compatible:
> > > +oneOf:
> > > +  - items:
> > > +- enum:
> > > +  - hisilicon,smmu-v3-pmcg-hip08
> > > +- const: arm,smmu-v3-pmcg
> > > +  - const: arm,smmu-v3-pmcg
> > > +
> > > +  reg:
> > > +description: |
> > > +  Base addresses of the PMCG registers. Either a single address for 
> > > Page 0
> > > +  or an additional address for Page 1, where some registers can be
> > > +  relocated with SMMU_PMCG_CFGR.RELOC_CTRS.
> > > +minItems: 1
> > > +maxItems: 2
> > > +
> > > +  interrupts:
> > > +maxItems: 1
> > > +
> > > +  msi-parent: true
> > > +
> > > +required:
> > > +  - compatible
> > > +  - reg
> > > +
> > > +additionalProperties: false
> > > +
> > > +examples:
> > > +  - |+
> > > +#include 
> > > +#include 
> > > +
> > > +pmu@2b42 {
> > > +compatible = "arm,smmu-v3-pmcg";
> > > +reg = <0 0x2b42 0 0x1000>,
> > > +  <0 0x2b43 0 0x1000>;
> > > +interrupts = ;
> > > +msi-parent = < 0xff>;
> > > +};
> > > +
> > > +pmu@2b44 {
> > > +compatible = "arm,smmu-v3-pmcg";
> > > +reg = <0 0x2b44 0 0x1000>,
> > > +  <0 0x2b45 0 0x1000>;
> > > +interrupts = ;
> > > +msi-parent = < 0xff>;
> > > +};
> > > --
> > > 2.33.1
> > > 
___
iommu mailing list
iommu@lists.linux-foundation.org

Re: [PATCH] iommu/arm-smmu-v3: Constify arm_smmu_mmu_notifier_ops

2021-12-10 Thread Jean-Philippe Brucker
On Sat, Dec 04, 2021 at 11:33:01PM +0100, Rikard Falkeborn wrote:
> The only usage of arm_smmu_mmu_notifier_ops is to assign its address to
> the ops field in the mmu_notifier struct, which is a pointer to const
> struct mmu_notifier_ops. Make it const to allow the compiler to put it
> in read-only memory.
> 
> Signed-off-by: Rikard Falkeborn 

Reviewed-by: Jean-Philippe Brucker 

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index ee66d1f4cb81..a737ba5f727e 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -220,7 +220,7 @@ static void arm_smmu_mmu_notifier_free(struct 
> mmu_notifier *mn)
>   kfree(mn_to_smmu(mn));
>  }
>  
> -static struct mmu_notifier_ops arm_smmu_mmu_notifier_ops = {
> +static const struct mmu_notifier_ops arm_smmu_mmu_notifier_ops = {
>   .invalidate_range   = arm_smmu_mm_invalidate_range,
>   .release= arm_smmu_mm_release,
>   .free_notifier  = arm_smmu_mmu_notifier_free,
> -- 
> 2.34.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/4] ioasid: Reserve a global PASID for in-kernel DMA

2021-12-10 Thread Jean-Philippe Brucker
On Thu, Dec 09, 2021 at 10:14:04AM -0800, Jacob Pan wrote:
> > This looks like we're just one step away from device drivers needing
> > multiple PASIDs for kernel DMA so I'm trying to figure out how to evolve
> > the API towards that. It's probably as simple as keeping a kernel IOASID
> > set at first, but then we'll probably want to optimize by having multiple
> > overlapping sets for each device driver (all separate from the SVA set).
> Sounds reasonable to start with a kernel set for in-kernel DMA once we need
> multiple ones. But I am not sure what *overlapping* sets mean here, could
> you explain?

Given that each device uses a separate PASID table, we could allocate the
same set of PASID values for different device drivers. We just need to
make sure that those values are different from PASIDs allocated for user
SVA.

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/4] ioasid: Reserve a global PASID for in-kernel DMA

2021-12-09 Thread Jean-Philippe Brucker
Hi Jacob,

On Tue, Dec 07, 2021 at 05:47:11AM -0800, Jacob Pan wrote:
> In-kernel DMA is managed by DMA mapping APIs, which supports per device
> addressing mode for legacy DMA requests. With the introduction of
> Process Address Space ID (PASID), device DMA can now target at a finer
> granularity per PASID + Requester ID (RID).
> 
> However, for in-kernel DMA there is no need to differentiate between
> legacy DMA and DMA with PASID in terms of mapping. DMA address mapping
> for RID+PASID can be made identical to the RID. The benefit for the
> drivers is the continuation of DMA mapping APIs without change.
> 
> This patch reserves a special IOASID for devices that perform in-kernel
> DMA requests with PASID. This global IOASID is excluded from the
> IOASID allocator. The analogous case is PASID #0, a special PASID
> reserved for DMA requests without PASID (legacy). We could have different
> kernel PASIDs for individual devices, but for simplicity reasons, a
> globally reserved one will fit the bill.
> 
> Signed-off-by: Jacob Pan 
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
>  drivers/iommu/intel/iommu.c | 4 ++--
>  drivers/iommu/intel/pasid.h | 3 +--
>  drivers/iommu/intel/svm.c   | 2 +-
>  drivers/iommu/ioasid.c  | 2 ++
>  include/linux/ioasid.h  | 4 
>  6 files changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index ee66d1f4cb81..ac79a37ffe06 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -329,7 +329,7 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct 
> *mm)
>   return ERR_PTR(-ENOMEM);
>  
>   /* Allocate a PASID for this mm if necessary */
> - ret = iommu_sva_alloc_pasid(mm, 1, (1U << master->ssid_bits) - 1);
> + ret = iommu_sva_alloc_pasid(mm, IOASID_ALLOC_BASE, (1U << 
> master->ssid_bits) - 1);

I'd rather keep hardware limits as parameters here. PASID#0 is reserved by
the SMMUv3 hardware so we have to pass at least 1 here, but VT-d could
change RID_PASID and pass 0. On the other hand IOASID_DMA_PASID depends on
device drivers needs and is not needed on all systems, so I think could
stay within the ioasid allocator. Could VT-d do an ioasid_alloc()/ioasid_get()
to reserve this global PASID, storing it under the device_domain_lock?

This looks like we're just one step away from device drivers needing
multiple PASIDs for kernel DMA so I'm trying to figure out how to evolve
the API towards that. It's probably as simple as keeping a kernel IOASID
set at first, but then we'll probably want to optimize by having multiple
overlapping sets for each device driver (all separate from the SVA set).

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API

2021-12-08 Thread Jean-Philippe Brucker
On Wed, Dec 08, 2021 at 08:56:16AM -0400, Jason Gunthorpe wrote:
> From a progress perspective I would like to start with simple 'page
> tables in userspace', ie no PASID in this step.
> 
> 'page tables in userspace' means an iommufd ioctl to create an
> iommu_domain where the IOMMU HW is directly travesering a
> device-specific page table structure in user space memory. All the HW
> today implements this by using another iommu_domain to allow the IOMMU
> HW DMA access to user memory - ie nesting or multi-stage or whatever.
> 
> This would come along with some ioctls to invalidate the IOTLB.
> 
> I'm imagining this step as a iommu_group->op->create_user_domain()
> driver callback which will create a new kind of domain with
> domain-unique ops. Ie map/unmap related should all be NULL as those
> are impossible operations.
> 
> From there the usual struct device (ie RID) attach/detatch stuff needs
> to take care of routing DMAs to this iommu_domain.
> 
> Step two would be to add the ability for an iommufd using driver to
> request that a RID is connected to an iommu_domain. This
> connection can be requested for any kind of iommu_domain, kernel owned
> or user owned.
> 
> I don't quite have an answer how exactly the SMMUv3 vs Intel
> difference in PASID routing should be resolved.

In SMMUv3 the user pgd is always stored in the PASID table (actually
called "context descriptor table" but I want to avoid confusion with the
VT-d "context table"). And to access the PASID table, the SMMUv3 first
translate its GPA into a PA using the stage-2 page table. For userspace to
pass individual pgds to the kernel, as opposed to passing whole PASID
tables, the host kernel needs to reserve GPA space and map it in stage-2,
so it can store the PASID table in there. Userspace manages GPA space.

This would be easy for a single pgd. In this case the PASID table has a
single entry and userspace could just pass one GPA page during
registration. However it isn't easily generalized to full PASID support,
because managing a multi-level PASID table will require runtime GPA
allocation, and that API is awkward. That's why we opted for "attach PASID
table" operation rather than "attach page table" (back then the choice was
easy since VT-d used the same concept).

So I think the simplest way to support nesting is still to have separate
modes of operations depending on the hardware.

Thanks,
Jean

> 
> to get answers I'm hoping to start building some sketch RFCs for these
> different things on iommufd, hopefully in January. I'm looking at user
> page tables, PASID, dirty tracking and userspace IO fault handling as
> the main features iommufd must tackle.
> 
> The purpose of the sketches would be to validate that the HW features
> we want to exposed can work will with the choices the base is making.
> 
> Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 2/5] iommu/virtio: Support bypass domains

2021-12-01 Thread Jean-Philippe Brucker
The VIRTIO_IOMMU_F_BYPASS_CONFIG feature adds a new flag to the ATTACH
request, that creates a bypass domain. Use it to enable identity
domains.

When VIRTIO_IOMMU_F_BYPASS_CONFIG is not supported by the device, we
currently fail attaching to an identity domain. Future patches will
instead create identity mappings in this case.

Reviewed-by: Kevin Tian 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/virtio-iommu.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 80930ce04a16..14dfee76fd19 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -71,6 +71,7 @@ struct viommu_domain {
struct rb_root_cached   mappings;
 
unsigned long   nr_endpoints;
+   boolbypass;
 };
 
 struct viommu_endpoint {
@@ -587,7 +588,9 @@ static struct iommu_domain *viommu_domain_alloc(unsigned 
type)
 {
struct viommu_domain *vdomain;
 
-   if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
+   if (type != IOMMU_DOMAIN_UNMANAGED &&
+   type != IOMMU_DOMAIN_DMA &&
+   type != IOMMU_DOMAIN_IDENTITY)
return NULL;
 
vdomain = kzalloc(sizeof(*vdomain), GFP_KERNEL);
@@ -630,6 +633,17 @@ static int viommu_domain_finalise(struct viommu_endpoint 
*vdev,
vdomain->map_flags  = viommu->map_flags;
vdomain->viommu = viommu;
 
+   if (domain->type == IOMMU_DOMAIN_IDENTITY) {
+   if (!virtio_has_feature(viommu->vdev,
+   VIRTIO_IOMMU_F_BYPASS_CONFIG)) {
+   ida_free(>domain_ids, vdomain->id);
+   vdomain->viommu = NULL;
+   return -EOPNOTSUPP;
+   }
+
+   vdomain->bypass = true;
+   }
+
return 0;
 }
 
@@ -691,6 +705,9 @@ static int viommu_attach_dev(struct iommu_domain *domain, 
struct device *dev)
.domain = cpu_to_le32(vdomain->id),
};
 
+   if (vdomain->bypass)
+   req.flags |= cpu_to_le32(VIRTIO_IOMMU_ATTACH_F_BYPASS);
+
for (i = 0; i < fwspec->num_ids; i++) {
req.endpoint = cpu_to_le32(fwspec->ids[i]);
 
@@ -1132,6 +1149,7 @@ static unsigned int features[] = {
VIRTIO_IOMMU_F_DOMAIN_RANGE,
VIRTIO_IOMMU_F_PROBE,
VIRTIO_IOMMU_F_MMIO,
+   VIRTIO_IOMMU_F_BYPASS_CONFIG,
 };
 
 static struct virtio_device_id id_table[] = {
-- 
2.34.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 3/5] iommu/virtio: Sort reserved regions

2021-12-01 Thread Jean-Philippe Brucker
To ease identity mapping support, keep the list of reserved regions
sorted.

Reviewed-by: Eric Auger 
Reviewed-by: Kevin Tian 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/virtio-iommu.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 14dfee76fd19..1b3c1f2741c6 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -423,7 +423,7 @@ static int viommu_add_resv_mem(struct viommu_endpoint *vdev,
size_t size;
u64 start64, end64;
phys_addr_t start, end;
-   struct iommu_resv_region *region = NULL;
+   struct iommu_resv_region *region = NULL, *next;
unsigned long prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
start = start64 = le64_to_cpu(mem->start);
@@ -454,7 +454,12 @@ static int viommu_add_resv_mem(struct viommu_endpoint 
*vdev,
if (!region)
return -ENOMEM;
 
-   list_add(>list, >resv_regions);
+   /* Keep the list sorted */
+   list_for_each_entry(next, >resv_regions, list) {
+   if (next->start > region->start)
+   break;
+   }
+   list_add_tail(>list, >list);
return 0;
 }
 
-- 
2.34.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 4/5] iommu/virtio: Pass end address to viommu_add_mapping()

2021-12-01 Thread Jean-Philippe Brucker
To support identity mappings, the virtio-iommu driver must be able to
represent full 64-bit ranges internally. Pass (start, end) instead of
(start, size) to viommu_add/del_mapping().

Clean comments. The one about the returned size was never true: when
sweeping the whole address space the returned size will most certainly
be smaller than 2^64.

Reviewed-by: Eric Auger 
Reviewed-by: Kevin Tian 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/virtio-iommu.c | 31 +++
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 1b3c1f2741c6..2fa370c2659c 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -311,8 +311,8 @@ static int viommu_send_req_sync(struct viommu_dev *viommu, 
void *buf,
  *
  * On success, return the new mapping. Otherwise return NULL.
  */
-static int viommu_add_mapping(struct viommu_domain *vdomain, unsigned long 
iova,
- phys_addr_t paddr, size_t size, u32 flags)
+static int viommu_add_mapping(struct viommu_domain *vdomain, u64 iova, u64 end,
+ phys_addr_t paddr, u32 flags)
 {
unsigned long irqflags;
struct viommu_mapping *mapping;
@@ -323,7 +323,7 @@ static int viommu_add_mapping(struct viommu_domain 
*vdomain, unsigned long iova,
 
mapping->paddr  = paddr;
mapping->iova.start = iova;
-   mapping->iova.last  = iova + size - 1;
+   mapping->iova.last  = end;
mapping->flags  = flags;
 
spin_lock_irqsave(>mappings_lock, irqflags);
@@ -338,26 +338,24 @@ static int viommu_add_mapping(struct viommu_domain 
*vdomain, unsigned long iova,
  *
  * @vdomain: the domain
  * @iova: start of the range
- * @size: size of the range. A size of 0 corresponds to the entire address
- * space.
+ * @end: end of the range
  *
- * On success, returns the number of unmapped bytes (>= size)
+ * On success, returns the number of unmapped bytes
  */
 static size_t viommu_del_mappings(struct viommu_domain *vdomain,
- unsigned long iova, size_t size)
+ u64 iova, u64 end)
 {
size_t unmapped = 0;
unsigned long flags;
-   unsigned long last = iova + size - 1;
struct viommu_mapping *mapping = NULL;
struct interval_tree_node *node, *next;
 
spin_lock_irqsave(>mappings_lock, flags);
-   next = interval_tree_iter_first(>mappings, iova, last);
+   next = interval_tree_iter_first(>mappings, iova, end);
while (next) {
node = next;
mapping = container_of(node, struct viommu_mapping, iova);
-   next = interval_tree_iter_next(node, iova, last);
+   next = interval_tree_iter_next(node, iova, end);
 
/* Trying to split a mapping? */
if (mapping->iova.start < iova)
@@ -656,8 +654,8 @@ static void viommu_domain_free(struct iommu_domain *domain)
 {
struct viommu_domain *vdomain = to_viommu_domain(domain);
 
-   /* Free all remaining mappings (size 2^64) */
-   viommu_del_mappings(vdomain, 0, 0);
+   /* Free all remaining mappings */
+   viommu_del_mappings(vdomain, 0, ULLONG_MAX);
 
if (vdomain->viommu)
ida_free(>viommu->domain_ids, vdomain->id);
@@ -742,6 +740,7 @@ static int viommu_map(struct iommu_domain *domain, unsigned 
long iova,
 {
int ret;
u32 flags;
+   u64 end = iova + size - 1;
struct virtio_iommu_req_map map;
struct viommu_domain *vdomain = to_viommu_domain(domain);
 
@@ -752,7 +751,7 @@ static int viommu_map(struct iommu_domain *domain, unsigned 
long iova,
if (flags & ~vdomain->map_flags)
return -EINVAL;
 
-   ret = viommu_add_mapping(vdomain, iova, paddr, size, flags);
+   ret = viommu_add_mapping(vdomain, iova, end, paddr, flags);
if (ret)
return ret;
 
@@ -761,7 +760,7 @@ static int viommu_map(struct iommu_domain *domain, unsigned 
long iova,
.domain = cpu_to_le32(vdomain->id),
.virt_start = cpu_to_le64(iova),
.phys_start = cpu_to_le64(paddr),
-   .virt_end   = cpu_to_le64(iova + size - 1),
+   .virt_end   = cpu_to_le64(end),
.flags  = cpu_to_le32(flags),
};
 
@@ -770,7 +769,7 @@ static int viommu_map(struct iommu_domain *domain, unsigned 
long iova,
 
ret = viommu_send_req_sync(vdomain->viommu, , sizeof(map));
if (ret)
-   viommu_del_mappings(vdomain, iova, size);
+   viommu_del_mappings(vdomain, iova, end);
 
return ret;
 }
@@ -783,7 +782,7 @@ static size_t viommu_unmap(struct iommu_domain *domain, 
unsigned long iova,
struct virtio_iommu_req_unmap un

[PATCH v3 1/5] iommu/virtio: Add definitions for VIRTIO_IOMMU_F_BYPASS_CONFIG

2021-12-01 Thread Jean-Philippe Brucker
Add definitions for the VIRTIO_IOMMU_F_BYPASS_CONFIG, which supersedes
VIRTIO_IOMMU_F_BYPASS.

Reviewed-by: Kevin Tian 
Signed-off-by: Jean-Philippe Brucker 
---
 include/uapi/linux/virtio_iommu.h | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/virtio_iommu.h 
b/include/uapi/linux/virtio_iommu.h
index 237e36a280cb..1ff357f0d72e 100644
--- a/include/uapi/linux/virtio_iommu.h
+++ b/include/uapi/linux/virtio_iommu.h
@@ -16,6 +16,7 @@
 #define VIRTIO_IOMMU_F_BYPASS  3
 #define VIRTIO_IOMMU_F_PROBE   4
 #define VIRTIO_IOMMU_F_MMIO5
+#define VIRTIO_IOMMU_F_BYPASS_CONFIG   6
 
 struct virtio_iommu_range_64 {
__le64  start;
@@ -36,6 +37,8 @@ struct virtio_iommu_config {
struct virtio_iommu_range_32domain_range;
/* Probe buffer size */
__le32  probe_size;
+   __u8bypass;
+   __u8reserved[3];
 };
 
 /* Request types */
@@ -66,11 +69,14 @@ struct virtio_iommu_req_tail {
__u8reserved[3];
 };
 
+#define VIRTIO_IOMMU_ATTACH_F_BYPASS   (1 << 0)
+
 struct virtio_iommu_req_attach {
struct virtio_iommu_req_headhead;
__le32  domain;
__le32  endpoint;
-   __u8reserved[8];
+   __le32  flags;
+   __u8reserved[4];
struct virtio_iommu_req_tailtail;
 };
 
-- 
2.34.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 5/5] iommu/virtio: Support identity-mapped domains

2021-12-01 Thread Jean-Philippe Brucker
Support identity domains for devices that do not offer the
VIRTIO_IOMMU_F_BYPASS_CONFIG feature, by creating 1:1 mappings between
the virtual and physical address space. Identity domains created this
way still perform noticeably better than DMA domains, because they don't
have the overhead of setting up and tearing down mappings at runtime.
The performance difference between this and bypass is minimal in
comparison.

It does not matter that the physical addresses in the identity mappings
do not all correspond to memory. By enabling passthrough we are trusting
the device driver and the device itself to only perform DMA to suitable
locations. In some cases it may even be desirable to perform DMA to MMIO
regions.

Reviewed-by: Eric Auger 
Reviewed-by: Kevin Tian 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/virtio-iommu.c | 61 +---
 1 file changed, 57 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 2fa370c2659c..6a8a52b4297b 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -375,6 +375,55 @@ static size_t viommu_del_mappings(struct viommu_domain 
*vdomain,
return unmapped;
 }
 
+/*
+ * Fill the domain with identity mappings, skipping the device's reserved
+ * regions.
+ */
+static int viommu_domain_map_identity(struct viommu_endpoint *vdev,
+ struct viommu_domain *vdomain)
+{
+   int ret;
+   struct iommu_resv_region *resv;
+   u64 iova = vdomain->domain.geometry.aperture_start;
+   u64 limit = vdomain->domain.geometry.aperture_end;
+   u32 flags = VIRTIO_IOMMU_MAP_F_READ | VIRTIO_IOMMU_MAP_F_WRITE;
+   unsigned long granule = 1UL << __ffs(vdomain->domain.pgsize_bitmap);
+
+   iova = ALIGN(iova, granule);
+   limit = ALIGN_DOWN(limit + 1, granule) - 1;
+
+   list_for_each_entry(resv, >resv_regions, list) {
+   u64 resv_start = ALIGN_DOWN(resv->start, granule);
+   u64 resv_end = ALIGN(resv->start + resv->length, granule) - 1;
+
+   if (resv_end < iova || resv_start > limit)
+   /* No overlap */
+   continue;
+
+   if (resv_start > iova) {
+   ret = viommu_add_mapping(vdomain, iova, resv_start - 1,
+(phys_addr_t)iova, flags);
+   if (ret)
+   goto err_unmap;
+   }
+
+   if (resv_end >= limit)
+   return 0;
+
+   iova = resv_end + 1;
+   }
+
+   ret = viommu_add_mapping(vdomain, iova, limit, (phys_addr_t)iova,
+flags);
+   if (ret)
+   goto err_unmap;
+   return 0;
+
+err_unmap:
+   viommu_del_mappings(vdomain, 0, iova);
+   return ret;
+}
+
 /*
  * viommu_replay_mappings - re-send MAP requests
  *
@@ -637,14 +686,18 @@ static int viommu_domain_finalise(struct viommu_endpoint 
*vdev,
vdomain->viommu = viommu;
 
if (domain->type == IOMMU_DOMAIN_IDENTITY) {
-   if (!virtio_has_feature(viommu->vdev,
-   VIRTIO_IOMMU_F_BYPASS_CONFIG)) {
+   if (virtio_has_feature(viommu->vdev,
+  VIRTIO_IOMMU_F_BYPASS_CONFIG)) {
+   vdomain->bypass = true;
+   return 0;
+   }
+
+   ret = viommu_domain_map_identity(vdev, vdomain);
+   if (ret) {
ida_free(>domain_ids, vdomain->id);
vdomain->viommu = NULL;
return -EOPNOTSUPP;
}
-
-   vdomain->bypass = true;
}
 
return 0;
-- 
2.34.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 0/5] iommu/virtio: Add identity domains

2021-12-01 Thread Jean-Philippe Brucker
Support identity domains, allowing to only enable IOMMU protection for a
subset of endpoints (those assigned to userspace, for example). Users
may enable identity domains at compile time
(CONFIG_IOMMU_DEFAULT_PASSTHROUGH), boot time (iommu.passthrough=1) or
runtime (/sys/kernel/iommu_groups/*/type = identity).

Since v2 [1] I fixed the padding in patch 1 and a rebase error in patch
5, reported by Eric.

Patches 1-2 support identity domains using the optional
VIRTIO_IOMMU_F_BYPASS_CONFIG feature, which was accepted into the spec
[2]. Patches 3-5 add a fallback to identity mappings, when the feature
is not supported.

QEMU patches are on my virtio-iommu/bypass branch [3], and depend on the
UAPI update.

[1] 
https://lore.kernel.org/linux-iommu/20211123155301.1047943-1-jean-phili...@linaro.org/
[2] https://github.com/oasis-tcs/virtio-spec/issues/119
[3] https://jpbrucker.net/git/qemu/log/?h=virtio-iommu/bypass

Jean-Philippe Brucker (5):
  iommu/virtio: Add definitions for VIRTIO_IOMMU_F_BYPASS_CONFIG
  iommu/virtio: Support bypass domains
  iommu/virtio: Sort reserved regions
  iommu/virtio: Pass end address to viommu_add_mapping()
  iommu/virtio: Support identity-mapped domains

 include/uapi/linux/virtio_iommu.h |   8 ++-
 drivers/iommu/virtio-iommu.c  | 113 +-
 2 files changed, 101 insertions(+), 20 deletions(-)

-- 
2.34.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 5/5] iommu/virtio: Support identity-mapped domains

2021-11-29 Thread Jean-Philippe Brucker
On Sat, Nov 27, 2021 at 06:09:56PM +0100, Eric Auger wrote:
> > -   vdomain->viommu = 0;
> > +   vdomain->viommu = NULL;
> nit: that change could have been done in patch 2

Ah yes, I changed that in v2 but fixed up the wrong patch

> > return -EOPNOTSUPP;
> > }
> > -
> > -   vdomain->bypass = true;
> > }
> >  
> > return 0;
> Besides
> Reviewed-by: Eric Auger 

Thanks!
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 4/5] iommu/virtio: Pass end address to viommu_add_mapping()

2021-11-29 Thread Jean-Philippe Brucker
On Sat, Nov 27, 2021 at 06:09:56PM -0500, Michael S. Tsirkin wrote:
> > > -static int viommu_add_mapping(struct viommu_domain *vdomain, unsigned 
> > > long iova,
> > > -   phys_addr_t paddr, size_t size, u32 flags)
> > > +static int viommu_add_mapping(struct viommu_domain *vdomain, u64 iova, 
> > > u64 end,
> > > +   phys_addr_t paddr, u32 flags)
> > >  {
> > >   unsigned long irqflags;
> > >   struct viommu_mapping *mapping;
> 
> I am worried that API changes like that will cause subtle
> bugs since types of arguments change but not their
> number. If we forgot to update some callers it will all be messed up.
> 
> How about passing struct Range instead?

I gave struct range a try but it looks messier overall since it would only
be used to pass arguments. I think the update is safe enough because there
is one caller for viommu_add_mapping() and two for viommu_del_mappings(),
at the moment.

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 2/5] iommu/virtio: Support bypass domains

2021-11-29 Thread Jean-Philippe Brucker
On Sat, Nov 27, 2021 at 05:18:28PM +0100, Eric Auger wrote:
> Hi Jean,
> 
> On 11/23/21 4:52 PM, Jean-Philippe Brucker wrote:
> > The VIRTIO_IOMMU_F_BYPASS_CONFIG feature adds a new flag to the ATTACH
> > request, that creates a bypass domain. Use it to enable identity
> > domains.
> >
> > When VIRTIO_IOMMU_F_BYPASS_CONFIG is not supported by the device, we
> > currently fail attaching to an identity domain. Future patches will
> > instead create identity mappings in this case.
> >
> > Reviewed-by: Kevin Tian 
> > Signed-off-by: Jean-Philippe Brucker 
> > ---
> >  drivers/iommu/virtio-iommu.c | 20 +++-
> >  1 file changed, 19 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> > index 80930ce04a16..ee8a7afd667b 100644
> > --- a/drivers/iommu/virtio-iommu.c
> > +++ b/drivers/iommu/virtio-iommu.c
> > @@ -71,6 +71,7 @@ struct viommu_domain {
> > struct rb_root_cached   mappings;
> >  
> > unsigned long   nr_endpoints;
> > +   boolbypass;
> >  };
> >  
> >  struct viommu_endpoint {
> > @@ -587,7 +588,9 @@ static struct iommu_domain 
> > *viommu_domain_alloc(unsigned type)
> >  {
> > struct viommu_domain *vdomain;
> >  
> > -   if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
> > +   if (type != IOMMU_DOMAIN_UNMANAGED &&
> > +   type != IOMMU_DOMAIN_DMA &&
> > +   type != IOMMU_DOMAIN_IDENTITY)
> > return NULL;
> >  
> > vdomain = kzalloc(sizeof(*vdomain), GFP_KERNEL);
> > @@ -630,6 +633,17 @@ static int viommu_domain_finalise(struct 
> > viommu_endpoint *vdev,
> > vdomain->map_flags  = viommu->map_flags;
> > vdomain->viommu = viommu;
> >  
> > +   if (domain->type == IOMMU_DOMAIN_IDENTITY) {
> > +   if (!virtio_has_feature(viommu->vdev,
> nit: couldn't the check be done before the ida_alloc_range(),
> simplifying the failure cleanup?

It could, but patch 5 falls back to identity mappings, which is better
left at the end of the function to keep the error path simple. I put this
at the end already here, so patch 5 doesn't need to move things around.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 1/5] iommu/virtio: Add definitions for VIRTIO_IOMMU_F_BYPASS_CONFIG

2021-11-29 Thread Jean-Philippe Brucker
Hi Eric,

On Sat, Nov 27, 2021 at 08:59:25AM +0100, Eric Auger wrote:
> > @@ -36,6 +37,8 @@ struct virtio_iommu_config {
> > struct virtio_iommu_range_32domain_range;
> > /* Probe buffer size */
> > __le32  probe_size;
> > +   __u8bypass;
> > +   __u8reserved[7];
> in [PATCH v3] virtio-iommu: Rework the bypass feature I see
> 
> +  u8 bypass;
> +  u8 reserved[3];
> 
> What was exactly voted?

Good catch, this should be 3. It brings the config struct to 40 bytes,
which is the size compilers generate when there is no reserved field.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 4/5] iommu/virtio: Pass end address to viommu_add_mapping()

2021-11-23 Thread Jean-Philippe Brucker
To support identity mappings, the virtio-iommu driver must be able to
represent full 64-bit ranges internally. Pass (start, end) instead of
(start, size) to viommu_add/del_mapping().

Clean comments. The one about the returned size was never true: when
sweeping the whole address space the returned size will most certainly
be smaller than 2^64.

Reviewed-by: Kevin Tian 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/virtio-iommu.c | 31 +++
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index d63ec4d11b00..eceb9281c8c1 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -311,8 +311,8 @@ static int viommu_send_req_sync(struct viommu_dev *viommu, 
void *buf,
  *
  * On success, return the new mapping. Otherwise return NULL.
  */
-static int viommu_add_mapping(struct viommu_domain *vdomain, unsigned long 
iova,
- phys_addr_t paddr, size_t size, u32 flags)
+static int viommu_add_mapping(struct viommu_domain *vdomain, u64 iova, u64 end,
+ phys_addr_t paddr, u32 flags)
 {
unsigned long irqflags;
struct viommu_mapping *mapping;
@@ -323,7 +323,7 @@ static int viommu_add_mapping(struct viommu_domain 
*vdomain, unsigned long iova,
 
mapping->paddr  = paddr;
mapping->iova.start = iova;
-   mapping->iova.last  = iova + size - 1;
+   mapping->iova.last  = end;
mapping->flags  = flags;
 
spin_lock_irqsave(>mappings_lock, irqflags);
@@ -338,26 +338,24 @@ static int viommu_add_mapping(struct viommu_domain 
*vdomain, unsigned long iova,
  *
  * @vdomain: the domain
  * @iova: start of the range
- * @size: size of the range. A size of 0 corresponds to the entire address
- * space.
+ * @end: end of the range
  *
- * On success, returns the number of unmapped bytes (>= size)
+ * On success, returns the number of unmapped bytes
  */
 static size_t viommu_del_mappings(struct viommu_domain *vdomain,
- unsigned long iova, size_t size)
+ u64 iova, u64 end)
 {
size_t unmapped = 0;
unsigned long flags;
-   unsigned long last = iova + size - 1;
struct viommu_mapping *mapping = NULL;
struct interval_tree_node *node, *next;
 
spin_lock_irqsave(>mappings_lock, flags);
-   next = interval_tree_iter_first(>mappings, iova, last);
+   next = interval_tree_iter_first(>mappings, iova, end);
while (next) {
node = next;
mapping = container_of(node, struct viommu_mapping, iova);
-   next = interval_tree_iter_next(node, iova, last);
+   next = interval_tree_iter_next(node, iova, end);
 
/* Trying to split a mapping? */
if (mapping->iova.start < iova)
@@ -656,8 +654,8 @@ static void viommu_domain_free(struct iommu_domain *domain)
 {
struct viommu_domain *vdomain = to_viommu_domain(domain);
 
-   /* Free all remaining mappings (size 2^64) */
-   viommu_del_mappings(vdomain, 0, 0);
+   /* Free all remaining mappings */
+   viommu_del_mappings(vdomain, 0, ULLONG_MAX);
 
if (vdomain->viommu)
ida_free(>viommu->domain_ids, vdomain->id);
@@ -742,6 +740,7 @@ static int viommu_map(struct iommu_domain *domain, unsigned 
long iova,
 {
int ret;
u32 flags;
+   u64 end = iova + size - 1;
struct virtio_iommu_req_map map;
struct viommu_domain *vdomain = to_viommu_domain(domain);
 
@@ -752,7 +751,7 @@ static int viommu_map(struct iommu_domain *domain, unsigned 
long iova,
if (flags & ~vdomain->map_flags)
return -EINVAL;
 
-   ret = viommu_add_mapping(vdomain, iova, paddr, size, flags);
+   ret = viommu_add_mapping(vdomain, iova, end, paddr, flags);
if (ret)
return ret;
 
@@ -761,7 +760,7 @@ static int viommu_map(struct iommu_domain *domain, unsigned 
long iova,
.domain = cpu_to_le32(vdomain->id),
.virt_start = cpu_to_le64(iova),
.phys_start = cpu_to_le64(paddr),
-   .virt_end   = cpu_to_le64(iova + size - 1),
+   .virt_end   = cpu_to_le64(end),
.flags  = cpu_to_le32(flags),
};
 
@@ -770,7 +769,7 @@ static int viommu_map(struct iommu_domain *domain, unsigned 
long iova,
 
ret = viommu_send_req_sync(vdomain->viommu, , sizeof(map));
if (ret)
-   viommu_del_mappings(vdomain, iova, size);
+   viommu_del_mappings(vdomain, iova, end);
 
return ret;
 }
@@ -783,7 +782,7 @@ static size_t viommu_unmap(struct iommu_domain *domain, 
unsigned long iova,
struct virtio_iommu_req_unmap unmap;
  

[PATCH v2 5/5] iommu/virtio: Support identity-mapped domains

2021-11-23 Thread Jean-Philippe Brucker
Support identity domains for devices that do not offer the
VIRTIO_IOMMU_F_BYPASS_CONFIG feature, by creating 1:1 mappings between
the virtual and physical address space. Identity domains created this
way still perform noticeably better than DMA domains, because they don't
have the overhead of setting up and tearing down mappings at runtime.
The performance difference between this and bypass is minimal in
comparison.

It does not matter that the physical addresses in the identity mappings
do not all correspond to memory. By enabling passthrough we are trusting
the device driver and the device itself to only perform DMA to suitable
locations. In some cases it may even be desirable to perform DMA to MMIO
regions.

Reviewed-by: Kevin Tian 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/virtio-iommu.c | 63 +---
 1 file changed, 58 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index eceb9281c8c1..6a8a52b4297b 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -375,6 +375,55 @@ static size_t viommu_del_mappings(struct viommu_domain 
*vdomain,
return unmapped;
 }
 
+/*
+ * Fill the domain with identity mappings, skipping the device's reserved
+ * regions.
+ */
+static int viommu_domain_map_identity(struct viommu_endpoint *vdev,
+ struct viommu_domain *vdomain)
+{
+   int ret;
+   struct iommu_resv_region *resv;
+   u64 iova = vdomain->domain.geometry.aperture_start;
+   u64 limit = vdomain->domain.geometry.aperture_end;
+   u32 flags = VIRTIO_IOMMU_MAP_F_READ | VIRTIO_IOMMU_MAP_F_WRITE;
+   unsigned long granule = 1UL << __ffs(vdomain->domain.pgsize_bitmap);
+
+   iova = ALIGN(iova, granule);
+   limit = ALIGN_DOWN(limit + 1, granule) - 1;
+
+   list_for_each_entry(resv, >resv_regions, list) {
+   u64 resv_start = ALIGN_DOWN(resv->start, granule);
+   u64 resv_end = ALIGN(resv->start + resv->length, granule) - 1;
+
+   if (resv_end < iova || resv_start > limit)
+   /* No overlap */
+   continue;
+
+   if (resv_start > iova) {
+   ret = viommu_add_mapping(vdomain, iova, resv_start - 1,
+(phys_addr_t)iova, flags);
+   if (ret)
+   goto err_unmap;
+   }
+
+   if (resv_end >= limit)
+   return 0;
+
+   iova = resv_end + 1;
+   }
+
+   ret = viommu_add_mapping(vdomain, iova, limit, (phys_addr_t)iova,
+flags);
+   if (ret)
+   goto err_unmap;
+   return 0;
+
+err_unmap:
+   viommu_del_mappings(vdomain, 0, iova);
+   return ret;
+}
+
 /*
  * viommu_replay_mappings - re-send MAP requests
  *
@@ -637,14 +686,18 @@ static int viommu_domain_finalise(struct viommu_endpoint 
*vdev,
vdomain->viommu = viommu;
 
if (domain->type == IOMMU_DOMAIN_IDENTITY) {
-   if (!virtio_has_feature(viommu->vdev,
-   VIRTIO_IOMMU_F_BYPASS_CONFIG)) {
+   if (virtio_has_feature(viommu->vdev,
+  VIRTIO_IOMMU_F_BYPASS_CONFIG)) {
+   vdomain->bypass = true;
+   return 0;
+   }
+
+   ret = viommu_domain_map_identity(vdev, vdomain);
+   if (ret) {
ida_free(>domain_ids, vdomain->id);
-   vdomain->viommu = 0;
+   vdomain->viommu = NULL;
return -EOPNOTSUPP;
}
-
-   vdomain->bypass = true;
}
 
return 0;
-- 
2.33.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 3/5] iommu/virtio: Sort reserved regions

2021-11-23 Thread Jean-Philippe Brucker
To ease identity mapping support, keep the list of reserved regions
sorted.

Reviewed-by: Kevin Tian 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/virtio-iommu.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index ee8a7afd667b..d63ec4d11b00 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -423,7 +423,7 @@ static int viommu_add_resv_mem(struct viommu_endpoint *vdev,
size_t size;
u64 start64, end64;
phys_addr_t start, end;
-   struct iommu_resv_region *region = NULL;
+   struct iommu_resv_region *region = NULL, *next;
unsigned long prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
start = start64 = le64_to_cpu(mem->start);
@@ -454,7 +454,12 @@ static int viommu_add_resv_mem(struct viommu_endpoint 
*vdev,
if (!region)
return -ENOMEM;
 
-   list_add(>list, >resv_regions);
+   /* Keep the list sorted */
+   list_for_each_entry(next, >resv_regions, list) {
+   if (next->start > region->start)
+   break;
+   }
+   list_add_tail(>list, >list);
return 0;
 }
 
-- 
2.33.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/5] iommu/virtio: Support bypass domains

2021-11-23 Thread Jean-Philippe Brucker
The VIRTIO_IOMMU_F_BYPASS_CONFIG feature adds a new flag to the ATTACH
request, that creates a bypass domain. Use it to enable identity
domains.

When VIRTIO_IOMMU_F_BYPASS_CONFIG is not supported by the device, we
currently fail attaching to an identity domain. Future patches will
instead create identity mappings in this case.

Reviewed-by: Kevin Tian 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/virtio-iommu.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 80930ce04a16..ee8a7afd667b 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -71,6 +71,7 @@ struct viommu_domain {
struct rb_root_cached   mappings;
 
unsigned long   nr_endpoints;
+   boolbypass;
 };
 
 struct viommu_endpoint {
@@ -587,7 +588,9 @@ static struct iommu_domain *viommu_domain_alloc(unsigned 
type)
 {
struct viommu_domain *vdomain;
 
-   if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
+   if (type != IOMMU_DOMAIN_UNMANAGED &&
+   type != IOMMU_DOMAIN_DMA &&
+   type != IOMMU_DOMAIN_IDENTITY)
return NULL;
 
vdomain = kzalloc(sizeof(*vdomain), GFP_KERNEL);
@@ -630,6 +633,17 @@ static int viommu_domain_finalise(struct viommu_endpoint 
*vdev,
vdomain->map_flags  = viommu->map_flags;
vdomain->viommu = viommu;
 
+   if (domain->type == IOMMU_DOMAIN_IDENTITY) {
+   if (!virtio_has_feature(viommu->vdev,
+   VIRTIO_IOMMU_F_BYPASS_CONFIG)) {
+   ida_free(>domain_ids, vdomain->id);
+   vdomain->viommu = 0;
+   return -EOPNOTSUPP;
+   }
+
+   vdomain->bypass = true;
+   }
+
return 0;
 }
 
@@ -691,6 +705,9 @@ static int viommu_attach_dev(struct iommu_domain *domain, 
struct device *dev)
.domain = cpu_to_le32(vdomain->id),
};
 
+   if (vdomain->bypass)
+   req.flags |= cpu_to_le32(VIRTIO_IOMMU_ATTACH_F_BYPASS);
+
for (i = 0; i < fwspec->num_ids; i++) {
req.endpoint = cpu_to_le32(fwspec->ids[i]);
 
@@ -1132,6 +1149,7 @@ static unsigned int features[] = {
VIRTIO_IOMMU_F_DOMAIN_RANGE,
VIRTIO_IOMMU_F_PROBE,
VIRTIO_IOMMU_F_MMIO,
+   VIRTIO_IOMMU_F_BYPASS_CONFIG,
 };
 
 static struct virtio_device_id id_table[] = {
-- 
2.33.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/5] iommu/virtio: Add definitions for VIRTIO_IOMMU_F_BYPASS_CONFIG

2021-11-23 Thread Jean-Philippe Brucker
Add definitions for the VIRTIO_IOMMU_F_BYPASS_CONFIG, which supersedes
VIRTIO_IOMMU_F_BYPASS.

Reviewed-by: Kevin Tian 
Signed-off-by: Jean-Philippe Brucker 
---
 include/uapi/linux/virtio_iommu.h | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/virtio_iommu.h 
b/include/uapi/linux/virtio_iommu.h
index 237e36a280cb..cafd8cf7febf 100644
--- a/include/uapi/linux/virtio_iommu.h
+++ b/include/uapi/linux/virtio_iommu.h
@@ -16,6 +16,7 @@
 #define VIRTIO_IOMMU_F_BYPASS  3
 #define VIRTIO_IOMMU_F_PROBE   4
 #define VIRTIO_IOMMU_F_MMIO5
+#define VIRTIO_IOMMU_F_BYPASS_CONFIG   6
 
 struct virtio_iommu_range_64 {
__le64  start;
@@ -36,6 +37,8 @@ struct virtio_iommu_config {
struct virtio_iommu_range_32domain_range;
/* Probe buffer size */
__le32  probe_size;
+   __u8bypass;
+   __u8reserved[7];
 };
 
 /* Request types */
@@ -66,11 +69,14 @@ struct virtio_iommu_req_tail {
__u8reserved[3];
 };
 
+#define VIRTIO_IOMMU_ATTACH_F_BYPASS   (1 << 0)
+
 struct virtio_iommu_req_attach {
struct virtio_iommu_req_headhead;
__le32  domain;
__le32  endpoint;
-   __u8reserved[8];
+   __le32  flags;
+   __u8reserved[4];
struct virtio_iommu_req_tailtail;
 };
 
-- 
2.33.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 0/5] iommu/virtio: Add identity domains

2021-11-23 Thread Jean-Philippe Brucker
Support identity domains, allowing to only enable IOMMU protection for a
subset of endpoints (those assigned to userspace, for example). Users
may enable identity domains at compile time
(CONFIG_IOMMU_DEFAULT_PASSTHROUGH), boot time (iommu.passthrough=1) or
runtime (/sys/kernel/iommu_groups/*/type = identity).

Since v1 [1] I rebased onto v5.16-rc and added Kevin's review tag.
The specification update for the new feature has now been accepted [2].

Patches 1-2 support identity domains using the optional
VIRTIO_IOMMU_F_BYPASS_CONFIG feature, and patches 3-5 add a fallback to
identity mappings, when the feature is not supported.

QEMU patches are on my virtio-iommu/bypass branch [3], and depend on the
UAPI update.

[1] 
https://lore.kernel.org/linux-iommu/20211013121052.518113-1-jean-phili...@linaro.org/
[2] https://github.com/oasis-tcs/virtio-spec/issues/119
[3] https://jpbrucker.net/git/qemu/log/?h=virtio-iommu/bypass

Jean-Philippe Brucker (5):
  iommu/virtio: Add definitions for VIRTIO_IOMMU_F_BYPASS_CONFIG
  iommu/virtio: Support bypass domains
  iommu/virtio: Sort reserved regions
  iommu/virtio: Pass end address to viommu_add_mapping()
  iommu/virtio: Support identity-mapped domains

 include/uapi/linux/virtio_iommu.h |   8 ++-
 drivers/iommu/virtio-iommu.c  | 113 +-
 2 files changed, 101 insertions(+), 20 deletions(-)

-- 
2.33.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 3/3] perf/smmuv3: Synthesize IIDR from CoreSight ID registers

2021-11-17 Thread Jean-Philippe Brucker
From: Robin Murphy 

The SMMU_PMCG_IIDR register was not present in older revisions of the
Arm SMMUv3 spec. On Arm Ltd. implementations, the IIDR value consists of
fields from several PIDR registers, allowing us to present a
standardized identifier to userspace.

Signed-off-by: Robin Murphy 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/perf/arm_smmuv3_pmu.c | 55 ++-
 1 file changed, 54 insertions(+), 1 deletion(-)

diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c
index 19697617153a..598d6978280d 100644
--- a/drivers/perf/arm_smmuv3_pmu.c
+++ b/drivers/perf/arm_smmuv3_pmu.c
@@ -76,6 +76,10 @@
 #define SMMU_PMCG_CR0xE04
 #define SMMU_PMCG_CR_ENABLE BIT(0)
 #define SMMU_PMCG_IIDR  0xE08
+#define SMMU_PMCG_IIDR_PRODUCTIDGENMASK(31, 20)
+#define SMMU_PMCG_IIDR_VARIANT  GENMASK(19, 16)
+#define SMMU_PMCG_IIDR_REVISION GENMASK(15, 12)
+#define SMMU_PMCG_IIDR_IMPLEMENTER  GENMASK(11, 0)
 #define SMMU_PMCG_CEID0 0xE20
 #define SMMU_PMCG_CEID1 0xE28
 #define SMMU_PMCG_IRQ_CTRL  0xE50
@@ -84,6 +88,20 @@
 #define SMMU_PMCG_IRQ_CFG1  0xE60
 #define SMMU_PMCG_IRQ_CFG2  0xE64
 
+/* IMP-DEF ID registers */
+#define SMMU_PMCG_PIDR0 0xFE0
+#define SMMU_PMCG_PIDR0_PART_0  GENMASK(7, 0)
+#define SMMU_PMCG_PIDR1 0xFE4
+#define SMMU_PMCG_PIDR1_DES_0   GENMASK(7, 4)
+#define SMMU_PMCG_PIDR1_PART_1  GENMASK(3, 0)
+#define SMMU_PMCG_PIDR2 0xFE8
+#define SMMU_PMCG_PIDR2_REVISIONGENMASK(7, 4)
+#define SMMU_PMCG_PIDR2_DES_1   GENMASK(2, 0)
+#define SMMU_PMCG_PIDR3 0xFEC
+#define SMMU_PMCG_PIDR3_REVAND  GENMASK(7, 4)
+#define SMMU_PMCG_PIDR4 0xFD0
+#define SMMU_PMCG_PIDR4_DES_2   GENMASK(3, 0)
+
 /* MSI config fields */
 #define MSI_CFG0_ADDR_MASK  GENMASK_ULL(51, 2)
 #define MSI_CFG2_MEMATTR_DEVICE_nGnRE   0x1
@@ -755,6 +773,41 @@ static void smmu_pmu_get_acpi_options(struct smmu_pmu 
*smmu_pmu)
dev_notice(smmu_pmu->dev, "option mask 0x%x\n", smmu_pmu->options);
 }
 
+static bool smmu_pmu_coresight_id_regs(struct smmu_pmu *smmu_pmu)
+{
+   return of_device_is_compatible(smmu_pmu->dev->of_node,
+  "arm,mmu-600-pmcg");
+}
+
+static void smmu_pmu_get_iidr(struct smmu_pmu *smmu_pmu)
+{
+   u32 iidr = readl_relaxed(smmu_pmu->reg_base + SMMU_PMCG_IIDR);
+
+   if (!iidr && smmu_pmu_coresight_id_regs(smmu_pmu)) {
+   u32 pidr0 = readl(smmu_pmu->reg_base + SMMU_PMCG_PIDR0);
+   u32 pidr1 = readl(smmu_pmu->reg_base + SMMU_PMCG_PIDR1);
+   u32 pidr2 = readl(smmu_pmu->reg_base + SMMU_PMCG_PIDR2);
+   u32 pidr3 = readl(smmu_pmu->reg_base + SMMU_PMCG_PIDR3);
+   u32 pidr4 = readl(smmu_pmu->reg_base + SMMU_PMCG_PIDR4);
+
+   u32 productid = FIELD_GET(SMMU_PMCG_PIDR0_PART_0, pidr0) |
+   (FIELD_GET(SMMU_PMCG_PIDR1_PART_1, pidr1) << 8);
+   u32 variant = FIELD_GET(SMMU_PMCG_PIDR2_REVISION, pidr2);
+   u32 revision = FIELD_GET(SMMU_PMCG_PIDR3_REVAND, pidr3);
+   u32 implementer =
+   FIELD_GET(SMMU_PMCG_PIDR1_DES_0, pidr1) |
+   (FIELD_GET(SMMU_PMCG_PIDR2_DES_1, pidr2) << 4) |
+   (FIELD_GET(SMMU_PMCG_PIDR4_DES_2, pidr4) << 8);
+
+   iidr = FIELD_PREP(SMMU_PMCG_IIDR_PRODUCTID, productid) |
+  FIELD_PREP(SMMU_PMCG_IIDR_VARIANT, variant) |
+  FIELD_PREP(SMMU_PMCG_IIDR_REVISION, revision) |
+  FIELD_PREP(SMMU_PMCG_IIDR_IMPLEMENTER, implementer);
+   }
+
+   smmu_pmu->iidr = iidr;
+}
+
 static int smmu_pmu_probe(struct platform_device *pdev)
 {
struct smmu_pmu *smmu_pmu;
@@ -826,7 +879,7 @@ static int smmu_pmu_probe(struct platform_device *pdev)
return err;
}
 
-   smmu_pmu->iidr = readl_relaxed(smmu_pmu->reg_base + SMMU_PMCG_IIDR);
+   smmu_pmu_get_iidr(smmu_pmu);
 
name = devm_kasprintf(>dev, GFP_KERNEL, "smmuv3_pmcg_%llx",
  (res_0->start) >> SMMU_PMCG_PA_SHIFT);
-- 
2.33.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/3] perf/smmuv3: Add devicetree support

2021-11-17 Thread Jean-Philippe Brucker
Add device-tree support to the SMMUv3 PMCG driver.

Signed-off-by: Jay Chen 
Signed-off-by: Jean-Philippe Brucker 
Signed-off-by: Robin Murphy 
---
 drivers/perf/arm_smmuv3_pmu.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c
index 226348822ab3..19697617153a 100644
--- a/drivers/perf/arm_smmuv3_pmu.c
+++ b/drivers/perf/arm_smmuv3_pmu.c
@@ -47,6 +47,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -834,7 +835,8 @@ static int smmu_pmu_probe(struct platform_device *pdev)
return -EINVAL;
}
 
-   smmu_pmu_get_acpi_options(smmu_pmu);
+   if (!dev->of_node)
+   smmu_pmu_get_acpi_options(smmu_pmu);
 
/* Pick one CPU to be the preferred one to use */
smmu_pmu->on_cpu = raw_smp_processor_id();
@@ -884,9 +886,16 @@ static void smmu_pmu_shutdown(struct platform_device *pdev)
smmu_pmu_disable(_pmu->pmu);
 }
 
+static const struct of_device_id smmu_pmu_of_match[] = {
+   { .compatible = "arm,smmu-v3-pmcg" },
+   {}
+};
+MODULE_DEVICE_TABLE(of, smmu_pmu_of_match);
+
 static struct platform_driver smmu_pmu_driver = {
.driver = {
.name = "arm-smmu-v3-pmcg",
+   .of_match_table = of_match_ptr(smmu_pmu_of_match),
.suppress_bind_attrs = true,
},
.probe = smmu_pmu_probe,
-- 
2.33.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/3] dt-bindings: Add Arm SMMUv3 PMCG binding

2021-11-17 Thread Jean-Philippe Brucker
Add binding for the Arm SMMUv3 PMU. Each node represents a PMCG, and is
placed as a sibling node of the SMMU. Although the PMCGs registers may
be within the SMMU MMIO region, they are separate devices, and there can
be multiple PMCG devices for each SMMU (for example one for the TCU and
one for each TBU).

Signed-off-by: Jean-Philippe Brucker 
Signed-off-by: Robin Murphy 
---
 .../bindings/perf/arm,smmu-v3-pmcg.yaml   | 70 +++
 1 file changed, 70 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/perf/arm,smmu-v3-pmcg.yaml

diff --git a/Documentation/devicetree/bindings/perf/arm,smmu-v3-pmcg.yaml 
b/Documentation/devicetree/bindings/perf/arm,smmu-v3-pmcg.yaml
new file mode 100644
index ..a4b53a6a1ebf
--- /dev/null
+++ b/Documentation/devicetree/bindings/perf/arm,smmu-v3-pmcg.yaml
@@ -0,0 +1,70 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/perf/arm,smmu-v3-pmcg.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Arm SMMUv3 Performance Monitor Counter Group
+
+maintainers:
+  - Will Deacon 
+  - Robin Murphy 
+
+description: |
+  An SMMUv3 may have several Performance Monitor Counter Group (PMCG).
+  They are standalone performance monitoring units that support both
+  architected and IMPLEMENTATION DEFINED event counters.
+
+properties:
+  $nodename:
+pattern: "^pmu@[0-9a-f]*"
+  compatible:
+oneOf:
+  - items:
+  - const: arm,mmu-600-pmcg
+  - const: arm,smmu-v3-pmcg
+  - const: arm,smmu-v3-pmcg
+
+  reg:
+items:
+  - description: Register page 0
+  - description: Register page 1, if SMMU_PMCG_CFGR.RELOC_CTRS = 1
+minItems: 1
+
+  interrupts:
+maxItems: 1
+
+  msi-parent: true
+
+required:
+  - compatible
+  - reg
+
+anyOf:
+  - required:
+  - interrupts
+  - required:
+  - msi-parent
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+#include 
+
+pmu@2b42 {
+compatible = "arm,smmu-v3-pmcg";
+reg = <0x2b42 0x1000>,
+  <0x2b43 0x1000>;
+interrupts = ;
+msi-parent = < 0xff>;
+};
+
+pmu@2b44 {
+compatible = "arm,smmu-v3-pmcg";
+reg = <0x2b44 0x1000>,
+  <0x2b45 0x1000>;
+interrupts = ;
+msi-parent = < 0xff>;
+};
-- 
2.33.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 0/3] perf/smmuv3: Support devicetree

2021-11-17 Thread Jean-Philippe Brucker
Add devicetree binding for the SMMUv3 PMU, called Performance Monitoring
Counter Group (PMCG) in the spec. Each SMMUv3 implementation can have
multiple independent PMCGs, for example one for the Translation Control
Unit (TCU) and one per Translation Buffer Unit (TBU).

Since v1 [1]:
* Fixed warnings in the binding doc
* Removed hip08 support
* Merged Robin's version. I took the liberty of splitting the driver
  patch into 2 and 3. One fix in patch 3, and whitespace changes (the
  driver uses spaces instead of tabs to align #define values, which I
  was going to fix but actually seems more common across the tree.)

[1] 
https://lore.kernel.org/linux-iommu/2026113536.69758-1-jean-phili...@linaro.org/

Jean-Philippe Brucker (2):
  dt-bindings: Add Arm SMMUv3 PMCG binding
  perf/smmuv3: Add devicetree support

Robin Murphy (1):
  perf/smmuv3: Synthesize IIDR from CoreSight ID registers

 .../bindings/perf/arm,smmu-v3-pmcg.yaml   | 70 +++
 drivers/perf/arm_smmuv3_pmu.c | 66 -
 2 files changed, 134 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/perf/arm,smmu-v3-pmcg.yaml

-- 
2.33.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] perf/smmuv3: Support devicetree

2021-11-16 Thread Jean-Philippe Brucker
On Tue, Nov 16, 2021 at 05:00:14PM +, Robin Murphy wrote:
> On 2021-11-16 15:42, Jean-Philippe Brucker wrote:
> > On Tue, Nov 16, 2021 at 12:02:47PM +, Robin Murphy wrote:
> > > On 2021-11-16 11:35, Jean-Philippe Brucker wrote:
> > > > Add devicetree binding for the SMMUv3 PMU, called Performance Monitoring
> > > > Counter Group (PMCG) in the spec. Each SMMUv3 implementation can have
> > > > multiple independent PMCGs, for example one for the Translation Control
> > > > Unit (TCU) and one per Translation Buffer Unit (TBU).
> > > > 
> > > > I previously sent the binding as reply to Jay Chen's thread implementing
> > > > device tree support [1]. This posting addresses the comments from that
> > > > thread.
> > > 
> > > Ha, I'd also resurrected this and was planning to post it at some point 
> > > this
> > > week[0] - you should have said :)
> > 
> > Ah sorry about that, I just resent because there was some demand for it at
> > Linaro
> 
> Heh, no worries - it's not like you were even CC'ed on the thread where I
> only mentioned I *might* do it.
> 
> Can I get away with being cheeky and just saying that my review comments are
> the diff between my branch and yours, I wonder...

Sure, that works for me, I'll send a v2 this week or so

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] dt-bindings: Add Arm SMMUv3 PMCG binding

2021-11-16 Thread Jean-Philippe Brucker
On Tue, Nov 16, 2021 at 08:02:53AM -0600, Rob Herring wrote:
> My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
> on your patch (DT_CHECKER_FLAGS is new in v5.13):
> 
> yamllint warnings/errors:
> ./Documentation/devicetree/bindings/iommu/arm,smmu-v3-pmcg.yaml:24:9: 
> [warning] wrong indentation: expected 10 but found 8 (indentation)
> ./Documentation/devicetree/bindings/iommu/arm,smmu-v3-pmcg.yaml:25:11: 
> [warning] wrong indentation: expected 12 but found 10 (indentation)
> 
> dtschema/dtc warnings/errors:
> /builds/robherring/linux-dt-review/Documentation/devicetree/bindings/iommu/arm,smmu-v3-pmcg.example.dt.yaml:
>  example-0: pmu@2b42:reg:0: [0, 725745664, 0, 4096] is too long
>   From schema: 
> /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
> /builds/robherring/linux-dt-review/Documentation/devicetree/bindings/iommu/arm,smmu-v3-pmcg.example.dt.yaml:
>  example-0: pmu@2b42:reg:1: [0, 725811200, 0, 4096] is too long
>   From schema: 
> /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
> /builds/robherring/linux-dt-review/Documentation/devicetree/bindings/iommu/arm,smmu-v3-pmcg.example.dt.yaml:
>  example-0: pmu@2b44:reg:0: [0, 725876736, 0, 4096] is too long
>   From schema: 
> /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
> /builds/robherring/linux-dt-review/Documentation/devicetree/bindings/iommu/arm,smmu-v3-pmcg.example.dt.yaml:
>  example-0: pmu@2b44:reg:1: [0, 725942272, 0, 4096] is too long
>   From schema: 
> /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
> 
> doc reference errors (make refcheckdocs):
> 
> See https://patchwork.ozlabs.org/patch/1555758
> 
> This check can fail if there are any dependencies. The base for a patch
> series is generally the most recent rc1.
> 
> If you already ran 'make dt_binding_check' and didn't see the above
> error(s), then make sure 'yamllint' is installed and dt-schema is up to
> date:
> 
> pip3 install dtschema --upgrade
> 
> Please check and re-submit.

Right I'll fix those, I had only run dtbs_check

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] perf/smmuv3: Add devicetree support

2021-11-16 Thread Jean-Philippe Brucker
On Tue, Nov 16, 2021 at 12:06:36PM +, John Garry wrote:
> On 16/11/2021 11:35, Jean-Philippe Brucker wrote:
> > Add device-tree support to the SMMUv3 PMCG.  One small cosmetic change
> > while factoring the option mask printout: don't display it when zero, it
> > only contains one erratum at the moment.
> > 
> > Signed-off-by: Jay Chen 
> > Signed-off-by: Jean-Philippe Brucker 
> > ---
> >   drivers/perf/arm_smmuv3_pmu.c | 25 +++--
> >   1 file changed, 23 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c
> > index 226348822ab3..958325ac103a 100644
> > --- a/drivers/perf/arm_smmuv3_pmu.c
> > +++ b/drivers/perf/arm_smmuv3_pmu.c
> > @@ -47,6 +47,7 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> >   #include 
> >   #include 
> >   #include 
> > @@ -750,8 +751,15 @@ static void smmu_pmu_get_acpi_options(struct smmu_pmu 
> > *smmu_pmu)
> > smmu_pmu->options |= SMMU_PMCG_EVCNTR_RDONLY;
> > break;
> > }
> > +}
> > +
> > +static void smmu_pmu_get_of_options(struct smmu_pmu *smmu_pmu)
> > +{
> > +   struct device_node *node = smmu_pmu->dev->of_node;
> > -   dev_notice(smmu_pmu->dev, "option mask 0x%x\n", smmu_pmu->options);
> > +   if (of_device_is_compatible(node, "hisilicon,smmu-v3-pmcg-hip08"))
> 
> I don't think that this is necessary. We don't support DT for hip08, nor
> have any plans to. Incidentally, was this binding missing in your series?

Ok I'll drop this (and the compatible value from patch 1)

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] perf/smmuv3: Support devicetree

2021-11-16 Thread Jean-Philippe Brucker
On Tue, Nov 16, 2021 at 12:02:47PM +, Robin Murphy wrote:
> On 2021-11-16 11:35, Jean-Philippe Brucker wrote:
> > Add devicetree binding for the SMMUv3 PMU, called Performance Monitoring
> > Counter Group (PMCG) in the spec. Each SMMUv3 implementation can have
> > multiple independent PMCGs, for example one for the Translation Control
> > Unit (TCU) and one per Translation Buffer Unit (TBU).
> > 
> > I previously sent the binding as reply to Jay Chen's thread implementing
> > device tree support [1]. This posting addresses the comments from that
> > thread.
> 
> Ha, I'd also resurrected this and was planning to post it at some point this
> week[0] - you should have said :)

Ah sorry about that, I just resent because there was some demand for it at
Linaro

> > Patch 1 adds two compatible strings. "arm,smmu-v3-pmcg" is common to all
> > PMCGs. "hisilicon,smmu-v3-pmcg-hip08" allows to support the same quirk
> > as IORT for that implementation (see patch 2). We'll probably want to
> > also introduce compatible strings for each implementation that has
> > additional perf events. For example the MMU-600 implementation has
> > different events for TCU and TBU PMCGs [2], but both components have the
> > same device IDs. So the driver could differentiate them if they had two
> > distinct compatible strings such as "arm,mmu-600-pmcg-tbu" and
> > "arm,mmu-600-pmcg-tcu".
> 
> Actually it only needs a general MMU-600 compatible, since once you know
> it's an Arm Ltd. implementation, you can assume the pattern for the IMP_DEF
> ID registers to figure out the rest.

It might be an error in the MMU-600 spec specifically, both TBU and TCU
PMU registers have a 0x83 PIDR0, where I think the TBU should be 0x84 (the
revC model uses that value). It's possible that the implementation
actually has 0x84 instead.

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] dt-bindings: Add Arm SMMUv3 PMCG binding

2021-11-16 Thread Jean-Philippe Brucker
Add binding for the Arm SMMUv3 PMU. Each node represents a PMCG, and is
placed as a sibling node of the SMMU. Although the PMCGs registers may
be within the SMMU MMIO region, they are separate devices, and there can
be multiple PMCG devices for each SMMU (for example one for the TCU and
one for each TBU).

Signed-off-by: Jean-Philippe Brucker 
---
 .../bindings/iommu/arm,smmu-v3-pmcg.yaml  | 67 +++
 1 file changed, 67 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/iommu/arm,smmu-v3-pmcg.yaml

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu-v3-pmcg.yaml 
b/Documentation/devicetree/bindings/iommu/arm,smmu-v3-pmcg.yaml
new file mode 100644
index ..a893e071fdb4
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu-v3-pmcg.yaml
@@ -0,0 +1,67 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/iommu/arm,smmu-v3-pmcg.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Arm SMMUv3 Performance Monitor Counter Group
+
+maintainers:
+  - Will Deacon 
+  - Robin Murphy 
+
+description: |+
+  An SMMUv3 may have several Performance Monitor Counter Group (PMCG).
+  They are standalone performance monitoring units that support both
+  architected and IMPLEMENTATION DEFINED event counters.
+
+properties:
+  $nodename:
+pattern: "^pmu@[0-9a-f]*"
+  compatible:
+oneOf:
+  - items:
+- enum:
+  - hisilicon,smmu-v3-pmcg-hip08
+- const: arm,smmu-v3-pmcg
+  - const: arm,smmu-v3-pmcg
+
+  reg:
+description: |
+  Base addresses of the PMCG registers. Either a single address for Page 0
+  or an additional address for Page 1, where some registers can be
+  relocated with SMMU_PMCG_CFGR.RELOC_CTRS.
+minItems: 1
+maxItems: 2
+
+  interrupts:
+maxItems: 1
+
+  msi-parent: true
+
+required:
+  - compatible
+  - reg
+
+additionalProperties: false
+
+examples:
+  - |+
+#include 
+#include 
+
+pmu@2b42 {
+compatible = "arm,smmu-v3-pmcg";
+reg = <0 0x2b42 0 0x1000>,
+  <0 0x2b43 0 0x1000>;
+interrupts = ;
+msi-parent = < 0xff>;
+};
+
+pmu@2b44 {
+compatible = "arm,smmu-v3-pmcg";
+reg = <0 0x2b44 0 0x1000>,
+  <0 0x2b45 0 0x1000>;
+interrupts = ;
+msi-parent = < 0xff>;
+};
-- 
2.33.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/2] perf/smmuv3: Support devicetree

2021-11-16 Thread Jean-Philippe Brucker
Add devicetree binding for the SMMUv3 PMU, called Performance Monitoring
Counter Group (PMCG) in the spec. Each SMMUv3 implementation can have
multiple independent PMCGs, for example one for the Translation Control
Unit (TCU) and one per Translation Buffer Unit (TBU).

I previously sent the binding as reply to Jay Chen's thread implementing
device tree support [1]. This posting addresses the comments from that
thread.

Patch 1 adds two compatible strings. "arm,smmu-v3-pmcg" is common to all
PMCGs. "hisilicon,smmu-v3-pmcg-hip08" allows to support the same quirk
as IORT for that implementation (see patch 2). We'll probably want to
also introduce compatible strings for each implementation that has
additional perf events. For example the MMU-600 implementation has
different events for TCU and TBU PMCGs [2], but both components have the
same device IDs. So the driver could differentiate them if they had two
distinct compatible strings such as "arm,mmu-600-pmcg-tbu" and
"arm,mmu-600-pmcg-tcu".

The series doesn't deal with this because for testing I use a software
model which only implements architected events. I do not include DTS
change for that platform because enabling PMCGs requires an additional
model option. See my branch smmu/pmu-dt [3] for details.

[1] https://lore.kernel.org/all/20200707150114.GC159413@myrica/
[2] 
https://developer.arm.com/documentation/100310/0202/Functional-description/Operation/Performance-Monitoring-Unit
[3] https://jpbrucker.net/git/linux/log/?h=smmu/pmu-dt

Jean-Philippe Brucker (2):
  dt-bindings: Add Arm SMMUv3 PMCG binding
  perf/smmuv3: Add devicetree support

 .../bindings/iommu/arm,smmu-v3-pmcg.yaml  | 67 +++
 drivers/perf/arm_smmuv3_pmu.c | 25 ++-
 2 files changed, 90 insertions(+), 2 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/iommu/arm,smmu-v3-pmcg.yaml

-- 
2.33.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] perf/smmuv3: Add devicetree support

2021-11-16 Thread Jean-Philippe Brucker
Add device-tree support to the SMMUv3 PMCG.  One small cosmetic change
while factoring the option mask printout: don't display it when zero, it
only contains one erratum at the moment.

Signed-off-by: Jay Chen 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/perf/arm_smmuv3_pmu.c | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c
index 226348822ab3..958325ac103a 100644
--- a/drivers/perf/arm_smmuv3_pmu.c
+++ b/drivers/perf/arm_smmuv3_pmu.c
@@ -47,6 +47,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -750,8 +751,15 @@ static void smmu_pmu_get_acpi_options(struct smmu_pmu 
*smmu_pmu)
smmu_pmu->options |= SMMU_PMCG_EVCNTR_RDONLY;
break;
}
+}
+
+static void smmu_pmu_get_of_options(struct smmu_pmu *smmu_pmu)
+{
+   struct device_node *node = smmu_pmu->dev->of_node;
 
-   dev_notice(smmu_pmu->dev, "option mask 0x%x\n", smmu_pmu->options);
+   if (of_device_is_compatible(node, "hisilicon,smmu-v3-pmcg-hip08"))
+   /* HiSilicon Erratum 162001800 */
+   smmu_pmu->options |= SMMU_PMCG_EVCNTR_RDONLY;
 }
 
 static int smmu_pmu_probe(struct platform_device *pdev)
@@ -834,7 +842,13 @@ static int smmu_pmu_probe(struct platform_device *pdev)
return -EINVAL;
}
 
-   smmu_pmu_get_acpi_options(smmu_pmu);
+   if (dev->of_node)
+   smmu_pmu_get_of_options(smmu_pmu);
+   else
+   smmu_pmu_get_acpi_options(smmu_pmu);
+
+   if (smmu_pmu->options)
+   dev_notice(dev, "option mask 0x%x\n", smmu_pmu->options);
 
/* Pick one CPU to be the preferred one to use */
smmu_pmu->on_cpu = raw_smp_processor_id();
@@ -884,10 +898,17 @@ static void smmu_pmu_shutdown(struct platform_device 
*pdev)
smmu_pmu_disable(_pmu->pmu);
 }
 
+static const struct of_device_id arm_smmu_pmu_match[] = {
+   { .compatible = "arm,smmu-v3-pmcg" },
+   {},
+};
+MODULE_DEVICE_TABLE(of, arm_smmu_pmu_match);
+
 static struct platform_driver smmu_pmu_driver = {
.driver = {
.name = "arm-smmu-v3-pmcg",
.suppress_bind_attrs = true,
+   .of_match_table = of_match_ptr(arm_smmu_pmu_match),
},
.probe = smmu_pmu_probe,
.remove = smmu_pmu_remove,
-- 
2.33.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/virtio: Add identity domains

2021-10-22 Thread Jean-Philippe Brucker
On Fri, Oct 22, 2021 at 06:16:27AM -0400, Michael S. Tsirkin wrote:
> On Wed, Oct 13, 2021 at 01:10:48PM +0100, Jean-Philippe Brucker wrote:
> > Support identity domains, allowing to only enable IOMMU protection for a
> > subset of endpoints (those assigned to userspace, for example). Users
> > may enable identity domains at compile time
> > (CONFIG_IOMMU_DEFAULT_PASSTHROUGH), boot time (iommu.passthrough=1) or
> > runtime (/sys/kernel/iommu_groups/*/type = identity).
> 
> 
> I put this in my branch so it can get testing under linux-next,
> but pls notice if the ballot does not conclude in time
> for the merge window I won't send it to Linus.

Makes sense, thank you. I sent a new version of the spec change with
clarifications
https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg07969.html

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 10/20] iommu/iommufd: Add IOMMU_DEVICE_GET_INFO

2021-10-22 Thread Jean-Philippe Brucker
On Thu, Oct 21, 2021 at 08:22:23PM -0300, Jason Gunthorpe wrote:
> On Thu, Oct 21, 2021 at 03:58:02PM +0100, Jean-Philippe Brucker wrote:
> > On Thu, Oct 21, 2021 at 02:26:00AM +, Tian, Kevin wrote:
> > > > I'll leave it to Jean to confirm. If only coherent DMA can be used in
> > > > the guest on other platforms, suppose VFIO should not blindly set
> > > > IOMMU_CACHE and in concept it should deny assigning a non-coherent
> > > > device since no co-ordination with guest exists today.
> > > 
> > > Jean, what's your opinion?
> > 
> > Yes a sanity check to prevent assigning non-coherent devices would be
> > good, though I'm not particularly worried about non-coherent devices. PCIe
> > on Arm should be coherent (according to the Base System Architecture). So
> > vfio-pci devices should be coherent, but vfio-platform and mdev are
> > case-by-case (hopefully all coherent since it concerns newer platforms).
> > 
> > More worrying, I thought we disabled No-Snoop for VFIO but I was wrong,
> > it's left enabled. On Arm I don't think userspace can perform the right
> > cache maintenance operations to maintain coherency with a device that
> > issues No-Snoop writes. Userspace can issue clean+invalidate but not
> > invalidate alone, so there is no equivalent to
> > arch_sync_dma_for_cpu().
> 
> So what happens in a VM? Does a VM know that arch_sync_dma_for_cpu()
> is not available?

This would only affect userspace drivers, it's only host or guest
userspace that cannot issue the maintenance operations. The VM can do
arch_sync_dma_for_cpu()

Thanks,
Jean

> 
> And how does this work with the nested IOMMU translation? I thought I
> read in the SMMU spec that the io page table entries could control
> cachability including in nesting cases?
> 
> > I think the worse that can happen is the device owner shooting itself in
> > the foot by using No-Snoop, but would it hurt to disable it?
> 
> No, the worst is the same as Intel - a driver running in the guest VM
> assumes it can use arch_sync_dma_for_cpu() and acts accordingly,
> resulting in a broken VM.
> 
> Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 10/20] iommu/iommufd: Add IOMMU_DEVICE_GET_INFO

2021-10-21 Thread Jean-Philippe Brucker
On Thu, Oct 21, 2021 at 02:26:00AM +, Tian, Kevin wrote:
> > I'll leave it to Jean to confirm. If only coherent DMA can be used in
> > the guest on other platforms, suppose VFIO should not blindly set
> > IOMMU_CACHE and in concept it should deny assigning a non-coherent
> > device since no co-ordination with guest exists today.
> 
> Jean, what's your opinion?

Yes a sanity check to prevent assigning non-coherent devices would be
good, though I'm not particularly worried about non-coherent devices. PCIe
on Arm should be coherent (according to the Base System Architecture). So
vfio-pci devices should be coherent, but vfio-platform and mdev are
case-by-case (hopefully all coherent since it concerns newer platforms).

More worrying, I thought we disabled No-Snoop for VFIO but I was wrong,
it's left enabled. On Arm I don't think userspace can perform the right
cache maintenance operations to maintain coherency with a device that
issues No-Snoop writes. Userspace can issue clean+invalidate but not
invalidate alone, so there is no equivalent to arch_sync_dma_for_cpu().
I think the worse that can happen is the device owner shooting itself in
the foot by using No-Snoop, but would it hurt to disable it?

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


  1   2   3   4   5   6   7   8   9   10   >