Re: [PATCH v3 1/8] iommu: Add iopf domain attach/detach/replace interface

2024-02-20 Thread Baolu Lu

On 2024/2/21 14:49, Tian, Kevin wrote:

+struct iopf_attach_cookie {
+   struct iommu_domain *domain;
+   struct device *dev;
+   unsigned int pasid;
+   refcount_t users;
+
+   void *private;
+   void (*release)(struct iopf_attach_cookie *cookie);
+};

this cookie has nothing specific to iopf.

it may makes more sense to build a generic iommu_attach_device_cookie()
helper so the same object can be reused in future other usages too.

within iommu core it can check domain iopf handler and this generic cookie
to update iopf specific data e.g. the pasid_cookie xarray.

This means attaching an iopf-capable domain follows two steps:

1) Attaching the domain to the device.
2) Setting up the iopf data, necessary for handling iopf data.

This creates a time window during which the iopf is enabled, but the
software cannot handle it. Or not?


why two steps? in attach you can setup the iopf data when recognizing
that the domain is iopf capable...


Oh, maybe I misunderstood. So your proposal is to make the new interface
generic, not for iopf only?

Best regards,
baolu



RE: [PATCH v3 1/8] iommu: Add iopf domain attach/detach/replace interface

2024-02-20 Thread Tian, Kevin
> From: Baolu Lu 
> Sent: Wednesday, February 21, 2024 3:21 PM
> 
> On 2024/2/21 14:49, Tian, Kevin wrote:
>  +struct iopf_attach_cookie {
>  +struct iommu_domain *domain;
>  +struct device *dev;
>  +unsigned int pasid;
>  +refcount_t users;
>  +
>  +void *private;
>  +void (*release)(struct iopf_attach_cookie *cookie);
>  +};
> >>> this cookie has nothing specific to iopf.
> >>>
> >>> it may makes more sense to build a generic
> iommu_attach_device_cookie()
> >>> helper so the same object can be reused in future other usages too.
> >>>
> >>> within iommu core it can check domain iopf handler and this generic
> cookie
> >>> to update iopf specific data e.g. the pasid_cookie xarray.
> >> This means attaching an iopf-capable domain follows two steps:
> >>
> >> 1) Attaching the domain to the device.
> >> 2) Setting up the iopf data, necessary for handling iopf data.
> >>
> >> This creates a time window during which the iopf is enabled, but the
> >> software cannot handle it. Or not?
> >>
> > why two steps? in attach you can setup the iopf data when recognizing
> > that the domain is iopf capable...
> 
> Oh, maybe I misunderstood. So your proposal is to make the new interface
> generic, not for iopf only?
> 

yes.


RE: [PATCH v3 1/8] iommu: Add iopf domain attach/detach/replace interface

2024-02-20 Thread Tian, Kevin
> From: Baolu Lu 
> Sent: Wednesday, February 21, 2024 1:53 PM
> 
> On 2024/2/7 16:11, Tian, Kevin wrote:
> >> From: Lu Baolu 
> >> Sent: Monday, January 22, 2024 3:39 PM
> >>
> >> There is a slight difference between iopf domains and non-iopf domains.
> >> In the latter, references to domains occur between attach and detach;
> >> While in the former, due to the existence of asynchronous iopf handling
> >> paths, references to the domain may occur after detach, which leads to
> >> potential UAF issues.
> >
> > Does UAF still exist if iommu driver follows the guidance you just added
> > to iopf_queue_remove_device()?
> >
> > it clearly says that the driver needs to disable IOMMU PRI reception,
> > remove device from iopf queue and disable PRI on the device.
> 
> The iopf_queue_remove_device() function is only called after the last
> iopf-capable domain is detached from the device. It may not be called
> during domain replacement. Hence, there is no guarantee that
> iopf_queue_remove_device() will be called when a domain is detached from
> the device.

oh yes. More accurately even the last detach may not trigger it.

e.g. idxd driver does it at device/driver unbind.

> 
> >
> > presumably those are all about what needs to be done in the detach
> > operation. Then once detach completes there should be no more
> > reference to the domain from the iopf path?
> 
> The domain pointer stored in the iopf_group structure is only released
> after the iopf response, possibly after the domain is detached from the
> device. Thus, the domain pointer can only be freed after the iopf
> response.

make sense.

> 
> >
> >>
> >> +struct iopf_attach_cookie {
> >> +  struct iommu_domain *domain;
> >> +  struct device *dev;
> >> +  unsigned int pasid;
> >> +  refcount_t users;
> >> +
> >> +  void *private;
> >> +  void (*release)(struct iopf_attach_cookie *cookie);
> >> +};
> >
> > this cookie has nothing specific to iopf.
> >
> > it may makes more sense to build a generic iommu_attach_device_cookie()
> > helper so the same object can be reused in future other usages too.
> >
> > within iommu core it can check domain iopf handler and this generic cookie
> > to update iopf specific data e.g. the pasid_cookie xarray.
> 
> This means attaching an iopf-capable domain follows two steps:
> 
> 1) Attaching the domain to the device.
> 2) Setting up the iopf data, necessary for handling iopf data.
> 
> This creates a time window during which the iopf is enabled, but the
> software cannot handle it. Or not?
> 

why two steps? in attach you can setup the iopf data when recognizing
that the domain is iopf capable...


Re: [PATCH v3 6/8] iommufd: IOPF-capable hw page table attach/detach/replace

2024-02-20 Thread Baolu Lu

On 2024/2/20 21:57, Joel Granados wrote:

diff --git a/drivers/iommu/iommufd/fault.c b/drivers/iommu/iommufd/fault.c
index e752d1c49dde..a4a49f3cd4c2 100644
--- a/drivers/iommu/iommufd/fault.c
+++ b/drivers/iommu/iommufd/fault.c
@@ -267,3 +267,125 @@ int iommufd_fault_iopf_handler(struct iopf_group *group)
  
  	return 0;

  }
+
+static void release_attach_cookie(struct iopf_attach_cookie *cookie)
+{
+   struct iommufd_hw_pagetable *hwpt = cookie->domain->fault_data;

There is a possibility here of cookie->domain being NULL. When you call
release_attach_cookie from iommufd_fault_domain_attach_dev if
idev->iopf_enabled is false. In this case, you have not set the domain
yet.


Yes. Good catch!




+   struct iommufd_device *idev = cookie->private;
+
+   refcount_dec(>obj.users);
+   refcount_dec(>obj.users);

You should decrease this ref count only if the cookie actually had a
domain.

This function could be something like this:

static void release_attach_cookie(struct iopf_attach_cookie *cookie)
{
struct iommufd_hw_pagetable *hwpt;
struct iommufd_device *idev = cookie->private;

refcount_dec(>obj.users);
if (cookie->domain) {
hwpt = cookie->domain->fault_data;
refcount_dec(>obj.users);
}
kfree(cookie);
}


Yeah, fixed.


+   kfree(cookie);
+}
+
+static int iommufd_fault_iopf_enable(struct iommufd_device *idev)
+{
+   int ret;
+
+   if (idev->iopf_enabled)
+   return 0;
+
+   ret = iommu_dev_enable_feature(idev->dev, IOMMU_DEV_FEAT_IOPF);
+   if (ret)
+   return ret;
+
+   idev->iopf_enabled = true;
+
+   return 0;
+}
+
+static void iommufd_fault_iopf_disable(struct iommufd_device *idev)
+{
+   if (!idev->iopf_enabled)
+   return;
+
+   iommu_dev_disable_feature(idev->dev, IOMMU_DEV_FEAT_IOPF);
+   idev->iopf_enabled = false;
+}
+
+int iommufd_fault_domain_attach_dev(struct iommufd_hw_pagetable *hwpt,
+   struct iommufd_device *idev)
+{
+   struct iopf_attach_cookie *cookie;
+   int ret;
+
+   cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
+   if (!cookie)
+   return -ENOMEM;
+
+   refcount_inc(>obj.users);
+   refcount_inc(>obj.users);
+   cookie->release = release_attach_cookie;
+   cookie->private = idev;
+
+   if (!idev->iopf_enabled) {
+   ret = iommufd_fault_iopf_enable(idev);
+   if (ret)
+   goto out_put_cookie;

You have not set domain here and release_attach_cookie will try to
access a null address.


Fixed as above.

Best regards,
baolu



Re: [PATCH v3 5/8] iommufd: Associate fault object with iommufd_hw_pgtable

2024-02-20 Thread Baolu Lu

On 2024/2/7 16:14, Tian, Kevin wrote:

From: Lu Baolu 
Sent: Monday, January 22, 2024 3:39 PM

+
+int iommufd_fault_iopf_handler(struct iopf_group *group)
+{
+   struct iommufd_hw_pagetable *hwpt = group->cookie->domain-

fault_data;

+   struct iommufd_fault *fault = hwpt->fault;
+


why not directly using iommufd_fault as the fault_data?


The relationship among these structures is:

iommufd_hwpt -> iommu_domain
  ^
  |
  v
iommufd_fault

It appears preferable to hook the hwpt instead of iommufd_fault to an
iommu_domain structure.

Best regards,
baolu




Re: [PATCH v3 1/8] iommu: Add iopf domain attach/detach/replace interface

2024-02-20 Thread Baolu Lu

On 2024/2/7 16:11, Tian, Kevin wrote:

From: Lu Baolu 
Sent: Monday, January 22, 2024 3:39 PM

There is a slight difference between iopf domains and non-iopf domains.
In the latter, references to domains occur between attach and detach;
While in the former, due to the existence of asynchronous iopf handling
paths, references to the domain may occur after detach, which leads to
potential UAF issues.


Does UAF still exist if iommu driver follows the guidance you just added
to iopf_queue_remove_device()?

it clearly says that the driver needs to disable IOMMU PRI reception,
remove device from iopf queue and disable PRI on the device.


The iopf_queue_remove_device() function is only called after the last
iopf-capable domain is detached from the device. It may not be called
during domain replacement. Hence, there is no guarantee that
iopf_queue_remove_device() will be called when a domain is detached from
the device.



presumably those are all about what needs to be done in the detach
operation. Then once detach completes there should be no more
reference to the domain from the iopf path?


The domain pointer stored in the iopf_group structure is only released
after the iopf response, possibly after the domain is detached from the
device. Thus, the domain pointer can only be freed after the iopf
response.





+struct iopf_attach_cookie {
+   struct iommu_domain *domain;
+   struct device *dev;
+   unsigned int pasid;
+   refcount_t users;
+
+   void *private;
+   void (*release)(struct iopf_attach_cookie *cookie);
+};


this cookie has nothing specific to iopf.

it may makes more sense to build a generic iommu_attach_device_cookie()
helper so the same object can be reused in future other usages too.

within iommu core it can check domain iopf handler and this generic cookie
to update iopf specific data e.g. the pasid_cookie xarray.


This means attaching an iopf-capable domain follows two steps:

1) Attaching the domain to the device.
2) Setting up the iopf data, necessary for handling iopf data.

This creates a time window during which the iopf is enabled, but the
software cannot handle it. Or not?

Best regards,
baolu



Re: [PATCH v3 6/8] iommufd: IOPF-capable hw page table attach/detach/replace

2024-02-20 Thread Joel Granados
On Mon, Jan 22, 2024 at 03:39:01PM +0800, Lu Baolu wrote:
> The iopf-capable hw page table attach/detach/replace should use the iommu
> iopf-specific interfaces. The pointer to iommufd_device is stored in the
> private field of the attachment cookie, so that it can be easily retrieved
> in the fault handling paths. The references to iommufd_device and
> iommufd_hw_pagetable objects are held until the cookie is released, which
> happens after the hw_pagetable is detached from the device and all
> outstanding iopf's are responded to. This guarantees that both the device
> and hw_pagetable are valid before domain detachment and outstanding faults
> are handled.
> 
> The iopf-capable hw page tables can only be attached to devices that
> support the IOMMU_DEV_FEAT_IOPF feature. On the first attachment of an
> iopf-capable hw_pagetable to the device, the IOPF feature is enabled on
> the device. Similarly, after the last iopf-capable hwpt is detached from
> the device, the IOPF feature is disabled on the device.
> 
> The current implementation allows a replacement between iopf-capable and
> non-iopf-capable hw page tables. This matches the nested translation use
> case, where a parent domain is attached by default and can then be
> replaced with a nested user domain with iopf support.
> 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/iommufd/iommufd_private.h |   7 ++
>  drivers/iommu/iommufd/device.c  |  15 ++-
>  drivers/iommu/iommufd/fault.c   | 122 
>  3 files changed, 141 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/iommufd/iommufd_private.h 
> b/drivers/iommu/iommufd/iommufd_private.h
> index 2780bed0c6b1..9844a1289c01 100644
> --- a/drivers/iommu/iommufd/iommufd_private.h
> +++ b/drivers/iommu/iommufd/iommufd_private.h
> @@ -398,6 +398,7 @@ struct iommufd_device {
>   /* always the physical device */
>   struct device *dev;
>   bool enforce_cache_coherency;
> + bool iopf_enabled;
>   /* outstanding faults awaiting response indexed by fault group id */
>   struct xarray faults;
>  };
> @@ -459,6 +460,12 @@ iommufd_get_fault(struct iommufd_ucmd *ucmd, u32 id)
>  int iommufd_fault_alloc(struct iommufd_ucmd *ucmd);
>  void iommufd_fault_destroy(struct iommufd_object *obj);
>  int iommufd_fault_iopf_handler(struct iopf_group *group);
> +int iommufd_fault_domain_attach_dev(struct iommufd_hw_pagetable *hwpt,
> + struct iommufd_device *idev);
> +void iommufd_fault_domain_detach_dev(struct iommufd_hw_pagetable *hwpt,
> +  struct iommufd_device *idev);
> +int iommufd_fault_domain_replace_dev(struct iommufd_hw_pagetable *hwpt,
> +  struct iommufd_device *idev);
>  
>  #ifdef CONFIG_IOMMUFD_TEST
>  int iommufd_test(struct iommufd_ucmd *ucmd);
> diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
> index d70913ee8fdf..c4737e876ebc 100644
> --- a/drivers/iommu/iommufd/device.c
> +++ b/drivers/iommu/iommufd/device.c
> @@ -377,7 +377,10 @@ int iommufd_hw_pagetable_attach(struct 
> iommufd_hw_pagetable *hwpt,
>* attachment.
>*/
>   if (list_empty(>igroup->device_list)) {
> - rc = iommu_attach_group(hwpt->domain, idev->igroup->group);
> + if (hwpt->fault_capable)
> + rc = iommufd_fault_domain_attach_dev(hwpt, idev);
> + else
> + rc = iommu_attach_group(hwpt->domain, 
> idev->igroup->group);
>   if (rc)
>   goto err_unresv;
>   idev->igroup->hwpt = hwpt;
> @@ -403,7 +406,10 @@ iommufd_hw_pagetable_detach(struct iommufd_device *idev)
>   mutex_lock(>igroup->lock);
>   list_del(>group_item);
>   if (list_empty(>igroup->device_list)) {
> - iommu_detach_group(hwpt->domain, idev->igroup->group);
> + if (hwpt->fault_capable)
> + iommufd_fault_domain_detach_dev(hwpt, idev);
> + else
> + iommu_detach_group(hwpt->domain, idev->igroup->group);
>   idev->igroup->hwpt = NULL;
>   }
>   if (hwpt_is_paging(hwpt))
> @@ -498,7 +504,10 @@ iommufd_device_do_replace(struct iommufd_device *idev,
>   goto err_unlock;
>   }
>  
> - rc = iommu_group_replace_domain(igroup->group, hwpt->domain);
> + if (old_hwpt->fault_capable || hwpt->fault_capable)
> + rc = iommufd_fault_domain_replace_dev(hwpt, idev);
> + else
> + rc = iommu_group_replace_domain(igroup->group, hwpt->domain);
>   if (rc)
>   goto err_unresv;
>  
> diff --git a/drivers/iommu/iommufd/fault.c b/drivers/iommu/iommufd/fault.c
> index e752d1c49dde..a4a49f3cd4c2 100644
> --- a/drivers/iommu/iommufd/fault.c
> +++ b/drivers/iommu/iommufd/fault.c
> @@ -267,3 +267,125 @@ int iommufd_fault_iopf_handler(struct iopf_group *group)
>  
>   return 0;
>  }
> +
> +static void