Thank you for your contribution! Sashiko AI review found 3 potential issue(s) 
to consider:

Pre-existing issues:
- [High] Use-After-Free in IOCTLs due to lack of device unplug synchronization
- [High] Deterministic Use-After-Free in GEM callbacks when userspace closes 
the file descriptor after device removal
- [High] Double free/unbind of the IOMMU SVA device handle on PASID failure
--

commit 8636266787707619953362d31190f3d7097a3d27
Author: Lizhi Hou <[email protected]>

accel/amdxdna: Fix amdxdna_client lifetime race during device removal

In amdxdna_remove(), all amdxdna_client structures are freed after
calling drm_dev_unplug(). However, drm_dev_unplug() does not force
existing file descriptors to be closed, so amdxdna_drm_close() may be
called after amdxdna_remove() has completed.

As a result, accessing client->pid for debug output in
amdxdna_drm_close() can lead to a use-after-free, since the access is
not protected by drm_dev_enter().

Fix this by moving the debug output after drm_dev_enter(). Similarly,
add the drm_dev_enter() protection for amdxdna_drm_open().

> diff --git a/drivers/accel/amdxdna/amdxdna_pci_drv.c 
> b/drivers/accel/amdxdna/amdxdna_pci_drv.c
> index 1f066ed8d2365..a22c986f1b331 100644
> --- a/drivers/accel/amdxdna/amdxdna_pci_drv.c
> +++ b/drivers/accel/amdxdna/amdxdna_pci_drv.c
> @@ -108,10 +108,16 @@ static int amdxdna_drm_open(struct drm_device *ddev, 
> struct drm_file *filp)
>  {
>       struct amdxdna_dev *xdna = to_xdna_dev(ddev);
>       struct amdxdna_client *client;
> +     int ret = 0, idx;
> +
> +     if (!drm_dev_enter(ddev, &idx))
> +             return -ENODEV;
>  
>       client = kzalloc_obj(*client);
> -     if (!client)
> -             return -ENOMEM;
> +     if (!client) {
> +             ret = -ENOMEM;
> +             goto dev_exit;
> +     }
>  
>       client->pid = pid_nr(rcu_access_pointer(filp->pid));
>       client->xdna = xdna;
> @@ -125,7 +131,8 @@ static int amdxdna_drm_open(struct drm_device *ddev, 
> struct drm_file *filp)
>                       if (!amdxdna_use_carveout(xdna)) {
>                               XDNA_ERR(xdna, "PASID unavailable and carveout 
> not configured");
>                               kfree(client);
> -                             return -EINVAL;
> +                             ret = -EINVAL;
> +                             goto dev_exit;
>                       }

[Severity: High]
This isn't a bug introduced by this patch, but does this error handling path
lead to a double-free of the SVA device handle?

In amdxdna_sva_init(), if iommu_sva_get_pasid() fails, the code calls
iommu_sva_unbind_device(client->sva) but leaves client->sva pointing to the
unbound handle:

drivers/accel/amdxdna/amdxdna_pci_drv.c:amdxdna_sva_init() {
        client->pasid = iommu_sva_get_pasid(client->sva);
        if (client->pasid == IOMMU_PASID_INVALID) {
                iommu_sva_unbind_device(client->sva);
                XDNA_ERR(xdna, "SVA get pasid failed");
                return -ENODEV;
        }
}

If amdxdna_use_carveout() is true, the error is ignored here, and the open
succeeds.

Later, when amdxdna_client_cleanup() calls amdxdna_sva_fini(), the
IS_ERR_OR_NULL(client->sva) check passes since client->sva is a dangling
pointer, leading to a second call to iommu_sva_unbind_device(client->sva).

[ ... ]
> @@ -175,11 +186,10 @@ static void amdxdna_drm_close(struct drm_device *ddev, 
> struct drm_file *filp)
>       struct amdxdna_dev *xdna = to_xdna_dev(ddev);
>       int idx;
>  
> -     XDNA_DBG(xdna, "closing pid %d", client->pid);
> -
>       if (!drm_dev_enter(&xdna->ddev, &idx))
>               return;
>  
> +     XDNA_DBG(xdna, "closing pid %d", client->pid);
>       mutex_lock(&xdna->dev_lock);
>       amdxdna_client_cleanup(client);
>       mutex_unlock(&xdna->dev_lock);

[Severity: High]
This is a pre-existing issue, but while drm_dev_enter() now protects open
and close, are the ioctl handlers still vulnerable to a use-after-free during
device removal?

The DRM core's drm_ioctl() does not hold drm_dev_enter() and only performs a
lockless check of drm_dev_is_unplugged(). If an ioctl is running concurrently
with amdxdna_remove(), drm_dev_unplug() will not wait for the ioctl to finish.
The client structure can be freed while concurrently running ioctls try to
access it.

For example, in amdxdna_drm_create_bo_ioctl():

drivers/accel/amdxdna/amdxdna_gem.c:amdxdna_drm_create_bo_ioctl() {
        struct amdxdna_client *client = filp->driver_priv;
        ...
}

Should drm_dev_enter() be added to the ioctl handlers as well to prevent
accessing a freed client structure?


[Severity: High]
This is a pre-existing issue, but does freeing the client structure during
device removal cause a deterministic use-after-free when GEM objects are
eventually closed?

When amdxdna_remove() forcefully frees the client structure via
amdxdna_client_cleanup(), active GEM objects created by this client are not
destroyed because the DRM core holds references to them as long as the user's
file descriptor remains open.

When the user eventually closes the file descriptor or unmaps the GEM objects,
amdxdna_gem_obj_close() is invoked, which accesses the freed client structure:

drivers/accel/amdxdna/amdxdna_gem.c:amdxdna_gem_del_bo_usage() {
        guard(mutex)(&client->mm_lock);
        ...
}

Could the client structure be kept alive until all GEM objects drop their
references?

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=1

Reply via email to