Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
Pre-existing issues: - [High] Use-After-Free in IOCTLs due to lack of device unplug synchronization - [High] Deterministic Use-After-Free in GEM callbacks when userspace closes the file descriptor after device removal - [High] Double free/unbind of the IOMMU SVA device handle on PASID failure -- commit 8636266787707619953362d31190f3d7097a3d27 Author: Lizhi Hou <[email protected]> accel/amdxdna: Fix amdxdna_client lifetime race during device removal In amdxdna_remove(), all amdxdna_client structures are freed after calling drm_dev_unplug(). However, drm_dev_unplug() does not force existing file descriptors to be closed, so amdxdna_drm_close() may be called after amdxdna_remove() has completed. As a result, accessing client->pid for debug output in amdxdna_drm_close() can lead to a use-after-free, since the access is not protected by drm_dev_enter(). Fix this by moving the debug output after drm_dev_enter(). Similarly, add the drm_dev_enter() protection for amdxdna_drm_open(). > diff --git a/drivers/accel/amdxdna/amdxdna_pci_drv.c > b/drivers/accel/amdxdna/amdxdna_pci_drv.c > index 1f066ed8d2365..a22c986f1b331 100644 > --- a/drivers/accel/amdxdna/amdxdna_pci_drv.c > +++ b/drivers/accel/amdxdna/amdxdna_pci_drv.c > @@ -108,10 +108,16 @@ static int amdxdna_drm_open(struct drm_device *ddev, > struct drm_file *filp) > { > struct amdxdna_dev *xdna = to_xdna_dev(ddev); > struct amdxdna_client *client; > + int ret = 0, idx; > + > + if (!drm_dev_enter(ddev, &idx)) > + return -ENODEV; > > client = kzalloc_obj(*client); > - if (!client) > - return -ENOMEM; > + if (!client) { > + ret = -ENOMEM; > + goto dev_exit; > + } > > client->pid = pid_nr(rcu_access_pointer(filp->pid)); > client->xdna = xdna; > @@ -125,7 +131,8 @@ static int amdxdna_drm_open(struct drm_device *ddev, > struct drm_file *filp) > if (!amdxdna_use_carveout(xdna)) { > XDNA_ERR(xdna, "PASID unavailable and carveout > not configured"); > kfree(client); > - return -EINVAL; > + ret = -EINVAL; > + goto dev_exit; > } [Severity: High] This isn't a bug introduced by this patch, but does this error handling path lead to a double-free of the SVA device handle? In amdxdna_sva_init(), if iommu_sva_get_pasid() fails, the code calls iommu_sva_unbind_device(client->sva) but leaves client->sva pointing to the unbound handle: drivers/accel/amdxdna/amdxdna_pci_drv.c:amdxdna_sva_init() { client->pasid = iommu_sva_get_pasid(client->sva); if (client->pasid == IOMMU_PASID_INVALID) { iommu_sva_unbind_device(client->sva); XDNA_ERR(xdna, "SVA get pasid failed"); return -ENODEV; } } If amdxdna_use_carveout() is true, the error is ignored here, and the open succeeds. Later, when amdxdna_client_cleanup() calls amdxdna_sva_fini(), the IS_ERR_OR_NULL(client->sva) check passes since client->sva is a dangling pointer, leading to a second call to iommu_sva_unbind_device(client->sva). [ ... ] > @@ -175,11 +186,10 @@ static void amdxdna_drm_close(struct drm_device *ddev, > struct drm_file *filp) > struct amdxdna_dev *xdna = to_xdna_dev(ddev); > int idx; > > - XDNA_DBG(xdna, "closing pid %d", client->pid); > - > if (!drm_dev_enter(&xdna->ddev, &idx)) > return; > > + XDNA_DBG(xdna, "closing pid %d", client->pid); > mutex_lock(&xdna->dev_lock); > amdxdna_client_cleanup(client); > mutex_unlock(&xdna->dev_lock); [Severity: High] This is a pre-existing issue, but while drm_dev_enter() now protects open and close, are the ioctl handlers still vulnerable to a use-after-free during device removal? The DRM core's drm_ioctl() does not hold drm_dev_enter() and only performs a lockless check of drm_dev_is_unplugged(). If an ioctl is running concurrently with amdxdna_remove(), drm_dev_unplug() will not wait for the ioctl to finish. The client structure can be freed while concurrently running ioctls try to access it. For example, in amdxdna_drm_create_bo_ioctl(): drivers/accel/amdxdna/amdxdna_gem.c:amdxdna_drm_create_bo_ioctl() { struct amdxdna_client *client = filp->driver_priv; ... } Should drm_dev_enter() be added to the ioctl handlers as well to prevent accessing a freed client structure? [Severity: High] This is a pre-existing issue, but does freeing the client structure during device removal cause a deterministic use-after-free when GEM objects are eventually closed? When amdxdna_remove() forcefully frees the client structure via amdxdna_client_cleanup(), active GEM objects created by this client are not destroyed because the DRM core holds references to them as long as the user's file descriptor remains open. When the user eventually closes the file descriptor or unmaps the GEM objects, amdxdna_gem_obj_close() is invoked, which accesses the freed client structure: drivers/accel/amdxdna/amdxdna_gem.c:amdxdna_gem_del_bo_usage() { guard(mutex)(&client->mm_lock); ... } Could the client structure be kept alive until all GEM objects drop their references? -- Sashiko AI review ยท https://sashiko.dev/#/patchset/[email protected]?part=1
