Thank you for your contribution! Sashiko AI review found 4 potential issue(s) 
to consider:

New issues:
- [Medium] The `userptr->pages.drm` field is initialized after the MMU interval 
notifier is inserted, exposing it to concurrent lockless access by notifier 
callbacks.

Pre-existing issues:
- [High] Device pages are leaked on the error path in `drm_gpusvm_get_pages()` 
because `svm_pages->dpagemap` is not populated before jumping to the unmap 
cleanup logic.
- [High] `__drm_gpusvm_unmap_pages()` incorrectly passes system pages to the 
`device_unmap` callback when a mixed range is mapped on an IOVA-enabled device.
- [High] Uninitialized memory read of `svm_pages->dma_addr[0].dir` in the error 
path of `__drm_gpusvm_unmap_pages`.
--

commit 5025fa9183334de09c92fc07865022970a487d35
Author: Honglei Huang <[email protected]>
Subject: drm/gpusvm: embed struct drm_device into drm_gpusvm_pages

drm_gpusvm_pages is the layer that actually represents physical
pages/mappings it owns the dma_addr array, the dma_iova_state...
With the previous patch, so drm_gpusvm_pages is now strictly about
physical pages and their DMA view.

> diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
> index 6cb38c76dc7be..d6407c94cdda3 100644
> --- a/drivers/gpu/drm/drm_gpusvm.c
> +++ b/drivers/gpu/drm/drm_gpusvm.c
> @@ -1135,11 +1136,16 @@ static void __drm_gpusvm_unmap_pages(struct 
> drm_gpusvm *gpusvm,
>                                    unsigned long npages)
>  {
>       struct drm_pagemap *dpagemap = svm_pages->dpagemap;
> -     struct device *dev = gpusvm->drm->dev;
> +     struct device *dev;
>       unsigned long i, j;
>  
>       lockdep_assert_held(&gpusvm->notifier_lock);
>  
> +     if (!svm_pages->drm)
> +             return;
> +
> +     dev = svm_pages->drm->dev;
> +
>       if (svm_pages->flags.has_dma_mapping) {

[Severity: High]
This is a pre-existing issue, but I wanted to ask about the logic when IOVA is
used in __drm_gpusvm_unmap_pages(). When use_iova is true, system pages
evaluating this check:

    if (!use_iova && addr->proto == DRM_INTERCONNECT_SYSTEM)
        dma_unmap_page(dev, addr->addr, PAGE_SIZE << addr->order, addr->dir);
    else if (dpagemap && dpagemap->ops->device_unmap)
        dpagemap->ops->device_unmap(dpagemap, dev, addr);

would fall through to the else if branch. If the SVM range contains both
system and device pages, dpagemap would be non-NULL, causing the system pages
to be erroneously passed to dpagemap->ops->device_unmap(). Could this result
in an incompatible address format being passed to the device pagemap driver?

[Severity: High]
Another pre-existing issue in this same function: when mapping the very first
page fails in drm_gpusvm_get_pages(), the error path calls this unmap function
with npages (or num_dma_mapped) set to 0. The code here does:

    if (use_iova)
        dma_iova_destroy(dev, &svm_pages->state, svm_pages->state_offset,
                         svm_pages->dma_addr[0].dir, 0);

Since svm_pages->dma_addr is allocated with kvmalloc_objs() (which does not
zero memory) and num_dma_mapped is 0, is it possible this passes
uninitialized memory as the DMA direction?

[ ... ]

> @@ -1590,7 +1599,7 @@ int drm_gpusvm_get_pages(struct drm_gpusvm *gpusvm,
>       }
>  
>       if (dma_use_iova(state)) {
> -             err = dma_iova_sync(gpusvm->drm->dev, state, 0,
> +             err = dma_iova_sync(svm_pages->drm->dev, state, 0,
>                                   svm_pages->state_offset);
>               if (err)
>                       goto err_unmap;

[Severity: High]
This isn't a bug introduced by this patch, but I have a question about the
error path here. In drm_gpusvm_get_pages(), if an error occurs while mapping
a device page:

    if (dma_mapping_error(svm_pages->drm->dev, svm_pages->dma_addr[j].addr)) {
        err = -EFAULT;
        goto err_unmap;
    }

The jump to err_unmap happens before svm_pages->dpagemap is populated
at the end of the loop. When __drm_gpusvm_unmap_pages() is then called,
it relies on dpagemap which resolves from svm_pages->dpagemap (which is
still NULL). Does this mean the device's unmap callback is skipped, leaking
all device pages previously mapped in the current batch?

[ ... ]

> diff --git a/drivers/gpu/drm/xe/xe_userptr.c b/drivers/gpu/drm/xe/xe_userptr.c
> index 6761005c0b900..1b540e62af6c3 100644
> --- a/drivers/gpu/drm/xe/xe_userptr.c
> +++ b/drivers/gpu/drm/xe/xe_userptr.c
> @@ -390,6 +390,7 @@ int xe_userptr_setup(struct xe_userptr_vma *uvma, 
> unsigned long start,
>                    unsigned long range)
>  {
>       struct xe_userptr *userptr = &uvma->userptr;
> +     struct xe_vm *vm = xe_vma_vm(&uvma->vma);
>       int err;
>  
>       INIT_LIST_HEAD(&userptr->invalidate_link);
> @@ -402,6 +403,7 @@ int xe_userptr_setup(struct xe_userptr_vma *uvma, 
> unsigned long start,
>               return err;
>  
>       userptr->pages.notifier_seq = LONG_MAX;
> +     userptr->pages.drm = &vm->xe->drm;
>  
>       return 0;
>  }

[Severity: Medium]
Could this ordering cause a data race? The MMU interval notifier is inserted
just before this assignment:

    err = mmu_interval_notifier_insert(&userptr->notifier, current->mm,
                                       start, range,
                                       &vma_userptr_notifier_ops);
    if (err)
        return err;

This immediately exposes the notifier to asynchronous invalidation callbacks.
If the invalidation callback executes concurrently on another CPU, it might
read userptr->pages.drm before it is initialized here, leading to potential
KCSAN warnings or lockless access issues. Should the initialization of
userptr->pages.drm happen before the notifier is inserted?

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=2

Reply via email to