On Wed, Jun 17, 2026 at 10:50:25AM +0000, [email protected] wrote:
> Thank you for your contribution! Sashiko AI review found 2 potential issue(s)
> to consider:
>
> New issues:
> - [Low] Compilation error due to missing `pages` member in `struct
> xe_svm_range`.
>
> Pre-existing issues:
> - [High] Critical error paths swallow error codes by jumping to `out`,
> causing an infinite GPU page fault livelock.
> --
>
> commit c7d50cf1df054885b8dfd4565b263894bfd104b3
> Author: Honglei Huang <[email protected]>
> Subject: drm/gpusvm: split MM state flags out of drm_gpusvm_pages_flags
>
> This commit splits the MM state flags out of drm_gpusvm_pages_flags and moves
> them to a new drm_gpusvm_range_flags structure. It keeps the mapping state
> flags in drm_gpusvm_pages_flags and updates usage in the amd and xe drivers.
>
> Link:
> https://gitlab.freedesktop.org/mbrost/xe-kernel-driver-svn-perf-6-15-2025/-/commit/623f6a50c037d9e44f6c9fbe6859a0ba7ad50177
>
> > diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
> > index e1651e70c8f01..05578e187d204 100644
> > --- a/drivers/gpu/drm/xe/xe_svm.c
> > +++ b/drivers/gpu/drm/xe/xe_svm.c
> > @@ -134,7 +134,8 @@ xe_svm_garbage_collector_add_range(struct xe_vm *vm,
> > struct xe_svm_range *range,
> >
> > range_debug(range, "GARBAGE COLLECTOR ADD");
> >
> > - drm_gpusvm_range_set_unmapped(&range->base, mmu_range);
> > + drm_gpusvm_range_set_unmapped(&range->base, &range->pages, 1,
> > + mmu_range);
>
> [Severity: Low]
> Does this cause a compilation error due to pages not being a direct member
> of struct xe_svm_range? It looks like the pages struct is located inside the
> embedded base field (range->base.pages).
>
> While this might be fixed in a later patch in the series, should each commit
> compile successfully on its own to preserve bisection?
>
So s/range->pages/range->base.pages/
> [ ... ]
>
> > @@ -1248,7 +1249,7 @@ static int __xe_svm_handle_pagefault(struct xe_vm
> > *vm, struct xe_vma *vma,
> >
> > xe_svm_range_fault_count_stats_incr(gt, range);
> >
> > - if (ctx.devmem_only && !range->base.pages.flags.migrate_devmem) {
> > + if (ctx.devmem_only && !range->base.flags.migrate_devmem) {
> > err = -EACCES;
> > goto out;
> > }
>
> [Severity: High]
> This is a pre-existing issue, but can this error path swallow the error code
> by
> jumping to the out label?
>
> Looking at __xe_svm_handle_pagefault(), the out label appears to
> unconditionally return 0:
>
> out:
> xe_svm_range_fault_us_stats_incr(gt, range, start);
> return 0;
>
> If we set err = -EACCES and jump to out, or if xe_svm_range_get_pages()
> fails and jumps to out later in the function, does returning 0 cause the
> caller to believe the page fault was handled successfully?
>
> Since the pages weren't mapped, wouldn't the GPU immediately retry the
> instruction and fault again, potentially causing an infinite livelock? Should
> these error paths jump to err_out instead?
>
Yep, posted a fix here: https://patchwork.freedesktop.org/series/168716/
Matt
> --
> Sashiko AI review ยท
> https://sashiko.dev/#/patchset/[email protected]?part=1