On Wed, Jun 17, 2026 at 10:50:25AM +0000, [email protected] wrote:
> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) 
> to consider:
> 
> New issues:
> - [Low] Compilation error due to missing `pages` member in `struct 
> xe_svm_range`.
> 
> Pre-existing issues:
> - [High] Critical error paths swallow error codes by jumping to `out`, 
> causing an infinite GPU page fault livelock.
> --
> 
> commit c7d50cf1df054885b8dfd4565b263894bfd104b3
> Author: Honglei Huang <[email protected]>
> Subject: drm/gpusvm: split MM state flags out of drm_gpusvm_pages_flags
> 
> This commit splits the MM state flags out of drm_gpusvm_pages_flags and moves
> them to a new drm_gpusvm_range_flags structure. It keeps the mapping state
> flags in drm_gpusvm_pages_flags and updates usage in the amd and xe drivers.
> 
> Link: 
> https://gitlab.freedesktop.org/mbrost/xe-kernel-driver-svn-perf-6-15-2025/-/commit/623f6a50c037d9e44f6c9fbe6859a0ba7ad50177
> 
> > diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
> > index e1651e70c8f01..05578e187d204 100644
> > --- a/drivers/gpu/drm/xe/xe_svm.c
> > +++ b/drivers/gpu/drm/xe/xe_svm.c
> > @@ -134,7 +134,8 @@ xe_svm_garbage_collector_add_range(struct xe_vm *vm, 
> > struct xe_svm_range *range,
> >  
> >     range_debug(range, "GARBAGE COLLECTOR ADD");
> >  
> > -   drm_gpusvm_range_set_unmapped(&range->base, mmu_range);
> > +   drm_gpusvm_range_set_unmapped(&range->base, &range->pages, 1,
> > +                                 mmu_range);
> 
> [Severity: Low]
> Does this cause a compilation error due to pages not being a direct member
> of struct xe_svm_range? It looks like the pages struct is located inside the
> embedded base field (range->base.pages).
> 
> While this might be fixed in a later patch in the series, should each commit
> compile successfully on its own to preserve bisection?
> 

So s/range->pages/range->base.pages/

> [ ... ]
> 
> > @@ -1248,7 +1249,7 @@ static int __xe_svm_handle_pagefault(struct xe_vm 
> > *vm, struct xe_vma *vma,
> >  
> >     xe_svm_range_fault_count_stats_incr(gt, range);
> >  
> > -   if (ctx.devmem_only && !range->base.pages.flags.migrate_devmem) {
> > +   if (ctx.devmem_only && !range->base.flags.migrate_devmem) {
> >             err = -EACCES;
> >             goto out;
> >     }
> 
> [Severity: High]
> This is a pre-existing issue, but can this error path swallow the error code 
> by
> jumping to the out label?
> 
> Looking at __xe_svm_handle_pagefault(), the out label appears to
> unconditionally return 0:
> 
> out:
>       xe_svm_range_fault_us_stats_incr(gt, range, start);
>       return 0;
> 
> If we set err = -EACCES and jump to out, or if xe_svm_range_get_pages()
> fails and jumps to out later in the function, does returning 0 cause the
> caller to believe the page fault was handled successfully?
> 
> Since the pages weren't mapped, wouldn't the GPU immediately retry the
> instruction and fault again, potentially causing an infinite livelock? Should
> these error paths jump to err_out instead?
> 

Yep, posted a fix here: https://patchwork.freedesktop.org/series/168716/

Matt

> -- 
> Sashiko AI review ยท 
> https://sashiko.dev/#/patchset/[email protected]?part=1

Reply via email to