Hi, Matt

On Mon, 2025-10-27 at 17:13 -0700, Matthew Brost wrote:
> On Sat, Oct 25, 2025 at 02:04:12PM +0200, Thomas Hellström wrote:
> > Data present in foreign device memory may cause migration to fail.
> > For now, retry once after first migrating to system.
> > 
> > Signed-off-by: Thomas Hellström <[email protected]>
> > ---
> >  drivers/gpu/drm/xe/xe_svm.c | 19 +++++++++++++++----
> >  1 file changed, 15 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_svm.c
> > b/drivers/gpu/drm/xe/xe_svm.c
> > index 9814f95cb212..41e075aa015c 100644
> > --- a/drivers/gpu/drm/xe/xe_svm.c
> > +++ b/drivers/gpu/drm/xe/xe_svm.c
> > @@ -1529,13 +1529,24 @@ struct drm_pagemap
> > *xe_vma_resolve_pagemap(struct xe_vma *vma, struct xe_tile *t
> >  int xe_svm_alloc_vram(struct xe_svm_range *range, const struct
> > drm_gpusvm_ctx *ctx,
> >                   struct drm_pagemap *dpagemap)
> >  {
> > +   int err, retries = 1;
> > +
> >     xe_assert(range_to_vm(&range->base)->xe, range-
> > >base.pages.flags.migrate_devmem);
> >     range_debug(range, "ALLOCATE VRAM");
> >  
> > -   return drm_pagemap_populate_mm(dpagemap,
> > xe_svm_range_start(range),
> > -                                  xe_svm_range_end(range),
> > -                                  range->base.gpusvm->mm,
> > -                                  ctx->timeslice_ms);
> > +retry:
> > +   err = drm_pagemap_populate_mm(dpagemap,
> > xe_svm_range_start(range),
> > +                                 xe_svm_range_end(range),
> > +                                 range->base.gpusvm->mm,
> > +                                 ctx->timeslice_ms);
> > +   if ((err == -EBUSY || err == -EFAULT) && retries--) {
> 
> I don't think this is what we want to do here. -EFAULT indicates that
> the pages are entirely present somewhere in device memory. This could
> be
> either on the local device or on a foreign device, but we don’t have
> enough information here to determine which case it is.
> 
> If this is on our local device, we're always good. This could occur
> playing mremap games.
> 
> If it's on a foreign device, things get trickier. If our interconnect
> supports atomics we're still good. But if the interconnect
> doesn't support atomics (e.g., PCIe P2P), this an atomic fault, then
> we
> need to move the memory. Also, if there's no path between device
> memories, then of course we need to move the memory.
> 
> Again, we don’t have enough information here to make the correct
> decision. We really need to call drm_gpusvm_range_get_pages to gather
> the CPU pages in order to make this kind of decision. Ideally, the
> logic
> should be built into drm_gpusvm_range_get_pages to understand atomic
> migration requirements.

For multi-device I'm just looking at a patch that considers p2p
migration and at the same time returns 0 if data is placed in
compatible memory, given migration policies and interconnects. But
until that patch lands, we need a way to evict memory from foreign
devices so that we can migrate to the desired device.

I would have expected that if memory is already present in local device
memory we'd have that xe_svm_range_in_vram() flag set and would not
attempt to migrate, at least in most cases? Currently, if data is
already fully or partly present in another p2p-device memory, and we
ignore the -EFAULT, then get_pages() wouldn't detect that? Or well, we
can look at the dpagemap returned from get_pages and retry the
migration at that point.

We also need to realize that with multi-gpu, the chances of migration
races increases dramatically and whether those return -EBUSY or -EFAULT
appears a bit arbitrary to me? We can't really assume that cpages == 0
means all pages are already where they are supposed to be.

My current thinking how to handle all this is the following:

1) xe_svm_range_in_vram(), first check to avoid migration.
2) (NEW, not implemented yet) if we decide to migrate, first run a
hmm_range_fault() without faulting flag to determine current memory
migration status - Perhaps optional. This may reject migration more
efficiently than if we collect pages for migration and then inspect
them, because then we've already sent an invalidation event.
3) Call into drm_pagemap_populate_mm(). This collects all compatible-
and system pages, and determines what to migrate. If no migration
needed, returns 0. If racing or needing to migrate foreign devices to
system, return -EBUSY,
4) If -EBUSY evict, and retry migration once.

For now, I think we make atomic faults use local VRAM only. Moving fow

> 
> Once drm_gpusvm_range_get_pages returns, we can take appropriate
> action.
> Initially, for simplicity, this might just be a bounce to system
> memory.
> Later, it could evolve into a direct device-to-device move.

I agree we need a pass with hmm_range_fault(), question is in what
order we do this. I think a pass without the fault flag on before
trying to migrate would

a) Avoid populating with system pages for data that's going to be in
VRAM anyway,
b) Possibly avoiding collecting migrate pages and thus also an
invalidation for all devices.

The drawback is we'd might unnecessarily run a non-faulting
hmm_range_fault() when we need to migrate anyway. My thinking is, that
would be a rather quick call, though compared to the reverse lookups in
the migrate code.

> 
> The logic inside drm_gpusvm_range_get_pages would likely involve
> devmem_only combined with a drm_pagemap passed in, which can detect
> connectivity and atomic support between devices—based on the
> drm_pagemap
> extracted from the ZDD.
> 
> Let know if thia makes sense, or if you have thought about doing this
> in
> a follow up.

In any case I think we need to set up a flow-chart / flow-list similar
to the above and consider the most important cases and what to do with
them. for now, I think we can replace this patch if necessary with a
dpagemap check, whether desired equals what's present and rerun after
that. We'd probably need that anyway.

Thomas




> 
> Matt
> 
> > +           range_debug(range, "ALLOCATE VRAM - Retry.");
> > +
> > +           drm_gpusvm_range_evict(range->base.gpusvm, &range-
> > >base);
> > +           goto retry;
> > +   }
> > +
> > +   return err;
> >  }
> >  
> >  static struct drm_pagemap_addr
> > -- 
> > 2.51.0
> > 

Reply via email to