starvation problem

Matthew Brost Fri, 30 Jan 2026 13:01:37 -0800

On Fri, Jan 30, 2026 at 12:38:10PM -0800, Andrew Morton wrote:
> On Fri, 30 Jan 2026 20:56:31 +0100 Thomas Hellström 
> <[email protected]> wrote:
> 
> > > 
> > > > --- a/mm/hmm.c
> > > > +++ b/mm/hmm.c
> > > > @@ -674,6 +674,13 @@ int hmm_range_fault(struct hmm_range *range)
> > > >                         return -EBUSY;
> > > >                 ret = walk_page_range(mm, hmm_vma_walk.last,
> > > > range->end,
> > > >                                       &hmm_walk_ops,
> > > > &hmm_vma_walk);
> > > > +               /*
> > > > +                * Conditionally reschedule to let other work
> > > > items get
> > > > +                * a chance to unlock device-private pages whose
> > > > locks
> > > > +                * we're spinning on.
> > > > +                */
> > > > +               cond_resched();
> > > > +
> > > >                 /*
> > > >                  * When -EBUSY is returned the loop restarts with
> > > >                  * hmm_vma_walk.last set to an address that has
> > > > not been stored
> > > 
> > > If the process which is running hmm_range_fault() has
> > > SCHED_FIFO/SHCED_RR then cond_resched() doesn't work.  An explicit
> > > msleep() would be better?
> > 
> > Unfortunately hmm_range_fault() is typically called from a gpu
> > pagefault handler and it's crucial to get the gpu up and running again
> > as fast as possible.
> 
> Would a millisecond matter?  Regular old preemption will often cause
> longer delays.
>


I think millisecond is too high. We are aiming to GPU page faults
serviced in 10-15us of CPU time (GPU copy time varies based on size of
fault / copy bus speed but still at most 200us).

Matt

> > Is there a way we could test for the cases where cond_resched() doesn't
> > work and in that case instead call sched_yield(), at least on -EBUSY
> > errors?
> 
> kernel-internal sched_yield() was taken away years ago and I don't
> think there's a replacement, particularly one which will cause a
> realtime-policy task to yield to a non-rt-policy one.
> 
> It's common for kernel code to forget that it could have realtime
> policy - we probably have potential lockups in various places.
> 
> I suggest you rerun your testcase with this patch using `chrt -r', see
> if my speculation is correct.

Re: [PATCH] mm/hmm: Fix a hmm_range_fault() livelock / starvation problem

Reply via email to