On Fri, 30 Jan 2026 13:01:24 -0800 Matthew Brost <[email protected]> wrote:
> > > Unfortunately hmm_range_fault() is typically called from a gpu > > > pagefault handler and it's crucial to get the gpu up and running again > > > as fast as possible. > > > > Would a millisecond matter? Regular old preemption will often cause > > longer delays. > > > > I think millisecond is too high. We are aiming to GPU page faults > serviced in 10-15us of CPU time (GPU copy time varies based on size of > fault / copy bus speed but still at most 200us). But it's a rare case? Am I incorrect in believing that getting preempted will cause latencies much larger than this? > Matt > > > > Is there a way we could test for the cases where cond_resched() doesn't > > > work and in that case instead call sched_yield(), at least on -EBUSY > > > errors? > > > > kernel-internal sched_yield() was taken away years ago and I don't > > think there's a replacement, particularly one which will cause a > > realtime-policy task to yield to a non-rt-policy one. > > > > It's common for kernel code to forget that it could have realtime > > policy - we probably have potential lockups in various places. > > > > I suggest you rerun your testcase with this patch using `chrt -r', see > > if my speculation is correct. Please?
