Hi, I think it is better to just not backport 0b8b2668f9981c1fefc2ef892bd915288ef01f33 ("um: insert scheduler ticks when userspace does not yield").
Benjamin On Thu, 2025-05-08 at 19:00 +0200, Christian Lamparter wrote: > Hi, > > On 3/14/25 2:08 PM, Benjamin Berg wrote: > > From: Benjamin Berg <benjamin.b...@intel.com> > > um: work around sched_yield not yielding in time-travel mode > > > > sched_yield by a userspace may not actually cause scheduling in > > time-travel mode as no time has passed. In the case seen it appears > > to > > be a badly implemented userspace spinlock in ASAN. Unfortunately, > > with > > time-travel it causes an extreme slowdown or even deadlock > > depending on > > the kernel configuration (CONFIG_UML_MAX_USERSPACE_ITERATIONS). > > > > Work around it by accounting time to the process whenever it > > executes a > > sched_yield syscall. > > > > Signed-off-by: Benjamin Berg <benjamin.b...@intel.com> > > From what I can tell the patch mentioned above was backported to > 6.12.27 by: > < > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commi > t/arch/um?id=887c5c12e80c8424bd471122d2e8b6b462e12874> > > but without the upstream > > Commit 0b8b2668f9981c1fefc2ef892bd915288ef01f33 > > Author: Benjamin Berg <benjamin.b...@intel.com> > > Date: Thu Oct 10 16:25:37 2024 +0200 > > um: insert scheduler ticks when userspace does not yield > > > > In time-travel mode userspace can do a lot of work without any > > time > > passing. Unfortunately, this can result in OOM situations as the > > RCU > > core code will never be run. [...] > > the kernel build for 6.12.27 for the UM-Target will fail: > > > /usr/bin/ld: arch/um/kernel/skas/syscall.o: in function > > `handle_syscall': linux- > > 6.12.27/arch/um/kernel/skas/syscall.c:43:(.text+0xa2): undefined > > reference to `tt_extra_sched_jiffies' > > collect2: error: ld returned 1 exit status > > is it possible to backport 0b8b2668f9981c1fefc2ef892bd915288ef01f33 > too? > Or is it better to revert 887c5c12e80c8424bd471122d2e8b6b462e12874 > again > in the stable releases? > > Best Regards, > Christian Lamparter > > > > > --- > > > > I suspect it is this code in ASAN that uses sched_yield > > > > https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp > > though there are also some other places that use sched_yield. > > > > I doubt that code is reasonable. At the same time, not sure that > > sched_yield is behaving as advertised either as it obviously is not > > necessarily relinquishing the CPU. > > --- > > arch/um/include/linux/time-internal.h | 2 ++ > > arch/um/kernel/skas/syscall.c | 11 +++++++++++ > > 2 files changed, 13 insertions(+) > > > > diff --git a/arch/um/include/linux/time-internal.h > > b/arch/um/include/linux/time-internal.h > > index b22226634ff6..138908b999d7 100644 > > --- a/arch/um/include/linux/time-internal.h > > +++ b/arch/um/include/linux/time-internal.h > > @@ -83,6 +83,8 @@ extern void time_travel_not_configured(void); > > #define time_travel_del_event(...) time_travel_not_configured() > > #endif /* CONFIG_UML_TIME_TRAVEL_SUPPORT */ > > > > +extern unsigned long tt_extra_sched_jiffies; > > + > > /* > > * Without CONFIG_UML_TIME_TRAVEL_SUPPORT this is a linker error > > if used, > > * which is intentional since we really shouldn't link it in that > > case. > > diff --git a/arch/um/kernel/skas/syscall.c > > b/arch/um/kernel/skas/syscall.c > > index b09e85279d2b..a5beaea2967e 100644 > > --- a/arch/um/kernel/skas/syscall.c > > +++ b/arch/um/kernel/skas/syscall.c > > @@ -31,6 +31,17 @@ void handle_syscall(struct uml_pt_regs *r) > > goto out; > > > > syscall = UPT_SYSCALL_NR(r); > > + > > + /* > > + * If no time passes, then sched_yield may not actually > > yield, causing > > + * broken spinlock implementations in userspace (ASAN) to > > hang for long > > + * periods of time. > > + */ > > + if ((time_travel_mode == TT_MODE_INFCPU || > > + time_travel_mode == TT_MODE_EXTERNAL) && > > + syscall == __NR_sched_yield) > > + tt_extra_sched_jiffies += 1; > > + > > if (syscall >= 0 && syscall < __NR_syscalls) { > > unsigned long ret = EXECUTE_SYSCALL(syscall, > > regs); > > > >