On 23/03/16(Wed) 21:35, Mark Kettenis wrote: > > Date: Mon, 21 Mar 2016 16:51:16 +0100 (CET) > > From: Mark Kettenis <mark.kette...@xs4all.nl> > > > > > Date: Sat, 19 Mar 2016 13:53:07 +0100 > > > From: Martin Pieuchot <m...@openbsd.org> > > > > > > Applications using multiple threads often call sched_yield(2) to > > > indicate that one of the threads cannot make any progress because > > > it is waiting for a resource held by another one. > > > > > > One example of this scenario is the _spinlock() implementation of > > > our librthread. But if you look on https://codesearch.debian.net > > > you can find much more use cases, notably MySQL, PostgreSQL, JDK, > > > libreoffice, etc. > > > > > > Now the problem with our current scheduler is that the priority of > > > a thread decreases when it is the "curproc" of a CPU. So the threads > > > that don't run and sched_yield(2) end up having a higher priority than > > > the thread holding the resource. Which means that it's really hard for > > > such multi-threaded applications to make progress, resulting in a lot of > > > IPIs numbers. > > > That'd also explain why if you have a more CPUs, let's say 4 instead > > > of 2, your application will more likely make some progress and you'll > > > see less sluttering/freezing. > > > > > > So what the diff below does is that it penalizes the threads from > > > multi-threaded applications such that progress can be made. It is > > > inspired from the recent scheduler work done by Michal Mazurek on > > > tech@. > > > > > > I experimented with various values for "p_priority" and this one is > > > the one that generates fewer # IPIs when watching a HD video on firefox. > > > Because yes, with this diff, now I can. > > > > > > I'd like to know if dereferencing ``p_p'' is safe without holding the > > > KERNEL_LOCK. > > > > > > I'm also interested in hearing from more people using multi-threaded > > > applications. > > > > This doesn't only change the sched_yield() behaviour, but also > > modifies how in-kernel yield() calls behave for threaded processes. > > That is probably not right. > > So here is a diff that keeps yield() the same and adds the code in the > sched_yield(2) implementation instead.
I'm not a big fan of code duplication, what about checking for p_usrpri >= PUSER instead? > It also changes the logic that > picks the priority to walk the complete threads listand pick the > highest priotity of all the threads. That should be enough to make > sure the calling thread is scheduled behind all other threads from the > same process. That does require us to grab the kernel lock though. > So this removes NOLOCK from sched_yield(2). I don't think that is a > big issue. > > Still improves video playback in firefox on my x1. This is another kind of hack :) It's certainly less intrusive but I'm still not sure if we want that. Reducing the priority of a thread to the worst priority of its siblings might still not be enough. Now I'd like to know how many times sched_yield() really triggers a context switch for threaded programs with this version of the diff. Without any diff I observed that only 20 to 25% of the sched_yield() calls made by firefox result in a different thread being chosen by mi_switch(). Somethings tell me that per-CPU runqueues are to blame here. > Index: syscalls.master > =================================================================== > RCS file: /cvs/src/sys/kern/syscalls.master,v > retrieving revision 1.167 > diff -u -p -r1.167 syscalls.master > --- syscalls.master 21 Mar 2016 22:41:29 -0000 1.167 > +++ syscalls.master 23 Mar 2016 20:33:45 -0000 > @@ -514,7 +514,7 @@ > #else > 297 UNIMPL > #endif > -298 STD NOLOCK { int sys_sched_yield(void); } > +298 STD { int sys_sched_yield(void); } > 299 STD NOLOCK { pid_t sys_getthrid(void); } > 300 OBSOL t32___thrsleep > 301 STD { int sys___thrwakeup(const volatile void *ident, \ > Index: kern_synch.c > =================================================================== > RCS file: /cvs/src/sys/kern/kern_synch.c,v > retrieving revision 1.129 > diff -u -p -r1.129 kern_synch.c > --- kern_synch.c 9 Mar 2016 13:38:50 -0000 1.129 > +++ kern_synch.c 23 Mar 2016 20:33:45 -0000 > @@ -1,4 +1,4 @@ > -/* $OpenBSD: kern_synch.c,v 1.129 2016/03/09 13:38:50 mpi Exp $ */ > +/* $openbsd: kern_synch.c,v 1.129 2016/03/09 13:38:50 mpi Exp $ */ > /* $NetBSD: kern_synch.c,v 1.37 1996/04/22 01:38:37 christos Exp $ */ > > /* > @@ -432,7 +432,24 @@ wakeup(const volatile void *chan) > int > sys_sched_yield(struct proc *p, void *v, register_t *retval) > { > - yield(); > + struct proc *q; > + int s; > + > + SCHED_LOCK(s); > + /* > + * If one of the threads of a multi-threaded process called > + * sched_yield(2), drop its priority to ensure its siblings > + * can make some progress. > + */ > + p->p_priority = p->p_usrpri; > + TAILQ_FOREACH(q, &p->p_p->ps_threads, p_thr_link) > + p->p_priority = max(p->p_priority, q->p_priority); > + p->p_stat = SRUN; > + setrunqueue(p); > + p->p_ru.ru_nvcsw++; > + mi_switch(); > + SCHED_UNLOCK(s); > + > return (0); > } > >