On 2019-06-21 17:11:26, Martin Pieuchot <[email protected]> wrote: > On 06/06/19(Thu) 15:16, Martin Pieuchot wrote: > > On 02/06/19(Sun) 16:41, Martin Pieuchot wrote: > > > On 01/06/19(Sat) 18:55, Martin Pieuchot wrote: > > > > Diff below exists mainly for documentation and test purposes. If > > > > you're not interested about how to break the scheduler internals in > > > > pieces, don't read further and go straight to testing! > > > > > > > > - First change is to stop calling tsleep(9) at PUSER. That makes > > > > it clear that all "sleeping priorities" are smaller than PUSER. > > > > That's important to understand for the diff below. `p_priority' > > > > is currently a placeholder for the "sleeping priority" and the > > > > "runnqueue priority". Both fields are separated by this diff. > > > > > > > > - When a thread goes to sleep, the priority argument of tsleep(9) is > > > > now recorded in `p_slpprio'. This argument can be considered as part > > > > of the sleep queue. Its purpose is to place the thread into a higher > > > > runqueue when awoken. > > > > > > > > - Currently, for stopped threads, `p_priority' correspond to > > > > `p_usrpri'. > > > > So setrunnable() has been untangled to place SSTOP and SSLEEP threads > > > > in the preferred queue without having to use `p_priority'. Note that > > > > `p_usrpri' is still recalculated *after* having called setrunqueue(). > > > > This is currently fine because setrunnable() is called with > > > > SCHED_LOCK() > > > > but it will be racy when we'll split it. > > > > > > > > - A new field, `p_runprio' has been introduced. It should be considered > > > > as part of the per-CPU runqueues. It indicates where a current thread > > > > is placed. > > > > > > > > - `spc_curpriority' is now updated at every context-switch. That means > > > > need_resched() won't be called after comparing an out-of-date value. > > > > At the same time, `p_usrpri' is initialized to the highest possible > > > > value for idle threads. > > > > > > > > - resched_proc() was calling need_resched() in the following conditions: > > > > - If the SONPROC thread has a higher priority that the current > > > > running thread (itself). > > > > - Twice in setrunnable() when we know that p_priority <= p_usrpri. > > > > - If schedcpu() considered that a thread, after updating its prio, > > > > should preempt the one running on the CPU pointed by `p_cpu'. > > > > > > > > The diff below simplify all of that by calling need_resched() when: > > > > - A thread is inserted in a CPU runqueue at a higher priority than > > > > the one SONPROC. > > > > - schedcpu() decides that a thread in SRUN state should preempt the > > > > one SONPROC. > > > > > > > > - `p_estcpu' `p_usrpri' and `p_slptime' which represent the "priority" > > > > of a thread are now updated while holding a per-thread mutex. As a > > > > result schedclock() and donice() no longer takes the SCHED_LOCK(), > > > > and schedcpu() almost never take it. > > > > > > > > - With this diff top(1) and ps(1) will report the "real" `p_usrpi' value > > > > when displaying priorities. This is helpful to understand what's > > > > happening: > > > > > > > > load averages: 0.99, 0.56, 0.25 two.lab.grenadille.net > > > > 23:42:10 > > > > 70 threads: 68 idle, 2 on processor > > > > up 0:09 > > > > CPU0: 0.0% user, 0.0% nice, 51.0% sys, 2.0% spin, 0.0% intr, 47.1% > > > > idle > > > > CPU1: 2.0% user, 0.0% nice, 51.0% sys, 3.9% spin, 0.0% intr, 43.1% > > > > idle > > > > Memory: Real: 47M/1005M act/tot Free: 2937M Cache: 812M Swap: 0K/4323M > > > > > > > > PID TID PRI NICE SIZE RES STATE WAIT TIME CPU > > > > COMMAND > > > > 81000 145101 72 0 0K 1664K sleep/1 bored 1:15 36.96% > > > > softnet > > > > 47133 244097 73 0 2984K 4408K sleep/1 netio 1:06 35.06% cvs > > > > 64749 522184 66 0 176K 148K onproc/1 - 0:55 28.81% nfsd > > > > 21615 602473 127 0 0K 1664K sleep/0 - 7:22 0.00% > > > > idle0 > > > > 12413 606242 127 0 0K 1664K sleep/1 - 7:08 0.00% > > > > idle1 > > > > 85778 338258 50 0 4936K 7308K idle select 0:10 0.00% ssh > > > > > > > > 22771 575513 50 0 176K 148K sleep/0 nfsd 0:02 0.00% > > > > nfsd > > > > .... > > > > > > > > > > > > - The removal of `p_priority' and the change that makes mi_switch() > > > > always update `spc_curpriority' might introduce some changes in > > > > behavior, especially with kernel threads that were not going through > > > > tsleep(9). We currently have some situations where the priority of > > > > the running thread isn't correctly reflected. This diff changes that > > > > which means we should be able to better understand where the problems > > > > are. > > > > > > > > I'd be interested in comments/tests/reviews before continuing in this > > > > direction. Note that at least part of this diff are required to split > > > > the accounting apart from the SCHED_LOCK() as well. > > > > > > > > I'll also work on exporting scheduler statistics unless somebody wants > > > > to beat me :) > > > > > > Updated diff to use IPL_SCHED and rebased to apply on top of -current :) > > > > Updated diff that fixes a pagefault reported by sthen@. > > Rebased diff on top of -current. I'm still looking for tests and > comments :) >
I've been running with this for a day or so. It survived a ports/www/mozilla-firefox build using all 4 CPU cores, and a kernel build likewise using all 4 cores. I've been able to browse the net, play ports/games/prboom (a Doom engine), play ports/games/wesnoth, play music, watch videos, etc. No problems encountered so far. -- Bryan
