On Wed, May 15, 2019 at 09:05:32AM -0500, Amit Kulkarni wrote: > Hi, > > This effort is heavily based on top of Gregor's and Michal's diffs. Tried to > incorporate feedback given by different people to them in 2011/2016. Split > the new code in a ifdef, so people can do a straight comparison, tried very > hard not to delete existing code, just shifted it around. Main motivation was > to find if it is possible to do incremental improvements in scheduler. After > sometime, have still not understood completely the load avg part, > p_priority/p_usrpri calculations, the cpuset code. Looked heavily at > Dragonfly (simpler than FreeBSD) and FreeBSD code. As a newcomer, OpenBSD > code is way easier to read. Thanks for the detailed explanations given to > Michal and also in later diffs posted to tech@, they helped a lot when trying > to understand the decisions made by other devs, the commit messages help a > little but the explanations help a lot more. This invites discussions and end > users learn a lot more in the process. > > This diff survived multiple tens of kernel builds, a bsd.sp build, bsd.rd > build, a bsd.mp without these changes, userland/xenocara, a make regress a > few days ago all on 3 desktops on amd64. Ran under all possible scenarios > listed in previous sentence. No major invasive changes attempted, all tools > should work as is, if its broken, please let me know. This is faster than > current, but not sure by how much, placebo effect in play. > > I think there is a bug in resched_proc() which is fixed in mi_switch(), a > quick enhancement in cpu_switchto(), p_slptime, and precise round robin > interval for each proc unless preempted or it yields(). > > Tried to fiddle with different time slices other than hz/10, not sure if it > will work on other arches, but tried to stay within MI code, so it should > work. Other than counting no idea to verify a proc is actually switched away > after its rr interval is over. Just went by what is there in the existing > code. Tried giving higher slice like 200 ms, but didn't want to exceed the > rucheck() in kern_resource 1 sec limit. > > Tried to do rudimentary work on affinity without having a affinity field in > struct proc or struct schedstate_percpu (like Dragonfly/FreeBSD does). > Observed the unbalance in runqs. Affinity works most of the time under light > load. There's a problem when I try to comment out sched_steal_proc(), in > kern_sched, that is the problem with this iteration of the diff. > > Not sure if the reduction from 32 queues to a single TAILQ would be accepted, > but tried it anyway, it is definitely faster. This code tries to reduce the > complexity in deciding which queue to place the proc on. There is no current > support for a real-time queue or other types of scheduling classes, so IMHO > this is a simplification. > > Tried to give detailed explanation of thinking. Sent the complete diff, but > will split diff, if parts of it are found to be valid. > > In any case, a request to please accept a small change in top, to display > p_priority directly. > > Thanks > >
Hi, I tried your diff. It makes game games/gzdoom unplayable because of too much stuttering. I did not feel any change apart for this case on my intel i7 8550-U.