On 03/10/2018 08:16 AM, Peter Zijlstra wrote: > On Fri, Mar 09, 2018 at 06:06:29PM -0500, Waiman Long wrote: >> So you are talking about sched_relax_domain_level and > That one I wouldn't be sad to see the back of. > >> sched_load_balance. > This one, that's critical. And this is the perfect time to try and fix > the whole isolcpus issue. > > The primary issue is that to make equivalent functionality available > through cpuset, we need to basically start all tasks outside the root > group. > > The equivalent of isolcpus=xxx is a cgroup setup like: > > root > / \ > system other > > Where other has the @xxx cpus and system the remainder and > root.sched_load_balance = 0.
I saw in the kernel-parameters.txt file that the isolcpus option was deprecated - use cpusets instead. However, there doesn't seem to have document on the right way to do it. Of course, we can achieve similar results with what you have outlined above, but the process is more complex than just adding another boot command line argument with isolcpus. So I doubt isolcpus will die anytime soon unless we can make the alternative as easy to use. > Back before cgroups (and the new workqueue stuff), we could've started > everything in the !root group, no worry. But now that doesn't work, > because a bunch of controllers can't deal with that and everything > cgroup expects the cgroupfs to be empty on boot. AFAIK, all the processes belong to the root cgroup on boot. And the root cgroup is usually special that the controller may not exert any control for processes in the root cgroup. Many controllers become active for processes in the child cgroups only. Would you mind elaborating what doesn't quite work currently? > It's one of my biggest regrets that I didn't 'fix' this before cgroups > came along. > >> I have not removed any bits. I just haven't exposed >> them yet. It does seem like these 2 control knobs are useful from the >> scheduling perspective. Do we also need cpu_exclusive or just the two >> sched control knobs are enough? > I always forget if we need exclusive for load_balance to work; I'll > peruse the document/code. I think the cpu_exclusive feature can be useful to enforce that CPUs allocated to the "other" isolated cgroup cannot be used by the processes under the "system" parent. I know that there are special code to handle the isolcpus option. How about changing it to create a exclusive cpuset automatically instead. Applications that need to run in those isolated CPUs can then use the standard cgroup process to be moved into the isolated cgroup. For example, isolcpus=<cpuset-name>,<cpu-id-list> or isolcpuset=<cpuset-name>[,cpu:<cpu-id-list>][,mem:<memory-node-list>] We can then retire the old usage and encourage users to use the cgroup API to manage it. Cheers, Longman