On 2/8/2018 10:54 PM, Mike Galbraith wrote: > On Thu, 2018-02-08 at 14:19 -0800, Rohit Jain wrote: >> This patch introduces the sysctl for sched_domain based migration costs. >> These in turn can be used for performance tuning of workloads. > > With this patch, we trade 1 completely bogus constant (cost is really > highly variable) for 3, twiddling of which has zero effect unless you > trigger a domain rebuild afterward, which is neither mentioned in the > changelog, nor documented. > > bogo-numbers++ is kinda hard to love.
Yup, the domain rebuild is missing. I am no fan of tunables, the fewer the better, but one of the several flaws of the single figure for migration cost is that it ignores the very large difference in cost when migrating between near vs far levels of the cache hierarchy. Migration between CPUs of the same core should be free, as they share L1 cache. Rohit defined a tunable for it, but IMO it could be hard coded to 0. Migration between CPUs in different sockets is the most expensive and is represented by the existing sysctl_sched_migration_cost tunable. Migration between CPUs in the same core cluster, or in the same socket, is somewhere in between, as they share L2 or L3 cache. We could avoid a separate tunable by setting it to sysctl_sched_migration_cost / 10. - Steve