SD_PREFER_SIBLING adds an additional bias towards spreading tasks on the
_parent_ sched_domain even if a sched_group isn't overloaded. It is
currently set on:

   1. SMT level to promote spreading to sibling cores rather than using
      sibling HW-threads (caff37ef96eac7fe96a).

   2. Non-NUMA levels which don't have the SD_SHARE_PKG_RESOURCES flag
      set (= DIE level in the default topology) as it was found to
      improve benchmarks on certain NUMA systems (6956dc568f34107f1d02b).

   3. Any non-NUMA level that inherits the flag due to elimination of
      its parent sched_domain level in the de-generate step of the
      sched_domain hierarchy set up (= MC level in the default
      topology).

Preferring siblings seems to be a useful tweak for all non-NUMA levels,
so we should enable it on all non-NUMA levels. As it is, it is possible
to have it SMT and DIE, but not MC in between when using the default
topology.

Signed-off-by: Morten Rasmussen <[email protected]>
cc: Ingo Molnar <[email protected]>
cc: Peter Zijlstra <[email protected]>
---
 kernel/sched/topology.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 6798276d29af..7f70806bfa0f 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1122,7 +1122,7 @@ sd_init(struct sched_domain_topology_level *tl,
                                        | 0*SD_SHARE_CPUCAPACITY
                                        | 0*SD_SHARE_PKG_RESOURCES
                                        | 0*SD_SERIALIZE
-                                       | 0*SD_PREFER_SIBLING
+                                       | 1*SD_PREFER_SIBLING
                                        | 0*SD_NUMA
                                        | sd_flags
                                        ,
@@ -1153,7 +1153,6 @@ sd_init(struct sched_domain_topology_level *tl,
        }
 
        if (sd->flags & SD_SHARE_CPUCAPACITY) {
-               sd->flags |= SD_PREFER_SIBLING;
                sd->imbalance_pct = 110;
                sd->smt_gain = 1178; /* ~15% */
 
@@ -1168,6 +1167,7 @@ sd_init(struct sched_domain_topology_level *tl,
                sd->busy_idx = 3;
                sd->idle_idx = 2;
 
+               sd->flags &= ~SD_PREFER_SIBLING;
                sd->flags |= SD_SERIALIZE;
                if (sched_domains_numa_distance[tl->numa_level] > 
RECLAIM_DISTANCE) {
                        sd->flags &= ~(SD_BALANCE_EXEC |
@@ -1177,7 +1177,6 @@ sd_init(struct sched_domain_topology_level *tl,
 
 #endif
        } else {
-               sd->flags |= SD_PREFER_SIBLING;
                sd->cache_nice_tries = 1;
                sd->busy_idx = 2;
                sd->idle_idx = 1;
-- 
2.7.4

Reply via email to