On Fri, 2015-07-03 at 08:40 +0200, Mike Galbraith wrote:

> Hm.  Seems what this load should like best is if we detect 1:N, skip all
> of the routine gyrations, ie move the N (workers) infrequently, expend
> search cycles frequently only on the 1 (dispatch).
> 
> Ponder..

While taking a refresher peek at the wake_wide() thing, seems it's not
really paying attention when the waker of many is awakened.  I wonder if
your load would see more benefit if it watched like so.. rashly assuming
I didn't wreck it completely (iow, completely untested).

---
 kernel/sched/fair.c |   36 ++++++++++++++++++++++--------------
 1 file changed, 22 insertions(+), 14 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4586,10 +4586,23 @@ static void record_wakee(struct task_str
                current->wakee_flips >>= 1;
                current->wakee_flip_decay_ts = jiffies;
        }
+       if (time_after(jiffies, p->wakee_flip_decay_ts + HZ)) {
+               p->wakee_flips >>= 1;
+               p->wakee_flip_decay_ts = jiffies;
+       }
 
        if (current->last_wakee != p) {
                current->last_wakee = p;
                current->wakee_flips++;
+               /*
+                * Flip the buddy as well.  It's the ratio of flips
+                * with a socket size decayed cutoff that determines
+                * whether the pair are considered to be part of 1:N
+                * or M*N loads of a size that we need to spread, so
+                * ensure flips of both load components.  The waker
+                * of many will have many more flips than its wakees.
+                */
+               p->wakee_flips++;
        }
 }
 
@@ -4732,24 +4745,19 @@ static long effective_load(struct task_g
 
 static int wake_wide(struct task_struct *p)
 {
+       unsigned long max = max(current->wakee_flips, p->wakee_flips);
+       unsigned long min = min(current->wakee_flips, p->wakee_flips);
        int factor = this_cpu_read(sd_llc_size);
 
        /*
-        * Yeah, it's the switching-frequency, could means many wakee or
-        * rapidly switch, use factor here will just help to automatically
-        * adjust the loose-degree, so bigger node will lead to more pull.
+        * Yeah, it's a switching-frequency heuristic, and could mean the
+        * intended many wakees/waker relationship, or rapidly switching
+        * between a few.  Use factor to try to automatically adjust such
+        * that the load spreads when it grows beyond what will fit in llc.
         */
-       if (p->wakee_flips > factor) {
-               /*
-                * wakee is somewhat hot, it needs certain amount of cpu
-                * resource, so if waker is far more hot, prefer to leave
-                * it alone.
-                */
-               if (current->wakee_flips > (factor * p->wakee_flips))
-                       return 1;
-       }
-
-       return 0;
+       if (min < factor)
+               return 0;
+       return max > min * factor;
 }
 
 static int wake_affine(struct sched_domain *sd, struct task_struct *p, int 
sync)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to