Re: [PATCH,RFC] numa,sched: use group fault statistics in numa placement

Rik van Riel Thu, 01 Aug 2013 09:37:16 -0700

On 08/01/2013 06:37 AM, Peter Zijlstra wrote:

On Thu, Aug 01, 2013 at 02:23:19AM -0400, Rik van Riel wrote:

Subject: [PATCH,RFC] numa,sched: use group fault statistics in numa placement


Here is a quick strawman on how the group fault stuff could be used
to help pick the best node for a task. This is likely to be quite
suboptimal and in need of tweaking. My main goal is to get this to
Peter & Mel before it's breakfast time on their side of the Atlantic...

This goes on top of "sched, numa: Use {cpu, pid} to create task groups for shared 
faults"

Enjoy :)

+       /*
+        * Should we stay on our own, or move in with the group?
+        * The absolute count of faults may not be useful, but comparing
+        * the fraction of accesses in each top node may give us a hint
+        * where to start looking for a migration target.
+        *
+        *  max_group_faults     max_faults
+        * ------------------ > ------------
+        * total_group_faults   total_faults
+        */
+       if (max_group_nid >= 0 && max_group_nid != max_nid) {
+               if (max_group_faults * total_faults >
+                               max_faults * total_group_faults)
+                       max_nid = max_group_nid;
+       }


This makes sense.. another part of the problem, which you might already
have spotted is selecting a task to swap with.

If you only look at per task faults its often impossible to find a
suitable swap task because moving you to a more suitable node would
degrade the other task -- below a patch you've already seen but I
haven't yet posted because I'm not at all sure its something 'sane' :-)


I did not realize you had not posted that patch yet, and was
actually building on top of it :)

I suspect that comparing both per-task and per-group fault weights
in task_numa_compare should make your code do the right thing in
task_numa_migrate.

I suspect there will be enough randomness in accesses that they
will never be exactly the same, so we might not need an explicit
tie breaker.

However, if numa_migrate_preferred fails, we may want to try
migrating to any node that has a better score than the current
one.  After all, if we have a group of tasks that would fit in
2 NUMA nodes, we don't want half of the tasks to not migrate
at all because the top node is full. We want them to move to
the #2 node at some point.

--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH,RFC] numa,sched: use group fault statistics in numa placement

Reply via email to