Re: CFS scheduler unfairly prefers pinned tasks

2015-10-11 Thread paul . szabo
I wrote:

  The Linux CFS scheduler prefers pinned tasks and unfairly
  gives more CPU time to tasks that have set CPU affinity.
  ...
  I believe I have now solved the problem, simply by setting:
for n in /proc/sys/kernel/sched_domain/cpu*/domain0/min_interval; do echo 0 
> $n; done
for n in /proc/sys/kernel/sched_domain/cpu*/domain0/max_interval; do echo 1 
> $n; done

Testing with real-life jobs, I found I needed min_- and max_interval for
domain1 also, and a couple of other non-default values, so:

  for n in /proc/sys/kernel/sched_domain/cpu*/dom*/min_interval; do echo 0 > 
$n; done
  for n in /proc/sys/kernel/sched_domain/cpu*/dom*/max_interval; do echo 1 > 
$n; done
  echo 10 > /proc/sys/kernel/sched_latency_ns 
  echo 10 > /proc/sys/kernel/sched_min_granularity_ns
  echo 1 >  /proc/sys/kernel/sched_wakeup_granularity_ns

and then things seem fair and my users are happy.

Thanks, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-11 Thread paul . szabo
I wrote:

  The Linux CFS scheduler prefers pinned tasks and unfairly
  gives more CPU time to tasks that have set CPU affinity.

I believe I have now solved the problem, simply by setting:

  for n in /proc/sys/kernel/sched_domain/cpu*/domain0/min_interval; do echo 0 > 
$n; done
  for n in /proc/sys/kernel/sched_domain/cpu*/domain0/max_interval; do echo 1 > 
$n; done

I am not sure what the domain1 values would be for (that I see exist
on my 4*E5-4627v2 server). So far I do not see any negative effects of
using these (extreme?) settings. (Explanation of what these things are
meant for, or pointers to documentation, would be appreciated.)

---

Thanks for the insightful discussion.

(Scary, isn't it?)

Thanks, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-11 Thread paul . szabo
I wrote:

  The Linux CFS scheduler prefers pinned tasks and unfairly
  gives more CPU time to tasks that have set CPU affinity.

I believe I have now solved the problem, simply by setting:

  for n in /proc/sys/kernel/sched_domain/cpu*/domain0/min_interval; do echo 0 > 
$n; done
  for n in /proc/sys/kernel/sched_domain/cpu*/domain0/max_interval; do echo 1 > 
$n; done

I am not sure what the domain1 values would be for (that I see exist
on my 4*E5-4627v2 server). So far I do not see any negative effects of
using these (extreme?) settings. (Explanation of what these things are
meant for, or pointers to documentation, would be appreciated.)

---

Thanks for the insightful discussion.

(Scary, isn't it?)

Thanks, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-11 Thread paul . szabo
I wrote:

  The Linux CFS scheduler prefers pinned tasks and unfairly
  gives more CPU time to tasks that have set CPU affinity.
  ...
  I believe I have now solved the problem, simply by setting:
for n in /proc/sys/kernel/sched_domain/cpu*/domain0/min_interval; do echo 0 
> $n; done
for n in /proc/sys/kernel/sched_domain/cpu*/domain0/max_interval; do echo 1 
> $n; done

Testing with real-life jobs, I found I needed min_- and max_interval for
domain1 also, and a couple of other non-default values, so:

  for n in /proc/sys/kernel/sched_domain/cpu*/dom*/min_interval; do echo 0 > 
$n; done
  for n in /proc/sys/kernel/sched_domain/cpu*/dom*/max_interval; do echo 1 > 
$n; done
  echo 10 > /proc/sys/kernel/sched_latency_ns 
  echo 10 > /proc/sys/kernel/sched_min_granularity_ns
  echo 1 >  /proc/sys/kernel/sched_wakeup_granularity_ns

and then things seem fair and my users are happy.

Thanks, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-10 Thread Wanpeng Li

On 10/10/15 11:59 AM, Wanpeng Li wrote:

Hi Paul,
On 10/8/15 4:19 PM, Mike Galbraith wrote:

On Tue, 2015-10-06 at 04:45 +0200, Mike Galbraith wrote:

On Tue, 2015-10-06 at 08:48 +1100, paul.sz...@sydney.edu.au wrote:

The Linux CFS scheduler prefers pinned tasks and unfairly
gives more CPU time to tasks that have set CPU affinity.
This effect is observed with or without CGROUP controls.

To demonstrate: on an otherwise idle machine, as some user
run several processes pinned to each CPU, one for each CPU
(as many as CPUs present in the system) e.g. for a quad-core
non-HyperThreaded machine:

   taskset -c 0 perl -e 'while(1){1}' &
   taskset -c 1 perl -e 'while(1){1}' &
   taskset -c 2 perl -e 'while(1){1}' &
   taskset -c 3 perl -e 'while(1){1}' &

and (as that same or some other user) run some without
pinning:

   perl -e 'while(1){1}' &
   perl -e 'while(1){1}' &

and use e.g.   top   to observe that the pinned processes get
more CPU time than "fair".


Interesting, I can reproduce it w/ your simple script. However, they 
are fair when the number of pinned perl tasks is equal to unpinned 
perl tasks. I will dig into it more deeply.


For the pinned tasks, when set the task affinity to all the available 
cpus instead of the separate cpu as in your test, there is fair between 
pinned tasks and unpinned tasks. So I suspect that if it is the overhead 
associated with migration stuff.


Regards,
Wanpeng Li

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-10 Thread Wanpeng Li

On 10/10/15 11:59 AM, Wanpeng Li wrote:

Hi Paul,
On 10/8/15 4:19 PM, Mike Galbraith wrote:

On Tue, 2015-10-06 at 04:45 +0200, Mike Galbraith wrote:

On Tue, 2015-10-06 at 08:48 +1100, paul.sz...@sydney.edu.au wrote:

The Linux CFS scheduler prefers pinned tasks and unfairly
gives more CPU time to tasks that have set CPU affinity.
This effect is observed with or without CGROUP controls.

To demonstrate: on an otherwise idle machine, as some user
run several processes pinned to each CPU, one for each CPU
(as many as CPUs present in the system) e.g. for a quad-core
non-HyperThreaded machine:

   taskset -c 0 perl -e 'while(1){1}' &
   taskset -c 1 perl -e 'while(1){1}' &
   taskset -c 2 perl -e 'while(1){1}' &
   taskset -c 3 perl -e 'while(1){1}' &

and (as that same or some other user) run some without
pinning:

   perl -e 'while(1){1}' &
   perl -e 'while(1){1}' &

and use e.g.   top   to observe that the pinned processes get
more CPU time than "fair".


Interesting, I can reproduce it w/ your simple script. However, they 
are fair when the number of pinned perl tasks is equal to unpinned 
perl tasks. I will dig into it more deeply.


For the pinned tasks, when set the task affinity to all the available 
cpus instead of the separate cpu as in your test, there is fair between 
pinned tasks and unpinned tasks. So I suspect that if it is the overhead 
associated with migration stuff.


Regards,
Wanpeng Li

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-09 Thread Wanpeng Li

Hi Paul,
On 10/8/15 4:19 PM, Mike Galbraith wrote:

On Tue, 2015-10-06 at 04:45 +0200, Mike Galbraith wrote:

On Tue, 2015-10-06 at 08:48 +1100, paul.sz...@sydney.edu.au wrote:

The Linux CFS scheduler prefers pinned tasks and unfairly
gives more CPU time to tasks that have set CPU affinity.
This effect is observed with or without CGROUP controls.

To demonstrate: on an otherwise idle machine, as some user
run several processes pinned to each CPU, one for each CPU
(as many as CPUs present in the system) e.g. for a quad-core
non-HyperThreaded machine:

   taskset -c 0 perl -e 'while(1){1}' &
   taskset -c 1 perl -e 'while(1){1}' &
   taskset -c 2 perl -e 'while(1){1}' &
   taskset -c 3 perl -e 'while(1){1}' &

and (as that same or some other user) run some without
pinning:

   perl -e 'while(1){1}' &
   perl -e 'while(1){1}' &

and use e.g.   top   to observe that the pinned processes get
more CPU time than "fair".


Interesting, I can reproduce it w/ your simple script. However, they are 
fair when the number of pinned perl tasks is equal to unpinned perl 
tasks. I will dig into it more deeply.


Regards,
Wanpeng Li
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-09 Thread Wanpeng Li

Hi Paul,
On 10/8/15 4:19 PM, Mike Galbraith wrote:

On Tue, 2015-10-06 at 04:45 +0200, Mike Galbraith wrote:

On Tue, 2015-10-06 at 08:48 +1100, paul.sz...@sydney.edu.au wrote:

The Linux CFS scheduler prefers pinned tasks and unfairly
gives more CPU time to tasks that have set CPU affinity.
This effect is observed with or without CGROUP controls.

To demonstrate: on an otherwise idle machine, as some user
run several processes pinned to each CPU, one for each CPU
(as many as CPUs present in the system) e.g. for a quad-core
non-HyperThreaded machine:

   taskset -c 0 perl -e 'while(1){1}' &
   taskset -c 1 perl -e 'while(1){1}' &
   taskset -c 2 perl -e 'while(1){1}' &
   taskset -c 3 perl -e 'while(1){1}' &

and (as that same or some other user) run some without
pinning:

   perl -e 'while(1){1}' &
   perl -e 'while(1){1}' &

and use e.g.   top   to observe that the pinned processes get
more CPU time than "fair".


Interesting, I can reproduce it w/ your simple script. However, they are 
fair when the number of pinned perl tasks is equal to unpinned perl 
tasks. I will dig into it more deeply.


Regards,
Wanpeng Li
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-08 Thread Mike Galbraith
On Fri, 2015-10-09 at 08:55 +1100, paul.sz...@sydney.edu.au wrote:

> >> Good to see that you agree ...
> > Weeell, we've disagreed on pretty much everything ...
> 
> Sorry I disagree: we do agree on the essence. :-)

P.S.

To some extent.  If the essence is $subject, nope, we definitely
disagree.  If the essence is that _group_ scheduling is not strictly
fair, then we agree.  The must be fixed bit, I also disagree with.
Maybe wants fixing I can agree with ;-) 

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-08 Thread Mike Galbraith
On Fri, 2015-10-09 at 08:55 +1100, paul.sz...@sydney.edu.au wrote:
> Dear Mike,
> 
> >>> I see a fairness issue ... but one opposite to your complaint.
> >> Why is that opposite? ...
> >
> > Well, not exactly opposite, only opposite in that the one pert task also
> > receives MORE than it's fair share when unpinned.  Two 100$ hogs sharing
> > one CPU should each get 50% of that CPU. ...
> 
> But you are using CGROUPs, grouping all oinks into one group, and the
> one pert into another: requesting each group to get same total CPU.
> Since pert has one process only, the most he can get is 100% (not 400%),
> and it is quite OK for the oinks together to get 700%.

Well, that of course depends on what you call fair.  I realize why and
where it happens.  I told weight adjustment to keep its grubby mitts off
of autogroups, and of course the "problem" went away.  Back to the
viewpoint thing, with two users, each having been _placed_ in a group, I
can well imagine a user who is trying to use all of his authorized
bandwidth raising an eyebrow when he sees one of his tasks getting 24
whole milliseconds per second with an allegedly fair scheduler.

I can see it both ways.  What's going to come out of this is probably
going to be "tough titty, yes, group scheduling has side effects, and
this is one".  I already know it does.  Question is only whether the
weight adjustment gears are spinning as intended or not.

> > IFF ... massively parallel and synchronized ...
> 
> You would be making the assumption that you had the machine to yourself:
> might be the wrong thing to assume.

Yup, it would be a doomed attempt to run a load which cannot thrive in a
shared environment in such an environment.  Are any of the compute loads
you're having trouble with.. in the math department..  perhaps doing oh,
say complex math goop that feeds the output of one parallel computation
into the next parallel computation? :)

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-08 Thread paul . szabo
Dear Mike,

>>> I see a fairness issue ... but one opposite to your complaint.
>> Why is that opposite? ...
>
> Well, not exactly opposite, only opposite in that the one pert task also
> receives MORE than it's fair share when unpinned.  Two 100$ hogs sharing
> one CPU should each get 50% of that CPU. ...

But you are using CGROUPs, grouping all oinks into one group, and the
one pert into another: requesting each group to get same total CPU.
Since pert has one process only, the most he can get is 100% (not 400%),
and it is quite OK for the oinks together to get 700%.

> IFF ... massively parallel and synchronized ...

You would be making the assumption that you had the machine to yourself:
might be the wrong thing to assume.

>> Good to see that you agree ...
> Weeell, we've disagreed on pretty much everything ...

Sorry I disagree: we do agree on the essence. :-)

Cheers, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-08 Thread Mike Galbraith
On Thu, 2015-10-08 at 21:54 +1100, paul.sz...@sydney.edu.au wrote:
> Dear Mike,
> 
> > I see a fairness issue ... but one opposite to your complaint.
> 
> Why is that opposite? I think it would be fair for the one pert process
> to get 100% CPU, the many oink processes can get everything else. That
> one oink is lowly 10% (when others are 100%) is of no consequence.

Well, not exactly opposite, only opposite in that the one pert task also
receives MORE than it's fair share when unpinned.  Two 100$ hogs sharing
one CPU should each get 50% of that CPU.  The fact that the oink group
contains 8 tasks vs 1 for the pert group should be irrelevant, but what
that last oinker is getting is 1/9 of a CPU, and there just happen to be
9 runnable tasks total, 1 in group pert, and 8 in group oink.

IFF that ratio were to prove to be a constant, AND the oink group were a
massively parallel and synchronized compute job on a huge box, that
entire compute job would not be slowed down by the factor 2 that a fair
distribution would do to it, on say a 1000 core box, it'd be.. utterly
dead, because you'd put it out of your misery.

vogelweide:~/:[0]# cgexec -g cpu:foo bash
vogelweide:~/:[0]# for i in `seq 0 63`; do taskset -c $i cpuhog& done
[1] 8025
[2] 8026
...
vogelweide:~/:[130]# cgexec -g cpu:bar bash
vogelweide:~/:[130]# taskset -c 63 pert 10 (report every 10 seconds)
2260.91 MHZ CPU
perturbation threshold 0.024 usecs.
pert/s:  255 >2070.76us:   38 min:  0.05 max:4065.46 avg: 93.83 sum/s: 
23946us overhead: 2.39%
pert/s:  255 >2070.32us:   37 min:  1.32 max:4039.94 avg: 92.82 sum/s: 
23744us overhead: 2.37%
pert/s:  253 >2069.85us:   38 min:  0.05 max:4036.44 avg: 94.89 sum/s: 
24054us overhead: 2.41%

Hm, that's a kinda odd looking number from my 64 core box, but whatever,
it's far from fair according to my definition thereof.  Poor little oink
plus all other cycles not spent in pert's tight loop add up ~24ms/s.

> Good to see that you agree on the fairness issue... it MUST be fixed!
> CFS might be wrong or wasteful, but never unfair.

Weeell, we've disagreed on pretty much everything we've talked about so
far, but I can well imagine that what I see in the share update business
_could_ be part of your massive compute job woes.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-08 Thread Peter Zijlstra
On Thu, Oct 08, 2015 at 09:54:21PM +1100, paul.sz...@sydney.edu.au wrote:
> Good to see that you agree on the fairness issue... it MUST be fixed!
> CFS might be wrong or wasteful, but never unfair.

I've not yet had time to look at the case at hand, but there are wat is
called 'infeasible weight' scenarios for which it is impossible to be
fair.

Also, CFS must remain a practical scheduler, which places bounds on the
amount of weird cases we can deal with.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-08 Thread paul . szabo
Dear Mike,

> I see a fairness issue ... but one opposite to your complaint.

Why is that opposite? I think it would be fair for the one pert process
to get 100% CPU, the many oink processes can get everything else. That
one oink is lowly 10% (when others are 100%) is of no consequence.

What happens when you un-pin pert: does it get 100%? What if you run two
perts? Have you reproduced my observations?

---

Good to see that you agree on the fairness issue... it MUST be fixed!
CFS might be wrong or wasteful, but never unfair.

Cheers, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-08 Thread Mike Galbraith
On Tue, 2015-10-06 at 04:45 +0200, Mike Galbraith wrote:
> On Tue, 2015-10-06 at 08:48 +1100, paul.sz...@sydney.edu.au wrote:
> > The Linux CFS scheduler prefers pinned tasks and unfairly
> > gives more CPU time to tasks that have set CPU affinity.
> > This effect is observed with or without CGROUP controls.
> > 
> > To demonstrate: on an otherwise idle machine, as some user
> > run several processes pinned to each CPU, one for each CPU
> > (as many as CPUs present in the system) e.g. for a quad-core
> > non-HyperThreaded machine:
> > 
> >   taskset -c 0 perl -e 'while(1){1}' &
> >   taskset -c 1 perl -e 'while(1){1}' &
> >   taskset -c 2 perl -e 'while(1){1}' &
> >   taskset -c 3 perl -e 'while(1){1}' &
> > 
> > and (as that same or some other user) run some without
> > pinning:
> > 
> >   perl -e 'while(1){1}' &
> >   perl -e 'while(1){1}' &
> > 
> > and use e.g.   top   to observe that the pinned processes get
> > more CPU time than "fair".

I see a fairness issue with pinned tasks and group scheduling, but one
opposite to your complaint.
 
Two task groups, one with 8 hogs (oink), one with 1 (pert), all are pinned.
  PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+ P COMMAND
 3269 root  20   04060724648 R 100.0 0.004   1:00.02 1 oink
 3270 root  20   04060652576 R 100.0 0.004   0:59.84 2 oink
 3271 root  20   04060692616 R 100.0 0.004   0:59.95 3 oink
 3274 root  20   04060608532 R 100.0 0.004   1:00.01 6 oink
 3273 root  20   04060728652 R 99.90 0.005   0:59.98 5 oink
 3272 root  20   04060644568 R 99.51 0.004   0:59.80 4 oink
 3268 root  20   04060612536 R 99.41 0.004   0:59.67 0 oink
 3279 root  20   08312804708 R 88.83 0.005   0:53.06 7 pert
 3275 root  20   04060656580 R 11.07 0.004   0:06.98 7 oink
.
That group share math would make a huge compute group with progress
checkpoints sharing an SGI monster with one other hog amusing to watch.
  
-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-08 Thread Mike Galbraith
On Tue, 2015-10-06 at 04:45 +0200, Mike Galbraith wrote:
> On Tue, 2015-10-06 at 08:48 +1100, paul.sz...@sydney.edu.au wrote:
> > The Linux CFS scheduler prefers pinned tasks and unfairly
> > gives more CPU time to tasks that have set CPU affinity.
> > This effect is observed with or without CGROUP controls.
> > 
> > To demonstrate: on an otherwise idle machine, as some user
> > run several processes pinned to each CPU, one for each CPU
> > (as many as CPUs present in the system) e.g. for a quad-core
> > non-HyperThreaded machine:
> > 
> >   taskset -c 0 perl -e 'while(1){1}' &
> >   taskset -c 1 perl -e 'while(1){1}' &
> >   taskset -c 2 perl -e 'while(1){1}' &
> >   taskset -c 3 perl -e 'while(1){1}' &
> > 
> > and (as that same or some other user) run some without
> > pinning:
> > 
> >   perl -e 'while(1){1}' &
> >   perl -e 'while(1){1}' &
> > 
> > and use e.g.   top   to observe that the pinned processes get
> > more CPU time than "fair".

I see a fairness issue with pinned tasks and group scheduling, but one
opposite to your complaint.
 
Two task groups, one with 8 hogs (oink), one with 1 (pert), all are pinned.
  PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+ P COMMAND
 3269 root  20   04060724648 R 100.0 0.004   1:00.02 1 oink
 3270 root  20   04060652576 R 100.0 0.004   0:59.84 2 oink
 3271 root  20   04060692616 R 100.0 0.004   0:59.95 3 oink
 3274 root  20   04060608532 R 100.0 0.004   1:00.01 6 oink
 3273 root  20   04060728652 R 99.90 0.005   0:59.98 5 oink
 3272 root  20   04060644568 R 99.51 0.004   0:59.80 4 oink
 3268 root  20   04060612536 R 99.41 0.004   0:59.67 0 oink
 3279 root  20   08312804708 R 88.83 0.005   0:53.06 7 pert
 3275 root  20   04060656580 R 11.07 0.004   0:06.98 7 oink
.
That group share math would make a huge compute group with progress
checkpoints sharing an SGI monster with one other hog amusing to watch.
  
-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-08 Thread paul . szabo
Dear Mike,

> I see a fairness issue ... but one opposite to your complaint.

Why is that opposite? I think it would be fair for the one pert process
to get 100% CPU, the many oink processes can get everything else. That
one oink is lowly 10% (when others are 100%) is of no consequence.

What happens when you un-pin pert: does it get 100%? What if you run two
perts? Have you reproduced my observations?

---

Good to see that you agree on the fairness issue... it MUST be fixed!
CFS might be wrong or wasteful, but never unfair.

Cheers, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-08 Thread Peter Zijlstra
On Thu, Oct 08, 2015 at 09:54:21PM +1100, paul.sz...@sydney.edu.au wrote:
> Good to see that you agree on the fairness issue... it MUST be fixed!
> CFS might be wrong or wasteful, but never unfair.

I've not yet had time to look at the case at hand, but there are wat is
called 'infeasible weight' scenarios for which it is impossible to be
fair.

Also, CFS must remain a practical scheduler, which places bounds on the
amount of weird cases we can deal with.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-08 Thread Mike Galbraith
On Thu, 2015-10-08 at 21:54 +1100, paul.sz...@sydney.edu.au wrote:
> Dear Mike,
> 
> > I see a fairness issue ... but one opposite to your complaint.
> 
> Why is that opposite? I think it would be fair for the one pert process
> to get 100% CPU, the many oink processes can get everything else. That
> one oink is lowly 10% (when others are 100%) is of no consequence.

Well, not exactly opposite, only opposite in that the one pert task also
receives MORE than it's fair share when unpinned.  Two 100$ hogs sharing
one CPU should each get 50% of that CPU.  The fact that the oink group
contains 8 tasks vs 1 for the pert group should be irrelevant, but what
that last oinker is getting is 1/9 of a CPU, and there just happen to be
9 runnable tasks total, 1 in group pert, and 8 in group oink.

IFF that ratio were to prove to be a constant, AND the oink group were a
massively parallel and synchronized compute job on a huge box, that
entire compute job would not be slowed down by the factor 2 that a fair
distribution would do to it, on say a 1000 core box, it'd be.. utterly
dead, because you'd put it out of your misery.

vogelweide:~/:[0]# cgexec -g cpu:foo bash
vogelweide:~/:[0]# for i in `seq 0 63`; do taskset -c $i cpuhog& done
[1] 8025
[2] 8026
...
vogelweide:~/:[130]# cgexec -g cpu:bar bash
vogelweide:~/:[130]# taskset -c 63 pert 10 (report every 10 seconds)
2260.91 MHZ CPU
perturbation threshold 0.024 usecs.
pert/s:  255 >2070.76us:   38 min:  0.05 max:4065.46 avg: 93.83 sum/s: 
23946us overhead: 2.39%
pert/s:  255 >2070.32us:   37 min:  1.32 max:4039.94 avg: 92.82 sum/s: 
23744us overhead: 2.37%
pert/s:  253 >2069.85us:   38 min:  0.05 max:4036.44 avg: 94.89 sum/s: 
24054us overhead: 2.41%

Hm, that's a kinda odd looking number from my 64 core box, but whatever,
it's far from fair according to my definition thereof.  Poor little oink
plus all other cycles not spent in pert's tight loop add up ~24ms/s.

> Good to see that you agree on the fairness issue... it MUST be fixed!
> CFS might be wrong or wasteful, but never unfair.

Weeell, we've disagreed on pretty much everything we've talked about so
far, but I can well imagine that what I see in the share update business
_could_ be part of your massive compute job woes.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-08 Thread paul . szabo
Dear Mike,

>>> I see a fairness issue ... but one opposite to your complaint.
>> Why is that opposite? ...
>
> Well, not exactly opposite, only opposite in that the one pert task also
> receives MORE than it's fair share when unpinned.  Two 100$ hogs sharing
> one CPU should each get 50% of that CPU. ...

But you are using CGROUPs, grouping all oinks into one group, and the
one pert into another: requesting each group to get same total CPU.
Since pert has one process only, the most he can get is 100% (not 400%),
and it is quite OK for the oinks together to get 700%.

> IFF ... massively parallel and synchronized ...

You would be making the assumption that you had the machine to yourself:
might be the wrong thing to assume.

>> Good to see that you agree ...
> Weeell, we've disagreed on pretty much everything ...

Sorry I disagree: we do agree on the essence. :-)

Cheers, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-08 Thread Mike Galbraith
On Fri, 2015-10-09 at 08:55 +1100, paul.sz...@sydney.edu.au wrote:
> Dear Mike,
> 
> >>> I see a fairness issue ... but one opposite to your complaint.
> >> Why is that opposite? ...
> >
> > Well, not exactly opposite, only opposite in that the one pert task also
> > receives MORE than it's fair share when unpinned.  Two 100$ hogs sharing
> > one CPU should each get 50% of that CPU. ...
> 
> But you are using CGROUPs, grouping all oinks into one group, and the
> one pert into another: requesting each group to get same total CPU.
> Since pert has one process only, the most he can get is 100% (not 400%),
> and it is quite OK for the oinks together to get 700%.

Well, that of course depends on what you call fair.  I realize why and
where it happens.  I told weight adjustment to keep its grubby mitts off
of autogroups, and of course the "problem" went away.  Back to the
viewpoint thing, with two users, each having been _placed_ in a group, I
can well imagine a user who is trying to use all of his authorized
bandwidth raising an eyebrow when he sees one of his tasks getting 24
whole milliseconds per second with an allegedly fair scheduler.

I can see it both ways.  What's going to come out of this is probably
going to be "tough titty, yes, group scheduling has side effects, and
this is one".  I already know it does.  Question is only whether the
weight adjustment gears are spinning as intended or not.

> > IFF ... massively parallel and synchronized ...
> 
> You would be making the assumption that you had the machine to yourself:
> might be the wrong thing to assume.

Yup, it would be a doomed attempt to run a load which cannot thrive in a
shared environment in such an environment.  Are any of the compute loads
you're having trouble with.. in the math department..  perhaps doing oh,
say complex math goop that feeds the output of one parallel computation
into the next parallel computation? :)

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-08 Thread Mike Galbraith
On Fri, 2015-10-09 at 08:55 +1100, paul.sz...@sydney.edu.au wrote:

> >> Good to see that you agree ...
> > Weeell, we've disagreed on pretty much everything ...
> 
> Sorry I disagree: we do agree on the essence. :-)

P.S.

To some extent.  If the essence is $subject, nope, we definitely
disagree.  If the essence is that _group_ scheduling is not strictly
fair, then we agree.  The must be fixed bit, I also disagree with.
Maybe wants fixing I can agree with ;-) 

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-06 Thread Mike Galbraith
On Wed, 2015-10-07 at 07:44 +1100, paul.sz...@sydney.edu.au wrote:

> I agree that pinning may be bad... should not the kernel penalize the
> badly pinned processes?

I didn't say pinning is bad, I said was what you're seeing is not a bug.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-06 Thread paul . szabo
Dear Mike,

>> ... the CFS is meant to be fair, using things like vruntime
>> to preempt, and throttling. Why are those pinned tasks not preempted or
>> throttled?
>
> Imagine you own a 8192 CPU box for a moment, all CPUs having one pinned
> task, plus one extra unpinned task, and ponder what would have to happen
> in order to meet your utilization expectation. ...

Sorry but the kernel contradicts. As per my original report, things are
"fair" in the case of:
 - with CGROUP controls and the two kinds of processes run by
   different users, when there is just one un-pinned process
and that is so on my quad-core i5-3470 baby or my 32-core 4*E5-4627v2
server (and everywhere that I tested). The kernel is smart and gets it
right for one un-pinned process: why not for two?

Now re-testing further (on some machines with CGROUP): on the i5-3470
things are fair still with one un-pinned (become un-fair with two), on
the 4*E5-4627v2 are fair still with 4 un-pinned (become un-fair with 5).
Does this suggest that the kernel does things right within each physical
CPU, but breaks across several (or exact contrary)? Maybe not: on a
2*E5530 machine, things are fair with just one un-pinned and un-fair
with 2 already.

> What you're seeing is not a bug.  No task can occupy more than one CPU
> at a time, making space reservation on multiple CPUs a very bad idea.

I agree that pinning may be bad... should not the kernel penalize the
badly pinned processes?

Cheers, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-06 Thread Mike Galbraith
On Tue, 2015-10-06 at 21:06 +1100, paul.sz...@sydney.edu.au wrote:

> And further... the CFS is meant to be fair, using things like vruntime
> to preempt, and throttling. Why are those pinned tasks not preempted or
> throttled?

Imagine you own a 8192 CPU box for a moment, all CPUs having one pinned
task, plus one extra unpinned task, and ponder what would have to happen
in order to meet your utilization expectation.Right.

What you're seeing is not a bug.  No task can occupy more than one CPU
at a time, making space reservation on multiple CPUs a very bad idea.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-06 Thread paul . szabo
Dear Mike,

>> .. CFS ... unfairly gives more CPU time to [pinned] tasks ...
>
> If they can all migrate, load balancing can move any of them to try to
> fix the permanent imbalance, so they'll all bounce about sharing a CPU
> with some other hog, and it all kinda sorta works out.
>
> When most are pinned, to make it work out long term you'd have to be
> short term unfair, walking the unpinned minority around the box in a
> carefully orchestrated dance... and have omniscient powers that assure
> that none of the tasks you're trying to equalize is gonna do something
> rude like leave, sleep, fork or whatever, and muck up the grand plan.

Could not your argument be turned around: for a pinned task it is harder
to find an idle CPU, so they should get less time?

But really... those pinned tasks do not hog the CPU forever. Whatever
kicks them off: could not that be done just a little earlier?

And further... the CFS is meant to be fair, using things like vruntime
to preempt, and throttling. Why are those pinned tasks not preempted or
throttled?

Thanks, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-06 Thread paul . szabo
Dear Mike,

>> ... the CFS is meant to be fair, using things like vruntime
>> to preempt, and throttling. Why are those pinned tasks not preempted or
>> throttled?
>
> Imagine you own a 8192 CPU box for a moment, all CPUs having one pinned
> task, plus one extra unpinned task, and ponder what would have to happen
> in order to meet your utilization expectation. ...

Sorry but the kernel contradicts. As per my original report, things are
"fair" in the case of:
 - with CGROUP controls and the two kinds of processes run by
   different users, when there is just one un-pinned process
and that is so on my quad-core i5-3470 baby or my 32-core 4*E5-4627v2
server (and everywhere that I tested). The kernel is smart and gets it
right for one un-pinned process: why not for two?

Now re-testing further (on some machines with CGROUP): on the i5-3470
things are fair still with one un-pinned (become un-fair with two), on
the 4*E5-4627v2 are fair still with 4 un-pinned (become un-fair with 5).
Does this suggest that the kernel does things right within each physical
CPU, but breaks across several (or exact contrary)? Maybe not: on a
2*E5530 machine, things are fair with just one un-pinned and un-fair
with 2 already.

> What you're seeing is not a bug.  No task can occupy more than one CPU
> at a time, making space reservation on multiple CPUs a very bad idea.

I agree that pinning may be bad... should not the kernel penalize the
badly pinned processes?

Cheers, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-06 Thread Mike Galbraith
On Wed, 2015-10-07 at 07:44 +1100, paul.sz...@sydney.edu.au wrote:

> I agree that pinning may be bad... should not the kernel penalize the
> badly pinned processes?

I didn't say pinning is bad, I said was what you're seeing is not a bug.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-06 Thread paul . szabo
Dear Mike,

>> .. CFS ... unfairly gives more CPU time to [pinned] tasks ...
>
> If they can all migrate, load balancing can move any of them to try to
> fix the permanent imbalance, so they'll all bounce about sharing a CPU
> with some other hog, and it all kinda sorta works out.
>
> When most are pinned, to make it work out long term you'd have to be
> short term unfair, walking the unpinned minority around the box in a
> carefully orchestrated dance... and have omniscient powers that assure
> that none of the tasks you're trying to equalize is gonna do something
> rude like leave, sleep, fork or whatever, and muck up the grand plan.

Could not your argument be turned around: for a pinned task it is harder
to find an idle CPU, so they should get less time?

But really... those pinned tasks do not hog the CPU forever. Whatever
kicks them off: could not that be done just a little earlier?

And further... the CFS is meant to be fair, using things like vruntime
to preempt, and throttling. Why are those pinned tasks not preempted or
throttled?

Thanks, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-06 Thread Mike Galbraith
On Tue, 2015-10-06 at 21:06 +1100, paul.sz...@sydney.edu.au wrote:

> And further... the CFS is meant to be fair, using things like vruntime
> to preempt, and throttling. Why are those pinned tasks not preempted or
> throttled?

Imagine you own a 8192 CPU box for a moment, all CPUs having one pinned
task, plus one extra unpinned task, and ponder what would have to happen
in order to meet your utilization expectation.Right.

What you're seeing is not a bug.  No task can occupy more than one CPU
at a time, making space reservation on multiple CPUs a very bad idea.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-05 Thread Mike Galbraith
On Tue, 2015-10-06 at 08:48 +1100, paul.sz...@sydney.edu.au wrote:
> The Linux CFS scheduler prefers pinned tasks and unfairly
> gives more CPU time to tasks that have set CPU affinity.
> This effect is observed with or without CGROUP controls.
> 
> To demonstrate: on an otherwise idle machine, as some user
> run several processes pinned to each CPU, one for each CPU
> (as many as CPUs present in the system) e.g. for a quad-core
> non-HyperThreaded machine:
> 
>   taskset -c 0 perl -e 'while(1){1}' &
>   taskset -c 1 perl -e 'while(1){1}' &
>   taskset -c 2 perl -e 'while(1){1}' &
>   taskset -c 3 perl -e 'while(1){1}' &
> 
> and (as that same or some other user) run some without
> pinning:
> 
>   perl -e 'while(1){1}' &
>   perl -e 'while(1){1}' &
> 
> and use e.g.   top   to observe that the pinned processes get
> more CPU time than "fair".
> 
> Fairness is obtained when either:
>  - there are as many un-pinned processes as CPUs; or
>  - with CGROUP controls and the two kinds of processes run by
>different users, when there is just one un-pinned process; or
>  - if the pinning is turned off for these processes (or they
>are started without).
> 
> Any insight is welcome!

If they can all migrate, load balancing can move any of them to try to
fix the permanent imbalance, so they'll all bounce about sharing a CPU
with some other hog, and it all kinda sorta works out.

When most are pinned, to make it work out long term you'd have to be
short term unfair, walking the unpinned minority around the box in a
carefully orchestrated dance... and have omniscient powers that assure
that none of the tasks you're trying to equalize is gonna do something
rude like leave, sleep, fork or whatever, and muck up the grand plan.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


CFS scheduler unfairly prefers pinned tasks

2015-10-05 Thread paul . szabo
The Linux CFS scheduler prefers pinned tasks and unfairly
gives more CPU time to tasks that have set CPU affinity.
This effect is observed with or without CGROUP controls.

To demonstrate: on an otherwise idle machine, as some user
run several processes pinned to each CPU, one for each CPU
(as many as CPUs present in the system) e.g. for a quad-core
non-HyperThreaded machine:

  taskset -c 0 perl -e 'while(1){1}' &
  taskset -c 1 perl -e 'while(1){1}' &
  taskset -c 2 perl -e 'while(1){1}' &
  taskset -c 3 perl -e 'while(1){1}' &

and (as that same or some other user) run some without
pinning:

  perl -e 'while(1){1}' &
  perl -e 'while(1){1}' &

and use e.g.   top   to observe that the pinned processes get
more CPU time than "fair".

Fairness is obtained when either:
 - there are as many un-pinned processes as CPUs; or
 - with CGROUP controls and the two kinds of processes run by
   different users, when there is just one un-pinned process; or
 - if the pinning is turned off for these processes (or they
   are started without).

Any insight is welcome!

---

I would appreciate replies direct to me as I am not subscribed to the
linux-kernel mailing list (but will try to watch the archives).

This bug is also reported to Debian, please see
  http://bugs.debian.org/800945

I use Debian with the 3.16 kernel, have not yet tried 4.* kernels.


Thanks, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


CFS scheduler unfairly prefers pinned tasks

2015-10-05 Thread paul . szabo
The Linux CFS scheduler prefers pinned tasks and unfairly
gives more CPU time to tasks that have set CPU affinity.
This effect is observed with or without CGROUP controls.

To demonstrate: on an otherwise idle machine, as some user
run several processes pinned to each CPU, one for each CPU
(as many as CPUs present in the system) e.g. for a quad-core
non-HyperThreaded machine:

  taskset -c 0 perl -e 'while(1){1}' &
  taskset -c 1 perl -e 'while(1){1}' &
  taskset -c 2 perl -e 'while(1){1}' &
  taskset -c 3 perl -e 'while(1){1}' &

and (as that same or some other user) run some without
pinning:

  perl -e 'while(1){1}' &
  perl -e 'while(1){1}' &

and use e.g.   top   to observe that the pinned processes get
more CPU time than "fair".

Fairness is obtained when either:
 - there are as many un-pinned processes as CPUs; or
 - with CGROUP controls and the two kinds of processes run by
   different users, when there is just one un-pinned process; or
 - if the pinning is turned off for these processes (or they
   are started without).

Any insight is welcome!

---

I would appreciate replies direct to me as I am not subscribed to the
linux-kernel mailing list (but will try to watch the archives).

This bug is also reported to Debian, please see
  http://bugs.debian.org/800945

I use Debian with the 3.16 kernel, have not yet tried 4.* kernels.


Thanks, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS scheduler unfairly prefers pinned tasks

2015-10-05 Thread Mike Galbraith
On Tue, 2015-10-06 at 08:48 +1100, paul.sz...@sydney.edu.au wrote:
> The Linux CFS scheduler prefers pinned tasks and unfairly
> gives more CPU time to tasks that have set CPU affinity.
> This effect is observed with or without CGROUP controls.
> 
> To demonstrate: on an otherwise idle machine, as some user
> run several processes pinned to each CPU, one for each CPU
> (as many as CPUs present in the system) e.g. for a quad-core
> non-HyperThreaded machine:
> 
>   taskset -c 0 perl -e 'while(1){1}' &
>   taskset -c 1 perl -e 'while(1){1}' &
>   taskset -c 2 perl -e 'while(1){1}' &
>   taskset -c 3 perl -e 'while(1){1}' &
> 
> and (as that same or some other user) run some without
> pinning:
> 
>   perl -e 'while(1){1}' &
>   perl -e 'while(1){1}' &
> 
> and use e.g.   top   to observe that the pinned processes get
> more CPU time than "fair".
> 
> Fairness is obtained when either:
>  - there are as many un-pinned processes as CPUs; or
>  - with CGROUP controls and the two kinds of processes run by
>different users, when there is just one un-pinned process; or
>  - if the pinning is turned off for these processes (or they
>are started without).
> 
> Any insight is welcome!

If they can all migrate, load balancing can move any of them to try to
fix the permanent imbalance, so they'll all bounce about sharing a CPU
with some other hog, and it all kinda sorta works out.

When most are pinned, to make it work out long term you'd have to be
short term unfair, walking the unpinned minority around the box in a
carefully orchestrated dance... and have omniscient powers that assure
that none of the tasks you're trying to equalize is gonna do something
rude like leave, sleep, fork or whatever, and muck up the grand plan.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/