Re: [RFC PATCH v2 0/6] SCHED_DEADLINE server infrastructure

2020-09-08 Thread Juri Lelli
Hi Pavel,

On 09/09/20 00:22, Pavel Machek wrote:
> Hi!
> 
> > This is RFC v2 of Peter's SCHED_DEADLINE server infrastructure
> > implementation [1].
> > 
> > SCHED_DEADLINE servers can help fixing starvation issues of low priority 
> > tasks (e.g., 
> > SCHED_OTHER) when higher priority tasks monopolize CPU cycles. Today we 
> > have RT 
> > Throttling; DEADLINE servers should be able to replace and improve that.
> 
> It would be worth noting what "server" is in this context.

It comes from Constant Bandwidth Server (CBS), that SCHED_DEADLINE is
implementing [1].

> 
> It is not white box with CPU inside, it is not even an userland process, 
> afaict.
> 
> Subject is quite confusing.

Best,
Juri

1 - 
https://elixir.bootlin.com/linux/latest/source/Documentation/scheduler/sched-deadline.rst#L42



Re: [RFC PATCH v2 0/6] SCHED_DEADLINE server infrastructure

2020-09-08 Thread Pavel Machek
Hi!

> This is RFC v2 of Peter's SCHED_DEADLINE server infrastructure
> implementation [1].
> 
> SCHED_DEADLINE servers can help fixing starvation issues of low priority 
> tasks (e.g., 
> SCHED_OTHER) when higher priority tasks monopolize CPU cycles. Today we have 
> RT 
> Throttling; DEADLINE servers should be able to replace and improve that.

It would be worth noting what "server" is in this context.

It is not white box with CPU inside, it is not even an userland process, afaict.

Subject is quite confusing.

Best regards,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Re: [RFC PATCH v2 0/6] SCHED_DEADLINE server infrastructure

2020-08-07 Thread luca abeni
Hi Juri,

thanks for sharing the v2 patchset!

In the next days I'll have a look at it, and try some tests...

In the meanwhile, I have some questions/comments after a first quick
look.

If I understand well, the patchset does not apply deadline servers to
FIFO and RR tasks, right? How does this patchset interact with RT
throttling?

If I understand well, patch 6/6 does something like "use deadline
servers for SCHED_OTHER only if FIFO/RR tasks risk to starve
SCHED_OTHER tasks"... Right? I understand this is because you do not
want to delay RT tasks if they are not starving other tasks. But then,
maybe what you want is not deadline-based scheduling. Maybe a
reservation-based scheduler based on fixed priorities is what you want?
(with SCHED_DEADLINE, you could provide exact performance guarantees to
SCHED_OTHER tasks, but I suspect patch 6/6 breaks these guarantees?)


Thanks,
Luca

On Fri,  7 Aug 2020 11:50:45 +0200
Juri Lelli  wrote:

> Hi,
> 
> This is RFC v2 of Peter's SCHED_DEADLINE server infrastructure
> implementation [1].
> 
> SCHED_DEADLINE servers can help fixing starvation issues of low
> priority tasks (e.g., SCHED_OTHER) when higher priority tasks
> monopolize CPU cycles. Today we have RT Throttling; DEADLINE servers
> should be able to replace and improve that.
> 
> I rebased Peter's patches (adding changelogs where needed) on
> tip/sched/core as of today and incorporated fixes to issues discussed
> during RFC v1. Current set seems to even boot on real HW! :-)
> 
> While playing with RFC v1 set (and discussing it further offline with
> Daniel) it has emerged the need to slightly change the behavior. Patch
> 6/6 is a (cumbersome?) attempt to show what's probably needed.
> The problem with "original" implementation is that FIFO tasks might
> suffer preemption from NORMAL even when spare CPU cycles are
> available. In fact, fair deadline server is enqueued right away when
> NORMAL tasks wake up and they are first scheduled by the server, thus
> potentially preempting a well behaving FIFO task. This is of course
> not ideal. So, in patch 6/6 I propose to use some kind of starvation
> monitor/ watchdog that delays enqueuing of deadline servers to the
> point when fair tasks might start to actually suffer from starvation
> (just randomly picked HZ/2 for now). One problem I already see with
> the current implementation is that it adds overhead to fair paths, so
> I'm pretty sure there are better ways to implement the idea (e.g.,
> Daniel already suggested using a starvation monitor kthread sort of
> thing).
> 
> Receiving comments and suggestions is the sole purpose of this posting
> at this stage. Hopefully we can further discuss the idea at Plumbers
> in a few weeks. So, please don't focus too much into actual
> implementation (which I plan to revise anyway after I'm back from pto
> :), but try to see if this might actually fly. The feature seems to
> be very much needed.
> 
> Thanks!
> 
> Juri
> 
> 1 -
> https://lore.kernel.org/lkml/20190726145409.947503...@infradead.org/
> 
> Juri Lelli (1):
>   sched/fair: Implement starvation monitor
> 
> Peter Zijlstra (5):
>   sched: Unify runtime accounting across classes
>   sched/deadline: Collect sched_dl_entity initialization
>   sched/deadline: Move bandwidth accounting into
> {en,de}queue_dl_entity sched/deadline: Introduce deadline servers
>   sched/fair: Add trivial fair server
> 
>  include/linux/sched.h|  28 ++-
>  kernel/sched/core.c  |  23 +-
>  kernel/sched/deadline.c  | 483
> --- kernel/sched/fair.c  |
> 136 ++- kernel/sched/rt.c|  17 +-
>  kernel/sched/sched.h |  50 +++-
>  kernel/sched/stop_task.c |  16 +-
>  7 files changed, 522 insertions(+), 231 deletions(-)
> 



Re: [RFC PATCH v2 0/6] SCHED_DEADLINE server infrastructure

2020-08-07 Thread peterz
On Fri, Aug 07, 2020 at 03:16:32PM +0200, luca abeni wrote:

> If I understand well, the patchset does not apply deadline servers to
> FIFO and RR tasks, right? How does this patchset interact with RT
> throttling?

ideally it will replace it ;-)

But of course, there's the whole cgroup trainwreck waiting there :/


Re: [RFC PATCH v2 0/6] SCHED_DEADLINE server infrastructure

2020-08-07 Thread Juri Lelli
On 07/08/20 15:41, luca abeni wrote:
> Hi Juri,
> 
> On Fri, 7 Aug 2020 15:30:41 +0200
> Juri Lelli  wrote:
> [...]
> > > In the meanwhile, I have some questions/comments after a first quick
> > > look.
> > > 
> > > If I understand well, the patchset does not apply deadline servers
> > > to FIFO and RR tasks, right? How does this patchset interact with RT
> > > throttling?  
> > 
> > Well, it's something like the dual of it, in that RT Throttling
> > directly throttles RT tasks to make spare CPU cycles available to
> > fair tasks while this patchset introduces deadline servers to
> > schedule fair tasks, thus still reserving CPU time for those (when
> > needed).
> 
> Ah, OK... I was thinking about using deadline servers for both RT and
> non-RT tasks. And to use them not only to throttle, but also to provide
> some kind of performance guarantees (to all the scheduling classes).
> Think about what can be done when combining this mechanism with
> cgroups/containers :)
> 
> [...]
> > > I understand this is because you do not
> > > want to delay RT tasks if they are not starving other tasks. But
> > > then, maybe what you want is not deadline-based scheduling. Maybe a
> > > reservation-based scheduler based on fixed priorities is what you
> > > want? (with SCHED_DEADLINE, you could provide exact performance
> > > guarantees to SCHED_OTHER tasks, but I suspect patch 6/6 breaks
> > > these guarantees?)  
> > 
> > I agree that we are not interested in exact guarantees in this case,
> > but why not using something that it's already there and would give us
> > the functionality we need (fix starvation for fair)?
> 
> Ok, if performance guarantees to non-RT tasks are not the goal, then I
> agree. I was thinking that since the patchset provides a mechanism to
> schedule various classes of tasks through deadline servers, then
> using these servers to provide some kinds of guarantees could be
> interesting ;-)

Not saying that the wider scope approach is not worth pursuing, you know
I would be very much interested into that as well :-), but I'd leave it
for a later time. This proposal looks reasonably achieaveble in somewhat
short times and it should provide provable benefits to production today.
Plus, you are again right, foundations would be there already when we'll
be ready for the wider solution.



Re: [RFC PATCH v2 0/6] SCHED_DEADLINE server infrastructure

2020-08-07 Thread luca abeni
Hi Juri,

On Fri, 7 Aug 2020 15:30:41 +0200
Juri Lelli  wrote:
[...]
> > In the meanwhile, I have some questions/comments after a first quick
> > look.
> > 
> > If I understand well, the patchset does not apply deadline servers
> > to FIFO and RR tasks, right? How does this patchset interact with RT
> > throttling?  
> 
> Well, it's something like the dual of it, in that RT Throttling
> directly throttles RT tasks to make spare CPU cycles available to
> fair tasks while this patchset introduces deadline servers to
> schedule fair tasks, thus still reserving CPU time for those (when
> needed).

Ah, OK... I was thinking about using deadline servers for both RT and
non-RT tasks. And to use them not only to throttle, but also to provide
some kind of performance guarantees (to all the scheduling classes).
Think about what can be done when combining this mechanism with
cgroups/containers :)

[...]
> > I understand this is because you do not
> > want to delay RT tasks if they are not starving other tasks. But
> > then, maybe what you want is not deadline-based scheduling. Maybe a
> > reservation-based scheduler based on fixed priorities is what you
> > want? (with SCHED_DEADLINE, you could provide exact performance
> > guarantees to SCHED_OTHER tasks, but I suspect patch 6/6 breaks
> > these guarantees?)  
> 
> I agree that we are not interested in exact guarantees in this case,
> but why not using something that it's already there and would give us
> the functionality we need (fix starvation for fair)?

Ok, if performance guarantees to non-RT tasks are not the goal, then I
agree. I was thinking that since the patchset provides a mechanism to
schedule various classes of tasks through deadline servers, then
using these servers to provide some kinds of guarantees could be
interesting ;-)



Thanks,
Luca

> It would also
> work well in presence of "real" deadline tasks I think, in that you
> could account for these fair servers while performing admission
> control.
> 
> Best,
> 
> Juri
> 



Re: [RFC PATCH v2 0/6] SCHED_DEADLINE server infrastructure

2020-08-07 Thread Juri Lelli
Hi Luca,

On 07/08/20 15:16, luca abeni wrote:
> Hi Juri,
> 
> thanks for sharing the v2 patchset!
> 
> In the next days I'll have a look at it, and try some tests...

Thanks!

> In the meanwhile, I have some questions/comments after a first quick
> look.
> 
> If I understand well, the patchset does not apply deadline servers to
> FIFO and RR tasks, right? How does this patchset interact with RT
> throttling?

Well, it's something like the dual of it, in that RT Throttling directly
throttles RT tasks to make spare CPU cycles available to fair tasks
while this patchset introduces deadline servers to schedule fair tasks,
thus still reserving CPU time for those (when needed).

> If I understand well, patch 6/6 does something like "use deadline
> servers for SCHED_OTHER only if FIFO/RR tasks risk to starve
> SCHED_OTHER tasks"... Right?

That's the basic idea, yes.

> I understand this is because you do not
> want to delay RT tasks if they are not starving other tasks. But then,
> maybe what you want is not deadline-based scheduling. Maybe a
> reservation-based scheduler based on fixed priorities is what you want?
> (with SCHED_DEADLINE, you could provide exact performance guarantees to
> SCHED_OTHER tasks, but I suspect patch 6/6 breaks these guarantees?)

I agree that we are not interested in exact guarantees in this case, but
why not using something that it's already there and would give us the
functionality we need (fix starvation for fair)? It would also work well
in presence of "real" deadline tasks I think, in that you could account
for these fair servers while performing admission control.

Best,

Juri



[RFC PATCH v2 0/6] SCHED_DEADLINE server infrastructure

2020-08-07 Thread Juri Lelli
Hi,

This is RFC v2 of Peter's SCHED_DEADLINE server infrastructure
implementation [1].

SCHED_DEADLINE servers can help fixing starvation issues of low priority
tasks (e.g., SCHED_OTHER) when higher priority tasks monopolize CPU
cycles. Today we have RT Throttling; DEADLINE servers should be able to
replace and improve that.

I rebased Peter's patches (adding changelogs where needed) on
tip/sched/core as of today and incorporated fixes to issues discussed
during RFC v1. Current set seems to even boot on real HW! :-)

While playing with RFC v1 set (and discussing it further offline with
Daniel) it has emerged the need to slightly change the behavior. Patch
6/6 is a (cumbersome?) attempt to show what's probably needed.
The problem with "original" implementation is that FIFO tasks might
suffer preemption from NORMAL even when spare CPU cycles are available.
In fact, fair deadline server is enqueued right away when NORMAL tasks
wake up and they are first scheduled by the server, thus potentially
preempting a well behaving FIFO task. This is of course not ideal.
So, in patch 6/6 I propose to use some kind of starvation monitor/
watchdog that delays enqueuing of deadline servers to the point when
fair tasks might start to actually suffer from starvation (just randomly
picked HZ/2 for now). One problem I already see with the current
implementation is that it adds overhead to fair paths, so I'm pretty
sure there are better ways to implement the idea (e.g., Daniel already
suggested using a starvation monitor kthread sort of thing).

Receiving comments and suggestions is the sole purpose of this posting
at this stage. Hopefully we can further discuss the idea at Plumbers in
a few weeks. So, please don't focus too much into actual implementation
(which I plan to revise anyway after I'm back from pto :), but try to
see if this might actually fly. The feature seems to be very much needed.

Thanks!

Juri

1 - https://lore.kernel.org/lkml/20190726145409.947503...@infradead.org/

Juri Lelli (1):
  sched/fair: Implement starvation monitor

Peter Zijlstra (5):
  sched: Unify runtime accounting across classes
  sched/deadline: Collect sched_dl_entity initialization
  sched/deadline: Move bandwidth accounting into {en,de}queue_dl_entity
  sched/deadline: Introduce deadline servers
  sched/fair: Add trivial fair server

 include/linux/sched.h|  28 ++-
 kernel/sched/core.c  |  23 +-
 kernel/sched/deadline.c  | 483 ---
 kernel/sched/fair.c  | 136 ++-
 kernel/sched/rt.c|  17 +-
 kernel/sched/sched.h |  50 +++-
 kernel/sched/stop_task.c |  16 +-
 7 files changed, 522 insertions(+), 231 deletions(-)

-- 
2.26.2