Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread Daniel Bristot de Oliveira
On 10/10/18 2:24 PM, Peter Zijlstra wrote:
>> I believe there were some papers circulated last year that looked at 
>> something similar to this when you had overlapping or completely disjoint 
>> CPUsets I think it would be nice to drag into the discussion. Has this been 
>> considered? (if so, sorry for adding line-noise!)
> Hurm, was that one of Bjorn's papers? Didn't that deal with AC of
> disjoint/overlapping sets?
> 

This paper:
https://people.mpi-sws.org/~bbb/papers/pdf/rtsj14.pdf

But, unless I am wrong, there were findings after this paper that shows
some imprecision on it.

Anyway, it does not analyse the locking properties, only scheduler of
independent tasks - it is a start, but far from what we do here.

(btw this paper is really complex...)

The locking problem for such case: APA with the nesting of different
locks in the locking implementation (we use raw spin lock on this, and
this method could also be used in the rw lock/sem in the future, nesting
rw_lock(mutex_proxy(raw_spinlock())) is an open problem from the
academic point of view.

I explained these things (nested lock and the need of APA for locking)
as "Open Problems" at the RTSOPs (part of the ECRTS) earlier this year:

http://rtsops2018.loria.fr/wp-content/uploads/2018/07/RTSOPS18_proceedings_final.pdf

Bjorn were there, but not only him... Baruah, Davis, Alan Burns were there.

There are some works being done for more complex locking in academy, like:

https://www.cs.unc.edu/~jarretc/papers/ecrts18b_long.pdf

But still, the task model used on these implementations are not the
Linux one.

-- Daniel


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread Daniel Bristot de Oliveira
On 10/10/18 2:24 PM, Peter Zijlstra wrote:
>> I believe there were some papers circulated last year that looked at 
>> something similar to this when you had overlapping or completely disjoint 
>> CPUsets I think it would be nice to drag into the discussion. Has this been 
>> considered? (if so, sorry for adding line-noise!)
> Hurm, was that one of Bjorn's papers? Didn't that deal with AC of
> disjoint/overlapping sets?
> 

This paper:
https://people.mpi-sws.org/~bbb/papers/pdf/rtsj14.pdf

But, unless I am wrong, there were findings after this paper that shows
some imprecision on it.

Anyway, it does not analyse the locking properties, only scheduler of
independent tasks - it is a start, but far from what we do here.

(btw this paper is really complex...)

The locking problem for such case: APA with the nesting of different
locks in the locking implementation (we use raw spin lock on this, and
this method could also be used in the rw lock/sem in the future, nesting
rw_lock(mutex_proxy(raw_spinlock())) is an open problem from the
academic point of view.

I explained these things (nested lock and the need of APA for locking)
as "Open Problems" at the RTSOPs (part of the ECRTS) earlier this year:

http://rtsops2018.loria.fr/wp-content/uploads/2018/07/RTSOPS18_proceedings_final.pdf

Bjorn were there, but not only him... Baruah, Davis, Alan Burns were there.

There are some works being done for more complex locking in academy, like:

https://www.cs.unc.edu/~jarretc/papers/ecrts18b_long.pdf

But still, the task model used on these implementations are not the
Linux one.

-- Daniel


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread Juri Lelli
On 10/10/18 13:56, Henrik Austad wrote:
> On Tue, Oct 09, 2018 at 11:24:26AM +0200, Juri Lelli wrote:
> > Hi all,
> 
> Hi, nice series, I have a lot of details to grok, but I like the idea of PE
> 
> > Proxy Execution (also goes under several other names) isn't a new
> > concept, it has been mentioned already in the past to this community
> > (both in email discussions and at conferences [1, 2]), but no actual
> > implementation that applies to a fairly recent kernel exists as of today
> > (of which I'm aware of at least - happy to be proven wrong).
> > 
> > Very broadly speaking, more info below, proxy execution enables a task
> > to run using the context of some other task that is "willing" to
> > participate in the mechanism, as this helps both tasks to improve
> > performance (w.r.t. the latter task not participating to proxy
> > execution).
> 
> From what I remember, PEP was originally proposed for a global EDF, and as 
> far as my head has been able to read this series, this implementation is 
> planned for not only deadline, but eventuall also for sched_(rr|fifo|other) 
> - is that correct?

Correct, this is cross class.

> I have a bit of concern when it comes to affinities and and where the 
> lock owner will actually execute while in the context of the proxy, 
> especially when you run into the situation where you have disjoint CPU 
> affinities for _rr tasks to ensure the deadlines.

Well, it's the (scheduler context) of the proxy that is potentially
moved around. Lock owner stays inside its affinity.

> I believe there were some papers circulated last year that looked at 
> something similar to this when you had overlapping or completely disjoint 
> CPUsets I think it would be nice to drag into the discussion. Has this been 
> considered? (if so, sorry for adding line-noise!)

I think you refer to BBB work. Not sure if it applies here, though
(considering what above).

> Let me know if my attempt at translating brainlanguage into semi-coherent 
> english failed and I'll do another attempt

You succeeded! (that's assuming that I got your questions right of
course :)
> 
> > This RFD/proof of concept aims at starting a discussion about how we can
> > get proxy execution in mainline. But, first things first, why do we even
> > care about it?
> > 
> > I'm pretty confident with saying that the line of development that is
> > mainly interested in this at the moment is the one that might benefit
> > in allowing non privileged processes to use deadline scheduling [3].
> > The main missing bit before we can safely relax the root privileges
> > constraint is a proper priority inheritance mechanism, which translates
> > to bandwidth inheritance [4, 5] for deadline scheduling, or to some sort
> > of interpretation of the concept of running a task holding a (rt_)mutex
> > within the bandwidth allotment of some other task that is blocked on the
> > same (rt_)mutex.
> > 
> > The concept itself is pretty general however, and it is not hard to
> > foresee possible applications in other scenarios (say for example nice
> > values/shares across co-operating CFS tasks or clamping values [6]).
> > But I'm already digressing, so let's get back to the code that comes
> > with this cover letter.
> > 
> > One can define the scheduling context of a task as all the information
> > in task_struct that the scheduler needs to implement a policy and the
> > execution contex as all the state required to actually "run" the task.
> > An example of scheduling context might be the information contained in
> > task_struct se, rt and dl fields; affinity pertains instead to execution
> > context (and I guess decideing what pertains to what is actually up for
> > discussion as well ;-). Patch 04/08 implements such distinction.
> 
> I really like the idea of splitting scheduling ctx and execution context!
> 
> > As implemented in this set, a link between scheduling contexts of
> > different tasks might be established when a task blocks on a mutex held
> > by some other task (blocked_on relation). In this case the former task
> > starts to be considered a potential proxy for the latter (mutex owner).
> > One key change in how mutexes work made in here is that waiters don't
> > really sleep: they are not dequeued, so they can be picked up by the
> > scheduler when it runs.  If a waiter (potential proxy) task is selected
> > by the scheduler, the blocked_on relation is used to find the mutex
> > owner and put that to run on the CPU, using the proxy task scheduling
> > context.
> > 
> >Follow the blocked-on relation:
> >   
> >   ,-> task   <- proxy, picked by scheduler
> >   | | blocked-on
> >   | v
> >  blocked-task |   mutex
> >   | | owner
> >   | v
> >   `-- task   <- gets to run using proxy info
> > 
> > Now, the situation is (of course) more tricky than depicted so far
> > because we have to deal 

Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread Juri Lelli
On 10/10/18 13:56, Henrik Austad wrote:
> On Tue, Oct 09, 2018 at 11:24:26AM +0200, Juri Lelli wrote:
> > Hi all,
> 
> Hi, nice series, I have a lot of details to grok, but I like the idea of PE
> 
> > Proxy Execution (also goes under several other names) isn't a new
> > concept, it has been mentioned already in the past to this community
> > (both in email discussions and at conferences [1, 2]), but no actual
> > implementation that applies to a fairly recent kernel exists as of today
> > (of which I'm aware of at least - happy to be proven wrong).
> > 
> > Very broadly speaking, more info below, proxy execution enables a task
> > to run using the context of some other task that is "willing" to
> > participate in the mechanism, as this helps both tasks to improve
> > performance (w.r.t. the latter task not participating to proxy
> > execution).
> 
> From what I remember, PEP was originally proposed for a global EDF, and as 
> far as my head has been able to read this series, this implementation is 
> planned for not only deadline, but eventuall also for sched_(rr|fifo|other) 
> - is that correct?

Correct, this is cross class.

> I have a bit of concern when it comes to affinities and and where the 
> lock owner will actually execute while in the context of the proxy, 
> especially when you run into the situation where you have disjoint CPU 
> affinities for _rr tasks to ensure the deadlines.

Well, it's the (scheduler context) of the proxy that is potentially
moved around. Lock owner stays inside its affinity.

> I believe there were some papers circulated last year that looked at 
> something similar to this when you had overlapping or completely disjoint 
> CPUsets I think it would be nice to drag into the discussion. Has this been 
> considered? (if so, sorry for adding line-noise!)

I think you refer to BBB work. Not sure if it applies here, though
(considering what above).

> Let me know if my attempt at translating brainlanguage into semi-coherent 
> english failed and I'll do another attempt

You succeeded! (that's assuming that I got your questions right of
course :)
> 
> > This RFD/proof of concept aims at starting a discussion about how we can
> > get proxy execution in mainline. But, first things first, why do we even
> > care about it?
> > 
> > I'm pretty confident with saying that the line of development that is
> > mainly interested in this at the moment is the one that might benefit
> > in allowing non privileged processes to use deadline scheduling [3].
> > The main missing bit before we can safely relax the root privileges
> > constraint is a proper priority inheritance mechanism, which translates
> > to bandwidth inheritance [4, 5] for deadline scheduling, or to some sort
> > of interpretation of the concept of running a task holding a (rt_)mutex
> > within the bandwidth allotment of some other task that is blocked on the
> > same (rt_)mutex.
> > 
> > The concept itself is pretty general however, and it is not hard to
> > foresee possible applications in other scenarios (say for example nice
> > values/shares across co-operating CFS tasks or clamping values [6]).
> > But I'm already digressing, so let's get back to the code that comes
> > with this cover letter.
> > 
> > One can define the scheduling context of a task as all the information
> > in task_struct that the scheduler needs to implement a policy and the
> > execution contex as all the state required to actually "run" the task.
> > An example of scheduling context might be the information contained in
> > task_struct se, rt and dl fields; affinity pertains instead to execution
> > context (and I guess decideing what pertains to what is actually up for
> > discussion as well ;-). Patch 04/08 implements such distinction.
> 
> I really like the idea of splitting scheduling ctx and execution context!
> 
> > As implemented in this set, a link between scheduling contexts of
> > different tasks might be established when a task blocks on a mutex held
> > by some other task (blocked_on relation). In this case the former task
> > starts to be considered a potential proxy for the latter (mutex owner).
> > One key change in how mutexes work made in here is that waiters don't
> > really sleep: they are not dequeued, so they can be picked up by the
> > scheduler when it runs.  If a waiter (potential proxy) task is selected
> > by the scheduler, the blocked_on relation is used to find the mutex
> > owner and put that to run on the CPU, using the proxy task scheduling
> > context.
> > 
> >Follow the blocked-on relation:
> >   
> >   ,-> task   <- proxy, picked by scheduler
> >   | | blocked-on
> >   | v
> >  blocked-task |   mutex
> >   | | owner
> >   | v
> >   `-- task   <- gets to run using proxy info
> > 
> > Now, the situation is (of course) more tricky than depicted so far
> > because we have to deal 

Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread Juri Lelli
On 10/10/18 13:23, Peter Zijlstra wrote:
> On Wed, Oct 10, 2018 at 01:16:29PM +0200, luca abeni wrote:
> > On Wed, 10 Oct 2018 12:57:10 +0200
> > Peter Zijlstra  wrote:
> > 
> > > On Wed, Oct 10, 2018 at 12:34:17PM +0200, luca abeni wrote:
> > > > So, I would propose to make the proxy() function of patch more
> > > > generic, and not strictly bound to mutexes. Maybe a task structure
> > > > can contain a list of tasks for which the task can act as a proxy,
> > > > and we can have a function like "I want to act as a proxy for task
> > > > T" to be invoked when a task blocks?  
> > > 
> > > Certainly possible, but that's something I'd prefer to look at after
> > > it all 'works'.
> > 
> > Of course :)
> > I was mentioning this idea because maybe it can have some impact on the
> > design.
> > 
> > BTW, here is another "interesting" issue I had in the past with changes
> > like this one: how do we check if the patchset works as expected?
> > 
> > "No crashes" is surely a requirement, but I think we also need some
> > kind of testcase that fails if the inheritance mechanism is not working
> > properly, and is successful if the inheritance works.
> > 
> > Maybe we can develop some testcase based on rt-app (if noone has such a
> > testcase already)
> 
> Indeed; IIRC there is a test suite that mostly covers the FIFO-PI stuff,
> that should obviously still pass. Steve, do you know where that lives?
> 
> For the extended DL stuff, we'd need new tests.

This one, right?

https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git/tree/src/pi_tests/pi_stress.c?h=stable/v1.0

It looks like it supports DEADLINE as well.. although I'll have to check
again what it does for the DEADLINE case.


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread Juri Lelli
On 10/10/18 13:23, Peter Zijlstra wrote:
> On Wed, Oct 10, 2018 at 01:16:29PM +0200, luca abeni wrote:
> > On Wed, 10 Oct 2018 12:57:10 +0200
> > Peter Zijlstra  wrote:
> > 
> > > On Wed, Oct 10, 2018 at 12:34:17PM +0200, luca abeni wrote:
> > > > So, I would propose to make the proxy() function of patch more
> > > > generic, and not strictly bound to mutexes. Maybe a task structure
> > > > can contain a list of tasks for which the task can act as a proxy,
> > > > and we can have a function like "I want to act as a proxy for task
> > > > T" to be invoked when a task blocks?  
> > > 
> > > Certainly possible, but that's something I'd prefer to look at after
> > > it all 'works'.
> > 
> > Of course :)
> > I was mentioning this idea because maybe it can have some impact on the
> > design.
> > 
> > BTW, here is another "interesting" issue I had in the past with changes
> > like this one: how do we check if the patchset works as expected?
> > 
> > "No crashes" is surely a requirement, but I think we also need some
> > kind of testcase that fails if the inheritance mechanism is not working
> > properly, and is successful if the inheritance works.
> > 
> > Maybe we can develop some testcase based on rt-app (if noone has such a
> > testcase already)
> 
> Indeed; IIRC there is a test suite that mostly covers the FIFO-PI stuff,
> that should obviously still pass. Steve, do you know where that lives?
> 
> For the extended DL stuff, we'd need new tests.

This one, right?

https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git/tree/src/pi_tests/pi_stress.c?h=stable/v1.0

It looks like it supports DEADLINE as well.. although I'll have to check
again what it does for the DEADLINE case.


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread Peter Zijlstra
On Wed, Oct 10, 2018 at 01:56:39PM +0200, Henrik Austad wrote:
> On Tue, Oct 09, 2018 at 11:24:26AM +0200, Juri Lelli wrote:
> > Hi all,
> 
> Hi, nice series, I have a lot of details to grok, but I like the idea of PE
> 
> > Proxy Execution (also goes under several other names) isn't a new
> > concept, it has been mentioned already in the past to this community
> > (both in email discussions and at conferences [1, 2]), but no actual
> > implementation that applies to a fairly recent kernel exists as of today
> > (of which I'm aware of at least - happy to be proven wrong).
> > 
> > Very broadly speaking, more info below, proxy execution enables a task
> > to run using the context of some other task that is "willing" to
> > participate in the mechanism, as this helps both tasks to improve
> > performance (w.r.t. the latter task not participating to proxy
> > execution).
> 
> From what I remember, PEP was originally proposed for a global EDF, and as 
> far as my head has been able to read this series, this implementation is 
> planned for not only deadline, but eventuall also for sched_(rr|fifo|other) 
> - is that correct?

This implementation covers every scheduling class unconditionally. It
directly uses the scheduling function to order things; where PI
re-implements the FIFO scheduling function to order the blocked lists.

> I have a bit of concern when it comes to affinities and and where the 
> lock owner will actually execute while in the context of the proxy, 
> especially when you run into the situation where you have disjoint CPU 
> affinities for _rr tasks to ensure the deadlines.

The affinities of execution contexts are respected.

> I believe there were some papers circulated last year that looked at 
> something similar to this when you had overlapping or completely disjoint 
> CPUsets I think it would be nice to drag into the discussion. Has this been 
> considered? (if so, sorry for adding line-noise!)

Hurm, was that one of Bjorn's papers? Didn't that deal with AC of
disjoint/overlapping sets?

> > One can define the scheduling context of a task as all the information
> > in task_struct that the scheduler needs to implement a policy and the
> > execution contex as all the state required to actually "run" the task.
> > An example of scheduling context might be the information contained in
> > task_struct se, rt and dl fields; affinity pertains instead to execution
> > context (and I guess decideing what pertains to what is actually up for
> > discussion as well ;-). Patch 04/08 implements such distinction.
> 
> I really like the idea of splitting scheduling ctx and execution context!

Right; so this whole thing relies on 'violating' affinities for
scheduling contexts, but respects affinities for execution contexts.

The basic observation is that affinities only matter when you execute
code.

This then also gives a fairly clear definition of what an execution
context is.


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread Peter Zijlstra
On Wed, Oct 10, 2018 at 01:56:39PM +0200, Henrik Austad wrote:
> On Tue, Oct 09, 2018 at 11:24:26AM +0200, Juri Lelli wrote:
> > Hi all,
> 
> Hi, nice series, I have a lot of details to grok, but I like the idea of PE
> 
> > Proxy Execution (also goes under several other names) isn't a new
> > concept, it has been mentioned already in the past to this community
> > (both in email discussions and at conferences [1, 2]), but no actual
> > implementation that applies to a fairly recent kernel exists as of today
> > (of which I'm aware of at least - happy to be proven wrong).
> > 
> > Very broadly speaking, more info below, proxy execution enables a task
> > to run using the context of some other task that is "willing" to
> > participate in the mechanism, as this helps both tasks to improve
> > performance (w.r.t. the latter task not participating to proxy
> > execution).
> 
> From what I remember, PEP was originally proposed for a global EDF, and as 
> far as my head has been able to read this series, this implementation is 
> planned for not only deadline, but eventuall also for sched_(rr|fifo|other) 
> - is that correct?

This implementation covers every scheduling class unconditionally. It
directly uses the scheduling function to order things; where PI
re-implements the FIFO scheduling function to order the blocked lists.

> I have a bit of concern when it comes to affinities and and where the 
> lock owner will actually execute while in the context of the proxy, 
> especially when you run into the situation where you have disjoint CPU 
> affinities for _rr tasks to ensure the deadlines.

The affinities of execution contexts are respected.

> I believe there were some papers circulated last year that looked at 
> something similar to this when you had overlapping or completely disjoint 
> CPUsets I think it would be nice to drag into the discussion. Has this been 
> considered? (if so, sorry for adding line-noise!)

Hurm, was that one of Bjorn's papers? Didn't that deal with AC of
disjoint/overlapping sets?

> > One can define the scheduling context of a task as all the information
> > in task_struct that the scheduler needs to implement a policy and the
> > execution contex as all the state required to actually "run" the task.
> > An example of scheduling context might be the information contained in
> > task_struct se, rt and dl fields; affinity pertains instead to execution
> > context (and I guess decideing what pertains to what is actually up for
> > discussion as well ;-). Patch 04/08 implements such distinction.
> 
> I really like the idea of splitting scheduling ctx and execution context!

Right; so this whole thing relies on 'violating' affinities for
scheduling contexts, but respects affinities for execution contexts.

The basic observation is that affinities only matter when you execute
code.

This then also gives a fairly clear definition of what an execution
context is.


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread Henrik Austad
On Tue, Oct 09, 2018 at 11:24:26AM +0200, Juri Lelli wrote:
> Hi all,

Hi, nice series, I have a lot of details to grok, but I like the idea of PE

> Proxy Execution (also goes under several other names) isn't a new
> concept, it has been mentioned already in the past to this community
> (both in email discussions and at conferences [1, 2]), but no actual
> implementation that applies to a fairly recent kernel exists as of today
> (of which I'm aware of at least - happy to be proven wrong).
> 
> Very broadly speaking, more info below, proxy execution enables a task
> to run using the context of some other task that is "willing" to
> participate in the mechanism, as this helps both tasks to improve
> performance (w.r.t. the latter task not participating to proxy
> execution).

From what I remember, PEP was originally proposed for a global EDF, and as 
far as my head has been able to read this series, this implementation is 
planned for not only deadline, but eventuall also for sched_(rr|fifo|other) 
- is that correct?

I have a bit of concern when it comes to affinities and and where the 
lock owner will actually execute while in the context of the proxy, 
especially when you run into the situation where you have disjoint CPU 
affinities for _rr tasks to ensure the deadlines.

I believe there were some papers circulated last year that looked at 
something similar to this when you had overlapping or completely disjoint 
CPUsets I think it would be nice to drag into the discussion. Has this been 
considered? (if so, sorry for adding line-noise!)

Let me know if my attempt at translating brainlanguage into semi-coherent 
english failed and I'll do another attempt

> This RFD/proof of concept aims at starting a discussion about how we can
> get proxy execution in mainline. But, first things first, why do we even
> care about it?
> 
> I'm pretty confident with saying that the line of development that is
> mainly interested in this at the moment is the one that might benefit
> in allowing non privileged processes to use deadline scheduling [3].
> The main missing bit before we can safely relax the root privileges
> constraint is a proper priority inheritance mechanism, which translates
> to bandwidth inheritance [4, 5] for deadline scheduling, or to some sort
> of interpretation of the concept of running a task holding a (rt_)mutex
> within the bandwidth allotment of some other task that is blocked on the
> same (rt_)mutex.
> 
> The concept itself is pretty general however, and it is not hard to
> foresee possible applications in other scenarios (say for example nice
> values/shares across co-operating CFS tasks or clamping values [6]).
> But I'm already digressing, so let's get back to the code that comes
> with this cover letter.
> 
> One can define the scheduling context of a task as all the information
> in task_struct that the scheduler needs to implement a policy and the
> execution contex as all the state required to actually "run" the task.
> An example of scheduling context might be the information contained in
> task_struct se, rt and dl fields; affinity pertains instead to execution
> context (and I guess decideing what pertains to what is actually up for
> discussion as well ;-). Patch 04/08 implements such distinction.

I really like the idea of splitting scheduling ctx and execution context!

> As implemented in this set, a link between scheduling contexts of
> different tasks might be established when a task blocks on a mutex held
> by some other task (blocked_on relation). In this case the former task
> starts to be considered a potential proxy for the latter (mutex owner).
> One key change in how mutexes work made in here is that waiters don't
> really sleep: they are not dequeued, so they can be picked up by the
> scheduler when it runs.  If a waiter (potential proxy) task is selected
> by the scheduler, the blocked_on relation is used to find the mutex
> owner and put that to run on the CPU, using the proxy task scheduling
> context.
> 
>Follow the blocked-on relation:
>   
>   ,-> task   <- proxy, picked by scheduler
>   | | blocked-on
>   | v
>  blocked-task |   mutex
>   | | owner
>   | v
>   `-- task   <- gets to run using proxy info
> 
> Now, the situation is (of course) more tricky than depicted so far
> because we have to deal with all sort of possible states the mutex
> owner might be in while a potential proxy is selected by the scheduler,
> e.g. owner might be sleeping, running on a different CPU, blocked on
> another mutex itself... so, I'd kindly refer people to have a look at
> 05/08 proxy() implementation and comments.

My head hurt already.. :)

> Peter kindly shared his WIP patches with us (me, Luca, Tommaso, Claudio,
> Daniel, the Pisa gang) a while ago, but I could seriously have a decent
> look at them only recently (thanks a lot to the 

Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread Henrik Austad
On Tue, Oct 09, 2018 at 11:24:26AM +0200, Juri Lelli wrote:
> Hi all,

Hi, nice series, I have a lot of details to grok, but I like the idea of PE

> Proxy Execution (also goes under several other names) isn't a new
> concept, it has been mentioned already in the past to this community
> (both in email discussions and at conferences [1, 2]), but no actual
> implementation that applies to a fairly recent kernel exists as of today
> (of which I'm aware of at least - happy to be proven wrong).
> 
> Very broadly speaking, more info below, proxy execution enables a task
> to run using the context of some other task that is "willing" to
> participate in the mechanism, as this helps both tasks to improve
> performance (w.r.t. the latter task not participating to proxy
> execution).

From what I remember, PEP was originally proposed for a global EDF, and as 
far as my head has been able to read this series, this implementation is 
planned for not only deadline, but eventuall also for sched_(rr|fifo|other) 
- is that correct?

I have a bit of concern when it comes to affinities and and where the 
lock owner will actually execute while in the context of the proxy, 
especially when you run into the situation where you have disjoint CPU 
affinities for _rr tasks to ensure the deadlines.

I believe there were some papers circulated last year that looked at 
something similar to this when you had overlapping or completely disjoint 
CPUsets I think it would be nice to drag into the discussion. Has this been 
considered? (if so, sorry for adding line-noise!)

Let me know if my attempt at translating brainlanguage into semi-coherent 
english failed and I'll do another attempt

> This RFD/proof of concept aims at starting a discussion about how we can
> get proxy execution in mainline. But, first things first, why do we even
> care about it?
> 
> I'm pretty confident with saying that the line of development that is
> mainly interested in this at the moment is the one that might benefit
> in allowing non privileged processes to use deadline scheduling [3].
> The main missing bit before we can safely relax the root privileges
> constraint is a proper priority inheritance mechanism, which translates
> to bandwidth inheritance [4, 5] for deadline scheduling, or to some sort
> of interpretation of the concept of running a task holding a (rt_)mutex
> within the bandwidth allotment of some other task that is blocked on the
> same (rt_)mutex.
> 
> The concept itself is pretty general however, and it is not hard to
> foresee possible applications in other scenarios (say for example nice
> values/shares across co-operating CFS tasks or clamping values [6]).
> But I'm already digressing, so let's get back to the code that comes
> with this cover letter.
> 
> One can define the scheduling context of a task as all the information
> in task_struct that the scheduler needs to implement a policy and the
> execution contex as all the state required to actually "run" the task.
> An example of scheduling context might be the information contained in
> task_struct se, rt and dl fields; affinity pertains instead to execution
> context (and I guess decideing what pertains to what is actually up for
> discussion as well ;-). Patch 04/08 implements such distinction.

I really like the idea of splitting scheduling ctx and execution context!

> As implemented in this set, a link between scheduling contexts of
> different tasks might be established when a task blocks on a mutex held
> by some other task (blocked_on relation). In this case the former task
> starts to be considered a potential proxy for the latter (mutex owner).
> One key change in how mutexes work made in here is that waiters don't
> really sleep: they are not dequeued, so they can be picked up by the
> scheduler when it runs.  If a waiter (potential proxy) task is selected
> by the scheduler, the blocked_on relation is used to find the mutex
> owner and put that to run on the CPU, using the proxy task scheduling
> context.
> 
>Follow the blocked-on relation:
>   
>   ,-> task   <- proxy, picked by scheduler
>   | | blocked-on
>   | v
>  blocked-task |   mutex
>   | | owner
>   | v
>   `-- task   <- gets to run using proxy info
> 
> Now, the situation is (of course) more tricky than depicted so far
> because we have to deal with all sort of possible states the mutex
> owner might be in while a potential proxy is selected by the scheduler,
> e.g. owner might be sleeping, running on a different CPU, blocked on
> another mutex itself... so, I'd kindly refer people to have a look at
> 05/08 proxy() implementation and comments.

My head hurt already.. :)

> Peter kindly shared his WIP patches with us (me, Luca, Tommaso, Claudio,
> Daniel, the Pisa gang) a while ago, but I could seriously have a decent
> look at them only recently (thanks a lot to the 

Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread luca abeni
Hi all,

On Tue,  9 Oct 2018 11:24:26 +0200
Juri Lelli  wrote:

> Hi all,
> 
> Proxy Execution (also goes under several other names) isn't a new
> concept, it has been mentioned already in the past to this community
> (both in email discussions and at conferences [1, 2]), but no actual
> implementation that applies to a fairly recent kernel exists as of
> today (of which I'm aware of at least - happy to be proven wrong).
> 
> Very broadly speaking, more info below, proxy execution enables a task
> to run using the context of some other task that is "willing" to
> participate in the mechanism, as this helps both tasks to improve
> performance (w.r.t. the latter task not participating to proxy
> execution).

First of all, thanks to Juri for working on this!

I am looking at the patchset, and I have some questions / comments
(maybe some of my questions are really stupid, I do not know... :)


To begin, I have a very high-level comment about proxy execution: I
believe proxy execution is a very powerful concept, that can be used in
many cases, not only to do inheritance on mutual exclusion (think, for
example, about pipelines of tasks: a task implementing a stage of the
pipeline can act as a proxy for the task implementing the previous
stage).

So, I would propose to make the proxy() function of patch more generic,
and not strictly bound to mutexes. Maybe a task structure can contain a
list of tasks for which the task can act as a proxy, and we can have a
function like "I want to act as a proxy for task T" to be invoked when
a task blocks?



Luca


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread luca abeni
Hi all,

On Tue,  9 Oct 2018 11:24:26 +0200
Juri Lelli  wrote:

> Hi all,
> 
> Proxy Execution (also goes under several other names) isn't a new
> concept, it has been mentioned already in the past to this community
> (both in email discussions and at conferences [1, 2]), but no actual
> implementation that applies to a fairly recent kernel exists as of
> today (of which I'm aware of at least - happy to be proven wrong).
> 
> Very broadly speaking, more info below, proxy execution enables a task
> to run using the context of some other task that is "willing" to
> participate in the mechanism, as this helps both tasks to improve
> performance (w.r.t. the latter task not participating to proxy
> execution).

First of all, thanks to Juri for working on this!

I am looking at the patchset, and I have some questions / comments
(maybe some of my questions are really stupid, I do not know... :)


To begin, I have a very high-level comment about proxy execution: I
believe proxy execution is a very powerful concept, that can be used in
many cases, not only to do inheritance on mutual exclusion (think, for
example, about pipelines of tasks: a task implementing a stage of the
pipeline can act as a proxy for the task implementing the previous
stage).

So, I would propose to make the proxy() function of patch more generic,
and not strictly bound to mutexes. Maybe a task structure can contain a
list of tasks for which the task can act as a proxy, and we can have a
function like "I want to act as a proxy for task T" to be invoked when
a task blocks?



Luca


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread Peter Zijlstra
On Wed, Oct 10, 2018 at 01:16:29PM +0200, luca abeni wrote:
> On Wed, 10 Oct 2018 12:57:10 +0200
> Peter Zijlstra  wrote:
> 
> > On Wed, Oct 10, 2018 at 12:34:17PM +0200, luca abeni wrote:
> > > So, I would propose to make the proxy() function of patch more
> > > generic, and not strictly bound to mutexes. Maybe a task structure
> > > can contain a list of tasks for which the task can act as a proxy,
> > > and we can have a function like "I want to act as a proxy for task
> > > T" to be invoked when a task blocks?  
> > 
> > Certainly possible, but that's something I'd prefer to look at after
> > it all 'works'.
> 
> Of course :)
> I was mentioning this idea because maybe it can have some impact on the
> design.
> 
> BTW, here is another "interesting" issue I had in the past with changes
> like this one: how do we check if the patchset works as expected?
> 
> "No crashes" is surely a requirement, but I think we also need some
> kind of testcase that fails if the inheritance mechanism is not working
> properly, and is successful if the inheritance works.
> 
> Maybe we can develop some testcase based on rt-app (if noone has such a
> testcase already)

Indeed; IIRC there is a test suite that mostly covers the FIFO-PI stuff,
that should obviously still pass. Steve, do you know where that lives?

For the extended DL stuff, we'd need new tests.


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread Peter Zijlstra
On Wed, Oct 10, 2018 at 01:16:29PM +0200, luca abeni wrote:
> On Wed, 10 Oct 2018 12:57:10 +0200
> Peter Zijlstra  wrote:
> 
> > On Wed, Oct 10, 2018 at 12:34:17PM +0200, luca abeni wrote:
> > > So, I would propose to make the proxy() function of patch more
> > > generic, and not strictly bound to mutexes. Maybe a task structure
> > > can contain a list of tasks for which the task can act as a proxy,
> > > and we can have a function like "I want to act as a proxy for task
> > > T" to be invoked when a task blocks?  
> > 
> > Certainly possible, but that's something I'd prefer to look at after
> > it all 'works'.
> 
> Of course :)
> I was mentioning this idea because maybe it can have some impact on the
> design.
> 
> BTW, here is another "interesting" issue I had in the past with changes
> like this one: how do we check if the patchset works as expected?
> 
> "No crashes" is surely a requirement, but I think we also need some
> kind of testcase that fails if the inheritance mechanism is not working
> properly, and is successful if the inheritance works.
> 
> Maybe we can develop some testcase based on rt-app (if noone has such a
> testcase already)

Indeed; IIRC there is a test suite that mostly covers the FIFO-PI stuff,
that should obviously still pass. Steve, do you know where that lives?

For the extended DL stuff, we'd need new tests.


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread luca abeni
On Wed, 10 Oct 2018 12:57:10 +0200
Peter Zijlstra  wrote:

> On Wed, Oct 10, 2018 at 12:34:17PM +0200, luca abeni wrote:
> > So, I would propose to make the proxy() function of patch more
> > generic, and not strictly bound to mutexes. Maybe a task structure
> > can contain a list of tasks for which the task can act as a proxy,
> > and we can have a function like "I want to act as a proxy for task
> > T" to be invoked when a task blocks?  
> 
> Certainly possible, but that's something I'd prefer to look at after
> it all 'works'.

Of course :)
I was mentioning this idea because maybe it can have some impact on the
design.

BTW, here is another "interesting" issue I had in the past with changes
like this one: how do we check if the patchset works as expected?

"No crashes" is surely a requirement, but I think we also need some
kind of testcase that fails if the inheritance mechanism is not working
properly, and is successful if the inheritance works.

Maybe we can develop some testcase based on rt-app (if noone has such a
testcase already)


Thanks,
Luca
> The mutex blocking thing doesn't require external
> interfaces etc..



Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread luca abeni
On Wed, 10 Oct 2018 12:57:10 +0200
Peter Zijlstra  wrote:

> On Wed, Oct 10, 2018 at 12:34:17PM +0200, luca abeni wrote:
> > So, I would propose to make the proxy() function of patch more
> > generic, and not strictly bound to mutexes. Maybe a task structure
> > can contain a list of tasks for which the task can act as a proxy,
> > and we can have a function like "I want to act as a proxy for task
> > T" to be invoked when a task blocks?  
> 
> Certainly possible, but that's something I'd prefer to look at after
> it all 'works'.

Of course :)
I was mentioning this idea because maybe it can have some impact on the
design.

BTW, here is another "interesting" issue I had in the past with changes
like this one: how do we check if the patchset works as expected?

"No crashes" is surely a requirement, but I think we also need some
kind of testcase that fails if the inheritance mechanism is not working
properly, and is successful if the inheritance works.

Maybe we can develop some testcase based on rt-app (if noone has such a
testcase already)


Thanks,
Luca
> The mutex blocking thing doesn't require external
> interfaces etc..



Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread Peter Zijlstra
On Wed, Oct 10, 2018 at 12:34:17PM +0200, luca abeni wrote:
> So, I would propose to make the proxy() function of patch more generic,
> and not strictly bound to mutexes. Maybe a task structure can contain a
> list of tasks for which the task can act as a proxy, and we can have a
> function like "I want to act as a proxy for task T" to be invoked when
> a task blocks?

Certainly possible, but that's something I'd prefer to look at after it
all 'works'. The mutex blocking thing doesn't require external
interfaces etc..


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-10 Thread Peter Zijlstra
On Wed, Oct 10, 2018 at 12:34:17PM +0200, luca abeni wrote:
> So, I would propose to make the proxy() function of patch more generic,
> and not strictly bound to mutexes. Maybe a task structure can contain a
> list of tasks for which the task can act as a proxy, and we can have a
> function like "I want to act as a proxy for task T" to be invoked when
> a task blocks?

Certainly possible, but that's something I'd prefer to look at after it
all 'works'. The mutex blocking thing doesn't require external
interfaces etc..


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-09 Thread Juri Lelli
On 09/10/18 13:56, Daniel Bristot de Oliveira wrote:
> On 10/9/18 12:51 PM, Sebastian Andrzej Siewior wrote:
> >> The main concerns I have with the current approach is that, being based
> >> on mutex.c, it's both
> >>
> >>  - not linked with futexes
> >>  - not involving "legacy" priority inheritance (rt_mutex.c)
> >>
> >> I believe one of the main reasons Peter started this on mutexes is to
> >> have better coverage of potential problems (which I can assure everybody
> >> it had). I'm not yet sure what should we do moving forward, and this is
> >> exactly what I'd be pleased to hear your opinions on.
> > wasn't the idea that once it works to get rid of rt_mutex?

Looks like it was (see Peter's reply).

> As far as I know, it is. But there are some additional complexity
> involving a -rt version of this patch, for instance:
> 
> What should the protocol do if the thread migrating is with migration
> disabled?
> 
> The side effects of, for instance, ignoring the migrate_disable() would
> add noise for the initial implementation... too much complexity at once.
> 
> IMHO, once it works in the non-rt, it will be easier to do the changes
> needed to integrate it with -rt.
> 
> Thoughts?

Makes sense to me. Maybe we should just still keep in mind eventual
integration, so that we don't take decisions we would regret.


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-09 Thread Juri Lelli
On 09/10/18 13:56, Daniel Bristot de Oliveira wrote:
> On 10/9/18 12:51 PM, Sebastian Andrzej Siewior wrote:
> >> The main concerns I have with the current approach is that, being based
> >> on mutex.c, it's both
> >>
> >>  - not linked with futexes
> >>  - not involving "legacy" priority inheritance (rt_mutex.c)
> >>
> >> I believe one of the main reasons Peter started this on mutexes is to
> >> have better coverage of potential problems (which I can assure everybody
> >> it had). I'm not yet sure what should we do moving forward, and this is
> >> exactly what I'd be pleased to hear your opinions on.
> > wasn't the idea that once it works to get rid of rt_mutex?

Looks like it was (see Peter's reply).

> As far as I know, it is. But there are some additional complexity
> involving a -rt version of this patch, for instance:
> 
> What should the protocol do if the thread migrating is with migration
> disabled?
> 
> The side effects of, for instance, ignoring the migrate_disable() would
> add noise for the initial implementation... too much complexity at once.
> 
> IMHO, once it works in the non-rt, it will be easier to do the changes
> needed to integrate it with -rt.
> 
> Thoughts?

Makes sense to me. Maybe we should just still keep in mind eventual
integration, so that we don't take decisions we would regret.


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-09 Thread Daniel Bristot de Oliveira
On 10/9/18 12:51 PM, Sebastian Andrzej Siewior wrote:
>> The main concerns I have with the current approach is that, being based
>> on mutex.c, it's both
>>
>>  - not linked with futexes
>>  - not involving "legacy" priority inheritance (rt_mutex.c)
>>
>> I believe one of the main reasons Peter started this on mutexes is to
>> have better coverage of potential problems (which I can assure everybody
>> it had). I'm not yet sure what should we do moving forward, and this is
>> exactly what I'd be pleased to hear your opinions on.
> wasn't the idea that once it works to get rid of rt_mutex?

As far as I know, it is. But there are some additional complexity
involving a -rt version of this patch, for instance:

What should the protocol do if the thread migrating is with migration
disabled?

The side effects of, for instance, ignoring the migrate_disable() would
add noise for the initial implementation... too much complexity at once.

IMHO, once it works in the non-rt, it will be easier to do the changes
needed to integrate it with -rt.

Thoughts?

-- Daniel



Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-09 Thread Daniel Bristot de Oliveira
On 10/9/18 12:51 PM, Sebastian Andrzej Siewior wrote:
>> The main concerns I have with the current approach is that, being based
>> on mutex.c, it's both
>>
>>  - not linked with futexes
>>  - not involving "legacy" priority inheritance (rt_mutex.c)
>>
>> I believe one of the main reasons Peter started this on mutexes is to
>> have better coverage of potential problems (which I can assure everybody
>> it had). I'm not yet sure what should we do moving forward, and this is
>> exactly what I'd be pleased to hear your opinions on.
> wasn't the idea that once it works to get rid of rt_mutex?

As far as I know, it is. But there are some additional complexity
involving a -rt version of this patch, for instance:

What should the protocol do if the thread migrating is with migration
disabled?

The side effects of, for instance, ignoring the migrate_disable() would
add noise for the initial implementation... too much complexity at once.

IMHO, once it works in the non-rt, it will be easier to do the changes
needed to integrate it with -rt.

Thoughts?

-- Daniel



Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-09 Thread Sebastian Andrzej Siewior
On 2018-10-09 11:24:26 [+0200], Juri Lelli wrote:
> The main concerns I have with the current approach is that, being based
> on mutex.c, it's both
> 
>  - not linked with futexes
>  - not involving "legacy" priority inheritance (rt_mutex.c)
> 
> I believe one of the main reasons Peter started this on mutexes is to
> have better coverage of potential problems (which I can assure everybody
> it had). I'm not yet sure what should we do moving forward, and this is
> exactly what I'd be pleased to hear your opinions on.

wasn't the idea that once it works to get rid of rt_mutex?

Sebastian


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-09 Thread Sebastian Andrzej Siewior
On 2018-10-09 11:24:26 [+0200], Juri Lelli wrote:
> The main concerns I have with the current approach is that, being based
> on mutex.c, it's both
> 
>  - not linked with futexes
>  - not involving "legacy" priority inheritance (rt_mutex.c)
> 
> I believe one of the main reasons Peter started this on mutexes is to
> have better coverage of potential problems (which I can assure everybody
> it had). I'm not yet sure what should we do moving forward, and this is
> exactly what I'd be pleased to hear your opinions on.

wasn't the idea that once it works to get rid of rt_mutex?

Sebastian


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-09 Thread Juri Lelli
On 09/10/18 11:44, Peter Zijlstra wrote:
> On Tue, Oct 09, 2018 at 11:24:26AM +0200, Juri Lelli wrote:
> > The main concerns I have with the current approach is that, being based
> > on mutex.c, it's both
> > 
> >  - not linked with futexes
> >  - not involving "legacy" priority inheritance (rt_mutex.c)
> > 
> > I believe one of the main reasons Peter started this on mutexes is to
> > have better coverage of potential problems (which I can assure everybody
> > it had). I'm not yet sure what should we do moving forward, and this is
> > exactly what I'd be pleased to hear your opinions on.
> 
> Well that, and mutex was 'simple', I didn't have to go rip out all the
> legacy PI crud.

Indeed.

> If this all ends up working well, the solution is 'simple' and we can
> simply copy mutex to rt_mutex or something along those lines if we want
> to keep the distinction between them. Alternatively we simply delete
> rt_mutex.

Ah.. right.. sounds *scary*, but I guess it might be an option after
all.

> Thanks for reviving this.. it's been an 'interesting' year and a half
> since I wrote all this and I've really not had time to work on it.

n/p, it has been a long standing thing to look at for me as well. Thanks
again for sharing your patches!


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-09 Thread Juri Lelli
On 09/10/18 11:44, Peter Zijlstra wrote:
> On Tue, Oct 09, 2018 at 11:24:26AM +0200, Juri Lelli wrote:
> > The main concerns I have with the current approach is that, being based
> > on mutex.c, it's both
> > 
> >  - not linked with futexes
> >  - not involving "legacy" priority inheritance (rt_mutex.c)
> > 
> > I believe one of the main reasons Peter started this on mutexes is to
> > have better coverage of potential problems (which I can assure everybody
> > it had). I'm not yet sure what should we do moving forward, and this is
> > exactly what I'd be pleased to hear your opinions on.
> 
> Well that, and mutex was 'simple', I didn't have to go rip out all the
> legacy PI crud.

Indeed.

> If this all ends up working well, the solution is 'simple' and we can
> simply copy mutex to rt_mutex or something along those lines if we want
> to keep the distinction between them. Alternatively we simply delete
> rt_mutex.

Ah.. right.. sounds *scary*, but I guess it might be an option after
all.

> Thanks for reviving this.. it's been an 'interesting' year and a half
> since I wrote all this and I've really not had time to work on it.

n/p, it has been a long standing thing to look at for me as well. Thanks
again for sharing your patches!


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-09 Thread Peter Zijlstra
On Tue, Oct 09, 2018 at 11:24:26AM +0200, Juri Lelli wrote:
> The main concerns I have with the current approach is that, being based
> on mutex.c, it's both
> 
>  - not linked with futexes
>  - not involving "legacy" priority inheritance (rt_mutex.c)
> 
> I believe one of the main reasons Peter started this on mutexes is to
> have better coverage of potential problems (which I can assure everybody
> it had). I'm not yet sure what should we do moving forward, and this is
> exactly what I'd be pleased to hear your opinions on.

Well that, and mutex was 'simple', I didn't have to go rip out all the
legacy PI crud.

If this all ends up working well, the solution is 'simple' and we can
simply copy mutex to rt_mutex or something along those lines if we want
to keep the distinction between them. Alternatively we simply delete
rt_mutex.

Thanks for reviving this.. it's been an 'interesting' year and a half
since I wrote all this and I've really not had time to work on it.


Re: [RFD/RFC PATCH 0/8] Towards implementing proxy execution

2018-10-09 Thread Peter Zijlstra
On Tue, Oct 09, 2018 at 11:24:26AM +0200, Juri Lelli wrote:
> The main concerns I have with the current approach is that, being based
> on mutex.c, it's both
> 
>  - not linked with futexes
>  - not involving "legacy" priority inheritance (rt_mutex.c)
> 
> I believe one of the main reasons Peter started this on mutexes is to
> have better coverage of potential problems (which I can assure everybody
> it had). I'm not yet sure what should we do moving forward, and this is
> exactly what I'd be pleased to hear your opinions on.

Well that, and mutex was 'simple', I didn't have to go rip out all the
legacy PI crud.

If this all ends up working well, the solution is 'simple' and we can
simply copy mutex to rt_mutex or something along those lines if we want
to keep the distinction between them. Alternatively we simply delete
rt_mutex.

Thanks for reviving this.. it's been an 'interesting' year and a half
since I wrote all this and I've really not had time to work on it.