Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-02-02 Thread Luca Abeni

Hi Peter,

On 01/31/2015 10:56 AM, Peter Zijlstra wrote:

On Fri, Jan 30, 2015 at 10:35:02AM +, Juri Lelli wrote:

So, we do the safe thing only in case of throttling.


No, even for the !throttle aka running tasks. We only use
dl_{runtime,deadline,period} for replenishment, until that time we
observe the old runtime/deadline set by the previous replenishment.


I guess is more than
ok for now, while we hopefully find some spare cycle to implement a
complete solution :/.


Yeah, I bet the fun part is computing the 0-lag across the entire root
domain, per-cpu 0-lag isn't correct afaict.

Uhm... This is an interesting problem.

I _suspect_ the 0-lag time does not depend on the number of CPUs. In other
words: I _suspect_ that when you kill a SCHED_DEADLINE task its bandwidth
should released at a time
t0 = scheduling_deadline - current_budget / maximum_budget * period
and this time is not affected by the fact that the task is scheduled by
global EDF or by EDF on a single core.
But I have no proof about this (and I changed my mind on this multiple
times :).


On a side note: as far as I can see, releasing the bandwidth at the end
of the current reservation period (that is, when the time is equal to the
current scheduling deadline) should be safe.
Basically, by doing this we assume that the task already consumed all of
its budget for the current reservation period.



Luca
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-02-02 Thread Juri Lelli
On 31/01/2015 09:56, Peter Zijlstra wrote:
> On Fri, Jan 30, 2015 at 10:35:02AM +, Juri Lelli wrote:
>> So, we do the safe thing only in case of throttling.
> 
> No, even for the !throttle aka running tasks. We only use
> dl_{runtime,deadline,period} for replenishment, until that time we
> observe the old runtime/deadline set by the previous replenishment.
> 

Oh, right. We set dl_new in __dl_clear_params(), nice.

Thanks,

- Juri

>> I guess is more than
>> ok for now, while we hopefully find some spare cycle to implement a
>> complete solution :/.
> 
> Yeah, I bet the fun part is computing the 0-lag across the entire root
> domain, per-cpu 0-lag isn't correct afaict.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-31 Thread Peter Zijlstra
On Fri, Jan 30, 2015 at 10:35:02AM +, Juri Lelli wrote:
> So, we do the safe thing only in case of throttling.

No, even for the !throttle aka running tasks. We only use
dl_{runtime,deadline,period} for replenishment, until that time we
observe the old runtime/deadline set by the previous replenishment.

> I guess is more than
> ok for now, while we hopefully find some spare cycle to implement a
> complete solution :/.

Yeah, I bet the fun part is computing the 0-lag across the entire root
domain, per-cpu 0-lag isn't correct afaict.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-30 Thread Juri Lelli
Hi Peter,

On 28/01/2015 14:08, Peter Zijlstra wrote:
> On Thu, Jan 15, 2015 at 02:35:46PM +0100, Luca Abeni wrote:
> 
>>> >From what I understand we should either modify the tasks run/sleep stats
>>> when we change its parameters or we should schedule a delayed release of
>>> the bandwidth delta (when it reaches its 0-lag point, if thats in the
>>> future).
> 
>> I suspect the correct behaviour can be difficult to implement:
>> - When a SCHED_DEADLINE task ends (or changes its scheduling policy to
>>   something different), its bandwidth cannot be released immediately,
>>   but should be released at the "0-lag time" (which reminds me about the
>>   GRUB patches... I had to implement a similar behaviour in those patches :)
>> - The same applies when the task changes its scheduling parameters decreasing
>>   its bandwidth. In this case, we also need to update the current runtime (if
>>   it is larger than the new runtime, set it to the new maximum value - I 
>> think
>>   this is the safest thing to do)
>> - When a task changes its parameters to increase its bandwidth, be do not
>>   have such problems.
>>
>> As far as I can see, if we apply the runtime / deadline changes starting from
>> the next reservation period we are safe (because the "0-lag time" is always
>> smaller than the current scheduling deadline).
>> This might cause some transient overloads (because if I change the parameters
>> of a task at time t, the update takes action a little bit later - at the next
>> scheduling deadline), but guarantees that a task never consumes more than
>> expected (for example: if a task continuously changes its bandwidth between
>> 0.4 and 0.3, it will never consume more than 0.4. I suspect that if we
>> immediately update dl_se->deadline and dl_se->runtime a task can arrive to
>> consume much more CPU time).
> 
> 
> OK, how about something like this; it seems to survive your Bug-Test for
> at least 50 cycles.
> 

Thanks for the patch!

> ---
>  kernel/sched/core.c | 33 -
>  kernel/sched/deadline.c |  3 ++-
>  2 files changed, 30 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index ade2958a9197..d787d6553d72 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1816,6 +1816,10 @@ void __dl_clear_params(struct task_struct *p)
>   dl_se->dl_period = 0;
>   dl_se->flags = 0;
>   dl_se->dl_bw = 0;
> +
> + dl_se->dl_throttled = 0;
> + dl_se->dl_new = 1;
> + dl_se->dl_yielded = 0;
>  }
>  
>  /*
> @@ -1844,7 +1848,7 @@ static void __sched_fork(unsigned long clone_flags, 
> struct task_struct *p)
>  #endif
>  
>   RB_CLEAR_NODE(&p->dl.rb_node);
> - hrtimer_init(&p->dl.dl_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> + init_dl_task_timer(&p->dl);
>   __dl_clear_params(p);
>  
>   INIT_LIST_HEAD(&p->rt.run_list);
> @@ -2054,6 +2058,9 @@ static inline int dl_bw_cpus(int i)
>   * allocated bandwidth to reflect the new situation.
>   *
>   * This function is called while holding p's rq->lock.
> + *
> + * XXX we should delay bw change until the task's 0-lag point, see
> + * __setparam_dl().
>   */
>  static int dl_overflow(struct task_struct *p, int policy,
>  const struct sched_attr *attr)
> @@ -3258,15 +3265,31 @@ __setparam_dl(struct task_struct *p, const struct 
> sched_attr *attr)
>  {
>   struct sched_dl_entity *dl_se = &p->dl;
>  
> - init_dl_task_timer(dl_se);
>   dl_se->dl_runtime = attr->sched_runtime;
>   dl_se->dl_deadline = attr->sched_deadline;
>   dl_se->dl_period = attr->sched_period ?: dl_se->dl_deadline;
>   dl_se->flags = attr->sched_flags;
>   dl_se->dl_bw = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
> - dl_se->dl_throttled = 0;
> - dl_se->dl_new = 1;
> - dl_se->dl_yielded = 0;
> +
> + /*
> +  * Changing the parameters of a task is 'tricky' and we're not doing
> +  * the correct thing -- also see task_dead_dl() and switched_from_dl().
> +  *
> +  * What we SHOULD do is delay the bandwidth release until the 0-lag
> +  * point. This would include retaining the task_struct until that time
> +  * and change dl_overflow() to not immediately decrement the current
> +  * amount.
> +  *
> +  * Instead we retain the current runtime/deadline and let the new
> +  * parameters take effect after the current reservation period lapses.
> +  * This is safe (albeit pessimistic) because the 0-lag point is always
> +  * before the current scheduling deadline.
> +  *
> +  * We can still have temporary overloads because we do not delay the
> +  * change in bandwidth until that time; so admission control is
> +  * not on the safe side. It does however guarantee tasks will never
> +  * consume more than promised.

So, we do the safe thing only in case of throttling. I guess is more than
ok for now, while we hopefully find some spare cycle to implement a
comp

Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-30 Thread Luca Abeni

Hi Peter,

On 01/28/2015 03:08 PM, Peter Zijlstra wrote:

On Thu, Jan 15, 2015 at 02:35:46PM +0100, Luca Abeni wrote:


>From what I understand we should either modify the tasks run/sleep stats
when we change its parameters or we should schedule a delayed release of
the bandwidth delta (when it reaches its 0-lag point, if thats in the
future).



I suspect the correct behaviour can be difficult to implement:
- When a SCHED_DEADLINE task ends (or changes its scheduling policy to
   something different), its bandwidth cannot be released immediately,
   but should be released at the "0-lag time" (which reminds me about the
   GRUB patches... I had to implement a similar behaviour in those patches :)
- The same applies when the task changes its scheduling parameters decreasing
   its bandwidth. In this case, we also need to update the current runtime (if
   it is larger than the new runtime, set it to the new maximum value - I think
   this is the safest thing to do)
- When a task changes its parameters to increase its bandwidth, be do not
   have such problems.

As far as I can see, if we apply the runtime / deadline changes starting from
the next reservation period we are safe (because the "0-lag time" is always
smaller than the current scheduling deadline).
This might cause some transient overloads (because if I change the parameters
of a task at time t, the update takes action a little bit later - at the next
scheduling deadline), but guarantees that a task never consumes more than
expected (for example: if a task continuously changes its bandwidth between
0.4 and 0.3, it will never consume more than 0.4. I suspect that if we
immediately update dl_se->deadline and dl_se->runtime a task can arrive to
consume much more CPU time).



OK, how about something like this; it seems to survive your Bug-Test for
at least 50 cycles.

I tested your patch for 1 day, and it works fine for my testcases (no crashes,
hangs, or strange behaviours).
Also, the comments look ok to me.


Thanks,
Luca



---
  kernel/sched/core.c | 33 -
  kernel/sched/deadline.c |  3 ++-
  2 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ade2958a9197..d787d6553d72 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1816,6 +1816,10 @@ void __dl_clear_params(struct task_struct *p)
dl_se->dl_period = 0;
dl_se->flags = 0;
dl_se->dl_bw = 0;
+
+   dl_se->dl_throttled = 0;
+   dl_se->dl_new = 1;
+   dl_se->dl_yielded = 0;
  }

  /*
@@ -1844,7 +1848,7 @@ static void __sched_fork(unsigned long clone_flags, 
struct task_struct *p)
  #endif

RB_CLEAR_NODE(&p->dl.rb_node);
-   hrtimer_init(&p->dl.dl_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+   init_dl_task_timer(&p->dl);
__dl_clear_params(p);

INIT_LIST_HEAD(&p->rt.run_list);
@@ -2054,6 +2058,9 @@ static inline int dl_bw_cpus(int i)
   * allocated bandwidth to reflect the new situation.
   *
   * This function is called while holding p's rq->lock.
+ *
+ * XXX we should delay bw change until the task's 0-lag point, see
+ * __setparam_dl().
   */
  static int dl_overflow(struct task_struct *p, int policy,
   const struct sched_attr *attr)
@@ -3258,15 +3265,31 @@ __setparam_dl(struct task_struct *p, const struct 
sched_attr *attr)
  {
struct sched_dl_entity *dl_se = &p->dl;

-   init_dl_task_timer(dl_se);
dl_se->dl_runtime = attr->sched_runtime;
dl_se->dl_deadline = attr->sched_deadline;
dl_se->dl_period = attr->sched_period ?: dl_se->dl_deadline;
dl_se->flags = attr->sched_flags;
dl_se->dl_bw = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
-   dl_se->dl_throttled = 0;
-   dl_se->dl_new = 1;
-   dl_se->dl_yielded = 0;
+
+   /*
+* Changing the parameters of a task is 'tricky' and we're not doing
+* the correct thing -- also see task_dead_dl() and switched_from_dl().
+*
+* What we SHOULD do is delay the bandwidth release until the 0-lag
+* point. This would include retaining the task_struct until that time
+* and change dl_overflow() to not immediately decrement the current
+* amount.
+*
+* Instead we retain the current runtime/deadline and let the new
+* parameters take effect after the current reservation period lapses.
+* This is safe (albeit pessimistic) because the 0-lag point is always
+* before the current scheduling deadline.
+*
+* We can still have temporary overloads because we do not delay the
+* change in bandwidth until that time; so admission control is
+* not on the safe side. It does however guarantee tasks will never
+* consume more than promised.
+*/
  }

  /*
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index b5209

Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-28 Thread Luca Abeni

Hi Peter,

On 01/28/2015 03:08 PM, Peter Zijlstra wrote:

On Thu, Jan 15, 2015 at 02:35:46PM +0100, Luca Abeni wrote:


>From what I understand we should either modify the tasks run/sleep stats
when we change its parameters or we should schedule a delayed release of
the bandwidth delta (when it reaches its 0-lag point, if thats in the
future).



I suspect the correct behaviour can be difficult to implement:
- When a SCHED_DEADLINE task ends (or changes its scheduling policy to
   something different), its bandwidth cannot be released immediately,
   but should be released at the "0-lag time" (which reminds me about the
   GRUB patches... I had to implement a similar behaviour in those patches :)
- The same applies when the task changes its scheduling parameters decreasing
   its bandwidth. In this case, we also need to update the current runtime (if
   it is larger than the new runtime, set it to the new maximum value - I think
   this is the safest thing to do)
- When a task changes its parameters to increase its bandwidth, be do not
   have such problems.

As far as I can see, if we apply the runtime / deadline changes starting from
the next reservation period we are safe (because the "0-lag time" is always
smaller than the current scheduling deadline).
This might cause some transient overloads (because if I change the parameters
of a task at time t, the update takes action a little bit later - at the next
scheduling deadline), but guarantees that a task never consumes more than
expected (for example: if a task continuously changes its bandwidth between
0.4 and 0.3, it will never consume more than 0.4. I suspect that if we
immediately update dl_se->deadline and dl_se->runtime a task can arrive to
consume much more CPU time).



OK, how about something like this; it seems to survive your Bug-Test for
at least 50 cycles.

Thanks; tomorrow I'll test this patch and I'll let you know how it works in
my setup (but looking at it I am pretty sure it will solve my issue).


Thanks,
Luca



---
  kernel/sched/core.c | 33 -
  kernel/sched/deadline.c |  3 ++-
  2 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ade2958a9197..d787d6553d72 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1816,6 +1816,10 @@ void __dl_clear_params(struct task_struct *p)
dl_se->dl_period = 0;
dl_se->flags = 0;
dl_se->dl_bw = 0;
+
+   dl_se->dl_throttled = 0;
+   dl_se->dl_new = 1;
+   dl_se->dl_yielded = 0;
  }

  /*
@@ -1844,7 +1848,7 @@ static void __sched_fork(unsigned long clone_flags, 
struct task_struct *p)
  #endif

RB_CLEAR_NODE(&p->dl.rb_node);
-   hrtimer_init(&p->dl.dl_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+   init_dl_task_timer(&p->dl);
__dl_clear_params(p);

INIT_LIST_HEAD(&p->rt.run_list);
@@ -2054,6 +2058,9 @@ static inline int dl_bw_cpus(int i)
   * allocated bandwidth to reflect the new situation.
   *
   * This function is called while holding p's rq->lock.
+ *
+ * XXX we should delay bw change until the task's 0-lag point, see
+ * __setparam_dl().
   */
  static int dl_overflow(struct task_struct *p, int policy,
   const struct sched_attr *attr)
@@ -3258,15 +3265,31 @@ __setparam_dl(struct task_struct *p, const struct 
sched_attr *attr)
  {
struct sched_dl_entity *dl_se = &p->dl;

-   init_dl_task_timer(dl_se);
dl_se->dl_runtime = attr->sched_runtime;
dl_se->dl_deadline = attr->sched_deadline;
dl_se->dl_period = attr->sched_period ?: dl_se->dl_deadline;
dl_se->flags = attr->sched_flags;
dl_se->dl_bw = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
-   dl_se->dl_throttled = 0;
-   dl_se->dl_new = 1;
-   dl_se->dl_yielded = 0;
+
+   /*
+* Changing the parameters of a task is 'tricky' and we're not doing
+* the correct thing -- also see task_dead_dl() and switched_from_dl().
+*
+* What we SHOULD do is delay the bandwidth release until the 0-lag
+* point. This would include retaining the task_struct until that time
+* and change dl_overflow() to not immediately decrement the current
+* amount.
+*
+* Instead we retain the current runtime/deadline and let the new
+* parameters take effect after the current reservation period lapses.
+* This is safe (albeit pessimistic) because the 0-lag point is always
+* before the current scheduling deadline.
+*
+* We can still have temporary overloads because we do not delay the
+* change in bandwidth until that time; so admission control is
+* not on the safe side. It does however guarantee tasks will never
+* consume more than promised.
+*/
  }

  /*
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index b52

Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-28 Thread Peter Zijlstra
On Thu, Jan 15, 2015 at 02:35:46PM +0100, Luca Abeni wrote:

> >>From what I understand we should either modify the tasks run/sleep stats
> >when we change its parameters or we should schedule a delayed release of
> >the bandwidth delta (when it reaches its 0-lag point, if thats in the
> >future).

> I suspect the correct behaviour can be difficult to implement:
> - When a SCHED_DEADLINE task ends (or changes its scheduling policy to
>   something different), its bandwidth cannot be released immediately,
>   but should be released at the "0-lag time" (which reminds me about the
>   GRUB patches... I had to implement a similar behaviour in those patches :)
> - The same applies when the task changes its scheduling parameters decreasing
>   its bandwidth. In this case, we also need to update the current runtime (if
>   it is larger than the new runtime, set it to the new maximum value - I think
>   this is the safest thing to do)
> - When a task changes its parameters to increase its bandwidth, be do not
>   have such problems.
> 
> As far as I can see, if we apply the runtime / deadline changes starting from
> the next reservation period we are safe (because the "0-lag time" is always
> smaller than the current scheduling deadline).
> This might cause some transient overloads (because if I change the parameters
> of a task at time t, the update takes action a little bit later - at the next
> scheduling deadline), but guarantees that a task never consumes more than
> expected (for example: if a task continuously changes its bandwidth between
> 0.4 and 0.3, it will never consume more than 0.4. I suspect that if we
> immediately update dl_se->deadline and dl_se->runtime a task can arrive to
> consume much more CPU time).


OK, how about something like this; it seems to survive your Bug-Test for
at least 50 cycles.

---
 kernel/sched/core.c | 33 -
 kernel/sched/deadline.c |  3 ++-
 2 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ade2958a9197..d787d6553d72 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1816,6 +1816,10 @@ void __dl_clear_params(struct task_struct *p)
dl_se->dl_period = 0;
dl_se->flags = 0;
dl_se->dl_bw = 0;
+
+   dl_se->dl_throttled = 0;
+   dl_se->dl_new = 1;
+   dl_se->dl_yielded = 0;
 }
 
 /*
@@ -1844,7 +1848,7 @@ static void __sched_fork(unsigned long clone_flags, 
struct task_struct *p)
 #endif
 
RB_CLEAR_NODE(&p->dl.rb_node);
-   hrtimer_init(&p->dl.dl_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+   init_dl_task_timer(&p->dl);
__dl_clear_params(p);
 
INIT_LIST_HEAD(&p->rt.run_list);
@@ -2054,6 +2058,9 @@ static inline int dl_bw_cpus(int i)
  * allocated bandwidth to reflect the new situation.
  *
  * This function is called while holding p's rq->lock.
+ *
+ * XXX we should delay bw change until the task's 0-lag point, see
+ * __setparam_dl().
  */
 static int dl_overflow(struct task_struct *p, int policy,
   const struct sched_attr *attr)
@@ -3258,15 +3265,31 @@ __setparam_dl(struct task_struct *p, const struct 
sched_attr *attr)
 {
struct sched_dl_entity *dl_se = &p->dl;
 
-   init_dl_task_timer(dl_se);
dl_se->dl_runtime = attr->sched_runtime;
dl_se->dl_deadline = attr->sched_deadline;
dl_se->dl_period = attr->sched_period ?: dl_se->dl_deadline;
dl_se->flags = attr->sched_flags;
dl_se->dl_bw = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
-   dl_se->dl_throttled = 0;
-   dl_se->dl_new = 1;
-   dl_se->dl_yielded = 0;
+
+   /*
+* Changing the parameters of a task is 'tricky' and we're not doing
+* the correct thing -- also see task_dead_dl() and switched_from_dl().
+*
+* What we SHOULD do is delay the bandwidth release until the 0-lag
+* point. This would include retaining the task_struct until that time
+* and change dl_overflow() to not immediately decrement the current
+* amount.
+*
+* Instead we retain the current runtime/deadline and let the new
+* parameters take effect after the current reservation period lapses.
+* This is safe (albeit pessimistic) because the 0-lag point is always
+* before the current scheduling deadline.
+*
+* We can still have temporary overloads because we do not delay the
+* change in bandwidth until that time; so admission control is
+* not on the safe side. It does however guarantee tasks will never
+* consume more than promised.
+*/
 }
 
 /*
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index b52092f2636d..726470d47f87 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1094,6 +1094,7 @@ static void task_dead_dl(struct task_struct *p)
 * Since we are TASK_DEAD we won't slip out of the domain!
 */
   

Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-15 Thread Luca Abeni

Hi Peter,

On 01/15/2015 01:23 PM, Peter Zijlstra wrote:

On Thu, Jan 15, 2015 at 12:23:43PM +0100, Luca Abeni wrote:

There are some parts of the patch that I do not understand (for example:
if I understand well, if the task is not throttled you set dl_new to 1...
And if it is throttled you change its current runtime and scheduling deadline...
Why?)
Anyway, I tested the patch and I confirm that it fixes the hang I originally
reported.


I'll go look at the patch in a wee bit. But Juri, Luca, could you
describe the correct behaviour?


From what I understand we should either modify the tasks run/sleep stats

when we change its parameters or we should schedule a delayed release of
the bandwidth delta (when it reaches its 0-lag point, if thats in the
future).

I suspect the correct behaviour can be difficult to implement:
- When a SCHED_DEADLINE task ends (or changes its scheduling policy to
  something different), its bandwidth cannot be released immediately,
  but should be released at the "0-lag time" (which reminds me about the
  GRUB patches... I had to implement a similar behaviour in those patches :)
- The same applies when the task changes its scheduling parameters decreasing
  its bandwidth. In this case, we also need to update the current runtime (if
  it is larger than the new runtime, set it to the new maximum value - I think
  this is the safest thing to do)
- When a task changes its parameters to increase its bandwidth, be do not
  have such problems.

As far as I can see, if we apply the runtime / deadline changes starting from
the next reservation period we are safe (because the "0-lag time" is always
smaller than the current scheduling deadline).
This might cause some transient overloads (because if I change the parameters
of a task at time t, the update takes action a little bit later - at the next
scheduling deadline), but guarantees that a task never consumes more than
expected (for example: if a task continuously changes its bandwidth between
0.4 and 0.3, it will never consume more than 0.4. I suspect that if we
immediately update dl_se->deadline and dl_se->runtime a task can arrive to
consume much more CPU time).


Luca



Doing neither will allow one to game the reservation system afaict.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-15 Thread Peter Zijlstra
On Thu, Jan 15, 2015 at 12:23:43PM +0100, Luca Abeni wrote:
> There are some parts of the patch that I do not understand (for example:
> if I understand well, if the task is not throttled you set dl_new to 1...
> And if it is throttled you change its current runtime and scheduling 
> deadline...
> Why?)
> Anyway, I tested the patch and I confirm that it fixes the hang I originally
> reported.

I'll go look at the patch in a wee bit. But Juri, Luca, could you
describe the correct behaviour?

>From what I understand we should either modify the tasks run/sleep stats
when we change its parameters or we should schedule a delayed release of
the bandwidth delta (when it reaches its 0-lag point, if thats in the
future).

Doing neither will allow one to game the reservation system afaict.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-15 Thread Luca Abeni

Hi Kirill,

On 01/14/2015 01:43 PM, Kirill Tkhai wrote:
[...]

Say we have a userspace task that evaluates and changes runtime
parameters for other tasks (basically what Luca is doing IIRC), and the
changes keep resetting the sleep time, the whole guarantee system comes
down, rendering the deadline system useless.

   If in the future we allow non-privileged users to increase deadline,
   we will reflect that in __setparam_dl() too.

  I'd say it is better to implement the right behavior even for root, so
  that we will find it right when we'll grant access to non root users
  too. Also, even if root can do everything, we always try not to break
  guarantees that come with admission control (root or non root that is).


Just so.


How about something like this?

There are some parts of the patch that I do not understand (for example:
if I understand well, if the task is not throttled you set dl_new to 1...
And if it is throttled you change its current runtime and scheduling deadline...
Why?)
Anyway, I tested the patch and I confirm that it fixes the hang I originally
reported.


Luca


  include/linux/sched/deadline.h |  2 ++
  kernel/sched/core.c| 33 +
  kernel/sched/deadline.c| 10 +-
  kernel/sched/sched.h   |  1 -
  4 files changed, 32 insertions(+), 14 deletions(-)
diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
index 9d303b8..c70a7fc 100644
--- a/include/linux/sched/deadline.h
+++ b/include/linux/sched/deadline.h
@@ -21,4 +21,6 @@ static inline int dl_task(struct task_struct *p)
return dl_prio(p->prio);
  }

+extern enum hrtimer_restart dl_task_timer(struct hrtimer *timer);
+
  #endif /* _SCHED_DEADLINE_H */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c0accc0..edf9d91 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1809,6 +1809,8 @@ void __dl_clear_params(struct task_struct *p)
  {
struct sched_dl_entity *dl_se = &p->dl;

+   dl_se->dl_throttled = 0;
+   dl_se->dl_yielded = 0;
dl_se->dl_runtime = 0;
dl_se->dl_deadline = 0;
dl_se->dl_period = 0;
@@ -1840,6 +1842,7 @@ static void __sched_fork(unsigned long clone_flags, 
struct task_struct *p)

RB_CLEAR_NODE(&p->dl.rb_node);
hrtimer_init(&p->dl.dl_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+   p->dl.dl_timer.function = dl_task_timer;
__dl_clear_params(p);

INIT_LIST_HEAD(&p->rt.run_list);
@@ -3250,16 +3253,38 @@ static void
  __setparam_dl(struct task_struct *p, const struct sched_attr *attr)
  {
struct sched_dl_entity *dl_se = &p->dl;
+   struct hrtimer *timer = &dl_se->dl_timer;
+   struct rq *rq = task_rq(p);
+   int pending = 0;
+
+   if (hrtimer_is_queued(timer)) {
+   /*
+* Not need smp_rmb() here, synchronization points
+* are rq lock in our caller and in dl_task_timer().
+*/
+   pending = 1;
+   } else if (hrtimer_callback_running(timer)) {
+   smp_rmb(); /* Pairs with rq lock in timer handler */
+
+   /* Has the timer handler already finished? */
+   if (dl_se->dl_throttled)
+   pending = 1; /* No */
+   }
+
+   if (!pending) {
+   BUG_ON(dl_se->dl_throttled);
+   dl_se->dl_new = 1;
+   } else {
+   dl_se->deadline = rq_clock(rq) + attr->sched_deadline;
+   dl_se->runtime = attr->sched_runtime;
+   }

-   init_dl_task_timer(dl_se);
+   dl_se->dl_yielded = 0;
dl_se->dl_runtime = attr->sched_runtime;
dl_se->dl_deadline = attr->sched_deadline;
dl_se->dl_period = attr->sched_period ?: dl_se->dl_deadline;
dl_se->flags = attr->sched_flags;
dl_se->dl_bw = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
-   dl_se->dl_throttled = 0;
-   dl_se->dl_new = 1;
-   dl_se->dl_yielded = 0;
  }

  /*
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index b52092f..b457ca7 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -500,7 +500,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, 
bool boosted)
   * updating (and the queueing back to dl_rq) will be done by the
   * next call to enqueue_task_dl().
   */
-static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
+enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
  {
struct sched_dl_entity *dl_se = container_of(timer,
 struct sched_dl_entity,
@@ -559,14 +559,6 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer 
*timer)
return HRTIMER_NORESTART;
  }

-void init_dl_task_timer(struct sched_dl_entity *dl_se)
-{
-   struct hrtimer *timer = &dl_se->dl_timer;
-
-   hrtimer_init(timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
-   timer->function = dl_task_timer;
-}
-
  stat

Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-14 Thread Kirill Tkhai
13.01.2015, 17:04, "Peter Zijlstra" :
> On Tue, Jan 12, 2015 at 12:26:40PM +0300, Kirill Tkhai wrote:
>>>  Well, I'm inclined to agree to Luca's viewpoint. We should not change
>>>  parameters of a throttled task or we may affect other tasks.
>>  Could you explain your viewpoint more? How does this affects other tasks?
>
> I agree with Juri and Luca here. Luca gave an example that DoS's
> SCHED_DEADLINE.
>
> In Luca's example we'd constantly call sched_setattr() on a task, which
> results in it never throttling and that will affect other tasks, because
> if we're running, they cannot be.
>>  As I understand, in __setparam_dl() we are sure that there is enough
>>  dl_bw. In __sched_setscheduler() we call it after dl_overflow() check.
>
> Yes, _but_, as per the above. BW includes the sleep time, if we fail to
> sleep its all pointless.
>
> We got the runtime (before the throttle) under the promise that we'd go
> sleep. When we change our parameters while being throttled we should
> adjust the throttle time accordingly. If say we changed from 30% to 35%
> we are allowed to sleep less. Now sleeping more than strictly required
> is only detrimental to ourselves, so that is not (too) critical.
>
> However, the other way around, if we change from 35% to 30% we should
> now effectively sleep longer (for the same runtime we already consumed),
> and we must do this, because admission control will instantly assume the
> 5% change in bandwidth to be available. If we sleep short, this BW is
> not in fact available and we break our guarantees.
>   Anyway, I am fine with every patch that fixes the bug :)
   Deadline class requires root privileges. So, I do not see a problem
   here. Please, see __sched_setscheduler().
>
> Its not a priv issue, not doing this makes it impossible to use
> SCHED_DEADLINE in the intended way, even for root.
>
> Say we have a userspace task that evaluates and changes runtime
> parameters for other tasks (basically what Luca is doing IIRC), and the
> changes keep resetting the sleep time, the whole guarantee system comes
> down, rendering the deadline system useless.
   If in the future we allow non-privileged users to increase deadline,
   we will reflect that in __setparam_dl() too.
>>>  I'd say it is better to implement the right behavior even for root, so
>>>  that we will find it right when we'll grant access to non root users
>>>  too. Also, even if root can do everything, we always try not to break
>>>  guarantees that come with admission control (root or non root that is).
>
> Just so.

How about something like this?

 include/linux/sched/deadline.h |  2 ++
 kernel/sched/core.c| 33 +
 kernel/sched/deadline.c| 10 +-
 kernel/sched/sched.h   |  1 -
 4 files changed, 32 insertions(+), 14 deletions(-)
diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
index 9d303b8..c70a7fc 100644
--- a/include/linux/sched/deadline.h
+++ b/include/linux/sched/deadline.h
@@ -21,4 +21,6 @@ static inline int dl_task(struct task_struct *p)
return dl_prio(p->prio);
 }
 
+extern enum hrtimer_restart dl_task_timer(struct hrtimer *timer);
+
 #endif /* _SCHED_DEADLINE_H */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c0accc0..edf9d91 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1809,6 +1809,8 @@ void __dl_clear_params(struct task_struct *p)
 {
struct sched_dl_entity *dl_se = &p->dl;
 
+   dl_se->dl_throttled = 0;
+   dl_se->dl_yielded = 0;
dl_se->dl_runtime = 0;
dl_se->dl_deadline = 0;
dl_se->dl_period = 0;
@@ -1840,6 +1842,7 @@ static void __sched_fork(unsigned long clone_flags, 
struct task_struct *p)
 
RB_CLEAR_NODE(&p->dl.rb_node);
hrtimer_init(&p->dl.dl_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+   p->dl.dl_timer.function = dl_task_timer;
__dl_clear_params(p);
 
INIT_LIST_HEAD(&p->rt.run_list);
@@ -3250,16 +3253,38 @@ static void
 __setparam_dl(struct task_struct *p, const struct sched_attr *attr)
 {
struct sched_dl_entity *dl_se = &p->dl;
+   struct hrtimer *timer = &dl_se->dl_timer;
+   struct rq *rq = task_rq(p);
+   int pending = 0;
+
+   if (hrtimer_is_queued(timer)) {
+   /*
+* Not need smp_rmb() here, synchronization points
+* are rq lock in our caller and in dl_task_timer().
+*/
+   pending = 1;
+   } else if (hrtimer_callback_running(timer)) {
+   smp_rmb(); /* Pairs with rq lock in timer handler */
+
+   /* Has the timer handler already finished? */
+   if (dl_se->dl_throttled)
+   pending = 1; /* No */
+   }
+
+   if (!pending) {
+   BUG_ON(dl_se->dl_throttled);
+   dl_se->dl_new = 1;
+   } else {
+   dl_se->deadline = rq_clock(rq) + attr->sched_deadline;
+   

Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-13 Thread Peter Zijlstra
On Tue, Jan 12, 2015 at 12:26:40PM +0300, Kirill Tkhai wrote:
> > Well, I'm inclined to agree to Luca's viewpoint. We should not change
> > parameters of a throttled task or we may affect other tasks.
> 
> Could you explain your viewpoint more? How does this affects other tasks?

I agree with Juri and Luca here. Luca gave an example that DoS's
SCHED_DEADLINE.

In Luca's example we'd constantly call sched_setattr() on a task, which
results in it never throttling and that will affect other tasks, because
if we're running, they cannot be.

> As I understand, in __setparam_dl() we are sure that there is enough
> dl_bw. In __sched_setscheduler() we call it after dl_overflow() check.

Yes, _but_, as per the above. BW includes the sleep time, if we fail to
sleep its all pointless.

We got the runtime (before the throttle) under the promise that we'd go
sleep. When we change our parameters while being throttled we should
adjust the throttle time accordingly. If say we changed from 30% to 35%
we are allowed to sleep less. Now sleeping more than strictly required
is only detrimental to ourselves, so that is not (too) critical.

However, the other way around, if we change from 35% to 30% we should
now effectively sleep longer (for the same runtime we already consumed),
and we must do this, because admission control will instantly assume the
5% change in bandwidth to be available. If we sleep short, this BW is
not in fact available and we break our guarantees.

> >>>  Anyway, I am fine with every patch that fixes the bug :)
> >>  Deadline class requires root privileges. So, I do not see a problem
> >>  here. Please, see __sched_setscheduler().

Its not a priv issue, not doing this makes it impossible to use
SCHED_DEADLINE in the intended way, even for root.

Say we have a userspace task that evaluates and changes runtime
parameters for other tasks (basically what Luca is doing IIRC), and the
changes keep resetting the sleep time, the whole guarantee system comes
down, rendering the deadline system useless.

> >>  If in the future we allow non-privileged users to increase deadline,
> >>  we will reflect that in __setparam_dl() too.
> >
> > I'd say it is better to implement the right behavior even for root, so
> > that we will find it right when we'll grant access to non root users
> > too. Also, even if root can do everything, we always try not to break
> > guarantees that come with admission control (root or non root that is).

Just so.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-13 Thread Kirill Tkhai
Hi, Juri,

13.01.2015, 11:10, "Juri Lelli" :
> Hi all,
>
> really sorry for the huge delay in replying to this! :(
>
> On 07/01/2015 12:29, Kirill Tkhai wrote:
>>  On Ср, 2015-01-07 at 08:01 +0100, Luca Abeni wrote:
>>>  Hi Kirill,
>>>
>>>  On Tue, 06 Jan 2015 02:07:21 +0300
>>>  Kirill Tkhai  wrote:
  On Пн, 2015-01-05 at 16:21 +0100, Luca Abeni wrote:
>>>  [...]
>  For reference, I attach the patch I am using locally (based on what
>  I suggested in my previous mail) and seems to work fine here.
>
>  Based on your comments, I suspect my patch can be further
>  simplified by moving the call to init_dl_task_timer() in
>  __sched_fork().
  It seems this way has problems. The first one is that task may become
  throttled again, and we will start dl_timer again.
>>>  Well, in my understanding if I change the parameters of a
>>>  SCHED_DEADLINE task when it is throttled, it stays throttled... So, the
>>>  task might not become throttled again before the dl timer fires.
>>>  So, I hoped this problem does not exist. But I might be wrong.
>>  You keep zeroing of dl_se->dl_throttled, and further enqueue_task()
>>  places it on the dl_rq. So, further update_curr_dl() may make it throttled
>>  again, and it will try to start dl_timer (which is already set).
  The second is that
  it's better to minimize number of combination of situations we have.
  Let's keep only one combination: timer is set <-> task is throttled.
>>>  Yes, this was my goal too... So, if I change the parameters of a task
>>>  when it is throttled, I leave dl_throttled set to 1 and I leave the
>>>  timer active.
>>  As I see,
>>
>>  dl_se->dl_throttled = 0;
>>
>>  is still in __setparam_dl() after your patch, so you do not leave
>>  it set to 1.
>>>  [...]
>>  @@ -3250,16 +3251,19 @@ static void
>>    __setparam_dl(struct task_struct *p, const struct sched_attr
>>  *attr) {
>>    struct sched_dl_entity *dl_se = &p->dl;
>>  +    struct hrtimer *timer = &dl_se->dl_timer;
>>  +
>>  + if (!hrtimer_active(timer) ||
>>  hrtimer_try_to_cancel(timer) != -1) {
>  Just for the sake of curiosity, why trying to cancel the timer
>  ("|| hrtimer_try_to_cancel(timer)") here? If it is active, cannot
>  we leave it active (without touching dl_throttled, dl_new and
>  dl_yielded)?
>
>  I mean: if I try to change the parameters of a task when it is
>  throttled, I'd like it to stay throttled until the end of the
>  reservation period... Or am I missing something?
  I think that when people change task's parameters, they want the
  kernel reacts on this immediately. For example, you want to kill
  throttled deadline task. You change parameters, but nothing happens.
  I think all developers had this use case when they were debugging
  deadline class.
>>>  I see... Different people have different requirements :)
>>>  My goal was to do something like adaptive scheduling (or scheduling
>>>  tasks with mode changes), so I did not want that changing the
>>>  scheduling parameters of a task affected the scheduling of the other
>>>  tasks... But if a task exits the throttled state when I change its
>>>  parameters, it might consume much more than the reserved CPU time.
>>>  Also, I suspect this kind of approach can be exploited by malicious
>>>  users: if I create a task with runtime 30ms and period 100ms, and I
>>>  change its scheduling parameters (to runtime=29ms and back) frequently
>>>  enough, I can consume much more than 30% of the CPU time...
>
> Well, I'm inclined to agree to Luca's viewpoint. We should not change
> parameters of a throttled task or we may affect other tasks.

Could you explain your viewpoint more? How does this affects other tasks?

As I understand, in __setparam_dl() we are sure that there is enough
dl_bw. In __sched_setscheduler() we call it after dl_overflow() check.

>>>  Anyway, I am fine with every patch that fixes the bug :)
>>  Deadline class requires root privileges. So, I do not see a problem
>>  here. Please, see __sched_setscheduler().
>>
>>  If in the future we allow non-privileged users to increase deadline,
>>  we will reflect that in __setparam_dl() too.
>
> I'd say it is better to implement the right behavior even for root, so
> that we will find it right when we'll grant access to non root users
> too. Also, even if root can do everything, we always try not to break
> guarantees that come with admission control (root or non root that is).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-13 Thread Juri Lelli
Hi all,

really sorry for the huge delay in replying to this! :(

On 07/01/2015 12:29, Kirill Tkhai wrote:
> On Ср, 2015-01-07 at 08:01 +0100, Luca Abeni wrote:
>> Hi Kirill,
>>
>> On Tue, 06 Jan 2015 02:07:21 +0300
>> Kirill Tkhai  wrote:
>>
>>> On Пн, 2015-01-05 at 16:21 +0100, Luca Abeni wrote:
>> [...]
 For reference, I attach the patch I am using locally (based on what
 I suggested in my previous mail) and seems to work fine here.

 Based on your comments, I suspect my patch can be further
 simplified by moving the call to init_dl_task_timer() in
 __sched_fork().
>>>
>>> It seems this way has problems. The first one is that task may become
>>> throttled again, and we will start dl_timer again.
>> Well, in my understanding if I change the parameters of a
>> SCHED_DEADLINE task when it is throttled, it stays throttled... So, the
>> task might not become throttled again before the dl timer fires.
>> So, I hoped this problem does not exist. But I might be wrong.
> 
> You keep zeroing of dl_se->dl_throttled, and further enqueue_task()
> places it on the dl_rq. So, further update_curr_dl() may make it throttled
> again, and it will try to start dl_timer (which is already set).
> 
>>> The second is that
>>> it's better to minimize number of combination of situations we have.
>>> Let's keep only one combination: timer is set <-> task is throttled.
>> Yes, this was my goal too... So, if I change the parameters of a task
>> when it is throttled, I leave dl_throttled set to 1 and I leave the
>> timer active.
> 
> As I see,
> 
> dl_se->dl_throttled = 0;
> 
> is still in __setparam_dl() after your patch, so you do not leave
> it set to 1.
> 
>>
>> [...]
> @@ -3250,16 +3251,19 @@ static void
>   __setparam_dl(struct task_struct *p, const struct sched_attr
> *attr) {
>   struct sched_dl_entity *dl_se = &p->dl;
> +struct hrtimer *timer = &dl_se->dl_timer;
> +
> + if (!hrtimer_active(timer) ||
> hrtimer_try_to_cancel(timer) != -1) {
 Just for the sake of curiosity, why trying to cancel the timer
 ("|| hrtimer_try_to_cancel(timer)") here? If it is active, cannot
 we leave it active (without touching dl_throttled, dl_new and
 dl_yielded)?

 I mean: if I try to change the parameters of a task when it is
 throttled, I'd like it to stay throttled until the end of the
 reservation period... Or am I missing something?
>>>
>>> I think that when people change task's parameters, they want the
>>> kernel reacts on this immediately. For example, you want to kill
>>> throttled deadline task. You change parameters, but nothing happens.
>>> I think all developers had this use case when they were debugging
>>> deadline class.
>> I see... Different people have different requirements :)
>> My goal was to do something like adaptive scheduling (or scheduling
>> tasks with mode changes), so I did not want that changing the
>> scheduling parameters of a task affected the scheduling of the other
>> tasks... But if a task exits the throttled state when I change its
>> parameters, it might consume much more than the reserved CPU time.
>> Also, I suspect this kind of approach can be exploited by malicious
>> users: if I create a task with runtime 30ms and period 100ms, and I
>> change its scheduling parameters (to runtime=29ms and back) frequently
>> enough, I can consume much more than 30% of the CPU time...
>>

Well, I'm inclined to agree to Luca's viewpoint. We should not change
parameters of a throttled task or we may affect other tasks.

>> Anyway, I am fine with every patch that fixes the bug :)
> 
> Deadline class requires root privileges. So, I do not see a problem
> here. Please, see __sched_setscheduler().
> 
> If in the future we allow non-privileged users to increase deadline,
> we will reflect that in __setparam_dl() too.
> 

I'd say it is better to implement the right behavior even for root, so
that we will find it right when we'll grant access to non root users
too. Also, even if root can do everything, we always try not to break
guarantees that come with admission control (root or non root that is).

Thanks a lot,

- Juri

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-09 Thread Kirill Tkhai

09.01.2015, 14:15, "Luca Abeni" :
> Hi Kirill,
>
> On 01/07/2015 02:04 PM, Kirill Tkhai wrote:
> [...]
  If in the future we allow non-privileged users to increase deadline,
  we will reflect that in __setparam_dl() too.
>>>  Ok.
>>  Does my patch help you? It helps me, but anyway I need your confirmation.
>
> Sorry about the delay... Anyway, I finally had time to test your patch, and
> as expected it fixes the hang I was seeing.

Thanks for the testing.

Kirill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-09 Thread Luca Abeni

Hi Kirill,

On 01/07/2015 02:04 PM, Kirill Tkhai wrote:
[...]

If in the future we allow non-privileged users to increase deadline,
we will reflect that in __setparam_dl() too.

Ok.


Does my patch help you? It helps me, but anyway I need your confirmation.

Sorry about the delay... Anyway, I finally had time to test your patch, and
as expected it fixes the hang I was seeing.



Thanks,
Luca

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-07 Thread Luca Abeni

On 01/07/2015 02:04 PM, Kirill Tkhai wrote:
[...]

and further enqueue_task() places it on the dl_rq.

I was under the impression that no further enqueue_task() will happen (since
the task is throttled, it is not on runqueue, so __sched_setscheduler() will
not dequeue/enqueue it).
But I am probably missing something else :)


We have two concept of "on runqueue". The first one is rq->on_rq. It means
that a task is "queued". The second is on_dl_rq(dl_se).

When task is not "queued", it's always not on dl_rq.

When task is "queued" it may be in two states:
1)on_dl_rq() -- this means the task is not throttled;
2)!on_dl_rq() -- is task as throttled.

So when we are discussing about a throttled task, the task is "queued". If
you clear dl_throttled, __sched_setscheduler() places it back it the both
meaning: on_rq and on_dl_rq, and the task becomes available for picking
in __schedule().

Ah, I see. Thanks for explaining! Now, everything is more clear and I agree
with you.

[...]

Does my patch help you? It helps me, but anyway I need your confirmation.

I am just back from vacations, and I had no time to test it yet... I hope to
test it before the end of the week, and I'll let you know (but now I am 
convinced
that it should help).



Thanks again,
Luca

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-07 Thread Kirill Tkhai
On Ср, 2015-01-07 at 13:45 +0100, Luca Abeni wrote:
> On 01/07/2015 01:29 PM, Kirill Tkhai wrote:
> [...]
>  Based on your comments, I suspect my patch can be further
>  simplified by moving the call to init_dl_task_timer() in
>  __sched_fork().
> >>>
> >>> It seems this way has problems. The first one is that task may become
> >>> throttled again, and we will start dl_timer again.
> >> Well, in my understanding if I change the parameters of a
> >> SCHED_DEADLINE task when it is throttled, it stays throttled... So, the
> >> task might not become throttled again before the dl timer fires.
> >> So, I hoped this problem does not exist. But I might be wrong.
> >
> > You keep zeroing of dl_se->dl_throttled
> Right... Now that you point this out, I realize that dl_se->dl_throttled = 0
> should be inside the if().
> 
> > and further enqueue_task() places it on the dl_rq.
> I was under the impression that no further enqueue_task() will happen (since
> the task is throttled, it is not on runqueue, so __sched_setscheduler() will
> not dequeue/enqueue it).
> But I am probably missing something else :)

We have two concept of "on runqueue". The first one is rq->on_rq. It means
that a task is "queued". The second is on_dl_rq(dl_se).

When task is not "queued", it's always not on dl_rq.

When task is "queued" it may be in two states:
1)on_dl_rq() -- this means the task is not throttled;
2)!on_dl_rq() -- is task as throttled.

So when we are discussing about a throttled task, the task is "queued". If
you clear dl_throttled, __sched_setscheduler() places it back it the both
meaning: on_rq and on_dl_rq, and the task becomes available for picking
in __schedule().

> 
> >>> The second is that
> >>> it's better to minimize number of combination of situations we have.
> >>> Let's keep only one combination: timer is set <-> task is throttled.
> >> Yes, this was my goal too... So, if I change the parameters of a task
> >> when it is throttled, I leave dl_throttled set to 1 and I leave the
> >> timer active.
> >
> > As I see,
> >
> > dl_se->dl_throttled = 0;
> >
> > is still in __setparam_dl() after your patch, so you do not leave
> > it set to 1.
> You are right, my fault.
> 
> [...]
> >>> I think that when people change task's parameters, they want the
> >>> kernel reacts on this immediately. For example, you want to kill
> >>> throttled deadline task. You change parameters, but nothing happens.
> >>> I think all developers had this use case when they were debugging
> >>> deadline class.
> >> I see... Different people have different requirements :)
> >> My goal was to do something like adaptive scheduling (or scheduling
> >> tasks with mode changes), so I did not want that changing the
> >> scheduling parameters of a task affected the scheduling of the other
> >> tasks... But if a task exits the throttled state when I change its
> >> parameters, it might consume much more than the reserved CPU time.
> >> Also, I suspect this kind of approach can be exploited by malicious
> >> users: if I create a task with runtime 30ms and period 100ms, and I
> >> change its scheduling parameters (to runtime=29ms and back) frequently
> >> enough, I can consume much more than 30% of the CPU time...
> >>
> >> Anyway, I am fine with every patch that fixes the bug :)
> >
> > Deadline class requires root privileges. So, I do not see a problem
> > here. Please, see __sched_setscheduler().
> I know... But the final goal is to allow non-root users to use SCHED_DEADLINE,
> so I was thinking about future problems.

I think everything may change many times before we implement that. It's better
to keep the code in the consistent state.

> > If in the future we allow non-privileged users to increase deadline,
> > we will reflect that in __setparam_dl() too.
> Ok.

Does my patch help you? It helps me, but anyway I need your confirmation.

Thanks,
Kirill.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-07 Thread Luca Abeni

On 01/07/2015 01:29 PM, Kirill Tkhai wrote:
[...]

Based on your comments, I suspect my patch can be further
simplified by moving the call to init_dl_task_timer() in
__sched_fork().


It seems this way has problems. The first one is that task may become
throttled again, and we will start dl_timer again.

Well, in my understanding if I change the parameters of a
SCHED_DEADLINE task when it is throttled, it stays throttled... So, the
task might not become throttled again before the dl timer fires.
So, I hoped this problem does not exist. But I might be wrong.


You keep zeroing of dl_se->dl_throttled

Right... Now that you point this out, I realize that dl_se->dl_throttled = 0
should be inside the if().


and further enqueue_task() places it on the dl_rq.

I was under the impression that no further enqueue_task() will happen (since
the task is throttled, it is not on runqueue, so __sched_setscheduler() will
not dequeue/enqueue it).
But I am probably missing something else :)


The second is that
it's better to minimize number of combination of situations we have.
Let's keep only one combination: timer is set <-> task is throttled.

Yes, this was my goal too... So, if I change the parameters of a task
when it is throttled, I leave dl_throttled set to 1 and I leave the
timer active.


As I see,

dl_se->dl_throttled = 0;

is still in __setparam_dl() after your patch, so you do not leave
it set to 1.

You are right, my fault.

[...]

I think that when people change task's parameters, they want the
kernel reacts on this immediately. For example, you want to kill
throttled deadline task. You change parameters, but nothing happens.
I think all developers had this use case when they were debugging
deadline class.

I see... Different people have different requirements :)
My goal was to do something like adaptive scheduling (or scheduling
tasks with mode changes), so I did not want that changing the
scheduling parameters of a task affected the scheduling of the other
tasks... But if a task exits the throttled state when I change its
parameters, it might consume much more than the reserved CPU time.
Also, I suspect this kind of approach can be exploited by malicious
users: if I create a task with runtime 30ms and period 100ms, and I
change its scheduling parameters (to runtime=29ms and back) frequently
enough, I can consume much more than 30% of the CPU time...

Anyway, I am fine with every patch that fixes the bug :)


Deadline class requires root privileges. So, I do not see a problem
here. Please, see __sched_setscheduler().

I know... But the final goal is to allow non-root users to use SCHED_DEADLINE,
so I was thinking about future problems.



If in the future we allow non-privileged users to increase deadline,
we will reflect that in __setparam_dl() too.

Ok.



Thanks,
Luca

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-07 Thread Kirill Tkhai
On Ср, 2015-01-07 at 08:01 +0100, Luca Abeni wrote:
> Hi Kirill,
> 
> On Tue, 06 Jan 2015 02:07:21 +0300
> Kirill Tkhai  wrote:
> 
> > On Пн, 2015-01-05 at 16:21 +0100, Luca Abeni wrote:
> [...]
> > > For reference, I attach the patch I am using locally (based on what
> > > I suggested in my previous mail) and seems to work fine here.
> > > 
> > > Based on your comments, I suspect my patch can be further
> > > simplified by moving the call to init_dl_task_timer() in
> > > __sched_fork().
> > 
> > It seems this way has problems. The first one is that task may become
> > throttled again, and we will start dl_timer again.
> Well, in my understanding if I change the parameters of a
> SCHED_DEADLINE task when it is throttled, it stays throttled... So, the
> task might not become throttled again before the dl timer fires.
> So, I hoped this problem does not exist. But I might be wrong.

You keep zeroing of dl_se->dl_throttled, and further enqueue_task()
places it on the dl_rq. So, further update_curr_dl() may make it throttled
again, and it will try to start dl_timer (which is already set).

> > The second is that
> > it's better to minimize number of combination of situations we have.
> > Let's keep only one combination: timer is set <-> task is throttled.
> Yes, this was my goal too... So, if I change the parameters of a task
> when it is throttled, I leave dl_throttled set to 1 and I leave the
> timer active.

As I see,

dl_se->dl_throttled = 0;

is still in __setparam_dl() after your patch, so you do not leave
it set to 1.

> 
> [...]
> > > > @@ -3250,16 +3251,19 @@ static void
> > > >   __setparam_dl(struct task_struct *p, const struct sched_attr
> > > > *attr) {
> > > > struct sched_dl_entity *dl_se = &p->dl;
> > > > +struct hrtimer *timer = &dl_se->dl_timer;
> > > > +
> > > > +   if (!hrtimer_active(timer) ||
> > > > hrtimer_try_to_cancel(timer) != -1) {
> > > Just for the sake of curiosity, why trying to cancel the timer
> > > ("|| hrtimer_try_to_cancel(timer)") here? If it is active, cannot
> > > we leave it active (without touching dl_throttled, dl_new and
> > > dl_yielded)?
> > > 
> > > I mean: if I try to change the parameters of a task when it is
> > > throttled, I'd like it to stay throttled until the end of the
> > > reservation period... Or am I missing something?
> > 
> > I think that when people change task's parameters, they want the
> > kernel reacts on this immediately. For example, you want to kill
> > throttled deadline task. You change parameters, but nothing happens.
> > I think all developers had this use case when they were debugging
> > deadline class.
> I see... Different people have different requirements :)
> My goal was to do something like adaptive scheduling (or scheduling
> tasks with mode changes), so I did not want that changing the
> scheduling parameters of a task affected the scheduling of the other
> tasks... But if a task exits the throttled state when I change its
> parameters, it might consume much more than the reserved CPU time.
> Also, I suspect this kind of approach can be exploited by malicious
> users: if I create a task with runtime 30ms and period 100ms, and I
> change its scheduling parameters (to runtime=29ms and back) frequently
> enough, I can consume much more than 30% of the CPU time...
> 
> Anyway, I am fine with every patch that fixes the bug :)

Deadline class requires root privileges. So, I do not see a problem
here. Please, see __sched_setscheduler().

If in the future we allow non-privileged users to increase deadline,
we will reflect that in __setparam_dl() too.

Thanks,
Kirill.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-06 Thread Luca Abeni
Hi Kirill,

On Tue, 06 Jan 2015 02:07:21 +0300
Kirill Tkhai  wrote:

> On Пн, 2015-01-05 at 16:21 +0100, Luca Abeni wrote:
[...]
> > For reference, I attach the patch I am using locally (based on what
> > I suggested in my previous mail) and seems to work fine here.
> > 
> > Based on your comments, I suspect my patch can be further
> > simplified by moving the call to init_dl_task_timer() in
> > __sched_fork().
> 
> It seems this way has problems. The first one is that task may become
> throttled again, and we will start dl_timer again.
Well, in my understanding if I change the parameters of a
SCHED_DEADLINE task when it is throttled, it stays throttled... So, the
task might not become throttled again before the dl timer fires.
So, I hoped this problem does not exist. But I might be wrong.

> The second is that
> it's better to minimize number of combination of situations we have.
> Let's keep only one combination: timer is set <-> task is throttled.
Yes, this was my goal too... So, if I change the parameters of a task
when it is throttled, I leave dl_throttled set to 1 and I leave the
timer active.

[...]
> > > @@ -3250,16 +3251,19 @@ static void
> > >   __setparam_dl(struct task_struct *p, const struct sched_attr
> > > *attr) {
> > >   struct sched_dl_entity *dl_se = &p->dl;
> > > +struct hrtimer *timer = &dl_se->dl_timer;
> > > +
> > > + if (!hrtimer_active(timer) ||
> > > hrtimer_try_to_cancel(timer) != -1) {
> > Just for the sake of curiosity, why trying to cancel the timer
> > ("|| hrtimer_try_to_cancel(timer)") here? If it is active, cannot
> > we leave it active (without touching dl_throttled, dl_new and
> > dl_yielded)?
> > 
> > I mean: if I try to change the parameters of a task when it is
> > throttled, I'd like it to stay throttled until the end of the
> > reservation period... Or am I missing something?
> 
> I think that when people change task's parameters, they want the
> kernel reacts on this immediately. For example, you want to kill
> throttled deadline task. You change parameters, but nothing happens.
> I think all developers had this use case when they were debugging
> deadline class.
I see... Different people have different requirements :)
My goal was to do something like adaptive scheduling (or scheduling
tasks with mode changes), so I did not want that changing the
scheduling parameters of a task affected the scheduling of the other
tasks... But if a task exits the throttled state when I change its
parameters, it might consume much more than the reserved CPU time.
Also, I suspect this kind of approach can be exploited by malicious
users: if I create a task with runtime 30ms and period 100ms, and I
change its scheduling parameters (to runtime=29ms and back) frequently
enough, I can consume much more than 30% of the CPU time...

Anyway, I am fine with every patch that fixes the bug :)



Thanks,
Luca
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-05 Thread Kirill Tkhai
On Пн, 2015-01-05 at 16:21 +0100, Luca Abeni wrote:
> Hi Kirill,
> 
> On 01/04/2015 11:51 PM, Kirill Tkhai wrote:
> > Hi, Luca,
> >
> > I've just notived this.
> [...]
> > I think we should do something like below.
> >
> > hrtimer_init() is already called in sched_fork(), so we shouldn't do this
> > twice. Also, we shouldn't reinit dl_throttled etc if timer is pending,
> > and we should prevent a race here.
> Thanks for the comments (I did not notice that hrtimer_init() was called 
> twice)
> and for the patch. I'll test it in the next days.
> 
> For reference, I attach the patch I am using locally (based on what I 
> suggested
> in my previous mail) and seems to work fine here.
> 
> Based on your comments, I suspect my patch can be further simplified by moving
> the call to init_dl_task_timer() in __sched_fork().

It seems this way has problems. The first one is that task may become throttled
again, and we will start dl_timer again. The second is that it's better to 
minimize
number of combination of situations we have. Let's keep only one combination:
timer is set <-> task is throttled. This makes easier the further development
(just makes less of combinations we have to keep in mind).

> [...]
> > @@ -3250,16 +3251,19 @@ static void
> >   __setparam_dl(struct task_struct *p, const struct sched_attr *attr)
> >   {
> > struct sched_dl_entity *dl_se = &p->dl;
> > +struct hrtimer *timer = &dl_se->dl_timer;
> > +
> > +   if (!hrtimer_active(timer) || hrtimer_try_to_cancel(timer) != -1) {
> Just for the sake of curiosity, why trying to cancel the timer
> ("|| hrtimer_try_to_cancel(timer)") here? If it is active, cannot we leave it
> active (without touching dl_throttled, dl_new and dl_yielded)?
> 
> I mean: if I try to change the parameters of a task when it is throttled, I'd 
> like
> it to stay throttled until the end of the reservation period... Or am I 
> missing
> something?

I think that when people change task's parameters, they want the kernel reacts
on this immediately. For example, you want to kill throttled deadline task.
You change parameters, but nothing happens. I think all developers had this
use case when they were debugging deadline class.

> 
> 
>   Thanks,
>   Luca
> 
> > +   dl_se->dl_throttled = 0;
> > +   dl_se->dl_new = 1;
> > +   dl_se->dl_yielded = 0;
> > +   }
> >
> > -   init_dl_task_timer(dl_se);
> > dl_se->dl_runtime = attr->sched_runtime;
> > dl_se->dl_deadline = attr->sched_deadline;
> > dl_se->dl_period = attr->sched_period ?: dl_se->dl_deadline;
> > dl_se->flags = attr->sched_flags;
> > dl_se->dl_bw = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
> > -   dl_se->dl_throttled = 0;
> > -   dl_se->dl_new = 1;
> > -   dl_se->dl_yielded = 0;
> >   }
> >
> >   /*
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index e5db8c6..8649bd6 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -500,7 +500,7 @@ static int start_dl_timer(struct sched_dl_entity 
> > *dl_se, bool boosted)
> >* updating (and the queueing back to dl_rq) will be done by the
> >* next call to enqueue_task_dl().
> >*/
> > -static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
> > +enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
> >   {
> > struct sched_dl_entity *dl_se = container_of(timer,
> >  struct sched_dl_entity,
> > @@ -559,14 +559,6 @@ static enum hrtimer_restart dl_task_timer(struct 
> > hrtimer *timer)
> > return HRTIMER_NORESTART;
> >   }
> >
> > -void init_dl_task_timer(struct sched_dl_entity *dl_se)
> > -{
> > -   struct hrtimer *timer = &dl_se->dl_timer;
> > -
> > -   hrtimer_init(timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> > -   timer->function = dl_task_timer;
> > -}
> > -
> >   static
> >   int dl_runtime_exceeded(struct rq *rq, struct sched_dl_entity *dl_se)
> >   {
> > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> > index 9a2a45c..ad3a2f0 100644
> > --- a/kernel/sched/sched.h
> > +++ b/kernel/sched/sched.h
> > @@ -1257,7 +1257,6 @@ extern void init_rt_bandwidth(struct rt_bandwidth 
> > *rt_b, u64 period, u64 runtime
> >
> >   extern struct dl_bandwidth def_dl_bandwidth;
> >   extern void init_dl_bandwidth(struct dl_bandwidth *dl_b, u64 period, u64 
> > runtime);
> > -extern void init_dl_task_timer(struct sched_dl_entity *dl_se);
> >
> >   unsigned long to_ratio(u64 period, u64 runtime);
> >
> >
> 

Please, try this one instead of the patch I sent yesterday.

[PATCH]sched/dl: Prevent initialization of already set dl_timer

__setparam_dl() may be called for a task which is already of
deadline class. If the task is throttled, we corrupt its dl_timer
when we are doing hrtimer_init() in init_dl_task_timer().

It seems that__setparam_dl() is not the place where we want
to use cancel_dl_timer(). It may unlock rq while the call chain
is

Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-05 Thread Luca Abeni

Hi Kirill,

On 01/04/2015 11:51 PM, Kirill Tkhai wrote:

Hi, Luca,

I've just notived this.

[...]

I think we should do something like below.

hrtimer_init() is already called in sched_fork(), so we shouldn't do this
twice. Also, we shouldn't reinit dl_throttled etc if timer is pending,
and we should prevent a race here.

Thanks for the comments (I did not notice that hrtimer_init() was called twice)
and for the patch. I'll test it in the next days.

For reference, I attach the patch I am using locally (based on what I suggested
in my previous mail) and seems to work fine here.

Based on your comments, I suspect my patch can be further simplified by moving
the call to init_dl_task_timer() in __sched_fork().

[...]

@@ -3250,16 +3251,19 @@ static void
  __setparam_dl(struct task_struct *p, const struct sched_attr *attr)
  {
struct sched_dl_entity *dl_se = &p->dl;
+struct hrtimer *timer = &dl_se->dl_timer;
+
+   if (!hrtimer_active(timer) || hrtimer_try_to_cancel(timer) != -1) {

Just for the sake of curiosity, why trying to cancel the timer
("|| hrtimer_try_to_cancel(timer)") here? If it is active, cannot we leave it
active (without touching dl_throttled, dl_new and dl_yielded)?

I mean: if I try to change the parameters of a task when it is throttled, I'd 
like
it to stay throttled until the end of the reservation period... Or am I missing
something?


Thanks,
Luca


+   dl_se->dl_throttled = 0;
+   dl_se->dl_new = 1;
+   dl_se->dl_yielded = 0;
+   }

-   init_dl_task_timer(dl_se);
dl_se->dl_runtime = attr->sched_runtime;
dl_se->dl_deadline = attr->sched_deadline;
dl_se->dl_period = attr->sched_period ?: dl_se->dl_deadline;
dl_se->flags = attr->sched_flags;
dl_se->dl_bw = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
-   dl_se->dl_throttled = 0;
-   dl_se->dl_new = 1;
-   dl_se->dl_yielded = 0;
  }

  /*
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index e5db8c6..8649bd6 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -500,7 +500,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, 
bool boosted)
   * updating (and the queueing back to dl_rq) will be done by the
   * next call to enqueue_task_dl().
   */
-static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
+enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
  {
struct sched_dl_entity *dl_se = container_of(timer,
 struct sched_dl_entity,
@@ -559,14 +559,6 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer 
*timer)
return HRTIMER_NORESTART;
  }

-void init_dl_task_timer(struct sched_dl_entity *dl_se)
-{
-   struct hrtimer *timer = &dl_se->dl_timer;
-
-   hrtimer_init(timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
-   timer->function = dl_task_timer;
-}
-
  static
  int dl_runtime_exceeded(struct rq *rq, struct sched_dl_entity *dl_se)
  {
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 9a2a45c..ad3a2f0 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1257,7 +1257,6 @@ extern void init_rt_bandwidth(struct rt_bandwidth *rt_b, 
u64 period, u64 runtime

  extern struct dl_bandwidth def_dl_bandwidth;
  extern void init_dl_bandwidth(struct dl_bandwidth *dl_b, u64 period, u64 
runtime);
-extern void init_dl_task_timer(struct sched_dl_entity *dl_se);

  unsigned long to_ratio(u64 period, u64 runtime);




>From 7a0e6747c40cf9186f3645eb94408090ab11936a Mon Sep 17 00:00:00 2001
From: Luca Abeni 
Date: Sat, 27 Dec 2014 18:20:57 +0100
Subject: [PATCH 03/11] Do not initialize the deadline timer if it is already
 initialized

After commit 67dfa1b756f250972bde31d65e3f8fde6aeddc5b changing the
parameters of a SCHED_DEADLINE tasks might crash the system. This
happens because 67dfa1b756f250972bde31d65e3f8fde6aeddc5b removed the
following lines from init_dl_task_timer():
-   if (hrtimer_active(timer)) {
-   hrtimer_try_to_cancel(timer);
-   return;
-   }

As a result, if sched_setattr() is invoked to change the parameters
(runtime or period) of a SCHED_DEADLINE task, init_dl_task_timer()
might end up initializing a pending timer.

Fix this problem by modifying __setparam_dl() to call init_dl_task_timer()
only if the task is not already a SCHED_DEADLINE one.
---
 kernel/sched/core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b5797b7..470111c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3251,7 +3251,9 @@ __setparam_dl(struct task_struct *p, const struct sched_attr *attr)
 {
 	struct sched_dl_entity *dl_se = &p->dl;
 
-	init_dl_task_timer(dl_se);
+if (p->sched_class != &dl_sched_class) {
+		init_dl_task_timer(dl_se);
+	}
 	dl_se->dl_runtime = attr->sched_runtime;
 	dl_se->dl_deadl

Re: Another SCHED_DEADLINE bug (with bisection and possible fix)

2015-01-04 Thread Kirill Tkhai
Hi, Luca,

I've just notived this.

30.12.2014, 02:27, "luca abeni" :
> Hi all,
>
> when running some experiments on current git master, I noticed a
> regression respect to version 3.18 of the kernel: when invoking
> sched_setattr() to change the SCHED_DEADLINE parameters of a task that
> is already scheduled by SCHED_DEADLINE, it is possible to crash the
> system.
>
> The bug can be reproduced with this testcase:
> http://disi.unitn.it/~abeni/reclaiming/bug-test.tgz
> Uncompress it, enter the "Bug-Test" directory, and type "make test".
> After few cycles, my test machine (a laptop with an intel i7 CPU)
> becomes unusable, and freezes.
>
> Since I know that 3.18 is not affected by this bug, I tried a bisect,
> that pointed to commit 67dfa1b756f250972bde31d65e3f8fde6aeddc5b
> (sched/deadline: Implement cancel_dl_timer() to use in
> switched_from_dl()).
> By looking at that commit, I suspect the problem is that it removes the
> following lines from init_dl_task_timer():
> -   if (hrtimer_active(timer)) {
> -   hrtimer_try_to_cancel(timer);
> -   return;
> -   }
>
> As a result, when changing the parameters of a SCHED_DEADLINE task
> init_dl_task_timer() is invoked again, and it can initialize a pending
> timer (not sure why, but this seems to be the cause of the freezes I am
> seeing).
>
> So, I modified core.c::__setparam_dl() to invoke init_dl_task_timer()
> only if the task is not already scheduled by SCHED_DEADLINE...
> Basically, I changed
> init_dl_task_timer(dl_se);
> into
> if (p->sched_class != &dl_sched_class) {
> init_dl_task_timer(dl_se);
> }
>
> I am not sure if this is the correct fix, but with this change the
> kernel survives my test script (mentioned above), and arrives to 500
> cycles (without my patch, it crashes after 2 or 3 cycles).
>
> What do you think? Is my patch correct, or should I fix the issue in a
> different way?

I think we should do something like below.

hrtimer_init() is already called in sched_fork(), so we shouldn't do this
twice. Also, we shouldn't reinit dl_throttled etc if timer is pending,
and we should prevent a race here.

This passes your test (I waited for 30 cycles), but there's no SOB,
because I need a little time to check everything once again.

 include/linux/sched/deadline.h |  2 ++
 kernel/sched/core.c| 12 
 kernel/sched/deadline.c| 10 +-
 kernel/sched/sched.h   |  1 -
 4 files changed, 11 insertions(+), 14 deletions(-)
diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
index 9d303b8..c70a7fc 100644
--- a/include/linux/sched/deadline.h
+++ b/include/linux/sched/deadline.h
@@ -21,4 +21,6 @@ static inline int dl_task(struct task_struct *p)
return dl_prio(p->prio);
 }
 
+extern enum hrtimer_restart dl_task_timer(struct hrtimer *timer);
+
 #endif /* _SCHED_DEADLINE_H */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c0accc0..a388cc8 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1840,6 +1840,7 @@ static void __sched_fork(unsigned long clone_flags, 
struct task_struct *p)
 
RB_CLEAR_NODE(&p->dl.rb_node);
hrtimer_init(&p->dl.dl_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+   p->dl.dl_timer.function = dl_task_timer;
__dl_clear_params(p);
 
INIT_LIST_HEAD(&p->rt.run_list);
@@ -3250,16 +3251,19 @@ static void
 __setparam_dl(struct task_struct *p, const struct sched_attr *attr)
 {
struct sched_dl_entity *dl_se = &p->dl;
+struct hrtimer *timer = &dl_se->dl_timer;
+
+   if (!hrtimer_active(timer) || hrtimer_try_to_cancel(timer) != -1) {
+   dl_se->dl_throttled = 0;
+   dl_se->dl_new = 1;
+   dl_se->dl_yielded = 0;
+   }
 
-   init_dl_task_timer(dl_se);
dl_se->dl_runtime = attr->sched_runtime;
dl_se->dl_deadline = attr->sched_deadline;
dl_se->dl_period = attr->sched_period ?: dl_se->dl_deadline;
dl_se->flags = attr->sched_flags;
dl_se->dl_bw = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
-   dl_se->dl_throttled = 0;
-   dl_se->dl_new = 1;
-   dl_se->dl_yielded = 0;
 }
 
 /*
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index e5db8c6..8649bd6 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -500,7 +500,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, 
bool boosted)
  * updating (and the queueing back to dl_rq) will be done by the
  * next call to enqueue_task_dl().
  */
-static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
+enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
 {
struct sched_dl_entity *dl_se = container_of(timer,
 struct sched_dl_entity,
@@ -559,14 +559,6 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer 
*timer)
return HRTIMER_NORESTART;
 }
 
-void init_dl_task_timer(struct sched_