Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

2017-07-12 Thread Peter Zijlstra
On Tue, Jul 11, 2017 at 06:33:55PM +0200, Frederic Weisbecker wrote:
> if (!tick_nohz_full_cpu(smp_processor_id()) && likely(predicted_idle_us < 
> short_idle_threshold))
> cpuidle_fast();
> 
> Ugly but safer!

I'd not overly worry about this, cpuidle_fast() isn't anything that's
likely to ever happen.


Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

2017-07-12 Thread Peter Zijlstra
On Tue, Jul 11, 2017 at 06:33:55PM +0200, Frederic Weisbecker wrote:
> if (!tick_nohz_full_cpu(smp_processor_id()) && likely(predicted_idle_us < 
> short_idle_threshold))
> cpuidle_fast();
> 
> Ugly but safer!

I'd not overly worry about this, cpuidle_fast() isn't anything that's
likely to ever happen.


Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

2017-07-11 Thread Li, Aubrey


On 2017/7/12 13:03, Paul E. McKenney wrote:
> On Wed, Jul 12, 2017 at 11:19:59AM +0800, Li, Aubrey wrote:
>> On 2017/7/12 2:11, Paul E. McKenney wrote:
>>> On Tue, Jul 11, 2017 at 06:33:55PM +0200, Frederic Weisbecker wrote:
 On Tue, Jul 11, 2017 at 05:58:47AM -0700, Paul E. McKenney wrote:
> On Mon, Jul 10, 2017 at 09:38:34AM +0800, Aubrey Li wrote:
>> From: Aubrey Li 
>>
>> The system will enter a fast idle loop if the predicted idle period
>> is shorter than the threshold.
>> ---
>>  kernel/sched/idle.c | 9 -
>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
>> index cf6c11f..16a766c 100644
>> --- a/kernel/sched/idle.c
>> +++ b/kernel/sched/idle.c
>> @@ -280,6 +280,8 @@ static void cpuidle_generic(void)
>>   */
>>  static void do_idle(void)
>>  {
>> +unsigned int predicted_idle_us;
>> +unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2;
>>  /*
>>   * If the arch has a polling bit, we maintain an invariant:
>>   *
>> @@ -291,7 +293,12 @@ static void do_idle(void)
>>
>>  __current_set_polling();
>>
>> -cpuidle_generic();
>> +predicted_idle_us = cpuidle_predict();
>> +
>> +if (likely(predicted_idle_us < short_idle_threshold))
>> +cpuidle_fast();
>
> What if we get here from nohz_full usermode execution?  In that
> case, if I remember correctly, the scheduling-clock interrupt
> will still be disabled, and would have to be re-enabled before
> we could safely invoke cpuidle_fast().
>
> Or am I missing something here?

 That's a good point. It's partially ok because if the tick is needed
 for something specific, it is not entirely stopped but programmed to that
 deadline.

 Now there is some idle specific code when we enter dynticks-idle. See
 tick_nohz_start_idle(), tick_nohz_stop_idle(), 
 sched_clock_idle_wakeup_event()
 and some subsystems that react differently when we enter dyntick idle
 mode (scheduler_tick_max_deferment) so the tick may need a reevaluation.

 For now I'd rather suggest that we treat full nohz as an exception case 
 here
 and do:

 if (!tick_nohz_full_cpu(smp_processor_id()) && 
 likely(predicted_idle_us < short_idle_threshold))
 cpuidle_fast();

 Ugly but safer!
>>>
>>> Works for me!
>>
>> I guess who enabled full nohz(for example the financial guys who need the 
>> system
>> response as fast as possible) does not like this compromise, ;)
> 
> And some HPC guys and some real-time guys with CPU-bound real-time
> processing, so there are likely quite a few different views on this
> compromise.
> 
>> How about add rcu_idle enter/exit back only for full nohz case in fast idle? 
>> RCU idle
>> is the only risky ops if removing them from fast idle path. Comparing to 
>> adding RCU
>> idle back, going to normal idle path has more overhead IMHO.
> 
> That might work, but I would need to see the actual patch.  Frederic
> Weisbecker should look at it as well.
> 
Okay, let me address the first round of comments and deliver v2 soon.

Thanks,
-Aubrey


Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

2017-07-11 Thread Li, Aubrey


On 2017/7/12 13:03, Paul E. McKenney wrote:
> On Wed, Jul 12, 2017 at 11:19:59AM +0800, Li, Aubrey wrote:
>> On 2017/7/12 2:11, Paul E. McKenney wrote:
>>> On Tue, Jul 11, 2017 at 06:33:55PM +0200, Frederic Weisbecker wrote:
 On Tue, Jul 11, 2017 at 05:58:47AM -0700, Paul E. McKenney wrote:
> On Mon, Jul 10, 2017 at 09:38:34AM +0800, Aubrey Li wrote:
>> From: Aubrey Li 
>>
>> The system will enter a fast idle loop if the predicted idle period
>> is shorter than the threshold.
>> ---
>>  kernel/sched/idle.c | 9 -
>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
>> index cf6c11f..16a766c 100644
>> --- a/kernel/sched/idle.c
>> +++ b/kernel/sched/idle.c
>> @@ -280,6 +280,8 @@ static void cpuidle_generic(void)
>>   */
>>  static void do_idle(void)
>>  {
>> +unsigned int predicted_idle_us;
>> +unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2;
>>  /*
>>   * If the arch has a polling bit, we maintain an invariant:
>>   *
>> @@ -291,7 +293,12 @@ static void do_idle(void)
>>
>>  __current_set_polling();
>>
>> -cpuidle_generic();
>> +predicted_idle_us = cpuidle_predict();
>> +
>> +if (likely(predicted_idle_us < short_idle_threshold))
>> +cpuidle_fast();
>
> What if we get here from nohz_full usermode execution?  In that
> case, if I remember correctly, the scheduling-clock interrupt
> will still be disabled, and would have to be re-enabled before
> we could safely invoke cpuidle_fast().
>
> Or am I missing something here?

 That's a good point. It's partially ok because if the tick is needed
 for something specific, it is not entirely stopped but programmed to that
 deadline.

 Now there is some idle specific code when we enter dynticks-idle. See
 tick_nohz_start_idle(), tick_nohz_stop_idle(), 
 sched_clock_idle_wakeup_event()
 and some subsystems that react differently when we enter dyntick idle
 mode (scheduler_tick_max_deferment) so the tick may need a reevaluation.

 For now I'd rather suggest that we treat full nohz as an exception case 
 here
 and do:

 if (!tick_nohz_full_cpu(smp_processor_id()) && 
 likely(predicted_idle_us < short_idle_threshold))
 cpuidle_fast();

 Ugly but safer!
>>>
>>> Works for me!
>>
>> I guess who enabled full nohz(for example the financial guys who need the 
>> system
>> response as fast as possible) does not like this compromise, ;)
> 
> And some HPC guys and some real-time guys with CPU-bound real-time
> processing, so there are likely quite a few different views on this
> compromise.
> 
>> How about add rcu_idle enter/exit back only for full nohz case in fast idle? 
>> RCU idle
>> is the only risky ops if removing them from fast idle path. Comparing to 
>> adding RCU
>> idle back, going to normal idle path has more overhead IMHO.
> 
> That might work, but I would need to see the actual patch.  Frederic
> Weisbecker should look at it as well.
> 
Okay, let me address the first round of comments and deliver v2 soon.

Thanks,
-Aubrey


Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

2017-07-11 Thread Paul E. McKenney
On Wed, Jul 12, 2017 at 11:19:59AM +0800, Li, Aubrey wrote:
> On 2017/7/12 2:11, Paul E. McKenney wrote:
> > On Tue, Jul 11, 2017 at 06:33:55PM +0200, Frederic Weisbecker wrote:
> >> On Tue, Jul 11, 2017 at 05:58:47AM -0700, Paul E. McKenney wrote:
> >>> On Mon, Jul 10, 2017 at 09:38:34AM +0800, Aubrey Li wrote:
>  From: Aubrey Li 
> 
>  The system will enter a fast idle loop if the predicted idle period
>  is shorter than the threshold.
>  ---
>   kernel/sched/idle.c | 9 -
>   1 file changed, 8 insertions(+), 1 deletion(-)
> 
>  diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
>  index cf6c11f..16a766c 100644
>  --- a/kernel/sched/idle.c
>  +++ b/kernel/sched/idle.c
>  @@ -280,6 +280,8 @@ static void cpuidle_generic(void)
>    */
>   static void do_idle(void)
>   {
>  +unsigned int predicted_idle_us;
>  +unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2;
>   /*
>    * If the arch has a polling bit, we maintain an invariant:
>    *
>  @@ -291,7 +293,12 @@ static void do_idle(void)
> 
>   __current_set_polling();
> 
>  -cpuidle_generic();
>  +predicted_idle_us = cpuidle_predict();
>  +
>  +if (likely(predicted_idle_us < short_idle_threshold))
>  +cpuidle_fast();
> >>>
> >>> What if we get here from nohz_full usermode execution?  In that
> >>> case, if I remember correctly, the scheduling-clock interrupt
> >>> will still be disabled, and would have to be re-enabled before
> >>> we could safely invoke cpuidle_fast().
> >>>
> >>> Or am I missing something here?
> >>
> >> That's a good point. It's partially ok because if the tick is needed
> >> for something specific, it is not entirely stopped but programmed to that
> >> deadline.
> >>
> >> Now there is some idle specific code when we enter dynticks-idle. See
> >> tick_nohz_start_idle(), tick_nohz_stop_idle(), 
> >> sched_clock_idle_wakeup_event()
> >> and some subsystems that react differently when we enter dyntick idle
> >> mode (scheduler_tick_max_deferment) so the tick may need a reevaluation.
> >>
> >> For now I'd rather suggest that we treat full nohz as an exception case 
> >> here
> >> and do:
> >>
> >> if (!tick_nohz_full_cpu(smp_processor_id()) && 
> >> likely(predicted_idle_us < short_idle_threshold))
> >> cpuidle_fast();
> >>
> >> Ugly but safer!
> > 
> > Works for me!
> 
> I guess who enabled full nohz(for example the financial guys who need the 
> system
> response as fast as possible) does not like this compromise, ;)

And some HPC guys and some real-time guys with CPU-bound real-time
processing, so there are likely quite a few different views on this
compromise.

> How about add rcu_idle enter/exit back only for full nohz case in fast idle? 
> RCU idle
> is the only risky ops if removing them from fast idle path. Comparing to 
> adding RCU
> idle back, going to normal idle path has more overhead IMHO.

That might work, but I would need to see the actual patch.  Frederic
Weisbecker should look at it as well.

Thanx, Paul



Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

2017-07-11 Thread Paul E. McKenney
On Wed, Jul 12, 2017 at 11:19:59AM +0800, Li, Aubrey wrote:
> On 2017/7/12 2:11, Paul E. McKenney wrote:
> > On Tue, Jul 11, 2017 at 06:33:55PM +0200, Frederic Weisbecker wrote:
> >> On Tue, Jul 11, 2017 at 05:58:47AM -0700, Paul E. McKenney wrote:
> >>> On Mon, Jul 10, 2017 at 09:38:34AM +0800, Aubrey Li wrote:
>  From: Aubrey Li 
> 
>  The system will enter a fast idle loop if the predicted idle period
>  is shorter than the threshold.
>  ---
>   kernel/sched/idle.c | 9 -
>   1 file changed, 8 insertions(+), 1 deletion(-)
> 
>  diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
>  index cf6c11f..16a766c 100644
>  --- a/kernel/sched/idle.c
>  +++ b/kernel/sched/idle.c
>  @@ -280,6 +280,8 @@ static void cpuidle_generic(void)
>    */
>   static void do_idle(void)
>   {
>  +unsigned int predicted_idle_us;
>  +unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2;
>   /*
>    * If the arch has a polling bit, we maintain an invariant:
>    *
>  @@ -291,7 +293,12 @@ static void do_idle(void)
> 
>   __current_set_polling();
> 
>  -cpuidle_generic();
>  +predicted_idle_us = cpuidle_predict();
>  +
>  +if (likely(predicted_idle_us < short_idle_threshold))
>  +cpuidle_fast();
> >>>
> >>> What if we get here from nohz_full usermode execution?  In that
> >>> case, if I remember correctly, the scheduling-clock interrupt
> >>> will still be disabled, and would have to be re-enabled before
> >>> we could safely invoke cpuidle_fast().
> >>>
> >>> Or am I missing something here?
> >>
> >> That's a good point. It's partially ok because if the tick is needed
> >> for something specific, it is not entirely stopped but programmed to that
> >> deadline.
> >>
> >> Now there is some idle specific code when we enter dynticks-idle. See
> >> tick_nohz_start_idle(), tick_nohz_stop_idle(), 
> >> sched_clock_idle_wakeup_event()
> >> and some subsystems that react differently when we enter dyntick idle
> >> mode (scheduler_tick_max_deferment) so the tick may need a reevaluation.
> >>
> >> For now I'd rather suggest that we treat full nohz as an exception case 
> >> here
> >> and do:
> >>
> >> if (!tick_nohz_full_cpu(smp_processor_id()) && 
> >> likely(predicted_idle_us < short_idle_threshold))
> >> cpuidle_fast();
> >>
> >> Ugly but safer!
> > 
> > Works for me!
> 
> I guess who enabled full nohz(for example the financial guys who need the 
> system
> response as fast as possible) does not like this compromise, ;)

And some HPC guys and some real-time guys with CPU-bound real-time
processing, so there are likely quite a few different views on this
compromise.

> How about add rcu_idle enter/exit back only for full nohz case in fast idle? 
> RCU idle
> is the only risky ops if removing them from fast idle path. Comparing to 
> adding RCU
> idle back, going to normal idle path has more overhead IMHO.

That might work, but I would need to see the actual patch.  Frederic
Weisbecker should look at it as well.

Thanx, Paul



Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

2017-07-11 Thread Li, Aubrey
On 2017/7/12 2:11, Paul E. McKenney wrote:
> On Tue, Jul 11, 2017 at 06:33:55PM +0200, Frederic Weisbecker wrote:
>> On Tue, Jul 11, 2017 at 05:58:47AM -0700, Paul E. McKenney wrote:
>>> On Mon, Jul 10, 2017 at 09:38:34AM +0800, Aubrey Li wrote:
 From: Aubrey Li 

 The system will enter a fast idle loop if the predicted idle period
 is shorter than the threshold.
 ---
  kernel/sched/idle.c | 9 -
  1 file changed, 8 insertions(+), 1 deletion(-)

 diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
 index cf6c11f..16a766c 100644
 --- a/kernel/sched/idle.c
 +++ b/kernel/sched/idle.c
 @@ -280,6 +280,8 @@ static void cpuidle_generic(void)
   */
  static void do_idle(void)
  {
 +  unsigned int predicted_idle_us;
 +  unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2;
/*
 * If the arch has a polling bit, we maintain an invariant:
 *
 @@ -291,7 +293,12 @@ static void do_idle(void)

__current_set_polling();

 -  cpuidle_generic();
 +  predicted_idle_us = cpuidle_predict();
 +
 +  if (likely(predicted_idle_us < short_idle_threshold))
 +  cpuidle_fast();
>>>
>>> What if we get here from nohz_full usermode execution?  In that
>>> case, if I remember correctly, the scheduling-clock interrupt
>>> will still be disabled, and would have to be re-enabled before
>>> we could safely invoke cpuidle_fast().
>>>
>>> Or am I missing something here?
>>
>> That's a good point. It's partially ok because if the tick is needed
>> for something specific, it is not entirely stopped but programmed to that
>> deadline.
>>
>> Now there is some idle specific code when we enter dynticks-idle. See
>> tick_nohz_start_idle(), tick_nohz_stop_idle(), 
>> sched_clock_idle_wakeup_event()
>> and some subsystems that react differently when we enter dyntick idle
>> mode (scheduler_tick_max_deferment) so the tick may need a reevaluation.
>>
>> For now I'd rather suggest that we treat full nohz as an exception case here
>> and do:
>>
>> if (!tick_nohz_full_cpu(smp_processor_id()) && likely(predicted_idle_us 
>> < short_idle_threshold))
>> cpuidle_fast();
>>
>> Ugly but safer!
> 
> Works for me!
>

I guess who enabled full nohz(for example the financial guys who need the system
response as fast as possible) does not like this compromise, ;)

How about add rcu_idle enter/exit back only for full nohz case in fast idle? 
RCU idle
is the only risky ops if removing them from fast idle path. Comparing to adding 
RCU
idle back, going to normal idle path has more overhead IMHO.

Thanks,
-Aubrey


Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

2017-07-11 Thread Li, Aubrey
On 2017/7/12 2:11, Paul E. McKenney wrote:
> On Tue, Jul 11, 2017 at 06:33:55PM +0200, Frederic Weisbecker wrote:
>> On Tue, Jul 11, 2017 at 05:58:47AM -0700, Paul E. McKenney wrote:
>>> On Mon, Jul 10, 2017 at 09:38:34AM +0800, Aubrey Li wrote:
 From: Aubrey Li 

 The system will enter a fast idle loop if the predicted idle period
 is shorter than the threshold.
 ---
  kernel/sched/idle.c | 9 -
  1 file changed, 8 insertions(+), 1 deletion(-)

 diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
 index cf6c11f..16a766c 100644
 --- a/kernel/sched/idle.c
 +++ b/kernel/sched/idle.c
 @@ -280,6 +280,8 @@ static void cpuidle_generic(void)
   */
  static void do_idle(void)
  {
 +  unsigned int predicted_idle_us;
 +  unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2;
/*
 * If the arch has a polling bit, we maintain an invariant:
 *
 @@ -291,7 +293,12 @@ static void do_idle(void)

__current_set_polling();

 -  cpuidle_generic();
 +  predicted_idle_us = cpuidle_predict();
 +
 +  if (likely(predicted_idle_us < short_idle_threshold))
 +  cpuidle_fast();
>>>
>>> What if we get here from nohz_full usermode execution?  In that
>>> case, if I remember correctly, the scheduling-clock interrupt
>>> will still be disabled, and would have to be re-enabled before
>>> we could safely invoke cpuidle_fast().
>>>
>>> Or am I missing something here?
>>
>> That's a good point. It's partially ok because if the tick is needed
>> for something specific, it is not entirely stopped but programmed to that
>> deadline.
>>
>> Now there is some idle specific code when we enter dynticks-idle. See
>> tick_nohz_start_idle(), tick_nohz_stop_idle(), 
>> sched_clock_idle_wakeup_event()
>> and some subsystems that react differently when we enter dyntick idle
>> mode (scheduler_tick_max_deferment) so the tick may need a reevaluation.
>>
>> For now I'd rather suggest that we treat full nohz as an exception case here
>> and do:
>>
>> if (!tick_nohz_full_cpu(smp_processor_id()) && likely(predicted_idle_us 
>> < short_idle_threshold))
>> cpuidle_fast();
>>
>> Ugly but safer!
> 
> Works for me!
>

I guess who enabled full nohz(for example the financial guys who need the system
response as fast as possible) does not like this compromise, ;)

How about add rcu_idle enter/exit back only for full nohz case in fast idle? 
RCU idle
is the only risky ops if removing them from fast idle path. Comparing to adding 
RCU
idle back, going to normal idle path has more overhead IMHO.

Thanks,
-Aubrey


Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

2017-07-11 Thread Paul E. McKenney
On Tue, Jul 11, 2017 at 06:33:55PM +0200, Frederic Weisbecker wrote:
> On Tue, Jul 11, 2017 at 05:58:47AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 10, 2017 at 09:38:34AM +0800, Aubrey Li wrote:
> > > From: Aubrey Li 
> > > 
> > > The system will enter a fast idle loop if the predicted idle period
> > > is shorter than the threshold.
> > > ---
> > >  kernel/sched/idle.c | 9 -
> > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> > > index cf6c11f..16a766c 100644
> > > --- a/kernel/sched/idle.c
> > > +++ b/kernel/sched/idle.c
> > > @@ -280,6 +280,8 @@ static void cpuidle_generic(void)
> > >   */
> > >  static void do_idle(void)
> > >  {
> > > + unsigned int predicted_idle_us;
> > > + unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2;
> > >   /*
> > >* If the arch has a polling bit, we maintain an invariant:
> > >*
> > > @@ -291,7 +293,12 @@ static void do_idle(void)
> > > 
> > >   __current_set_polling();
> > > 
> > > - cpuidle_generic();
> > > + predicted_idle_us = cpuidle_predict();
> > > +
> > > + if (likely(predicted_idle_us < short_idle_threshold))
> > > + cpuidle_fast();
> > 
> > What if we get here from nohz_full usermode execution?  In that
> > case, if I remember correctly, the scheduling-clock interrupt
> > will still be disabled, and would have to be re-enabled before
> > we could safely invoke cpuidle_fast().
> > 
> > Or am I missing something here?
> 
> That's a good point. It's partially ok because if the tick is needed
> for something specific, it is not entirely stopped but programmed to that
> deadline.
> 
> Now there is some idle specific code when we enter dynticks-idle. See
> tick_nohz_start_idle(), tick_nohz_stop_idle(), sched_clock_idle_wakeup_event()
> and some subsystems that react differently when we enter dyntick idle
> mode (scheduler_tick_max_deferment) so the tick may need a reevaluation.
> 
> For now I'd rather suggest that we treat full nohz as an exception case here
> and do:
> 
> if (!tick_nohz_full_cpu(smp_processor_id()) && likely(predicted_idle_us < 
> short_idle_threshold))
> cpuidle_fast();
> 
> Ugly but safer!

Works for me!

Thanx, Paul



Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

2017-07-11 Thread Paul E. McKenney
On Tue, Jul 11, 2017 at 06:33:55PM +0200, Frederic Weisbecker wrote:
> On Tue, Jul 11, 2017 at 05:58:47AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 10, 2017 at 09:38:34AM +0800, Aubrey Li wrote:
> > > From: Aubrey Li 
> > > 
> > > The system will enter a fast idle loop if the predicted idle period
> > > is shorter than the threshold.
> > > ---
> > >  kernel/sched/idle.c | 9 -
> > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> > > index cf6c11f..16a766c 100644
> > > --- a/kernel/sched/idle.c
> > > +++ b/kernel/sched/idle.c
> > > @@ -280,6 +280,8 @@ static void cpuidle_generic(void)
> > >   */
> > >  static void do_idle(void)
> > >  {
> > > + unsigned int predicted_idle_us;
> > > + unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2;
> > >   /*
> > >* If the arch has a polling bit, we maintain an invariant:
> > >*
> > > @@ -291,7 +293,12 @@ static void do_idle(void)
> > > 
> > >   __current_set_polling();
> > > 
> > > - cpuidle_generic();
> > > + predicted_idle_us = cpuidle_predict();
> > > +
> > > + if (likely(predicted_idle_us < short_idle_threshold))
> > > + cpuidle_fast();
> > 
> > What if we get here from nohz_full usermode execution?  In that
> > case, if I remember correctly, the scheduling-clock interrupt
> > will still be disabled, and would have to be re-enabled before
> > we could safely invoke cpuidle_fast().
> > 
> > Or am I missing something here?
> 
> That's a good point. It's partially ok because if the tick is needed
> for something specific, it is not entirely stopped but programmed to that
> deadline.
> 
> Now there is some idle specific code when we enter dynticks-idle. See
> tick_nohz_start_idle(), tick_nohz_stop_idle(), sched_clock_idle_wakeup_event()
> and some subsystems that react differently when we enter dyntick idle
> mode (scheduler_tick_max_deferment) so the tick may need a reevaluation.
> 
> For now I'd rather suggest that we treat full nohz as an exception case here
> and do:
> 
> if (!tick_nohz_full_cpu(smp_processor_id()) && likely(predicted_idle_us < 
> short_idle_threshold))
> cpuidle_fast();
> 
> Ugly but safer!

Works for me!

Thanx, Paul



Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

2017-07-11 Thread Frederic Weisbecker
On Tue, Jul 11, 2017 at 05:58:47AM -0700, Paul E. McKenney wrote:
> On Mon, Jul 10, 2017 at 09:38:34AM +0800, Aubrey Li wrote:
> > From: Aubrey Li 
> > 
> > The system will enter a fast idle loop if the predicted idle period
> > is shorter than the threshold.
> > ---
> >  kernel/sched/idle.c | 9 -
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> > index cf6c11f..16a766c 100644
> > --- a/kernel/sched/idle.c
> > +++ b/kernel/sched/idle.c
> > @@ -280,6 +280,8 @@ static void cpuidle_generic(void)
> >   */
> >  static void do_idle(void)
> >  {
> > +   unsigned int predicted_idle_us;
> > +   unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2;
> > /*
> >  * If the arch has a polling bit, we maintain an invariant:
> >  *
> > @@ -291,7 +293,12 @@ static void do_idle(void)
> > 
> > __current_set_polling();
> > 
> > -   cpuidle_generic();
> > +   predicted_idle_us = cpuidle_predict();
> > +
> > +   if (likely(predicted_idle_us < short_idle_threshold))
> > +   cpuidle_fast();
> 
> What if we get here from nohz_full usermode execution?  In that
> case, if I remember correctly, the scheduling-clock interrupt
> will still be disabled, and would have to be re-enabled before
> we could safely invoke cpuidle_fast().
> 
> Or am I missing something here?

That's a good point. It's partially ok because if the tick is needed
for something specific, it is not entirely stopped but programmed to that
deadline.

Now there is some idle specific code when we enter dynticks-idle. See
tick_nohz_start_idle(), tick_nohz_stop_idle(), sched_clock_idle_wakeup_event()
and some subsystems that react differently when we enter dyntick idle
mode (scheduler_tick_max_deferment) so the tick may need a reevaluation.

For now I'd rather suggest that we treat full nohz as an exception case here
and do:

if (!tick_nohz_full_cpu(smp_processor_id()) && likely(predicted_idle_us < 
short_idle_threshold))
cpuidle_fast();

Ugly but safer!

Thanks.


Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

2017-07-11 Thread Frederic Weisbecker
On Tue, Jul 11, 2017 at 05:58:47AM -0700, Paul E. McKenney wrote:
> On Mon, Jul 10, 2017 at 09:38:34AM +0800, Aubrey Li wrote:
> > From: Aubrey Li 
> > 
> > The system will enter a fast idle loop if the predicted idle period
> > is shorter than the threshold.
> > ---
> >  kernel/sched/idle.c | 9 -
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> > index cf6c11f..16a766c 100644
> > --- a/kernel/sched/idle.c
> > +++ b/kernel/sched/idle.c
> > @@ -280,6 +280,8 @@ static void cpuidle_generic(void)
> >   */
> >  static void do_idle(void)
> >  {
> > +   unsigned int predicted_idle_us;
> > +   unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2;
> > /*
> >  * If the arch has a polling bit, we maintain an invariant:
> >  *
> > @@ -291,7 +293,12 @@ static void do_idle(void)
> > 
> > __current_set_polling();
> > 
> > -   cpuidle_generic();
> > +   predicted_idle_us = cpuidle_predict();
> > +
> > +   if (likely(predicted_idle_us < short_idle_threshold))
> > +   cpuidle_fast();
> 
> What if we get here from nohz_full usermode execution?  In that
> case, if I remember correctly, the scheduling-clock interrupt
> will still be disabled, and would have to be re-enabled before
> we could safely invoke cpuidle_fast().
> 
> Or am I missing something here?

That's a good point. It's partially ok because if the tick is needed
for something specific, it is not entirely stopped but programmed to that
deadline.

Now there is some idle specific code when we enter dynticks-idle. See
tick_nohz_start_idle(), tick_nohz_stop_idle(), sched_clock_idle_wakeup_event()
and some subsystems that react differently when we enter dyntick idle
mode (scheduler_tick_max_deferment) so the tick may need a reevaluation.

For now I'd rather suggest that we treat full nohz as an exception case here
and do:

if (!tick_nohz_full_cpu(smp_processor_id()) && likely(predicted_idle_us < 
short_idle_threshold))
cpuidle_fast();

Ugly but safer!

Thanks.


Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

2017-07-11 Thread Paul E. McKenney
On Mon, Jul 10, 2017 at 09:38:34AM +0800, Aubrey Li wrote:
> From: Aubrey Li 
> 
> The system will enter a fast idle loop if the predicted idle period
> is shorter than the threshold.
> ---
>  kernel/sched/idle.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index cf6c11f..16a766c 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -280,6 +280,8 @@ static void cpuidle_generic(void)
>   */
>  static void do_idle(void)
>  {
> + unsigned int predicted_idle_us;
> + unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2;
>   /*
>* If the arch has a polling bit, we maintain an invariant:
>*
> @@ -291,7 +293,12 @@ static void do_idle(void)
> 
>   __current_set_polling();
> 
> - cpuidle_generic();
> + predicted_idle_us = cpuidle_predict();
> +
> + if (likely(predicted_idle_us < short_idle_threshold))
> + cpuidle_fast();

What if we get here from nohz_full usermode execution?  In that
case, if I remember correctly, the scheduling-clock interrupt
will still be disabled, and would have to be re-enabled before
we could safely invoke cpuidle_fast().

Or am I missing something here?

Thanx, Paul

> + else
> + cpuidle_generic();
> 
>   __current_clr_polling();
> 
> -- 
> 2.7.4
> 



Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

2017-07-11 Thread Paul E. McKenney
On Mon, Jul 10, 2017 at 09:38:34AM +0800, Aubrey Li wrote:
> From: Aubrey Li 
> 
> The system will enter a fast idle loop if the predicted idle period
> is shorter than the threshold.
> ---
>  kernel/sched/idle.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index cf6c11f..16a766c 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -280,6 +280,8 @@ static void cpuidle_generic(void)
>   */
>  static void do_idle(void)
>  {
> + unsigned int predicted_idle_us;
> + unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2;
>   /*
>* If the arch has a polling bit, we maintain an invariant:
>*
> @@ -291,7 +293,12 @@ static void do_idle(void)
> 
>   __current_set_polling();
> 
> - cpuidle_generic();
> + predicted_idle_us = cpuidle_predict();
> +
> + if (likely(predicted_idle_us < short_idle_threshold))
> + cpuidle_fast();

What if we get here from nohz_full usermode execution?  In that
case, if I remember correctly, the scheduling-clock interrupt
will still be disabled, and would have to be re-enabled before
we could safely invoke cpuidle_fast().

Or am I missing something here?

Thanx, Paul

> + else
> + cpuidle_generic();
> 
>   __current_clr_polling();
> 
> -- 
> 2.7.4
> 



[RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

2017-07-09 Thread Aubrey Li
From: Aubrey Li 

The system will enter a fast idle loop if the predicted idle period
is shorter than the threshold.
---
 kernel/sched/idle.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index cf6c11f..16a766c 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -280,6 +280,8 @@ static void cpuidle_generic(void)
  */
 static void do_idle(void)
 {
+   unsigned int predicted_idle_us;
+   unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2;
/*
 * If the arch has a polling bit, we maintain an invariant:
 *
@@ -291,7 +293,12 @@ static void do_idle(void)
 
__current_set_polling();
 
-   cpuidle_generic();
+   predicted_idle_us = cpuidle_predict();
+
+   if (likely(predicted_idle_us < short_idle_threshold))
+   cpuidle_fast();
+   else
+   cpuidle_generic();
 
__current_clr_polling();
 
-- 
2.7.4



[RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

2017-07-09 Thread Aubrey Li
From: Aubrey Li 

The system will enter a fast idle loop if the predicted idle period
is shorter than the threshold.
---
 kernel/sched/idle.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index cf6c11f..16a766c 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -280,6 +280,8 @@ static void cpuidle_generic(void)
  */
 static void do_idle(void)
 {
+   unsigned int predicted_idle_us;
+   unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2;
/*
 * If the arch has a polling bit, we maintain an invariant:
 *
@@ -291,7 +293,12 @@ static void do_idle(void)
 
__current_set_polling();
 
-   cpuidle_generic();
+   predicted_idle_us = cpuidle_predict();
+
+   if (likely(predicted_idle_us < short_idle_threshold))
+   cpuidle_fast();
+   else
+   cpuidle_generic();
 
__current_clr_polling();
 
-- 
2.7.4