Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-12 Thread Rafael J. Wysocki
On Monday, March 12, 2018 12:02:14 AM CET Doug Smythies wrote:
> On 2018.03.11 08:52 Doug Smythies wrote:
> > On 2018.03.11 03:22 Rafael J. Wysocki wrote:
> >> On Sunday, March 11, 2018 8:43:02 AM CET Doug Smythies wrote:
> >>> On 2018.03.10 15:55 Rafael J. Wysocki wrote: 
>  On Saturday, March 10, 2018 5:07:36 PM CET Doug Smythies wrote:
> > On 2018.03.10 01:00 Rafael J. Wysocki wrote:
> >>>
> >>> ... [snip] ...
> >>> 
>  The information that they often spend more time than a tick
>  period in state 0 in one go *is* relevant, though.
> 
> 
>  That issue can be dealt with in a couple of ways and the patch below is a
>  rather straightforward attempt to do that.  The idea, basically, is to 
>  discard
>  the result of governor prediction if the tick has been stopped alread and
>  the predicted idle duration is within the tick range.
> >>>
>  Please try it on top of the v3 and tell me if you see an improvement.
> >>> 
> >>> It seems pretty good so far.
> >>> See a new line added to the previous graph, "rjwv3plus".
> >>> 
> >>> http://fast.smythies.com/rjwv3plus_100.png
> >>
> >> OK, cool!
> >>
> >> Below is a respin of the last patch which also prevents shallow states from
> >> being chosen due to interactivity_req when the tick is stopped.
> >>
> >> You may also add a poll_idle() fix I've just posted:
> >>
> >> https://patchwork.kernel.org/patch/10274595/
> >>
> >> on top of this.  It makes quite a bit of a difference for me. :-)
> >
> > I will add and test, but I already know from testing previous versions
> > of this patch, from Rik van Riel and myself, that the results will be
> > awesome.
> 
> And the results are indeed awesome.
> 
> A four hour 100% load on one CPU test was run, with trace, however
> there is nothing to report, as everything is great.
> 
> The same graph as the last couple of days, with a new line added for
> V3 + the respin of patch 7 of 6 + the poll-idle fix, called rjwv3pp,
> is here:
> 
> http://fast.smythies.com/rjwv3pp_100.png

That looks great, thank you!



Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-12 Thread Rafael J. Wysocki
On Monday, March 12, 2018 12:02:14 AM CET Doug Smythies wrote:
> On 2018.03.11 08:52 Doug Smythies wrote:
> > On 2018.03.11 03:22 Rafael J. Wysocki wrote:
> >> On Sunday, March 11, 2018 8:43:02 AM CET Doug Smythies wrote:
> >>> On 2018.03.10 15:55 Rafael J. Wysocki wrote: 
>  On Saturday, March 10, 2018 5:07:36 PM CET Doug Smythies wrote:
> > On 2018.03.10 01:00 Rafael J. Wysocki wrote:
> >>>
> >>> ... [snip] ...
> >>> 
>  The information that they often spend more time than a tick
>  period in state 0 in one go *is* relevant, though.
> 
> 
>  That issue can be dealt with in a couple of ways and the patch below is a
>  rather straightforward attempt to do that.  The idea, basically, is to 
>  discard
>  the result of governor prediction if the tick has been stopped alread and
>  the predicted idle duration is within the tick range.
> >>>
>  Please try it on top of the v3 and tell me if you see an improvement.
> >>> 
> >>> It seems pretty good so far.
> >>> See a new line added to the previous graph, "rjwv3plus".
> >>> 
> >>> http://fast.smythies.com/rjwv3plus_100.png
> >>
> >> OK, cool!
> >>
> >> Below is a respin of the last patch which also prevents shallow states from
> >> being chosen due to interactivity_req when the tick is stopped.
> >>
> >> You may also add a poll_idle() fix I've just posted:
> >>
> >> https://patchwork.kernel.org/patch/10274595/
> >>
> >> on top of this.  It makes quite a bit of a difference for me. :-)
> >
> > I will add and test, but I already know from testing previous versions
> > of this patch, from Rik van Riel and myself, that the results will be
> > awesome.
> 
> And the results are indeed awesome.
> 
> A four hour 100% load on one CPU test was run, with trace, however
> there is nothing to report, as everything is great.
> 
> The same graph as the last couple of days, with a new line added for
> V3 + the respin of patch 7 of 6 + the poll-idle fix, called rjwv3pp,
> is here:
> 
> http://fast.smythies.com/rjwv3pp_100.png

That looks great, thank you!



RE: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-11 Thread Doug Smythies
On 2018.03.11 08:52 Doug Smythies wrote:
> On 2018.03.11 03:22 Rafael J. Wysocki wrote:
>> On Sunday, March 11, 2018 8:43:02 AM CET Doug Smythies wrote:
>>> On 2018.03.10 15:55 Rafael J. Wysocki wrote: 
 On Saturday, March 10, 2018 5:07:36 PM CET Doug Smythies wrote:
> On 2018.03.10 01:00 Rafael J. Wysocki wrote:
>>>
>>> ... [snip] ...
>>> 
 The information that they often spend more time than a tick
 period in state 0 in one go *is* relevant, though.


 That issue can be dealt with in a couple of ways and the patch below is a
 rather straightforward attempt to do that.  The idea, basically, is to 
 discard
 the result of governor prediction if the tick has been stopped alread and
 the predicted idle duration is within the tick range.
>>>
 Please try it on top of the v3 and tell me if you see an improvement.
>>> 
>>> It seems pretty good so far.
>>> See a new line added to the previous graph, "rjwv3plus".
>>> 
>>> http://fast.smythies.com/rjwv3plus_100.png
>>
>> OK, cool!
>>
>> Below is a respin of the last patch which also prevents shallow states from
>> being chosen due to interactivity_req when the tick is stopped.
>>
>> You may also add a poll_idle() fix I've just posted:
>>
>> https://patchwork.kernel.org/patch/10274595/
>>
>> on top of this.  It makes quite a bit of a difference for me. :-)
>
> I will add and test, but I already know from testing previous versions
> of this patch, from Rik van Riel and myself, that the results will be
> awesome.

And the results are indeed awesome.

A four hour 100% load on one CPU test was run, with trace, however
there is nothing to report, as everything is great.

The same graph as the last couple of days, with a new line added for
V3 + the respin of patch 7 of 6 + the poll-idle fix, called rjwv3pp,
is here:

http://fast.smythies.com/rjwv3pp_100.png

... Doug




RE: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-11 Thread Doug Smythies
On 2018.03.11 08:52 Doug Smythies wrote:
> On 2018.03.11 03:22 Rafael J. Wysocki wrote:
>> On Sunday, March 11, 2018 8:43:02 AM CET Doug Smythies wrote:
>>> On 2018.03.10 15:55 Rafael J. Wysocki wrote: 
 On Saturday, March 10, 2018 5:07:36 PM CET Doug Smythies wrote:
> On 2018.03.10 01:00 Rafael J. Wysocki wrote:
>>>
>>> ... [snip] ...
>>> 
 The information that they often spend more time than a tick
 period in state 0 in one go *is* relevant, though.


 That issue can be dealt with in a couple of ways and the patch below is a
 rather straightforward attempt to do that.  The idea, basically, is to 
 discard
 the result of governor prediction if the tick has been stopped alread and
 the predicted idle duration is within the tick range.
>>>
 Please try it on top of the v3 and tell me if you see an improvement.
>>> 
>>> It seems pretty good so far.
>>> See a new line added to the previous graph, "rjwv3plus".
>>> 
>>> http://fast.smythies.com/rjwv3plus_100.png
>>
>> OK, cool!
>>
>> Below is a respin of the last patch which also prevents shallow states from
>> being chosen due to interactivity_req when the tick is stopped.
>>
>> You may also add a poll_idle() fix I've just posted:
>>
>> https://patchwork.kernel.org/patch/10274595/
>>
>> on top of this.  It makes quite a bit of a difference for me. :-)
>
> I will add and test, but I already know from testing previous versions
> of this patch, from Rik van Riel and myself, that the results will be
> awesome.

And the results are indeed awesome.

A four hour 100% load on one CPU test was run, with trace, however
there is nothing to report, as everything is great.

The same graph as the last couple of days, with a new line added for
V3 + the respin of patch 7 of 6 + the poll-idle fix, called rjwv3pp,
is here:

http://fast.smythies.com/rjwv3pp_100.png

... Doug




RE: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-11 Thread Doug Smythies
On 2018.03.11 03:22 Rafael J. Wysocki wrote:
> On Sunday, March 11, 2018 8:43:02 AM CET Doug Smythies wrote:
>> On 2018.03.10 15:55 Rafael J. Wysocki wrote: 
>>>On Saturday, March 10, 2018 5:07:36 PM CET Doug Smythies wrote:
 On 2018.03.10 01:00 Rafael J. Wysocki wrote:
>>>
>> ... [snip] ...
>> 
>>> The information that they often spend more time than a tick
 period in state 0 in one go *is* relevant, though.
>>>
>>>
>>> That issue can be dealt with in a couple of ways and the patch below is a
>>> rather straightforward attempt to do that.  The idea, basically, is to 
>>> discard
>>> the result of governor prediction if the tick has been stopped alread and
>>> the predicted idle duration is within the tick range.
>>>
>>> Please try it on top of the v3 and tell me if you see an improvement.
>> 
>> It seems pretty good so far.
>> See a new line added to the previous graph, "rjwv3plus".
>> 
>> http://fast.smythies.com/rjwv3plus_100.png
>
> OK, cool!
>
> Below is a respin of the last patch which also prevents shallow states from
> being chosen due to interactivity_req when the tick is stopped.
>
> You may also add a poll_idle() fix I've just posted:
>
> https://patchwork.kernel.org/patch/10274595/
>
> on top of this.  It makes quite a bit of a difference for me. :-)

I will add and test, but I already know from testing previous versions
of this patch, from Rik van Riel and myself, that the results will be
awesome.

>
>> I'll do another 100% load on one CPU test overnight, this time with
>> a trace.

The only thing I'll add from the 7 hour overnight test with trace is that
there were 0 occurrences of excessive times spent in idle states above 0.
The histograms show almost entirely those idle states being limited to
one tick time (I am using a 1000 Hz kernel). Exceptions:

Idle State: 3  CPU: 0: 1 occurrence of 1790 uSec (which is O.K. anyhow)
Idle State: 3  CPU: 6: 1 occurrence of 2372 uSec (which is O.K. anyhow)

... Doug




RE: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-11 Thread Doug Smythies
On 2018.03.11 03:22 Rafael J. Wysocki wrote:
> On Sunday, March 11, 2018 8:43:02 AM CET Doug Smythies wrote:
>> On 2018.03.10 15:55 Rafael J. Wysocki wrote: 
>>>On Saturday, March 10, 2018 5:07:36 PM CET Doug Smythies wrote:
 On 2018.03.10 01:00 Rafael J. Wysocki wrote:
>>>
>> ... [snip] ...
>> 
>>> The information that they often spend more time than a tick
 period in state 0 in one go *is* relevant, though.
>>>
>>>
>>> That issue can be dealt with in a couple of ways and the patch below is a
>>> rather straightforward attempt to do that.  The idea, basically, is to 
>>> discard
>>> the result of governor prediction if the tick has been stopped alread and
>>> the predicted idle duration is within the tick range.
>>>
>>> Please try it on top of the v3 and tell me if you see an improvement.
>> 
>> It seems pretty good so far.
>> See a new line added to the previous graph, "rjwv3plus".
>> 
>> http://fast.smythies.com/rjwv3plus_100.png
>
> OK, cool!
>
> Below is a respin of the last patch which also prevents shallow states from
> being chosen due to interactivity_req when the tick is stopped.
>
> You may also add a poll_idle() fix I've just posted:
>
> https://patchwork.kernel.org/patch/10274595/
>
> on top of this.  It makes quite a bit of a difference for me. :-)

I will add and test, but I already know from testing previous versions
of this patch, from Rik van Riel and myself, that the results will be
awesome.

>
>> I'll do another 100% load on one CPU test overnight, this time with
>> a trace.

The only thing I'll add from the 7 hour overnight test with trace is that
there were 0 occurrences of excessive times spent in idle states above 0.
The histograms show almost entirely those idle states being limited to
one tick time (I am using a 1000 Hz kernel). Exceptions:

Idle State: 3  CPU: 0: 1 occurrence of 1790 uSec (which is O.K. anyhow)
Idle State: 3  CPU: 6: 1 occurrence of 2372 uSec (which is O.K. anyhow)

... Doug




Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-11 Thread Rafael J. Wysocki
On Sunday, March 11, 2018 11:21:36 AM CET Rafael J. Wysocki wrote:
> On Sunday, March 11, 2018 8:43:02 AM CET Doug Smythies wrote:
> > On 2018.03.10 15:55 Rafael J. Wysocki wrote: 
> > >On Saturday, March 10, 2018 5:07:36 PM CET Doug Smythies wrote:
> > >> On 2018.03.10 01:00 Rafael J. Wysocki wrote:
> > >
> > ... [snip] ...
> > 
> > > The information that they often spend more time than a tick
> > > period in state 0 in one go *is* relevant, though.
> > >
> > >
> > > That issue can be dealt with in a couple of ways and the patch below is a
> > > rather straightforward attempt to do that.  The idea, basically, is to 
> > > discard
> > > the result of governor prediction if the tick has been stopped alread and
> > > the predicted idle duration is within the tick range.
> > >
> > > Please try it on top of the v3 and tell me if you see an improvement.
> > 
> > It seems pretty good so far.
> > See a new line added to the previous graph, "rjwv3plus".
> > 
> > http://fast.smythies.com/rjwv3plus_100.png
> 
> OK, cool!
> 
> Below is a respin of the last patch which also prevents shallow states from
> being chosen due to interactivity_req when the tick is stopped.

Actually appending the patch this time, sorry.

> You may also add a poll_idle() fix I've just posted:
> 
> https://patchwork.kernel.org/patch/10274595/
> 
> on top of this.  It makes quite a bit of a difference for me. :-)
> 
> > I'll do another 100% load on one CPU test overnight, this time with
> > a trace.
> 
> Thanks!

---
From: Rafael J. Wysocki 
Subject: [PATCH] cpuidle: menu: Avoid selecting shallow states with stopped tick

If the scheduler tick has been stopped already and the governor
selects a shallow idle state, the CPU can spend a long time in that
state if the selection is based on an inaccurate prediction of idle
period duration.  That effect turns out to occur relatively often,
so it needs to be mitigated.

To that end, modify the menu governor to discard the result of the
idle period duration prediction if it is less than the tick period
duration and the tick is stopped, unless the tick timer is going to
expire soon.

Signed-off-by: Rafael J. Wysocki 
---
 drivers/cpuidle/governors/menu.c |   26 +++---
 1 file changed, 19 insertions(+), 7 deletions(-)

Index: linux-pm/drivers/cpuidle/governors/menu.c
===
--- linux-pm.orig/drivers/cpuidle/governors/menu.c
+++ linux-pm/drivers/cpuidle/governors/menu.c
@@ -297,6 +297,7 @@ static int menu_select(struct cpuidle_dr
unsigned long nr_iowaiters, cpu_load;
int resume_latency = dev_pm_qos_raw_read_value(device);
ktime_t tick_time;
+   unsigned int tick_us;
 
if (data->needs_update) {
menu_update(drv, dev);
@@ -315,6 +316,7 @@ static int menu_select(struct cpuidle_dr
 
/* determine the expected residency time, round up */
data->next_timer_us = 
ktime_to_us(tick_nohz_get_sleep_length(_time));
+   tick_us = ktime_to_us(tick_time);
 
get_iowait_load(_iowaiters, _load);
data->bucket = which_bucket(data->next_timer_us, nr_iowaiters);
@@ -354,12 +356,24 @@ static int menu_select(struct cpuidle_dr
data->predicted_us = min(data->predicted_us, expected_interval);
 
/*
-* Use the performance multiplier and the user-configurable
-* latency_req to determine the maximum exit latency.
+* If the tick is already stopped, the cost of possible misprediction is
+* much higher, because the CPU may be stuck in a shallow idle state for
+* a long time as a result of it.  For this reason, if that happens say
+* we might mispredict and try to force the CPU into a state for which
+* we would have stopped the tick, unless the tick timer is going to
+* expire really soon anyway.
 */
-   interactivity_req = data->predicted_us / 
performance_multiplier(nr_iowaiters, cpu_load);
-   if (latency_req > interactivity_req)
-   latency_req = interactivity_req;
+   if (tick_nohz_tick_stopped() && data->predicted_us < TICK_USEC_HZ) {
+   data->predicted_us = min_t(unsigned int, TICK_USEC_HZ, tick_us);
+   } else {
+   /*
+* Use the performance multiplier and the user-configurable
+* latency_req to determine the maximum exit latency.
+*/
+   interactivity_req = data->predicted_us / 
performance_multiplier(nr_iowaiters, cpu_load);
+   if (latency_req > interactivity_req)
+   latency_req = interactivity_req;
+   }
 
/*
 * Find the idle state with the lowest power while satisfying
@@ -403,8 +417,6 @@ static int menu_select(struct cpuidle_dr
 */
if (first_idx > idx &&
drv->states[first_idx].target_residency < 

Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-11 Thread Rafael J. Wysocki
On Sunday, March 11, 2018 11:21:36 AM CET Rafael J. Wysocki wrote:
> On Sunday, March 11, 2018 8:43:02 AM CET Doug Smythies wrote:
> > On 2018.03.10 15:55 Rafael J. Wysocki wrote: 
> > >On Saturday, March 10, 2018 5:07:36 PM CET Doug Smythies wrote:
> > >> On 2018.03.10 01:00 Rafael J. Wysocki wrote:
> > >
> > ... [snip] ...
> > 
> > > The information that they often spend more time than a tick
> > > period in state 0 in one go *is* relevant, though.
> > >
> > >
> > > That issue can be dealt with in a couple of ways and the patch below is a
> > > rather straightforward attempt to do that.  The idea, basically, is to 
> > > discard
> > > the result of governor prediction if the tick has been stopped alread and
> > > the predicted idle duration is within the tick range.
> > >
> > > Please try it on top of the v3 and tell me if you see an improvement.
> > 
> > It seems pretty good so far.
> > See a new line added to the previous graph, "rjwv3plus".
> > 
> > http://fast.smythies.com/rjwv3plus_100.png
> 
> OK, cool!
> 
> Below is a respin of the last patch which also prevents shallow states from
> being chosen due to interactivity_req when the tick is stopped.

Actually appending the patch this time, sorry.

> You may also add a poll_idle() fix I've just posted:
> 
> https://patchwork.kernel.org/patch/10274595/
> 
> on top of this.  It makes quite a bit of a difference for me. :-)
> 
> > I'll do another 100% load on one CPU test overnight, this time with
> > a trace.
> 
> Thanks!

---
From: Rafael J. Wysocki 
Subject: [PATCH] cpuidle: menu: Avoid selecting shallow states with stopped tick

If the scheduler tick has been stopped already and the governor
selects a shallow idle state, the CPU can spend a long time in that
state if the selection is based on an inaccurate prediction of idle
period duration.  That effect turns out to occur relatively often,
so it needs to be mitigated.

To that end, modify the menu governor to discard the result of the
idle period duration prediction if it is less than the tick period
duration and the tick is stopped, unless the tick timer is going to
expire soon.

Signed-off-by: Rafael J. Wysocki 
---
 drivers/cpuidle/governors/menu.c |   26 +++---
 1 file changed, 19 insertions(+), 7 deletions(-)

Index: linux-pm/drivers/cpuidle/governors/menu.c
===
--- linux-pm.orig/drivers/cpuidle/governors/menu.c
+++ linux-pm/drivers/cpuidle/governors/menu.c
@@ -297,6 +297,7 @@ static int menu_select(struct cpuidle_dr
unsigned long nr_iowaiters, cpu_load;
int resume_latency = dev_pm_qos_raw_read_value(device);
ktime_t tick_time;
+   unsigned int tick_us;
 
if (data->needs_update) {
menu_update(drv, dev);
@@ -315,6 +316,7 @@ static int menu_select(struct cpuidle_dr
 
/* determine the expected residency time, round up */
data->next_timer_us = 
ktime_to_us(tick_nohz_get_sleep_length(_time));
+   tick_us = ktime_to_us(tick_time);
 
get_iowait_load(_iowaiters, _load);
data->bucket = which_bucket(data->next_timer_us, nr_iowaiters);
@@ -354,12 +356,24 @@ static int menu_select(struct cpuidle_dr
data->predicted_us = min(data->predicted_us, expected_interval);
 
/*
-* Use the performance multiplier and the user-configurable
-* latency_req to determine the maximum exit latency.
+* If the tick is already stopped, the cost of possible misprediction is
+* much higher, because the CPU may be stuck in a shallow idle state for
+* a long time as a result of it.  For this reason, if that happens say
+* we might mispredict and try to force the CPU into a state for which
+* we would have stopped the tick, unless the tick timer is going to
+* expire really soon anyway.
 */
-   interactivity_req = data->predicted_us / 
performance_multiplier(nr_iowaiters, cpu_load);
-   if (latency_req > interactivity_req)
-   latency_req = interactivity_req;
+   if (tick_nohz_tick_stopped() && data->predicted_us < TICK_USEC_HZ) {
+   data->predicted_us = min_t(unsigned int, TICK_USEC_HZ, tick_us);
+   } else {
+   /*
+* Use the performance multiplier and the user-configurable
+* latency_req to determine the maximum exit latency.
+*/
+   interactivity_req = data->predicted_us / 
performance_multiplier(nr_iowaiters, cpu_load);
+   if (latency_req > interactivity_req)
+   latency_req = interactivity_req;
+   }
 
/*
 * Find the idle state with the lowest power while satisfying
@@ -403,8 +417,6 @@ static int menu_select(struct cpuidle_dr
 */
if (first_idx > idx &&
drv->states[first_idx].target_residency < TICK_USEC_HZ) {
-   unsigned int tick_us = 

Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-11 Thread Rafael J. Wysocki
On Sunday, March 11, 2018 8:43:02 AM CET Doug Smythies wrote:
> On 2018.03.10 15:55 Rafael J. Wysocki wrote: 
> >On Saturday, March 10, 2018 5:07:36 PM CET Doug Smythies wrote:
> >> On 2018.03.10 01:00 Rafael J. Wysocki wrote:
> >
> ... [snip] ...
> 
> > The information that they often spend more time than a tick
> > period in state 0 in one go *is* relevant, though.
> >
> >
> > That issue can be dealt with in a couple of ways and the patch below is a
> > rather straightforward attempt to do that.  The idea, basically, is to 
> > discard
> > the result of governor prediction if the tick has been stopped alread and
> > the predicted idle duration is within the tick range.
> >
> > Please try it on top of the v3 and tell me if you see an improvement.
> 
> It seems pretty good so far.
> See a new line added to the previous graph, "rjwv3plus".
> 
> http://fast.smythies.com/rjwv3plus_100.png

OK, cool!

Below is a respin of the last patch which also prevents shallow states from
being chosen due to interactivity_req when the tick is stopped.

You may also add a poll_idle() fix I've just posted:

https://patchwork.kernel.org/patch/10274595/

on top of this.  It makes quite a bit of a difference for me. :-)

> I'll do another 100% load on one CPU test overnight, this time with
> a trace.

Thanks!



Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-11 Thread Rafael J. Wysocki
On Sunday, March 11, 2018 8:43:02 AM CET Doug Smythies wrote:
> On 2018.03.10 15:55 Rafael J. Wysocki wrote: 
> >On Saturday, March 10, 2018 5:07:36 PM CET Doug Smythies wrote:
> >> On 2018.03.10 01:00 Rafael J. Wysocki wrote:
> >
> ... [snip] ...
> 
> > The information that they often spend more time than a tick
> > period in state 0 in one go *is* relevant, though.
> >
> >
> > That issue can be dealt with in a couple of ways and the patch below is a
> > rather straightforward attempt to do that.  The idea, basically, is to 
> > discard
> > the result of governor prediction if the tick has been stopped alread and
> > the predicted idle duration is within the tick range.
> >
> > Please try it on top of the v3 and tell me if you see an improvement.
> 
> It seems pretty good so far.
> See a new line added to the previous graph, "rjwv3plus".
> 
> http://fast.smythies.com/rjwv3plus_100.png

OK, cool!

Below is a respin of the last patch which also prevents shallow states from
being chosen due to interactivity_req when the tick is stopped.

You may also add a poll_idle() fix I've just posted:

https://patchwork.kernel.org/patch/10274595/

on top of this.  It makes quite a bit of a difference for me. :-)

> I'll do another 100% load on one CPU test overnight, this time with
> a trace.

Thanks!



RE: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-10 Thread Doug Smythies
On 2018.03.10 15:55 Rafael J. Wysocki wrote: 
>On Saturday, March 10, 2018 5:07:36 PM CET Doug Smythies wrote:
>> On 2018.03.10 01:00 Rafael J. Wysocki wrote:
>
... [snip] ...

> The information that they often spend more time than a tick
> period in state 0 in one go *is* relevant, though.
>
>
> That issue can be dealt with in a couple of ways and the patch below is a
> rather straightforward attempt to do that.  The idea, basically, is to discard
> the result of governor prediction if the tick has been stopped alread and
> the predicted idle duration is within the tick range.
>
> Please try it on top of the v3 and tell me if you see an improvement.

It seems pretty good so far.
See a new line added to the previous graph, "rjwv3plus".

http://fast.smythies.com/rjwv3plus_100.png

I'll do another 100% load on one CPU test overnight, this time with
a trace.

... Doug




RE: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-10 Thread Doug Smythies
On 2018.03.10 15:55 Rafael J. Wysocki wrote: 
>On Saturday, March 10, 2018 5:07:36 PM CET Doug Smythies wrote:
>> On 2018.03.10 01:00 Rafael J. Wysocki wrote:
>
... [snip] ...

> The information that they often spend more time than a tick
> period in state 0 in one go *is* relevant, though.
>
>
> That issue can be dealt with in a couple of ways and the patch below is a
> rather straightforward attempt to do that.  The idea, basically, is to discard
> the result of governor prediction if the tick has been stopped alread and
> the predicted idle duration is within the tick range.
>
> Please try it on top of the v3 and tell me if you see an improvement.

It seems pretty good so far.
See a new line added to the previous graph, "rjwv3plus".

http://fast.smythies.com/rjwv3plus_100.png

I'll do another 100% load on one CPU test overnight, this time with
a trace.

... Doug




Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-10 Thread Rafael J. Wysocki
On Saturday, March 10, 2018 5:07:36 PM CET Doug Smythies wrote:
> On 2018.03.10 01:00 Rafael J. Wysocki wrote:
> > On Saturday, March 10, 2018 8:41:39 AM CET Doug Smythies wrote:
> >> 
> >> With apologies to those that do not like the term "PowerNightmares",
> >
> > OK, and what exactly do you count as "PowerNightmares"?
> 
> I'll snip some below and then explain:
> 
> ...[snip]...
> 
> >> 
> >> Kernel 4.16-rc4: Summary: Average processor package power 27.41 watts
> >> 
> >> Idle State 0: Total Entries: 9096 : PowerNightmares: 6540 : Not PN time 
> >> (seconds): 0.051532 : PN time: 7886.309553 : Ratio:
> 153037.133492
> >> Idle State 1: Total Entries: 28731 : PowerNightmares: 215 : Not PN time 
> >> (seconds): 0.211999 : PN time: 77.395467 : Ratio:
> 365.074679
> >> Idle State 2: Total Entries: 4474 : PowerNightmares: 97 : Not PN time 
> >> (seconds): 1.959059 : PN time: 0.874112 : Ratio: 0.446190
> >> Idle State 3: Total Entries: 2319 : PowerNightmares: 0 : Not PN time 
> >> (seconds): 1.663376 : PN time: 0.00 : Ratio: 0.00
> 
> O.K. let's go deeper than the summary, and focus on idle state 0, which has 
> been my area of interest in this saga.
> 
> Idle State 0:
> CPU: 0: Entries: 2093 : PowerNightmares: 1136 : Not PN time (seconds): 
> 0.024840 : PN time: 1874.417840 : Ratio: 75459.655439
> CPU: 1: Entries: 1051 : PowerNightmares: 721 : Not PN time (seconds): 
> 0.004668 : PN time: 198.845193 : Ratio: 42597.513425
> CPU: 2: Entries: 759 : PowerNightmares: 583 : Not PN time (seconds): 0.003299 
> : PN time: 1099.069256 : Ratio: 333152.246028
> CPU: 3: Entries: 1033 : PowerNightmares: 1008 : Not PN time (seconds): 
> 0.000361 : PN time: 1930.340683 : Ratio: 5347203.995237
> CPU: 4: Entries: 1310 : PowerNightmares: 1025 : Not PN time (seconds): 
> 0.006214 : PN time: 1332.497114 : Ratio: 214434.682950
> CPU: 5: Entries: 1097 : PowerNightmares: 848 : Not PN time (seconds): 
> 0.005029 : PN time: 785.366864 : Ratio: 156167.601340
> CPU: 6: Entries: 1753 : PowerNightmares: 1219 : Not PN time (seconds): 
> 0.007121 : PN time: 665.772603 : Ratio: 93494.256958
> 
> Note: CPU 7 is busy and doesn't go into idle at all.
> 
> And also look at the histograms of the times spent in idle state 0:
> CPU 3 might be the most interesting:
> 
> Idle State: 0  CPU: 3:
> 4 1
> 5 3
> 7 2
> 11 1
> 12 1
> 13 2
> 14 3
> 15 3
> 17 4
> 18 1
> 19 2
> 28 2
> 7563 1
> 8012 1
>  1006
> 
> Where:
> Column 1 is the time in microseconds it was in that idle state
> up to  microseconds, which includes anything more.
> Column 2 is the number of occurrences of that time.
> 
> Notice that 1008 times out of the 1033, it spent an excessive amount
> of time in idle state 0, leading to excessive power consumption.
> I adopted Thomas Ilsche's "Powernightmare" term for this several
> months ago.
> 
> This CPU 3 example was pretty clear, but sometimes it is not so
> obvious. I admit that my thresholds for is it a "powernigthmare" or
> not are somewhat arbitrary, and I'll change them to whatever anybody
> wants. Currently:
> 
> #define THRESHOLD_0 100   /* Idle state 0 PowerNightmare threshold in 
> microseconds */
> #define THRESHOLD_1 1000  /* Idle state 1 PowerNightmare threshold in 
> microseconds */
> #define THRESHOLD_2 2000  /* Idle state 2 PowerNightmare threshold in 
> microseconds */
> #define THRESHOLD_3 4000  /* Idle state 3 PowerNightmare threshold in 
> microseconds */

That's clear, thanks!

Well, the main reason why I have a problem with the "powernigthmare" term is
because it is somewhat arbitrary overall.  After all, you ended up having to
explain what you meant in detail even though you have used it in the
previous message, so it doesn't appear all that useful to me. :-)

Also, the current work isn't even concerned about idle times below the
length of the tick period, so the information that some CPUs spent over
100 us in state 0 for a certain number of times during the test is not that
relevant here.  The information that they often spend more time than a tick
period in state 0 in one go *is* relevant, though.

The $subject patch series, on the one hand, is about adding a safety net for
possible governor mispredictions using the existing tick infrastructure,
along with avoiding unnecessary timer manipulation overhead related to the
stopping and starting of the tick, on the other hand.  Of course, the safety
net will not improve the accuracy of governor predictions anyway, it may only
reduce their impact.

That said, it doesn't catch one case which turns out to be quite significant
and which is when the tick is stopped already and the governor predicts short
idle.  That, again, may cause the CPU to spend a long time in a shallow idle
state which then will qualify as a "powernightmare" I suppose.  If I'm reading
your data correctly, that is the main reason for the majority of cases in which
CPUs spend  us and more in state 0 on your system.

That issue can be dealt with in a couple of ways and the 

Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-10 Thread Rafael J. Wysocki
On Saturday, March 10, 2018 5:07:36 PM CET Doug Smythies wrote:
> On 2018.03.10 01:00 Rafael J. Wysocki wrote:
> > On Saturday, March 10, 2018 8:41:39 AM CET Doug Smythies wrote:
> >> 
> >> With apologies to those that do not like the term "PowerNightmares",
> >
> > OK, and what exactly do you count as "PowerNightmares"?
> 
> I'll snip some below and then explain:
> 
> ...[snip]...
> 
> >> 
> >> Kernel 4.16-rc4: Summary: Average processor package power 27.41 watts
> >> 
> >> Idle State 0: Total Entries: 9096 : PowerNightmares: 6540 : Not PN time 
> >> (seconds): 0.051532 : PN time: 7886.309553 : Ratio:
> 153037.133492
> >> Idle State 1: Total Entries: 28731 : PowerNightmares: 215 : Not PN time 
> >> (seconds): 0.211999 : PN time: 77.395467 : Ratio:
> 365.074679
> >> Idle State 2: Total Entries: 4474 : PowerNightmares: 97 : Not PN time 
> >> (seconds): 1.959059 : PN time: 0.874112 : Ratio: 0.446190
> >> Idle State 3: Total Entries: 2319 : PowerNightmares: 0 : Not PN time 
> >> (seconds): 1.663376 : PN time: 0.00 : Ratio: 0.00
> 
> O.K. let's go deeper than the summary, and focus on idle state 0, which has 
> been my area of interest in this saga.
> 
> Idle State 0:
> CPU: 0: Entries: 2093 : PowerNightmares: 1136 : Not PN time (seconds): 
> 0.024840 : PN time: 1874.417840 : Ratio: 75459.655439
> CPU: 1: Entries: 1051 : PowerNightmares: 721 : Not PN time (seconds): 
> 0.004668 : PN time: 198.845193 : Ratio: 42597.513425
> CPU: 2: Entries: 759 : PowerNightmares: 583 : Not PN time (seconds): 0.003299 
> : PN time: 1099.069256 : Ratio: 333152.246028
> CPU: 3: Entries: 1033 : PowerNightmares: 1008 : Not PN time (seconds): 
> 0.000361 : PN time: 1930.340683 : Ratio: 5347203.995237
> CPU: 4: Entries: 1310 : PowerNightmares: 1025 : Not PN time (seconds): 
> 0.006214 : PN time: 1332.497114 : Ratio: 214434.682950
> CPU: 5: Entries: 1097 : PowerNightmares: 848 : Not PN time (seconds): 
> 0.005029 : PN time: 785.366864 : Ratio: 156167.601340
> CPU: 6: Entries: 1753 : PowerNightmares: 1219 : Not PN time (seconds): 
> 0.007121 : PN time: 665.772603 : Ratio: 93494.256958
> 
> Note: CPU 7 is busy and doesn't go into idle at all.
> 
> And also look at the histograms of the times spent in idle state 0:
> CPU 3 might be the most interesting:
> 
> Idle State: 0  CPU: 3:
> 4 1
> 5 3
> 7 2
> 11 1
> 12 1
> 13 2
> 14 3
> 15 3
> 17 4
> 18 1
> 19 2
> 28 2
> 7563 1
> 8012 1
>  1006
> 
> Where:
> Column 1 is the time in microseconds it was in that idle state
> up to  microseconds, which includes anything more.
> Column 2 is the number of occurrences of that time.
> 
> Notice that 1008 times out of the 1033, it spent an excessive amount
> of time in idle state 0, leading to excessive power consumption.
> I adopted Thomas Ilsche's "Powernightmare" term for this several
> months ago.
> 
> This CPU 3 example was pretty clear, but sometimes it is not so
> obvious. I admit that my thresholds for is it a "powernigthmare" or
> not are somewhat arbitrary, and I'll change them to whatever anybody
> wants. Currently:
> 
> #define THRESHOLD_0 100   /* Idle state 0 PowerNightmare threshold in 
> microseconds */
> #define THRESHOLD_1 1000  /* Idle state 1 PowerNightmare threshold in 
> microseconds */
> #define THRESHOLD_2 2000  /* Idle state 2 PowerNightmare threshold in 
> microseconds */
> #define THRESHOLD_3 4000  /* Idle state 3 PowerNightmare threshold in 
> microseconds */

That's clear, thanks!

Well, the main reason why I have a problem with the "powernigthmare" term is
because it is somewhat arbitrary overall.  After all, you ended up having to
explain what you meant in detail even though you have used it in the
previous message, so it doesn't appear all that useful to me. :-)

Also, the current work isn't even concerned about idle times below the
length of the tick period, so the information that some CPUs spent over
100 us in state 0 for a certain number of times during the test is not that
relevant here.  The information that they often spend more time than a tick
period in state 0 in one go *is* relevant, though.

The $subject patch series, on the one hand, is about adding a safety net for
possible governor mispredictions using the existing tick infrastructure,
along with avoiding unnecessary timer manipulation overhead related to the
stopping and starting of the tick, on the other hand.  Of course, the safety
net will not improve the accuracy of governor predictions anyway, it may only
reduce their impact.

That said, it doesn't catch one case which turns out to be quite significant
and which is when the tick is stopped already and the governor predicts short
idle.  That, again, may cause the CPU to spend a long time in a shallow idle
state which then will qualify as a "powernightmare" I suppose.  If I'm reading
your data correctly, that is the main reason for the majority of cases in which
CPUs spend  us and more in state 0 on your system.

That issue can be dealt with in a couple of ways and the 

RE: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-10 Thread Doug Smythies
On 2018.03.10 01:00 Rafael J. Wysocki wrote:
> On Saturday, March 10, 2018 8:41:39 AM CET Doug Smythies wrote:
>> 
>> With apologies to those that do not like the term "PowerNightmares",
>
> OK, and what exactly do you count as "PowerNightmares"?

I'll snip some below and then explain:

...[snip]...

>> 
>> Kernel 4.16-rc4: Summary: Average processor package power 27.41 watts
>> 
>> Idle State 0: Total Entries: 9096 : PowerNightmares: 6540 : Not PN time 
>> (seconds): 0.051532 : PN time: 7886.309553 : Ratio:
153037.133492
>> Idle State 1: Total Entries: 28731 : PowerNightmares: 215 : Not PN time 
>> (seconds): 0.211999 : PN time: 77.395467 : Ratio:
365.074679
>> Idle State 2: Total Entries: 4474 : PowerNightmares: 97 : Not PN time 
>> (seconds): 1.959059 : PN time: 0.874112 : Ratio: 0.446190
>> Idle State 3: Total Entries: 2319 : PowerNightmares: 0 : Not PN time 
>> (seconds): 1.663376 : PN time: 0.00 : Ratio: 0.00

O.K. let's go deeper than the summary, and focus on idle state 0, which has 
been my area of interest in this saga.

Idle State 0:
CPU: 0: Entries: 2093 : PowerNightmares: 1136 : Not PN time (seconds): 0.024840 
: PN time: 1874.417840 : Ratio: 75459.655439
CPU: 1: Entries: 1051 : PowerNightmares: 721 : Not PN time (seconds): 0.004668 
: PN time: 198.845193 : Ratio: 42597.513425
CPU: 2: Entries: 759 : PowerNightmares: 583 : Not PN time (seconds): 0.003299 : 
PN time: 1099.069256 : Ratio: 333152.246028
CPU: 3: Entries: 1033 : PowerNightmares: 1008 : Not PN time (seconds): 0.000361 
: PN time: 1930.340683 : Ratio: 5347203.995237
CPU: 4: Entries: 1310 : PowerNightmares: 1025 : Not PN time (seconds): 0.006214 
: PN time: 1332.497114 : Ratio: 214434.682950
CPU: 5: Entries: 1097 : PowerNightmares: 848 : Not PN time (seconds): 0.005029 
: PN time: 785.366864 : Ratio: 156167.601340
CPU: 6: Entries: 1753 : PowerNightmares: 1219 : Not PN time (seconds): 0.007121 
: PN time: 665.772603 : Ratio: 93494.256958

Note: CPU 7 is busy and doesn't go into idle at all.

And also look at the histograms of the times spent in idle state 0:
CPU 3 might be the most interesting:

Idle State: 0  CPU: 3:
4 1
5 3
7 2
11 1
12 1
13 2
14 3
15 3
17 4
18 1
19 2
28 2
7563 1
8012 1
 1006

Where:
Column 1 is the time in microseconds it was in that idle state
up to  microseconds, which includes anything more.
Column 2 is the number of occurrences of that time.

Notice that 1008 times out of the 1033, it spent an excessive amount
of time in idle state 0, leading to excessive power consumption.
I adopted Thomas Ilsche's "Powernightmare" term for this several
months ago.

This CPU 3 example was pretty clear, but sometimes it is not so
obvious. I admit that my thresholds for is it a "powernigthmare" or
not are somewhat arbitrary, and I'll change them to whatever anybody
wants. Currently:

#define THRESHOLD_0 100   /* Idle state 0 PowerNightmare threshold in 
microseconds */
#define THRESHOLD_1 1000  /* Idle state 1 PowerNightmare threshold in 
microseconds */
#define THRESHOLD_2 2000  /* Idle state 2 PowerNightmare threshold in 
microseconds */
#define THRESHOLD_3 4000  /* Idle state 3 PowerNightmare threshold in 
microseconds */

Let's have a look at another example from the same test run:

Idle State 1:
CPU: 0: Entries: 3104 : PowerNightmares: 41 : Not PN time (seconds): 0.012196 : 
PN time: 10.841577 : Ratio: 888.945312
CPU: 1: Entries: 2637 : PowerNightmares: 40 : Not PN time (seconds): 0.013135 : 
PN time: 11.334686 : Ratio: 862.937649
CPU: 2: Entries: 1618 : PowerNightmares: 41 : Not PN time (seconds): 0.008351 : 
PN time: 10.193641 : Ratio: 1220.649147
CPU: 3: Entries: 10180 : PowerNightmares: 31 : Not PN time (seconds): 0.087596 
: PN time: 14.748787 : Ratio: 168.372836
CPU: 4: Entries: 3878 : PowerNightmares: 22 : Not PN time (seconds): 0.040360 : 
PN time: 14.207233 : Ratio: 352.012710
CPU: 5: Entries: 3658 : PowerNightmares: 1 : Not PN time (seconds): 0.031188 : 
PN time: 0.604176 : Ratio: 19.372066
CPU: 6: Entries: 3656 : PowerNightmares: 39 : Not PN time (seconds): 0.019173 : 
PN time: 15.465367 : Ratio: 806.622179

Idle State: 1  CPU: 2:
0 230
1 566
2 161
3 86
4 61
5 13
6 32
7 37
8 42
9 41
10 4
11 41
12 38
13 24
14 27
15 26
16 5
17 21
18 16
19 17
20 15
21 1
22 12
23 17
24 16
25 11
26 2
27 5
28 5
29 3
35 1
47 1
1733 1
1850 1
2027 1
3929 1
 37

The 41 "Powernightmares" out of 1618 seems correct to me.
Even if someone claims that the threshold should have been >3929 uSec,
there are still 37 "powenightmares".

>>> 
>> Graph of package power verses time: http://fast.smythies.com/rjwv3_100.png
>
> The graph actually shows an improvement to my eyes, as the blue line is quite
> consistently above the red one except for a few regions (and I don't really
> understand the drop in the blue line by the end of the test window).

Agreed, it shows improvement, as does the average package power.

The roughly 22 minute drop in the reference test, unmodified kernel 4.16-rc4,
the blue line, is something 

RE: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-10 Thread Doug Smythies
On 2018.03.10 01:00 Rafael J. Wysocki wrote:
> On Saturday, March 10, 2018 8:41:39 AM CET Doug Smythies wrote:
>> 
>> With apologies to those that do not like the term "PowerNightmares",
>
> OK, and what exactly do you count as "PowerNightmares"?

I'll snip some below and then explain:

...[snip]...

>> 
>> Kernel 4.16-rc4: Summary: Average processor package power 27.41 watts
>> 
>> Idle State 0: Total Entries: 9096 : PowerNightmares: 6540 : Not PN time 
>> (seconds): 0.051532 : PN time: 7886.309553 : Ratio:
153037.133492
>> Idle State 1: Total Entries: 28731 : PowerNightmares: 215 : Not PN time 
>> (seconds): 0.211999 : PN time: 77.395467 : Ratio:
365.074679
>> Idle State 2: Total Entries: 4474 : PowerNightmares: 97 : Not PN time 
>> (seconds): 1.959059 : PN time: 0.874112 : Ratio: 0.446190
>> Idle State 3: Total Entries: 2319 : PowerNightmares: 0 : Not PN time 
>> (seconds): 1.663376 : PN time: 0.00 : Ratio: 0.00

O.K. let's go deeper than the summary, and focus on idle state 0, which has 
been my area of interest in this saga.

Idle State 0:
CPU: 0: Entries: 2093 : PowerNightmares: 1136 : Not PN time (seconds): 0.024840 
: PN time: 1874.417840 : Ratio: 75459.655439
CPU: 1: Entries: 1051 : PowerNightmares: 721 : Not PN time (seconds): 0.004668 
: PN time: 198.845193 : Ratio: 42597.513425
CPU: 2: Entries: 759 : PowerNightmares: 583 : Not PN time (seconds): 0.003299 : 
PN time: 1099.069256 : Ratio: 333152.246028
CPU: 3: Entries: 1033 : PowerNightmares: 1008 : Not PN time (seconds): 0.000361 
: PN time: 1930.340683 : Ratio: 5347203.995237
CPU: 4: Entries: 1310 : PowerNightmares: 1025 : Not PN time (seconds): 0.006214 
: PN time: 1332.497114 : Ratio: 214434.682950
CPU: 5: Entries: 1097 : PowerNightmares: 848 : Not PN time (seconds): 0.005029 
: PN time: 785.366864 : Ratio: 156167.601340
CPU: 6: Entries: 1753 : PowerNightmares: 1219 : Not PN time (seconds): 0.007121 
: PN time: 665.772603 : Ratio: 93494.256958

Note: CPU 7 is busy and doesn't go into idle at all.

And also look at the histograms of the times spent in idle state 0:
CPU 3 might be the most interesting:

Idle State: 0  CPU: 3:
4 1
5 3
7 2
11 1
12 1
13 2
14 3
15 3
17 4
18 1
19 2
28 2
7563 1
8012 1
 1006

Where:
Column 1 is the time in microseconds it was in that idle state
up to  microseconds, which includes anything more.
Column 2 is the number of occurrences of that time.

Notice that 1008 times out of the 1033, it spent an excessive amount
of time in idle state 0, leading to excessive power consumption.
I adopted Thomas Ilsche's "Powernightmare" term for this several
months ago.

This CPU 3 example was pretty clear, but sometimes it is not so
obvious. I admit that my thresholds for is it a "powernigthmare" or
not are somewhat arbitrary, and I'll change them to whatever anybody
wants. Currently:

#define THRESHOLD_0 100   /* Idle state 0 PowerNightmare threshold in 
microseconds */
#define THRESHOLD_1 1000  /* Idle state 1 PowerNightmare threshold in 
microseconds */
#define THRESHOLD_2 2000  /* Idle state 2 PowerNightmare threshold in 
microseconds */
#define THRESHOLD_3 4000  /* Idle state 3 PowerNightmare threshold in 
microseconds */

Let's have a look at another example from the same test run:

Idle State 1:
CPU: 0: Entries: 3104 : PowerNightmares: 41 : Not PN time (seconds): 0.012196 : 
PN time: 10.841577 : Ratio: 888.945312
CPU: 1: Entries: 2637 : PowerNightmares: 40 : Not PN time (seconds): 0.013135 : 
PN time: 11.334686 : Ratio: 862.937649
CPU: 2: Entries: 1618 : PowerNightmares: 41 : Not PN time (seconds): 0.008351 : 
PN time: 10.193641 : Ratio: 1220.649147
CPU: 3: Entries: 10180 : PowerNightmares: 31 : Not PN time (seconds): 0.087596 
: PN time: 14.748787 : Ratio: 168.372836
CPU: 4: Entries: 3878 : PowerNightmares: 22 : Not PN time (seconds): 0.040360 : 
PN time: 14.207233 : Ratio: 352.012710
CPU: 5: Entries: 3658 : PowerNightmares: 1 : Not PN time (seconds): 0.031188 : 
PN time: 0.604176 : Ratio: 19.372066
CPU: 6: Entries: 3656 : PowerNightmares: 39 : Not PN time (seconds): 0.019173 : 
PN time: 15.465367 : Ratio: 806.622179

Idle State: 1  CPU: 2:
0 230
1 566
2 161
3 86
4 61
5 13
6 32
7 37
8 42
9 41
10 4
11 41
12 38
13 24
14 27
15 26
16 5
17 21
18 16
19 17
20 15
21 1
22 12
23 17
24 16
25 11
26 2
27 5
28 5
29 3
35 1
47 1
1733 1
1850 1
2027 1
3929 1
 37

The 41 "Powernightmares" out of 1618 seems correct to me.
Even if someone claims that the threshold should have been >3929 uSec,
there are still 37 "powenightmares".

>>> 
>> Graph of package power verses time: http://fast.smythies.com/rjwv3_100.png
>
> The graph actually shows an improvement to my eyes, as the blue line is quite
> consistently above the red one except for a few regions (and I don't really
> understand the drop in the blue line by the end of the test window).

Agreed, it shows improvement, as does the average package power.

The roughly 22 minute drop in the reference test, unmodified kernel 4.16-rc4,
the blue line, is something 

Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-10 Thread Rafael J. Wysocki
On Saturday, March 10, 2018 6:01:31 AM CET Mike Galbraith wrote:
> On Fri, 2018-03-09 at 10:34 +0100, Rafael J. Wysocki wrote:
> > Hi All,
> > 
> > Thanks a lot for the discussion and testing so far!
> > 
> > This is a total respin of the whole series, so please look at it afresh.
> > Patches 2 and 3 are the most similar to their previous versions, but
> > still they are different enough.
> 
> Respin of testdrive...

Appreciated, thanks!

> i4790 booted nopti nospectre_v2
> 
> 30 sec tbench
> 4.16.0.g1b88acc-master (virgin)
> Throughput 559.279 MB/sec  1 clients  1 procs  max_latency=0.046 ms
> Throughput 997.119 MB/sec  2 clients  2 procs  max_latency=0.246 ms
> Throughput 1693.04 MB/sec  4 clients  4 procs  max_latency=4.309 ms
> Throughput 3597.2 MB/sec  8 clients  8 procs  max_latency=6.760 ms
> Throughput 3474.55 MB/sec  16 clients  16 procs  max_latency=6.743 ms
> 
> 4.16.0.g1b88acc-master (+ v2)
> Throughput 588.929 MB/sec  1 clients  1 procs  max_latency=0.291 ms
> Throughput 1080.93 MB/sec  2 clients  2 procs  max_latency=0.639 ms
> Throughput 1826.3 MB/sec  4 clients  4 procs  max_latency=0.647 ms
> Throughput 3561.01 MB/sec  8 clients  8 procs  max_latency=1.279 ms
> Throughput 3382.98 MB/sec  16 clients  16 procs  max_latency=4.817 ms
> 
> 4.16.0.g1b88acc-master (+ v3)
> Throughput 588.711 MB/sec  1 clients  1 procs  max_latency=0.067 ms
> Throughput 1077.71 MB/sec  2 clients  2 procs  max_latency=0.298 ms

This is a bit better than "raw".  Around 8-9% I'd say.

> Throughput 1803.47 MB/sec  4 clients  4 procs  max_latency=0.667 ms

This one is too, but not as much.

> Throughput 3591.4 MB/sec  8 clients  8 procs  max_latency=4.999 ms
> Throughput 3444.74 MB/sec  16 clients  16 procs  max_latency=1.995 ms

And these are slightly worse, but just slightly.

> 4.16.0.g1b88acc-master (+ my local patches)
> Throughput 722.559 MB/sec  1 clients  1 procs  max_latency=0.087 ms
> Throughput 1208.59 MB/sec  2 clients  2 procs  max_latency=0.289 ms
> Throughput 2071.94 MB/sec  4 clients  4 procs  max_latency=0.654 ms
> Throughput 3784.91 MB/sec  8 clients  8 procs  max_latency=0.974 ms
> Throughput 3644.4 MB/sec  16 clients  16 procs  max_latency=5.620 ms
> 
> turbostat -q -- firefox /root/tmp/video/BigBuckBunny-DivXPlusHD.mkv & sleep 
> 300;killall firefox
> 
> PkgWatt
>   1 2 3
> 4.16.0.g1b88acc-master 6.95  7.03  6.91 (virgin)
> 4.16.0.g1b88acc-master 7.20  7.25  7.26 (+v2)
> 4.16.0.g1b88acc-master 7.04  6.97  7.07 (+v3)
> 4.16.0.g1b88acc-master 6.90  7.06  6.95 (+my patches)
> 
> No change wrt nohz high frequency cross core scheduling overhead, but
> the light load power consumption oddity did go away.

OK

> (btw, don't read anything into max_latency numbers, that's GUI noise)

I see.

Thanks!



Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-10 Thread Rafael J. Wysocki
On Saturday, March 10, 2018 6:01:31 AM CET Mike Galbraith wrote:
> On Fri, 2018-03-09 at 10:34 +0100, Rafael J. Wysocki wrote:
> > Hi All,
> > 
> > Thanks a lot for the discussion and testing so far!
> > 
> > This is a total respin of the whole series, so please look at it afresh.
> > Patches 2 and 3 are the most similar to their previous versions, but
> > still they are different enough.
> 
> Respin of testdrive...

Appreciated, thanks!

> i4790 booted nopti nospectre_v2
> 
> 30 sec tbench
> 4.16.0.g1b88acc-master (virgin)
> Throughput 559.279 MB/sec  1 clients  1 procs  max_latency=0.046 ms
> Throughput 997.119 MB/sec  2 clients  2 procs  max_latency=0.246 ms
> Throughput 1693.04 MB/sec  4 clients  4 procs  max_latency=4.309 ms
> Throughput 3597.2 MB/sec  8 clients  8 procs  max_latency=6.760 ms
> Throughput 3474.55 MB/sec  16 clients  16 procs  max_latency=6.743 ms
> 
> 4.16.0.g1b88acc-master (+ v2)
> Throughput 588.929 MB/sec  1 clients  1 procs  max_latency=0.291 ms
> Throughput 1080.93 MB/sec  2 clients  2 procs  max_latency=0.639 ms
> Throughput 1826.3 MB/sec  4 clients  4 procs  max_latency=0.647 ms
> Throughput 3561.01 MB/sec  8 clients  8 procs  max_latency=1.279 ms
> Throughput 3382.98 MB/sec  16 clients  16 procs  max_latency=4.817 ms
> 
> 4.16.0.g1b88acc-master (+ v3)
> Throughput 588.711 MB/sec  1 clients  1 procs  max_latency=0.067 ms
> Throughput 1077.71 MB/sec  2 clients  2 procs  max_latency=0.298 ms

This is a bit better than "raw".  Around 8-9% I'd say.

> Throughput 1803.47 MB/sec  4 clients  4 procs  max_latency=0.667 ms

This one is too, but not as much.

> Throughput 3591.4 MB/sec  8 clients  8 procs  max_latency=4.999 ms
> Throughput 3444.74 MB/sec  16 clients  16 procs  max_latency=1.995 ms

And these are slightly worse, but just slightly.

> 4.16.0.g1b88acc-master (+ my local patches)
> Throughput 722.559 MB/sec  1 clients  1 procs  max_latency=0.087 ms
> Throughput 1208.59 MB/sec  2 clients  2 procs  max_latency=0.289 ms
> Throughput 2071.94 MB/sec  4 clients  4 procs  max_latency=0.654 ms
> Throughput 3784.91 MB/sec  8 clients  8 procs  max_latency=0.974 ms
> Throughput 3644.4 MB/sec  16 clients  16 procs  max_latency=5.620 ms
> 
> turbostat -q -- firefox /root/tmp/video/BigBuckBunny-DivXPlusHD.mkv & sleep 
> 300;killall firefox
> 
> PkgWatt
>   1 2 3
> 4.16.0.g1b88acc-master 6.95  7.03  6.91 (virgin)
> 4.16.0.g1b88acc-master 7.20  7.25  7.26 (+v2)
> 4.16.0.g1b88acc-master 7.04  6.97  7.07 (+v3)
> 4.16.0.g1b88acc-master 6.90  7.06  6.95 (+my patches)
> 
> No change wrt nohz high frequency cross core scheduling overhead, but
> the light load power consumption oddity did go away.

OK

> (btw, don't read anything into max_latency numbers, that's GUI noise)

I see.

Thanks!



Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-10 Thread Rafael J. Wysocki
On Saturday, March 10, 2018 8:41:39 AM CET Doug Smythies wrote:
> On 2018.03.09 07:19 Rik van Riel wrote:
> > On Fri, 2018-03-09 at 10:34 +0100, Rafael J. Wysocki wrote:
> >> Hi All,
> >> 
> >> Thanks a lot for the discussion and testing so far!
> >> 
> >> This is a total respin of the whole series, so please look at it
> >> afresh.
> >> Patches 2 and 3 are the most similar to their previous versions, but
> >> still they are different enough.
> >
> > This series gives no RCU errors on startup,
> > and no CPUs seem to be getting stuck any more.
> 
> Confirmed on my test server. Boot is normal and no other errors, so far.

Thanks for testing, much appreciated!

> Part 1: Idle test:
> 
> I was able to repeat Mike's higher power issue under very light load,
> well no load in my case, with V2.
> 
> V3 is much better.
> 
> A one hour trace on my very idle server was 22 times smaller with V3
> than V2, and mainly due to idle state 4 not exiting and re-entering
> every tick time for great periods of time.
> 
> Disclaimer: From past experience, 1 hour is not nearly long enough
> for this test. Issues tend to come in bunches, sometimes many hours
> apart.
> 
> V2:
> Idle State 4: Entries: 1359560
> CPU: 0: Entries: 125305
> CPU: 1: Entries: 62489
> CPU: 2: Entries: 10203
> CPU: 3: Entries: 108107
> CPU: 4: Entries: 19915
> CPU: 5: Entries: 430253
> CPU: 6: Entries: 564650
> CPU: 7: Entries: 38638
> 
> V3:
> Idle State 4: Entries: 64505
> CPU: 0: Entries: 13060
> CPU: 1: Entries: 5266
> CPU: 2: Entries: 15744
> CPU: 3: Entries: 5574
> CPU: 4: Entries: 8425
> CPU: 5: Entries: 6270
> CPU: 6: Entries: 5592
> CPU: 7: Entries: 4574
> 
> Kernel 4.16-rc4:
> Idle State 4: Entries: 61390
> CPU: 0: Entries: 9529
> CPU: 1: Entries: 10556
> CPU: 2: Entries: 5478
> CPU: 3: Entries: 5991
> CPU: 4: Entries: 3686
> CPU: 5: Entries: 7610
> CPU: 6: Entries: 11074
> CPU: 7: Entries: 7466
> 
> With apologies to those that do not like the term "PowerNightmares",

OK, and what exactly do you count as "PowerNightmares"?

> it has become very ingrained in my tools:
> 
> V2:
> 1 hour idle Summary:
> 
> Idle State 0: Total Entries: 113 : PowerNightmares: 56 : Not PN time 
> (seconds): 0.001224 : PN time: 65.543239 : Ratio: 53548.397792
> Idle State 1: Total Entries: 1015 : PowerNightmares: 42 : Not PN time 
> (seconds): 0.053986 : PN time: 21.054470 : Ratio: 389.998703
> Idle State 2: Total Entries: 1382 : PowerNightmares: 17 : Not PN time 
> (seconds): 0.728686 : PN time: 6.046906 : Ratio: 8.298370
> Idle State 3: Total Entries: 113 : PowerNightmares: 13 : Not PN time 
> (seconds): 0.069055 : PN time: 6.021458 : Ratio: 87.198002

The V2 had a serious bug, please discard it entirely.

> 
> V3:
> 1 hour idle Summary: Average processor package power 3.78 watts
> 
> Idle State 0: Total Entries: 134 : PowerNightmares: 109 : Not PN time 
> (seconds): 0.000477 : PN time: 144.719723 : Ratio: 303395.646541
> Idle State 1: Total Entries: 1104 : PowerNightmares: 84 : Not PN time 
> (seconds): 0.052639 : PN time: 74.639142 : Ratio: 1417.943768
> Idle State 2: Total Entries: 968 : PowerNightmares: 141 : Not PN time 
> (seconds): 0.325953 : PN time: 128.235137 : Ratio: 393.416035
> Idle State 3: Total Entries: 295 : PowerNightmares: 103 : Not PN time 
> (seconds): 0.164884 : PN time: 97.159421 : Ratio: 589.259243
> 
> Kernel 4.16-rc4: Average processor package power (excluding a few minutes of 
> abnormal power) 3.70 watts.
> 1 hour idle Summary:
> 
> Idle State 0: Total Entries: 168 : PowerNightmares: 59 : Not PN time 
> (seconds): 0.001323 : PN time: 81.802197 : Ratio: 61830.836545
> Idle State 1: Total Entries: 1669 : PowerNightmares: 78 : Not PN time 
> (seconds): 0.022003 : PN time: 37.477413 : Ratio: 1703.286509
> Idle State 2: Total Entries: 1447 : PowerNightmares: 30 : Not PN time 
> (seconds): 0.502672 : PN time: 0.789344 : Ratio: 1.570296
> Idle State 3: Total Entries: 176 : PowerNightmares: 0 : Not PN time 
> (seconds): 0.259425 : PN time: 0.00 : Ratio: 0.00
> 
> Part 2: 100% load on one CPU test. Test duration 4 hours
> 
> V3: Summary: Average processor package power 26.75 watts
> 
> Idle State 0: Total Entries: 10039 : PowerNightmares: 7186 : Not PN time 
> (seconds): 0.067477 : PN time: 6215.220295 : Ratio: 92108.722903
> Idle State 1: Total Entries: 17268 : PowerNightmares: 195 : Not PN time 
> (seconds): 0.213049 : PN time: 55.905323 : Ratio: 262.405939
> Idle State 2: Total Entries: 5858 : PowerNightmares: 676 : Not PN time 
> (seconds): 2.578006 : PN time: 167.282069 : Ratio: 64.888161
> Idle State 3: Total Entries: 1500 : PowerNightmares: 488 : Not PN time 
> (seconds): 0.772463 : PN time: 125.514015 : Ratio: 162.485472
> 
> Kernel 4.16-rc4: Summary: Average processor package power 27.41 watts
> 
> Idle State 0: Total Entries: 9096 : PowerNightmares: 6540 : Not PN time 
> (seconds): 0.051532 : PN time: 7886.309553 : Ratio: 153037.133492
> Idle State 1: Total Entries: 28731 : PowerNightmares: 215 : Not PN time 
> (seconds): 

Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-10 Thread Rafael J. Wysocki
On Saturday, March 10, 2018 8:41:39 AM CET Doug Smythies wrote:
> On 2018.03.09 07:19 Rik van Riel wrote:
> > On Fri, 2018-03-09 at 10:34 +0100, Rafael J. Wysocki wrote:
> >> Hi All,
> >> 
> >> Thanks a lot for the discussion and testing so far!
> >> 
> >> This is a total respin of the whole series, so please look at it
> >> afresh.
> >> Patches 2 and 3 are the most similar to their previous versions, but
> >> still they are different enough.
> >
> > This series gives no RCU errors on startup,
> > and no CPUs seem to be getting stuck any more.
> 
> Confirmed on my test server. Boot is normal and no other errors, so far.

Thanks for testing, much appreciated!

> Part 1: Idle test:
> 
> I was able to repeat Mike's higher power issue under very light load,
> well no load in my case, with V2.
> 
> V3 is much better.
> 
> A one hour trace on my very idle server was 22 times smaller with V3
> than V2, and mainly due to idle state 4 not exiting and re-entering
> every tick time for great periods of time.
> 
> Disclaimer: From past experience, 1 hour is not nearly long enough
> for this test. Issues tend to come in bunches, sometimes many hours
> apart.
> 
> V2:
> Idle State 4: Entries: 1359560
> CPU: 0: Entries: 125305
> CPU: 1: Entries: 62489
> CPU: 2: Entries: 10203
> CPU: 3: Entries: 108107
> CPU: 4: Entries: 19915
> CPU: 5: Entries: 430253
> CPU: 6: Entries: 564650
> CPU: 7: Entries: 38638
> 
> V3:
> Idle State 4: Entries: 64505
> CPU: 0: Entries: 13060
> CPU: 1: Entries: 5266
> CPU: 2: Entries: 15744
> CPU: 3: Entries: 5574
> CPU: 4: Entries: 8425
> CPU: 5: Entries: 6270
> CPU: 6: Entries: 5592
> CPU: 7: Entries: 4574
> 
> Kernel 4.16-rc4:
> Idle State 4: Entries: 61390
> CPU: 0: Entries: 9529
> CPU: 1: Entries: 10556
> CPU: 2: Entries: 5478
> CPU: 3: Entries: 5991
> CPU: 4: Entries: 3686
> CPU: 5: Entries: 7610
> CPU: 6: Entries: 11074
> CPU: 7: Entries: 7466
> 
> With apologies to those that do not like the term "PowerNightmares",

OK, and what exactly do you count as "PowerNightmares"?

> it has become very ingrained in my tools:
> 
> V2:
> 1 hour idle Summary:
> 
> Idle State 0: Total Entries: 113 : PowerNightmares: 56 : Not PN time 
> (seconds): 0.001224 : PN time: 65.543239 : Ratio: 53548.397792
> Idle State 1: Total Entries: 1015 : PowerNightmares: 42 : Not PN time 
> (seconds): 0.053986 : PN time: 21.054470 : Ratio: 389.998703
> Idle State 2: Total Entries: 1382 : PowerNightmares: 17 : Not PN time 
> (seconds): 0.728686 : PN time: 6.046906 : Ratio: 8.298370
> Idle State 3: Total Entries: 113 : PowerNightmares: 13 : Not PN time 
> (seconds): 0.069055 : PN time: 6.021458 : Ratio: 87.198002

The V2 had a serious bug, please discard it entirely.

> 
> V3:
> 1 hour idle Summary: Average processor package power 3.78 watts
> 
> Idle State 0: Total Entries: 134 : PowerNightmares: 109 : Not PN time 
> (seconds): 0.000477 : PN time: 144.719723 : Ratio: 303395.646541
> Idle State 1: Total Entries: 1104 : PowerNightmares: 84 : Not PN time 
> (seconds): 0.052639 : PN time: 74.639142 : Ratio: 1417.943768
> Idle State 2: Total Entries: 968 : PowerNightmares: 141 : Not PN time 
> (seconds): 0.325953 : PN time: 128.235137 : Ratio: 393.416035
> Idle State 3: Total Entries: 295 : PowerNightmares: 103 : Not PN time 
> (seconds): 0.164884 : PN time: 97.159421 : Ratio: 589.259243
> 
> Kernel 4.16-rc4: Average processor package power (excluding a few minutes of 
> abnormal power) 3.70 watts.
> 1 hour idle Summary:
> 
> Idle State 0: Total Entries: 168 : PowerNightmares: 59 : Not PN time 
> (seconds): 0.001323 : PN time: 81.802197 : Ratio: 61830.836545
> Idle State 1: Total Entries: 1669 : PowerNightmares: 78 : Not PN time 
> (seconds): 0.022003 : PN time: 37.477413 : Ratio: 1703.286509
> Idle State 2: Total Entries: 1447 : PowerNightmares: 30 : Not PN time 
> (seconds): 0.502672 : PN time: 0.789344 : Ratio: 1.570296
> Idle State 3: Total Entries: 176 : PowerNightmares: 0 : Not PN time 
> (seconds): 0.259425 : PN time: 0.00 : Ratio: 0.00
> 
> Part 2: 100% load on one CPU test. Test duration 4 hours
> 
> V3: Summary: Average processor package power 26.75 watts
> 
> Idle State 0: Total Entries: 10039 : PowerNightmares: 7186 : Not PN time 
> (seconds): 0.067477 : PN time: 6215.220295 : Ratio: 92108.722903
> Idle State 1: Total Entries: 17268 : PowerNightmares: 195 : Not PN time 
> (seconds): 0.213049 : PN time: 55.905323 : Ratio: 262.405939
> Idle State 2: Total Entries: 5858 : PowerNightmares: 676 : Not PN time 
> (seconds): 2.578006 : PN time: 167.282069 : Ratio: 64.888161
> Idle State 3: Total Entries: 1500 : PowerNightmares: 488 : Not PN time 
> (seconds): 0.772463 : PN time: 125.514015 : Ratio: 162.485472
> 
> Kernel 4.16-rc4: Summary: Average processor package power 27.41 watts
> 
> Idle State 0: Total Entries: 9096 : PowerNightmares: 6540 : Not PN time 
> (seconds): 0.051532 : PN time: 7886.309553 : Ratio: 153037.133492
> Idle State 1: Total Entries: 28731 : PowerNightmares: 215 : Not PN time 
> (seconds): 

RE: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-09 Thread Doug Smythies
On 2018.03.09 07:19 Rik van Riel wrote:
> On Fri, 2018-03-09 at 10:34 +0100, Rafael J. Wysocki wrote:
>> Hi All,
>> 
>> Thanks a lot for the discussion and testing so far!
>> 
>> This is a total respin of the whole series, so please look at it
>> afresh.
>> Patches 2 and 3 are the most similar to their previous versions, but
>> still they are different enough.
>
> This series gives no RCU errors on startup,
> and no CPUs seem to be getting stuck any more.

Confirmed on my test server. Boot is normal and no other errors, so far.

Part 1: Idle test:

I was able to repeat Mike's higher power issue under very light load,
well no load in my case, with V2.

V3 is much better.

A one hour trace on my very idle server was 22 times smaller with V3
than V2, and mainly due to idle state 4 not exiting and re-entering
every tick time for great periods of time.

Disclaimer: From past experience, 1 hour is not nearly long enough
for this test. Issues tend to come in bunches, sometimes many hours
apart.

V2:
Idle State 4: Entries: 1359560
CPU: 0: Entries: 125305
CPU: 1: Entries: 62489
CPU: 2: Entries: 10203
CPU: 3: Entries: 108107
CPU: 4: Entries: 19915
CPU: 5: Entries: 430253
CPU: 6: Entries: 564650
CPU: 7: Entries: 38638

V3:
Idle State 4: Entries: 64505
CPU: 0: Entries: 13060
CPU: 1: Entries: 5266
CPU: 2: Entries: 15744
CPU: 3: Entries: 5574
CPU: 4: Entries: 8425
CPU: 5: Entries: 6270
CPU: 6: Entries: 5592
CPU: 7: Entries: 4574

Kernel 4.16-rc4:
Idle State 4: Entries: 61390
CPU: 0: Entries: 9529
CPU: 1: Entries: 10556
CPU: 2: Entries: 5478
CPU: 3: Entries: 5991
CPU: 4: Entries: 3686
CPU: 5: Entries: 7610
CPU: 6: Entries: 11074
CPU: 7: Entries: 7466

With apologies to those that do not like the term "PowerNightmares",
it has become very ingrained in my tools:

V2:
1 hour idle Summary:

Idle State 0: Total Entries: 113 : PowerNightmares: 56 : Not PN time (seconds): 
0.001224 : PN time: 65.543239 : Ratio: 53548.397792
Idle State 1: Total Entries: 1015 : PowerNightmares: 42 : Not PN time 
(seconds): 0.053986 : PN time: 21.054470 : Ratio: 389.998703
Idle State 2: Total Entries: 1382 : PowerNightmares: 17 : Not PN time 
(seconds): 0.728686 : PN time: 6.046906 : Ratio: 8.298370
Idle State 3: Total Entries: 113 : PowerNightmares: 13 : Not PN time (seconds): 
0.069055 : PN time: 6.021458 : Ratio: 87.198002

V3:
1 hour idle Summary: Average processor package power 3.78 watts

Idle State 0: Total Entries: 134 : PowerNightmares: 109 : Not PN time 
(seconds): 0.000477 : PN time: 144.719723 : Ratio: 303395.646541
Idle State 1: Total Entries: 1104 : PowerNightmares: 84 : Not PN time 
(seconds): 0.052639 : PN time: 74.639142 : Ratio: 1417.943768
Idle State 2: Total Entries: 968 : PowerNightmares: 141 : Not PN time 
(seconds): 0.325953 : PN time: 128.235137 : Ratio: 393.416035
Idle State 3: Total Entries: 295 : PowerNightmares: 103 : Not PN time 
(seconds): 0.164884 : PN time: 97.159421 : Ratio: 589.259243

Kernel 4.16-rc4: Average processor package power (excluding a few minutes of 
abnormal power) 3.70 watts.
1 hour idle Summary:

Idle State 0: Total Entries: 168 : PowerNightmares: 59 : Not PN time (seconds): 
0.001323 : PN time: 81.802197 : Ratio: 61830.836545
Idle State 1: Total Entries: 1669 : PowerNightmares: 78 : Not PN time 
(seconds): 0.022003 : PN time: 37.477413 : Ratio: 1703.286509
Idle State 2: Total Entries: 1447 : PowerNightmares: 30 : Not PN time 
(seconds): 0.502672 : PN time: 0.789344 : Ratio: 1.570296
Idle State 3: Total Entries: 176 : PowerNightmares: 0 : Not PN time (seconds): 
0.259425 : PN time: 0.00 : Ratio: 0.00

Part 2: 100% load on one CPU test. Test duration 4 hours

V3: Summary: Average processor package power 26.75 watts

Idle State 0: Total Entries: 10039 : PowerNightmares: 7186 : Not PN time 
(seconds): 0.067477 : PN time: 6215.220295 : Ratio: 92108.722903
Idle State 1: Total Entries: 17268 : PowerNightmares: 195 : Not PN time 
(seconds): 0.213049 : PN time: 55.905323 : Ratio: 262.405939
Idle State 2: Total Entries: 5858 : PowerNightmares: 676 : Not PN time 
(seconds): 2.578006 : PN time: 167.282069 : Ratio: 64.888161
Idle State 3: Total Entries: 1500 : PowerNightmares: 488 : Not PN time 
(seconds): 0.772463 : PN time: 125.514015 : Ratio: 162.485472

Kernel 4.16-rc4: Summary: Average processor package power 27.41 watts

Idle State 0: Total Entries: 9096 : PowerNightmares: 6540 : Not PN time 
(seconds): 0.051532 : PN time: 7886.309553 : Ratio: 153037.133492
Idle State 1: Total Entries: 28731 : PowerNightmares: 215 : Not PN time 
(seconds): 0.211999 : PN time: 77.395467 : Ratio: 365.074679
Idle State 2: Total Entries: 4474 : PowerNightmares: 97 : Not PN time 
(seconds): 1.959059 : PN time: 0.874112 : Ratio: 0.446190
Idle State 3: Total Entries: 2319 : PowerNightmares: 0 : Not PN time (seconds): 
1.663376 : PN time: 0.00 : Ratio: 0.00

Graph of package power verses time: http://fast.smythies.com/rjwv3_100.png

... Doug




RE: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-09 Thread Doug Smythies
On 2018.03.09 07:19 Rik van Riel wrote:
> On Fri, 2018-03-09 at 10:34 +0100, Rafael J. Wysocki wrote:
>> Hi All,
>> 
>> Thanks a lot for the discussion and testing so far!
>> 
>> This is a total respin of the whole series, so please look at it
>> afresh.
>> Patches 2 and 3 are the most similar to their previous versions, but
>> still they are different enough.
>
> This series gives no RCU errors on startup,
> and no CPUs seem to be getting stuck any more.

Confirmed on my test server. Boot is normal and no other errors, so far.

Part 1: Idle test:

I was able to repeat Mike's higher power issue under very light load,
well no load in my case, with V2.

V3 is much better.

A one hour trace on my very idle server was 22 times smaller with V3
than V2, and mainly due to idle state 4 not exiting and re-entering
every tick time for great periods of time.

Disclaimer: From past experience, 1 hour is not nearly long enough
for this test. Issues tend to come in bunches, sometimes many hours
apart.

V2:
Idle State 4: Entries: 1359560
CPU: 0: Entries: 125305
CPU: 1: Entries: 62489
CPU: 2: Entries: 10203
CPU: 3: Entries: 108107
CPU: 4: Entries: 19915
CPU: 5: Entries: 430253
CPU: 6: Entries: 564650
CPU: 7: Entries: 38638

V3:
Idle State 4: Entries: 64505
CPU: 0: Entries: 13060
CPU: 1: Entries: 5266
CPU: 2: Entries: 15744
CPU: 3: Entries: 5574
CPU: 4: Entries: 8425
CPU: 5: Entries: 6270
CPU: 6: Entries: 5592
CPU: 7: Entries: 4574

Kernel 4.16-rc4:
Idle State 4: Entries: 61390
CPU: 0: Entries: 9529
CPU: 1: Entries: 10556
CPU: 2: Entries: 5478
CPU: 3: Entries: 5991
CPU: 4: Entries: 3686
CPU: 5: Entries: 7610
CPU: 6: Entries: 11074
CPU: 7: Entries: 7466

With apologies to those that do not like the term "PowerNightmares",
it has become very ingrained in my tools:

V2:
1 hour idle Summary:

Idle State 0: Total Entries: 113 : PowerNightmares: 56 : Not PN time (seconds): 
0.001224 : PN time: 65.543239 : Ratio: 53548.397792
Idle State 1: Total Entries: 1015 : PowerNightmares: 42 : Not PN time 
(seconds): 0.053986 : PN time: 21.054470 : Ratio: 389.998703
Idle State 2: Total Entries: 1382 : PowerNightmares: 17 : Not PN time 
(seconds): 0.728686 : PN time: 6.046906 : Ratio: 8.298370
Idle State 3: Total Entries: 113 : PowerNightmares: 13 : Not PN time (seconds): 
0.069055 : PN time: 6.021458 : Ratio: 87.198002

V3:
1 hour idle Summary: Average processor package power 3.78 watts

Idle State 0: Total Entries: 134 : PowerNightmares: 109 : Not PN time 
(seconds): 0.000477 : PN time: 144.719723 : Ratio: 303395.646541
Idle State 1: Total Entries: 1104 : PowerNightmares: 84 : Not PN time 
(seconds): 0.052639 : PN time: 74.639142 : Ratio: 1417.943768
Idle State 2: Total Entries: 968 : PowerNightmares: 141 : Not PN time 
(seconds): 0.325953 : PN time: 128.235137 : Ratio: 393.416035
Idle State 3: Total Entries: 295 : PowerNightmares: 103 : Not PN time 
(seconds): 0.164884 : PN time: 97.159421 : Ratio: 589.259243

Kernel 4.16-rc4: Average processor package power (excluding a few minutes of 
abnormal power) 3.70 watts.
1 hour idle Summary:

Idle State 0: Total Entries: 168 : PowerNightmares: 59 : Not PN time (seconds): 
0.001323 : PN time: 81.802197 : Ratio: 61830.836545
Idle State 1: Total Entries: 1669 : PowerNightmares: 78 : Not PN time 
(seconds): 0.022003 : PN time: 37.477413 : Ratio: 1703.286509
Idle State 2: Total Entries: 1447 : PowerNightmares: 30 : Not PN time 
(seconds): 0.502672 : PN time: 0.789344 : Ratio: 1.570296
Idle State 3: Total Entries: 176 : PowerNightmares: 0 : Not PN time (seconds): 
0.259425 : PN time: 0.00 : Ratio: 0.00

Part 2: 100% load on one CPU test. Test duration 4 hours

V3: Summary: Average processor package power 26.75 watts

Idle State 0: Total Entries: 10039 : PowerNightmares: 7186 : Not PN time 
(seconds): 0.067477 : PN time: 6215.220295 : Ratio: 92108.722903
Idle State 1: Total Entries: 17268 : PowerNightmares: 195 : Not PN time 
(seconds): 0.213049 : PN time: 55.905323 : Ratio: 262.405939
Idle State 2: Total Entries: 5858 : PowerNightmares: 676 : Not PN time 
(seconds): 2.578006 : PN time: 167.282069 : Ratio: 64.888161
Idle State 3: Total Entries: 1500 : PowerNightmares: 488 : Not PN time 
(seconds): 0.772463 : PN time: 125.514015 : Ratio: 162.485472

Kernel 4.16-rc4: Summary: Average processor package power 27.41 watts

Idle State 0: Total Entries: 9096 : PowerNightmares: 6540 : Not PN time 
(seconds): 0.051532 : PN time: 7886.309553 : Ratio: 153037.133492
Idle State 1: Total Entries: 28731 : PowerNightmares: 215 : Not PN time 
(seconds): 0.211999 : PN time: 77.395467 : Ratio: 365.074679
Idle State 2: Total Entries: 4474 : PowerNightmares: 97 : Not PN time 
(seconds): 1.959059 : PN time: 0.874112 : Ratio: 0.446190
Idle State 3: Total Entries: 2319 : PowerNightmares: 0 : Not PN time (seconds): 
1.663376 : PN time: 0.00 : Ratio: 0.00

Graph of package power verses time: http://fast.smythies.com/rjwv3_100.png

... Doug




Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-09 Thread Mike Galbraith
On Fri, 2018-03-09 at 10:34 +0100, Rafael J. Wysocki wrote:
> Hi All,
> 
> Thanks a lot for the discussion and testing so far!
> 
> This is a total respin of the whole series, so please look at it afresh.
> Patches 2 and 3 are the most similar to their previous versions, but
> still they are different enough.

Respin of testdrive...

i4790 booted nopti nospectre_v2

30 sec tbench
4.16.0.g1b88acc-master (virgin)
Throughput 559.279 MB/sec  1 clients  1 procs  max_latency=0.046 ms
Throughput 997.119 MB/sec  2 clients  2 procs  max_latency=0.246 ms
Throughput 1693.04 MB/sec  4 clients  4 procs  max_latency=4.309 ms
Throughput 3597.2 MB/sec  8 clients  8 procs  max_latency=6.760 ms
Throughput 3474.55 MB/sec  16 clients  16 procs  max_latency=6.743 ms

4.16.0.g1b88acc-master (+ v2)
Throughput 588.929 MB/sec  1 clients  1 procs  max_latency=0.291 ms
Throughput 1080.93 MB/sec  2 clients  2 procs  max_latency=0.639 ms
Throughput 1826.3 MB/sec  4 clients  4 procs  max_latency=0.647 ms
Throughput 3561.01 MB/sec  8 clients  8 procs  max_latency=1.279 ms
Throughput 3382.98 MB/sec  16 clients  16 procs  max_latency=4.817 ms

4.16.0.g1b88acc-master (+ v3)
Throughput 588.711 MB/sec  1 clients  1 procs  max_latency=0.067 ms
Throughput 1077.71 MB/sec  2 clients  2 procs  max_latency=0.298 ms
Throughput 1803.47 MB/sec  4 clients  4 procs  max_latency=0.667 ms
Throughput 3591.4 MB/sec  8 clients  8 procs  max_latency=4.999 ms
Throughput 3444.74 MB/sec  16 clients  16 procs  max_latency=1.995 ms

4.16.0.g1b88acc-master (+ my local patches)
Throughput 722.559 MB/sec  1 clients  1 procs  max_latency=0.087 ms
Throughput 1208.59 MB/sec  2 clients  2 procs  max_latency=0.289 ms
Throughput 2071.94 MB/sec  4 clients  4 procs  max_latency=0.654 ms
Throughput 3784.91 MB/sec  8 clients  8 procs  max_latency=0.974 ms
Throughput 3644.4 MB/sec  16 clients  16 procs  max_latency=5.620 ms

turbostat -q -- firefox /root/tmp/video/BigBuckBunny-DivXPlusHD.mkv & sleep 
300;killall firefox

PkgWatt
  1 2 3
4.16.0.g1b88acc-master 6.95  7.03  6.91 (virgin)
4.16.0.g1b88acc-master 7.20  7.25  7.26 (+v2)
4.16.0.g1b88acc-master 7.04  6.97  7.07 (+v3)
4.16.0.g1b88acc-master 6.90  7.06  6.95 (+my patches)

No change wrt nohz high frequency cross core scheduling overhead, but
the light load power consumption oddity did go away.

(btw, don't read anything into max_latency numbers, that's GUI noise)

-Mike


Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-09 Thread Mike Galbraith
On Fri, 2018-03-09 at 10:34 +0100, Rafael J. Wysocki wrote:
> Hi All,
> 
> Thanks a lot for the discussion and testing so far!
> 
> This is a total respin of the whole series, so please look at it afresh.
> Patches 2 and 3 are the most similar to their previous versions, but
> still they are different enough.

Respin of testdrive...

i4790 booted nopti nospectre_v2

30 sec tbench
4.16.0.g1b88acc-master (virgin)
Throughput 559.279 MB/sec  1 clients  1 procs  max_latency=0.046 ms
Throughput 997.119 MB/sec  2 clients  2 procs  max_latency=0.246 ms
Throughput 1693.04 MB/sec  4 clients  4 procs  max_latency=4.309 ms
Throughput 3597.2 MB/sec  8 clients  8 procs  max_latency=6.760 ms
Throughput 3474.55 MB/sec  16 clients  16 procs  max_latency=6.743 ms

4.16.0.g1b88acc-master (+ v2)
Throughput 588.929 MB/sec  1 clients  1 procs  max_latency=0.291 ms
Throughput 1080.93 MB/sec  2 clients  2 procs  max_latency=0.639 ms
Throughput 1826.3 MB/sec  4 clients  4 procs  max_latency=0.647 ms
Throughput 3561.01 MB/sec  8 clients  8 procs  max_latency=1.279 ms
Throughput 3382.98 MB/sec  16 clients  16 procs  max_latency=4.817 ms

4.16.0.g1b88acc-master (+ v3)
Throughput 588.711 MB/sec  1 clients  1 procs  max_latency=0.067 ms
Throughput 1077.71 MB/sec  2 clients  2 procs  max_latency=0.298 ms
Throughput 1803.47 MB/sec  4 clients  4 procs  max_latency=0.667 ms
Throughput 3591.4 MB/sec  8 clients  8 procs  max_latency=4.999 ms
Throughput 3444.74 MB/sec  16 clients  16 procs  max_latency=1.995 ms

4.16.0.g1b88acc-master (+ my local patches)
Throughput 722.559 MB/sec  1 clients  1 procs  max_latency=0.087 ms
Throughput 1208.59 MB/sec  2 clients  2 procs  max_latency=0.289 ms
Throughput 2071.94 MB/sec  4 clients  4 procs  max_latency=0.654 ms
Throughput 3784.91 MB/sec  8 clients  8 procs  max_latency=0.974 ms
Throughput 3644.4 MB/sec  16 clients  16 procs  max_latency=5.620 ms

turbostat -q -- firefox /root/tmp/video/BigBuckBunny-DivXPlusHD.mkv & sleep 
300;killall firefox

PkgWatt
  1 2 3
4.16.0.g1b88acc-master 6.95  7.03  6.91 (virgin)
4.16.0.g1b88acc-master 7.20  7.25  7.26 (+v2)
4.16.0.g1b88acc-master 7.04  6.97  7.07 (+v3)
4.16.0.g1b88acc-master 6.90  7.06  6.95 (+my patches)

No change wrt nohz high frequency cross core scheduling overhead, but
the light load power consumption oddity did go away.

(btw, don't read anything into max_latency numbers, that's GUI noise)

-Mike


Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-09 Thread Rik van Riel
On Fri, 2018-03-09 at 10:34 +0100, Rafael J. Wysocki wrote:
> Hi All,
> 
> Thanks a lot for the discussion and testing so far!
> 
> This is a total respin of the whole series, so please look at it
> afresh.
> Patches 2 and 3 are the most similar to their previous versions, but
> still they are different enough.

This series gives no RCU errors on startup,
and no CPUs seem to be getting stuck any more.

I will run some performance tests with these
patches.

-- 
All Rights Reversed.

signature.asc
Description: This is a digitally signed message part


Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework

2018-03-09 Thread Rik van Riel
On Fri, 2018-03-09 at 10:34 +0100, Rafael J. Wysocki wrote:
> Hi All,
> 
> Thanks a lot for the discussion and testing so far!
> 
> This is a total respin of the whole series, so please look at it
> afresh.
> Patches 2 and 3 are the most similar to their previous versions, but
> still they are different enough.

This series gives no RCU errors on startup,
and no CPUs seem to be getting stuck any more.

I will run some performance tests with these
patches.

-- 
All Rights Reversed.

signature.asc
Description: This is a digitally signed message part