from:"Stratos Karafotis"


On 03/05/2013 09:34 AM, Viresh Kumar wrote:

On 5 March 2013 13:22, Stratos Karafotis strat...@semaphore.gr wrote:
I misread it here when i looked at this mail for the first time. :)
I strongly believe that we need a full stop (.) before Every sampling_rate,
otherwise it looks like we check for down_factor while increasing freq :)


I agree. I will do that.


Even now we aren't checking this 80% thing, right? And so in your patch we can
actually fix the patch too with the right logic of code.. And
documentation too :)


In my opinion the logic was initially correct. It was broken in the same 
commit that broke also sampling_down_factor.


Now we check if load  (cs_tuners.down_threshold - 10) to decrease freq.
Down threshold is 20, so we actually check the 80% idle.

I think the subtraction of 10 from down_threshold is wrong. It seems 
similar with ondemand but there is no logic for this in conservative.
User can simply select the down_threshold and the load will be compared 
with user's value. No need to alter user's selection.


I will prepare a patchset for these changes.

Regards,
Stratos
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH linux-next] cpufreq: conservative: Fix sampling_down_factor functionality


Hi David,
On 03/05/2013 04:21 PM, David C Niemi wrote:


I should clarify -- I wrote the sampling_down_factor in the *ondemand* 
governor.  I chose the name of the parameter based on the vaguely similar 
parameter in the conservative governor, but the documentation that was 
referenced (about it only applying at top speed and the comment about skipping 
evaluation opportunities when it is active) was written by me in reference to 
the ondemand governor.  It could be that someone backported some of the 
ondemand sampling_down_factor's behavior to the conservative governor.

I'd like to ask -- what is the intended use of the conservative governor these 
days as differentiated from the ondemand governor?  At one time it seemed more 
oriented towards power savings, but the ondemand governor had picked up most or 
all of its power-saving features.


Thanks for the information.
I would agree about the use of conservative, but I think that I'm not
the right person to answer this question. :)

Regards,
Stratos
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/3 linux-next] cpufreq: conservative: Fix sampling_down_factor functionality

sampling_down_factor tunable is unused since commit
8e677ce83bf41ba9c74e5b6d9ee60b07d4e5ed93 (4 years ago).

This patch restores the original functionality and documents the
tunable.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 Documentation/cpu-freq/governors.txt   |  6 ++
 drivers/cpufreq/cpufreq_conservative.c | 11 ---
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/Documentation/cpu-freq/governors.txt 
b/Documentation/cpu-freq/governors.txt
index c7a2eb8..4dfed30 100644
--- a/Documentation/cpu-freq/governors.txt
+++ b/Documentation/cpu-freq/governors.txt
@@ -191,6 +191,12 @@ governor but for the opposite direction.  For example when 
set to its
 default value of '20' it means that if the CPU usage needs to be below
 20% between samples to have the frequency decreased.
 
+sampling_down_factor: similar functionality as in ondemand governor.
+But in conservative, it controls the rate at which the kernel makes
+a decision on when to decrease the frequency while running in any
+speed. Load for frequency increase is still evaluated every
+sampling rate.
+
 3. The Governor Interface in the CPUfreq Core
 =
 
diff --git a/drivers/cpufreq/cpufreq_conservative.c 
b/drivers/cpufreq/cpufreq_conservative.c
index 4fd0006..1e3be56 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -44,9 +44,9 @@ static struct cs_dbs_tuners cs_tuners = {
 
 /*
  * Every sampling_rate, we check, if current idle time is less than 20%
- * (default), then we try to increase frequency Every sampling_rate *
- * sampling_down_factor, we check, if current idle time is more than 80%, then
- * we try to decrease frequency
+ * (default), then we try to increase frequency. Every sampling_rate *
+ * sampling_down_factor, we check, if current idle time is more than 80%
+ * (default), then we try to decrease frequency
  *
  * Any frequency increase takes it to the maximum frequency. Frequency 
reduction
  * happens at minimum steps of 5% (default) of maximum frequency
@@ -87,6 +87,11 @@ static void cs_check_cpu(int cpu, unsigned int load)
return;
}
 
+   /* if sampling_down_factor is active break out early */
+   if (++dbs_info-down_skip  cs_tuners.sampling_down_factor)
+   return;
+   dbs_info-down_skip = 0;
+
/*
 * The optimal frequency is the frequency that is the lowest that can
 * support the current CPU usage without triggering the up policy. To be
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/3 linux-next] cpufreq: conservative: Fix the logic in frequency decrease checking

When we evaluate the CPU load for frequency decrease we have to compare
the load against down_threshold. There is no need to subtract 10 points
from down_threshold.

Instead, we have to use the default down_threshold or user's selection
unmodified.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_conservative.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c 
b/drivers/cpufreq/cpufreq_conservative.c
index 1e3be56..08be431 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -92,12 +92,8 @@ static void cs_check_cpu(int cpu, unsigned int load)
return;
dbs_info-down_skip = 0;
 
-   /*
-* The optimal frequency is the frequency that is the lowest that can
-* support the current CPU usage without triggering the up policy. To be
-* safe, we focus 10 points under the threshold.
-*/
-   if (load  (cs_tuners.down_threshold - 10)) {
+   /* Check for frequency decrease */
+   if (load  cs_tuners.down_threshold) {
freq_target = (cs_tuners.freq_step * policy-max) / 100;
 
dbs_info-requested_freq -= freq_target;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3 linux-next] cpufreq: conservative: Use an inline function to evaluate freq_target

Use an inline function to evaluate freq_target to avoid duplicate code.

Also, define a macro for the default frequency step and fix the
calculation of freq_target when the max freq is less that 100.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_conservative.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c 
b/drivers/cpufreq/cpufreq_conservative.c
index 08be431..029de49 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -28,6 +28,7 @@
 /* Conservative governor macros */
 #define DEF_FREQUENCY_UP_THRESHOLD (80)
 #define DEF_FREQUENCY_DOWN_THRESHOLD   (20)
+#define DEF_FREQUENCY_STEP (5)
 #define DEF_SAMPLING_DOWN_FACTOR   (1)
 #define MAX_SAMPLING_DOWN_FACTOR   (10)
 
@@ -39,9 +40,20 @@ static struct cs_dbs_tuners cs_tuners = {
.down_threshold = DEF_FREQUENCY_DOWN_THRESHOLD,
.sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR,
.ignore_nice = 0,
-   .freq_step = 5,
+   .freq_step = DEF_FREQUENCY_STEP,
 };
 
+static inline unsigned int get_freq_target(struct cpufreq_policy *policy)
+{
+   unsigned int freq_target = (cs_tuners.freq_step * policy-max) / 100;
+
+   /* max freq cannot be less than 100. But who knows... */
+   if (unlikely(freq_target == 0))
+   freq_target = DEF_FREQUENCY_STEP * 1000; /* frequency in KHz */
+
+   return freq_target;
+}
+
 /*
  * Every sampling_rate, we check, if current idle time is less than 20%
  * (default), then we try to increase frequency. Every sampling_rate *
@@ -55,7 +67,6 @@ static void cs_check_cpu(int cpu, unsigned int load)
 {
struct cs_cpu_dbs_info_s *dbs_info = per_cpu(cs_cpu_dbs_info, cpu);
struct cpufreq_policy *policy = dbs_info-cdbs.cur_policy;
-   unsigned int freq_target;
 
/*
 * break out if we 'cannot' reduce the speed as the user might
@@ -72,13 +83,7 @@ static void cs_check_cpu(int cpu, unsigned int load)
if (dbs_info-requested_freq == policy-max)
return;
 
-   freq_target = (cs_tuners.freq_step * policy-max) / 100;
-
-   /* max freq cannot be less than 100. But who knows */
-   if (unlikely(freq_target == 0))
-   freq_target = 5;
-
-   dbs_info-requested_freq += freq_target;
+   dbs_info-requested_freq += get_freq_target(policy);
if (dbs_info-requested_freq  policy-max)
dbs_info-requested_freq = policy-max;
 
@@ -94,9 +99,7 @@ static void cs_check_cpu(int cpu, unsigned int load)
 
/* Check for frequency decrease */
if (load  cs_tuners.down_threshold) {
-   freq_target = (cs_tuners.freq_step * policy-max) / 100;
-
-   dbs_info-requested_freq -= freq_target;
+   dbs_info-requested_freq -= get_freq_target(policy);
if (dbs_info-requested_freq  policy-min)
dbs_info-requested_freq = policy-min;
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3 linux-next] cpufreq: conservative: Use an inline function to evaluate freq_target

2013-03-06 Thread Stratos Karafotis

On 03/06/2013 03:23 PM, Viresh Kumar wrote:
 Atleast my poor mind can't make out how. To me it looks like broken now.
 
 
 When can we enter this if block, probably only in case where max freq is
 less than 100 KHz (And because we have freq unit in KHz in cpufreq, its exact
 value is less than 100). Lets say its 90.
 
 So, we will get into your if block now and would set freq_target to 90 - 
 5000.
 
 So its broken, isn't it.
 
 Rest is fine.
 

Of course your are right. I'm sorry for this confusion.

Below v2 of this patch.

Thanks,
Stratos

8
Use an inline function to evaluate freq_target to avoid duplicate code.

Also, define a macro for the default frequency step.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_conservative.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c 
b/drivers/cpufreq/cpufreq_conservative.c
index 08be431..3fb921d 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -28,6 +28,7 @@
 /* Conservative governor macros */
 #define DEF_FREQUENCY_UP_THRESHOLD (80)
 #define DEF_FREQUENCY_DOWN_THRESHOLD   (20)
+#define DEF_FREQUENCY_STEP (5)
 #define DEF_SAMPLING_DOWN_FACTOR   (1)
 #define MAX_SAMPLING_DOWN_FACTOR   (10)
 
@@ -39,9 +40,20 @@ static struct cs_dbs_tuners cs_tuners = {
.down_threshold = DEF_FREQUENCY_DOWN_THRESHOLD,
.sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR,
.ignore_nice = 0,
-   .freq_step = 5,
+   .freq_step = DEF_FREQUENCY_STEP,
 };
 
+static inline unsigned int get_freq_target(struct cpufreq_policy *policy)
+{
+   unsigned int freq_target = (cs_tuners.freq_step * policy-max) / 100;
+
+   /* max freq cannot be less than 100. But who knows... */
+   if (unlikely(freq_target == 0))
+   freq_target = DEF_FREQUENCY_STEP;
+
+   return freq_target;
+}
+
 /*
  * Every sampling_rate, we check, if current idle time is less than 20%
  * (default), then we try to increase frequency. Every sampling_rate *
@@ -55,7 +67,6 @@ static void cs_check_cpu(int cpu, unsigned int load)
 {
struct cs_cpu_dbs_info_s *dbs_info = per_cpu(cs_cpu_dbs_info, cpu);
struct cpufreq_policy *policy = dbs_info-cdbs.cur_policy;
-   unsigned int freq_target;
 
/*
 * break out if we 'cannot' reduce the speed as the user might
@@ -72,13 +83,7 @@ static void cs_check_cpu(int cpu, unsigned int load)
if (dbs_info-requested_freq == policy-max)
return;
 
-   freq_target = (cs_tuners.freq_step * policy-max) / 100;
-
-   /* max freq cannot be less than 100. But who knows */
-   if (unlikely(freq_target == 0))
-   freq_target = 5;
-
-   dbs_info-requested_freq += freq_target;
+   dbs_info-requested_freq += get_freq_target(policy);
if (dbs_info-requested_freq  policy-max)
dbs_info-requested_freq = policy-max;
 
@@ -94,9 +99,7 @@ static void cs_check_cpu(int cpu, unsigned int load)
 
/* Check for frequency decrease */
if (load  cs_tuners.down_threshold) {
-   freq_target = (cs_tuners.freq_step * policy-max) / 100;
-
-   dbs_info-requested_freq -= freq_target;
+   dbs_info-requested_freq -= get_freq_target(policy);
if (dbs_info-requested_freq  policy-min)
dbs_info-requested_freq = policy-min;
 
-- 
1.8.1.4

 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3 linux-next] cpufreq: conservative: Fix the logic in frequency decrease checking

2013-03-06 Thread Stratos Karafotis

On 03/06/2013 09:35 PM, David C Niemi wrote:
 The 10 sounds like an attempt to add some hysteresis to the up/down 
 decisionmaking.  If you take it out, you should make sure you don't get into 
 situations where you're continually switching rapidly between two 
 frequencies.  (In the ondemand governor some care was also taken to avoid the 
 cost of doing a CPU idleness evaluation counting towards the CPU looking busy 
 enough to upshift; I am not familiar enough with Conservative to know whether 
 that is a problem for it too).
 
 DCN

This is true for ondemand but, as you know, there is a separate tunable 
(down_threshold) in conservative with default value 20.
It's independent from up_threshold (default 80), so I believe there is no
need to add a hysteresis.
Also, if we subtract 10 from down_threshold, we change user's decision
about this threshold. For example, if user sets down_threshold to 25, wants
this value to 25 not to 15.

Checking the initial commit of conservative governor, we can see that it
was not use hysteresis factor. This was added later (by mistake in my opinion)
as an attempt to make conservative to function similar to ondemand.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] cpufreq: ondemand: Fix typos in comments

2013-02-08 Thread Stratos Karafotis

Fix some typos in comments.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_ondemand.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
b/drivers/cpufreq/cpufreq_ondemand.c
index 09b27ae..f3eb26c 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -26,7 +26,7 @@
 
 #include cpufreq_governor.h
 
-/* On-demand governor macors */
+/* On-demand governor macros */
 #define DEF_FREQUENCY_DOWN_DIFFERENTIAL(10)
 #define DEF_FREQUENCY_UP_THRESHOLD (80)
 #define DEF_SAMPLING_DOWN_FACTOR   (1)
@@ -66,7 +66,7 @@ static void ondemand_powersave_bias_init_cpu(int cpu)
  * efficient idling at a higher frequency/voltage is.
  * Pavel Machek says this is not so for various generations of AMD and old
  * Intel systems.
- * Mike Chan (androidlcom) calis this is also not true for ARM.
+ * Mike Chan (android.com) claims this is also not true for ARM.
  * Because of this, whitelist specific known (series) of CPUs by default, and
  * leave all others up to the user.
  */
@@ -74,7 +74,7 @@ static int should_io_be_busy(void)
 {
 #if defined(CONFIG_X86)
/*
-* For Intel, Core 2 (model 15) andl later have an efficient idle.
+* For Intel, Core 2 (model 15) and later have an efficient idle.
 */
if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL 
boot_cpu_data.x86 == 6 
@@ -159,8 +159,8 @@ static void dbs_freq_increase(struct cpufreq_policy *p, 
unsigned int freq)
 
 /*
  * Every sampling_rate, we check, if current idle time is less than 20%
- * (default), then we try to increase frequency Every sampling_rate, we look 
for
- * a the lowest frequency which can sustain the load while keeping idle time
+ * (default), then we try to increase frequency. Every sampling_rate, we look
+ * for the lowest frequency which can sustain the load while keeping idle time
  * over 30%. If such a frequency exist, we try to decrease to this frequency.
  *
  * Any frequency increase takes it to the maximum frequency. Frequency 
reduction
@@ -267,7 +267,7 @@ static ssize_t show_sampling_rate_min(struct kobject *kobj,
  * update_sampling_rate - update sampling rate effective immediately if needed.
  * @new_rate: new sampling rate
  *
- * If new rate is smaller than the old, simply updaing
+ * If new rate is smaller than the old, simply updating
  * dbs_tuners_int.sampling_rate might not be appropriate. For example, if the
  * original sampling_rate was 1 second and the requested new sampling rate is 
10
  * ms because the user needs immediate reaction from ondemand governor, but not
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] cpufreq: conservative: Fix typos in comments

2013-02-08 Thread Stratos Karafotis

Fix a couple of typos in comments.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_conservative.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c 
b/drivers/cpufreq/cpufreq_conservative.c
index e8bb915..4fd0006 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -25,7 +25,7 @@
 
 #include cpufreq_governor.h
 
-/* Conservative governor macors */
+/* Conservative governor macros */
 #define DEF_FREQUENCY_UP_THRESHOLD (80)
 #define DEF_FREQUENCY_DOWN_THRESHOLD   (20)
 #define DEF_SAMPLING_DOWN_FACTOR   (1)
@@ -144,7 +144,7 @@ static int dbs_cpufreq_notifier(struct notifier_block *nb, 
unsigned long val,
 
/*
 * we only care if our internally tracked freq moves outside the 'valid'
-* ranges of freqency available to us otherwise we do not change it
+* ranges of frequency available to us otherwise we do not change it
*/
if (dbs_info-requested_freq  policy-max
|| dbs_info-requested_freq  policy-min)
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH linux-next] x86: smpboot.c: Remove unused variable c

2013-02-12 Thread Stratos Karafotis

This patch removes the unused variable 'c' in mwait_play_dead and fixes
the following warning:

arch/x86/kernel/smpboot.c: In function ‘mwait_play_dead’:
arch/x86/kernel/smpboot.c:1370:22: warning:
unused variable ‘c’ [-Wunused-variable]

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 arch/x86/kernel/smpboot.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index a6ceaed..7bc998a 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1367,7 +1367,6 @@ static inline void mwait_play_dead(void)
unsigned int highest_subcstate = 0;
int i;
void *mwait_ptr;
-   struct cpuinfo_x86 *c = __this_cpu_ptr(cpu_info);
 
if (!this_cpu_has(X86_FEATURE_MWAIT))
return;
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] cpufreq: ondemand: Replace down_differential tuner with adj_up_threshold

2013-02-05 Thread Stratos Karafotis

In order to avoid the calculation of up_threshold - down_differential
every time that the frequency must be decreased, we replace the
down_differential tuner with the adj_up_threshold which keeps the
difference across multiple checks.

Update the adj_up_threshold only when the up_theshold is also updated.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_governor.h |  2 +-
 drivers/cpufreq/cpufreq_ondemand.c | 16 ++--
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.h 
b/drivers/cpufreq/cpufreq_governor.h
index f661654..4250944 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -108,7 +108,7 @@ struct od_dbs_tuners {
unsigned int sampling_rate;
unsigned int sampling_down_factor;
unsigned int up_threshold;
-   unsigned int down_differential;
+   unsigned int adj_up_threshold;
unsigned int powersave_bias;
unsigned int io_is_busy;
 };
diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
b/drivers/cpufreq/cpufreq_ondemand.c
index 7731f7c..2cc76ad 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -47,7 +47,8 @@ static struct cpufreq_governor cpufreq_gov_ondemand;
 static struct od_dbs_tuners od_tuners = {
.up_threshold = DEF_FREQUENCY_UP_THRESHOLD,
.sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR,
-   .down_differential = DEF_FREQUENCY_DOWN_DIFFERENTIAL,
+   .adj_up_threshold = DEF_FREQUENCY_UP_THRESHOLD -
+   DEF_FREQUENCY_DOWN_DIFFERENTIAL,
.ignore_nice = 0,
.powersave_bias = 0,
 };
@@ -192,11 +193,9 @@ static void od_check_cpu(int cpu, unsigned int load_freq)
 * support the current CPU usage without triggering the up policy. To be
 * safe, we focus 10 points under the threshold.
 */
-   if (load_freq  (od_tuners.up_threshold - od_tuners.down_differential) *
-   policy-cur) {
+   if (load_freq  od_tuners.adj_up_threshold * policy-cur) {
unsigned int freq_next;
-   freq_next = load_freq / (od_tuners.up_threshold -
-   od_tuners.down_differential);
+   freq_next = load_freq / od_tuners.adj_up_threshold;
 
/* No longer fully busy, reset rate_mult */
dbs_info-rate_mult = 1;
@@ -351,6 +350,10 @@ static ssize_t store_up_threshold(struct kobject *a, 
struct attribute *b,
input  MIN_FREQUENCY_UP_THRESHOLD) {
return -EINVAL;
}
+   /* Calculate the new adj_up_threshold */
+   od_tuners.adj_up_threshold += input;
+   od_tuners.adj_up_threshold -= od_tuners.up_threshold;
+
od_tuners.up_threshold = input;
return count;
 }
@@ -507,7 +510,8 @@ static int __init cpufreq_gov_dbs_init(void)
if (idle_time != -1ULL) {
/* Idle micro accounting is supported. Use finer thresholds */
od_tuners.up_threshold = MICRO_FREQUENCY_UP_THRESHOLD;
-   od_tuners.down_differential = MICRO_FREQUENCY_DOWN_DIFFERENTIAL;
+   od_tuners.adj_up_threshold = MICRO_FREQUENCY_UP_THRESHOLD -
+MICRO_FREQUENCY_DOWN_DIFFERENTIAL;
/*
 * In nohz/micro accounting case we set the minimum frequency
 * not depending on HZ, but fixed (very low). The deferred
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] drivers: android: Restructure code in lowmemorykiller

2013-01-31 Thread Stratos Karafotis

This patch restructures code for better readability and easier
maintenance.

Also introduces lowmemorykiller.h header file.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/staging/android/lowmemorykiller.c | 162 ++
 drivers/staging/android/lowmemorykiller.h |  42 
 2 files changed, 139 insertions(+), 65 deletions(-)
 create mode 100644 drivers/staging/android/lowmemorykiller.h

diff --git a/drivers/staging/android/lowmemorykiller.c 
b/drivers/staging/android/lowmemorykiller.c
index 3b91b0f..ade8584 100644
--- a/drivers/staging/android/lowmemorykiller.c
+++ b/drivers/staging/android/lowmemorykiller.c
@@ -38,50 +38,44 @@
 #include linux/rcupdate.h
 #include linux/profile.h
 #include linux/notifier.h
+#include lowmemorykiller.h
 
-static uint32_t lowmem_debug_level = 2;
-static short lowmem_adj[6] = {
+static short lowmem_adj[LOWMEM_ARRAY_SIZE] = {
0,
1,
6,
12,
 };
-static int lowmem_adj_size = 4;
-static int lowmem_minfree[6] = {
+static int lowmem_minfree[LOWMEM_ARRAY_SIZE] = {
3 * 512,/* 6MB */
2 * 1024,   /* 8MB */
4 * 1024,   /* 16MB */
16 * 1024,  /* 64MB */
 };
-static int lowmem_minfree_size = 4;
 
-static unsigned long lowmem_deathpending_timeout;
+static int lowmem_adj_size = DEF_LOWMEM_SIZE;
+static int lowmem_minfree_size = DEF_LOWMEM_SIZE;
+static uint32_t lowmem_debug_level = DEF_DEBUG_LEVEL;
 
-#define lowmem_print(level, x...)  \
-   do {\
-   if (lowmem_debug_level = (level))  \
-   printk(x);  \
-   } while (0)
+static unsigned long lowmem_deathpending_timeout;
+static short min_score_adj;
+static struct selected_struct selected;
 
-static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
+static void set_min_score_adj(struct shrink_control *sc)
 {
-   struct task_struct *tsk;
-   struct task_struct *selected = NULL;
-   int rem = 0;
-   int tasksize;
int i;
-   short min_score_adj = OOM_SCORE_ADJ_MAX + 1;
-   int selected_tasksize = 0;
-   short selected_oom_score_adj;
int array_size = ARRAY_SIZE(lowmem_adj);
int other_free = global_page_state(NR_FREE_PAGES);
int other_file = global_page_state(NR_FILE_PAGES) -
global_page_state(NR_SHMEM);
 
+   min_score_adj = OOM_SCORE_ADJ_MAX + 1;
+
if (lowmem_adj_size  array_size)
array_size = lowmem_adj_size;
if (lowmem_minfree_size  array_size)
array_size = lowmem_minfree_size;
+
for (i = 0; i  array_size; i++) {
if (other_free  lowmem_minfree[i] 
other_file  lowmem_minfree[i]) {
@@ -89,10 +83,82 @@ static int lowmem_shrink(struct shrinker *s, struct 
shrink_control *sc)
break;
}
}
+
if (sc-nr_to_scan  0)
lowmem_print(3, lowmem_shrink %lu, %x, ofree %d %d, ma %hd\n,
sc-nr_to_scan, sc-gfp_mask, other_free,
other_file, min_score_adj);
+   return;
+}
+
+static enum lowmem_scan_t scan_process(struct task_struct *tsk)
+{
+   struct task_struct *p;
+   short oom_score_adj;
+   int tasksize;
+
+   if (tsk-flags  PF_KTHREAD)
+   return LMK_SCAN_CONTINUE;
+
+   p = find_lock_task_mm(tsk);
+   if (!p)
+   return LMK_SCAN_CONTINUE;
+
+   if (test_tsk_thread_flag(p, TIF_MEMDIE) 
+   time_before_eq(jiffies, lowmem_deathpending_timeout)) {
+   task_unlock(p);
+   rcu_read_unlock();
+   return LMK_SCAN_ABORT;
+   }
+
+   oom_score_adj = p-signal-oom_score_adj;
+   if (oom_score_adj  min_score_adj) {
+   task_unlock(p);
+   return LMK_SCAN_CONTINUE;
+   }
+
+   tasksize = get_mm_rss(p-mm);
+   task_unlock(p);
+   if (tasksize = 0)
+   return LMK_SCAN_CONTINUE;
+
+   if (selected.task) {
+   if (oom_score_adj  selected.oom_score_adj)
+   return LMK_SCAN_CONTINUE;
+
+   if (oom_score_adj == selected.oom_score_adj 
+   tasksize = selected.tasksize)
+   return LMK_SCAN_CONTINUE;
+   }
+
+   selected.task = p;
+   selected.tasksize = tasksize;
+   selected.oom_score_adj = oom_score_adj;
+   lowmem_print(2, select %d (%s), adj %hd, size %d, to kill\n,
+ p-pid, p-comm, oom_score_adj, tasksize);
+
+   return LMK_SCAN_OK;
+}
+
+static inline void kill_selected(void)
+{
+   lowmem_print(1, send sigkill to %d (%s), adj %hd, size %d\n,
+ selected.task-pid, selected.task-comm,
+ selected.oom_score_adj, selected.tasksize

Re: [PATCH] drivers: android: Restructure code in lowmemorykiller

2013-01-31 Thread Stratos Karafotis


On 02/01/2013 12:25 AM, Greg Kroah-Hartman wrote:

Given that no one is working on it, why does it need to be maintained
easier?  :)


Thanks for your immediate response.
I was thinking to work on this driver. Is it going to be obsolete or 
something?



Why create a .h file?  Who needs it?  Only create a .h file if some
other .c file needs something in it, never for just one .c file.

sorry,

greg k-h



I will fix it.

Thanks again,
Stratos
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] regmap: debugfs: Fix compiler warning

2013-02-02 Thread Stratos Karafotis

This patch fixes the following compiler warning of uninitialized
variable:

drivers/base/regmap/regmap-debugfs.c: In function ‘regmap_read_debugfs’:
drivers/base/regmap/regmap-debugfs.c:180:9: warning: ‘ret’ may be used
uninitialized in this function [-Wmaybe-uninitialized]

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/base/regmap/regmap-debugfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/regmap/regmap-debugfs.c 
b/drivers/base/regmap/regmap-debugfs.c
index d9a6c94..aa570af 100644
--- a/drivers/base/regmap/regmap-debugfs.c
+++ b/drivers/base/regmap/regmap-debugfs.c
@@ -80,7 +80,7 @@ static unsigned int regmap_debugfs_get_dump_start(struct 
regmap *map,
 {
struct regmap_debugfs_off_cache *c = NULL;
loff_t p = 0;
-   unsigned int i, ret;
+   unsigned int i, ret = base;

/*
 * If we don't have a cache build one so we don't have to do a
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH linux-next] cpufreq: governors: Calculate iowait time only when necessary

2013-02-27 Thread Stratos Karafotis

Currently we always calculate the CPU iowait time and add it to idle time.
If we are in ondemand and we use io_is_busy, we re-calculate iowait time
and we subtract it from idle time.

With this patch iowait time is calculated only when necessary avoiding
the double call to get_cpu_iowait_time_us. We use a parameter in
function get_cpu_idle_time to distinguish when the iowait time will be
added to idle time or not, without the need of keeping the prev_io_wait.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_conservative.c |  2 +-
 drivers/cpufreq/cpufreq_governor.c | 46 +-
 drivers/cpufreq/cpufreq_governor.h |  3 +--
 drivers/cpufreq/cpufreq_ondemand.c | 11 +++-
 4 files changed, 29 insertions(+), 33 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c 
b/drivers/cpufreq/cpufreq_conservative.c
index 4fd0006..dfe652c 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -242,7 +242,7 @@ static ssize_t store_ignore_nice_load(struct kobject *a, 
struct attribute *b,
struct cs_cpu_dbs_info_s *dbs_info;
dbs_info = per_cpu(cs_cpu_dbs_info, j);
dbs_info-cdbs.prev_cpu_idle = get_cpu_idle_time(j,
-   dbs_info-cdbs.prev_cpu_wall);
+   dbs_info-cdbs.prev_cpu_wall, 0);
if (cs_tuners.ignore_nice)
dbs_info-cdbs.prev_cpu_nice =
kcpustat_cpu(j).cpustat[CPUTIME_NICE];
diff --git a/drivers/cpufreq/cpufreq_governor.c 
b/drivers/cpufreq/cpufreq_governor.c
index 5a76086..a322bda 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -50,13 +50,13 @@ static inline u64 get_cpu_idle_time_jiffy(unsigned int cpu, 
u64 *wall)
return cputime_to_usecs(idle_time);
 }
 
-u64 get_cpu_idle_time(unsigned int cpu, u64 *wall)
+u64 get_cpu_idle_time(unsigned int cpu, u64 *wall, int io_busy)
 {
u64 idle_time = get_cpu_idle_time_us(cpu, NULL);
 
if (idle_time == -1ULL)
return get_cpu_idle_time_jiffy(cpu, wall);
-   else
+   else if (!io_busy)
idle_time += get_cpu_iowait_time_us(cpu, wall);
 
return idle_time;
@@ -83,13 +83,22 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
/* Get Absolute Load (in terms of freq for ondemand gov) */
for_each_cpu(j, policy-cpus) {
struct cpu_dbs_common_info *j_cdbs;
-   u64 cur_wall_time, cur_idle_time, cur_iowait_time;
-   unsigned int idle_time, wall_time, iowait_time;
+   u64 cur_wall_time, cur_idle_time;
+   unsigned int idle_time, wall_time;
unsigned int load;
+   int io_busy = 0;
 
j_cdbs = dbs_data-get_cpu_cdbs(j);
 
-   cur_idle_time = get_cpu_idle_time(j, cur_wall_time);
+   /*
+* For the purpose of ondemand, waiting for disk IO is
+* an indication that you're performance critical, and
+* not that the system is actually idle. So do not add
+* the iowait time to the cpu idle time.
+*/
+   if (dbs_data-governor == GOV_ONDEMAND)
+   io_busy = od_tuners-io_is_busy;
+   cur_idle_time = get_cpu_idle_time(j, cur_wall_time, io_busy);
 
wall_time = (unsigned int)
(cur_wall_time - j_cdbs-prev_cpu_wall);
@@ -117,29 +126,6 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
idle_time += jiffies_to_usecs(cur_nice_jiffies);
}
 
-   if (dbs_data-governor == GOV_ONDEMAND) {
-   struct od_cpu_dbs_info_s *od_j_dbs_info =
-   dbs_data-get_cpu_dbs_info_s(cpu);
-
-   cur_iowait_time = get_cpu_iowait_time_us(j,
-   cur_wall_time);
-   if (cur_iowait_time == -1ULL)
-   cur_iowait_time = 0;
-
-   iowait_time = (unsigned int) (cur_iowait_time -
-   od_j_dbs_info-prev_cpu_iowait);
-   od_j_dbs_info-prev_cpu_iowait = cur_iowait_time;
-
-   /*
-* For the purpose of ondemand, waiting for disk IO is
-* an indication that you're performance critical, and
-* not that the system is actually idle. So subtract the
-* iowait time from the cpu idle time.
-*/
-   if (od_tuners-io_is_busy  idle_time = iowait_time)
-   idle_time -= iowait_time;
-   }
-
if (unlikely(!wall_time || wall_time

Re: [PATCH v2 linux-next] cpufreq: governors: Calculate iowait time only when necessary

2013-02-28 Thread Stratos Karafotis

Hi Viresh,

On 02/28/2013 08:58 AM, Viresh Kumar wrote:
 I have really spent some 10-15 minutes reviewing this patch as initially it
 looked to me like something is missing here and calculations are going wrong.
 
 But it was fine :)

Thanks for you review.
 
 diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
 b/drivers/cpufreq/cpufreq_ondemand.c
 index f3eb26c..46be2c4 100644
 --- a/drivers/cpufreq/cpufreq_ondemand.c
 +++ b/drivers/cpufreq/cpufreq_ondemand.c
 @@ -339,11 +339,20 @@ static ssize_t store_io_is_busy(struct kobject *a, 
 struct attribute *b,
   {
 
 +   /* we need to re-evaluate prev_cpu_idle */
 +   for_each_online_cpu(j) {
 +   struct od_cpu_dbs_info_s *dbs_info;
 +   dbs_info = per_cpu(od_cpu_dbs_info, j);
 
 Probably squash above two lines and give a blank line after it.
 
 +   dbs_info-cdbs.prev_cpu_idle = get_cpu_idle_time(j,
 +   dbs_info-cdbs.prev_cpu_wall, od_tuners.io_is_busy);
 +   }
  return count;
   }
 
 And after that add my:
 
 Acked-by: Viresh Kumar viresh.ku...@linaro.org
 

Bellow V2 with your suggestion included.

Regards,
Stratos


8--
Currently we always calculate the CPU iowait time and add it to idle time.
If we are in ondemand and we use io_is_busy, we re-calculate iowait time
and we subtract it from idle time.

With this patch iowait time is calculated only when necessary avoiding
the double call to get_cpu_iowait_time_us. We use a parameter in
function get_cpu_idle_time to distinguish when the iowait time will be
added to idle time or not, without the need of keeping the prev_io_wait.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_conservative.c |  2 +-
 drivers/cpufreq/cpufreq_governor.c | 46 +-
 drivers/cpufreq/cpufreq_governor.h |  3 +--
 drivers/cpufreq/cpufreq_ondemand.c | 11 +++-
 4 files changed, 29 insertions(+), 33 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c 
b/drivers/cpufreq/cpufreq_conservative.c
index 4fd0006..dfe652c 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -242,7 +242,7 @@ static ssize_t store_ignore_nice_load(struct kobject *a, 
struct attribute *b,
struct cs_cpu_dbs_info_s *dbs_info;
dbs_info = per_cpu(cs_cpu_dbs_info, j);
dbs_info-cdbs.prev_cpu_idle = get_cpu_idle_time(j,
-   dbs_info-cdbs.prev_cpu_wall);
+   dbs_info-cdbs.prev_cpu_wall, 0);
if (cs_tuners.ignore_nice)
dbs_info-cdbs.prev_cpu_nice =
kcpustat_cpu(j).cpustat[CPUTIME_NICE];
diff --git a/drivers/cpufreq/cpufreq_governor.c 
b/drivers/cpufreq/cpufreq_governor.c
index 5a76086..a322bda 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -50,13 +50,13 @@ static inline u64 get_cpu_idle_time_jiffy(unsigned int cpu, 
u64 *wall)
return cputime_to_usecs(idle_time);
 }
 
-u64 get_cpu_idle_time(unsigned int cpu, u64 *wall)
+u64 get_cpu_idle_time(unsigned int cpu, u64 *wall, int io_busy)
 {
u64 idle_time = get_cpu_idle_time_us(cpu, NULL);
 
if (idle_time == -1ULL)
return get_cpu_idle_time_jiffy(cpu, wall);
-   else
+   else if (!io_busy)
idle_time += get_cpu_iowait_time_us(cpu, wall);
 
return idle_time;
@@ -83,13 +83,22 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
/* Get Absolute Load (in terms of freq for ondemand gov) */
for_each_cpu(j, policy-cpus) {
struct cpu_dbs_common_info *j_cdbs;
-   u64 cur_wall_time, cur_idle_time, cur_iowait_time;
-   unsigned int idle_time, wall_time, iowait_time;
+   u64 cur_wall_time, cur_idle_time;
+   unsigned int idle_time, wall_time;
unsigned int load;
+   int io_busy = 0;
 
j_cdbs = dbs_data-get_cpu_cdbs(j);
 
-   cur_idle_time = get_cpu_idle_time(j, cur_wall_time);
+   /*
+* For the purpose of ondemand, waiting for disk IO is
+* an indication that you're performance critical, and
+* not that the system is actually idle. So do not add
+* the iowait time to the cpu idle time.
+*/
+   if (dbs_data-governor == GOV_ONDEMAND)
+   io_busy = od_tuners-io_is_busy;
+   cur_idle_time = get_cpu_idle_time(j, cur_wall_time, io_busy);
 
wall_time = (unsigned int)
(cur_wall_time - j_cdbs-prev_cpu_wall);
@@ -117,29 +126,6 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
idle_time += jiffies_to_usecs(cur_nice_jiffies

[PATCH next-20130124] Sound: pci: Fix unused variable warning in patch_sigmatel.c

2013-01-24 Thread Stratos Karafotis

Fix the following build warnings

sound/pci/hda/patch_sigmatel.c: In function ‘stac92hd71bxx_fixup_hp’:
sound/pci/hda/patch_sigmatel.c:2434:24: warning: unused variable ‘spec’ 
[-Wunused-variable]

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 sound/pci/hda/patch_sigmatel.c | 14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c
index 0aa0ceb..f269d1f 100644
--- a/sound/pci/hda/patch_sigmatel.c
+++ b/sound/pci/hda/patch_sigmatel.c
@@ -2365,7 +2365,6 @@ static void stac92hd71bxx_fixup_ref(struct hda_codec 
*codec,
 static void stac92hd71bxx_fixup_hp_m4(struct hda_codec *codec,
  const struct hda_fixup *fix, int action)
 {
-   struct sigmatel_spec *spec = codec-spec;
struct hda_jack_tbl *jack;
 
if (action != HDA_FIXUP_ACT_PRE_PROBE)
@@ -2381,7 +2380,7 @@ static void stac92hd71bxx_fixup_hp_m4(struct hda_codec 
*codec,
if (jack)
jack-private_data = 0x02;
 
-   spec-gpio_mask |= 0x02;
+   ((struct sigmatel_spec *)codec-spec)-gpio_mask |= 0x02;
 
/* enable internal microphone */
snd_hda_codec_set_pincfg(codec, 0x0e, 0x01813040);
@@ -2420,19 +2419,15 @@ static void stac92hd71bxx_fixup_hp_dv5(struct hda_codec 
*codec,
 static void stac92hd71bxx_fixup_hp_hdx(struct hda_codec *codec,
   const struct hda_fixup *fix, int action)
 {
-   struct sigmatel_spec *spec = codec-spec;
-
if (action != HDA_FIXUP_ACT_PRE_PROBE)
return;
-   spec-gpio_led = 0x08;
+   ((struct sigmatel_spec *)codec-spec)-gpio_led = 0x08;
 }
 
 
 static void stac92hd71bxx_fixup_hp(struct hda_codec *codec,
   const struct hda_fixup *fix, int action)
 {
-   struct sigmatel_spec *spec = codec-spec;
-
if (action != HDA_FIXUP_ACT_PRE_PROBE)
return;
 
@@ -2456,8 +2451,9 @@ static void stac92hd71bxx_fixup_hp(struct hda_codec 
*codec,
 
if (find_mute_led_cfg(codec, 1))
snd_printd(mute LED gpio %d polarity %d\n,
-   spec-gpio_led,
-   spec-gpio_led_polarity);
+   ((struct sigmatel_spec *)codec-spec)-gpio_led,
+   ((struct sigmatel_spec *)codec-spec)-
+   gpio_led_polarity);
 
 }
 
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] drivers: infiniband: Fix compiler warning

2013-01-25 Thread Stratos Karafotis

This patch fixes the following compiler warning of uninitialized variable

drivers/infiniband/hw/mlx4/qp.c: In function ‘mlx4_ib_post_send’:
drivers/infiniband/hw/mlx4/qp.c:1862:62: warning: ‘vlan’ may be used 
uninitialized
in this function [-Wmaybe-uninitialized]
drivers/infiniband/hw/mlx4/qp.c:1752:6: note: ‘vlan’ was declared here

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/infiniband/hw/mlx4/qp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 19e0637..37829b6 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -1749,7 +1749,7 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, 
struct ib_send_wr *wr,
int is_eth;
int is_vlan = 0;
int is_grh;
-   u16 vlan;
+   u16 vlan = 0;
int err = 0;

send_size = 0;
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH linux-next] net: ipv6: Fix compiler warning

2013-02-18 Thread Stratos Karafotis

Fix the following compiler warning (also a checkpatch error):

net/ipv6/xfrm6_mode_tunnel.c: In function ‘xfrm6_mode_tunnel_input’:
net/ipv6/xfrm6_mode_tunnel.c:72:2: warning: suggest parentheses around
assignment used as truth value [-Wparentheses]

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 net/ipv6/xfrm6_mode_tunnel.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/xfrm6_mode_tunnel.c b/net/ipv6/xfrm6_mode_tunnel.c
index 93c41a8..9bf6a74 100644
--- a/net/ipv6/xfrm6_mode_tunnel.c
+++ b/net/ipv6/xfrm6_mode_tunnel.c
@@ -69,7 +69,8 @@ static int xfrm6_mode_tunnel_input(struct xfrm_state *x, 
struct sk_buff *skb)
if (!pskb_may_pull(skb, sizeof(struct ipv6hdr)))
goto out;
 
-   if (err = skb_unclone(skb, GFP_ATOMIC))
+   err = skb_unclone(skb, GFP_ATOMIC);
+   if (err)
goto out;
 
if (x-props.flags  XFRM_STATE_DECAP_DSCP)
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH linux-next] cpufreq: ondemand: Calculate gradient of CPU load to early increase frequency

2013-02-20 Thread Stratos Karafotis

Instead of checking only the absolute value of CPU load_freq to increase
frequency, we detect forthcoming CPU load rise and increase frequency
earlier.

Every sampling rate, we calculate the gradient of load_freq.
If it is too steep we assume that the load most probably will
go over up_threshold in next iteration(s). We reduce up_threshold
by early_differential to achieve frequency increase in the current
iteration.

A new tuner early_demand is introduced to enable this functionality
(disabled by default). Also we use new tuners to control early demand:

- early_differential: controls the final up_threshold
- grad_up_threshold: over this gradient of load we will decrease
up_threshold by early_differential.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_governor.c |  1 +
 drivers/cpufreq/cpufreq_governor.h |  4 
 drivers/cpufreq/cpufreq_ondemand.c | 41 +-
 3 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c 
b/drivers/cpufreq/cpufreq_governor.c
index 5a76086..348cb80 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -276,6 +276,7 @@ int cpufreq_governor_dbs(struct dbs_data *dbs_data,
} else {
od_dbs_info-rate_mult = 1;
od_dbs_info-sample_type = OD_NORMAL_SAMPLE;
+   od_dbs_info-prev_load_freq = 0;
od_ops-powersave_bias_init_cpu(cpu);
 
if (!policy-governor-initialized)
diff --git a/drivers/cpufreq/cpufreq_governor.h 
b/drivers/cpufreq/cpufreq_governor.h
index d2ac911..d1425a7 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -94,6 +94,7 @@ struct od_cpu_dbs_info_s {
unsigned int freq_hi_jiffies;
unsigned int rate_mult;
unsigned int sample_type:1;
+   unsigned int prev_load_freq;
 };
 
 struct cs_cpu_dbs_info_s {
@@ -112,6 +113,9 @@ struct od_dbs_tuners {
unsigned int adj_up_threshold;
unsigned int powersave_bias;
unsigned int io_is_busy;
+   unsigned int early_differential;
+   unsigned int grad_up_threshold;
+   unsigned int early_demand;
 };
 
 struct cs_dbs_tuners {
diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
b/drivers/cpufreq/cpufreq_ondemand.c
index f3eb26c..458806f 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -30,6 +30,8 @@
 #define DEF_FREQUENCY_DOWN_DIFFERENTIAL(10)
 #define DEF_FREQUENCY_UP_THRESHOLD (80)
 #define DEF_SAMPLING_DOWN_FACTOR   (1)
+#define DEF_GRAD_UP_THESHOLD   (50)
+#define DEF_EARLY_DIFFERENTIAL (45)
 #define MAX_SAMPLING_DOWN_FACTOR   (10)
 #define MICRO_FREQUENCY_DOWN_DIFFERENTIAL  (3)
 #define MICRO_FREQUENCY_UP_THRESHOLD   (95)
@@ -49,8 +51,11 @@ static struct od_dbs_tuners od_tuners = {
.sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR,
.adj_up_threshold = DEF_FREQUENCY_UP_THRESHOLD -
DEF_FREQUENCY_DOWN_DIFFERENTIAL,
+   .early_differential = DEF_EARLY_DIFFERENTIAL,
+   .grad_up_threshold = DEF_GRAD_UP_THESHOLD,
.ignore_nice = 0,
.powersave_bias = 0,
+   .early_demand = 0,
 };
 
 static void ondemand_powersave_bias_init_cpu(int cpu)
@@ -170,11 +175,29 @@ static void od_check_cpu(int cpu, unsigned int load_freq)
 {
struct od_cpu_dbs_info_s *dbs_info = per_cpu(od_cpu_dbs_info, cpu);
struct cpufreq_policy *policy = dbs_info-cdbs.cur_policy;
+   unsigned int up_threshold = od_tuners.up_threshold;
+   unsigned int grad;
 
dbs_info-freq_lo = 0;
 
+   /*
+* Calculate the gradient of load_freq. If it is too steep we assume
+* that the load will go over up_threshold in next iteration(s). We
+* reduce up_threshold by early_differential to achieve frequency
+* increase earlier
+*/
+   if (od_tuners.early_demand) {
+   if (load_freq  dbs_info-prev_load_freq) {
+   grad = load_freq - dbs_info-prev_load_freq;
+
+   if (grad  od_tuners.grad_up_threshold * policy-cur)
+   up_threshold -= od_tuners.early_differential;
+   }
+   dbs_info-prev_load_freq = load_freq;
+   }
+
/* Check for frequency increase */
-   if (load_freq  od_tuners.up_threshold * policy-cur) {
+   if (load_freq  up_threshold * policy-cur) {
/* If switching to max speed, apply sampling_down_factor */
if (policy-cur  policy-max)
dbs_info-rate_mult =
@@ -438,12 +461,26 @@ static ssize_t store_powersave_bias(struct kobject *a, 
struct attribute *b,
return count;
 }
 
+static ssize_t store_early_demand(struct kobject

Re: [PATCH v2 linux-next] cpufreq: ondemand: Calculate gradient of CPU load to early increase frequency

2013-02-21 Thread Stratos Karafotis

Hi Viresh,

Thank you very much for your review and your suggestions.

On 02/21/2013 06:59 AM, Viresh Kumar wrote:
 Sorry for this but i already have a patchset which has changed these files
 to some extent. Can you please rebase over them? Actually my patchset
 is already accepted, its just that rafael didn't wanted to have them for 3.9.
 
 http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/heads/cpufreq-for-3.10

No problem. I rebased the patch over your tree.

 So, probably you just don't need this tunable: early_differential.
 Rather just increase the frequency without doing this calculation:
 
 if (load_freq  od_tuners.up_threshold * policy-cur) {

I agree with your suggestion. This is simpler approach. So, if the gradient is 
greater than
grad_up_threshold we increase frequency immediately. i.e.

if previous load was 10 and current load is 65 the gradient will be 55 ( 
grad_up_theshold)
and we increase frequency

 +   if (od_tuners.early_demand) {
 +   if (load_freq  dbs_info-prev_load_freq) {
 
  (load_freq  od_tuners.up_threshold * policy-cur) ??

In my opinion this is not necessary. If load_freq is greater, then we have to 
increase 
frequency in anyway.

 
 +show_one(od, early_demand, early_demand);
 
 What about making other two tunables rw?
 

Of course. I added the grad_up_threshold.

Following patch v2.

Thanks again,
Stratos

8--
Instead of checking only the absolute value of CPU load_freq to increase
frequency, we detect forthcoming CPU load rise and increase frequency
earlier.

Every sampling rate, we calculate the gradient of load_freq. If it is 
too steep we assume that the load most probably will go over 
up_threshold in next iteration(s) and we increase frequency immediately.

New tuners are introduced:
- early_demand: to enable this functionality (disabled by default).
- grad_up_threshold: over this gradient of load we will increase
frequency immediately.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_governor.c |  1 +
 drivers/cpufreq/cpufreq_governor.h |  6 +++-
 drivers/cpufreq/cpufreq_ondemand.c | 72 ++
 3 files changed, 72 insertions(+), 7 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c 
b/drivers/cpufreq/cpufreq_governor.c
index 7722505..e737aa9 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -322,6 +322,7 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
} else {
od_dbs_info-rate_mult = 1;
od_dbs_info-sample_type = OD_NORMAL_SAMPLE;
+   od_dbs_info-prev_load_freq = 0;
od_ops-powersave_bias_init_cpu(cpu);
}
 
diff --git a/drivers/cpufreq/cpufreq_governor.h 
b/drivers/cpufreq/cpufreq_governor.h
index 6301790..c9a237a 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -96,6 +96,7 @@ struct od_cpu_dbs_info_s {
unsigned int freq_hi_jiffies;
unsigned int rate_mult;
unsigned int sample_type:1;
+   unsigned int prev_load_freq;
 };
 
 struct cs_cpu_dbs_info_s {
@@ -114,6 +115,8 @@ struct od_dbs_tuners {
unsigned int adj_up_threshold;
unsigned int powersave_bias;
unsigned int io_is_busy;
+   unsigned int grad_up_threshold;
+   unsigned int early_demand;
 };
 
 struct cs_dbs_tuners {
@@ -160,7 +163,8 @@ struct od_ops {
void (*powersave_bias_init_cpu)(int cpu);
unsigned int (*powersave_bias_target)(struct cpufreq_policy *policy,
unsigned int freq_next, unsigned int relation);
-   void (*freq_increase)(struct cpufreq_policy *p, unsigned int freq);
+   void (*freq_increase)(struct od_cpu_dbs_info_s *dbs_info,
+ struct cpufreq_policy *p, unsigned int freq);
 };
 
 struct cs_ops {
diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
b/drivers/cpufreq/cpufreq_ondemand.c
index c5fd794..4c948af 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -31,6 +31,7 @@
 #define DEF_FREQUENCY_DOWN_DIFFERENTIAL(10)
 #define DEF_FREQUENCY_UP_THRESHOLD (80)
 #define DEF_SAMPLING_DOWN_FACTOR   (1)
+#define DEF_GRAD_UP_THRESHOLD  (50)
 #define MAX_SAMPLING_DOWN_FACTOR   (10)
 #define MICRO_FREQUENCY_DOWN_DIFFERENTIAL  (3)
 #define MICRO_FREQUENCY_UP_THRESHOLD   (95)
@@ -139,11 +140,16 @@ static void ondemand_powersave_bias_init(void)
}
 }
 
-static void dbs_freq_increase(struct cpufreq_policy *p, unsigned int freq)
+static void dbs_freq_increase(struct od_cpu_dbs_info_s *dbs_info,
+ struct cpufreq_policy *p, unsigned int freq)
 {
struct dbs_data *dbs_data = p-governor_data;
struct od_dbs_tuners

Re: [PATCH v3 linux-next] cpufreq: ondemand: Calculate gradient of CPU load to early increase frequency

2013-02-21 Thread Stratos Karafotis

Hi,

On 02/21/2013 05:33 PM, Viresh Kumar wrote:
 Hi Again,
 
 int boost_freq = 0;
 
 if (od_tuners-early_demand) {
 if (load_freq  dbs_info-prev_load_freq  (load_freq
 - dbs_info-prev_load_freq 
 od_tuners-grad_up_threshold * policy-cur)) {
 boost_freq = 1;
 }
 
 dbs_info-prev_load_freq = load_freq;
 }
 
  /* Check for frequency increase */
  if (boost_freq || (load_freq  od_tuners-up_threshold * 
 policy-cur)) {
   increase-freq;
 
 
 This would get rid of duplicate calls to increase_freq() and also
 avoid changing its
 prototype.
 

Thanks again. Following V3 with your suggestion.

Regards,
Stratos

---8
Instead of checking only the absolute value of CPU load_freq to increase
frequency, we detect forthcoming CPU load rise and increase frequency
earlier.

Every sampling rate, we calculate the gradient of load_freq. If it is
too steep we assume that the load most probably will go over
up_threshold in next iteration(s) and we increase frequency immediately.

New tuners are introduced:
- early_demand: to enable this functionality (disabled by default).
- grad_up_threshold: over this gradient of load we will increase
frequency immediately.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_governor.c |  1 +
 drivers/cpufreq/cpufreq_governor.h |  3 ++
 drivers/cpufreq/cpufreq_ondemand.c | 59 +-
 3 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c 
b/drivers/cpufreq/cpufreq_governor.c
index 7722505..e737aa9 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -322,6 +322,7 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
} else {
od_dbs_info-rate_mult = 1;
od_dbs_info-sample_type = OD_NORMAL_SAMPLE;
+   od_dbs_info-prev_load_freq = 0;
od_ops-powersave_bias_init_cpu(cpu);
}
 
diff --git a/drivers/cpufreq/cpufreq_governor.h 
b/drivers/cpufreq/cpufreq_governor.h
index 6301790..3a703b9 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -96,6 +96,7 @@ struct od_cpu_dbs_info_s {
unsigned int freq_hi_jiffies;
unsigned int rate_mult;
unsigned int sample_type:1;
+   unsigned int prev_load_freq;
 };
 
 struct cs_cpu_dbs_info_s {
@@ -114,6 +115,8 @@ struct od_dbs_tuners {
unsigned int adj_up_threshold;
unsigned int powersave_bias;
unsigned int io_is_busy;
+   unsigned int grad_up_threshold;
+   unsigned int early_demand;
 };
 
 struct cs_dbs_tuners {
diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
b/drivers/cpufreq/cpufreq_ondemand.c
index c5fd794..fa4b21e 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -31,6 +31,7 @@
 #define DEF_FREQUENCY_DOWN_DIFFERENTIAL(10)
 #define DEF_FREQUENCY_UP_THRESHOLD (80)
 #define DEF_SAMPLING_DOWN_FACTOR   (1)
+#define DEF_GRAD_UP_THRESHOLD  (50)
 #define MAX_SAMPLING_DOWN_FACTOR   (10)
 #define MICRO_FREQUENCY_DOWN_DIFFERENTIAL  (3)
 #define MICRO_FREQUENCY_UP_THRESHOLD   (95)
@@ -168,11 +169,26 @@ static void od_check_cpu(int cpu, unsigned int load_freq)
struct cpufreq_policy *policy = dbs_info-cdbs.cur_policy;
struct dbs_data *dbs_data = policy-governor_data;
struct od_dbs_tuners *od_tuners = dbs_data-tuners;
+   int boost_freq = 0;
 
dbs_info-freq_lo = 0;
 
+   /*
+* Calculate the gradient of load_freq. If it is too steep we assume
+* that the load will go over up_threshold in next iteration(s) and
+* we increase the frequency immediately
+*/
+   if (od_tuners-early_demand) {
+   if (load_freq  dbs_info-prev_load_freq 
+  (load_freq - dbs_info-prev_load_freq 
+   od_tuners-grad_up_threshold * policy-cur))
+   boost_freq = 1;
+
+   dbs_info-prev_load_freq = load_freq;
+   }
+
/* Check for frequency increase */
-   if (load_freq  od_tuners-up_threshold * policy-cur) {
+   if (boost_freq || (load_freq  od_tuners-up_threshold * policy-cur)) {
/* If switching to max speed, apply sampling_down_factor */
if (policy-cur  policy-max)
dbs_info-rate_mult =
@@ -445,12 +461,47 @@ static ssize_t store_powersave_bias(struct cpufreq_policy 
*policy,
return count;
 }
 
+static ssize_t store_grad_up_threshold(struct cpufreq_policy *policy,
+   const char *buf, size_t count)
+{
+   struct dbs_data *dbs_data = policy-governor_data;
+   struct

[PATCH] cpufreq: governors: Remove duplicate check of target freq in supported range

2013-08-26 Thread Stratos Karafotis

Function __cpufreq_driver_target checks if target_freq is within
policy-min and policy-max range. generic_powersave_bias_target also
checks if target_freq is valid through cpufreq_frequency_table_target
call. So, drop the unnecessary duplicate check in *_check_cpu functions.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_conservative.c | 4 
 drivers/cpufreq/cpufreq_ondemand.c | 3 ---
 2 files changed, 7 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c 
b/drivers/cpufreq/cpufreq_conservative.c
index 7f67a75..f62d822 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -67,8 +67,6 @@ static void cs_check_cpu(int cpu, unsigned int load)
return;
 
dbs_info-requested_freq += get_freq_target(cs_tuners, policy);
-   if (dbs_info-requested_freq  policy-max)
-   dbs_info-requested_freq = policy-max;
 
__cpufreq_driver_target(policy, dbs_info-requested_freq,
CPUFREQ_RELATION_H);
@@ -89,8 +87,6 @@ static void cs_check_cpu(int cpu, unsigned int load)
return;
 
dbs_info-requested_freq -= get_freq_target(cs_tuners, policy);
-   if (dbs_info-requested_freq  policy-min)
-   dbs_info-requested_freq = policy-min;
 
__cpufreq_driver_target(policy, dbs_info-requested_freq,
CPUFREQ_RELATION_L);
diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
b/drivers/cpufreq/cpufreq_ondemand.c
index 87f3305..32f26f6 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -177,9 +177,6 @@ static void od_check_cpu(int cpu, unsigned int load)
/* No longer fully busy, reset rate_mult */
dbs_info-rate_mult = 1;
 
-   if (freq_next  policy-min)
-   freq_next = policy-min;
-
if (!od_tuners-powersave_bias) {
__cpufreq_driver_target(policy, freq_next,
CPUFREQ_RELATION_L);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] cpufreq: governor: Fix typos in comments

2013-08-26 Thread Stratos Karafotis

- 'Governer' should be 'Governor'
- 'S' is used for Siemens (electrical conductance) in SI units.
Use small 's' for seconds.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_governor.c |  2 +-
 drivers/cpufreq/cpufreq_governor.h | 12 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c 
b/drivers/cpufreq/cpufreq_governor.c
index 8742736..fea104a 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -230,7 +230,7 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 
policy-governor_data = dbs_data;
 
-   /* policy latency is in nS. Convert it to uS first */
+   /* policy latency is in ns. Convert it to us first */
latency = policy-cpuinfo.transition_latency / 1000;
if (latency == 0)
latency = 1;
diff --git a/drivers/cpufreq/cpufreq_governor.h 
b/drivers/cpufreq/cpufreq_governor.h
index a02d78b..88cd39f 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -25,11 +25,11 @@
 /*
  * The polling frequency depends on the capability of the processor. Default
  * polling frequency is 1000 times the transition latency of the processor. The
- * governor will work on any processor with transition latency = 10mS, using
+ * governor will work on any processor with transition latency = 10ms, using
  * appropriate sampling rate.
  *
- * For CPUs with transition latency  10mS (mostly drivers with 
CPUFREQ_ETERNAL)
- * this governor will not work. All times here are in uS.
+ * For CPUs with transition latency  10ms (mostly drivers with 
CPUFREQ_ETERNAL)
+ * this governor will not work. All times here are in us (micro seconds).
  */
 #define MIN_SAMPLING_RATE_RATIO(2)
 #define LATENCY_MULTIPLIER (1000)
@@ -162,7 +162,7 @@ struct cs_cpu_dbs_info_s {
unsigned int enable:1;
 };
 
-/* Per policy Governers sysfs tunables */
+/* Per policy Governors sysfs tunables */
 struct od_dbs_tuners {
unsigned int ignore_nice_load;
unsigned int sampling_rate;
@@ -181,7 +181,7 @@ struct cs_dbs_tuners {
unsigned int freq_step;
 };
 
-/* Common Governer data across policies */
+/* Common Governor data across policies */
 struct dbs_data;
 struct common_dbs_data {
/* Common across governors */
@@ -205,7 +205,7 @@ struct common_dbs_data {
void *gov_ops;
 };
 
-/* Governer Per policy data */
+/* Governor Per policy data */
 struct dbs_data {
struct common_dbs_data *cdata;
unsigned int min_sampling_rate;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cpufreq: governors: Remove duplicate check of target freq in supported range

2013-08-27 Thread Stratos Karafotis

On 08/27/2013 08:57 AM, Viresh Kumar wrote:
 On 27 August 2013 00:07, Stratos Karafotis strat...@semaphore.gr wrote:
   drivers/cpufreq/cpufreq_conservative.c | 4 
 
 Get rid of few more checks..
 
 /* if we are already at full speed then break out early */
 if (dbs_info-requested_freq == policy-max)
 return;
 
 
 /*
 * if we cannot reduce the frequency anymore, break out early
 */
 if (policy-cur == policy-min)
 return;
 

I think we should keep these checks because:

1) They shorten the execution code (there is no unnecessary call of
__cpufreq_driver_target)
2) In case my patch will be accepted, we need them to avoid continuously
increase of dbs_info-requested_freq.With my patch the requested_freq
can temporarily overcome policy-min and policy-max. __cpufreq_driver_target
will select the correct frequency (within policy-min and policy-max).
Then, dbs_cpufreq_notifier will adjust requested_freq.

I hope the logic in 2) to be acceptable.


Thanks,
Stratos
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cpufreq: governors: Remove duplicate check of target freq in supported range

2013-08-27 Thread Stratos Karafotis


On 08/27/2013 07:07 PM, Viresh Kumar wrote:

On 27 August 2013 21:16, Stratos Karafotis strat...@semaphore.gr wrote:

I think we should keep these checks because:

1) They shorten the execution code (there is no unnecessary call of
__cpufreq_driver_target)


I don't really count this one.. This is how code is present everywhere in
kernel.. These checks are present in routines and callers don't need to
take care of them..


I mean that if we will get rid of the code you mentioned, we will have
an extra call to function __cpufreq_driver_target in some cases.


2) In case my patch will be accepted, we need them to avoid continuously
increase of dbs_info-requested_freq.With my patch the requested_freq
can temporarily overcome policy-min and policy-max. __cpufreq_driver_target
will select the correct frequency (within policy-min and policy-max).
Then, dbs_cpufreq_notifier will adjust requested_freq.


Sorry, I couldn't understand what you meant here :(



I'm sorry. Let me try to explain this better.

With my patch, dbs_info-requested_freq will not be capped within
policy-min and policy-max in cs_check_cpu.
So, temporarily it may have a value greater than policy-max
or lower that policy-min.
When we call __cpufreq_driver_target, the correct frequency will be 
selected because __cpufreq_driver_target takes care to adjust the

target frequency within policy range.
But, eventually, dbs_cpufreq_notifier will adjust dbs_info-requested
within policy range, if needed.

If we remove
if (dbs_info-requested_freq == policy-max)
return;
and
if (policy-cur == policy-min)
return;

request_freq will keep increasing or decreasing in each iteration and
finally will overflow or underflow.

Consider, for example, that in a CPU with policy-max = 1000MHz
the current frequency is 950MHz. With a constant load above
up_threshold, the requested_freq in first iteration will be 1000MHz
and __cpufreq_driver_target will select 1000MHz freq.

In second iteration, requested_freq will be 1050MHz, and 
__cpufreq_driver_target will select 1000MHz. dbs_cpufreq_notifier

will adjust requested_freq back to 1000MHz.

In next iterations, dbs_cpufreq_notifier will not be called, so we
need the above check (dbs_info-requested_freq == policy-max) to
prevent requested_freq to grow arbitrary.

I hope my explanation was better now. :)


Thanks,
Stratos
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] nohz: Only update sleeptime stats locally

2013-08-19 Thread Stratos Karafotis

On 08/18/2013 08:04 PM, Oleg Nesterov wrote:
 Sorry for double post. forgot to cc cpufreq maintainers.
 
 On 08/16, Frederic Weisbecker wrote:

 To fix this, lets only update the sleeptime stats locally when the CPU
 exits from idle.
 
 I am in no position to ack the changes in this area, but I like this
 change very much. Because, as a code reader, I was totally confused by
 
   if (last_update_time)
   update_ts_time_stats()
 
 code and it looks obviously wrong.
 
 I added more cc's. It seems to me that 9366d840 cpufreq: governors:
 Calculate iowait time only when necessary doesn't realize what
 
   -   u64 idle_time = get_cpu_idle_time_us(cpu, NULL);
   +   u64 idle_time = get_cpu_idle_time_us(cpu, io_busy ? wall : 
 NULL);
 
 actually means. OTOH, get_cpu_iowait_time_us() was called with
 last_update_time != NULL even before this patch...

To be honest, I am unfamiliar with tick-sched code.
With patch 9366d840, I was trying to avoid duplicate calls to
get_cpu_iowait_time_us function. I just saw that the original
code was calling update_ts_time_stats within get_cpu_idle_time_us
and get_cpu_iowait_time_us and I thought that I should keep calling
these functions with non NULL parameter to update the time stats.

In fact the original patch submission was without this:
-   u64 idle_time = get_cpu_idle_time_us(cpu, NULL);
+   u64 idle_time = get_cpu_idle_time_us(cpu, io_busy ? wall : NULL);
and the idle time calculation was wrong (ondemand couldn't increase to max freq)


For your convenience the call paths before and after this patch:

Before patch

get_cpu_idle_time(j, cur_wall_time);
u64 idle_time = get_cpu_idle_time_us(cpu, NULL);
idle_time += get_cpu_iowait_time_us(cpu, wall);
update_ts_time_stats(cpu, ts, now, last_update_time);
... 
get_cpu_iowait_time_us(j, cur_wall_time);
update_ts_time_stats(cpu, ts, now, last_update_time);


After patch (io_busy = 1)

cur_idle_time = get_cpu_idle_time(j, cur_wall_time, io_busy);
u64 idle_time = get_cpu_idle_time_us(cpu, io_busy ? wall : NULL);
update_ts_time_stats(cpu, ts, now, last_update_time);


After patch (io_busy = 0)

cur_idle_time = get_cpu_idle_time(j, cur_wall_time, io_busy);
u64 idle_time = get_cpu_idle_time_us(cpu, io_busy ? wall : NULL);

idle_time += get_cpu_iowait_time_us(cpu, wall);
update_ts_time_stats(cpu, ts, now, last_update_time);


Regards,
Stratos
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: oops during boot with CONFIG_SND_DYNAMIC_MINORS not set

2013-08-22 Thread Stratos Karafotis

On 08/22/2013 10:59 AM, Takashi Iwai wrote:
 At Thu, 22 Aug 2013 00:42:41 +0300,
 Stratos Karafotis wrote:

 Hi,

 I get the following oops during boot when build with 
 CONFIG_SND_DYNAMIC_MINORS
 not set (3.11-rc6).
 The issue is vanished building the kernel with CONFIG_SND_DYNAMIC_MINORS=y
 as suggested in printk message.

 Regards,
 Stratos


 
 Could you check the patch below?
 Thanks!
 
 
 Takashi
 
 ---
 From: Takashi Iwai ti...@suse.de
 Subject: [PATCH] ALSA: hda - Fix NULL dereference with 
 CONFIG_SND_DYNAMIC_MINORS=n
 
 Without the dynamic minor assignment, HDMI codec may have less PCM
 instances than the number of pins, which eventually leads to Oops.
 
 Reported-by: Stratos Karafotis strat...@semaphore.gr
 Cc: sta...@vger.kernel.org
 Signed-off-by: Takashi Iwai ti...@suse.de
 ---
   sound/pci/hda/patch_hdmi.c | 3 +++
   1 file changed, 3 insertions(+)
 
 diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c
 index 030ca86..e2cb92b 100644
 --- a/sound/pci/hda/patch_hdmi.c
 +++ b/sound/pci/hda/patch_hdmi.c
 @@ -1781,6 +1781,9 @@ static int generic_hdmi_build_controls(struct hda_codec 
 *codec)
   struct snd_pcm_chmap *chmap;
   struct snd_kcontrol *kctl;
   int i;
 +
 + if (pin_idx = codec-num_pcms)
 + break;
   err = snd_pcm_add_chmap_ctls(codec-pcm_info[pin_idx].pcm,
SNDRV_PCM_STREAM_PLAYBACK,
NULL, 0, pin_idx, chmap);
 

Hi,

Unfortunately, still the same problem after applying your patch.

Regards,
Stratos


[   12.828335] Oops:  [#1] SMP 
[   12.829237] Modules linked in: snd_hda_codec_hdmi arc4 rt2800pci 
eeprom_93cx6 rt2x00pci iTCO_wdt snd_hda_codec_realtek iTCO_vendor_support 
rt2800lib crc_ccitt eeepc_wmi rt2x00mmio rt2x00lib asus_wmi mac80211 
sparse_keymap cfg80211 rfkill r8169 snd_hda_intel(+) snd_hda_codec snd_hwdep 
snd_seq snd_seq_device mii snd_pcm snd_page_alloc snd_timer snd soundcore 
i2c_i801 i2c_core lpc_ich mfd_core serio_raw pcspkr uinput binfmt_misc 
usb_storage video wmi
[   12.833282] CPU: 2 PID: 405 Comm: systemd-udevd Not tainted 3.11.0-rc6+ #5
[   12.834299] Hardware name: ASUSTeK COMPUTER INC. CM6870/CM6870, BIOS 0606 
08/27/2012
[   12.835334] task: 880212d45d40 ti: 880210bd6000 task.ti: 
880210bd6000
[   12.836411] RIP: 0010:[a00ae992]  [a00ae992] 
snd_pcm_add_chmap_ctls+0xd2/0x160 [snd_pcm]
[   12.837505] RSP: 0018:880210bd7968  EFLAGS: 00010246
[   12.838596] RAX: a00b451e RBX: 880212348500 RCX: 
[   12.839721] RDX: a00b4533 RSI: 880212348500 RDI: 880210bd7980
[   12.840831] RBP: 880210bd79f8 R08:  R09: 880216003b00
[   12.841944] R10:  R11: 8802103210c0 R12: 
[   12.843073] R13: 880210bd7a10 R14:  R15: 880210bd7980
[   12.844218] FS:  7f902eec6880() GS:88021ec8() 
knlGS:
[   12.845403] CS:  0010 DS:  ES:  CR0: 80050033
[   12.846566] CR2: 0018 CR3: 00021054f000 CR4: 001407e0
[   12.847737] Stack:
[   12.848892]   0c334000  
0003
[   12.850086]   a00b451e 1011 

[   12.851275]  a00aca80 a00aecd0  
a00adb30
[   12.852465] Call Trace:
[   12.853646]  [a00aca80] ? snd_pcm_hw_rule_msbits+0x50/0x50 
[snd_pcm]
[   12.854852]  [a00aecd0] ? snd_pcm_hw_rule_ratdens+0x2b0/0x2b0 
[snd_pcm]
[   12.856062]  [a00adb30] ? snd_pcm_hw_param_last+0x240/0x240 
[snd_pcm]
[   12.857298]  [a02a27fd] generic_hdmi_build_controls+0x15d/0x200 
[snd_hda_codec_hdmi]
[   12.858524]  [a02a1827] ? generic_hdmi_init+0xb7/0xd0 
[snd_hda_codec_hdmi]
[   12.859757]  [a00ee812] snd_hda_codec_build_controls+0x1c2/0x220 
[snd_hda_codec]
[   12.861003]  [a00e9255] ? snd_hda_codec_configure+0x295/0x450 
[snd_hda_codec]
[   12.862264]  [a00ee898] snd_hda_build_controls+0x28/0x80 
[snd_hda_codec]
[   12.863523]  [a0084bed] azx_probe_continue+0x84d/0xcc0 
[snd_hda_intel]
[   12.864744]  [a0084060] ? perf_trace_azx_pcm_trigger+0xe0/0xe0 
[snd_hda_intel]
[   12.865979]  [a0082ee0] ? azx_resume+0x130/0x130 [snd_hda_intel]
[   12.867205]  [a0083c20] ? azx_pcm_prepare+0x5f0/0x5f0 
[snd_hda_intel]
[   12.868433]  [a00828f0] ? azx_runtime_suspend+0x40/0x40 
[snd_hda_intel]
[   12.869658]  [a0081800] ? azx_remove+0x30/0x30 [snd_hda_intel]
[   12.870883]  [a00854bf] azx_probe+0x3bf/0x7e0 [snd_hda_intel]
[   12.872131]  [8130b3ee] local_pci_probe+0x3e/0x70
[   12.873347]  [8130c6d1] pci_device_probe+0x121/0x130
[   12.874584]  [813bf3c7] driver_probe_device+0x87/0x390
[   12.875803]  [813bf7a3

Re: oops during boot with CONFIG_SND_DYNAMIC_MINORS not set

2013-08-22 Thread Stratos Karafotis

On 08/23/2013 12:23 AM, Takashi Iwai wrote:
 At Thu, 22 Aug 2013 19:03:44 +0300,
 Stratos Karafotis wrote:

 On 08/22/2013 10:59 AM, Takashi Iwai wrote:
 At Thu, 22 Aug 2013 00:42:41 +0300,
 Stratos Karafotis wrote:

 Hi,

 I get the following oops during boot when build with 
 CONFIG_SND_DYNAMIC_MINORS
 not set (3.11-rc6).
 The issue is vanished building the kernel with CONFIG_SND_DYNAMIC_MINORS=y
 as suggested in printk message.

 Regards,
 Stratos



 Could you check the patch below?
 Thanks!


 Takashi

 ---
 From: Takashi Iwai ti...@suse.de
 Subject: [PATCH] ALSA: hda - Fix NULL dereference with 
 CONFIG_SND_DYNAMIC_MINORS=n

 Without the dynamic minor assignment, HDMI codec may have less PCM
 instances than the number of pins, which eventually leads to Oops.

 Reported-by: Stratos Karafotis strat...@semaphore.gr
 Cc: sta...@vger.kernel.org
 Signed-off-by: Takashi Iwai ti...@suse.de
 ---
sound/pci/hda/patch_hdmi.c | 3 +++
1 file changed, 3 insertions(+)

 diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c
 index 030ca86..e2cb92b 100644
 --- a/sound/pci/hda/patch_hdmi.c
 +++ b/sound/pci/hda/patch_hdmi.c
 @@ -1781,6 +1781,9 @@ static int generic_hdmi_build_controls(struct 
 hda_codec *codec)
 struct snd_pcm_chmap *chmap;
 struct snd_kcontrol *kctl;
 int i;
 +
 +   if (pin_idx = codec-num_pcms)
 +   break;
 err = snd_pcm_add_chmap_ctls(codec-pcm_info[pin_idx].pcm,
  SNDRV_PCM_STREAM_PLAYBACK,
  NULL, 0, pin_idx, chmap);


 Hi,

 Unfortunately, still the same problem after applying your patch.
 
 Bah, it's a wrong one.  The patch below should work better.
 Sorry for inconvenience.

No problem! :)

Yes, no more oops with the latest patch.
Tested with CONFIG_SND_DYNAMIC_MINORS=y and n.

There is an error -16 now (please see below), but I guess it's normal(?)

Thanks,
Stratos


[   16.392453] ALSA sound/pci/hda/hda_codec.c:4506 Too many HDMI devices
[   16.392457] ALSA sound/pci/hda/hda_codec.c:4508 Consider building the kernel 
with CONFIG_SND_DYNAMIC_MINORS=y
[   16.392460] ALSA sound/pci/hda/hda_codec.c:4506 Too many HDMI devices
[   16.392461] ALSA sound/pci/hda/hda_codec.c:4508 Consider building the kernel 
with CONFIG_SND_DYNAMIC_MINORS=y
[   16.409640] ALSA sound/core/control.c:349 control 0:0:0:HDMI/DP Jack:0 is 
already present
[   16.410627] hda_codec: cannot build controls for #3 (error -16)
[   16.411683] input: HDA NVidia HDMI/DP as 
/devices/pci:00/:00:01.0/:01:00.1/sound/card1/input15
[   16.411844] input: HDA NVidia HDMI/DP,pcm=7 as 
/devices/pci:00/:00:01.0/:01:00.1/sound/card1/input16
[   16.411986] input: HDA NVidia HDMI/DP,pcm=3 as 
/devices/pci:00/:00:01.0/:01:00.1/sound/card1/input17
[   19.682495] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: 
(null)
[   21.367420] FAT-fs (sda1): Volume was not properly unmounted. Some data may 
be corrupt. Please run fsck.
[   21.420191] EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: 
(null)
[   21.847085] type=1305 audit(1377207061.920:3): audit_pid=534 old=0 
auid=4294967295 ses=4294967295
 res=1
[   22.164707] alsactl[549]: segfault at 1 ip 7f6c8ad58a7d sp 
7fff19912de0 error 4 in libc-2.17.so[7f6c8ad1+1b6000]

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cpufreq: conservative: fix requested_freq reduction issue

2013-11-08 Thread Stratos Karafotis

On Fri, Nov 8, 2013 at 6:55 AM, Viresh Kumar viresh.ku...@linaro.org wrote:
 On 8 November 2013 00:36, Stratos Karafotis skarafo...@gmail.com wrote:
 I think the existing code already checks if the requested_freq is greater
 than policy-max in __cpufreq_driver_target.

 Yes it does. But the problem is:
 - cs_check_cpu() sets requested_freq above policy-max
 - We execute following code because (requested_freq != policy-max)

 dbs_info-requested_freq += get_freq_target(cs_tuners, policy);
 __cpufreq_driver_target(policy, dbs_info-requested_freq,
 CPUFREQ_RELATION_H);
 - In __cpufreq_driver_target(), we don't do anything and return early..
 - Above will keep on repeating all the time..

 If we change the code as I have suggested it to be:
 - After first loop where requested_freq went over policy-max, we will
 return early from cs_check_cpu(), but we have already set freq to max..

 If we put this check earlier, cpufreq will never reach policy-max.

 Can you please explain why do you see that happening?

Please let me rephrase my previous post. In some circumstances (depending
on freq_step and freq_table values) CPU frequency will never reach to
policy-max.

For example suppose that (for simplicity values in MHz):
policy-max = 1000
policy-cur = 800
requested_freq = 800
freq_target = 300

In 'first' iteration, if we return early with this code (because
requested_freq will be
1100):
if (dbs_info-requested_freq = policy-max)
 return;

CPU freq will never go over 800MHz.

I think the current code works correctly.
- The requested freq will go to 1100 in first iteration.
- __cpufreq_driver_target will change CPU freq to 1000
- dbs_cpufreq_notifier will adjust the requested_freq to 1000

In next iteration the code:
if (dbs_info-requested_freq == policy-max)
return;

will keep the freq to max and break out early.

So, I think there is no need for an extra check because of
dbs_cpufreq_notifier code.


Thanks,
Stratos Karafotis
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cpufreq: conservative: fix requested_freq reduction issue

2013-11-08 Thread Stratos Karafotis

On Fri, Nov 8, 2013 at 8:16 PM, Viresh Kumar viresh.ku...@linaro.org wrote:
 On 8 November 2013 23:13, Stratos Karafotis skarafo...@gmail.com wrote:
 Please let me rephrase my previous post. In some circumstances (depending
 on freq_step and freq_table values) CPU frequency will never reach to
 policy-max.

 For example suppose that (for simplicity values in MHz):
 policy-max = 1000
 policy-cur = 800
 requested_freq = 800
 freq_target = 300

 In 'first' iteration, if we return early with this code (because
 requested_freq will be
 1100):
 if (dbs_info-requested_freq = policy-max)
  return;

 That's not correct. At this point requested_freq would have been
 800 only, and would have increased after this instruction to 1100.
 So, in the first transition we will go to max freq, but not from the
 second.

 Though this piece of code is more simplified by the new solution
 I gave.


Yes, you are right.

 CPU freq will never go over 800MHz.

 I think the current code works correctly.
 - The requested freq will go to 1100 in first iteration.
 - __cpufreq_driver_target will change CPU freq to 1000
 - dbs_cpufreq_notifier will adjust the requested_freq to 1000

 So, I think there is no need for an extra check because of
 dbs_cpufreq_notifier code.

 Now with the new code in place we are correcting requested_freq
 in cs_check_cpu(), then why do we need dbs_cpufreq_notifier()?

 What do you think?

I removed the check you proposed in this commit 934dac1ea072 to avoid
the duplicate check in cs_check_cpu and in dbs_cpufreq_notifier.

I agree that we don't need dbs_cpufreq_notifier if we transfer checks in
cs_check_cpu. But I'm not 100% sure if the notifier also covers
other cases and if it can be safely removed.


Stratos Karafotis
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] cpufreq: Remove unnecessary braces

2014-03-19 Thread Stratos Karafotis

Remove 3 sets of unnecessary braces

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 1eafd8c..ca3c01f 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1325,17 +1325,16 @@ static int __cpufreq_remove_dev_prepare(struct device 
*dev,
cpus = cpumask_weight(policy-cpus);
up_read(policy-rwsem);
 
-   if (cpu != policy-cpu) {
+   if (cpu != policy-cpu)
sysfs_remove_link(dev-kobj, cpufreq);
-   } else if (cpus  1) {
+   else if (cpus  1) {
new_cpu = cpufreq_nominate_new_policy_cpu(policy, cpu);
if (new_cpu = 0) {
update_policy_cpu(policy, new_cpu);
 
-   if (!cpufreq_suspended) {
+   if (!cpufreq_suspended)
pr_debug(%s: policy Kobject moved to cpu: %d 
from: %d\n,
 __func__, new_cpu, cpu);
-   }
}
}
 
@@ -2158,11 +2157,9 @@ int cpufreq_update_policy(unsigned int cpu)
if (!policy-cur) {
pr_debug(Driver did not initialize current freq\n);
policy-cur = new_policy.cur;
-   } else {
-   if (policy-cur != new_policy.cur  has_target())
+   } else if (policy-cur != new_policy.cur  has_target())
cpufreq_out_of_sync(cpu, policy-cur,
new_policy.cur);
-   }
}
 
ret = cpufreq_set_policy(policy, new_policy);
-- 
1.8.5.3
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] cpufreq: Fix checkpatch errors and warnings

2014-03-19 Thread Stratos Karafotis

Fix 2 checkpatch errors about using assignment in if condition,
1 checkpatch error about a required space after comma
and 3 warnings about line over 80 characters.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq.c | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index e3aa9de..1eafd8c 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -939,8 +939,10 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy 
*policy,
up_write(policy-rwsem);
 
if (has_target()) {
-   if ((ret = __cpufreq_governor(policy, CPUFREQ_GOV_START)) ||
-   (ret = __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS))) 
{
+   ret = __cpufreq_governor(policy, CPUFREQ_GOV_START);
+   if (!ret)
+   ret = __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
+   if (ret) {
pr_err(%s: Failed to start governor\n, __func__);
return ret;
}
@@ -1394,14 +1396,13 @@ static int __cpufreq_remove_dev_finish(struct device 
*dev,
 
if (!cpufreq_suspended)
cpufreq_policy_free(policy);
-   } else {
-   if (has_target()) {
-   if ((ret = __cpufreq_governor(policy, 
CPUFREQ_GOV_START)) ||
-   (ret = __cpufreq_governor(policy, 
CPUFREQ_GOV_LIMITS))) {
-   pr_err(%s: Failed to start governor\n,
-  __func__);
-   return ret;
-   }
+   } else if (has_target()) {
+   ret = __cpufreq_governor(policy, CPUFREQ_GOV_START);
+   if (!ret)
+   ret = __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
+   if (ret) {
+   pr_err(%s: Failed to start governor\n, __func__);
+   return ret;
}
}
 
@@ -2086,7 +2087,7 @@ static int cpufreq_set_policy(struct cpufreq_policy 
*policy,
if (old_gov) {
__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
up_write(policy-rwsem);
-   __cpufreq_governor(policy,CPUFREQ_GOV_POLICY_EXIT);
+   __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
down_write(policy-rwsem);
}
 
-- 
1.8.5.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cpufreq: Remove unnecessary braces

2014-03-19 Thread Stratos Karafotis

On 20/03/2014 12:45 πμ, Rafael J. Wysocki wrote:
 On Wednesday, March 19, 2014 11:33:00 PM Stratos Karafotis wrote:
 Remove 3 sets of unnecessary braces

 Signed-off-by: Stratos Karafotis strat...@semaphore.gr
 ---
  drivers/cpufreq/cpufreq.c | 11 ---
  1 file changed, 4 insertions(+), 7 deletions(-)

 diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
 index 1eafd8c..ca3c01f 100644
 --- a/drivers/cpufreq/cpufreq.c
 +++ b/drivers/cpufreq/cpufreq.c
 @@ -1325,17 +1325,16 @@ static int __cpufreq_remove_dev_prepare(struct 
 device *dev,
  cpus = cpumask_weight(policy-cpus);
  up_read(policy-rwsem);
  
 -if (cpu != policy-cpu) {
 +if (cpu != policy-cpu)
  sysfs_remove_link(dev-kobj, cpufreq);
 -} else if (cpus  1) {
 
 These braces aren't in fact unnecessary, they are in accordance with 
 CodingStyle.
 
 +else if (cpus  1) {
  new_cpu = cpufreq_nominate_new_policy_cpu(policy, cpu);
  if (new_cpu = 0) {
  update_policy_cpu(policy, new_cpu);
  
 -if (!cpufreq_suspended) {
 +if (!cpufreq_suspended)
  pr_debug(%s: policy Kobject moved to cpu: %d 
 from: %d\n,
   __func__, new_cpu, cpu);
 -}
  }
  }
  
 @@ -2158,11 +2157,9 @@ int cpufreq_update_policy(unsigned int cpu)
  if (!policy-cur) {
  pr_debug(Driver did not initialize current freq\n);
  policy-cur = new_policy.cur;
 -} else {
 -if (policy-cur != new_policy.cur  has_target())
 +} else if (policy-cur != new_policy.cur  has_target())
 
 And here too.
 
  cpufreq_out_of_sync(cpu, policy-cur,
  new_policy.cur);
 -}
  }
  
  ret = cpufreq_set_policy(policy, new_policy);

 

I'm sorry for the inconvenience. I read again the CodingStyle (more carefully 
:) ).
I'm sending the corrected patch with the single case of unnecessary braces.

Thanks,
Stratos

8---
Remove unnecessary braces from a single statement.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index e3aa9de..220c4a9 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1330,10 +1330,9 @@ static int __cpufreq_remove_dev_prepare(struct device 
*dev,
if (new_cpu = 0) {
update_policy_cpu(policy, new_cpu);
 
-   if (!cpufreq_suspended) {
+   if (!cpufreq_suspended)
pr_debug(%s: policy Kobject moved to cpu: %d 
from: %d\n,
 __func__, new_cpu, cpu);
-   }
}
}
 
-- 
1.8.5.3


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 1/3] cpufreq: ondemand: Change the calculation of target frequency

2013-06-07 Thread Stratos Karafotis

On 06/05/2013 11:35 PM, Rafael J. Wysocki wrote:
 On Wednesday, June 05, 2013 08:13:26 PM Stratos Karafotis wrote:
 Hi Borislav,

 On 06/05/2013 07:17 PM, Borislav Petkov wrote:
 On Wed, Jun 05, 2013 at 07:01:25PM +0300, Stratos Karafotis wrote:
 Ondemand calculates load in terms of frequency and increases it only
 if the load_freq is greater than up_threshold multiplied by current
 or average frequency. This seems to produce oscillations of frequency
 between min and max because, for example, a relatively small load can
 easily saturate minimum frequency and lead the CPU to max. Then, the
 CPU will decrease back to min due to a small load_freq.

 Right, and I think this is how we want it, no?

 The thing is, the faster you finish your work, the faster you can become
 idle and save power.

 This is exactly the goal of this patch. To use more efficiently middle
 frequencies to finish faster the work.

 If you switch frequencies in a staircase-like manner, you're going to
 take longer to finish, in certain cases, and burn more power while doing
 so.

 This is not true with this patch. It switches to middle frequencies
 when the load  up_threshold.
 Now, ondemand does not increase freq. CPU runs in lowest freq till the
 load is greater than up_threshold.

 Btw, racing to idle is also a good example for why you want boosting:
 you want to go max out the core but stay within power limits so that you
 can finish sooner.

 This patch changes the calculation method of load and target frequency
 considering 2 points:
 - Load computation should be independent from current or average
 measured frequency. For example an absolute load 80% at 100MHz is not
 necessarily equivalent to 8% at 1000MHz in the next sampling interval.
 - Target frequency should be increased to any value of frequency table
 proportional to absolute load, instead to only the max. Thus:

 Target frequency = C * load

 where C = policy-cpuinfo.max_freq / 100

 Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait.
 Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an
 increase ~1.5% in performance. cpufreq_stats (time_in_state) shows
 that middle frequencies are used more, with this patch. Highest
 and lowest frequencies were used less by ~9%
 
 Can you also use powertop to measure the percentage of time spent in idle
 states for the same workload with and without your patchset?  Also, it would
 be good to measure the total energy consumption somehow ...
 
 Thanks,
 Rafael

Hi Rafael,

I repeated the tests extracting also powertop results.
Measurement steps with and without this patch:
1) Reboot system
2) Running twice Phoronix benchmark of Linux Kernel Compilation 3.1 test
   without taking measurement
3) Wait few minutes
4) Run Phoronix and powertop for 100secs and take measurement.

I will try to repeat the test and take measurements with turbostat as
Borislav suggested.


Thanks,
Stratos

--
Test WITHOUT this patch:

Phoronix Test Suite v4.6.0

Installed: pts/build-linux-kernel-1.3.0

System Information

Hardware:
Processor: Intel Core i7-3770 @ 3.40GHz (8 Cores), Motherboard: ASUS CM6870, 
Chipset: Intel Xeon E3-1200 v2/3rd, Memory: 2 x 4096 MB DDR3-1600MHz 
HY64C1C1624ZY, Disk: 1000GB Seagate ST1000DM003-9YN1, Graphics: NVIDIA GeForce 
GT 640 3072MB, Audio: Realtek ALC892, Monitor: S23B350, Network: Realtek 
RTL8111/8168 + Ralink RT3090 Wireless 802.11n 1T/1R

Software:
OS: Fedora 18, Kernel: 3.10.0-rc3v+ (x86_64), Desktop: KDE 4.10.3, Display 
Server: X Server 1.13.3, Display Driver: nouveau 1.0.7, File-System: ext4, 
Screen Resolution: 1920x1080

Would you like to save these test results (Y/n): n


Timed Linux Kernel Compilation 3.1:
pts/build-linux-kernel-1.3.0
Test 1 of 1
Estimated Trial Run Count:3
Estimated Time To Completion: 2 Minutes
Running Pre-Test Script @ 21:41:19
Started Run 1 @ 21:41:30
Running Interim Test Script @ 21:41:44
Started Run 2 @ 21:41:47
Running Interim Test Script @ 21:42:02
Started Run 3 @ 21:42:05
Running Interim Test Script @ 21:42:15  [Std. Dev: 19.28%]
Started Run 4 @ 21:42:19
Running Interim Test Script @ 21:42:29  [Std. Dev: 18.72%]
Started Run 5 @ 21:42:32
Running Interim Test Script @ 21:42:42  [Std. Dev: 17.84%]
Started Run 6 @ 21:42:46  [Std. Dev: 16.91%]
Running Post-Test Script @ 21:42:55

Test Results:
11.073544979095
14.059958934784
9.6814110279083
9.6158590316772
9.5762379169464
9.5944919586182

Average: 10.60 Seconds

Powertop results:
http://www.semaphore.gr/results/powertop_without.html


-
Test WITH this patch:

Phoronix Test Suite v4.6.0

Installed: pts/build-linux-kernel-1.3.0

System Information

Hardware:
Processor: Intel Core i7-3770 @ 3.40GHz

Re: [PATCH v3 1/3] cpufreq: ondemand: Change the calculation of target frequency

2013-06-08 Thread Stratos Karafotis

On 06/07/2013 11:57 PM, Rafael J. Wysocki wrote:
 On Friday, June 07, 2013 10:14:34 PM Stratos Karafotis wrote:
 On 06/05/2013 11:35 PM, Rafael J. Wysocki wrote:
 On Wednesday, June 05, 2013 08:13:26 PM Stratos Karafotis wrote:
 Hi Borislav,

 On 06/05/2013 07:17 PM, Borislav Petkov wrote:
 On Wed, Jun 05, 2013 at 07:01:25PM +0300, Stratos Karafotis wrote:
 Ondemand calculates load in terms of frequency and increases it only
 if the load_freq is greater than up_threshold multiplied by current
 or average frequency. This seems to produce oscillations of frequency
 between min and max because, for example, a relatively small load can
 easily saturate minimum frequency and lead the CPU to max. Then, the
 CPU will decrease back to min due to a small load_freq.

 Right, and I think this is how we want it, no?

 The thing is, the faster you finish your work, the faster you can become
 idle and save power.

 This is exactly the goal of this patch. To use more efficiently middle
 frequencies to finish faster the work.

 If you switch frequencies in a staircase-like manner, you're going to
 take longer to finish, in certain cases, and burn more power while doing
 so.

 This is not true with this patch. It switches to middle frequencies
 when the load  up_threshold.
 Now, ondemand does not increase freq. CPU runs in lowest freq till the
 load is greater than up_threshold.

 Btw, racing to idle is also a good example for why you want boosting:
 you want to go max out the core but stay within power limits so that you
 can finish sooner.

 This patch changes the calculation method of load and target frequency
 considering 2 points:
 - Load computation should be independent from current or average
 measured frequency. For example an absolute load 80% at 100MHz is not
 necessarily equivalent to 8% at 1000MHz in the next sampling interval.
 - Target frequency should be increased to any value of frequency table
 proportional to absolute load, instead to only the max. Thus:

 Target frequency = C * load

 where C = policy-cpuinfo.max_freq / 100

 Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait.
 Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an
 increase ~1.5% in performance. cpufreq_stats (time_in_state) shows
 that middle frequencies are used more, with this patch. Highest
 and lowest frequencies were used less by ~9%

 Can you also use powertop to measure the percentage of time spent in idle
 states for the same workload with and without your patchset?  Also, it would
 be good to measure the total energy consumption somehow ...

 Thanks,
 Rafael

 Hi Rafael,

 I repeated the tests extracting also powertop results.
 Measurement steps with and without this patch:
 1) Reboot system
 2) Running twice Phoronix benchmark of Linux Kernel Compilation 3.1 test
 without taking measurement
 3) Wait few minutes
 4) Run Phoronix and powertop for 100secs and take measurement.
 
 Well, while this is not conclusive, it definitely looks very promising. :-)
 
 We're seeing measurable performance improvement with the patchset applied 
 *and*
 more time spent in idle states both at the same time.  I'd be very surprised 
 if
 the energy consumption measuremets did not confirm that the patchset allowed
 us to reduce it.
 
 If my computations are correct (somebody please check), the cores spent about
 20% more time in idle on the average with the patchset applied and in addition
 to that the cc6 residency was greater by about 2% on the average with respect
 to the kernel without the patchset.
 
 We need to verify if there are gains (or at least no regressions) with other
 workloads, but since this *also* reduces code complexity quite a bit, I'm
 seriously considering taking it.
 
 I will try to repeat the test and take measurements with turbostat as
 Borislav suggested.
 
 Please do!
 
 Thanks,
 Rafael
 

Hi,

I repeated the tests extracting results from turbostat.
Measurement steps with and without this patch:
1) Reboot system
2) Running twice Phoronix benchmark of Linux Kernel Compilation 3.1 test
   without taking measurement
3) Wait few minutes
4) Run Phoronix and turbostat (-i 100) and take measurement


Thanks,
Stratos

--
Test WITHOUT this patch:

Phoronix Test Suite v4.6.0

Installed: pts/build-linux-kernel-1.3.0

System Information

Hardware:
Processor: Intel Core i7-3770 @ 3.40GHz (8 Cores), Motherboard: ASUS CM6870, 
Chipset: Intel Xeon E3-1200 v2/3rd, Memory: 2 x 4096 MB DDR3-1600MHz 
HY64C1C1624ZY, Disk: 1000GB Seagate ST1000DM003-9YN1, Graphics: NVIDIA GeForce 
GT 640 3072MB, Audio: Realtek ALC892, Monitor: S23B350, Network: Realtek 
RTL8111/8168 + Ralink RT3090 Wireless 802.11n 1T/1R

Software:
OS: Fedora 18, Kernel: 3.10.0-rc3v+ (x86_64), Desktop: KDE 4.10.3, Display 
Server: X Server 1.13.3, Display Driver: nouveau 1.0.7, File-System: ext4, 
Screen Resolution: 1920x1080

Would you like to save these test results (Y

Re: [PATCH v3 1/3] cpufreq: ondemand: Change the calculation of target frequency

2013-06-08 Thread Stratos Karafotis

I also did the test with the way you mentioned. But I thought to run turbostat 
for 100 sec as I did with powertop.
Actually benchmark lasts about 96 secs.

I think that we use almost the same energy for 100 sec to run the same load a 
little bit faster. I think this means also a reduce to power consumption.

I will also send the results running the test as you said.

Thanks again,
Stratos

Rafael J. Wysocki r...@sisk.pl wrote:

On Saturday, June 08, 2013 12:56:00 PM Stratos Karafotis wrote:
 On 06/07/2013 11:57 PM, Rafael J. Wysocki wrote:
  On Friday, June 07, 2013 10:14:34 PM Stratos Karafotis wrote:
  On 06/05/2013 11:35 PM, Rafael J. Wysocki wrote:
  On Wednesday, June 05, 2013 08:13:26 PM Stratos Karafotis wrote:
  Hi Borislav,
 
  On 06/05/2013 07:17 PM, Borislav Petkov wrote:
  On Wed, Jun 05, 2013 at 07:01:25PM +0300, Stratos Karafotis wrote:
  Ondemand calculates load in terms of frequency and increases it only
  if the load_freq is greater than up_threshold multiplied by current
  or average frequency. This seems to produce oscillations of frequency
  between min and max because, for example, a relatively small load can
  easily saturate minimum frequency and lead the CPU to max. Then, the
  CPU will decrease back to min due to a small load_freq.
 
  Right, and I think this is how we want it, no?
 
  The thing is, the faster you finish your work, the faster you can 
  become
  idle and save power.
 
  This is exactly the goal of this patch. To use more efficiently middle
  frequencies to finish faster the work.
 
  If you switch frequencies in a staircase-like manner, you're going to
  take longer to finish, in certain cases, and burn more power while 
  doing
  so.
 
  This is not true with this patch. It switches to middle frequencies
  when the load  up_threshold.
  Now, ondemand does not increase freq. CPU runs in lowest freq till the
  load is greater than up_threshold.
 
  Btw, racing to idle is also a good example for why you want boosting:
  you want to go max out the core but stay within power limits so that 
  you
  can finish sooner.
 
  This patch changes the calculation method of load and target frequency
  considering 2 points:
  - Load computation should be independent from current or average
  measured frequency. For example an absolute load 80% at 100MHz is not
  necessarily equivalent to 8% at 1000MHz in the next sampling interval.
  - Target frequency should be increased to any value of frequency table
  proportional to absolute load, instead to only the max. Thus:
 
  Target frequency = C * load
 
  where C = policy-cpuinfo.max_freq / 100
 
  Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait.
  Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an
  increase ~1.5% in performance. cpufreq_stats (time_in_state) shows
  that middle frequencies are used more, with this patch. Highest
  and lowest frequencies were used less by ~9%
 
  Can you also use powertop to measure the percentage of time spent in idle
  states for the same workload with and without your patchset?  Also, it 
  would
  be good to measure the total energy consumption somehow ...
 
  Thanks,
  Rafael
 
  Hi Rafael,
 
  I repeated the tests extracting also powertop results.
  Measurement steps with and without this patch:
  1) Reboot system
  2) Running twice Phoronix benchmark of Linux Kernel Compilation 3.1 test
  without taking measurement
  3) Wait few minutes
  4) Run Phoronix and powertop for 100secs and take measurement.
  
  Well, while this is not conclusive, it definitely looks very promising. :-)
  
  We're seeing measurable performance improvement with the patchset applied 
  *and*
  more time spent in idle states both at the same time.  I'd be very 
  surprised if
  the energy consumption measuremets did not confirm that the patchset 
  allowed
  us to reduce it.
  
  If my computations are correct (somebody please check), the cores spent 
  about
  20% more time in idle on the average with the patchset applied and in 
  addition
  to that the cc6 residency was greater by about 2% on the average with 
  respect
  to the kernel without the patchset.
  
  We need to verify if there are gains (or at least no regressions) with 
  other
  workloads, but since this *also* reduces code complexity quite a bit, I'm
  seriously considering taking it.
  
  I will try to repeat the test and take measurements with turbostat as
  Borislav suggested.
  
  Please do!
  
  Thanks,
  Rafael
  
 
 Hi,
 
 I repeated the tests extracting results from turbostat.
 Measurement steps with and without this patch:
 1) Reboot system
 2) Running twice Phoronix benchmark of Linux Kernel Compilation 3.1 test
without taking measurement
 3) Wait few minutes
 4) Run Phoronix and turbostat (-i 100) and take measurement

You need to do something like

# ./turbostat command invoking the phoronix suite

Did you do that?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source

Re: [PATCH v3 1/3] cpufreq: ondemand: Change the calculation of target frequency

2013-06-08 Thread Stratos Karafotis

On 06/08/2013 05:05 PM, Rafael J. Wysocki wrote:
 On Saturday, June 08, 2013 03:34:29 PM Stratos Karafotis wrote:
 I also did the test with the way you mentioned. But I thought to run 
 turbostat for 100 sec as I did with powertop.
 
 Ah, OK.
 
 Actually benchmark lasts about 96 secs.

 I think that we use almost the same energy for 100 sec to run the same load 
 a little bit faster. I think this means also a reduce to power consumption.

 I will also send the results running the test as you said.
 
 Cool, thanks!

More results running:
./turbostat phoronix-test-suite benchmark pts/build-linux-kernel

Measurement steps with and without this patch:
1) Reboot system
2) Run twice the command above without taking measurement
3) Wait few minutes
4) Run the command and take measurement

Thanks,
Stratos

--
Test WITHOUT this patch:

Phoronix Test Suite v4.6.0

Installed: pts/build-linux-kernel-1.3.0

System Information

Hardware:
Processor: Intel Core i7-3770 @ 3.40GHz (8 Cores), Motherboard: ASUS CM6870, 
Chipset: Intel Xeon E3-1200 v2/3rd, Memory: 2 x 4096 MB DDR3-1600MHz 
HY64C1C1624ZY, Disk: 1000GB Seagate ST1000DM003-9YN1, Graphics: NVIDIA GeForce 
GT 640 3072MB, Audio: Realtek ALC892, Monitor: S23B350, Network: Realtek 
RTL8111/8168 + Ralink RT3090 Wireless 802.11n 1T/1R

Software:
OS: Fedora 18, Kernel: 3.10.0-rc3v+ (x86_64), Desktop: KDE 4.10.3, Display 
Server: X Server 1.13.3, Display Driver: nouveau 1.0.7, File-System: ext4, 
Screen Resolution: 1920x1080

Would you like to save these test results (Y/n): 

Timed Linux Kernel Compilation 3.1:
pts/build-linux-kernel-1.3.0
Test 1 of 1
Estimated Trial Run Count:3
Estimated Time To Completion: 2 Minutes
Running Pre-Test Script @ 22:59:35
Started Run 1 @ 22:59:46
Running Interim Test Script @ 23:00:00
Started Run 2 @ 23:00:04
Running Interim Test Script @ 23:00:13
Started Run 3 @ 23:00:17
Running Interim Test Script @ 23:00:26  [Std. Dev: 10.04%]
Started Run 4 @ 23:00:30
Running Interim Test Script @ 23:00:39  [Std. Dev: 8.98%]
Started Run 5 @ 23:00:43
Running Interim Test Script @ 23:00:53  [Std. Dev: 7.80%]
Started Run 6 @ 23:00:56  [Std. Dev: 7.21%]
Running Post-Test Script @ 23:01:06

Test Results:
11.121481895447
9.3301539421082
9.4521908760071
9.3115320205688
9.720575094223
9.396096944809

Average: 9.72 Seconds

cor CPU%c0  GHz  TSC SMI%c1%c3%c6%c7 CTMP PTMP   %pc2   
%pc3   %pc6   %pc7  Pkg_W  Cor_W GFX_W
 40.96 3.57 3.39   0   9.83   3.36  45.85   0.00   46   46   0.00   
0.00   0.00   0.00  27.25  21.27  0.00
  0   0  37.65 3.67 3.39   0  20.53   3.18  38.64   0.00   46   46   0.00   
0.00   0.00   0.00  27.25  21.27  0.00
  0   4  52.10 3.54 3.39   0   6.08
  1   1  35.21 3.66 3.39   0  11.45   3.80  49.54   0.00   41
  1   5  41.99 3.45 3.39   0   4.66
  2   2  35.46 3.66 3.39   0  10.97   3.60  49.97   0.00   38
  2   6  41.90 3.48 3.39   0   4.53
  3   3  39.44 3.69 3.39   0  12.46   2.86  45.24   0.00   41
  3   7  43.90 3.45 3.39   0   7.99
94.876210 sec


-
Test WITH this patch:

Phoronix Test Suite v4.6.0

Installed: pts/build-linux-kernel-1.3.0

System Information

Hardware:
Processor: Intel Core i7-3770 @ 3.40GHz (8 Cores), Motherboard: ASUS CM6870, 
Chipset: Intel Xeon E3-1200 v2/3rd, Memory: 2 x 4096 MB DDR3-1600MHz 
HY64C1C1624ZY, Disk: 1000GB Seagate ST1000DM003-9YN1, Graphics: NVIDIA GeForce 
GT 640 3072MB, Audio: Realtek ALC892, Monitor: S23B350, Network: Realtek 
RTL8111/8168 + Ralink RT3090 Wireless 802.11n 1T/1R

Software:
OS: Fedora 18, Kernel: 3.10.0-rc3+ (x86_64), Desktop: KDE 4.10.3, Display 
Server: X Server 1.13.3, Display Driver: nouveau 1.0.7, File-System: ext4, 
Screen Resolution: 1920x1080

Would you like to save these test results (Y/n): 

Timed Linux Kernel Compilation 3.1:
pts/build-linux-kernel-1.3.0
Test 1 of 1
Estimated Trial Run Count:3
Estimated Time To Completion: 2 Minutes
Running Pre-Test Script @ 22:48:20
Started Run 1 @ 22:48:30
Running Interim Test Script @ 22:48:44
Started Run 2 @ 22:48:47
Running Interim Test Script @ 22:48:56
Started Run 3 @ 22:49:00
Running Interim Test Script @ 22:49:10  [Std. Dev: 4.68%]
Started Run 4 @ 22:49:13
Running Interim Test Script @ 22:49:23  [Std. Dev: 4.72%]
Started Run 5 @ 22:49:26
Running Interim Test Script @ 22:49:35  [Std. Dev: 4.25%]
Started Run 6 @ 22:49:39  [Std. Dev: 3.98%]
Running Post-Test Script @ 22:49:48

Test Results:
10.205597162247
9.2953701019287
9.8262219429016
9.2547709941864
9.4089620113373
9.3398430347443

Average: 9.56 Seconds

Re: [PATCH v3 1/3] cpufreq: ondemand: Change the calculation of target frequency

2013-06-09 Thread Stratos Karafotis

On 06/09/2013 07:26 PM, Borislav Petkov wrote:
 On Sun, Jun 09, 2013 at 12:18:09AM +0200, Rafael J. Wysocki wrote:
 The average power drawn by the package is slightly higher with the
 patchset applied (27.66 W vs 27.25 W), but since the time needed to
 complete the workload with the patchset applied was shorter by about
 2.3 sec, the total energy used was less in the latter case (by about
 25.7 J if I'm not mistaken, or 1% relative). This means that in the
 absence of a power limit between 27.25 W and 27.66 W it's better to
 use the kernel with the patchset applied for that particular workload
 from the performance and energy usage perspective.

 Good, hopefully that's going to be confirmed on other systems and/or
 with other workloads. :-)
 
 Yep, I see similar results on my AMD F15h.
 
 So there's a register which tells you what the current energy
 consumption in Watts is and support for it is integrated in lm_sensors.
 I did one read per second, for the duration of the kernel build (10-r5 +
 tip), with and without the patch, and averaged out the results:
 
 without
 ===
 
 1. 158 samples, avg Watts: 116.915
 2. 158 samples, avg Watts: 116.855
 3. 158 samples, avg Watts: 116.737
 4. 158 samples, avg Watts: 116.792
 
 = 116.82475 avg Watts.
 
 with
 
 
 1. 157 samples, avg Watts: 116.496
 2. 156 samples, avg Watts: 117.535
 3. 156 samples, avg Watts: 118.174
 4. 157 samples, avg Watts: 117.95
 
 = 117.53875 avg Watts.
 
 So there's a slight raise in the average power consumption but the
 samples count drops by 1 or 2, which is consistent with the observed
 kernel build speedup of 1 or 2 seconds.
 
 perf doesn't show any significant difference with and without the patch
 but those are single runs only.
 
 without
 ===
 
   Performance counter stats for 'make -j9':
 
  1167856.647713 task-clock#7.272 CPUs utilized
   1,071,177 context-switches  #0.917 K/sec
  52,844 cpu-migrations#0.045 K/sec
  43,600,721 page-faults   #0.037 M/sec
   4,712,068,048,465 cycles#4.035 GHz
   1,181,730,064,794 stalled-cycles-frontend   #   25.08% frontend cycles idle
 243,576,229,438 stalled-cycles-backend#5.17% backend  cycles idle
   2,966,369,010,209 instructions  #0.63  insns per cycle
   #0.40  stalled cycles per 
 insn
 651,136,706,156 branches  #  557.548 M/sec
  34,582,447,788 branch-misses #5.31% of all branches
 
   160.599796045 seconds time elapsed
 
 with
 
 
   Performance counter stats for 'make -j9':
 
  1169278.095561 task-clock#7.271 CPUs utilized
   1,076,528 context-switches  #0.921 K/sec
  53,284 cpu-migrations#0.046 K/sec
  43,598,610 page-faults   #0.037 M/sec
   4,721,747,687,668 cycles#4.038 GHz
   1,182,301,583,422 stalled-cycles-frontend   #   25.04% frontend cycles idle
 248,675,448,161 stalled-cycles-backend#5.27% backend  cycles idle
   2,967,419,684,598 instructions  #0.63  insns per cycle
   #0.40  stalled cycles per 
 insn
 651,527,448,140 branches  #  557.205 M/sec
  34,560,656,638 branch-misses #5.30% of all branches
 
   160.811815170 seconds time elapsed

Hi,

Boris, thanks so much for your tests!

Rafael, thanks for your analysis!

I did some additional tests to see how the CPU behaves in it's low and high 
limits.

I used Phoronix Java SciMark 2.0 test (FFT, Monte Carlo etc) to check the patch 
in
really heavy loads. The results were almost identical with and without this 
patch.
This is the expected behavior because I believe the load is greater than 
up_threshold
most of the time in this cases.
With this patch.
Duration: 120.568521 sec
Pkg_W: 20.97

Without this patch
Duration: 120.606813 sec
Pkg_W: 21.11


I also used a small program to check the CPU in very small loads with duration
comparable to sampling rate (1 in my config).
The program uses a tight 'for' loop with duration ~ (2 x sampling_rate).
After this it sleeps for 5000us.
I repeat the above for 100 times and then the program sleeps for 1 sec.
The above procedure repeats 15 times.

Results show that there is a slow down (~4%) WITH this patch.
Though, less energy used WITH this patch (25,23J ~3.3%)

Thanks,
Stratos


WITHOUT patch:

Starting benchmark
run 0
Avg time: 21907 us
run 1
Avg time: 21792 us
run 2
Avg time: 21827 us
run 3
Avg time: 21831 us
run 4
Avg time: 21828 us
run 5
Avg time: 21838 us
run 6
Avg time: 21819 us
run 7
Avg time: 21836 us
run 8
Avg time: 21761 us
run 9
Avg time: 21586 us
run 10
Avg time: 20366 us
run 11
Avg time: 21732 us
run 12
Avg time: 20225 us
run 13
Avg time: 21818 us
run 14
Avg time: 21812

Re: [PATCH v3 1/3] cpufreq: ondemand: Change the calculation of target frequency

2013-06-10 Thread Stratos Karafotis

On 06/09/2013 11:58 PM, Rafael J. Wysocki wrote:
Well, this means that your changes may hurt performance if the load comes and
goes in spikes, which is not so good. The fact that they cause less energy to
be used at the same time kind of balance that, though. [After all, we're
talking about the ondemand governor which should be used if the user wants to
sacrifice some performance for energy savings.]

It would be interesting to see if the picture changes for different time
intervals in your test program (e.g. loop duration that is not a multiple of
sampling_rate and sleep times different from 5000 us) to rule out any random
coincidences.

Can you possibly prepare a graph showing both the execution time and energy
consumption for several different loop durations in your program (let's keep
the 5000 us sleep for now), including multiples of sampling_rate as well as
some other durations?

Hi,

I tested different loop durations with my program from 1,000us to 1,000,000us.
The logic is almost the same with the previous test:

1) Use a 'for' loop to a period T (~ 1000-100us)
2) sleep for 5000us
3) Repeat steps 1-2, 50 times.
4) sleep for 1s
5) Repeat 1-4, 5 times.

The results:
https://docs.google.com/spreadsheet/ccc?key=0AnMfNYUV1k0ddE13ZUtYdGs2dUVRdG00bVRVT3JScWcusp=sharing

Sheet1 (ProcessX1) includes the results from the test program running
as single copy. The second one (ProcessX4) includes the results from the test
program running it in 4 copies in parallel (using a bash script that waits
the end of execution).

Graphs show the difference(%) in total execution time and total energy without
and with the patch.
Negative values mean that the test *with* the patch had better performance or
used less energy.

Test shows that below sampling rate (1us in my config), ondemand with this
patch behaves better (both in performance and consumption).
Though, in this test, for loads with 1us duration = 20us ondemand
behaves better without the patch.

Thanks,
Stratos
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH v3 1/3] cpufreq: ondemand: Change the calculation of target frequency

2013-06-13 Thread Stratos Karafotis

Hi Rafael,

On 06/11/2013 02:24 AM, Rafael J. Wysocki wrote:
On Tuesday, June 11, 2013 12:57:26 AM Stratos Karafotis wrote:
On 06/09/2013 11:58 PM, Rafael J. Wysocki wrote:
Well, this means that your changes may hurt performance if the load comes
and
goes in spikes, which is not so good. The fact that they cause less energy
to
be used at the same time kind of balance that, though. [After all, we're
talking about the ondemand governor which should be used if the user wants
to
sacrifice some performance for energy savings.]

Hi,

I tested different loop durations with my program from 1,000us to
1,000,000us.
The logic is almost the same with the previous test:

1) Use a 'for' loop to a period T (~ 1000-100us)
2) sleep for 5000us
3) Repeat steps 1-2, 50 times.
4) sleep for 1s
5) Repeat 1-4, 5 times.

The results:
https://docs.google.com/spreadsheet/ccc?key=0AnMfNYUV1k0ddE13ZUtYdGs2dUVRdG00bVRVT3JScWcusp=sharing

Graphs show the difference(%) in total execution time and total energy
without
and with the patch.
Negative values mean that the test *with* the patch had better performance or
used less energy.

Test shows that below sampling rate (1us in my config), ondemand with
this
patch behaves better (both in performance and consumption).
Though, in this test, for loads with 1us duration = 20us ondemand
behaves better without the patch.

Thanks for these results!

Well, I'd say that this doesn't look rosy any more, so the jury is still out.

We need more testing with different workloads and on different hardware. I'll
try to arrange something to that end.

Please let me share some more test results using aim9 benchmark suite:
https://docs.google.com/spreadsheet/ccc?key=0AnMfNYUV1k0ddDdGdlJyUHpqT2xGY1lBOEt2UEVnNlEusp=sharing

Each test was running for 10sec.
Total execution time with and without the patch was almost identical, which is
expected since the tests in aim9 run for a specific period.
The energy during the test run was increased by 0.43% with the patch.
The performance was increased by 1.25% (average) with this patch.

Re: [PATCH v3 1/3] cpufreq: ondemand: Change the calculation of target frequency

2013-06-13 Thread Stratos Karafotis

On 06/14/2013 12:40 AM, Borislav Petkov wrote:
On Fri, Jun 14, 2013 at 12:22:18AM +0300, Stratos Karafotis wrote:
Please let me share some more test results using aim9 benchmark suite:
https://docs.google.com/spreadsheet/ccc?key=0AnMfNYUV1k0ddDdGdlJyUHpqT2xGY1lBOEt2UEVnNlEusp=sharing

Each test was running for 10sec.
Total execution time with and without the patch was almost identical, which
is
expected since the tests in aim9 run for a specific period.
The energy during the test run was increased by 0.43% with the patch.
The performance was increased by 1.25% (average) with this patch.

Not bad. However, exec_test and fork_test are kinda unexpected with such
a high improvement percentage. Happen to have an explanation?

FWIW, if we don't find any serious perf/power regressions with
this patch, I'd say it is worth applying even solely for the code
simplification it brings.

Although, I'm not sure about the unexpected improvement, I confirm this
(run again the test). Also, there is important improvement in
Directory searches (+5.79%), Disk Copies (+1.19%), shell scripts
(1.20%, 1.51%, 2.38%) and tcp/udp tests (3.62%, 1.41%).

I believe that ondemand has better performance with this patch in
medium loads. Maybe these operations produce small to medium loads (lower
than up_threshold) and push the CPU to medium frequencies. Without the
patch CPU stays longer to min frequency.

Thanks,
Stratos

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH v3 1/3] cpufreq: ondemand: Change the calculation of target frequency

2013-06-14 Thread Stratos Karafotis

Hi,

On 06/14/2013 03:55 PM, Rafael J. Wysocki wrote:
 On Friday, June 14, 2013 02:44:01 PM Borislav Petkov wrote:
 On Fri, Jun 14, 2013 at 02:46:38PM +0200, Rafael J. Wysocki wrote:
 OK, so here's a deal. After 3.10-rc1 goes out, I'll put this into
 linux-next

 Yeah, you mean 3.11-rc1 here...
 
 Sure, sorry for the confusion.
 
 for 3.12, so that people have a few more weeks to complain. If they
 don't, it'll go into 3.12.

 but yep, sounds like a deal.
 
 Cool, thanks!


Great news! :-)

Thank you all, for your help and for your valuable time!

Regards,
Stratos

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 2/3] cpufreq: Remove unused function __cpufreq_driver_getavg

2013-06-04 Thread Stratos Karafotis


On 06/04/2013 08:19 AM, Viresh Kumar wrote:

On 4 June 2013 01:18, Stratos Karafotis strat...@semaphore.gr wrote:

Calculation of frequency target in ondemand governor changed and it is


s/frequency target/target frequency


I will change it also in 3/3 that I use the same.


independent from measured average frequency.

Remove unused__cpufreq_driver_getavg function and getavg member from
cpufreq_driver struct. Also, remove the callback getavg in
acpi_cpufreq_driver.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
  drivers/cpufreq/Makefile   |  2 +-
  drivers/cpufreq/acpi-cpufreq.c |  5 -
  drivers/cpufreq/cpufreq.c  | 12 
  include/linux/cpufreq.h|  6 --
  4 files changed, 1 insertion(+), 24 deletions(-)

diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index 6ad0b91..aebd4ef 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -23,7 +23,7 @@ obj-$(CONFIG_GENERIC_CPUFREQ_CPU0)+= cpufreq-cpu0.o
  # powernow-k8 can load then. ACPI is preferred to all other hardware-specific 
drivers.
  # speedstep-* is preferred over p4-clockmod.

-obj-$(CONFIG_X86_ACPI_CPUFREQ) += acpi-cpufreq.o mperf.o


Should this be done in 3/3 ?



acpi-cpufreq does not use mperf after 2/3. Why should we compile it with
CONFIG_X86_ACPI_CPUFREQ?
Do you want me to move the change in 3/3?

Thanks,
Stratos

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 1/3] cpufreq: ondemand: Change the calculation of target frequency

2013-06-04 Thread Stratos Karafotis

On 06/03/2013 11:38 PM, David C Niemi wrote:
 
 Interesting analysis; I just got back from vacation and have not had a chance 
 to comment until now.
 
 I like Stratos' general idea of making the decision to upshift or downshift 
 independent of current frequency, as it makes thinks simpler and potentially 
 more stable.  But I believe it will be important to measure performance and 
 power consumption in a wider range of use cases to know whether it is an 
 overall win (or whether it can at least be tuned to match the status quo for 
 various use cases).
 
 In my main use case (network servers), I don't think using more middle 
 frequencies is a good thing at all; as soon as a load gets heavy even briefly 
 I want the CPU doing all it can until the load has clearly abated.  The main 
 competition in this use case is between using ondemand (tuned for performance 
 at the cost of some extra power consumption) or the performance governor 
 (which cannot be tuned at all, and where C-states are the only hope for 
 moderating power consumption).
 
 A couple of additional points -- it is possible to get excellent overall 
 performance and avoid oscillation using ondemand right now by using a low 
 up_threshold and a sampling_down_factor of around 100; in this case you spend 
 most of your time at either the lowest or highest possible frequency and you 
 spend very little time thinking about slowing down.  The main downside of 
 this is an increase in power consumption, so it is not a battery-friendly 
 approach, but someone will need to also measure power consumption if we want 
 to justify a change from the status quo on that basis.  There are dozens of 
 ways to save power at the expense of performance or vice versa, so any major 
 change like this needs to be analyzed for both, in case your patch just 
 results in running at higher average frequencies and gets its performance 
 boost from that.
 
 David C Niemi

Hi David, 
Thanks for your comments!

In your case, the behavior of ondemand will not change to the worst.
up_threshold/sampling down factor remain as is. 
So, for loads above up_threshold ondemand will behave the same.

For loads lower than up_threshold, CPU will remain in lowest
frequency or downshift to a middle one with the old method.
After this patch, CPU will remain to the lowest or downshift to a
middle frequency or upshift to a middle frequency. So, I think we will
have a better performance, with the patch.

I know that CPU load tends to be chaotic, but please let me try to explain
my logic with a theoretical example to compare ondemand with and without
this patch that I think it will be valid in many cases.

Let's assume for simplicity a single core CPU with available
frequencies 100-1000MHz in steps of 100MHz. The architecture does
not support APERF/MPERF to measure average frequency. All tunables
to default values. As initial state we consider that the CPU is
idling in 100MHz with load = 0 (ideally).

A process needs CPU time and in the next iteration ondemand calculates
the load of the previous sampling interval.
There are 3 different possible paths:
1) Load is greater than up_threshold: with or without the patch, CPU will 
increase to max.
2) Load is lower than 10: with or without the patch, CPU will remain in the 
lowest freq.
3) Load between 10 and up_threshold, for example 50:
without this patch, CPU will remain to 100MHz
with this patch, CPU will increase to a frequency that it's directly
proportional to load (500MHz)

If we concern about performance, ondemand will behave better with this patch
for case 3. But what about power consumption? I would say that this depends
on the duration of load:

3a) Suppose that the process causes a CPU load of 50 for 5 sampling periods 
without this patch.
Without this patch, the CPU will remain for 5 sampling periods in 100MHz
With this patch, CPU will increase to 500Mhz, most probably, for ~1 sampling 
period.

3b) The process causes a CPU load of 50 for 1 sampling period.
Without this patch, the CPU will remain to 100MHz for 1 sampling period
With this patch, the CPU will increase to 500MHz for 1 sampling period

3c) The process causes a CPU load of 50, and then increases to 100 for next 
iterations
(most probably because the process started in the middle of sampling period).
Without this patch CPU will remain to 100MHz for the 1st period and then
it will increase to 1000MHz for next iterations.
With this patch the CPU will increase to 500MHz for the 1st period and then
it will increase to 1000MHz for next iterations.

The only case that the new method will be less power efficient is b) but I 
think there will be
significant improvement in performance for a) and c)

The results will be similar when the governor upshifts from any other frequency.

Using the highest frequency, the proposed method will downshift to lower 
frequencies
because with the 'old' method the calculation it's dependent from the current 
frequency
and

Re: [PATCH v2 2/3] cpufreq: Remove unused function __cpufreq_driver_getavg

I think you are right. I will reorder 2/3 and 3/3 with the change you suggested.

Thanks,
Stratos

Viresh Kumar viresh.ku...@linaro.org wrote:

On 4 June 2013 20:36, Stratos Karafotis strat...@semaphore.gr wrote:
 On 06/04/2013 08:19 AM, Viresh Kumar wrote:
 Should this be done in 3/3 ?


 acpi-cpufreq does not use mperf after 2/3. Why should we compile it with
 CONFIG_X86_ACPI_CPUFREQ?
 Do you want me to move the change in 3/3?

I somehow feel now that 3/3 should come before 2/3 and then this change
should be merged into it. And at the end we can have this patch as 3/3..

What do you say? core should go last and users/drivers must go first.

[PATCH v3 3/3] cpufreq: Remove unused function __cpufreq_driver_getavg

Calculation of target frequency in ondemand governor changed and it is
independent from measured average frequency.

Remove unused__cpufreq_driver_getavg function and getavg member from
cpufreq_driver struct.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq.c | 12 
 include/linux/cpufreq.h   |  6 --
 2 files changed, 18 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index f8c2860..a61aacb 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1584,18 +1584,6 @@ fail:
 }
 EXPORT_SYMBOL_GPL(cpufreq_driver_target);
 
-int __cpufreq_driver_getavg(struct cpufreq_policy *policy, unsigned int cpu)
-{
-   if (cpufreq_disabled())
-   return 0;
-
-   if (!cpufreq_driver-getavg)
-   return 0;
-
-   return cpufreq_driver-getavg(policy, cpu);
-}
-EXPORT_SYMBOL_GPL(__cpufreq_driver_getavg);
-
 /*
  * when event is CPUFREQ_GOV_LIMITS
  */
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index d939056..50f19ad 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -215,10 +215,6 @@ extern int __cpufreq_driver_target(struct cpufreq_policy 
*policy,
   unsigned int target_freq,
   unsigned int relation);
 
-
-extern int __cpufreq_driver_getavg(struct cpufreq_policy *policy,
-  unsigned int cpu);
-
 int cpufreq_register_governor(struct cpufreq_governor *governor);
 void cpufreq_unregister_governor(struct cpufreq_governor *governor);
 
@@ -258,8 +254,6 @@ struct cpufreq_driver {
unsigned int(*get)  (unsigned int cpu);
 
/* optional */
-   unsigned int (*getavg)  (struct cpufreq_policy *policy,
-unsigned int cpu);
int (*bios_limit)   (int cpu, unsigned int *limit);
 
int (*exit) (struct cpufreq_policy *policy);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 0/3] cpufreq: ondemand: Change the calculation of target frequency

Changes since v2:
- Reorder patches 2/3 and 3/3
- Fix typos in patch changelog

Changes since v1:
- Use policy-cpuinfo.max_freq in the calculation formula
of target frequency instead of policy-max
- Split the patch into 3 parts

Stratos Karafotis (3):
  cpufreq: ondemand: Change the calculation of target frequency
  cpufreq: Remove unused APERF/MPERF support
  cpufreq: Remove unused function __cpufreq_driver_getavg

 arch/x86/include/asm/processor.h   | 29 --
 drivers/cpufreq/Makefile   |  2 +-
 drivers/cpufreq/acpi-cpufreq.c |  5 
 drivers/cpufreq/cpufreq.c  | 12 -
 drivers/cpufreq/cpufreq_governor.c | 10 +---
 drivers/cpufreq/cpufreq_governor.h |  1 -
 drivers/cpufreq/cpufreq_ondemand.c | 39 ++---
 drivers/cpufreq/mperf.c| 51 --
 drivers/cpufreq/mperf.h|  9 ---
 include/linux/cpufreq.h|  6 -
 10 files changed, 9 insertions(+), 155 deletions(-)
 delete mode 100644 drivers/cpufreq/mperf.c
 delete mode 100644 drivers/cpufreq/mperf.h

-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 2/3] cpufreq: Remove unused APERF/MPERF support

Calculation of target frequency in ondemand governor changed and it is
independent from measured average frequency.

Remove unused APERF/MPERF support.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 arch/x86/include/asm/processor.h | 29 ---
 drivers/cpufreq/Makefile |  2 +-
 drivers/cpufreq/acpi-cpufreq.c   |  5 
 drivers/cpufreq/mperf.c  | 51 
 drivers/cpufreq/mperf.h  |  9 ---
 5 files changed, 1 insertion(+), 95 deletions(-)
 delete mode 100644 drivers/cpufreq/mperf.c
 delete mode 100644 drivers/cpufreq/mperf.h

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 4b3..2874a3b 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -941,35 +941,6 @@ extern int set_tsc_mode(unsigned int val);
 
 extern u16 amd_get_nb_id(int cpu);
 
-struct aperfmperf {
-   u64 aperf, mperf;
-};
-
-static inline void get_aperfmperf(struct aperfmperf *am)
-{
-   WARN_ON_ONCE(!boot_cpu_has(X86_FEATURE_APERFMPERF));
-
-   rdmsrl(MSR_IA32_APERF, am-aperf);
-   rdmsrl(MSR_IA32_MPERF, am-mperf);
-}
-
-#define APERFMPERF_SHIFT 10
-
-static inline
-unsigned long calc_aperfmperf_ratio(struct aperfmperf *old,
-   struct aperfmperf *new)
-{
-   u64 aperf = new-aperf - old-aperf;
-   u64 mperf = new-mperf - old-mperf;
-   unsigned long ratio = aperf;
-
-   mperf = APERFMPERF_SHIFT;
-   if (mperf)
-   ratio = div64_u64(aperf, mperf);
-
-   return ratio;
-}
-
 extern unsigned long arch_align_stack(unsigned long sp);
 extern void free_init_pages(char *what, unsigned long begin, unsigned long 
end);
 
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index c956094..7f823d9 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -23,7 +23,7 @@ obj-$(CONFIG_GENERIC_CPUFREQ_CPU0)+= cpufreq-cpu0.o
 # powernow-k8 can load then. ACPI is preferred to all other hardware-specific 
drivers.
 # speedstep-* is preferred over p4-clockmod.
 
-obj-$(CONFIG_X86_ACPI_CPUFREQ) += acpi-cpufreq.o mperf.o
+obj-$(CONFIG_X86_ACPI_CPUFREQ) += acpi-cpufreq.o
 obj-$(CONFIG_X86_POWERNOW_K8)  += powernow-k8.o
 obj-$(CONFIG_X86_PCC_CPUFREQ)  += pcc-cpufreq.o
 obj-$(CONFIG_X86_POWERNOW_K6)  += powernow-k6.o
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 17e3496..d3a5f35 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -45,7 +45,6 @@
 #include asm/msr.h
 #include asm/processor.h
 #include asm/cpufeature.h
-#include mperf.h
 
 MODULE_AUTHOR(Paul Diefenbaugh, Dominik Brodowski);
 MODULE_DESCRIPTION(ACPI Processor P-States Driver);
@@ -842,10 +841,6 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy 
*policy)
/* notify BIOS that we exist */
acpi_processor_notify_smm(THIS_MODULE);
 
-   /* Check for APERF/MPERF support in hardware */
-   if (boot_cpu_has(X86_FEATURE_APERFMPERF))
-   acpi_cpufreq_driver.getavg = cpufreq_get_measured_perf;
-
pr_debug(CPU%u - ACPI performance management activated.\n, cpu);
for (i = 0; i  perf-state_count; i++)
pr_debug( %cP%d: %d MHz, %d mW, %d uS\n,
diff --git a/drivers/cpufreq/mperf.c b/drivers/cpufreq/mperf.c
deleted file mode 100644
index 911e193..000
--- a/drivers/cpufreq/mperf.c
+++ /dev/null
@@ -1,51 +0,0 @@
-#include linux/kernel.h
-#include linux/smp.h
-#include linux/module.h
-#include linux/init.h
-#include linux/cpufreq.h
-#include linux/slab.h
-
-#include mperf.h
-
-static DEFINE_PER_CPU(struct aperfmperf, acfreq_old_perf);
-
-/* Called via smp_call_function_single(), on the target CPU */
-static void read_measured_perf_ctrs(void *_cur)
-{
-   struct aperfmperf *am = _cur;
-
-   get_aperfmperf(am);
-}
-
-/*
- * Return the measured active (C0) frequency on this CPU since last call
- * to this function.
- * Input: cpu number
- * Return: Average CPU frequency in terms of max frequency (zero on error)
- *
- * We use IA32_MPERF and IA32_APERF MSRs to get the measured performance
- * over a period of time, while CPU is in C0 state.
- * IA32_MPERF counts at the rate of max advertised frequency
- * IA32_APERF counts at the rate of actual CPU frequency
- * Only IA32_APERF/IA32_MPERF ratio is architecturally defined and
- * no meaning should be associated with absolute values of these MSRs.
- */
-unsigned int cpufreq_get_measured_perf(struct cpufreq_policy *policy,
-   unsigned int cpu)
-{
-   struct aperfmperf perf;
-   unsigned long ratio;
-   unsigned int retval;
-
-   if (smp_call_function_single(cpu, read_measured_perf_ctrs, perf, 1))
-   return 0;
-
-   ratio = calc_aperfmperf_ratio(per_cpu(acfreq_old_perf, cpu), perf);
-   per_cpu(acfreq_old_perf, cpu) = perf

[PATCH v3 1/3] cpufreq: ondemand: Change the calculation of target frequency

Ondemand calculates load in terms of frequency and increases it only
if the load_freq is greater than up_threshold multiplied by current
or average frequency. This seems to produce oscillations of frequency
between min and max because, for example, a relatively small load can
easily saturate minimum frequency and lead the CPU to max. Then, the
CPU will decrease back to min due to a small load_freq.

This patch changes the calculation method of load and target frequency
considering 2 points:
- Load computation should be independent from current or average
measured frequency. For example an absolute load 80% at 100MHz is not
necessarily equivalent to 8% at 1000MHz in the next sampling interval.
- Target frequency should be increased to any value of frequency table
proportional to absolute load, instead to only the max. Thus:

Target frequency = C * load

where C = policy-cpuinfo.max_freq / 100

Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait.
Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an
increase ~1.5% in performance. cpufreq_stats (time_in_state) shows
that middle frequencies are used more, with this patch. Highest
and lowest frequencies were used less by ~9%

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_governor.c | 10 +-
 drivers/cpufreq/cpufreq_governor.h |  1 -
 drivers/cpufreq/cpufreq_ondemand.c | 39 +++---
 3 files changed, 8 insertions(+), 42 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c 
b/drivers/cpufreq/cpufreq_governor.c
index a849b2d..47c8077 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -54,7 +54,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 
policy = cdbs-cur_policy;
 
-   /* Get Absolute Load (in terms of freq for ondemand gov) */
+   /* Get Absolute Load */
for_each_cpu(j, policy-cpus) {
struct cpu_dbs_common_info *j_cdbs;
u64 cur_wall_time, cur_idle_time;
@@ -105,14 +105,6 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 
load = 100 * (wall_time - idle_time) / wall_time;
 
-   if (dbs_data-cdata-governor == GOV_ONDEMAND) {
-   int freq_avg = __cpufreq_driver_getavg(policy, j);
-   if (freq_avg = 0)
-   freq_avg = policy-cur;
-
-   load *= freq_avg;
-   }
-
if (load  max_load)
max_load = load;
}
diff --git a/drivers/cpufreq/cpufreq_governor.h 
b/drivers/cpufreq/cpufreq_governor.h
index e7bbf76..c305cad 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -169,7 +169,6 @@ struct od_dbs_tuners {
unsigned int sampling_rate;
unsigned int sampling_down_factor;
unsigned int up_threshold;
-   unsigned int adj_up_threshold;
unsigned int powersave_bias;
unsigned int io_is_busy;
 };
diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
b/drivers/cpufreq/cpufreq_ondemand.c
index 4b9bb5d..62e67a9 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -29,11 +29,9 @@
 #include cpufreq_governor.h
 
 /* On-demand governor macros */
-#define DEF_FREQUENCY_DOWN_DIFFERENTIAL(10)
 #define DEF_FREQUENCY_UP_THRESHOLD (80)
 #define DEF_SAMPLING_DOWN_FACTOR   (1)
 #define MAX_SAMPLING_DOWN_FACTOR   (10)
-#define MICRO_FREQUENCY_DOWN_DIFFERENTIAL  (3)
 #define MICRO_FREQUENCY_UP_THRESHOLD   (95)
 #define MICRO_FREQUENCY_MIN_SAMPLE_RATE(1)
 #define MIN_FREQUENCY_UP_THRESHOLD (11)
@@ -159,14 +157,10 @@ static void dbs_freq_increase(struct cpufreq_policy *p, 
unsigned int freq)
 
 /*
  * Every sampling_rate, we check, if current idle time is less than 20%
- * (default), then we try to increase frequency. Every sampling_rate, we look
- * for the lowest frequency which can sustain the load while keeping idle time
- * over 30%. If such a frequency exist, we try to decrease to this frequency.
- *
- * Any frequency increase takes it to the maximum frequency. Frequency 
reduction
- * happens at minimum steps of 5% (default) of current frequency
+ * (default), then we try to increase frequency. Else, we adjust the frequency
+ * proportional to load.
  */
-static void od_check_cpu(int cpu, unsigned int load_freq)
+static void od_check_cpu(int cpu, unsigned int load)
 {
struct od_cpu_dbs_info_s *dbs_info = per_cpu(od_cpu_dbs_info, cpu);
struct cpufreq_policy *policy = dbs_info-cdbs.cur_policy;
@@ -176,29 +170,17 @@ static void od_check_cpu(int cpu, unsigned int load_freq)
dbs_info-freq_lo = 0;
 
/* Check for frequency increase */
-   if (load_freq  od_tuners-up_threshold * policy-cur) {
+   if (load  od_tuners-up_threshold

Re: [PATCH v3 1/3] cpufreq: ondemand: Change the calculation of target frequency


Hi Borislav,

On 06/05/2013 07:17 PM, Borislav Petkov wrote:

On Wed, Jun 05, 2013 at 07:01:25PM +0300, Stratos Karafotis wrote:

Ondemand calculates load in terms of frequency and increases it only
if the load_freq is greater than up_threshold multiplied by current
or average frequency. This seems to produce oscillations of frequency
between min and max because, for example, a relatively small load can
easily saturate minimum frequency and lead the CPU to max. Then, the
CPU will decrease back to min due to a small load_freq.


Right, and I think this is how we want it, no?

The thing is, the faster you finish your work, the faster you can become
idle and save power.


This is exactly the goal of this patch. To use more efficiently middle
frequencies to finish faster the work.


If you switch frequencies in a staircase-like manner, you're going to
take longer to finish, in certain cases, and burn more power while doing
so.


This is not true with this patch. It switches to middle frequencies
when the load  up_threshold.
Now, ondemand does not increase freq. CPU runs in lowest freq till the
load is greater than up_threshold.


Btw, racing to idle is also a good example for why you want boosting:
you want to go max out the core but stay within power limits so that you
can finish sooner.


This patch changes the calculation method of load and target frequency
considering 2 points:
- Load computation should be independent from current or average
measured frequency. For example an absolute load 80% at 100MHz is not
necessarily equivalent to 8% at 1000MHz in the next sampling interval.
- Target frequency should be increased to any value of frequency table
proportional to absolute load, instead to only the max. Thus:

Target frequency = C * load

where C = policy-cpuinfo.max_freq / 100

Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait.
Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an
increase ~1.5% in performance. cpufreq_stats (time_in_state) shows
that middle frequencies are used more, with this patch. Highest
and lowest frequencies were used less by ~9%


I read this as the workload takes longer to complete which means
higher power consumption and longer execution times which means less
time spent in idle. And I don't think we want that.

Yes, no?


In my opinion, no.
Running the benchmark mentioned in changelog shows shorter execution
time by ~1.5%

Thanks,
Stratos
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH linux-next] ext4: inode: Fix compiler warning

Fix the following compiler warning:

fs/ext4/inode.c: In function ‘ext4_da_writepages’:
fs/ext4/inode.c:2212:6: warning: ‘err’ may be used uninitialized in this
function [-Wmaybe-uninitialized]
fs/ext4/inode.c:2155:6: note: ‘err’ was declared here

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 fs/ext4/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 442c5d2..702428c 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2152,7 +2152,7 @@ static int mpage_map_and_submit_extent(handle_t *handle,
 {
struct inode *inode = mpd-inode;
struct ext4_map_blocks *map = mpd-map;
-   int err;
+   int err = 0;
loff_t disksize;
 
mpd-io_submit.io_end-offset =
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 1/3] cpufreq: ondemand: Change the calculation of target frequency

Hi Rafael,

I will try to provide the requested info (although, I'm not sure how to measure 
total energy :) )

Thanks,
Stratos

Rafael J. Wysocki r...@sisk.pl wrote:

On Wednesday, June 05, 2013 08:13:26 PM Stratos Karafotis wrote:
 Hi Borislav,
 
 On 06/05/2013 07:17 PM, Borislav Petkov wrote:
  On Wed, Jun 05, 2013 at 07:01:25PM +0300, Stratos Karafotis wrote:
  Ondemand calculates load in terms of frequency and increases it only
  if the load_freq is greater than up_threshold multiplied by current
  or average frequency. This seems to produce oscillations of frequency
  between min and max because, for example, a relatively small load can
  easily saturate minimum frequency and lead the CPU to max. Then, the
  CPU will decrease back to min due to a small load_freq.
 
  Right, and I think this is how we want it, no?
 
  The thing is, the faster you finish your work, the faster you can become
  idle and save power.
 
 This is exactly the goal of this patch. To use more efficiently middle
 frequencies to finish faster the work.
 
  If you switch frequencies in a staircase-like manner, you're going to
  take longer to finish, in certain cases, and burn more power while doing
  so.
 
 This is not true with this patch. It switches to middle frequencies
 when the load  up_threshold.
 Now, ondemand does not increase freq. CPU runs in lowest freq till the
 load is greater than up_threshold.
 
  Btw, racing to idle is also a good example for why you want boosting:
  you want to go max out the core but stay within power limits so that you
  can finish sooner.
 
  This patch changes the calculation method of load and target frequency
  considering 2 points:
  - Load computation should be independent from current or average
  measured frequency. For example an absolute load 80% at 100MHz is not
  necessarily equivalent to 8% at 1000MHz in the next sampling interval.
  - Target frequency should be increased to any value of frequency table
  proportional to absolute load, instead to only the max. Thus:
 
  Target frequency = C * load
 
  where C = policy-cpuinfo.max_freq / 100
 
  Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait.
  Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an
  increase ~1.5% in performance. cpufreq_stats (time_in_state) shows
  that middle frequencies are used more, with this patch. Highest
  and lowest frequencies were used less by ~9%

Can you also use powertop to measure the percentage of time spent in idle
states for the same workload with and without your patchset?  Also, it would
be good to measure the total energy consumption somehow ...

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

Re: [PATCH v3 1/3] cpufreq: ondemand: Change the calculation of target frequency

Thanks Viresh. I think I couldn't explain this in better way.
Also thanks for acknowledgment!

Stratos

Viresh Kumar viresh.ku...@linaro.org wrote:

On 6 June 2013 15:31, Borislav Petkov b...@suse.de wrote:

 Hold on, you say above easily saturate minimum frequency and lead the
 CPU to max. I read this as we jump straight to max P-state where we
 even boost.

Probably he meant: At lowest levels of frequencies, a small load on system
may look like a huge one. like: 20-30% load on max freq can be 95% load
on min freq. And so we jump to max freq even for this load and return back
pretty quickly as this load doesn't sustain for longer. over that we wait for
load to go over up_threshold to increase freq.

 CPU to max finishes the work faster than middle frequencies, if you're
 CPU-bound.

He isn't removing this feature at all.

Current code is:

if (load  up_threshold)
   goto maxfreq.
else
   don't increase freq, maybe decrease it in steps

What he is doing is:

if (load  up_threshold)
   goto maxfreq.
else
   increase/decrease freq based on current load.

So, if up_threshold is 95 and load remains  95, his patch will
give significant improvement both power  performance wise.

Else, it shouldn't decrease it.

Re: [PATCH v3 1/3] cpufreq: ondemand: Change the calculation of target frequency

On 06/06/2013 03:10 PM, Borislav Petkov wrote:
 On Thu, Jun 06, 2013 at 03:40:13PM +0530, Viresh Kumar wrote:
 his patch will give significant improvement both power  performance wise.
 
 Yes, and I'd like to see the paperwork on that. Numbers, and on a couple
 of platforms/vendors if possible, please.
 
 Thanks.
 

On 06/06/2013 04:15 PM, Borislav Petkov wrote: Please do not top-post.
 
 On Thu, Jun 06, 2013 at 03:54:20PM +0300, Stratos Karafotis wrote:
 I will try to provide the requested info (although, I'm not sure how
 to measure total energy :) )
 
 tools/power/x86/turbostat looks like a good tool. It can show, a.o.,
 power consumption in Watts on modern Intels and other interesting stuff.
 
 HTH.
 

Apologies for top-posting. I was able to send email only from my phone.

Thanks for you hint about turbostat.

As you most probably understood, I'm individual amateur kernel developer.
I could provide some numbers from x86 architecture as Rafael suggested.
But unfortunately, I don't have access to more sources/infrastructure.
So, I will not be able to provide numbers from different platform(s).

I've already provided some benchmarks from x86 (3.10-rc3) and also
tested the patch in 3.4.47 kernel (ARM, Nexus 4 phone, ~1000 installations)
and in 3.0.80 kernel (ARM, Samsung Galaxy S phone, ~1500 installations).

Kindly let me know if couple of platforms/vendors is a show stopper
for this patch series. If yes, please ignore this patch and accept
my apologies for wasting your time. I am just trying to contribute
on this project (I believe there is space here for amateur developers).

Many thanks to Rafael who helped me and guide me.
Thanks to Viresh for his helpful comments and his acknowledgment for
the patch.

Best Regards,
Stratos
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 1/3] cpufreq: ondemand: Change the calculation of target frequency


On 06/06/2013 08:11 PM, Borislav Petkov wrote:

On Thu, Jun 06, 2013 at 07:46:17PM +0300, Stratos Karafotis wrote:

Apologies for top-posting. I was able to send email only from my phone.

Thanks for you hint about turbostat.

As you most probably understood, I'm individual amateur kernel developer.
I could provide some numbers from x86 architecture as Rafael suggested.
But unfortunately, I don't have access to more sources/infrastructure.
So, I will not be able to provide numbers from different platform(s).

I've already provided some benchmarks from x86 (3.10-rc3) and also
tested the patch in 3.4.47 kernel (ARM, Nexus 4 phone, ~1000 installations)
and in 3.0.80 kernel (ARM, Samsung Galaxy S phone, ~1500 installations).

Kindly let me know if couple of platforms/vendors is a show stopper
for this patch series. If yes, please ignore this patch and accept
my apologies for wasting your time. I am just trying to contribute
on this project (I believe there is space here for amateur developers).


I'm in no way discouraging you in contributing to the kernel - on the
opposite: you should continue doing that.


I will try! :)


I'm just trying to make sure that a change like that doesn't hurt
existing systems, thus the request to test on a couple of platforms. If
you don't have other platforms, that's fine, we'll find them somewhere. :-)

I'm hoping you can understand my aspect too, though - how would you feel
if a patch shows improvement on my box but slows down yours - you won't
be very happy with it, right? That's why we generally want to test such
power/performance tweaks on a wider range of machines.


I'm totally understand your aspect and I think you are absolutely
right. I just wanted to declare that I am not able to provide numbers
for other platforms due to lack of hardware.


But you said you have a i7-3770 CPU on which, I think, turbostat should
be able to show you how the power consumption looks like.

And if so, you could measure that consumption once with, and once
without your patch. This will give us initial numbers, at least.

How does that sound?



That sounds perfect! I will provide numbers for i7 soon.

Thanks for your comments!
Stratos

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] cpufreq: ondemand: Increase frequency to any value proportional to load

2013-05-29 Thread Stratos Karafotis

On 05/28/2013 11:54 PM, Rafael J. Wysocki wrote:
 On Tuesday, May 28, 2013 08:03:19 PM Stratos Karafotis wrote:
 I mean any value of freq table. Please let me know if you want me to rephrase
 it in description.
 
 Yes, it would be nice to be more precise.

OK sure, I will add a more precise description.


 Which is equivalent to saying that __cpufreq_driver_getavg() is not useful and
 may be removed (but the patch doesn't do that and I wonder why?), but surely
 the developer who added it wouldn't agree.
 
 So, please explain: why can we drop __cpufreq_driver_getavg()?
 

With the new proposed method the next frequency is not dependent from current
or average frequency. The next frequency is dependent only from load.
So, we don't need support for freq feedback from hardware anymore.
Even if, due to underlying hardware coordination mechanism, CPU runs in 
different 
frequency than the actual, the calculation of load and of target frequency will
remain the unaffected, with this patch.

With full respect to ondemand coauthor, and if the new method is acceptable,
I could send a patch to revert the original one about the IA32_APERF and
IA32_MPERF MSR support.


 Thus, in the comparison with up_threshold to increase frequency we actually
 do this (in cases that getavg is not implemented):
 if (load  up_theshold)
  increase to max

 So, after the patch we keep the same comparison to decide about the max 
 frequency.
 I thought, that below up_threshold is 'fair' to decide about the next
 frequency with formula that frequency is proportional to load.
 For example in a CPU with min freq 100MHz and max 1000MHz with a
 load 50 target frequency should be 500MHz.
 
 Well, that sounds reasonable, but the patch actually does more than that.

I'm sorry, but I didn't understand your last point.


Thanks for your comments,
Stratos

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] cpufreq: ondemand: Increase frequency to any value proportional to load

2013-05-30 Thread Stratos Karafotis


On 05/30/2013 01:29 AM, Rafael J. Wysocki wrote:

On Wednesday, May 29, 2013 06:15:56 PM Stratos Karafotis wrote:

On 05/28/2013 11:54 PM, Rafael J. Wysocki wrote:

On Tuesday, May 28, 2013 08:03:19 PM Stratos Karafotis wrote:

I mean any value of freq table. Please let me know if you want me to rephrase
it in description.


Yes, it would be nice to be more precise.


OK sure, I will add a more precise description.



Which is equivalent to saying that __cpufreq_driver_getavg() is not useful and
may be removed (but the patch doesn't do that and I wonder why?), but surely
the developer who added it wouldn't agree.

So, please explain: why can we drop __cpufreq_driver_getavg()?



With the new proposed method the next frequency is not dependent from current
or average frequency. The next frequency is dependent only from load.
So, we don't need support for freq feedback from hardware anymore.


OK, but that's a more significant change than the changelog suggests.
The changelog should tell the whole story and explain the rationale. :-)

So, please explain that in fact it is not necessary to use the current or
average frequency to compute the target and why that is the case.

Also the patch should remove __cpufreq_driver_getavg() and the callback used by
it, since that code will be dead after applying it anyway.


Even if, due to underlying hardware coordination mechanism, CPU runs in 
different
frequency than the actual, the calculation of load and of target frequency will
remain the unaffected, with this patch.

With full respect to ondemand coauthor, and if the new method is acceptable,
I could send a patch to revert the original one about the IA32_APERF and
IA32_MPERF MSR support.


I'm not sure what you mean by revert, but please do as I said above.


Thus, in the comparison with up_threshold to increase frequency we actually
do this (in cases that getavg is not implemented):
if (load  up_theshold)
increase to max

So, after the patch we keep the same comparison to decide about the max 
frequency.
I thought, that below up_threshold is 'fair' to decide about the next
frequency with formula that frequency is proportional to load.
For example in a CPU with min freq 100MHz and max 1000MHz with a
load 50 target frequency should be 500MHz.


Well, that sounds reasonable, but the patch actually does more than that.


I'm sorry, but I didn't understand your last point.


Please see above.

The changelog doesn't even mention that the code is being switched from using
measured past frequencies to not using them, because you think that there's a
better way of computing the target (which by the way I can agree with :-)).

Thanks,
Rafael




OK, I will send a new patch that includes your corrections and suggestions.

Thanks so much for your time and your comments!

Stratos

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] cpufreq: ondemand: Change the calculation of target frequency

2013-05-30 Thread Stratos Karafotis

Ondemand calculates load in terms of frequency and increases it only
if the load_freq is greater than up_threshold multiplied by current
or average frequency. This seems to produce oscillations of frequency
between min and max because, for example, a relatively small load can
easily saturate minimum frequency and lead the CPU to max. Then, the
CPU will decrease back to min due to a small load_freq.

This patch changes the calculation method of load and target frequency
considering 2 points:
- Load computation should be independent from current or average
measured frequency. For example an absolute load 80% at 100MHz is not
necessarily equivalent to 8% at 1000MHz in the next sampling interval.
- Target frequency should be increased to any value of frequency table
proportional to absolute load, instead to only the max.

The patch also removes the unused code as a result of the above changes.

Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait.
Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an
increase ~1.5% in performance. cpufreq_stats (time_in_state) shows
that middle frequencies are used more, with this patch. Highest
and lowest frequencies were used less by ~9%

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 arch/x86/include/asm/processor.h   | 29 --
 drivers/cpufreq/Makefile   |  2 +-
 drivers/cpufreq/acpi-cpufreq.c |  5 
 drivers/cpufreq/cpufreq.c  | 21 
 drivers/cpufreq/cpufreq_governor.c | 10 +---
 drivers/cpufreq/cpufreq_governor.h |  1 -
 drivers/cpufreq/cpufreq_ondemand.c | 39 ++---
 drivers/cpufreq/mperf.c| 51 --
 drivers/cpufreq/mperf.h|  9 ---
 include/linux/cpufreq.h|  6 -
 10 files changed, 9 insertions(+), 164 deletions(-)
 delete mode 100644 drivers/cpufreq/mperf.c
 delete mode 100644 drivers/cpufreq/mperf.h

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 4b3..2874a3b 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -941,35 +941,6 @@ extern int set_tsc_mode(unsigned int val);
 
 extern u16 amd_get_nb_id(int cpu);
 
-struct aperfmperf {
-   u64 aperf, mperf;
-};
-
-static inline void get_aperfmperf(struct aperfmperf *am)
-{
-   WARN_ON_ONCE(!boot_cpu_has(X86_FEATURE_APERFMPERF));
-
-   rdmsrl(MSR_IA32_APERF, am-aperf);
-   rdmsrl(MSR_IA32_MPERF, am-mperf);
-}
-
-#define APERFMPERF_SHIFT 10
-
-static inline
-unsigned long calc_aperfmperf_ratio(struct aperfmperf *old,
-   struct aperfmperf *new)
-{
-   u64 aperf = new-aperf - old-aperf;
-   u64 mperf = new-mperf - old-mperf;
-   unsigned long ratio = aperf;
-
-   mperf = APERFMPERF_SHIFT;
-   if (mperf)
-   ratio = div64_u64(aperf, mperf);
-
-   return ratio;
-}
-
 extern unsigned long arch_align_stack(unsigned long sp);
 extern void free_init_pages(char *what, unsigned long begin, unsigned long 
end);
 
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index 315b923..4519fb1 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -23,7 +23,7 @@ obj-$(CONFIG_GENERIC_CPUFREQ_CPU0)+= cpufreq-cpu0.o
 # powernow-k8 can load then. ACPI is preferred to all other hardware-specific 
drivers.
 # speedstep-* is preferred over p4-clockmod.
 
-obj-$(CONFIG_X86_ACPI_CPUFREQ) += acpi-cpufreq.o mperf.o
+obj-$(CONFIG_X86_ACPI_CPUFREQ) += acpi-cpufreq.o
 obj-$(CONFIG_X86_POWERNOW_K8)  += powernow-k8.o
 obj-$(CONFIG_X86_PCC_CPUFREQ)  += pcc-cpufreq.o
 obj-$(CONFIG_X86_POWERNOW_K6)  += powernow-k6.o
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 11b8b4b..0025cdd 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -45,7 +45,6 @@
 #include asm/msr.h
 #include asm/processor.h
 #include asm/cpufeature.h
-#include mperf.h
 
 MODULE_AUTHOR(Paul Diefenbaugh, Dominik Brodowski);
 MODULE_DESCRIPTION(ACPI Processor P-States Driver);
@@ -842,10 +841,6 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy 
*policy)
/* notify BIOS that we exist */
acpi_processor_notify_smm(THIS_MODULE);
 
-   /* Check for APERF/MPERF support in hardware */
-   if (boot_cpu_has(X86_FEATURE_APERFMPERF))
-   acpi_cpufreq_driver.getavg = cpufreq_get_measured_perf;
-
pr_debug(CPU%u - ACPI performance management activated.\n, cpu);
for (i = 0; i  perf-state_count; i++)
pr_debug( %cP%d: %d MHz, %d mW, %d uS\n,
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 2d53f47..04ceddc 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1502,27 +1502,6 @@ no_policy:
 }
 EXPORT_SYMBOL_GPL(cpufreq_driver_target);
 
-int __cpufreq_driver_getavg(struct cpufreq_policy *policy

Re: [PATCH] cpufreq: ondemand: Change the calculation of target frequency

2013-05-31 Thread Stratos Karafotis

On 05/31/2013 11:51 AM, Viresh Kumar wrote:
 ---
   arch/x86/include/asm/processor.h   | 29 --
   drivers/cpufreq/Makefile   |  2 +-
   drivers/cpufreq/acpi-cpufreq.c |  5 
   drivers/cpufreq/cpufreq.c  | 21 
   drivers/cpufreq/cpufreq_governor.c | 10 +---
   drivers/cpufreq/cpufreq_governor.h |  1 -
   drivers/cpufreq/cpufreq_ondemand.c | 39 ++---
   drivers/cpufreq/mperf.c| 51 
 --
   drivers/cpufreq/mperf.h|  9 ---
   include/linux/cpufreq.h|  6 -
   10 files changed, 9 insertions(+), 164 deletions(-)
   delete mode 100644 drivers/cpufreq/mperf.c
   delete mode 100644 drivers/cpufreq/mperf.h
 
 I believe you should have removed other users of getavg() in a separate
 patch and also cc'd relevant people so that you can some review comments
 from  them.

I will split the patch in two. If it's OK, I will keep the removal of 
__cpufreq_driver_getavg in the original patch and move the clean up of
APERF/MPERF support in a second patch. I will also cc relevant people.


  /* Check for frequency increase */
 -   if (load_freq  od_tuners-up_threshold * policy-cur) {
 +   if (load  od_tuners-up_threshold) {
 
 Chances of this getting hit are minimal now.. I don't know if keeping
 this will change anything now :)

Actually, no. This getting hit pretty often.
Please find attached the cpufreq statistics - trans_table during build
of 3.4 kernel. With default up_threshold (95), the transition to max
happened many times because of load was greater than up_threshold.
I also thought to keep this code to leave up_threshold functionality unaffected.
 
On 05/31/2013 03:42 PM, Rafael J. Wysocki wrote:
 On Friday, May 31, 2013 02:24:59 PM Viresh Kumar wrote:
 +   } else {
 +   /* Calculate the next frequency proportional to load */
  unsigned int freq_next;
 -   freq_next = load_freq / od_tuners-adj_up_threshold;
 +   freq_next = load * policy-max / 100;

 Rafael asked why you believe this is the right formula and I really couldn't
 find an appropriate answer to that, sorry :(
 
 Right, it would be good to explain that.
 
 Proportional to load means C * load, so why is policy-max / 100 *the* 
 right C?
 

I think, finally(?) I see your point. The right C should be 
policy-cpuinfo.max_freq / 100.
This way the target frequency will be proportional to load and the calculation 
will
map the load to CPU freq table.

I will update the patch according to your observations and suggestions.

Thanks,
Stratos
   From  :To
 :   3401000   340   330   310   300   290   
280   260   250   240   220   210   200   190   
170   160 
  3401000: 0 0 4 2 4 2 
3 0 2 1 1 1 1 4 
029 
  340: 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 
0 0 
  330: 4 0 0 0 1 0 
0 0 0 0 0 1 0 0 
0 7 
  310: 2 0 0 0 1 0 
0 0 0 0 0 0 0 0 
0 0 
  300: 4 0 0 0 0 0 
0 0 0 1 0 0 0 0 
0 4 
  290: 1 0 0 0 1 0 
0 0 0 0 0 0 0 0 
0 7 
  280: 3 0 0 0 0 1 
0 0 0 1 0 0 0 0 
0 3 
  260: 0 0 0 0 0 1 
0 0 0 1 0 0 0 0 
0 4 
  250: 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 
0 4 
  240: 3 0 0 0 0 0 
1 0 0 0 0 0 0 0 
0 7 
  220: 1 0 0 0 0 0 
0 0 0 1 0 0 0 0 
0 3 
  210: 1 0 0 0 0 0 
0 1 0 0 0 0 0 1 
0 4 
  200: 1 0 0 0

Re: [PATCH] cpufreq: ondemand: Change the calculation of target frequency

2013-06-01 Thread Stratos Karafotis

On 06/01/2013 03:27 PM, Rafael J. Wysocki wrote:
 On Friday, May 31, 2013 07:33:06 PM Stratos Karafotis wrote:
 On 05/31/2013 11:51 AM, Viresh Kumar wrote:
 ---
arch/x86/include/asm/processor.h   | 29 --
drivers/cpufreq/Makefile   |  2 +-
drivers/cpufreq/acpi-cpufreq.c |  5 
drivers/cpufreq/cpufreq.c  | 21 
drivers/cpufreq/cpufreq_governor.c | 10 +---
drivers/cpufreq/cpufreq_governor.h |  1 -
drivers/cpufreq/cpufreq_ondemand.c | 39 ++---
drivers/cpufreq/mperf.c| 51 
 --
drivers/cpufreq/mperf.h|  9 ---
include/linux/cpufreq.h|  6 -
10 files changed, 9 insertions(+), 164 deletions(-)
delete mode 100644 drivers/cpufreq/mperf.c
delete mode 100644 drivers/cpufreq/mperf.h

 I believe you should have removed other users of getavg() in a separate
 patch and also cc'd relevant people so that you can some review comments
 from  them.

 I will split the patch in two. If it's OK, I will keep the removal of
 __cpufreq_driver_getavg in the original patch and move the clean up of
 APERF/MPERF support in a second patch. I will also cc relevant people.


   /* Check for frequency increase */
 -   if (load_freq  od_tuners-up_threshold * policy-cur) {
 +   if (load  od_tuners-up_threshold) {

 Chances of this getting hit are minimal now.. I don't know if keeping
 this will change anything now :)

 Actually, no. This getting hit pretty often.
 Please find attached the cpufreq statistics - trans_table during build
 of 3.4 kernel. With default up_threshold (95), the transition to max
 happened many times because of load was greater than up_threshold.
 I also thought to keep this code to leave up_threshold functionality 
 unaffected.
   
 On 05/31/2013 03:42 PM, Rafael J. Wysocki wrote:
 On Friday, May 31, 2013 02:24:59 PM Viresh Kumar wrote:
 +   } else {
 +   /* Calculate the next frequency proportional to load */
   unsigned int freq_next;
 -   freq_next = load_freq / od_tuners-adj_up_threshold;
 +   freq_next = load * policy-max / 100;

 Rafael asked why you believe this is the right formula and I really 
 couldn't
 find an appropriate answer to that, sorry :(

 Right, it would be good to explain that.

 Proportional to load means C * load, so why is policy-max / 100 *the* 
 right C?


 I think, finally(?) I see your point. The right C should be 
 policy-cpuinfo.max_freq / 100.
 This way the target frequency will be proportional to load and the 
 calculation will
 map the load to CPU freq table.
 
 That seems to mean take the percentage of policy-cpuinfo.max_freq 
 proportional
 to the current load and use the available frequency closest to that.  Is that
 correct?
 
 Rafael
 
 

In my opinion, yes. I thought, yesterday, after your question, to normalize load
to policy-min - policy-max. But I think it's a more correct approach to take 
the percentage of cpuinfo.max, as you said.
Actually, I did my tests on the percentage of policy-max that was equal to 
cpuinfo.max.

Unless, I miss something here. :)

Thanks,
Stratos
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cpufreq: ondemand: Change the calculation of target frequency

2013-06-01 Thread Stratos Karafotis

On 06/01/2013 05:56 PM, Viresh Kumar wrote:
 On 31 May 2013 22:03, Stratos Karafotis strat...@semaphore.gr wrote:
 On 05/31/2013 11:51 AM, Viresh Kumar wrote:
 I believe you should have removed other users of getavg() in a separate
 patch and also cc'd relevant people so that you can some review comments
 from  them.

 I will split the patch in two. If it's OK, I will keep the removal of
 __cpufreq_driver_getavg in the original patch and move the clean up of
 APERF/MPERF support in a second patch. I will also cc relevant people.
 
 Even removal of __cpufreq_driver_getavg() should be done in a separate
 patch, so that it can be reverted easily if required later.

Thanks, Viresh. I will do the removal of that function in a seperate patch.
Should I use a third patch for it? Or should I include it in the patch which
will remove APERF/MPERF support?

 Proportional to load means C * load, so why is policy-max / 100 *the* 
 right C?

 I think, finally(?) I see your point. The right C should be 
 policy-cpuinfo.max_freq / 100.
 
 Why are you changing it to cpuinfo.max_freq?? This is fixed once a driver is
 initialized.. but user may request a lower max freq for a governor or policy.
 Which is actually reflected in policy-max I believe.

My initial thought is to have a linear function to calculate the target freq 
proportional to load: (I will use 'C' as the function's slope as Rafael used 
it) 

freq_target = C * load

For simplicity, let's assume that load is between 0 and 1 as initially is 
calculated
in governor.
Ideally,  for a load = 0, we should have freq_target = 0 and for load = 1,
freq_target = cpuinfo.max

So, the slope C = cpuinfo.max 

I think, it's matter of definition about what policy-min and policy-max can 
do.
Should they change the slope C? Or only limit freq_target?
I don't think that the policy-max (or min) should affect HOW (slope C) governor
calculates freq_target but only limit the calculated result.

Maybe, we could have separate tunables to a affect the slope C.

If I'm wrong about the definition of policy-min, policy-max, I would change
the code accordingly.


 Over that why keeping following check is useful anymore?
 
 if (load_freq  od_tuners-up_threshold)
  goto max.
 
 As, if load is over 95, then even policy-max * 95 / 100 will even give almost
 the same freq.
 

I thought that too. But maybe user selects a lower value for up_threshold.
(For example, up_threshold = 60). In my opinion, we have to keep up_theshold,
to give the possibility to user to have max freq with small loads.


Thanks for your comments!
Stratos 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cpufreq: ondemand: Change the calculation of target frequency

On 06/03/2013 02:24 PM, Viresh Kumar wrote:
On 3 June 2013 16:27, Rafael J. Wysocki r...@sisk.pl wrote:
The question is if we want policy-max to re-scale them effectively (i.e. to
change weights so that the maximum load maps to the highest frequency
available
at the moment) or if we want policy-max to work as a cap (i.e. to map all
loads above certain value to the maximum frequency available at the moment,
so
that the criteria for selecting the lower frequencies don't change). In my
opinion the second option is better, because it means OK, we can't use some
high frequencies, but let's not hurt performance for the loads that wouldn't
require them anyway. Otherwise, we'll effectively throttle all loads and
that not only causes performance to drop, but also causes more energy to be
used overall.

I wouldn't say that I don't agree with you as I do to some extent.

But the question that comes to my mind now is: Why is policy-max reduced
in the first place? User doesn't have control over which freqs to expose, so
available_frequencies will stay the same. The only thing he is capable
of doing is: reduce policy-max.. Which in a way means that.. I don't want to
use higher frequencies (power, thermal reasons) and I know performance will
go down with it and I don't care for it now.

And this way I feel policy-max isn't a bad choice either.

BUT as you said: policy-max isn't there to say don't be so aggressive now in
increasing frequencies but just only to say, don't go over this frequency..

So, probably we can use cpuinfo.max_freq :)

Both of you know better than me, but I also believe that cpuinfo.max is more
appropriate. Although, the decision was taken, let me share a spreadsheet to
show
the 2 cases.
https://docs.google.com/spreadsheet/ccc?key=0AnMfNYUV1k0ddHh1OUFXa0kxcGZJaXd4am1sdmVWT0Eusp=sharing

I will prepare the v2 of this patch that uses cpuinfo.max_freq instead of
policy-max.
Also, I will split the patch into 3 parts as suggested.

Thank you for your comments and suggestions!
Stratos
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

[PATCH v2 1/3] cpufreq: ondemand: Change the calculation of target frequency

Ondemand calculates load in terms of frequency and increases it only
if the load_freq is greater than up_threshold multiplied by current
or average frequency. This seems to produce oscillations of frequency
between min and max because, for example, a relatively small load can
easily saturate minimum frequency and lead the CPU to max. Then, the
CPU will decrease back to min due to a small load_freq.

This patch changes the calculation method of load and target frequency
considering 2 points:
- Load computation should be independent from current or average
measured frequency. For example an absolute load 80% at 100MHz is not
necessarily equivalent to 8% at 1000MHz in the next sampling interval.
- Target frequency should be increased to any value of frequency table
proportional to absolute load, instead to only the max. Thus:

Target frequency = C * load

where C = policy-cpuinfo.max_freq / 100

Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait.
Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an
increase ~1.5% in performance. cpufreq_stats (time_in_state) shows
that middle frequencies are used more, with this patch. Highest
and lowest frequencies were used less by ~9%

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_governor.c | 10 +-
 drivers/cpufreq/cpufreq_governor.h |  1 -
 drivers/cpufreq/cpufreq_ondemand.c | 39 +++---
 3 files changed, 8 insertions(+), 42 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c 
b/drivers/cpufreq/cpufreq_governor.c
index 7532570..a2a56c4 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -53,7 +53,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 
policy = cdbs-cur_policy;
 
-   /* Get Absolute Load (in terms of freq for ondemand gov) */
+   /* Get Absolute Load */
for_each_cpu(j, policy-cpus) {
struct cpu_dbs_common_info *j_cdbs;
u64 cur_wall_time, cur_idle_time;
@@ -104,14 +104,6 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 
load = 100 * (wall_time - idle_time) / wall_time;
 
-   if (dbs_data-cdata-governor == GOV_ONDEMAND) {
-   int freq_avg = __cpufreq_driver_getavg(policy, j);
-   if (freq_avg = 0)
-   freq_avg = policy-cur;
-
-   load *= freq_avg;
-   }
-
if (load  max_load)
max_load = load;
}
diff --git a/drivers/cpufreq/cpufreq_governor.h 
b/drivers/cpufreq/cpufreq_governor.h
index e7bbf76..c305cad 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -169,7 +169,6 @@ struct od_dbs_tuners {
unsigned int sampling_rate;
unsigned int sampling_down_factor;
unsigned int up_threshold;
-   unsigned int adj_up_threshold;
unsigned int powersave_bias;
unsigned int io_is_busy;
 };
diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
b/drivers/cpufreq/cpufreq_ondemand.c
index 4b9bb5d..62e67a9 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -29,11 +29,9 @@
 #include cpufreq_governor.h
 
 /* On-demand governor macros */
-#define DEF_FREQUENCY_DOWN_DIFFERENTIAL(10)
 #define DEF_FREQUENCY_UP_THRESHOLD (80)
 #define DEF_SAMPLING_DOWN_FACTOR   (1)
 #define MAX_SAMPLING_DOWN_FACTOR   (10)
-#define MICRO_FREQUENCY_DOWN_DIFFERENTIAL  (3)
 #define MICRO_FREQUENCY_UP_THRESHOLD   (95)
 #define MICRO_FREQUENCY_MIN_SAMPLE_RATE(1)
 #define MIN_FREQUENCY_UP_THRESHOLD (11)
@@ -159,14 +157,10 @@ static void dbs_freq_increase(struct cpufreq_policy *p, 
unsigned int freq)
 
 /*
  * Every sampling_rate, we check, if current idle time is less than 20%
- * (default), then we try to increase frequency. Every sampling_rate, we look
- * for the lowest frequency which can sustain the load while keeping idle time
- * over 30%. If such a frequency exist, we try to decrease to this frequency.
- *
- * Any frequency increase takes it to the maximum frequency. Frequency 
reduction
- * happens at minimum steps of 5% (default) of current frequency
+ * (default), then we try to increase frequency. Else, we adjust the frequency
+ * proportional to load.
  */
-static void od_check_cpu(int cpu, unsigned int load_freq)
+static void od_check_cpu(int cpu, unsigned int load)
 {
struct od_cpu_dbs_info_s *dbs_info = per_cpu(od_cpu_dbs_info, cpu);
struct cpufreq_policy *policy = dbs_info-cdbs.cur_policy;
@@ -176,29 +170,17 @@ static void od_check_cpu(int cpu, unsigned int load_freq)
dbs_info-freq_lo = 0;
 
/* Check for frequency increase */
-   if (load_freq  od_tuners-up_threshold * policy-cur) {
+   if (load  od_tuners-up_threshold

[PATCH v2 0/3] cpufreq: ondemand: Change the calculation of target frequency

Changes since v1:
Use policy-cpuinfo.max_freq in the calculation formula
of target frequency instead of policy-max
Split the patch into 3 parts

Stratos Karafotis (3):
  cpufreq: ondemand: Change the calculation of target frequency
  cpufreq: Remove unused function __cpufreq_driver_getavg
  cpufreq: Remove unused APERF/MPERF support

 arch/x86/include/asm/processor.h   | 29 --
 drivers/cpufreq/Makefile   |  2 +-
 drivers/cpufreq/acpi-cpufreq.c |  5 
 drivers/cpufreq/cpufreq.c  | 12 -
 drivers/cpufreq/cpufreq_governor.c | 10 +---
 drivers/cpufreq/cpufreq_governor.h |  1 -
 drivers/cpufreq/cpufreq_ondemand.c | 39 ++---
 drivers/cpufreq/mperf.c| 51 --
 drivers/cpufreq/mperf.h|  9 ---
 include/linux/cpufreq.h|  6 -
 10 files changed, 9 insertions(+), 155 deletions(-)
 delete mode 100644 drivers/cpufreq/mperf.c
 delete mode 100644 drivers/cpufreq/mperf.h
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 2/3] cpufreq: Remove unused function __cpufreq_driver_getavg

Calculation of frequency target in ondemand governor changed and it is
independent from measured average frequency.

Remove unused__cpufreq_driver_getavg function and getavg member from
cpufreq_driver struct. Also, remove the callback getavg in
acpi_cpufreq_driver.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/Makefile   |  2 +-
 drivers/cpufreq/acpi-cpufreq.c |  5 -
 drivers/cpufreq/cpufreq.c  | 12 
 include/linux/cpufreq.h|  6 --
 4 files changed, 1 insertion(+), 24 deletions(-)

diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index 6ad0b91..aebd4ef 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -23,7 +23,7 @@ obj-$(CONFIG_GENERIC_CPUFREQ_CPU0)+= cpufreq-cpu0.o
 # powernow-k8 can load then. ACPI is preferred to all other hardware-specific 
drivers.
 # speedstep-* is preferred over p4-clockmod.
 
-obj-$(CONFIG_X86_ACPI_CPUFREQ) += acpi-cpufreq.o mperf.o
+obj-$(CONFIG_X86_ACPI_CPUFREQ) += acpi-cpufreq.o
 obj-$(CONFIG_X86_POWERNOW_K8)  += powernow-k8.o
 obj-$(CONFIG_X86_PCC_CPUFREQ)  += pcc-cpufreq.o
 obj-$(CONFIG_X86_POWERNOW_K6)  += powernow-k6.o
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 17e3496..d3a5f35 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -45,7 +45,6 @@
 #include asm/msr.h
 #include asm/processor.h
 #include asm/cpufeature.h
-#include mperf.h
 
 MODULE_AUTHOR(Paul Diefenbaugh, Dominik Brodowski);
 MODULE_DESCRIPTION(ACPI Processor P-States Driver);
@@ -842,10 +841,6 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy 
*policy)
/* notify BIOS that we exist */
acpi_processor_notify_smm(THIS_MODULE);
 
-   /* Check for APERF/MPERF support in hardware */
-   if (boot_cpu_has(X86_FEATURE_APERFMPERF))
-   acpi_cpufreq_driver.getavg = cpufreq_get_measured_perf;
-
pr_debug(CPU%u - ACPI performance management activated.\n, cpu);
for (i = 0; i  perf-state_count; i++)
pr_debug( %cP%d: %d MHz, %d mW, %d uS\n,
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index f8c2860..a61aacb 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1584,18 +1584,6 @@ fail:
 }
 EXPORT_SYMBOL_GPL(cpufreq_driver_target);
 
-int __cpufreq_driver_getavg(struct cpufreq_policy *policy, unsigned int cpu)
-{
-   if (cpufreq_disabled())
-   return 0;
-
-   if (!cpufreq_driver-getavg)
-   return 0;
-
-   return cpufreq_driver-getavg(policy, cpu);
-}
-EXPORT_SYMBOL_GPL(__cpufreq_driver_getavg);
-
 /*
  * when event is CPUFREQ_GOV_LIMITS
  */
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index d939056..50f19ad 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -215,10 +215,6 @@ extern int __cpufreq_driver_target(struct cpufreq_policy 
*policy,
   unsigned int target_freq,
   unsigned int relation);
 
-
-extern int __cpufreq_driver_getavg(struct cpufreq_policy *policy,
-  unsigned int cpu);
-
 int cpufreq_register_governor(struct cpufreq_governor *governor);
 void cpufreq_unregister_governor(struct cpufreq_governor *governor);
 
@@ -258,8 +254,6 @@ struct cpufreq_driver {
unsigned int(*get)  (unsigned int cpu);
 
/* optional */
-   unsigned int (*getavg)  (struct cpufreq_policy *policy,
-unsigned int cpu);
int (*bios_limit)   (int cpu, unsigned int *limit);
 
int (*exit) (struct cpufreq_policy *policy);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 3/3] cpufreq: Remove unused APERF/MPERF support

Calculation of frequency target in ondemand governor changed and it is
independent from measured average frequency.

Remove unused APERF/MPERF support.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 arch/x86/include/asm/processor.h | 29 ---
 drivers/cpufreq/mperf.c  | 51 
 drivers/cpufreq/mperf.h  |  9 ---
 3 files changed, 89 deletions(-)
 delete mode 100644 drivers/cpufreq/mperf.c
 delete mode 100644 drivers/cpufreq/mperf.h

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 4b3..2874a3b 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -941,35 +941,6 @@ extern int set_tsc_mode(unsigned int val);
 
 extern u16 amd_get_nb_id(int cpu);
 
-struct aperfmperf {
-   u64 aperf, mperf;
-};
-
-static inline void get_aperfmperf(struct aperfmperf *am)
-{
-   WARN_ON_ONCE(!boot_cpu_has(X86_FEATURE_APERFMPERF));
-
-   rdmsrl(MSR_IA32_APERF, am-aperf);
-   rdmsrl(MSR_IA32_MPERF, am-mperf);
-}
-
-#define APERFMPERF_SHIFT 10
-
-static inline
-unsigned long calc_aperfmperf_ratio(struct aperfmperf *old,
-   struct aperfmperf *new)
-{
-   u64 aperf = new-aperf - old-aperf;
-   u64 mperf = new-mperf - old-mperf;
-   unsigned long ratio = aperf;
-
-   mperf = APERFMPERF_SHIFT;
-   if (mperf)
-   ratio = div64_u64(aperf, mperf);
-
-   return ratio;
-}
-
 extern unsigned long arch_align_stack(unsigned long sp);
 extern void free_init_pages(char *what, unsigned long begin, unsigned long 
end);
 
diff --git a/drivers/cpufreq/mperf.c b/drivers/cpufreq/mperf.c
deleted file mode 100644
index 911e193..000
--- a/drivers/cpufreq/mperf.c
+++ /dev/null
@@ -1,51 +0,0 @@
-#include linux/kernel.h
-#include linux/smp.h
-#include linux/module.h
-#include linux/init.h
-#include linux/cpufreq.h
-#include linux/slab.h
-
-#include mperf.h
-
-static DEFINE_PER_CPU(struct aperfmperf, acfreq_old_perf);
-
-/* Called via smp_call_function_single(), on the target CPU */
-static void read_measured_perf_ctrs(void *_cur)
-{
-   struct aperfmperf *am = _cur;
-
-   get_aperfmperf(am);
-}
-
-/*
- * Return the measured active (C0) frequency on this CPU since last call
- * to this function.
- * Input: cpu number
- * Return: Average CPU frequency in terms of max frequency (zero on error)
- *
- * We use IA32_MPERF and IA32_APERF MSRs to get the measured performance
- * over a period of time, while CPU is in C0 state.
- * IA32_MPERF counts at the rate of max advertised frequency
- * IA32_APERF counts at the rate of actual CPU frequency
- * Only IA32_APERF/IA32_MPERF ratio is architecturally defined and
- * no meaning should be associated with absolute values of these MSRs.
- */
-unsigned int cpufreq_get_measured_perf(struct cpufreq_policy *policy,
-   unsigned int cpu)
-{
-   struct aperfmperf perf;
-   unsigned long ratio;
-   unsigned int retval;
-
-   if (smp_call_function_single(cpu, read_measured_perf_ctrs, perf, 1))
-   return 0;
-
-   ratio = calc_aperfmperf_ratio(per_cpu(acfreq_old_perf, cpu), perf);
-   per_cpu(acfreq_old_perf, cpu) = perf;
-
-   retval = (policy-cpuinfo.max_freq * ratio)  APERFMPERF_SHIFT;
-
-   return retval;
-}
-EXPORT_SYMBOL_GPL(cpufreq_get_measured_perf);
-MODULE_LICENSE(GPL);
diff --git a/drivers/cpufreq/mperf.h b/drivers/cpufreq/mperf.h
deleted file mode 100644
index 5dbf295..000
--- a/drivers/cpufreq/mperf.h
+++ /dev/null
@@ -1,9 +0,0 @@
-/*
- *  (c) 2010 Advanced Micro Devices, Inc.
- *  Your use of this code is subject to the terms and conditions of the
- *  GNU general public license version 2. See COPYING or
- *  http://www.gnu.org/licenses/gpl.html
- */
-
-unsigned int cpufreq_get_measured_perf(struct cpufreq_policy *policy,
-   unsigned int cpu);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 linux-next] cpufreq: ondemand: Calculate gradient of CPU load to early increase frequency

2013-04-26 Thread Stratos Karafotis


On 04/09/2013 07:56 PM, Stratos Karafotis wrote:

On 04/05/2013 10:50 PM, Stratos Karafotis wrote:

Hi Viresh,

On 04/04/2013 07:54 AM, Viresh Kumar wrote:

Hi Stratos,

Yes, your results show some improvements. BUT if performance is the
only thing
we were looking for, then we will never use ondemand governor but
performance
governor.

I suspect this little increase in performance must have increased
power numbers
too (significantly). So, if you can get numbers in the form of
power/performance
with and without your patch, it will be great.

--
viresh



I run some more tests. I increased the number of iterations to 100
(from 20).
I also test for counter 1,000,000 (~4200us), 5,000,000 (~1us),
15,000,000 (~3us).

This time, I also extracted statistics from cpufreq_stats driver. I
think this will be an
indication for power consumption. Below the results and attached the
program I used for to
get these numbers.


Any comments would be appreciated.

Thanks,
Stratos




Ping.

Thanks,
Stratos
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH] cpufreq: ondemand: Increase frequency to any value proportional to load

2013-05-27 Thread Stratos Karafotis


Ondemand increases frequency only if the load_freq is greater than
up_threshold. This seems to produce oscillations of frequency between
min and max because a relatively small load can easily saturate minimum
frequency and lead the CPU to max. Then, the CPU will decrease back to
min due to a small load_freq.

With this patch the frequency can be increased to any value,
proportional to load, if the load is below up_threshold. Thus, middle
frequencies are used more. Absolute load is used for the calculation of
frequency.

Phoronix benchmark results of Linux Kernel Compilation 3.1 tests are
attached with and without this patch. cpufreq_stats (time_in_state) are
also included. Tested on Intel i7-3770 CPU @ 3.40GH and on 
Quad core 1500 MHz Krait.


Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
drivers/cpufreq/cpufreq_governor.c | 10 +-
drivers/cpufreq/cpufreq_governor.h |  1 -
drivers/cpufreq/cpufreq_ondemand.c | 39 --
3 files changed, 9 insertions(+), 41 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c 
b/drivers/cpufreq/cpufreq_governor.c
index 5af40ad..eeab30a 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -97,7 +97,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)

policy = cdbs-cur_policy;

-   /* Get Absolute Load (in terms of freq for ondemand gov) */
+   /* Get Absolute Load */
for_each_cpu(j, policy-cpus) {
struct cpu_dbs_common_info *j_cdbs;
u64 cur_wall_time, cur_idle_time;
@@ -148,14 +148,6 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)

load = 100 * (wall_time - idle_time) / wall_time;

-   if (dbs_data-cdata-governor == GOV_ONDEMAND) {
-   int freq_avg = __cpufreq_driver_getavg(policy, j);
-   if (freq_avg = 0)
-   freq_avg = policy-cur;
-
-   load *= freq_avg;
-   }
-
if (load  max_load)
max_load = load;
}
diff --git a/drivers/cpufreq/cpufreq_governor.h 
b/drivers/cpufreq/cpufreq_governor.h
index e16a961..e899c11 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -169,7 +169,6 @@ struct od_dbs_tuners {
unsigned int sampling_rate;
unsigned int sampling_down_factor;
unsigned int up_threshold;
-   unsigned int adj_up_threshold;
unsigned int powersave_bias;
unsigned int io_is_busy;
};
diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
b/drivers/cpufreq/cpufreq_ondemand.c
index 4b9bb5d..bf2ae64 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -29,11 +29,9 @@
#include cpufreq_governor.h

/* On-demand governor macros */
-#define DEF_FREQUENCY_DOWN_DIFFERENTIAL(10)
#define DEF_FREQUENCY_UP_THRESHOLD  (80)
#define DEF_SAMPLING_DOWN_FACTOR(1)
#define MAX_SAMPLING_DOWN_FACTOR(10)
-#define MICRO_FREQUENCY_DOWN_DIFFERENTIAL  (3)
#define MICRO_FREQUENCY_UP_THRESHOLD(95)
#define MICRO_FREQUENCY_MIN_SAMPLE_RATE (1)
#define MIN_FREQUENCY_UP_THRESHOLD  (11)
@@ -159,14 +157,12 @@ static void dbs_freq_increase(struct cpufreq_policy *p, 
unsigned int freq)

/*
 * Every sampling_rate, we check, if current idle time is less than 20%
- * (default), then we try to increase frequency. Every sampling_rate, we look
- * for the lowest frequency which can sustain the load while keeping idle time
- * over 30%. If such a frequency exist, we try to decrease to this frequency.
+ * (default), then we try to increase frequency. Else, we adjust the frequency
+ * proportional to load.
 *
- * Any frequency increase takes it to the maximum frequency. Frequency 
reduction
- * happens at minimum steps of 5% (default) of current frequency
+ * Any frequency increase takes it to the maximum frequency.
 */
-static void od_check_cpu(int cpu, unsigned int load_freq)
+static void od_check_cpu(int cpu, unsigned int load)
{
struct od_cpu_dbs_info_s *dbs_info = per_cpu(od_cpu_dbs_info, cpu);
struct cpufreq_policy *policy = dbs_info-cdbs.cur_policy;
@@ -176,29 +172,17 @@ static void od_check_cpu(int cpu, unsigned int load_freq)
dbs_info-freq_lo = 0;

/* Check for frequency increase */
-   if (load_freq  od_tuners-up_threshold * policy-cur) {
+   if (load  od_tuners-up_threshold) {
/* If switching to max speed, apply sampling_down_factor */
if (policy-cur  policy-max)
dbs_info-rate_mult =
od_tuners-sampling_down_factor;
dbs_freq_increase(policy, policy-max);
return;
-   }
-
-   /* Check for frequency decrease */
-   /* if we cannot reduce the frequency anymore, break out early

Re: [RFC PATCH] cpufreq: ondemand: Increase frequency to any value proportional to load

2013-05-28 Thread Stratos Karafotis

Hi Rafael,

Thank you for your prompt reply and your comments!

On 05/28/2013 02:43 PM, Rafael J. Wysocki wrote:
 With this patch the frequency can be increased to any value,
 
 What exactly does any value mean here?
 

I mean any value of freq table. Please let me know if you want me to rephrase
it in description.

 Can you please comment the results in the changelog?  Attachments don't
 show up in git logs. :-)

I'm sorry, you are right. I added comments in the patch description.

 
 Can you please explain why this is the right formula?
 


Without this patch, we compare load_freq with up_threshold to decide about
the max frequency.
load_freq = load * freq_avg

In most cpufreq drivers getavg function is not implemented (I found that
it's implemented only in acpi-cpufreq). Therefore:
freq_avg = policy-cur. 

Thus, in the comparison with up_threshold to increase frequency we actually
do this (in cases that getavg is not implemented):
if (load  up_theshold)
increase to max

So, after the patch we keep the same comparison to decide about the max 
frequency.
I thought, that below up_threshold is 'fair' to decide about the next 
frequency with formula that frequency is proportional to load.
For example in a CPU with min freq 100MHz and max 1000MHz with a 
load 50 target frequency should be 500MHz. 


Thanks,
Stratos

---8-
Ondemand increases frequency only if the load_freq is greater than
up_threshold. This seems to produce oscillations of frequency between
min and max because a relatively small load can easily saturate minimum
frequency and lead the CPU to max. Then, the CPU will decrease back to
min due to a small load_freq.

With this patch the frequency can be increased to any value,
proportional to load, if the load is below up_threshold. Thus, middle
frequencies are used more. Absolute load is used for the calculation of
frequency.

Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait.
Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an
increase 1.5% to performance. cpufreq_stats (time_in_state) shows
that middle frequencies are used more, with this patch. Highest 
and lowest frequencies were used less by ~9% 

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_governor.c | 10 +-
 drivers/cpufreq/cpufreq_governor.h |  1 -
 drivers/cpufreq/cpufreq_ondemand.c | 39 +++---
 3 files changed, 8 insertions(+), 42 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c 
b/drivers/cpufreq/cpufreq_governor.c
index 5af40ad..eeab30a 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -97,7 +97,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 
policy = cdbs-cur_policy;
 
-   /* Get Absolute Load (in terms of freq for ondemand gov) */
+   /* Get Absolute Load */
for_each_cpu(j, policy-cpus) {
struct cpu_dbs_common_info *j_cdbs;
u64 cur_wall_time, cur_idle_time;
@@ -148,14 +148,6 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 
load = 100 * (wall_time - idle_time) / wall_time;
 
-   if (dbs_data-cdata-governor == GOV_ONDEMAND) {
-   int freq_avg = __cpufreq_driver_getavg(policy, j);
-   if (freq_avg = 0)
-   freq_avg = policy-cur;
-
-   load *= freq_avg;
-   }
-
if (load  max_load)
max_load = load;
}
diff --git a/drivers/cpufreq/cpufreq_governor.h 
b/drivers/cpufreq/cpufreq_governor.h
index e16a961..e899c11 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -169,7 +169,6 @@ struct od_dbs_tuners {
unsigned int sampling_rate;
unsigned int sampling_down_factor;
unsigned int up_threshold;
-   unsigned int adj_up_threshold;
unsigned int powersave_bias;
unsigned int io_is_busy;
 };
diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
b/drivers/cpufreq/cpufreq_ondemand.c
index 4b9bb5d..c1e6d3e 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -29,11 +29,9 @@
 #include cpufreq_governor.h
 
 /* On-demand governor macros */
-#define DEF_FREQUENCY_DOWN_DIFFERENTIAL(10)
 #define DEF_FREQUENCY_UP_THRESHOLD (80)
 #define DEF_SAMPLING_DOWN_FACTOR   (1)
 #define MAX_SAMPLING_DOWN_FACTOR   (10)
-#define MICRO_FREQUENCY_DOWN_DIFFERENTIAL  (3)
 #define MICRO_FREQUENCY_UP_THRESHOLD   (95)
 #define MICRO_FREQUENCY_MIN_SAMPLE_RATE(1)
 #define MIN_FREQUENCY_UP_THRESHOLD (11)
@@ -159,14 +157,10 @@ static void dbs_freq_increase(struct cpufreq_policy *p, 
unsigned int freq)
 
 /*
  * Every sampling_rate, we check, if current idle time is less

[PATCH] cpufreq: ondemand: Remove redundant return statement

2013-10-31 Thread Stratos Karafotis

After commit dfa5bb622555d9da0df21b50f46ebdeef390041b
cpufreq: ondemand: Change the calculation of target frequency,
this return statement is no longer needed.

Reported-by: Henrik Nilsson karl.henrik.nils...@gmail.com
Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/cpufreq_ondemand.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
b/drivers/cpufreq/cpufreq_ondemand.c
index 32f26f6..18d4091 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -168,7 +168,6 @@ static void od_check_cpu(int cpu, unsigned int load)
dbs_info-rate_mult =
od_tuners-sampling_down_factor;
dbs_freq_increase(policy, policy-max);
-   return;
} else {
/* Calculate the next frequency proportional to load */
unsigned int freq_next;
-- 
1.8.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cpufreq: ondemand: Remove redundant return statement

2013-11-01 Thread Stratos Karafotis


On 11/01/2013 02:18 AM, Rafael J. Wysocki wrote:

On Friday, November 01, 2013 12:09:16 AM Viresh Kumar wrote:

On 31 October 2013 23:57, Stratos Karafotis strat...@semaphore.gr wrote:

After commit dfa5bb622555d9da0df21b50f46ebdeef390041b
cpufreq: ondemand: Change the calculation of target frequency,
this return statement is no longer needed.

Reported-by: Henrik Nilsson karl.henrik.nils...@gmail.com
Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
  drivers/cpufreq/cpufreq_ondemand.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
b/drivers/cpufreq/cpufreq_ondemand.c
index 32f26f6..18d4091 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -168,7 +168,6 @@ static void od_check_cpu(int cpu, unsigned int load)
 dbs_info-rate_mult =
 od_tuners-sampling_down_factor;
 dbs_freq_increase(policy, policy-max);
-   return;
 } else {
 /* Calculate the next frequency proportional to load */
 unsigned int freq_next;


Acked-by: Viresh Kumar viresh.ku...@linaro.org


Queued up for 3.13, thanks!



Thank you both for your immediate response!

Stratos
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/7] cpufreq: intel_pstate: Debugfs file addition and cleanups

Hi all,

Below some work on intel_pstate driver.
Mostly a cleanup (patches 1/7, 5/7, 6/7, 7/7), some code simplification
(patch 4/7) and the addition of stats file in debugfs that presents
driver's statistics (patch 3/7).

Please note that patch 3/7 depends on 2/7 (which also simplifies calculation
if tracing is on).


Thanks!

Stratos Karafotis (7):
  cpufreq: intel_pstate: Remove duplicate CPU ID check
  cpufreq: intel_pstate: Avoid duplicate call of
intel_pstate_get_scaled_busy
  cpufreq: intel_pstate: Add debugfs file stats
  cpufreq: intel_pstate: Simplify code in
intel_pstate_adjust_busy_pstate
  cpufreq: intel_pstate: Remove redundant includes
  cpufreq: intel_pstate: Trivial code cleanup
  cpufreq: intel_pstate: Make intel_pstate_kobject local

 drivers/cpufreq/intel_pstate.c | 237 +++--
 1 file changed, 135 insertions(+), 102 deletions(-)

-- 
1.9.3
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/7] cpufreq: intel_pstate: Remove redundant includes

Also put them in alphabetical order.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/intel_pstate.c | 17 ++---
 1 file changed, 2 insertions(+), 15 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 26a0262..d4f0518 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -10,26 +10,13 @@
  * of the License.
  */
 
-#include linux/kernel.h
-#include linux/kernel_stat.h
-#include linux/module.h
-#include linux/ktime.h
-#include linux/hrtimer.h
-#include linux/tick.h
-#include linux/slab.h
-#include linux/sched.h
-#include linux/list.h
+#include linux/acpi.h
 #include linux/cpu.h
 #include linux/cpufreq.h
-#include linux/sysfs.h
-#include linux/types.h
-#include linux/fs.h
 #include linux/debugfs.h
-#include linux/acpi.h
+#include linux/module.h
 #include trace/events/power.h
 
-#include asm/div64.h
-#include asm/msr.h
 #include asm/cpu_device_id.h
 
 #define BYT_RATIOS 0x66a
-- 
1.9.3
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/7] cpufreq: intel_pstate: Avoid duplicate call of intel_pstate_get_scaled_busy

Store busy_scaled value to avoid to duplicate call of
intel_pstate_get_scaled_busy on every sampling interval.

Also, rename the function to intel_pstate_calc_scaled_busy.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/intel_pstate.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 4e7f492..31e2ae5 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -55,6 +55,7 @@ static inline int32_t div_fp(int32_t x, int32_t y)
 
 struct sample {
int32_t core_pct_busy;
+   int32_t busy_scaled;
u64 aperf;
u64 mperf;
int freq;
@@ -604,7 +605,7 @@ static inline void intel_pstate_set_sample_time(struct 
cpudata *cpu)
mod_timer_pinned(cpu-timer, jiffies + delay);
 }
 
-static inline int32_t intel_pstate_get_scaled_busy(struct cpudata *cpu)
+static inline void intel_pstate_calc_scaled_busy(struct cpudata *cpu)
 {
int32_t core_busy, max_pstate, current_pstate, sample_ratio;
u32 duration_us;
@@ -624,20 +625,19 @@ static inline int32_t intel_pstate_get_scaled_busy(struct 
cpudata *cpu)
core_busy = mul_fp(core_busy, sample_ratio);
}
 
-   return core_busy;
+   cpu-sample.busy_scaled = core_busy;
 }
 
 static inline void intel_pstate_adjust_busy_pstate(struct cpudata *cpu)
 {
-   int32_t busy_scaled;
struct _pid *pid;
signed int ctl = 0;
int steps;
 
pid = cpu-pid;
-   busy_scaled = intel_pstate_get_scaled_busy(cpu);
+   intel_pstate_calc_scaled_busy(cpu);
 
-   ctl = pid_calc(pid, busy_scaled);
+   ctl = pid_calc(pid, cpu-sample.busy_scaled);
 
steps = abs(ctl);
 
@@ -659,7 +659,7 @@ static void intel_pstate_timer_func(unsigned long __data)
intel_pstate_adjust_busy_pstate(cpu);
 
trace_pstate_sample(fp_toint(sample-core_pct_busy),
-   fp_toint(intel_pstate_get_scaled_busy(cpu)),
+   fp_toint(sample-busy_scaled),
cpu-pstate.current_pstate,
sample-mperf,
sample-aperf,
-- 
1.9.3
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 7/7] cpufreq: intel_pstate: Make intel_pstate_kobject local

Since we never remove sysfs entry, we can make the intel_pstate_kobject
local.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/intel_pstate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index fa44f0f..9533fff 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -387,10 +387,10 @@ static struct attribute *intel_pstate_attributes[] = {
 static struct attribute_group intel_pstate_attr_group = {
.attrs = intel_pstate_attributes,
 };
-static struct kobject *intel_pstate_kobject;
 
 static void intel_pstate_sysfs_expose_params(void)
 {
+   struct kobject *intel_pstate_kobject;
int rc;
 
intel_pstate_kobject = kobject_create_and_add(intel_pstate,
-- 
1.9.3
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/7] cpufreq: intel_pstate: Trivial code cleanup

Remove unnecessary blank lines.
Remove unnecessary parentheses.
Remove unnecessary braces.
Put the code in one line where possible.
Add blank lines after variable declarations.
Alignment to open parenthesis.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/intel_pstate.c | 96 --
 1 file changed, 45 insertions(+), 51 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index d4f0518..fa44f0f 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -142,7 +142,7 @@ static struct perf_limits limits = {
 };
 
 static inline void pid_reset(struct _pid *pid, int setpoint, int busy,
-   int deadband, int integral) {
+int deadband, int integral) {
pid-setpoint = setpoint;
pid-deadband  = deadband;
pid-integral  = int_tofp(integral);
@@ -161,7 +161,6 @@ static inline void pid_i_gain_set(struct _pid *pid, int 
percent)
 
 static inline void pid_d_gain_set(struct _pid *pid, int percent)
 {
-
pid-d_gain = div_fp(int_tofp(percent), int_tofp(100));
 }
 
@@ -192,9 +191,9 @@ static signed int pid_calc(struct _pid *pid, int32_t busy)
 
result = pterm + mul_fp(pid-integral, pid-i_gain) + dterm;
if (result = 0)
-   result = result + (1  (FRAC_BITS-1));
+   result += 1  (FRAC_BITS-1);
else
-   result = result - (1  (FRAC_BITS-1));
+   result -= 1  (FRAC_BITS-1);
return (signed int)fp_toint(result);
 }
 
@@ -204,20 +203,16 @@ static inline void intel_pstate_busy_pid_reset(struct 
cpudata *cpu)
pid_d_gain_set(cpu-pid, pid_params.d_gain_pct);
pid_i_gain_set(cpu-pid, pid_params.i_gain_pct);
 
-   pid_reset(cpu-pid,
-   pid_params.setpoint,
-   100,
-   pid_params.deadband,
-   0);
+   pid_reset(cpu-pid, pid_params.setpoint, 100, pid_params.deadband, 0);
 }
 
 static inline void intel_pstate_reset_all_pid(void)
 {
unsigned int cpu;
-   for_each_online_cpu(cpu) {
+
+   for_each_online_cpu(cpu)
if (all_cpu_data[cpu])
intel_pstate_busy_pid_reset(all_cpu_data[cpu]);
-   }
 }
 
 /** debugfs begin /
@@ -227,13 +222,13 @@ static int pid_param_set(void *data, u64 val)
intel_pstate_reset_all_pid();
return 0;
 }
+
 static int pid_param_get(void *data, u64 *val)
 {
*val = *(u32 *)data;
return 0;
 }
-DEFINE_SIMPLE_ATTRIBUTE(fops_pid_param, pid_param_get,
-   pid_param_set, %llu\n);
+DEFINE_SIMPLE_ATTRIBUTE(fops_pid_param, pid_param_get, pid_param_set, 
%llu\n);
 
 struct pid_param {
char *name;
@@ -310,8 +305,8 @@ static void intel_pstate_debug_expose_params(void)
return;
while (pid_files[i].name) {
debugfs_create_file(pid_files[i].name, 0660,
-   debugfs_parent, pid_files[i].value,
-   fops_pid_param);
+   debugfs_parent, pid_files[i].value,
+   fops_pid_param);
i++;
}
debugfs_create_file(stats, S_IRUSR | S_IRGRP, debugfs_parent, NULL,
@@ -329,10 +324,11 @@ static void intel_pstate_debug_expose_params(void)
}
 
 static ssize_t store_no_turbo(struct kobject *a, struct attribute *b,
-   const char *buf, size_t count)
+ const char *buf, size_t count)
 {
unsigned int input;
int ret;
+
ret = sscanf(buf, %u, input);
if (ret != 1)
return -EINVAL;
@@ -342,10 +338,11 @@ static ssize_t store_no_turbo(struct kobject *a, struct 
attribute *b,
 }
 
 static ssize_t store_max_perf_pct(struct kobject *a, struct attribute *b,
-   const char *buf, size_t count)
+ const char *buf, size_t count)
 {
unsigned int input;
int ret;
+
ret = sscanf(buf, %u, input);
if (ret != 1)
return -EINVAL;
@@ -353,14 +350,16 @@ static ssize_t store_max_perf_pct(struct kobject *a, 
struct attribute *b,
limits.max_sysfs_pct = clamp_t(int, input, 0 , 100);
limits.max_perf_pct = min(limits.max_policy_pct, limits.max_sysfs_pct);
limits.max_perf = div_fp(int_tofp(limits.max_perf_pct), int_tofp(100));
+
return count;
 }
 
 static ssize_t store_min_perf_pct(struct kobject *a, struct attribute *b,
-   const char *buf, size_t count)
+ const char *buf, size_t count)
 {
unsigned int input;
int ret;
+
ret = sscanf(buf, %u, input);
if (ret != 1)
return -EINVAL;
@@ -397,8 +396,7 @@ static void intel_pstate_sysfs_expose_params

[PATCH 4/7] cpufreq: intel_pstate: Simplify code in intel_pstate_adjust_busy_pstate

Simplify the code by removing the inline functions
pstate_increase and pstate_decrease and use directly the
intel_pstate_set_pstate.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/intel_pstate.c | 26 +++---
 1 file changed, 3 insertions(+), 23 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 3a49269..26a0262 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -588,21 +588,6 @@ static void intel_pstate_set_pstate(struct cpudata *cpu, 
int pstate)
pstate_funcs.set(cpu, pstate);
 }
 
-static inline void intel_pstate_pstate_increase(struct cpudata *cpu, int steps)
-{
-   int target;
-   target = cpu-pstate.current_pstate + steps;
-
-   intel_pstate_set_pstate(cpu, target);
-}
-
-static inline void intel_pstate_pstate_decrease(struct cpudata *cpu, int steps)
-{
-   int target;
-   target = cpu-pstate.current_pstate - steps;
-   intel_pstate_set_pstate(cpu, target);
-}
-
 static void intel_pstate_get_cpu_pstates(struct cpudata *cpu)
 {
cpu-pstate.min_pstate = pstate_funcs.get_min();
@@ -695,20 +680,15 @@ static inline void intel_pstate_calc_scaled_busy(struct 
cpudata *cpu)
 static inline void intel_pstate_adjust_busy_pstate(struct cpudata *cpu)
 {
struct _pid *pid;
-   signed int ctl = 0;
-   int steps;
+   signed int ctl;
 
pid = cpu-pid;
intel_pstate_calc_scaled_busy(cpu);
 
ctl = pid_calc(pid, cpu-sample.busy_scaled);
 
-   steps = abs(ctl);
-
-   if (ctl  0)
-   intel_pstate_pstate_increase(cpu, steps);
-   else
-   intel_pstate_pstate_decrease(cpu, steps);
+   /* Negative values of ctl increase the pstate and vice versa */
+   intel_pstate_set_pstate(cpu, cpu-pstate.current_pstate - ctl);
 }
 
 static void intel_pstate_timer_func(unsigned long __data)
-- 
1.9.3
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/7] cpufreq: intel_pstate: Remove duplicate CPU ID check

We check the CPU ID during driver init. There is no need
to do it again per logical CPU initialization.

So, remove the duplicate check.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/intel_pstate.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index aebd457..4e7f492 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -691,14 +691,8 @@ MODULE_DEVICE_TABLE(x86cpu, intel_pstate_cpu_ids);
 
 static int intel_pstate_init_cpu(unsigned int cpunum)
 {
-
-   const struct x86_cpu_id *id;
struct cpudata *cpu;
 
-   id = x86_match_cpu(intel_pstate_cpu_ids);
-   if (!id)
-   return -ENODEV;
-
all_cpu_data[cpunum] = kzalloc(sizeof(struct cpudata), GFP_KERNEL);
if (!all_cpu_data[cpunum])
return -ENOMEM;
-- 
1.9.3
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/7] cpufreq: intel_pstate: Add debugfs file stats

Add stats file in debugfs under driver's parent directory
(pstate_snb) which counts the time in nsecs per requested
P state and the number of times the specific state
was requested.

The file presents the statistics per logical CPU in the
following format. The time is displayed in msecs:

CPU0
P-stateTime Count
 16 4882777 23632
 17   21210   174
 18  549781  3300
 19   51171   461
 20   35487   394
 21   18173   219
 22   13752   258
 236048   172
 247754   177
 254587   151
 265465   162
 27143247
 28 86354
 29144850
 30103047
 31147262
 32222168
 33186960
 34214070
 39   85446  3803

...

The file can be used for debugging but also for monitoring
various system workloads.

Also, make the debugfs_parent local as we never remove
the driver's debugfs files.

Signed-off-by: Stratos Karafotis strat...@semaphore.gr
---
 drivers/cpufreq/intel_pstate.c | 80 +-
 1 file changed, 79 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 31e2ae5..3a49269 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -86,6 +86,12 @@ struct _pid {
int32_t last_err;
 };
 
+struct pstate_stat {
+   int pstate;
+   u64 time;
+   u64 count;
+};
+
 struct cpudata {
int cpu;
 
@@ -99,6 +105,7 @@ struct cpudata {
u64 prev_aperf;
u64 prev_mperf;
struct sample sample;
+   struct pstate_stat *stats;
 };
 
 static struct cpudata **all_cpu_data;
@@ -256,9 +263,59 @@ static struct pid_param pid_files[] = {
{NULL, NULL}
 };
 
-static struct dentry *debugfs_parent;
+static inline unsigned int stats_state_index(struct cpudata *cpu, int pstate)
+{
+   if (pstate = cpu-pstate.max_pstate)
+   return pstate - cpu-pstate.min_pstate;
+   else
+   return cpu-pstate.max_pstate - cpu-pstate.min_pstate + 1;
+}
+
+static int stats_debug_show(struct seq_file *m, void *unused)
+{
+   struct cpudata *cpu;
+   int i, j, cnt;
+
+   get_online_cpus();
+   for_each_online_cpu(i) {
+   if (all_cpu_data[i])
+   cpu = all_cpu_data[i];
+   else
+   continue;
+
+   seq_printf(m, CPU%u\n, i);
+   seq_puts(m, P-stateTime Count\n);
+
+   cnt = cpu-pstate.max_pstate - cpu-pstate.min_pstate + 2;
+   for (j = 0; j  cnt; j++)
+   seq_printf(m, %7u %11llu %9llu\n,
+  cpu-stats[j].pstate,
+  cpu-stats[j].time / USEC_PER_MSEC,
+  cpu-stats[j].count);
+
+   seq_puts(m, \n);
+   }
+   put_online_cpus();
+
+   return 0;
+}
+
+static int stats_debug_open(struct inode *inode, struct file *file)
+{
+   return single_open(file, stats_debug_show, inode-i_private);
+}
+
+static const struct file_operations fops_stats_pstate = {
+   .open   = stats_debug_open,
+   .read   = seq_read,
+   .llseek = seq_lseek,
+   .release= single_release,
+   .owner  = THIS_MODULE,
+};
+
 static void intel_pstate_debug_expose_params(void)
 {
+   struct dentry *debugfs_parent;
int i = 0;
 
debugfs_parent = debugfs_create_dir(pstate_snb, NULL);
@@ -270,6 +327,8 @@ static void intel_pstate_debug_expose_params(void)
fops_pid_param);
i++;
}
+   debugfs_create_file(stats, S_IRUSR | S_IRGRP, debugfs_parent, NULL,
+   fops_stats_pstate);
 }
 
 /** debugfs end /
@@ -610,6 +669,7 @@ static inline void intel_pstate_calc_scaled_busy(struct 
cpudata *cpu)
int32_t core_busy, max_pstate, current_pstate, sample_ratio;
u32 duration_us;
u32 sample_time;
+   unsigned int i;
 
core_busy = cpu-sample.core_pct_busy;
max_pstate = int_tofp(cpu-pstate.max_pstate);
@@ -626,6 +686,10 @@ static inline void intel_pstate_calc_scaled_busy(struct 
cpudata *cpu)
}
 
cpu-sample.busy_scaled = core_busy;
+
+   i = stats_state_index(cpu, cpu-pstate.current_pstate);
+   cpu-stats[i].time += duration_us;
+   cpu-stats[i].count++;
 }
 
 static inline void intel_pstate_adjust_busy_pstate(struct cpudata *cpu)
@@ -692,6 +756,7 @@ MODULE_DEVICE_TABLE(x86cpu, intel_pstate_cpu_ids);
 static int intel_pstate_init_cpu(unsigned int cpunum)
 {
struct cpudata *cpu;
+   unsigned int i, cnt;
 
all_cpu_data[cpunum

Re: [PATCH 7/7] cpufreq: intel_pstate: Make intel_pstate_kobject local