[PATCH] of: add missing documentation for of_platform_populate()

2012-11-23 Thread Javi Merino
15c3597d (dt/platform: allow device name to be overridden) added a
lookup parameter to of_platform_populate() but did not update the
documentation.  This patch adds the missing documentation entry.

Cc: Grant Likely grant.lik...@secretlab.ca
Cc: Jiri Kosina triv...@kernel.org
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 drivers/of/platform.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index b80891b..e0a6514 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -436,6 +436,7 @@ EXPORT_SYMBOL(of_platform_bus_probe);
  * of_platform_populate() - Populate platform_devices from device tree data
  * @root: parent of the first level to probe or NULL for the root of the tree
  * @matches: match table, NULL to use the default
+ * @lookup: auxdata table for matching id and platform_data with device nodes
  * @parent: parent to hook devices from, NULL for toplevel
  *
  * Similar to of_platform_bus_probe(), this function walks the device tree
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] irqchip: gic: Don't complain in gic_get_cpumask() if UP system

2013-07-12 Thread Javi Merino
On Sat, Jul 06, 2013 at 12:39:33AM +0100, Stephen Boyd wrote:
 In a uniprocessor implementation the interrupt processor targets
 registers are read-as-zero/write-ignored (RAZ/WI). Unfortunately
 gic_get_cpumask() will print a critical message saying
 
  GIC CPU mask not found - kernel will fail to boot.
 
 if these registers all read as zero, but there won't actually be
 a problem on uniprocessor systems and the kernel will boot just
 fine. Skip this check if we're running a UP kernel or if we
 detect that the hardware only supports a single processor.
 
 Cc: Nicolas Pitre n...@linaro.org
 Cc: Russell King rmk+ker...@arm.linux.org.uk
 Signed-off-by: Stephen Boyd sb...@codeaurora.org
 ---
 
 Maybe we should just drop the check entirely? It looks like it may
 just be debug code that won't ever trigger in practice, even on the
 11MPCore that caused this code to be introduced.

I agree, we should drop the check.  It's annoying in uniprocessors and
unlikely to be found in the real world unless your gic entry in the dt
is wrong.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND] smp: harmonize prototypes of smp functions

2013-09-10 Thread Javi Merino
Avoid unnecessary casts from int to bool in smp functions.  Some
functions in kernel/smp.c have a wait parameter that can be set to one
if you want to wait for the command to complete.  It's defined as bool
in a few of them and int in the rest.  If a function with wait
declared as int calls a function whose prototype has wait defined as
bool, the compiler needs to test if the int is != 0 and change it to 1
if so.  This useless check can be avoided if we are consistent and
make all the functions use the same type for this parameter.

For example in arm, before this patch:

800464e4 smp_call_function:
800464e4:   b538push{r3, r4, r5, lr}
800464e6:   460dmov r5, r1
800464e8:   4613mov r3, r2   ; move wait to r3
800464ea:   f64f 448c   movwr4, #64652
800464ee:   3300addsr3, #0   ; test if wait is 0
800464f0:   f2c8 0425   movtr4, #32805
800464f4:   4601mov r1, r0
800464f6:   bf18it  ne
800464f8:   2301movne   r3, #1   ; if it is not, wait = 1
800464fa:   462amov r2, r5
800464fc:   6820ldr r0, [r4, #0]
800464fe:   f7ff fea9   bl  80046254 smp_call_function_many
80046502:   2000movsr0, #0
80046504:   bd38pop {r3, r4, r5, pc}
80046506:   bf00nop

After the patch:

800464e4 smp_call_function:
800464e4:   b538push{r3, r4, r5, lr}
800464e6:   460dmov r5, r1
800464e8:   4613mov r3, r2  ; just move it to r3
800464ea:   f64f 448c   movwr4, #64652
800464ee:   4601mov r1, r0
800464f0:   f2c8 0425   movtr4, #32805
800464f4:   462amov r2, r5
800464f6:   6820ldr r0, [r4, #0]
800464f8:   f7ff feac   bl  80046254 smp_call_function_many
800464fc:   2000movsr0, #0
800464fe:   bd38pop {r3, r4, r5, pc}

Same for x86.  Before:

8109bf10 smp_call_function:
8109bf10:   55  push   %rbp
8109bf11:   48 89 e5mov%rsp,%rbp
8109bf14:   31 c9   xor%ecx,%ecx  ; ecx = 0
8109bf16:   85 d2   test   %edx,%edx  ; test if 
wait is 0
8109bf18:   48 89 f2mov%rsi,%rdx
8109bf1b:   48 89 femov%rdi,%rsi
8109bf1e:   48 8b 3d 4b d3 76 00mov0x76d34b(%rip),%rdi  
  # 81809270 cpu_online_mask
8109bf25:   0f 95 c1setne  %cl; if it is 
not, ecx = 1
8109bf28:   e8 43 fc ff ff  callq  8109bb70 
smp_call_function_many
8109bf2d:   31 c0   xor%eax,%eax
8109bf2f:   5d  pop%rbp
8109bf30:   c3  retq

After:

8109bf20 smp_call_function:
8109bf20:   55  push   %rbp
8109bf21:   48 89 e5mov%rsp,%rbp
8109bf24:   89 d1   mov%edx,%ecx  ; just move 
wait to ecx
8109bf26:   48 89 f2mov%rsi,%rdx
8109bf29:   48 89 femov%rdi,%rsi
8109bf2c:   48 8b 3d 3d d3 76 00mov0x76d33d(%rip),%rdi  
  # 81809270 cpu_online_mask
8109bf33:   e8 48 fc ff ff  callq  8109bb80 
smp_call_function_many
8109bf38:   31 c0   xor%eax,%eax
8109bf3a:   5d  pop%rbp
8109bf3b:   c3  retq
8109bf3c:   0f 1f 40 00 nopl   0x0(%rax)

Cc: Andrew Morton a...@linux-foundation.org
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 include/linux/smp.h |6 +++---
 kernel/smp.c|6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/smp.h b/include/linux/smp.h
index c181399..a894405 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -72,7 +72,7 @@ extern void smp_cpus_done(unsigned int max_cpus);
  */
 int smp_call_function(smp_call_func_t func, void *info, int wait);
 void smp_call_function_many(const struct cpumask *mask,
-   smp_call_func_t func, void *info, bool wait);
+   smp_call_func_t func, void *info, int wait);
 
 void __smp_call_function_single(int cpuid, struct call_single_data *data,
int wait);
@@ -104,7 +104,7 @@ int on_each_cpu(smp_call_func_t func, void *info, int wait);
  * the local one.
  */
 void on_each_cpu_mask(const struct cpumask *mask, smp_call_func_t func,
-   void *info, bool wait);
+   void *info, int wait);
 
 /*
  * Call a function

Re: [PATCH] smp: harmonize prototypes of smp functions

2013-09-18 Thread Javi Merino
On Tue, Sep 17, 2013 at 10:22:28PM +0100, Andrew Morton wrote:
 On Mon,  2 Sep 2013 15:33:13 +0100 Javi Merino javi.mer...@arm.com wrote:
 
  Avoid unnecessary casts from int to bool in smp functions.  Some
  functions in kernel/smp.c have a wait parameter that can be set to one
  if you want to wait for the command to complete.  It's defined as bool
  in a few of them and int in the rest.  If a function with wait
  declared as int calls a function whose prototype has wait defined as
  bool, the compiler needs to test if the int is != 0 and change it to 1
  if so.  This useless check can be avoided if we are consistent and
  make all the functions use the same type for this parameter.
 
 Yes, that's a problem with bool.
 
 But the `wait' argument *is* a boolean and switching everything over to
 use bool (instead of int) should provide similar code-size savings.
 Did you evaluate that approach?

I did; you get exactly the same code-size savings.  But then I read
this[0] and thought that int was preferred.

[0] https://lkml.org/lkml/2013/8/31/138

I can submit the bool patch instead if you prefer it.  Cheers,
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] smp: harmonize prototypes of smp functions

2013-09-02 Thread Javi Merino
Avoid unnecessary casts from int to bool in smp functions.  Some
functions in kernel/smp.c have a wait parameter that can be set to one
if you want to wait for the command to complete.  It's defined as bool
in a few of them and int in the rest.  If a function with wait
declared as int calls a function whose prototype has wait defined as
bool, the compiler needs to test if the int is != 0 and change it to 1
if so.  This useless check can be avoided if we are consistent and
make all the functions use the same type for this parameter.

For example in arm, before this patch:

800464e4 smp_call_function:
800464e4:   b538push{r3, r4, r5, lr}
800464e6:   460dmov r5, r1
800464e8:   4613mov r3, r2   ; move wait to r3
800464ea:   f64f 448c   movwr4, #64652
800464ee:   3300addsr3, #0   ; test if wait is 0
800464f0:   f2c8 0425   movtr4, #32805
800464f4:   4601mov r1, r0
800464f6:   bf18it  ne
800464f8:   2301movne   r3, #1   ; if it is not, wait = 1
800464fa:   462amov r2, r5
800464fc:   6820ldr r0, [r4, #0]
800464fe:   f7ff fea9   bl  80046254 smp_call_function_many
80046502:   2000movsr0, #0
80046504:   bd38pop {r3, r4, r5, pc}
80046506:   bf00nop

After the patch:

800464e4 smp_call_function:
800464e4:   b538push{r3, r4, r5, lr}
800464e6:   460dmov r5, r1
800464e8:   4613mov r3, r2  ; just move it to r3
800464ea:   f64f 448c   movwr4, #64652
800464ee:   4601mov r1, r0
800464f0:   f2c8 0425   movtr4, #32805
800464f4:   462amov r2, r5
800464f6:   6820ldr r0, [r4, #0]
800464f8:   f7ff feac   bl  80046254 smp_call_function_many
800464fc:   2000movsr0, #0
800464fe:   bd38pop {r3, r4, r5, pc}

Same for x86.  Before:

8109bf10 smp_call_function:
8109bf10:   55  push   %rbp
8109bf11:   48 89 e5mov%rsp,%rbp
8109bf14:   31 c9   xor%ecx,%ecx  ; ecx = 0
8109bf16:   85 d2   test   %edx,%edx  ; test if 
wait is 0
8109bf18:   48 89 f2mov%rsi,%rdx
8109bf1b:   48 89 femov%rdi,%rsi
8109bf1e:   48 8b 3d 4b d3 76 00mov0x76d34b(%rip),%rdi  
  # 81809270 cpu_online_mask
8109bf25:   0f 95 c1setne  %cl; if it is 
not, ecx = 1
8109bf28:   e8 43 fc ff ff  callq  8109bb70 
smp_call_function_many
8109bf2d:   31 c0   xor%eax,%eax
8109bf2f:   5d  pop%rbp
8109bf30:   c3  retq

After:

8109bf20 smp_call_function:
8109bf20:   55  push   %rbp
8109bf21:   48 89 e5mov%rsp,%rbp
8109bf24:   89 d1   mov%edx,%ecx  ; just move 
wait to ecx
8109bf26:   48 89 f2mov%rsi,%rdx
8109bf29:   48 89 femov%rdi,%rsi
8109bf2c:   48 8b 3d 3d d3 76 00mov0x76d33d(%rip),%rdi  
  # 81809270 cpu_online_mask
8109bf33:   e8 48 fc ff ff  callq  8109bb80 
smp_call_function_many
8109bf38:   31 c0   xor%eax,%eax
8109bf3a:   5d  pop%rbp
8109bf3b:   c3  retq
8109bf3c:   0f 1f 40 00 nopl   0x0(%rax)

Cc: Andrew Morton a...@linux-foundation.org
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 include/linux/smp.h |6 +++---
 kernel/smp.c|6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/smp.h b/include/linux/smp.h
index c181399..a894405 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -72,7 +72,7 @@ extern void smp_cpus_done(unsigned int max_cpus);
  */
 int smp_call_function(smp_call_func_t func, void *info, int wait);
 void smp_call_function_many(const struct cpumask *mask,
-   smp_call_func_t func, void *info, bool wait);
+   smp_call_func_t func, void *info, int wait);
 
 void __smp_call_function_single(int cpuid, struct call_single_data *data,
int wait);
@@ -104,7 +104,7 @@ int on_each_cpu(smp_call_func_t func, void *info, int wait);
  * the local one.
  */
 void on_each_cpu_mask(const struct cpumask *mask, smp_call_func_t func,
-   void *info, bool wait);
+   void *info, int wait);
 
 /*
  * Call a function

Re: [PATCH] tools/thermal: tmon: fix compilation errors when building statically

2014-06-10 Thread Javi Merino
Hi Rui,

On Mon, Jun 02, 2014 at 12:54:57PM +0100, Jacob Pan wrote:
 On Mon,  2 Jun 2014 18:08:17 +0100
 Javi Merino javi.mer...@arm.com wrote:
 
  tmon fails to build statically with the following error:
  
  $ make LDFLAGS=-static
  gcc -O1 -Wall -Wshadow -W -Wformat -Wimplicit-function-declaration
  -Wimplicit-int -fstack-protector -D VERSION=\1.0\ -static tmon.o
  tui.o sysfs.o pid.o   -o tmon -lm -lpanel -lncursesw  -lpthread
  tmon.o: In function `tmon_sig_handler': tmon.c:(.text+0x21):
  undefined reference to `stdscr' tmon.o: In function `tmon_cleanup':
  tmon.c:(.text+0xb9): undefined reference to `stdscr'
  tmon.c:(.text+0x11e): undefined reference to `stdscr'
  tmon.c:(.text+0x123): undefined reference to `keypad'
  tmon.c:(.text+0x12d): undefined reference to `nocbreak' tmon.o: In
  function `main': tmon.c:(.text+0x785): undefined reference to `stdscr'
  tmon.c:(.text+0x78a): undefined reference to `nodelay'
  tui.o: In function `setup_windows':
  tui.c:(.text+0x131): undefined reference to `stdscr'
  tui.c:(.text+0x176): undefined reference to `stdscr'
  tui.c:(.text+0x19f): undefined reference to `stdscr'
  tui.c:(.text+0x1cc): undefined reference to `stdscr'
  tui.c:(.text+0x1ff): undefined reference to `stdscr'
  tui.o:tui.c:(.text+0x229): more undefined references to `stdscr'
  follow tui.o: In function `show_cooling_device':
  [...]
  
  stdscr() and friends are in libtinfo (part of ncurses) so add it to
  the libraries that are linked in when compiling tmon to fix it.
  
 Acked-by: Jacob Pan jacob.jun@linux.intel.com

Thanks!

Rui, can you pick this up?

Cheers,
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v3 4/7] thermal: introduce the Power Actor API

2014-06-12 Thread Javi Merino
On Wed, Jun 11, 2014 at 12:32:54PM +0100, Eduardo Valentin wrote:
 Hello Javi,
 
 On Tue, Jun 03, 2014 at 11:18:32AM +0100, Javi Merino wrote:
  This patch introduces the Power Actor API in the thermal framework.
  With it, devices that can report their power consumption and control
  it can be registered.  This base interface is meant to be used to
  derive specific power actors, such as a cpu power actor.
  
  Cc: Zhang Rui rui.zh...@intel.com
  Cc: Eduardo Valentin edubez...@gmail.com
  Signed-off-by: Javi Merino javi.mer...@arm.com
  ---
   Documentation/thermal/power_actor.txt | 38 ++
   drivers/thermal/Kconfig   |  3 ++
   drivers/thermal/Makefile  |  2 +
   drivers/thermal/power_actor/Makefile  |  5 +++
   drivers/thermal/power_actor/power_actor.c | 64 
  +++
   drivers/thermal/power_actor/power_actor.h | 64 
  +++
 
 Do you think this API may have other users other than thermal?
 
 If yes, I propose moving it somewhere else, such as driver/base/power.
 

It could be used by others, but I think we should move it when those
other users appear.

   6 files changed, 176 insertions(+)
   create mode 100644 Documentation/thermal/power_actor.txt
   create mode 100644 drivers/thermal/power_actor/Makefile
   create mode 100644 drivers/thermal/power_actor/power_actor.c
   create mode 100644 drivers/thermal/power_actor/power_actor.h
  
  diff --git a/Documentation/thermal/power_actor.txt 
  b/Documentation/thermal/power_actor.txt
  new file mode 100644
  index ..5a61f32ec143
  --- /dev/null
  +++ b/Documentation/thermal/power_actor.txt
  @@ -0,0 +1,38 @@
  +
  +Power Actor API
  +===
  +
  +The base power actor API is meant to be used to derive specific power
 
 
 How about having a deeper explanation? Maybe including one or two
 paragraphs explaning why and when using this API makes sense.
 
 One or two scenario explanation also would help.

Ok, I'll add it.

 An explanation of the difference to cooling API is also good to have.
 
  +actors, such as a cpu power actor.  When registering, they should call
  +`power_actor_register()` with a unique `enum power_actor_types`.  When
 
 Why this enum? Maybe it is a design copy of PM QoS?

To be able to tell different power actors apart.  I haven't looked
into PM QoS.

 But to me this is bound to struct device, no?

You're suggesting that we should add this as a field to struct device?
Could do.

  +unregistering, the power actor should call `power_actor_unregister()`
  +with the `struct power_actor *` received in the call to
  +`power_actor_register()`.
  +
  +Callbacks
  +-
  +
  +1. u32 get_req_power(struct power_actor *actor)
  +@actor: a valid `struct power_actor *` registered with
  +`power_actor_register()`
  +
  +`get_req_power()` returns the current requested power in milliwatts.
  +
  +2. u32 get_max_power(struct power_actor *actor)
  +@actor: a valid `struct power_actor *` registered with
  +`power_actor_register()`
  +
  +`get_max_power()` returns the maximum power that the device could
  +consume if it was fully utilized.  It's a function as some devices'
  +maximum power consumption can change due to external factors such as
  +temperature.
  +
  +3. int set_power(struct power_actor *actor, u32 power)
  +@actor: a valid `struct power_actor *` registered with
  +`power_actor_register()`
  +@power: power in milliwatts
  +
  +`set_power()` should configure the device to consume @power
  +milliwatts.
  +
  +Returns 0 on success, -E* on error.
  diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
  index 2d51912a6e40..47e2f15537ca 100644
  --- a/drivers/thermal/Kconfig
  +++ b/drivers/thermal/Kconfig
  @@ -89,6 +89,9 @@ config THERMAL_GOV_USER_SPACE
  help
Enable this to let the user space manage the platform thermals.
   
  +config THERMAL_POWER_ACTOR
  +   bool
  +
 
 Why empty description/help?

Because it's not an option that users can select.

   config CPU_THERMAL
  bool generic cpu cooling support
  depends on CPU_FREQ
  diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
  index 54e4ec9eb5df..878a02cab7d1 100644
  --- a/drivers/thermal/Makefile
  +++ b/drivers/thermal/Makefile
  @@ -14,6 +14,8 @@ thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE)  += 
  fair_share.o
   thermal_sys-$(CONFIG_THERMAL_GOV_STEP_WISE)+= step_wise.o
   thermal_sys-$(CONFIG_THERMAL_GOV_USER_SPACE)   += user_space.o
   
  +obj-$(CONFIG_THERMAL_POWER_ACTOR) += power_actor/
  +
   # cpufreq cooling
   thermal_sys-$(CONFIG_CPU_THERMAL)  += cpu_cooling.o
   
  diff --git a/drivers/thermal/power_actor/Makefile 
  b/drivers/thermal/power_actor/Makefile
  new file mode 100644
  index ..46478f4928be
  --- /dev/null
  +++ b/drivers/thermal/power_actor/Makefile
  @@ -0,0 +1,5 @@
  +#
  +# Makefile for the power actors
  +#
  +
  +obj-y += power_actor.o

Re: [RFC PATCH v3 5/7] thermal: add a basic cpu power actor

2014-06-12 Thread Javi Merino
On Wed, Jun 11, 2014 at 01:05:37PM +0100, Eduardo Valentin wrote:
 Hello Javi,
 
 On Tue, Jun 03, 2014 at 11:18:33AM +0100, Javi Merino wrote:
  Introduce a power actor for cpus.  It has a basic power model to get
  the current power utilization and uses cpufreq cooling devices to set
  the desired power.  It uses the current frequency (as reported by
  cpufreq) as well as load and OPPs for the power calculations.  The
  cpus must have registered their OPPs in the OPP library.
 
  Cc: Zhang Rui rui.zh...@intel.com
  Cc: Eduardo Valentin edubez...@gmail.com
  Signed-off-by: Punit Agrawal punit.agra...@arm.com
  Signed-off-by: Javi Merino javi.mer...@arm.com
  ---
   Documentation/thermal/power_actor.txt | 126 +++
   drivers/thermal/Kconfig   |   5 +
   drivers/thermal/power_actor/Kconfig   |   9 +
   drivers/thermal/power_actor/Makefile  |   2 +
   drivers/thermal/power_actor/cpu_actor.c   | 601 
  ++
   drivers/thermal/power_actor/power_actor.h |  41 ++
   6 files changed, 784 insertions(+)
   create mode 100644 drivers/thermal/power_actor/Kconfig
   create mode 100644 drivers/thermal/power_actor/cpu_actor.c
 
  diff --git a/Documentation/thermal/power_actor.txt 
  b/Documentation/thermal/power_actor.txt
  index 5a61f32ec143..fd51760615bf 100644
  --- a/Documentation/thermal/power_actor.txt
  +++ b/Documentation/thermal/power_actor.txt
  @@ -36,3 +36,129 @@ temperature.
   milliwatts.
 
   Returns 0 on success, -E* on error.
  +
  +CPU Power Actor API
  +===
  +
  +A simple power model for CPUs.  The current power is calculated as
  +dynamic + (optionally) static power.  This power model requires that
  +the operating-points of the CPUs are registered using the kernel's opp
  +library and the `cpufreq_frequency_table` is assigned to the `struct
  +device` of the cpu.  If you are using the `cpufreq-cpu0.c` driver then
  +the `cpufreq_frequency_table` should already be assigned to the cpu
  +device.
  +
  +The `tz` and `plat_static_func` parameters of
  +`power_cpu_actor_register()` are optional.  If you don't provide them,
  +only dynamic power will be considered.
  +
  +Dynamic power
  +-
  +
  +The dynamic power consumption of a processor depends
  +on many factors.  For a given processor implementation the primary
  +factors are:
  +
  +- The time the processor spends running, consuming dynamic power, as
  +  compared to the time in idle states where dynamic consumption is
  +  negligible.  Herein we refer to this as 'utilisation'.
  +- The voltage and frequency levels as a result of DVFS.  The DVFS
  +  level is a dominant factor governing power consumption.
  +- In running time the 'execution' behaviour (instruction types, memory
  +  access patterns and so forth) causes, in most cases, a second order
  +  variation.  In pathological cases this variation can be significant,
  +  but typically it is of a much lesser impact than the factors above.
  +
  +A high level dynamic power consumption model may then be represented as:
  +
  +Pdyn = f(run) * Voltage^2 * Frequency * Utilisation
  +
  +f(run) here represents the described execution behaviour and its
  +result has a units of Watts/Hz/Volt^2 (this often expressed in
  +mW/MHz/uVolt^2)
  +
  +The detailed behaviour for f(run) could be modelled on-line.  However,
  +in practice, such an on-line model has dependencies on a number of
  +implementation specific processor support and characterisation
  +factors.  Therefore, in initial implementation that contribution is
  +represented as a constant coefficient.  This is a simplification
  +consistent with the relative contribution to overall power variation.
  +
  +In this simplified representation our model becomes:
  +
  +Pdyn = Kd * Voltage^2 * Frequency * Utilisation
  +
  +Where Kd (capacitance) represents an indicative running time dynamic
  +power coefficient in fundamental units of mW/MHz/uVolt^2
  +
  +Static Power
  +
  +
  +Static leakage power consumption depends on a number of factors.  For a
  +given circuit implementation the primary factors are:
  +
  +- Time the circuit spends in each 'power state'
  +- Temperature
  +- Operating voltage
  +- Process grade
  +
  +The time the circuit spends in each 'power state' for a given
  +evaluation period at first order means OFF or ON.  However,
  +'retention' states can also be supported that reduce power during
  +inactive periods without loss of context.
  +
  +Note: The visibility of state entries to the OS can vary, according to
  +platform specifics, and this can then impact the accuracy of a model
  +based on OS state information alone.  It might be possible in some
  +cases to extract more accurate information from system resources.
  +
  +The temperature, operating voltage and process 'grade' (slow to fast)
  +of the circuit are all significant factors in static leakage power
  +consumption.  All of these have complex relationships to static

Re: [RFC PATCH v3 5/7] thermal: add a basic cpu power actor

2014-06-13 Thread Javi Merino
On Thu, Jun 12, 2014 at 03:26:50PM +0100, Javi Merino wrote:
 On Wed, Jun 11, 2014 at 01:05:37PM +0100, Eduardo Valentin wrote:
  On Tue, Jun 03, 2014 at 11:18:33AM +0100, Javi Merino wrote:
   Introduce a power actor for cpus.  It has a basic power model to get
   the current power utilization and uses cpufreq cooling devices to set
   the desired power.  It uses the current frequency (as reported by
   cpufreq) as well as load and OPPs for the power calculations.  The
   cpus must have registered their OPPs in the OPP library.
  
   Cc: Zhang Rui rui.zh...@intel.com
   Cc: Eduardo Valentin edubez...@gmail.com
   Signed-off-by: Punit Agrawal punit.agra...@arm.com
   Signed-off-by: Javi Merino javi.mer...@arm.com
   ---
Documentation/thermal/power_actor.txt | 126 +++
drivers/thermal/Kconfig   |   5 +
drivers/thermal/power_actor/Kconfig   |   9 +
drivers/thermal/power_actor/Makefile  |   2 +
drivers/thermal/power_actor/cpu_actor.c   | 601 
   ++
drivers/thermal/power_actor/power_actor.h |  41 ++
6 files changed, 784 insertions(+)
create mode 100644 drivers/thermal/power_actor/Kconfig
create mode 100644 drivers/thermal/power_actor/cpu_actor.c
  
[...]
   diff --git a/drivers/thermal/power_actor/Kconfig 
   b/drivers/thermal/power_actor/Kconfig
   new file mode 100644
   index ..fa542ca99cdb
   --- /dev/null
   +++ b/drivers/thermal/power_actor/Kconfig
   @@ -0,0 +1,9 @@
   +#
   +# Thermal power actor configuration
   +#
   +
   +config THERMAL_POWER_ACTOR_CPU
   + bool
   + prompt Simple power model for a CPU
   + help
   +   A simple CPU power model
 
  A better help is always welcome.
 
 I guess I can repeat some of the text in the Documentation here.

I've been thinking about it and I'd rather remove the prompt
altogether.  There's no point in asking the user about this, drivers
that use this API can select THERMAL_POWER_ACTOR_CPU themselves.  So
I'll remove the help text.

[...]
   diff --git a/drivers/thermal/power_actor/power_actor.h 
   b/drivers/thermal/power_actor/power_actor.h
   index 28098f43630b..230317c284b2 100644
   --- a/drivers/thermal/power_actor/power_actor.h
   +++ b/drivers/thermal/power_actor/power_actor.h
   @@ -17,11 +17,16 @@
#ifndef __POWER_ACTOR_H__
#define __POWER_ACTOR_H__
  
   +#include linux/cpumask.h
   +#include linux/device.h
   +#include linux/err.h
#include linux/list.h
   +#include linux/thermal.h
  
#define MAX_NUM_ACTORS 8
  
enum power_actor_types {
   + POWER_ACTOR_CPU,
 
  Using struct device is more scalable no? What if we want to provide a
  power actor for a specific bus, or device, or coprocessor? Are we going
  to maintain this enum for every single new user?
 
 The counterargument would be, does every single device need to carry
 this?  Anyway, I'll see if we can add the power actor to the struct
 device.

I've decided to remove the only user of the power_actor_types enum and
remove this field as well.  It's simplifies things which is always
good.

Cheers,
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v2 3/7] thermal: let governors have private data for each thermal zone

2014-05-20 Thread Javi Merino
A governor may need to store its current state between calls to
throttle().  That state depends on the thermal zone, so store it as
private data in struct thermal_zone_device.

The governors may have two new ops: bind_to_tz() and unbind_from_tz().
When provided, these functions let governors do some initialization
and teardown when they are bound/unbound to a tz and possibly store that
information in the governor_data field of the struct
thermal_zone_device.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 drivers/thermal/thermal_core.c |   83 
 include/linux/thermal.h|9 +
 2 files changed, 84 insertions(+), 8 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 71b0ec0c370d..1b13d8e0cfd1 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -72,6 +72,58 @@ static struct thermal_governor *__find_governor(const char 
*name)
return NULL;
 }
 
+/**
+ * bind_previous_governor - bind the previous governor of the thermal zone
+ * @tz:a valid pointer to a struct thermal_zone_device
+ * @failed_gov_name:   the name of the governor that failed to register
+ *
+ * Register the previous governor of the thermal zone after a new
+ * governor has failed to be bound.
+ */
+static void bind_previous_governor(struct thermal_zone_device *tz,
+   const char *failed_gov_name)
+{
+   if (tz-governor  tz-governor-bind_to_tz) {
+   if (tz-governor-bind_to_tz(tz)) {
+   dev_warn(tz-device,
+   governor %s failed to bind and the previous 
one (%s) failed to register again, thermal zone %s has no governor\n,
+   failed_gov_name, tz-governor-name, tz-type);
+   tz-governor = NULL;
+   }
+   }
+}
+
+/**
+ * thermal_set_governor() - Switch to another governor
+ * @tz:a valid pointer to a struct thermal_zone_device
+ * @new_gov:   pointer to the new governor
+ *
+ * Change the governor of thermal zone @tz.
+ *
+ * Returns 0 on success, an error if the new governor's bind_to_tz() failed.
+ */
+static int thermal_set_governor(struct thermal_zone_device *tz,
+   struct thermal_governor *new_gov)
+{
+   int ret = 0;
+
+   if (tz-governor  tz-governor-unbind_from_tz)
+   tz-governor-unbind_from_tz(tz);
+
+   if (new_gov  new_gov-bind_to_tz) {
+   ret = new_gov-bind_to_tz(tz);
+   if (ret) {
+   bind_previous_governor(tz, new_gov-name);
+
+   return ret;
+   }
+   }
+
+   tz-governor = new_gov;
+
+   return ret;
+}
+
 int thermal_register_governor(struct thermal_governor *governor)
 {
int err;
@@ -104,8 +156,15 @@ int thermal_register_governor(struct thermal_governor 
*governor)
 
name = pos-tzp-governor_name;
 
-   if (!strnicmp(name, governor-name, THERMAL_NAME_LENGTH))
-   pos-governor = governor;
+   if (!strnicmp(name, governor-name, THERMAL_NAME_LENGTH)) {
+   int ret;
+
+   ret = thermal_set_governor(pos, governor);
+   if (ret)
+   dev_warn(pos-device,
+   Failed to set governor %s for thermal 
zone %s: %d\n,
+   governor-name, pos-type, ret);
+   }
}
 
mutex_unlock(thermal_list_lock);
@@ -131,7 +190,7 @@ void thermal_unregister_governor(struct thermal_governor 
*governor)
list_for_each_entry(pos, thermal_tz_list, node) {
if (!strnicmp(pos-governor-name, governor-name,
THERMAL_NAME_LENGTH))
-   pos-governor = NULL;
+   thermal_set_governor(pos, NULL);
}
 
mutex_unlock(thermal_list_lock);
@@ -756,8 +815,9 @@ policy_store(struct device *dev, struct device_attribute 
*attr,
if (!gov)
goto exit;
 
-   tz-governor = gov;
-   ret = count;
+   ret = thermal_set_governor(tz, gov);
+   if (!ret)
+   ret = count;
 
 exit:
mutex_unlock(thermal_governor_lock);
@@ -1452,6 +1512,7 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
int result;
int count;
int passive = 0;
+   struct thermal_governor *governor;
 
if (type  strlen(type) = THERMAL_NAME_LENGTH)
return ERR_PTR(-EINVAL);
@@ -1542,9 +1603,15 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
mutex_lock(thermal_governor_lock);
 
if (tz-tzp)
-   tz-governor = __find_governor(tz-tzp

[RFC PATCH v2 1/7] tracing: Add __bitmask() macro to trace events to cpumasks and other bitmasks

2014-05-20 Thread Javi Merino
From: Steven Rostedt (Red Hat) rost...@goodmis.org

Being able to show a cpumask of events can be useful as some events
may affect only some CPUs. There is no standard way to record the
cpumask and converting it to a string is rather expensive during
the trace as traces happen in hotpaths. It would be better to record
the raw event mask and be able to parse it at print time.

The following macros were added for use with the TRACE_EVENT() macro:

  __bitmask()
  __assign_bitmask()
  __get_bitmask()

To test this, I added this to the sched_migrate_task event, which
looked like this:

TRACE_EVENT(sched_migrate_task,

TP_PROTO(struct task_struct *p, int dest_cpu, const struct cpumask 
*cpus),

TP_ARGS(p, dest_cpu, cpus),

TP_STRUCT__entry(
__array(char,   comm,   TASK_COMM_LEN   )
__field(pid_t,  pid )
__field(int,prio)
__field(int,orig_cpu)
__field(int,dest_cpu)
__bitmask(  cpumask, num_possible_cpus())
),

TP_fast_assign(
memcpy(__entry-comm, p-comm, TASK_COMM_LEN);
__entry-pid= p-pid;
__entry-prio   = p-prio;
__entry-orig_cpu   = task_cpu(p);
__entry-dest_cpu   = dest_cpu;
__assign_bitmask(cpumask, cpumask_bits(cpus), 
num_possible_cpus());
),

TP_printk(comm=%s pid=%d prio=%d orig_cpu=%d dest_cpu=%d cpumask=%s,
  __entry-comm, __entry-pid, __entry-prio,
  __entry-orig_cpu, __entry-dest_cpu,
  __get_bitmask(cpumask))
);

With the output of:

ksmtuned-3613  [003] d..2   485.220508: sched_migrate_task: 
comm=ksmtuned pid=3615 prio=120 orig_cpu=3 dest_cpu=2 cpumask=,000f
 migration/1-13[001] d..5   485.221202: sched_migrate_task: 
comm=ksmtuned pid=3614 prio=120 orig_cpu=1 dest_cpu=0 cpumask=,000f
 awk-3615  [002] d.H5   485.221747: sched_migrate_task: 
comm=rcu_preempt pid=7 prio=120 orig_cpu=0 dest_cpu=1 cpumask=,00ff
 migration/2-18[002] d..5   485.222062: sched_migrate_task: 
comm=ksmtuned pid=3615 prio=120 orig_cpu=2 dest_cpu=3 cpumask=,000f

Link: 
http://lkml.kernel.org/r/1399377998-14870-6-git-send-email-javi.mer...@arm.com
Link: http://lkml.kernel.org/r/20140506132238.22e13...@gandalf.local.home

Suggested-by: Javi Merino javi.mer...@arm.com
Tested-by: Javi Merino javi.mer...@arm.com
Signed-off-by: Steven Rostedt rost...@goodmis.org
---
 include/linux/ftrace_event.h |3 +++
 include/linux/trace_seq.h|   10 
 include/trace/ftrace.h   |   57 +-
 kernel/trace/trace_output.c  |   41 ++
 4 files changed, 110 insertions(+), 1 deletion(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index d16da3e53bc7..cff3106ffe2c 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -38,6 +38,9 @@ const char *ftrace_print_symbols_seq_u64(struct trace_seq *p,
 *symbol_array);
 #endif
 
+const char *ftrace_print_bitmask_seq(struct trace_seq *p, void *bitmask_ptr,
+unsigned int bitmask_size);
+
 const char *ftrace_print_hex_seq(struct trace_seq *p,
 const unsigned char *buf, int len);
 
diff --git a/include/linux/trace_seq.h b/include/linux/trace_seq.h
index a32d86ec8bf2..136116924d8d 100644
--- a/include/linux/trace_seq.h
+++ b/include/linux/trace_seq.h
@@ -46,6 +46,9 @@ extern int trace_seq_putmem_hex(struct trace_seq *s, const 
void *mem,
 extern void *trace_seq_reserve(struct trace_seq *s, size_t len);
 extern int trace_seq_path(struct trace_seq *s, const struct path *path);
 
+extern int trace_seq_bitmask(struct trace_seq *s, const unsigned long *maskp,
+int nmaskbits);
+
 #else /* CONFIG_TRACING */
 static inline int trace_seq_printf(struct trace_seq *s, const char *fmt, ...)
 {
@@ -57,6 +60,13 @@ trace_seq_bprintf(struct trace_seq *s, const char *fmt, 
const u32 *binary)
return 0;
 }
 
+static inline int
+trace_seq_bitmask(struct trace_seq *s, const unsigned long *maskp,
+ int nmaskbits)
+{
+   return 0;
+}
+
 static inline int trace_print_seq(struct seq_file *m, struct trace_seq *s)
 {
return 0;
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 0a1a4f7caf09..9b7a989dcbcc 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -53,6 +53,9 @@
 #undef __string
 #define __string(item, src) __dynamic_array(char, item, -1)
 
+#undef __bitmask
+#define __bitmask(item, nr_bits) __dynamic_array(char, item, -1)
+
 #undef TP_STRUCT__entry

[RFC PATCH v2 5/7] thermal: add a basic cpu power actor

2014-05-20 Thread Javi Merino
Introduce a power actor for cpus.  It has a basic power model to get
the current power utilization and uses cpufreq cooling devices to set
the desired power.  It uses the current frequency (as reported by
cpufreq) as well as load and OPPs for the power calculations.  The
cpus must have registered their OPPs in the OPP library.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Punit Agrawal punit.agra...@arm.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 Documentation/thermal/power_actor.txt |   46 
 drivers/thermal/Kconfig   |5 +
 drivers/thermal/power_actor/Kconfig   |9 +
 drivers/thermal/power_actor/Makefile  |2 +
 drivers/thermal/power_actor/cpu_actor.c   |  419 +
 drivers/thermal/power_actor/power_actor.h |   23 ++
 6 files changed, 504 insertions(+)
 create mode 100644 drivers/thermal/power_actor/Kconfig
 create mode 100644 drivers/thermal/power_actor/cpu_actor.c

diff --git a/Documentation/thermal/power_actor.txt 
b/Documentation/thermal/power_actor.txt
index a0f06e091907..d74909376610 100644
--- a/Documentation/thermal/power_actor.txt
+++ b/Documentation/thermal/power_actor.txt
@@ -27,3 +27,49 @@ Callbacks
 milliwatts.
 
 Returns 0 on success, -E* on error.
+
+CPU Power Actor API
+===
+A simple power model for CPUs.  The current power is calculated as
+dynamic power.  The dynamic power consumption of a processor depends
+on many factors.  For a given processor implementation the primary
+factors are:
+
+- The time the processor spends running, consuming dynamic power, as
+  compared to the time in idle states where dynamic consumption is
+  negligible.  Herein we refer to this as 'utilisation'.
+- The voltage and frequency levels as a result of DVFS.  The DVFS
+  level is a dominant factor governing power consumption.
+- In running time the 'execution' behaviour (instruction types, memory
+  access patterns and so forth) causes, in most cases, a second order
+  variation.  In pathological cases this variation can be significant,
+  but typically it is of a much lesser impact than the factors above.
+
+A high level dynamic power consumption model may then be represented as:
+
+Pdyn = f(run) * Voltage^2 * Frequency * Utilisation
+
+f(run) here represents the described execution behaviour and its
+result has a units of Watts/Hz/Volt^2 (this often expressed in
+mW/MHz/uVolt^2)
+
+The detailed behaviour for f(run) could be modelled on-line.  However,
+in practice, such an on-line model has dependencies on a number of
+implementation specific processor support and characterisation
+factors.  Therefore, in initial implementation that contribution is
+represented as a constant coefficient.  This is a simplification
+consistent with the relative contribution to overall power variation.
+
+In this simplified representation our model becomes:
+
+Pdyn = Kd * Voltage^2 * Frequency * Utilisation
+
+Where Kd (capacitance) represents an indicative running time dynamic
+power coefficient in fundamental units of mW/MHz/uVolt^2
+
+This power model requires that the operating-points of the CPUs are
+registered using the kernel's opp library and the
+`cpufreq_frequency_table` is assigned to the `struct device` of the
+cpu.  If you are using the `cpufreq-cpu0.c` driver then the
+`cpufreq_frequency_table` should already be assigned to the cpu
+device.
diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 47e2f15537ca..1818c4fa60b8 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -92,6 +92,11 @@ config THERMAL_GOV_USER_SPACE
 config THERMAL_POWER_ACTOR
bool
 
+menu Power actors
+depends on THERMAL_POWER_ACTOR
+source drivers/thermal/power_actor/Kconfig
+endmenu
+
 config CPU_THERMAL
bool generic cpu cooling support
depends on CPU_FREQ
diff --git a/drivers/thermal/power_actor/Kconfig 
b/drivers/thermal/power_actor/Kconfig
new file mode 100644
index ..fa542ca99cdb
--- /dev/null
+++ b/drivers/thermal/power_actor/Kconfig
@@ -0,0 +1,9 @@
+#
+# Thermal power actor configuration
+#
+
+config THERMAL_POWER_ACTOR_CPU
+   bool
+   prompt Simple power model for a CPU
+   help
+ A simple CPU power model
diff --git a/drivers/thermal/power_actor/Makefile 
b/drivers/thermal/power_actor/Makefile
index 46478f4928be..6f04b92997e6 100644
--- a/drivers/thermal/power_actor/Makefile
+++ b/drivers/thermal/power_actor/Makefile
@@ -3,3 +3,5 @@
 #
 
 obj-y += power_actor.o
+
+obj-$(CONFIG_THERMAL_POWER_ACTOR_CPU) += cpu_actor.o
diff --git a/drivers/thermal/power_actor/cpu_actor.c 
b/drivers/thermal/power_actor/cpu_actor.c
new file mode 100644
index ..0d76d52609fa
--- /dev/null
+++ b/drivers/thermal/power_actor/cpu_actor.c
@@ -0,0 +1,419 @@
+/*
+ * A basic cpu actor
+ *
+ * Copyright (C) 2014 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU

[RFC PATCH v2 2/7] thermal: document struct thermal_zone_device and thermal_governor

2014-05-20 Thread Javi Merino
Document struct thermal_zone_device and struct thermal_governor fields
and their use by the thermal framework code.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---

Hi linux-pm,

This was sent as a separate patch to linux-pm and can be merged
independently, as it documents the current thermal framework.

 include/linux/thermal.h |   44 ++--
 1 file changed, 42 insertions(+), 2 deletions(-)

diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index f7e11c7ea7d9..9b7cb804e03f 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -158,6 +158,40 @@ struct thermal_attr {
char name[THERMAL_NAME_LENGTH];
 };
 
+/**
+ * struct thermal_zone_device - structure for a thermal zone
+ * @id:unique id number for each thermal zone
+ * @type:  the thermal zone device type
+ * @device:struct device for this thermal zone
+ * @trip_temp_attrs:   attributes for trip points for sysfs: trip temperature
+ * @trip_type_attrs:   attributes for trip points for sysfs: trip type
+ * @trip_hyst_attrs:   attributes for trip points for sysfs: trip hysteresis
+ * @devdata:   private pointer for device private data
+ * @trips: number of trip points the thermal zone supports
+ * @passive_delay: number of milliseconds to wait between polls when
+ * performing passive cooling.  Only used by the step-wise
+ * governor
+ * @polling_delay: number of milliseconds to wait between polls when
+ * checking whether trip points have been crossed (0 for
+ * interrupt driven systems)
+ * @temperature:   current temperature.  This is only for core code,
+ * drivers should use thermal_zone_get_temp() to get the
+ * current temperature
+ * @last_temperature:  previous temperature read
+ * @emul_temperature:  emulated temperature when using CONFIG_THERMAL_EMULATION
+ * @passive:   step-wise specific parameter.  1 if you've crossed a passive
+ * trip point, 0 otherwise
+ * @forced_passive:step-wise specific parameter.  If  0, temperature at
+ * which to switch on all cpufreq cooling devices.
+ * @ops:   operations this thermal_zone_device supports
+ * @tzp:   thermal zone parameters
+ * @governor:  pointer to the governor for this thermal zone
+ * @thermal_instances: list of struct thermal_instance of this thermal zone
+ * @idr:   struct idr to generate unique id for this zone's cooling devices
+ * @lock:  lock to protect thermal_instances list
+ * @node:  node in thermal_tz_list (in thermal_core.c)
+ * @poll_queue:delayed work for polling
+ */
 struct thermal_zone_device {
int id;
char type[THERMAL_NAME_LENGTH];
@@ -179,12 +213,18 @@ struct thermal_zone_device {
struct thermal_governor *governor;
struct list_head thermal_instances;
struct idr idr;
-   struct mutex lock; /* protect thermal_instances list */
+   struct mutex lock;
struct list_head node;
struct delayed_work poll_queue;
 };
 
-/* Structure that holds thermal governor information */
+/**
+ * struct thermal_governor - structure that holds thermal governor information
+ * @name:  name of the governor
+ * @throttle:  callback called for every trip point even if temperature is
+ * below the trip point temperature
+ * @governor_list: node in thermal_governor_list (in thermal_core.c)
+ */
 struct thermal_governor {
char name[THERMAL_NAME_LENGTH];
int (*throttle)(struct thermal_zone_device *tz, int trip);
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v2 4/7] thermal: introduce the Power Actor API

2014-05-20 Thread Javi Merino
This patch introduces the Power Actor API in the thermal framework.
With it, devices that can report their power consumption and control
it can be registered.  This base interface is meant to be used to
derive specific power actors, such as a cpu power actor.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 Documentation/thermal/power_actor.txt |   29 +
 drivers/thermal/Kconfig   |3 ++
 drivers/thermal/Makefile  |2 +
 drivers/thermal/power_actor/Makefile  |5 +++
 drivers/thermal/power_actor/power_actor.c |   66 +
 drivers/thermal/power_actor/power_actor.h |   63 +++
 6 files changed, 168 insertions(+)
 create mode 100644 Documentation/thermal/power_actor.txt
 create mode 100644 drivers/thermal/power_actor/Makefile
 create mode 100644 drivers/thermal/power_actor/power_actor.c
 create mode 100644 drivers/thermal/power_actor/power_actor.h

diff --git a/Documentation/thermal/power_actor.txt 
b/Documentation/thermal/power_actor.txt
new file mode 100644
index ..a0f06e091907
--- /dev/null
+++ b/Documentation/thermal/power_actor.txt
@@ -0,0 +1,29 @@
+
+Power Actor API
+===
+
+The base power actor API is meant to be used to derive specific power
+actors, such as a cpu power actor.  When registering, they should call
+`power_actor_register()` with a unique `enum power_actor_types`.  When
+unregistering, the power actor should call `power_actor_unregister()`
+with the `struct power_actor *` received in the call to
+`power_actor_register()`.
+
+Callbacks
+-
+
+1. u32 get_req_power(struct power_actor *actor)
+@actor: a valid `struct power_actor *` registered with
+`power_actor_register()`
+
+`get_req_power()` returns the current requested power in milliwatts.
+
+2. int set_power(struct power_actor *actor, u32 power)
+@actor: a valid `struct power_actor *` registered with
+`power_actor_register()`
+@power: power in milliwatts
+
+`set_power()` should configure the device to consume @power
+milliwatts.
+
+Returns 0 on success, -E* on error.
diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 2d51912a6e40..47e2f15537ca 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -89,6 +89,9 @@ config THERMAL_GOV_USER_SPACE
help
  Enable this to let the user space manage the platform thermals.
 
+config THERMAL_POWER_ACTOR
+   bool
+
 config CPU_THERMAL
bool generic cpu cooling support
depends on CPU_FREQ
diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index 54e4ec9eb5df..878a02cab7d1 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -14,6 +14,8 @@ thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE)  += fair_share.o
 thermal_sys-$(CONFIG_THERMAL_GOV_STEP_WISE)+= step_wise.o
 thermal_sys-$(CONFIG_THERMAL_GOV_USER_SPACE)   += user_space.o
 
+obj-$(CONFIG_THERMAL_POWER_ACTOR) += power_actor/
+
 # cpufreq cooling
 thermal_sys-$(CONFIG_CPU_THERMAL)  += cpu_cooling.o
 
diff --git a/drivers/thermal/power_actor/Makefile 
b/drivers/thermal/power_actor/Makefile
new file mode 100644
index ..46478f4928be
--- /dev/null
+++ b/drivers/thermal/power_actor/Makefile
@@ -0,0 +1,5 @@
+#
+# Makefile for the power actors
+#
+
+obj-y += power_actor.o
diff --git a/drivers/thermal/power_actor/power_actor.c 
b/drivers/thermal/power_actor/power_actor.c
new file mode 100644
index ..3edcb1ab4dff
--- /dev/null
+++ b/drivers/thermal/power_actor/power_actor.c
@@ -0,0 +1,66 @@
+/*
+ * Basic interface for power actors
+ *
+ * Copyright (C) 2014 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed as is WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#define pr_fmt(fmt) Power actor:  fmt
+
+#include linux/err.h
+#include linux/list.h
+#include linux/slab.h
+
+#include power_actor.h
+
+LIST_HEAD(actor_list);
+
+/**
+ * power_actor_register - Register an actor in the power actor API
+ * @type:  actor type
+ * @ops:   struct power_actor_ops for this actor
+ * @max_power: maximum power that this actor can consume
+ * @privdata:  pointer to private data related to the actor
+ *
+ * Returns the struct power_actor * on success, ERR_PTR() on failure
+ */
+struct power_actor *power_actor_register(enum power_actor_types type,
+   struct power_actor_ops *ops,
+   u32 max_power, void *privdata)
+{
+   struct power_actor *actor;
+
+   if (!ops-get_req_power || !ops

[RFC PATCH v2 0/7] The power allocator thermal governor

2014-05-20 Thread Javi Merino
Hi linux-pm,

This is v2 of the RFC we sent in [1].  The power allocator governor
allocates device power to control temperature.  This requires
transforming performance requests into requested power, which we do
with the aid of power models.  Patch 5 (thermal: add a basic cpu power
actor) implements a simple power model for cpus.  The division of
power between the actors ensures that power is allocated where it is
needed the most, based on the current workload.

[1] http://thread.gmane.org/gmane.linux.power-management.general/45000 

Patches 1 and 2 are not proper parts of these series and can be merged
separately.  Patch 1 (tracing: Add __bitmask() macro to trace events
to cpumasks and other bitmasks) is already in for-next.  Patch 2
(thermal: document struct thermal_zone_device and thermal_governor)
has already been submitted to linux-pm[2] and is generic.

[2] http://article.gmane.org/gmane.linux.power-management.general/45434

Changes since v1:
  - Fixed finding cpufreq cooling devices in cpufreq_frequency_change()
  - Replaced the reworked cooling device interface with a separate
power actor API
  - Addressed most of Eduardo's comments
  - Incorporated ftrace support for bitmask to trace cpumasks

Todo:
  - Add static power to the cpu power model
  - Change the PI controller into a PID controller
  - Turn power actors into a device
  - Let platforms override the power allocator governor parameters
  - Add more tracing and provide scripts to evaluate the proposal.
  - Tune it to achieve the temperature stability we are aiming for

Cheers,
Javi  Punit

Javi Merino (6):
  thermal: document struct thermal_zone_device and thermal_governor
  thermal: let governors have private data for each thermal zone
  thermal: introduce the Power Actor API
  thermal: add a basic cpu power actor
  thermal: introduce the Power Allocator governor
  thermal: add trace events to the power allocator governor

Steven Rostedt (Red Hat) (1):
  tracing: Add __bitmask() macro to trace events to cpumasks and other
bitmasks

 Documentation/thermal/power_actor.txt |   75 +
 Documentation/thermal/power_allocator.txt |   42 +++
 drivers/thermal/Kconfig   |   23 ++
 drivers/thermal/Makefile  |3 +
 drivers/thermal/power_actor/Kconfig   |9 +
 drivers/thermal/power_actor/Makefile  |7 +
 drivers/thermal/power_actor/cpu_actor.c   |  424 +++
 drivers/thermal/power_actor/power_actor.c |   66 +
 drivers/thermal/power_actor/power_actor.h |   86 ++
 drivers/thermal/power_allocator.c |  452 +
 drivers/thermal/thermal_core.c|   90 +-
 drivers/thermal/thermal_core.h|8 +
 include/linux/ftrace_event.h  |3 +
 include/linux/thermal.h   |   58 +++-
 include/linux/trace_seq.h |   10 +
 include/trace/events/thermal.h|   38 +++
 include/trace/events/thermal_governor.h   |   35 +++
 include/trace/ftrace.h|   57 +++-
 kernel/trace/trace_output.c   |   41 +++
 19 files changed, 1515 insertions(+), 12 deletions(-)
 create mode 100644 Documentation/thermal/power_actor.txt
 create mode 100644 Documentation/thermal/power_allocator.txt
 create mode 100644 drivers/thermal/power_actor/Kconfig
 create mode 100644 drivers/thermal/power_actor/Makefile
 create mode 100644 drivers/thermal/power_actor/cpu_actor.c
 create mode 100644 drivers/thermal/power_actor/power_actor.c
 create mode 100644 drivers/thermal/power_actor/power_actor.h
 create mode 100644 drivers/thermal/power_allocator.c
 create mode 100644 include/trace/events/thermal.h
 create mode 100644 include/trace/events/thermal_governor.h

-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v2 6/7] thermal: introduce the Power Allocator governor

2014-05-20 Thread Javi Merino
The power allocator governor is a thermal governor that controls system
and device power allocation to control temperature.  Conceptually, the
implementation takes a system view of heat dissipation by managing
multiple heat sources.

This governor relies on power-aware cooling devices (power actors) to
operate.  That is, cooling devices whose thermal_cooling_device_ops
accept THERMAL_UNIT_POWER.

It uses a Proportional Integral (PI) controller driven by the
temperature of the thermal zone.  This budget is then allocated to
each cooling device that can have bearing on the temperature we are
trying to control.  It decides how much power to give each cooling
device based on the performance they are requesting.  The PI
controller ensures that the total power budget does not exceed the
control temperature.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Punit Agrawal punit.agra...@arm.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 Documentation/thermal/power_allocator.txt |   42 +++
 drivers/thermal/Kconfig   |   15 +
 drivers/thermal/Makefile  |1 +
 drivers/thermal/power_allocator.c |  442 +
 drivers/thermal/thermal_core.c|7 +-
 drivers/thermal/thermal_core.h|8 +
 include/linux/thermal.h   |5 +
 7 files changed, 519 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/thermal/power_allocator.txt
 create mode 100644 drivers/thermal/power_allocator.c

diff --git a/Documentation/thermal/power_allocator.txt 
b/Documentation/thermal/power_allocator.txt
new file mode 100644
index ..daedf117611a
--- /dev/null
+++ b/Documentation/thermal/power_allocator.txt
@@ -0,0 +1,42 @@
+
+Integration of the power_allocator governor in a platform
+=
+
+Registering thermal_zone_device
+---
+
+An estimate of the sustainable dissipatable power (in mW) should be
+provided while registering the thermal zone.  This is the maximum
+sustained power for allocation at the desired maximum temperature.
+This number can vary for different conditions, but the closed-loop of
+the controller should take care of those variations, the
+`max_dissipatable_power` should be an estimation of it.  Register your
+thermal zone with `thermal_zone_params` that have a
+`max_dissipatable_power`.  If you weren't passing any
+`thermal_zone_params`, then something like this will do:
+
+   static const struct thermal_zone_params tz_params = {
+   .max_dissipatable_power = 3500,
+   };
+
+and then pass `tz_params` as the 5th parameter to
+`thermal_zone_device_register()`
+
+Trip points
+---
+
+The governor requires the following two trip points:
+
+1.  switch on trip point: temperature above which the governor
+control loop starts operating
+2.  desired temperature trip point: it should be higher than the
+switch on trip point. It is the target temperature the governor
+is controlling for.
+
+The trip points can be either active or passive.
+
+Power actors
+
+
+Devices controlled by this governor must be registered with the power
+actor API.  Read `power_actor.txt` for more information about them.
diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 1818c4fa60b8..e5b338a7cab9 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -71,6 +71,14 @@ config THERMAL_DEFAULT_GOV_USER_SPACE
  Select this if you want to let the user space manage the
  platform thermals.
 
+config THERMAL_DEFAULT_GOV_POWER_ALLOCATOR
+   bool power_allocator
+   select THERMAL_GOV_POWER_ALLOCATOR
+   help
+ Select this if you want to control temperature based on
+ system and device power allocation. This governor relies on
+ power actors to operate.
+
 endchoice
 
 config THERMAL_GOV_FAIR_SHARE
@@ -89,6 +97,13 @@ config THERMAL_GOV_USER_SPACE
help
  Enable this to let the user space manage the platform thermals.
 
+config THERMAL_GOV_POWER_ALLOCATOR
+   bool Power allocator thermal governor
+   select THERMAL_POWER_ACTOR
+   help
+ Enable this to manage platform thermals by dynamically
+ allocating and limiting power to devices.
+
 config THERMAL_POWER_ACTOR
bool
 
diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index 878a02cab7d1..c5b47f058675 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -13,6 +13,7 @@ thermal_sys-$(CONFIG_THERMAL_OF)  += of-thermal.o
 thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE)   += fair_share.o
 thermal_sys-$(CONFIG_THERMAL_GOV_STEP_WISE)+= step_wise.o
 thermal_sys-$(CONFIG_THERMAL_GOV_USER_SPACE)   += user_space.o
+thermal_sys-$(CONFIG_THERMAL_GOV_POWER_ALLOCATOR)  += power_allocator.o
 
 obj-$(CONFIG_THERMAL_POWER_ACTOR) += power_actor/
 
diff

[RFC PATCH v2 7/7] thermal: add trace events to the power allocator governor

2014-05-20 Thread Javi Merino
Add trace events for the power allocator governor and the power actor
interface of the cpu cooling device.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
Signed-off-by: Javi Merino javi.mer...@arm.com

---

trace-cmd needs the patched attached in
http://article.gmane.org/gmane.linux.kernel/1704423 for this to work.

 drivers/thermal/power_actor/cpu_actor.c |5 
 drivers/thermal/power_allocator.c   |   12 +-
 include/trace/events/thermal.h  |   38 +++
 include/trace/events/thermal_governor.h |   35 
 4 files changed, 89 insertions(+), 1 deletion(-)
 create mode 100644 include/trace/events/thermal.h
 create mode 100644 include/trace/events/thermal_governor.h

diff --git a/drivers/thermal/power_actor/cpu_actor.c 
b/drivers/thermal/power_actor/cpu_actor.c
index 0d76d52609fa..61f1edc13ec2 100644
--- a/drivers/thermal/power_actor/cpu_actor.c
+++ b/drivers/thermal/power_actor/cpu_actor.c
@@ -27,6 +27,9 @@
 #include linux/printk.h
 #include linux/slab.h
 
+#define CREATE_TRACE_POINTS
+#include trace/events/thermal.h
+
 #include power_actor.h
 
 /**
@@ -188,6 +191,8 @@ static int cpu_set_power(struct power_actor *actor, u32 
power)
return -EINVAL;
}
 
+   trace_thermal_power_limit(cpu_actor-cpumask, freq, cdev_state, power);
+
return cdev-ops-set_cur_state(cdev, cdev_state);
 }
 
diff --git a/drivers/thermal/power_allocator.c 
b/drivers/thermal/power_allocator.c
index 836c834a898c..b1ebcecb1c15 100644
--- a/drivers/thermal/power_allocator.c
+++ b/drivers/thermal/power_allocator.c
@@ -19,6 +19,9 @@
 #include linux/slab.h
 #include linux/thermal.h
 
+#define CREATE_TRACE_POINTS
+#include trace/events/thermal_governor.h
+
 #include power_actor/power_actor.h
 #include thermal_core.h
 
@@ -117,7 +120,14 @@ static u32 pi_controller(struct thermal_zone_device *tz,
power_range = tz-tzp-max_dissipatable_power +
frac_to_int(power_range);
 
-   return clamp(power_range, (s64)0, (s64)max_allocatable_power);
+   power_range = clamp(power_range, (s64)0, (s64)max_allocatable_power);
+
+   trace_thermal_power_allocator_pi(frac_to_int(err),
+   frac_to_int(params-err_integral),
+   frac_to_int(p), frac_to_int(i),
+   power_range);
+
+   return power_range;
 }
 
 /**
diff --git a/include/trace/events/thermal.h b/include/trace/events/thermal.h
new file mode 100644
index ..6496da62276d
--- /dev/null
+++ b/include/trace/events/thermal.h
@@ -0,0 +1,38 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM thermal
+
+#if !defined(_TRACE_THERMAL_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_THERMAL_H
+
+#include linux/tracepoint.h
+
+TRACE_EVENT(thermal_power_limit,
+   TP_PROTO(const struct cpumask *cpus, unsigned int freq,
+   unsigned long cdev_state, unsigned long power),
+
+   TP_ARGS(cpus, freq, cdev_state, power),
+
+   TP_STRUCT__entry(
+   __bitmask(cpumask, num_possible_cpus())
+   __field(unsigned int,  freq  )
+   __field(unsigned long, cdev_state)
+   __field(unsigned long, power )
+   ),
+
+   TP_fast_assign(
+   __assign_bitmask(cpumask, cpumask_bits(cpus),
+   num_possible_cpus());
+   __entry-freq = freq;
+   __entry-cdev_state = cdev_state;
+   __entry-power = power;
+   ),
+
+   TP_printk(cpus=%s freq=%u cdev_state=%lu power=%lu,
+   __get_bitmask(cpumask), __entry-freq, __entry-cdev_state,
+   __entry-power)
+);
+
+#endif /* _TRACE_THERMAL_H */
+
+/* This part must be outside protection */
+#include trace/define_trace.h
diff --git a/include/trace/events/thermal_governor.h 
b/include/trace/events/thermal_governor.h
new file mode 100644
index ..1fbf5c91f659
--- /dev/null
+++ b/include/trace/events/thermal_governor.h
@@ -0,0 +1,35 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM thermal_governor
+
+#if !defined(_TRACE_THERMAL_GOVERNOR_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_THERMAL_GOVERNOR_H
+
+#include linux/tracepoint.h
+
+TRACE_EVENT(thermal_power_allocator_pi,
+   TP_PROTO(s32 err, s32 err_integral, s64 p, s64 i, s32 output),
+   TP_ARGS(err, err_integral, p, i, output),
+   TP_STRUCT__entry(
+   __field(s32, err )
+   __field(s32, err_integral)
+   __field(s64, p   )
+   __field(s64, i   )
+   __field(s32, output  )
+   ),
+   TP_fast_assign(
+   __entry-err = err;
+   __entry-err_integral = err_integral;
+   __entry-p = p

Re: [RFC PATCH v2 1/7] tracing: Add __bitmask() macro to trace events to cpumasks and other bitmasks

2014-05-21 Thread Javi Merino
Hi Steve,

On Wed, May 21, 2014 at 03:07:23AM +0100, Steven Rostedt wrote:
 Hmm, I didn't think about cross tree dependencies. I already pushed this
 patch to my for-next branch which is already in linux-next, and I do not
 rebase this branch unless there's a really good need to.
 
 I guess I needed to make a separate branch that you could have pulled
 separately. I'm not sure how we want to proceed, unless you wait till
 Linus pulls my branch before you add this to your tree.
 
 Maybe it would be OK to cherry pick it? I'm not sure Linus would want
 that.
 
 Maybe I can make a separate branch that only has this patch and merge it
 into my tree, where git will handle the duplicate. But then we have a
 strange history.
 
 How urgent is your change? Can it wait till my stuff makes it into
 Linus's tree in the 3.16 merge window?

Sorry for the confusion.  I was expecting you to merge this patch via
your tree.  I said that in the cover letter for the series, but I
should've repeated it in this patch.  I included it in the series for
completeness and I'll remove it once it reaches mainline.

Cheers,
Javi

 On Tue, 2014-05-20 at 15:10 +0100, Javi Merino wrote:
  From: Steven Rostedt (Red Hat) rost...@goodmis.org
  
  Being able to show a cpumask of events can be useful as some events
  may affect only some CPUs. There is no standard way to record the
  cpumask and converting it to a string is rather expensive during
  the trace as traces happen in hotpaths. It would be better to record
  the raw event mask and be able to parse it at print time.
  
  The following macros were added for use with the TRACE_EVENT() macro:
  
__bitmask()
__assign_bitmask()
__get_bitmask()
  
  To test this, I added this to the sched_migrate_task event, which
  looked like this:
  
  TRACE_EVENT(sched_migrate_task,
  
  TP_PROTO(struct task_struct *p, int dest_cpu, const struct cpumask 
  *cpus),
  
  TP_ARGS(p, dest_cpu, cpus),
  
  TP_STRUCT__entry(
  __array(char,   comm,   TASK_COMM_LEN   )
  __field(pid_t,  pid )
  __field(int,prio)
  __field(int,orig_cpu)
  __field(int,dest_cpu)
  __bitmask(  cpumask, num_possible_cpus())
  ),
  
  TP_fast_assign(
  memcpy(__entry-comm, p-comm, TASK_COMM_LEN);
  __entry-pid= p-pid;
  __entry-prio   = p-prio;
  __entry-orig_cpu   = task_cpu(p);
  __entry-dest_cpu   = dest_cpu;
  __assign_bitmask(cpumask, cpumask_bits(cpus), 
  num_possible_cpus());
  ),
  
  TP_printk(comm=%s pid=%d prio=%d orig_cpu=%d dest_cpu=%d cpumask=%s,
__entry-comm, __entry-pid, __entry-prio,
__entry-orig_cpu, __entry-dest_cpu,
__get_bitmask(cpumask))
  );
  
  With the output of:
  
  ksmtuned-3613  [003] d..2   485.220508: sched_migrate_task: 
  comm=ksmtuned pid=3615 prio=120 orig_cpu=3 dest_cpu=2 
  cpumask=,000f
   migration/1-13[001] d..5   485.221202: sched_migrate_task: 
  comm=ksmtuned pid=3614 prio=120 orig_cpu=1 dest_cpu=0 
  cpumask=,000f
   awk-3615  [002] d.H5   485.221747: sched_migrate_task: 
  comm=rcu_preempt pid=7 prio=120 orig_cpu=0 dest_cpu=1 
  cpumask=,00ff
   migration/2-18[002] d..5   485.222062: sched_migrate_task: 
  comm=ksmtuned pid=3615 prio=120 orig_cpu=2 dest_cpu=3 
  cpumask=,000f
  
  Link: 
  http://lkml.kernel.org/r/1399377998-14870-6-git-send-email-javi.mer...@arm.com
  Link: http://lkml.kernel.org/r/20140506132238.22e13...@gandalf.local.home
  
  Suggested-by: Javi Merino javi.mer...@arm.com
  Tested-by: Javi Merino javi.mer...@arm.com
  Signed-off-by: Steven Rostedt rost...@goodmis.org
 
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH v3] tracing: Add __bitmask() macro to trace events to cpumasks and other bitmasks

2014-05-15 Thread Javi Merino
On Thu, May 15, 2014 at 12:36:09PM +0100, Steven Rostedt wrote:
 On Thu, 15 May 2014 07:34:05 -0400
 Steven Rostedt rost...@goodmis.org wrote:
 
 
  Should work now. Thanks for testing.
 
 If all goes well, can you please give me your Tested-by for both
 patches. I've already ran my updated bitmask kernel patch through my
 tests (although, there's still no event that uses it).

I've tested it on my setup and it works beautifully now.  You can add
my Tested-by to both the kernel and the trace-cmd patches.  I'll
include the kernel one together with a patch that uses it as part of
our next series, thanks!

 I'll push it to my for-next repo. I'll also work on updating
 libtraceevent with the trace-cmd patch such that perf gets the benefit
 of this new macro as well.

Yup, I guess we'll need a new version of trace-cmd to support this
when it gets into the kernel.

Thanks for the prompt response,
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] thermal: document struct thermal_zone_device and thermal_governor

2014-05-16 Thread Javi Merino
Document struct struct thermal_zone_device and struct thermal_governor
fields and their use by the thermal framework code.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin eduardo.valen...@ti.com
Signed-off-by: Javi Merino javi.mer...@arm.com

---

Hi linux-pm,

I have some patches that add new fields to these structures but I
don't have a good place to describe those fields as these structs are
mostly undocumented so I thought I'd document them.

I'm unsure about some of the descriptions, specially for passive and
forced_passive so please review them.

 include/linux/thermal.h |   44 ++--
 1 file changed, 42 insertions(+), 2 deletions(-)

diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index f7e11c7ea7d9..af928c667dba 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -158,6 +158,40 @@ struct thermal_attr {
char name[THERMAL_NAME_LENGTH];
 };
 
+/**
+ * struct thermal_zone_device - structure for a thermal zone
+ * @id:unique id number for each thermal zone
+ * @type:  the thermal zone device type
+ * @device:struct device for this thermal zone
+ * @trip_temp_attrs:   attributes for trip points for sysfs: trip temperature
+ * @trip_type_attrs:   attributes for trip points for sysfs: trip type
+ * @trip_hyst_attrs:   attributes for trip points for sysfs: trip hysteresis
+ * @devdata:   private pointer for device private data
+ * @trips: number of trip points the thermal zone supports
+ * @passive_delay: number of milliseconds to wait between polls when
+ * performing passive cooling.  Only used by the step-wise
+ * governor
+ * @polling_delay: number of milliseconds to wait between polls when
+ * checking whether trip points have been crossed (0 for
+ * interrupt driven systems)
+ * @temperature:   current temperature.  This is only for core code,
+ * drivers should use thermal_zone_get_temp() to get the
+ * current temperature
+ * @last_temperature:  previous temperature read
+ * @emul_temperature:  emulated temperature when using CONFIG_THERMAL_EMULATION
+ * @passive:   step-wise specific parameter.  1 if you've crossed a passive
+ * trip point, 0 otherwise
+ * @forced_passive:step-wise specific parameter.  If  0, temperature at
+ * which to switch on all cpufreq cooling devices.
+ * @ops:   operations this thermal_zone_device supports
+ * @tzp:   thermal zone parameters
+ * @governor:  pointer to the governor for this thermal zone
+ * @thermal_instances: list of struct thermal_instance of this thermal zone
+ * @idr:   struct idr to generate unique id for this zone's cooling devices
+ * @lock:  lock to protect thermal_instances list
+ * @node:  node in thermal_tz_list (in thermal_core.c)
+ * @poll_queue:delayed work for polling
+ */
 struct thermal_zone_device {
int id;
char type[THERMAL_NAME_LENGTH];
@@ -179,12 +213,18 @@ struct thermal_zone_device {
struct thermal_governor *governor;
struct list_head thermal_instances;
struct idr idr;
-   struct mutex lock; /* protect thermal_instances list */
+   struct mutex lock;
struct list_head node;
struct delayed_work poll_queue;
 };
 
-/* Structure that holds thermal governor information */
+/**
+ * struct thermal_governor - structure that holds thermal governor information
+ * @name:  name of the governor
+ * @throttle:  callback called for every trip point even if temperature is
+ * below the trip point temperature
+ * @governor_list: node in thermal_governor_list (in thermal_core.c)
+ */
 struct thermal_governor {
char name[THERMAL_NAME_LENGTH];
int (*throttle)(struct thermal_zone_device *tz, int trip);
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 2/5] thermal: cpu_cooling: Add notifications support for the clients

2014-05-16 Thread Javi Merino
Hi Amit,

On Thu, May 08, 2014 at 03:37:57PM +0100, Amit Daniel Kachhap wrote:
 This patch adds notification support for those clients of cpu_cooling
 APIs which may want to do something interesting after receiving these
 cpu_cooling events. The notifier structure passed is of both Set/Get type.
 The notfications events can be of type,
 1. CPU_COOLING_SET_STATE_PRE
 2. CPU_COOLING_SET_STATE_POST
 3. CPU_COOLING_GET_CUR_STATE
 4. CPU_COOLING_GET_MAX_STATE
 
 The advantages of these notfications is to differentiate between different
 P states in the cpufreq table and the cooling states. The clients of these
 events may group few P states into 1 cooling states. Also some more cooling
 states can be enabled when the maximum of P state is reached. Post 
 notifications
 can be used for those cases.
 
 Signed-off-by: Amit Daniel Kachhap amit.dan...@samsung.com
 ---
  drivers/thermal/cpu_cooling.c |   99 +++-
  include/linux/cpu_cooling.h   |   55 +++
  2 files changed, 151 insertions(+), 3 deletions(-)
 
 diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
 index 21f44d4..e2aeb36 100644
 --- a/drivers/thermal/cpu_cooling.c
 +++ b/drivers/thermal/cpu_cooling.c
 @@ -50,6 +50,7 @@ struct cpufreq_cooling_device {
   unsigned int cpufreq_state;
   unsigned int cpufreq_val;
   struct cpumask allowed_cpus;
 + struct cpufreq_cooling_status request_status;
   struct list_head node;
  };
  static DEFINE_IDR(cpufreq_idr);
 @@ -59,6 +60,8 @@ static DEFINE_MUTEX(cooling_cpufreq_lock);
  #define NOTIFY_INVALID NULL
  static struct cpufreq_cooling_device *notify_device;
  
 +/* Notfier list to validates/updates the cpufreq cooling states */
 +static BLOCKING_NOTIFIER_HEAD(cpufreq_cooling_state_notifier_list);
  /* A list to hold all the cpufreq cooling devices registered */
  static LIST_HEAD(cpufreq_cooling_list);
  
 @@ -266,6 +269,21 @@ static unsigned int get_cpu_frequency(unsigned int cpu, 
 unsigned long level)
   return freq;
  }
  
 +static int
 +cpufreq_cooling_notify_states(struct cpufreq_cooling_status *request,
 + enum cpu_cooling_state_ops op)
 +{
 + /* Invoke the notifiers which have registered for this state change */
 + if (op == CPU_COOLING_SET_STATE_PRE ||
 + op == CPU_COOLING_SET_STATE_POST ||
 + op == CPU_COOLING_GET_MAX_STATE ||
 + op == CPU_COOLING_GET_CUR_STATE) {
 + blocking_notifier_call_chain(
 + cpufreq_cooling_state_notifier_list, op, request);
 + }
 + return 0;
 +}
 +
  /**
   * cpufreq_apply_cooling - function to apply frequency clipping.
   * @cpufreq_device: cpufreq_cooling_device pointer containing frequency
 @@ -285,9 +303,18 @@ static int cpufreq_apply_cooling(struct 
 cpufreq_cooling_device *cpufreq_device,
   struct cpumask *mask = cpufreq_device-allowed_cpus;
   unsigned int cpu = cpumask_any(mask);
  
 + cpufreq_device-request_status.cur_state =
 + cpufreq_device-cpufreq_state;
 + cpufreq_device-request_status.new_state = cooling_state;
 +
 + cpufreq_cooling_notify_states(cpufreq_device-request_status,
 + CPU_COOLING_SET_STATE_PRE);
 +
 + cooling_state = cpufreq_device-request_status.new_state;
  
   /* Check if the old cooling action is same as new cooling action */
 - if (cpufreq_device-cpufreq_state == cooling_state)
 + if (cpufreq_device-cpufreq_state ==
 + cpufreq_device-request_status.new_state)
   return 0;
  
   clip_freq = get_cpu_frequency(cpu, cooling_state);
 @@ -304,7 +331,8 @@ static int cpufreq_apply_cooling(struct 
 cpufreq_cooling_device *cpufreq_device,
   }
  
   notify_device = NOTIFY_INVALID;
 -
 + cpufreq_cooling_notify_states(cpufreq_device-request_status,
 + CPU_COOLING_SET_STATE_POST);
   return 0;
  }
  
 @@ -383,6 +411,11 @@ static int cpufreq_get_max_state(struct 
 thermal_cooling_device *cdev,
   if (count  0)
   *state = count;
  
 + cpufreq_device-request_status.max_state = count;
 + cpufreq_cooling_notify_states(cpufreq_device-request_status,
 + CPU_COOLING_GET_MAX_STATE);
 + *state = cpufreq_device-request_status.max_state;
 +

I think this should all be inside the if (count  0).  If not, then
remove it, as it is dead code now.

Cheers,
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 2/6] thermal: cpu_cooling: Support passing driver private data.

2014-05-29 Thread Javi Merino
Hi Amit,

One minor comment.

On Thu, May 29, 2014 at 09:15:30AM +0100, Amit Daniel Kachhap wrote:
 This patch allows the caller of cpufreq cooling APIs to register along
 with their driver data which will be useful while receiving any cooling states
 notifications.
 This patch is in preparation to add notfication support for cpufrequency
 cooling changes. This change also removes the unnecessary exposing of
 internal housekeeping structure via thermal_cooling_device-devdata to the
 users of cpufreq cooling APIs. As part of this implmentation, this private 
 local
 structure (cpufreq_dev) is now stored in a list and scanned for each call to
 cpu cooling interfaces.
 
 Signed-off-by: Amit Daniel Kachhap amit.dan...@samsung.com
 ---
  Documentation/thermal/cpu-cooling-api.txt  |3 +-
  drivers/thermal/cpu_cooling.c  |   89 
 
  drivers/thermal/db8500_cpufreq_cooling.c   |2 +-
  drivers/thermal/samsung/exynos_thermal_common.c|2 +-
  drivers/thermal/ti-soc-thermal/ti-thermal-common.c |2 +-
  include/linux/cpu_cooling.h|5 +-
  6 files changed, 80 insertions(+), 23 deletions(-)
 
 diff --git a/Documentation/thermal/cpu-cooling-api.txt 
 b/Documentation/thermal/cpu-cooling-api.txt
 index fca24c9..aaa07c6 100644
 --- a/Documentation/thermal/cpu-cooling-api.txt
 +++ b/Documentation/thermal/cpu-cooling-api.txt
 @@ -17,13 +17,14 @@ the user. The registration APIs returns the cooling 
 device pointer.
 
  1.1 cpufreq registration/unregistration APIs
  1.1.1 struct thermal_cooling_device *cpufreq_cooling_register(
 -   struct cpumask *clip_cpus)
 +   struct cpumask *clip_cpus, void *devdata)
 
  This interface function registers the cpufreq cooling device with the 
 name
  thermal-cpufreq-%x. This api can support multiple instances of cpufreq
  cooling devices.
 
 clip_cpus: cpumask of cpus where the frequency constraints will happen.
 +   devdata: driver private data pointer.
 
  1.1.2 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
 
 diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
 index 16388b0..6d145d5 100644
 --- a/drivers/thermal/cpu_cooling.c
 +++ b/drivers/thermal/cpu_cooling.c
 @@ -50,16 +50,18 @@ struct cpufreq_cooling_device {
 unsigned int cpufreq_state;
 unsigned int cpufreq_val;
 struct cpumask allowed_cpus;
 +   struct list_head node;
  };
  static DEFINE_IDR(cpufreq_idr);
  static DEFINE_MUTEX(cooling_cpufreq_lock);
 
 -static unsigned int cpufreq_dev_count;
 -
  /* notify_table passes value to the CPUFREQ_ADJUST callback function. */
  #define NOTIFY_INVALID NULL
  static DEFINE_PER_CPU(struct cpufreq_cooling_device *, notify_device);
 
 +/* A list to hold all the cpufreq cooling devices registered */
 +static LIST_HEAD(cpufreq_cooling_list);
 +
  /**
   * get_idr - function to get a unique id.
   * @idr: struct idr * handle used to create a id.
 @@ -98,6 +100,26 @@ static void release_idr(struct idr *idr, int id)
 
  /* Below code defines functions to be used for cpufreq as cooling device */
 
 +/**
 + * cpufreq_cooling_get_info - function to cpufreq_dev for the corresponding 
 cdev
 + * @cdev: pointer of the cooling device
 + *
 + * This function will loop through the cpufreq_cooling_device list and
 + * return the matching device
 + *

You should add a Locking: section here which documents that this
function must be called with cooling_cpufreq_lock held.

Cheers,
Javi

 + * Return: cpufreq_cooling_device if matched with the cdev or NULL if not
 + * matched.
 + */
 +static inline struct cpufreq_cooling_device *
 +cpufreq_cooling_get_info(struct thermal_cooling_device *cdev)
 +{
 +   struct cpufreq_cooling_device *cpufreq_dev = NULL;
 +   list_for_each_entry(cpufreq_dev, cpufreq_cooling_list, node)
 +   if (cpufreq_dev-cool_dev == cdev)
 +   break;
 +   return cpufreq_dev;
 +}
 +


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 3/6] thermal: thermal-core: Add notifications support for the cooling states

2014-05-29 Thread Javi Merino
Hi Amit,

On Thu, May 29, 2014 at 09:15:31AM +0100, Amit Daniel Kachhap wrote:
 This patch adds notification infrastructure for any requests related to 
 cooling
 states. The notifier structure passed is of both Get/Set type. So the receiver
 of these can sense the new/cur/max cooling state as decided by thermal 
 governor.
 In addition to that it can also override the cooling state and may do 
 something
 interesting after receiving these CPU cooling events such as masking some
 states, enabling some extra conditional states or perform any extra operation
 for aggressive thermal cooling.
 The notfications events can be of type,
 
 1. COOLING_SET_STATE_PRE
 2. COOLING_SET_STATE_POST
 3. COOLING_GET_CUR_STATE
 4. COOLING_GET_MAX_STATE
 
 Signed-off-by: Amit Daniel Kachhap amit.dan...@samsung.com
 ---
  Documentation/thermal/sysfs-api.txt |   21 +++
  drivers/thermal/thermal_core.c  |   69 
 ++-
  include/linux/thermal.h |   21 +++
  3 files changed, 109 insertions(+), 2 deletions(-)
 
 diff --git a/Documentation/thermal/sysfs-api.txt 
 b/Documentation/thermal/sysfs-api.txt
 index 87519cb..5f45e03 100644
 --- a/Documentation/thermal/sysfs-api.txt
 +++ b/Documentation/thermal/sysfs-api.txt
 @@ -92,6 +92,27 @@ temperature) and throttle appropriate devices.
  It deletes the corresponding entry form /sys/class/thermal folder and
  unbind itself from all the thermal zone devices using it.
  
 +1.2.3 int thermal_cooling_register_notifier(struct notifier_block *nb)
 +
 +This interface function registers the client notifier handler. The 
 notifier
 +handler can use this to monitor or update any cooling state requests.
 +nb: notifier structure containing client notifier handler.
 +
 +1.2.4 int thermal_cooling_unregister_notifier(struct notifier_block *nb)
 +
 +This interface function unregisters the client notifier handler.
 +nb: notifier structure containing client notifier handler.
 +
 +1.2.5 int thermal_cooling_notify_states(struct thermal_cooling_status 
 *request,enum cooling_state_ops op)
 +
 +This interface function invokes the earlier registered cooling states 
 handler.
 +request: holds the relevant cooling state value.
 + .cur_state: current cooling state.
 + .new_state: new cooling state to be set.
 + .max_state: max cooling state.
 + .devdata: driver private data pointer.
 +op: describes various operation supported.
 +
  1.3 interface for binding a thermal zone device with a thermal cooling device
  1.3.1 int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
   int trip, struct thermal_cooling_device *cdev,
 diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
 index 71b0ec0..1a60f83 100644
 --- a/drivers/thermal/thermal_core.c
 +++ b/drivers/thermal/thermal_core.c
 @@ -52,6 +52,8 @@ static DEFINE_MUTEX(thermal_idr_lock);
  static LIST_HEAD(thermal_tz_list);
  static LIST_HEAD(thermal_cdev_list);
  static LIST_HEAD(thermal_governor_list);
 +/* Notfier list to validates/updates the cpufreq cooling states */
 +static BLOCKING_NOTIFIER_HEAD(cooling_state_notifier_list);
  
  static DEFINE_MUTEX(thermal_list_lock);
  static DEFINE_MUTEX(thermal_governor_lock);
 @@ -1073,8 +1075,71 @@ static struct class thermal_class = {
  };
  
  /**
 - * __thermal_cooling_device_register() - register a new thermal cooling 
 device
 - * @np:  a pointer to a device tree node.
 + * thermal_cooling_notify_states - Invoke the necessary cooling states 
 handler.
 + * @request: holds the relevant cooling state value. say if the cooling state
 + * operation is of type COOLING_GET_MAX_STATE, then request holds
 + * the current max cooling state value.
 + * @op: different operations supported
 + *
 + * This API allows the registered user to recieve the different cooling
 + * notifications like current state, max state and set state.
 + *
 + * Return: 0 (success)
 + */
 +int thermal_cooling_notify_states(struct thermal_cooling_status *request,
 + enum cooling_state_ops op)
 +{
 + /* Invoke the notifiers which have registered for this state change */
 + if (op == COOLING_SET_STATE_PRE ||
 + op == COOLING_SET_STATE_POST ||
 + op == COOLING_GET_MAX_STATE ||
 + op == COOLING_GET_CUR_STATE) {
 + blocking_notifier_call_chain(
 + cooling_state_notifier_list, op, request);
 + }
 + return 0;
 +}
 +EXPORT_SYMBOL_GPL(thermal_cooling_notify_states);
 +
 +/**
 + * thermal_cooling_register_notifier - registers a notifier with thermal 
 cooling.
 + * @nb:  notifier function to register.
 + *
 + * Add a driver to receive all cooling notifications like current state,
 + * max state and set state. The drivers after reading the events can perform
 + * some mapping like grouping some P states into 1 cooling state.
 + *
 + * Return: 0 (success)
 + */
 +int 

Re: [PATCH v1 6/6] ACPI: thermal: processor: Use the generic cpufreq infrastructure

2014-05-29 Thread Javi Merino
Hi Amit,

On Thu, May 29, 2014 at 09:15:34AM +0100, Amit Daniel Kachhap wrote:
 This patch upgrades the ACPI cpufreq cooling portions to use the generic
 cpufreq cooling infrastructure. There should not be any functionality
 related changes as the same behaviour is provided by the generic
 cpufreq APIs with the notifier mechanism.
 
 Signed-off-by: Amit Daniel Kachhap amit.dan...@samsung.com
 ---
  drivers/acpi/processor_driver.c  |6 +-
  drivers/acpi/processor_thermal.c |  235 
 ++
  include/acpi/processor.h |3 +-
  3 files changed, 115 insertions(+), 129 deletions(-)
 
 diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
 index 7f70f31..10aba4a 100644
 --- a/drivers/acpi/processor_driver.c
 +++ b/drivers/acpi/processor_driver.c
 @@ -36,6 +36,7 @@
  #include linux/cpuidle.h
  #include linux/slab.h
  #include linux/acpi.h
 +#include linux/cpu_cooling.h
 
  #include acpi/processor.h
 
 @@ -178,8 +179,7 @@ static int __acpi_processor_start(struct acpi_device 
 *device)
 if (!cpuidle_get_driver() || cpuidle_get_driver() == 
 acpi_idle_driver)
 acpi_processor_power_init(pr);
 
 -   pr-cdev = thermal_cooling_device_register(Processor, device,
 -  processor_cooling_ops);
 +   pr-cdev = acpi_processor_cooling_register(device);

With this you have removed the only cooling device whose type was
Processor.  There's special code for dealing with this cooling
device in drivers/thermal/thermal_core.c:passive_store():

list_for_each_entry(cdev, thermal_cdev_list, node) {
if (!strncmp(Processor, cdev-type,
 sizeof(Processor)))
thermal_zone_bind_cooling_device(tz,
THERMAL_TRIPS_NONE, cdev,
THERMAL_NO_LIMIT,
THERMAL_NO_LIMIT);
}
mutex_unlock(thermal_list_lock);
if (!tz-passive_delay)

With your change, that code is now dead as it can't do anything.  No
I don't know what should you do with it, either remove it or make it
match the cpufreq cooling device.  But this patch should deal with
that code as well.

Cheers,
Javi


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 3/6] thermal: thermal-core: Add notifications support for the cooling states

2014-06-02 Thread Javi Merino
On Mon, Jun 02, 2014 at 10:31:19AM +0100, Amit Kachhap wrote:
 On 5/29/14, Javi Merino javi.mer...@arm.com wrote:
  Hi Amit,
 
  On Thu, May 29, 2014 at 09:15:31AM +0100, Amit Daniel Kachhap wrote:
  This patch adds notification infrastructure for any requests related to
  cooling
  states. The notifier structure passed is of both Get/Set type. So the
  receiver
  of these can sense the new/cur/max cooling state as decided by thermal
  governor.
  In addition to that it can also override the cooling state and may do
  something
  interesting after receiving these CPU cooling events such as masking some
  states, enabling some extra conditional states or perform any extra
  operation
  for aggressive thermal cooling.
  The notfications events can be of type,
 
  1. COOLING_SET_STATE_PRE
  2. COOLING_SET_STATE_POST
  3. COOLING_GET_CUR_STATE
  4. COOLING_GET_MAX_STATE
 
  Signed-off-by: Amit Daniel Kachhap amit.dan...@samsung.com
  ---
   Documentation/thermal/sysfs-api.txt |   21 +++
   drivers/thermal/thermal_core.c  |   69
  ++-
   include/linux/thermal.h |   21 +++
   3 files changed, 109 insertions(+), 2 deletions(-)
 
  diff --git a/Documentation/thermal/sysfs-api.txt
  b/Documentation/thermal/sysfs-api.txt
  index 87519cb..5f45e03 100644
  --- a/Documentation/thermal/sysfs-api.txt
  +++ b/Documentation/thermal/sysfs-api.txt
  @@ -92,6 +92,27 @@ temperature) and throttle appropriate devices.
   It deletes the corresponding entry form /sys/class/thermal folder
  and
   unbind itself from all the thermal zone devices using it.
 
  +1.2.3 int thermal_cooling_register_notifier(struct notifier_block *nb)
  +
  +This interface function registers the client notifier handler. The
  notifier
  +handler can use this to monitor or update any cooling state
  requests.
  +nb: notifier structure containing client notifier handler.
  +
  +1.2.4 int thermal_cooling_unregister_notifier(struct notifier_block *nb)
  +
  +This interface function unregisters the client notifier handler.
  +nb: notifier structure containing client notifier handler.
  +
  +1.2.5 int thermal_cooling_notify_states(struct thermal_cooling_status
  *request,enum cooling_state_ops op)
  +
  +This interface function invokes the earlier registered cooling states
  handler.
  +request: holds the relevant cooling state value.
  +  .cur_state: current cooling state.
  +  .new_state: new cooling state to be set.
  +  .max_state: max cooling state.
  +  .devdata: driver private data pointer.
  +op: describes various operation supported.
  +
   1.3 interface for binding a thermal zone device with a thermal cooling
  device
   1.3.1 int thermal_zone_bind_cooling_device(struct thermal_zone_device
  *tz,
 int trip, struct thermal_cooling_device *cdev,
  diff --git a/drivers/thermal/thermal_core.c
  b/drivers/thermal/thermal_core.c
  index 71b0ec0..1a60f83 100644
  --- a/drivers/thermal/thermal_core.c
  +++ b/drivers/thermal/thermal_core.c
  @@ -52,6 +52,8 @@ static DEFINE_MUTEX(thermal_idr_lock);
   static LIST_HEAD(thermal_tz_list);
   static LIST_HEAD(thermal_cdev_list);
   static LIST_HEAD(thermal_governor_list);
  +/* Notfier list to validates/updates the cpufreq cooling states */
  +static BLOCKING_NOTIFIER_HEAD(cooling_state_notifier_list);
 
   static DEFINE_MUTEX(thermal_list_lock);
   static DEFINE_MUTEX(thermal_governor_lock);
  @@ -1073,8 +1075,71 @@ static struct class thermal_class = {
   };
 
   /**
  - * __thermal_cooling_device_register() - register a new thermal cooling
  device
  - * @np:   a pointer to a device tree node.
  + * thermal_cooling_notify_states - Invoke the necessary cooling states
  handler.
  + * @request: holds the relevant cooling state value. say if the cooling
  state
  + * operation is of type COOLING_GET_MAX_STATE, then request holds
  + * the current max cooling state value.
  + * @op: different operations supported
  + *
  + * This API allows the registered user to recieve the different cooling
  + * notifications like current state, max state and set state.
  + *
  + * Return: 0 (success)
  + */
  +int thermal_cooling_notify_states(struct thermal_cooling_status
  *request,
  +  enum cooling_state_ops op)
  +{
  +  /* Invoke the notifiers which have registered for this state change */
  +  if (op == COOLING_SET_STATE_PRE ||
  +  op == COOLING_SET_STATE_POST ||
  +  op == COOLING_GET_MAX_STATE ||
  +  op == COOLING_GET_CUR_STATE) {
  +  blocking_notifier_call_chain(
  +  cooling_state_notifier_list, op, request);
  +  }
  +  return 0;
  +}
  +EXPORT_SYMBOL_GPL(thermal_cooling_notify_states);
  +
  +/**
  + * thermal_cooling_register_notifier - registers a notifier with thermal
  cooling.
  + * @nb:   notifier function to register.
  + *
  + * Add a driver to receive all cooling notifications like current state,
  + * max state

Re: [PATCH v1 6/6] ACPI: thermal: processor: Use the generic cpufreq infrastructure

2014-06-02 Thread Javi Merino
On Mon, Jun 02, 2014 at 10:21:48AM +0100, Amit Kachhap wrote:
 Hi Javi,
 
 On 5/29/14, Javi Merino javi.mer...@arm.com wrote:
  Hi Amit,
 
  On Thu, May 29, 2014 at 09:15:34AM +0100, Amit Daniel Kachhap wrote:
  This patch upgrades the ACPI cpufreq cooling portions to use the generic
  cpufreq cooling infrastructure. There should not be any functionality
  related changes as the same behaviour is provided by the generic
  cpufreq APIs with the notifier mechanism.
 
  Signed-off-by: Amit Daniel Kachhap amit.dan...@samsung.com
  ---
   drivers/acpi/processor_driver.c  |6 +-
   drivers/acpi/processor_thermal.c |  235
  ++
   include/acpi/processor.h |3 +-
   3 files changed, 115 insertions(+), 129 deletions(-)
 
  diff --git a/drivers/acpi/processor_driver.c
  b/drivers/acpi/processor_driver.c
  index 7f70f31..10aba4a 100644
  --- a/drivers/acpi/processor_driver.c
  +++ b/drivers/acpi/processor_driver.c
  @@ -36,6 +36,7 @@
   #include linux/cpuidle.h
   #include linux/slab.h
   #include linux/acpi.h
  +#include linux/cpu_cooling.h
 
   #include acpi/processor.h
 
  @@ -178,8 +179,7 @@ static int __acpi_processor_start(struct acpi_device
  *device)
  if (!cpuidle_get_driver() || cpuidle_get_driver() ==
  acpi_idle_driver)
  acpi_processor_power_init(pr);
 
  -   pr-cdev = thermal_cooling_device_register(Processor, device,
  -
  processor_cooling_ops);
  +   pr-cdev = acpi_processor_cooling_register(device);
 
  With this you have removed the only cooling device whose type was
  Processor.  There's special code for dealing with this cooling
  device in drivers/thermal/thermal_core.c:passive_store():
 
  list_for_each_entry(cdev, thermal_cdev_list, node) {
  if (!strncmp(Processor, cdev-type,
   sizeof(Processor)))
  thermal_zone_bind_cooling_device(tz,
  THERMAL_TRIPS_NONE, cdev,
  THERMAL_NO_LIMIT,
  THERMAL_NO_LIMIT);
  }
  mutex_unlock(thermal_list_lock);
  if (!tz-passive_delay)
 
  With your change, that code is now dead as it can't do anything.  No
  I don't know what should you do with it, either remove it or make it
  match the cpufreq cooling device.  But this patch should deal with
  that code as well.
 nice catch. I somehow missed modifying this section.
 I think the following changes should fix this,
 -   if (!strncmp(Processor, cdev-type,
 -sizeof(Processor)))
 +   if (!strncmp(thermal-cpufreq, cdev-type,
 +sizeof(thermal-cpufreq)))
 thermal_zone_bind_cooling_device(tz,
 

That should do it.  I don't really understand why this code is
specifically looking for ACPI processor cooling devices but I guess
that's the least disrupting change you can make.

Cheers,
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] thermal: document struct thermal_zone_device and thermal_governor

2014-05-23 Thread Javi Merino
On Fri, May 23, 2014 at 02:44:51PM +0100, Eduardo Valentin wrote:
 Hello Javi,

Hi Eduardo,

Sorry for the wrong From: in my previous email, I hope that this one
goes out with the correct one.

 On Fri, May 23, 2014 at 10:35:52AM +0100, Javi Merino wrote:
  On Thu, May 22, 2014 at 04:27:30PM +0100, Eduardo Valentin wrote:
   On Fri, May 16, 2014 at 12:16:08PM +0100, Javi Merino wrote:
   
Document struct struct thermal_zone_device and struct thermal_governor
fields and their use by the thermal framework code.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin eduardo.valen...@ti.com
Signed-off-by: Javi Merino javi.mer...@arm.com

---

Hi linux-pm,

I have some patches that add new fields to these structures but I
don't have a good place to describe those fields as these structs are
mostly undocumented so I thought I'd document them.

I'm unsure about some of the descriptions, specially for passive and
forced_passive so please review them.

 include/linux/thermal.h |   44 
++--
 1 file changed, 42 insertions(+), 2 deletions(-)

diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index f7e11c7ea7d9..af928c667dba 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -158,6 +158,40 @@ struct thermal_attr {
char name[THERMAL_NAME_LENGTH];
 };
 
+/**
+ * struct thermal_zone_device - structure for a thermal zone
+ * @id:unique id number for each thermal zone
+ * @type:  the thermal zone device type
+ * @device:struct device for this thermal zone
+ * @trip_temp_attrs:   attributes for trip points for sysfs: trip 
temperature
+ * @trip_type_attrs:   attributes for trip points for sysfs: trip type
+ * @trip_hyst_attrs:   attributes for trip points for sysfs: trip 
hysteresis
+ * @devdata:   private pointer for device private data
+ * @trips: number of trip points the thermal zone supports
+ * @passive_delay: number of milliseconds to wait between polls 
when
+ * performing passive cooling.  Only used by the 
step-wise
   
   I don't think this parameter is specific to step-wise.
  
  It's only used by step-wise, all the other governors always poll at
  polling_delay.
 
 Agreed, but the fact we have only one governor using it, it does not
 imply it is a specific concept for that governor. The concept can and
 should be reused by other governors. Idea is simple, you have different
 temporal / thermal evolution when idle, when monitoring passive cooling 
 actions. And this fact won't change, no matter the policy you apply.

If this is so generic, shouldn't it be done by the thermal core code
instead of left to the governors?  Something like:

if (temperature  temp_of_first_passive_trip_point)
tz-passive = 1;

  While we are at it, can we change the name so that it's more generic and
  can be reused by other governors?  Something like idle_poll and
  active_poll?  I'm open to suggestions for a better name.
  
 
 active_poll means when cooling devices are active? which kind of cooling
 devices?

Ok, I guess it's just confusing in my head.  I'll change the power
allocator governor to set tz-passive to true when we want to poll at
a faster rate.

+ * governor
+ * @polling_delay: number of milliseconds to wait between polls 
when
+ * checking whether trip points have been crossed 
(0 for
+ * interrupt driven systems)
+ * @temperature:   current temperature.  This is only for core 
code,
+ * drivers should use thermal_zone_get_temp() to 
get the
+ * current temperature
+ * @last_temperature:  previous temperature read
+ * @emul_temperature:  emulated temperature when using 
CONFIG_THERMAL_EMULATION
+ * @passive:   step-wise specific parameter.  1 if you've crossed a 
passive
+ * trip point, 0 otherwise
   
   ditto.
  
  ditto, no other governor changes passive and it can't be changed from
  sysfs.
  
+ * @forced_passive:step-wise specific parameter.  If  0, 
temperature at
+ * which to switch on all cpufreq cooling devices.
   
   ditto.
  
  Again, this is step-wise specific.  You can change it from sysfs, but
  the other governors (userspace and fair-share) will happily ignore
  you.
  
   Also, governors are not aware of specific cooling devices. 
  
  I didn't say that governors are aware of specific cooling devices.  I
  said that if forced_passive is greater than 0, then only cpufreq
  cooling devices are switched on.  It's wrong, it's not cpufreq cooling
  devices, it's ACPI Processor cooling devices.
  
 
 I mean, we don't have only cpufreq cooling devices. 

True

Re: [PATCH v3 4/6] acerhdf: Use bang-bang thermal governor

2014-05-12 Thread Javi Merino
Hi Peter,

On Mon, May 12, 2014 at 11:27:08AM +0100, Peter Feuerer wrote:
 Javi Merino writes:
  On Sat, May 03, 2014 at 06:59:24PM +0100, Peter Feuerer wrote:
  acerhdf has been doing an on-off fan control using hysteresis by
  post-manipulating the outcome of thermal subsystem trip point handling.
  This patch enables acerhdf to use the bang-bang governor, which is
  intended for on-off controlled fans.
  
  Cc: Andrew Morton a...@linux-foundation.org
  CC: Zhang Rui rui.zh...@intel.com
  Cc: Andreas Mohr a...@lisas.de
  Cc: Borislav Petkov b...@suse.de
  Cc: Javi Merino javi.mer...@arm.com
  Signed-off-by: Peter Feuerer pe...@piie.net
  ---
   drivers/platform/x86/Kconfig   |  2 +-
   drivers/platform/x86/acerhdf.c | 34 +-
   2 files changed, 30 insertions(+), 6 deletions(-)
  
  diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
  index 27df2c5..0c15d89 100644
  --- a/drivers/platform/x86/Kconfig
  +++ b/drivers/platform/x86/Kconfig
  @@ -38,7 +38,7 @@ config ACER_WMI
   
   config ACERHDF
 tristate Acer Aspire One temperature and fan driver
  -  depends on THERMAL  ACPI
  +  depends on ACPI  THERMAL_GOV_BANG_BANG
 ---help---
   This is a driver for Acer Aspire One netbooks. It allows to access
   the temperature sensor and to control the fan.
  diff --git a/drivers/platform/x86/acerhdf.c 
  b/drivers/platform/x86/acerhdf.c
  index 176edbd..afaa849 100644
  --- a/drivers/platform/x86/acerhdf.c
  +++ b/drivers/platform/x86/acerhdf.c
  @@ -50,7 +50,7 @@
*/
   #undef START_IN_KERNEL_MODE
   
  -#define DRV_VER 0.5.30
  +#define DRV_VER 0.5.31
   
   /*
* According to the Atom N270 datasheet,
  @@ -259,6 +259,14 @@ static const struct bios_settings_t bios_tbl[] = {
   
   static const struct bios_settings_t *bios_cfg __read_mostly;
   
  +/*
  + * this struct is used to instruct thermal layer to use bang_bang instead 
  of
  + * default governor for acerhdf
  + */
  +static struct thermal_zone_params acerhdf_zone_params = {
  +  .governor_name = bang_bang,
  +};
  +
   static int acerhdf_get_temp(int *temp)
   {
 u8 read_temp;
  @@ -440,6 +448,15 @@ static int acerhdf_get_trip_type(struct 
  thermal_zone_device *thermal, int trip,
 return 0;
   }
   
  +static int acerhdf_get_trip_hyst(struct thermal_zone_device *thermal, int 
  trip,
  +   unsigned long *temp)
  +{
  +  if (trip == 0)
  +  *temp = fanon - fanoff;
  +
  +  return 0;
  +}
  +
  
  I think you should only return 0 if you've updated the temperature.
  Otherwise you're telling the calling function that everything went all
  right but you may be leaving garbage in *temp.  What about
  
  if (trip != 0)
  return -EINVAL;
  
  *temp = fanon - fanoff;
  return 0;
 
 Yes, sounds good.  What about trip == 1?  This is a valid trip point for 
 acerhdf (critical) and one could argue, that it has a valid hysteresis of 
 0.  - Thus EINVAL would be wrong here.

Then for trip == 1 you should also write something in *temp if it's a
valid trip point and you don't want to return -EINVAL.  I don't know
what is the correct hysteresis for trip == 1. Maybe 0?  What I'm
trying to say is that you should return 0 only if you've updated
*temp.

Cheers,
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH v2] tracing: Add __bitmask() macro to trace events to cpumasks and other bitmasks

2014-05-14 Thread Javi Merino
Hi Steven,

On Wed, May 07, 2014 at 04:12:38AM +0100, Steven Rostedt wrote:
 
 Being able to show a cpumask of events can be useful as some events
 may affect only some CPUs. There is no standard way to record the
 cpumask and converting it to a string is rather expensive during
 the trace as traces happen in hotpaths. It would be better to record
 the raw event mask and be able to parse it at print time.
 
 The following macros were added for use with the TRACE_EVENT() macro:
 
   __bitmask()
   __assign_bitmask()
   __get_bitmask()
 
 To test this, I added this to the sched_migrate_task event, which
 looked like this:
 
 TRACE_EVENT(sched_migrate_task,
 
   TP_PROTO(struct task_struct *p, int dest_cpu, const struct cpumask 
 *cpus),
 
   TP_ARGS(p, dest_cpu, cpus),
 
   TP_STRUCT__entry(
   __array(char,   comm,   TASK_COMM_LEN   )
   __field(pid_t,  pid )
   __field(int,prio)
   __field(int,orig_cpu)
   __field(int,dest_cpu)
   __bitmask(  cpumask, cpus   )
   ),
 
   TP_fast_assign(
   memcpy(__entry-comm, p-comm, TASK_COMM_LEN);
   __entry-pid= p-pid;
   __entry-prio   = p-prio;
   __entry-orig_cpu   = task_cpu(p);
   __entry-dest_cpu   = dest_cpu;
   __assign_bitmask(cpumask, cpumask_bits(cpus));
   ),
 
   TP_printk(comm=%s pid=%d prio=%d orig_cpu=%d dest_cpu=%d cpumask=%s,
 __entry-comm, __entry-pid, __entry-prio,
 __entry-orig_cpu, __entry-dest_cpu,
 __get_bitmask(cpumask))
 );
 
 With the output of:
 
 ksmtuned-3613  [003] d..2   485.220508: sched_migrate_task: 
 comm=ksmtuned pid=3615 prio=120 orig_cpu=3 dest_cpu=2 
 cpumask=,000f
  migration/1-13[001] d..5   485.221202: sched_migrate_task: 
 comm=ksmtuned pid=3614 prio=120 orig_cpu=1 dest_cpu=0 
 cpumask=,000f
  awk-3615  [002] d.H5   485.221747: sched_migrate_task: 
 comm=rcu_preempt pid=7 prio=120 orig_cpu=0 dest_cpu=1 
 cpumask=,00ff
  migration/2-18[002] d..5   485.222062: sched_migrate_task: 
 comm=ksmtuned pid=3615 prio=120 orig_cpu=2 dest_cpu=3 
 cpumask=,000f
 
 Link: 
 http://lkml.kernel.org/r/1399377998-14870-6-git-send-email-javi.mer...@arm.com
 
 Suggested-by: Javi Merino javi.mer...@arm.com
 Signed-off-by: Steven Rostedt rost...@goodmis.org
 ---
 
 Changes since v1: Use bitmask name instead of cpumask naming.
 
 ---
  include/linux/ftrace_event.h |  3 +++
  include/linux/trace_seq.h| 10 
  include/trace/ftrace.h   | 57 
 +++-
  kernel/trace/trace_output.c  | 41 +++
  4 files changed, 110 insertions(+), 1 deletion(-)
 
 diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
 index d16da3e..cff3106 100644
 --- a/include/linux/ftrace_event.h
 +++ b/include/linux/ftrace_event.h
 @@ -38,6 +38,9 @@ const char *ftrace_print_symbols_seq_u64(struct trace_seq 
 *p,
*symbol_array);
  #endif
  
 +const char *ftrace_print_bitmask_seq(struct trace_seq *p, void *bitmask_ptr,
 +  unsigned int bitmask_size);
 +
  const char *ftrace_print_hex_seq(struct trace_seq *p,
const unsigned char *buf, int len);
  
 diff --git a/include/linux/trace_seq.h b/include/linux/trace_seq.h
 index a32d86e..1361169 100644
 --- a/include/linux/trace_seq.h
 +++ b/include/linux/trace_seq.h
 @@ -46,6 +46,9 @@ extern int trace_seq_putmem_hex(struct trace_seq *s, const 
 void *mem,
  extern void *trace_seq_reserve(struct trace_seq *s, size_t len);
  extern int trace_seq_path(struct trace_seq *s, const struct path *path);
  
 +extern int trace_seq_bitmask(struct trace_seq *s, const unsigned long *maskp,
 +  int nmaskbits);
 +
  #else /* CONFIG_TRACING */
  static inline int trace_seq_printf(struct trace_seq *s, const char *fmt, ...)
  {
 @@ -57,6 +60,13 @@ trace_seq_bprintf(struct trace_seq *s, const char *fmt, 
 const u32 *binary)
   return 0;
  }
  
 +static inline int
 +trace_seq_bitmask(struct trace_seq *s, const unsigned long *maskp,
 +   int nmaskbits)
 +{
 + return 0;
 +}
 +
  static inline int trace_print_seq(struct seq_file *m, struct trace_seq *s)
  {
   return 0;
 diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
 index 0a1a4f7..d9c44af 100644
 --- a/include/trace/ftrace.h
 +++ b/include/trace/ftrace.h
 @@ -53,6 +53,9 @@
  #undef __string
  #define __string(item, src) __dynamic_array(char, item, -1)
  
 +#undef __bitmask
 +#define __bitmask(item, src) __dynamic_array(char, item, -1)
 +
  #undef TP_STRUCT__entry

Re: [RFC][PATCH v2] tracing: Add __bitmask() macro to trace events to cpumasks and other bitmasks

2014-05-14 Thread Javi Merino
On Wed, May 14, 2014 at 04:36:23PM +0100, Steven Rostedt wrote:
 On Wed, 14 May 2014 15:23:24 +0100
 Javi Merino javi.mer...@arm.com wrote:
 
   diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
   index 0a1a4f7..d9c44af 100644
   --- a/include/trace/ftrace.h
   +++ b/include/trace/ftrace.h
   @@ -53,6 +53,9 @@
#undef __string
#define __string(item, src) __dynamic_array(char, item, -1)

   +#undef __bitmask
   +#define __bitmask(item, src) __dynamic_array(char, item, -1)
   +
#undef TP_STRUCT__entry
#define TP_STRUCT__entry(args...) args

   @@ -128,6 +131,9 @@
#undef __string
#define __string(item, src) __dynamic_array(char, item, -1)

   +#undef __string
   +#define __string(item, src) __dynamic_array(unsigned long, item, -1)
   +
  
  This overrides the previous definition of __string() and looks like it
  shouldn't be here.
 
 Bah! I knew there was a reason I didn't push this out to my for-next
 branch yet. I can still rebase :-)
 
 That should have been __bitmask(). Hmm, amazing it still worked.
 
  
#undef DECLARE_EVENT_CLASS
#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)   
   \
 struct ftrace_data_offsets_##call { \
   @@ -200,6 +206,15 @@
#undef __get_str
#define __get_str(field) (char *)__get_dynamic_array(field)

   +#undef __get_bitmask
   +#define __get_bitmask(field) 
   \
   + ({  \
   + void *__bitmask = __get_dynamic_array(field);   \
   + unsigned int __bitmask_size;\
   + __bitmask_size = (__entry-__data_loc_##field  16)  0x; \
   + ftrace_print_bitmask_seq(p, __bitmask, __bitmask_size); \
   + })
   +
#undef __print_flags
#define __print_flags(flag, delim, flag_array...)
   \
 ({  \
   @@ -322,6 +337,9 @@ static struct trace_event_functions 
   ftrace_event_type_funcs_##call = {\
#undef __string
#define __string(item, src) __dynamic_array(char, item, -1)

   +#undef __bitmask
   +#define __bitmask(item, src) __dynamic_array(unsigned long, item, -1)
   +
  
#undef DECLARE_EVENT_CLASS
#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, func, print) 
   \
static int notrace __init
   \
   @@ -372,6 +390,29 @@ ftrace_define_fields_##call(struct ftrace_event_call 
   *event_call)\
#define __string(item, src) __dynamic_array(char, item,  
   \
 strlen((src) ? (const char *)(src) : (null)) + 1)

   +/*
   + * __bitmask_size_in_bytes_raw is the number of bytes needed to hold
   + * num_possible_cpus().
   + */
   +#define __bitmask_size_in_bytes_raw  \
   + ((num_possible_cpus() + 7) / 8)
   +
   +#define __bitmask_size_in_longs  
   \
   + ((__bitmask_size_in_bytes_raw + ((BITS_PER_LONG / 8) - 1))  \
   +  / (BITS_PER_LONG / 8))
   +
   +/*
   + * __bitmask_size_in_bytes is the number of bytes needed to hold
   + * num_possible_cpus() padded out to the nearest long. This is what
   + * is saved in the buffer, just to be consistent.
   + */
   +#define __bitmask_size_in_bytes  \
   + (__bitmask_size_in_longs * (BITS_PER_LONG / 8))
   +
   +#undef __bitmask
   +#define __bitmask(item, src) __dynamic_array(unsigned long, item,
   \
   +  __bitmask_size_in_longs)
   +
#undef DECLARE_EVENT_CLASS
#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)   
   \
static inline notrace int ftrace_get_offsets_##call( 
   \
   @@ -513,12 +554,22 @@ static inline notrace int 
   ftrace_get_offsets_##call(\
 __entry-__data_loc_##item = __data_offsets.item;

#undef __string
   -#define __string(item, src) __dynamic_array(char, item, -1)  
   \
   +#define __string(item, src) __dynamic_array(char, item, -1)

#undef __assign_str
#define __assign_str(dst, src)   
   \
 strcpy(__get_str(dst), (src) ? (const char *)(src) : (null));

   +#undef __bitmask
   +#define __bitmask(item, src) __dynamic_array(unsigned long, item, -1)
  
  Why src?  It's not used in any of the definitions of the __bitmask()
  macro, can we remove it?
 
 Hmm, I may need to refactor this. I may pull this patch for now to push
 the rest to for-next.
 
 I just noticed that we need a way to specify the length of the bitmask.
 Right now it's hardcoded as num_possible_cpus, which Mathieu was
 asking for a more generic approach.
 
 OK, let me pull this patch out of my tree (thank god I never pushed
 it), and work on it a bit more.

Ok, well keep me

Re: [RFC][PATCH v3] tracing: Add __bitmask() macro to trace events to cpumasks and other bitmasks

2014-05-14 Thread Javi Merino
;
   __entry-prio   = p-prio;
   __entry-orig_cpu   = task_cpu(p);
   __entry-dest_cpu   = dest_cpu;
   __assign_bitmask(cpumask, cpumask_bits(cpus), 
 num_possible_cpus());
   ),
 
   TP_printk(comm=%s pid=%d prio=%d orig_cpu=%d dest_cpu=%d cpumask=%s,
 __entry-comm, __entry-pid, __entry-prio,
 __entry-orig_cpu, __entry-dest_cpu,
 __get_bitmask(cpumask))
 );
 
 With the output of:
 
 ksmtuned-3613  [003] d..2   485.220508: sched_migrate_task: 
 comm=ksmtuned pid=3615 prio=120 orig_cpu=3 dest_cpu=2 
 cpumask=,000f
  migration/1-13[001] d..5   485.221202: sched_migrate_task: 
 comm=ksmtuned pid=3614 prio=120 orig_cpu=1 dest_cpu=0 
 cpumask=,000f
  awk-3615  [002] d.H5   485.221747: sched_migrate_task: 
 comm=rcu_preempt pid=7 prio=120 orig_cpu=0 dest_cpu=1 
 cpumask=,00ff
  migration/2-18[002] d..5   485.222062: sched_migrate_task: 
 comm=ksmtuned pid=3615 prio=120 orig_cpu=2 dest_cpu=3 
 cpumask=,000f
 
 Link: 
 http://lkml.kernel.org/r/1399377998-14870-6-git-send-email-javi.mer...@arm.com
 
 Suggested-by: Javi Merino javi.mer...@arm.com
 Signed-off-by: Steven Rostedt rost...@goodmis.org
 ---
  include/linux/ftrace_event.h |  3 +++
  include/linux/trace_seq.h| 10 
  include/trace/ftrace.h   | 57 
 +++-
  kernel/trace/trace_output.c  | 41 +++
  4 files changed, 110 insertions(+), 1 deletion(-)
 
 diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
 index d16da3e..cff3106 100644
 --- a/include/linux/ftrace_event.h
 +++ b/include/linux/ftrace_event.h
 @@ -38,6 +38,9 @@ const char *ftrace_print_symbols_seq_u64(struct trace_seq 
 *p,
*symbol_array);
  #endif
  
 +const char *ftrace_print_bitmask_seq(struct trace_seq *p, void *bitmask_ptr,
 +  unsigned int bitmask_size);
 +
  const char *ftrace_print_hex_seq(struct trace_seq *p,
const unsigned char *buf, int len);
  
 diff --git a/include/linux/trace_seq.h b/include/linux/trace_seq.h
 index a32d86e..1361169 100644
 --- a/include/linux/trace_seq.h
 +++ b/include/linux/trace_seq.h
 @@ -46,6 +46,9 @@ extern int trace_seq_putmem_hex(struct trace_seq *s, const 
 void *mem,
  extern void *trace_seq_reserve(struct trace_seq *s, size_t len);
  extern int trace_seq_path(struct trace_seq *s, const struct path *path);
  
 +extern int trace_seq_bitmask(struct trace_seq *s, const unsigned long *maskp,
 +  int nmaskbits);
 +
  #else /* CONFIG_TRACING */
  static inline int trace_seq_printf(struct trace_seq *s, const char *fmt, ...)
  {
 @@ -57,6 +60,13 @@ trace_seq_bprintf(struct trace_seq *s, const char *fmt, 
 const u32 *binary)
   return 0;
  }
  
 +static inline int
 +trace_seq_bitmask(struct trace_seq *s, const unsigned long *maskp,
 +   int nmaskbits)
 +{
 + return 0;
 +}
 +
  static inline int trace_print_seq(struct seq_file *m, struct trace_seq *s)
  {
   return 0;
 diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
 index 0a1a4f7..9b7a989 100644
 --- a/include/trace/ftrace.h
 +++ b/include/trace/ftrace.h
 @@ -53,6 +53,9 @@
  #undef __string
  #define __string(item, src) __dynamic_array(char, item, -1)
  
 +#undef __bitmask
 +#define __bitmask(item, nr_bits) __dynamic_array(char, item, -1)
 +
  #undef TP_STRUCT__entry
  #define TP_STRUCT__entry(args...) args
  
 @@ -128,6 +131,9 @@
  #undef __string
  #define __string(item, src) __dynamic_array(char, item, -1)
  
 +#undef __bitmask
 +#define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1)
 +
  #undef DECLARE_EVENT_CLASS
  #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)   
 \
   struct ftrace_data_offsets_##call { \
 @@ -200,6 +206,15 @@
  #undef __get_str
  #define __get_str(field) (char *)__get_dynamic_array(field)
  
 +#undef __get_bitmask
 +#define __get_bitmask(field) \
 + ({  \
 + void *__bitmask = __get_dynamic_array(field);   \
 + unsigned int __bitmask_size;\
 + __bitmask_size = (__entry-__data_loc_##field  16)  0x; \
 + ftrace_print_bitmask_seq(p, __bitmask, __bitmask_size); \
 + })
 +
  #undef __print_flags
  #define __print_flags(flag, delim, flag_array...)\
   ({  \
 @@ -322,6 +337,9 @@ static struct trace_event_functions 
 ftrace_event_type_funcs_##call = {\
  #undef __string
  #define __string(item, src) __dynamic_array(char, item, -1)
  
 +#undef __bitmask
 +#define

[RFC PATCH v4 6/7] thermal: add trace events to the power allocator governor

2014-06-17 Thread Javi Merino
Add trace events for the power allocator governor and the power actor
interface of the cpu cooling device.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
Signed-off-by: Javi Merino javi.mer...@arm.com

---

trace-cmd needs the patched attached in
http://article.gmane.org/gmane.linux.kernel/1704423 for this to work.

 drivers/thermal/cpu_actor.c  |  5 +++
 drivers/thermal/power_allocator.c| 12 ++-
 include/trace/events/thermal_power.h | 62 
 3 files changed, 78 insertions(+), 1 deletion(-)
 create mode 100644 include/trace/events/thermal_power.h

diff --git a/drivers/thermal/cpu_actor.c b/drivers/thermal/cpu_actor.c
index 67897b1ded62..4ef715dea87f 100644
--- a/drivers/thermal/cpu_actor.c
+++ b/drivers/thermal/cpu_actor.c
@@ -27,6 +27,8 @@
 #include linux/printk.h
 #include linux/slab.h
 
+#include trace/events/thermal_power.h
+
 #include power_actor.h
 
 /**
@@ -297,6 +299,9 @@ static int cpu_set_power(struct power_actor *actor,
return -EINVAL;
}
 
+   trace_thermal_power_limit(cpu_actor-cpumask, target_freq, cdev_state,
+   power);
+
return cdev-ops-set_cur_state(cdev, cdev_state);
 }
 
diff --git a/drivers/thermal/power_allocator.c 
b/drivers/thermal/power_allocator.c
index dd781eb29568..a10c5ed26820 100644
--- a/drivers/thermal/power_allocator.c
+++ b/drivers/thermal/power_allocator.c
@@ -19,6 +19,9 @@
 #include linux/slab.h
 #include linux/thermal.h
 
+#define CREATE_TRACE_POINTS
+#include trace/events/thermal_power.h
+
 #include power_actor.h
 #include thermal_core.h
 
@@ -133,7 +136,14 @@ static u32 pid_controller(struct thermal_zone_device *tz,
/* feed-forward the known sustainable dissipatable power */
power_range = tz-tzp-sustainable_power + frac_to_int(power_range);
 
-   return clamp(power_range, (s64)0, (s64)max_allocatable_power);
+   power_range = clamp(power_range, (s64)0, (s64)max_allocatable_power);
+
+   trace_thermal_power_allocator_pid(frac_to_int(err),
+   frac_to_int(params-err_integral),
+   frac_to_int(p), frac_to_int(i),
+   frac_to_int(d), power_range);
+
+   return power_range;
 }
 
 /**
diff --git a/include/trace/events/thermal_power.h 
b/include/trace/events/thermal_power.h
new file mode 100644
index ..6629f8b4ca9f
--- /dev/null
+++ b/include/trace/events/thermal_power.h
@@ -0,0 +1,62 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM thermal_power
+
+#if !defined(_TRACE_THERMAL_GOVERNOR_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_THERMAL_POWER_H
+
+#include linux/tracepoint.h
+
+TRACE_EVENT(thermal_power_allocator_pid,
+   TP_PROTO(s32 err, s32 err_integral, s64 p, s64 i, s64 d, s32 output),
+   TP_ARGS(err, err_integral, p, i, d, output),
+   TP_STRUCT__entry(
+   __field(s32, err )
+   __field(s32, err_integral)
+   __field(s64, p   )
+   __field(s64, i   )
+   __field(s64, d   )
+   __field(s32, output  )
+   ),
+   TP_fast_assign(
+   __entry-err = err;
+   __entry-err_integral = err_integral;
+   __entry-p = p;
+   __entry-i = i;
+   __entry-d = d;
+   __entry-output = output;
+   ),
+
+   TP_printk(err=%d err_integral=%d p=%lld i=%lld d=%lld output=%d,
+   __entry-err, __entry-err_integral,
+   __entry-p, __entry-i, __entry-d, __entry-output)
+);
+
+TRACE_EVENT(thermal_power_limit,
+   TP_PROTO(const struct cpumask *cpus, unsigned int freq,
+   unsigned long cdev_state, u32 power),
+
+   TP_ARGS(cpus, freq, cdev_state, power),
+
+   TP_STRUCT__entry(
+   __bitmask(cpumask, num_possible_cpus())
+   __field(unsigned int,  freq  )
+   __field(unsigned long, cdev_state)
+   __field(u32,   power )
+   ),
+
+   TP_fast_assign(
+   __assign_bitmask(cpumask, cpumask_bits(cpus),
+   num_possible_cpus());
+   __entry-freq = freq;
+   __entry-cdev_state = cdev_state;
+   __entry-power = power;
+   ),
+
+   TP_printk(cpus=%s freq=%u cdev_state=%lu power=%u,
+   __get_bitmask(cpumask), __entry-freq, __entry-cdev_state,
+   __entry-power)
+);
+#endif /* _TRACE_THERMAL_POWER_H */
+
+/* This part must be outside protection */
+#include trace/define_trace.h
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo

[RFC PATCH v4 1/7] thermal: document struct thermal_zone_device and thermal_governor

2014-06-17 Thread Javi Merino
Document struct thermal_zone_device and struct thermal_governor fields
and their use by the thermal framework code.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
Hi linux-pm,

This patch is independent of the whole series and can be merged independently

 include/linux/thermal.h | 46 --
 1 file changed, 44 insertions(+), 2 deletions(-)

diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index f7e11c7ea7d9..0305cde21a74 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -158,6 +158,42 @@ struct thermal_attr {
char name[THERMAL_NAME_LENGTH];
 };
 
+/**
+ * struct thermal_zone_device - structure for a thermal zone
+ * @id:unique id number for each thermal zone
+ * @type:  the thermal zone device type
+ * @device:struct device for this thermal zone
+ * @trip_temp_attrs:   attributes for trip points for sysfs: trip temperature
+ * @trip_type_attrs:   attributes for trip points for sysfs: trip type
+ * @trip_hyst_attrs:   attributes for trip points for sysfs: trip hysteresis
+ * @devdata:   private pointer for device private data
+ * @trips: number of trip points the thermal zone supports
+ * @passive_delay: number of milliseconds to wait between polls when
+ * performing passive cooling.  Currenty only used by the
+ * step-wise governor
+ * @polling_delay: number of milliseconds to wait between polls when
+ * checking whether trip points have been crossed (0 for
+ * interrupt driven systems)
+ * @temperature:   current temperature.  This is only for core code,
+ * drivers should use thermal_zone_get_temp() to get the
+ * current temperature
+ * @last_temperature:  previous temperature read
+ * @emul_temperature:  emulated temperature when using CONFIG_THERMAL_EMULATION
+ * @passive:   1 if you've crossed a passive trip point, 0 otherwise.
+ * Currenty only used by the step-wise governor.
+ * @forced_passive:If  0, temperature at which to switch on all ACPI
+ * processor cooling devices.  Currently only used by the
+ * step-wise governor.
+ * @ops:   operations this thermal_zone_device supports
+ * @tzp:   thermal zone parameters
+ * @governor:  pointer to the governor for this thermal zone
+ * @thermal_instances: list of struct thermal_instance of this thermal zone
+ * @idr:   struct idr to generate unique id for this zone's cooling
+ * devices
+ * @lock:  lock to protect thermal_instances list
+ * @node:  node in thermal_tz_list (in thermal_core.c)
+ * @poll_queue:delayed work for polling
+ */
 struct thermal_zone_device {
int id;
char type[THERMAL_NAME_LENGTH];
@@ -179,12 +215,18 @@ struct thermal_zone_device {
struct thermal_governor *governor;
struct list_head thermal_instances;
struct idr idr;
-   struct mutex lock; /* protect thermal_instances list */
+   struct mutex lock;
struct list_head node;
struct delayed_work poll_queue;
 };
 
-/* Structure that holds thermal governor information */
+/**
+ * struct thermal_governor - structure that holds thermal governor information
+ * @name:  name of the governor
+ * @throttle:  callback called for every trip point even if temperature is
+ * below the trip point temperature
+ * @governor_list: node in thermal_governor_list (in thermal_core.c)
+ */
 struct thermal_governor {
char name[THERMAL_NAME_LENGTH];
int (*throttle)(struct thermal_zone_device *tz, int trip);
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v4 2/7] thermal: let governors have private data for each thermal zone

2014-06-17 Thread Javi Merino
A governor may need to store its current state between calls to
throttle().  That state depends on the thermal zone, so store it as
private data in struct thermal_zone_device.

The governors may have two new ops: bind_to_tz() and unbind_from_tz().
When provided, these functions let governors do some initialization
and teardown when they are bound/unbound to a tz and possibly store that
information in the governor_data field of the struct
thermal_zone_device.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 drivers/thermal/thermal_core.c | 83 ++
 include/linux/thermal.h|  9 +
 2 files changed, 84 insertions(+), 8 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 71b0ec0c370d..3da99dd80ad5 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -72,6 +72,58 @@ static struct thermal_governor *__find_governor(const char 
*name)
return NULL;
 }
 
+/**
+ * bind_previous_governor() - bind the previous governor of the thermal zone
+ * @tz:a valid pointer to a struct thermal_zone_device
+ * @failed_gov_name:   the name of the governor that failed to register
+ *
+ * Register the previous governor of the thermal zone after a new
+ * governor has failed to be bound.
+ */
+static void bind_previous_governor(struct thermal_zone_device *tz,
+   const char *failed_gov_name)
+{
+   if (tz-governor  tz-governor-bind_to_tz) {
+   if (tz-governor-bind_to_tz(tz)) {
+   dev_warn(tz-device,
+   governor %s failed to bind and the previous 
one (%s) failed to register again, thermal zone %s has no governor\n,
+   failed_gov_name, tz-governor-name, tz-type);
+   tz-governor = NULL;
+   }
+   }
+}
+
+/**
+ * thermal_set_governor() - Switch to another governor
+ * @tz:a valid pointer to a struct thermal_zone_device
+ * @new_gov:   pointer to the new governor
+ *
+ * Change the governor of thermal zone @tz.
+ *
+ * Return: 0 on success, an error if the new governor's bind_to_tz() failed.
+ */
+static int thermal_set_governor(struct thermal_zone_device *tz,
+   struct thermal_governor *new_gov)
+{
+   int ret = 0;
+
+   if (tz-governor  tz-governor-unbind_from_tz)
+   tz-governor-unbind_from_tz(tz);
+
+   if (new_gov  new_gov-bind_to_tz) {
+   ret = new_gov-bind_to_tz(tz);
+   if (ret) {
+   bind_previous_governor(tz, new_gov-name);
+
+   return ret;
+   }
+   }
+
+   tz-governor = new_gov;
+
+   return ret;
+}
+
 int thermal_register_governor(struct thermal_governor *governor)
 {
int err;
@@ -104,8 +156,15 @@ int thermal_register_governor(struct thermal_governor 
*governor)
 
name = pos-tzp-governor_name;
 
-   if (!strnicmp(name, governor-name, THERMAL_NAME_LENGTH))
-   pos-governor = governor;
+   if (!strnicmp(name, governor-name, THERMAL_NAME_LENGTH)) {
+   int ret;
+
+   ret = thermal_set_governor(pos, governor);
+   if (ret)
+   dev_warn(pos-device,
+   Failed to set governor %s for thermal 
zone %s: %d\n,
+   governor-name, pos-type, ret);
+   }
}
 
mutex_unlock(thermal_list_lock);
@@ -131,7 +190,7 @@ void thermal_unregister_governor(struct thermal_governor 
*governor)
list_for_each_entry(pos, thermal_tz_list, node) {
if (!strnicmp(pos-governor-name, governor-name,
THERMAL_NAME_LENGTH))
-   pos-governor = NULL;
+   thermal_set_governor(pos, NULL);
}
 
mutex_unlock(thermal_list_lock);
@@ -756,8 +815,9 @@ policy_store(struct device *dev, struct device_attribute 
*attr,
if (!gov)
goto exit;
 
-   tz-governor = gov;
-   ret = count;
+   ret = thermal_set_governor(tz, gov);
+   if (!ret)
+   ret = count;
 
 exit:
mutex_unlock(thermal_governor_lock);
@@ -1452,6 +1512,7 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
int result;
int count;
int passive = 0;
+   struct thermal_governor *governor;
 
if (type  strlen(type) = THERMAL_NAME_LENGTH)
return ERR_PTR(-EINVAL);
@@ -1542,9 +1603,15 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
mutex_lock(thermal_governor_lock);
 
if (tz-tzp)
-   tz-governor = __find_governor(tz-tzp

[RFC PATCH v4 4/7] thermal: add a basic cpu power actor

2014-06-17 Thread Javi Merino
Introduce a power actor for cpus.  It has a basic power model to get
the current power utilization and uses cpufreq cooling devices to set
the desired power.  It uses the current frequency (as reported by
cpufreq) as well as load and OPPs for the power calculations.  The
cpus must have registered their OPPs in the OPP library.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Punit Agrawal punit.agra...@arm.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 Documentation/thermal/power_actor.txt | 125 +
 drivers/thermal/Kconfig   |   3 +
 drivers/thermal/Makefile  |   1 +
 drivers/thermal/cpu_actor.c   | 479 ++
 drivers/thermal/power_actor.h |  30 +++
 5 files changed, 638 insertions(+)
 create mode 100644 drivers/thermal/cpu_actor.c

diff --git a/Documentation/thermal/power_actor.txt 
b/Documentation/thermal/power_actor.txt
index 11ca2d0bf0bd..c96344f12599 100644
--- a/Documentation/thermal/power_actor.txt
+++ b/Documentation/thermal/power_actor.txt
@@ -54,3 +54,128 @@ temperature.
 milliwatts.
 
 Returns 0 on success, -E* on error.
+
+CPU Power Actor API
+===
+
+A simple power model for CPUs.  The current power is calculated as
+dynamic + (optionally) static power.  This power model requires that
+the operating-points of the CPUs are registered using the kernel's opp
+library and the `cpufreq_frequency_table` is assigned to the `struct
+device` of the cpu.  If you are using the `cpufreq-cpu0.c` driver then
+the `cpufreq_frequency_table` should already be assigned to the cpu
+device.
+
+The `plat_static_func` parameter of `power_cpu_actor_register()` is
+optional.  If you don't provide it, only dynamic power will be
+considered.
+
+Dynamic power
+-
+
+The dynamic power consumption of a processor depends on many factors.
+For a given processor implementation the primary factors are:
+
+- The time the processor spends running, consuming dynamic power, as
+  compared to the time in idle states where dynamic consumption is
+  negligible.  Herein we refer to this as 'utilisation'.
+- The voltage and frequency levels as a result of DVFS.  The DVFS
+  level is a dominant factor governing power consumption.
+- In running time the 'execution' behaviour (instruction types, memory
+  access patterns and so forth) causes, in most cases, a second order
+  variation.  In pathological cases this variation can be significant,
+  but typically it is of a much lesser impact than the factors above.
+
+A high level dynamic power consumption model may then be represented as:
+
+Pdyn = f(run) * Voltage^2 * Frequency * Utilisation
+
+f(run) here represents the described execution behaviour and its
+result has a units of Watts/Hz/Volt^2 (this often expressed in
+mW/MHz/uVolt^2)
+
+The detailed behaviour for f(run) could be modelled on-line.  However,
+in practice, such an on-line model has dependencies on a number of
+implementation specific processor support and characterisation
+factors.  Therefore, in initial implementation that contribution is
+represented as a constant coefficient.  This is a simplification
+consistent with the relative contribution to overall power variation.
+
+In this simplified representation our model becomes:
+
+Pdyn = Kd * Voltage^2 * Frequency * Utilisation
+
+Where Kd (capacitance) represents an indicative running time dynamic
+power coefficient in fundamental units of mW/MHz/uVolt^2
+
+Static Power
+
+
+Static leakage power consumption depends on a number of factors.  For a
+given circuit implementation the primary factors are:
+
+- Time the circuit spends in each 'power state'
+- Temperature
+- Operating voltage
+- Process grade
+
+The time the circuit spends in each 'power state' for a given
+evaluation period at first order means OFF or ON.  However,
+'retention' states can also be supported that reduce power during
+inactive periods without loss of context.
+
+Note: The visibility of state entries to the OS can vary, according to
+platform specifics, and this can then impact the accuracy of a model
+based on OS state information alone.  It might be possible in some
+cases to extract more accurate information from system resources.
+
+The temperature, operating voltage and process 'grade' (slow to fast)
+of the circuit are all significant factors in static leakage power
+consumption.  All of these have complex relationships to static power.
+
+Circuit implementation specific factors include the chosen silicon
+process as well as the type, number and size of transistors in both
+the logic gates and any RAM elements included.
+
+The static power consumption modelling must take into account the
+power managed regions that are implemented.  Taking the example of an
+ARM processor cluster, the modelling would take into account whether
+each CPU can be powered OFF separately or if only a single power
+region is implemented

[RFC PATCH v4 0/7] The power allocator thermal governor

2014-06-17 Thread Javi Merino
Hi linux-pm,

The power allocator governor allocates device power to control
temperature.  This requires transforming performance requests into
requested power, which we do with the aid of power models.  Patch 4
(thermal: add a basic cpu power actor) implements a simple power model
for cpus.  The division of power between the actors ensures that power
is allocated where it is needed the most, based on the current
workload.

Patch 1 is a generic documentation of the current thermal framework
and can be merged separately.

Changes since v3:
  - Use tz-passive to poll faster when the first trip point is hit.
  - Don't make a special directory for power_actors
  - Add a DT property for sustainable-power
  - Simplify the static power interface and pass the current thermal
zone in every power_actor_ops to remove the controversial
enum power_actor_types
  - Use locks with the actor_list list
  - Use cpufreq_get() to get the frequency of the cpu instead of
using the notifiers.
  - Remove the prompt for THERMAL_POWER_ACTOR_CPU when configuring
the kernel

Changes since v2:
  - Changed the PI controller into a PID controller
  - Added static power to the cpu power model
  - tz parameter max_dissipatable_power renamed to sustainable_power
  - Register the cpufreq cooling device as part of the
power_cpu_actor registration.

Changes since v1:
  - Fixed finding cpufreq cooling devices in cpufreq_frequency_change()
  - Replaced the cooling device interface with a separate power actor
API
  - Addressed most of Eduardo's comments
  - Incorporated ftrace support for bitmask to trace cpumasks

Todo:
  - Rethink the use of trip points and make it less intrusive
  - Let platforms override the power allocator governor parameters
  - Add more tracing and provide scripts to evaluate the proposal.
  - Tune it to achieve the temperature stability we are aiming for

Cheers,
Javi  Punit

Javi Merino (6):
  thermal: document struct thermal_zone_device and thermal_governor
  thermal: let governors have private data for each thermal zone
  thermal: introduce the Power Actor API
  thermal: add a basic cpu power actor
  thermal: introduce the Power Allocator governor
  thermal: add trace events to the power allocator governor

Punit Agrawal (1):
  of: thermal: Introduce sustainable power for a thermal zone

 .../devicetree/bindings/thermal/thermal.txt|   4 +
 Documentation/thermal/power_actor.txt  | 181 
 Documentation/thermal/power_allocator.txt  |  41 ++
 drivers/thermal/Kconfig|  21 +
 drivers/thermal/Makefile   |   5 +
 drivers/thermal/cpu_actor.c| 484 +
 drivers/thermal/of-thermal.c   |   4 +
 drivers/thermal/power_actor.c  |  68 +++
 drivers/thermal/power_actor.h  |  91 
 drivers/thermal/power_allocator.c  | 477 
 drivers/thermal/thermal_core.c |  90 +++-
 drivers/thermal/thermal_core.h |   8 +
 include/linux/thermal.h|  63 ++-
 include/trace/events/thermal_power.h   |  62 +++
 14 files changed, 1588 insertions(+), 11 deletions(-)
 create mode 100644 Documentation/thermal/power_actor.txt
 create mode 100644 Documentation/thermal/power_allocator.txt
 create mode 100644 drivers/thermal/cpu_actor.c
 create mode 100644 drivers/thermal/power_actor.c
 create mode 100644 drivers/thermal/power_actor.h
 create mode 100644 drivers/thermal/power_allocator.c
 create mode 100644 include/trace/events/thermal_power.h

-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v4 5/7] thermal: introduce the Power Allocator governor

2014-06-17 Thread Javi Merino
The power allocator governor is a thermal governor that controls system
and device power allocation to control temperature.  Conceptually, the
implementation divides the sustainable power of a thermal zone among
all the heat sources in that zone.

This governor relies on power actors, entities that represent heat
sources.  They can report current and maximum power consumption and
can set a given maximum power consumption, usually via a cooling
device.

The governor uses a Proportional Integral Derivative (PID) controller
driven by the temperature of the thermal zone.  The output of the
controller is a power budget that is then allocated to each power
actor that can have bearing on the temperature we are trying to
control.  It decides how much power to give each cooling device based
on the performance they are requesting.  The PID controller ensures
that the total power budget does not exceed the control temperature.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Punit Agrawal punit.agra...@arm.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 Documentation/thermal/power_allocator.txt |  41 +++
 drivers/thermal/Kconfig   |  15 +
 drivers/thermal/Makefile  |   1 +
 drivers/thermal/power_allocator.c | 467 ++
 drivers/thermal/thermal_core.c|   7 +-
 drivers/thermal/thermal_core.h|   8 +
 include/linux/thermal.h   |   8 +
 7 files changed, 546 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/thermal/power_allocator.txt
 create mode 100644 drivers/thermal/power_allocator.c

diff --git a/Documentation/thermal/power_allocator.txt 
b/Documentation/thermal/power_allocator.txt
new file mode 100644
index ..93a3ce90322d
--- /dev/null
+++ b/Documentation/thermal/power_allocator.txt
@@ -0,0 +1,41 @@
+Integration of the power_allocator governor in a platform
+=
+
+Registering thermal_zone_device
+---
+
+An estimate of the sustainable dissipatable power (in mW) should be
+provided while registering the thermal zone.  This is the maximum
+sustained power for allocation at the desired maximum temperature.
+This number can vary for different conditions, but the closed-loop of
+the controller should take care of those variations, the
+`sustainable_power` should be an estimation of it.  Register your
+thermal zone with `thermal_zone_params` that have a
+`sustainable_power`.  If you weren't passing any
+`thermal_zone_params`, then something like this will do:
+
+   static const struct thermal_zone_params tz_params = {
+   .sustainable_power = 3500,
+   };
+
+and then pass `tz_params` as the 5th parameter to
+`thermal_zone_device_register()`
+
+Trip points
+---
+
+The governor requires the following two trip points:
+
+1.  switch on trip point: temperature above which the governor
+control loop starts operating
+2.  desired temperature trip point: it should be higher than the
+switch on trip point. It is the target temperature the governor
+is controlling for.
+
+The trip points can be either active or passive.
+
+Power actors
+
+
+Devices controlled by this governor must be registered with the power
+actor API.  Read `power_actor.txt` for more information about them.
diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index c3cb4be49695..fef56342450a 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -71,6 +71,14 @@ config THERMAL_DEFAULT_GOV_USER_SPACE
  Select this if you want to let the user space manage the
  platform thermals.
 
+config THERMAL_DEFAULT_GOV_POWER_ALLOCATOR
+   bool power_allocator
+   select THERMAL_GOV_POWER_ALLOCATOR
+   help
+ Select this if you want to control temperature based on
+ system and device power allocation. This governor relies on
+ power actors to operate.
+
 endchoice
 
 config THERMAL_GOV_FAIR_SHARE
@@ -89,6 +97,13 @@ config THERMAL_GOV_USER_SPACE
help
  Enable this to let the user space manage the platform thermals.
 
+config THERMAL_GOV_POWER_ALLOCATOR
+   bool Power allocator thermal governor
+   select THERMAL_POWER_ACTOR
+   help
+ Enable this to manage platform thermals by dynamically
+ allocating and limiting power to devices.
+
 config THERMAL_POWER_ACTOR
bool
 
diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index 74f97c90a46c..e74d57d0fe61 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -13,6 +13,7 @@ thermal_sys-$(CONFIG_THERMAL_OF)  += of-thermal.o
 thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE)   += fair_share.o
 thermal_sys-$(CONFIG_THERMAL_GOV_STEP_WISE)+= step_wise.o
 thermal_sys-$(CONFIG_THERMAL_GOV_USER_SPACE)   += user_space.o
+thermal_sys

[RFC PATCH v4 3/7] thermal: introduce the Power Actor API

2014-06-17 Thread Javi Merino
This patch introduces the Power Actor API in the thermal framework.
With it, devices that can report their power consumption and control
it can be registered.  This base interface is meant to be used to
derive specific power actors, such as a cpu power actor.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 Documentation/thermal/power_actor.txt | 56 +
 drivers/thermal/Kconfig   |  3 ++
 drivers/thermal/Makefile  |  3 ++
 drivers/thermal/power_actor.c | 68 +++
 drivers/thermal/power_actor.h | 61 +++
 5 files changed, 191 insertions(+)
 create mode 100644 Documentation/thermal/power_actor.txt
 create mode 100644 drivers/thermal/power_actor.c
 create mode 100644 drivers/thermal/power_actor.h

diff --git a/Documentation/thermal/power_actor.txt 
b/Documentation/thermal/power_actor.txt
new file mode 100644
index ..11ca2d0bf0bd
--- /dev/null
+++ b/Documentation/thermal/power_actor.txt
@@ -0,0 +1,56 @@
+Power Actor API
+===
+
+The base power actor API is meant to be used to derive specific power
+actors, such as a cpu power actor.  Power actors can be registered by
+calling `power_actor_register()` and should be unregistered by calling
+`power_actor_unregister()` with the `struct power_actor *` received in
+the call to `power_actor_register()`.
+
+This can't be implemented using the cooling device API because:
+
+1.  get_max_state() gives you the maximum cooling state which, for
+passive devices, is the minimum performance (frequency in case of
+cpufreq cdev).  get_max_power() gives you the maximum power, which
+gives you the maximum performance (frequency in the case of CPUs,
+GPUs and buses)
+
+2.  You need to pass the thermal_zone_device to all the callbacks,
+something that the current cooling device API doesn't do.
+
+Callbacks
+-
+
+1. u32 get_req_power(struct power_actor *actor,
+   struct thermal_zone_device *tz)
+@actor: a valid `struct power_actor *` registered with
+`power_actor_register()`
+@tz:   the thermal zone closest to the actor (typically, the thermal
+   zone the caller is operating on)
+
+`get_req_power()` returns the current requested power in milliwatts.
+
+2. u32 get_max_power(struct power_actor *actor,
+   struct thermal_zone_device *tz)
+@actor: a valid `struct power_actor *` registered with
+`power_actor_register()`
+@tz:   the thermal zone closest to the actor (typically, the thermal
+   zone the caller is operating on)
+
+`get_max_power()` returns the maximum power that the device could
+consume if it was fully utilized.  It's a function as some devices'
+maximum power consumption can change due to external factors such as
+temperature.
+
+3. int set_power(struct power_actor *actor,
+   struct thermal_zone_device *tz, u32 power)
+@actor: a valid `struct power_actor *` registered with
+`power_actor_register()`
+@tz:   the thermal zone closest to the actor (typically, the thermal
+   zone the caller is operating on)
+@power: power in milliwatts
+
+`set_power()` should configure the device to consume @power
+milliwatts.
+
+Returns 0 on success, -E* on error.
diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index f9a13867cb70..ce4ebe17252c 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -89,6 +89,9 @@ config THERMAL_GOV_USER_SPACE
help
  Enable this to let the user space manage the platform thermals.
 
+config THERMAL_POWER_ACTOR
+   bool
+
 config CPU_THERMAL
bool generic cpu cooling support
depends on CPU_FREQ
diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index de0636a57a64..d83aa42ab573 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -14,6 +14,9 @@ thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE)  += fair_share.o
 thermal_sys-$(CONFIG_THERMAL_GOV_STEP_WISE)+= step_wise.o
 thermal_sys-$(CONFIG_THERMAL_GOV_USER_SPACE)   += user_space.o
 
+# power actors
+obj-$(CONFIG_THERMAL_POWER_ACTOR) += power_actor.o
+
 # cpufreq cooling
 thermal_sys-$(CONFIG_CPU_THERMAL)  += cpu_cooling.o
 
diff --git a/drivers/thermal/power_actor.c b/drivers/thermal/power_actor.c
new file mode 100644
index ..d4f7bdbe371e
--- /dev/null
+++ b/drivers/thermal/power_actor.c
@@ -0,0 +1,68 @@
+/*
+ * Basic interface for power actors
+ *
+ * Copyright (C) 2014 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed as is WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even

[RFC PATCH v4 7/7] of: thermal: Introduce sustainable power for a thermal zone

2014-06-17 Thread Javi Merino
From: Punit Agrawal punit.agra...@arm.com

Introduce an optional property called, sustainable-power, which
represents the power (in mW) which the thermal zone can safely
dissipate.

If provided the property is parsed and associated with the thermal
zone via the thermal zone parameters.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Punit Agrawal punit.agra...@arm.com
---
 Documentation/devicetree/bindings/thermal/thermal.txt | 4 
 drivers/thermal/of-thermal.c  | 4 
 2 files changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt 
b/Documentation/devicetree/bindings/thermal/thermal.txt
index f5db6b72a36f..c6eb9a8d2aed 100644
--- a/Documentation/devicetree/bindings/thermal/thermal.txt
+++ b/Documentation/devicetree/bindings/thermal/thermal.txt
@@ -167,6 +167,10 @@ Optional property:
by means of sensor ID. Additional coefficients are
interpreted as constant offset.
 
+- sustainable-power:   An estimate of the sustainable power (in mW) that the
+  Type: unsigned   thermal zone can dissipate.
+  Size: one cell
+
 Note: The delay properties are bound to the maximum dT/dt (temperature
 derivative over time) in two situations for a thermal zone:
 (i)  - when passive cooling is activated (polling-delay-passive); and
diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c
index 04b1be7fa018..eaf81ea654b9 100644
--- a/drivers/thermal/of-thermal.c
+++ b/drivers/thermal/of-thermal.c
@@ -769,6 +769,7 @@ int __init of_parse_thermal_zones(void)
for_each_child_of_node(np, child) {
struct thermal_zone_device *zone;
struct thermal_zone_params *tzp;
+   u32 prop;
 
tz = thermal_of_build_thermal_zone(child);
if (IS_ERR(tz)) {
@@ -791,6 +792,9 @@ int __init of_parse_thermal_zones(void)
/* No hwmon because there might be hwmon drivers registering */
tzp-no_hwmon = true;
 
+   if (!of_property_read_u32(child, sustainable-power, prop))
+   tzp-sustainable_power = prop;
+
zone = thermal_zone_device_register(child-name, tz-ntrips,
0, tz,
ops, tzp,
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v4 6/7] thermal: add trace events to the power allocator governor

2014-06-17 Thread Javi Merino
On Tue, Jun 17, 2014 at 12:18:38PM +0100, Steven Rostedt wrote:
 On Tue, 17 Jun 2014 10:14:52 +0100
 Javi Merino javi.mer...@arm.com wrote:
 
  Add trace events for the power allocator governor and the power actor
  interface of the cpu cooling device.
  
  Cc: Zhang Rui rui.zh...@intel.com
  Cc: Eduardo Valentin edubez...@gmail.com
  Cc: Steven Rostedt rost...@goodmis.org
  Cc: Frederic Weisbecker fweis...@gmail.com
  Cc: Ingo Molnar mi...@redhat.com
  Signed-off-by: Javi Merino javi.mer...@arm.com
  
  ---
  
  trace-cmd needs the patched attached in
  http://article.gmane.org/gmane.linux.kernel/1704423 for this to work.
 
 The recently released trace-cmd v2.4 contains this.

Good to know, I'll drop the text from future uploads.  Thanks!
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] tools/thermal: tmon: fix compilation errors when building statically

2014-06-02 Thread Javi Merino
tmon fails to build statically with the following error:

$ make LDFLAGS=-static
gcc -O1 -Wall -Wshadow -W -Wformat -Wimplicit-function-declaration 
-Wimplicit-int -fstack-protector -D VERSION=\1.0\ -static tmon.o tui.o 
sysfs.o pid.o   -o tmon -lm -lpanel -lncursesw  -lpthread
tmon.o: In function `tmon_sig_handler':
tmon.c:(.text+0x21): undefined reference to `stdscr'
tmon.o: In function `tmon_cleanup':
tmon.c:(.text+0xb9): undefined reference to `stdscr'
tmon.c:(.text+0x11e): undefined reference to `stdscr'
tmon.c:(.text+0x123): undefined reference to `keypad'
tmon.c:(.text+0x12d): undefined reference to `nocbreak'
tmon.o: In function `main':
tmon.c:(.text+0x785): undefined reference to `stdscr'
tmon.c:(.text+0x78a): undefined reference to `nodelay'
tui.o: In function `setup_windows':
tui.c:(.text+0x131): undefined reference to `stdscr'
tui.c:(.text+0x176): undefined reference to `stdscr'
tui.c:(.text+0x19f): undefined reference to `stdscr'
tui.c:(.text+0x1cc): undefined reference to `stdscr'
tui.c:(.text+0x1ff): undefined reference to `stdscr'
tui.o:tui.c:(.text+0x229): more undefined references to `stdscr' follow
tui.o: In function `show_cooling_device':
[...]

stdscr() and friends are in libtinfo (part of ncurses) so add it to
the libraries that are linked in when compiling tmon to fix it.

Cc: Jacob Pan jacob.jun@linux.intel.com
Cc: Zhang Rui rui.zh...@intel.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 tools/thermal/tmon/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/thermal/tmon/Makefile b/tools/thermal/tmon/Makefile
index 447321104ec0..e775adcbd29f 100644
--- a/tools/thermal/tmon/Makefile
+++ b/tools/thermal/tmon/Makefile
@@ -21,7 +21,7 @@ OBJS = tmon.o tui.o sysfs.o pid.o
 OBJS +=
 
 tmon: $(OBJS) Makefile tmon.h
-   $(CC) ${CFLAGS} $(LDFLAGS) $(OBJS)  -o $(TARGET) -lm -lpanel -lncursesw 
 -lpthread
+   $(CC) ${CFLAGS} $(LDFLAGS) $(OBJS)  -o $(TARGET) -lm -lpanel -lncursesw 
-ltinfo -lpthread
 
 valgrind: tmon
 sudo valgrind -v --track-origins=yes --tool=memcheck --leak-check=yes 
--show-reachable=yes --num-callers=20 --track-fds=yes ./$(TARGET)  1 /dev/null
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] thermal: document struct thermal_zone_device and thermal_governor

2014-06-02 Thread Javi Merino
Document struct thermal_zone_device and struct thermal_governor fields
and their use by the thermal framework code.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Javi Merino javi.mer...@arm.com

---

Hi linux-pm,

I have some patches that add new fields to these structures but I
don't have a good place to describe those fields as these structs are
mostly undocumented so I thought I'd document them.

Changes since v1:
  * Clarified that some parameters are currently only used by the
step-wise governor.
  * Clarified that forced_passive operates on ACPI processor cooling
devices.

 include/linux/thermal.h | 45 +++--
 1 file changed, 43 insertions(+), 2 deletions(-)

diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index f7e11c7ea7d9..6fca46c82c4d 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -158,6 +158,41 @@ struct thermal_attr {
char name[THERMAL_NAME_LENGTH];
 };
 
+/**
+ * struct thermal_zone_device - structure for a thermal zone
+ * @id:unique id number for each thermal zone
+ * @type:  the thermal zone device type
+ * @device:struct device for this thermal zone
+ * @trip_temp_attrs:   attributes for trip points for sysfs: trip temperature
+ * @trip_type_attrs:   attributes for trip points for sysfs: trip type
+ * @trip_hyst_attrs:   attributes for trip points for sysfs: trip hysteresis
+ * @devdata:   private pointer for device private data
+ * @trips: number of trip points the thermal zone supports
+ * @passive_delay: number of milliseconds to wait between polls when
+ * performing passive cooling.  Currenty only used by the
+ * step-wise governor
+ * @polling_delay: number of milliseconds to wait between polls when
+ * checking whether trip points have been crossed (0 for
+ * interrupt driven systems)
+ * @temperature:   current temperature.  This is only for core code,
+ * drivers should use thermal_zone_get_temp() to get the
+ * current temperature
+ * @last_temperature:  previous temperature read
+ * @emul_temperature:  emulated temperature when using CONFIG_THERMAL_EMULATION
+ * @passive:   1 if you've crossed a passive trip point, 0 otherwise.
+ * Currenty only used by the step-wise governor.
+ * @forced_passive:If  0, temperature at which to switch on all ACPI
+ * processor cooling devices.  Currently only used by the
+ * step-wise governor.
+ * @ops:   operations this thermal_zone_device supports
+ * @tzp:   thermal zone parameters
+ * @governor:  pointer to the governor for this thermal zone
+ * @thermal_instances: list of struct thermal_instance of this thermal zone
+ * @idr:   struct idr to generate unique id for this zone's cooling devices
+ * @lock:  lock to protect thermal_instances list
+ * @node:  node in thermal_tz_list (in thermal_core.c)
+ * @poll_queue:delayed work for polling
+ */
 struct thermal_zone_device {
int id;
char type[THERMAL_NAME_LENGTH];
@@ -179,12 +214,18 @@ struct thermal_zone_device {
struct thermal_governor *governor;
struct list_head thermal_instances;
struct idr idr;
-   struct mutex lock; /* protect thermal_instances list */
+   struct mutex lock;
struct list_head node;
struct delayed_work poll_queue;
 };
 
-/* Structure that holds thermal governor information */
+/**
+ * struct thermal_governor - structure that holds thermal governor information
+ * @name:  name of the governor
+ * @throttle:  callback called for every trip point even if temperature is
+ * below the trip point temperature
+ * @governor_list: node in thermal_governor_list (in thermal_core.c)
+ */
 struct thermal_governor {
char name[THERMAL_NAME_LENGTH];
int (*throttle)(struct thermal_zone_device *tz, int trip);
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 0/7] The power allocator thermal governor

2014-06-03 Thread Javi Merino
Hi linux-pm,

The power allocator governor allocates device power to control
temperature.  This requires transforming performance requests into
requested power, which we do with the aid of power models.  Patch 5
(thermal: add a basic cpu power actor) implements a simple power model
for cpus.  The division of power between the actors ensures that power
is allocated where it is needed the most, based on the current
workload.

Patches 1 and 2 are not proper parts of these series and can be merged
separately.  Patch 1 (tracing: Add __bitmask() macro to trace events
to cpumasks and other bitmasks) is already in for-next[1].  Patch 2
(thermal: document struct thermal_zone_device and thermal_governor)
has already been submitted to linux-pm[2] and is generic.

[1] 
https://git.kernel.org/cgit/linux/kernel/git/rostedt/linux-trace.git/commit/?h=for-nextid=4449bf927b61b
db4389393c6fea6837214d1ace7
[2] http://article.gmane.org/gmane.linux.power-management.general/46041

Changes since v2:
  - Changed the PI controller into a PID controller
  - Added static power to the cpu power model
  - tz parameter max_dissipatable_power renamed to sustainable_power
  - Register the cpufreq cooling device as part of the
power_cpu_actor registration.

Changes since v1:
  - Fixed finding cpufreq cooling devices in cpufreq_frequency_change()
  - Replaced the cooling device interface with a separate power actor
API
  - Addressed most of Eduardo's comments
  - Incorporated ftrace support for bitmask to trace cpumasks

Todo:
  - Fix kerneldoc strings
  - Use tz-passive
  - Use cpufreq_get() to get the frequency of the cpu instead of
using the notifiers.
  - Rethink the use of trip points and make it less intrusive
  - Let platforms override the power allocator governor parameters
  - Add more tracing and provide scripts to evaluate the proposal.
  - Tune it to achieve the temperature stability we are aiming for

Cheers,
Javi  Punit

Javi Merino (6):
  thermal: document struct thermal_zone_device and thermal_governor
  thermal: let governors have private data for each thermal zone
  thermal: introduce the Power Actor API
  thermal: add a basic cpu power actor
  thermal: introduce the Power Allocator governor
  thermal: add trace events to the power allocator governor

Steven Rostedt (Red Hat) (1):
  tracing: Add __bitmask() macro to trace events to cpumasks and other
bitmasks

 Documentation/thermal/power_actor.txt | 164 
 Documentation/thermal/power_allocator.txt |  42 +++
 drivers/thermal/Kconfig   |  23 ++
 drivers/thermal/Makefile  |   3 +
 drivers/thermal/power_actor/Kconfig   |   9 +
 drivers/thermal/power_actor/Makefile  |   7 +
 drivers/thermal/power_actor/cpu_actor.c   | 606 ++
 drivers/thermal/power_actor/power_actor.c |  64 
 drivers/thermal/power_actor/power_actor.h | 105 ++
 drivers/thermal/power_allocator.c | 465 +++
 drivers/thermal/thermal_core.c|  90 -
 drivers/thermal/thermal_core.h|   8 +
 include/linux/ftrace_event.h  |   3 +
 include/linux/thermal.h   |  62 ++-
 include/linux/trace_seq.h |  10 +
 include/trace/events/thermal.h|  38 ++
 include/trace/events/thermal_governor.h   |  37 ++
 include/trace/ftrace.h|  57 ++-
 kernel/trace/trace_output.c   |  41 ++
 19 files changed, 1822 insertions(+), 12 deletions(-)
 create mode 100644 Documentation/thermal/power_actor.txt
 create mode 100644 Documentation/thermal/power_allocator.txt
 create mode 100644 drivers/thermal/power_actor/Kconfig
 create mode 100644 drivers/thermal/power_actor/Makefile
 create mode 100644 drivers/thermal/power_actor/cpu_actor.c
 create mode 100644 drivers/thermal/power_actor/power_actor.c
 create mode 100644 drivers/thermal/power_actor/power_actor.h
 create mode 100644 drivers/thermal/power_allocator.c
 create mode 100644 include/trace/events/thermal.h
 create mode 100644 include/trace/events/thermal_governor.h

-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 1/7] tracing: Add __bitmask() macro to trace events to cpumasks and other bitmasks

2014-06-03 Thread Javi Merino
From: Steven Rostedt (Red Hat) rost...@goodmis.org

Being able to show a cpumask of events can be useful as some events
may affect only some CPUs. There is no standard way to record the
cpumask and converting it to a string is rather expensive during
the trace as traces happen in hotpaths. It would be better to record
the raw event mask and be able to parse it at print time.

The following macros were added for use with the TRACE_EVENT() macro:

  __bitmask()
  __assign_bitmask()
  __get_bitmask()

To test this, I added this to the sched_migrate_task event, which
looked like this:

TRACE_EVENT(sched_migrate_task,

TP_PROTO(struct task_struct *p, int dest_cpu, const struct cpumask 
*cpus),

TP_ARGS(p, dest_cpu, cpus),

TP_STRUCT__entry(
__array(char,   comm,   TASK_COMM_LEN   )
__field(pid_t,  pid )
__field(int,prio)
__field(int,orig_cpu)
__field(int,dest_cpu)
__bitmask(  cpumask, num_possible_cpus())
),

TP_fast_assign(
memcpy(__entry-comm, p-comm, TASK_COMM_LEN);
__entry-pid= p-pid;
__entry-prio   = p-prio;
__entry-orig_cpu   = task_cpu(p);
__entry-dest_cpu   = dest_cpu;
__assign_bitmask(cpumask, cpumask_bits(cpus), 
num_possible_cpus());
),

TP_printk(comm=%s pid=%d prio=%d orig_cpu=%d dest_cpu=%d cpumask=%s,
  __entry-comm, __entry-pid, __entry-prio,
  __entry-orig_cpu, __entry-dest_cpu,
  __get_bitmask(cpumask))
);

With the output of:

ksmtuned-3613  [003] d..2   485.220508: sched_migrate_task: 
comm=ksmtuned pid=3615 prio=120 orig_cpu=3 dest_cpu=2 cpumask=,000f
 migration/1-13[001] d..5   485.221202: sched_migrate_task: 
comm=ksmtuned pid=3614 prio=120 orig_cpu=1 dest_cpu=0 cpumask=,000f
 awk-3615  [002] d.H5   485.221747: sched_migrate_task: 
comm=rcu_preempt pid=7 prio=120 orig_cpu=0 dest_cpu=1 cpumask=,00ff
 migration/2-18[002] d..5   485.222062: sched_migrate_task: 
comm=ksmtuned pid=3615 prio=120 orig_cpu=2 dest_cpu=3 cpumask=,000f

Link: 
http://lkml.kernel.org/r/1399377998-14870-6-git-send-email-javi.mer...@arm.com
Link: http://lkml.kernel.org/r/20140506132238.22e13...@gandalf.local.home

Suggested-by: Javi Merino javi.mer...@arm.com
Tested-by: Javi Merino javi.mer...@arm.com
Signed-off-by: Steven Rostedt rost...@goodmis.org
---

Note: When sending the pull request to Linus, state that you
cherry-picked the commit from
https://git.kernel.org/cgit/linux/kernel/git/rostedt/linux-trace.git/commit/?h=for-nextid=4449bf927b61bdb4389393c6fea6837214d1ace7

 include/linux/ftrace_event.h |  3 +++
 include/linux/trace_seq.h| 10 
 include/trace/ftrace.h   | 57 +++-
 kernel/trace/trace_output.c  | 41 +++
 4 files changed, 110 insertions(+), 1 deletion(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index d16da3e53bc7..cff3106ffe2c 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -38,6 +38,9 @@ const char *ftrace_print_symbols_seq_u64(struct trace_seq *p,
 *symbol_array);
 #endif
 
+const char *ftrace_print_bitmask_seq(struct trace_seq *p, void *bitmask_ptr,
+unsigned int bitmask_size);
+
 const char *ftrace_print_hex_seq(struct trace_seq *p,
 const unsigned char *buf, int len);
 
diff --git a/include/linux/trace_seq.h b/include/linux/trace_seq.h
index a32d86ec8bf2..136116924d8d 100644
--- a/include/linux/trace_seq.h
+++ b/include/linux/trace_seq.h
@@ -46,6 +46,9 @@ extern int trace_seq_putmem_hex(struct trace_seq *s, const 
void *mem,
 extern void *trace_seq_reserve(struct trace_seq *s, size_t len);
 extern int trace_seq_path(struct trace_seq *s, const struct path *path);
 
+extern int trace_seq_bitmask(struct trace_seq *s, const unsigned long *maskp,
+int nmaskbits);
+
 #else /* CONFIG_TRACING */
 static inline int trace_seq_printf(struct trace_seq *s, const char *fmt, ...)
 {
@@ -57,6 +60,13 @@ trace_seq_bprintf(struct trace_seq *s, const char *fmt, 
const u32 *binary)
return 0;
 }
 
+static inline int
+trace_seq_bitmask(struct trace_seq *s, const unsigned long *maskp,
+ int nmaskbits)
+{
+   return 0;
+}
+
 static inline int trace_print_seq(struct seq_file *m, struct trace_seq *s)
 {
return 0;
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 0a1a4f7caf09..9b7a989dcbcc 100644
--- a/include/trace/ftrace.h
+++ b/include

[RFC PATCH v3 2/7] thermal: document struct thermal_zone_device and thermal_governor

2014-06-03 Thread Javi Merino
Document struct thermal_zone_device and struct thermal_governor fields
and their use by the thermal framework code.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 include/linux/thermal.h | 45 +++--
 1 file changed, 43 insertions(+), 2 deletions(-)

diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index f7e11c7ea7d9..6fca46c82c4d 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -158,6 +158,41 @@ struct thermal_attr {
char name[THERMAL_NAME_LENGTH];
 };
 
+/**
+ * struct thermal_zone_device - structure for a thermal zone
+ * @id:unique id number for each thermal zone
+ * @type:  the thermal zone device type
+ * @device:struct device for this thermal zone
+ * @trip_temp_attrs:   attributes for trip points for sysfs: trip temperature
+ * @trip_type_attrs:   attributes for trip points for sysfs: trip type
+ * @trip_hyst_attrs:   attributes for trip points for sysfs: trip hysteresis
+ * @devdata:   private pointer for device private data
+ * @trips: number of trip points the thermal zone supports
+ * @passive_delay: number of milliseconds to wait between polls when
+ * performing passive cooling.  Currenty only used by the
+ * step-wise governor
+ * @polling_delay: number of milliseconds to wait between polls when
+ * checking whether trip points have been crossed (0 for
+ * interrupt driven systems)
+ * @temperature:   current temperature.  This is only for core code,
+ * drivers should use thermal_zone_get_temp() to get the
+ * current temperature
+ * @last_temperature:  previous temperature read
+ * @emul_temperature:  emulated temperature when using CONFIG_THERMAL_EMULATION
+ * @passive:   1 if you've crossed a passive trip point, 0 otherwise.
+ * Currenty only used by the step-wise governor.
+ * @forced_passive:If  0, temperature at which to switch on all ACPI
+ * processor cooling devices.  Currently only used by the
+ * step-wise governor.
+ * @ops:   operations this thermal_zone_device supports
+ * @tzp:   thermal zone parameters
+ * @governor:  pointer to the governor for this thermal zone
+ * @thermal_instances: list of struct thermal_instance of this thermal zone
+ * @idr:   struct idr to generate unique id for this zone's cooling devices
+ * @lock:  lock to protect thermal_instances list
+ * @node:  node in thermal_tz_list (in thermal_core.c)
+ * @poll_queue:delayed work for polling
+ */
 struct thermal_zone_device {
int id;
char type[THERMAL_NAME_LENGTH];
@@ -179,12 +214,18 @@ struct thermal_zone_device {
struct thermal_governor *governor;
struct list_head thermal_instances;
struct idr idr;
-   struct mutex lock; /* protect thermal_instances list */
+   struct mutex lock;
struct list_head node;
struct delayed_work poll_queue;
 };
 
-/* Structure that holds thermal governor information */
+/**
+ * struct thermal_governor - structure that holds thermal governor information
+ * @name:  name of the governor
+ * @throttle:  callback called for every trip point even if temperature is
+ * below the trip point temperature
+ * @governor_list: node in thermal_governor_list (in thermal_core.c)
+ */
 struct thermal_governor {
char name[THERMAL_NAME_LENGTH];
int (*throttle)(struct thermal_zone_device *tz, int trip);
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 5/7] thermal: add a basic cpu power actor

2014-06-03 Thread Javi Merino
Introduce a power actor for cpus.  It has a basic power model to get
the current power utilization and uses cpufreq cooling devices to set
the desired power.  It uses the current frequency (as reported by
cpufreq) as well as load and OPPs for the power calculations.  The
cpus must have registered their OPPs in the OPP library.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Punit Agrawal punit.agra...@arm.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 Documentation/thermal/power_actor.txt | 126 +++
 drivers/thermal/Kconfig   |   5 +
 drivers/thermal/power_actor/Kconfig   |   9 +
 drivers/thermal/power_actor/Makefile  |   2 +
 drivers/thermal/power_actor/cpu_actor.c   | 601 ++
 drivers/thermal/power_actor/power_actor.h |  41 ++
 6 files changed, 784 insertions(+)
 create mode 100644 drivers/thermal/power_actor/Kconfig
 create mode 100644 drivers/thermal/power_actor/cpu_actor.c

diff --git a/Documentation/thermal/power_actor.txt 
b/Documentation/thermal/power_actor.txt
index 5a61f32ec143..fd51760615bf 100644
--- a/Documentation/thermal/power_actor.txt
+++ b/Documentation/thermal/power_actor.txt
@@ -36,3 +36,129 @@ temperature.
 milliwatts.
 
 Returns 0 on success, -E* on error.
+
+CPU Power Actor API
+===
+
+A simple power model for CPUs.  The current power is calculated as
+dynamic + (optionally) static power.  This power model requires that
+the operating-points of the CPUs are registered using the kernel's opp
+library and the `cpufreq_frequency_table` is assigned to the `struct
+device` of the cpu.  If you are using the `cpufreq-cpu0.c` driver then
+the `cpufreq_frequency_table` should already be assigned to the cpu
+device.
+
+The `tz` and `plat_static_func` parameters of
+`power_cpu_actor_register()` are optional.  If you don't provide them,
+only dynamic power will be considered.
+
+Dynamic power
+-
+
+The dynamic power consumption of a processor depends
+on many factors.  For a given processor implementation the primary
+factors are:
+
+- The time the processor spends running, consuming dynamic power, as
+  compared to the time in idle states where dynamic consumption is
+  negligible.  Herein we refer to this as 'utilisation'.
+- The voltage and frequency levels as a result of DVFS.  The DVFS
+  level is a dominant factor governing power consumption.
+- In running time the 'execution' behaviour (instruction types, memory
+  access patterns and so forth) causes, in most cases, a second order
+  variation.  In pathological cases this variation can be significant,
+  but typically it is of a much lesser impact than the factors above.
+
+A high level dynamic power consumption model may then be represented as:
+
+Pdyn = f(run) * Voltage^2 * Frequency * Utilisation
+
+f(run) here represents the described execution behaviour and its
+result has a units of Watts/Hz/Volt^2 (this often expressed in
+mW/MHz/uVolt^2)
+
+The detailed behaviour for f(run) could be modelled on-line.  However,
+in practice, such an on-line model has dependencies on a number of
+implementation specific processor support and characterisation
+factors.  Therefore, in initial implementation that contribution is
+represented as a constant coefficient.  This is a simplification
+consistent with the relative contribution to overall power variation.
+
+In this simplified representation our model becomes:
+
+Pdyn = Kd * Voltage^2 * Frequency * Utilisation
+
+Where Kd (capacitance) represents an indicative running time dynamic
+power coefficient in fundamental units of mW/MHz/uVolt^2
+
+Static Power
+
+
+Static leakage power consumption depends on a number of factors.  For a
+given circuit implementation the primary factors are:
+
+- Time the circuit spends in each 'power state'
+- Temperature
+- Operating voltage
+- Process grade
+
+The time the circuit spends in each 'power state' for a given
+evaluation period at first order means OFF or ON.  However,
+'retention' states can also be supported that reduce power during
+inactive periods without loss of context.
+
+Note: The visibility of state entries to the OS can vary, according to
+platform specifics, and this can then impact the accuracy of a model
+based on OS state information alone.  It might be possible in some
+cases to extract more accurate information from system resources.
+
+The temperature, operating voltage and process 'grade' (slow to fast)
+of the circuit are all significant factors in static leakage power
+consumption.  All of these have complex relationships to static power.
+
+Circuit implementation specific factors include the chosen silicon
+process as well as the type, number and size of transistors in both
+the logic gates and any RAM elements included.
+
+The static power consumption modelling must take into account the
+power managed regions that are implemented.  Taking the example of an
+ARM processor cluster

[RFC PATCH v3 6/7] thermal: introduce the Power Allocator governor

2014-06-03 Thread Javi Merino
The power allocator governor is a thermal governor that controls system
and device power allocation to control temperature.  Conceptually, the
implementation divides the sustainable power of a thermal zone among
all the heat sources in that zone.

This governor relies on power actors, entities that represent heat
sources.  They can report current and maximum power consumption and
can set a given maximum power consumption, usually via a cooling
device.

The governor uses a Proportional Integral Derivative (PID) controller
driven by the temperature of the thermal zone.  The output of the
controller is a power budget that is then allocated to each power
actor that can have bearing on the temperature we are trying to
control.  It decides how much power to give each cooling device based
on the performance they are requesting.  The PID controller ensures
that the total power budget does not exceed the control temperature.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Punit Agrawal punit.agra...@arm.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 Documentation/thermal/power_allocator.txt |  42 +++
 drivers/thermal/Kconfig   |  15 +
 drivers/thermal/Makefile  |   1 +
 drivers/thermal/power_allocator.c | 455 ++
 drivers/thermal/thermal_core.c|   7 +-
 drivers/thermal/thermal_core.h|   8 +
 include/linux/thermal.h   |   8 +
 7 files changed, 535 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/thermal/power_allocator.txt
 create mode 100644 drivers/thermal/power_allocator.c

diff --git a/Documentation/thermal/power_allocator.txt 
b/Documentation/thermal/power_allocator.txt
new file mode 100644
index ..cfa66933af86
--- /dev/null
+++ b/Documentation/thermal/power_allocator.txt
@@ -0,0 +1,42 @@
+
+Integration of the power_allocator governor in a platform
+=
+
+Registering thermal_zone_device
+---
+
+An estimate of the sustainable dissipatable power (in mW) should be
+provided while registering the thermal zone.  This is the maximum
+sustained power for allocation at the desired maximum temperature.
+This number can vary for different conditions, but the closed-loop of
+the controller should take care of those variations, the
+`sustainable_power` should be an estimation of it.  Register your
+thermal zone with `thermal_zone_params` that have a
+`sustainable_power`.  If you weren't passing any
+`thermal_zone_params`, then something like this will do:
+
+   static const struct thermal_zone_params tz_params = {
+   .sustainable_power = 3500,
+   };
+
+and then pass `tz_params` as the 5th parameter to
+`thermal_zone_device_register()`
+
+Trip points
+---
+
+The governor requires the following two trip points:
+
+1.  switch on trip point: temperature above which the governor
+control loop starts operating
+2.  desired temperature trip point: it should be higher than the
+switch on trip point. It is the target temperature the governor
+is controlling for.
+
+The trip points can be either active or passive.
+
+Power actors
+
+
+Devices controlled by this governor must be registered with the power
+actor API.  Read `power_actor.txt` for more information about them.
diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 1818c4fa60b8..e5b338a7cab9 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -71,6 +71,14 @@ config THERMAL_DEFAULT_GOV_USER_SPACE
  Select this if you want to let the user space manage the
  platform thermals.
 
+config THERMAL_DEFAULT_GOV_POWER_ALLOCATOR
+   bool power_allocator
+   select THERMAL_GOV_POWER_ALLOCATOR
+   help
+ Select this if you want to control temperature based on
+ system and device power allocation. This governor relies on
+ power actors to operate.
+
 endchoice
 
 config THERMAL_GOV_FAIR_SHARE
@@ -89,6 +97,13 @@ config THERMAL_GOV_USER_SPACE
help
  Enable this to let the user space manage the platform thermals.
 
+config THERMAL_GOV_POWER_ALLOCATOR
+   bool Power allocator thermal governor
+   select THERMAL_POWER_ACTOR
+   help
+ Enable this to manage platform thermals by dynamically
+ allocating and limiting power to devices.
+
 config THERMAL_POWER_ACTOR
bool
 
diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index 878a02cab7d1..c5b47f058675 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -13,6 +13,7 @@ thermal_sys-$(CONFIG_THERMAL_OF)  += of-thermal.o
 thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE)   += fair_share.o
 thermal_sys-$(CONFIG_THERMAL_GOV_STEP_WISE)+= step_wise.o
 thermal_sys-$(CONFIG_THERMAL_GOV_USER_SPACE)   += user_space.o
+thermal_sys

[RFC PATCH v3 7/7] thermal: add trace events to the power allocator governor

2014-06-03 Thread Javi Merino
Add trace events for the power allocator governor and the power actor
interface of the cpu cooling device.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
Signed-off-by: Javi Merino javi.mer...@arm.com

---

trace-cmd needs the patched attached in
http://article.gmane.org/gmane.linux.kernel/1704423 for this to work.

 drivers/thermal/power_actor/cpu_actor.c |  5 +
 drivers/thermal/power_allocator.c   | 12 ++-
 include/trace/events/thermal.h  | 38 +
 include/trace/events/thermal_governor.h | 37 
 4 files changed, 91 insertions(+), 1 deletion(-)
 create mode 100644 include/trace/events/thermal.h
 create mode 100644 include/trace/events/thermal_governor.h

diff --git a/drivers/thermal/power_actor/cpu_actor.c 
b/drivers/thermal/power_actor/cpu_actor.c
index 6eac80d119a5..396689b8fd40 100644
--- a/drivers/thermal/power_actor/cpu_actor.c
+++ b/drivers/thermal/power_actor/cpu_actor.c
@@ -27,6 +27,9 @@
 #include linux/printk.h
 #include linux/slab.h
 
+#define CREATE_TRACE_POINTS
+#include trace/events/thermal.h
+
 #include power_actor.h
 
 /**
@@ -297,6 +300,8 @@ static int cpu_set_power(struct power_actor *actor, u32 
power)
return -EINVAL;
}
 
+   trace_thermal_power_limit(cpu_actor-cpumask, freq, cdev_state, power);
+
return cdev-ops-set_cur_state(cdev, cdev_state);
 }
 
diff --git a/drivers/thermal/power_allocator.c 
b/drivers/thermal/power_allocator.c
index 9c7e7f212eb6..1e57a810903d 100644
--- a/drivers/thermal/power_allocator.c
+++ b/drivers/thermal/power_allocator.c
@@ -19,6 +19,9 @@
 #include linux/slab.h
 #include linux/thermal.h
 
+#define CREATE_TRACE_POINTS
+#include trace/events/thermal_governor.h
+
 #include power_actor/power_actor.h
 #include thermal_core.h
 
@@ -130,7 +133,14 @@ static u32 pid_controller(struct thermal_zone_device *tz,
/* feed-forward the known sustainable dissipatable power */
power_range = tz-tzp-sustainable_power + frac_to_int(power_range);
 
-   return clamp(power_range, (s64)0, (s64)max_allocatable_power);
+   power_range = clamp(power_range, (s64)0, (s64)max_allocatable_power);
+
+   trace_thermal_power_allocator_pid(frac_to_int(err),
+   frac_to_int(params-err_integral),
+   frac_to_int(p), frac_to_int(i),
+   frac_to_int(d), power_range);
+
+   return power_range;
 }
 
 /**
diff --git a/include/trace/events/thermal.h b/include/trace/events/thermal.h
new file mode 100644
index ..ce326e5cfe90
--- /dev/null
+++ b/include/trace/events/thermal.h
@@ -0,0 +1,38 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM thermal
+
+#if !defined(_TRACE_THERMAL_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_THERMAL_H
+
+#include linux/tracepoint.h
+
+TRACE_EVENT(thermal_power_limit,
+   TP_PROTO(const struct cpumask *cpus, unsigned int freq,
+   unsigned long cdev_state, u32 power),
+
+   TP_ARGS(cpus, freq, cdev_state, power),
+
+   TP_STRUCT__entry(
+   __bitmask(cpumask, num_possible_cpus())
+   __field(unsigned int,  freq  )
+   __field(unsigned long, cdev_state)
+   __field(u32,   power )
+   ),
+
+   TP_fast_assign(
+   __assign_bitmask(cpumask, cpumask_bits(cpus),
+   num_possible_cpus());
+   __entry-freq = freq;
+   __entry-cdev_state = cdev_state;
+   __entry-power = power;
+   ),
+
+   TP_printk(cpus=%s freq=%u cdev_state=%lu power=%u,
+   __get_bitmask(cpumask), __entry-freq, __entry-cdev_state,
+   __entry-power)
+);
+
+#endif /* _TRACE_THERMAL_H */
+
+/* This part must be outside protection */
+#include trace/define_trace.h
diff --git a/include/trace/events/thermal_governor.h 
b/include/trace/events/thermal_governor.h
new file mode 100644
index ..5e5b25a41a6b
--- /dev/null
+++ b/include/trace/events/thermal_governor.h
@@ -0,0 +1,37 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM thermal_governor
+
+#if !defined(_TRACE_THERMAL_GOVERNOR_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_THERMAL_GOVERNOR_H
+
+#include linux/tracepoint.h
+
+TRACE_EVENT(thermal_power_allocator_pid,
+   TP_PROTO(s32 err, s32 err_integral, s64 p, s64 i, s64 d, s32 output),
+   TP_ARGS(err, err_integral, p, i, d, output),
+   TP_STRUCT__entry(
+   __field(s32, err )
+   __field(s32, err_integral)
+   __field(s64, p   )
+   __field(s64, i   )
+   __field(s64, d   )
+   __field(s32, output  )
+   ),
+   TP_fast_assign(
+   __entry-err

[RFC PATCH v3 3/7] thermal: let governors have private data for each thermal zone

2014-06-03 Thread Javi Merino
A governor may need to store its current state between calls to
throttle().  That state depends on the thermal zone, so store it as
private data in struct thermal_zone_device.

The governors may have two new ops: bind_to_tz() and unbind_from_tz().
When provided, these functions let governors do some initialization
and teardown when they are bound/unbound to a tz and possibly store that
information in the governor_data field of the struct
thermal_zone_device.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 drivers/thermal/thermal_core.c | 83 ++
 include/linux/thermal.h|  9 +
 2 files changed, 84 insertions(+), 8 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 71b0ec0c370d..1b13d8e0cfd1 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -72,6 +72,58 @@ static struct thermal_governor *__find_governor(const char 
*name)
return NULL;
 }
 
+/**
+ * bind_previous_governor - bind the previous governor of the thermal zone
+ * @tz:a valid pointer to a struct thermal_zone_device
+ * @failed_gov_name:   the name of the governor that failed to register
+ *
+ * Register the previous governor of the thermal zone after a new
+ * governor has failed to be bound.
+ */
+static void bind_previous_governor(struct thermal_zone_device *tz,
+   const char *failed_gov_name)
+{
+   if (tz-governor  tz-governor-bind_to_tz) {
+   if (tz-governor-bind_to_tz(tz)) {
+   dev_warn(tz-device,
+   governor %s failed to bind and the previous 
one (%s) failed to register again, thermal zone %s has no governor\n,
+   failed_gov_name, tz-governor-name, tz-type);
+   tz-governor = NULL;
+   }
+   }
+}
+
+/**
+ * thermal_set_governor() - Switch to another governor
+ * @tz:a valid pointer to a struct thermal_zone_device
+ * @new_gov:   pointer to the new governor
+ *
+ * Change the governor of thermal zone @tz.
+ *
+ * Returns 0 on success, an error if the new governor's bind_to_tz() failed.
+ */
+static int thermal_set_governor(struct thermal_zone_device *tz,
+   struct thermal_governor *new_gov)
+{
+   int ret = 0;
+
+   if (tz-governor  tz-governor-unbind_from_tz)
+   tz-governor-unbind_from_tz(tz);
+
+   if (new_gov  new_gov-bind_to_tz) {
+   ret = new_gov-bind_to_tz(tz);
+   if (ret) {
+   bind_previous_governor(tz, new_gov-name);
+
+   return ret;
+   }
+   }
+
+   tz-governor = new_gov;
+
+   return ret;
+}
+
 int thermal_register_governor(struct thermal_governor *governor)
 {
int err;
@@ -104,8 +156,15 @@ int thermal_register_governor(struct thermal_governor 
*governor)
 
name = pos-tzp-governor_name;
 
-   if (!strnicmp(name, governor-name, THERMAL_NAME_LENGTH))
-   pos-governor = governor;
+   if (!strnicmp(name, governor-name, THERMAL_NAME_LENGTH)) {
+   int ret;
+
+   ret = thermal_set_governor(pos, governor);
+   if (ret)
+   dev_warn(pos-device,
+   Failed to set governor %s for thermal 
zone %s: %d\n,
+   governor-name, pos-type, ret);
+   }
}
 
mutex_unlock(thermal_list_lock);
@@ -131,7 +190,7 @@ void thermal_unregister_governor(struct thermal_governor 
*governor)
list_for_each_entry(pos, thermal_tz_list, node) {
if (!strnicmp(pos-governor-name, governor-name,
THERMAL_NAME_LENGTH))
-   pos-governor = NULL;
+   thermal_set_governor(pos, NULL);
}
 
mutex_unlock(thermal_list_lock);
@@ -756,8 +815,9 @@ policy_store(struct device *dev, struct device_attribute 
*attr,
if (!gov)
goto exit;
 
-   tz-governor = gov;
-   ret = count;
+   ret = thermal_set_governor(tz, gov);
+   if (!ret)
+   ret = count;
 
 exit:
mutex_unlock(thermal_governor_lock);
@@ -1452,6 +1512,7 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
int result;
int count;
int passive = 0;
+   struct thermal_governor *governor;
 
if (type  strlen(type) = THERMAL_NAME_LENGTH)
return ERR_PTR(-EINVAL);
@@ -1542,9 +1603,15 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
mutex_lock(thermal_governor_lock);
 
if (tz-tzp)
-   tz-governor = __find_governor(tz-tzp

[RFC PATCH v3 4/7] thermal: introduce the Power Actor API

2014-06-03 Thread Javi Merino
This patch introduces the Power Actor API in the thermal framework.
With it, devices that can report their power consumption and control
it can be registered.  This base interface is meant to be used to
derive specific power actors, such as a cpu power actor.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 Documentation/thermal/power_actor.txt | 38 ++
 drivers/thermal/Kconfig   |  3 ++
 drivers/thermal/Makefile  |  2 +
 drivers/thermal/power_actor/Makefile  |  5 +++
 drivers/thermal/power_actor/power_actor.c | 64 +++
 drivers/thermal/power_actor/power_actor.h | 64 +++
 6 files changed, 176 insertions(+)
 create mode 100644 Documentation/thermal/power_actor.txt
 create mode 100644 drivers/thermal/power_actor/Makefile
 create mode 100644 drivers/thermal/power_actor/power_actor.c
 create mode 100644 drivers/thermal/power_actor/power_actor.h

diff --git a/Documentation/thermal/power_actor.txt 
b/Documentation/thermal/power_actor.txt
new file mode 100644
index ..5a61f32ec143
--- /dev/null
+++ b/Documentation/thermal/power_actor.txt
@@ -0,0 +1,38 @@
+
+Power Actor API
+===
+
+The base power actor API is meant to be used to derive specific power
+actors, such as a cpu power actor.  When registering, they should call
+`power_actor_register()` with a unique `enum power_actor_types`.  When
+unregistering, the power actor should call `power_actor_unregister()`
+with the `struct power_actor *` received in the call to
+`power_actor_register()`.
+
+Callbacks
+-
+
+1. u32 get_req_power(struct power_actor *actor)
+@actor: a valid `struct power_actor *` registered with
+`power_actor_register()`
+
+`get_req_power()` returns the current requested power in milliwatts.
+
+2. u32 get_max_power(struct power_actor *actor)
+@actor: a valid `struct power_actor *` registered with
+`power_actor_register()`
+
+`get_max_power()` returns the maximum power that the device could
+consume if it was fully utilized.  It's a function as some devices'
+maximum power consumption can change due to external factors such as
+temperature.
+
+3. int set_power(struct power_actor *actor, u32 power)
+@actor: a valid `struct power_actor *` registered with
+`power_actor_register()`
+@power: power in milliwatts
+
+`set_power()` should configure the device to consume @power
+milliwatts.
+
+Returns 0 on success, -E* on error.
diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 2d51912a6e40..47e2f15537ca 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -89,6 +89,9 @@ config THERMAL_GOV_USER_SPACE
help
  Enable this to let the user space manage the platform thermals.
 
+config THERMAL_POWER_ACTOR
+   bool
+
 config CPU_THERMAL
bool generic cpu cooling support
depends on CPU_FREQ
diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index 54e4ec9eb5df..878a02cab7d1 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -14,6 +14,8 @@ thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE)  += fair_share.o
 thermal_sys-$(CONFIG_THERMAL_GOV_STEP_WISE)+= step_wise.o
 thermal_sys-$(CONFIG_THERMAL_GOV_USER_SPACE)   += user_space.o
 
+obj-$(CONFIG_THERMAL_POWER_ACTOR) += power_actor/
+
 # cpufreq cooling
 thermal_sys-$(CONFIG_CPU_THERMAL)  += cpu_cooling.o
 
diff --git a/drivers/thermal/power_actor/Makefile 
b/drivers/thermal/power_actor/Makefile
new file mode 100644
index ..46478f4928be
--- /dev/null
+++ b/drivers/thermal/power_actor/Makefile
@@ -0,0 +1,5 @@
+#
+# Makefile for the power actors
+#
+
+obj-y += power_actor.o
diff --git a/drivers/thermal/power_actor/power_actor.c 
b/drivers/thermal/power_actor/power_actor.c
new file mode 100644
index ..d891deb0e2a1
--- /dev/null
+++ b/drivers/thermal/power_actor/power_actor.c
@@ -0,0 +1,64 @@
+/*
+ * Basic interface for power actors
+ *
+ * Copyright (C) 2014 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed as is WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#define pr_fmt(fmt) Power actor:  fmt
+
+#include linux/err.h
+#include linux/list.h
+#include linux/slab.h
+
+#include power_actor.h
+
+LIST_HEAD(actor_list);
+
+/**
+ * power_actor_register - Register an actor in the power actor API
+ * @type:  actor type
+ * @ops:   struct power_actor_ops for this actor
+ * @privdata:  pointer to private data related to the actor
+ *
+ * Returns the struct power_actor * on success, ERR_PTR

Re: [PATCH] linux/thermal.h: Rename KELVIN_TO_CELSIUS to DECI_KELVIN_TO_CELSIUS

2014-03-24 Thread Javi Merino
On Mon, Mar 24, 2014 at 05:29:09PM +, Rasmus Villemoes wrote:
 The macros KELVIN_TO_CELSIUS and CELSIUS_TO_KELVIN actually work on
 decikelvins, so rename them to reflect their actual semantics.
 
 Signed-off-by: Rasmus Villemoes li...@rasmusvillemoes.dk
 ---
  drivers/acpi/thermal.c  | 12 ++--
  drivers/platform/x86/asus-wmi.c |  2 +-
  drivers/platform/x86/intel_menlow.c |  8 
  include/linux/thermal.h |  6 +++---
  4 files changed, 14 insertions(+), 14 deletions(-)

[snip]

 diff --git a/include/linux/thermal.h b/include/linux/thermal.h
 index f7e11c7..c978aa3 100644
 --- a/include/linux/thermal.h
 +++ b/include/linux/thermal.h
 @@ -41,9 +41,9 @@
  #define THERMAL_NO_LIMIT THERMAL_CSTATE_INVALID
  
  /* Unit conversion macros */
 -#define KELVIN_TO_CELSIUS(t) (long)(((long)t-2732 = 0) ?\
 - ((long)t-2732+5)/10 : ((long)t-2732-5)/10)
 -#define CELSIUS_TO_KELVIN(t) ((t)*10+2732)
 +#define DECI_KELVIN_TO_CELSIUS(t)(long)(((long)t-2732 = 0) ?\
 + ((long)t-2732+5)/10 : 
 ((long)t-2732-5)/10)
 +#define CELSIUS_TO_DECI_KELVIN(t)((t)*10+2732)

While you are at it, you could also make it use a statement expression
as Joe Perches suggested earlier:

http://thread.gmane.org/gmane.linux.power-management.general/43978/focus=1671955

Cheers,
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


checkpatch: false positives when parsing trace includes

2014-04-29 Thread Javi Merino
Hi,

checkpatch complains about the spaces before the close parenthesis in
trace events:

ERROR: space prohibited before that close parenthesis ')'
#94: FILE: include/trace/events/thermal.h:14:
+   __field(unsigned int,  freq  )

However, in that directory, that's actually the norm, not the
exception:

$ git grep '__field(' include/trace/events/ | grep -P '[ \t]+\)' | wc -l
1284
$ git grep '__field(' include/trace/events/ | wc -l
1783
$

More than 70% of the __field() entries *have* spaces before the close
parenthesis.  Should checkpatch make an exception for this directory
and not flag it as an error?

Cheers,
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 3/4] thermal: Added Bang-bang thermal governor

2014-04-29 Thread Javi Merino
On Tue, Apr 29, 2014 at 10:17:56AM +0100, Peter Feuerer wrote:
 The bang-bang thermal governor uses a hysteresis to switch abruptly on
 or off a cooling device.  It is intended to control fans, which can
 not be throttled but just switched on or off.
 Bang-bang cannot be set as default governor as it is intended for
 special devices only.  For those special devices the driver needs to
 explicitely request it.

I don't really understand why step-wise doesn't work for you (AIUI,
this governor should be a subset of it.  I'll let others comment on
that, just a minor comment below.

[...]
 diff --git a/drivers/thermal/gov_bang_bang.c b/drivers/thermal/gov_bang_bang.c
 new file mode 100644
 index 000..328dde0
 --- /dev/null
 +++ b/drivers/thermal/gov_bang_bang.c
 @@ -0,0 +1,124 @@
 +/*
 + *  gov_bang_bang.c - A simple thermal throttling governor using hysteresis
 + *
 + *  Copyright (C) 2014 Peter Feuerer pe...@piie.net
 + *
 + *  Based on step_wise.c with following Copyrights:
 + *  Copyright (C) 2012 Intel Corp
 + *  Copyright (C) 2012 Durgadoss R durgados...@intel.com
 + *
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License as published by
 + * the Free Software Foundation, version 2.
 + *
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
 + * the GNU General Public License for more details.
 + *
 + */
 +
 +#include linux/thermal.h
 +
 +#include thermal_core.h
 +
 +static void thermal_zone_trip_update(struct thermal_zone_device *tz, int 
 trip)
 +{
 + long trip_temp;
 + unsigned long trip_hyst;
 + struct thermal_instance *instance;
 +
 + tz-ops-get_trip_temp(tz, trip, trip_temp);
 + tz-ops-get_trip_hyst(tz, trip, trip_hyst);
 +
 + dev_dbg(tz-device, Trip%d[temp=%ld]:temp=%d:hyst=%ld\n,
 + trip, trip_temp, tz-temperature,
 + trip_hyst);
 +
 + mutex_lock(tz-lock);
 +
 + list_for_each_entry(instance, tz-thermal_instances, tz_node) {
 + if (instance-trip != trip)
 + continue;
 +
 + /* in case fan is neither on nor off set the fan to active */
 + if (instance-target != 0  instance-target != 1)
 + instance-target = 1;

I think you should add a pr_warn() here to warn the user that the
governor is being used with a cooling device that seems to support
more than one cooling state.

Cheers,
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 4/4] acerhdf: Use bang-bang thermal governor

2014-04-29 Thread Javi Merino
On Tue, Apr 29, 2014 at 10:17:57AM +0100, Peter Feuerer wrote:
 acerhdf has been doing an on-off fan control using hysteresis by
 post-manipulating the outcome of thermal subsystem trip point handling.
 This patch enables acerhdf to use the bang-bang governor, which is
 intended for on-off controlled fans.
 
 CC: Zhang Rui rui.zh...@intel.com
 Cc: Andreas Mohr a...@lisas.de
 Cc: Borislav Petkov b...@suse.de
 Signed-off-by: Peter Feuerer pe...@piie.net
 ---
  drivers/platform/x86/Kconfig   |  2 +-
  drivers/platform/x86/acerhdf.c | 48 
 +++---
  2 files changed, 41 insertions(+), 9 deletions(-)
 
 diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
 index 27df2c5..0c15d89 100644
 --- a/drivers/platform/x86/Kconfig
 +++ b/drivers/platform/x86/Kconfig
 @@ -38,7 +38,7 @@ config ACER_WMI
  
  config ACERHDF
   tristate Acer Aspire One temperature and fan driver
 - depends on THERMAL  ACPI
 + depends on ACPI  THERMAL_GOV_BANG_BANG
   ---help---
 This is a driver for Acer Aspire One netbooks. It allows to access
 the temperature sensor and to control the fan.
 diff --git a/drivers/platform/x86/acerhdf.c b/drivers/platform/x86/acerhdf.c
 index 176edbd..f3884f9 100644
 --- a/drivers/platform/x86/acerhdf.c
 +++ b/drivers/platform/x86/acerhdf.c
 @@ -50,7 +50,7 @@
   */
  #undef START_IN_KERNEL_MODE
  
 -#define DRV_VER 0.5.30
 +#define DRV_VER 0.5.31
  
  /*
   * According to the Atom N270 datasheet,
 @@ -135,8 +135,8 @@ struct bios_settings_t {
   const char *vendor;
   const char *product;
   const char *version;
 - unsigned char fanreg;
 - unsigned char tempreg;
 + u8 fanreg;
 + u8 tempreg;
   struct fancmd cmd;
   int mcmd_enable;
  };
 @@ -259,6 +259,17 @@ static const struct bios_settings_t bios_tbl[] = {
  
  static const struct bios_settings_t *bios_cfg __read_mostly;
  
 +/*
 + * this struct is used to instruct thermal layer to use bang_bang instead of
 + * default governor for acerhdf
 + */
 +static struct thermal_zone_params acerhdf_zone_params = {
 + .governor_name = bang_bang,
 + .no_hwmon = 0,
 + .num_tbps = 0,
 + .tbp = 0,
 +};

You don't need to initialize statics to 0.  checkpatch only considers
it an error if it finds it in a variable, but I think it also applies
to fields in struct.

 +
  static int acerhdf_get_temp(int *temp)
  {
   u8 read_temp;
 @@ -436,6 +447,17 @@ static int acerhdf_get_trip_type(struct 
 thermal_zone_device *thermal, int trip,
  {
   if (trip == 0)
   *type = THERMAL_TRIP_ACTIVE;
 + if (trip == 1)
 + *type = THERMAL_TRIP_CRITICAL;

This looks like an unrelated change that should be on a patch on its
own.

 +
 + return 0;
 +}
 +
 +static int acerhdf_get_trip_hyst(struct thermal_zone_device *thermal, int 
 trip,
 +  unsigned long *temp)
 +{
 + if (trip == 0)
 + *temp = fanon - fanoff;
  
   return 0;
  }
 @@ -445,6 +467,8 @@ static int acerhdf_get_trip_temp(struct 
 thermal_zone_device *thermal, int trip,
  {
   if (trip == 0)
   *temp = fanon;
 + else if (trip == 1)
 + *temp = ACERHDF_TEMP_CRIT;
  
   return 0;
  }
 @@ -464,8 +488,10 @@ static struct thermal_zone_device_ops acerhdf_dev_ops = {
   .get_mode = acerhdf_get_mode,
   .set_mode = acerhdf_set_mode,
   .get_trip_type = acerhdf_get_trip_type,
 + .get_trip_hyst = acerhdf_get_trip_hyst,
   .get_trip_temp = acerhdf_get_trip_temp,
   .get_crit_temp = acerhdf_get_crit_temp,
 + .notify = NULL,

Same as before, no need to initialize static to NULL.

Cheers,
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 3/4] thermal: Added Bang-bang thermal governor

2014-04-30 Thread Javi Merino
Hi Peter,

On Tue, Apr 29, 2014 at 10:31:42PM +0100, Peter Feuerer wrote:
 Peter Feuerer writes:
 
  Javi Merino writes:
 [...]
 
  diff --git a/drivers/thermal/gov_bang_bang.c 
  b/drivers/thermal/gov_bang_bang.c
  new file mode 100644
  index 000..328dde0
  --- /dev/null
  +++ b/drivers/thermal/gov_bang_bang.c
  @@ -0,0 +1,124 @@
  +/*
  + *  gov_bang_bang.c - A simple thermal throttling governor using 
  hysteresis
  + *
  + *  Copyright (C) 2014 Peter Feuerer pe...@piie.net
  + *
  + *  Based on step_wise.c with following Copyrights:
  + *  Copyright (C) 2012 Intel Corp
  + *  Copyright (C) 2012 Durgadoss R durgados...@intel.com
  + *
  + *
  + * This program is free software; you can redistribute it and/or modify
  + * it under the terms of the GNU General Public License as published by
  + * the Free Software Foundation, version 2.
  + *
  + * This program is distributed in the hope that it will be useful,
  + * but WITHOUT ANY WARRANTY; without even the implied warranty of
  + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
  + * the GNU General Public License for more details.
  + *
  + */
  +
  +#include linux/thermal.h
  +
  +#include thermal_core.h
  +
  +static void thermal_zone_trip_update(struct thermal_zone_device *tz, int 
  trip)
  +{
  + long trip_temp;
  + unsigned long trip_hyst;
  + struct thermal_instance *instance;
  +
  + tz-ops-get_trip_temp(tz, trip, trip_temp);
  + tz-ops-get_trip_hyst(tz, trip, trip_hyst);
  +
  + dev_dbg(tz-device, Trip%d[temp=%ld]:temp=%d:hyst=%ld\n,
  + trip, trip_temp, tz-temperature,
  + trip_hyst);
  +
  + mutex_lock(tz-lock);
  +
  + list_for_each_entry(instance, tz-thermal_instances, tz_node) {
  + if (instance-trip != trip)
  + continue;
  +
  + /* in case fan is neither on nor off set the fan to active */
  + if (instance-target != 0  instance-target != 1)
  + instance-target = 1;
  
  I think you should add a pr_warn() here to warn the user that the
  governor is being used with a cooling device that seems to support
  more than one cooling state.
  
  Strange thing is, that the first time it is actually called with acerhdf 
  attached, it comes in with instance→target = -1 … I did not yet find out, 
  why.
  
  I'll further investigate on this and add some warning to it.
 
 I found out, that the default init state of a cooling device is:
 
 drivers/thermal/thermal_core.c:
  960 dev-target = THERMAL_NO_TARGET;
 
 While drivers/thermal/thermal_core.h:
  30 /* Initial state of a cooling device during binding */
  31 #define THERMAL_NO_TARGET -1UL
 
 
 So I changed my patch to this:
 
 +   /* in case fan is in initial state, switch the fan off */
 +   if (instance-target == THERMAL_NO_TARGET)
 +   instance-target = 0;
 +
 +   /* in case fan is neither on nor off set the fan to active */
 +   if (instance-target != 0  instance-target != 1) {
 +   pr_warn(Thermal instance %s controlled by bang-bang 
 has unexpected state: %ld\n,
 +   instance-name, instance-target);
 +   instance-target = 1;
 +   }

That sounds like a better solution, thanks!
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/6] acerhdf: Use bang-bang thermal governor

2014-05-06 Thread Javi Merino
Hi Peter,

On Sat, May 03, 2014 at 06:59:24PM +0100, Peter Feuerer wrote:
 acerhdf has been doing an on-off fan control using hysteresis by
 post-manipulating the outcome of thermal subsystem trip point handling.
 This patch enables acerhdf to use the bang-bang governor, which is
 intended for on-off controlled fans.
 
 Cc: Andrew Morton a...@linux-foundation.org
 CC: Zhang Rui rui.zh...@intel.com
 Cc: Andreas Mohr a...@lisas.de
 Cc: Borislav Petkov b...@suse.de
 Cc: Javi Merino javi.mer...@arm.com
 Signed-off-by: Peter Feuerer pe...@piie.net
 ---
  drivers/platform/x86/Kconfig   |  2 +-
  drivers/platform/x86/acerhdf.c | 34 +-
  2 files changed, 30 insertions(+), 6 deletions(-)
 
 diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
 index 27df2c5..0c15d89 100644
 --- a/drivers/platform/x86/Kconfig
 +++ b/drivers/platform/x86/Kconfig
 @@ -38,7 +38,7 @@ config ACER_WMI
  
  config ACERHDF
   tristate Acer Aspire One temperature and fan driver
 - depends on THERMAL  ACPI
 + depends on ACPI  THERMAL_GOV_BANG_BANG
   ---help---
 This is a driver for Acer Aspire One netbooks. It allows to access
 the temperature sensor and to control the fan.
 diff --git a/drivers/platform/x86/acerhdf.c b/drivers/platform/x86/acerhdf.c
 index 176edbd..afaa849 100644
 --- a/drivers/platform/x86/acerhdf.c
 +++ b/drivers/platform/x86/acerhdf.c
 @@ -50,7 +50,7 @@
   */
  #undef START_IN_KERNEL_MODE
  
 -#define DRV_VER 0.5.30
 +#define DRV_VER 0.5.31
  
  /*
   * According to the Atom N270 datasheet,
 @@ -259,6 +259,14 @@ static const struct bios_settings_t bios_tbl[] = {
  
  static const struct bios_settings_t *bios_cfg __read_mostly;
  
 +/*
 + * this struct is used to instruct thermal layer to use bang_bang instead of
 + * default governor for acerhdf
 + */
 +static struct thermal_zone_params acerhdf_zone_params = {
 + .governor_name = bang_bang,
 +};
 +
  static int acerhdf_get_temp(int *temp)
  {
   u8 read_temp;
 @@ -440,6 +448,15 @@ static int acerhdf_get_trip_type(struct 
 thermal_zone_device *thermal, int trip,
   return 0;
  }
  
 +static int acerhdf_get_trip_hyst(struct thermal_zone_device *thermal, int 
 trip,
 +  unsigned long *temp)
 +{
 + if (trip == 0)
 + *temp = fanon - fanoff;
 +
 + return 0;
 +}
 +

I think you should only return 0 if you've updated the temperature.
Otherwise you're telling the calling function that everything went all
right but you may be leaving garbage in *temp.  What about

if (trip != 0)
return -EINVAL;

*temp = fanon - fanoff;
return 0;

Cheers,
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v11 0/4] thermal: samsung: Clean up and add support for Exynos5420

2014-04-08 Thread Javi Merino
On Thu, Mar 20, 2014 at 02:45:54AM +, Naveen Krishna Ch wrote:
 Hello Tomasz,
 
 On 20 March 2014 00:58, Tomasz Figa t.f...@samsung.com wrote:
  Hi Leela,
 
 
  On 19.03.2014 12:19, Leela Krishna Amudala wrote:
 
  Hi All,
 
  I didn't see this series in mainline, Any comments for this ?
 
 
  Naveen had posted v12 of this series and I believe all the patches got my
  Reviewed-by.
 
  Best regards,
  Tomasz
 Thanks for your reply.
 These patches are posted to the mailing list quite some time now.
 Anything i can do to speed up the process.

These patches create 5 thermal zones, one for each sensor.  Due to the
way the exynos code in drivers/thermal/samsung/exynos_thermal_common.c
registers thermal zones, each one gets a cpufreq_cooling_device.
Therefore you end up with 5 thermal zone each of which is controlling
the cpu frequency.

Wouldn't it be better to collate these 5 sensors to create one thermal
zone for the whole SoC?  Maybe a simplistic algorithm like reporting
the maximum temperature of all the sensors as the thermal zone
temperature is good enough.

Cheers,
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v5 01/10] tracing: Add array printing helpers

2014-07-10 Thread Javi Merino
From: Dave Martin dave.mar...@arm.com

If a trace event contains an array, there is currently no standard
way to format this for text output.  Drivers are currently hacking
around this by a) local hacks that use the trace_seq functionailty
directly, or b) just not printing that information.  For fixed size
arrays, formatting of the elements can be open-coded, but this gets
cumbersome for arrays of non-trivial size.

These approaches result in non-standard content of the event format
description delivered to userspace, so userland tools needs to be
taught to understand and parse each array printing method
individually.

This patch implements common __print_type_array() helpers that
tracepoint implementations can use instead of reinventing them.  A
simple C-style syntax is used to delimit the array and its elements
{like,this}.

So that the helpers can be used with large static arrays as well as
dynamic arrays, they take a pointer and element count: they can be
used with __get_dynamic_array() for use with dynamic arrays.

Cc: Steven Rostedt rost...@goodmis.org
Cc: Ingo Molnar mi...@redhat.com
Signed-off-by: Dave Martin dave.mar...@arm.com
---
 include/linux/ftrace_event.h |  9 
 include/trace/ftrace.h   | 17 ++
 kernel/trace/trace_output.c  | 55 
 3 files changed, 81 insertions(+)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index cff3106ffe2c..919f21a3420b 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -44,6 +44,15 @@ const char *ftrace_print_bitmask_seq(struct trace_seq *p, 
void *bitmask_ptr,
 const char *ftrace_print_hex_seq(struct trace_seq *p,
 const unsigned char *buf, int len);
 
+const char *ftrace_print_u8_array_seq(struct trace_seq *p,
+ const u8 *buf, int count);
+const char *ftrace_print_u16_array_seq(struct trace_seq *p,
+  const u16 *buf, int count);
+const char *ftrace_print_u32_array_seq(struct trace_seq *p,
+  const u32 *buf, int count);
+const char *ftrace_print_u64_array_seq(struct trace_seq *p,
+  const u64 *buf, int count);
+
 struct trace_iterator;
 struct trace_event;
 
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 26b4f2e13275..15bc5d417aea 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -263,6 +263,19 @@
 #undef __print_hex
 #define __print_hex(buf, buf_len) ftrace_print_hex_seq(p, buf, buf_len)
 
+#undef __print_u8_array
+#define __print_u8_array(array, count) \
+   ftrace_print_u8_array_seq(p, array, count)
+#undef __print_u16_array
+#define __print_u16_array(array, count)\
+   ftrace_print_u16_array_seq(p, array, count)
+#undef __print_u32_array
+#define __print_u32_array(array, count)\
+   ftrace_print_u32_array_seq(p, array, count)
+#undef __print_u64_array
+#define __print_u64_array(array, count)\
+   ftrace_print_u64_array_seq(p, array, count)
+
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
 static notrace enum print_line_t   \
@@ -676,6 +689,10 @@ static inline void ftrace_test_probe_##call(void)  
\
 #undef __get_dynamic_array_len
 #undef __get_str
 #undef __get_bitmask
+#undef __print_u8_array
+#undef __print_u16_array
+#undef __print_u32_array
+#undef __print_u64_array
 
 #undef TP_printk
 #define TP_printk(fmt, args...) \ fmt \,   __stringify(args)
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index f3dad80c20b2..b46238e75523 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -454,6 +454,61 @@ ftrace_print_hex_seq(struct trace_seq *p, const unsigned 
char *buf, int buf_len)
 }
 EXPORT_SYMBOL(ftrace_print_hex_seq);
 
+static const char *
+ftrace_print_array_seq(struct trace_seq *p, const void *buf, int buf_len,
+  bool (*iterator)(struct trace_seq *p, const char *prefix,
+   const void **buf, int *buf_len))
+{
+   const char *ret = p-buffer + p-len;
+   const char *prefix = ;
+
+   trace_seq_putc(p, '{');
+
+   if (iterator(p, prefix, buf, buf_len)) {
+   prefix = ,;
+
+   while (iterator(p, prefix, buf, buf_len))
+   ;
+   }
+
+   trace_seq_putc(p, '}');
+   trace_seq_putc(p, 0);
+
+   return ret;
+}
+
+#define DEFINE_PRINT_ARRAY(type, printk_type, format)  \
+static bool\
+ftrace_print_array_iterator_##type(struct trace_seq *p, const char *prefix, \
+  const void **buf, int *buf_len)  \
+{   

[RFC PATCH v5 03/10] tools lib traceevent: Add support for __print_u{8,16,32,64}_array()

2014-07-10 Thread Javi Merino
Trace can now generate traces with u8, u16, u32 and u64 dynamic
arrays.  Add support to parse them.

Cc: Arnaldo Carvalho de Melo a...@redhat.com
Cc: Steven Rostedt srost...@redhat.com
Cc: Jiri Olsa jo...@redhat.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 tools/lib/traceevent/event-parse.c | 62 +++---
 tools/lib/traceevent/event-parse.h |  4 +++
 2 files changed, 61 insertions(+), 5 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index 8a0a8749df4c..8f25903d6e72 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -753,6 +753,10 @@ static void free_arg(struct print_arg *arg)
free_arg(arg-symbol.field);
free_flag_sym(arg-symbol.symbols);
break;
+   case PRINT_U8:
+   case PRINT_U16:
+   case PRINT_U32:
+   case PRINT_U64:
case PRINT_HEX:
free_arg(arg-num.field);
free_arg(arg-num.size);
@@ -2827,6 +2831,22 @@ process_function(struct event_format *event, struct 
print_arg *arg,
free_token(token);
return process_hex(event, arg, tok);
}
+   if (strcmp(token, __print_u8_array) == 0) {
+   free_token(token);
+   return process_num(event, arg, tok, PRINT_U8);
+   }
+   if (strcmp(token, __print_u16_array) == 0) {
+   free_token(token);
+   return process_num(event, arg, tok, PRINT_U16);
+   }
+   if (strcmp(token, __print_u32_array) == 0) {
+   free_token(token);
+   return process_num(event, arg, tok, PRINT_U32);
+   }
+   if (strcmp(token, __print_u64_array) == 0) {
+   free_token(token);
+   return process_num(event, arg, tok, PRINT_U64);
+   }
if (strcmp(token, __get_str) == 0) {
free_token(token);
return process_str(event, arg, tok);
@@ -3355,6 +3375,10 @@ eval_num_arg(void *data, int size, struct event_format 
*event, struct print_arg
break;
case PRINT_FLAGS:
case PRINT_SYMBOL:
+   case PRINT_U8:
+   case PRINT_U16:
+   case PRINT_U32:
+   case PRINT_U64:
case PRINT_HEX:
break;
case PRINT_TYPE:
@@ -3660,7 +3684,7 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
unsigned long long val, fval;
unsigned long addr;
char *str;
-   unsigned char *hex;
+   void *num;
int print;
int i, len;
 
@@ -3739,13 +3763,17 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
}
}
break;
+   case PRINT_U8:
+   case PRINT_U16:
+   case PRINT_U32:
+   case PRINT_U64:
case PRINT_HEX:
if (arg-num.field-type == PRINT_DYNAMIC_ARRAY) {
unsigned long offset;
offset = pevent_read_number(pevent,
data + arg-num.field-dynarray.field-offset,
arg-num.field-dynarray.field-size);
-   hex = data + (offset  0x);
+   num = data + (offset  0x);
} else {
field = arg-num.field-field.field;
if (!field) {
@@ -3755,13 +3783,24 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
goto out_warning_field;
arg-num.field-field.field = field;
}
-   hex = data + field-offset;
+   num = data + field-offset;
}
len = eval_num_arg(data, size, event, arg-num.size);
for (i = 0; i  len; i++) {
if (i)
trace_seq_putc(s, ' ');
-   trace_seq_printf(s, %02x, hex[i]);
+   if (arg-type == PRINT_HEX)
+   trace_seq_printf(s, %02x,
+   ((uint8_t *)num)[i]);
+   else if (arg-type == PRINT_U8)
+   trace_seq_printf(s, %u, ((uint8_t *)num)[i]);
+   else if (arg-type == PRINT_U16)
+   trace_seq_printf(s, %u, ((uint16_t *)num)[i]);
+   else if (arg-type == PRINT_U32)
+   trace_seq_printf(s, %u, ((uint32_t *)num)[i]);
+   else/* PRINT_U64 */
+   trace_seq_printf(s, %lu,
+   ((uint64_t *)num)[i]);
}
break;
 
@@ -4922,7 +4961,20 @@ static void print_args(struct print_arg *args)
printf());
break

[RFC PATCH v5 05/10] thermal: let governors have private data for each thermal zone

2014-07-10 Thread Javi Merino
A governor may need to store its current state between calls to
throttle().  That state depends on the thermal zone, so store it as
private data in struct thermal_zone_device.

The governors may have two new ops: bind_to_tz() and unbind_from_tz().
When provided, these functions let governors do some initialization
and teardown when they are bound/unbound to a tz and possibly store that
information in the governor_data field of the struct
thermal_zone_device.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 drivers/thermal/thermal_core.c | 83 ++
 include/linux/thermal.h|  9 +
 2 files changed, 84 insertions(+), 8 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 71b0ec0c370d..3da99dd80ad5 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -72,6 +72,58 @@ static struct thermal_governor *__find_governor(const char 
*name)
return NULL;
 }
 
+/**
+ * bind_previous_governor() - bind the previous governor of the thermal zone
+ * @tz:a valid pointer to a struct thermal_zone_device
+ * @failed_gov_name:   the name of the governor that failed to register
+ *
+ * Register the previous governor of the thermal zone after a new
+ * governor has failed to be bound.
+ */
+static void bind_previous_governor(struct thermal_zone_device *tz,
+   const char *failed_gov_name)
+{
+   if (tz-governor  tz-governor-bind_to_tz) {
+   if (tz-governor-bind_to_tz(tz)) {
+   dev_warn(tz-device,
+   governor %s failed to bind and the previous 
one (%s) failed to register again, thermal zone %s has no governor\n,
+   failed_gov_name, tz-governor-name, tz-type);
+   tz-governor = NULL;
+   }
+   }
+}
+
+/**
+ * thermal_set_governor() - Switch to another governor
+ * @tz:a valid pointer to a struct thermal_zone_device
+ * @new_gov:   pointer to the new governor
+ *
+ * Change the governor of thermal zone @tz.
+ *
+ * Return: 0 on success, an error if the new governor's bind_to_tz() failed.
+ */
+static int thermal_set_governor(struct thermal_zone_device *tz,
+   struct thermal_governor *new_gov)
+{
+   int ret = 0;
+
+   if (tz-governor  tz-governor-unbind_from_tz)
+   tz-governor-unbind_from_tz(tz);
+
+   if (new_gov  new_gov-bind_to_tz) {
+   ret = new_gov-bind_to_tz(tz);
+   if (ret) {
+   bind_previous_governor(tz, new_gov-name);
+
+   return ret;
+   }
+   }
+
+   tz-governor = new_gov;
+
+   return ret;
+}
+
 int thermal_register_governor(struct thermal_governor *governor)
 {
int err;
@@ -104,8 +156,15 @@ int thermal_register_governor(struct thermal_governor 
*governor)
 
name = pos-tzp-governor_name;
 
-   if (!strnicmp(name, governor-name, THERMAL_NAME_LENGTH))
-   pos-governor = governor;
+   if (!strnicmp(name, governor-name, THERMAL_NAME_LENGTH)) {
+   int ret;
+
+   ret = thermal_set_governor(pos, governor);
+   if (ret)
+   dev_warn(pos-device,
+   Failed to set governor %s for thermal 
zone %s: %d\n,
+   governor-name, pos-type, ret);
+   }
}
 
mutex_unlock(thermal_list_lock);
@@ -131,7 +190,7 @@ void thermal_unregister_governor(struct thermal_governor 
*governor)
list_for_each_entry(pos, thermal_tz_list, node) {
if (!strnicmp(pos-governor-name, governor-name,
THERMAL_NAME_LENGTH))
-   pos-governor = NULL;
+   thermal_set_governor(pos, NULL);
}
 
mutex_unlock(thermal_list_lock);
@@ -756,8 +815,9 @@ policy_store(struct device *dev, struct device_attribute 
*attr,
if (!gov)
goto exit;
 
-   tz-governor = gov;
-   ret = count;
+   ret = thermal_set_governor(tz, gov);
+   if (!ret)
+   ret = count;
 
 exit:
mutex_unlock(thermal_governor_lock);
@@ -1452,6 +1512,7 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
int result;
int count;
int passive = 0;
+   struct thermal_governor *governor;
 
if (type  strlen(type) = THERMAL_NAME_LENGTH)
return ERR_PTR(-EINVAL);
@@ -1542,9 +1603,15 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
mutex_lock(thermal_governor_lock);
 
if (tz-tzp)
-   tz-governor = __find_governor(tz-tzp

[RFC PATCH v5 04/10] thermal: document struct thermal_zone_device and thermal_governor

2014-07-10 Thread Javi Merino
Document struct thermal_zone_device and struct thermal_governor fields
and their use by the thermal framework code.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 include/linux/thermal.h | 46 --
 1 file changed, 44 insertions(+), 2 deletions(-)

diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index f7e11c7ea7d9..0305cde21a74 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -158,6 +158,42 @@ struct thermal_attr {
char name[THERMAL_NAME_LENGTH];
 };
 
+/**
+ * struct thermal_zone_device - structure for a thermal zone
+ * @id:unique id number for each thermal zone
+ * @type:  the thermal zone device type
+ * @device:struct device for this thermal zone
+ * @trip_temp_attrs:   attributes for trip points for sysfs: trip temperature
+ * @trip_type_attrs:   attributes for trip points for sysfs: trip type
+ * @trip_hyst_attrs:   attributes for trip points for sysfs: trip hysteresis
+ * @devdata:   private pointer for device private data
+ * @trips: number of trip points the thermal zone supports
+ * @passive_delay: number of milliseconds to wait between polls when
+ * performing passive cooling.  Currenty only used by the
+ * step-wise governor
+ * @polling_delay: number of milliseconds to wait between polls when
+ * checking whether trip points have been crossed (0 for
+ * interrupt driven systems)
+ * @temperature:   current temperature.  This is only for core code,
+ * drivers should use thermal_zone_get_temp() to get the
+ * current temperature
+ * @last_temperature:  previous temperature read
+ * @emul_temperature:  emulated temperature when using CONFIG_THERMAL_EMULATION
+ * @passive:   1 if you've crossed a passive trip point, 0 otherwise.
+ * Currenty only used by the step-wise governor.
+ * @forced_passive:If  0, temperature at which to switch on all ACPI
+ * processor cooling devices.  Currently only used by the
+ * step-wise governor.
+ * @ops:   operations this thermal_zone_device supports
+ * @tzp:   thermal zone parameters
+ * @governor:  pointer to the governor for this thermal zone
+ * @thermal_instances: list of struct thermal_instance of this thermal zone
+ * @idr:   struct idr to generate unique id for this zone's cooling
+ * devices
+ * @lock:  lock to protect thermal_instances list
+ * @node:  node in thermal_tz_list (in thermal_core.c)
+ * @poll_queue:delayed work for polling
+ */
 struct thermal_zone_device {
int id;
char type[THERMAL_NAME_LENGTH];
@@ -179,12 +215,18 @@ struct thermal_zone_device {
struct thermal_governor *governor;
struct list_head thermal_instances;
struct idr idr;
-   struct mutex lock; /* protect thermal_instances list */
+   struct mutex lock;
struct list_head node;
struct delayed_work poll_queue;
 };
 
-/* Structure that holds thermal governor information */
+/**
+ * struct thermal_governor - structure that holds thermal governor information
+ * @name:  name of the governor
+ * @throttle:  callback called for every trip point even if temperature is
+ * below the trip point temperature
+ * @governor_list: node in thermal_governor_list (in thermal_core.c)
+ */
 struct thermal_governor {
char name[THERMAL_NAME_LENGTH];
int (*throttle)(struct thermal_zone_device *tz, int trip);
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v5 00/10] The power allocator thermal governor

2014-07-10 Thread Javi Merino
Hi linux-pm,

The power allocator governor allocates device power to control
temperature.  This requires transforming performance requests into
requested power, which we do with the aid of power models.  Patch 7
(thermal: add a basic cpu power actor) implements a simple power model
for cpus.  The division of power between the actors ensures that power
is allocated where it is needed the most, based on the current
workload.

Patches 1-3 adds array printing helpers to ftrace, which we then use
in patch 9.  Patch 4 is a generic documentation of the current thermal
framework and can be merged separately.

Changes since v4:
  - Add more tracing
  - Document some of the limitations of the power allocator governor
  - Export the power_actor API and move power_actor.h to include/linux

Changes since v3:
  - Use tz-passive to poll faster when the first trip point is hit.
  - Don't make a special directory for power_actors
  - Add a DT property for sustainable-power
  - Simplify the static power interface and pass the current thermal
zone in every power_actor_ops to remove the controversial
enum power_actor_types
  - Use locks with the actor_list list
  - Use cpufreq_get() to get the frequency of the cpu instead of
using the notifiers.
  - Remove the prompt for THERMAL_POWER_ACTOR_CPU when configuring
the kernel

Changes since v2:
  - Changed the PI controller into a PID controller
  - Added static power to the cpu power model
  - tz parameter max_dissipatable_power renamed to sustainable_power
  - Register the cpufreq cooling device as part of the
power_cpu_actor registration.

Changes since v1:
  - Fixed finding cpufreq cooling devices in cpufreq_frequency_change()
  - Replaced the cooling device interface with a separate power actor
API
  - Addressed most of Eduardo's comments
  - Incorporated ftrace support for bitmask to trace cpumasks

Todo:
  - Rethink the use of trip points and make it less intrusive
  - Let platforms override the power allocator governor parameters
  - Provide scripts to evaluate the proposal
  - Let the governor operate on cooling devices as well

Cheers,
Javi  Punit

Dave Martin (1):
  tracing: Add array printing helpers

Javi Merino (8):
  tools lib traceevent: Generalize numeric argument
  tools lib traceevent: Add support for __print_u{8,16,32,64}_array()
  thermal: document struct thermal_zone_device and thermal_governor
  thermal: let governors have private data for each thermal zone
  thermal: introduce the Power Actor API
  thermal: add a basic cpu power actor
  thermal: introduce the Power Allocator governor
  thermal: add trace events to the power allocator governor

Punit Agrawal (1):
  of: thermal: Introduce sustainable power for a thermal zone

 .../devicetree/bindings/thermal/thermal.txt|   4 +
 Documentation/thermal/power_actor.txt  | 181 
 Documentation/thermal/power_allocator.txt  |  61 +++
 drivers/thermal/Kconfig|  23 +
 drivers/thermal/Makefile   |   5 +
 drivers/thermal/cpu_actor.c| 501 +
 drivers/thermal/of-thermal.c   |   4 +
 drivers/thermal/power_actor.c  |  70 +++
 drivers/thermal/power_allocator.c  | 485 
 drivers/thermal/thermal_core.c |  90 +++-
 drivers/thermal/thermal_core.h |   8 +
 include/linux/ftrace_event.h   |   9 +
 include/linux/power_actor.h|  86 
 include/linux/thermal.h|  63 ++-
 include/trace/events/thermal_power_allocator.h | 138 ++
 include/trace/ftrace.h |  17 +
 kernel/trace/trace_output.c|  55 +++
 tools/lib/traceevent/event-parse.c |  88 +++-
 tools/lib/traceevent/event-parse.h |   8 +-
 19 files changed, 1865 insertions(+), 31 deletions(-)
 create mode 100644 Documentation/thermal/power_actor.txt
 create mode 100644 Documentation/thermal/power_allocator.txt
 create mode 100644 drivers/thermal/cpu_actor.c
 create mode 100644 drivers/thermal/power_actor.c
 create mode 100644 drivers/thermal/power_allocator.c
 create mode 100644 include/linux/power_actor.h
 create mode 100644 include/trace/events/thermal_power_allocator.h

-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v5 06/10] thermal: introduce the Power Actor API

2014-07-10 Thread Javi Merino
This patch introduces the Power Actor API in the thermal framework.
With it, devices that can report their power consumption and control
it can be registered.  This base interface is meant to be used to
derive specific power actors, such as a cpu power actor.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 Documentation/thermal/power_actor.txt | 56 
 drivers/thermal/Kconfig   |  3 ++
 drivers/thermal/Makefile  |  3 ++
 drivers/thermal/power_actor.c | 70 +++
 include/linux/power_actor.h   | 60 ++
 5 files changed, 192 insertions(+)
 create mode 100644 Documentation/thermal/power_actor.txt
 create mode 100644 drivers/thermal/power_actor.c
 create mode 100644 include/linux/power_actor.h

diff --git a/Documentation/thermal/power_actor.txt 
b/Documentation/thermal/power_actor.txt
new file mode 100644
index ..11ca2d0bf0bd
--- /dev/null
+++ b/Documentation/thermal/power_actor.txt
@@ -0,0 +1,56 @@
+Power Actor API
+===
+
+The base power actor API is meant to be used to derive specific power
+actors, such as a cpu power actor.  Power actors can be registered by
+calling `power_actor_register()` and should be unregistered by calling
+`power_actor_unregister()` with the `struct power_actor *` received in
+the call to `power_actor_register()`.
+
+This can't be implemented using the cooling device API because:
+
+1.  get_max_state() gives you the maximum cooling state which, for
+passive devices, is the minimum performance (frequency in case of
+cpufreq cdev).  get_max_power() gives you the maximum power, which
+gives you the maximum performance (frequency in the case of CPUs,
+GPUs and buses)
+
+2.  You need to pass the thermal_zone_device to all the callbacks,
+something that the current cooling device API doesn't do.
+
+Callbacks
+-
+
+1. u32 get_req_power(struct power_actor *actor,
+   struct thermal_zone_device *tz)
+@actor: a valid `struct power_actor *` registered with
+`power_actor_register()`
+@tz:   the thermal zone closest to the actor (typically, the thermal
+   zone the caller is operating on)
+
+`get_req_power()` returns the current requested power in milliwatts.
+
+2. u32 get_max_power(struct power_actor *actor,
+   struct thermal_zone_device *tz)
+@actor: a valid `struct power_actor *` registered with
+`power_actor_register()`
+@tz:   the thermal zone closest to the actor (typically, the thermal
+   zone the caller is operating on)
+
+`get_max_power()` returns the maximum power that the device could
+consume if it was fully utilized.  It's a function as some devices'
+maximum power consumption can change due to external factors such as
+temperature.
+
+3. int set_power(struct power_actor *actor,
+   struct thermal_zone_device *tz, u32 power)
+@actor: a valid `struct power_actor *` registered with
+`power_actor_register()`
+@tz:   the thermal zone closest to the actor (typically, the thermal
+   zone the caller is operating on)
+@power: power in milliwatts
+
+`set_power()` should configure the device to consume @power
+milliwatts.
+
+Returns 0 on success, -E* on error.
diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index f9a13867cb70..ce4ebe17252c 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -89,6 +89,9 @@ config THERMAL_GOV_USER_SPACE
help
  Enable this to let the user space manage the platform thermals.
 
+config THERMAL_POWER_ACTOR
+   bool
+
 config CPU_THERMAL
bool generic cpu cooling support
depends on CPU_FREQ
diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index de0636a57a64..d83aa42ab573 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -14,6 +14,9 @@ thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE)  += fair_share.o
 thermal_sys-$(CONFIG_THERMAL_GOV_STEP_WISE)+= step_wise.o
 thermal_sys-$(CONFIG_THERMAL_GOV_USER_SPACE)   += user_space.o
 
+# power actors
+obj-$(CONFIG_THERMAL_POWER_ACTOR) += power_actor.o
+
 # cpufreq cooling
 thermal_sys-$(CONFIG_CPU_THERMAL)  += cpu_cooling.o
 
diff --git a/drivers/thermal/power_actor.c b/drivers/thermal/power_actor.c
new file mode 100644
index ..0b123f9850ea
--- /dev/null
+++ b/drivers/thermal/power_actor.c
@@ -0,0 +1,70 @@
+/*
+ * Basic interface for power actors
+ *
+ * Copyright (C) 2014 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed as is WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even

[RFC PATCH v5 10/10] of: thermal: Introduce sustainable power for a thermal zone

2014-07-10 Thread Javi Merino
From: Punit Agrawal punit.agra...@arm.com

Introduce an optional property called, sustainable-power, which
represents the power (in mW) which the thermal zone can safely
dissipate.

If provided the property is parsed and associated with the thermal
zone via the thermal zone parameters.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Punit Agrawal punit.agra...@arm.com
---
 Documentation/devicetree/bindings/thermal/thermal.txt | 4 
 drivers/thermal/of-thermal.c  | 4 
 2 files changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt 
b/Documentation/devicetree/bindings/thermal/thermal.txt
index f5db6b72a36f..c6eb9a8d2aed 100644
--- a/Documentation/devicetree/bindings/thermal/thermal.txt
+++ b/Documentation/devicetree/bindings/thermal/thermal.txt
@@ -167,6 +167,10 @@ Optional property:
by means of sensor ID. Additional coefficients are
interpreted as constant offset.
 
+- sustainable-power:   An estimate of the sustainable power (in mW) that the
+  Type: unsigned   thermal zone can dissipate.
+  Size: one cell
+
 Note: The delay properties are bound to the maximum dT/dt (temperature
 derivative over time) in two situations for a thermal zone:
 (i)  - when passive cooling is activated (polling-delay-passive); and
diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c
index 04b1be7fa018..eaf81ea654b9 100644
--- a/drivers/thermal/of-thermal.c
+++ b/drivers/thermal/of-thermal.c
@@ -769,6 +769,7 @@ int __init of_parse_thermal_zones(void)
for_each_child_of_node(np, child) {
struct thermal_zone_device *zone;
struct thermal_zone_params *tzp;
+   u32 prop;
 
tz = thermal_of_build_thermal_zone(child);
if (IS_ERR(tz)) {
@@ -791,6 +792,9 @@ int __init of_parse_thermal_zones(void)
/* No hwmon because there might be hwmon drivers registering */
tzp-no_hwmon = true;
 
+   if (!of_property_read_u32(child, sustainable-power, prop))
+   tzp-sustainable_power = prop;
+
zone = thermal_zone_device_register(child-name, tz-ntrips,
0, tz,
ops, tzp,
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v5 07/10] thermal: add a basic cpu power actor

2014-07-10 Thread Javi Merino
Introduce a power actor for cpus.  It has a basic power model to get
the current power utilization and uses cpufreq cooling devices to set
the desired power.  It uses the current frequency (as reported by
cpufreq) as well as load and OPPs for the power calculations.  The
cpus must have registered their OPPs in the OPP library.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Punit Agrawal punit.agra...@arm.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 Documentation/thermal/power_actor.txt | 125 +
 drivers/thermal/Kconfig   |   5 +
 drivers/thermal/Makefile  |   1 +
 drivers/thermal/cpu_actor.c   | 488 ++
 include/linux/power_actor.h   |  26 ++
 5 files changed, 645 insertions(+)
 create mode 100644 drivers/thermal/cpu_actor.c

diff --git a/Documentation/thermal/power_actor.txt 
b/Documentation/thermal/power_actor.txt
index 11ca2d0bf0bd..c96344f12599 100644
--- a/Documentation/thermal/power_actor.txt
+++ b/Documentation/thermal/power_actor.txt
@@ -54,3 +54,128 @@ temperature.
 milliwatts.
 
 Returns 0 on success, -E* on error.
+
+CPU Power Actor API
+===
+
+A simple power model for CPUs.  The current power is calculated as
+dynamic + (optionally) static power.  This power model requires that
+the operating-points of the CPUs are registered using the kernel's opp
+library and the `cpufreq_frequency_table` is assigned to the `struct
+device` of the cpu.  If you are using the `cpufreq-cpu0.c` driver then
+the `cpufreq_frequency_table` should already be assigned to the cpu
+device.
+
+The `plat_static_func` parameter of `power_cpu_actor_register()` is
+optional.  If you don't provide it, only dynamic power will be
+considered.
+
+Dynamic power
+-
+
+The dynamic power consumption of a processor depends on many factors.
+For a given processor implementation the primary factors are:
+
+- The time the processor spends running, consuming dynamic power, as
+  compared to the time in idle states where dynamic consumption is
+  negligible.  Herein we refer to this as 'utilisation'.
+- The voltage and frequency levels as a result of DVFS.  The DVFS
+  level is a dominant factor governing power consumption.
+- In running time the 'execution' behaviour (instruction types, memory
+  access patterns and so forth) causes, in most cases, a second order
+  variation.  In pathological cases this variation can be significant,
+  but typically it is of a much lesser impact than the factors above.
+
+A high level dynamic power consumption model may then be represented as:
+
+Pdyn = f(run) * Voltage^2 * Frequency * Utilisation
+
+f(run) here represents the described execution behaviour and its
+result has a units of Watts/Hz/Volt^2 (this often expressed in
+mW/MHz/uVolt^2)
+
+The detailed behaviour for f(run) could be modelled on-line.  However,
+in practice, such an on-line model has dependencies on a number of
+implementation specific processor support and characterisation
+factors.  Therefore, in initial implementation that contribution is
+represented as a constant coefficient.  This is a simplification
+consistent with the relative contribution to overall power variation.
+
+In this simplified representation our model becomes:
+
+Pdyn = Kd * Voltage^2 * Frequency * Utilisation
+
+Where Kd (capacitance) represents an indicative running time dynamic
+power coefficient in fundamental units of mW/MHz/uVolt^2
+
+Static Power
+
+
+Static leakage power consumption depends on a number of factors.  For a
+given circuit implementation the primary factors are:
+
+- Time the circuit spends in each 'power state'
+- Temperature
+- Operating voltage
+- Process grade
+
+The time the circuit spends in each 'power state' for a given
+evaluation period at first order means OFF or ON.  However,
+'retention' states can also be supported that reduce power during
+inactive periods without loss of context.
+
+Note: The visibility of state entries to the OS can vary, according to
+platform specifics, and this can then impact the accuracy of a model
+based on OS state information alone.  It might be possible in some
+cases to extract more accurate information from system resources.
+
+The temperature, operating voltage and process 'grade' (slow to fast)
+of the circuit are all significant factors in static leakage power
+consumption.  All of these have complex relationships to static power.
+
+Circuit implementation specific factors include the chosen silicon
+process as well as the type, number and size of transistors in both
+the logic gates and any RAM elements included.
+
+The static power consumption modelling must take into account the
+power managed regions that are implemented.  Taking the example of an
+ARM processor cluster, the modelling would take into account whether
+each CPU can be powered OFF separately or if only a single power
+region is implemented

[RFC PATCH v5 08/10] thermal: introduce the Power Allocator governor

2014-07-10 Thread Javi Merino
The power allocator governor is a thermal governor that controls system
and device power allocation to control temperature.  Conceptually, the
implementation divides the sustainable power of a thermal zone among
all the heat sources in that zone.

This governor relies on power actors, entities that represent heat
sources.  They can report current and maximum power consumption and
can set a given maximum power consumption, usually via a cooling
device.

The governor uses a Proportional Integral Derivative (PID) controller
driven by the temperature of the thermal zone.  The output of the
controller is a power budget that is then allocated to each power
actor that can have bearing on the temperature we are trying to
control.  It decides how much power to give each cooling device based
on the performance they are requesting.  The PID controller ensures
that the total power budget does not exceed the control temperature.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Punit Agrawal punit.agra...@arm.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 Documentation/thermal/power_allocator.txt |  61 
 drivers/thermal/Kconfig   |  15 +
 drivers/thermal/Makefile  |   1 +
 drivers/thermal/power_allocator.c | 467 ++
 drivers/thermal/thermal_core.c|   7 +-
 drivers/thermal/thermal_core.h|   8 +
 include/linux/thermal.h   |   8 +
 7 files changed, 566 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/thermal/power_allocator.txt
 create mode 100644 drivers/thermal/power_allocator.c

diff --git a/Documentation/thermal/power_allocator.txt 
b/Documentation/thermal/power_allocator.txt
new file mode 100644
index ..1859074dadcb
--- /dev/null
+++ b/Documentation/thermal/power_allocator.txt
@@ -0,0 +1,61 @@
+Integration of the power_allocator governor in a platform
+=
+
+Registering thermal_zone_device
+---
+
+An estimate of the sustainable dissipatable power (in mW) should be
+provided while registering the thermal zone.  This is the maximum
+sustained power for allocation at the desired maximum temperature.
+This number can vary for different conditions, but the closed-loop of
+the controller should take care of those variations, the
+`sustainable_power` should be an estimation of it.  Register your
+thermal zone with `thermal_zone_params` that have a
+`sustainable_power`.  If you weren't passing any
+`thermal_zone_params`, then something like this will do:
+
+   static const struct thermal_zone_params tz_params = {
+   .sustainable_power = 3500,
+   };
+
+and then pass `tz_params` as the 5th parameter to
+`thermal_zone_device_register()`
+
+Trip points
+---
+
+The governor requires the following two trip points:
+
+1.  switch on trip point: temperature above which the governor
+control loop starts operating
+2.  desired temperature trip point: it should be higher than the
+switch on trip point. It is the target temperature the governor
+is controlling for.
+
+The trip points can be either active or passive.
+
+Power actors
+
+
+Devices controlled by this governor must be registered with the power
+actor API.  Read `power_actor.txt` for more information about them.
+
+Limitations of the power allocator governor
+===
+
+The power allocator governor can't work with cooling devices directly.
+A power actor can be created to interface between the governor and the
+cooling device (see cpu_actor.c for an example).  Otherwise, if you
+have power actors and cooling devices that are next to the same
+thermal sensor create two thermal zones, one for each type.  Use the
+power allocator governor for the power actor thermal zone with the
+power actors and any other governor for the one with cooling devices.
+
+The power allocator governor's PID controller is highly dependent on a
+periodic tick.  If you have a driver that calls
+`thermal_zone_device_update()` (or anything that ends up calling the
+governor's `throttle()` function) repetitively, the governor response
+won't be very good.  Note that this is not particular to this
+governor, step-wise will also misbehave if you call its throttle()
+faster than the normal thermal framework tick (due to interrupts for
+example) as it will overreact.
diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 249b196deffd..0e76c0dab5f3 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -71,6 +71,14 @@ config THERMAL_DEFAULT_GOV_USER_SPACE
  Select this if you want to let the user space manage the
  platform thermals.
 
+config THERMAL_DEFAULT_GOV_POWER_ALLOCATOR
+   bool power_allocator
+   select THERMAL_GOV_POWER_ALLOCATOR
+   help
+ Select this if you want to control

[RFC PATCH v5 02/10] tools lib traceevent: Generalize numeric argument

2014-07-10 Thread Javi Merino
Numeric arguments can be in different bases, so rename it to num so
that they can be used for formats other than PRINT_HEX

Cc: Steven Rostedt srost...@redhat.com
Cc: Arnaldo Carvalho de Melo a...@redhat.com
Cc: Jiri Olsa jo...@redhat.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 tools/lib/traceevent/event-parse.c | 26 +-
 tools/lib/traceevent/event-parse.h |  4 ++--
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index 93825a17dcce..8a0a8749df4c 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -754,8 +754,8 @@ static void free_arg(struct print_arg *arg)
free_flag_sym(arg-symbol.symbols);
break;
case PRINT_HEX:
-   free_arg(arg-hex.field);
-   free_arg(arg-hex.size);
+   free_arg(arg-num.field);
+   free_arg(arg-num.size);
break;
case PRINT_TYPE:
free(arg-typecast.type);
@@ -2503,7 +2503,7 @@ process_hex(struct event_format *event, struct print_arg 
*arg, char **tok)
if (test_type_token(type, token, EVENT_DELIM, ,))
goto out_free;
 
-   arg-hex.field = field;
+   arg-num.field = field;
 
free_token(token);
 
@@ -2519,7 +2519,7 @@ process_hex(struct event_format *event, struct print_arg 
*arg, char **tok)
if (test_type_token(type, token, EVENT_DELIM, )))
goto out_free;
 
-   arg-hex.size = field;
+   arg-num.size = field;
 
free_token(token);
type = read_token_item(tok);
@@ -3740,24 +3740,24 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
}
break;
case PRINT_HEX:
-   if (arg-hex.field-type == PRINT_DYNAMIC_ARRAY) {
+   if (arg-num.field-type == PRINT_DYNAMIC_ARRAY) {
unsigned long offset;
offset = pevent_read_number(pevent,
-   data + arg-hex.field-dynarray.field-offset,
-   arg-hex.field-dynarray.field-size);
+   data + arg-num.field-dynarray.field-offset,
+   arg-num.field-dynarray.field-size);
hex = data + (offset  0x);
} else {
-   field = arg-hex.field-field.field;
+   field = arg-num.field-field.field;
if (!field) {
-   str = arg-hex.field-field.name;
+   str = arg-num.field-field.name;
field = pevent_find_any_field(event, str);
if (!field)
goto out_warning_field;
-   arg-hex.field-field.field = field;
+   arg-num.field-field.field = field;
}
hex = data + field-offset;
}
-   len = eval_num_arg(data, size, event, arg-hex.size);
+   len = eval_num_arg(data, size, event, arg-num.size);
for (i = 0; i  len; i++) {
if (i)
trace_seq_putc(s, ' ');
@@ -4923,9 +4923,9 @@ static void print_args(struct print_arg *args)
break;
case PRINT_HEX:
printf(__print_hex();
-   print_args(args-hex.field);
+   print_args(args-num.field);
printf(, );
-   print_args(args-hex.size);
+   print_args(args-num.size);
printf());
break;
case PRINT_STRING:
diff --git a/tools/lib/traceevent/event-parse.h 
b/tools/lib/traceevent/event-parse.h
index 7a3873ff9a4f..2bf72e908a74 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -240,7 +240,7 @@ struct print_arg_symbol {
struct print_flag_sym   *symbols;
 };
 
-struct print_arg_hex {
+struct print_arg_num {
struct print_arg*field;
struct print_arg*size;
 };
@@ -291,7 +291,7 @@ struct print_arg {
struct print_arg_typecast   typecast;
struct print_arg_flags  flags;
struct print_arg_symbol symbol;
-   struct print_arg_hexhex;
+   struct print_arg_numnum;
struct print_arg_func   func;
struct print_arg_string string;
struct print_arg_bitmaskbitmask;
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org

[RFC PATCH v5 09/10] thermal: add trace events to the power allocator governor

2014-07-10 Thread Javi Merino
Add trace events for the power allocator governor and the power actor
interface of the cpu cooling device.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 drivers/thermal/cpu_actor.c|  17 ++-
 drivers/thermal/power_allocator.c  |  22 +++-
 include/trace/events/thermal_power_allocator.h | 138 +
 3 files changed, 173 insertions(+), 4 deletions(-)
 create mode 100644 include/trace/events/thermal_power_allocator.h

diff --git a/drivers/thermal/cpu_actor.c b/drivers/thermal/cpu_actor.c
index 45ea4fa92ea0..b5ed2e80e288 100644
--- a/drivers/thermal/cpu_actor.c
+++ b/drivers/thermal/cpu_actor.c
@@ -28,6 +28,8 @@
 #include linux/printk.h
 #include linux/slab.h
 
+#include trace/events/thermal_power_allocator.h
+
 /**
  * struct power_table - frequency to power conversion
  * @frequency: frequency in KHz
@@ -184,11 +186,12 @@ static u32 get_static_power(struct cpu_actor *cpu_actor,
  */
 static u32 get_dynamic_power(struct cpu_actor *cpu_actor, unsigned long freq)
 {
-   int cpu;
-   u32 power = 0, raw_cpu_power, total_load = 0;
+   int i, cpu;
+   u32 power = 0, raw_cpu_power, total_load = 0, load_cpu[NR_CPUS];
 
raw_cpu_power = cpu_freq_to_power(cpu_actor, freq);
 
+   i = 0;
for_each_cpu(cpu, cpu_actor-cpumask) {
u32 load;
 
@@ -198,8 +201,15 @@ static u32 get_dynamic_power(struct cpu_actor *cpu_actor, 
unsigned long freq)
load = get_load(cpu_actor, cpu);
power += (raw_cpu_power * load) / 100;
total_load += load;
+   load_cpu[i] = load;
+
+   i++;
}
 
+   trace_thermal_power_actor_cpu_get_dyn_power(cpu_actor-cpumask, freq,
+   raw_cpu_power, load_cpu, i,
+   power);
+
cpu_actor-last_load = total_load;
 
return power;
@@ -296,6 +306,9 @@ static int cpu_set_power(struct power_actor *actor,
return -EINVAL;
}
 
+   trace_thermal_power_actor_cpu_limit(cpu_actor-cpumask, target_freq,
+   cdev_state, power);
+
return cdev-ops-set_cur_state(cdev, cdev_state);
 }
 
diff --git a/drivers/thermal/power_allocator.c 
b/drivers/thermal/power_allocator.c
index eb1797cd859b..e6793d6d1288 100644
--- a/drivers/thermal/power_allocator.c
+++ b/drivers/thermal/power_allocator.c
@@ -20,6 +20,9 @@
 #include linux/slab.h
 #include linux/thermal.h
 
+#define CREATE_TRACE_POINTS
+#include trace/events/thermal_power_allocator.h
+
 #include thermal_core.h
 
 #define FRAC_BITS 8
@@ -133,7 +136,14 @@ static u32 pid_controller(struct thermal_zone_device *tz,
/* feed-forward the known sustainable dissipatable power */
power_range = tz-tzp-sustainable_power + frac_to_int(power_range);
 
-   return clamp(power_range, (s64)0, (s64)max_allocatable_power);
+   power_range = clamp(power_range, (s64)0, (s64)max_allocatable_power);
+
+   trace_thermal_power_allocator_pid(frac_to_int(err),
+   frac_to_int(params-err_integral),
+   frac_to_int(p), frac_to_int(i),
+   frac_to_int(d), power_range);
+
+   return power_range;
 }
 
 /**
@@ -214,7 +224,7 @@ static int allocate_power(struct thermal_zone_device *tz,
struct power_actor *actor;
u32 *req_power, *max_power, *granted_power;
u32 total_req_power, max_allocatable_power;
-   u32 power_range;
+   u32 total_granted_power, power_range;
int i, num_actors, ret = 0;
 
mutex_lock(tz-lock);
@@ -265,12 +275,20 @@ static int allocate_power(struct thermal_zone_device *tz,
divvy_up_power(req_power, max_power, num_actors, total_req_power,
power_range, granted_power);
 
+   total_granted_power = 0;
i = 0;
list_for_each_entry_rcu(actor, actor_list, actor_node) {
actor-ops-set_power(actor, tz, granted_power[i]);
+   total_granted_power += granted_power[i];
+
i++;
}
 
+   trace_thermal_power_allocator(req_power, total_req_power, granted_power,
+   total_granted_power, num_actors, power_range,
+   max_allocatable_power, current_temp,
+   (s32)control_temp - (s32)current_temp);
+
devm_kfree(tz-device, granted_power);
 free_max_power:
devm_kfree(tz-device, max_power);
diff --git a/include/trace/events/thermal_power_allocator.h 
b/include/trace/events/thermal_power_allocator.h
new file mode 100644
index ..9140cec55c63
--- /dev/null
+++ b/include/trace/events

Re: [RFC PATCH v5 09/10] thermal: add trace events to the power allocator governor

2014-07-10 Thread Javi Merino
On Thu, Jul 10, 2014 at 04:44:51PM +0100, Steven Rostedt wrote:
 On Thu, 10 Jul 2014 15:18:47 +0100
 Javi Merino javi.mer...@arm.com wrote:
 
  Add trace events for the power allocator governor and the power actor
  interface of the cpu cooling device.
  
  Cc: Zhang Rui rui.zh...@intel.com
  Cc: Eduardo Valentin edubez...@gmail.com
  Cc: Steven Rostedt rost...@goodmis.org
  Cc: Frederic Weisbecker fweis...@gmail.com
  Cc: Ingo Molnar mi...@redhat.com
  Signed-off-by: Javi Merino javi.mer...@arm.com
  ---
   drivers/thermal/cpu_actor.c|  17 ++-
   drivers/thermal/power_allocator.c  |  22 +++-
   include/trace/events/thermal_power_allocator.h | 138 
  +
   3 files changed, 173 insertions(+), 4 deletions(-)
   create mode 100644 include/trace/events/thermal_power_allocator.h
  
  diff --git a/drivers/thermal/cpu_actor.c b/drivers/thermal/cpu_actor.c
  index 45ea4fa92ea0..b5ed2e80e288 100644
  --- a/drivers/thermal/cpu_actor.c
  +++ b/drivers/thermal/cpu_actor.c
  @@ -28,6 +28,8 @@
   #include linux/printk.h
   #include linux/slab.h
   
  +#include trace/events/thermal_power_allocator.h
  +
   /**
* struct power_table - frequency to power conversion
* @frequency: frequency in KHz
  @@ -184,11 +186,12 @@ static u32 get_static_power(struct cpu_actor 
  *cpu_actor,
*/
   static u32 get_dynamic_power(struct cpu_actor *cpu_actor, unsigned long 
  freq)
   {
  -   int cpu;
  -   u32 power = 0, raw_cpu_power, total_load = 0;
  +   int i, cpu;
  +   u32 power = 0, raw_cpu_power, total_load = 0, load_cpu[NR_CPUS];
 
 When NR_CPUS == 1024, you just killed the stack, as you added 4K to it.
 We upped the stack recently to 16k, but still.

True, this array should be static.

   
  raw_cpu_power = cpu_freq_to_power(cpu_actor, freq);
   
  +   i = 0;
  for_each_cpu(cpu, cpu_actor-cpumask) {
  u32 load;
   
  @@ -198,8 +201,15 @@ static u32 get_dynamic_power(struct cpu_actor 
  *cpu_actor, unsigned long freq)
  load = get_load(cpu_actor, cpu);
  power += (raw_cpu_power * load) / 100;
  total_load += load;
  +   load_cpu[i] = load;
  +
  +   i++;
  }
   
  +   trace_thermal_power_actor_cpu_get_dyn_power(cpu_actor-cpumask, freq,
  +   raw_cpu_power, load_cpu, i,
  +   power);
 
 How many CPUs are you saving load_cpu on? A trace event can't be bigger
 than a page. And the data is actually a little less than that with the
 required headers.

The biggest system I've tested it on is an 8 cpu system (with
NR_CPUS==8).  So yes, small and we haven't seen any issues.

Are you saying that we are siphoning too much data through ftrace?  He
find it really valuable to collect information during run and process
it afterwards but I can see how this may not be feasible for systems
with thousands of cpus.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v5 09/10] thermal: add trace events to the power allocator governor

2014-07-11 Thread Javi Merino
On Thu, Jul 10, 2014 at 07:03:50PM +0100, Steven Rostedt wrote:
 On Thu, 10 Jul 2014 17:20:14 +0100
 Javi Merino javi.mer...@arm.com wrote:
 
 
   
   How many CPUs are you saving load_cpu on? A trace event can't be bigger
   than a page. And the data is actually a little less than that with the
   required headers.
  
  The biggest system I've tested it on is an 8 cpu system (with
  NR_CPUS==8).  So yes, small and we haven't seen any issues.
  
  Are you saying that we are siphoning too much data through ftrace?  He
  find it really valuable to collect information during run and process
  it afterwards but I can see how this may not be feasible for systems
  with thousands of cpus.
 
 Only too much for a single event. Perhaps have the tracepoint post per
 CPU? Then you wouldn't need that array.

Sounds good, I'll do that.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND] thermal: document struct thermal_zone_device and thermal_governor

2014-06-25 Thread Javi Merino
Document struct thermal_zone_device and struct thermal_governor fields
and their use by the thermal framework code.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Javi Merino javi.mer...@arm.com

---

Hi linux-pm,

I have some patches that add new fields to these structures but I
don't have a good place to describe those fields as these structs are
mostly undocumented so I thought I'd document them.

Changes since v1:
  * Clarified that some parameters are currently only used by the
step-wise governor.
  * Clarified that forced_passive operates on ACPI processor cooling
devices.

 include/linux/thermal.h | 46 --
 1 file changed, 44 insertions(+), 2 deletions(-)

diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index f7e11c7ea7d9..0305cde21a74 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -158,6 +158,42 @@ struct thermal_attr {
char name[THERMAL_NAME_LENGTH];
 };
 
+/**
+ * struct thermal_zone_device - structure for a thermal zone
+ * @id:unique id number for each thermal zone
+ * @type:  the thermal zone device type
+ * @device:struct device for this thermal zone
+ * @trip_temp_attrs:   attributes for trip points for sysfs: trip temperature
+ * @trip_type_attrs:   attributes for trip points for sysfs: trip type
+ * @trip_hyst_attrs:   attributes for trip points for sysfs: trip hysteresis
+ * @devdata:   private pointer for device private data
+ * @trips: number of trip points the thermal zone supports
+ * @passive_delay: number of milliseconds to wait between polls when
+ * performing passive cooling.  Currenty only used by the
+ * step-wise governor
+ * @polling_delay: number of milliseconds to wait between polls when
+ * checking whether trip points have been crossed (0 for
+ * interrupt driven systems)
+ * @temperature:   current temperature.  This is only for core code,
+ * drivers should use thermal_zone_get_temp() to get the
+ * current temperature
+ * @last_temperature:  previous temperature read
+ * @emul_temperature:  emulated temperature when using CONFIG_THERMAL_EMULATION
+ * @passive:   1 if you've crossed a passive trip point, 0 otherwise.
+ * Currenty only used by the step-wise governor.
+ * @forced_passive:If  0, temperature at which to switch on all ACPI
+ * processor cooling devices.  Currently only used by the
+ * step-wise governor.
+ * @ops:   operations this thermal_zone_device supports
+ * @tzp:   thermal zone parameters
+ * @governor:  pointer to the governor for this thermal zone
+ * @thermal_instances: list of struct thermal_instance of this thermal zone
+ * @idr:   struct idr to generate unique id for this zone's cooling
+ * devices
+ * @lock:  lock to protect thermal_instances list
+ * @node:  node in thermal_tz_list (in thermal_core.c)
+ * @poll_queue:delayed work for polling
+ */
 struct thermal_zone_device {
int id;
char type[THERMAL_NAME_LENGTH];
@@ -179,12 +215,18 @@ struct thermal_zone_device {
struct thermal_governor *governor;
struct list_head thermal_instances;
struct idr idr;
-   struct mutex lock; /* protect thermal_instances list */
+   struct mutex lock;
struct list_head node;
struct delayed_work poll_queue;
 };
 
-/* Structure that holds thermal governor information */
+/**
+ * struct thermal_governor - structure that holds thermal governor information
+ * @name:  name of the governor
+ * @throttle:  callback called for every trip point even if temperature is
+ * below the trip point temperature
+ * @governor_list: node in thermal_governor_list (in thermal_core.c)
+ */
 struct thermal_governor {
char name[THERMAL_NAME_LENGTH];
int (*throttle)(struct thermal_zone_device *tz, int trip);
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 3/3] thermal: trace: Trace when temperature is above a trip point

2014-06-25 Thread Javi Merino
On Tue, Jun 24, 2014 at 11:41:38AM +0100, Punit Agrawal wrote:
 Javi Merino javi.mer...@arm.com writes:
 
  Hi Punit,
 
  On Wed, Jun 11, 2014 at 12:31:44PM +0100, Punit Agrawal wrote:
  Create a new event to trace when the temperature is above a trip
  point. Use the trace-point when handling non-critical and critical
  trip pionts.
  
  Cc: Zhang Rui rui.zh...@intel.com
  Cc: Eduardo Valentin edubez...@gmail.com
  Cc: Steven Rostedt rost...@goodmis.org
  Cc: Frederic Weisbecker fweis...@gmail.com
  Cc: Ingo Molnar mi...@redhat.com
  Signed-off-by: Punit Agrawal punit.agra...@arm.com
  ---
  Hi Steven,
  
  I am facing an issue with partial trace being emitted when using
  __print_symbolic in this patch. 
  
  When the trip_type is THERMAL_TRIP_ACTIVE (i.e., the first value in
  the symbol map), the emitted trace contains the corresponding string
  (active). But for other values of trip_type an empty string is
  emitted in the trace.
  
  I've looked at other uses of __print_symbolic in the kernel and don't
  see any difference in usage. Do you know what could be causing this or
  alternately have any pointers on how to debug this behaviour?
  
  Thanks.
  Punit
  
   drivers/thermal/fair_share.c   |7 ++-
   drivers/thermal/step_wise.c|5 -
   drivers/thermal/thermal_core.c |2 ++
   include/trace/events/thermal.h |   30 ++
   4 files changed, 42 insertions(+), 2 deletions(-)
  
  diff --git a/drivers/thermal/fair_share.c b/drivers/thermal/fair_share.c
  index 944ba2f..2cddd68 100644
  --- a/drivers/thermal/fair_share.c
  +++ b/drivers/thermal/fair_share.c
  @@ -23,6 +23,7 @@
*/
   
   #include linux/thermal.h
  +#include trace/events/thermal.h
   
   #include thermal_core.h
   
  @@ -34,14 +35,18 @@ static int get_trip_level(struct thermal_zone_device 
  *tz)
   {
 int count = 0;
 unsigned long trip_temp;
  +  enum thermal_trip_type trip_type;
   
 if (tz-trips == 0 || !tz-ops-get_trip_temp)
 return 0;
   
 for (count = 0; count  tz-trips; count++) {
 tz-ops-get_trip_temp(tz, count, trip_temp);
  -  if (tz-temperature  trip_temp)
  +  if (tz-temperature  trip_temp) {
  +  tz-ops-get_trip_type(tz, count, trip_type);
  +  trace_thermal_zone_trip(tz, count, trip_type);
 
  This should be outside the if condition.  You want to report when trip
  points have been hit, like in the step_wise code below.
 
 
 It turned out to be a bit more subtle than moving the trace outside the
 if.

True, it was more difficult than what I said.  

 I have the below fixup with an added comment. Let me know if that
 doesn't solve the problem.

I don't have a reproducer, I just spotted it while reading the code.
The below fix seems to be the right thing.

 -- 8 --
 Subject: [PATCH] fixup! thermal: trace: Trace when temperature is above a
  trip point
 
 ---
  drivers/thermal/fair_share.c |   15 +++
  1 file changed, 11 insertions(+), 4 deletions(-)
 
 diff --git a/drivers/thermal/fair_share.c b/drivers/thermal/fair_share.c
 index 2cddd68..6e0a3fb 100644
 --- a/drivers/thermal/fair_share.c
 +++ b/drivers/thermal/fair_share.c
 @@ -42,12 +42,19 @@ static int get_trip_level(struct thermal_zone_device *tz)
  
   for (count = 0; count  tz-trips; count++) {
   tz-ops-get_trip_temp(tz, count, trip_temp);
 - if (tz-temperature  trip_temp) {
 - tz-ops-get_trip_type(tz, count, trip_type);
 - trace_thermal_zone_trip(tz, count, trip_type);
 + if (tz-temperature  trip_temp)
   break;
 - }
   }
 +
 + /*
 +  * count  0 only if temperature is greater than first trip
 +  * point, in which case, trip_point = count - 1
 +  */
 + if (count  0) {
 + tz-ops-get_trip_type(tz, count - 1, trip_type);
 + trace_thermal_zone_trip(tz, count - 1, trip_type);
 + }
 +
   return count;
  }
  
 -- 
 1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH TRIVIAL] thermal: cpu_cooling: fix typo highjack - hijack

2014-06-25 Thread Javi Merino
Cc: Eduardo Valentin eduardo.valen...@ti.com
Cc: Zhang Rui rui.zh...@intel.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 drivers/thermal/cpu_cooling.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 84a75f89bf74..1ab0018271c5 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -305,7 +305,7 @@ static int cpufreq_apply_cooling(struct 
cpufreq_cooling_device *cpufreq_device,
  * @event: value showing cpufreq event for which this function invoked.
  * @data: callback-specific data
  *
- * Callback to highjack the notification on cpufreq policy transition.
+ * Callback to hijack the notification on cpufreq policy transition.
  * Every time there is a change in policy, we will intercept and
  * update the cpufreq policy with thermal constraints.
  *
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] MAINTAINERS: Update Eduardo Valentin's email address

2014-06-26 Thread Javi Merino
On Mon, Jun 02, 2014 at 06:28:00PM +0100, Eduardo Valentin wrote:
 Hello Lee Jones,
 
 On Fri, May 30, 2014 at 11:03:28AM +0100, Lee Jones wrote:
  Eduardo TI address is bouncing, but it looks like he's still
  contributing via his Gmail address.
  
 
 Thanks for being proactive! I actually sent the very same patch a while
 ago:
 
 
 https://lkml.org/lkml/2014/4/2/644
 
 
 Maybe it's fallen into the cracks.
 
 Rui, any idea what is the status of that patch?

I've been bitten by this *again*.  This patch should've been in
3.15-rc1, it's 3.16-rc2 and Eduardo's email is still outdated in the
MAINTAINERS file.  The thermal tree seems to be pretty stale lately
and patches are rotting in the list, but this update should really get
into the kernel ASAP.

Eduardo, can you send this to triv...@kernel.org ?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 3/3] thermal: trace: Trace when temperature is above a trip point

2014-06-20 Thread Javi Merino
Hi Punit,

On Wed, Jun 11, 2014 at 12:31:44PM +0100, Punit Agrawal wrote:
 Create a new event to trace when the temperature is above a trip
 point. Use the trace-point when handling non-critical and critical
 trip pionts.
 
 Cc: Zhang Rui rui.zh...@intel.com
 Cc: Eduardo Valentin edubez...@gmail.com
 Cc: Steven Rostedt rost...@goodmis.org
 Cc: Frederic Weisbecker fweis...@gmail.com
 Cc: Ingo Molnar mi...@redhat.com
 Signed-off-by: Punit Agrawal punit.agra...@arm.com
 ---
 Hi Steven,
 
 I am facing an issue with partial trace being emitted when using
 __print_symbolic in this patch. 
 
 When the trip_type is THERMAL_TRIP_ACTIVE (i.e., the first value in
 the symbol map), the emitted trace contains the corresponding string
 (active). But for other values of trip_type an empty string is
 emitted in the trace.
 
 I've looked at other uses of __print_symbolic in the kernel and don't
 see any difference in usage. Do you know what could be causing this or
 alternately have any pointers on how to debug this behaviour?
 
 Thanks.
 Punit
 
  drivers/thermal/fair_share.c   |7 ++-
  drivers/thermal/step_wise.c|5 -
  drivers/thermal/thermal_core.c |2 ++
  include/trace/events/thermal.h |   30 ++
  4 files changed, 42 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/thermal/fair_share.c b/drivers/thermal/fair_share.c
 index 944ba2f..2cddd68 100644
 --- a/drivers/thermal/fair_share.c
 +++ b/drivers/thermal/fair_share.c
 @@ -23,6 +23,7 @@
   */
  
  #include linux/thermal.h
 +#include trace/events/thermal.h
  
  #include thermal_core.h
  
 @@ -34,14 +35,18 @@ static int get_trip_level(struct thermal_zone_device *tz)
  {
   int count = 0;
   unsigned long trip_temp;
 + enum thermal_trip_type trip_type;
  
   if (tz-trips == 0 || !tz-ops-get_trip_temp)
   return 0;
  
   for (count = 0; count  tz-trips; count++) {
   tz-ops-get_trip_temp(tz, count, trip_temp);
 - if (tz-temperature  trip_temp)
 + if (tz-temperature  trip_temp) {
 + tz-ops-get_trip_type(tz, count, trip_type);
 + trace_thermal_zone_trip(tz, count, trip_type);

This should be outside the if condition.  You want to report when trip
points have been hit, like in the step_wise code below.

   break;
 + }
   }
   return count;
  }
 diff --git a/drivers/thermal/step_wise.c b/drivers/thermal/step_wise.c
 index f251521..3b54c2c 100644
 --- a/drivers/thermal/step_wise.c
 +++ b/drivers/thermal/step_wise.c
 @@ -23,6 +23,7 @@
   */
  
  #include linux/thermal.h
 +#include trace/events/thermal.h
  
  #include thermal_core.h
  
 @@ -129,8 +130,10 @@ static void thermal_zone_trip_update(struct 
 thermal_zone_device *tz, int trip)
  
   trend = get_tz_trend(tz, trip);
  
 - if (tz-temperature = trip_temp)
 + if (tz-temperature = trip_temp) {
   throttle = true;
 + trace_thermal_zone_trip(tz, trip, trip_type);
 + }
  
   dev_dbg(tz-device, Trip%d[type=%d,temp=%ld]:trend=%d,throttle=%d\n,
   trip, trip_type, trip_temp, trend, throttle);
 diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
 index c74c78d..454884a 100644
 --- a/drivers/thermal/thermal_core.c
 +++ b/drivers/thermal/thermal_core.c
 @@ -371,6 +371,8 @@ static void handle_critical_trips(struct 
 thermal_zone_device *tz,
   if (tz-temperature  trip_temp)
   return;
  
 + trace_thermal_zone_trip(tz, trip, trip_type);
 +
   if (tz-ops-notify)
   tz-ops-notify(tz, trip, trip_type);
  

Cheers,
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] thermal: document struct thermal_zone_device and thermal_governor

2014-07-08 Thread Javi Merino
On Wed, Jun 25, 2014 at 11:00:12AM +0100, Javi Merino wrote:
 Document struct thermal_zone_device and struct thermal_governor fields
 and their use by the thermal framework code.
 
 Cc: Zhang Rui rui.zh...@intel.com
 Cc: Eduardo Valentin edubez...@gmail.com
 Signed-off-by: Javi Merino javi.mer...@arm.com
 
 ---
 
 Hi linux-pm,
 
 I have some patches that add new fields to these structures but I
 don't have a good place to describe those fields as these structs are
 mostly undocumented so I thought I'd document them.
 
 Changes since v1:
   * Clarified that some parameters are currently only used by the
 step-wise governor.
   * Clarified that forced_passive operates on ACPI processor cooling
 devices.
 
  include/linux/thermal.h | 46 --
  1 file changed, 44 insertions(+), 2 deletions(-)
 
 diff --git a/include/linux/thermal.h b/include/linux/thermal.h
 index f7e11c7ea7d9..0305cde21a74 100644
 --- a/include/linux/thermal.h
 +++ b/include/linux/thermal.h
 @@ -158,6 +158,42 @@ struct thermal_attr {
   char name[THERMAL_NAME_LENGTH];
  };
  
 +/**
 + * struct thermal_zone_device - structure for a thermal zone
 + * @id:  unique id number for each thermal zone
 + * @type:the thermal zone device type
 + * @device:  struct device for this thermal zone
 + * @trip_temp_attrs: attributes for trip points for sysfs: trip temperature
 + * @trip_type_attrs: attributes for trip points for sysfs: trip type
 + * @trip_hyst_attrs: attributes for trip points for sysfs: trip hysteresis
 + * @devdata: private pointer for device private data
 + * @trips:   number of trip points the thermal zone supports
 + * @passive_delay:   number of milliseconds to wait between polls when
 + *   performing passive cooling.  Currenty only used by the
 + *   step-wise governor
 + * @polling_delay:   number of milliseconds to wait between polls when
 + *   checking whether trip points have been crossed (0 for
 + *   interrupt driven systems)
 + * @temperature: current temperature.  This is only for core code,
 + *   drivers should use thermal_zone_get_temp() to get the
 + *   current temperature
 + * @last_temperature:previous temperature read
 + * @emul_temperature:emulated temperature when using 
 CONFIG_THERMAL_EMULATION
 + * @passive: 1 if you've crossed a passive trip point, 0 otherwise.
 + *   Currenty only used by the step-wise governor.
 + * @forced_passive:  If  0, temperature at which to switch on all ACPI
 + *   processor cooling devices.  Currently only used by the
 + *   step-wise governor.
 + * @ops: operations this thermal_zone_device supports
 + * @tzp: thermal zone parameters
 + * @governor:pointer to the governor for this thermal zone
 + * @thermal_instances:   list of struct thermal_instance of this 
 thermal zone
 + * @idr: struct idr to generate unique id for this zone's cooling
 + *   devices
 + * @lock:lock to protect thermal_instances list
 + * @node:node in thermal_tz_list (in thermal_core.c)
 + * @poll_queue:  delayed work for polling
 + */
  struct thermal_zone_device {
   int id;
   char type[THERMAL_NAME_LENGTH];
 @@ -179,12 +215,18 @@ struct thermal_zone_device {
   struct thermal_governor *governor;
   struct list_head thermal_instances;
   struct idr idr;
 - struct mutex lock; /* protect thermal_instances list */
 + struct mutex lock;
   struct list_head node;
   struct delayed_work poll_queue;
  };
  
 -/* Structure that holds thermal governor information */
 +/**
 + * struct thermal_governor - structure that holds thermal governor 
 information
 + * @name:name of the governor
 + * @throttle:callback called for every trip point even if 
 temperature is
 + *   below the trip point temperature
 + * @governor_list:   node in thermal_governor_list (in thermal_core.c)
 + */
  struct thermal_governor {
   char name[THERMAL_NAME_LENGTH];
   int (*throttle)(struct thermal_zone_device *tz, int trip);
 -- 
 1.9.1
 

I think I've addressed all the comments in the previous version, any
other objections?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] thermal: tell cooling devices when a trip_point changes

2014-07-31 Thread Javi Merino
On Thu, Jul 31, 2014 at 12:10:40AM +0100, Matt Longnecker wrote:
 Some hardware can react autonomously at a programmed temperature.
 For example, an SoC might implement a last ditch throttle or a
 hardware thermal shutdown. The driver for such a device can
 register itself as a cooling_device with the thermal framework.
 
 With this change, the thermal framework notifies such a driver
 when userspace alters the relevant trip temperature so that
 the driver can reprogram its hardware

Why can't you just use the existing cooling device interface?  Cooling
devices can be bound to trip points.  Most thermal governors will
increase cooling for that cooling device when the trip point is hit.
The last ditch throttle or hardware thermal shutdown will then kick
when the cooling state changes to 1.

If the existing governors are too complex for what you want, you can
have a look at the bang bang governor[0] which (I think) is bound to
be merged soon.

[0] http://article.gmane.org/gmane.linux.kernel/1753348

Cheers,
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] thermal: of: look for sensor driver parent node if device node missing

2014-07-25 Thread Javi Merino
On Fri, Jul 25, 2014 at 09:22:00AM +0100, Laxman Dewangan wrote:
 Thanks Rui.
 It seems I have put the wrong email-id for Eduardo (which I got from 
 get_maintainer) and the original patch not reached to Eduardo.
 
 Do I need to re-post patch?
 
 Thanks,
 Laxman
 
 On Thursday 24 July 2014 08:45 PM, Zhang, Rui wrote:
  Hi, Laxman,
 
  As Eduardo is the of thermal author and maintainer, I will take your patch 
  only if you can get ACK from Eduardo.
 
  Eduardo,
  Do you have any comments on this?

[Fixed Eduardo's email.]

  -Original Message-
  From: Laxman Dewangan [mailto:ldewan...@nvidia.com]
  Sent: Thursday, July 24, 2014 5:49 PM
  To: Zhang, Rui; eduardo.valen...@ti.com
  Cc: linux...@vger.kernel.org; linux-kernel@vger.kernel.org
  Subject: Re: [PATCH] thermal: of: look for sensor driver parent node if
  device node missing
  Importance: High
 
  On Monday 14 July 2014 04:42 PM, Laxman Dewangan wrote:
  There are some mfd devices which supports junction thermal interrupt
  like ams,AS3722. The DT binding of these devices are defined as the
  flat and drivers for sub module of such devices are registered as the
  mfd_add_devices. In this method, the sub devices registered as
  platform driver and these do not have the of_node pointer on their
  device structure. In this case, use the parent of_node pointer to get
  the required of_node pointer.
 
  Any comment please?
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-pm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH-REPOST] thermal: of: look for sensor driver parent node if device node missing

2014-07-25 Thread Javi Merino
[Fixed Eduardo's email *again*]

On Fri, Jul 25, 2014 at 10:19:31AM +0100, Laxman Dewangan wrote:
 There are some mfd devices which supports junction thermal interrupt
 like ams,AS3722. The DT binding of these devices are defined as the
 flat and drivers for sub module of such devices are registered as
 the mfd_add_devices. In this method, the sub devices registered as
 platform driver and these do not have the of_node pointer on their
 device structure. In this case, use the parent of_node pointer to
 get the required of_node pointer.
 
 Signed-off-by: Laxman Dewangan ldewan...@nvidia.com
 ---
 I typed differnet email ID for Eduardo and so it did not reach to him.
 Resending the patch with correct ID.
 
  drivers/thermal/of-thermal.c | 2 ++
  1 file changed, 2 insertions(+)
 
 diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c
 index 04b1be7..85a7d71 100644
 --- a/drivers/thermal/of-thermal.c
 +++ b/drivers/thermal/of-thermal.c
 @@ -396,6 +396,8 @@ thermal_zone_of_sensor_register(struct device *dev, int 
 sensor_id,
   return ERR_PTR(-EINVAL);
  
   sensor_np = dev-of_node;
 + if (!sensor_np  dev-parent)
 + sensor_np = dev-parent-of_node;
  
   for_each_child_of_node(np, child) {
   struct of_phandle_args sensor_specs;
 -- 
 1.8.1.5
 
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] trace-cmd: link trace-record in ctracecmd.so

2014-08-15 Thread Javi Merino
Commit b1e9220819eb (trace-cmd: Report stats for all instances) moved
tracecmd_stat_cpu() from trace-recorder.c to trace-record.c.  This
makes import ctracecmd fail in tracecmd.py:

$ python tracecmd.py
Traceback (most recent call last):
  File tracecmd.py, line 22, in module
from ctracecmd import *
ImportError: ctracecmd.so: undefined symbol: tracecmd_stat_cpu

Link trace-record in ctracecmd.so to add back the required symbol.

Signed-off-by: Javi Merino javi.mer...@arm.com
---
 Makefile | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index cbe0eb9..308d7f8 100644
--- a/Makefile
+++ b/Makefile
@@ -317,8 +317,9 @@ KERNEL_SHARK_OBJS = $(TRACE_VIEW_OBJS) $(TRACE_GRAPH_OBJS) 
$(TRACE_GUI_OBJS) \
 
 PEVENT_LIB_OBJS = event-parse.o trace-seq.o parse-filter.o parse-utils.o
 TCMD_LIB_OBJS = $(PEVENT_LIB_OBJS) trace-util.o trace-input.o trace-ftrace.o \
-   trace-output.o trace-recorder.o trace-restore.o 
trace-usage.o \
-   trace-blk-hack.o kbuffer-parse.o event-plugin.o
+   trace-output.o trace-record.o trace-recorder.o \
+   trace-restore.o trace-usage.o trace-blk-hack.o \
+   kbuffer-parse.o event-plugin.o
 
 PLUGIN_OBJS =
 PLUGIN_OBJS += plugin_jbd2.o
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v5 05/10] thermal: let governors have private data for each thermal zone

2014-08-19 Thread Javi Merino
On Tue, Aug 19, 2014 at 01:49:32PM +0100, edubez...@gmail.com wrote:
 Javi,

Hi Eduardo,

 On Thu, Jul 10, 2014 at 10:18 AM, Javi Merino javi.mer...@arm.com wrote:
  A governor may need to store its current state between calls to
  throttle().  That state depends on the thermal zone, so store it as
  private data in struct thermal_zone_device.
 
  The governors may have two new ops: bind_to_tz() and unbind_from_tz().
  When provided, these functions let governors do some initialization
  and teardown when they are bound/unbound to a tz and possibly store that
  information in the governor_data field of the struct
  thermal_zone_device.
 
  Cc: Zhang Rui rui.zh...@intel.com
  Cc: Eduardo Valentin edubez...@gmail.com
  Signed-off-by: Javi Merino javi.mer...@arm.com
  ---
   drivers/thermal/thermal_core.c | 83 
  ++
   include/linux/thermal.h|  9 +
   2 files changed, 84 insertions(+), 8 deletions(-)
 
  diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
  index 71b0ec0c370d..3da99dd80ad5 100644
  --- a/drivers/thermal/thermal_core.c
  +++ b/drivers/thermal/thermal_core.c
  @@ -72,6 +72,58 @@ static struct thermal_governor *__find_governor(const 
  char *name)
  return NULL;
   }
 
  +/**
  + * bind_previous_governor() - bind the previous governor of the thermal 
  zone
  + * @tz:a valid pointer to a struct thermal_zone_device
  + * @failed_gov_name:   the name of the governor that failed to register
  + *
  + * Register the previous governor of the thermal zone after a new
  + * governor has failed to be bound.
  + */
  +static void bind_previous_governor(struct thermal_zone_device *tz,
  +   const char *failed_gov_name)
  +{
  +   if (tz-governor  tz-governor-bind_to_tz) {
  +   if (tz-governor-bind_to_tz(tz)) {
  +   dev_warn(tz-device,
  +   governor %s failed to bind and the 
  previous one (%s) failed to register again, thermal zone %s has no 
  governor\n,
  +   failed_gov_name, tz-governor-name, 
  tz-type);
 
 The above must be a dev_err(), not warn.

Sure

  Besides, I would prefer if
 you could improve the message. What is the difference between register
 and bind?

None.  It should say bind in both, it'll make it clearer.  I'll fix
it.

  +   tz-governor = NULL;
  +   }
  +   }
  +}
  +
  +/**
  + * thermal_set_governor() - Switch to another governor
  + * @tz:a valid pointer to a struct thermal_zone_device
  + * @new_gov:   pointer to the new governor
  + *
  + * Change the governor of thermal zone @tz.
  + *
  + * Return: 0 on success, an error if the new governor's bind_to_tz() 
  failed.
  + */
  +static int thermal_set_governor(struct thermal_zone_device *tz,
  +   struct thermal_governor *new_gov)
  +{
  +   int ret = 0;
  +
  +   if (tz-governor  tz-governor-unbind_from_tz)
  +   tz-governor-unbind_from_tz(tz);
  +
  +   if (new_gov  new_gov-bind_to_tz) {
  +   ret = new_gov-bind_to_tz(tz);
  +   if (ret) {
  +   bind_previous_governor(tz, new_gov-name);
  +
  +   return ret;
  +   }
  +   }
  +
  +   tz-governor = new_gov;
  +
  +   return ret;
  +}
  +
   int thermal_register_governor(struct thermal_governor *governor)
   {
  int err;
  @@ -104,8 +156,15 @@ int thermal_register_governor(struct thermal_governor 
  *governor)
 
  name = pos-tzp-governor_name;
 
  -   if (!strnicmp(name, governor-name, THERMAL_NAME_LENGTH))
  -   pos-governor = governor;
  +   if (!strnicmp(name, governor-name, THERMAL_NAME_LENGTH)) {
  +   int ret;
  +
  +   ret = thermal_set_governor(pos, governor);
  +   if (ret)
  +   dev_warn(pos-device,
  +   Failed to set governor %s for 
  thermal zone %s: %d\n,
  +   governor-name, pos-type, ret);
 
 dev_err here too.

Ok.  Thanks!
Javi 

  +   }
  }
 
  mutex_unlock(thermal_list_lock);
  @@ -131,7 +190,7 @@ void thermal_unregister_governor(struct 
  thermal_governor *governor)
  list_for_each_entry(pos, thermal_tz_list, node) {
  if (!strnicmp(pos-governor-name, governor-name,
  THERMAL_NAME_LENGTH))
  -   pos-governor = NULL;
  +   thermal_set_governor(pos, NULL);
  }
 
  mutex_unlock(thermal_list_lock);
  @@ -756,8 +815,9 @@ policy_store(struct device *dev, struct 
  device_attribute *attr,
  if (!gov)
  goto exit

Re: [RFC PATCH v5 08/10] thermal: introduce the Power Allocator governor

2014-08-19 Thread Javi Merino
On Tue, Aug 19, 2014 at 02:45:52PM +0100, Eduardo Valentin wrote:
 Javi,

Hi Eduardo,

 On Thu, Jul 10, 2014 at 03:18:46PM +0100, Javi Merino wrote:
  The power allocator governor is a thermal governor that controls system
  and device power allocation to control temperature.  Conceptually, the
  implementation divides the sustainable power of a thermal zone among
  all the heat sources in that zone.
 
  This governor relies on power actors, entities that represent heat
  sources.  They can report current and maximum power consumption and
  can set a given maximum power consumption, usually via a cooling
  device.
 
  The governor uses a Proportional Integral Derivative (PID) controller
  driven by the temperature of the thermal zone.  The output of the
  controller is a power budget that is then allocated to each power
  actor that can have bearing on the temperature we are trying to
  control.  It decides how much power to give each cooling device based
  on the performance they are requesting.  The PID controller ensures
  that the total power budget does not exceed the control temperature.
 
  Cc: Zhang Rui rui.zh...@intel.com
  Cc: Eduardo Valentin edubez...@gmail.com
  Signed-off-by: Punit Agrawal punit.agra...@arm.com
  Signed-off-by: Javi Merino javi.mer...@arm.com
  ---
   Documentation/thermal/power_allocator.txt |  61 
   drivers/thermal/Kconfig   |  15 +
   drivers/thermal/Makefile  |   1 +
   drivers/thermal/power_allocator.c | 467 
  ++
   drivers/thermal/thermal_core.c|   7 +-
   drivers/thermal/thermal_core.h|   8 +
   include/linux/thermal.h   |   8 +
   7 files changed, 566 insertions(+), 1 deletion(-)
   create mode 100644 Documentation/thermal/power_allocator.txt
   create mode 100644 drivers/thermal/power_allocator.c
 
  diff --git a/Documentation/thermal/power_allocator.txt 
  b/Documentation/thermal/power_allocator.txt
  new file mode 100644
  index ..1859074dadcb
  --- /dev/null
  +++ b/Documentation/thermal/power_allocator.txt
  @@ -0,0 +1,61 @@
  +Integration of the power_allocator governor in a platform
  +=
  +
  +Registering thermal_zone_device
  +---
  +
  +An estimate of the sustainable dissipatable power (in mW) should be
  +provided while registering the thermal zone.  This is the maximum
  +sustained power for allocation at the desired maximum temperature.
  +This number can vary for different conditions, but the closed-loop of
  +the controller should take care of those variations, the
  +`sustainable_power` should be an estimation of it.  Register your
  +thermal zone with `thermal_zone_params` that have a
  +`sustainable_power`.  If you weren't passing any
  +`thermal_zone_params`, then something like this will do:
  +
  + static const struct thermal_zone_params tz_params = {
  + .sustainable_power = 3500,
  + };
  +
  +and then pass `tz_params` as the 5th parameter to
  +`thermal_zone_device_register()`
  +
  +Trip points
  +---
  +
  +The governor requires the following two trip points:
  +
  +1.  switch on trip point: temperature above which the governor
  +control loop starts operating
  +2.  desired temperature trip point: it should be higher than the
  +switch on trip point. It is the target temperature the governor
  +is controlling for.
  +
  +The trip points can be either active or passive.
  +
 
 I would prefer, for the sake of keeping the right concept in use, you
 state here that they are passive trip points.

Sure, I've also changed the function that checks this condition
accordingly.

  +Power actors
  +
  +
  +Devices controlled by this governor must be registered with the power
  +actor API.  Read `power_actor.txt` for more information about them.
  +
  +Limitations of the power allocator governor
  +===
  +
  +The power allocator governor can't work with cooling devices directly.
  +A power actor can be created to interface between the governor and the
  +cooling device (see cpu_actor.c for an example).  Otherwise, if you
  +have power actors and cooling devices that are next to the same
  +thermal sensor create two thermal zones, one for each type.  Use the
  +power allocator governor for the power actor thermal zone with the
  +power actors and any other governor for the one with cooling devices.
  +
  +The power allocator governor's PID controller is highly dependent on a
  +periodic tick.  If you have a driver that calls
  +`thermal_zone_device_update()` (or anything that ends up calling the
  +governor's `throttle()` function) repetitively, the governor response
  +won't be very good.  Note that this is not particular to this
  +governor, step-wise will also misbehave if you call its throttle()
  +faster than the normal thermal framework tick (due

Re: [PATCH v5 3/6] thermal: Added Bang-bang thermal governor

2014-10-29 Thread Javi Merino
On Tue, Oct 28, 2014 at 07:33:39PM +, Peter Feuerer wrote:
 Hi Rui,
 
 I wonder whether you've had time to apply my set of patches already?  Would 
 you please be so kind to just send me a short reply?

The bang-bang governor was merged and is part of v3.18-rc2:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8264fce6de03f3915e2301f52f181a982718a8cb

Cheers,
Javi

 Peter Feuerer writes:
 
  Hi Rui,
  
  Zhang Rui writes:
  
  On Sat, 2014-07-26 at 16:14 +0200, Peter Feuerer wrote:
  Hi Rui,
  
  Peter Feuerer writes:
  
   The bang-bang thermal governor uses a hysteresis to switch abruptly on
   or off a cooling device.  It is intended to control fans, which can
   not be throttled but just switched on or off.
   Bang-bang cannot be set as default governor as it is intended for
   special devices only.  For those special devices the driver needs to
   explicitely request it.
   
   Cc: Andrew Morton a...@linux-foundation.org
   Cc: Zhang Rui rui.zh...@intel.com
  
  Anything that prevents you from giving your acked-by?
  
  NO.
  
  I'll queue them for 3.18.
  
  Are all 6 patches in for 3.18 as you promissed?
  
  kind regards,
  --peter;
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/2] tracing: Add array printing helpers

2014-12-15 Thread Javi Merino
From: Dave Martin dave.mar...@arm.com

If a trace event contains an array, there is currently no standard
way to format this for text output.  Drivers are currently hacking
around this by a) local hacks that use the trace_seq functionailty
directly, or b) just not printing that information.  For fixed size
arrays, formatting of the elements can be open-coded, but this gets
cumbersome for arrays of non-trivial size.

These approaches result in non-standard content of the event format
description delivered to userspace, so userland tools needs to be
taught to understand and parse each array printing method
individually.

This patch implements common __print_type_array() helpers that
tracepoint implementations can use instead of reinventing them.  A
simple C-style syntax is used to delimit the array and its elements
{like,this}.

So that the helpers can be used with large static arrays as well as
dynamic arrays, they take a pointer and element count: they can be
used with __get_dynamic_array() for use with dynamic arrays.

Cc: Steven Rostedt rost...@goodmis.org
Cc: Ingo Molnar mi...@redhat.com
Signed-off-by: Dave Martin dave.mar...@arm.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
Changes since v1[0]

- Replaced the DEFINE_PRINT_ARRAY macros with a single
  ftrace_print_array_seq() function.

[0] http://thread.gmane.org/gmane.linux.kernel/1845418/focus=54110

 include/linux/ftrace_event.h |  4 
 include/trace/ftrace.h   |  5 +
 kernel/trace/trace_output.c  | 42 ++
 3 files changed, 51 insertions(+)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 0bebb5c348b8..5aa4a9269547 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -44,6 +44,10 @@ const char *ftrace_print_bitmask_seq(struct trace_seq *p, 
void *bitmask_ptr,
 const char *ftrace_print_hex_seq(struct trace_seq *p,
 const unsigned char *buf, int len);
 
+const char *ftrace_print_array_seq(struct trace_seq *p,
+  const void *buf, int buf_len,
+  size_t el_size);
+
 struct trace_iterator;
 struct trace_event;
 
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 139b5067345b..da911289a8dd 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -263,6 +263,10 @@
 #undef __print_hex
 #define __print_hex(buf, buf_len) ftrace_print_hex_seq(p, buf, buf_len)
 
+#undef __print_array
+#define __print_array(array, count, el_size)   \
+   ftrace_print_array_seq(p, array, count, el_size)
+
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
 static notrace enum print_line_t   \
@@ -674,6 +678,7 @@ static inline void ftrace_test_probe_##call(void)   
\
 #undef __get_dynamic_array_len
 #undef __get_str
 #undef __get_bitmask
+#undef __print_array
 
 #undef TP_printk
 #define TP_printk(fmt, args...) \ fmt \,   __stringify(args)
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index b77b9a697619..6cee7c36a669 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -177,6 +177,48 @@ ftrace_print_hex_seq(struct trace_seq *p, const unsigned 
char *buf, int buf_len)
 }
 EXPORT_SYMBOL(ftrace_print_hex_seq);
 
+const char *
+ftrace_print_array_seq(struct trace_seq *p, const void *buf, int buf_len,
+  size_t el_size)
+{
+   const char *ret = trace_seq_buffer_ptr(p);
+   const char *prefix = ;
+   void *ptr = (void *)buf;
+
+   trace_seq_putc(p, '{');
+
+   while (ptr  buf + buf_len) {
+   switch (el_size) {
+   case 8:
+   trace_seq_printf(p, %s0x%x, prefix,
+*(u8 *)ptr);
+   break;
+   case 16:
+   trace_seq_printf(p, %s0x%x, prefix,
+*(u16 *)ptr);
+   break;
+   case 32:
+   trace_seq_printf(p, %s0x%x, prefix,
+*(u32 *)ptr);
+   break;
+   case 64:
+   trace_seq_printf(p, %s0x%llx, prefix,
+*(u64 *)ptr);
+   break;
+   default:
+   BUG();
+   }
+   prefix = ,;
+   ptr += el_size / 8;
+   }
+
+   trace_seq_putc(p, '}');
+   trace_seq_putc(p, 0);
+
+   return ret;
+}
+EXPORT_SYMBOL(ftrace_print_array_seq);
+
 int ftrace_raw_output_prep(struct trace_iterator *iter,
   struct trace_event *trace_event)
 {
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info

[PATCH v2 2/2] tools lib traceevent: Add support for __print_array()

2014-12-15 Thread Javi Merino
Trace can now generate traces with variable element size arrays.  Add
support to parse them.

Cc: Namhyung Kim namhy...@kernel.org
Cc: Arnaldo Carvalho de Melo a...@redhat.com
Cc: Steven Rostedt srost...@redhat.com
Cc: Jiri Olsa jo...@redhat.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 tools/lib/traceevent/event-parse.c | 127 +
 tools/lib/traceevent/event-parse.h |   8 +++
 2 files changed, 135 insertions(+)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index cf3a44bf1ec3..00dd6213449c 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -757,6 +757,11 @@ static void free_arg(struct print_arg *arg)
free_arg(arg-hex.field);
free_arg(arg-hex.size);
break;
+   case PRINT_INT_ARRAY:
+   free_arg(arg-int_array.field);
+   free_arg(arg-int_array.size);
+   free_arg(arg-int_array.el_size);
+   break;
case PRINT_TYPE:
free(arg-typecast.type);
free_arg(arg-typecast.item);
@@ -2533,6 +2538,71 @@ process_hex(struct event_format *event, struct print_arg 
*arg, char **tok)
 }
 
 static enum event_type
+process_int_array(struct event_format *event, struct print_arg *arg, char 
**tok)
+{
+   struct print_arg *field;
+   enum event_type type;
+   char *token;
+
+   memset(arg, 0, sizeof(*arg));
+   arg-type = PRINT_INT_ARRAY;
+
+   field = alloc_arg();
+   if (!field) {
+   do_warning_event(event, %s: not enough memory!, __func__);
+   goto out;
+   }
+
+   type = process_arg(event, field, token);
+
+   if (test_type_token(type, token, EVENT_DELIM, ,))
+   goto out_free;
+
+   arg-int_array.field = field;
+
+   free_token(token);
+
+   field = alloc_arg();
+   if (!field) {
+   do_warning_event(event, %s: not enough memory!, __func__);
+   goto out;
+   }
+
+   type = process_arg(event, field, token);
+
+   if (test_type_token(type, token, EVENT_DELIM, ,))
+   goto out_free;
+
+   arg-int_array.size = field;
+
+   free_token(token);
+
+   field = alloc_arg();
+   if (!field) {
+   do_warning_event(event, %s: not enough memory!, __func__);
+   goto out;
+   }
+
+   type = process_arg(event, field, token);
+
+   if (test_type_token(type, token, EVENT_DELIM, )))
+   goto out_free;
+
+   arg-int_array.el_size = field;
+
+   free_token(token);
+   type = read_token_item(tok);
+   return type;
+
+ out_free:
+   free_arg(field);
+   free_token(token);
+out:
+   *tok = NULL;
+   return EVENT_ERROR;
+}
+
+static enum event_type
 process_dynamic_array(struct event_format *event, struct print_arg *arg, char 
**tok)
 {
struct format_field *field;
@@ -2827,6 +2897,10 @@ process_function(struct event_format *event, struct 
print_arg *arg,
free_token(token);
return process_hex(event, arg, tok);
}
+   if (strcmp(token, __print_array) == 0) {
+   free_token(token);
+   return process_int_array(event, arg, tok);
+   }
if (strcmp(token, __get_str) == 0) {
free_token(token);
return process_str(event, arg, tok);
@@ -3355,6 +3429,7 @@ eval_num_arg(void *data, int size, struct event_format 
*event, struct print_arg
break;
case PRINT_FLAGS:
case PRINT_SYMBOL:
+   case PRINT_INT_ARRAY:
case PRINT_HEX:
break;
case PRINT_TYPE:
@@ -3765,6 +3840,49 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
}
break;
 
+   case PRINT_INT_ARRAY: {
+   void *num;
+   int el_size;
+
+   if (arg-int_array.field-type == PRINT_DYNAMIC_ARRAY) {
+   unsigned long offset;
+
+   offset = pevent_read_number(pevent,
+data + 
arg-int_array.field-dynarray.field-offset,
+   arg-int_array.field-dynarray.field-size);
+   num = data + (offset  0x);
+   } else {
+   field = arg-int_array.field-field.field;
+   if (!field) {
+   str = arg-int_array.field-field.name;
+   field = pevent_find_any_field(event, str);
+   if (!field)
+   goto out_warning_field;
+   arg-int_array.field-field.field = field;
+   }
+   num = data + field-offset;
+   }
+   len = eval_num_arg(data, size, event, arg-int_array.size);
+   el_size

[RFC PATCH v6 0/9] The power allocator thermal governor

2014-12-05 Thread Javi Merino
Hi linux-pm,

The power allocator governor allocates device power to control
temperature.  This requires transforming performance requests into
requested power, which we do with an extended cooling device API
introduced in patch 5 (thermal: extend the cooling device API to
include power information).  Patch 6 (thermal: cpu_cooling: implement
the power cooling device API) extends the cpu cooling device using a
simple power model.  The division of power between the cooling devices
ensures that power is allocated where it is needed the most, based on
the current workload.

Patches 1-3 adds array printing helpers to ftrace, which we then use
in patch 8.

Changes since v5:
  - Addressed Stephen's review of the trace patches.
  - Removed power actors and extended the cooling device interface
instead.
  - Let platforms override the power allocator governor parameters in
their thermal zone parameters

Changes since v4:
  - Add more tracing
  - Document some of the limitations of the power allocator governor
  - Export the power_actor API and move power_actor.h to include/linux

Changes since v3:
  - Use tz-passive to poll faster when the first trip point is hit.
  - Don't make a special directory for power_actors
  - Add a DT property for sustainable-power
  - Simplify the static power interface and pass the current thermal
zone in every power_actor_ops to remove the controversial
enum power_actor_types
  - Use locks with the actor_list list
  - Use cpufreq_get() to get the frequency of the cpu instead of
using the notifiers.
  - Remove the prompt for THERMAL_POWER_ACTOR_CPU when configuring
the kernel

Changes since v2:
  - Changed the PI controller into a PID controller
  - Added static power to the cpu power model
  - tz parameter max_dissipatable_power renamed to sustainable_power
  - Register the cpufreq cooling device as part of the
power_cpu_actor registration.

Changes since v1:
  - Fixed finding cpufreq cooling devices in cpufreq_frequency_change()
  - Replaced the cooling device interface with a separate power actor
API
  - Addressed most of Eduardo's comments
  - Incorporated ftrace support for bitmask to trace cpumasks

Todo:
  - Expose thermal zone parameters in sysfs
  - Expose new governor parameters in device tree
  - Expose cooling device weights in device tree

Cheers,
Javi  Punit

Dave Martin (1):
  tracing: Add array printing helpers

Javi Merino (7):
  tools lib traceevent: Generalize numeric argument
  tools lib traceevent: Add support for __print_u{8,16,32,64}_array()
  thermal: let governors have private data for each thermal zone
  thermal: extend the cooling device API to include power information
  thermal: cpu_cooling: implement the power cooling device API
  thermal: introduce the Power Allocator governor
  thermal: add trace events to the power allocator governor

Punit Agrawal (1):
  of: thermal: Introduce sustainable power for a thermal zone

 .../devicetree/bindings/thermal/thermal.txt|   4 +
 Documentation/thermal/cpu-cooling-api.txt  | 144 +-
 Documentation/thermal/power_allocator.txt  | 223 +
 drivers/thermal/Kconfig|  15 +
 drivers/thermal/Makefile   |   1 +
 drivers/thermal/cpu_cooling.c  | 455 +-
 drivers/thermal/of-thermal.c   |   4 +
 drivers/thermal/power_allocator.c  | 528 +
 drivers/thermal/thermal_core.c | 128 -
 drivers/thermal/thermal_core.h |   8 +
 include/linux/cpu_cooling.h|  49 +-
 include/linux/ftrace_event.h   |   9 +
 include/linux/thermal.h|  61 ++-
 include/trace/events/thermal_power_allocator.h | 138 ++
 include/trace/ftrace.h |  17 +
 kernel/trace/trace_output.c|  51 ++
 tools/lib/traceevent/event-parse.c |  88 +++-
 tools/lib/traceevent/event-parse.h |   8 +-
 18 files changed, 1886 insertions(+), 45 deletions(-)
 create mode 100644 Documentation/thermal/power_allocator.txt
 create mode 100644 drivers/thermal/power_allocator.c
 create mode 100644 include/trace/events/thermal_power_allocator.h

-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v6 1/9] tracing: Add array printing helpers

2014-12-05 Thread Javi Merino
From: Dave Martin dave.mar...@arm.com

If a trace event contains an array, there is currently no standard
way to format this for text output.  Drivers are currently hacking
around this by a) local hacks that use the trace_seq functionailty
directly, or b) just not printing that information.  For fixed size
arrays, formatting of the elements can be open-coded, but this gets
cumbersome for arrays of non-trivial size.

These approaches result in non-standard content of the event format
description delivered to userspace, so userland tools needs to be
taught to understand and parse each array printing method
individually.

This patch implements common __print_type_array() helpers that
tracepoint implementations can use instead of reinventing them.  A
simple C-style syntax is used to delimit the array and its elements
{like,this}.

So that the helpers can be used with large static arrays as well as
dynamic arrays, they take a pointer and element count: they can be
used with __get_dynamic_array() for use with dynamic arrays.

Cc: Steven Rostedt rost...@goodmis.org
Cc: Ingo Molnar mi...@redhat.com
Signed-off-by: Dave Martin dave.mar...@arm.com
---
 include/linux/ftrace_event.h |  9 
 include/trace/ftrace.h   | 17 +++
 kernel/trace/trace_output.c  | 51 
 3 files changed, 77 insertions(+)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 28672e87e910..415afc53fa51 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -44,6 +44,15 @@ const char *ftrace_print_bitmask_seq(struct trace_seq *p, 
void *bitmask_ptr,
 const char *ftrace_print_hex_seq(struct trace_seq *p,
 const unsigned char *buf, int len);
 
+const char *ftrace_print_u8_array_seq(struct trace_seq *p,
+ const u8 *buf, int count);
+const char *ftrace_print_u16_array_seq(struct trace_seq *p,
+  const u16 *buf, int count);
+const char *ftrace_print_u32_array_seq(struct trace_seq *p,
+  const u32 *buf, int count);
+const char *ftrace_print_u64_array_seq(struct trace_seq *p,
+  const u64 *buf, int count);
+
 struct trace_iterator;
 struct trace_event;
 
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 26b4f2e13275..15bc5d417aea 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -263,6 +263,19 @@
 #undef __print_hex
 #define __print_hex(buf, buf_len) ftrace_print_hex_seq(p, buf, buf_len)
 
+#undef __print_u8_array
+#define __print_u8_array(array, count) \
+   ftrace_print_u8_array_seq(p, array, count)
+#undef __print_u16_array
+#define __print_u16_array(array, count)\
+   ftrace_print_u16_array_seq(p, array, count)
+#undef __print_u32_array
+#define __print_u32_array(array, count)\
+   ftrace_print_u32_array_seq(p, array, count)
+#undef __print_u64_array
+#define __print_u64_array(array, count)\
+   ftrace_print_u64_array_seq(p, array, count)
+
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
 static notrace enum print_line_t   \
@@ -676,6 +689,10 @@ static inline void ftrace_test_probe_##call(void)  
\
 #undef __get_dynamic_array_len
 #undef __get_str
 #undef __get_bitmask
+#undef __print_u8_array
+#undef __print_u16_array
+#undef __print_u32_array
+#undef __print_u64_array
 
 #undef TP_printk
 #define TP_printk(fmt, args...) \ fmt \,   __stringify(args)
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index c6977d5a9b12..4a6ee61f30b3 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -186,6 +186,57 @@ ftrace_print_hex_seq(struct trace_seq *p, const unsigned 
char *buf, int buf_len)
 }
 EXPORT_SYMBOL(ftrace_print_hex_seq);
 
+static const char *
+ftrace_print_array_seq(struct trace_seq *p, const void *buf, int buf_len,
+  bool (*iterator)(struct trace_seq *p, const char *prefix,
+   const void **buf, int *buf_len))
+{
+   const char *ret = trace_seq_buffer_ptr(p);
+   const char *prefix = ;
+
+   trace_seq_putc(p, '{');
+
+   while (iterator(p, prefix, buf, buf_len))
+   prefix = ,;
+
+   trace_seq_putc(p, '}');
+   trace_seq_putc(p, 0);
+
+   return ret;
+}
+
+#define DEFINE_PRINT_ARRAY(type, printk_type, format)  \
+static bool\
+ftrace_print_array_iterator_##type(struct trace_seq *p, const char *prefix, \
+  const void **buf, int *buf_len)  \
+{  \
+   const type *__src = *buf; 

[RFC PATCH v6 4/9] thermal: let governors have private data for each thermal zone

2014-12-05 Thread Javi Merino
A governor may need to store its current state between calls to
throttle().  That state depends on the thermal zone, so store it as
private data in struct thermal_zone_device.

The governors may have two new ops: bind_to_tz() and unbind_from_tz().
When provided, these functions let governors do some initialization
and teardown when they are bound/unbound to a tz and possibly store that
information in the governor_data field of the struct
thermal_zone_device.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 drivers/thermal/thermal_core.c | 83 ++
 include/linux/thermal.h|  9 +
 2 files changed, 84 insertions(+), 8 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 43b90709585f..9021cb72a13a 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -75,6 +75,58 @@ static struct thermal_governor *__find_governor(const char 
*name)
return NULL;
 }
 
+/**
+ * bind_previous_governor() - bind the previous governor of the thermal zone
+ * @tz:a valid pointer to a struct thermal_zone_device
+ * @failed_gov_name:   the name of the governor that failed to register
+ *
+ * Register the previous governor of the thermal zone after a new
+ * governor has failed to be bound.
+ */
+static void bind_previous_governor(struct thermal_zone_device *tz,
+   const char *failed_gov_name)
+{
+   if (tz-governor  tz-governor-bind_to_tz) {
+   if (tz-governor-bind_to_tz(tz)) {
+   dev_err(tz-device,
+   governor %s failed to bind and the previous 
one (%s) failed to bind again, thermal zone %s has no governor\n,
+   failed_gov_name, tz-governor-name, tz-type);
+   tz-governor = NULL;
+   }
+   }
+}
+
+/**
+ * thermal_set_governor() - Switch to another governor
+ * @tz:a valid pointer to a struct thermal_zone_device
+ * @new_gov:   pointer to the new governor
+ *
+ * Change the governor of thermal zone @tz.
+ *
+ * Return: 0 on success, an error if the new governor's bind_to_tz() failed.
+ */
+static int thermal_set_governor(struct thermal_zone_device *tz,
+   struct thermal_governor *new_gov)
+{
+   int ret = 0;
+
+   if (tz-governor  tz-governor-unbind_from_tz)
+   tz-governor-unbind_from_tz(tz);
+
+   if (new_gov  new_gov-bind_to_tz) {
+   ret = new_gov-bind_to_tz(tz);
+   if (ret) {
+   bind_previous_governor(tz, new_gov-name);
+
+   return ret;
+   }
+   }
+
+   tz-governor = new_gov;
+
+   return ret;
+}
+
 int thermal_register_governor(struct thermal_governor *governor)
 {
int err;
@@ -107,8 +159,15 @@ int thermal_register_governor(struct thermal_governor 
*governor)
 
name = pos-tzp-governor_name;
 
-   if (!strncasecmp(name, governor-name, THERMAL_NAME_LENGTH))
-   pos-governor = governor;
+   if (!strncasecmp(name, governor-name, THERMAL_NAME_LENGTH)) {
+   int ret;
+
+   ret = thermal_set_governor(pos, governor);
+   if (ret)
+   dev_err(pos-device,
+   Failed to set governor %s for thermal 
zone %s: %d\n,
+   governor-name, pos-type, ret);
+   }
}
 
mutex_unlock(thermal_list_lock);
@@ -134,7 +193,7 @@ void thermal_unregister_governor(struct thermal_governor 
*governor)
list_for_each_entry(pos, thermal_tz_list, node) {
if (!strncasecmp(pos-governor-name, governor-name,
THERMAL_NAME_LENGTH))
-   pos-governor = NULL;
+   thermal_set_governor(pos, NULL);
}
 
mutex_unlock(thermal_list_lock);
@@ -762,8 +821,9 @@ policy_store(struct device *dev, struct device_attribute 
*attr,
if (!gov)
goto exit;
 
-   tz-governor = gov;
-   ret = count;
+   ret = thermal_set_governor(tz, gov);
+   if (!ret)
+   ret = count;
 
 exit:
mutex_unlock(thermal_governor_lock);
@@ -1459,6 +1519,7 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
int result;
int count;
int passive = 0;
+   struct thermal_governor *governor;
 
if (type  strlen(type) = THERMAL_NAME_LENGTH)
return ERR_PTR(-EINVAL);
@@ -1549,9 +1610,15 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
mutex_lock(thermal_governor_lock);
 
if (tz-tzp)
-   tz-governor = __find_governor(tz-tzp

[RFC PATCH v6 5/9] thermal: extend the cooling device API to include power information

2014-12-05 Thread Javi Merino
Add three optional callbacks to the cooling device interface to allow
them to express power.  In addition to the callbacks, add helpers to
identify cooling devices that implement the power cooling device API.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 Documentation/thermal/power_allocator.txt | 27 ++
 drivers/thermal/thermal_core.c| 38 +++
 include/linux/thermal.h   | 12 ++
 3 files changed, 77 insertions(+)
 create mode 100644 Documentation/thermal/power_allocator.txt

diff --git a/Documentation/thermal/power_allocator.txt 
b/Documentation/thermal/power_allocator.txt
new file mode 100644
index ..d3bb79050c27
--- /dev/null
+++ b/Documentation/thermal/power_allocator.txt
@@ -0,0 +1,27 @@
+Cooling device power API
+
+
+Cooling devices controlled by this governor must supply the additional
+power API in their `cooling_device_ops`.  It consists on three ops:
+
+1. u32 get_actual_power(struct thermal_cooling_device *cdev);
+@cdev: The `struct thermal_cooling_device` pointer
+
+`get_actual_power()` returns the power currently consumed by the
+device in milliwatts.
+
+2. u32 state2power(struct thermal_cooling_device *cdev, unsigned long
+state);
+@cdev: The `struct thermal_cooling_device` pointer
+@state: A cooling device state
+
+Convert cooling device state @state into power consumption in
+milliwatts.
+
+3. unsigned long power2state(struct thermal_cooling_device *cdev,
+u32 power);
+@cdev: The `struct thermal_cooling_device` pointer
+@power: power in milliwatts
+
+Calculate a cooling device state that would make the device consume at
+most @power mW.
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 9021cb72a13a..c490f262ea7f 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -866,6 +866,44 @@ emul_temp_store(struct device *dev, struct 
device_attribute *attr,
 static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
 #endif/*CONFIG_THERMAL_EMULATION*/
 
+/**
+ * power_actor_get_max_power() - get the maximum power that a cdev can consume
+ * @cdev:  pointer to thermal_cooling_device
+ *
+ * Calculate the maximum power consumption in milliwats that the
+ * cooling device can currently consume.  If @cdev doesn't support the
+ * power_actor API, this function returns 0.
+ */
+u32 power_actor_get_max_power(struct thermal_cooling_device *cdev)
+{
+   if (!cdev_is_power_actor(cdev))
+   return 0;
+
+   return cdev-ops-state2power(cdev, 0);
+}
+
+/**
+ * power_actor_set_power() - limit the maximum power that a cooling device can 
consume
+ * @cdev:  pointer to thermal_cooling_device
+ * @power: the power in milliwatts
+ *
+ * Set the cooling device to consume at most @power milliwatts.
+ *
+ * Returns: 0 on success, -EINVAL if the cooling device does not
+ * implement the power actor API or -E* for other failures.
+ */
+int power_actor_set_power(struct thermal_cooling_device *cdev, u32 power)
+{
+   unsigned long state;
+
+   if (!cdev_is_power_actor(cdev))
+   return -EINVAL;
+
+   state = cdev-ops-power2state(cdev, power);
+
+   return cdev-ops-set_cur_state(cdev, state);
+}
+
 static DEVICE_ATTR(type, 0444, type_show, NULL);
 static DEVICE_ATTR(temp, 0444, temp_show, NULL);
 static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index 2c14ab1f5c0d..1155457caf52 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -142,6 +142,9 @@ struct thermal_cooling_device_ops {
int (*get_max_state) (struct thermal_cooling_device *, unsigned long *);
int (*get_cur_state) (struct thermal_cooling_device *, unsigned long *);
int (*set_cur_state) (struct thermal_cooling_device *, unsigned long);
+   u32 (*get_actual_power) (struct thermal_cooling_device *);
+   u32 (*state2power) (struct thermal_cooling_device *, unsigned long);
+   unsigned long (*power2state) (struct thermal_cooling_device *, u32);
 };
 
 struct thermal_cooling_device {
@@ -322,6 +325,15 @@ void thermal_zone_of_sensor_unregister(struct device *dev,
 }
 
 #endif
+
+static inline bool cdev_is_power_actor(struct thermal_cooling_device *cdev)
+{
+   return cdev-ops-get_actual_power  cdev-ops-state2power 
+   cdev-ops-power2state;
+}
+
+u32 power_actor_get_max_power(struct thermal_cooling_device *);
+int power_actor_set_power(struct thermal_cooling_device *, u32);
 struct thermal_zone_device *thermal_zone_device_register(const char *, int, 
int,
void *, struct thermal_zone_device_ops *,
const struct thermal_zone_params *, int, int);
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord

[RFC PATCH v6 8/9] thermal: add trace events to the power allocator governor

2014-12-05 Thread Javi Merino
Add trace events for the power allocator governor and the power actor
interface of the cpu cooling device.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 drivers/thermal/cpu_cooling.c  |  26 -
 drivers/thermal/power_allocator.c  |  21 +++-
 include/trace/events/thermal_power_allocator.h | 138 +
 3 files changed, 182 insertions(+), 3 deletions(-)
 create mode 100644 include/trace/events/thermal_power_allocator.h

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 335d95dd7e5a..f4d453429742 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -29,6 +29,8 @@
 #include linux/cpu.h
 #include linux/cpu_cooling.h
 
+#include trace/events/thermal_power_allocator.h
+
 /**
  * struct power_table - frequency to power conversion
  * @frequency: frequency in KHz
@@ -644,12 +646,20 @@ static int cpufreq_set_cur_state(struct 
thermal_cooling_device *cdev,
 static u32 cpufreq_get_actual_power(struct thermal_cooling_device *cdev)
 {
unsigned long freq;
-   int cpu;
+   int i = 0, cpu;
u32 static_power, dynamic_power, total_load = 0;
struct cpufreq_cooling_device *cpufreq_device = cdev-devdata;
+   u32 *load_cpu = NULL;
 
freq = cpufreq_quick_get(cpumask_any(cpufreq_device-allowed_cpus));
 
+   if (trace_thermal_power_cpu_get_power_enabled()) {
+   u32 ncpus = cpumask_weight(cpufreq_device-allowed_cpus);
+
+   load_cpu = devm_kcalloc(cdev-device, ncpus, sizeof(*load_cpu),
+   GFP_KERNEL);
+   }
+
for_each_cpu(cpu, cpufreq_device-allowed_cpus) {
u32 load;
 
@@ -659,6 +669,10 @@ static u32 cpufreq_get_actual_power(struct 
thermal_cooling_device *cdev)
load = 0;
 
total_load += load;
+   if (trace_thermal_power_cpu_limit_enabled()  load_cpu)
+   load_cpu[i] = load;
+
+   i++;
}
 
cpufreq_device-last_load = total_load;
@@ -666,6 +680,14 @@ static u32 cpufreq_get_actual_power(struct 
thermal_cooling_device *cdev)
static_power = get_static_power(cpufreq_device, freq);
dynamic_power = get_dynamic_power(cpufreq_device, freq);
 
+   if (trace_thermal_power_cpu_limit_enabled()  load_cpu) {
+   trace_thermal_power_cpu_get_power(
+   cpufreq_device-allowed_cpus,
+   freq, load_cpu, i, dynamic_power, static_power);
+
+   devm_kfree(cdev-device, load_cpu);
+   }
+
return static_power + dynamic_power;
 }
 
@@ -730,6 +752,8 @@ static unsigned long cpufreq_power2state(struct 
thermal_cooling_device *cdev,
return 0;
}
 
+   trace_thermal_power_cpu_limit(cpufreq_device-allowed_cpus,
+   target_freq, cdev_state, power);
return cdev_state;
 }
 
diff --git a/drivers/thermal/power_allocator.c 
b/drivers/thermal/power_allocator.c
index 09e98991efbb..fa725a36872e 100644
--- a/drivers/thermal/power_allocator.c
+++ b/drivers/thermal/power_allocator.c
@@ -19,6 +19,9 @@
 #include linux/slab.h
 #include linux/thermal.h
 
+#define CREATE_TRACE_POINTS
+#include trace/events/thermal_power_allocator.h
+
 #include thermal_core.h
 
 #define FRAC_BITS 10
@@ -157,7 +160,14 @@ static u32 pid_controller(struct thermal_zone_device *tz,
/* feed-forward the known sustainable dissipatable power */
power_range = tz-tzp-sustainable_power + frac_to_int(power_range);
 
-   return clamp(power_range, (s64)0, (s64)max_allocatable_power);
+   power_range = clamp(power_range, (s64)0, (s64)max_allocatable_power);
+
+   trace_thermal_power_allocator_pid(frac_to_int(err),
+   frac_to_int(params-err_integral),
+   frac_to_int(p), frac_to_int(i),
+   frac_to_int(d), power_range);
+
+   return power_range;
 }
 
 /**
@@ -238,7 +248,7 @@ static int allocate_power(struct thermal_zone_device *tz,
struct thermal_instance *instance;
u32 *req_power, *max_power, *granted_power;
u32 total_req_power, max_allocatable_power;
-   u32 power_range;
+   u32 total_granted_power, power_range;
int i, num_actors, ret = 0;
 
mutex_lock(tz-lock);
@@ -301,6 +311,7 @@ static int allocate_power(struct thermal_zone_device *tz,
divvy_up_power(req_power, max_power, num_actors, total_req_power,
power_range, granted_power);
 
+   total_granted_power = 0;
i = 0;
list_for_each_entry(instance, tz-thermal_instances, tz_node) {
if (instance-trip

[RFC PATCH v6 7/9] thermal: introduce the Power Allocator governor

2014-12-05 Thread Javi Merino
The power allocator governor is a thermal governor that controls system
and device power allocation to control temperature.  Conceptually, the
implementation divides the sustainable power of a thermal zone among
all the heat sources in that zone.

This governor relies on power actors, entities that represent heat
sources.  They can report current and maximum power consumption and
can set a given maximum power consumption, usually via a cooling
device.

The governor uses a Proportional Integral Derivative (PID) controller
driven by the temperature of the thermal zone.  The output of the
controller is a power budget that is then allocated to each power
actor that can have bearing on the temperature we are trying to
control.  It decides how much power to give each cooling device based
on the performance they are requesting.  The PID controller ensures
that the total power budget does not exceed the control temperature.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Punit Agrawal punit.agra...@arm.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 Documentation/thermal/power_allocator.txt | 196 
 drivers/thermal/Kconfig   |  15 +
 drivers/thermal/Makefile  |   1 +
 drivers/thermal/power_allocator.c | 511 ++
 drivers/thermal/thermal_core.c|   7 +-
 drivers/thermal/thermal_core.h|   8 +
 include/linux/thermal.h   |  40 ++-
 7 files changed, 774 insertions(+), 4 deletions(-)
 create mode 100644 drivers/thermal/power_allocator.c

diff --git a/Documentation/thermal/power_allocator.txt 
b/Documentation/thermal/power_allocator.txt
index d3bb79050c27..23b684afdc75 100644
--- a/Documentation/thermal/power_allocator.txt
+++ b/Documentation/thermal/power_allocator.txt
@@ -1,3 +1,172 @@
+Power allocator governor tunables
+=
+
+Trip points
+---
+
+The governor requires the following two passive trip points:
+
+1.  switch on trip point: temperature above which the governor
+control loop starts operating.
+2.  desired temperature trip point: it should be higher than the
+switch on trip point. It is the target temperature the governor
+is controlling for.
+
+PID Controller
+--
+
+The power allocator governor implements a
+Proportional-Integral-Derivative controller (PID controller) with
+temperature as the control input and power as the controlled output:
+
+P_max = k_p * e + k_i * err_integral + k_d * diff_err + sustainable_power
+
+where
+e = desired_temperature - current_temperature
+err_integral is the sum of previous errors
+diff_err = e - previous_error
+
+It is similar to the one depicted below:
+
+  k_d
+   |
+current_temp   |
+ | v
+ |+--+   +---+
+ | +-| diff_err |--| X |--+
+ | |  +--+   +---+  |
+ | ||  tdpactor
+ | |  k_i   |   |get_actual_power()
+ | |   ||   || |
+ | |   ||   || | ...
+ v |   vv   vv v
+   +---+   |  +---+  +---++---+   +---+   +--+
+   | S |---+-| sum e |-| X |---| S |--| S |--|power |
+   +---+   |  +---+  +---++---+   +---+   |allocation|
+ ^ |^ +--+
+ | ||| |
+ | |+---+   || |
+ | +---| X |---+v v
+ |  +---+   granted performance
+desired_temperature   ^
+  |
+  |
+  k_po/k_pu
+
+Sustainable power
+-
+
+An estimate of the sustainable dissipatable power (in mW) should be
+provided while registering the thermal zone.  This estimates the
+sustained power that can be dissipated at the desired control
+temperature.  This is the maximum sustained power for allocation at
+the desired maximum temperature.  The actual sustained power can vary
+for a number of reasons.  The closed loop controller will take care of
+variations such as environmental conditions, and some factors related
+to the speed-grade of the silicon.  `sustainable_power` is therefore
+simply an estimate, and may be tuned to affect the aggressiveness of
+the thermal ramp.  For reference, this is 2000mW - 4500mW depending on
+screen size (4 phone - 10 tablet).
+
+If you are using device tree

[RFC PATCH v6 9/9] of: thermal: Introduce sustainable power for a thermal zone

2014-12-05 Thread Javi Merino
From: Punit Agrawal punit.agra...@arm.com

Introduce an optional property called, sustainable-power, which
represents the power (in mW) which the thermal zone can safely
dissipate.

If provided the property is parsed and associated with the thermal
zone via the thermal zone parameters.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Punit Agrawal punit.agra...@arm.com
---
 Documentation/devicetree/bindings/thermal/thermal.txt | 4 
 drivers/thermal/of-thermal.c  | 4 
 2 files changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt 
b/Documentation/devicetree/bindings/thermal/thermal.txt
index f5db6b72a36f..c6eb9a8d2aed 100644
--- a/Documentation/devicetree/bindings/thermal/thermal.txt
+++ b/Documentation/devicetree/bindings/thermal/thermal.txt
@@ -167,6 +167,10 @@ Optional property:
by means of sensor ID. Additional coefficients are
interpreted as constant offset.
 
+- sustainable-power:   An estimate of the sustainable power (in mW) that the
+  Type: unsigned   thermal zone can dissipate.
+  Size: one cell
+
 Note: The delay properties are bound to the maximum dT/dt (temperature
 derivative over time) in two situations for a thermal zone:
 (i)  - when passive cooling is activated (polling-delay-passive); and
diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c
index 62143ba31001..e032b9bf4085 100644
--- a/drivers/thermal/of-thermal.c
+++ b/drivers/thermal/of-thermal.c
@@ -794,6 +794,7 @@ int __init of_parse_thermal_zones(void)
for_each_child_of_node(np, child) {
struct thermal_zone_device *zone;
struct thermal_zone_params *tzp;
+   u32 prop;
 
/* Check whether child is enabled or not */
if (!of_device_is_available(child))
@@ -820,6 +821,9 @@ int __init of_parse_thermal_zones(void)
/* No hwmon because there might be hwmon drivers registering */
tzp-no_hwmon = true;
 
+   if (!of_property_read_u32(child, sustainable-power, prop))
+   tzp-sustainable_power = prop;
+
zone = thermal_zone_device_register(child-name, tz-ntrips,
0, tz,
ops, tzp,
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v6 6/9] thermal: cpu_cooling: implement the power cooling device API

2014-12-05 Thread Javi Merino
Add a basic power model to the cpu cooling device to implement the
power cooling device API.  The power model uses the current frequency,
current load and OPPs for the power calculations.  The cpus must have
registered their OPPs using the OPP library.

Cc: Zhang Rui rui.zh...@intel.com
Cc: Eduardo Valentin edubez...@gmail.com
Signed-off-by: Punit Agrawal punit.agra...@arm.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 Documentation/thermal/cpu-cooling-api.txt | 144 +-
 drivers/thermal/cpu_cooling.c | 431 +-
 include/linux/cpu_cooling.h   |  49 +++-
 3 files changed, 611 insertions(+), 13 deletions(-)

diff --git a/Documentation/thermal/cpu-cooling-api.txt 
b/Documentation/thermal/cpu-cooling-api.txt
index fca24c931ec8..d438a900e374 100644
--- a/Documentation/thermal/cpu-cooling-api.txt
+++ b/Documentation/thermal/cpu-cooling-api.txt
@@ -25,8 +25,150 @@ the user. The registration APIs returns the cooling device 
pointer.
 
clip_cpus: cpumask of cpus where the frequency constraints will happen.
 
-1.1.2 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
+1.1.2 struct thermal_cooling_device *cpufreq_power_cooling_register(
+const struct cpumask *clip_cpus, u32 capacitance,
+get_static_t plat_static_func)
+
+Similar to cpufreq_cooling_register, this function registers a cpufreq
+cooling device.  Using this function, the cooling device will
+implement the power extensions by using a simple cpu power model.  The
+cpus must have registered their OPPs using the OPP library.
+
+The additional parameters are needed for the power model (See 2. Power
+models).  capacitance is the dynamic power coefficient (See 2.1
+Dynamic power).  plat_static_func is a function to calculate the
+static power consumed by these cpus (See 2.2 Static power).
+
+1.1.3 struct thermal_cooling_device *of_cpufreq_power_cooling_register(
+struct device_node *np, const struct cpumask *clip_cpus, u32 capacitance,
+get_static_t plat_static_func)
+
+Similar to cpufreq_power_cooling_register, this function register a
+cpufreq cooling device with power extensions using the device tree
+information supplied by the np parameter.
+
+1.1.4 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
 
 This interface function unregisters the thermal-cpufreq-%x cooling 
device.
 
 cdev: Cooling device pointer which has to be unregistered.
+
+2. Power models
+
+The power API registration functions provide a simple power model for
+CPUs.  The current power is calculated as dynamic + (optionally)
+static power.  This power model requires that the operating-points of
+the CPUs are registered using the kernel's opp library and the
+`cpufreq_frequency_table` is assigned to the `struct device` of the
+cpu.  If you are using the `cpufreq-cpu0.c` driver then the
+`cpufreq_frequency_table` should already be assigned to the cpu
+device.
+
+The `plat_static_func` parameter of `cpufreq_power_cooling_register()`
+and `of_cpufreq_power_cooling_register()` is optional.  If you don't
+provide it, only dynamic power will be considered.
+
+2.1 Dynamic power
+
+The dynamic power consumption of a processor depends on many factors.
+For a given processor implementation the primary factors are:
+
+- The time the processor spends running, consuming dynamic power, as
+  compared to the time in idle states where dynamic consumption is
+  negligible.  Herein we refer to this as 'utilisation'.
+- The voltage and frequency levels as a result of DVFS.  The DVFS
+  level is a dominant factor governing power consumption.
+- In running time the 'execution' behaviour (instruction types, memory
+  access patterns and so forth) causes, in most cases, a second order
+  variation.  In pathological cases this variation can be significant,
+  but typically it is of a much lesser impact than the factors above.
+
+A high level dynamic power consumption model may then be represented as:
+
+Pdyn = f(run) * Voltage^2 * Frequency * Utilisation
+
+f(run) here represents the described execution behaviour and its
+result has a units of Watts/Hz/Volt^2 (this often expressed in
+mW/MHz/uVolt^2)
+
+The detailed behaviour for f(run) could be modelled on-line.  However,
+in practice, such an on-line model has dependencies on a number of
+implementation specific processor support and characterisation
+factors.  Therefore, in initial implementation that contribution is
+represented as a constant coefficient.  This is a simplification
+consistent with the relative contribution to overall power variation.
+
+In this simplified representation our model becomes:
+
+Pdyn = Kd * Voltage^2 * Frequency * Utilisation
+
+Where Kd (capacitance) represents an indicative running time dynamic
+power coefficient in fundamental units of mW/MHz/uVolt^2
+
+2.2 Static power
+
+Static leakage power consumption depends on a number of factors.  For a
+given circuit implementation the primary factors

[RFC PATCH v6 3/9] tools lib traceevent: Add support for __print_u{8,16,32,64}_array()

2014-12-05 Thread Javi Merino
Trace can now generate traces with u8, u16, u32 and u64 dynamic
arrays.  Add support to parse them.

Cc: Arnaldo Carvalho de Melo a...@redhat.com
Cc: Steven Rostedt srost...@redhat.com
Cc: Jiri Olsa jo...@redhat.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 tools/lib/traceevent/event-parse.c | 62 +++---
 tools/lib/traceevent/event-parse.h |  4 +++
 2 files changed, 61 insertions(+), 5 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index f12ea53cc83b..f67260bddd65 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -753,6 +753,10 @@ static void free_arg(struct print_arg *arg)
free_arg(arg-symbol.field);
free_flag_sym(arg-symbol.symbols);
break;
+   case PRINT_U8:
+   case PRINT_U16:
+   case PRINT_U32:
+   case PRINT_U64:
case PRINT_HEX:
free_arg(arg-num.field);
free_arg(arg-num.size);
@@ -2827,6 +2831,22 @@ process_function(struct event_format *event, struct 
print_arg *arg,
free_token(token);
return process_hex(event, arg, tok);
}
+   if (strcmp(token, __print_u8_array) == 0) {
+   free_token(token);
+   return process_num(event, arg, tok, PRINT_U8);
+   }
+   if (strcmp(token, __print_u16_array) == 0) {
+   free_token(token);
+   return process_num(event, arg, tok, PRINT_U16);
+   }
+   if (strcmp(token, __print_u32_array) == 0) {
+   free_token(token);
+   return process_num(event, arg, tok, PRINT_U32);
+   }
+   if (strcmp(token, __print_u64_array) == 0) {
+   free_token(token);
+   return process_num(event, arg, tok, PRINT_U64);
+   }
if (strcmp(token, __get_str) == 0) {
free_token(token);
return process_str(event, arg, tok);
@@ -3355,6 +3375,10 @@ eval_num_arg(void *data, int size, struct event_format 
*event, struct print_arg
break;
case PRINT_FLAGS:
case PRINT_SYMBOL:
+   case PRINT_U8:
+   case PRINT_U16:
+   case PRINT_U32:
+   case PRINT_U64:
case PRINT_HEX:
break;
case PRINT_TYPE:
@@ -3660,7 +3684,7 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
unsigned long long val, fval;
unsigned long addr;
char *str;
-   unsigned char *hex;
+   void *num;
int print;
int i, len;
 
@@ -3739,13 +3763,17 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
}
}
break;
+   case PRINT_U8:
+   case PRINT_U16:
+   case PRINT_U32:
+   case PRINT_U64:
case PRINT_HEX:
if (arg-num.field-type == PRINT_DYNAMIC_ARRAY) {
unsigned long offset;
offset = pevent_read_number(pevent,
data + arg-num.field-dynarray.field-offset,
arg-num.field-dynarray.field-size);
-   hex = data + (offset  0x);
+   num = data + (offset  0x);
} else {
field = arg-num.field-field.field;
if (!field) {
@@ -3755,13 +3783,24 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
goto out_warning_field;
arg-num.field-field.field = field;
}
-   hex = data + field-offset;
+   num = data + field-offset;
}
len = eval_num_arg(data, size, event, arg-num.size);
for (i = 0; i  len; i++) {
if (i)
trace_seq_putc(s, ' ');
-   trace_seq_printf(s, %02x, hex[i]);
+   if (arg-type == PRINT_HEX)
+   trace_seq_printf(s, %02x,
+   ((uint8_t *)num)[i]);
+   else if (arg-type == PRINT_U8)
+   trace_seq_printf(s, %u, ((uint8_t *)num)[i]);
+   else if (arg-type == PRINT_U16)
+   trace_seq_printf(s, %u, ((uint16_t *)num)[i]);
+   else if (arg-type == PRINT_U32)
+   trace_seq_printf(s, %u, ((uint32_t *)num)[i]);
+   else/* PRINT_U64 */
+   trace_seq_printf(s, %lu,
+   ((uint64_t *)num)[i]);
}
break;
 
@@ -4922,7 +4961,20 @@ static void print_args(struct print_arg *args)
printf());
break

[RFC PATCH v6 2/9] tools lib traceevent: Generalize numeric argument

2014-12-05 Thread Javi Merino
Numeric arguments can be in different bases, so rename it to num so
that they can be used for formats other than PRINT_HEX

Cc: Steven Rostedt srost...@redhat.com
Cc: Arnaldo Carvalho de Melo a...@redhat.com
Cc: Jiri Olsa jo...@redhat.com
Signed-off-by: Javi Merino javi.mer...@arm.com
---
 tools/lib/traceevent/event-parse.c | 26 +-
 tools/lib/traceevent/event-parse.h |  4 ++--
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index cf3a44bf1ec3..f12ea53cc83b 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -754,8 +754,8 @@ static void free_arg(struct print_arg *arg)
free_flag_sym(arg-symbol.symbols);
break;
case PRINT_HEX:
-   free_arg(arg-hex.field);
-   free_arg(arg-hex.size);
+   free_arg(arg-num.field);
+   free_arg(arg-num.size);
break;
case PRINT_TYPE:
free(arg-typecast.type);
@@ -2503,7 +2503,7 @@ process_hex(struct event_format *event, struct print_arg 
*arg, char **tok)
if (test_type_token(type, token, EVENT_DELIM, ,))
goto out_free;
 
-   arg-hex.field = field;
+   arg-num.field = field;
 
free_token(token);
 
@@ -2519,7 +2519,7 @@ process_hex(struct event_format *event, struct print_arg 
*arg, char **tok)
if (test_type_token(type, token, EVENT_DELIM, )))
goto out_free;
 
-   arg-hex.size = field;
+   arg-num.size = field;
 
free_token(token);
type = read_token_item(tok);
@@ -3740,24 +3740,24 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
}
break;
case PRINT_HEX:
-   if (arg-hex.field-type == PRINT_DYNAMIC_ARRAY) {
+   if (arg-num.field-type == PRINT_DYNAMIC_ARRAY) {
unsigned long offset;
offset = pevent_read_number(pevent,
-   data + arg-hex.field-dynarray.field-offset,
-   arg-hex.field-dynarray.field-size);
+   data + arg-num.field-dynarray.field-offset,
+   arg-num.field-dynarray.field-size);
hex = data + (offset  0x);
} else {
-   field = arg-hex.field-field.field;
+   field = arg-num.field-field.field;
if (!field) {
-   str = arg-hex.field-field.name;
+   str = arg-num.field-field.name;
field = pevent_find_any_field(event, str);
if (!field)
goto out_warning_field;
-   arg-hex.field-field.field = field;
+   arg-num.field-field.field = field;
}
hex = data + field-offset;
}
-   len = eval_num_arg(data, size, event, arg-hex.size);
+   len = eval_num_arg(data, size, event, arg-num.size);
for (i = 0; i  len; i++) {
if (i)
trace_seq_putc(s, ' ');
@@ -4923,9 +4923,9 @@ static void print_args(struct print_arg *args)
break;
case PRINT_HEX:
printf(__print_hex();
-   print_args(args-hex.field);
+   print_args(args-num.field);
printf(, );
-   print_args(args-hex.size);
+   print_args(args-num.size);
printf());
break;
case PRINT_STRING:
diff --git a/tools/lib/traceevent/event-parse.h 
b/tools/lib/traceevent/event-parse.h
index 7a3873ff9a4f..2bf72e908a74 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -240,7 +240,7 @@ struct print_arg_symbol {
struct print_flag_sym   *symbols;
 };
 
-struct print_arg_hex {
+struct print_arg_num {
struct print_arg*field;
struct print_arg*size;
 };
@@ -291,7 +291,7 @@ struct print_arg {
struct print_arg_typecast   typecast;
struct print_arg_flags  flags;
struct print_arg_symbol symbol;
-   struct print_arg_hexhex;
+   struct print_arg_numnum;
struct print_arg_func   func;
struct print_arg_string string;
struct print_arg_bitmaskbitmask;
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org

Re: [RFC PATCH v6 6/9] thermal: cpu_cooling: implement the power cooling device API

2014-12-08 Thread Javi Merino
On Mon, Dec 08, 2014 at 05:49:00AM +, Viresh Kumar wrote:
 Hi Javi,

Hi Viresh,

 Looks like ARM's exchange server screwed up your patch?
 
 This is how I see it with gmail's show-original option:
 
 +=09cpufreq_device-dyn_power_table =3D power_table;
 +=09cpufreq_device-dyn_power_table_entries =3D i;
 +
 
 I have seen this a lot, while I was in ARM. Had to adopt some work-arounds to
 get over it. :)

Sigh.  Care to share them (privately I guess)?
 
 On Sat, Dec 6, 2014 at 12:34 AM, Javi Merino javi.mer...@arm.com wrote:
 
  diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
 
  +static int build_dyn_power_table(struct cpufreq_cooling_device 
  *cpufreq_device,
  +   u32 capacitance)
  +{
  +   struct power_table *power_table;
  +   struct dev_pm_opp *opp;
  +   struct device *dev = NULL;
  +   int num_opps, cpu, i, ret = 0;
 
 Why not initialize num_opps and i to 0 here?

ok

  +   unsigned long freq;
  +
  +   num_opps = 0;
  +
  +   rcu_read_lock();
  +
  +   for_each_cpu(cpu, cpufreq_device-allowed_cpus) {
 
 All these CPUs must be sharing the OPPs as they must be supplied
 from a single clock line. But probably you need to iterate over all
 because you don't know which ones share OPP. Right ? Probably
 the work I am doing around getting new OPP bindings might solve
 this..

Is this loop pointless?  I seem to recall that it was needed but I
forgot the details.  If you think it is, I can remove it.

  +   dev = get_cpu_device(cpu);
  +   if (!dev)
 
 Is this allowed? I understand you can continue, but this is not
 possible. Right ? So, print a error here?

Ok, now it prints an error.

  +   continue;
  +
  +   num_opps = dev_pm_opp_get_opp_count(dev);
  +   if (num_opps  0) {
  +   break;
  +   } else if (num_opps  0) {
  +   ret = num_opps;
  +   goto unlock;
  +   }
  +   }
  +
  +   if (num_opps == 0) {
  +   ret = -EINVAL;
  +   goto unlock;
  +   }
  +
  +   power_table = kcalloc(num_opps, sizeof(*power_table), GFP_KERNEL);
  +
  +   i = 0;
 
 Either initialize i at the beginning or in the initialization part of
 for loop below.

As part of the for loop.
 
  +   for (freq = 0;
  +opp = dev_pm_opp_find_freq_ceil(dev, freq), !IS_ERR(opp);
  +freq++) {
  +   u32 freq_mhz, voltage_mv;
  +   u64 power;
  +
  +   freq_mhz = freq / 100;
  +   voltage_mv = dev_pm_opp_get_voltage(opp) / 1000;
  +
  +   /*
  +* Do the multiplication with MHz and millivolt so as
  +* to not overflow.
  +*/
  +   power = (u64)capacitance * freq_mhz * voltage_mv * 
  voltage_mv;
  +   do_div(power, 10);
  +
  +   /* frequency is stored in power_table in KHz */
  +   power_table[i].frequency = freq / 1000;
  +   power_table[i].power = power;
  +
  +   i++;
 
 Why here and not with freq++?

As part of the for loop as well.
 
  +   }
  +
  +   if (i == 0) {
  +   ret = PTR_ERR(opp);
  +   goto unlock;
  +   }
  +
  +   cpufreq_device-dyn_power_table = power_table;
  +   cpufreq_device-dyn_power_table_entries = i;
  +
  +unlock:
  +   rcu_read_unlock();
  +   return ret;
  +}
  +
  +static u32 cpu_freq_to_power(struct cpufreq_cooling_device *cpufreq_device,
  +   u32 freq)
 
 Because the patch is screwed up a bit, I really can't see if the 'u'
 or u32 is directly
 below the 's' of struct cpufreq_cooling_device. Running checkpatch with 
 --strict
 will take care of that probably. Sorry if you have already taken care of 
 that..

It wasn't.  I'll run checkpatch with --strict on next submission.

  +{
  +   int i;
  +   struct power_table *pt = cpufreq_device-dyn_power_table;
  +
  +   for (i = 1; i  cpufreq_device-dyn_power_table_entries; i++)
  +   if (freq  pt[i].frequency)
  +   break;
  +
  +   return pt[i - 1].power;
  +}
 
  +static u32 get_static_power(struct cpufreq_cooling_device *cpufreq_device,
  +   unsigned long freq)
  +{
  +   struct device *cpu_dev;
  +   struct dev_pm_opp *opp;
  +   unsigned long voltage;
  +   struct cpumask *cpumask = cpufreq_device-allowed_cpus;
  +   unsigned long freq_hz = freq * 1000;
  +
  +   if (!cpufreq_device-plat_get_static_power)
  +   return 0;
  +
  +   cpu_dev = get_cpu_device(cpumask_any(cpumask));
 
 Similar to the way you have used for-each-cpu earlier, the cpu
 returned from above maynot have opps attached to it. Right ?
 
 Probably you can keep a copy of the cpu_dev we have opps attached
 with somewhere

Re: [RFC PATCH v6 6/9] thermal: cpu_cooling: implement the power cooling device API

2014-12-08 Thread Javi Merino
On Mon, Dec 08, 2014 at 01:31:35PM +, Viresh Kumar wrote:
 On 8 December 2014 at 18:20, Javi Merino javi.mer...@arm.com wrote:
  Is this loop pointless?  I seem to recall that it was needed but I
  forgot the details.  If you think it is, I can remove it.
 
 Yes it is pointless. The CPUs you are iterating on, share clock lines
 and so they will have same set of OPPs. Just do this for the cpu
 we are registering the cooling device.

Ok, changed it into:

cpu = cpumask_any(cpufreq_device-allowed_cpus);
dev = get_cpu_device(cpu);
if (!dev) {
dev_warn(cpufreq_device-cool_dev-device,
No cpu device for cpu %d\n, cpu);
ret = -EINVAL;
goto unlock;
}

num_opps = dev_pm_opp_get_opp_count(dev);
if (num_opps = 0) {
ret = (num_opps  0)? num_opps : -EINVAL;
goto unlock;
}

Thanks!
Javi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v6 6/9] thermal: cpu_cooling: implement the power cooling device API

2014-12-09 Thread Javi Merino
Hi Viresh,

On Tue, Dec 09, 2014 at 01:59:39AM +, Viresh Kumar wrote:
 On 8 December 2014 at 19:52, Javi Merino javi.mer...@arm.com wrote:
  Ok, changed it into:
 
  cpu = cpumask_any(cpufreq_device-allowed_cpus);
  dev = get_cpu_device(cpu);
  if (!dev) {
  dev_warn(cpufreq_device-cool_dev-device,
  No cpu device for cpu %d\n, cpu);
  ret = -EINVAL;
  goto unlock;
  }
 
  num_opps = dev_pm_opp_get_opp_count(dev);
  if (num_opps = 0) {
  ret = (num_opps  0)? num_opps : -EINVAL;
  goto unlock;
  }
 
 And this might not work. This is what I said in the first reply.
 
 So, a bit lengthy reply now :)
 
 Every cpu has a device struct associated with it. When cpufreq
 core initializes drivers, they ask for mapping (initializing) the opps.
 At that point we pass policy-cpu to opp core. OPP core doesn't
 know which cores share clock line (I am trying to solve that [1]) and
 so it just initializes the OPPs for policy-cpu. Let us say it cpuX.
 
 Now there will be few more CPUs which are going to share clock
 line with it and hence will use the same OPPs. In thermal core,
 you got clip_cpus which is exactly the masks of all these CPUs
 sharing clock line.
 
 If the OPP layer is good enough, then above code can work. But
 because right now the OPPs are mapped to just cpuX, passing
 any other cpu from clip_cpus will fail as it doesn't have any associated
 OPPs.
 
 Now what I asked you is to use the CPU for which
 __cpufreq_cooling_register() is called. Normally we are calling
 __cpufreq_cooling_register() for the CPU for which OPPs are
 registered (but people might call it up for other CPUs as well)..

Sorry but I don't follow.  __cpufreq_cooling_register() is passed a
clip_cpus mask, not a single cpu.  How do I get the cpu for which
__cpufreq_cooling_register() is called if not by looping through all
the cpus in the mask?
 
 So, using that cpu *might* have worked here.
 
 Now the earlier loop you used was good to get this information,
 but it wasn't consistent and so I objected.
 
 What you should do:
 
 - Create another routine to find the cpu for which OPPs are bound
 to
 -  And save the cpu_dev for it in the global struct for cpu_cooling

This I have done, it wasn't part of the snip that I sent.

 - reuse it wherever required.

Same as above.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v6 6/9] thermal: cpu_cooling: implement the power cooling device API

2014-12-09 Thread Javi Merino
On Tue, Dec 09, 2014 at 10:36:46AM +, Viresh Kumar wrote:
 On 9 December 2014 at 16:02, Javi Merino javi.mer...@arm.com wrote:
  Sorry but I don't follow.  __cpufreq_cooling_register() is passed a
  clip_cpus mask, not a single cpu.  How do I get the cpu for which
  __cpufreq_cooling_register() is called if not by looping through all
  the cpus in the mask?
 
 Yeah, its np that is passed instead of cpu number. So, that wouldn't
 be usable. Also because of the limitations I explained earlier, it makes
 sense to iterate over all clip_cpus and finding which one owns OPPs.

Ok, how about this then?  I've pasted the whole commit so as to avoid
confusion.

diff --git a/Documentation/thermal/cpu-cooling-api.txt 
b/Documentation/thermal/cpu-cooling-api.txt
index fca24c931ec8..d438a900e374 100644
--- a/Documentation/thermal/cpu-cooling-api.txt
+++ b/Documentation/thermal/cpu-cooling-api.txt
@@ -25,8 +25,150 @@ the user. The registration APIs returns the cooling device 
pointer.
 
clip_cpus: cpumask of cpus where the frequency constraints will happen.
 
-1.1.2 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
+1.1.2 struct thermal_cooling_device *cpufreq_power_cooling_register(
+const struct cpumask *clip_cpus, u32 capacitance,
+get_static_t plat_static_func)
+
+Similar to cpufreq_cooling_register, this function registers a cpufreq
+cooling device.  Using this function, the cooling device will
+implement the power extensions by using a simple cpu power model.  The
+cpus must have registered their OPPs using the OPP library.
+
+The additional parameters are needed for the power model (See 2. Power
+models).  capacitance is the dynamic power coefficient (See 2.1
+Dynamic power).  plat_static_func is a function to calculate the
+static power consumed by these cpus (See 2.2 Static power).
+
+1.1.3 struct thermal_cooling_device *of_cpufreq_power_cooling_register(
+struct device_node *np, const struct cpumask *clip_cpus, u32 capacitance,
+get_static_t plat_static_func)
+
+Similar to cpufreq_power_cooling_register, this function register a
+cpufreq cooling device with power extensions using the device tree
+information supplied by the np parameter.
+
+1.1.4 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
 
 This interface function unregisters the thermal-cpufreq-%x cooling 
device.
 
 cdev: Cooling device pointer which has to be unregistered.
+
+2. Power models
+
+The power API registration functions provide a simple power model for
+CPUs.  The current power is calculated as dynamic + (optionally)
+static power.  This power model requires that the operating-points of
+the CPUs are registered using the kernel's opp library and the
+`cpufreq_frequency_table` is assigned to the `struct device` of the
+cpu.  If you are using the `cpufreq-cpu0.c` driver then the
+`cpufreq_frequency_table` should already be assigned to the cpu
+device.
+
+The `plat_static_func` parameter of `cpufreq_power_cooling_register()`
+and `of_cpufreq_power_cooling_register()` is optional.  If you don't
+provide it, only dynamic power will be considered.
+
+2.1 Dynamic power
+
+The dynamic power consumption of a processor depends on many factors.
+For a given processor implementation the primary factors are:
+
+- The time the processor spends running, consuming dynamic power, as
+  compared to the time in idle states where dynamic consumption is
+  negligible.  Herein we refer to this as 'utilisation'.
+- The voltage and frequency levels as a result of DVFS.  The DVFS
+  level is a dominant factor governing power consumption.
+- In running time the 'execution' behaviour (instruction types, memory
+  access patterns and so forth) causes, in most cases, a second order
+  variation.  In pathological cases this variation can be significant,
+  but typically it is of a much lesser impact than the factors above.
+
+A high level dynamic power consumption model may then be represented as:
+
+Pdyn = f(run) * Voltage^2 * Frequency * Utilisation
+
+f(run) here represents the described execution behaviour and its
+result has a units of Watts/Hz/Volt^2 (this often expressed in
+mW/MHz/uVolt^2)
+
+The detailed behaviour for f(run) could be modelled on-line.  However,
+in practice, such an on-line model has dependencies on a number of
+implementation specific processor support and characterisation
+factors.  Therefore, in initial implementation that contribution is
+represented as a constant coefficient.  This is a simplification
+consistent with the relative contribution to overall power variation.
+
+In this simplified representation our model becomes:
+
+Pdyn = Kd * Voltage^2 * Frequency * Utilisation
+
+Where Kd (capacitance) represents an indicative running time dynamic
+power coefficient in fundamental units of mW/MHz/uVolt^2
+
+2.2 Static power
+
+Static leakage power consumption depends on a number of factors.  For a
+given circuit implementation the primary factors are:
+
+- Time

Re: [RFC PATCH v6 6/9] thermal: cpu_cooling: implement the power cooling device API

2014-12-09 Thread Javi Merino
On Tue, Dec 09, 2014 at 11:06:46AM +, Viresh Kumar wrote:
 On 9 December 2014 at 16:30, Javi Merino javi.mer...@arm.com wrote:
  Ok, how about this then?  I've pasted the whole commit so as to avoid
  confusion.
 
 Yeah, the cpu_dev part looks fine now.

Great, thanks!

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   >