Re: [v3,1/3] thermal: mediatek: Relocate driver to mediatek folder

2021-04-20 Thread Daniel Lezcano
On 12/03/2021 04:40, Michael Kao wrote:
> Add Mediatek proprietary folder to upstream more thermal zone and cooler
> drivers. Relocate the original thermal controller driver to it and rename
> as soc_temp.c to show its purpose more clearly.

We already know the purpose :)

soc_temp gives no additional information.

Either keep the name or give the hardware sensor name. It is a driver
directory.

> Signed-off-by: Michael Kao 
> ---
>  drivers/thermal/Kconfig   | 14 ---
>  drivers/thermal/Makefile  |  2 +-
>  drivers/thermal/mediatek/Kconfig  | 23 +++
>  drivers/thermal/mediatek/Makefile |  1 +
>  .../{mtk_thermal.c => mediatek/soc_temp.c}|  0

[ ... ]

vv


> +config MTK_THERMAL
> + tristate "Mediatek thermal drivers"
> + depends on THERMAL_OF
> + help
> +   This is the option for Mediatek thermal software
> +   solutions. Please enable corresponding options to
> +   get temperature information from thermal sensors or
> +   turn on throttle mechaisms for thermal mitigation.
> +
> +if MTK_THERMAL


^^

All the above not needed.

> +config MTK_SOC_THERMAL
> + tristate "Temperature sensor driver for mediatek SoCs"
> + depends on HAS_IOMEM
> + depends on NVMEM
> + depends on RESET_CONTROLLER
> + help
> +   Enable this option if you want to get SoC temperature
> +   information for Mediatek platforms. This driver
> +   configures thermal controllers to collect temperature
> +   via AUXADC interface.
> +
> +endif
> diff --git a/drivers/thermal/mediatek/Makefile 
> b/drivers/thermal/mediatek/Makefile
> new file mode 100644
> index ..f75313ddce5e
> --- /dev/null
> +++ b/drivers/thermal/mediatek/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_MTK_SOC_THERMAL)+= soc_temp.o
> diff --git a/drivers/thermal/mtk_thermal.c 
> b/drivers/thermal/mediatek/soc_temp.c
> similarity index 100%
> rename from drivers/thermal/mtk_thermal.c
> rename to drivers/thermal/mediatek/soc_temp.c



-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH v2 2/2] thermal: power_allocator: update once cooling devices when temp is low

2021-04-20 Thread Daniel Lezcano
On 20/04/2021 22:01, Lukasz Luba wrote:
> 
> 
> On 4/20/21 4:24 PM, Daniel Lezcano wrote:
>> On 20/04/2021 16:21, Lukasz Luba wrote:
>>> Hi Daniel,
>>>
>>> On 4/20/21 2:30 PM, Daniel Lezcano wrote:
>>>> On 19/04/2021 10:45, Lukasz Luba wrote:
>>>
>>> [snip]

[ ... ]

> 
> That new approach should work. I can test your patch with this new
> functions and re-base my work on top of it.
> Or you like me to write such patch and send it?

At your convenience. I'm pretty busy ATM with more patches to review, so
if you can handle that change that will be nice. Otherwise, I can take
care of it but later.

Thanks

  -- Daniel

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH v2] thermal: mediatek: add sensors-support

2021-04-20 Thread Daniel Lezcano
On 20/04/2021 17:54, Frank Wunderlich wrote:
> From: Frank Wunderlich 
> 
> add HWMON-support to mediateks thermal driver to allow lm-sensors
> userspace tools read soc temperature
> 
> Signed-off-by: Frank Wunderlich 
> ---
> v2: drop ifdef and used devm_thermal_add_hwmon_sysfs
> ---
>  drivers/thermal/mtk_thermal.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/thermal/mtk_thermal.c b/drivers/thermal/mtk_thermal.c
> index 149c6d7fd5a0..32be8a715c7d 100644
> --- a/drivers/thermal/mtk_thermal.c
> +++ b/drivers/thermal/mtk_thermal.c
> @@ -23,6 +23,8 @@
>  #include 
>  #include 
>  
> +#include "thermal_hwmon.h"
> +
>  /* AUXADC Registers */
>  #define AUXADC_CON1_SET_V0x008
>  #define AUXADC_CON1_CLR_V0x00c
> @@ -1087,6 +1089,11 @@ static int mtk_thermal_probe(struct platform_device 
> *pdev)
>   goto err_disable_clk_peri_therm;
>   }
>  
> + tzdev->tzp->no_hwmon = false;

^^ you can drop this line

> + ret = devm_thermal_add_hwmon_sysfs(tzdev);
> + if (ret)
> + dev_err(>dev, "error in thermal_add_hwmon_sysfs");
> +
>   return 0;
>  
>  err_disable_clk_peri_therm:
> 


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH v2 1/1] thermal: ti-soc-thermal: Remove duplicated header file inclusion

2021-04-20 Thread Daniel Lezcano
On 06/04/2021 11:19, Zhen Lei wrote:
> Delete one of the header files  that are included
> twice, all included header files are then rearranged alphabetically.

The duplicate header file inclusion has been already fixed in a previous
patch.

Applied this patch by massaging the changelog accordingly and fixing the
conflict.

Thanks

  -- Daniel

-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: Aw: Re: [PATCH] thermal: mediatek: add sensors-support

2021-04-20 Thread Daniel Lezcano
On 20/04/2021 17:24, Frank Wunderlich wrote:
> Am 20. April 2021 17:18:32 MESZ schrieb Daniel Lezcano 
> :
>>
>> Hi Frank,
> 
>> The no_hwmon usage is a bit fuzzy in the thermal core code.
> 
> Maybe add depency in Kconfig? Else we can get undefined symbols on linking
> 
> regards Frank

Nope, there are the stubs in thermal_hwmon.h



-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH v2 2/2] thermal: power_allocator: update once cooling devices when temp is low

2021-04-20 Thread Daniel Lezcano
On 20/04/2021 16:21, Lukasz Luba wrote:
> Hi Daniel,
> 
> On 4/20/21 2:30 PM, Daniel Lezcano wrote:
>> On 19/04/2021 10:45, Lukasz Luba wrote:
> 
> [snip]
> 
>>> -    instance->cdev->updated = false;
>>> +    if (update)
>>> +    instance->cdev->updated = false;
>>> +
>>>   mutex_unlock(>cdev->lock);
>>> -    (instance->cdev);
>>> +
>>> +    if (update)
>>> +    thermal_cdev_update(instance->cdev);
>>
>> This cdev update has something bad IMHO. It is protected by a mutex but
>> the 'updated' field is left unprotected before calling
>> thermal_cdev_update().
>>
>> It is not the fault of this code but how the cooling device are updated
>> and how it interacts with the thermal instances.
>>
>> IMO, part of the core code needs to revisited.
> 
> I agree, but please check my comments below.
> 
>>
>> This change tight a bit more the knot.
>>
>> Would it make sense to you if we create a function eg.
>> __thermal_cdev_update()
> 
> I'm not sure if I assume it right that the function would only have the:
> list_for_each_entry(instance, >thermal_instances, cdev_node)
> 
> loop from the thermal_cdev_update(). But if it has only this loop then
> it's too little.
> 
>>
>> And then we have:
>>
>> void thermal_cdev_update(struct thermal_cooling_device *cdev)
>> {
>>  mutex_lock(>lock);
>>  /* cooling device is updated*/
>>  if (cdev->updated) {
>>  mutex_unlock(>lock);
>>  return;
>>  }
>>
>> __thermal_cdev_update(cdev);
>>
>>  thermal_cdev_set_cur_state(cdev, target);
> 
> Here we are actually setting the 'target' state via:
> cdev->ops->set_cur_state(cdev, target)
> 
> then we notify, then updating stats.
> 
>>
>>  cdev->updated = true;
>>  mutex_unlock(>lock);
>>  trace_cdev_update(cdev, target);
> 
> Also this trace is something that I'm using in my tests...

Yeah, I noticed right after sending the comments. All that should be
moved in the lockless function.

So this function becomes:

void thermal_cdev_update(struct thermal_cooling_device *cdev)
{
mutex_lock(>lock);
if (!cdev->updated) {
__thermal_cdev_update(cdev);
cdev->updated = true;
}
mutex_unlock(>lock);

dev_dbg(>device, "set to state %lu\n", target);
}

We end up with the trace_cdev_update(cdev, target) inside the mutex
section but that should be fine.

>>  dev_dbg(>device, "set to state %lu\n", target);
>> }
>>
>> And in this file we do instead:
>>
>> -    instance->cdev->updated = false;
>> +    if (update)
>> +    __thermal_cdev_update(instance->cdev);
>>    mutex_unlock(>cdev->lock);
>> -    thermal_cdev_update(instance->cdev);
> 
> Without the line above, we are not un-throttling the devices.

Is it still true with the amended function thermal_cdev_update() ?


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: Aw: Re: [PATCH] thermal: mediatek: add sensors-support

2021-04-20 Thread Daniel Lezcano


Hi Frank,

On 20/04/2021 16:59, Frank Wunderlich wrote:
> Hi,
> 
>> Gesendet: Dienstag, 20. April 2021 um 14:07 Uhr
>> Von: "Daniel Lezcano" 
> 
>> No #ifdef in C file.
> ...
> 
>> devm_thermal_add_hwmon_sysfs() ?
> 
> based on your comments this should be enough/right?
> 
> #if IS_ENABLED(CONFIG_THERMAL_HWMON)
> tzdev->tzp->no_hwmon = false;> ret = 
> devm_thermal_add_hwmon_sysfs(tzdev);
> if (ret)
> dev_err(>dev, "error in thermal_add_hwmon_sysfs");
> #endif
> 
> if yes i send out v2, at least it works on my device

just the following lines should work:

ret = devm_thermal_add_hwmon_sysfs(tzdev);
if (ret)
dev_err(>dev, "error in thermal_add_hwmon_sysfs");


The no_hwmon usage is a bit fuzzy in the thermal core code.


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH v2 2/2] thermal: power_allocator: update once cooling devices when temp is low

2021-04-20 Thread Daniel Lezcano
On 19/04/2021 10:45, Lukasz Luba wrote:
> The cooling device state change generates an event, also when there is no
> need, because temperature is low and device is not throttled. Avoid to
> unnecessary update the cooling device which means also not sending event.
> The cooling device state has not changed because the temperature is still
> below the first activation trip point value, so we can do this.
> Add a tracking mechanism to make sure it updates cooling devices only
> once - when the temperature dropps below first trip point.
> 
> Reported-by: Daniel Lezcano 
> Signed-off-by: Lukasz Luba 
> ---
>  drivers/thermal/gov_power_allocator.c | 14 ++
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/thermal/gov_power_allocator.c 
> b/drivers/thermal/gov_power_allocator.c
> index d393409fb786..f379f1aaa3b5 100644
> --- a/drivers/thermal/gov_power_allocator.c
> +++ b/drivers/thermal/gov_power_allocator.c
> @@ -571,7 +571,7 @@ static void reset_pid_controller(struct 
> power_allocator_params *params)
>   params->prev_err = 0;
>  }
>  
> -static void allow_maximum_power(struct thermal_zone_device *tz)
> +static void allow_maximum_power(struct thermal_zone_device *tz, bool update)
>  {
>   struct thermal_instance *instance;
>   struct power_allocator_params *params = tz->governor_data;
> @@ -594,9 +594,13 @@ static void allow_maximum_power(struct 
> thermal_zone_device *tz)
>*/
>   cdev->ops->get_requested_power(cdev, _power);
>  
> - instance->cdev->updated = false;
> + if (update)
> + instance->cdev->updated = false;
> +
>   mutex_unlock(>cdev->lock);
> - (instance->cdev);
> +
> + if (update)
> + thermal_cdev_update(instance->cdev);

This cdev update has something bad IMHO. It is protected by a mutex but
the 'updated' field is left unprotected before calling
thermal_cdev_update().

It is not the fault of this code but how the cooling device are updated
and how it interacts with the thermal instances.

IMO, part of the core code needs to revisited.

This change tight a bit more the knot.

Would it make sense to you if we create a function eg.
__thermal_cdev_update()

And then we have:

void thermal_cdev_update(struct thermal_cooling_device *cdev)
{
mutex_lock(>lock);
/* cooling device is updated*/
if (cdev->updated) {
mutex_unlock(>lock);
return;
}

__thermal_cdev_update(cdev);

thermal_cdev_set_cur_state(cdev, target);

cdev->updated = true;
mutex_unlock(>lock);
trace_cdev_update(cdev, target);
dev_dbg(>device, "set to state %lu\n", target);
}

And in this file we do instead:

-   instance->cdev->updated = false;
+   if (update)
+   __thermal_cdev_update(instance->cdev);
mutex_unlock(>cdev->lock);
-   thermal_cdev_update(instance->cdev);

>   }
>   mutex_unlock(>lock);
>  }
> @@ -710,6 +714,7 @@ static int power_allocator_throttle(struct 
> thermal_zone_device *tz, int trip)
>   int ret;
>   int switch_on_temp, control_temp;
>   struct power_allocator_params *params = tz->governor_data;
> + bool update;
>  
>   /*
>* We get called for every trip point but we only need to do
> @@ -721,9 +726,10 @@ static int power_allocator_throttle(struct 
> thermal_zone_device *tz, int trip)
>   ret = tz->ops->get_trip_temp(tz, params->trip_switch_on,
>_on_temp);
>   if (!ret && (tz->temperature < switch_on_temp)) {
> + update = (tz->last_temperature >= switch_on_temp);
>   tz->passive = 0;
>   reset_pid_controller(params);
> - allow_maximum_power(tz);
> + allow_maximum_power(tz, update);
>   return 0;
>   }
>  
> 


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH] thermal: mediatek: add sensors-support

2021-04-20 Thread Daniel Lezcano
On 20/03/2021 09:06, Frank Wunderlich wrote:
> From: Frank Wunderlich 
> 
> add HWMON-support to mediateks thermanl driver to allow lm-sensors
> userspace tools read soc temperature
> 
> Signed-off-by: Frank Wunderlich 
> ---
>  drivers/thermal/mtk_thermal.c | 22 ++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/drivers/thermal/mtk_thermal.c b/drivers/thermal/mtk_thermal.c
> index 149c6d7fd5a0..e22d77d57458 100644
> --- a/drivers/thermal/mtk_thermal.c
> +++ b/drivers/thermal/mtk_thermal.c
> @@ -23,6 +23,8 @@
>  #include 
>  #include 
>  
> +#include "thermal_hwmon.h"
> +
>  /* AUXADC Registers */
>  #define AUXADC_CON1_SET_V0x008
>  #define AUXADC_CON1_CLR_V0x00c
> @@ -983,6 +985,13 @@ static void mtk_thermal_release_periodic_ts(struct 
> mtk_thermal *mt,
>   writel((tmp & (~0x10e)), mt->thermal_base + TEMP_MSRCTL1);
>  }
>  
> +static void mtk_thermal_hwmon_action(void *data)
> +{
> + struct thermal_zone_device *zone = data;
> +
> + thermal_remove_hwmon_sysfs(zone);
> +}
> +
>  static int mtk_thermal_probe(struct platform_device *pdev)
>  {
>   int ret, i, ctrl_id;
> @@ -1087,6 +1096,19 @@ static int mtk_thermal_probe(struct platform_device 
> *pdev)
>   goto err_disable_clk_peri_therm;
>   }
>  
> +#ifdef CONFIG_THERMAL_HWMON

No #ifdef in C file.

> + tzdev->tzp->no_hwmon = false;
> + ret = thermal_add_hwmon_sysfs(tzdev);
> + if (ret)
> + dev_err(>dev, "error in thermal_add_hwmon_sysfs");
> +
> + ret = devm_add_action(>dev, mtk_thermal_hwmon_action, tzdev);

devm_thermal_add_hwmon_sysfs() ?

> + if (ret) {
> + dev_err(>dev, "error in devm_add_action");
> + mtk_thermal_hwmon_action(tzdev);
> + }
> +#endif
> +
>   return 0;
>  
>  err_disable_clk_peri_therm:
> 


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH] [v4, 1/1] clocksource/drivers/timer-mediatek: optimize systimer irq clear flow on shutdown

2021-04-19 Thread Daniel Lezcano
On 09/04/2021 11:22, Fengquan Chen wrote:
> mtk_syst_clkevt_shutdown is called after irq disabled in suspend flow,
> clear any pending systimer irq when shutdown to avoid suspend aborted
> due to timer irq pending
> 
> Also as for systimer in mediatek socs, there must be firstly enable
> timer before clear systimer irq
> 
> Fixes: e3af677607d9("clocksource/drivers/timer-mediatek: Add support for 
> system timer")
> Signed-off-by: Fengquan Chen 
> 
> ---
>  drivers/clocksource/timer-mediatek.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/clocksource/timer-mediatek.c 
> b/drivers/clocksource/timer-mediatek.c
> index 9318edc..6461fd3 100644
> --- a/drivers/clocksource/timer-mediatek.c
> +++ b/drivers/clocksource/timer-mediatek.c
> @@ -60,9 +60,9 @@
>   * SYST_CON_EN: Clock enable. Shall be set to
>   *   - Start timer countdown.
>   *   - Allow timeout ticks being updated.
> - *   - Allow changing interrupt functions.
> + *   - Allow changing interrupt status,like clear irq pending.
>   *
> - * SYST_CON_IRQ_EN: Set to allow interrupt.
> + * SYST_CON_IRQ_EN: Set to enable interrupt.
>   *
>   * SYST_CON_IRQ_CLR: Set to clear interrupt.
>   */
> @@ -75,6 +75,7 @@
>  static void mtk_syst_ack_irq(struct timer_of *to)
>  {
>   /* Clear and disable interrupt */
> + writel(SYST_CON_EN, SYST_CON_REG(to));
>   writel(SYST_CON_IRQ_CLR | SYST_CON_EN, SYST_CON_REG(to));
>  }

IIRC, there is a hardware issue here. If it is the case, please describe
it and refer to an errata if any.

Also Evan Benn commented your code and asked a couple of questions [1],
please answer before reposting a new version.

Comments ignored == patch ignored

> @@ -111,6 +112,9 @@ static int mtk_syst_clkevt_next_event(unsigned long ticks,
>  
>  static int mtk_syst_clkevt_shutdown(struct clock_event_device *clkevt)
>  {
> + /* Clear any irq */
> + mtk_syst_ack_irq(to_timer_of(clkevt));
> +
>   /* Disable timer */
>   writel(0, SYST_CON_REG(to_timer_of(clkevt)));

Please check out the patch sent by Evan Benn [2], if you agree, ack it.

Thanks

  -- Daniel

[1] https://patchwork.kernel.org/comment/24059277/
[2]
https://lore.kernel.org/linux-arm-kernel/20210412132200.v3.1.I1d9917047de06715da16e1620759f703fcfdcbcb@changeid/

-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH v3] drivers/clocksource/mediatek: Ack and disable interrupts on suspend

2021-04-19 Thread Daniel Lezcano
On 12/04/2021 05:22, Evan Benn wrote:
> Interrupts are disabled during suspend before this driver disables its
> timers. ARM trusted firmware will abort suspend if the timer irq is
> pending, so ack and disable the timer interrupt during suspend.
> 
> Signed-off-by: Evan Benn 
> ---
> 
> Changes in v3:
> Move the ACK from the shutdown to the suspend function.
> 
> Changes in v2:
> Remove the patch that splits the drivers into 2 files.
> 
>  drivers/clocksource/timer-mediatek.c | 23 +++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/drivers/clocksource/timer-mediatek.c 
> b/drivers/clocksource/timer-mediatek.c
> index 9318edcd8963..1ae8fee639bf 100644
> --- a/drivers/clocksource/timer-mediatek.c
> +++ b/drivers/clocksource/timer-mediatek.c
> @@ -241,6 +241,27 @@ static void mtk_gpt_enable_irq(struct timer_of *to, u8 
> timer)
>  timer_of_base(to) + GPT_IRQ_EN_REG);
>  }
>  
> +static void mtk_gpt_resume(struct clock_event_device *clk)
> +{
> + struct timer_of *to = to_timer_of(clk);
> +
> + mtk_gpt_enable_irq(to, TIMER_CLK_EVT);
> +}
> +
> +static void mtk_gpt_suspend(struct clock_event_device *clk)
> +{
> + struct timer_of *to = to_timer_of(clk);
> +
> + /* Disable all interrupts */
> + writel(0x0, timer_of_base(to) + GPT_IRQ_EN_REG);
> +
> + /* This is called with interrupts disabled,

Please fix the comment style:

/*
 * This is called with interrupts disabled
 * so ...
 * ...
 */

Other than than that I'll be happy to pick it.

> +  * so we need to ack any interrupt that is pending
> +  * Or for example ATF will prevent a suspend from completing.
> +  */
> + writel(0x3f, timer_of_base(to) + GPT_IRQ_ACK_REG);
> +}
> +
>  static struct timer_of to = {
>   .flags = TIMER_OF_IRQ | TIMER_OF_BASE | TIMER_OF_CLOCK,
>  
> @@ -286,6 +307,8 @@ static int __init mtk_gpt_init(struct device_node *node)
>   to.clkevt.set_state_oneshot = mtk_gpt_clkevt_shutdown;
>   to.clkevt.tick_resume = mtk_gpt_clkevt_shutdown;
>   to.clkevt.set_next_event = mtk_gpt_clkevt_next_event;
> + to.clkevt.suspend = mtk_gpt_suspend;
> + to.clkevt.resume = mtk_gpt_resume;
>   to.of_irq.handler = mtk_gpt_interrupt;
>  
>   ret = timer_of_init(node, );
> 


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH v7 2/9] reboot: thermal: Export hardware protection shutdown

2021-04-16 Thread Daniel Lezcano
On 14/04/2021 07:52, Matti Vaittinen wrote:
> Thermal core contains a logic for safety shutdown. System is attempted to
> be powered off if temperature exceeds safety limits.
> 
> Currently this can be also utilized by regulator subsystem as a final
> protection measure if PMICs report dangerous over-voltage, over-current or
> over-temperature and if per regulator counter measures fail or do not
> exist.
> 
> Move this logic to kernel/reboot.c and export the functionality for other
> subsystems to use. Also replace the mutex with a spinlock to allow using
> the function from any context.
> 
> Also the EMIF bus code has implemented a safety shut-down. EMIF does not
> attempt orderly_poweroff at all. Thus the EMIF code is not converted to use
> this new function.
> 
> Signed-off-by: Matti Vaittinen 
> ---
> Changelog
>  v7:
>   - new patch
> 
> Please note - this patch has received only a minimal amount of testing.
> (The new API call was tested to shut-down my system at driver probe but
> no odd corner-cases have been tested).
> 
> Any testing for thermal shutdown is appreciated.
> ---
>  drivers/thermal/thermal_core.c | 63 ++---
>  include/linux/reboot.h |  1 +
>  kernel/reboot.c| 86 ++

Please send a patch implementing the reboot/shutdown and then another
one replacing the thermal shutdown code by a call to the new API.

>  3 files changed, 91 insertions(+), 59 deletions(-)
> 
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 996c038f83a4..b1444845af38 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -36,10 +36,8 @@ static LIST_HEAD(thermal_governor_list);
>  
>  static DEFINE_MUTEX(thermal_list_lock);
>  static DEFINE_MUTEX(thermal_governor_lock);
> -static DEFINE_MUTEX(poweroff_lock);
>  
>  static atomic_t in_suspend;
> -static bool power_off_triggered;
>  
>  static struct thermal_governor *def_governor;
>  
> @@ -327,70 +325,18 @@ static void handle_non_critical_trips(struct 
> thermal_zone_device *tz, int trip)
>  def_governor->throttle(tz, trip);
>  }
>  
> -/**
> - * thermal_emergency_poweroff_func - emergency poweroff work after a known 
> delay
> - * @work: work_struct associated with the emergency poweroff function
> - *
> - * This function is called in very critical situations to force
> - * a kernel poweroff after a configurable timeout value.
> - */
> -static void thermal_emergency_poweroff_func(struct work_struct *work)
> -{
> - /*
> -  * We have reached here after the emergency thermal shutdown
> -  * Waiting period has expired. This means orderly_poweroff has
> -  * not been able to shut off the system for some reason.
> -  * Try to shut down the system immediately using kernel_power_off
> -  * if populated
> -  */
> - WARN(1, "Attempting kernel_power_off: Temperature too high\n");
> - kernel_power_off();
> -
> - /*
> -  * Worst of the worst case trigger emergency restart
> -  */
> - WARN(1, "Attempting emergency_restart: Temperature too high\n");
> - emergency_restart();
> -}
> -
> -static DECLARE_DELAYED_WORK(thermal_emergency_poweroff_work,
> - thermal_emergency_poweroff_func);
> -
> -/**
> - * thermal_emergency_poweroff - Trigger an emergency system poweroff
> - *
> - * This may be called from any critical situation to trigger a system 
> shutdown
> - * after a known period of time. By default this is not scheduled.
> - */
> -static void thermal_emergency_poweroff(void)
> +void thermal_zone_device_critical(struct thermal_zone_device *tz)
>  {
> - int poweroff_delay_ms = CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS;
>   /*
>* poweroff_delay_ms must be a carefully profiled positive value.
> -  * Its a must for thermal_emergency_poweroff_work to be scheduled
> +  * Its a must for forced_emergency_poweroff_work to be scheduled.
>*/
> - if (poweroff_delay_ms <= 0)
> - return;
> - schedule_delayed_work(_emergency_poweroff_work,
> -   msecs_to_jiffies(poweroff_delay_ms));
> -}
> + int poweroff_delay_ms = CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS;
>  
> -void thermal_zone_device_critical(struct thermal_zone_device *tz)
> -{
>   dev_emerg(>device, "%s: critical temperature reached, "
> "shutting down\n", tz->type);
>  
> - mutex_lock(_lock);
> - if (!power_off_triggered) {
> - /*
> -  * Queue a backup emergency shutdown in the event of
> -  * orderly_poweroff failure
> -  */
> - thermal_emergency_poweroff();
> - orderly_poweroff(true);
> - power_off_triggered = true;
> - }
> - mutex_unlock(_lock);
> + hw_protection_shutdown("Temperature too high", poweroff_delay_ms);
>  }
>  EXPORT_SYMBOL(thermal_zone_device_critical);
>  
> @@ -1549,7 +1495,6 

Re: [PATCH v7 2/9] reboot: thermal: Export hardware protection shutdown

2021-04-16 Thread Daniel Lezcano
On 14/04/2021 07:52, Matti Vaittinen wrote:
> Thermal core contains a logic for safety shutdown. System is attempted to
> be powered off if temperature exceeds safety limits.
> 
> Currently this can be also utilized by regulator subsystem as a final
> protection measure if PMICs report dangerous over-voltage, over-current or
> over-temperature and if per regulator counter measures fail or do not
> exist.
> 
> Move this logic to kernel/reboot.c and export the functionality for other
> subsystems to use. Also replace the mutex with a spinlock to allow using
> the function from any context.
> 
> Also the EMIF bus code has implemented a safety shut-down. EMIF does not
> attempt orderly_poweroff at all. Thus the EMIF code is not converted to use
> this new function.
> 
> Signed-off-by: Matti Vaittinen 
> ---
> Changelog
>  v7:
>   - new patch
> 
> Please note - this patch has received only a minimal amount of testing.
> (The new API call was tested to shut-down my system at driver probe but
> no odd corner-cases have been tested).
> 
> Any testing for thermal shutdown is appreciated.

You can test it easily by enabling the option CONFIG_THERMAL_EMULATION

Then in any thermal zone:

Assuming the critical temp is below the one specified in the command:

echo 10 > /sys/class/thermal/thermal_zone0/emul_temp

> ---
>  drivers/thermal/thermal_core.c | 63 ++---
>  include/linux/reboot.h |  1 +
>  kernel/reboot.c| 86 ++
>  3 files changed, 91 insertions(+), 59 deletions(-)
> 
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 996c038f83a4..b1444845af38 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -36,10 +36,8 @@ static LIST_HEAD(thermal_governor_list);
>  
>  static DEFINE_MUTEX(thermal_list_lock);
>  static DEFINE_MUTEX(thermal_governor_lock);
> -static DEFINE_MUTEX(poweroff_lock);
>  
>  static atomic_t in_suspend;
> -static bool power_off_triggered;
>  
>  static struct thermal_governor *def_governor;
>  
> @@ -327,70 +325,18 @@ static void handle_non_critical_trips(struct 
> thermal_zone_device *tz, int trip)
>  def_governor->throttle(tz, trip);
>  }
>  
> -/**
> - * thermal_emergency_poweroff_func - emergency poweroff work after a known 
> delay
> - * @work: work_struct associated with the emergency poweroff function
> - *
> - * This function is called in very critical situations to force
> - * a kernel poweroff after a configurable timeout value.
> - */
> -static void thermal_emergency_poweroff_func(struct work_struct *work)
> -{
> - /*
> -  * We have reached here after the emergency thermal shutdown
> -  * Waiting period has expired. This means orderly_poweroff has
> -  * not been able to shut off the system for some reason.
> -  * Try to shut down the system immediately using kernel_power_off
> -  * if populated
> -  */
> - WARN(1, "Attempting kernel_power_off: Temperature too high\n");
> - kernel_power_off();
> -
> - /*
> -  * Worst of the worst case trigger emergency restart
> -  */
> - WARN(1, "Attempting emergency_restart: Temperature too high\n");
> - emergency_restart();
> -}
> -
> -static DECLARE_DELAYED_WORK(thermal_emergency_poweroff_work,
> - thermal_emergency_poweroff_func);
> -
> -/**
> - * thermal_emergency_poweroff - Trigger an emergency system poweroff
> - *
> - * This may be called from any critical situation to trigger a system 
> shutdown
> - * after a known period of time. By default this is not scheduled.
> - */
> -static void thermal_emergency_poweroff(void)
> +void thermal_zone_device_critical(struct thermal_zone_device *tz)
>  {
> - int poweroff_delay_ms = CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS;
>   /*
>* poweroff_delay_ms must be a carefully profiled positive value.
> -  * Its a must for thermal_emergency_poweroff_work to be scheduled
> +  * Its a must for forced_emergency_poweroff_work to be scheduled.
>*/
> - if (poweroff_delay_ms <= 0)
> - return;
> - schedule_delayed_work(_emergency_poweroff_work,
> -   msecs_to_jiffies(poweroff_delay_ms));
> -}
> + int poweroff_delay_ms = CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS;
>  
> -void thermal_zone_device_critical(struct thermal_zone_device *tz)
> -{
>   dev_emerg(>device, "%s: critical temperature reached, "
> "shutting down\n", tz->type);
>  
> - mutex_lock(_lock);
> - if (!power_off_triggered) {
> - /*
> -  * Queue a backup emergency shutdown in the event of
> -  * orderly_poweroff failure
> -  */
> - thermal_emergency_poweroff();
> - orderly_poweroff(true);
> - power_off_triggered = true;
> - }
> - mutex_unlock(_lock);
> + hw_protection_shutdown("Temperature too high", 

Re: [v7,3/3] thermal: mediatek: add another get_temp ops for thermal sensors

2021-04-09 Thread Daniel Lezcano
On 16/03/2021 08:01, Michael Kao wrote:
> Provide thermal zone to read thermal sensor
> in the SoC. We can read all the thermal sensors
> value in the SoC by the node /sys/class/thermal/
> 
> In mtk_thermal_bank_temperature, return -EAGAIN instead of -EACCESS
> on the first read of sensor that often are bogus values.
> This can avoid following warning on boot:
> 
>   thermal thermal_zone6: failed to read out thermal zone (-13)

This patch is changing more things than described in the changelog.

Is it possible to share some technical details about how the sensor(s)
are working or point to some documentation if any ? and possibly the
layout ?

IIUC there is a fake thermal zone zero with the purpose of aggregating
all the other sensors by taking the max temperature of all the sensors.

This patch adds a thermal zone per sensor, and each sensor is per CPU.
CPU0 being actually the max of all the other sensors, right ?


> Signed-off-by: Michael Kao 
> Signed-off-by: Hsin-Yi Wang 
> ---
>  drivers/thermal/mtk_thermal.c | 100 +-
>  1 file changed, 75 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/thermal/mtk_thermal.c b/drivers/thermal/mtk_thermal.c
> index 149c6d7fd5a0..57e4f08a947e 100644
> --- a/drivers/thermal/mtk_thermal.c
> +++ b/drivers/thermal/mtk_thermal.c
> @@ -245,6 +245,11 @@ enum mtk_thermal_version {
>  
>  struct mtk_thermal;
>  
> +struct mtk_thermal_zone {
> + struct mtk_thermal *mt;
> + int id;
> +};
> +
>  struct thermal_bank_cfg {
>   unsigned int num_sensors;
>   const int *sensors;
> @@ -637,6 +642,30 @@ static void mtk_thermal_put_bank(struct mtk_thermal_bank 
> *bank)
>   mutex_unlock(>lock);
>  }
>  
> +static u32 _get_sensor_temp(struct mtk_thermal *mt, int id)
> +{
> + u32 raw;
> + int temp;
> +
> + raw = readl(mt->thermal_base + mt->conf->msr[id]);
> +
> + if (mt->conf->version == MTK_THERMAL_V1)
> + temp = raw_to_mcelsius_v1(mt, id, raw);
> + else
> + temp = raw_to_mcelsius_v2(mt, id, raw);
> +
> + /*
> +  * The first read of a sensor often contains very high bogus
> +  * temperature value. Filter these out so that the system does
> +  * not immediately shut down.
> +  */
> +
> + if (temp > 20)
> + return -EAGAIN;
> + else
> + return temp;
> +}
> +
>  /**
>   * mtk_thermal_bank_temperature - get the temperature of a bank
>   * @bank:The bank
> @@ -647,28 +676,11 @@ static void mtk_thermal_put_bank(struct 
> mtk_thermal_bank *bank)
>  static int mtk_thermal_bank_temperature(struct mtk_thermal_bank *bank)
>  {
>   struct mtk_thermal *mt = bank->mt;
> - const struct mtk_thermal_data *conf = mt->conf;
>   int i, temp = INT_MIN, max = INT_MIN;
> - u32 raw;
> -
> - for (i = 0; i < conf->bank_data[bank->id].num_sensors; i++) {
> - raw = readl(mt->thermal_base + conf->msr[i]);
>  
> - if (mt->conf->version == MTK_THERMAL_V1) {
> - temp = raw_to_mcelsius_v1(
> - mt, conf->bank_data[bank->id].sensors[i], raw);
> - } else {
> - temp = raw_to_mcelsius_v2(
> - mt, conf->bank_data[bank->id].sensors[i], raw);
> - }
> + for (i = 0; i < mt->conf->bank_data[bank->id].num_sensors; i++) {
>  
> - /*
> -  * The first read of a sensor often contains very high bogus
> -  * temperature value. Filter these out so that the system does
> -  * not immediately shut down.
> -  */
> - if (temp > 20)
> - temp = 0;
> + temp = _get_sensor_temp(mt, i);
>  
>   if (temp > max)
>   max = temp;
> @@ -679,7 +691,8 @@ static int mtk_thermal_bank_temperature(struct 
> mtk_thermal_bank *bank)
>  
>  static int mtk_read_temp(void *data, int *temperature)
>  {
> - struct mtk_thermal *mt = data;
> + struct mtk_thermal_zone *tz = data;
> + struct mtk_thermal *mt = tz->mt;
>   int i;
>   int tempmax = INT_MIN;
>  
> @@ -698,10 +711,28 @@ static int mtk_read_temp(void *data, int *temperature)
>   return 0;
>  }
>  
> +static int mtk_read_sensor_temp(void *data, int *temperature)
> +{
> + struct mtk_thermal_zone *tz = data;
> + struct mtk_thermal *mt = tz->mt;
> + int id = tz->id - 1;
> +
> + if (id < 0)
> + return -EACCES;
> +
> + *temperature = _get_sensor_temp(mt, id);
> +
> + return 0;
> +}
> +
>  static const struct thermal_zone_of_device_ops mtk_thermal_ops = {
>   .get_temp = mtk_read_temp,
>  };
>  
> +static const struct thermal_zone_of_device_ops mtk_thermal_sensor_ops = {
> + .get_temp = mtk_read_sensor_temp,
> +};
> +
>  static void mtk_thermal_init_bank(struct mtk_thermal *mt, int num,
> u32 apmixed_phys_base, u32 auxadc_phys_base,
>   

Re: [v7,1/3] arm64: dts: mt8183: add thermal zone node

2021-04-09 Thread Daniel Lezcano
On 16/03/2021 08:01, Michael Kao wrote:
> From: "michael.kao" 
> 
> Add thermal zone node to Mediatek MT8183 dts file.
> 
> Evaluate the thermal zone every 500ms while not cooling
> and every 100ms when passive cooling is performed.
> 
> Signed-off-by: Matthias Kaehlcke 
> Signed-off-by: Michael Kao 
> ---
>  arch/arm64/boot/dts/mediatek/mt8183.dtsi | 85 
>  1 file changed, 85 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi 
> b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> index 5b782a4769e7..d3550af06408 100644
> --- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> +++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> @@ -657,6 +657,87 @@
>   status = "disabled";
>   };
>  
> + thermal: thermal@1100b000 {
> + #thermal-sensor-cells = <1>;
> + compatible = "mediatek,mt8183-thermal";
> + reg = <0 0x1100b000 0 0x1000>;
> + clocks = < CLK_INFRA_THERM>,
> +  < CLK_INFRA_AUXADC>;
> + clock-names = "therm", "auxadc";
> + resets = <  MT8183_INFRACFG_AO_THERM_SW_RST>;
> + interrupts = <0 76 IRQ_TYPE_LEVEL_LOW>;
> + mediatek,auxadc = <>;
> + mediatek,apmixedsys = <>;
> + nvmem-cells = <_calibration>;
> + nvmem-cell-names = "calibration-data";
> + };
> +
> + thermal-zones {
> + cpu_thermal: cpu_thermal {
> + polling-delay-passive = <100>;
> + polling-delay = <500>;
> + thermal-sensors = < 0>;
> + sustainable-power = <5000>;
> + };
> +
> + /* The tzts1 ~ tzts6 don't need to polling */
> + /* The tzts1 ~ tzts6 don't need to thermal throttle */
> +
> + tzts1: tzts1 {
> + polling-delay-passive = <0>;
> + polling-delay = <0>;
> + thermal-sensors = < 1>;
> + sustainable-power = <5000>;
> + trips {};
> + cooling-maps {};
> + };

What is the point of defining the sustainable power with no cooling
device associated ?



-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [GIT PULL] cpuidle for v5.13-rc1

2021-04-08 Thread Daniel Lezcano
On 08/04/2021 19:24, Rafael J. Wysocki wrote:
> On Thu, Apr 8, 2021 at 5:10 PM Daniel Lezcano  
> wrote:
>>
>>
>> Hi Rafael,
>>
>> please consider pulling the following change for cpuidle on ARM for
>> v5.13-rc1
>>
>> Thanks
>>
>>   -- Daniel
>>
>>
>> The following changes since commit dde8740bd9b505c58ec8b2277d5d55c6951b7e42:
>>
>>   Merge branch 'acpi-processor-fixes' into linux-next (2021-04-07
>> 19:02:56 +0200)
> 
> Can you please rebase this on 5.12-rc6?  My linux-next branch is
> re-merged on a regular basis.
> 
> Generally speaking, if you want me to pull from a branch, please make
> sure that this branch is based on a commit present in the Linus' tree,
> preferably one of the commits tagged as -rc or a specific merge.
> 

Sure, here is the pull request based on v5.12-rc6 with the signed tag
cpuidle-v5.13-rc1

Thanks

  -- Daniel


The following changes since commit e49d033bddf5b565044e2abe4241353959bc9120:

  Linux 5.12-rc6 (2021-04-04 14:15:36 -0700)

are available in the Git repository at:

  https://git.linaro.org/people/daniel.lezcano/linux tags/cpuidle-v5.13-rc1

for you to fetch changes up to 498ba2a8a2756694b6f357426dbc8a5e6b6c:

  cpuidle: Fix ARM_QCOM_SPM_CPUIDLE configuration (2021-04-08 19:54:14
+0200)


- Fix the C7 state on the tegra114 by setting the L2-no-flush flag
  unconditionally (Dmitry Osipenko)

- Remove the do_idle firmware call as it is not supported by the ATF
  on tegra SoC (Dmitry Osipenko)

- Add a missing dependency on CONFIG_MMU to prevent linkage error (He
  Ying)


Dmitry Osipenko (2):
  cpuidle: tegra: Fix C7 idling state on Tegra114
  cpuidle: tegra: Remove do_idle firmware call

He Ying (1):
  cpuidle: Fix ARM_QCOM_SPM_CPUIDLE configuration

 drivers/cpuidle/Kconfig.arm |  2 +-
 drivers/cpuidle/cpuidle-tegra.c | 19 ---
 2 files changed, 5 insertions(+), 16 deletions(-)

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


[GIT PULL] cpuidle for v5.13-rc1

2021-04-08 Thread Daniel Lezcano


Hi Rafael,

please consider pulling the following change for cpuidle on ARM for
v5.13-rc1

Thanks

  -- Daniel


The following changes since commit dde8740bd9b505c58ec8b2277d5d55c6951b7e42:

  Merge branch 'acpi-processor-fixes' into linux-next (2021-04-07
19:02:56 +0200)

are available in the Git repository at:

  https://git.linaro.org/people/daniel.lezcano/linux tags/cpuidle-v5.13-rc1

for you to fetch changes up to 0beffa4e524f3989ac31fe8563348d45a87f7314:

  cpuidle: Fix ARM_QCOM_SPM_CPUIDLE configuration (2021-04-08 16:49:19
+0200)


- Fix the C7 state on the tegra114 by setting the L2-no-flush flag
  unconditionally (Dmitry Osipenko)

- Remove the do_idle firmware call as it is not supported by the ATF
  on tegra SoC (Dmitry Osipenko)

- Add a missing dependency on CONFIG_MMU to prevent linkage error (He
  Ying)


Dmitry Osipenko (2):
  cpuidle: tegra: Fix C7 idling state on Tegra114
  cpuidle: tegra: Remove do_idle firmware call

He Ying (1):
  cpuidle: Fix ARM_QCOM_SPM_CPUIDLE configuration

 drivers/cpuidle/Kconfig.arm |  2 +-
 drivers/cpuidle/cpuidle-tegra.c | 19 ---
 2 files changed, 5 insertions(+), 16 deletions(-)

-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


[GIT PULL] timer drivers for v5.13

2021-04-08 Thread Daniel Lezcano


Hi Thomas,

please consider these changes for v5.13

Thanks

  -- Daniel

The following changes since commit d4c7c28806616809e3baa0b7cd8c665516b2726d:

  timekeeping: Allow runtime PM from change_clocksource() (2021-03-29
16:41:59 +0200)

are available in the Git repository at:

  https://git.linaro.org/people/daniel.lezcano/linux tags/timers-v5.13-rc1

for you to fetch changes up to 8120891105ba32b45bc35f7dc07e6d87a8314556:

  dt-bindings: timer: nuvoton,npcm7xx: Add wpcm450-timer (2021-04-08
16:41:20 +0200)


 - Add dt bindings for the wpcm450 and the timer declaration (Jonathan
   Neuschäfer)

 - Add dt bindings for the JZ4760, the timer declaration for the
   ingenic ost and timer (Paul Cercueil)

 - Add dt bindings for the cmt r8a779a0 (Wolfram Sang)

 - Add dt bindings for the cmt r8a77961 (Niklas Söderlund)

 - Add missing dt bindings for the tmu r8a7795, r8a7796, r8a77961, r8a77965,
   r8a77990 and r8a77995 (Niklas Söderlund)

 - Check pending post before writing a new post in register for the
   timer TI dm and add the stopped callback ops to prevent any
   spurious interrupt (Tony Lindgren)

 - Fix return value check at init when calling device_node_to_regmap()
   for the Ingenic OST timer (Wei Yongjun)

 - Fix a trivial typo s/overflw/overflow/ for the pistachio timer (Drew
Fustini)

 - Don't use CMTOUT_IE with R-Car Gen2/3 (Wolfram Sang)

 - Fix rollback when the initialization fails on the dw_apb timer (Dinh
Nguyen)

 - Switch to timer TI dm on dra7 in order to prevent using the bogus
   architected timer which fails to wrap correctly after 388 days (Tony
Lindgren)

 - Add function annotation to optimize memory for the ARM architected
   timer (Jisheng Zhang)


Dinh Nguyen (1):
  clocksource/drivers/dw_apb_timer_of: Add handling for potential
memory leak

Drew Fustini (1):
  clocksource/drivers/pistachio: Fix trivial typo

Jisheng Zhang (1):
  clocksource/drivers/arm_arch_timer: Add __ro_after_init and __init

Jonathan Neuschäfer (2):
  clocksource/drivers/npcm: Add support for WPCM450
  dt-bindings: timer: nuvoton,npcm7xx: Add wpcm450-timer

Niklas Söderlund (2):
  dt-bindings: timer: renesas,tmu: Document missing Gen3 SoCs
  dt-bindings: timer: renesas,cmt: Document R8A77961

Paul Cercueil (3):
  dt-bindings: timer: ingenic: Add compatible strings for JZ4760(B)
  clocksource/drivers/ingenic: Add support for the JZ4760
  clocksource/drivers/ingenic-ost: Add support for the JZ4760B

Tony Lindgren (4):
  clocksource/drivers/timer-ti-dm: Fix posted mode status check order
  clocksource/drivers/timer-ti-dm: Add missing set_state_oneshot_stopped
  clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap
issue
  clocksource/drivers/timer-ti-dm: Handle dra7 timer wrap errata i940

Wei Yongjun (1):
  clocksource/drivers/ingenic_ost: Fix return value check in
ingenic_ost_probe()

Wolfram Sang (2):
  dt-bindings: timer: renesas,cmt: Add r8a779a0 CMT support
  clocksource/drivers/sh_cmt: Don't use CMTOUT_IE with R-Car Gen2/3

 .../devicetree/bindings/timer/ingenic,tcu.yaml |  30 ++--
 .../bindings/timer/nuvoton,npcm7xx-timer.txt   |   3 +-
 .../devicetree/bindings/timer/renesas,cmt.yaml |   4 +
 .../devicetree/bindings/timer/renesas,tmu.yaml |   6 +
 arch/arm/boot/dts/dra7-l4.dtsi |   4 +-
 arch/arm/boot/dts/dra7.dtsi|  20 +++
 drivers/clocksource/arm_arch_timer.c   |  23 +--
 drivers/clocksource/dw_apb_timer_of.c  |  26 +++-
 drivers/clocksource/ingenic-ost.c  |   9 +-
 drivers/clocksource/ingenic-timer.c|   2 +
 drivers/clocksource/sh_cmt.c   |   5 +-
 drivers/clocksource/timer-npcm7xx.c|   1 +
 drivers/clocksource/timer-pistachio.c  |   4 +-
 drivers/clocksource/timer-ti-dm-systimer.c | 155
+
 include/linux/cpuhotplug.h |   1 +
 15 files changed, 229 insertions(+), 64 deletions(-)



-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [v3,1/3] thermal: mediatek: Relocate driver to mediatek folder

2021-04-07 Thread Daniel Lezcano
On 07/04/2021 11:08, Michael Kao wrote:
> Hi Maintainers,
>  
> Gentle pin for this patch.
>  
> Thanks

ack, pong



-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH 1/2] thermal: power_allocator: maintain the device statistics from going stale

2021-04-06 Thread Daniel Lezcano
On 06/04/2021 20:38, Lukasz Luba wrote:

[ ... ]

>> But there is still the polling delay because the governor is IPA in this
>> case? There is also an additional trip-point0 which is not a target for
>> a cooling device, just put there to ensure the IPA has enough data when
>> reaching the second trip point which is the target.
> 
> It's just a configuration which was there for years. Some who wants to
> use IPA have to be sure that it has this 2 trip points: switch_on and
> control. If you are talking about a new design, then it's not for this
> patch. The complete re-design of DT thermal zones, sensors, etc is
> a huge topic.

Right it is not for this patch. Just pointing out there is something
wrong from my POV when polling is used for sampling.

[ ... ]



-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH 1/2] thermal: power_allocator: maintain the device statistics from going stale

2021-04-06 Thread Daniel Lezcano


Hi Lukasz,


On 06/04/2021 14:25, Lukasz Luba wrote:
> 
> 
> On 4/6/21 12:24 PM, Daniel Lezcano wrote:
>> On 06/04/2021 12:39, Lukasz Luba wrote:
>>>
>>>
>>> On 4/6/21 11:16 AM, Daniel Lezcano wrote:
>>>> On 06/04/2021 10:44, Lukasz Luba wrote:
>>>>>
>>>>>
>>>>> On 4/2/21 4:54 PM, Daniel Lezcano wrote:
>>>>>> On 31/03/2021 18:33, Lukasz Luba wrote:
>>>>>>> When the temperature is below the first activation trip point the
>>>>>>> cooling
>>>>>>> devices are not checked, so they cannot maintain fresh
>>>>>>> statistics. It
>>>>>>> leads into the situation, when temperature crosses first trip
>>>>>>> point, the
>>>>>>> statistics are stale and show state for very long period.
>>>>>>
>>>>>> Can you elaborate the statistics you are referring to ?
>>>>>>
>>>>>> I can understand the pid controller needs temperature but I don't
>>>>>> understand the statistics with the cooling device.
>>>>>>
>>>>>>
>>>>>
>>>>> The allocate_power() calls cooling_ops->get_requested_power(),
>>>>> which is for CPUs cpufreq_get_requested_power() function.
>>>>> In that cpufreq implementation for !SMP we still has the
>>>>> issue of stale statistics. Viresh, when he introduced the usage
>>>>> of sched_cpu_util(), he fixed that 'long non-meaningful period'
>>>>> of the broken statistics and it can be found since v5.12-rc1.
>>>>>
>>>>> The bug is still there for the !SMP. Look at the way how idle time
>>>>> is calculated in get_load() [1]. It relies on 'idle_time->timestamp'
>>>>> for calculating the period. But when this function is not called,
>>>>> the value can be very far away in time, e.g. a few seconds back,
>>>>> when the last allocate_power() was called.
>>>>>
>>>>> The bug is there for both SMP and !SMP [2] for older kernels, which
>>>>> can
>>>>> be used in Android or ChromeOS. I've been considering to put this
>>>>> simple
>>>>> IPA fix also to some other kernels, because Viresh's change is more
>>>>> a 'feature' and does not cover both platforms.
>>>>
>>>> Ok, so IIUC, the temperature is needed as well as the power to do the
>>>> connection for the pid loop (temp vs power).
>>>>
>>>> I'm trying to figure out how to delegate the mitigation switch
>>>> on/off to
>>>> the core code instead of the governor (and kill tz->passive) but how
>>>> things are implemented for the IPA makes this very difficult.
>>>>
>>>> AFAICT, this fix is not correct.
>>>>
>>>> If the temperature is below the 'switch_on_temp' the passive is set to
>>>> zero and the throttle function is not called anymore (assuming it is
>>>> interrupt mode not polling mode), so the power number is not updated
>>>> also.
>>>
>>> IPA doesn't work well in asynchronous mode, because it needs this fixed
>>> length for the period. I have been experimenting with tsens IRQs and
>>> also both fixed polling + sporadic asynchronous IRQs, trying to fix it
>>> and have 'predictable' IPA, but without a luck.
>>> IPA needs synchronous polling events like we have for high temp e.g.
>>> 100ms and low temp e.g. 1000ms. The asynchronous events are root of
>>> undesirable states (too low or to high) calculated and set for cooling
>>> devices. It's also harder to escape these states with asynchronous
>>> events. I don't recommend using IPA with asynchronous events from IRQs,
>>> for now. It might change in future, though.
>>
>> I understand that but there is the 'switch_on_temp' trip point which is
>> supposed to begin to collect the power values ahead of the
>> 'desired_temp' (aka mitigation trip point / sustainable power).
>>
>>
>>> The patch 2/2 should calm down the unnecessary updates/notifications so
>>> your request.
>>> The longer polling, which we have for temperature below 'switch_on_temp'
>>> (e.g. every 1sec) shouldn't harm the performance of the system, but
>>> definitely makes IPA more predictable and stable.
>>
>> The change I proposed is correct then no ? The polling is still
>> effective.
> 
> In your 

Re: [PATCH 1/2] thermal: power_allocator: maintain the device statistics from going stale

2021-04-06 Thread Daniel Lezcano
On 06/04/2021 12:39, Lukasz Luba wrote:
> 
> 
> On 4/6/21 11:16 AM, Daniel Lezcano wrote:
>> On 06/04/2021 10:44, Lukasz Luba wrote:
>>>
>>>
>>> On 4/2/21 4:54 PM, Daniel Lezcano wrote:
>>>> On 31/03/2021 18:33, Lukasz Luba wrote:
>>>>> When the temperature is below the first activation trip point the
>>>>> cooling
>>>>> devices are not checked, so they cannot maintain fresh statistics. It
>>>>> leads into the situation, when temperature crosses first trip
>>>>> point, the
>>>>> statistics are stale and show state for very long period.
>>>>
>>>> Can you elaborate the statistics you are referring to ?
>>>>
>>>> I can understand the pid controller needs temperature but I don't
>>>> understand the statistics with the cooling device.
>>>>
>>>>
>>>
>>> The allocate_power() calls cooling_ops->get_requested_power(),
>>> which is for CPUs cpufreq_get_requested_power() function.
>>> In that cpufreq implementation for !SMP we still has the
>>> issue of stale statistics. Viresh, when he introduced the usage
>>> of sched_cpu_util(), he fixed that 'long non-meaningful period'
>>> of the broken statistics and it can be found since v5.12-rc1.
>>>
>>> The bug is still there for the !SMP. Look at the way how idle time
>>> is calculated in get_load() [1]. It relies on 'idle_time->timestamp'
>>> for calculating the period. But when this function is not called,
>>> the value can be very far away in time, e.g. a few seconds back,
>>> when the last allocate_power() was called.
>>>
>>> The bug is there for both SMP and !SMP [2] for older kernels, which can
>>> be used in Android or ChromeOS. I've been considering to put this simple
>>> IPA fix also to some other kernels, because Viresh's change is more
>>> a 'feature' and does not cover both platforms.
>>
>> Ok, so IIUC, the temperature is needed as well as the power to do the
>> connection for the pid loop (temp vs power).
>>
>> I'm trying to figure out how to delegate the mitigation switch on/off to
>> the core code instead of the governor (and kill tz->passive) but how
>> things are implemented for the IPA makes this very difficult.
>>
>> AFAICT, this fix is not correct.
>>
>> If the temperature is below the 'switch_on_temp' the passive is set to
>> zero and the throttle function is not called anymore (assuming it is
>> interrupt mode not polling mode), so the power number is not updated
>> also.
> 
> IPA doesn't work well in asynchronous mode, because it needs this fixed
> length for the period. I have been experimenting with tsens IRQs and
> also both fixed polling + sporadic asynchronous IRQs, trying to fix it
> and have 'predictable' IPA, but without a luck.
> IPA needs synchronous polling events like we have for high temp e.g.
> 100ms and low temp e.g. 1000ms. The asynchronous events are root of
> undesirable states (too low or to high) calculated and set for cooling
> devices. It's also harder to escape these states with asynchronous
> events. I don't recommend using IPA with asynchronous events from IRQs,
> for now. It might change in future, though.

I understand that but there is the 'switch_on_temp' trip point which is
supposed to begin to collect the power values ahead of the
'desired_temp' (aka mitigation trip point / sustainable power).


> The patch 2/2 should calm down the unnecessary updates/notifications so
> your request.
> The longer polling, which we have for temperature below 'switch_on_temp'
> (e.g. every 1sec) shouldn't harm the performance of the system, but
> definitely makes IPA more predictable and stable.

The change I proposed is correct then no ? The polling is still effective.

If the IPA needs a sampling, it may be more adequate to separate the
sampling from the polling. So the other governors won't be impacted by
the IPA specific setup, and we do not end up with polling/passive delays
tricks for a governor. The IPA would have its own self-encapsulated
sampling rate and the thermal zone setup won't depend on the governor.

What do you think ?

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH 1/2] thermal: power_allocator: maintain the device statistics from going stale

2021-04-06 Thread Daniel Lezcano
On 06/04/2021 10:44, Lukasz Luba wrote:
> 
> 
> On 4/2/21 4:54 PM, Daniel Lezcano wrote:
>> On 31/03/2021 18:33, Lukasz Luba wrote:
>>> When the temperature is below the first activation trip point the
>>> cooling
>>> devices are not checked, so they cannot maintain fresh statistics. It
>>> leads into the situation, when temperature crosses first trip point, the
>>> statistics are stale and show state for very long period.
>>
>> Can you elaborate the statistics you are referring to ?
>>
>> I can understand the pid controller needs temperature but I don't
>> understand the statistics with the cooling device.
>>
>>
> 
> The allocate_power() calls cooling_ops->get_requested_power(),
> which is for CPUs cpufreq_get_requested_power() function.
> In that cpufreq implementation for !SMP we still has the
> issue of stale statistics. Viresh, when he introduced the usage
> of sched_cpu_util(), he fixed that 'long non-meaningful period'
> of the broken statistics and it can be found since v5.12-rc1.
> 
> The bug is still there for the !SMP. Look at the way how idle time
> is calculated in get_load() [1]. It relies on 'idle_time->timestamp'
> for calculating the period. But when this function is not called,
> the value can be very far away in time, e.g. a few seconds back,
> when the last allocate_power() was called.
> 
> The bug is there for both SMP and !SMP [2] for older kernels, which can
> be used in Android or ChromeOS. I've been considering to put this simple
> IPA fix also to some other kernels, because Viresh's change is more
> a 'feature' and does not cover both platforms.

Ok, so IIUC, the temperature is needed as well as the power to do the
connection for the pid loop (temp vs power).

I'm trying to figure out how to delegate the mitigation switch on/off to
the core code instead of the governor (and kill tz->passive) but how
things are implemented for the IPA makes this very difficult.

AFAICT, this fix is not correct.

If the temperature is below the 'switch_on_temp' the passive is set to
zero and the throttle function is not called anymore (assuming it is
interrupt mode not polling mode), so the power number is not updated also.

My suggestion is to do the following:

ret = tz->ops->get_trip_temp(tz, params->trip_switch_on,

_on_temp);
if (ret)
return ret;

if ((tz->temperature < switch_on_temp)) {

/* Nothing to do */
if (tz->last_temperature < switch_on_temp)
return 0;

/* Switch off */
tz->passive = 0;
reset_pid_controller(params);
allow_maximum_power(tz);
return 0;
}

/* Collect the power numbers */
...


Two birds in a shot.



-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH v14 0/9] Add support for ipq8064 tsens

2021-04-05 Thread Daniel Lezcano
On 04/04/2021 16:48, Ansuel Smith wrote:
> This patchset convert msm8960 to reg_filed, use int_common instead 
> of a custom function and fix wrong tsens get_temp function for msm8960.
> Ipq8064 SoCs tsens driver is based on 8960 tsens driver. Ipq8064 needs
> to be registered as a gcc child as the tsens regs on this platform are
> shared with the controller.
> This is based on work and code here
> https://git.linaro.org/people/amit.kucheria/kernel.git/log/?h=wrk3/tsens-8960-breakage

Applied, the series.

Fixed a minor conflict with patch 9/9 and "dt-bindings: thermal:
qcom-tsens: Add compatible for sm8350"

Thanks

  -- Daniel

-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: linux-next: Fixes tag needs some work in the thermal tree

2021-04-05 Thread Daniel Lezcano
On 06/04/2021 00:35, Stephen Rothwell wrote:
> Hi all,
> 
> In commit
> 
>   5845dd350432 ("thermal/drivers/tsens: Fix missing put_device error")
> 
> Fixes tag
> 
>   Fixes: a7ff82976122 ("drivers: thermal: tsens: Merge tsens-common.c into
> 
> has these problem(s):
> 
>   - Subject has leading but no trailing parentheses
>   - Subject has leading but no trailing quotes
> 
> Please do not split Fixes tages over more than one line.  Also, please
> keep all the commit message tags together at the end of the commit
> message.

Fixed, thanks!

  -- Daniel

-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH v2 1/8] dt-bindings: timer: Add compatible for Mediatek MT8195

2021-04-04 Thread Daniel Lezcano
On 29/03/2021 13:52, Matthias Brugger wrote:
> 
> 
> On 19/03/2021 03:34, Seiya Wang wrote:
>> This commit adds dt-binding documentation of timer for Mediatek MT8195 SoC
>> Platform.
>>
>> Signed-off-by: Seiya Wang 
> 
> Applied to v5.12-next/dts64

Usually bindings go through the subsystem maintainer.


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH v3 1/2] dt-bindings: thermal: qcom-tsens: Add compatible for sm8350

2021-04-04 Thread Daniel Lezcano
On 24/03/2021 13:43, Robert Foss wrote:
> Add tsens bindings for sm8350.
> 
> Signed-off-by: Robert Foss 
> Reviewed-by: Vinod Koul 
> ---

Applied, thanks


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH 05/14] dt-bindings: timer: nuvoton,npcm7xx: Add wpcm450-timer

2021-04-04 Thread Daniel Lezcano
On 20/03/2021 19:16, Jonathan Neuschäfer wrote:
> Add a compatible string for WPCM450, which has essentially the same
> timer controller.
> 
> Signed-off-by: Jonathan Neuschäfer 
> ---

Applied, thanks


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH][next] thermal/drivers/devfreq_cooling: Fix error return if kasprintf returns NULL

2021-04-04 Thread Daniel Lezcano


Hi Colin,


On 25/03/2021 18:21, Colin King wrote:
> From: Colin Ian King 
> 
> Currently when kasprintf fails and returns NULL, the error return -ENOMEM
> is being assigned to cdev instead of err causing the return via the label
> remove_qos_re to return the incorrect error code. Fix this by explicitly
> setting err before taking the error return path.
> 
> Addresses-Coverity: ("Unused valued")
> Fixes: f8d354e821b2 ("thermal/drivers/devfreq_cooling: Use device name 
> instead of auto-numbering")
> Signed-off-by: Colin Ian King 
> ---

Thanks for your patch. It was already fixed after being reported by
kbuild-test.

  -- Daniel



-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH 1/2] thermal: power_allocator: maintain the device statistics from going stale

2021-04-02 Thread Daniel Lezcano
On 31/03/2021 18:33, Lukasz Luba wrote:
> When the temperature is below the first activation trip point the cooling
> devices are not checked, so they cannot maintain fresh statistics. It
> leads into the situation, when temperature crosses first trip point, the
> statistics are stale and show state for very long period. 

Can you elaborate the statistics you are referring to ?

I can understand the pid controller needs temperature but I don't
understand the statistics with the cooling device.


> This has impact
> on IPA algorithm calculation and wrong decisions. Thus, check the cooling
> devices even when the temperature is low, to refresh these statistics.
> 
> Signed-off-by: Lukasz Luba 
> ---
>  drivers/thermal/gov_power_allocator.c | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/thermal/gov_power_allocator.c 
> b/drivers/thermal/gov_power_allocator.c
> index 2802a0e13c88..0cbd10cab193 100644
> --- a/drivers/thermal/gov_power_allocator.c
> +++ b/drivers/thermal/gov_power_allocator.c
> @@ -575,15 +575,27 @@ static void allow_maximum_power(struct 
> thermal_zone_device *tz)
>  {
>   struct thermal_instance *instance;
>   struct power_allocator_params *params = tz->governor_data;
> + u32 req_power;
>  
>   mutex_lock(>lock);
>   list_for_each_entry(instance, >thermal_instances, tz_node) {
> + struct thermal_cooling_device *cdev = instance->cdev;
> +
>   if ((instance->trip != params->trip_max_desired_temperature) ||
>   (!cdev_is_power_actor(instance->cdev)))
>   continue;
>  
>   instance->target = 0;
>   mutex_lock(>cdev->lock);
> + /*
> +  * Call for updating the cooling devices local stats and avoid
> +  * periods of dozen of seconds when those have not been
> +  * maintained. The long period would come into the first check
> +  * when lower threshold is crossed. Thus, limit it to single
> +  * one longer polling period.
> +  */
> + cdev->ops->get_requested_power(cdev, _power);
> +
>   instance->cdev->updated = false;
>   mutex_unlock(>cdev->lock);
>   thermal_cdev_update(instance->cdev);
> 


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH v2] drivers/clocksource/mediatek: Ack and disable interrupts on shutdown

2021-04-02 Thread Daniel Lezcano


Hi Evan,

On 27/03/2021 02:31, Evan Benn wrote:
> Hi Daniel,
> 
> That is a good point, and I did try that at first and it works fine. I
> uploaded this version because the suspend/resume callbacks were
> undocumented and mostly not used by other clocksource drivers. I
> thought a smaller diff might be preferable.
> I also thought it would be better to shut off the interrupt as soon as
> it is not needed, avoiding any other potential bugs, instead of just
> fixing the one we know about with suspend. I'm not sure how the other
> driver / timer disable flows are intended to work for example
> (shutdown, detach, etc).
> 
> That said I am happy to upload that version if people think it is better.

IMO, it is not in the normal flow to disable/enable the interrupts.

Does this timer belong to an always-on power domain ?



> https://elixir.bootlin.com/linux/latest/source/include/linux/clockchips.h#L120
> 
> 
> On Thu, 25 Mar 2021 at 19:10, Daniel Lezcano  
> wrote:
>>
>> On 25/03/2021 02:35, Evan Benn wrote:
>>> set_state_shutdown is called during system suspend after interrupts have
>>> been disabled. If the timer has fired in the meantime, there will be
>>> a pending IRQ. So we ack that now and disable the timer. Without this
>>> ARM trusted firmware will abort the suspend due to the pending
>>> interrupt.


>>> Now always disable the IRQ in state transitions, and re-enable in
>>> set_periodic and next_event.
>>
>> Why not put add the suspend/resume callbacks and put there the specific
>> code and let the irq untouched in the normal flow ?
>>
>>> Signed-off-by: Evan Benn 
>>> ---
>>>
>>> Changes in v2:
>>> Remove the patch that splits the drivers into 2 files.
>>>
>>>  drivers/clocksource/timer-mediatek.c | 49 +---
>>>  1 file changed, 30 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/drivers/clocksource/timer-mediatek.c 
>>> b/drivers/clocksource/timer-mediatek.c
>>> index 9318edcd8963..fba2f9494d90 100644
>>> --- a/drivers/clocksource/timer-mediatek.c
>>> +++ b/drivers/clocksource/timer-mediatek.c
>>> @@ -132,13 +132,33 @@ static u64 notrace mtk_gpt_read_sched_clock(void)
>>>   return readl_relaxed(gpt_sched_reg);
>>>  }
>>>
>>> +static void mtk_gpt_disable_ack_interrupts(struct timer_of *to, u8 timer)
>>> +{
>>> + u32 val;
>>> +
>>> + /* Disable interrupts */
>>> + val = readl(timer_of_base(to) + GPT_IRQ_EN_REG);
>>> + writel(val & ~GPT_IRQ_ENABLE(timer), timer_of_base(to) +
>>> +GPT_IRQ_EN_REG);
>>> +
>>> + /* Ack interrupts */
>>> + writel(GPT_IRQ_ACK(timer), timer_of_base(to) + GPT_IRQ_ACK_REG);
>>> +}
>>> +
>>>  static void mtk_gpt_clkevt_time_stop(struct timer_of *to, u8 timer)
>>>  {
>>>   u32 val;
>>>
>>> + /* Disable timer */
>>>   val = readl(timer_of_base(to) + GPT_CTRL_REG(timer));
>>>   writel(val & ~GPT_CTRL_ENABLE, timer_of_base(to) +
>>>  GPT_CTRL_REG(timer));
>>> +
>>> + /* This may be called with interrupts disabled,
>>> +  * so we need to ack any interrupt that is pending
>>> +  * Or for example ATF will prevent a suspend from completing.
>>> +  */
>>> + mtk_gpt_disable_ack_interrupts(to, timer);
>>>  }
>>>
>>>  static void mtk_gpt_clkevt_time_setup(struct timer_of *to,
>>> @@ -152,8 +172,10 @@ static void mtk_gpt_clkevt_time_start(struct timer_of 
>>> *to,
>>>  {
>>>   u32 val;
>>>
>>> - /* Acknowledge interrupt */
>>> - writel(GPT_IRQ_ACK(timer), timer_of_base(to) + GPT_IRQ_ACK_REG);
>>> + /* Enable interrupts */
>>> + val = readl(timer_of_base(to) + GPT_IRQ_EN_REG);
>>> + writel(val | GPT_IRQ_ENABLE(timer),
>>> +timer_of_base(to) + GPT_IRQ_EN_REG);
>>>
>>>   val = readl(timer_of_base(to) + GPT_CTRL_REG(timer));
>>>
>>> @@ -226,21 +248,6 @@ __init mtk_gpt_setup(struct timer_of *to, u8 timer, u8 
>>> option)
>>>  timer_of_base(to) + GPT_CTRL_REG(timer));
>>>  }
>>>
>>> -static void mtk_gpt_enable_irq(struct timer_of *to, u8 timer)
>>> -{
>>> - u32 val;
>>> -
>>> - /* Disable all interrupts */
>>> - writel(0x0, timer_of_base(to) + GPT_IRQ_EN_REG);
>>> -
>>> - /* Acknowledge 

Re: [PATCH v6 2/7] powercap/drivers/dtpm: Create a registering system

2021-04-02 Thread Daniel Lezcano
On 02/04/2021 10:02, Greg KH wrote:
> On Fri, Apr 02, 2021 at 12:08:49AM +0200, Daniel Lezcano wrote:
>>
>> Hi Greg,
>>
>> On 01/04/2021 21:28, Greg KH wrote:
>>> On Thu, Apr 01, 2021 at 08:36:49PM +0200, Daniel Lezcano wrote:
>>>> A SoC can be differently structured depending on the platform and the
>>>> kernel can not be aware of all the combinations, as well as the
>>>> specific tweaks for a particular board.
>>>>
>>>> The creation of the hierarchy must be delegated to userspace.
>>>
>>> Why?  Isn't this what DT is for?
>>
>> I've always been told the DT describes the hardware. Here we are more
>> describing a configuration, that is the reason why I've let the
>> userspace to handle that through configfs.
> 
> DT does describe how the hardware configuration is to be used.  You are
> saying that the hardware description does not work and somehow you need
> a magic userspace tool to reconfigure things instead?
> 
>>> What "userspace tool" is going to be created to manage all of this?
>>> Pointers to that codebase?
>>
>> You are certainly aware of most of it but let me give a bit more of context.
> 
> No, I am not aware of it at all, thanks :)
> 
>> The thermal framework has cooling devices which export their 'state', a
>> representation of their performance level, in sysfs. Unfortunately that
>> gives access from the user space to act on the performance as a power
>> limiter in total conflict with the in-kernel thermal governor decisions.
> 
> Why not remove that conflict?

Because the cooling device state should have not be exported its state
in a writable way, that is a weakness different applications are
abusing. Moreover, the thermal framework is not designed for that. It is
not the purpose of the cooling device to be managed as a power limiter
by the userspace.

The powercap framework is there for that.

There are devices registering themselves as a cooling device while there
is no sensor and governor for them just for the sake of acting on their
power via opaque integer values.


>> That is done from thermal daemon the different SoC vendors tweak for
>> their platform. Depending on the application running and identified as a
>> scenario, the daemon acts proactively on the different cooling devices
>> to ensure a skin temperature which is far below the thermal limit of the
>> components.
> 
> So userspace is going to try to manage power settings in order to keep
> thermal values under control?  Has no one learned from our past mistakes
> when we used to have to do this 10 years ago and it turned out so bad
> that it was just baked into the hardware instead?

I agree in a desktop/server environment, that is true but on the
embedded world, things are a bit different because we are on battery,
the cooling devices are passive and the device is carried. This is why
there are different level of thermal control:

 - In the hardware / firmware to protect physically the silicon

 - In the kernel, to protect from a hard reset or reboot. Usually each
of them is represented with a thermal zone (and its sensor) and a
governor. There is a 1:1 relationship. IOW, a governor manage a thermal
zone without any knowledge of the other thermal zones temperature, just
the one it is monitoring and acts on the power (with different
techniques) when the temperature threshold is reached. Its action is
local and it is after crossing a certain temperature for this component.

 - In the userspace, the logic will get notified about which application
is running and can set the power limitation on different devices knowing
the performances are enough for that particular application, that saves
energy and consequently reduce the temperature. Its action is on
multiple places and happens far below the thermal limits. On the other
side, the skin temperature is a particular sensor placed on the
dissipation path and its temperature must be below 45°C (30 minutes skin
contact at 45°C causes a 1st degree burn). It is the results of the
dissipation of all the devices, the logic in userspace can know which
devices to act on to have the overall power dissipation low enough to
stay below this 45°C (eg. reduce camera resolution).  So here userspace
deals with performance, temperature, application profile, etc ... and
abuse the cooling device.

I'm not sure in the ARM/ARM64 embedded ecosystem, creating a complex
hardware mechanism, supported by a firmware is really possible.

In a desktop/server environment, we do not care about having this skin
temperature (and battery savings) aspects.

> {sigh}
> 
>> This usage of the cooling devices hijacked the real purpose of the
>> thermal framework which is to protect the silicon. Nobody to blame,
>> the

Re: [PATCH v6 7/7] powercap/drivers/dtpm: Allow dtpm node device creation through configfs

2021-04-02 Thread Daniel Lezcano
On 01/04/2021 21:37, Greg KH wrote:
> On Thu, Apr 01, 2021 at 08:36:54PM +0200, Daniel Lezcano wrote:

[ ... ]

> Why have you not documented all of this in Documentation/ABI so that we
> can find it later?

You are right, I will write some documentation.


>> Signed-off-by: Daniel Lezcano 
>> ---
>>  drivers/powercap/Kconfig |   8 ++
>>  drivers/powercap/Makefile|   1 +
>>  drivers/powercap/dtpm_configfs.c | 202 +++
>>  include/linux/dtpm.h |   2 +
>>  4 files changed, 213 insertions(+)
>>  create mode 100644 drivers/powercap/dtpm_configfs.c
>>
>> diff --git a/drivers/powercap/Kconfig b/drivers/powercap/Kconfig
>> index 8242e8c5ed77..599b41e4e0a7 100644
>> --- a/drivers/powercap/Kconfig
>> +++ b/drivers/powercap/Kconfig
>> @@ -50,6 +50,14 @@ config DTPM
>>This enables support for the power capping for the dynamic
>>thermal power management userspace engine.
>>  
>> +config DTPM_CONFIGFS
>> +tristate "Dynamic Thermal Power Management configuration via configfs"
>> +depends on DTPM && CONFIGFS_FS
>> +help
>> +  This enables support for creating the DTPM device hierarchy
>> +  via configfs giving the userspace full control of the
>> +  thermal constraint representation.
> 
> Why does this have to be a module, shouldn't it just be part of the DTPM
> code itself?

All options I've seen around are tristate as well as CONFIGFS_FS, so
TBH, I don't know what is the best option.

>> +
>>  config DTPM_CPU
>>  bool "Add CPU power capping based on the energy model"
>>  depends on DTPM && ENERGY_MODEL
>> diff --git a/drivers/powercap/Makefile b/drivers/powercap/Makefile
>> index fabcf388a8d3..519cabc624c3 100644
>> --- a/drivers/powercap/Makefile
>> +++ b/drivers/powercap/Makefile
>> @@ -1,5 +1,6 @@
>>  # SPDX-License-Identifier: GPL-2.0-only
>>  obj-$(CONFIG_DTPM) += dtpm.o
>> +obj-$(CONFIG_DTPM_CONFIGFS) += dtpm_configfs.o
>>  obj-$(CONFIG_DTPM_CPU) += dtpm_cpu.o
>>  obj-$(CONFIG_POWERCAP)  += powercap_sys.o
>>  obj-$(CONFIG_INTEL_RAPL_CORE) += intel_rapl_common.o
>> diff --git a/drivers/powercap/dtpm_configfs.c 
>> b/drivers/powercap/dtpm_configfs.c
>> new file mode 100644
>> index ..b8de71e94fc3
>> --- /dev/null
>> +++ b/drivers/powercap/dtpm_configfs.c
>> @@ -0,0 +1,202 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright 2021 Linaro Limited
>> + *
>> + * Author: Daniel Lezcano 
>> + *
>> + * The DTPM framework defines a set of devices which are power capable.
>> + *
>> + * The configfs allows to create a hierarchy of devices in order
>> + * to reflect the constraints we want to apply to them.
>> + *
>> + * Each dtpm node is created via a mkdir operation in the configfs
>> + * directory. It will create the corresponding dtpm device in the
>> + * sysfs and the 'device' will contain the absolute path to the dtpm
>> + * node in the sysfs, thus allowing to do the connection between the
>> + * created dtpm node in the configfs hierarchy and the dtpm node in
>> + * the powercap framework.
> 
> Tieing configfs and sysfs is very odd, and ripe for problems.  Those are
> independant filesystems with independant devices and paths and object
> lifetimes.  How can you guarantee they all work together ok?

I think my description is not correct. We do not create a sysfs entry,
we create a powercap device. So there is not interaction, just a 1:1
relationship with create/remove and the powercap device.

But I can understand the "path" for the device can be controversial. I
don't see how it can be inconsistent as it refers always to the device
path which does not change but we describe something from one filesystem
to another filesystem.

If it is a problem, it can be removed. Userspace can still do the
connection between the configfs entry name and the device in sysfs by
looking up the name in /sys/devices/virtual/powercap/dtpm/dtpm:*/name.

[ ... ]

>> +static struct config_group *dtpm_cstrn_make_group(struct config_group *grp, 
>> const char *name)
>> +{
>> +struct dtpm *d, *p;
>> +int ret;
>> +
>> +if (dtpm_configfs_exists(cstrn_group, name))
>> +return ERR_PTR(-EEXIST);
>> +
>> +d = dtpm_lookup(name);
>> +if (!d) {
>> +d = kzalloc(sizeof(*d), GFP_KERNEL);
>> +if (!d)
>> +return ERR_PTR(-ENOMEM);
>> +dtpm_init(d, NULL);
>> +}
>> +
>> +c

Re: [PATCH v6 7/7] powercap/drivers/dtpm: Allow dtpm node device creation through configfs

2021-04-02 Thread Daniel Lezcano
On 01/04/2021 21:37, Greg KH wrote:
> On Thu, Apr 01, 2021 at 08:36:54PM +0200, Daniel Lezcano wrote:

[ ... ]

> Why have you not documented all of this in Documentation/ABI so that we
> can find it later?

You are right, I will write some documentation.


>> Signed-off-by: Daniel Lezcano 
>> ---
>>  drivers/powercap/Kconfig |   8 ++
>>  drivers/powercap/Makefile|   1 +
>>  drivers/powercap/dtpm_configfs.c | 202 +++
>>  include/linux/dtpm.h |   2 +
>>  4 files changed, 213 insertions(+)
>>  create mode 100644 drivers/powercap/dtpm_configfs.c
>>
>> diff --git a/drivers/powercap/Kconfig b/drivers/powercap/Kconfig
>> index 8242e8c5ed77..599b41e4e0a7 100644
>> --- a/drivers/powercap/Kconfig
>> +++ b/drivers/powercap/Kconfig
>> @@ -50,6 +50,14 @@ config DTPM
>>This enables support for the power capping for the dynamic
>>thermal power management userspace engine.
>>  
>> +config DTPM_CONFIGFS
>> +tristate "Dynamic Thermal Power Management configuration via configfs"
>> +depends on DTPM && CONFIGFS_FS
>> +help
>> +  This enables support for creating the DTPM device hierarchy
>> +  via configfs giving the userspace full control of the
>> +  thermal constraint representation.
> 
> Why does this have to be a module, shouldn't it just be part of the DTPM
> code itself?

All options I've seen around are tristate as well as CONFIGFS_FS, so
TBH, I don't know what is the best option.

>> +
>>  config DTPM_CPU
>>  bool "Add CPU power capping based on the energy model"
>>  depends on DTPM && ENERGY_MODEL
>> diff --git a/drivers/powercap/Makefile b/drivers/powercap/Makefile
>> index fabcf388a8d3..519cabc624c3 100644
>> --- a/drivers/powercap/Makefile
>> +++ b/drivers/powercap/Makefile
>> @@ -1,5 +1,6 @@
>>  # SPDX-License-Identifier: GPL-2.0-only
>>  obj-$(CONFIG_DTPM) += dtpm.o
>> +obj-$(CONFIG_DTPM_CONFIGFS) += dtpm_configfs.o
>>  obj-$(CONFIG_DTPM_CPU) += dtpm_cpu.o
>>  obj-$(CONFIG_POWERCAP)  += powercap_sys.o
>>  obj-$(CONFIG_INTEL_RAPL_CORE) += intel_rapl_common.o
>> diff --git a/drivers/powercap/dtpm_configfs.c 
>> b/drivers/powercap/dtpm_configfs.c
>> new file mode 100644
>> index ..b8de71e94fc3
>> --- /dev/null
>> +++ b/drivers/powercap/dtpm_configfs.c
>> @@ -0,0 +1,202 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright 2021 Linaro Limited
>> + *
>> + * Author: Daniel Lezcano 
>> + *
>> + * The DTPM framework defines a set of devices which are power capable.
>> + *
>> + * The configfs allows to create a hierarchy of devices in order
>> + * to reflect the constraints we want to apply to them.
>> + *
>> + * Each dtpm node is created via a mkdir operation in the configfs
>> + * directory. It will create the corresponding dtpm device in the
>> + * sysfs and the 'device' will contain the absolute path to the dtpm
>> + * node in the sysfs, thus allowing to do the connection between the
>> + * created dtpm node in the configfs hierarchy and the dtpm node in
>> + * the powercap framework.
> 
> Tieing configfs and sysfs is very odd, and ripe for problems.  Those are
> independant filesystems with independant devices and paths and object
> lifetimes.  How can you guarantee they all work together ok?

I think my description is not correct. We do not create a sysfs entry,
we create a powercap device. So there is not interaction, just a 1:1
relationship with create/remove and the powercap device.

But I can understand the "path" for the device can be controversial. I
don't see how it can be inconsistent as it refers always to the device
path which does not change but we describe something from one filesystem
to another filesystem.

If it is a problem, it can be removed. Userspace can still do the
connection between the configfs entry name and the device in sysfs by
looking up the name in /sys/devices/virtual/powercap/dtpm/dtpm:*/name.

[ ... ]

>> +static struct config_group *dtpm_cstrn_make_group(struct config_group *grp, 
>> const char *name)
>> +{
>> +struct dtpm *d, *p;
>> +int ret;
>> +
>> +if (dtpm_configfs_exists(cstrn_group, name))
>> +return ERR_PTR(-EEXIST);
>> +
>> +d = dtpm_lookup(name);
>> +if (!d) {
>> +d = kzalloc(sizeof(*d), GFP_KERNEL);
>> +if (!d)
>> +return ERR_PTR(-ENOMEM);
>> +dtpm_init(d, NULL);
>> +}
>> +
>> +c

Re: [RESEND PATCH] MAINTAINERS: update thermal CPU cooling section

2021-04-02 Thread Daniel Lezcano
On 02/04/2021 12:25, Lukasz Luba wrote:
> Hi Viresh, Daniel
> 
> On 2/18/21 4:18 AM, Viresh Kumar wrote:
>> On 17-02-21, 11:59, Lukasz Luba wrote:
>>> Update maintainers responsible for CPU cooling on Arm side.
>>>
>>> Signed-off-by: Lukasz Luba 
>>> ---
>>> Hi Daniel,
>>>
>>> Please ignore the previous email and that this change with 'R'.
>>> Javi will ack it later.
>>>
>>> Regards,
>>> Lukasz
>>>
>>>   MAINTAINERS | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index f32ebcff37d2..fe34f56acb0f 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -17774,7 +17774,7 @@ THERMAL/CPU_COOLING
>>>   M:    Amit Daniel Kachhap 
>>>   M:    Daniel Lezcano 
>>>   M:    Viresh Kumar 
>>> -M:    Javi Merino 
>>> +R:    Lukasz Luba 
>>>   L:    linux...@vger.kernel.org
>>>   S:    Supported
>>>   F:    Documentation/driver-api/thermal/cpu-cooling-api.rst
>>
>> Good that we have one more reviewer for this :)
>>
>> Acked-by: Viresh Kumar 
>>
> 
> I believe it has lost somewhere in people mailboxes.
> 
> Thank you Viresh for the ACK.
> 
> Could you Daniel (or you Viresh) take this patch, please?

I was expecting Javi to ack it.


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH v6 2/7] powercap/drivers/dtpm: Create a registering system

2021-04-01 Thread Daniel Lezcano


Hi Greg,

On 01/04/2021 21:28, Greg KH wrote:
> On Thu, Apr 01, 2021 at 08:36:49PM +0200, Daniel Lezcano wrote:
>> A SoC can be differently structured depending on the platform and the
>> kernel can not be aware of all the combinations, as well as the
>> specific tweaks for a particular board.
>>
>> The creation of the hierarchy must be delegated to userspace.
> 
> Why?  Isn't this what DT is for?

I've always been told the DT describes the hardware. Here we are more
describing a configuration, that is the reason why I've let the
userspace to handle that through configfs.

> What "userspace tool" is going to be created to manage all of this?
> Pointers to that codebase?

You are certainly aware of most of it but let me give a bit more of context.

The thermal framework has cooling devices which export their 'state', a
representation of their performance level, in sysfs. Unfortunately that
gives access from the user space to act on the performance as a power
limiter in total conflict with the in-kernel thermal governor decisions.

That is done from thermal daemon the different SoC vendors tweak for
their platform. Depending on the application running and identified as a
scenario, the daemon acts proactively on the different cooling devices
to ensure a skin temperature which is far below the thermal limit of the
components.

This usage of the cooling devices hijacked the real purpose of the
thermal framework which is to protect the silicon. Nobody to blame,
there is no alternative for userspace.

The use case falls under the power limitation framework prerogative and
that is the reason why we provided a new framework to limit the power
based on the powercap framework. The thermal daemon can then use it and
stop abusing the thermal framework.

This DTPM framework allows to read the power consumption and set a power
limit to a device.

While the powercap simple backend gives a single device entry, DTPM
aggregates the different devices power by summing their power and their
limits. The tree representation of the different DTPM nodes describe how
their limits are set and how the power is computed along the different
devices.

For more info, we did a presentation at ELC [1] and Linux PM
microconference [2] and there is an article talking about it [3].


To answer your questions, there is a SoC vendor thermal daemon using
DTPM and there is a tool created to watch the thermal framework and read
the DTPM values, it is available at [4]. It is currently under
development with the goal of doing power rebalancing / capping across
the different nodes when there is a violation of the parent's power limit.



[1]
https://ossna2020.sched.com/event/c3Wf/ideas-for-finer-grained-control-over-your-heat-budget-amit-kucheria-daniel-lezcano-linaro

[2]
https://www.linuxplumbersconf.org/event/7/page/80-accepted-microconferences#power-cr

[3] https://www.linaro.org/blog/using-energy-model-to-stay-in-tdp-budget/

[4] https://git.linaro.org/people/daniel.lezcano/dtpm.git


>> These changes provide a registering mechanism where the different
>> subsystems will initialize their dtpm backends and register with a
>> name the dtpm node in a list.
>>
>> The next changes will provide an userspace interface to create
>> hierarchically the different nodes. Those will be created by name and
>> found via the list filled by the different subsystem.
>>
>> If a specified name is not found in the list, it is assumed to be a
>> virtual node which will have children and the default is to allocate
>> such node.
> 
> So userspace sets the name?
> 
> Why not use the name in the device itself?  I thought I asked that last
> time...

I probably missed it, sorry for that.

When the userspace creates the directory in the configfs, there is a
lookup with the name in the device list name. If it is found, then the
device is used, otherwise a virtual node is created instead, its power
consumption is equal to the sum of the children.

The different drivers register themselves with their name and the
associated dtpm structure. The userspace pick in this list to create a
hierarchy via configfs.

For example, a big.Little system.

- little CPUs power limiter will have the name cpu0-cpufreq
- big CPUs will have the name cpu4-cpufreq
- gpu will have the name ff9a.gpu-devfreq
- charger will have the name power-supply-charge
- DDR memory controller can have the name dmc-devfreq

Userspace may want to create this hierarchy:

soc
 - package
   - cluster0
 - cpu0-cpufreq
   - cluster1
 - ff9a.gpu-devfreq
   - dmc-devfreq
 - battery
   - power-supply-charge

It will do:

mkdir soc (virtual node)
mkdir soc/cluster0 (virtual node)
mkdir soc/cluster0/cpu0-cpufreq (real device)
etc ...

The configfs does not represent the layout of the sensors or the floor
plan of the devices but only the constraints we want to tie togeth

[PATCH v6 6/7] powercap/drivers/dtpm: Export the symbols for the modules

2021-04-01 Thread Daniel Lezcano
The DTPM framework provides a generic API to register devices which
power capable. The devices may be compiled as modules while the
framework is not.

Export the necessary API to let the drivers register themselves.

Signed-off-by: Daniel Lezcano 
---
 drivers/powercap/dtpm.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/powercap/dtpm.c b/drivers/powercap/dtpm.c
index b389bc397cdf..733567bbe0be 100644
--- a/drivers/powercap/dtpm.c
+++ b/drivers/powercap/dtpm.c
@@ -289,6 +289,7 @@ int dtpm_update_power(struct dtpm *dtpm)
 
return ret;
 }
+EXPORT_SYMBOL_GPL(dtpm_update_power);
 
 /**
  * dtpm_release_zone - Cleanup when the node is released
@@ -486,6 +487,7 @@ void dtpm_init(struct dtpm *dtpm, struct dtpm_ops *ops)
dtpm->ops = ops;
}
 }
+EXPORT_SYMBOL_GPL(dtpm_init);
 
 /**
  * dtpm_unregister - Unregister a dtpm node from the hierarchy tree
@@ -500,6 +502,7 @@ void dtpm_unregister(struct dtpm *dtpm)
 
pr_debug("Unregistered dtpm node '%s'\n", dtpm->zone.name);
 }
+EXPORT_SYMBOL_GPL(dtpm_unregister);
 
 /**
  * dtpm_register - Register a dtpm node in the hierarchy tree
@@ -574,6 +577,7 @@ int dtpm_register(const char *name, struct dtpm *dtpm, 
struct dtpm *parent)
 
return 0;
 }
+EXPORT_SYMBOL_GPL(dtpm_register);
 
 static int __init init_dtpm(void)
 {
-- 
2.17.1



[PATCH v6 7/7] powercap/drivers/dtpm: Allow dtpm node device creation through configfs

2021-04-01 Thread Daniel Lezcano
The DTPM framework is built on top of the powercap framework as a new
controller to act on the power of the devices. The approach is to
provide an unified API to do power limitation on devices which are
capable of that with different techniques.

In addition, the DTPM framework allows to create a hierarchical
representation of the devices in order to reflect the dependencies of
the power constraints we want to apply to a group of devices.

The hierarchy can be different for the same platform as it will depend
on the form factor (tablet, notebook, phone, ...), on other components
and/or a policy, and application scenario.

The kernel can not have such knowledge and only the SoC vendor can
setup its platform to fulfill the objectives of its hardware.

This patch adds the ability to create a DTPM hierarchy via
configfs. All DTPM capable devices are registered in a list, and the
creation of a configfs directory with the name of one of this device
will create a DTPM node in the DTPM powercap sysfs. If the name is not
in the list, a virtual node will be created instead. This virtual node
in the DTPM semantic is to aggregate the power characteristics of the
children.

In order to do the connection between the configfs and sysfs easily, a
'device' file contains the path to the corresponding DTPM powercap
node in sysfs (cross filesystems symlink is not supported by
configfs).

In order to not block any new features in the future, the hierarchical
constraints are stored under a top folder 'constraints', so sibling
can be created for other purposes if needed.

When the configfs was populated, the module can not be unloaded until
all the elements in the tree have been removed.

1) Resulting creation via mkdir:

root@rock960:/sys/kernel/config# tree dtpm/
dtpm/
└── constraints
└── platform
├── device
└── soc
├── device
└── pkg
 ├── device
 ├── cpu0-cpufreq
 │   └── device
 ├── cpu4-cpufreq
 │   └── device
 └── panfrost-devfreq
└── device

2) The content of the 'device' file above

root@rock960:/sys/kernel/config# find dtpm/constraints -name "device" -exec cat 
{} \;
/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:0/dtpm:0:0:1
/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:0/dtpm:0:0:0/dtpm:0:0:0:2
/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:0/dtpm:0:0:0/dtpm:0:0:0:1
/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:0/dtpm:0:0:0/dtpm:0:0:0:0
/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:0/dtpm:0:0:0
/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:0
/devices/virtual/powercap/dtpm/dtpm:0

3) The dtpm device creation node is reflected in sysfs:

root@rock960:/sys/devices/virtual/powercap/dtpm# find . -type d | grep dtpm
./dtpm:0
./dtpm:0/power
./dtpm:0/dtpm:0:0
./dtpm:0/dtpm:0:0/power
./dtpm:0/dtpm:0:0/dtpm:0:0:0
./dtpm:0/dtpm:0:0/dtpm:0:0:0/dtpm:0:0:0:1
./dtpm:0/dtpm:0:0/dtpm:0:0:0/dtpm:0:0:0:1/power
./dtpm:0/dtpm:0:0/dtpm:0:0:0/power
./dtpm:0/dtpm:0:0/dtpm:0:0:0/dtpm:0:0:0:2
./dtpm:0/dtpm:0:0/dtpm:0:0:0/dtpm:0:0:0:2/power
./dtpm:0/dtpm:0:0/dtpm:0:0:0/dtpm:0:0:0:0
./dtpm:0/dtpm:0:0/dtpm:0:0:0/dtpm:0:0:0:0/power
./dtpm:0/dtpm:0:0/dtpm:0:0:1
./dtpm:0/dtpm:0:0/dtpm:0:0:1/power

Signed-off-by: Daniel Lezcano 
---
 drivers/powercap/Kconfig |   8 ++
 drivers/powercap/Makefile|   1 +
 drivers/powercap/dtpm_configfs.c | 202 +++
 include/linux/dtpm.h |   2 +
 4 files changed, 213 insertions(+)
 create mode 100644 drivers/powercap/dtpm_configfs.c

diff --git a/drivers/powercap/Kconfig b/drivers/powercap/Kconfig
index 8242e8c5ed77..599b41e4e0a7 100644
--- a/drivers/powercap/Kconfig
+++ b/drivers/powercap/Kconfig
@@ -50,6 +50,14 @@ config DTPM
  This enables support for the power capping for the dynamic
  thermal power management userspace engine.
 
+config DTPM_CONFIGFS
+   tristate "Dynamic Thermal Power Management configuration via configfs"
+   depends on DTPM && CONFIGFS_FS
+   help
+ This enables support for creating the DTPM device hierarchy
+ via configfs giving the userspace full control of the
+ thermal constraint representation.
+
 config DTPM_CPU
bool "Add CPU power capping based on the energy model"
depends on DTPM && ENERGY_MODEL
diff --git a/drivers/powercap/Makefile b/drivers/powercap/Makefile
index fabcf388a8d3..519cabc624c3 100644
--- a/drivers/powercap/Makefile
+++ b/drivers/powercap/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 obj-$(CONFIG_DTPM) += dtpm.o
+obj-$(CONFIG_DTPM_CONFIGFS) += dtpm_configfs.o
 obj-$(CONFIG_DTPM_CPU) += dtpm_cpu.o
 obj-$(CONFIG_POWERCAP) += powercap_sys.o
 obj-$(CONFIG_INTEL_RAPL_CORE) += intel_rapl_common.o
diff --git a/drivers/powercap/dtpm_configfs.c b/drivers/powercap/dtpm_configfs.c
new file mode 10

[PATCH v6 5/7] powercap/drivers/dtpm: Scale the power with the load

2021-04-01 Thread Daniel Lezcano
Currently the power consumption is based on the current OPP power
assuming the entire performance domain is fully loaded.

That gives very gross power estimation and we can do much better by
using the load to scale the power consumption.

Use the utilization to normalize and scale the power usage over the
max possible power.

Tested on a rock960 with 2 big CPUS, the power consumption estimation
conforms with the expected one.

Before this change:

~$ ~/dhrystone -t 1 -l 1&
~$ cat 
/sys/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:1/constraint_0_max_power_uw
226

After this change:

~$ ~/dhrystone -t 1 -l 1&
~$ cat 
/sys/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:1/constraint_0_max_power_uw
113

~$ ~/dhrystone -t 2 -l 1&
~$ cat 
/sys/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:1/constraint_0_max_power_uw
226

Signed-off-by: Daniel Lezcano 
Reviewed-by: Lukasz Luba 
---

V3:
  - Fixed uninitialized 'cpu' in scaled_power_uw()
V2:
  - Replaced cpumask by em_span_cpus
  - Changed 'util' metrics variable types
  - Optimized utilization scaling power computation
  - Renamed parameter name for scale_pd_power_uw()
---
 drivers/powercap/dtpm_cpu.c | 46 +++--
 1 file changed, 39 insertions(+), 7 deletions(-)

diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
index f4092d1b01d7..eae35ae3c42e 100644
--- a/drivers/powercap/dtpm_cpu.c
+++ b/drivers/powercap/dtpm_cpu.c
@@ -68,27 +68,59 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 
power_limit)
return power_limit;
 }
 
+static u64 scale_pd_power_uw(struct cpumask *pd_mask, u64 power)
+{
+   unsigned long max = 0, sum_util = 0;
+   int cpu;
+
+   for_each_cpu_and(cpu, pd_mask, cpu_online_mask) {
+
+   /*
+* The capacity is the same for all CPUs belonging to
+* the same perf domain, so a single call to
+* arch_scale_cpu_capacity() is enough. However, we
+* need the CPU parameter to be initialized by the
+* loop, so the call ends up in this block.
+*
+* We can initialize 'max' with a cpumask_first() call
+* before the loop but the bits computation is not
+* worth given the arch_scale_cpu_capacity() just
+* returns a value where the resulting assembly code
+* will be optimized by the compiler.
+*/
+   max = arch_scale_cpu_capacity(cpu);
+   sum_util += sched_cpu_util(cpu, max);
+   }
+
+   /*
+* In the improbable case where all the CPUs of the perf
+* domain are offline, 'max' will be zero and will lead to an
+* illegal operation with a zero division.
+*/
+   return max ? (power * ((sum_util << 10) / max)) >> 10 : 0;
+}
+
 static u64 get_pd_power_uw(struct dtpm *dtpm)
 {
struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
struct em_perf_domain *pd;
-   struct cpumask cpus;
+   struct cpumask *pd_mask;
unsigned long freq;
-   int i, nr_cpus;
+   int i;
 
pd = em_cpu_get(dtpm_cpu->cpu);
-   freq = cpufreq_quick_get(dtpm_cpu->cpu);
 
-   cpumask_and(, cpu_online_mask, to_cpumask(pd->cpus));
-   nr_cpus = cpumask_weight();
+   pd_mask = em_span_cpus(pd);
+
+   freq = cpufreq_quick_get(dtpm_cpu->cpu);
 
for (i = 0; i < pd->nr_perf_states; i++) {
 
if (pd->table[i].frequency < freq)
continue;
 
-   return pd->table[i].power *
-   MICROWATT_PER_MILLIWATT * nr_cpus;
+   return scale_pd_power_uw(pd_mask, pd->table[i].power *
+MICROWATT_PER_MILLIWATT);
}
 
return 0;
-- 
2.17.1



[PATCH v6 3/7] powercap/drivers/dtpm: Simplify the dtpm table

2021-04-01 Thread Daniel Lezcano
The dtpm table is an array of pointers, that forces the user of the
table to define initdata along with the declaration of the table
entry. It is more efficient to create an array of dtpm structure, so
the declaration of the table entry can be done by initializing the
different fields.

Signed-off-by: Daniel Lezcano 
Reviewed-by: Lukasz Luba 
---
 drivers/powercap/dtpm.c |  4 ++--
 drivers/powercap/dtpm_cpu.c |  4 +++-
 include/linux/dtpm.h| 20 +---
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/powercap/dtpm.c b/drivers/powercap/dtpm.c
index a707cc2965a1..d9aac107c927 100644
--- a/drivers/powercap/dtpm.c
+++ b/drivers/powercap/dtpm.c
@@ -583,7 +583,7 @@ int dtpm_register(const char *name, struct dtpm *dtpm, 
struct dtpm *parent)
 
 static int __init dtpm_init(void)
 {
-   struct dtpm_descr **dtpm_descr;
+   struct dtpm_descr *dtpm_descr;
 
pct = powercap_register_control_type(NULL, "dtpm", NULL);
if (IS_ERR(pct)) {
@@ -592,7 +592,7 @@ static int __init dtpm_init(void)
}
 
for_each_dtpm_table(dtpm_descr)
-   (*dtpm_descr)->init(*dtpm_descr);
+   dtpm_descr->init();
 
return 0;
 }
diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
index 9deafd16a197..74b39a1082e5 100644
--- a/drivers/powercap/dtpm_cpu.c
+++ b/drivers/powercap/dtpm_cpu.c
@@ -204,7 +204,7 @@ static int cpuhp_dtpm_cpu_online(unsigned int cpu)
return ret;
 }
 
-int dtpm_register_cpu(struct dtpm *parent)
+static int __init dtpm_cpu_init(void)
 {
int ret;
 
@@ -241,3 +241,5 @@ int dtpm_register_cpu(struct dtpm *parent)
 
return 0;
 }
+
+DTPM_DECLARE(dtpm_cpu, dtpm_cpu_init);
diff --git a/include/linux/dtpm.h b/include/linux/dtpm.h
index 577c71d4e098..2ec282d1 100644
--- a/include/linux/dtpm.h
+++ b/include/linux/dtpm.h
@@ -33,25 +33,23 @@ struct dtpm_ops {
void (*release)(struct dtpm *);
 };
 
-struct dtpm_descr;
-
-typedef int (*dtpm_init_t)(struct dtpm_descr *);
+typedef int (*dtpm_init_t)(void);
 
 struct dtpm_descr {
-   struct dtpm *parent;
-   const char *name;
dtpm_init_t init;
 };
 
 /* Init section thermal table */
-extern struct dtpm_descr *__dtpm_table[];
-extern struct dtpm_descr *__dtpm_table_end[];
+extern struct dtpm_descr __dtpm_table[];
+extern struct dtpm_descr __dtpm_table_end[];
 
-#define DTPM_TABLE_ENTRY(name) \
-   static typeof(name) *__dtpm_table_entry_##name  \
-   __used __section("__dtpm_table") = 
+#define DTPM_TABLE_ENTRY(name, __init) \
+   static struct dtpm_descr __dtpm_table_entry_##name  \
+   __used __section("__dtpm_table") = {\
+   .init = __init, \
+   }
 
-#define DTPM_DECLARE(name) DTPM_TABLE_ENTRY(name)
+#define DTPM_DECLARE(name, init)   DTPM_TABLE_ENTRY(name, init)
 
 #define for_each_dtpm_table(__dtpm)\
for (__dtpm = __dtpm_table; \
-- 
2.17.1



[PATCH v6 4/7] powercap/drivers/dtpm: Use container_of instead of a private data field

2021-04-01 Thread Daniel Lezcano
The dtpm framework provides an API to allocate a dtpm node. However
when a backend dtpm driver needs to allocate a dtpm node it must
define its own structure and store the pointer of this structure in
the private field of the dtpm structure.

It is more elegant to use the container_of macro and add the dtpm
structure inside the dtpm backend specific structure. The code will be
able to deal properly with the dtpm structure as a generic entity,
making all this even more self-encapsulated.

The dtpm_alloc() function does no longer make sense as the dtpm
structure will be allocated when allocating the device specific dtpm
structure. The dtpm_init() is provided instead.

Signed-off-by: Daniel Lezcano 
Reviewed-by: Lukasz Luba 
---
 drivers/powercap/dtpm.c | 18 +--
 drivers/powercap/dtpm_cpu.c | 46 ++---
 include/linux/dtpm.h|  3 +--
 3 files changed, 29 insertions(+), 38 deletions(-)

diff --git a/drivers/powercap/dtpm.c b/drivers/powercap/dtpm.c
index d9aac107c927..b389bc397cdf 100644
--- a/drivers/powercap/dtpm.c
+++ b/drivers/powercap/dtpm.c
@@ -473,24 +473,18 @@ static struct powercap_zone_ops zone_ops = {
 };
 
 /**
- * dtpm_alloc - Allocate and initialize a dtpm struct
- * @name: a string specifying the name of the node
- *
- * Return: a struct dtpm pointer, NULL in case of error
+ * dtpm_init - Allocate and initialize a dtpm struct
+ * @dtpm: The dtpm struct pointer to be initialized
+ * @ops: The dtpm device specific ops, NULL for a virtual node
  */
-struct dtpm *dtpm_alloc(struct dtpm_ops *ops)
+void dtpm_init(struct dtpm *dtpm, struct dtpm_ops *ops)
 {
-   struct dtpm *dtpm;
-
-   dtpm = kzalloc(sizeof(*dtpm), GFP_KERNEL);
if (dtpm) {
INIT_LIST_HEAD(>children);
INIT_LIST_HEAD(>sibling);
dtpm->weight = 1024;
dtpm->ops = ops;
}
-
-   return dtpm;
 }
 
 /**
@@ -581,7 +575,7 @@ int dtpm_register(const char *name, struct dtpm *dtpm, 
struct dtpm *parent)
return 0;
 }
 
-static int __init dtpm_init(void)
+static int __init init_dtpm(void)
 {
struct dtpm_descr *dtpm_descr;
 
@@ -596,4 +590,4 @@ static int __init dtpm_init(void)
 
return 0;
 }
-late_initcall(dtpm_init);
+late_initcall(init_dtpm);
diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
index 74b39a1082e5..f4092d1b01d7 100644
--- a/drivers/powercap/dtpm_cpu.c
+++ b/drivers/powercap/dtpm_cpu.c
@@ -25,16 +25,22 @@
 #include 
 #include 
 
-static DEFINE_PER_CPU(struct dtpm *, dtpm_per_cpu);
-
 struct dtpm_cpu {
+   struct dtpm dtpm;
struct freq_qos_request qos_req;
int cpu;
 };
 
+static DEFINE_PER_CPU(struct dtpm_cpu *, dtpm_per_cpu);
+
+static struct dtpm_cpu *to_dtpm_cpu(struct dtpm *dtpm)
+{
+   return container_of(dtpm, struct dtpm_cpu, dtpm);
+}
+
 static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit)
 {
-   struct dtpm_cpu *dtpm_cpu = dtpm->private;
+   struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
struct em_perf_domain *pd = em_cpu_get(dtpm_cpu->cpu);
struct cpumask cpus;
unsigned long freq;
@@ -64,7 +70,7 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 
power_limit)
 
 static u64 get_pd_power_uw(struct dtpm *dtpm)
 {
-   struct dtpm_cpu *dtpm_cpu = dtpm->private;
+   struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
struct em_perf_domain *pd;
struct cpumask cpus;
unsigned long freq;
@@ -90,7 +96,7 @@ static u64 get_pd_power_uw(struct dtpm *dtpm)
 
 static int update_pd_power_uw(struct dtpm *dtpm)
 {
-   struct dtpm_cpu *dtpm_cpu = dtpm->private;
+   struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
struct em_perf_domain *em = em_cpu_get(dtpm_cpu->cpu);
struct cpumask cpus;
int nr_cpus;
@@ -111,7 +117,7 @@ static int update_pd_power_uw(struct dtpm *dtpm)
 
 static void pd_release(struct dtpm *dtpm)
 {
-   struct dtpm_cpu *dtpm_cpu = dtpm->private;
+   struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
 
if (freq_qos_request_active(_cpu->qos_req))
freq_qos_remove_request(_cpu->qos_req);
@@ -129,20 +135,19 @@ static struct dtpm_ops dtpm_ops = {
 static int cpuhp_dtpm_cpu_offline(unsigned int cpu)
 {
struct em_perf_domain *pd;
-   struct dtpm *dtpm;
+   struct dtpm_cpu *dtpm_cpu;
 
pd = em_cpu_get(cpu);
if (!pd)
return -EINVAL;
 
-   dtpm = per_cpu(dtpm_per_cpu, cpu);
+   dtpm_cpu = per_cpu(dtpm_per_cpu, cpu);
 
-   return dtpm_update_power(dtpm);
+   return dtpm_update_power(_cpu->dtpm);
 }
 
 static int cpuhp_dtpm_cpu_online(unsigned int cpu)
 {
-   struct dtpm *dtpm;
struct dtpm_cpu *dtpm_cpu;
struct cpufreq_policy *policy;
struct em_perf_domain *pd;
@@ -157,27 +162,23 @@ static int cpuhp_dtpm_cpu_online(unsigned int cpu)
if (!pd)
r

[PATCH v6 1/7] powercap/drivers/dtpm: Encapsulate even more the code

2021-04-01 Thread Daniel Lezcano
In order to increase the self-encapsulation of the dtpm generic code,
the following changes are adding a power update ops to the dtpm
ops. That allows the generic code to call directly the dtpm backend
function to update the power values.

The power update function does compute the power characteristics when
the function is invoked. In the case of the CPUs, the power
consumption depends on the number of online CPUs. The online CPUs mask
is not up to date at CPUHP_AP_ONLINE_DYN state in the tear down
callback. That is the reason why the online / offline are at separate
state. As there is already an existing state for DTPM, this one is
only moved to the DEAD state, so there is no addition of new state
with these changes. The dtpm node is not removed when the cpu is
unplugged.

That simplifies the code for the next changes and results in a more
self-encapsulated code.

Signed-off-by: Daniel Lezcano 
Reviewed-by: Lukasz Luba 
---
V4:
 - Replaced s/sprintf/snprintf/ for the dtpm node name
V2:
 - Updated the changelog with the CPU node not being removed
 - Commented the cpu hotplug callbacks to explain why there are two callbacks
 - Changed 'upt_power_uw' to 'update_power_uw'
 - Removed unused cpumask variable
---
 drivers/powercap/dtpm.c |  54 ++---
 drivers/powercap/dtpm_cpu.c | 148 
 include/linux/cpuhotplug.h  |   2 +-
 include/linux/dtpm.h|   3 +-
 4 files changed, 97 insertions(+), 110 deletions(-)

diff --git a/drivers/powercap/dtpm.c b/drivers/powercap/dtpm.c
index c2185ec5f887..58433b8ef9a1 100644
--- a/drivers/powercap/dtpm.c
+++ b/drivers/powercap/dtpm.c
@@ -116,8 +116,6 @@ static void __dtpm_sub_power(struct dtpm *dtpm)
parent->power_limit -= dtpm->power_limit;
parent = parent->parent;
}
-
-   __dtpm_rebalance_weight(root);
 }
 
 static void __dtpm_add_power(struct dtpm *dtpm)
@@ -130,45 +128,45 @@ static void __dtpm_add_power(struct dtpm *dtpm)
parent->power_limit += dtpm->power_limit;
parent = parent->parent;
}
+}
+
+static int __dtpm_update_power(struct dtpm *dtpm)
+{
+   int ret;
+
+   __dtpm_sub_power(dtpm);
 
-   __dtpm_rebalance_weight(root);
+   ret = dtpm->ops->update_power_uw(dtpm);
+   if (ret)
+   pr_err("Failed to update power for '%s': %d\n",
+  dtpm->zone.name, ret);
+
+   if (!test_bit(DTPM_POWER_LIMIT_FLAG, >flags))
+   dtpm->power_limit = dtpm->power_max;
+
+   __dtpm_add_power(dtpm);
+
+   if (root)
+   __dtpm_rebalance_weight(root);
+
+   return ret;
 }
 
 /**
  * dtpm_update_power - Update the power on the dtpm
  * @dtpm: a pointer to a dtpm structure to update
- * @power_min: a u64 representing the new power_min value
- * @power_max: a u64 representing the new power_max value
  *
  * Function to update the power values of the dtpm node specified in
  * parameter. These new values will be propagated to the tree.
  *
  * Return: zero on success, -EINVAL if the values are inconsistent
  */
-int dtpm_update_power(struct dtpm *dtpm, u64 power_min, u64 power_max)
+int dtpm_update_power(struct dtpm *dtpm)
 {
-   int ret = 0;
+   int ret;
 
mutex_lock(_lock);
-
-   if (power_min == dtpm->power_min && power_max == dtpm->power_max)
-   goto unlock;
-
-   if (power_max < power_min) {
-   ret = -EINVAL;
-   goto unlock;
-   }
-
-   __dtpm_sub_power(dtpm);
-
-   dtpm->power_min = power_min;
-   dtpm->power_max = power_max;
-   if (!test_bit(DTPM_POWER_LIMIT_FLAG, >flags))
-   dtpm->power_limit = power_max;
-
-   __dtpm_add_power(dtpm);
-
-unlock:
+   ret = __dtpm_update_power(dtpm);
mutex_unlock(_lock);
 
return ret;
@@ -436,6 +434,7 @@ int dtpm_register(const char *name, struct dtpm *dtpm, 
struct dtpm *parent)
 
if (dtpm->ops && !(dtpm->ops->set_power_uw &&
   dtpm->ops->get_power_uw &&
+  dtpm->ops->update_power_uw &&
   dtpm->ops->release))
return -EINVAL;
 
@@ -455,7 +454,8 @@ int dtpm_register(const char *name, struct dtpm *dtpm, 
struct dtpm *parent)
root = dtpm;
}
 
-   __dtpm_add_power(dtpm);
+   if (dtpm->ops && !dtpm->ops->update_power_uw(dtpm))
+   __dtpm_add_power(dtpm);
 
pr_info("Registered dtpm node '%s' / %llu-%llu uW, \n",
dtpm->zone.name, dtpm->power_min, dtpm->power_max);
diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
index 51c366938acd..f6076de39540 100644
--- a/drivers/powercap/dtpm_cpu.c
+++ b/drivers/powercap/dtpm_cpu.c
@@ -14,6 +14,8 @@
  * The CPU hotplug is supported an

[PATCH v6 2/7] powercap/drivers/dtpm: Create a registering system

2021-04-01 Thread Daniel Lezcano
A SoC can be differently structured depending on the platform and the
kernel can not be aware of all the combinations, as well as the
specific tweaks for a particular board.

The creation of the hierarchy must be delegated to userspace.

These changes provide a registering mechanism where the different
subsystems will initialize their dtpm backends and register with a
name the dtpm node in a list.

The next changes will provide an userspace interface to create
hierarchically the different nodes. Those will be created by name and
found via the list filled by the different subsystem.

If a specified name is not found in the list, it is assumed to be a
virtual node which will have children and the default is to allocate
such node.

Cc: Greg KH 
Signed-off-by: Daniel Lezcano 
Reviewed-by: Lukasz Luba 
---

V6:
  - Added the EXPORT_SYMBOL_GPL
V5:
  - Decrease log level from 'info' to 'debug'
  - Remove the refcount, it is pointless, lifetime cycle is already
handled by the device refcounting. The dtpm node allocator is in
charge of freeing it.
  - Rename the functions to 'dtpm_add, dtpm_del, dtpm_lookup'
  - Fix missing kfrees when deleting the node from the list
V4:
  - Fixed typo in the commit log
V2:
  - Fixed error code path by dropping lock
---
 drivers/powercap/dtpm.c | 124 ++--
 drivers/powercap/dtpm_cpu.c |   8 +--
 include/linux/dtpm.h|   6 ++
 3 files changed, 130 insertions(+), 8 deletions(-)

diff --git a/drivers/powercap/dtpm.c b/drivers/powercap/dtpm.c
index 58433b8ef9a1..a707cc2965a1 100644
--- a/drivers/powercap/dtpm.c
+++ b/drivers/powercap/dtpm.c
@@ -34,6 +34,14 @@ static DEFINE_MUTEX(dtpm_lock);
 static struct powercap_control_type *pct;
 static struct dtpm *root;
 
+struct dtpm_node {
+   const char *name;
+   struct dtpm *dtpm;
+   struct list_head node;
+};
+
+static LIST_HEAD(dtpm_list);
+
 static int get_time_window_us(struct powercap_zone *pcz, int cid, u64 *window)
 {
return -ENOSYS;
@@ -152,6 +160,116 @@ static int __dtpm_update_power(struct dtpm *dtpm)
return ret;
 }
 
+static struct dtpm *__dtpm_lookup(const char *name)
+{
+   struct dtpm_node *node;
+
+   list_for_each_entry(node, _list, node) {
+   if (!strcmp(name, node->name))
+   return node->dtpm;
+   }
+
+   return NULL;
+}
+
+/**
+ * dtpm_lookup - Lookup for a registered dtpm node given its name
+ * @name: the name of the dtpm device
+ *
+ * The function looks up in the list of the registered dtpm
+ * devices. This function must be called to create a dtpm node in the
+ * powercap hierarchy.
+ *
+ * Return: a pointer to a dtpm structure, NULL if not found.
+ */
+struct dtpm *dtpm_lookup(const char *name)
+{
+   struct dtpm *dtpm;
+
+   mutex_lock(_lock);
+   dtpm = __dtpm_lookup(name);
+   mutex_unlock(_lock);
+
+   return dtpm;
+}
+EXPORT_SYMBOL_GPL(dtpm_lookup);
+
+/**
+ * dtpm_add - Add the dtpm in the dtpm list
+ * @name: a name used as an identifier
+ * @dtpm: the dtpm node to be registered
+ *
+ * Stores the dtpm device in a list. The list contains all the devices
+ * which are power capable in terms of limitation and power
+ * consumption measurements. Even if conceptually, a power capable
+ * device won't register itself twice, the function will check if it
+ * was already registered in order to prevent a misuse of the API.
+ *
+ * Return: 0 on success, -EEXIST if the device name is already present
+ * in the list, -ENOMEM in case of memory allocation failure.
+ */
+int dtpm_add(const char *name, struct dtpm *dtpm)
+{
+   struct dtpm_node *node;
+   int ret;
+
+   mutex_lock(_lock);
+
+   ret = -EEXIST;
+   if (__dtpm_lookup(name))
+   goto out_unlock;
+
+   ret = -ENOMEM;
+   node = kzalloc(sizeof(*node), GFP_KERNEL);
+   if (!node)
+   goto out_unlock;
+
+   node->name = kstrdup(name, GFP_KERNEL);
+   if (!node->name) {
+   kfree(node);
+   goto out_unlock;
+   }
+
+   node->dtpm = dtpm;
+
+   list_add(>node, _list);
+
+   ret = 0;
+out_unlock:
+   mutex_unlock(_lock);
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(dtpm_add);
+
+/**
+ * dtpm_del - Remove the dtpm device from the list
+ * @name: the dtpm device name to be removed
+ *
+ * Remove the dtpm device from the list of the registered devices.
+ */
+void dtpm_del(const char *name)
+{
+   struct dtpm_node *node;
+
+   mutex_lock(_lock);
+
+   list_for_each_entry(node, _list, node) {
+
+   if (strcmp(name, node->name))
+   continue;
+
+   list_del(>node);
+   kfree(node->name);
+   kfree(node);
+
+   break;
+   }
+
+   mutex_unlock(_lock);
+}
+EXPORT_SYMBOL_GPL(dtpm_del);
+
 /**
  * dtpm_update_power - Update the power on the dtpm
  * @dtpm: a pointer to a dtpm structure to upd

Re: [PATCH] clocksource/arm_arch_timer: add __ro_after_init and __init

2021-04-01 Thread Daniel Lezcano
On 01/04/2021 10:31, Marc Zyngier wrote:
> On Tue, 30 Mar 2021 07:04:44 +0100,
> Jisheng Zhang  wrote:
>>
>> Some functions are not needed after booting, so mark them as __init
>> to move them to the .init section.
>>
>> Some global variables are never modified after init, so can be
>> __ro_after_init.
>>
>> Signed-off-by: Jisheng Zhang 
> 
> Acked-by: Marc Zyngier 
> 
>   M.

Thanks Marc


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH v5 2/5] powercap/drivers/dtpm: Create a registering system

2021-04-01 Thread Daniel Lezcano
On 01/04/2021 08:02, Greg KH wrote:
> On Wed, Mar 31, 2021 at 10:46:48PM +0200, Daniel Lezcano wrote:
>>
>> Hi Greg,
>>
>> On 31/03/2021 20:06, Greg KH wrote:
>>> On Wed, Mar 31, 2021 at 01:00:45PM +0200, Daniel Lezcano wrote:
>>>> +struct dtpm *dtpm_lookup(const char *name);
>>>> +
>>>> +int dtpm_add(const char *name, struct dtpm *dtpm);
>>>> +
>>>> +void dtpm_del(const char *name);
>>>
>>> You can not add new kernel apis that have no user.  How do you know if
>>> they actually work or not?  We have no idea as we do not see anyone
>>> using them :(
>>>
>>> So no need to add things with no user, feel free to just drop this patch
>>> until you have one.
>>
>> I've sent a couple of patches [1] on top of the previous series. I'm
>> finishing to respin it against this new one.
>>
>>   -- Daniel
>>
>> [1] https://lkml.org/lkml/2021/3/12/1514
> 
> Please use lore.kernel.org, we do not control lkml and it has been down
> in the past and it's impossible to reply from.
> 
> Please always provide a user of a function in the patch series,
> otherwise you will end up with comments like mine above.

Ok, will do.

Thanks

  -- Daniel


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH v5 2/5] powercap/drivers/dtpm: Create a registering system

2021-03-31 Thread Daniel Lezcano


Hi Greg,

On 31/03/2021 20:06, Greg KH wrote:
> On Wed, Mar 31, 2021 at 01:00:45PM +0200, Daniel Lezcano wrote:
>> +struct dtpm *dtpm_lookup(const char *name);
>> +
>> +int dtpm_add(const char *name, struct dtpm *dtpm);
>> +
>> +void dtpm_del(const char *name);
> 
> You can not add new kernel apis that have no user.  How do you know if
> they actually work or not?  We have no idea as we do not see anyone
> using them :(
> 
> So no need to add things with no user, feel free to just drop this patch
> until you have one.

I've sent a couple of patches [1] on top of the previous series. I'm
finishing to respin it against this new one.

  -- Daniel

[1] https://lkml.org/lkml/2021/3/12/1514


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH v2] thermal: cpufreq_cooling: fix slab OOB issue

2021-03-31 Thread Daniel Lezcano
On 29/12/2020 06:08, Michael Kao wrote:
> From: brian-sy yang 
> 
> Slab OOB issue is scanned by KASAN in cpu_power_to_freq().
> If power is limited below the power of OPP0 in EM table,
> it will cause slab out-of-bound issue with negative array
> index.
> 
> Return the lowest frequency if limited power cannot found
> a suitable OPP in EM table to fix this issue.
> 
> Backtrace:
> [] die+0x104/0x5ac
> [] bug_handler+0x64/0xd0
> [] brk_handler+0x160/0x258
> [] do_debug_exception+0x248/0x3f0
> [] el1_dbg+0x14/0xbc
> [] __kasan_report+0x1dc/0x1e0
> [] kasan_report+0x10/0x20
> [] __asan_report_load8_noabort+0x18/0x28
> [] cpufreq_power2state+0x180/0x43c
> [] power_actor_set_power+0x114/0x1d4
> [] allocate_power+0xaec/0xde0
> [] power_allocator_throttle+0x3ec/0x5a4
> [] handle_thermal_trip+0x160/0x294
> [] thermal_zone_device_check+0xe4/0x154
> [] process_one_work+0x5e4/0xe28
> [] worker_thread+0xa4c/0xfac
> [] kthread+0x33c/0x358
> [] ret_from_fork+0xc/0x18
> 
> Fixes: 371a3bc79c11b ("thermal/drivers/cpufreq_cooling: Fix wrong frequency 
> converted from power")
> Signed-off-by: brian-sy yang 
> Signed-off-by: Michael Kao 
> Reviewed-by: Lukasz Luba 
> ---

Applied, thanks


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH] thermal/drivers/hisi: Use the correct HiSilicon copyright

2021-03-31 Thread Daniel Lezcano
On 30/03/2021 08:45, Hao Fang wrote:
> s/Hisilicon/HiSilicon/g.
> It should use capital S, according to
> https://www.hisilicon.com/en/terms-of-use.
> 
> Signed-off-by: Hao Fang 
> ---

Applied, thanks


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH] thermal: ti-soc-thermal: of_device.h is included twice

2021-03-31 Thread Daniel Lezcano
On 23/03/2021 03:14, Wan Jiabing wrote:
> linux/of_device.h has been included at line 25, so remove 
> the duplicate one at line 35.
> 
> Signed-off-by: Wan Jiabing 
> ---

Thanks for your patch.

It was actually already fixed [1].

   -- D.

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git/commit/?h=thermal/next=7440e912b0fe755d80b958a65859ebabb5338cf8


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH] clocksource/arm_arch_timer: add __ro_after_init and __init

2021-03-31 Thread Daniel Lezcano
Hi,


On 30/03/2021 08:04, Jisheng Zhang wrote:
> Some functions are not needed after booting, so mark them as __init
> to move them to the .init section.
> 
> Some global variables are never modified after init, so can be
> __ro_after_init.
> 
> Signed-off-by: Jisheng Zhang 

Mar[ck] ? Any comment on this change ?



> ---
>  drivers/clocksource/arm_arch_timer.c | 23 ---
>  1 file changed, 12 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/clocksource/arm_arch_timer.c 
> b/drivers/clocksource/arm_arch_timer.c
> index d0177824c518..1b885964fb34 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -51,7 +51,7 @@
>  
>  static unsigned arch_timers_present __initdata;
>  
> -static void __iomem *arch_counter_base;
> +static void __iomem *arch_counter_base __ro_after_init;
>  
>  struct arch_timer {
>   void __iomem *base;
> @@ -60,15 +60,16 @@ struct arch_timer {
>  
>  #define to_arch_timer(e) container_of(e, struct arch_timer, evt)
>  
> -static u32 arch_timer_rate;
> -static int arch_timer_ppi[ARCH_TIMER_MAX_TIMER_PPI];
> +static u32 arch_timer_rate __ro_after_init;
> +u32 arch_timer_rate1 __ro_after_init;
> +static int arch_timer_ppi[ARCH_TIMER_MAX_TIMER_PPI] __ro_after_init;
>  
>  static struct clock_event_device __percpu *arch_timer_evt;
>  
> -static enum arch_timer_ppi_nr arch_timer_uses_ppi = ARCH_TIMER_VIRT_PPI;
> -static bool arch_timer_c3stop;
> -static bool arch_timer_mem_use_virtual;
> -static bool arch_counter_suspend_stop;
> +static enum arch_timer_ppi_nr arch_timer_uses_ppi __ro_after_init = 
> ARCH_TIMER_VIRT_PPI;
> +static bool arch_timer_c3stop __ro_after_init;
> +static bool arch_timer_mem_use_virtual __ro_after_init;
> +static bool arch_counter_suspend_stop __ro_after_init;
>  #ifdef CONFIG_GENERIC_GETTIMEOFDAY
>  static enum vdso_clock_mode vdso_default = VDSO_CLOCKMODE_ARCHTIMER;
>  #else
> @@ -76,7 +77,7 @@ static enum vdso_clock_mode vdso_default = 
> VDSO_CLOCKMODE_NONE;
>  #endif /* CONFIG_GENERIC_GETTIMEOFDAY */
>  
>  static cpumask_t evtstrm_available = CPU_MASK_NONE;
> -static bool evtstrm_enable = IS_ENABLED(CONFIG_ARM_ARCH_TIMER_EVTSTREAM);
> +static bool evtstrm_enable __ro_after_init = 
> IS_ENABLED(CONFIG_ARM_ARCH_TIMER_EVTSTREAM);
>  
>  static int __init early_evtstrm_cfg(char *buf)
>  {
> @@ -176,7 +177,7 @@ static notrace u64 arch_counter_get_cntvct(void)
>   * to exist on arm64. arm doesn't use this before DT is probed so even
>   * if we don't have the cp15 accessors we won't have a problem.
>   */
> -u64 (*arch_timer_read_counter)(void) = arch_counter_get_cntvct;
> +u64 (*arch_timer_read_counter)(void) __ro_after_init = 
> arch_counter_get_cntvct;
>  EXPORT_SYMBOL_GPL(arch_timer_read_counter);
>  
>  static u64 arch_counter_read(struct clocksource *cs)
> @@ -925,7 +926,7 @@ static int validate_timer_rate(void)
>   * rate was probed first, and don't verify that others match. If the first 
> node
>   * probed has a clock-frequency property, this overrides the HW register.
>   */
> -static void arch_timer_of_configure_rate(u32 rate, struct device_node *np)
> +static void __init arch_timer_of_configure_rate(u32 rate, struct device_node 
> *np)
>  {
>   /* Who has more than one independent system counter? */
>   if (arch_timer_rate)
> @@ -939,7 +940,7 @@ static void arch_timer_of_configure_rate(u32 rate, struct 
> device_node *np)
>   pr_warn("frequency not available\n");
>  }
>  
> -static void arch_timer_banner(unsigned type)
> +static void __init arch_timer_banner(unsigned type)
>  {
>   pr_info("%s%s%s timer(s) running at %lu.%02luMHz (%s%s%s).\n",
>   type & ARCH_TIMER_TYPE_CP15 ? "cp15" : "",
> 


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


[PATCH v5 4/5] powercap/drivers/dtpm: Use container_of instead of a private data field

2021-03-31 Thread Daniel Lezcano
The dtpm framework provides an API to allocate a dtpm node. However
when a backend dtpm driver needs to allocate a dtpm node it must
define its own structure and store the pointer of this structure in
the private field of the dtpm structure.

It is more elegant to use the container_of macro and add the dtpm
structure inside the dtpm backend specific structure. The code will be
able to deal properly with the dtpm structure as a generic entity,
making all this even more self-encapsulated.

The dtpm_alloc() function does no longer make sense as the dtpm
structure will be allocated when allocating the device specific dtpm
structure. The dtpm_init() is provided instead.

Signed-off-by: Daniel Lezcano 
Reviewed-by: Lukasz Luba 
---
 drivers/powercap/dtpm.c | 24 +--
 drivers/powercap/dtpm_cpu.c | 46 ++---
 include/linux/dtpm.h|  3 +--
 3 files changed, 35 insertions(+), 38 deletions(-)

diff --git a/drivers/powercap/dtpm.c b/drivers/powercap/dtpm.c
index 3bcd8374baf8..03dbabc1ffd0 100644
--- a/drivers/powercap/dtpm.c
+++ b/drivers/powercap/dtpm.c
@@ -190,6 +190,12 @@ struct dtpm *dtpm_lookup(const char *name)
dtpm = __dtpm_lookup(name);
mutex_unlock(_lock);
 
+   if (!dtpm) {
+   dtpm = kzalloc(sizeof(*dtpm), GFP_KERNEL);
+   if (dtpm)
+   dtpm_init(dtpm, NULL);
+   }
+
return dtpm;
 }
 
@@ -470,24 +476,18 @@ static struct powercap_zone_ops zone_ops = {
 };
 
 /**
- * dtpm_alloc - Allocate and initialize a dtpm struct
- * @name: a string specifying the name of the node
- *
- * Return: a struct dtpm pointer, NULL in case of error
+ * dtpm_init - Allocate and initialize a dtpm struct
+ * @dtpm: The dtpm struct pointer to be initialized
+ * @ops: The dtpm device specific ops, NULL for a virtual node
  */
-struct dtpm *dtpm_alloc(struct dtpm_ops *ops)
+void dtpm_init(struct dtpm *dtpm, struct dtpm_ops *ops)
 {
-   struct dtpm *dtpm;
-
-   dtpm = kzalloc(sizeof(*dtpm), GFP_KERNEL);
if (dtpm) {
INIT_LIST_HEAD(>children);
INIT_LIST_HEAD(>sibling);
dtpm->weight = 1024;
dtpm->ops = ops;
}
-
-   return dtpm;
 }
 
 /**
@@ -578,7 +578,7 @@ int dtpm_register(const char *name, struct dtpm *dtpm, 
struct dtpm *parent)
return 0;
 }
 
-static int __init dtpm_init(void)
+static int __init init_dtpm(void)
 {
struct dtpm_descr *dtpm_descr;
 
@@ -593,4 +593,4 @@ static int __init dtpm_init(void)
 
return 0;
 }
-late_initcall(dtpm_init);
+late_initcall(init_dtpm);
diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
index 74b39a1082e5..f4092d1b01d7 100644
--- a/drivers/powercap/dtpm_cpu.c
+++ b/drivers/powercap/dtpm_cpu.c
@@ -25,16 +25,22 @@
 #include 
 #include 
 
-static DEFINE_PER_CPU(struct dtpm *, dtpm_per_cpu);
-
 struct dtpm_cpu {
+   struct dtpm dtpm;
struct freq_qos_request qos_req;
int cpu;
 };
 
+static DEFINE_PER_CPU(struct dtpm_cpu *, dtpm_per_cpu);
+
+static struct dtpm_cpu *to_dtpm_cpu(struct dtpm *dtpm)
+{
+   return container_of(dtpm, struct dtpm_cpu, dtpm);
+}
+
 static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit)
 {
-   struct dtpm_cpu *dtpm_cpu = dtpm->private;
+   struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
struct em_perf_domain *pd = em_cpu_get(dtpm_cpu->cpu);
struct cpumask cpus;
unsigned long freq;
@@ -64,7 +70,7 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 
power_limit)
 
 static u64 get_pd_power_uw(struct dtpm *dtpm)
 {
-   struct dtpm_cpu *dtpm_cpu = dtpm->private;
+   struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
struct em_perf_domain *pd;
struct cpumask cpus;
unsigned long freq;
@@ -90,7 +96,7 @@ static u64 get_pd_power_uw(struct dtpm *dtpm)
 
 static int update_pd_power_uw(struct dtpm *dtpm)
 {
-   struct dtpm_cpu *dtpm_cpu = dtpm->private;
+   struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
struct em_perf_domain *em = em_cpu_get(dtpm_cpu->cpu);
struct cpumask cpus;
int nr_cpus;
@@ -111,7 +117,7 @@ static int update_pd_power_uw(struct dtpm *dtpm)
 
 static void pd_release(struct dtpm *dtpm)
 {
-   struct dtpm_cpu *dtpm_cpu = dtpm->private;
+   struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
 
if (freq_qos_request_active(_cpu->qos_req))
freq_qos_remove_request(_cpu->qos_req);
@@ -129,20 +135,19 @@ static struct dtpm_ops dtpm_ops = {
 static int cpuhp_dtpm_cpu_offline(unsigned int cpu)
 {
struct em_perf_domain *pd;
-   struct dtpm *dtpm;
+   struct dtpm_cpu *dtpm_cpu;
 
pd = em_cpu_get(cpu);
if (!pd)
return -EINVAL;
 
-   dtpm = per_cpu(dtpm_per_cpu, cpu);
+   dtpm_cpu = per_cpu(dtpm_per_cpu, cpu);
 
-   return dtpm_update_power(dtpm);
+   return dtpm_update

[PATCH v5 5/5] powercap/drivers/dtpm: Scale the power with the load

2021-03-31 Thread Daniel Lezcano
Currently the power consumption is based on the current OPP power
assuming the entire performance domain is fully loaded.

That gives very gross power estimation and we can do much better by
using the load to scale the power consumption.

Use the utilization to normalize and scale the power usage over the
max possible power.

Tested on a rock960 with 2 big CPUS, the power consumption estimation
conforms with the expected one.

Before this change:

~$ ~/dhrystone -t 1 -l 1&
~$ cat 
/sys/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:1/constraint_0_max_power_uw
226

After this change:

~$ ~/dhrystone -t 1 -l 1&
~$ cat 
/sys/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:1/constraint_0_max_power_uw
113

~$ ~/dhrystone -t 2 -l 1&
~$ cat 
/sys/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:1/constraint_0_max_power_uw
226

Signed-off-by: Daniel Lezcano 
Reviewed-by: Lukasz Luba 
---

V3:
  - Fixed uninitialized 'cpu' in scaled_power_uw()
V2:
  - Replaced cpumask by em_span_cpus
  - Changed 'util' metrics variable types
  - Optimized utilization scaling power computation
  - Renamed parameter name for scale_pd_power_uw()
---
 drivers/powercap/dtpm_cpu.c | 46 +++--
 1 file changed, 39 insertions(+), 7 deletions(-)

diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
index f4092d1b01d7..eae35ae3c42e 100644
--- a/drivers/powercap/dtpm_cpu.c
+++ b/drivers/powercap/dtpm_cpu.c
@@ -68,27 +68,59 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 
power_limit)
return power_limit;
 }
 
+static u64 scale_pd_power_uw(struct cpumask *pd_mask, u64 power)
+{
+   unsigned long max = 0, sum_util = 0;
+   int cpu;
+
+   for_each_cpu_and(cpu, pd_mask, cpu_online_mask) {
+
+   /*
+* The capacity is the same for all CPUs belonging to
+* the same perf domain, so a single call to
+* arch_scale_cpu_capacity() is enough. However, we
+* need the CPU parameter to be initialized by the
+* loop, so the call ends up in this block.
+*
+* We can initialize 'max' with a cpumask_first() call
+* before the loop but the bits computation is not
+* worth given the arch_scale_cpu_capacity() just
+* returns a value where the resulting assembly code
+* will be optimized by the compiler.
+*/
+   max = arch_scale_cpu_capacity(cpu);
+   sum_util += sched_cpu_util(cpu, max);
+   }
+
+   /*
+* In the improbable case where all the CPUs of the perf
+* domain are offline, 'max' will be zero and will lead to an
+* illegal operation with a zero division.
+*/
+   return max ? (power * ((sum_util << 10) / max)) >> 10 : 0;
+}
+
 static u64 get_pd_power_uw(struct dtpm *dtpm)
 {
struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
struct em_perf_domain *pd;
-   struct cpumask cpus;
+   struct cpumask *pd_mask;
unsigned long freq;
-   int i, nr_cpus;
+   int i;
 
pd = em_cpu_get(dtpm_cpu->cpu);
-   freq = cpufreq_quick_get(dtpm_cpu->cpu);
 
-   cpumask_and(, cpu_online_mask, to_cpumask(pd->cpus));
-   nr_cpus = cpumask_weight();
+   pd_mask = em_span_cpus(pd);
+
+   freq = cpufreq_quick_get(dtpm_cpu->cpu);
 
for (i = 0; i < pd->nr_perf_states; i++) {
 
if (pd->table[i].frequency < freq)
continue;
 
-   return pd->table[i].power *
-   MICROWATT_PER_MILLIWATT * nr_cpus;
+   return scale_pd_power_uw(pd_mask, pd->table[i].power *
+MICROWATT_PER_MILLIWATT);
}
 
return 0;
-- 
2.17.1



[PATCH v5 2/5] powercap/drivers/dtpm: Create a registering system

2021-03-31 Thread Daniel Lezcano
A SoC can be differently structured depending on the platform and the
kernel can not be aware of all the combinations, as well as the
specific tweaks for a particular board.

The creation of the hierarchy must be delegated to userspace.

These changes provide a registering mechanism where the different
subsystems will initialize their dtpm backends and register with a
name the dtpm node in a list.

The next changes will provide an userspace interface to create
hierarchically the different nodes. Those will be created by name and
found via the list filled by the different subsystem.

If a specified name is not found in the list, it is assumed to be a
virtual node which will have children and the default is to allocate
such node.

Cc: Greg KH 
Signed-off-by: Daniel Lezcano 
Reviewed-by: Lukasz Luba 
---

V5:
  - Decrease log level from 'info' to 'debug'
  - Remove the refcount, it is pointless, lifetime cycle is already
handled by the device refcounting. The dtpm node allocator is in
charge of freeing it.
  - Rename the functions to 'dtpm_add, dtpm_del, dtpm_lookup'
  - Fix missing kfrees when deleting the node from the list
V4:
  - Fixed typo in the commit log
V2:
  - Fixed error code path by dropping lock
---
 drivers/powercap/dtpm.c | 121 ++--
 drivers/powercap/dtpm_cpu.c |   8 +--
 include/linux/dtpm.h|   6 ++
 3 files changed, 127 insertions(+), 8 deletions(-)

diff --git a/drivers/powercap/dtpm.c b/drivers/powercap/dtpm.c
index 58433b8ef9a1..8df7adeed0cf 100644
--- a/drivers/powercap/dtpm.c
+++ b/drivers/powercap/dtpm.c
@@ -34,6 +34,14 @@ static DEFINE_MUTEX(dtpm_lock);
 static struct powercap_control_type *pct;
 static struct dtpm *root;
 
+struct dtpm_node {
+   const char *name;
+   struct dtpm *dtpm;
+   struct list_head node;
+};
+
+static LIST_HEAD(dtpm_list);
+
 static int get_time_window_us(struct powercap_zone *pcz, int cid, u64 *window)
 {
return -ENOSYS;
@@ -152,6 +160,113 @@ static int __dtpm_update_power(struct dtpm *dtpm)
return ret;
 }
 
+static struct dtpm *__dtpm_lookup(const char *name)
+{
+   struct dtpm_node *node;
+
+   list_for_each_entry(node, _list, node) {
+   if (!strcmp(name, node->name))
+   return node->dtpm;
+   }
+
+   return NULL;
+}
+
+/**
+ * dtpm_lookup - Lookup for a registered dtpm node given its name
+ * @name: the name of the dtpm device
+ *
+ * The function looks up in the list of the registered dtpm
+ * devices. This function must be called to create a dtpm node in the
+ * powercap hierarchy.
+ *
+ * Return: a pointer to a dtpm structure, NULL if not found.
+ */
+struct dtpm *dtpm_lookup(const char *name)
+{
+   struct dtpm *dtpm;
+
+   mutex_lock(_lock);
+   dtpm = __dtpm_lookup(name);
+   mutex_unlock(_lock);
+
+   return dtpm;
+}
+
+/**
+ * dtpm_add - Add the dtpm in the dtpm list
+ * @name: a name used as an identifier
+ * @dtpm: the dtpm node to be registered
+ *
+ * Stores the dtpm device in a list. The list contains all the devices
+ * which are power capable in terms of limitation and power
+ * consumption measurements. Even if conceptually, a power capable
+ * device won't register itself twice, the function will check if it
+ * was already registered in order to prevent a misuse of the API.
+ *
+ * Return: 0 on success, -EEXIST if the device name is already present
+ * in the list, -ENOMEM in case of memory allocation failure.
+ */
+int dtpm_add(const char *name, struct dtpm *dtpm)
+{
+   struct dtpm_node *node;
+   int ret;
+
+   mutex_lock(_lock);
+
+   ret = -EEXIST;
+   if (__dtpm_lookup(name))
+   goto out_unlock;
+
+   ret = -ENOMEM;
+   node = kzalloc(sizeof(*node), GFP_KERNEL);
+   if (!node)
+   goto out_unlock;
+
+   node->name = kstrdup(name, GFP_KERNEL);
+   if (!node->name) {
+   kfree(node);
+   goto out_unlock;
+   }
+
+   node->dtpm = dtpm;
+
+   list_add(>node, _list);
+
+   ret = 0;
+out_unlock:
+   mutex_unlock(_lock);
+
+   return ret;
+}
+
+/**
+ * dtpm_del - Remove the dtpm device from the list
+ * @name: the dtpm device name to be removed
+ *
+ * Remove the dtpm device from the list of the registered devices.
+ */
+void dtpm_del(const char *name)
+{
+   struct dtpm_node *node;
+
+   mutex_lock(_lock);
+
+   list_for_each_entry(node, _list, node) {
+
+   if (strcmp(name, node->name))
+   continue;
+
+   list_del(>node);
+   kfree(node->name);
+   kfree(node);
+
+   break;
+   }
+
+   mutex_unlock(_lock);
+}
+
 /**
  * dtpm_update_power - Update the power on the dtpm
  * @dtpm: a pointer to a dtpm structure to update
@@ -208,8 +323,6 @@ int dtpm_release_zone(struct powercap_zone *pcz)
if (root == dtpm)
root = NULL;
 
-   kf

[PATCH v5 1/5] powercap/drivers/dtpm: Encapsulate even more the code

2021-03-31 Thread Daniel Lezcano
In order to increase the self-encapsulation of the dtpm generic code,
the following changes are adding a power update ops to the dtpm
ops. That allows the generic code to call directly the dtpm backend
function to update the power values.

The power update function does compute the power characteristics when
the function is invoked. In the case of the CPUs, the power
consumption depends on the number of online CPUs. The online CPUs mask
is not up to date at CPUHP_AP_ONLINE_DYN state in the tear down
callback. That is the reason why the online / offline are at separate
state. As there is already an existing state for DTPM, this one is
only moved to the DEAD state, so there is no addition of new state
with these changes. The dtpm node is not removed when the cpu is
unplugged.

That simplifies the code for the next changes and results in a more
self-encapsulated code.

Signed-off-by: Daniel Lezcano 
Reviewed-by: Lukasz Luba 
---
V4:
 - Replaced s/sprintf/snprintf/ for the dtpm node name
V2:
 - Updated the changelog with the CPU node not being removed
 - Commented the cpu hotplug callbacks to explain why there are two callbacks
 - Changed 'upt_power_uw' to 'update_power_uw'
 - Removed unused cpumask variable
---
 drivers/powercap/dtpm.c |  54 ++---
 drivers/powercap/dtpm_cpu.c | 148 
 include/linux/cpuhotplug.h  |   2 +-
 include/linux/dtpm.h|   3 +-
 4 files changed, 97 insertions(+), 110 deletions(-)

diff --git a/drivers/powercap/dtpm.c b/drivers/powercap/dtpm.c
index c2185ec5f887..58433b8ef9a1 100644
--- a/drivers/powercap/dtpm.c
+++ b/drivers/powercap/dtpm.c
@@ -116,8 +116,6 @@ static void __dtpm_sub_power(struct dtpm *dtpm)
parent->power_limit -= dtpm->power_limit;
parent = parent->parent;
}
-
-   __dtpm_rebalance_weight(root);
 }
 
 static void __dtpm_add_power(struct dtpm *dtpm)
@@ -130,45 +128,45 @@ static void __dtpm_add_power(struct dtpm *dtpm)
parent->power_limit += dtpm->power_limit;
parent = parent->parent;
}
+}
+
+static int __dtpm_update_power(struct dtpm *dtpm)
+{
+   int ret;
+
+   __dtpm_sub_power(dtpm);
 
-   __dtpm_rebalance_weight(root);
+   ret = dtpm->ops->update_power_uw(dtpm);
+   if (ret)
+   pr_err("Failed to update power for '%s': %d\n",
+  dtpm->zone.name, ret);
+
+   if (!test_bit(DTPM_POWER_LIMIT_FLAG, >flags))
+   dtpm->power_limit = dtpm->power_max;
+
+   __dtpm_add_power(dtpm);
+
+   if (root)
+   __dtpm_rebalance_weight(root);
+
+   return ret;
 }
 
 /**
  * dtpm_update_power - Update the power on the dtpm
  * @dtpm: a pointer to a dtpm structure to update
- * @power_min: a u64 representing the new power_min value
- * @power_max: a u64 representing the new power_max value
  *
  * Function to update the power values of the dtpm node specified in
  * parameter. These new values will be propagated to the tree.
  *
  * Return: zero on success, -EINVAL if the values are inconsistent
  */
-int dtpm_update_power(struct dtpm *dtpm, u64 power_min, u64 power_max)
+int dtpm_update_power(struct dtpm *dtpm)
 {
-   int ret = 0;
+   int ret;
 
mutex_lock(_lock);
-
-   if (power_min == dtpm->power_min && power_max == dtpm->power_max)
-   goto unlock;
-
-   if (power_max < power_min) {
-   ret = -EINVAL;
-   goto unlock;
-   }
-
-   __dtpm_sub_power(dtpm);
-
-   dtpm->power_min = power_min;
-   dtpm->power_max = power_max;
-   if (!test_bit(DTPM_POWER_LIMIT_FLAG, >flags))
-   dtpm->power_limit = power_max;
-
-   __dtpm_add_power(dtpm);
-
-unlock:
+   ret = __dtpm_update_power(dtpm);
mutex_unlock(_lock);
 
return ret;
@@ -436,6 +434,7 @@ int dtpm_register(const char *name, struct dtpm *dtpm, 
struct dtpm *parent)
 
if (dtpm->ops && !(dtpm->ops->set_power_uw &&
   dtpm->ops->get_power_uw &&
+  dtpm->ops->update_power_uw &&
   dtpm->ops->release))
return -EINVAL;
 
@@ -455,7 +454,8 @@ int dtpm_register(const char *name, struct dtpm *dtpm, 
struct dtpm *parent)
root = dtpm;
}
 
-   __dtpm_add_power(dtpm);
+   if (dtpm->ops && !dtpm->ops->update_power_uw(dtpm))
+   __dtpm_add_power(dtpm);
 
pr_info("Registered dtpm node '%s' / %llu-%llu uW, \n",
dtpm->zone.name, dtpm->power_min, dtpm->power_max);
diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
index 51c366938acd..f6076de39540 100644
--- a/drivers/powercap/dtpm_cpu.c
+++ b/drivers/powercap/dtpm_cpu.c
@@ -14,6 +14,8 @@
  * The CPU hotplug is supported an

[PATCH v5 3/5] powercap/drivers/dtpm: Simplify the dtpm table

2021-03-31 Thread Daniel Lezcano
The dtpm table is an array of pointers, that forces the user of the
table to define initdata along with the declaration of the table
entry. It is more efficient to create an array of dtpm structure, so
the declaration of the table entry can be done by initializing the
different fields.

Signed-off-by: Daniel Lezcano 
Reviewed-by: Lukasz Luba 
---
 drivers/powercap/dtpm.c |  4 ++--
 drivers/powercap/dtpm_cpu.c |  4 +++-
 include/linux/dtpm.h| 20 +---
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/powercap/dtpm.c b/drivers/powercap/dtpm.c
index 8df7adeed0cf..3bcd8374baf8 100644
--- a/drivers/powercap/dtpm.c
+++ b/drivers/powercap/dtpm.c
@@ -580,7 +580,7 @@ int dtpm_register(const char *name, struct dtpm *dtpm, 
struct dtpm *parent)
 
 static int __init dtpm_init(void)
 {
-   struct dtpm_descr **dtpm_descr;
+   struct dtpm_descr *dtpm_descr;
 
pct = powercap_register_control_type(NULL, "dtpm", NULL);
if (IS_ERR(pct)) {
@@ -589,7 +589,7 @@ static int __init dtpm_init(void)
}
 
for_each_dtpm_table(dtpm_descr)
-   (*dtpm_descr)->init(*dtpm_descr);
+   dtpm_descr->init();
 
return 0;
 }
diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
index 9deafd16a197..74b39a1082e5 100644
--- a/drivers/powercap/dtpm_cpu.c
+++ b/drivers/powercap/dtpm_cpu.c
@@ -204,7 +204,7 @@ static int cpuhp_dtpm_cpu_online(unsigned int cpu)
return ret;
 }
 
-int dtpm_register_cpu(struct dtpm *parent)
+static int __init dtpm_cpu_init(void)
 {
int ret;
 
@@ -241,3 +241,5 @@ int dtpm_register_cpu(struct dtpm *parent)
 
return 0;
 }
+
+DTPM_DECLARE(dtpm_cpu, dtpm_cpu_init);
diff --git a/include/linux/dtpm.h b/include/linux/dtpm.h
index 577c71d4e098..2ec282d1 100644
--- a/include/linux/dtpm.h
+++ b/include/linux/dtpm.h
@@ -33,25 +33,23 @@ struct dtpm_ops {
void (*release)(struct dtpm *);
 };
 
-struct dtpm_descr;
-
-typedef int (*dtpm_init_t)(struct dtpm_descr *);
+typedef int (*dtpm_init_t)(void);
 
 struct dtpm_descr {
-   struct dtpm *parent;
-   const char *name;
dtpm_init_t init;
 };
 
 /* Init section thermal table */
-extern struct dtpm_descr *__dtpm_table[];
-extern struct dtpm_descr *__dtpm_table_end[];
+extern struct dtpm_descr __dtpm_table[];
+extern struct dtpm_descr __dtpm_table_end[];
 
-#define DTPM_TABLE_ENTRY(name) \
-   static typeof(name) *__dtpm_table_entry_##name  \
-   __used __section("__dtpm_table") = 
+#define DTPM_TABLE_ENTRY(name, __init) \
+   static struct dtpm_descr __dtpm_table_entry_##name  \
+   __used __section("__dtpm_table") = {\
+   .init = __init, \
+   }
 
-#define DTPM_DECLARE(name) DTPM_TABLE_ENTRY(name)
+#define DTPM_DECLARE(name, init)   DTPM_TABLE_ENTRY(name, init)
 
 #define for_each_dtpm_table(__dtpm)\
for (__dtpm = __dtpm_table; \
-- 
2.17.1



Re: [PATCH v2 2/2] thermal: qcom-spmi-temp-alarm: add support for GEN2 rev 1 PMIC peripherals

2021-03-30 Thread Daniel Lezcano
On 30/03/2021 19:49, Guru Das Srinagesh wrote:
> On Tue, Aug 04, 2020 at 11:48:30PM -0700, Stephen Boyd wrote:
>> Quoting Guru Das Srinagesh (2020-07-29 09:52:52)
>>> From: David Collins 
>>>
>>> Add support for TEMP_ALARM GEN2 PMIC peripherals with digital
>>> major revision 1.  This revision utilizes a different temperature
>>> threshold mapping than earlier revisions.
>>>
>>> Signed-off-by: David Collins 
>>> Signed-off-by: Guru Das Srinagesh 
>>> ---
>>
>> Reviewed-by: Stephen Boyd 
> 
> + Daniel Lezcano
> 
> Hi Daniel,
> 
> I just checked Linus' tree and discovered that this patch has not been
> applied - only the other patch in this series. Since this patch has been
> reviewed already, could you please check if it's good to be applied as
> well?

Applied, thanks


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH] thermal/drivers/netlink: Add the temperature when crossing a trip point

2021-03-28 Thread Daniel Lezcano
On 28/03/2021 21:31, Srinivas Pandruvada wrote:
> Hi Daniel,

[ ... ]

>> I don't see where is the problem. The protocol is still compatible
>> with
>> the previous version, so that does not break the existing AFAICT.
>> That
>> is done on purpose.
> 
> The size of netlink message is changed. This is not a good argument to
> just adding members at the end. The point I am trying that netlink now
> is an ABI, which should go through same process as we are
> adding/changing a sysfs attributes.
>  

Ok I understand your point. I will provide a description in
Documentation/ABI/testing, something I should have done before and take
the opportunity to provide a bigger update of the thermal netlink
messages with more commands.

Thanks
  -- Daniel

-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH v4 2/5] powercap/drivers/dtpm: Create a registering system

2021-03-28 Thread Daniel Lezcano
On 28/03/2021 19:26, Greg KH wrote:

[ ... ]

>>> So why are you trying to add a kref here as the structure already has
>>> support for proper lifetimes?
>>
>> Right, I'll revisit that part. Thanks for the review.
>>
>> I've a branch which is pulled by Rafael [1]. These parts are already
>> merged in the dtpm/next branch but not yet in Rafael's tree.
> 
> I would recommend fixing that up if you can rebase it.  If not, you need
> to revert it and start over.  I'll be glad to review it if you cc: me on
> the patches.
> 
>> I think a rebase is possible but I would like to avoid that. Would be a
>> patch on top of the dtpm/next acceptable given your flow with Android ?
> 
> This has nothing to do with the Android kernel workflow, sorry.  I am
> concerned about proper kernel development and keeping bugs out of it.


Fair enough, I will fix it up and send a v5.

Thanks

  -- Daniel


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH v4 2/5] powercap/drivers/dtpm: Create a registering system

2021-03-28 Thread Daniel Lezcano
On 28/03/2021 13:24, Greg KH wrote:
> On Sun, Mar 28, 2021 at 01:11:30PM +0200, Daniel Lezcano wrote:
>>
>> Hi Greg,
>>
>> On 28/03/2021 08:50, Greg KH wrote:
>>
>> [ ... ]
>>
>>>>> And any reason why you are not using "real" struct devices in this
>>>>> subsystem?  You seem to be rolling your own infrastructure for no good
>>>>> reason.  I imagine you want sysfs support next, right?
>>>>
>>>> Actually, the framework is on top of powercap, so it has de facto the
>>>> sysfs support. On the other side, the dtpm backends are tied with the
>>>> device they manage.
>>>
>>> So why are they not a "real" device in the driver model?  It looks like
>>> you almost are wanting all of that functionality and are having to
>>> implement it "by hand" instead.
>>
>> I'm sorry I misunderstanding your point. dtpm is the backend for the
>> powercap subsystem which is the generic subsystem to do power limitation.
>>
>> We have:
>>
>> struct dtpm_cpu {
>>  struct dtpm dtmp;
>>  ...
>> }
>>
>> struct dtpm {
>>  struct powercap powecap;
>> };
>>
>> struct powercap {
>>  struct device dev;
>> };
> 
> Oh nice.  So you can not use a kref here at all as you already have a
> reference counted device controling your structure.  You can not have 2
> references trying to control the same structure, that way lies madness
> and bugs :(
> 
> So why are you trying to add a kref here as the structure already has
> support for proper lifetimes?

Right, I'll revisit that part. Thanks for the review.

I've a branch which is pulled by Rafael [1]. These parts are already
merged in the dtpm/next branch but not yet in Rafael's tree.

I think a rebase is possible but I would like to avoid that. Would be a
patch on top of the dtpm/next acceptable given your flow with Android ?

  -- Daniel

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/daniel.lezcano/linux.git/log/?h=dtpm/next

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH v4 2/5] powercap/drivers/dtpm: Create a registering system

2021-03-28 Thread Daniel Lezcano


Hi Greg,

On 28/03/2021 08:50, Greg KH wrote:

[ ... ]

>>> And any reason why you are not using "real" struct devices in this
>>> subsystem?  You seem to be rolling your own infrastructure for no good
>>> reason.  I imagine you want sysfs support next, right?
>>
>> Actually, the framework is on top of powercap, so it has de facto the
>> sysfs support. On the other side, the dtpm backends are tied with the
>> device they manage.
> 
> So why are they not a "real" device in the driver model?  It looks like
> you almost are wanting all of that functionality and are having to
> implement it "by hand" instead.

I'm sorry I misunderstanding your point. dtpm is the backend for the
powercap subsystem which is the generic subsystem to do power limitation.

We have:

struct dtpm_cpu {
struct dtpm dtmp;
...
}

struct dtpm {
struct powercap powecap;
};

struct powercap {
struct device dev;
};



-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH] thermal/drivers/netlink: Add the temperature when crossing a trip point

2021-03-27 Thread Daniel Lezcano


Hi Srinivas,

On 27/03/2021 18:46, Srinivas Pandruvada wrote:
> On Fri, 2021-03-26 at 17:37 +0100, Rafael J. Wysocki wrote:
>> On Thu, Mar 25, 2021 at 8:38 PM Daniel Lezcano
>>  wrote:
>>> The slope of the temperature increase or decrease can be high and
>>> when
>>> the temperature crosses the trip point, there could be a
>>> significant
>>> difference between the trip temperature and the measured
>>> temperatures.
>>>
>>> That forces the userspace to read the temperature back right after
>>> receiving a trip violation notification.
>>>
>>> In order to be efficient, give the temperature which resulted in
>>> the
>>> trip violation.
>>>
>>> Signed-off-by: Daniel Lezcano 
>>
>> Srinivas, what do you think?
> 
> - IMO netlink message should also be treated as we treat other ABIs. So
> add only when this is a must. Although here GENL version is incremented
> , users are not that careful. At least on x86, we know users created
> their own applications.

I don't see where is the problem. The protocol is still compatible with
the previous version, so that does not break the existing AFAICT. That
is done on purpose.

There is a new attribute added, the application using the previous
version will just not be aware of its presence and parse the message
without getting the temperature.

> - Here the concern is temperature is changing so fast then netlink +
> user space processing latency is enough to change further to read
> temperature again. Atleast we assume that and read temperature again.
> So not sure that this is the right approach to add another field for
> the temperature.

I'm not sure to understand your comment. Whatever the mechanism
(interrupt based or polling), the temperature is read in any case by the
call to thermal_zone_device_update() which in turns calls
handle_trip_point and then send the message. So, why not add the
temperature as the userspace is interested on getting the temperature
anyway.

For instance, in polling mode, (usually set to 1000ms), the trip point
crossing is detected with a long delay and during this time the
temperature could be far beyond the trip temperature.

It is not about netlink latency but about avoiding a back and forth when
the trip point is crossed and the temperature wavering around.


>>> ---
>>>  drivers/thermal/thermal_core.c|  6 --
>>>  drivers/thermal/thermal_netlink.c | 11 ++-
>>>  drivers/thermal/thermal_netlink.h |  8 
>>>  include/uapi/linux/thermal.h  |  2 +-
>>>  4 files changed, 15 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/thermal/thermal_core.c
>>> b/drivers/thermal/thermal_core.c
>>> index 996c038f83a4..948020ef51b1 100644
>>> --- a/drivers/thermal/thermal_core.c
>>> +++ b/drivers/thermal/thermal_core.c
>>> @@ -430,10 +430,12 @@ static void handle_thermal_trip(struct
>>> thermal_zone_device *tz, int trip)
>>> if (tz->last_temperature != THERMAL_TEMP_INVALID) {
>>> if (tz->last_temperature < trip_temp &&
>>> tz->temperature >= trip_temp)
>>> -   thermal_notify_tz_trip_up(tz->id, trip);
>>> +   thermal_notify_tz_trip_up(tz->id, trip,
>>> + tz->temperature);
>>> if (tz->last_temperature >= trip_temp &&
>>> tz->temperature < (trip_temp - hyst))
>>> -   thermal_notify_tz_trip_down(tz->id, trip);
>>> +   thermal_notify_tz_trip_down(tz->id, trip,
>>> +   tz-
>>>> temperature);
>>> }
>>>
>>> if (type == THERMAL_TRIP_CRITICAL || type ==
>>> THERMAL_TRIP_HOT)
>>> diff --git a/drivers/thermal/thermal_netlink.c
>>> b/drivers/thermal/thermal_netlink.c
>>> index 1234dbe95895..a16dd4d5d710 100644
>>> --- a/drivers/thermal/thermal_netlink.c
>>> +++ b/drivers/thermal/thermal_netlink.c
>>> @@ -121,7 +121,8 @@ static int thermal_genl_event_tz(struct param
>>> *p)
>>>  static int thermal_genl_event_tz_trip_up(struct param *p)
>>>  {
>>> if (nla_put_u32(p->msg, THERMAL_GENL_ATTR_TZ_ID, p->tz_id)
>>> ||
>>> -   nla_put_u32(p->msg, THERMAL_GENL_ATTR_TZ_TRIP_ID, p-
>>>> trip_id))
>>> +   nla_put_u32(p->msg, THERMAL_GENL_ATTR_TZ_TRIP_ID, p-
>>>> trip_id) ||
&g

Re: [PATCH v4 2/5] powercap/drivers/dtpm: Create a registering system

2021-03-27 Thread Daniel Lezcano
On 27/03/2021 13:50, Greg KH wrote:
> On Fri, Mar 12, 2021 at 02:04:08PM +0100, Daniel Lezcano wrote:
>> A SoC can be differently structured depending on the platform and the
>> kernel can not be aware of all the combinations, as well as the
>> specific tweaks for a particular board.
>>
>> The creation of the hierarchy must be delegated to userspace.
>>
>> These changes provide a registering mechanism where the different
>> subsystems will initialize their dtpm backends and register with a
>> name the dtpm node in a list.
>>
>> The next changes will provide an userspace interface to create
>> hierarchically the different nodes. Those will be created by name and
>> found via the list filled by the different subsystem.
>>
>> If a specified name is not found in the list, it is assumed to be a
>> virtual node which will have children and the default is to allocate
>> such node.
>>
>> When the node register in the list, the function will be dtpm_register
>> where the previous semantic was to create the node. Thus, the
>> functions are renamed to reflect their purpose.
>>
>> Signed-off-by: Daniel Lezcano 
>> Reviewed-by: Lukasz Luba 
>> ---

[ ... ]

>> +static void dtpm_release(struct kref *kref)
>> +{
>> +struct dtpm *dtpm = container_of(kref, struct dtpm, kref);
>> +
>> +kfree(dtpm);
>> +}
>> +
>> +/**
>> + * dtpm_put - Release a reference on a dtpm device
>> + * @dtpm: a pointer to a dtpm structure
>> + *
>> + * Release the reference on the specified dtpm device. The last
>> + * reference leads to a memory release.
>> + */
>> +void dtpm_put(struct dtpm *dtpm)
>> +{
>> +kref_put(>kref, dtpm_release);
> 
> You forgot to also grab the dtpm_lock before calling this, right?  What
> is preventing a get and put from being called at the same time?
> 
> You protect things at get time, but not put from what I can see :(

Thanks for spotting this, I will send a fix for that.

[ ... ]

>> +list_add(>node, _list);
>> +
>> +pr_info("Registered %s\n", name);
> 
> When kernel code works properly, it is quiet.  This is debugging code a
> the most, never something that everyone should be seeing all the time,
> please remove.

Initially, a comment asked for debug traces in the code. There are more
traces in the code at the pr_debug level.

Is your suggestion to remove the pr_info as well as other debug traces
or convert those pr_info to pr_debug ?

[ ... ]

> And any reason why you are not using "real" struct devices in this
> subsystem?  You seem to be rolling your own infrastructure for no good
> reason.  I imagine you want sysfs support next, right?

Actually, the framework is on top of powercap, so it has de facto the
sysfs support. On the other side, the dtpm backends are tied with the
device they manage.



-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


[PATCH] thermal/drivers/netlink: Add the temperature when crossing a trip point

2021-03-25 Thread Daniel Lezcano
The slope of the temperature increase or decrease can be high and when
the temperature crosses the trip point, there could be a significant
difference between the trip temperature and the measured temperatures.

That forces the userspace to read the temperature back right after
receiving a trip violation notification.

In order to be efficient, give the temperature which resulted in the
trip violation.

Signed-off-by: Daniel Lezcano 
---
 drivers/thermal/thermal_core.c|  6 --
 drivers/thermal/thermal_netlink.c | 11 ++-
 drivers/thermal/thermal_netlink.h |  8 
 include/uapi/linux/thermal.h  |  2 +-
 4 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 996c038f83a4..948020ef51b1 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -430,10 +430,12 @@ static void handle_thermal_trip(struct 
thermal_zone_device *tz, int trip)
if (tz->last_temperature != THERMAL_TEMP_INVALID) {
if (tz->last_temperature < trip_temp &&
tz->temperature >= trip_temp)
-   thermal_notify_tz_trip_up(tz->id, trip);
+   thermal_notify_tz_trip_up(tz->id, trip,
+ tz->temperature);
if (tz->last_temperature >= trip_temp &&
tz->temperature < (trip_temp - hyst))
-   thermal_notify_tz_trip_down(tz->id, trip);
+   thermal_notify_tz_trip_down(tz->id, trip,
+   tz->temperature);
}
 
if (type == THERMAL_TRIP_CRITICAL || type == THERMAL_TRIP_HOT)
diff --git a/drivers/thermal/thermal_netlink.c 
b/drivers/thermal/thermal_netlink.c
index 1234dbe95895..a16dd4d5d710 100644
--- a/drivers/thermal/thermal_netlink.c
+++ b/drivers/thermal/thermal_netlink.c
@@ -121,7 +121,8 @@ static int thermal_genl_event_tz(struct param *p)
 static int thermal_genl_event_tz_trip_up(struct param *p)
 {
if (nla_put_u32(p->msg, THERMAL_GENL_ATTR_TZ_ID, p->tz_id) ||
-   nla_put_u32(p->msg, THERMAL_GENL_ATTR_TZ_TRIP_ID, p->trip_id))
+   nla_put_u32(p->msg, THERMAL_GENL_ATTR_TZ_TRIP_ID, p->trip_id) ||
+   nla_put_u32(p->msg, THERMAL_GENL_ATTR_TZ_TEMP, p->temp))
return -EMSGSIZE;
 
return 0;
@@ -285,16 +286,16 @@ int thermal_notify_tz_disable(int tz_id)
return thermal_genl_send_event(THERMAL_GENL_EVENT_TZ_DISABLE, );
 }
 
-int thermal_notify_tz_trip_down(int tz_id, int trip_id)
+int thermal_notify_tz_trip_down(int tz_id, int trip_id, int temp)
 {
-   struct param p = { .tz_id = tz_id, .trip_id = trip_id };
+   struct param p = { .tz_id = tz_id, .trip_id = trip_id, .temp = temp };
 
return thermal_genl_send_event(THERMAL_GENL_EVENT_TZ_TRIP_DOWN, );
 }
 
-int thermal_notify_tz_trip_up(int tz_id, int trip_id)
+int thermal_notify_tz_trip_up(int tz_id, int trip_id, int temp)
 {
-   struct param p = { .tz_id = tz_id, .trip_id = trip_id };
+   struct param p = { .tz_id = tz_id, .trip_id = trip_id, .temp = temp };
 
return thermal_genl_send_event(THERMAL_GENL_EVENT_TZ_TRIP_UP, );
 }
diff --git a/drivers/thermal/thermal_netlink.h 
b/drivers/thermal/thermal_netlink.h
index 828d1dddfa98..e554f76291f4 100644
--- a/drivers/thermal/thermal_netlink.h
+++ b/drivers/thermal/thermal_netlink.h
@@ -11,8 +11,8 @@ int thermal_notify_tz_create(int tz_id, const char *name);
 int thermal_notify_tz_delete(int tz_id);
 int thermal_notify_tz_enable(int tz_id);
 int thermal_notify_tz_disable(int tz_id);
-int thermal_notify_tz_trip_down(int tz_id, int id);
-int thermal_notify_tz_trip_up(int tz_id, int id);
+int thermal_notify_tz_trip_down(int tz_id, int id, int temp);
+int thermal_notify_tz_trip_up(int tz_id, int id, int temp);
 int thermal_notify_tz_trip_delete(int tz_id, int id);
 int thermal_notify_tz_trip_add(int tz_id, int id, int type,
   int temp, int hyst);
@@ -49,12 +49,12 @@ static inline int thermal_notify_tz_disable(int tz_id)
return 0;
 }
 
-static inline int thermal_notify_tz_trip_down(int tz_id, int id)
+static inline int thermal_notify_tz_trip_down(int tz_id, int id, int temp)
 {
return 0;
 }
 
-static inline int thermal_notify_tz_trip_up(int tz_id, int id)
+static inline int thermal_notify_tz_trip_up(int tz_id, int id, int temp)
 {
return 0;
 }
diff --git a/include/uapi/linux/thermal.h b/include/uapi/linux/thermal.h
index c105054cbb57..bf5d9c8ef16f 100644
--- a/include/uapi/linux/thermal.h
+++ b/include/uapi/linux/thermal.h
@@ -18,7 +18,7 @@ enum thermal_trip_type {
 
 /* Adding event notification support elements */
 #define THERMAL_GENL_FAMILY_NAME   "thermal"
-#define THERMAL_GENL_VERSION   0x01
+#define THERMA

Re: [PATCH v2] drivers/clocksource/mediatek: Ack and disable interrupts on shutdown

2021-03-25 Thread Daniel Lezcano
On 25/03/2021 02:35, Evan Benn wrote:
> set_state_shutdown is called during system suspend after interrupts have
> been disabled. If the timer has fired in the meantime, there will be
> a pending IRQ. So we ack that now and disable the timer. Without this
> ARM trusted firmware will abort the suspend due to the pending
> interrupt.
> 
> Now always disable the IRQ in state transitions, and re-enable in
> set_periodic and next_event.

Why not put add the suspend/resume callbacks and put there the specific
code and let the irq untouched in the normal flow ?

> Signed-off-by: Evan Benn 
> ---
> 
> Changes in v2:
> Remove the patch that splits the drivers into 2 files.
> 
>  drivers/clocksource/timer-mediatek.c | 49 +---
>  1 file changed, 30 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/clocksource/timer-mediatek.c 
> b/drivers/clocksource/timer-mediatek.c
> index 9318edcd8963..fba2f9494d90 100644
> --- a/drivers/clocksource/timer-mediatek.c
> +++ b/drivers/clocksource/timer-mediatek.c
> @@ -132,13 +132,33 @@ static u64 notrace mtk_gpt_read_sched_clock(void)
>   return readl_relaxed(gpt_sched_reg);
>  }
>  
> +static void mtk_gpt_disable_ack_interrupts(struct timer_of *to, u8 timer)
> +{
> + u32 val;
> +
> + /* Disable interrupts */
> + val = readl(timer_of_base(to) + GPT_IRQ_EN_REG);
> + writel(val & ~GPT_IRQ_ENABLE(timer), timer_of_base(to) +
> +GPT_IRQ_EN_REG);
> +
> + /* Ack interrupts */
> + writel(GPT_IRQ_ACK(timer), timer_of_base(to) + GPT_IRQ_ACK_REG);
> +}
> +
>  static void mtk_gpt_clkevt_time_stop(struct timer_of *to, u8 timer)
>  {
>   u32 val;
>  
> + /* Disable timer */
>   val = readl(timer_of_base(to) + GPT_CTRL_REG(timer));
>   writel(val & ~GPT_CTRL_ENABLE, timer_of_base(to) +
>  GPT_CTRL_REG(timer));
> +
> + /* This may be called with interrupts disabled,
> +  * so we need to ack any interrupt that is pending
> +  * Or for example ATF will prevent a suspend from completing.
> +  */
> + mtk_gpt_disable_ack_interrupts(to, timer);
>  }
>  
>  static void mtk_gpt_clkevt_time_setup(struct timer_of *to,
> @@ -152,8 +172,10 @@ static void mtk_gpt_clkevt_time_start(struct timer_of 
> *to,
>  {
>   u32 val;
>  
> - /* Acknowledge interrupt */
> - writel(GPT_IRQ_ACK(timer), timer_of_base(to) + GPT_IRQ_ACK_REG);
> + /* Enable interrupts */
> + val = readl(timer_of_base(to) + GPT_IRQ_EN_REG);
> + writel(val | GPT_IRQ_ENABLE(timer),
> +timer_of_base(to) + GPT_IRQ_EN_REG);
>  
>   val = readl(timer_of_base(to) + GPT_CTRL_REG(timer));
>  
> @@ -226,21 +248,6 @@ __init mtk_gpt_setup(struct timer_of *to, u8 timer, u8 
> option)
>  timer_of_base(to) + GPT_CTRL_REG(timer));
>  }
>  
> -static void mtk_gpt_enable_irq(struct timer_of *to, u8 timer)
> -{
> - u32 val;
> -
> - /* Disable all interrupts */
> - writel(0x0, timer_of_base(to) + GPT_IRQ_EN_REG);
> -
> - /* Acknowledge all spurious pending interrupts */
> - writel(0x3f, timer_of_base(to) + GPT_IRQ_ACK_REG);
> -
> - val = readl(timer_of_base(to) + GPT_IRQ_EN_REG);
> - writel(val | GPT_IRQ_ENABLE(timer),
> -timer_of_base(to) + GPT_IRQ_EN_REG);
> -}
> -
>  static struct timer_of to = {
>   .flags = TIMER_OF_IRQ | TIMER_OF_BASE | TIMER_OF_CLOCK,
>  
> @@ -292,6 +299,12 @@ static int __init mtk_gpt_init(struct device_node *node)
>   if (ret)
>   return ret;
>  
> + /* In case the firmware left the interrupts enabled
> +  * disable and ack those now
> +  */
> + mtk_gpt_disable_ack_interrupts(, TIMER_CLK_SRC);
> + mtk_gpt_disable_ack_interrupts(, TIMER_CLK_EVT);
> +
>   /* Configure clock source */
>   mtk_gpt_setup(, TIMER_CLK_SRC, GPT_CTRL_OP_FREERUN);
>   clocksource_mmio_init(timer_of_base() + GPT_CNT_REG(TIMER_CLK_SRC),
> @@ -305,8 +318,6 @@ static int __init mtk_gpt_init(struct device_node *node)
>   clockevents_config_and_register(, timer_of_rate(),
>   TIMER_SYNC_TICKS, 0x);
>  
> - mtk_gpt_enable_irq(, TIMER_CLK_EVT);
> -
>   return 0;
>  }
>  TIMER_OF_DECLARE(mtk_mt6577, "mediatek,mt6577-timer", mtk_gpt_init);
> 


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH v2 0/5] arm64: sunxi: Enable the sun4i timer

2021-03-24 Thread Daniel Lezcano
On 24/03/2021 04:51, Samuel Holland wrote:
> On 3/22/21 9:18 AM, Daniel Lezcano wrote:
>> On 22/03/2021 05:47, Samuel Holland wrote:
>>> In preparation for adding CPU idle states, hook up the sun4i timer.
>>> Having a non-c3stop clockevent source available is necessary for all
>>> CPUs to simultaneously enter a local-timer-stop idle state.
>>
>> Why simultaneously ?
> Because the CPU providing (the hrtimer providing) the broadcast timer
> cannot enter an idle state which would stop that timer. So in my case,
> with 4 CPUs, I was seeing at most 3 CPUs enter idle at any given time.
> This prevented any cluster-level idle states from doing anything. After
> applying this series, I was able to observe the whole cluster powering
> down when appropriate.

Ah, ok. I did not realize this fourth CPU was acting as the broadcast
timer, so allowing at least a CPU power down. This setup is unusual.

The changes make sense.

Acked-by: Daniel Lezcano 



-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH] powercap/drivers/dtpm: Add dtpm devfreq with energy model support

2021-03-23 Thread Daniel Lezcano
On 23/03/2021 16:56, Lukasz Luba wrote:
> Hi Daniel,
> 
> On 3/19/21 4:28 PM, Daniel Lezcano wrote:
>> Currently the dtpm supports the CPUs via cpufreq and the energy
>> model. This change provides the same for the device which supports
>> devfreq.
>>
>> Each device supporting devfreq and having an energy model can register
>> themselves in the list of supported devices.
>>
>> The concept is the same as the cpufreq dtpm support: the QoS is used
>> to aggregate the requests and the energy model gives the value of the
>> instantaneous power consumption ponderated by the load of the device.
>>
> 
> 
> I've just started the review, but I have a blocking question:
> 
> Why there is no unregister function (like 'dtmp_unregister_devfreq')?
> Do you consider any devfreq drivers to be modules?
> 
> The code looks like an API that it's going to be called directly in
> e.g. GPU driver in it's probe function. In that case probably the
> module unloading should call dtmp unregister.
> 
> Could you explain this to me please? So I can continue the review.

BTW, thanks for taking the time to review the patch.


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH] powercap/drivers/dtpm: Add dtpm devfreq with energy model support

2021-03-23 Thread Daniel Lezcano
On 23/03/2021 16:56, Lukasz Luba wrote:
> Hi Daniel,
> 
> On 3/19/21 4:28 PM, Daniel Lezcano wrote:
>> Currently the dtpm supports the CPUs via cpufreq and the energy
>> model. This change provides the same for the device which supports
>> devfreq.
>>
>> Each device supporting devfreq and having an energy model can register
>> themselves in the list of supported devices.
>>
>> The concept is the same as the cpufreq dtpm support: the QoS is used
>> to aggregate the requests and the energy model gives the value of the
>> instantaneous power consumption ponderated by the load of the device.
>>
> 
> 
> I've just started the review, but I have a blocking question:
> 
> Why there is no unregister function (like 'dtmp_unregister_devfreq')?
> Do you consider any devfreq drivers to be modules?
> 
> The code looks like an API that it's going to be called directly in
> e.g. GPU driver in it's probe function. In that case probably the
> module unloading should call dtmp unregister.
> 
> Could you explain this to me please? So I can continue the review.

Just forgot the unregister function :)


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH 1/2] clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issue

2021-03-22 Thread Daniel Lezcano
On 22/03/2021 17:33, Tony Lindgren wrote:
> Hi,
> 
> * Daniel Lezcano  [210322 15:56]:
>> On 04/03/2021 08:37, Tony Lindgren wrote:
>>> There is a timer wrap issue on dra7 for the ARM architected timer.
>>> In a typical clock configuration the timer fails to wrap after 388 days.
>>>
>>> To work around the issue, we need to use timer-ti-dm timers instead.
>>>
>>> Let's prepare for adding support for percpu timers by adding a common
>>> dmtimer_clkevt_init_common() and call it from dmtimer_clockevent_init().
>>> This patch makes no intentional functional changes.
>>>
>>> Signed-off-by: Tony Lindgren 
>>> ---
>>
>> [ ... ]
>>
>>> @@ -575,33 +574,60 @@ static int __init dmtimer_clockevent_init(struct 
>>> device_node *np)
>>>  */
>>> writel_relaxed(OMAP_TIMER_CTRL_POSTED, t->base + t->ifctrl);
>>>  
>>> +   if (dev->cpumask == cpu_possible_mask)
>>> +   irqflags = IRQF_TIMER;
>>> +   else
>>> +   irqflags = IRQF_TIMER | IRQF_NOBALANCING;
>>
>> Can you explain the reasoning behind the test above ?
> 
> In the per cpu case we assign one dmtimer per cpu, and we want the
> interrupt handling on the assigned CPU. In the per cpu case we have
> the cpu specified with dev->cpumask unlike for the normal clockevent
> case.
> 
> In the per cpu dmtimer case the interrupt line is not wired per cpu
> though, so I don't think we want to add IRQF_PERCPU here.

If it is per cpu, then the parameter will be cpumask_of(cpu). If there
is one cpu, no balancing can happen and then the IRQF_NOBALANCING is not
needed, neither this test and the irqflags, right?



-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH 1/2] clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issue

2021-03-22 Thread Daniel Lezcano
On 04/03/2021 08:37, Tony Lindgren wrote:
> There is a timer wrap issue on dra7 for the ARM architected timer.
> In a typical clock configuration the timer fails to wrap after 388 days.
> 
> To work around the issue, we need to use timer-ti-dm timers instead.
> 
> Let's prepare for adding support for percpu timers by adding a common
> dmtimer_clkevt_init_common() and call it from dmtimer_clockevent_init().
> This patch makes no intentional functional changes.
> 
> Signed-off-by: Tony Lindgren 
> ---

[ ... ]

> @@ -575,33 +574,60 @@ static int __init dmtimer_clockevent_init(struct 
> device_node *np)
>*/
>   writel_relaxed(OMAP_TIMER_CTRL_POSTED, t->base + t->ifctrl);
>  
> + if (dev->cpumask == cpu_possible_mask)
> + irqflags = IRQF_TIMER;
> + else
> + irqflags = IRQF_TIMER | IRQF_NOBALANCING;

Can you explain the reasoning behind the test above ?

[ ... ]


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH v2 0/5] arm64: sunxi: Enable the sun4i timer

2021-03-22 Thread Daniel Lezcano
On 22/03/2021 05:47, Samuel Holland wrote:
> In preparation for adding CPU idle states, hook up the sun4i timer.
> Having a non-c3stop clockevent source available is necessary for all
> CPUs to simultaneously enter a local-timer-stop idle state.

Why simultaneously ?

> Changes from v1:
>   - Removed H616 changes (depends on an unmerged patch set)
>   - Reworded the patch 4-5 commit messages for clarity
>   - Added Acked-by tags
> 
> Samuel Holland (5):
>   dt-bindings: timer: Simplify conditional expressions
>   dt-bindings: timer: Add compatibles for sun50i timers
>   arm64: dts: allwinner: a64: Sort watchdog node
>   arm64: dts: allwinner: Add sun4i MMIO timer nodes
>   arm64: sunxi: Build the sun4i timer driver
> 
>  .../timer/allwinner,sun4i-a10-timer.yaml  | 42 +--
>  arch/arm64/Kconfig.platforms  |  1 +
>  arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 25 +++
>  arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi  |  9 
>  4 files changed, 46 insertions(+), 31 deletions(-)
> 


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: drivers/clocksource/dw_apb_timer_of.c:66 timer_get_base_and_rate() warn: 'timer_clk' not released on lines: 64.

2021-03-22 Thread Daniel Lezcano
On 22/03/2021 13:21, Dinh Nguyen wrote:
> Hi Daniel,
> 
> 2/21 4:58 AM, Daniel Lezcano wrote:
>>
>> Dinh,
>>
>> is it possible to have a look at this issue?
>>
>> Thanks
>>
> 
> Sorry, but somehow I missed cc'ing you when I first sent the patch.
> 
> I've resent it just now.

Ah, thanks!

 -- D.



-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH 1/2] drivers/clocksource/mediatek: Split mediatek drivers into 2 files

2021-03-22 Thread Daniel Lezcano


Hi Evan,

On 18/03/2021 06:04, Evan Benn wrote:
> mtk_gpt and mtk_syst drivers for mt6577 and mt6765 devices were not
> sharing any code. So split them into separate files.

For the sake of consistency, keeping all in one is better.

Thanks
  -- Daniel

> Signed-off-by: Evan Benn 
> ---
> 
>  arch/arm/mach-mediatek/Kconfig|   3 +-
>  arch/arm64/Kconfig.platforms  |   3 +-
>  drivers/clocksource/Kconfig   |  13 +-
>  drivers/clocksource/Makefile  |   3 +-
>  ...mer-mediatek.c => timer-mediatek-mt6577.c} | 100 -
>  drivers/clocksource/timer-mediatek-mt6765.c   | 135 ++
>  6 files changed, 151 insertions(+), 106 deletions(-)
>  rename drivers/clocksource/{timer-mediatek.c => timer-mediatek-mt6577.c} 
> (69%)
>  create mode 100644 drivers/clocksource/timer-mediatek-mt6765.c
> 
> diff --git a/arch/arm/mach-mediatek/Kconfig b/arch/arm/mach-mediatek/Kconfig
> index 9e0f592d87d8..8686f992c4b6 100644
> --- a/arch/arm/mach-mediatek/Kconfig
> +++ b/arch/arm/mach-mediatek/Kconfig
> @@ -4,7 +4,8 @@ menuconfig ARCH_MEDIATEK
>   depends on ARCH_MULTI_V7
>   select ARM_GIC
>   select PINCTRL
> - select MTK_TIMER
> + select TIMER_MEDIATEK_MT6577
> + select TIMER_MEDIATEK_MT6765
>   select MFD_SYSCON
>   help
> Support for Mediatek MT65xx & MT81xx SoCs
> diff --git a/arch/arm64/Kconfig.platforms b/arch/arm64/Kconfig.platforms
> index cdfd5fed457f..d4752375ab0b 100644
> --- a/arch/arm64/Kconfig.platforms
> +++ b/arch/arm64/Kconfig.platforms
> @@ -161,7 +161,8 @@ config ARCH_MEDIATEK
>   bool "MediaTek SoC Family"
>   select ARM_GIC
>   select PINCTRL
> - select MTK_TIMER
> + select TIMER_MEDIATEK_MT6577
> + select TIMER_MEDIATEK_MT6765
>   help
> This enables support for MediaTek MT27xx, MT65xx, MT76xx
> & MT81xx ARMv8 SoCs
> diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
> index 39aa21d01e05..d697c799284e 100644
> --- a/drivers/clocksource/Kconfig
> +++ b/drivers/clocksource/Kconfig
> @@ -438,13 +438,20 @@ config OXNAS_RPS_TIMER
>  config SYS_SUPPORTS_SH_CMT
>   bool
>  
> -config MTK_TIMER
> - bool "Mediatek timer driver" if COMPILE_TEST
> +config TIMER_MEDIATEK_MT6577
> + bool "Mediatek mt6577 timer driver" if COMPILE_TEST
>   depends on HAS_IOMEM
>   select TIMER_OF
>   select CLKSRC_MMIO
>   help
> -   Support for Mediatek timer driver.
> +   Enables clocksource and clockevent driver for Mediatek mt6577 timer.
> +
> +config TIMER_MEDIATEK_MT6765
> + bool "Mediatek mt6765 timer driver" if COMPILE_TEST
> + depends on HAS_IOMEM
> + select TIMER_OF
> + help
> +   Enables clockevent driver for Mediatek mt6765 timer.
>  
>  config SPRD_TIMER
>   bool "Spreadtrum timer driver" if EXPERT
> diff --git a/drivers/clocksource/Makefile b/drivers/clocksource/Makefile
> index c17ee32a7151..b1f06ce114f9 100644
> --- a/drivers/clocksource/Makefile
> +++ b/drivers/clocksource/Makefile
> @@ -49,7 +49,8 @@ obj-$(CONFIG_CLKSRC_SAMSUNG_PWM)+= samsung_pwm_timer.o
>  obj-$(CONFIG_FSL_FTM_TIMER)  += timer-fsl-ftm.o
>  obj-$(CONFIG_VF_PIT_TIMER)   += timer-vf-pit.o
>  obj-$(CONFIG_CLKSRC_QCOM)+= timer-qcom.o
> -obj-$(CONFIG_MTK_TIMER)  += timer-mediatek.o
> +obj-$(CONFIG_TIMER_MEDIATEK_MT6577)  += timer-mediatek-mt6577.o
> +obj-$(CONFIG_TIMER_MEDIATEK_MT6765)  += timer-mediatek-mt6765.o
>  obj-$(CONFIG_CLKSRC_PISTACHIO)   += timer-pistachio.o
>  obj-$(CONFIG_CLKSRC_TI_32K)  += timer-ti-32k.o
>  obj-$(CONFIG_OXNAS_RPS_TIMER)+= timer-oxnas-rps.o
> diff --git a/drivers/clocksource/timer-mediatek.c 
> b/drivers/clocksource/timer-mediatek-mt6577.c
> similarity index 69%
> rename from drivers/clocksource/timer-mediatek.c
> rename to drivers/clocksource/timer-mediatek-mt6577.c
> index 9318edcd8963..9e5241d1876d 100644
> --- a/drivers/clocksource/timer-mediatek.c
> +++ b/drivers/clocksource/timer-mediatek-mt6577.c
> @@ -47,86 +47,8 @@
>  #define GPT_CNT_REG(val)(0x08 + (0x10 * (val)))
>  #define GPT_CMP_REG(val)(0x0C + (0x10 * (val)))
>  
> -/* system timer */
> -#define SYST_BASE   (0x40)
> -
> -#define SYST_CON(SYST_BASE + 0x0)
> -#define SYST_VAL(SYST_BASE + 0x4)
> -
> -#define SYST_CON_REG(to)(timer_of_base(to) + SYST_CON)
> -#define SYST_VAL_REG(to)(timer_of_base(to) + SYST_VAL)
> -
> -/*
> - * SYST_CON_EN: Clock enable. Shall be set to
> - *   - Start timer countdown.
> - *   - Allow timeout ticks being updated.
> - *   - Allow changing interrupt functions.
> - *
> - * SYST_CON_IRQ_EN: Set to allow interrupt.
> - *
> - * SYST_CON_IRQ_CLR: Set to clear interrupt.
> - */
> -#define SYST_CON_EN  BIT(0)
> -#define SYST_CON_IRQ_EN  BIT(1)
> -#define SYST_CON_IRQ_CLR BIT(4)
> -
>  static void __iomem *gpt_sched_reg __read_mostly;
>  
> -static 

Re: [PATCH 10/14] clocksource/drivers/npcm: Add support for WPCM450

2021-03-22 Thread Daniel Lezcano
On 20/03/2021 19:16, Jonathan Neuschäfer wrote:
> Add a compatible string for WPCM450, which has essentially the same
> timer controller.
> 
> Signed-off-by: Jonathan Neuschäfer 
> ---

Applied, thanks


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH 2/2] clocksource/drivers/timer-ti-dm: Handle dra7 timer wrap errata i940

2021-03-22 Thread Daniel Lezcano
On 04/03/2021 08:37, Tony Lindgren wrote:
> There is a timer wrap issue on dra7 for the ARM architected timer.
> In a typical clock configuration the timer fails to wrap after 388 days.
> 
> To work around the issue, we need to use timer-ti-dm percpu timers instead.
> 
> Let's configure dmtimer3 and 4 as percpu timers by default, and warn about
> the issue if the dtb is not configured properly.
> 
> Let's do this as a single patch so it can be backported to v5.8 and later
> kernels easily. 

Cc:  # v5.8+

??

> Note that this patch depends on earlier timer-ti-dm
> systimer posted mode fixes, and a preparatory clockevent patch
> "clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issue".
> 
> For more information, please see the errata for "AM572x Sitara Processors
> Silicon Revisions 1.1, 2.0":
> 
> https://www.ti.com/lit/er/sprz429m/sprz429m.pdf
> 
> The concept is based on earlier reference patches done by Tero Kristo and
> Keerthy.
> 
> Cc: Keerthy 
> Cc: Tero Kristo 
> Signed-off-by: Tony Lindgren 
> ---
>  arch/arm/boot/dts/dra7-l4.dtsi |  4 +-
>  arch/arm/boot/dts/dra7.dtsi| 20 ++
>  drivers/clocksource/timer-ti-dm-systimer.c | 76 ++
>  include/linux/cpuhotplug.h |  1 +
>  4 files changed, 99 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/boot/dts/dra7-l4.dtsi b/arch/arm/boot/dts/dra7-l4.dtsi
> --- a/arch/arm/boot/dts/dra7-l4.dtsi
> +++ b/arch/arm/boot/dts/dra7-l4.dtsi
> @@ -1168,7 +1168,7 @@ timer2: timer@0 {
>   };
>   };
>  
> - target-module@34000 {   /* 0x48034000, ap 7 
> 46.0 */
> + timer3_target: target-module@34000 {/* 0x48034000, ap 7 
> 46.0 */
>   compatible = "ti,sysc-omap4-timer", "ti,sysc";
>   reg = <0x34000 0x4>,
> <0x34010 0x4>;
> @@ -1195,7 +1195,7 @@ timer3: timer@0 {
>   };
>   };
>  
> - target-module@36000 {   /* 0x48036000, ap 9 
> 4e.0 */
> + timer4_target: target-module@36000 {/* 0x48036000, ap 9 
> 4e.0 */
>   compatible = "ti,sysc-omap4-timer", "ti,sysc";
>   reg = <0x36000 0x4>,
> <0x36010 0x4>;
> diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi
> --- a/arch/arm/boot/dts/dra7.dtsi
> +++ b/arch/arm/boot/dts/dra7.dtsi
> @@ -46,6 +46,7 @@ aliases {
>  
>   timer {
>   compatible = "arm,armv7-timer";
> + status = "disabled";/* See ARM architected timer wrap 
> erratum i940 */
>   interrupts =  IRQ_TYPE_LEVEL_LOW)>,
> IRQ_TYPE_LEVEL_LOW)>,
> IRQ_TYPE_LEVEL_LOW)>,
> @@ -1241,3 +1242,22 @@ timer@0 {
>   assigned-clock-parents = <_32k_ck>;
>   };
>  };
> +
> +/* Local timers, see ARM architected timer wrap erratum i940 */
> +_target {
> + ti,no-reset-on-init;
> + ti,no-idle;
> + timer@0 {
> + assigned-clocks = <_clkctrl DRA7_L4PER_TIMER3_CLKCTRL 24>;
> + assigned-clock-parents = <_sys_clk_div>;
> + };
> +};
> +
> +_target {
> + ti,no-reset-on-init;
> + ti,no-idle;
> + timer@0 {
> + assigned-clocks = <_clkctrl DRA7_L4PER_TIMER4_CLKCTRL 24>;
> + assigned-clock-parents = <_sys_clk_div>;
> + };
> +};
> diff --git a/drivers/clocksource/timer-ti-dm-systimer.c 
> b/drivers/clocksource/timer-ti-dm-systimer.c
> --- a/drivers/clocksource/timer-ti-dm-systimer.c
> +++ b/drivers/clocksource/timer-ti-dm-systimer.c
> @@ -2,6 +2,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -634,6 +635,78 @@ static int __init dmtimer_clockevent_init(struct 
> device_node *np)
>   return error;
>  }
>  
> +/* Dmtimer as percpu timer. See dra7 ARM architected timer wrap erratum i940 
> */
> +static DEFINE_PER_CPU(struct dmtimer_clockevent, dmtimer_percpu_timer);
> +
> +static int __init dmtimer_percpu_timer_init(struct device_node *np, int cpu)
> +{
> + struct dmtimer_clockevent *clkevt;
> + int error;
> +
> + if (!cpu_possible(cpu))
> + return -EINVAL;
> +
> + if (!of_property_read_bool(np->parent, "ti,no-reset-on-init") ||
> + !of_property_read_bool(np->parent, "ti,no-idle"))
> + pr_warn("Incomplete dtb for percpu dmtimer %pOF\n", np->parent);
> +
> + clkevt = per_cpu_ptr(_percpu_timer, cpu);
> +
> + error = dmtimer_clkevt_init_common(clkevt, np, CLOCK_EVT_FEAT_ONESHOT,
> +cpumask_of(cpu), "percpu-dmtimer",
> +500);
> + if (error)
> + return error;
> +
> + return 0;
> +}
> +
> +/* See TRM for timer internal resynch latency */
> +static int omap_dmtimer_starting_cpu(unsigned int cpu)
> +{
> + struct 

Re: [PATCH] clocksource/drivers/timer-mediatek: optimize systimer irq clear flow on Mediatek Socs

2021-03-22 Thread Daniel Lezcano
On 02/03/2021 08:28, Fengquan Chen wrote:
> 1)ensure systimer is enabled before clear and disable interrupt, which only
> for systimer in Mediatek Socs.
>
> 2)clear any pending timer-irq when shutdown to keep suspend flow clean,
> when use systimer as tick-broadcast timer
> 
> Change-Id: Ia3eda83324af2fdaf5cbb3569a9bf020a11f8009

Remove the above.

Add a Fixes tag.

> Signed-off-by: Fengquan Chen 
> ---
>  drivers/clocksource/timer-mediatek.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/clocksource/timer-mediatek.c 
> b/drivers/clocksource/timer-mediatek.c
> index 9318edc..9f1f095dc 100644
> --- a/drivers/clocksource/timer-mediatek.c
> +++ b/drivers/clocksource/timer-mediatek.c
> @@ -75,6 +75,7 @@
>  static void mtk_syst_ack_irq(struct timer_of *to)
>  {
>   /* Clear and disable interrupt */
> + writel(SYST_CON_EN, SYST_CON_REG(to));

SYST_CON_EN is set below, why do you have to do it before?

Is that a hw bug ?

It is confusing what the description of the SYST_CON_EN says:

/*
 * SYST_CON_EN: Clock enable. Shall be set to
 *   - Start timer countdown.
 *   - Allow timeout ticks being updated.
 *   - Allow changing interrupt functions.

What means "interrupt functions" ?

Does writing writel(SYST_CON_EN, SYST_CON_REG(to)) before
SYST_CON_IRQ_CLR allows to clear the interrupt flag?

Can you explain how the timer works regarding this part?

It sounds to me a bit strange.

 *
 * SYST_CON_IRQ_EN: Set to allow interrupt.
 *
 * SYST_CON_IRQ_CLR: Set to clear interrupt.
 */


>   writel(SYST_CON_IRQ_CLR | SYST_CON_EN, SYST_CON_REG(to));
>  }
>  
> @@ -111,6 +112,9 @@ static int mtk_syst_clkevt_next_event(unsigned long ticks,
>  
>  static int mtk_syst_clkevt_shutdown(struct clock_event_device *clkevt)
>  {
> + /* Clear any irq */
> + mtk_syst_ack_irq(to_timer_of(clkevt));
> +
>   /* Disable timer */
>   writel(0, SYST_CON_REG(to_timer_of(clkevt)));
>  
> 


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: drivers/clocksource/dw_apb_timer_of.c:66 timer_get_base_and_rate() warn: 'timer_clk' not released on lines: 64.

2021-03-22 Thread Daniel Lezcano


Dinh,

is it possible to have a look at this issue?

Thanks

  -- Daniel


On 22/02/2021 07:21, Dan Carpenter wrote:
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> master
> head:   55f62bc873477dae2c45bbbc30b86cf3e0982f3b
> commit: 5d9814df0aec56a638bbf20795abb4cfaf3cd331 
> clocksource/drivers/dw_apb_timer_of: Add error handling if no clock available
> config: arm64-randconfig-m031-20210221 (attached as .config)
> compiler: aarch64-linux-gcc (GCC) 9.3.0
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
> Reported-by: Dan Carpenter 
> 
> New smatch warnings:
> drivers/clocksource/dw_apb_timer_of.c:66 timer_get_base_and_rate() warn: 
> 'timer_clk' not released on lines: 64.
> 
> Old smatch warnings:
> drivers/clocksource/dw_apb_timer_of.c:66 timer_get_base_and_rate() warn: 
> '*base' not released on lines: 56,64.
> 
> vim +/timer_clk +66 drivers/clocksource/dw_apb_timer_of.c
> 
> 5d9814df0aec56 drivers/clocksource/dw_apb_timer_of.c Dinh Nguyen
> 2020-12-05  17  static int __init timer_get_base_and_rate(struct device_node 
> *np,
> af75655c066621 arch/arm/mach-picoxcell/time.cJamie Iles 
> 2011-07-25  18void __iomem **base, u32 *rate)
> af75655c066621 arch/arm/mach-picoxcell/time.cJamie Iles 
> 2011-07-25  19  {
> a8b447f2a7 drivers/clocksource/dw_apb_timer_of.c Heiko Stuebner 
> 2013-06-04  20struct clk *timer_clk;
> a8b447f2a7 drivers/clocksource/dw_apb_timer_of.c Heiko Stuebner 
> 2013-06-04  21struct clk *pclk;
> 1f174a1a2cdebc drivers/clocksource/dw_apb_timer_of.c Dinh Nguyen
> 2018-09-17  22struct reset_control *rstc;
> 5d9814df0aec56 drivers/clocksource/dw_apb_timer_of.c Dinh Nguyen
> 2020-12-05  23int ret;
> a8b447f2a7 drivers/clocksource/dw_apb_timer_of.c Heiko Stuebner 
> 2013-06-04  24  
> af75655c066621 arch/arm/mach-picoxcell/time.cJamie Iles 
> 2011-07-25  25*base = of_iomap(np, 0);
> af75655c066621 arch/arm/mach-picoxcell/time.cJamie Iles 
> 2011-07-25  26  
> af75655c066621 arch/arm/mach-picoxcell/time.cJamie Iles 
> 2011-07-25  27if (!*base)
> 2a4849d2674b96 drivers/clocksource/dw_apb_timer_of.c Rob Herring
> 2018-08-27  28panic("Unable to map regs for %pOFn", np);
> af75655c066621 arch/arm/mach-picoxcell/time.cJamie Iles 
> 2011-07-25  29  
> 1f174a1a2cdebc drivers/clocksource/dw_apb_timer_of.c Dinh Nguyen
> 2018-09-17  30/*
> 1f174a1a2cdebc drivers/clocksource/dw_apb_timer_of.c Dinh Nguyen
> 2018-09-17  31 * Reset the timer if the reset control is available, wiping
> 1f174a1a2cdebc drivers/clocksource/dw_apb_timer_of.c Dinh Nguyen
> 2018-09-17  32 * out the state the firmware may have left it
> 1f174a1a2cdebc drivers/clocksource/dw_apb_timer_of.c Dinh Nguyen
> 2018-09-17  33 */
> 1f174a1a2cdebc drivers/clocksource/dw_apb_timer_of.c Dinh Nguyen
> 2018-09-17  34rstc = of_reset_control_get(np, NULL);
> 1f174a1a2cdebc drivers/clocksource/dw_apb_timer_of.c Dinh Nguyen
> 2018-09-17  35if (!IS_ERR(rstc)) {
> 1f174a1a2cdebc drivers/clocksource/dw_apb_timer_of.c Dinh Nguyen
> 2018-09-17  36reset_control_assert(rstc);
> 1f174a1a2cdebc drivers/clocksource/dw_apb_timer_of.c Dinh Nguyen
> 2018-09-17  37reset_control_deassert(rstc);
> 1f174a1a2cdebc drivers/clocksource/dw_apb_timer_of.c Dinh Nguyen
> 2018-09-17  38}
> 1f174a1a2cdebc drivers/clocksource/dw_apb_timer_of.c Dinh Nguyen
> 2018-09-17  39  
> a8b447f2a7 drivers/clocksource/dw_apb_timer_of.c Heiko Stuebner 
> 2013-06-04  40/*
> a8b447f2a7 drivers/clocksource/dw_apb_timer_of.c Heiko Stuebner 
> 2013-06-04  41 * Not all implementations use a periphal clock, so don't 
> panic
> a8b447f2a7 drivers/clocksource/dw_apb_timer_of.c Heiko Stuebner 
> 2013-06-04  42 * if it's not present
> a8b447f2a7 drivers/clocksource/dw_apb_timer_of.c Heiko Stuebner 
> 2013-06-04  43 */
> a8b447f2a7 drivers/clocksource/dw_apb_timer_of.c Heiko Stuebner 
> 2013-06-04  44pclk = of_clk_get_by_name(np, "pclk");
> a8b447f2a7 drivers/clocksource/dw_apb_timer_of.c Heiko Stuebner 
> 2013-06-04  45if (!IS_ERR(pclk))
> a8b447f2a7 drivers/clocksource/dw_apb_timer_of.c Heiko Stuebner 
> 2013-06-04  46if (clk_prepare_enable(pclk))
> 2a4849d2674b96 drivers/clocksource/dw_apb_timer_of.c Rob Herring
> 2018-08-27  47pr_warn("pclk for %pOFn is present, but 
> could not be activated\n",
> 2a4849d2674b96 drivers/clocksource/dw_apb_timer_of.c Rob Herring
> 2018-08-27  48np);
> a8b447f2a7 drivers/clocksource/dw_apb_timer_of.c Heiko Stuebner 
> 2013-06-04  49  
> 5d9814df0aec56 drivers/clocksource/dw_apb_timer_of.c Dinh Nguyen
> 2020-12-05  50if (!of_property_read_u32(np, "clock-freq", rate) &&
> 5d9814df0aec56 

Re: [PATCH] thermal/drivers/cpuidle_cooling: Fix use after error

2021-03-22 Thread Daniel Lezcano
On 22/03/2021 04:29, Viresh Kumar wrote:
> On 19-03-21, 21:25, Daniel Lezcano wrote:
>> When the function successfully finishes it logs an information about
>> the registration of the cooling device and use its name to build the
>> message. Unfortunately it was freed right before:
>>
>> drivers/thermal/cpuidle_cooling.c:218 __cpuidle_cooling_register()
>>  warn: 'name' was already freed.
>>
>> Fix this by freeing after the message happened.
>>
>> Fixes: 6fd1b186d900 ("thermal/drivers/cpuidle_cooling: Use device name 
>> instead of auto-numbering")
> 
> Why not merge this with the Fixes patch itself since it isn't there in Linus's
> tree yet ?
> 
> Or is your branch strictly immutable ?

Hi Viresh;

The changes follow the path:

testing -> linux-next -> next

The branch next is never rebased. The patch above reached it. This is
notified by the thermal-bot [1].

  -- Daniel

[1]
https://lore.kernel.org/linux-pm/20210314111333.16551-3-daniel.lezc...@linaro.org/T/#ma257519efc70ee60faca47dbd458b05de5449bf8


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH 2/3] clocksource/drivers/timer-ti-dm: Remove extra of_node_put()

2021-03-20 Thread Daniel Lezcano
On 08/03/2021 16:26, Tony Lindgren wrote:
> Hi,
> 
> * Tony Lindgren  [210305 07:58]:
>> * Grygorii Strashko  [210304 20:56]:
>>>
>>>
>>> On 04/03/2021 09:21, Tony Lindgren wrote:
 We have of_translate_address() already do of_node_put() as needed.
 I probably looked at __of_translate_address() earlier by accident
 that of_translate_address() uses.
>>>
>>> I do not see of_node_put() in of_translate_address() and
>>>  __of_translate_address() is doing of_node_get(dev);
>>> ?
>>
>> Oh right.. this is confusing.. Yeah we can ignore this patch.
>> We should have the use count set for only the system timer(s)
>> we claim.
> 
> Daniel, would you like me to repost this series with this patch dropped?

No, it is ok. I will take care of not picking it.

Thanks
  -- Daniel



-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [RESEND PATCH v4 1/2] dt-bindings: tsens: qcom: Document MDM9607 compatible

2021-03-20 Thread Daniel Lezcano
On 20/03/2021 20:01, Konrad Dybcio wrote:
> Add the compatible for MDM9607.
>> Signed-off-by: Konrad Dybcio 
>> ---
>> v4: separate from the main patch
>>
>>  Documentation/devicetree/bindings/thermal/qcom-tsens.yaml | 2 ++
>>  1 file changed, 2 insertions(+)
> 
> This change was previously a-b Rob Herring, I forgot to add the ack back in 
> after separating the patch, sorry for that.

Can you point it to a lkml@ archive ?

-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


[PATCH] thermal/drivers/cpuidle_cooling: Fix use after error

2021-03-19 Thread Daniel Lezcano
When the function successfully finishes it logs an information about
the registration of the cooling device and use its name to build the
message. Unfortunately it was freed right before:

drivers/thermal/cpuidle_cooling.c:218 __cpuidle_cooling_register()
warn: 'name' was already freed.

Fix this by freeing after the message happened.

Fixes: 6fd1b186d900 ("thermal/drivers/cpuidle_cooling: Use device name instead 
of auto-numbering")
Reported-by: Dan Carpenter 
Signed-off-by: Daniel Lezcano 
---
 drivers/thermal/cpuidle_cooling.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/cpuidle_cooling.c 
b/drivers/thermal/cpuidle_cooling.c
index f32976163bad..4f41102e8b16 100644
--- a/drivers/thermal/cpuidle_cooling.c
+++ b/drivers/thermal/cpuidle_cooling.c
@@ -208,18 +208,20 @@ static int __cpuidle_cooling_register(struct device_node 
*np,
 
cdev = thermal_of_cooling_device_register(np, name, idle_cdev,
  _cooling_ops);
-   kfree(name);
-
if (IS_ERR(cdev)) {
ret = PTR_ERR(cdev);
-   goto out_unregister;
+   goto out_kfree_name;
}
 
pr_debug("%s: Idle injection set with idle duration=%u, latency=%u\n",
 name, idle_duration_us, latency_us);
 
+   kfree(name);
+
return 0;
 
+out_kfree_name:
+   kfree(name);
 out_unregister:
idle_inject_unregister(ii_dev);
 out_kfree:
-- 
2.25.1



[PATCH] thermal/drivers/devfreq_cooling: Fix wrong return on error path

2021-03-19 Thread Daniel Lezcano
The following error is reported by kbuild:

 smatch warnings:
 drivers/thermal/devfreq_cooling.c:433 of_devfreq_cooling_register_power() 
warn: passing zero to 'ERR_PTR'

Fix the error code by the setting the 'err' variable instead of 'cdev'.

Fixes: f8d354e821b2 ("thermal/drivers/devfreq_cooling: Use device name instead 
of auto-numbering")
Reported-by: kernel test robot 
Reported-by: Dan Carpenter 
Signed-off-by: Daniel Lezcano 
---
 drivers/thermal/devfreq_cooling.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/thermal/devfreq_cooling.c 
b/drivers/thermal/devfreq_cooling.c
index fb250ac16f50..3a788ac4f525 100644
--- a/drivers/thermal/devfreq_cooling.c
+++ b/drivers/thermal/devfreq_cooling.c
@@ -402,7 +402,7 @@ of_devfreq_cooling_register_power(struct device_node *np, 
struct devfreq *df,
if (err < 0)
goto free_table;
 
-   cdev = ERR_PTR(-ENOMEM);
+   err = -ENOMEM;
name = kasprintf(GFP_KERNEL, "devfreq-%s", dev_name(dev));
if (!name)
goto remove_qos_req;
-- 
2.25.1



[PATCH] thermal: core: Fix memory leak in the error path

2021-03-19 Thread Daniel Lezcano
Fix the following error:

 smatch warnings:
 drivers/thermal/thermal_core.c:1020 __thermal_cooling_device_register() warn: 
possible memory leak of 'cdev'

by freeing the cdev when exiting the function in the error path.

Fixes: 584837618100 ("thermal/drivers/core: Use a char pointer for the cooling 
device name")
Reported-by: kernel test robot 
Reported-by: Dan Carpenter 
Signed-off-by: Daniel Lezcano 
---
 drivers/thermal/thermal_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index c8d4010940ef..3566fd291399 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1017,6 +1017,7 @@ __thermal_cooling_device_register(struct device_node *np,
 out_ida_remove:
ida_simple_remove(_cdev_ida, cdev->id);
 out_kfree_cdev:
+   kfree(cdev);
return ERR_PTR(ret);
 }
 
-- 
2.25.1



Re: [PATCH] dt: rockchip: rk3399: Add dynamic power coefficient for GPU

2021-03-19 Thread Daniel Lezcano
On 19/03/2021 19:05, Robin Murphy wrote:
> On 2021-03-19 14:35, Daniel Lezcano wrote:
>>
>> Hi Robin,
>>
>> On 19/03/2021 13:17, Robin Murphy wrote:
>>> On 2021-03-19 11:05, Daniel Lezcano wrote:
>>>> The DTPM framework is looking for upstream SoC candidates to share the
>>>> power numbers.
>>>>
>>>> We can see around different numbers but the one which seems to be
>>>> consistent with the initial post for the values on the CPUs can be
>>>> found in the patch https://lore.kernel.org/patchwork/patch/810159/
>>>
>>> The kernel hacker in me would be more inclined to trust the BSP that the
>>> vendor actively supports than a 5-year-old patch that was never pursued
>>> upstream. Apparently that was last updated more recently:
>>>
>>> https://github.com/rockchip-linux/kernel/commit/98d4505e1bd62ff028bd79fbd8284d64b6f468f8
>>>
>>
>> Yes, I've seen this value also.
>>
>>> The ex-mathematician in me can't even comment either way without
>>> evidence that whatever model expects to consume this value is even
>>> comparable to whatever "arm,mali-simple-power-model" is. >
>>> The way the
>>> latter apparently needs an explicit "static" coefficient as well as a
>>> "dynamic" one, and the value here being nearly 3 times that of a
>>> similarly-named one in active use downstream (ChromeOS appears to still
>>> be using the values from before the above commit), certainly incline me
>>> to think they may not be...
>>
>> Sorry, I'm missing the point :/
>>
>> We dropped in the kernel any static power computation because as there
>> was no value, the resulting code was considered dead. So we rely on the
>> dynamic power only.
> 
> Right, so a 2-factor model is clearly not identical to a 1-factor model,
> so how do we know that a value for one is valid for the other, even if
> it happens to have a similar name? I'm not saying that it is or isn't; I
> don't know. If someone can point to the downstream coefficient
> definition being identical to the upstream one then great, let's use
> that as justification. If not, then the justification of one arbitrary
> meaningless number over any other is a bit misleading.

That's a call :)

>>>> I don't know the precision of this value but it is better than
>>>> nothing.
>>>
>>> But is it? If it leads to some throttling mechanism kicking in and
>>> crippling GPU performance because it's massively overestimating power
>>> consumption, that would be objectively worse for most users, no?
>>
>> No because there is no sustainable power specified for the thermal zones
>> related to the GPU.
> OK, that's some reassurance at least. Does the exact value have any
> material effect? 

Yes, it has when it is combined with other devices having also power
values, like the CPUs and hopefully the DMC soon.

If we can have more or less consistent power numbers for the DMC, CPU
and GPU on the rock960, with the thermal zone having these three heating
sources, we can use the DTPM framework to act on the power of the whole.

I don't know the best coefficient, 733, 977 or 1780 [1]

The value of 977 sound to me as a starting point.



[1]
https://chromium.googlesource.com/chromiumos/third_party/kernel/+/factory-gru-8652.B-chromeos-4.4/arch/arm64/boot/dts/rockchip/rk3399-gru-kevin-r3.dts

> If not, what's to stop us from using an obviously
> made-up value like 1, and saying so?





-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


[PATCH] powercap/drivers/dtpm: Add dtpm devfreq with energy model support

2021-03-19 Thread Daniel Lezcano
Currently the dtpm supports the CPUs via cpufreq and the energy
model. This change provides the same for the device which supports
devfreq.

Each device supporting devfreq and having an energy model can register
themselves in the list of supported devices.

The concept is the same as the cpufreq dtpm support: the QoS is used
to aggregate the requests and the energy model gives the value of the
instantaneous power consumption ponderated by the load of the device.

Cc: Chanwoo Choi 
Cc: Lukasz Luba 
Cc: Kyungmin Park 
Cc: MyungJoo Ham 
Signed-off-by: Daniel Lezcano 
---
 drivers/powercap/Kconfig|   7 ++
 drivers/powercap/Makefile   |   1 +
 drivers/powercap/dtpm_devfreq.c | 198 
 include/linux/dtpm.h|  13 +++
 4 files changed, 219 insertions(+)
 create mode 100644 drivers/powercap/dtpm_devfreq.c

diff --git a/drivers/powercap/Kconfig b/drivers/powercap/Kconfig
index 599b41e4e0a7..acdb047d8f1b 100644
--- a/drivers/powercap/Kconfig
+++ b/drivers/powercap/Kconfig
@@ -64,4 +64,11 @@ config DTPM_CPU
help
  This enables support for CPU power limitation based on
  energy model.
+
+config DTPM_DEVFREQ
+   bool "Add device power capping based on the energy model"
+   depends on DTPM && ENERGY_MODEL
+   help
+ This enables support for device power limitation based on
+ energy model.
 endif
diff --git a/drivers/powercap/Makefile b/drivers/powercap/Makefile
index 519cabc624c3..e47f4fd68fb9 100644
--- a/drivers/powercap/Makefile
+++ b/drivers/powercap/Makefile
@@ -2,6 +2,7 @@
 obj-$(CONFIG_DTPM) += dtpm.o
 obj-$(CONFIG_DTPM_CONFIGFS) += dtpm_configfs.o
 obj-$(CONFIG_DTPM_CPU) += dtpm_cpu.o
+obj-$(CONFIG_DTPM_DEVFREQ) += dtpm_devfreq.o
 obj-$(CONFIG_POWERCAP) += powercap_sys.o
 obj-$(CONFIG_INTEL_RAPL_CORE) += intel_rapl_common.o
 obj-$(CONFIG_INTEL_RAPL) += intel_rapl_msr.o
diff --git a/drivers/powercap/dtpm_devfreq.c b/drivers/powercap/dtpm_devfreq.c
new file mode 100644
index ..0f259238a45d
--- /dev/null
+++ b/drivers/powercap/dtpm_devfreq.c
@@ -0,0 +1,198 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright 2021 Linaro Limited
+ *
+ * Author: Daniel Lezcano 
+ *
+ * The devfreq device combined with the energy model and the load can
+ * give an estimation of the power consumption as well as limiting the
+ * power.
+ *
+ */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifndef HZ_PER_KHZ
+#define HZ_PER_KHZ 1000UL
+#endif
+
+struct dtpm_devfreq {
+   struct dtpm dtpm;
+   struct dev_pm_qos_request qos_req;
+   struct devfreq *devfreq;
+};
+
+struct dtpm_devfreq *to_dtpm_devfreq(struct dtpm *dtpm)
+{
+   return container_of(dtpm, struct dtpm_devfreq, dtpm);
+}
+
+static int update_pd_power_uw(struct dtpm *dtpm)
+{
+   struct dtpm_devfreq *dtpm_devfreq = to_dtpm_devfreq(dtpm);
+   struct devfreq *devfreq = dtpm_devfreq->devfreq;
+   struct device *dev = devfreq->dev.parent;
+   struct em_perf_domain *pd = em_pd_get(dev);
+
+   dtpm->power_min = pd->table[0].power;
+   dtpm->power_min *= MICROWATT_PER_MILLIWATT;
+
+   dtpm->power_max = pd->table[pd->nr_perf_states - 1].power;
+   dtpm->power_max *= MICROWATT_PER_MILLIWATT;
+
+   return 0;
+}
+
+static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit)
+{
+   struct dtpm_devfreq *dtpm_devfreq = to_dtpm_devfreq(dtpm);
+   struct devfreq *devfreq = dtpm_devfreq->devfreq;
+   struct device *dev = devfreq->dev.parent;
+   struct em_perf_domain *pd = em_pd_get(dev);
+   unsigned long freq;
+   u64 power;
+   int i;
+
+   for (i = 0; i < pd->nr_perf_states; i++) {
+
+   power = pd->table[i].power * MICROWATT_PER_MILLIWATT;
+   if (power > power_limit)
+   break;
+   }
+
+   freq = pd->table[i - 1].frequency;
+
+   dev_pm_qos_update_request(_devfreq->qos_req, freq);
+
+   power_limit = pd->table[i - 1].power * MICROWATT_PER_MILLIWATT;
+
+   return power_limit;
+}
+
+static void _normalize_load(struct devfreq_dev_status *status)
+{
+   if (status->total_time > 0xf) {
+   status->total_time >>= 10;
+   status->busy_time >>= 10;
+   }
+
+   status->busy_time <<= 10;
+   status->busy_time /= status->total_time ? : 1;
+
+   status->busy_time = status->busy_time ? : 1;
+   status->total_time = 1024;
+}
+
+static u64 get_pd_power_uw(struct dtpm *dtpm)
+{
+   struct dtpm_devfreq *dtpm_devfreq = to_dtpm_devfreq(dtpm);
+   struct devfreq *devfreq = dtpm_devfreq->devfreq;
+   struct device *dev = devfreq->dev.parent;
+   struct em_perf_domain *pd = em_pd_get(dev);
+   struct devfreq_dev_status status;
+   unsigned lon

Re: [PATCH] MAINTAINERS: Add co-maintainer for Qualcomm tsens thermal drivers

2021-03-19 Thread Daniel Lezcano
On 19/03/2021 16:37, Thara Gopinath wrote:
> Add myself as the maintainer for Qualcomm tsens drivers so that I
> can help Daniel by taking care of/reviewing changes to these drivers.
> 
> Signed-off-by: Thara Gopinath 
> ---

Applied, thanks


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH] dt: rockchip: rk3399: Add dynamic power coefficient for GPU

2021-03-19 Thread Daniel Lezcano


Hi Robin,

On 19/03/2021 13:17, Robin Murphy wrote:
> On 2021-03-19 11:05, Daniel Lezcano wrote:
>> The DTPM framework is looking for upstream SoC candidates to share the
>> power numbers.
>>
>> We can see around different numbers but the one which seems to be
>> consistent with the initial post for the values on the CPUs can be
>> found in the patch https://lore.kernel.org/patchwork/patch/810159/
> 
> The kernel hacker in me would be more inclined to trust the BSP that the
> vendor actively supports than a 5-year-old patch that was never pursued
> upstream. Apparently that was last updated more recently:
> 
> https://github.com/rockchip-linux/kernel/commit/98d4505e1bd62ff028bd79fbd8284d64b6f468f8

Yes, I've seen this value also.

> The ex-mathematician in me can't even comment either way without
> evidence that whatever model expects to consume this value is even
> comparable to whatever "arm,mali-simple-power-model" is. >
> The way the
> latter apparently needs an explicit "static" coefficient as well as a
> "dynamic" one, and the value here being nearly 3 times that of a
> similarly-named one in active use downstream (ChromeOS appears to still
> be using the values from before the above commit), certainly incline me
> to think they may not be...

Sorry, I'm missing the point :/

We dropped in the kernel any static power computation because as there
was no value, the resulting code was considered dead. So we rely on the
dynamic power only.

>> I don't know the precision of this value but it is better than
>> nothing.
> 
> But is it? If it leads to some throttling mechanism kicking in and
> crippling GPU performance because it's massively overestimating power
> consumption, that would be objectively worse for most users, no?

No because there is no sustainable power specified for the thermal zones
related to the GPU.



-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH v1 2/2] arm64: dts: qcom: sm8350: Add thermal zones and throttling support

2021-03-19 Thread Daniel Lezcano
On 19/03/2021 12:24, Robert Foss wrote:
> On Fri, 19 Mar 2021 at 11:49, Robert Foss  wrote:
>>
>> sm8350 has 29 thermal sensors split across two tsens controllers. Add
>> the thermal zones to expose them and wireup the cpus to throttle their
>> frequencies on crossing passive temperature thresholds.
>>
>> Signed-off-by: Robert Foss 
>> ---

[ ... ]

>> +   };
>> +
>> +   // TODO: What is the NSP subsystem?
> 
> This comment should not have been included, will remove in v2

Please trim when replying to a large patch file.

[ ... ]

>> +   trips {
>> +   nspss1_alert0: trip-point0 {
>> +   temperature = <9>;
>> +   hysteresis = <1000>;
>> +   type = "hot";
>> +   };
>> +   };
>> +   };
>> +
>> +   // TODO: What is the NSP subsystem?
> 
> This comment should not have been included, will remove in v2

[ ... ]

>> +   trips {
>> +   nspss2_alert0: trip-point0 {
>> +   temperature = <9>;
>> +   hysteresis = <1000>;
>> +   type = "hot";
>> +   };
>> +   };
>> +   };
>> +
>> +   // TODO: What is the NSP subsystem?
> 
> This comment should not have been included, will remove in v2
> 

[ ... ]



-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


[PATCH] dt: rockchip: rk3399: Add dynamic power coefficient for GPU

2021-03-19 Thread Daniel Lezcano
The DTPM framework is looking for upstream SoC candidates to share the
power numbers.

We can see around different numbers but the one which seems to be
consistent with the initial post for the values on the CPUs can be
found in the patch https://lore.kernel.org/patchwork/patch/810159/

I don't know the precision of this value but it is better than
nothing.

Hopefully, one day SoC vendors will be more generous with the power
numbers at least for the SoC which are from the previous generation
and give the community the opportunity to develop power based
frameworks.
---
 arch/arm64/boot/dts/rockchip/rk3399.dtsi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi 
b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
index edbbf35fe19e..1ab1d293d2e9 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -1933,6 +1933,7 @@
interrupt-names = "job", "mmu", "gpu";
clocks = < ACLK_GPU>;
#cooling-cells = <2>;
+   dynamic-power-coefficient = <977>;
power-domains = < RK3399_PD_GPU>;
status = "disabled";
};
-- 
2.17.1



Re: [PATCH v2 1/9] units: Add the HZ macros

2021-03-19 Thread Daniel Lezcano


Hi Rafael,

is it possible to merge this series through linux-pm ?


On 25/02/2021 12:22, Andy Shevchenko wrote:
> On Wed, Feb 24, 2021 at 03:42:11PM +0100, Daniel Lezcano wrote:
>> The macros for the unit conversion for frequency are duplicated in
>> different places.
>>
>> Provide these macros in the 'units' header, so they can be reused.
> 
> For the all that have not been tagged:
> Reviewed-by: Andy Shevchenko 
> 
> Thanks!
> 
>> Signed-off-by: Daniel Lezcano 
>> Reviewed-by: Christian Eggers 
>> Reviewed-by: Andy Shevchenko 
>> ---
>>  include/linux/units.h | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/include/linux/units.h b/include/linux/units.h
>> index dcc30a53fa93..218ec0d314b6 100644
>> --- a/include/linux/units.h
>> +++ b/include/linux/units.h
>> @@ -4,6 +4,10 @@
>>  
>>  #include 
>>  
>> +#define HZ_PER_KHZ  1000L
>> +#define KHZ_PER_MHZ 1000L
>> +#define HZ_PER_MHZ  100L
>> +
>>  #define MILLIWATT_PER_WATT  1000L
>>  #define MICROWATT_PER_MILLIWATT 1000L
>>  #define MICROWATT_PER_WATT  100L
>> -- 
>> 2.17.1
>>
> 


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


[PATCH] powercap/drivers/dtpm : Fix power limit initialization

2021-03-18 Thread Daniel Lezcano
When a DTPM node is registered its power limit must be initialized to
the power max.

Signed-off-by: Daniel Lezcano 
---
 drivers/powercap/dtpm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/powercap/dtpm.c b/drivers/powercap/dtpm.c
index a4784ac2f79b..2f028776cfa8 100644
--- a/drivers/powercap/dtpm.c
+++ b/drivers/powercap/dtpm.c
@@ -594,8 +594,10 @@ int dtpm_create(const char *name, struct dtpm *dtpm, 
struct dtpm *parent)
root = dtpm;
}
 
-   if (dtpm->ops && !dtpm->ops->update_power_uw(dtpm))
+   if (dtpm->ops && !dtpm->ops->update_power_uw(dtpm)) {
__dtpm_add_power(dtpm);
+   dtpm->power_limit = dtpm->power_max;
+   }
 
pr_info("Created dtpm node '%s' / %llu-%llu uW, \n",
dtpm->zone.name, dtpm->power_min, dtpm->power_max);
-- 
2.17.1



Re: [PATCH 1/2] csky: Enable generic clockevent broadcast

2021-03-17 Thread Daniel Lezcano
On 07/03/2021 03:24, guo...@kernel.org wrote:
> From: Guo Ren 
> 
> When percpu-timers are stopped by deep power saving mode, we need
> system timer help to broadcast IPI_TIMER.
> 
> This is first introduced by broken x86 hardware, where the local apic
> timer stops in C3 state. But many other architectures(powerpc, mips,
> arm, hexagon, openrisc, sh) have supported the infrastructure to
> deal with Power Management issues.
> 
> Signed-off-by: Guo Ren 
> Cc: Arnd Bergmann 
> Cc: Thomas Gleixner 
> Cc: Daniel Lezcano 
> ---

Acked-by: Daniel Lezcano 


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


[GIT PULL] thermal fixes for v5.12

2021-03-17 Thread Daniel Lezcano


Hi Linus,

please consider pulling this single fix for the thermal framework.

Thanks

  -- Daniel


The following changes since commit a38fd8748464831584a19438cbb3082b5a2dab15:

  Linux 5.12-rc2 (2021-03-05 17:33:41 -0800)

are available in the Git repository at:


ssh://g...@gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git
tags/thermal-v5.12-rc4

for you to fetch changes up to 2046a24ae121cd107929655a6aaf3b8c5beea01f:

  thermal/core: Add NULL pointer check before using cooling device stats
(2021-03-17 09:55:58 +0100)


- Fix NULL pointer access when the cooling device transition stats
  table failed to allocate due to a big number of states (Manaf
  Meethalavalappu Pallikunhi).


Manaf Meethalavalappu Pallikunhi (1):
  thermal/core: Add NULL pointer check before using cooling device stats

 drivers/thermal/thermal_sysfs.c | 3 +++
 1 file changed, 3 insertions(+)

-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH] thermal: power_allocator: using round the division when re-divvying up power

2021-03-16 Thread Daniel Lezcano
On 15/03/2021 10:51, Lukasz Luba wrote:
> 
> 
> On 3/15/21 8:25 AM, gao.yunxi...@gmail.com wrote:
>> From: "jeson.gao" 
>>
>> The division is used directly in re-divvying up power, the decimal
>> part will
>> be discarded, devices will get less than the extra_actor_power - 1.
>> if using round the division to make the calculation more accurate.
>>
>> For example:
>> actor0 received more than it's max_power, it has the extra_power 759
>> actor1 received less than it's max_power, it require extra_actor_power
>> 395
>> actor2 received less than it's max_power, it require extra_actor_power
>> 365
>> actor1 and actor2 require the total capped_extra_power 760
>>
>> using division in re-divvying up power
>> actor1 would actually get the extra_actor_power 394
>> actor2 would actually get the extra_actor_power 364
>>
>> if using round the division in re-divvying up power
>> actor1 would actually get the extra_actor_power 394
>> actor2 would actually get the extra_actor_power 365
>>
>> Signed-off-by: Jeson Gao 
>> ---

Applied, thanks

-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [PATCH v2 4/5] thermal/drivers/cpuidle_cooling: Use device name instead of auto-numbering

2021-03-14 Thread Daniel Lezcano
On 15/03/2021 04:07, Viresh Kumar wrote:
> On 12-03-21, 18:03, Daniel Lezcano wrote:
>> Currently the naming of a cooling device is just a cooling technique
>> followed by a number. When there are multiple cooling devices using
>> the same technique, it is impossible to clearly identify the related
>> device as this one is just a number.
>>
>> For instance:
>>
>>  thermal-idle-0
>>  thermal-idle-1
>>  thermal-idle-2
>>  thermal-idle-3
>>  etc ...
>>
>> The 'thermal' prefix is redundant with the subsystem namespace. This
>> patch removes the 'thermal prefix and changes the number by the device
>> name. So the naming above becomes:
>>
>>  idle-cpu0
>>  idle-cpu1
>>  idle-cpu2
>>  idle-cpu3
>>  etc ...
>>
>> Signed-off-by: Daniel Lezcano 
>> Reviewed-by: Lukasz Luba 
> 
> I acked for both the patches :(

Right, I'll add you when merging the patches.

Thanks


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Re: [PATCH v3 1/5] thermal/drivers/core: Use a char pointer for the cooling device name

2021-03-14 Thread Daniel Lezcano
On 14/03/2021 13:47, Ido Schimmel wrote:
> On Sun, Mar 14, 2021 at 12:13:29PM +0100, Daniel Lezcano wrote:
>> We want to have any kind of name for the cooling devices as we do no
>> longer want to rely on auto-numbering. Let's replace the cooling
>> device's fixed array by a char pointer to be allocated dynamically
>> when registering the cooling device, so we don't limit the length of
>> the name.
>>
>> Rework the error path at the same time as we have to rollback the
>> allocations in case of error.
>>
>> Tested with a dummy device having the name:
>>  "Llanfairpwllgwyngyllgogerychwyrndrobwantysiliogogogoch"
>>
>> A village on the island of Anglesey (Wales), known to have the longest
>> name in Europe.
>>
>> Signed-off-by: Daniel Lezcano 
>> Reviewed-by: Lukasz Luba 
> 
> Tested-by: Ido Schimmel 

Thanks !


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


[PATCH v3 5/5] thermal/drivers/cpufreq_cooling: Remove unused list

2021-03-14 Thread Daniel Lezcano
There is a list with the purpose of grouping the cpufreq cooling
device together as described in the comments but actually it is
unused, the code evolved since 2012 and the list was no longer needed.

Delete the remaining unused list related code.

Signed-off-by: Daniel Lezcano 
Reviewed-by: Lukasz Luba 
---
 drivers/thermal/cpufreq_cooling.c | 13 -
 1 file changed, 13 deletions(-)

diff --git a/drivers/thermal/cpufreq_cooling.c 
b/drivers/thermal/cpufreq_cooling.c
index 3f5f1dce1320..f3d308427665 100644
--- a/drivers/thermal/cpufreq_cooling.c
+++ b/drivers/thermal/cpufreq_cooling.c
@@ -59,7 +59,6 @@ struct time_in_idle {
  * @cdev: thermal_cooling_device pointer to keep track of the
  * registered cooling device.
  * @policy: cpufreq policy.
- * @node: list_head to link all cpufreq_cooling_device together.
  * @idle_time: idle time stats
  * @qos_req: PM QoS contraint to apply
  *
@@ -72,16 +71,12 @@ struct cpufreq_cooling_device {
unsigned int max_level;
struct em_perf_domain *em;
struct cpufreq_policy *policy;
-   struct list_head node;
 #ifndef CONFIG_SMP
struct time_in_idle *idle_time;
 #endif
struct freq_qos_request qos_req;
 };
 
-static DEFINE_MUTEX(cooling_list_lock);
-static LIST_HEAD(cpufreq_cdev_list);
-
 #ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
 /**
  * get_level: Find the level for a particular frequency
@@ -602,10 +597,6 @@ __cpufreq_cooling_register(struct device_node *np,
if (IS_ERR(cdev))
goto remove_qos_req;
 
-   mutex_lock(_list_lock);
-   list_add(_cdev->node, _cdev_list);
-   mutex_unlock(_list_lock);
-
return cdev;
 
 remove_qos_req:
@@ -697,10 +688,6 @@ void cpufreq_cooling_unregister(struct 
thermal_cooling_device *cdev)
 
cpufreq_cdev = cdev->devdata;
 
-   mutex_lock(_list_lock);
-   list_del(_cdev->node);
-   mutex_unlock(_list_lock);
-
thermal_cooling_device_unregister(cdev);
freq_qos_remove_request(_cdev->qos_req);
free_idle_time(cpufreq_cdev);
-- 
2.17.1



[PATCH v3 4/5] thermal/drivers/cpuidle_cooling: Use device name instead of auto-numbering

2021-03-14 Thread Daniel Lezcano
Currently the naming of a cooling device is just a cooling technique
followed by a number. When there are multiple cooling devices using
the same technique, it is impossible to clearly identify the related
device as this one is just a number.

For instance:

 thermal-idle-0
 thermal-idle-1
 thermal-idle-2
 thermal-idle-3
 etc ...

The 'thermal' prefix is redundant with the subsystem namespace. This
patch removes the 'thermal prefix and changes the number by the device
name. So the naming above becomes:

 idle-cpu0
 idle-cpu1
 idle-cpu2
 idle-cpu3
 etc ...

Signed-off-by: Daniel Lezcano 
Reviewed-by: Lukasz Luba 
---
V2:
  - Removed idr.h header
  - Used kasprintf instead of fixed buffer length on the stack
  - Fixed typo in the log
---
 drivers/thermal/cpuidle_cooling.c | 33 +++
 1 file changed, 16 insertions(+), 17 deletions(-)

diff --git a/drivers/thermal/cpuidle_cooling.c 
b/drivers/thermal/cpuidle_cooling.c
index 7ecab4b16b29..f32976163bad 100644
--- a/drivers/thermal/cpuidle_cooling.c
+++ b/drivers/thermal/cpuidle_cooling.c
@@ -9,9 +9,9 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -26,8 +26,6 @@ struct cpuidle_cooling_device {
unsigned long state;
 };
 
-static DEFINE_IDA(cpuidle_ida);
-
 /**
  * cpuidle_cooling_runtime - Running time computation
  * @idle_duration_us: CPU idle time to inject in microseconds
@@ -174,10 +172,11 @@ static int __cpuidle_cooling_register(struct device_node 
*np,
struct idle_inject_device *ii_dev;
struct cpuidle_cooling_device *idle_cdev;
struct thermal_cooling_device *cdev;
+   struct device *dev;
unsigned int idle_duration_us = TICK_USEC;
unsigned int latency_us = UINT_MAX;
-   char dev_name[THERMAL_NAME_LENGTH];
-   int id, ret;
+   char *name;
+   int ret;
 
idle_cdev = kzalloc(sizeof(*idle_cdev), GFP_KERNEL);
if (!idle_cdev) {
@@ -185,16 +184,10 @@ static int __cpuidle_cooling_register(struct device_node 
*np,
goto out;
}
 
-   id = ida_simple_get(_ida, 0, 0, GFP_KERNEL);
-   if (id < 0) {
-   ret = id;
-   goto out_kfree;
-   }
-
ii_dev = idle_inject_register(drv->cpumask);
if (!ii_dev) {
ret = -EINVAL;
-   goto out_id;
+   goto out_kfree;
}
 
of_property_read_u32(np, "duration-us", _duration_us);
@@ -205,24 +198,30 @@ static int __cpuidle_cooling_register(struct device_node 
*np,
 
idle_cdev->ii_dev = ii_dev;
 
-   snprintf(dev_name, sizeof(dev_name), "thermal-idle-%d", id);
+   dev = get_cpu_device(cpumask_first(drv->cpumask));
 
-   cdev = thermal_of_cooling_device_register(np, dev_name, idle_cdev,
+   name = kasprintf(GFP_KERNEL, "idle-%s", dev_name(dev));
+   if (!name) {
+   ret = -ENOMEM;
+   goto out_unregister;
+   }
+
+   cdev = thermal_of_cooling_device_register(np, name, idle_cdev,
  _cooling_ops);
+   kfree(name);
+
if (IS_ERR(cdev)) {
ret = PTR_ERR(cdev);
goto out_unregister;
}
 
pr_debug("%s: Idle injection set with idle duration=%u, latency=%u\n",
-dev_name, idle_duration_us, latency_us);
+name, idle_duration_us, latency_us);
 
return 0;
 
 out_unregister:
idle_inject_unregister(ii_dev);
-out_id:
-   ida_simple_remove(_ida, id);
 out_kfree:
kfree(idle_cdev);
 out:
-- 
2.17.1



[PATCH v3 2/5] thermal/drivers/cpufreq_cooling: Use device name instead of auto-numbering

2021-03-14 Thread Daniel Lezcano
Currently the naming of a cooling device is just a cooling technique
followed by a number. When there are multiple cooling devices using
the same technique, it is impossible to clearly identify the related
device as this one is just a number.

For instance:

 thermal-cpufreq-0
 thermal-cpufreq-1
 etc ...

The 'thermal' prefix is redundant with the subsystem namespace. This
patch removes the 'thermal' prefix and changes the number by the device
name. So the naming above becomes:

 cpufreq-cpu0
 cpufreq-cpu4
 etc ...

Signed-off-by: Daniel Lezcano 
Acked-by: Viresh Kumar 
Reviewed-by: Lukasz Luba 
---
V2:
  - Use kasprintf() instead of fixed array length on the stack
  - Fixed typo in the log
  - Removed idr.h inclusion
---
 drivers/thermal/cpufreq_cooling.c | 34 +++
 1 file changed, 12 insertions(+), 22 deletions(-)

diff --git a/drivers/thermal/cpufreq_cooling.c 
b/drivers/thermal/cpufreq_cooling.c
index 10af3341e5ea..3f5f1dce1320 100644
--- a/drivers/thermal/cpufreq_cooling.c
+++ b/drivers/thermal/cpufreq_cooling.c
@@ -13,10 +13,10 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -50,8 +50,6 @@ struct time_in_idle {
 
 /**
  * struct cpufreq_cooling_device - data for cooling device with cpufreq
- * @id: unique integer value corresponding to each cpufreq_cooling_device
- * registered.
  * @last_load: load measured by the latest call to 
cpufreq_get_requested_power()
  * @cpufreq_state: integer value representing the current state of cpufreq
  * cooling devices.
@@ -69,7 +67,6 @@ struct time_in_idle {
  * cpufreq_cooling_device.
  */
 struct cpufreq_cooling_device {
-   int id;
u32 last_load;
unsigned int cpufreq_state;
unsigned int max_level;
@@ -82,7 +79,6 @@ struct cpufreq_cooling_device {
struct freq_qos_request qos_req;
 };
 
-static DEFINE_IDA(cpufreq_ida);
 static DEFINE_MUTEX(cooling_list_lock);
 static LIST_HEAD(cpufreq_cdev_list);
 
@@ -528,11 +524,11 @@ __cpufreq_cooling_register(struct device_node *np,
 {
struct thermal_cooling_device *cdev;
struct cpufreq_cooling_device *cpufreq_cdev;
-   char dev_name[THERMAL_NAME_LENGTH];
unsigned int i;
struct device *dev;
int ret;
struct thermal_cooling_device_ops *cooling_ops;
+   char *name;
 
dev = get_cpu_device(policy->cpu);
if (unlikely(!dev)) {
@@ -567,16 +563,6 @@ __cpufreq_cooling_register(struct device_node *np,
/* max_level is an index, not a counter */
cpufreq_cdev->max_level = i - 1;
 
-   ret = ida_simple_get(_ida, 0, 0, GFP_KERNEL);
-   if (ret < 0) {
-   cdev = ERR_PTR(ret);
-   goto free_idle_time;
-   }
-   cpufreq_cdev->id = ret;
-
-   snprintf(dev_name, sizeof(dev_name), "thermal-cpufreq-%d",
-cpufreq_cdev->id);
-
cooling_ops = _cooling_ops;
 
 #ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
@@ -591,7 +577,7 @@ __cpufreq_cooling_register(struct device_node *np,
pr_err("%s: unsorted frequency tables are not supported\n",
   __func__);
cdev = ERR_PTR(-EINVAL);
-   goto remove_ida;
+   goto free_idle_time;
}
 
ret = freq_qos_add_request(>constraints,
@@ -601,11 +587,18 @@ __cpufreq_cooling_register(struct device_node *np,
pr_err("%s: Failed to add freq constraint (%d)\n", __func__,
   ret);
cdev = ERR_PTR(ret);
-   goto remove_ida;
+   goto free_idle_time;
}
 
-   cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
+   cdev = ERR_PTR(-ENOMEM);
+   name = kasprintf(GFP_KERNEL, "cpufreq-%s", dev_name(dev));
+   if (!name)
+   goto remove_qos_req;
+
+   cdev = thermal_of_cooling_device_register(np, name, cpufreq_cdev,
  cooling_ops);
+   kfree(name);
+
if (IS_ERR(cdev))
goto remove_qos_req;
 
@@ -617,8 +610,6 @@ __cpufreq_cooling_register(struct device_node *np,
 
 remove_qos_req:
freq_qos_remove_request(_cdev->qos_req);
-remove_ida:
-   ida_simple_remove(_ida, cpufreq_cdev->id);
 free_idle_time:
free_idle_time(cpufreq_cdev);
 free_cdev:
@@ -712,7 +703,6 @@ void cpufreq_cooling_unregister(struct 
thermal_cooling_device *cdev)
 
thermal_cooling_device_unregister(cdev);
freq_qos_remove_request(_cdev->qos_req);
-   ida_simple_remove(_ida, cpufreq_cdev->id);
free_idle_time(cpufreq_cdev);
kfree(cpufreq_cdev);
 }
-- 
2.17.1



[PATCH v3 3/5] thermal/drivers/devfreq_cooling: Use device name instead of auto-numbering

2021-03-14 Thread Daniel Lezcano
Currently the naming of a cooling device is just a cooling technique
followed by a number. When there are multiple cooling devices using
the same technique, it is impossible to clearly identify the related
device as this one is just a number.

For instance:

 thermal-devfreq-0
 thermal-devfreq-1
 etc ...

The 'thermal' prefix is redundant with the subsystem namespace. This
patch removes the 'thermal' prefix and changes the number by the device
name. So the naming above becomes:

 devfreq-500.gpu
 devfreq-1d84000.ufshc
 etc ...

Signed-off-by: Daniel Lezcano 
Reviewed-by: Lukasz Luba 
---
V2:
 - Removed idr.h header
 - Used kasprintf instead of fixed buffer length on the stack
 - Fixed typo in the log
---
 drivers/thermal/devfreq_cooling.c | 25 -
 1 file changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/thermal/devfreq_cooling.c 
b/drivers/thermal/devfreq_cooling.c
index fed3121ff2a1..fb250ac16f50 100644
--- a/drivers/thermal/devfreq_cooling.c
+++ b/drivers/thermal/devfreq_cooling.c
@@ -14,7 +14,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -25,11 +24,8 @@
 #define HZ_PER_KHZ 1000
 #define SCALE_ERROR_MITIGATION 100
 
-static DEFINE_IDA(devfreq_ida);
-
 /**
  * struct devfreq_cooling_device - Devfreq cooling device
- * @id:unique integer value corresponding to each
  * devfreq_cooling_device registered.
  * @cdev:  Pointer to associated thermal cooling device.
  * @devfreq:   Pointer to associated devfreq device.
@@ -51,7 +47,6 @@ static DEFINE_IDA(devfreq_ida);
  * @em_pd: Energy Model for the associated Devfreq device
  */
 struct devfreq_cooling_device {
-   int id;
struct thermal_cooling_device *cdev;
struct devfreq *devfreq;
unsigned long cooling_state;
@@ -363,7 +358,7 @@ of_devfreq_cooling_register_power(struct device_node *np, 
struct devfreq *df,
struct thermal_cooling_device *cdev;
struct device *dev = df->dev.parent;
struct devfreq_cooling_device *dfc;
-   char dev_name[THERMAL_NAME_LENGTH];
+   char *name;
int err, num_opps;
 
dfc = kzalloc(sizeof(*dfc), GFP_KERNEL);
@@ -407,30 +402,27 @@ of_devfreq_cooling_register_power(struct device_node *np, 
struct devfreq *df,
if (err < 0)
goto free_table;
 
-   err = ida_simple_get(_ida, 0, 0, GFP_KERNEL);
-   if (err < 0)
+   cdev = ERR_PTR(-ENOMEM);
+   name = kasprintf(GFP_KERNEL, "devfreq-%s", dev_name(dev));
+   if (!name)
goto remove_qos_req;
 
-   dfc->id = err;
-
-   snprintf(dev_name, sizeof(dev_name), "thermal-devfreq-%d", dfc->id);
-
-   cdev = thermal_of_cooling_device_register(np, dev_name, dfc,
+   cdev = thermal_of_cooling_device_register(np, name, dfc,
  _cooling_ops);
+   kfree(name);
+
if (IS_ERR(cdev)) {
err = PTR_ERR(cdev);
dev_err(dev,
"Failed to register devfreq cooling device (%d)\n",
err);
-   goto release_ida;
+   goto remove_qos_req;
}
 
dfc->cdev = cdev;
 
return cdev;
 
-release_ida:
-   ida_simple_remove(_ida, dfc->id);
 remove_qos_req:
dev_pm_qos_remove_request(>req_max_freq);
 free_table:
@@ -527,7 +519,6 @@ void devfreq_cooling_unregister(struct 
thermal_cooling_device *cdev)
dev = dfc->devfreq->dev.parent;
 
thermal_cooling_device_unregister(dfc->cdev);
-   ida_simple_remove(_ida, dfc->id);
dev_pm_qos_remove_request(>req_max_freq);
 
em_dev_unregister_perf_domain(dev);
-- 
2.17.1



[PATCH v3 1/5] thermal/drivers/core: Use a char pointer for the cooling device name

2021-03-14 Thread Daniel Lezcano
We want to have any kind of name for the cooling devices as we do no
longer want to rely on auto-numbering. Let's replace the cooling
device's fixed array by a char pointer to be allocated dynamically
when registering the cooling device, so we don't limit the length of
the name.

Rework the error path at the same time as we have to rollback the
allocations in case of error.

Tested with a dummy device having the name:
 "Llanfairpwllgwyngyllgogerychwyrndrobwantysiliogogogoch"

A village on the island of Anglesey (Wales), known to have the longest
name in Europe.

Signed-off-by: Daniel Lezcano 
Reviewed-by: Lukasz Luba 
---
V3:
  - Inverted kfree() and put_device() when unregistering the cooling device
(Reported by Ido Schimmel)
---
 .../ethernet/mellanox/mlxsw/core_thermal.c|  2 +-
 drivers/thermal/thermal_core.c| 38 +++
 include/linux/thermal.h   |  2 +-
 3 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c 
b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
index bf85ce9835d7..7447c2a73cbd 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
@@ -141,7 +141,7 @@ static int mlxsw_get_cooling_device_idx(struct 
mlxsw_thermal *thermal,
/* Allow mlxsw thermal zone binding to an external cooling device */
for (i = 0; i < ARRAY_SIZE(mlxsw_thermal_external_allowed_cdev); i++) {
if (strnstr(cdev->type, mlxsw_thermal_external_allowed_cdev[i],
-   sizeof(cdev->type)))
+   strlen(cdev->type)))
return 0;
}
 
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 996c038f83a4..c8d4010940ef 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -960,10 +960,7 @@ __thermal_cooling_device_register(struct device_node *np,
 {
struct thermal_cooling_device *cdev;
struct thermal_zone_device *pos = NULL;
-   int result;
-
-   if (type && strlen(type) >= THERMAL_NAME_LENGTH)
-   return ERR_PTR(-EINVAL);
+   int ret;
 
if (!ops || !ops->get_max_state || !ops->get_cur_state ||
!ops->set_cur_state)
@@ -973,14 +970,17 @@ __thermal_cooling_device_register(struct device_node *np,
if (!cdev)
return ERR_PTR(-ENOMEM);
 
-   result = ida_simple_get(_cdev_ida, 0, 0, GFP_KERNEL);
-   if (result < 0) {
-   kfree(cdev);
-   return ERR_PTR(result);
+   ret = ida_simple_get(_cdev_ida, 0, 0, GFP_KERNEL);
+   if (ret < 0)
+   goto out_kfree_cdev;
+   cdev->id = ret;
+
+   cdev->type = kstrdup(type ? type : "", GFP_KERNEL);
+   if (!cdev->type) {
+   ret = -ENOMEM;
+   goto out_ida_remove;
}
 
-   cdev->id = result;
-   strlcpy(cdev->type, type ? : "", sizeof(cdev->type));
mutex_init(>lock);
INIT_LIST_HEAD(>thermal_instances);
cdev->np = np;
@@ -990,12 +990,9 @@ __thermal_cooling_device_register(struct device_node *np,
cdev->devdata = devdata;
thermal_cooling_device_setup_sysfs(cdev);
dev_set_name(>device, "cooling_device%d", cdev->id);
-   result = device_register(>device);
-   if (result) {
-   ida_simple_remove(_cdev_ida, cdev->id);
-   put_device(>device);
-   return ERR_PTR(result);
-   }
+   ret = device_register(>device);
+   if (ret)
+   goto out_kfree_type;
 
/* Add 'this' new cdev to the global cdev list */
mutex_lock(_list_lock);
@@ -1013,6 +1010,14 @@ __thermal_cooling_device_register(struct device_node *np,
mutex_unlock(_list_lock);
 
return cdev;
+
+out_kfree_type:
+   kfree(cdev->type);
+   put_device(>device);
+out_ida_remove:
+   ida_simple_remove(_cdev_ida, cdev->id);
+out_kfree_cdev:
+   return ERR_PTR(ret);
 }
 
 /**
@@ -1171,6 +1176,7 @@ void thermal_cooling_device_unregister(struct 
thermal_cooling_device *cdev)
ida_simple_remove(_cdev_ida, cdev->id);
device_del(>device);
thermal_cooling_device_destroy_sysfs(cdev);
+   kfree(cdev->type);
put_device(>device);
 }
 EXPORT_SYMBOL_GPL(thermal_cooling_device_unregister);
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index 6ac7bb1d2b1f..169502164364 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -91,7 +91,7 @@ struct thermal_cooling_device_ops {
 
 struct thermal_cooling_device {
int id;
-   char type[THERMAL_NAME_LENGTH];
+   char *type;
struct device device;
struct device_node *np;
void *devdata;
-- 
2.17.1



  1   2   3   4   5   6   7   8   9   10   >