from:"Rafael J. Wysocki"

Re: [PATCH] sysfs: Unbreak the build around sysfs_bin_attr_simple_read()

2024-05-23 Thread Rafael J. Wysocki

On Thu, May 23, 2024 at 1:00 PM Lukas Wunner  wrote:
>
> Günter reports build breakage for m68k "m5208evb_defconfig" plus
> CONFIG_BLK_DEV_INITRD=y caused by commit 66bc1a173328 ("treewide:
> Use sysfs_bin_attr_simple_read() helper").
>
> The defconfig disables CONFIG_SYSFS, so sysfs_bin_attr_simple_read()
> is not compiled into the kernel.  But init/initramfs.c references
> that function in the initializer of a struct bin_attribute.
>
> Add an empty static inline to avoid the build breakage.
>
> Fixes: 66bc1a173328 ("treewide: Use sysfs_bin_attr_simple_read() helper")
> Reported-by: Guenter Roeck 
> Closes: 
> https://lore.kernel.org/r/e12b0027-b199-4de7-b83d-668171447...@roeck-us.net
> Signed-off-by: Lukas Wunner 

Works for me.

Reviewed-by: Rafael J. Wysocki 

> ---
>  include/linux/sysfs.h | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
> index a7d725fbf739..c4e64dc11206 100644
> --- a/include/linux/sysfs.h
> +++ b/include/linux/sysfs.h
> @@ -750,6 +750,15 @@ static inline int sysfs_emit_at(char *buf, int at, const 
> char *fmt, ...)
>  {
> return 0;
>  }
> +
> +static inline ssize_t sysfs_bin_attr_simple_read(struct file *file,
> +struct kobject *kobj,
> +struct bin_attribute *attr,
> +char *buf, loff_t off,
> +size_t count)
> +{
> +   return 0;
> +}
>  #endif /* CONFIG_SYSFS */
>
>  static inline int __must_check sysfs_create_file(struct kobject *kobj,
> --
> 2.43.0
>

Re: [PATCH] tracing/treewide: Remove second parameter of __assign_str()

2024-05-17 Thread Rafael J. Wysocki

On Thu, May 16, 2024 at 7:35 PM Steven Rostedt  wrote:
>
> From: "Steven Rostedt (Google)" 
>
> [
>This is a treewide change. I will likely re-create this patch again in
>the second week of the merge window of v6.10 and submit it then. Hoping
>to keep the conflicts that it will cause to a minimum.
> ]
>
> With the rework of how the __string() handles dynamic strings where it
> saves off the source string in field in the helper structure[1], the
> assignment of that value to the trace event field is stored in the helper
> value and does not need to be passed in again.
>
> This means that with:
>
>   __string(field, mystring)
>
> Which use to be assigned with __assign_str(field, mystring), no longer
> needs the second parameter and it is unused. With this, __assign_str()
> will now only get a single parameter.
>
> There's over 700 users of __assign_str() and because coccinelle does not
> handle the TRACE_EVENT() macro I ended up using the following sed script:
>
>   git grep -l __assign_str | while read a ; do
>   sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file;
>   mv /tmp/test-file $a;
>   done
>
> I then searched for __assign_str() that did not end with ';' as those
> were multi line assignments that the sed script above would fail to catch.
>
> Note, the same updates will need to be done for:
>
>   __assign_str_len()
>   __assign_rel_str()
>   __assign_rel_str_len()
>
> I tested this with both an allmodconfig and an allyesconfig (build only for 
> both).
>
> [1] 
> https://lore.kernel.org/linux-trace-kernel/2024011442.634192...@goodmis.org/
>
> Cc: Masami Hiramatsu 
> Cc: Mathieu Desnoyers 
> Cc: Linus Torvalds 
> Cc: Julia Lawall 
> Signed-off-by: Steven Rostedt (Google) 

Acked-by: Rafael J. Wysocki  # for thermal

Re: [PATCH 0/2] Deduplicate bin_attribute simple read() callbacks

2024-04-08 Thread Rafael J. Wysocki

On Sat, Apr 6, 2024 at 3:52 PM Lukas Wunner  wrote:
>
> For my upcoming PCI device authentication v2 patches, I have the need
> to expose a simple buffer in virtual memory as a bin_attribute.
>
> It turns out we've duplicated the ->read() callback for such simple
> buffers a fair number of times across the tree.
>
> So instead of reinventing the wheel, I decided to introduce a common
> helper and eliminate all duplications I could find.
>
> I'm open to a bikeshedding discussion on the sysfs_bin_attr_simple_read()
> name. ;)
>
> Lukas Wunner (2):
>   sysfs: Add sysfs_bin_attr_simple_read() helper
>   treewide: Use sysfs_bin_attr_simple_read() helper
>
>  arch/powerpc/platforms/powernv/opal.c  | 10 +---
>  drivers/acpi/bgrt.c|  9 +---
>  drivers/firmware/dmi_scan.c| 12 ++
>  drivers/firmware/efi/rci2-table.c  | 10 +---
>  drivers/gpu/drm/i915/gvt/firmware.c| 26 +
>  .../intel/int340x_thermal/int3400_thermal.c|  9 +---
>  fs/sysfs/file.c| 27 
> ++
>  include/linux/sysfs.h  | 15 
>  init/initramfs.c   | 10 +---
>  kernel/module/sysfs.c  | 13 +--
>  10 files changed, 56 insertions(+), 85 deletions(-)
>
> --

For the series

Acked-by: Rafael J. Wysocki

Re: [PATCH 10/14] suspend: add a arch_resume_nosmt() prototype

2023-05-24 Thread Rafael J. Wysocki

On Wed, May 17, 2023 at 4:52 PM Arnd Bergmann  wrote:
>
> On Wed, May 17, 2023, at 15:48, Rafael J. Wysocki wrote:
> > On Wed, May 17, 2023 at 3:12 PM Arnd Bergmann  wrote:
> >>
> >> From: Arnd Bergmann 
> >>
> >> The arch_resume_nosmt() has a __weak definition, plus an x86
> >> specific override, but no prototype that ensures the two have
> >> the same arguments. This causes a W=1 warning:
> >>
> >> arch/x86/power/hibernate.c:189:5: error: no previous prototype for 
> >> 'arch_resume_nosmt' [-Werror=missing-prototypes]
> >>
> >> Add the prototype in linux/suspend.h, which is included in
> >> both places.
> >>
> >> Signed-off-by: Arnd Bergmann 
> >
> > Do you want me to pick this up?
>
> Yes, please do. Thanks,

Done, thanks!

Re: [PATCH 10/14] suspend: add a arch_resume_nosmt() prototype

2023-05-17 Thread Rafael J. Wysocki

On Wed, May 17, 2023 at 3:12 PM Arnd Bergmann  wrote:
>
> From: Arnd Bergmann 
>
> The arch_resume_nosmt() has a __weak definition, plus an x86
> specific override, but no prototype that ensures the two have
> the same arguments. This causes a W=1 warning:
>
> arch/x86/power/hibernate.c:189:5: error: no previous prototype for 
> 'arch_resume_nosmt' [-Werror=missing-prototypes]
>
> Add the prototype in linux/suspend.h, which is included in
> both places.
>
> Signed-off-by: Arnd Bergmann 

Do you want me to pick this up?

If not

Acked-by: Rafael J. Wysocki 

> ---
>  include/linux/suspend.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/include/linux/suspend.h b/include/linux/suspend.h
> index f16653f7be32..bc911fecb8e8 100644
> --- a/include/linux/suspend.h
> +++ b/include/linux/suspend.h
> @@ -472,6 +472,8 @@ static inline int hibernate_quiet_exec(int (*func)(void 
> *data), void *data) {
>  }
>  #endif /* CONFIG_HIBERNATION */
>
> +int arch_resume_nosmt(void);
> +
>  #ifdef CONFIG_HIBERNATION_SNAPSHOT_DEV
>  int is_hibernate_resume_dev(dev_t dev);
>  #else
> --
> 2.39.2
>

Re: [PATCH 12/19] thermal: cpuidle_cooling: Adjust includes to remove of_device.h

2023-03-29 Thread Rafael J. Wysocki

On Wed, Mar 29, 2023 at 5:53 PM Rob Herring  wrote:
>
> Now that of_cpu_device_node_get() is defined in of.h, of_device.h is just
> implicitly including other includes, and is no longer needed. Adjust the
> include files with what was implicitly included by of_device.h (cpu.h and
> of.h) and drop including of_device.h.
>
> Signed-off-by: Rob Herring 
> ---
> Please ack and I will take the series via the DT tree.

Acked-by: Rafael J. Wysocki 

> ---
>  drivers/thermal/cpuidle_cooling.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/thermal/cpuidle_cooling.c 
> b/drivers/thermal/cpuidle_cooling.c
> index 4f41102e8b16..6f6daead485e 100644
> --- a/drivers/thermal/cpuidle_cooling.c
> +++ b/drivers/thermal/cpuidle_cooling.c
> @@ -7,12 +7,13 @@
>   */
>  #define pr_fmt(fmt) "cpuidle cooling: " fmt
>
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>
>
> --
> 2.39.2
>

Re: [PATCH 14/19] cpufreq: Adjust includes to remove of_device.h

2023-03-29 Thread Rafael J. Wysocki

On Wed, Mar 29, 2023 at 5:53 PM Rob Herring  wrote:
>
> Now that of_cpu_device_node_get() is defined in of.h, of_device.h is just
> implicitly including other includes, and is no longer needed. Adjust the
> include files with what was implicitly included by of_device.h (cpu.h and
> of.h) and drop including of_device.h.
>
> Signed-off-by: Rob Herring 
> ---
> Please ack and I will take the series via the DT tree.

Acked-by: Rafael J. Wysocki 

> ---
>  drivers/cpufreq/cpufreq-dt-platdev.c | 1 -
>  drivers/cpufreq/kirkwood-cpufreq.c   | 2 +-
>  drivers/cpufreq/maple-cpufreq.c  | 2 +-
>  drivers/cpufreq/pmac32-cpufreq.c | 2 +-
>  drivers/cpufreq/pmac64-cpufreq.c | 2 +-
>  drivers/cpufreq/qcom-cpufreq-hw.c| 4 ++--
>  drivers/cpufreq/spear-cpufreq.c  | 2 +-
>  drivers/cpufreq/tegra124-cpufreq.c   | 1 -
>  drivers/cpufreq/tegra20-cpufreq.c| 2 +-
>  include/linux/cpufreq.h  | 1 -
>  10 files changed, 8 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq-dt-platdev.c 
> b/drivers/cpufreq/cpufreq-dt-platdev.c
> index e85703651098..f9675e1a8529 100644
> --- a/drivers/cpufreq/cpufreq-dt-platdev.c
> +++ b/drivers/cpufreq/cpufreq-dt-platdev.c
> @@ -6,7 +6,6 @@
>
>  #include 
>  #include 
> -#include 
>  #include 
>
>  #include "cpufreq-dt.h"
> diff --git a/drivers/cpufreq/kirkwood-cpufreq.c 
> b/drivers/cpufreq/kirkwood-cpufreq.c
> index 70ad8fe1d78b..95588101efbd 100644
> --- a/drivers/cpufreq/kirkwood-cpufreq.c
> +++ b/drivers/cpufreq/kirkwood-cpufreq.c
> @@ -9,7 +9,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/cpufreq/maple-cpufreq.c b/drivers/cpufreq/maple-cpufreq.c
> index 28d346062166..f9306410a07f 100644
> --- a/drivers/cpufreq/maple-cpufreq.c
> +++ b/drivers/cpufreq/maple-cpufreq.c
> @@ -23,7 +23,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>
>  #define DBG(fmt...) pr_debug(fmt)
>
> diff --git a/drivers/cpufreq/pmac32-cpufreq.c 
> b/drivers/cpufreq/pmac32-cpufreq.c
> index 4b8ee2014da6..a28716d8fc54 100644
> --- a/drivers/cpufreq/pmac32-cpufreq.c
> +++ b/drivers/cpufreq/pmac32-cpufreq.c
> @@ -23,7 +23,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>
>  #include 
>  #include 
> diff --git a/drivers/cpufreq/pmac64-cpufreq.c 
> b/drivers/cpufreq/pmac64-cpufreq.c
> index ba9c31d98bd6..2cd2b06849a2 100644
> --- a/drivers/cpufreq/pmac64-cpufreq.c
> +++ b/drivers/cpufreq/pmac64-cpufreq.c
> @@ -21,7 +21,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>
>  #include 
>  #include 
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c 
> b/drivers/cpufreq/qcom-cpufreq-hw.c
> index 2f581d2d617d..df165a078d14 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -11,8 +11,8 @@
>  #include 
>  #include 
>  #include 
> -#include 
> -#include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/cpufreq/spear-cpufreq.c b/drivers/cpufreq/spear-cpufreq.c
> index c6fdf019dbde..78b875db6b66 100644
> --- a/drivers/cpufreq/spear-cpufreq.c
> +++ b/drivers/cpufreq/spear-cpufreq.c
> @@ -18,7 +18,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/cpufreq/tegra124-cpufreq.c 
> b/drivers/cpufreq/tegra124-cpufreq.c
> index 7a1ea6fdcab6..312ca5ddc6c4 100644
> --- a/drivers/cpufreq/tegra124-cpufreq.c
> +++ b/drivers/cpufreq/tegra124-cpufreq.c
> @@ -11,7 +11,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/cpufreq/tegra20-cpufreq.c 
> b/drivers/cpufreq/tegra20-cpufreq.c
> index ab7ac7df9e62..5d1f5f87e46d 100644
> --- a/drivers/cpufreq/tegra20-cpufreq.c
> +++ b/drivers/cpufreq/tegra20-cpufreq.c
> @@ -12,7 +12,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index 65623233ab2f..3ac4a10d4651 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -15,7 +15,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
>
> --
> 2.39.2
>

Re: [PATCH 16/19] cpuidle: Adjust includes to remove of_device.h

2023-03-29 Thread Rafael J. Wysocki

On Wed, Mar 29, 2023 at 5:52 PM Rob Herring  wrote:
>
> Now that of_cpu_device_node_get() is defined in of.h, of_device.h is just
> implicitly including other includes, and is no longer needed. Adjust the
> include files with what was implicitly included by of_device.h (cpu.h,
> cpuhotplug.h, of.h, and of_platform.h) and drop including of_device.h.
>
> Signed-off-by: Rob Herring 
> ---
> Please ack and I will take the series via the DT tree.

Acked-by: Rafael J. Wysocki 

> ---
>  drivers/cpuidle/cpuidle-psci.c  | 1 -
>  drivers/cpuidle/cpuidle-qcom-spm.c  | 3 +--
>  drivers/cpuidle/cpuidle-riscv-sbi.c | 2 +-
>  drivers/cpuidle/dt_idle_states.c| 1 -
>  4 files changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
> index 6de027f9f6f5..bf68920d038a 100644
> --- a/drivers/cpuidle/cpuidle-psci.c
> +++ b/drivers/cpuidle/cpuidle-psci.c
> @@ -16,7 +16,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/cpuidle/cpuidle-qcom-spm.c 
> b/drivers/cpuidle/cpuidle-qcom-spm.c
> index c6e2e91bb4c3..1fc9968eae19 100644
> --- a/drivers/cpuidle/cpuidle-qcom-spm.c
> +++ b/drivers/cpuidle/cpuidle-qcom-spm.c
> @@ -11,8 +11,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c 
> b/drivers/cpuidle/cpuidle-riscv-sbi.c
> index be383f4b6855..ae0b838a0634 100644
> --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> @@ -8,6 +8,7 @@
>
>  #define pr_fmt(fmt) "cpuidle-riscv-sbi: " fmt
>
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -15,7 +16,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/cpuidle/dt_idle_states.c 
> b/drivers/cpuidle/dt_idle_states.c
> index 02aa0b39af9d..12fec92a85fd 100644
> --- a/drivers/cpuidle/dt_idle_states.c
> +++ b/drivers/cpuidle/dt_idle_states.c
> @@ -14,7 +14,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>
>  #include "dt_idle_states.h"
>
>
> --
> 2.39.2
>

Re: [PATCH] cpufreq: pmac32: Use of_property_read_bool() for boolean properties

2023-03-27 Thread Rafael J. Wysocki

On Mon, Mar 13, 2023 at 5:26 AM Viresh Kumar  wrote:
>
> On 10-03-23, 08:47, Rob Herring wrote:
> > It is preferred to use typed property access functions (i.e.
> > of_property_read_ functions) rather than low-level
> > of_get_property/of_find_property functions for reading properties.
> > Convert reading boolean properties to to of_property_read_bool().
> >
> > Signed-off-by: Rob Herring 
> > ---
> >  drivers/cpufreq/pmac32-cpufreq.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
>
> Acked-by: Viresh Kumar 

Applied as 6.4 material, thanks!

Re: [PATCH v8 2/3] freezer: refactor pm_freezing into a function.

2022-12-02 Thread Rafael J. Wysocki

On Thu, Dec 1, 2022 at 12:08 PM Ricardo Ribalda  wrote:
>
> Add a way to let the drivers know if the processes are frozen.
>
> This is needed by drivers that are waiting for processes to end on their
> shutdown path.
>
> Convert pm_freezing into a function and export it, so it can be used by
> drivers that are either built-in or modules.
>
> Cc: sta...@vger.kernel.org
> Fixes: 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers 
> in .shutdown")
> Signed-off-by: Ricardo Ribalda 

Why can't you export the original pm_freezing variable and why is this
fixing anything?

> ---
>  include/linux/freezer.h |  3 ++-
>  kernel/freezer.c|  3 +--
>  kernel/power/process.c  | 24 
>  3 files changed, 23 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/freezer.h b/include/linux/freezer.h
> index b303472255be..3413c869d68b 100644
> --- a/include/linux/freezer.h
> +++ b/include/linux/freezer.h
> @@ -13,7 +13,7 @@
>  #ifdef CONFIG_FREEZER
>  DECLARE_STATIC_KEY_FALSE(freezer_active);
>
> -extern bool pm_freezing;   /* PM freezing in effect */
> +bool pm_freezing(void);
>  extern bool pm_nosig_freezing; /* PM nosig freezing in effect */
>
>  /*
> @@ -80,6 +80,7 @@ static inline int freeze_processes(void) { return -ENOSYS; }
>  static inline int freeze_kernel_threads(void) { return -ENOSYS; }
>  static inline void thaw_processes(void) {}
>  static inline void thaw_kernel_threads(void) {}
> +static inline bool pm_freezing(void) { return false; }
>
>  static inline bool try_to_freeze(void) { return false; }
>
> diff --git a/kernel/freezer.c b/kernel/freezer.c
> index 4fad0e6fca64..2d3530ebdb7e 100644
> --- a/kernel/freezer.c
> +++ b/kernel/freezer.c
> @@ -20,7 +20,6 @@ EXPORT_SYMBOL(freezer_active);
>   * indicate whether PM freezing is in effect, protected by
>   * system_transition_mutex
>   */
> -bool pm_freezing;
>  bool pm_nosig_freezing;
>
>  /* protects freezing and frozen transitions */
> @@ -46,7 +45,7 @@ bool freezing_slow_path(struct task_struct *p)
> if (pm_nosig_freezing || cgroup_freezing(p))
> return true;
>
> -   if (pm_freezing && !(p->flags & PF_KTHREAD))
> +   if (pm_freezing() && !(p->flags & PF_KTHREAD))
> return true;
>
> return false;
> diff --git a/kernel/power/process.c b/kernel/power/process.c
> index ddd9988327fe..8a4d0e2c8c20 100644
> --- a/kernel/power/process.c
> +++ b/kernel/power/process.c
> @@ -108,6 +108,22 @@ static int try_to_freeze_tasks(bool user_only)
> return todo ? -EBUSY : 0;
>  }
>
> +/*
> + * Indicate whether PM freezing is in effect, protected by
> + * system_transition_mutex.
> + */
> +static bool pm_freezing_internal;
> +
> +/**
> + * pm_freezing - indicate whether PM freezing is in effect.
> + *
> + */
> +bool pm_freezing(void)
> +{
> +   return pm_freezing_internal;
> +}
> +EXPORT_SYMBOL(pm_freezing);

Use EXPORT_SYMBOL_GPL() instead, please.

> +
>  /**
>   * freeze_processes - Signal user space processes to enter the refrigerator.
>   * The current thread will not be frozen.  The same process that calls
> @@ -126,12 +142,12 @@ int freeze_processes(void)
> /* Make sure this task doesn't get frozen */
> current->flags |= PF_SUSPEND_TASK;
>
> -   if (!pm_freezing)
> +   if (!pm_freezing())
> static_branch_inc(_active);
>
> pm_wakeup_clear(0);
> pr_info("Freezing user space processes ... ");
> -   pm_freezing = true;
> +   pm_freezing_internal = true;
> error = try_to_freeze_tasks(true);
> if (!error) {
> __usermodehelper_set_disable_depth(UMH_DISABLED);
> @@ -187,9 +203,9 @@ void thaw_processes(void)
> struct task_struct *curr = current;
>
> trace_suspend_resume(TPS("thaw_processes"), 0, true);
> -   if (pm_freezing)
> +   if (pm_freezing())
> static_branch_dec(_active);
> -   pm_freezing = false;
> +   pm_freezing_internal = false;
> pm_nosig_freezing = false;
>
> oom_killer_enable();
>
> --

Re: [PATCH v2 00/44] cpuidle,rcu: Clean up the mess

2022-09-19 Thread Rafael J. Wysocki

On Mon, Sep 19, 2022 at 12:17 PM Peter Zijlstra  wrote:
>
> Hi All!
>
> At long last, a respin of the cpuidle vs rcu cleanup patches.
>
> v1: https://lkml.kernel.org/r/20220608142723.103523...@infradead.org
>
> These here patches clean up the mess that is cpuidle vs rcuidle.
>
> At the end of the ride there's only on RCU_NONIDLE user left:
>
>   arch/arm64/kernel/suspend.c:RCU_NONIDLE(__cpu_suspend_exit());
>
> and 'one' trace_*_rcuidle() user:
>
>   kernel/trace/trace_preemptirq.c:
> trace_irq_enable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
>   kernel/trace/trace_preemptirq.c:
> trace_irq_disable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
>   kernel/trace/trace_preemptirq.c:
> trace_irq_enable_rcuidle(CALLER_ADDR0, caller_addr);
>   kernel/trace/trace_preemptirq.c:
> trace_irq_disable_rcuidle(CALLER_ADDR0, caller_addr);
>   kernel/trace/trace_preemptirq.c:
> trace_preempt_enable_rcuidle(a0, a1);
>   kernel/trace/trace_preemptirq.c:
> trace_preempt_disable_rcuidle(a0, a1);
>
> However this last is all in deprecated code that should be unused for 
> GENERIC_ENTRY.
>
> I've touched a lot of code that I can't test and I might've broken something 
> by
> accident. In particular the whole ARM cpuidle stuff was quite involved.
>
> Please all; have a look where you haven't already.
>
>
> New since v1:
>
>  - rebase on top of Frederic's rcu-context-tracking rename fest
>  - more omap goodness as per the last discusion (thanks Tony!)
>  - removed one more RCU_NONIDLE() from arm64/risc-v perf code
>  - ubsan/kasan fixes
>  - intel_idle module-param for testing
>  - a bunch of extra __always_inline, because compilers are silly.

Acked-by: Rafael J. Wysocki 

for the whole set and let me know if you want me to merge any of these
through cpuidle.

Thanks!

>
> ---
>  arch/alpha/kernel/process.c   |  1 -
>  arch/alpha/kernel/vmlinux.lds.S   |  1 -
>  arch/arc/kernel/process.c |  3 ++
>  arch/arc/kernel/vmlinux.lds.S |  1 -
>  arch/arm/include/asm/vmlinux.lds.h|  1 -
>  arch/arm/kernel/process.c |  1 -
>  arch/arm/kernel/smp.c |  6 +--
>  arch/arm/mach-gemini/board-dt.c   |  3 +-
>  arch/arm/mach-imx/cpuidle-imx6q.c |  4 +-
>  arch/arm/mach-imx/cpuidle-imx6sx.c|  5 ++-
>  arch/arm/mach-omap2/common.h  |  6 ++-
>  arch/arm/mach-omap2/cpuidle34xx.c | 16 +++-
>  arch/arm/mach-omap2/cpuidle44xx.c | 29 +++---
>  arch/arm/mach-omap2/omap-mpuss-lowpower.c | 12 +-
>  arch/arm/mach-omap2/pm.h  |  2 +-
>  arch/arm/mach-omap2/pm24xx.c  | 51 +---
>  arch/arm/mach-omap2/pm34xx.c  | 14 +--
>  arch/arm/mach-omap2/pm44xx.c  |  2 +-
>  arch/arm/mach-omap2/powerdomain.c | 10 ++---
>  arch/arm64/kernel/idle.c  |  1 -
>  arch/arm64/kernel/smp.c   |  4 +-
>  arch/arm64/kernel/vmlinux.lds.S   |  1 -
>  arch/csky/kernel/process.c|  1 -
>  arch/csky/kernel/smp.c|  2 +-
>  arch/csky/kernel/vmlinux.lds.S|  1 -
>  arch/hexagon/kernel/process.c |  1 -
>  arch/hexagon/kernel/vmlinux.lds.S |  1 -
>  arch/ia64/kernel/process.c|  1 +
>  arch/ia64/kernel/vmlinux.lds.S|  1 -
>  arch/loongarch/kernel/idle.c  |  1 +
>  arch/loongarch/kernel/vmlinux.lds.S   |  1 -
>  arch/m68k/kernel/vmlinux-nommu.lds|  1 -
>  arch/m68k/kernel/vmlinux-std.lds  |  1 -
>  arch/m68k/kernel/vmlinux-sun3.lds |  1 -
>  arch/microblaze/kernel/process.c  |  1 -
>  arch/microblaze/kernel/vmlinux.lds.S  |  1 -
>  arch/mips/kernel/idle.c   |  8 ++--
>  arch/mips/kernel/vmlinux.lds.S|  1 -
>  arch/nios2/kernel/process.c   |  1 -
>  arch/nios2/kernel/vmlinux.lds.S   |  1 -
>  arch/openrisc/kernel/process.c|  1 +
>  arch/openrisc/kernel/vmlinux.lds.S|  1 -
>  arch/parisc/kernel/process.c  |  2 -
>  arch/parisc/kernel/vmlinux.lds.S  |  1 -
>  arch/powerpc/kernel/idle.c|  5 +--
>  arch/powerpc/kernel/vmlinux.lds.S |  1 -
>  arch/riscv/kernel/process.c   |  1 -
>  arch/riscv/kernel/vmlinux-xip.lds.S   |  1 -
>  arch/riscv/kernel/vmlinux.lds.S   |  1 -
>  arch/s390/kernel/idle.c   |  1 -
>  arch/s390/kernel/vmlinux.lds.S|  1 -
>  arch/sh/kernel/idle.c

Re: [PATCH] cpuidle: move from strlcpy with unused retval to strscpy

2022-08-31 Thread Rafael J. Wysocki

On Thu, Aug 18, 2022 at 11:00 PM Wolfram Sang
 wrote:
>
> Follow the advice of the below link and prefer 'strscpy' in this
> subsystem. Conversion is 1:1 because the return value is not used.
> Generated by a coccinelle script.
>
> Link: 
> https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=v6a6g1ouzcprm...@mail.gmail.com/
> Signed-off-by: Wolfram Sang 
> ---
>  drivers/cpuidle/cpuidle-powernv.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpuidle/cpuidle-powernv.c 
> b/drivers/cpuidle/cpuidle-powernv.c
> index c32c600b3cf8..0b5461b3d7dd 100644
> --- a/drivers/cpuidle/cpuidle-powernv.c
> +++ b/drivers/cpuidle/cpuidle-powernv.c
> @@ -233,8 +233,8 @@ static inline void add_powernv_state(int index, const 
> char *name,
>  unsigned int exit_latency,
>  u64 psscr_val, u64 psscr_mask)
>  {
> -   strlcpy(powernv_states[index].name, name, CPUIDLE_NAME_LEN);
> -   strlcpy(powernv_states[index].desc, name, CPUIDLE_NAME_LEN);
> +   strscpy(powernv_states[index].name, name, CPUIDLE_NAME_LEN);
> +   strscpy(powernv_states[index].desc, name, CPUIDLE_NAME_LEN);
> powernv_states[index].flags = flags;
> powernv_states[index].target_residency = target_residency;
> powernv_states[index].exit_latency = exit_latency;
> --

Applied as 6.1 material, thanks!

Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-30 Thread Rafael J. Wysocki

On Sat, Jul 30, 2022 at 11:48 AM Michel Lespinasse
 wrote:
>
> On Fri, Jul 29, 2022 at 04:59:50PM +0200, Rafael J. Wysocki wrote:
> > On Fri, Jul 29, 2022 at 12:25 PM Michel Lespinasse
> >  wrote:
> > >
> > > On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> > > > On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > > > > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > > > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > > > > Xeons") wrecked intel_idle in two ways:
> > > > > >
> > > > > >  - must not have tracing in idle functions
> > > > > >  - must return with IRQs disabled
> > > > > >
> > > > > > Additionally, it added a branch for no good reason.
> > > > > >
> > > > > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on 
> > > > > > Xeons")
> > > > > > Signed-off-by: Peter Zijlstra (Intel) 
> > > > >
> > > > > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > > > > usage" when booting a kernel with debug options compiled in. Please
> > > > > see the attached dmesg output. The issue starts with commit 
> > > > > 32d4fd5751ea
> > > > > and is still present in v5.19-rc8.
> > > > >
> > > > > I'm not sure, is this too late to fix or revert in v5.19 final ?
> > > >
> > > > I finally got a chance to take a quick look at this.
> > > >
> > > > The rcu_eqs_exit() function is making a lockdep complaint about
> > > > being invoked with interrupts enabled.  This function is called from
> > > > rcu_idle_exit(), which is an expected code path from 
> > > > cpuidle_enter_state()
> > > > via its call to rcu_idle_exit().  Except that rcu_idle_exit() disables
> > > > interrupts before invoking rcu_eqs_exit().
> > > >
> > > > The only other call to rcu_idle_exit() does not disable interrupts,
> > > > but it is via rcu_user_exit(), which would be a very odd choice for
> > > > cpuidle_enter_state().
> > > >
> > > > It seems unlikely, but it might be that it is the use of 
> > > > local_irq_save()
> > > > instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> > > > the trouble.  If this is the case, then the commit shown below would
> > > > help.  Note that this commit removes the warning from lockdep, so it
> > > > is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> > > > equivalent debugging.
> > > >
> > > > Could you please try your test with the -rce commit shown below applied?
> > >
> > > Thanks for looking into it.
> > >
> > > After checking out Peter's commit 32d4fd5751ea,
> > > cherry picking your commit ed4ae5eff4b3,
> > > and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
> > > I am now seeing this a few seconds into the boot:
> > >
> > > [3.010650] [ cut here ]
> > > [3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 
> > > sched_clock_tick+0x27/0x60
> > > [3.010657] Modules linked in:
> > > [3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> > > 5.19.0-rc1-test-5-g1be22fea0611 #1
> > > [3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 
> > > 01/17/2022
> > > [3.010663] RIP: 0010:sched_clock_tick+0x27/0x60
> > > [3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 
> > > c0 74 18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 
> > > 02 <0f> 0b e8 e2 6c 89 00 48 c7 c3 40 d5 02 00
> > >  89 c0 48 03 1c c5 c0 98
> > > [3.010667] RSP: :b2803e28 EFLAGS: 00010002
> > > [3.010670] RAX: 0001 RBX: c8ce7fa07060 RCX: 
> > > 0001
> > > [3.010671] RDX:  RSI: b268dd21 RDI: 
> > > b269ab13
> > > [3.010673] RBP: 0001 R08: ffc300d5 R09: 
> > > 0002be80
> > > [3.010674] R10: 03625b53183a R11: a012b802b7a4 R12: 
> > > b2aa9e80
> > > [3.010675] R13: b2aa9e00 R14: 0001 R15: 
> > > 
> > >

Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-29 Thread Rafael J. Wysocki

On Fri, Jul 29, 2022 at 12:25 PM Michel Lespinasse
 wrote:
>
> On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > > Xeons") wrecked intel_idle in two ways:
> > > >
> > > >  - must not have tracing in idle functions
> > > >  - must return with IRQs disabled
> > > >
> > > > Additionally, it added a branch for no good reason.
> > > >
> > > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > > > Signed-off-by: Peter Zijlstra (Intel) 
> > >
> > > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > > usage" when booting a kernel with debug options compiled in. Please
> > > see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> > > and is still present in v5.19-rc8.
> > >
> > > I'm not sure, is this too late to fix or revert in v5.19 final ?
> >
> > I finally got a chance to take a quick look at this.
> >
> > The rcu_eqs_exit() function is making a lockdep complaint about
> > being invoked with interrupts enabled.  This function is called from
> > rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
> > via its call to rcu_idle_exit().  Except that rcu_idle_exit() disables
> > interrupts before invoking rcu_eqs_exit().
> >
> > The only other call to rcu_idle_exit() does not disable interrupts,
> > but it is via rcu_user_exit(), which would be a very odd choice for
> > cpuidle_enter_state().
> >
> > It seems unlikely, but it might be that it is the use of local_irq_save()
> > instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> > the trouble.  If this is the case, then the commit shown below would
> > help.  Note that this commit removes the warning from lockdep, so it
> > is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> > equivalent debugging.
> >
> > Could you please try your test with the -rce commit shown below applied?
>
> Thanks for looking into it.
>
> After checking out Peter's commit 32d4fd5751ea,
> cherry picking your commit ed4ae5eff4b3,
> and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
> I am now seeing this a few seconds into the boot:
>
> [3.010650] [ cut here ]
> [3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 
> sched_clock_tick+0x27/0x60
> [3.010657] Modules linked in:
> [3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> 5.19.0-rc1-test-5-g1be22fea0611 #1
> [3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 01/17/2022
> [3.010663] RIP: 0010:sched_clock_tick+0x27/0x60
> [3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 c0 74 
> 18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 02 <0f> 0b 
> e8 e2 6c 89 00 48 c7 c3 40 d5 02 00
>  89 c0 48 03 1c c5 c0 98
> [3.010667] RSP: :b2803e28 EFLAGS: 00010002
> [3.010670] RAX: 0001 RBX: c8ce7fa07060 RCX: 
> 0001
> [3.010671] RDX:  RSI: b268dd21 RDI: 
> b269ab13
> [3.010673] RBP: 0001 R08: ffc300d5 R09: 
> 0002be80
> [3.010674] R10: 03625b53183a R11: a012b802b7a4 R12: 
> b2aa9e80
> [3.010675] R13: b2aa9e00 R14: 0001 R15: 
> 
> [3.010677] FS:  () GS:a012b800() 
> knlGS:
> [3.010678] CS:  0010 DS:  ES:  CR0: 80050033
> [3.010680] CR2: a012f81ff000 CR3: 000c99612001 CR4: 
> 003706f0
> [3.010681] DR0:  DR1:  DR2: 
> 
> [3.010682] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [3.010683] Call Trace:
> [3.010685]  
> [3.010688]  cpuidle_enter_state+0xb7/0x4b0
> [3.010694]  cpuidle_enter+0x29/0x40
> [3.010697]  do_idle+0x1d4/0x210
> [3.010702]  cpu_startup_entry+0x19/0x20
> [3.010704]  rest_init+0x117/0x1a0
> [3.010708]  arch_call_rest_init+0xa/0x10
> [3.010711]  start_kernel+0x6d8/0x6ff
> [3.010716]  secondary_startup_64_no_verify+0xce/0xdb
> [3.010728]  
> [3.010729] irq event stamp: 44179
> [3.010730] hardirqs last  enabled at (44179): [] 
> asm_sysvec_apic_timer_interrupt+0x1b/0x20
> [3.010734] hardirqs last disabled at (44177): [] 
> __do_softirq+0x3f0/0x498
> [3.010736] softirqs last  enabled at (44178): [] 
> __do_softirq+0x332/0x498
> [3.010738] softirqs last disabled at (44171): [] 
> irq_exit_rcu+0xab/0xf0
> [3.010741] ---[ end trace  ]---

Can you please give this patch a go:
https://patchwork.kernel.org/project/linux-pm/patch/Yt/axpfi88new...@e126311.manchester.arm.com/
?

Re: [PATCH v4 1/3] PCI: Remove pci_get_legacy_ide_irq and asm-generic/pci.h

2022-07-21 Thread Rafael J. Wysocki

On Wed, Jul 20, 2022 at 3:20 PM Stafford Horne  wrote:
>
> The definition of the pci header function pci_get_legacy_ide_irq is only
> used in platforms that support PNP.  So many of the architecutres where
> it is defined do not use it.  This also means we can remove
> asm-generic/pci.h as all it provides is a definition of
> pci_get_legacy_ide_irq.
>
> Where referenced, replace the usage of pci_get_legacy_ide_irq with the
> libata.h macros ATA_PRIMARY_IRQ and ATA_SECONDARY_IRQ which provide the
> same functionality.  This allows removing pci_get_legacy_ide_irq from
> headers where it is no longer used.
>
> Acked-by: Geert Uytterhoeven 
> Acked-by: Pierre Morel 
> Co-developed-by: Arnd Bergmann 
> Signed-off-by: Stafford Horne 

Acked-by: Rafael J. Wysocki 

> ---
>
> Since v3:
>  - Further remove the definictions of pci_get_legacy_ide_irq from x86 and use
>the libata macros.
>  - Add Acked-bys.
>
>  arch/alpha/include/asm/pci.h   |  6 --
>  arch/arm/include/asm/pci.h |  5 -
>  arch/arm64/include/asm/pci.h   |  6 --
>  arch/ia64/include/asm/pci.h|  6 --
>  arch/m68k/include/asm/pci.h|  2 --
>  arch/mips/include/asm/pci.h|  6 --
>  arch/parisc/include/asm/pci.h  |  5 -
>  arch/powerpc/include/asm/pci.h |  1 -
>  arch/s390/include/asm/pci.h|  1 -
>  arch/sh/include/asm/pci.h  |  6 --
>  arch/sparc/include/asm/pci.h   |  9 -
>  arch/x86/include/asm/pci.h |  3 ---
>  arch/xtensa/include/asm/pci.h  |  3 ---
>  drivers/pnp/resource.c |  5 +++--
>  include/asm-generic/pci.h  | 17 -
>  15 files changed, 3 insertions(+), 78 deletions(-)
>  delete mode 100644 include/asm-generic/pci.h
>
> diff --git a/arch/alpha/include/asm/pci.h b/arch/alpha/include/asm/pci.h
> index cf6bc1e64d66..6312656279d7 100644
> --- a/arch/alpha/include/asm/pci.h
> +++ b/arch/alpha/include/asm/pci.h
> @@ -56,12 +56,6 @@ struct pci_controller {
>
>  /* IOMMU controls.  */
>
> -/* TODO: integrate with include/asm-generic/pci.h ? */
> -static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
> -{
> -   return channel ? 15 : 14;
> -}
> -
>  #define pci_domain_nr(bus) ((struct pci_controller *)(bus)->sysdata)->index
>
>  static inline int pci_proc_domain(struct pci_bus *bus)
> diff --git a/arch/arm/include/asm/pci.h b/arch/arm/include/asm/pci.h
> index 68e6f25784a4..5916b88d4c94 100644
> --- a/arch/arm/include/asm/pci.h
> +++ b/arch/arm/include/asm/pci.h
> @@ -22,11 +22,6 @@ static inline int pci_proc_domain(struct pci_bus *bus)
>  #define HAVE_PCI_MMAP
>  #define ARCH_GENERIC_PCI_MMAP_RESOURCE
>
> -static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
> -{
> -   return channel ? 15 : 14;
> -}
> -
>  extern void pcibios_report_status(unsigned int status_mask, int warn);
>
>  #endif /* __KERNEL__ */
> diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
> index b33ca260e3c9..0aebc3488c32 100644
> --- a/arch/arm64/include/asm/pci.h
> +++ b/arch/arm64/include/asm/pci.h
> @@ -23,12 +23,6 @@
>  extern int isa_dma_bridge_buggy;
>
>  #ifdef CONFIG_PCI
> -static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
> -{
> -   /* no legacy IRQ on arm64 */
> -   return -ENODEV;
> -}
> -
>  static inline int pci_proc_domain(struct pci_bus *bus)
>  {
> return 1;
> diff --git a/arch/ia64/include/asm/pci.h b/arch/ia64/include/asm/pci.h
> index 8c163d1d0189..fa8f545c24c9 100644
> --- a/arch/ia64/include/asm/pci.h
> +++ b/arch/ia64/include/asm/pci.h
> @@ -63,10 +63,4 @@ static inline int pci_proc_domain(struct pci_bus *bus)
> return (pci_domain_nr(bus) != 0);
>  }
>
> -#define HAVE_ARCH_PCI_GET_LEGACY_IDE_IRQ
> -static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
> -{
> -   return channel ? isa_irq_to_vector(15) : isa_irq_to_vector(14);
> -}
> -
>  #endif /* _ASM_IA64_PCI_H */
> diff --git a/arch/m68k/include/asm/pci.h b/arch/m68k/include/asm/pci.h
> index 5a4bc223743b..ccdfa0dc8413 100644
> --- a/arch/m68k/include/asm/pci.h
> +++ b/arch/m68k/include/asm/pci.h
> @@ -2,8 +2,6 @@
>  #ifndef _ASM_M68K_PCI_H
>  #define _ASM_M68K_PCI_H
>
> -#include 
> -
>  #definepcibios_assign_all_busses() 1
>
>  #definePCIBIOS_MIN_IO  0x0100
> diff --git a/arch/mips/include/asm/pci.h b/arch/mips/include/asm/pci.h
> index 9ffc8192adae..3fd6e22c108b 100644
> --- a/arch/mips/include/asm/pci.h
> +++ b/arch/mips/include/asm/pci.h
> @@ -139,10 +139,4 @@ static inline int pci_proc_domain(struct pci_bus *bus)
>  /* Do platform specific device initialization

Re: [PATCH 31/36] cpuidle,acpi: Make noinstr clean

2022-07-06 Thread Rafael J. Wysocki

ieu.desnoy...@efficios.com>, Frederic Weisbecker , Len 
Brown , linux-xte...@linux-xtensa.org, Sascha Hauer 
, Vasily Gorbik , linux-arm-msm 
, linux-al...@vger.kernel.org, linux-m68k 
, Stafford Horne , Linux ARM 
, Chris Zankel , 
Stephen Boyd , dingu...@kernel.org, Daniel Bristot de 
Oliveira , Alexander Shishkin 
, Lorenzo Pieralisi 
, Rasmus Villemoes , Joel 
Fernandes , Will Deacon , Boris 
Ostrovsky , Kevin Hilman , 
linux-c...@vger.kernel.org, pv-driv...@vmware.com, 
linux-snps-...@lists.infradead.org, Mel Gorman , Jacob Pan 
, Arnd Bergmann , ulli.kr...@googlemail.com, 
vgu...@kernel.org, linux-clk , Josh Triplett 
, Steven Rostedt , 
r...@vger.kernel.org, Borislav Petkov , bc...@quicinc.com, 
Thomas Bogendoerfer , Parisc List 
, Sudeep Holla , Shawn Guo 
, David Miller , Rich Felker 
, Tony Lindgren , amakha...@vmware.com, 
Bjorn Andersson , "H. Peter Anvin" 
, sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, 
linux-riscv , anton.iva...@cambridgegreys.com, 
jo...@southpole.se, Yury Norov , Richard Weinberger 
, the arch/x86 maintainers , Russell King - 
ARM Linux , Ingo Molnar , Al
 bert Ou , "Paul E. McKenney" <
paul...@kernel.org>, Heiko Carstens , 
stefan.kristians...@saunalahti.fi, openr...@lists.librecores.org, Paul Walmsley 
, linux-tegra , 
namhy...@kernel.org, Andy Shevchenko , 
jpoim...@kernel.org, Juergen Gross , Michal Simek 
, "open list:BROADCOM NVRAM DRIVER" 
, Palmer Dabbelt , Anup Patel 
, i...@jurassic.park.msu.ru, Johannes Berg 
, linuxppc-dev 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 

On Wed, Jun 8, 2022 at 4:47 PM Peter Zijlstra  wrote:
>
> vmlinux.o: warning: objtool: io_idle+0xc: call to __inb.isra.0() leaves 
> .noinstr.text section
> vmlinux.o: warning: objtool: acpi_idle_enter+0xfe: call to num_online_cpus() 
> leaves .noinstr.text section
> vmlinux.o: warning: objtool: acpi_idle_enter+0x115: call to 
> acpi_idle_fallback_to_c1.isra.0() leaves .noinstr.text section
>
> Signed-off-by: Peter Zijlstra (Intel) 

Acked-by: Rafael J. Wysocki 

> ---
>  arch/x86/include/asm/shared/io.h |4 ++--
>  drivers/acpi/processor_idle.c|2 +-
>  include/linux/cpumask.h  |4 ++--
>  3 files changed, 5 insertions(+), 5 deletions(-)
>
> --- a/arch/x86/include/asm/shared/io.h
> +++ b/arch/x86/include/asm/shared/io.h
> @@ -5,13 +5,13 @@
>  #include 
>
>  #define BUILDIO(bwl, bw, type) \
> -static inline void __out##bwl(type value, u16 port)\
> +static __always_inline void __out##bwl(type value, u16 port)   \
>  {  \
> asm volatile("out" #bwl " %" #bw "0, %w1"   \
>  : : "a"(value), "Nd"(port));   \
>  }  \
> \
> -static inline type __in##bwl(u16 port) \
> +static __always_inline type __in##bwl(u16 port)  
>   \
>  {  \
> type value; \
> asm volatile("in" #bwl " %w1, %" #bw "0"\
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -593,7 +593,7 @@ static int acpi_idle_play_dead(struct cp
> return 0;
>  }
>
> -static bool acpi_idle_fallback_to_c1(struct acpi_processor *pr)
> +static __always_inline bool acpi_idle_fallback_to_c1(struct acpi_processor 
> *pr)
>  {
> return IS_ENABLED(CONFIG_HOTPLUG_CPU) && !pr->flags.has_cst &&
> !(acpi_gbl_FADT.flags & ACPI_FADT_C2_MP_SUPPORTED);
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -908,9 +908,9 @@ static inline const struct cpumask *get_
>   * concurrent CPU hotplug operations unless invoked from a cpuhp_lock held
>   * region.
>   */
> -static inline unsigned int num_online_cpus(void)
> +static __always_inline unsigned int num_online_cpus(void)
>  {
> -   return atomic_read(&__num_online_cpus);
> +   return arch_atomic_read(&__num_online_cpus);
>  }
>  #define num_possible_cpus()cpumask_weight(cpu_possible_mask)
>  #define num_present_cpus() cpumask_weight(cpu_present_mask)
>
>

Re: [PATCH 20/36] arch/idle: Change arch_cpu_idle() IRQ behaviour

2022-07-06 Thread Rafael J. Wysocki

ieu.desnoy...@efficios.com>, Frederic Weisbecker , Len 
Brown , linux-xte...@linux-xtensa.org, Sascha Hauer 
, Vasily Gorbik , linux-arm-msm 
, linux-al...@vger.kernel.org, linux-m68k 
, Stafford Horne , Linux ARM 
, Chris Zankel , 
Stephen Boyd , dingu...@kernel.org, Daniel Bristot de 
Oliveira , Alexander Shishkin 
, Lorenzo Pieralisi 
, Rasmus Villemoes , Joel 
Fernandes , Will Deacon , Boris 
Ostrovsky , Kevin Hilman , 
linux-c...@vger.kernel.org, pv-driv...@vmware.com, 
linux-snps-...@lists.infradead.org, Mel Gorman , Jacob Pan 
, Arnd Bergmann , ulli.kr...@googlemail.com, 
vgu...@kernel.org, linux-clk , Josh Triplett 
, Steven Rostedt , 
r...@vger.kernel.org, Borislav Petkov , bc...@quicinc.com, 
Thomas Bogendoerfer , Parisc List 
, Sudeep Holla , Shawn Guo 
, David Miller , Rich Felker 
, Tony Lindgren , amakha...@vmware.com, 
Bjorn Andersson , "H. Peter Anvin" 
, sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, 
linux-riscv , anton.iva...@cambridgegreys.com, 
jo...@southpole.se, Yury Norov , Richard Weinberger 
, the arch/x86 maintainers , Russell King - 
ARM Linux , Ingo Molnar , Al
 bert Ou , "Paul E. McKenney" <
paul...@kernel.org>, Heiko Carstens , 
stefan.kristians...@saunalahti.fi, openr...@lists.librecores.org, Paul Walmsley 
, linux-tegra , 
namhy...@kernel.org, Andy Shevchenko , 
jpoim...@kernel.org, Juergen Gross , Michal Simek 
, "open list:BROADCOM NVRAM DRIVER" 
, Palmer Dabbelt , Anup Patel 
, i...@jurassic.park.msu.ru, Johannes Berg 
, linuxppc-dev 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 

On Wed, Jun 8, 2022 at 4:46 PM Peter Zijlstra  wrote:
>
> Current arch_cpu_idle() is called with IRQs disabled, but will return
> with IRQs enabled.
>
> However, the very first thing the generic code does after calling
> arch_cpu_idle() is raw_local_irq_disable(). This means that
> architectures that can idle with IRQs disabled end up doing a
> pointless 'enable-disable' dance.
>
> Therefore, push this IRQ disabling into the idle function, meaning
> that those architectures can avoid the pointless IRQ state flipping.
>
> Signed-off-by: Peter Zijlstra (Intel) 

Acked-by: Rafael J. Wysocki 

> ---
>  arch/alpha/kernel/process.c  |1 -
>  arch/arc/kernel/process.c|3 +++
>  arch/arm/kernel/process.c|1 -
>  arch/arm/mach-gemini/board-dt.c  |3 ++-
>  arch/arm64/kernel/idle.c |1 -
>  arch/csky/kernel/process.c   |1 -
>  arch/csky/kernel/smp.c   |2 +-
>  arch/hexagon/kernel/process.c|1 -
>  arch/ia64/kernel/process.c   |1 +
>  arch/microblaze/kernel/process.c |1 -
>  arch/mips/kernel/idle.c  |8 +++-
>  arch/nios2/kernel/process.c  |1 -
>  arch/openrisc/kernel/process.c   |1 +
>  arch/parisc/kernel/process.c |2 --
>  arch/powerpc/kernel/idle.c   |5 ++---
>  arch/riscv/kernel/process.c  |1 -
>  arch/s390/kernel/idle.c  |1 -
>  arch/sh/kernel/idle.c|1 +
>  arch/sparc/kernel/leon_pmc.c |4 
>  arch/sparc/kernel/process_32.c   |1 -
>  arch/sparc/kernel/process_64.c   |3 ++-
>  arch/um/kernel/process.c |1 -
>  arch/x86/coco/tdx/tdx.c  |3 +++
>  arch/x86/kernel/process.c|   15 ---
>  arch/xtensa/kernel/process.c |1 +
>  kernel/sched/idle.c  |2 --
>  26 files changed, 28 insertions(+), 37 deletions(-)
>
> --- a/arch/alpha/kernel/process.c
> +++ b/arch/alpha/kernel/process.c
> @@ -57,7 +57,6 @@ EXPORT_SYMBOL(pm_power_off);
>  void arch_cpu_idle(void)
>  {
> wtint(0);
> -   raw_local_irq_enable();
>  }
>
>  void arch_cpu_idle_dead(void)
> --- a/arch/arc/kernel/process.c
> +++ b/arch/arc/kernel/process.c
> @@ -114,6 +114,8 @@ void arch_cpu_idle(void)
> "sleep %0   \n"
> :
> :"I"(arg)); /* can't be "r" has to be embedded const */
> +
> +   raw_local_irq_disable();
>  }
>
>  #else  /* ARC700 */
> @@ -122,6 +124,7 @@ void arch_cpu_idle(void)
>  {
> /* sleep, but enable both set E1/E2 (levels of interrupts) before 
> committing */
> __asm__ __volatile__("sleep 0x3 \n");
> +   raw_local_irq_disable();
>  }
>
>  #endif
> --- a/arch/arm/kernel/process.c
> +++ b/arch/arm/kernel/process.c
> @@ -78,7 +78,6 @@ void arch_cpu_idle(void)
> arm_pm_idle();
> else
> cpu_do_idle();
> -   raw_local_irq_enable();
>  }
>
>  void arch_cpu_idle_prepare(void)
> --- a/arch/arm/mach-gemini/board-dt.c
> +++ b/arch/arm/mach-gemini/board-dt.c
> @@ -42,8 +42,9

Re: [PATCH 18/36] cpuidle: Annotate poll_idle()

2022-07-06 Thread Rafael J. Wysocki

ieu.desnoy...@efficios.com>, Frederic Weisbecker , Len 
Brown , linux-xte...@linux-xtensa.org, Sascha Hauer 
, Vasily Gorbik , linux-arm-msm 
, linux-al...@vger.kernel.org, linux-m68k 
, Stafford Horne , Linux ARM 
, Chris Zankel , 
Stephen Boyd , dingu...@kernel.org, Daniel Bristot de 
Oliveira , Alexander Shishkin 
, Lorenzo Pieralisi 
, Rasmus Villemoes , Joel 
Fernandes , Will Deacon , Boris 
Ostrovsky , Kevin Hilman , 
linux-c...@vger.kernel.org, pv-driv...@vmware.com, 
linux-snps-...@lists.infradead.org, Mel Gorman , Jacob Pan 
, Arnd Bergmann , ulli.kr...@googlemail.com, 
vgu...@kernel.org, linux-clk , Josh Triplett 
, Steven Rostedt , 
r...@vger.kernel.org, Borislav Petkov , bc...@quicinc.com, 
Thomas Bogendoerfer , Parisc List 
, Sudeep Holla , Shawn Guo 
, David Miller , Rich Felker 
, Tony Lindgren , amakha...@vmware.com, 
Bjorn Andersson , "H. Peter Anvin" 
, sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, 
linux-riscv , anton.iva...@cambridgegreys.com, 
jo...@southpole.se, Yury Norov , Richard Weinberger 
, the arch/x86 maintainers , Russell King - 
ARM Linux , Ingo Molnar , Al
 bert Ou , "Paul E. McKenney" <
paul...@kernel.org>, Heiko Carstens , 
stefan.kristians...@saunalahti.fi, openr...@lists.librecores.org, Paul Walmsley 
, linux-tegra , 
namhy...@kernel.org, Andy Shevchenko , 
jpoim...@kernel.org, Juergen Gross , Michal Simek 
, "open list:BROADCOM NVRAM DRIVER" 
, Palmer Dabbelt , Anup Patel 
, i...@jurassic.park.msu.ru, Johannes Berg 
, linuxppc-dev 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 

On Wed, Jun 8, 2022 at 4:46 PM Peter Zijlstra  wrote:
>
> The __cpuidle functions will become a noinstr class, as such they need
> explicit annotations.
>
> Signed-off-by: Peter Zijlstra (Intel) 

Reviewed-by: Rafael J. Wysocki 

> ---
>  drivers/cpuidle/poll_state.c |6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> --- a/drivers/cpuidle/poll_state.c
> +++ b/drivers/cpuidle/poll_state.c
> @@ -13,7 +13,10 @@
>  static int __cpuidle poll_idle(struct cpuidle_device *dev,
>struct cpuidle_driver *drv, int index)
>  {
> -   u64 time_start = local_clock();
> +   u64 time_start;
> +
> +   instrumentation_begin();
> +   time_start = local_clock();
>
> dev->poll_time_limit = false;
>
> @@ -39,6 +42,7 @@ static int __cpuidle poll_idle(struct cp
> raw_local_irq_disable();
>
> current_clr_polling();
> +   instrumentation_end();
>
> return index;
>  }
>
>

Re: [PATCH 17/36] acpi_idle: Remove tracing

2022-07-06 Thread Rafael J. Wysocki

ieu.desnoy...@efficios.com>, Frederic Weisbecker , Len 
Brown , linux-xte...@linux-xtensa.org, Sascha Hauer 
, Vasily Gorbik , linux-arm-msm 
, linux-al...@vger.kernel.org, linux-m68k 
, Stafford Horne , Linux ARM 
, Chris Zankel , 
Stephen Boyd , dingu...@kernel.org, Daniel Bristot de 
Oliveira , Alexander Shishkin 
, Lorenzo Pieralisi 
, Rasmus Villemoes , Joel 
Fernandes , Will Deacon , Boris 
Ostrovsky , Kevin Hilman , 
linux-c...@vger.kernel.org, pv-driv...@vmware.com, 
linux-snps-...@lists.infradead.org, Mel Gorman , Jacob Pan 
, Arnd Bergmann , ulli.kr...@googlemail.com, 
vgu...@kernel.org, linux-clk , Josh Triplett 
, Steven Rostedt , 
r...@vger.kernel.org, Borislav Petkov , bc...@quicinc.com, 
Thomas Bogendoerfer , Parisc List 
, Sudeep Holla , Shawn Guo 
, David Miller , Rich Felker 
, Tony Lindgren , amakha...@vmware.com, 
Bjorn Andersson , "H. Peter Anvin" 
, sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, 
linux-riscv , anton.iva...@cambridgegreys.com, 
jo...@southpole.se, Yury Norov , Richard Weinberger 
, the arch/x86 maintainers , Russell King - 
ARM Linux , Ingo Molnar , Al
 bert Ou , "Paul E. McKenney" <
paul...@kernel.org>, Heiko Carstens , 
stefan.kristians...@saunalahti.fi, openr...@lists.librecores.org, Paul Walmsley 
, linux-tegra , 
namhy...@kernel.org, Andy Shevchenko , 
jpoim...@kernel.org, Juergen Gross , Michal Simek 
, "open list:BROADCOM NVRAM DRIVER" 
, Palmer Dabbelt , Anup Patel 
, i...@jurassic.park.msu.ru, Johannes Berg 
, linuxppc-dev 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 

On Wed, Jun 8, 2022 at 4:47 PM Peter Zijlstra  wrote:
>
> All the idle routines are called with RCU disabled, as such there must
> not be any tracing inside.
>
> Signed-off-by: Peter Zijlstra (Intel) 

This actually does some additional code duplication cleanup which
would be good to mention in the changelog.  Or even move to a separate
patch for that matter.

Otherwise LGTM.

> ---
>  drivers/acpi/processor_idle.c |   24 +---
>  1 file changed, 13 insertions(+), 11 deletions(-)
>
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -108,8 +108,8 @@ static const struct dmi_system_id proces
>  static void __cpuidle acpi_safe_halt(void)
>  {
> if (!tif_need_resched()) {
> -   safe_halt();
> -   local_irq_disable();
> +   raw_safe_halt();
> +   raw_local_irq_disable();
> }
>  }
>
> @@ -524,16 +524,21 @@ static int acpi_idle_bm_check(void)
> return bm_status;
>  }
>
> -static void wait_for_freeze(void)
> +static __cpuidle void io_idle(unsigned long addr)
>  {
> +   /* IO port based C-state */
> +   inb(addr);
> +
>  #ifdef CONFIG_X86
> /* No delay is needed if we are in guest */
> if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
> return;
>  #endif
> -   /* Dummy wait op - must do something useless after P_LVL2 read
> -  because chipsets cannot guarantee that STPCLK# signal
> -  gets asserted in time to freeze execution properly. */
> +   /*
> +* Dummy wait op - must do something useless after P_LVL2 read
> +* because chipsets cannot guarantee that STPCLK# signal
> +* gets asserted in time to freeze execution properly.
> +*/
> inl(acpi_gbl_FADT.xpm_timer_block.address);
>  }
>
> @@ -553,9 +558,7 @@ static void __cpuidle acpi_idle_do_entry
> } else if (cx->entry_method == ACPI_CSTATE_HALT) {
> acpi_safe_halt();
> } else {
> -   /* IO port based C-state */
> -   inb(cx->address);
> -   wait_for_freeze();
> +   io_idle(cx->address);
> }
>
> perf_lopwr_cb(false);
> @@ -577,8 +580,7 @@ static int acpi_idle_play_dead(struct cp
> if (cx->entry_method == ACPI_CSTATE_HALT)
> safe_halt();
> else if (cx->entry_method == ACPI_CSTATE_SYSTEMIO) {
> -   inb(cx->address);
> -   wait_for_freeze();
> +   io_idle(cx->address);
> } else
> return -ENODEV;
>
>
>

Re: [PATCH 05/36] cpuidle: Move IRQ state validation

2022-07-06 Thread Rafael J. Wysocki

ieu.desnoy...@efficios.com>, Frederic Weisbecker , Len 
Brown , linux-xte...@linux-xtensa.org, Sascha Hauer 
, Vasily Gorbik , linux-arm-msm 
, linux-al...@vger.kernel.org, linux-m68k 
, Stafford Horne , Linux ARM 
, Chris Zankel , 
Stephen Boyd , dingu...@kernel.org, Daniel Bristot de 
Oliveira , Alexander Shishkin 
, Lorenzo Pieralisi 
, Rasmus Villemoes , Joel 
Fernandes , Will Deacon , Boris 
Ostrovsky , Kevin Hilman , 
linux-c...@vger.kernel.org, pv-driv...@vmware.com, 
linux-snps-...@lists.infradead.org, Mel Gorman , Jacob Pan 
, Arnd Bergmann , ulli.kr...@googlemail.com, 
vgu...@kernel.org, linux-clk , Josh Triplett 
, Steven Rostedt , 
r...@vger.kernel.org, Borislav Petkov , bc...@quicinc.com, 
Thomas Bogendoerfer , Parisc List 
, Sudeep Holla , Shawn Guo 
, David Miller , Rich Felker 
, Tony Lindgren , amakha...@vmware.com, 
Bjorn Andersson , "H. Peter Anvin" 
, sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, 
linux-riscv , anton.iva...@cambridgegreys.com, 
jo...@southpole.se, Yury Norov , Richard Weinberger 
, the arch/x86 maintainers , Russell King - 
ARM Linux , Ingo Molnar , Al
 bert Ou , "Paul E. McKenney" <
paul...@kernel.org>, Heiko Carstens , 
stefan.kristians...@saunalahti.fi, openr...@lists.librecores.org, Paul Walmsley 
, linux-tegra , 
namhy...@kernel.org, Andy Shevchenko , 
jpoim...@kernel.org, Juergen Gross , Michal Simek 
, "open list:BROADCOM NVRAM DRIVER" 
, Palmer Dabbelt , Anup Patel 
, i...@jurassic.park.msu.ru, Johannes Berg 
, linuxppc-dev 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 

On Wed, Jun 8, 2022 at 4:47 PM Peter Zijlstra  wrote:
>
> Make cpuidle_enter_state() consistent with the s2idle variant and
> verify ->enter() always returns with interrupts disabled.
>
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  drivers/cpuidle/cpuidle.c |   10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -234,7 +234,11 @@ int cpuidle_enter_state(struct cpuidle_d
> stop_critical_timings();
> if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE))
> rcu_idle_enter();
> +
> entered_state = target_state->enter(dev, drv, index);
> +   if (WARN_ONCE(!irqs_disabled(), "%ps leaked IRQ state", 
> target_state->enter))

I'm not sure if dumping a call trace here is really useful and
WARN_ON() often gets converted to panic().

I would print an error message with pr_warn_once().

Otherwise LGTM.

> +   raw_local_irq_disable();
> +
> if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE))
> rcu_idle_exit();
> start_critical_timings();
> @@ -246,12 +250,8 @@ int cpuidle_enter_state(struct cpuidle_d
> /* The cpu is no longer idle or about to enter idle. */
> sched_idle_set_state(NULL);
>
> -   if (broadcast) {
> -   if (WARN_ON_ONCE(!irqs_disabled()))
> -   local_irq_disable();
> -
> +   if (broadcast)
> tick_broadcast_exit();
> -   }
>
> if (!cpuidle_state_is_coupled(drv, index))
> local_irq_enable();
>
>

Re: [PATCH 03/36] cpuidle/poll: Ensure IRQ state is invariant

2022-07-06 Thread Rafael J. Wysocki

ieu.desnoy...@efficios.com>, Frederic Weisbecker , Len 
Brown , linux-xte...@linux-xtensa.org, Sascha Hauer 
, Vasily Gorbik , linux-arm-msm 
, linux-al...@vger.kernel.org, linux-m68k 
, Stafford Horne , Linux ARM 
, Chris Zankel , 
Stephen Boyd , dingu...@kernel.org, Daniel Bristot de 
Oliveira , Alexander Shishkin 
, Lorenzo Pieralisi 
, Rasmus Villemoes , Joel 
Fernandes , Will Deacon , Boris 
Ostrovsky , Kevin Hilman , 
linux-c...@vger.kernel.org, pv-driv...@vmware.com, 
linux-snps-...@lists.infradead.org, Mel Gorman , Jacob Pan 
, Arnd Bergmann , ulli.kr...@googlemail.com, 
vgu...@kernel.org, linux-clk , Josh Triplett 
, Steven Rostedt , 
r...@vger.kernel.org, Borislav Petkov , bc...@quicinc.com, 
Thomas Bogendoerfer , Parisc List 
, Sudeep Holla , Shawn Guo 
, David Miller , Rich Felker 
, Tony Lindgren , amakha...@vmware.com, 
Bjorn Andersson , "H. Peter Anvin" 
, sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, 
linux-riscv , anton.iva...@cambridgegreys.com, 
jo...@southpole.se, Yury Norov , Richard Weinberger 
, the arch/x86 maintainers , Russell King - 
ARM Linux , Ingo Molnar , Al
 bert Ou , "Paul E. McKenney" <
paul...@kernel.org>, Heiko Carstens , 
stefan.kristians...@saunalahti.fi, openr...@lists.librecores.org, Paul Walmsley 
, linux-tegra , 
namhy...@kernel.org, Andy Shevchenko , 
jpoim...@kernel.org, Juergen Gross , Michal Simek 
, "open list:BROADCOM NVRAM DRIVER" 
, Palmer Dabbelt , Anup Patel 
, i...@jurassic.park.msu.ru, Johannes Berg 
, linuxppc-dev 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 

On Wed, Jun 8, 2022 at 4:47 PM Peter Zijlstra  wrote:
>
> cpuidle_state::enter() methods should be IRQ invariant
>
> Signed-off-by: Peter Zijlstra (Intel) 

Reviewed-by: Rafael J. Wysocki 

> ---
>  drivers/cpuidle/poll_state.c |4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> --- a/drivers/cpuidle/poll_state.c
> +++ b/drivers/cpuidle/poll_state.c
> @@ -17,7 +17,7 @@ static int __cpuidle poll_idle(struct cp
>
> dev->poll_time_limit = false;
>
> -   local_irq_enable();
> +   raw_local_irq_enable();
> if (!current_set_polling_and_test()) {
> unsigned int loop_count = 0;
> u64 limit;
> @@ -36,6 +36,8 @@ static int __cpuidle poll_idle(struct cp
> }
> }
> }
> +   raw_local_irq_disable();
> +
> current_clr_polling();
>
> return index;
>
>

Re: [PATCH 02/36] x86/idle: Replace x86_idle with a static_call

2022-06-08 Thread Rafael J. Wysocki

ieu.desnoy...@efficios.com>, Frederic Weisbecker , Len 
Brown , linux-xte...@linux-xtensa.org, Sascha Hauer 
, Vasily Gorbik , linux-arm-msm 
, linux-al...@vger.kernel.org, linux-m68k 
, Stafford Horne , Linux ARM 
, ch...@zankel.net, Stephen Boyd 
, dingu...@kernel.org, Daniel Bristot de Oliveira 
, Alexander Shishkin , 
lpieral...@kernel.org, Rasmus Villemoes , Joel 
Fernandes , Will Deacon , Boris 
Ostrovsky , Kevin Hilman , 
linux-c...@vger.kernel.org, pv-driv...@vmware.com, 
linux-snps-...@lists.infradead.org, Mel Gorman , Jacob Pan 
, Arnd Bergmann , ulli.kr...@googlemail.com, vgu...@kernel.org, linux-clk 
, Josh Triplett , Steven 
Rostedt , r...@vger.kernel.org, Borislav Petkov 
, bc...@quicinc.com, Thomas Bogendoerfer 
, Parisc List , Sudeep 
Holla , Shawn Guo , David Miller 
, Rich Felker , Tony Lindgren 
, amakha...@vmware.com, Bjorn Andersson 
, "H. Peter Anvin" , 
sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, linux-riscv 
, anton.iva...@cambridgegreys.com, 
jo...@southpole.se, Yury Norov , Richard Weinberger 
, the arch/x86 maintainers , Russell King - 
ARM Linux , Ingo Molnar , Albert Ou 
, "P
 aul E. McKenney" , Heiko Carstens
 , stefan.kristians...@saunalahti.fi, 
openr...@lists.librecores.org, Paul Walmsley , 
linux-tegra , namhy...@kernel.org, Andy Shevchenko 
, jpoim...@kernel.org, Juergen Gross 
, Michal Simek , "open list:BROADCOM NVRAM 
DRIVER" , Palmer Dabbelt , Anup 
Patel , i...@jurassic.park.msu.ru, Johannes Berg 
, linuxppc-dev 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 

On Wed, Jun 8, 2022 at 4:47 PM Peter Zijlstra  wrote:
>
> Typical boot time setup; no need to suffer an indirect call for that.
>
> Signed-off-by: Peter Zijlstra (Intel) 
> Reviewed-by: Frederic Weisbecker 

Reviewed-by: Rafael J. Wysocki 

> ---
>  arch/x86/kernel/process.c |   50 
> +-
>  1 file changed, 28 insertions(+), 22 deletions(-)
>
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -24,6 +24,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -692,7 +693,23 @@ void __switch_to_xtra(struct task_struct
>  unsigned long boot_option_idle_override = IDLE_NO_OVERRIDE;
>  EXPORT_SYMBOL(boot_option_idle_override);
>
> -static void (*x86_idle)(void);
> +/*
> + * We use this if we don't have any better idle routine..
> + */
> +void __cpuidle default_idle(void)
> +{
> +   raw_safe_halt();
> +}
> +#if defined(CONFIG_APM_MODULE) || defined(CONFIG_HALTPOLL_CPUIDLE_MODULE)
> +EXPORT_SYMBOL(default_idle);
> +#endif
> +
> +DEFINE_STATIC_CALL_NULL(x86_idle, default_idle);
> +
> +static bool x86_idle_set(void)
> +{
> +   return !!static_call_query(x86_idle);
> +}
>
>  #ifndef CONFIG_SMP
>  static inline void play_dead(void)
> @@ -715,28 +732,17 @@ void arch_cpu_idle_dead(void)
>  /*
>   * Called from the generic idle code.
>   */
> -void arch_cpu_idle(void)
> -{
> -   x86_idle();
> -}
> -
> -/*
> - * We use this if we don't have any better idle routine..
> - */
> -void __cpuidle default_idle(void)
> +void __cpuidle arch_cpu_idle(void)
>  {
> -   raw_safe_halt();
> +   static_call(x86_idle)();
>  }
> -#if defined(CONFIG_APM_MODULE) || defined(CONFIG_HALTPOLL_CPUIDLE_MODULE)
> -EXPORT_SYMBOL(default_idle);
> -#endif
>
>  #ifdef CONFIG_XEN
>  bool xen_set_default_idle(void)
>  {
> -   bool ret = !!x86_idle;
> +   bool ret = x86_idle_set();
>
> -   x86_idle = default_idle;
> +   static_call_update(x86_idle, default_idle);
>
> return ret;
>  }
> @@ -859,20 +865,20 @@ void select_idle_routine(const struct cp
> if (boot_option_idle_override == IDLE_POLL && smp_num_siblings > 1)
> pr_warn_once("WARNING: polling idle and HT enabled, 
> performance may degrade\n");
>  #endif
> -   if (x86_idle || boot_option_idle_override == IDLE_POLL)
> +   if (x86_idle_set() || boot_option_idle_override == IDLE_POLL)
> return;
>
> if (boot_cpu_has_bug(X86_BUG_AMD_E400)) {
> pr_info("using AMD E400 aware idle routine\n");
> -   x86_idle = amd_e400_idle;
> +   static_call_update(x86_idle, amd_e400_idle);
> } else if (prefer_mwait_c1_over_halt(c)) {
> pr_info("using mwait in idle threads\n");
> -   x86_idle = mwait_idle;
> +   static_call_update(x86_idle, mwait_idle);
> } else if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) {
> pr_info("using TDX aware idle routine\n");
> -

Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-06-08 Thread Rafael J. Wysocki

ieu.desnoy...@efficios.com>, Frederic Weisbecker , Len 
Brown , linux-xte...@linux-xtensa.org, Sascha Hauer 
, Vasily Gorbik , linux-arm-msm 
, linux-al...@vger.kernel.org, linux-m68k 
, Stafford Horne , Linux ARM 
, ch...@zankel.net, Stephen Boyd 
, dingu...@kernel.org, Daniel Bristot de Oliveira 
, Alexander Shishkin , 
lpieral...@kernel.org, Rasmus Villemoes , Joel 
Fernandes , Will Deacon , Boris 
Ostrovsky , Kevin Hilman , 
linux-c...@vger.kernel.org, pv-driv...@vmware.com, 
linux-snps-...@lists.infradead.org, Mel Gorman , Jacob Pan 
, Arnd Bergmann , ulli.kr...@googlemail.com, vgu...@kernel.org, linux-clk 
, Josh Triplett , Steven 
Rostedt , r...@vger.kernel.org, Borislav Petkov 
, bc...@quicinc.com, Thomas Bogendoerfer 
, Parisc List , Sudeep 
Holla , Shawn Guo , David Miller 
, Rich Felker , Tony Lindgren 
, amakha...@vmware.com, Bjorn Andersson 
, "H. Peter Anvin" , 
sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, linux-riscv 
, anton.iva...@cambridgegreys.com, 
jo...@southpole.se, Yury Norov , Richard Weinberger 
, the arch/x86 maintainers , Russell King - 
ARM Linux , Ingo Molnar , Albert Ou 
, "P
 aul E. McKenney" , Heiko Carstens
 , stefan.kristians...@saunalahti.fi, 
openr...@lists.librecores.org, Paul Walmsley , 
linux-tegra , namhy...@kernel.org, Andy Shevchenko 
, jpoim...@kernel.org, Juergen Gross 
, Michal Simek , "open list:BROADCOM NVRAM 
DRIVER" , Palmer Dabbelt , Anup 
Patel , i...@jurassic.park.msu.ru, Johannes Berg 
, linuxppc-dev 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 

On Wed, Jun 8, 2022 at 5:48 PM Peter Zijlstra  wrote:
>
> On Wed, Jun 08, 2022 at 05:01:05PM +0200, Rafael J. Wysocki wrote:
> > On Wed, Jun 8, 2022 at 4:47 PM Peter Zijlstra  wrote:
> > >
> > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > Xeons") wrecked intel_idle in two ways:
> > >
> > >  - must not have tracing in idle functions
> > >  - must return with IRQs disabled
> > >
> > > Additionally, it added a branch for no good reason.
> > >
> > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > > Signed-off-by: Peter Zijlstra (Intel) 
> >
> > Acked-by: Rafael J. Wysocki 
> >
> > And do I think correctly that this can be applied without the rest of
> > the series?
>
> Yeah, I don't think this relies on any of the preceding patches. If you
> want to route this through the pm/fixes tree that's fine.

OK, thanks, applied (and I moved the intel_idle() kerneldoc so it is
next to the function to avoid the docs build warning).

Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-06-08 Thread Rafael J. Wysocki

ieu.desnoy...@efficios.com>, Frederic Weisbecker , Len 
Brown , linux-xte...@linux-xtensa.org, Sascha Hauer 
, Vasily Gorbik , linux-arm-msm 
, linux-al...@vger.kernel.org, linux-m68k 
, Stafford Horne , Linux ARM 
, ch...@zankel.net, Stephen Boyd 
, dingu...@kernel.org, Daniel Bristot de Oliveira 
, Alexander Shishkin , 
lpieral...@kernel.org, Rasmus Villemoes , Joel 
Fernandes , Will Deacon , Boris 
Ostrovsky , Kevin Hilman , 
linux-c...@vger.kernel.org, pv-driv...@vmware.com, 
linux-snps-...@lists.infradead.org, Mel Gorman , Jacob Pan 
, Arnd Bergmann , ulli.kr...@googlemail.com, vgu...@kernel.org, linux-clk 
, Josh Triplett , Steven 
Rostedt , r...@vger.kernel.org, Borislav Petkov 
, bc...@quicinc.com, Thomas Bogendoerfer 
, Parisc List , Sudeep 
Holla , Shawn Guo , David Miller 
, Rich Felker , Tony Lindgren 
, amakha...@vmware.com, Bjorn Andersson 
, "H. Peter Anvin" , 
sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, linux-riscv 
, anton.iva...@cambridgegreys.com, 
jo...@southpole.se, Yury Norov , Richard Weinberger 
, the arch/x86 maintainers , Russell King - 
ARM Linux , Ingo Molnar , Albert Ou 
, "P
 aul E. McKenney" , Heiko Carstens
 , stefan.kristians...@saunalahti.fi, 
openr...@lists.librecores.org, Paul Walmsley , 
linux-tegra , namhy...@kernel.org, Andy Shevchenko 
, jpoim...@kernel.org, Juergen Gross 
, Michal Simek , "open list:BROADCOM NVRAM 
DRIVER" , Palmer Dabbelt , Anup 
Patel , i...@jurassic.park.msu.ru, Johannes Berg 
, linuxppc-dev 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 

On Wed, Jun 8, 2022 at 4:47 PM Peter Zijlstra  wrote:
>
> Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> Xeons") wrecked intel_idle in two ways:
>
>  - must not have tracing in idle functions
>  - must return with IRQs disabled
>
> Additionally, it added a branch for no good reason.
>
> Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> Signed-off-by: Peter Zijlstra (Intel) 

Acked-by: Rafael J. Wysocki 

And do I think correctly that this can be applied without the rest of
the series?

> ---
>  drivers/idle/intel_idle.c |   48 
> +++---
>  1 file changed, 37 insertions(+), 11 deletions(-)
>
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -129,21 +137,37 @@ static unsigned int mwait_substates __in
>   *
>   * Must be called under local_irq_disable().
>   */
> +
> -static __cpuidle int intel_idle(struct cpuidle_device *dev,
> -   struct cpuidle_driver *drv, int index)
> +static __always_inline int __intel_idle(struct cpuidle_device *dev,
> +   struct cpuidle_driver *drv, int index)
>  {
> struct cpuidle_state *state = >states[index];
> unsigned long eax = flg2MWAIT(state->flags);
> unsigned long ecx = 1; /* break on interrupt flag */
>
> -   if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE)
> -   local_irq_enable();
> -
> mwait_idle_with_hints(eax, ecx);
>
> return index;
>  }
>
> +static __cpuidle int intel_idle(struct cpuidle_device *dev,
> +   struct cpuidle_driver *drv, int index)
> +{
> +   return __intel_idle(dev, drv, index);
> +}
> +
> +static __cpuidle int intel_idle_irq(struct cpuidle_device *dev,
> +   struct cpuidle_driver *drv, int index)
> +{
> +   int ret;
> +
> +   raw_local_irq_enable();
> +   ret = __intel_idle(dev, drv, index);
> +   raw_local_irq_disable();
> +
> +   return ret;
> +}
> +
>  /**
>   * intel_idle_s2idle - Ask the processor to enter the given idle state.
>   * @dev: cpuidle device of the target CPU.
> @@ -1801,6 +1824,9 @@ static void __init intel_idle_init_cstat
> /* Structure copy. */
> drv->states[drv->state_count] = cpuidle_state_table[cstate];
>
> +   if (cpuidle_state_table[cstate].flags & 
> CPUIDLE_FLAG_IRQ_ENABLE)
> +   drv->states[drv->state_count].enter = intel_idle_irq;
> +
> if ((disabled_states_mask & BIT(drv->state_count)) ||
> ((icpu->use_acpi || force_use_acpi) &&
>  intel_idle_off_by_default(mwait_hint) &&
>
>

Re: [PATCH v1] kernel/reboot: Fix powering off using a non-syscall code paths

2022-06-07 Thread Rafael J. Wysocki

On Mon, Jun 6, 2022 at 6:57 PM Dmitry Osipenko
 wrote:
>
> There are other methods of powering off machine than the reboot syscall.
> Previously we missed to coved those methods and it created power-off
> regression for some machines, like the PowerPC e500. Fix this problem
> by moving the legacy sys-off handler registration to the latest phase
> of power-off process and making the kernel_can_power_off() to check the
> legacy pm_power_off presence.
>
> Tested-by: Michael Ellerman  # ppce500
> Reported-by: Michael Ellerman  # ppce500
> Fixes: da007f171fc9 ("kernel/reboot: Change registration order of legacy 
> power-off handler")
> Signed-off-by: Dmitry Osipenko 
> ---
>  kernel/reboot.c | 46 ++
>  1 file changed, 26 insertions(+), 20 deletions(-)
>
> diff --git a/kernel/reboot.c b/kernel/reboot.c
> index 3b19b123efec..b5a71d1ff603 100644
> --- a/kernel/reboot.c
> +++ b/kernel/reboot.c
> @@ -320,6 +320,7 @@ static struct sys_off_handler platform_sys_off_handler;
>  static struct sys_off_handler *alloc_sys_off_handler(int priority)
>  {
> struct sys_off_handler *handler;
> +   gfp_t flags;
>
> /*
>  * Platforms like m68k can't allocate sys_off handler dynamically
> @@ -330,7 +331,12 @@ static struct sys_off_handler *alloc_sys_off_handler(int 
> priority)
> if (handler->cb_data)
> return ERR_PTR(-EBUSY);
> } else {
> -   handler = kzalloc(sizeof(*handler), GFP_KERNEL);
> +   if (system_state > SYSTEM_RUNNING)
> +   flags = GFP_ATOMIC;
> +   else
> +   flags = GFP_KERNEL;
> +
> +   handler = kzalloc(sizeof(*handler), flags);
> if (!handler)
> return ERR_PTR(-ENOMEM);
> }
> @@ -440,7 +446,7 @@ void unregister_sys_off_handler(struct sys_off_handler 
> *handler)
>  {
> int err;
>
> -   if (!handler)
> +   if (IS_ERR_OR_NULL(handler))
> return;
>
> if (handler->blocking)
> @@ -615,7 +621,23 @@ static void do_kernel_power_off_prepare(void)
>   */
>  void do_kernel_power_off(void)
>  {
> +   struct sys_off_handler *sys_off = NULL;
> +
> +   /*
> +* Register sys-off handlers for legacy PM callback. This allows
> +* legacy PM callbacks temporary co-exist with the new sys-off API.
> +*
> +* TODO: Remove legacy handlers once all legacy PM users will be
> +*   switched to the sys-off based APIs.
> +*/
> +   if (pm_power_off)
> +   sys_off = register_sys_off_handler(SYS_OFF_MODE_POWER_OFF,
> +  SYS_OFF_PRIO_DEFAULT,
> +  legacy_pm_power_off, NULL);
> +
> atomic_notifier_call_chain(_off_handler_list, 0, NULL);
> +
> +   unregister_sys_off_handler(sys_off);
>  }
>
>  /**
> @@ -626,7 +648,8 @@ void do_kernel_power_off(void)
>   */
>  bool kernel_can_power_off(void)
>  {
> -   return !atomic_notifier_call_chain_is_empty(_off_handler_list);
> +   return !atomic_notifier_call_chain_is_empty(_off_handler_list) 
> ||
> +   pm_power_off;
>  }
>  EXPORT_SYMBOL_GPL(kernel_can_power_off);
>
> @@ -661,7 +684,6 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, 
> unsigned int, cmd,
> void __user *, arg)
>  {
> struct pid_namespace *pid_ns = task_active_pid_ns(current);
> -   struct sys_off_handler *sys_off = NULL;
> char buffer[256];
> int ret = 0;
>
> @@ -686,21 +708,6 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, 
> unsigned int, cmd,
> if (ret)
> return ret;
>
> -   /*
> -* Register sys-off handlers for legacy PM callback. This allows
> -* legacy PM callbacks temporary co-exist with the new sys-off API.
> -*
> -* TODO: Remove legacy handlers once all legacy PM users will be
> -*   switched to the sys-off based APIs.
> -*/
> -   if (pm_power_off) {
> -   sys_off = register_sys_off_handler(SYS_OFF_MODE_POWER_OFF,
> -  SYS_OFF_PRIO_DEFAULT,
> -  legacy_pm_power_off, NULL);
> -   if (IS_ERR(sys_off))
> -   return PTR_ERR(sys_off);
> -   }
> -
> /* Instead of trying to make the power_off code look like
>  * halt when pm_power_off is not set do it the easy way.
>  */
> @@ -758,7 +765,6 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, 
> unsigned int, cmd,
> break;
> }
> mutex_unlock(_transition_mutex);
> -   unregister_sys_off_handler(sys_off);
> return ret;
>  }
>
> --

Applied (with a couple of edits in the changelog), thanks!

Re: [PATCH v4 2/2] PCI/PM: Fix pci_pm_suspend_noirq() to disable PTM

2022-04-25 Thread Rafael J. Wysocki

On Mon, Apr 25, 2022 at 8:33 PM David E. Box
 wrote:
>
> On Sat, 2022-04-23 at 10:01 -0500, Bjorn Helgaas wrote:
> > On Sat, Apr 23, 2022 at 12:43:14AM +, Jingar, Rajvi wrote:
> > > > -Original Message-
> > > > From: Bjorn Helgaas 
> > > > On Thu, Apr 14, 2022 at 07:54:02PM +0200, Rafael J. Wysocki wrote:
> > > > > On 3/25/2022 8:50 PM, Rajvi Jingar wrote:
> > > > > > For the PCIe devices (like nvme) that do not go into D3 state still
> > > > > > need to
> > > > > > disable PTM on PCIe root ports to allow the port to enter a lower-
> > > > > > power PM
> > > > > > state and the SoC to reach a lower-power idle state as a whole. Move
> > > > > > the
> > > > > > pci_disable_ptm() out of pci_prepare_to_sleep() as this code path is
> > > > > > not
> > > > > > followed for devices that do not go into D3. This patch fixes the
> > > > > > issue
> > > > > > seen on Dell XPS 9300 with Ice Lake CPU and Dell Precision 5530 with
> > > > > > Coffee
> > > > > > Lake CPU platforms to get improved residency in low power idle 
> > > > > > states.
> > > > > >
> > > > > > Fixes: a697f072f5da ("PCI: Disable PTM during suspend to save 
> > > > > > power")
> > > > > > Signed-off-by: Rajvi Jingar 
> > > > > > Suggested-by: David E. Box 
> > > > > > ---
> > > > > >   drivers/pci/pci-driver.c | 10 ++
> > > > > >   drivers/pci/pci.c| 10 --
> > > > > >   2 files changed, 10 insertions(+), 10 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > > > > > index 8b55a90126a2..ab733374a260 100644
> > > > > > --- a/drivers/pci/pci-driver.c
> > > > > > +++ b/drivers/pci/pci-driver.c
> > > > > > @@ -847,6 +847,16 @@ static int pci_pm_suspend_noirq(struct device
> > > > > > *dev)
> > > > > >   if (!pci_dev->state_saved) {
> > > > > >   pci_save_state(pci_dev);
> > > > > > + /*
> > > > > > +  * There are systems (for example, Intel mobile chips
> > > > > > since
> > > > Coffee
> > > > > > +  * Lake) where the power drawn while suspended can be
> > > > significantly
> > > > > > +  * reduced by disabling PTM on PCIe root ports as this
> > > > > > allows the
> > > > > > +  * port to enter a lower-power PM state and the SoC to
> > > > > > reach a
> > > > > > +  * lower-power idle state as a whole.
> > > > > > +  */
> > > > > > + if (pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT)
> > > > > > + pci_disable_ptm(pci_dev);
> > > >
> > > > Why is disabling PTM dependent on pci_dev->state_saved?  The point of
> > > > this is to change the behavior of the device, and it seems like we
> > > > want to do that regardless of whether the driver has used
> > > > pci_save_state().
> > >
> > > Because we use the saved state to restore PTM on the root port.
> > > And it's under this condition that the root port state gets saved.
> >
> > Yes, I understand that pci_restore_ptm_state() depends on a previous
> > call to pci_save_ptm_state().
> >
> > The point I'm trying to make is that pci_disable_ptm() changes the
> > state of the device, and that state change should not depend on
> > whether the driver has used pci_save_state().
>
> We do it here because D3 depends on whether the device state was saved by the
> driver.
>
> if (!pci_dev->state_saved) {
> pci_save_state(pci_dev);
>
> /* disable PTM here */
>
> if (pci_power_manageable(pci_dev))
> pci_prepare_to_sleep(pci_dev);
> }
>
>
> If we disable PTM before the check, we will have saved "PTM disabled" as the
> restore state. And we can't do it after the check as the device will be in D3.
>
> As to disabling PTM on all devices, I see no problem with this, but the
> reasoning is different. We disabled the root port PTM for power savings.

Right.  As per the comment explaining why it is disabled.

> >
> > When we're putting a device into a low-power state, I think we want to
> > disable PTM *always*, no matter what the driver did.  And I think we
> > want to do it for all devices, not just Root Ports.
> >
> > Bjorn
>

Re: [PATCH] cpufreq: Prepare cleanup of powerpc's asm/prom.h

2022-04-13 Thread Rafael J. Wysocki

On Mon, Apr 4, 2022 at 8:27 AM Viresh Kumar  wrote:
>
> On 01-04-22, 19:24, Christophe Leroy wrote:
> > powerpc's asm/prom.h brings some headers that it doesn't
> > need itself.
> >
> > In order to clean it up, first add missing headers in
> > users of asm/prom.h
> >
> > Signed-off-by: Christophe Leroy 
> > ---
> >  drivers/cpufreq/pasemi-cpufreq.c  | 1 -
> >  drivers/cpufreq/pmac32-cpufreq.c  | 2 +-
> >  drivers/cpufreq/pmac64-cpufreq.c  | 2 +-
> >  drivers/cpufreq/ppc_cbe_cpufreq.c | 1 -
> >  drivers/cpufreq/ppc_cbe_cpufreq_pmi.c | 2 +-
> >  5 files changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/cpufreq/pasemi-cpufreq.c 
> > b/drivers/cpufreq/pasemi-cpufreq.c
> > index 815645170c4d..039a66bbe1be 100644
> > --- a/drivers/cpufreq/pasemi-cpufreq.c
> > +++ b/drivers/cpufreq/pasemi-cpufreq.c
> > @@ -18,7 +18,6 @@
> >
> >  #include 
> >  #include 
> > -#include 
> >  #include 
> >  #include 
> >
> > diff --git a/drivers/cpufreq/pmac32-cpufreq.c 
> > b/drivers/cpufreq/pmac32-cpufreq.c
> > index 4f20c6a9108d..20f64a8b0a35 100644
> > --- a/drivers/cpufreq/pmac32-cpufreq.c
> > +++ b/drivers/cpufreq/pmac32-cpufreq.c
> > @@ -24,7 +24,7 @@
> >  #include 
> >  #include 
> >  #include 
> > -#include 
> > +
> >  #include 
> >  #include 
> >  #include 
> > diff --git a/drivers/cpufreq/pmac64-cpufreq.c 
> > b/drivers/cpufreq/pmac64-cpufreq.c
> > index d7542a106e6b..ba9c31d98bd6 100644
> > --- a/drivers/cpufreq/pmac64-cpufreq.c
> > +++ b/drivers/cpufreq/pmac64-cpufreq.c
> > @@ -22,7 +22,7 @@
> >  #include 
> >  #include 
> >  #include 
> > -#include 
> > +
> >  #include 
> >  #include 
> >  #include 
> > diff --git a/drivers/cpufreq/ppc_cbe_cpufreq.c 
> > b/drivers/cpufreq/ppc_cbe_cpufreq.c
> > index c58abb4cca3a..e3313ce63b38 100644
> > --- a/drivers/cpufreq/ppc_cbe_cpufreq.c
> > +++ b/drivers/cpufreq/ppc_cbe_cpufreq.c
> > @@ -12,7 +12,6 @@
> >  #include 
> >
> >  #include 
> > -#include 
> >  #include 
> >
> >  #include "ppc_cbe_cpufreq.h"
> > diff --git a/drivers/cpufreq/ppc_cbe_cpufreq_pmi.c 
> > b/drivers/cpufreq/ppc_cbe_cpufreq_pmi.c
> > index 037fe23bc6ed..4fba3637b115 100644
> > --- a/drivers/cpufreq/ppc_cbe_cpufreq_pmi.c
> > +++ b/drivers/cpufreq/ppc_cbe_cpufreq_pmi.c
> > @@ -13,9 +13,9 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  #include 
> > -#include 
> >  #include 
> >  #include 
>
> Acked-by: Viresh Kumar 

Applied as 5.19 material.

If the powerpc folks decide to take it, I can drop it, so please let me know.

Re: [PATCH 05/22] acpica: Replace comments with C99 initializers

2022-03-28 Thread Rafael J. Wysocki

On Sat, Mar 26, 2022 at 6:09 PM Benjamin Stürz  wrote:
>
> This replaces comments with C99's designated
> initializers because the kernel supports them now.

However, note that all of the ACPICA material should be submitted to
the upstream ACPICA project via https://github.com/acpica/acpica

Also please note that the set of compilers that need to be supported
by the ACPICA project is greater than the set of compilers that can
build the Linux kernel.


> Signed-off-by: Benjamin Stürz 
> ---
>  drivers/acpi/acpica/utdecode.c | 183 -
>  1 file changed, 90 insertions(+), 93 deletions(-)
>
> diff --git a/drivers/acpi/acpica/utdecode.c b/drivers/acpi/acpica/utdecode.c
> index bcd3871079d7..d19868d2ea46 100644
> --- a/drivers/acpi/acpica/utdecode.c
> +++ b/drivers/acpi/acpica/utdecode.c
> @@ -156,37 +156,37 @@ static const char acpi_gbl_bad_type[] = "UNDEFINED";
>  /* Printable names of the ACPI object types */
>
>  static const char *acpi_gbl_ns_type_names[] = {
> -   /* 00 */ "Untyped",
> -   /* 01 */ "Integer",
> -   /* 02 */ "String",
> -   /* 03 */ "Buffer",
> -   /* 04 */ "Package",
> -   /* 05 */ "FieldUnit",
> -   /* 06 */ "Device",
> -   /* 07 */ "Event",
> -   /* 08 */ "Method",
> -   /* 09 */ "Mutex",
> -   /* 10 */ "Region",
> -   /* 11 */ "Power",
> -   /* 12 */ "Processor",
> -   /* 13 */ "Thermal",
> -   /* 14 */ "BufferField",
> -   /* 15 */ "DdbHandle",
> -   /* 16 */ "DebugObject",
> -   /* 17 */ "RegionField",
> -   /* 18 */ "BankField",
> -   /* 19 */ "IndexField",
> -   /* 20 */ "Reference",
> -   /* 21 */ "Alias",
> -   /* 22 */ "MethodAlias",
> -   /* 23 */ "Notify",
> -   /* 24 */ "AddrHandler",
> -   /* 25 */ "ResourceDesc",
> -   /* 26 */ "ResourceFld",
> -   /* 27 */ "Scope",
> -   /* 28 */ "Extra",
> -   /* 29 */ "Data",
> -   /* 30 */ "Invalid"
> +   [0]  = "Untyped",
> +   [1]  = "Integer",
> +   [2]  = "String",
> +   [3]  = "Buffer",
> +   [4]  = "Package",
> +   [5]  = "FieldUnit",
> +   [6]  = "Device",
> +   [7]  = "Event",
> +   [8]  = "Method",
> +   [9]  = "Mutex",
> +   [10] = "Region",
> +   [11] = "Power",
> +   [12] = "Processor",
> +   [13] = "Thermal",
> +   [14] = "BufferField",
> +   [15] = "DdbHandle",
> +   [16] = "DebugObject",
> +   [17] = "RegionField",
> +   [18] = "BankField",
> +   [19] = "IndexField",
> +   [20] = "Reference",
> +   [21] = "Alias",
> +   [22] = "MethodAlias",
> +   [23] = "Notify",
> +   [24] = "AddrHandler",
> +   [25] = "ResourceDesc",
> +   [26] = "ResourceFld",
> +   [27] = "Scope",
> +   [28] = "Extra",
> +   [29] = "Data",
> +   [30] = "Invalid"
>  };
>
>  const char *acpi_ut_get_type_name(acpi_object_type type)
> @@ -284,22 +284,22 @@ const char *acpi_ut_get_node_name(void *object)
>  /* Printable names of object descriptor types */
>
>  static const char *acpi_gbl_desc_type_names[] = {
> -   /* 00 */ "Not a Descriptor",
> -   /* 01 */ "Cached Object",
> -   /* 02 */ "State-Generic",
> -   /* 03 */ "State-Update",
> -   /* 04 */ "State-Package",
> -   /* 05 */ "State-Control",
> -   /* 06 */ "State-RootParseScope",
> -   /* 07 */ "State-ParseScope",
> -   /* 08 */ "State-WalkScope",
> -   /* 09 */ "State-Result",
> -   /* 10 */ "State-Notify",
> -   /* 11 */ "State-Thread",
> -   /* 12 */ "Tree Walk State",
> -   /* 13 */ "Parse Tree Op",
> -   /* 14 */ "Operand Object",
> -   /* 15 */ "Namespace Node"
> +   [0]  = "Not a Descriptor",
> +   [1]  = "Cached Object",
> +   [2]  = "State-Generic",
> +   [3]  = "State-Update",
> +   [4]  = "State-Package",
> +   [5]  = "State-Control",
> +   [6]  = "State-RootParseScope",
> +   [7]  = "State-ParseScope",
> +   [8]  = "State-WalkScope",
> +   [9]  = "State-Result",
> +   [10] = "State-Notify",
> +   [11] = "State-Thread",
> +   [12] = "Tree Walk State",
> +   [13] = "Parse Tree Op",
> +   [14] = "Operand Object",
> +   [15] = "Namespace Node"
>  };
>
>  const char *acpi_ut_get_descriptor_name(void *object)
> @@ -331,13 +331,13 @@ const char *acpi_ut_get_descriptor_name(void *object)
>  /* Printable names of reference object sub-types */
>
>  static const char *acpi_gbl_ref_class_names[] = {
> -   /* 00 */ "Local",
> -   /* 01 */ "Argument",
> -   /* 02 */ "RefOf",
> -   /* 03 */ "Index",
> -   /* 04 */ "DdbHandle",
> -   /* 05 */ "Named Object",
> -   /* 06 */ "Debug"
> +   [0] = "Local",
> +   [1] = "Argument",
> +   [2] = "RefOf",
> +   [3] = "Index",
> +   [4] = "DdbHandle",
> +   [5] = "Named Object",
> +   [6] = "Debug"
>  };
>
>  const char *acpi_ut_get_reference_name(union acpi_operand_object *object)
> @@ -416,25 +416,22 @@

Re: [PATCH] Docs: admin/kernel-parameters: edit a few boot options

2022-03-22 Thread Rafael J. Wysocki

On Mon, Mar 21, 2022 at 2:22 AM Randy Dunlap  wrote:
>
> Clean up some of admin-guide/kernel-parameters.txt:
>
> a. "smt" should be "smt=" (S390)
> b. add "smt-enabled" for POWERPC
> c. Sparc supports the vdso= boot option
> d. make the tp_printk options (2) formatting similar to other options
>by adding spacing
> e. add "trace_clock=" with a reference to Documentation/trace/ftrace.rst
> f. use [IA-64] as documented instead of [ia64]
> g. fix formatting and text for test_suspend=

This ->

> h. fix formatting for swapaccount=
> i. fix formatting and grammar for video.brightness_switch_enabled=

-> and the last one are fine with me, but I suppose that there will be a v2?

> Signed-off-by: Randy Dunlap 
> Cc: Heiko Carstens 
> Cc: Vasily Gorbik 
> Cc: Alexander Gordeev 
> Cc: Christian Borntraeger 
> Cc: Sven Schnelle 
> Cc: linux-s...@vger.kernel.org
> Cc: Steven Rostedt 
> Cc: Ingo Molnar 
> Cc: "David S. Miller" 
> Cc: sparcli...@vger.kernel.org
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-i...@vger.kernel.org
> Cc: "Rafael J. Wysocki" 
> Cc: Pavel Machek 
> Cc: Len Brown 
> Cc: linux...@vger.kernel.org
> Cc: linux-a...@vger.kernel.org
> Cc: Johannes Weiner 
> Cc: Andrew Morton 
> Cc: Jonathan Corbet 
> Cc: linux-...@vger.kernel.org
> ---
>  Documentation/admin-guide/kernel-parameters.txt |   33 +-
>  1 file changed, 22 insertions(+), 11 deletions(-)
>
> --- linux-next-20220318.orig/Documentation/admin-guide/kernel-parameters.txt
> +++ linux-next-20220318/Documentation/admin-guide/kernel-parameters.txt
> @@ -2814,7 +2814,7 @@
> different yeeloong laptops.
> Example: machtype=lemote-yeeloong-2f-7inch
>
> -   max_addr=nn[KMG][KNL,BOOT,ia64] All physical memory greater
> +   max_addr=nn[KMG][KNL,BOOT,IA-64] All physical memory greater
> than or equal to this physical address is ignored.
>
> maxcpus=[SMP] Maximum number of processors that an SMP kernel
> @@ -3057,7 +3057,7 @@
>
> mga=[HW,DRM]
>
> -   min_addr=nn[KMG][KNL,BOOT,ia64] All physical memory below this
> +   min_addr=nn[KMG][KNL,BOOT,IA-64] All physical memory below 
> this
> physical address is ignored.
>
> mini2440=   [ARM,HW,KNL]
> @@ -5382,13 +5382,19 @@
> 1: Fast pin select (default)
> 2: ATC IRMode
>
> -   smt [KNL,S390] Set the maximum number of threads (logical
> +   smt=[KNL,S390] Set the maximum number of threads (logical
> CPUs) to use per physical CPU on systems capable of
> symmetric multithreading (SMT). Will be capped to the
> actual hardware limit.
> Format: 
> Default: -1 (no limit)
>
> +   smt-enabled=[PPC 64-bit] Enable SMT, disable SMT, or set the
> +   maximum number of threads. This can be used to 
> override
> +   the Open Firmware (OF) option.
> +   Format: on | off | 
> +   Default: all threads enabled
> +
> softlockup_panic=
> [KNL] Should the soft-lockup detector generate panics.
> Format: 0 | 1
> @@ -5768,8 +5774,9 @@
> This parameter controls use of the Protected
> Execution Facility on pSeries.
>
> -   swapaccount=[0|1]
> -   [KNL] Enable accounting of swap in memory resource
> +   swapaccount=[KNL]
> +   Format: [0|1]
> +   Enable accounting of swap in memory resource
> controller if no parameter or 1 is given or disable
> it if 0 is given (See 
> Documentation/admin-guide/cgroup-v1/memory.rst)
>
> @@ -5815,7 +5822,8 @@
>
> tdfx=   [HW,DRM]
>
> -   test_suspend=   [SUSPEND][,N]
> +   test_suspend=   [SUSPEND]
> +   Format: { "mem" | "standby" | "freeze" }[,N]
> Specify "mem" (for Suspend-to-RAM) or "standby" (for
> standby suspend) or "freeze" (for suspend type freeze)
> as the system sleep state during system startup with
> @@ -5902,6 +5910,8 @@
>

Re: [PATCH v4 05/25] reboot: Warn if restart handler has duplicated priority

2021-12-12 Thread Rafael J. Wysocki

On Fri, Dec 10, 2021 at 8:04 PM Dmitry Osipenko  wrote:
>
> 10.12.2021 21:27, Rafael J. Wysocki пишет:
> > On Mon, Nov 29, 2021 at 12:34 PM Dmitry Osipenko  wrote:
> >>
> >> 29.11.2021 03:26, Michał Mirosław пишет:
> >>> On Mon, Nov 29, 2021 at 12:06:19AM +0300, Dmitry Osipenko wrote:
> >>>> 28.11.2021 03:28, Michał Mirosław пишет:
> >>>>> On Fri, Nov 26, 2021 at 09:00:41PM +0300, Dmitry Osipenko wrote:
> >>>>>> Add sanity check which ensures that there are no two restart handlers
> >>>>>> registered with the same priority. Normally it's a direct sign of a
> >>>>>> problem if two handlers use the same priority.
> >>>>>
> >>>>> The patch doesn't ensure the property that there are no 
> >>>>> duplicated-priority
> >>>>> entries on the chain.
> >>>>
> >>>> It's not the exact point of this patch.
> >>>>
> >>>>> I'd rather see a atomic_notifier_chain_register_unique() that returns
> >>>>> -EBUSY or something istead of adding an entry with duplicate priority.
> >>>>> That way it would need only one list traversal unless you want to
> >>>>> register the duplicate anyway (then you would call the older
> >>>>> atomic_notifier_chain_register() after reporting the error).
> >>>>
> >>>> The point of this patch is to warn developers about the problem that
> >>>> needs to be fixed. We already have such troubling drivers in mainline.
> >>>>
> >>>> It's not critical to register different handlers with a duplicated
> >>>> priorities, but such cases really need to be corrected. We shouldn't
> >>>> break users' machines during transition to the new API, meanwhile
> >>>> developers should take action of fixing theirs drivers.
> >>>>
> >>>>> (Or you could return > 0 when a duplicate is registered in
> >>>>> atomic_notifier_chain_register() if the callers are prepared
> >>>>> for that. I don't really like this way, though.)
> >>>>
> >>>> I had a similar thought at some point before and decided that I'm not in
> >>>> favor of this approach. It's nicer to have a dedicated function that
> >>>> verifies the uniqueness, IMO.
> >>>
> >>> I don't like the part that it traverses the list second time to check
> >>> the uniqueness. But actually you could avoid that if
> >>> notifier_chain_register() would always add equal-priority entries in
> >>> reverse order:
> >>>
> >>>  static int notifier_chain_register(struct notifier_block **nl,
> >>>   struct notifier_block *n)
> >>>  {
> >>>   while ((*nl) != NULL) {
> >>>   if (unlikely((*nl) == n)) {
> >>>   WARN(1, "double register detected");
> >>>   return 0;
> >>>   }
> >>> - if (n->priority > (*nl)->priority)
> >>> + if (n->priority >= (*nl)->priority)
> >>>   break;
> >>>   nl = &((*nl)->next);
> >>>   }
> >>>   n->next = *nl;
> >>>   rcu_assign_pointer(*nl, n);
> >>>   return 0;
> >>>  }
> >>>
> >>> Then the check for uniqueness after adding would be:
> >>>
> >>>  WARN(nb->next && nb->priority == nb->next->priority);
> >>
> >> We can't just change the registration order because invocation order of
> >> the call chain depends on the registration order
> >
> > It doesn't if unique priorities are required and isn't that what you want?
> >
> >> and some of current
> >> users may rely on that order. I'm pretty sure that changing the order
> >> will have unfortunate consequences.
> >
> > Well, the WARN() doesn't help much then.
> >
> > Either you can make all of the users register with unique priorities,
> > and then you can make the registration reject non-unique ones, or you
> > cannot assume them to be unique.
>
> There is no strong requirement for priorities to be unique, the reboot.c
> code will work properly.

In which case adding the WARN() is not appropriate IMV.

Also I've looked at the existing code and at least in some cases the
order in which the notifiers run doesn't matter.  I'm not sure what
the purpose of this patch is TBH.

> The potential problem is on the user's side and the warning is intended
> to aid the user.

Unless somebody has the panic_on_warn mentioned previously set and
really the user need not understand what the WARN() is about.  IOW,
WARN() helps developers, not users.

> We can make it a strong requirement, but only after converting and
> testing all kernel drivers.

Right.

> I'll consider to add patches for that.

But can you avoid adding more patches to this series?

Re: [PATCH v4 06/25] reboot: Warn if unregister_restart_handler() fails

2021-12-12 Thread Rafael J. Wysocki

On Fri, Dec 10, 2021 at 7:54 PM Dmitry Osipenko  wrote:
>
> 10.12.2021 21:32, Rafael J. Wysocki пишет:
> > On Fri, Nov 26, 2021 at 7:02 PM Dmitry Osipenko  wrote:
> >>
> >> Emit warning if unregister_restart_handler() fails since it never should
> >> fail. This will ease further API development by catching mistakes early.
> >>
> >> Signed-off-by: Dmitry Osipenko 
> >> ---
> >>  kernel/reboot.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/kernel/reboot.c b/kernel/reboot.c
> >> index e6659ae329f1..f0e7b9c13f6b 100644
> >> --- a/kernel/reboot.c
> >> +++ b/kernel/reboot.c
> >> @@ -210,7 +210,7 @@ EXPORT_SYMBOL(register_restart_handler);
> >>   */
> >>  int unregister_restart_handler(struct notifier_block *nb)
> >>  {
> >> -   return atomic_notifier_chain_unregister(_handler_list, nb);
> >> +   return 
> >> WARN_ON(atomic_notifier_chain_unregister(_handler_list, nb));
> >
> > The only reason why it can fail is if the object pointed to by nb is
> > not in the chain.
>
> I had exactly this case where object wasn't in the chain due to a bug
> and this warning was very helpful.

During the development.  In production it would be rather annoying.

> >  Why WARN() about this?  And what about systems with
> > panic_on_warn set?
>
> That warning condition will never happen normally, only when something
> is seriously wrong.
>
> Those systems with panic_on_warn will get what was they asked for.

They may not be asking for panicking on bugs in the reboot notifier
code, though.  That's what your change is making them panic on.

Re: [PATCH v4 03/25] notifier: Add atomic/blocking_notifier_has_unique_priority()

2021-12-12 Thread Rafael J. Wysocki

On Fri, Dec 10, 2021 at 7:52 PM Dmitry Osipenko  wrote:
>
> 10.12.2021 21:19, Rafael J. Wysocki пишет:
> ...
> >> +bool atomic_notifier_has_unique_priority(struct atomic_notifier_head *nh,
> >> +   struct notifier_block *n)
> >> +{
> >> +   unsigned long flags;
> >> +   bool ret;
> >> +
> >> +   spin_lock_irqsave(>lock, flags);
> >> +   ret = notifier_has_unique_priority(>head, n);
> >> +   spin_unlock_irqrestore(>lock, flags);
> >
> > This only works if the caller can prevent new entries from being added
> > to the list at this point or if the caller knows that they cannot be
> > added for some reason, but the kerneldoc doesn't mention this
> > limitation.
>
> I'll update the comment.
>
> ..
> >> +bool blocking_notifier_has_unique_priority(struct blocking_notifier_head 
> >> *nh,
> >> +   struct notifier_block *n)
> >> +{
> >> +   bool ret;
> >> +
> >> +   /*
> >> +* This code gets used during boot-up, when task switching is
> >> +* not yet working and interrupts must remain disabled. At such
> >> +* times we must not call down_read().
> >> +*/
> >> +   if (system_state != SYSTEM_BOOTING)
> >
> > No, please don't do this, it makes the whole thing error-prone.
>
> What should I do then?

First of all, do you know of any users who may want to call this
during early initialization?  If so, then why may they want to do
that?

Depending on the above, I would consider adding a special mechanism for them.

> >> +   down_read(>rwsem);
> >> +
> >> +   ret = notifier_has_unique_priority(>head, n);
> >> +
> >> +   if (system_state != SYSTEM_BOOTING)
> >> +   up_read(>rwsem);
> >
> > And still what if a new entry with a non-unique priority is added to
> > the chain at this point?
>
> If entry with a non-unique priority is added after the check, then
> obviously it won't be detected.

Why isn't this a problem?

> I don't understand the question. These
> down/up_read() are the locks that prevent the race, if that's the question.

Not really, they only prevent the race from occurring while
notifier_has_unique_priority() is running.

If anyone depends on this check for correctness, they need to lock the
rwsem, do the check, do the thing depending on the check while holding
the rwsem and then release the rwsem.  Otherwise it is racy.

Re: [PATCH v4 07/25] reboot: Remove extern annotation from function prototypes

2021-12-12 Thread Rafael J. Wysocki

On Fri, Dec 10, 2021 at 7:16 PM Dmitry Osipenko  wrote:
>
> 10.12.2021 21:09, Rafael J. Wysocki пишет:
> > On Fri, Nov 26, 2021 at 7:02 PM Dmitry Osipenko  wrote:
> >>
> >> There is no need to annotate function prototypes with 'extern', it makes
> >> code less readable. Remove unnecessary annotations from .
> >>
> >> Signed-off-by: Dmitry Osipenko 
> >
> > I'm not sure that this is really useful.
> >
> > Personally, I tend to respect the existing conventions like this.
> >
> > Surely, this change is not required for the rest of the series to work.
>
> Problem that such things start to spread all over the kernel with a
> copy-paste approach if there is nobody to clean up the code.
>
> This is not a common convention and sometimes it's getting corrected [1].
>
> [1] https://git.kernel.org/linus/6d7434931

In separate patches outside of series adding new features, if one is
so inclined.

Re: [PATCH v4 06/25] reboot: Warn if unregister_restart_handler() fails

2021-12-12 Thread Rafael J. Wysocki

On Fri, Nov 26, 2021 at 7:02 PM Dmitry Osipenko  wrote:
>
> Emit warning if unregister_restart_handler() fails since it never should
> fail. This will ease further API development by catching mistakes early.
>
> Signed-off-by: Dmitry Osipenko 
> ---
>  kernel/reboot.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/reboot.c b/kernel/reboot.c
> index e6659ae329f1..f0e7b9c13f6b 100644
> --- a/kernel/reboot.c
> +++ b/kernel/reboot.c
> @@ -210,7 +210,7 @@ EXPORT_SYMBOL(register_restart_handler);
>   */
>  int unregister_restart_handler(struct notifier_block *nb)
>  {
> -   return atomic_notifier_chain_unregister(_handler_list, nb);
> +   return 
> WARN_ON(atomic_notifier_chain_unregister(_handler_list, nb));

The only reason why it can fail is if the object pointed to by nb is
not in the chain.  Why WARN() about this?  And what about systems with
panic_on_warn set?

>  }
>  EXPORT_SYMBOL(unregister_restart_handler);
>
> --
> 2.33.1
>

Re: [PATCH v4 05/25] reboot: Warn if restart handler has duplicated priority

2021-12-12 Thread Rafael J. Wysocki

On Mon, Nov 29, 2021 at 12:34 PM Dmitry Osipenko  wrote:
>
> 29.11.2021 03:26, Michał Mirosław пишет:
> > On Mon, Nov 29, 2021 at 12:06:19AM +0300, Dmitry Osipenko wrote:
> >> 28.11.2021 03:28, Michał Mirosław пишет:
> >>> On Fri, Nov 26, 2021 at 09:00:41PM +0300, Dmitry Osipenko wrote:
>  Add sanity check which ensures that there are no two restart handlers
>  registered with the same priority. Normally it's a direct sign of a
>  problem if two handlers use the same priority.
> >>>
> >>> The patch doesn't ensure the property that there are no 
> >>> duplicated-priority
> >>> entries on the chain.
> >>
> >> It's not the exact point of this patch.
> >>
> >>> I'd rather see a atomic_notifier_chain_register_unique() that returns
> >>> -EBUSY or something istead of adding an entry with duplicate priority.
> >>> That way it would need only one list traversal unless you want to
> >>> register the duplicate anyway (then you would call the older
> >>> atomic_notifier_chain_register() after reporting the error).
> >>
> >> The point of this patch is to warn developers about the problem that
> >> needs to be fixed. We already have such troubling drivers in mainline.
> >>
> >> It's not critical to register different handlers with a duplicated
> >> priorities, but such cases really need to be corrected. We shouldn't
> >> break users' machines during transition to the new API, meanwhile
> >> developers should take action of fixing theirs drivers.
> >>
> >>> (Or you could return > 0 when a duplicate is registered in
> >>> atomic_notifier_chain_register() if the callers are prepared
> >>> for that. I don't really like this way, though.)
> >>
> >> I had a similar thought at some point before and decided that I'm not in
> >> favor of this approach. It's nicer to have a dedicated function that
> >> verifies the uniqueness, IMO.
> >
> > I don't like the part that it traverses the list second time to check
> > the uniqueness. But actually you could avoid that if
> > notifier_chain_register() would always add equal-priority entries in
> > reverse order:
> >
> >  static int notifier_chain_register(struct notifier_block **nl,
> >   struct notifier_block *n)
> >  {
> >   while ((*nl) != NULL) {
> >   if (unlikely((*nl) == n)) {
> >   WARN(1, "double register detected");
> >   return 0;
> >   }
> > - if (n->priority > (*nl)->priority)
> > + if (n->priority >= (*nl)->priority)
> >   break;
> >   nl = &((*nl)->next);
> >   }
> >   n->next = *nl;
> >   rcu_assign_pointer(*nl, n);
> >   return 0;
> >  }
> >
> > Then the check for uniqueness after adding would be:
> >
> >  WARN(nb->next && nb->priority == nb->next->priority);
>
> We can't just change the registration order because invocation order of
> the call chain depends on the registration order

It doesn't if unique priorities are required and isn't that what you want?

> and some of current
> users may rely on that order. I'm pretty sure that changing the order
> will have unfortunate consequences.

Well, the WARN() doesn't help much then.

Either you can make all of the users register with unique priorities,
and then you can make the registration reject non-unique ones, or you
cannot assume them to be unique.

Re: [PATCH v4 04/25] reboot: Correct typo in a comment

2021-12-12 Thread Rafael J. Wysocki

On Fri, Nov 26, 2021 at 7:02 PM Dmitry Osipenko  wrote:
>
> Correct s/implemenations/implementations/ in .
>
> Signed-off-by: Dmitry Osipenko 

This patch clearly need not be part of this series.

> ---
>  include/linux/reboot.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/reboot.h b/include/linux/reboot.h
> index af907a3d68d1..7c288013a3ca 100644
> --- a/include/linux/reboot.h
> +++ b/include/linux/reboot.h
> @@ -63,7 +63,7 @@ struct pt_regs;
>  extern void machine_crash_shutdown(struct pt_regs *);
>
>  /*
> - * Architecture independent implemenations of sys_reboot commands.
> + * Architecture independent implementations of sys_reboot commands.
>   */
>
>  extern void kernel_restart_prepare(char *cmd);
> --
> 2.33.1
>

Re: [PATCH v4 03/25] notifier: Add atomic/blocking_notifier_has_unique_priority()

2021-12-12 Thread Rafael J. Wysocki

On Fri, Nov 26, 2021 at 7:02 PM Dmitry Osipenko  wrote:
>
> Add atomic/blocking_notifier_has_unique_priority() helpers which return
> true if given handler has unique priority.
>
> Signed-off-by: Dmitry Osipenko 
> ---
>  include/linux/notifier.h |  5 +++
>  kernel/notifier.c| 69 
>  2 files changed, 74 insertions(+)
>
> diff --git a/include/linux/notifier.h b/include/linux/notifier.h
> index 924c9d7c8e73..2c4036f225e1 100644
> --- a/include/linux/notifier.h
> +++ b/include/linux/notifier.h
> @@ -175,6 +175,11 @@ int raw_notifier_call_chain_robust(struct 
> raw_notifier_head *nh,
>
>  bool blocking_notifier_call_chain_is_empty(struct blocking_notifier_head 
> *nh);
>
> +bool atomic_notifier_has_unique_priority(struct atomic_notifier_head *nh,
> +   struct notifier_block *nb);
> +bool blocking_notifier_has_unique_priority(struct blocking_notifier_head *nh,
> +   struct notifier_block *nb);
> +
>  #define NOTIFY_DONE0x  /* Don't care */
>  #define NOTIFY_OK  0x0001  /* Suits me */
>  #define NOTIFY_STOP_MASK   0x8000  /* Don't call further */
> diff --git a/kernel/notifier.c b/kernel/notifier.c
> index b20cb7b9b1f0..7a325b742104 100644
> --- a/kernel/notifier.c
> +++ b/kernel/notifier.c
> @@ -122,6 +122,19 @@ static int notifier_call_chain_robust(struct 
> notifier_block **nl,
> return ret;
>  }
>
> +static int notifier_has_unique_priority(struct notifier_block **nl,
> +   struct notifier_block *n)
> +{
> +   while (*nl && (*nl)->priority >= n->priority) {
> +   if ((*nl)->priority == n->priority && *nl != n)
> +   return false;
> +
> +   nl = &((*nl)->next);
> +   }
> +
> +   return true;
> +}
> +
>  /*
>   * Atomic notifier chain routines.  Registration and unregistration
>   * use a spinlock, and call_chain is synchronized by RCU (no locks).
> @@ -203,6 +216,30 @@ int atomic_notifier_call_chain(struct 
> atomic_notifier_head *nh,
>  EXPORT_SYMBOL_GPL(atomic_notifier_call_chain);
>  NOKPROBE_SYMBOL(atomic_notifier_call_chain);
>
> +/**
> + * atomic_notifier_has_unique_priority - Checks whether notifier's 
> priority is unique
> + * @nh: Pointer to head of the atomic notifier chain
> + * @n: Entry in notifier chain to check
> + *
> + * Checks whether there is another notifier in the chain with the same 
> priority.
> + * Must be called in process context.
> + *
> + * Returns true if priority is unique, false otherwise.
> + */
> +bool atomic_notifier_has_unique_priority(struct atomic_notifier_head *nh,
> +   struct notifier_block *n)
> +{
> +   unsigned long flags;
> +   bool ret;
> +
> +   spin_lock_irqsave(>lock, flags);
> +   ret = notifier_has_unique_priority(>head, n);
> +   spin_unlock_irqrestore(>lock, flags);

This only works if the caller can prevent new entries from being added
to the list at this point or if the caller knows that they cannot be
added for some reason, but the kerneldoc doesn't mention this
limitation.

> +
> +   return ret;
> +}
> +EXPORT_SYMBOL_GPL(atomic_notifier_has_unique_priority);
> +
>  /*
>   * Blocking notifier chain routines.  All access to the chain is
>   * synchronized by an rwsem.
> @@ -336,6 +373,38 @@ bool blocking_notifier_call_chain_is_empty(struct 
> blocking_notifier_head *nh)
>  }
>  EXPORT_SYMBOL_GPL(blocking_notifier_call_chain_is_empty);
>
> +/**
> + * blocking_notifier_has_unique_priority - Checks whether notifier's 
> priority is unique
> + * @nh: Pointer to head of the blocking notifier chain
> + * @n: Entry in notifier chain to check
> + *
> + * Checks whether there is another notifier in the chain with the same 
> priority.
> + * Must be called in process context.
> + *
> + * Returns true if priority is unique, false otherwise.
> + */
> +bool blocking_notifier_has_unique_priority(struct blocking_notifier_head *nh,
> +   struct notifier_block *n)
> +{
> +   bool ret;
> +
> +   /*
> +* This code gets used during boot-up, when task switching is
> +* not yet working and interrupts must remain disabled. At such
> +* times we must not call down_read().
> +*/
> +   if (system_state != SYSTEM_BOOTING)

No, please don't do this, it makes the whole thing error-prone.

> +   down_read(>rwsem);
> +
> +   ret = notifier_has_unique_priority(>head, n);
> +
> +   if (system_state != SYSTEM_BOOTING)
> +   up_read(>rwsem);

And still what if a new entry with a non-unique priority is added to
the chain at this point?

> +
> +   return ret;
> +}
> +EXPORT_SYMBOL_GPL(blocking_notifier_has_unique_priority);
> +
>  /*
>   * Raw notifier chain routines.  There is no protection;
>   * the caller must provide it.  Use at your own risk!
> --
> 2.33.1
>

Re: [PATCH v4 02/25] notifier: Add blocking_notifier_call_chain_is_empty()

2021-12-12 Thread Rafael J. Wysocki

On Fri, Nov 26, 2021 at 7:01 PM Dmitry Osipenko  wrote:
>
> Add blocking_notifier_call_chain_is_empty() that returns true if call
> chain is empty.
>
> Signed-off-by: Dmitry Osipenko 
> ---
>  include/linux/notifier.h |  2 ++
>  kernel/notifier.c| 14 ++
>  2 files changed, 16 insertions(+)
>
> diff --git a/include/linux/notifier.h b/include/linux/notifier.h
> index 4b80a815b666..924c9d7c8e73 100644
> --- a/include/linux/notifier.h
> +++ b/include/linux/notifier.h
> @@ -173,6 +173,8 @@ int blocking_notifier_call_chain_robust(struct 
> blocking_notifier_head *nh,
>  int raw_notifier_call_chain_robust(struct raw_notifier_head *nh,
> unsigned long val_up, unsigned long val_down, void *v);
>
> +bool blocking_notifier_call_chain_is_empty(struct blocking_notifier_head 
> *nh);
> +
>  #define NOTIFY_DONE0x  /* Don't care */
>  #define NOTIFY_OK  0x0001  /* Suits me */
>  #define NOTIFY_STOP_MASK   0x8000  /* Don't call further */
> diff --git a/kernel/notifier.c b/kernel/notifier.c
> index b8251dc0bc0f..b20cb7b9b1f0 100644
> --- a/kernel/notifier.c
> +++ b/kernel/notifier.c
> @@ -322,6 +322,20 @@ int blocking_notifier_call_chain(struct 
> blocking_notifier_head *nh,
>  }
>  EXPORT_SYMBOL_GPL(blocking_notifier_call_chain);
>
> +/**
> + * blocking_notifier_call_chain_is_empty - Check whether notifier chain 
> is empty
> + * @nh: Pointer to head of the blocking notifier chain
> + *
> + * Checks whether notifier chain is empty.
> + *
> + * Returns true is notifier chain is empty, false otherwise.
> + */
> +bool blocking_notifier_call_chain_is_empty(struct blocking_notifier_head *nh)
> +{
> +   return !rcu_access_pointer(nh->head);
> +}
> +EXPORT_SYMBOL_GPL(blocking_notifier_call_chain_is_empty);

The check is not reliable (racy) without locking, so I wouldn't export
anything like this to modules.

At least IMO it should be added along with a user.

Re: [PATCH v4 07/25] reboot: Remove extern annotation from function prototypes

2021-12-12 Thread Rafael J. Wysocki

On Fri, Nov 26, 2021 at 7:02 PM Dmitry Osipenko  wrote:
>
> There is no need to annotate function prototypes with 'extern', it makes
> code less readable. Remove unnecessary annotations from .
>
> Signed-off-by: Dmitry Osipenko 

I'm not sure that this is really useful.

Personally, I tend to respect the existing conventions like this.

Surely, this change is not required for the rest of the series to work.

> ---
>  include/linux/reboot.h | 38 +++---
>  1 file changed, 19 insertions(+), 19 deletions(-)
>
> diff --git a/include/linux/reboot.h b/include/linux/reboot.h
> index 7c288013a3ca..b7fa25726323 100644
> --- a/include/linux/reboot.h
> +++ b/include/linux/reboot.h
> @@ -40,36 +40,36 @@ extern int reboot_cpu;
>  extern int reboot_force;
>
>
> -extern int register_reboot_notifier(struct notifier_block *);
> -extern int unregister_reboot_notifier(struct notifier_block *);
> +int register_reboot_notifier(struct notifier_block *);
> +int unregister_reboot_notifier(struct notifier_block *);
>
> -extern int devm_register_reboot_notifier(struct device *, struct 
> notifier_block *);
> +int devm_register_reboot_notifier(struct device *, struct notifier_block *);
>
> -extern int register_restart_handler(struct notifier_block *);
> -extern int unregister_restart_handler(struct notifier_block *);
> -extern void do_kernel_restart(char *cmd);
> +int register_restart_handler(struct notifier_block *);
> +int unregister_restart_handler(struct notifier_block *);
> +void do_kernel_restart(char *cmd);
>
>  /*
>   * Architecture-specific implementations of sys_reboot commands.
>   */
>
> -extern void migrate_to_reboot_cpu(void);
> -extern void machine_restart(char *cmd);
> -extern void machine_halt(void);
> -extern void machine_power_off(void);
> +void migrate_to_reboot_cpu(void);
> +void machine_restart(char *cmd);
> +void machine_halt(void);
> +void machine_power_off(void);
>
> -extern void machine_shutdown(void);
> +void machine_shutdown(void);
>  struct pt_regs;
> -extern void machine_crash_shutdown(struct pt_regs *);
> +void machine_crash_shutdown(struct pt_regs *);
>
>  /*
>   * Architecture independent implementations of sys_reboot commands.
>   */
>
> -extern void kernel_restart_prepare(char *cmd);
> -extern void kernel_restart(char *cmd);
> -extern void kernel_halt(void);
> -extern void kernel_power_off(void);
> +void kernel_restart_prepare(char *cmd);
> +void kernel_restart(char *cmd);
> +void kernel_halt(void);
> +void kernel_power_off(void);
>
>  extern int C_A_D; /* for sysctl */
>  void ctrl_alt_del(void);
> @@ -77,15 +77,15 @@ void ctrl_alt_del(void);
>  #define POWEROFF_CMD_PATH_LEN  256
>  extern char poweroff_cmd[POWEROFF_CMD_PATH_LEN];
>
> -extern void orderly_poweroff(bool force);
> -extern void orderly_reboot(void);
> +void orderly_poweroff(bool force);
> +void orderly_reboot(void);
>  void hw_protection_shutdown(const char *reason, int ms_until_forced);
>
>  /*
>   * Emergency restart, callable from an interrupt handler.
>   */
>
> -extern void emergency_restart(void);
> +void emergency_restart(void);
>  #include 
>
>  #endif /* _LINUX_REBOOT_H */
> --
> 2.33.1
>

Re: [PATCH v2 08/45] kernel: Add combined power-off+restart handler call chain API

2021-10-28 Thread Rafael J. Wysocki

On Wed, Oct 27, 2021 at 11:18 PM Dmitry Osipenko  wrote:
>
> SoC platforms often have multiple options of how to perform system's
> power-off and restart operations. Meanwhile today's kernel is limited to
> a single option. Add combined power-off+restart handler call chain API,
> which is inspired by the restart API. The new API provides both power-off
> and restart functionality.
>
> The old pm_power_off method will be kept around till all users are
> converted to the new API.
>
> Current restart API will be replaced by the new unified API since
> new API is its superset. The restart functionality of the power-handler
> API is built upon the existing restart-notifier APIs.
>
> In order to ease conversion to the new API, convenient helpers are added
> for the common use-cases. They will reduce amount of boilerplate code and
> remove global variables. These helpers preserve old behaviour for cases
> where only one power-off handler is executed, this is what existing
> drivers want, and thus, they could be easily converted to the new API.
> Users of the new API should explicitly enable power-off chaining by
> setting corresponding flag of the power_handler structure.
>
> Signed-off-by: Dmitry Osipenko 
> ---
>  include/linux/reboot.h   | 176 +++-
>  kernel/power/hibernate.c |   2 +-
>  kernel/reboot.c  | 601 ++-
>  3 files changed, 768 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/reboot.h b/include/linux/reboot.h
> index b7fa25726323..0ec835338c27 100644
> --- a/include/linux/reboot.h
> +++ b/include/linux/reboot.h
> @@ -8,10 +8,16 @@
>
>  struct device;
>
> -#define SYS_DOWN   0x0001  /* Notify of system down */
> -#define SYS_RESTARTSYS_DOWN
> -#define SYS_HALT   0x0002  /* Notify of system halt */
> -#define SYS_POWER_OFF  0x0003  /* Notify of system power off */
> +enum reboot_prepare_mode {
> +   SYS_DOWN = 1,   /* Notify of system down */
> +   SYS_RESTART = SYS_DOWN,
> +   SYS_HALT,   /* Notify of system halt */
> +   SYS_POWER_OFF,  /* Notify of system power off */
> +};
> +
> +#define RESTART_PRIO_RESERVED  0
> +#define RESTART_PRIO_DEFAULT   128
> +#define RESTART_PRIO_HIGH  192
>
>  enum reboot_mode {
> REBOOT_UNDEFINED = -1,
> @@ -49,6 +55,167 @@ int register_restart_handler(struct notifier_block *);
>  int unregister_restart_handler(struct notifier_block *);
>  void do_kernel_restart(char *cmd);
>
> +/*
> + * Unified poweroff + restart API.
> + */
> +
> +#define POWEROFF_PRIO_RESERVED 0
> +#define POWEROFF_PRIO_PLATFORM 1
> +#define POWEROFF_PRIO_DEFAULT  128
> +#define POWEROFF_PRIO_HIGH 192
> +#define POWEROFF_PRIO_FIRMWARE 224

Also I'm wondering why these particular numbers were chosen, here and above?

> +
> +enum poweroff_mode {
> +   POWEROFF_NORMAL = 0,
> +   POWEROFF_PREPARE,
> +};
> +
> +struct power_off_data {
> +   void *cb_data;
> +};
> +
> +struct power_off_prep_data {
> +   void *cb_data;
> +};
> +
> +struct restart_data {
> +   void *cb_data;
> +   const char *cmd;
> +   enum reboot_mode mode;
> +};
> +
> +struct reboot_prep_data {
> +   void *cb_data;
> +   const char *cmd;
> +   enum reboot_prepare_mode mode;
> +};
> +
> +struct power_handler_private_data {
> +   struct notifier_block reboot_prep_nb;
> +   struct notifier_block power_off_nb;
> +   struct notifier_block restart_nb;
> +   void (*trivial_power_off_cb)(void);
> +   void (*simple_power_off_cb)(void *data);
> +   void *simple_power_off_cb_data;
> +   bool registered;
> +};
> +
> +/**
> + * struct power_handler - Machine power-off + restart handler
> + *
> + * Describes power-off and restart handlers which are invoked by kernel
> + * to power off or restart this machine.  Supports prioritized chaining for
> + * both restart and power-off handlers.  Callback's priority must be unique.
> + * Intended to be used by device drivers that are responsible for restarting
> + * and powering off hardware which kernel is running on.
> + *
> + * Struct power_handler can be static.  Members of this structure must not be
> + * altered while handler is registered.
> + *
> + * Fill the structure members and pass it to register_power_handler().
> + */
> +struct power_handler {
> +   /**
> +* @cb_data:
> +*
> +* User data included in callback's argument.
> +*/

And here I would document the structure fields in the main kerneldoc
comment above.

As is, it is a bit hard to grasp the whole definition.

> +   void *cb_data;
> +
> +   /**
> +* @power_off_cb:
> +*
> +* Callback that should turn off machine.  Inactive if NULL.
> +*/
> +   void (*power_off_cb)(struct power_off_data *data);
> +
> +   /**
> +* @power_off_prepare_cb:
> +*
> +* Power-off preparation

Re: [PATCH v2 08/45] kernel: Add combined power-off+restart handler call chain API

2021-10-28 Thread Rafael J. Wysocki

On Wed, Oct 27, 2021 at 11:18 PM Dmitry Osipenko  wrote:
>
> SoC platforms often have multiple options of how to perform system's
> power-off and restart operations. Meanwhile today's kernel is limited to
> a single option. Add combined power-off+restart handler call chain API,
> which is inspired by the restart API. The new API provides both power-off
> and restart functionality.
>
> The old pm_power_off method will be kept around till all users are
> converted to the new API.
>
> Current restart API will be replaced by the new unified API since
> new API is its superset. The restart functionality of the power-handler
> API is built upon the existing restart-notifier APIs.
>
> In order to ease conversion to the new API, convenient helpers are added
> for the common use-cases. They will reduce amount of boilerplate code and
> remove global variables. These helpers preserve old behaviour for cases
> where only one power-off handler is executed, this is what existing
> drivers want, and thus, they could be easily converted to the new API.
> Users of the new API should explicitly enable power-off chaining by
> setting corresponding flag of the power_handler structure.
>
> Signed-off-by: Dmitry Osipenko 
> ---
>  include/linux/reboot.h   | 176 +++-
>  kernel/power/hibernate.c |   2 +-
>  kernel/reboot.c  | 601 ++-
>  3 files changed, 768 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/reboot.h b/include/linux/reboot.h
> index b7fa25726323..0ec835338c27 100644
> --- a/include/linux/reboot.h
> +++ b/include/linux/reboot.h
> @@ -8,10 +8,16 @@
>
>  struct device;
>
> -#define SYS_DOWN   0x0001  /* Notify of system down */
> -#define SYS_RESTARTSYS_DOWN
> -#define SYS_HALT   0x0002  /* Notify of system halt */
> -#define SYS_POWER_OFF  0x0003  /* Notify of system power off */
> +enum reboot_prepare_mode {
> +   SYS_DOWN = 1,   /* Notify of system down */
> +   SYS_RESTART = SYS_DOWN,
> +   SYS_HALT,   /* Notify of system halt */
> +   SYS_POWER_OFF,  /* Notify of system power off */
> +};
> +
> +#define RESTART_PRIO_RESERVED  0
> +#define RESTART_PRIO_DEFAULT   128
> +#define RESTART_PRIO_HIGH  192
>
>  enum reboot_mode {
> REBOOT_UNDEFINED = -1,
> @@ -49,6 +55,167 @@ int register_restart_handler(struct notifier_block *);
>  int unregister_restart_handler(struct notifier_block *);
>  void do_kernel_restart(char *cmd);
>
> +/*
> + * Unified poweroff + restart API.
> + */
> +
> +#define POWEROFF_PRIO_RESERVED 0
> +#define POWEROFF_PRIO_PLATFORM 1
> +#define POWEROFF_PRIO_DEFAULT  128
> +#define POWEROFF_PRIO_HIGH 192
> +#define POWEROFF_PRIO_FIRMWARE 224
> +
> +enum poweroff_mode {
> +   POWEROFF_NORMAL = 0,
> +   POWEROFF_PREPARE,
> +};
> +
> +struct power_off_data {
> +   void *cb_data;
> +};
> +
> +struct power_off_prep_data {
> +   void *cb_data;
> +};
> +
> +struct restart_data {
> +   void *cb_data;
> +   const char *cmd;
> +   enum reboot_mode mode;
> +};
> +
> +struct reboot_prep_data {
> +   void *cb_data;
> +   const char *cmd;
> +   enum reboot_prepare_mode mode;
> +};
> +
> +struct power_handler_private_data {
> +   struct notifier_block reboot_prep_nb;
> +   struct notifier_block power_off_nb;
> +   struct notifier_block restart_nb;
> +   void (*trivial_power_off_cb)(void);
> +   void (*simple_power_off_cb)(void *data);
> +   void *simple_power_off_cb_data;
> +   bool registered;
> +};
> +
> +/**
> + * struct power_handler - Machine power-off + restart handler
> + *
> + * Describes power-off and restart handlers which are invoked by kernel
> + * to power off or restart this machine.  Supports prioritized chaining for
> + * both restart and power-off handlers.  Callback's priority must be unique.
> + * Intended to be used by device drivers that are responsible for restarting
> + * and powering off hardware which kernel is running on.
> + *
> + * Struct power_handler can be static.  Members of this structure must not be
> + * altered while handler is registered.
> + *
> + * Fill the structure members and pass it to register_power_handler().
> + */
> +struct power_handler {

The name of this structure is too generic IMV.  There are many things
that it might apply to in principle.

What about calling power_off_handler or sys_off_handler as it need not
be about power at all?

Re: [PATCH v4] lockdown,selinux: fix wrong subject in some SELinux lockdown checks

2021-09-13 Thread Rafael J. Wysocki

On Mon, Sep 13, 2021 at 4:04 PM Ondrej Mosnacek  wrote:
>
> Commit 59438b46471a ("security,lockdown,selinux: implement SELinux
> lockdown") added an implementation of the locked_down LSM hook to
> SELinux, with the aim to restrict which domains are allowed to perform
> operations that would breach lockdown.
>
> However, in several places the security_locked_down() hook is called in
> situations where the current task isn't doing any action that would
> directly breach lockdown, leading to SELinux checks that are basically
> bogus.
>
> To fix this, add an explicit struct cred pointer argument to
> security_lockdown() and define NULL as a special value to pass instead
> of current_cred() in such situations. LSMs that take the subject
> credentials into account can then fall back to some default or ignore
> such calls altogether. In the SELinux lockdown hook implementation, use
> SECINITSID_KERNEL in case the cred argument is NULL.
>
> Most of the callers are updated to pass current_cred() as the cred
> pointer, thus maintaining the same behavior. The following callers are
> modified to pass NULL as the cred pointer instead:
> 1. arch/powerpc/xmon/xmon.c
>  Seems to be some interactive debugging facility. It appears that
>  the lockdown hook is called from interrupt context here, so it
>  should be more appropriate to request a global lockdown decision.
> 2. fs/tracefs/inode.c:tracefs_create_file()
>  Here the call is used to prevent creating new tracefs entries when
>  the kernel is locked down. Assumes that locking down is one-way -
>  i.e. if the hook returns non-zero once, it will never return zero
>  again, thus no point in creating these files. Also, the hook is
>  often called by a module's init function when it is loaded by
>  userspace, where it doesn't make much sense to do a check against
>  the current task's creds, since the task itself doesn't actually
>  use the tracing functionality (i.e. doesn't breach lockdown), just
>  indirectly makes some new tracepoints available to whoever is
>  authorized to use them.
> 3. net/xfrm/xfrm_user.c:copy_to_user_*()
>  Here a cryptographic secret is redacted based on the value returned
>  from the hook. There are two possible actions that may lead here:
>  a) A netlink message XFRM_MSG_GETSA with NLM_F_DUMP set - here the
> task context is relevant, since the dumped data is sent back to
> the current task.
>  b) When adding/deleting/updating an SA via XFRM_MSG_xxxSA, the
> dumped SA is broadcasted to tasks subscribed to XFRM events -
> here the current task context is not relevant as it doesn't
> represent the tasks that could potentially see the secret.
>  It doesn't seem worth it to try to keep using the current task's
>  context in the a) case, since the eventual data leak can be
>  circumvented anyway via b), plus there is no way for the task to
>  indicate that it doesn't care about the actual key value, so the
>  check could generate a lot of "false alert" denials with SELinux.
>  Thus, let's pass NULL instead of current_cred() here faute de
>  mieux.
>
> Improvements-suggested-by: Casey Schaufler 
> Improvements-suggested-by: Paul Moore 
> Fixes: 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown")
> Acked-by: Dan Williams  [cxl]
> Acked-by: Steffen Klassert  [xfrm]
> Signed-off-by: Ondrej Mosnacek 

Acked-by: Rafael J. Wysocki 

for the ACPI and hibernation changes.

> ---
>
> v4:
> - rebase on top of TODO
> - fix rebase conflicts:
>   * drivers/cxl/pci.c
> - trivial: the lockdown reason was corrected in mainline
>   * kernel/bpf/helpers.c, kernel/trace/bpf_trace.c
> - trivial: LOCKDOWN_BPF_READ was renamed to LOCKDOWN_BPF_READ_KERNEL
>   in mainline
>   * kernel/power/hibernate.c
> - trivial: !secretmem_active() was added to the condition in
>   hibernation_available()
> - cover new security_locked_down() call in kernel/bpf/helpers.c
>   (LOCKDOWN_BPF_WRITE_USER in BPF_FUNC_probe_write_user case)
>
> v3: https://lore.kernel.org/lkml/20210616085118.1141101-1-omosn...@redhat.com/
> - add the cred argument to security_locked_down() and adapt all callers
> - keep using current_cred() in BPF, as the hook calls have been shifted
>   to program load time (commit ff40e51043af ("bpf, lockdown, audit: Fix
>   buggy SELinux lockdown permission checks"))
> - in SELinux, don't ignore hook calls where cred == NULL, but use
>   SECINITSID_KERNEL as the subject instead
> - update explanations in the commit message
>
> v2: https://lore.kernel.org/lkml/20210517092006.803332-1-omosn...@redhat.com

Re: [PATCH v2 4/4] bus: Make remove callback return void

2021-07-06 Thread Rafael J. Wysocki

On Tue, Jul 6, 2021 at 5:53 PM Uwe Kleine-König
 wrote:
>
> The driver core ignores the return value of this callback because there
> is only little it can do when a device disappears.
>
> This is the final bit of a long lasting cleanup quest where several
> buses were converted to also return void from their remove callback.
> Additionally some resource leaks were fixed that were caused by drivers
> returning an error code in the expectation that the driver won't go
> away.
>
> With struct bus_type::remove returning void it's prevented that newly
> implemented buses return an ignored error code and so don't anticipate
> wrong expectations for driver authors.
>
> Acked-by: Russell King (Oracle)  (For ARM, Amba 
> and related parts)
> Acked-by: Mark Brown 
> Acked-by: Chen-Yu Tsai  (for drivers/bus/sunxi-rsb.c)
> Acked-by: Pali Rohár 
> Acked-by: Mauro Carvalho Chehab  (for drivers/media)
> Acked-by: Hans de Goede  (For drivers/platform)
> Acked-by: Alexandre Belloni 
> Acked-By: Vinod Koul 
> Acked-by: Juergen Gross  (For Xen)
> Acked-by: Lee Jones  (For drivers/mfd)
> Acked-by: Johannes Thumshirn  (For drivers/mcb)
> Acked-by: Johan Hovold 
> Acked-by: Srinivas Kandagatla  (For 
> drivers/slimbus)
> Acked-by: Kirti Wankhede  (For drivers/vfio)
> Acked-by: Maximilian Luz 
> Acked-by: Heikki Krogerus  (For ulpi and 
> typec)
> Acked-by: Samuel Iglesias Gonsálvez  (For ipack)
> Reviewed-by: Tom Rix  (For fpga)
> Acked-by: Geoff Levand  (For ps3)
> Signed-off-by: Uwe Kleine-König 

For the ACPI part:

Acked-by: Rafael J. Wysocki 

> ---
>
>  arch/arm/common/locomo.c  | 3 +--
>  arch/arm/common/sa.c  | 4 +---
>  arch/arm/mach-rpc/ecard.c | 4 +---
>  arch/mips/sgi-ip22/ip22-gio.c | 3 +--
>  arch/parisc/kernel/drivers.c  | 5 ++---
>  arch/powerpc/platforms/ps3/system-bus.c   | 3 +--
>  arch/powerpc/platforms/pseries/ibmebus.c  | 3 +--
>  arch/powerpc/platforms/pseries/vio.c  | 3 +--
>  drivers/acpi/bus.c| 3 +--
>  drivers/amba/bus.c| 4 +---
>  drivers/base/auxiliary.c  | 4 +---
>  drivers/base/isa.c| 4 +---
>  drivers/base/platform.c   | 4 +---
>  drivers/bcma/main.c   | 6 ++
>  drivers/bus/sunxi-rsb.c   | 4 +---
>  drivers/cxl/core.c| 3 +--
>  drivers/dax/bus.c | 4 +---
>  drivers/dma/idxd/sysfs.c  | 4 +---
>  drivers/firewire/core-device.c| 4 +---
>  drivers/firmware/arm_scmi/bus.c   | 4 +---
>  drivers/firmware/google/coreboot_table.c  | 4 +---
>  drivers/fpga/dfl.c| 4 +---
>  drivers/hid/hid-core.c| 4 +---
>  drivers/hid/intel-ish-hid/ishtp/bus.c | 4 +---
>  drivers/hv/vmbus_drv.c| 5 +
>  drivers/hwtracing/intel_th/core.c | 4 +---
>  drivers/i2c/i2c-core-base.c   | 5 +
>  drivers/i3c/master.c  | 4 +---
>  drivers/input/gameport/gameport.c | 3 +--
>  drivers/input/serio/serio.c   | 3 +--
>  drivers/ipack/ipack.c | 4 +---
>  drivers/macintosh/macio_asic.c| 4 +---
>  drivers/mcb/mcb-core.c| 4 +---
>  drivers/media/pci/bt8xx/bttv-gpio.c   | 3 +--
>  drivers/memstick/core/memstick.c  | 3 +--
>  drivers/mfd/mcp-core.c| 3 +--
>  drivers/misc/mei/bus.c| 4 +---
>  drivers/misc/tifm_core.c  | 3 +--
>  drivers/mmc/core/bus.c| 4 +---
>  drivers/mmc/core/sdio_bus.c   | 4 +---
>  drivers/net/netdevsim/bus.c   | 3 +--
>  drivers/ntb/core.c| 4 +---
>  drivers/ntb/ntb_transport.c   | 4 +---
>  drivers/nvdimm/bus.c  | 3 +--
>  drivers/pci/endpoint/pci-epf-core.c   | 4 +---
>  drivers/pci/pci-driver.c  | 3 +--
>  drivers/pcmcia/ds.c   | 4 +---
>  drivers/platform/surface/aggregator/bus.c | 4 +---
>  drivers/platform/x86/wmi.c| 4 +---
>  drivers/pnp/driver.c  | 3 +--
>  drivers/rapidio/rio-driver.c  | 4 +---
>  drivers/rpmsg/rpmsg_core.c| 4 +---
>  drivers/s390/cio/ccwgroup.c   | 4 +---
>  drivers/s390/cio/css.c| 4 +---
>  drivers/s390/cio/device.c | 4 +---
>  drivers/s390/cio/scm.c| 4 +---
>  drivers/s390/crypto/ap_bus.c  | 4 +---
>  drivers/scsi/scsi_debug.c | 3 +--
>  drivers/siox/siox-co

Re: [PATCH] cpufreq: Remove unused flag CPUFREQ_PM_NO_WARN

2021-02-04 Thread Rafael J. Wysocki

On Tue, Feb 2, 2021 at 6:42 AM Viresh Kumar  wrote:
>
> This flag is set by one of the drivers but it isn't used in the code
> otherwise. Remove the unused flag and update the driver.
>
> Signed-off-by: Viresh Kumar 

Applied as 5.12 material, thanks!

> ---
> Rebased over:
>
> https://lore.kernel.org/lkml/a59bb322b22c247d570b70a8e94067804287623b.1612241683.git.viresh.ku...@linaro.org/
>
>  drivers/cpufreq/pmac32-cpufreq.c |  3 +--
>  include/linux/cpufreq.h  | 13 +
>  2 files changed, 6 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/cpufreq/pmac32-cpufreq.c 
> b/drivers/cpufreq/pmac32-cpufreq.c
> index 73621bc11976..4f20c6a9108d 100644
> --- a/drivers/cpufreq/pmac32-cpufreq.c
> +++ b/drivers/cpufreq/pmac32-cpufreq.c
> @@ -439,8 +439,7 @@ static struct cpufreq_driver pmac_cpufreq_driver = {
> .init   = pmac_cpufreq_cpu_init,
> .suspend= pmac_cpufreq_suspend,
> .resume = pmac_cpufreq_resume,
> -   .flags  = CPUFREQ_PM_NO_WARN |
> - CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING,
> +   .flags  = CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING,
> .attr   = cpufreq_generic_attr,
> .name   = "powermac",
>  };
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index c8e40e91fe9b..353969c7acd3 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -398,8 +398,11 @@ struct cpufreq_driver {
>  /* loops_per_jiffy or other kernel "constants" aren't affected by frequency 
> transitions */
>  #define CPUFREQ_CONST_LOOPSBIT(1)
>
> -/* don't warn on suspend/resume speed mismatches */
> -#define CPUFREQ_PM_NO_WARN BIT(2)
> +/*
> + * Set by drivers that want the core to automatically register the cpufreq
> + * driver as a thermal cooling device.
> + */
> +#define CPUFREQ_IS_COOLING_DEV BIT(2)
>
>  /*
>   * This should be set by platforms having multiple clock-domains, i.e.
> @@ -431,12 +434,6 @@ struct cpufreq_driver {
>   */
>  #define CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING  BIT(6)
>
> -/*
> - * Set by drivers that want the core to automatically register the cpufreq
> - * driver as a thermal cooling device.
> - */
> -#define CPUFREQ_IS_COOLING_DEV BIT(7)
> -
>  int cpufreq_register_driver(struct cpufreq_driver *driver_data);
>  int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
>
> --
> 2.25.0.rc1.19.g042ed3e048af
>

Re: [PATCH v2 2/4] PM: hibernate: make direct map manipulations more explicit

2020-10-29 Thread Rafael J. Wysocki

On Thu, Oct 29, 2020 at 5:19 PM Mike Rapoport  wrote:
>
> From: Mike Rapoport 
>
> When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may be
> not present in the direct map and has to be explicitly mapped before it
> could be copied.
>
> On arm64 it is possible that a page would be removed from the direct map
> using set_direct_map_invalid_noflush() but __kernel_map_pages() will refuse
> to map this page back if DEBUG_PAGEALLOC is disabled.
>
> Introduce hibernate_map_page() that will explicitly use
> set_direct_map_{default,invalid}_noflush() for ARCH_HAS_SET_DIRECT_MAP case
> and debug_pagealloc_map_pages() for DEBUG_PAGEALLOC case.
>
> The remapping of the pages in safe_copy_page() presumes that it only
> changes protection bits in an existing PTE and so it is safe to ignore
> return value of set_direct_map_{default,invalid}_noflush().
>
> Still, add a WARN_ON() so that future changes in set_memory APIs will not
> silently break hibernation.
>
> Signed-off-by: Mike Rapoport 

>From the hibernation support perspective:

Acked-by: Rafael J. Wysocki 

> ---
>  include/linux/mm.h  | 12 
>  kernel/power/snapshot.c | 30 --
>  2 files changed, 28 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 1fc0609056dc..14e397f3752c 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2927,16 +2927,6 @@ static inline bool debug_pagealloc_enabled_static(void)
>  #if defined(CONFIG_DEBUG_PAGEALLOC) || 
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
>  extern void __kernel_map_pages(struct page *page, int numpages, int enable);
>
> -/*
> - * When called in DEBUG_PAGEALLOC context, the call should most likely be
> - * guarded by debug_pagealloc_enabled() or debug_pagealloc_enabled_static()
> - */
> -static inline void
> -kernel_map_pages(struct page *page, int numpages, int enable)
> -{
> -   __kernel_map_pages(page, numpages, enable);
> -}
> -
>  static inline void debug_pagealloc_map_pages(struct page *page,
>  int numpages, int enable)
>  {
> @@ -2948,8 +2938,6 @@ static inline void debug_pagealloc_map_pages(struct 
> page *page,
>  extern bool kernel_page_present(struct page *page);
>  #endif /* CONFIG_HIBERNATION */
>  #else  /* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
> -static inline void
> -kernel_map_pages(struct page *page, int numpages, int enable) {}
>  static inline void debug_pagealloc_map_pages(struct page *page,
>  int numpages, int enable) {}
>  #ifdef CONFIG_HIBERNATION
> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> index 46b1804c1ddf..054c8cce4236 100644
> --- a/kernel/power/snapshot.c
> +++ b/kernel/power/snapshot.c
> @@ -76,6 +76,32 @@ static inline void hibernate_restore_protect_page(void 
> *page_address) {}
>  static inline void hibernate_restore_unprotect_page(void *page_address) {}
>  #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
>
> +static inline void hibernate_map_page(struct page *page, int enable)
> +{
> +   if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> +   unsigned long addr = (unsigned long)page_address(page);
> +   int ret;
> +
> +   /*
> +* This should not fail because remapping a page here means
> +* that we only update protection bits in an existing PTE.
> +* It is still worth to have WARN_ON() here if something
> +* changes and this will no longer be the case.
> +*/
> +   if (enable)
> +   ret = set_direct_map_default_noflush(page);
> +   else
> +   ret = set_direct_map_invalid_noflush(page);
> +
> +   if (WARN_ON(ret))
> +   return;
> +
> +   flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +   } else {
> +   debug_pagealloc_map_pages(page, 1, enable);
> +   }
> +}
> +
>  static int swsusp_page_is_free(struct page *);
>  static void swsusp_set_page_forbidden(struct page *);
>  static void swsusp_unset_page_forbidden(struct page *);
> @@ -1355,9 +1381,9 @@ static void safe_copy_page(void *dst, struct page 
> *s_page)
> if (kernel_page_present(s_page)) {
> do_copy_page(dst, page_address(s_page));
> } else {
> -   kernel_map_pages(s_page, 1, 1);
> +   hibernate_map_page(s_page, 1);
> do_copy_page(dst, page_address(s_page));
> -   kernel_map_pages(s_page, 1, 0);
> +   hibernate_map_page(s_page, 0);
> }
>  }
>
> --
> 2.28.0
>

Re: [PATCH 0/5] cpuidle-pseries: Parse extended CEDE information for idle.

2020-07-27 Thread Rafael J. Wysocki

On Tue, Jul 7, 2020 at 1:32 PM Gautham R Shenoy  wrote:
>
> Hi,
>
> On Tue, Jul 07, 2020 at 04:41:34PM +0530, Gautham R. Shenoy wrote:
> > From: "Gautham R. Shenoy" 
> >
> > Hi,
> >
> >
> >
> >
> > Gautham R. Shenoy (5):
> >   cpuidle-pseries: Set the latency-hint before entering CEDE
> >   cpuidle-pseries: Add function to parse extended CEDE records
> >   cpuidle-pseries : Fixup exit latency for CEDE(0)
> >   cpuidle-pseries : Include extended CEDE states in cpuidle framework
> >   cpuidle-pseries: Block Extended CEDE(1) which adds no additional
> > value.
>
> Forgot to mention that these patches are on top of Nathan's series to
> remove extended CEDE offline and bogus topology update code :
> https://lore.kernel.org/linuxppc-dev/20200612051238.1007764-1-nath...@linux.ibm.com/

OK, so this is targeted at the powerpc maintainers, isn't it?

Re: [PATCH v3 1/2] cpuidle: Trace IPI based and timer based wakeup latency from idle states

2020-07-27 Thread Rafael J. Wysocki

On Tue, Jul 21, 2020 at 2:43 PM Pratik Rajesh Sampat
 wrote:
>
> Fire directed smp_call_function_single IPIs from a specified source
> CPU to the specified target CPU to reduce the noise we have to wade
> through in the trace log.

And what's the purpose of it?

> The module is based on the idea written by Srivatsa Bhat and maintained
> by Vaidyanathan Srinivasan internally.
>
> Queue HR timer and measure jitter. Wakeup latency measurement for idle
> states using hrtimer.  Echo a value in ns to timer_test_function and
> watch trace. A HRtimer will be queued and when it fires the expected
> wakeup vs actual wakeup is computes and delay printed in ns.
>
> Implemented as a module which utilizes debugfs so that it can be
> integrated with selftests.
>
> To include the module, check option and include as module
> kernel hacking -> Cpuidle latency selftests
>
> [srivatsa.b...@linux.vnet.ibm.com: Initial implementation in
>  cpidle/sysfs]
>
> [sva...@linux.vnet.ibm.com: wakeup latency measurements using hrtimer
>  and fix some of the time calculation]
>
> [e...@linux.vnet.ibm.com: Fix some whitespace and tab errors and
>  increase the resolution of IPI wakeup]
>
> Signed-off-by: Pratik Rajesh Sampat 
> Reviewed-by: Gautham R. Shenoy 
> ---
>  drivers/cpuidle/Makefile   |   1 +
>  drivers/cpuidle/test-cpuidle_latency.c | 150 +
>  lib/Kconfig.debug  |  10 ++
>  3 files changed, 161 insertions(+)
>  create mode 100644 drivers/cpuidle/test-cpuidle_latency.c
>
> diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
> index f07800cbb43f..2ae05968078c 100644
> --- a/drivers/cpuidle/Makefile
> +++ b/drivers/cpuidle/Makefile
> @@ -8,6 +8,7 @@ obj-$(CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED) += coupled.o
>  obj-$(CONFIG_DT_IDLE_STATES) += dt_idle_states.o
>  obj-$(CONFIG_ARCH_HAS_CPU_RELAX) += poll_state.o
>  obj-$(CONFIG_HALTPOLL_CPUIDLE)   += cpuidle-haltpoll.o
> +obj-$(CONFIG_IDLE_LATENCY_SELFTEST)  += test-cpuidle_latency.o
>
>  
> ##
>  # ARM SoC drivers
> diff --git a/drivers/cpuidle/test-cpuidle_latency.c 
> b/drivers/cpuidle/test-cpuidle_latency.c
> new file mode 100644
> index ..61574665e972
> --- /dev/null
> +++ b/drivers/cpuidle/test-cpuidle_latency.c
> @@ -0,0 +1,150 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Module-based API test facility for cpuidle latency using IPIs and timers

I'd like to see a more detailed description of what it does and how it
works here.

> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +/* IPI based wakeup latencies */
> +struct latency {
> +   unsigned int src_cpu;
> +   unsigned int dest_cpu;
> +   ktime_t time_start;
> +   ktime_t time_end;
> +   u64 latency_ns;
> +} ipi_wakeup;
> +
> +static void measure_latency(void *info)
> +{
> +   struct latency *v;
> +   ktime_t time_diff;
> +
> +   v = (struct latency *)info;
> +   v->time_end = ktime_get();
> +   time_diff = ktime_sub(v->time_end, v->time_start);
> +   v->latency_ns = ktime_to_ns(time_diff);
> +}
> +
> +void run_smp_call_function_test(unsigned int cpu)
> +{
> +   ipi_wakeup.src_cpu = smp_processor_id();
> +   ipi_wakeup.dest_cpu = cpu;
> +   ipi_wakeup.time_start = ktime_get();
> +   smp_call_function_single(cpu, measure_latency, _wakeup, 1);
> +}
> +
> +/* Timer based wakeup latencies */
> +struct timer_data {
> +   unsigned int src_cpu;
> +   u64 timeout;
> +   ktime_t time_start;
> +   ktime_t time_end;
> +   struct hrtimer timer;
> +   u64 timeout_diff_ns;
> +} timer_wakeup;
> +
> +static enum hrtimer_restart timer_called(struct hrtimer *hrtimer)
> +{
> +   struct timer_data *w;
> +   ktime_t time_diff;
> +
> +   w = container_of(hrtimer, struct timer_data, timer);
> +   w->time_end = ktime_get();
> +
> +   time_diff = ktime_sub(w->time_end, w->time_start);
> +   time_diff = ktime_sub(time_diff, ns_to_ktime(w->timeout));
> +   w->timeout_diff_ns = ktime_to_ns(time_diff);
> +   return HRTIMER_NORESTART;
> +}
> +
> +static void run_timer_test(unsigned int ns)
> +{
> +   hrtimer_init(_wakeup.timer, CLOCK_MONOTONIC,
> +HRTIMER_MODE_REL);
> +   timer_wakeup.timer.function = timer_called;
> +   timer_wakeup.time_start = ktime_get();
> +   timer_wakeup.src_cpu = smp_processor_id();
> +   timer_wakeup.timeout = ns;
> +
> +   hrtimer_start(_wakeup.timer, ns_to_ktime(ns),
> + HRTIMER_MODE_REL_PINNED);
> +}
> +
> +static struct dentry *dir;
> +
> +static int cpu_read_op(void *data, u64 *value)
> +{
> +   *value = ipi_wakeup.dest_cpu;
> +   return 0;
> +}
> +
> +static int cpu_write_op(void *data, u64 value)
> +{
> +   run_smp_call_function_test(value);
> +   return 0;
> +}
> +DEFINE_SIMPLE_ATTRIBUTE(ipi_ops, cpu_read_op,

Re: [PATCH -next] cpuidle/pseries: Make symbol 'pseries_idle_driver' static

2020-07-15 Thread Rafael J. Wysocki

On Tue, Jul 14, 2020 at 4:14 PM Wei Yongjun  wrote:
>
> The sparse tool complains as follows:
>
> drivers/cpuidle/cpuidle-pseries.c:25:23: warning:
>  symbol 'pseries_idle_driver' was not declared. Should it be static?
>
> 'pseries_idle_driver' is not used outside of this file, so marks
> it static.
>
> Reported-by: Hulk Robot 
> Signed-off-by: Wei Yongjun 
> ---
>  drivers/cpuidle/cpuidle-pseries.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/cpuidle/cpuidle-pseries.c 
> b/drivers/cpuidle/cpuidle-pseries.c
> index 6513ef2af66a..3e058ad2bb51 100644
> --- a/drivers/cpuidle/cpuidle-pseries.c
> +++ b/drivers/cpuidle/cpuidle-pseries.c
> @@ -22,7 +22,7 @@
>  #include 
>  #include 
>
> -struct cpuidle_driver pseries_idle_driver = {
> +static struct cpuidle_driver pseries_idle_driver = {
> .name = "pseries_idle",
> .owner= THIS_MODULE,
>  };

Applied as 5.9 material, thanks!

Re: [PATCH -next] cpufreq: powernv: Make some symbols static

2020-07-15 Thread Rafael J. Wysocki

On Tue, Jul 14, 2020 at 4:14 PM Wei Yongjun  wrote:
>
> The sparse tool complains as follows:
>
> drivers/cpufreq/powernv-cpufreq.c:88:1: warning:
>  symbol 'pstate_revmap' was not declared. Should it be static?
> drivers/cpufreq/powernv-cpufreq.c:383:18: warning:
>  symbol 'cpufreq_freq_attr_cpuinfo_nominal_freq' was not declared. Should it 
> be static?
> drivers/cpufreq/powernv-cpufreq.c:669:6: warning:
>  symbol 'gpstate_timer_handler' was not declared. Should it be static?
> drivers/cpufreq/powernv-cpufreq.c:902:6: warning:
>  symbol 'powernv_cpufreq_work_fn' was not declared. Should it be static?
>
> Those symbols are not used outside of this file, so mark
> them static.
>
> Reported-by: Hulk Robot 
> Signed-off-by: Wei Yongjun 
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/cpufreq/powernv-cpufreq.c 
> b/drivers/cpufreq/powernv-cpufreq.c
> index 8646eb197cd9..cf118263ec65 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -85,7 +85,7 @@ struct global_pstate_info {
>
>  static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
>
> -DEFINE_HASHTABLE(pstate_revmap, POWERNV_MAX_PSTATES_ORDER);
> +static DEFINE_HASHTABLE(pstate_revmap, POWERNV_MAX_PSTATES_ORDER);
>  /**
>   * struct pstate_idx_revmap_data: Entry in the hashmap pstate_revmap
>   *   indexed by a function of pstate id.
> @@ -380,7 +380,7 @@ static ssize_t cpuinfo_nominal_freq_show(struct 
> cpufreq_policy *policy,
> powernv_freqs[powernv_pstate_info.nominal].frequency);
>  }
>
> -struct freq_attr cpufreq_freq_attr_cpuinfo_nominal_freq =
> +static struct freq_attr cpufreq_freq_attr_cpuinfo_nominal_freq =
> __ATTR_RO(cpuinfo_nominal_freq);
>
>  #define SCALING_BOOST_FREQS_ATTR_INDEX 2
> @@ -666,7 +666,7 @@ static inline void  queue_gpstate_timer(struct 
> global_pstate_info *gpstates)
>   * according quadratic equation. Queues a new timer if it is still not equal
>   * to local pstate
>   */
> -void gpstate_timer_handler(struct timer_list *t)
> +static void gpstate_timer_handler(struct timer_list *t)
>  {
> struct global_pstate_info *gpstates = from_timer(gpstates, t, timer);
> struct cpufreq_policy *policy = gpstates->policy;
> @@ -899,7 +899,7 @@ static struct notifier_block powernv_cpufreq_reboot_nb = {
> .notifier_call = powernv_cpufreq_reboot_notifier,
>  };
>
> -void powernv_cpufreq_work_fn(struct work_struct *work)
> +static void powernv_cpufreq_work_fn(struct work_struct *work)
>  {
> struct chip *chip = container_of(work, struct chip, throttle);
> struct cpufreq_policy *policy;
>

Applied as 5.9 material, thanks!

Re: [PATCH V4 0/3] cpufreq: Allow default governor on cmdline and fix locking issues

2020-06-30 Thread Rafael J. Wysocki

On Mon, Jun 29, 2020 at 10:58 PM Viresh Kumar  wrote:
>
> Hi,
>
> I have picked Quentin's series over my patch, modified both and tested.
>
> V3->V4:
> - Do __module_get() for cpufreq_default_governor() case as well and get
>   rid of an extra variable.
> - Use a single character array, default_governor, instead of two of them.
>
> V2->V3:
> - default_governor is a string now and we don't set it on governor
>   registration or unregistration anymore.
> - Fixed locking issues in cpufreq_init_policy().
>
> --
> Viresh
>
> Original cover letter fro Quentin:
>
> This series enables users of prebuilt kernels (e.g. distro kernels) to
> specify their CPUfreq governor of choice using the kernel command line,
> instead of having to wait for the system to fully boot to userspace to
> switch using the sysfs interface. This is helpful for 2 reasons:
>   1. users get to choose the governor that runs during the actual boot;
>   2. it simplifies the userspace boot procedure a bit (one less thing to
>  worry about).
>
> To enable this, the first patch moves all governor init calls to
> core_initcall, to make sure they are registered by the time the drivers
> probe. This should be relatively low impact as registering a governor
> is a simple procedure (it gets added to a llist), and all governors
> already load at core_initcall anyway when they're set as the default
> in Kconfig. This also allows to clean-up the governors' init/exit code,
> and reduces boilerplate.
>
> The second patch introduces the new command line parameter, inspired by
> its cpuidle counterpart. More details can be found in the respective
> patch headers.
>
> Changes in v2:
>  - added Viresh's ack to patch 01
>  - moved the assignment of 'default_governor' in patch 02 to the governor
>registration path instead of the driver registration (Viresh)
>
> Quentin Perret (2):
>   cpufreq: Register governors at core_initcall
>   cpufreq: Specify default governor on command line
>
> Viresh Kumar (1):
>   cpufreq: Fix locking issues with governors
>
>  .../admin-guide/kernel-parameters.txt |  5 ++
>  Documentation/admin-guide/pm/cpufreq.rst  |  6 +-
>  .../platforms/cell/cpufreq_spudemand.c| 26 +-
>  drivers/cpufreq/cpufreq.c | 87 ---
>  drivers/cpufreq/cpufreq_conservative.c| 22 ++---
>  drivers/cpufreq/cpufreq_ondemand.c| 24 ++---
>  drivers/cpufreq/cpufreq_performance.c | 14 +--
>  drivers/cpufreq/cpufreq_powersave.c   | 18 +---
>  drivers/cpufreq/cpufreq_userspace.c   | 18 +---
>  include/linux/cpufreq.h   | 14 +++
>  kernel/sched/cpufreq_schedutil.c  |  6 +-
>  11 files changed, 100 insertions(+), 140 deletions(-)
>
> --

All three patches applied as 5.9 material, thanks!

Re: [PATCH v2 2/2] cpufreq: Specify default governor on command line

2020-06-25 Thread Rafael J. Wysocki

On Thu, Jun 25, 2020 at 3:50 PM Quentin Perret  wrote:
>
> On Thursday 25 Jun 2020 at 15:28:43 (+0200), Rafael J. Wysocki wrote:
> > On Thu, Jun 25, 2020 at 1:53 PM Quentin Perret  wrote:
> > >
> > > On Thursday 25 Jun 2020 at 13:44:34 (+0200), Rafael J. Wysocki wrote:
> > > > On Thu, Jun 25, 2020 at 1:36 PM Viresh Kumar  
> > > > wrote:
> > > > > This change is not right IMO. This part handles the set-policy case,
> > > > > where there are no governors. Right now this code, for some reasons
> > > > > unknown to me, forcefully uses the default governor set to indicate
> > > > > the policy, which is not a great idea in my opinion TBH. This doesn't
> > > > > and shouldn't care about governor modules and should only be looking
> > > > > at strings instead of governor pointer.
> > > >
> > > > Sounds right.
> > > >
> > > > > Rafael, I even think we should remove this code completely and just
> > > > > rely on what the driver has sent to us. Using the selected governor
> > > > > for set policy drivers is very confusing and also we shouldn't be
> > > > > forced to compiling any governor for the set-policy case.
> > > >
> > > > Well, AFAICS the idea was to use the default governor as a kind of
> > > > default policy proxy, but I agree that strings should be sufficient
> > > > for that.
> > >
> > > I agree with all the above. I'd much rather not rely on the default
> > > governor name to populate the default policy, too, so +1 from me.
> >
> > So before this series the default governor was selected at the kernel
> > configuration time (pre-build) and was always built-in.  Because it
> > could not go away, its name could be used to indicate the default
> > policy for the "setpolicy" drivers.
> >
> > After this series, however, it cannot be used this way reliably, but
> > you can still pass cpufreq_param_governor to cpufreq_parse_policy()
> > instead of def_gov->name in cpufreq_init_policy(), can't you?
>
> Good point. I also need to fallback to the default builtin governor if
> the command line parameter isn't valid (or non-existent), so perhaps
> something like so?

Yes, that should work if I haven't missed anything.

> iff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index dad6b85f4c89..20a2020abf88 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -653,6 +653,23 @@ static unsigned int cpufreq_parse_policy(char 
> *str_governor)
> return CPUFREQ_POLICY_UNKNOWN;
>  }
>
> +static unsigned int cpufreq_default_policy(void)
> +{
> +   unsigned int pol;
> +
> +   pol = cpufreq_parse_policy(cpufreq_param_governor);
> +   if (pol != CPUFREQ_POLICY_UNKNOWN)
> +   return pol;
> +
> +   if (IS_BUILTIN(CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE))
> +   return CPUFREQ_POLICY_PERFORMANCE;
> +
> +   if (IS_BUILTIN(CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE))
> +   return CPUFREQ_POLICY_POWERSAVE;
> +
> +   return CPUFREQ_POLICY_UNKNOWN;
> +}
> +
>  /**
>   * cpufreq_parse_governor - parse a governor string only for has_target()
>   * @str_governor: Governor name.
> @@ -1085,8 +1102,8 @@ static int cpufreq_init_policy(struct cpufreq_policy 
> *policy)
> /* Use the default policy if there is no last_policy. */
> if (policy->last_policy) {
> pol = policy->last_policy;
> -   } else if (default_governor) {
> -   pol = cpufreq_parse_policy(default_governor->name);
> +   } else {
> +   pol = cpufreq_default_policy();
> /*
>  * In case the default governor is neiter 
> "performance"
>  * nor "powersave", fall back to the initial policy

Re: [PATCH v2 2/2] cpufreq: Specify default governor on command line

2020-06-25 Thread Rafael J. Wysocki

On Thu, Jun 25, 2020 at 1:53 PM Quentin Perret  wrote:
>
> On Thursday 25 Jun 2020 at 13:44:34 (+0200), Rafael J. Wysocki wrote:
> > On Thu, Jun 25, 2020 at 1:36 PM Viresh Kumar  
> > wrote:
> > > This change is not right IMO. This part handles the set-policy case,
> > > where there are no governors. Right now this code, for some reasons
> > > unknown to me, forcefully uses the default governor set to indicate
> > > the policy, which is not a great idea in my opinion TBH. This doesn't
> > > and shouldn't care about governor modules and should only be looking
> > > at strings instead of governor pointer.
> >
> > Sounds right.
> >
> > > Rafael, I even think we should remove this code completely and just
> > > rely on what the driver has sent to us. Using the selected governor
> > > for set policy drivers is very confusing and also we shouldn't be
> > > forced to compiling any governor for the set-policy case.
> >
> > Well, AFAICS the idea was to use the default governor as a kind of
> > default policy proxy, but I agree that strings should be sufficient
> > for that.
>
> I agree with all the above. I'd much rather not rely on the default
> governor name to populate the default policy, too, so +1 from me.

So before this series the default governor was selected at the kernel
configuration time (pre-build) and was always built-in.  Because it
could not go away, its name could be used to indicate the default
policy for the "setpolicy" drivers.

After this series, however, it cannot be used this way reliably, but
you can still pass cpufreq_param_governor to cpufreq_parse_policy()
instead of def_gov->name in cpufreq_init_policy(), can't you?

Re: [PATCH v2 2/2] cpufreq: Specify default governor on command line

2020-06-25 Thread Rafael J. Wysocki

On Thu, Jun 25, 2020 at 1:36 PM Viresh Kumar  wrote:
>
> After your last email (reply to my patch), I noticed a change which
> isn't required. :)
>
> On 23-06-20, 15:21, Quentin Perret wrote:
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index 0128de3603df..4b1a5c0173cf 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -50,6 +50,9 @@ static LIST_HEAD(cpufreq_governor_list);
> >  #define for_each_governor(__governor)\
> >   list_for_each_entry(__governor, _governor_list, governor_list)
> >
> > +static char cpufreq_param_governor[CPUFREQ_NAME_LEN];
> > +static struct cpufreq_governor *default_governor;
> > +
> >  /**
> >   * The "cpufreq driver" - the arch- or hardware-dependent low
> >   * level driver of CPUFreq support, and its spinlock. This lock
> > @@ -1055,7 +1058,6 @@ __weak struct cpufreq_governor 
> > *cpufreq_default_governor(void)
> >
> >  static int cpufreq_init_policy(struct cpufreq_policy *policy)
> >  {
> > - struct cpufreq_governor *def_gov = cpufreq_default_governor();
> >   struct cpufreq_governor *gov = NULL;
> >   unsigned int pol = CPUFREQ_POLICY_UNKNOWN;
> >
> > @@ -1065,8 +1067,8 @@ static int cpufreq_init_policy(struct cpufreq_policy 
> > *policy)
> >   if (gov) {
> >   pr_debug("Restoring governor %s for cpu %d\n",
> >policy->governor->name, policy->cpu);
> > - } else if (def_gov) {
> > - gov = def_gov;
> > + } else if (default_governor) {
> > + gov = default_governor;
> >   } else {
> >   return -ENODATA;
> >   }
>
>
> > @@ -1074,8 +1076,8 @@ static int cpufreq_init_policy(struct cpufreq_policy 
> > *policy)
> >   /* Use the default policy if there is no last_policy. */
> >   if (policy->last_policy) {
> >   pol = policy->last_policy;
> > - } else if (def_gov) {
> > - pol = cpufreq_parse_policy(def_gov->name);
> > + } else if (default_governor) {
> > + pol = cpufreq_parse_policy(default_governor->name);
>
> This change is not right IMO. This part handles the set-policy case,
> where there are no governors. Right now this code, for some reasons
> unknown to me, forcefully uses the default governor set to indicate
> the policy, which is not a great idea in my opinion TBH. This doesn't
> and shouldn't care about governor modules and should only be looking
> at strings instead of governor pointer.

Sounds right.

> Rafael, I even think we should remove this code completely and just
> rely on what the driver has sent to us. Using the selected governor
> for set policy drivers is very confusing and also we shouldn't be
> forced to compiling any governor for the set-policy case.

Well, AFAICS the idea was to use the default governor as a kind of
default policy proxy, but I agree that strings should be sufficient
for that.

I'll have a look at what to do with that code.

Re: [PATCH v2 2/2] cpufreq: Specify default governor on command line

2020-06-25 Thread Rafael J. Wysocki

On Thu, Jun 25, 2020 at 10:50 AM Viresh Kumar  wrote:
>
> On 24-06-20, 16:32, Quentin Perret wrote:
> > Right, but I must admit that, looking at this more, I'm getting a bit
> > confused with the overall locking for governors :/
> >
> > When in cpufreq_init_policy() we find a governor using
> > find_governor(policy->last_governor), what guarantees this governor is
> > not concurrently unregistered? That is, what guarantees this governor
> > doesn't go away between that find_governor() call, and the subsequent
> > call to try_module_get() in cpufreq_set_policy() down the line?
> >
> > Can we somewhat assume that whatever governor is referred to by
> > policy->last_governor will have a non-null refcount? Or are the
> > cpufreq_online() and cpufreq_unregister_governor() path mutually
> > exclusive? Or is there something else?
>
> This should be sufficient to fix pending issues I believe. Based over your
> patches.

LGTM, but can you post it in a new thread to let Patchwork pick it up?

> -8<-
> From: Viresh Kumar 
> Date: Thu, 25 Jun 2020 13:15:23 +0530
> Subject: [PATCH] cpufreq: Fix locking issues with governors
>
> The locking around governors handling isn't adequate currently. The list
> of governors should never be traversed without locking in place. Also we
> must make sure the governor isn't removed while it is still referenced
> by code.
>
> Reported-by: Quentin Perret 
> Signed-off-by: Viresh Kumar 
> ---
>  drivers/cpufreq/cpufreq.c | 59 ---
>  1 file changed, 36 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 4b1a5c0173cf..dad6b85f4c89 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -624,6 +624,24 @@ static struct cpufreq_governor *find_governor(const char 
> *str_governor)
> return NULL;
>  }
>
> +static struct cpufreq_governor *get_governor(const char *str_governor)
> +{
> +   struct cpufreq_governor *t;
> +
> +   mutex_lock(_governor_mutex);
> +   t = find_governor(str_governor);
> +   if (!t)
> +   goto unlock;
> +
> +   if (!try_module_get(t->owner))
> +   t = NULL;
> +
> +unlock:
> +   mutex_unlock(_governor_mutex);
> +
> +   return t;
> +}
> +
>  static unsigned int cpufreq_parse_policy(char *str_governor)
>  {
> if (!strncasecmp(str_governor, "performance", CPUFREQ_NAME_LEN))
> @@ -643,28 +661,14 @@ static struct cpufreq_governor 
> *cpufreq_parse_governor(char *str_governor)
>  {
> struct cpufreq_governor *t;
>
> -   mutex_lock(_governor_mutex);
> -
> -   t = find_governor(str_governor);
> -   if (!t) {
> -   int ret;
> -
> -   mutex_unlock(_governor_mutex);
> -
> -   ret = request_module("cpufreq_%s", str_governor);
> -   if (ret)
> -   return NULL;
> -
> -   mutex_lock(_governor_mutex);
> +   t = get_governor(str_governor);
> +   if (t)
> +   return t;
>
> -   t = find_governor(str_governor);
> -   }
> -   if (t && !try_module_get(t->owner))
> -   t = NULL;
> -
> -   mutex_unlock(_governor_mutex);
> +   if (request_module("cpufreq_%s", str_governor))
> +   return NULL;
>
> -   return t;
> +   return get_governor(str_governor);
>  }
>
>  /**
> @@ -818,12 +822,14 @@ static ssize_t show_scaling_available_governors(struct 
> cpufreq_policy *policy,
> goto out;
> }
>
> +   mutex_lock(_governor_mutex);
> for_each_governor(t) {
> if (i >= (ssize_t) ((PAGE_SIZE / sizeof(char))
> - (CPUFREQ_NAME_LEN + 2)))
> -   goto out;
> +   break;
> i += scnprintf([i], CPUFREQ_NAME_PLEN, "%s ", t->name);
> }
> +   mutex_unlock(_governor_mutex);
>  out:
> i += sprintf([i], "\n");
> return i;
> @@ -1060,11 +1066,14 @@ static int cpufreq_init_policy(struct cpufreq_policy 
> *policy)
>  {
> struct cpufreq_governor *gov = NULL;
> unsigned int pol = CPUFREQ_POLICY_UNKNOWN;
> +   bool put_governor = false;
> +   int ret;
>
> if (has_target()) {
> /* Update policy governor to the one used before hotplug. */
> -   gov = find_governor(policy->last_governor);
> +   gov = get_governor(policy->last_governor);
> if (gov) {
> +   put_governor = true;
> pr_debug("Restoring governor %s for cpu %d\n",
>  policy->governor->name, policy->cpu);
> } else if (default_governor) {
> @@ -1091,7 +1100,11 @@ static int cpufreq_init_policy(struct cpufreq_policy 
> *policy)
> return -ENODATA;
> }
>
> -   return cpufreq_set_policy(policy, gov, pol);
> +

Re: [PATCH v2 2/2] cpufreq: Specify default governor on command line

2020-06-24 Thread Rafael J. Wysocki

On Wed, Jun 24, 2020 at 7:50 AM Viresh Kumar  wrote:
>
> On 23-06-20, 15:21, Quentin Perret wrote:
> > Currently, the only way to specify the default CPUfreq governor is via
> > Kconfig options, which suits users who can build the kernel themselves
> > perfectly.
> >
> > However, for those who use a distro-like kernel (such as Android, with
> > the Generic Kernel Image project), the only way to use a different
> > default is to boot to userspace, and to then switch using the sysfs
> > interface. Being able to specify the default governor on the command
> > line, like is the case for cpuidle, would enable those users to specify
> > their governor of choice earlier on, and to simplify slighlty the
> > userspace boot procedure.
> >
> > To support this use-case, add a kernel command line parameter enabling
> > to specify a default governor for CPUfreq, which takes precedence over
> > the builtin default.
> >
> > This implementation has one notable limitation: the default governor
> > must be registered before the driver. This is solved for builtin
> > governors and drivers using appropriate *_initcall() functions. And in
> > the modular case, this must be reflected as a constraint on the module
> > loading order.
> >
> > Signed-off-by: Quentin Perret 
> > ---
> >  .../admin-guide/kernel-parameters.txt |  5 
> >  Documentation/admin-guide/pm/cpufreq.rst  |  6 ++---
> >  drivers/cpufreq/cpufreq.c | 23 +++
> >  3 files changed, 26 insertions(+), 8 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> > b/Documentation/admin-guide/kernel-parameters.txt
> > index fb95fad81c79..5fd3c9f187eb 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -703,6 +703,11 @@
> >   cpufreq.off=1   [CPU_FREQ]
> >   disable the cpufreq sub-system
> >
> > + cpufreq.default_governor=
> > + [CPU_FREQ] Name of the default cpufreq governor to 
> > use.
> > + This governor must be registered in the kernel before
> > + the cpufreq driver probes.
> > +
> >   cpu_init_udelay=N
> >   [X86] Delay for N microsec between assert and 
> > de-assert
> >   of APIC INIT to start processors.  This delay occurs
> > diff --git a/Documentation/admin-guide/pm/cpufreq.rst 
> > b/Documentation/admin-guide/pm/cpufreq.rst
> > index 0c74a7784964..368e612145d2 100644
> > --- a/Documentation/admin-guide/pm/cpufreq.rst
> > +++ b/Documentation/admin-guide/pm/cpufreq.rst
> > @@ -147,9 +147,9 @@ CPUs in it.
> >
> >  The next major initialization step for a new policy object is to attach a
> >  scaling governor to it (to begin with, that is the default scaling governor
> > -determined by the kernel configuration, but it may be changed later
> > -via ``sysfs``).  First, a pointer to the new policy object is passed to the
> > -governor's ``->init()`` callback which is expected to initialize all of the
> > +determined by the kernel command line or configuration, but it may be 
> > changed
> > +later via ``sysfs``).  First, a pointer to the new policy object is passed 
> > to
> > +the governor's ``->init()`` callback which is expected to initialize all 
> > of the
> >  data structures necessary to handle the given policy and, possibly, to add
> >  a governor ``sysfs`` interface to it.  Next, the governor is started by
> >  invoking its ``->start()`` callback.
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index 0128de3603df..4b1a5c0173cf 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -50,6 +50,9 @@ static LIST_HEAD(cpufreq_governor_list);
> >  #define for_each_governor(__governor)\
> >   list_for_each_entry(__governor, _governor_list, governor_list)
> >
> > +static char cpufreq_param_governor[CPUFREQ_NAME_LEN];
> > +static struct cpufreq_governor *default_governor;
> > +
> >  /**
> >   * The "cpufreq driver" - the arch- or hardware-dependent low
> >   * level driver of CPUFreq support, and its spinlock. This lock
> > @@ -1055,7 +1058,6 @@ __weak struct cpufreq_governor 
> > *cpufreq_default_governor(void)
> >
> >  static int cpufreq_init_policy(struct cpufreq_policy *policy)
> >  {
> > - struct cpufreq_governor *def_gov = cpufreq_default_governor();
> >   struct cpufreq_governor *gov = NULL;
> >   unsigned int pol = CPUFREQ_POLICY_UNKNOWN;
> >
> > @@ -1065,8 +1067,8 @@ static int cpufreq_init_policy(struct cpufreq_policy 
> > *policy)
> >   if (gov) {
> >   pr_debug("Restoring governor %s for cpu %d\n",
> >policy->governor->name, policy->cpu);
> > - } else if (def_gov) {
> > - gov = def_gov;
> > + } else if (default_governor) {
> > + gov =

Re: [PATCH] cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_work_fn

2020-03-25 Thread Rafael J. Wysocki

On Tuesday, March 24, 2020 7:34:56 AM CET Michael Ellerman wrote:
> "Rafael J. Wysocki"  writes:
> > On Monday, March 16, 2020 2:57:43 PM CET Pratik Rajesh Sampat wrote:
> >> The patch avoids allocating cpufreq_policy on stack hence fixing frame
> >> size overflow in 'powernv_cpufreq_work_fn'
> >> 
> >> Fixes: 227942809b52 ("cpufreq: powernv: Restore cpu frequency to 
> >> policy->cur on unthrottling")
> >> Signed-off-by: Pratik Rajesh Sampat 
> >
> > Any objections or concerns here?
> >
> > If not, I'll queue it up.
> 
> I have it in my testing branch,

Great!

> but if you pick it up I can drop it.

Let it go in through your tree.

Cheers!

Re: [patch V3 05/20] acpi: Remove header dependency

2020-03-22 Thread Rafael J. Wysocki

On Sat, Mar 21, 2020 at 12:35 PM Thomas Gleixner  wrote:
>
> From: Peter Zijlstra 
>
> In order to avoid future header hell, remove the inclusion of
> proc_fs.h from acpi_bus.h. All it needs is a forward declaration of a
> struct.
>
> Signed-off-by: Peter Zijlstra (Intel) 
> Signed-off-by: Thomas Gleixner 
> Cc: Darren Hart 
> Cc: Andy Shevchenko 
> Cc: platform-driver-...@vger.kernel.org
> Cc: Greg Kroah-Hartman 
> Cc: Zhang Rui 
> Cc: "Rafael J. Wysocki" 
> Cc: linux...@vger.kernel.org
> Cc: Len Brown 
> Cc: linux-a...@vger.kernel.org

Acked-by: Rafael J. Wysocki 

> ---
>  drivers/platform/x86/dell-smo8800.c  |1 +
>  drivers/platform/x86/wmi.c   |1 +
>  drivers/thermal/intel/int340x_thermal/acpi_thermal_rel.c |1 +
>  include/acpi/acpi_bus.h  |2 +-
>  4 files changed, 4 insertions(+), 1 deletion(-)
>
> --- a/drivers/platform/x86/dell-smo8800.c
> +++ b/drivers/platform/x86/dell-smo8800.c
> @@ -16,6 +16,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  struct smo8800_device {
> u32 irq; /* acpi device irq */
> --- a/drivers/platform/x86/wmi.c
> +++ b/drivers/platform/x86/wmi.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>
>  ACPI_MODULE_NAME("wmi");
> --- a/drivers/thermal/intel/int340x_thermal/acpi_thermal_rel.c
> +++ b/drivers/thermal/intel/int340x_thermal/acpi_thermal_rel.c
> @@ -19,6 +19,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "acpi_thermal_rel.h"
>
>  static acpi_handle acpi_thermal_rel_handle;
> --- a/include/acpi/acpi_bus.h
> +++ b/include/acpi/acpi_bus.h
> @@ -80,7 +80,7 @@ bool acpi_dev_present(const char *hid, c
>
>  #ifdef CONFIG_ACPI
>
> -#include 
> +struct proc_dir_entry;
>
>  #define ACPI_BUS_FILE_ROOT "acpi"
>  extern struct proc_dir_entry *acpi_root_dir;
>
>

Re: [PATCH] cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_work_fn

2020-03-19 Thread Rafael J. Wysocki

On Monday, March 16, 2020 2:57:43 PM CET Pratik Rajesh Sampat wrote:
> The patch avoids allocating cpufreq_policy on stack hence fixing frame
> size overflow in 'powernv_cpufreq_work_fn'
> 
> Fixes: 227942809b52 ("cpufreq: powernv: Restore cpu frequency to policy->cur 
> on unthrottling")
> Signed-off-by: Pratik Rajesh Sampat 

Any objections or concerns here?

If not, I'll queue it up.

> ---
>  drivers/cpufreq/powernv-cpufreq.c | 13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/cpufreq/powernv-cpufreq.c 
> b/drivers/cpufreq/powernv-cpufreq.c
> index 56f4bc0d209e..20ee0661555a 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -902,6 +902,7 @@ static struct notifier_block powernv_cpufreq_reboot_nb = {
>  void powernv_cpufreq_work_fn(struct work_struct *work)
>  {
>   struct chip *chip = container_of(work, struct chip, throttle);
> + struct cpufreq_policy *policy;
>   unsigned int cpu;
>   cpumask_t mask;
>  
> @@ -916,12 +917,14 @@ void powernv_cpufreq_work_fn(struct work_struct *work)
>   chip->restore = false;
>   for_each_cpu(cpu, ) {
>   int index;
> - struct cpufreq_policy policy;
>  
> - cpufreq_get_policy(, cpu);
> - index = cpufreq_table_find_index_c(, policy.cur);
> - powernv_cpufreq_target_index(, index);
> - cpumask_andnot(, , policy.cpus);
> + policy = cpufreq_cpu_get(cpu);
> + if (!policy)
> + continue;
> + index = cpufreq_table_find_index_c(policy, policy->cur);
> + powernv_cpufreq_target_index(policy, index);
> + cpumask_andnot(, , policy->cpus);
> + cpufreq_cpu_put(policy);
>   }
>  out:
>   put_online_cpus();
>

Re: [PATCH v5] reboot: support offline CPUs before reboot

2020-01-15 Thread Rafael J. Wysocki

On Wed, Jan 15, 2020 at 7:35 AM Hsin-Yi Wang  wrote:
>
> Currently system reboots uses architecture specific codes (smp_send_stop)
> to offline non reboot CPUs. Most architecture's implementation is looping
> through all non reboot online CPUs and call ipi function to each of them. Some
> architecture like arm64, arm, and x86... would set offline masks to cpu 
> without
> really offline them. This causes some race condition and kernel warning comes
> out sometimes when system reboots.
>
> This patch adds a config ARCH_OFFLINE_CPUS_ON_REBOOT, which would offline 
> cpus in
> migrate_to_reboot_cpu(). If non reboot cpus are all offlined here, the loop 
> for
> checking online cpus would be an empty loop. If architecture don't enable this
> config, or some cpus somehow fails to offline, it would fallback to ipi
> function.
>
> Opt in this config for architectures that support CONFIG_HOTPLUG_CPU.
>
> Signed-off-by: Hsin-Yi Wang 
> ---
> Change from v4:
> * fix a few nits: naming, comments, remove Kconfig text...
>
> Change from v3:
> * Opt in config for architectures that support CONFIG_HOTPLUG_CPU
> * Merge function offline_secondary_cpus() and freeze_secondary_cpus()
>   with an additional flag.

This does not seem to be a very good idea, since
freeze_secondary_cpus() does much more than you need for reboot.

For reboot, you basically only need to do something like this AFAICS:

cpu_maps_update_begin();

for_each_online_cpu(i) {
if (i != cpu)
_cpu_down(i, 1, CPUHP_OFFLINE);
}
cpu_hotplug_disabled++;

cpu_maps_update_done();

And you may put this into a function defined outside of CONFIG_PM_SLEEP.

>
> Change from v2:
> * Add another config instead of configed by CONFIG_HOTPLUG_CPU

So why exactly is this new Kconfig option needed?

Everybody supporting CPU hotplug seems to opt in anyway.

[cut]

>
> -int freeze_secondary_cpus(int primary)
> +int freeze_secondary_cpus(int primary, bool reboot)
>  {
> int cpu, error = 0;
>
> @@ -1237,11 +1237,13 @@ int freeze_secondary_cpus(int primary)
> if (cpu == primary)
> continue;
>
> -   if (pm_wakeup_pending()) {
> +#ifdef CONFIG_PM_SLEEP
> +   if (!reboot && pm_wakeup_pending()) {
> pr_info("Wakeup pending. Abort CPU freeze\n");
> error = -EBUSY;
> break;
> }
> +#endif

Please avoid using #ifdefs in function bodies.  This makes the code
hard to maintain in the long term.

>
> trace_suspend_resume(TPS("CPU_OFF"), cpu, true);
> error = _cpu_down(cpu, 1, CPUHP_OFFLINE);
> @@ -1250,7 +1252,9 @@ int freeze_secondary_cpus(int primary)
> cpumask_set_cpu(cpu, frozen_cpus);
> else {
> pr_err("Error taking CPU%d down: %d\n", cpu, error);
> -   break;
> +   /* When rebooting, offline as many CPUs as possible. 
> */
> +   if (!reboot)
> +   break;
> }
> }
>
> diff --git a/kernel/reboot.c b/kernel/reboot.c
> index c4d472b7f1b4..12f643b66e57 100644
> --- a/kernel/reboot.c
> +++ b/kernel/reboot.c
> @@ -7,6 +7,7 @@
>
>  #define pr_fmt(fmt)"reboot: " fmt
>
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -220,7 +221,9 @@ void migrate_to_reboot_cpu(void)
> /* The boot cpu is always logical cpu 0 */
> int cpu = reboot_cpu;
>
> +#if !IS_ENABLED(CONFIG_ARCH_OFFLINE_CPUS_ON_REBOOT)
> cpu_hotplug_disable();
> +#endif

You can write this as

if (!IS_ENABLED(CONFIG_ARCH_OFFLINE_CPUS_ON_REBOOT))
cpu_hotplug_disable();

That's what IS_ENABLED() is there for.

>
> /* Make certain the cpu I'm about to reboot on is online */
> if (!cpu_online(cpu))
> @@ -231,6 +234,11 @@ void migrate_to_reboot_cpu(void)
>
> /* Make certain I only run on the appropriate processor */
> set_cpus_allowed_ptr(current, cpumask_of(cpu));
> +
> +#if IS_ENABLED(CONFIG_ARCH_OFFLINE_CPUS_ON_REBOOT)
> +   /* Offline other cpus if possible */
> +   freeze_secondary_cpus(cpu, true);
> +#endif

The above comment applies here too.

>  }
>
>  /**
> --

Re: [PATCH 12/23] y2038: syscalls: change remaining timeval to __kernel_old_timeval

2019-11-13 Thread Rafael J. Wysocki

On Friday, November 8, 2019 10:12:11 PM CET Arnd Bergmann wrote:
> All of the remaining syscalls that pass a timeval (gettimeofday, utime,
> futimesat) can trivially be changed to pass a __kernel_old_timeval
> instead, which has a compatible layout, but avoids ambiguity with
> the timeval type in user space.
> 
> Signed-off-by: Arnd Bergmann 

For the change in power/power.h

Acked-by: Rafael J. Wysocki 

> ---
>  arch/powerpc/include/asm/asm-prototypes.h |  3 ++-
>  arch/powerpc/kernel/syscalls.c|  4 ++--
>  fs/select.c   | 10 +-
>  fs/utimes.c   |  8 
>  include/linux/syscalls.h  | 10 +-
>  kernel/power/power.h  |  2 +-
>  kernel/time/time.c|  2 +-
>  7 files changed, 20 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/asm-prototypes.h 
> b/arch/powerpc/include/asm/asm-prototypes.h
> index 8561498e653c..2c25dc079cb9 100644
> --- a/arch/powerpc/include/asm/asm-prototypes.h
> +++ b/arch/powerpc/include/asm/asm-prototypes.h
> @@ -92,7 +92,8 @@ long sys_swapcontext(struct ucontext __user *old_ctx,
>  long sys_debug_setcontext(struct ucontext __user *ctx,
> int ndbg, struct sig_dbg_op __user *dbg);
>  int
> -ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user 
> *exp, struct timeval __user *tvp);
> +ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user 
> *exp,
> +struct __kernel_old_timeval __user *tvp);
>  unsigned long __init early_init(unsigned long dt_ptr);
>  void __init machine_init(u64 dt_ptr);
>  #endif
> diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c
> index 3bfb3888e897..078608ec2e92 100644
> --- a/arch/powerpc/kernel/syscalls.c
> +++ b/arch/powerpc/kernel/syscalls.c
> @@ -79,7 +79,7 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len,
>   * sys_select() with the appropriate args. -- Cort
>   */
>  int
> -ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user 
> *exp, struct timeval __user *tvp)
> +ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user 
> *exp, struct __kernel_old_timeval __user *tvp)
>  {
>   if ( (unsigned long)n >= 4096 )
>   {
> @@ -89,7 +89,7 @@ ppc_select(int n, fd_set __user *inp, fd_set __user *outp, 
> fd_set __user *exp, s
>   || __get_user(inp, ((fd_set __user * __user *)(buffer+1)))
>   || __get_user(outp, ((fd_set  __user * __user *)(buffer+2)))
>   || __get_user(exp, ((fd_set  __user * __user *)(buffer+3)))
> - || __get_user(tvp, ((struct timeval  __user * __user 
> *)(buffer+4
> + || __get_user(tvp, ((struct __kernel_old_timeval  __user * 
> __user *)(buffer+4
>   return -EFAULT;
>   }
>   return sys_select(n, inp, outp, exp, tvp);
> diff --git a/fs/select.c b/fs/select.c
> index 53a0c149f528..11d0285d46b7 100644
> --- a/fs/select.c
> +++ b/fs/select.c
> @@ -321,7 +321,7 @@ static int poll_select_finish(struct timespec64 *end_time,
>   switch (pt_type) {
>   case PT_TIMEVAL:
>   {
> - struct timeval rtv;
> + struct __kernel_old_timeval rtv;
>  
>   if (sizeof(rtv) > sizeof(rtv.tv_sec) + 
> sizeof(rtv.tv_usec))
>   memset(, 0, sizeof(rtv));
> @@ -698,10 +698,10 @@ int core_sys_select(int n, fd_set __user *inp, fd_set 
> __user *outp,
>  }
>  
>  static int kern_select(int n, fd_set __user *inp, fd_set __user *outp,
> -fd_set __user *exp, struct timeval __user *tvp)
> +fd_set __user *exp, struct __kernel_old_timeval __user 
> *tvp)
>  {
>   struct timespec64 end_time, *to = NULL;
> - struct timeval tv;
> + struct __kernel_old_timeval tv;
>   int ret;
>  
>   if (tvp) {
> @@ -720,7 +720,7 @@ static int kern_select(int n, fd_set __user *inp, fd_set 
> __user *outp,
>  }
>  
>  SYSCALL_DEFINE5(select, int, n, fd_set __user *, inp, fd_set __user *, outp,
> - fd_set __user *, exp, struct timeval __user *, tvp)
> + fd_set __user *, exp, struct __kernel_old_timeval __user *, tvp)
>  {
>   return kern_select(n, inp, outp, exp, tvp);
>  }
> @@ -810,7 +810,7 @@ SYSCALL_DEFINE6(pselect6_time32, int, n, fd_set __user *, 
> inp, fd_set __user *,
>  struct sel_arg_struct {
>   unsigned long n;
>   fd_set __user *inp, *outp, *exp;
> - struct timeval __user *tvp;
> + struct __kernel_old_timeval __user *tvp;
>  };
>

Re: [PATCH 4/5] power: avs: smartreflex: Remove superfluous cast in debugfs_create_file() call

2019-11-13 Thread Rafael J. Wysocki

On Monday, October 21, 2019 4:51:48 PM CET Geert Uytterhoeven wrote:
> There is no need to cast a typed pointer to a void pointer when calling
> a function that accepts the latter.  Remove it, as the cast prevents
> further compiler checks.
> 
> Signed-off-by: Geert Uytterhoeven 
> ---
>  drivers/power/avs/smartreflex.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/power/avs/smartreflex.c b/drivers/power/avs/smartreflex.c
> index 4684e7df833a81e9..5376f3d22f31eade 100644
> --- a/drivers/power/avs/smartreflex.c
> +++ b/drivers/power/avs/smartreflex.c
> @@ -905,7 +905,7 @@ static int omap_sr_probe(struct platform_device *pdev)
>   sr_info->dbg_dir = debugfs_create_dir(sr_info->name, sr_dbg_dir);
>  
>   debugfs_create_file("autocomp", S_IRUGO | S_IWUSR, sr_info->dbg_dir,
> - (void *)sr_info, _sr_fops);
> + sr_info, _sr_fops);
>   debugfs_create_x32("errweight", S_IRUGO, sr_info->dbg_dir,
>  _info->err_weight);
>   debugfs_create_x32("errmaxlimit", S_IRUGO, sr_info->dbg_dir,
> 

Applying as 5.5 material, thanks!

Re: [PATCH 4/5] power: avs: smartreflex: Remove superfluous cast in debugfs_create_file() call

2019-11-08 Thread Rafael J. Wysocki

On Monday, October 21, 2019 4:51:48 PM CET Geert Uytterhoeven wrote:
> There is no need to cast a typed pointer to a void pointer when calling
> a function that accepts the latter.  Remove it, as the cast prevents
> further compiler checks.
> 
> Signed-off-by: Geert Uytterhoeven 

Greg, have you taken this one by any chance?

> ---
>  drivers/power/avs/smartreflex.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/power/avs/smartreflex.c b/drivers/power/avs/smartreflex.c
> index 4684e7df833a81e9..5376f3d22f31eade 100644
> --- a/drivers/power/avs/smartreflex.c
> +++ b/drivers/power/avs/smartreflex.c
> @@ -905,7 +905,7 @@ static int omap_sr_probe(struct platform_device *pdev)
>   sr_info->dbg_dir = debugfs_create_dir(sr_info->name, sr_dbg_dir);
>  
>   debugfs_create_file("autocomp", S_IRUGO | S_IWUSR, sr_info->dbg_dir,
> - (void *)sr_info, _sr_fops);
> + sr_info, _sr_fops);
>   debugfs_create_x32("errweight", S_IRUGO, sr_info->dbg_dir,
>  _info->err_weight);
>   debugfs_create_x32("errmaxlimit", S_IRUGO, sr_info->dbg_dir,
>

Re: [PATCH v3] cpufreq: powernv: fix stack bloat and hard limit on num cpus

2019-11-04 Thread Rafael J. Wysocki

On Thursday, October 31, 2019 6:21:59 AM CET John Hubbard wrote:
> The following build warning occurred on powerpc 64-bit builds:
> 
> drivers/cpufreq/powernv-cpufreq.c: In function 'init_chip_info':
> drivers/cpufreq/powernv-cpufreq.c:1070:1: warning: the frame size of
> 1040 bytes is larger than 1024 bytes [-Wframe-larger-than=]
> 
> This is with a cross-compiler based on gcc 8.1.0, which I got from:
>   https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/8.1.0/
> 
> The warning is due to putting 1024 bytes on the stack:
> 
> unsigned int chip[256];
> 
> ...and it's also undesirable to have a hard limit on the number of
> CPUs here.
> 
> Fix both problems by dynamically allocating based on num_possible_cpus,
> as recommended by Michael Ellerman.
> 
> Fixes: 053819e0bf840 ("cpufreq: powernv: Handle throttling due to Pmax 
> capping at chip level")
> Cc: Michael Ellerman 
> Cc: Shilpasri G Bhat 
> Cc: Preeti U Murthy 
> Cc: Viresh Kumar 
> Cc: Rafael J. Wysocki 
> Cc: linux...@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: John Hubbard 
> Acked-by: Viresh Kumar 
> ---
> 
> Changes since v2: applied fixes from Michael Ellerman's review:
> 
> * Changed from CONFIG_NR_CPUS to num_possible_cpus()
> 
> * Fixed up commit description: added a note about exactly which
>   compiler generates the warning. And softened up wording about
>   the limitation on number of CPUs.
> 
> Changes since v1: includes Viresh's review commit fixes.

Applying as 5.5 material, thanks!

Re: [PATCH v2] cpufreq: powernv: fix stack bloat and NR_CPUS limitation

2019-10-28 Thread Rafael J. Wysocki

On Friday, October 18, 2019 7:07:12 AM CET Viresh Kumar wrote:
> On 17-10-19, 21:55, John Hubbard wrote:
> > The following build warning occurred on powerpc 64-bit builds:
> > 
> > drivers/cpufreq/powernv-cpufreq.c: In function 'init_chip_info':
> > drivers/cpufreq/powernv-cpufreq.c:1070:1: warning: the frame size of 1040 
> > bytes is larger than 1024 bytes [-Wframe-larger-than=]
> > 
> > This is due to putting 1024 bytes on the stack:
> > 
> > unsigned int chip[256];
> > 
> > ...and while looking at this, it also has a bug: it fails with a stack
> > overrun, if CONFIG_NR_CPUS > 256.
> > 
> > Fix both problems by dynamically allocating based on CONFIG_NR_CPUS.
> > 
> > Fixes: 053819e0bf840 ("cpufreq: powernv: Handle throttling due to Pmax 
> > capping at chip level")
> > Cc: Shilpasri G Bhat 
> > Cc: Preeti U Murthy 
> > Cc: Viresh Kumar 
> > Cc: Rafael J. Wysocki 
> > Cc: linux...@vger.kernel.org
> > Cc: linuxppc-dev@lists.ozlabs.org
> > Signed-off-by: John Hubbard 
> > ---
> > 
> > Changes since v1: includes Viresh's review commit fixes.
> > 
> >  drivers/cpufreq/powernv-cpufreq.c | 17 +
> >  1 file changed, 13 insertions(+), 4 deletions(-)
> 
> Acked-by: Viresh Kumar 
> 
> 

Applying as 5.5 material, thanks!

Re: [PATCH v9 1/3] PM: wakeup: Add routine to help fetch wakeup source object.

2019-10-23 Thread Rafael J. Wysocki

On Wed, Oct 23, 2019 at 10:24 AM Ran Wang  wrote:
>
> Some user might want to go through all registered wakeup sources
> and doing things accordingly. For example, SoC PM driver might need to
> do HW programming to prevent powering down specific IP which wakeup
> source depending on. So add this API to help walk through all registered
> wakeup source objects on that list and return them one by one.
>
> Signed-off-by: Ran Wang 
> Tested-by: Leonard Crestez 

OK, thanks for making all of the requested changes:

Reviewed-by: Rafael J. Wysocki 

and please feel free to push this through the appropriate
arch/platform tree.  Alternatively, please let me know if you want me
to take this series, but then I need an ACK from the appropriate
maintainer(s) on patch 3.

> ---
> Change in v9:
> - Supplement comments for wakeup_sources_read_lock(),
>   wakeup_sources_read_unlock, wakeup_sources_walk_start and
>   wakeup_sources_walk_next().
>
> Change in v8:
> - Rename wakeup_source_get_next() to wakeup_sources_walk_next().
> - Add wakeup_sources_read_lock() to take over locking job of
>   wakeup_source_get_star().
> - Rename wakeup_source_get_start() to wakeup_sources_walk_start().
> - Replace wakeup_source_get_stop() with wakeup_sources_read_unlock().
> - Define macro for_each_wakeup_source(ws).
>
> Change in v7:
> - Remove define of member *dev in wake_irq to fix conflict with commit
> c8377adfa781 ("PM / wakeup: Show wakeup sources stats in sysfs"), user
> will use ws->dev->parent instead.
> - Remove '#include ' because it is not used.
>
> Change in v6:
> - Add wakeup_source_get_star() and wakeup_source_get_stop() to aligned
> with wakeup_sources_stats_seq_start/nex/stop.
>
> Change in v5:
> - Update commit message, add decription of walk through all wakeup
> source objects.
> - Add SCU protection in function wakeup_source_get_next().
> - Rename wakeup_source member 'attached_dev' to 'dev' and move it up
> (before wakeirq).
>
> Change in v4:
> - None.
>
> Change in v3:
> - Adjust indentation of *attached_dev;.
>
> Change in v2:
> - None.
>
>  drivers/base/power/wakeup.c | 54 
> +
>  include/linux/pm_wakeup.h   |  9 
>  2 files changed, 63 insertions(+)
>
> diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> index 5817b51..70a9edb 100644
> --- a/drivers/base/power/wakeup.c
> +++ b/drivers/base/power/wakeup.c
> @@ -248,6 +248,60 @@ void wakeup_source_unregister(struct wakeup_source *ws)
>  EXPORT_SYMBOL_GPL(wakeup_source_unregister);
>
>  /**
> + * wakeup_sources_read_lock - Lock wakeup source list for read.
> + *
> + * Returns an index of srcu lock for struct wakeup_srcu.
> + * This index must be passed to the matching wakeup_sources_read_unlock().
> + */
> +int wakeup_sources_read_lock(void)
> +{
> +   return srcu_read_lock(_srcu);
> +}
> +EXPORT_SYMBOL_GPL(wakeup_sources_read_lock);
> +
> +/**
> + * wakeup_sources_read_unlock - Unlock wakeup source list.
> + * @idx: return value from corresponding wakeup_sources_read_lock()
> + */
> +void wakeup_sources_read_unlock(int idx)
> +{
> +   srcu_read_unlock(_srcu, idx);
> +}
> +EXPORT_SYMBOL_GPL(wakeup_sources_read_unlock);
> +
> +/**
> + * wakeup_sources_walk_start - Begin a walk on wakeup source list
> + *
> + * Returns first object of the list of wakeup sources.
> + *
> + * Note that to be safe, wakeup sources list needs to be locked by calling
> + * wakeup_source_read_lock() for this.
> + */
> +struct wakeup_source *wakeup_sources_walk_start(void)
> +{
> +   struct list_head *ws_head = _sources;
> +
> +   return list_entry_rcu(ws_head->next, struct wakeup_source, entry);
> +}
> +EXPORT_SYMBOL_GPL(wakeup_sources_walk_start);
> +
> +/**
> + * wakeup_sources_walk_next - Get next wakeup source from the list
> + * @ws: Previous wakeup source object
> + *
> + * Note that to be safe, wakeup sources list needs to be locked by calling
> + * wakeup_source_read_lock() for this.
> + */
> +struct wakeup_source *wakeup_sources_walk_next(struct wakeup_source *ws)
> +{
> +   struct list_head *ws_head = _sources;
> +
> +   return list_next_or_null_rcu(ws_head, >entry,
> +   struct wakeup_source, entry);
> +}
> +EXPORT_SYMBOL_GPL(wakeup_sources_walk_next);
> +
> +/**
>   * device_wakeup_attach - Attach a wakeup source object to a device object.
>   * @dev: Device to handle.
>   * @ws: Wakeup source object to

Re: [PATCH v9 3/3] soc: fsl: add RCPM driver

2019-10-23 Thread Rafael J. Wysocki

gt; +   struct rcpm *rcpm;
> +   struct device_node  *np = dev->of_node;
> +   u32 value[RCPM_WAKEUP_CELL_MAX_SIZE + 1];
> +
> +   rcpm = dev_get_drvdata(dev);
> +   if (!rcpm)
> +   return -EINVAL;
> +
> +   base = rcpm->ippdexpcr_base;
> +   idx = wakeup_sources_read_lock();
> +
> +   /* Begin with first registered wakeup source */
> +   for_each_wakeup_source(ws) {
> +
> +   /* skip object which is not attached to device */
> +   if (!ws->dev || !ws->dev->parent)
> +   continue;
> +
> +   ret = device_property_read_u32_array(ws->dev->parent,
> +   "fsl,rcpm-wakeup", value,
> +   rcpm->wakeup_cells + 1);
> +
> +   /*  Wakeup source should refer to current rcpm device */
> +   if (ret || (np->phandle != value[0])) {
> +   pr_debug("%s doesn't refer to this rcpm\n", ws->name);

I'm still quite unsure why it is useful to print this message instead
of printing one when the wakeup source does match (there may be many
wakeup source objects you don't care about in principle), but
whatever.

> +   continue;
> +   }
> +
> +   /* Property "#fsl,rcpm-wakeup-cells" of rcpm node defines the
> +* number of IPPDEXPCR register cells, and "fsl,rcpm-wakeup"
> +* of wakeup source IP contains an integer array:  +* RCPM node, IPPDEXPCR0 setting, IPPDEXPCR1 setting,
> +* IPPDEXPCR2 setting, etc>.
> +*
> +* So we will go thought them and do programming accordngly.
> +*/
> +   for (i = 0; i < rcpm->wakeup_cells; i++) {
> +   u32 tmp = value[i + 1];
> +   void __iomem *address = base + i * 4;
> +
> +   if (!tmp)
> +   continue;
> +
> +   /* We can only OR related bits */
> +   if (rcpm->little_endian) {
> +   tmp |= ioread32(address);
> +   iowrite32(tmp, address);
> +   } else {
> +   tmp |= ioread32be(address);
> +   iowrite32be(tmp, address);
> +   }
> +   }
> +   }
> +
> +   wakeup_sources_read_unlock(idx);
> +
> +   return 0;
> +}
> +
> +static const struct dev_pm_ops rcpm_pm_ops = {
> +   .prepare =  rcpm_pm_prepare,
> +};

For the above:

Reviewed-by: Rafael J. Wysocki

Re: [PATCH 1/3] PM: wakeup: Add routine to help fetch wakeup source object.

2019-10-22 Thread Rafael J. Wysocki

On Tue, Oct 22, 2019 at 9:51 AM Ran Wang  wrote:
>
> Some user might want to go through all registered wakeup sources
> and doing things accordingly. For example, SoC PM driver might need to
> do HW programming to prevent powering down specific IP which wakeup
> source depending on. So add this API to help walk through all registered
> wakeup source objects on that list and return them one by one.
>
> Signed-off-by: Ran Wang 
> Tested-by: Leonard Crestez 
> ---
> Change in v8
> - Rename wakeup_source_get_next() to wakeup_sources_walk_next().
> - Add wakeup_sources_read_lock() to take over locking job of
>   wakeup_source_get_star().
> - Rename wakeup_source_get_start() to wakeup_sources_walk_start().
> - Replace wakeup_source_get_stop() with wakeup_sources_read_unlock().
> - Define macro for_each_wakeup_source(ws).
>
> Change in v7:
> - Remove define of member *dev in wake_irq to fix conflict with commit
> c8377adfa781 ("PM / wakeup: Show wakeup sources stats in sysfs"), user
> will use ws->dev->parent instead.
> - Remove '#include ' because it is not used.
>
> Change in v6:
> - Add wakeup_source_get_star() and wakeup_source_get_stop() to aligned
> with wakeup_sources_stats_seq_start/nex/stop.
>
> Change in v5:
> - Update commit message, add decription of walk through all wakeup
> source objects.
> - Add SCU protection in function wakeup_source_get_next().
> - Rename wakeup_source member 'attached_dev' to 'dev' and move it up
> (before wakeirq).
>
> Change in v4:
> - None.
>
> Change in v3:
> - Adjust indentation of *attached_dev;.
>
> Change in v2:
> - None.
>
>  drivers/base/power/wakeup.c | 42 ++
>  include/linux/pm_wakeup.h   |  9 +
>  2 files changed, 51 insertions(+)
>
> diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> index 5817b51..8c7a5f9 100644
> --- a/drivers/base/power/wakeup.c
> +++ b/drivers/base/power/wakeup.c
> @@ -248,6 +248,48 @@ void wakeup_source_unregister(struct wakeup_source *ws)
>  EXPORT_SYMBOL_GPL(wakeup_source_unregister);
>
>  /**
> + * wakeup_sources_read_lock - Lock wakeup source list for read.

Please document the return value.

> + */
> +int wakeup_sources_read_lock(void)
> +{
> +   return srcu_read_lock(_srcu);
> +}
> +EXPORT_SYMBOL_GPL(wakeup_sources_read_lock);
> +
> +/**
> + * wakeup_sources_read_unlock - Unlock wakeup source list.

Please document the argument.

> + */
> +void wakeup_sources_read_unlock(int idx)
> +{
> +   srcu_read_unlock(_srcu, idx);
> +}
> +EXPORT_SYMBOL_GPL(wakeup_sources_read_unlock);
> +
> +/**
> + * wakeup_sources_walk_start - Begin a walk on wakeup source list

Please document the return value and add a note that the wakeup
sources list needs to be locked for reading for this to be safe.

> + */
> +struct wakeup_source *wakeup_sources_walk_start(void)
> +{
> +   struct list_head *ws_head = _sources;
> +
> +   return list_entry_rcu(ws_head->next, struct wakeup_source, entry);
> +}
> +EXPORT_SYMBOL_GPL(wakeup_sources_walk_start);
> +
> +/**
> + * wakeup_sources_walk_next - Get next wakeup source from the list
> + * @ws: Previous wakeup source object

Please add a note that the wakeup sources list needs to be locked for
reading for this to be safe.

> + */
> +struct wakeup_source *wakeup_sources_walk_next(struct wakeup_source *ws)
> +{
> +   struct list_head *ws_head = _sources;
> +
> +   return list_next_or_null_rcu(ws_head, >entry,
> +   struct wakeup_source, entry);
> +}
> +EXPORT_SYMBOL_GPL(wakeup_sources_walk_next);
> +
> +/**
>   * device_wakeup_attach - Attach a wakeup source object to a device object.
>   * @dev: Device to handle.
>   * @ws: Wakeup source object to attach to @dev.
> diff --git a/include/linux/pm_wakeup.h b/include/linux/pm_wakeup.h
> index 661efa0..aa3da66 100644
> --- a/include/linux/pm_wakeup.h
> +++ b/include/linux/pm_wakeup.h
> @@ -63,6 +63,11 @@ struct wakeup_source {
> boolautosleep_enabled:1;
>  };
>
> +#define for_each_wakeup_source(ws) \
> +   for ((ws) = wakeup_sources_walk_start();\
> +(ws);  \
> +(ws) = wakeup_sources_walk_next((ws)))
> +
>  #ifdef CONFIG_PM_SLEEP
>
>  /*
> @@ -92,6 +97,10 @@ extern void wakeup_source_remove(struct wakeup_source *ws);
>  extern struct wakeup_source *wakeup_source_register(struct device *dev,
> const char *name);
>  extern void wakeup_source_unregister(struct wakeup_source *ws);
> +extern int wakeup_sources_read_lock(void);
> +extern void wakeup_sources_read_unlock(int idx);
> +extern struct wakeup_source *wakeup_sources_walk_start(void);
> +extern struct wakeup_source *wakeup_sources_walk_next(struct wakeup_source 
> *ws);
>  extern int

Re: [PATCH 3/3] soc: fsl: add RCPM driver

2019-10-22 Thread Rafael J. Wysocki

On Tue, Oct 22, 2019 at 9:52 AM Ran Wang  wrote:
>
> The NXP's QorIQ Processors based on ARM Core have RCPM module
> (Run Control and Power Management), which performs system level
> tasks associated with power management such as wakeup source control.
>
> This driver depends on PM wakeup source framework which help to
> collect wake information.
>
> Signed-off-by: Ran Wang 
> ---
> Change in v8:
> - Adjust related API usage to meet wakeup.c's update in patch 1/3.
> - Add sanity checking for the case of ws->dev or ws->dev->parent
>   is null.
>
> Change in v7:
> - Replace 'ws->dev' with 'ws->dev->parent' to get aligned with
> c8377adfa781 ("PM / wakeup: Show wakeup sources stats in sysfs")
> - Remove '+obj-y += ftm_alarm.o' since it is wrong.
> - Cosmetic work.
>
> Change in v6:
> - Adjust related API usage to meet wakeup.c's update in patch 1/3.
>
> Change in v5:
> - Fix v4 regression of the return value of wakeup_source_get_next()
> didn't pass to ws in while loop.
> - Rename wakeup_source member 'attached_dev' to 'dev'.
> - Rename property 'fsl,#rcpm-wakeup-cells' to 
> '#fsl,rcpm-wakeup-cells'.
> please see https://lore.kernel.org/patchwork/patch/1101022/
>
> Change in v4:
> - Remove extra ',' in author line of rcpm.c
> - Update usage of wakeup_source_get_next() to be less confusing to the
> reader, code logic remain the same.
>
> Change in v3:
> - Some whitespace ajdustment.
>
> Change in v2:
> - Rebase Kconfig and Makefile update to latest mainline.
>
>  drivers/soc/fsl/Kconfig  |   8 +++
>  drivers/soc/fsl/Makefile |   1 +
>  drivers/soc/fsl/rcpm.c   | 133 
> +++
>  3 files changed, 142 insertions(+)
>  create mode 100644 drivers/soc/fsl/rcpm.c
>
> diff --git a/drivers/soc/fsl/Kconfig b/drivers/soc/fsl/Kconfig
> index f9ad8ad..4918856 100644
> --- a/drivers/soc/fsl/Kconfig
> +++ b/drivers/soc/fsl/Kconfig
> @@ -40,4 +40,12 @@ config DPAA2_CONSOLE
>   /dev/dpaa2_mc_console and /dev/dpaa2_aiop_console,
>   which can be used to dump the Management Complex and AIOP
>   firmware logs.
> +
> +config FSL_RCPM
> +   bool "Freescale RCPM support"
> +   depends on PM_SLEEP
> +   help
> + The NXP QorIQ Processors based on ARM Core have RCPM module
> + (Run Control and Power Management), which performs all device-level
> + tasks associated with power management, such as wakeup source 
> control.
>  endmenu
> diff --git a/drivers/soc/fsl/Makefile b/drivers/soc/fsl/Makefile
> index 71dee8d..906f1cd 100644
> --- a/drivers/soc/fsl/Makefile
> +++ b/drivers/soc/fsl/Makefile
> @@ -6,6 +6,7 @@
>  obj-$(CONFIG_FSL_DPAA) += qbman/
>  obj-$(CONFIG_QUICC_ENGINE) += qe/
>  obj-$(CONFIG_CPM)  += qe/
> +obj-$(CONFIG_FSL_RCPM) += rcpm.o
>  obj-$(CONFIG_FSL_GUTS) += guts.o
>  obj-$(CONFIG_FSL_MC_DPIO)  += dpio/
>  obj-$(CONFIG_DPAA2_CONSOLE)+= dpaa2-console.o
> diff --git a/drivers/soc/fsl/rcpm.c b/drivers/soc/fsl/rcpm.c
> new file mode 100644
> index 000..3ed135e
> --- /dev/null
> +++ b/drivers/soc/fsl/rcpm.c
> @@ -0,0 +1,133 @@
> +// SPDX-License-Identifier: GPL-2.0
> +//
> +// rcpm.c - Freescale QorIQ RCPM driver
> +//
> +// Copyright 2019 NXP
> +//
> +// Author: Ran Wang 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define RCPM_WAKEUP_CELL_MAX_SIZE  7
> +
> +struct rcpm {
> +   unsigned intwakeup_cells;
> +   void __iomem*ippdexpcr_base;
> +   boollittle_endian;
> +};
> +

Please add a kerneldoc comment describing this routine.

> +static int rcpm_pm_prepare(struct device *dev)
> +{
> +   int i, ret, idx;
> +   void __iomem *base;
> +   struct wakeup_source*ws;
> +   struct rcpm *rcpm;
> +   struct device_node  *np = dev->of_node;
> +   u32 value[RCPM_WAKEUP_CELL_MAX_SIZE + 1], tmp;
> +
> +   rcpm = dev_get_drvdata(dev);
> +   if (!rcpm)
> +   return -EINVAL;
> +
> +   base = rcpm->ippdexpcr_base;
> +   idx = wakeup_sources_read_lock();
> +
> +   /* Begin with first registered wakeup source */
> +   for_each_wakeup_source(ws) {
> +
> +   /* skip object which is not attached to device */
> +   if (!ws->dev || !ws->dev->parent)
> +   continue;
> +
> +   ret = device_property_read_u32_array(ws->dev->parent,
> +   "fsl,rcpm-wakeup", value,
> +   rcpm->wakeup_cells + 1);
> +
> +   /*  Wakeup source should refer to current rcpm device */
> +   if (ret || (np->phandle != value[0])) {
> +   dev_info(dev, "%s doesn't refer to this rcpm\n",
> +

Re: [PATCH v7 1/3] PM: wakeup: Add routine to help fetch wakeup source object.

2019-10-21 Thread Rafael J. Wysocki

On Mon, Oct 21, 2019 at 5:49 AM Ran Wang  wrote:
>
> Some user might want to go through all registered wakeup sources
> and doing things accordingly. For example, SoC PM driver might need to
> do HW programming to prevent powering down specific IP which wakeup
> source depending on. So add this API to help walk through all registered
> wakeup source objects on that list and return them one by one.
>
> Signed-off-by: Ran Wang 
> Tested-by: Leonard Crestez 
> ---
> Change in v7:
> - Remove define of member *dev in wake_irq to fix conflict with commit
> c8377adfa781 ("PM / wakeup: Show wakeup sources stats in sysfs"), user
> will use ws->dev->parent instead.
> - Remove '#include ' because it is not used.
>
> Change in v6:
> - Add wakeup_source_get_star() and wakeup_source_get_stop() to aligned
> with wakeup_sources_stats_seq_start/nex/stop.
>
> Change in v5:
> - Update commit message, add decription of walk through all wakeup
> source objects.
> - Add SCU protection in function wakeup_source_get_next().
> - Rename wakeup_source member 'attached_dev' to 'dev' and move it up
> (before wakeirq).
>
> Change in v4:
> - None.
>
> Change in v3:
> - Adjust indentation of *attached_dev;.
>
> Change in v2:
> - None.
>
>  drivers/base/power/wakeup.c | 37 +
>  include/linux/pm_wakeup.h   |  3 +++
>  2 files changed, 40 insertions(+)
>
> diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> index 5817b51..dee1b09 100644
> --- a/drivers/base/power/wakeup.c
> +++ b/drivers/base/power/wakeup.c
> @@ -248,6 +248,43 @@ void wakeup_source_unregister(struct wakeup_source *ws)
>  EXPORT_SYMBOL_GPL(wakeup_source_unregister);
>
>  /**
> + * wakeup_source_get_star - Begin a walk on wakeup source list

The "get" in the name suggests acquiring a reference of some kind
which doesn't happen here.

What about renaming it to wakeup_sources_walk_start()?

> + * @srcuidx: Lock index allocated for this caller.
> + */
> +struct wakeup_source *wakeup_source_get_start(int *srcuidx)

I don't quite like the calling convention here with passing an int
pointer to get the SRCU index back.

What about splitting this into, say, wakeup_sources_read_lock() (that
will return the SRCU index) and wakeup_sources_walk_start() (that will
return the first list entry)?

Then, you could do something like

idx = wakeup_sources_read_lock();

ws = wakeup_sources_walk_start();
while (ws) {

stuff

ws = wakeup_sources_walk_next();
}

wakeup_sources_read_unlock(idx);

Or even define for_each_wakeup_source(ws) as

for (ws = wakeup_sources_walk_start(); ws; ws = wakeup_sources_walk_next())

and use that under a _read_lock()/_read_unlock() pair?

Re: [PATCH v5 1/3] PM: wakeup: Add routine to help fetch wakeup source object.

2019-08-19 Thread Rafael J. Wysocki

On Monday, August 19, 2019 10:33:25 AM CEST Ran Wang wrote:
> Hi Rafael,
> 
> On Monday, August 19, 2019 16:20, Rafael J. Wysocki wrote:
> > 
> > On Mon, Aug 19, 2019 at 10:15 AM Ran Wang  wrote:
> > >
> > > Hi Rafael,
> > >
> > > On Monday, August 05, 2019 17:59, Rafael J. Wysocki wrote:
> > > >
> > > > On Wednesday, July 24, 2019 9:47:20 AM CEST Ran Wang wrote:
> > > > > Some user might want to go through all registered wakeup sources
> > > > > and doing things accordingly. For example, SoC PM driver might
> > > > > need to do HW programming to prevent powering down specific IP
> > > > > which wakeup source depending on. So add this API to help walk
> > > > > through all registered wakeup source objects on that list and return 
> > > > > them
> > one by one.
> > > > >
> > > > > Signed-off-by: Ran Wang 
> > > > > ---
> > > > > Change in v5:
> > > > > - Update commit message, add decription of walk through all wakeup
> > > > > source objects.
> > > > > - Add SCU protection in function wakeup_source_get_next().
> > > > > - Rename wakeup_source member 'attached_dev' to 'dev' and move
> > > > > it
> > > > up
> > > > > (before wakeirq).
> > > > >
> > > > > Change in v4:
> > > > > - None.
> > > > >
> > > > > Change in v3:
> > > > > - Adjust indentation of *attached_dev;.
> > > > >
> > > > > Change in v2:
> > > > > - None.
> > > > >
> > > > >  drivers/base/power/wakeup.c | 24 
> > > > >  include/linux/pm_wakeup.h   |  3 +++
> > > > >  2 files changed, 27 insertions(+)
> > > > >
> > > > > diff --git a/drivers/base/power/wakeup.c
> > > > > b/drivers/base/power/wakeup.c index ee31d4f..2fba891 100644
> > > > > --- a/drivers/base/power/wakeup.c
> > > > > +++ b/drivers/base/power/wakeup.c
> > > > > @@ -14,6 +14,7 @@
> > > > >  #include 
> > > > >  #include 
> > > > >  #include 
> > > > > +#include 
> > > > >  #include 
> > > > >  #include 
> > > > >
> > > > > @@ -226,6 +227,28 @@ void wakeup_source_unregister(struct
> > > > wakeup_source *ws)
> > > > > }
> > > > >  }
> > > > >  EXPORT_SYMBOL_GPL(wakeup_source_unregister);
> > > > > +/**
> > > > > + * wakeup_source_get_next - Get next wakeup source from the list
> > > > > + * @ws: Previous wakeup source object, null means caller want first 
> > > > > one.
> > > > > + */
> > > > > +struct wakeup_source *wakeup_source_get_next(struct wakeup_source
> > > > > +*ws) {
> > > > > +   struct list_head *ws_head = _sources;
> > > > > +   struct wakeup_source *next_ws = NULL;
> > > > > +   int idx;
> > > > > +
> > > > > +   idx = srcu_read_lock(_srcu);
> > > > > +   if (ws)
> > > > > +   next_ws = list_next_or_null_rcu(ws_head, >entry,
> > > > > +   struct wakeup_source, entry);
> > > > > +   else
> > > > > +   next_ws = list_entry_rcu(ws_head->next,
> > > > > +   struct wakeup_source, entry);
> > > > > +   srcu_read_unlock(_srcu, idx);
> > > > > +
> > > >
> > > > This is incorrect.
> > > >
> > > > The SRCU cannot be unlocked until the caller of this is done with
> > > > the object returned by it, or that object can be freed while it is 
> > > > still being
> > accessed.
> > >
> > > Thanks for the comment. Looks like I was not fully understanding your
> > > point on
> > > v4 discussion. So I will implement 3 APIs by referring
> > > wakeup_sources_stats_seq_start/next/stop()
> > >
> > > > Besides, this patch conflicts with some general wakeup sources
> > > > changes in the works, so it needs to be deferred and rebased on top of 
> > > > those
> > changes.
> > >
> > > Could you please tell me which is the right code base I should developing 
> > > on?
> > > I just tried applying v5 patch on latest
> > > git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git branch master
> > (d1abaeb Linux 5.3-rc5) and no conflict encountered.
> > 
> > It is better to use the most recent -rc from Linus (5.3-rc5 as of
> > today) as the base unless your patches depend on some changes that are not 
> > in
> > there.
> 
> OK, So I need to implement on latest 
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git branch 
> master, am I right?
> 
> However, I just checked v5.3-rc5 code and found it has the same HEAD (d1abaeb 
> Linux 5.3-rc5
> on which I did not observe v5 patch apply conflict, did I miss something? 
> Thanks.

The conflict I mentioned earlier was with another patch series in the works
which is not in 5.3-rc5.  However, there are problems with that series and it
is not linux-next now even, so please just base your series on top of -rc5.

Re: [PATCH v5 1/3] PM: wakeup: Add routine to help fetch wakeup source object.

2019-08-19 Thread Rafael J. Wysocki

On Mon, Aug 19, 2019 at 10:15 AM Ran Wang  wrote:
>
> Hi Rafael,
>
> On Monday, August 05, 2019 17:59, Rafael J. Wysocki wrote:
> >
> > On Wednesday, July 24, 2019 9:47:20 AM CEST Ran Wang wrote:
> > > Some user might want to go through all registered wakeup sources and
> > > doing things accordingly. For example, SoC PM driver might need to do
> > > HW programming to prevent powering down specific IP which wakeup
> > > source depending on. So add this API to help walk through all
> > > registered wakeup source objects on that list and return them one by one.
> > >
> > > Signed-off-by: Ran Wang 
> > > ---
> > > Change in v5:
> > > - Update commit message, add decription of walk through all wakeup
> > > source objects.
> > > - Add SCU protection in function wakeup_source_get_next().
> > > - Rename wakeup_source member 'attached_dev' to 'dev' and move it
> > up
> > > (before wakeirq).
> > >
> > > Change in v4:
> > > - None.
> > >
> > > Change in v3:
> > > - Adjust indentation of *attached_dev;.
> > >
> > > Change in v2:
> > > - None.
> > >
> > >  drivers/base/power/wakeup.c | 24 
> > >  include/linux/pm_wakeup.h   |  3 +++
> > >  2 files changed, 27 insertions(+)
> > >
> > > diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> > > index ee31d4f..2fba891 100644
> > > --- a/drivers/base/power/wakeup.c
> > > +++ b/drivers/base/power/wakeup.c
> > > @@ -14,6 +14,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >  #include 
> > >  #include 
> > >
> > > @@ -226,6 +227,28 @@ void wakeup_source_unregister(struct
> > wakeup_source *ws)
> > > }
> > >  }
> > >  EXPORT_SYMBOL_GPL(wakeup_source_unregister);
> > > +/**
> > > + * wakeup_source_get_next - Get next wakeup source from the list
> > > + * @ws: Previous wakeup source object, null means caller want first one.
> > > + */
> > > +struct wakeup_source *wakeup_source_get_next(struct wakeup_source
> > > +*ws) {
> > > +   struct list_head *ws_head = _sources;
> > > +   struct wakeup_source *next_ws = NULL;
> > > +   int idx;
> > > +
> > > +   idx = srcu_read_lock(_srcu);
> > > +   if (ws)
> > > +   next_ws = list_next_or_null_rcu(ws_head, >entry,
> > > +   struct wakeup_source, entry);
> > > +   else
> > > +   next_ws = list_entry_rcu(ws_head->next,
> > > +   struct wakeup_source, entry);
> > > +   srcu_read_unlock(_srcu, idx);
> > > +
> >
> > This is incorrect.
> >
> > The SRCU cannot be unlocked until the caller of this is done with the object
> > returned by it, or that object can be freed while it is still being 
> > accessed.
>
> Thanks for the comment. Looks like I was not fully understanding your point on
> v4 discussion. So I will implement 3 APIs by referring 
> wakeup_sources_stats_seq_start/next/stop()
>
> > Besides, this patch conflicts with some general wakeup sources changes in 
> > the
> > works, so it needs to be deferred and rebased on top of those changes.
>
> Could you please tell me which is the right code base I should developing on?
> I just tried applying v5 patch on latest 
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git branch master 
> (d1abaeb Linux 5.3-rc5)
> and no conflict encountered.

It is better to use the most recent -rc from Linus (5.3-rc5 as of
today) as the base unless your patches depend on some changes that are
not in there.

Re: [PATCH v5 1/3] PM: wakeup: Add routine to help fetch wakeup source object.

2019-08-05 Thread Rafael J. Wysocki

On Wednesday, July 24, 2019 9:47:20 AM CEST Ran Wang wrote:
> Some user might want to go through all registered wakeup sources
> and doing things accordingly. For example, SoC PM driver might need to
> do HW programming to prevent powering down specific IP which wakeup
> source depending on. So add this API to help walk through all registered
> wakeup source objects on that list and return them one by one.
> 
> Signed-off-by: Ran Wang 
> ---
> Change in v5:
>   - Update commit message, add decription of walk through all wakeup
>   source objects.
>   - Add SCU protection in function wakeup_source_get_next().
>   - Rename wakeup_source member 'attached_dev' to 'dev' and move it up
>   (before wakeirq).
> 
> Change in v4:
>   - None.
> 
> Change in v3:
>   - Adjust indentation of *attached_dev;.
> 
> Change in v2:
>   - None.
> 
>  drivers/base/power/wakeup.c | 24 
>  include/linux/pm_wakeup.h   |  3 +++
>  2 files changed, 27 insertions(+)
> 
> diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> index ee31d4f..2fba891 100644
> --- a/drivers/base/power/wakeup.c
> +++ b/drivers/base/power/wakeup.c
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -226,6 +227,28 @@ void wakeup_source_unregister(struct wakeup_source *ws)
>   }
>  }
>  EXPORT_SYMBOL_GPL(wakeup_source_unregister);
> +/**
> + * wakeup_source_get_next - Get next wakeup source from the list
> + * @ws: Previous wakeup source object, null means caller want first one.
> + */
> +struct wakeup_source *wakeup_source_get_next(struct wakeup_source *ws)
> +{
> + struct list_head *ws_head = _sources;
> + struct wakeup_source *next_ws = NULL;
> + int idx;
> +
> + idx = srcu_read_lock(_srcu);
> + if (ws)
> + next_ws = list_next_or_null_rcu(ws_head, >entry,
> + struct wakeup_source, entry);
> + else
> + next_ws = list_entry_rcu(ws_head->next,
> + struct wakeup_source, entry);
> + srcu_read_unlock(_srcu, idx);
> +

This is incorrect.

The SRCU cannot be unlocked until the caller of this is done
with the object returned by it, or that object can be freed
while it is still being accessed.

Besides, this patch conflicts with some general wakeup sources
changes in the works, so it needs to be deferred and rebased on
top of those changes.

> + return next_ws;
> +}
> +EXPORT_SYMBOL_GPL(wakeup_source_get_next);
>  
>  /**
>   * device_wakeup_attach - Attach a wakeup source object to a device object.
> @@ -242,6 +265,7 @@ static int device_wakeup_attach(struct device *dev, 
> struct wakeup_source *ws)
>   return -EEXIST;
>   }
>   dev->power.wakeup = ws;
> + ws->dev = dev;
>   if (dev->power.wakeirq)
>   device_wakeup_attach_irq(dev, dev->power.wakeirq);
>   spin_unlock_irq(>power.lock);
> diff --git a/include/linux/pm_wakeup.h b/include/linux/pm_wakeup.h
> index 9102760..fc23c1a 100644
> --- a/include/linux/pm_wakeup.h
> +++ b/include/linux/pm_wakeup.h
> @@ -23,6 +23,7 @@ struct wake_irq;
>   * @name: Name of the wakeup source
>   * @entry: Wakeup source list entry
>   * @lock: Wakeup source lock
> + * @dev: The device it attached to
>   * @wakeirq: Optional device specific wakeirq
>   * @timer: Wakeup timer list
>   * @timer_expires: Wakeup timer expiration
> @@ -42,6 +43,7 @@ struct wakeup_source {
>   const char  *name;
>   struct list_headentry;
>   spinlock_t  lock;
> + struct device   *dev;
>   struct wake_irq *wakeirq;
>   struct timer_list   timer;
>   unsigned long   timer_expires;
> @@ -88,6 +90,7 @@ extern void wakeup_source_add(struct wakeup_source *ws);
>  extern void wakeup_source_remove(struct wakeup_source *ws);
>  extern struct wakeup_source *wakeup_source_register(const char *name);
>  extern void wakeup_source_unregister(struct wakeup_source *ws);
> +extern struct wakeup_source *wakeup_source_get_next(struct wakeup_source 
> *ws);
>  extern int device_wakeup_enable(struct device *dev);
>  extern int device_wakeup_disable(struct device *dev);
>  extern void device_set_wakeup_capable(struct device *dev, bool capable);
>

Re: [PATCH V3] cpufreq: Make cpufreq_generic_init() return void

2019-07-18 Thread Rafael J. Wysocki

On Tuesday, July 16, 2019 6:06:08 AM CEST Viresh Kumar wrote:
> It always returns 0 (success) and its return type should really be void.
> Over that, many drivers have added error handling code based on its
> return value, which is not required at all.
> 
> change its return type to void and update all the callers.
> 
> Signed-off-by: Viresh Kumar 
> ---
> V2->V3:
> - Update bmips cpufreq driver to avoid "warning: 'ret' may be used
>   uninitialized".
> - Build bot reported this issue almost after 4 days of posting this
>   patch, I was expecting this a lot earlier :)
> 
>  drivers/cpufreq/bmips-cpufreq.c | 17 ++---
>  drivers/cpufreq/cpufreq.c   |  4 +---
>  drivers/cpufreq/davinci-cpufreq.c   |  3 ++-
>  drivers/cpufreq/imx6q-cpufreq.c |  6 ++
>  drivers/cpufreq/kirkwood-cpufreq.c  |  3 ++-
>  drivers/cpufreq/loongson1-cpufreq.c |  8 +++-
>  drivers/cpufreq/loongson2_cpufreq.c |  3 ++-
>  drivers/cpufreq/maple-cpufreq.c |  3 ++-
>  drivers/cpufreq/omap-cpufreq.c  | 15 +--
>  drivers/cpufreq/pasemi-cpufreq.c|  3 ++-
>  drivers/cpufreq/pmac32-cpufreq.c|  3 ++-
>  drivers/cpufreq/pmac64-cpufreq.c|  3 ++-
>  drivers/cpufreq/s3c2416-cpufreq.c   |  9 ++---
>  drivers/cpufreq/s3c64xx-cpufreq.c   | 15 +++
>  drivers/cpufreq/s5pv210-cpufreq.c   |  3 ++-
>  drivers/cpufreq/sa1100-cpufreq.c|  3 ++-
>  drivers/cpufreq/sa1110-cpufreq.c|  3 ++-
>  drivers/cpufreq/spear-cpufreq.c |  3 ++-
>  drivers/cpufreq/tegra20-cpufreq.c   |  8 +---
>  include/linux/cpufreq.h |  2 +-
>  20 files changed, 46 insertions(+), 71 deletions(-)
> 
> diff --git a/drivers/cpufreq/bmips-cpufreq.c b/drivers/cpufreq/bmips-cpufreq.c
> index 56a4ebbf00e0..f7c23fa468f0 100644
> --- a/drivers/cpufreq/bmips-cpufreq.c
> +++ b/drivers/cpufreq/bmips-cpufreq.c
> @@ -131,23 +131,18 @@ static int bmips_cpufreq_exit(struct cpufreq_policy 
> *policy)
>  static int bmips_cpufreq_init(struct cpufreq_policy *policy)
>  {
>   struct cpufreq_frequency_table *freq_table;
> - int ret;
>  
>   freq_table = bmips_cpufreq_get_freq_table(policy);
>   if (IS_ERR(freq_table)) {
> - ret = PTR_ERR(freq_table);
> - pr_err("%s: couldn't determine frequency table (%d).\n",
> - BMIPS_CPUFREQ_NAME, ret);
> - return ret;
> + pr_err("%s: couldn't determine frequency table (%ld).\n",
> + BMIPS_CPUFREQ_NAME, PTR_ERR(freq_table));
> + return PTR_ERR(freq_table);
>   }
>  
> - ret = cpufreq_generic_init(policy, freq_table, TRANSITION_LATENCY);
> - if (ret)
> - bmips_cpufreq_exit(policy);
> - else
> - pr_info("%s: registered\n", BMIPS_CPUFREQ_NAME);
> + cpufreq_generic_init(policy, freq_table, TRANSITION_LATENCY);
> + pr_info("%s: registered\n", BMIPS_CPUFREQ_NAME);
>  
> - return ret;
> + return 0;
>  }
>  
>  static struct cpufreq_driver bmips_cpufreq_driver = {
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 4d6043ee7834..8dda62367816 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -159,7 +159,7 @@ EXPORT_SYMBOL_GPL(arch_set_freq_scale);
>   * - set policies transition latency
>   * - policy->cpus with all possible CPUs
>   */
> -int cpufreq_generic_init(struct cpufreq_policy *policy,
> +void cpufreq_generic_init(struct cpufreq_policy *policy,
>   struct cpufreq_frequency_table *table,
>   unsigned int transition_latency)
>  {
> @@ -171,8 +171,6 @@ int cpufreq_generic_init(struct cpufreq_policy *policy,
>* share the clock and voltage and clock.
>*/
>   cpumask_setall(policy->cpus);
> -
> - return 0;
>  }
>  EXPORT_SYMBOL_GPL(cpufreq_generic_init);
>  
> diff --git a/drivers/cpufreq/davinci-cpufreq.c 
> b/drivers/cpufreq/davinci-cpufreq.c
> index 3de48ae60c29..297d23cad8b5 100644
> --- a/drivers/cpufreq/davinci-cpufreq.c
> +++ b/drivers/cpufreq/davinci-cpufreq.c
> @@ -90,7 +90,8 @@ static int davinci_cpu_init(struct cpufreq_policy *policy)
>* Setting the latency to 2000 us to accommodate addition of drivers
>* to pre/post change notification list.
>*/
> - return cpufreq_generic_init(policy, freq_table, 2000 * 1000);
> + cpufreq_generic_init(policy, freq_table, 2000 * 1000);
> + return 0;
>  }
>  
>  static struct cpufreq_driver davinci_driver = {
> diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
> index 47ccfa6b17b7..648a09a1778a 100644
> --- a/drivers/cpufreq/imx6q-cpufreq.c
> +++ b/drivers/cpufreq/imx6q-cpufreq.c
> @@ -190,14 +190,12 @@ static int imx6q_set_target(struct cpufreq_policy 
> *policy, unsigned int index)
>  
>  static int imx6q_cpufreq_init(struct cpufreq_policy *policy)
>  {
> - int ret;
> -
>   policy->clk = clks[ARM].clk;
> - ret = cpufreq_generic_init(policy, freq_table, transition_latency);
> +

Re: [PATCH 00/10] cpufreq: Migrate users of policy notifiers to QoS requests

2019-07-16 Thread Rafael J. Wysocki

On Tue, Jul 16, 2019 at 12:14 PM Viresh Kumar  wrote:
>
> On 16-07-19, 12:06, Rafael J. Wysocki wrote:
> > On Tue, Jul 16, 2019 at 11:49 AM Viresh Kumar  
> > wrote:
> > >
> > > Hello,
> > >
> > > Now that cpufreq core supports taking QoS requests for min/max cpu
> > > frequencies, lets migrate rest of the users to using them instead of the
> > > policy notifiers.
> >
> > Technically, this still is linux-next only. :-)
>
> True :)
>
> > > The CPUFREQ_NOTIFY and CPUFREQ_ADJUST events of the policy notifiers are
> > > removed as a result, but we have to add CPUFREQ_CREATE_POLICY and
> > > CPUFREQ_REMOVE_POLICY events to it for the acpi stuff specifically. So
> > > the policy notifiers aren't completely removed.
> >
> > That's not entirely accurate, because arch_topology is going to use
> > CPUFREQ_CREATE_POLICY now too.
>
> Yeah, I thought about that while writing this patchset and
> coverletter. But had it not been required for ACPI, I would have done
> it differently for the arch-topology code. Maybe direct calling of
> arch-topology routine from cpufreq core. I wanted to get rid of the
> policy notifiers completely but I couldn't find a better way of doing
> it for ACPI stuff.
>
> > > Boot tested on my x86 PC and ARM hikey board. Nothing looked broken :)
> > >
> > > This has already gone through build bot for a few days now.
> >
> > So I'd prefer patches [5-8] to go right after the first one and then
> > do the cleanups on top of that, as somebody may want to backport the
> > essential changes without the cleanups.
>
> In the exceptional case where nobody finds anything wrong with the
> patches (highly unlikely), do you want me to resend with reordering or
> you can reorder them while applying? There are no dependencies between
> those patches anyway.

Please resend the reordered set when the merge window closes.

Re: [PATCH 00/10] cpufreq: Migrate users of policy notifiers to QoS requests

2019-07-16 Thread Rafael J. Wysocki

On Tue, Jul 16, 2019 at 11:49 AM Viresh Kumar  wrote:
>
> Hello,
>
> Now that cpufreq core supports taking QoS requests for min/max cpu
> frequencies, lets migrate rest of the users to using them instead of the
> policy notifiers.

Technically, this still is linux-next only. :-)

> The CPUFREQ_NOTIFY and CPUFREQ_ADJUST events of the policy notifiers are
> removed as a result, but we have to add CPUFREQ_CREATE_POLICY and
> CPUFREQ_REMOVE_POLICY events to it for the acpi stuff specifically. So
> the policy notifiers aren't completely removed.

That's not entirely accurate, because arch_topology is going to use
CPUFREQ_CREATE_POLICY now too.

> Boot tested on my x86 PC and ARM hikey board. Nothing looked broken :)
>
> This has already gone through build bot for a few days now.

So I'd prefer patches [5-8] to go right after the first one and then
do the cleanups on top of that, as somebody may want to backport the
essential changes without the cleanups.

Re: [PATCH v5] cpufreq/pasemi: fix an use-after-free in pas_cpufreq_cpu_init()

2019-07-10 Thread Rafael J. Wysocki

On Tuesday, July 9, 2019 10:12:05 AM CEST Viresh Kumar wrote:
> On 09-07-19, 16:04, Wen Yang wrote:
> > The cpu variable is still being used in the of_get_property() call
> > after the of_node_put() call, which may result in use-after-free.
> > 
> > Fixes: a9acc26b75f ("cpufreq/pasemi: fix possible object reference leak")
> > Signed-off-by: Wen Yang 
> > Cc: "Rafael J. Wysocki" 
> > Cc: Viresh Kumar 
> > Cc: Michael Ellerman 
> > Cc: linuxppc-dev@lists.ozlabs.org
> > Cc: linux...@vger.kernel.org
> > Cc: linux-ker...@vger.kernel.org

Patch applied.

> > ---
> > v5: put together the code to get, use, and release cpu device_node.
> > v4: restore the blank line.
> > v3: fix a leaked reference.
> > v2: clean up the code according to the advice of viresh.
> > 
> >  drivers/cpufreq/pasemi-cpufreq.c | 21 +
> >  1 file changed, 9 insertions(+), 12 deletions(-)
> > 
> > diff --git a/drivers/cpufreq/pasemi-cpufreq.c 
> > b/drivers/cpufreq/pasemi-cpufreq.c
> > index 6b1e4ab..1f0beb7 100644
> > --- a/drivers/cpufreq/pasemi-cpufreq.c
> > +++ b/drivers/cpufreq/pasemi-cpufreq.c
> > @@ -131,10 +131,17 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy 
> > *policy)
> > int err = -ENODEV;
> >  
> > cpu = of_get_cpu_node(policy->cpu, NULL);
> > -
> > -   of_node_put(cpu);
> > if (!cpu)
> > goto out;
> 
> I would have loved a blank line here :)

And I added the blank line.

> > +   max_freqp = of_get_property(cpu, "clock-frequency", NULL);
> > +   of_node_put(cpu);
> > +   if (!max_freqp) {
> > +   err = -EINVAL;
> > +   goto out;
> > +   }
> > +
> > +   /* we need the freq in kHz */
> > +   max_freq = *max_freqp / 1000;
> >  
> > dn = of_find_compatible_node(NULL, NULL, "1682m-sdc");
> > if (!dn)
> > @@ -171,16 +178,6 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy 
> > *policy)
> > }
> >  
> > pr_debug("init cpufreq on CPU %d\n", policy->cpu);
> > -
> > -   max_freqp = of_get_property(cpu, "clock-frequency", NULL);
> > -   if (!max_freqp) {
> > -   err = -EINVAL;
> > -   goto out_unmap_sdcpwr;
> > -   }
> > -
> > -   /* we need the freq in kHz */
> > -   max_freq = *max_freqp / 1000;
> > -
> > pr_debug("max clock-frequency is at %u kHz\n", max_freq);
> > pr_debug("initializing frequency table\n");
> 
> Though, enough versions have happened now.
> 
> Acked-by: Viresh Kumar 
> 
> 

Thanks!

Re: [PATCH v2] powerpc/power: Expose pfn_is_nosave prototype

2019-06-27 Thread Rafael J. Wysocki

On Friday, May 24, 2019 12:44:18 PM CEST Mathieu Malaterre wrote:
> The declaration for pfn_is_nosave is only available in
> kernel/power/power.h. Since this function can be override in arch,
> expose it globally. Having a prototype will make sure to avoid warning
> (sometime treated as error with W=1) such as:
> 
>   arch/powerpc/kernel/suspend.c:18:5: error: no previous prototype for 
> 'pfn_is_nosave' [-Werror=missing-prototypes]
> 
> This moves the declaration into a globally visible header file and add
> missing include to avoid a warning on powerpc. Also remove the
> duplicated prototypes since not required anymore.
> 
> Cc: Christophe Leroy 
> Signed-off-by: Mathieu Malaterre 
> ---
> v2: As suggestion by christophe remove duplicates prototypes
> 
>  arch/powerpc/kernel/suspend.c | 1 +
>  arch/s390/kernel/entry.h  | 1 -
>  include/linux/suspend.h   | 1 +
>  kernel/power/power.h  | 2 --
>  4 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/suspend.c b/arch/powerpc/kernel/suspend.c
> index a531154cc0f3..9e1b6b894245 100644
> --- a/arch/powerpc/kernel/suspend.c
> +++ b/arch/powerpc/kernel/suspend.c
> @@ -8,6 +8,7 @@
>   */
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  
> diff --git a/arch/s390/kernel/entry.h b/arch/s390/kernel/entry.h
> index 20420c2b8a14..b2956d49b6ad 100644
> --- a/arch/s390/kernel/entry.h
> +++ b/arch/s390/kernel/entry.h
> @@ -63,7 +63,6 @@ void __init startup_init(void);
>  void die(struct pt_regs *regs, const char *str);
>  int setup_profiling_timer(unsigned int multiplier);
>  void __init time_init(void);
> -int pfn_is_nosave(unsigned long);
>  void s390_early_resume(void);
>  unsigned long prepare_ftrace_return(unsigned long parent, unsigned long sp, 
> unsigned long ip);
>  
> diff --git a/include/linux/suspend.h b/include/linux/suspend.h
> index 6b3ea9ea6a9e..e8b8a7bede90 100644
> --- a/include/linux/suspend.h
> +++ b/include/linux/suspend.h
> @@ -395,6 +395,7 @@ extern bool system_entering_hibernation(void);
>  extern bool hibernation_available(void);
>  asmlinkage int swsusp_save(void);
>  extern struct pbe *restore_pblist;
> +int pfn_is_nosave(unsigned long pfn);
>  #else /* CONFIG_HIBERNATION */
>  static inline void register_nosave_region(unsigned long b, unsigned long e) 
> {}
>  static inline void register_nosave_region_late(unsigned long b, unsigned 
> long e) {}
> diff --git a/kernel/power/power.h b/kernel/power/power.h
> index 9e58bdc8a562..44bee462ff57 100644
> --- a/kernel/power/power.h
> +++ b/kernel/power/power.h
> @@ -75,8 +75,6 @@ static inline void hibernate_reserved_size_init(void) {}
>  static inline void hibernate_image_size_init(void) {}
>  #endif /* !CONFIG_HIBERNATION */
>  
> -extern int pfn_is_nosave(unsigned long);
> -
>  #define power_attr(_name) \
>  static struct kobj_attribute _name##_attr = {\
>   .attr   = { \
> 

Applied, thanks!

Re: [PATCH v4 1/3] PM: wakeup: Add routine to help fetch wakeup source object.

2019-06-18 Thread Rafael J. Wysocki

On Monday, May 20, 2019 11:52:36 AM CEST Ran Wang wrote:
> Some user might want to go through all registered wakeup sources
> and doing things accordingly. For example, SoC PM driver might need to
> do HW programming to prevent powering down specific IP which wakeup
> source depending on. And is user's responsibility to identify if this
> wakeup source he is interested in.

I guess the idea here is that you need to walk wakeup devices and you noticed
that there was a wakeup source object for each of them and those wakeup
source objects were on a list, so you could walk wakeup devices by walking
the list of wakeup source objects.

That is fair enough, but the changelog above doesn't even talk about that.
 
> Signed-off-by: Ran Wang 
> ---
> Change in v4:
>   - None.
> 
> Change in v3:
>   - Adjust indentation of *attached_dev;.
> 
> Change in v2:
>   - None.
> 
>  drivers/base/power/wakeup.c |   18 ++
>  include/linux/pm_wakeup.h   |3 +++
>  2 files changed, 21 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> index 5b2b6a0..6904485 100644
> --- a/drivers/base/power/wakeup.c
> +++ b/drivers/base/power/wakeup.c
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -226,6 +227,22 @@ void wakeup_source_unregister(struct wakeup_source *ws)
>   }
>  }
>  EXPORT_SYMBOL_GPL(wakeup_source_unregister);
> +/**
> + * wakeup_source_get_next - Get next wakeup source from the list
> + * @ws: Previous wakeup source object, null means caller want first one.
> + */
> +struct wakeup_source *wakeup_source_get_next(struct wakeup_source *ws)
> +{
> + struct list_head *ws_head = _sources;
> +
> + if (ws)
> + return list_next_or_null_rcu(ws_head, >entry,
> + struct wakeup_source, entry);
> + else
> + return list_entry_rcu(ws_head->next,
> + struct wakeup_source, entry);
> +}
> +EXPORT_SYMBOL_GPL(wakeup_source_get_next);

This needs to be arranged along the lines of 
wakeup_sources_stats_seq_start/next/stop()
because of the SRCU protection of the list.

>  
>  /**
>   * device_wakeup_attach - Attach a wakeup source object to a device object.
> @@ -242,6 +259,7 @@ static int device_wakeup_attach(struct device *dev, 
> struct wakeup_source *ws)
>   return -EEXIST;
>   }
>   dev->power.wakeup = ws;
> + ws->attached_dev = dev;
>   if (dev->power.wakeirq)
>   device_wakeup_attach_irq(dev, dev->power.wakeirq);
>   spin_unlock_irq(>power.lock);
> diff --git a/include/linux/pm_wakeup.h b/include/linux/pm_wakeup.h
> index 0ff134d..913b2fb 100644
> --- a/include/linux/pm_wakeup.h
> +++ b/include/linux/pm_wakeup.h
> @@ -50,6 +50,7 @@
>   * @wakeup_count: Number of times the wakeup source might abort suspend.
>   * @active: Status of the wakeup source.
>   * @has_timeout: The wakeup source has been activated with a timeout.
> + * @attached_dev: The device it attached to
>   */
>  struct wakeup_source {
>   const char  *name;
> @@ -70,6 +71,7 @@ struct wakeup_source {
>   unsigned long   wakeup_count;
>   boolactive:1;
>   boolautosleep_enabled:1;
> + struct device   *attached_dev;

Please (a) call it just dev and (b) move it up (before wakeirq, say).

>  };
>  
>  #ifdef CONFIG_PM_SLEEP
> @@ -101,6 +103,7 @@ static inline void device_set_wakeup_path(struct device 
> *dev)
>  extern void wakeup_source_remove(struct wakeup_source *ws);
>  extern struct wakeup_source *wakeup_source_register(const char *name);
>  extern void wakeup_source_unregister(struct wakeup_source *ws);
> +extern struct wakeup_source *wakeup_source_get_next(struct wakeup_source 
> *ws);
>  extern int device_wakeup_enable(struct device *dev);
>  extern int device_wakeup_disable(struct device *dev);
>  extern void device_set_wakeup_capable(struct device *dev, bool capable);
>

Re: [PATCH 01/14] ABI: fix some syntax issues at the ABI database

2019-06-14 Thread Rafael J. Wysocki

On Fri, Jun 14, 2019 at 4:04 AM Mauro Carvalho Chehab
 wrote:
>
> From: Mauro Carvalho Chehab 
>
> On those three files, the ABI representation described at
> README are violated.
>
> - at sysfs-bus-iio-proximity-as3935:
> a ':' character is missing after "What"
>
> - at sysfs-class-devfreq:
> there's a typo at Description
>
> - at sysfs-class-cxl, it is using the ":" character at a
> file preamble, causing it to be misinterpreted as a
> tag.
>
> - On the other files, instead of "What", they use "Where".
>
> Signed-off-by: Mauro Carvalho Chehab 
> Signed-off-by: Mauro Carvalho Chehab 

Acked-by: Rafael J. Wysocki 

> ---
>  Documentation/ABI/testing/pstore  |  2 +-
>  .../sysfs-bus-event_source-devices-format |  2 +-
>  .../ABI/testing/sysfs-bus-i2c-devices-hm6352  |  6 ++---
>  .../ABI/testing/sysfs-bus-iio-distance-srf08  |  4 ++--
>  .../testing/sysfs-bus-iio-proximity-as3935|  4 ++--
>  .../ABI/testing/sysfs-bus-pci-devices-cciss   | 22 +--
>  .../testing/sysfs-bus-usb-devices-usbsevseg   | 12 +-
>  Documentation/ABI/testing/sysfs-class-cxl |  6 ++---
>  Documentation/ABI/testing/sysfs-class-devfreq |  2 +-
>  .../ABI/testing/sysfs-class-powercap  |  2 +-
>  Documentation/ABI/testing/sysfs-kernel-fscaps |  2 +-
>  .../ABI/testing/sysfs-kernel-vmcoreinfo   |  2 +-
>  12 files changed, 33 insertions(+), 33 deletions(-)
>
> diff --git a/Documentation/ABI/testing/pstore 
> b/Documentation/ABI/testing/pstore
> index 5fca9f5e10a3..8d6e48f4e8ef 100644
> --- a/Documentation/ABI/testing/pstore
> +++ b/Documentation/ABI/testing/pstore
> @@ -1,4 +1,4 @@
> -Where: /sys/fs/pstore/... (or /dev/pstore/...)
> +What:  /sys/fs/pstore/... (or /dev/pstore/...)
>  Date:  March 2011
>  Kernel Version: 2.6.39
>  Contact:   tony.l...@intel.com
> diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-format 
> b/Documentation/ABI/testing/sysfs-bus-event_source-devices-format
> index 77f47ff5ee02..b6f8748e0200 100644
> --- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-format
> +++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-format
> @@ -1,4 +1,4 @@
> -Where: /sys/bus/event_source/devices//format
> +What:  /sys/bus/event_source/devices//format
>  Date:  January 2012
>  Kernel Version: 3.3
>  Contact:   Jiri Olsa 
> diff --git a/Documentation/ABI/testing/sysfs-bus-i2c-devices-hm6352 
> b/Documentation/ABI/testing/sysfs-bus-i2c-devices-hm6352
> index feb2e4a87075..29bd447e50a0 100644
> --- a/Documentation/ABI/testing/sysfs-bus-i2c-devices-hm6352
> +++ b/Documentation/ABI/testing/sysfs-bus-i2c-devices-hm6352
> @@ -1,18 +1,18 @@
> -Where: /sys/bus/i2c/devices/.../heading0_input
> +What:  /sys/bus/i2c/devices/.../heading0_input
>  Date:  April 2010
>  Kernel Version: 2.6.36?
>  Contact:   alan@intel.com
>  Description:   Reports the current heading from the compass as a floating
> point value in degrees.
>
> -Where: /sys/bus/i2c/devices/.../power_state
> +What:  /sys/bus/i2c/devices/.../power_state
>  Date:  April 2010
>  Kernel Version: 2.6.36?
>  Contact:   alan@intel.com
>  Description:   Sets the power state of the device. 0 sets the device into
> sleep mode, 1 wakes it up.
>
> -Where: /sys/bus/i2c/devices/.../calibration
> +What:  /sys/bus/i2c/devices/.../calibration
>  Date:  April 2010
>  Kernel Version: 2.6.36?
>  Contact:   alan@intel.com
> diff --git a/Documentation/ABI/testing/sysfs-bus-iio-distance-srf08 
> b/Documentation/ABI/testing/sysfs-bus-iio-distance-srf08
> index 0a1ca1487fa9..a133fd8d081a 100644
> --- a/Documentation/ABI/testing/sysfs-bus-iio-distance-srf08
> +++ b/Documentation/ABI/testing/sysfs-bus-iio-distance-srf08
> @@ -1,4 +1,4 @@
> -What   /sys/bus/iio/devices/iio:deviceX/sensor_sensitivity
> +What:  /sys/bus/iio/devices/iio:deviceX/sensor_sensitivity
>  Date:  January 2017
>  KernelVersion: 4.11
>  Contact:   linux-...@vger.kernel.org
> @@ -6,7 +6,7 @@ Description:
> Show or set the gain boost of the amp, from 0-31 range.
> default 31
>
> -What   /sys/bus/iio/devices/iio:deviceX/sensor_max_range
> +What:  /sys/bus/iio/devices/iio:deviceX/sensor_max_range
>  Date:  January 2017
>  KernelVersion: 4.11
>  Contact:   linux-...@vger.kernel.org
> diff --git a/Documentation/ABI/testing/sysfs-bus-iio-proximity-as3935 
> b/Documentation/ABI/testing/sysfs-bus-

Re: [PATCH v2] powerpc/power: Expose pfn_is_nosave prototype

2019-05-28 Thread Rafael J. Wysocki

On Tuesday, May 28, 2019 3:16:30 AM CEST Michael Ellerman wrote:
> "Rafael J. Wysocki"  writes:
> > On Friday, May 24, 2019 12:44:18 PM CEST Mathieu Malaterre wrote:
> >> The declaration for pfn_is_nosave is only available in
> >> kernel/power/power.h. Since this function can be override in arch,
> >> expose it globally. Having a prototype will make sure to avoid warning
> >> (sometime treated as error with W=1) such as:
> >> 
> >>   arch/powerpc/kernel/suspend.c:18:5: error: no previous prototype for 
> >> 'pfn_is_nosave' [-Werror=missing-prototypes]
> >> 
> >> This moves the declaration into a globally visible header file and add
> >> missing include to avoid a warning on powerpc. Also remove the
> >> duplicated prototypes since not required anymore.
> >> 
> >> Cc: Christophe Leroy 
> >> Signed-off-by: Mathieu Malaterre 
> >> ---
> >> v2: As suggestion by christophe remove duplicates prototypes
> >> 
> >>  arch/powerpc/kernel/suspend.c | 1 +
> >>  arch/s390/kernel/entry.h  | 1 -
> >>  include/linux/suspend.h   | 1 +
> >>  kernel/power/power.h  | 2 --
> >>  4 files changed, 2 insertions(+), 3 deletions(-)
> >> 
> >> diff --git a/kernel/power/power.h b/kernel/power/power.h
> >> index 9e58bdc8a562..44bee462ff57 100644
> >> --- a/kernel/power/power.h
> >> +++ b/kernel/power/power.h
> >> @@ -75,8 +75,6 @@ static inline void hibernate_reserved_size_init(void) {}
> >>  static inline void hibernate_image_size_init(void) {}
> >>  #endif /* !CONFIG_HIBERNATION */
> >>  
> >> -extern int pfn_is_nosave(unsigned long);
> >> -
> >>  #define power_attr(_name) \
> >>  static struct kobj_attribute _name##_attr = { \
> >>.attr   = { \
> >> 
> >
> > With an ACK from the powerpc maintainers, I could apply this one.
> 
> Sent.

Thanks!

Re: [PATCH v2] powerpc/power: Expose pfn_is_nosave prototype

2019-05-27 Thread Rafael J. Wysocki

On Friday, May 24, 2019 12:44:18 PM CEST Mathieu Malaterre wrote:
> The declaration for pfn_is_nosave is only available in
> kernel/power/power.h. Since this function can be override in arch,
> expose it globally. Having a prototype will make sure to avoid warning
> (sometime treated as error with W=1) such as:
> 
>   arch/powerpc/kernel/suspend.c:18:5: error: no previous prototype for 
> 'pfn_is_nosave' [-Werror=missing-prototypes]
> 
> This moves the declaration into a globally visible header file and add
> missing include to avoid a warning on powerpc. Also remove the
> duplicated prototypes since not required anymore.
> 
> Cc: Christophe Leroy 
> Signed-off-by: Mathieu Malaterre 
> ---
> v2: As suggestion by christophe remove duplicates prototypes
> 
>  arch/powerpc/kernel/suspend.c | 1 +
>  arch/s390/kernel/entry.h  | 1 -
>  include/linux/suspend.h   | 1 +
>  kernel/power/power.h  | 2 --
>  4 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/suspend.c b/arch/powerpc/kernel/suspend.c
> index a531154cc0f3..9e1b6b894245 100644
> --- a/arch/powerpc/kernel/suspend.c
> +++ b/arch/powerpc/kernel/suspend.c
> @@ -8,6 +8,7 @@
>   */
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  
> diff --git a/arch/s390/kernel/entry.h b/arch/s390/kernel/entry.h
> index 20420c2b8a14..b2956d49b6ad 100644
> --- a/arch/s390/kernel/entry.h
> +++ b/arch/s390/kernel/entry.h
> @@ -63,7 +63,6 @@ void __init startup_init(void);
>  void die(struct pt_regs *regs, const char *str);
>  int setup_profiling_timer(unsigned int multiplier);
>  void __init time_init(void);
> -int pfn_is_nosave(unsigned long);
>  void s390_early_resume(void);
>  unsigned long prepare_ftrace_return(unsigned long parent, unsigned long sp, 
> unsigned long ip);
>  
> diff --git a/include/linux/suspend.h b/include/linux/suspend.h
> index 6b3ea9ea6a9e..e8b8a7bede90 100644
> --- a/include/linux/suspend.h
> +++ b/include/linux/suspend.h
> @@ -395,6 +395,7 @@ extern bool system_entering_hibernation(void);
>  extern bool hibernation_available(void);
>  asmlinkage int swsusp_save(void);
>  extern struct pbe *restore_pblist;
> +int pfn_is_nosave(unsigned long pfn);
>  #else /* CONFIG_HIBERNATION */
>  static inline void register_nosave_region(unsigned long b, unsigned long e) 
> {}
>  static inline void register_nosave_region_late(unsigned long b, unsigned 
> long e) {}
> diff --git a/kernel/power/power.h b/kernel/power/power.h
> index 9e58bdc8a562..44bee462ff57 100644
> --- a/kernel/power/power.h
> +++ b/kernel/power/power.h
> @@ -75,8 +75,6 @@ static inline void hibernate_reserved_size_init(void) {}
>  static inline void hibernate_image_size_init(void) {}
>  #endif /* !CONFIG_HIBERNATION */
>  
> -extern int pfn_is_nosave(unsigned long);
> -
>  #define power_attr(_name) \
>  static struct kobj_attribute _name##_attr = {\
>   .attr   = { \
> 

With an ACK from the powerpc maintainers, I could apply this one.

Re: [PATCH 10/10] docs: fix broken documentation links

2019-05-27 Thread Rafael J. Wysocki

On Mon, May 20, 2019 at 4:48 PM Mauro Carvalho Chehab
 wrote:
>
> Mostly due to x86 and acpi conversion, several documentation
> links are still pointing to the old file. Fix them.
>
> Signed-off-by: Mauro Carvalho Chehab 

For the ACPI part:

Acked-by: Rafael J. Wysocki

Re: [PATCH v5 00/23] Include linux ACPI docs into Sphinx TOC tree

2019-05-01 Thread Rafael J. Wysocki

On Thursday, April 25, 2019 5:20:35 PM CEST Changbin Du wrote:
> On Thu, Apr 25, 2019 at 10:44:14AM +0200, Rafael J. Wysocki wrote:
> > .On Wed, Apr 24, 2019 at 7:54 PM Changbin Du  wrote:
> > >
> > > Hi All,
> > > The kernel now uses Sphinx to generate intelligent and beautiful 
> > > documentation
> > > from reStructuredText files. I converted all of the Linux ACPI/PCI/X86 
> > > docs to
> > > reST format in this serias.
> > >
> > > The hieararchy of ACPI docs are based on Corbet's suggestion:
> > > https://lkml.org/lkml/2019/4/3/1047
> > > I did some adjustment according to the content and finally they are 
> > > placed as:
> > > Documentation/firmware-guide/acpi/
> > 
> > I'd like to queue up this series, but it is missing a patch to create
> > Documentation/firmware-guide/acpi/index.rst.
> > 
> > Care to provide one?
> oops, the first patch is missed. Let me add it next.

I've picked up the first patch from the v6 and applied this series on top of it.

 Thanks!

Re: Why is suspend with s2idle available on POWER8 systems?

2019-04-29 Thread Rafael J. Wysocki

On Mon, Apr 29, 2019 at 10:50 AM Paul Menzel  wrote:
>
> Dear Rafael,
>
>
> On 04/29/2019 09:17 AM, Rafael J. Wysocki wrote:
> > On Sat, Apr 27, 2019 at 12:54 PM Paul Menzel  wrote:
>
> >> Updating an IBM S822LC from Ubuntu 18.10 to 19.04 some user space stuff
> >> seems to have changed, so that going into sleep/suspend is enabled.
> >>
> >> That raises two questions.
> >>
> >> 1.  Is suspend actually supported on a POWER8 processor?
> >
> > Suspend-to-idle is a special variant of system suspend that does not
> > depend on any special platform support.  It works by suspending
> > devices and letting all of the CPUs in the system go idle (hence the
> > name).
> >
> > Also see 
> > https://www.kernel.org/doc/html/latest/admin-guide/pm/sleep-states.html#suspend-to-idle
>
> Thanks. I guess I mixed it up with the new S0ix-states [1].

Those can be entered via suspend-to-idle, if supported and actually
reachable on a given platform, but suspend-to-idle is more general
than that.

> >>> Apr 27 10:18:13 power NetworkManager[7534]:   [1556353093.7224] 
> >>> manager: sleep: sleep requested (sleeping: no  e
> >>> Apr 27 10:18:13 power systemd[1]: Reached target Sleep.
> >>> Apr 27 10:18:13 power systemd[1]: Starting Suspend...
> >>> Apr 27 10:18:13 power systemd-sleep[82190]: Suspending system...
> >>> Apr 27 10:18:13 power kernel: PM: suspend entry (s2idle)
> >>> -- Reboot --
> >>
> >>> $ uname -m
> >>> ppc64le
> >>> $ more /proc/version
> >>> Linux version 5.1.0-rc6+ (joey@power) (gcc version 8.3.0 (Ubuntu 
> >>> 8.3.0-6ubuntu1)) #1 SMP Sat Apr 27 10:01:48 CEST 2019
> >>> $ more /sys/power/mem_sleep
> >>> [s2idle]
> >>> $ more /sys/power/state
> >>> freeze mem
> >>> $ grep _SUSPEND /boot/config-5.0.0-14-generic # also enabled in Ubuntu’s 
> >>> configuration
> >>> CONFIG_ARCH_SUSPEND_POSSIBLE=y
> >>> CONFIG_SUSPEND=y
> >>> CONFIG_SUSPEND_FREEZER=y
> >>> # CONFIG_SUSPEND_SKIP_SYNC is not set
> >>> # CONFIG_PM_TEST_SUSPEND is not set
> >>
> >> Should the Kconfig symbol `SUSPEND` be selectable? If yes, should their
> >> be some detection during runtime?
> >>
> >> 2.  If it is supported, what are the ways to getting it to resume? What
> >> would the IPMI command be?
> >
> > That would depend on the distribution.
> >
> > Generally, you need to set up at least one device to generate wakeup
> > interrupts.
> >
> > The interface to do that are the /sys/devices/.../power/wakeup files,
> > but that has to cause enble_irq_wake() to be called for the given IRQ,
> > so some support in the underlying drivers need to be present for it to
> > work.
> >
> > USB devices generally work as wakeup sources if the controllers reside
> > on a PCI bus, for example.
>
> ```
> $ find /sys/devices/ -name wakeup | xargs grep enabled
> /sys/devices/pci0021:00/0021:00:00.0/0021:01:00.0/0021:02:09.0/0021:0d:00.0/usb1/1-3/1-3.4/power/wakeup:enabled
> /sys/devices/pci0021:00/0021:00:00.0/0021:01:00.0/0021:02:09.0/0021:0d:00.0/power/wakeup:enabled
> $ lsusb -t
> /:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
> /:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 480M
> |__ Port 3: Dev 2, If 0, Class=Hub, Driver=hub/5p, 480M
> |__ Port 1: Dev 3, If 0, Class=Mass Storage, Driver=usb-storage, 480M
> |__ Port 2: Dev 4, If 0, Class=Mass Storage, Driver=usb-storage, 480M
> |__ Port 3: Dev 5, If 0, Class=Mass Storage, Driver=usb-storage, 480M
> |__ Port 4: Dev 6, If 0, Class=Human Interface Device, Driver=usbhid, 
> 1.5M
> |__ Port 4: Dev 6, If 1, Class=Human Interface Device, Driver=usbhid, 
> 1.5M
> $ lsusb
> Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
> Bus 001 Device 006: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
> and Mouse
> Bus 001 Device 005: ID 046b:ff31 American Megatrends, Inc.
> Bus 001 Device 004: ID 046b:ff40 American Megatrends, Inc.
> Bus 001 Device 003: ID 046b:ff20 American Megatrends, Inc.
> Bus 001 Device 002: ID 046b:ff01 American Megatrends, Inc.
> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
> ```

I'm not really sure what you wanted to say here, but it looks like
system wakeup is not enabled for device 6 on bus 1 which is probably
what you want.

Re: Why is suspend with s2idle available on POWER8 systems?

2019-04-29 Thread Rafael J. Wysocki

On Sat, Apr 27, 2019 at 12:54 PM Paul Menzel  wrote:
>
> Dear Linux folks,
>
>
> Updating an IBM S822LC from Ubuntu 18.10 to 19.04 some user space stuff
> seems to have changed, so that going into sleep/suspend is enabled.
>
> That raises two questions.
>
> 1.  Is suspend actually supported on a POWER8 processor?

Suspend-to-idle is a special variant of system suspend that does not
depend on any special platform support.  It works by suspending
devices and letting all of the CPUs in the system go idle (hence the
name).

Also see 
https://www.kernel.org/doc/html/latest/admin-guide/pm/sleep-states.html#suspend-to-idle

>
> > Apr 27 10:18:13 power NetworkManager[7534]:   [1556353093.7224] 
> > manager: sleep: sleep requested (sleeping: no  e
> > Apr 27 10:18:13 power systemd[1]: Reached target Sleep.
> > Apr 27 10:18:13 power systemd[1]: Starting Suspend...
> > Apr 27 10:18:13 power systemd-sleep[82190]: Suspending system...
> > Apr 27 10:18:13 power kernel: PM: suspend entry (s2idle)
> > -- Reboot --
>
> > $ uname -m
> > ppc64le
> > $ more /proc/version
> > Linux version 5.1.0-rc6+ (joey@power) (gcc version 8.3.0 (Ubuntu 
> > 8.3.0-6ubuntu1)) #1 SMP Sat Apr 27 10:01:48 CEST 2019
> > $ more /sys/power/mem_sleep
> > [s2idle]
> > $ more /sys/power/state
> > freeze mem
> > $ grep _SUSPEND /boot/config-5.0.0-14-generic # also enabled in Ubuntu’s 
> > configuration
> > CONFIG_ARCH_SUSPEND_POSSIBLE=y
> > CONFIG_SUSPEND=y
> > CONFIG_SUSPEND_FREEZER=y
> > # CONFIG_SUSPEND_SKIP_SYNC is not set
> > # CONFIG_PM_TEST_SUSPEND is not set
>
> Should the Kconfig symbol `SUSPEND` be selectable? If yes, should their
> be some detection during runtime?
>
> 2.  If it is supported, what are the ways to getting it to resume? What
> would the IPMI command be?

That would depend on the distribution.

Generally, you need to set up at least one device to generate wakeup interrupts.

The interface to do that are the /sys/devices/.../power/wakeup files,
but that has to cause enble_irq_wake() to be called for the given IRQ,
so some support in the underlying drivers need to be present for it to
work.

USB devices generally work as wakeup sources if the controllers reside
on a PCI bus, for example.

Thanks,
Rafael

Re: [PATCH v5 00/23] Include linux ACPI docs into Sphinx TOC tree

2019-04-25 Thread Rafael J. Wysocki

.On Wed, Apr 24, 2019 at 7:54 PM Changbin Du  wrote:
>
> Hi All,
> The kernel now uses Sphinx to generate intelligent and beautiful documentation
> from reStructuredText files. I converted all of the Linux ACPI/PCI/X86 docs to
> reST format in this serias.
>
> The hieararchy of ACPI docs are based on Corbet's suggestion:
> https://lkml.org/lkml/2019/4/3/1047
> I did some adjustment according to the content and finally they are placed as:
> Documentation/firmware-guide/acpi/

I'd like to queue up this series, but it is missing a patch to create
Documentation/firmware-guide/acpi/index.rst.

Care to provide one?

Re: [PATCH v4 00/63] Include linux ACPI/PCI/X86 docs into Sphinx TOC tree

2019-04-23 Thread Rafael J. Wysocki

On Tue, Apr 23, 2019 at 6:30 PM Changbin Du  wrote:
>
> Hi Corbet and All,
> The kernel now uses Sphinx to generate intelligent and beautiful documentation
> from reStructuredText files. I converted all of the Linux ACPI/PCI/X86 docs to
> reST format in this serias.
>
> In this version I combined ACPI and PCI docs, and added new x86 docs 
> conversion.

I'm not sure if combining all three into one big patch series has been
a good idea, honestly.

It would have been easier to review and handle otherwise.

For one, I'd like to handle the ACPI part of it myself if Jon doesn't mind that.

Thanks,
Rafael

Re: [PATCH] drivers: cpuidle: This patch fix the following checkpatch warning.

2019-04-18 Thread Rafael J. Wysocki

On Wednesday, April 17, 2019 4:52:34 PM CEST Mohan Kumar wrote:
> Use pr_debug instead of printk
> 
> WARNING: Prefer [subsystem eg: netdev]_dbg([subsystem]dev, ... then
> dev_dbg(dev, ... then pr_debug(...  to printk(KERN_DEBUG ...
> 
> Signed-off-by: Mohan Kumar 
> ---
>  drivers/cpuidle/cpuidle-powernv.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpuidle/cpuidle-powernv.c 
> b/drivers/cpuidle/cpuidle-powernv.c
> index 84b1ebe..7cf9835 100644
> --- a/drivers/cpuidle/cpuidle-powernv.c
> +++ b/drivers/cpuidle/cpuidle-powernv.c
> @@ -401,7 +401,7 @@ static int __init powernv_processor_idle_init(void)
>   powernv_cpuidle_driver_init();
>   retval = cpuidle_register(_idle_driver, NULL);
>   if (retval) {
> - printk(KERN_DEBUG "Registration of powernv driver failed.\n");
> + pr_debug("Registration of powernv driver failed.\n");
>   return retval;
>   }
>  
> @@ -413,7 +413,7 @@ static int __init powernv_processor_idle_init(void)
>  "cpuidle/powernv:dead", NULL,
>  powernv_cpuidle_cpu_dead);
>   WARN_ON(retval < 0);
> - printk(KERN_DEBUG "powernv_idle_driver registered\n");
> + pr_debug("powernv_idle_driver registered\n");
>   return 0;
>  }
>  
> 

Recently, you've sent two different patches against two different drivers with 
the same subject.

IMO it is fair enough to call that "confusing".

Moreover, pr_debug() is not a direct replacement for printk(KERN_DEBUG ) as the 
latter does
not require dynamic debug to be enabled and I'm not really sure if you are 
aware of that
difference.

Re: [PATCH v2 1/2] cpuidle : auto-promotion for cpuidle states

2019-04-09 Thread Rafael J. Wysocki

On Fri, Apr 5, 2019 at 11:17 AM Abhishek Goel
 wrote:
>
> Currently, the cpuidle governors (menu /ladder) determine what idle state

There are three governors in 5.1-rc.

> an idling CPU should enter into based on heuristics that depend on the
> idle history on that CPU. Given that no predictive heuristic is perfect,
> there are cases where the governor predicts a shallow idle state, hoping
> that the CPU will be busy soon. However, if no new workload is scheduled
> on that CPU in the near future, the CPU will end up in the shallow state.
>
> In case of POWER, this is problematic, when the predicted state in the
> aforementioned scenario is a lite stop state, as such lite states will
> inhibit SMT folding, thereby depriving the other threads in the core from
> using the core resources.
>
> To address this, such lite states need to be autopromoted.

I don't quite agree with this statement and it doesn't even match what
the patch does AFAICS.  "Autopromotion" would be going from the given
state to a deeper one without running state selection in between, but
that's not what's going on here.

> The cpuidle-core can queue timer to correspond with the residency value of 
> the next
> available state. Thus leading to auto-promotion to a deeper idle state as
> soon as possible.

No, it doesn't automatically cause a deeper state to be used next
time.  It simply kicks the CPU out of the idle state and one more
iteration of the idle loop runs on it.  Whether or not a deeper state
will be selected in that iteration depends on the governor
computations carried out in it.

Now, this appears to be almost analogous to the "polling" state used
on x86 which uses the next idle state's target residency as a timeout.

While generally I'm not a big fan of setting up timers in the idle
loop (it sort of feels like pulling your own hair in order to get
yourself out of a swamp), if idle states like these are there in your
platform, setting up a timer to get out of them in the driver's
->enter() routine might not be particularly objectionable.  Doing that
in the core is a whole different story, though.

Generally, this adds quite a bit of complexity (on the "ugly" side of
things IMO) to the core to cover a corner case present in one
platform, while IMO it can be covered in the driver for that platform
directly.

> Signed-off-by: Abhishek Goel 
> ---
>
> v1->v2 : Removed timeout_needed and rebased to current upstream kernel
>
>  drivers/cpuidle/cpuidle.c  | 68 +-
>  drivers/cpuidle/governors/ladder.c |  3 +-
>  drivers/cpuidle/governors/menu.c   | 22 +-
>  include/linux/cpuidle.h| 10 -
>  4 files changed, 99 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index 7f108309e..11ce43f19 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -36,6 +36,11 @@ static int enabled_devices;
>  static int off __read_mostly;
>  static int initialized __read_mostly;
>
> +struct auto_promotion {
> +   struct hrtimer  hrtimer;
> +   unsigned long   timeout_us;
> +};
> +
>  int cpuidle_disabled(void)
>  {
> return off;
> @@ -188,6 +193,54 @@ int cpuidle_enter_s2idle(struct cpuidle_driver *drv, 
> struct cpuidle_device *dev)
>  }
>  #endif /* CONFIG_SUSPEND */
>
> +enum hrtimer_restart auto_promotion_hrtimer_callback(struct hrtimer *hrtimer)
> +{
> +   return HRTIMER_NORESTART;
> +}
> +
> +#ifdef CONFIG_CPU_IDLE_AUTO_PROMOTION
> +DEFINE_PER_CPU(struct auto_promotion, ap);
> +
> +static void cpuidle_auto_promotion_start(int cpu, struct cpuidle_state 
> *state)
> +{
> +   struct auto_promotion *this_ap = _cpu(ap, cpu);
> +
> +   if (state->flags & CPUIDLE_FLAG_AUTO_PROMOTION)
> +   hrtimer_start(_ap->hrtimer, 
> ns_to_ktime(this_ap->timeout_us
> +   * 1000), HRTIMER_MODE_REL_PINNED);
> +}
> +
> +static void cpuidle_auto_promotion_cancel(int cpu)
> +{
> +   struct hrtimer *hrtimer;
> +
> +   hrtimer = _cpu(ap, cpu).hrtimer;
> +   if (hrtimer_is_queued(hrtimer))
> +   hrtimer_cancel(hrtimer);
> +}
> +
> +static void cpuidle_auto_promotion_update(int cpu, unsigned long timeout)
> +{
> +   per_cpu(ap, cpu).timeout_us = timeout;
> +}
> +
> +static void cpuidle_auto_promotion_init(int cpu, struct cpuidle_driver *drv)
> +{
> +   struct auto_promotion *this_ap = _cpu(ap, cpu);
> +
> +   hrtimer_init(_ap->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> +   this_ap->hrtimer.function = auto_promotion_hrtimer_callback;
> +}
> +#else
> +static inline void cpuidle_auto_promotion_start(int cpu, struct cpuidle_state
> +   *state) { }
> +static inline void cpuidle_auto_promotion_cancel(int cpu) { }
> +static inline void cpuidle_auto_promotion_update(int cpu, unsigned long
> +   timeout) { }
> +static inline void

Re: [PATCH v2 1/2] cpuidle : auto-promotion for cpuidle states

2019-04-09 Thread Rafael J. Wysocki

On Tue, Apr 9, 2019 at 11:29 AM Abhishek  wrote:
>
> Hi Daniel,
>
> Thanks for such a descriptive review. I will include all the suggestions
> made in my next iteration.

Please give me some time to send comments before that.

Re: [PATCH 1/2] cpuidle : auto-promotion for cpuidle states

2019-03-22 Thread Rafael J. Wysocki

On Fri, Mar 22, 2019 at 8:31 AM Abhishek Goel
 wrote:
>
> Currently, the cpuidle governors (menu /ladder) determine what idle state
> an idling CPU should enter into based on heuristics that depend on the
> idle history on that CPU. Given that no predictive heuristic is perfect,
> there are cases where the governor predicts a shallow idle state, hoping
> that the CPU will be busy soon. However, if no new workload is scheduled
> on that CPU in the near future, the CPU will end up in the shallow state.
>
> In case of POWER, this is problematic, when the predicted state in the
> aforementioned scenario is a lite stop state, as such lite states will
> inhibit SMT folding, thereby depriving the other threads in the core from
> using the core resources.
>
> To address this, such lite states need to be autopromoted. The cpuidle-
> core can queue timer to correspond with the residency value of the next
> available state. Thus leading to auto-promotion to a deeper idle state as
> soon as possible.

Isn't the tick stopping avoidance sufficient for that?

Re: [PATCH v2] cpufreq: powernv: add of_node_put()

2018-12-11 Thread Rafael J. Wysocki

On Wednesday, November 21, 2018 5:02:04 AM CET Viresh Kumar wrote:
> On 20-11-18, 11:05, Yangtao Li wrote:
> > The of_find_node_by_path() returns a node pointer with refcount
> > incremented,but there is the lack of use of the of_node_put() when
> > done.Add the missing of_node_put() to release the refcount.
> > 
> > Signed-off-by: Yangtao Li 
> > ---
> > Changes in v2
> > -update changelog
> > ---
> >  drivers/cpufreq/powernv-cpufreq.c | 17 +++--
> >  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> Acked-by: Viresh Kumar 

Patch applied, thanks!

Re: [PATCH v2] cpufreq: pmac64: add of_node_put()

2018-12-11 Thread Rafael J. Wysocki

On Monday, November 26, 2018 7:02:26 AM CET Viresh Kumar wrote:
> On 23-11-18, 08:33, Yangtao Li wrote:
> > of_find_node_by_path() acquires a reference to the node
> > returned by it and that reference needs to be dropped by its caller.
> > g5_neo2_cpufreq_init() doesn't do that, so fix it.
> > 
> > Signed-off-by: Yangtao Li 
> > Acked-by: Viresh Kumar 
> > ---
> > Changes in v2:
> > -update changelog
> > ---
> >  drivers/cpufreq/pmac64-cpufreq.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/cpufreq/pmac64-cpufreq.c 
> > b/drivers/cpufreq/pmac64-cpufreq.c
> > index be623dd7b9f2..1d32a863332d 100644
> > --- a/drivers/cpufreq/pmac64-cpufreq.c
> > +++ b/drivers/cpufreq/pmac64-cpufreq.c
> > @@ -411,6 +411,7 @@ static int __init g5_neo2_cpufreq_init(struct 
> > device_node *cpunode)
> > pfunc_set_vdnap0 = pmf_find_function(root, "set-vdnap0");
> > pfunc_vdnap0_complete =
> > pmf_find_function(root, "slewing-done");
> > +   of_node_put(root);
> > if (pfunc_set_vdnap0 == NULL ||
> > pfunc_vdnap0_complete == NULL) {
> > pr_err("Can't find required platform function\n");
> 
> Acked-by: Viresh Kumar 

Patch applied, thanks!

Re: [PATCH] cpufreq: powernv: add of_node_put()

2018-11-20 Thread Rafael J. Wysocki

On Tue, Nov 20, 2018 at 1:57 PM Yangtao Li  wrote:
>
> use of_node_put() to release the refcount.

Again, this changelog is as good as no changelog at all.

If you are adding a missing of_node_put(), please say that and explain
why it is necessary.

Thanks,
Rafael

Re: [PATCH v1 2/6] mm/memory_hotplug: make add_memory() take the device_hotplug_lock

2018-09-18 Thread Rafael J. Wysocki

On Tue, Sep 18, 2018 at 1:48 PM David Hildenbrand  wrote:
>
> add_memory() currently does not take the device_hotplug_lock, however
> is aleady called under the lock from
> arch/powerpc/platforms/pseries/hotplug-memory.c
> drivers/acpi/acpi_memhotplug.c
> to synchronize against CPU hot-remove and similar.
>
> In general, we should hold the device_hotplug_lock when adding memory
> to synchronize against online/offline request (e.g. from user space) -
> which already resulted in lock inversions due to device_lock() and
> mem_hotplug_lock - see 30467e0b3be ("mm, hotplug: fix concurrent memory
> hot-add deadlock"). add_memory()/add_memory_resource() will create memory
> block devices, so this really feels like the right thing to do.
>
> Holding the device_hotplug_lock makes sure that a memory block device
> can really only be accessed (e.g. via .online/.state) from user space,
> once the memory has been fully added to the system.
>
> The lock is not held yet in
> drivers/xen/balloon.c
> arch/powerpc/platforms/powernv/memtrace.c
> drivers/s390/char/sclp_cmd.c
> drivers/hv/hv_balloon.c
> So, let's either use the locked variants or take the lock.
>
> Don't export add_memory_resource(), as it once was exported to be used
> by XEN, which is never built as a module. If somebody requires it, we
> also have to export a locked variant (as device_hotplug_lock is never
> exported).
>
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Michael Ellerman 
> Cc: "Rafael J. Wysocki" 
> Cc: Len Brown 
> Cc: Greg Kroah-Hartman 
> Cc: Boris Ostrovsky 
> Cc: Juergen Gross 
> Cc: Nathan Fontenot 
> Cc: John Allen 
> Cc: Andrew Morton 
> Cc: Michal Hocko 
> Cc: Dan Williams 
> Cc: Joonsoo Kim 
> Cc: Vlastimil Babka 
> Cc: Oscar Salvador 
> Cc: Mathieu Malaterre 
> Cc: Pavel Tatashin 
> Cc: YASUAKI ISHIMATSU 
> Reviewed-by: Pavel Tatashin 
> Signed-off-by: David Hildenbrand 
> ---
>  .../platforms/pseries/hotplug-memory.c|  2 +-
>  drivers/acpi/acpi_memhotplug.c|  2 +-
>  drivers/base/memory.c |  9 ++--
>  drivers/xen/balloon.c |  3 +++
>  include/linux/memory_hotplug.h|  1 +
>  mm/memory_hotplug.c   | 22 ---
>  6 files changed, 32 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
> b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index b3f54466e25f..2e6f41dc103a 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -702,7 +702,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
> nid = memory_add_physaddr_to_nid(lmb->base_addr);
>
> /* Add the memory */
> -   rc = add_memory(nid, lmb->base_addr, block_sz);
> +   rc = __add_memory(nid, lmb->base_addr, block_sz);
> if (rc) {
> dlpar_remove_device_tree_lmb(lmb);
> return rc;
> diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
> index 811148415993..8fe0960ea572 100644
> --- a/drivers/acpi/acpi_memhotplug.c
> +++ b/drivers/acpi/acpi_memhotplug.c
> @@ -228,7 +228,7 @@ static int acpi_memory_enable_device(struct 
> acpi_memory_device *mem_device)
> if (node < 0)
> node = memory_add_physaddr_to_nid(info->start_addr);
>
> -   result = add_memory(node, info->start_addr, info->length);
> +   result = __add_memory(node, info->start_addr, info->length);
>
> /*
>  * If the memory block has been used by the kernel, 
> add_memory()
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 817320c7c4c1..40cac122ec73 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -519,15 +519,20 @@ memory_probe_store(struct device *dev, struct 
> device_attribute *attr,
> if (phys_addr & ((pages_per_block << PAGE_SHIFT) - 1))
> return -EINVAL;
>
> +   ret = lock_device_hotplug_sysfs();
> +   if (ret)
> +   goto out;
> +
> nid = memory_add_physaddr_to_nid(phys_addr);
> -   ret = add_memory(nid, phys_addr,
> -MIN_MEMORY_BLOCK_SIZE * sections_per_block);
> +   ret = __add_memory(nid, phys_addr,
> +  MIN_MEMORY_BLOCK_SIZE * sections_per_block);
>
> if (ret)
> goto out;
>
> ret = count;
>  out:
> +   unlock_device_hotplug();
> return ret;
>  }
>
&

Re: [PATCH v1 1/6] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock

2018-09-18 Thread Rafael J. Wysocki

On Tue, Sep 18, 2018 at 1:48 PM David Hildenbrand  wrote:
>
> remove_memory() is exported right now but requires the
> device_hotplug_lock, which is not exported. So let's provide a variant
> that takes the lock and only export that one.
>
> The lock is already held in
> arch/powerpc/platforms/pseries/hotplug-memory.c
> drivers/acpi/acpi_memhotplug.c
> So, let's use the locked variant.
>
> The lock is not held (but taken in)
> arch/powerpc/platforms/powernv/memtrace.c
> So let's keep using the (now) locked variant.
>
> Apart from that, there are not other users in the tree.
>
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Michael Ellerman 
> Cc: "Rafael J. Wysocki" 
> Cc: Len Brown 
> Cc: Rashmica Gupta 
> Cc: Michael Neuling 
> Cc: Balbir Singh 
> Cc: Nathan Fontenot 
> Cc: John Allen 
> Cc: Andrew Morton 
> Cc: Michal Hocko 
> Cc: Dan Williams 
> Cc: Joonsoo Kim 
> Cc: Vlastimil Babka 
> Cc: Pavel Tatashin 
> Cc: Greg Kroah-Hartman 
> Cc: Oscar Salvador 
> Cc: YASUAKI ISHIMATSU 
> Cc: Mathieu Malaterre 
> Reviewed-by: Pavel Tatashin 
> Signed-off-by: David Hildenbrand 
> ---
>  arch/powerpc/platforms/powernv/memtrace.c   | 2 --
>  arch/powerpc/platforms/pseries/hotplug-memory.c | 6 +++---
>  drivers/acpi/acpi_memhotplug.c  | 2 +-
>  include/linux/memory_hotplug.h  | 3 ++-
>  mm/memory_hotplug.c | 9 -
>  5 files changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/memtrace.c 
> b/arch/powerpc/platforms/powernv/memtrace.c
> index 51dc398ae3f7..8f1cd4f3bfd5 100644
> --- a/arch/powerpc/platforms/powernv/memtrace.c
> +++ b/arch/powerpc/platforms/powernv/memtrace.c
> @@ -90,9 +90,7 @@ static bool memtrace_offline_pages(u32 nid, u64 start_pfn, 
> u64 nr_pages)
> walk_memory_range(start_pfn, end_pfn, (void *)MEM_OFFLINE,
>   change_memblock_state);
>
> -   lock_device_hotplug();
> remove_memory(nid, start_pfn << PAGE_SHIFT, nr_pages << PAGE_SHIFT);
> -   unlock_device_hotplug();
>
> return true;
>  }
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
> b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index c1578f54c626..b3f54466e25f 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -334,7 +334,7 @@ static int pseries_remove_memblock(unsigned long base, 
> unsigned int memblock_siz
> nid = memory_add_physaddr_to_nid(base);
>
> for (i = 0; i < sections_per_block; i++) {
> -   remove_memory(nid, base, MIN_MEMORY_BLOCK_SIZE);
> +   __remove_memory(nid, base, MIN_MEMORY_BLOCK_SIZE);
> base += MIN_MEMORY_BLOCK_SIZE;
> }
>
> @@ -423,7 +423,7 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb)
> block_sz = pseries_memory_block_size();
> nid = memory_add_physaddr_to_nid(lmb->base_addr);
>
> -   remove_memory(nid, lmb->base_addr, block_sz);
> +   __remove_memory(nid, lmb->base_addr, block_sz);
>
> /* Update memory regions for memory remove */
> memblock_remove(lmb->base_addr, block_sz);
> @@ -710,7 +710,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
>
> rc = dlpar_online_lmb(lmb);
> if (rc) {
> -   remove_memory(nid, lmb->base_addr, block_sz);
> +   __remove_memory(nid, lmb->base_addr, block_sz);
> dlpar_remove_device_tree_lmb(lmb);
> } else {
> lmb->flags |= DRCONF_MEM_ASSIGNED;
> diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
> index 6b0d3ef7309c..811148415993 100644
> --- a/drivers/acpi/acpi_memhotplug.c
> +++ b/drivers/acpi/acpi_memhotplug.c
> @@ -282,7 +282,7 @@ static void acpi_memory_remove_memory(struct 
> acpi_memory_device *mem_device)
> nid = memory_add_physaddr_to_nid(info->start_addr);
>
> acpi_unbind_memory_blocks(info);
> -   remove_memory(nid, info->start_addr, info->length);
> +   __remove_memory(nid, info->start_addr, info->length);
> list_del(>list);
> kfree(info);
> }
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 34a28227068d..1f096852f479 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -301,6 +301,7 @@ extern bool is_mem_section_removable(unsigned long pfn, 
> unsigned long nr_pages);
>  extern v

Re: [RFC 08/15] x86: PCI: clean up pcibios_scan_root()

2018-08-20 Thread Rafael J. Wysocki

On Mon, Aug 20, 2018 at 1:17 PM Arnd Bergmann  wrote:
>
> On Mon, Aug 20, 2018 at 10:31 AM Rafael J. Wysocki  wrote:
> > On Fri, Aug 17, 2018 at 12:32 PM Arnd Bergmann  wrote:
>
> > > -static struct pci_bus *pci_scan_root_bus(struct device *parent, int bus,
> > > -   struct pci_ops *ops, void *sysdata, struct list_head 
> > > *resources)
> > > +void pcibios_scan_root(int busnum)
> > >  {
> > > +   struct pci_sysdata *sd;
> > > struct pci_host_bridge *bridge;
> > > int error;
> > >
> > > -   bridge = pci_alloc_host_bridge(0);
> > > -   if (!bridge)
> > > -   return NULL;
> > > +   bridge = pci_alloc_host_bridge(sizeof(sd));
> > > +   if (!bridge) {
> > > +   printk(KERN_ERR "PCI: OOM, skipping PCI bus %02x\n", 
> > > busnum);
> > > +   return;
> > > +   }
> > > +   sd = pci_host_bridge_priv(bridge);
> >
> > This looks fishy, as bridge->private is not set at this point AFAICS,
> > unless one of the previous patches changes that.
>
> bridge->private what comes after the bridge structure, and it's allocated
> by pci_alloc_host_bridge() passing the size of the structure we want
> for this private area.

I see, sorry for the noise.

Re: [RFC 07/15] PCI/ACPI: clean up acpi_pci_root_create()

2018-08-20 Thread Rafael J. Wysocki

On Mon, Aug 20, 2018 at 1:20 PM Arnd Bergmann  wrote:
>
> On Mon, Aug 20, 2018 at 10:23 AM Rafael J. Wysocki  wrote:
> > On Fri, Aug 17, 2018 at 12:33 PM Arnd Bergmann  wrote:
> > > @@ -909,8 +881,7 @@ struct pci_bus *acpi_pci_root_create(struct 
> > > acpi_pci_root *root,
> > > int ret, busnum = root->secondary.start;
> > > struct acpi_device *device = root->device;
> > > int node = acpi_get_node(device->handle);
> > > -   struct pci_bus *bus;
> > > -   struct pci_host_bridge *host_bridge;
> > > +   struct pci_host_bridge *bridge;
> >
> > Why "bridge" and not "host" or even something to stand for "root complex"?
> >
> > Or maybe it can still be "host_bridge"?
>
> I did this for consistency with the naming in drivers/pci/probe.c,
> which always declares the local variable as 'struct pci_host_bridge *bridge'.
> It's easy to change here if you feel strongly about it (I don't).

I would leave host_bridge here.  It would make the patch smaller too I think.

Re: [RFC 08/15] x86: PCI: clean up pcibios_scan_root()

2018-08-20 Thread Rafael J. Wysocki

On Fri, Aug 17, 2018 at 12:32 PM Arnd Bergmann  wrote:
>
> pcibios_scan_root() is now just a wrapper around pci_scan_root_bus(),
> and merging the two into one makes it shorter and more readable.
>
> We can also take advantage of pci_alloc_host_bridge() doing the
> allocation of the sysdata for us, which helps if we ever want to
> allow hot-unplugging the host bridge itself.
>
> We might be able to simplify it further using pci_host_probe(),
> but I wasn't sure about the resource registration there.
>
> Signed-off-by: Arnd Bergmann 
> ---
>  arch/x86/pci/common.c | 53 ++-
>  1 file changed, 17 insertions(+), 36 deletions(-)
>
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index e740d9aa4024..920d0885434c 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -453,54 +453,35 @@ void __init dmi_check_pciprobe(void)
> dmi_check_system(pciprobe_dmi_table);
>  }
>
> -static struct pci_bus *pci_scan_root_bus(struct device *parent, int bus,
> -   struct pci_ops *ops, void *sysdata, struct list_head 
> *resources)
> +void pcibios_scan_root(int busnum)
>  {
> +   struct pci_sysdata *sd;
> struct pci_host_bridge *bridge;
> int error;
>
> -   bridge = pci_alloc_host_bridge(0);
> -   if (!bridge)
> -   return NULL;
> +   bridge = pci_alloc_host_bridge(sizeof(sd));
> +   if (!bridge) {
> +   printk(KERN_ERR "PCI: OOM, skipping PCI bus %02x\n", busnum);
> +   return;
> +   }
> +   sd = pci_host_bridge_priv(bridge);

This looks fishy, as bridge->private is not set at this point AFAICS,
unless one of the previous patches changes that.

>
> -   list_splice_init(resources, >windows);
> -   bridge->dev.parent = parent;
> -   bridge->sysdata = sysdata;
> -   bridge->busnr = bus;
> -   bridge->ops = ops;
> +   sd->node = x86_pci_root_bus_node(busnum);
> +   x86_pci_root_bus_resources(busnum, >windows);
> +   bridge->sysdata = sd;
> +   bridge->busnr = busnum;
> +   bridge->ops = _root_ops;
>
> +   printk(KERN_DEBUG "PCI: Probing PCI hardware (bus %02x)\n", busnum);
> error = pci_scan_root_bus_bridge(bridge);
> if (error < 0)
> goto err_out;
>
> -   return bridge->bus;
> +   pci_bus_add_devices(bridge->bus);
> +   return;
>
>  err_out:
> -   kfree(bridge);
> -   return NULL;
> -}
> -
> -void pcibios_scan_root(int busnum)
> -{
> -   struct pci_bus *bus;
> -   struct pci_sysdata *sd;
> -   LIST_HEAD(resources);
> -
> -   sd = kzalloc(sizeof(*sd), GFP_KERNEL);
> -   if (!sd) {
> -   printk(KERN_ERR "PCI: OOM, skipping PCI bus %02x\n", busnum);
> -   return;
> -   }
> -   sd->node = x86_pci_root_bus_node(busnum);
> -   x86_pci_root_bus_resources(busnum, );
> -   printk(KERN_DEBUG "PCI: Probing PCI hardware (bus %02x)\n", busnum);
> -   bus = pci_scan_root_bus(NULL, busnum, _root_ops, sd, );
> -   if (!bus) {
> -   pci_free_resource_list();
> -   kfree(sd);
> -   return;
> -   }
> -   pci_bus_add_devices(bus);
> +   pci_free_host_bridge(bridge);
>  }
>
>  void __init pcibios_set_cache_line_size(void)
> --
> 2.18.0
>

Re: [RFC 07/15] PCI/ACPI: clean up acpi_pci_root_create()

2018-08-20 Thread Rafael J. Wysocki

On Fri, Aug 17, 2018 at 12:33 PM Arnd Bergmann  wrote:
>
> The acpi_pci_create_root_bus() can be fully integrated into
> acpi_pci_root_create(), improving a few things:
>
> * We can call pci_scan_root_bus_bridge(), which registers and
>   scans the bridge in one step.
> * After a failure in pci_register_host_bridge(), we correctly
>   clean up the resources.
> * The bridge settings (release function, flags, operations etc)
>   can get set up before registering the bridge.
> * Further cleanup would be possible, removing duplication between
>   pci_host_bridge and some ACPI structures.
>
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/acpi/pci_root.c | 68 +++--
>  1 file changed, 24 insertions(+), 44 deletions(-)
>
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index 85dbcf47015b..5f73de3b67c8 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -873,34 +873,6 @@ static void acpi_pci_root_release_info(struct 
> pci_host_bridge *bridge)
> __acpi_pci_root_release_info(bridge->release_data);
>  }
>
> -static struct pci_bus *acpi_pci_create_root_bus(struct device *parent, int 
> bus,
> -   struct pci_ops *ops, void *sysdata, struct list_head 
> *resources)
> -{
> -   int error;
> -   struct pci_host_bridge *bridge;
> -
> -   bridge = pci_alloc_host_bridge(0);
> -   if (!bridge)
> -   return NULL;
> -
> -   bridge->dev.parent = parent;
> -
> -   list_splice_init(resources, >windows);
> -   bridge->sysdata = sysdata;
> -   bridge->busnr = bus;
> -   bridge->ops = ops;
> -
> -   error = pci_register_host_bridge(bridge);
> -   if (error < 0)
> -   goto err_out;
> -
> -   return bridge->bus;
> -
> -err_out:
> -   kfree(bridge);
> -   return NULL;
> -}
> -
>  struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
>  struct acpi_pci_root_ops *ops,
>  struct acpi_pci_root_info *info,
> @@ -909,8 +881,7 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root 
> *root,
> int ret, busnum = root->secondary.start;
> struct acpi_device *device = root->device;
> int node = acpi_get_node(device->handle);
> -   struct pci_bus *bus;
> -   struct pci_host_bridge *host_bridge;
> +   struct pci_host_bridge *bridge;

Why "bridge" and not "host" or even something to stand for "root complex"?

Or maybe it can still be "host_bridge"?

>
> info->root = root;
> info->bridge = device;
> @@ -930,30 +901,39 @@ struct pci_bus *acpi_pci_root_create(struct 
> acpi_pci_root *root,
>
> pci_acpi_root_add_resources(info);
> pci_add_resource(>resources, >secondary);
> -   bus = acpi_pci_create_root_bus(NULL, busnum, ops->pci_ops,
> - sysdata, >resources);
> -   if (!bus)
> +
> +   bridge = pci_alloc_host_bridge(0);
> +   if (!bridge)
> goto out_release_info;
>
> -   host_bridge = to_pci_host_bridge(bus->bridge);
> +   list_splice_init(>resources, >windows);
> +   bridge->sysdata = sysdata;
> +   bridge->busnr = busnum;
> +   bridge->ops = ops->pci_ops;
> +   pci_set_host_bridge_release(bridge, acpi_pci_root_release_info,
> +   info);
> +
> if (!(root->osc_control_set & OSC_PCI_EXPRESS_NATIVE_HP_CONTROL))
> -   host_bridge->native_pcie_hotplug = 0;
> +   bridge->native_pcie_hotplug = 0;
> if (!(root->osc_control_set & OSC_PCI_SHPC_NATIVE_HP_CONTROL))
> -   host_bridge->native_shpc_hotplug = 0;
> +   bridge->native_shpc_hotplug = 0;
> if (!(root->osc_control_set & OSC_PCI_EXPRESS_AER_CONTROL))
> -   host_bridge->native_aer = 0;
> +   bridge->native_aer = 0;
> if (!(root->osc_control_set & OSC_PCI_EXPRESS_PME_CONTROL))
> -   host_bridge->native_pme = 0;
> +   bridge->native_pme = 0;
> if (!(root->osc_control_set & OSC_PCI_EXPRESS_LTR_CONTROL))
> -   host_bridge->native_ltr = 0;
> +   bridge->native_ltr = 0;
> +
> +   ret = pci_scan_root_bus_bridge(bridge);
> +   if (ret < 0)
> +   goto out_release_bridge;
>
> -   pci_scan_child_bus(bus);
> -   pci_set_host_bridge_release(host_bridge, acpi_pci_root_release_info,
> -   info);
> if (node != NUMA_NO_NODE)
> -   dev_printk(KERN_DEBUG, >dev, "on NUMA node %d\n", node);
> -   return bus;
> +   dev_printk(KERN_DEBUG, >bus->dev, "on NUMA node 
> %d\n", node);
> +   return bridge->bus;
>
> +out_release_bridge:
> +   pci_free_host_bridge(bridge);
>  out_release_info:
> __acpi_pci_root_release_info(info);
> return NULL;
> --
> 2.18.0
>

1 2 3 4 >

1 - 100 of 321 matches

Mail list logo