Re: [PATCH v5] reboot: support offline CPUs before reboot

2020-01-16 Thread Thomas Gleixner
Hsin-Yi Wang  writes:
> On Thu, Jan 16, 2020 at 8:30 AM Thomas Gleixner  wrote:
> We saw this issue on regular reboot (not panic) on arm64: If tick
> broadcast and smp_send_stop() happen together and the first broadcast
> arrives to some idled CPU that hasn't already executed reboot ipi to
> run in spinloop, it would try to broadcast to another CPU, but that
> target CPU is already marked as offline by set_cpu_online() in reboot
> ipi, and a warning comes out since tick_handle_oneshot_broadcast()
> would check if it tries to broadcast to offline cpus. Most of the time
> the CPU getting the broadcast interrupt is already in the spinloop and
> thus isn't going to receive interrupts from the broadcast timer.

The timer broadcasting is obviously broken by the existing reboot unplug
mechanism as the outgoing CPU should remove itself from the broadcast.

Just addressing the broadcast issue is not sufficient as there are tons
of other places which rely on consistency of the various cpu masks.

> If system supports hotplug, _cpu_down() would properly handle tasks
> termination such as remove CPU from timer broadcasting by
> tick_offline_cpu()...etc, as well as some interrupt handling.

Well, emphasis on 'if system supports hotplug'. If not, then you are
back to square one. On ARM64 hotplug is selectable by a config option.

So either we mandate HOTPLUG_CPU for SMP and get rid of all the
ifdeffery or we need to have a mechanism which works on !HOTPLUG_CPU as
well.

That whole reboot/shutdown stuff is an unpenetrable mess of notifiers
and architecture hackery, so something generic and understandable is
really required.

Thanks,

tglx


Re: [PATCH v5] reboot: support offline CPUs before reboot

2020-01-16 Thread Hsin-Yi Wang
On Wed, Jan 15, 2020 at 7:41 PM Sudeep Holla  wrote:
>
> On Wed, Jan 15, 2020 at 02:34:10PM +0800, Hsin-Yi Wang wrote:
> > Currently system reboots uses architecture specific codes (smp_send_stop)
> > to offline non reboot CPUs. Most architecture's implementation is looping
> > through all non reboot online CPUs and call ipi function to each of them. 
> > Some
> > architecture like arm64, arm, and x86... would set offline masks to cpu 
> > without
> > really offline them. This causes some race condition and kernel warning 
> > comes
> > out sometimes when system reboots.
> >
> > This patch adds a config ARCH_OFFLINE_CPUS_ON_REBOOT, which would offline 
> > cpus in
> > migrate_to_reboot_cpu(). If non reboot cpus are all offlined here, the loop 
> > for
> > checking online cpus would be an empty loop. If architecture don't enable 
> > this
> > config, or some cpus somehow fails to offline, it would fallback to ipi
> > function.
> >
>
> What's the timing impact on systems with large number of CPUs(say 256 or
> more) ? I remember we added some change to reduce the wait times for
> offlining CPUs in system suspend path on arm64, still not negligible.
>

This is not the final solution, but I would still provided some data
points here:

Tested on my arm64 with 4 cpu: 2 a53 and 2 a72.
Offlining 3 cpu takes about 60~65 ms
Offlining 2 cpu(a53+a72 or a72+a72) takes about 42~47 ms
Offlining 1 cpu(a53 or a72) takes about 23~25 ms.

It would take longer time for systems with large number of CPUs.


Re: [PATCH v5] reboot: support offline CPUs before reboot

2020-01-16 Thread Hsin-Yi Wang
On Thu, Jan 16, 2020 at 8:30 AM Thomas Gleixner  wrote:
>
> Hsin-Yi Wang  writes:
>
> > Currently system reboots uses architecture specific codes (smp_send_stop)
> > to offline non reboot CPUs. Most architecture's implementation is looping
> > through all non reboot online CPUs and call ipi function to each of them. 
> > Some
> > architecture like arm64, arm, and x86... would set offline masks to cpu 
> > without
> > really offline them. This causes some race condition and kernel warning 
> > comes
> > out sometimes when system reboots.
>
> 'some race condition and kernel warning' is pretty useless information.
> Please describe exactly which kind of issues are caused by the current
> mechanism. Especially the race conditions are the interesting part (the
> warnings are just a consequence).
>
> > This patch adds a config ARCH_OFFLINE_CPUS_ON_REBOOT, which would
> > offline cpus in
>
> Please read Documentation/process/submitting-patches.rst and search for
> 'This patch'.
>
> > migrate_to_reboot_cpu(). If non reboot cpus are all offlined here, the loop 
> > for
> > checking online cpus would be an empty loop.
>
> This does not make any sense. The issues which you are trying to solve
> are going to be still there when CONFIG_HOTPLUG_CPU is disabled.
>
> > If architecture don't enable this config, or some cpus somehow fails
> > to offline, it would fallback to ipi function.
>
> This is really a half baken solution which keeps the various pointlessly
> different pseudo reboot/kexec offlining implementations around. So with
> this we have yet more code which only works depending on kernel
> configuration and has the issue of potentially not being able to offline
> a CPU. IOW this is going to fail completely in cases where a system is
> in a state which prevents regular hotplug.
>
> The existing pseudo-offline functions have timeouts and eventually a
> fallback, e.g. the NMI fallback on x86. With this proposed regular
> offline solution this will just get stuck w/o a chance to force
> recovery.
>
> While I like the idea and surely agree that the ideal solution is to
> properly shutdown the CPUs on reboot, we need to take a step back and
> look at the minimum requirements for a regular shutdown/reboot and at
> the same time have a look at the requirements for emergency shutdown and
> kexec/kcrash. Having proper information about the race conditions and
> warnings you mentioned would be a good starting point.
>

We saw this issue on regular reboot (not panic) on arm64: If tick
broadcast and smp_send_stop() happen together and the first broadcast
arrives to some idled CPU that hasn't already executed reboot ipi to
run in spinloop, it would try to broadcast to another CPU, but that
target CPU is already marked as offline by set_cpu_online() in reboot
ipi, and a warning comes out since tick_handle_oneshot_broadcast()
would check if it tries to broadcast to offline cpus. Most of the time
the CPU getting the broadcast interrupt is already in the spinloop and
thus isn't going to receive interrupts from the broadcast timer.

[   27.032080] Set kernel.core_pattern before fs.suid_dumpable.
[   27.978628] reboot: Restarting system
[   27.978919] WARNING: CPU: 3 PID: 0 at
/mnt/host/source/src/third_party/kernel/v4.19/kernel/time/tick-broadcast.c:652
tick_handle_oneshot_broadcast+0x1f8/0x214
[   27.978932] Modules linked in: rfcomm uinput bridge stp llc
nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp esp6 ah6
xfrm6_mode_tunnel xfrm6_mode_transport xfrm4_mode_tunnel
xfrm4_mode_transport ip6t_REJECT ip6t_ipv6header hci_uart btqca
bluetooth ipt_MASQUERADE ecdh_generic lzo_rle lzo_compress zram fuse
cros_ec_sensors_sync cros_ec_sensors_ring cros_ec_light_prox
cros_ec_sensors industrialio_triggered_buffer kfifo_buf
cros_ec_sensors_core ath10k_sdio ath10k_core ath mac80211 cfg80211
asix usbnet mii joydev
[   27.979102] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G S
  4.19.56 #79
[   27.979113] Hardware name: MediaTek krane sku176 board (DT)
[   27.979127] pstate: 0085 (nzcv daIf -PAN -UAO)
[   27.979140] pc : tick_handle_oneshot_broadcast+0x1f8/0x214
[   27.979154] lr : tick_handle_oneshot_broadcast+0x108/0x214
[   27.979162] sp : fff85c851610
[   27.979171] x29: fff85c851670 x28: ff9082785510
[   27.979187] x27: ff90822da700 x26: ff90826169c8
[   27.979202] x25: ff9082616000 x24: 0001
[   27.979217] x23: ff90827854f8 x22: 00067a6599b8
[   27.979232] x21:  x20: 7fff
[   27.979248] x19: ff9082785508 x18: 
[   27.979263] x17:  x16: 
[   27.979278] x15:  x14: fff85bf08040
[   27.979293] x13: ff9082785000 x12: 0069
[   27.979308] x11: ff9082785000 x10: 0018
[   27.979323] x9 : 0010 x8 : ff9082615000
[   27.979338] x7 :  x6 : 003f
[   27.979353] x5 : 0040 x4 : 0102
[   27.979367] x3 : 

Re: [PATCH v5] reboot: support offline CPUs before reboot

2020-01-15 Thread Thomas Gleixner
Hsin-Yi Wang  writes:

> Currently system reboots uses architecture specific codes (smp_send_stop)
> to offline non reboot CPUs. Most architecture's implementation is looping
> through all non reboot online CPUs and call ipi function to each of them. Some
> architecture like arm64, arm, and x86... would set offline masks to cpu 
> without
> really offline them. This causes some race condition and kernel warning comes
> out sometimes when system reboots.

'some race condition and kernel warning' is pretty useless information.
Please describe exactly which kind of issues are caused by the current
mechanism. Especially the race conditions are the interesting part (the
warnings are just a consequence).

> This patch adds a config ARCH_OFFLINE_CPUS_ON_REBOOT, which would
> offline cpus in

Please read Documentation/process/submitting-patches.rst and search for
'This patch'.

> migrate_to_reboot_cpu(). If non reboot cpus are all offlined here, the loop 
> for
> checking online cpus would be an empty loop.

This does not make any sense. The issues which you are trying to solve
are going to be still there when CONFIG_HOTPLUG_CPU is disabled.

> If architecture don't enable this config, or some cpus somehow fails
> to offline, it would fallback to ipi function.

This is really a half baken solution which keeps the various pointlessly
different pseudo reboot/kexec offlining implementations around. So with
this we have yet more code which only works depending on kernel
configuration and has the issue of potentially not being able to offline
a CPU. IOW this is going to fail completely in cases where a system is
in a state which prevents regular hotplug.

The existing pseudo-offline functions have timeouts and eventually a
fallback, e.g. the NMI fallback on x86. With this proposed regular
offline solution this will just get stuck w/o a chance to force
recovery.

While I like the idea and surely agree that the ideal solution is to
properly shutdown the CPUs on reboot, we need to take a step back and
look at the minimum requirements for a regular shutdown/reboot and at
the same time have a look at the requirements for emergency shutdown and
kexec/kcrash. Having proper information about the race conditions and
warnings you mentioned would be a good starting point.

> Opt in this config for architectures that support CONFIG_HOTPLUG_CPU.

This is not opt-in. You force that on all architectures which support
CONFIG_HOTPLUG_CPU. The way we do this normally is to provide the
infrastructure first and then have separate patches (one per
architecture) enabling this, which allows the architecture maintainers
to decide individually.

Thanks,

tglx


Re: [PATCH v5] reboot: support offline CPUs before reboot

2020-01-15 Thread Sudeep Holla
On Wed, Jan 15, 2020 at 02:34:10PM +0800, Hsin-Yi Wang wrote:
> Currently system reboots uses architecture specific codes (smp_send_stop)
> to offline non reboot CPUs. Most architecture's implementation is looping
> through all non reboot online CPUs and call ipi function to each of them. Some
> architecture like arm64, arm, and x86... would set offline masks to cpu 
> without
> really offline them. This causes some race condition and kernel warning comes
> out sometimes when system reboots.
>
> This patch adds a config ARCH_OFFLINE_CPUS_ON_REBOOT, which would offline 
> cpus in
> migrate_to_reboot_cpu(). If non reboot cpus are all offlined here, the loop 
> for
> checking online cpus would be an empty loop. If architecture don't enable this
> config, or some cpus somehow fails to offline, it would fallback to ipi
> function.
>

What's the timing impact on systems with large number of CPUs(say 256 or
more) ? I remember we added some change to reduce the wait times for
offlining CPUs in system suspend path on arm64, still not negligible.

--
Regards,
Sudeep


Re: [PATCH v5] reboot: support offline CPUs before reboot

2020-01-15 Thread Hsin-Yi Wang
On Wed, Jan 15, 2020 at 5:49 PM Rafael J. Wysocki  wrote:
>
> On Wed, Jan 15, 2020 at 7:35 AM Hsin-Yi Wang  wrote:
> >
> > Currently system reboots uses architecture specific codes (smp_send_stop)
> > to offline non reboot CPUs. Most architecture's implementation is looping
> > through all non reboot online CPUs and call ipi function to each of them. 
> > Some
> > architecture like arm64, arm, and x86... would set offline masks to cpu 
> > without
> > really offline them. This causes some race condition and kernel warning 
> > comes
> > out sometimes when system reboots.
> >
> > This patch adds a config ARCH_OFFLINE_CPUS_ON_REBOOT, which would offline 
> > cpus in
> > migrate_to_reboot_cpu(). If non reboot cpus are all offlined here, the loop 
> > for
> > checking online cpus would be an empty loop. If architecture don't enable 
> > this
> > config, or some cpus somehow fails to offline, it would fallback to ipi
> > function.
> >
> > Opt in this config for architectures that support CONFIG_HOTPLUG_CPU.
> >
> > Signed-off-by: Hsin-Yi Wang 
> > ---
> > Change from v4:
> > * fix a few nits: naming, comments, remove Kconfig text...
> >
> > Change from v3:
> > * Opt in config for architectures that support CONFIG_HOTPLUG_CPU
> > * Merge function offline_secondary_cpus() and freeze_secondary_cpus()
> >   with an additional flag.
>
> This does not seem to be a very good idea, since
> freeze_secondary_cpus() does much more than you need for reboot.
>
> For reboot, you basically only need to do something like this AFAICS:
>
> cpu_maps_update_begin();
>
> for_each_online_cpu(i) {
> if (i != cpu)
> _cpu_down(i, 1, CPUHP_OFFLINE);
> }
> cpu_hotplug_disabled++;
>
> cpu_maps_update_done();
>
> And you may put this into a function defined outside of CONFIG_PM_SLEEP.
>
v2's implementation is similar to this. The conclusion in v2[1] is
that since these 2 functions are similar, we should merge them. I'm
fine both ways but slightly prefer v2's. Maybe wait for others to
comment?

[1] https://lore.kernel.org/lkml/87muarpcwm@vitty.brq.redhat.com/
> >
> > Change from v2:
> > * Add another config instead of configed by CONFIG_HOTPLUG_CPU
>
> So why exactly is this new Kconfig option needed?
>
> Everybody supporting CPU hotplug seems to opt in anyway.
>
Currently we opt-in for all arch that supports HOTPLUG_CPU, but if
some arch decides that this would make reboot slow (or maybe other
reasons), they can choose to opt-out.
I have only tested on arm64 and x86 for now.
> [cut]
>
> >
> > -int freeze_secondary_cpus(int primary)
> > +int freeze_secondary_cpus(int primary, bool reboot)
> >  {
> > int cpu, error = 0;
> >
> > @@ -1237,11 +1237,13 @@ int freeze_secondary_cpus(int primary)
> > if (cpu == primary)
> > continue;
> >
> > -   if (pm_wakeup_pending()) {
> > +#ifdef CONFIG_PM_SLEEP
> > +   if (!reboot && pm_wakeup_pending()) {
> > pr_info("Wakeup pending. Abort CPU freeze\n");
> > error = -EBUSY;
> > break;
> > }
> > +#endif
>
> Please avoid using #ifdefs in function bodies.  This makes the code
> hard to maintain in the long term.
>
> >
> > trace_suspend_resume(TPS("CPU_OFF"), cpu, true);
> > error = _cpu_down(cpu, 1, CPUHP_OFFLINE);
> > @@ -1250,7 +1252,9 @@ int freeze_secondary_cpus(int primary)
> > cpumask_set_cpu(cpu, frozen_cpus);
> > else {
> > pr_err("Error taking CPU%d down: %d\n", cpu, error);
> > -   break;
> > +   /* When rebooting, offline as many CPUs as 
> > possible. */
> > +   if (!reboot)
> > +   break;
> > }
> > }
> >
> > diff --git a/kernel/reboot.c b/kernel/reboot.c
> > index c4d472b7f1b4..12f643b66e57 100644
> > --- a/kernel/reboot.c
> > +++ b/kernel/reboot.c
> > @@ -7,6 +7,7 @@
> >
> >  #define pr_fmt(fmt)"reboot: " fmt
> >
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -220,7 +221,9 @@ void migrate_to_reboot_cpu(void)
> > /* The boot cpu is always logical cpu 0 */
> > int cpu = reboot_cpu;
> >
> > +#if !IS_ENABLED(CONFIG_ARCH_OFFLINE_CPUS_ON_REBOOT)
> > cpu_hotplug_disable();
> > +#endif
>
> You can write this as
>
> if (!IS_ENABLED(CONFIG_ARCH_OFFLINE_CPUS_ON_REBOOT))
> cpu_hotplug_disable();
>
> That's what IS_ENABLED() is there for.
>
> >
> > /* Make certain the cpu I'm about to reboot on is online */
> > if (!cpu_online(cpu))
> > @@ -231,6 +234,11 @@ void migrate_to_reboot_cpu(void)
> >
> > /* Make certain I only run on the appropriate processor */
> > set_cpus_allowed_ptr(current, cpumask_of(cpu));
> > +
> > +#if IS_ENABLED(CONFIG_ARCH_OFFLINE_CPUS_ON_REBOOT)
> > +   /* Offline other cpus if possible */
> > +   

Re: [PATCH v5] reboot: support offline CPUs before reboot

2020-01-15 Thread Rafael J. Wysocki
On Wed, Jan 15, 2020 at 7:35 AM Hsin-Yi Wang  wrote:
>
> Currently system reboots uses architecture specific codes (smp_send_stop)
> to offline non reboot CPUs. Most architecture's implementation is looping
> through all non reboot online CPUs and call ipi function to each of them. Some
> architecture like arm64, arm, and x86... would set offline masks to cpu 
> without
> really offline them. This causes some race condition and kernel warning comes
> out sometimes when system reboots.
>
> This patch adds a config ARCH_OFFLINE_CPUS_ON_REBOOT, which would offline 
> cpus in
> migrate_to_reboot_cpu(). If non reboot cpus are all offlined here, the loop 
> for
> checking online cpus would be an empty loop. If architecture don't enable this
> config, or some cpus somehow fails to offline, it would fallback to ipi
> function.
>
> Opt in this config for architectures that support CONFIG_HOTPLUG_CPU.
>
> Signed-off-by: Hsin-Yi Wang 
> ---
> Change from v4:
> * fix a few nits: naming, comments, remove Kconfig text...
>
> Change from v3:
> * Opt in config for architectures that support CONFIG_HOTPLUG_CPU
> * Merge function offline_secondary_cpus() and freeze_secondary_cpus()
>   with an additional flag.

This does not seem to be a very good idea, since
freeze_secondary_cpus() does much more than you need for reboot.

For reboot, you basically only need to do something like this AFAICS:

cpu_maps_update_begin();

for_each_online_cpu(i) {
if (i != cpu)
_cpu_down(i, 1, CPUHP_OFFLINE);
}
cpu_hotplug_disabled++;

cpu_maps_update_done();

And you may put this into a function defined outside of CONFIG_PM_SLEEP.

>
> Change from v2:
> * Add another config instead of configed by CONFIG_HOTPLUG_CPU

So why exactly is this new Kconfig option needed?

Everybody supporting CPU hotplug seems to opt in anyway.

[cut]

>
> -int freeze_secondary_cpus(int primary)
> +int freeze_secondary_cpus(int primary, bool reboot)
>  {
> int cpu, error = 0;
>
> @@ -1237,11 +1237,13 @@ int freeze_secondary_cpus(int primary)
> if (cpu == primary)
> continue;
>
> -   if (pm_wakeup_pending()) {
> +#ifdef CONFIG_PM_SLEEP
> +   if (!reboot && pm_wakeup_pending()) {
> pr_info("Wakeup pending. Abort CPU freeze\n");
> error = -EBUSY;
> break;
> }
> +#endif

Please avoid using #ifdefs in function bodies.  This makes the code
hard to maintain in the long term.

>
> trace_suspend_resume(TPS("CPU_OFF"), cpu, true);
> error = _cpu_down(cpu, 1, CPUHP_OFFLINE);
> @@ -1250,7 +1252,9 @@ int freeze_secondary_cpus(int primary)
> cpumask_set_cpu(cpu, frozen_cpus);
> else {
> pr_err("Error taking CPU%d down: %d\n", cpu, error);
> -   break;
> +   /* When rebooting, offline as many CPUs as possible. 
> */
> +   if (!reboot)
> +   break;
> }
> }
>
> diff --git a/kernel/reboot.c b/kernel/reboot.c
> index c4d472b7f1b4..12f643b66e57 100644
> --- a/kernel/reboot.c
> +++ b/kernel/reboot.c
> @@ -7,6 +7,7 @@
>
>  #define pr_fmt(fmt)"reboot: " fmt
>
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -220,7 +221,9 @@ void migrate_to_reboot_cpu(void)
> /* The boot cpu is always logical cpu 0 */
> int cpu = reboot_cpu;
>
> +#if !IS_ENABLED(CONFIG_ARCH_OFFLINE_CPUS_ON_REBOOT)
> cpu_hotplug_disable();
> +#endif

You can write this as

if (!IS_ENABLED(CONFIG_ARCH_OFFLINE_CPUS_ON_REBOOT))
cpu_hotplug_disable();

That's what IS_ENABLED() is there for.

>
> /* Make certain the cpu I'm about to reboot on is online */
> if (!cpu_online(cpu))
> @@ -231,6 +234,11 @@ void migrate_to_reboot_cpu(void)
>
> /* Make certain I only run on the appropriate processor */
> set_cpus_allowed_ptr(current, cpumask_of(cpu));
> +
> +#if IS_ENABLED(CONFIG_ARCH_OFFLINE_CPUS_ON_REBOOT)
> +   /* Offline other cpus if possible */
> +   freeze_secondary_cpus(cpu, true);
> +#endif

The above comment applies here too.

>  }
>
>  /**
> --