* Robin Holt <[email protected]> wrote:
> On Tue, Apr 16, 2013 at 09:18:07PM +0530, Srivatsa S. Bhat wrote:
> > On 04/16/2013 05:36 PM, Robin Holt wrote:
> > > On Tue, Apr 16, 2013 at 01:32:56PM +0200, Ingo Molnar wrote:
> > >>
> > >> * Robin Holt <[email protected]> wrote:
> > >>
> > >>> We recently noticed that reboot of a 1024 cpu machine takes approx 16
> > >>> minutes of just stopping the cpus. The slowdown was tracked to commit
> > >>> f96972f.
> > >>>
> > >>> The current implementation does all the work of hot removing the cpus
> > >>> before halting the system. We are switching to just migrating to the
> > >>> boot cpu and then continuing with shutdown/reboot.
> > >>>
> > >>> This also has the effect of not breaking x86's command line parameter
> > >>> for
> > >>> specifying the reboot cpu. Note, this code was shamelessly copied from
> > >>> arch/x86/kernel/reboot.c with bits removed pertaining to the reboot_cpu
> > >>> command line parameter.
> > >>>
> > >>> Signed-off-by: Robin Holt <[email protected]>
> > >>> Tested-by: Shawn Guo <[email protected]>
> > >>> To: Ingo Molnar <[email protected]>
> > >>> To: Russ Anderson <[email protected]>
> > >>> To: Oleg Nesterov <[email protected]>
> > >>> Cc: Andrew Morton <[email protected]>
> > >>> Cc: "H. Peter Anvin" <[email protected]>
> > >>> Cc: Lai Jiangshan <[email protected]>
> > >>> Cc: Linus Torvalds <[email protected]>
> > >>> Cc: Linux Kernel Mailing List <[email protected]>
> > >>> Cc: Michel Lespinasse <[email protected]>
> > >>> Cc: Oleg Nesterov <[email protected]>
> > >>> Cc: "Paul E. McKenney" <[email protected]>
> > >>> Cc: Paul Mackerras <[email protected]>
> > >>> Cc: Peter Zijlstra <[email protected]>
> > >>> Cc: Robin Holt <[email protected]>
> > >>> Cc: "[email protected]" <[email protected]>
> > >>> Cc: Tejun Heo <[email protected]>
> > >>> Cc: the arch/x86 maintainers <[email protected]>
> > >>> Cc: Thomas Gleixner <[email protected]>
> > >>> Cc: <[email protected]>
> > >>>
> > >>> ---
> > >>>
> > >>> Changes since -v1.
> > >>> - Set PF_THREAD_BOUND before migrating to eliminate potential race.
> > >>> - Modified kernel_power_off to also migrate instead of using
> > >>> disable_nonboot_cpus().
> > >>> ---
> > >>> kernel/sys.c | 22 +++++++++++++++++++---
> > >>> 1 file changed, 19 insertions(+), 3 deletions(-)
> > >>>
> > >>> diff --git a/kernel/sys.c b/kernel/sys.c
> > >>> index 0da73cf..5ef7aa2 100644
> > >>> --- a/kernel/sys.c
> > >>> +++ b/kernel/sys.c
> > >>> @@ -357,6 +357,22 @@ int unregister_reboot_notifier(struct
> > >>> notifier_block *nb)
> > >>> }
> > >>> EXPORT_SYMBOL(unregister_reboot_notifier);
> > >>>
> > >>> +void migrate_to_reboot_cpu(void)
> > >>
> > >> It appears to be file-scope, so should be static I guess?
> > >
> > > Done.
> > >
> > >>> +{
> > >>> + /* The boot cpu is always logical cpu 0 */
> > >>> + int reboot_cpu_id = 0;
> > >>> +
> > >>> + /* Make certain the cpu I'm about to reboot on is online */
> > >>> + if (!cpu_online(reboot_cpu_id))
> > >>> + reboot_cpu_id = smp_processor_id();
> > >>
> > >> Shouldn't we pick the first online CPU instead, to make it deterministic?
> > >
> > > Done.
> > >
> > > reboot_cpu_id = cpumask_first(cpu_online_mask);
> > >
> >
> > Let me ask again: if CPU 0 (or whatever the preferred reboot cpu is)
> > is offline, then why should we even bother pinning the task to (another)
> > CPU? Why not just proceed with the reboot?
>
> No idea. I copied it from the arch/x86 code. I can not defend it.
I'd say it's a quality of implementation improvement if the choice of the CPU
is
deterministic, as long as the current configuration of CPUs is deterministic.
I.e. instead of 'reboot on the first CPU, or a random CPU', make the rule
'reboot
on the first online CPU'. That's a simple rule to think about.
( On most architectures CPU#0 cannot be unplugged, so the rule will effectively
be
'reboot on CPU#0'. Like the current upstream behavior. )
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/