Re: [Bugme-new] [Bug 11629] New: quad G5 fails to shut down

2008-09-26 Thread Joel Schopp



From: Johannes Berg [EMAIL PROTECTED]
Subject: powerpc: fix shutdown

I tracked down the shutdown regression to CPUs not dying
when being shut down during power-off. This turns out to
be due to the system_state being SYSTEM_POWER_OFF, which
this code doesn't take as a valid state for shutting off
CPUs in.

This has never made sense to me, but when I added hotplug
code to implement hibernate I only made it work and did
not question the need to check the system_state. Thomas
Gleixner helped me dig, but the only thing we found is
that it was added with the original commit that added CPU
hotplug support.

Signed-off-by: Johannes Berg [EMAIL PROTECTED]
Cc: Thomas Gleixner [EMAIL PROTECTED]
Cc: Joel Schopp [EMAIL PROTECTED]
Cc: Benjamin Herrenschmidt [EMAIL PROTECTED]
  
Tested this patch on a Power6 partition with cpu hotplug online/offline 
and some shutdown and reboot cycles.  Didn't see any ill effects.


Acked-by: Joel Schopp [EMAIL PROTECTED]


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [Bugme-new] [Bug 11629] New: quad G5 fails to shut down

2008-09-25 Thread Johannes Berg
I wrote:

 And here's the fix.

Well, that should of course remove that comment there too. New version
below.

From: Johannes Berg [EMAIL PROTECTED]
Subject: powerpc: fix shutdown

I tracked down the shutdown regression to CPUs not dying
when being shut down during power-off. This turns out to
be due to the system_state being SYSTEM_POWER_OFF, which
this code doesn't take as a valid state for shutting off
CPUs in.

This has never made sense to me, but when I added hotplug
code to implement hibernate I only made it work and did
not question the need to check the system_state. Thomas
Gleixner helped me dig, but the only thing we found is
that it was added with the original commit that added CPU
hotplug support.

Signed-off-by: Johannes Berg [EMAIL PROTECTED]
Cc: Thomas Gleixner [EMAIL PROTECTED]
Cc: Joel Schopp [EMAIL PROTECTED]
Cc: Benjamin Herrenschmidt [EMAIL PROTECTED]
---
 arch/powerpc/kernel/idle.c |6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

--- everything.orig/arch/powerpc/kernel/idle.c  2008-09-01 12:28:37.0 
+0200
+++ everything/arch/powerpc/kernel/idle.c   2008-09-25 10:55:45.0 
+0200
@@ -34,11 +34,7 @@
 #include asm/smp.h
 
 #ifdef CONFIG_HOTPLUG_CPU
-/* this is used for software suspend, and that shuts down
- * CPUs even while the system is still booting... */
-#define cpu_should_die()   (cpu_is_offline(smp_processor_id())  \
-  (system_state == SYSTEM_RUNNING \
-|| system_state == SYSTEM_BOOTING))
+#define cpu_should_die()   cpu_is_offline(smp_processor_id())
 #else
 #define cpu_should_die()   0
 #endif


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [Bugme-new] [Bug 11629] New: quad G5 fails to shut down

2008-09-24 Thread Johannes Berg
On Tue, 2008-09-23 at 14:30 -0700, Andrew Morton wrote:

  Steps to reproduce:
  shut down the machine, see it fail.
  
  http://johannes.sipsolutions.net/files/dsc_2274.jpg
  
 
 That went splat in the clockevents code.

This is still strange, it appears that when the CPUs fail to offline, we
still call into the clockevents code and it rightfully crashes, but I'll
check that out separately.

I've found the problem, below is a tentative fix. The problem has been
around for quite a while, here's the whole commit that caused it
(because it's so short):

commit 4047727e5ae33f9b8d2b7766d1994ea6e5ec2991
Author: Mark Lord [EMAIL PROTECTED]
Date:   Mon Oct 1 01:20:10 2007 -0700

Fix SMP poweroff hangs

We need to disable all CPUs other than the boot CPU (usually 0) before
attempting to power-off modern SMP machines.  This fixes the
hang-on-poweroff issue on my MythTV SMP box, and also on Thomas Gleixner's
new toybox.

Signed-off-by: Mark Lord [EMAIL PROTECTED]
Acked-by: Thomas Gleixner [EMAIL PROTECTED]
Cc: Rafael J. Wysocki [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
Signed-off-by: Linus Torvalds [EMAIL PROTECTED]

diff --git a/kernel/sys.c b/kernel/sys.c
index 1b33b05..8ae2e63 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -32,6 +32,7 @@
 #include linux/getcpu.h
 #include linux/task_io_accounting_ops.h
 #include linux/seccomp.h
+#include linux/cpu.h
 
 #include linux/compat.h
 #include linux/syscalls.h
@@ -878,6 +879,7 @@ void kernel_power_off(void)
kernel_shutdown_prepare(SYSTEM_POWER_OFF);
if (pm_power_off_prepare)
pm_power_off_prepare();
+   disable_nonboot_cpus();
sysdev_shutdown();
printk(KERN_EMERG Power down.\n);
machine_power_off();


And here's the fix.

From: Johannes Berg [EMAIL PROTECTED]
Subject: powerpc: fix shutdown

I tracked down the shutdown regression to CPUs not dying
when being shut down during power-off. This turns out to
be due to the system_state being SYSTEM_POWER_OFF, which
this code doesn't take as a valid state for shutting off
CPUs in.

This has never made sense to me, but when I added hotplug
code to implement hibernate I only made it work and did
not question the need to check the system_state. Thomas
Gleixner helped me dig, but the only thing we found is
that it was added with the original commit that added CPU
hotplug support.

Signed-off-by: Johannes Berg [EMAIL PROTECTED]
Cc: Thomas Gleixner [EMAIL PROTECTED]
Cc: Joel Schopp [EMAIL PROTECTED]
Cc: Benjamin Herrenschmidt [EMAIL PROTECTED]
---
 arch/powerpc/kernel/idle.c |4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

--- a/arch/powerpc/kernel/idle.c
+++ b/arch/powerpc/kernel/idle.c
@@ -36,9 +36,7 @@
 #ifdef CONFIG_HOTPLUG_CPU
 /* this is used for software suspend, and that shuts down
  * CPUs even while the system is still booting... */
-#define cpu_should_die()   (cpu_is_offline(smp_processor_id())  \
-  (system_state == SYSTEM_RUNNING \
-|| system_state == SYSTEM_BOOTING))
+#define cpu_should_die()   cpu_is_offline(smp_processor_id())
 #else
 #define cpu_should_die()   0
 #endif


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [Bugme-new] [Bug 11629] New: quad G5 fails to shut down

2008-09-23 Thread Andrew Morton
On Tue, 23 Sep 2008 14:20:55 -0700 (PDT)
[EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=11629
 
Summary: quad G5 fails to shut down
Product: Platform Specific/Hardware
Version: 2.5
  KernelVersion: 2.6.27-rc6+wireless bits
   Platform: All
 OS/Version: Linux
   Tree: Mainline
 Status: NEW
   Severity: normal
   Priority: P1
  Component: PPC-64
 AssignedTo: [EMAIL PROTECTED]
 ReportedBy: [EMAIL PROTECTED]
 
 
 Latest working kernel version: don't remember, pretty sure 2.6.26 worked but
 have been running development kernels forever
 Earliest failing kernel version: n/a, haven't checked, has been failing for a
 while
 Distribution: debian/unstable
 Hardware Environment: quad G5 powermac
 Software Environment: ?
 Problem Description: machine fails to shut down, 
 
 Steps to reproduce:
 shut down the machine, see it fail.
 
 http://johannes.sipsolutions.net/files/dsc_2274.jpg
 

That went splat in the clockevents code.

I saw a pull request with several fixes float past this morning:

From: Ingo Molnar [EMAIL PROTECTED]
To: Linus Torvalds [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED], Thomas Gleixner [EMAIL PROTECTED], H. Peter Anvin 
[EMAIL PROTECTED], Andrew Morton [EMAIL PROTECTED]
Subject: [git pull] timer fixes
Date: Tue, 23 Sep 2008 21:41:00 +0200

Fingers crossed..

I'll mark this as a post-2.6.26 regression, although you seem a bit
wobbly on that.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [Bugme-new] [Bug 11629] New: quad G5 fails to shut down

2008-09-23 Thread Johannes Berg
On Tue, 2008-09-23 at 14:30 -0700, Andrew Morton wrote:

  http://johannes.sipsolutions.net/files/dsc_2274.jpg
  
 
 That went splat in the clockevents code.
 
 I saw a pull request with several fixes float past this morning:
 
 From: Ingo Molnar [EMAIL PROTECTED]
 To: Linus Torvalds [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED], Thomas Gleixner [EMAIL PROTECTED], H. Peter Anvin 
 [EMAIL PROTECTED], Andrew Morton [EMAIL PROTECTED]
 Subject: [git pull] timer fixes
 Date: Tue, 23 Sep 2008 21:41:00 +0200
 
 Fingers crossed..

Well, that does seem to fix the regression. That is, it no longer
crashes. It still doesn't shut down, but that seems older and possibly
partially my fault as I wrote the CPU hotplug implementation (thought it
works fine for hibernate, never mind the fact that resuming from
hibernate hangs)

johannes


signature.asc
Description: This is a digitally signed message part
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [Bugme-new] [Bug 11629] New: quad G5 fails to shut down

2008-09-23 Thread Johannes Berg
On Wed, 2008-09-24 at 00:45 +0200, Johannes Berg wrote:

 Well, that does seem to fix the regression. That is, it no longer
 crashes.

Spoke too early, it seems that the crash isn't exactly reproducible all
the time nor happens at the same spot all the time. I'd suspect that the
timer code isn't to blame but something in the IPI code or so.

johannes


signature.asc
Description: This is a digitally signed message part
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev