Bug#572442: sparc 2.6.29+ NMI watchdog deadlock on Sun Fire V240 etc

2010-03-04 Thread Josip Rodin
Package: linux-2.6
Severity: serious
Tags: upstream patch

Hi there,

Ever since kernel 2.6.29 came out, several classes of sparc machines have
been unable to upgrade, because they would get stuck while initializing
the new NMI watchdog code.

The process of trying to figure it out is mostly documented in this
long-running mailing list thread that spanned many months:
http://lists.debian.org/debian-sparc/2009/08/msg5.html
http://lists.debian.org/debian-sparc/2009/09/msg00018.html
http://lists.debian.org/debian-sparc/2009/10/msg00015.html
http://lists.debian.org/debian-sparc/2009/11/msg00034.html
http://lists.debian.org/debian-sparc/2009/12/msg0.html

Had this gone unattended, sparc release requalification might have been in
trouble, because the bug affects the Fire V240 sparc buildd machines as well
as Jurij Smakov's test machine, and that's a lot in our little universe :)

Fortunately David Miller came to the rescue and personally debugged the
problem on one of the buildds, and fixed the problem. His solution, that
we are currently running on schroeder.debian.org, is attached.

Please include the patch in the sparc kernel package so that we can test
it widely, preferably ASAP. TIA.

- Forwarded message from David Miller da...@davemloft.net -

Date: Wed, 03 Mar 2010 09:11:41 -0800 (PST)
Subject: Re: Sparc release requalification


Ok, I think I fixed it.

Attached are two versions of the fix, the first attachment is
for 2.6.33 and the second one is for any kernel 2.6.32 and
previous.

Give it a good test on any machine you've seen this problem on
and let me know how it goes.

Thanks.

From 8a4fd1e4922413cfdfa6c51a59efb720d904a5eb Mon Sep 17 00:00:00 2001
From: David S. Miller da...@davemloft.net
Date: Wed, 3 Mar 2010 09:06:03 -0800
Subject: [PATCH] sparc64: Make prom entry spinlock NMI safe.

If we do something like try to print to the OF console from an NMI
while we're already in OpenFirmware, we'll deadlock on the spinlock.

Use a raw spinlock and disable NMIs when we take it.

Signed-off-by: David S. Miller da...@davemloft.net
---
 arch/sparc/prom/p1275.c |   12 +++-
 1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/sparc/prom/p1275.c b/arch/sparc/prom/p1275.c
index 4b7c937..2d8b70d 100644
--- a/arch/sparc/prom/p1275.c
+++ b/arch/sparc/prom/p1275.c
@@ -32,10 +32,9 @@ extern void prom_cif_interface(void);
 extern void prom_cif_callback(void);
 
 /*
- * This provides SMP safety on the p1275buf. prom_callback() drops this lock
- * to allow recursuve acquisition.
+ * This provides SMP safety on the p1275buf.
  */
-DEFINE_SPINLOCK(prom_entry_lock);
+DEFINE_RAW_SPINLOCK(prom_entry_lock);
 
 long p1275_cmd(const char *service, long fmt, ...)
 {
@@ -47,7 +46,9 @@ long p1275_cmd(const char *service, long fmt, ...)

p = p1275buf.prom_buffer;
 
-   spin_lock_irqsave(prom_entry_lock, flags);
+   raw_local_save_flags(flags);
+   raw_local_irq_restore(PIL_NMI);
+   raw_spin_lock(prom_entry_lock);
 
p1275buf.prom_args[0] = (unsigned long)p;   /* service */
strcpy (p, service);
@@ -139,7 +140,8 @@ long p1275_cmd(const char *service, long fmt, ...)
va_end(list);
x = p1275buf.prom_args [nargs + 3];
 
-   spin_unlock_irqrestore(prom_entry_lock, flags);
+   raw_spin_unlock(prom_entry_lock);
+   raw_local_irq_restore(flags);
 
return x;
 }
-- 
1.6.6.1


sparc64: Make prom entry spinlock NMI safe.

If we do something like try to print to the OF console from an NMI
while we're already in OpenFirmware, we'll deadlock on the spinlock.

Disable NMIs when we take it.

Signed-off-by: David S. Miller da...@davemloft.net

diff --git a/arch/sparc/prom/p1275.c b/arch/sparc/prom/p1275.c
index 4b7c937..815cab6 100644
--- a/arch/sparc/prom/p1275.c
+++ b/arch/sparc/prom/p1275.c
@@ -32,8 +32,7 @@ extern void prom_cif_interface(void);
 extern void prom_cif_callback(void);
 
 /*
- * This provides SMP safety on the p1275buf. prom_callback() drops this lock
- * to allow recursuve acquisition.
+ * This provides SMP safety on the p1275buf.
  */
 DEFINE_SPINLOCK(prom_entry_lock);
 
@@ -47,7 +46,9 @@ long p1275_cmd(const char *service, long fmt, ...)

p = p1275buf.prom_buffer;
 
-   spin_lock_irqsave(prom_entry_lock, flags);
+   raw_local_save_flags(flags);
+   raw_local_irq_restore(PIL_NMI);
+   spin_lock(prom_entry_lock);
 
p1275buf.prom_args[0] = (unsigned long)p;   /* service */
strcpy (p, service);
@@ -139,7 +140,8 @@ long p1275_cmd(const char *service, long fmt, ...)
va_end(list);
x = p1275buf.prom_args [nargs + 3];
 
-   spin_unlock_irqrestore(prom_entry_lock, flags);
+   spin_unlock(prom_entry_lock);
+   raw_local_irq_restore(flags);
 
return x;
 }


- End forwarded message -

-- 
 2. That which causes joy or happiness.



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a 

Bug#572442: sparc 2.6.29+ NMI watchdog deadlock on Sun Fire V240 etc

2010-03-04 Thread Ben Hutchings
On Thu, 2010-03-04 at 10:19 +0100, Josip Rodin wrote:
[...]
 Fortunately David Miller came to the rescue and personally debugged the
 problem on one of the buildds, and fixed the problem. His solution, that
 we are currently running on schroeder.debian.org, is attached.
 
 Please include the patch in the sparc kernel package so that we can test
 it widely, preferably ASAP. TIA.

This is not applicable to 2.6.32 as the spinlock API has changed.  I
assume David will send a suitable patch to stable shortly, and we'll use
that.

Ben.

-- 
Ben Hutchings
Q.  Which is the greater problem in the world today, ignorance or apathy?
A.  I don't know and I couldn't care less.


signature.asc
Description: This is a digitally signed message part


Bug#572442: sparc 2.6.29+ NMI watchdog deadlock on Sun Fire V240 etc

2010-03-04 Thread Martin Michlmayr
* Ben Hutchings b...@decadent.org.uk [2010-03-04 14:59]:
  Please include the patch in the sparc kernel package so that we can test
  it widely, preferably ASAP. TIA.
 
 This is not applicable to 2.6.32 as the spinlock API has changed.  I
 assume David will send a suitable patch to stable shortly, and we'll use
 that.

David's message said, Attached are two versions of the fix, the first
attachment is for 2.6.33 and the second one is for any kernel 2.6.32
and previous and there were indeed two patches.

-- 
Martin Michlmayr
http://www.cyrius.com/



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100304151234.gm1...@jirafa.cyrius.com



Bug#572442: sparc 2.6.29+ NMI watchdog deadlock on Sun Fire V240 etc

2010-03-04 Thread David Miller
From: Ben Hutchings b...@decadent.org.uk
Date: Thu, 04 Mar 2010 14:59:20 +

 On Thu, 2010-03-04 at 10:19 +0100, Josip Rodin wrote:
 [...]
 Fortunately David Miller came to the rescue and personally debugged the
 problem on one of the buildds, and fixed the problem. His solution, that
 we are currently running on schroeder.debian.org, is attached.
 
 Please include the patch in the sparc kernel package so that we can test
 it widely, preferably ASAP. TIA.
 
 This is not applicable to 2.6.32 as the spinlock API has changed.  I
 assume David will send a suitable patch to stable shortly, and we'll use
 that.

Read the my damn posting Ben!

There are two patches, first one is for 2.6.33 and the second one is
for 2.6.32 and beforehand.



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100304.071303.235692658.da...@davemloft.net



Bug#572442: sparc 2.6.29+ NMI watchdog deadlock on Sun Fire V240 etc

2010-03-04 Thread Ben Hutchings
On Thu, Mar 04, 2010 at 07:13:03AM -0800, David Miller wrote:
 From: Ben Hutchings b...@decadent.org.uk
 Date: Thu, 04 Mar 2010 14:59:20 +
 
  On Thu, 2010-03-04 at 10:19 +0100, Josip Rodin wrote:
  [...]
  Fortunately David Miller came to the rescue and personally debugged the
  problem on one of the buildds, and fixed the problem. His solution, that
  we are currently running on schroeder.debian.org, is attached.
  
  Please include the patch in the sparc kernel package so that we can test
  it widely, preferably ASAP. TIA.
  
  This is not applicable to 2.6.32 as the spinlock API has changed.  I
  assume David will send a suitable patch to stable shortly, and we'll use
  that.
 
 Read the my damn posting Ben!
 
 There are two patches, first one is for 2.6.33 and the second one is
 for 2.6.32 and beforehand.
 
I'm sorry, I read Josip's introduction referring to 'the patch' and skipped
straight to the first patch without reading your introduction.  I'll apply
the second patch.

Ben.

-- 
Ben Hutchings
Q.  Which is the greater problem in the world today, ignorance or apathy?
A.  I don't know and I couldn't care less.



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100304200907.gd2...@decadent.org.uk