[PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Dongsheng Wang
From: Wang Dongsheng dongsheng.w...@freescale.com

If softirq use hardirq stack, we will get kernel painc when a hard irq coming 
again
during __do_softirq enable local irq to deal with softirq action. So we need to 
switch
satck into softirq stack when invoke soft irq.

 Task---
| Task stack
|
Interrput-EXCEPTION-do_IRQ-
^| Hard irq stack
||
|irq_exit-__do_softirq-local_irq_enable-- 
  --local_irq_disable
|   
| Hard irq stack
|   
|
|   
Interrupt coming again
|   There will get a Interrupt nesting  
|


Trace 1: Trap 900

Kernel stack overflow in process e8152f40, r1=e8e05ec0
CPU: 0 PID: 2399 Comm: image_compress/ Not tainted 3.13.0-rc3-03475-g2e3f85b 
#432
task: e8152f40 ti: c080a000 task.ti: ef176000
NIP: c05bec04 LR: c0305590 CTR: 0010
REGS: e8e05e10 TRAP: 0901   Not tainted  (3.13.0-rc3-03475-g2e3f85b)
MSR: 00029000 CE,EE,ME  CR: 22f22722  XER: 2000

GPR00: c0305590 e8e05ec0 e8152f40 c07e1e2c 00029000 00ec fffc 0010
GPR08: 007f   b02539f3 a00ae278
NIP [c05bec04] _raw_spin_unlock_irqrestore+0x10/0x14
LR [c0305590] add_timer_randomness+0x60/0xfc
Call Trace:
[e8e05ec0] [c0305590] add_timer_randomness+0x60/0xfc (unreliable)
[e8e05ee0] [c026c9a8] blk_update_bidi_request+0x64/0x94
[e8e05f00] [c026cd00] blk_end_bidi_request+0x20/0x7c
[e8e05f20] [c032f21c] scsi_io_completion+0xe0/0x5e8
[e8e05f70] [c0272b84] blk_done_softirq+0x98/0xb8
[e8e05f90] [c004893c] __do_softirq+0xf8/0x1f8
[e8e05fe0] [c0048dbc] irq_exit+0xa4/0xc8
[e8e05ff0] [c000d5f4] call_do_irq+0x24/0x3c
[ef177d50] [c00046ec] do_IRQ+0x8c/0xf8
[ef177d70] [c000f6dc] ret_from_except+0x0/0x18
--- Exception: 501 at lzo1x_1_do_compress+0x248/0x40c
LR = lzo1x_1_compress+0x98/0x268
[ef177e30] [c07c7440] runqueues+0x0/0x540 (unreliable)
[ef177e60] []   (null)
[ef177ea0] [c0085a9c] lzo_compress_threadfn+0x6c/0x138
[ef177ef0] [c0062a00] kthread+0xc4/0xd8
[ef177f40] [c000f158] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
40a2fff0 4c00012c 2f89 419e000c 3860 4e800020 3861 4e800020
7c0004ac 3920 9123 7c800106 4e800020 7d201828 35290001 40810010
Kernel panic - not syncing: kernel stack overflow
CPU: 0 PID: 2399 Comm: image_compress/ Not tainted 3.13.0-rc3-03475-g2e3f85b 
#432
Call Trace:
Rebooting in 180 seconds..

Trace 2: Trap 500

VFS: Mounted root (ext2 filesystem) on device 1:0.
devtmpfs: mounted
Freeing unused kernel memory: 268K (c079a000 - c07dd000)
INIT: version 2.88 booting
Starting udev
udevd[1423]: starting version 182
random: nonblocking pool is initialized
Kernel stack overflow in process e829ca80, r1=e8badf90
CPU: 0 PID: 1553 Comm: mount.sh Not tainted 3.13.0-rc1-148228-gea7ca7c #21
task: e829ca80 ti: c081c000 task.ti: e9d28000
NIP: c00434bc LR: c0043444 CTR: c0018cec
REGS: e8badee0 TRAP: 0501   Not tainted  (3.13.0-rc1-148228-gea7ca7c)
MSR: 00029000 CE,EE,ME  CR: 48222422  XER: 2000

GPR00: c00439a0 e8badf90 e829ca80 0001 e80cc780 0001 b92f44af 
GPR08: 0001 010ba000 010ba000 ddd3e6d1 48222422
NIP [c00434bc] __do_softirq+0x94/0x1f8
LR [c0043444] __do_softirq+0x1c/0x1f8
Call Trace:
[e8badf90] [100f] 0x100f (unreliable)
[e8badfe0] [c00439a0] irq_exit+0xa4/0xc8
[e8badff0] [c000ccd8] call_do_irq+0x24/0x3c
[e9d29f20] [c000479c] do_IRQ+0x8c/0xf8
[e9d29f40] [c000eb54] ret_from_except+0x0/0x18
--- Exception: 501 at 0x1003d540
LR = 0x10041974
Instruction dump:
3e80c082 3f40c07e 3b6a 3a941040 3aa0 3b5a9388 7f16c378 812f0008
5529103a 7d3c482e 7eb8492e 7c008146 3ba0 7e9ea378 4814 57fff87f
Kernel panic - not syncing: kernel stack overflow
CPU: 0 PID: 1553 Comm: mount.sh Not tainted 3.13.0-rc1-148228-gea7ca7c #21
Call Trace:
Rebooting in 180 seconds..

Signed-off-by: Wang Dongsheng dongsheng.w...@freescale.com
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Linus Torvalds torva...@linux-foundation.org
Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@kernel.org
Cc: Peter Zijlstra pet...@infradead.org
Cc: H. Peter Anvin h...@zytor.com
Cc: Paul Mackerras pau...@au1.ibm.com
Cc: James Hogan james.ho...@imgtec.com
Cc: James E.J. Bottomley j...@parisc-linux.org
Cc: Helge Deller del...@gmx.de
Cc: Martin Schwidefsky schwidef...@de.ibm.com
Cc: Heiko Carstens heiko.carst...@de.ibm.com
Cc: David S. Miller da...@davemloft.net
Cc: Andrew Morton a...@linux-foundation.org

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 957bf34..ffde3fb 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ 

Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Kevin Hao
On Fri, Mar 28, 2014 at 03:38:32PM +0800, Dongsheng Wang wrote:
 From: Wang Dongsheng dongsheng.w...@freescale.com
 
 If softirq use hardirq stack, we will get kernel painc when a hard irq coming 
 again
 during __do_softirq enable local irq to deal with softirq action. So we need 
 to switch
 satck into softirq stack when invoke soft irq.
 
  Task---
   | Task stack
   |
   Interrput-EXCEPTION-do_IRQ-
   ^| Hard irq stack
   ||
   |irq_exit-__do_softirq-local_irq_enable-- 
   --local_irq_disable
   |   
 | Hard irq stack
   |   
 |
   |   
 Interrupt coming again
   |   There will get a Interrupt nesting  
 |
   
 
 Trace 1: Trap 900
 
 Kernel stack overflow in process e8152f40, r1=e8e05ec0
 CPU: 0 PID: 2399 Comm: image_compress/ Not tainted 3.13.0-rc3-03475-g2e3f85b 
 #432
 task: e8152f40 ti: c080a000 task.ti: ef176000
 NIP: c05bec04 LR: c0305590 CTR: 0010
 REGS: e8e05e10 TRAP: 0901   Not tainted  (3.13.0-rc3-03475-g2e3f85b)

Could you double check if you got the following patch applied?

commit 1a18a66446f3f289b05b634f18012424d82aa63a
Author: Kevin Hao haoke...@gmail.com
Date:   Fri Jan 17 12:25:28 2014 +0800

powerpc: Set the correct ksp_limit on ppc32 when switching to irq stack

Guenter Roeck has got the following call trace on a p2020 board:
  Kernel stack overflow in process eb3e5a00, r1=eb79df90
  CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 
#4
  task: eb3e5a00 ti: c0616000 task.ti: ef44
  NIP: c003a420 LR: c003a410 CTR: c0017518
  REGS: eb79dee0 TRAP: 0901   Not tainted 
(3.13.0-rc8-juniper-00146-g19eca00)
  MSR: 00029000 CE,EE,ME  CR: 24008444  XER: 
  GPR00: c003a410 eb79df90 eb3e5a00  eb05d900 0001 65d87646 

  GPR08:  020b8000   44008442
  NIP [c003a420] __do_softirq+0x94/0x1ec
  LR [c003a410] __do_softirq+0x84/0x1ec
  Call Trace:
  [eb79df90] [c003a410] __do_softirq+0x84/0x1ec (unreliable)
  [eb79dfe0] [c003a970] irq_exit+0xbc/0xc8
  [eb79dff0] [c000cc1c] call_do_irq+0x24/0x3c
  [ef441f20] [c00046a8] do_IRQ+0x8c/0xf8
  [ef441f40] [c000e7f4] ret_from_except+0x0/0x18
  --- Exception: 501 at 0xfcda524
  LR = 0x10024900
  Instruction dump:
  7c781b78 3b4a 3a73b040 543c0024 3a80 3b3913a0 7ef5bb78 48201bf9
  5463103a 7d3b182e 7e89b92e 7c008146 3ba0 7e7e9b78 4814 57fff87f
  Kernel panic - not syncing: kernel stack overflow
  CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 
#4
  Call Trace:

The reason is that we have used the wrong register to calculate the
ksp_limit in commit cbc9565ee826 (powerpc: Remove ksp_limit on ppc64).
Just fix it.

As suggested by Benjamin Herrenschmidt, also add the C prototype of the
function in the comment in order to avoid such kind of errors in the
future.

Cc: sta...@vger.kernel.org # 3.12
Reported-by: Guenter Roeck li...@roeck-us.net
Tested-by: Guenter Roeck li...@roeck-us.net
Signed-off-by: Kevin Hao haoke...@gmail.com
Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org

Thanks,
Kevin


pgp4yk53md3vO.pgp
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread dongsheng.w...@freescale.com
Thanks Kevin. Your patch works normal. :)

I still have some confused. I think when __do_softirq always get a interrupt, 
the hard stack will be run out, isn't it?

Regards,
-Dongsheng

 -Original Message-
 From: Kevin Hao [mailto:haoke...@gmail.com]
 Sent: Friday, March 28, 2014 4:18 PM
 To: Wang Dongsheng-B40534
 Cc: fweis...@gmail.com; James Hogan; Andrew Morton; David S. Miller; Peter
 Zijlstra; Helge Deller; H. Peter Anvin; Heiko Carstens; linux-
 ker...@vger.kernel.org; Paul Mackerras; James E.J. Bottomley; Linus Torvalds;
 Jin Zhengxiong-R64188; Wood Scott-B07421; Thomas Gleixner; linuxppc-
 d...@lists.ozlabs.org; Ingo Molnar; Martin Schwidefsky
 Subject: Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at
 powerpc platform
 
 On Fri, Mar 28, 2014 at 03:38:32PM +0800, Dongsheng Wang wrote:
  From: Wang Dongsheng dongsheng.w...@freescale.com
 
  If softirq use hardirq stack, we will get kernel painc when a hard irq
  coming again during __do_softirq enable local irq to deal with softirq
  action. So we need to switch satck into softirq stack when invoke soft irq.
 
   Task---
  | Task stack
  |
  Interrput-EXCEPTION-do_IRQ-
  ^| Hard irq stack
  ||
  |irq_exit-__do_softirq-local_irq_enable-- 
--
 local_irq_disable
  |   
  | Hard irq
 stack
  |   
  |
  |   
  Interrupt
 coming again
  |   There will get a Interrupt nesting  
  |
 
  --
  --
 
  Trace 1: Trap 900
 
  Kernel stack overflow in process e8152f40, r1=e8e05ec0
  CPU: 0 PID: 2399 Comm: image_compress/ Not tainted
  3.13.0-rc3-03475-g2e3f85b #432
  task: e8152f40 ti: c080a000 task.ti: ef176000
  NIP: c05bec04 LR: c0305590 CTR: 0010
  REGS: e8e05e10 TRAP: 0901   Not tainted  (3.13.0-rc3-03475-g2e3f85b)
 
 Could you double check if you got the following patch applied?
 
 commit 1a18a66446f3f289b05b634f18012424d82aa63a
 Author: Kevin Hao haoke...@gmail.com
 Date:   Fri Jan 17 12:25:28 2014 +0800
 
 powerpc: Set the correct ksp_limit on ppc32 when switching to irq stack
 
 Guenter Roeck has got the following call trace on a p2020 board:
   Kernel stack overflow in process eb3e5a00, r1=eb79df90
   CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00
 #4
   task: eb3e5a00 ti: c0616000 task.ti: ef44
   NIP: c003a420 LR: c003a410 CTR: c0017518
   REGS: eb79dee0 TRAP: 0901   Not tainted 
 (3.13.0-rc8-juniper-00146-g19eca00)
   MSR: 00029000 CE,EE,ME  CR: 24008444  XER: 
   GPR00: c003a410 eb79df90 eb3e5a00  eb05d900 0001 65d87646
 
   GPR08:  020b8000   44008442
   NIP [c003a420] __do_softirq+0x94/0x1ec
   LR [c003a410] __do_softirq+0x84/0x1ec
   Call Trace:
   [eb79df90] [c003a410] __do_softirq+0x84/0x1ec (unreliable)
   [eb79dfe0] [c003a970] irq_exit+0xbc/0xc8
   [eb79dff0] [c000cc1c] call_do_irq+0x24/0x3c
   [ef441f20] [c00046a8] do_IRQ+0x8c/0xf8
   [ef441f40] [c000e7f4] ret_from_except+0x0/0x18
   --- Exception: 501 at 0xfcda524
   LR = 0x10024900
   Instruction dump:
   7c781b78 3b4a 3a73b040 543c0024 3a80 3b3913a0 7ef5bb78 48201bf9
   5463103a 7d3b182e 7e89b92e 7c008146 3ba0 7e7e9b78 4814 
 57fff87f
   Kernel panic - not syncing: kernel stack overflow
   CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00
 #4
   Call Trace:
 
 The reason is that we have used the wrong register to calculate the
 ksp_limit in commit cbc9565ee826 (powerpc: Remove ksp_limit on ppc64).
 Just fix it.
 
 As suggested by Benjamin Herrenschmidt, also add the C prototype of the
 function in the comment in order to avoid such kind of errors in the
 future.
 
 Cc: sta...@vger.kernel.org # 3.12
 Reported-by: Guenter Roeck li...@roeck-us.net
 Tested-by: Guenter Roeck li...@roeck-us.net
 Signed-off-by: Kevin Hao haoke...@gmail.com
 Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
 
 Thanks,
 Kevin
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Kevin Hao
On Fri, Mar 28, 2014 at 09:00:13AM +, dongsheng.w...@freescale.com wrote:
 Thanks Kevin. Your patch works normal. :)
 
 I still have some confused. I think when __do_softirq always get a interrupt, 
 the hard stack will be run out, isn't it?

No, it won't. Please see the explanation in the following commit log.

commit cc1f027454929924471bea2f362431072e3c71be
Author: Frederic Weisbecker fweis...@gmail.com
Date:   Tue Sep 24 17:17:47 2013 +0200

irq: Optimize softirq stack selection in irq exit

If irq_exit() is called on the arch's specified irq stack,
it should be safe to run softirqs inline under that same
irq stack as it is near empty by the time we call irq_exit().

For example if we use the same stack for both hard and soft irqs here,
the worst case scenario is:
hardirq - softirq - hardirq. But then the softirq supersedes the
first hardirq as the stack user since irq_exit() is called in
a mostly empty stack. So the stack merge in this case looks acceptable.

Stack overrun still have a chance to happen if hardirqs have more
opportunities to nest, but then it's another problem to solve.

So lets adapt the irq exit's softirq stack on top of a new Kconfig symbol
that can be defined when irq_exit() runs on the irq stack. That way
we can spare some stack switch on irq processing and all the cache
issues that come along.

Thanks,
Kevin


pgpYaRl6M_j5q.pgp
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Benjamin Herrenschmidt
On Fri, 2014-03-28 at 15:38 +0800, Dongsheng Wang wrote:
 From: Wang Dongsheng dongsheng.w...@freescale.com
 
 If softirq use hardirq stack, we will get kernel painc when a hard irq coming 
 again
 during __do_softirq enable local irq to deal with softirq action. So we need 
 to switch
 satck into softirq stack when invoke soft irq.

Yes, an interrupt can potentially nest but we should be near the top of
the stack at that point, as the comment says in softirq.c, it should
be fine. And your backtrace doesn't seem to indicate a major overflow.

The code in do_IRQ() will make sure we don't switch stack again if
we were already on either hard or softirq stack.

I need a better analysis of your problem. Is that really a stack
overflow ? Or is it a false positive due to a bug in the overflow
detection ?

I moved around the code that updates KSP_LIMIT in 32-bit to asm in
misc_32.S a while ago since we don't do that on 64-bit, maybe we are
getting it wrong...

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Benjamin Herrenschmidt
On Fri, 2014-03-28 at 16:18 +0800, Kevin Hao wrote:

 powerpc: Set the correct ksp_limit on ppc32 when switching to irq stack


Kevin. It looks like it was applied to 3.14 and sent to 3.12 stable but
not 3.13 ... can you fix that up ?

Cheers,
Ben.

 Guenter Roeck has got the following call trace on a p2020 board:
   Kernel stack overflow in process eb3e5a00, r1=eb79df90
   CPU: 0 PID: 2838 Comm: ssh Not tainted 
 3.13.0-rc8-juniper-00146-g19eca00 #4
   task: eb3e5a00 ti: c0616000 task.ti: ef44
   NIP: c003a420 LR: c003a410 CTR: c0017518
   REGS: eb79dee0 TRAP: 0901   Not tainted 
 (3.13.0-rc8-juniper-00146-g19eca00)
   MSR: 00029000 CE,EE,ME  CR: 24008444  XER: 
   GPR00: c003a410 eb79df90 eb3e5a00  eb05d900 0001 65d87646 
 
   GPR08:  020b8000   44008442
   NIP [c003a420] __do_softirq+0x94/0x1ec
   LR [c003a410] __do_softirq+0x84/0x1ec
   Call Trace:
   [eb79df90] [c003a410] __do_softirq+0x84/0x1ec (unreliable)
   [eb79dfe0] [c003a970] irq_exit+0xbc/0xc8
   [eb79dff0] [c000cc1c] call_do_irq+0x24/0x3c
   [ef441f20] [c00046a8] do_IRQ+0x8c/0xf8
   [ef441f40] [c000e7f4] ret_from_except+0x0/0x18
   --- Exception: 501 at 0xfcda524
   LR = 0x10024900
   Instruction dump:
   7c781b78 3b4a 3a73b040 543c0024 3a80 3b3913a0 7ef5bb78 48201bf9
   5463103a 7d3b182e 7e89b92e 7c008146 3ba0 7e7e9b78 4814 
 57fff87f
   Kernel panic - not syncing: kernel stack overflow
   CPU: 0 PID: 2838 Comm: ssh Not tainted 
 3.13.0-rc8-juniper-00146-g19eca00 #4
   Call Trace:
 
 The reason is that we have used the wrong register to calculate the
 ksp_limit in commit cbc9565ee826 (powerpc: Remove ksp_limit on ppc64).
 Just fix it.
 
 As suggested by Benjamin Herrenschmidt, also add the C prototype of the
 function in the comment in order to avoid such kind of errors in the
 future.
 
 Cc: sta...@vger.kernel.org # 3.12
 Reported-by: Guenter Roeck li...@roeck-us.net
 Tested-by: Guenter Roeck li...@roeck-us.net
 Signed-off-by: Kevin Hao haoke...@gmail.com
 Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
 
 Thanks,
 Kevin


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Kevin Hao
On Sat, Mar 29, 2014 at 08:27:07AM +1100, Benjamin Herrenschmidt wrote:
 On Fri, 2014-03-28 at 16:18 +0800, Kevin Hao wrote:
 
  powerpc: Set the correct ksp_limit on ppc32 when switching to irq stack
 
 
 Kevin. It looks like it was applied to 3.14 and sent to 3.12 stable but
 not 3.13 ... can you fix that up ?

It was already merged into 3.13 stable since 3.13.6:
  https://lkml.org/lkml/2014/3/4/787

I guess that Dongsheng didn't use the latest 3.13 stable tree.

Thanks,
Kevin


pgpOYtJ8PSajM.pgp
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev