Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Kevin Hao
On Fri, Mar 28, 2014 at 03:38:32PM +0800, Dongsheng Wang wrote:
 From: Wang Dongsheng dongsheng.w...@freescale.com
 
 If softirq use hardirq stack, we will get kernel painc when a hard irq coming 
 again
 during __do_softirq enable local irq to deal with softirq action. So we need 
 to switch
 satck into softirq stack when invoke soft irq.
 
  Task---
   | Task stack
   |
   Interrput-EXCEPTION-do_IRQ-
   ^| Hard irq stack
   ||
   |irq_exit-__do_softirq-local_irq_enable-- 
   --local_irq_disable
   |   
 | Hard irq stack
   |   
 |
   |   
 Interrupt coming again
   |   There will get a Interrupt nesting  
 |
   
 
 Trace 1: Trap 900
 
 Kernel stack overflow in process e8152f40, r1=e8e05ec0
 CPU: 0 PID: 2399 Comm: image_compress/ Not tainted 3.13.0-rc3-03475-g2e3f85b 
 #432
 task: e8152f40 ti: c080a000 task.ti: ef176000
 NIP: c05bec04 LR: c0305590 CTR: 0010
 REGS: e8e05e10 TRAP: 0901   Not tainted  (3.13.0-rc3-03475-g2e3f85b)

Could you double check if you got the following patch applied?

commit 1a18a66446f3f289b05b634f18012424d82aa63a
Author: Kevin Hao haoke...@gmail.com
Date:   Fri Jan 17 12:25:28 2014 +0800

powerpc: Set the correct ksp_limit on ppc32 when switching to irq stack

Guenter Roeck has got the following call trace on a p2020 board:
  Kernel stack overflow in process eb3e5a00, r1=eb79df90
  CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 
#4
  task: eb3e5a00 ti: c0616000 task.ti: ef44
  NIP: c003a420 LR: c003a410 CTR: c0017518
  REGS: eb79dee0 TRAP: 0901   Not tainted 
(3.13.0-rc8-juniper-00146-g19eca00)
  MSR: 00029000 CE,EE,ME  CR: 24008444  XER: 
  GPR00: c003a410 eb79df90 eb3e5a00  eb05d900 0001 65d87646 

  GPR08:  020b8000   44008442
  NIP [c003a420] __do_softirq+0x94/0x1ec
  LR [c003a410] __do_softirq+0x84/0x1ec
  Call Trace:
  [eb79df90] [c003a410] __do_softirq+0x84/0x1ec (unreliable)
  [eb79dfe0] [c003a970] irq_exit+0xbc/0xc8
  [eb79dff0] [c000cc1c] call_do_irq+0x24/0x3c
  [ef441f20] [c00046a8] do_IRQ+0x8c/0xf8
  [ef441f40] [c000e7f4] ret_from_except+0x0/0x18
  --- Exception: 501 at 0xfcda524
  LR = 0x10024900
  Instruction dump:
  7c781b78 3b4a 3a73b040 543c0024 3a80 3b3913a0 7ef5bb78 48201bf9
  5463103a 7d3b182e 7e89b92e 7c008146 3ba0 7e7e9b78 4814 57fff87f
  Kernel panic - not syncing: kernel stack overflow
  CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 
#4
  Call Trace:

The reason is that we have used the wrong register to calculate the
ksp_limit in commit cbc9565ee826 (powerpc: Remove ksp_limit on ppc64).
Just fix it.

As suggested by Benjamin Herrenschmidt, also add the C prototype of the
function in the comment in order to avoid such kind of errors in the
future.

Cc: sta...@vger.kernel.org # 3.12
Reported-by: Guenter Roeck li...@roeck-us.net
Tested-by: Guenter Roeck li...@roeck-us.net
Signed-off-by: Kevin Hao haoke...@gmail.com
Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org

Thanks,
Kevin


pgp4yk53md3vO.pgp
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread dongsheng.w...@freescale.com
Thanks Kevin. Your patch works normal. :)

I still have some confused. I think when __do_softirq always get a interrupt, 
the hard stack will be run out, isn't it?

Regards,
-Dongsheng

 -Original Message-
 From: Kevin Hao [mailto:haoke...@gmail.com]
 Sent: Friday, March 28, 2014 4:18 PM
 To: Wang Dongsheng-B40534
 Cc: fweis...@gmail.com; James Hogan; Andrew Morton; David S. Miller; Peter
 Zijlstra; Helge Deller; H. Peter Anvin; Heiko Carstens; linux-
 ker...@vger.kernel.org; Paul Mackerras; James E.J. Bottomley; Linus Torvalds;
 Jin Zhengxiong-R64188; Wood Scott-B07421; Thomas Gleixner; linuxppc-
 d...@lists.ozlabs.org; Ingo Molnar; Martin Schwidefsky
 Subject: Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at
 powerpc platform
 
 On Fri, Mar 28, 2014 at 03:38:32PM +0800, Dongsheng Wang wrote:
  From: Wang Dongsheng dongsheng.w...@freescale.com
 
  If softirq use hardirq stack, we will get kernel painc when a hard irq
  coming again during __do_softirq enable local irq to deal with softirq
  action. So we need to switch satck into softirq stack when invoke soft irq.
 
   Task---
  | Task stack
  |
  Interrput-EXCEPTION-do_IRQ-
  ^| Hard irq stack
  ||
  |irq_exit-__do_softirq-local_irq_enable-- 
--
 local_irq_disable
  |   
  | Hard irq
 stack
  |   
  |
  |   
  Interrupt
 coming again
  |   There will get a Interrupt nesting  
  |
 
  --
  --
 
  Trace 1: Trap 900
 
  Kernel stack overflow in process e8152f40, r1=e8e05ec0
  CPU: 0 PID: 2399 Comm: image_compress/ Not tainted
  3.13.0-rc3-03475-g2e3f85b #432
  task: e8152f40 ti: c080a000 task.ti: ef176000
  NIP: c05bec04 LR: c0305590 CTR: 0010
  REGS: e8e05e10 TRAP: 0901   Not tainted  (3.13.0-rc3-03475-g2e3f85b)
 
 Could you double check if you got the following patch applied?
 
 commit 1a18a66446f3f289b05b634f18012424d82aa63a
 Author: Kevin Hao haoke...@gmail.com
 Date:   Fri Jan 17 12:25:28 2014 +0800
 
 powerpc: Set the correct ksp_limit on ppc32 when switching to irq stack
 
 Guenter Roeck has got the following call trace on a p2020 board:
   Kernel stack overflow in process eb3e5a00, r1=eb79df90
   CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00
 #4
   task: eb3e5a00 ti: c0616000 task.ti: ef44
   NIP: c003a420 LR: c003a410 CTR: c0017518
   REGS: eb79dee0 TRAP: 0901   Not tainted 
 (3.13.0-rc8-juniper-00146-g19eca00)
   MSR: 00029000 CE,EE,ME  CR: 24008444  XER: 
   GPR00: c003a410 eb79df90 eb3e5a00  eb05d900 0001 65d87646
 
   GPR08:  020b8000   44008442
   NIP [c003a420] __do_softirq+0x94/0x1ec
   LR [c003a410] __do_softirq+0x84/0x1ec
   Call Trace:
   [eb79df90] [c003a410] __do_softirq+0x84/0x1ec (unreliable)
   [eb79dfe0] [c003a970] irq_exit+0xbc/0xc8
   [eb79dff0] [c000cc1c] call_do_irq+0x24/0x3c
   [ef441f20] [c00046a8] do_IRQ+0x8c/0xf8
   [ef441f40] [c000e7f4] ret_from_except+0x0/0x18
   --- Exception: 501 at 0xfcda524
   LR = 0x10024900
   Instruction dump:
   7c781b78 3b4a 3a73b040 543c0024 3a80 3b3913a0 7ef5bb78 48201bf9
   5463103a 7d3b182e 7e89b92e 7c008146 3ba0 7e7e9b78 4814 
 57fff87f
   Kernel panic - not syncing: kernel stack overflow
   CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00
 #4
   Call Trace:
 
 The reason is that we have used the wrong register to calculate the
 ksp_limit in commit cbc9565ee826 (powerpc: Remove ksp_limit on ppc64).
 Just fix it.
 
 As suggested by Benjamin Herrenschmidt, also add the C prototype of the
 function in the comment in order to avoid such kind of errors in the
 future.
 
 Cc: sta...@vger.kernel.org # 3.12
 Reported-by: Guenter Roeck li...@roeck-us.net
 Tested-by: Guenter Roeck li...@roeck-us.net
 Signed-off-by: Kevin Hao haoke...@gmail.com
 Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
 
 Thanks,
 Kevin
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Kevin Hao
On Fri, Mar 28, 2014 at 09:00:13AM +, dongsheng.w...@freescale.com wrote:
 Thanks Kevin. Your patch works normal. :)
 
 I still have some confused. I think when __do_softirq always get a interrupt, 
 the hard stack will be run out, isn't it?

No, it won't. Please see the explanation in the following commit log.

commit cc1f027454929924471bea2f362431072e3c71be
Author: Frederic Weisbecker fweis...@gmail.com
Date:   Tue Sep 24 17:17:47 2013 +0200

irq: Optimize softirq stack selection in irq exit

If irq_exit() is called on the arch's specified irq stack,
it should be safe to run softirqs inline under that same
irq stack as it is near empty by the time we call irq_exit().

For example if we use the same stack for both hard and soft irqs here,
the worst case scenario is:
hardirq - softirq - hardirq. But then the softirq supersedes the
first hardirq as the stack user since irq_exit() is called in
a mostly empty stack. So the stack merge in this case looks acceptable.

Stack overrun still have a chance to happen if hardirqs have more
opportunities to nest, but then it's another problem to solve.

So lets adapt the irq exit's softirq stack on top of a new Kconfig symbol
that can be defined when irq_exit() runs on the irq stack. That way
we can spare some stack switch on irq processing and all the cache
issues that come along.

Thanks,
Kevin


pgpYaRl6M_j5q.pgp
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Benjamin Herrenschmidt
On Fri, 2014-03-28 at 15:38 +0800, Dongsheng Wang wrote:
 From: Wang Dongsheng dongsheng.w...@freescale.com
 
 If softirq use hardirq stack, we will get kernel painc when a hard irq coming 
 again
 during __do_softirq enable local irq to deal with softirq action. So we need 
 to switch
 satck into softirq stack when invoke soft irq.

Yes, an interrupt can potentially nest but we should be near the top of
the stack at that point, as the comment says in softirq.c, it should
be fine. And your backtrace doesn't seem to indicate a major overflow.

The code in do_IRQ() will make sure we don't switch stack again if
we were already on either hard or softirq stack.

I need a better analysis of your problem. Is that really a stack
overflow ? Or is it a false positive due to a bug in the overflow
detection ?

I moved around the code that updates KSP_LIMIT in 32-bit to asm in
misc_32.S a while ago since we don't do that on 64-bit, maybe we are
getting it wrong...

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Benjamin Herrenschmidt
On Fri, 2014-03-28 at 16:18 +0800, Kevin Hao wrote:

 powerpc: Set the correct ksp_limit on ppc32 when switching to irq stack


Kevin. It looks like it was applied to 3.14 and sent to 3.12 stable but
not 3.13 ... can you fix that up ?

Cheers,
Ben.

 Guenter Roeck has got the following call trace on a p2020 board:
   Kernel stack overflow in process eb3e5a00, r1=eb79df90
   CPU: 0 PID: 2838 Comm: ssh Not tainted 
 3.13.0-rc8-juniper-00146-g19eca00 #4
   task: eb3e5a00 ti: c0616000 task.ti: ef44
   NIP: c003a420 LR: c003a410 CTR: c0017518
   REGS: eb79dee0 TRAP: 0901   Not tainted 
 (3.13.0-rc8-juniper-00146-g19eca00)
   MSR: 00029000 CE,EE,ME  CR: 24008444  XER: 
   GPR00: c003a410 eb79df90 eb3e5a00  eb05d900 0001 65d87646 
 
   GPR08:  020b8000   44008442
   NIP [c003a420] __do_softirq+0x94/0x1ec
   LR [c003a410] __do_softirq+0x84/0x1ec
   Call Trace:
   [eb79df90] [c003a410] __do_softirq+0x84/0x1ec (unreliable)
   [eb79dfe0] [c003a970] irq_exit+0xbc/0xc8
   [eb79dff0] [c000cc1c] call_do_irq+0x24/0x3c
   [ef441f20] [c00046a8] do_IRQ+0x8c/0xf8
   [ef441f40] [c000e7f4] ret_from_except+0x0/0x18
   --- Exception: 501 at 0xfcda524
   LR = 0x10024900
   Instruction dump:
   7c781b78 3b4a 3a73b040 543c0024 3a80 3b3913a0 7ef5bb78 48201bf9
   5463103a 7d3b182e 7e89b92e 7c008146 3ba0 7e7e9b78 4814 
 57fff87f
   Kernel panic - not syncing: kernel stack overflow
   CPU: 0 PID: 2838 Comm: ssh Not tainted 
 3.13.0-rc8-juniper-00146-g19eca00 #4
   Call Trace:
 
 The reason is that we have used the wrong register to calculate the
 ksp_limit in commit cbc9565ee826 (powerpc: Remove ksp_limit on ppc64).
 Just fix it.
 
 As suggested by Benjamin Herrenschmidt, also add the C prototype of the
 function in the comment in order to avoid such kind of errors in the
 future.
 
 Cc: sta...@vger.kernel.org # 3.12
 Reported-by: Guenter Roeck li...@roeck-us.net
 Tested-by: Guenter Roeck li...@roeck-us.net
 Signed-off-by: Kevin Hao haoke...@gmail.com
 Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
 
 Thanks,
 Kevin


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Kevin Hao
On Sat, Mar 29, 2014 at 08:27:07AM +1100, Benjamin Herrenschmidt wrote:
 On Fri, 2014-03-28 at 16:18 +0800, Kevin Hao wrote:
 
  powerpc: Set the correct ksp_limit on ppc32 when switching to irq stack
 
 
 Kevin. It looks like it was applied to 3.14 and sent to 3.12 stable but
 not 3.13 ... can you fix that up ?

It was already merged into 3.13 stable since 3.13.6:
  https://lkml.org/lkml/2014/3/4/787

I guess that Dongsheng didn't use the latest 3.13 stable tree.

Thanks,
Kevin


pgpOYtJ8PSajM.pgp
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev