Re: [RFC PATCH] irqchip/gic-v3: Do not enable irqs when handling spurious interrups

2021-04-16 Thread He Ying

Hello Marc,


在 2021/4/16 22:15, Marc Zyngier 写道:

[+ Mark]

On Fri, 16 Apr 2021 07:22:17 +0100,
He Ying  wrote:

We found this problem in our kernel src tree:

[   14.816231] [ cut here ]
[   14.816231] kernel BUG at irq.c:99!
[   14.816232] Internal error: Oops - BUG: 0 [#1] SMP
[   14.816232] Process swapper/0 (pid: 0, stack limit = 0x(ptrval))
[   14.816233] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G   O  
4.19.95-1.h1.AOS2.0.aarch64 #14
[   14.816233] Hardware name: evb (DT)
[   14.816234] pstate: 80400085 (Nzcv daIf +PAN -UAO)
[   14.816234] pc : asm_nmi_enter+0x94/0x98
[   14.816235] lr : asm_nmi_enter+0x18/0x98
[   14.816235] sp : 08003c50
[   14.816235] pmr_save: 0070
[   14.816237] x29: 08003c50 x28: 095f56c0
[   14.816238] x27:  x26: 08004000
[   14.816239] x25: 015e x24: 8008fb916000
[   14.816240] x23: 2045 x22: 080817cc
[   14.816241] x21: 08003da0 x20: 0060
[   14.816242] x19: 03ff x18: 
[   14.816243] x17: 0008 x16: 003d0900
[   14.816244] x15: 095ea6c8 x14: 8008fff5ab40
[   14.816244] x13: 8008fff58b9d x12: 
[   14.816245] x11: 08c8a200 x10: 8e31fca5
[   14.816246] x9 : 08c8a208 x8 : 000f
[   14.816247] x7 : 0004 x6 : 8008fff58b9e
[   14.816248] x5 :  x4 : 8000
[   14.816249] x3 :  x2 : 8000
[   14.816250] x1 : 0012 x0 : 095f56c0
[   14.816251] Call trace:
[   14.816251]  asm_nmi_enter+0x94/0x98
[   14.816251]  el1_irq+0x8c/0x180
[   14.816252]  gic_handle_irq+0xbc/0x2e4
[   14.816252]  el1_irq+0xcc/0x180
[   14.816253]  arch_timer_handler_virt+0x38/0x58
[   14.816253]  handle_percpu_devid_irq+0x90/0x240
[   14.816253]  generic_handle_irq+0x34/0x50
[   14.816254]  __handle_domain_irq+0x68/0xc0
[   14.816254]  gic_handle_irq+0xf8/0x2e4
[   14.816255]  el1_irq+0xcc/0x180
[   14.816255]  arch_cpu_idle+0x34/0x1c8
[   14.816255]  default_idle_call+0x24/0x44
[   14.816256]  do_idle+0x1d0/0x2c8
[   14.816256]  cpu_startup_entry+0x28/0x30
[   14.816256]  rest_init+0xb8/0xc8
[   14.816257]  start_kernel+0x4c8/0x4f4
[   14.816257] Code: 940587f1 d5384100 b9401001 36a7fd01 (d421)
[   14.816258] Modules linked in: start_dp(O) smeth(O)
[   15.103092] ---[ end trace 701753956cb14aa8 ]---
[   15.103093] Kernel panic - not syncing: Fatal exception in interrupt
[   15.103099] SMP: stopping secondary CPUs
[   15.103100] Kernel Offset: disabled
[   15.103100] CPU features: 0x36,a2400218
[   15.103100] Memory Limit: none

Urgh...


Our kernel src tree is based on 4.19.95 and backports arm64 pseudo-NMI
patches but doesn't support nested NMI. Its top relative commit is
commit 17ce302f3117 ("arm64: Fix interrupt tracing in the presence of NMIs").

Can you please reproduce it with mainline and without any backport?
It is hard to reason about something that isn't a vanilla kernel.


I think our kernel is quite like v5.3 mainline. Reproducing it in v5.3 
mainline may


be a little difficult for us because our product needs some more self 
developed


patches to work.




I look into this issue and find that it's caused by 'BUG_ON(in_nmi())'
in nmi_enter(). From the call trace, we find two 'el1_irqs' which
means an interrupt preempts the other one and the new one is an NMI.
Furthermore, by adding some prints, we find the first irq also calls
nmi_enter(), but its priority is not GICD_INT_NMI_PRI and its irq number
is 1023. It enables irq by calling gic_arch_enable_irqs() in
gic_handle_irq(). At this moment, the second irq preempts the first irq
and it's an NMI but current context is already in nmi. So that may be
the problem.

I'm not sure I get it. From the stack trace, I see this:

[   14.816251]  asm_nmi_enter+0x94/0x98
[   14.816251]  el1_irq+0x8c/0x180  (C)
[   14.816252]  gic_handle_irq+0xbc/0x2e4
[   14.816252]  el1_irq+0xcc/0x180  (B)
[   14.816253]  arch_timer_handler_virt+0x38/0x58
[   14.816253]  handle_percpu_devid_irq+0x90/0x240
[   14.816253]  generic_handle_irq+0x34/0x50
[   14.816254]  __handle_domain_irq+0x68/0xc0
[   14.816254]  gic_handle_irq+0xf8/0x2e4
[   14.816255]  el1_irq+0xcc/0x180  (A)

which indicates that we preempted a timer interrupt (A) with another
IRQ (B), itself immediately preempted by another IRQ (C)? That's
indeed at least one too many.

Can you please describe for each of (A), (B) and (C) whether they are
spurious or not, what their priorities are if they aren't spurious?


Yes. I ignored interrupt (A). (B) is spurious and its priority is 0xa0 
and PMR is 0x70.


(C) is an NMI and its priority is 0x20. Note that GIC_PRIO_IRQON is 0xe0,

GIC_PRIO_IRQOFF is 0x60, GICD_INT_DEF_PRI is 0xa0 and GICD_INT_NMI_PRI is

0x20 in our kernel.


In my opinion, when handling 

Re: [RFC PATCH] irqchip/gic-v3: Do not enable irqs when handling spurious interrups

2021-04-16 Thread Marc Zyngier
[+ Mark]

On Fri, 16 Apr 2021 07:22:17 +0100,
He Ying  wrote:
> 
> We found this problem in our kernel src tree:
> 
> [   14.816231] [ cut here ]
> [   14.816231] kernel BUG at irq.c:99!
> [   14.816232] Internal error: Oops - BUG: 0 [#1] SMP
> [   14.816232] Process swapper/0 (pid: 0, stack limit = 0x(ptrval))
> [   14.816233] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G   O  
> 4.19.95-1.h1.AOS2.0.aarch64 #14
> [   14.816233] Hardware name: evb (DT)
> [   14.816234] pstate: 80400085 (Nzcv daIf +PAN -UAO)
> [   14.816234] pc : asm_nmi_enter+0x94/0x98
> [   14.816235] lr : asm_nmi_enter+0x18/0x98
> [   14.816235] sp : 08003c50
> [   14.816235] pmr_save: 0070
> [   14.816237] x29: 08003c50 x28: 095f56c0
> [   14.816238] x27:  x26: 08004000
> [   14.816239] x25: 015e x24: 8008fb916000
> [   14.816240] x23: 2045 x22: 080817cc
> [   14.816241] x21: 08003da0 x20: 0060
> [   14.816242] x19: 03ff x18: 
> [   14.816243] x17: 0008 x16: 003d0900
> [   14.816244] x15: 095ea6c8 x14: 8008fff5ab40
> [   14.816244] x13: 8008fff58b9d x12: 
> [   14.816245] x11: 08c8a200 x10: 8e31fca5
> [   14.816246] x9 : 08c8a208 x8 : 000f
> [   14.816247] x7 : 0004 x6 : 8008fff58b9e
> [   14.816248] x5 :  x4 : 8000
> [   14.816249] x3 :  x2 : 8000
> [   14.816250] x1 : 0012 x0 : 095f56c0
> [   14.816251] Call trace:
> [   14.816251]  asm_nmi_enter+0x94/0x98
> [   14.816251]  el1_irq+0x8c/0x180
> [   14.816252]  gic_handle_irq+0xbc/0x2e4
> [   14.816252]  el1_irq+0xcc/0x180
> [   14.816253]  arch_timer_handler_virt+0x38/0x58
> [   14.816253]  handle_percpu_devid_irq+0x90/0x240
> [   14.816253]  generic_handle_irq+0x34/0x50
> [   14.816254]  __handle_domain_irq+0x68/0xc0
> [   14.816254]  gic_handle_irq+0xf8/0x2e4
> [   14.816255]  el1_irq+0xcc/0x180
> [   14.816255]  arch_cpu_idle+0x34/0x1c8
> [   14.816255]  default_idle_call+0x24/0x44
> [   14.816256]  do_idle+0x1d0/0x2c8
> [   14.816256]  cpu_startup_entry+0x28/0x30
> [   14.816256]  rest_init+0xb8/0xc8
> [   14.816257]  start_kernel+0x4c8/0x4f4
> [   14.816257] Code: 940587f1 d5384100 b9401001 36a7fd01 (d421)
> [   14.816258] Modules linked in: start_dp(O) smeth(O)
> [   15.103092] ---[ end trace 701753956cb14aa8 ]---
> [   15.103093] Kernel panic - not syncing: Fatal exception in interrupt
> [   15.103099] SMP: stopping secondary CPUs
> [   15.103100] Kernel Offset: disabled
> [   15.103100] CPU features: 0x36,a2400218
> [   15.103100] Memory Limit: none

Urgh...

> Our kernel src tree is based on 4.19.95 and backports arm64 pseudo-NMI
> patches but doesn't support nested NMI. Its top relative commit is
> commit 17ce302f3117 ("arm64: Fix interrupt tracing in the presence of NMIs").

Can you please reproduce it with mainline and without any backport?
It is hard to reason about something that isn't a vanilla kernel.

> I look into this issue and find that it's caused by 'BUG_ON(in_nmi())'
> in nmi_enter(). From the call trace, we find two 'el1_irqs' which
> means an interrupt preempts the other one and the new one is an NMI.
> Furthermore, by adding some prints, we find the first irq also calls
> nmi_enter(), but its priority is not GICD_INT_NMI_PRI and its irq number
> is 1023. It enables irq by calling gic_arch_enable_irqs() in
> gic_handle_irq(). At this moment, the second irq preempts the first irq
> and it's an NMI but current context is already in nmi. So that may be
> the problem.

I'm not sure I get it. From the stack trace, I see this:

[   14.816251]  asm_nmi_enter+0x94/0x98
[   14.816251]  el1_irq+0x8c/0x180  (C)
[   14.816252]  gic_handle_irq+0xbc/0x2e4
[   14.816252]  el1_irq+0xcc/0x180  (B)
[   14.816253]  arch_timer_handler_virt+0x38/0x58
[   14.816253]  handle_percpu_devid_irq+0x90/0x240
[   14.816253]  generic_handle_irq+0x34/0x50
[   14.816254]  __handle_domain_irq+0x68/0xc0
[   14.816254]  gic_handle_irq+0xf8/0x2e4
[   14.816255]  el1_irq+0xcc/0x180  (A)

which indicates that we preempted a timer interrupt (A) with another
IRQ (B), itself immediately preempted by another IRQ (C)? That's
indeed at least one too many.

Can you please describe for each of (A), (B) and (C) whether they are
spurious or not, what their priorities are if they aren't spurious?

> In my opinion, when handling spurious interrupts, we shouldn't enable irqs.
> My reason is that for spurious interrupts we may enter nmi context in
> el1_irq() because current PMR may be GIC_PRIO_IRQOFF. If we enable irqs
> at this time, another NMI may happen and preempt this spurious interrupt
> but the context is already in nmi. That causes a bug on if nested NMI is
> not supported. Even for nested 

[RFC PATCH] irqchip/gic-v3: Do not enable irqs when handling spurious interrups

2021-04-16 Thread He Ying
We found this problem in our kernel src tree:

[   14.816231] [ cut here ]
[   14.816231] kernel BUG at irq.c:99!
[   14.816232] Internal error: Oops - BUG: 0 [#1] SMP
[   14.816232] Process swapper/0 (pid: 0, stack limit = 0x(ptrval))
[   14.816233] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G   O  
4.19.95-1.h1.AOS2.0.aarch64 #14
[   14.816233] Hardware name: evb (DT)
[   14.816234] pstate: 80400085 (Nzcv daIf +PAN -UAO)
[   14.816234] pc : asm_nmi_enter+0x94/0x98
[   14.816235] lr : asm_nmi_enter+0x18/0x98
[   14.816235] sp : 08003c50
[   14.816235] pmr_save: 0070
[   14.816237] x29: 08003c50 x28: 095f56c0
[   14.816238] x27:  x26: 08004000
[   14.816239] x25: 015e x24: 8008fb916000
[   14.816240] x23: 2045 x22: 080817cc
[   14.816241] x21: 08003da0 x20: 0060
[   14.816242] x19: 03ff x18: 
[   14.816243] x17: 0008 x16: 003d0900
[   14.816244] x15: 095ea6c8 x14: 8008fff5ab40
[   14.816244] x13: 8008fff58b9d x12: 
[   14.816245] x11: 08c8a200 x10: 8e31fca5
[   14.816246] x9 : 08c8a208 x8 : 000f
[   14.816247] x7 : 0004 x6 : 8008fff58b9e
[   14.816248] x5 :  x4 : 8000
[   14.816249] x3 :  x2 : 8000
[   14.816250] x1 : 0012 x0 : 095f56c0
[   14.816251] Call trace:
[   14.816251]  asm_nmi_enter+0x94/0x98
[   14.816251]  el1_irq+0x8c/0x180
[   14.816252]  gic_handle_irq+0xbc/0x2e4
[   14.816252]  el1_irq+0xcc/0x180
[   14.816253]  arch_timer_handler_virt+0x38/0x58
[   14.816253]  handle_percpu_devid_irq+0x90/0x240
[   14.816253]  generic_handle_irq+0x34/0x50
[   14.816254]  __handle_domain_irq+0x68/0xc0
[   14.816254]  gic_handle_irq+0xf8/0x2e4
[   14.816255]  el1_irq+0xcc/0x180
[   14.816255]  arch_cpu_idle+0x34/0x1c8
[   14.816255]  default_idle_call+0x24/0x44
[   14.816256]  do_idle+0x1d0/0x2c8
[   14.816256]  cpu_startup_entry+0x28/0x30
[   14.816256]  rest_init+0xb8/0xc8
[   14.816257]  start_kernel+0x4c8/0x4f4
[   14.816257] Code: 940587f1 d5384100 b9401001 36a7fd01 (d421)
[   14.816258] Modules linked in: start_dp(O) smeth(O)
[   15.103092] ---[ end trace 701753956cb14aa8 ]---
[   15.103093] Kernel panic - not syncing: Fatal exception in interrupt
[   15.103099] SMP: stopping secondary CPUs
[   15.103100] Kernel Offset: disabled
[   15.103100] CPU features: 0x36,a2400218
[   15.103100] Memory Limit: none

Our kernel src tree is based on 4.19.95 and backports arm64 pseudo-NMI
patches but doesn't support nested NMI. Its top relative commit is
commit 17ce302f3117 ("arm64: Fix interrupt tracing in the presence of NMIs").

I look into this issue and find that it's caused by 'BUG_ON(in_nmi())'
in nmi_enter(). From the call trace, we find two 'el1_irqs' which
means an interrupt preempts the other one and the new one is an NMI.
Furthermore, by adding some prints, we find the first irq also calls
nmi_enter(), but its priority is not GICD_INT_NMI_PRI and its irq number
is 1023. It enables irq by calling gic_arch_enable_irqs() in
gic_handle_irq(). At this moment, the second irq preempts the first irq
and it's an NMI but current context is already in nmi. So that may be
the problem.

In my opinion, when handling spurious interrupts, we shouldn't enable irqs.
My reason is that for spurious interrupts we may enter nmi context in
el1_irq() because current PMR may be GIC_PRIO_IRQOFF. If we enable irqs
at this time, another NMI may happen and preempt this spurious interrupt
but the context is already in nmi. That causes a bug on if nested NMI is
not supported. Even for nested nmi, I think it's not a normal scenario.

Fixes: 17ce302f3117 ("arm64: Fix interrupt tracing in the presence of NMIs")
Signed-off-by: He Ying 
---
 drivers/irqchip/irq-gic-v3.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 94b89258d045..d3b52734a2c5 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -654,15 +654,15 @@ static asmlinkage void __exception_irq_entry 
gic_handle_irq(struct pt_regs *regs
return;
}
 
+   /* Check for special IDs first */
+   if ((irqnr >= 1020 && irqnr <= 1023))
+   return;
+
if (gic_prio_masking_enabled()) {
gic_pmr_mask_irqs();
gic_arch_enable_irqs();
}
 
-   /* Check for special IDs first */
-   if ((irqnr >= 1020 && irqnr <= 1023))
-   return;
-
if (static_branch_likely(_deactivate_key))
gic_write_eoir(irqnr);
else
-- 
2.17.1