Le Wed, Jan 07, 2026 at 08:02:43PM -0500, Joel Fernandes a écrit :
> 
> 
> > On Jan 7, 2026, at 6:15 PM, Frederic Weisbecker <[email protected]> wrote:
> > 
> > Le Thu, Jan 01, 2026 at 11:34:10AM -0500, Joel Fernandes a écrit :
> >> From: Yao Kai <[email protected]>
> >> 
> >> Commit 5f5fa7ea89dc ("rcu: Don't use negative nesting depth in
> >> __rcu_read_unlock()") removes the recursion-protection code from
> >> __rcu_read_unlock(). Therefore, we could invoke the deadloop in
> >> raise_softirq_irqoff() with ftrace enabled as follows:
> >> 
> >> WARNING: CPU: 0 PID: 0 at kernel/trace/trace.c:3021 
> >> __ftrace_trace_stack.constprop.0+0x172/0x180
> >> Modules linked in: my_irq_work(O)
> >> CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G O 6.18.0-rc7-dirty #23 
> >> PREEMPT(full)
> >> Tainted: [O]=OOT_MODULE
> >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 
> >> 04/01/2014
> >> RIP: 0010:__ftrace_trace_stack.constprop.0+0x172/0x180
> >> RSP: 0018:ffffc900000034a8 EFLAGS: 00010002
> >> RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000000
> >> RDX: 0000000000000003 RSI: ffffffff826d7b87 RDI: ffffffff826e9329
> >> RBP: 0000000000090009 R08: 0000000000000005 R09: ffffffff82afbc4c
> >> R10: 0000000000000008 R11: 0000000000011d7a R12: 0000000000000000
> >> R13: ffff888003874100 R14: 0000000000000003 R15: ffff8880038c1054
> >> FS:  0000000000000000(0000) GS:ffff8880fa8ea000(0000) 
> >> knlGS:0000000000000000
> >> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> CR2: 000055b31fa7f540 CR3: 00000000078f4005 CR4: 0000000000770ef0
> >> PKRU: 55555554
> >> Call Trace:
> >> <IRQ>
> >> trace_buffer_unlock_commit_regs+0x6d/0x220
> >> trace_event_buffer_commit+0x5c/0x260
> >> trace_event_raw_event_softirq+0x47/0x80
> >> raise_softirq_irqoff+0x6e/0xa0
> >> rcu_read_unlock_special+0xb1/0x160
> >> unwind_next_frame+0x203/0x9b0
> >> __unwind_start+0x15d/0x1c0
> >> arch_stack_walk+0x62/0xf0
> >> stack_trace_save+0x48/0x70
> >> __ftrace_trace_stack.constprop.0+0x144/0x180
> >> trace_buffer_unlock_commit_regs+0x6d/0x220
> >> trace_event_buffer_commit+0x5c/0x260
> >> trace_event_raw_event_softirq+0x47/0x80
> >> raise_softirq_irqoff+0x6e/0xa0
> >> rcu_read_unlock_special+0xb1/0x160
> >> unwind_next_frame+0x203/0x9b0
> >> __unwind_start+0x15d/0x1c0
> >> arch_stack_walk+0x62/0xf0
> >> stack_trace_save+0x48/0x70
> >> __ftrace_trace_stack.constprop.0+0x144/0x180
> >> trace_buffer_unlock_commit_regs+0x6d/0x220
> >> trace_event_buffer_commit+0x5c/0x260
> >> trace_event_raw_event_softirq+0x47/0x80
> >> raise_softirq_irqoff+0x6e/0xa0
> >> rcu_read_unlock_special+0xb1/0x160
> >> unwind_next_frame+0x203/0x9b0
> >> __unwind_start+0x15d/0x1c0
> >> arch_stack_walk+0x62/0xf0
> >> stack_trace_save+0x48/0x70
> >> __ftrace_trace_stack.constprop.0+0x144/0x180
> >> trace_buffer_unlock_commit_regs+0x6d/0x220
> >> trace_event_buffer_commit+0x5c/0x260
> >> trace_event_raw_event_softirq+0x47/0x80
> >> raise_softirq_irqoff+0x6e/0xa0
> >> rcu_read_unlock_special+0xb1/0x160
> >> __is_insn_slot_addr+0x54/0x70
> >> kernel_text_address+0x48/0xc0
> >> __kernel_text_address+0xd/0x40
> >> unwind_get_return_address+0x1e/0x40
> >> arch_stack_walk+0x9c/0xf0
> >> stack_trace_save+0x48/0x70
> >> __ftrace_trace_stack.constprop.0+0x144/0x180
> >> trace_buffer_unlock_commit_regs+0x6d/0x220
> >> trace_event_buffer_commit+0x5c/0x260
> >> trace_event_raw_event_softirq+0x47/0x80
> >> __raise_softirq_irqoff+0x61/0x80
> >> __flush_smp_call_function_queue+0x115/0x420
> >> __sysvec_call_function_single+0x17/0xb0
> >> sysvec_call_function_single+0x8c/0xc0
> >> </IRQ>
> >> 
> >> Commit b41642c87716 ("rcu: Fix rcu_read_unlock() deadloop due to IRQ work")
> >> fixed the infinite loop in rcu_read_unlock_special() for IRQ work by
> >> setting a flag before calling irq_work_queue_on(). We fix this issue by
> >> setting the same flag before calling raise_softirq_irqoff() and rename the
> >> flag to defer_qs_pending for more common.
> >> 
> >> Fixes: 5f5fa7ea89dc ("rcu: Don't use negative nesting depth in 
> >> __rcu_read_unlock()")
> >> Reported-by: Tengda Wu <[email protected]>
> >> Signed-off-by: Yao Kai <[email protected]>
> >> Reviewed-by: Joel Fernandes <[email protected]>
> >> Signed-off-by: Joel Fernandes <[email protected]>
> > 
> > Looks good but, BTW, what happens if rcu_qs() is called
> > before rcu_preempt_deferred_qs() had a chance to be called?
> 
> Could you provide an example of when that can happen?

It can happen because rcu_qs() is called before rcu_preempt_deferred_qs()
in rcu_softirq_qs(). Inverting the calls could help but IRQs must be disabled
to ensure there is no read side between rcu_preempt_deferred_qs() and rcu_qs().

I'm not aware of other ways to trigger that, except perhaps this:

https://lore.kernel.org/rcu/[email protected]/T/#u

Either we fix those sites and make sure that rcu_preempt_deferred_qs() is always
called before rcu_qs() in the same IRQ disabled section (or there are other
fields set in ->rcu_read_unlock_special for later clearance). If we do that we
must WARN_ON_ONCE(rdp->defer_qs_pending == DEFER_QS_PENDING) in rcu_qs().

Or we reset rdp->defer_qs_pending from rcu_qs(), which sounds more robust.

Ah an alternative is to make rdp::defer_qs_pending a field in union rcu_special
which, sadly, would need to be expanded as a u64.

Thanks.

-- 
Frederic Weisbecker
SUSE Labs

Reply via email to