Le Thu, Jan 01, 2026 at 11:34:10AM -0500, Joel Fernandes a écrit : > From: Yao Kai <[email protected]> > > Commit 5f5fa7ea89dc ("rcu: Don't use negative nesting depth in > __rcu_read_unlock()") removes the recursion-protection code from > __rcu_read_unlock(). Therefore, we could invoke the deadloop in > raise_softirq_irqoff() with ftrace enabled as follows: > > WARNING: CPU: 0 PID: 0 at kernel/trace/trace.c:3021 > __ftrace_trace_stack.constprop.0+0x172/0x180 > Modules linked in: my_irq_work(O) > CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G O 6.18.0-rc7-dirty #23 > PREEMPT(full) > Tainted: [O]=OOT_MODULE > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 > 04/01/2014 > RIP: 0010:__ftrace_trace_stack.constprop.0+0x172/0x180 > RSP: 0018:ffffc900000034a8 EFLAGS: 00010002 > RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000000 > RDX: 0000000000000003 RSI: ffffffff826d7b87 RDI: ffffffff826e9329 > RBP: 0000000000090009 R08: 0000000000000005 R09: ffffffff82afbc4c > R10: 0000000000000008 R11: 0000000000011d7a R12: 0000000000000000 > R13: ffff888003874100 R14: 0000000000000003 R15: ffff8880038c1054 > FS: 0000000000000000(0000) GS:ffff8880fa8ea000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 000055b31fa7f540 CR3: 00000000078f4005 CR4: 0000000000770ef0 > PKRU: 55555554 > Call Trace: > <IRQ> > trace_buffer_unlock_commit_regs+0x6d/0x220 > trace_event_buffer_commit+0x5c/0x260 > trace_event_raw_event_softirq+0x47/0x80 > raise_softirq_irqoff+0x6e/0xa0 > rcu_read_unlock_special+0xb1/0x160 > unwind_next_frame+0x203/0x9b0 > __unwind_start+0x15d/0x1c0 > arch_stack_walk+0x62/0xf0 > stack_trace_save+0x48/0x70 > __ftrace_trace_stack.constprop.0+0x144/0x180 > trace_buffer_unlock_commit_regs+0x6d/0x220 > trace_event_buffer_commit+0x5c/0x260 > trace_event_raw_event_softirq+0x47/0x80 > raise_softirq_irqoff+0x6e/0xa0 > rcu_read_unlock_special+0xb1/0x160 > unwind_next_frame+0x203/0x9b0 > __unwind_start+0x15d/0x1c0 > arch_stack_walk+0x62/0xf0 > stack_trace_save+0x48/0x70 > __ftrace_trace_stack.constprop.0+0x144/0x180 > trace_buffer_unlock_commit_regs+0x6d/0x220 > trace_event_buffer_commit+0x5c/0x260 > trace_event_raw_event_softirq+0x47/0x80 > raise_softirq_irqoff+0x6e/0xa0 > rcu_read_unlock_special+0xb1/0x160 > unwind_next_frame+0x203/0x9b0 > __unwind_start+0x15d/0x1c0 > arch_stack_walk+0x62/0xf0 > stack_trace_save+0x48/0x70 > __ftrace_trace_stack.constprop.0+0x144/0x180 > trace_buffer_unlock_commit_regs+0x6d/0x220 > trace_event_buffer_commit+0x5c/0x260 > trace_event_raw_event_softirq+0x47/0x80 > raise_softirq_irqoff+0x6e/0xa0 > rcu_read_unlock_special+0xb1/0x160 > __is_insn_slot_addr+0x54/0x70 > kernel_text_address+0x48/0xc0 > __kernel_text_address+0xd/0x40 > unwind_get_return_address+0x1e/0x40 > arch_stack_walk+0x9c/0xf0 > stack_trace_save+0x48/0x70 > __ftrace_trace_stack.constprop.0+0x144/0x180 > trace_buffer_unlock_commit_regs+0x6d/0x220 > trace_event_buffer_commit+0x5c/0x260 > trace_event_raw_event_softirq+0x47/0x80 > __raise_softirq_irqoff+0x61/0x80 > __flush_smp_call_function_queue+0x115/0x420 > __sysvec_call_function_single+0x17/0xb0 > sysvec_call_function_single+0x8c/0xc0 > </IRQ> > > Commit b41642c87716 ("rcu: Fix rcu_read_unlock() deadloop due to IRQ work") > fixed the infinite loop in rcu_read_unlock_special() for IRQ work by > setting a flag before calling irq_work_queue_on(). We fix this issue by > setting the same flag before calling raise_softirq_irqoff() and rename the > flag to defer_qs_pending for more common. > > Fixes: 5f5fa7ea89dc ("rcu: Don't use negative nesting depth in > __rcu_read_unlock()") > Reported-by: Tengda Wu <[email protected]> > Signed-off-by: Yao Kai <[email protected]> > Reviewed-by: Joel Fernandes <[email protected]> > Signed-off-by: Joel Fernandes <[email protected]>
Looks good but, BTW, what happens if rcu_qs() is called before rcu_preempt_deferred_qs() had a chance to be called? current->rcu_read_unlock_special.b.need_qs is reset by rcu_qs() so subsequent calls to rcu_read_unlock() won't issue rcu_read_unlock_special() (unless the task is blocked). And further calls to rcu_preempt_deferred_qs() through rcu_core() will be ignored as well. But rdp->defer_qs_pending will remain in the DEFER_QS_PENDING state until the next grace period. And if rcu_read_unlock_special() is called again during the next GP on unfortunate place needing deferred qs, the state machine will spuriously assume that either rcu_core or the irq_work are pending, when none are anymore. The state should be reset by rcu_qs(). Thanks. -- Frederic Weisbecker SUSE Labs

