On Mon, May 20, 2024 at 10:30:17PM -0700, Andrii Nakryiko wrote:
> Recent changes made uprobe_cpu_buffer preparation lazy, and moved it
> deeper into __uprobe_trace_func(). This is problematic because
> __uprobe_trace_func() is called inside rcu_read_lock()/rcu_read_unlock()
> block, which then calls prepare_uprobe_buffer() -> uprobe_buffer_get() ->
> mutex_lock(&ucb->mutex), leading to a splat about using mutex under
> non-sleepable RCU:
>
> BUG: sleeping function called from invalid context at
> kernel/locking/mutex.c:585
> in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 98231, name:
> stress-ng-sigq
> preempt_count: 0, expected: 0
> RCU nest depth: 1, expected: 0
> ...
> Call Trace:
> <TASK>
> dump_stack_lvl+0x3d/0xe0
> __might_resched+0x24c/0x270
> ? prepare_uprobe_buffer+0xd5/0x1d0
> __mutex_lock+0x41/0x820
> ? ___perf_sw_event+0x206/0x290
> ? __perf_event_task_sched_in+0x54/0x660
> ? __perf_event_task_sched_in+0x54/0x660
> prepare_uprobe_buffer+0xd5/0x1d0
> __uprobe_trace_func+0x4a/0x140
> uprobe_dispatcher+0x135/0x280
> ? uprobe_dispatcher+0x94/0x280
> uprobe_notify_resume+0x650/0xec0
> ? atomic_notifier_call_chain+0x21/0x110
> ? atomic_notifier_call_chain+0xf8/0x110
> irqentry_exit_to_user_mode+0xe2/0x1e0
> asm_exc_int3+0x35/0x40
> RIP: 0033:0x7f7e1d4da390
> Code: 33 04 00 0f 1f 80 00 00 00 00 f3 0f 1e fa b9 01 00 00 00 e9 b2 fc ff
> ff 66 90 f3 0f 1e fa 31 c9 e9 a5 fc ff ff 0f 1f 44 00 00 <cc> 0f 1e fa b8 27
> 00 00 00 0f 05 c3 0f 1f 40 00 f3 0f 1e fa b8 6e
> RSP: 002b:00007ffd2abc3608 EFLAGS: 00000246
> RAX: 0000000000000000 RBX: 0000000076d325f1 RCX: 0000000000000000
> RDX: 0000000076d325f1 RSI: 000000000000000a RDI: 00007ffd2abc3690
> RBP: 000000000000000a R08: 00017fb700000000 R09: 00017fb700000000
> R10: 00017fb700000000 R11: 0000000000000246 R12: 0000000000017ff2
> R13: 00007ffd2abc3610 R14: 0000000000000000 R15: 00007ffd2abc3780
> </TASK>
>
> Luckily, it's easy to fix by moving prepare_uprobe_buffer() to be called
> slightly earlier: into uprobe_trace_func() and uretprobe_trace_func(), outside
> of RCU locked section. This still keeps this buffer preparation lazy and helps
> avoid the overhead when it's not needed. E.g., if there is only BPF uprobe
> handler installed on a given uprobe, buffer won't be initialized.
>
> Note, the other user of prepare_uprobe_buffer(), __uprobe_perf_func(), is not
> affected, as it doesn't prepare buffer under RCU read lock.
>
> Fixes: 1b8f85defbc8 ("uprobes: prepare uprobe args buffer lazily")
> Reported-by: Breno Leitao <[email protected]>
> Signed-off-by: Andrii Nakryiko <[email protected]>
Tested-by: Breno Leitao <[email protected]>