On Fri, Mar 19, 2021 at 06:41:06PM +0000, Mark Rutland wrote: > We recently converted arm64 to use arch_stack_walk() in commit: > > 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") > > The core stacktrace code expects that (when tracing the current task) > arch_stack_walk() starts a trace at its caller, and does not include > itself in the trace. However, arm64's arch_stack_walk() includes itself, > and so traces include one more entry than callers expect. The core > stacktrace code which calls arch_stack_walk() tries to skip a number of > entries to prevent itself appearing in a trace, and the additional entry > prevents skipping one of the core stacktrace functions, leaving this in > the trace unexpectedly. > > We can fix this by having arm64's arch_stack_walk() begin the trace with > its caller. The first value returned by the trace will be > __builtin_return_address(0), i.e. the caller of arch_stack_walk(). The > first frame record to be unwound will be __builtin_frame_address(1), > i.e. the caller's frame record. To prevent surprises, arch_stack_walk() > is also marked noinline. > > While __builtin_frame_address(1) is not safe in portable code, local GCC > developers have confirmed that it is safe on arm64. To find the caller's > frame record, the builtin can safely dereference the current function's > frame record or (in theory) could stash the original FP into another GPR > at function entry time, neither of which are problematic. > > Prior to this patch, the tracing code would unexpectedly show up in > traces of the current task, e.g. > > | # cat /proc/self/stack > | [<0>] stack_trace_save_tsk+0x98/0x100 > | [<0>] proc_pid_stack+0xb4/0x130 > | [<0>] proc_single_show+0x60/0x110 > | [<0>] seq_read_iter+0x230/0x4d0 > | [<0>] seq_read+0xdc/0x130 > | [<0>] vfs_read+0xac/0x1e0 > | [<0>] ksys_read+0x6c/0xfc > | [<0>] __arm64_sys_read+0x20/0x30 > | [<0>] el0_svc_common.constprop.0+0x60/0x120 > | [<0>] do_el0_svc+0x24/0x90 > | [<0>] el0_svc+0x2c/0x54 > | [<0>] el0_sync_handler+0x1a4/0x1b0 > | [<0>] el0_sync+0x170/0x180 > > After this patch, the tracing code will not show up in such traces: > > | # cat /proc/self/stack > | [<0>] proc_pid_stack+0xb4/0x130 > | [<0>] proc_single_show+0x60/0x110 > | [<0>] seq_read_iter+0x230/0x4d0 > | [<0>] seq_read+0xdc/0x130 > | [<0>] vfs_read+0xac/0x1e0 > | [<0>] ksys_read+0x6c/0xfc > | [<0>] __arm64_sys_read+0x20/0x30 > | [<0>] el0_svc_common.constprop.0+0x60/0x120 > | [<0>] do_el0_svc+0x24/0x90 > | [<0>] el0_svc+0x2c/0x54 > | [<0>] el0_sync_handler+0x1a4/0x1b0 > | [<0>] el0_sync+0x170/0x180 > > Erring on the side of caution, I've given this a spin with a bunch of > toolchains, verifying the output of /proc/self/stack and checking that > the assembly looked sound. For GCC (where we require version 5.1.0 or > later) I tested with the kernel.org crosstool binares for versions > 5.5.0, 6.4.0, 6.5.0, 7.3.0, 7.5.0, 8.1.0, 8.3.0, 8.4.0, 9.2.0, and > 10.1.0. For clang (where we require version 10.0.1 or later) I tested > with the llvm.org binary releases of 11.0.0, and 11.0.1. > > Fixes: 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") > Signed-off-by: Mark Rutland <mark.rutl...@arm.com> > Cc: Catalin Marinas <catalin.mari...@arm.com> > Cc: Chen Jun <chenjun...@huawei.com> > Cc: Marco Elver <el...@google.com> > Cc: Mark Brown <broo...@kernel.org> > Cc: Will Deacon <w...@kernel.org>
Thanks Mark. I think we should add a cc stable, just with Fixes doesn't always seem to end up in a stable kernel: Cc: <sta...@vger.kernel.org> # 5.10.x With that: Reviewed-by: Catalin Marinas <catalin.mari...@arm.com>