On Thu, Mar 04, 2021 at 03:54:48PM -0600, Segher Boessenkool wrote: > Hi!
Hi Segher, > On Thu, Mar 04, 2021 at 02:57:30PM +0000, Mark Rutland wrote: > > It looks like GCC is happy to give us the function-entry-time FP if we use > > __builtin_frame_address(1), > > From the GCC manual: > Calling this function with a nonzero argument can have > unpredictable effects, including crashing the calling program. As > a result, calls that are considered unsafe are diagnosed when the > '-Wframe-address' option is in effect. Such calls should only be > made in debugging situations. > > It *does* warn (the warning is in -Wall btw), on both powerpc and > aarch64. Furthermore, using this builtin causes lousy code (it forces > the use of a frame pointer, which we normally try very hard to optimise > away, for good reason). > > And, that warning is not an idle warning. Non-zero arguments to > __builtin_frame_address can crash the program. It won't on simpler > functions, but there is no real definition of what a simpler function > *is*. It is meant for debugging, not for production use (this is also > why no one has bothered to make it faster). > > On Power it should work, but on pretty much any other arch it won't. I understand this is true generally, and cannot be relied upon in portable code. However as you hint here for Power, I believe that on arm64 __builtin_frame_address(1) shouldn't crash the program due to the way frame records work on arm64, but I'll go check with some local compiler folk. I agree that __builtin_frame_address(2) and beyond certainly can, e.g. by NULL dereference and similar. For context, why do you think this would work on power specifically? I wonder if our rationale is similar. Are you aware of anything in particular that breaks using __builtin_frame_address(1) in non-portable code, or is this just a general sentiment of this not being a supported use-case? > > Unless we can get some strong guarantees from compiler folk such that we > > can guarantee a specific function acts boundary for unwinding (and > > doesn't itself get split, etc), the only reliable way I can think to > > solve this requires an assembly trampoline. Whatever we do is liable to > > need some invasive rework. > > You cannot get such a guarantee, other than not letting the compiler > see into the routine at all, like with assembler code (not inline asm, > real assembler code). If we cannot reliably ensure this then I'm happy to go write an assembly trampoline to snapshot the state at a function call boundary (where our procedure call standard mandates the state of the LR, FP, and frame records pointed to by the FP). This'll require reworking a reasonable amount of code cross-architecture, so I'll need to get some more concrete justification (e.g. examples of things that can go wrong in practice). > The real way forward is to bite the bullet and to no longer pretend you > can do a full backtrace from just the stack contents. You cannot. I think what you mean here is that there's no reliable way to handle the current/leaf function, right? If so I do agree. Beyond that I believe that arm64's frame records should be sufficient. Thanks, Mark.