On Tue, 15 Oct 2024 10:01:49 +0200 Peter Zijlstra <pet...@infradead.org> wrote:
> On Mon, Oct 14, 2024 at 09:08:41PM -0400, Steven Rostedt wrote: > > There's a long standing issue with having fentry in weak functions that > > were overwritten. This was first caught when a "notrace" function was > > showing up in the /sys/kernel/tracing/available_filter_functions list. > > > > > > https://lore.kernel.org/all/20220412094923.0abe90955e5db486b7bca...@kernel.org/ > > > > What was happening is that ftrace uses kallsyms (what is basically listed > > by nm) > > Ah, if only. NM lists the symbol table, and that has strictly more > information than kallsyms, notably it includes symbol length. Using > symbol length it is trivial to distinguish the case you're tripping > over. > > > to map the location of __fentry to the previous symbol ahead of it. > > Then it says that function is the name of the function for that __fentry. > > But when you have weak functions, you can have this: > > > > > > int notrace do_not_watch_me(void) > > { > > return whatever; > > } > > > > void __weak define_me_in_arch(void) > > { > > return; > > } > > > > The define_me_in_arch() gets a __fentry, but because it is overwritten, it > > becomes attached to the end of do_not_watch_me(). When ftrace asks kallsyms > > for the symbol name of that __fentry location, it returns > > do_not_watch_me(), and ftrace will simply list that. > > Which is a kallsym bug that does not exists in the ELF space. > > There is a patch set that tries to fix this here: > > > https://lore.kernel.org/all/20240723063258.2240610-1-zhengyej...@huaweicloud.com/T/#u > > Instead of adding size to kallsyms it adds extra symbols to denote the > holes, which ends up being similar. Yeah, I was looking at this patch when I thought the compiler could do it for us. Especially since it now creates the __mcount_locs too. > > > But I was thinking, since gcc (and perhaps clang?) now has: > > > > -mrecord-mcount > > > > that creates the __mcount_loc section that ftrace uses to find where the > > __fentry's are. Perhaps the compiler can simply remove any of those > > __fentry's that happen to exist in a weak function that was overridden. > > > > Would this be something that the compiler could do? > > Linker, this is link time. The linker would not only have to drop the > (weak) symbol from the symbol table (which is all it really does), but > it would now have to go and muck about with other sections. Shucks, I was hoping that the linker would have enough info to know what was inside the locations of those symbols. Oh well. > > It would have to go re-write the __mcount_loc section and drop entrie(s) > that were inside the dropped symbol. But then, what about other extra > sections? For instance, we can have (and still patch) alternatives in > the 'dead' weak text. > > Where does this stop? > > LTO can do the right thing and completely eliminate the weak function, > but anything short of that is a really big ask. Esp. since this is a > problem of our own making -- kallsyms is a flawed representation of the > symbol table. Yeah, if it's too much of an ask, then forget it. But I decided to ask to find out how much of an ask it is ;-) -- Steve