Hi, On Mon, Mar 23, 2026 at 02:06:43PM +0100, Petr Pavlu wrote: > On 3/17/26 12:04 PM, Stanislaw Gruszka wrote: > > Module symbol lookup via find_kallsyms_symbol() performs a linear scan > > over the entire symtab when resolving an address. The number of symbols > > in module symtabs has grown over the years, largely due to additional > > metadata in non-standard sections, making this lookup very slow. > > > > Improve this by separating function symbols during module load, placing > > them at the beginning of the symtab, sorting them by address, and using > > binary search when resolving addresses in module text. > > Doesn't considering only function symbols break the expected behavior > with CONFIG_KALLSYMS_ALL=y. For instance, when using kdb, is it still > able to see all symbols in a module? The module loader should be remain > consistent with the main kallsyms code regarding which symbols can be > looked up.
We already have a CONFIG_KALLSYMS_ALL=y inconsistency between kernel and module symbol lookup, independent of this patch. find_kallsyms_symbol() restricts the search to MOD_TEXT (or MOD_INIT_TEXT) address ranges, so it cannot resolve data or rodata symbols. This appears to be acceptable in practice, most kallsyms_lookup() users are interested in function symbols. Users relying on CONFIG_KALLSYMS_ALL=y seems to use name-based lookups or iterate over the full symtab. Though kdb looks like the exception: it can resolve data symbols by address in the kernel, but not in modules. But, I think, resolving symbols by name is more common for kdb. To make the behavior consistent, we could either: extend find_kallsyms_symbol() to cover data/rodata symbols (for CONFIG_KALLSYSM_ALL), or restrict kallsyms_lookup() to text symbols and introduce a separate API for data symbols lookup for users that really need that. I think second option is better, as some (maybe most) users are not interested in all symbols, even if CONFIG_KALLSYSM_ALL is set. However, either would require substantial rework and is outside the scope of this patch. Regards Stanislaw > > This also should improve times for linear symbol name lookups, as valid > > function symbols are now located at the beginning of the symtab. > > > > The cost of sorting is small relative to module load time. In repeated > > module load tests [1], depending on .config options, this change > > increases load time between 2% and 4%. With cold caches, the difference > > is not measurable, as memory access latency dominates. > > > > The sorting theoretically could be done in compile time, but much more > > complicated as we would have to simulate kernel addresses resolution > > for symbols, and then correct relocation entries. That would be risky > > if get out of sync. > > > > The improvement can be observed when listing ftrace filter functions: > > > > root@nano:~# time cat /sys/kernel/tracing/available_filter_functions | wc -l > > 74908 > > > > real 0m1.315s > > user 0m0.000s > > sys 0m1.312s > > > > After: > > > > root@nano:~# time cat /sys/kernel/tracing/available_filter_functions | wc -l > > 74911 > > > > real 0m0.167s > > user 0m0.004s > > sys 0m0.175s > > > > (there are three more symbols introduced by the patch) > > This looks as a reasonable improvement. > > > > > For livepatch modules, the symtab layout is preserved and the existing > > linear search is used. For this case, it should be possible to keep > > the original ELF symtab instead of copying it 1:1, but that is outside > > the scope of this patch. > > Livepatch modules are already handled specially by the kallsyms module > code so excluding them from this optimization is probably ok. > > However, it might be worth revisiting this exception. I believe that > livepatch support requires the original symbol table for relocations to > remain usable. It might make sense to investigate whether updating the > relocation data with the adjusted symbol indexes would be sensible. > > -- > Thanks, > Petr
