Hi Sasha,
On Tue, 3 Mar 2026 at 19:22, Sasha Levin <[email protected]> wrote:
> Replace the flat uncompressed parallel arrays (lineinfo_addrs[],
> lineinfo_file_ids[], lineinfo_lines[]) with a block-indexed,
> delta-encoded, ULEB128 varint compressed format.
>
> The sorted address array has small deltas between consecutive entries
> (typically 1-50 bytes), file IDs have high locality (delta often 0,
> same file), and line numbers change slowly. Delta-encoding followed
> by ULEB128 varint compression shrinks most values from 4 bytes to 1.
>
> Entries are grouped into blocks of 64. A small uncompressed block
> index (first addr + byte offset per block) enables O(log(N/64)) binary
> search, followed by sequential decode of at most 64 varints within the
> matching block. All decode state lives on the stack -- zero
> allocations, still safe for NMI/panic context.
>
> Measured on a defconfig+debug x86_64 build (3,017,154 entries, 4,822
> source files, 47,144 blocks):
>
> Before (flat arrays):
> lineinfo_addrs[] 12,068,616 bytes (u32 x 3.0M)
> lineinfo_file_ids[] 6,034,308 bytes (u16 x 3.0M)
> lineinfo_lines[] 12,068,616 bytes (u32 x 3.0M)
> Total: 30,171,540 bytes (28.8 MiB, 10.0 bytes/entry)
>
> After (block-indexed delta + ULEB128):
> lineinfo_block_addrs[] 188,576 bytes (184 KiB)
> lineinfo_block_offsets[] 188,576 bytes (184 KiB)
> lineinfo_data[] 10,926,128 bytes (10.4 MiB)
> Total: 11,303,280 bytes (10.8 MiB, 3.7 bytes/entry)
>
> Savings: 18.0 MiB (2.7x reduction)
>
> Booted in QEMU and verified with SysRq-l that annotations still work:
>
> default_idle+0x9/0x10 (arch/x86/kernel/process.c:767)
> default_idle_call+0x6c/0xb0 (kernel/sched/idle.c:122)
> do_idle+0x335/0x490 (kernel/sched/idle.c:191)
> cpu_startup_entry+0x4e/0x60 (kernel/sched/idle.c:429)
> rest_init+0x1aa/0x1b0 (init/main.c:760)
>
> Suggested-by: Juergen Gross <[email protected]>
> Assisted-by: Claude:claude-opus-4-6
> Signed-off-by: Sasha Levin <[email protected]>
Thanks for your patch!
> --- a/include/linux/mod_lineinfo.h
> +++ b/include/linux/mod_lineinfo.h
> +/*
> + * Read a ULEB128 varint from a byte stream.
> + * Returns the decoded value and advances *pos past the encoded bytes.
> + * If *pos would exceed 'end', returns 0 and sets *pos = end (safe for
> + * NMI/panic context -- no crash, just a missed annotation).
> + */
> +static inline u32 lineinfo_read_uleb128(const u8 *data, u32 *pos, u32 end)
> +{
> + u32 result = 0;
> + unsigned int shift = 0;
> +
> + while (*pos < end) {
> + u8 byte = data[*pos];
> + (*pos)++;
> + result |= (u32)(byte & 0x7f) << shift;
> + if (!(byte & 0x80))
> + return result;
> + shift += 7;
> + if (shift >= 32) {
> + /* Malformed -- skip remaining continuation bytes */
> + while (*pos < end && (data[*pos] & 0x80))
> + (*pos)++;
> + if (*pos < end)
> + (*pos)++;
> + return result;
> + }
> + }
> + return result;
> +}
FTR, arch/arc/kernel/unwind.c, arch/sh/kernel/dwarf.c, and
tools/perf/util/genelf_debug.calready have (different) LEB128 accessors,
so there is an opportunity for consolidation.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds