Hi Sasha,

On Tue, 3 Mar 2026 at 19:22, Sasha Levin <[email protected]> wrote:
> Replace the flat uncompressed parallel arrays (lineinfo_addrs[],
> lineinfo_file_ids[], lineinfo_lines[]) with a block-indexed,
> delta-encoded, ULEB128 varint compressed format.
>
> The sorted address array has small deltas between consecutive entries
> (typically 1-50 bytes), file IDs have high locality (delta often 0,
> same file), and line numbers change slowly.  Delta-encoding followed
> by ULEB128 varint compression shrinks most values from 4 bytes to 1.
>
> Entries are grouped into blocks of 64.  A small uncompressed block
> index (first addr + byte offset per block) enables O(log(N/64)) binary
> search, followed by sequential decode of at most 64 varints within the
> matching block.  All decode state lives on the stack -- zero
> allocations, still safe for NMI/panic context.
>
> Measured on a defconfig+debug x86_64 build (3,017,154 entries, 4,822
> source files, 47,144 blocks):
>
>   Before (flat arrays):
>     lineinfo_addrs[]    12,068,616 bytes (u32 x 3.0M)
>     lineinfo_file_ids[]  6,034,308 bytes (u16 x 3.0M)
>     lineinfo_lines[]    12,068,616 bytes (u32 x 3.0M)
>     Total:              30,171,540 bytes (28.8 MiB, 10.0 bytes/entry)
>
>   After (block-indexed delta + ULEB128):
>     lineinfo_block_addrs[]    188,576 bytes (184 KiB)
>     lineinfo_block_offsets[]  188,576 bytes (184 KiB)
>     lineinfo_data[]        10,926,128 bytes (10.4 MiB)
>     Total:                 11,303,280 bytes (10.8 MiB, 3.7 bytes/entry)
>
>   Savings: 18.0 MiB (2.7x reduction)
>
> Booted in QEMU and verified with SysRq-l that annotations still work:
>
>   default_idle+0x9/0x10 (arch/x86/kernel/process.c:767)
>   default_idle_call+0x6c/0xb0 (kernel/sched/idle.c:122)
>   do_idle+0x335/0x490 (kernel/sched/idle.c:191)
>   cpu_startup_entry+0x4e/0x60 (kernel/sched/idle.c:429)
>   rest_init+0x1aa/0x1b0 (init/main.c:760)
>
> Suggested-by: Juergen Gross <[email protected]>
> Assisted-by: Claude:claude-opus-4-6
> Signed-off-by: Sasha Levin <[email protected]>

Thanks for your patch!

> --- a/include/linux/mod_lineinfo.h
> +++ b/include/linux/mod_lineinfo.h

> +/*
> + * Read a ULEB128 varint from a byte stream.
> + * Returns the decoded value and advances *pos past the encoded bytes.
> + * If *pos would exceed 'end', returns 0 and sets *pos = end (safe for
> + * NMI/panic context -- no crash, just a missed annotation).
> + */
> +static inline u32 lineinfo_read_uleb128(const u8 *data, u32 *pos, u32 end)
> +{
> +       u32 result = 0;
> +       unsigned int shift = 0;
> +
> +       while (*pos < end) {
> +               u8 byte = data[*pos];
> +               (*pos)++;
> +               result |= (u32)(byte & 0x7f) << shift;
> +               if (!(byte & 0x80))
> +                       return result;
> +               shift += 7;
> +               if (shift >= 32) {
> +                       /* Malformed -- skip remaining continuation bytes */
> +                       while (*pos < end && (data[*pos] & 0x80))
> +                               (*pos)++;
> +                       if (*pos < end)
> +                               (*pos)++;
> +                       return result;
> +               }
> +       }
> +       return result;
> +}

FTR, arch/arc/kernel/unwind.c, arch/sh/kernel/dwarf.c, and
tools/perf/util/genelf_debug.calready have (different) LEB128 accessors,
so there is an opportunity for consolidation.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

Reply via email to