On Mon, 21 Jul 2025 11:20:34 -0400 Mathieu Desnoyers <mathieu.desnoy...@efficios.com> wrote:
> Hi! > > I've written up an RFC for a new system call to handle sframe registration > for shared libraries. There has been interest to cover both sframe in > the short term, but also JIT use-cases in the long term, so I'm > covering both here in this RFC to provide the full context. Implementation > wise we could start by only covering the sframe use-case. > > I've called it "codectl(2)" for now, but I'm of course open to feedback. Hmm, I guess I'm OK with that name. I can't really think of anything that would be better. But kernel developers are notorious for sucking at coming up with decent names ;-) > > For ELF, I'm including the optional pathname, build id, and debug link > information which are really useful to translate from instruction pointers > to executable/library name, symbol, offset, source file, line number. > This is what we are using in LTTng-UST and Babeltrace debug-info filter > plugin [1], and I think this would be relevant for kernel tracers as well > so they can make the resulting stack traces meaningful to users. Honestly, I'm not sure it needs to be an ELF file. Just a file that has an sframe section in it. > > sys_codectl(2) > ================= > > * arg0: unsigned int @option: > > /* Additional labels can be added to enum code_opt, for extensibility. */ > > enum code_opt { > CODE_REGISTER_ELF, Perhaps the above should be: CODE_REGISTER_SFRAME, as currently SFrame is read only via files. > CODE_REGISTER_JIT, From our other conversations, JIT will likely be a completely different format than SFRAME, so calling it just JIT should be fine. > CODE_UNREGISTER, I wonder if this should be the first enum. That is, "0" is to unregister. That way, all non-zero options will be for what is being registered, and "0" is for unregistering any of them. > }; > > * arg1: void * @info > > /* if (@option == CODE_REGISTER_ELF) */ > > /* > * text_start, text_end, sframe_start, sframe_end allow unwinding of the > * call stack. > * > * elf_start, elf_end, pathname, and either build_id or debug_link allows > * mapping instruction pointers to file, symbol, offset, and source file > * location. > */ > struct code_elf_info { > : __u64 elf_start; > __u64 elf_end; Perhaps: __u64 file_start; __u64 file_end; ? And call it "struct code_sframe_info" > __u64 text_start; > __u64 text_end; > __u64 sframe_start; > __u64 sframe_end; What is the above "sframe" for? > __u64 pathname; /* char *, NULL if unavailable. */ > > __u64 build_id; /* char *, NULL if unavailable. */ > __u64 debug_link_pathname; /* char *, NULL if unavailable. */ Maybe just list the above three as "optional" ? It may be available, but the implementer just doesn't want to implement it. > __u32 build_id_len; > __u32 debug_link_crc; > }; > > > /* if (@option == CODE_REGISTER_JIT) */ > > /* > * Registration of sorted JIT unwind table: The reserved memory area is > * of size reserved_len. Userspace increases used_len as new code is > * populated between text_start and text_end. This area is populated in > * increasing address order, and its ABI requires to have no overlapping > * fre. This fits the common use-case where JITs populate code into > * a given memory area by increasing address order. The sorted unwind > * tables can be chained with a singly-linked list as they become full. > * Consecutive chained tables are also in sorted text address order. > * > * Note: if there is an eventual use-case for unsorted jit unwind table, > * this would be introduced as a new "code option". > */ > > struct code_jit_info { > __u64 text_start; /* text_start >= addr */ > __u64 text_end; /* addr < text_end */ > __u64 unwind_head; /* struct code_jit_unwind_table * */ > }; > > struct code_jit_unwind_fre { > /* > * Contains info similar to sframe, allowing unwind for a given > * code address range. > */ > __u32 size; > __u32 ip_off; /* offset from text_start */ > __s32 cfa_off; > __s32 ra_off; > __s32 fp_off; > __u8 info; > }; > > struct code_jit_unwind_table { > __u64 reserved_len; > __u64 used_len; /* > * Incremented by userspace (store-release), read by > * the kernel (load-acquire). > */ > __u64 next; /* Chain with next struct code_jit_unwind_table. */ > struct code_jit_unwind_fre fre[]; > }; I wonder if we should avoid the "jit" portion completely for now until we know what exactly we need. Thanks, -- Steve > > /* if (@option == CODE_UNREGISTER) */ > > void *info > > * arg2: size_t info_size > > /* > * Size of @info structure, allowing extensibility. See > * copy_struct_from_user(). > */ > > * arg3: unsigned int flags (0) > > /* Flags for extensibility. */ > > Your feedback is welcome, > > Thanks, > > Mathieu > > [1] > https://babeltrace.org/docs/v2.0/man7/babeltrace2-filter.lttng-utils.debug-info.7/ >