On Thu, Dec 4, 2025 at 8:59 AM Jakub Jelinek <[email protected]> wrote: > > On Thu, Dec 04, 2025 at 08:37:19AM -0800, Y Song via Dwarf-discuss wrote: > > Motivation > > ========== > > > > My particular use case is for bpf-based linux kernel > > tracing. When tracing a kernel function, the user would like > > to know the actual signature. This is critical. > > > > For example, if the actual signature is > > static int foo(int a, int c) { ... } > > and the source signature is > > static int foo(int a, int b, int c) { ... } > > That is pretty normal function cloning. > __attribute__((noinline)) static int foo (int a, int b, int c) { return a + > c; } > int bar (int a, int c) > { > return foo (a, 0, c) + foo (1, 1, 2) + foo (2, 2, 3) + foo (a, 3, 4) + foo > (5, 4, c); > } > > You can just normally emit > DW_TAG_subprogram > DW_AT_name "foo" > DW_AT_inline 1 > ... > DW_TAG_formal_parameter > DW_AT_name "a" > ... > DW_TAG_formal_parameter > DW_AT_name "b" > ... > DW_TAG_formal_parameter > DW_AT_name "c" > for the original user function (if it isn't emitted in that shape, > without DW_AT_low_pc/DW_AT_high_pc/DW_AT_ranges etc. > Then > DW_TAG_subprogram > DW_AT_abstract_origin <above foo DW_TAG_subprogram> > DW_AT_low_pc ... > DW_AT_high_pc ... > ... > DW_TAG_formal_parameter > DW_AT_abstract_origin <above a DW_TAG_formal_parameter> > DW_AT_location ... > DW_TAG_formal_parameter > DW_AT_abstract_origin <above b DW_TAG_formal_parameter> > DW_AT_location DW_OP_GNU_parameter_ref <reference to > DW_TAG_call_site_parameter> > DW_TAG_formal_parameter > DW_AT_abstract_origin <above c DW_TAG_formal_parameter> > DW_AT_location ... > > DW_OP_GNU_parameter_ref is an extension, see > https://dwarfstd.org/issues/230109.1.html > In any case, I don't see why you need something like > DW_TAG_inlined_subroutine at DW_TAG_compile_unit scope, you can't inline a > function into a translation unit.
In https://github.com/llvm/llvm-project/pull/157349, we have tried two different approaches to encode new true signatures at the same time preserving original signatures. Method 1: See https://github.com/llvm/llvm-project/pull/157349#issuecomment-332994687 and subsequent comments including below https://github.com/llvm/llvm-project/pull/157349#issuecomment-3341626819 one example: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000000) DW_AT_high_pc (0x0000000000000006) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("mul") ... DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_name ("num1") DW_AT_decl_file ("/app/example.cpp") DW_AT_decl_line (...) DW_AT_type (0x0000003d "int") DW_TAG_inlined_subroutine DW_AT_abstract_origin (LOC_A "mul") DW_AT_low_pc (...) DW_AT_high_pc (...) DW_AT_call_file ("/app/example.cpp") DW_AT_call_line (...) DW_AT_call_column (...) DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_abstract_origin (LOC_B "num1") DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_abstract_origin (LOC_C "num2") NULL NULL See the above. The parameters in DW_TAG_subprogram presents the true signature while the DW_TAG_inlined_subroutine represents the original signatures. This seems to be working but lldb does not like it. See https://github.com/llvm/llvm-project/pull/157349#issuecomment-3412590751 One of lldb section: (lldb) bt * thread #1, name = 'ex2', stop reason = step in * frame #0: 0x0000555555555161 ex2`inc(x=41, y=<unavailable>) at ex2.c:5:18 [inlined] frame #1: 0x0000555555555161 ex2`inc at ex2.c:0 frame #2: 0x0000555555555146 ex2`do_work(n=41) at ex2.c:11:13 [inlined] frame #3: 0x0000555555555140 ex2`main at ex2.c:17:12 frame #4: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128 frame #5: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128 frame #6: 0x0000555555555075 ex2`_start + 37 which is not good since the frame #0 now marked as 'inlined' which is not expected from uers. Method 2: Later on I tried another approach, directly encode true signature in the DISubprogram but add a declaration to link to abstract origin. See https://github.com/llvm/llvm-project/pull/157349#issuecomment-3420455765 and following comments in github. The eventual dwarf: See link https://github.com/llvm/llvm-project/pull/157349#issuecomment-3422954925 0x0000009e: DW_TAG_subprogram DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/inline_lldb/c-same-func-name/same_func_name.cc") DW_AT_decl_line (3) DW_AT_type (0x000000b1 "int") DW_AT_declaration (true) DW_AT_external (true) 0x000000a6: DW_TAG_formal_parameter DW_AT_type (0x000000b1 "int") 0x000000ab: DW_TAG_formal_parameter DW_AT_type (0x000000b1 "int") 0x000000b0: NULL 0x000000b1: DW_TAG_base_type DW_AT_name ("int") DW_AT_encoding (DW_ATE_signed) DW_AT_byte_size (0x04) 0x000000b5: DW_TAG_subprogram DW_AT_low_pc (0x0000000000001160) DW_AT_high_pc (0x000000000000117f) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_linkage_name ("_ZL3fooii") DW_AT_specification (0x0000009e "foo") 0x000000c2: DW_TAG_formal_parameter DW_AT_location (indexed (0x2) loclist = 0x0000003d: [0x0000000000001160, 0x0000000000001163): DW_OP_reg5 RDI [0x0000000000001163, 0x000000000000117f): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value) DW_AT_name ("x") DW_AT_decl_file ("/home/yhs/tests/inline_lldb/c-same-func-name/same_func_name.cc") DW_AT_decl_line (3) DW_AT_type (0x000000b1 "int") 0x000000cb: DW_TAG_variable DW_AT_location (DW_OP_fbreg +4) DW_AT_name ("t") DW_AT_decl_file ("/home/yhs/tests/inline_lldb/c-same-func-name/same_func_name.cc") DW_AT_decl_line (4) DW_AT_type (0x00000122 "volatile int") 0x000000d6: DW_TAG_call_site DW_AT_call_origin (0x00000108 "printf") DW_AT_call_return_pc (0x0000000000001179) 0x000000dc: NULL In the above, true signature is directly encoded in the DISubprogram. I guess dwarf people may not accept it since they may want to keep the original parameter for debugger purpose (e.g. used by DW_OP_entry_value). So I guess it looks like both approaches won't work. The above recommended dwarf format: > DW_TAG_subprogram > DW_AT_name "foo" > DW_AT_inline 1 > ... > DW_TAG_formal_parameter > DW_AT_name "a" > ... > DW_TAG_formal_parameter > DW_AT_name "b" > ... > DW_TAG_formal_parameter > DW_AT_name "c" > for the original user function (if it isn't emitted in that shape, > without DW_AT_low_pc/DW_AT_high_pc/DW_AT_ranges etc. > Then > DW_TAG_subprogram > DW_AT_abstract_origin <above foo DW_TAG_subprogram> > DW_AT_low_pc ... > DW_AT_high_pc ... > ... > DW_TAG_formal_parameter > DW_AT_abstract_origin <above a DW_TAG_formal_parameter> > DW_AT_location ... > DW_TAG_formal_parameter > DW_AT_abstract_origin <above b DW_TAG_formal_parameter> > DW_AT_location DW_OP_GNU_parameter_ref <reference to > DW_TAG_call_site_parameter> > DW_TAG_formal_parameter > DW_AT_abstract_origin <above c DW_TAG_formal_parameter> > DW_AT_location ... In the above, in DW_TAG_subprogram, all source-level formal parameters are presented under DW_TAG_subprogram, so it has nowhere to encode the true signatures. Also the type for the parameter may be different from the source-level one. Not sure how to represent them. Any suggestions? -- Dwarf-discuss mailing list [email protected] https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss
