Re: [Dwarf-discuss] Proposal to encode Changed Function Signatures in dwarf

Y Song via Dwarf-discuss Thu, 04 Dec 2025 20:51:35 -0800

On Thu, Dec 4, 2025 at 8:59 AM Jakub Jelinek <[email protected]> wrote:
>
> On Thu, Dec 04, 2025 at 08:37:19AM -0800, Y Song via Dwarf-discuss wrote:
> > Motivation
> > ==========
> >
> > My particular use case is for bpf-based linux kernel
> > tracing. When tracing a kernel function, the user would like
> > to know the actual signature. This is critical.
> >
> > For example, if the actual signature is
> >   static int foo(int a, int c) { ... }
> > and the source signature is
> >   static int foo(int a, int b, int c) { ... }
>
> That is pretty normal function cloning.
> __attribute__((noinline)) static int foo (int a, int b, int c) { return a + 
> c; }
> int bar (int a, int c)
> {
>   return foo (a, 0, c) + foo (1, 1, 2) + foo (2, 2, 3) + foo (a, 3, 4) + foo 
> (5, 4, c);
> }
>
> You can just normally emit
> DW_TAG_subprogram
>   DW_AT_name "foo"
>   DW_AT_inline 1
> ...
>   DW_TAG_formal_parameter
>     DW_AT_name "a"
> ...
>   DW_TAG_formal_parameter
>     DW_AT_name "b"
> ...
>   DW_TAG_formal_parameter
>     DW_AT_name "c"
> for the original user function (if it isn't emitted in that shape,
> without DW_AT_low_pc/DW_AT_high_pc/DW_AT_ranges etc.
> Then
> DW_TAG_subprogram
>   DW_AT_abstract_origin <above foo DW_TAG_subprogram>
>   DW_AT_low_pc ...
>   DW_AT_high_pc ...
> ...
>   DW_TAG_formal_parameter
>     DW_AT_abstract_origin <above a DW_TAG_formal_parameter>
>     DW_AT_location ...
>   DW_TAG_formal_parameter
>     DW_AT_abstract_origin <above b DW_TAG_formal_parameter>
>     DW_AT_location DW_OP_GNU_parameter_ref <reference to 
> DW_TAG_call_site_parameter>
>   DW_TAG_formal_parameter
>     DW_AT_abstract_origin <above c DW_TAG_formal_parameter>
>     DW_AT_location ...
>
> DW_OP_GNU_parameter_ref is an extension, see 
> https://dwarfstd.org/issues/230109.1.html
> In any case, I don't see why you need something like
> DW_TAG_inlined_subroutine at DW_TAG_compile_unit scope, you can't inline a
> function into a translation unit.


In https://github.com/llvm/llvm-project/pull/157349, we have tried two
different approaches to encode new true signatures at the same time
preserving original signatures.

Method 1:
  See https://github.com/llvm/llvm-project/pull/157349#issuecomment-332994687
  and subsequent comments including below
  https://github.com/llvm/llvm-project/pull/157349#issuecomment-3341626819

  one example:

   DW_TAG_subprogram
                DW_AT_low_pc    (0x0000000000000000)
                DW_AT_high_pc    (0x0000000000000006)
                DW_AT_frame_base    (DW_OP_reg7 RSP)
                DW_AT_call_all_calls    (true)
                DW_AT_name    ("mul")
                ...

              DW_TAG_formal_parameter
                  DW_AT_location    (DW_OP_reg5 RDI)
                  DW_AT_name    ("num1")
                  DW_AT_decl_file    ("/app/example.cpp")
                  DW_AT_decl_line    (...)
                  DW_AT_type    (0x0000003d "int")

              DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin    (LOC_A "mul")
                  DW_AT_low_pc    (...)
                  DW_AT_high_pc    (...)
                  DW_AT_call_file    ("/app/example.cpp")
                  DW_AT_call_line    (...)
                  DW_AT_call_column    (...)

                    DW_TAG_formal_parameter
                            DW_AT_location    (DW_OP_reg5 RDI)
                            DW_AT_abstract_origin    (LOC_B "num1")
                    DW_TAG_formal_parameter
                            DW_AT_location    (DW_OP_reg5 RDI)
                            DW_AT_abstract_origin    (LOC_C "num2")
             NULL
NULL

See the above. The parameters in DW_TAG_subprogram presents the true signature
while the DW_TAG_inlined_subroutine represents the original signatures.

This seems to be working but lldb does not like it.
See https://github.com/llvm/llvm-project/pull/157349#issuecomment-3412590751
One of lldb section:

(lldb) bt
* thread #1, name = 'ex2', stop reason = step in
  * frame #0: 0x0000555555555161 ex2`inc(x=41, y=<unavailable>) at
ex2.c:5:18 [inlined]
    frame #1: 0x0000555555555161 ex2`inc at ex2.c:0
    frame #2: 0x0000555555555146 ex2`do_work(n=41) at ex2.c:11:13 [inlined]
    frame #3: 0x0000555555555140 ex2`main at ex2.c:17:12
    frame #4: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
    frame #5: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
    frame #6: 0x0000555555555075 ex2`_start + 37

which is not good since the frame #0 now marked as 'inlined' which is
not expected from uers.

Method 2:

Later on I tried another approach, directly encode true signature in
the DISubprogram but add a declaration to link to abstract origin.
See https://github.com/llvm/llvm-project/pull/157349#issuecomment-3420455765
and following comments in github.

The eventual dwarf:
See link 
https://github.com/llvm/llvm-project/pull/157349#issuecomment-3422954925

0x0000009e:   DW_TAG_subprogram
                DW_AT_name      ("foo")
                DW_AT_decl_file
("/home/yhs/tests/inline_lldb/c-same-func-name/same_func_name.cc")
                DW_AT_decl_line (3)
                DW_AT_type      (0x000000b1 "int")
                DW_AT_declaration       (true)
                DW_AT_external  (true)

0x000000a6:     DW_TAG_formal_parameter
                  DW_AT_type    (0x000000b1 "int")

0x000000ab:     DW_TAG_formal_parameter
                  DW_AT_type    (0x000000b1 "int")

0x000000b0:     NULL

0x000000b1:   DW_TAG_base_type
                DW_AT_name      ("int")
                DW_AT_encoding  (DW_ATE_signed)
                DW_AT_byte_size (0x04)
0x000000b5:   DW_TAG_subprogram
                DW_AT_low_pc    (0x0000000000001160)
                DW_AT_high_pc   (0x000000000000117f)
                DW_AT_frame_base        (DW_OP_reg7 RSP)
                DW_AT_call_all_calls    (true)
                DW_AT_linkage_name      ("_ZL3fooii")
                DW_AT_specification     (0x0000009e "foo")

0x000000c2:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x2) loclist = 0x0000003d:
                     [0x0000000000001160, 0x0000000000001163): DW_OP_reg5 RDI
                     [0x0000000000001163, 0x000000000000117f):
DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value)
                  DW_AT_name    ("x")
                  DW_AT_decl_file
("/home/yhs/tests/inline_lldb/c-same-func-name/same_func_name.cc")
                  DW_AT_decl_line       (3)
                  DW_AT_type    (0x000000b1 "int")

0x000000cb:     DW_TAG_variable
                  DW_AT_location        (DW_OP_fbreg +4)
                  DW_AT_name    ("t")
                  DW_AT_decl_file
("/home/yhs/tests/inline_lldb/c-same-func-name/same_func_name.cc")
                  DW_AT_decl_line       (4)
                  DW_AT_type    (0x00000122 "volatile int")

0x000000d6:     DW_TAG_call_site
                  DW_AT_call_origin     (0x00000108 "printf")
                  DW_AT_call_return_pc  (0x0000000000001179)

0x000000dc:     NULL

In the above, true signature is directly encoded in the DISubprogram.
I guess dwarf people may not accept it since they may want to keep the
original parameter for debugger purpose (e.g. used by
DW_OP_entry_value).

So I guess it looks like both approaches won't work. The above
recommended dwarf format:

> DW_TAG_subprogram
>   DW_AT_name "foo"
>   DW_AT_inline 1
> ...
>   DW_TAG_formal_parameter
>     DW_AT_name "a"
> ...
>   DW_TAG_formal_parameter
>     DW_AT_name "b"
> ...
>   DW_TAG_formal_parameter
>     DW_AT_name "c"
> for the original user function (if it isn't emitted in that shape,
> without DW_AT_low_pc/DW_AT_high_pc/DW_AT_ranges etc.
> Then
> DW_TAG_subprogram
>   DW_AT_abstract_origin <above foo DW_TAG_subprogram>
>   DW_AT_low_pc ...
>   DW_AT_high_pc ...
> ...
>   DW_TAG_formal_parameter
>     DW_AT_abstract_origin <above a DW_TAG_formal_parameter>
>     DW_AT_location ...
>   DW_TAG_formal_parameter
>     DW_AT_abstract_origin <above b DW_TAG_formal_parameter>
>     DW_AT_location DW_OP_GNU_parameter_ref <reference to 
> DW_TAG_call_site_parameter>
>   DW_TAG_formal_parameter
>     DW_AT_abstract_origin <above c DW_TAG_formal_parameter>
>     DW_AT_location ...

In the above, in DW_TAG_subprogram, all source-level formal parameters are
presented under DW_TAG_subprogram, so it has nowhere to encode the true
signatures. Also the type for the parameter may be different from the
source-level
one. Not sure how to represent them. Any suggestions?
-- 
Dwarf-discuss mailing list
[email protected]
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss

Re: [Dwarf-discuss] Proposal to encode Changed Function Signatures in dwarf

Reply via email to