Re: [Dwarf-Discuss] lambda (& other anonymous type) identification/naming

2023-02-28 Thread David Blaikie via Dwarf-Discuss
Hmm - I guess one complication of only putting the mangling number on
the type, is that you need the scope of the lambda too... which is
tricky in this case:

extern int i;
int i = []{ return 3; }();

In this case, the lambda is mangled in the scope of the global
variable `i`: i::{lambda()#1}::operator()() const
(https://godbolt.org/z/15Eqa8ajT)

Oh, and I guess you can use a lambda without ever instantiating its
operator(), and for a generic lambda there's nothing to describe...

eg:
template
void f1(const T&){}
inline void f2() {
  f1([](auto){});
}
void f3() {
  f2();
}

Clang's DWARF for the anonymous type is:
0x0043: DW_TAG_class_type
  DW_AT_calling_convention  (DW_CC_pass_by_value)
  DW_AT_byte_size   (0x01)
  DW_AT_decl_file
("/usr/local/google/home/blaikie/dev/scratch/test.cpp")
  DW_AT_decl_line   (4)

GCC's includes a dtor (called "~") but the type just has size,
file, line, and column.

So we could avoid using the whole mangled name of the anonymous type
in some cases - maybe it's worth having features (like being able to
provide the mangling number in an attribute, maybe being able to scope
the type inside a variable DIE? though that sounds a bit frightening)
to help in those cases, even if in some of the worst cases we'd have
to use the mangled name to reassociate anonymous types?

- Dave

On Mon, Aug 22, 2022 at 12:44 PM David Blaikie  wrote:
>
> Ping - any thoughts here?
>
> On Sun, Jul 24, 2022 at 9:08 PM David Blaikie  wrote:
> >
> > Ping on this thread - would love to hear what ideas folks have for
> > addressing the naming of anonymous types (enums, structs/classes, and
> > lambdas) - especially if it'd make it easier to go back/forth between
> > the DW_AT_name of a template with an unnamed type as a parameter and
> > the actual DIEs describing the same parameter type.
> >
> > On Tue, Jun 14, 2022 at 1:02 PM David Blaikie  wrote:
> > >
> > > Looks like https://reviews.llvm.org/D122766 (-ffile-reproducible) might 
> > > solve my immediate issues in clang, but I think we should still consider 
> > > moving to a more canonical naming of lambdas that, necessarily, doesn't 
> > > include the file name (unfortunately). Probably has to include the lambda 
> > > numbering/something roughly equivalent to the mangled lambda name - it 
> > > could include type information (it'd be superfluous to a unique 
> > > identifier, but I don't think it would break consistently naming the same 
> > > type across CUs either).
> > >
> > > Anyone got ideas/preferences/thoughts on this?
> > >
> > > On Mon, Jan 24, 2022 at 5:51 PM David Blaikie  wrote:
> > >>
> > >> On Mon, Jan 24, 2022 at 5:37 PM Adrian Prantl  wrote:
> > >>>
> > >>>
> > >>>
> > >>> On Jan 23, 2022, at 2:53 PM, David Blaikie  wrote:
> > >>>
> > >>> A rather common "quality of implementation" issue seems to be lambda 
> > >>> naming.
> > >>>
> > >>> I came across this due to non-canonicalization of lambda names in 
> > >>> template parameters depending on how a source file is named in Clang, 
> > >>> and GCC's seem to be very ambiguous:
> > >>>
> > >>> $ cat tmp/lambda.h
> > >>> template
> > >>> void f1(T) { }
> > >>> static int i = (f1([]{}), 1);
> > >>> static int j = (f1([]{}), 2);
> > >>> void f1() {
> > >>>   f1([]{});
> > >>>   f1([]{});
> > >>> }
> > >>> $ cat tmp/lambda.cpp
> > >>> #ifdef I_PATH
> > >>> #include 
> > >>> #else
> > >>> #include "lambda.h"
> > >>> #endif
> > >>> $ clang++-tot tmp/lambda.cpp -g -c -I. -DI_PATH && llvm-dwarfdump-tot 
> > >>> lambda.o | grep "f1<"
> > >>> DW_AT_name  ("f1<(lambda at ./tmp/lambda.h:3:20)>")
> > >>> DW_AT_name  ("f1<(lambda at ./tmp/lambda.h:4:20)>")
> > >>> DW_AT_name  ("f1<(lambda at ./tmp/lambda.h:6:6)>")
> > >>> DW_AT_name  ("f1<(lambda at ./tmp/lambda.h:7:6)>")
> > >>> $ clang++-tot tmp/lambda.cpp -g -c && llvm-dwarfdump-tot lambda.o | 
> > >>> grep "f1<"
> > >>> DW_AT_name  ("f1<(lambda at tmp/lambda.h:3:20)>")
> > >>> DW_AT_name  ("f1<(lambda at tmp/lambda.h:4:20)>")
> > >>> DW_AT_name  ("f1<(lambda at tmp/lambda.h:6:6)>")
> > >>> DW_AT_name  ("f1<(lambda at tmp/lambda.h:7:6)>")
> > >>> $ g++-tot tmp/lambda.cpp -g -c -I. && llvm-dwarfdump-tot lambda.o | 
> > >>> grep "f1<"
> > >>> DW_AT_name  ("f1 >")
> > >>> DW_AT_name  ("f1 >")
> > >>> DW_AT_name  ("f1< >")
> > >>>
> > >>> DW_AT_name  ("f1< >")
> > >>>
> > >>> (I came across this in the context of my simplified template names work 
> > >>> - rebuilding names from the DW_TAG description of the template 
> > >>> parameters - and while I'm not rebuilding names that have lambda 
> > >>> parameters (keep encoding the full string instead). The issue is if 
> > >>> some other type depending on a type with a lambda parameter 

Re: [Dwarf-Discuss] lambda (& other anonymous type) identification/naming

2022-08-22 Thread David Blaikie via Dwarf-Discuss
Ping - any thoughts here?

On Sun, Jul 24, 2022 at 9:08 PM David Blaikie  wrote:
>
> Ping on this thread - would love to hear what ideas folks have for
> addressing the naming of anonymous types (enums, structs/classes, and
> lambdas) - especially if it'd make it easier to go back/forth between
> the DW_AT_name of a template with an unnamed type as a parameter and
> the actual DIEs describing the same parameter type.
>
> On Tue, Jun 14, 2022 at 1:02 PM David Blaikie  wrote:
> >
> > Looks like https://reviews.llvm.org/D122766 (-ffile-reproducible) might 
> > solve my immediate issues in clang, but I think we should still consider 
> > moving to a more canonical naming of lambdas that, necessarily, doesn't 
> > include the file name (unfortunately). Probably has to include the lambda 
> > numbering/something roughly equivalent to the mangled lambda name - it 
> > could include type information (it'd be superfluous to a unique identifier, 
> > but I don't think it would break consistently naming the same type across 
> > CUs either).
> >
> > Anyone got ideas/preferences/thoughts on this?
> >
> > On Mon, Jan 24, 2022 at 5:51 PM David Blaikie  wrote:
> >>
> >> On Mon, Jan 24, 2022 at 5:37 PM Adrian Prantl  wrote:
> >>>
> >>>
> >>>
> >>> On Jan 23, 2022, at 2:53 PM, David Blaikie  wrote:
> >>>
> >>> A rather common "quality of implementation" issue seems to be lambda 
> >>> naming.
> >>>
> >>> I came across this due to non-canonicalization of lambda names in 
> >>> template parameters depending on how a source file is named in Clang, and 
> >>> GCC's seem to be very ambiguous:
> >>>
> >>> $ cat tmp/lambda.h
> >>> template
> >>> void f1(T) { }
> >>> static int i = (f1([]{}), 1);
> >>> static int j = (f1([]{}), 2);
> >>> void f1() {
> >>>   f1([]{});
> >>>   f1([]{});
> >>> }
> >>> $ cat tmp/lambda.cpp
> >>> #ifdef I_PATH
> >>> #include 
> >>> #else
> >>> #include "lambda.h"
> >>> #endif
> >>> $ clang++-tot tmp/lambda.cpp -g -c -I. -DI_PATH && llvm-dwarfdump-tot 
> >>> lambda.o | grep "f1<"
> >>> DW_AT_name  ("f1<(lambda at ./tmp/lambda.h:3:20)>")
> >>> DW_AT_name  ("f1<(lambda at ./tmp/lambda.h:4:20)>")
> >>> DW_AT_name  ("f1<(lambda at ./tmp/lambda.h:6:6)>")
> >>> DW_AT_name  ("f1<(lambda at ./tmp/lambda.h:7:6)>")
> >>> $ clang++-tot tmp/lambda.cpp -g -c && llvm-dwarfdump-tot lambda.o | grep 
> >>> "f1<"
> >>> DW_AT_name  ("f1<(lambda at tmp/lambda.h:3:20)>")
> >>> DW_AT_name  ("f1<(lambda at tmp/lambda.h:4:20)>")
> >>> DW_AT_name  ("f1<(lambda at tmp/lambda.h:6:6)>")
> >>> DW_AT_name  ("f1<(lambda at tmp/lambda.h:7:6)>")
> >>> $ g++-tot tmp/lambda.cpp -g -c -I. && llvm-dwarfdump-tot lambda.o | grep 
> >>> "f1<"
> >>> DW_AT_name  ("f1 >")
> >>> DW_AT_name  ("f1 >")
> >>> DW_AT_name  ("f1< >")
> >>>
> >>> DW_AT_name  ("f1< >")
> >>>
> >>> (I came across this in the context of my simplified template names work - 
> >>> rebuilding names from the DW_TAG description of the template parameters - 
> >>> and while I'm not rebuilding names that have lambda parameters (keep 
> >>> encoding the full string instead). The issue is if some other type 
> >>> depending on a type with a lambda parameter - but then multiple uses of 
> >>> that inner type exist, from different translation units (using type 
> >>> units) with different ways of naming the same file - so then the expected 
> >>> name has one spelling, but the actual spelling is different due to the 
> >>> "./")
> >>>
> >>> But all this said - it'd be good to figure out a reliable naming - the 
> >>> naming we have here, while usable for humans (pointing to surce files, 
> >>> etc) - they don't reliably give unique names for each lambda/template 
> >>> instantiation which would make it difficult for a consumer to know if two 
> >>> entities are the same (important for types - is some function parameter 
> >>> the same type as another type?)
> >>>
> >>> While it's expected cross-producer (eg: trying to be compatible with GCC 
> >>> and Clang debug info) you have to do some fuzzy matching (eg: "f1" 
> >>> or "f1" at the most basic - there are more complicated cases) - 
> >>> this one's not possible with the data available.
> >>>
> >>> The source file/line/column is insufficient to uniquely identify a lambda 
> >>> (multiple lambdas stamped out by a macro would get all the same 
> >>> file/line/col) and valid code (albeit unlikely) that writes the same 
> >>> definition in multiple places could make the same lambda have different 
> >>> names.
> >>>
> >>> We should probably use something more like the way various ABI manglings 
> >>> do to identify these entities.
> >>>
> >>> But we should probably also do this for other unnamed types that have 
> >>> linkage (need to/would benefit from being matched up between two CUs), 
> >>> even 

Re: [Dwarf-Discuss] lambda (& other anonymous type) identification/naming

2022-07-24 Thread David Blaikie via Dwarf-Discuss
Ping on this thread - would love to hear what ideas folks have for
addressing the naming of anonymous types (enums, structs/classes, and
lambdas) - especially if it'd make it easier to go back/forth between
the DW_AT_name of a template with an unnamed type as a parameter and
the actual DIEs describing the same parameter type.

On Tue, Jun 14, 2022 at 1:02 PM David Blaikie  wrote:
>
> Looks like https://reviews.llvm.org/D122766 (-ffile-reproducible) might solve 
> my immediate issues in clang, but I think we should still consider moving to 
> a more canonical naming of lambdas that, necessarily, doesn't include the 
> file name (unfortunately). Probably has to include the lambda 
> numbering/something roughly equivalent to the mangled lambda name - it could 
> include type information (it'd be superfluous to a unique identifier, but I 
> don't think it would break consistently naming the same type across CUs 
> either).
>
> Anyone got ideas/preferences/thoughts on this?
>
> On Mon, Jan 24, 2022 at 5:51 PM David Blaikie  wrote:
>>
>> On Mon, Jan 24, 2022 at 5:37 PM Adrian Prantl  wrote:
>>>
>>>
>>>
>>> On Jan 23, 2022, at 2:53 PM, David Blaikie  wrote:
>>>
>>> A rather common "quality of implementation" issue seems to be lambda naming.
>>>
>>> I came across this due to non-canonicalization of lambda names in template 
>>> parameters depending on how a source file is named in Clang, and GCC's seem 
>>> to be very ambiguous:
>>>
>>> $ cat tmp/lambda.h
>>> template
>>> void f1(T) { }
>>> static int i = (f1([]{}), 1);
>>> static int j = (f1([]{}), 2);
>>> void f1() {
>>>   f1([]{});
>>>   f1([]{});
>>> }
>>> $ cat tmp/lambda.cpp
>>> #ifdef I_PATH
>>> #include 
>>> #else
>>> #include "lambda.h"
>>> #endif
>>> $ clang++-tot tmp/lambda.cpp -g -c -I. -DI_PATH && llvm-dwarfdump-tot 
>>> lambda.o | grep "f1<"
>>> DW_AT_name  ("f1<(lambda at ./tmp/lambda.h:3:20)>")
>>> DW_AT_name  ("f1<(lambda at ./tmp/lambda.h:4:20)>")
>>> DW_AT_name  ("f1<(lambda at ./tmp/lambda.h:6:6)>")
>>> DW_AT_name  ("f1<(lambda at ./tmp/lambda.h:7:6)>")
>>> $ clang++-tot tmp/lambda.cpp -g -c && llvm-dwarfdump-tot lambda.o | grep 
>>> "f1<"
>>> DW_AT_name  ("f1<(lambda at tmp/lambda.h:3:20)>")
>>> DW_AT_name  ("f1<(lambda at tmp/lambda.h:4:20)>")
>>> DW_AT_name  ("f1<(lambda at tmp/lambda.h:6:6)>")
>>> DW_AT_name  ("f1<(lambda at tmp/lambda.h:7:6)>")
>>> $ g++-tot tmp/lambda.cpp -g -c -I. && llvm-dwarfdump-tot lambda.o | grep 
>>> "f1<"
>>> DW_AT_name  ("f1 >")
>>> DW_AT_name  ("f1 >")
>>> DW_AT_name  ("f1< >")
>>>
>>> DW_AT_name  ("f1< >")
>>>
>>> (I came across this in the context of my simplified template names work - 
>>> rebuilding names from the DW_TAG description of the template parameters - 
>>> and while I'm not rebuilding names that have lambda parameters (keep 
>>> encoding the full string instead). The issue is if some other type 
>>> depending on a type with a lambda parameter - but then multiple uses of 
>>> that inner type exist, from different translation units (using type units) 
>>> with different ways of naming the same file - so then the expected name has 
>>> one spelling, but the actual spelling is different due to the "./")
>>>
>>> But all this said - it'd be good to figure out a reliable naming - the 
>>> naming we have here, while usable for humans (pointing to surce files, etc) 
>>> - they don't reliably give unique names for each lambda/template 
>>> instantiation which would make it difficult for a consumer to know if two 
>>> entities are the same (important for types - is some function parameter the 
>>> same type as another type?)
>>>
>>> While it's expected cross-producer (eg: trying to be compatible with GCC 
>>> and Clang debug info) you have to do some fuzzy matching (eg: "f1" or 
>>> "f1" at the most basic - there are more complicated cases) - this 
>>> one's not possible with the data available.
>>>
>>> The source file/line/column is insufficient to uniquely identify a lambda 
>>> (multiple lambdas stamped out by a macro would get all the same 
>>> file/line/col) and valid code (albeit unlikely) that writes the same 
>>> definition in multiple places could make the same lambda have different 
>>> names.
>>>
>>> We should probably use something more like the way various ABI manglings do 
>>> to identify these entities.
>>>
>>> But we should probably also do this for other unnamed types that have 
>>> linkage (need to/would benefit from being matched up between two CUs), even 
>>> not lambdas.
>>>
>>> FWIW, at least the llvm-cxxfilt demanglings of clang's manglings for these 
>>> symbols is:
>>>
>>>  void f1<$_0>($_0)
>>>  f1<$_1>($_1)
>>>  void f1(f1()::$_2)
>>>  void f1(f1()::$_3)
>>>
>>> Should we use that instead?
>>>
>>>
>>> The only other information that the current 

Re: [Dwarf-Discuss] lambda (& other anonymous type) identification/naming

2022-06-14 Thread David Blaikie via Dwarf-Discuss
Looks like https://reviews.llvm.org/D122766 (-ffile-reproducible) might
solve my immediate issues in clang, but I think we should still consider
moving to a more canonical naming of lambdas that, necessarily, doesn't
include the file name (unfortunately). Probably has to include the lambda
numbering/something roughly equivalent to the mangled lambda name - it
could include type information (it'd be superfluous to a unique identifier,
but I don't think it would break consistently naming the same type across
CUs either).

Anyone got ideas/preferences/thoughts on this?

On Mon, Jan 24, 2022 at 5:51 PM David Blaikie  wrote:

> On Mon, Jan 24, 2022 at 5:37 PM Adrian Prantl  wrote:
>
>>
>>
>> On Jan 23, 2022, at 2:53 PM, David Blaikie  wrote:
>>
>> A rather common "quality of implementation" issue seems to be lambda
>> naming.
>>
>> I came across this due to non-canonicalization of lambda names in
>> template parameters depending on how a source file is named in Clang, and
>> GCC's seem to be very ambiguous:
>>
>> $ cat tmp/lambda.h
>> template
>> void f1(T) { }
>> static int i = (f1([]{}), 1);
>> static int j = (f1([]{}), 2);
>> void f1() {
>>   f1([]{});
>>   f1([]{});
>> }
>> $ cat tmp/lambda.cpp
>> #ifdef I_PATH
>> #include 
>> #else
>> #include "lambda.h"
>> #endif
>> $ clang++-tot tmp/lambda.cpp -g -c -I. -DI_PATH && llvm-dwarfdump-tot
>> lambda.o | grep "f1<"
>> DW_AT_name  ("*f1<*(lambda at ./tmp/lambda.h:3:20)>")
>> DW_AT_name  ("*f1<*(lambda at ./tmp/lambda.h:4:20)>")
>> DW_AT_name  ("*f1<*(lambda at ./tmp/lambda.h:6:6)>")
>> DW_AT_name  ("*f1<*(lambda at ./tmp/lambda.h:7:6)>")
>> $ clang++-tot tmp/lambda.cpp -g -c && llvm-dwarfdump-tot lambda.o | grep
>> "f1<"
>> DW_AT_name  ("*f1<*(lambda at tmp/lambda.h:3:20)>")
>> DW_AT_name  ("*f1<*(lambda at tmp/lambda.h:4:20)>")
>> DW_AT_name  ("*f1<*(lambda at tmp/lambda.h:6:6)>")
>> DW_AT_name  ("*f1<*(lambda at tmp/lambda.h:7:6)>")
>> $ g++-tot tmp/lambda.cpp -g -c -I. && llvm-dwarfdump-tot lambda.o | grep
>> "f1<"
>> DW_AT_name  ("*f1<*f1():: >")
>> DW_AT_name  ("*f1<*f1():: >")
>> DW_AT_name  ("*f1<* >")
>>
>> DW_AT_name  ("*f1<* >")
>>
>> (I came across this in the context of my simplified template names work -
>> rebuilding names from the DW_TAG description of the template parameters -
>> and while I'm not rebuilding names that have lambda parameters (keep
>> encoding the full string instead). The issue is if some other type
>> depending on a type with a lambda parameter - but then multiple uses of
>> that inner type exist, from different translation units (using type units)
>> with different ways of naming the same file - so then the expected name has
>> one spelling, but the actual spelling is different due to the "./")
>>
>> But all this said - it'd be good to figure out a reliable naming - the
>> naming we have here, while usable for humans (pointing to surce files, etc)
>> - they don't reliably give unique names for each lambda/template
>> instantiation which would make it difficult for a consumer to know if two
>> entities are the same (important for types - is some function parameter the
>> same type as another type?)
>>
>> While it's expected cross-producer (eg: trying to be compatible with GCC
>> and Clang debug info) you have to do some fuzzy matching (eg: "f1" or
>> "f1" at the most basic - there are more complicated cases) - this
>> one's not possible with the data available.
>>
>> The source file/line/column is insufficient to uniquely identify a lambda
>> (multiple lambdas stamped out by a macro would get all the same
>> file/line/col) and valid code (albeit unlikely) that writes the same
>> definition in multiple places could make the same lambda have different
>> names.
>>
>> We should probably use something more like the way various ABI manglings
>> do to identify these entities.
>>
>> But we should probably also do this for other unnamed types that have
>> linkage (need to/would benefit from being matched up between two CUs), even
>> not lambdas.
>>
>> FWIW, at least the llvm-cxxfilt demanglings of clang's manglings for
>> these symbols is:
>>
>>  void f1<$_0>($_0)
>>  f1<$_1>($_1)
>>  void f1(f1()::$_2)
>>  void f1(f1()::$_3)
>>
>> Should we use that instead?
>>
>>
>> The only other information that the current human-readable DWARF name
>> carries is the file+line and that is fully redundant with DW_AT_file/line,
>> so the above scheme seem reasonable to me. Poorly symbolicated backtraces
>> would be worse in this scheme, so I'm expecting most pushback from users
>> who rely on a tool that just prints the human readable name with no source
>> info.
>>
>
> Yeah - you can always pull the file/line/col from the DW_AT_decl_* anyway,
> so encoding it in the type name does seem redundant and inefficient 

Re: [Dwarf-Discuss] lambda (& other anonymous type) identification/naming

2022-01-25 Thread Adrian Prantl via Dwarf-Discuss


> On Jan 23, 2022, at 2:53 PM, David Blaikie  wrote:
> 
> A rather common "quality of implementation" issue seems to be lambda naming.
> 
> I came across this due to non-canonicalization of lambda names in template 
> parameters depending on how a source file is named in Clang, and GCC's seem 
> to be very ambiguous:
> 
> $ cat tmp/lambda.h
> template
> void f1(T) { }
> static int i = (f1([]{}), 1);
> static int j = (f1([]{}), 2);
> void f1() {
>   f1([]{});
>   f1([]{});
> }
> $ cat tmp/lambda.cpp
> #ifdef I_PATH
> #include 
> #else
> #include "lambda.h"
> #endif
> $ clang++-tot tmp/lambda.cpp -g -c -I. -DI_PATH && llvm-dwarfdump-tot 
> lambda.o | grep "f1<"
> DW_AT_name  ("f1<(lambda at ./tmp/lambda.h:3:20)>")
> DW_AT_name  ("f1<(lambda at ./tmp/lambda.h:4:20)>")
> DW_AT_name  ("f1<(lambda at ./tmp/lambda.h:6:6)>")
> DW_AT_name  ("f1<(lambda at ./tmp/lambda.h:7:6)>")
> $ clang++-tot tmp/lambda.cpp -g -c && llvm-dwarfdump-tot lambda.o | grep "f1<"
> DW_AT_name  ("f1<(lambda at tmp/lambda.h:3:20)>")
> DW_AT_name  ("f1<(lambda at tmp/lambda.h:4:20)>")
> DW_AT_name  ("f1<(lambda at tmp/lambda.h:6:6)>")
> DW_AT_name  ("f1<(lambda at tmp/lambda.h:7:6)>")
> $ g++-tot tmp/lambda.cpp -g -c -I. && llvm-dwarfdump-tot lambda.o | grep "f1<"
> DW_AT_name  ("f1 >")
> DW_AT_name  ("f1 >")
> DW_AT_name  ("f1< >")
> DW_AT_name  ("f1< >")
> 
> (I came across this in the context of my simplified template names work - 
> rebuilding names from the DW_TAG description of the template parameters - and 
> while I'm not rebuilding names that have lambda parameters (keep encoding the 
> full string instead). The issue is if some other type depending on a type 
> with a lambda parameter - but then multiple uses of that inner type exist, 
> from different translation units (using type units) with different ways of 
> naming the same file - so then the expected name has one spelling, but the 
> actual spelling is different due to the "./")
> 
> But all this said - it'd be good to figure out a reliable naming - the naming 
> we have here, while usable for humans (pointing to surce files, etc) - they 
> don't reliably give unique names for each lambda/template instantiation which 
> would make it difficult for a consumer to know if two entities are the same 
> (important for types - is some function parameter the same type as another 
> type?)
> 
> While it's expected cross-producer (eg: trying to be compatible with GCC and 
> Clang debug info) you have to do some fuzzy matching (eg: "f1" or 
> "f1" at the most basic - there are more complicated cases) - this 
> one's not possible with the data available.
> 
> The source file/line/column is insufficient to uniquely identify a lambda 
> (multiple lambdas stamped out by a macro would get all the same 
> file/line/col) and valid code (albeit unlikely) that writes the same 
> definition in multiple places could make the same lambda have different names.
> 
> We should probably use something more like the way various ABI manglings do 
> to identify these entities.
> 
> But we should probably also do this for other unnamed types that have linkage 
> (need to/would benefit from being matched up between two CUs), even not 
> lambdas.
> 
> FWIW, at least the llvm-cxxfilt demanglings of clang's manglings for these 
> symbols is:
> 
>  void f1<$_0>($_0)
>  f1<$_1>($_1)
>  void f1(f1()::$_2)
>  void f1(f1()::$_3)
> 
> Should we use that instead?

The only other information that the current human-readable DWARF name carries 
is the file+line and that is fully redundant with DW_AT_file/line, so the above 
scheme seem reasonable to me. Poorly symbolicated backtraces would be worse in 
this scheme, so I'm expecting most pushback from users who rely on a tool that 
just prints the human readable name with no source info.

> 
> GCC's mangling's different (in these examples that's OK, since they're all 
> internal linkage):
> 
>  void f1(f1()::'lambda0'())
>  void f1(f1()::'lambda'())
> 
> If I add an example like this:
> 
> inline auto f1() { return []{}; }
> 
> and instantiate the template with the result of f1:
> 
>  void f1(f2()::'lambda'())
> 
> GCC:
> 
>  void f1(f2()::'lambda'()) 
> 
> So they consistently use the same mangling - we could use the same naming for 
> template parameters?
> 
> How should we communicate this sort of identity for unnamed types in the DIEs 
> describing the types themselves (not just the string of a template name of a 
> type instantiated with the unnamed type) so the unnamed type can be matched 
> up between translation units.
> 
> eg, if I have these two translation units:
> // header
> inline auto f1() { struct { } local; return local; }
> // unit 1:
> #include "header"
> auto f2(decltype(f1())) { }
> // unit 2:
> #include 

Re: [Dwarf-Discuss] lambda (& other anonymous type) identification/naming

2022-01-25 Thread David Blaikie via Dwarf-Discuss
On Mon, Jan 24, 2022 at 5:37 PM Adrian Prantl  wrote:

>
>
> On Jan 23, 2022, at 2:53 PM, David Blaikie  wrote:
>
> A rather common "quality of implementation" issue seems to be lambda
> naming.
>
> I came across this due to non-canonicalization of lambda names in template
> parameters depending on how a source file is named in Clang, and GCC's seem
> to be very ambiguous:
>
> $ cat tmp/lambda.h
> template
> void f1(T) { }
> static int i = (f1([]{}), 1);
> static int j = (f1([]{}), 2);
> void f1() {
>   f1([]{});
>   f1([]{});
> }
> $ cat tmp/lambda.cpp
> #ifdef I_PATH
> #include 
> #else
> #include "lambda.h"
> #endif
> $ clang++-tot tmp/lambda.cpp -g -c -I. -DI_PATH && llvm-dwarfdump-tot
> lambda.o | grep "f1<"
> DW_AT_name  ("*f1<*(lambda at ./tmp/lambda.h:3:20)>")
> DW_AT_name  ("*f1<*(lambda at ./tmp/lambda.h:4:20)>")
> DW_AT_name  ("*f1<*(lambda at ./tmp/lambda.h:6:6)>")
> DW_AT_name  ("*f1<*(lambda at ./tmp/lambda.h:7:6)>")
> $ clang++-tot tmp/lambda.cpp -g -c && llvm-dwarfdump-tot lambda.o | grep
> "f1<"
> DW_AT_name  ("*f1<*(lambda at tmp/lambda.h:3:20)>")
> DW_AT_name  ("*f1<*(lambda at tmp/lambda.h:4:20)>")
> DW_AT_name  ("*f1<*(lambda at tmp/lambda.h:6:6)>")
> DW_AT_name  ("*f1<*(lambda at tmp/lambda.h:7:6)>")
> $ g++-tot tmp/lambda.cpp -g -c -I. && llvm-dwarfdump-tot lambda.o | grep
> "f1<"
> DW_AT_name  ("*f1<*f1():: >")
> DW_AT_name  ("*f1<*f1():: >")
> DW_AT_name  ("*f1<* >")
>
> DW_AT_name  ("*f1<* >")
>
> (I came across this in the context of my simplified template names work -
> rebuilding names from the DW_TAG description of the template parameters -
> and while I'm not rebuilding names that have lambda parameters (keep
> encoding the full string instead). The issue is if some other type
> depending on a type with a lambda parameter - but then multiple uses of
> that inner type exist, from different translation units (using type units)
> with different ways of naming the same file - so then the expected name has
> one spelling, but the actual spelling is different due to the "./")
>
> But all this said - it'd be good to figure out a reliable naming - the
> naming we have here, while usable for humans (pointing to surce files, etc)
> - they don't reliably give unique names for each lambda/template
> instantiation which would make it difficult for a consumer to know if two
> entities are the same (important for types - is some function parameter the
> same type as another type?)
>
> While it's expected cross-producer (eg: trying to be compatible with GCC
> and Clang debug info) you have to do some fuzzy matching (eg: "f1" or
> "f1" at the most basic - there are more complicated cases) - this
> one's not possible with the data available.
>
> The source file/line/column is insufficient to uniquely identify a lambda
> (multiple lambdas stamped out by a macro would get all the same
> file/line/col) and valid code (albeit unlikely) that writes the same
> definition in multiple places could make the same lambda have different
> names.
>
> We should probably use something more like the way various ABI manglings
> do to identify these entities.
>
> But we should probably also do this for other unnamed types that have
> linkage (need to/would benefit from being matched up between two CUs), even
> not lambdas.
>
> FWIW, at least the llvm-cxxfilt demanglings of clang's manglings for these
> symbols is:
>
>  void f1<$_0>($_0)
>  f1<$_1>($_1)
>  void f1(f1()::$_2)
>  void f1(f1()::$_3)
>
> Should we use that instead?
>
>
> The only other information that the current human-readable DWARF name
> carries is the file+line and that is fully redundant with DW_AT_file/line,
> so the above scheme seem reasonable to me. Poorly symbolicated backtraces
> would be worse in this scheme, so I'm expecting most pushback from users
> who rely on a tool that just prints the human readable name with no source
> info.
>

Yeah - you can always pull the file/line/col from the DW_AT_decl_* anyway,
so encoding it in the type name does seem redundant and inefficient indeed
(beyond/independent of the correctness issues).

> GCC's mangling's different (in these examples that's OK, since they're all
> internal linkage):
>
>  void f1(f1()::'lambda0'())
>  void f1(f1()::'lambda'())
>
> If I add an example like this:
>
> inline auto f1() { return []{}; }
>
> and instantiate the template with the result of f1:
>
>  void f1(f2()::'lambda'())
>
> GCC:
>
>  void f1(f2()::'lambda'())
>
> So they consistently use the same mangling - we could use the same naming
> for template parameters?
>
> How should we communicate this sort of identity for unnamed types in the
> DIEs describing the types themselves (not just the string of a template
> name of a type instantiated with the unnamed type) so the