> On Jan 23, 2022, at 2:53 PM, David Blaikie <dblai...@gmail.com> wrote: > > A rather common "quality of implementation" issue seems to be lambda naming. > > I came across this due to non-canonicalization of lambda names in template > parameters depending on how a source file is named in Clang, and GCC's seem > to be very ambiguous: > > $ cat tmp/lambda.h > template<typename T> > void f1(T) { } > static int i = (f1([]{}), 1); > static int j = (f1([]{}), 2); > void f1() { > f1([]{}); > f1([]{}); > } > $ cat tmp/lambda.cpp > #ifdef I_PATH > #include <tmp/lambda.h> > #else > #include "lambda.h" > #endif > $ clang++-tot tmp/lambda.cpp -g -c -I. -DI_PATH && llvm-dwarfdump-tot > lambda.o | grep "f1<" > DW_AT_name ("f1<(lambda at ./tmp/lambda.h:3:20)>") > DW_AT_name ("f1<(lambda at ./tmp/lambda.h:4:20)>") > DW_AT_name ("f1<(lambda at ./tmp/lambda.h:6:6)>") > DW_AT_name ("f1<(lambda at ./tmp/lambda.h:7:6)>") > $ clang++-tot tmp/lambda.cpp -g -c && llvm-dwarfdump-tot lambda.o | grep "f1<" > DW_AT_name ("f1<(lambda at tmp/lambda.h:3:20)>") > DW_AT_name ("f1<(lambda at tmp/lambda.h:4:20)>") > DW_AT_name ("f1<(lambda at tmp/lambda.h:6:6)>") > DW_AT_name ("f1<(lambda at tmp/lambda.h:7:6)>") > $ g++-tot tmp/lambda.cpp -g -c -I. && llvm-dwarfdump-tot lambda.o | grep "f1<" > DW_AT_name ("f1<f1()::<lambda()> >") > DW_AT_name ("f1<f1()::<lambda()> >") > DW_AT_name ("f1<<lambda()> >") > DW_AT_name ("f1<<lambda()> >") > > (I came across this in the context of my simplified template names work - > rebuilding names from the DW_TAG description of the template parameters - and > while I'm not rebuilding names that have lambda parameters (keep encoding the > full string instead). The issue is if some other type depending on a type > with a lambda parameter - but then multiple uses of that inner type exist, > from different translation units (using type units) with different ways of > naming the same file - so then the expected name has one spelling, but the > actual spelling is different due to the "./") > > But all this said - it'd be good to figure out a reliable naming - the naming > we have here, while usable for humans (pointing to surce files, etc) - they > don't reliably give unique names for each lambda/template instantiation which > would make it difficult for a consumer to know if two entities are the same > (important for types - is some function parameter the same type as another > type?) > > While it's expected cross-producer (eg: trying to be compatible with GCC and > Clang debug info) you have to do some fuzzy matching (eg: "f1<int*>" or > "f1<int *>" at the most basic - there are more complicated cases) - this > one's not possible with the data available. > > The source file/line/column is insufficient to uniquely identify a lambda > (multiple lambdas stamped out by a macro would get all the same > file/line/col) and valid code (albeit unlikely) that writes the same > definition in multiple places could make the same lambda have different names. > > We should probably use something more like the way various ABI manglings do > to identify these entities. > > But we should probably also do this for other unnamed types that have linkage > (need to/would benefit from being matched up between two CUs), even not > lambdas. > > FWIW, at least the llvm-cxxfilt demanglings of clang's manglings for these > symbols is: > > void f1<$_0>($_0) > f1<$_1>($_1) > void f1<f1()::$_2>(f1()::$_2) > void f1<f1()::$_3>(f1()::$_3) > > Should we use that instead?
The only other information that the current human-readable DWARF name carries is the file+line and that is fully redundant with DW_AT_file/line, so the above scheme seem reasonable to me. Poorly symbolicated backtraces would be worse in this scheme, so I'm expecting most pushback from users who rely on a tool that just prints the human readable name with no source info. > > GCC's mangling's different (in these examples that's OK, since they're all > internal linkage): > > void f1<f1()::'lambda0'()>(f1()::'lambda0'()) > void f1<f1()::'lambda'()>(f1()::'lambda'()) > > If I add an example like this: > > inline auto f1() { return []{}; } > > and instantiate the template with the result of f1: > > void f1<f2()::'lambda'()>(f2()::'lambda'()) > > GCC: > > void f1<f2()::'lambda'()>(f2()::'lambda'()) > > So they consistently use the same mangling - we could use the same naming for > template parameters? > > How should we communicate this sort of identity for unnamed types in the DIEs > describing the types themselves (not just the string of a template name of a > type instantiated with the unnamed type) so the unnamed type can be matched > up between translation units. > > eg, if I have these two translation units: > // header > inline auto f1() { struct { } local; return local; } > // unit 1: > #include "header" > auto f2(decltype(f1())) { } > // unit 2: > #include "header" > decltype(f1()) v1; > > Currently the DWARF produced for this unnamed type is: > 0x0000003f: DW_TAG_structure_type > DW_AT_calling_convention (DW_CC_pass_by_value) > DW_AT_byte_size (0x01) > DW_AT_decl_file > ("/usr/local/google/home/blaikie/dev/scratch/test.cpp") > DW_AT_decl_line (1) > is this the type of struct {}? > > So there's no way to know if you see that structure type definition in two > different translation units whether they refer to the same type because there > may be multiple types that have the same DWARF description. (so no way to > know if the DWARF consumer should allow the user to evaluate an expression > `f2(v1)` or not, I think?) Does a C++ compiler usually treat structurally equivalent but differently named types as interchangeable? Does a C++ compiler usually treat structurally equivalent anonymous types as interchangeable? -- adrian > > I guess the only way to have an unnamed type with linkage is to use it inside > an inline function - so within that scope you'd have to produce DWARF for any > types consistently in all definitions of the function and then a consumer > could match them up by counting (assuming the unnamed types were always > emitted in the same order in the child DIE list)... > > But this all seems a bit subtle & maybe would benefit from a more > robust/explicit description? > > Perhaps adding an integer attribute to number anonymous types? They'd need to > differentiate between lambdas and other anonymous types, since they have > separate numberings.
_______________________________________________ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org