dblaikie added inline comments.
================ Comment at: lib/CodeGen/CGDebugInfo.cpp:812-814 + // CodeView types with C++ mangling need a type identifier. + if (CGM.getCodeGenOpts().EmitCodeView) + return true; ---------------- rnk wrote: > rnk wrote: > > dblaikie wrote: > > > Just came across this code today - it's /probably/ a problem for LTO. > > > LLVM IR linking depends on the identifier to determine if two structs are > > > the same for linkage purposes - so if an identifier is added for a > > > non-linkage (local/not externally visible) type, LLVM will consider them > > > to be the same even though they're unrelated: > > > ``` > > > namespace { struct t1 { int i; int j; }; } > > > t1 v1; > > > void *v3 = &v1; > > > ``` > > > ``` > > > namespace { struct t1 { int i; }; } > > > t1 v1; > > > void *v2 = &v1; > > > ``` > > > ``` > > > $ clang++-tot -emit-llvm -S a.cpp b.cpp -g -gcodeview && > > > ~/dev/llvm/build/default/bin/llvm-link -o ab.bc a.ll b.ll && > > > ~/dev/llvm/build/default/bin/llvm-dis ab.bc && cat ab.ll | grep "\"t1\"" > > > !8 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t1", > > > scope: !9, file: !3, line: 1, size: 64, flags: DIFlagTypePassByValue, > > > elements: !10, identifier: "_ZTSN12_GLOBAL__N_12t1E") > > > $ clang++-tot -emit-llvm -S a.cpp b.cpp -g && > > > ~/dev/llvm/build/default/bin/llvm-link -o ab.bc a.ll b.ll && > > > ~/dev/llvm/build/default/bin/llvm-dis ab.bc && cat ab.ll | grep "\"t1\"" > > > !8 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t1", > > > scope: !9, file: !3, line: 1, size: 64, flags: DIFlagTypePassByValue, > > > elements: !10) > > > !21 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t1", > > > scope: !9, file: !17, line: 1, size: 32, flags: DIFlagTypePassByValue, > > > elements: !22) > > > ``` > > > > > > So in linking, now both `a.cpp` and `b.cpp` refer to a single `t1` type > > > (in this case, it looks like the first one - the 64 bit wide one). > > > > > > If CodeView actually can't represent these two distinct types with the > > > same name in the same object file, so be it? But this looks like it's > > > likely to cause problems for consumers/users. > > If you use the MSVC ABI, we will assign unique identifiers to these two > > structs: > > ``` > > !9 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t1", > > scope: !10, file: !7, line: 1, size: 64, flags: DIFlagTypePassByValue, > > elements: !11, identifier: ".?AUt1@?A0xF964240C@@") > > ``` > > > > The `A0xF964240C` is set up here, and it is based on the source file path > > (the hash will only be unique when source file paths are unique across the > > build): > > https://github.com/llvm/llvm-project/blob/main/clang/lib/AST/MicrosoftMangle.cpp#L464 > > ``` > > // It's important to make the mangled names unique because, when CodeView > > // debug info is in use, the debugger uses mangled type names to > > distinguish > > // between otherwise identically named types in anonymous namespaces. > > ``` > > > > Maybe this should use a "is MSVC ABI" check instead, since that is what > > controls whether the identifier will be unique, and the unique-ness is what > > matters for LTO and PDBs. > After thinking about it some more, see the comment I added here: > rG59320fc9d78237a5f9fdd6a5030d6554bb6976ce > > ``` > // If the type is not externally visible, it should be unique to the current > TU, > // and should not need an identifier to participate in type deduplication. > // However, when emitting CodeView, the format internally uses these > // unique type name identifers for references between debug info. For example, > // the method of a class in an anonymous namespace uses the identifer to refer > // to its parent class. The Microsoft C++ ABI attempts to provide unique names > // for such types, so when emitting CodeView, always use identifiers for C++ > // types. This may create problems when attempting to emit CodeView when the > MS > // C++ ABI is not in use. > ``` > > I think type identifiers are pretty crucial for CodeView functionality. The > debugger won't be able to link a method to its class without them. Therefore, > I think it's better to leave this as it is. The bad experience of duplicate > type identifiers is better than the lack of functionality from not emitting > identifiers at all for non-externally visible types. Yeah, thanks for explaining/adding the comment - that about covers it. Hmm, do these identifiers just need to only be unique within a single object file to provide the name referencing mechanics? Perhaps we could generate them later to ensure they're unique - rather than risk collapsing them while linking? (at least here's an obscure case where they seem to collide: Compiling the same file with different `-D` values, I guess the hash is only for the filename, not for other properties that might impact the output - so this ends up with two types with the same mangled name even on MSVC ABI: ``` namespace { struct t1 { int i; #ifdef SECOND int j; #endif }; } t1 v1; void* #ifdef SECOND v2 #else v3 #endif = &v1; ``` ``` clang++-tot -emit-llvm -S a.cpp -g -gcodeview -target x86_64-windows && clang++-tot -emit-llvm -S a.cpp -DSECOND -g -gcodeview -o b.ll -target x86_64-windows && ~/dev/llvm/build/default/bin/llvm-link -o ab.bc a.ll b.ll && ~/dev/llvm/build/default/bin/llvm-dis ab.bc && cat ab.ll | grep "\"t1\"" ``` ) Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D45438/new/ https://reviews.llvm.org/D45438 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits