dblaikie added inline comments.

================
Comment at: lib/CodeGen/CGDebugInfo.cpp:812-814
+  // CodeView types with C++ mangling need a type identifier.
+  if (CGM.getCodeGenOpts().EmitCodeView)
+    return true;
----------------
rnk wrote:
> rnk wrote:
> > dblaikie wrote:
> > > Just came across this code today - it's /probably/ a problem for LTO. 
> > > LLVM IR linking depends on the identifier to determine if two structs are 
> > > the same for linkage purposes - so if an identifier is added for a 
> > > non-linkage (local/not externally visible) type, LLVM will consider them 
> > > to be the same even though they're unrelated:
> > > ```
> > > namespace { struct t1 { int i; int j; }; }
> > > t1 v1;
> > > void *v3 = &v1;
> > > ```
> > > ```
> > > namespace { struct t1 { int i; }; }
> > > t1 v1;
> > > void *v2 = &v1;
> > > ```
> > > ```
> > > $ clang++-tot -emit-llvm -S a.cpp b.cpp -g -gcodeview  && 
> > > ~/dev/llvm/build/default/bin/llvm-link -o ab.bc a.ll b.ll && 
> > > ~/dev/llvm/build/default/bin/llvm-dis ab.bc && cat ab.ll | grep "\"t1\""
> > > !8 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t1", 
> > > scope: !9, file: !3, line: 1, size: 64, flags: DIFlagTypePassByValue, 
> > > elements: !10, identifier: "_ZTSN12_GLOBAL__N_12t1E")
> > > $ clang++-tot -emit-llvm -S a.cpp b.cpp -g && 
> > > ~/dev/llvm/build/default/bin/llvm-link -o ab.bc a.ll b.ll && 
> > > ~/dev/llvm/build/default/bin/llvm-dis ab.bc && cat ab.ll | grep "\"t1\""
> > > !8 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t1", 
> > > scope: !9, file: !3, line: 1, size: 64, flags: DIFlagTypePassByValue, 
> > > elements: !10)
> > > !21 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t1", 
> > > scope: !9, file: !17, line: 1, size: 32, flags: DIFlagTypePassByValue, 
> > > elements: !22)
> > > ```
> > > 
> > > So in linking, now both `a.cpp` and `b.cpp` refer to a single `t1` type 
> > > (in this case, it looks like the first one - the 64 bit wide one).
> > > 
> > > If CodeView actually can't represent these two distinct types with the 
> > > same name in the same object file, so be it? But this looks like it's 
> > > likely to cause problems for consumers/users.
> > If you use the MSVC ABI, we will assign unique identifiers to these two 
> > structs:
> > ```
> > !9 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t1", 
> > scope: !10, file: !7, line: 1, size: 64, flags: DIFlagTypePassByValue, 
> > elements: !11, identifier: ".?AUt1@?A0xF964240C@@")
> > ```
> > 
> > The `A0xF964240C` is set up here, and it is based on the source file path 
> > (the hash will only be unique when source file paths are unique across the 
> > build):
> > https://github.com/llvm/llvm-project/blob/main/clang/lib/AST/MicrosoftMangle.cpp#L464
> > ```
> >   // It's important to make the mangled names unique because, when CodeView
> >   // debug info is in use, the debugger uses mangled type names to 
> > distinguish
> >   // between otherwise identically named types in anonymous namespaces.
> > ```
> > 
> > Maybe this should use a "is MSVC ABI" check instead, since that is what 
> > controls whether the identifier will be unique, and the unique-ness is what 
> > matters for LTO and PDBs.
> After thinking about it some more, see the comment I added here: 
> rG59320fc9d78237a5f9fdd6a5030d6554bb6976ce
> 
> ```
> // If the type is not externally visible, it should be unique to the current 
> TU,
> // and should not need an identifier to participate in type deduplication.
> // However, when emitting CodeView, the format internally uses these
> // unique type name identifers for references between debug info. For example,
> // the method of a class in an anonymous namespace uses the identifer to refer
> // to its parent class. The Microsoft C++ ABI attempts to provide unique names
> // for such types, so when emitting CodeView, always use identifiers for C++
> // types. This may create problems when attempting to emit CodeView when the 
> MS
> // C++ ABI is not in use.
> ```
> 
> I think type identifiers are pretty crucial for CodeView functionality. The 
> debugger won't be able to link a method to its class without them. Therefore, 
> I think it's better to leave this as it is. The bad experience of duplicate 
> type identifiers is better than the lack of functionality from not emitting 
> identifiers at all for non-externally visible types.
Yeah, thanks for explaining/adding the comment - that about covers it.


Hmm, do these identifiers just need to only be unique within a single object 
file to provide the name referencing mechanics? Perhaps we could generate them 
later to ensure they're unique - rather than risk collapsing them while linking?

(at least here's an obscure case where they seem to collide: Compiling the same 
file with different `-D` values, I guess the hash is only for the filename, not 
for other properties that might impact the output - so this ends up with two 
types with the same mangled name even on MSVC ABI:
```
namespace { struct t1 { int i;
#ifdef SECOND
  int j;
#endif
}; }
t1 v1;
void*
#ifdef SECOND
v2
#else
v3
#endif
= &v1;
```
```
clang++-tot -emit-llvm -S a.cpp -g -gcodeview  -target x86_64-windows && 
clang++-tot -emit-llvm -S a.cpp -DSECOND -g -gcodeview -o b.ll -target 
x86_64-windows && ~/dev/llvm/build/default/bin/llvm-link -o ab.bc a.ll b.ll && 
~/dev/llvm/build/default/bin/llvm-dis ab.bc && cat ab.ll | grep "\"t1\""
```
)


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D45438/new/

https://reviews.llvm.org/D45438

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to