dblaikie added a comment.

In D93747#2481383 <https://reviews.llvm.org/D93747#2481383>, @hoy wrote:

> In D93747#2481304 <https://reviews.llvm.org/D93747#2481304>, @tmsriram wrote:
>
>> In D93747#2481223 <https://reviews.llvm.org/D93747#2481223>, @tmsriram wrote:
>>
>>> In D93747#2481163 <https://reviews.llvm.org/D93747#2481163>, @dblaikie 
>>> wrote:
>>>
>>>> In D93747#2481095 <https://reviews.llvm.org/D93747#2481095>, @hoy wrote:
>>>>
>>>>> In D93747#2481073 <https://reviews.llvm.org/D93747#2481073>, @dblaikie 
>>>>> wrote:
>>>>>
>>>>>> Suggesting that gdb isn't using the DW_AT_name at all for "break 
>>>>>> <function name>" but instead demangling and stripping off the extras 
>>>>>> from the linkage name, and since it can't demangle this uniquified name 
>>>>>> (unlike the mangled name used when using the overloadable attribute) 
>>>>>> that degrades the debugger user experience? I'd have to test that in 
>>>>>> more detail/with some hand-hacked DWARF.
>>>>>
>>>>> Yes, I think in your case the linage name can be demangled by the 
>>>>> debugger. In my previous experiment, the uniquefied names could not be 
>>>>> demangled therefore I was not able to breakpoint.
>>>>
>>>> Ah, did some more testing. Seems it's because it isn't a classically 
>>>> mangled name.
>>
>> The simplest fix I can think of is to emit the base 10 number of the 
>> md5hash.  That would preserve all the existing uniqueness and be demangler 
>> friendly.  @hoy @dblaikie  WDYT?
>
> I think using the base 10 form of md5hash is a smart workaround. Thanks for 
> the suggestion.

Sure - wouldn't mind having some wording from the Itanium ABI/mangling rules 
that explain/corroborate what we've seen from testing to ensure we are using it 
correctly and there aren't any other ways that might be more suitable, or 
lurking gotchas we haven't tested.

>>>> It might be possible for this uniquifying step to check if the name is 
>>>> mangled (does it start with "_Z") and if it isn't mangled, it could mangle 
>>>> it (check the length of the name, then "_Z<length><name>v") and add the 
>>>> unique suffix. That looks to me like it preserves the original debugging 
>>>> behavior?
>>>>
>>>> (has the problem that we don't actually know the mangling scheme at this 
>>>> point - we do know it up in clang (where it might be Itanium mangling or 
>>>> Microsoft mangling), might point again to the possibility this feature is 
>>>> more suitable to implement in the frontend rather than a middle end pass. 
>>>> And also the 'v' in the mangling is 'void return type', which is a lie - 
>>>> not sure if we could do something better there)
>
> Doing name unification in the frontend sounds like the ultimate solution and 
> since the frontend has all the knowledge about the name mangler. I think for 
> now we can go with the solution @tmsriram suggested, i.e., using the base 10 
> form of md5 hash.

Any general idea of how long "for now" would be? It doesn't seem like putting 
this off is going to make things especially better & seems like more bug 
fixes/changes/etc are being built around the solution as it is at the moment. 
So I'd be hesitant to put off this kind of restructuring too long.

>>>> I think it's pretty important that we don't break tradeoff debuggability 
>>>> for profile accuracy. It'll make for a difficult tradeoff for our/any 
>>>> users.
>>>
>>> Agreed, I didn't expect this.
>>>
>>>> (side note: using the original source file name seems problematic - I know 
>>>> in google at least, the same source file is sometimes built into more than 
>>>> one library/form with different preprocessor defines, so this may not 
>>>> produce as unique a name as you are expecting?)
>>>
>>> It was a best effort and I think the hashing can be improved by using more 
>>> signals other than just the module name if needed.  For hard data though, 
>>> this does significantly improve performance which clearly comes from better 
>>> profile attribution so it does something.

I'm OK with the idea that this helped the situation - but it doesn't seem very 
principled/rigorous. Rather than a sliding scale of "better" I think it'd be 
worth considering in more detail what would be required to make this correct.

Using the object file name may be more robust - also not perfect for all build 
systems (one could imagine a distributed build system that /might/ be able to 
get away with having the distributed builders always build a file named x.o - 
only to have the distribution system rename the file to its canonical name on 
the main machine before linking occurs - but I think that's not how most 
distributed build systems work, and most build systems provide a unique object 
file name across a build?) but maybe moreso than using the input file name? (at 
least I think for google/blaze the object filename is genuinely unique)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93747/new/

https://reviews.llvm.org/D93747

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to