[PATCH] D93747: Rename debug linkage name with -funique-internal-linkage-names

Hongtao Yu via Phabricator via cfe-commits Wed, 06 Jan 2021 12:00:13 -0800

hoy added a comment.

In D93747#2482730 <https://reviews.llvm.org/D93747#2482730>, @tmsriram wrote:


> In D93747#2482726 <https://reviews.llvm.org/D93747#2482726>, @hoy wrote:
>
>> In D93747#2482519 <https://reviews.llvm.org/D93747#2482519>, @tmsriram wrote:
>>
>>> In D93747#2481494 <https://reviews.llvm.org/D93747#2481494>, @dblaikie 
>>> wrote:
>>>
>>>> In D93747#2481383 <https://reviews.llvm.org/D93747#2481383>, @hoy wrote:
>>>>
>>>>> In D93747#2481304 <https://reviews.llvm.org/D93747#2481304>, @tmsriram 
>>>>> wrote:
>>>>>
>>>>>> In D93747#2481223 <https://reviews.llvm.org/D93747#2481223>, @tmsriram 
>>>>>> wrote:
>>>>>>
>>>>>>> In D93747#2481163 <https://reviews.llvm.org/D93747#2481163>, @dblaikie 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> In D93747#2481095 <https://reviews.llvm.org/D93747#2481095>, @hoy 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> In D93747#2481073 <https://reviews.llvm.org/D93747#2481073>, 
>>>>>>>>> @dblaikie wrote:
>>>>>>>>>
>>>>>>>>>> Suggesting that gdb isn't using the DW_AT_name at all for "break 
>>>>>>>>>> <function name>" but instead demangling and stripping off the extras 
>>>>>>>>>> from the linkage name, and since it can't demangle this uniquified 
>>>>>>>>>> name (unlike the mangled name used when using the overloadable 
>>>>>>>>>> attribute) that degrades the debugger user experience? I'd have to 
>>>>>>>>>> test that in more detail/with some hand-hacked DWARF.
>>>>>>>>>
>>>>>>>>> Yes, I think in your case the linage name can be demangled by the 
>>>>>>>>> debugger. In my previous experiment, the uniquefied names could not 
>>>>>>>>> be demangled therefore I was not able to breakpoint.
>>>>>>>>
>>>>>>>> Ah, did some more testing. Seems it's because it isn't a classically 
>>>>>>>> mangled name.
>>>>>>
>>>>>> The simplest fix I can think of is to emit the base 10 number of the 
>>>>>> md5hash.  That would preserve all the existing uniqueness and be 
>>>>>> demangler friendly.  @hoy @dblaikie  WDYT?
>>>>>
>>>>> I think using the base 10 form of md5hash is a smart workaround. Thanks 
>>>>> for the suggestion.
>>>>
>>>> Sure - wouldn't mind having some wording from the Itanium ABI/mangling 
>>>> rules that explain/corroborate what we've seen from testing to ensure we 
>>>> are using it correctly and there aren't any other ways that might be more 
>>>> suitable, or lurking gotchas we haven't tested.
>>>
>>> Yes, makes sense. Also, like @dblaikie  pointed out in D94154 
>>> <https://reviews.llvm.org/D94154> it is now important that we don't 
>>> generate the DW_AT_linkage_name for C style names using this suffix as it 
>>> will not demangle.  I think this makes it important that we only update 
>>> this field if it already exists.
>>
>> Yes, it makes sense to do the renaming only if the linkage name preexists to 
>> not break the debuggability. Unfortunately we won't be able to support C 
>> with this patch.
>
> There is nothing needed to support C right?  overloadable attribute will 
> mangle with _Z so it will work as expected?  What did I miss?

I mean from the AutoFDO point of view, C programs with static functions are not 
getting the unique debug name support.

>>>>>>>> It might be possible for this uniquifying step to check if the name is 
>>>>>>>> mangled (does it start with "_Z") and if it isn't mangled, it could 
>>>>>>>> mangle it (check the length of the name, then "_Z<length><name>v") and 
>>>>>>>> add the unique suffix. That looks to me like it preserves the original 
>>>>>>>> debugging behavior?
>>>>>>>>
>>>>>>>> (has the problem that we don't actually know the mangling scheme at 
>>>>>>>> this point - we do know it up in clang (where it might be Itanium 
>>>>>>>> mangling or Microsoft mangling), might point again to the possibility 
>>>>>>>> this feature is more suitable to implement in the frontend rather than 
>>>>>>>> a middle end pass. And also the 'v' in the mangling is 'void return 
>>>>>>>> type', which is a lie - not sure if we could do something better there)
>>>>>
>>>>> Doing name unification in the frontend sounds like the ultimate solution 
>>>>> and since the frontend has all the knowledge about the name mangler. I 
>>>>> think for now we can go with the solution @tmsriram suggested, i.e., 
>>>>> using the base 10 form of md5 hash.
>>>>
>>>> Any general idea of how long "for now" would be? It doesn't seem like 
>>>> putting this off is going to make things especially better & seems like 
>>>> more bug fixes/changes/etc are being built around the solution as it is at 
>>>> the moment. So I'd be hesitant to put off this kind of restructuring too 
>>>> long.
>>
>> I'm not sure. It looks like implementation in the front end has more 
>> complexity as discussed in the other thread D94154 
>> <https://reviews.llvm.org/D94154>.
>>
>>>>>>>> I think it's pretty important that we don't break tradeoff 
>>>>>>>> debuggability for profile accuracy. It'll make for a difficult 
>>>>>>>> tradeoff for our/any users.
>>>>>>>
>>>>>>> Agreed, I didn't expect this.
>>>>>>>
>>>>>>>> (side note: using the original source file name seems problematic - I 
>>>>>>>> know in google at least, the same source file is sometimes built into 
>>>>>>>> more than one library/form with different preprocessor defines, so 
>>>>>>>> this may not produce as unique a name as you are expecting?)
>>>>>>>
>>>>>>> It was a best effort and I think the hashing can be improved by using 
>>>>>>> more signals other than just the module name if needed.  For hard data 
>>>>>>> though, this does significantly improve performance which clearly comes 
>>>>>>> from better profile attribution so it does something.
>>>>
>>>> I'm OK with the idea that this helped the situation - but it doesn't seem 
>>>> very principled/rigorous. Rather than a sliding scale of "better" I think 
>>>> it'd be worth considering in more detail what would be required to make 
>>>> this correct.
>>>>
>>>> Using the object file name may be more robust - also not perfect for all 
>>>> build systems (one could imagine a distributed build system that /might/ 
>>>> be able to get away with having the distributed builders always build a 
>>>> file named x.o - only to have the distribution system rename the file to 
>>>> its canonical name on the main machine before linking occurs - but I think 
>>>> that's not how most distributed build systems work, and most build systems 
>>>> provide a unique object file name across a build?) but maybe moreso than 
>>>> using the input file name? (at least I think for google/blaze the object 
>>>> filename is genuinely unique)
>>>
>>> So we are using the full path name of the source.  We could also combine 
>>> the object file name into the hash to make it more robust. I can work on 
>>> this patch independently if that is alright.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93747/new/

https://reviews.llvm.org/D93747

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D93747: Rename debug linkage name with -funique-internal-linkage-names

Reply via email to