[Lldb-commits] [PATCH] D122974: prevent ConstString from calling djbHash() more than once

Pavel Labath via Phabricator via lldb-commits Fri, 29 Apr 2022 07:36:00 -0700

labath added a comment.

In D122974#3482540 <https://reviews.llvm.org/D122974#3482540>, @llunak wrote:


> In D122974#3481852 <https://reviews.llvm.org/D122974#3481852>, @labath wrote:
>
>> In D122974#3481269 <https://reviews.llvm.org/D122974#3481269>, @llunak wrote:
>>
>>> But what if I don't want StringMap to compute the hash at all? There's 
>>> still that 10-15% of CPU time spent in djbHash() when debug info uses 
>>> exactly[*] that (and LLDB's index cache could be modified to store it). 
>>> Which means it can be useful for StringMap to allow accepting precomputed 
>>> hash, and then what purpose will that HashedStringRef have?
>>
>> I think that David's point was that you would use a (probably static) 
>> StringMap method to produce the hash object, and then you use the hash 
>> object to select the correct map, and also to insert the key in the map. 
>> Something like:
>
> ...
>
>> That should only produce one hash computation, right?
>
> Yes, it would, but that's not my point. If StringMap insists on computing the 
> hash itself and doesn't allow outside code to provide it, then that prevents 
> the possible LLDB optimization saving the remaing 10-15% startup CPU cost by 
> not computing the hash at all. If StringMap allows that, then this 
> HashedStringRef idea can be easily circumvented by passing the hash directly, 
> so it seems rather pointless. And the relevant LLDB code from this change is 
> so small that creating HashedStringRef just for that seems like an overkill.

Interesting. I don't know if I missed this somewhere, but could explain what 
kind of a map operation can lldb perform without computing the hash at least 
once?

>>> (*) Well, not exactly, the seed is different. Doesn't seem an impossible 
>>> problem to solve though.
>>
>> I think that's because it's trying to produce a value that is not correlated 
>> to the value used inside the individual objects (so that you don't overload 
>> some buckets of the maps while keeping others empty). However, there are 
>> other ways to achieve that. Since StringMap is always using a power of two 
>> for the number of buckets, I think it would be sufficient to use an odd 
>> number of StringMap instances (say 255). Using the modulo operation to 
>> select the StringMap should select spread out the complete hashes evenly 
>> among to individual maps.
>
> By different seed I meant that DWARF and StringMap use different seed for DJB 
> hash. And that's only because LLVM code historically used the worse 0 seed 
> and some tests rely on that (7b319df29bf4ebe690ca0c41761e46d8b0081293 
> <https://reviews.llvm.org/rG7b319df29bf4ebe690ca0c41761e46d8b0081293>). But 
> it seems that LLDB doesn't even read .debug_names entries into memory, so 
> this having anything to do with DWARF debuginfo is actually moot, it'd only 
> matter for LLDB index cache, which could use any hash it wants.
>
> But you're possibly talking about something unrelated here.

Yeah, I thought that the we used a different seed (or hash?) when computing the 
StringMap index -- I believe that was the case at some point, but I guess it 
was changed since then...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122974/new/

https://reviews.llvm.org/D122974

_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [PATCH] D122974: prevent ConstString from calling djbHash() more than once

Reply via email to