dblaikie added a comment.

In D122974#3424654 <https://reviews.llvm.org/D122974#3424654>, @JDevlieghere 
wrote:

> The ConstString/StringPool is pretty hot so I'm very excited about making it 
> faster.
>
> I'm slightly concerned about the two hash values (the "full" hash vs the 
> single byte one). That's not something that was introduced in this patch, but 
> passing it around adds an opportunity to get it wrong.
>
> I'm wondering if we could wrap this in a struct and pass that around:
>
>   struct HashedStringRef {
>     const llvm::StringRef str;
>     const unsigned full_hash;
>     const uint8_t hash; 
>     HashedStringRef(llvm::StringRef s) : str(s), full_hash(djbHash(str)), 
> hash(hash(str)) {}
>   }
>
> It looks like we always need both except in the StringMap implementation, but 
> if I'm reading the code correctly we'll have constructed it in ConstString 
> already anyway?

I'm sort of with you here - I don't think this API feature needs to affect lldb 
much - the scope of these externally-computed hashes is small, but the 
StringMap API could be more robust/less error prone by having a struct like 
this. Specifically `StringMap::hash` could/should return an immutable struct 
that contains the `StringRef` and the hash (not that that's bulletproof - the 
data beneath the `StringRef` could still be modified externally - but the same 
is true of any hashing data structure - the keys might be shallow or have 
mutable state, etc) - and that immutable object must be passed back to the 
hash-providing insertion functions. (these functions could/should probably 
still then assert that the hash is correct in case of that underlying mutation 
- though someone could argue that's overkill and I'd be open to having that 
discussion).

By immutable I mean that the caller can't go and modify either the `StringRef` 
or hash to make these out of sync. These handles are probably still copy 
assignable. The external code shouldn't be able to create their own (ctor 
private/protected, etc) - the only way to get one is to call `StringMap` to 
produce one.



================
Comment at: lldb/source/Utility/ConstString.cpp:107-109
         llvm::sys::SmartScopedReader<false> rlock(m_string_pools[h].m_mutex);
-        auto it = m_string_pools[h].m_string_map.find(string_ref);
+        auto it = m_string_pools[h].m_string_map.find(string_ref, string_hash);
         if (it != m_string_pools[h].m_string_map.end())
----------------
total aside, but it might be nice to refactor `m_string_pools[h]` out into a 
named variable here - accessing it 3 times and in a context where it's super 
important that it's the same thing in all 3 places, seems like it'd be easier 
to read with a named variable - might make it clear which things need to happen 
under the lock and which ones don't, etc.

(oh, and the 4th and 5th use a few lines down - and I think that'd be the only 
uses of `h`, so `h` could go away and `hash` could be called inline in the 
array index instead, maybe?)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122974/new/

https://reviews.llvm.org/D122974

_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to