dblaikie added a comment.
In D122974#3424654 <https://reviews.llvm.org/D122974#3424654>, @JDevlieghere
wrote:
> The ConstString/StringPool is pretty hot so I'm very excited about making it
> faster.
>
> I'm slightly concerned about the two hash values (the "full" hash vs the
> single byte one). That's not something that was introduced in this patch, but
> passing it around adds an opportunity to get it wrong.
>
> I'm wondering if we could wrap this in a struct and pass that around:
>
> struct HashedStringRef {
> const llvm::StringRef str;
> const unsigned full_hash;
> const uint8_t hash;
> HashedStringRef(llvm::StringRef s) : str(s), full_hash(djbHash(str)),
> hash(hash(str)) {}
> }
>
> It looks like we always need both except in the StringMap implementation, but
> if I'm reading the code correctly we'll have constructed it in ConstString
> already anyway?
I'm sort of with you here - I don't think this API feature needs to affect lldb
much - the scope of these externally-computed hashes is small, but the
StringMap API could be more robust/less error prone by having a struct like
this. Specifically `StringMap::hash` could/should return an immutable struct
that contains the `StringRef` and the hash (not that that's bulletproof - the
data beneath the `StringRef` could still be modified externally - but the same
is true of any hashing data structure - the keys might be shallow or have
mutable state, etc) - and that immutable object must be passed back to the
hash-providing insertion functions. (these functions could/should probably
still then assert that the hash is correct in case of that underlying mutation
- though someone could argue that's overkill and I'd be open to having that
discussion).
By immutable I mean that the caller can't go and modify either the `StringRef`
or hash to make these out of sync. These handles are probably still copy
assignable. The external code shouldn't be able to create their own (ctor
private/protected, etc) - the only way to get one is to call `StringMap` to
produce one.
================
Comment at: lldb/source/Utility/ConstString.cpp:107-109
llvm::sys::SmartScopedReader<false> rlock(m_string_pools[h].m_mutex);
- auto it = m_string_pools[h].m_string_map.find(string_ref);
+ auto it = m_string_pools[h].m_string_map.find(string_ref, string_hash);
if (it != m_string_pools[h].m_string_map.end())
----------------
total aside, but it might be nice to refactor `m_string_pools[h]` out into a
named variable here - accessing it 3 times and in a context where it's super
important that it's the same thing in all 3 places, seems like it'd be easier
to read with a named variable - might make it clear which things need to happen
under the lock and which ones don't, etc.
(oh, and the 4th and 5th use a few lines down - and I think that'd be the only
uses of `h`, so `h` could go away and `hash` could be called inline in the
array index instead, maybe?)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D122974/new/
https://reviews.llvm.org/D122974
_______________________________________________
lldb-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits