dblaikie added a comment. In D122974#3424654 <https://reviews.llvm.org/D122974#3424654>, @JDevlieghere wrote:
> The ConstString/StringPool is pretty hot so I'm very excited about making it > faster. > > I'm slightly concerned about the two hash values (the "full" hash vs the > single byte one). That's not something that was introduced in this patch, but > passing it around adds an opportunity to get it wrong. > > I'm wondering if we could wrap this in a struct and pass that around: > > struct HashedStringRef { > const llvm::StringRef str; > const unsigned full_hash; > const uint8_t hash; > HashedStringRef(llvm::StringRef s) : str(s), full_hash(djbHash(str)), > hash(hash(str)) {} > } > > It looks like we always need both except in the StringMap implementation, but > if I'm reading the code correctly we'll have constructed it in ConstString > already anyway? I'm sort of with you here - I don't think this API feature needs to affect lldb much - the scope of these externally-computed hashes is small, but the StringMap API could be more robust/less error prone by having a struct like this. Specifically `StringMap::hash` could/should return an immutable struct that contains the `StringRef` and the hash (not that that's bulletproof - the data beneath the `StringRef` could still be modified externally - but the same is true of any hashing data structure - the keys might be shallow or have mutable state, etc) - and that immutable object must be passed back to the hash-providing insertion functions. (these functions could/should probably still then assert that the hash is correct in case of that underlying mutation - though someone could argue that's overkill and I'd be open to having that discussion). By immutable I mean that the caller can't go and modify either the `StringRef` or hash to make these out of sync. These handles are probably still copy assignable. The external code shouldn't be able to create their own (ctor private/protected, etc) - the only way to get one is to call `StringMap` to produce one. ================ Comment at: lldb/source/Utility/ConstString.cpp:107-109 llvm::sys::SmartScopedReader<false> rlock(m_string_pools[h].m_mutex); - auto it = m_string_pools[h].m_string_map.find(string_ref); + auto it = m_string_pools[h].m_string_map.find(string_ref, string_hash); if (it != m_string_pools[h].m_string_map.end()) ---------------- total aside, but it might be nice to refactor `m_string_pools[h]` out into a named variable here - accessing it 3 times and in a context where it's super important that it's the same thing in all 3 places, seems like it'd be easier to read with a named variable - might make it clear which things need to happen under the lock and which ones don't, etc. (oh, and the 4th and 5th use a few lines down - and I think that'd be the only uses of `h`, so `h` could go away and `hash` could be called inline in the array index instead, maybe?) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D122974/new/ https://reviews.llvm.org/D122974 _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits