https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65641
--- Comment #8 from François Dumont <fdumont at gcc dot gnu.org> --- (In reply to Jakub Lopuszanski from comment #6) > > A) Use the lowest bit to indicate the pointer points to the first item in > its bucket. > In addition to Jonathan's feedback there is also this PR pending to open _Hashtable implementation to allocator's fancy pointer type. https://forge.sourceware.org/gcc/gcc-TEST/pulls/32 with this abstraction accessing the pointer 1st bit would be a little bit more complicated. I'm also pointing to this PR cause I wonder how it would behaves on your benchmark. With it you can define _GLIBCXX_USE_ALLOC_PTR_FOR_HASHTABLE=9002 and then benefit from a new node memory layout where cached hash code is just after next node pointer so accessible without cache miss on most platforms. Otherwise have you consider the alternatives rehash policy/range hash policy provided at the moment ? It is std::__detail::_Power2_rehash_policy / std::__detail::_Mask_range_hashing that will avoid any div operation. See testsuite/performance/23_containers/insert_erase/41975.cc to see how to use those. In my attempts to limit this problem I'm on my side exploring other paths like: - Consider that we are still in the same bucket if the next node has the same hash code, it can be done before the modulo operation. That's a nice optimization for users providing bad hash functor which is quite common I think. - Consider that we moved to a different bucket if next node is Bucket+1 before begin node. This last optimization can work only if hash functor is pretty good and hashtable container properly sized to have a load factor close to 1.
