On Thu, Jul 2, 2026 at 2:21 AM Nathan Myers <[email protected]> wrote:
> The hash tables std::unordered_set, _map, _multiset, _multimap
> as implemented take time linear in the size of the bucket array
> (because memset), rather than in the number of elements stored,
> in violation of Standard requirements. With this patch it
> performs work only for the elements present, but only for a very
> sparsely-populated table. With a bucket loading above 0.5%, the
> status-quo method tests faster, so the new method is used only
> when less.
>
Could you post your test results?
>
> It turns out libc++ also zeroes its bucket table on clear(), like
> libstdc++, but uses a regular loop. Performance differences seem
> to depend on memory organization.
>
> libstdc++-v3/Changelog:
> PR libstdc++/67922
> * include/bits/hashtable.h (clear): Special-case minimal
> population.
> ---
> libstdc++-v3/include/bits/hashtable.h | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/libstdc++-v3/include/bits/hashtable.h
> b/libstdc++-v3/include/bits/hashtable.h
> index eff6c31d827..d9eb0fe45e3 100644
> --- a/libstdc++-v3/include/bits/hashtable.h
> +++ b/libstdc++-v3/include/bits/hashtable.h
> @@ -2778,6 +2778,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
> clear() noexcept
> {
> + if (this->size() < this->bucket_count() / 200)
> + {
> + this->erase(this->begin(), this->end());
> + return;
> + }
> this->_M_deallocate_nodes(_M_begin());
> std::fill_n(_M_buckets, _M_bucket_count, nullptr);
> _M_element_count = 0;
> --
> 2.54.0
>
>