[Bug libstdc++/58153] unordered_multimap::erase(iterator) is not constant-time when many entries have the same key

temporal at gmail dot com Wed, 14 Aug 2013 04:17:06 -0700

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58153


--- Comment #2 from Kenton Varda <temporal at gmail dot com> ---
> The standard says average case O(1), worst case O(a.size()), so if every
> element in the container has the same key then it's O(n).

I'm not sure that follows.  Yes, the standard says "worst case O(n)", but why
should expect worst-case performance if I have lots of entries with the same
key?  In the absence of explicit language to the contrary, it seems entirely
reasonable to expect unordered_multimap<K, V> to perform similarly to
unordered_map<K, list<V>>, where in fact it turns out to be more like
unordered_map<K, slist<V>>.

I guess the real bug here is with all of the C++ reference sites that fail to
mention this rather large caveat...  that unordered_multimap supports duplicate
keys, but only as long as there aren't "too many" duplicates.

I think part of the problem might be that people are thinking: "Well, duh, hash
maps perform poorly if lots of keys have the same hash!".  But I don't think
that's quite the same thing.  Unique hash maps perform poorly in that case
because it's necessary to linearly scan through many entries to find the one
that matches the key.  But when looking for a key in a multimap, you stop at
the first match, so even if there are lots of entries with the same key, lookup
can still be O(1).  So, I don't think it's intrinsically true that a
hashtable-based multimap should perform badly when many entries have the same
key.

[Bug libstdc++/58153] unordered_multimap::erase(iterator) is not constant-time when many entries have the same key

Reply via email to