I was taking a look at dictobject.c and realized that the logic controlling whether a resizedict will occur in dict_set_item_by_hash_or_entry disallows for the shrinking of a dictionary. This is contrary to what the comments directly above say:
(http://hg.python.org/cpython/file/f3032825f637/Objects/dictobject.c#l771) 771 /* If we added a key, we can safely resize. Otherwise just return! 772 * If fill >= 2/3 size, adjust size. Normally, this doubles or 773 * quaduples the size, but it's also possible for the dict to shrink 774 * (if ma_fill is much larger than ma_used, meaning a lot of dict 775 * keys have been * deleted). The "bug" occures in the following conditional since we exit out of the function without checking the relative magnitudes of ma_filled to ma_used. Instead, we only check if we still have a correct loading factor (and the "don't resize on modification" bit). This can be fixed by changing the following conditional on line 785 to: if (mp->ma_used <= n_used || (mp->ma_fill*3 < (mp->ma_mask+1)*2 && mp->ma_used*5 > mp->ma_fill)) The factor of 5 was chosen arbitrarily... I'm sure with some benchmarking we could tune it to an optimal value for the internal use of dictionaries. However, before I put this effort in I was wondering if this is in fact desired behavior or if it is indeed a bug. At the very least, the comments should be updated to reflect the actual resizing dynamics of the dictionary. Micha ----------------------------- http://micha.gd/ http://github.com/mynameisfiber/ _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com