I was taking a look at dictobject.c and realized that the logic
controlling whether a resizedict will occur in
dict_set_item_by_hash_or_entry disallows for the shrinking of a
dictionary.  This is contrary to what the comments directly above say:

(http://hg.python.org/cpython/file/f3032825f637/Objects/dictobject.c#l771)
771 /* If we added a key, we can safely resize. Otherwise just return!
772 * If fill >= 2/3 size, adjust size. Normally, this doubles or
773 * quaduples the size, but it's also possible for the dict to shrink
774 * (if ma_fill is much larger than ma_used, meaning a lot of dict
775 * keys have been * deleted).

The "bug" occures in the following conditional since we exit out of
the function without checking the relative magnitudes of ma_filled to
ma_used. Instead, we only check if we still have a correct loading
factor (and the "don't resize on modification" bit).

This can be fixed by changing the following conditional on line 785 to:

if (mp->ma_used <= n_used || (mp->ma_fill*3 < (mp->ma_mask+1)*2 &&
mp->ma_used*5 > mp->ma_fill))

The factor of 5 was chosen arbitrarily... I'm sure with some
benchmarking we could tune it to an optimal value for the internal use
of dictionaries.  However, before I put this effort in I was wondering if
this is in fact desired behavior or if it is indeed a bug.  At the
very least, the comments should be updated to reflect the actual
resizing dynamics of the dictionary.

Micha
-----------------------------
http://micha.gd/
http://github.com/mynameisfiber/
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to