[Python-Dev] Changing the order of iteration over a dictionary

Mark Shannon Fri, 20 Jan 2012 02:50:55 -0800

Hi,

One of the main sticking points over possible fixes for thehash-collision security issue seems to be a fear that changing theiteration order of a dictionary will break backwards compatibility.

The order of iteration has never been specified. In fact not only is itarbitrary, it cannot be determined from the contents of a dict alone; itmay depend on the insertion order.

Changing a hash function is not the only change that will change theiteration order; any of the following will also do so:

* Changing the minimum size of a dict.
* Changing the load factor of a dict.
* Changing the resizing policy of a dict.
* Sharing of keys between dicts.

By treating iteration order as part of the API we are effectively rulingout ever making any improvements to the dict.


For example, my new dictionary implementation
https://bitbucket.org/markshannon/hotpy_new_dict/

reduces memory use by 47% for gcbench, and by about 20% for the 2to3benchmark, on my 32bit machine.

(Nice graphs: http://tinyurl.com/7qd2nnm http://tinyurl.com/6uqvl2x )

The new dict implementation (necessarily) changes the iteration orderand will break code that relies on it.

If dict iteration order is to be treated as part of the API (and I thinkthat is a very bad idea) then it should be documented, which will bedifficult since it is barely deterministic.This will also be a major problem for PyPy, Jython and IronPython, asthey will have to reimplement their dicts.


So, don't be afraid to change that hash function :)

Cheers,
Mark
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Changing the order of iteration over a dictionary

Reply via email to