In http://mail.python.org/pipermail/python-dev/2011-December/115138.html, Christian Heimes pointed out that
> ... we don't have to alter the outcome of hash ... We just need to reduce the > chance that > an attacker can produce collisions in the dict (and set?) I'll state it more strongly. hash probably should not change (at least for this), but we may want to consider a different conflict resolution strategy when the first slot is already filled. Remember that there was a fair amount of thought and timing effort put into selecting the current strategy; it is deliberately sub-optimal for random input, in order to do better with typical input. < http://hg.python.org/cpython/file/7010fa9bd190/Objects/dictnotes.txt > If there is a change, it would currently be needed in three places for each of set and dict (the lookdict functions and insertdict_clean). It may be worth adding some macros just to keep those six in sync. Once those macros are in place, that allows a compile-time switch. My personal opinion is that accepting *and parsing* enough data for this to be a problem is enough of an edge case that I don't want normal dicts slowed down at all for this; I would therefore prefer that the change be restricted to such a compile-time switch, with current behavior the default. http://hg.python.org/cpython/file/7010fa9bd190/Objects/dictobject.c#l571 583 for (perturb = hash; ep->me_key != NULL; perturb >>= PERTURB_SHIFT) { 584 i = (i << 2) + i + perturb + 1; PERTURB_SHIFT is already a private #define to 5; per dictnotes, 4 and 6 perform almost as well. Someone worried can easily make that change today, and be protected from "generic" anti-python attacks. I believe the salt suggestions have equivalent to replacing perturb = hash; with something like perturb = hash + salt; Changing i = (i << 2) + i + perturb + 1; would allow effectively replacing the initial hash, but risks spoiling performance in the non-adversary case. Would there be objections to replacing those two lines with something like: for (perturb = FIRST_PERTURB(hash, key); ep->me_key != NULL; perturb = NEXT_PERTURB(hash, key, perturb)) { i = NEXT_SLOT(i, perturb); The default macro definitions should keep things as they are #define FIRST_PERTURB(hash, key) hash #define NEXT_PERTURB(hash, key, perturb) perturb >> PERTURB_SHIFT #define NEXT_SLOT(i, perturb) (i << 2) + i + perturb + 1 while allowing #ifdefs for (slower but) safer things like adding a salt, or even using alternative hashes. -jJ _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com