On 1/19/2012 8:54 PM, Carl Meyer wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi Victor,
On 01/19/2012 05:48 PM, Victor Stinner wrote:
[snip]
Using a randomized hash may
also break (indirectly) real applications because the application
output is also somehow "randomized". For example, in the Django test
suite, the HTML output is different at each run. Web browsers may
render the web page differently, or crash, or ... I don't think that
Django would like to sort attributes of each HTML tag, just because we
wanted to fix a vulnerability.
I'm a Django core developer, and if it is true that our test-suite has a
dictionary-ordering dependency that is expressed via HTML attribute
ordering, I consider that a bug and would like to fix it. I'd be
grateful for, not resentful of, a change in CPython that revealed the
bug and prompted us to fix it. (I presume that it is true, as it sounds
like you experienced it directly; I don't have time to play around at
the moment, but I'm surprised we haven't seen bug reports about it from
users of 64-bit Pythons long ago). I can't speak for the core team, but
I doubt there would be much disagreement on this point: ideally Django
would run equally well on any implementation of Python, and as far as I
know none of the alternative implementations guarantee hash or
dict-ordering compatibility with CPython.
I don't have the expertise to speak otherwise to the alternatives for
fixing the collisions vulnerability, but I don't believe it's accurate
to presume that Django would not want to fix a dict-ordering dependency,
and use that as a justification for one approach over another.
Carl
It might be a good idea to have a way to seed the hash with some value
to allow testing with different dict orderings -- this would allow tests
to be developed using one Python implementation that would be immune to
the different orderings on different implementations; however,
randomizing the hash not only doesn't solve the problem for long-running
applications, it causes non-deterministic performance from one run to
the next even with the exact same data: a different (random) seed could
cause collisions sporadically with data that usually gave good
performance results, and there would be little explanation for it, and
little way to reproduce the problem to report it or understand it.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com