On 06/05/2012 08:02 PM, mark florisson wrote:
On 5 June 2012 18:09, Dag Sverre Seljebotn<d.s.seljeb...@astro.uio.no> wrote:
On 06/05/2012 07:01 PM, Dag Sverre Seljebotn wrote:
On 06/05/2012 09:25 AM, Stefan Behnel wrote:
Dag Sverre Seljebotn, 04.06.2012 21:44:
This can cause crashes/stack smashes
etc. if there's lower-64bit-of-md5 collisions, but a) the
probability is incredibly small, b) it would only matter in
situations that should cause an AttributeError anyway, c) if we
really care, we can always use an interning-like mechanism to
validate on module loading that its hashes doesn't collide with
other hashes (and raise an exception "Congratulations, you've
discovered a phenomenal md5 collision, get in touch with cython
devs and we'll work around it right away").
I'm not a big fan of such an attitude. If this happens at runtime, it can
induce any cost from cheap-at-test-time to
hugely-expensive-in-production.
Thinking with my evil hat on, this can potentially be data triggered from
the outside (e.g. if a JIT compiler is involved at one end), thus
possibly
even leading to a security hole.
We should try to produce software that others can build a business on.
Well, I'd build a business on something that fails with a 5e-7
probability any day :-) (given that you trust my estimates in the other
post; I think they were rather conservative myself)
This was put the wrong way. The chance was 5e-7 that it would fail for
anybody over the course of human history (and that was a rather pessimistic
estimate).
So a more "individual tack":
Assume that the process contains 200 MB of method definitions alone, with
each method definition being a 8 character string. (That should mean the
executable should be several gigabytes :-))
That puts the probability of collision at 10^-34 for that process containing
a 64-bit hash collision.
Dag
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel
The point is not so much running into this problem accidentally, but
maliciously. If user input from untrusted users can somehow determine
the function signatures that are generated and called by a JIT, then a
malicious user can find collisions offline and cause some fault in a
valid user program.
This took me a while to understand. So the idea is that you're in a
completely managed environment (like Java), and you want to run
untrusted code and have it not segfault or smash the stack. Eve then
cleverly assembles a caller/callee pair with mismatching signatures but
the same hash.
Yes, in that situation 64 bits is perhaps not enough.
But is this relevant to what we're trying to do here? We're discussing
APIs to talk between Python C extension modules that already have
unlimited powers. I'd think a "managed Cython" would be such a large
change that one could easily change the hash size at that point?
But I agree it's not as easily written off as I thought.
Dag
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel