On 06/06/2012 11:00 PM, Robert Bradshaw wrote:
On Tue, Jun 5, 2012 at 2:41 PM, Dag Sverre Seljebotn
<d.s.seljeb...@astro.uio.no>  wrote:
Is the goal then to avoid having to have an interning registry?

Yes, and to avoid invoking an expensive hash function at runtime in
order to achieve good distribution.

I don't understand. Compilation of call-sites would always generate a hash. You also need them while initializing/composing the hash table.

But the storage and comparison of the hash rather than and interned string seems orthogonal to that.

If it weren't for the security consern I agree with you. But I think Mark and Stefan makes a good point. Since you could hand a JIT-ed vtable (potentially the result of "trusted and verified user input") to a Cython function, *all* call-sites should use the full 160 bits.

Interning solves this in a better way, and preserves vtable memory to boot.

A collision registry would work against a security breach but still allow a DoS attack.

Our dependencies are already:

 - md5
 - Pagh99 algorithm

Why not throw in an interning registry as well ;-)

But then the end-result is pretty cool.

Something that hasn't come up so far is that Cython doesn't know the exact
types of external typedefs, so it can't generate the hash at Cythonize-time.
I guess some support for build systems to probe for type sizes and compute
the signature hashes in a sepearate header file would solve this -- with a
fallback to computing them runtime at module loading, if you're not using a
supported build system. (But suddenly an interning registry doesn't look so
horrible..)

It all depends on how strict you want to be. It may be acceptable to
let f(int) and f(long) not hash to the same value even if sizeof(int)
== sizeof(long). We could also promote all int types to long or long
long, including extern times (assuming, with a c-compile-time check,
external types declared up to "long" are<= sizeof(long)). Another

Please no, I don't like any of those. We should not make the trouble with external typedefs worse than it already is. (Part of me wants to just declare that Cython is like Go with no implicit conversions to aovid inheriting the ugly coercion rules of C anyway...)

option is to let the hash be md5(sig) + hashN(sizeof(extern_arg1),
sizeof(extern_argN)) where hashN is a macro.

Good idea. Would the following destroy all the nice properties of md5? I guess I wouldn't use it for crypto any longer...:

hash("mymethod:iiZd") =
md5("mymethod") ^ md5("i\x1") ^ md5("i\x2") ^ md5("Z\x3") ^ md5("d\x4")

Dag
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel

Reply via email to