On 05/31/2012 10:13 PM, Robert Bradshaw wrote:
On Thu, May 31, 2012 at 12:29 PM, Dag Sverre Seljebotn
<d.s.seljeb...@astro.uio.no>  wrote:
On 05/31/2012 08:50 PM, Robert Bradshaw wrote:

On Thu, May 31, 2012 at 7:04 AM, Dag Sverre Seljebotn
<d.s.seljeb...@astro.uio.no>    wrote:

[Discussion on numfo...@googlegroups.com please]

I've uploaded a draft-state SEP 201 (previously CEP 1000):

https://github.com/numfocus/sep/blob/master/sep201.rst

"""
Many callable objects are simply wrappers around native code. This holds
for
any Cython function, f2py functions, manually written CPython extensions,
Numba, etc.

Obviously, when native code calls other native code, it would be nice to
skip the significant cost of boxing and unboxing all the arguments.
"""


The thread about this on the Cython list is almost endless:

http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443

There was a long discussion on the key-comparison vs. interned-string
approach. I've written both up in SEP 201 since it was the major point of
contention. There was some benchmarks starting here:

http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443

And why provide a table and not a get_function_pointer starting here:

http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443

For those who followed that and don't want to read the entire spec, the
aspect of flags is new. How do we avoid to duplicate entries/check
against
two signatures for cases like a GIL-holding caller wanting to call a
nogil
function? My take: For key-comparison you can compare under a mask, for
interned-string we should have additional flags field.

The situation is a bit awkward: The Cython list consensus (well, me and
Robert Bradshaw) decided on what is "Approach 1" (key-comparison) in SEP
201. I pushed for that.

Still, now that a month has passed, I just think key-comparison is too
ugly,
and that the interning mechanism shouldn't be *that* hard to code up,
probably 500 lines of C code if one just requires the GIL in a first
iteration, and that keeping the spec simpler is more important.

So I'm tentatively proposing Approach 2.


I'm still not convinced that a hybrid approach, where signatures below
some cutoff are compiled down to keys, is not a worthwhile approach.
This gets around variable-length keys (both the complexity and
possible runtime costs for long keys) and allows simple libraries to
produce and consume fast callables without participating in the
interning mechanism.

I still think this gives us the "worst of both worlds", all the
disadvantages and none of the advantages.

It avoids the one of the primary disadvantage of keys, namely the
variable length complexity.

How many simple libraries are there really? Cython on one end, the
magnificently complicated NumPy ufuncs on the other? Thinking big, perhaps
PyPy and Julia? Cython, PyPy, Julia would all have to deal with long
signatures anyway. And NumPy ufuncs are already complicated so even more
low-level stuff wouldn't hurt.

I was thinking of, for example, a differential equation solver written
in C, C++, or Fortran that could take a PyNativeCallableTable*
directly, primarily avoiding welding this spec to Python.

I'm not sure how real-world that is in the end. But, the size of Cython generated code would be kept down for most modules as it wouldn't need to bundle an interner.

AND, a problem with interning is spreading the signature strings all over memory (in the event you actually need to look at the contents). With a smart interner I guess this can be eliminated to some extent, but much better if one doesn't have to worry, and if all short signatures are keys you don't.

Playing along:

a) It'd be very nice to avoid explicit decoding. I think one should be able to cast the key to char[]; this a) avoids having to allocate a buffer on the stack to pass to a Decode function, b) let's you inspect the table in a debugger easily.

b) Flags are needed in addition to interning; GIL status and exception return values do not require exact matches. I think more than 3 bits are needed for flags => our minimal padded table entry size is actually 24 bytes! (And this is OK, my benchmarks weren't affected by 8-byte vs 16-byte comparisons, branching is so dominating.)

Now, 16 bits seems about right for flags, so this means we can actually for free use 14-char keys (12 for signature data, one for \0, one for guard)

That pushes the number of non-interning signatures high enough to make it really fit 95% of the use-cases. I feel 6 chars is a little low, remember that a "pointer to a double complex" is "&Zd" by itself unless we play with encoding.

BUT, it then gets rather complicated to have things work on little-endian vs. big-endian though as the guard byte must be in different positions. If you want to align the pointer you get this:

typedef struct {
    void *funcptr;
    union {
        union {
            struct {
                uint16_t interned_flags;
                uint16_t padding1;
                uint32_t padding2;
                uint64_t interned_sig;
            };
            struct {
                uint16_t flags;
                char sig[14];
            };
        } big_endian;

        union {
            struct {
                uint64_t interned_sig;
                uint32_t padding1;
                uint16_t padding2;
                uint16_t interned_flags;
            };
            struct {
                char sig[14];
                uint16_t flags;
            };
        } little_endian;
    };
};

(interned_flags and flags is really the same, I just didn't want to mess with the struct alignment)

So I think this is *almost* there, but it certainly gets complicated because of endianness issues.

Of course, an alternative is to not have the interned_sig be 64-bit aligned. Or, play with adapting the string/guard bytes in the middle, but that sort of breaks a) above.

Thinking about this is psychologically difficult because it's very likely bikeshedding, but OTOH once the spec is out in the wild it will never be worth it to change it so some care is called for... oh well, at least I'm having fun!

Dag
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel

Reply via email to