On 06/01/2012 03:49 PM, Dag Sverre Seljebotn wrote:
On 05/31/2012 10:13 PM, Robert Bradshaw wrote:
On Thu, May 31, 2012 at 12:29 PM, Dag Sverre Seljebotn
<d.s.seljeb...@astro.uio.no> wrote:
On 05/31/2012 08:50 PM, Robert Bradshaw wrote:
On Thu, May 31, 2012 at 7:04 AM, Dag Sverre Seljebotn
<d.s.seljeb...@astro.uio.no> wrote:
[Discussion on numfo...@googlegroups.com please]
I've uploaded a draft-state SEP 201 (previously CEP 1000):
https://github.com/numfocus/sep/blob/master/sep201.rst
"""
Many callable objects are simply wrappers around native code. This
holds
for
any Cython function, f2py functions, manually written CPython
extensions,
Numba, etc.
Obviously, when native code calls other native code, it would be
nice to
skip the significant cost of boxing and unboxing all the arguments.
"""
The thread about this on the Cython list is almost endless:
http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443
There was a long discussion on the key-comparison vs. interned-string
approach. I've written both up in SEP 201 since it was the major
point of
contention. There was some benchmarks starting here:
http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443
And why provide a table and not a get_function_pointer starting here:
http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443
For those who followed that and don't want to read the entire spec,
the
aspect of flags is new. How do we avoid to duplicate entries/check
against
two signatures for cases like a GIL-holding caller wanting to call a
nogil
function? My take: For key-comparison you can compare under a mask,
for
interned-string we should have additional flags field.
The situation is a bit awkward: The Cython list consensus (well, me
and
Robert Bradshaw) decided on what is "Approach 1" (key-comparison)
in SEP
201. I pushed for that.
Still, now that a month has passed, I just think key-comparison is too
ugly,
and that the interning mechanism shouldn't be *that* hard to code up,
probably 500 lines of C code if one just requires the GIL in a first
iteration, and that keeping the spec simpler is more important.
So I'm tentatively proposing Approach 2.
I'm still not convinced that a hybrid approach, where signatures below
some cutoff are compiled down to keys, is not a worthwhile approach.
This gets around variable-length keys (both the complexity and
possible runtime costs for long keys) and allows simple libraries to
produce and consume fast callables without participating in the
interning mechanism.
I still think this gives us the "worst of both worlds", all the
disadvantages and none of the advantages.
It avoids the one of the primary disadvantage of keys, namely the
variable length complexity.
How many simple libraries are there really? Cython on one end, the
magnificently complicated NumPy ufuncs on the other? Thinking big,
perhaps
PyPy and Julia? Cython, PyPy, Julia would all have to deal with long
signatures anyway. And NumPy ufuncs are already complicated so even more
low-level stuff wouldn't hurt.
I was thinking of, for example, a differential equation solver written
in C, C++, or Fortran that could take a PyNativeCallableTable*
directly, primarily avoiding welding this spec to Python.
I'm not sure how real-world that is in the end. But, the size of Cython
generated code would be kept down for most modules as it wouldn't need
to bundle an interner.
AND, a problem with interning is spreading the signature strings all
over memory (in the event you actually need to look at the contents).
With a smart interner I guess this can be eliminated to some extent, but
much better if one doesn't have to worry, and if all short signatures
are keys you don't.
Playing along:
a) It'd be very nice to avoid explicit decoding. I think one should be
able to cast the key to char[]; this a) avoids having to allocate a
buffer on the stack to pass to a Decode function, b) let's you inspect
the table in a debugger easily.
b) Flags are needed in addition to interning; GIL status and exception
return values do not require exact matches. I think more than 3 bits are
needed for flags => our minimal padded table entry size is actually 24
bytes! (And this is OK, my benchmarks weren't affected by 8-byte vs
16-byte comparisons, branching is so dominating.)
Now, 16 bits seems about right for flags, so this means we can actually
for free use 14-char keys (12 for signature data, one for \0, one for
guard)
That pushes the number of non-interning signatures high enough to make
it really fit 95% of the use-cases. I feel 6 chars is a little low,
remember that a "pointer to a double complex" is "&Zd" by itself unless
we play with encoding.
BUT, it then gets rather complicated to have things work on
little-endian vs. big-endian though as the guard byte must be in
different positions. If you want to align the pointer you get this:
typedef struct {
void *funcptr;
union {
union {
struct {
uint16_t interned_flags;
uint16_t padding1;
uint32_t padding2;
uint64_t interned_sig;
};
struct {
uint16_t flags;
char sig[14];
};
} big_endian;
union {
struct {
uint64_t interned_sig;
uint32_t padding1;
uint16_t padding2;
uint16_t interned_flags;
};
struct {
char sig[14];
uint16_t flags;
};
} little_endian;
};
};
(interned_flags and flags is really the same, I just didn't want to mess
with the struct alignment)
So I think this is *almost* there, but it certainly gets complicated
because of endianness issues.
Of course, an alternative is to not have the interned_sig be 64-bit
aligned. Or, play with adapting the string/guard bytes in the middle,
but that sort of breaks a) above.
OK, now I feel silly.
If we need flags (as I believe we do), and the flag-containing quadword
is being compared anyway, there's no reason at all to play tricks with
aligned pointers and guard bytes.
Simplest approach with a 128-bit compare (which, as I said, doesn't hurt
one bit and may be needed anyway to filter on GIL-ness), is then
struct {
union {
char *interned_sig;
char signature[8]
};
uint64_t flags; // first 8 bits always 0, for terminating \0
void *funcptr;
};
One could also complicate this again to eat a few more flag bits for
signature chars...
Dag
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel