On Thu, Oct 23, 2008 at 5:25 AM, Maciej Fijalkowski <[EMAIL PROTECTED]> wrote:
> Hey.
>
> First sorry for late response, we're kind of busy doing other things
> now (ie working on 2.5-compatible release). That doesn't mean we don't
> appreciate input about our problems.
>
> On Fri, Oct 17, 2008 at 5:50 AM, Geoffrey Irving <[EMAIL PROTECTED]> wrote:
> <snip>
>
> That's true. PyPy is able to handle pointers to any C place.
>
>> .. with
>>
>> separate information about type and ownership.
>
> We don't provide this, since C has no notion of that at all.
At the lowest level the type is just a hashable identifier object, so
it can probably implemented at the RPython level. E.g.,
# RPython type-safety layer
class CppObject:
def __init__(ptr, type):
self.ptr = ptr # pointer to the actual C++ instance
self.type = type # represents the C++ type
self.destructor = type.destructor # function pointer to destructor
def __traverse__(self):
... traverse through list of contained python object pointers ...
def __del__(self):
CCall(self.destructor, self.ptr)
class CppFunc:
def __init__(ptr, resulttype, argtypes):
self.ptr = ptr
self.resulttype = resulttype
self.argtypes = argtypes
def __call__(self, *args):
if len(args) != len(self.argtypes):
raise TypeError(...)
argptrs = []
for a,t in zip(args,self.argtypes):
if not isinstance(a, CppObject) or a.type != t:
raise TypeError(...)
argptrs.append(a.ptr)
resultptr = Alloc(self.resulttype.size)
try:
CppCall(self.ptr, resultptr, *argptrs) # assumes specific
calling convention
except CppException, e: # CppCall would have to generate this
Dealloc(resultptr)
raise CppToPythonException(e)
return CppObject(resultptr, self.resulttype)
If this layer is written in RPython, features like overload resolution
and C++ methods can be written in application-level python without
worring about safety.
>> <snip>
>>
>> As I mentioned in the blog comment, a lot of these issues come up in
>> contexts outside C++, like numpy. Internally numpy represents
>> operations like addition as a big list of optimized routines to call
>> depending on the stored data type. Functions in these tables are
>> called on raw pointers to memory, which is fundamental since numpy
>> arrays can refer to memory inside objects from C++, Fortran, mmap,
>> etc. It'd be really awesome if the type dispatch step could be
>> written in python but still call into optimized C code for the final
>> arithmetic.
>
> That's the goal. Well, not exactly - point is that you write this code
> in Python/RPython and JIT is able to generate efficient assembler out
> of it. That's a very far-reaching goal though to have nice integration
> between yet-non-existant JIT and yet-non-existant PyPy's numpy :-)
Asking the JIT to generate to generate efficient code might be
sufficient in this case, but in terms of this discussion it just
removes numpy as a useful thought experiment towards C++ bindings. :)
Also for maximum speed I doubt the JIT will be able to match custom
code such as BLAS, given that C++ compilers usually don't get there
either.
>> The other major issue is safety: if a lot of overloading and dispatch
>> code is going to be written in python, it'd be nice to shield that
>> code from segfaults. I think you can get a long way there just by
>> having a consistent scheme for boxing the three components above
>> (pointer, type, and reference info), a way to label C function
>> pointers with type information, a small RPython layer that did simple
>> type-checked calls (with no support for overloading or type
>> conversion). I just wrote a C++ analogue to this last part as a
>> minimal replacement for Boost.Python, so I could try to formulate what
>> I mean in pseudocode if there's interest. There'd be some amount of
>> duplicate type checking if higher level layers such as overload
>> resolution were written in application level python, but that
>> duplication should be amenable to elimination by the JIT.
>
> I think for now we're happy with extra overhead. We would like to have
> *any* working C++ bindings first and then eventually think about
> speeding it up.
Another advantage of splitting the code into an RPython type-safety
layer and application-level code is that the latter could be shared
with between pypy and cpython. I haven't looked at reflex at all, but
in Boost.Python most of the complexity goes into code that could exist
at the application-level.
Geoffrey
_______________________________________________
[email protected]
http://codespeak.net/mailman/listinfo/pypy-dev