Hello,
I recently hacked together a script called entanglement.py that uses
libclang to parse C++ headers and generate a Python wrapper that can
call the C++ symbols in a .so directly. The Itanium C++ ABI is easy
enough to call from ctypes with 1 exception. Returning a class by
value from C++ results in a missing constructor call because Python
just calls memcpy. Otherwise references, this/self pointers, the
object life-cycle, virtual functions using single inheritance and
overloaded C++ operators are all easy to implement.
While this is an experiment, I genuinely think C++ should be natively
supported by script interpreters who can read some kind of
pre-compiled C++ header and use that to talk directly to the .so.
This is just a crude first attempt to move in that direction and other
ABIs would require support. I would love feedback...
Anyway, entanglement.py handles translating the entire object model,
most of the C++ operators including the whole C++ life-cycle.
Overloaded functions are selected by argument count and then first arg
type. Being able to support C++'s re-opening namespaces and C++'s
forward declaring nested classes did require monkey-patching in
Python. Templates would have to be instantiated to be called in C++
and macros are not yet being converted.
So for example, the generated code for an overloaded C++ constructor
would look like this:
def __init__(self,*_Args,**_Kwargs):
assert not _Kwargs, 'Keyword arguments.'
match _Len(_Args):
case 0:
return
_ZN12OperatorTestC1Ev(_Ctypes.byref(self),) # type: ignore
case 1:
return
_ZN12OperatorTestC1Ei(_Ctypes.byref(self),_Args[0]) # type: ignore
case _:
assert False, f'Arg count: {_Len(_Args)}'
...
Then the "symbol table" for _ZN12OperatorTestC1Ev would look like this:
_X=_Cdll._ZN12OperatorTestC1Ev # void OperatorTest()
_X.argtypes=[_Ctypes.c_void_p,]
_X.restype=None
_ZN12OperatorTestC1Ev=_X
The rest of the output is attached. Anyone familiar with ctypes should
be able to read it. In theory ctypes could decode the argtypes and
restype automatically because they are explicitly encoded in the C++
name mangling scheme.
Where this script really shines is that it doesn't generate C++. Other
tools I tried were eating 50% of my compile time and 50% of my .so
size just to wrap 50 function calls. By comparison, this builds in a
fraction of a second and is just a little more byte-code. Meanwhile,
it allows you to write a Python wrapper for a C++ object model while
barely knowing Python, let alone writing any. I can't speak to the
overhead compared to the Python C API, however.
Here is the script:
https://github.com/whatchamacallem/hatchlingplatform/blob/master/entanglement_example/src/entanglement.py
Here is the test poject. There are docs on that page.
https://github.com/whatchamacallem/hatchlingplatform/tree/master/entanglement_example
Note: testpythonbindings.sh is in the parent directory above that and
the path to libclang is hard-coded to work with Ubuntu LTS. (I swear
I am more worried about finding libclang on your hard drive portably
than I am about the portability of the rest of this.)
I am tempted to write an improvement proposal asking for C++ support
in Python. (In theory a description of the C++ interface could be
added to the ELF file format and loaded that way.) Again, let me know
what you think.
Regards,
Adrian
--
https://mail.python.org/mailman3//lists/python-list.python.org