Dear list,

A thread I started was originally meant to discuss how to use C++ memory management methods (operator new, delete etc.) with a boost python instance. Rather than dwelling on the concern, I've been (successfully) wrapping other code since, but have now arrived at a separate conundrum, which I think could be addressed by the same conceptual solution. This time I've found a working attempt at a solution here on this list[1], and was hoping that more generic, template-ised versions could be introduced into Boost..

[1] - http://mail.python.org/pipermail/cplusplus-sig/2007-August/012438.html


The message has turned into a bit of an essay, so I'll summarise what I've written here:-

       * Python objects and protocols - they're not all the same.
       * Python Buffers - An example and attempt at exposing one.
       * C++ IO streams - exposing buffered object interfaces to Python.
       * Customising PyTypeObjects already used in Boost Python.
* "There should be one-- and preferably only one --obvious way to do it."
       * Summary


What's in an Object?
--------------------

What I think it boils down to is a lack of support for the different type objects defined in the Python C-API Abstract[2] and Concrete[3] Object Layers.

The problem in [1] was related to PyBufferProcs and PyBufferObjects. How can an object representing a buffer be properly exposed to Python? The PyBuffer* structs were designed with this in mind, but are now deprecated in favour of memory view objects [4]. Either way, a `grep` of the Boost Python header and source files show no sign of either API being made of use.

[2] - http://docs.python.org/2/c-api/abstract.html
[3] - http://docs.python.org/2/c-api/concrete.html
[4] - http://docs.python.org/2/c-api/buffer.html#memoryview-objects


A buffered solution
-------------------

The solution from [1] makes it about as simple as possible for the client / Python registration code to expose a return type that is managed by a PyBuffer_Type-like PyTypeObject. A custom to-python converter is registered and return_value_policy used.

However, this is still fairly cumbersome compared to current Boost Python usage, as the C-Python API needs to be used directly and a custom PyTypeObject defined, for any return-type that should use a different type protocol. The solution also goes nowhere to providing the functionality a Python buffer expects, but instead just demonstrates how one might use a new PyTypeObject.


A standards-compliant solution
------------------------------

With the C++ standard library in mind, I was wondering what boost python might be able to do with IO streams. I have a family of C++ classes that use iostream-like operators to serialise objects into either XML, plain text, or binary formats. Providing this functionality via a buffered object seems to be the appropriate solution... Using boost python to expose such an interface though, looks non-trivial, to say the least.

A boost-friendly solution might be to recognise boost::asio::buffer[6] objects, perhaps using boost::mpl statements in the to-python converter registrations.

I'm still trying to get to grips with standard library templates personally, so would prefer if classes derived from ios_base could automatically have their '<<' and '>>' operators exposed at compile time, depending on whether they are read-only or read-write. An exposed seek function would also be useful, when one is available in the C++ type.


Specialised PyTypeObjects
-------------------------

Discussing each of the different object types is too large a subject to describe in full here, but would it not be sensible for Boost Python to make it easier to expose other PyTypeObjects?

The NumPy C-API exposes 8 public and 4 private type specialisations[5], for representing clearly different types of data. These are essentially PyTypeObjects conforming to the API defined in the C-Python object layers documentation[2,3].

With quite a lot more code, Boost Python could potentially provide capability to specialise the type objects for a number of pre-defined base types, by providing custom HolderGenerators[6] for each type specialisation. These HolderGenerators can be referred to by creating corresponding `return_value_policy`s. This is what the solution from [1] does, by defining both a new HolderGenerator and a corresponding return_value_policy.

This concept is not problem-free, however. In my case, I'd like to tie a C++ class's streaming interface directly to the PyTypeObject. For Python 2.x this would mean populating a new PyTypeObject's tp_as_buffer attribute to a PyBufferProcs struct. The code from [1] could be modified to do this, but it would take quite a lot more work. (It has..)

For Python2.7 and above, there are of course the new buffer and memoryview APIs, but I haven't really read up on or done anything with them yet...


[5] - http://docs.scipy.org/doc/numpy/reference/c-api.types-and-structures.html [6] - http://www.boost.org/doc/libs/1_53_0/libs/python/doc/v2/HolderGenerator.html


A generalised solution
----------------------

To answer my question from the previous thread I started here, on how to use a custom PyTypeObject on an exposed class_<> hierarchy, I think the way to do this is to use `pytype_object_manager_traits<PyTypeObject*, object>`, as is done in str.hpp, list.hpp, etc. e.g.:-

namespace converter
{
  template <>
  struct object_manager_traits<str>
      : pytype_object_manager_traits<&PyUnicode_Type, str>
  {
  };
}

This seems to be the best way to register a PyTypeObject to a C++ class, with Boost. But it does require a tremendous of work, when wanting to use PyTypeObjects that should use STL functionality.

C++ IO streams
--------------


Mapping C++ STL functions to PyTypeObject attributes[7] does not appear to have been done at all in Boost Python, in so far as I can tell. Of course there are the standard objects, bp::string, list, etc. , which use core Python's respective PyTypeObjects as instance managers, like above, but it doesn't seem like there is a robust way to replace a PyTypeObject's function pointers with STL-conforming implementations. I suppose it is possible to edit the PyTypeObject, after getting it with `object.get_type()`, but that seems a bit of an inefficient, run-time hack.

I was playing around with the code from [1] over the weekend, and have started to map the C++ iostream template functions to a PyTypeObject's `tp_as_buffer` member struct, to expose buffered access to C++ formatted stream methods through a PyBufferProcs struct[8]. Admittedly, this was a bit of a pointless exercise, as the buffer protocol has been removed in Python 3, but I am currently developing with Python 2.7 and wanted to try out an initial, working implementation where a custom PyTypeObject is used.

For std::i/ostream, there is some production code available that can perform Python file-like object conversions. In particular, the two subsequent replies to this message[9] here on this list, mention open-source libraries that can already do this. And from the code listed in [1], I've made available yet another (partially complete) implementation[10].

[7] - http://docs.python.org/2/c-api/typeobj.html#
[8] - http://docs.python.org/2/c-api/typeobj.html#PyBufferProcs
[9] - http://mail.python.org/pipermail/cplusplus-sig/2010-March/015411.html
[10] - https://github.com/alexleach/bp_helpers


Moving forward
--------------

Assuming Boost Python follows the Zen of Python, there should be one - and only one - obvious way to achieve what I want. That is currently, to expose a future-proof, STL-compliant iostream interface, through Boost Python. I don't think any of the above implementations are compatible with Python 3, since I don't think any of them use the new Python buffer or memoryview APIs, but I'd like to make the switch soon, myself.

I'm sure adding buffer support to Boost Python would be valuable for a number of users. From a backwards-compatibility perspective, it would probably be good to have both the old and the new buffer APIs included in Boost Python, to be selected with a Python preprocessor macro. Memoryviews are a relatively fancy and new feature, but buffers have been around for ages, so it would be good if they were supported, for basically all versions of Python. Ideally though, we would also have memoryview functionality in v2.7+, too.


One way to rule them all
------------------------

Now, I've discovered a number of ways to write to_python converters, and am not sure what is the "one obvious way" to define a new PyTypeObject's API.

I would be grateful for feedback on which should be the preferred way to expose a class with a custom PyTypeObject. Here are the methods I've looked into:-


1. indexing_suite

Perhaps my favourite way I found to expose a to_python converter, was with boost python's indexing_suite, as I did for std::list[11] (also attached to a msg on this list, earlier this month). From the client's perspective, all that needs to be done is to instantiate a template. For examples, see the C++ test code[12]. However, I haven't really looked into how the converter is registered internally, as the base classes take care of that. Either way, the indexing suite functions are only attached to the PyObject, not its respective PyTypeObject.

[11] - https://github.com/alexleach/bp_helpers/blob/master/include/boost_helpers/make_list.hpp [12] - https://github.com/alexleach/bp_helpers/blob/master/src/tests/test_make_list.cpp


2. class_<..>

The code for the class_ template, its bases and typedefs is really quite advanced, but it can't be said that it is inflexible. Still, I haven't found an "obvious way" to replace a class's object manager. I get a runtime warning if a to_python_converter is registered as well as a class. bp::init has an operator[] method, which can be used to specify a CallPolicy, but I haven't managed to get that to change an instance's base type.

The registry is probably the way to do this, but for me at least, the registry is very opaque, so I haven't found a good way to edit or replace a PyTypeObject, either during or after an exposed class_<> has been initialised.


3. to_python_converter<class T, class Conversion, bool has_get_pytype=false>

This is how the solution in [1] enables to-python (PyObject) conversion, and is also how I've been doing it in the testing code I modified from there[13-15]. A corresponding Conversion class seems necessary to write, for each new type of PyTypeObject. e.g. as done in return_opaque_pointer.hpp and opaque_pointer_converter.hpp.

[13] - https://github.com/alexleach/bp_helpers/blob/master/src/tests/test_buffer_object.cpp [14] - https://github.com/alexleach/bp_helpers/blob/master/include/boost_helpers/return_buffer_object.hpp [15] - https://github.com/alexleach/bp_helpers/blob/master/include/boost_helpers/buffer_pointer_converter.hpp


4. Return value policies and HolderGenerators

In functions and methods where a CallPolicy can be specified, as I've said already, a custom CallPolicy can be used to refer to a custom HolderGenerator. These specify which PyTypeObject is used for managing the python-converted object, but can a custom return value policy and holder be specified with the class_<> template? I sure would like to find a way...

However, alone, this doesn't seem to put a type converter in the registry. I thought that the MakeHolder::execute function should only need to be called once, but my code is currently calling it every time I want a new Python instance. So, I think I must not be registering the class' type id properly[14]...



Then there are the lvalue and rvalue Python converters, which admittedly I don't know much about. There's also some other concepts I haven't mentioned above, like install_holders[16], for example, and whatever is done when you add shared_ptr<X> to the class_ template's arguments.

[16] - http://www.boost.org/doc/libs/1_53_0/libs/python/doc/v2/instance_holder.html#instance_holder-spec


Summary
-------

Should the ability to expose C++ istreams and ostreams be added to Boost Python? How should this be done? I thought that having a chainable return_value_policy for both istreams and ostreams would be great. That way they could be both used in conjunction for an iostream, with the functionality just incrementally added to a base PyTypeObject. But I don't see how one could attach additional PyObject methods, like done by a class_ template's def methods.

What about memoryviews? If someone was to go ahead and write converters for Python memoryviews, are there any C++ standards-compliant classes that could be accommodated? i.e. Are there any classes defined in the C++ standard for multidimensional, buffered memory access? Which, if any would be an appropriate match to a Python memoryview? I guess that nested std::vectors and lists might be good candidates, but I stand to be corrected.


Apologies for the length this became and thank you for sticking with me this far. Any advice, suggestions, pointers to code or documentation I've probably overlooked or neglected, or even criticism would be appreciated. Further discussion on how best to improve Boost Python as it is would be great! I do like to contribute to open source communities when possible, but I am strained for time...

Thanks again!
Kind regards,
Alex
_______________________________________________
Cplusplus-sig mailing list
Cplusplus-sig@python.org
http://mail.python.org/mailman/listinfo/cplusplus-sig

Reply via email to