Re: [Cython] Buffer passing implementation details

Kurt Smith Wed, 06 May 2009 14:41:47 -0700

On Wed, May 6, 2009 at 11:31 AM, Dag Sverre Seljebotn
<[email protected]> wrote:
> (Kurt is the major recipient of this.)
>
> I've been thinking some more on buffer passing. In the end (and perhaps
> even in summer; I could perhaps have a go at it working alongside you)
> the scenario we are looking at is something like
>
> external_set_foo_array(foo_handle, self.arr[2::2])
>
> with fast operations all the way.
>
> That is
> a) Efficient slicing without Python overhead (#178)
> b) Storing acquired buffers in cdef class fields (#301)
> c) External functions can keep a reference to the buffer (not really
> necesarry, but it is necesarry for internal Cython "cdef" functions, and
> it would be nice to treat them the same)
> d) The base pointer may have to be moved (again slicing can do this)
>
> This seems to make it clear that
> a) The Py_buffer is not suitable as the primary vessel of our buffer
> data, e.g. to the external function. It can't work that well with
> slicing as we must maintain the original Py_buffer data when releasing it.
> b) We need a reference count. E.g. doing a slice, or assigning to
> self.arr, would increase this count. This must go on the heap; and so we
> might as well put the entire Py_buffer on the heap.


Doing the slicing manually on the buffer data and having a refcounted
heap-allocated Py_buffer would address the object-Py_buffer data
synchronization problem, right?  The above modifications would have
that, once the buffer is acquired, all modifications to it are done at
the buffer level.

Although, what about when we require a contiguous copy (or any mode,
for that matter) of a buffer?  Would we set the top-level object
reference to None, as suggested in a previous thread?  That would
prevent slicing at the Python API level, though, right?

>
> I have made a new ticket outlining the thoughts in detail in #311 and
> added a comment in #299. #311 is not really in your direct interest for
> GSoC but it is very tightly coupled with #299 so it would be good to
> keep in the back of your mind anyway.
>
> Thoughts?

General question: how difficult would it be for the Cython-side
programmer to get the underlying char *data inside the Py_buffer, in
case it needs to be accessed?  It seems there are more than a few
'layers' -- are they all necessary?  The justifications I see for each
are the following:

typedef struct {
   void *buf;
   PyObject *obj;
   Py_ssize_t len;
   Py_ssize_t itemsize;
   int readonly;
   int ndim;
   char *format;
   Py_ssize_t *shape;
   Py_ssize_t *strides;
   Py_ssize_t *suboffsets;
   void *internal;
} Py_buffer;

The "base layer" -- interface structure for tp_getbuffer from objects, etc.

typedef struct {
  size_t refcount;
  Py_buffer bufinfo;
} __Pyx_Buffer;

The Cython wrapper around Py_buffer -- includes cython-managed
refcounting upon assignment/slicing (other cases?)

typedef struct {
  __Pyx_buffer* bufinfo;
  char* data;
  Py_ssize_t shape0, stride0;
} __Pyx_StridedBuf_1D;

The "int[:] buf" buffer -- exists to make the new buffer syntax more
transparent (not part of GSoC, but putting in framework for it).  char
*data pointer to allow fast slicing w/o Python API (e.g. arr[10:]),
shape0 & stride0 copied from buffer for optimizations.

typedef struct {
  __Pyx_StridedBuf_1D buffer;
  PyObject* object;
} __Pyx_StridedBuf_1D_Obj;

The "object[int, mode='strided'] buf" object buffer.  This is the top
level and holds a reference to the original PyObject from which the
buffer was acquired.  Object reference required when a python-level
access/assignment is required.

So, initially, if one passes a 'buffer' to a function like so:

external_func(arr[2::2])

What is the slicing modifying/updating?  The top-level through the
Python API layer via the PyObject?  Or is it being done to the
__Pyx_StridedBuf_1D members -- offsetting the char *data pointer by 2,
and changing the shape0 and stride0 members?  (Would it be the top
layer object at first & for the GSoC, then when faster, low-level
slicing is implemented, directly on the lower-level?)


Besides Python API-level slicing, why is the top-level object
reference required?  Is it for buffer reacquisition through
tp_getbuffer?  That will be handled at the next lower level through a
struct copy by value and increase of the refcount.  It isn't for
buffer release, either, since that doesn't require the original object
reference explicitly.  (There is an object reference at the Py_buffer
level).

Basically, nothing Cython-side would require the PyObject from which
the original buffer is acquired once all these are implemented, right?
(Again, not all part of GSoC, but at least putting in the framework.)

This is something of me thinking out loud, so let me know confusing parts, etc.


> It sheems a shame to let go of a neatly PEP-defined Py_buffer for
> passing to external functions, but I think it won't be too bad if we
> with each Cython version ship nice C header and Fortran include files
> containing the appropriate structs and access macros.

I guess the access macros would make it easy for the external
functions to get to the important stuff, so my previous concern isn't
so essential (removing the 'top level' layer from the buffer struct
heirarchy).

Kurt
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] Buffer passing implementation details

Reply via email to