Seems I got too caught up in C++ stuff :-) Here's an answer.
Kurt Smith wrote:
> On Wed, May 6, 2009 at 11:31 AM, Dag Sverre Seljebotn
> <[email protected]> wrote:
>> (Kurt is the major recipient of this.)
>>
>> I've been thinking some more on buffer passing. In the end (and perhaps
>> even in summer; I could perhaps have a go at it working alongside you)
>> the scenario we are looking at is something like
>>
>> external_set_foo_array(foo_handle, self.arr[2::2])
>>
>> with fast operations all the way.
>>
>> That is
>> a) Efficient slicing without Python overhead (#178)
>> b) Storing acquired buffers in cdef class fields (#301)
>> c) External functions can keep a reference to the buffer (not really
>> necesarry, but it is necesarry for internal Cython "cdef" functions, and
>> it would be nice to treat them the same)
>> d) The base pointer may have to be moved (again slicing can do this)
>>
>> This seems to make it clear that
>> a) The Py_buffer is not suitable as the primary vessel of our buffer
>> data, e.g. to the external function. It can't work that well with
>> slicing as we must maintain the original Py_buffer data when releasing it.
>> b) We need a reference count. E.g. doing a slice, or assigning to
>> self.arr, would increase this count. This must go on the heap; and so we
>> might as well put the entire Py_buffer on the heap.
>
> Doing the slicing manually on the buffer data and having a refcounted
> heap-allocated Py_buffer would address the object-Py_buffer data
> synchronization problem, right? The above modifications would have
> that, once the buffer is acquired, all modifications to it are done at
> the buffer level.
Not sure what you mean, but: Once the buffer is required, all new slices
etc. is done in the local structs and the Py_buffer just sits there for
querying information that doesn't change and for use release time.
> Although, what about when we require a contiguous copy (or any mode,
> for that matter) of a buffer? Would we set the top-level object
> reference to None, as suggested in a previous thread? That would
> prevent slicing at the Python API level, though, right?
Hmm. I'm still thinking about contiguous copies. Let's deal with that in
the next iteration.
> General question: how difficult would it be for the Cython-side
> programmer to get the underlying char *data inside the Py_buffer, in
> case it needs to be accessed? It seems there are more than a few
> 'layers' -- are they all necessary? The justifications I see for each
> are the following:
>
> typedef struct {
> void *buf;
> PyObject *obj;
> Py_ssize_t len;
> Py_ssize_t itemsize;
> int readonly;
> int ndim;
> char *format;
> Py_ssize_t *shape;
> Py_ssize_t *strides;
> Py_ssize_t *suboffsets;
> void *internal;
> } Py_buffer;
I think the presence of the obj field here is a bit unsatisfactory. It
seems that after months (not to say years) of buffer PEP discussion,
someone just added it in a bugfix *sigh*
Anyway it is not to my knowledge documented anywhere but there's a long
discussion about it here:
http://bugs.python.org/issue3139
(You don't really need to read it though.)
Things seem a bit unclear due to the lack of official documentation, but
at least for now I don't want to assume that the PyObject filled in here
is the same as the object you acquire the buffer from. The obj field is
filled in by the tp_getbuffer implementation which could do things like
make a dedicated copy of the buffer into a new object and then pass that
new object into the obj field.
Anyway, the idea (I gather from reading CPython sources) is that the obj
is who should recieve the release call, which may or may not be the
object you acquire on.
So really, obj here is just present as the recipient of a release. It
might even be set to NULL if there's no need to release the buffer.
> The "base layer" -- interface structure for tp_getbuffer from objects, etc.
>
> typedef struct {
> size_t refcount;
> Py_buffer bufinfo;
> } __Pyx_Buffer;
>
> The Cython wrapper around Py_buffer -- includes cython-managed
> refcounting upon assignment/slicing (other cases?)
Basically whenever the __Pyx_buffer* is copied.
> typedef struct {
> __Pyx_buffer* bufinfo;
> char* data;
> Py_ssize_t shape0, stride0;
> } __Pyx_StridedBuf_1D;
>
> The "int[:] buf" buffer -- exists to make the new buffer syntax more
> transparent (not part of GSoC, but putting in framework for it). char
> *data pointer to allow fast slicing w/o Python API (e.g. arr[10:]),
> shape0 & stride0 copied from buffer for optimizations.
Yep. (And shape0 and stride0 also to allow fast slicing; we don't want
to be modifying the shape/stride arrays passed in the Py_buffer)
To be pedantic (and make sure things are understood correctly) there's
not a Python API for slicing, only a NumPy one.
>
> typedef struct {
> __Pyx_StridedBuf_1D buffer;
> PyObject* object;
> } __Pyx_StridedBuf_1D_Obj;
>
> The "object[int, mode='strided'] buf" object buffer. This is the top
> level and holds a reference to the original PyObject from which the
> buffer was acquired. Object reference required when a python-level
> access/assignment is required.
Yes. a) Since this doesn't have to be the same as the on in Py_buffer, a
seperate reference seems necesarry, b) storing a function local variable
inside a pointer will make accesses to the object part of the variable
slightly slower.
> So, initially, if one passes a 'buffer' to a function like so:
>
> external_func(arr[2::2])
>
> What is the slicing modifying/updating? The top-level through the
> Python API layer via the PyObject? Or is it being done to the
> __Pyx_StridedBuf_1D members -- offsetting the char *data pointer by 2,
> and changing the shape0 and stride0 members? (Would it be the top
> layer object at first & for the GSoC, then when faster, low-level
> slicing is implemented, directly on the lower-level?)
Yes.
First & for GSoC, we wouldn't know anything about "arr[2::2]", it would
just be "some Python expression". It would (if arr is a NumPy object)
return a new NumPy object from which we can acquire a correct Py_buffer.
When slicing is optimized, we don't have that luxury, and must store the
info in shape0/stride0.
>
> Besides Python API-level slicing, why is the top-level object
> reference required? Is it for buffer reacquisition through
> tp_getbuffer? That will be handled at the next lower level through a
> struct copy by value and increase of the refcount. It isn't for
> buffer release, either, since that doesn't require the original object
> reference explicitly. (There is an object reference at the Py_buffer
> level).
It is needed because when you do
cdef np.ndarray[int] obj = ...
obj.fooblargh()
on which object do you execute "fooblargh()"? (And we can't rely on the
Py_buffer level one.)
> Basically, nothing Cython-side would require the PyObject from which
> the original buffer is acquired once all these are implemented, right?
> (Again, not all part of GSoC, but at least putting in the framework.)
Yes, that would happen at the same rate that the
cdef object[int] a
syntax is deprecated in favor of
cdef int[:] a
This is really just linked to syntax.
>> It sheems a shame to let go of a neatly PEP-defined Py_buffer for
>> passing to external functions, but I think it won't be too bad if we
>> with each Cython version ship nice C header and Fortran include files
>> containing the appropriate structs and access macros.
>
> I guess the access macros would make it easy for the external
> functions to get to the important stuff, so my previous concern isn't
> so essential (removing the 'top level' layer from the buffer struct
> heirarchy).
Well, if you think it is easier we can just introduce the new syntax
only in a few limited spots for your GSoC and thus drop the top level.
Your pick. I don't think it makes much of a difference.
--
Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev