Re: [Python-Dev] An updated extended buffer PEP

2007-03-28 Thread Carl Banks


Travis Oliphant wrote:
 Carl Banks wrote:
 Travis E. Oliphant wrote:
 I think we are getting closer.   What do you think about Greg's idea 
 of basically making the provider the bufferinfo structure and having 
 the exporter handle copying memory over for shape and strides if it 
 wants to be able to change those before the lock is released.
 It seems like it's just a different way to return the data.  You could 
 do it by setting values through pointers, or do it by returning a 
 structure.  Which way you choose is a minor detail in my opinion.  I'd 
 probably favor returning the information in a structure.

 I would consider adding two fields to the structure:

 size_t structsize; /* size of the structure */
 Why is this necessary?  can't you get that by sizeof(bufferinfo)?

In case you want to add something later.  Though if you did that, it 
would be a different major release, meaning you'd have to rebuild 
anyway.  They rashly add fields to the PyTypeObject in the same way. :) 
  So never mind.


 PyObject* releaser; /* the object you need to call releasebuffer on */ 
 Is this so that another object could be used to manage releases if desired?

Yes, that was a use case I saw for a different view object.  I don't 
think it's crucially important to have it, but for exporting objects 
that delegate management of the buffer to another object, then it would 
be very helpful if the exporter could tell consumers that the other 
object is managing the buffer.

Suppose A is an exporting object, but it uses a hidden object R to 
manage the buffer memory.  Thus you have A referring to R, like this:

A - R

Now object B takes a view of A.  If we don't have this field, then B 
will have to hold a reference to A, like this:

B - A - R

A would be responsible for keeping track of views, and A could not be 
garbage collected until B disappears.  If we do have this field, then A 
could tell be B to hold a reference to R instead:

B - R
A - R

A is no longer obliged to keep track of views, and it can be garbage 
collected even if B still exists.


Here's a concrete example of where it would be useful: consider a 
ByteBufferSlice object.  Basically, the object represents a 
shared-memory slice of a 1-D array of bytes (for example, Python 3000 
bytes object, or an mmap object).

Now, if the ByteBufferSlice object could tell the consumer that someone 
else is managing the buffer, then it wouldn't have to keep track of 
views, thus simplifying things.

P.S. In thinking about this, it occurred to me that there should be a 
way to lock the buffer without requesting details.  ByteBufferSlice 
would already know the details of the buffer, but it would need to 
increment the original buffer's lock count.  Thus, I propose new fuction:

typedef int (*lockbufferproc)(PyObject* self);


Carl Banks
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] An updated extended buffer PEP

2007-03-28 Thread Carl Banks


Carl Banks wrote:
 Here's a concrete example of where it would be useful: consider a 
 ByteBufferSlice object.  Basically, the object represents a 
 shared-memory slice of a 1-D array of bytes (for example, Python 3000 
 bytes object, or an mmap object).
 
 Now, if the ByteBufferSlice object could tell the consumer that someone 
 else is managing the buffer, then it wouldn't have to keep track of 
 views, thus simplifying things.
 
 P.S. In thinking about this, it occurred to me that there should be a 
 way to lock the buffer without requesting details.  ByteBufferSlice 
 would already know the details of the buffer, but it would need to 
 increment the original buffer's lock count.  Thus, I propose new fuction:
 
 typedef int (*lockbufferproc)(PyObject* self);


And, because real examples are better than philosophical speculations, 
here's a skeleton implementation of the ByteBufferSlice array, sans 
boilerplate and error checking, and with some educated guessing about 
future details:


typedef struct  {
   PyObject_HEAD
   PyObject* releaser;
   unsigned char* buf;
   Py_ssize_t length;
}
ByteBufferSliceObject;


PyObject* ByteBufferSlice_new(PyObject* bufobj, Py_ssize_t start, 
Py_ssize_t end) {
   ByteBufferSliceObject* self;
   BufferInfoObject* bufinfo;

   self = (ByteBufferSliceObject*)type-tp_alloc(type, 0);
   bufinfo = PyObject_GetBuffer(bufobj);

   self-releaser = bufinfo-releaser;
   self-buf = bufinfo-buf + start;
   self-length = end-start;

   /* look how soon we're done with this information */
   Py_DECREF(bufinfo);

   return self;
}


PyObject* ByteBufferSlice_dealloc(PyObject* self) {
   PyObject_ReleaseBuffer(self-releaser);
   self-ob_type-tp_free((PyObject*)self);
}


PyObject* ByteBufferSlice_getbuffer(PyObject* self, int flags) {
   BufferInfoObject* bufinfo;
   static Py_ssize_t stridesarray[] = { 1 };

   bufinfo = BufferInfo_New();
   bufinfo-releaser = self-releaser;
   bufinfo-writable = 1;
   bufinfo-buf = self-buf;
   bufinfo-length = self-length;
   bufinfo-ndims = 1;
   bufinfo-strides = stridesarray;
   bufinfo-size = self-length;
   bufinfo-subbufoffsets = NULL;

   /* Before we go, increase the original buffer's lock count */
   PyObject_LockBuffer(self-releaser);

   return bufinfo;
}


/* don't define releasebuffer or lockbuffer */
/* only objects that manage buffers themselves would define these */


/* Now look how easy this is */
/* Everything works out if ByteBufferSlice reexports the buffer */

PyObject* ByteBufferSlice_getslice(PyObject* self, Py_ssize_t start, 
Py_ssize_t end) {
   return ByteBufferSlice_new(self,start,end);
}


The implementation of this is very straightforward, and it's easy to see 
why and how bufinfo-release works, and why it'd be useful.

It's almost like there's two protocols here: a buffer exporter protocol 
(getbuffer) and a buffer manager protocol (lockbuffer and 
releasebuffer).  Some objects would support only exporter protocol; 
others both.


Carl Banks
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] An updated extended buffer PEP

2007-03-28 Thread Greg Ewing
Carl Banks wrote:

 /* don't define releasebuffer or lockbuffer */
 /* only objects that manage buffers themselves would define these */

That's an advantage, but it's a pretty small one -- the
releasebuffer implementation would be very simple in
this case.

I'm bothered that the releaser field makes the protocol
asymmetrical and thus harder to reason about. It would
cost me more mental effort to convince myself that a
releasebuffer implementation wasn't needed in any
particular case than it would to write the one-line
implementation otherwise required.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] An updated extended buffer PEP

2007-03-28 Thread Greg Ewing
Carl Banks wrote:

 Now object B takes a view of A.  If we don't have this field, then B 
 will have to hold a reference to A, like this:
 
 B - A - R
 
 A would be responsible for keeping track of views,

A isn't keeping track of views, it's keeping track of the
single object R, which it has to keep a reference to anyway.

 and A could not be 
 garbage collected until B disappears.

I'm not convinced that this would be a serious problem. An
object that's using a different object to manage the buffer
is probably quite small, so it doesn't matter much if it
stays around.

 Here's a concrete example of where it would be useful: consider a 
 ByteBufferSlice object.  Basically, the object represents a 
 shared-memory slice of a 1-D array of bytes (for example, Python 3000 
 bytes object, or an mmap object).

And this would be a very small object, not worth the trouble
of caring whether it stays around a bit longer than needed,
IMO.

 P.S. In thinking about this, it occurred to me that there should be a 
 way to lock the buffer without requesting details.

Perhaps you could do this by calling getbuffer with NULL
for the bufferinfo pointer, and similarly call releasebuffer
with NULL to unlock it.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] An updated extended buffer PEP

2007-03-27 Thread Carl Banks
Travis Oliphant wrote:
 Travis Oliphant wrote:
 Hi Carl and Greg,

 Here is my updated PEP which incorporates several parts of the 
 discussions we have been having.
 
 And here is the actual link:
 
 http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/pep_buffer.txt 


What's the purpose of void** segments in PyObject_GetBuffer?  It seems 
like it's leftover from an older incarnation?

I'd hope after more recent discussion, we'll end up simplifying 
releasebuffer.  It seems like it'd be a nightmare to keep track of what 
you've released.


Finally, the isptr thing.  It's just not sufficient.  Frankly, I'm 
having doubts whether it's a good idea to support multibuffer at all. 
Sure, it brings generality, but I'm thinking its too hard to explain and 
too hard to get one's head around, and will lead to lots of 
misunderstanding and bugginess.  OTOH, it really doen't burden anyone 
except those who want to export multi-buffered arrays, and we only have 
one shot to do it.  I just hope it doesn't confuse everyone so much that 
no one bothers.

Here's how I would update the isptr thing.  I've changed derefoff to 
subbufferoffsets to describe it better.


typedef PyObject *(*getbufferproc)(PyObject *obj, void **buf,
Py_ssize_t *len, int *writeable,
char **format, int *ndims,
Py_ssize_t **shape,
Py_ssize_t **strides,
Py_ssize_t **subbufferoffsets);


subbufferoffsets

   Used to export information about multibuffer arrays.  It is an
   address of a ``Py_ssize_t *`` variable that will be set to point at
   an array of ``Py_ssize_t`` of length ``*ndims``.

   [I don't even want to try a verbal description.]

   To demonstrate how subbufferoffsets works, here is am example of a
   function that returns a pointer to an element of ANY N-dimensional
   array, single- or multi-buffered.

void* get_item_pointer(int ndim, void* buf, Py_ssize_t* strides,
 Py_ssize_t* subarrayoffs, Py_ssize_t *indices) {
 char* pointer = (char*)buf;
 int i;
 for (i = 0; i  ndim; i++) {
 pointer += strides[i]*indices[i];
 if (subarraysoffs[i] = 0) {
 pointer = *(char**)pointer + subarraysoffs[i];
 }
 }
 return (void*)pointer;
 }

   For single buffers, subbufferoffsets is negative in every dimension
   and it reduces to normal single-buffer indexing.  For multi-buffers,
   subbufferoffsets indicates when to dereference the pointer and switch
   to the new buffer, and gives the offset into the buffer to start at.
   In most cases, the subbufferoffset would be zero (indicating it should
   start at the beginning of the new buffer), but can be a positive
   number if the following dimension has been sliced, and thus the 0th
   entry in that dimension would not be at the beginning of the new
   buffer.



Other than that, looks good. :)


Carl Banks
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] An updated extended buffer PEP

2007-03-27 Thread Travis E. Oliphant
Greg Ewing wrote:
 Here's another idea, to allow multiple views of the same
 buffer with different shape/stride info to coexist, but
 without extra provider objects or refcount weirdness.
 Also it avoids using calls with a brazillion arguments.
 
struct bufferinfo {
  void **buf;
  Py_ssize_t *len;
  int *writeable;
  char **format;
  int *ndims;
  Py_ssize_t **shape;
  Py_ssize_t **strides;
  int **isptr;
};
 
int (*getbuffer)(PyObject *obj, struct bufferinfo *info);
 
int (*releasebuffer)(PyObject *obj, struct bufferinfo *info);


This is not much different from my original view object.  Stick a 
PyObject_HEAD at the start of this bufferinfo and you have it.

Memory management was the big reason I wanted to do something like this.

I don't see why a PyObject_HEAD would make anything significantly 
slower.  Then we could use Python's memory management very easily to 
create and destroy these things.  This bufferinfo object would become 
the provider I was talking about.

 If the object has constant shape/stride info, it just fills
 in the info struct with pointers to its own memory, and does
 nothing when releasebuffer is called (other than unlocking
 its buffer).
 
 If its shape/stride info can change, it mallocs memory for
 them and copies them into the info struct. When releasebuffer
 is called, it frees this memory.

 
 It is the responsibility of the consumer to ensure that the
 base object remains alive until releasebuffer has been called
 on the info struct (to avoid leaking any memory that has
 been malloced for shapes/strides).

This is a reasonable design choice.  I actually prefer to place all the 
buffer information in a single object rather than the multiple argument 
design because it scales better and is easier to explain and understand.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] An updated extended buffer PEP

2007-03-27 Thread Travis E. Oliphant
Carl Banks wrote:
 Travis Oliphant wrote:
 Travis Oliphant wrote:
 Hi Carl and Greg,

 Here is my updated PEP which incorporates several parts of the 
 discussions we have been having.
 And here is the actual link:

 http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/pep_buffer.txt 
 
 
 What's the purpose of void** segments in PyObject_GetBuffer?  It seems 
 like it's leftover from an older incarnation?
 

Yeah, I forgot to change that location.

 I'd hope after more recent discussion, we'll end up simplifying 
 releasebuffer.  It seems like it'd be a nightmare to keep track of what 
 you've released.

Yeah, I agree.   I think I'm leaning toward the bufferinfo structure 
which allows the exporter to copy memory for things that it wants to be 
free to change while the buffer is exported.

 
 
 Finally, the isptr thing.  It's just not sufficient.  Frankly, I'm 
 having doubts whether it's a good idea to support multibuffer at all. 
 Sure, it brings generality, but I'm thinking its too hard to explain and 
 too hard to get one's head around, and will lead to lots of 
 misunderstanding and bugginess.  OTOH, it really doen't burden anyone 
 except those who want to export multi-buffered arrays, and we only have 
 one shot to do it.  I just hope it doesn't confuse everyone so much that 
 no one bothers.

People used to have doubts about explaining strides in NumPy as well.  I 
sure would have hated to see them eliminate the possiblity because of 
those doubts. I think the addition you discuss is not difficult once you 
get a hold of it.

I also understand now why subbufferoffsets is needed.  I was thinking 
that for slices you would just re-create a whole other array of pointers 
to contain that addition.   But, that is really not advisable.  It makes 
sense when you are talking about a single pointer variable (like in 
NumPy) but it doesn't when you have an array of pointers.

Providing the example about how to extract the pointer from the returned 
information goes a long way towards clearing up any remaining confusion.

Your ImageObject example is also helpful.   I really like the addition 
and think it is clear enough and supports a lot of use cases with very 
little effort.

 
 Here's how I would update the isptr thing.  I've changed derefoff to 
 subbufferoffsets to describe it better.
 
 
 typedef PyObject *(*getbufferproc)(PyObject *obj, void **buf,
 Py_ssize_t *len, int *writeable,
 char **format, int *ndims,
 Py_ssize_t **shape,
 Py_ssize_t **strides,
 Py_ssize_t **subbufferoffsets);
 
 
 subbufferoffsets
 
Used to export information about multibuffer arrays.  It is an
address of a ``Py_ssize_t *`` variable that will be set to point at
an array of ``Py_ssize_t`` of length ``*ndims``.
 
[I don't even want to try a verbal description.]
 
To demonstrate how subbufferoffsets works, here is am example of a
function that returns a pointer to an element of ANY N-dimensional
array, single- or multi-buffered.
 
 void* get_item_pointer(int ndim, void* buf, Py_ssize_t* strides,
  Py_ssize_t* subarrayoffs, Py_ssize_t *indices) {
  char* pointer = (char*)buf;
  int i;
  for (i = 0; i  ndim; i++) {
  pointer += strides[i]*indices[i];
  if (subarraysoffs[i] = 0) {
  pointer = *(char**)pointer + subarraysoffs[i];
  }
  }
  return (void*)pointer;
  }
 
For single buffers, subbufferoffsets is negative in every dimension
and it reduces to normal single-buffer indexing.  

What about just having subbufferoffsets be NULL in this case?  i.e. you 
don't need it.If some of the dimensions did not need dereferencing 
then they would be negative (how about we just say -1 to be explicit)?

For multi-buffers,
subbufferoffsets indicates when to dereference the pointer and switch
to the new buffer, and gives the offset into the buffer to start at.
In most cases, the subbufferoffset would be zero (indicating it should
start at the beginning of the new buffer), but can be a positive
number if the following dimension has been sliced, and thus the 0th
entry in that dimension would not be at the beginning of the new
buffer.
 
 
 
 Other than that, looks good. :)
 

I think we are getting closer.   What do you think about Greg's idea of 
basically making the provider the bufferinfo structure and having the 
exporter handle copying memory over for shape and strides if it wants to 
be able to change those before the lock is released.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 

Re: [Python-Dev] An updated extended buffer PEP

2007-03-27 Thread Travis Oliphant
Lisandro Dalcin wrote:
 On 3/26/07, Travis Oliphant [EMAIL PROTECTED] wrote:
 Here is my updated PEP which incorporates several parts of the
 discussions we have been having.

 Travis, it looks really good, below my comments
I hope you don't mind me replying to python-dev.


 1- Is it hard to EXTEND PyBufferProcs in order to be able to use all
 this machinery in Py 2.X series, not having to wait until Py3k?

No, I don't think it will be hard.  I just wanted to focus on Py3k since 
it is going to happen before Python 2.6 and I wanted it discussed in 
that world.

 2- Its not clear for me if this PEP will enable object types defined
 in the Python side to export buffer info. This is a feature I really
 like in numpy, and simplifies my life a lot when I need to export
 memory for C/C++ object wrapped with the help of tools like SWIG.
This PEP does not address that.  You will have to rely on the objects 
themselves for any such information.

 3- Why not to  constraint the returned 'view' object to be of a
 specific type defined in the C side (and perhaps available in the
 Python side)? This 'view' object could maintain a reference to the
 base object containing the data, could call releasebuffer using the
 base object when the view object is decref'ed, and can have a flag
 field for think like OWN_MEMORY, OWN_SHAPE, etc in order to properly
 manage memory deallocation. Does all this make sense?

Yes, that was my original thinking and we are kind of coming back to it 
after several iterations.   Perhaps, though we can stick with an 
object-less buffer interface but have this view object as an expanded 
buffer object.

-Travis



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] An updated extended buffer PEP

2007-03-27 Thread Greg Ewing
Travis E. Oliphant wrote:
 Greg Ewing wrote:

   struct bufferinfo {
 ...
   };

   int (*getbuffer)(PyObject *obj, struct bufferinfo *info);
   int (*releasebuffer)(PyObject *obj, struct bufferinfo *info);

 This is not much different from my original view object.  Stick a 
 PyObject_HEAD at the start of this bufferinfo and you have it.

The important difference is that it *doesn't* have
PyObject_HEAD at the start of it. :-)

 I don't see why a PyObject_HEAD would make anything significantly 
 slower.  Then we could use Python's memory management very easily to 
 create and destroy these things.

In the case where the shape/stride info is constant, and
the caller is able to allocate the struct bufferinfo on
the stack, my proposal requires no memory allocations at
all. That's got to be faster than allocating and freeing
a Python object.

When it is necessary to allocate memory for the shape/stride,
some mallocs and frees (or Python equivalents) are going to
be needed either way. I don't see how using a Python object
makes this any easier.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] An updated extended buffer PEP

2007-03-27 Thread Greg Ewing
Travis Oliphant wrote:
 Perhaps, though we can stick with an 
 object-less buffer interface but have this view object as an expanded 
 buffer object.

I like this idea.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] An updated extended buffer PEP

2007-03-27 Thread Travis Oliphant
Carl Banks wrote:
 Travis E. Oliphant wrote:
 I think we are getting closer.   What do you think about Greg's idea 
 of basically making the provider the bufferinfo structure and having 
 the exporter handle copying memory over for shape and strides if it 
 wants to be able to change those before the lock is released.

 It seems like it's just a different way to return the data.  You could 
 do it by setting values through pointers, or do it by returning a 
 structure.  Which way you choose is a minor detail in my opinion.  I'd 
 probably favor returning the information in a structure.

 I would consider adding two fields to the structure:

 size_t structsize; /* size of the structure */
Why is this necessary?  can't you get that by sizeof(bufferinfo)?

 PyObject* releaser; /* the object you need to call releasebuffer on */ 
Is this so that another object could be used to manage releases if desired?

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] An updated extended buffer PEP

2007-03-26 Thread Travis Oliphant

Hi Carl and Greg,

Here is my updated PEP which incorporates several parts of the 
discussions we have been having. 

-Travis


 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] An updated extended buffer PEP

2007-03-26 Thread Greg Ewing
Travis Oliphant wrote:

  Here is my updated PEP which incorporates several parts of the 
  discussions we have been having.

It looks pretty good.

However, I'm still having trouble seeing what use it is returning
a different object from getbuffer. There seems to be no rationale
set out for this in the PEP. Can you give me a concrete example of
a case where it would be necessary?

Also it appears that you're returning a borrowed reference, so
if the provider object is not the same as the main object, this
would seem to require the main object to keep references to all
the provider objects that it has handed out, until releasebuffer
has been called on them. This seems very odd to me.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] An updated extended buffer PEP

2007-03-26 Thread Greg Ewing
Here's another idea, to allow multiple views of the same
buffer with different shape/stride info to coexist, but
without extra provider objects or refcount weirdness.
Also it avoids using calls with a brazillion arguments.

   struct bufferinfo {
 void **buf;
 Py_ssize_t *len;
 int *writeable;
 char **format;
 int *ndims;
 Py_ssize_t **shape;
 Py_ssize_t **strides;
 int **isptr;
   };

   int (*getbuffer)(PyObject *obj, struct bufferinfo *info);

   int (*releasebuffer)(PyObject *obj, struct bufferinfo *info);

If the object has constant shape/stride info, it just fills
in the info struct with pointers to its own memory, and does
nothing when releasebuffer is called (other than unlocking
its buffer).

If its shape/stride info can change, it mallocs memory for
them and copies them into the info struct. When releasebuffer
is called, it frees this memory.

It is the responsibility of the consumer to ensure that the
base object remains alive until releasebuffer has been called
on the info struct (to avoid leaking any memory that has
been malloced for shapes/strides).

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com