[Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-19 Thread Carl Banks
Travis Oliphant wrote:
 Carl Banks wrote:
 Ok, I've thought quite a bit about this, and I have an idea that I 
 think will be ok with you, and I'll be able to drop my main 
 objection.  It's not a big change, either.  The key is to explicitly 
 say whether the flag allows or requires.  But I made a few other 
 changes as well.
 I'm good with using an identifier to differentiate between an allowed 
 flag and a require flag.   I'm not a big fan of 
 VERY_LONG_IDENTIFIER_NAMES though.  Just enough to understand what it 
 means but not so much that it takes forever to type and uses up 
 horizontal real-estate.

That's fine with me.  I'm not very particular about spellings, as long
as they're not misleading.

 Now, here is a key point: for these functions to work (indeed, for 
 PyObject_GetBuffer to work at all), you need enough information in 
 bufinfo to figure it out.  The bufferinfo struct should be 
 self-contained; you should not need to know what flags were passed to 
 PyObject_GetBuffer in order to know exactly what data you're looking at.
 Naturally.
 
 Therefore, format must always be supplied by getbuffer.  You cannot 
 tell if an array is contiguous without the format string.  (But see 
 below.)
 
 No, I don't think this is quite true.   You don't need to know what 
 kind of data you are looking at if you don't get strides.  If you use 
 the SIMPLE interface, then both consumer and exporter know the object is 
 looking at bytes which always has an itemsize of 1.

But doesn't this violate the above maxim?  Suppose these are the
contents of bufinfo:

ndim = 1
len = 20
shape = (10,)
strides = (2,)
format = NULL

How does it know whether it's looking at contiguous array of 10 two-byte
objects, or a discontiguous array of 10 one-byte objects, without having
at least an item size?  Since item size is now in the mix, it's moot, of
course.

The idea that Py_BUF_SIMPLE implies bytes is news to me.  What if you
want a contiguous, one-dimensional array of an arbitrary type?  I was
thinking this would be acceptable with Py_BUF_SIMPLE.  It seems you want
to require Py_BUF_FORMAT for that, which would suggest to me that
Py_BUF_ALLOW_ND amd Py_BUF_ALLOW_NONCONTIGUOUS, etc., would imply
Py_BUF_FORMAT?  IOW, pretty much anything that's not SIMPLE implies FORMAT?

If that's the case, then most of the issues I brought up about item size
don't apply.  Also, if that's the case, you're right that Py_BUF_FORMAT
makes more sense than Py_BUF_DONT_NEED_FORAMT.

But it now it seems even more unnecessary than it did before.  Wouldn't
any consumer that just wants to look at a chunk of bytes always use
Py_BUF_FORMAT, especially if there's danger of a presumptuous exporter
raising an exception?


 I'll update the PEP with my adaptation of your suggestions in a little 
 while.

Ok.  Thanks for taking the lead, and for putting up with my verbiose
nitpicking. :)


Carl Banks

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-19 Thread Travis Oliphant
Carl Banks wrote:
 Travis Oliphant wrote:
 Carl Banks wrote:
 Ok, I've thought quite a bit about this, and I have an idea that I 
 think will be ok with you, and I'll be able to drop my main 
 objection.  It's not a big change, either.  The key is to explicitly 
 say whether the flag allows or requires.  But I made a few other 
 changes as well.
 I'm good with using an identifier to differentiate between an 
 allowed flag and a require flag.   I'm not a big fan of 
 VERY_LONG_IDENTIFIER_NAMES though.  Just enough to understand what it 
 means but not so much that it takes forever to type and uses up 
 horizontal real-estate.

 That's fine with me.  I'm not very particular about spellings, as long 
 as they're not misleading.

 Now, here is a key point: for these functions to work (indeed, for 
 PyObject_GetBuffer to work at all), you need enough information in 
 bufinfo to figure it out.  The bufferinfo struct should be 
 self-contained; you should not need to know what flags were passed 
 to PyObject_GetBuffer in order to know exactly what data you're 
 looking at.
 Naturally.

 Therefore, format must always be supplied by getbuffer.  You cannot 
 tell if an array is contiguous without the format string.  (But see 
 below.)

 No, I don't think this is quite true.   You don't need to know what 
 kind of data you are looking at if you don't get strides.  If you 
 use the SIMPLE interface, then both consumer and exporter know the 
 object is looking at bytes which always has an itemsize of 1.

 But doesn't this violate the above maxim?  Suppose these are the 
 contents of bufinfo:

 ndim = 1
 len = 20
 shape = (10,)
 strides = (2,)
 format = NULL

In my thinking, format/itemsize is necessary if you have strides (as you 
do here) but not needed if you don't have strides information (i.e. you 
are assuming a C_CONTIGUOUS memory-chunk).   The intent of the simple 
interface is to basically allow consumers to mimic the old buffer 
protocol, very easily. 

 How does it know whether it's looking at contiguous array of 10 
 two-byte objects, or a discontiguous array of 10 one-byte objects, 
 without having at least an item size?  Since item size is now in the 
 mix, it's moot, of course.

My only real concern is to have some way to tell the exporter that it 
doesn't need to figure out the format if the consumer doesn't care 
about it.  Given the open-ended nature of the format string, it is 
possible that a costly format-string construction step could be 
under-taken even when the consumer doesn't care about it.

I can see you are considering the buffer structure as a 
self-introspecting structure where I was considering it only in terms of 
how the consumer would be using its members (which implied it knew what 
it was asking for and wouldn't touch anything else).

How about we assume FORMAT will always be filled in but we add a 
Py_BUF_REQUIRE_PRIMITIVE flag that will only return primitive format 
strings (i.e. basic c-types)?   An exporter receiving this flag will 
have to return complicated data-types as 'bytes'.   I would add this to 
the Py_BUF_SIMPLE default.


 The idea that Py_BUF_SIMPLE implies bytes is news to me.  What if you 
 want a contiguous, one-dimensional array of an arbitrary type?  I was 
 thinking this would be acceptable with Py_BUF_SIMPLE.
Unsigned bytes are just the lowest common denominator.  They represent 
the old way of sharing memory.   Doesn't an arbitrary type mean 
bytes?  Or did you mean what if you wanted a contiguous, one-dimensional 
array of a *specific* type?

   It seems you want to require Py_BUF_FORMAT for that, which would 
 suggest to me that
 But it now it seems even more unnecessary than it did before.  
 Wouldn't any consumer that just wants to look at a chunk of bytes 
 always use Py_BUF_FORMAT, especially if there's danger of a 
 presumptuous exporter raising an exception?

I'll put in the REQUIRE_PRIMITIVE_FORMAT idea in the next update to the 
PEP.  I can just check in my changes to SVN, so it should show up by 
Friday.

Thanks again,

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-19 Thread Terry Reedy

Travis Oliphant [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
| I'm good with using an identifier to differentiate between an allowed
| flag and a require flag.   I'm not a big fan of
| VERY_LONG_IDENTIFIER_NAMES though.  Just enough to understand what it
| means but not so much that it takes forever to type and uses up
| horizontal real-estate.

To save fingers and real-estate, adopt the following convention:
by default, adjectives like writable and contiguous are 'required'
unless tagged with 'OK', as in WRITABLE_OK.

Explain that in the flag doc just before the flags themselves.
And yes, ND for N_DIMENSIONAL or MULTIDIMENSIONAL
is also a great win that can also be explained in the same intro.

Terry Jan Reedy



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-19 Thread Greg Ewing
Travis Oliphant wrote:
 you would like to make the original memory 
 read-only until you are done with the algorithm and have copied the data 
 back into memory.

Okay, I see what you mean now.

Maybe this should be called Py_BUF_LOCK_CONTENTS or
something to make the semantics clearer. Otherwise
it could mean either that *you* don't intend to
write to it, or that you require nobody ever to
write to it.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-18 Thread Travis Oliphant
Greg Ewing wrote:
 Carl Banks wrote:

 Py_BUF_REQUIRE_READONLY - Raise excpetion if the buffer is writable.

 Is there a use case for this?

Yes.  The idea is used in NumPy all the time.

Suppose you want to write to an array but only have an algorithm that 
works with contiguous data.  Then you need to make a copy of the data 
into a contiguous buffer but you would like to make the original memory 
read-only until you are done with the algorithm and have copied the data 
back into memory.

That way when you release the GIL, the memory area will now be read-only 
and so other instances won't write to it (because any writes will be 
eradicated by the copy back when the algorithm is done).

NumPy uses this idea all the time in its UPDATE_IF_COPY flag.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-18 Thread Travis E. Oliphant
Jim Jewett wrote:
 Reading this message without the entire PEP in front of me showed some
 confusing usage.  (Details below)  Most (but not all) I could resolve
 from the PEP itself, but they could be clarified with different
 constant names.
 

I'm going to adapt some suggestions made by you and Carl Banks.  Look 
for an updated flags section of the PEP shortly.

-Travis
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-18 Thread Travis Oliphant
Carl Banks wrote:
 Ok, I've thought quite a bit about this, and I have an idea that I 
 think will be ok with you, and I'll be able to drop my main 
 objection.  It's not a big change, either.  The key is to explicitly 
 say whether the flag allows or requires.  But I made a few other 
 changes as well.
I'm good with using an identifier to differentiate between an allowed 
flag and a require flag.   I'm not a big fan of 
VERY_LONG_IDENTIFIER_NAMES though.  Just enough to understand what it 
means but not so much that it takes forever to type and uses up 
horizontal real-estate.

We use flags in NumPy quite a bit, and I'm obviously trying to adapt 
some of this to the general case here, but I'm biased by my 10 years of 
experience with the way I think about NumPy arrays.

Thanks for helping out and offering your fresh approach.   I like a lot 
of what you've come up with.  There are a few modifications I would 
make, though.


 First of all, let me define how I'm using the word contiguous: it's 
 a single buffer with no gaps.  So, if you were to do this: 
 memset(bufinfo-buf,0,bufinfo-len), you would not touch any data 
 that isn't being exported.

Sure, we call this NPY_ONESEGMENT in NumPy-speak, though, because 
contiguous could be NPY_C_CONTIGUOUS or NPY_F_CONTIGUOUS.   We also 
don't use the terms ROW_MAJOR and COLUMN_MAJOR and so I'm not a big fan 
of bringing them up in the Python space because the NumPy community has 
already learned the C_ and F_ terminology which also generalizes to 
multiple-dimensions more clearly without using 2-d concepts.

 Without further ado, here is my proposal:


 --

 With no flags, the PyObject_GetBuffer will raise an exception if the 
 buffer is not direct, contiguous, and one-dimensional.  Here are the 
 flags and how they affect that:

I'm not sure what you mean by direct here.  But, this looks like the 
Py_BUF_SIMPLE case (which was a named-constant for 0) in my proposal.
The exporter receiving no flags would need to return a simple buffer 
(and it wouldn't need to fill in the format character either --- 
valuable information for the exporter to know).

 Py_BUF_REQUIRE_WRITABLE - Raise exception if the buffer isn't writable.
WRITEABLE is an alternative spelling and the one that NumPy uses.   So, 
either include both of these as alternatives or just use WRITEABLE.

 Py_BUF_REQUIRE_READONLY - Raise excpetion if the buffer is writable.
Or if the object memory can't be made read-only if it is writeable.

 Py_BUF_ALLOW_NONCONTIGUOUS - Allow noncontiguous buffers.  (This turns 
 on shape and strides.)

Fine.
 Py_BUF_ALLOW_MULTIDIMENSIONAL - Allow multidimensional buffers.  (Also 
 turns on shape and strides.)
Just use ND instead of MULTIDIMENSIONAL   and only turn on shape if it 
is present.

 (Neither of the above two flags implies the other.)


 Py_BUF_ALLOW_INDIRECT - Allow indirect buffers.  Implies 
 Py_BUF_ALLOW_NONCONTIGUOUS and Py_BUF_ALLOW_MULTIDIMENSIONAL. (Turns 
 on shape, strides, and suboffsets.)
If we go with this consumer-oriented naming scheme, I like indirect also.

 Py_BUF_REQUIRE_CONTIGUOUS_C_ARRAY or Py_BUF_REQUIRE_ROW_MAJOR - Raise 
 an exception if the array isn't a contiguous array with in C 
 (row-major) format.

 Py_BUF_REQUIRE_CONTIGUOUS_FORTRAN_ARRAY or Py_BUF_REQUIRE_COLUMN_MAJOR 
 - Raise an exception if the array isn't a contiguous array with in 
 Fortran (column-major) format.
Just name them C_CONTIGUOUS and F_CONTIGUOUS like in NumPy.

 Py_BUF_ALLOW_NONCONTIGUOUS, Py_BUF_REQUIRE_CONTIGUOUS_C_ARRAY, and 
 Py_BUF_REQUIRE_CONTIGUOUS_FORTRAN_ARRAY all conflict with each other, 
 and an exception should be raised if more than one are set.

 (I would go with ROW_MAJOR and COLUMN_MAJOR: even though the terms 
 only make sense for 2D arrays, I believe the terms are commonly 
 generalized to other dimensions.)
As I mentioned there is already a well-established history with NumPy.  
We've dealt with this issue already.

 Possible pseudo-flags:

 Py_BUF_SIMPLE = 0;
 Py_BUF_ALLOW_STRIDED = Py_BUF_ALLOW_NONCONTIGUOUS
| Py_BUF_ALLOW_MULTIDIMENSIONAL;

 --

 Now, for each flag, there should be an associated function to test the 
 condition, given a bufferinfo struct.  (Though I suppose they don't 
 necessarily have to map one-to-one, I'll do that here.)

 int PyBufferInfo_IsReadonly(struct bufferinfo*);
 int PyBufferInfo_IsWritable(struct bufferinfo*);
 int PyBufferInfo_IsContiguous(struct bufferinfo*);
 int PyBufferInfo_IsMultidimensional(struct bufferinfo*);
 int PyBufferInfo_IsIndirect(struct bufferinfo*);
 int PyBufferInfo_IsRowMajor(struct bufferinfo*);
 int PyBufferInfo_IsColumnMajor(struct bufferinfo*);

 The function PyObject_GetBuffer then has a pretty obvious 
 implementation.  Here is an except:

 if ((flags  Py_BUF_REQUIRE_READONLY) 
 !PyBufferInfo_IsReadonly(bufinfo)) {
 PyExc_SetString(PyErr_BufferError,buffer not read-only);
 return 0;
 }

 Pretty straightforward, no?


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-17 Thread Carl Banks
Travis Oliphant wrote:
 Carl Banks wrote:
 My recommendation is, any flag should turn on some circle in the Venn
 diagram (it could be a circle I didn't draw--shaped arrays, for
 example--but it should be *some* circle).
 I don't think your Venn diagram is broad enough and it un-necessarily 
 limits the use of flags to communicate between consumer and exporter.   
 We don't have to ram these flags down that point-of-view for them to be 
 productive.If you have a specific alternative proposal, or specific 
 criticisms, then I'm very willing to hear them.


Ok, I've thought quite a bit about this, and I have an idea that I think 
will be ok with you, and I'll be able to drop my main objection.  It's 
not a big change, either.  The key is to explicitly say whether the flag 
allows or requires.  But I made a few other changes as well.

First of all, let me define how I'm using the word contiguous: it's a 
single buffer with no gaps.  So, if you were to do this: 
memset(bufinfo-buf,0,bufinfo-len), you would not touch any data that 
isn't being exported.

Without further ado, here is my proposal:


--

With no flags, the PyObject_GetBuffer will raise an exception if the 
buffer is not direct, contiguous, and one-dimensional.  Here are the 
flags and how they affect that:

Py_BUF_REQUIRE_WRITABLE - Raise exception if the buffer isn't writable.

Py_BUF_REQUIRE_READONLY - Raise excpetion if the buffer is writable.

Py_BUF_ALLOW_NONCONTIGUOUS - Allow noncontiguous buffers.  (This turns 
on shape and strides.)

Py_BUF_ALLOW_MULTIDIMENSIONAL - Allow multidimensional buffers.  (Also 
turns on shape and strides.)

(Neither of the above two flags implies the other.)

Py_BUF_ALLOW_INDIRECT - Allow indirect buffers.  Implies 
Py_BUF_ALLOW_NONCONTIGUOUS and Py_BUF_ALLOW_MULTIDIMENSIONAL. (Turns on 
shape, strides, and suboffsets.)

Py_BUF_REQUIRE_CONTIGUOUS_C_ARRAY or Py_BUF_REQUIRE_ROW_MAJOR - Raise an 
exception if the array isn't a contiguous array with in C (row-major) 
format.

Py_BUF_REQUIRE_CONTIGUOUS_FORTRAN_ARRAY or Py_BUF_REQUIRE_COLUMN_MAJOR - 
Raise an exception if the array isn't a contiguous array with in Fortran 
(column-major) format.

Py_BUF_ALLOW_NONCONTIGUOUS, Py_BUF_REQUIRE_CONTIGUOUS_C_ARRAY, and 
Py_BUF_REQUIRE_CONTIGUOUS_FORTRAN_ARRAY all conflict with each other, 
and an exception should be raised if more than one are set.

(I would go with ROW_MAJOR and COLUMN_MAJOR: even though the terms only 
make sense for 2D arrays, I believe the terms are commonly generalized 
to other dimensions.)

Possible pseudo-flags:

Py_BUF_SIMPLE = 0;
Py_BUF_ALLOW_STRIDED = Py_BUF_ALLOW_NONCONTIGUOUS
| Py_BUF_ALLOW_MULTIDIMENSIONAL;

--

Now, for each flag, there should be an associated function to test the 
condition, given a bufferinfo struct.  (Though I suppose they don't 
necessarily have to map one-to-one, I'll do that here.)

int PyBufferInfo_IsReadonly(struct bufferinfo*);
int PyBufferInfo_IsWritable(struct bufferinfo*);
int PyBufferInfo_IsContiguous(struct bufferinfo*);
int PyBufferInfo_IsMultidimensional(struct bufferinfo*);
int PyBufferInfo_IsIndirect(struct bufferinfo*);
int PyBufferInfo_IsRowMajor(struct bufferinfo*);
int PyBufferInfo_IsColumnMajor(struct bufferinfo*);

The function PyObject_GetBuffer then has a pretty obvious 
implementation.  Here is an except:

 if ((flags  Py_BUF_REQUIRE_READONLY) 
 !PyBufferInfo_IsReadonly(bufinfo)) {
 PyExc_SetString(PyErr_BufferError,buffer not read-only);
 return 0;
 }

Pretty straightforward, no?

Now, here is a key point: for these functions to work (indeed, for 
PyObject_GetBuffer to work at all), you need enough information in 
bufinfo to figure it out.  The bufferinfo struct should be 
self-contained; you should not need to know what flags were passed to 
PyObject_GetBuffer in order to know exactly what data you're looking at.

Therefore, format must always be supplied by getbuffer.  You cannot tell 
if an array is contiguous without the format string.  (But see below.)

And even if the consumer isn't asking for a contiguous buffer, it has to 
know the item size so it knows what data not to step on.

(This is true even in your own proposal, BTW.  If a consumer asks for a 
non-strided array in your proposal, PyObject_GetBuffer would have to 
know the item size to determine if the array is contiguous.)


--

FAQ:

Q. Why ALLOW_NONCONTIGUOUS and ALLOW_MULTIDIMENSIONAL instead of 
ALLOW_STRIDED and ALLOW_SHAPED?

A. It's more useful to the consumer that way.  With ALLOW_STRIDED and 
ALLOW_SHAPED, there's no way for a consumer to request a general 
one-dimensional array (it can only request a non-strided one-dimensional 
array), and requesting a SHAPED array but not a STRIDED one can only 
return a C-like (row-major) array, although a consumer might reasonably 
want a Fortran-like (column-major) array.  This approach maps more 
directly to the consumer's needs, is more flexible, and still 

Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-17 Thread Greg Ewing
Carl Banks wrote:

 Py_BUF_REQUIRE_READONLY - Raise excpetion if the buffer is writable.

Is there a use case for this?

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-16 Thread Jim Jewett
Reading this message without the entire PEP in front of me showed some
confusing usage.  (Details below)  Most (but not all) I could resolve
from the PEP itself, but they could be clarified with different
constant names.

Counter Proposal at bottom, and specific questions in between.

Travis Oliphant wrote:
 Carl Banks wrote:

 Some of the flags RESTRICT the kind of buffers that can be
 exported (Py_BUF_WRITABLE); other flags EXPAND the
 kind of buffers that can be exported (Py_BUF_INDIRECT).
 That is highly confusing and I'm -1
 on any proposal that includes both behaviors.

 Basically, every flag corresponds to a different property of
 the buffer  that the consumer is requesting:

I had trouble seeing it like that.  Could you at least rename them
something like SHAPE_VALID?  But I would prefer the change suggested
at the bottom.

 Py_BUF_SIMPLE  --- you are requesting the simplest possible  (0x00)

??? Does this automatically assume a readable buffer?  (No hardware
write buffers?)  (This one I couldn't tell from the PEP)

 Py_BUF_WRITEABLE --  get a writeable buffer   (0x01)
 Py_BUF_READONLY --  get a read-only buffer(0x02)

Do Py_BUF_READONLY promise that *this* consumer won't try to change
it, or request that it be immutable (and no one else will change it
either)?  (From the PEP itself, I think you wanted IMMUTABLE.)

When it comes back, does it mean you, the consumer, promised not to
change this, or I, the exporter, won't authorize anyone to change
it, including myself.

If it is really a request for immutability, and the exporter can't
make that promise, then should the buffer protocol itself make and
return an immutable copy?

For this grouping to make sense, I have to assume that you really mean

Py_BUF_READABLE  (0 - no way to mark it write-only)
Py_BUF_WRITABLE (1 - meaning *this* consumer can write)
Py_BUF_IMMUTABLE (2 - meaning no one can change it.)
so 3 = compile-time error

but that still doesn't mesh with your later statement that:

 ... most people request read-or-write  buffers
 i.e. Py_BUF_SIMPLE.


 Py_BUF_FORMAT --  get a formatted buffer.   (0x04)

Is this saying I the consumer will respect your formatting or I the
consumer will fail if you don't tell me the formatting?

If this flag comes back, does that mean that understanding the
formatting is mandatory, or is it just informational?

??? To make this concrete, there are libraries that sniff and guess a
format.  Should they pass this flag or not?

 Py_BUF_SHAPE -- get a buffer with shape information  (0x08)

Wait -- what is the difference between these two again?  Is format
the internal format of a single element, and shape the dimensions of
an array?  Should Py_BUF_FORMAT be Py_BUF_ELT_FORMAT?

Do you pass Py_BUF_SHAPE to indicate that you'll accept N-Dim arrays,
or to say that you prefer them or somehow need them?

??? If Py_BUF_SHAPE was requested, but the buffer really is
1-dimensional, should this still be set on the way back?  (Presumably
setting the shape variable to point to a (one-element) array
containing len/itemsize?

 Py_BUF_STRIDES --  get a buffer with stride information (and shape)  (0x18)
 Py_BUF_OFFSET -- get a buffer with suboffsets (and strides and shape) (0x38)

I assume that Py_BUF_OFFSET doesn't make sense without Py_BUF_STRIDES,
and Py_BUF_STRIDES doesn't make sense without Py_BUF_SHAPE.

What happens if someone *does* pass 0x20?

If you want to avoid that, it might make sense to treat this as 4
enumerated values (1-D, n-D, n-D with strides, n-D with strides and
offsets) instead of 3 flags.

If Py_BUF_OFFSET was requested, but the return value is a single
continuous 1D array, should Py_BUF_OFFSET still come back but have the
zeros filled in?

Also, why are all negative offsets invalid?  It seems like they might
be useful for some re-orderings, and using an offset of 0 has the same
effect as marking the offset unused.

 For me, the most restrictive requests are

 PY_BUF_WRITEABLE | Py_BUF_FORMAT and
 Py_BUF_READONLY |  Py_BUF_FORMAT

 The most un-restrictive request (the largest circle in my mental Venn
 diagram) is

 Py_BUF_OFFSETS followed by Py_BUF_STRIDES followed by Py_BUF_SHAPE

I think this is what he meant when he said that the flags had opposite meanings.
Ideally, 0 and 0xFF should be shortcuts to either the most restrictive
or the least restrictive.

 Think of them as turning on members of the bufferinfo structure.

If the senses can't be reversed, could you at least rename them to
indicate this?  Something like Py_BUF_VALID_SUBOFFSETS?



Counter Proposal:


0x00: Py_BUF_RWLOCK:  Consumer is a new owner.  It can read and
write; no other code can. (most restrictive)

0x01: Py_BUF_READONLY:  Consumer doesn't need to (cannot) write.
0x02: Py_BUF_MUTABLE:  Other code can write

(implies)
0x03 = PY_BUF_UR:
Py_BUF_READONLY | Py_BUF_MUTABLE : Unconfirmed Read: Consumer
won't write, but other code might.


(skipping 0x04 just to get 

Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-13 Thread Travis Oliphant
Carl Banks wrote:

 The thing that bothers me about this whole flags setup is that 
 different flags can do opposite things.

 Some of the flags RESTRICT the kind of buffers that can be
 exported (Py_BUF_WRITABLE); other flags EXPAND the kind of buffers that
 can be exported (Py_BUF_INDIRECT).  That is highly confusing and I'm -1
 on any proposal that includes both behaviors.  (Mutually exclusive sets
 of flags are a minor exception: they can be thought of as either
 RESTICTING or EXPANDING, so they could be mixed with either.)
The mutually exclusive set is the one example of the restriction that 
you gave. 

I think the flags setup I've described is much closer to your Venn 
diagram concept than you give it credit for.   I've re-worded some of 
the discussion (see 
http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/numpy/doc/pep_buffer.txt
 
) so that it is more clear that each flag is a description what kind of 
buffer the consumer is prepared to deal with.

For example, if the consumer cares about what's 'in' the array, it uses 
Py_BUF_FORMAT.   Exporters are free to do what they want with this 
information.   I agree that NumPy would not force you to use it's buffer 
only as a region of some specific type, but some other object may want 
to be much more restrictive and only export to consumers who will 
recognize the data stored for what it is.I think it's up to the 
exporters to decide whether or not to raise an error when a certain kind 
of buffer is requested.

Basically, every flag corresponds to a different property of the buffer 
that the consumer is requesting:

Py_BUF_SIMPLE  --- you are requesting the simplest possible  (0x00)

Py_BUF_WRITEABLE --  get a writeable buffer   (0x01)

Py_BUF_READONLY --  get a read-only buffer(0x02)

Py_BUF_FORMAT --  get a formatted buffer.   (0x04)

Py_BUF_SHAPE -- get a buffer with shape information  (0x08)

Py_BUF_STRIDES --  get a buffer with stride information (and shape)  (0x18)

Py_BUF_OFFSET -- get a buffer with suboffsets (and strides and shape) (0x38)

This is a logical sequence.  There is progression.  Each flag is a bit 
that indicates something about how the consumer can use the buffer.  In 
other words, the consumer states what kind of buffer is being 
requested.  The exporter obliges (and can save possibly significant time 
if the consumer is not requesting the information it must otherwise 
produce).

 I originally suggested a small set of flags that expand the set of 
 allowed buffers.  Here's a little Venn diagram of buffers to 
 illustrate what I was thinking:

 http://www.aerojockey.com/temp/venn.png

 With no flags, the only buffers allowed to be returned are in the All
 circle but no others.  Add Py_BUF_WRITABLE and now you can export
 writable buffers as well.  Add Py_BUF_STRIDED and the strided circle is
 opened to you, and so on.

 My recommendation is, any flag should turn on some circle in the Venn
 diagram (it could be a circle I didn't draw--shaped arrays, for
 example--but it should be *some* circle).
I don't think your Venn diagram is broad enough and it un-necessarily 
limits the use of flags to communicate between consumer and exporter.   
We don't have to ram these flags down that point-of-view for them to be 
productive.If you have a specific alternative proposal, or specific 
criticisms, then I'm very willing to hear them.   

I've thought through the flags again, and I'm not sure how I would 
change them.  They make sense to me.   Especially in light of past 
usages of the buffer protocol (where most people request read-or-write 
buffers i.e. Py_BUF_SIMPLE.   I'm also not sure our mental diagrams are 
both oriented the same.  For me, the most restrictive requests are

PY_BUF_WRITEABLE | Py_BUF_FORMAT and Py_BUF_READONLY | Py_BUF_FORMAT

The most un-restrictive request (the largest circle in my mental Venn 
diagram) is

Py_BUF_OFFSETS followed by Py_BUF_STRIDES followed by Py_BUF_SHAPE

adding Py_BUF_FORMATS, Py_BUF_WRITEABLE, or Py_BUF_READONLY serves to 
restrict any of the other circles

Is this dual use of flags what bothers you?  (i.e. use of some flags for 
restricting circles in your Venn diagram that are turned on by other 
flags? --- you say Py_BUF_OFFSETS | Py_BUF_WRITEABLE to get the 
intersection of the Py_BUF_OFFSETS largest circle with the WRITEABLE 
subset?) 

Such concerns are not convincing to me.  Just don't think of the flags 
in that way.  Think of them as turning on members of the bufferinfo 
structure.  



 Py_BUF_FORMAT
The consumer will be using the format string information so make 
 sure thatmember is filled correctly. 

 Is the idea to throw an exception if there's some other data format 
 besides b, and this flag isn't set?  It seems superfluous otherwise.

 The idea is that a consumer may not care about the format and the 
 exporter may want to know that to simplify the interface.In other 
 words the flag is a way for the consumer to communicate that it wants 
 

Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-13 Thread Lisandro Dalcin
On 4/13/07, Travis Oliphant [EMAIL PROTECTED] wrote:
  int PyObject_GetContiguous(PyObject *obj, void **buf, Py_ssize_t
  *len,
 int fortran)
 
  Return a contiguous chunk of memory representing the buffer.  If a
  copy is made then return 1.  If no copy was needed return 0.
 
  8) If a copy was made, What should consumers call to free memory?

 You are right.  We need a free function.


I think now the memory perhaps should be allocated with PyMem_New and
deallocated with PyMem_Free.

Additionally, the return should perhaps indicate success or failure,
and a new argument should be passed in order to know if a copy was
made, something like

int PyObject_GetContiguous(PyObject *obj,
   void **buf, Py_ssize_t
*len, int *copy,
   char layout)


-- 
Lisandro Dalcín
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594
___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-13 Thread Greg Ewing
Travis Oliphant wrote:

 Py_BUF_SIMPLE  --- you are requesting the simplest possible  (0x00)
 
 Py_BUF_WRITEABLE --  get a writeable buffer   (0x01)
 
 Py_BUF_READONLY --  get a read-only buffer(0x02)

I don't see how these three form a progression.

 From the other things you say, it appears you mean
Py_BUF_SIMPLE to mean both readable and writeable.
But then Py_BUF_READONLY is turning off a capability
(being able to write to the buffer) rather than
turning one on.

Seems to me the simplest possible buffer is a
read-only one (assuming we don't want to allow for
write-only buffers -- or do we?), in which case
a more logical arrangement to my mind would be

   Py_BUF_READONLY  = 0x00  # simplest possible
   PY_BUF_READWRITE = 0x01

If we do want write-only buffers, then there
isn't a single simplest possible buffer, except
for one that's neither readable nor writable,
which doesn't seem very useful. So we would
have

   Py_BUF_READABLE  = 0x01
   Py_BUF_WRITABLE  = 0x02
   Py_BUF_READWRITE = 0x03

--
Greg
___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-12 Thread Travis Oliphant
Neil Hodgson wrote:
 Travis Oliphant:

 PEP: 3118
 ...

   I'd like to see the PEP include discussion of what to do when an
 incompatible request is received while locked. Should there be a
 standard Can't do that: my buffer has been got exception?
I'm not sure what standard to make a decision about that by.  Sure, why 
not?

It's not something I'd considered. 

-Travis

___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-12 Thread Travis Oliphant
Lisandro Dalcin wrote:
 On 4/9/07, Travis Oliphant [EMAIL PROTECTED] wrote:

 Travis, all this is far better and simpler than previous approaches...
 Just a few comments

Thanks for your wonderful suggestions.  I've incorporated many of them.

 1) I assume that 'bufferinfo' structure will be part of public Python
 API. In such a case, I think it should be renamed and prefixed.
 Something like 'PyBufferInfo' sounds good?

I prefer that as well. 


 2) I also think 'bufferinfo' could also have an 'itemsize' field
 filled if Py_BUF_ITEMSIZE flag is passed. What do you think? Exporters
 can possibly fill this field more efficiently than next parsing
 'format' string, it can also save consumers from an API call.
I think the itemsize member is a good idea.   I'm re-visiting what the 
flags should be after suggestions by Carl.

 3) It does make sense to make 'format' be 'const char *' ?
Yes,

 4) I am not sure about this, but perhaps 'buferingo' should save the
 flags passed to 'getbuffer' in a 'flags' field. This can be possibly
 needed at 'releasebuffer' call.

I think this is un-necessary.

   typedef struct {
   PyObject_HEAD
   PyObject *base;
   struct bufferinfo view;
   int itemsize;
   int flags;
   } PyMemoryViewObject;

 5) If my previous comments go in, so 'PyMemoryViewObject' will not
 need 'itemsize' and 'flags' fields (they are in 'bufferinfo'
 structure).

After suggestions by Greg, I like the idea of the PyMemoryViewObject 
holding a pointer to another object (from which it gets memory on 
request) as well as information about a slice of that memory. 

Thus, the memory view object is something like:

typedef struct {
  PyObject_HEAD
  PyObject *base; 
  int ndims;
  Py_ssize_t *offsets;/* slice starts */
  Py_ssize_t *lengths;   /* slice stops */
  Py_ssize_t *skips;   /* slice steps */
} PyMemoryViewObject;

It is more convenient to store any slicing information (so a memory view 
object could store an arbitrary slice of another object) as offsets, 
lengths, and skips which can be used to adjust the memory buffer 
returned by base.

 int PyObject_GetContiguous(PyObject *obj, void **buf, Py_ssize_t 
 *len,
int fortran)

 Return a contiguous chunk of memory representing the buffer.  If a
 copy is made then return 1.  If no copy was needed return 0.

 8) If a copy was made, What should consumers call to free memory?

You are right.  We need a free function.

 9) What about using a char, like 'c' or 'C', and 'f' or 'F', and 0 or
 'a' or 'A' (any) ?

I'm happy with that too. 

 int PyObject_CopyToObject(PyObject *obj, void *buf, Py_ssize_t len,
   int fortran)

 10) Better name? Perhaps PyObject_CopyBuffer or PyObject_CopyMemory?
I'm not sure why those are better names.  The current name reflects the 
idea of copying the data into the object.


 int PyObject_SizeFromFormat(char *)

 int PyObject_IsContiguous(struct bufferinfo *view, int fortran);

 void PyObject_FillContiguousStrides(int *ndims, Py_ssize_t *shape,
 int itemsize,
 Py_ssize_t *strides, int 
 fortran)

 int PyObject_FillBufferInfo(struct bufferinfo *view, void *buf, 
 Py_ssize_t len,
  int readonly, int infoflags)


 11) Perhaps the 'PyObject_' prefix is wrong, as those functions does
 not operate with Python objects.

Agreed.

-Travis

___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-11 Thread Lisandro Dalcin
On 4/9/07, Travis Oliphant [EMAIL PROTECTED] wrote:

Travis, all this is far better and simpler than previous approaches...
Just a few comments

 The bufferinfo structure is::

   struct bufferinfo {
void *buf;
Py_ssize_t len;
int readonly;
char *format;
int ndims;
Py_ssize_t *shape;
Py_ssize_t *strides;
Py_ssize_t *suboffsets;
void *internal;
   };

1) I assume that 'bufferinfo' structure will be part of public Python
API. In such a case, I think it should be renamed and prefixed.
Something like 'PyBufferInfo' sounds good?

2) I also think 'bufferinfo' could also have an 'itemsize' field
filled if Py_BUF_ITEMSIZE flag is passed. What do you think? Exporters
can possibly fill this field more efficiently than next parsing
'format' string, it can also save consumers from an API call.

3) It does make sense to make 'format' be 'const char *' ?

4) I am not sure about this, but perhaps 'buferingo' should save the
flags passed to 'getbuffer' in a 'flags' field. This can be possibly
needed at 'releasebuffer' call.


   typedef struct {
   PyObject_HEAD
   PyObject *base;
   struct bufferinfo view;
   int itemsize;
   int flags;
   } PyMemoryViewObject;

5) If my previous comments go in, so 'PyMemoryViewObject' will not
need 'itemsize' and 'flags' fields (they are in 'bufferinfo'
structure).

6) Perhaps 'view' field can be renamed to 'info'.


 int PyObject_SizeFromFormat(char *)

7) Why not 'const char *' here?


 int PyObject_GetContiguous(PyObject *obj, void **buf, Py_ssize_t *len,
int fortran)

 Return a contiguous chunk of memory representing the buffer.  If a
 copy is made then return 1.  If no copy was needed return 0.

8) If a copy was made, What should consumers call to free memory?

 If the object is multi-dimensional, then if
 fortran is 1, the first dimension of the underlying array will vary
 the fastest in the buffer.  If fortran is 0, then the last dimension
 will vary the fastest (C-style contiguous). If fortran is -1, then it
 does not matter and you will get whatever the object decides is more
 efficient.

9) What about using a char, like 'c' or 'C', and 'f' or 'F', and 0 or
'a' or 'A' (any) ?

 int PyObject_CopyToObject(PyObject *obj, void *buf, Py_ssize_t len,
   int fortran)

10) Better name? Perhaps PyObject_CopyBuffer or PyObject_CopyMemory?

 int PyObject_SizeFromFormat(char *)

 int PyObject_IsContiguous(struct bufferinfo *view, int fortran);

 void PyObject_FillContiguousStrides(int *ndims, Py_ssize_t *shape,
 int itemsize,
 Py_ssize_t *strides, int fortran)

 int PyObject_FillBufferInfo(struct bufferinfo *view, void *buf, 
 Py_ssize_t len,
  int readonly, int infoflags)


11) Perhaps the 'PyObject_' prefix is wrong, as those functions does
not operate with Python objects.

Regards,

-- 
Lisandro Dalcín
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-11 Thread Greg Ewing
Lisandro Dalcin wrote:

 4) I am not sure about this, but perhaps 'buferingo' should save the
 flags passed to 'getbuffer' in a 'flags' field. This can be possibly
 needed at 'releasebuffer' call.

The object isn't necessarily providing all the things
that were requested in the flags, so it's going to
have to keep its own track of what has been allocated.

 11) Perhaps the 'PyObject_' prefix is wrong, as those functions does
 not operate with Python objects.

Yes, PyBuffer_ would be a better prefix for these,
I think.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiem! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-11 Thread Greg Ewing
 From PEP 3118:

   A memory-view object is an extended buffer object that
   should replace the buffer object in Python 3K.

   typedef struct {
 PyObject_HEAD
 PyObject *base;
 struct bufferinfo view;
 int itemsize;
 int flags;
   } PyMemoryViewObject;

If the purpose is to provide Python-level access to an
object via its buffer interface, then keeping a bufferinfo
struct in it is the wrong implementation strategy, since it
implies keeping the base object's memory locked as long as
the view object exists.

That was the mistake made by the original buffer object,
and the solution is not to hold onto the info returned by
the base object's buffer interface, but to make a new
buffer request for each Python-level access.

If that's not the purpose of this object, you need to
explain what its purpose actually is.

Also some of what you say about this object is rather
unclear, e.g. It exports a view using the base object.
I don't know what that is supposed to mean.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiem! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-11 Thread Travis Oliphant
Greg Ewing wrote:
 From PEP 3118:

   A memory-view object is an extended buffer object that
   should replace the buffer object in Python 3K.

   typedef struct {
 PyObject_HEAD
 PyObject *base;
 struct bufferinfo view;
 int itemsize;
 int flags;
   } PyMemoryViewObject;

 If the purpose is to provide Python-level access to an
 object via its buffer interface, then keeping a bufferinfo
 struct in it is the wrong implementation strategy, since it
 implies keeping the base object's memory locked as long as
 the view object exists.

Yes, but that was the intention.   The MemoryView Object is basically an 
N-d array object. 

 That was the mistake made by the original buffer object,
 and the solution is not to hold onto the info returned by
 the base object's buffer interface, but to make a new
 buffer request for each Python-level access.
I could see this approach also, but if we went this way then the memory 
view object should hold slice information so that it can be a sliced 
view of a memory area.

Because slicing NumPy array's already does it by holding on to a view, I 
guess having an object that doesn't hold on to a view in Python but 
re-gets it every time it is needed, would be useful. 

In that case:

typedef struct {
PyObject_HEAD
PyObject *base;
int ndims;
PyObject **slices;  /* or 3 Py_ssize_t arrays */
int flags;
} PyMemoryViewObject;

would be enough to store, I suppose.


-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-11 Thread Carl Banks
Travis Oliphant wrote:
 Carl Banks wrote:


 Travis Oliphant wrote:
  Py_BUF_READONLY
 The returned buffer must be readonly and the underlying object 
 should make
 its memory readonly if that is possible.

 I don't like the if possible thing.  If it makes no guarantees, it 
 pretty much useless over Py_BUF_SIMPLE.
 O.K.  Let's make it raise an error if it can't set it read-only.

The thing that bothers me about this whole flags setup is that different 
flags can do opposite things.

Some of the flags RESTRICT the kind of buffers that can be
exported (Py_BUF_WRITABLE); other flags EXPAND the kind of buffers that
can be exported (Py_BUF_INDIRECT).  That is highly confusing and I'm -1
on any proposal that includes both behaviors.  (Mutually exclusive sets
of flags are a minor exception: they can be thought of as either
RESTICTING or EXPANDING, so they could be mixed with either.)

I originally suggested a small set of flags that expand the set of 
allowed buffers.  Here's a little Venn diagram of buffers to illustrate 
what I was thinking:

http://www.aerojockey.com/temp/venn.png

With no flags, the only buffers allowed to be returned are in the All
circle but no others.  Add Py_BUF_WRITABLE and now you can export
writable buffers as well.  Add Py_BUF_STRIDED and the strided circle is
opened to you, and so on.

My recommendation is, any flag should turn on some circle in the Venn
diagram (it could be a circle I didn't draw--shaped arrays, for
example--but it should be *some* circle).


 Py_BUF_FORMAT
The consumer will be using the format string information so make 
 sure thatmember is filled correctly. 

 Is the idea to throw an exception if there's some other data format 
 besides b, and this flag isn't set?  It seems superfluous otherwise.
 
 The idea is that a consumer may not care about the format and the 
 exporter may want to know that to simplify the interface.In other 
 words the flag is a way for the consumer to communicate that it wants 
 format information (or not).

I'm -1 on using the flags for this.  It's completely out of character
compared to the rest of the flags.  All other flags are there for the
benefit of the consumer; this flag is useless to the consumer.

More concretely, all the rest of the flags are there to tell the 
exporter what kind of buffer they're prepared to accept.  This flag, 
alone, does not do that.

Even the benefits to the exporter are dubious.  This flag can't reduce 
code complexity, since all buffer objects have to be prepared to furnish 
type information.  At best, this flag is a rare optimization.  In fact, 
most buffers are going to point format to a constant string, regardless 
of whether this flag was passed or not:

bufinfo-format = b;


 If the exporter wants to raise an exception if the format is not
 requested is up to the exporter.

That seems like a bad idea.  Suppose I have a contiguous numpy array of
floats and I want to view it as a sequence of bytes.  If the exporter's
allowed to raise an exception for this, any consumer that wanted a
data-neutral view of the data would still have to pass Py_BUF_FORMAT to
guard against this.  Wouldn't that be ironic?


 Py_BUF_SHAPE
The consumer can (and might) make use of using the ndims and shape 
 members of the structure
so make sure they are filled in correctly.Py_BUF_STRIDES 
 (implies SHAPE)
The consumer can (and might) make use of the strides member of the 
 structure (as well
as ndims and shape)

 Is there any reasonable benefit for allowing Py_BUF_SHAPE without 
 Py_BUF_STRIDES?  Would the array be C- or Fortran-like?
 
 Yes,  I could see a consumer not being able to handle simple striding 
 but could handle shape information.  Many users of NumPy arrays like to 
 think of the array as an N-d array but want to ignore striding.

Ok, but is the indexing row-major or column-major?  That has to be decided.


Carl Banks
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-10 Thread Nick Coghlan
Carl Banks wrote:
 Another little mistake I made: looking at the Python source, it seems 
 that most C defines do not use the Py_ prefix, so probably we shouldn't 
 here.  Sorry.

Most of the #define's aren't exposed via Python.h and aren't part of the 
public C API. The public ones are meant to use the prefix.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-09 Thread Travis Oliphant



Changes:


 * added the flags variable to allow simpler calling for getbuffer.

 * added some explanation of ideas that were discussed and abandoned.

 * added examples for simple use cases.

 * added more C-API calls to allow easier usage.


Thanks for all feedback.

-Travis

PEP: 3118
Title: Revising the buffer protocol
Version: $Revision$
Last-Modified: $Date$
Authors: Travis Oliphant [EMAIL PROTECTED], Carl Banks [EMAIL PROTECTED]
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 28-Aug-2006
Python-Version: 3000

Abstract


This PEP proposes re-designing the buffer interface (PyBufferProcs
function pointers) to improve the way Python allows memory sharing
in Python 3.0

In particular, it is proposed that the character buffer portion 
of the API be elminated and the multiple-segment portion be 
re-designed in conjunction with allowing for strided memory
to be shared.   In addition, the new buffer interface will 
allow the sharing of any multi-dimensional nature of the
memory and what data-format the memory contains. 

This interface will allow any extension module to either 
create objects that share memory or create algorithms that
use and manipulate raw memory from arbitrary objects that 
export the interface. 


Rationale
=

The Python 2.X buffer protocol allows different Python types to
exchange a pointer to a sequence of internal buffers.  This
functionality is *extremely* useful for sharing large segments of
memory between different high-level objects, but it is too limited and
has issues:

1. There is the little used sequence-of-segments option
   (bf_getsegcount) that is not well motivated. 

2. There is the apparently redundant character-buffer option
   (bf_getcharbuffer)

3. There is no way for a consumer to tell the buffer-API-exporting
   object it is finished with its view of the memory and
   therefore no way for the exporting object to be sure that it is
   safe to reallocate the pointer to the memory that it owns (for
   example, the array object reallocating its memory after sharing
   it with the buffer object which held the original pointer led
   to the infamous buffer-object problem).

4. Memory is just a pointer with a length. There is no way to
   describe what is in the memory (float, int, C-structure, etc.)

5. There is no shape information provided for the memory.  But,
   several array-like Python types could make use of a standard
   way to describe the shape-interpretation of the memory
   (wxPython, GTK, pyQT, CVXOPT, PyVox, Audio and Video
   Libraries, ctypes, NumPy, data-base interfaces, etc.)

6. There is no way to share discontiguous memory (except through
   the sequence of segments notion).  

   There are two widely used libraries that use the concept of
   discontiguous memory: PIL and NumPy.  Their view of discontiguous
   arrays is different, though.  The proposed buffer interface allows
   sharing of either memory model.  Exporters will use only one
   approach and consumers may choose to support discontiguous 
   arrays of each type however they choose. 

   NumPy uses the notion of constant striding in each dimension as its
   basic concept of an array. With this concept, a simple sub-region
   of a larger array can be described without copying the data.   T
   Thus, stride information is the additional information that must be
   shared. 

   The PIL uses a more opaque memory representation. Sometimes an
   image is contained in a contiguous segment of memory, but sometimes
   it is contained in an array of pointers to the contiguous segments
   (usually lines) of the image.  The PIL is where the idea of multiple
   buffer segments in the original buffer interface came from.   

   NumPy's strided memory model is used more often in computational
   libraries and because it is so simple it makes sense to support
   memory sharing using this model.  The PIL memory model is sometimes 
   used in C-code where a 2-d array can be then accessed using double
   pointer indirection:  e.g. image[i][j].  

   The buffer interface should allow the object to export either of these
   memory models.  Consumers are free to either require contiguous memory
   or write code to handle one or both of these memory models. 

Proposal Overview
=

* Eliminate the char-buffer and multiple-segment sections of the
  buffer-protocol.

* Unify the read/write versions of getting the buffer.

* Add a new function to the interface that should be called when
  the consumer object is done with the memory area.  

* Add a new variable to allow the interface to describe what is in
  memory (unifying what is currently done now in struct and
  array)

* Add a new variable to allow the protocol to share shape information

* Add a new variable for sharing stride information

* Add a new mechanism for sharing arrays that must 
  be accessed using pointer indirection. 

* Fix all objects in the core and the standard library to conform
 

Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-09 Thread Neil Hodgson
Travis Oliphant:

 PEP: 3118
 ...

   I'd like to see the PEP include discussion of what to do when an
incompatible request is received while locked. Should there be a
standard Can't do that: my buffer has been got exception?

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-09 Thread Carl Banks


Travis Oliphant wrote:
  Py_BUF_READONLY
 The returned buffer must be readonly and the underlying object 
should make
 its memory readonly if that is possible.

I don't like the if possible thing.  If it makes no guarantees, it 
pretty much useless over Py_BUF_SIMPLE.


 Py_BUF_FORMAT
The consumer will be using the format string information so make sure that 
member is filled correctly. 

Is the idea to throw an exception if there's some other data format 
besides b, and this flag isn't set?  It seems superfluous otherwise.


 Py_BUF_SHAPE
The consumer can (and might) make use of using the ndims and shape members 
 of the structure
so make sure they are filled in correctly. 

 Py_BUF_STRIDES (implies SHAPE)
The consumer can (and might) make use of the strides member of the 
 structure (as well
as ndims and shape)

Is there any reasonable benefit for allowing Py_BUF_SHAPE without 
Py_BUF_STRIDES?  Would the array be C- or Fortran-like?


Another little mistake I made: looking at the Python source, it seems 
that most C defines do not use the Py_ prefix, so probably we shouldn't 
here.  Sorry.


Carl
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-09 Thread Travis Oliphant
Carl Banks wrote:


 Travis Oliphant wrote:
  Py_BUF_READONLY
 The returned buffer must be readonly and the underlying object 
 should make
 its memory readonly if that is possible.

 I don't like the if possible thing.  If it makes no guarantees, it 
 pretty much useless over Py_BUF_SIMPLE.
O.K.  Let's make it raise an error if it can't set it read-only.

 Py_BUF_FORMAT
The consumer will be using the format string information so make 
 sure thatmember is filled correctly. 

 Is the idea to throw an exception if there's some other data format 
 besides b, and this flag isn't set?  It seems superfluous otherwise.

The idea is that a consumer may not care about the format and the 
exporter may want to know that to simplify the interface.In other 
words the flag is a way for the consumer to communicate that it wants 
format information (or not). 

If the exporter wants to raise an exception if the format is not 
requested is up to the exporter.

 Py_BUF_SHAPE
The consumer can (and might) make use of using the ndims and shape 
 members of the structure
so make sure they are filled in correctly.Py_BUF_STRIDES 
 (implies SHAPE)
The consumer can (and might) make use of the strides member of the 
 structure (as well
as ndims and shape)

 Is there any reasonable benefit for allowing Py_BUF_SHAPE without 
 Py_BUF_STRIDES?  Would the array be C- or Fortran-like?

Yes,  I could see a consumer not being able to handle simple striding 
but could handle shape information.  Many users of NumPy arrays like to 
think of the array as an N-d array but want to ignore striding.

I've made the changes in numpy's SVN.   Hopefully they will get mirrored 
over to the python PEP directory eventually.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com