subject:"\[Python\-Dev\] PEP\: Adding data\-type objects to Python"

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-03 Thread Travis Oliphant


>
> Perhaps the most relevant thing to pull from this conversation is back 
> to what Martin has asked about before: "flexible array members". A TCP 
> packet has no defined length (there isn't even a header field in the 
> packet for this, so in fairness we can talk about IP packets which 
> do). There is no way for me to describe this with the pre-PEP 
> data-formats.
>
> I feel like it is misleading of you to say "it's up to the package to 
> do manipulations," because you glanced over the fact that you can't 
> even describe this type of data. ISTM, that you're only interested in 
> describing repetitious fixed-structure arrays. 
Yes, that's right.  I'm only interested in describing binary data with a 
fixed length.  Others can help push it farther than that (if they even 
care).

> If we are going to have a "default Python way to handle data-formats", 
> then don't you feel like this falls short of the mark?
Not for me.   We can fix what needs fixing, but not if we can't get out 
of the gate.
>
> I fear that you speak about this in too grandiose terms and are now 
> trapped by people asking, "well, can I do this?" I think for a lot of 
> folks the answer is: "nope." With respect to the network packets, this 
> PEP doesn't do anything to fix the communication barrier.

Yes it could if you were interested in pushing it there.   No, I didn't 
solve that particular problem with the PEP (because I can only solve the 
problems I'm aware of), but I do think the problem could be solved.   We 
have far too many nay-sayers on this list, I think.

Right now, I don't have time to push this further.  My real interest is 
the extended buffer protocol.  I want something that works for that.  
When I do have time again to discuss it again, I might come back and 
push some more. 

But, not now.

-Travis



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-03 Thread Scott Dial

Travis Oliphant wrote:
> Paul Moore wrote:
>> Enough of the abstract. As a concrete example, suppose I have a (byte)
>> string in my program containing some binary data - an ID3 header, or a
>> TCP packet, or whatever. It doesn't really matter. Does your proposal
>> offer anything to me in how I might manipulate that data (assuming I'm
>> not using NumPy)? (I'm not insisting that it should, I'm just trying
>> to understand the scope of the PEP).
>>
> 
> What do you mean by "manipulate the data."  The proposal for a 
> data-format object would help you describe that data in a standard way 
> and therefore share that data between several library that would be able 
> to understand the data (because they all use and/or understand the 
> default Python way to handle data-formats).
> 

Perhaps the most relevant thing to pull from this conversation is back 
to what Martin has asked about before: "flexible array members". A TCP 
packet has no defined length (there isn't even a header field in the 
packet for this, so in fairness we can talk about IP packets which do). 
There is no way for me to describe this with the pre-PEP data-formats.

I feel like it is misleading of you to say "it's up to the package to do 
manipulations," because you glanced over the fact that you can't even 
describe this type of data. ISTM, that you're only interested in 
describing repetitious fixed-structure arrays. If we are going to have a 
"default Python way to handle data-formats", then don't you feel like 
this falls short of the mark?

I fear that you speak about this in too grandiose terms and are now 
trapped by people asking, "well, can I do this?" I think for a lot of 
folks the answer is: "nope." With respect to the network packets, this 
PEP doesn't do anything to fix the communication barrier. Is this not in 
the scope of "a consistent and standard way to discuss the format of 
binary data" (which is what your PEP's abstract sets out as the task)?

-- 
Scott Dial
[EMAIL PROTECTED]
[EMAIL PROTECTED]
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-02 Thread Alexander Belopolsky

Paul Moore  gmail.com> writes:

> Somewhat. My understanding is that the python-level buffer object is
> frowned upon as not good practice, and is scheduled for removal at
> some point (Py3K, quite possibly?) Hence, any code that uses buffer()
> feels like it "needs" to be replaced by something "more acceptable".

Python 2.x buffer object serves two distinct purposes.  First, it is a
"mutable string" object and this is definitely not going away being
replaced by the bytes object. (Interestingly, this functionality is not
exposed to python, but C extension modules can call
PyBuffer_New(size) to create a buffer.)  Second, it is a "view" into any
object supporting buffer protocol.  For a while this usage was indeed
frowned upon because buffer objects held the pointer obtained from
bf_get*buffer for too long causing memory errors in situations like
this:

>>> a = array('c', "x"*10)
>>> b = buffer(a, 5, 2)
>>> a.extend('x'*1000)
>>> str(b)
'xx'

This problem was fixed more than two years ago. 

--
r35400 | nascheme | 2004-03-10 

Make buffer objects based on mutable objects (like array) safe.
--

Even though it was suggested in the past that buffer *object*
should be deprecated as unsafe, I don't remember seeing a call
to deprecate the buffer protocol.   

> So although I understand the use you suggest, it's not compelling to
> me because I am left with the feeling that I wish I knew "the way to
> do it that didn't need the buffer object" (even though I realise
> intellectually that such a way may not exist).
> 

As I explained in another post,  I used buffer object as an example of
an object that supports buffer protocol, but does not export type
information in the form usable by numpy.

Here is another way to illustrate the problem:

>>> a = numpy.array(array.array('H', [1,2,3]))
>>> b = numpy.array([1,2,3],dtype='H')
>>> a.dtype == b.dtype
False

With the extended buffer protocol it will be possible for numpy.array(..)
to realize that array.array('H', [1,2,3]) is a sequence of unsigned short
integers and convert it accordingly.  Currently numpy has to go through
the sequence protocol to create a numpy.array from an array.array and
loose the type information.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-02 Thread Greg Ewing

Travis E. Oliphant wrote:
> We have T_UBYTE and T_BYTE, etc. defined 
> in structmember.h already.  Should we just re-use those #defines while 
> adding to them to make an easy to use interface for primitive types?

They're mixed up with size information, though,
which we don't want to do.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-02 Thread Greg Ewing

Travis Oliphant wrote:
> or just
> 
> numpy.array(array.array('d',[1,2,3]))
> 
> and leave-out the buffer object all together.

I think the buffer object in his example was just a
placeholder for "some arbitrary object that supports
the buffer interface", not necessarily another NumPy
array.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-02 Thread Ronald Oussoren



On Nov 2, 2006, at 9:35 PM, Thomas Heller wrote:


Ronald Oussoren schrieb:

On Oct 31, 2006, at 6:38 PM, Thomas Heller wrote:



This mechanism is probably a hack because it'n not possible to add
C accessible
fields to type objects, on the other hand it is extensible (in
principle, at least).


I better start rewriting PyObjC then :-). PyObjC stores some addition
information in the type objects that are used to describe Objective-C
classes (such as a reference to the proxied class).

IIRC This has been possible from Python 2.3.


I assume you are referring to the code in pyobjc/Modules/objc/objc- 
class.h


Yes.



If this really is reliable I should better start rewriting ctypes  
then ;-).


Hm, I always thought there was some additional magic going on with  
type

objects, fields appended dynamically at the end or whatever.


There is such magic, but that magic was updated in Python 2.3 to  
allow type-object extensions like this.


Ronald


smime.p7s
Description: S/MIME cryptographic signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-02 Thread Thomas Heller

Ronald Oussoren schrieb:
> On Oct 31, 2006, at 6:38 PM, Thomas Heller wrote:
> 
>>
>> This mechanism is probably a hack because it'n not possible to add  
>> C accessible
>> fields to type objects, on the other hand it is extensible (in  
>> principle, at least).
> 
> I better start rewriting PyObjC then :-). PyObjC stores some addition  
> information in the type objects that are used to describe Objective-C  
> classes (such as a reference to the proxied class).
> 
> IIRC This has been possible from Python 2.3.

I assume you are referring to the code in pyobjc/Modules/objc/objc-class.h ?

If this really is reliable I should better start rewriting ctypes then ;-).

Hm, I always thought there was some additional magic going on with type
objects, fields appended dynamically at the end or whatever.

Thomas
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-02 Thread Travis Oliphant

Martin v. Löwis wrote:
> Travis E. Oliphant schrieb:
> 
>>>2. Should primitive type codes be characters or integers (from an enum) at
>>>C level?
>>>- I prefer integers
>>
>>>3. Should size be expressed in bits or bytes?
>>>- I prefer bits
>>>
>>
>>So, you want an integer enum for the "kind" and an integer for the 
>>bitsize?   That's fine with me.
>>
>>One thing I just remembered.  We have T_UBYTE and T_BYTE, etc. defined 
>>in structmember.h already.  Should we just re-use those #defines while 
>>adding to them to make an easy to use interface for primitive types?
> 
> 
> Notice that those type codes imply sizes, namely the platform sizes
> (where "platform" always means "what the C compiler does"). So if
> you want to have platform-independent codes as well, you shouldn't
> use the T_ codes.
> 

In NumPy we've found it convenient to use both.   Basically, we've set 
up a header file that "does the translation" using #defines and typedefs 
to create things like (on a 32-bit platform)

typedef npy_int32  int
#define NPY_INT32 NPY_INT

So, that either the T_code-like enum or the bit-width can be used 
interchangable.

Typically people want to specify bit-widths (and see their data-types in 
bit-widths) but in C-code that implements something you need to use one 
of the platform integers.

I don't know if we really need to bring all of that over.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-02 Thread Paul Moore

On 11/2/06, Travis Oliphant <[EMAIL PROTECTED]> wrote:
> What do you mean by "manipulate the data."  The proposal for a
> data-format object would help you describe that data in a standard way
> and therefore share that data between several library that would be able
> to understand the data (because they all use and/or understand the
> default Python way to handle data-formats).
>
> It would be up to the other packages to "manipulate" the data.

Yes, some other messages I read since I posted this clarified it for
me. Essentially, as a Python programmer, there's nothing in the PEP
for me - it's for extension writers (and maybe writers of some
lower-level Python modules? I'm not sure about this). So as I'm not
really the target audience, I won't comment further.

> So, what you would be able to do is take your byte-string and create a
> buffer object which you could then share with other packages:
>
> Example:
>
> b = buffer(bytestr, format=data_format_object)
>
> Now.
>
> a = numpy.frombuffer(b)
> a['field1']  # prints data stored in the field named "field1"
>
> etc.
>
> Or.
>
> cobj = ctypes.frombuffer(b)
>
> # Now, cobj is a ctypes object that is basically a "structure" that can
> be passed # directly to your C-code.
>
> Does this help?

Somewhat. My understanding is that the python-level buffer object is
frowned upon as not good practice, and is scheduled for removal at
some point (Py3K, quite possibly?) Hence, any code that uses buffer()
feels like it "needs" to be replaced by something "more acceptable".
So although I understand the use you suggest, it's not compelling to
me because I am left with the feeling that I wish I knew "the way to
do it that didn't need the buffer object" (even though I realise
intellectually that such a way may not exist).

Paul.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Martin v. Löwis

Travis E. Oliphant schrieb:
>> 2. Should primitive type codes be characters or integers (from an enum) at
>> C level?
>> - I prefer integers
> 
>> 3. Should size be expressed in bits or bytes?
>> - I prefer bits
>>
> 
> So, you want an integer enum for the "kind" and an integer for the 
> bitsize?   That's fine with me.
> 
> One thing I just remembered.  We have T_UBYTE and T_BYTE, etc. defined 
> in structmember.h already.  Should we just re-use those #defines while 
> adding to them to make an easy to use interface for primitive types?

Notice that those type codes imply sizes, namely the platform sizes
(where "platform" always means "what the C compiler does"). So if
you want to have platform-independent codes as well, you shouldn't
use the T_ codes.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Alexander Belopolsky

Travis E. Oliphant  ieee.org> writes:

> 
> Alexander Belopolsky wrote:
> > ...
> > 1. Should primitive types be associated with simple type codes
 (short, 
int, long,
> > float, double) or type/size pairs [(int,16), (int, 32), (int, 64), 
(float, 32), 
> > (float, 64)]?
> >  - I prefer pairs
> > 
> > 2. Should primitive type codes be characters or integers (from an 
enum) at
> > C level?
> > - I prefer integers
> 
> Are these orthogonal?
> 

Do you mean are my quiestions 1 and 2 orthogonal? I guess they are.

> > 
> > 3. Should size be expressed in bits or bytes?
> > - I prefer bits
> > 
> 
> So, you want an integer enum for the "kind" and an integer for the 
> bitsize?   That's fine with me.
> 
> One thing I just remembered.  We have T_UBYTE and T_BYTE, etc. defined 
> in structmember.h already.  Should we just re-use those #defines while 
> adding to them to make an easy to use interface for primitive types?
> 

I was thinking about using something like NPY_TYPES enum, but T_* 
codes would work as well.  Let me just present both options for the
 record:

 --- numpy/ndarrayobject.h ---

enum NPY_TYPES {NPY_BOOL=0,
NPY_BYTE, NPY_UBYTE,
NPY_SHORT, NPY_USHORT,
NPY_INT, NPY_UINT,
NPY_LONG, NPY_ULONG,
NPY_LONGLONG, NPY_ULONGLONG,
NPY_FLOAT, NPY_DOUBLE, NPY_LONGDOUBLE,
NPY_CFLOAT, NPY_CDOUBLE, NPY_CLONGDOUBLE,
NPY_OBJECT=17,
NPY_STRING, NPY_UNICODE,
NPY_VOID,
NPY_NTYPES,
NPY_NOTYPE,
NPY_CHAR,  /* special flag */
NPY_USERDEF=256  /* leave room for characters */
};

--- structmember.h ---

/* Types */
#define T_SHORT 0
#define T_INT   1
#define T_LONG  2
#define T_FLOAT 3
#define T_DOUBLE4
#define T_STRING5
#define T_OBJECT6
/* XXX the ordering here is weird for binary compatibility */
#define T_CHAR  7   /* 1-character string */
#define T_BYTE  8   /* 8-bit signed int */
/* unsigned variants: */
#define T_UBYTE 9
#define T_USHORT10
#define T_UINT  11
#define T_ULONG 12

/* Added by Jack: strings contained in the structure */
#define T_STRING_INPLACE13

#define T_OBJECT_EX 16  /* Like T_OBJECT, but raises AttributeError
   when the value is NULL, instead of
   converting to None. */
#ifdef HAVE_LONG_LONG
#define T_LONGLONG  17  
#define T_ULONGLONG  18
#endif /* HAVE_LONG_LONG */




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Travis E. Oliphant

Alexander Belopolsky wrote:
> Travis Oliphant  ieee.org> writes:
>> Don't lump those ideas together.  Shapes and strides are necessary for 
>> N-dimensional array's (it's essentially what *defines* the N-dimensional 
>> array).   I really don't want to sacrifice those in the extended buffer 
>> protocol.  If you want to separate them into different functions then 
>> that is a possibility.
>>
> 
> I don't understand.  Do you want to discuss shapes and strides separately
> from the datatype or not? Note that in ctypes shape is a property of 
> datatype (as in c_int*2*3).   In your proposal, shapes and strides are
> communicated separately.  This presents a unique memory management
> challenge: if the object does not contain shape information in a ready to
> be pointed to form, who is responsible for deallocating the shape array?  
>  

Perhaps a "view object" should be returned like /F suggests and it 
manages the shape, strides, and data-format.


>>> If we manage to agree on the standard way to pass primitive type 
>>> information,
>>> it will be a big achievement and immediately useful because simple arrays 
>>> are
>>> already in the standard library.
>>>
>> We could start there, I suppose.  Especially if it helps us all get on 
>> the same page.
> 
> Let's start:
> 
> 1. Should primitive types be associated with simple type codes (short, int, 
> long,
> float, double) or type/size pairs [(int,16), (int, 32), (int, 64), (float, 
> 32), 
> (float, 64)]?
>  - I prefer pairs
> 

> 2. Should primitive type codes be characters or integers (from an enum) at
> C level?
> - I prefer integers

Are these orthogonal?

> 
> 3. Should size be expressed in bits or bytes?
> - I prefer bits
> 

So, you want an integer enum for the "kind" and an integer for the 
bitsize?   That's fine with me.

One thing I just remembered.  We have T_UBYTE and T_BYTE, etc. defined 
in structmember.h already.  Should we just re-use those #defines while 
adding to them to make an easy to use interface for primitive types?

-Travis


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Alexander Belopolsky

Travis Oliphant  ieee.org> writes:
>
> Don't lump those ideas together.  Shapes and strides are necessary for 
> N-dimensional array's (it's essentially what *defines* the N-dimensional 
> array).   I really don't want to sacrifice those in the extended buffer 
> protocol.  If you want to separate them into different functions then 
> that is a possibility.
>

I don't understand.  Do you want to discuss shapes and strides separately
from the datatype or not? Note that in ctypes shape is a property of 
datatype (as in c_int*2*3).   In your proposal, shapes and strides are
communicated separately.  This presents a unique memory management
challenge: if the object does not contain shape information in a ready to
be pointed to form, who is responsible for deallocating the shape array?  
 
> > 
> > If we manage to agree on the standard way to pass primitive type 
> > information,
> > it will be a big achievement and immediately useful because simple arrays 
> > are
> > already in the standard library.
> > 
> 
> We could start there, I suppose.  Especially if it helps us all get on 
> the same page.

Let's start:

1. Should primitive types be associated with simple type codes (short, int, 
long,
float, double) or type/size pairs [(int,16), (int, 32), (int, 64), (float, 32), 
(float, 64)]?
 - I prefer pairs

2. Should primitive type codes be characters or integers (from an enum) at
C level?
- I prefer integers

3. Should size be expressed in bits or bytes?
- I prefer bits


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Travis Oliphant

Paul Moore wrote:
> 
> 
> Enough of the abstract. As a concrete example, suppose I have a (byte)
> string in my program containing some binary data - an ID3 header, or a
> TCP packet, or whatever. It doesn't really matter. Does your proposal
> offer anything to me in how I might manipulate that data (assuming I'm
> not using NumPy)? (I'm not insisting that it should, I'm just trying
> to understand the scope of the PEP).
> 

What do you mean by "manipulate the data."  The proposal for a 
data-format object would help you describe that data in a standard way 
and therefore share that data between several library that would be able 
to understand the data (because they all use and/or understand the 
default Python way to handle data-formats).

It would be up to the other packages to "manipulate" the data.

So, what you would be able to do is take your byte-string and create a 
buffer object which you could then share with other packages:

Example:

b = buffer(bytestr, format=data_format_object)

Now.

a = numpy.frombuffer(b)
a['field1']  # prints data stored in the field named "field1"

etc.

Or.

cobj = ctypes.frombuffer(b)

# Now, cobj is a ctypes object that is basically a "structure" that can 
be passed # directly to your C-code.

Does this help?

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Travis Oliphant

Alexander Belopolsky wrote:
> Travis Oliphant  ieee.org> writes:
> 
> 
b = buffer(array('d', [1,2,3]))
> 
> 
> there is not much that I can do with b.  For example, if I want to pass it to
> numpy, I will have to provide the type and shape information myself:
> 
> 
numpy.ndarray(shape=(3,), dtype=float, buffer=b)
> 
> array([ 1.,  2.,  3.])
> 
> With the extended buffer protocol, I should be able to do
> 
> 
numpy.array(b)

or just

numpy.array(array.array('d',[1,2,3]))

and leave-out the buffer object all together.

> 
> 
> So let's start by solving this problem and limit it to data that can be found
> in a standard library array.  This way we can postpone the discussion of 
> shapes,
> strides and nested structs.

Don't lump those ideas together.  Shapes and strides are necessary for 
N-dimensional array's (it's essentially what *defines* the N-dimensional 
array).   I really don't want to sacrifice those in the extended buffer 
protocol.  If you want to separate them into different functions then 
that is a possibility.

> 
> If we manage to agree on the standard way to pass primitive type information,
> it will be a big achievement and immediately useful because simple arrays are
> already in the standard library.
> 

We could start there, I suppose.  Especially if it helps us all get on 
the same page.  But, we already see the applications beyond this simple 
case so I would like to have at least an "eye" for the more difficult 
case which we already have a working solution for in the "array interface"

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Paul Moore

On 11/1/06, Alexander Belopolsky <[EMAIL PROTECTED]> wrote:
> Let's just start with that.  The way I see the problem is that buffer protocol
> is fine as long as your data is an array of bytes, but if it is an array of
> doubles, you are out of luck. So, while I can do
>
> >>> b = buffer(array('d', [1,2,3]))
>
> there is not much that I can do with b.  For example, if I want to pass it to
> numpy, I will have to provide the type and shape information myself:
>
> >>> numpy.ndarray(shape=(3,), dtype=float, buffer=b)
> array([ 1.,  2.,  3.])
>
> With the extended buffer protocol, I should be able to do
>
> >>> numpy.array(b)

As a data point, this is the first posting that has clearly explained
to me what the two PEPs are attempting to achieve. That may be my
blindness to what others find self-evident, but equally, I may not be
the only one who needed this example...

Paul.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Alexander Belopolsky

Travis Oliphant  ieee.org> writes:
> Frankly, I'd be happy enough to start with 
> "typecodes" in the extended buffer protocol (that's where the array 
> module is now) and then move up to something more complete later.
> 

Let's just start with that.  The way I see the problem is that buffer protocol
is fine as long as your data is an array of bytes, but if it is an array of
doubles, you are out of luck. So, while I can do

>>> b = buffer(array('d', [1,2,3]))

there is not much that I can do with b.  For example, if I want to pass it to
numpy, I will have to provide the type and shape information myself:

>>> numpy.ndarray(shape=(3,), dtype=float, buffer=b)
array([ 1.,  2.,  3.])

With the extended buffer protocol, I should be able to do

>>> numpy.array(b)

So let's start by solving this problem and limit it to data that can be found
in a standard library array.  This way we can postpone the discussion of shapes,
strides and nested structs.

I propose a simple bf_gettypeinfo(PyObject *obj, int* type, int* bitsize) method
that would return a type code and the size of the data item.

I believe it is better to have type codes free from size information for
several reasons:

1. Generic code can use size information directly without having to know
that int is 32 and double is 64 bits.

2. Odd sizes can be easily described without having to add a new type code.

3. I assume that the existing bf_ functions would still return size in bytes,
so having item size available as an int will help to get number of items.

If we manage to agree on the standard way to pass primitive type information,
it will be a big achievement and immediately useful because simple arrays are
already in the standard library.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Travis Oliphant

Martin v. Löwis wrote:
> Travis E. Oliphant schrieb:
> 
>>2) complex-valued types (you might argue that it's just a 2-array of 
>>floats, but you could say the same thing about int as an array of 
>>bytes).  The point is how do people interpret the data.  Complex-valued 
>>data-types are very common.  It is one reason Fortran is still used by 
>>scientists.
> 
> 
> Well, by the same reasoning, you could argue that pixel values (RGBA)
> are missing in the PEP. It's a convenience, sure, and it may also help
> interfacing with the platform's FORTRAN implementation - however, are
> you sure that NumPy's complex layout is consistent with the platform's
> C99 _Complex definition?
> 

I think so (it is on gcc).  And yes, where you draw the line between 
fundamental and "derived" data-type is somewhat arbitrary.  I'd rather 
include complex-numbers than not given their prevalence in the 
data-streams I'm trying to make compatible with each other.

> 
>>3) Unicode characters
>>
>>4) What about floating-point representations that are not IEEE 754 
>>4-byte or 8-byte.
> 
> 
> Both of these are available in a platform-dependent way: if the
> platform uses non-IEEE754 formats for C float and C double, ctypes
> will interface with that just fine. It is actually vice versa:
> IEEE-754 4-byte and 8-byte is not supported in ctypes.

That's what I meant.  The 'f' kind in the data-type description is also 
intended to mean "platform float" whatever that is.  But, a complete 
data-format representation would have a way to describe other 
bit-layouts for floating point representation.  Even if you can't 
actually calculate directly with them without conversion.

> Same for Unicode: the platform's wchar_t is supported (as you said),
> but not a platform-independent (say) 4-byte little-endian.

Right.

It's a matter of scope.  Frankly, I'd be happy enough to start with 
"typecodes" in the extended buffer protocol (that's where the array 
module is now) and then move up to something more complete later.

But, since we already have an array interface for record-arrays to share 
information and data with each other, and ctypes showing all of it's 
power, then why not be more complete?

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Martin v. Löwis

Travis E. Oliphant schrieb:
> I was too hasty.  There are some things actually missing from ctypes:

I think Thomas can correct me if I'm wrong: I think endianness is
supported (although this support seems undocumented). There seems
to be code that checks for the presence of a _byteswapped_ attribute
on fields of a struct; presence of this field is then interpreted
as data having the "other" endianness.

> 1) long double (this is not the same across platforms, but it is a 
> data-type).

That's indeed missing.

> 2) complex-valued types (you might argue that it's just a 2-array of 
> floats, but you could say the same thing about int as an array of 
> bytes).  The point is how do people interpret the data.  Complex-valued 
> data-types are very common.  It is one reason Fortran is still used by 
> scientists.

Well, by the same reasoning, you could argue that pixel values (RGBA)
are missing in the PEP. It's a convenience, sure, and it may also help
interfacing with the platform's FORTRAN implementation - however, are
you sure that NumPy's complex layout is consistent with the platform's
C99 _Complex definition?

> 3) Unicode characters
> 
> 4) What about floating-point representations that are not IEEE 754 
> 4-byte or 8-byte.

Both of these are available in a platform-dependent way: if the
platform uses non-IEEE754 formats for C float and C double, ctypes
will interface with that just fine. It is actually vice versa:
IEEE-754 4-byte and 8-byte is not supported in ctypes.
Same for Unicode: the platform's wchar_t is supported (as you said),
but not a platform-independent (say) 4-byte little-endian.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Travis E. Oliphant

Jim Jewett wrote:
> I'm still not sure exactly what is missing from ctypes.  To make this 
> concrete:

I was too hasty.  There are some things actually missing from ctypes:

1) long double (this is not the same across platforms, but it is a 
data-type).
2) complex-valued types (you might argue that it's just a 2-array of 
floats, but you could say the same thing about int as an array of 
bytes).  The point is how do people interpret the data.  Complex-valued 
data-types are very common.  It is one reason Fortran is still used by 
scientists.
3) Unicode characters (there is w_char support but I mean a way to 
describe what kind of unicode characters you have in a cross-platform 
way).  I actually think we have a way to describe encodings in the 
data-format representation as well.

4) What about floating-point representations that are not IEEE 754 
4-byte or 8-byte.   There should be a way to at least express the 
data-format in these cases (this is actually how long double should be 
handled as well since it varies across platforms what is actually done 
with the extra bits).

So, we can't "just use ctypes" as a complete data-format representation 
because it's also missing some things.

What we need is a standard way for libraries that deal with data-formats 
to communicate with each other.  I need help with a PEP like this and 
that's what I'm asking for.  It's all I've really been after all along.

A couple of points:

* One reason to support the idea of the Python object approach (versus a 
string-syntax) is that it "is already parsed".  A list-syntax approach 
(perhaps built from strings for fundamental data-types) might also be 
considered "already parsed" as well.

* One advantage of using "kind" versus a character for every type (like 
struct and array do) is that it helps consumers and producers speed up 
the parser (a fuller branching tree).

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Travis Oliphant

Jim Jewett wrote:
> I'm still not sure exactly what is missing from ctypes.  To make this 
> concrete:

I think the only thing missing from ctypes "expressiveness" as far as I 
can tell in terms of what you "can" do is the byte-order representation.

What is missing is ease-of use for producers and consumers in 
interpreting the data-type.   When I speak of Producers and consumers, 
I'm largely talking about C-code (or Java or .NET) code writers.

Producers must basically use Python code to create classes of various 
types.   This is going to be slow in 'C'.  Probably slower than the 
array interface (which is what we have no informally).

Consumers are going to have a hard time interpreting the result.  I'm 
not even sure how to do that, in fact.  I'd like NumPy to be able to 
understand ctypes as a means to specify data.  Would I have to check 
against all the sub-types of CDataType, pull out the fields, check the 
tp_name of the type object?  I'm not sure.

It seems like a string with the C-structure would be better as a 
data-representation, but then a third-party library would want to parse 
that so that Python might as well have it's own parser for data-types. 

So, Python might as well have it's own way to describe data.  My claim 
is this default way should *not* be overloaded by using Python 
type-objects (the ctypes way).  I'm making a claim that the NumPy way of 
using a different Python object to describe data-types.  I'm not saying 
the NumPy object should be used.  I'm saying we should come up with a 
singe DataFormatType whose instances express the data formats in ways 
that other packages can produce and consume (or even use internally).  

It would be easy for NumPy to "use" the default Python object in it's 
PyArray_Descr * structure.  It would also be easy for ctypes to "use" 
the default Python object in its StgDict object that is the tp_dict of 
every ctypes type object.

It would be easy for the struct module to allow for this data-format 
object (instead of just strings) in it's methods. 

It would be easy for the array module to accept this data-format object 
(instead of just typecodes) in it's constructor.

Lot's of things would suddenly be more consistent throughout both the 
Python and C-Python user space.

Perhaps after discussion, it becomes clear that the ctypes approach is 
sufficient to be "that thing" that all modules use to share data-format 
information.  It's definitely expressive enough.   But, my argument is 
that NumPy data-type objects are also "pretty close." so why should they 
be rejected.  We could also make a "string-syntax" do it.

>
> You have said that creating whole classes is too much overhead, and
> the description should only be an instance.  To me, that particular
> class (arrays of 500 structs) still looks pretty lightweight.  So
> please clarify when it starts to be a problem.
>

> (1)  For simple types -- mapping
>   char name[30];  ==> ("name", c_char*30)
>
> Do you object to using the c_char type?
> Do you object to the array-of-length-30 class, instead of just having
> a repeat or shape attribute?
> Do you object to naming the field?
>
> (2)  For the complex types, nested and struct
>
> Do you object to creating these two classes even once?   For example,
> are you expecting to need different classes for each buffer, and to
> have many buffers created quickly?
I object to the way I "consume" and "produce" the ctypes interface.  
It's much to slow to be used on the C-level for sharing many small 
buffers quickly.
>
> Is creating that new class a royal pain, but frequent (and slow)
> enough that you can't just make a call into python (or ctypes)?
>
> (3)  Given that you will describe X, is X*500 (==> a type describing
> an array of 500 Xs) a royal pain in C?  If so, are you expecting to
> have to do it dynamically for many sizes, and quickly enough that you
> can't just let ctypes do it for you?

That pretty much sums it up (plus the pain of having to basically write 
Python code from "C").

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Jim Jewett

I'm still not sure exactly what is missing from ctypes.  To make this concrete:

You have an array of 500 elements meeting

struct {
  int  simple;
  struct nested {
   char name[30];
   char addr[45];
   int  amount;
  }

ctypes can describe this as

class nested(Structure):
_fields_ = [("name", c_char*30),
("addr", c_char*45),
("amount", c_long)]

class struct(Structure):
_fields_ = [("simple", c_int), ("nested", nested)]

desc = struct * 500

You have said that creating whole classes is too much overhead, and
the description should only be an instance.  To me, that particular
class (arrays of 500 structs) still looks pretty lightweight.  So
please clarify when it starts to be a problem.

(1)  For simple types -- mapping
   char name[30];  ==> ("name", c_char*30)

Do you object to using the c_char type?
Do you object to the array-of-length-30 class, instead of just having
a repeat or shape attribute?
Do you object to naming the field?

(2)  For the complex types, nested and struct

Do you object to creating these two classes even once?   For example,
are you expecting to need different classes for each buffer, and to
have many buffers created quickly?

Is creating that new class a royal pain, but frequent (and slow)
enough that you can't just make a call into python (or ctypes)?

(3)  Given that you will describe X, is X*500 (==> a type describing
an array of 500 Xs) a royal pain in C?  If so, are you expecting to
have to do it dynamically for many sizes, and quickly enough that you
can't just let ctypes do it for you?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Bill Baxter

Martin v. Löwis  v.loewis.de> writes:

> 
> Bill Baxter schrieb:
> > Basically in my code I want to be able to take the binary data descriptor 
> > and
> > say "give me the 'r' field of this pixel as an integer".
> > 
> > Is either one (the PEP or c-types) clearly easier to use in this case? 
> > What
> > would the code look like for handling both formats generically?
> 
> The PEP, as specified, does not support accessing individual fields from
> Python. OTOH, ctypes, as implemented, does. This comparison is not fair,
> though: an *implementation* of the PEP (say, NumPy) might also give you
> Python-level access to the fields.

I see.  So at the Python-user convenience level it's pretty much a wash.  Are
there significant differences in memory usage and/or performance?  ctypes 
sounds to be more heavyweight from the discussion.  If I have a lot of image
formats I want to support is that going to mean lots of overhead with ctypes?
Do I pay for it whether or not I actually end up having to handle an image in a
given format?

> With the PEP, you can get access to the 'r' field from C code.
> Performing this access is quite tedious; as I'm uncertain whether you
> actually wanted to see C code, I refrain from trying to formulate it.

Actually this is more what I was after.  I've written C code to interface with
Numpy arrays and found it to be not so bad.  But the data I was passing around
was just a plain N-dimensional array of doubles.  Very basic.  It *sounds* like
what Travis is saying is that handling a less simple case, like the one above
of supporting a variety of RGB image formats, would be easier with the PEP than
with ctypes.  Or maybe it's generating the data in my C code that's trickier,
as opposed to consuming it?

I'm just trying to understand what the deal is, and at the same time perhaps
inject a more concrete example into the discussion. Travis has said several
times that working with ctypes, which requires a Python type per 'element', is
more complicated from the C side, and I'd like to see more concretely how so,
as someone who may end up needing to write such code.

And I'm ok without seeing the actual code if someone can actually answer my
question.  The question is not whether it is tedious or not -- everything about
the Python C API is tedious from what I've seen.  The question is which is
*more* tedious, and how significan is the difference in tediousness to the guy
who's job it is to actually write the code.

--bb

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Nick Coghlan

Travis Oliphant wrote:
> Nick Coghlan wrote:
>> In fact, it may make sense to just use the lists/strings directly as the 
>> data 
>> exchange format definitions, and let the various libraries do their own 
>> translation into their private format descriptions instead of creating a new 
>> one-type-to-describe-them-all.
> 
> Yes, I'm open to this possibility.   I basically want two things in the 
> object passed through the extended buffer protocol:
> 
> 1) It's fast on the C-level
> 2) It covers all the use-cases.
> 
> If just a particular string or list structure were passed, then I would 
> drop the data-format PEP and just have the dataformat argument of the 
> extended buffer protocol be that thing.
> 
> Then, something that converts ctypes objects to that special format 
> would be very nice indeed.

It may make sense to have a couple distinct sections in the datatype PEP:
  a. describing data formats with basic Python types
  b. a lightweight class for parsing these data format descriptions

It's most of the way there already - part A would just be the various styles 
of arguments accepted by the datatype constructor, and part B would be the 
datatype object itself.

I personally think it makes the most sense to do both, but separating the two 
would make it clear that the descriptions can be standardised without 
*necessarily* defining a new class.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Martin v. Löwis

Bill Baxter schrieb:
> Basically in my code I want to be able to take the binary data descriptor and
> say "give me the 'r' field of this pixel as an integer".
> 
> Is either one (the PEP or c-types) clearly easier to use in this case?  What
> would the code look like for handling both formats generically?

The PEP, as specified, does not support accessing individual fields from
Python. OTOH, ctypes, as implemented, does. This comparison is not fair,
though: an *implementation* of the PEP (say, NumPy) might also give you
Python-level access to the fields.

With the PEP, you can get access to the 'r' field from C code.
Performing this access is quite tedious; as I'm uncertain whether you
actually wanted to see C code, I refrain from trying to formulate it.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Ronald Oussoren



On Oct 31, 2006, at 6:38 PM, Thomas Heller wrote:



This mechanism is probably a hack because it'n not possible to add  
C accessible
fields to type objects, on the other hand it is extensible (in  
principle, at least).


I better start rewriting PyObjC then :-). PyObjC stores some addition  
information in the type objects that are used to describe Objective-C  
classes (such as a reference to the proxied class).


IIRC This has been possible from Python 2.3.

Ronald




smime.p7s
Description: S/MIME cryptographic signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Bill Baxter

One thing I'm curious about in the ctypes vs this PEP debate is the following. 
How do the approaches differ in practice if I'm developing a library that wants
to accept various image formats that all describe the same thing: rgb data. 
Let's say for now all I want to support is two different image formats whose
pixels are described in C structs by:

struct rbg565
{
  unsigned short r:5;
  unsigned short g:6;
  unsigned short b:5; 
};

struct rgb101210
{
  unsigned int r:10;
  unsigned int g:12;
  unsigned int b:10; 
};


Basically in my code I want to be able to take the binary data descriptor and
say "give me the 'r' field of this pixel as an integer".

Is either one (the PEP or c-types) clearly easier to use in this case?  What
would the code look like for handling both formats generically?

--bb


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Ron Adam

> The only benefit I imagine would be for an extension module library 
> writer and for users of the struct and array modules.  But, other than 
> that, I don't know.  It actually doesn't have to be exposed to Python. 
> I used Python notation in the PEP to explain what is basically a 
> C-structure.  I don't care if the object ever gets exposed to Python.
> 
> Maybe that's part of the communication problem.


I get the impression where ctypes is good for accessing native C libraries from 
within python, the data-type object is meant to add a more direct way to share 
native python object's *data* with C (or other languages) in a more efficient 
way.  For data that can be represented well in continuous memory address's, it 
lightens the load so instead of a list of python objects you get an "array of 
data for n python_type objects" without the duplications of the python type for 
every element.

I think maybe some more complete examples demonstrating how it is to be used 
from both the Python and C would be good.

Cheers,
Ron

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Travis Oliphant

Paul Moore wrote:
> On 10/31/06, Travis Oliphant <[EMAIL PROTECTED]> wrote:
> 
>>Martin v. Löwis wrote:
>>
>>>[...] because I still don't quite understand what the PEP
>>>wants to achieve.
>>>
>>
>>Are you saying you still don't understand after having read the extended
>>buffer protocol PEP, yet?
> 
> 
> I can't speak for Martin, but I don't understand how I, as a Python
> programmer, might use the data type objects specified in the PEP. I
> have skimmed the extended buffer protocol PEP, but I'm conscious that
> no objects I currently use support the extended buffer protocol (and
> the PEP doesn't mention adding support to existing objects), so I
> don't see that as too relevant to me.

Do you use the PIL?  The PIL supports the array interface.

CVXOPT supports the array interface.

Numarray
Numeric
NumPy

all support the array interface.

> 
> I have also installed numpy, and looked at the help for numpy.dtype,
> but that doesn't add much to the PEP. 

The source-code is available.

> The freely available chapters of
> the numpy book explain how dtypes describe data structures, but not
> how to use them. 

The freely available Numeric documentation doesn't
> refer to dtypes, as far as I can tell. 

It kind of does, they are PyArray_Descr * structures in Numeric.  They 
just aren't Python objects.

Is there any documentation on
> how to use dtypes, independently of other features of numpy? 

There are examples and other help pages at http://www.scipy.org

If not,
> can you clarify where the benefit lies for a Python user of this
> proposal? (I understand the benefits of a common language for
> extensions to communicate datatype information, but why expose it to
> Python? How do Python users use it?)
> 

The only benefit I imagine would be for an extension module library 
writer and for users of the struct and array modules.  But, other than 
that, I don't know.  It actually doesn't have to be exposed to Python. 
I used Python notation in the PEP to explain what is basically a 
C-structure.  I don't care if the object ever gets exposed to Python.

Maybe that's part of the communication problem.

> This is probably all self-evident to the numpy community, but I think
> that as the PEP is aimed at a wider audience it needs a little more
> background.

It's hard to write that background because most of what I understand is 
from the NumPy community.  I can't give you all the examples but my 
concern is that you have all these third party libraries out there 
describing what is essentially binary data and using either 
string-copies or the buffer protocol + extra information obtained by 
some method or attribute that varies across the implementations.  There 
should really be a standard for describing this data.

There are attempts at it in the struct and array module.  There is the 
approach of ctypes but I claim that using Python type objects is 
over-kill for the purposes of describing data-formats.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Josiah Carlson

"Paul Moore" <[EMAIL PROTECTED]> wrote:
> On 10/31/06, Travis Oliphant <[EMAIL PROTECTED]> wrote:
> > Martin v. Löwis wrote:
> > > [...] because I still don't quite understand what the PEP
> > > wants to achieve.
> > >
> >
> > Are you saying you still don't understand after having read the extended
> > buffer protocol PEP, yet?
> 
> I can't speak for Martin, but I don't understand how I, as a Python
> programmer, might use the data type objects specified in the PEP. I
> have skimmed the extended buffer protocol PEP, but I'm conscious that
> no objects I currently use support the extended buffer protocol (and
> the PEP doesn't mention adding support to existing objects), so I
> don't see that as too relevant to me.

Presumably str in 2.x and bytes in 3.x could be extended to support the
'S' specifier, unicode in 2.x and text in 3.x could be extended to
support the 'U' specifier.  The various array.array variants could be
extended to support all relevant specifiers, etc.

> This is probably all self-evident to the numpy community, but I think
> that as the PEP is aimed at a wider audience it needs a little more
> background.

Someone correct me if I am wrong, but it allows things equivalent to the
following that is available in C, available in Python...

typedef struct {
char R;
char G;
char B;
char A;
} pixel_RGBA;

pixel_RGBA image[1024][768];

Or even...

typedef struct {
long long numerator;
unsigned long long denominator;
double approximation;
} rational;

rational ratios[1024];

The real use is that after you have your array of (packed) objects, be
it one of the above samples, or otherwise, you don't need to explicitly
pass around specifiers (like in struct, or ctypes), numpy and others can
talk to each other, and pick up the specifier with the extended buffer
protocol, and it just works.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Paul Moore

On 10/31/06, Travis Oliphant <[EMAIL PROTECTED]> wrote:
> Martin v. Löwis wrote:
> > [...] because I still don't quite understand what the PEP
> > wants to achieve.
> >
>
> Are you saying you still don't understand after having read the extended
> buffer protocol PEP, yet?

I can't speak for Martin, but I don't understand how I, as a Python
programmer, might use the data type objects specified in the PEP. I
have skimmed the extended buffer protocol PEP, but I'm conscious that
no objects I currently use support the extended buffer protocol (and
the PEP doesn't mention adding support to existing objects), so I
don't see that as too relevant to me.

I have also installed numpy, and looked at the help for numpy.dtype,
but that doesn't add much to the PEP. The freely available chapters of
the numpy book explain how dtypes describe data structures, but not
how to use them. The freely available Numeric documentation doesn't
refer to dtypes, as far as I can tell. Is there any documentation on
how to use dtypes, independently of other features of numpy? If not,
can you clarify where the benefit lies for a Python user of this
proposal? (I understand the benefits of a common language for
extensions to communicate datatype information, but why expose it to
Python? How do Python users use it?)

This is probably all self-evident to the numpy community, but I think
that as the PEP is aimed at a wider audience it needs a little more
background.

Paul.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Martin v. Löwis

Travis Oliphant schrieb:
> I think it actually is.  Perhaps I'm wrong, but a type-object is still a 
> special kind of an instance of a meta-type.  I once tried to add 
> function pointers to a type object by inheriting from it.  But, I was 
> told that Python is not set up to handle that.  Maybe I misunderstood.

I'm not quite sure what the problems are: one "obvious" problem is
that the next Python version may also extend the size of type objects.
But, AFAICT, even that should "work", in the sense that this new version
should check for the presence of a flag to determine whether the
additional fields are there. The only tricky question is how you can
find out whether your own extension is there.

If that is a common problem, I think a framework could be added to
support extensible type objects (with some kind of registry for
additional fields, and a per-type-object indicator whether a certain
extension field is present).

> Let me be very clear.  The whole reason I make any statements about 
> ctypes is because somebody else brought it up.  I'm not trying to 
> replace ctypes and the way it uses type objects to represent data 
> internally.

Ok. I understood you differently earlier.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Travis Oliphant

Martin v. Löwis wrote:
> Stephan Tolksdorf schrieb:
> 
>>While Travis' proposal encompasses the data format functionality within 
>>the struct module and overlaps with what ctypes has to offer, it does 
>>not aim to replace ctypes.
> 
> 
> This discussion could have been a lot shorter if he had said so.
> Unfortunately (?) he stated that it was *precisely* a motivation
> of the PEP to provide a standard data description machinery that
> can then be adopted by the struct, array, and ctypes modules.

Struct and array I was sure about.  Ctypes less sure.  I'm very sorry 
for the distraction I caused by mis-stating my objective.   My objective 
is really the extended buffer protocol.  The data-type object is a means 
to that end.

I do think ctypes could make use of the data-type object and that there 
is a real difference between using Python type objects as data-format 
descriptions and using another Python type for those descriptions.  I 
thought to go the ctypes route (before I even knew what ctypes did) but 
decided against it for a number of reasons.

But, nonetheless those are side issues.  The purpose of the PEP is to 
provide an object that the extended buffer protocol can use to share 
data-format information.  It should be considered primarily in that context.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Thomas Heller

Travis Oliphant schrieb:
> For example, I'm pretty sure you were the one who made me aware that you 
> can't just extend the PyTypeObject.  Instead you extended the tp_dict of 
> the Python typeObject to store some of the extra information that is 
> needed to describe a data-type like I'm proposing.
> 
> So, if you I'm just describing data-format information, why do I need 
> all this complexity (that makes ctypes implementation easier/more 
> natural/etc)?  What if the StgDictObject is the Python data-format 
> object I'm talking about?  It actually looks closer.
> 
> But, if all I want is the StgDictObject (or something like it), then why 
> should I pass around the whole type object?

Maybe you don't need it.  ctypes certainly needs the type object because
it is also used for constructing instances (while NumPy uses factory functions,
IIUC), or for converting 'native' Python object into foreign function arguments.

I know that this doesn't interest you from the NumPy perspective (and I don't 
want
to offend you by saying this).

> This is all I'm saying to those that want me to use ctypes to describe 
> data-formats in the extended buffer protocol.  I'm not trying to change 
> anything in ctypes.

I don't want to change anything in NumPy, either, and was not the one who
suggested to use ctypes objects, although I had thought about whether it
would be possible or not.

What I like about ctypes, and dislike about Numeric/Numarry/NumPy is
the way C compatible types are defined in ctypes.  I find the ctypes
way more natural than the numxxx or array module way, but what else would
anyone expect from me as the ctypes author...

I hope that a useful interface is developed from your proposals, and
will be happy to adapt ctypes to use it or interface ctypes with it
if this makes sense.

Thomas

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Travis Oliphant

Thomas Heller wrote:
> 
> (I tried to read the whole thread again, but it is too large already.)
> 
> There is a (badly named, probably) api to access information
> about ctypes types and instances of this type.  The functions are
> PyObject_stgdict(obj) and PyType_stgdict(type).  Both return a
> 'StgDictObject' instance or NULL if the funtion fails.  This object
> is the ctypes' type object's __dict__.
> 
> StgDictObject is a subclass of PyDictObject and has fields that
> carry information about the C type (alignment requirements, size in bytes,
> plus some other stuff).  Also it contains several pointers to functions
> that implement (in C) struct-like functionality (packing/unpacking).
> 
> Of course several of these fields can only be used for ctypes-specific
> purposes, for example a pointer to the ffi_type which is used when
> calling foreign functions, or the restype, argtypes, and errcheck fields
> which are only used when the type describes a function pointer.
> 
> 
> This mechanism is probably a hack because it'n not possible to add C 
> accessible
> fields to type objects, on the other hand it is extensible (in principle, at 
> least).
> 

Thank you for the description.  While I've studied the ctypes code, I 
still don't understand the purposes beind all the data-structures.

Also, I really don't have an opinion about ctypes' implementation.   All 
my comparisons are simply being resistant to the "unexplained" idea that 
I'm supposed to use ctypes objects in a way they weren't really designed 
to be used.

For example, I'm pretty sure you were the one who made me aware that you 
can't just extend the PyTypeObject.  Instead you extended the tp_dict of 
the Python typeObject to store some of the extra information that is 
needed to describe a data-type like I'm proposing.

So, if you I'm just describing data-format information, why do I need 
all this complexity (that makes ctypes implementation easier/more 
natural/etc)?  What if the StgDictObject is the Python data-format 
object I'm talking about?  It actually looks closer.

But, if all I want is the StgDictObject (or something like it), then why 
should I pass around the whole type object?

This is all I'm saying to those that want me to use ctypes to describe 
data-formats in the extended buffer protocol.  I'm not trying to change 
anything in ctypes.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Travis Oliphant

Martin v. Löwis wrote:
> Travis E. Oliphant schrieb:
> 
>>But, there are distinct disadvantages to this approach compared to what 
>>I'm trying to allow.   Martin claims that the ctypes approach is 
>>*basically* equivalent but this is just not true.
> 
> 
> I may claim that, but primarily, my goal was to demonstrate that the
> proposed PEP cannot be used to describe ctypes object layouts (without
> checking, I can readily believe that the PEP covers everything in
> the array and struct modules).
> 

That's a fine argument.  You are right in terms of the PEP as it stands. 
  However, I want to make clear that a single Python type object *could* 
be used to describe data including all the cases you laid out.  It would 
not be difficult to extend the PEP to cover all the cases you've 
described --- I'm not sure that's desireable.  I'm not trying to replace 
what ctypes does.  I'm just trying to get something that we can use to 
exchange data-format information through the extended buffer protocol.

It really comes down to using Python type-objects as the instances 
describing data-formats (which ctypes does) or "normal" Python objects 
as the instances describing data-formats (what the PEP proposes).

> 
>>It could be made more 
>>true if the ctypes objects inherited from a "meta-type" and if Python 
>>allowed meta-types to expand their C-structures.  But, last I checked 
>>this is not possible.
> 
> 
> That I don't understand. a) what do you think is not possible?

Extending the C-structure of PyTypeObject and having Python types use 
that as their "type-object".

  b)
> why is that an important difference between a datatype and a ctype?

Because with instances of C-types you are stuck with the PyTypeObject 
structure.  If you want to add anything you have to do it in the 
dictionary.

Instances of a datatype allow adding anything after the PyObject_HEAD 
structure.

> 
> If you are suggesting that, given two Python types A and B, and
> B inheriting from A, that the memory layout of B cannot extend
> the memory layout of A, then: that is certainly possible in Python,
> and there are many examples for it.
>

I know this.  I've done it for many different objects.  I'm saying it's 
not quite the same when what you are extending is the PyTypeObject and 
trying to use it as the type object for some other object.

> 
>>A Python type object is a very particular kind of Python-type.  As far 
>>as I can tell, it's not as flexible in terms of the kinds of things you 
>>can do with the "instances" of a type object (i.e. what ctypes types 
>>are) on the C-level.
> 
> 
> Ah, you are worried that NumArray objects would have to be *instances*
> of ctypes types. That wouldn't be necessary at all. Instead, if each
> NumArray object had a method get_ctype(), which returned a ctypes type,
> then you would get the same desciptiveness that you get with the
> PEP's datatype.
> 

No, I'm not worried about that (It's not NumArray by the way, it's 
NumPy.  NumPy replaces both NumArray and Numeric).

NumPy actually interfaces with ctypes quite well.  This is how I learned 
anything I might know about ctypes.  So, I'm well aware of this.

What I am concerned about is using Python type objects (i.e. Python 
objects that can be cast in C to PyTypeObject *) outside of ctypes to 
describe data-formats when you don't need it and it just complicates 
dealing with the data-format description.

> 
>>Where is the discussion that crowned the ctypes way of doing things as 
>>"the one true way"
> 
> 
> It hasn't been crowned this way. Me, personally, I just said two things
> about this PEP and ctypes:

Thanks for clarifying, but I know you didn't say this.  Others, however, 
basically did.

> a) the PEP does not support all concepts that ctypes needs

It could be extended, but I'm not sure it *needs* to be in it's real 
context.  I'm very sorry for contributing to the distraction that ctypes 
should adopt the PEP.  My words were unclear.  But, I'm not pushing for 
that.  I really have no opinion how ctypes describes data.

> b) ctypes can express all examples in the PEP
> in response to your proposal that ctypes should adopt the PEP, and
> that ctypes is not good enough to be the one true way.
> 

I think it is "good enough" in the semantic sense.  But, I think using 
type objects in this fashion for general-purpose data-description is 
over-kill and will be much harder to extend and deal with.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Travis Oliphant

Martin v. Löwis wrote:
> Travis Oliphant schrieb:
> 
>>The big difference, however, is that by going this route you are forced 
>>to use the "type object" as your data-format "instance".
> 
> 
> Since everything is an object (an "instance) in Python, this is not
> such a big difference.
> 

I think it actually is.  Perhaps I'm wrong, but a type-object is still a 
special kind of an instance of a meta-type.  I once tried to add 
function pointers to a type object by inheriting from it.  But, I was 
told that Python is not set up to handle that.  Maybe I misunderstood.

Let me be very clear.  The whole reason I make any statements about 
ctypes is because somebody else brought it up.  I'm not trying to 
replace ctypes and the way it uses type objects to represent data 
internally.   All I'm trying to do is come up with a way to describe 
data-types through a buffer protocol.  The way ctypes does it is "too" 
bulky by definining a new Python type for every data-format.

While semantically you may talk about the equivalency of types being 
instances of a "meta-type" and regular objects being instances of a 
type.  My understanding is still that there are practical differences 
when it comes to implementation --- and certain things that "can't be done"

Here's what I mean by the difference.

This is akin to what I'm proposing

struct {
PyObject_HEAD
/* whatever you need to represent your instance
Quite a bit of flexibility
*/
} PyDataFormatObject;

A Python type object (what every C-types data-format "type" inherits 
from) has a C-structure

struct {
PyObject_VAR_HEAD
char *tp_name;
 int tp_basicsize, tp_itemsize;

 /* Methods to implement standard operations */

 destructor tp_dealloc;
 printfunc tp_print;
 getattrfunc tp_getattr;
 setattrfunc tp_setattr;
 cmpfunc tp_compare;
 reprfunc tp_repr;

...
...

PyObject *tp_bases;
 PyObject *tp_mro; /* method resolution order */
 PyObject *tp_cache;
 PyObject *tp_subclasses;
 PyObject *tp_weaklist;
 destructor tp_del;

 ... /* + more under certain conditions */
} PyTypeObject;

Why in the world do we need to carry all this extra baggage around in 
each data-format instance in order to just describe data?  I can see why 
it's useful for ctypes to do it and that's fine.  But, the argument that 
every exchange of data-format information should use this type-object 
instance is hard to swallow.

So, I'm happy to let ctypes continue on doing what it's doing trusting 
its developers to have done something good.  I'd be happy to drop any 
reference to ctypes.  The only reason to have the data-type objects is 
something to pass as part of the extended buffer protocol.

> 
> 
> Can you explain why that is? In the PEP, I see two C fucntions:
> setitem and getitem. I think they can be implemented readily with
> ctypes' GETFUNC and SETFUNC function pointers that it uses
> all over the place.

Sure, but where do these function pointers live and where are they 
stored.  In ctypes it's in the CField_object.  Now, this is closer to 
what I'm talking about.  But, why is not not the same thing.  Why, yet 
another type object to talk about fields of a structure?

These are rhetorical questions.  I really don't expect or need an answer 
because I'm not questioning why ctypes did what it did for solving the 
problem it was solving.  I am questioning anyone who claims that we 
should use this mechanism for describing data-formats in the extended 
buffer protocol.

> 
> I don't see a requirement to support C structure members or
> function pointers in the datatype object.
> 
> 
>>There are a few people claiming I should use the ctypes type-hierarchy 
>>but nobody has explained how that would be possible given the 
>>attributes, C-structure members and C-function pointers that I'm proposing.
> 
> 
> Ok, here you go. Remember, I'm still not claiming that this should be
> done: I'm just explaining how it could be done.

O.K.  Thanks for putting in the effort.   It doesn't answer my real 
concerns, though.

>>It was clear to me that we were "on to something".  Now, the biggest 
>>claim against the gist of what I'm proposing (details we can argue 
>>about), seems from my perspective to be a desire to "go backwards" and 
>>carry data-type information around with a Python type.
> 
> 
> I, at least, have no such desire. I just explained that the ctypes
> model of memory layouts is just as expressive as the one in the
> PEP. 

I agree with this.  I'm very aware of what "can" be expressed.  I just 
think it's too awkard and bulky to use in the extended buffer protocol

> Which of these is "better" for what the PEP wants to achieve,
> I can't say, because I still don't quite understand what the PEP
> wants to achieve.
>

Are you saying you still don't understand after having read the extended 
buffer protocol PEP,

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Thomas Heller

Travis Oliphant schrieb:
> Greg Ewing wrote:
>> Travis Oliphant wrote:
>> 
>> 
>>>Part of the problem is that ctypes uses a lot of different Python types 
>>>(that's what I mean by "multi-object" to accomplish it's goal).  What 
>>>I'm looking for is a single Python type that can be passed around and 
>>>explains binary data.
>> 
>> 
>> It's not clear that multi-object is a bad thing in and
>> of itself. It makes sense conceptually -- if you have
>> a datatype object representing a struct, and you ask
>> for a description of one of its fields, which could
>> be another struct or array, you would expect to get
>> another datatype object describing that.
>> 
>> Can you elaborate on what would be wrong with this?
>> 
>> Also, can you clarify whether your objection is to
>> multi-object or multi-type. They're not the same thing --
>> you could have a data structure built out of multiple
>> objects that are all of the same Python type, with
>> attributes distinguishing between struct, array, etc.
>> That would be single-type but multi-object.
> 
> I've tried to clarify this in another post.  Basically, what I don't 
> like about the ctypes approach is that it is multi-type (every new 
> data-format is a Python type).
> 
> In order to talk about all these Python types together, then they must 
> all share some attribute (or else be derived from a meta-type in C with 
> a specific function-pointer entry).

(I tried to read the whole thread again, but it is too large already.)

There is a (badly named, probably) api to access information
about ctypes types and instances of this type.  The functions are
PyObject_stgdict(obj) and PyType_stgdict(type).  Both return a
'StgDictObject' instance or NULL if the funtion fails.  This object
is the ctypes' type object's __dict__.

StgDictObject is a subclass of PyDictObject and has fields that
carry information about the C type (alignment requirements, size in bytes,
plus some other stuff).  Also it contains several pointers to functions
that implement (in C) struct-like functionality (packing/unpacking).

Of course several of these fields can only be used for ctypes-specific
purposes, for example a pointer to the ffi_type which is used when
calling foreign functions, or the restype, argtypes, and errcheck fields
which are only used when the type describes a function pointer.

This mechanism is probably a hack because it'n not possible to add C accessible
fields to type objects, on the other hand it is extensible (in principle, at 
least).

Just to describe the implementation.

Thomas

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Martin v. Löwis

Stephan Tolksdorf schrieb:
> While Travis' proposal encompasses the data format functionality within 
> the struct module and overlaps with what ctypes has to offer, it does 
> not aim to replace ctypes.

This discussion could have been a lot shorter if he had said so.
Unfortunately (?) he stated that it was *precisely* a motivation
of the PEP to provide a standard data description machinery that
can then be adopted by the struct, array, and ctypes modules.

> I also do not understand why the data format type should attempt to 
> fully describe arbitrarily complex data formats, like fragmented 
> (non-continuous) data structures in memory. You'd probably need a full 
> programming language for that anyway.

For an FFI application, you need to be able to describe arbitrary
in-memory formats, since that's what the foreign function will
expect. For type safety and reuse, you better separate the
description of the layout from the creation of the actual values.
Otherwise (i.e. if you have to define the layout on each invocation),
creating the parameters for a foreign function becomes very tedious
and error-prone, with errors often being catastrophic (i.e. interpreter
crashes).

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Martin v. Löwis

Travis Oliphant schrieb:
> The big difference, however, is that by going this route you are forced 
> to use the "type object" as your data-format "instance".

Since everything is an object (an "instance) in Python, this is not
such a big difference.

> This is 
> fitting a square peg into a round hole in my opinion.To really be 
> useful, you would need to add the attributes and (most importantly) 
> C-function pointers and C-structure members to these type objects. 

Can you explain why that is? In the PEP, I see two C fucntions:
setitem and getitem. I think they can be implemented readily with
ctypes' GETFUNC and SETFUNC function pointers that it uses
all over the place.

I don't see a requirement to support C structure members or
function pointers in the datatype object.

> There are a few people claiming I should use the ctypes type-hierarchy 
> but nobody has explained how that would be possible given the 
> attributes, C-structure members and C-function pointers that I'm proposing.

Ok, here you go. Remember, I'm still not claiming that this should be
done: I'm just explaining how it could be done.

- byteorder/isnative: I think this could be derived from the
  presence of the _swappedbytes_ field
- itemsize: can be done with ctypes.sizeof
- kind: can be created through a mapping of the _type_ field
  (I think)
- fields: can be derived from the _fields_ member
- hasobject: compare, recursively, with py_object
- name: use __name__
- base: again, created from _type_ (if _length_ is present)
- shape: recursively look at _length_
- alignment: use ctypes.alignment

> It was clear to me that we were "on to something".  Now, the biggest 
> claim against the gist of what I'm proposing (details we can argue 
> about), seems from my perspective to be a desire to "go backwards" and 
> carry data-type information around with a Python type.

I, at least, have no such desire. I just explained that the ctypes
model of memory layouts is just as expressive as the one in the
PEP. Which of these is "better" for what the PEP wants to achieve,
I can't say, because I still don't quite understand what the PEP
wants to achieve.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Travis Oliphant

Nick Coghlan wrote:
> Travis E. Oliphant wrote:
> 
>>However, the existence of an alternative strategy using a single Python 
>>type and multiple instances of that type to describe binary data (which 
>>is the NumPy approach and essentially the array module approach) means 
>>that we can't just a-priori assume that the way ctypes did it is the 
>>only or best way.
> 
> 
> As a hypothetical, what if there was a helper function that translated a 
> description of a data structure using basic strings and sequences (along the 
> lines of what you have in your PEP) into a ctypes data structure?
> 

That would be fine and useful in fact.  I don't see how it helps the 
problem of "what to pass through the buffer protocol"  I see passing 
c-types type objects around on the c-level as an un-necessary and 
burdensome approach unless the ctypes objects were significantly enhanced.

> 
> In fact, it may make sense to just use the lists/strings directly as the data 
> exchange format definitions, and let the various libraries do their own 
> translation into their private format descriptions instead of creating a new 
> one-type-to-describe-them-all.

Yes, I'm open to this possibility.   I basically want two things in the 
object passed through the extended buffer protocol:

1) It's fast on the C-level
2) It covers all the use-cases.

If just a particular string or list structure were passed, then I would 
drop the data-format PEP and just have the dataformat argument of the 
extended buffer protocol be that thing.

Then, something that converts ctypes objects to that special format 
would be very nice indeed.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Martin v. Löwis

Travis E. Oliphant schrieb:
> But, there are distinct disadvantages to this approach compared to what 
> I'm trying to allow.   Martin claims that the ctypes approach is 
> *basically* equivalent but this is just not true.

I may claim that, but primarily, my goal was to demonstrate that the
proposed PEP cannot be used to describe ctypes object layouts (without
checking, I can readily believe that the PEP covers everything in
the array and struct modules).

> It could be made more 
> true if the ctypes objects inherited from a "meta-type" and if Python 
> allowed meta-types to expand their C-structures.  But, last I checked 
> this is not possible.

That I don't understand. a) what do you think is not possible? b)
why is that an important difference between a datatype and a ctype?

If you are suggesting that, given two Python types A and B, and
B inheriting from A, that the memory layout of B cannot extend
the memory layout of A, then: that is certainly possible in Python,
and there are many examples for it.

> A Python type object is a very particular kind of Python-type.  As far 
> as I can tell, it's not as flexible in terms of the kinds of things you 
> can do with the "instances" of a type object (i.e. what ctypes types 
> are) on the C-level.

Ah, you are worried that NumArray objects would have to be *instances*
of ctypes types. That wouldn't be necessary at all. Instead, if each
NumArray object had a method get_ctype(), which returned a ctypes type,
then you would get the same desciptiveness that you get with the
PEP's datatype.

> I'm happy to have the data-format object live separate from ctypes and 
> leave it to the ctypes author(s) to support it if desired.  But, the 
> claim that the extended buffer protocol jump through all kinds of hoops 
> to conform to the "ctypes standard" when that "standard" was designed 
> with a different idea in mind is not acceptable.

That, of course, is a reasoning I can understand. This is free software,
contributors can chose to contribute whatever they want; you can't force
anybody to do anything specific you want to get done. Acceptance of
any PEP (not just this PEP) should always be contingent on available
of a patch implementing it.

> Where is the discussion that crowned the ctypes way of doing things as 
> "the one true way"

It hasn't been crowned this way. Me, personally, I just said two things
about this PEP and ctypes:
a) the PEP does not support all concepts that ctypes needs
b) ctypes can express all examples in the PEP
in response to your proposal that ctypes should adopt the PEP, and
that ctypes is not good enough to be the one true way.

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Stephan Tolksdorf

Martin v. Löwis wrote:
> Travis Oliphant schrieb:
>> Function pointers are "supported" with the void data-type and could be 
>> more specifically supported if it were important.   People typically 
>> don't use the buffer protocol to send function-pointers around in a way 
>> that the void description wouldn't be enough.
> 
> As I said before, I can't tell whether it's important, as I still don't
> know what the purpose of this PEP is. If it is to support a unification
> of memory layout specifications, and if that unifications is also to
> include ctypes, then yes, it is important. If it is to describe array
> elements in NumArray arrays, then it might not be important.
 >
 > For the usage of ctypes, the PEP void type is insufficient to describe
 > function pointers: you also need a specification of the signature of
 > the function pointer (parameter types and return type), or else you
 > can't use the function pointer (i.e. you can't call the function).

The buffer protocol is primarily meant for describing the format of 
(large) contiguous pieces of binary data. In most cases that will be all 
kinds of numerical data for scientific applications, image and other 
media data, simple databases and similar kinds of data.

There is currently no adequate data format type which sufficiently 
supports these applications, otherwise Travis wouldn't make this proposal.

While Travis' proposal encompasses the data format functionality within 
the struct module and overlaps with what ctypes has to offer, it does 
not aim to replace ctypes.

I don't think that a basic data format type necessarily should be able 
to encode all the information a foreign function interface needs to call 
a code library. From my point of view, that kind of information is one 
abstraction layer above a basic data format and should be implemented as 
an extension of or complementary to the basic data format.

I also do not understand why the data format type should attempt to 
fully describe arbitrarily complex data formats, like fragmented 
(non-continuous) data structures in memory. You'd probably need a full 
programming language for that anyway.

Regards,
   Stephan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Travis Oliphant

Martin v. Löwis wrote:
> Travis Oliphant schrieb:
> 
>>So, the big difference is that I think data-formats should be 
>>*instances* of a single type.
> 
> 
> This is nearly the case for ctypes as well. All layout descriptions
> are instances of the type type. Nearly, because they are instances
> of subtypes of the type type:
> 
> py> type(ctypes.c_long)
> 
> py> type(ctypes.c_double)
> 
> py> type(ctypes.c_double).__bases__
> (,)
> py> type(ctypes.Structure)
> 
> py> type(ctypes.Array)
> 
> py> type(ctypes.Structure).__bases__
> (,)
> py> type(ctypes.Array).__bases__
> (,)
> 
> So if your requirement is "all layout descriptions ought to have
> the same type", then this is (nearly) the case: they are instances
> of type (rather then datatype, as in your PEP).
> 

The big difference, however, is that by going this route you are forced 
to use the "type object" as your data-format "instance".  This is 
fitting a square peg into a round hole in my opinion.To really be 
useful, you would need to add the attributes and (most importantly) 
C-function pointers and C-structure members to these type objects.  I 
don't even think that is possible in Python (even if you do create a 
meta-type that all the c-type type objects can use that carries the same 
information).

There are a few people claiming I should use the ctypes type-hierarchy 
but nobody has explained how that would be possible given the 
attributes, C-structure members and C-function pointers that I'm proposing.

In NumPy we also have a Python type for each basic data-format (we call 
them array scalars).  For a little while they carried the data-format 
information on the Python side.  This turned out to be not flexible 
enough.  So, we expanded the PyArray_Descr * structure which has always 
been a part of Numeric (and the array module array type) into an actual 
Python type and a lot of things became possible.

It was clear to me that we were "on to something".  Now, the biggest 
claim against the gist of what I'm proposing (details we can argue 
about), seems from my perspective to be a desire to "go backwards" and 
carry data-type information around with a Python type.

The data-type object did not just appear out of thin-air one day.  It 
really can be seen as an evolution from the beginnings of Numeric (and 
the Python array module).

So, this is what we came up with in the NumPy world.  Ctypes came up 
with something a bit different.  It is not "trivial" to "just use 
ctypes."  I could say the same thing and tell ctypes to just use NumPy's 
  data-type object.   It could be done that way, but of course it would 
take a bit of work on the part of ctypes to make that happen.

Having ctypes in the standard library does not mean that any other 
discussion of how data-format should be represented has been decided on. 
If I had known that was what it meant to put ctypes in the standard 
library, I would have been more vocal several months ago.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Travis E. Oliphant

Michael Chermside wrote:
> In this email I'm responding to a series of emails from Travis
> pretty much in the order I read them:
> 
>>
>> In the mean-time, how are other packages supposed to communicate  
>> binary information about data with each other?
> 
> Here we disagree.
> 
> I haven't used C-types. I have no idea whether it is well-designed or
> horribly unusable. So if someone wanted to argue that C-types is a
> mistake and should be thrown out, I'd be willing to listen. 
> Until
> someone tries to make that argument, I'm presuming it's good enough to
> be part of the standard library for Python.

My problem with this argument is two fold:

1) I'm not sure you really know what your talking about since you 
apparently haven't used either ctypes or NumPy (I've used both and so 
forgive me if I claim to understand the strengths of the data-format 
representations that each uses a bit better).  Therefore, it's hard for 
me to take your opinion seriously.  I will try though. I understand you 
have a preference for not wildly expanding the ways to do similar 
things.  I share that preference with you.

2) You are assuming that because it's good enough for the standard 
library means that the way they describe data-formats (using a separate 
Python type for each one) is the *one true way*.  When was this 
discussed?   Frankly it's a weak argument because the struct module has 
been around for a lot longer.  Why didn't the ctypes module follow that 
standard?  Or the standard that's in the array module for describing 
data-types.  That's been there for a long time too.  Why wasn't ctypes 
forced to use that approach?

The reason it wasn't is because it made sense for ctypes to use a 
separate type for each data-format object so that you could call 
C-functions as if they were Python functions.  If this is your goal, 
then it seems like a good idea (though not strictly necessary) to use a 
separate Python type for each data-format.

But, there are distinct disadvantages to this approach compared to what 
I'm trying to allow.   Martin claims that the ctypes approach is 
*basically* equivalent but this is just not true.  It could be made more 
true if the ctypes objects inherited from a "meta-type" and if Python 
allowed meta-types to expand their C-structures.  But, last I checked 
this is not possible.

A Python type object is a very particular kind of Python-type.  As far 
as I can tell, it's not as flexible in terms of the kinds of things you 
can do with the "instances" of a type object (i.e. what ctypes types 
are) on the C-level.

The other disadvantage of what you are describing is: Who is going to 
write the code?

I'm happy to have the data-format object live separate from ctypes and 
leave it to the ctypes author(s) to support it if desired.  But, the 
claim that the extended buffer protocol jump through all kinds of hoops 
to conform to the "ctypes standard" when that "standard" was designed 
with a different idea in mind is not acceptable.

Ctypes has only been in Python since 2.5 and the array interface was 
around before that.   Numeric has been around longer than ctypes.  The 
array module and the struct modules in Python have also both been around 
longer than ctypes as well.

Where is the discussion that crowned the ctypes way of doing things as 
"the one true way"

> 
> In a different message, he writes:
>> It also bothers me that so many ways to describe binary data are  
>> being used out there.  This is a problem that deserves being solved.  
>>  And, no, ctypes hasn't solved it (we can't directly use the ctypes  
>> solution).
> 
> Really? Why? Is this a failing in C-types? Can C-types be "fixed"?

You can't grow C-function pointers on to an existing type object.   You 
are also carrying around a lot of weight in the Python type object that 
is un-necessary if all you are doing is describing data.

> 
> I just disagree. (1) I *DO* think we should "just use ctypes because it's
> there". After all, the problem we're trying to solve is one of
> COMPATIBILITY - you don't solve those by introducing competing standards.
> (2) From what I understand of it, I think ctypes is quite capable of
> describing data to be accessed via the buffer protocol.

Capable but not supporting all the things I'm talking about.  The ctypes 
objects don't have any of the methods or attributes (or C function 
pointers) that I've described.  Nor should they necessarily grow them.

> 
> Why? Who cares? Seriously, if we were proposing to describe the layouts
> with a collection of rubber bands and potato chips, I'd say it was a
> crazy idea. But we're proposing using data structures in a computer
> memory. Why does it matter whether those data structures are of the same
> "python type" or different "python types"? I care whether the structure
> can be created, passed around, and interrogated. I don't care what
> Python type they are.

Sure, but the flexibility you have with an instance of a Python type is 
different

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Michael Chermside

In this email I'm responding to a series of emails from Travis
pretty much in the order I read them:

Travis Oliphant writes:
> I'm saying we should introduce a single-object mechanism for  
> describing binary data so that the many-object approach of c-types  
> does not become some kind of de-facto standard.  C-types can  
> "translate" this object-instance to its internals if and when it  
> needs to.
>
> In the mean-time, how are other packages supposed to communicate  
> binary information about data with each other?

Here we disagree.

I haven't used C-types. I have no idea whether it is well-designed or
horribly unusable. So if someone wanted to argue that C-types is a
mistake and should be thrown out, I'd be willing to listen. Until
someone tries to make that argument, I'm presuming it's good enough to
be part of the standard library for Python.

Given that, I think that it *SHOULD* become a de-facto standard. I
think that the way different packages should communicate binary information
about data with each other is using C-types. Not because it's wonderful
(remember, I've never used it), but because it's STANDARD. There should
be one obvious way to do things! When there is, it makes interoperability
WAY easier, and interoperability is the main objective when dealing with
things like binary data formats.

Propose using C-types. Or propose *improving* C-types. But don't propose
ignoring it.

In a different message, he writes:
> It also bothers me that so many ways to describe binary data are  
> being used out there.  This is a problem that deserves being solved.  
>  And, no, ctypes hasn't solved it (we can't directly use the ctypes  
> solution).

Really? Why? Is this a failing in C-types? Can C-types be "fixed"?

Later he explains:
> Remember the buffer protocol is in compiled code.  So, as a result,
>
> 1) It's harder to construct a class to pass through the protocol  
> using the multiple-types approach of ctypes.
>
> 2) It's harder to interpret the object recevied through the buffer protocol.
>
> Sure, it would be *possible* to use ctypes, but I think it would be  
> very difficult.  Think about how you would write the get_data_format  
> C function in the extended buffer protocol for NumPy if you had to  
> import ctypes and then build a class just to describe your data.   
> How would you interpret what you get back?

Aha! So what you REALLY ought to be asking for is a C interface to the
ctypes module. That seems like a very sensible and reasonable request.

> I don't think we should just *use ctypes because it's there* when  
> the way it describes binary data was not constructed with the  
> extended buffer protocol in mind.

I just disagree. (1) I *DO* think we should "just use ctypes because it's
there". After all, the problem we're trying to solve is one of
COMPATIBILITY - you don't solve those by introducing competing standards.
(2) From what I understand of it, I think ctypes is quite capable of
describing data to be accessed via the buffer protocol.

In another email:
> In order to make sense of the data-format object that I'm proposing  
> you have to see the need to share information about data-format  
> through an extended buffer protocol (which I will be proposing  
> soon).  I'm not going to try to argue that right now because there  
> are a lot of people who can do that.

Actually, no need to convince me... I am already convinced of the
wisdom of this approach.

> My view is that it is un-necessary to use a different type object to  
> describe each different data-type.
  [...]
> So, the big difference is that I think data-formats should be  
> *instances* of a single type.

Why? Who cares? Seriously, if we were proposing to describe the layouts
with a collection of rubber bands and potato chips, I'd say it was a
crazy idea. But we're proposing using data structures in a computer
memory. Why does it matter whether those data structures are of the same
"python type" or different "python types"? I care whether the structure
can be created, passed around, and interrogated. I don't care what
Python type they are.

> I'm saying that I don't like the idea of forcing this approach on  
> everybody else who wants to describe arbitrary binary data just  
> because ctypes is included.

And I'm saying that I *do*. Hey, if someone proposed getting rid of
the current syntax for the array module (for Py3K) and replacing it with
use of ctypes, I'd give it serious consideration. There should be only
one way to describe binary structures. It should be powerful enough to
describe almost any structure, easy-to-use, and most of all it should be
used consistently everywhere.

> I need some encouragement in order to continue to invest energy in  
> pushing this forward.

Please keep up the good work! Some day I'd like to see NumPy built in
to the standard Python distribution. The incremental, PEP by PEP approach
you are taking is the best route to getting there. But there may be
some changes a

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Gareth McCaughan

> > It might be better not to consider "bit" to be a
> > type at all, and come up with another way of indicating
> > that the size is in bits. Perhaps
> > 
> > 'i4'   # 4-byte signed int
> > 'i4b'  # 4-bit signed int
> > 'u4'   # 4-byte unsigned int
> > 'u4b'  # 4-bit unsigned int
> > 
> 
> I like this.  Very nice.  I think that's the right way to look at it.

I remark that 'ib4' and 'ub4' make for marginally easier
parsing and less danger of ambiguity.

-- 
g

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Nick Coghlan

Travis E. Oliphant wrote:
> However, the existence of an alternative strategy using a single Python 
> type and multiple instances of that type to describe binary data (which 
> is the NumPy approach and essentially the array module approach) means 
> that we can't just a-priori assume that the way ctypes did it is the 
> only or best way.

As a hypothetical, what if there was a helper function that translated a 
description of a data structure using basic strings and sequences (along the 
lines of what you have in your PEP) into a ctypes data structure?

> The examples of "missing features" that Martin has exposed are not 
> show-stoppers.  They can all be easily handled within the context of 
> what is being proposed.   I can modify the PEP to show this.  But, I 
> don't have the time to spend if it's just all going to be rejected in 
> the end.  I need some encouragement in order to continue to invest 
> energy in pushing this forward.

I think the most important thing in your PEP is the formats for describing 
structures in a way that is easy to construct in both C and Python 
(specifically, by using strings and sequences), and it is worth pursuing for 
that aspect alone. Whether that datatype is then implemented as a class in its 
own right or as a factory function that returns a ctypes data type object is, 
to my mind, a relatively minor implementation issue (either way has questions 
to be addressed - I'm not sure how you tell ctypes that you have a 32-bit 
integer with a non-native endian format, for example).

In fact, it may make sense to just use the lists/strings directly as the data 
exchange format definitions, and let the various libraries do their own 
translation into their private format descriptions instead of creating a new 
one-type-to-describe-them-all.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Paul Moore

On 10/31/06, Travis Oliphant <[EMAIL PROTECTED]> wrote:
> In order to make sense of the data-format object that I'm proposing you
> have to see the need to share information about data-format through an
> extended buffer protocol (which I will be proposing soon).  I'm not
> going to try to argue that right now because there are a lot of people
> who can do that.
>
> So, I'm going to assume that you see the need for it.  If you don't,
> then just suspend concern about that for the moment.  There are a lot of
> us who really see the need for it.

[...]

> Again, my real purpose is the extended buffer protocol.  These
> data-format type is a means to that end.  If the consensus is that
> nobody sees a greater use of the data-format type beyond the buffer
> protocol, then I will just write 1 PEP for the extended buffer protocol.

While I don't personally use NumPy, I can see where an extended buffer
protocol like you describe could be advantageous, and so I'm happy to
concede that benefit.

I can also vaguely see that a unified "block of memory description"
would be useful. My interest would be in the area of the struct module
(unpacking and packing data for dumping to byte streams - whether this
happens in place or not is not too important to this use case).
However, I cannot see how your proposal would help here in practice -
does it include the functionality of the struct module (or should it?)
If so, then I'd like to see examples of equivalent constructs. If not,
then isn't it yet another variation on the theme, adding to the
problem of multiple approaches rather than helping?

I can also see the parallels with ctypes. Here I feel a little less
sure that keeping the two approaches is wrong. I don't know why I feel
like that - maybe nothing more than familiarity with ctypes - but I
don't have the same reluctance to have both the ctypes data definition
stuff and the new datatype proposal.

Enough of the abstract. As a concrete example, suppose I have a (byte)
string in my program containing some binary data - an ID3 header, or a
TCP packet, or whatever. It doesn't really matter. Does your proposal
offer anything to me in how I might manipulate that data (assuming I'm
not using NumPy)? (I'm not insisting that it should, I'm just trying
to understand the scope of the PEP).

Paul.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Martin v. Löwis

Travis Oliphant schrieb:
> So, the big difference is that I think data-formats should be 
> *instances* of a single type.

This is nearly the case for ctypes as well. All layout descriptions
are instances of the type type. Nearly, because they are instances
of subtypes of the type type:

py> type(ctypes.c_long)

py> type(ctypes.c_double)

py> type(ctypes.c_double).__bases__
(,)
py> type(ctypes.Structure)

py> type(ctypes.Array)

py> type(ctypes.Structure).__bases__
(,)
py> type(ctypes.Array).__bases__
(,)

So if your requirement is "all layout descriptions ought to have
the same type", then this is (nearly) the case: they are instances
of type (rather then datatype, as in your PEP).

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Travis E. Oliphant

M.-A. Lemburg wrote:
> Travis E. Oliphant wrote:
> 
> I understand and that's why I'm asking why you made the range
> explicit in the definition.
> 

In the case of NumPy it was so that String and Unicode arrays would both 
look like multi-length string "character" arrays and not arrays of 
arrays of some character.

But, this can change in the data-format object.  I can see that the 
Unicode description needs to be improved.

> The definition should talk about Unicode code points.
> The number of bytes then determines whether you can only
> represent the ASCII subset (1 byte), UCS2 (2 bytes, BMP only)
> or UCS4 (4 bytes, all currently assigned code points).

Yes, you are correct.  A string of unicode characters should really be 
represented in the same way that an array of integers is represented for 
a data-format object.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Travis E. Oliphant

Greg Ewing wrote:
> Travis Oliphant wrote:
> 
>> The 'bit' type re-intprets the size information to be in units of "bits" 
>> and so implies a "bit-field" instead of another data-format.
> 
> Hmmm, okay, but now you've got another orthogonality
> problem, because you can't distinguish between e.g.
> a 5-bit signed int field and a 5-bit unsigned int
> field.

Good point.

> 
> It might be better not to consider "bit" to be a
> type at all, and come up with another way of indicating
> that the size is in bits. Perhaps
> 
> 'i4'   # 4-byte signed int
> 'i4b'  # 4-bit signed int
> 'u4'   # 4-byte unsigned int
> 'u4b'  # 4-bit unsigned int
> 

I like this.  Very nice.  I think that's the right way to look at it.

> (Next we can have an argument about whether bit
> fields should be packed MSB-to-LSB or vice versa...:-)

I guess we need another flag / attribute to indicate that.

The other thing that needs to be discussed at some point may be a way to 
indicate the floating-point format.  I've basically punted on this and 
just meant 'f' to mean "platform float"

Thus, you can't use the data-type object to pass information between two 
platforms that don't share a common floating point representation.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Travis E. Oliphant

Travis Oliphant wrote:
> Greg Ewing wrote:
>> Travis Oliphant wrote:
>>
>>
>>> Part of the problem is that ctypes uses a lot of different Python types 
>>> (that's what I mean by "multi-object" to accomplish it's goal).  What 
>>> I'm looking for is a single Python type that can be passed around and 
>>> explains binary data.
>>
>> It's not clear that multi-object is a bad thing in and
>> of itself. It makes sense conceptually -- if you have
>> a datatype object representing a struct, and you ask
>> for a description of one of its fields, which could
>> be another struct or array, you would expect to get
>> another datatype object describing that.

Yes, exactly.  This is what the Python type I'm proposing does as well. 
   So, perhaps we are misunderstanding each other.  The difference is 
that data-types are instances of the data-type (data-format) object 
instead of new Python types (as they are in ctypes).
> 
> I've tried to clarify this in another post.  Basically, what I don't 
> like about the ctypes approach is that it is multi-type (every new 
> data-format is a Python type).
> 

I should clarify that I have no opinion about the ctypes approach for 
what ctypes does with it.  I like ctypes and have adapted NumPy to make 
it easier to work with ctypes.

I'm saying that I don't like the idea of forcing this approach on 
everybody else who wants to describe arbitrary binary data just because 
ctypes is included.  Now, if it is shown that it is indeed better than a 
simpler instances-of-a-single-type approach that I'm basically proposing 
  then I'll be persuaded.

However, the existence of an alternative strategy using a single Python 
type and multiple instances of that type to describe binary data (which 
is the NumPy approach and essentially the array module approach) means 
that we can't just a-priori assume that the way ctypes did it is the 
only or best way.

The examples of "missing features" that Martin has exposed are not 
show-stoppers.  They can all be easily handled within the context of 
what is being proposed.   I can modify the PEP to show this.  But, I 
don't have the time to spend if it's just all going to be rejected in 
the end.  I need some encouragement in order to continue to invest 
energy in pushing this forward.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Travis Oliphant

Greg Ewing wrote:
> Travis Oliphant wrote:
> 
> 
>>Part of the problem is that ctypes uses a lot of different Python types 
>>(that's what I mean by "multi-object" to accomplish it's goal).  What 
>>I'm looking for is a single Python type that can be passed around and 
>>explains binary data.
> 
> 
> It's not clear that multi-object is a bad thing in and
> of itself. It makes sense conceptually -- if you have
> a datatype object representing a struct, and you ask
> for a description of one of its fields, which could
> be another struct or array, you would expect to get
> another datatype object describing that.
> 
> Can you elaborate on what would be wrong with this?
> 
> Also, can you clarify whether your objection is to
> multi-object or multi-type. They're not the same thing --
> you could have a data structure built out of multiple
> objects that are all of the same Python type, with
> attributes distinguishing between struct, array, etc.
> That would be single-type but multi-object.

I've tried to clarify this in another post.  Basically, what I don't 
like about the ctypes approach is that it is multi-type (every new 
data-format is a Python type).

In order to talk about all these Python types together, then they must 
all share some attribute (or else be derived from a meta-type in C with 
a specific function-pointer entry).

I think it is simpler to think of a single Python type whose instances 
convey information about data-format.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Travis Oliphant

Armin Rigo wrote:
> Hi Travis,
> 
> On Fri, Oct 27, 2006 at 02:05:31PM -0600, Travis E. Oliphant wrote:
> 
>>This PEP proposes adapting the data-type objects from NumPy for
>>inclusion in standard Python, to provide a consistent and standard
>>way to discuss the format of binary data. 
> 
> 
> How does this compare with ctypes?  Do we really need yet another,
> incompatible way to describe C-like data structures in the standarde
> library?

There is a lot of subtlety in the details that IMHO clouds the central 
issue which I will try to clarify here the way I see it.

First of all:

In order to make sense of the data-format object that I'm proposing you 
have to see the need to share information about data-format through an 
extended buffer protocol (which I will be proposing soon).  I'm not 
going to try to argue that right now because there are a lot of people 
who can do that.

So, I'm going to assume that you see the need for it.  If you don't, 
then just suspend concern about that for the moment.  There are a lot of 
us who really see the need for it.

Now:

To describe data-formats ctypes uses a Python type-object defined for 
every data-format you might need.

In my view this is an 'over-use' of the type-object and in fact, to be 
useful, requires the definition of a meta-type that carries the relevant 
additions to the type-object that are needed to describe data (like 
function pointers to get data in and out of Python objects).

My view is that it is un-necessary to use a different type object to 
describe each different data-type.

The route I'm proposing is to define (in C) a *single* new Python type 
(called a data-format type) that carries the information needed to 
describe a chunk of memory.

In this way *instances* of this new type define data-formats.

In ctypes *instances* of the "meta-type" (i.e. new types) define 
data-formats (actually I'm not sure if all the new c-types are derived 
from the same meta-type).

So, the big difference is that I think data-formats should be 
*instances* of a single type.  There is no need to define a Python 
type-object for every single data-type.  In fact, not only is there no 
need, it makes the extended buffer protocol I'm proposing even more 
difficult to use and explain.

Again, my real purpose is the extended buffer protocol.  These 
data-format type is a means to that end.  If the consensus is that 
nobody sees a greater use of the data-format type beyond the buffer 
protocol, then I will just write 1 PEP for the extended buffer protocol.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Greg Ewing

Travis Oliphant wrote:

> I'm not sure I understand what you mean by "incomplete / recursive" 
> types unless you are referring to something like a node where an element 
> of the structure is a pointer to another structure of the same kind 
> (like used in linked-lists or trees).

Yes, and more complex arrangements of types that
reference each other.

>  If that is the case, then it's 
> easily supported once support for pointers is added.

But it doesn't fit easily into the single-object
model.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Greg Ewing

Travis Oliphant wrote:

> The 'bit' type re-intprets the size information to be in units of "bits" 
> and so implies a "bit-field" instead of another data-format.

Hmmm, okay, but now you've got another orthogonality
problem, because you can't distinguish between e.g.
a 5-bit signed int field and a 5-bit unsigned int
field.

It might be better not to consider "bit" to be a
type at all, and come up with another way of indicating
that the size is in bits. Perhaps

'i4'   # 4-byte signed int
'i4b'  # 4-bit signed int
'u4'   # 4-byte unsigned int
'u4b'  # 4-bit unsigned int

(Next we can have an argument about whether bit
fields should be packed MSB-to-LSB or vice versa...:-)

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Greg Ewing

Travis Oliphant wrote:

> Part of the problem is that ctypes uses a lot of different Python types 
> (that's what I mean by "multi-object" to accomplish it's goal).  What 
> I'm looking for is a single Python type that can be passed around and 
> explains binary data.

It's not clear that multi-object is a bad thing in and
of itself. It makes sense conceptually -- if you have
a datatype object representing a struct, and you ask
for a description of one of its fields, which could
be another struct or array, you would expect to get
another datatype object describing that.

Can you elaborate on what would be wrong with this?

Also, can you clarify whether your objection is to
multi-object or multi-type. They're not the same thing --
you could have a data structure built out of multiple
objects that are all of the same Python type, with
attributes distinguishing between struct, array, etc.
That would be single-type but multi-object.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Martin v. Löwis

Travis Oliphant schrieb:
> Function pointers are "supported" with the void data-type and could be 
> more specifically supported if it were important.   People typically 
> don't use the buffer protocol to send function-pointers around in a way 
> that the void description wouldn't be enough.

As I said before, I can't tell whether it's important, as I still don't
know what the purpose of this PEP is. If it is to support a unification
of memory layout specifications, and if that unifications is also to
include ctypes, then yes, it is important. If it is to describe array
elements in NumArray arrays, then it might not be important.

For the usage of ctypes, the PEP void type is insufficient to describe
function pointers: you also need a specification of the signature of
the function pointer (parameter types and return type), or else you
can't use the function pointer (i.e. you can't call the function).

> Pointers are also "supported" with the void data-type.  If pointers to 
> other data-types were an important feature to support, then this could 
> be added in many ways (a flag on the data-type object for example is how 
> this is done is NumPy).

For ctypes, (I think) you need "true" pointers to other layouts, or
else you couldn't set up the memory correctly.

I don't understand how this could work with some extended buffer
protocol, though: would a buffer still have to be a contiguous piece
of memory? If you have structures with pointers in them, they
rarely point to contiguous memory.

> Unions are actually supported (just define two fields with the same 
> offset).

Ah, ok. What's the string syntax for it?

> I don't know what you mean by "packed structs" (unless you are talking 
> about alignment issues in which case there is support for it).

Yes, this is indeed about alignment; I missed it. What's the string
syntax for it?

> I'm not sure I understand what you mean by "incomplete / recursive" 
> types unless you are referring to something like a node where an element 
> of the structure is a pointer to another structure of the same kind 
> (like used in linked-lists or trees).  If that is the case, then it's 
> easily supported once support for pointers is added.

That's what I mean, yes. I'm not sure how it can easily be added,
though. Suppose you want to describe

struct item{
  int key;
  char* value;
  struct item *next;
};

How would you do that? Something like

item = datatype([('key', 'i4'), ('value', 'S*'), ('next',
'what_to_put_here*')]

can't work: item hasn't been assigned, yet, so you can't
use it as the field type.

> I also don't know what you mean by "open-ended arrays."  The data-format 
> is meant to describe a fixed-size chunk of data.

I see. In C (and thus in ctypes), you sometimes have what C99 calls
"flexible array member":

struct PyString{
  Py_ssize_t ob_refcnt;
  PyObject *ob_type;
  Py_ssize_t ob_len;
  char ob_sval[];
};

where the ob_sval field can extend arbitrarily, as it is the last
member of the struct. Of course, this will give you dynamically-sized
objects (objects in C cannot really be "variable-sized", since the
size of a memory block has to be defined at allocation time, and
can't really change afterwards).

> String syntax is not needed to support all of these things.

Ok. That's confusing in the PEP: it's not clear whether all these
forms are meant to be equivalent, and, if not, which one is the most
generic one, and what aspects are missing in what forms. Also,
if you have a datatype which cannot be expressed in the string
syntax, what is its "str" attribute?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Travis Oliphant

Martin v. Löwis wrote:
> Robert Kern schrieb:
> 
>>>As I unification mechanism, I think it is insufficient. I doubt it
>>>can express all the concepts that ctypes supports.
>>
>>What do you think is missing that can't be added?
> 
> 
> I can factually only report what is missing. Whether it can be added,
> I don't know. As I just wrote in a few other messages: pointers,
> unions, functions pointers, packed structs, incomplete/recursive
> types. Also "flexible array members" (i.e. open-ended arrays).
> 

I understand function pointers, pointers, and unions.

Function pointers are "supported" with the void data-type and could be 
more specifically supported if it were important.   People typically 
don't use the buffer protocol to send function-pointers around in a way 
that the void description wouldn't be enough.

Pointers are also "supported" with the void data-type.  If pointers to 
other data-types were an important feature to support, then this could 
be added in many ways (a flag on the data-type object for example is how 
this is done is NumPy).

Unions are actually supported (just define two fields with the same 
offset).

I don't know what you mean by "packed structs" (unless you are talking 
about alignment issues in which case there is support for it).

I'm not sure I understand what you mean by "incomplete / recursive" 
types unless you are referring to something like a node where an element 
of the structure is a pointer to another structure of the same kind 
(like used in linked-lists or trees).  If that is the case, then it's 
easily supported once support for pointers is added.

I also don't know what you mean by "open-ended arrays."  The data-format 
is meant to describe a fixed-size chunk of data.

String syntax is not needed to support all of these things.  What I'm 
asking for and proposing is a way to construct an instance of a single 
Python type that communicates this data-format information in a 
standardized way.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Travis Oliphant

Greg Ewing wrote:
> Travis E. Oliphant wrote:
> 
> 
>>Greg Ewing wrote:
> 
> 
>>>What exactly does "bit" mean in that context?   
>>
>>Do you mean "big" ?
> 
> 
> No, you've got a data type there called "bit",
> which seems to imply a size, in contradiction
> to the size-independent nature of the other
> types. I'm asking what size-independent
> information it's meant to convey.

Ah.  I see what you were saying now.   I guess the 'bit' type is 
different (we actually don't have that type in NumPy so my understanding 
of it is limited).

The 'bit' type re-intprets the size information to be in units of "bits" 
and so implies a "bit-field" instead of another data-format.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Travis Oliphant

Jim Jewett wrote:
> Travis E. Oliphant wrote:
> 
> 
>>Two packages need to share a chunk of memory (the package authors do not
>>know each other and only have and Python as a common reference).  They
>>both want to describe that the memory they are sharing has some
>>underlying binary structure.
> 
> 
> As a quick sanity check, please tell me where I went off track.
> 
> it sounds to me like you are assuming that:
> 
> (1)  The memory chunk represents a single object (probably an array of
> some sort)
> (2)  That subchunks can themselves be described by a (single?)
> repeating C struct.
> (3)  You can't just use the C header, since you want this at run-time.
> (4)  It would be enough if you could say
> 
> This is an array of 500 elements that look like
> 
> struct {
>   int  simple;
>   struct nested {
>char name[30];
>char addr[45];
>int  amount;
>   }
> 

Sure.  I think that's pretty much it.  I assume you mean object in the 
general sense and not as in (Python object).

> (5)  But is it not acceptable to use Martin's suggested ctypes
> equivalent of (building out from the inside):

Part of the problem is that ctypes uses a lot of different Python types 
(that's what I mean by "multi-object" to accomplish it's goal).  What 
I'm looking for is a single Python type that can be passed around and 
explains binary data.

Remember the buffer protocol is in compiled code.  So, as a result,

1) It's harder to construct a class to pass through the protocol using 
the multiple-types approach of ctypes.

2) It's harder to interpret the object recevied through the buffer 
protocol.

Sure, it would be *possible* to use ctypes, but I think it would be very 
difficult.  Think about how you would write the get_data_format C 
function in the extended buffer protocol for NumPy if you had to import 
ctypes and then build a class just to describe your data.  How would you 
interpret what you get back?

The ctypes "format-description" approach is not as unified as a single 
Python type object that I'm proposing.

In NumPy, we have a very nice, compact description of complicated data 
already available.  Why not use what we've learned?

I don't think we should just *use ctypes because it's there* when the 
way it describes binary data was not constructed with the extended 
buffer protocol in mind.

The other option, of course, which would not introduce a new Python type 
is to use the array interface specification and pass a list of tuples. 
But, I think this is also un-necessarily wasteful because the sending 
object has to construct it and the receiving object has to de-construct 
it.  The whole point of the (extended) buffer protocol is to communicate 
this information more quickly.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Travis Oliphant

Martin v. Löwis wrote:
> Josiah Carlson schrieb:
> 
>>One could also toss wxPython, VTK, or any one of the other GUI libraries
>>into the mix for visualizing those images, of which wxPython just
>>acquired no-copy display of PIL images, and being able to manipulate
>>them with numpy (of which some wxPython built in classes use numpy to
>>speed up manipulation) would be very useful.
> 
> 
> I'm doubtful that this PEP alone would allow zero-copy sharing of images
> for display. Often, the libraries need the data in a different format.
> So they need to copy, even if they could understand the other format.
> However, the PEP won't allow "understanding" the format. If I know I
> have an array of 4-byte values: which of them is R, G, B, and A?
> 

You give a name to the fields: 'R', 'G', 'B', and 'A'.


-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Jim Jewett

Travis E. Oliphant wrote:

> Two packages need to share a chunk of memory (the package authors do not
> know each other and only have and Python as a common reference).  They
> both want to describe that the memory they are sharing has some
> underlying binary structure.

As a quick sanity check, please tell me where I went off track.

it sounds to me like you are assuming that:

(1)  The memory chunk represents a single object (probably an array of
some sort)
(2)  That subchunks can themselves be described by a (single?)
repeating C struct.
(3)  You can't just use the C header, since you want this at run-time.
(4)  It would be enough if you could say

This is an array of 500 elements that look like

struct {
  int  simple;
  struct nested {
   char name[30];
   char addr[45];
   int  amount;
  }

(5)  But is it not acceptable to use Martin's suggested ctypes
equivalent of (building out from the inside):

class nested(Structure):
_fields_ = [("name", c_char*30), ("addr", c_char*45),
("amount", c_long)]

class struct(Structure):
_fields_ = [("simple", c_int), ("nested", nested)]

struct * 500

If I misunderstood, could you show me where?

If I did understand correctly, could you expand on why (5) is
unacceptable, given that ctypes is now in the core?  (New and unknown,
I would understand -- but that is also true of any datatype proposal,
for the people who haven't already used it.  I suspect that any
differences from Numpy would be a source of pain for those who *have*
used Numpy, but following Numpy exactly is ... not much simpler than
the above.)

Or are you just saying that "anything with a buffer interface should
also have a datatype object describing the layout in a standard way"?
If so, that makes sense, but I'm inclined to prefer the ctypes way, so
that most people won't ever have to worry about things like
endianness/strides/Fortan layout.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Diez B. Roggisch


> ...in the cases I have seen, which includes BMP, TGA, uncompressed TIFF,
> a handful of platform-specific bitmap formats, etc.,  you _always_ get
> them in RGBA order.  If the alpha channel is to be left out, then you
> get them as RGB.

Mac OS X unfortunately uses ARGB. Writing some alti-vec code remedied that for 
passing it around to the OpenCV library.

Just my $.02 

Diez
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Jack Jansen

Would it be possible to make the data-type objects subclassable, with the subclasses being able to override the equality test?The range of data types that you've specified in the PEP are good enough for most general use, and probably for NumPy as well, but someone already came up with the example of image formats, which have their whole own range of data formats. I could throw in audio formats (bits per sample, excess-N or signed or ulaw samples, mono/stereo/5.1/etc, order of the channels), and there's probably a whole slew of other areas that have their own sets of formats.If the datatype objects are subclassable, modules could initially start by adding their own formats. So, the "jackaudio" and "jillaudio" modules would have distinct sets of formats. But then later on it should be fairly easy for them to recognize each others formats. So, jackaudio would recognize the jillaudio format "msdos linear pcm" as being identical to its own "16-bit excess-32768".Hopefully eventually all audio module writers would get together and define a set of standard audio formats. -- Jack Jansen, <[EMAIL PROTECTED]>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman  

smime.p7s
Description: S/MIME cryptographic signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Nick Coghlan

Neal Becker wrote:
> I have watched numpy with interest for a long time.  My own interest is to
> possibly use the c-api to wrap c++ algorithms to use from python.
> 
> One thing that has concerned me, and continues to concern me with this
> proposal, is that it seems to suffer from a very fat interface.  I
> certainly have not studied the options in any depth, but my gut feeling is
> that the interface is too fat and too complex.  I wonder if it's possible
> to avoid this.  I wonder if this is an example of all the methods sinking
> to the base class.

You've just described my number #1 concern with incorporating NumPy wholesale, 
and the reason I believe it would be nice to cherry-pick a couple of key 
components for the standard library, rather than adopting the whole thing.

Travis has done a lot of work towards that goal (the latest result of which is 
this pre-PEP for describing the individual array elements in a way that is 
more flexible than the single character codes of the current array module).

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Greg Ewing

Josiah Carlson wrote:

> ...in the cases I have seen ... you _always_ get
> them in RGBA order.

Except when you don't. I've had cases where I've had to
convert between RGBA and BGRA (for stuffing directly into
a frame buffer on Linux, as far as I remember).

So it may be worth including some features in the standard
for describing pixel formats.

Pygame seems to have a very detailed and flexible system
for doing this, so it might be a good idea to have a
look at that.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Greg Ewing

Travis E. Oliphant wrote:
> Martin v. Löwis wrote:
> 
>>Travis E. Oliphant schrieb:

>>Is it the intent of this PEP to support such data structures,
>>and allow the user to fill in a Unicode object, and then the
>>processing is automatic?

> No, the point of the data-format object is to communicate information 
> about data-formats not to encode or decode anything.

Well, there's still the issue of how much detail you
want to be able to convey, so I think the question
is valid. Is the encoding of a Unicode string something
we want to be able to communicate via this mechanism,
or is that outside its scope?

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Greg Ewing

Travis E. Oliphant wrote:

> Greg Ewing wrote:

>>What exactly does "bit" mean in that context?   
> 
> Do you mean "big" ?

No, you've got a data type there called "bit",
which seems to imply a size, in contradiction
to the size-independent nature of the other
types. I'm asking what size-independent
information it's meant to convey.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Josiah Carlson

"Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> Josiah Carlson schrieb:
> > One could also toss wxPython, VTK, or any one of the other GUI libraries
> > into the mix for visualizing those images, of which wxPython just
> > acquired no-copy display of PIL images, and being able to manipulate
> > them with numpy (of which some wxPython built in classes use numpy to
> > speed up manipulation) would be very useful.
> 
> I'm doubtful that this PEP alone would allow zero-copy sharing of images
> for display. Often, the libraries need the data in a different format.
> So they need to copy, even if they could understand the other format.
> However, the PEP won't allow "understanding" the format. If I know I
> have an array of 4-byte values: which of them is R, G, B, and A?

...in the cases I have seen, which includes BMP, TGA, uncompressed TIFF,
a handful of platform-specific bitmap formats, etc.,  you _always_ get
them in RGBA order.  If the alpha channel is to be left out, then you
get them as RGB.

The trick with allowing zero-copy sharing is 1) to understand the format,
and 2) to manipulate/display in-place.  The former is necessary for the
latter, which is what Travis is shooting for.  Also, because wxPython
has figured out how PIL images are structured, they can do #2, and so
far no one has mentioned any examples where the standard RGB/RGBA format
hasn't worked for them.

In the case of jpegs (as you mentioned in another message), PIL
uncompresses all images it understands into some kind of 'natural'
format (from what I understand). For 24/32 bit images, that is RGB or
RGBA. For palletized images (gif, 8-bit png, 8-bit bmp, etc.) maybe it
is a palletized format, or maybe it is RGB/RGBA?  I don't know, all of
my images are 24/32 bit, but I can just about guarantee it's not an
issue for the case that Paul mentioned.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Martin v. Löwis

Josiah Carlson schrieb:
> One could also toss wxPython, VTK, or any one of the other GUI libraries
> into the mix for visualizing those images, of which wxPython just
> acquired no-copy display of PIL images, and being able to manipulate
> them with numpy (of which some wxPython built in classes use numpy to
> speed up manipulation) would be very useful.

I'm doubtful that this PEP alone would allow zero-copy sharing of images
for display. Often, the libraries need the data in a different format.
So they need to copy, even if they could understand the other format.
However, the PEP won't allow "understanding" the format. If I know I
have an array of 4-byte values: which of them is R, G, B, and A?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Martin v. Löwis

Paul Moore schrieb:
> Here's an example. PIL handles images (in various formats) in memory,
> as blocks of binary image data. NumPy provides methods for
> manipulating in-memory blocks of data. Now, if I want to use NumPy to
> manipulate that data in place (for example, to cap the red component
> at 128, and equalise the range of the green component) my code needs
> to know the format of the memory block that PIL exposes. I am assuming
> that in-place manipulation is better, because there is no need for
> repeated copies of the data to be made (this would be true for large
> images).

Thanks, that looks like a good example. Is it possible to elaborate
that? E.g. what specific image format would I use (could that work
for jpeg, even though this format has compression in it), and
what specific NumPy routines would I use to implement the capping
and equalising? What would the datatype description look like that
those tools need to exchange?

Looking at this in more detail, PIL in-memory images (ImagingCore
objects) either have the image8 UINT8**, or the image32 INT32**;
they have separate fields for pixelsize and linesize. In the image8
case, there are three options:
- each value is an 8-bit integer (IMAGING_TYPE_UINT8) (1)
- each value is a 16-bit integer, either little (2) or big endian (3)
  (IMAGING_TYPE_SPECIAL, mode either I;16 or I;16B)
In the image32 case, there are five options:
- two 8-bit values per four bytes, namely byte 0 and byte 3 (4)
- three 8-bit values (bytes 0, 1, 2) (5)
- four 8-bit values (6)
- a single 32-bit int (7)
- a single 32-bit float (8)

Now, what would be the algorithm in NumPy that I could use to
implement capping and equalising?

> If PIL could expose a descriptor for its data structure, NumPy code
> could manipulate it in place without fear of corrupting it. Of course,
> this can be done by the end user reading the PIL documentation and
> transcribing the documented format into the NumPy code. But I would
> argue that it's better if the PIL block is self-describing in a way
> that avoids the need for a manual transcription of the format.

Without digging further, I think some of the formats simply don't allow
for the kind of manipulation you suggest, namely all palette formats
(which are the single-valued ones, plus the two-band version with
 a palette number and an alpha value), and greyscale images. So
in any case, the application has to look at the mode of the image to
find out whether the operation is even meaningful. And then, the
application has to tell NumPy somehow what fields to operate on.

> To do this *without* needing the PIL and NumPy developers to
> co-operate needs an independent standard, which is what I assume this
> PEP is intended to provide.

Ok, I now understand the goal, although I still like to understand
this usecase better.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Josiah Carlson

"Paul Moore" <[EMAIL PROTECTED]> wrote:
> On 10/29/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> > Travis E. Oliphant schrieb:
> > > Remember the context that the data-format object is presented in.  Two
> > > packages need to share a chunk of memory (the package authors do not
> > > know each other and only have and Python as a common reference).  They
> > > both want to describe that the memory they are sharing has some
> > > underlying binary structure.
> >
> > Can you please give an example of such two packages, and an application
> > that needs them share data?
> 
> To do this *without* needing the PIL and NumPy developers to
> co-operate needs an independent standard, which is what I assume this
> PEP is intended to provide.

One could also toss wxPython, VTK, or any one of the other GUI libraries
into the mix for visualizing those images, of which wxPython just
acquired no-copy display of PIL images, and being able to manipulate
them with numpy (of which some wxPython built in classes use numpy to
speed up manipulation) would be very useful.

Of all of the intended uses, I'd say that zero-copy sharing of
information on the graphics/visualization front is the most immediate
'people will be using it tomorrow' feature.

I personally don't have my pulse on the Scientific Python community, so
I don't know about other uses, but in regards to Martin's list of
missing features: "pointers, unions, function pointers,
alignment/packing [, etc.]" I'm going to go out on a limb and say for
the majority of those YAGNI, or really, NOHAFIAFACT (no one has asked
for it, as far as I can tell).  Someone who knows the scipy community,
feel free to correct me.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Paul Moore

On 10/29/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> Travis E. Oliphant schrieb:
> > Remember the context that the data-format object is presented in.  Two
> > packages need to share a chunk of memory (the package authors do not
> > know each other and only have and Python as a common reference).  They
> > both want to describe that the memory they are sharing has some
> > underlying binary structure.
>
> Can you please give an example of such two packages, and an application
> that needs them share data?

Here's an example. PIL handles images (in various formats) in memory,
as blocks of binary image data. NumPy provides methods for
manipulating in-memory blocks of data. Now, if I want to use NumPy to
manipulate that data in place (for example, to cap the red component
at 128, and equalise the range of the green component) my code needs
to know the format of the memory block that PIL exposes. I am assuming
that in-place manipulation is better, because there is no need for
repeated copies of the data to be made (this would be true for large
images).

If PIL could expose a descriptor for its data structure, NumPy code
could manipulate it in place without fear of corrupting it. Of course,
this can be done by the end user reading the PIL documentation and
transcribing the documented format into the NumPy code. But I would
argue that it's better if the PIL block is self-describing in a way
that avoids the need for a manual transcription of the format.

To do this *without* needing the PIL and NumPy developers to
co-operate needs an independent standard, which is what I assume this
PEP is intended to provide.

Paul.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Neal Becker

I have watched numpy with interest for a long time.  My own interest is to
possibly use the c-api to wrap c++ algorithms to use from python.

One thing that has concerned me, and continues to concern me with this
proposal, is that it seems to suffer from a very fat interface.  I
certainly have not studied the options in any depth, but my gut feeling is
that the interface is too fat and too complex.  I wonder if it's possible
to avoid this.  I wonder if this is an example of all the methods sinking
to the base class.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Edward C. Jones

Travis E. Oliphant wrote:
 > It also bothers me that so many ways to describe binary data are
 > being used out there.  This is a problem that deserves being solved.

Is there a survey paper somewhere about binary formats? What formats are 
used in particle physics, bio-informatics, astronomy, etc? What software 
is used to read and write binary data? What descriptive languages are 
used for data (SQL, XML, etc)?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Martin v. Löwis

Robert Kern schrieb:
>> As I unification mechanism, I think it is insufficient. I doubt it
>> can express all the concepts that ctypes supports.
> 
> What do you think is missing that can't be added?

I can factually only report what is missing. Whether it can be added,
I don't know. As I just wrote in a few other messages: pointers,
unions, functions pointers, packed structs, incomplete/recursive
types. Also "flexible array members" (i.e. open-ended arrays).

While it may be possible to come up with a string syntax to describe
all these things (*), I wonder whether it should be done, and whether
NumArray can then support this extended data model.

Regards,
Martin

(*) perhaps with the exception of incomplete types: C needs forward
references in its own syntax.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Martin v. Löwis

Travis E. Oliphant schrieb:
>> As I unification mechanism, I think it is insufficient. I doubt it
>> can express all the concepts that ctypes supports.
>>
> 
> Please clarify what you mean.
> 
> Are you saying that a single object can't carry all the information 
> about binary data that ctypes allows with it's multi-object approach?

I'm not sure what you mean by "single object". If I use the tuple
syntax, e.g.

datatype((float, (3,2))

There are also multiple objects (the float, the 3, and the 2). You
get a single "root" object back, but so do you in ctypes.

But this isn't really what I meant. Instead, I think the PEP lacks
various concepts from C data types, such as pointers, unions,
function pointers, alignment/packing.

> In the mean-time, how are other packages supposed to communicate binary 
> information about data with each other?

This is my other question. Why should they?

> Remember the context that the data-format object is presented in.  Two 
> packages need to share a chunk of memory (the package authors do not 
> know each other and only have and Python as a common reference).  They 
> both want to describe that the memory they are sharing has some 
> underlying binary structure.

Can you please give an example of such two packages, and an application
that needs them share data?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Martin v. Löwis

Travis E. Oliphant schrieb:
> I'm proposing to add this object to Python so that the buffer protcol 
> has a fast and efficient way to share #3.   That's really all I'm after.

I admit that I don't understand this objective. Why is it desirable to
support such an extended buffer protocol? What specific application
would be made possible if it was available and implemented in the
relevant modules and data types? What are the relevant modules and data
types that should implement it?

> It also bothers me that so many ways to describe binary data are being 
> used out there.  This is a problem that deserves being solved.  And, no, 
> ctypes hasn't solved it (we can't directly use the ctypes solution). 
> Perhaps this PEP doesn't hit all the corners, but a data-format object 
> *is* a useful thing to consider.

IMO, it is only useful if it realistically can support all the use cases
that it intends to support. If this PEP is about defining the elements
of arrays, I doubt it can realistically support everything you can
express in ctypes. There is no support for pointers (except for
PyObject*), no support for incomplete (recursive) types, no support
for function pointers, etc.

Vice versa: why exactly can't you use the data type system of ctypes?
If I want to say "int[10]", I do

py> ctypes.c_long * 10


To rewrite the examples from the PEP:

datatype(float) => ctypes.c_double
datatype(int)   => ctypes.c_long
datatype((int, 5)) => ctypes.c_long * 5
datatype((float, (3,2)) => (ctypes.c_double * 3) * 2

struct {
  int  simple;
  struct nested {
   char name[30];
   char addr[45];
   int  amount;
  }
=>
py> from ctypes import *
py> class nested(Structure):
...  _fields_ = [("name", c_char*30), ("addr", c_char*45), ("amount",
c_long)]
...
py> class struct(Structure):
...   _fields_ = [("simple", c_int), ("nested", nested)]
...

> Guido seemed to think the data-type objects were nice when he saw them 
> at SciPy 2006, and so I'm presenting a PEP.

I have no objection to including NumArray as-is into Python. I just
wonder were the rationale for this PEP comes from, i.e. why do you
need to exchange this information across different modules?

> Without the data-format object, I'm don't know how to extend the buffer 
> protocol to communicate data-format information.  Do you have a better 
> idea?

See above: I can't understand where the need for an extended buffer
protocol comes from. I can see why NumArray needs reflection, and
needs to keep information to interpret the bytes in the array.
But why is it important that the same information is exposed by
other data types?

>> Is it the intent of this PEP to support such data structures,
>> and allow the user to fill in a Unicode object, and then the
>> processing is automatic? (i.e. in ID3v1, the string gets
>> automatically Latin-1-encoded and zero-padded, in ID3v2, it
>> gets automatically UTF-8 encoded, and null-terminated)
>>
> 
> No, the point of the data-format object is to communicate information 
> about data-formats not to encode or decode anything.   Users of the 
> data-format object could decide what they wanted to do with that 
> information.   We just need a standard way to communicate it through the 
> buffer protocol.

This was actually a different sub-thread: why do you need to support
the 'U' code (or the 'S' code, for that matter)? In what application
do you have fixed size Unicode arrays, as opposed to Unicode strings?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Travis E. Oliphant

Martin v. Löwis wrote:
> Travis E. Oliphant schrieb:
>> How to handle unicode data-formats could definitely be improved. 
> 
> As before, I'm doubtful what the actual needs are. For example, is
> it desired to support generation of ID3v2 tags with such a data
> format? The tag is specified here:
> 

Perhaps I was not clear enough about what I'm try to do.   For a long 
time a lot of people have wanted something like Numeric in Python 
itself.  There have been many hurdles to that goal.

After discussions at SciPy 2006 with Guido, we decided that the best way 
to proceed at this point was to extend the buffer protocol to allow 
packages to share array-like information with each-other.

There are several things missing from the buffer protocol that NumPy 
needs in order to be able to really understand the (fixed-size) memory 
another package has allocated and is sharing.

The most important of these is

1) Shape information
2) Striding information
3) Data-format information  (how is each element perceived).

Shape and striding information can be shared with a C-array of integers.

How is data-format information supposed to be shared?

We've come up with a very flexible way to do this in NumPy using a 
single Python object.  This Python object supports describing the layout 
of any fixed-size chunk of memory (right now in units of bytes --- bit 
fields could be added, though).

I'm proposing to add this object to Python so that the buffer protcol 
has a fast and efficient way to share #3.   That's really all I'm after.

It also bothers me that so many ways to describe binary data are being 
used out there.  This is a problem that deserves being solved.  And, no, 
ctypes hasn't solved it (we can't directly use the ctypes solution). 
Perhaps this PEP doesn't hit all the corners, but a data-format object 
*is* a useful thing to consider.

The array object in Python already has a PyArray_Descr * structure that 
is a watered-down version of what I'm talking about.   In fact, this is 
what Numeric built from (or vice-versa actually).  And NumPy has greatly 
enhanced this object for any conceivable structure.

Guido seemed to think the data-type objects were nice when he saw them 
at SciPy 2006, and so I'm presenting a PEP.

Without the data-format object, I'm don't know how to extend the buffer 
protocol to communicate data-format information.  Do you have a better 
idea?

I have no trouble limiting the data-type object to the buffer protocol 
extension PEP, but I do think it could gain wider use.

> 
> Is it the intent of this PEP to support such data structures,
> and allow the user to fill in a Unicode object, and then the
> processing is automatic? (i.e. in ID3v1, the string gets
> automatically Latin-1-encoded and zero-padded, in ID3v2, it
> gets automatically UTF-8 encoded, and null-terminated)
>

No, the point of the data-format object is to communicate information 
about data-formats not to encode or decode anything.   Users of the 
data-format object could decide what they wanted to do with that 
information.   We just need a standard way to communicate it through the 
buffer protocol.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Travis E. Oliphant

Martin v. Löwis wrote:
> Travis E. Oliphant schrieb:
>> What is needed is a definitive way to describe data and then have
>>
>> array
>> struct
>> ctypes
>>
>> all be compatible with that same method.That's why I'm proposing the 
>> PEP.  It's a unification effort not yet-another-method.
> 
> As I unification mechanism, I think it is insufficient. I doubt it
> can express all the concepts that ctypes supports.
> 

Please clarify what you mean.

Are you saying that a single object can't carry all the information 
about binary data that ctypes allows with it's multi-object approach?

I don't agree with you, if that is the case.  Sure, perhaps I've not 
included certain cases, so give an example.

Besides, I don't think this is the right view of "unification".  I'm not 
saying that ctypes should get rid of it's many objects used for 
interfacing with C-functions.

I'm saying we should introduce a single-object mechanism for describing 
binary data so that the many-object approach of c-types does not become 
some kind of de-facto standard.  C-types can "translate" this 
object-instance to its internals if and when it needs to.

In the mean-time, how are other packages supposed to communicate binary 
information about data with each other?

Remember the context that the data-format object is presented in.  Two 
packages need to share a chunk of memory (the package authors do not 
know each other and only have and Python as a common reference).  They 
both want to describe that the memory they are sharing has some 
underlying binary structure.

How do they do that? Please explain to me how the buffer protocol can be 
extended so that information about "what is in the memory" can be shared 
without a data-format object?

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Martin v. Löwis

Travis E. Oliphant schrieb:
> How to handle unicode data-formats could definitely be improved. 

As before, I'm doubtful what the actual needs are. For example, is
it desired to support generation of ID3v2 tags with such a data
format? The tag is specified here:

http://www.id3.org/id3v2.4.0-structure.txt

In ID3v1, text fields have a specified width, and are supposed
to be encoded in Latin-1, and padded with zero bytes.

In ID3v2, text fields start with an encoding declaration
(say, \x03 for UTF-8), then followed with a null-terminated
sequence of UTF-8 bytes.

Is it the intent of this PEP to support such data structures,
and allow the user to fill in a Unicode object, and then the
processing is automatic? (i.e. in ID3v1, the string gets
automatically Latin-1-encoded and zero-padded, in ID3v2, it
gets automatically UTF-8 encoded, and null-terminated)

If that is not to be supported, what are the use cases?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Robert Kern

Martin v. Löwis wrote:
> Travis E. Oliphant schrieb:
>> What is needed is a definitive way to describe data and then have
>>
>> array
>> struct
>> ctypes
>>
>> all be compatible with that same method.That's why I'm proposing the 
>> PEP.  It's a unification effort not yet-another-method.
> 
> As I unification mechanism, I think it is insufficient. I doubt it
> can express all the concepts that ctypes supports.

What do you think is missing that can't be added?

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Travis E. Oliphant

Greg Ewing wrote:
> Travis E. Oliphant wrote:
> 
>> How to handle unicode data-formats could definitely be improved. 
>> Suggestions are welcome.
> 
> 'U4*10'  string of 10 4-byte Unicode chars
> 

I like that.  Thanks.

-Travis


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Martin v. Löwis

Travis E. Oliphant schrieb:
> What is needed is a definitive way to describe data and then have
> 
> array
> struct
> ctypes
> 
> all be compatible with that same method.That's why I'm proposing the 
> PEP.  It's a unification effort not yet-another-method.

As I unification mechanism, I think it is insufficient. I doubt it
can express all the concepts that ctypes supports.

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Travis E. Oliphant

Greg Ewing wrote:
> Nick Coghlan wrote:
> 
>> Greg Ewing wrote:
> 
>>> Also, what if I want to refer to fields by name
>>> but don't want to have to work out all the offsets
> 
>> Use the list definition form. With the changes I've 
>> suggested above, you wouldn't even have to name the fields you don't 
>> care about - just describe them.
> 
> That would be okay.
> 
> I still don't see a strong justification for having a
> one-big-string form as well as a list/tuple/dict form,
> though.

Compaction of representation is all. It's used quite a bit in numarray, 
   which is where most of the 'kind' names came from as well.   When you 
don't want to name fields it is a really nice feature (but it doesn't 
nest well).

-Travis


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Travis E. Oliphant

Greg Ewing wrote:
> Travis E. Oliphant wrote:
> 
>> The 'kind' does not specify how "big" the data-type (data-format) is.
> 
> What exactly does "bit" mean in that context?   

Do you mean "big" ?  It's how many bytes the kind is using.

So, 'u4' is a 4-byte unsigned integer and 'u2' is a 2-byte unsigned 
integer.


-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Travis E. Oliphant

Greg Ewing wrote:
> Nick Coghlan wrote:
>> I'd say the answer to where we put it will be dependent on what happens to 
>> the 
>> idea of adding a NumArray style fixed dimension array type to the standard 
>> library. If that gets exposed through the array module as array.dimarray, 
>> then 
>> it would make sense to expose the associated data layout descriptors as 
>> array.datatype.
> 
> Seem to me that arrays are a sub-concept of binary data,
> not the other way around. So maybe both arrays and data
> types should be in a module called 'binary' or some such.

Yes, very good point.

That's probably one reason I'm proposing the data-type first before the 
array interface in the extended buffer protocol.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread Greg Ewing

Travis E. Oliphant wrote:

> How to handle unicode data-formats could definitely be improved. 
> Suggestions are welcome.

'U4*10'  string of 10 4-byte Unicode chars

Then for consistency you'd want 'S*10' rather than
just 'S10' (or at least allow it as an alternative).

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread Greg Ewing

Travis E. Oliphant wrote:

> The 'kind' does not specify how "big" the data-type (data-format) is.

What exactly does "bit" mean in that context?

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread Greg Ewing

Nick Coghlan wrote:
> I'd say the answer to where we put it will be dependent on what happens to 
> the 
> idea of adding a NumArray style fixed dimension array type to the standard 
> library. If that gets exposed through the array module as array.dimarray, 
> then 
> it would make sense to expose the associated data layout descriptors as 
> array.datatype.

Seem to me that arrays are a sub-concept of binary data,
not the other way around. So maybe both arrays and data
types should be in a module called 'binary' or some such.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread Greg Ewing

Nick Coghlan wrote:

> Greg Ewing wrote:

>> Also, what if I want to refer to fields by name
>> but don't want to have to work out all the offsets

> Use the list definition form. With the changes I've 
> suggested above, you wouldn't even have to name the fields you don't 
> care about - just describe them.

That would be okay.

I still don't see a strong justification for having a
one-big-string form as well as a list/tuple/dict form,
though.

--
Greg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread Travis E. Oliphant

Martin v. Löwis wrote:
> Travis E. Oliphant schrieb:
>> In this case, the 'kind' does not specify how large the data-type is. 
>> You can have 'u1', 'u2', 'u4', etc.
>>
>> The same is true with Unicode.  You can have 10-character unicode 
>> elements, 20-character, etc.  But, we have to be clear about what a 
>> "character" is in the data-format.
> 
> That is certainly confusing. In u1, u2, u4, the digit seems to indicate
> the size of a single value (1 byte, 2 bytes, 4 bytes). Right? Yet,
> in U20, it does *not* indicate the size of a single value but of an
> array? And then, it's not the size, but the number of elements?
> 

Good point.  In NumPy, unicode support was added "in parallel" with 
string arrays where there is not the ambiguity.   So, yes, it's true 
that the unicode case is a special-case.

The other way to handle it would be to describe the 'code'-point size 
(i.e. 'U1', 'U2', 'U4' for UCS-1, UCS-2, UCS-4) and then have the length 
be encoded as an "array" of those types.

This was not the direction we took with NumPy (which is what I'm using 
as a reference) because I wanted Unicode and string arrays to look the 
same and thought of strings differently.

How to handle unicode data-formats could definitely be improved. 
Suggestions are welcome.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread M.-A. Lemburg

Travis E. Oliphant wrote:
> M.-A. Lemburg wrote:
>> Travis E. Oliphant wrote:
>>> M.-A. Lemburg wrote:
 Travis E. Oliphant wrote:
> 
>
> PEP: 
> Title: Adding data-type objects to the standard library
>   Attributes
>
>  kind  --  returns the basic "kind" of the data-type. The basic 
> kinds
>  are:
>'t' - bit, 
>'b' - bool, 
>'i' - signed integer, 
>'u' - unsigned integer,
>'f' - floating point,  
>'c' - complex floating point, 
>'S' - string (fixed-length sequence of char),
>'U' - fixed length sequence of UCS4,
 Shouldn't this read "fixed length sequence of Unicode" ?!
 The underlying code unit format (UCS2 and UCS4) depends on the
 Python version.
>>> Well, in NumPy 'U' always means UCS4.  So, I just copied that over.  See 
>>> my questions at the bottom which talk about how to handle this.  A 
>>> data-format does not necessarily have to correspond to something Python 
>>> represents with an Object.
>> Ok, but why are you being specific about UCS4 (which is an internal
>> storage format), while you are not specific about e.g. the
>> internal bit size of the integers (which could be 32 or 64 bit) ?
>>
> 
> The 'kind' does not specify how "big" the data-type (data-format) is.  A 
> number is needed to represent the number of bytes.
> 
> In this case, the 'kind' does not specify how large the data-type is. 
> You can have 'u1', 'u2', 'u4', etc.
> 
> The same is true with Unicode.  You can have 10-character unicode 
> elements, 20-character, etc.  But, we have to be clear about what a 
> "character" is in the data-format.

I understand and that's why I'm asking why you made the range
explicit in the definition.

The definition should talk about Unicode code points.
The number of bytes then determines whether you can only
represent the ASCII subset (1 byte), UCS2 (2 bytes, BMP only)
or UCS4 (4 bytes, all currently assigned code points).

This is similar to the range for integers (ie. ZZ_0), where
the number of bytes determines the range of numbers that can
be represented.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 28 2006)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread Martin v. Löwis

Travis E. Oliphant schrieb:
> In this case, the 'kind' does not specify how large the data-type is. 
> You can have 'u1', 'u2', 'u4', etc.
> 
> The same is true with Unicode.  You can have 10-character unicode 
> elements, 20-character, etc.  But, we have to be clear about what a 
> "character" is in the data-format.

That is certainly confusing. In u1, u2, u4, the digit seems to indicate
the size of a single value (1 byte, 2 bytes, 4 bytes). Right? Yet,
in U20, it does *not* indicate the size of a single value but of an
array? And then, it's not the size, but the number of elements?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread Travis E. Oliphant

Armin Rigo wrote:
> Hi Travis,
> 
> On Fri, Oct 27, 2006 at 02:05:31PM -0600, Travis E. Oliphant wrote:
>> This PEP proposes adapting the data-type objects from NumPy for
>> inclusion in standard Python, to provide a consistent and standard
>> way to discuss the format of binary data. 
> 
> How does this compare with ctypes?  Do we really need yet another,
> incompatible way to describe C-like data structures in the standard
> library?

Part of what the data-type, data-format object is trying to do is bring 
together all the disparate ways to represent data that *already* exists 
in the standard library.

What is needed is a definitive way to describe data and then have

array
struct
ctypes

all be compatible with that same method.That's why I'm proposing the 
PEP.  It's a unification effort not yet-another-method.  One of the big 
reasons for it is to move something like the array interface into 
Python.  There are tens to hundreds of people mostly in the scientific 
computing community that want to see Python grow more support for 
NumPy-like things.  I keep getting requests to "do something" to make 
Python more aware of arrays.   This PEP is part of that effort.

In particular, something like the array interface should be available in 
Python.  The easiest way to do this is to extend the buffer protocol to 
allow objects to share information about shape, strides, and data-format 
of a block of memory.

But, how do you represent data-format in Python?  What will the objects 
pass back and forth to each other to do it?  C-types has a solution 
which creates multiple objects to do it.  This is an un-wieldy 
over-complicated solution for the array interface.

The array objects have a solution using the a single object that carries 
the data-format information. The solution we have for arrays deserves 
consideration.  It could be placed inside the array module if desired, 
but again, I'm really looking for something that would allow the extend 
buffer protocol (to be proposed soon) to share data-type information.

That could be done with the array-interface objects (strings, lists, and 
tuples), but then every body who uses the interface will have to write 
their own "decoders" to process the data-format information.

I actually think ctypes would benefit from this data-format 
specification too.

Recognizing all these diverging ways to essentially talk about the same 
thing is part of what prompted this PEP.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread Travis E. Oliphant

M.-A. Lemburg wrote:
> Travis E. Oliphant wrote:
>> M.-A. Lemburg wrote:
>>> Travis E. Oliphant wrote:
 

 PEP: 
 Title: Adding data-type objects to the standard library
   Attributes

  kind  --  returns the basic "kind" of the data-type. The basic 
 kinds
  are:
't' - bit, 
'b' - bool, 
'i' - signed integer, 
'u' - unsigned integer,
'f' - floating point,  
'c' - complex floating point, 
'S' - string (fixed-length sequence of char),
'U' - fixed length sequence of UCS4,
>>> Shouldn't this read "fixed length sequence of Unicode" ?!
>>> The underlying code unit format (UCS2 and UCS4) depends on the
>>> Python version.
>> Well, in NumPy 'U' always means UCS4.  So, I just copied that over.  See 
>> my questions at the bottom which talk about how to handle this.  A 
>> data-format does not necessarily have to correspond to something Python 
>> represents with an Object.
> 
> Ok, but why are you being specific about UCS4 (which is an internal
> storage format), while you are not specific about e.g. the
> internal bit size of the integers (which could be 32 or 64 bit) ?
> 

The 'kind' does not specify how "big" the data-type (data-format) is.  A 
number is needed to represent the number of bytes.

In this case, the 'kind' does not specify how large the data-type is. 
You can have 'u1', 'u2', 'u4', etc.

The same is true with Unicode.  You can have 10-character unicode 
elements, 20-character, etc.  But, we have to be clear about what a 
"character" is in the data-format.

-Travis




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread Fredrik Lundh

Josiah Carlson wrote:

> I think that even on 64 bit platforms, using 'int' or 'long' generally
> means 32 bit.  In order to get 64 bit ints, one needs to use 'long long'.

real 64-bit platforms use the LP64 standard, where long and pointers are 
both 64 bits:

 http://www.unix.org/version2/whatsnew/lp64_wp.html



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread Josiah Carlson


"M.-A. Lemburg" <[EMAIL PROTECTED]> wrote:
> 
> Travis E. Oliphant wrote:
> > M.-A. Lemburg wrote:
> >> Travis E. Oliphant wrote:
> >>> 
> >>>
> >>> PEP: 
> >>> Title: Adding data-type objects to the standard library
> >>>   Attributes
> >>>
> >>>  kind  --  returns the basic "kind" of the data-type. The basic 
> >>> kinds
> >>>  are:
> >>>'t' - bit, 
> >>>'b' - bool, 
> >>>'i' - signed integer, 
> >>>'u' - unsigned integer,
> >>>'f' - floating point,  
> >>>'c' - complex floating point, 
> >>>'S' - string (fixed-length sequence of char),
> >>>'U' - fixed length sequence of UCS4,
> >> Shouldn't this read "fixed length sequence of Unicode" ?!
> >> The underlying code unit format (UCS2 and UCS4) depends on the
> >> Python version.
> > 
> > Well, in NumPy 'U' always means UCS4.  So, I just copied that over.  See 
> > my questions at the bottom which talk about how to handle this.  A 
> > data-format does not necessarily have to correspond to something Python 
> > represents with an Object.
> 
> Ok, but why are you being specific about UCS4 (which is an internal
> storage format), while you are not specific about e.g. the
> internal bit size of the integers (which could be 32 or 64 bit) ?

I think that even on 64 bit platforms, using 'int' or 'long' generally
means 32 bit.  In order to get 64 bit ints, one needs to use 'long long'. 
Sharing some of the codes with the struct module, though arbitrary,
doesn't seem like a bad idea to me.  Of course offering specifically 32
and 64 bit ints would make sense to me.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

1 2 >

1 - 100 of 112 matches

Mail list logo